A new benchmarking platform, Alpha Arena, tested six leading artificial intelligence (AI) models in autonomous cryptocurrency trading in real perpetual markets on Hyperliquid.
The experiment assigned each model $10,000 in real capital and a single, identical system prompt, then let them operate without any human intervention.
The initial results after just three days were astounding. DeepSeek Chat V3.1 saw its portfolio grow by over 35%, reaching a total value of $13,502.62.
Not only did he outperform all other AI traders, but he also beat Bitcoin's benchmark "Buy & Hold" which gained only $4%. Grok 4 ranked second with a return of $30%$, while Claude Sonnet 4.5 made $28%$.

The experiment, running until 3 November 2025, aimed to assess the risk management, timing and decision-making ability of the LLM in live market conditions.
DeepSeek's Winning Strategy
DeepSeek triumphed due to a combination of factors:
- Position Diversification and Management: Maintained long positions on all six planned assets (ETH, SOL, XRP, BTC, DOGE and BNB) with moderate leverage ($10x-20x$), maximising exposure to the altcoin rally that occurred on 19-20 October.
- Strict Discipline: Unlike others, DeepSeek consistently adhered to the "Non-hit invalidation → HOLD" rule, allowing profits to compound without overtrading.
- Balanced Risk Management:No single asset dominated the total returns of $2,719, a sign of solid risk allocation.
The Errors of the Competitors
Not all AIs were successful. Gemini 2.5 Pro suffered the biggest loss, a $-33%, due to a costly mistake: opening a short on BNB in a growing market. GPT-5 also struggled, losing $27%$ due to 'operational errors' such as not setting stop-losses.
Qwen3 Max, on the other hand, was far too conservative, trading only BTC and closing at $-0.25%$.
The Alpha Arena organisers stress that the results are purely for educational purposes, but DeepSeek's $35%$ gain in just 72 hours is a powerful sign of the intersection of AI and finance.
Whoever wishes to replicate a similar approach to learning can do so safely using testnet or paper-trading platforms, adopting the same minimalist prompt to focus on discipline and risk management.