AI Model Intelligence
Benchmarks, pricing, and capabilities for today's leading AI models. Curated by DVNx with AI-generated analysis, updated daily.
Last updated: 25 May 2026, 08:00 UTC
DVNx Picks
Our take on today's best models, by use case.
GPT-5.5 (xhigh)
OpenAI
GPT-5.5 (xhigh) leads the Intelligence Index at 60, a full 3 points ahead of the next tier, with an Arena score of 1481, 85 tok/sec, and a 922K context window at $11.25/1M blended. When you need the most capable model on the board and cost is secondary, nothing else competes.
Gemini 3.1 Pro Preview
Gemini 3.1 Pro Preview hits an Intelligence Index of 57 and an Arena score of 1488, all for $4.50/1M blended — less than half the cost of GPT-5.5 and a fraction of Claude Opus 4.7, with 140 tok/sec throughput and a full 1M token context window. For production workloads where you need frontier-class quality without frontier-class bills, nothing touches it.
Grok 4.3
xAI
Grok 4.3 delivers an Intelligence Index of 53 at just $1.56/1M blended with 110 tok/sec throughput and a 1M token context window — near-frontier intelligence at a budget price with real speed. If you're optimizing for capability per dollar at scale, this is the most efficient pick in the dataset.
GPT-5.4 mini (xhigh)
OpenAI
GPT-5.4 mini (xhigh) tops the speed chart at 178 tok/sec with an Intelligence Index of 49 and an Arena score of 1480, all at $1.69/1M blended. For latency-sensitive applications that still need real capability, this is the clear call.
MiMo-V2.5-Pro
Xiaomi
MiMo-V2.5-Pro takes the open-source crown with an Intelligence Index of 54 and an Arena score of 1465 at just $1.50/1M blended — outscoring every other open-weight model in the dataset while keeping a 1M token context window. If you need open weights with frontier-adjacent capability, this is your pick.
Gemini 3.1 Pro Preview
Gemini 3.1 Pro Preview ties for the largest context window at 1M tokens while also carrying an Arena score of 1488 and one of the lowest prices at $4.50/1M — you're not sacrificing intelligence or budget to get the long context. For massive codebases, long-horizon agent loops, or document-heavy workloads, this is the best all-in option.
Leaderboard
All models ranked by Artificial Analysis Intelligence Index. Search or sort by any column.
| # | Model | Index |
|---|