Live data from Artificial Analysis API

AI Model Benchmarks

Compare 454+ AI models across 12 benchmarks — Intelligence, Coding, Math, Science, and more. Data updated hourly.

Benchmarks:
454 models · click headers to sort
#
Model
Speed
$/1M
AA Index
GPQA
MMLU-Pro
LiveCode
AIME
HLE
1
77 t/s
$5.6
57.2
92.0%
41.6%
2
118 t/s
$4.5
57.2
94.1%
44.7%
3
66 t/s
$4.8
54.0
91.5%
39.9%
4
48 t/s
$10.0
53.0
89.6%
36.7%
5
56 t/s
$6.0
51.7
87.5%
30.0%
6
71 t/s
$4.8
51.3
90.3%
87.4%
88.9%
99.0%
35.4%
7
60 t/s
$1.6
49.8
82.0%
27.2%
8
52 t/s
$10.0
49.7
86.6%
89.5%
87.1%
91.3%
28.4%
9
40 t/s
$0.53
49.6
87.4%
28.1%
10
$1.5
49.2
87.0%
28.3%
11
101 t/s
$4.8
49.0
89.9%
33.5%
12
248 t/s
$3.0
48.5
88.5%
30.0%
13
128 t/s
$4.5
48.4
90.8%
89.8%
91.7%
95.7%
37.2%
14
188 t/s
$1.7
48.1
87.5%
26.6%
15
96 t/s
$3.4
47.7
87.3%
87.0%
86.8%
94.0%
26.5%
16
34 t/s
$1.2
46.8
87.9%
29.4%
17
$0.00
46.8
84.7%
25.4%
18
$4.8
46.6
86.4%
85.9%
89.4%
96.7%
24.9%
19
47 t/s
$10.0
46.5
84.0%
18.6%
20
194 t/s
$1.1
46.4
89.8%
89.0%
90.8%
97.0%
34.7%
21
57 t/s
$1.4
45.0
89.3%
27.3%
22
96 t/s
$3.4
44.6
85.4%
87.1%
84.6%
94.3%
26.5%
23
166 t/s
$3.4
44.6
83.7%
86.5%
84.0%
98.7%
25.6%
24
206 t/s
$0.46
44.4
81.7%
26.5%
25
47 t/s
$6.0
44.4
79.9%
13.2%
26
99 t/s
$0.53
43.8
85.5%
16.0%
27
$0.00
43.4
82.8%
19.9%
28
168 t/s
$3.4
43.1
86.0%
86.0%
84.9%
95.7%
23.4%
29
51 t/s
$10.0
43.1
81.0%
88.9%
73.8%
62.7%
12.9%
30
49 t/s
$6.0
43.0
83.4%
87.5%
71.4%
88.0%
17.3%
31
$0.00
42.9
80.9%
15.8%
32
50 t/s
$6.0
42.6
79.7%
10.8%
33
91 t/s
$0.82
42.1
85.8%
22.2%
34
74 t/s
$1.0
42.1
85.9%
85.6%
89.4%
95.0%
25.1%
35
86 t/s
$3.4
42.0
84.2%
86.7%
70.3%
91.7%
23.5%
36
36 t/s
$30.0
42.0
80.9%
88.0%
65.4%
80.3%
11.9%
37
59 t/s
$0.53
41.9
84.8%
19.1%
38
37 t/s
$0.32
41.7
84.0%
86.2%
86.2%
92.0%
22.2%
39
138 t/s
$1.1
41.6
85.7%
23.4%
40
124 t/s
$0.15
41.5
83.5%
20.0%
41
50 t/s
$6.0
41.5
87.7%
86.6%
81.9%
92.7%
23.9%
42
$4.5
41.3
88.7%
89.5%
85.7%
86.7%
27.6%
43
78 t/s
$0.69
41.2
82.8%
83.7%
83.8%
90.7%
19.7%
44
100 t/s
$1.1
40.9
83.8%
84.8%
85.3%
94.7%
22.3%
45
o3-pro
OpenAI
19 t/s
$35.0
40.7
84.5%
46
57 t/s
$1.6
40.6
66.6%
7.2%
47
52 t/s
$1.4
40.1
86.1%
18.8%
48
36 t/s
$2.4
39.9
86.1%
26.2%
49
62 t/s
$0.53
39.4
83.0%
87.5%
81.0%
82.7%
22.2%
50
36 t/s
$0.00
39.2
85.7%
22.7%
51
78 t/s
$3.4
39.2
80.8%
86.0%
76.3%
83.0%
18.4%
52
128 t/s
$0.15
39.2
84.6%
84.3%
86.8%
96.3%
21.1%
53
37 t/s
$30.0
39.0
79.6%
87.3%
63.6%
73.3%
11.7%
54
82 t/s
$0.69
38.9
80.3%
82.8%
69.2%
85.0%
14.6%
55
47 t/s
$6.0
38.7
77.7%
84.2%
65.5%
74.3%
9.6%
56
50 t/s
$1.5
38.6
82.6%
13.9%
57
174 t/s
$0.69
38.6
81.3%
82.0%
83.6%
91.7%
16.9%
58
131 t/s
$0.28
38.6
85.3%
85.4%
82.2%
89.3%
17.6%
59
o3
OpenAI
86 t/s
$3.5
38.4
82.7%
85.3%
80.8%
88.3%
20.0%
60
196 t/s
$0.46
38.1
76.1%
14.7%
61
86 t/s
$0.15
37.8
83.1%
19.1%
62
183 t/s
$1.7
37.7
82.3%
17.1%
63
33 t/s
$1.2
37.3
78.9%
12.3%
64
87 t/s
$0.82
37.2
84.2%
13.2%
65
126 t/s
$2.0
37.1
67.2%
76.0%
61.5%
83.7%
9.7%
66
128 t/s
$0.69
37.1
84.5%
19.7%
67
48 t/s
$6.0
37.1
72.7%
86.0%
59.0%
37.0%
7.1%
68
MiniMax-M2
MiniMax
60 t/s
$0.53
36.1
77.7%
82.0%
82.6%
78.3%
12.5%
69
155 t/s
$0.41
36.0
80.0%
19.2%
70
96 t/s
$0.53
36.0
76.4%
81.3%
74.7%
94.7%
33.4%
71
35 t/s
$30.0
36.0
72
157 t/s
$1.1
35.9
82.7%
14.8%
73
127 t/s
$3.4
35.7
78.5%
83.0%
73.0%
89.0%
8.9%
74
65 t/s
$5.6
35.4
74.8%
10.6%
75
150 t/s
$0.28
35.1
84.7%
85.0%
83.2%
89.7%
17.0%
76
169 t/s
$1.1
35.0
81.2%
88.2%
79.7%
55.7%
14.1%
77
$6.0
34.7
77.2%
83.7%
47.3%
56.3%
10.3%
78
130 t/s
$3.4
34.6
84.4%
86.2%
80.1%
87.7%
21.1%
79
76 t/s
$0.94
34.2
66.4%
79.4%
56.2%
48.0%
6.1%
80
$0.80
33.9
79.2%
85.1%
79.8%
89.7%
15.2%
81
70 t/s
$4.8
33.6
71.2%
81.4%
66.9%
51.0%
7.3%
82
205 t/s
$0.56
33.5
82.2%
16.2%
83
Doubao Seed Code
ByteDance Seed
$0.00
33.5
76.4%
85.4%
76.6%
79.3%
13.3%
84
234 t/s
$0.26
33.3
78.2%
80.8%
87.8%
93.4%
18.5%
85
126 t/s
$1.9
33.1
78.4%
83.2%
85.9%
90.7%
17.5%
86
51 t/s
$6.0
33.0
68.3%
83.7%
44.9%
38.0%
4.0%
87
36 t/s
$30.0
33.0
70.1%
86.0%
54.2%
36.3%
5.9%
88
37 t/s
$0.32
32.9
79.7%
85.0%
78.9%
87.7%
13.8%
89
Mercury 2
Inception
770 t/s
$0.38
32.8
77.0%
15.5%
90
65 t/s
$0.98
32.5
78.0%
82.9%
69.5%
86.0%
13.3%
91
43 t/s
$2.4
32.5
77.6%
82.4%
53.5%
82.3%
12.0%
92
172 t/s
$0.10
32.4
80.6%
13.3%
93
41 t/s
$0.32
32.1
75.1%
83.7%
59.3%
59.0%
10.5%
94
197 t/s
$0.35
32.1
79.1%
82.8%
69.6%
84.7%
11.1%
95
K-EXAONE (Reasoning)
LG AI Research
$0.00
32.1
78.3%
83.8%
76.8%
90.3%
13.1%
96
122 t/s
$3.4
31.9
75.1%
82.2%
63.8%
63.3%
5.2%
97
Qwen3 Max
Alibaba
33 t/s
$2.4
31.4
76.4%
84.1%
76.7%
80.7%
11.1%
98
$0.20
31.2
79.2%
18.3%
99
103 t/s
$2.0
31.1
64.6%
80.0%
51.1%
39.0%
4.3%
100
$0.00
31.1
79.3%
84.2%
71.3%
78.3%
12.7%
Showing top 100 of 454 models. Use search/filter to narrow down.

Benchmark Guide

Intelligence
Source ↗

Composite score across math, science, coding

Graduate-level science Q&A (Diamond)

MMLU-Pro
Source ↗

Knowledge & reasoning across 57 subjects

LiveCodeBench
Source ↗

Live coding benchmark with new problems

AIME 2025
Source ↗

American Invitational Math Exam

MATH-500
Source ↗

Competition-level math problems

Humanity's Last Exam - hardest questions

Composite coding benchmark score

Composite math benchmark score

SciCode
Source ↗

Scientific coding problems

IFBench
Source ↗

Instruction following benchmark

TerminalBench
Source ↗

Terminal/CLI task completion

Compare pricing for all models side by side

Open AI API Cost Calculator →