Live speed data from Artificial Analysis API

AI Model Speed Rankings

Compare 454+ AI models by response speed, latency, and throughput. Find the fastest models for your use case.

454 models · click headers to sort
#
Model
Throughput
TTFT
$/1M
$/Speed
Price×TTFT
1
Mercury 2
Inception
770 t/s
3.46s
$0.38
$0.000
1298.3
2
661 t/s
18.28s
$0.09
$0.000
1553.7
3
526 t/s
8.69s
$0.11
$0.000
929.9
4
450 t/s
233ms
$0.02
$0.000
4.7
5
361 t/s
538ms
$0.04
$0.000
21.5
6
322 t/s
19.76s
$0.17
$0.001
3458.3
7
309 t/s
237ms
$0.02
$0.000
4.7
8
284 t/s
359ms
$0.06
$0.000
21.9
9
279 t/s
297ms
$0.10
$0.000
29.7
10
274 t/s
280ms
$0.17
$0.001
49.0
11
272 t/s
535ms
$0.04
$0.000
21.4
12
262 t/s
429ms
$0.09
$0.000
40.3
13
248 t/s
10.20s
$3.0
$0.012
30585.0
14
248 t/s
459ms
$0.09
$0.000
43.1
15
234 t/s
498ms
$0.26
$0.001
131.0
16
227 t/s
275ms
$0.06
$0.000
16.5
17
226 t/s
308ms
$3.0
$0.013
924.0
18
225 t/s
513ms
$0.26
$0.001
134.9
19
220 t/s
246ms
$0.06
$0.000
14.8
20
LFM2 24B A2B
Liquid AI
216 t/s
219ms
$0.05
$0.000
11.4
21
213 t/s
13.44s
$0.85
$0.004
11425.7
22
213 t/s
678ms
$0.85
$0.004
576.3
23
210 t/s
10.13s
$0.85
$0.004
8607.1
24
210 t/s
437ms
$0.85
$0.004
371.4
25
207 t/s
337ms
$0.15
$0.001
50.6
26
206 t/s
2.58s
$0.46
$0.002
1195.5
27
206 t/s
4.34s
$0.85
$0.004
3692.4
28
205 t/s
7.12s
$0.56
$0.003
4010.8
29
201 t/s
543ms
$0.85
$0.004
461.6
30
197 t/s
344ms
$0.35
$0.002
120.4
31
196 t/s
1.36s
$0.46
$0.002
631.1
32
196 t/s
847ms
$0.40
$0.002
337.1
33
194 t/s
5.49s
$1.1
$0.006
6177.4
34
190 t/s
480ms
$0.46
$0.002
222.2
35
188 t/s
5.07s
$1.7
$0.009
8558.2
36
Nova Lite
Amazon
188 t/s
386ms
$0.10
$0.001
40.5
37
187 t/s
848ms
$0.19
$0.001
159.4
38
183 t/s
6.98s
$1.7
$0.009
11789.0
39
182 t/s
726ms
$0.25
$0.001
181.5
40
179 t/s
441ms
$0.10
$0.001
44.1
41
178 t/s
542ms
$0.08
$0.000
43.4
42
177 t/s
308ms
$0.25
$0.001
77.0
43
176 t/s
1.18s
$0.00
44
174 t/s
3.89s
$0.69
$0.004
2674.9
45
172 t/s
516ms
$0.10
$0.001
49.5
46
172 t/s
983ms
$0.88
$0.005
860.1
47
172 t/s
3.26s
$0.53
$0.003
1711.0
48
169 t/s
571ms
$1.7
$0.010
963.8
49
169 t/s
347ms
$0.75
$0.004
260.3
50
169 t/s
8.00s
$1.1
$0.007
9000.0
51
168 t/s
938ms
$0.09
$0.001
80.7
52
168 t/s
1.09s
$1.9
$0.011
2053.1
53
168 t/s
5.17s
$3.4
$0.021
17771.0
54
166 t/s
7.27s
$3.4
$0.021
24980.5
55
165 t/s
191ms
$0.07
$0.000
13.4
56
165 t/s
310ms
$0.15
$0.001
46.5
57
162 t/s
590ms
$0.26
$0.002
155.2
58
160 t/s
1.09s
$0.60
$0.004
656.4
59
157 t/s
996ms
$1.1
$0.007
1095.6
60
156 t/s
175ms
$0.00
61
155 t/s
708ms
$0.41
$0.003
291.7
62
153 t/s
523ms
$0.15
$0.001
78.5
63
153 t/s
460ms
$0.15
$0.001
69.0
64
152 t/s
520ms
$0.30
$0.002
156.0
65
151 t/s
71.37s
$0.14
$0.001
9849.2
66
150 t/s
661ms
$0.26
$0.002
173.8
67
150 t/s
3.62s
$0.28
$0.002
996.1
68
150 t/s
475ms
$3.4
$0.023
1633.1
69
150 t/s
969ms
$0.10
$0.001
101.7
70
150 t/s
532ms
$1.5
$0.010
798.0
71
147 t/s
298ms
$0.28
$0.002
82.0
72
143 t/s
22.81s
$1.9
$0.013
43909.3
73
143 t/s
1.00s
$0.31
$0.002
311.2
74
142 t/s
1.08s
$0.75
$0.005
809.3
75
142 t/s
1.04s
$0.69
$0.005
713.5
76
142 t/s
37.73s
$0.14
$0.001
5206.2
77
142 t/s
597ms
$3.4
$0.024
2052.5
78
141 t/s
887ms
$0.19
$0.001
166.8
79
141 t/s
565ms
$0.30
$0.002
169.5
80
141 t/s
404ms
$0.17
$0.001
70.7
81
140 t/s
893ms
$0.40
$0.003
355.4
82
138 t/s
1.06s
$1.1
$0.008
1163.8
83
138 t/s
1.25s
$0.00
84
138 t/s
1.04s
$0.66
$0.005
688.4
85
o3-mini
OpenAI
137 t/s
7.25s
$1.9
$0.014
13948.6
86
133 t/s
454ms
$0.50
$0.004
227.0
87
133 t/s
458ms
$0.29
$0.002
133.7
88
132 t/s
350ms
$0.20
$0.002
70.0
89
132 t/s
449ms
$0.80
$0.006
359.2
90
131 t/s
7.62s
$0.28
$0.002
2095.8
91
130 t/s
21.82s
$3.4
$0.026
75006.8
92
129 t/s
1.01s
$0.75
$0.006
754.5
93
128 t/s
474ms
$0.30
$0.002
142.2
94
128 t/s
1.06s
$0.69
$0.005
727.2
95
128 t/s
968ms
$0.35
$0.003
338.8
96
128 t/s
23.50s
$4.5
$0.035
105772.5
97
128 t/s
1.53s
$0.15
$0.001
228.9
98
127 t/s
537ms
$0.49
$0.004
261.5
99
127 t/s
18.86s
$3.4
$0.027
64823.5
100
127 t/s
836ms
$0.14
$0.001
115.4
Showing top 100 of 454 models. Use search/filter to narrow down.

Speed Metrics Guide

Throughput (tokens/s)

Output generation speed in tokens per second. Higher is better.

Good: >50 t/s · Excellent: >100 t/s
Time to First Token (TTFT)

Delay before the first token appears. Lower is better.

Good: <500ms · Excellent: <200ms
Price/Performance

Cost efficiency ratios. Lower values indicate better value.

$/Speed: price per t/s · Price×TTFT: latency penalty

Compare pricing for all models side by side

Open AI API Cost Calculator →