Live speed data from Artificial Analysis API

AI Model Speed Rankings

Compare 511+ AI models by response speed, latency, and throughput. Find the fastest models for your use case.

511 models · click headers to sort
#
Model
Throughput
TTFT
$/1M
$/Speed
Price×TTFT
1
Mercury 2
Inception
818 t/s
4.90s
$0.38
$0.000
1837.9
2
426 t/s
21.98s
$0.09
$0.000
1868.0
3
364 t/s
8.74s
$0.11
$0.000
935.5
4
319 t/s
663ms
$0.06
$0.000
40.4
5
318 t/s
223ms
$0.04
$0.000
8.9
6
307 t/s
585ms
$0.13
$0.000
76.6
7
295 t/s
235ms
$1.2
$0.004
282.0
8
288 t/s
4.82s
$0.56
$0.002
2713.7
9
276 t/s
509ms
$0.26
$0.001
133.4
10
272 t/s
524ms
$0.26
$0.001
137.3
11
270 t/s
363ms
$0.09
$0.000
31.9
12
264 t/s
462ms
$0.10
$0.000
43.9
13
238 t/s
19.33s
$0.17
$0.001
3382.4
14
232 t/s
1.16s
$0.28
$0.001
318.4
15
230 t/s
687ms
$0.30
$0.001
206.1
16
227 t/s
9.75s
$3.4
$0.015
32892.8
17
225 t/s
1.01s
$0.40
$0.002
403.6
18
223 t/s
838ms
$0.85
$0.004
712.3
19
223 t/s
893ms
$0.19
$0.001
167.9
20
218 t/s
12.92s
$0.85
$0.004
10981.1
21
203 t/s
2.86s
$0.17
$0.001
500.5
22
Nova Lite
Amazon
202 t/s
634ms
$0.10
$0.001
66.6
23
196 t/s
6.17s
$1.1
$0.006
6946.9
24
194 t/s
469ms
$0.10
$0.001
46.9
25
193 t/s
572ms
$0.85
$0.004
486.2
26
193 t/s
905ms
$0.15
$0.001
135.8
27
192 t/s
414ms
$0.15
$0.001
62.1
28
187 t/s
868ms
$0.00
29
187 t/s
758ms
$0.25
$0.001
189.5
30
186 t/s
878ms
$1.1
$0.006
987.8
31
180 t/s
339ms
$0.10
$0.001
33.9
32
177 t/s
5.33s
$0.69
$0.004
3667.0
33
177 t/s
1.15s
$0.00
34
176 t/s
1.46s
$0.56
$0.003
810.4
35
175 t/s
10.15s
$0.85
$0.005
8631.7
36
173 t/s
959ms
$0.41
$0.002
395.1
37
173 t/s
233ms
$0.06
$0.000
14.0
38
173 t/s
19.51s
$0.85
$0.005
16586.9
39
172 t/s
3.14s
$3.4
$0.020
10798.8
40
170 t/s
550ms
$3.4
$0.020
1890.9
41
170 t/s
3.60s
$3.4
$0.020
12380.2
42
169 t/s
17.80s
$1.9
$0.011
34265.0
43
166 t/s
7.06s
$3.4
$0.021
24262.0
44
166 t/s
1.43s
$0.84
$0.005
1203.5
45
165 t/s
249ms
$0.06
$0.000
14.9
46
164 t/s
72.48s
$0.14
$0.001
10002.2
47
162 t/s
21.93s
$1.9
$0.012
42205.6
48
o3-mini
OpenAI
162 t/s
7.29s
$1.9
$0.012
14027.5
49
160 t/s
1.10s
$1.1
$0.007
1206.7
50
160 t/s
5.06s
$0.85
$0.005
4304.4
51
158 t/s
1.19s
$0.88
$0.006
1040.4
52
157 t/s
4.34s
$1.7
$0.011
7327.6
53
155 t/s
799ms
$3.0
$0.019
2397.0
54
155 t/s
3.13s
$0.46
$0.003
1449.7
55
155 t/s
708ms
$0.10
$0.001
68.0
56
154 t/s
522ms
$1.7
$0.011
881.1
57
154 t/s
718ms
$0.14
$0.001
99.1
58
154 t/s
15.08s
$3.4
$0.022
51845.0
59
154 t/s
525ms
$0.46
$0.003
243.1
60
152 t/s
1.05s
$1.1
$0.007
1152.8
61
151 t/s
37.69s
$0.14
$0.001
5200.5
62
151 t/s
1.15s
$1.9
$0.012
2160.0
63
150 t/s
3.84s
$0.46
$0.003
1776.1
64
150 t/s
1.45s
$0.15
$0.001
217.5
65
150 t/s
5.74s
$1.7
$0.011
9689.1
66
149 t/s
578ms
$0.26
$0.002
151.4
67
149 t/s
483ms
$4.4
$0.029
2113.1
68
148 t/s
416ms
$0.17
$0.001
72.8
69
148 t/s
1.33s
$0.15
$0.001
199.3
70
147 t/s
945ms
$0.31
$0.002
292.9
71
146 t/s
1.48s
$0.15
$0.001
221.5
72
144 t/s
1.14s
$0.67
$0.005
768.6
73
LFM2 24B A2B
Liquid AI
142 t/s
294ms
$0.05
$0.000
15.3
74
142 t/s
519ms
$1.5
$0.011
778.5
75
141 t/s
516ms
$0.30
$0.002
154.8
76
141 t/s
25.18s
$4.5
$0.032
113314.5
77
140 t/s
524ms
$0.26
$0.002
137.3
78
139 t/s
946ms
$0.19
$0.001
177.8
79
139 t/s
548ms
$0.10
$0.001
57.0
80
139 t/s
958ms
$0.40
$0.003
381.3
81
138 t/s
1.09s
$0.69
$0.005
750.6
82
138 t/s
1.18s
$0.00
83
138 t/s
702ms
$0.09
$0.001
60.4
84
136 t/s
509ms
$0.14
$0.001
70.2
85
135 t/s
1.17s
$0.66
$0.005
768.9
86
135 t/s
19.85s
$4.5
$0.033
89316.0
87
134 t/s
825ms
$3.4
$0.026
2836.3
88
132 t/s
230ms
$0.02
$0.000
4.6
89
131 t/s
1.26s
$0.00
90
131 t/s
1.29s
$0.69
$0.005
888.9
91
128 t/s
602ms
$0.40
$0.003
237.8
92
127 t/s
474ms
$0.06
$0.000
29.9
93
126 t/s
296ms
$0.00
94
126 t/s
1.03s
$0.34
$0.003
348.8
95
125 t/s
21.89s
$3.4
$0.027
75275.0
96
125 t/s
1.06s
$0.21
$0.002
225.1
97
123 t/s
2.43s
$0.20
$0.002
486.2
98
121 t/s
1.01s
$0.30
$0.002
304.5
99
121 t/s
671ms
$0.47
$0.004
318.7
100
120 t/s
2.86s
$0.20
$0.002
572.2
Showing top 100 of 511 models. Use search/filter to narrow down.

Speed Metrics Guide

Throughput (tokens/s)

Output generation speed in tokens per second. Higher is better.

Good: >50 t/s · Excellent: >100 t/s
Time to First Token (TTFT)

Delay before the first token appears. Lower is better.

Good: <500ms · Excellent: <200ms
Price/Performance

Cost efficiency ratios. Lower values indicate better value.

$/Speed: price per t/s · Price×TTFT: latency penalty

Compare pricing for all models side by side

Open AI API Cost Calculator →