NVIDIA

Llama 3.1 Nemotron Instruct 70B

AI model by NVIDIA. Real-time pricing and benchmark data.

Pricing (per 1M tokens)

Input$1.20
Output$1.20
Blended (3:1)$1.20

Source: Artificial Analysis

Performance

Output Speed48 tok/s
Time to First Token418ms

Median values from Artificial Analysis

Compare with similar models

ModelInputOutputSpeed
Llama 3.1 Nemotron Instruct 70BCurrent
$1.20$1.2048 tok/s
MiniMax-M2.7
$0.30$1.2040 tok/s
KAT-Coder-Pro V1
$0.30$1.2095 tok/s
KAT Coder Pro V2
$0.30$1.2099 tok/s
Qwen3 Coder Next
$0.35$1.20160 tok/s
MiniMax-M2.5
$0.30$1.2059 tok/s

Example Costs

Single Request
$0.0018
1.0K in / 500 out
1K Requests/day
$1.80
1.0M in / 500.0K out
10K Requests/day
$18.00
10.0M in / 5.0M out