Compare/Jamba 1.5 Large vs Qwen3 VL 32B (Reasoning)

Jamba 1.5 LargevsQwen3 VL 32B (Reasoning)

Side-by-side comparison of pricing, 12 benchmarks, and generation speed.

AI21 Labs

Jamba 1.5 Large

Input
$2/M
Output
$8/M
Speed
TTFT
Alibaba

Qwen3 VL 32B (Reasoning)

Input
$0.7/M
Output
$8.4/M
Speed
97 tok/s
TTFT
1.10s

Winner by Category

Cheaper
Qwen3 VL 32B (Reasoning)
Faster (tok/s)
Qwen3 VL 32B (Reasoning)
Lower Latency
Jamba 1.5 Large
Benchmarks (1-11)
Qwen3 VL 32B (Reasoning)

Pricing Comparison

MetricJamba 1.5 LargeQwen3 VL 32B (Reasoning)
Input ($/M tokens)$2$0.7
Output ($/M tokens)$8$8.4
Cost for 1M input + 100K output tokens:
Jamba 1.5 Large$2.80
Qwen3 VL 32B (Reasoning)$1.54

Speed Comparison

Output Speed (tokens/s) — higher is better
Jamba 1.5 Large
Qwen3 VL 32B (Reasoning)
97 tok/s
Time to First Token (seconds) — lower is better
Jamba 1.5 Large
Qwen3 VL 32B (Reasoning)
1.10s

Benchmark Comparison

Data from Artificial Analysis API — 12 benchmarks

Intelligence Index
10.724.7
Coding Index
14.5
Math Index
84.7
GPQA Diamond
42.7%73.3%
MMLU-Pro
57.2%81.8%
LiveCodeBench
14.3%73.8%
AIME 2025
84.7%
MATH-500
60.6%
Humanity's Last Exam
4.0%9.6%
SciCode
16.3%28.5%
IFBench
59.4%
TerminalBench
7.6%
Jamba 1.5 Large1 wins
11 winsQwen3 VL 32B (Reasoning)

Frequently Asked Questions

Which is cheaper, Jamba 1.5 Large or Qwen3 VL 32B (Reasoning)?

Qwen3 VL 32B (Reasoning) is cheaper overall. Its blended price (3:1 input/output ratio) is $2.63/M tokens vs $3.50/M for Jamba 1.5 Large.

Which model performs better on benchmarks?

Qwen3 VL 32B (Reasoning) wins 11 out of 12 benchmarks compared to 1 for Jamba 1.5 Large. See the detailed benchmark chart above for per-category results.

Which is faster for real-time applications?

Qwen3 VL 32B (Reasoning) generates tokens faster at 97 tok/s vs 0 tok/s. Jamba 1.5 Large also has lower time-to-first-token (0.00s vs 1.10s).

When should I use Jamba 1.5 Large vs Qwen3 VL 32B (Reasoning)?

Choose based on your priorities: Qwen3 VL 32B (Reasoning) for lower cost, Qwen3 VL 32B (Reasoning) for stronger benchmark performance, and Qwen3 VL 32B (Reasoning) for faster generation. For latency-sensitive apps, check the TTFT comparison above.