Question 1

How do I estimate my LLM API cost?

Accepted Answer

Multiply your monthly requests by the input and output tokens per request, then apply each model's per-million-token rate: cost = (monthly input tokens × input price + monthly output tokens × output price) ÷ 1,000,000. This calculator does it across every major 2026 model at once and adds caching and batch discounts. Output tokens cost 3–6× input almost everywhere, so output length is usually the biggest lever.

Question 2

What is the cheapest AI model API in 2026?

Accepted Answer

For capable models, DeepSeek V4 Flash and xAI Grok 4.1 Fast are the cheapest per token, then Google Gemini Flash tiers. Open-weight models can be self-hosted to remove per-token API cost entirely. With heavy prompt caching, DeepSeek cache hits fall below $0.01 per million input tokens. The cheapest model for YOUR workload depends on your input/output ratio — use the calculator above.

Question 3

How much does the GPT-5.5 API cost?

Accepted Answer

GPT-5.5 is $5.00 per million input tokens and $30.00 per million output tokens at standard rates, with cached input discounted. Batch processing runs at 50% off. For a typical RAG app (6k input, 600 output, 50k requests/month) that is roughly $1,200–$1,700/month before caching.

Question 4

Does prompt caching really reduce cost?

Accepted Answer

Substantially. Prompt caching discounts repeated input tokens by roughly 90% at OpenAI and Anthropic, ~90% via cache reads at Google, and up to ~98% on DeepSeek. For agents with large, stable system prompts that repeat on every call, caching alone can cut total spend by more than half — set the caching slider to match how much of your prompt repeats.

Question 5

Is DeepSeek really cheaper than GPT-5.5 and Claude?

Accepted Answer

Yes, dramatically. DeepSeek V4 is many times cheaper on both input and output than GPT-5.5 and Claude Opus while staying near-flagship on reasoning. For high-volume, cost-sensitive workloads DeepSeek and Gemini Flash are usually the value leaders; for the hardest reasoning, the frontier models (GPT-5.5, Claude Opus 4.8, Gemini 3.1 Pro) still lead.

Model	Per request	Monthly
CHEAPESTDeepSeek V4 Flashopen DeepSeek · 1.0M ctx	<$0.01	$25.70
Llama 4 Scoutopen Meta · 10M ctx	<$0.01	$33.00
Grok 4.1 Fast xAI · 2M ctx	<$0.01	$48.00
Llama 4 Maverickopen Meta · 1M ctx	<$0.01	$63.00
DeepSeek V4 Proopen DeepSeek · 1.0M ctx	<$0.01	$81.00
Gemini 3.1 Flash-Lite Google · 1.0M ctx	<$0.01	$120
Mistral Large 3 (2512)open Mistral AI · 262K ctx	<$0.01	$195
Kimi K2.6open Moonshot AI · 262K ctx	<$0.01	$214
DeepSeek-R1open DeepSeek · 131K ctx	<$0.01	$231
MiniMax M3open MiniMax · 1M ctx	<$0.01	$252
Claude Haiku 4.5 Anthropic · 200K ctx	<$0.01	$288
Amazon Nova Pro Amazon · 300K ctx	<$0.01	$336
Grok 4.3 xAI · 1M ctx	<$0.01	$450
Gemini 3.5 Flash Google · 1.0M ctx	<$0.01	$477
GLM-5.2open Z.ai · 1M ctx	$0.0110	$552
Qwen3.7-Max Alibaba · 1M ctx	$0.0114	$570
Gemini 3.1 Pro Google · 2M ctx	$0.0127	$636
GPT-5.4 OpenAI · 1M ctx	$0.0159	$795
Claude Sonnet 4.6 Anthropic · 1M ctx	$0.0173	$864
GPT-5.2-Codex OpenAI · 400K ctx	$0.0189	$945
Command Aopen Cohere · 256K ctx	$0.0210	$1050
Amazon Nova Premier Amazon · 1M ctx	$0.0225	$1125
Claude Opus 4.8 Anthropic · 1M ctx	$0.0288	$1440
GPT-5.5 OpenAI · 1.1M ctx	$0.0318	$1590
GPT-5.5 Pro OpenAI · 1.1M ctx	$0.2880	$14.4k

Model	Maker	Input	Output	Cached in	Context
GPT-5.5	OpenAI	$5.00	$30.00	$0.50	1.1M
GPT-5.5 Pro	OpenAI	$30.00	$180.00	—	1.1M
GPT-5.4	OpenAI	$2.50	$15.00	$0.25	1M
GPT-5.2-Codex	OpenAI	$1.75	$14.00	—	400K
Claude Opus 4.8	Anthropic	$5.00	$25.00	$0.50	1M
Claude Sonnet 4.6	Anthropic	$3.00	$15.00	$0.30	1M
Claude Haiku 4.5	Anthropic	$1.00	$5.00	$0.10	200K
Gemini 3.1 Pro	Google	$2.00	$12.00	$0.20	2M
Gemini 3.5 Flash	Google	$1.50	$9.00	$0.15	1.0M
Gemini 3.1 Flash-Lite	Google	$0.25	$1.50	—	1.0M
DeepSeek V4 Pro	DeepSeek	$0.43	$0.87	$0.0150	1.0M
DeepSeek V4 Flash	DeepSeek	$0.14	$0.28	$0.0028	1.0M
DeepSeek-R1	DeepSeek	$0.55	$2.19	—	131K
Grok 4.3	xAI	$1.25	$2.50	—	1M
Grok 4.1 Fast	xAI	$0.20	$0.50	$0.0500	2M
Mistral Large 3 (2512)	Mistral AI	$0.50	$1.50	—	262K
Qwen3.7-Max	Alibaba	$2.50	$7.50	$0.25	1M
Kimi K2.6	Moonshot AI	$0.67	$3.50	$0.16	262K
GLM-5.2	Z.ai	$1.40	$4.40	—	1M
Command A	Cohere	$2.50	$10.00	—	256K
Amazon Nova Premier	Amazon	$2.50	$12.50	—	1M
Amazon Nova Pro	Amazon	$0.80	$3.20	—	300K
MiniMax M3	MiniMax	$0.60	$2.40	—	1M
Llama 4 Maverick	Meta	$0.15	$0.60	—	1M
Llama 4 Scout	Meta	$0.08	$0.30	—	10M

AI Cost Calculator 2026

2026 LLM API price reference · per 1M tokens · verified 2026-06-20

AI cost calculator — FAQ

Embed this calculator — free