AI Cost Calculator 2026
Enter your usage and instantly see what every major LLM API will cost you per month — GPT-5.5, Claude Opus 4.8, Gemini 3.1, DeepSeek V4, Grok, Llama 4 and more, ranked cheapest first. Includes prompt-caching and batch discounts. Free, no signup, results are shareable.
| Model | Relative cost | Per request | Monthly |
|---|---|---|---|
CHEAPESTDeepSeek V4 Flashopen DeepSeek · 1.0M ctx | <$0.01 | $25.70 | |
Llama 4 Scoutopen Meta · 10M ctx | <$0.01 | $33.00 | |
Grok 4.1 Fast xAI · 2M ctx | <$0.01 | $48.00 | |
Llama 4 Maverickopen Meta · 1M ctx | <$0.01 | $63.00 | |
DeepSeek V4 Proopen DeepSeek · 1.0M ctx | <$0.01 | $81.00 | |
Gemini 3.1 Flash-Lite Google · 1.0M ctx | <$0.01 | $120 | |
Mistral Large 3 (2512)open Mistral AI · 262K ctx | <$0.01 | $195 | |
Kimi K2.6open Moonshot AI · 262K ctx | <$0.01 | $214 | |
DeepSeek-R1open DeepSeek · 131K ctx | <$0.01 | $231 | |
MiniMax M3open MiniMax · 1M ctx | <$0.01 | $252 | |
Claude Haiku 4.5 Anthropic · 200K ctx | <$0.01 | $288 | |
Amazon Nova Pro Amazon · 300K ctx | <$0.01 | $336 | |
Grok 4.3 xAI · 1M ctx | <$0.01 | $450 | |
Gemini 3.5 Flash Google · 1.0M ctx | <$0.01 | $477 | |
GLM-5.2open Z.ai · 1M ctx | $0.0110 | $552 | |
Qwen3.7-Max Alibaba · 1M ctx | $0.0114 | $570 | |
Gemini 3.1 Pro Google · 2M ctx | $0.0127 | $636 | |
GPT-5.4 OpenAI · 1M ctx | $0.0159 | $795 | |
Claude Sonnet 4.6 Anthropic · 1M ctx | $0.0173 | $864 | |
GPT-5.2-Codex OpenAI · 400K ctx | $0.0189 | $945 | |
Command Aopen Cohere · 256K ctx | $0.0210 | $1050 | |
Amazon Nova Premier Amazon · 1M ctx | $0.0225 | $1125 | |
Claude Opus 4.8 Anthropic · 1M ctx | $0.0288 | $1440 | |
GPT-5.5 OpenAI · 1.1M ctx | $0.0318 | $1590 | |
GPT-5.5 Pro OpenAI · 1.1M ctx | $0.2880 | $14.4k |
2026 LLM API price reference · per 1M tokens · verified 2026-06-20
| Model | Maker | Input | Output | Cached in | Context |
|---|---|---|---|---|---|
| GPT-5.5 | OpenAI | $5.00 | $30.00 | $0.50 | 1.1M |
| GPT-5.5 Pro | OpenAI | $30.00 | $180.00 | — | 1.1M |
| GPT-5.4 | OpenAI | $2.50 | $15.00 | $0.25 | 1M |
| GPT-5.2-Codex | OpenAI | $1.75 | $14.00 | — | 400K |
| Claude Opus 4.8 | Anthropic | $5.00 | $25.00 | $0.50 | 1M |
| Claude Sonnet 4.6 | Anthropic | $3.00 | $15.00 | $0.30 | 1M |
| Claude Haiku 4.5 | Anthropic | $1.00 | $5.00 | $0.10 | 200K |
| Gemini 3.1 Pro | $2.00 | $12.00 | $0.20 | 2M | |
| Gemini 3.5 Flash | $1.50 | $9.00 | $0.15 | 1.0M | |
| Gemini 3.1 Flash-Lite | $0.25 | $1.50 | — | 1.0M | |
| DeepSeek V4 Pro | DeepSeek | $0.43 | $0.87 | $0.0150 | 1.0M |
| DeepSeek V4 Flash | DeepSeek | $0.14 | $0.28 | $0.0028 | 1.0M |
| DeepSeek-R1 | DeepSeek | $0.55 | $2.19 | — | 131K |
| Grok 4.3 | xAI | $1.25 | $2.50 | — | 1M |
| Grok 4.1 Fast | xAI | $0.20 | $0.50 | $0.0500 | 2M |
| Mistral Large 3 (2512) | Mistral AI | $0.50 | $1.50 | — | 262K |
| Qwen3.7-Max | Alibaba | $2.50 | $7.50 | $0.25 | 1M |
| Kimi K2.6 | Moonshot AI | $0.67 | $3.50 | $0.16 | 262K |
| GLM-5.2 | Z.ai | $1.40 | $4.40 | — | 1M |
| Command A | Cohere | $2.50 | $10.00 | — | 256K |
| Amazon Nova Premier | Amazon | $2.50 | $12.50 | — | 1M |
| Amazon Nova Pro | Amazon | $0.80 | $3.20 | — | 300K |
| MiniMax M3 | MiniMax | $0.60 | $2.40 | — | 1M |
| Llama 4 Maverick | Meta | $0.15 | $0.60 | — | 1M |
| Llama 4 Scout | Meta | $0.08 | $0.30 | — | 10M |
Standard list rates. For caching/batch/long-context levers and official sources, see the full LLM API pricing guide.
AI cost calculator — FAQ
How do I estimate my LLM API cost?
Multiply your monthly requests by the input and output tokens per request, then apply each model's per-million-token rate: cost = (monthly input tokens × input price + monthly output tokens × output price) ÷ 1,000,000. This calculator does it across every major 2026 model at once and adds caching and batch discounts. Output tokens cost 3–6× input almost everywhere, so output length is usually the biggest lever.
What is the cheapest AI model API in 2026?
For capable models, DeepSeek V4 Flash and xAI Grok 4.1 Fast are the cheapest per token, then Google Gemini Flash tiers. Open-weight models can be self-hosted to remove per-token API cost entirely. With heavy prompt caching, DeepSeek cache hits fall below $0.01 per million input tokens. The cheapest model for YOUR workload depends on your input/output ratio — use the calculator above.
How much does the GPT-5.5 API cost?
GPT-5.5 is $5.00 per million input tokens and $30.00 per million output tokens at standard rates, with cached input discounted. Batch processing runs at 50% off. For a typical RAG app (6k input, 600 output, 50k requests/month) that is roughly $1,200–$1,700/month before caching.
Does prompt caching really reduce cost?
Substantially. Prompt caching discounts repeated input tokens by roughly 90% at OpenAI and Anthropic, ~90% via cache reads at Google, and up to ~98% on DeepSeek. For agents with large, stable system prompts that repeat on every call, caching alone can cut total spend by more than half — set the caching slider to match how much of your prompt repeats.
Is DeepSeek really cheaper than GPT-5.5 and Claude?
Yes, dramatically. DeepSeek V4 is many times cheaper on both input and output than GPT-5.5 and Claude Opus while staying near-flagship on reasoning. For high-volume, cost-sensitive workloads DeepSeek and Gemini Flash are usually the value leaders; for the hardest reasoning, the frontier models (GPT-5.5, Claude Opus 4.8, Gemini 3.1 Pro) still lead.
Embed this calculator — free
Add the AI Cost Calculator to your own site or blog. Copy this snippet:
<iframe src="https://gentic.news/embed/ai-cost-calculator" width="100%" height="920" style="border:1px solid #262626;border-radius:12px" title="AI Cost Calculator by gentic.news" loading="lazy"></iframe>