Cursor Launches Composer 2 with $0.50/M Input Token Pricing, Claims Major Benchmark Gains

Cursor has released Composer 2, a coding AI model priced at $0.50 per million input tokens and $2.50 per million output tokens. The company reports significant benchmark improvements over previous versions across CursorBench, Terminal-Bench 2.0, and SWE-bench Multilingual.

AAAla AYADI & AI Research Desk·Mar 19, 2026·2 min read··311 views·AI-Generated·Report error

Source: x.comvia @kimmonismusSingle Source

What Happened

Cursor has launched Composer 2, a new version of its coding AI model, making it available within the Cursor IDE. The announcement came via a tweet from @kimmonismus, highlighting both pricing and performance claims.

The key pricing details:

Input tokens: $0.50 per million tokens
Output tokens: $2.50 per million tokens

Performance Claims

According to the announcement, Composer 2 shows "major jumps over prior versions" on three coding benchmarks:

CursorBench 61.3 Terminal-Bench 2.0 61.7 SWE-bench Multilingual 73.7

The tweet describes this as "a notably very very strong price-performance launch for coding AI" and suggests this "pressure on OpenAI and Anthropic increased a lot."

Context

Cursor is an AI-powered code editor that has gained popularity among developers for its deep integration of AI assistance directly into the development workflow. The original Composer model was positioned as a coding-specific alternative to general-purpose models like GPT-4 and Claude.

The pricing structure places Composer 2 significantly below current market leaders:

OpenAI's GPT-4 Turbo: $10.00 per million input tokens, $30.00 per million output tokens
Anthropic's Claude 3 Opus: $15.00 per million input tokens, $75.00 per million output tokens

However, it's important to note that the announcement does not provide:

Baseline scores for previous Composer versions
Direct comparisons to competing models on these benchmarks
Details about model architecture, training data, or context window
Information about availability outside the Cursor IDE

What's Next

The launch represents Cursor's attempt to compete on both price and specialized coding performance. Developers using Cursor will now have access to what the company claims is a significantly improved model at aggressive pricing.

Without independent verification of the benchmark claims or details about model capabilities, the actual impact on the competitive landscape remains to be demonstrated through real-world developer adoption and performance in production coding workflows.

Source: gentic.news · Mar 19, 2026 · author=Ala AYADI · citation.json

AI-assisted reporting. Generated by gentic.news from multiple verified sources, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala AYADI.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

The pricing is genuinely disruptive if the performance claims hold up. At $0.50/M input tokens, Composer 2 is priced at 5% of GPT-4 Turbo's input cost and 3.3% of Claude 3 Opus's input cost. For output tokens, it's 8.3% of GPT-4 Turbo and 3.3% of Claude 3 Opus. This creates a compelling price-performance proposition for coding tasks, assuming the benchmark improvements translate to real-world utility. The benchmark scores themselves are difficult to evaluate without context. CursorBench and Terminal-Bench 2.0 are proprietary benchmarks, making cross-model comparisons impossible without Cursor publishing results for competing models. The SWE-bench Multilingual score of 73.7% is more interpretable, but we'd need to see how this compares to Claude 3.5 Sonnet (79.8% on SWE-bench Verified) and GPT-4o (which reportedly scores in the low 80s% range on similar coding benchmarks). Practitioners should test Composer 2 against their specific coding workflows rather than relying solely on benchmark claims. The real question is whether the model's specialized training for coding tasks provides tangible advantages over general-purpose models when solving complex, domain-specific problems. The aggressive pricing makes experimentation low-risk, but developers should verify whether the model's capabilities match their needs before committing to significant architectural changes.

#coding-ai #product-launch #benchmarks #pricing

This story is part of

The Instruction Hierarchy Crisis: OpenAI's Internal Fix for a Systemic AI Safety Failure

As public chatbots fail safety tests, OpenAI's quiet IH-Challenge project reveals a deeper struggle to control model agency.

Compare side-by-side

Anthropic vs OpenAI

→

Mentioned in this article

Composer 2 Cursor Anthropic OpenAI

Enjoyed this article?