Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

A bar chart comparing Cursor Composer 2.5, Opus 4.7, and GPT-5.5 performance scores and costs, with Composer 2.5…
Products & LaunchesBreakthroughScore: 95

Cursor's Composer 2.5 matches Opus 4.7, GPT-5.5 at fraction of cost

Cursor's Composer 2.5 scores 79.8% on SWE-Bench Multilingual at $0.50/M tokens, matching Opus 4.7 and GPT-5.5 at 30x lower cost.

·7h ago·3 min read··38 views·AI-Generated·Report error
Share:
Source: the-decoder.comvia the_decoderWidely Reported
How does Cursor's Composer 2.5 compare to Opus 4.7 and GPT-5.5 on benchmarks and cost?

Cursor's Composer 2.5 scores 79.8% on SWE-Bench Multilingual and 63.2% on CursorBench v3.1, matching Anthropic's Opus 4.7 and OpenAI's GPT-5.5, but costs $0.50 per million input tokens versus Opus 4.7's $15.

TL;DR

Composer 2.5 scores 79.8% on SWE-Bench Multilingual · Built on Kimi K2.5 with 25x more synthetic tasks · Costs $0.50/M input tokens versus Opus 4.7's $15

Cursor's Composer 2.5 scores 79.8% on SWE-Bench Multilingual, matching Anthropic's Opus 4.7 and OpenAI's GPT-5.5. It costs $0.50 per million input tokens versus Opus 4.7's $15.

Key facts

  • Composer 2.5 scores 79.8% on SWE-Bench Multilingual
  • Costs $0.50/M input tokens vs Opus 4.7's $15/M
  • Trained on 25x more synthetic tasks than Composer 2
  • 85% of compute budget went to training and RL
  • Successor model training on Colossus-2 with 1M H100s

Cursor shipped Composer 2.5, a major upgrade to its in-house AI coding model built on the open-source Kimi K2.5 checkpoint from Moonshot AI. According to the company's blog post, the model was trained on 25 times more synthetic tasks than its predecessor, Composer 2, and 85 percent of the compute budget went toward extra training and reinforcement learning.

On benchmarks like SWE-Bench Multilingual (79.8 percent) and CursorBench v3.1 (63.2 percent), Composer 2.5 matches Anthropic's Opus 4.7 and OpenAI's GPT-5.5. The unique take here is not that a smaller model matches frontier labs — it's that Cursor has achieved this at a 30x cost advantage, which fundamentally shifts the economics of AI coding tools.

Pricing and variants

The standard Composer 2.5 costs $0.50 per million input tokens and $2.50 per million output tokens. A faster variant with the same benchmark performance runs $3.00 and $15.00 per million tokens, respectively. For comparison, Anthropic charges $15 per million input tokens for Opus 4.7 [according to Anthropic's pricing page]. OpenAI's GPT-5.5 costs $10 per million input tokens [per OpenAI's API docs].

Composer 2.5 is live in Cursor now. The company did not disclose the exact training cost but noted that a much larger successor model is already in training with SpaceX and xAI, using ten times the compute on the Colossus-2 cluster with one million H100 equivalents. SpaceX had previously announced plans to acquire Cursor for $60 billion.

What this means for the coding tools market

Cursor's approach — fine-tuning an open-source checkpoint with massive synthetic data and RL — directly challenges the narrative that only frontier labs with $10B+ training runs can produce top-tier coding models. If Composer 2.5 maintains this performance in production, it pressures Anthropic and OpenAI to justify their premium pricing for coding tasks.

What to watch

Watch for independent benchmark reproductions of Composer 2.5 on SWE-Bench Verified, and whether Cursor publishes a technical report detailing the synthetic data generation pipeline. The successor model trained with SpaceX and xAI on Colossus-2 could set a new bar for coding benchmarks in Q3 2026.


Sources cited in this article

  1. Anthropic's
  2. OpenAI's API
Source: gentic.news · · author= · citation.json

AI-assisted reporting. Generated by gentic.news from 2 verified sources, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

Cursor's achievement is less about raw benchmark numbers and more about the cost structure. The company has demonstrated that with sufficient synthetic data and RL compute, an open-source checkpoint can match frontier models at a fraction of the API cost. This mirrors what DeepSeek showed with V3 earlier in 2025 — that the gap between open and closed models is narrowing rapidly when you optimize for a specific domain. The 30x cost advantage is the structural story. If Composer 2.5 performs similarly in production, it forces Anthropic and OpenAI to either cut prices for coding workloads or justify the premium with reliability gains. Given that Cursor already has distribution as a code editor, this could accelerate the shift away from general-purpose frontier APIs for coding. The successor model with SpaceX and xAI is a wildcard. Using ten times the compute on Colossus-2 suggests Cursor is not content with being a cost leader — they're aiming for absolute performance leadership. The $60 billion acquisition offer from SpaceX, if completed, would give them essentially unlimited compute resources.
Compare side-by-side
Anthropic vs OpenAI
Enjoyed this article?
Share:

AI Toolslive

Five one-click lenses on this article. Cached for 24h.

Pick a tool above to generate an instant lens on this article.

Related Articles

From the lab

The framework underneath this story

Every article on this site sits on top of one engine and one framework — both built by the lab.

More in Products & Launches

View all