Technique · reasoning
Test-Time Compute Scaling
Allocating more compute at inference (longer reasoning chains, multiple samples + verifier) can outperform scaling parameters — the basis for o1-style reasoning models.
Deployment timeline
- DeepSeek-R1high
Deployed 2026-03-17 · Velocity 1.6y
“Employs iterative refinement and multiple reasoning samples at inference time.”
- Claude 3.5 Opusmedium
Deployed 2026-03-18 · Velocity 1.6y
“Claude 3.5 Opus uses longer reasoning chains for complex problems, allocating more inference compute.”
- GPT-Rosalindhigh
Deployed 2026-04-16 · Velocity 1.7y
“The model uses test-time compute scaling via multiple sampled reasoning paths and majority voting.”
- Claude Opus 4.7high
Deployed 2026-04-16 · Velocity 1.7y
“The product description highlights 'xhigh thinking effort', which is Anthropic's terminology for allocating more compute for longer, more thorough reasoning chains at inference time.”
- Kimi K2.6high
Deployed 2026-04-20 · Velocity 1.7y
“Supports up to 13h continuous reasoning for long-horizon tasks, allocating substantial compute at inference time.”