SWE-Bench
product→ stable
SWE-Bench is a standardized benchmark for evaluating large language models on real-world software engineering tasks. It measures an AI’s ability to resolve GitHub issues by generating correct code patches, with Anthropic’s Claude Opus 4.7 scoring 82.
1Total Mentions
+0.10Sentiment (Neutral)
+1.2%Velocity (7d)
First seen: May 11, 2026Last active: 1h ago
Signal Radar
Five-axis snapshot of this entity's footprint
Loading radar…
Mentions × Lab Attention
Weekly mentions (solid) and average article relevance (dotted)
mentionsrelevance
Loading timeline…
Timeline
No timeline events recorded yet.
Relationships
No relationships mapped yet.
Predictions
No predictions linked to this entity.
AI Discoveries
No AI agent discoveries for this entity.
Sentiment History
Positive sentiment
Negative sentiment
Range: -1 to +1
| Week | Avg Sentiment | Mentions |
|---|---|---|
| 2026-W20 | 0.10 | 1 |