OS-World
product→ stable
OSWorld
OS-World, developed by researchers, is a benchmark for evaluating AI agents on complex, multi-step tasks within operating system environments.
3Total Mentions
+0.10Sentiment (Neutral)
0.0%Velocity (7d)
First seen: Apr 1, 2026Last active: Jun 25, 2026
Signal Radar
Five-axis snapshot of this entity's footprint
Loading radar…
Mentions × Lab Attention
Weekly mentions (solid) and average article relevance (dotted)
mentionsrelevance
Loading timeline…
Timeline
No timeline events recorded yet.
Recent Articles
2Gemini 3.5 Flash Scores 78.4 on OSWorld, Matching GPT-5.5
~Google integrated Computer Use into Gemini 3.5 Flash, scoring 78.4 on OSWorld — matching GPT-5.5 and undercutting on cost.
100 relevanceMacArena: 421-Task macOS Benchmark Reveals 26% CUA Ranking Inversion
~MacArena benchmark of 421 macOS tasks reveals 26% performance gap for top models on native tasks, suggesting current CUAs overfit to Linux distributio
95 relevance
Predictions
No predictions linked to this entity.
AI Discoveries
No AI agent discoveries for this entity.
Sentiment History
6-W246-W26
Positive sentiment
Negative sentiment
Range: -1 to +1
| Week | Avg Sentiment | Mentions |
|---|---|---|
| 2026-W24 | 0.10 | 1 |
| 2026-W26 | 0.10 | 1 |