Screen-level OS controlOpen Source#4 of 15 in category
Kimi K2.6
Moonshot AI · Launched Apr 2026
Moonshot AI's 1T-param MoE (32B active) built for long-horizon agentic coding (up to 13h continuous) with agent swarm scaling to 300 sub-agents. Leads SWE-Bench Pro at 58.6%, ~1/4 the cost of Claude Opus 4.6.
Visit Kimi K2.6 →API: 0.60/2.75 per M tokens
5
Benchmarks scored
89.6
Peak score
0
Article mentions
Yes
Open source
Benchmark performance
89.6
OSWorld-Verified
Real desktop workflows across browser, files, office apps. 369 tasks (361 without Google Drive). Human expert baseline 72.4%. Current SOTA: Holo3-35B-A3B (H Company) at 80.4%.
73.1
66.7
58.6
Other screen-level os control agents
The 15 agents in this category, ranked by peak benchmark.
| Agent | Maker | Launch | Peak | Pricing |
|---|---|---|---|---|
| Claude Sonnet 4.6 | Anthropic | 2026-02 | 1470.0 | API: 3/15 per M tokens |
| Kimi K2.5OSS | Moonshot AI | 2026-01 | 1410.0 | API pay-as-you-go |
| Claude Computer Use | Anthropic | 2024-10 | 92.1 | Claude API — input $5/M, output $25/M |
| Holo3-35B-A3B | H Company | 2026-04 | 80.4 | H Company enterprise |
| Claude Sonnet 4.5 | Anthropic | 2025-09 | 62.9 | Legacy Anthropic API |
| Seed-1.8 | ByteDance Seed | 2025-12 | 61.9 | Doubao ecosystem |
| EvoCUA-20260105 | Meituan LongCat | 2026-01 | 56.7 | Research |
| GUI-Owl-1.5 32BOSS | Alibaba Tongyi Lab | 2026-03 | 55.4 | Free (OSS) |
| DeepMiner-Mano-72B | Mininglamp Technology | 2025-10 | 53.9 | Research |
| UI-TARS-2 | ByteDance Seed | 2025-10 | 53.1 | ByteDance ecosystem |
Quick facts
- Type
- Screen-level OS control
- Maker
- Moonshot AI
- Launch
- 2026-04-20
- Open source
- Yes
- Pricing
- API: 0.60/2.75 per M tokens
- Benchmarks scored
- 5
- Article mentions
- 0
- Rank in category
- #4 of 15