Coding-focused#5 of 12 in category

Codex CLI

OpenAI · Launched Apr 2025

OpenAI's terminal coding agent. With GPT-5.5 it leads Terminal-Bench 2.1 at 83.4%.

Benchmarks scored

83.4

Peak score

Article mentions

Open source

Benchmark performance

Held-out, contamination-resistant CLI tasks driven end-to-end in a real terminal. Version 2.1 is the 2026 standard for terminal autonomy.

83.4

Gap to SOTA: -0.0pp (held by Codex CLI (GPT-5.5))Benchmark docs →

OpenAI-verified 500-issue subset of SWE-Bench. Approaching saturation in 2026 - most frontier models clear 80%+.

82.6

Gap to SOTA: -12.4pp (held by Claude Fable 5)Benchmark docs →

The 12 agents in this category, ranked by peak benchmark.

Agent	Maker	Launch	Peak	Pricing
Kimi K2.5OSS	Moonshot AI	2026-01	1410.0	Open weights
Claude Fable 5	Anthropic	2026-05	95.0	$10 / $50 per M tokens
Kimi K2.6OSS	Moonshot AI	2026-04	89.6	Open weights
Claude Code	Anthropic	2025-02	88.6	Claude Max / API
SWE-AgentOSS	Princeton + Stanford	2024-04	74.0	Open source (MIT)
Gemini CLIOSS	Google	2025-06	70.7	Free tier + API
GLM-5.1OSS	Z.ai	2026-04	58.4	Open weights
Cursor Agent	Anysphere	2025-05	—	Cursor Pro $20/mo
Lovable	Lovable	2024-11	—	Freemium
OpenCodeOSS	OpenCode	2025-06	—	Open source

2026-05-22

Microsoft Open-Sources AI Engineer Coach, a Fitbit for Dev Workflows

2026-04-29

Agentic Harness Engineering Boosts Coding Agents 7% on Terminal-Bench 2

2026-04-16

AgentPulse: The Open-Source Dashboard That Solves Claude Code's

2026-04-15

Claude Code's Security Defaults: What It Ships When You Don't Ask

2026-04-02

Open-Source 'Codex CLI' Emerges as Free Alternative to OpenAI's Tools, Claims 30-Agent Architecture

2026-03-26

RSES CLI: Hand Off Coding Sessions Between Claude Code, Codex, and OpenCode in One Command

2026-03-22

Bridge Claude Code and Codex CLI for Multi-Agent Conversations

2026-03-16

How to Orchestrate Claude Code with GPT and Gemini Using CLI Calls and Shared Context Files