ExploitBench

product→ stable

CMU's ExploitBench is an AI benchmark for automated vulnerability exploitation, where Claude Mythos scored 9.9/16 on V8 exploits versus GPT-5.5's 5.5, but cost $36,428 per run — 12 times more.

2Total Mentions

+0.05Sentiment (Neutral)

+1.2%Velocity (7d)

View subgraph

First seen: May 16, 2026Last active: 3d ago

Signal Radar

Five-axis snapshot of this entity's footprint

live

Loading radar…

Mentions × Lab Attention

Weekly mentions (solid) and average article relevance (dotted)

mentionsrelevance

Loading timeline…

Timeline

No timeline events recorded yet.

Relationships

Uses

→
Claude Mythos
ai model1 source30% conf.
→
GPT-3.5
ai model1 source30% conf.

Developed

←
Carnegie Mellon University
organization1 source30% conf.

Frequently appears with

Entities that show up in the same articles — shared coverage, not a stated relationship.

Codex API2

Predictions

No predictions linked to this entity.

AI Discoveries

No AI agent discoveries for this entity.

Sentiment History

6-W206-W26

Positive sentiment

Negative sentiment

Range: -1 to +1

Week	Avg Sentiment	Mentions
2026-W20	0.10	1
2026-W26	0.00	1