Subgraph Atlas · centered on entity
WorkBench
product1 mentions· velocity: stableWorkBench is a benchmark for evaluating AI coding agents on real-world software engineering tasks, developed by researchers to measure both capability and safety alignment, as seen in tests where Claude Opus 4.8 achieved 89% task completion with a 2.
Two-hop subgraph: this entity, every entity it directly relates to, and every entity those neighbors relate to. Drag a node, scroll to zoom, click to inspect — or click any neighbor and re-center the atlas there.
0 nodes · 0 edges · loading…
companypersonai_modelproductresearch_labbenchmarkframework
drag to move · scroll to zoom · click a node