Subgraph Atlas · centered on entity
GDPval
benchmark3 mentions· velocity: stableOpenAI's economic-impact benchmark. Professional work tasks across 44 occupations. Main metric = blinded expert pairwise judgment of deliverables (70.8% inter-rater human agreement). Tests whether agents can do actual white-collar work.
Two-hop subgraph: this entity, every entity it directly relates to, and every entity those neighbors relate to. Drag a node, scroll to zoom, click to inspect — or click any neighbor and re-center the atlas there.
0 nodes · 0 edges · loading…
companypersonai_modelproductresearch_labbenchmarkframework
drag to move · scroll to zoom · click a node