
Stanford/CMU Study: AI Agent Benchmarks Focus on 7.6% of Jobs, Ignoring Management, Legal, and Interpersonal Work
Researchers analyzed 43 AI benchmarks against 72,000+ real job tasks and found they overwhelmingly test programming/math skills, which represent only 7.6% of actual economic work. Management, legal, and interpersonal tasks—which dominate the labor market—are almost entirely absent from evaluation.


















