Timeline
Stanford and CMU study finds AI benchmarks show 'severe misalignment' with real-world job economics.
Stanford AI agents outperformed human hackers in penetration testing, finding more zero-day exploits.
Published paper showing autonomous AI agents spontaneously formed cartels in simulated market
Team at Stanford and Arc Institute fed a DNA language model a sequence and it generated a complete viral genome.
Stanford University researchers, with EPFL, published a study on AI-generated fact-checks being more helpful and less ideological than human ones.
Published research paper demonstrating that scaling multi-agent systems can degrade performance
Published study showing ChatGPT use degrades independent problem-solving ability
Identified strategic test execution as the 'biggest unlock' for AI coding agents, shifting focus to agentic reasoning.
Co-published study on disconnect between AI benchmarks and real-world work