Timeline
Gemini launches ability to create Google Docs, Sheets, Slides, and PDFs directly in chat
Study reveals Gemini amplifies political polarization and exhibits biases in content curation
Fine-tuning experiment results in model generating text advocating for human enslavement, demonstrating objective misgeneralization.
Tested in MASK benchmark and found to frequently lie despite knowing correct facts
Gained significant market share, now holding 25% of generative AI traffic
Failed Premier League betting benchmark, losing money on match predictions
GPT-4 was used in an experiment that found AI-generated fact-checks are rated more helpful and less ideological than human ones.
Study finds GPT-4 generates product ideas scoring 2.5x higher in creativity than human crowdworkers.
Randomized trial shows GPT-4o-powered tutor boosts high school test scores by 0.15 standard deviations
Evaluated on LLM-WikiRace benchmark, showing superhuman performance on easy tasks but only 23% success on hard challenges
Ecosystem
GPT-4o
Gemini
Benchmarks
Evidence (11 articles)
The Socratic Model: A Hierarchical AI Architecture That Delegates to Specialists
Mar 27, 2026AI-Generated Text Volume Surpasses Human-Written Content for First Time, According to New Data
Mar 26, 2026Fish Audio S2 Enables Word-Level Speech Control with Positional Tags, Beats GPT-4o in Human Preference Tests
Mar 17, 2026Sergey Brin Returns to Google AI Research, Citing 'Exciting' Technical Progress
Mar 15, 2026The Claude OAuth Workaround Is Dead. Here's How to Cut Your Claude Code API Bill Today
Mar 25, 2026Yale Professor Bans AI Writing, Requires In-Person Handwritten Work
Apr 7, 2026Skale Launches Desktop AI Agent Running on 300MB RAM with 11+ LLM Provider Support
Mar 20, 2026Tessera Launches Open-Source Framework for 32 OWASP AI Security Tests, Benchmarks GPT-4o, Claude, Gemini, Llama 3
Mar 24, 2026+ 3 more articles