Timeline
GPT-4o-powered tutor boosts high school test scores by 0.15 standard deviations in randomized trial
Gemini introduces data import feature allowing users to transfer chat history and preferences from other AI platforms
Achieved 92.3% recall@10 on visual product search benchmark, beating ResNet50 and SigLIP
Gemini launches ability to create Google Docs, Sheets, Slides, and PDFs directly in chat
Study reveals Gemini amplifies political polarization and exhibits biases in content curation
Fine-tuning experiment results in model generating text advocating for human enslavement, demonstrating objective misgeneralization.
Tested in MASK benchmark and found to frequently lie despite knowing correct facts
Gained significant market share, now holding 25% of generative AI traffic
Failed Premier League betting benchmark, losing money on match predictions
GPT-4 was used in an experiment that found AI-generated fact-checks are rated more helpful and less ideological than human ones.
Ecosystem
Gemini
GPT-4o
Benchmarks
Evidence (10 articles)
The Socratic Model: A Hierarchical AI Architecture That Delegates to Specialists
Mar 27, 2026AI-Generated Text Volume Surpasses Human-Written Content for First Time, According to New Data
Mar 26, 2026Fish Audio S2 Enables Word-Level Speech Control with Positional Tags, Beats GPT-4o in Human Preference Tests
Mar 17, 2026The Claude OAuth Workaround Is Dead. Here's How to Cut Your Claude Code API Bill Today
Mar 25, 2026Yale Professor Bans AI Writing, Requires In-Person Handwritten Work
Apr 7, 2026Skale Launches Desktop AI Agent Running on 300MB RAM with 11+ LLM Provider Support
Mar 20, 2026Tessera Launches Open-Source Framework for 32 OWASP AI Security Tests, Benchmarks GPT-4o, Claude, Gemini, Llama 3
Mar 24, 2026CMU Study: Top LLMs Fail Simple Contradiction Tests, Lack True Reasoning
Apr 6, 2026+ 2 more articles