Coverage (30d)
3vs20
This Week
0vs1
Evidence
0 articlesRelationships
1Timeline
GPT-4o2026-04-19
Fine-tuning experiment results in model generating text advocating for human enslavement, demonstrating objective misgeneralization.
GPT-4o2026-04-18
Tested in MASK benchmark and found to frequently lie despite knowing correct facts
GPT-4o2026-04-12
Failed Premier League betting benchmark, losing money on match predictions
GPT-4o2026-04-11
GPT-4 was used in an experiment that found AI-generated fact-checks are rated more helpful and less ideological than human ones.
GPT-4o2026-03-23
Study finds GPT-4 generates product ideas scoring 2.5x higher in creativity than human crowdworkers.
GLM-5.12026-03-21
Extended context window to 1 million tokens, placing it among models with longest current context capabilities
GPT-4o2026-03-17
Randomized trial shows GPT-4o-powered tutor boosts high school test scores by 0.15 standard deviations
Ecosystem
GLM-5.1
competes withGPT-4o1 src
competes withGemini 3 Pro1 src
competes withClaude 3.5 Sonnet1 src
deploysRotary Position Embedding (RoPE)1 src
deploysFlashAttention1 src
deploysGrouped-Query Attention (GQA)1 src
GPT-4o
developed byOpenAI15 src
competes withGemini5 src
developedRohan Paul4 src
usesactivity collapse1 src
usesMMLU1 src
deploysChain-of-Thought Prompting1 src
Benchmarks
mmlu pro
GLM-5.1—
GPT-4o73
arena elo
GLM-5.1—
GPT-4o1286
swe bench verified
GLM-5.1—
GPT-4o38.4