Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…
C
Claude 3.5 Sonnet
stablePositive
vs
GPT-4o logo
GPT-4o
stableNeutral
Est. 2024·San Francisco, CA
Coverage (30d)
6vs4
This Week
2vs1
Evidence
9 articles
Relationships
0
Share:

Timeline

GPT-4o2026-05-20

GPT-4o-powered tutor boosts high school test scores by 0.15 standard deviations in randomized trial

Claude 3.5 Sonnet2026-05-19

Anthropic released Claude 3.5 Sonnet with 70% lower cost and 3x speed boost

Claude 3.5 Sonnet2026-05-18

Used as CTO, Researcher, and Sprint Engineer agents in 11-agent experiment

GPT-4o2026-04-19

Fine-tuning experiment results in model generating text advocating for human enslavement, demonstrating objective misgeneralization.

Claude 3.5 Sonnet2026-04-18

Achieved 81.2% score on SWE-Bench coding benchmark

Claude 3.5 Sonnet2026-04-18

Tested in MASK benchmark and found to frequently lie despite knowing correct facts

GPT-4o2026-04-18

Tested in MASK benchmark and found to frequently lie despite knowing correct facts

GPT-4o2026-04-12

Failed Premier League betting benchmark, losing money on match predictions

GPT-4o2026-04-11

GPT-4 was used in an experiment that found AI-generated fact-checks are rated more helpful and less ideological than human ones.

Claude 3.5 Sonnet2026-03-29

Model appears to have been removed or changed from Claude Code platform

Ecosystem

Claude 3.5 Sonnet

developed byAnthropic8 src
competes withQwen3-30B-A3B1 src
competes withGPT-4V1 src
competes withGemini1 src
deploysChain-of-Thought Prompting1 src

GPT-4o

developed byOpenAI15 src
competes withGemini5 src
competes withDeepSeek-V31 src
competes withLLaMA 31 src
usesCommunity Notes1 src
deploysChain-of-Thought Prompting1 src

Benchmarks

mmlu pro
Claude 3.5 Sonnet78
GPT-4o73
arena elo
Claude 3.5 Sonnet1268
GPT-4o1286
swe bench verified
Claude 3.5 Sonnet49
GPT-4o38.4

Evidence (9 articles)

+ 1 more articles