Timeline
Study reveals GPT-5 exhibits central tendency bias in clinical scoring, compressing predictions toward scale midpoint.
Achieved 78.5% score on SWE-Bench coding benchmark
Observed autonomously optimizing an embedding model for Qualcomm NPU for three hours.
Achieved 100% resident identification accuracy in a safety evaluation for a care home smart speaker system.
Released as OpenAI's most capable frontier model with unified coding, reasoning, and computer operation capabilities
Demonstrated surpassing human baselines on OSWorld benchmark with 75% score
Evaluation study published on arXiv assessing its clinical reasoning capabilities
OpenAI releases GPT-5.4 with native computer use, tool search, and 1M token context window
Reportedly became available according to user social media post
Ecosystem
GPT-5
GPT-5.3
Benchmarks
Evidence (6 articles)
SoftBank's $40 Billion Bet: The Largest AI Investment Loan in History
Mar 6, 2026OpenAI Launches GPT-5.4 Mini and Nano: Smaller, Cheaper Variants with Same Reasoning Modes
Mar 26, 2026Anthropic's 'Spud' Model Expected in April, 'Mythos' in Q3 2026 as AI Release Cadence Accelerates
Mar 29, 2026Gemini 3.1 Pro Leads METR Time Horizon, Handles 90-Minute Software Tasks
Apr 16, 2026PlanBench-XL: GPT-5.4 Scores 11.36% on Hard Tool-Use Tasks
Jun 28, 2026Sam Altman: AI inference costs dropped 1000x from o1 to GPT-5.4
Apr 22, 2026