scientific discovery
30 articles about scientific discovery in AI news
ResearchGym Exposes AI's 'Capability-Reliability Gap' in Scientific Discovery
A new benchmark called ResearchGym reveals that while frontier AI agents can occasionally achieve state-of-the-art scientific results, they fail to do so reliably. In controlled evaluations, agents completed only 26.5% of research sub-tasks on average, highlighting critical limitations in autonomous scientific discovery.
Altman: Next-Gen AI Models to Aid 'Career-Defining' Scientific Discovery
OpenAI CEO Sam Altman stated that upcoming AI models will assist researchers in making 'career-defining' discoveries, though he tempered expectations of immediate Nobel-level breakthroughs.
AI Bridges the Gap Between Data and Discovery: New Framework Aligns Scientific Observations with Decades of Literature
Researchers have developed a novel AI framework that aligns X-ray spectra with scientific literature using contrastive learning. This multimodal approach improves physical variable estimation by 16-18% and identifies high-priority astronomical targets, demonstrating how AI can accelerate scientific discovery by connecting data with domain knowledge.
AI Crosses the Rubicon: From Scientific Tool to Active Discovery Partner
This week marked a paradigm shift as AI systems transitioned from research tools to active participants in scientific discovery. OpenAI's GPT-5.2 Pro helped conjecture a new formula in particle physics, while Google's Gemini 3 Deep Think achieved unprecedented results on reasoning benchmarks. These developments signal AI's growing capacity for genuine scientific contribution.
From Code to Discovery: The Next Frontier of AI Agents in Research
AI researcher Omar Saray predicts a shift from 'agentic coding' to 'agentic research'—where AI systems will autonomously conduct scientific discovery. This evolution promises to accelerate innovation across disciplines.
EmbodiedAct: How Active AI Agents Are Revolutionizing Scientific Simulation
Researchers have developed EmbodiedAct, a framework that transforms scientific software into active AI agents with real-time perception. This breakthrough addresses critical limitations in how LLMs interact with physical simulations, enabling more reliable scientific discovery through embodied actions.
PRL-Bench: LLMs Score Below 50% on End-to-End Physics Research Tasks
Researchers introduced PRL-Bench, a benchmark built from 100 recent Physical Review Letters papers, testing LLMs on end-to-end physics research. Top models scored below 50%, exposing a significant capability gap for autonomous scientific discovery.
Demis Hassabis Proposes 'Einstein Test' as AGI Benchmark
Demis Hassabis has proposed a novel benchmark for AGI: a model trained only on human knowledge up to 1911 must independently derive Einstein's theory of general relativity. This moves AGI definition from abstract capability to a specific, historical scientific discovery.
BloClaw: New AI4S 'Operating System' Cuts Agent Tool-Calling Errors to 0.2% with XML-Regex Protocol
Researchers introduced BloClaw, a unified operating system for AI-driven scientific discovery that replaces fragile JSON tool-calling with a dual-track XML-Regex protocol, cutting error rates from 17.6% to 0.2%. The system autonomously captures dynamic visualizations and provides a morphing UI, benchmarked across cheminformatics, protein folding, and molecular docking.
Anthropic Launches Dedicated Science Blog to Chronicle AI Research and Applications
Anthropic has launched a new Science Blog to publish its research and case studies on using AI to accelerate scientific discovery, aligning with its mission to increase the pace of scientific progress.
DrugPlayGround Benchmark Tests LLMs on Drug Discovery Tasks
A new framework called DrugPlayGround provides the first standardized benchmark for evaluating large language models on key drug discovery tasks, including predicting drug-protein interactions and chemical properties. This addresses a critical gap in objectively assessing LLMs' potential to accelerate pharmaceutical research.
Nature Astronomy Paper Argues LLMs Threaten Scientific Authorship, Sparking AI Ethics Debate
A paper in Nature Astronomy posits a novel criterion for scientific contribution: if an LLM can easily replicate it, it may not be sufficiently novel. This directly challenges the perceived value of incremental, LLM-augmented research.
Ethan Mollick Critiques Scientific Publishing's AI Inertia: PDFs Still Dominate in 2026
Wharton professor Ethan Mollick highlights that scientific papers in 2026 are still primarily uploaded as formatted PDFs to restrictive academic archives, signaling slow adaptation to AI's potential for accelerating research.
Revieve Launches AI Skin Advisor for ChatGPT, Expanding Generative AI Beauty Discovery
Beauty tech platform Revieve launches an AI Skin Advisor as a ChatGPT plugin, enabling conversational skin analysis and product discovery. This represents a strategic expansion into generative AI platforms for beauty brands and retailers.
Stanford-Princeton Team Open-Sources LabClaw: The 'Skill OS' for Scientific AI
Researchers from Stanford and Princeton have open-sourced LabClaw, a 'Skill Operating Layer' for LabOS that transforms natural language commands into executable lab workflows. This breakthrough promises to dramatically accelerate scientific experimentation by bridging human intent with robotic execution.
Annealed Co-Generation: A New AI Framework Tackles Scientific Complexity Through Pairwise Modeling
Researchers propose Annealed Co-Generation, a novel AI framework that simplifies multivariate generation in scientific applications by modeling variables in pairs rather than jointly. The approach reduces computational burden and data imbalance while maintaining coherence across complex systems.
Beyond General AI: How Liquid Foundation Models Are Revolutionizing Drug Discovery
Researchers have developed MMAI Gym, a specialized training platform that teaches AI the 'language of molecules' to create more efficient drug discovery models. The resulting Liquid Foundation Models outperform larger general-purpose AI while requiring fewer computational resources.
RxnNano: How a Tiny AI Model Outperforms Giants in Chemical Discovery
Researchers have developed RxnNano, a compact 0.5B-parameter AI model that outperforms models ten times larger in predicting chemical reactions. Using innovative training techniques that prioritize chemical understanding over brute-force scaling, it achieves 23.5% better accuracy on key benchmarks for drug discovery applications.
XtalPi's Profit Milestone Signals AI's Transformative Impact on Pharmaceutical Discovery
Chinese AI drug discovery firm XtalPi projects its first annual profit in 2025 following a 193% revenue surge, marking a pivotal moment for AI-driven pharmaceutical research. The company's turnaround demonstrates the commercial viability of AI in accelerating drug development pipelines.
OpenAI Launches GPT-Rosalind for Drug Discovery, GPT-5.4-Cyber for Security
OpenAI launched GPT-Rosalind, a life sciences model performing above the 95th percentile of human experts on novel biological data, and GPT-5.4-Cyber, a cybersecurity variant. These releases, alongside a major Agents SDK update, signal a pivot from general AI to specialized, high-stakes enterprise domains.
Bridging Language and Logic: How LLMs Are Revolutionizing Causal Discovery
Researchers introduce DMCD, a novel framework that combines LLM semantic reasoning with statistical validation to uncover causal relationships from data. This hybrid approach outperforms traditional methods on real-world benchmarks, promising more accurate AI-driven decision-making.
Columbia Prof: LLMs Can't Generate New Science, Only Map Known Data
Columbia CS Professor Vishal Misra argues LLMs cannot generate new scientific ideas because they learn structured maps of known data and fail outside those boundaries. True discovery requires creating new conceptual maps, a capability current architectures lack.
Researchers Achieve Ultra-Long-Horizon Agentic Science with Cohesive AI Agents
A research team has developed AI agents capable of executing and maintaining coherent, long-horizon scientific research workflows. This addresses a core challenge in creating autonomous systems for complex discovery.
Anthropic's AI Researchers Outperform Humans, Discover Novel Science
Anthropic reports its AI systems for alignment research are surpassing human scientists in performance and generating novel scientific concepts, broadening the exploration space for AI safety.
Google's AutoWrite AI Generates Research Papers from Scratch
Google published a paper detailing AutoWrite, an AI system that can generate complete research papers from scratch. This represents a significant step toward automating the scientific writing process.
AI Firms Target Biotech for High-Impact, High-Margin Applications
A trend analysis notes AI companies are shifting focus to biotech, where accurate prediction models can be monetized through drug discovery and synthetic biology, creating a new competitive frontier.
Sam Altman Outlines 3 AI Futures: Research, Operations, Personal Agents
OpenAI CEO Sam Altman outlined three potential outcomes for AI development: systems that conduct scientific research, accelerate company operations, and serve as trusted personal agents. This vision frames the strategic direction for OpenAI and the broader industry.
Nature Report: China's Public R&D Spending Nears US Levels, Shifting Global Science Funding Landscape
A new Nature report indicates China is close to surpassing the US in public R&D spending. This shift in funding could alter which nation sets the global pace for scientific research, though China still lags in fundamental research output.
OpenAI's 'Autonomous AI Researchers' Vision Sparks Debate on Biology's 'ChatGPT Moment'
A tweet highlights OpenAI's repeated references to 'autonomous AI researchers' as signaling a 'ChatGPT moment for biology,' suggesting AI could accelerate drug discovery by orders of magnitude. The claim draws a direct analogy to AlphaFold's impact on structural biology.
Mirendil: Ex-Anthropic Scientists Launch $1B Venture to Build AI That Thinks Like a Scientist
Former Anthropic researchers are raising $175M at a $1B valuation for Mirendil, a startup aiming to build AI systems for long-term scientific reasoning. The goal is to accelerate breakthroughs in biology and materials science, aligning with a broader industry push toward autonomous AI researchers.