mathematics

30 articles about mathematics in AI news

Terence Tao Demonstrates AI's Growing Role in Formal Mathematics with Claude and Lean

Fields Medalist Terence Tao has released a video showing how Claude Code can be used to formalize mathematical proofs in Lean, highlighting AI's expanding capabilities in high-level mathematics.

Mar 8, 202685% relevant

OpenAI Internal Model Reportedly Solves Three New Erdős Problems, Marking AI Advance in Pure Mathematics

An internal AI model at OpenAI has reportedly solved three previously unsolved mathematical problems from the Erdős collection. This development signals a potential leap in AI's capacity for abstract reasoning and formal theorem proving.

Apr 1, 202685% relevant

Terence Tao on AI and Mathematics: Collaboration, Not Replacement, for Problems Like Riemann Hypothesis

Fields Medalist Terence Tao suggests AI may not solve problems like the Riemann Hypothesis alone, but through a 'collaboration we can't yet imagine' blending AI power with human insight.

Mar 21, 202685% relevant

A Deep Dive into LoRA: The Mathematics, Architecture, and Deployment of Low-Rank Adaptation

A technical guide explores the mathematical foundations, memory architecture, and structural consequences of Low-Rank Adaptation (LoRA) for fine-tuning LLMs. It provides critical insights for practitioners implementing efficient model customization.

Mar 17, 202695% relevant

Mathematics Enters New Era as AI Generates Novel Proofs, Says Fields Medalist Terence Tao

Fields Medalist Terence Tao reveals AI is now producing unique mathematical proofs, though verification remains a bottleneck. He argues that to fully leverage AI, mathematicians must design problems that are easily checkable by both humans and machines.

Mar 11, 202685% relevant

AI's 'Cheap Wins' in Mathematics Signal a New Era of Human-Machine Collaboration

Fields Medalist Terence Tao reveals AI is solving easier Erdős problems, but the real breakthrough is AI as a tireless junior co-author accelerating mathematical discovery through tedious work automation.

Feb 26, 202685% relevant

Mathematics Enters New Era as Terence Tao Declares AI's Research Breakthroughs Are Real

Fields Medalist Terence Tao states AI has moved beyond hype to become a genuine tool for mathematical discovery, marking a paradigm shift in how research is conducted. His endorsement signals AI's maturation from experimental assistant to collaborative partner in solving complex problems.

Feb 17, 202685% relevant

AI Engineer Publishes Free Open-Source Textbook Compiling Math, CS, and AI Concepts

An AI engineer has compiled a comprehensive, free open-source textbook covering mathematics, computer science, and AI concepts. The resource is built with an intuitive, visual-first approach to aid learning.

Mar 27, 202689% relevant

AI Engineer Henry Ndubuaku Releases Open-Source 'Maths, CS & AI Compendium' Textbook

AI engineer Henry Ndubuaku has published a free, open-source textbook compiling mathematics, computer science, and AI concepts. The resource emphasizes intuitive understanding over notation and has reportedly helped users land roles at DeepMind, OpenAI, and Nvidia.

Mar 27, 202685% relevant

Terence Tao: LLM Math is Simple Undergraduate Linear Algebra, But Why They Work Remains a Mystery

Fields Medalist Terence Tao explains that the mathematics to build and run LLMs is straightforward linear algebra. The real puzzle is why they perform unpredictably across tasks, a gap in theory for 'meso-scale' natural data.

Mar 15, 202685% relevant

ChatGPT-5.2 Proves Mathematical Conjecture in Groundbreaking 'Vibe-Proving' Case Study

Researchers demonstrate ChatGPT-5.2 (Thinking) successfully resolving a mathematical conjecture about spectral regions through iterative 'vibe-proving' workflows. The case study reveals where AI assistance proves most valuable in research mathematics and where human expertise remains irreplaceable.

Feb 24, 202670% relevant

ByteDance Finds AI Agents Double Learning Speed Every 3 Months

ByteDance's Seed AI team discovered that AI agents double learning speed every three months via real-world interaction, per a Thursday paper. EdgeBench benchmark with 134 tasks ≥12 hours each underpins the finding.

Jul 3, 202690% relevant

Bernard Arnault Funds New Math Institute at École Polytechnique

Bernard Arnault funds new math institute at École Polytechnique. Donation amount undisclosed.

Jun 26, 202685% relevant

Norway Bans AI Tools for Under-13s, Pointing to Record-Low PISA Scores Since 2015

Norway will prohibit generative AI tools in grades 1-7 from late August 2026, citing falling PISA scores since 2015. Secondary students may use AI only under supervision. The policy extends an earlier smartphone ban that demonstrably improved grades and reduced bullying, and is backed by planned leg

Jun 19, 202695% relevant

MA-ProofBench: GPT-5.5 Hits 16% on Math Analysis, Most Models Near 0%

MA-ProofBench, a new theorem-proving benchmark for mathematical analysis, shows GPT-5.5 achieving 16% on undergraduate problems and 5% on PhD-level, with most models near 0% on the harder set.

Jun 15, 202682% relevant

OpenAI Model Disproves Erdős Conjecture, First AI to Solve Open Math Problem

OpenAI reasoning model disproves 1946 Erdős conjecture, first AI to solve open math problem. Cross-domain proof verified by Gowers.

May 21, 202697% relevant

Halupedia: Open-Source Wikipedia Clone Generates Every Article via AI Hallucination

Halupedia generates fake Wikipedia articles via AI hallucination on click. Open-source backend vibeserver lets anyone deploy a similar project.

May 12, 202679% relevant

40-Author Survey Unveils 'Levels × Laws' Framework for Agent World Models

A 40-author survey introduces a 'levels × laws' framework for world models in AI agents, spanning 3 capability levels and 4 law regimes, synthesizing 400+ works. It provides a shared vocabulary for designing and evaluating world models across traditionally siloed research communities.

Apr 27, 202685% relevant

LLM-as-a-Judge Framework Fixes Math Evaluation Failures

Researchers propose an LLM-as-a-judge framework for evaluating math reasoning that beats rule-based symbolic comparison, fixing failures in Lighteval and SimpleRL. This enables more accurate benchmarking of LLM math abilities.

Apr 27, 202682% relevant

Anthropic Launches STEM Fellows Program to Pair Experts with AI Research

Anthropic announced the Anthropic STEM Fellows Program, a new initiative to bring science and engineering experts into its research teams for collaborative, months-long projects aimed at accelerating progress with AI.

Apr 20, 202689% relevant

Ethan Mollick: OpenAI's O1 Release Was Second Most Important LLM Launch

Ethan Mollick tweeted that OpenAI's O1 launch was the second most important LLM release after GPT-3.5, featuring a pivotal chart. He expressed surprise that OpenAI disclosed its biggest AI advance rather than keeping it proprietary.

Apr 20, 202693% relevant

Ethan Mollick: AI Judgment & Problem-Solving Are Skills, Not Human Exclusives

Ethan Mollick contends that skills like judgment and problem-solving, often cited as uniquely human, are domains where AI can and does demonstrate competence, reframing them as learnable capabilities.

Apr 19, 202675% relevant

Demis Hassabis Proposes 'Einstein Test' as AGI Benchmark

Demis Hassabis has proposed a novel benchmark for AGI: a model trained only on human knowledge up to 1911 must independently derive Einstein's theory of general relativity. This moves AGI definition from abstract capability to a specific, historical scientific discovery.

Apr 19, 202687% relevant

Google's PaperBanana AI Generates Academic Diagrams, Beats Human Designs 3:1

Google released PaperBanana, an AI system that transforms raw methodology text into publication-ready academic diagrams using a 5-agent creative pipeline. In blind evaluations, humans preferred its outputs nearly 3 out of 4 times over manually designed figures.

Apr 16, 202695% relevant

Altman: Next-Gen AI Models to Aid 'Career-Defining' Scientific Discovery

OpenAI CEO Sam Altman stated that upcoming AI models will assist researchers in making 'career-defining' discoveries, though he tempered expectations of immediate Nobel-level breakthroughs.

Apr 16, 202687% relevant

SPPO: Sequence-Level PPO Cuts RL Training Time 5.9x for Math Reasoning

Researchers introduced SPPO, a sequence-level PPO algorithm that reformulates reasoning as a contextual bandit. It achieves a 5.9x speedup over GRPO while matching performance on AIME, AMC, and MATH benchmarks at 1.5B and 7B scales.

Apr 15, 202691% relevant

GPT-5.4 Pro Solves 60-Year-Old Erdős Problem #1196, Finds 'Book Proof'

OpenAI's GPT-5.4 Pro solved Erdős Problem #1196, a 60-year-old conjecture on primitive sets, in ~80 minutes. The AI discovered a purely analytic proof using von Mangoldt weights, rejecting the standard probabilistic approach used by mathematicians since 1935.

Apr 15, 2026100% relevant

Stanford 2026 AI Index: Models Beat Human Baselines, U.S.-China Gap Narrows

The 423-page Stanford 2026 AI Index Report reveals frontier AI models now match or exceed human baselines on hard coding, science, and math tests. Global AI adoption has hit ~53% in just three years, while the U.S.-China capability gap shrinks.

Apr 14, 202697% relevant

VMLOps Publishes 2026 AI Engineer Roadmap for Software Engineers

VMLOps published a comprehensive 2026 roadmap detailing the skills and knowledge software engineers need to transition into AI engineering. The guide reflects the current industry demand for engineers who can build and deploy production AI systems.

Apr 12, 202685% relevant

AI Engineer Gurisingh Turns Ed Thorp's Trading System into 10 ChatGPT Prompts

AI engineer Gurisingh has distilled the quantitative, probabilistic trading system of Ed Thorp—who beat blackjack and ran a 29-year winning hedge fund—into 10 actionable prompts for AI agents.

Apr 12, 202687% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety