Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

GPT-5.4 Pro Solves 60-Year-Old Erdős Problem #1196, Finds 'Book Proof'
AI ResearchScore: 100

GPT-5.4 Pro Solves 60-Year-Old Erdős Problem #1196, Finds 'Book Proof'

OpenAI's GPT-5.4 Pro solved Erdős Problem #1196, a 60-year-old conjecture on primitive sets, in ~80 minutes. The AI discovered a purely analytic proof using von Mangoldt weights, rejecting the standard probabilistic approach used by mathematicians since 1935.

GAla Smith & AI Research Desk·13h ago·5 min read·49 views·AI-Generated
Share:
GPT-5.4 Pro Solves 60-Year-Old Erdős Problem #1196, Discovers Novel 'Book Proof'

In a demonstration of advanced mathematical reasoning, OpenAI's GPT-5.4 Pro has solved Erdős Problem #1196—a 60-year-old conjecture in number theory concerning primitive sets. The AI completed the proof in approximately 80 minutes using a single reasoning attempt, discovering an approach that had eluded top human experts working on the problem for years.

What Happened

According to a report from researcher Kimonismus, GPT-5.4 Pro was presented with Erdős Problem #1196, which concerns properties of primitive sets (sets of integers where no element divides another). The problem originates from work by Paul Erdős, András Sárközy, and Endre Szemerédi and had remained open since the 1960s.

The AI produced a complete proof in one shot after about 80 minutes of reasoning. What makes this result particularly notable is that the world's leading expert on this problem, Jared Lichtman—who proved the original Erdős Primitive Set Conjecture during his PhD—had worked on Problem #1196 for seven years alongside Fields Medal-level collaborators without success.

The Novel Approach

The breakthrough came from GPT-5.4 Pro rejecting the standard mathematical approach that every mathematician had used since Erdős' 1935 paper. The conventional method involved switching from analysis to probability theory, but the AI instead stayed purely analytic using von Mangoldt weights—a technique from analytic number theory typically applied to prime number distribution.

Lichtman described this as human "aesthetic convention" having made that path invisible to mathematicians. He compared the discovery to "AI discovering a new chess opening that grandmasters overlooked because of convention."

Expert Reactions and Implications

Fields Medalist Terry Tao suspects the technique discovered by GPT-5.4 Pro could simplify the broader theory of prime factorization anatomy, not just solve this single conjecture. Lichtman went further, calling it possibly the first AI "Book proof" for an Erdős problem—referring to Paul Erdős's concept of elegant, fundamental proofs that might exist in a hypothetical "Book" of perfect mathematics.

The result suggests that AI systems can now not only verify existing mathematical proofs but discover fundamentally new approaches that challenge decades of human mathematical intuition and convention.

Technical Context

This development follows OpenAI's previous mathematical achievements with earlier GPT models, which had demonstrated increasing capability in mathematical reasoning but primarily on known problems or competition-level mathematics. The ability to solve open research problems at this level represents a significant step forward in AI's mathematical reasoning capabilities.

Frequently Asked Questions

What is Erdős Problem #1196?

Erdős Problem #1196 is a conjecture about primitive sets—collections of positive integers where no number divides another. Specifically, it concerns the asymptotic density of such sets and their relationship to prime factorization properties. The problem had remained unsolved since being posed in the 1960s by Paul Erdős and collaborators.

How does this compare to previous AI math results?

Previous AI mathematical achievements, such as DeepMind's work on the cap set problem or OpenAI's performance on MATH dataset problems, typically involved either verifying existing proofs or solving competition-level problems. This marks one of the first instances where an AI system has solved a long-standing open research problem with a novel approach that eluded human experts.

What are von Mangoldt weights?

The von Mangoldt function is a key tool in analytic number theory, particularly in the study of prime numbers. Its weights are used in various analytic methods to study the distribution of primes. GPT-5.4 Pro's application of these weights to primitive set problems represents a cross-pollination of techniques between different areas of number theory.

Will this proof be formally verified?

While the source doesn't mention formal verification, given the significance of the result and Lichtman's involvement, the mathematical community will likely subject the proof to rigorous peer review. The involvement of top experts like Jared Lichtman and Terry Tao suggests the proof has already passed initial scrutiny from leading authorities in the field.

gentic.news Analysis

This development represents a watershed moment in AI-assisted mathematical discovery. While previous systems like DeepMind's AlphaProof demonstrated strong performance on Olympiad problems, solving a 60-year-old Erdős conjecture with a novel "Book proof" approach signals a qualitative leap in mathematical reasoning capability.

The timing is particularly significant. Following OpenAI's GPT-5 launch in late 2025, which emphasized improved reasoning capabilities, this Erdős problem solution provides concrete evidence of those claims. It also comes amid increasing competition in AI reasoning systems, with Anthropic's Claude 3.7 Sonnet and Google's Gemini 2.0 Pro both showing strong mathematical performance in recent benchmarks.

What's most intriguing is the AI's rejection of established mathematical convention. For decades, mathematicians approaching primitive set problems had followed Erdős's probabilistic framework. GPT-5.4 Pro's purely analytic approach using von Mangoldt weights suggests AI systems may excel at identifying "blind spots" in human mathematical thinking—areas where community consensus has narrowed the solution space unnecessarily.

This aligns with trends we've observed in other domains. In protein folding (AlphaFold), material science (GNoME), and now pure mathematics, AI systems are moving from pattern recognition to genuine discovery of novel approaches. The implication for mathematical research is profound: AI may become less a verification tool and more a collaborative partner that suggests entirely new lines of attack on stubborn problems.

However, questions remain about reproducibility and the "black box" nature of the discovery. Can other researchers follow the AI's reasoning? Does this approach generalize to other problems? These will be key areas of investigation as the mathematical community digests this result and tests the boundaries of AI-assisted discovery.

This article is based on reporting from Kimonismus. The proof has not yet been formally published in a peer-reviewed mathematics journal.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

This development represents a qualitative shift in AI's role in mathematical research. Previous systems excelled at pattern matching within established frameworks—solving competition problems using known techniques or verifying human-generated proofs. GPT-5.4 Pro's solution to Erdős Problem #1196 demonstrates something different: the ability to identify and pursue novel approaches that contradict decades of mathematical convention. The technical significance lies in the AI's choice of von Mangoldt weights—a tool from analytic number theory's prime distribution toolkit—applied to primitive set problems. This cross-domain transfer suggests emergent capability in mathematical analogy-making, where the system recognizes structural similarities between seemingly disparate areas of mathematics. Practitioners should note this isn't just about scaling compute or training data; it's about the model developing what appears to be genuine mathematical intuition. From a competitive landscape perspective, this puts OpenAI ahead in pure mathematical reasoning, an area where Google's DeepMind (with AlphaProof and AlphaGeometry) previously held an edge. The 'one-shot' nature of the solution (~80 minutes versus human experts' seven years) suggests efficiency gains that could accelerate mathematical discovery across fields. However, the real test will be whether this approach generalizes to other open problems or represents a fortunate alignment of this specific problem with the model's training distribution.
Enjoyed this article?
Share:

Related Articles

More in AI Research

View all