Over the weekend, Neel Somani stumbled onto something unexpected while stress-testing the mathematical abilities of a new OpenAI model. Somani, a former quant researcher and startup founder, fed a difficult math problem into ChatGPT and let it run. When he checked back roughly 15 minutes later, the model had produced a full solution.
To be sure, Somani carefully reviewed the proof and then formalized it using Harmonic’s verification tools. The result held up.
“I wanted to get a clearer sense of where large language models can genuinely solve open math problems, and where they still fail,” Somani said. What surprised him was that, with the latest model, that boundary appears to have shifted.
The model’s reasoning process was strikingly sophisticated, drawing on advanced concepts such as Legendre’s formula, Bertrand’s postulate, and the Star of David theorem. Along the way, it even surfaced a 2013 Math Overflow discussion in which Harvard mathematician Noam Elkies outlined an elegant solution to a related problem. Yet the AI’s final proof diverged in meaningful ways from Elkies’ approach and offered a more complete answer to a version of the problem originally posed by Paul Erdős.
Erdős’ vast catalog of unsolved conjectures has long served as a proving ground for human mathematicians. Now, it is increasingly becoming a testbed for artificial intelligence as well.
For skeptics of machine intelligence, these results are hard to dismiss. AI tools are already deeply embedded in modern mathematics, from formal proof systems to literature review assistants. But since the release of GPT-5.2, which Somani describes as “anecdotally stronger at mathematical reasoning than earlier versions,” the pace of progress has become difficult to ignore. The question is no longer whether AI can assist mathematicians, but how far it can push the boundaries of discovery.
Somani has been focusing on the Erdős Problems, a collection of more than 1,000 conjectures maintained online. The problems span a wide range of topics and difficulty levels, making them a natural target for AI experimentation. In November, the first wave of autonomous progress came from a Gemini-based system called AlphaEvolve. More recently, GPT-5.2 has shown surprising strength in tackling advanced mathematical challenges.
Since Christmas alone, 15 problems on the Erdős site have shifted from “open” to “solved.” In 11 of those cases, the published solutions explicitly credit AI models as part of the process.
Leading mathematicians are watching closely. On his GitHub page, Terence Tao offers a more cautious assessment, identifying eight Erdős problems where AI systems made substantial autonomous progress, along with six more where models contributed by locating and extending prior research. While this falls well short of fully autonomous mathematics, it highlights a growing and meaningful role for large language models.
Writing on Mastodon, Tao suggested that AI’s scalability makes it particularly well suited for tackling the “long tail” of lesser-known Erdős problems. Many of these, he argued, likely have straightforward solutions that humans have simply not prioritized.
“As such,” Tao wrote, “many of these easier Erdős problems are now more likely to be solved by purely AI-based methods than by human or hybrid approaches.”
Another factor accelerating progress is a renewed emphasis on formalization, the painstaking process of translating mathematical reasoning into a rigorously verifiable form. While formalization predates computers, modern proof assistants have dramatically lowered the barrier. Tools such as Lean, developed at Microsoft Research, have become standard in parts of the field, while newer AI-driven systems aim to automate much of the remaining work.
For Tudor Achim, founder of Harmonic, the headline numbers matter less than who is adopting these tools. “What matters to me is that leading math and computer science professors are actually using them,” he said. “These are people with reputations to protect. When they say they rely on tools like Aristotle or ChatGPT, that’s meaningful evidence.”
AI may not yet be doing mathematics on its own. But its growing role in solving long-standing problems suggests that the frontier of mathematical discovery is beginning to shift — and faster than many expected.
We have helped 20+ companies in industries like Finance, Transportation, Health, Tourism, Events, Education, Sports.