In a remarkable milestone for artificial intelligence, OpenAI has announced that one of its latest experimental models has achieved gold medal-level performance at the 2025 International Mathematical Olympiad (IMO), widely regarded as the most prestigious global competition in mathematics for high school students.
The AI model earned a score of 35 out of a possible 42 points — a level of performance that ranks it among the top contestants in the world and well within the range required to receive a gold medal. The result represents a stunning display of mathematical reasoning by an AI system, and it signals a dramatic advance in machine intelligence that goes far beyond simple computation.
A New Benchmark for AI Reasoning
The International Mathematical Olympiad is known for its extreme difficulty. Participants are presented with six problems across two days, each requiring not just accurate computation but deep insight, creativity, and extended logical reasoning. The problems span subjects such as algebra, combinatorics, geometry, and number theory, and are often designed to challenge the world’s most brilliant young minds.

OpenAI’s model was given the same problems faced by human competitors and was evaluated under similar time constraints. It successfully solved five of the six problems, providing complete, detailed solutions that were indistinguishable in structure and quality from those written by human gold medalists.
What makes this achievement so significant is not just the accuracy of the final answers, but the way the model reached them. Rather than relying on specialized symbolic solvers or handcrafted heuristics, the model used a general-purpose large language model framework that combined natural language understanding with step-by-step logical reasoning. The solutions were written in coherent, human-readable prose, reflecting deep conceptual understanding and mathematical fluency.
From Language Models to Mathematical Thinkers
Until recently, artificial intelligence systems have excelled primarily at tasks involving pattern recognition, such as image classification or language generation. However, tasks that require extended chains of reasoning, abstract thinking, or problem-solving across multiple steps have proven much more difficult. Mathematics, particularly at the Olympiad level, has long been considered a “last frontier” for AI — a domain that demands not only correctness but elegance and creativity.
OpenAI’s success suggests that large language models have begun to cross this frontier. The model was not specifically trained to solve IMO problems; instead, it was designed as a general reasoning system with the capacity to understand complex instructions, plan solutions, and verify its own reasoning in real time. It approached the Olympiad problems much like a human would — reading carefully, forming hypotheses, testing them, and adjusting its approach as needed.
This type of reasoning, sometimes referred to as “chain-of-thought,” allows the model to simulate the deliberative, reflective process characteristic of human mathematicians. In fact, many of the model’s written solutions resembled those found in top math journals or Olympiad training materials, complete with explanations, justifications, and proofs.
Implications for Education and Research
The implications of this achievement extend far beyond competitions. Experts believe that AI systems capable of solving Olympiad-level problems could be used to enhance mathematics education by offering personalized tutoring, automatically generating new problem sets, or even providing feedback on student solutions. These systems could also play a role in mathematical research by helping to verify proofs, explore conjectures, or suggest novel approaches to longstanding open problems.
More broadly, this milestone demonstrates that language-based AI systems are beginning to grasp abstract reasoning in a meaningful way. While previous AI math systems relied on narrow, domain-specific tools, OpenAI’s model represents a shift toward more flexible, general-purpose intelligence — the kind that could eventually assist with advanced scientific discovery, theorem proving, or other highly intellectual tasks.
Challenges and Cautions
Despite the impressive results, OpenAI emphasized that the model remains experimental and is not yet publicly available. The team behind the project is continuing to test and refine the system to ensure its reliability, transparency, and safety. One challenge in particular is ensuring that the model doesn’t “hallucinate” or make confident errors in its reasoning, a problem that still plagues many large language models today.
Furthermore, while the model’s solutions were reviewed by expert mathematicians and deemed correct, the IMO itself has not officially recognized or ranked AI participants. The competition remains strictly for human contestants. Still, the achievement is being widely celebrated as a sign of how far AI has come — and how close it may be to achieving human-like reasoning across a wide range of disciplines.
A Glimpse of the Future
This breakthrough at the IMO may well be remembered as a turning point in the history of artificial intelligence. It marks a moment when machines, once seen as purely mechanical calculators, began to demonstrate the kind of creative, structured thinking that underpins human intellectual achievement.
As AI systems continue to improve, collaborations between humans and machines in fields like mathematics, science, and engineering could become the norm. Rather than replacing human thinkers, such systems may enhance our ability to explore, understand, and build — acting as partners in thought, not just tools.

For now, OpenAI’s gold medal moment at the IMO stands as a powerful reminder of what AI can already do — and a tantalizing preview of what may come next.







