Google DeepMind AI models solved math Olympiad problems at the level of a silver medalist

Google DeepMind, the London-based artificial intelligence (AI) research subsidiary of Google, has introduced AlphaProof and AlphaGeometry 2 AI models that can solve complex mathematical problems that current AI models cannot handle.

Image source: geralt/Pixabay

For a number of reasons, solving mathematical problems that require advanced reasoning abilities is not yet within the capabilities of most AI systems. The fact is that these types of problems require the formation and use of abstractions. It also requires complex hierarchical planning, setting subgoals, backtracking, and finding new paths, which is a difficult issue for AI.

Both new AI models have the ability to perform advanced mathematical reasoning to solve complex mathematical problems. AlphaProof was created using reinforcement learning, gaining the ability to prove mathematical statements in the formal Lean programming language. To create it, we used a pre-trained language model AlphaZero, a reinforcement learning algorithm that previously taught itself to play chess, shogi and go. In turn, AlphaGeometry 2 is an improved version of the existing AlphaGeometry AI system, introduced in January and designed to solve geometry problems.

While AlphaProof was trained to solve problems on a wide range of math topics, AlphaGeometry 2 is optimized for solving problems involving object movements and equations involving angles, ratios and distances. Because AlphaGeometry 2 was trained on significantly more synthetic data than its predecessor, it can handle much more complex geometry problems.

To test the capabilities of the new AI systems, Google DeepMind researchers tasked them with solving six problems from this year’s International Mathematical Olympiad (IMO) and proving the answers were correct. AlphaProof solved two algebra problems and one number theory problem, one of which was the hardest in the Olympiad, while AlphaGeometry 2 solved a geometry problem. Two problems in combinatorics remained unsolved.

Two renowned mathematicians, Tim Gowers and Joseph Myers, tested the solutions provided by the systems. They awarded each of the four correct answers the maximum number of points (seven out of seven), giving the systems a total of 28 points out of a maximum of 42. An Olympian who scored the same number of points would have been awarded a silver medal and would have fallen just short of gold, which awarded to those who score 29 points or more.

For the first time, an AI system was able to achieve medal-level results in solving IMO mathematical problems. “As a mathematician, I find this very impressive and a significant leap over what was previously possible,” Gowers said during a press conference.

Creating AI systems that can solve complex mathematical problems could pave the way for exciting human-AI collaborations, says Katie Collins, a researcher at the University of Cambridge. This, in turn, can help us learn more about how we humans do math. “There’s still a lot we don’t know about how people solve complex math problems,” she says.

admin

Share
Published by
admin

Recent Posts

Express test of external SSD-drive MSI Datamag 20Gbps

Today we will talk about a new gadget from MSI, which the manufacturer itself mysteriously…

5 hours ago

Apple to Release Updated MacBook Air with M4 Chip in March 2025

Apple is preparing to launch updated 13- and 15-inch versions of the MacBook Air laptop,…

6 hours ago

Official Radeon RX 9070 XT Relative Performance Leaked to Press

The VideoCardz portal writes that AMD held a closed briefing for journalists this week, where…

7 hours ago

Kindergarten of some kind: former German data center converted into preschool

Bonn, Germany, is in dire need of kindergartens, so they are sometimes placed in the…

7 hours ago

Apple to Improve iPhone 17 Pro Camera with Focus on Video

According to online sources, Apple will focus more on improving video recording in the new…

7 hours ago

GeForce RTX 5070 Ti with “fallen off” ROPs loses up to 11% performance in synthetic tests

It was previously reported that some GeForce RTX 5090/RTX 5090D graphics cards, and as it…

7 hours ago