Classic platformer Super Mario Bros. puts AI to the test

Comparing AI models is notoriously difficult, and their creators are often accused of bias, partiality, and making test results difficult for ordinary people to understand. So rather than focusing on abstract math and logic tests, the researchers proposed testing the AI ​​using Nintendo’s classic platformer Super Mario Bros.

Image source: Hao AI Lab

The experiment used an emulated version of Super Mario Bros. that was integrated with a custom framework called GamingAgent from researchers at the Hao AI Lab at the University of California, San Diego. This system allowed AI models to control Mario by generating Python code. All models were given the same basic instructions, like “Jump over this enemy,” as well as visualizations of the game state in the form of screenshots.

While Super Mario Bros. may look like a simple 2D platformer, researchers have found that the classic Nintendo game seriously challenges AI to plan complex movement sequences and adapt gameplay strategies on the fly.

The best model in mastering Super Mario Bros. was recognized by the researchers as Claude 3.7 from Anthropic, which demonstrated impressive reflexes, stringing together precise jumps and skillfully avoiding enemies. Its predecessor, Claude 3.5, also showed decent results, while OpenAI’s GPT-4o and Google’s Gemini 1.5 Pro lagged behind the competition.

As it turns out, logical thinking isn’t the key to success in Super Mario Bros. — timing is. Even a small delay can send Mario back to a previous checkpoint. The researchers suggest that the more “conscious” and reasoning models may have taken too long to figure out their next steps, leading to frequent failures.

Of course, using retro games to evaluate AI is largely an experiment. An AI’s ability to beat Super Mario Bros. doesn’t determine how useful it really is, though watching models trained on billions of parameters battle (and often lose) against a seemingly childish game is certainly entertaining.

For those who want to conduct their own experiment, Hao AI Lab has opened the source code of its GamingAgent on GitHub.

admin

Share
Published by
admin

Recent Posts

XAI Built Makeshift Gas Power Plant for AI Supercomputer Colossus Without Asking

XAI has more than doubled the number of mobile gas turbines at its Colossus supercomputer…

1 hour ago

Apple Accelerates AI Implementation Problems — Smarter Siri May Be Released This Fall

Apple, which was previously forced to delay the release of advanced artificial intelligence features for…

1 hour ago

Axiom Space plans to deploy two Orbital Data Center nodes in space by the end of 2025

Axiom Space, which has long announced plans to build a space data center, the Orbital…

2 hours ago

Axiom Space plans to deploy two Orbital Data Center nodes in space by the end of 2025

Axiom Space, which has long announced plans to build a space data center, the Orbital…

2 hours ago

Microsoft Unveils Anniversary Surface Laptop That’s Unavailable to Buy

To celebrate its 50th anniversary, Microsoft is giving away a special edition commemorative Surface laptop…

2 hours ago