Generative AI models are capturing the imagination of many business leaders, promising automation and replacing millions of jobs. However, scientists at the Massachusetts Institute of Technology (MIT) warn that AI, although it provides plausible answers, does not actually have the understanding of complex systems and is limited to predictions. In real world tasks, be it reasoning, navigation, chemistry or gaming, AI exhibits significant limitations.

Image source: HUNGQUACH679PNG / Pixabay

Modern large language models (LLMs) such as GPT-4 give the impression of being a thoughtful answer to complex user queries, when in fact they only accurately predict the most likely words to be placed next to previous ones in a certain context. To test whether AI models can truly “understand” the real world, MIT scientists have developed metrics designed to objectively test their intelligence.

One of the objectives of the experiment was to evaluate the ability of AI to generate step-by-step instructions for navigating the streets of New York. While generative AIs exhibit some degree of “implicit” learning of the laws of the world around them, this is not the equivalent of true understanding. To improve the accuracy of the assessment, researchers have created formalized methods that allow them to analyze how correctly the AI ​​perceives and interprets real situations.

The MIT study focused on Transformers, a type of generative AI model used in popular services like GPT-4. Transformers are trained on vast arrays of text data, which allows them to achieve high accuracy in selecting sequences of words and create believable texts.

To further explore the capabilities of such systems, scientists used a class of problems known as Deterministic Finite Automaton (DFA), which cover areas such as logic, geographic navigation, chemistry and even strategy in games. In the experiment, the researchers chose two different tasks – driving a car on the streets of New York and playing Othello – to test the AI’s ability to correctly understand the underlying rules.

As Harvard University postdoc Keyon Vafa noted, the key goal of the experiment was to test the ability of AI models to reconstruct the internal logic of complex systems: “We needed test benches where we knew exactly what a model of the world looked like. Now we can think rigorously about what it means to restore this model of the world.”

Test results showed that transformers are able to provide correct routes and suggest correct moves in the Othello game when the conditions of the tasks are precisely defined. However, when adding complicating factors such as New York’s detours, AI models began generating counter-intuitive route options, suggesting random overpasses that didn’t actually exist.

The MIT study showed the fundamental limitations of generative AI models, especially in those tasks that require mental flexibility and the ability to adapt to real-world conditions. While existing AI models may be impressive in their ability to generate plausible answers, they remain just predictive tools and not fully intelligent systems.

Leave a Reply

Your email address will not be published. Required fields are marked *