A team of researchers from Google and the University of California, Berkeley, has proposed a new method for scaling artificial intelligence (AI). It is called “inference-time search,” which allows a model to generate multiple answers to a query and choose the best one. This approach can improve the performance of models without additional training. However, outside experts have questioned the idea.

Image source: AI generated

Previously, the main way to improve AI was to train large language models (LLM) on an ever-increasing amount of data and increase the computing power when running (testing) the model. This has become the norm, or rather the law, for most leading AI labs. The new method proposed by researchers is that the model generates many possible answers to a user’s query and then selects the best one. As TechCrunch notes, this will significantly improve the accuracy of answers even for not very large and outdated models.

As an example, the scientists cited the Gemini 1.5 Pro model, released by Google in early 2024. Using the inference-time search technique, this model is said to have outperformed OpenAI’s powerful o1-preview on mathematical and scientific tests. One of the authors of the paper, Eric Zhao, emphasized: “By simply randomly selecting 200 answers and checking them, Gemini 1.5 clearly outperforms o1-preview and even approaches o1.”

However, experts found the results predictable and did not see the method as a revolutionary breakthrough. Matthew Guzdial, an AI researcher at the University of Alberta, noted that the method only works when the correct answer can be clearly determined, which is impossible for most problems.

Mike Cook, a researcher at King’s College London, agrees. He says the new method doesn’t improve AI’s reasoning abilities, but rather helps it work around existing limitations. “If a model is wrong 5% of the time, then by testing 200 options, those errors will just become more noticeable,” he explains. The main problem is that the method doesn’t make models smarter, it simply increases the number of calculations needed to find the best answer. In real-world settings, this approach could be too expensive and ineffective.

Despite this, the search for new ways to scale AI continues, as current models require enormous computational resources and researchers strive to find methods that will improve the level of AI reasoning without incurring excessive costs.

Leave a Reply

Your email address will not be published. Required fields are marked *