Researchers from Stanford and the University of Washington have created an AI-model that surpasses Openai in solving mathematical problems. The model, called S1, was trained in a limited set of data from 1000 questions by distillation. This made it possible to achieve high efficiency with minimal resources and prove that large companies such as Openai, Microsoft, Meta✴ and Google may not have to build huge data centers, filling them with thousands of NVIDIA graphic processors.

Image source: Growtika / Unsplash

The distillation method that scientists applied was a key solution in the experiment. This approach allows small models to study on answers provided by larger AI models. In this case, as The Verge writes, S1 quickly improved its abilities using the answers from the artificial intelligence model Gemini 2.0 Flash Thinking Experimental, developed by Google.

The S1 model was created on the basis of the QWEN2.5 project from Alibaba (Cloud) open source. Initially, the researchers used a set of data from 59,000 questions, but during the experiments they came to the conclusion that an increase in the amount of data does not give significant improvements, and for final training they used only a small set of 1000 issues. At the same time, only 16 GPU NVIDIA H100 was used.

In S1, a technique called “scaling of testing time” was also used, which allows the model to “reflect” before generating an answer. The researchers also stimulated the model to double -check their conclusions by adding a command in the form of the word “wait” (“wait”), which forced AI to continue reasoning and correct errors in their answers.

It is claimed that the S1 model showed impressive results and was able to surpass Openai O1-Preview by 27 % when solving mathematical problems. The recently sensational R1 model from DeepSeek also used a similar approach for relatively little money. True, now Openai accuses DeepSeek of extracting information from its models in violation of service conditions. It is worth saying that in the conditions of using Google Gemini it is indicated that its API is forbidden to use to create competing chat bots.

An increase in the number of smaller and cheaper models can, according to experts, turn over the entire industry and prove that there is no need to invest billions of dollars in AI training, build huge data centers and purchase a large amount of GPU.

Leave a Reply

Your email address will not be published. Required fields are marked *