Researchers created an analogue of the Openai AI-model in 26 minutes using the distillation method

Researchers from Stanford and the University of Washington have created an AI-model that surpasses Openai in solving mathematical problems. The model, called S1, was trained in a limited set of data from 1000 questions by distillation. This made it possible to achieve high efficiency with minimal resources and prove that large companies such as Openai, Microsoft, Meta✴ and Google may not have to build huge data centers, filling them with thousands of NVIDIA graphic processors.

Image source: Growtika / Unsplash

The distillation method that scientists applied was a key solution in the experiment. This approach allows small models to study on answers provided by larger AI models. In this case, as The Verge writes, S1 quickly improved its abilities using the answers from the artificial intelligence model Gemini 2.0 Flash Thinking Experimental, developed by Google.

The S1 model was created on the basis of the QWEN2.5 project from Alibaba (Cloud) open source. Initially, the researchers used a set of data from 59,000 questions, but during the experiments they came to the conclusion that an increase in the amount of data does not give significant improvements, and for final training they used only a small set of 1000 issues. At the same time, only 16 GPU NVIDIA H100 was used.

In S1, a technique called “scaling of testing time” was also used, which allows the model to “reflect” before generating an answer. The researchers also stimulated the model to double -check their conclusions by adding a command in the form of the word “wait” (“wait”), which forced AI to continue reasoning and correct errors in their answers.

It is claimed that the S1 model showed impressive results and was able to surpass Openai O1-Preview by 27 % when solving mathematical problems. The recently sensational R1 model from DeepSeek also used a similar approach for relatively little money. True, now Openai accuses DeepSeek of extracting information from its models in violation of service conditions. It is worth saying that in the conditions of using Google Gemini it is indicated that its API is forbidden to use to create competing chat bots.

An increase in the number of smaller and cheaper models can, according to experts, turn over the entire industry and prove that there is no need to invest billions of dollars in AI training, build huge data centers and purchase a large amount of GPU.

admin

Share
Published by
admin

Recent Posts

Stardew Valley Gets Baldur’s Village Mod Featuring Baldur’s Gate 3 Characters – Sven Vincke Approves

A crossover between the fantasy RPG Baldur’s Gate 3 from Larian Studios and the farming…

15 minutes ago

Radeon RX 9070 XT is faster than GeForce RTX 5080 in Cyberpunk 2077 and 3DMark after undervolting

Well-known overclocker and YouTube blogger Der8auer (Roman Hartung) experimented with manual overclocking of the PowerColor…

2 hours ago

Asus Releases VU Air Ionizer Monitors With Built-in Air Ionizer

Asus has released three monitors of the VU Air Ionizer series, the main feature of…

3 hours ago

Millions of computers at risk of being hacked due to critical Python vulnerability

A critical vulnerability has been discovered in the popular Python object-oriented programming language package —…

3 hours ago

Split Fiction sold over a million copies in two days; It Takes Two took nearly a month to reach that milestone

Developers from the Swedish Hazelight Studios (It Takes Two, A Way Out) reported on the…

3 hours ago

Sony Reacts to Rumors of God of War Remasters Announced to Celebrate Series’ 20th Anniversary

Developers from Sony Interactive Entertainment's Santa Monica studio have adjusted fans' expectations for the event…

3 hours ago