Openay Deep Research showed a record result in the most difficult “last exam of mankind”

Image source: scale.com

Benchmark, created by experts from around the world, contains extremely complex questions and tasks on knowledge and reasoning – even some people cannot understand individual questions in it, not to mention the answer to them. Soon after her exit, the list of leaders in the exam was headed by the reasoning model of the Deepseek R1 AI, which gave 9.4 % of the correct answers. Openai O3-Mini models with a result of 10.5 % and O3-Mini-High could overtake it, which scored 13 %-the latter is really more powerful, but it also works slower. But the result was shown by the Aegent Openai Deep Research more impressive-it scored 26.6 %, thereby driving the previous less than 10 days.

admin

Share
Published by
admin

Recent Posts

Starliner long -suffering space project brought Boeing $ 523 million last year

Boeing reported on continuing losses under the program for providing commercial flights to the ISS.…

1 hour ago

ChatGPT AI Bota can now send voice messages and photos via WhatsApp

Developers of generative neural networks who can create content based on text or other tips…

1 hour ago

The impossible is possible: the eighth patch for Baldur’s Gate 3 will make an affordable players a forbidden “evil” ending

The stressful testing of the eighth large patch for the fantasy role -playing game Baldur’s…

2 hours ago

Thermal Grizzly introduced Kryosheet graphene heat laying, which will replace thermal paved with CPU and GPU

Thermal Grizzly introduced a new product called Kryosheet - graphene thermal layers for use with…

3 hours ago