Openay Deep Research showed a record result in the most difficult “last exam of mankind”

Image source: scale.com

Benchmark, created by experts from around the world, contains extremely complex questions and tasks on knowledge and reasoning – even some people cannot understand individual questions in it, not to mention the answer to them. Soon after her exit, the list of leaders in the exam was headed by the reasoning model of the Deepseek R1 AI, which gave 9.4 % of the correct answers. Openai O3-Mini models with a result of 10.5 % and O3-Mini-High could overtake it, which scored 13 %-the latter is really more powerful, but it also works slower. But the result was shown by the Aegent Openai Deep Research more impressive-it scored 26.6 %, thereby driving the previous less than 10 days.

admin

Share
Published by
admin

Recent Posts

OpenAI’s refusal to go commercial will not remove Elon Musk’s claims

It has recently become known that the OpenAI startup will still retain a structure that…

1 hour ago

OpenAI’s refusal to go commercial will not remove Elon Musk’s claims

It has recently become known that the OpenAI startup will still retain a structure that…

1 hour ago

Waymo to Double Robotaxis and Launch Minivan Production by End of Next Year Zeekr RT

Waymo's plans to expand its robo-taxis fleet require ramping up its manufacturing capacity, and a…

3 hours ago

OpenAI Reveals the Real Reason Behind ChatGPT’s Toady Behavior

After the latest GPT-4o update was rolled back due to the model being too accommodating,…

3 hours ago

Developers File New Lawsuit Against Apple — Now for Failure to Comply with Court Order

Apple is facing a new class action lawsuit over allegations that it violated an injunction…

3 hours ago

OpenAI has finally decided against becoming a commercial company

After months of controversy and legal challenges, including a lawsuit from Elon Musk, OpenAI has…

3 hours ago