Image source: scale.com
Benchmark, created by experts from around the world, contains extremely complex questions and tasks on knowledge and reasoning – even some people cannot understand individual questions in it, not to mention the answer to them. Soon after her exit, the list of leaders in the exam was headed by the reasoning model of the Deepseek R1 AI, which gave 9.4 % of the correct answers. Openai O3-Mini models with a result of 10.5 % and O3-Mini-High could overtake it, which scored 13 %-the latter is really more powerful, but it also works slower. But the result was shown by the Aegent Openai Deep Research more impressive-it scored 26.6 %, thereby driving the previous less than 10 days.
At the Warhammer Skulls 2025 presentation, developers from the British studio Auroch Digital announced a…
In line with its new strategy, Canadian studio Relic Entertainment presented a remaster of Warhammer…
Publisher Sega and developers from the Lithuanian studio SneakyBox announced a re-release of the 2011…
Xiaomi has officially unveiled its second electric vehicle, the YU7 crossover in three trim levels:…
The ID-Cooling DX360 Max liquid cooling system has one, but very important difference from other…
As part of the expansion of the diversity of the "Laptops and PCs" section, it's…