Deliveries of Nvidia GB200 AI servers at a price of $3 million are in jeopardy due to leaks in the life support system

An unexpected problem has plagued Nvidia’s latest GB200 NVL72 and NVL36 server systems, which are equipped with the advanced GB200 compute accelerators, which are designed for artificial intelligence applications. Shortly before mass production and the launch of the product, a serious problem was discovered in the liquid cooling system.

Image source: NVIDIA

Let us recall that the GB200 NVL72 systems represent an entire server rack with 18 1U nodes at once, each of which has a pair of GB200 accelerators, which, in turn, are a pair of Nvidia B200 chips and one 72-core Arm Grace processor. In total, the system includes 72 B200 chips, 36 Grace processors, connected by the NVLink 5 bus. This entire system consumes about 120 kW, is equipped with a life support system and a single DC power bus. In turn, the GB200 NVL36 system is a system with half the number of GB200. According to preliminary data, the GB200 NVL72 system will cost $3 million.

As TweakTown reports with reference to the Taiwanese publication UDN, leaks have been detected in the GB200 NVL72 liquid cooling systems, which, according to preliminary data, are associated with components from third-party manufacturers. Previously, Nvidia transferred the production of some cooling system components, such as pipes, quick connectors and hoses, to its partners – large international manufacturers.

Image Source: TheRegister.com

The leaks were discovered before mass production of the NVL36 and NVL72 AI systems began, giving manufacturers time to fix the problems and, despite the difficulties encountered and the threat of missed delivery dates to key customers, the product is expected to be delivered on time.

However, the incident has raised concerns among major cloud service providers who fear the reliability of Nvidia’s new servers. In response to the situation, Taiwanese manufacturers such as Shuanghong and Qihong have begun to ramp up production of liquid cooling components to provide Nvidia with alternative options.

Certification of pipes, quick-release couplings and hoses is a complex process that requires special knowledge and experience. Previously, Taiwanese companies did not specialize in the production of such components, but Nvidia’s decision to use liquid cooling in its AI chips pushed them to develop new technologies. Currently, active work is underway to eliminate the problem. It is expected that server cabinets with GB200 processors and the corrected cooling system will begin to be shipped to customers in the near future.

admin

Share
Published by
admin

Recent Posts

Hidden features of Microsoft Bing Wallpaper scared users

Microsoft has released the Bing Wallpaper app, which updates your desktop background daily using images…

4 minutes ago

“There will be more to come”: a Rockstar employee intrigued fans with “absolutely mind-blowing things” in GTA VI

While fans eagerly await the next GTA VI trailer, Rockstar Games' ambitious open-world crime thriller…

14 minutes ago

“James Webb” was the first in history to find the “Einstein zigzag” – a unique curvature of space-time

Gravitational lensing, predicted 90 years ago by Einstein, was confirmed by observation four years after…

44 minutes ago

The second Xiaomi electric car will be released a year after the first and will be noticeably different from it

Xiaomi's efforts to carve out its place in China's highly competitive electric vehicle market are…

2 hours ago