Cerebras Systems, in collaboration with the US Department of Energy (DOE) Sandia National Laboratories (SNL), conducted a successful experiment to train an AI model with 1 trillion parameters using a single CS-3 system with a WSE-3 czar accelerator and 55 TB of MemoryX external memory.
Training models of this scale typically requires thousands of GPU-based accelerators that consume megawatts of power, dozens of experts, and weeks of hardware and software tuning, Cerebras says. However, SNL scientists were able to train the model on a single system without making changes to either the model or the infrastructure software. Moreover, they were able to achieve almost linear scaling – 16 CS-3 systems showed a 15.3-fold increase in learning speed.
Image source: Cerebras
A model of this scale requires terabytes of memory, thousands of times more than is available on a single GPU. In other words, classical clusters of thousands of accelerators must be correctly connected to each other before training begins. Cerebras systems for storing scales use external MemoryX memory based on 1U nodes with the most common DDR5, making it as easy to train a model with a trillion parameters as a small model on a single accelerator, the company says.
Previously, SNL and Cerebras deployed the Kingfisher cluster based on CS-3 systems, which will be used as a test platform for the development of AI technologies for national security.
The Japanese private probe Resilience has taken a high-quality photo of the Moon's south pole…
Cybersecurity researcher Jeremiah Fowler discovered a publicly available database with more than 184 million logins…
Played on PC In 2016, the Doom series returned to our screens, and did so…
US President Donald Trump this week said he would impose a 25% tariff on iPhones…
Thermaltake unveiled a prototype of the IX700 system unit with an immersion cooling system at…
At the Warhammer Skulls 2025 presentation, developers from the British studio Auroch Digital announced a…