AI will do without Nvidia: Amazon has released systems on Trainium2 chips, and a year later Trainium3 will be released

Amazon’s Amazon Web Services (AWS) division announced at its re:Invent conference that customers of its cloud platform can now use systems powered by Trainium2 accelerators, designed to train and run large artificial intelligence language models.

Image source: aws.amazon.com

The chips introduced last year are four times faster than their predecessors: one EC2 instance with 16 Trainium2 accelerators offers performance of up to 20.8 Pflops. This means that when deploying the Meta✴ Llama 405B scale model on the Amazon Bedrock platform, the customer will receive a “3x increase in token generation speed compared to other available offerings from major cloud providers.” You can also choose the EC2 Trn2 UltraServer system with 64 Trainium2 accelerators and 83.2 Pflops of performance. It is noted that the figure of 20.8 Pflops refers to dense models and FP8 accuracy, and 83.2 Pflops refers to sparse models and FP8. For communication between accelerators in UltraServer systems, the NeuronLink interconnect is used.

Together with its partner Anthropic, OpanAI’s main competitor in the field of large language models, AWS intends to build a large cluster of UltraServer systems with “hundreds of thousands of Trainium2 chips” where the startup can train its models. It will be five times more powerful than the cluster on which Anthropic trained its current generation models—AWS estimates it will “be the world’s largest AI compute cluster reported to date.” The project will help the company surpass the performance of current Nvidia accelerators, which are still in high demand and remain in short supply. Although early next year Nvidia is preparing to launch a new generation of Blackwell accelerators, which, with 72 chips per rack, will offer up to 720 Pflops for FP8.

Perhaps that’s why AWS has already announced the next generation of Trainium3 accelerators, which offer another fourfold increase in performance for UltraServer systems – the accelerators will be manufactured using the 3 nm process technology, and their deployment will begin in late 2025. The company justified the need for new generation systems by the fact that modern AI models are approaching trillions of parameters in scale. Trn2 instances are currently only available in the US East region of the AWS infrastructure, but will soon appear in others; UltraServer systems currently operate in pre-access mode.

admin

Share
Published by
admin

Recent Posts

Noctua shared a recipe for how to muffle the fans in a PC using a 3D printer

When creating the Seasonic Prime TX-1600 Noctua Edition power supply, Noctua developed an unusual grille…

4 hours ago

The Dragon Age: The Veilguard character editor became a standalone application, and the famous armor from Dragon Age 2 was added to the game

Following Mass Effect, another BioWare role-playing franchise waited for its unofficial holiday - on December…

4 hours ago

Intel already has “almost ready” Xe3 graphics, although the first Xe2 video cards came out only yesterday

Distinguished Intel researcher Tom Petersen, who actually became the face of the company's graphics direction,…

5 hours ago