AI will do without Nvidia: Amazon has released systems on Trainium2 chips, and a year later Trainium3 will be released

Amazon’s Amazon Web Services (AWS) division announced at its re:Invent conference that customers of its cloud platform can now use systems powered by Trainium2 accelerators, designed to train and run large artificial intelligence language models.

Image source: aws.amazon.com

The chips introduced last year are four times faster than their predecessors: one EC2 instance with 16 Trainium2 accelerators offers performance of up to 20.8 Pflops. This means that when deploying the Meta✴ Llama 405B scale model on the Amazon Bedrock platform, the customer will receive a “3x increase in token generation speed compared to other available offerings from major cloud providers.” You can also choose the EC2 Trn2 UltraServer system with 64 Trainium2 accelerators and 83.2 Pflops of performance. It is noted that the figure of 20.8 Pflops refers to dense models and FP8 accuracy, and 83.2 Pflops refers to sparse models and FP8. For communication between accelerators in UltraServer systems, the NeuronLink interconnect is used.

Together with its partner Anthropic, OpanAI’s main competitor in the field of large language models, AWS intends to build a large cluster of UltraServer systems with “hundreds of thousands of Trainium2 chips” where the startup can train its models. It will be five times more powerful than the cluster on which Anthropic trained its current generation models—AWS estimates it will “be the world’s largest AI compute cluster reported to date.” The project will help the company surpass the performance of current Nvidia accelerators, which are still in high demand and remain in short supply. Although early next year Nvidia is preparing to launch a new generation of Blackwell accelerators, which, with 72 chips per rack, will offer up to 720 Pflops for FP8.

Perhaps that’s why AWS has already announced the next generation of Trainium3 accelerators, which offer another fourfold increase in performance for UltraServer systems – the accelerators will be manufactured using the 3 nm process technology, and their deployment will begin in late 2025. The company justified the need for new generation systems by the fact that modern AI models are approaching trillions of parameters in scale. Trn2 instances are currently only available in the US East region of the AWS infrastructure, but will soon appear in others; UltraServer systems currently operate in pre-access mode.

admin

Share
Published by
admin

Recent Posts

Physicists Doubt Microsoft’s Majorana 1 Quantum Processor’s Performance on Majorana Fermions

There is an opinion among experts that the new topological quantum processor Microsoft Majorana 1…

13 minutes ago

Google has begun to disable uBlock Origin en masse in Chrome due to the transition to Manifest V3

Some Chrome users have noticed that the uBlock Origin extension no longer works. The developers…

33 minutes ago

Apple CEO Promises Trump to Invest Hundreds of Millions of Dollars in Developing Manufacturing in the U.S.

The directness of the current US President Donald Trump sometimes creates inconvenience for his partners,…

3 hours ago

Apple Confirms It Will Soon Make Vision Pro Headsets More Comfortable and Smarter

Apple has officially confirmed that its generative AI platform, Apple Intelligence, will be coming to…

9 hours ago

OpenAI Purges ChatGPT of Suspected Malicious Accounts from China and North Korea

OpenAI has suspended accounts of users in China and North Korea who allegedly used the…

9 hours ago