Work for us: AWS will provide scientists with clusters of 40 thousand Trainium AI accelerators

AWS intends to attract more people to develop AI applications and frameworks using Amazon’s Tranium family of accelerators. As part of the new Build on Trainium initiative, with $110 million in funding, academia will be provided with UltraClaster clusters, including up to 40 thousand accelerators, The Register reports.

As part of the Build on Trainium program, it is planned to provide access to the cluster to representatives of universities who are engaged in the development of new AI algorithms that can increase the efficiency of using accelerators and improve the scaling of calculations in large distributed systems. It is not specified which generation of chips, Trainium1 or Trainium2, the clusters will be built on.

Image source: AWS

As the AWS blog itself explains, researchers may come up with new AI model architectures or new performance optimization technology, but they may not have access to HPC resources for large experiments. Equally important, the fruits of the labor are expected to be distributed through an open source model, so the entire machine learning ecosystem will benefit from this.

However, there is little altruism on the part of AWS. Firstly, $110 million will be issued to selected projects in the form of cloud loans, this is not the first time this has happened. Secondly, the company is actually trying to shift some of its tasks to other people. AWS custom chips, including AI accelerators for training and inference, were originally developed to improve the efficiency of the company’s internal tasks. However, low-level frameworks, etc. The software is not designed to be freely used by a wide range of people, as, for example, is the case with NVIDIA CUDA.

In other words, to popularize Trainium, AWS needs software that is easier to learn, and even better, ready-made solutions for application problems. It is no coincidence that Intel and AMD tend to offer developers ready-made frameworks like PyTorch and TensorFlow optimized for their accelerators, rather than trying to force them to do fairly low-level programming. AWS does the same thing with products like SageMaker.

The project is largely possible thanks to the new Neuron Kernel Interface (NKI) for AWS Tranium and Inferentia, which provides direct access to the chip’s instruction set and allows researchers to build optimized computing kernels for new models, performance optimization and innovation in general. However, scientists – unlike ordinary developers – are often interested in working with low-level systems.

admin

Share
Published by
admin

Recent Posts

Apple to Release Updated MacBook Air with M4 Chip in March 2025

Apple is preparing to launch updated 13- and 15-inch versions of the MacBook Air laptop,…

59 minutes ago

Official Radeon RX 9070 XT Relative Performance Leaked to Press

The VideoCardz portal writes that AMD held a closed briefing for journalists this week, where…

1 hour ago

Kindergarten of some kind: former German data center converted into preschool

Bonn, Germany, is in dire need of kindergartens, so they are sometimes placed in the…

1 hour ago

Apple to Improve iPhone 17 Pro Camera with Focus on Video

According to online sources, Apple will focus more on improving video recording in the new…

2 hours ago

GeForce RTX 5070 Ti with “fallen off” ROPs loses up to 11% performance in synthetic tests

It was previously reported that some GeForce RTX 5090/RTX 5090D graphics cards, and as it…

2 hours ago

Chinese scientists have figured out how to extend the life of lithium-ion batteries

A group of researchers from China has developed a technology that will restore the capacity…

3 hours ago