Work for us: AWS will provide scientists with clusters of 40 thousand Trainium AI accelerators

AWS intends to attract more people to develop AI applications and frameworks using Amazon’s Tranium family of accelerators. As part of the new Build on Trainium initiative, with $110 million in funding, academia will be provided with UltraClaster clusters, including up to 40 thousand accelerators, The Register reports.

As part of the Build on Trainium program, it is planned to provide access to the cluster to representatives of universities who are engaged in the development of new AI algorithms that can increase the efficiency of using accelerators and improve the scaling of calculations in large distributed systems. It is not specified which generation of chips, Trainium1 or Trainium2, the clusters will be built on.

Image source: AWS

As the AWS blog itself explains, researchers may come up with new AI model architectures or new performance optimization technology, but they may not have access to HPC resources for large experiments. Equally important, the fruits of the labor are expected to be distributed through an open source model, so the entire machine learning ecosystem will benefit from this.

However, there is little altruism on the part of AWS. Firstly, $110 million will be issued to selected projects in the form of cloud loans, this is not the first time this has happened. Secondly, the company is actually trying to shift some of its tasks to other people. AWS custom chips, including AI accelerators for training and inference, were originally developed to improve the efficiency of the company’s internal tasks. However, low-level frameworks, etc. The software is not designed to be freely used by a wide range of people, as, for example, is the case with NVIDIA CUDA.

In other words, to popularize Trainium, AWS needs software that is easier to learn, and even better, ready-made solutions for application problems. It is no coincidence that Intel and AMD tend to offer developers ready-made frameworks like PyTorch and TensorFlow optimized for their accelerators, rather than trying to force them to do fairly low-level programming. AWS does the same thing with products like SageMaker.

The project is largely possible thanks to the new Neuron Kernel Interface (NKI) for AWS Tranium and Inferentia, which provides direct access to the chip’s instruction set and allows researchers to build optimized computing kernels for new models, performance optimization and innovation in general. However, scientists – unlike ordinary developers – are often interested in working with low-level systems.

admin

Share
Published by
admin

Recent Posts

Chinese CATL is ready to start producing traction batteries in the USA if Trump allows it

A very extensive interview with the founder and head of CATL, Robin Zeng, to Reuters…

25 minutes ago

NASA has laid off hundreds of employees at a key space exploration laboratory.

Uncertainty over funding in 2025 and the expectation of new NASA leadership after the election…

25 minutes ago

Foxconn is also reaping the benefits of the AI ​​boom – profits jumped 14% thanks to AI servers

Foxconn, the world's largest contract electronics manufacturer from Taiwan and Apple's main iPhone assembly partner,…

1 hour ago

China is running out of cheap labor – robots will solve the problem

Chinese industry is currently struggling to solve one of its biggest problems with labor shortages.…

1 hour ago

A laser beam can cast a shadow, scientists have found

The imperfection of a 3D simulator of nonlinear optical phenomena in materials, which in the…

1 hour ago

Sonos reported a drop in revenue after the fiasco with its proprietary application

Sonos is still trying to overcome the problems associated with a radical update of the…

2 hours ago