IBM has announced a new generation of computing systems for artificial intelligence – the Telum II processor and the IBM Spyre accelerator. Both products are designed to accelerate AI and improve the performance of enterprise applications. Telum II offers significant improvements with larger cache and high-performance cores. The Spyre accelerator complements this to provide even better performance for AI-based applications.
According to the company’s blog, the new IBM Telum II processor, developed using Samsung’s 5-nanometer technology, will be equipped with eight high-performance cores operating at 5.5 GHz. The amount of on-chip cache memory has increased by 40%, while the virtual L3 cache has grown to 360 MB, and the L4 cache to 2.88 GB. Another innovation is the integrated data processing unit (DPU) for accelerating I/O operations and the next generation of built-in AI accelerator.
Telum II offers significant performance improvements over previous generations. The built-in AI accelerator provides four times the processing power, reaching 24 trillion operations per second (TOPS). The accelerator architecture is optimized for working with large language models and supports a wide range of AI models for complex analysis of structured and text data. In addition, the new processor supports the INT8 data type to improve computational efficiency. At the system level, however, Telum II allows each processor core to access any of the eight AI accelerators within a single module, providing more efficient load distribution and achieving overall performance of 192 TOPS.
IBM also introduced the Spyre accelerator, developed jointly with IBM Research and IBM Infrastructure development. Spyre is equipped with 32 AI accelerator cores, the architecture of which is similar to the architecture of the accelerator integrated into the Telum II chip. The ability to connect multiple Spyre accelerators to the IBM Z I/O subsystem via PCIe significantly increases the available resources to accelerate AI workloads.
Telum II and Spyre are designed to support a wide range of AI use cases, including ensemble AI. This method takes advantage of using multiple AI models simultaneously to improve overall forecasting performance and accuracy. An example is insurance fraud detection, where traditional neural networks are successfully combined with large language models to improve analysis efficiency.
Both products were presented on August 26 at the Hot Chips 2024 conference in Palo Alto (California, USA). Their release is planned for 2025.
Formally, we learned about the intentions of the Chinese company ByteDance to spend $7 billion…
Believable physics in games won't convince anyone these days, but there was always something special…
The status of the largest supplier of semiconductor products does not fully please the South…
The Galaxy Unpacked event will take place tonight in San Jose, California, where Samsung will…
OpenAI this week announced a joint venture with SoftBank and Oracle that will invest $500…
Meta✴ continues to develop the direction of wearable devices and, in addition to updating its…