NVIDIA announced the H200 NVL accelerator, designed as a dual-slot PCIe expansion card. The product is said to be aimed at highly configurable air-cooled enterprise systems for AI and HPC applications.

Like the SXM version of the NVIDIA H200, the presented accelerator received 141 GB of HBM3e memory with a bandwidth of 4.8 TB/s. At the same time, the maximum TDP has been reduced from 700 to 600 W. Four cards can be connected via NVIDIA NVLink interconnect with a throughput of up to 900 GB/s per GPU. In this case, the accelerators are connected to the host system via PCIe 5.0 x16.

You can install two such bundles in one server, which will give a total of eight H200 NVL accelerators and 1126 GB of HBM3e memory, which is very significant for inference workloads. The declared FP8 performance of the H200 NVL card reaches 3.34 Pflops versus approximately 4 Pflops for the SXM version. The speed of FP32 and FP64 is 60 and 30 Tflops, respectively. INT8 performance is up to 3.34 Pflops. A license for the NVIDIA AI Enterprise software platform is included with the cards.

Image source: NVIDIA

In addition, NVIDIA announced GB200 NVL4 accelerators with liquid cooling. They include two Grace-Backwell superchips, which gives two 72-core Grace processors and four B100 accelerators. The LPDDR5X ECC memory capacity is 960 GB, the HBM3e memory is 768 GB. The NVlink-C2C interconnect with a throughput of up to 900 GB/s is used, with all six CPU-GPU chips located in the same domain.

The GB200 NVL4 system is equipped with two M.2 22110/2280 connectors for SSDs with PCIe 5.0 interface, eight slots for E1.S NVMe drives (PCIe 5.0), six interfaces for FHFL PCIe 5.0 x16 cards, a USB port, an RJ45 network connector (IPMI ) and Mini-DisplayPort interface. The device is made in a 2U form factor with dimensions of 440×88×900 mm, and its weight is 45 kg. TDP adjustable – from 2.75 kW to 5.5 kW.

Leave a Reply

Your email address will not be published. Required fields are marked *