The NVIDIA GRACE chip is an Arm Neoverse-based CPU designed for AI infrastructure and high-performance computing. It has the highest performance and twice as much memory bandwidth and energy efficiency as today’s best server chips.
The NVIDIA Grace CPU Superchip is made up of two CPU chips that are linked together using NVLink-C2C, a new high-speed, low-latency chip-to-chip interconnect.
The Grace CPU Superchip is a companion to NVIDIA’s first CPU-GPU integrated module, the Grace Hopper Superchip, which was announced last year and is designed to run large-scale HPC and AI applications alongside an NVIDIA HopperTM architecture-based GPU. The underlying CPU architecture, as well as the NVLink-C2C interconnect, are identical on both superchips.
The Grace CPU Superchip combines the highest performance, memory bandwidth, and NVIDIA software platforms into a single chip that will shine as the AI infrastructure’s CPU.
Grace CPU Superchip packs 144 Arm cores into a single socket for industry-leading performance on the SPECrate2017 int base benchmark, with an estimated performance of 740. As estimated in NVIDIA’s labs with the same class of compilers, this is more than 1.5x higher than the dual-CPU shipping with the DGX A100.
Grace CPU Superchip’s innovative memory subsystem, which consists of LPDDR5x memory with Error Correction Code for the best balance of speed and power consumption, also provides industry-leading energy efficiency and memory bandwidth. The LPDDR5x memory subsystem provides twice the bandwidth of traditional DDR5 designs, at 1 terabyte per second, while consuming significantly less power, with the entire CPU and memory consuming only 500 watts.
The Grace CPU Superchip is built on Armv9, the most recent data center architecture. The Grace CPU Superchip combines the highest single-threaded core performance with support for Arm’s new generation of vector extensions, bringing immediate benefits to a wide range of applications.
NVIDIA’s computing software stacks, including NVIDIA RTX, NVIDIA HPC, NVIDIA AI, and Omniverse, will all run on the Grace CPU Superchip. Customers can configure servers with the Grace CPU Superchip and NVIDIA ConnectX-7 NICs as standalone CPU-only systems or as GPU-accelerated servers with one, two, four, or eight Hopper-based GPUs, allowing them to optimize performance for their specific workloads while maintaining a single software stack.
With its highest performance, memory bandwidth, energy efficiency, and configurability, the Grace CPU Superchip will excel at the most demanding HPC, AI, data analytics, scientific computing, and hyperscale computing applications.
The Grace CPU Superchip’s 144 cores and 1TB/s memory bandwidth will give CPU-based high-performance computing applications unprecedented performance. HPC applications are compute-intensive, requiring the most powerful cores, the fastest memory bandwidth, and the appropriate memory capacity per core to achieve the best results.