Nvidia's new Rubin CPX GPU delivers 30 petaFLOPs compute and 128GB memory for inference

(Image credit: Nvidia)

Nvidia announces Rubin CPX GPU with 128GB memory built for enterprise AI workloads
Vera Rubin NVL144 CPX rack delivers 8 exaFLOPs compute and 100TB fast memory
Shipments planned for late 2026 with Rubin Ultra and Feynman already on roadmap

Nvidia has announced a brand new GPU built on the Rubin architecture and designed for long-context AI workloads.

Rubin CPX, as it’s known, includes 128GB of GDDR7 memory, making it the company’s first GPU at that capacity.

There were rumors of a 128GB RTX gaming card, but this is 100% not that. This GPU is a compute engine aimed at inference in areas such as software development, research, and high-definition video generation. It will not be running Metal Gear Solid Delta: Snake Eater any time soon.

Vera Rubin NVL144 CPX rack

The GPU delivers up to 30 petaFLOPs of NVFP4 compute and integrates hardware attention acceleration that Nvidia says is three times faster than the GB300 NVL72.

It also incorporates four NVENC and four NVDEC units to accelerate video workflows.

As part of Nvidia’s broader push toward disaggregated inference, Rubin CPX is designed to handle the compute-heavy context phase, while other Rubin GPUs and Vera CPUs address generation tasks.

By concentrating Rubin CPX on context processing tasks, Nvidia aims to improve throughput while lowering high-value inference deployment costs.

Nvidia’s Dynamo software will manage things behind the scenes, handing low-latency cache transfers and routing across components.

The company’s largest deployment model is the Vera Rubin NVL144 CPX rack. Each unit integrates 144 Rubin CPX GPUs, 144 Rubin GPUs, and 36 Vera CPUs.

Together they deliver 8 exaFLOPs of NVFP4 compute, 100TB of high-speed memory, and 1.7PB/s of memory bandwidth.

Quantum-X800 InfiniBand or Spectrum-X Ethernet with ConnectX-9 SuperNICs provide the connectivity.

Shipments of Rubin CPX and the NVL144 CPX racks are currently penciled in for late 2026, following the recent tape-out at TSMC.

Nvidia’s roadmap includes Rubin Ultra, now expected in 2027, and Feynman, slated for 2028.

Those designs will extend the Rubin architecture with higher density modules, HBM4E memory, and faster networking.

Via Videocardz

Nvidia Vera Rubin NVL144 CPX rack and tray — (Image credit: Nvidia)

TOPICS

Wayne Williams is a freelancer writing news for TechRadar Pro. He has been writing about computers, technology, and the web for 30 years. In that time he wrote for most of the UK’s PC magazines, and launched, edited and published a number of them too.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.

Nvidia has launched a GPU with 128GB of GDDR7 RAM but yeah, there's no way it will sell one to us to run games

Vera Rubin NVL144 CPX rack

You might also like