Chips & Hardware · Report

Nvidia abandoned its planned quad-die Rubin Ultra GPU architecture in favor of a dual-die design due to manufacturing execution concerns.

Signals advanced packaging constraints in GPU roadmap; affects H200/B200 successor timelines and near-term AI accelerator supply availability.

Trade pressSlicast · July 1, 2026 · Global · Source: Tom's Hardware

importance 93

In a bid to offer unbeatable performance, Nvidia had planned to use four GPU chiplets in its Rubin Ultra AI accelerator due in 2027. However, due to concerns about the manufacturability of such a solution, the company decided to cancel it in favor of a dual-GPU design that is easier to produce, according to SemiAnalysis.

Nvidia's Rubin Ultra GPU with four compute chiplets was arguably one of Nvidia's most ambitious projects in recent years. It would have doubled the performance of the original Rubin, which uses two compute chiplets, while also increasing the complexity of Nvidia's data center GPUs to unprecedented levels. However, connecting four near reticle-sized dies using existing advanced packaging technologies presented a tremendous engineering challenge. Cooling four complex dies and 16 HBM4E modules proved both difficult and costly. As a result, due to "manufacturing execution concerns," Nvidia reportedly canceled the four-compute-die design in favor of one with two compute chiplets.

The revised Rubin Ultra would be approximately half as powerful as the original planned configuration, which could make it less competitive against AMD's Instinct MI500-series offerings. However, Nvidia is likely to optimize the dual-chiplet Rubin Ultra design to extract additional performance and justify the upgrade. Notably, the revised Rubin Ultra uses HBM4E memory instead of the HBM4 used by the original Rubin. Starting with Rubin GPUs, Nvidia plans to offer liquid-cooled Kyber rack-scale systems that increase GPU count per scale-up domain to at least 144 packages, thereby increasing overall compute performance.

The reduction in HBM modules has broader market implications. The canceled four-chiplet design would have used 16 HBM4E packages, while the revised dual-chiplet version will use only eight, potentially impacting the HBM market overall. The dual-chiplet configuration will also be less expensive than the original design. Since Nvidia focuses primarily on selling rack-scale solutions rather than individual GPUs, the ultimate impact on customer spending remains uncertain—customers purchasing more systems to achieve the same compute density may spend more overall than they would have with fewer systems containing more chiplets per GPU.

Read the original