Chips & Hardware · Report

Nvidia Vera Rubin computing architecture redefines GPU design for efficiency and scale.

Vera Rubin roadmap signals Nvidia's long-term shift toward heterogeneous compute; maintains roadmap credibility.

Trade pressSlicast · July 1, 2026 · US · Source: Google News

importance 62

Agentic AI is driving key technology providers to rethink the computing architecture required to run rapidly expanding autonomous systems. In June, CoreWeave Inc. and Nvidia Corp. announced the first bring-up and validation of Nvidia Vera Rubin NVL72 on CoreWeave Cloud—a fundamentally different approach to infrastructure designed to provide an environment where workloads reason continuously, scale unpredictably, and operate in production around the clock.

The system comprises 72 Rubin GPUs, 36 Vera CPUs, and 260 terabytes per second of NVLink 6 bandwidth inside a single rack—more data bandwidth than is used by the entire global internet, according to Chen Goldberg, executive vice president of product and engineering at CoreWeave. "The world is shifting from asking AI questions to having AI actually do things continuously at scale without stopping," Goldberg said. "Agents are writing code, running experiments and executing multi-step reasoning loops. This is exactly what Vera Rubin was architected for."

Vera Rubin NVL72 is designed to support large-scale inference, persistent reasoning sessions and production AI workloads that require more than raw GPU density. Nvidia's advanced chip architecture has allowed CoreWeave to provide multiple systems solutions, including liquid cooling, rack control, networking and secure multi-tenant operations.

CoreWeave's liquid cooling solution, Valvey, monitors flow rate, temperature, pressure and leak detection in real time. "We can control a single valve at a sub-second timescale," Goldberg explained. "If we detect any leak, we take action immediately."

A new unified rack control appliance, Racky, aggregates power, cooling and environmental sensors into a standardized management surface. This allows each Vera Rubin rack to be managed as a cloud resource rather than a custom one-off build, according to Peter Salanki, chief technology officer of CoreWeave. "It takes in telemetry from the GPUs themselves, telemetry from the power systems, leak sensors and the building management system, and ties all these things together," Salanki said.

With multiple CPUs and GPUs in a single rack, communication becomes particularly important. The announcement includes multi-rail and multi-plane networking with support for both Nvidia Quantum-X800 InfiniBand and Nvidia Spectrum-X Ethernet with RDMA over Converged Ethernet RoCE. "The genius of this rack scale system is that it allows you to scale memory, compute and all the fabrics so that GPU number 1 can talk to GPU number 72 at the exact same speed," said Dion Harris, product leader at Nvidia. "This gives you a very consistent, reliable way to scale your workload across the entire rack."

CoreWeave is leveraging Nvidia BlueField-4 DPUs (data processing units) to enable secure, multi-tenant AI cloud operations with faster data access and lower latency. BlueField-4 allows tenants to run workloads across the full Vera Rubin computing platform while preserving control and security. "We brought it all together and went through the validation flow to make sure that the Vera CPUs work with the Rubin chips, with ConnectX NICs, and with the BlueField-4 DPUs," said Harshdeep Banwait, director of product at CoreWeave.

To implement the rack-scale platform, CoreWeave drew from its partner ecosystem. Dell provided the architectural backbone through its high-performance PowerEdge XE9812 servers. "As AI models expand at trillion-parameter scale and context windows encompass millions of tokens, compute density is going to grow in importance," according to Ihab Tarazi, senior vice president and chief technology officer of Dell. "All the new models that really matter are trillion parameter models. They no longer fit on an 8-way GPU standard server. They really need those NVL72 GPU systems."

CoreWeave's validation underscores the need for inference performance supporting agentic AI in production. The inference market has grown exponentially in recent years, presenting an opportunity for Vera Rubin to support both training and massive scale inferencing at previously impossible cost and performance levels. "Brand new workloads are coming to life as part of it," said Corey Sanders, senior vice president of product at CoreWeave.

These new workloads are being driven by increased adoption of agentic AI, and CoreWeave has taken a series of actions to build cloud infrastructure that supports it. This has included the acquisition of AI model development firm Weights & Biases Inc. in 2025. New Weights & Biases agentic AI tools have been added to the CoreWeave platform since then. "For the first time, there is an agent inside Weights & Biases that helps AI users train models and build AI applications," said Shawn Lewis, founder and chief technology officer of Weights & Biases at CoreWeave. "W&B Launch connects the Weights & Biases toolkit back to infrastructure, allowing agents to launch experiments onto CoreWeave infrastructure instead of requiring a human to do so."

Read the original