Data Centers · Report

AI workload surge has exposed switch fabric as the critical network bottleneck, leaving GPUs idle despite massive capex.

Switch capacity, not GPU count, now constrains data center throughput; forces rearchitecture of interconnect topology and elevated networking spend in hyperscaler builds.

Trade pressSlicast · July 3, 2026 · Global · Source: Data Center Knowledge

importance 75

The rapid growth of AI workloads is exposing a critical bottleneck in infrastructure: the network. While compute power has surged, the ability of networks to keep pace has not. This imbalance is leaving some of the world's most advanced chips underutilized, driving up costs and energy consumption.

A study of Model Flop Utilization (MFU) shows AI labs achieving just 35–40% utilization on Nvidia H100S during trillion-parameter training runs. The world's most expensive chips remain idle more than half the time, waiting for data to arrive over the network.

The network fabric connecting compute has become the binding constraint on what AI systems can actually achieve. The architectural decisions being made now—stitching together components designed in isolation—will determine the cost, energy efficiency, and competitive viability of future AI infrastructure.

AI training workloads are already moving beyond 400 Gb/s to 800 Gb/s, with 1.6 Tb/s line rates on the near-term roadmap. But raw link speed is only part of the problem. As clusters scale to thousands of GPUs, the challenge shifts from connection speed to how efficiently the switching fabric coordinates data movement across all nodes. This is a fundamentally harder engineering problem. Network technology must achieve the 1.6 Tb/s line rate by 2027; missing that window will force the ecosystem to route around you.

This is why networking's share of data center capex is rising from roughly 5–10% today toward 15–20% by 2030. Networking is now a primary cost driver, not infrastructure overhead.

The instinctive response—faster transceivers, denser cables, higher line rates—does not solve the underlying problem. As per-link bandwidth increases, demands on every switching node grow. A switch that was marginal at 400 Gb/s becomes a hard ceiling at 800 Gb/s. The interconnect upgrade exposes the switching layer, weighing down the entire process.

Building cluster scale purely through point-to-point interconnects to route around the switching layer multiplies laser sources required, drives power consumption up nonlinearly, and compounds complexity with every node added. The switch is unavoidable. The only question is whether it performs well enough to no longer be the bottleneck.

The AI infrastructure stack evolved as independently optimized components—accelerators, transceivers, interconnects, switches—each developed to its own performance envelope, then handed to architects expected to make them work together. The result is overengineering and wasted capacity. Network fabrics are specified for generic workloads that match no actual deployment.

The industry is trying to solve a system-level problem with component-level thinking. Wasted switching capacity leads to underutilized compute and power budgets that, by design, exceed actual needs. Trillion-dollar investments end up substantially less productive than they should be.

Closing the gap between raw compute and delivered performance demands a different starting point. Rather than assembling a fabric from the best available parts, AI network architecture must begin with the workload and reason backward to the switching, interconnect, and interposer design that actually serves it. Three things follow:

Co-optimization across the stack. The interposer, interconnect, and switching layer are not independent variables. The network's performance envelope is set by how these layers interact; gains in one are routinely offset by constraints in another.

Architecture-specific design. AI training, inference, and HPC workloads have fundamentally different traffic patterns, latency tolerances, and bandwidth utilization profiles. A reference architecture for training looks materially different from one designed for inference; generic designs serve neither well.

Photonic packet-level, reconfigurable switching. Electronic packet switches hit hard limits at scale: power dissipation grows, latency floors don't move, and silicon hits physical constraints. Photonic switching offers a path through these limits, but architecture matters as much as medium. Circuit switching suits predictable, long-duration flows. AI traffic is asymmetric, shifting dynamically between training and inference. Photonic circuit switching cannot reconfigure quickly enough to avoid the idle periods that defeat the purpose of optical systems. Packet-level, reconfigurable photonic architectures solve this, preserving the low latency, high bandwidth, and energy efficiency of optical media.

Nvidia made networking one of its biggest divisions for good reason: returns on compute depend on whether the network can deliver at the speed required, without latency, congestion, or underutilization. The switching layer must be a first-class design input. Existing switching architectures were built for a different era, and AI traffic patterns have already outgrown them.

The winners will be those who design from the workload outward, adopting architectures that match the nature of AI traffic as it continues to grow. The rest will continue paying for compute they can't use.

Read the original