Chips & Hardware · Report

Panmnesia boosts CXL scale with fabric switching; Meta repurposes old DRAM using CXL.

CXL (Compute Express Link) fabric scales memory tiering; enables hyperscalers to extend memory hierarchy and optimize power/cost ratio.

Trade pressSlicast · June 27, 2026 · Global · Source: Blocks & Files

importance 65

Panmnesia, a Korean fabless semiconductor company, is advancing combined CXL and UAL/Ethernet technology with a fusion chip integrating PCIe and CXL as a step toward CXL/UAL unification. Simultaneously, Meta is leveraging CXL to attach old DRAM recycled from decommissioned servers to newer systems, boosting memory capacity and system performance.

CXL (Computer eXpress Link) extends the PCIe bus outside a server's chassis, initially based on the PCIe 5.0 standard and now absorbing PCIe generations 6 and 7. The fundamental challenge with CXL switches has been a persistent perception that placing a switch between the CPU and devices makes it difficult to meet the memory-access latency these systems require. This constraint has forced directly attached Multi-Headed Devices (MHDs) to remain the standard, despite their scalability limitations.

Panmnesia CEO Myoungsoo Jung stated: "There has been a perception that putting a switch between the CPU and devices makes it hard to meet the memory-access latency these systems expect, so directly attached MHDs stayed the norm even though they were harder to scale. Our work shows this is not an inherent limit of CXL or CXL switches—it is a trait of early-stage CXL, and one that fades as the standard and the products around it mature. With a fabric switch that carries our next-stage CXL controller, scalability, low latency, and stable performance can come together."

Panmnesia's next-stage CXL controller features optimized design and employs Port-Based Routing (PBR), which forwards data using port identifiers assigned to each device. This contrasts with Hierarchy-Based Routing (HBR), used in early PCIe and CXL designs, which constrains devices to tree-like hierarchies. The PBR approach enables mesh interconnections—the woven-cloth topology from which "fabric" switching derives its name. The new switch supports both PBR and HBR routing methods.

The controller achieves latency reduction through architectural innovations. Unlike early CXL designs, which typically maintained separate buffers at each layer and managed timing independently—imposing significant synchronization overhead—Panmnesia's controller shares buffers across layers, eliminating much of that penalty, and applies per-layer optimizations to further reduce latency. Testing demonstrated stable performance scaling to as many as 64 nodes, substantially expanding the usual approach of directly attaching MHDs to CPUs.

Panmnesia has released pre-production PCIe 6.4-CXL 3.2 Fusion Switch chips and has advanced the controller with CXL 4.0 features, now available as the PCIe 7.0-CXL 4.0 Combo IP. The company presented this work alongside a paper at ISCA 2026 titled "A Silicon-Proven Unified Low-Latency CXL Controller and Port-Based Routing Switch for Memory-Centric Fabrics."

Meta's parallel effort, Vistara, addresses a complementary challenge. The technology recycles DDR4 DIMMs from decommissioned servers, attaching them as expanded memory to newer systems via CXL. This approach delivers "near zero-cost memory expansion through recycling, performance gains from higher memory capacity, and a reduced carbon footprint." However, CXL adoption has been limited by low bandwidth, high latency, and high runtime overhead—Meta's expanded memory in production exhibits approximately 10 times lower bandwidth and approximately 60 percent higher latency than local memory. Most existing CXL solutions bundle DRAM with the controller, preventing DIMM reuse, and often lack DDR4 support.

Meta's hardware-software co-design addresses these constraints. On the hardware side, the team developed an in-house CXL ASIC, Vistara, optimized for DRAM reuse, power efficiency, and low latency. On the software side, they built an optimized solution leveraging Transparent Page Placement (TPP), determining appropriate local-to-expanded memory ratios for each workload and automating per-workload configuration, including disabling expanded memory for latency-sensitive applications.

The results are significant. Meta's solution achieved up to a 25 percent reduction in server count for disaggregated ML inference and a 29 percent reduction in average latency for distributed caches. Meta presented this work at ISCA 2026 in a paper titled "Vistara: Making CXL Real—Full Path from ASIC Design and IS Support to Hyperscale Deployment."

Both presentations occurred in the ISCA 2026 Industry Session on June 29 at the conference, held in Raleigh, North Carolina from June 27 to July 1.

Panmnesia partners can request pre-release chips and pilot systems for the PCIe 6.4-CXL 3.2 Fusion Switch and the PCIe 7.0-CXL 4.0 Combo IP by contacting sales@panmnesia.com.

Read the original