Data Centers · Report

China's domestically-developed LineShine supercomputer captured #1 position on TOP500 with 2.19 ExaFLOPS sustained FP64 performance, becoming the first all-CPU system to exceed 2 ExaFLOPS and reclaiming global supercomputing leadership after 9 years.

Geopolitical shift in compute hierarchy: China surpasses U.S. (El Capitan) in FP64; reshapes global assumptions about AI infrastructure geography and potential supply-chain vulnerabilities.

Trade pressSlicast · June 24, 2026 · Global · Source: Tom's Hardware

importance 93

China's LineShine supercomputer has dethroned El Capitan as the world's number one supercomputer, going straight to the top of the charts after the National Supercomputer Center in Shenzhen submitted its results. LineShine hit 2.198 FP64 ExaFLOPS in the Linpack benchmark and became the industry's first machine in the Top 500 list to sustain more than 2 ExaFLOPS of double-precision performance using only CPUs. The system is deployed at the National Supercomputing Centre in Shenzhen and was built by the Shenzhen Cloud Computing Center using semi-custom 304-core LX2 processors based on the Armv9 instruction set architecture and running at 1.55 GHz. The machine employs 13.79 million cores in total, uses a proprietary LingQi interconnect, and consumes 42.2 MW of power.

From a performance-per-watt point of view, the LineShine machine delivers 52.07 GFLOPS/W, which is below El Capitan's 60.94 GFLOPS/W. However, LineShine by far outperforms Fugaku—another CPU-only supercomputer that used to be the number one HPC system several years ago—which can only deliver 14.78 to 16.84 GFLOPS/W depending on whether its efficiency is optimized or not.

LineShine also moved to the top of the HPCG ranking with 22.00 HPCG-PFLOPS. However, the supercomputer achieved 7.92 mixed-precision EFLOPS in HPL-MxP, which puts it behind El Capitan, Frontier, and Aurora. This limits LineShine's usability for AI training and inference, but this can be justified with its exceptional performance for traditional supercomputer tasks.

Each LX2 CPU relies on two compute chiplets and has a total of 304 CPU cores organized into eight CPU clusters containing 38 cores each. Every core includes Arm SVE (Scalable Vector Extension) and SME (Scalable Matrix Extension) units that accelerate vector and matrix operations used in AI training and scientific computing, supporting FP64, FP32, BF16, FP16, and INT8 data formats. The chip features an unusual memory architecture that pairs 32 GB of on-package HBM, offering up to 4 TB/s of bandwidth, with as much as 256 GB of external DDR5 memory to maximize both bandwidth and capacity.

The processor gains 3.6X performance when moving from FP64 to mixed-precision data, which is lower compared to systems that integrate low-precision accelerators, such as AMD's Instinct MI300A or Intel's Ponte Vecchio. While an Armv9 CPU with SVE/SME can accelerate FP16/BF16/INT8 workloads, its mixed-precision uplift remains limited compared to systems with accelerators due to many reasons, including memory bandwidth, software maturity, and interconnect efficiency. That said, it may be too early to make final conclusions about the LX2 and its usability for mixed-precision workloads.

In any case, the very fact that a Chinese supercomputer has achieved extraordinary FP64 performance is remarkable. Furthermore, the fact that the National Supercomputer Center in Shenzhen has actually submitted results to Top 500 indicates that the organization is confident the LineShine supercomputer relies exclusively on domestic technologies and the U.S. government cannot affect the production of these technologies.

Read the original