Chips & Hardware · Report

Chinese GPU chipmaker (fourth of the 'GPU four dragons') receives regulatory approval and IPO backing from Tencent, with reported 6 billion yuan investment, strengthening domestic AI chip supply chain.

Supply chain diversification away from NVIDIA dependency; Chinese AI infrastructure gains independent GPU production capacity amid geopolitical AI competition.

Trade pressSlicast · June 28, 2026 · China · Source: Google News

importance 90

This afternoon, the Shanghai Stock Exchange published its review results: Shuiyuan Technology, a cloud AI computing power manufacturer, has obtained approval from the listing committee for its STAR Market IPO application.

Reviewing its IPO progress, the company officially received acceptance from the STAR Market on January 22, after passing through two rounds of regulatory inquiries and successfully reaching the listing committee review.

As one of the "four dragons" among the first tier of domestic GPU manufacturers, Shuiyuan's approval is highly significant. Previously, Moore Threads and Muxi Technology listed on the STAR Market, while Biren Technology went public in Hong Kong. Upon Shuiyuan's completion of listing, the four leading domestic computing power companies will be fully assembled in the capital markets.

Shuiyuan Technology was founded in March 2018. According to the prospectus, over its eight-year existence, the company has independently developed and iterated through four generations of architecture and five cloud AI chips, forming a product ecosystem encompassing AI chips, AI accelerator cards and modules, intelligent computing systems and clusters, and AI computing and programming software platforms.

Unlike the general-purpose GPU route commonly compared to NVIDIA's GPU ecosystem, Shuiyuan has chosen a DSA (Domain-Specific Architecture) path. Simply put, general-purpose GPUs emphasize stronger general-purpose parallel computing capabilities, while DSA focuses on architecture optimization for specific computing workloads.

For Shuiyuan, the technology bet is not on replicating the CUDA ecosystem, but rather on building an independent system in cloud AI training and inference scenarios through self-developed instruction sets, self-developed computing units, self-developed interconnect technology, and self-developed software stacks. The company is targeting the domestic demand for national chip substitution in large model training, inference, and intelligent computing center construction. The obstacles are clear: NVIDIA's long-established hardware, software, developer, and customer migration cost barriers.

The prospectus categorizes domestic cloud AI chip manufacturers into two types: those using DSA architecture, exemplified by Huawei HiSilicon, Cambricon, and Shuiyuan; and those using general-purpose GPU architecture, represented by Moore Threads, Muxi Technology, Tsingdata Intellitech, and Biren Technology.

Shuiyuan's core hardware architecture includes the GCU-CARE accelerated computing unit and GCU-LARE inter-card high-speed interconnect technology. According to the prospectus, GCU-CARE corresponds to NVIDIA's Tensor Core accelerated computing unit, while GCU-LARE corresponds to NVIDIA's NVLink inter-card interconnect technology.

At the software level, rather than following the CUDA ecosystem, the company developed its own full-stack AI computing and programming software platform, "TopsRider," comprising drivers, compiled languages and compilers, operator libraries, and toolchains.

The advantage of this approach is that Shuiyuan can perform software-hardware coordinated optimization around actual AI training and inference workloads. Particularly on the inference side, the industry's reliance on the CUDA ecosystem is relatively lower than on the training side, and cost, energy efficiency, and deployment efficiency become more critical. The prospectus notes that as the AI inference market grows, DSA architecture is expected to demonstrate superior cost-effectiveness in specific scenarios.

However, the difficulties are equally clear: software ecosystem migration, model adaptation, operator coverage, and customer validation cycles all become commercial deployment barriers. AI chips are not a business won by hardware parameters alone; whether customers' models, frameworks, and business can run stably is often more critical than single-chip peak performance.

In 2019, the company launched its first-generation Shuisi 1.0 architecture, corresponding to Yunyuan T1x training series and Yunyuan i1x inference series products. In 2021, it launched the second-generation Shuisi 2.0 architecture, corresponding to the Yunyuan T2x training series; that same year it launched the Shuisi 2.5 architecture, corresponding to the Yunyuan i2x inference series. In 2024, the company launched the third-generation Shuisi 320 architecture, corresponding to the Yunyuan S60 inference card, primarily targeting large model inference scenarios. In 2025, it launched the fourth-generation Shuisi 400 architecture, corresponding to the Yunyuan L600 training-inference integrated module.

According to the prospectus, Shuisi 400 supports FP8 low-precision computing and targets ultra-large-scale cluster expansion requirements of over ten thousand cards. The Yunyuan L600 adopts an OAM module form, supporting high-density, high-interconnect AI server deployment.

Beyond chips and accelerator cards, Shuiyuan is advancing toward the systems and cluster level. Its intelligent computing system brand is Cloud Blazer POD, typically comprising four to eight AI-dedicated servers and multiple network switches, with each POD generally integrating 32 to 64 AI accelerator cards and modules. Larger intelligent computing clusters comprise multiple PODs, general-purpose CPU servers, high-speed networking equipment, independent storage servers, and the company's self-developed system software.

In 2025, the company's operating revenue reached 986 million yuan, of which AI accelerator cards and modules contributed 856 million yuan, accounting for 86.83%; intelligent computing systems and clusters contributed 128 million yuan, accounting for 13.00%; IP licensing and other revenue contributed 16.42 million yuan, accounting for 0.17%. This indicates that although the company possesses product forms ranging from single cards and modules to PODs and clusters, the current commercial focus remains on AI accelerator card and module delivery. Whether systems and cluster business can sustain growth depends on major customer project schedules and subsequent domestic intelligent computing center construction needs.

Shuiyuan's ability to complete multiple generations of chip architecture iteration and bring products to the cloud AI computing market within years of its founding is closely related to its founding team's long-accumulated experience in chip design, engineering management, and industrialization.

According to the prospectus, founder Zhao Lidong was born in 1966 and holds a bachelor's degree in electronic engineering from Tsinghua University and a master's degree in electronics and computer engineering from Utah State University. He has over 30 years of experience in chip design and management, having worked at S3, Juniper Networks, AMD, and other companies, and participated in the establishment of AMD's China R&D center. In March 2018, he co-founded Shuiyuan Technology and currently serves as the company's chairman, CEO, and board secretary.

Co-founder Zhang Yalin was born in 1978 and holds a bachelor's degree in electronic engineering and information systems from Fudan University, with 25 years of chip design and management experience. According to the prospectus, he previously worked at AMD as a senior chip development manager and chip technology director of the China R&D center, leading chip design projects including the Xbox-One main chip and the Little Overlord Z+ chip. In March 2018, he co-founded Shuiyuan Technology with Zhao Lidong and currently serves as the company's director, general manager, and COO.

The prospectus discloses that core technical personnel include Chai Jing, Luo Wei, and Chen Songtao. Chai Jing previously worked as a senior chip development manager at AMD, and after joining Shuiyuan leads the hardware chip department, having participated in the R&D process from architecture design to mass production for the company's four generations of architecture and five cloud AI chips. Luo Wei previously worked at NVIDIA Shanghai as a senior manager for CUDA testing, development, and quality assurance; after joining Shuiyuan, he leads the software R&D system and championed building the TopsRider software stack from scratch. Chen Songtao previously worked at Teradyne, Avago, Marvell, and other companies, and after joining Shuiyuan leads the products and systems engineering division.

Based on the founding team's composition and core personnel structure, Shuiyuan was never simply about "making a chip" from its inception—it aimed to simultaneously develop capabilities in chip architecture, board cards and modules, systems and clusters, and software stacks.

The prospectus discloses that from 2023 to 2025, the company accumulated R&D investment of 3.676 billion yuan against cumulative operating revenue of 2.014 billion yuan, with R&D investment over the past three years representing 182.55% of cumulative operating revenue.

As of the end of 2025, the company employed 838 people, of which 643 were R&D personnel, representing 76.73% of the workforce.

Due to massive R&D investment and lengthy return cycles, Shuiyuan, despite rapid revenue growth, remains in a continuous loss-making phase. From 2023 to 2025, the company's operating revenue was 301 million yuan, 722 million yuan, and 990 million yuan respectively; net losses were 1.665 billion yuan, 1.510 billion yuan, and 1.164 billion yuan respectively. During the same period, R&D expenses were 1.229 billion yuan, 1.312 billion yuan, and 1.135 billion yuan, representing 408.01%, 181.66%, and 114.63% of operating revenue respectively.

In the first quarter of 2026, the company realized operating revenue of 287 million yuan, a year-on-year increase of 1474.85%; net losses attributable to parent company shareholders were 444 million yuan, with losses expanding year-on-year.

The prospectus explains that in the first quarter of 2026, the company's products saw further volume growth at downstream customers, but due to maintained high R&D intensity, along with interest expenses provisioned on customer advance payments and resulting taxes and surcharges, the current period's profit was affected.

In September 2023, Shuiyuan completed debt-to-equity conversion and Series D financing. Investors including Tencent Technology and Guofang Jinpu collectively subscribed 767 million yuan in previously issued convertible bonds and 890 million yuan in cash for new registered capital. The Series D cash capital increase referenced a pre-money valuation of approximately 14.1 billion yuan; the aforementioned convertible bond investors converted at the lower of approximately 12.343 billion yuan pre-money valuation or the next round's pre-money valuation.

In December that year, the company completed Series D+ financing. Eight investors including Huai'an Tierong, Cloud Creation Intelligent Computing, Anhui Zhongan, and Zhejiang Fund subscribed approximately 680 million yuan in cash for new company equity, with this round referencing a pre-money valuation of approximately 15.9 billion yuan.

Entering 2024, capital continued to increase its stake. In June 2024, Shuiyuan completed Series D++ financing, with 12 investors including Tencent Technology subscribing approximately 753 million yuan in cash for new company equity, with this round referencing a pre-money valuation of approximately 16.6 billion yuan.

In December 2024, Shuiyuan completed Series E financing. 36 investors including Shanghai Productin and Tencent Technology collectively subscribed approximately 2.72 billion yuan in cash for new company equity, with this round referencing a pre-money valuation of approximately 17.5 billion yuan. Shanghai Production and Tencent Technology each invested 300 million yuan, National Investment Juli invested 200 million yuan, Yangzhou State Fund invested approximately 198 million yuan, and Wuyue Peak Period III invested 185 million yuan.

Behind successive financing rounds lies, on one hand, the capital-intensive nature of the AI chip industry itself, and on the other, capital's betting on the window for domestic AI computing power substitution.

One piece of information particularly worth noting: Tencent is not only a major shareholder of Shuiyuan but also a significant customer.

The prospectus shows that as of the prospectus signing date, Tencent Technology and its concerted action party Suzhou Paiyi collectively hold 20.258% of Shuiyuan's shares, making them the company's largest shareholder. Zhao Lidong and Zhang Yalin, through direct shareholding and employee shareholding platforms, collectively control 28.1357% of the company's voting rights. The company has no controlling shareholder, with relatively dispersed share ownership.

In 2025, the company primarily used a direct sales model, with direct sales revenue accounting for 98.74% of operating revenue. From the top five customers perspective, Tencent Technology (Shenzhen) Co., Ltd. corresponded to sales of 768 million yuan, representing 74.90%; Chengdu High-Tech Electronic Information Industry Co., Ltd. corresponded to sales of 145 million yuan, representing 14.15%; customers A, B, and C contributed 55.0593 million yuan, 13.4098 million yuan, and 11.9526 million yuan respectively. The top five customers collectively represented 96.89% of total revenue.

The prospectus further explains that the company's high customer concentration relates partly to large single contracts in systems and cluster business and partly to concentrated demand from internet end-customer clients. Beyond direct sales to Tencent, the company also employs an AVAP mode, whereby it sells AI accelerator cards or modules to server manufacturers designated by internet customers at prices agreed with those customers. Accordingly, part of server manufacturer procurement may actually represent end-user demand from internet customers.

For this IPO, Shuiyuan plans to raise 6 billion yuan, continuing investment in fifth-generation AI chip series products R&D and industrialization, sixth-generation AI chip series products R&D and industrialization, and advanced AI software-hardware coordinated innovation projects.

In other words, from past funding rounds through this rush toward the STAR Market, Shuiyuan's financing trajectory has remained unchanged: continuously trading capital for R&D cycles, trading R&D iteration for product volume, and through major customer scenario validation, striving to establish itself firmly in the domestic AI computing power market.

Like the first three of the "four dragons" of domestic GPU companies that have successfully completed IPOs, Shuiyuan has similarly caught the strategic window that domestic AI chip companies cannot ignore.

Over recent years, large model training and inference demand has grown rapidly; particularly in the past year, new applications exemplified by "lobster agents" have dramatically increased token consumption, transforming computing power from internet giants' technical reserves into foundational infrastructure jointly invested in by cloud providers, operators, local intelligent computing centers, and industry customers.

As AI applications progress from model training toward large-scale deployment, inference computing demand continues to expand, providing new entry opportunities for domestic AI chip companies.

The prospectus cites Zhikuai Consulting data indicating that the global AI accelerator card market reached approximately 119.028 billion dollars in 2024, with projections for growth to 525.77 billion dollars by 2028. China's AI accelerator card market grew from 12.254 billion yuan in 2020 to 216.477 billion yuan in 2024, with projections reaching 1,107.646 billion yuan by 2028.

In the past, discussions of AI computing power were nearly synonymous with large model training. But training is ultimately cyclical, whereas inference demand persists continually alongside application deployment.

Data indicates that by 2028, China's inference AI accelerator card market is projected to reach 808.582 billion yuan, accounting for over 70% of the overall market.

This trend is particularly significant for companies like Shuiyuan employing DSA (Domain-Specific Architecture). In the training market, NVIDIA has built extraordinarily deep moats through its CUDA ecosystem, particularly in software adaptation and cluster stability, making it very difficult for newcomers to achieve breakthroughs in the short term. By contrast, while the inference scenario equally values reliability, customers show higher sensitivity to cost, energy efficiency ratio, and deployment density. In sectors where models are relatively fixed and application scenarios clear, as long as domestic chips can deliver verified cost-effectiveness, there are opportunities to pry open customer procurement doors.

The prospectus discloses that China's total AI accelerator card shipments in 2025 reached approximately 4 million units, with NVIDIA shipments of approximately 2.2 million units, representing about 55%. During the same period, Shuiyuan sold 663,000 AI accelerator cards and modules, representing approximately 1.7% of the Chinese market share by shipment volume.

This data directly illustrates the situation facing domestic AI chip companies: the demand window has indeed opened, but the market leader remains NVIDIA. Domestic manufacturers are currently completing validation and substitution more in specific customers, specific scenarios, and specific projects; genuine large-scale adoption still requires crossing multiple thresholds in product iteration, software ecosystems, customer migration, and cluster stability.

Therefore, the most critical issue for Shuiyuan before and after its IPO is not "can we make domestic AI chips." From the prospectus's disclosure of product iteration and sales circumstances, the company has already completed commercial deployment from chips and accelerator cards through intelligent computing systems. The more critical question is whether it can enable these products to run stably over the long term in major customers' real business operations, and continuously reduce customer migration costs from the NVIDIA ecosystem to the domestic AI computing power platform.

In other words, opportunities in domestic AI computing power have emerged, but the real test is just beginning. For Shuiyuan, the market offers not a ready-made ticket but a long-term validation regarding product reliability, software ecosystems, and scaled delivery capability.

Read the original