Chips & Hardware · Report

OpenAI announces its first self-developed AI inference chip, marking the company's entry into custom silicon design.

Custom inference chips from leading AI companies could reduce NVIDIA dependency and diversify the AI compute supply chain.

Trade pressSlicast · June 26, 2026 · US · Source: Google News

importance 94

OpenAI and semiconductor giant Broadcom jointly released their first custom AI inference chip, Jalapeño, on June 24, 2026. The application-specific integrated circuit (ASIC) is designed specifically for large language model inference. Under their division of labor, OpenAI is responsible for the underlying architecture design, Broadcom handles silicon implementation and network hardware, and Canadian electronics manufacturing services provider Celestica manages board and rack system integration.

OpenAI stated that Jalapeño's performance per watt will surpass current state-of-the-art levels. Engineering samples are currently running various machine learning workloads in the laboratory at mass production target frequencies and power consumption, including GPT5.3, Codex, and Spark.

In its early stages, OpenAI primarily relied on exclusive partnerships with cloud providers, exchanging capital for computing power. The company accepted investment from Microsoft and leased a dedicated Microsoft Azure cluster comprised of tens of thousands of NVIDIA GPUs for model training and hosting, optimizing high-speed communication between multiple chips. OpenAI emphasizes that the world is moving towards a computing-centric economy, and Jalapeño is part of the company's long-term full-stack infrastructure strategy. The decision to launch Jalapeño in mid-2026 reflects the natural consequence of high operating costs, market competition, and supply chain pressures, enabling OpenAI to deliver more powerful intelligence with greater efficiency.

At NVIDIA's 2026 annual shareholder meeting on June 24th, CEO Jensen Huang emphasized that the "era of useful AI" has arrived and expressed confidence in continued AI infrastructure development. Huang stated that every industry is vying to adopt Agentic AI, describing the AI industry as a five-layered cake comprising energy, chips and systems, infrastructure, models, and applications. While traditional data centers provided storage and other services, current AI factories are used to create tokens, explaining the strong demand for computing power. Huang revealed that Hopper was previously built for pre-training, Blackwell enables rack-scale inference, and Vera Rubin is designed for intelligent agents; Vera Rubin is now in full production, with every major model developer and hyperscale cloud provider preparing to build upon it.

Within the industry, Google has demonstrated significant profit margins in computing power cost control and hardware-software synergy optimization through its self-developed TPU series chips. Anthropic has deeply integrated its capital and computing power with Amazon and Google, while collaborating with multiple computing power and hardware manufacturers to build a diversified computing power foundation.

WiMi, an AI vision innovation company, has built a full-chain investment model covering chip architecture design, cluster system optimization, and quantum AI chip R&D. The company has been promoting the construction of AI chip clusters, focusing on integration of computing power foundation and technology, exploring low-power chips and edge computing optimization, and improving AI cloud computing infrastructure. According to its 2025 annual report, WiMi achieved a net profit of 347 million yuan, a year-on-year surge of 235.9%, primarily due to rapid development in quantum technology, artificial intelligence, and holographic AR. Going forward, WiMi will continue to follow up on open-source large-model technology, optimizing algorithms to reduce application costs and accelerate commercialization by opening up computing resources and technical interfaces.

The competition in AI has essentially evolved into a competition for computing power. For OpenAI to maintain its technological and commercial leadership, completion of the chip puzzle is essential. OpenAI's multi-generational computing platform is planned for initial deployment by the end of 2026. Whoever can first master the supply of low-cost, highly stable, and zero-carbon computing power will firmly grasp the core discourse power of the global AI industry and secure a key ticket to the AGI era.

Read the original