Chips & Hardware · Report

OpenAI and Broadcom unveil Jalapeño, a custom AI inference chip designed for gigawatt-scale deployment with half the inference cost of alternatives.

OpenAI's direct entry into chip design signals a fundamental shift toward vertical integration in AI infrastructure; inference cost parity unlocks new workload economics.

Trade pressSlicast · June 26, 2026 · US · Source: Google News

importance 95

OpenAI spent years buying other people's silicon, but it just showed off its own chip, and the codename packs a little kick. The company teamed with Broadcom to reveal Jalapeño, a custom AI chip built specifically for inference. In a statement of intent that doubled as a photo op, Broadcom CEO Hock Tan walked into OpenAI's offices and handed the first wafer to Sam Altman and Greg Brockman.

Running ChatGPT at planetary scale on rented hardware is expensive, and most of that hardware traces back to one supplier. Leaning on NVIDIA's GPUs has worked well enough to this point, but it leaves OpenAI paying a premium on every query and jockeying with the entire industry for supply. A custom-designed chip helps solve both problems.

OpenAI built an unusually large compute element and paired it with high-bandwidth memory to maximize compute, minimize data movement, and keep those cores fed. TSMC reportedly manufactures the chip, Broadcom handles the silicon implementation and networking, and Celestica builds the server racks.

The development timeline is a jaw-dropper. A high-performance chip manufactured on a leading edge node usually takes two to three years from concept to tape-out. Jalapeño supposedly got there in nine months. OpenAI leaned on its own models to help grind through layout and optimization, and engineering samples are already humming in the lab, including one running the GPT-5.3-Codex-Spark model.

Leaked financials point to OpenAI having pulled in roughly $13.07 billion in revenue last year against an operating loss near $20.9 billion. Recently published OpenAI research found its most active internal users now run more than 60 hours of Codex agent work in a single day—exactly the token-hungry workload Jalapeño is built to serve cheaply.

Broadcom CEO Tan has said early samples cut inference costs by about half versus a typical AI GPU, and he told one source that surging demand could let the partnership "do better" than his earlier deployment forecast. OpenAI hardware chief Richard Ho called Jalapeño "a very general purpose device" tuned for language models. Every performance figure so far is self-reported, with a full technical report promised later this year.

OpenAI still depends heavily on NVIDIA GPUs for training, and any chip is only as good as the supply chain feeding it. Jalapeño is the first rung of a planned multi-generation ladder, with the next version reportedly due in 2028 and yearly updates after that.

Read the original