OpenAI partners with Broadcom to unveil Jalapeño, a custom LLM inference accelerator chip designed for production deployment by late 2026, marking a major step toward full-stack AI compute independence with potential ~50% cost reduction.
OpenAI is adding custom hardware to its tech stack. The company and Broadcom have unveiled "Jalapeño"—OpenAI's first "Intelligence Processor"—a custom accelerator built specifically for large language model inference and the first chip in a multi-generation platform the two companies are developing together. Broadcom CEO Hock Tan and President Charlie Kawwas handed the first wafer to OpenAI CEO Sam Altman and President Greg Brockman, marking OpenAI's first step into custom hardware after years of focusing on models and products.
Unlike modified general-purpose chips, Jalapeño was designed from scratch for modern LLM inference. OpenAI handles the chip design, Broadcom contributes silicon manufacturing and networking technology including its Tomahawk networking chips, and Celestica manages boards, racks, and system integration. Early tests showed performance per watt that is "substantially better" than current state-of-the-art hardware, though these are self-reported numbers that remain unfinalized. A technical report is forthcoming. The specific chips against which Jalapeño was tested, the tasks used, and the testing conditions remain unclear.
The architecture reportedly cuts data movement and pushes utilization closer to its theoretical maximum. Engineering samples are already running machine learning workloads in the lab, including the GPT-5.3-Codex-Spark model, which currently runs on Cerebras hardware, another provider specializing in inference.
The development process from design to tape-out took just nine months—what OpenAI calls the fastest ASIC development cycle for high-performance semiconductors it is aware of. OpenAI's own models helped accelerate parts of the design process, though rumors about chip plans have circulated since 2023.
The announcement reflects OpenAI's broader argument that controlling the full stack from chip to product enables faster, more reliable model operation at lower cost. Broadcom CEO Tan says the first deployment is planned for late 2026 at gigawatt scale, in collaboration with Microsoft and other partners. Broadcom has reportedly required Microsoft to guarantee it will purchase 40 percent of the chips to secure the first phase.