OpenAI unveils Jalapeño, its first custom AI inference chip co-developed with Broadcom, designed to deliver faster LLM inference with optimized power efficiency.
OpenAI has unveiled Jalapeño, its first custom AI inference chip, developed in partnership with Broadcom. The move marks OpenAI's entry into the competitive inference-silicon market alongside established players like Groq and Cerebras.
The chip is purpose-built for LLM inference operations with a focus on speed and power efficiency. By optimizing for inference rather than training, OpenAI targets one of the highest-volume workloads in deployed AI systems—where latency and operating costs directly impact service margins.
The collaboration with Broadcom positions OpenAI to gain manufacturing and design scale, potentially lowering inference costs across its product stack. Custom inference hardware typically enables tighter latency budgets and better power-per-token economics, factors that intensify competition in the inference-silicon supply chain as AI deployment scales.