OpenAI and Broadcom unveiled Jalapeño, OpenAI's first custom-built AI inference chip. Designed from scratch for LLM inference with targets of 50% cost reduction, it is scheduled for gigawatt-scale deployment by end-2026 after nine months of development accelerated by OpenAI's own models.
OpenAI and Broadcom have unveiled Jalapeño, OpenAI's first custom-built AI inference chip. Designed from scratch specifically for LLM inference, the chip represents a major step toward vertically integrated AI infrastructure as companies move beyond relying on existing semiconductor designs.
The chip targets a 50% reduction in inference costs, a significant efficiency gain in an area where operational expenses directly impact deployment margins. Development took nine months and was accelerated by OpenAI's own models, which provided real-world optimization targets during the design process.
Deployment is scheduled to reach gigawatt scale by the end of 2026, signaling confidence in the chip's readiness and OpenAI's commitment to scaling custom silicon across its inference workloads. This move aligns with broader industry trends of major AI companies building proprietary chips to reduce dependency on GPU vendors and improve unit economics as inference becomes the dominant compute demand.