Chips & Hardware · Report

OpenAI and Broadcom have unveiled Jalapeño, a custom inference processor optimized for gigawatt-scale AI data center deployments, designed to handle large language model inference workloads at production scale.

Hyperscalers are transitioning from commodity GPU inference to custom silicon, reducing reliance on NVIDIA and accelerating the industry shift toward specialized inference accelerators.

Trade pressSlicast · June 25, 2026 · US · Source: Google News

importance 95

OpenAI and Broadcom have announced Jalapeño, a custom-designed inference processor built for large-scale AI data center deployments. The chip targets the production inference workloads that power services like ChatGPT.

Jalapeño is optimized for gigawatt-scale AI data center environments, focusing specifically on the inference phase where trained models serve user requests. By building a custom chip tailored to this workload—rather than relying solely on general-purpose GPUs—OpenAI addresses a key efficiency bottleneck in running inference at production scale.

The move signals OpenAI's strategy to verticalize its infrastructure stack, following competitors like Meta and Google in developing proprietary silicon. For the broader AI buildout, custom inference chips reduce operational costs and latency, enabling more efficient serving of frontier models as demand continues to scale.

Read the original