OpenAI unveiled Jalapeño, its first custom AI inference chip, developed with Broadcom. Built from scratch for LLM inference, delivering ~50% cost reduction and performance-per-watt beyond current state-of-the-art, scaling by late 2026.
OpenAI and Broadcom this morning unveiled their first custom AI accelerator chip named "Jalapeño," positioning it as a purpose-built processor for large language model inference, rather than the more general GPUs offered by companies like Nvidia or AMD. According to its creators, Jalapeño is designed to support workloads behind ChatGPT, Codex, the API and future agentic products. Broadcom's news release notably positions it as a product that could be available to external AI firms as well—"built from the ground up for current and future LLMs across the industry."
The engineering timeline for Jalapeño set a blistering pace for the semiconductor industry, moving from early schematics to fabrication readiness within a nine-month window, when new processor development cycles are typically measured in years. The companies attributed this speed to a deep software-hardware co-development process that actively used OpenAI's own models to accelerate parts of the chip design. After receiving an early physical model on Wednesday, OpenAI outlined plans to begin rolling out these processors across active data centers by the end of this year. OpenAI says it has already begun testing at least one of its prior generation models, GPT-5.3-Codex-Spark, on the chips at a production workload, though in a test environment.
The release marks a major strategic expansion for the ChatGPT creator as it attempts to build the full computational stack required to make advanced AI faster, more reliable, and more accessible. Outstanding questions remain, including how the new Jalapeño chip performs compared to direct competitors, its costs, and its manufacturing viability.
To understand why OpenAI is moving into chip design, it helps to examine the architecture. Jalapeño is an Application-Specific Integrated Circuit, or ASIC. Unlike a GPU, which can handle many types of workloads, an ASIC is tuned for narrower uses. That narrower focus can make it cheaper and more efficient for specific AI tasks, though less adaptable than Nvidia-style GPUs. In Jalapeño's case, OpenAI is starting from a clean design focused on modern LLM serving, instead of adapting a broader accelerator to fit its needs. The company says the architecture is shaped by its experience running large-scale AI products and is meant to reduce unnecessary data movement while better matching compute, memory and networking resources.
Broadcom is contributing core silicon implementation and networking technology, including Tomahawk networking silicon, while Celestica is helping with board, rack and system integration. The goal is to move the chip closer to its practical performance ceiling in real workloads, not just improve theoretical benchmarks.
However, OpenAI's pivot into proprietary hardware is not merely a quest for technical supremacy; it may also make its core unit economics far more sustainable. Audited financial documents recently posted revealed that while OpenAI generated an impressive $13.07 billion in revenue throughout 2025, its total operational expenses for the year ballooned to $34 billion, resulting in an operating loss of nearly $20.92 billion. The primary culprit behind this cash hemorrhage involves pure compute requirements, though more is likely attributable to training than inference. In 2025 alone, research and development costs—driven largely by the infrastructure required to train and serve massive language models—accounted for $19.18 billion, or approximately 56 percent of the company's entire spending footprint. OpenAI reportedly paid Microsoft over $10.59 billion just for R&D and compute infrastructure last year.
Still, as OpenAI lays the groundwork for a heavily anticipated public offering in 2026, the Jalapeño inference chip may offer reassurance to private investors and public markets that OpenAI has a plan for moving toward profitability. If it can drive down the costs of AI inference, it may recoup some of the losses spent on costly training runs.
"By designing more of the stack ourselves, we can serve more intelligence with greater efficiency and keep pushing advanced AI toward broader access," said Greg Brockman, OpenAI's president and co-founder, in a statement included in Broadcom's release.
The introduction of Jalapeño immediately raises questions about OpenAI's strategic positioning within the fiercely competitive semiconductor and GPU market. Since kicking off the generative AI boom in late 2022, OpenAI has remained one of the largest customers of GPU market leader Nvidia's premium products, while also taking billions in investment dollars from the firm and expanding to work with other rival chipmakers to fuel its appetites.
In February 2026, Nvidia finalized a $30 billion direct investment into OpenAI as part of a massive $110 billion funding round. This deal secured an agreement to deploy 10 gigawatts of computing systems—including 3 gigawatts of dedicated inference capacity and 2 gigawatts of training capacity—utilizing Nvidia's next-generation Vera Rubin platform. As part of the same February 2026 funding round, Amazon invested $50 billion into OpenAI, including a commitment for OpenAI to consume approximately two gigawatts of AWS's proprietary Trainium computing capacity over the next eight years.
OpenAI also signed agreements with AMD for usage of the latter's AMD Instinct MI450 Series GPUs and struck a pact with Cerebras, an AI chipmaker that executed its initial public offering in May 2026. This sprawling web of vendor agreements highlights the sheer scale of OpenAI's infrastructural ambitions. The ultimate goal of the OpenAI and Broadcom partnership involves deploying gigawatt-scale data centers with Microsoft and other partners beginning in 2026—that is, data centers with compute requiring energy on the order of cities.
For Broadcom, the partnership acts as a massive reputational catalyst. The company has been among the biggest beneficiaries of the generative AI boom, helping hyperscalers and frontier labs engineer custom silicon. Broadcom shares reflect this momentum, demonstrating an 18 percent year-over-year increase in the first part of 2026 and a nearly 7X boost since the end of 2022.
Ultimately, Jalapeño confirms that OpenAI believes it is ready to move beyond software and code into the realm of real-world, custom hardware. By controlling the physics of its inference pipeline—while simultaneously leveraging the capital and hardware of Nvidia, Amazon, AMD, and Cerebras—OpenAI is attempting to rapidly rewrite its unit economics for AI.