OpenAI and Broadcom announced Jalapeño, OpenAI's first custom AI inference ASIC chip, engineered in a 9-month development sprint to compete with Nvidia's dominance.
OpenAI and Broadcom jointly developed an AI inference server chip called Jalapeno, the companies announced on June 24. The custom chip is part of OpenAI's strategy to reduce its reliance on Nvidia and secure its own hardware.
According to SiliconANGLE, OpenAI plans to bring its first Jalapeno servers online by year-end and then expand the chip's use. Unlike Nvidia's flagship GPUs, which handle both training and inference, Jalapeno is designed exclusively for inference. Early tests showed Jalapeno delivered significantly higher performance per watt than "current state-of-the-art" products, OpenAI said.
OpenAI disclosed limited details about the chip's design, noting only that its underlying architecture is optimized to reduce data movement. Jalapeno-based inference clusters will use Broadcom's Tomahawk series networking technology, a chip for Ethernet switches that manages data movement between servers in the same rack and between racks. The latest Tomahawk 6 handles up to 1.6 terabits per second of traffic and reduces network bottlenecks with a built-in congestion management engine.
OpenAI is also developing custom server racks equipped with Jalapeno and Broadcom networking gear. It is working with Toronto-based data centre equipment design services company Celestica on this effort.