Chips & Hardware · Report

OpenAI has unveiled Jalapeno, a custom AI inference chip built in partnership with Broadcom, marking OpenAI's entry into proprietary silicon design.

OpenAI competes in custom AI chip space alongside Nvidia/Broadcom, diversifying inference hardware supply.

Trade pressSlicast · June 27, 2026 · US · Source: Google News

importance 95

OpenAI has unveiled Jalapeño, its first custom-built AI inference chip, developed in partnership with Broadcom. The move represents a significant step in reducing the company's dependence on third-party chipmakers as AI systems grow more advanced and attract expanding user bases.

OpenAI is not alone in this transition. Meta has similarly advanced its in-house silicon ambitions through its Meta Training and Inference Accelerator (MTIA) program. Like Meta's MTIA chips, Jalapeño is designed primarily for AI inference rather than model training.

Inference is the stage where an AI model processes requests and generates responses. Every time someone asks ChatGPT a question, an inference system is at work behind the scenes. Because these requests happen continuously, running AI services at scale requires enormous computing power, making inference one of the industry's biggest operating expenses.

Traditional processors such as CPUs and GPUs—particularly those supplied by Nvidia, AMD and Intel—are powerful enough to handle these workloads, but they are designed for a wide range of computing tasks. That flexibility comes at a cost, both financially and in terms of energy consumption. Inference chips take a different approach. They are purpose-built to run AI models more efficiently, reducing power consumption, lowering operating costs and delivering faster responses under heavy workloads. By developing specialized hardware, companies such as OpenAI can optimize performance while reducing reliance on external chip suppliers.

Cloud-scale inference processors are fundamentally different from the chips found in consumer electronics. They are designed for data centers, consume hundreds of watts of power and require sophisticated cooling systems to operate efficiently. By contrast, smartphones and laptops already include their own AI accelerators in the form of Neural Processing Units (NPUs), which handle tasks such as facial recognition, voice processing, image enhancement and on-device AI features including text generation. NPUs are not replacements for CPUs or GPUs but work alongside them, with each processor handling the tasks it is best suited for.

In flagship smartphones released in 2026, such as the Galaxy S26 series, NPUs are integrated into the same System-on-Chip (SoC) as the CPU and GPU, enabling faster and more efficient on-device AI without relying entirely on cloud processing. The rise of inference chips is unlikely to solve the industry's growing memory demands. AI models continue to require vast amounts of memory in data centers, while consumer devices also need more memory as manufacturers add increasingly sophisticated AI features. As a result, demand for higher-capacity memory is expected to remain strong, keeping pressure on component costs.

Even so, dedicated inference processors are expected to play a key role in the future of AI-powered services. From autonomous logistics and predictive manufacturing to robotics and large-scale AI assistants, they promise faster performance, lower operating costs and improved energy efficiency. Most consumers will never see or interact directly with chips like Jalapeño, but they are likely to benefit from the faster, cheaper and more responsive AI services these processors are designed to power.

Read the original