OpenAI and Broadcom Unveil Jalapeño, a Custom LLM Inference Chip

OpenAI's Strategic Leap into Custom Silicon

OpenAI and Broadcom have officially unveiled Jalapeño, OpenAI's inaugural Intelligence Processor, a custom-designed chip specifically engineered for Large Language Model (LLM) inference. This announcement signifies a pivotal moment in OpenAI's long-term strategy to develop a full-stack AI platform, encompassing everything from models and products to the underlying hardware. The goal is to make advanced AI more efficient, reliable, and broadly accessible.

The collaboration with Broadcom, first announced in October 2025, aims to deploy 10 gigawatts of these custom AI accelerators. This move allows OpenAI to embed its deep understanding of LLM fundamentals directly into the hardware, addressing critical bottlenecks in inference at scale, such as data movement, compute-memory balance, and networking efficiency.

Jalapeño: A Purpose-Built Inference Engine

Unlike general-purpose AI accelerators, Jalapeño is a blank-slate design optimized specifically for modern LLM inference. OpenAI emphasizes that it is not a repurposed training accelerator, but rather a chip architected around the unique demands of interactive LLM products. The design is informed by the systems OpenAI operates daily across products like ChatGPT, Codex, and its API, ensuring it meets the inference needs of current and future AI models across the industry.

Early testing indicates that Jalapeño will deliver performance per watt substantially better than current state-of-the-art chips, although detailed technical performance reports are still forthcoming. The chip's architecture is designed to reduce data movement and achieve realized utilization much closer to theoretical peak performance, leading to high efficiency in both cost and power.

Rapid Development and Future Deployment

The development of Jalapeño from initial design to manufacturing tape-out was achieved in an exceptionally fast nine-month timeframe. This accelerated development cycle was partly attributed to the deep software-hardware co-development between OpenAI's engineering teams and Broadcom's silicon implementation expertise, and notably, the use of OpenAI's own models to optimize parts of the chip's design.

Jalapeño represents the first step in a multi-generation compute platform, with initial deployment slated for late 2026. This platform will integrate OpenAI-designed accelerators with Broadcom's silicon implementation, networking, and connectivity technologies, including Tomahawk networking silicon, and Celestica's expertise in board, rack, and system integration. The goal is to enable the deployment of gigawatt-scale data centers with partners like Microsoft.

Implications for the AI Landscape

OpenAI's foray into custom silicon with Jalapeño is a strategic move to gain greater control over its infrastructure and reduce the rising costs associated with LLM inference. By owning more of the technology stack, OpenAI aims to offer faster, more reliable, and more affordable AI services, potentially giving it a competitive edge in a rapidly evolving market. This initiative also signals a diversification away from reliance on general-purpose GPUs from companies like Nvidia.

The ability of Jalapeño to work with all LLMs suggests that OpenAI might also be positioning itself as an infrastructure provider, potentially selling its hardware to third parties. This mirrors the vertical integration strategies seen from other tech giants like Google with its Tensor Processing Units (TPUs). The success of this custom chip initiative could significantly impact the economics of AI, potentially lowering the cost of compute across the industry and democratizing access to advanced AI.

OpenAI and Broadcom Unveil Jalapeño, a Custom LLM Inference Chip

OpenAI's Strategic Leap into Custom Silicon

Jalapeño: A Purpose-Built Inference Engine

Rapid Development and Future Deployment

Implications for the AI Landscape

Tags

Sources