OpenAI and Broadcom unveil custom AI chip designed to slash processing costs by roughly 50%

OpenAI and Broadcom unveiled Jalapeño, OpenAI’s first custom AI inference chip, on June 24, marking the company’s entry into semiconductor design as it aims to cut the cost of running large language models at scale.

Jalapeño is purpose-built for LLM inference—the process of running trained models to answer user queries—rather than a general-purpose accelerator adapted from earlier AI workloads. Early testing shows the chip delivers roughly 50% lower inference costs per token compared to typical AI GPUs, according to Broadcom CEO Hock Tan.

The partnership represents a significant step in OpenAI’s strategy to own more of its infrastructure stack. OpenAI designed the chip from scratch using its deep understanding of LLM fundamentals, while Broadcom handled silicon implementation and networking technologies. Celestica contributed board, rack, and system integration expertise.

OpenAI’s hardware chief Richard Ho said the architecture was optimized around the kernels, memory movement, networking, and serving patterns that matter most for frontier AI models. “Based on early testing, Jalapeño will efficiently execute our most important workloads close to the hardware’s theoretical limits,” Ho stated.

The development speed was striking: Jalapeño moved from initial design to manufacturing tape-out in just nine months—what the companies describe as the fastest ASIC development cycle ever achieved in high-performance advanced semiconductors. OpenAI’s own models helped accelerate parts of the design and optimization process, demonstrating how AI tools can speed hardware engineering.

A Broader Shift Toward Custom Silicon

OpenAI’s move follows a wider industry trend. Google has built custom Tensor Processing Units (TPUs) for years and now rents them to other companies; Amazon developed Trainium chips for training and Inferentia chips for inference; and Meta rolled out its MTIA (Meta Training and Inference Accelerator) chips in early 2026. According to research firm TrendForce, custom AI chip sales are projected to grow 45% in 2026, compared to 16% for standard GPUs.

Broadcom CEO Hock Tan said the Jalapeño partnership represents “a fundamental commitment to scaling the physical infrastructure required for the next decade of AI.” The company plans to deploy the chip at gigawatt scale with data center partners, including Microsoft, beginning by the end of 2026.

OpenAI plans to release a detailed technical report once testing is complete. The company emphasizes that inference—where AI reaches users through every ChatGPT response and API call—is where cost and speed improvements translate directly into faster, cheaper, and more reliable AI products.

Sources

OpenAI — Official announcement of Jalapeño chip design, nine-month development timeline, and deployment plans
Broadcom — CEO Hock Tan’s statements on 50% cost savings and gigawatt-scale deployment with Microsoft
Bloomberg — Reporting on the chip’s performance claims and cost comparisons to Nvidia’s Blackwell
CNBC — Coverage of the Jalapeño announcement and Broadcom stock implications
Reuters — Reporting on the broader trend of hyperscalers designing custom AI chips
Forbes — Analysis of custom AI chip market growth projections for 2026

About the author, Chris Martin

Chris Martin is a US economics and current affairs journalist covering the intersection of policy, markets, and everyday financial life. With a background in financial reporting and a sharp eye for the stories behind the numbers, Chris brings clarity to some of the most complex issues shaping the American economy today. At ECIKS.org, Chris covers breaking developments across domestic economic policy, business strategy, Wall Street movements, and political decisions that ripple through financial markets. His reporting blends rigorous data analysis with accessible storytelling making critical information useful for investors, entrepreneurs, and engaged citizens alike.