Broadcom partners with OpenAI on Jalapeño AI chip to cut inference costs

OpenAI and Broadcom unveiled Jalapeño, OpenAI’s first custom AI inference chip, on June 24, marking the company’s entry into semiconductor design as it aims to cut the cost of running large language models at scale.

Jalapeño is purpose-built for LLM inference—the process of running trained models to answer user queries—rather than a general-purpose accelerator adapted from earlier AI workloads. Early testing shows the chip delivers roughly 50% lower inference costs per token compared to typical AI GPUs, according to Broadcom CEO Hock Tan.

The partnership represents a significant step in OpenAI’s strategy to own more of its infrastructure stack. OpenAI designed the chip from scratch using its deep understanding of LLM fundamentals, while Broadcom handled silicon implementation and networking technologies. Celestica contributed board, rack, and system integration expertise.

OpenAI’s hardware chief Richard Ho said the architecture was optimized around the kernels, memory movement, networking, and serving patterns that matter most for frontier AI models. “Based on early testing, Jalapeño will efficiently execute our most important workloads close to the hardware’s theoretical limits,” Ho stated.

The development speed was striking: Jalapeño moved from initial design to manufacturing tape-out in just nine months—what the companies describe as the fastest ASIC development cycle ever achieved in high-performance advanced semiconductors. OpenAI’s own models helped accelerate parts of the design and optimization process, demonstrating how AI tools can speed hardware engineering.

A Broader Shift Toward Custom Silicon

OpenAI’s move follows a wider industry trend. Google has built custom Tensor Processing Units (TPUs) for years and now rents them to other companies; Amazon developed Trainium chips for training and Inferentia chips for inference; and Meta rolled out its MTIA (Meta Training and Inference Accelerator) chips in early 2026. According to research firm TrendForce, custom AI chip sales are projected to grow 45% in 2026, compared to 16% for standard GPUs.

Broadcom CEO Hock Tan said the Jalapeño partnership represents “a fundamental commitment to scaling the physical infrastructure required for the next decade of AI.” The company plans to deploy the chip at gigawatt scale with data center partners, including Microsoft, beginning by the end of 2026.

OpenAI plans to release a detailed technical report once testing is complete. The company emphasizes that inference—where AI reaches users through every ChatGPT response and API call—is where cost and speed improvements translate directly into faster, cheaper, and more reliable AI products.

Sources

  • OpenAI — Official announcement of Jalapeño chip design, nine-month development timeline, and deployment plans
  • Broadcom — CEO Hock Tan’s statements on 50% cost savings and gigawatt-scale deployment with Microsoft
  • Bloomberg — Reporting on the chip’s performance claims and cost comparisons to Nvidia’s Blackwell
  • CNBC — Coverage of the Jalapeño announcement and Broadcom stock implications
  • Reuters — Reporting on the broader trend of hyperscalers designing custom AI chips
  • Forbes — Analysis of custom AI chip market growth projections for 2026

Give your feedback

Be the first to rate this post
or leave a detailed review



ECIKS.org is an independent media. Support us by adding us to your Google News favorites:

Post a comment

Publish a comment