OpenAI and Broadcom announce chip designed for LLM inference at scale
OpenAI and Broadcom have unveiled a custom ASIC called “Jalapeño” for large language model inference in data centers, claiming substantially better performance per watt than current state-of-the-art, with more details to come.

OpenAI, the company behind ChatGPT and Codex, and Broadcom, an established chip supplier, have announced a new chip named “Jalapeño” designed specifically for large language model (LLM) inference in data centers. The chip is intended for deployment in large-scale data centers, and both companies describe it as the first generation of a long-term project that will see iterative improvements.
Broadcom states that this application-specific integrated circuit (ASIC) was built from scratch based on “detailed insights” from discussions with OpenAI researchers, and its development was informed by OpenAI’s own roadmap for future models and products. The design and production process took nine months.
OpenAI claims that early testing shows “Jalapeño” will deliver “substantially better” performance per watt than current state-of-the-art systems, but notes that performance measurement is not yet complete. A detailed technical report is expected in the coming months.


