OpenAI Partners With Cerebras To Add 750MW Of Ultra Low-Latency AI Compute

OpenAI has partnered with Cerebras to add 750 megawatts of ultra-low-latency AI compute to OpenAI’s platform, expanding the company’s inference capacity with systems designed to deliver faster, more responsive model outputs. Multiple reports cited that the deal is valued at over $10 billion.

Cerebras builds purpose-built AI systems to accelerate long-form outputs from AI models. The company’s approach centers on placing large amounts of compute, memory, and bandwidth onto a single, giant chip to reduce the bottlenecks that can slow inference on conventional hardware.

OpenAI said the integration is intended to improve real-time performance across its products by shortening the request-response loop behind complex tasks such as answering difficult questions, generating code, creating images, and running AI agents. The company argues that faster, more natural response times can increase engagement and support higher-value workloads.

According to OpenAI, the Cerebras capacity will be integrated into its inference stack in phases and expanded across workloads over time. The additional compute is expected to come online in multiple tranches through 2028.

The partnership aligns with OpenAI’s broader compute strategy of maintaining a portfolio of systems matched to specific workload needs, with Cerebras positioned as a dedicated option for low-latency inference.

KEY QUOTES:

“OpenAI’s compute strategy is to build a resilient portfolio that matches the right systems to the right workloads. Cerebras adds a dedicated low-latency inference solution to our platform. That means faster responses, more natural interactions, and a stronger foundation to scale real-time AI to many more people.”

Sachin Katti, OpenAI

“We are delighted to partner with OpenAI, bringing the world’s leading AI models to the world’s fastest AI processor. Just as broadband transformed the internet, real-time inference will transform AI, enabling entirely new ways to build and interact with AI models.”

Andrew Feldman, Co-Founder And CEO, Cerebras