Qualcomm: AI200 And AI250 Launched To Transform Rack-Scale AI Inference For The Data Center Era

Qualcomm Technologies unveiled its latest breakthroughs in data center artificial intelligence with the launch of the Qualcomm AI200 and AI250 chip-based accelerator cards and rack systems. These next-generation solutions, optimized for AI inference at scale, are engineered to deliver superior memory capacity, performance, and energy efficiency, setting new standards for total cost of ownership (TCO) and enabling seamless deployment of large-scale generative AI across data centers.

The new AI200 solution provides a purpose-built, rack-level AI inference architecture designed for low TCO and optimized performance across large language and multimodal model workloads. Supporting 768 GB of LPDDR per card, AI200 enhances memory capacity and flexibility, delivering scalable and cost-efficient performance for enterprises deploying advanced AI models.

Building on this foundation, the upcoming AI250 introduces an innovative near-memory computing architecture that delivers over 10x higher adequate memory bandwidth while significantly reducing power consumption. This represents a generational leap in AI inference performance and efficiency, enabling disaggregated inference that maximizes hardware utilization while maintaining flexibility and cost efficiency.

Both solutions incorporate direct liquid cooling for thermal performance, PCIe for scale-up, Ethernet for scale-out, and confidential computing for secure AI workloads. At the rack level, each system supports up to 160 kW, enabling hyperscaler-grade deployment across large-scale data centers.

Qualcomm’s solutions are powered by a comprehensive software stack optimized for AI inference, spanning from the application to the system layer. The stack supports major machine learning frameworks, inference engines, and generative AI tools, with seamless onboarding for Hugging Face models through the Qualcomm Efficient Transformers Library and AI Inference Suite. Developers can deploy models with one-click simplicity and access tools, APIs, and libraries for operationalizing AI at scale.

The AI200 will be commercially available in 2026, followed by the AI250 in 2027. Qualcomm confirmed that these products mark the beginning of a multi-generation AI data center roadmap with an annual release cadence, emphasizing energy efficiency, scalability, and cost leadership.

KEY QUOTES

“With Qualcomm AI200 and AI250, we’re redefining what’s possible for rack-scale AI inference. These innovative new AI infrastructure solutions empower customers to deploy generative AI at unprecedented TCO, while maintaining the flexibility and security modern data centers demand. Our rich software stack and open ecosystem support make it easier than ever for developers and enterprises to integrate, manage, and scale already trained AI models on our optimized AI inference solutions. With seamless compatibility for leading AI frameworks and one-click model deployment, Qualcomm AI200 and AI250 are designed for frictionless adoption and rapid innovation.”

Durga Malladi, Senior Vice President and General Manager, Technology Planning, Edge Solutions & Data Center, Qualcomm Technologies

Qualcomm: AI200 And AI250 Launched To Transform Rack-Scale AI Inference For The Data Center Era

Consumer Tech