Baseten, an AI inference company focused on running modern models in production, has raised $300 million in new financing at a $5 billion valuation. IVP, CapitalG, and NVIDIA anchored the round, as demand accelerates for high-performance infrastructure that can reliably deploy and operate AI models at scale.
The funding marks Baseten’s third capital raise in the past year, underscoring the surge in customer needs around production-grade inference—covering real-time performance, reliability, cost controls, and operational tooling. Baseten said its platform is used by a growing set of AI application developers, naming customers including Cursor, OpenEvidence, Abridge, Notion, and Clay.
The company positioned inference as the defining AI narrative heading into 2026, arguing that the industry’s center of gravity is shifting from training larger models to delivering fast, dependable, and economical inference in real workflows. Baseten cited analyst estimates that inference will represent roughly two-thirds of all AI compute by the end of 2026, up from about one-third in 2023, and said its infrastructure is built to support an expanding universe of specialized models.
Baseten is emphasizing what it calls a “multi-model future,” where organizations run many custom or domain-specific models rather than relying on a small set of generalized, centralized systems. The company said it provides an independent inference layer designed for strong guardrails, security, observability, and flexibility across cloud environments, aiming to help customers control their infrastructure choices while protecting their intellectual property.
Baseten also highlighted product principles intended to reduce lock-in: open runtimes rather than proprietary model-weight constraints, no lock-in around customer models, and multi-cloud flexibility to optimize for reliability and cost. The company framed these capabilities—alongside developer experience and performance—as drivers behind customers standardizing on its platform as they scale.
Founded in 2019 and based in San Francisco, Baseten said it has raised $585 million to date. In addition to the anchor investors in the new round, the company listed other investors, including Conviction, Bond, Greylock, and Spark Capital.
KEY QUOTES:
“If cloud was the foundation that enabled the last generation of great technology companies, inference is the foundation for the next. Every breakout AI application depends on fast, reliable, and cost-effective inference. We’ve spent six years building the infrastructure to make that possible—and we’re ready for this next chapter of hundreds and then thousands of new models.”
“Our customers are building precise models for everything from elite software development to medical documentation to high-stakes legal reasoning, and they need a platform that will allow them to deliver deep specificity, expertise, and performance. This is the type of user experience — and the promise of AI — we’re dedicated to making possible.”
Tuhin Srivastava, Co-founder and CEO, Baseten
“Baseten lets us run the models we need, the way we need to run them. The performance is best-in-class, but what sets them apart is everything else: the reliability, the developer experience, the fact that they’re constantly finding ways to lower our costs. They’re a partner, not a vendor.”
Shiv Rao, Co-founder and CEO, Abridge
“Baseten is quickly becoming default infrastructure. In a world where every ambitious AI team wants to run many models and fully own its IP, Baseten gives them the freedom, reliability, and economics to do that at scale. That combination—open runtimes, multi-cloud resilience, and a deeply considered developer experience—is the new standard the best companies expect.”
Sarah Guo, General Partner, Conviction

