Gimlet Labs: $80 Million Raised For AI Inference Software Platform Scaling Multi-Silicon Infrastructure

Gimlet Labs has raised $80 million in a Series A funding round led by Menlo Ventures, bringing the company’s total funding to $92 million as it looks to scale its multi-silicon AI inference platform.

The San Francisco-based applied AI research and product company is focused on improving the performance and efficiency of AI inference workloads, particularly for agentic AI systems. Since emerging from stealth five months ago, the company reports eight-figure revenue, a tripling of its customer base, and the addition of major customers, including a top frontier AI lab and a leading hyperscaler.

Gimlet Labs’ core offering is its inference cloud, which orchestrates AI workloads across heterogeneous hardware environments. The platform is designed to address growing inefficiencies in traditional AI infrastructure, where reliance on homogeneous hardware has created bottlenecks in latency, power consumption, and overall performance.

The company’s software stack automatically maps workloads to the most suitable chips across a mix of GPUs, CPUs, and emerging architectures, and can even split a single model across different hardware types. This approach enables faster execution and improved efficiency, with the company citing performance gains of three to ten times at the same cost and power levels.

Gimlet Labs also operates its own multi-silicon data centers while allowing customers to deploy its software within their own infrastructure. The company is working with leading chipmakers, including NVIDIA, AMD, Intel, ARM, Cerebras, and d-Matrix, to support its heterogeneous computing approach.

The new funding will be used to expand the company’s team and accelerate the deployment of its inference cloud as demand increases from AI labs seeking more efficient ways to run large-scale models and agent-based systems.

KEY QUOTES:

“We’ve entered a fundamentally new era of computing where the speed of intelligence has become the critical bottleneck. In order to unlock the next 10-100X performance increases needed in use cases like coding agents, we’ve identified how to leverage heterogeneous hardware for faster, more efficient inference. At Gimlet, we’re seeing this approach deliver an order of magnitude better performance per watt for our customers which is critical for anyone operating at scale given today’s datacenter capacity bottlenecks,”

Zain Asgar, Co-Founder And CEO Of Gimlet Labs

“Heterogeneity is inevitable, and Gimlet Labs is ahead of it. Most infrastructure was built for a homogeneous world — and the industry is paying hundreds of billions in CapEx for it. Gimlet built the only infrastructure designed from the ground up to embrace heterogeneity, purpose-built for agentic AI at scale. The research pedigree and deployment experience this team brings is unmatched,”

Tim Tully, Partner At Menlo Ventures

“From our vantage point, the world’s largest AI infrastructure buildouts, such as from foundation model labs to sovereign cloud environments, etc., are all converging on the same realization: homogeneous silicon cannot deliver the performance and efficiency these deployments require. Gimlet’s multi-silicon orchestration is the missing layer in the stack, and we believe their technology will become a foundational layer for AI at scale in the world’s largest AI deployments,”

Abhishek Shukla, Managing Director At Prosperity7 Ventures US