Predibase: Helping Engineers Fine-Tune And Serve Open-Source AI Models Across Clouds

By Amit Chowdhry • Feb 8, 2024

Built by developers, for developers, Predibase enables any software engineer to do the work of an ML engineer in an easy-to-understand declarative way. Pulse 2.0 interviewed Predibase Co-founder and CEO Devvret Rishi to learn more about the company.

Dev Rishi’s Background

Devvret Rishi

Academically, Rishi’s background has been in AI, with a bachelor’s and master’s in computer science from Harvard University, focused on ML. And Rishi said:

“I enjoyed seeing the full lifecycle around technology, from ideation through dissemination, and so I joined Google as a product manager in 2016, initially working on Firebase – an acquired startup and developer platform. I then spent time as a PM with Google Research and finally a team that became Vertex AI, Google Cloud’s machine learning platform. I was also the first PM on Kaggle, a data science community that grew from about a million users to ten million users, and helped me see firsthand the promise of democratized machine learning.”

“In 2017, one of my co-founders, Piero Molino, was part of an acquired startup that became the foundation of Uber’s AI organization. Piero wasn’t happy reinventing the wheel for each new ML project he worked on at the company, so he created a framework that helped him deliver machine learning projects much more quickly by abstracting out end-to-end pipelines behind just a few lines of configuration. The project was widely adopted internally, so he worked with Uber to open-source it and then saw a strong response from the open-source community in GitHub as well. In 2020, he knew he wanted to start a company and met Travis and our final cofounder, Professor Chris Ré, at Stanford. Travis had been a tech lead at Uber as well, working on an open-source project called Horovod that made it easier to train deep-learning models at scale. I had been a PM at Google, most recently on Vertex AI and also the first PM on Kaggle. I had seen both the massive growth in practitioner interest in machine learning and the struggles organizations onboarding to GCP had in productionizing ML. We decided to take our combined experiences and go all-in to making machine learning massively simpler for developers.”

“Today, Ludwig has over 10k stars on GitHub, and we all see a massive opportunity to democratize AI by moving it from the walled garden of expert data scientists to a world where engineers can build best-in-class AI applications in minutes.”

Core Products

What are Predibase’s core products and features? Rishi explained:

“Predibase is the fastest, most efficient way to productionize open-source AI. As the first commercially available AI platform designed for developers, Predibase makes it easy for engineering teams to fine-tune and serve any open-source LLM or deep learning model on state-of-the-art managed infra in the cloud—at a much lower cost than using out-of-the-box models from commercial vendors like OpenAI.”

“Built by members of the teams that created the internal AI platforms at Apple and Uber, Predibase is highly scalable and built for production AI. We pair an easy-to-use declarative interface with high-end GPU capacity on serverless managed infra to ensure engineers have a cost-effective solution for the complete ML lifecycle. Most importantly, Predibase is built on open-source foundations and can be deployed in your private cloud, so all of your data and models stay in your control.”

“In use with both Fortune 500 and high-growth tech companies, Predibase is helping engineering teams deliver AI-driven value back to their organization in days, not months.”

Challenges Faced

After asking Rishi about challenges faced, he acknowledged:

“Our bet is that the market will graduate over the course of the next year from using single behemoth and general-purpose APIs to smaller, task-specific models deployed to solve an organization’s use case. We’re in the early innings of that, so the number of organizations who have both the data as well as the expertise to fine-tune their own smaller LLM is a bottleneck for our business, but we are working hard to continually lower the barrier and increase the volume of organizations that could deploy their own mini-LLM to solve a task.”

Evolution Of Predibase’s Technology

How has the company’s technology evolved since launching? Rishi noted:

“We set out to build an AI platform that made it easy for teams to build state-of-the-art deep learning applications. With the massive interest and adoption of LLMs (which are a form of deep learning models), we’ve turned our attention to building industry-leading capabilities that make it easy and cost-effective for teams to customize open-source LLMs that can outperform commercial LLM models at a fraction of the cost. We also made it possible to train and serve models at scale on widely available commodity GPUs, which makes it possible for any organization to integrate AI even in the face of an industry-wide shortage of high-end GPUs.”

Significant Milestones

What have been some of the company’s most significant milestones? Rishi cited:

“1.) Building a platform to extend the functionality of the open source Ludwig low code AI framework (which was originally developed by our co-founder Piero Molino).

2.) Raising $28 million in seed and series A funding

3.) New innovations like the development of LoRAX are game-changing for our customers, enabling them to serve hundreds of custom models in production at the cost of serving one.

4.) Our open-source project Ludwig recently reached over 10K stars on GitHub.

5.) Customers have trained several hundred models on Predibase.”

Customer Success Stories

After asking Rishi about customer success stories, he highlighted:

“Quotes from our customers:

Damian Cristian, Co-Founder and CEO of Koble.ai, an investment platform that uses AI to identify early-stage companies that outperform the market: ‘We adopted Predibase to save our team months of effort developing infrastructure for training and serving complex open source LLMs. With Predibase, we can experiment and iterate faster with less custom work and have the option to deploy models in our own cloud. Now we don’t need to worry about scaling our infrastructure as we grow because Predibase supports efficient fine-tuning and serving of even the largest models like LLaMA-2-70B in production on A100 GPUs.'”

“Anand Gomes, CEO, Paradigm: ‘With over $200B in trades, Paradigm is the largest global liquidity network for cryptocurrencies. One of our top priorities is helping traders make smarter decisions with AI,” said Anand Gomes, cofounder and CEO of Paradigm. ‘By adopting Predibase and their declarative approach to ML, our team of engineers has built new product capabilities that were previously not possible, and best of all, the time it takes to build production models on top of Snowflake has been reduced from months to minutes and at a fraction of the cost. With this technology, we’ve built powerful relevance scoring and in-platform intelligence that helps our customers identify trading opportunities and capture edge.'”

“Lindsay Ng, Sr Data Scientist, Payscale: ‘As a leader in compensation software, we provide our thousands of clients with insights to improve pay equity, transparency and compensation planning. Declarative machine learning is critical to how we build and deploy some of the ML services that make this possible. Our data science team relies on declarative ML to rapidly iterate on our tools which utilize multi-modal datasets and state-of-the-art language models.'”

“Harsh Singhal, Head of Machine Learning & AI, Koo: ‘Every day, millions of users share their experiences with each other on our global social media platform. Declarative ML makes it easy for our team of data scientists and engineers to train state-of-the-art language models on top of these large volumes of unstructured data. With these insights, we’re able to build more personalized user experiences. Furthermore, the flexibility of declarative ML enables us to rapidly experiment with many different data modalities and model architectures for a broad range of use cases.'”

“David Thau, Global Data and Technology Lead Scientist, World Wildlife Fund: ‘We see a massive opportunity for customized open-source LLMs to help our teams generate rea-ltime insights across our large corpus of project reports. The insights generated by this effort have big potential to improve the outcomes of our conservation efforts. We’re excited to partner with Predibase on this initiative.'”

Funding

After asking Rishi about the company’s funding and revenue information, he revealed:

“To date, we’ve raised $28 million in seed and series A funding. As a private company, we don’t disclose revenue. We can say we have several customers, spanning high growth tech companies and larger enterprises.”

Total Addressable Market

What total addressable market (TAM) size is the company pursuing? Rishi assessed:

“We see a future in which every engineering organization within a company builds and deploys task-specific AI models using a platform like Predibase.”

Differentiation From The Competition

What differentiates the company from its competition? Rishi affirmed:

“Our key differentiators today are:

1.) Our declarative ML framework, which makes it much easier and faster to get AI projects into production with minimal coding.

2.) The flexibility of our platform, which both provides guided co-pilot support for data scientists and engineers who are not as experienced with ML while giving sophisticated ML teams fine-grained control over their projects. It also enables organizations to work with nearly any open source or commercial LLM model.

3.) Our ability to train and serve models on widely available commodity GPUs, which dramatically lowers the cost and reduces the time required for model fine tuning and serving. This brings AI within reach of organizations that may not have the resources to access high-end GPUs.”

Future Company Goals

What are some of the company’s future goals? Rishi concluded:

“We want to be the unified interface for machine learning. We start by making it easier to fine-tune and deploy your own large language model to solve a specific task, but we want to broaden the kinds of models that we are able to support to be multi-modal (support images, audio, etc.) and increase the set of tasks we can help organizations solve.”