Bauplan, a Python-first serverless data platform that transforms complex infrastructure processes into a few lines of code over data lakes, announced its launch with $7.5 million seed funding. Innovation Endeavors led this funding round, which included experienced operators Wes McKinney, Aditya Agarwal, and Chris Re.
The data infrastructure industry has long been dominated by complex platforms and specialized teams, limiting data-intensive operations to costly, skilled professionals, hindering innovation, and distracting developers with intricate cloud setups. However, industry needs are reshaping through API-driven deployment in AI and the shift towards object storage, and data infrastructure must become accessible to software engineers skilled in application development, but not necessarily in big-data systems.
These developers need code-first solutions that integrate with CI/CD workflows and prioritize Python, free from SQL or Spark limitations.
Recently launched, Bauplan offers a serverless runtime that processes large datasets directly on object storage in pure Python. Developers can build powerful applications using simple serverless Python functions and familiar git-like concepts like branch, commit, and merge. This programmable and code-first platform is built for automation to empower the next generation of data developers and eliminate the need for Kubernetes, Spark, and special infrastructure management.
Bauplan was created by Ciro Greco (CEO), Jacopo Tagliabue, and Mattia Pavoni, who took their previous company Tooso from inception to IPO through acquisition by Coveo. And the team has worked on data and machine learning for a long time, published 60+ research papers, earned thousands of citations, and created popular open source contributions with over 50 million downloads and 10k GitHub stars.
Bauplan is ideal for infrastructure and data science teams in medium—and large-sized enterprises with data-intensive use cases involving machine learning and AI applications, especially in B2B software, media, financial services, and healthcare tech. Enterprise design partners, such as MFE-MediaForEurope, a leading European broadcaster, already use it.
This platform offers serverless Python functions vertically integrated with object storage, supporting natively Iceberg tables and git-like operations over data lakes, like zero-copy branches and automatic data versioning. Developers can build pipelines and applications over data branches without managing Kubernetes, Spark, or any other infrastructure. Using a simple Python SDK, they can also manage the entire data lifecycle as a CI/CD workflow.
The new funding will be used for product development and initial market validation with early customers.
New board member: Ihab Ilyas joins as an advisor.
KEY QUOTES:
“Data today looks a lot like DevOps a decade ago. Back then, infrastructure-as-code allowed all kinds of developers to automate a lot of stuff. Data is going through the same process today. We had a revelatory moment at the beginning of this year when a large infrastructure team put the system in production and we went from zero to 40,000 jobs per week.”
– Ciro Greco, CEO and co-founder of Bauplan
“Bauplan has rapidly taken hold across our organization. Developers who have the expertise to work on data can now focus on the actual work and never deal with infrastructure. At the same time, developers who are green or have a traditional backend and software engineering background can now build production-grade solutions with data. We went from being stuck with infrastructure to unlocking a wide range of new use cases in just weeks.”
– Fabio Melen, Head of Data Technology at MFE-MediaForEurope
“Bauplan has done pioneering infrastructure work to create a Lambda-like experience for complex data and AI workloads. By removing all the infrastructure complexity & abstraction overhead of tools like Spark, they allow any software engineer to be a data engineer. This is an essential shift as all companies become AI-driven.”
– Davis Treybig, Partner at Innovation Endeavors