Starburst: How This Rapidly Growing Data Lakehouse Provider Is Transforming A $30 Billion Market

By Amit Chowdhry • May 15, 2024

Starburst is a company that offers a full-featured open data lakehouse platform built on open-source Trino. The company’s end-to-end analytics platform includes the capabilities needed to discover, organize, consume, and share data with industry-leading price performance for cloud and on-premises workloads. Pulse 2.0 interviewed Starburst co-founder and CEO Justin Borgman to learn more about the company.

Justin Borgman’s Background

Justin Borgman

What is Justin Borgman’s background? Borgman said:

My background is primarily in technology and entrepreneurship, particularly data management and analytics. I began my career as a software engineer, gaining practical experience developing software solutions. This shaped my understanding of technology and its applications, leading to my founding of Hadapt, a company that created “SQL on Hadoop,” turning Hadoop from a file system to an analytic database accessible by any BI tool. In essence, this was the earliest version of what is now known as a ‘Lakehouse architecture.’ In 2014, Hadapt was acquired by Teradata, where I became Vice President & GM responsible for the company’s portfolio of Hadoop products before founding Starburst in 2017.”

Formation Of Starburst

How did the idea for the company come together? Borgman shared:

The genesis of Starburst was the recognition of a pressing need within the industry for a more efficient and scalable solution to accessing and analyzing large volumes of data across disparate sources. Emerging trends such as the exponential growth of big data, the increasing adoption of cloud computing, and now AI, further reinforced this, highlighting the limitations and costly lock-in of traditional data warehousing approaches.”

“The vision for Starburst took hold when I encountered Trino (initially called Presto), an open-source distributed SQL query engine developed at Meta/Facebook to address their massive data processing needs. Trino co-creators, myself, and two of my peers at Teradata came together to form Starburst. Together, we created a platform that leverages cutting-edge technologies to democratize data access and empower organizations to derive actionable insights from their data more effectively and with unprecedented price performance. Simply put, we free our customers by allowing them to perform data warehousing analytics without the proprietary data warehouse.”

Favorite Memory

What has been Borgman’s favorite memory working for the company so far? Borgman reflected:

“There are so many, but one of my earliest memories is back in the early days of the company’s founding when our global headquarters was a Cafe Nero in Boston. It wasn’t uncommon for folks to walk in and see this ambitious and sometimes frantic group of individuals coming together to start laying the foundation for something iconic to be created. Like most startups at this stage, we were a small, agile, and scrappy team, hungry for someone just to hear us out and, more often than not, treading for our dear lives in the deep end while learning how to swim. The best part was that a $2 coffee could get you a whole day of free Wi-Fi and meeting space, which for a bootstrapped startup was invaluable.”

Core Products

What are the company’s core products and features? Borgman explained:

Starburst offers an open data lakehouse to provide an end-to-end analytics platform with industry-leading price-performance for both cloud and on-premises workloads.”

“At Starburst, we believe freeing our customers from vendor lock-in while providing best-in-class value is mission-critical. We also fundamentally believe that the best materialization of a lakehouse is a concept we’ve defined as Icehouse—a data lakehouse built on Apache Iceberg and Trino. With Icehouse, not only is the data portable because of the open file and table formats, but the customer now owns SQL because it is not locked in some proprietary language.”

“To bring Icehouse to life, we offer two products to enable our customers to use Starburst within their multi-cloud, hybrid, or on-premises data estate. First, Starburst Galaxy is our fully managed SaaS for end-to-end data and analytics. Second, Starburst Enterprise, which can be deployed anywhere—on premises, in the cloud, or hybrid—and is self-managed software that can run in some of the world’s most security-sensitive environments. With both products, we’ve streamlined data access and analytics for organizations dealing with large and complex datasets.”

Challenges Faced

What challenges has Borgman faced in building the company? Borgman acknowledged:

These past few years have been challenging for many companies in the data and analytics space as AI has taken center stage, leading to a frenzy for companies to catch up, and ever-changing geopolitical headwinds and market volatility.”

“I’m pleased to share that through these past couple of years, Starburst has fared well, and it’s not because we got lucky. It’s because we are helping businesses solve real problems in a way they haven’t been able to before. Like everyone, yes, we are anticipating future speed bumps. Still, we are more focused on continuing to build compelling products that solve meaningful business problems while building a well-oiled execution machine across Starburst.”

Evolution Of Starburst’s Technology

How has the company’s technology evolved since launching? Borgman noted:

Starburst’s technology has evolved significantly since its inception, driven by a commitment to innovation and a deep understanding of the evolving needs of data companies everywhere. Over the past short but exhilarating seven years, Starburst has gone from primarily offering solutions and services for Trino, to realizing we can actually help solve the analytics problem better by building ready-to-use software on top of Trino.”

“This enables organizations that don’t have the inhouse capabilities to build complete stacks using open source to now benefit from a commercially viable and ready to go solution. Then we started seeing the trend, driven by customer demand, towards SaaS, which led to the introduction of Starburst Galaxy. As a SaaS offering, Galaxy is a full-fledged end-to-end analytics platform optimized for an Icehouse-centric data lakehouse.”

Significant Milestones

What have been some of the company’s most significant milestones recently? Borgman cited:

Starburst has achieved several significant milestones that have shaped its trajectory as a leading data lakehouse provider. Some of the more recent ones include:

  1. The scaling of Starburst Galaxy, our cloud-native platform that centralized management and deployment capabilities for Starburst clusters across multi-cloud and hybrid cloud environments to simplify cluster provisioning, configuration, and monitoring, empowering organizations to scale their data infrastructure seamlessly and efficiently. 
  2. We are trusted by some of the most iconic brands in the world, including seven of the top ten banks, six of the top ten pharmaceutical companies, and four of the top seven US telecommunications firms.
  3. This year, two strategic partnerships were announced. The first is the launch of Dell’s Data Lakehouse, which is powered by Starburst. The second is that Starburst is the only enterprise-grade analytics engine certified to operate within Google’s Distributed Cloud for highly security-conscious organizations needing air-gapped data architectures.”

Customer Success Stories

After asking Borgman about customer success stories, he highlighted: 

Resilience is a remarkable biopharmaceutical manufacturing company with a mission to revolutionize how medicines are made. To achieve this a focus of its data team was to eliminate inefficiencies within data processing by using Starburst data products capabilities. Initially intended for R&D, Starburst’s data products saw significant adoption, prompting its expanded role within the organization. Resilience now supports over 30 purpose-built data products across six domains. This enabled instant reporting across 20+ key applications, resulting in over 200,000 unique dataset accesses and a 20-fold increase in BI tool usage at Resilience in only 4 months. Now Resilience benefits from powerful analytics powered by Starburst on a consistent set of trusted data products, securely democratized across the company to make informed decisions faster.”

Volkswagen Group, I don’t think they need an introduction, embarked on a mission to bring the data mesh concept to the company. VW built an open lakehouse architecture with Iceberg and Delta Parquet as its foundation, Starburst for price-performant querying, data federation, data product creation, and attribute-based access controls for data governance. As a result, the use of Starburst and Iceberg has successfully connected different data ecosystems into a data mesh. This integration includes one petabyte of data across traditional data warehouses, S3-based storage in the cloud and on-prem, and a cloud-based data lakehouse.”

“Lastly, Citi implemented Starburst across their Hybrid Cloud solution, spanning over 100 markets and managing around 25 trillion dollars of Assets Under Custody and Administration. This has enabled enhanced efficiency and innovation across the organization.”

Funding/Revenue

After asking Borgman about the company’s funding and revenue information, he revealed: 

Starburst has experienced significant growth supported by several rounds of successful funding, propelling us to unicorn status and helping to solidify our position as a leader in the data access and analytics space.”

“Since its inception, Starburst has attracted substantial investment from prominent venture capital firms and strategic investors. Notable funding rounds include Series A, B, and C rounds led by top-tier investors such as Index Ventures, Coatue Management, and Andreessen Horowitz. With each round of funding, Starburst has demonstrated strong investor confidence in its vision and potential to revolutionize data access and analytics.”

“This robust financial backing has enabled Starburst to accelerate product development, expand its customer base, and invest in key strategic initiatives. As a result, Starburst has reached a valuation of over $3 billion and continues to experience rapid growth in terms of revenue, customer acquisition, and market penetration. Our impressive growth metrics underscore our ability to deliver tangible value to customers and capitalize on the growing demand for modern data solutions in today’s data-driven world.”

Total Addressable Market

What total addressable market (TAM) size is the company pursuing? Borgman assessed:

Starburst targets a significant portion of the Total Addressable Market (TAM) for data analytics and cloud computing solutions. This TAM encompasses a broad spectrum of industries, including but not limited to e-commerce, finance, technology, healthcare, and marketing. As more businesses across various sectors recognize the importance of leveraging data for informed decision-making, the TAM for solutions like our data lake platform continues to expand rapidly.”

“Today, we estimate the opportunity size to be upward of $30 billion annually. With the increasing volume and complexity of data being generated globally, we aim to capture a substantial share of this growing market by providing simple, scalable, cost-effective, and high-performance solutions to meet the evolving needs of modern data-centric businesses.”

Differentiation From The Competition

What differentiates the company from its competition? Borgman affirmed:

Because the TAM that Starburst participates in is massive, it also means we are going head-to-head against competitors significantly larger and more mature than us—remember Starburst is just about 7 years old.  And here is why we can hold our ground against these giants and win.

  1. Data Acceleration Technology. Starburst specializes in data acceleration, offering a highly efficient and scalable platform that enables organizations to query vast amounts of data at petabyte scale across disparate sources with lightning-fast speed.
  2. Cloud-Native Architecture: Unlike traditional data warehouses, Starburst is built on a cloud-native architecture, which means it seamlessly integrates with popular cloud platforms like AWS, Azure, and Google Cloud, as well as on-premises architectures, ensuring optimal performance, scalability, and flexibility.
  3. Cost-Effectiveness: Starburst offers a cost-effective alternative to traditional data warehouses and legacy Hadoop-based architectures by utilizing modern open-source technologies and efficient query optimization techniques. 
  4. Compatibility and True Platform Openness: Starburst is compatible with a wide range of data sources, governance, analytics, BI, visualization, and AI tools, enabling interoperability with existing investments and ensuring seamless integration with their existing workflows. Also, Starburst goes beyond other lakehouse approaches by supporting all the popular open file and table formats and providing an enhanced SQL query engine built on open-source technology.”

Balancing OS Trino With Commercial Offering

How do you and your co-founders balance the demands of continuing to contribute back to OS Trino while improving your commercial offering? Borgman concluded:

You’re right. This is a big challenge faced by companies that start from an open-source project. In fact, at times, Starburst has opened sourced things that raised eyebrows about why not monetize them, but in the long run, those decisions have been better for the community and Starburst.

I think a few factors have allowed Starburst to maintain a healthy balance between open source and commercial development. 

  1. For open source, it’s important to nurture a rich, vibrant community surrounding the project and actively contribute. This allows for innovation in the project to be fostered from anywhere in the community and allows everyone to benefit. 
  2. Co-creators of Trino are Starburst employees. We have been fortunate that Martin, Dain, and David have been with Starburst as CTOs since its inception and continue to be actively involved in the open-source project as well. We believe having the trio on the team has helped ensure Starburst continues to contribute to the community while forcing us to innovate around Trino’s core engine.
  3. Next, do not lose sight of what the core open-source project is and what the commercial offering is. I think in many organizations, the lines blur and then it’s hard to keep things exclusive for commercial use without disenfranchising the open source community. So for us, it’s important to know how we are making the engine better and then having an honest debate of where the enhancements can benefit users the most. The second part is beyond the core open source project, what are all the capabilities you’re building around the open source that help further enhance the project being used in the commercial offering?
  4. Lastly, and this is probably the hardest thing for many founding teams to internalize, not every user of the open-source project can or should be converted into a paid customer. There are some teams and businesses out there that simply love to or, out of necessity, have to build everything in-house, and that is the beauty of open source. In many cases, those teams are probably not a good fit. So, getting your ideal customer profile (ICP) right is also important.”