Cube: Interview With Co-Founder And CEO Artyom Keydunov About The Universal Semantic Layer Company

By Amit Chowdhry • Nov 13, 2024

Cube brings consistency, context, and trust to the next generation of data experiences. And Cube Cloud is a leading Al-powered universal semantic layer platform, helping companies of any size to manage and deliver trusted data with a single source of truth. Any data source can be unified, governed, optimized, and integrated with any data application: Al, BI, spreadsheets, and embedded analytics. Pulse 2.0 interviewed Cube co-founder and CEO Artyom Keydunov to learn more about the company.

Artyom Keydunov

Formation Of Cube

How did the idea for Cube come together? Keydunov said:

“Pavel Tiunov (CTO) and I started Cube to solve a real problem: data models are spread across too many BI tools and cloud data warehouse silos, and data engineers need an open and universal semantic layer – a single source of truth – that can feed metrics and defined data to any data app, BI tool or AI bot.”

Core Products

What are the company’s core products and features? Keydunov explained:

“Cube Cloud is a universal semantic layer that makes it easy to connect siloed data, create consistent metrics, and make them accessible to any data experience. Our solution provides unmatched integrations and interoperability, supporting a robust set of deployment options, data connectivity, coding languages, and native APIs so that customers can build solutions on the modern data stack to their unique requirements. Cube Cloud provides the following benefits:

Unify fragmented business definitions: Consolidate data modeling workflows and create a single source of truth with consistent metrics defined once.

Centralize and enforce fine-grained governance and security policies: Use data access controls to grant row and column level permissions and mask sensitive data upstream.

Achieve faster, more cost-efficient results: Optimize query performance with Cube Store’s powerful caching layer and advanced pre-aggregation capabilities.

Integrate with any endpoint: Use Cube’s AI, GraphQL, REST, SQL, MDX, and Orchestration APIs to deliver trusted data.”

“Data engineers and application developers use Cube Cloud’s code-first, developer-oriented platform to organize and govern data from cloud data platforms into centralized, consistent data models and business definitions and deliver it to every downstream tool via its APIs. Apply software engineering best practices and processes to data management: CI/CD, isolated environments, and version control with Git integration. Use intelligent capabilities, such as Cube Copilot, data model code generation, and front-end embedded analytics code generation, to increase engineering productivity. We’ve recently launched the preview of Cube Visual Modeler, so that data teams can work more efficiently together while building data models and metrics visually. With Cube Cloud, business data becomes consistent, accurate, easy to access, and, most importantly, trusted.”

“Cube accelerates trusted data-driven decisions, delivering better experiences to employees inside the organization, customers outside the organization, and machines with our native OpenAI integrations. Customers can build their own GenAI experience with the AI API. For internal BI use cases, Cube Cloud provides a semantic catalog, now generally available, and no-code Generative AI capabilities, in preview, to simplify discovery, exploration, and access to modeled data and downstream, connected BI content for data analysts and business users. Customers can add unlimited named user accounts to allow anyone to search and reuse trusted data products and perform natural language queries in a simplified, business-user-friendly interface.”

Challenges Faced

What challenges have Keydunov faced in building the company? Keydunov acknowledged:

“I think everyone will agree that 2023 started the avalanche of AI marketing and hype, and it will continue. This makes it hard for a data message to break through all the noise. We are delivering AI products that our customer base is asking for. This approach ensures that we are meeting the needs of our users while staying relevant in the AI sphere.”

“We believe AI will provide a substantial competitive edge by enhancing product capabilities and improving user experiences built on the flexible foundation of Cube Cloud’s universal semantic layer. Cube’s AI strategy is based on three pillars: Assist, Augment, and Automate.”

Evolution Of Cube’s Technology

How has the company’s technology evolved since launching? Keydunov noted:

“We first built Statsbot, a Slack data app, and after tremendous interest, they expanded and launched the first version of the open-source Cube Core on HackerNews in 2019. Cube Core grew rapidly in the first several months, crossing 10,000 stars on GitHub and 1,000s of data engineers and developers joining the community almost overnight.”

“The team built and launched the commercial version, Cube Cloud, in 2021. The technology of Cube Cloud has evolved rapidly over the last few years. We add features every month. The number of integrations, developer productivity capabilities, and AI capabilities increased. Here are a few highlights:

Semantic Layer Sync improves interoperability by closing the gap between semantic layers and BI tools. With Semantic Layer Sync, you can develop the data model and surface metrics from Cube to one or many BI tools in seconds while still having the benefit of never breaking dashboards in a BI tool by untested changes. Over the year, we built integrations to Tableau and Power BI, in addition to Preset, Superset, and Metabase.

– Data modeling can now be done through Python and Jinja in Cube, not just YAML and Javascript. Jinja as a template engine and Python as a programming language to define dynamic data models in YAML.

AI API (preview). Cube provides a standard interface for interacting with OpenAI as a turnkey solution for text-to-semantic layer queries using LLM-generated SQL.

Pre-aggregation index – Cube Cloud highlights such queries in Query History and Performance Insights, hinting that you can get an additional performance boost with correct pre-aggregation indexes.”

Next-Generation Data Modeling Engine (Tesseract) (preview): This feature boosts performance, starting with the new SQL Planner. It adds support for multi-stage calculations, aggregating from query results so that organizations can tackle more complex analytical scenarios. 

Data Access Policies (preview): Cube lets users define comprehensive role-based access controls, including policies and roles. They can also integrate with external directory services and ensure secure and scalable data access tailored to their security and governance requirements.

Cube Copilot (preview): Cube Cloud provides intelligent recommendations and streamlines data modeling tasks in code, allowing engineers to focus on higher-value activities.

Cube Visual Modeler (preview): Cube has a no-code option now, with new clicks-into-code data modeling capabilities. This enables more contributors, such as data stewards and analysts, to participate in data definitions, fostering cross-team collaboration and improving communication.

Semantic Catalog: Cube users can search a unified view of trusted data assets—now with column/member-level lineage. This makes it easier for teams to find and use the right data for their needs and ensure that everyone is working with unified data, reducing errors and increasing trust. Available in Cube Cloud Premium and above.

Significant Milestones

What have been some of the company’s most significant milestones? Keydunov cited:

“The launch of Cube Cloud in 2021 has been our most significant milestone. It is never easy for a company to move from open-source to commercially available, but we were able to do that in two years. Cube Cloud makes it easier for data teams to manage their semantic layer more easily and cost-effectively by providing a robust and growing feature set, access to chart support, and flexible deployment options from fully-managed SaaS to Bring Your Own Cloud (BYOC).”

Customer Success Stories

When asking Keydunov about customer success stories, he highlighted:

“We have several customers who have shared the success they have found with Cube:

– Webflow is a Website Experience Platform (WXP) that empowers modern marketing teams to visually build, manage, and optimize stunning websites, with over 300,000 customers in 200+ countries. Webflow faced a growing need to streamline its data infrastructure. As the company scaled, it became essential to provide consistent data access and interpretation across both engineering and non-engineering teams inside and outside the organization, while maintaining a high level of performance and security – see public story here.

– Breakthrough, a leading provider of sustainable fuel and freight solutions, uses Cube Cloud to improve the performance of dashboards for customers via embedded dashboards. They have been very happy to find the Cube is a single place to create metrics for reuse; this simplifies the data engineers’ workflow and increases productivity – see public story here.

– Quantatec specializes in tracking, logistics control, and fleet management and uses Cube Cloud to help non-technical fleet managers access data via natural language queries. With AI on top of the Cube semantic layer, they found that ‘Cube’s semantic layer played a pivotal role in giving the SQL Agent the context it needs to reason about which tables can answer the questions posed’ – see public story.”

Funding/Revenue

When asking Keydunov about the company’s funding and revenue details, he revealed:

“As of June 7, 2024, we announced that we completed a $25 million funding round with new investor Databricks Ventures, joined by all the previous investors, including Decibel, Bain Capital Ventures, Eniac Ventures, and 645 Ventures. This brings us to $48 million in funding to deliver the leading universal semantic layer that unifies business logic, centralizes governance and security, optimizes query performance, and integrates with any data endpoint.”

“From a revenue standpoint, in fiscal year 2024, we saw a 4x growth in the number of customers and a 3x increase in both bookings and the average deal size.”

Total Addressable Market

What total addressable market (TAM) size is the company pursuing? Keydunov assessed:

“We looked at Fortune Business Insights and have a rough estimate of the global business intelligence market size valued at $29.42 billion in 2023 and is projected to grow from $31.98 billion in 2024 to $63.76 billion by 2032. This excludes the other areas a semantic layer covers: Data Mgmt/Integration, BI tools, CDWs, Data Gov/Sec.”

“When we look to include a conservative portion of those additional markets, we get a TAM for a universal semantic layer in the modern data stack estimated to be around $34.75 billion by 2025: $16 billion (Data Management) + $8.75 billion (BI Tools) + $6 billion (Cloud Data Warehousing) + $4 billion (Data Governance).”

Differentiation From The Competition

What differentiates the company from its competition? Keydunov affirmed:

“Cube Cloud is different from competitors in that it is purpose-built for cloud-native architecture and designed to be optimized for speed and scale. Here are specific features that set Cube Cloud apart from the competition.”

– Cube Store is a large-scale, in-memory, aggregate-aware caching layer designed to scale up to several billions of rows per single rollup table. It provides very efficient cost trade-offs due to a hybrid in-memory and a persisted columnar parquet-based query engine built on top of Apache DataFusion.

– SQL API boosts your data’s performance, makes it consistent, and centralizes its caching and security upstream of every BI platform. Cube’s versatile, Postgres-compliant SQL API lets you skip the new language learning curve, making it incredibly easy to connect data sources and the universal semantic layer to any downstream tools.

– Data Access Controls implement governance and security policies for every entity of the data modeling layer, be it a cube, a view, a measure, or a dimension. Engineers have complete control of what is surfaced and available via APIs to any actor–a user, a role, or an application. Employ row- and column-level security and data masking. Advanced query rewrite capabilities inject the current user’s security context regardless of where it’s consumed.”

“In addition to Cube Cloud’s differentiated features, we set out to build a product for the data engineer.

– Code-first to empower the data engineer: We have applied software engineering best practices and processes to data management, including CI/CD, isolated environments, and version control.

– Interoperable to model data once and deliver it anywhere: Robust deployment options, data connectivity, and extensive native APIs allow connections to integrate data sources with data endpoints.

– Intelligent to automate data workflows: Cube Cloud capabilities support the development workflow. Cube Cloud can act as the data engineer’s co-pilot with code generation and aggregate awareness to deliver performance at optimal cost.”

Future Company Goals

What are some of the company’s future goals? Keydunov concluded:

“We have two focus areas of innovation and growth for the coming year.

1.) We are focused on increasing our already extensive list of seamless integrations, improving BI connectivity, and developing new ways to drive greater operational productivity. These areas will expand market share by meeting fundamental enterprise needs.

2.) We are developing new ways to simplify discovery, exploration, and access to trusted, modeled data for everyone in the organization, alongside time-saving, AI-powered data engineering capabilities.”