DataPelago: Interview With CEO Rajan Goyal About The Universal Data Processing Engine Company

By Amit Chowdhry • Mar 12, 2025

DataPelago is on a mission to make it viable to extract value from all data in the world so people can capture every insight, cure, invention, and opportunity. Pulse 2.0 interviewed DataPelago CEO Rajan Goyal to learn more about the company.

Rajan Goyal’s Background

Rajan Goyal

What is Rajan Goyal’s background? Goyal said:

“I am the co-founder and CEO of DataPelago, and have a deep background in data processing and accelerated computing. Prior to founding DataPelago, I was the CTO of Fungible, where I led the development of the Data Processing Unit (DPU) architecture, a key innovation that transformed modern data centers by introducing a third essential computing element alongside the CPU and GPU.”

“I’ve also held leadership roles at tech companies like Cisco, Oracle, and Cavium. At Cavium, I contributed to the advancements in multi-core processor technologies and domain-specific accelerators, particularly for the OCTEON processor line. My work across these roles focused on pushing the boundaries of networking, security, data movement, and data storage technologies.”

“I hold over 150 patents and have driven numerous products to multi-billion-dollar success. I have a Bachelor’s degree in mechanical engineering from the Thapar Institute of Engineering and Technology and a master’s degree in computer science from Stanford University.”

Formation Of DataPelago

How did the idea for the company come together? Goyal shared:

“The idea for DataPelago came from my experience in accelerated computing and my realization that traditional CPU-based systems couldn’t handle the growing complexity and volume of data. While I was at Fungible, I saw that compute, not I/O, would become the next major bottleneck, especially with the rise of AI and unstructured data. I founded DataPelago to create a universal data processing engine that could process any data type on any hardware, overcoming the performance and scalability challenges of today’s data-driven world.”

Core Products

What are the company’s core products and features? Goyal explained:

“DataPelago’s core product is our Universal Data Processing Engine (UDPE), a software solution that seamlessly integrates into an organization’s tech/data stack to supercharge their current data processing methods.”

“Our UDPE is a versatile system designed to support all data types – structured and unstructured – and all processors – both existing (CPU, GPU, FPGA, custom silicon) and future. This is the new data processing standard for the accelerated computing era.”

“The engine is comprised of three layers:

  1. DataVM – the industry’s first virtual machine with a domain-specific Instruction Set Architecture (ISA) for data operators providing a common abstraction for execution on accelerated computing hardware, spanning CPU, GPU, FPGA, and custom silicon.
  2. DataOS – the operating system layer that maps data operations to heterogeneous accelerated computing elements and manages them dynamically to optimize performance at scale.
  3. DataApp – a pluggable layer that enables integration with engines such as Spark and Trino to deliver acceleration capabilities to these engines.”

“This technology enables organizations to achieve faster data processing with better price-performance, supporting use cases such as GenAI, analytics, and AI model training. The engine also eliminates the need for data migration or vendor lock-in, seamlessly integrating with existing data stores and lakehouse platforms.”

“The engine is modular and composable, and leverages open-source technologies and standards. The engine uses Apache Gluten and Substrait to accelerate query plans from widely-used open-source engines such as Spark and Trino. This modularity enables a plug-and-play model for enterprises to start reaping the benefits of acceleration for their existing investments without any additional effort. The composability allows enterprises to use the best-of-the-breed stack for all their data pipelines, whether business analytics or GenAI/LLM.”

Evolution Of The Company’s Technology

How has the company’s technology evolved since launching? Goyal noted:

“DataPelago began with a vision to handle the complexity of data processing from structured to semi/unstructured data processing. Generative AI emerged as the killer use case for extracting insights from unstructured data and validated our thesis. With our versatile engine at the core of our mission, we are continually expanding our product ecosystem to meet the evolving needs of modern organizations.”

“As data volumes double every two years, organizations have access to more information than ever, empowering data teams to drive business-critical insights. Recent advances in accelerating computing hardware, like GPUs and FPGAs, now make it possible to process and analyze data faster than before. Yet, with 90% of new data being unstructured, data teams are still increasingly finding themselves limited by the constraints of current data processing systems.”

“Our engine is purpose-built for the era of accelerated computing at a time where efficient data processing is needed most. This engine offers unprecedented levels of flexibility and composability for organizations. It is designed to be available on all major cloud platforms, including AWS, Azure and Google Cloud, as well as for on-premises setups and specialized GPU providers supporting GenAI and AI Factory workloads.”

Significant Milestones

What have been some of the company’s most significant milestones? Goyal cited:

“Building our Universal Data Processing Engine from the ground up has been a significant milestone for the company. It’s the world’s first of its kind, capable of accelerating any engine, processing any type of data, operating on any hardware, and can handle massive data volumes quickly and cost-effectively.”

“We have also partnered closely with design partners like McAfee to refine and validate the engine. These partnerships have been essential in testing the engine’s flexibility, scalability, and compatibility across different data environments, helping us tailor its solution to real-world needs and strengthen its foundation in the industry. We’ve also secured early validation from a growing base of early customers, like Akad Seguros.”

Customer Success Stories

When asking Goyal about customer success stories, he highlighted:

“One of our early success stories comes from Akad Seguros, an insurance company that leveraged our engine to unify its data processing pipelines for GenAI and analytics. Using our solution, Akad Seguros was able to process various data formats – including PDFs, text, and parquet files – on a single platform and reduce its data processing costs by over 50%.”

Funding

When asking Goyal about the company’s funding details, he revealed:

“We have raised $47 million to date.”

Total Addressable Market

What total addressable market (TAM) size is the company pursuing? Goyal assessed:

“We are pursuing a global TAM for data processing, estimated to be $14+ billion today with a compound annual growth rate of 14%. We are also pursuing a global TAM for GenAI and LLM data processing, expected to reach $10+ billion by 2032 with a compound annual growth rate of 65%.”

Differentiation From The Competition

What differentiates the company from its competition? Goyal affirmed:

“All data processing platforms designed in the last decade were designed for structured data and general-purpose computing. It has become evident that the current approach for building data platforms is untenable. The legacy vendors are or will attempt to create bandaid solutions, but systems designed in the last decade will not be sufficient for the next decade of computing. Some new companies are attempting to build data processing solutions for accelerated computing, but with a narrower focus. There is a need for an entirely new data processing approach, built from the ground up for the evolution of data and hardware.”

“Our solution stands apart by offering the first Universal Data Processing Engine specifically designed for accelerated computing. We deliver a significant performance advantage by processing data one to two orders of magnitude faster than traditional systems. The engine’s architecture is highly flexible and composable, allowing business to process diverse data formats and workloads on a single platform, with significant cost reductions. Additionally, unlike other engines, our open-source enablement gives customers the flexibility to accelerate existing tools without vendor lock-in or complex migrations.”

Future Company Goals

What are some of the company’s future company goals? Goyal concluded:

“We are committed to ongoing innovation for our customers. Our launch has generated significant interest within the data processing ecosystem, and we’re actively collaborating with prospective customers and partners to bring our groundbreaking data processing engine to market, driving a quantum leap forward for the entire industry.”