Recce: $4 Million Secured For Bringing Data-Native Code Review To Systems

By Amit Chowdhry • Apr 29, 2025

Recce, a provider of data-native code review tools for data transformation projects and AI systems, announced it has launched a cloud platform for its popular open-source toolkit and raised $4 million in funding. Heavybit led the funding round, with participation from Vertex Ventures US, Hive Ventures, and angels Visionary, SVT Angels, Brighter Capital, Ventek Ventures, and Scott Breitenother and Tim Chen of Essence VC.

Recce’s open-source project now receives 3,600 downloads every week on GitHub, and its users include The Philadelphia Inquirer, telecoms companies, healthcare tech startups, and government entities in Brazil and Australia.

Value Proposition: Recce brings the best practice workflows data professionals are used to–including data diffing, validation checklists, and query result comparison–natively into data transformation workflows alongside existing tools like dbt, so data engineers, data scientists, and other stakeholders can streamline data validation across the software lifecycle.

Recce Founder: Recce was created as an open-source project in 2023 by CL Kao, a pioneer in code versioning systems who built Git-precursor SVK, which was widely adopted by Apple, Ubisoft, and others.

Version 1.0

Recce also released 1.0 of the open-source project, providing the foundation for the company’s new collaborative SaaS offering, Recce Cloud:
1.) Column-Level Impact Lineage Analysis for precise and granular downstream change impact scoping and visualization
2.) Change Exploration between production and development data with flexible row-by-row, profiling, value distribution, and arbitrary query result comparison
3.) Evidence Collection for capturing exploration insights into shareable and reusable validation checklists

Reece Cloud

Recce Cloud also launched in private beta to enable team sharing and collaboration:
1.) Full data-validation context sharing with teams including lineage diffs, custom query results, and structured checklists
2.) Data workflow integration with GitHub for ensuring merging code only when all checks are approved
3.) Free tier in the pricing plan

KEY QUOTES:

“A company’s proprietary data set is a key differentiator in the advent of AI commoditization, and extracting value from that data is a critical priority. Tools like dbt unlocked data analytics for software; now, we’re in an era where the data itself is managed programmatically, and you have to continuously validate not just the logic of the data, but also the data being generated. We believe most code reviews in the future will become data reviews as data correctness becomes a defining element for success. Recce’s mission is to ensure the stability and accuracy of complex data systems as AI and specifically LLMs drive more data transformation.”

Founder and CEO CL Kao

“Recce has become essential to our analytics workflow at The Inquirer. Recce automates data validation across 50+ downstream consumers of our data models, supports ad-hoc impact analysis, and integrates cleanly into our CI/CD pipeline. It’s helped us move faster and smarter without compromising data integrity.”

Brian Waligorski, Lead Data Engineer at The Philadelphia Inquirer

“Data pipelines are the New Secret Sauce for every company building with AI, enabling teams to create and improve high-quality training data from their own IP. Recce provides the essential toolkit for unlocking the full value of their data with iteration, refinement, and monitoring, while mitigating the risk of errors and corruption. Heavybit is thrilled to support them as they grow the ecosystem for data pipeline validation in the age of AI as part of our ongoing mission of 10+ years: Bringing critical enterprise infrastructure to market.”

Heavybit General Partner and DevOps trailblazer Jesse Robbins, who is joining Recce’s board

“AI models bring a large degree of randomness to software development, especially for data-intensive applications. This raises the premium on data-forward testing tools to get closer to predictability. Until now, that’s been done in a bespoke manner and largely by hand. CL, who has been a longtime collaborator of mine on open-source projects, is solving this problem beautifully with the Recce toolkit, and I’m glad to be supporting him.”

Brian Behlendorf, an open-source pioneer and founding member of the Apache Software Foundation, who has also joined Recce’s board