Databricks: AI Data Platform Launches Genie Code Autonomous Data Engineering Agent

By Amit Chowdhry • Today at 3:57 PM

Databricks has introduced Genie Code, an autonomous artificial intelligence agent designed to help data teams automate complex analytics, engineering, and machine learning workflows across enterprise data systems. Genie Code expands Databricks’ broader “Genie” family of AI tools and aims to shift data teams beyond code assistance toward agent-driven execution. Rather than merely generating code, the system is designed to plan, execute and maintain production data workflows under human supervision.

According to Databricks, Genie Code integrates directly with the company’s data intelligence platform and governance layer, enabling the agent to interpret enterprise data context and business semantics. Through this integration, the agent can autonomously build pipelines, debug failures, generate dashboards, and maintain production systems.

The company said the system is built specifically for data and analytics work, which often requires understanding data lineage, historical usage patterns, and governance policies rather than simply reading source code. By integrating with Unity Catalog, Genie Code can surface relevant datasets, enforce access controls, and apply governance rules during task execution.

Databricks said the agent can handle the full lifecycle of data work, including training machine learning models, creating production-ready pipelines, and generating visualizations. It can also monitor pipelines and AI models in the background, diagnose anomalies, and suggest fixes before engineers intervene.

The system uses an agent architecture that routes tasks across multiple models and tools rather than relying on a single AI model. According to internal benchmarking shared by Databricks, Genie Code solved about 77.1% of real-world data science tasks compared with 32.1% for a leading coding agent equipped with Databricks Model Context Protocol servers.

Genie Code is also designed to connect with external enterprise tools such as Jira, Confluence, and GitHub through the Model Context Protocol, allowing the agent to perform autonomous workflows across systems. Databricks said the platform can learn from user interactions through persistent memory, enabling the system to improve over time and adapt to organizational coding practices.

Beyond development, the company positioned Genie Code as an operational agent capable of maintaining production workloads. The tool can analyze system logs, evaluate model performance, diagnose failures, and recommend infrastructure adjustments such as provisioning and autoscaling.

Databricks said Genie Code is now generally available within the Databricks workspace and can be accessed from notebooks, the SQL editor, and Lakeflow pipeline tools without additional configuration.

KEY QUOTE:

“Genie Code can autonomously carry out complex tasks such as building pipelines, debugging failures, shipping dashboards, and maintaining production systems.”

Patrick Wendell — Co-Founder And Vice President Of Engineering At Databricks; Matei Zaharia — Co-Founder And Chief Technologist At Databricks; Weston Hutchins — Product Lead At Databricks; Gal Oshri — Engineering Leader At Databricks