

At its Data + AI Summit today, Databricks announced that it’s open sourcing Unity Catalog, the metadata catalog that governs how users and compute engines can access data. Coming off of last week’s news around Apache Iceberg, the move marks an important shift for Databricks as it seeks to maintain momentum as customers increasingly demand open lakehouse platforms.
Databricks unveiled Unity Catalog back in 2021 as a way to govern and secure access to data stored in Delta, the table format that Databricks created in 2017 as the linchpin of its lakehouse strategy. It has remained a proprietary product at Databricks since.
But in recent years, a competing table format, Apache Iceberg, has gained momentum in the big data ecosystem. Databricks addressed Iceberg’s rise last week with the planned acquisition of Tabular, the lakehouse company founded by Iceberg’s creator. Databricks’ strategy is to gradually move the Iceberg and Delta specifications closer together over time, thereby eliminating the differences between them.
That left the humble metadata catalog as the last piece standing between customers and their dream of a truly open data lakehouse. Databricks’ rival, Snowflake, addressed the potential lock-in of the metadata catalog last week with the launch of Polaris, which is based on Iceberg’s REST-based API. The company tells Datanami that it plans to donate the Polaris project to open source, likely the Apache Software Foundation, within 90 days.
That left the still-proprietary Unity Catalog as the odd-man out at the metadata catalog layer, just as a new era of open lakehouses suddenly arrives. To address that strategic shift in the market, Databricks decided to open source Unity Catalog.
The move creates the “USB” for data access, Databricks CEO Ali Ghodsi said during his keynote address at Databricks’ Data + AI Summit in San Francisco.
“All the silos that you had before, they can just access one copy of the data that’s in a standardized USB format under your ownership,” Ghodsi said. “It goes through one governance layer that’s just standardized–that’s Unity Catalog–for all of your data.”
Unity Catalog previously supported Delta and Iceberg, in addition to Apache Hudi, another open table format, via Databricks’ Delta Lake UniForm format. In fact, Unity Catalog also supports Iceberg’s REST-based API, Ghodsi pointed out.
“We basically standardized the data layer and the security layer so that you own your data and everything goes through these open interfaces,” he said. “And I think that’s going to be awesome for the community, for everybody in here. Because we just have way more use cases. We’re going to be able to do much more innovation, and we’ll just expand this market for everybody involved.”

Databricks CEO Ali Ghodsi announced the open sourcing of Unity Catalog at Data + AI Summit, June 12, 2024
Databricks customers applauded the move, including AT&T and Nasdaq.
“With the announcement of Unity Catalog’s open sourcing, we are encouraged by Databricks’ step to make lakehouse governance and metadata management possible through open standards,” said Matt Dugan, AT&T’s vice president for data platforms. “The flexibility to utilize interoperable tools with our data and AI assets, with consistent governance, is core to the AT&T data platform strategy.”
“Databricks’ decision to open source Unity Catalog provides a solution that helps eliminate data silos and we look forward to further scaling our platform, enhancing our governance, and modernizing our data applications as we continue to deliver for our clients,” said Lenny Rosenfeld, Nasdaq’s vice president of capital access platforms.
It’s not clear what open source foundation Databricks will choose for Unity Catalog OSS, nor what the timeline will be. Previously, Databricks has chosen The Linux Foundation to open source various internally developed products, including Delta and MLFlow.
Unity Catalog will be posted to Github on Thursday during Databricks’ CTO Matei Zaharai keynote at Data + AI Summit, the company said.
Related Items:
All Eyes on Databricks as Data + AI Summit Kicks Off
Databricks Nabs Iceberg-Maker Tabular to Spawn Table Uniformity
Snowflake Embraces Open Data with Polaris Catalog
April 21, 2025
- MIT: Making AI-Generated Code More Accurate in Any Language
- Cadence Introduces Industry-First DDR5 12.8Gbps MRDIMM Gen2 IP on TSMC N3 for Cloud AI
- BigDATAwire Unveils 2025 People to Watch
April 18, 2025
- Snowflake and PostgreSQL Among Top Climbers in DB-Engines Rankings
- Capital One Software Unveils Capital One Databolt to Help Companies Tokenize Sensitive Data at Scale
- Salesforce Launches Tableau Next to Streamline Data-to-Action with Agentic Analytics
- DataVisor Named a Leader in Forrester Wave for AML Solutions, Q2 2025
- GitLab Announces the General Availability of GitLab Duo with Amazon Q
- Anthropic Joins Palantir’s FedStart Program to Deploy Claude Application
- SnapLogic Announces Partnership with Glean to Transform the Agentic Enterprise
April 17, 2025
- Qlik Highlights Real-World Enterprise AI at Qlik Connect 2025 with Lenovo, Visa, and Reworld
- SnapLogic Ushers in the Era of Infinite AI Workforce for the Agentic Enterprise With AgentCreator 3.0
- Devo Highlights Analyst Overload in Push Toward Alertless SOC
- InfluxData Launches InfluxDB 3 Core and Enterprise for Real-Time Time Series Data
- Informatica and Carnegie Mellon University Partner to Drive Innovation in GenAI for Data Management
- SnapLogic Unveils Next-Gen API Management Solution to Power the Composable and Agentic Enterprise
- Monte Carlo Launches Observability Agents To Accelerate Data + AI Monitoring and Troubleshooting
- Berkeley Lab Spotlights Efficient Method for Faster Topological Data Analysis
- Vultr Recognized for Strength in Data Sovereignty and Security in 2025 Omdia Sovereign Cloud Report
April 16, 2025
- PayPal Feeds the DL Beast with Huge Vault of Fraud Data
- OpenTelemetry Is Too Complicated, VictoriaMetrics Says
- Will Model Context Protocol (MCP) Become the Standard for Agentic AI?
- Accelerating Agentic AI Productivity with Enterprise Frameworks
- When Will Large Vision Models Have Their ChatGPT Moment?
- Thriving in the Second Wave of Big Data Modernization
- What Benchmarks Say About Agentic AI’s Coding Potential
- Google Cloud Preps for Agentic AI Era with ‘Ironwood’ TPU, New Models and Software
- Google Cloud Fleshes Out its Databases at Next 2025, with an Eye to AI
- Can We Learn to Live with AI Hallucinations?
- More Features…
- Grafana’s Annual Report Uncovers Key Insights into the Future of Observability
- Google Cloud Cranks Up the Analytics at Next 2025
- Reporter’s Notebook: AI Hype and Glory at Nvidia GTC 2025
- New Intel CEO Lip-Bu Tan Promises Return to Engineering Innovation in Major Address
- AI One Emerges from Stealth to “End the Data Lake Era”
- Snowflake Bolsters Support for Apache Iceberg Tables
- SnapLogic Connects the Dots Between Agents, APIs, and Work AI
- Excessive Cloud Spending In the Spotlight
- Mathematica Helps Crack Zodiac Killer’s Code
- Big Growth Forecasted for Big Data
- More News In Brief…
- Gartner Predicts 40% of Generative AI Solutions Will Be Multimodal By 2027
- MinIO: Introducing Model Context Protocol Server for MinIO AIStor
- Dataiku Achieves AWS Generative AI Competency
- AMD Powers New Google Cloud C4D and H4D VMs with 5th Gen EPYC CPUs
- Deloitte Survey Finds AI Use and Tech Investments Top Priorities for Private Companies in 2024
- Prophecy Introduces Fully Governed Self-Service Data Preparation for Databricks SQL
- Seagate Unveils IronWolf Pro 24TB Hard Drive for SMBs and Enterprises
- CData Launches Microsoft Fabric Integration Accelerator
- MLCommons Releases New MLPerf Inference v5.0 Benchmark Results
- Opsera Raises $20M to Expand AI-Driven DevOps Platform
- More This Just In…