

Dremio today announced that the metadata catalog at the heart of its Apache Iceberg-based data lakehouse now supports other popular metadata catalog services, including Snowflake’s Apache Polaris-based catalog and Databricks Unity Catalog. The lakehouse provider says the move in its Project Nessie-based metadata catalog will bolster architectural flexibility in the cloud, on-prem, and everywhere in between.
Before metadata catalogs suddenly jumped into the big data consciousness earlier this year, Dremio had been quietly backing its own metadata catalog, dubbed Project Nessie, to provide the necessary housekeeping that a lakehouse based on Apache Iceberg tables requires.
So when Snowflake announced the open source Polaris metadata catalog during its user conference in early June, Dremio executives applauded the announcement and the openness that it could foster in the big data community. Seeing close alignment between Polaris and Nessie, which began development in 2020, Dremio executives pledged to work with the Polaris community to merge the two projects.
The Nessie-Polaris merger has yet to happen, but it is still in the plans. “Our goal is to merge the capabilities of Project Nessie into Apache Polaris (Incubating) to create a single, unified catalog,” says James Rowland-Jones, vice president of product at Dremio. “We believe this will become the default catalog for the open-source community. Dremio will continue to focus on seamless enterprise services built around it.”
In the meantime, Dremio is moving forward with development its own catalog service for technical metadata, dubbed the Dremio Enterprise Data Catalog. Specifically, Dremio today announced several new capabilities in the metadata catalog, which is based on Nessie.
The new bits include integration with the Snowflake metadata catalog service based on Apache Polaris as well as hooking into Unity Catalog, the metadata catalog that Databricks built for managing data stored in Delta Lake tables (Unity Catalog does quite a bit more, including lineage tracking, semantic modeling, security, governance, and functions as a regular, user-focused data catalog, but that’s another story).
Dremio’s move is noteworthy for a couple of reasons. For starters, with its acquisition of Iceberg maker Tabular for between $1 billion and $2 billion and its commitments to essentially merge the Delta Lake and Iceberg specs, Databricks helped to ease CFOs who were worried that they would pick the “wrong” format.
However, while Databricks committed earlier this year to supporting Iceberg tables with a future release of Unity Catalog, that support is not available yet. Dremio’s support for Unity Catalog ensures that Databricks customers who use its metadata catalog can achieve that interoperability with Polaris today.
“Flexibility is essential for modern organizations looking to maximize the value of their data,” said Tomer Shiran, Founder of Dremio. “With expanded Iceberg catalog support across all environments, Dremio empowers businesses to deploy their lakehouse architecture wherever it’s most effective. We’re 100% committed to giving customers the freedom to choose the best tools and infrastructure while reducing fears of vendor lock-in.”
Dremio’s product, which is officially called the Dremio Enterprise Data Catalog for Apache Iceberg, supports all Iceberg engines through the Iceberg REST API. In addition to supporting Dremio’s own SQL query engine, it supports other Iceberg-compatible query engines, including Apache Spark, Flink, and others.
Dremio’s catalog automates many of the housekeeping tasks that are required to keep an Iceber-based data lakehouse running at peak efficiency. That includes things like table optimization routines, such as compaction and garbage collection. It also provides “Git”-like branching and version control, enabling users to access data as it existed at particular moments in time (so-called “time travelling”). The catalog also provides centralized data governance and role-based access control (RBAC), ensuring fine-grained access to data and preventing user access to of sensitive data.
Kevin Petrie, vice president of research at BARC, says Dremio’s move helps enterprises deal with the “extraordinary pressure to access, prepare, and govern distributed datasets for consumption by analytics and AI applications.”
“To meet this demand, they need to catalog diverse data and metadata across data centers, regions, and clouds,” Petrie said in Dremio’s press release. “Dremio is taking a logical step to enable this with an open catalog that is based on Apache Iceberg, the emerging standard for flexible table formats, and by integrating with an ecosystem of popular platforms.”
Related Items:
Polaris Catalog, To Be Merged With Nessie, Now Available on GitHub
What the Big Fuss Over Table Formats and Metadata Catalogs Is All About
August 29, 2025
- Argonne Researchers Give Presentations at Data Management Conference
- Big Data Expo 2025 in China Spotlights Data Resources, AI, and Industrialization
- NSF Facilities Partner to Transform Data Processing for Next-Gen Radio Astronomy
August 28, 2025
- Pecan AI Brings Explainable AI Forecasting Directly to Business Teams
- Virtualitics Introduces Iris with Mission-Tuned AI Agents for DoD and Beyond
- EDB Outlines Future of Lakehouse and Strategies for Intelligent Applications at Supermicro Open Storage Summit
- Cerebras and Core42 Launch Global Access to OpenAI’s gpt-oss-120B
August 27, 2025
- Acceldata Announces General Availability of Agentic Data Management
- Data Streaming Summit 2025 Returns to San Francisco with 30-Plus Sessions Across Four Tracks
- CDAO Fall 2025 Opens Registration for Boston Conference
- Ataccama Data Trust Assessment Reveals Data Quality Gaps Blocking AI and Compliance
- Apache Software Foundation Expands Tools, Governance, and Community in FY2025
- Coalesce Launches JOIN Community Discussions on Data Strategy and AI
- Alluxio Reports Q2 Growth as Enterprise AI 3.7 Advances AI Data Performance
- OpenText and Ponemon Institute Survey of CIOs Finds Lack of Information Readiness Threatens AI Success
- Domo Announces Enhanced Cloud Integration Capabilities with BigQuery
August 26, 2025
- MariaDB Accelerates Cloud Deployments, Adds Agentic AI and Serverless Capability with Acquisition of SkySQL
- OpenLight Raises $34M Series A to Scale Next-Gen Integrated Photonics for AI Data Centers
- Domo Unveils Enhanced Cloud Integration Upgrades for Snowflake
- NVIDIA: Industry Leaders Transform Enterprise Data Centers for the AI Era with RTX PRO Servers
- Rethinking Risk: The Role of Selective Retrieval in Data Lake Strategies
- Why Metadata Is the New Interface Between IT and AI
- What Are Reasoning Models and Why You Should Care
- Why OpenAI’s New Open Weight Models Are a Big Deal
- Beyond Words: Battle for Semantic Layer Supremacy Heats Up
- Meet Vast Data CEO Renen Hallak, a 2024 BigDATAwire Person to Watch
- LinkedIn Introduces Northguard, Its Replacement for Kafka
- This Big Data Lesson Applies to AI
- Software-Defined Storage: Your Hidden Superpower for AI, Data Modernization Success
- What Is MosaicML, and Why Is Databricks Buying It For $1.3B?
- More Features…
- Mathematica Helps Crack Zodiac Killer’s Code
- BigDATAwire Exclusive Interview: DataPelago CEO on Launching the Spark Accelerator
- GigaOm Rates the Object Stores
- McKinsey Dishes the Goods on Latest Tech Trends
- Solidigm Celebrates World’s Largest SSD with ‘122 Day’
- Google Pushes AI Agents Into Everyday Data Tasks
- The Top Five Data Labeling Firms According to Everest Group
- Oracle Launches Exadata Service for AI, Compliance, and Location-Critical Workloads
- Databricks Now Worth $100B. Will It Reach $1T?
- Promethium Wants to Make Self Service Data Work at AI Scale
- More News In Brief…
- Gartner Predicts 40% of Generative AI Solutions Will Be Multimodal By 2027
- Seagate Unveils IronWolf Pro 24TB Hard Drive for SMBs and Enterprises
- LF AI & Data Foundation Hosts Vortex Project to Power High Performance Data Access for AI and Analytics
- Deloitte Survey Finds AI Use and Tech Investments Top Priorities for Private Companies in 2024
- Dell Unveils Updates to Dell AI Data Platform
- NVIDIA AI Foundry Builds Custom Llama 3.1 Generative AI Models for the World’s Enterprises
- Redpanda Partners with Databricks to Deliver One‑Step Stream‑to‑Table Iceberg Integration for Real‑Time Lakehouses
- Computing Community Consortium Outlines Roadmap for Long-Term AI Research
- Transcend Expands ‘Do Not Train’ and Deep Deletion to Power Responsible AI at Scale for B2B AI Companies
- Acceldata Announces General Availability of Agentic Data Management
- More This Just In…