

(Shutterstock AI Image)
Collate, the startup behind the open source project OpenMetadata, has raised $10 million in Series A funding to tackle a growing challenge in enterprise data: managing metadata across an increasingly complex stack.
While much of the industry’s attention is on training larger AI models, Collate is focused on the underlying infrastructure: building tools that help data teams govern, document, and make sense of the systems those models rely on.
Its approach points to a broader shift in how companies are rethinking data readiness, not just in terms of access and storage but in structure, context, and usability. Collate aims to help teams move beyond fragmented documentation and manual processes by offering tools that promote more consistent governance and faster insight across departments.
The company positions its platform as a bridge between today’s disconnected data environments and the operational needs of AI systems, with a particular focus on supporting workflows at what it calls “agentic speed”, which refers to the pace at which decisions are increasingly made by autonomous systems.
At its core, Collate’s product focuses on reducing the operational burden that comes with managing metadata across dozens of tools. Instead of relying on engineers to manually update documentation or enforce access rules, the platform captures metadata automatically as pipelines run and schemas evolve. Policies are stored as code, so access controls and data classifications are checked and enforced before any query runs, whether it’s from a person or a machine.
The company says this model helps organizations keep their data systems more reliable and transparent without slowing down development. Everything from lineage tracking to data quality checks happens in the background, giving teams a real-time view into how data is flowing, who is using it, and whether it complies with internal rules.
Open source is central to that effort. Collate is built on OpenMetadata, a fast-growing project with an active community and wide integration support. Rather than replacing existing infrastructure, the platform connects to existing data warehouses, lakes, dashboards, and machine learning tools. This lets teams enrich what they already use with field-level documentation, usage insights, and governance controls.
The team behind Collate has worked on large-scale data systems for more than a decade. Before launching the company, the founders led infrastructure efforts at Yahoo, Hortonworks, and Uber. They were involved in building some of the most widely used open source tools in the space, including Hadoop, Kafka, and Storm. That background has shaped how they think about scale, flexibility, and automation.
Collate says this experience has helped them design a platform that supports what they call a virtuous cycle. The idea is simple. Better metadata helps teams get more out of AI, and AI can be used to improve metadata in return.
This feedback loop is a core part of how Collate sees its role. The company believes it can give data teams a system that improves over time, without relying on constant manual work.
“Collate’s Series A couldn’t come at a more critical time,” said Suresh Srinivas, CEO of Collate. “We’re in the midst of an AI race, not just for getting data ready for AI, but for how AI itself helps prepare that data. The winners will be organizations with highly functioning data teams augmented by AI.
“Our agentic approach is uniquely powered by richer metadata context from our knowledge graph and open source core,” continued Srinibas. “This is changing the game for our enterprise customers by solving the last mile of data challenges, helping them innovate faster with AI and data.”
That message is landing with customers working to modernize their data practices without slowing down operations. At Fundcraft, a financial services platform, Collate is now part of the company’s core data workflows. “Collate has been a game changer for strengthening our data culture,” said Victor Martin, CTO of FundCraft. “The biggest impact we’ve seen is accelerated development speed and improved cycle times, since our teams can now focus quickly on what matters most.”
Mango, the global fashion retailer, is also using the platform across its data organization. According to Collate, the company has seen 3× faster integration and a 20% increase in data team productivity. Improvements in data quality have also contributed to better performance in the company’s ML-driven pricing models. “Collate has proven to be the cornerstone of our data strategy,” said Jordi Orriols Torras, Mango’s Head of Data Governance.
With new funding in place and a growing customer base, Collate is turning its attention to scaling adoption and expanding automation within the metadata layer. Recent updates include Collate AutoPilot, a growing suite of AI agents that assist with documentation, data tiering, quality monitoring, and ingestion.
The platform now also supports enterprise-grade Model Context Protocol (MCP) support, which allows metadata to flow in both directions. This means systems can not only read metadata but also write changes back, closing the loop between insight and action.
Related Items
Cloudera Enhances Data Catalog and Metadata Management with Octopai Acquisition
Active Metadata – The New Unsung Hero of Successful Generative AI Projects
Data Warehousing for the (AI) Win
September 11, 2025
- MinIO Brings Hyperscaler Economics On-Prem with AIStor Pods
- Honeycomb Introduces the Developer Interface of the Future with AI-Native Observability Suite
- AdaParse: Smart PDF Processing for Scientific AI Training
September 10, 2025
- Progress Software Launches SaaS RAG Platform for Verifiable Generative AI
- Sigma Reveals New AI, BI, and Analytics Features, Redefining Data Exploration Capabilities for Customers
- Couchbase Shareholders Approve Acquisition by Haveli Investments
- Plotly Launches Studio and Cloud with GA as Vibe Analytics Event Approaches
- Expert.ai Launches Enhanced Solutions for Digital Information Services
- ThoughtSpot Redefines Analytics with Boundaryless, Agentic Intelligence
- Perforce Expands AI Capabilities to Boost Speed and Security in Software Development
- DiffusionData Releases Diffusion 6.12
September 9, 2025
- Algolia Unlocks Clean, Contextual Data at Scale with Introduction of Intelligent Data Kit
- CTERA Announces IntelliVerse 2025: A Free Virtual Forum on Data Readiness and AI in Digital Transformation
- Pliops Showcases XDP LightningAI’s Proven Impact at AI Infra Summit 2025
- MLCommons Releases New MLPerf Inference v5.1 Benchmark Results
- Monte Carlo Launches Agent Observability to Help Teams Build Reliable AI
- Sphinx Launches with $9.5M to Redefine How AI Works with Data
- NetApp Modernizes Object Storage with Enhanced Speed, Scalability and Security
- Sourcetable Launches Superagents to Bring Autonomous AI Into the Spreadsheet
- CoreWeave Launches Ventures Group to Invest in Future of AI
- Inside Sibyl, Google’s Massively Parallel Machine Learning Platform
- What Are Reasoning Models and Why You Should Care
- Beyond Words: Battle for Semantic Layer Supremacy Heats Up
- Rethinking Risk: The Role of Selective Retrieval in Data Lake Strategies
- Software-Defined Storage: Your Hidden Superpower for AI, Data Modernization Success
- Why Metadata Is the New Interface Between IT and AI
- Cube Ready to Become the Standard for Universal Semantic Layer, If Needed
- The AI Beatings Will Continue Until Data Improves
- What Is MosaicML, and Why Is Databricks Buying It For $1.3B?
- How to Make Data Work for What’s Next
- More Features…
- Mathematica Helps Crack Zodiac Killer’s Code
- GigaOm Rates the Object Stores
- Promethium Wants to Make Self Service Data Work at AI Scale
- Solidigm Celebrates World’s Largest SSD with ‘122 Day’
- Databricks Now Worth $100B. Will It Reach $1T?
- AI Hype Cycle: Gartner Charts the Rise of Agents, ModelOps, Synthetic Data, and AI Engineering
- The Top Five Data Labeling Firms According to Everest Group
- MIT Report Flags 95% GenAI Failure Rate, But Critics Say It Oversimplifies
- Anaconda Report Links AI Slowdown to Gaps in Data Governance
- Data Prep Still Dominates Data Scientists’ Time, Survey Finds
- More News In Brief…
- Seagate Unveils IronWolf Pro 24TB Hard Drive for SMBs and Enterprises
- Gartner Predicts 40% of Generative AI Solutions Will Be Multimodal By 2027
- DataSnap Expands with AI-Enabled Embedded Analytics to Accelerate Growth for Modern Businesses
- Acceldata Announces General Availability of Agentic Data Management
- Deloitte Survey Finds AI Use and Tech Investments Top Priorities for Private Companies in 2024
- Hitachi Vantara Recognized by GigaOm, Adds S3 Table Functionality to Virtual Storage Platform One Object
- Transcend Expands ‘Do Not Train’ and Deep Deletion to Power Responsible AI at Scale for B2B AI Companies
- Pecan AI Brings Explainable AI Forecasting Directly to Business Teams
- Striim Launches 5.2 with New AI Agents for Real-Time Predictive Analytics and Vector Embedding
- Liminal Paves the Way for Secure and Compliant Generative AI in Enterprise Settings
- More This Just In…