Anomalo Expands Data Quality Platform for Enhanced Unstructured Data Monitoring
The success of enterprise AI is closely tied to the quality and accuracy of the data it uses to train its models. This has been underscored by numerous reports that underscore the critical role of data quality.
Historically, enterprises worked primarily with structured data, which is clean, well-organized, and easily analyzed. This includes data such as customer databases or transaction records. However, the rise of GenAI has shifted the landscape. It is pushing organizations to harness vast amounts of unstructured data, which comes in diverse formats and lacks a predefined framework.
One of the key challenges of unstructured data is quality. This could be the result of inconsistencies, inaccuracies, missing information, or irrelevant content.
Anomalo aims to address this issue through its data quality platform, which has so far been used for structured data. However, the company has announced an expansion of its platform to better support unstructured data quality monitoring.
The platform leverages AI to automatically identify data issues, enabling teams to address them before making decisions, managing operations, or powering AI and machine learning workflows.
Anomalo shared insights from a McKinsey survey revealing that 65% of companies worldwide now use GenAI regularly. That is double the adoption rate from the previous year. However, there is no one-size-fits-all GenAI model for enterprises. Companies must bring their own data to the models to get accurate results. This is what makes enterprise data quality a major barrier to GenAI adoption.
“Generative AI is the next frontier, but there is no playbook for data quality when it comes to determining the quality of unstructured data feeding Generative AI workflows and LLMs,” explained Elliot Shmukler, co-founder and CEO of Anomalo.”
“Enterprises need to understand what they have inside their unstructured data collections and which parts of those collections are suitable for Generative AI use. At Anomalo, we’re building this playbook and are working with the world’s largest and most innovative companies to solve this challenge together.”
Anomalo’s updates let enterprises define custom data quality checks and set severity levels for both their custom and Anomalo’s out-of-the-box issues. It also supports approved models from AWS, Google, and Microsoft, ensuring full control over data while reducing the risk of external misuse.
There is currently no established framework for assessing the quality of unstructured data, such as customer order forms and call transcripts, according to Anomalo. The company aims to address this gap by leveraging its platform to accelerate various aspects of enterprise AI deployments.
Anomalo states that its expanded platform enables teams to integrate data quality monitoring into the data preparation phase. This approach highlights potential quality issues before data is sent to a model or vector database.
Anomalo’s data quality monitoring can also integrate with data pipelines feeding into RAG. In this use case, unstructured data is ingested into vector databases. Metadata filters, ranks, and curates the data to ensure high-quality information is used for generating outputs.
Additionally, Anomalo’s platform can help mitigate compliance risks by tagging and monitoring data for quality. This process ensures that sensitive information is identified and filtered out before it is used in GenAI models.
Anomalo isn’t the only company working on improving unstructured data quality. Several other players in the market, such as Collibra, Monte Carlo Data, and Qlik have various solutions focused on unstructured data quality. Anamalo states that it differentiates itself by analyzing raw unstructured data before any pipeline is set up. This method enables broader exploration and greater flexibility, going beyond traditional RAG approaches.
Along with the announcement of its expanded platform, Anomalo shared that it has raised an additional $10 million in Series B funding from Smith Point Capital. This brings its total raised to $82 million. The new funding will go toward more R&D for unstructured data quality monitoring.
According to Keith Block, founder and CEO of Smith Point Capital, “Anomalo is rewriting the enterprise playbook for data quality in the AI era. The complexity in managing the enterprise data estate is growing dramatically, driven by a step function change in the proliferation of structured and unstructured data.”
“Maximizing the quality of data in the enterprise has become mission-critical and an important area of investment for Fortune 500 executives. We are proud to lead Anomalo’s Series B extension as they emerge as the leading platform in this space.”
Related Items
Monte Carlo Brings GenAI to Data Observability
Modern Data Co. Seeks to Build the Last Mile to Data
PuppyGraph Secures $5 Million to Advance Zero-ETL Graph Querying