Anomalo Secures $10M to Advance Data Quality Solutions for Generative AI
PALO ALTO, Calif., Nov. 25, 2024 — Anomalo has announced that it has added the ability to identify common and business-specific quality and compliance issues in unstructured data targeted for Generative AI workflows. Anomalo’s platform uses AI to automatically detect issues in both unstructured and structured data, letting teams resolve any hiccups with their data before making decisions, running operations or powering AI and machine learning workflows.
This announcement comes as Anomalo sees record enterprise demand for its data quality product as Generative AI surges. In the last year, Anomalo has more than doubled its customers in the Fortune 500. Anomalo has also deepened its partnerships with Databricks, Snowflake and Google and received the prestigious Emerging Partner of the Year award from Databricks who also made a strategic investment in the company.
Elliot Shmukler, co-founder and CEO of Anomalo, said: “Generative AI is the next frontier, but there is no playbook for data quality when it comes to determining the quality of unstructured data feeding Generative AI workflows and LLMs. Enterprises need to understand what they have inside their unstructured data collections and which parts of those collections are suitable for Generative AI use. At Anomalo, we’re building this playbook and are working with the world’s largest and most innovative companies to solve this challenge together.”
A recent McKinsey Global Survey found that 65 percent of companies across sizes, geographies and industries now use Generative AI regularly, twice as many as last year. But there is not an off-the-shelf Generative AI model that will “just work” for enterprises, because whether they are building a RAG workflow or powering a customer support chatbot, enterprise-specific data is needed to make sure they get the correct outputs from the LLM. That means, they need to find a way to bring their data to the Generative AI models and, of course, to make sure they are bringing high quality and compliant data as well.
The challenge is that most of this data is unstructured, such as documents, call transcripts and order forms, and unlike data quality for structured data, there is no established framework for determining the quality of unstructured data. These documents are often cluttered with duplicates, errors, private information and even abusive language. Organizations who want to leverage their unstructured data need to be able to identify and resolve quality issues with such data before they get incorporated into Generative AI workflows and impact their performance or customer experience.
This challenge led Anomalo to expand its data quality platform for structured data to unstructured data in June. With its unstructured data quality monitoring capability, unstructured text documents can be evaluated for data quality with out-of-the-box issues including document length, duplicates, topics, tone, language, abusive language, PII and sentiment. Users are then able to quickly evaluate the quality of a document collection and identify issues in individual documents, dramatically reducing the time needed to profile, curate and leverage high-value unstructured text data.
With today’s announcement, Anomalo is expanding on these capabilities with two major advancements:
- Enterprises can now customize detected issues to describe any criteria they want to look for within the document collection and assign weightings to how severe the issue is for both their custom and Anomalo’s out-of-the-box issues .
- Enterprises can now leverage the models approved to run within their own cloud environment and hosted by AWS Bedrock, Google Vertex and Microsoft Azure AI with Anomalo’s cloud-hosted model-as-a-service support. Paired with Anomalo’s existing ability to seamlessly integrate with cloud providers and run entirely within a Virtual Private Cloud (VPC), this keeps data within enterprise data teams’ control and minimizes risk that data is ever used to train or fine-tune models.
Anomalo Raises $10M Extension Series B Funding, Bringing the Total Raised to $82M
Anomalo also announced $10 million in an extension Series B round from Smith Point Capital. Anomalo plans to use the new funding to accelerate investment in R&D for unstructured data monitoring and to deliver the future of data quality for Generative AI applications.
“Anomalo is rewriting the enterprise playbook for data quality in the AI era. The complexity in managing the enterprise data estate is growing dramatically, driven by a step function change in the proliferation of structured and unstructured data. Maximizing the quality of data in the enterprise has become mission-critical and an important area of investment for Fortune 500 executives. We are proud to lead Anomalo’s Series B extension as they emerge as the leading platform in this space,” said Keith Block, founder and CEO of Smith Point Capital.
About Anomalo
Anomalo helps enterprises build confidence in the data they use to make decisions and build products. Enterprises can simply connect Anomalo’s complete data quality platform to their data warehouse or lakehouse and begin monitoring their data in less than 5 minutes, all with minimal configuration and without a single line of code. Then, they can automatically detect and understand the root-cause of data issues, before anyone else. Anomalo is backed by SignalFire, Norwest Venture Partners, Foundation Capital, Two Sigma Ventures, Village Global, First Round Capital, Smith Point Capital and Databricks Ventures.
Source: Anomalo