Follow BigDATAwire:

May 5, 2025

Databricks and KPMG Invest in LlamaIndex to Unlock Scalable Enterprise AI

Shutterstock

Databricks and KPMG have made minority equity investments in LlamaIndex, a platform for agentic document workflows. These investments aim to accelerate the development and adoption of LlamaIndex’s flagship products, including the LlamaCloud and LlamaParse

In a world where valuable business information is buried in unstructured formats that AI struggles to process. LlamaIndex is designed to solve this problem, bridging the gap between legacy data and AI applications.

LlamaParse focuses on extracting machine-readable data from messy documents, whether it’s legal files, invoices, or tables embedded in PDFs. LlamaCloud handles the heavy lifting behind the scenes, managing indexing, embedding, and retrieval at scale so businesses can integrate this data into AI workflows without reinventing their infrastructure.

Together, these tools make unstructured knowledge searchable, accessible, and ready for AI-powered applications, allowing businesses to unlock insights from information that was previously difficult to use. As demand for AI-driven automation grows, LlamaIndex is positioning itself at the center of an essential transformation in enterprise data intelligence.

Databricks has spent the good part of the last decade building its Data Intelligence Platform to offer enterprises a unified environment for analytics, machine learning (ML), and now GenAI. However, for Databricks to truly unlock enterprise-grade GenAI, it needs to process unstructured data, which accounts for 80-90% of all enterprise data. Databricks is not natively optimized for handling unstructured data. 

LlamaIndex allows Databricks to offer more advanced tools for handling unstructured data. It also helps facilitate the development of Retrieval-Augmented Generation (RAG) applications, enabling enterprises to build AI solutions that can reason over complex documents. 

Databricks is making its investment through the Databricks Ventures AI Fund, a program designed to back startups that are advancing AI or enhancing its capabilities within the Databricks Data Intelligence Platform. 

The minority stake in LlamaIndex follows a series of investments in other AI-driven companies, including Cleanlab, Mistral AI, and Perplexity. Databricks seems to be keen to solidify its role at the center of the AI and data ecosystem. It is using venture capital as a strategic lever to support emerging technologies that align with its broader vision.

For KPMG, LlamaIndex provides the technical backbone to handle unstructured data more effectively. Like other professional services firms, KPMG works with large volumes of documents, including audit reports, regulatory filings, contracts, and client communications. With LlamaIndex, KPMG will have new tools to organize and analyze this data in ways that could improve efficiency across audits, tax analysis, and advisory services. 

KPMG’s involvement comes through KPMG Ventures, which is focused on supporting early-stage startups in emerging areas such as agentic AI, data infrastructure, and cybersecurity. The company’s investment in LlamaIndex is part of its broader strategy, following previous investments in AI-focused startups like Ema, Wokelo, and Rhino.ai.

Databricks and LlamaIndex’s investment signals a broader push to make enterprise data more accessible. Both companies see a growing demand for tools that help businesses extract value from proprietary data. 

Shutterstock

“LlamaIndex’s technology addresses a critical need in the enterprise AI stack, enabling companies to quickly build production-ready AI applications that leverage their proprietary data,” said Patrick Wendell, Co-Founder & VP Engineering at Databricks. “This investment aligns perfectly with our mission to help customers drive innovation through data intelligence.”

LlamaIndex announced a $19 million Series A funding round in March 2025. While the financial terms of the latest investments were not disclosed, Co-Founder and CEO Jerry Liu stated that the funding will help LlamaIndex accelerate product development and expand its go-to-market strategy.

A key advantage of LlamaIndex is that it does not compete with language models. It complements them by providing the underlying infrastructure that allows large language models to operate effectively across fragmented and unstructured enterprise data. Its focus on infrastructure, rather than flashy applications, makes it a natural fit for major platforms like Databricks.

“As we continue to innovate and push boundaries in applied AI, a robust data foundation is essential for building effective AI systems, particularly sophisticated knowledge assistants and agentic solutions,” said Swami Chandrasekaran, Principal, AI & Data Labs Leader at KPMG. “LlamaCloud and LlamaIndex provide the frameworks necessary to access, curate, and ingest data at scale, enabling KPMG to develop differentiated, industry-specific solutions that deliver measurable business outcomes for our clients.”

Related Items 

Tapping into the Unstructured Data Goldmine for Enterprise in 2025

Getting the Upper Hand on the Unstructured Data Problem

Peering Into the Unstructured Data Abyss

 

BigDATAwire