(Andy Chipus/Shutterstock)
The large language model (LLM) revolution has transformed vector databases from obscure search tech into must-have products for AI success. But which vector database features should you look for, and which vendors are innovating? The analysts at Forrester recently dug into the field to provide answers in a new report.
Vector databases are designed to manage and process one particular data type called a vector embedding, which is a numerical representation of words, documents, images, or even sound. A vector database indexes and stores these embeddings in a multi-dimensional space that allows users or applications to retrieve these embeddings and others nearby that they resemble. This similarity search function is what enabled users to get much better search results than straightforward keyword-matching, and led to the creation of so-called “AI search engines.”
When ChatGPT dropped the LLM bomb on the world in late 2022, a new use for vector databases was quickly discovered. By storing a set of source documents as embeddings in a vector database and then calling on the database to serve information from those documents via similarity search conducted at runtime as part of the prompt engineering or retrieval-augmented generation (RAG) process, GenAI users discovered they could greatly improve the quality of the responses generated by chatbots, co-pilots, and other forms of AI interactions enabled by LLMs like ChatGPT.
Just a few “native” vector databases existed prior to ChatGPT, such as Pinecone, Milvus, and
Zilliz. But almost overnight, many existing database vendors adapted their wares to be able to store, index, and process vector data, too, including Elastic, DataStax, Couchbase, MongoDB, and even Teradata. For NoSQL and relational databases that were already multi-modal in nature, the addition of the vector data type was a no-brainer.
However, as the market for vector databases exploded, it also created some confusion among users about what is the best approach to adopting vector databases. “Is the pgvector plug-in for Postgres sufficient for my GenAI needs? What benefits does a native vector database bring that multi-modal databases can’t match? Do these vector databases only run in the cloud or can I run them on-prem too?”
Enter Forrester, the longtime IT analyst group based in Cambridge, Massachusetts. In “Vector Databases Landscape, Q2 2024” report, Forrester analyst Noel Yuhanna and several of his colleagues dug into the burgeoning market for vector databases while slicing and dicing the vector database capabilities from 24 vendors.
Forrester started out by defining its terms. “A database management system that provides storage, indexing, processing, and access for data represented by vectors to support similarity searches, RAG apps, modern generative AI/LLM apps, and vector-based analytics,” the company states.
“Customers leverage vector databases to support customer experiences, RAG applications, image similarity search, real-time anomaly data detection, optimized recommendation engines, and fraud detection,” it continues. “Despite being in the nascent stages of this market, we anticipate a surge in diverse use cases in the near term.”
Forrester sees the market for vector database divided into two main segments: native vector DBs and multi-modal vector DBs.
The key difference between the camps, Forrester says, is the greater scalability of native vector DBs, “particularly when handling large volumes of vectors.” The main advantage of a multimodal vector DB, meanwhile, is that it can store other types of data, potentially eliminating the need for two or more separate databases.
The challenges of scale in vector databases have not been entirely solved, and the high-end “is still a work in progress,” Forrester says. “High-end scale and performance still require considerable effort, especially when supporting tens of billions of data points (vectors).”
Forrester didn’t rank the vector databases by their capabilities to tackle standard vector database duties (perhaps that will be the subject of an upcoming Forrester Wave). But it did look into which databases are being positioned for some of the emerging use cases for vector databases, which is handy to know (see image to the right).
The market for vector databases has seen a large number of entrants over the past 12 months, which makes for interesting dynamics that observers and customers should closely watch, Forrester says.
For instance, the capabilities expected of vector databases is changing. Core functions, like vector storage, indexing, and processing, are being augmented with more advanced features, “including enhanced security measures, optimized processing capabilities, and seamless integration with diverse vector embedding transformers and data streaming engines,” the analyst group says.
Another thing to look out for is market bleed-over. Cloud data platforms, including data fabrics and data lakehouses, are also adopting vector capabilities, Forrester says, which could further disrupt the market for vector databases.
“This trend underscores a shift toward comprehensive data management solutions that seamlessly integrate vector functionality, potentially reshaping the landscape of specialized vector databases,” Yuhanna and the Forrester analysts write.
Related Items:
Vector Databases Emerge to Fill Critical Role in AI
Home Depot Finds DIY Success with Vector Search
Can Thought Vectors Deliver Human-Level Reasoning?
October 24, 2025
- Darwin AI Raises $15M Series A to Help Governments Build AI Solutions
- Lightning Launches New Suite of Tools for PyTorch Developers and Researchers
October 23, 2025
- EDB Launches ‘AI & Data Horizons’ Podcast Exploring the Future of Sovereign AI and Data
- Opsera Unveils New DevOps Platform with Hummingbird AI Reasoning Agents, ‘Insights in a Box,’ and GitHub MCP Integration
- Sisense Finds ‘Churn Paradox’ as 82% of Developers Switch Despite High Satisfaction
- Domino Empowers Enterprise IT Teams to Deliver AI ROI at Scale by Maximizing Impact and Reducing Cost
- Couchbase Demonstrates Sub-Second Latency and Higher Accuracy in Billion-Vector Benchmark
- DCAI Expands AI Infrastructure Offering with WEKA’s Integrated Storage Services
- Red Hat Launches Red Hat Developer Lightspeed for AI-Powered Developer Productivity
- TetraScience Launches Scientific AI Lighthouse Program with Takeda as Founding Partner
October 22, 2025
- Qlik Opens Registration for Qlik Connect 2026, Set for April in Florida
- MariaDB and Exasol Partner to Bring Unified, High-Performance Analytics at Unprecedented Cost Efficiency
- Opsera Shares AI and DevOps Innovations at Dreamforce 2025, Announces New Partnership with Salesforce
- Couchbase 8.0 Delivers Unified Data Platform for High-Performance AI Applications at Scale
October 21, 2025
- Axelera AI Introduces Europa Processor for Scalable Edge-to-Enterprise AI Inference
- 10x Genomics and Anthropic Partner to Make Single Cell and Spatial Analysis More Accessible Through Claude for Life Sciences
- Dell Integrates NVIDIA, Elastic and Starburst to Advance AI Data Platform Capabilities
October 20, 2025
- Rethinking Risk: The Role of Selective Retrieval in Data Lake Strategies
- Top-Down or Bottom-Up Data Model Design: Which is Best?
- Goldman Sachs Chief Data Officer Warns AI Has Already Run Out of Data
- What Is MosaicML, and Why Is Databricks Buying It For $1.3B?
- What Are Reasoning Models and Why You Should Care
- 5 Tips to Architecting an Apache Iceberg Lakehouse
- Building Intelligence into the Database Layer
- Meet Vinoth Chandar, a 2024 Person to Watch
- What the Fivetran-dbt Merger Means for the Data Ecosystem
- AI Is Everywhere. Scaling It in Finance Requires Deeper Responsibility
- More Features…
- Mathematica Helps Crack Zodiac Killer’s Code
- Bloomberg Finds AI Data Centers Fueling America’s Energy Bill Crisis
- AI Agents Debut Atop Gartner 2025 Hype Cycle for Emerging Tech
- New GenAI System Built to Accelerate HPC Operations Data Analytics
- Global DataSphere to Hit 175 Zettabytes by 2025, IDC Says
- Why MinIO Added Support for Iceberg Tables
- Report: 80% of Global Workers Experience Information Overload
- Data is at the Center of Scientific Discovery Inside MIT’s New AI-Powered Platform
- Voltron Positions Data Flow as the Next Frontier in AI Performance
- Texas A&M Reinforcement Learning Algorithm Automates Oil and Gas Reserve Forecasting
- More News In Brief…
- Anthropic and Salesforce Expand Strategic Partnership to Deliver Trusted AI for Regulated Industries
- Deloitte Survey Finds AI Use and Tech Investments Top Priorities for Private Companies in 2024
- NVIDIA: Accelerating Large-Scale Data Analytics with GPU-Native Velox and cuDF
- Google Cloud’s 2025 DORA Report Finds 90% of Developers Now Use AI in Daily Workflows
- NVIDIA and Partners Launch NIM Agent Blueprints for Enterprises to Make Their Own AI
- Snowflake and Palantir Announce Strategic Partnership for Enterprise-Ready AI & Analytics
- Zilliz Sets New Industry Standard with VDBBench 1.0 for Benchmarking Real Vector Database Production Workloads
- John Snow Labs Cuts Cancer Registry Abstraction Time from Hours to Minutes
- Dataiku Breaks $350M ARR Barrier as Enterprises Accelerate the Move to Trusted AI at Scale
- Databricks and OpenAI Partner to Bring Frontier Models to 20,000+ Enterprises
- More This Just In…














