

(Tee11/Shutterstock)
Companies that are running into performance walls as they scale up their vector databases may want to check out the latest update to Zilliz Cloud, a hosted version of the Milvus database from Zilliz. The database maker says the update brings a 10x boost in throughput and latency, three new search algorithms that improve search accuracy from 70% to 95%, and a new AutoIndexer that eliminates the need to manually configure the database for peak performance on each data set.
Interest in vector databases is booming at the moment, thanks in large part to the explosion in use of large language models (LLMs) to create human-like interactions, as well as increasing adoption of AI search. By caching relevant documents as vectorized embeddings in a database, a vector database can feed more relevant data into AI models (or return better results in a search), thereby lowering the frequency of hallucinations and creating a better overall customer experience.
Zilliz is among the vector databases riding the GenAI wave. As the commercial outfit behind the open source Milvus database, the Redwood City, California company is actively working to carve out the high-end segment of the vector database market. Zilliz CEO and Founder Charles Xie says the company has more than 10,000 enterprise users, and counts large enterprises like Walmart, Target, Salesforce, Intuit, Fidelity, Nvidia, IBM, PayPal, and Roblox as customers.
With today’s update to Zilliz Cloud, customers will be able to push the size and performance of their vector databases installations even more. According to Xie, customers can use the 10x performance boost to either increase the throughput or to lower the latency.
“A lot of these vector database are running queries at subsecond latency,” Xie tells BigDATAwire. “They’re running somewhere from one second to 500 milliseconds. But in terms of latency, a lot of customers may expect more real-time latency. They want the query to be running in milliseconds, basically in tens of milliseconds. They want to get the results in 10 milliseconds or in 20 milliseconds.”
Customers that need more throughput can configure the database to boost throughput. According to Xie, vector databases often deliver to 50 to 100 queries per second. With the update to Zilliz Cloud, the company is able to offer a lot more, Xie says.
“There are a lot of these online services, they want 10,000 queries per second,” he says. “If you get a super popular application, you get hundreds of millions of users, you’d probably like somewhere from 10,000 per second to even 30,000 per second. With our new release, we can support up to 50,000 queries per second.”
The performance boost comes from work Zilliz has done to expand support for parallel processor deployments. It also added support for ARM CPU deployments, to go along with its previous support for Intel and AMD CPUs and Nvidia GPUs. It’s currently working with AWS to support its ARM-based Graviton processors, Xie says.
“We are using the parallel processing instruction set of modern processors, either the ARM CPU or Intel CPU, to unlock the full potential of the parallel data execution,” Xie says.
As companies move GenAI applications from development to production, the size of their vector databases is increasing. A year ago, many vector databases had on the order of a million vector embeddings, Xie says. But at the beginning of 2023, it was becoming more common to see databases storing 100 million to several billion vectors, Xie says. Zilliz’ largest deployment currently supports 100 billion vectors, he says.
Zilliz Cloud customers will be able to get more use out of all that high-dimensional data with the addition of new search algorithms. In previous release, Zilliz Cloud supported dense vector search, including approximate nearest neighbor (ANN). Now it sports four.
“We introduced a sparse index search, or basic sparse embedding search. And we also introduced scalar search, so you can do data filtering on top of a scalar property. And also we have this multi-vector search, so basically you can put a number of vectors in a vector array, to get more context in this search,” Xie explains.
“So combining these four searches–dense vector search, sparse vector search, scalar search, and also multi-vector search–we can bring the accuracy of the search result to another level, from around 70% to 80% accuracy to 95% and above in terms of recall accuracy,” he continues. “That’s huge.”
All those new search types could add a lot more complexity to Zilliz Cloud, further putting the database out of reach of organizations that can’t afford an army of adminstrators. But thanks to the new AutoIndexer added with this release, customers don’t have to worry about getting 500 to 1,000 parameters just right to get optimal performance, because the product will automatically set configurations for the user.
“A vector database is a very complex because it’s basically managing high-dimensional data. There are a lot of parameters and configurations and so the challenges are that a lot of our customers have to hire a bunch of vector database administrators to do all this configuration, to have a lot of trial and error and difficult configurations to get the best configuration for their usage pattern for their workload,” Xie says.
“But with AutoIndex, they don’t need that anymore,” he continues. “It’s autonomous driving mode. We’re using AI algorithms behind the scene to make sure that you get the best configuration out of the box. And the other thing that it also it also beneficial for them to reduce the total cost of ownership.”
A year ago, it was common for customers to spend $10,000 to $20,000 per month on a vector database solution. But as data volumes increase, they find themselves spending upwards of $1 million a month. “They’re definitely looking for a solution that can provide a better total cost of ownership,” he says. “So that’s why cost reduction is been very important to them.”
Zilliz Cloud is available on AWS, Microsoft Azure, and Google Cloud. For more information, see www.zilliz.com.
Related Items:
Zilliz Unveils Game-Changing Features for Vector Search
How Real-Time Vector Search Can Be a Game-Changer Across Industries
July 3, 2025
- FutureHouse Launches AI Platform to Accelerate Scientific Discovery
- KIOXIA AiSAQ Software Advances AI RAG with New Version of Vector Search Library
- NIH Highlights AI and Advanced Computing in New Data Science Strategic Plan
- UChicago Data Science Alum Transforms Baseball Passion into Career with Seattle Mariners
July 2, 2025
- Bright Data Launches AI Suite to Power Real-Time Web Access for Autonomous Agents
- Gartner Finds 45% of Organizations with High AI Maturity Sustain AI Projects for at Least 3 Years
- UF Highlights Role of Academic Data in Overcoming AI’s Looming Data Shortage
July 1, 2025
- Nexdata Presents Real-World Scalable AI Training Data Solutions at CVPR 2025
- IBM and DBmaestro Expand Partnership to Deliver Enterprise-Grade Database DevOps and Observability
- John Snow Labs Debuts Martlet.ai to Advance Compliance and Efficiency in HCC Coding
- HighByte Releases Industrial MCP Server for Agentic AI
- Qlik Releases Trust Score for AI in Qlik Talend Cloud
- Dresner Advisory Publishes 2025 Wisdom of Crowds Enterprise Performance Management Market Study
- Precisely Accelerates Location-Aware AI with Model Context Protocol
- MongoDB Announces Commitment to Achieve FedRAMP High and Impact Level 5 Authorizations
June 30, 2025
- Campfire Raises $35 Million Series A Led by Accel to Build the Next-Generation AI-Driven ERP
- Intel Xeon 6 Slashes Power Consumption for Nokia Core Network Customers
- Equal Opportunity Ventures Leads Investment in Manta AI to Redefine the Future of Data Science
- Tracer Protect for ChatGPT to Combat Rising Enterprise Brand Threats from AI Chatbots
June 27, 2025
- Inside the Chargeback System That Made Harvard’s Storage Sustainable
- What Are Reasoning Models and Why You Should Care
- Databricks Takes Top Spot in Gartner DSML Platform Report
- LinkedIn Introduces Northguard, Its Replacement for Kafka
- Change to Apache Iceberg Could Streamline Queries, Open Data
- Agentic AI Orchestration Layer Should be Independent, Dataiku CEO Says
- The Evolution of Time-Series Models: AI Leading a New Forecasting Era
- Fine-Tuning LLM Performance: How Knowledge Graphs Can Help Avoid Missteps
- Top-Down or Bottom-Up Data Model Design: Which is Best?
- Stream Processing at the Edge: Why Embracing Failure is the Winning Strategy
- More Features…
- Mathematica Helps Crack Zodiac Killer’s Code
- ‘The Relational Model Always Wins,’ RelationalAI CEO Says
- Confluent Says ‘Au Revoir’ to Zookeeper with Launch of Confluent Platform 8.0
- Solidigm Celebrates World’s Largest SSD with ‘122 Day’
- DuckLake Makes a Splash in the Lakehouse Stack – But Can It Break Through?
- The Top Five Data Labeling Firms According to Everest Group
- Supabase’s $200M Raise Signals Big Ambitions
- Toloka Expands Data Labeling Service
- With $17M in Funding, DataBahn Pushes AI Agents to Reinvent the Enterprise Data Pipeline
- Databricks Is Making a Long-Term Play to Fix AI’s Biggest Constraint
- More News In Brief…
- Astronomer Unveils New Capabilities in Astro to Streamline Enterprise Data Orchestration
- Databricks Unveils Databricks One: A New Way to Bring AI to Every Corner of the Business
- Seagate Unveils IronWolf Pro 24TB Hard Drive for SMBs and Enterprises
- Gartner Predicts 40% of Generative AI Solutions Will Be Multimodal By 2027
- BigBear.ai And Palantir Announce Strategic Partnership
- Databricks Donates Declarative Pipelines to Apache Spark Open Source Project
- Deloitte Survey Finds AI Use and Tech Investments Top Priorities for Private Companies in 2024
- Code.org, in Partnership with Amazon, Launches New AI Curriculum for Grades 8-12
- Databricks Announces Data Intelligence Platform for Communications
- Atlan Launches AI Data Quality Studio for Snowflake
- More This Just In…