

(ZinetroN/Shutterstock)
Rockset today unveiled new vector database capabilities, such as the addition of approximate nearest neighbor (ANN) search and native support for LlamaIndx and LangChain, that it says will help companies efficiently scale their GenAI applications once they’re in production.
As companies experiment with the new generative AI capabilities delivered via large language models (LLMs) and vector search, they’re getting good early results, says Rockset co-founder and CEO Venkat Venkataramani.
“We’re not educating people on what can vector search do for you,” he says. “They’ve already tinkered it at very small scale, built prototypes, and they already see the magic.”
While vector search and GenAI prototypes tease a tantalizing future, companies often run into trouble when they try to make the leap from development to production.
“Not a week goes by where somebody calls me and says, ‘Venkat, I started with this toy open source vector database and we did a shadow launch and a scale test, and it just bombed,’” Venkataramani says. “Other vector databases may have good vector support, but the database part is very shaky. Is it scalable? Is it reliable? It gets very expensive and very hard to operate very quickly.”
Rockset rolled out its initial support for vector search and storing vectorized embeddings earlier this year. Like many other SQL and NoSQL databases, the Silicon Valley firm experienced a surge in demand for these data types, which are instrumental for enabling vector search as well as other types of GenAI applications built atop LLMs and computer vision models.
The addition today of ANN and native support for LlamaIndex and LangChain, which are open source tools for automating prompt engineering and other critical behind-the-scenes GenAI data workflows, bolster Rocket’s existing capabilities for serving scalable GenAI apps.
The ANN algorithm is critical for quickly matching GenAI app user input to pre-generated vector embeddings stored in a vector database. It’s used both in vector search, where it powers the similarity search, as well as other GenAI use cases for text and computer vision.
Rocket’s implementation of ANN is unique, Venkataramani says, because it rebuilds the ANN index in real time as new data arrives, versus as a batch job that requires downtime.
“Other vector databases require you to rebuild the entire ANN index and all of that in batch mode, and so you don’t really get a real time application,” he says. “Rebuilding these indexes also is actually way more computationally expensive, but if you can incrementally maintain it, it is a lot cheaper and also more real-time.”
Rockset’s support for compute-compute separation enables it to run workloads such as index rebuilding, compaction, and ongoing maintenance without impacting the application’s main vector query workload, Venkataramani says. Compute-compute separation gives the database a big advantage when it comes to scaling GenAI applications, he says.
“You can have one or more compute instances for searches and similarity searches and vector searches and other real-time analytics and reporting–whatever applications you have,” the Datanami 2022 Person to Watch says. “They’re completely decoupled. They’re fully independently scalable and isolated from each other. But they work on the same copy of the data, and new data coming in–new updates, inserts, and deletes–will be available for your searches within single-digit milliseconds.”
The fact that Rockset, as a distributed relational database, can store all of a customer’s data as opposed to just storing vectors, as a dedicated vector database does, is another big advantage, Venkataramani says.
“You can have one column that’s basically vector embeddings, and all the other columns and other structured data available right there,” he says. “Building these kinds of hybrid searches across vectors and other metadata that you have is as simple as a SQL where clause. It’s not like you have a vector database and then you put all the other metadata and other structured data in a second separate database and you have to somehow in the application wire them together.”
Having all of the data in one place turns out to be very important in some GenAI use cases, such as powering a song recommendation engine, Venkataramani says. Running the ANN or K nearest neighbor (KNN) search–which applies a brute-force approach that delivers exact answers–is just one step among many that happens behind the scenes in recommendation engine. Developers may also bring some pre- and post-filtering using other metadata to get the best song recommendations in front of the user.
“You want to push the computation close to where the data lives, but the optimizer needs to be able to know which filters to apply first and which filters to apply second,” he says. “Imagine I have all the vectors in the vector database and all the metadata in the second database. Which one do I do first? If I go and get the 10 songs that are closest in the vector database, all of them might be in my recent playlist. If I go and look at all the songs from all these artists, none of them might be nearest neighbors. So I have to be able to combine them in the same SQL WHERE clause to be able to do this efficiently on the same data set.”
Since OpenAI ignited the GenAI storm a year ago with the launch of ChatGPT, the need for vector capabilities has exploded in the database market. Rockset’s vector capabilities are attracting attention among existing customers as well as prospects that are building GenAI applications, ranging from chatbots to recommendation engines to vector search, Venkataramani says.
“It’s really hot. It’s very, very significant,” he says. “AI applications are not like…a separate category of apps. Every application will have parts of their application powered by AI models and AI kind of capabilities, and it’ll be invisible…You’re not going to have a separate one-off side database to build your AI apps. Every single app in the world right now is going to get enhanced and have some components of it.”
One of the companies adopting Rockset’s vector capabilities is JetBlue. The airline, which recently shared its participated in the vendor’s one-day conference, did a bake-off between Rockset and several other vector database, and picked Rockset to power GenAI and other applications.
“We saw the immense power of real-time analytics and AI to transform JetBlue’s real-time decision augmentation and automation, since stitching together three to four database solutions would have slowed down application development,” Sai Ravuru, JetBlue’s senior manager of data science and analytics, says in a recent case study. “With Rockset, we found a database that could keep up with the fast pace of innovation at JetBlue.”
Related Items:
Rockset Says It’s Ready for Real-Time AI
Rockset Looks to Compute-Compute Isolation for Real-Time Advantage
June 13, 2025
- PuppyGraph Announces New Native Integration to Support Databricks’ Managed Iceberg Tables
- Striim Announces Neon Serverless Postgres Support
- AMD Advances Open AI Vision with New GPUs, Developer Cloud and Ecosystem Growth
- Databricks Launches Agent Bricks: A New Approach to Building AI Agents
- Basecamp Research Identifies Over 1M New Species to Power Generative Biology
- Informatica Expands Partnership with Databricks as Launch Partner for Managed Iceberg Tables and OLTP Database
- Thales Launches File Activity Monitoring to Strengthen Real-Time Visibility and Control Over Unstructured Data
- Sumo Logic’s New Report Reveals Security Leaders Are Prioritizing AI in New Solutions
June 12, 2025
- Databricks Expands Google Cloud Partnership to Offer Native Access to Gemini AI Models
- Zilliz Releases Milvus 2.6 with Tiered Storage and Int8 Compression to Cut Vector Search Costs
- Databricks and Microsoft Extend Strategic Partnership for Azure Databricks
- ThoughtSpot Unveils DataSpot to Accelerate Agentic Analytics for Every Databricks Customer
- Databricks Eliminates Table Format Lock-in and Adds Capabilities for Business Users with Unity Catalog Advancements
- OpsGuru Signs Strategic Collaboration Agreement with AWS and Expands Services to US
- Databricks Unveils Databricks One: A New Way to Bring AI to Every Corner of the Business
- MinIO Expands Partner Program to Meet AIStor Demand
- Databricks Donates Declarative Pipelines to Apache Spark Open Source Project
June 11, 2025
- What Are Reasoning Models and Why You Should Care
- The GDPR: An Artificial Intelligence Killer?
- Fine-Tuning LLM Performance: How Knowledge Graphs Can Help Avoid Missteps
- It’s Snowflake Vs. Databricks in Dueling Big Data Conferences
- Snowflake Widens Analytics and AI Reach at Summit 25
- Top-Down or Bottom-Up Data Model Design: Which is Best?
- Why Snowflake Bought Crunchy Data
- Change to Apache Iceberg Could Streamline Queries, Open Data
- Inside the Chargeback System That Made Harvard’s Storage Sustainable
- dbt Labs Cranks the Performance Dial with New Fusion Engine
- More Features…
- Mathematica Helps Crack Zodiac Killer’s Code
- It’s Official: Informatica Agrees to Be Bought by Salesforce for $8 Billion
- AI Agents To Drive Scientific Discovery Within a Year, Altman Predicts
- Solidigm Celebrates World’s Largest SSD with ‘122 Day’
- DuckLake Makes a Splash in the Lakehouse Stack – But Can It Break Through?
- The Top Five Data Labeling Firms According to Everest Group
- Who Is AI Inference Pipeline Builder Chalk?
- ‘The Relational Model Always Wins,’ RelationalAI CEO Says
- IBM to Buy DataStax for Database, GenAI Capabilities
- VAST Says It’s Built an Operating System for AI
- More News In Brief…
- Astronomer Unveils New Capabilities in Astro to Streamline Enterprise Data Orchestration
- Yandex Releases World’s Largest Event Dataset for Advancing Recommender Systems
- Astronomer Introduces Astro Observe to Provide Unified Full-Stack Data Orchestration and Observability
- BigID Reports Majority of Enterprises Lack AI Risk Visibility in 2025
- Databricks Announces Data Intelligence Platform for Communications
- MariaDB Expands Enterprise Platform with Galera Cluster Acquisition
- Snowflake Openflow Unlocks Full Data Interoperability, Accelerating Data Movement for AI Innovation
- Databricks Unveils Databricks One: A New Way to Bring AI to Every Corner of the Business
- Gartner Predicts 40% of Generative AI Solutions Will Be Multimodal By 2027
- Databricks Announces 2025 Data + AI Summit Keynote Lineup and Data Intelligence Programming
- More This Just In…