March 26, 2024

Kinetica Elevates RAG with Fast Access to Real-Time Data

Alex Woodie

(Summit Art Creations/Shutterstock)

Kinetica got its start building a GPU-powered database to serve fast SQL queries and visualizations for US government and military clients. But with a pair of announcements at Nvidia’s GTC show last week, the company is showing it’s prepared for the coming wave of generative AI applications, particularly those utilizing retrieval augmented generation (RAG) techniques to tap unique data sources.

Companies today are hunting for ways to leverage the power of large language models (LLMs) with their own proprietary data. Some companies are sending their data to OpenAI’s cloud or other cloud-based AI providers, while others are building their own LLMs.

However, many more companies are adopting the RAG approach, which has surfaced as perhaps the best middle ground between that doesn’t require building your own model (time-consuming and expensive) or sending your data to the cloud (not good privacy and security-wise).

With RAG, relevant data is injected directly into the context window before being sent off to the LLM for execution, thereby providing more personalization and context in the LLMs response. Along with prompt engineering, RAG has emerged as a low-risk and fruitful method for juicing GenAI returns.

The VRAM boost in Nvidia’s Blackwell GPU will help Kinetica keep the processor fed with data, Negahban said

Kinetica is also now getting into the RAG game with its database by essentially turning it into a vector database that can store and serve vector embeddings to LLMs, as well as by performing vector similarity search to optimize the data it sends to the LLM.

According to its announcement last week, Kinetica is able to serve vector embeddings 5x faster than other databases, a number it claims came from the VectorDBBench benchmark. The company claims its able to achieve that speed by leveraging Nvidia’s RAPIDS RAFT technology.

That GPU-based speed advantage will help Kinetica customers by enabling them to scan more of their data, including real-time data that has just been added to the database, without doing a lot of extra work, said Nima Negahban, co0founder and CEO of Kinetica.

“It’s hard for an LLM or a traditional RAG stack to be able to answer a question about something that’s happening right now, unless they’ve done a lot of pre-planning for specific data types,” Negahban told Datanami at the GTC conference last week, “whereas with Kinetica, we’ll be able to help you by looking at all the relational data, generate the SQL on the fly, and ultimately what we put just back in the context for the LLM is a simple text payload that the LLM will be able to understand to use to give the answer to the question.”

This essentially gives users the capability to talk to their complete corpus of relational enterprise data, without doing any preplanning.

“That’s the big advantage,” he continued, “because the traditional RAG pipelines right now, that part of it still requires a good amount of work as far as you have to have the right embedding model, you have to test it, you have to make sure it’s working for your use case.”

Kinetica can also talk to other databases and function as a generative federated query engine, as well as do the traditional vectorization of data that customers put inside of Kinetica, Negahban said. The database is designed to be used for operational data, such as time-series, telemetry, or teleco data. Thanks to the support for NVIDIA NeMo Retriever microservices, the company is able to position that data in a RAG workflow.

But for Kinetica, it all comes back to the GPU. Without the extreme computational power of the GPU, the company has just another RAG offering.

“Basically you need that GPU-accelerated engine to make it all work at the end of the day, because it’s got to have the speed,” said Negahban, a 2018 Datanami Person to Watch. “And we then put all that orchestration on top of it as far as being able to have the metadata necessary, being able to connect to other databases, having all that to make it easy for the end user, so basically they can start taking advantage of all that relational enterprise data in their LLM interaction.”

Kinetica Aims to Broaden Appeal of GPU Computing

Preventing the Next 9/11 Goal of NORAD’s New Streaming Data Warehouse

Applications: Artificial Intelligence

Technologies: Processors

Sectors: Government

Vendors: Kinetica

Tags: database, GPU, LLM, Nima Negahban, RAG, vector database, vector embeddings, vector similarity search

Kinetica Elevates RAG with Fast Access to Real-Time Data

August 1, 2025

July 31, 2025

July 30, 2025

Sponsored Partner Content

Build Trustworthy AI Workflows with Cube D3

AI That Knows Your Business: Meet Cube D3

Mainframe data: A powerful source for AI insights

CData recognized in the 2024 Gartner ® Magic Quadrant™ Report

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Transforming Healthcare with Data

IDC Spotlight: Boosting AI Impact with Data Products

Sponsored Multimedia

Unlocking Unstructured Data with GenAI
No Comments

The Power of DataOps: Bring Automation to Life
No Comments

Tactical Steps for Cloud Migration
No Comments

Immuta Data Access Platform
No Comments

Data Mesh: Fact or Fiction?
No Comments

Contributors

Kinetica Elevates RAG with Fast Access to Real-Time Data

August 1, 2025

July 31, 2025

July 30, 2025

Sponsored Partner Content

Leading Solution Providers

Tabor Network

Sponsored Whitepapers

Sponsored Multimedia

Contributors

Share

Copy short link