

(Summit Art Creations/Shutterstock)
Kinetica got its start building a GPU-powered database to serve fast SQL queries and visualizations for US government and military clients. But with a pair of announcements at Nvidia’s GTC show last week, the company is showing it’s prepared for the coming wave of generative AI applications, particularly those utilizing retrieval augmented generation (RAG) techniques to tap unique data sources.
Companies today are hunting for ways to leverage the power of large language models (LLMs) with their own proprietary data. Some companies are sending their data to OpenAI’s cloud or other cloud-based AI providers, while others are building their own LLMs.
However, many more companies are adopting the RAG approach, which has surfaced as perhaps the best middle ground between that doesn’t require building your own model (time-consuming and expensive) or sending your data to the cloud (not good privacy and security-wise).
With RAG, relevant data is injected directly into the context window before being sent off to the LLM for execution, thereby providing more personalization and context in the LLMs response. Along with prompt engineering, RAG has emerged as a low-risk and fruitful method for juicing GenAI returns.

The VRAM boost in Nvidia’s Blackwell GPU will help Kinetica keep the processor fed with data, Negahban said
Kinetica is also now getting into the RAG game with its database by essentially turning it into a vector database that can store and serve vector embeddings to LLMs, as well as by performing vector similarity search to optimize the data it sends to the LLM.
According to its announcement last week, Kinetica is able to serve vector embeddings 5x faster than other databases, a number it claims came from the VectorDBBench benchmark. The company claims its able to achieve that speed by leveraging Nvidia’s RAPIDS RAFT technology.
That GPU-based speed advantage will help Kinetica customers by enabling them to scan more of their data, including real-time data that has just been added to the database, without doing a lot of extra work, said Nima Negahban, co0founder and CEO of Kinetica.
“It’s hard for an LLM or a traditional RAG stack to be able to answer a question about something that’s happening right now, unless they’ve done a lot of pre-planning for specific data types,” Negahban told Datanami at the GTC conference last week, “whereas with Kinetica, we’ll be able to help you by looking at all the relational data, generate the SQL on the fly, and ultimately what we put just back in the context for the LLM is a simple text payload that the LLM will be able to understand to use to give the answer to the question.”
This essentially gives users the capability to talk to their complete corpus of relational enterprise data, without doing any preplanning.
“That’s the big advantage,” he continued, “because the traditional RAG pipelines right now, that part of it still requires a good amount of work as far as you have to have the right embedding model, you have to test it, you have to make sure it’s working for your use case.”
Kinetica can also talk to other databases and function as a generative federated query engine, as well as do the traditional vectorization of data that customers put inside of Kinetica, Negahban said. The database is designed to be used for operational data, such as time-series, telemetry, or teleco data. Thanks to the support for NVIDIA NeMo Retriever microservices, the company is able to position that data in a RAG workflow.
But for Kinetica, it all comes back to the GPU. Without the extreme computational power of the GPU, the company has just another RAG offering.
“Basically you need that GPU-accelerated engine to make it all work at the end of the day, because it’s got to have the speed,” said Negahban, a 2018 Datanami Person to Watch. “And we then put all that orchestration on top of it as far as being able to have the metadata necessary, being able to connect to other databases, having all that to make it easy for the end user, so basically they can start taking advantage of all that relational enterprise data in their LLM interaction.”
Related Items:
Bank Replaces Hundreds of Spark Streaming Nodes with Kinetica
Kinetica Aims to Broaden Appeal of GPU Computing
Preventing the Next 9/11 Goal of NORAD’s New Streaming Data Warehouse
April 25, 2025
- Denodo Supports Real-Time Data Integration for Hospital Sant Joan de Déu Barcelona
- Redwood Expands Automation Platform with Introduction of Redwood Insights
- Datatonic Announces Acquisition of Syntio to Expand Global Services and Delivery Capabilities
April 24, 2025
- Dataiku Expands Platform with Tools to Build, Govern, and Monitor AI Agents at Scale
- Indicium Launches IndiMesh to Streamline Enterprise AI and Data Systems
- StorONE and Phison Unveil Storage Platform Designed for LLM Training and AI Workflows
- Dataminr Raises $100M to Accelerate Global Push for Real-Time AI Intelligence
- Elastic Announces General Availability of Elastic Cloud Serverless on Google Cloud Marketplace
- CNCF Announces Schedule for OpenTelemetry Community Day
- Thoughtworks Signs Global Strategic Collaboration Agreement with AWS
April 23, 2025
- Metomic Introduces AI Data Protection Solution Amid Rising Concerns Over Sensitive Data Exposure in AI Tools
- Astronomer Unveils Apache Airflow 3 to Power AI and Real-Time Data Workflows
- CNCF Announces OpenObservabilityCon North America
- Domino Wins $16.5M DOD Award to Power Navy AI Infrastructure for Mine Detection
- Endor Labs Raises $93M to Expand AI-Powered AppSec Platform
- Ocient Announces Close of Series B Extension Financing to Accelerate Solutions for Complex Data and AI Workloads
April 22, 2025
- O’Reilly Launches AI Codecon, New Virtual Conference Series on the Future of AI-Enabled Development
- Qlik Powers Alpha Auto Group’s Global Growth with Automotive-Focused Analytics
- Docker Extends AI Momentum with MCP Tools Built for Developers
- John Snow Labs Unveils End-to-End HCC Coding Solution at Healthcare NLP Summit
- PayPal Feeds the DL Beast with Huge Vault of Fraud Data
- Will Model Context Protocol (MCP) Become the Standard for Agentic AI?
- OpenTelemetry Is Too Complicated, VictoriaMetrics Says
- Thriving in the Second Wave of Big Data Modernization
- Google Cloud Preps for Agentic AI Era with ‘Ironwood’ TPU, New Models and Software
- Google Cloud Fleshes Out its Databases at Next 2025, with an Eye to AI
- Can We Learn to Live with AI Hallucinations?
- Monte Carlo Brings AI Agents Into the Data Observability Fold
- AI Today and Tomorrow Series #3: HPC and AI—When Worlds Converge/Collide
- The Active Data Architecture Era Is Here, Dresner Says
- More Features…
- Google Cloud Cranks Up the Analytics at Next 2025
- New Intel CEO Lip-Bu Tan Promises Return to Engineering Innovation in Major Address
- AI One Emerges from Stealth to “End the Data Lake Era”
- SnapLogic Connects the Dots Between Agents, APIs, and Work AI
- Snowflake Bolsters Support for Apache Iceberg Tables
- GigaOM Report Highlights Top Performers in Unstructured Data Management for 2025
- Supabase’s $200M Raise Signals Big Ambitions
- Big Data Career Notes for March 2025
- GenAI Investments Accelerating, IDC and Gartner Say
- Dremio Speeds AI and BI Workloads with Spring Lakehouse Release
- More News In Brief…
- Gartner Predicts 40% of Generative AI Solutions Will Be Multimodal By 2027
- MinIO: Introducing Model Context Protocol Server for MinIO AIStor
- Dataiku Achieves AWS Generative AI Competency
- AMD Powers New Google Cloud C4D and H4D VMs with 5th Gen EPYC CPUs
- CData Launches Microsoft Fabric Integration Accelerator
- MLCommons Releases New MLPerf Inference v5.0 Benchmark Results
- Opsera Raises $20M to Expand AI-Driven DevOps Platform
- GitLab Announces the General Availability of GitLab Duo with Amazon Q
- Seagate Unveils IronWolf Pro 24TB Hard Drive for SMBs and Enterprises
- Intel and IBM Announce Availability of Intel Gaudi 3 AI Accelerators on IBM Cloud
- More This Just In…