Build Agents with Real-time Context

AgentsVector RAGSQL RAGGraphSpatialMCP

Most agents reason over stale snapshots. Kinetica resolves vector, SQL, graph, and spatial in one GPU query plan — on live streams, across billions of rows.

Start freeReal-time · GPU-accelerated · converged · at scale
The category

Three ways teams build agent retrieval today. Two stitch separate systems together — one runs it all in a single query plan.

REAL-TIME ANALYTICS · NO RETRIEVAL agent LLM analytics db SQL · OLAP columnar no vector · no graph vector db (external) graph (external) spatial (external) agent juggles 4 systems · 4 latencies · 4 auth surfaces
CLICKHOUSE · SINGLESTORE · DRUID

Fast OLAP. No vector, no graph, no spatial.

Built for sub-second analytics on structured tables. Excellent at filtering and aggregating, but the agent's retrieval surface — semantic search over documents, graph traversal, geospatial joins — lives in other systems. The agent has to choreograph all of them.

Fast structured queries on streaming data
No native vector search, graph, or geospatial
Agent stitches results across systems in tool-call code
The mechanism

Converged is easy to claim. GPU is why it's fast — and fresh.

Kinetica architecture for agentic retrieval & analytics
IngestEngineAccess
STREAMING SOURCES high-speed parallel ingest Kafka · streaming Data lake · S3 / GCS Data warehouse Operational DBs · CDC FULLY VECTORIZED PROCESSING ENGINE one query plan across every retrieval mode Vector CAGRA · HNSW similarity · ANN SQL · OLAP Postgres-compatible columnar · joins Spatial ST_* · geofences tracks · WMS Graph DLS · Cypher solve · match Time series ASOF · WINDOW continuous views Generative NL2SQL · embed inline LLM calls NVIDIA GPU SIMD CPU TIERED STORAGE · COLUMNAR · LOCKLESS AGENT ACCESS SURFACES whatever the agent already speaks MCP model context protocol Natural language → SQL NL2SQL · agent writes its own queries LangChain · LangGraph native integrations Postgres wire · REST · SDK Python · Java · JDBC CALLERS AI agents LLM Co-pilots app Dashboards BI EVERY MODE · EVERY CALLER · ONE TRANSACTIONAL STORE streaming data queryable in seconds — no rebuild, no re-embed, no second cluster
Hybrid retrieval, one SQL

A single SELECT can run ANN vector lookup, filter on tabular predicates, traverse a graph, and apply a geofence — all on the same scan, all on GPU.

Streaming-fresh embeddings

Embeddings are computed and indexed as data arrives. There is no nightly re-embed batch and no drift between operational facts and what the agent retrieves.

Speaks every agent surface

MCP for tool-using agents, NL2SQL for LLMs writing their own queries, LangChain for orchestration, Postgres wire for everything else.

The numbers

Independent benchmarks, same hardware, published. The engine that removes the network hops also wins on raw speed.

Structured retrieval

TPC-DS SF-200 · More is better
99of 99

Can the engine run the full enterprise SQL suite an agent's structured queries resemble? JOIN-heavy, aggregation-heavy, at scale.

Kinetica98 / 99 run
ClickHousepartial suite
LatencyKineticaSuite< 1s22 queriesof 98< 10s80 queriesof 98Total98 runof 99

Semantic retrieval

VectorDBBench · Less is better
faster

How fast can new embeddings be ingested and made queryable — so the agent's recall stays fresh inside the turn?

Kinetica5× ingest
Prior leader1× baseline
StageKineticaVector DBIndexGPU, livebatchFreshimmediatere-indexNo indexexact NNn/a

In the loop

Converged execution · One plan
1query plan

When a turn needs vector + filter + join + traverse together, how many systems and network hops does it cross?

Kinetica0 hops
Federated stack3–5 hops
Per turnKineticaFederatedEngines13–5Auth surfaces13–5Consistency1 viewN windows
Methodology
Structured retrieval: independent Radiant Advisors analysis, TPC-DS SF-200, and Kinetica 7.2 vs. ClickHouse 25.10 on identical hardware. Semantic retrieval: VectorDBBench, NVIDIA GTC 2024. GPU acceleration adds further headroom on both. "In the loop" counts systems crossed, not timings.
Full SQL and data →
The loop

An agent makes six to twelve blocking retrievals per turn, each gating the next. It's a new kind of user — with a latency budget.

As Janakiram MSV observed in Forbes after OpenAI's Rockset acquisition: production AI didn't need another vector database — it needed real-time retrieval over operational data. Pure vector stores are a feature, not a product.

Agent loop · single turn · left to right
in-loop hopfederated costKinetica · native
prompt user · agent 01 · KV Session working memory 02 · VECTOR Recall semantic ANN 03 · SQL Filter join · predicate 04 · GRAPH Traverse entity · hop 05 · TIME Weight recency · ASOF 06 · SPATIAL Locate geofence · ST_* answer LLM · synth load tool state, prior steps, user prefs from this run embed prompt, retrieve similar docs / memories narrow by account, tier, permission, join live tables walk relations · entity resolution · dependency weight by recency, ASOF join streams that drift filter by region, geofence, route, proximity IF FEDERATED ACROSS SIX SYSTEMS network hop auth · serialize + network hop stale embeddings + network hop consistency gap + separate db re-shape data + ETL window snapshot drift + PostGIS hop geometry shuffle 5 network hops · 5 auth surfaces · 5 consistency boundaries · per agent turn ON KINETICA · ONE QUERY PLAN · ZERO HOPS all six retrieval modes execute against the same tables, in the same scan, on GPU
01 · Key-value
Working memory

Parallel KV lookups at 100k+ reads/sec against the same tables — no separate Redis.

02 · Vector
Semantic recall

CAGRA & HNSW on GPU. 5× faster ingest on VectorDBBench; embeddings stay fresh.

03 · SQL
Filter & join

All 99 TPC-DS queries where ClickHouse runs a partial suite — vector + filter + join in one statement.

04 · Graph
Traversal

Native graph over the same tables as your SQL. Solve, match, traverse in one query.

05 · Time-series
Recency & ASOF

Vectorized ASOF and WINDOW operators on GPU. Continuous views as ticks arrive.

06 · Spatial
Location & geofence

Native ST_* operators and in-database tile rendering, fused with vector and SQL.

Federated · the hidden cost

Every modality boundary is a network hop, an auth surface, and a consistency window. Errors compound: a stale read at step 3 of a 12-step agent loop is confidently wrong by step 12. Strong consistency isn't a nice-to-have for agents — it's a correctness requirement.

Unified · the architectural payoff

All six retrieval modes run as column types in the same engine. The agent issues one SQL statement; Kinetica fans out across vector, structured, graph, time-series, and spatial inside a single query plan. One auth surface. One transactional view. No drift.

The toolkit

However your agent talks, Kinetica answers on the same engine and the same tables — the framework never constrains the database.

Model Context Protocol · native

Kinetica is an MCP server.

Any MCP-capable agent — Claude, Cursor, Copilot, Codex — discovers Kinetica's tables, schemas, and tools and queries them through the standard protocol. No glue code, no custom adapter.

Hybrid retrieval (vector + filter + join + traverse) returns as a single tool result.

# point any MCP client at the server
connect mcp://kinetica.your-domain.com

# the agent now sees tables + tools
tools: query_sql, vector_search,
       graph_solve, st_filter
NVIDIA NIM · colocated

Embedding and inference next to the data.

Kinetica invokes NIM-hosted embedding and LLM models directly from SQL. Embeddings are generated as data arrives — no separate pipeline, no extra hop, no model server round-trip per row. The retrieval path and the generation path run on the same GPU fabric.

The skills

Your coding agent already knows how to use Kinetica.

Two skills. One install command. Eleven agent platforms.

kinetica-execute teaches agents SQL analytics, geospatial, graph, time-series, security, and admin — with a live dual-runtime CLI for running queries directly. kinetica-code teaches the Python SDK and embedded SQL for application developers. Both install in one command and activate based on what the agent is being asked to do.

Claude CodeCursorOpenAI CodexWindsurfGemini CLIGitHub CopilotRoo CodeClineAiderContinueAmazon Qand any agent that reads SKILL.md
$npx skills add kineticadb/agent-skills
github.com/kineticadb/agent-skills

One engine.
Every retrieval mode.

Frequently asked questions

How is this different from a vector database like Pinecone or Milvus?
Pure vector databases solve one retrieval shape — approximate nearest-neighbor over embeddings. Real agent turns also need to filter by account/permission, join to live operational tables, traverse relationships, weight by recency, and apply geofences. Kinetica runs vector similarity as one of several operators in the same vectorized query plan — alongside SQL, graph, time-series, and spatial — so a hybrid retrieval is one statement, not five tool calls. Independent benchmarks measure 5–14× faster ANN than pure vector DBs on VectorDBBench.
Do I need to maintain a separate embedding pipeline?
No. Embeddings are generated in-database as rows arrive — Kinetica invokes NVIDIA NIM-hosted embedding models from SQL on the same GPU fabric that serves queries. There is no nightly re-embed batch, no embedding store to keep in sync with operational data, and no round trip to an external model server per row. The agent retrieves against embeddings that reflect the last few seconds of writes.
How does an agent connect to Kinetica?
Any way it wants. Kinetica is a native MCP server — Claude, Cursor, Copilot, Codex, and any MCP-capable agent discover its tables and tools through the standard protocol. It also speaks Postgres wire (psql, JDBC, ODBC, SQLAlchemy), exposes a REST API, ships Python and Java SDKs, integrates with LangChain and LangGraph as a first-class tool node, and supports NL2SQL for agents that prefer to write their own queries. The choice of agent framework never constrains the choice of database.
What is hybrid retrieval, exactly?
Inside one SQL statement, the planner runs vector similarity (CAGRA / HNSW on GPU), filters by structured predicates, joins to live operational tables, traverses a graph, applies ASOF / WINDOW operators for recency, and evaluates ST_* spatial predicates — against the same tables, in the same scan, with one transactional view. In a federated stack, those steps become network hops, auth boundaries, and consistency windows; on Kinetica they're column types in the same engine.
Will my coding agent know how to write Kinetica SQL?
Yes — we ship two open-source skill plugins. kinetica-execute teaches agents SQL analytics, geospatial, graph, time-series, security, and admin (with a dual-runtime CLI for running queries directly). kinetica-code teaches the Python SDK and embedded SQL. One install command — npx skills add kineticadb/agent-skills — activates them across Claude Code, Cursor, OpenAI Codex, Windsurf, Gemini CLI, GitHub Copilot, Roo Code, Cline, Aider, Continue, Amazon Q, and any agent that reads SKILL.md.
Can I run Kinetica next to my existing data warehouse?
Yes. Kinetica ingests from Kafka, S3 / GCS data lakes, data warehouses, and operational databases via CDC. Most agentic deployments leave the warehouse where it is and use Kinetica as the real-time retrieval layer in front — the warehouse continues to serve BI; Kinetica serves the agent's live retrieval and analytics in one engine with sub-second response times on streaming data.

To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions. Cookie Policy