Monday, September 1, 2025

What is Zep GraphRAG memory?

 Zep’s Graph RAG is a dynamic, temporally-aware retrieval system built on top of a continuously updated knowledge graph (Graphiti). Unlike standard RAG (Retrieval-Augmented Generation) that usually works with static documents, Zep’s Graph RAG is designed to handle real-time business and conversational data—including customer interactions, support tickets, preferences, and more—without expensive batch recomputation.


Additional Claims & Strengths of Zep’s Approach

Continuous Data Integration: Ingests JSON, text, chat, or structured business data in real time—it immediately becomes part of the knowledge graph.

Temporal Fact Management: Keeps a historical timeline of facts; knows what’s current vs. invalid—helping agents reason about evolving situations.

  

Relationship-Aware Retrieval: Retrieves context across related entities—e.g., given a customer, fetch their support tickets, purchases, preferences, etc.

Shared Knowledge Graphs: Supports graphs shared across users or agents for domain-wide context—centralized knowledge storage.

Custom Entity Types: Developers can model domain-specific entities (customers, products, projects) and define relationships relevant to their business.

 

Low-Latency, Scalable: Handles large-scale datasets at low latencies (<ms), combining multiple retrieval strategies.

 

Temporal Agents Memory Layer: Built for agent memory—not just RAG. The architecture, powered by Graphiti, handles both unstructured conversation and structured data with temporal insights.

 



In Summary:


Zep’s Graph RAG Memory evolves standard RAG by making it:

Dynamic (real-time updates)

Temporal (historical context and fact validity)

Fast (millisecond-level retrieval)

Context-rich (relationship-aware and domain-aware)


This enables AI agents to retrieve up-to-date, nuanced, temporal context rather than static snapshots—improving accuracy, reducing hallucinations, and allowing smarter decisioning.


Would you like me to show you a specific API example or a developer demo of how this is used in code?

What is Graphiti

Graphiti helps overcome static RAG’s limitations with dynamic data. It’s a real-time, temporally-aware knowledge graph engine that incrementally processes incoming data, instantly updating entities, relationships, and communities without batch recomputation. Graphiti isn’t just another retrieval tool — it’s an ever-present source of context for agents, continuously available and updated.

Graphiti’s real-time incremental architecture is built for frequent updates. It continuously ingests new data episodes (events or messages), extracting and immediately resolving entities and relationships against existing nodes.

A key feature is Graphiti’s bi-temporal model, which tracks when an event occurred and when it was ingested. Every graph edge (or relationship) includes explicit validity intervals (t_valid, t_invalid). Graphiti uses semantic, keyword, and graph search to determine whether new knowledge conflicts with existing knowledge. When conflicts arise, Graphiti intelligently uses the temporal metadata to update or invalidate, but not discard, outdated information, preserving historical accuracy without large-scale recomputation.

Fast Query Speeds: Instant Retrieval Without LLM Calls

Graphiti is built for speed. Zep’s own Graphiti implementation achieves extremely low-latency retrieval, returning results at a P95 latency of 300ms. This is enabled by a hybrid search approach that combines semantic embeddings, keyword (BM25) search, and direct graph traversal — avoiding any LLM calls during retrieval.

Graphiti represents a meaningful departure from traditional RAG methods, specifically because it was built from the ground up as a memory infrastructure for dynamic agentic systems. Graphiti offers incremental, real-time updates through its temporally aware knowledge graph. This design means engineers no longer need to recompute entire graphs when data changes. Instead, Graphiti incrementally integrates updates, resolves conflicts based on temporal metadata, and maintains an accurate historical state.

By removing the bottleneck of LLM-driven summarization at query time, Graphiti achieves practical latency levels that engineers require for interactive real-world applications. Its hybrid indexing system — combining semantic embeddings, keyword search, and graph traversal — allows rapid retrieval in near-constant time, independent of graph scale. With intuitive tools like custom entity types implemented through familiar structures such as Pydantic models, Graphiti addresses a significant capability gap in agent development, equipping engineers with a robust, performant, and genuinely dynamic memory layer.