Friday, May 16, 2025

Basics of GraphRAG python package

This package contains the official Neo4j GraphRAG features for Python.

The purpose of this package is to provide a first party package to developers, where Neo4j can guarantee long term commitment and maintenance as well as being fast to ship new features and high performing patterns and methods.

 This package is a renamed continuation of neo4j-genai. The package neo4j-genai is deprecated and will no longer be maintained. We encourage all users to migrate to this new package to continue receiving updates and support.

pip install neo4j-graphrag

pip install "neo4j-graphrag[openai]"


LLM providers (at least one is required for RAG and KG Builder Pipeline):

ollama: LLMs from Ollama

openai: LLMs from OpenAI (including AzureOpenAI)

google: LLMs from Vertex AI

cohere: LLMs from Cohere

anthropic: LLMs from Anthropic

mistralai: LLMs from MistralAI


sentence-transformers : to use embeddings from the sentence-transformers Python package


Vector database (to use External Retrievers):

weaviate: store vectors in Weaviate


pinecone: store vectors in Pinecone


qdrant: store vectors in Qdrant


experimental: experimental features mainly from the Knowledge Graph creation pipelines.


nlp:

spaCy: load spaCy trained models for nlp pipelines, used by SpaCySemanticMatchResolver component from the Knowledge Graph creation pipelines.


fuzzy-matching:

rapidfuzz: apply fuzzy matching using string similarity, used by FuzzyMatchResolver component from the Knowledge Graph creation pipelines.



Sample is as below 


Creating the Vector indexes 

===========================


from neo4j import GraphDatabase

from neo4j_graphrag.indexes import create_vector_index


URI = "neo4j://localhost:7687"

AUTH = ("neo4j", "password")


INDEX_NAME = "vector-index-name"


# Connect to Neo4j database

driver = GraphDatabase.driver(URI, auth=AUTH)


# Creating the index

create_vector_index(

    driver,

    INDEX_NAME,

    label="Document",

    embedding_property="vectorProperty",

    dimensions=1536,

    similarity_fn="euclidean",

)



Populating the vector indexes 

===========================


from neo4j import GraphDatabase

from neo4j_graphrag.indexes import upsert_vectors

from neo4j_graphrag.types import EntityType


URI = "neo4j://localhost:7687"

AUTH = ("neo4j", "password")


# Connect to Neo4j database

driver = GraphDatabase.driver(URI, auth=AUTH)


# Upsert the vector

vector = ...

upsert_vectors(

    driver,

    ids=["1234"],

    embedding_property="vectorProperty",

    embeddings=[vector],

    entity_type=EntityType.NODE,

)


Below is how to retrieve the documents 


from neo4j import GraphDatabase

from neo4j_graphrag.embeddings.openai import OpenAIEmbeddings

from neo4j_graphrag.retrievers import VectorRetriever


URI = "neo4j://localhost:7687"

AUTH = ("neo4j", "password")


INDEX_NAME = "vector-index-name"


# Connect to Neo4j database

driver = GraphDatabase.driver(URI, auth=AUTH)


# Create Embedder object

# Note: An OPENAI_API_KEY environment variable is required here

embedder = OpenAIEmbeddings(model="text-embedding-3-large")


# Initialize the retriever

retriever = VectorRetriever(driver, INDEX_NAME, embedder)


# Run the similarity search

query_text = "How do I do similarity search in Neo4j?"

response = retriever.search(query_text=query_text, top_k=5)



references:

https://neo4j.com/docs/neo4j-graphrag-python/current/


No comments:

Post a Comment