This project uses uv as the package manager, after cloning the repository and installing uv, run the following command:
cd EventRAG
uv venv
source .venv/bin/activate
uv sync
uv pip install -e .
Use the below Python snippet (in a script) to initialize EventRAG and perform queries:
import os
from eventrag import EventRAG, QueryParam
from eventrag.llm import gpt_4o_mini_complete, gpt_4o_complete
#########
# Uncomment the below two lines if running in a jupyter notebook to handle the async nature of rag.insert()
# import nest_asyncio
# nest_asyncio.apply()
#########
WORKING_DIR = "./dickens"
if not os.path.exists(WORKING_DIR):
os.mkdir(WORKING_DIR)
rag = EventRAG(
working_dir=WORKING_DIR,
llm_model_func=gpt_4o_mini_complete # Use gpt_4o_mini_complete LLM model
# llm_model_func=gpt_4o_complete # Optionally, use a stronger model
)
with open("./book.txt") as f:
rag.insert(f.read())
# Perform multi-event reasoning
print(rag.query("What are the top themes in this story?", param=QueryParam(mode="agent")))
The code can be found in the ./reproduce
directory.
In all the experiments, we use the gpt_4o_complete
model for LLM generation, and the openai_embedding
function for embedding.
We use Neo4J as the graph storage and Milvus as the vector storage for all experiments. You can setup them using the dockert-compose file in the ./dockers
directory.
Parameter | Type | Explanation | Default |
---|---|---|---|
working_dir | str |
Directory where the cache will be stored | eventrag_cache+timestamp |
kv_storage | str |
Storage type for documents and text chunks. Supported types: JsonKVStorage , OracleKVStorage |
JsonKVStorage |
vector_storage | str |
Storage type for embedding vectors. Supported types: NanoVectorDBStorage , OracleVectorDBStorage |
NanoVectorDBStorage |
graph_storage | str |
Storage type for graph edges and nodes. Supported types: NetworkXStorage , Neo4JStorage , OracleGraphStorage |
NetworkXStorage |
log_level | Log level for application runtime | logging.DEBUG |
|
chunk_token_size | int |
Maximum token size per chunk when splitting documents | 1200 |
chunk_overlap_token_size | int |
Overlap token size between two chunks when splitting documents | 100 |
tiktoken_model_name | str |
Model name for the Tiktoken encoder used to calculate token numbers | gpt-4o-mini |
entity_extract_max_gleaning | int |
Number of loops in the entity extraction process, appending history messages | 1 |
entity_summary_to_max_tokens | int |
Maximum token size for each entity summary | 500 |
node_embedding_algorithm | str |
Algorithm for node embedding (currently not used) | node2vec |
node2vec_params | dict |
Parameters for node embedding | {"dimensions": 1536,"num_walks": 10,"walk_length": 40,"window_size": 2,"iterations": 3,"random_seed": 3,} |
embedding_func | EmbeddingFunc |
Function to generate embedding vectors from text | openai_embedding |
embedding_batch_num | int |
Maximum batch size for embedding processes (multiple texts sent per batch) | 32 |
embedding_func_max_async | int |
Maximum number of concurrent asynchronous embedding processes | 16 |
llm_model_func | callable |
Function for LLM generation | gpt_4o_mini_complete |
llm_model_name | str |
LLM model name for generation | meta-llama/Llama-3.2-1B-Instruct |
llm_model_max_token_size | int |
Maximum token size for LLM generation (affects entity relation summaries) | 32768 |
llm_model_max_async | int |
Maximum number of concurrent asynchronous LLM processes | 16 |
llm_model_kwargs | dict |
Additional parameters for LLM generation | |
vector_db_storage_cls_kwargs | dict |
Additional parameters for vector database (currently not used) | |
enable_llm_cache | bool |
If TRUE , stores LLM results in cache; repeated prompts return cached responses |
TRUE |
addon_params | dict |
Additional parameters, e.g., {"example_number": 1, "language": "Simplified Chinese", "entity_types": ["organization", "person", "geo", "event"]} : sets example limit and output language |
example_number: all examples, language: English |
convert_response_to_json_func | callable |
Not used | convert_response_to_json |
embedding_cache_config | dict |
Configuration for question-answer caching. Contains three parameters: - enabled : Boolean value to enable/disable cache lookup functionality. When enabled, the system will check cached responses before generating new answers.- similarity_threshold : Float value (0-1), similarity threshold. When a new question's similarity with a cached question exceeds this threshold, the cached answer will be returned directly without calling the LLM.- use_llm_check : Boolean value to enable/disable LLM similarity verification. When enabled, LLM will be used as a secondary check to verify the similarity between questions before returning cached answers. |
Default: {"enabled": False, "similarity_threshold": 0.95, "use_llm_check": False} |
MIT License