这是indexloc提供的服务,不要输入任何密码
Skip to content

[Bug]: Token tracker always return 0 in any RAGAnything doc process and aquery func #2325

@Peefy

Description

@Peefy

Do you need to file an issue?

  • I have searched the existing issues and this bug is not already filed.
  • I believe this is a legitimate bug, not just a question or feature request.

Describe the bug

Token tracker always return 0 in any RAGAnything doc process and aquery func

Steps to reproduce

from lightrag.utils import TokenTracker
import os

async def test_blob_get_or_create_rag_instance():
    rag = await get_or_create_rag_instance(user_id)
    token_tracker = TokenTracker()
    with token_tracker:
        await rag.process_document_complete("md.md")
    print(token_tracker.get_usage())

    with token_tracker:
        print(await rag.aquery("Who is Timi"))
    print(token_tracker.get_usage())

The logs show that the LLM call count is zero

Content Information:
INFO: * Total blocks in content_list: 4
INFO: * Content block types:
INFO:   - text: 4
INFO: Content separation complete:
INFO:   - Text content length: 1016 characters
INFO:   - Multimodal items count: 0
INFO: Starting text content insertion into LightRAG...
INFO: Processing 1 document(s)
INFO: Extracting stage 1/1: md.md
INFO: Processing d-id: doc-6326028b4ca1cf0b066c1cda95221af1
INFO: Embedding func: 8 new workers initialized (Timeouts: Func: 30s, Worker: 60s, Health Check: 75s)
INFO: LLM func: 4 new workers initialized (Timeouts: Func: 180s, Worker: 360s, Health Check: 375s)
INFO:  == LLM cache == saving: default:extract:6d8865786e257fde6b26919023bc73fe
INFO:  == LLM cache == saving: default:extract:b7dda58774e34bb806b907b1dd6dd6ec
INFO: Chunk 1 of 1 extracted 15 Ent + 14 Rel chunk-f49cb571a13444ccc029bda23ec1b5e4
INFO: Merging stage 1/1: md.md
INFO: Phase 1: Processing 15 entities from doc-6326028b4ca1cf0b066c1cda95221af1 (async: 8)
INFO: Phase 2: Processing 14 relations from doc-6326028b4ca1cf0b066c1cda95221af1 (async: 8)
INFO: Phase 3: Updating final 15(15+0) entities and  14 relations from doc-6326028b4ca1cf0b066c1cda95221af1
INFO: Completed merging: 15 entities, 0 extra entities, 14 relations
INFO: [_] Writing graph with 15 nodes, 14 edges
INFO: In memory DB persist to disk
INFO: Completed processing file 1/1: md.md
INFO: Enqueued document processing pipeline stopped
INFO: Text content insertion complete
INFO: Document md.md processing complete!
LLM call count: 0, Prompt tokens: 0, Completion tokens: 0, Total tokens: 0

When I change another new file, the log always say LLM call count: 0

Expected Behavior

The log say LLM call count is not 0

LightRAG Config Used

async def memory_from_dir(data_dir: str) -> RAGAnything:
    """Create a RAGAnything memory instance from a specified directory.

    Args:
        data_dir (str): The directory containing the data to be used for memory.
    Returns:
        RAGAnything: An instance of RAGAnything configured with the specified data directory.
    """
    rag = RAGAnything(
        config=RAGAnythingConfig(
            working_dir=data_dir,
            parser="mineru",
            parse_method="auto",
            enable_table_processing=True,
        ),
        llm_model_func=llm_model_func,
        vision_model_func=vision_model_func,
        embedding_func=embedding_func,
    )
    await rag._ensure_lightrag_initialized()
    return rag

Logs and screenshots

INFO: Created working directory: /tmp/tmpqte57j3a/rag_storage
INFO: RAGAnything initialized with config:
INFO: Working directory: /tmp/tmpqte57j3a/rag_storage
INFO: Parser: mineru
INFO: Parse method: auto
INFO: Multimodal processing - Image: True, Table: True, Equation: True
INFO: Max concurrent files: 1
INFO: Parser 'mineru' installation verified
INFO: Initializing LightRAG with parameters: {'working_dir': '/tmp/tmpqte57j3a/rag_storage'}
INFO: [] Created new empty graph file: /tmp/tmpqte57j3a/rag_storage/graph_chunk_entity_relation.graphml
INFO: [
] Process 85228 KV load full_docs with 0 records
INFO: [] Process 85228 KV load text_chunks with 0 records
INFO: [
] Process 85228 KV load full_entities with 0 records
INFO: [] Process 85228 KV load full_relations with 0 records
INFO: [
] Process 85228 KV load entity_chunks with 0 records
INFO: [] Process 85228 KV load relation_chunks with 0 records
INFO: [
] Process 85228 KV load llm_response_cache with 0 records
INFO: [] Process 85228 doc status load doc_status with 0 records
INFO: [
] Process 85228 KV load parse_cache with 0 records
INFO: Multimodal processors initialized with context support
INFO: Available processors: ['image', 'table', 'equation', 'generic']
INFO: Context configuration: ContextConfig(context_window=1, context_mode='page', max_context_tokens=2000, include_headers=True, include_captions=True, filter_content_types=['text'])
INFO: LightRAG, parse cache, and multimodal processors initialized
Info: Found blob info False for user test_user_123
INFO: Created working directory: /tmp/tmp2c9nw0f0/rag_storage
INFO: RAGAnything initialized with config:
INFO: Working directory: /tmp/tmp2c9nw0f0/rag_storage
INFO: Parser: mineru
INFO: Parse method: auto
INFO: Multimodal processing - Image: True, Table: True, Equation: True
INFO: Max concurrent files: 1
INFO: Parser 'mineru' installation verified
INFO: Initializing LightRAG with parameters: {'working_dir': '/tmp/tmp2c9nw0f0/rag_storage'}
INFO: [] Created new empty graph file: /tmp/tmp2c9nw0f0/rag_storage/graph_chunk_entity_relation.graphml
INFO: Multimodal processors initialized with context support
INFO: Available processors: ['image', 'table', 'equation', 'generic']
INFO: Context configuration: ContextConfig(context_window=1, context_mode='page', max_context_tokens=2000, include_headers=True, include_captions=True, filter_content_types=['text'])
INFO: LightRAG, parse cache, and multimodal processors initialized
INFO: Starting complete document processing: md.md
INFO: Starting document parsing: md.md
INFO: Using mineru parser with method: auto
INFO: Using generic parser for .md file (method=auto)...
INFO: Parsing md.md complete! Extracted 4 content blocks
INFO: Stored parsing result in cache: 448988466d5125027b32166caf7752b2
INFO:
Content Information:
INFO: * Total blocks in content_list: 4
INFO: * Content block types:
INFO: - text: 4
INFO: Content separation complete:
INFO: - Text content length: 1016 characters
INFO: - Multimodal items count: 0
INFO: Starting text content insertion into LightRAG...
INFO: Processing 1 document(s)
INFO: Extracting stage 1/1: md.md
INFO: Processing d-id: doc-6326028b4ca1cf0b066c1cda95221af1
INFO: Embedding func: 8 new workers initialized (Timeouts: Func: 30s, Worker: 60s, Health Check: 75s)
INFO: LLM func: 4 new workers initialized (Timeouts: Func: 180s, Worker: 360s, Health Check: 375s)
INFO: == LLM cache == saving: default:extract:6d8865786e257fde6b26919023bc73fe
INFO: == LLM cache == saving: default:extract:b7dda58774e34bb806b907b1dd6dd6ec
INFO: Chunk 1 of 1 extracted 15 Ent + 14 Rel chunk-f49cb571a13444ccc029bda23ec1b5e4
INFO: Merging stage 1/1: md.md
INFO: Phase 1: Processing 15 entities from doc-6326028b4ca1cf0b066c1cda95221af1 (async: 8)
INFO: Phase 2: Processing 14 relations from doc-6326028b4ca1cf0b066c1cda95221af1 (async: 8)
INFO: Phase 3: Updating final 15(15+0) entities and 14 relations from doc-6326028b4ca1cf0b066c1cda95221af1
INFO: Completed merging: 15 entities, 0 extra entities, 14 relations
INFO: [
] Writing graph with 15 nodes, 14 edges
INFO: In memory DB persist to disk
INFO: Completed processing file 1/1: md.md
INFO: Enqueued document processing pipeline stopped
INFO: Text content insertion complete
INFO: Document md.md processing complete!
LLM call count: 0, Prompt tokens: 0, Completion tokens: 0, Total tokens: 0
{'prompt_tokens': 0, 'completion_tokens': 0, 'total_tokens': 0, 'call_count': 0}
INFO: Executing VLM enhanced query: Who is Timi...
INFO: == LLM cache == saving: mix:keywords:3abca453c4747416a8ce13ff65000772
INFO: Query nodes: Timi (top_k:40, cosine:0.2)
INFO: Local query: 15 entites, 14 relations
INFO: Query edges: Person identification (top_k:40, cosine:0.2)
INFO: Global query: 15 entites, 14 relations
INFO: Naive query: 1 chunks (chunk_top_k:20 cosine:0.2)
INFO: Raw search results: 15 entities, 14 relations, 1 vector chunks
INFO: After truncation: 15 entities, 14 relations
INFO: Selecting 1 from 1 entity-related chunks by vector similarity
INFO: Find no additional relations-related chunks from 14 relations
INFO: Round-robin merged chunks: 2 -> 1 (deduplicated 1)
WARNING: Rerank is enabled but no rerank model is configured. Please set up a rerank model or set enable_rerank=False in query parameters.
INFO: Final context: 15 entities, 14 relations, 1 chunks
INFO: Final chunks S+F/O: E15/1
INFO: Found 0 image path matches in prompt
INFO: No valid images found, falling back to normal query
INFO: Query nodes: Timi (top_k:40, cosine:0.2)
INFO: Local query: 15 entites, 14 relations
INFO: Query edges: Person identification (top_k:40, cosine:0.2)
INFO: Global query: 15 entites, 14 relations
INFO: Naive query: 1 chunks (chunk_top_k:20 cosine:0.2)
INFO: Raw search results: 15 entities, 14 relations, 1 vector chunks
INFO: After truncation: 15 entities, 14 relations
INFO: Selecting 1 from 1 entity-related chunks by vector similarity
INFO: Find no additional relations-related chunks from 14 relations
INFO: Round-robin merged chunks: 2 -> 1 (deduplicated 1)
WARNING: Rerank is enabled but no rerank model is configured. Please set up a rerank model or set enable_rerank=False in query parameters.
INFO: Final context: 15 entities, 14 relations, 1 chunks
INFO: Final chunks S+F/O: E15/1
INFO: == LLM cache == saving: mix:query:7ca3fab27b7b08aef1188957f656a241
I do not have enough information to answer.
LLM call count: 0, Prompt tokens: 0, Completion tokens: 0, Total tokens: 0
{'prompt_tokens': 0, 'completion_tokens': 0, 'total_tokens': 0, 'call_count': 0}

Additional Information

[project]
name = "lumis"
version = "0.1.0"
description = "Lumis memory storage and retrieval module"
readme = "README.md"
requires-python = ">=3.11"
dependencies = [
    "dotenv>=0.9.9",
    "fastapi>=0.120.1",
    "mangum>=0.19.0",
    "pydantic>=2.11.10",
    "python-multipart>=0.0.20",
    "raganything>=1.2.8",
    "ruff>=0.14.2",
    "uvicorn>=0.38.0",
    "vercel-blob>=0.4.2",
]

[dependency-groups]
dev = [
    "pytest>=8.4.2",
    "pytest-asyncio>=1.2.0",
]

[tool.setuptools]
packages = ["lumis"] 

[tool.pytest.ini_options]
testpaths = ["tests"]
pythonpath = ["."]
asyncio_mode = "auto"

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions