这是indexloc提供的服务,不要输入任何密码
Skip to content

[Feature Request]: Automatic merging of the same entity under different names #1323

@danielaskdd

Description

@danielaskdd

Background

LightRAG currently merges entities solely based on exact name matches (including captions). This results in multiple disconnected nodes for the same entity under different names, and may even create isolated subgraphs for identical entities, ultimately degrading query performance.

Automated Entity Merging for Variant Names

To address this, we propose an automated entity merging approach for differently named but identical entities:

  1. Vector Node Database Utilization:

    • Modify node vector DB implementation to store the embedded vector on entity name.
  2. Similarity Threshold Configuration:

    • Set a minimum cosine similarity threshold (e.g., 0.8) for candidate selection.
  3. Candidate Retrieval:

    • During merging, retrieve the top 10 most relevant nodes based on cosine similarity (above the threshold).
  4. LLM-Based Merge Validation:

    • Submit the current entity’s name/description along with candidate entities’ names/descriptions to an LLM.
    • Task the LLM to:
      • Determine whether merging is justified,
      • If merging is approved, select a best candidate for merging, and return the consolidated entity name and description.
  5. Iterative Merging With Depth Limitation (optional):

    • Repeat the merging validation process for the newly consolidated entity returned by the LLM.

Metadata

Metadata

Assignees

No one assigned

    Labels

    CoreLightRAG CorediscussenhancementNew feature or requesttrackedIssue is tracked by project

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions