这是indexloc提供的服务,不要输入任何密码
Skip to content

[BUG]: Multiple download of Xenova/all-MiniLM-L6-v2 on initial embed #1262

@alexanderMal

Description

@alexanderMal

How are you running AnythingLLM?

AnythingLLM desktop app

What happened?

Embedding the first document will download the the embedding model Xenova/all-MiniLM-L6-v2 for each chunk instead of only once. See Log:

[NativeEmbedder] Embedded Chunk 245 of 248
[NativeEmbedder] The native embedding model has never been run and will be downloaded right now. Subsequent runs will be faster. (~23MB)
[NativeEmbedder] Downloading Xenova/all-MiniLM-L6-v2 from https://huggingface.co/
[NativeEmbedder - Downloading model] tokenizer_config.json 100%
[NativeEmbedder - Downloading model] config.json 100%
[NativeEmbedder - Downloading model] tokenizer.json 100%
[NativeEmbedder - Downloading model] onnx/model_quantized.onnx 100%
[NativeEmbedder] Embedded Chunk 246 of 248
[NativeEmbedder] The native embedding model has never been run and will be downloaded right now. Subsequent runs will be faster. (~23MB)
[NativeEmbedder] Downloading Xenova/all-MiniLM-L6-v2 from https://huggingface.co/
[NativeEmbedder - Downloading model] tokenizer_config.json 100%
[NativeEmbedder - Downloading model] config.json 100%
[NativeEmbedder - Downloading model] tokenizer.json 100%
[NativeEmbedder - Downloading model] onnx/model_quantized.onnx 100%
[NativeEmbedder] Embedded Chunk 247 of 248
[NativeEmbedder] The native embedding model has never been run and will be downloaded right now. Subsequent runs will be faster. (~23MB)
[NativeEmbedder] Downloading Xenova/all-MiniLM-L6-v2 from https://huggingface.co/
[NativeEmbedder - Downloading model] tokenizer_config.json 100%
[NativeEmbedder - Downloading model] config.json 100%
[NativeEmbedder - Downloading model] tokenizer.json 100%

Subsequent embeddings use the existing modell.

Are there known steps to reproduce?

  1. Install Desktop App for Windows from the official website.
  2. Start embedding of a big file.
  3. View the logs.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions