这是indexloc提供的服务,不要输入任何密码
Skip to content

[FEAT]: num_ctx for Ollama embedder #3122

@alpilotx

Description

@alpilotx

How are you running AnythingLLM?

Docker (remote machine)

What happened?

I have started a new document embedding run and wanted to have a look on the amchine running ollama, how it uses cores and also check which parameters have been used to start ollama process.

Then I saw this:
Image
/usr/lib/ollama/runners/cpu_avx2/ollama_llama_server runner --model /root/.ollama/models/blobs/sha256-daec91ffb5dd0c27411bd71f29932917c49cf529a641d0168496c3a501e3062c --ctx-size 2048 --batch-size 512 --threads 64 --no-mmap --parallel 1

And I was surprised to see --ctx-size 2048 !

Because the embedding model I use is bge-m3 and for one it supports larger context windows AND I also explicitly told in the configuration to have "Max Embedding Chunk Length=4096"
Image

And also I let my documents be splitted to have up to 4096 tokens per chunk.
But by having Ollama run with --ctx-size 2048 , it will - to my understanding - not "see" anything from my chunks going beyond 2048 tokens.

QUESTION: So, is this a bug, that AnythingLLM does not run the embedding model with the correct configured size set in "Max Embedding Chunk Length=xxx" (in my case the 4096) ???

Are there known steps to reproduce?

No response

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions