-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Closed
Labels
possible bugBug was reported but is not confirmed or is unable to be replicated.Bug was reported but is not confirmed or is unable to be replicated.
Description
How are you running AnythingLLM?
All versions
What happened?
The function tokenizeString is very CPU-intensive. Its only use I found is here to estimate embedding costs for OpenAI:
anything-llm/frontend/src/components/Modals/ManageWorkspace/Documents/index.jsx
Lines 145 to 146 in e1af72d
| // Do not do cost estimation unless the embedding engine is OpenAi. | |
| if (systemSettings?.EmbeddingEngine === "openai") { |
When run against a local LLM provider, this function isn’t necessary, thus saving significant time and energy.
Are there known steps to reproduce?
I'm using this configuration:
- EMBEDDING_ENGINE=ollama
- EMBEDDING_BASE_PATH=http://ollama:11434
- EMBEDDING_MODEL_PREF=nomic-embed-text:latest
When I upload an 80KiB .xlsx file, the process takes too long and results in a timeout.
Without the token estimation, it's embedded within 1.1s.
Metadata
Metadata
Assignees
Labels
possible bugBug was reported but is not confirmed or is unable to be replicated.Bug was reported but is not confirmed or is unable to be replicated.