-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Description
How are you running AnythingLLM?
All versions
What happened?
When uploading a large document that gets split into more than 500 chunks (the default Qdrant batch limit), the embedding process fails. The addDocumentToNamespace function in server/utils/vectorDbProviders/qdrant/index.js correctly splits the vectors into chunks but then incorrectly attempts to upsert all vectors at once, rather than iterating through the prepared chunks.
Are there known steps to reproduce?
To Reproduce
- Set up AnythingLLM with Qdrant as the vector database.
- Use an embedding model with a small chunk size (e.g., 1024).
- Upload a large document that will generate more than 500 vector chunks.
- Observe the server logs for an error during the
client.upsertoperation.
Expected behavior
The document should be successfully embedded by sending the vector chunks to Qdrant in multiple batches, each no larger than the batch limit.
Additional context
The issue is in the loop that processes toChunks(vectors, 500). The fix is to use the chunk variable inside the loop to create the batch for client.upsert, rather than using the full submission object.