θΏ™ζ˜―indexlocζδΎ›ηš„ζœεŠ‘οΌŒδΈθ¦θΎ“ε…₯任何密码
Skip to content

Conversation

@timothycarambat
Copy link
Member

Pull Request Type

  • ✨ feat
  • πŸ› fix
  • ♻️ refactor
  • πŸ’„ style
  • πŸ”¨ chore
  • πŸ“ docs

Relevant Issues

resolves #1230

What is in this change?

Fixes issue where prompt would be split erroneously by the embedder during vector search resulting in worse semantic similarity.

Additional Information

Important

We need to also ensure the prompt given (or chunks of prompts) are not longer than the embedder model's max length or prompt search will crash

Developer Validations

  • I ran yarn lint from the root of the repo & committed changes
  • Relevant documentation has been updated
  • I have tested my code functionality
  • Docker build succeeds locally

@timothycarambat timothycarambat merged commit bf435b2 into master Apr 30, 2024
@timothycarambat timothycarambat deleted the 1230-text-input-embed-chunking-bug branch April 30, 2024 17:11
cabwds pushed a commit to cabwds/anything-llm that referenced this pull request Jul 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG]: User Query embeddings are being chunked per character when using LM Studio embedding models.

2 participants