[FEAT]: System prompt allocation window too constraining.

### What would you like to see?

Right now, when you send a prompt that is going to overflow the window of a model
https://github.com/Mintplex-Labs/anything-llm/blob/42e1d8e8ceffd378323b3ea71986f03beaf54d1a/server/utils/helpers/chat/index.js#L54

or are pinning documents that will overflow the budget for `system` prompts
https://github.com/Mintplex-Labs/anything-llm/blob/42e1d8e8ceffd378323b3ea71986f03beaf54d1a/server/utils/chats/stream.js#L110

We then begin to truncate the messages. This becomes an issue when the user wishes to pin many many documents and have the history and user prompt be more constrained. 

Ideally, the user should not have the `system` prompt constrained to only 15% (fixed) of the overall window. This limits high-context models such as gemini 1.5 substantially where they can only have 150K tokens of the 1M context.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[FEAT]: System prompt allocation window too constraining. #1244

What would you like to see?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[FEAT]: System prompt allocation window too constraining. #1244

Description

What would you like to see?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions