这是indexloc提供的服务,不要输入任何密码
Skip to content

[FEAT]: System prompt allocation window too constraining. #1244

@timothycarambat

Description

@timothycarambat

What would you like to see?

Right now, when you send a prompt that is going to overflow the window of a model

if (tokenManager.statsFrom(messages) + tokenBuffer < llm.promptWindowLimit())

or are pinning documents that will overflow the budget for system prompts

maxTokens: LLMConnector.limits.system,

We then begin to truncate the messages. This becomes an issue when the user wishes to pin many many documents and have the history and user prompt be more constrained.

Ideally, the user should not have the system prompt constrained to only 15% (fixed) of the overall window. This limits high-context models such as gemini 1.5 substantially where they can only have 150K tokens of the 1M context.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions