Add option to control KoboldCPP max response tokens #3746

shatfield4 · 2025-04-30T23:20:59Z

Pull Request Type

Relevant Issues

resolves #3708

What is in this change?

After some trial and error it was discovered that KoboldCPP uses the max_tokens param to control the size of the maximum amount of tokens in the response (this is not documented anywhere in their documentation)

KoboldCPP defaults to using 512 as the maximum amount of response tokens if not explicitly specified in the API call to their OpenAI compatible server

Add option to KoboldCPP options to allow for setting the maximum response tokens
Update .env.example to allow for configuring this option via .env

Additional Information

Developer Validations

I ran yarn lint from the root of the repo & committed changes
Relevant documentation has been updated
I have tested my code functionality
Docker build succeeds locally

add option to control koboldcpp max response tokens

add option to control koboldcpp max response tokens

7967919

shatfield4 requested a review from timothycarambat April 30, 2025 23:20

shatfield4 assigned timothycarambat Apr 30, 2025

shatfield4 linked an issue Apr 30, 2025 that may be closed by this pull request

[BUG]: When using KoboldCPP, the max response length is always 512, regardless of context size. #3708

Closed

timothycarambat merged commit 8912d0f into master May 2, 2025

timothycarambat deleted the 3708-bug-when-using-koboldcpp-the-max-response-length-is-always-512-regardless-of-context-size branch May 2, 2025 21:12

cabwds pushed a commit to cabwds/anything-llm that referenced this pull request Jul 3, 2025

Add option to control KoboldCPP max response tokens (Mintplex-Labs#3746)

9e548d9

add option to control koboldcpp max response tokens

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add option to control KoboldCPP max response tokens #3746

Add option to control KoboldCPP max response tokens #3746

Uh oh!

shatfield4 commented Apr 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Add option to control KoboldCPP max response tokens #3746

Add option to control KoboldCPP max response tokens #3746

Uh oh!

Conversation

shatfield4 commented Apr 30, 2025

Pull Request Type

Relevant Issues

What is in this change?

Additional Information

Developer Validations

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants