[BUG]: llama3.1 8B Context Size Max Tokens Ignored in Both Performance Modes

### How are you running AnythingLLM?

AnythingLLM desktop app

### What happened?

<img width="779" alt="anythingfllm_context" src="https://github.com/user-attachments/assets/bf3afd42-94fa-4716-a85a-24555377b17f">

When using "Base" as the "Performance Mode", the Max Tokens setting is ignored and Llama 3.1 is invoked with 8K context size.  When setting Performance Mode to "Maximum", the Max Tokens settings is ignored and Llama 3.1 is invoked with 128K context size.   Created a modelfile to enforce 32K context size but the result was 128K. Workspace was set to use the system defined LLM settings.

### Are there known steps to reproduce?

See above

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[BUG]: llama3.1 8B Context Size Max Tokens Ignored in Both Performance Modes #2442

How are you running AnythingLLM?

What happened?

Are there known steps to reproduce?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[BUG]: llama3.1 8B Context Size Max Tokens Ignored in Both Performance Modes #2442

Description

How are you running AnythingLLM?

What happened?

Are there known steps to reproduce?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions