-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Closed
Labels
core-team-onlyenhancementNew feature or requestNew feature or requestpossible bugBug was reported but is not confirmed or is unable to be replicated.Bug was reported but is not confirmed or is unable to be replicated.
Description
How are you running AnythingLLM?
All versions
What happened?
The TPS measurement of Gemini models is incorrect. This is likely due to manual counting of the tokens on request, which has always been a rough estimation. If possible, this value should come directly from the API chunks.
Are there known steps to reproduce?
Use Google Gemini as the LLM provider (any model)
In a chat send a simple prompt like Tell me a short story
Observe that the metrics are clearly wrong (~3-8 tps, which should be 100+)
More Context:
#4459 (review)
Metadata
Metadata
Assignees
Labels
core-team-onlyenhancementNew feature or requestNew feature or requestpossible bugBug was reported but is not confirmed or is unable to be replicated.Bug was reported but is not confirmed or is unable to be replicated.