Setup and run a local LLM and Chatbot using consumer grade hardware.
-
Updated
Jun 25, 2025 - JavaScript
Setup and run a local LLM and Chatbot using consumer grade hardware.
LLM Performance Testing | K6 + Grafana + InfluxDB | A tiny toolkit for load testing and benchmarking OpenAI-like inference endpoints using K6 + Grafana + InfluxDB
Proxy for vLLM to expose token usage metrics.
Add a description, image, and links to the vllm topic page so that developers can more easily learn about it.
To associate your repository with the vllm topic, visit your repo's landing page and select "manage topics."