Let's collect remaining issues we are aware of related to sampler performance - [x] Small regression (1 req / sec drop from `benchmark_throughput.py`) after https://github.com/octoml/mlc-llm/pull/192 when only greedy sampling is used. - [ ] Logprobs, and JSON are extremely slow