llm_eval

Suite of LLM eval experiments. Benchmarking and other tests.

For verifiable domains:
- Guided outputs (constrained decoding; multiple choice, JSON schema)

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
output		output
.gitignore		.gitignore
README.md		README.md
eval-mmlu-cot.py		eval-mmlu-cot.py
eval-mmlu.py		eval-mmlu.py

Provide feedback