Highlights
Starred repositories
UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination detection
Code and data to accompany Racing Thoughts by Lepori et al. 2025
PAIR.withgoogle.com and friend's work on interpretability methods
Reproducing Anthropic’s tracing-the-thoughts interpretability work on open models
General Reasoner: Advancing LLM Reasoning Across All Domains
This is the repo of developing reasoning models in the specific domain of financial, aim to enhance models capabilities in handling financial reasoning tasks.
Just a plain, simple and elegant one-page theme for research/academia.
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
RLHF experiments on a single A100 40G GPU. Support PPO, GRPO, REINFORCE, RAFT, RLOO, ReMax, DeepSeek R1-Zero reproducing.
Fully open reproduction of DeepSeek-R1
Minimal reproduction of DeepSeek R1-Zero
Training Large Language Model to Reason in a Continuous Latent Space
Search-o1: Agentic Search-Enhanced Large Reasoning Models
A series of technical report on Slow Thinking with LLM
🤗 smolagents: a barebones library for agents that think in code.
[ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers
FeatureAlignment = Alignment + Mechanistic Interpretability
Official repository for ACL 2025 paper "Model Extrapolation Expedites Alignment"
Efficient Triton Kernels for LLM Training
Arena-Hard-Auto: An automatic LLM benchmark.