Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!

Python 7,567 577 Updated Oct 9, 2025

NovaSky-AI / SkyRL

SkyRL: A Modular Full-stack RL Library for LLMs

Python 1,025 133 Updated Oct 13, 2025

PeterGriffinJin / Search-R1

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 3,315 280 Updated Oct 4, 2025

Jiayi-Pan / TinyZero

Minimal reproduction of DeepSeek R1-Zero

Python 12,261 1,510 Updated Apr 24, 2025

qiancheng0 / ToolRL

Python 358 26 Updated Jun 10, 2025

PRIME-RL / PRIME

Scalable RL solution for advanced reasoning of language models

Python 1,751 99 Updated Mar 18, 2025

InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 7,157 609 Updated Oct 13, 2025

Yuliang-Liu / MonkeyOCR

A lightweight LMM-based Document Parsing Model

Python 6,058 413 Updated Oct 9, 2025

sierra-research / tau-bench

Code and Data for Tau-Bench

Python 881 135 Updated Aug 28, 2025

hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 60,095 7,285 Updated Oct 13, 2025

om-ai-lab / VLM-R1

Solve Visual Understanding with Reinforced VLMs

Python 5,611 358 Updated Aug 29, 2025

QwenLM / Qwen-Agent

Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.

Python 11,927 1,084 Updated Sep 26, 2025

PrimeIntellect-ai / verifiers

Environments for LLM Reinforcement Learning

Python 3,290 388 Updated Oct 12, 2025

GeeeekExplorer / nano-vllm

Nano vLLM

Python 7,053 902 Updated Aug 31, 2025

google-research / tuning_playbook

A playbook for systematically maximizing the performance of deep learning models.

29,250 2,394 Updated Jun 18, 2024

huggingface / smolagents

🤗 smolagents: a barebones library for agents that think in code.

Python 23,366 2,048 Updated Oct 13, 2025

sail-sg / oat

🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

Python 532 40 Updated Oct 2, 2025

huggingface / parler-tts

Inference and training library for high-quality TTS models.

Python 5,441 579 Updated Dec 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Muriz Murgio

Achievements

Achievements

Block or report Murgio

Lists (1)

Frontend

Stars

huggingface / lighteval

openai / openai-guardrails-python

swiss-ai / pretrain-data

swiss-ai / apertus-tech-report

swiss-ai / apertus-format

denizsafak / abogen

nottelabs / open-operator-evals

openai / gpt-oss

vgel / repeng

hkust-nlp / llm-compression-intelligence

rllm-org / rllm

temporalio / sdk-python

OpenPipe / ART