Highlights
- Pro
Lists (6)
Sort Name ascending (A-Z)
Stars
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.
Hierarchical Reasoning Model Official Release
About model release for "Sundial: A Family of Highly Capable Time Series Foundation Models" (ICML 2025 Oral)
2026 AI/ML internship & new graduate job list updated daily
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu, Jiawei Zhao, Yuandong Tian, Zhangyang Wang
A series of math-specific large language models of our Qwen2 series.
An adaptive sampling framework for Reinforce-style LLM post training.
Fully open data curation for reasoning models
A new tool learning benchmark aiming at well-balanced stability and reality, based on ToolBench.
[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.
Official Repository of "Learning to Reason under Off-Policy Guidance"
Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.
[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models(NeurIPS 2024 Spotlight)
The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.
⚡FlashRAG: A Python Toolkit for Efficient RAG Research (WWW2025 Resource)
Toward Effective Tool-Integrated Reasoning via Self-Evolved Preference Learning
A simple Google Search Engine Crawler.
ToRA is a series of Tool-integrated Reasoning LLM Agents designed to solve challenging mathematical reasoning problems by interacting with tools [ICLR'24].
Muon is an optimizer for hidden layers in neural networks
The official repository of ICCV 2025 paper "CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning".
[ICML 2024] Reducing Tool Hallucination via Reliability Alignment
A dataset for training and evaluating LLMs on decision making about "when (not) to call" functions
This repository includes the official implementation of our paper "Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation"
Official Implementation of the paper WMARK@ICLR Detection Limits and Statistical Separability of Tree Ring Watermarks in Rectified Flow-based Text-to-Image Generation Models.