JamesSand

Zhizhou Sha JamesSand

Tsinghua University

75 followers · 228 following

09:04 (UTC +08:00)
https://jamessand.github.io/

Achievements

Highlights

Lists (6)

Sort

Stars

unslothai / unsloth

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.

Python 46,816 3,824 Updated Oct 8, 2025

sapientinc / HRM

Hierarchical Reasoning Model Official Release

Python 10,933 1,615 Updated Sep 9, 2025

thuml / Sundial

About model release for "Sundial: A Family of Highly Capable Time Series Foundation Models" (ICML 2025 Oral)

131 9 Updated Sep 12, 2025

speedyapply / 2026-AI-College-Jobs

2026 AI/ML internship & new graduate job list updated daily

3,615 146 Updated Oct 11, 2025

VITA-Group / WeLore

From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu, Jiawei Zhao, Yuandong Tian, Zhangyang Wang

Python 49 1 Updated Apr 21, 2025

QwenLM / Qwen2.5-Math

A series of math-specific large language models of our Qwen2 series.

Python 1,015 142 Updated Jan 11, 2025

RLHFlow / Reinforce-Ada

An adaptive sampling framework for Reinforce-style LLM post training.

Python 57 6 Updated Oct 11, 2025

centerforaisafety / hle

Humanity's Last Exam

Python 1,128 70 Updated Oct 7, 2025

open-thoughts / open-thoughts

Fully open data curation for reasoning models

Python 2,113 175 Updated Sep 3, 2025

mlfoundations / dclm

DataComp for Language Models

HTML 1,372 126 Updated Sep 9, 2025

THUNLP-MT / StableToolBench

A new tool learning benchmark aiming at well-balanced stability and reality, based on ToolBench.

Python 187 18 Updated Apr 15, 2025

OpenBMB / ToolBench

[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.

Python 5,272 447 Updated May 21, 2025

GavinZhengOI / LiveCodeBench-Pro

Python 140 3 Updated Sep 28, 2025

ElliottYan / LUFFY

Official Repository of "Learning to Reason under Off-Policy Guidance"

Python 340 39 Updated Oct 4, 2025

MARIO-Math-Reasoning / MARIO_EVAL

Python 52 4 Updated Mar 5, 2025

QwenLM / Qwen-Agent

Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.

Python 11,896 1,082 Updated Sep 26, 2025

HKUDS / LightRAG

[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"

Python 21,695 3,244 Updated Oct 11, 2025

starsuzi / Adaptive-RAG

Jsonnet 333 49 Updated May 2, 2024

GraphPKU / PiSSA

PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models(NeurIPS 2024 Spotlight)

Jupyter Notebook 386 19 Updated Jun 30, 2025

Agenta-AI / agenta

The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.

Python 3,234 382 Updated Oct 10, 2025

RUC-NLPIR / FlashRAG

⚡FlashRAG: A Python Toolkit for Efficient RAG Research (WWW2025 Resource)

Python 3,040 260 Updated Sep 25, 2025

asilverlight / Tool-Light

Toward Effective Tool-Integrated Reasoning via Self-Evolved Preference Learning

Python 15 1 Updated Sep 30, 2025

ZubinGou / llm-agent-web-tools

A simple Google Search Engine Crawler.

Python 19 2 Updated Feb 16, 2024

microsoft / ToRA

ToRA is a series of Tool-integrated Reasoning LLM Agents designed to solve challenging mathematical reasoning problems by interacting with tools [ICLR'24].

Python 1,095 77 Updated Feb 22, 2024

KellerJordan / Muon

Muon is an optimizer for hidden layers in neural networks

Python 1,830 86 Updated Jul 12, 2025

duowuyms / OpenCATP-LLM

The official repository of ICCV 2025 paper "CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning".

Python 15 2 Updated Aug 13, 2025

X-LANCE / ToolHallucination

[ICML 2024] Reducing Tool Hallucination via Reliability Alignment

Python 6 1 Updated Jun 17, 2025

NVIDIA / When2Call

A dataset for training and evaluating LLMs on decision making about "when (not) to call" functions

Python 38 3 Updated Apr 29, 2025

OliverRensu / xAR

This repository includes the official implementation of our paper "Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation"

Python 237 8 Updated Apr 30, 2025

dsgiitr / flux-watermarking

Official Implementation of the paper WMARK@ICLR Detection Limits and Statistical Separability of Tree Ring Watermarks in Rectified Flow-based Text-to-Image Generation Models.

Jupyter Notebook 6 1 Updated Apr 15, 2025

Zhizhou Sha JamesSand

Highlights

Lists (6)

Interests

Interview

low rank

NLP

RL

Vision

Stars