这是indexloc提供的服务,不要输入任何密码
Skip to content
View hiyouga's full-sized avatar
🕊️
咕咕咕
🕊️
咕咕咕

Organizations

@llm-factory

Block or report hiyouga

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

An open-source AI agent that brings the power of Gemini directly into your terminal.

TypeScript 64,597 6,136 Updated Jul 27, 2025

qwen-code is a coding agent that lives in digital world.

TypeScript 5,696 418 Updated Jul 25, 2025

Text-audio foundation model from Boson AI

Python 5,384 327 Updated Jul 23, 2025

一个面向多模态大模型训练的智能数据集构建与评估平台

TypeScript 80 3 Updated Jul 22, 2025

[ICLR 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.

Python 1,624 70 Updated Jul 26, 2025

Kimi K2 is the large language model series developed by Moonshot AI team

7,164 460 Updated Jul 23, 2025

Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI (Kunlun Inc.), specializing in vision-language reasoning.

Python 2,903 263 Updated Jul 16, 2025

What are the principles we can use to build LLM-powered software that is actually good enough to put in the hands of production customers?

TypeScript 10,407 677 Updated Jul 17, 2025

GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation

Python 264 23 Updated Jul 22, 2025

Fused Qwen3 MoE layer for faster training, compatible with HF Transformers, LoRA, 4-bit quant, Unsloth

Python 131 2 Updated Jul 14, 2025

:octocat:⚙️🗑️ A GitHub Action to free disk space on an Ubuntu runner.

488 90 Updated Aug 6, 2024

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning.

Python 909 24 Updated Jul 21, 2025

Scaling RL on advanced reasoning models

Python 530 32 Updated Jul 21, 2025

国产加速卡-海光DCU实战(大模型训练、微调、推理 等)

Python 35 1 Updated Jul 27, 2025

slime is a LLM post-training framework aiming for RL Scaling.

Python 692 51 Updated Jul 27, 2025

Bridge Megatron-Core to Hugging Face/Reinforcement Learning

Python 63 2 Updated Jul 26, 2025

The official Python SDK for Model Context Protocol servers and clients

Python 16,723 2,154 Updated Jul 26, 2025

Visual Planning: Let's Think Only with Images

Python 261 8 Updated May 20, 2025

patches for huggingface transformers to save memory

Python 27 4 Updated Jun 2, 2025

🚀 Efficient implementations of state-of-the-art linear attention models

Python 2,952 223 Updated Jul 27, 2025

🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal reinforcement learning, and text-only reinforcement learning—to …

Python 169 2 Updated Jul 9, 2025

Reverse Engineering Gemma 3n: Google's New Edge-Optimized Language Model

Python 228 14 Updated May 27, 2025

Get started with building Fullstack Agents using Gemini 2.5 and LangGraph

Jupyter Notebook 15,900 2,636 Updated Jun 18, 2025
Python 669 29 Updated Jul 7, 2025

🚀 The fast, Pythonic way to build MCP servers and clients

Python 15,280 974 Updated Jul 25, 2025

Efficient Agent Training for Computer Use

Python 117 3 Updated Jun 6, 2025

[ICCV 2025] Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning

Python 171 12 Updated Jun 26, 2025

OO for LLMs

Python 818 61 Updated Jul 26, 2025

Production ready LLM model compression/quantization toolkit with hw accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.

Python 700 102 Updated Jul 17, 2025

👷 Build compute kernels

Rust 79 11 Updated Jul 27, 2025
Next