+
Skip to content
View alchemistyzz's full-sized avatar
  • Tsinghua University
  • Beijing,China

Block or report alchemistyzz

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Generative Universal Verifier as Multimodal Meta-Reasoner

Python 18 2 Updated Oct 16, 2025

High-Fidelity Visual Reasoning on Structured Images

6 1 Updated Sep 30, 2025
Dockerfile 8 Updated Feb 13, 2025

verl-agent is an extension of veRL, designed for training LLM/VLM agents via RL. verl-agent is also the official code for paper "Group-in-Group Policy Optimization for LLM Agent Training"

Python 1,042 89 Updated Oct 15, 2025

MiroThinker is open-source agentic models trained for deep research and complex tool use scenarios.

Python 461 40 Updated Oct 15, 2025

(ArXiv25) Vision Matters: Simple Visual Perturbations Can Boost Multimodal Math Reasoning

Python 58 5 Updated Sep 30, 2025

A Benchmark for Evaluating MLLMs' Geometry Performance on Long-Step Problems Requiring Auxiliary Lines

Python 28 Updated Sep 11, 2025

A version of verl to support diverse tool use

Python 607 43 Updated Oct 17, 2025

My learning notes/codes for ML SYS.

Python 3,892 234 Updated Oct 6, 2025

A unified framework for controllable caption generation across images, videos, and audio. Supports multi-modal inputs and customizable caption styles.

Python 51 1 Updated Jul 24, 2025

Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.

1,032 36 Updated Oct 4, 2025

[NeurIPS'25] The official code of "PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning"

Python 24 2 Updated Sep 26, 2025

[ICLR2025] Codebase for "ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing", built on Megatron-LM.

Python 97 5 Updated Dec 20, 2024

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 6,303 466 Updated Aug 7, 2024

Description for MV-MATH

Python 15 Updated Jul 20, 2025

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…

Python 10,408 906 Updated Oct 17, 2025

Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning

Python 838 50 Updated Jul 22, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 60,358 7,312 Updated Oct 17, 2025

Fully open reproduction of DeepSeek-R1

Python 25,553 2,395 Updated Sep 8, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 14,440 2,286 Updated Oct 18, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 3,812 289 Updated Oct 16, 2025

Latest Advances on System-2 Reasoning

Python 1,252 70 Updated Jun 8, 2025

This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-based Reasoning MLLMs!

1,221 57 Updated Oct 12, 2025

OctoTools: An agentic framework with extensible tools for complex reasoning

Python 1,370 179 Updated Oct 11, 2025

One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks

Python 3,186 394 Updated Oct 17, 2025

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

Python 3,212 511 Updated Oct 17, 2025

Reproduce R1 Zero on Logic Puzzle

Python 2,402 162 Updated Mar 20, 2025

Witness the aha moment of VLM with less than $3.

Python 3,959 290 Updated May 19, 2025

ToRA is a series of Tool-integrated Reasoning LLM Agents designed to solve challenging mathematical reasoning problems by interacting with tools [ICLR'24].

Python 1,098 77 Updated Feb 22, 2024

Scalable RL solution for advanced reasoning of language models

Python 1,750 99 Updated Mar 18, 2025
Next
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载