Highlights
- Pro
Stars
Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"
Benchmarking Knowledge Transfer in Lifelong Robot Learning
ULMEvalKit: One-Stop Eval ToolKit for Image Generation
“FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching” FlowAR employs a simplest scale design and is compatible with any VAE.
Code release for Ming-UniVision: Joint Image Understanding and Geneation with a Continuous Unified Tokenizer
HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
WorldVLA: Towards Autoregressive Action World Model
Official repo for NeurIPS 2025 poster: Unveiling the Spatial-temporal Effective Receptive Fields of Spiking Neural Networks
[NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs".
[NeurIPS 2025] The official implementation of paper "Safe-Sora: Safe Text-to-Video Generation via Graphical Watermarking"
[CVPR2025 Highlight] PAR: Parallelized Autoregressive Visual Generation. https://yuqingwang1029.github.io/PAR-project
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
Large World Model -- Modeling Text and Video with Millions Context
Fully Open Framework for Democratized Multimodal Training
Multi-Level Triton Runner supporting Python, IR, PTX, and cubin.
OneCAT: Decoder-Only Auto-Regressive Model for Unified Understanding and Generation
Code for our paper "Next Visual Granularity Generation".
[ICCV2025]Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
A video foundation model for unified Text-to-Video (T2V) and Image-to-Video (I2V) generation.
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
[NeurIPS 2025] Native-resolution diffusion Transformer