[NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs".

Python 61 3 Updated Jun 17, 2024

Sugewud / Safe-Sora

[NeurIPS 2025] The official implementation of paper "Safe-Sora: Safe Text-to-Video Generation via Graphical Watermarking"

Python 10 2 Updated Oct 10, 2025

YuqingWang1029 / PAR

[CVPR2025 Highlight] PAR: Parallelized Autoregressive Visual Generation. https://yuqingwang1029.github.io/PAR-project

Python 176 2 Updated Mar 20, 2025

ChouYuhong / EasySP

Sequence Parallelism for Long Training

Python 2 Updated Oct 13, 2025

facebookresearch / lingua

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,729 274 Updated Jul 18, 2025

ChouYuhong / ULTra

Unified Long Training Codebase

Python 1 Updated Oct 13, 2025

LargeWorldModel / LWM

Large World Model -- Modeling Text and Video with Millions Context

Python 7,355 561 Updated Oct 19, 2024

EvolvingLMMs-Lab / LLaVA-OneVision-1.5

Fully Open Framework for Democratized Multimodal Training

Python 540 36 Updated Oct 17, 2025

toyaix / triton-runner

Multi-Level Triton Runner supporting Python, IR, PTX, and cubin.

Python 72 1 Updated Oct 16, 2025

Hon-Wong / VoRA

[Fully open] [Encoder-free MLLM] Vision as LoRA

Python 340 28 Updated Jun 12, 2025

onecat-ai / OneCAT

OneCAT: Decoder-Only Auto-Regressive Model for Unified Understanding and Generation

Python 222 5 Updated Sep 22, 2025

BICLab / SpikingBrain-7B

Python 1,105 140 Updated Sep 25, 2025

Yikai-Wang / nvg

Code for our paper "Next Visual Granularity Generation".

Python 39 Updated Oct 7, 2025

wusize / Harmon

[ICCV2025]Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generation

Python 174 5 Updated May 21, 2025

OpenNLPLab / lightning-attention

Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models

Python 330 26 Updated Feb 23, 2025

FoundationVision / Waver

A video foundation model for unified Text-to-Video (T2V) and Image-to-Video (I2V) generation.

630 48 Updated Aug 27, 2025

google / prompt-to-prompt

Jupyter Notebook 3,394 321 Updated May 14, 2024

IDEA-Research / Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Jupyter Notebook 17,021 1,544 Updated Sep 5, 2024

stepfun-ai / NextStep-1

Python 555 15 Updated Sep 30, 2025

WZDTHU / NiT

[NeurIPS 2025] Native-resolution diffusion Transformer

Python 286 17 Updated Oct 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Xuerui Qiu bollossom

Achievements

Achievements

Highlights

Block or report bollossom

Stars

ZhengrongYue / UniFlow

bytetriper / RAE

Lifelong-Robot-Learning / LIBERO

ULMEvalKit / ULMEvalKit

OliverRensu / FlowAR

inclusionAI / Ming-UniVision

echo840 / LIRA

Tencent-Hunyuan / HunyuanImage-3.0

alibaba-damo-academy / WorldVLA

EricZhang1412 / Spatial-temporal-ERF

MengLcool / DeepStack-VL