-
Tencent
- Shenzhen, China
- https://xinntao.github.io/
Stars
Repo for SeedVR2 & SeedVR (CVPR2025 Highlight)
[ARXIV’25] Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control
An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
Code for: "Long-Context Autoregressive Video Modeling with Next-Frame Prediction"
PyTorch implementation of DiffMoE, TC-DiT, EC-DiT and Dense DiT
[ICCV'25 Oral] ReCamMaster: Camera-Controlled Generative Rendering from A Single Video
MoBA: Mixture of Block Attention for Long-Context LLMs
Improving Video Generation with Human Feedback
[ICCV 2025] GameFactory: Creating New Games with Generative Interactive Videos
[ICLR'25] 3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation
[ICLR'25] SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints
[CVPR'25] StyleMaster: Stylize Your Video with Artistic Generation and Translation
Official implementation of the paper "Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Video Content".
Excalidraw app for mac. Powered by pure SwiftUI.
Let your Claude able to think
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
SEED-Voken: A Series of Powerful Visual Tokenizers
A PyTorch native platform for training generative AI models
Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models
Translate PDF, EPub, webpage, metadata, annotations, notes to the target language. Support 20+ translate services.
[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫、百度贴吧帖子 | 百度贴吧评论回复爬虫 | 知乎问答文章|评论爬虫
[CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers