+
Skip to content

Vchitect/RealDPO

Repository files navigation

RealDPO: Real or Not Real, that is the Preference

Guo Cheng3*,  Danni Yang1*,  Ziqi Huang2†,  Jianlou Si5,  Chenyang Si4,  Ziwei Liu2✉︎

(* Equal Contributions)    († Project Lead)    (✉︎ Corresponding Author)

1 Shanghai Artificial Intelligence Laboratory   2 S-Lab, Nanyang Technological University  
3 University of Electronic Science and Technology of China   4 Nanjing University  
5 SenseTime Research

Paper Dataset Project Page

RealDPO Demo (click to play)

🎬 Overview

  • We propose RealDPO, a novel training pipeline for action-centric video generation that leverages real-world data as preference signals to contrastively reveal and correct the model's inherent mistakes, addressing the limitations of existing reward models and preference alignment methods.
  • We design a tailored DPO loss for our video generation training objective, enabling efficient and effective preference alignment without the scalability and bias issues of prior approaches.
  • We introduce RealAction-5K, a compact yet high-quality curated dataset focused on human daily actions, specifically crafted to advance preference learning for video generation models and broader applications.

🚀 Installation

Make sure you have conda and Python 3.10 installed.

# Create environment
conda create -n real_dpo python=3.10
conda activate real_dpo

# Install PyTorch
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu118

# Install dependencies
pip install -r requirements.txt

🧠 Training

Enter the finetune directory and run the training script.

cd finetune
sh train_zero_i2v_real_dpo.sh

🔍 Inference

  1. Generate negative samples
sh scripts/i2v_reject_sampling
  1. Run inference using the trained model
# (Optional) Manually load model checkpoint
# pipe.transformer.load_state_dict(torch.load("../output/real_dpo/mp_rank_00_model_states.pt", map_location='cpu')['module'])

# Run inference
sh scripts/i2v_val_sampling.py

📦 Dataset

By default, the demo dataset is used for quick testing.

For actual training, replace it with the full dataset.

📚 Citation

If you find our repo useful for your research, please consider citing our paper:

@misc{guo2025realdpo,
      title={RealDPO: Real or Not Real, that is the Preference}, 
      author={Guo Cheng and Danni Yang and Ziqi Huang and Jianlou Si and Chenyang Si and Ziwei Liu},
      year={2025},
      eprint={2510.14955},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2510.14955},
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载