RealDPO: Real or Not Real, that is the Preference

Guo Cheng^3*, Danni Yang^1*, Ziqi Huang^2†, Jianlou Si⁵, Chenyang Si⁴, Ziwei Liu^2✉︎

(* Equal Contributions) († Project Lead) (✉︎ Corresponding Author)

¹ Shanghai Artificial Intelligence Laboratory ² S-Lab, Nanyang Technological University
³ University of Electronic Science and Technology of China ⁴ Nanjing University
⁵ SenseTime Research

🎬 Overview

We propose RealDPO, a novel training pipeline for action-centric video generation that leverages real-world data as preference signals to contrastively reveal and correct the model's inherent mistakes, addressing the limitations of existing reward models and preference alignment methods.
We design a tailored DPO loss for our video generation training objective, enabling efficient and effective preference alignment without the scalability and bias issues of prior approaches.
We introduce RealAction-5K, a compact yet high-quality curated dataset focused on human daily actions, specifically crafted to advance preference learning for video generation models and broader applications.

🚀 Installation

Make sure you have conda and Python 3.10 installed.

# Create environment
conda create -n real_dpo python=3.10
conda activate real_dpo

# Install PyTorch
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu118

# Install dependencies
pip install -r requirements.txt

🧠 Training

Enter the finetune directory and run the training script.

cd finetune
sh train_zero_i2v_real_dpo.sh

🔍 Inference

Generate negative samples

sh scripts/i2v_reject_sampling

Run inference using the trained model

# (Optional) Manually load model checkpoint
# pipe.transformer.load_state_dict(torch.load("../output/real_dpo/mp_rank_00_model_states.pt", map_location='cpu')['module'])

# Run inference
sh scripts/i2v_val_sampling.py

📦 Dataset

By default, the demo dataset is used for quick testing.

For actual training, replace it with the full dataset.

📚 Citation

If you find our repo useful for your research, please consider citing our paper:

@misc{guo2025realdpo,
      title={RealDPO: Real or Not Real, that is the Preference}, 
      author={Guo Cheng and Danni Yang and Ziqi Huang and Jianlou Si and Chenyang Si and Ziwei Liu},
      year={2025},
      eprint={2510.14955},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2510.14955},
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
assets		assets
demo_data		demo_data
evaluation/realaction_bench/vbench_test		evaluation/realaction_bench/vbench_test
finetune		finetune
inference		inference
scripts		scripts
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RealDPO: Real or Not Real, that is the Preference

🎬 Overview

🚀 Installation

🧠 Training

🔍 Inference

📦 Dataset

📚 Citation

About

Uh oh!

Releases

Packages

Languages

Vchitect/RealDPO

Folders and files

Latest commit

History

Repository files navigation

RealDPO: Real or Not Real, that is the Preference

🎬 Overview

🚀 Installation

🧠 Training

🔍 Inference

📦 Dataset

📚 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages