Guo Cheng3*, Danni Yang1*, Ziqi Huang2†, Jianlou Si5, Chenyang Si4, Ziwei Liu2✉︎
(* Equal Contributions) († Project Lead) (✉︎ Corresponding Author)
1 Shanghai Artificial Intelligence Laboratory
2 S-Lab, Nanyang Technological University
3 University of Electronic Science and Technology of China
4 Nanjing University
5 SenseTime Research
- We propose RealDPO, a novel training pipeline for action-centric video generation that leverages real-world data as preference signals to contrastively reveal and correct the model's inherent mistakes, addressing the limitations of existing reward models and preference alignment methods.
- We design a tailored DPO loss for our video generation training objective, enabling efficient and effective preference alignment without the scalability and bias issues of prior approaches.
- We introduce RealAction-5K, a compact yet high-quality curated dataset focused on human daily actions, specifically crafted to advance preference learning for video generation models and broader applications.
Make sure you have conda and Python 3.10 installed.
# Create environment
conda create -n real_dpo python=3.10
conda activate real_dpo
# Install PyTorch
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu118
# Install dependencies
pip install -r requirements.txt
Enter the finetune
directory and run the training script.
cd finetune
sh train_zero_i2v_real_dpo.sh
- Generate negative samples
sh scripts/i2v_reject_sampling
- Run inference using the trained model
# (Optional) Manually load model checkpoint
# pipe.transformer.load_state_dict(torch.load("../output/real_dpo/mp_rank_00_model_states.pt", map_location='cpu')['module'])
# Run inference
sh scripts/i2v_val_sampling.py
By default, the demo dataset is used for quick testing.
For actual training, replace it with the full dataset.
If you find our repo useful for your research, please consider citing our paper:
@misc{guo2025realdpo,
title={RealDPO: Real or Not Real, that is the Preference},
author={Guo Cheng and Danni Yang and Ziqi Huang and Jianlou Si and Chenyang Si and Ziwei Liu},
year={2025},
eprint={2510.14955},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2510.14955},
}