Visual Imitation Enables Contextual Humanoid Control. arXiV, 2025.
Arthur Allshire*, Hongsuk Choi*, Junyi Zhang*, David McAllister*,
Anthony Zhang, Chung Min Kim, Trevor Darrell, Pieter Abbeel, Jitendra Malik, Angjoo Kanazawa (*Equal contribution)
University of California, Berkeley
- Jul 6, 2025: Initial real-to-sim pipeline release.
- Release real‑to‑sim pipeline (July 15th, 2025)
- Release the video dataset (July 15th, 2025)
- Release sim‑to‑real pipeline (September 15th, 2025)
VideoMimic’s real-to-sim pipeline reconstructs 3D environments and human motion from single-camera videos and retargets the motion to humanoid robots for imitation learning. It extracts human poses in world coordinates, maps them to robot configurations, and reconstructs environments as pointclouds later converted to meshes.