Computer Science > Computer Vision and Pattern Recognition
[Submitted on 1 May 2024 (v1), last revised 1 May 2025 (this version, v3)]
Title:F2M-Reg: Unsupervised RGB-D Point Cloud Registration with Frame-to-Model Optimization
View PDF HTML (experimental)Abstract:This work studies the problem of unsupervised RGB-D point cloud registration, which aims at training a robust registration model without ground-truth pose supervision. Existing methods usually leverages unposed RGB-D sequences and adopt a frame-to-frame framework based on differentiable rendering to train the registration model, which enforces the photometric and geometric consistency between the two frames for supervision. However, this frame-to-frame framework is vulnerable to inconsistent factors between different frames, e.g., lighting changes, geometry occlusion, and reflective materials, which leads to suboptimal convergence of the registration model. In this paper, we propose a novel frame-to-model optimization framework named F2M-Reg for unsupervised RGB-D point cloud registration. We leverage the neural implicit field as a global model of the scene and optimize the estimated poses of the frames by registering them to the global model, and the registration model is subsequently trained with the optimized poses. Thanks to the global encoding capability of neural implicit field, our frame-to-model framework is significantly more robust to inconsistent factors between different frames and thus can provide better supervision for the registration model. Besides, we demonstrate that F2M-Reg can be further enhanced by a simplistic synthetic warming-up strategy. To this end, we construct a photorealistic synthetic dataset named Sim-RGBD to initialize the registration model for the frame-to-model optimization on real-world RGB-D sequences. Extensive experiments on four challenging benchmarks have shown that our method surpasses the previous state-of-the-art counterparts by a large margin, especially under scenarios with severe lighting changes and low overlap. Our code and models are available at this https URL.
Submission history
From: Zhinan Yu [view email][v1] Wed, 1 May 2024 13:38:03 UTC (48,168 KB)
[v2] Thu, 20 Jun 2024 07:28:42 UTC (48,169 KB)
[v3] Thu, 1 May 2025 04:19:18 UTC (44,408 KB)
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.