SplatFlow: Learning Multi-frame Optical Flow via Splatting

Wang, Bo; Zhang, Yifan; Li, Jian; Yu, Yang; Sun, Zhenping; Liu, Li; Hu, Dewen

doi:10.1007/s11263-024-01993-0

SplatFlow: Learning Multi-frame Optical Flow via Splatting

Published: 29 February 2024

Volume 132, pages 3023–3045, (2024)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

1054 Accesses
9 Citations
4 Altmetric
Explore all metrics

Abstract

The occlusion problem remains a crucial challenge in optical flow estimation (OFE). Despite the recent significant progress brought about by deep learning, most existing deep learning OFE methods still struggle to handle occlusions; in particular, those based on two frames cannot correctly handle occlusions because occluded regions have no visual correspondences. However, there is still hope in multi-frame settings, which can potentially mitigate the occlusion issue in OFE. Unfortunately, multi-frame OFE (MOFE) remains underexplored, and the limited studies on it are mainly specially designed for pyramid backbones or else obtain the aligned previous frame’s features, such as correlation volume and optical flow, through time-consuming backward flow calculation or non-differentiable forward warping transformation. This study proposes an efficient MOFE framework named SplatFlow to address these shortcomings. SplatFlow introduces the differentiable splatting transformation to align the previous frame’s motion feature and designs a Final-to-All embedding method to input the aligned motion feature into the current frame’s estimation, thus remodeling the existing two-frame backbones. The proposed SplatFlow is efficient yet more accurate, as it can handle occlusions properly. Extensive experimental evaluations show that SplatFlow substantially outperforms all published methods on the KITTI2015 and Sintel benchmarks. Especially on the Sintel benchmark, SplatFlow achieves errors of 1.12 (clean pass) and 2.07 (final pass), with surprisingly significant 19.4% and 16.2% error reductions, respectively, from the previous best results submitted. The code for SplatFlow is available at https://github.com/wwsource/SplatFlow.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Occlusions, Motion and Depth Boundaries with a Generic Network for Disparity, Optical Flow or Scene Flow Estimation

Continual Occlusion and Optical Flow Estimation

MemoFlow: Modifying Explicit Motion of Inconsistency in Optical Flow

Data Availibility

All datasets used are publicly available. Code is available at https://github.com/wwsource/SplatFlow.

References

Aslani, S., & Mahdavi-Nasab, H. (2013). Optical flow based moving object detection and tracking for traffic surveillance. International Journal of Electrical, Computer, Energetic, Electronic and Communication Engineering, 7(9), 1252–1256.
Google Scholar
Bailer C., Taetz B. & Stricker D. (2015) Flow fields: Dense correspondence fields for highly accurate large displacement optical flow estimation. In ICCV (pp. 4015–4023). IEEE.
Bao W., Lai W.-S., Ma C., Zhang X., Gao Z. & Yang M.-H. (2019) Depth-aware video frame interpolation. In CVPR (pp. 3703–3712). IEEE.
Bar-Haim A. & Wolf L. (2020) Scopeflow: Dynamic scene scoping for optical flow. In CVPR (pp. 7998–8007). IEEE.
Bhat G., Danelljan M., Van Gool L. & Timofte R. (2020) Know your surroundings: Exploiting scene information for object tracking. In ECCV (pp. 205–221). Springer.
Brox T., Bregler C. & Malik J. (2009) Large displacement optical flow. In CVPR (pp. 41–48). IEEE.
Butler D. J., Wulff J. & Stanley G. B., Black M. J. (2012) A naturalistic open source movie for optical flow evaluation. In ECCV (pp. 611–625). Springer.
Caballero J., Ledig C., Aitken A., Acosta A., Totz J., Wang Z. & Shi W. (2017) Real-time video super-resolution with spatio-temporal networks and motion compensation. In CVPR (pp. 4778–4787). IEEE.
Cheng J., Tsai Y.-H., Wang S., & Yang M.-H. (2017) Segflow: Joint learning for video object segmentation and optical flow. In ICCV (pp. 686–695). IEEE.
Choi, Y.-W., Kwon, K.-K., Lee, S.-I., Choi, J.-W., & Lee, S.-G. (2014). Multi-robot mapping using omnidirectional-vision slam based on fisheye images. ETRI Journal, 36(6), 913–923.
Article Google Scholar
Cun, X., Xu, F., Pun, C.-M., & Gao, H. (2018). Depth-assisted full resolution network for single image-based view synthesis. IEEE Computer Graphics and Applications, 39(2), 52–64.
Article Google Scholar
Doersch, C., Gupta, A., Markeeva, L., Recasens, A., Smaira, L., Aytar, Y., Carreira, J., Zisserman, A., & Yang, Y. (2022). Tap-vid: A benchmark for tracking any point in a video. Advances in Neural Information Processing Systems, 35, 13610–13626.
Google Scholar
Doersch C., Yang Y., Vecerik M., Gokay D., Gupta A., Aytar Y., Carreira J. & Zisserman A. (2023) Tapir: Tracking any point with per-frame initialization and temporal refinement. arXiv preprint arXiv:2306.08637
Dosovitskiy A., Fischer P., Ilg E., Hausser P., Hazirbas C., Golkov V., Van Der Smagt P., Cremers D. & Brox T. (2015) Flownet: Learning optical flow with convolutional networks. In ICCV (pp. 2758–2766). IEEE.
Gao C., Saraf A., Huang J.-B. & Kopf J. (2020) Flow-edge guided video completion. In ECCV (pp. 713–729). Springer.
Geiger A., Lenz P., & Urtasun R. (2012) Are we ready for autonomous driving? The Kitti vision benchmark suite. In CVPR (pp. 3354–3361). IEEE.
Gibson J. J. (1950) The perception of the visual world. Houghton Mifflin.
Godard C., Mac Aodha O., & Brostow G. J. (2017) Unsupervised monocular depth estimation with left-right consistency. In CVPR (pp. 270–279). IEEE.
Harley A. W., Fang Z., & Fragkiadaki K. (2022) Particle video revisited: Tracking through occlusions using point trajectories. In ECCV (pp. 59–75). Springer.
Horn, B. K., & Schunck, B. G. (1981). Determining optical flow. Artificial Intelligence, 17(1–3), 185–203.
Article Google Scholar
Hu P., Niklaus S., Sclaroff S., & Saenko K. (2022) Many-to-many splatting for efficient video frame interpolation. In CVPR (pp. 3553–3562). IEEE.
Hui T.-W., Tang X., & Loy C. C. (2018) Liteflownet: A lightweight convolutional neural network for optical flow estimation. In CVPR (pp. 8981–8989). IEEE.
Hui, T.-W., Tang, X., & Loy, C. C. (2020). A lightweight optical flow CNN-revisiting data fidelity and regularization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(8), 2555–2569.
Article Google Scholar
Hur J. & Roth S. (2019) Iterative residual refinement for joint optical flow and occlusion estimation. In CVPR ( pp. 5754–5763). IEEE.
Ilg E., Mayer N., Saikia T., Keuper M., Dosovitskiy A., & Brox T. (2017) Flownet 2.0: Evolution of optical flow estimation with deep networks. In CVPR (pp. 2462–2470). IEEE.
Irani, M. (1999). Multi-frame optical flow estimation using subspace constraints. In ICCV (pp. 626–633). IEEE.
Jason J. Y., Harley A. W., & Derpanis K. G. (2016) Back to basics: Unsupervised learning of optical flow via brightness constancy and motion smoothness. In ECCV (pp. 3–10). Springer.
Jiang H., Sun D., Jampani V., Yang M.-H., Learned-Miller E., & Kautz J. (2018) Super Slomo: High quality estimation of multiple intermediate frames for video interpolation. In CVPR (pp. 9000–9008). IEEE.
Jiang S., Campbell D., Lu Y., Li H., & Hartley R. (2021a) Learning to estimate hidden motions with global motion aggregation. In ICCV (pp. 9772–9781). IEEE.
Jiang S., Lu Y., Li H., & Hartley R. (2021b) Learning optical flow from a few matches. In CVPR (pp. 16,592–16,600). IEEE.
Jonschkowski R., Stone A., Barron J. T., Gordon A., Konolige K., & Angelova A. (2020) What matters in unsupervised optical flow. In ECCV (pp. 557–572). Springer.
Kondermann D., Nair R., Honauer K., Krispin K., Andrulis J., Brock A., Gussefeld B., Rahimimoghaddam M., Hofmann S., Brenner C. & Jahne B. (2016). The HCI benchmark suite: Stereo and flow ground truth with uncertainties for urban autonomous driving. In CVPR workshops (pp. 19–28). IEEE.
Li Z., Dekel T., Cole F., Tucker R., Snavely N., Liu C., & Freeman W. T. (2019) Learning the depths of moving people by watching frozen people. In CVPR (pp. 4521–4530). IEEE.
Liu M., He X., & Salzmann M. (2018) Geometry-aware deep network for single-image novel view synthesis. In CVPR (pp. 4616–4624). IEEE.
Liu P., Lyu M., King I., & Xu J. (2019) Selflow: Self-supervised learning of optical flow. In CVPR (pp. 4571–4580). IEEE.
Loshchilov I. & Hutter F. (2017) Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101
Mahjourian R., Wicke M., & Angelova A. (2018) Unsupervised learning of depth and ego-motion from monocular video using 3D geometric constraints. In CVPR (pp. 5667–5675). IEEE.
Maurer D. & Bruhn A. (2018) Proflow: Learning to predict optical flow. arXiv preprint arXiv:1806.00800
Mayer N., Ilg E., Hausser P., Fischer P., Cremers D., Dosovitskiy A. & Brox T. (2016) A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In CVPR ( pp. 4040–4048). IEEE.
Meister S., Hur J. & Roth S. (2018) Unflow: Unsupervised learning of optical flow with a bidirectional census loss. In Proceedings of the AAAI conference on artificial intelligence (Vol. 32, No. 1).
Menze M. & Geiger A. (2015) Object scene flow for autonomous vehicles. In CVPR (pp. 3061–3070). IEEE.
Neoral M., Šochman J. & Matas J. (2018) Continual occlusion and optical flow estimation. In ACCV (pp. 159–174). Springer.
Neoral M., Šerỳch J. & Matas J. (2024) Mft: Long-term tracking of every pixel. In WACV (pp. 6837–6847). IEEE.
Niklaus S. & Liu F. (2018) Context-aware synthesis for video frame interpolation. In CVPR (pp. 1701–1710). IEEE.
Niklaus S. & Liu F. (2020) Softmax splatting for video frame interpolation. In CVPR (pp. 5437–5446). IEEE.
Ranftl R., Vineet V., Chen Q. & Koltun V. (2016) Dense monocular depth estimation in complex dynamic scenes. In CVPR (pp. 4058–4066). IEEE.
Ren Z., Gallo O., Sun D., Yang M.-H., Sudderth E. B. & Kautz J. (2019) A fusion approach for multi-frame optical flow estimation. In WACV (pp. 2077–2086). IEEE.
Revaud J., Weinzaepfel P., Harchaoui Z. & Schmid C. (2015) Epicflow: Edge-preserving interpolation of correspondences for optical flow. In CVPR (pp. 1164–1172). IEEE.
Sajjadi M. S., Vemulapalli R. & Brown M. (2018) Frame-recurrent video super-resolution. In CVPR (pp. 6626–6634). IEEE.
Sevilla-Lara L., Liao Y., Güney F., Jampani V., Geiger A. & Black M. J. (2018) On the integration of optical flow and action recognition. In German conference on pattern recognition (pp. 281–297). Springer .
Siyao L., Zhao S., Yu W., Sun W., Metaxas D., Loy C. C. & Liu Z. (2021) Deep animation video interpolation in the wild. In CVPR (pp. 6587–6595). IEEE.
Smith L. N., & Topin N. (2019) Super-convergence: Very fast training of neural networks using large learning rates. In Artificial intelligence and machine learning for multi-domain operations applications, International Society for Optics and Photonics (Vol. 11006, p. 1100612).
Sun D., Yang X., Liu M.-Y., & Kautz J. (2018) PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In CVPR (pp. 8934–8943). IEEE.
Sun, D., Yang, X., Liu, M.-Y., & Kautz, J. (2019). Models matter, so does training: An empirical study of CNNs for optical flow estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(6), 1408–1423.
Tao X., Gao H., Liao R., Wang J., & Jia J. (2017) Detail-revealing deep video super-resolution. In ICCV (pp. 4472–4480). IEEE.
Teed Z. & Deng J. (2020) Raft: Recurrent all-pairs field transforms for optical flow. In ECCV (pp. 402–419). Springer.
Volz S., Bruhn A., Valgaerts L., & Zimmer H. (2011) Modeling temporal coherence for optical flow. In ICCV (pp. 1116–1123). IEEE.
Wang J., Zhong Y., Dai Y., Zhang K., Ji P., & Li H. (2020) Displacement-invariant matching cost learning for accurate optical flow estimation. arXiv preprint arXiv:2010.14851
Wang Y., Yang Y., Yang Z., Zhao L., Wang P., & Xu W. (2018) Occlusion aware unsupervised learning of optical flow. In CVPR (pp. 4884–4893). IEEE.
Weinzaepfel P., Revaud J., Harchaoui Z., & Schmid C. (2013) Deepflow: Large displacement optical flow with deep matching. In ICCV (pp. 1385–1392). IEEE.
Wulff J., Sevilla-Lara L., & Black M. J. (2017) Optical flow in mostly rigid scenes. In CVPR (pp. 4671–4680). IEEE.
Xu H., Yang J., Cai J., Zhang J., & Tong X. (2021) High-resolution optical flow from 1d attention and correlation. In ICCV (pp. 10,498–10,507). IEEE.
Xu, L., Jia, J., & Matsushita, Y. (2011). Motion detail preserving optical flow estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(9), 1744–1757.
Google Scholar
Xu R., Li X., Zhou B., & Loy C. C. (2019) Deep flow-guided video inpainting. In CVPR (pp. 3723–3732). IEEE.
Yang, G., & Ramanan, D. (2019). Volumetric correspondence networks for optical flow. Advances in Neural Information Processing Systems, 32, 794–805.
Google Scholar
Yang G. & Ramanan D. (2020) Upgrading optical flow to 3d scene flow through optical expansion. In CVPR (pp. 1334–1343). IEEE.
Ye B., Chang H., Ma B., Shan S., & Chen X. (2022) Joint feature learning and relation modeling for tracking: A one-stream framework. In ECCV (pp. 341–357). Springer.
Yin Z., Darrell T., & Yu F. (2019) Hierarchical discrete distribution decomposition for match density estimation. In CVPR (pp. 6044–6053). IEEE.
Zhang F., Woodford O. J., Prisacariu V. A., & Torr P. H. (2021) Separable flow: Learning motion cost volumes for optical flow estimation. In ICCV (pp. 10,807–10,817). IEEE.
Zhao S., Sheng Y., Dong Y., Chang E. I. & Xu Y., (2020) Maskflownet: Asymmetric feature matching with learnable occlusion mask. In CVPR (pp. 6278–6287). IEEE.

Download references

Acknowledgements

This work was partially supported by National Key Research and Development Program of China No. 2021YFB3100800, the National Natural Science Foundation of China under Grant 61973311, 62376283 and 62006239, the Defense industrial Technology Development Program (JCKY2020550B003) and the Key Stone grant (JS2023-03) of the National University of Defense Technology (NUDT).

Author information

Authors and Affiliations

College of Intelligence Science and Technology, National University of Defense Technology, Changsha, China
Bo Wang, Yifan Zhang, Jian Li, Yang Yu, Zhenping Sun & Dewen Hu
College of Electronic Science and Technology, National University of Defense Technology, Changsha, China
Li Liu

Authors

Bo Wang
View author publications
Search author on:PubMed Google Scholar
Yifan Zhang
View author publications
Search author on:PubMed Google Scholar
Jian Li
View author publications
Search author on:PubMed Google Scholar
Yang Yu
View author publications
Search author on:PubMed Google Scholar
Zhenping Sun
View author publications
Search author on:PubMed Google Scholar
Li Liu
View author publications
Search author on:PubMed Google Scholar
Dewen Hu
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Dewen Hu.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Communicated by Yasuyuki Matsushita.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, B., Zhang, Y., Li, J. et al. SplatFlow: Learning Multi-frame Optical Flow via Splatting. Int J Comput Vis 132, 3023–3045 (2024). https://doi.org/10.1007/s11263-024-01993-0

Download citation

Received: 27 October 2022
Accepted: 02 January 2024
Published: 29 February 2024
Version of record: 29 February 2024
Issue date: August 2024
DOI: https://doi.org/10.1007/s11263-024-01993-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SplatFlow: Learning Multi-frame Optical Flow via Splatting

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Occlusions, Motion and Depth Boundaries with a Generic Network for Disparity, Optical Flow or Scene Flow Estimation

Continual Occlusion and Optical Flow Estimation

MemoFlow: Modifying Explicit Motion of Inconsistency in Optical Flow

Explore related subjects

Data Availibility

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now