CEDFlow++: Latent Contour Enhancement for Dark Optical Flow Estimation

Zuo, Fengyuan; Jin, Haiyan; Xiao, Zhaolin; Su, Haonan; Zhang, Meng

doi:10.1007/s11263-025-02528-x

CEDFlow++: Latent Contour Enhancement for Dark Optical Flow Estimation

Published: 15 July 2025

Volume 133, pages 7222–7241, (2025)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Fengyuan Zuo¹,
Haiyan Jin ORCID: orcid.org/0000-0003-3742-4029^1,2,
Zhaolin Xiao^1,2,
Haonan Su^1,2 &
…
Meng Zhang¹

373 Accesses
Explore all metrics

Abstract

CEDFlow introduces a latent contour enhancement method into dark optical flow estimation and achieves advanced performance. Nevertheless, it largely focuses on addressing the motion boundary in a local manner. Unfortunately, it falls short in performance when addressing significant variations or large-scale degraded scenes. This paper introduces CEDFlow++, which features three innovative modules to address the key challenges of CEDFlow. Firstly, we introduce a decomposition-based feature encoder (DBFE), which captures both fine-grained and large-scale features through its local encoder and a uniquely designed sparse attention-based global encoder that suppresses noise and interference that only exist in the dark. Secondly, for reliable motion analysis, we propose a customized dual cost-volume reasoning (DCVR), which integrates important high-contrast feature correlations of the global cost volume into the local cost volume, effectively capturing salient yet holistic motion information while mitigating motion ambiguity caused by darkness. Importantly, we present a contour-guided attention (CGA) which enables context-adaptive extraction of contour features by modifying the sign properties of the Sobel kernel parameters in latent space, specifically targeting large-scale contours that are suitable for motion boundaries. Experimental results on the FCDN and VBOF datasets show that CEDFlow++ outperforms state-of-the-art methods in terms of the EPE index and produces more accurate and robust optical flow.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 10

Fig. 11

Fig. 17

GloFP-MSF: monocular scene flow estimation with global feature perception

Article 30 July 2024

DSFormer: Leveraging Transformer with Cross-Modal Attention for Temporal Consistency in Low-Light Video Enhancement

Low-light few-shot object detection via curve contrast enhancement and flow-encoder-based variational autoencoder

Article 11 February 2025

Data Availability

The data that support the findings of this study are openly available in [FCDN and VBOF datasets] at [https://github.com/mf-zhang/Optical-Flow-in-the-Dark].

References

Bayramli, B., Hur, J., & Lu, H. (2023). Raft-msf: Self-supervised monocular scene flow using recurrent optimizer. IJCV, 131(11), 2757–2769.
Article Google Scholar
Butler, D.J., Wulff, J., Stanley, G.B., & Black, M.J. (2012). A naturalistic open source movie for optical flow evaluation. In: ECCV, pp. 611–625 Springer
Cai, Y., Bian, H., Lin, J., Wang, H., Timofte, R., & Zhang, Y. (2023). Retinexformer: One-stage retinex-based transformer for low-light image enhancement. In: ICCV, pp. 12504–12513
Cai, B., Xu, X., Guo, K., Jia, K., Hu, B., & Tao, D. (2017). A joint intrinsic-extrinsic prior model for retinex. In: ICCV, pp. 4000–4009
Cao, B., Sun, Y., Zhu, P., & Hu, Q. (2023). Multi-modal gated mixture of local-to-global experts for dynamic image fusion. In: ICCV, pp. 23555–23564
Chi, C., Hao, T., Wang, Q., Guo, P., & Yang, X. (2022). Subspace-pnp: A geometric constraint loss for mutual assistance of depth and optical flow estimation. IJCV, 130(12), 3054–3069.
Article Google Scholar
Chobola, T., Liu, Y., Zhang, H., Schnabel, J.A., & Peng, T. (2024). Fast context-based low-light image enhancement via neural implicit representations. In: ECCV, pp. 413–430 . Springer
Conde, M. V., Vazquez-Corral, J., Brown, M. S., & Timofte, R. (2024). Nilut: Conditional neural implicit 3d lookup tables for image enhancement. AAAI, 38, 1371–1379.
Article Google Scholar
Dong, Q., Cao, C., & Fu, Y. (2023). Rethinking optical flow from geometric matching consistent perspective. In: CVPR, pp. 1337–1347
Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., Van Der Smagt, P., Cremers, D., & Brox, T. (2015). Flownet: Learning optical flow with convolutional networks. In: ICCV, pp. 2758–2766
Geiger, A., Lenz, P., & Urtasun, R. (2012). Are we ready for autonomous driving? the kitti vision benchmark suite. In: CVPR, pp. 3354–3361 IEEE
Guo, X., Li, Y., & Ling, H. (2016). Lime: Low-light image enhancement via illumination map estimation. IEEE TIP, 26(2), 982–993.
MathSciNet Google Scholar
Huang, Z., Shi, X., Zhang, C., Wang, Q., Cheung, K.C., Qin, H., Dai, J., & Li, H. (2022). Flowformer: A transformer architecture for optical flow. In: ECCV, pp. 668–685 . Springer
Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., & Brox, T. (2017). Flownet 2.0: Evolution of optical flow estimation with deep networks. In: CVPR, pp. 2462–2470
Jiang, S., Campbell, D., Lu, Y., Li, H., & Hartley, R. (2021). Learning to estimate hidden motions with global motion aggregation. In: ICCV, pp. 9772–9781
Jiang, S., Lu, Y., Li, H., & Hartley, R. (2021). Learning optical flow from a few matches. In: CVPR, pp. 16592–16600
Li, H., Luo, K., & Liu, S. (2021). Gyroflow: Gyroscope-guided unsupervised optical flow learning. In: ICCV, pp. 12869–12878
Li, H., Luo, K., Zeng, B., & Liu, S. (2024). Gyroflow+: Gyroscope-guided unsupervised deep homography and optical flow learning. IJCV, 132(6), 2331–2349.
Article Google Scholar
Lin, Z., Liang, T., Xiao, T., Wang, Y., & Yang, M.-H. (2024). Flownas: neural architecture search for optical flow estimation. IJCV, 132(4), 1055–1074.
Article Google Scholar
Loshchilov, I., Hutter, F., et al. (2017). Fixing weight decay regularization in adam. arXiv preprint arXiv:1711.05101 5, 5
Luo, A., Li, X., Yang, F., Liu, J., Fan, H., & Liu, S. (2024). Flowdiffuser: Advancing optical flow estimation with diffusion models. In: CVPR, pp. 19167–19176
Luo, K., Wang, C., Liu, S., Fan, H., Wang, J., & Sun, J. (2021). Upflow: Upsampling pyramid for unsupervised optical flow learning. In: CVPR, pp. 1045–1054
Luo, A., Yang, F., Li, X., & Liu, S. (2022). Learning optical flow with kernel patch attention. In: CVPR, pp. 8906–8915
Luo, A., Yang, F., Li, X., Nie, L., Lin, C., Fan, H., & Liu, S. (2023). Gaflow: Incorporating gaussian attention into optical flow. In: ICCV, pp. 9642–9651
Luo, A., Yang, F., Luo, K., Li, X., Fan, H., & Liu, S. (2022). Learning optical flow with adaptive graph reasoning. AAAI, 36, 1890–1898.
Article Google Scholar
Menze, M., & Geiger, A. (2015). Object scene flow for autonomous vehicles. In: CVPR, pp. 3061–3070
Menze, M., Heipke, C., & Geiger, A. (2015). Joint 3d estimation of vehicles and scene flow. ISPRS annals of the photogrammetry, remote sensing and spatial information sciences, 2, 427–434.
Article Google Scholar
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., & Lerer, A. (2017). Automatic differentiation in pytorch
Ren, S., Zhou, D., He, S., Feng, J., & Wang, X. (2022). Shunted self-attention via multi-scale token aggregation. In: CVPR, pp. 10853–10862
Ren, Z., Luo, W., Yan, J., Liao, W., Yang, X., Yuille, A., & Zha, H. (2020). Stflow: Self-taught optical flow estimation using pseudo labels. IEEE TIP, 29, 9113–9124.
Google Scholar
Shi, X., Huang, Z., Li, D., Zhang, M., Cheung, K.C., See, S., Qin, H., Dai, J., & Li, H. (2023). Flowformer++: Masked cost volume autoencoding for pretraining optical flow estimation. In: CVPR, pp. 1599–1610
Shi, Y., Liu, D., Zhang, L., Tian, Y., Xia, X., & Fu, X. (2024). Zero-ig: zero-shot illumination-guided joint denoising and adaptive enhancement for low-light images. In: CVPR, pp. 3015–3024
Smith, L.N., & Topin, N. (2019). Super-convergence: Very fast training of neural networks using large learning rates. In: Artificial Intelligence and Machine Learning for Multi-domain Operations Applications, vol. 11006, pp. 369–386 SPIE
Sui, X., Li, S., Geng, X., Wu, Y., Xu, X., Liu, Y., Goh, R., & Zhu, H. (2022). Craft: Cross-attentional flow transformer for robust optical flow. In: CVPR, pp. 17602–17611
Sun, D., Yang, X., Liu, M.-Y., & Kautz, J. (2018). Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In: CVPR, pp. 8934–8943
Sun, S., Chen, Y., Zhu, Y., Guo, G., & Li, G. (2022). Skflow: Learning optical flow with super kernels. NeurIPS, 35, 11313–11326.
Google Scholar
Teed, Z., & Deng, J. (2020). Raft: Recurrent all-pairs field transforms for optical flow. In: ECCV, pp. 402–419 Springer
Wang, R., Xu, X., Fu, C.-W., Lu, J., Yu, B., & Jia, J. (2021). Seeing dynamic scene in the dark: A high-quality video dataset with mechatronic alignment. In: ICCV, pp. 9700–9709
Wang, W., Yang, H., Fu, J., & Liu, J. (2024). Zero-reference low-light enhancement via physical quadruple priors. In: CVPR, pp. 26057–26066
Wang, Y., Yu, Y., Yang, W., Guo, L., Chau, L.-P., Kot, A.C., & Wen, B. (2023). Exposurediffusion: Learning to expose for low-light image enhancement. In: ICCV, pp. 12438–12448
Wang, W., Wang, X., Yang, W., & Liu, J. (2022). Unsupervised face detection in the dark. IEEE TPAMI, 45(1), 1250–1266.
Article Google Scholar
Wang, Y., Wan, R., Yang, W., Li, H., Chau, L.-P., & Kot, A. (2022). Low-light image enhancement with normalizing flow. AAAI,36, 2604–2612.
Wang, B., Zhang, Y., Li, J., Yu, Y., Sun, Z., Liu, L., & Hu, D. (2024). Splatflow: Learning multi-frame optical flow via splatting. IJCV, 132(8), 3023–3045.
Article Google Scholar
Wei, C., Wang, W., Yang, W., & Liu, J. (2018). Deep retinex decomposition for low-light enhancement. arXiv preprint arXiv:1808.04560
Xu, X., Wang, R., & Lu, J. (2023). Low-light image enhancement via structure modeling and guidance. In: CVPR, pp. 9893–9903
Xu, X., Wang, R., Fu, C.-W., & Jia, J. (2022). Snr-aware low-light image enhancement. In: CVPR, pp. 17714–17724
Xu, H., Yang, J., Cai, J., Zhang, J., & Tong, X. (2021). High-resolution optical flow from 1d attention and correlation. In: ICCV, pp. 10498–10507
Xu, H., Zhang, J., Cai, J., Rezatofighi, H., & Tao, D. (2022). Gmflow: Learning optical flow via global matching. In: CVPR, pp. 8121–8130
Young, S. I., Naman, A. T., & Taubman, D. (2019). Graph laplacian regularization for robust optical flow estimation. IEEE TIP, 29, 3970–3983.
MathSciNet Google Scholar
Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., & Yang, M.-H. (2022). Restormer: Efficient transformer for high-resolution image restoration. In: CVPR, pp. 5728–5739
Zhang, F., Li, Y., You, S., & Fu, Y. (2021). Learning temporal consistency for low light video enhancement from single images. In: CVPR, pp. 4967–4976
Zhao, S., Zhao, L., Zhang, Z., Zhou, E., & Metaxas, D. (2022). Global matching with overlapping attention for optical flow estimation. In: CVPR, pp. 17592–17601
Zheng, Y., L, F., & Zhang, y. (2022) Optical flow in the dark. IEEE TPAMI 44(12), 9464–9476
Zheng, Y., L, F., & Zhang, M. (2020) Optical flow in the dark. In: CVPR, pp. 6749–6757
Zhou, H., Chang, Y., Liu, H., Yan, W., Duan, Y., Shi, Z., & Yan, L. (2024). Exploring the common appearance-boundary adaptation for nighttime optical flow. arXiv preprint arXiv:2401.17642
Zuo, F., Xiao, Z., Jin, H., & Su, H. (2024). Cedflow: latent contour enhancement for dark optical flow estimation. AAAI, 38, 7909–7916.
Article Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (62272383, 62371389, 62031023) and the Doctoral Dissertation Innovation Fund of Xian University of Technology (252072206).

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Xi’an University of Technology, NO.5 South Jinhua Road, Xian, 710048, Shaanxi, China
Fengyuan Zuo, Haiyan Jin, Zhaolin Xiao, Haonan Su & Meng Zhang
Shaanxi Key Laboratory for Network Computing and Security Technology, NO.5 South Jinhua Road, Xian, 710048, Shaanxi, China
Haiyan Jin, Zhaolin Xiao & Haonan Su

Authors

Fengyuan Zuo
View author publications
Search author on:PubMed Google Scholar
Haiyan Jin
View author publications
Search author on:PubMed Google Scholar
Zhaolin Xiao
View author publications
Search author on:PubMed Google Scholar
Haonan Su
View author publications
Search author on:PubMed Google Scholar
Meng Zhang
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Haiyan Jin.

Additional information

Communicated by Ming-Hsuan Yang.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Basic and Context Backbones We follow a similar design as the well-known RAFT (Teed & Deng, 2020); we incorporate motion backbone and context backbone into our PsFE framework. The motion backbone, it outputs features at 1/8 resolution from $\textbf{x} \in \mathbb {R}^{3 \times H \times W} $ $\rightarrow $ $\textbf{x} \in \mathbb {R}^{C \times H/8 \times W/8} $ where we set C = 256. The motion backbone consists of 6 residual blocks, 2 at 1/2 resolution, 2 at 1/4 resolution, and 2 at 1/8 resolution. Furthermore, the structure of the context backbone is identical to that of the feature extraction network, except that BatchNorm regularization is used in the context branch and InstanceNorm is used in the basic branch.

The Local Feature Encoder. We used the local encoder in CEDFlow as a continuation. In detail, the local encoder employs three 2D residual convolutional blocks (with a small receptive field) to encode the local properties of each point, followed by ReLU activation to ensure robust information propagation of the fine-grained local feature.

The Global Feature Encoder. Our global encoder consists of a sparse attention method and an interaction layer. As shown in Fig. 19, the interaction layer first predicts a set of scale weights from the local feature $\hat{f^L}$, and then enhances the appearance information in the coarse $f^H$ while retaining high-contrast feature. The process is as follows,

$$\begin{aligned} V_p = L^V_p((1 + \mathcal{E}\mathcal{A}(\hat{f^L})) \cdot f^H), \end{aligned}$$

(15)

where ‘$\cdot $’ denotes an element-wise dot product. $\mathcal{E}\mathcal{A}()$ is a spatial attention module, which consists of a convolutional block and a sigmoid function to extract the importance weight of each point, thereby improving the spatial expression of feature.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zuo, F., Jin, H., Xiao, Z. et al. CEDFlow++: Latent Contour Enhancement for Dark Optical Flow Estimation. Int J Comput Vis 133, 7222–7241 (2025). https://doi.org/10.1007/s11263-025-02528-x

Download citation

Received: 07 June 2024
Accepted: 07 July 2025
Published: 15 July 2025
Version of record: 15 July 2025
Issue date: October 2025
DOI: https://doi.org/10.1007/s11263-025-02528-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CEDFlow++: Latent Contour Enhancement for Dark Optical Flow Estimation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

GloFP-MSF: monocular scene flow estimation with global feature perception

DSFormer: Leveraging Transformer with Cross-Modal Attention for Temporal Consistency in Low-Light Video Enhancement

Low-light few-shot object detection via curve contrast enhancement and flow-encoder-based variational autoencoder

Explore related subjects

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now