这是indexloc提供的服务,不要输入任何密码
Skip to main content
Log in

Active Stereo in the Wild through Virtual Pattern Projection

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

This paper presents a novel general-purpose guided stereo paradigm that mimics the active stereo principle by replacing the unreliable physical pattern projector with a depth sensor. It works by projecting virtual patterns consistent with the scene geometry onto the left and right images acquired by a conventional stereo camera, using the sparse hints obtained from a depth sensor, to facilitate the visual correspondence. Purposely, any depth sensing device can be seamlessly plugged into our framework, enabling the deployment of a virtual active stereo setup in any possible environment and overcoming the severe limitations of physical pattern projection, such as the limited working range and environmental conditions. Exhaustive experiments on indoor and outdoor datasets featuring both long and close range, including those providing raw, unfiltered depth hints from off-the-shelf depth sensors, highlight the effectiveness of our approach in notably boosting the robustness and accuracy of algorithms and deep stereo without any code modification and even without re-training. Additionally, we assess the performance of our strategy on active stereo evaluation datasets with conventional pattern projection. Indeed, in all these scenarios, our virtual pattern projection paradigm achieves state-of-the-art performance. The source code is available at: https://github.com/bartn8/vppstereo.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data Availability

The datasets and benchmark results supporting the findings of this study are publicly available. The KITTI 2012, 2015 and derived datasets can be accessed at https://www.cvlibs.net/datasets/kitti/eval_stereo.php. The Middlebury 2014 and 2021 datasets are available at https://vision.middlebury.edu/stereo/data/. The ETH3D dataset can be downloaded at https://www.eth3d.net/datasets. The DSEC dataset can be accessed at https://dsec.ifi.uzh.ch/. The M3ED dataset is available at https://m3ed.io/. The SIMSTEREO dataset can be accessed at https://ieee-dataport.org/open-access/active-passive-simstereo. All the pre-processed data used in our experiments are also available at https://github.com/bartn8/vppstereo/?tab=readme-ov-file#floppy_disk-datasets.

References

  • Aleotti, F., Tosi, F., Zama Ramirez, P., Poggi, M., Salti, S., Di Stefano, L., Mattoccia, S. (2021). Neural disparity refinement for arbitrary resolution stereo. International conference on 3d vision (3dv).

  • Aleotti, F., Tosi, F., Zhang, L., Poggi, M., Mattoccia, S. (2020). Reversing the cycle: selfsupervised deep stereo through enhanced monocular distillation. European conference on computer vision (eccv).

  • Badino, H., Huber, D.F., Kanade, T. (2011). Integrating lidar into stereo for fast and improved disparity computation. International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DPVT), 405-412,

  • Bangunharcana, A., Cho, J.W., Lee, S., Kweon, I.S., Kim, K.-S., Kim, S. (2021). Correlateand- excite: Real-time stereo matching via guided cost volume excitation. Ieee/rsj international conference on intelligent robots and systems (iros) (pp. 3542–3548).

  • Banz, C., Pirsch, P., Blume, H. (2012). Evaluation of penalty functions for semi-global matching cost aggregation. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences; 39-B3 , 39 (B3), 1–6,

  • Bartolomei, L., Poggi, M., Conti, A., Tosi, F., Mattoccia, S. (2024). Revisiting depth completion from a stereo matching perspective for cross-domain generalization. International conference on 3d vision (3dv).

  • Bartolomei, L., Poggi, M., Tosi, F., Conti, A., Mattoccia, S. (2023, October). Active stereo without pattern projector. Ieee/cvf international conference on computer vision (iccv) (p. 18470-18482).

  • Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.

    Article  Google Scholar 

  • Cai, C., Poggi, M., Mattoccia, S., Mordohai, P. (2020). Matching-space stereo networks for cross-domain generalization. 2020 international conference on 3d vision (3dv) (p. 364-373).

  • Camplani, M., & Salgado, L. (2012). Efficient spatio-temporal hole filling strategy for kinect depth maps. Three-dimensional image processing (3dip) and applications ii (Vol. 8290, pp. 127–136).

  • Chaney, K., Cladera, F., Wang, Z., Bisulco, A., Hsieh, M.A., Korpela, C.,... & Daniilidis, K. (2023, June). M3ed: Multi-robot, multisensor, multi-environment event dataset. Ieee/cvf conference on computer vision and pattern recognition (cvpr) workshops (p. 4015-4022).

  • Chang, J.-R., & Chen, Y.-S. (2018). Pyramid stereo matching network. Ieee/cvf conference on computer vision and pattern recognition (cvpr) (pp. 5410–5418).

  • Chen, R., Xu, J., & Zhang, S. (2022). Comparative study on 3d optical sensors for short range applications. Optics and Lasers in Engineering, 149, Article 106763.

    Article  Google Scholar 

  • Chen, W., Mirdehghan, P., Fidler, S., & Kutulakos, K. N. (2020). June). Ieee conference on computer vision and pattern recognition (cvpr): Auto-tuning structured light by optical stochastic gradient descent.

    Google Scholar 

  • Chen, Y., Yang, B., Liang, M., Urtasun, R. (2019). Learning joint 2d-3d representations for depth completion. Ieee/cvf international conference on computer vision (iccv) (pp. 10023–10032).

  • Cheng, X., Wang, P., Yang, R. (2018). Depth estimation via affinity learned with convolutional spatial propagation network. European conference on computer vision (eccv) (pp. 103–119).

  • Cheng, X., Wang, P., & Yang, R. (2019). Learning depth with convolutional spatial propagation network. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(10), 2361–2379.

    Article  Google Scholar 

  • Cheng, X., Zhong, Y., Dai, Y., Ji, P., Li, H. (2019). Noise-aware unsupervised deep lidar-stereo fusion. Ieee conference on computer vision pattern recognition (cvpr).

  • Cheng, X., Zhong, Y., Harandi, M., Dai, Y., Chang, X., Li, H.,... & Ge, Z. (2020). Hierarchical neural architecture search for deep stereo matching. Advances in Neural Information Processing Systems, 33 , ,

  • Chi, C., Wang, Q., Hao, T., Guo, P., Yang, X. (2021). Feature-level collaboration: Joint unsupervised learning of optical flow, stereo depth and camera motion. Ieee/cvf conference on computer vision and pattern recognition (cvpr) (pp. 2463–2473).

  • Choe, J., Joo, K., Imtiaz, T., & Kweon, I. S. (2021). Volumetric propagation network: Stereolidar fusion for long-range depth estimation. IEEE Robotics and Automation Letters, 6(3), 4672–4679.

    Article  Google Scholar 

  • Chuah, W., Tennakoon, R., Hoseinnezhad, R., Bab-Hadiashar, A., Suter, D. (2022). Itsa: An information-theoretic approach to automatic shortcut avoidance and domain generalization in stereo matching networks. Ieee/cvf conference on computer vision and pattern recognition (cvpr) (pp. 13022– 13032).

  • Conti, A., Poggi, M., Aleotti, F., Mattoccia, S. (2022). Unsupervised confidence for lidar depth maps and applications. Ieee/rsj international conference on intelligent robots and systems. (IROS)

  • Conti, A., Poggi, M., Mattoccia, S. (2023, January). Sparsity agnostic depth completion. Ieee/cvf winter conference on applications of computer vision (wacv) (p. 5871-5880).

  • Duggal, S.,Wang, S., Ma, W.-C., Hu, R., Urtasun, R. (2019). Deeppruner: Learning efficient stereo matching via differentiable patchmatch. Ieee/cvf international conference on computer vision (iccv) (pp. 4384–4393).

  • Fanello, S.R., Valentin, J., Rhemann, C., Kowdle, A., Tankovich, V., Davidson, P., Izadi, S. (2017). Ultrastereo: Efficient learning-based matching for active stereo systems. Ieee conference on computer vision and pattern recognition (cvpr) (p. 6535-6544).

  • Fiedler, D., & Müller, H. (2013). Impact of thermal and environmental conditions on the kinect sensor. X. Jiang, O.R.P. Bellon, D. Goldgof, & T. Oishi (Eds.), Advances in depth image analysis and applications (pp. 21–31). Springer Berlin Heidelberg.

  • Gandhi, V., ¡Cech, J., Horaud, R. (2012). Highresolution depth maps based on tof-stereo fusion. Ieee international conference on robotics and automation (pp. 4742–4749).

  • Gehrig, M., Aarents, W., Gehrig, D., & Scaramuzza, D. (2021). Dsec: A stereo event camera dataset for driving scenarios. IEEE Robotics and Automation Letters. https://doi.org/10.1109/LRA.2021.3068942

    Article  Google Scholar 

  • Geng, J. (2011, Jun). Structured-light 3d surface imaging: a tutorial. Adv. Opt. Photon., 3 (2), 128–160, https://doi.org/10.1364/AOP.3.000128 Retrieved from https://opg.optica.org/aop/abstract.cfm?URI=aop-3-2-128

  • Grunnet-Jepsen, A., Sweetser, J. N., Winer, P., Takagi, A., & Woodfill, J. (2018). Projectors for intel® realsense\(^{TM}\) depth cameras d4xx. Santa Clara, CA, USA: Intel Corp.

  • Guan, T., Wang, C., Liu, Y.-H. (2024). Neural markov random field for stereo matching. Ieee/cvf conference on computer vision and pattern recognition (cvpr).

  • Guo, W., Li, Z., Yang, Y., Wang, Z., Taylor, R.H., Unberath, M.,... & Li, Y. (2022). Context-enhanced stereo transformer. European conference on computer vision (eccv).

  • Guo, X., Yang, K., Yang, W., Wang, X., Li, H. (2019). Group-wise correlation stereo network. Ieee/cvf conference on computer vision and pattern recognition (cvpr) (pp. 3273–3282).

  • Gupta, M., & Nakhate, N. (2018). A geometric perspective on structured light coding. Europena conference on computer vision (eccv) (pp. 90–107).

  • Han, J., Shao, L., Xu, D., & Shotton, J. (2013). Enhanced computer vision with microsoft kinect sensor: A review. IEEE Transactions on Cybernetics, 43(5), 1318–1334. https://doi.org/10.1109/TCYB.2013.2265378

    Article  Google Scholar 

  • Handa, A., Whelan, T., McDonald, J., Davison, A.J. (2014). A benchmark for rgb-d visual odometry, 3d reconstruction and slam. Ieee international conference on robotics and automation (icra) (pp. 1524–1531).

  • Hirschmuller, H. (2007). Stereo processing by semiglobal matching and mutual information. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2), 328–341.

    Article  Google Scholar 

  • Hu, M., Wang, S., Li, B., Ning, S., Fan, L., Gong, X. (2021). Penet: Towards precise and efficient image guided depth completion. Ieee international conference on robotics and automation (icra) (pp. 13656–13662).

  • Huang, Y.-K., Liu, Y.-C., Wu, T.-H., Su, H.- T., Chang, Y.-C., Tsou, T.-L.,... & Hsu, W.H. (2021). S3: Learnable sparse signal superdensity for guided depth estimation. Ieee/cvf conference on computer vision and pattern recognition (cvpr) (pp. 16706– 16716).

  • Jospin, L., Antony, A., Xu, L., Laga, H., Boussaid, F., & Bennamoun, M. (2022). Activepassive simstereo-benchmarking the crossgeneralization capabilities of deep learningbased stereo methods. Advances in Neural Information Processing Systems (NeurIPS), 35, 29235–29247.

    Google Scholar 

  • Kendall, A., Martirosyan, H., Dasgupta, S., Henry, P., Kennedy, R., Bachrach, A., & Bry, A. (2017). Oct). Ieee international conference on computer vision (iccv): End-to-end learning of geometry and context for deep stereo regression.

  • Keselman, L., Iselin Woodfill, J., Grunnet-Jepsen, A., & Bhowmik, A. (2017). July). Ieee conference on computer vision and pattern recognition (cvpr) workshops: Intel realsense stereoscopic depth cameras.

  • Khamis, S., Fanello, S., Rhemann, C., Kowdle, A., Valentin, J., Izadi, S. (2018). Stereonet: Guided hierarchical refinement for real-time edge-aware depth prediction. European conference on computer vision (eccv) (pp. 573–590).

  • Kolmogorov, V., & Zabin, R. (2004). What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2), 147–159.

    Article  Google Scholar 

  • Konolige, K. (2010). Projected texture stereo. Ieee international conference on robotics and automation (icra) (p. 148-155).

  • Laga, H., Jospin, L. V., Boussaid, F., & Bennamoun, M. (2022). A survey on deep learning techniques for stereo-based depth estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(4), 1738–1764.

    Article  Google Scholar 

  • Lai, H.-Y., Tsai, Y.-H., Chiu, W.-C. (2019). Bridging stereo matching and optical flow via spatiotemporal correspondence. Ieee/cvf conference on computer vision and pattern recognition (cvpr) (pp. 1890–1899).

  • Li, J., Wang, P., Xiong, P., Cai, T., Yan, Z., Yang, L.,... & Liu, S. (2022). Practical stereo matching via cascaded recurrent network with adaptive correlation. Ieee/cvf conference on computer vision and pattern recognition (cvpr) (pp. 16263–16272).

  • Li, Z., Liu, X., Drenkow, N., Ding, A., Creighton, F.X., Taylor, R.H., Unberath, M. (2021). Revisiting stereo depth estimation from a sequence-to-sequence perspective with transformers. Ieee/cvf international conference on computer vision (iccv) (pp. 6197– 6206).

  • Liang, C.-K., Cheng, C.-C., Lai, Y.-C., Chen, L.- G., & Chen, H.H. (2011). Hardware-efficient belief propagation. IEEE Transactions on Circuits and Systems for Video Technology,21(5), 525–537.

  • Liang, Z., Feng, Y., Guo, Y., Liu, H., Chen, W., & Qiao, L.,... & Zhang, J. (2018). June). Ieee conference on computer vision and pattern recognition (cvpr): Learning for disparity estimation through feature constancy.

  • Lin, J.-T., Dai, D., Van Gool, L. (2020). Depth estimation from monocular images and sparse radar data. International conference on intelligent robots and systems (iros).

  • Lin, Y., Cheng, T., Zhong, Q., Zhou, W., Yang, H. (2022). Dynamic spatial propagation network for depth completion. Aaai conference on artificial intelligence (Vol. 36, pp. 1638–1646).

  • Lipson, L., Teed, Z., Deng, J. (2021). Raftstereo: Multilevel recurrent field transforms for stereo matching. International conference on 3d vision (3dv).

  • Liu, B., Yu, H., Qi, G. (2022). Graftnet: Towards domain generalized stereo matching with a broad-spectrum and task-oriented feature. Ieee/cvf conference on computer vision and pattern recognition (cvpr) (pp. 13012–13021).

  • Liu, I., Yang, E., Tao, J., Chen, R., Zhang, X., Ran, Q.,... & Su, H. (2022, June). Activezero: Mixed domain learning for active stereovision with zero annotation. Ieee/cvf conference on computer vision and pattern recognition (cvpr) (p. 13033-13042).

  • Liu, Y., Jia, T., Yang, X., Li, X., Li, W.,Wang, H., Chen, D. (2023). Joastereo: Joint learning structured light and reconstruction for generalizable active stereo. IEEE Transactions on Instrumentation and Measurement, ,

  • Lou, J., Liu, W., Chen, Z., Liu, F., Cheng, J. (2023). Elfnet: Evidential local-global fusion for stereo matching. Ieee/cvf international conference on computer vision (iccv) (pp. 17784–17793).

  • Lu, S., Ren, X., Liu, F. (2014). Depth enhancement via low-rank matrix completion. Ieee conference on computer vision and pattern recognition (cvpr) (pp. 3390–3397).

  • Ma, F., Cavalheiro, G.V., Karaman, S. (2019). Self-supervised sparse-to-dense: Selfsupervised depth completion from lidar and monocular camera. International conference on robotics and automation (icra) (pp. 3288–3295).

  • Manduchi, R., & Tomasi, C. (1999). Distinctiveness maps for image matching. 10th international conference on image analysis and processing (iciap) (p. 26-31).

  • Mayer, N., Ilg, E., Hausser, P., Fischer, P., Cremers, D., Dosovitskiy, A., Brox, T. (2016, June). A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. Ieee conference on computer vision and pattern recognition (cvpr).

  • Menze, M., & Geiger, A. (2015). Object scene flow for autonomous vehicles. Conference on computer vision and pattern recognition (cvpr).

  • Mirdehghan, P., Chen, W., Kutulakos, K.N. (2018). Optimal structured light a la carte. Ieee/cvf conference on computer vision and pattern recognition (cvpr) (p. 6248-6257).

  • Pang, J., Sun, W., Ren, J.S., Yang, C., Yan, Q. (2017, Oct). Cascade residual learning: A two-stage convolutional neural network for stereo matching. Ieee international conference on computer vision (iccv).

  • Park, J., Joo, K., Hu, Z., Liu, C.-K., So Kweon, I. (2020). Non-local spatial propagation network for depth completion. European conference computer vision (eccv) (pp. 120– 136).

  • Park, K., Kim, S., Sohn, K. (2018). Highprecision depth estimation with the 3d lidar and stereo fusion. Ieee international conference on robotics and automation (icra) (pp. 2156–2163).

  • Poggi, M., Kim, S., Tosi, F., Kim, S., Aleotti, F., Min, D.,... & Mattoccia, S. (2022, sep). On the confidence of stereo matching in a deep-learning era: A quantitative evaluation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44 (09), 5293-5313,

  • Poggi, M., Pallotti, D., Tosi, F., Mattoccia, S. (2019). Guided stereo matching. Ieee/cvf conference on computer vision and pattern recognition (cvpr) (pp. 979–988).

  • Poggi, M., Tonioni, A., Tosi, F., Mattoccia, S., Di Stefano, L. (2021). Continual adaptation for deep stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), ,

  • Poggi, M., & Tosi, F. (2024). Federated online adaptation for deep stereo. Ieee/cvf conference on computer vision and pattern recognition (cvpr).

  • Poggi, M., Tosi, F., Batsos, K., Mordohai, P., & Mattoccia, S. (2022). On the synergies between machine learning and binocular stereo for depth estimation from images: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(9), 5314–5334.

    Google Scholar 

  • Qiao, X., Poggi, M., Deng, P., Wei, H., Ge, C., & Mattoccia, S. (2024). Rgb guided tof imaging system: A survey of deep learning-based methods. International Journal on Computer Vision. https://doi.org/10.1007/s11263-024-02089-5

    Article  Google Scholar 

  • Ramirez, P., Costanzino, A., Tosi, F., Poggi, M., Salti, S., Mattoccia, S., Stefano, L. (2024, jan). Booster: A benchmark for depth from images of specular and transparent surfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46 (01), 85-102,

  • Riegler, G., Liao, Y., Donne, S., Koltun, V., Geiger, A. (2019, June). Connecting the dots: Learning representations for active monocular depth estimation. Ieee/cvf conference on computer vision and pattern recognition (cvpr).

  • Ronneberger, O., Fischer, P., Brox, T. (2015). Unet: Convolutional networks for biomedical image segmentation. Medical image computing and computer-assisted intervention (miccai) (pp. 234–241).

  • Saikia, T., Marrakchi, Y., Zela, A., Hutter, F., Brox, T. (2019). Autodispnet: Improving disparity estimation with automl. Ieee/cvf international conference on computer vision (iccv) (pp. 1812–1823).

  • Sarbolandi, H., Lefloch, D., & Kolb, A. (2015). Kinect range sensing: Structured-light versus time-of-flight kinect. Computer Vision and Image Understanding, 139, 1–20.

    Article  Google Scholar 

  • Scharstein, D., Hirschmüller, H., Kitajima, Y., Krathwohl, G., Ne¡sić, N., Wang, X., Westling, P. (2014). High-resolution stereo datasets with subpixel-accurate ground truth. German conference on pattern recognition (gcpr) (pp. 31–42).

  • Scharstein, D., & Szeliski, R. (2002). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Internation Journal of Computer Vision, 47(1–3), 7–42. https://doi.org/10.1023/A:1014573219977

    Article  Google Scholar 

  • Schöps, T., Schönberger, J.L., Galliani, S., Sattler, T., Schindler, K., Pollefeys, M., Geiger, A. (2017). A multi-view stereo benchmark with high-resolution images and multicamera videos. Conference on computer vision and pattern recognition (cvpr).

  • Schreiberhuber, S., Weibel, J.-B., Patten, T., Vincze, M. (2022). Gigadepth: Learning depth from structured light with branching neural networks. European conference on computer vision (eccv) (p. 214–229).

  • Shen, J., & Cheung, S.-C.S. (2013). Layer depth denoising and completion for structuredlight rgb-d cameras. Ieee conference on computer vision and pattern recognition (cvpr) (pp. 1187–1194).

  • Shen, Z., Dai, Y., Rao, Z. (2021, June). Cfnet: Cascade and fused cost volume for robust stereo matching. Ieee/cvf conference on computer vision and pattern recognition (cvpr) (p. 13906-13915).

  • Shen, Z., Dai, Y., Song, X., Rao, Z., Zhou, D., Zhang, L. (2022). Pcw-net: Pyramid combination and warping cost volume for stereo matching. European conference on computer vision (eccv) (pp. 280–297).

  • Song, X., Zhao, X., Hu, H., Fang, L. (2018). Edgestereo: A context integrated residual pyramid network for stereo matching. Asian conference on computer vision (accv).

  • Spangenberg, R., Langner, T., Adfeldt, S., Rojas, R. (2014). Large scale semi-global matching on the cpu. Ieee intelligent vehicles symposium proceedings (pp. 195–201).

  • Szeliski, R. (2022). Computer vision - algorithms and applications (2nd ed.). Springer.

    Book  Google Scholar 

  • Taniai, T., Matsushita, Y., Naemura, T. (2014). Graph cut based continuous stereo matching using locally shared labels. Ieee conference on computer vision and pattern recognition (cvpr) (pp. 1613–1620).

  • Tankovich, V., Hane, C., Zhang, Y., Kowdle, A., Fanello, S., Bouaziz, S. (2021, June). Hitnet: Hierarchical iterative tile refinement network for real-time stereo matching. Ieee/cvf conference on computer vision and pattern recognition (cvpr) (p. 14362-14372).

  • Teed, Z., & Deng, J. (2020). Raft: Recurrent allpairs field transforms for optical flow. European conference on computer vision (eccv) (pp. 402–419).

  • Tonioni, A., Poggi, M., Mattoccia, S., & Di Stefano, L. (2017). Oct). International conference on computer vision (iccv): Unsupervised adaptation for deep stereo.

  • Tonioni, A., Rahnama, O., Joy, T., Di Stefano, L., Thalaiyasingam, A., & Torr, P. (2019a). June). Ieee conference on computer vision and pattern recognition (cvpr): Learning to adapt for stereo.

  • Tonioni, A., Tosi, F., Poggi, M., Mattoccia, S., & Stefano, L. D. (2019b). June). Ieee conference on computer vision and pattern recognition (cvpr): Real-time self-adaptive deep stereo.

  • Tosi, F., Tonioni, A., De Gregorio, D., Poggi, M. (2023, June). Nerf-supervised deep stereo. Ieee/cvf conference on computer vision and pattern recognition (cvpr) (p. 855-866).

  • Veksler, O. (2005). Stereo correspondence by dynamic programming on a tree. Ieee conference on computer vision and pattern recognition (cvpr) (Vol. 2, pp. 384–390).

  • Wang, T.-H., Hu, H.-N., Lin, C.H., Tsai, Y.- H., Chiu, W.-C., Sun, M. (2019). 3d lidar and stereo fusion using stereo matching network with conditional cost volume normalization. Ieee/rsj international conference on intelligent robots and systems (iros) (pp. 5895–5902).

  • Wang, X., Xu, G., Jia, H., Yang, X. (2024). Selective-stereo: Adaptive frequency information selection for stereo matching. Ieee/cvf conference on computer vision and pattern recognition (cvpr).

  • Wang, Y., Lai, Z., Huang, G., Wang, B.H., Van Der Maaten, L., Campbell, M., Weinberger, K.Q. (2019). Anytime stereo image depth estimation on mobile devices. International conference on robotics and automation (icra) (pp. 5893–5900).

  • Wang, Y., Wang, P., Yang, Z., Luo, C., Yang, Y., Xu, W. (2019). Unos: Unified unsupervised optical-flow and stereo-depth estimation by watching videos. Ieee conference on computer vision and pattern recognition (cvpr) (pp. 8071–8081).

  • Watson, J., Aodha, O.M., Turmukhambetov, D., Brostow, G.J., Firman, M. (2020). Learning stereo from single images. European conference on computer vision (ECCV).

  • Xu, G.,Wang, X., Ding, X., Yang, X. (2023). Iterative geometry encoding volume for stereo matching. Ieee/cvf conference on computer vision and pattern recognition (cvpr) (pp. 21919–21928).

  • Xu, H., Zhang, J., Cai, J., Rezatofighi, H., Yu, F., Tao, D., Geiger, A. (2022). Unifying flow, stereo and depth estimation. arXiv preprint arXiv:2211.05783

  • Xu, Z., Li, Y., Zhu, S., Sun, Y. (2023). Expanding sparse lidar depth and guiding stereo matching for robust dense depth estimation. IEEE Robotics and Automation Letters, ,

  • Yang, G., Manela, J., Happold, M., Ramanan, D. (2019). Hierarchical deep stereo matching on high-resolution images. Ieee/cvf conference on computer vision and pattern recognition (cvpr) (pp. 5515–5524).

  • Yang, G., Zhao, H., Shi, J., Deng, Z., Jia, J. (2018). Segstereo: Exploiting semantic information for disparity estimation. European conference on computer vision (eccv) (pp. 636–651).

  • Yang, Q., Wang, L., Ahuja, N. (2010). A constant-space belief propagation algorithm for stereo matching. Ieee conference on computer vision and pattern recognition (cvpr) (pp. 1458–1465).

  • Yang, Q., Wang, L., Yang, R., & Stewénius, H., Nistér, D. (2008). Stereo matching with color-weighted correlation, hierarchical belief propagation, and occlusion handling. IEEE Transactions on Pattern Analysis and Machine Intelligence,31(3), 492–504.

  • Yin, Z., Darrell, T., Yu, F. (2019). Hierarchical discrete distribution decomposition for match density estimation. Ieee/cvf conference on computer vision and pattern recognition (cvpr) (pp. 6044–6053).

  • Yoon, K., & Kweon, I. (2007). Stereo matching with the distinctive similarity measure. International conference on computer vision (iccv) (pp. 1–7). IEEE Computer Society.

  • Zabatani, A., Surazhsky, V., Sperling, E., Moshe, S. B., Menashe, O., & Silver, D.H.,... & Kimmel, R. (2020). Intel\(textregistered\) realsense\(^{TM}\) sr300 coded light depth camera. IEEE Transactions on Pattern Analysis and Machine Intelligence,42(10), 2333–2345. https://doi.org/10.1109/TPAMI.2019.2915841

  • Zabih, R., & Woodfill, J. (1994). Non-parametric local transforms for computing visual correspondence. European conference on computer vision (eccv) (pp. 151–158).

  • Zbontar, J., LeCun, Y., et al. (2016). Stereo matching by training a convolutional neural network to compare image patches. J. Mach. Learn. Res., 17(1), 2287–2318.

    Google Scholar 

  • Zeng, J., Yao, C., Yu, L., Wu, Y., Jia, Y. (2023). Parameterized cost volume for stereo matching. Ieee/cvf international conference on computer vision (iccv) (pp. 18347–18357).

  • Zhang, F., Prisacariu, V., Yang, R., Torr, P.H. (2019). GA-Net: Guided aggregation net for end-to-end stereo matching. Ieee/cvf conference on computer vision and pattern recognition (cvpr).

  • Zhang, F., Qi, X., Yang, R., Prisacariu, V., Wah, B., Torr, P. (2020). Domain-invariant stereo matching networks. European conference on computer vision (eccv).

  • Zhang, J., Ramanagopal, M.S., Vasudevan, R., Johnson-Roberson, M. (2020). Listereo: Generate dense depth maps from lidar and stereo imagery. Ieee international conference on robotics and automation (icra) (pp. 7829–7836).

  • Zhang, J., Wang, X., Bai, X., Wang, C., Huang, L., Chen, Y.,... & Hancock, E.R. (2022). Revisiting domain generalized stereo matching networks from a feature consistency perspective. Ieee/cvf conference on computer vision and pattern recognition (cvpr) (pp. 13001–13011).

  • Zhang, L., Curless, B., Seitz, S.M. (2003). Spacetime stereo: Shape recovery for dynamic scenes. Ieee conference on computer vision and pattern recognition (cvpr) (Vol. 2, pp. II–367).

  • Zhang, Y., Guo, X., Poggi, M., Zhu, Z., Huang, G., Mattoccia, S. (2023). Completionformer: Depth completion with convolutions and vision transformers. Ieee/cvf conference on computer vision and pattern recognition (cvpr) (pp. 18527–18536).

  • Zhang, Y., Khamis, S., Rhemann, C., Valentin, J., Kowdle, A., Tankovich, V.,... & Fanello, S. (2018). Activestereonet: End-to-end self-supervised learning for active stereo systems. European conference on computer vision (eccv) (pp. 784–801).

  • Zhang, Y., Zou, S., Liu, X., Huang, X., Wan, Y., & Yao, Y. (2022). Lidar-guided stereo matching with a spatial consistency constraint. ISPRS Journal of Photogrammetry and Remote Sensing, 183, 164–177.

    Article  Google Scholar 

  • Zhao, H., Zhou, H., Zhang, Y., Chen, J., Yang, Y., Zhao, Y. (2023). High-frequency stereo matching network. Ieee/cvf conference on computer vision and pattern recognition (cvpr) (pp. 1327–1336).

  • Zhong, Y., Dai, Y., Li, H. (2017). Selfsupervised learning for stereo matching with self-improving ability. arXiv:1709.00930

Download references

Acknowledgements

This study was carried out within the MOST – Sustainable Mobility National Research Center and received funding from the European Union Next-GenerationEU – PIANO NAZIONALE DI RIPRESA E RESILIENZA (PNRR) – MISSIONE 4 COMPONENTE 2, INVESTIMENTO 1.4 – D.D. 1033 17/06/2022, CN00000023. This manuscript reflects only the authors’ views and opinions, neither the European Union nor the European Commission can be considered responsible for them. We acknowledge the CINECA award under the ISCRA initiative, for the availability of high-performance computing resources and support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luca Bartolomei.

Additional information

Communicated by Luca Magri.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bartolomei, L., Poggi, M., Tosi, F. et al. Active Stereo in the Wild through Virtual Pattern Projection. Int J Comput Vis 133, 7242–7269 (2025). https://doi.org/10.1007/s11263-025-02511-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1007/s11263-025-02511-6

Keywords

Profiles

  1. Stefano Mattoccia