这是indexloc提供的服务,不要输入任何密码
Skip to main content
Log in

Beyond Learned Metadata-Based Raw Image Reconstruction

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

While raw images possess distinct advantages over sRGB images, e.g., linearity and fine-grained quantization levels, they are not widely adopted by general users due to their substantial storage requirements. Very recent studies propose to compress raw images by designing sampling masks within the pixel space of the raw image. However, these approaches often leave space for pursuing more effective image representations and compact metadata. In this work, we propose a novel framework that learns a compact representation in the latent space, serving as metadata, in an end-to-end manner. Compared with lossy image compression, we analyze the intrinsic difference of the raw image reconstruction task caused by rich information from the sRGB image. Based on the analysis, a novel design of the backbone with asymmetric and hybrid spatial feature resolutions is proposed, which significantly improves the rate-distortion performance. Besides, we propose a novel design of the sRGB-guided context model, which can better predict the order masks of encoding/decoding based on both the sRGB image and the the masks of already processed features. Benefited from the better modeling of the correlation between order masks, the already processed information can be better utilized. Moreover, a novel sRGB-guided adaptive quantization precision strategy, which dynamically assigns varying levels of quantization precision to different regions, further enhances the representation ability of the model. Finally, based on the iterative properties of the proposed context model, we propose a novel strategy to achieve variable bit rates using a single model. This strategy allows for the continuous convergence of a wide range of bit rates. We demonstrate how our raw image compression scheme effectively allocates more bits to image regions that hold greater global importance. Extensive experimental results validate the superior performance of the proposed method, achieving high-quality raw image reconstruction with a smaller metadata size, compared with existing SOTA methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Algorithm 1
Algorithm 2
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Data availability

This work does not propose a new dataset. All the datasets we used are publicly available.

References

  • Abdelhamed, A., Brubaker, MA., & Brown, MS. (2019) Noise flow: Noise modeling with conditional normalizing flows. In Proceedings of the IEEE/CVF international conference on computer vision. pp 3165–3173.

  • Agustsson, E., Mentzer, F., Tschannen, M., Cavigelli, L., Timofte, R., Benini, L., & Gool, L. V. (2017). Soft-to-hard vector quantization for end-to-end learning compressible representations. Advances in neural information processing systems. 30.

  • Ballé, J., Laparra, V., & Simoncelli, E. P. (2016) End-to-end optimized image compression. arXiv preprint arXiv:1611.01704.

  • Ballé, J., Minnen, D., Singh, S., Hwang, SJ., & Johnston, N. (2018) Variational image compression with a scale hyperprior. arXiv preprint arXiv:1802.01436.

  • Bychkovsky, V., Paris, S., Chan, E., & Durand, F. (2011) Learning photographic global tonal adjustment with a database of input / output image pairs. In The twenty-fourth IEEE conference on computer vision and pattern recognition.

  • Chakrabarti, A., Xiong, Y., Sun, B., Darrell, T., Scharstein, D., Zickler, T., et al. (2014). Modeling radiometric uncertainty for vision with tone-mapped color images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(11), 2185–2198.

    Article  Google Scholar 

  • Cheng, Z., Sun, H., Takeuchi, M., & Katto, J. (2020). Learned image compression with discretized gaussian mixture likelihoods and attention modules. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 7939–7948.

  • Cheng, D., Prasad, D. K., & Brown, M. S. (2014). Illuminant estimation for color constancy: Why spatial-domain methods work and the role of the color distribution. JOSA A, 31(5), 1049–1058.

    Article  Google Scholar 

  • Choi, Y., El-Khamy, M., & Lee, J. (2019). Variable rate deep image compression with a conditional autoencoder. In Proceedings of the IEEE/CVF international conference on computer vision.

  • Debevec, P. E., & Malik, J. (2008). Recovering high dynamic range radiance maps from photographs. In ACM SIGGRAPH 2008 classes. pp 1–10.

  • Gong, H., Finlayson, G. D., Darrodi, M. M., & Fisher, R. B. (2018). Rank-based radiometric calibration. In Color and Imaging Conference. vol. 2018. Society for Imaging Science and Technology. pp 59–66.

  • He, D., Zheng, Y., Sun, B., Wang, Y., & Qin, H. (2021). Checkerboard context model for efficient learned image compression. In Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition. 2021. pp 14771–14780.

  • Helminger, L., Djelouah, A., Gross, M., & Schroers, C. (2020). Lossy image compression with normalizing flows. arXiv preprint arXiv:2008.10486.

  • Holub, V., & Fridrich, J. (2012) Designing steganographic distortion using directional filters. In IEEE International workshop on information forensics and security (WIFS). IEEE,2012, 234–239.

  • Holub, V., Fridrich, J., & Denemark, T. (2014). Universal distortion function for steganography in an arbitrary domain. EURASIP Journal on Information Security, 2014, 1–13.

    Article  Google Scholar 

  • Hu, Y., Yang, W., Ma, Z., & Liu, J. (2021). Learning end-to-end lossy image compression: A benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(8), 4194–4211.

    Google Scholar 

  • Huang, H., Yang, W., Hu, Y., Liu, J., & Duan, L. Y. (2022). Towards low light enhancement with raw images. IEEE Transactions on Image Processing, 31, 1391–1405.

    Article  Google Scholar 

  • Hussain, M., Wahab, A. W. A., Idris, Y. I. B., Ho, A. T., & Jung, K. H. (2018). Image steganography in spatial domain: A survey. Signal Processing: Image Communication, 65, 46–66.

    Google Scholar 

  • Jang, E., Gu, S., & Poole, B. Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144.

  • Kim, S. J., Lin, H. T., Lu, Z., Süsstrunk, S., Lin, S., & Brown, M. S. (2012). A new in-camera imaging model for color computer vision and its application. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(12), 2289–2302.

    Article  Google Scholar 

  • Lee, J., Cho, S., & Beack, SK. (2018). Context-adaptive entropy model for end-to-end optimized image compression. arXiv preprint arXiv:1809.10452.

  • Li, M., Zuo, W., Gu, S., Zhao, D., & Zhang, D. (2018) Learning convolutional networks for content-weighted image compression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2018. pp 3214–3223.

  • Liu, Y. L., Lai, W. S., Chen, Y. S., Kao, Y. L., Yang, M. H., Chuang, Y. Y. & Huang, J. B. (2020). Single-image HDR reconstruction by learning to reverse the camera pipeline. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 1651–1660.

  • Mandal, P. C., Mukherjee, I., Paul, G., & Chatterji, B. N. (2022). Digital image steganography: A literature survey. Information Sciences, 609, 1451–1488.

    Article  Google Scholar 

  • Marnerides, D., Bashford-Rogers, T., Hatchett, J., & Debattista, K. (2018). A new in-camera imaging model for color computer vision and its application. Computer Graphics Forum, 37, 37–49.

    Article  Google Scholar 

  • Mentzer, F., Toderici, G. D., Tschannen, M., & Agustsson, E. (2020). High-fidelity generative image compression. Advances in Neural Information Processing Systems, 33, 11913–11924.

    Google Scholar 

  • Minnen, D., & Singh S. (2020). Channel-wise autoregressive entropy models for learned image compression. In 2020 IEEE international conference on image processing (ICIP). IEEE. pp 3339–3343.

  • Minnen, D., Ballé, J., & Toderici, G. D. (2018). Joint autoregressive and hierarchical priors for learned image compression. Advances in neural information processing systems. 31.

  • Morkel, T., Eloff, J. H., Olivier, M. S. (2005). An overview of image steganography. In ISSA. vol. 1; pp 1–11.

  • Nam, S., Punnappurath, A., Brubaker, MA., & Brown, M. S. (2022). Learning sRGB-to-raw-RGB de-rendering with content-aware metadata. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 17704–17713.

  • Nguyen, R. M., & Brown, M. S. (2016). Raw image reconstruction using a self-contained SRGB-jpeg image with only 64 kb overhead. In Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1655–1663.

  • Nguyen, R. M., & Brown, M. S. (2018). Raw image reconstruction using a self-contained sRGB-JPEG image with small memory overhead. International Journal of Computer Vision, 126(6), 637–650.

    Article  MathSciNet  Google Scholar 

  • Pevnỳ, T., Filler, T., & Bas, P. (2010). Using high-dimensional image models to perform highly undetectable steganography. In Information Hiding: 12th International Conference, IH 2010, Calgary, AB, Canada, June 28-30, 2010, Revised Selected Papers 12. Springer; p. 161–177.

  • Punnappurath, A., & Brown, M. S. (2021). Spatially aware metadata for raw reconstruction. In Proceedings of the IEEE/CVF winter conference on applications of computer vision. pp 218–226.

  • Punnappurath, A., & Brown, M. S. (2019). Learning raw image reconstruction-aware deep image compressors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(4), 1013–1019.

    Article  Google Scholar 

  • Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323(6088), 533–536.

    Article  Google Scholar 

  • Song, M., Choi, J., & Han, B. (2021). Variable-rate deep image compression through spatially-adaptive feature transform. In Proceedings of the IEEE/CVF international conference on computer vision. pp 2380–2389.

  • Theis, L., Shi, W., Cunningham, A., & Huszár, F. (2017). Lossy image compression with compressive autoencoders. arXiv preprint arXiv:1703.00395.

  • Wang, Y., Wan, R., Yang, W., Li, H., Chau, LP., & Kot, A. (2022). Low-light image enhancement with normalizing flow. In Proceedings of the AAAI conference on artificial intelligence. vol. 36; pp 2604–2612.

  • Wang, Y., Yu, Y., Yang, W., Guo, L., Chau, L. P., Kot, A. C., & Wen, B. (2023). Raw image reconstruction with learned compact metadata. arXiv preprint arXiv:2302.12995.

  • Wang, L., & Yoon, K. J. (2021). Deep learning for hdr imaging: State-of-the-art and future trends. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(12), 8874–8895.

    Article  Google Scholar 

  • Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612.

    Article  Google Scholar 

  • Wei, K., Fu, Y., Yang, J., & Huang, H. (2020). A physics-based noise formation model for extreme low-light raw denoising. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 2758–2767.

  • Xing, Y., Qian, Z., & Chen, Q. (2021) Invertible image signal processing. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 6287–6296.

  • Yang, F., Sun, Q., Jin, H., & Zhou, Z. (2020). Superpixel segmentation with fully convolutional networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 13964–13973.

  • Yuan, L., & Sun, J. (2011). High quality image reconstruction from RAW and JPEG image pair. In 2011 international conference on computer vision. IEEE; 2011. pp 2158–2165.

  • Zhang, Y., Qin, H., Wang, X., & Li, H. (2021) Rethinking noise synthesis and modeling in raw denoising. In Proceedings of the IEEE/CVF international conference on computer vision. pp 4593–4601.

  • Zheng, Z., Ren, W., Cao, X., Wang, T., & Jia, X. (2021). Ultra-high-definition image hdr reconstruction via collaborative bilateral learning. In Proceedings of the IEEE/CVF international conference on computer vision. pp 4449–4458.

Download references

Funding

This work was done at Rapid-Rich Object Search (ROSE) Lab, Nanyang Technological University. This research is supported in part by the NTU-PKU Joint Research Institute (a collaboration between the Nanyang Technological University and Peking University that is sponsored by a donation from the Ng Teng Fong Charitable Foundation), the Basic and Frontier Research Project of PCL, the Major Key Project of PCL, and the MOE AcRF Tier 1 (RG61/22) and Start-Up Grant.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bihan Wen.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Code availability

The code of this work is released at https://github.com/wyf0912/R2LCM.

Additional information

Communicated by Seon Joo Kim.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Yu, Y., Yang, W. et al. Beyond Learned Metadata-Based Raw Image Reconstruction. Int J Comput Vis 132, 5514–5533 (2024). https://doi.org/10.1007/s11263-024-02143-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1007/s11263-024-02143-2

Keywords