这是indexloc提供的服务,不要输入任何密码
Skip to main content
Log in

Blind Multimodal Quality Assessment of Low-Light Images

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Blind image quality assessment (BIQA) aims at automatically and accurately forecasting objective scores for visual signals, which has been widely used to monitor product and service quality in low-light applications, covering smartphone photography, video surveillance, autonomous driving, etc. Recent developments in this field are dominated by unimodal solutions inconsistent with human subjective rating patterns, where human visual perception is simultaneously reflected by multiple sensory information. In this article, we present a unique blind multimodal quality assessment (BMQA) of low-light images from subjective evaluation to objective score. To investigate the multimodal mechanism, we first establish a multimodal low-light image quality (MLIQ) database with authentic low-light distortions, containing image-text modality pairs. Further, we specially design the key modules of BMQA, considering multimodal quality representation, latent feature alignment and fusion, and hybrid self-supervised and supervised learning. Extensive experiments show that our BMQA yields state-of-the-art accuracy on the proposed MLIQ benchmark database. In particular, we also build an independent single-image modality Dark-4K database, which is used to verify its applicability and generalization performance in mainstream unimodal applications. Qualitative and quantitative results on Dark-4K show that BMQA achieves superior performance to existing BIQA approaches as long as a pre-trained model is provided to generate text descriptions. The proposed framework and two databases as well as the collected BIQA methods and evaluation metrics are made publicly available on https://charwill.github.io/bmqa.html.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Algorithm 1
Fig. 9
Fig. 10
Fig. 11
Algorithm 2

Similar content being viewed by others

Data Availability

The datasets that support the findings of this work are available from the reasonable request, and also available at https://charwill.github.io/bmqa.html.

References

  • Baltrušaitis, T., Ahuja, C., & Morency, L. P. (2018). Multimodal machine learning: A survey and taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(2), 423–443.

    Article  MATH  Google Scholar 

  • Bosse, S., Maniry, D., Müller, K. R., Wiegand, T., & Samek, W. (2018). Deep neural networks for no-reference and full-reference image quality assessment. IEEE Transactions on Image Processing, 27(1), 206–219.

    Article  MathSciNet  MATH  Google Scholar 

  • Božić-Štulić, D., Marušić, Ž, & Gotovac, S. (2019). Deep learning approach in aerial imagery for supporting land search and rescue missions. Springer International Journal of Computer Vision, 127(9), 1256–1278.

    Article  Google Scholar 

  • Chen, B., Cao, Q., Hou, M., Zhang, Z., Lu, G., & Zhang, D. (2022). Multimodal emotion recognition with temporal and semantic consistency. IEEE Transactions on Audio, Speech and Language Processing, 29(1), 3592–3603.

    MATH  Google Scholar 

  • Chen, C., Chen, Q., Xu, J., & Koltun, V. (2018). Learning to see in the dark. In IEEE conference on computer vision and pattern recognition (CVPR), pp. 3291–3300.

  • Chen, L., Fu, Y., Wei, K., Zheng, D., & Heide, F. (2024). Instance Segmentation in the Dark. Springer International Journal of Computer Vision, 131(8), 2198–2218.

    Article  MATH  Google Scholar 

  • Chen, L., Zhang, J., Pan, J., Lin, S., Fang, F., & Ren, J.S. (2021). Learning a non-blind deblurring network for night blurry images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10542–10550.

  • Deng, Y., Loy, C. C., & Tang, X. (2017). Image aesthetic assessment: An experimental survey. IEEE Signal Processing Magazine, 34(4), 80–106.

    Article  Google Scholar 

  • Ding, K., Ma, K., Wang, S., & Simoncelli, E. P. (2021). Comparison of full-reference image quality models for optimization of image processing systems. Springer International Journal of Computer Vision, 129, 1258–1281.

    Article  MATH  Google Scholar 

  • Dmello, S. K., & Kory, J. (2015). A review and meta-analysis of multimodal affect detection systems. ACM Computing Surveys, 47(3), 1–36.

    Article  MATH  Google Scholar 

  • Fang, Y., Du, R., Zuo, Y., Wen, W., & Li, L. (2020). Perceptual quality assessment for screen content images by spatial continuity. IEEE Transactions on Circuits and Systems for Video Technology, 30(11), 4050–4063.

    Article  MATH  Google Scholar 

  • Fang, Y., Huang, L., Yan, J., Liu, X., & Liu, Y. (2022). Perceptual quality assessment of omnidirectional images. In AAAI Conference on Artificial Intelligence (AAAI), pp. 580–588.

  • Gordon, I. E. (2004). Theories of visual perception. Psychology press.

    Book  MATH  Google Scholar 

  • Gregory, R. L. (1980). Perceptions as hypotheses. Philosophical Transactions of the Royal Society of London. B Biological Sciences, 290(1038), 181–197.

    Article  MATH  Google Scholar 

  • Gu, J., Meng, G., Da, C., Xiang, S., & Pan, C. (2019). No-reference image quality assessment with reinforcement recursive list-wise ranking. In AAAI Conference on Artificial Intelligence (AAAI), pp. 8336–8343.

  • Gu, K., Zhou, J., Qiao, J. F., Zhai, G., Lin, W., & Bovik, A. C. (2017). No-reference quality assessment of screen content pictures. IEEE Transactions on Image Processing, 26(8), 4005–4018.

    Article  MathSciNet  MATH  Google Scholar 

  • Hanjalic, A., & Xu, L. Q. (2005). Affective video content representation and modeling. IEEE Transactions on Multimedia, 7(1), 143–154.

    Article  MATH  Google Scholar 

  • Hosu, V., Lin, H., Sziranyi, T., & Saupe, D. (2020). KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment. IEEE Transactions on Image Processing, 29(1), 4041–4056.

    Article  Google Scholar 

  • Kang, L., Ye, P., Li, Y., & Doermann, D. (2014). Convolutional neural networks for no-reference image quality assessment. In IEEE conference on computer vision and pattern recognition (CVPR), pp. 1733–1740.

  • Ke, J., Wang, Q., Wang, Y., Milanfar, P., & Yang, F. (2021). MUSIQ: Multi-scale image quality transformer. In IEEE international conference on computer vision (ICCV), pp. 5148–5157.

  • Kim, J., Nguyen, A. D., & Lee, S. (2019). Deep CNN-based blind image quality predictor. IEEE Transactions on Neural Networks and Learning Systems, 30(1), 11–24.

    Article  MATH  Google Scholar 

  • Kong, X., & Yang, Q. (2018). No-reference image quality assessment for image auto-denoising. Springer International Journal of Computer Vision, 126, 537–549.

    Article  MathSciNet  MATH  Google Scholar 

  • Li, C., Guo, C., Han, L. H., Jiang, J., Cheng, M. M., Gu, J., & Loy, C. C. (2022). Low-light image and video enhancement using deep learning: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(12), 9396–9416.

    Article  MATH  Google Scholar 

  • Li, D., Jiang, T., & Jiang, M. (2020). Norm-in-norm loss with faster convergence and better performance for image quality assessment. In ACM international conference on multimedia (MM), pp. 789–797.

  • Li, D., Jiang, T., & Jiang, M. (2021). Unified quality assessment of in-the-wild videos with mixed datasets training. Springer International Journal of Computer Vision, 129, 1238–1257.

    Article  MATH  Google Scholar 

  • Li, Q., Lin, W., & Fang, Y. (2016). No-reference quality assessment for multiply-distorted images in gradient domain. IEEE Signal Processing Letters, 23(4), 541–545.

    Article  MATH  Google Scholar 

  • Liu, C., Mao, Z., Zhang, T., Liu, A., Wang, B., & Zhang, Y. (2022). Focus Your Attention: A Focal Attention for Multimodal Learning. IEEE Transactions on Multimedia, 24(1), 103–115.

    Article  MATH  Google Scholar 

  • Liu, J., Xu, D., Yang, W., Fan, M., & Huang, H. (2021). Benchmarking low-light image enhancement and beyond. Springer International Journal of Computer Vision, 129, 1153–1184.

    Article  MATH  Google Scholar 

  • Liu, L., Liu, B., Huang, H., & Bovik, A. C. (2014). No-reference image quality assessment based on spatial and spectral entropies. Elsevier Signal Processing: Image Communication, 29(8), 856–863.

    MATH  Google Scholar 

  • Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., & Neubig, G. (2023). Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55(9), 1–35.

    Article  Google Scholar 

  • Liu, T. J., & Liu, K. H. (2018). No-reference image quality assessment by wide-perceptual-domain scorer ensemble method. IEEE Transactions on Image Processing, 27(3), 1138–1151.

    Article  MathSciNet  MATH  Google Scholar 

  • Liu, X., Van De Weijer, J., & Bagdanov, A. D. (2019). Exploiting unlabeled data in CNNs by self-supervised learning to rank. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(8), 1862–1878.

    Article  MATH  Google Scholar 

  • Loh, Y. P., & Chan, C. S. (2019). Getting to know low-light images with the exclusively dark dataset. Elsevier Computer Vision and Image Understanding, 178, 30–42.

    Article  MATH  Google Scholar 

  • Ma, R., Luo, H., Wu, Q., Ngan, K.N., Li, H., Meng, F., & Xu, L. (2021). Remember and Reuse: Cross-task blind image quality assessment via relevance-aware incremental learning. In ACM international conference on multimedia (MM), pp. 5248–5256.

  • Madhusudana, P. C., Birkbeck, N., Wang, Y., Adsumilli, B., & Bovik, A. C. (2022). Image quality assessment using contrastive learning. IEEE Transactions on Image Processing, 31(1), 4149–4161.

    Article  Google Scholar 

  • Martinez, H. A. B., & Farias, M. C. (2018). Combining audio and video metrics to assess audio-visual quality. Springer Multimedia Tools and Applications, 77(18), 23993–24012.

    Article  Google Scholar 

  • Min, X., Zhai, G., Zhou, J., Farias, M. C., & Bovik, A. C. (2020). Study of subjective and objective quality assessment of audio-visual signals. IEEE Transactions on Image Processing, 29(5588), 6054–6068.

    Article  MATH  Google Scholar 

  • Mittal, A., Moorthy, A. K., & Bovik, A. C. (2012). No-reference image quality assessment in the spatial domain. IEEE Transactions on Image Processing, 21(12), 4695–4708.

    Article  MathSciNet  MATH  Google Scholar 

  • Pinson, M. H., Janowski, L., Pépion, R., Huynh-Thu, Q., Schmidmer, C., Corriveau, P., Younkin, A., Le Callet, P., Barkowsky, M., & Ingram, W. (2012). The influence of subjects and environment on audiovisual subjective tests: An international study. IEEE Journal of Selected Topics in Signal Processing, 6(6), 640–651.

    Article  Google Scholar 

  • Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., & Krueger, G. (2021). Learning transferable visual models from natural language supervision. In PMLR international conference on machine learning (ICML), pp. 8748–8763.

  • Song, G., Wang, S., Huang, Q., & Tian, Q. (2019). Harmonized multimodal learning with Gaussian process latent variable models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(3), 858–872.

    Article  MATH  Google Scholar 

  • Stiennon, N., Ouyang, L., Wu, J., Ziegler, D., Lowe, R., Voss, C., Radford, A., Amodei, D., & Christiano, P. F. (2020). Learning to summarize with human feedback. Advances in Neural Information Processing Systems (NeurIPS), 33, 3008–3021.

    Google Scholar 

  • Su, S., Yan, Q., Zhu, Y., Zhang, C., Ge, X., Sun, J., & Zhang, Y. (2020). Blindly assess image quality in the wild guided by a self-adaptive hyper network. In IEEE Conference on computer vision and pattern recognition (CVPR), pp. 3667–3676.

  • Tian, Y., Zeng, H., Hou, J., Chen, J., Zhu, J., & Ma, K. K. (2021). A light field image quality assessment model based on symmetry and depth features. IEEE Transactions on Circuits and Systems for Video Technology, 31(5), 2046–2050.

    Article  MATH  Google Scholar 

  • Vinyals, O., Toshev, A., Bengio, S., & Erhan, D. (2016). Show and tell: Lessons learned from the 2015 mscoco image captioning challenge. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4), 652–663.

    Article  Google Scholar 

  • Virtanen, T., Nuutinen, M., Vaahteranoksa, M., Oittinen, P., & Häkkinen, J. (2014). CID2013: A database for evaluating no-reference image quality assessment algorithms. IEEE Transactions on Image Processing, 24(1), 390–402.

    Article  MathSciNet  MATH  Google Scholar 

  • Wade, N., & Swanston, M. (2013). Visual perception: An introduction. Psychology Press.

    Book  MATH  Google Scholar 

  • Wang, G., Chen, C., Fan, D.P., Hao, A., & Qin, H. (2021). From semantic categories to fixations: A novel weakly-supervised visual-auditory saliency detection approach. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15119–15128.

  • Wang, J., Chen, Z., Yuan, C., Li, B., Ma, W., & Hu, W. (2023). Hierarchical curriculum learning for no-reference image quality assessment. Springer International Journal of Computer Vision, 131, 3074–3093.

    Article  MATH  Google Scholar 

  • Wang, M., Huang, Y., Lin, J., Xie, W., Yue, G., Wang, S., & Li, L. (2021). Quality measurement of screen images via foreground perception and background suppression. IEEE Transactions on Instrumentation and Measurement, 70, 1–11.

    Article  MATH  Google Scholar 

  • Wang, M., Huang, Y., Xiong, J., & Xie, W. (2022). Low-light images in-the-wild: A novel visibility perception-guided blind quality indicator. IEEE Transactions on Industrial Informatics, 19(4), 6026–6036.

  • Wang, M., Huang, Y., & Zhang, J. (2021). Blind quality assessment of night-time images via weak illumination analysis. In IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6.

  • Wang, M., Xu, Z., Gong, Y., & Xie, W. (2022). S-CCR: Super-complete comparative representation for low-light image quality inference in-the-wild. In ACM international conference on multimedia (MM), pp. 5219–5227.

  • Wu, J., Ma, J., Liang, F., Dong, W., Shi, G., & Lin, W. (2020). End-to-end blind image quality prediction with cascaded deep neural network. IEEE Transactions on Image Processing, 29(1), 7414–7426.

    Article  MATH  Google Scholar 

  • Xiang, T., Yang, Y., & Guo, S. (2019). Blind night-time image quality assessment: Subjective and objective approaches. IEEE Transactions on Multimedia, 22(5), 1259–1272.

    Article  MATH  Google Scholar 

  • Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., Zemel, R., & Bengio, Y. (2015). Show, attend and tell: Neural image caption generation with visual attention. In PMLR international conference on machine learning (ICML), pp. 2048–2057.

  • Xu, Y., Cao, H., Yin, J., Chen, Z., Li, X., Li, Z., Xu, Q., & Yang, J. (2024). Going deeper into recognizing actions in dark environments: A comprehensive benchmark study. Springer International Journal of Computer Vision, 132(4), 1292–1309.

    Article  MATH  Google Scholar 

  • Yan, B., Bare, B., & Tan, W. (2019). Naturalness-aware deep no-reference image quality assessment. IEEE Transactions on Multimedia, 21(10), 2603–2615.

    Article  Google Scholar 

  • Yang, Q., Ma, Z., Xu, Y., Li, Z., & Sun, J. (2022). Inferring point cloud quality via graph similarity. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(6), 3015–3029.

    Article  MATH  Google Scholar 

  • Yang, W., Wu, J., Tian, S., Li, L., Dong, W., & Shi, G. (2022). Fine-grained image quality caption with hierarchical semantics degradation. IEEE Transactions on Image Processing, 31(1), 3578–3590.

    Article  MATH  Google Scholar 

  • Ying, Z., Ghadiyaram, D., & Bovik, A. (2022). Telepresence video quality assessment. In: Springer European Conference on Computer Vision (ECCV), pp. 327–347.

  • Ying, Z., Niu, H., Gupta, P., Mahajan, D., Ghadiyaram, D., & Bovik, A.C. (2020). From patches to pictures (PaQ-2-PiQ): Mapping the perceptual space of picture quality. In IEEE conference on computer vision and pattern recognition (CVPR), pp. 3575–3585.

  • Zhai, G., Wu, X., Yang, X., Lin, W., & Zhang, W. (2012). A psychovisual quality metric in free-energy principle. IEEE Transactions on Image Processing, 21(1), 41–52.

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang, J., Dong, B., Fu, Y., Wang, Y., Wei, X., Yin, B., & Yang, X. (2024). A universal event-based plug-in module for visual object tracking in degraded conditions. Springer International Journal of Computer Vision, 132(5), 1857–1879.

    Article  MATH  Google Scholar 

  • Zhang, L., Zhang, L., & Bovik, A. C. (2015). A feature-enriched completely blind image quality evaluator. IEEE Transactions on Image Processing, 24(8), 2579–2591.

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang, W., Li, D., Ma, C., Zhai, G., Yang, X., & Ma, K. (2023). Continual learning for blind image quality assessment. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(3), 2864–2878.

    MATH  Google Scholar 

  • Zhang, W., Ma, K., Yan, J., Deng, D., & Wang, Z. (2020). Blind image quality assessment using a deep bilinear convolutional neural network. IEEE Transactions on Circuits and Systems for Video Technology, 30(1), 36–47.

    Article  MATH  Google Scholar 

  • Zhang, W., Ma, K., Zhai, G., & Yang, X. (2021). Uncertainty-aware blind image quality assessment in the laboratory and wild. IEEE Transactions on Image Processing, 30(1), 3474–3486.

    Article  MATH  Google Scholar 

  • Zheng, Y., Chen, W., Lin, R., Zhao, T., & Le Callet, P. (2022). Uif: An objective quality assessment for underwater image enhancement. IEEE Transactions on Image Processing, 1, 5456–5468.

  • Zhu, H., Li, L., Wu, J., Dong, W., & Shi, G. (2020). MetaIQA: Deep meta-learning for no-reference image quality assessment. In IEEE conference on computer vision and pattern recognition (CVPR), pp. 14143–14152.

Download references

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 62472290 and Grant 62372306, in part by the Natural Science Foundation of Guangdong Province under Grant 2024A1515011972, Grant 2023A1515011197, and Grant 2022A1515011245, and in part by the Natural Science Foundation of Shenzhen City under Grant 20220809160139001.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Miaohui Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no Conflict of interest.

Additional information

Communicated by Jean-François Lalonde.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, M., Xu, Z., Xu, M. et al. Blind Multimodal Quality Assessment of Low-Light Images. Int J Comput Vis 133, 1665–1688 (2025). https://doi.org/10.1007/s11263-024-02239-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1007/s11263-024-02239-9

Keywords

Profiles

  1. Zhuowei Xu