Blind Multimodal Quality Assessment of Low-Light Images

Wang, Miaohui; Xu, Zhuowei; Xu, Mai; Lin, Weisi

doi:10.1007/s11263-024-02239-9

Blind Multimodal Quality Assessment of Low-Light Images

Published: 16 October 2024

Volume 133, pages 1665–1688, (2025)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

730 Accesses
4 Citations
Explore all metrics

Abstract

Blind image quality assessment (BIQA) aims at automatically and accurately forecasting objective scores for visual signals, which has been widely used to monitor product and service quality in low-light applications, covering smartphone photography, video surveillance, autonomous driving, etc. Recent developments in this field are dominated by unimodal solutions inconsistent with human subjective rating patterns, where human visual perception is simultaneously reflected by multiple sensory information. In this article, we present a unique blind multimodal quality assessment (BMQA) of low-light images from subjective evaluation to objective score. To investigate the multimodal mechanism, we first establish a multimodal low-light image quality (MLIQ) database with authentic low-light distortions, containing image-text modality pairs. Further, we specially design the key modules of BMQA, considering multimodal quality representation, latent feature alignment and fusion, and hybrid self-supervised and supervised learning. Extensive experiments show that our BMQA yields state-of-the-art accuracy on the proposed MLIQ benchmark database. In particular, we also build an independent single-image modality Dark-4K database, which is used to verify its applicability and generalization performance in mainstream unimodal applications. Qualitative and quantitative results on Dark-4K show that BMQA achieves superior performance to existing BIQA approaches as long as a pre-trained model is provided to generate text descriptions. The proposed framework and two databases as well as the collected BIQA methods and evaluation metrics are made publicly available on https://charwill.github.io/bmqa.html.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Algorithm 1

Algorithm 2

QualityNet: A multi-stream fusion framework with spatial and channel attention for blind image quality assessment

Article Open access 29 October 2024

MTQ-Caps: A Multi-task Capsule Network for Blind Image Quality Assessment

Physiological biometric image quality assessment -a review

Article 14 April 2025

Data Availability

The datasets that support the findings of this work are available from the reasonable request, and also available at https://charwill.github.io/bmqa.html.

References

Baltrušaitis, T., Ahuja, C., & Morency, L. P. (2018). Multimodal machine learning: A survey and taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(2), 423–443.
Article MATH Google Scholar
Bosse, S., Maniry, D., Müller, K. R., Wiegand, T., & Samek, W. (2018). Deep neural networks for no-reference and full-reference image quality assessment. IEEE Transactions on Image Processing, 27(1), 206–219.
Article MathSciNet MATH Google Scholar
Božić-Štulić, D., Marušić, Ž, & Gotovac, S. (2019). Deep learning approach in aerial imagery for supporting land search and rescue missions. Springer International Journal of Computer Vision, 127(9), 1256–1278.
Article Google Scholar
Chen, B., Cao, Q., Hou, M., Zhang, Z., Lu, G., & Zhang, D. (2022). Multimodal emotion recognition with temporal and semantic consistency. IEEE Transactions on Audio, Speech and Language Processing, 29(1), 3592–3603.
MATH Google Scholar
Chen, C., Chen, Q., Xu, J., & Koltun, V. (2018). Learning to see in the dark. In IEEE conference on computer vision and pattern recognition (CVPR), pp. 3291–3300.
Chen, L., Fu, Y., Wei, K., Zheng, D., & Heide, F. (2024). Instance Segmentation in the Dark. Springer International Journal of Computer Vision, 131(8), 2198–2218.
Article MATH Google Scholar
Chen, L., Zhang, J., Pan, J., Lin, S., Fang, F., & Ren, J.S. (2021). Learning a non-blind deblurring network for night blurry images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10542–10550.
Deng, Y., Loy, C. C., & Tang, X. (2017). Image aesthetic assessment: An experimental survey. IEEE Signal Processing Magazine, 34(4), 80–106.
Article Google Scholar
Ding, K., Ma, K., Wang, S., & Simoncelli, E. P. (2021). Comparison of full-reference image quality models for optimization of image processing systems. Springer International Journal of Computer Vision, 129, 1258–1281.
Article MATH Google Scholar
Dmello, S. K., & Kory, J. (2015). A review and meta-analysis of multimodal affect detection systems. ACM Computing Surveys, 47(3), 1–36.
Article MATH Google Scholar
Fang, Y., Du, R., Zuo, Y., Wen, W., & Li, L. (2020). Perceptual quality assessment for screen content images by spatial continuity. IEEE Transactions on Circuits and Systems for Video Technology, 30(11), 4050–4063.
Article MATH Google Scholar
Fang, Y., Huang, L., Yan, J., Liu, X., & Liu, Y. (2022). Perceptual quality assessment of omnidirectional images. In AAAI Conference on Artificial Intelligence (AAAI), pp. 580–588.
Gordon, I. E. (2004). Theories of visual perception. Psychology press.
Book MATH Google Scholar
Gregory, R. L. (1980). Perceptions as hypotheses. Philosophical Transactions of the Royal Society of London. B Biological Sciences, 290(1038), 181–197.
Article MATH Google Scholar
Gu, J., Meng, G., Da, C., Xiang, S., & Pan, C. (2019). No-reference image quality assessment with reinforcement recursive list-wise ranking. In AAAI Conference on Artificial Intelligence (AAAI), pp. 8336–8343.
Gu, K., Zhou, J., Qiao, J. F., Zhai, G., Lin, W., & Bovik, A. C. (2017). No-reference quality assessment of screen content pictures. IEEE Transactions on Image Processing, 26(8), 4005–4018.
Article MathSciNet MATH Google Scholar
Hanjalic, A., & Xu, L. Q. (2005). Affective video content representation and modeling. IEEE Transactions on Multimedia, 7(1), 143–154.
Article MATH Google Scholar
Hosu, V., Lin, H., Sziranyi, T., & Saupe, D. (2020). KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment. IEEE Transactions on Image Processing, 29(1), 4041–4056.
Article Google Scholar
Kang, L., Ye, P., Li, Y., & Doermann, D. (2014). Convolutional neural networks for no-reference image quality assessment. In IEEE conference on computer vision and pattern recognition (CVPR), pp. 1733–1740.
Ke, J., Wang, Q., Wang, Y., Milanfar, P., & Yang, F. (2021). MUSIQ: Multi-scale image quality transformer. In IEEE international conference on computer vision (ICCV), pp. 5148–5157.
Kim, J., Nguyen, A. D., & Lee, S. (2019). Deep CNN-based blind image quality predictor. IEEE Transactions on Neural Networks and Learning Systems, 30(1), 11–24.
Article MATH Google Scholar
Kong, X., & Yang, Q. (2018). No-reference image quality assessment for image auto-denoising. Springer International Journal of Computer Vision, 126, 537–549.
Article MathSciNet MATH Google Scholar
Li, C., Guo, C., Han, L. H., Jiang, J., Cheng, M. M., Gu, J., & Loy, C. C. (2022). Low-light image and video enhancement using deep learning: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(12), 9396–9416.
Article MATH Google Scholar
Li, D., Jiang, T., & Jiang, M. (2020). Norm-in-norm loss with faster convergence and better performance for image quality assessment. In ACM international conference on multimedia (MM), pp. 789–797.
Li, D., Jiang, T., & Jiang, M. (2021). Unified quality assessment of in-the-wild videos with mixed datasets training. Springer International Journal of Computer Vision, 129, 1238–1257.
Article MATH Google Scholar
Li, Q., Lin, W., & Fang, Y. (2016). No-reference quality assessment for multiply-distorted images in gradient domain. IEEE Signal Processing Letters, 23(4), 541–545.
Article MATH Google Scholar
Liu, C., Mao, Z., Zhang, T., Liu, A., Wang, B., & Zhang, Y. (2022). Focus Your Attention: A Focal Attention for Multimodal Learning. IEEE Transactions on Multimedia, 24(1), 103–115.
Article MATH Google Scholar
Liu, J., Xu, D., Yang, W., Fan, M., & Huang, H. (2021). Benchmarking low-light image enhancement and beyond. Springer International Journal of Computer Vision, 129, 1153–1184.
Article MATH Google Scholar
Liu, L., Liu, B., Huang, H., & Bovik, A. C. (2014). No-reference image quality assessment based on spatial and spectral entropies. Elsevier Signal Processing: Image Communication, 29(8), 856–863.
MATH Google Scholar
Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., & Neubig, G. (2023). Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55(9), 1–35.
Article Google Scholar
Liu, T. J., & Liu, K. H. (2018). No-reference image quality assessment by wide-perceptual-domain scorer ensemble method. IEEE Transactions on Image Processing, 27(3), 1138–1151.
Article MathSciNet MATH Google Scholar
Liu, X., Van De Weijer, J., & Bagdanov, A. D. (2019). Exploiting unlabeled data in CNNs by self-supervised learning to rank. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(8), 1862–1878.
Article MATH Google Scholar
Loh, Y. P., & Chan, C. S. (2019). Getting to know low-light images with the exclusively dark dataset. Elsevier Computer Vision and Image Understanding, 178, 30–42.
Article MATH Google Scholar
Ma, R., Luo, H., Wu, Q., Ngan, K.N., Li, H., Meng, F., & Xu, L. (2021). Remember and Reuse: Cross-task blind image quality assessment via relevance-aware incremental learning. In ACM international conference on multimedia (MM), pp. 5248–5256.
Madhusudana, P. C., Birkbeck, N., Wang, Y., Adsumilli, B., & Bovik, A. C. (2022). Image quality assessment using contrastive learning. IEEE Transactions on Image Processing, 31(1), 4149–4161.
Article Google Scholar
Martinez, H. A. B., & Farias, M. C. (2018). Combining audio and video metrics to assess audio-visual quality. Springer Multimedia Tools and Applications, 77(18), 23993–24012.
Article Google Scholar
Min, X., Zhai, G., Zhou, J., Farias, M. C., & Bovik, A. C. (2020). Study of subjective and objective quality assessment of audio-visual signals. IEEE Transactions on Image Processing, 29(5588), 6054–6068.
Article MATH Google Scholar
Mittal, A., Moorthy, A. K., & Bovik, A. C. (2012). No-reference image quality assessment in the spatial domain. IEEE Transactions on Image Processing, 21(12), 4695–4708.
Article MathSciNet MATH Google Scholar
Pinson, M. H., Janowski, L., Pépion, R., Huynh-Thu, Q., Schmidmer, C., Corriveau, P., Younkin, A., Le Callet, P., Barkowsky, M., & Ingram, W. (2012). The influence of subjects and environment on audiovisual subjective tests: An international study. IEEE Journal of Selected Topics in Signal Processing, 6(6), 640–651.
Article Google Scholar
Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., & Krueger, G. (2021). Learning transferable visual models from natural language supervision. In PMLR international conference on machine learning (ICML), pp. 8748–8763.
Song, G., Wang, S., Huang, Q., & Tian, Q. (2019). Harmonized multimodal learning with Gaussian process latent variable models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(3), 858–872.
Article MATH Google Scholar
Stiennon, N., Ouyang, L., Wu, J., Ziegler, D., Lowe, R., Voss, C., Radford, A., Amodei, D., & Christiano, P. F. (2020). Learning to summarize with human feedback. Advances in Neural Information Processing Systems (NeurIPS), 33, 3008–3021.
Google Scholar
Su, S., Yan, Q., Zhu, Y., Zhang, C., Ge, X., Sun, J., & Zhang, Y. (2020). Blindly assess image quality in the wild guided by a self-adaptive hyper network. In IEEE Conference on computer vision and pattern recognition (CVPR), pp. 3667–3676.
Tian, Y., Zeng, H., Hou, J., Chen, J., Zhu, J., & Ma, K. K. (2021). A light field image quality assessment model based on symmetry and depth features. IEEE Transactions on Circuits and Systems for Video Technology, 31(5), 2046–2050.
Article MATH Google Scholar
Vinyals, O., Toshev, A., Bengio, S., & Erhan, D. (2016). Show and tell: Lessons learned from the 2015 mscoco image captioning challenge. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4), 652–663.
Article Google Scholar
Virtanen, T., Nuutinen, M., Vaahteranoksa, M., Oittinen, P., & Häkkinen, J. (2014). CID2013: A database for evaluating no-reference image quality assessment algorithms. IEEE Transactions on Image Processing, 24(1), 390–402.
Article MathSciNet MATH Google Scholar
Wade, N., & Swanston, M. (2013). Visual perception: An introduction. Psychology Press.
Book MATH Google Scholar
Wang, G., Chen, C., Fan, D.P., Hao, A., & Qin, H. (2021). From semantic categories to fixations: A novel weakly-supervised visual-auditory saliency detection approach. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15119–15128.
Wang, J., Chen, Z., Yuan, C., Li, B., Ma, W., & Hu, W. (2023). Hierarchical curriculum learning for no-reference image quality assessment. Springer International Journal of Computer Vision, 131, 3074–3093.
Article MATH Google Scholar
Wang, M., Huang, Y., Lin, J., Xie, W., Yue, G., Wang, S., & Li, L. (2021). Quality measurement of screen images via foreground perception and background suppression. IEEE Transactions on Instrumentation and Measurement, 70, 1–11.
Article MATH Google Scholar
Wang, M., Huang, Y., Xiong, J., & Xie, W. (2022). Low-light images in-the-wild: A novel visibility perception-guided blind quality indicator. IEEE Transactions on Industrial Informatics, 19(4), 6026–6036.
Wang, M., Huang, Y., & Zhang, J. (2021). Blind quality assessment of night-time images via weak illumination analysis. In IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6.
Wang, M., Xu, Z., Gong, Y., & Xie, W. (2022). S-CCR: Super-complete comparative representation for low-light image quality inference in-the-wild. In ACM international conference on multimedia (MM), pp. 5219–5227.
Wu, J., Ma, J., Liang, F., Dong, W., Shi, G., & Lin, W. (2020). End-to-end blind image quality prediction with cascaded deep neural network. IEEE Transactions on Image Processing, 29(1), 7414–7426.
Article MATH Google Scholar
Xiang, T., Yang, Y., & Guo, S. (2019). Blind night-time image quality assessment: Subjective and objective approaches. IEEE Transactions on Multimedia, 22(5), 1259–1272.
Article MATH Google Scholar
Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., Zemel, R., & Bengio, Y. (2015). Show, attend and tell: Neural image caption generation with visual attention. In PMLR international conference on machine learning (ICML), pp. 2048–2057.
Xu, Y., Cao, H., Yin, J., Chen, Z., Li, X., Li, Z., Xu, Q., & Yang, J. (2024). Going deeper into recognizing actions in dark environments: A comprehensive benchmark study. Springer International Journal of Computer Vision, 132(4), 1292–1309.
Article MATH Google Scholar
Yan, B., Bare, B., & Tan, W. (2019). Naturalness-aware deep no-reference image quality assessment. IEEE Transactions on Multimedia, 21(10), 2603–2615.
Article Google Scholar
Yang, Q., Ma, Z., Xu, Y., Li, Z., & Sun, J. (2022). Inferring point cloud quality via graph similarity. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(6), 3015–3029.
Article MATH Google Scholar
Yang, W., Wu, J., Tian, S., Li, L., Dong, W., & Shi, G. (2022). Fine-grained image quality caption with hierarchical semantics degradation. IEEE Transactions on Image Processing, 31(1), 3578–3590.
Article MATH Google Scholar
Ying, Z., Ghadiyaram, D., & Bovik, A. (2022). Telepresence video quality assessment. In: Springer European Conference on Computer Vision (ECCV), pp. 327–347.
Ying, Z., Niu, H., Gupta, P., Mahajan, D., Ghadiyaram, D., & Bovik, A.C. (2020). From patches to pictures (PaQ-2-PiQ): Mapping the perceptual space of picture quality. In IEEE conference on computer vision and pattern recognition (CVPR), pp. 3575–3585.
Zhai, G., Wu, X., Yang, X., Lin, W., & Zhang, W. (2012). A psychovisual quality metric in free-energy principle. IEEE Transactions on Image Processing, 21(1), 41–52.
Article MathSciNet MATH Google Scholar
Zhang, J., Dong, B., Fu, Y., Wang, Y., Wei, X., Yin, B., & Yang, X. (2024). A universal event-based plug-in module for visual object tracking in degraded conditions. Springer International Journal of Computer Vision, 132(5), 1857–1879.
Article MATH Google Scholar
Zhang, L., Zhang, L., & Bovik, A. C. (2015). A feature-enriched completely blind image quality evaluator. IEEE Transactions on Image Processing, 24(8), 2579–2591.
Article MathSciNet MATH Google Scholar
Zhang, W., Li, D., Ma, C., Zhai, G., Yang, X., & Ma, K. (2023). Continual learning for blind image quality assessment. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(3), 2864–2878.
MATH Google Scholar
Zhang, W., Ma, K., Yan, J., Deng, D., & Wang, Z. (2020). Blind image quality assessment using a deep bilinear convolutional neural network. IEEE Transactions on Circuits and Systems for Video Technology, 30(1), 36–47.
Article MATH Google Scholar
Zhang, W., Ma, K., Zhai, G., & Yang, X. (2021). Uncertainty-aware blind image quality assessment in the laboratory and wild. IEEE Transactions on Image Processing, 30(1), 3474–3486.
Article MATH Google Scholar
Zheng, Y., Chen, W., Lin, R., Zhao, T., & Le Callet, P. (2022). Uif: An objective quality assessment for underwater image enhancement. IEEE Transactions on Image Processing, 1, 5456–5468.
Zhu, H., Li, L., Wu, J., Dong, W., & Shi, G. (2020). MetaIQA: Deep meta-learning for no-reference image quality assessment. In IEEE conference on computer vision and pattern recognition (CVPR), pp. 14143–14152.

Download references

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant 62472290 and Grant 62372306, in part by the Natural Science Foundation of Guangdong Province under Grant 2024A1515011972, Grant 2023A1515011197, and Grant 2022A1515011245, and in part by the Natural Science Foundation of Shenzhen City under Grant 20220809160139001.

Author information

Authors and Affiliations

State Key Laboratory of Radio Frequency Heterogeneous Integration, Shenzhen University, Shenzhen, China
Miaohui Wang & Zhuowei Xu
Electronic and Information Engineering, Beihang University, Beijing, China
Mai Xu
College of Computing and Data Science, Nanyang Technological University, Singapore, Singapore
Weisi Lin

Authors

Miaohui Wang
View author publications
Search author on:PubMed Google Scholar
Zhuowei Xu
View author publications
Search author on:PubMed Google Scholar
Mai Xu
View author publications
Search author on:PubMed Google Scholar
Weisi Lin
View author publications
Search author on:PubMed Google Scholar

Corresponding author

Correspondence to Miaohui Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no Conflict of interest.

Additional information

Communicated by Jean-François Lalonde.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, M., Xu, Z., Xu, M. et al. Blind Multimodal Quality Assessment of Low-Light Images. Int J Comput Vis 133, 1665–1688 (2025). https://doi.org/10.1007/s11263-024-02239-9

Download citation

Received: 21 December 2023
Accepted: 02 September 2024
Published: 16 October 2024
Version of record: 16 October 2024
Issue date: April 2025
DOI: https://doi.org/10.1007/s11263-024-02239-9

Keywords

Profiles

Zhuowei Xu View author profile

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Blind Multimodal Quality Assessment of Low-Light Images

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

QualityNet: A multi-stream fusion framework with spatial and channel attention for blind image quality assessment

MTQ-Caps: A Multi-task Capsule Network for Blind Image Quality Assessment

Physiological biometric image quality assessment -a review

Explore related subjects

Data Availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Profiles

Subscribe and save

Buy Now