+

CN103347196A - Method for evaluating stereo image vision comfort level based on machine learning - Google Patents

Method for evaluating stereo image vision comfort level based on machine learning Download PDF

Info

Publication number
CN103347196A
CN103347196A CN2013102649568A CN201310264956A CN103347196A CN 103347196 A CN103347196 A CN 103347196A CN 2013102649568 A CN2013102649568 A CN 2013102649568A CN 201310264956 A CN201310264956 A CN 201310264956A CN 103347196 A CN103347196 A CN 103347196A
Authority
CN
China
Prior art keywords
value
image
pixel
coordinate position
denoted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2013102649568A
Other languages
Chinese (zh)
Other versions
CN103347196B (en
Inventor
邵枫
姜求平
蒋刚毅
郁梅
李福翠
彭宗举
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ape Point Technology Beijing Co ltd
Zheng Juan
Original Assignee
Ningbo University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ningbo University filed Critical Ningbo University
Priority to CN201310264956.8A priority Critical patent/CN103347196B/en
Publication of CN103347196A publication Critical patent/CN103347196A/en
Application granted granted Critical
Publication of CN103347196B publication Critical patent/CN103347196B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

本发明公开了一种基于机器学习的立体图像视觉舒适度评价方法,其首先通过右视点图像的显著图和右视差图像提取出立体图像的视觉重要区域掩膜,然后利用视觉重要区域掩膜提取出用于反映视差幅度特征、视差梯度特征的特征矢量及用于反映空间频率特征的特征矢量,得到立体图像的特征矢量,再利用支持向量回归对立体图像集合中的所有立体图像的特征矢量进行训练,最后利用训练得到的支持向量回归训练模型对立体图像集合中的每幅立体图像进行测试,得到每幅立体图像的视觉舒适度评价预测值,优点是获得的立体图像的特征矢量信息具有较强的稳定性且能够较好地反映立体图像的视觉舒适度变化情况,从而有效地提高了客观评价情况与主观感知的相关性。

The invention discloses a method for evaluating the visual comfort of a stereoscopic image based on machine learning, which first extracts the visually important area mask of the stereoscopic image through the saliency map of the right viewpoint image and the right parallax image, and then uses the visually important area mask to extract The feature vectors used to reflect the parallax magnitude feature, the parallax gradient feature and the feature vector used to reflect the spatial frequency feature are obtained to obtain the feature vector of the stereo image, and then the feature vectors of all the stereo images in the stereo image set are analyzed by using support vector regression Finally, use the support vector regression training model obtained from the training to test each stereo image in the stereo image set, and obtain the visual comfort evaluation prediction value of each stereo image. The advantage is that the feature vector information of the obtained stereo image has relatively It has strong stability and can better reflect the change of the visual comfort of the stereoscopic image, thereby effectively improving the correlation between the objective evaluation and the subjective perception.

Description

一种基于机器学习的立体图像视觉舒适度评价方法A method for evaluating visual comfort of stereo images based on machine learning

技术领域technical field

本发明涉及一种图像质量评价方法,尤其是涉及一种基于机器学习的立体图像视觉舒适度评价方法。The invention relates to an image quality evaluation method, in particular to a machine learning-based method for evaluating visual comfort of stereoscopic images.

背景技术Background technique

随着立体视频显示技术和高质量立体视频内容获取技术的快速发展,立体视频的视觉体验质量(QoE,quality of experience)是立体视频系统设计中的一个重要问题,而视觉舒适度(VC,visual comfort)是影响立体视频的QoE的重要因素。目前,对立体视频/图像的质量评价研究主要考虑内容失真对于图像质量的影响,而很少考虑视觉舒适度等因素的影响。因此,为了提高观看者的视觉体验质量,研究立体视频/图像的视觉舒适度客观评价模型对指导3D内容的制作和后期处理具有十分重要的作用。With the rapid development of stereoscopic video display technology and high-quality stereoscopic video content acquisition technology, the visual quality of experience (QoE, quality of experience) of stereoscopic video is an important issue in the design of stereoscopic video systems, and visual comfort (VC, visual comfort) is an important factor affecting the QoE of stereoscopic video. At present, research on the quality evaluation of stereoscopic video/image mainly considers the influence of content distortion on image quality, but seldom considers the influence of factors such as visual comfort. Therefore, in order to improve the visual experience quality of viewers, it is very important to study the objective evaluation model of visual comfort of stereoscopic video/image to guide the production and post-processing of 3D content.

传统的立体图像视觉舒适度评价方法主要采用全局的视差统计特性来预测视觉舒适度。然而,根据人眼立体视觉注意力特性,人眼只对部分视觉重要区域的视觉舒适/不舒适比较敏感,如果以此全局的视差统计特征来预测视觉重要区域的视觉舒适程度,会导致无法精确预测得到客观评价值。因此,如何在评价过程中有效地根据视觉重要区域来提取出视觉舒适度特征,使得客观评价结果更加感觉符合人类视觉系统,是在对立体图像进行客观视觉舒适度评价过程中需要研究解决的问题。Traditional stereo image visual comfort evaluation methods mainly use the global disparity statistical characteristics to predict visual comfort. However, according to the attention characteristics of human stereo vision, the human eye is only sensitive to the visual comfort/discomfort of some visually important areas. If the global disparity statistical features are used to predict the visual comfort of visually important areas, it will lead to inaccurate Prediction gets an objective evaluation value. Therefore, how to effectively extract visual comfort features based on visually important areas in the evaluation process, so that the objective evaluation results are more in line with the human visual system, is a problem that needs to be studied and solved in the process of objective visual comfort evaluation of stereoscopic images .

发明内容Contents of the invention

本发明所要解决的技术问题是提供一种基于机器学习的立体图像视觉舒适度评价方法,其能够有效地提高客观评价结果与主观感知的相关性。The technical problem to be solved by the present invention is to provide a method for evaluating visual comfort of stereoscopic images based on machine learning, which can effectively improve the correlation between objective evaluation results and subjective perception.

本发明解决上述技术问题所采用的技术方案为:一种基于机器学习的立体图像视觉舒适度评价方法,其特征在于包括以下步骤:The technical solution adopted by the present invention to solve the above-mentioned technical problems is: a method for evaluating visual comfort of stereoscopic images based on machine learning, which is characterized in that it comprises the following steps:

①将待评价的立体图像的左视点图像记为{IL(x,y)},将待评价的立体图像的右视点图像记为{IR(x,y)},将待评价的立体图像的右视差图像记为{dR(x,y)},其中,此处(x,y)表示{IL(x,y)}、{IR(x,y)}和{dR(x,y)}中的像素点的坐标位置,1≤x≤W,1≤y≤H,W表示{IL(x,y)}、{IR(x,y)}和{dR(x,y)}的宽度,H表示{IL(x,y)}、{IR(x,y)}和{dR(x,y)}的高度,IL(x,y)表示{IL(x,y)}中坐标位置为(x,y)的像素点的像素值,IR(x,y)表示{IR(x,y)}中坐标位置为(x,y)的像素点的像素值,dR(x,y)表示{dR(x,y)}中坐标位置为(x,y)的像素点的像素值;① Denote the left viewpoint image of the stereo image to be evaluated as {I L (x, y)}, the right viewpoint image of the stereo image to be evaluated as {I R (x, y)}, and the stereo image to be evaluated The right disparity image of the image is denoted as {d R (x,y)}, where (x,y) denotes {I L (x,y)}, {I R (x,y)} and {d R The coordinate position of the pixel in (x,y)}, 1≤x≤W, 1≤y≤H, W means {I L (x, y)}, {I R (x, y)} and {d The width of R (x,y)}, H means the height of {I L (x,y)}, {I R (x,y)} and {d R (x,y)}, I L (x,y ) means the pixel value of the pixel whose coordinate position is (x, y) in {I L (x, y)}, and I R (x, y) means that the coordinate position in {I R (x, y)} is (x , y) the pixel value of the pixel point, d R (x, y) represents the pixel value of the pixel point whose coordinate position is (x, y) in {d R (x, y)};

②提取出{IR(x,y)}的显著图;然后根据{IR(x,y)}的显著图和{dR(x,y)},获取{IR(x,y)}的视觉显著图;再将{IR(x,y)}的视觉显著图划分为视觉重要区域和非视觉重要区域;最后根据{IR(x,y)}的视觉显著图的视觉重要区域和非视觉重要区域,获取待评价的立体图像的视觉重要区域掩膜,记为{M(x,y)},其中,M(x,y)表示{M(x,y)}中坐标位置为(x,y)的像素点的像素值;② Extract the saliency map of {I R (x, y)}; then according to the saliency map of {I R (x, y)} and {d R (x, y)}, get {I R (x, y) } visual saliency map; then divide the visual saliency map of {I R (x,y)} into visually important areas and non-visually important areas; finally according to the visual importance of the visual saliency map of {I R (x,y)} Regions and non-visually important regions, obtain the visually important region mask of the stereo image to be evaluated, denoted as {M(x,y)}, where M(x,y) represents the coordinates in {M(x,y)} The pixel value of the pixel at position (x, y);

③根据{dR(x,y)}和{M(x,y)},获取{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的视差均值μ、视差方差δ、最大负视差θ、视差范围χ,然后将μ、δ、θ和χ按顺序进行排列构成用于反映{dR(x,y)}的视差幅度特征的特征矢量,记为F1,F1=(μ,δ,θ,χ);③According to {d R (x,y)} and {M(x,y)}, obtain the visually important areas of the visual saliency map in {d R (x,y)} and {I R (x,y)} The disparity mean μ, disparity variance δ, maximum negative disparity θ, and disparity range χ of the pixels in the corresponding area, and then arrange μ, δ, θ and χ in order to reflect {d R (x, y )}, the feature vector of the parallax amplitude feature, denoted as F 1 , F 1 = (μ, δ, θ, χ);

④通过计算{dR(x,y)}的视差梯度幅值图像和视差梯度方向图像,计算{dR(x,y)}的视差梯度边缘图像;然后根据{dR(x,y)}的视差梯度边缘图像和{M(x,y)},计算{dR(x,y)}的视差梯度边缘图像中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的梯度均值ψ;最后将ψ作为用于反映{dR(x,y)}的视差梯度特征的特征矢量,记为F2④ By calculating the parallax gradient magnitude image and parallax gradient direction image of {d R (x, y)}, calculate the parallax gradient edge image of {d R (x, y)}; then according to {d R (x, y) } of the disparity gradient edge image and {M(x,y)}, calculate the visually important regions in the disparity gradient edge image of {d R (x,y)} and the visual saliency map of {I R (x,y)} The gradient mean value ψ of all pixels in the corresponding area; finally, ψ is used as a feature vector reflecting the parallax gradient feature of {d R (x, y)}, denoted as F 2 ;

⑤获取{IR(x,y)}的空间频率图像;然后根据{IR(x,y)}的空间频率图像和{M(x,y)},获取{IR(x,y)}的空间频率图像中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的空间频率均值ν、空间频率方差ρ、空间频率范围ζ、空间频率敏感因子τ;再将ν、ρ、ζ和τ按顺序进行排列构成用于反映{IR(x,y)}的空间频率特征的特征矢量,记为F3,F3=(ν,ρ,ζ,τ);⑤ Obtain the spatial frequency image of {I R (x, y)}; then according to the spatial frequency image of {I R (x, y)} and {M (x, y)}, obtain {I R (x, y) } in the spatial frequency image of {I R (x,y)}, the spatial frequency mean ν, spatial frequency variance ρ, spatial frequency range ζ, spatial frequency Sensitivity factor τ; then arrange ν, ρ, ζ and τ in order to form a feature vector used to reflect the spatial frequency characteristics of {I R (x,y)}, denoted as F 3 , F 3 =(ν, ρ , ζ, τ);

⑥将F1、F2及F3构成一个新的特征矢量,记为X,X=[F1,F2,F3],然后将X作为待评价的立体图像的特征矢量,其中,符号“[]”为矢量表示符号,[F1,F2,F3]表示将F1、F2和F3连接起来形成一个新的特征矢量;⑥Constitute F 1 , F 2 and F 3 into a new feature vector, denoted as X, X=[F 1 , F 2 , F 3 ], and then use X as the feature vector of the stereoscopic image to be evaluated, where the symbol "[]" is a vector symbol, and [F 1 , F 2 , F 3 ] means connecting F 1 , F 2 and F 3 to form a new feature vector;

⑦采用n副不同的立体图像以及对应的右视差图像建立立体图像集合,利用主观质量评价方法分别计算立体图像集合中的每副立体图像的视觉舒适度的平均主观评分均值,记为MOS,其中,n≥1,MOS∈[1,5];然后按照步骤①至步骤⑥计算待评价的立体图像的特征矢量X的操作,以相同的方式分别计算立体图像集合中的每幅立体图像的特征矢量,将立体图像集合中的第i幅立体图像的特征矢量记为Xi,其中,1≤i≤n,n表示立体图像集合中包含的立体图像的幅数;⑦ Use n different stereoscopic images and corresponding right parallax images to establish a stereoscopic image set, and use the subjective quality evaluation method to calculate the average subjective score of the visual comfort of each stereoscopic image in the stereoscopic image set, denoted as MOS, where , n≥1, MOS∈[1,5]; then follow steps ① to ⑥ to calculate the feature vector X of the stereo image to be evaluated, and calculate the feature of each stereo image in the stereo image set in the same way Vector, denoting the feature vector of the i-th stereo image in the stereo image set as X i , wherein, 1≤i≤n, n represents the number of stereo images contained in the stereo image set;

⑧将立体图像集合中的所有立体图像分成训练集和测试集,将训练集中的所有立体图像的特征矢量和平均主观评分均值构成训练样本数据集合,将测试集中的所有立体图像的特征矢量和平均主观评分均值构成测试样本数据集合,然后采用支持向量回归作为机器学习的方法,对训练样本数据集合中的所有立体图像的特征矢量进行训练,使得经过训练得到的回归函数值与平均主观评分均值之间的误差最小,拟合得到最优的权重矢量wopt和最优的偏置项bopt,接着利用wopt和bopt构造得到支持向量回归训练模型,再根据支持向量回归训练模型,对测试样本数据集合中的每幅立体图像的特征矢量进行测试,预测得到测试样本数据集合中的每幅立体图像的客观视觉舒适度评价预测值,将测试样本数据集合中的第k'幅立体图像的客观视觉舒适度评价预测值记为Qk',Qk'=f(Xk'),

Figure BDA00003416765200034
其中,f()为函数表示形式,Xk'表示测试样本数据集合中的第k'幅立体图像的特征矢量,(wopt)T为wopt的转置矩阵,
Figure BDA00003416765200035
表示测试样本数据集合中的第k'幅立体图像的线性函数,1≤k'≤n-t,t表示训练集中包含的立体图像的幅数;之后通过重新分配训练集和测试集,重新预测得到测试样本数据集合中的每幅立体图像的客观视觉舒适度评价预测值,经过N次迭代后计算立体图像中的每幅立体图像的客观视觉舒适度评价预测值的平均值,并将计算得到的平均值作为对应那幅立体图像的最终的客观视觉舒适度预测值,其中,N的值取大于100。8. Divide all the stereo images in the stereo image set into training set and test set, form the training sample data set with the feature vectors and average subjective score mean of all the stereo images in the training set, and combine the feature vectors and average values of all the stereo images in the test set The mean value of the subjective score constitutes the test sample data set, and then uses support vector regression as a machine learning method to train the feature vectors of all stereo images in the training sample data set, so that the regression function value obtained after training and the average subjective score mean value The error between is the smallest, and the optimal weight vector w opt and the optimal bias item b opt are obtained by fitting, and then the support vector regression training model is obtained by using w opt and b opt , and then according to the support vector regression training model, the test The feature vector of each stereoscopic image in the sample data set is tested, and the objective visual comfort evaluation prediction value of each stereoscopic image in the test sample data set is predicted, and the k'th stereoscopic image in the test sample data set is The predicted value of objective visual comfort evaluation is denoted as Q k' , Q k' = f(X k' ),
Figure BDA00003416765200034
Among them, f() is a function representation, X k' represents the feature vector of the k'th stereo image in the test sample data set, (w opt ) T is the transposition matrix of w opt ,
Figure BDA00003416765200035
Represents the linear function of the k'th stereo image in the test sample data set, 1≤k'≤nt, t represents the number of stereo images contained in the training set; after that, re-predict the test by reassigning the training set and the test set The objective visual comfort evaluation prediction value of each stereoscopic image in the sample data set, calculate the average value of the objective visual comfort evaluation prediction value of each stereoscopic image in the stereoscopic image after N iterations, and calculate the average The value is used as the final objective visual comfort prediction value corresponding to that stereo image, wherein, the value of N is greater than 100.

所述的步骤②的具体过程为:The concrete process of described step 2. is:

②-1、采用基于图论的视觉显著性模型提取出{IR(x,y)}的显著图,记为{SMR(x,y)},其中,SMR(x,y)表示{SMR(x,y)}中坐标位置为(x,y)的像素点的像素值;②-1. Use the visual saliency model based on graph theory to extract the saliency graph of {I R (x, y)}, denoted as {SM R (x, y)}, where SM R (x, y) means The pixel value of the pixel whose coordinate position is (x, y) in {SM R (x, y)};

②-2、根据{SMR(x,y)}和{dR(x,y)},获取{IR(x,y)}的视觉显著图,记为{DR(x,y)},将{DR(x,y)}中坐标位置为(x,y)的像素点的像素值记为DR(x,y),其中,

Figure BDA00003416765200044
表示SMR(x,y)的权重,
Figure BDA00003416765200045
表示dR(x,y)的权重,
Figure BDA00003416765200046
②-2. According to {SM R (x,y)} and {d R (x,y)}, obtain the visual saliency map of {I R (x,y)}, denoted as {D R (x,y) }, record the pixel value of the pixel whose coordinate position is (x, y) in {D R (x, y)} as D R (x, y), in,
Figure BDA00003416765200044
Represents the weight of SM R (x,y),
Figure BDA00003416765200045
Indicates the weight of d R (x,y),
Figure BDA00003416765200046

②-3、根据{DR(x,y)}中的每个像素点的像素值,将{DR(x,y)}划分为视觉重要区域和非视觉重要区域,{DR(x,y)}的视觉重要区域中的每个像素点的像素值大于自适应阈值T1,{DR(x,y)}的非视觉重要区域中的每个像素点的像素值小于或等于自适应阈值T1,其中,T1为利用大津法对{DR(x,y)}进行处理得到的阈值;②-3. According to the pixel value of each pixel in {D R (x, y)}, divide {D R (x, y)} into visually important areas and non-visually important areas, {D R (x , y)}, the pixel value of each pixel in the visually important area is greater than the adaptive threshold T 1 , and the pixel value of each pixel in the non-visually important area of {D R (x, y)} is less than or equal to Adaptive threshold T 1 , where T 1 is the threshold obtained by processing {D R (x, y)} using the Otsu method;

②-4、根据{DR(x,y)}的视觉重要区域和非视觉重要区域,获取待评价的立体图像的视觉重要区域掩膜,记为{M(x,y)},将{M(x,y)}中坐标位置为(x,y)的像素点的像素值记为M(x,y), M ( x , y ) = 1 D R ( x , y ) > T 1 0 D R ( x , y ) ≤ T 1 . ②-4. According to the visually important area and non-visually important area of {D R (x, y)}, obtain the visually important area mask of the stereo image to be evaluated, which is recorded as {M(x,y)}, and { The pixel value of the pixel whose coordinate position is (x, y) in M(x, y)} is recorded as M(x, y), m ( x , the y ) = 1 D. R ( x , the y ) > T 1 0 D. R ( x , the y ) ≤ T 1 .

所述的步骤②-2中取

Figure BDA00003416765200047
。Take from the step ②-2
Figure BDA00003416765200047
.

所述的步骤③的具体过程为:The concrete process of described step 3. is:

③-1、根据{dR(x,y)}和{M(x,y)},计算{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的视差均值,记为μ, μ = Σ ( x , y ) ∈ Ω d R ( x , y ) × M ( x , y ) Σ ( x , y ) ∈ Ω M ( x , y ) , 其中,Ω表示图像域范围;③-1. According to {d R (x, y)} and {M (x, y)}, calculate the visual saliency map in {d R (x, y)} and {I R (x, y)} The average disparity value of all pixels in the area corresponding to the visually important area, denoted as μ, μ = Σ ( x , the y ) ∈ Ω d R ( x , the y ) × m ( x , the y ) Σ ( x , the y ) ∈ Ω m ( x , the y ) , Among them, Ω represents the image domain range;

③-2、根据{dR(x,y)}和{M(x,y)}及μ,计算{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的视差方差,记为δ, δ = Σ ( x , y ) ∈ Ω ( d R ( x , y ) - μ ) 2 × M ( x , y ) Σ ( x , y ) ∈ Ω M ( x , y ) ; ③-2. According to {d R (x,y)} and {M(x,y)} and μ, calculate the visual salience of {I R (x,y)} in {d R (x,y)} The disparity variance of all pixels in the area corresponding to the visually important area of the graph is denoted as δ, δ = Σ ( x , the y ) ∈ Ω ( d R ( x , the y ) - μ ) 2 × m ( x , the y ) Σ ( x , the y ) ∈ Ω m ( x , the y ) ;

③-3、计算{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的最大负视差,记为θ,其中,θ的值为{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内视差值最小的1%像素点的视差均值;③-3. Calculate the maximum negative disparity of the pixels in {d R (x, y)} corresponding to the visually important area of the visual saliency map of {I R (x, y)}, denoted as θ, Among them, the value of θ is the disparity of 1% of the pixels in {d R (x, y)} corresponding to the visually important area of the visual saliency map of {I R (x, y)} with the smallest disparity value mean;

③-4、计算{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的视差范围,记为χ,χ=dmax-dmin,其中,dmax表示{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内视差值最大的1%像素点的视差均值,dmin表示{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内视差值最小的1%像素点的视差均值;③-4, calculate the disparity range of the pixels in the region corresponding to the visually important region of the visual saliency map of {I R (x, y)} in {d R (x, y)}, denoted as χ, χ =d max -d min , wherein, d max represents the maximum disparity value in {d R (x, y)} corresponding to the visually important area of the visual saliency map of {I R (x, y)} The mean value of the disparity of 1% of the pixels, d min represents the minimum disparity value of 1 in the area corresponding to the visually important area of the visual saliency map of {I R (x, y)} in {d R (x, y)} The average parallax value of % pixels;

③-5、将μ、δ、θ和χ按顺序进行排列构成用于反映{dR(x,y)}的视差幅度特征的特征矢量,记为F1,F1=(μ,δ,θ,χ),F1的维数为4。③-5. Arrange μ, δ, θ and χ in order to form a feature vector used to reflect the parallax amplitude characteristics of {d R (x, y)}, denoted as F 1 , F 1 =(μ, δ, θ, χ), the dimension of F1 is 4.

所述的步骤④的具体过程为:The concrete process of described step 4. is:

④-1、计算{dR(x,y)}的视差梯度幅值图像,记为{m(x,y)},将{m(x,y)}中坐标位置为(x,y)的像素点的梯度幅值记为m(x,y),

Figure BDA00003416765200052
其中,Gx(x,y)表示{m(x,y)}中坐标位置为(x,y)的像素点的水平梯度值,Gy(x,y)表示{m(x,y)}中坐标位置为(x,y)的像素点的垂直梯度值;④-1. Calculate the parallax gradient magnitude image of {d R (x,y)}, which is recorded as {m(x,y)}, and the coordinate position in {m(x,y)} is (x,y) The gradient magnitude of the pixel point is recorded as m(x,y),
Figure BDA00003416765200052
Among them, G x (x, y) represents the horizontal gradient value of the pixel whose coordinate position is (x, y) in {m(x, y)}, and G y (x, y) represents {m(x, y) } in the vertical gradient value of the pixel whose coordinate position is (x, y);

④-2、计算{dR(x,y)}的视差梯度方向图像,记为{θ(x,y)},将{θ(x,y)}中坐标位置为(x,y)的像素点的梯度方向值记为θ(x,y),θ(x,y)=arctan(Gy(x,y)/Gx(x,y)),其中,arctan()为取反正切函数;④-2. Calculate the disparity gradient direction image of {d R (x,y)}, which is recorded as {θ(x,y)}, and the coordinate position in {θ(x,y)} is (x,y) The gradient direction value of the pixel is recorded as θ(x,y), θ(x,y)=arctan(G y (x,y)/G x (x,y)), where arctan() is the arc tangent function;

④-3、根据{m(x,y)}和{θ(x,y)},计算{dR(x,y)}的视差梯度边缘图像,记为{E(x,y)},将{E(x,y)}中坐标位置为p的像素点的梯度边缘值记为E(p),④-3. According to {m(x,y)} and {θ(x,y)}, calculate the parallax gradient edge image of {d R (x,y)}, denoted as {E(x,y)}, Record the gradient edge value of the pixel point whose coordinate position is p in {E(x,y)} as E(p),

Figure BDA00003416765200061
其中,Gs(||p-q||)表示标准差为σs的高斯函数,
Figure BDA00003416765200062
p-q表示坐标位置p和坐标位置q之间的欧氏距离,符号“|| ||”为求欧氏距离符号,
Figure BDA00003416765200063
表示标准差为σo的高斯函数, G o ( | | θ → ( p ) - θ → ( q ) | | ) = exp ( - | | θ → ( p ) - θ → ( q ) | | 2 2 σ o 2 ) ,
Figure BDA00003416765200065
表示
Figure BDA00003416765200067
之间的欧氏距离, θ → ( p ) = [ sin ( θ ( p ) ) , cos ( ( p ) ) ] , θ → ( q ) = [ sin ( θ ( q ) ) , cos ( θ ( q ) ) ] , θ(p)表示{θ(x,y)}中坐标位置为p的像素点的梯度方向值,θ(q)表示{θ(x,y)}中坐标位置为q的像素点的梯度方向值,
Figure BDA000034167652000610
m(q)表示{m(x,y)}中坐标位置为q的像素点的梯度幅值,m(q')表示{m(x,y)}中坐标位置为q'的像素点的梯度幅值,εg为控制参数,符号“[]”为矢量表示符号,exp()表示以e为底的指数函数,e=2.71828183,
Figure BDA000034167652000612
表示以坐标位置为p的像素点为中心的邻域窗口,表示以坐标位置为q的像素点为中心的邻域窗口;
Figure BDA00003416765200061
Among them, G s (||pq||) represents the Gaussian function whose standard deviation is σ s ,
Figure BDA00003416765200062
pq represents the Euclidean distance between the coordinate position p and the coordinate position q, and the symbol "|| ||" is the Euclidean distance symbol,
Figure BDA00003416765200063
Represents a Gaussian function with standard deviation σ o , G o ( | | θ &Right Arrow; ( p ) - θ &Right Arrow; ( q ) | | ) = exp ( - | | θ &Right Arrow; ( p ) - θ &Right Arrow; ( q ) | | 2 2 σ o 2 ) ,
Figure BDA00003416765200065
express and
Figure BDA00003416765200067
The Euclidean distance between θ &Right Arrow; ( p ) = [ sin ( θ ( p ) ) , cos ( ( p ) ) ] , θ &Right Arrow; ( q ) = [ sin ( θ ( q ) ) , cos ( θ ( q ) ) ] , θ(p) represents the gradient direction value of the pixel point whose coordinate position is p in {θ(x,y)}, and θ(q) represents the gradient direction value of the pixel point whose coordinate position is q in {θ(x,y)} value,
Figure BDA000034167652000610
m(q) represents the gradient magnitude of the pixel at the coordinate position q in {m(x,y)}, and m(q') represents the gradient magnitude of the pixel at the coordinate position q' in {m(x,y)} Gradient amplitude, ε g is the control parameter, the symbol “[]” is the vector representation symbol, exp() represents the exponential function with e as the base, e=2.71828183,
Figure BDA000034167652000612
Indicates the neighborhood window centered on the pixel at the coordinate position p, Represents the neighborhood window centered on the pixel point whose coordinate position is q;

④-4、根据{E(x,y)}和{M(x,y)},计算{E(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的梯度均值,记为ψ,其中,Ω表示图像域范围,E(x,y)表示{E(x,y)}中坐标位置为(x,y)的像素点的梯度边缘值;④-4. According to {E(x,y)} and {M(x,y)}, calculate the visual importance of the visual saliency map in {E(x,y)} and {I R (x,y)} The gradient mean of all pixels in the region corresponding to the region is denoted as ψ, Among them, Ω represents the range of the image domain, and E(x, y) represents the gradient edge value of the pixel whose coordinate position is (x, y) in {E(x, y)};

④-5、将ψ作为用于反映{dR(x,y)}的视差梯度特征的特征矢量,记为F2,F2的维数为1。④-5. Take ψ as a feature vector for reflecting the disparity gradient feature of {d R (x, y)}, denoted as F 2 , and the dimension of F 2 is 1.

所述的步骤④-3中取σs=0.4,σo=0.4,εg=0.5。In the step ④-3, σ s =0.4, σ o =0.4, ε g =0.5.

所述的步骤④-3中的大小为3×3,

Figure BDA000034167652000615
的大小为3×3。In the step ④-3 described has a size of 3×3,
Figure BDA000034167652000615
The size is 3×3.

所述的步骤⑤的具体过程为:The concrete process of described step 5. is:

⑤-1、计算{IR(x,y)}的空间频率图像,记为{SF(x,y)},将{SF(x,y)}中坐标位置为(x,y)的像素点的空间频率值记为SF(x,y), SF ( x , y ) = ( HF ( x , y ) ) 2 + ( VF ( x , y ) ) 2 + ( DF ( x , y ) ) 2 , 其中,HF(x,y)表示{IR(x,y)}中坐标位置为(x,y)的像素点的水平方向频率值, HF ( x , y ) = &Sigma; m = - 1 1 &Sigma; n = 0 1 ( I R ( x + m , y + n ) - I R ( x + m , y + n - 1 ) ) 2 3 &times; 2 , VF(x,y)表示{IR(x,y)}中坐标位置为(x,y)的像素点的垂直方向频率值, VF ( x , y ) = &Sigma; m = 0 1 &Sigma; n = - 1 1 ( I R ( x + m , y + n ) - I R ( x + m - 1 , y + n ) ) 2 2 &times; 3 , DF(x,y)表示{IR(x,y)}中坐标位置为(x,y)的像素点的对角方向频率值, DF ( x , y ) = &Sigma; m = 0 1 &Sigma; n = 0 1 ( I R ( x + m , y + n ) - I R ( x + m - 1 , y + n - 1 ) ) 2 2 &times; 2 + &Sigma; m = - 1 0 &Sigma; n = 0 1 ( I R ( x + m , y + n ) - I R ( x + m + 1 , y + n - 1 ) ) 2 2 &times; 2 , IR(x+m,y+n)表示{IR(x,y)}中坐标位置为(x+m,y+n)的像素点的像素值,IR(x+m,y+n-1)表示{IR(x,y)}中坐标位置为(x+m,y+n-1)的像素点的像素值,IR(x+m-1,y+n)表示{IR(x,y)}中坐标位置为(x+m-1,y+n)的像素点的像素值,IR(x+m-1,y+n-1)表示{IR(x,y)}中坐标位置为(x+m-1,y+n-1)的像素点的像素值,IR(x+m+1,y+n-1)表示{IR(x,y)}中坐标位置为(x+m+1,y+n-1)的像素点的像素值,如果x+m<1,则IR(x+m,y+n)的值由IR(1,y+n)的值替代,IR(x+m,y+n-1)的值由IR(1,y+n-1)的值替代;如果x+m-1<1,则IR(x+m-1,y+n)的值由IR(1,y+n)的值替代,IR(x+m-1,y+n-1)的值由IR(1,y+n-1)的值替代;如果x+m>W,则IR(x+m,y+n)的值由IR(W,y+n)的值替代,IR(x+m,y+n-1)的值由IR(W,y+n-1)的值替代;如果x+m+1>W,则IR(x+m+1,y+n-1)的值由IR(W,y+n-1)的值替代;如果y+n<1,则IR(x+m,y+n)的值由IR(x+m,1)的值替代,IR(x+m-1,y+n)的值由IR(x+m-1,1)的值替代;如果y+n-1<1,则IR(x+m,y+n-1)的值由IR(x+m,1)的值替代,IR(x+m-1,y+n-1)的值由IR(x+m-1,1)的值替代,IR(x+m+1,y+n-1)的值由IR(x+m+1,1)的值替代;如果y+n>H,则IR(x+m,y+n)的值由IR(x+m,H)的值替代,IR(x+m-1,y+n)的值由IR(x+m-1,H)的值替代;⑤-1. Calculate the spatial frequency image of {I R (x, y)}, which is recorded as {SF(x, y)}, and the pixel whose coordinate position is (x, y) in {SF(x, y)} The spatial frequency value of a point is denoted as SF(x,y), SF ( x , the y ) = ( HF ( x , the y ) ) 2 + ( VF ( x , the y ) ) 2 + ( DF ( x , the y ) ) 2 , Among them, HF(x, y) represents the horizontal direction frequency value of the pixel whose coordinate position is (x, y) in {I R (x, y)}, HF ( x , the y ) = &Sigma; m = - 1 1 &Sigma; no = 0 1 ( I R ( x + m , the y + no ) - I R ( x + m , the y + no - 1 ) ) 2 3 &times; 2 , VF(x,y) represents the vertical frequency value of the pixel whose coordinate position is (x,y) in {I R (x,y)}, VF ( x , the y ) = &Sigma; m = 0 1 &Sigma; no = - 1 1 ( I R ( x + m , the y + no ) - I R ( x + m - 1 , the y + no ) ) 2 2 &times; 3 , DF(x, y) represents the diagonal direction frequency value of the pixel point whose coordinate position is (x, y) in {I R (x, y)}, DF ( x , the y ) = &Sigma; m = 0 1 &Sigma; no = 0 1 ( I R ( x + m , the y + no ) - I R ( x + m - 1 , the y + no - 1 ) ) 2 2 &times; 2 + &Sigma; m = - 1 0 &Sigma; no = 0 1 ( I R ( x + m , the y + no ) - I R ( x + m + 1 , the y + no - 1 ) ) 2 2 &times; 2 , I R (x+m, y+n) represents the pixel value of the pixel whose coordinate position is (x+m, y+n) in {I R (x, y)}, and I R (x+m, y+ n-1) represents the pixel value of the pixel point whose coordinate position is (x+m, y+n-1) in {I R (x, y)}, and I R (x+m-1, y+n) represents The pixel value of the pixel whose coordinate position is (x+m-1, y+n) in {I R (x, y)}, I R (x+m-1, y+n-1) means {I R The pixel value of the pixel whose coordinate position is (x+m-1, y+n-1) in (x, y)}, I R (x+m+1, y+n-1) means that {I R ( The pixel value of the pixel whose coordinate position is (x+m+1,y+n-1) in x,y)}, if x+m<1, then the value of I R (x+m,y+n) is replaced by the value of I R (1,y+n), and the value of I R (x+m,y+n-1) is replaced by the value of I R (1,y+n-1); if x+m- 1<1, the value of I R (x+m-1,y+n) is replaced by the value of I R (1,y+n), and the value of I R (x+m-1,y+n-1) The value is replaced by the value of I R (1,y+n-1); if x+m>W, the value of I R (x+m,y+n) is replaced by the value of I R (W,y+n) Instead, the value of I R (x+m,y+n-1) is replaced by the value of I R (W,y+n-1); if x+m+1>W, then I R (x+m+ 1,y+n-1) is replaced by the value of I R (W,y+n-1); if y+n<1, the value of I R (x+m,y+n) is replaced by I R The value of (x+m,1) is replaced, the value of I R (x+m-1,y+n) is replaced by the value of I R (x+m-1,1); if y+n-1<1 , then the value of I R (x+m,y+n-1) is replaced by the value of I R (x+m,1), and the value of I R (x+m-1,y+n-1) is replaced by I The value of R (x+m-1,1) is replaced, and the value of I R (x+m+1,y+n-1) is replaced by the value of I R (x+m+1,1); if y+ n>H, then the value of I R (x+m,y+n) is replaced by the value of I R (x+m,H), and the value of I R (x+m-1,y+n) is replaced by I R The value of (x+m-1,H) is replaced;

⑤-2、根据{SF(x,y)}和{M(x,y)},计算{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的空间频率均值,记为ν, &nu; = &Sigma; ( x , y ) &Element; &Omega; SF ( x , y ) &times; M ( x , y ) &Sigma; ( x , y ) &Element; &Omega; M ( x , y ) , 其中,Ω表示图像域范围;⑤-2. According to {SF(x,y)} and {M(x,y)}, calculate the visual importance of the visual saliency map in {SF(x,y)} and {I R (x,y)} The spatial frequency mean value of all pixels in the region corresponding to the region is denoted as ν, &nu; = &Sigma; ( x , the y ) &Element; &Omega; SF ( x , the y ) &times; m ( x , the y ) &Sigma; ( x , the y ) &Element; &Omega; m ( x , the y ) , Among them, Ω represents the image domain range;

⑤-3、根据{SF(x,y)}和{M(x,y)}及ν,计算{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的空间频率方差,记为ρ, &rho; = &Sigma; ( x , y ) &Element; &Omega; ( SF ( x , y ) - &nu; ) 2 &times; M ( x , y ) &Sigma; ( x , y ) &Element; &Omega; M ( x , y ) ; ⑤-3. According to {SF(x,y)} and {M(x,y)} and ν, calculate the visual saliency map in {SF(x,y)} and {I R (x,y)} The spatial frequency variance of all pixels in the area corresponding to the visually important area is denoted as ρ, &rho; = &Sigma; ( x , the y ) &Element; &Omega; ( SF ( x , the y ) - &nu; ) 2 &times; m ( x , the y ) &Sigma; ( x , the y ) &Element; &Omega; m ( x , the y ) ;

⑤-4、计算{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的空间频率范围,记为ζ,ζ=SFmax-SFmin,其中,SFmax表示{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内空间频率值最大的1%像素点的空间频率均值,SFmin表示{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内空间频率值最小的1%像素点的空间频率均值;⑤-4. Calculate the spatial frequency range of the pixels in {SF (x, y)} corresponding to the visually important area of the visual saliency map of {I R (x, y)}, denoted as ζ, ζ =SF max -SF min , where SF max represents the 1 with the largest spatial frequency value in the region corresponding to the visually important region of the visual saliency map of {I R (x,y)} in {SF(x,y)} % Spatial frequency mean of pixels, SF min represents the smallest 1% of the spatial frequency value in the region corresponding to the visually important region of the visual saliency map of {I R (x,y)} in {SF(x,y)} The spatial frequency mean of the pixel;

⑤-5、计算{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的空间频率敏感因子,记为τ,τ=ν/μ;⑤-5. Calculate the spatial frequency sensitivity factor of the pixels in the region corresponding to the visually important region of the visual saliency map of {I R (x, y)} in {SF (x, y)}, denoted as τ, τ=ν/μ;

⑤-6、将ν、ρ、ζ和τ按顺序进行排列构成用于反映{IR(x,y)}的空间频率特征的特征矢量,记为F3,F3=(ν,ρ,ζ,τ),F3的维数为4。⑤-6. Arrange ν, ρ, ζ and τ in order to form a feature vector for reflecting the spatial frequency characteristics of {I R (x, y)}, denoted as F 3 , F 3 =(ν, ρ, ζ, τ), the dimension of F 3 is 4.

所述的步骤⑧的具体过程为:The concrete process of described step 8. is:

⑧-1、随机选择立体图像集合中的

Figure BDA00003416765200098
幅立体图像构成训练集,将立体图像集合中剩余的n-t幅立体图像构成测试集,其中,符号
Figure BDA00003416765200099
为向上取整符号;⑧-1, randomly select the stereo image set
Figure BDA00003416765200098
Stereo images constitute the training set, and the remaining nt stereo images in the stereo image set constitute the test set, where the symbol
Figure BDA00003416765200099
is the symbol for rounding up;

⑧-2、将训练集中的所有立体图像的特征矢量和平均主观评分均值构成训练样本数据集合,记为Ωt,{Xk,MOSk}∈Ωt,其中,Xk表示训练样本数据集合Ωt中的第k幅立体图像的特征矢量,MOSk表示训练样本数据集合Ωt中的第k幅立体图像的平均主观评分均值,1≤k≤t;⑧-2. The feature vectors and average subjective ratings of all stereo images in the training set constitute the training sample data set, which is recorded as Ω t , {X k ,MOS k }∈Ω t , where X k represents the training sample data set The feature vector of the k-th stereo image in Ω t , MOS k represents the average subjective score mean value of the k-th stereo image in the training sample data set Ω t , 1≤k≤t;

⑧-3、构造训练样本数据集合Ωt中的每幅立体图像的特征矢量的回归函数,将Xk的回归函数记为f(Xk),

Figure BDA000034167652000910
其中,f()为函数表示形式,w为权重矢量,wT为w的转置矩阵,b为偏置项,表示Xk的线性函数,
Figure BDA00003416765200091
D(Xk,Xl)为支持向量回归中的核函数,
Figure BDA00003416765200092
Xl为训练样本数据集合Ωt中的第l幅立体图像的特征矢量,1≤l≤t,γ为核参数,exp()表示以e为底的指数函数,e=2.71828183,符号“|| ||”为求欧式距离符号;8.-3, construct the regression function of the feature vector of each stereoscopic image in the training sample data set Ω t , the regression function of X k is denoted as f(X k ),
Figure BDA000034167652000910
Among them, f() is the function representation, w is the weight vector, w T is the transpose matrix of w, b is the bias term, represents a linear function of X k ,
Figure BDA00003416765200091
D(X k ,X l ) is the kernel function in support vector regression,
Figure BDA00003416765200092
X l is the feature vector of the lth stereo image in the training sample data set Ω t , 1≤l≤t, γ is the kernel parameter, exp() represents the exponential function with e as the base, e=2.71828183, the symbol "| | ||" is the Euclidean distance symbol;

⑧-4、采用支持向量回归对训练样本数据集合Ωt中的所有立体图像的特征矢量进行训练,使得经过训练得到的回归函数值与平均主观评分均值之间的误差最小,拟合得到最优的权重矢量wopt和最优的偏置项bopt,将最优的权重矢量wopt和最优的偏置项bopt的组合记为(wopt,bopt), ( w opt , b opt ) = arg min ( w , b ) &Element; &Psi; &Sigma; k = 1 t ( f ( X k ) - MOS k ) 2 , 利用得到的最优的权重矢量wopt和最优的偏置项bopt构造支持向量回归训练模型,记为

Figure BDA00003416765200094
其中,Ψ表示对训练样本数据集合Ωt中的所有立体图像的特征矢量进行训练的所有的权重矢量和偏置项的组合的集合, arg min ( w , b ) &Element; &Psi; &Sigma; k = 1 t ( f ( X k ) - MOS k ) 2 表示使得 &Sigma; k = 1 t ( f ( X k ) - MOS k ) 2 最小的w和b的值,Xinp表示支持向量回归训练模型的输入矢量,(wopt)T为wopt的转置矩阵,
Figure BDA00003416765200101
表示支持向量回归训练模型的输入矢量Xinp的线性函数;⑧-4, adopting support vector regression to train the feature vectors of all stereoscopic images in the training sample data set Ω t , so that the error between the regression function value obtained through training and the mean value of the average subjective rating is the smallest, and the fitting is optimal The weight vector w opt and the optimal bias item b opt , the combination of the optimal weight vector w opt and the optimal bias item b opt is recorded as (w opt , b opt ), ( w opt , b opt ) = arg min ( w , b ) &Element; &Psi; &Sigma; k = 1 t ( f ( x k ) - MOS k ) 2 , Using the obtained optimal weight vector w opt and the optimal bias item b opt to construct a support vector regression training model, denoted as
Figure BDA00003416765200094
Among them, Ψ represents the set of combinations of all weight vectors and bias items that are trained on the feature vectors of all stereo images in the training sample data set Ωt , arg min ( w , b ) &Element; &Psi; &Sigma; k = 1 t ( f ( x k ) - MOS k ) 2 express to make &Sigma; k = 1 t ( f ( x k ) - MOS k ) 2 The smallest value of w and b, X inp represents the input vector of the support vector regression training model, (w opt ) T is the transpose matrix of w opt ,
Figure BDA00003416765200101
Represents a linear function of the input vector X inp of the support vector regression training model;

⑧-5、将测试集中的所有立体图像的特征矢量和平均主观评分均值构成测试样本数据集合,然后根据支持向量回归训练模型,对测试样本数据集合中的每幅立体图像的特征矢量进行测试,预测得到测试样本数据集合中的每幅立体图像的客观视觉舒适度评价预测值,将测试样本数据集合中的第k'幅立体图像的客观视觉舒适度评价预测值记为Qk',Qk'=f(Xk'),其中,Xk'表示测试样本数据集合中的第k'幅立体图像的特征矢量,表示测试样本数据集合中的第k'幅立体图像的线性函数,1≤k'≤n-t;8.-5, the eigenvectors of all stereoscopic images in the test set and the mean value of the average subjective score form the test sample data set, then according to the support vector regression training model, the eigenvectors of each stereoscopic image in the test sample data set are tested, The objective visual comfort evaluation prediction value of each stereoscopic image in the test sample data set is predicted, and the objective visual comfort evaluation prediction value of the k'th stereo image in the test sample data set is recorded as Q k' , Q k ' = f(X k' ), Among them, X k' represents the feature vector of the k'th stereo image in the test sample data set, Represents the linear function of the k'th stereo image in the test sample data set, 1≤k'≤nt;

⑧-6、再重新随机选择立体图像集合中的幅立体图像构成训练集,将立体图像集合中剩余的n-t幅立体图像构成测试集,然后返回步骤⑧-2继续执行,在经过N次迭代后,计算立体图像集合中的每幅立体图像的客观视觉舒适度评价预测值的平均值,再将计算得到的平均值作为对应那幅立体图像的最终的客观视觉舒适度评价预测值,其中,N的值取大于100。⑧-6, and then re-randomly select the stereo image collection Stereo images form the training set, and the remaining nt stereo images in the stereo image set constitute the test set, and then return to step ⑧-2 to continue execution. After N iterations, calculate the objective value of each stereo image in the stereo image set The average value of the predicted value of the visual comfort evaluation, and then the calculated average value is used as the final predicted value of the objective visual comfort evaluation corresponding to the stereo image, wherein the value of N is greater than 100.

所述的步骤⑧-3中取γ=54。In the step 8-3, γ=54 is taken.

与现有技术相比,本发明的优点在于:Compared with the prior art, the present invention has the advantages of:

1)本发明方法考虑到视觉重要区域对视觉舒适度的影响,因此根据立体图像的右视点图像的显著图和立体图像的右视差图像提取出立体图像的视觉重要区域掩膜,然后根据视觉重要区域掩膜只对视觉重要区域进行评价,从而有效地提高了客观评价结果与主观感知的相关性。1) The method of the present invention takes into account the influence of visually important regions on visual comfort, so the visually important region mask of the stereoscopic image is extracted according to the saliency map of the right viewpoint image of the stereoscopic image and the right disparity image of the stereoscopic image, and then according to the visually important Region masks only evaluate visually important regions, thus effectively improving the correlation between objective evaluation results and subjective perception.

2)本发明方法根据用于反映立体图像的右视差图像的视差幅度特征的特征矢量、用于反映立体图像的右视差图像的视差梯度特征的特征矢量、用于反映立体图像的右视点图像的空间频率特征的特征矢量,得到立体图像的特征矢量,然后利用支持向量回归对立体图像集合中的所有立体图像的特征矢量进行训练,计算得到立体图像集合中的每幅立体图像的客观视觉舒适度评价预测值,由于获得的立体图像的特征矢量信息具有较强的稳定性且能够较好地反映立体图像的视觉舒适度变化情况,因此有效地提高了客观评价情况与主观感知的相关性。2) The method of the present invention is based on the feature vector used to reflect the parallax amplitude feature of the right parallax image of the stereo image, the feature vector used to reflect the parallax gradient feature of the right parallax image of the stereo image, and the feature vector used to reflect the right viewpoint image of the stereo image The feature vector of the spatial frequency feature is obtained to obtain the feature vector of the stereo image, and then the feature vector of all the stereo images in the stereo image set is trained by using support vector regression, and the objective visual comfort of each stereo image in the stereo image set is calculated Evaluation prediction value, because the obtained feature vector information of the stereo image has strong stability and can better reflect the change of the visual comfort of the stereo image, so the correlation between the objective evaluation and the subjective perception is effectively improved.

附图说明Description of drawings

图1为本发明方法的总体实现框图;Fig. 1 is the overall realization block diagram of the inventive method;

图2a为“camera”的右视点图像;Figure 2a is the right view image of "camera";

图2b为“camera”的右视差图像;Figure 2b is the right parallax image of "camera";

图2c为“camera”的右视点图像的显著图;Figure 2c is a saliency map of the right view image of "camera";

图2d为“camera”的右视点图像的视觉显著图;Figure 2d is the visual saliency map of the right view image of "camera";

图2e为“camera”的视觉重要区域掩模;Figure 2e is the visually important area mask of "camera";

图3a为“cup”的右视点图像;Figure 3a is the right view image of "cup";

图3b为“cup”的右视差图像;Figure 3b is the right parallax image of "cup";

图3c为“cup”的右视点图像的显著图;Figure 3c is a saliency map of the right view image of "cup";

图3d为“cup”的右视点图像的视觉显著图;Figure 3d is a visual saliency map of the right view image of "cup";

图3e为“cup”的视觉重要区域掩模;Figure 3e is the visually important area mask of "cup";

图4a为“infant”的右视点图像;Figure 4a is the right view image of "infant";

图4b为“infant”的右视差图像;Figure 4b is the right parallax image of "infant";

图4c为“infant”的右视点图像的显著图;Figure 4c is a saliency map of the right view image of "infant";

图4d为“infant”的右视点图像的视觉显著图;Figure 4d is the visual saliency map of the right view image of "infant";

图4e为“infant”的视觉重要区域掩模;Figure 4e is the visually important area mask of "infant";

图5为根据F1和F2两个特征矢量获得的客观视觉舒适度评价预测值与平均主观评分均值的散点图;Fig. 5 is a scatter diagram of the predicted value of the objective visual comfort evaluation and the mean value of the average subjective score obtained according to the two feature vectors of F1 and F2 ;

图6为根据F1和F3两个特征矢量获得的客观视觉舒适度评价预测值与平均主观评分均值的散点图;Fig. 6 is a scatter diagram of the predicted value of the objective visual comfort evaluation and the mean value of the average subjective score obtained according to the two feature vectors of F1 and F3 ;

图7为根据F2和F3两个特征矢量获得的客观视觉舒适度评价预测值与平均主观评分均值的散点图;Fig. 7 is a scatter diagram of the predicted value of the objective visual comfort evaluation and the mean value of the average subjective score obtained according to the two feature vectors of F2 and F3 ;

图8为根据F1、F2和F3三个特征矢量获得的客观视觉舒适度评价预测值与平均主观评分均值的散点图。Fig. 8 is a scatter diagram of the predicted value of the objective visual comfort evaluation and the mean value of the average subjective score obtained according to the three feature vectors of F 1 , F 2 and F 3 .

具体实施方式Detailed ways

以下结合附图实施例对本发明作进一步详细描述。The present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments.

本发明提出的一种基于机器学习的立体图像视觉舒适度评价方法,其总体实现框图如图1所示,其包括以下步骤:A kind of stereoscopic image visual comfort evaluation method based on machine learning that the present invention proposes, its overall realization block diagram is as shown in Figure 1, and it comprises the following steps:

①将待评价的立体图像的左视点图像记为{IL(x,y)},将待评价的立体图像的右视点图像记为{IR(x,y)},将待评价的立体图像的右视差图像记为{dR(x,y)},其中,此处(x,y)表示{IL(x,y)}、{IR(x,y)}和{dR(x,y)}中的像素点的坐标位置,1≤x≤W,1≤y≤H,W表示{IL(x,y)}、{IR(x,y)}和{dR(x,y)}的宽度,H表示{IL(x,y)}、{IR(x,y)}和{dR(x,y)}的高度,IL(x,y)表示{IL(x,y)}中坐标位置为(x,y)的像素点的像素值,IR(x,y)表示{IR(x,y)}中坐标位置为(x,y)的像素点的像素值,dR(x,y)表示{dR(x,y)}中坐标位置为(x,y)的像素点的像素值。① Denote the left viewpoint image of the stereo image to be evaluated as {I L (x, y)}, the right viewpoint image of the stereo image to be evaluated as {I R (x, y)}, and the stereo image to be evaluated The right disparity image of the image is denoted as {d R (x,y)}, where (x,y) denotes {I L (x,y)}, {I R (x,y)} and {d R The coordinate position of the pixel in (x,y)}, 1≤x≤W, 1≤y≤H, W means {I L (x, y)}, {I R (x, y)} and {d The width of R (x,y)}, H means the height of {I L (x,y)}, {I R (x,y)} and {d R (x,y)}, I L (x,y ) means the pixel value of the pixel whose coordinate position is (x, y) in {I L (x, y)}, and I R (x, y) means that the coordinate position in {I R (x, y)} is (x ,y) is the pixel value of the pixel point, and d R (x, y) represents the pixel value of the pixel point whose coordinate position is (x, y) in {d R (x, y)}.

②提取出{IR(x,y)}的显著图;然后根据{IR(x,y)}的显著图和{dR(x,y)},获取{IR(x,y)}的视觉显著图;再将{IR(x,y)}的视觉显著图划分为视觉重要区域和非视觉重要区域;最后根据{IR(x,y)}的视觉显著图的视觉重要区域和非视觉重要区域,获取待评价的立体图像的视觉重要区域掩膜,记为{M(x,y)},其中,M(x,y)表示{M(x,y)}中坐标位置为(x,y)的像素点的像素值。② Extract the saliency map of {I R (x, y)}; then according to the saliency map of {I R (x, y)} and {d R (x, y)}, get {I R (x, y) } visual saliency map; then divide the visual saliency map of {I R (x,y)} into visually important areas and non-visually important areas; finally according to the visual importance of the visual saliency map of {I R (x,y)} Regions and non-visually important regions, obtain the visually important region mask of the stereo image to be evaluated, denoted as {M(x,y)}, where M(x,y) represents the coordinates in {M(x,y)} The pixel value of the pixel at position (x,y).

在此具体实施例中,步骤②的具体过程为:In this specific embodiment, the concrete process of step 2. is:

②-1、采用基于图论的视觉显著性(Graph-based Visual Saliency,GBVS)模型提取出{IR(x,y)}的显著图,记为{SMR(x,y)},其中,SMR(x,y)表示{SMR(x,y)}中坐标位置为(x,y)的像素点的像素值。②-1. Use the Graph-based Visual Saliency (GBVS) model to extract the saliency map of {I R (x, y)}, denoted as {SM R (x, y)}, where , SM R (x, y) represents the pixel value of the pixel at the coordinate position (x, y) in {SM R (x, y)}.

②-2、根据{SMR(x,y)}和{dR(x,y)},获取{IR(x,y)}的视觉显著图,记为{DR(x,y)},将{DR(x,y)}中坐标位置为(x,y)的像素点的像素值记为DR(x,y),

Figure BDA00003416765200121
其中,
Figure BDA00003416765200122
表示SMR(x,y)的权重,
Figure BDA00003416765200123
表示dR(x,y)的权重,
Figure BDA00003416765200124
在此取
Figure BDA00003416765200125
②-2. According to {SM R (x,y)} and {d R (x,y)}, obtain the visual saliency map of {I R (x,y)}, denoted as {D R (x,y) }, record the pixel value of the pixel whose coordinate position is (x, y) in {D R (x, y)} as D R (x, y),
Figure BDA00003416765200121
in,
Figure BDA00003416765200122
Represents the weight of SM R (x,y),
Figure BDA00003416765200123
Indicates the weight of d R (x,y),
Figure BDA00003416765200124
take here
Figure BDA00003416765200125

②-3、根据{DR(x,y)}中的每个像素点的像素值,将{DR(x,y)}划分为视觉重要区域和非视觉重要区域,{DR(x,y)}的视觉重要区域中的每个像素点的像素值大于自适应阈值T1,{DR(x,y)}的非视觉重要区域中的每个像素点的像素值小于或等于自适应阈值T1,其中,T1为利用大津法对{DR(x,y)}进行处理得到的阈值。②-3. According to the pixel value of each pixel in {D R (x, y)}, divide {D R (x, y)} into visually important areas and non-visually important areas, {D R (x , y)}, the pixel value of each pixel in the visually important area is greater than the adaptive threshold T 1 , and the pixel value of each pixel in the non-visually important area of {D R (x, y)} is less than or equal to Adaptive threshold T 1 , where T 1 is a threshold obtained by processing {D R (x, y)} using the Otsu method.

②-4、根据{DR(x,y)}的视觉重要区域和非视觉重要区域,获取待评价的立体图像的视觉重要区域掩膜,记为{M(x,y)},将{M(x,y)}中坐标位置为(x,y)的像素点的像素值记为M(x,y), M ( x , y ) = 1 D R ( x , y ) > T 1 0 D R ( x , y ) &le; T 1 . ②-4. According to the visually important area and non-visually important area of {D R (x, y)}, obtain the visually important area mask of the stereo image to be evaluated, which is recorded as {M(x,y)}, and { The pixel value of the pixel whose coordinate position is (x, y) in M(x, y)} is recorded as M(x, y), m ( x , the y ) = 1 D. R ( x , the y ) > T 1 0 D. R ( x , the y ) &le; T 1 .

在此,截取三组典型的立体图像来说明本发明方法中获取的待评价的立体图像的视觉重要区域掩膜的性能。图2a和图2b分别给出了“camera”的右视点图像和右视差图像,图2c给出了“camera”的右视点图像的显著图,图2d给出了“camera”的右视点图像的视觉显著图,图2e给出了“camera”的视觉重要区域掩模;图3a和图3b分别给出了“cup”的右视点图像和右视差图像,图3c给出了“cup”的右视点图像的显著图,图3d给出了“cup”的右视点图像的视觉显著图,图3e给出了“cup”的视觉重要区域掩模;图4a和图4b分别给出了“infant”的右视点图像和右视差图像,图4c给出了“infant”的右视点图像的显著图,图4d给出了“infant”的右视点图像的视觉显著图,图4e给出了“infant”的视觉重要区域掩模。从图2e、图3e和图4e可以看出,采用本发明方法得到的视觉重要区域,能够很好地反映人眼视觉舒适程度。Here, three groups of typical stereoscopic images are intercepted to illustrate the performance of the visually important region mask of the stereoscopic image to be evaluated obtained in the method of the present invention. Figure 2a and Figure 2b show the right view image and right disparity image of "camera", respectively, Figure 2c shows the saliency map of the right view image of "camera", and Figure 2d shows the right view image of "camera" Visual saliency map, Figure 2e shows the visually important region mask of "camera"; Figure 3a and Figure 3b show the right viewpoint image and right disparity image of "cup", respectively, Figure 3c shows the right The saliency map of the viewpoint image, Figure 3d shows the visual saliency map of the right viewpoint image of "cup", and Fig. 3e shows the visually important region mask of "cup"; Fig. 4a and Fig. 4b respectively give the "infant" Figure 4c shows the saliency map of the right view image of "infant", Figure 4d shows the visual saliency map of the right view image of "infant", and Figure 4e shows the saliency map of "infant" The visually important region mask of . It can be seen from Fig. 2e, Fig. 3e and Fig. 4e that the visually important areas obtained by the method of the present invention can well reflect the visual comfort of human eyes.

③根据{dR(x,y)}和{M(x,y)},获取{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的视差均值μ、视差方差δ、最大负视差θ、视差范围χ,然后将μ、δ、θ和χ按顺序进行排列构成用于反映{dR(x,y)}的视差幅度特征的特征矢量,记为F1,F1=(μ,δ,θ,χ)。③According to {d R (x,y)} and {M(x,y)}, obtain the visually important areas of the visual saliency map in {d R (x,y)} and {I R (x,y)} The disparity mean μ, disparity variance δ, maximum negative disparity θ, and disparity range χ of the pixels in the corresponding area, and then arrange μ, δ, θ and χ in order to reflect {d R (x, y )} The feature vector of the parallax amplitude feature is denoted as F 1 , F 1 =(μ, δ, θ, χ).

在此具体实施例中,步骤③的具体过程为:In this specific embodiment, the concrete process of step 3. is:

③-1、根据{dR(x,y)}和{M(x,y)},计算{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的视差均值,记为μ, &mu; = &Sigma; ( x , y ) &Element; &Omega; d R ( x , y ) &times; M ( x , y ) &Sigma; ( x , y ) &Element; &Omega; M ( x , y ) , 其中,Ω表示图像域范围。③-1. According to {d R (x, y)} and {M (x, y)}, calculate the visual saliency map in {d R (x, y)} and {I R (x, y)} The average disparity value of all pixels in the area corresponding to the visually important area, denoted as μ, &mu; = &Sigma; ( x , the y ) &Element; &Omega; d R ( x , the y ) &times; m ( x , the y ) &Sigma; ( x , the y ) &Element; &Omega; m ( x , the y ) , Among them, Ω represents the range of the image domain.

③-2、根据{dR(x,y)}和{M(x,y)}及μ,计算{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的视差方差,记为δ, &delta; = &Sigma; ( x , y ) &Element; &Omega; ( d R ( x , y ) - &mu; ) 2 &times; M ( x , y ) &Sigma; ( x , y ) &Element; &Omega; M ( x , y ) . ③-2. According to {d R (x,y)} and {M(x,y)} and μ, calculate the visual salience of {I R (x,y)} in {d R (x,y)} The disparity variance of all pixels in the area corresponding to the visually important area of the graph is denoted as δ, &delta; = &Sigma; ( x , the y ) &Element; &Omega; ( d R ( x , the y ) - &mu; ) 2 &times; m ( x , the y ) &Sigma; ( x , the y ) &Element; &Omega; m ( x , the y ) .

③-3、计算{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的最大负视差,记为θ,其中,θ的值为{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内视差值最小的1%像素点的视差均值。③-3. Calculate the maximum negative disparity of the pixels in {d R (x, y)} corresponding to the visually important area of the visual saliency map of {I R (x, y)}, denoted as θ, Among them, the value of θ is the disparity of 1% of the pixels in {d R (x, y)} corresponding to the visually important area of the visual saliency map of {I R (x, y)} with the smallest disparity value mean.

③-4、计算{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的视差范围,记为χ,χ=dmax-dmin,其中,dmax表示{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内视差值最大的1%像素点的视差均值,dmin表示{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内视差值最小的1%像素点的视差均值。③-4, calculate the disparity range of the pixels in the region corresponding to the visually important region of the visual saliency map of {I R (x, y)} in {d R (x, y)}, denoted as χ, χ =d max -d min , wherein, d max represents the maximum disparity value in {d R (x, y)} corresponding to the visually important area of the visual saliency map of {I R (x, y)} The mean value of the disparity of 1% of the pixels, d min represents the minimum disparity value of 1 in the area corresponding to the visually important area of the visual saliency map of {I R (x, y)} in {d R (x, y)} % disparity mean of pixels.

③-5、将μ、δ、θ和χ按顺序进行排列构成用于反映{dR(x,y)}的视差幅度特征的特征矢量,记为F1,F1=(μ,δ,θ,χ),F1的维数为4。③-5. Arrange μ, δ, θ and χ in order to form a feature vector used to reflect the parallax amplitude characteristics of {d R (x, y)}, denoted as F 1 , F 1 =(μ, δ, θ, χ), the dimension of F1 is 4.

④通过计算{dR(x,y)}的视差梯度幅值图像和视差梯度方向图像,计算{dR(x,y)}的视差梯度边缘图像;然后根据{dR(x,y)}的视差梯度边缘图像和{M(x,y)},计算{dR(x,y)}的视差梯度边缘图像中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的梯度均值ψ;最后将ψ作为用于反映{dR(x,y)}的视差梯度特征的特征矢量,记为F2④ By calculating the parallax gradient magnitude image and parallax gradient direction image of {d R (x, y)}, calculate the parallax gradient edge image of {d R (x, y)}; then according to {d R (x, y) } of the disparity gradient edge image and {M(x,y)}, calculate the visually important regions in the disparity gradient edge image of {d R (x,y)} and the visual saliency map of {I R (x,y)} The gradient mean value ψ of all pixels in the corresponding area; finally, ψ is used as a feature vector reflecting the disparity gradient feature of {d R (x, y)}, which is denoted as F 2 .

在此具体实施例中,步骤④的具体过程为:In this specific embodiment, the concrete process of step 4. is:

④-1、计算{dR(x,y)}的视差梯度幅值图像,记为{m(x,y)},将{m(x,y)}中坐标位置为(x,y)的像素点的梯度幅值记为m(x,y),

Figure BDA00003416765200142
其中,Gx(x,y)表示{m(x,y)}中坐标位置为(x,y)的像素点的水平梯度值,Gy(x,y)表示{m(x,y)}中坐标位置为(x,y)的像素点的垂直梯度值。④-1. Calculate the parallax gradient magnitude image of {d R (x,y)}, which is recorded as {m(x,y)}, and the coordinate position in {m(x,y)} is (x,y) The gradient magnitude of the pixel point is recorded as m(x,y),
Figure BDA00003416765200142
Among them, G x (x, y) represents the horizontal gradient value of the pixel whose coordinate position is (x, y) in {m(x, y)}, and G y (x, y) represents {m(x, y) } in the vertical gradient value of the pixel whose coordinate position is (x, y).

④-2、计算{dR(x,y)}的视差梯度方向图像,记为{θ(x,y)},将{θ(x,y)}中坐标位置为(x,y)的像素点的梯度方向值记为θ(x,y),θ(x,y)=arctan(Gy(x,y)/Gx(x,y)),其中,arctan()为取反正切函数。④-2. Calculate the disparity gradient direction image of {d R (x,y)}, which is recorded as {θ(x,y)}, and the coordinate position in {θ(x,y)} is (x,y) The gradient direction value of the pixel is recorded as θ(x,y), θ(x,y)=arctan(G y (x,y)/G x (x,y)), where arctan() is the arc tangent function.

④-3、根据{m(x,y)}和{θ(x,y)},计算{dR(x,y)}的视差梯度边缘图像,记为{E(x,y)},将{E(x,y)}中坐标位置为p的像素点的梯度边缘值记为E(p),

Figure BDA00003416765200151
其中,Gs(||p-q||)表示标准差为σs的高斯函数,在此取σs=0.4,
Figure BDA00003416765200152
||p-q||表示坐标位置p和坐标位置q之间的欧氏距离,符号“|| ||”为求欧氏距离符号,
Figure BDA00003416765200153
表示标准差为σo的高斯函数,在此取σo=0.4, G o ( | | &theta; &RightArrow; ( p ) - &theta; &RightArrow; ( q ) | | ) = exp ( - | | &theta; &RightArrow; ( p ) - &theta; &RightArrow; ( q ) | | 2 2 &sigma; o 2 ) , | | &theta; &RightArrow; ( p ) - &theta; &RightArrow; ( q ) | | 表示
Figure BDA00003416765200156
Figure BDA00003416765200157
之间的欧氏距离,
Figure BDA00003416765200158
Figure BDA00003416765200159
θ(p)表示{θ(x,y)}中坐标位置为p的像素点的梯度方向值,θ(q)表示{θ(x,y)}中坐标位置为q的像素点的梯度方向值,
Figure BDA000034167652001510
m(q)表示{m(x,y)}中坐标位置为q的像素点的梯度幅值,m(q')表示{m(x,y)}中坐标位置为q'的像素点的梯度幅值,εg为控制参数,在此取εg=0.5,符号“[]”为矢量表示符号,exp()表示以e为底的指数函数,e=2.71828183,
Figure BDA000034167652001511
表示以坐标位置为p的像素点为中心的邻域窗口,
Figure BDA000034167652001512
表示以坐标位置为q的像素点为中心的邻域窗口,在此
Figure BDA000034167652001513
的大小为3×3,
Figure BDA000034167652001514
的大小为3×3。④-3. According to {m(x,y)} and {θ(x,y)}, calculate the parallax gradient edge image of {d R (x,y)}, denoted as {E(x,y)}, Record the gradient edge value of the pixel point whose coordinate position is p in {E(x,y)} as E(p),
Figure BDA00003416765200151
Among them, G s (||pq||) represents a Gaussian function whose standard deviation is σ s , and here σ s =0.4,
Figure BDA00003416765200152
||pq|| represents the Euclidean distance between the coordinate position p and the coordinate position q, and the symbol "|| ||" is the Euclidean distance symbol,
Figure BDA00003416765200153
Represents a Gaussian function with standard deviation σ o , where σ o = 0.4, G o ( | | &theta; &Right Arrow; ( p ) - &theta; &Right Arrow; ( q ) | | ) = exp ( - | | &theta; &Right Arrow; ( p ) - &theta; &Right Arrow; ( q ) | | 2 2 &sigma; o 2 ) , | | &theta; &Right Arrow; ( p ) - &theta; &Right Arrow; ( q ) | | express
Figure BDA00003416765200156
and
Figure BDA00003416765200157
The Euclidean distance between
Figure BDA00003416765200158
Figure BDA00003416765200159
θ(p) represents the gradient direction value of the pixel point whose coordinate position is p in {θ(x,y)}, and θ(q) represents the gradient direction value of the pixel point whose coordinate position is q in {θ(x,y)} value,
Figure BDA000034167652001510
m(q) represents the gradient magnitude of the pixel at the coordinate position q in {m(x,y)}, and m(q') represents the gradient magnitude of the pixel at the coordinate position q' in {m(x,y)} Gradient magnitude, ε g is a control parameter, here ε g = 0.5, symbol "[]" is a vector representation symbol, exp () represents an exponential function with e as the base, e=2.71828183,
Figure BDA000034167652001511
Indicates the neighborhood window centered on the pixel at the coordinate position p,
Figure BDA000034167652001512
Indicates the neighborhood window centered on the pixel at the coordinate position q, where
Figure BDA000034167652001513
has a size of 3×3,
Figure BDA000034167652001514
The size is 3×3.

④-4、根据{E(x,y)}和{M(x,y)},计算{E(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的梯度均值,记为ψ,

Figure BDA00003416765200161
其中,Ω表示图像域范围,E(x,y)表示{E(x,y)}中坐标位置为(x,y)的像素点的梯度边缘值。④-4. According to {E(x,y)} and {M(x,y)}, calculate the visual importance of the visual saliency map in {E(x,y)} and {I R (x,y)} The gradient mean of all pixels in the region corresponding to the region is denoted as ψ,
Figure BDA00003416765200161
Among them, Ω represents the range of the image domain, and E(x, y) represents the gradient edge value of the pixel whose coordinate position is (x, y) in {E(x, y)}.

④-5、将ψ作为用于反映{dR(x,y)}的视差梯度特征的特征矢量,记为F2,F2的维数为1。④-5. Take ψ as a feature vector for reflecting the disparity gradient feature of {d R (x, y)}, denoted as F 2 , and the dimension of F 2 is 1.

⑤获取{IR(x,y)}的空间频率图像;然后根据{IR(x,y)}的空间频率图像和{M(x,y)},获取{IR(x,y)}的空间频率图像中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的空间频率均值ν、空间频率方差ρ、空间频率范围ζ、空间频率敏感因子τ;再将ν、ρ、ζ和τ按顺序进行排列构成用于反映{IR(x,y)}的空间频率特征的特征矢量,记为F3,F3=(ν,ρ,ζ,τ)。⑤ Obtain the spatial frequency image of {I R (x, y)}; then according to the spatial frequency image of {I R (x, y)} and {M (x, y)}, obtain {I R (x, y) } in the spatial frequency image of {I R (x,y)}, the spatial frequency mean ν, spatial frequency variance ρ, spatial frequency range ζ, spatial frequency Sensitivity factor τ; then arrange ν, ρ, ζ and τ in order to form a feature vector used to reflect the spatial frequency characteristics of {I R (x,y)}, denoted as F 3 , F 3 =(ν, ρ , ζ, τ).

在此具体实施例中,步骤⑤的具体过程为:In this specific embodiment, the concrete process of step 5. is:

⑤-1、计算{IR(x,y)}的空间频率图像,记为{SF(x,y)},将{SF(x,y)}中坐标位置为(x,y)的像素点的空间频率值记为SF(x,y), SF ( x , y ) = ( HF ( x , y ) ) 2 + ( VF ( x , y ) ) 2 + ( DF ( x , y ) ) 2 , 其中,HF(x,y)表示{IR(x,y)}中坐标位置为(x,y)的像素点的水平方向频率值, HF ( x , y ) = &Sigma; m = - 1 1 &Sigma; n = 0 1 ( I R ( x + m , y + n ) - I R ( x + m , y + n - 1 ) ) 2 3 &times; 2 , VF(x,y)表示{IR(x,y)}中坐标位置为(x,y)的像素点的垂直方向频率值, VF ( x , y ) = &Sigma; m = 0 1 &Sigma; n = - 1 1 ( I R ( x + m , y + n ) - I R ( x + m - 1 , y + n ) ) 2 2 &times; 3 , DF(x,y)表示{IR(x,y)}中坐标位置为(x,y)的像素点的对角方向频率值, DF ( x , y ) = &Sigma; m = 0 1 &Sigma; n = 0 1 ( I R ( x + m , y + n ) - I R ( x + m - 1 , y + n - 1 ) ) 2 2 &times; 2 + &Sigma; m = - 1 0 &Sigma; n = 0 1 ( I R ( x + m , y + n ) - I R ( x + m + 1 , y + n - 1 ) ) 2 2 &times; 2 , IR(x+m,y+n)表示{IR(x,y)}中坐标位置为(x+m,y+n)的像素点的像素值,IR(x+m,y+n-1)表示{IR(x,y)}中坐标位置为(x+m,y+n-1)的像素点的像素值,IR(x+m-1,y+n)表示{IR(x,y)}中坐标位置为(x+m-1,y+n)的像素点的像素值,IR(x+m-1,y+n-1)表示{IR(x,y)}中坐标位置为(x+m-1,y+n-1)的像素点的像素值,IR(x+m+1,y+n-1)表示{IR(x,y)}中坐标位置为(x+m+1,y+n-1)的像素点的像素值,如果x+m<1,则IR(x+m,y+n)的值由IR(1,y+n)的值替代,IR(x+m,y+n-1)的值由IR(1,y+n-1)的值替代;如果x+m-1<1,则IR(x+m-1,y+n)的值由IR(1,y+n)的值替代,IR(x+m-1,y+n-1)的值由IR(1,y+n-1)的值替代;如果x+m>W,则IR(x+m,y+n)的值由IR(W,y+n)的值替代,IR(x+m,y+n-1)的值由IR(W,y+n-1)的值替代;如果x+m+1>W,则IR(x+m+1,y+n-1)的值由IR(W,y+n-1)的值替代;如果y+n<1,则IR(x+m,y+n)的值由IR(x+m,1)的值替代,IR(x+m-1,y+n)的值由IR(x+m-1,1)的值替代;如果y+n-1<1,则IR(x+m,y+n-1)的值由IR(x+m,1)的值替代,IR(x+m-1,y+n-1)的值由IR(x+m-1,1)的值替代,IR(x+m+1,y+n-1)的值由IR(x+m+1,1)的值替代;如果y+n>H,则IR(x+m,y+n)的值由IR(x+m,H)的值替代,IR(x+m-1,y+n)的值由IR(x+m-1,H)的值替代。⑤-1. Calculate the spatial frequency image of {I R (x, y)}, which is recorded as {SF(x, y)}, and the pixel whose coordinate position is (x, y) in {SF(x, y)} The spatial frequency value of a point is denoted as SF(x,y), SF ( x , the y ) = ( HF ( x , the y ) ) 2 + ( VF ( x , the y ) ) 2 + ( DF ( x , the y ) ) 2 , Among them, HF(x, y) represents the horizontal direction frequency value of the pixel whose coordinate position is (x, y) in {I R (x, y)}, HF ( x , the y ) = &Sigma; m = - 1 1 &Sigma; no = 0 1 ( I R ( x + m , the y + no ) - I R ( x + m , the y + no - 1 ) ) 2 3 &times; 2 , VF(x,y) represents the vertical frequency value of the pixel whose coordinate position is (x,y) in {I R (x,y)}, VF ( x , the y ) = &Sigma; m = 0 1 &Sigma; no = - 1 1 ( I R ( x + m , the y + no ) - I R ( x + m - 1 , the y + no ) ) 2 2 &times; 3 , DF(x, y) represents the diagonal direction frequency value of the pixel point whose coordinate position is (x, y) in {I R (x, y)}, DF ( x , the y ) = &Sigma; m = 0 1 &Sigma; no = 0 1 ( I R ( x + m , the y + no ) - I R ( x + m - 1 , the y + no - 1 ) ) 2 2 &times; 2 + &Sigma; m = - 1 0 &Sigma; no = 0 1 ( I R ( x + m , the y + no ) - I R ( x + m + 1 , the y + no - 1 ) ) 2 2 &times; 2 , I R (x+m, y+n) represents the pixel value of the pixel whose coordinate position is (x+m, y+n) in {I R (x, y)}, and I R (x+m, y+ n-1) represents the pixel value of the pixel point whose coordinate position is (x+m, y+n-1) in {I R (x, y)}, and I R (x+m-1, y+n) represents The pixel value of the pixel whose coordinate position is (x+m-1, y+n) in {I R (x, y)}, I R (x+m-1, y+n-1) means {I R The pixel value of the pixel whose coordinate position is (x+m-1, y+n-1) in (x, y)}, I R (x+m+1, y+n-1) means that {I R ( The pixel value of the pixel whose coordinate position is (x+m+1,y+n-1) in x,y)}, if x+m<1, then the value of I R (x+m,y+n) is replaced by the value of I R (1,y+n), and the value of I R (x+m,y+n-1) is replaced by the value of I R (1,y+n-1); if x+m- 1<1, the value of I R (x+m-1,y+n) is replaced by the value of I R (1,y+n), and the value of I R (x+m-1,y+n-1) The value is replaced by the value of I R (1,y+n-1); if x+m>W, the value of I R (x+m,y+n) is replaced by the value of I R (W,y+n) Instead, the value of I R (x+m,y+n-1) is replaced by the value of I R (W,y+n-1); if x+m+1>W, then I R (x+m+ 1,y+n-1) is replaced by the value of I R (W,y+n-1); if y+n<1, the value of I R (x+m,y+n) is replaced by I R The value of (x+m,1) is replaced, the value of I R (x+m-1,y+n) is replaced by the value of I R (x+m-1,1); if y+n-1<1 , then the value of I R (x+m,y+n-1) is replaced by the value of I R (x+m,1), and the value of I R (x+m-1,y+n-1) is replaced by I The value of R (x+m-1,1) is replaced, and the value of I R (x+m+1,y+n-1) is replaced by the value of I R (x+m+1,1); if y+ n>H, then the value of I R (x+m,y+n) is replaced by the value of I R (x+m,H), and the value of I R (x+m-1,y+n) is replaced by I R (x+m-1,H) instead.

⑤-2、根据{SF(x,y)}和{M(x,y)},计算{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的空间频率均值,记为ν, &nu; = &Sigma; ( x , y ) &Element; &Omega; SF ( x , y ) &times; M ( x , y ) &Sigma; ( x , y ) &Element; &Omega; M ( x , y ) , 其中,Ω表示图像域范围。⑤-2. According to {SF(x,y)} and {M(x,y)}, calculate the visual importance of the visual saliency map in {SF(x,y)} and {I R (x,y)} The spatial frequency mean value of all pixels in the region corresponding to the region is denoted as ν, &nu; = &Sigma; ( x , the y ) &Element; &Omega; SF ( x , the y ) &times; m ( x , the y ) &Sigma; ( x , the y ) &Element; &Omega; m ( x , the y ) , Among them, Ω represents the range of the image domain.

⑤-3、根据{SF(x,y)}和{M(x,y)}及ν,计算{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的空间频率方差,记为ρ, &rho; = &Sigma; ( x , y ) &Element; &Omega; ( SF ( x , y ) - &nu; ) 2 &times; M ( x , y ) &Sigma; ( x , y ) &Element; &Omega; M ( x , y ) . ⑤-3. According to {SF(x,y)} and {M(x,y)} and ν, calculate the visual saliency map in {SF(x,y)} and {I R (x,y)} The spatial frequency variance of all pixels in the area corresponding to the visually important area is denoted as ρ, &rho; = &Sigma; ( x , the y ) &Element; &Omega; ( SF ( x , the y ) - &nu; ) 2 &times; m ( x , the y ) &Sigma; ( x , the y ) &Element; &Omega; m ( x , the y ) .

⑤-4、计算{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的空间频率范围,记为ζ,ζ=SFmax-SFmix,其中,SFmax表示{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内空间频率值最大的1%像素点的空间频率均值,SFmin表示{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内空间频率值最小的1%像素点的空间频率均值。⑤-4. Calculate the spatial frequency range of the pixels in {SF (x, y)} corresponding to the visually important area of the visual saliency map of {I R (x, y)}, denoted as ζ, ζ =SF max -SF mix , where SF max represents the 1 with the largest spatial frequency value in the region corresponding to the visually important region of the visual saliency map of {I R (x,y)} in {SF(x,y)} % Spatial frequency mean of pixels, SF min represents the smallest 1% of the spatial frequency value in the region corresponding to the visually important region of the visual saliency map of {I R (x,y)} in {SF(x,y)} The spatial frequency mean of the pixel.

⑤-5、计算{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的空间频率敏感因子,记为τ,τ=ν/μ。⑤-5. Calculate the spatial frequency sensitivity factor of the pixels in the region corresponding to the visually important region of the visual saliency map of {I R (x, y)} in {SF (x, y)}, denoted as τ, τ=ν/μ.

⑤-6、将ν、ρ、ζ和τ按顺序进行排列构成用于反映{IR(x,y)}的空间频率特征的特征矢量,记为F3,F3=(ν,ρ,ζ,τ),F3的维数为4。⑤-6. Arrange ν, ρ, ζ and τ in order to form a feature vector for reflecting the spatial frequency characteristics of {I R (x, y)}, denoted as F 3 , F 3 =(ν, ρ, ζ, τ), the dimension of F 3 is 4.

⑥将F1、F2及F3构成一个新的特征矢量,记为X,X=[F1,F2,F3],然后将X作为待评价的立体图像的特征矢量,其中,符号“[]”为矢量表示符号,[F1,F2,F3]表示将F1、F2和F3连接起来形成一个新的特征矢量。⑥Constitute F 1 , F 2 and F 3 into a new feature vector, denoted as X, X=[F 1 , F 2 , F 3 ], and then use X as the feature vector of the stereoscopic image to be evaluated, where the symbol "[]" is a vector representation symbol, and [F 1 , F 2 , F 3 ] means connecting F 1 , F 2 and F 3 to form a new feature vector.

⑦采用n副不同的立体图像以及对应的右视差图像建立立体图像集合,利用现有的主观质量评价方法分别计算立体图像集合中的每副立体图像的视觉舒适度的平均主观评分均值,记为MOS,其中,n≥1,MOS∈[1,5];然后按照步骤①至步骤⑥计算待评价的立体图像的特征矢量X的操作,以相同的方式分别计算立体图像集合中的每幅立体图像的特征矢量,将立体图像集合中的第i幅立体图像的特征矢量记为Xi,其中,1≤i≤n,n表示立体图像集合中包含的立体图像的幅数。⑦ Use n different stereoscopic images and the corresponding right disparity images to establish a stereoscopic image set, and use the existing subjective quality evaluation method to calculate the average subjective rating of the visual comfort of each stereoscopic image in the stereoscopic image set, which is denoted as MOS, where, n≥1, MOS∈[1,5]; then follow steps ① to ⑥ to calculate the feature vector X of the stereo image to be evaluated, and calculate each stereo in the stereo image set in the same way The feature vector of the image, the feature vector of the i-th stereo image in the stereo image set is denoted as X i , where 1≤i≤n, n represents the number of stereo images contained in the stereo image set.

在本实施例中,采用韩国科学技术院图像和视频系统实验室提供的立体图像数据库作为立体图像集合,该立体图像数据库包含120幅立体图像以及对应的右视差图像,该立体图像数据库包含了各种场景深度的室内和室外图像,并给出了每副立体图像的视觉舒适度的平均主观评分均值。In this embodiment, the stereoscopic image database provided by the Image and Video System Laboratory of the Korea Institute of Science and Technology is used as a stereoscopic image collection. The stereoscopic image database contains 120 stereoscopic images and corresponding right parallax images. Indoor and outdoor images of various scene depths, and the average subjective rating mean of the visual comfort of each stereo image is given.

⑧将立体图像集合中的所有立体图像分成训练集和测试集,将训练集中的所有立体图像的特征矢量和平均主观评分均值构成训练样本数据集合,将测试集中的所有立体图像的特征矢量和平均主观评分均值构成测试样本数据集合,然后采用支持向量回归作为机器学习的方法,对训练样本数据集合中的所有立体图像的特征矢量进行训练,使得经过训练得到的回归函数值与平均主观评分均值之间的误差最小,拟合得到最优的权重矢量wopt和最优的偏置项bopt,接着利用wopt和bopt构造得到支持向量回归训练模型,再根据支持向量回归训练模型,对测试样本数据集合中的每幅立体图像的特征矢量进行测试,预测得到测试样本数据集合中的每幅立体图像的客观视觉舒适度评价预测值,将测试样本数据集合中的第k'幅立体图像的客观视觉舒适度评价预测值记为Qk',Qk'=f(Xk'),

Figure BDA00003416765200192
其中,f()为函数表示形式,Xk'表示测试样本数据集合中的第k'幅立体图像的特征矢量,(wopt)T为wopt的转置矩阵,
Figure BDA00003416765200191
表示测试样本数据集合中的第k'幅立体图像的线性函数,1≤k'≤n-t,t表示训练集中包含的立体图像的幅数;之后通过重新分配训练集和测试集,重新预测得到测试样本数据集合中的每幅立体图像的客观视觉舒适度评价预测值,经过N次迭代后计算立体图像中的每幅立体图像的客观视觉舒适度评价预测值的平均值,并将计算得到的平均值作为对应那幅立体图像的最终的客观视觉舒适度预测值,其中,N的值取大于100,以保证立体图像集合中的每幅立体图像都能得到客观视觉舒适度评价预测值,在本实施例中取N=200。8. Divide all the stereo images in the stereo image set into training set and test set, form the training sample data set with the feature vectors and average subjective score mean of all the stereo images in the training set, and combine the feature vectors and average values of all the stereo images in the test set The mean value of the subjective score constitutes the test sample data set, and then uses support vector regression as a machine learning method to train the feature vectors of all stereo images in the training sample data set, so that the regression function value obtained after training and the average subjective score mean value The error between is the smallest, and the optimal weight vector w opt and the optimal bias item b opt are obtained by fitting, and then the support vector regression training model is obtained by using w opt and b opt , and then according to the support vector regression training model, the test The feature vector of each stereoscopic image in the sample data set is tested, and the objective visual comfort evaluation prediction value of each stereoscopic image in the test sample data set is predicted, and the k'th stereoscopic image in the test sample data set is The predicted value of objective visual comfort evaluation is denoted as Q k' , Q k' = f(X k' ),
Figure BDA00003416765200192
Among them, f() is a function representation, X k' represents the feature vector of the k'th stereo image in the test sample data set, (w opt ) T is the transposition matrix of w opt ,
Figure BDA00003416765200191
Represents the linear function of the k'th stereo image in the test sample data set, 1≤k'≤nt, t represents the number of stereo images contained in the training set; after that, re-predict the test by reassigning the training set and the test set The objective visual comfort evaluation prediction value of each stereoscopic image in the sample data set, calculate the average value of the objective visual comfort evaluation prediction value of each stereoscopic image in the stereoscopic image after N iterations, and calculate the average The value is used as the final objective visual comfort prediction value corresponding to the stereo image, where the value of N is greater than 100, so as to ensure that each stereo image in the stereo image set can get the objective visual comfort evaluation prediction value, in this N=200 is taken in the embodiment.

在此具体实施例中,步骤⑧的具体过程为:In this specific embodiment, the concrete process of step 8. is:

⑧-1、随机选择立体图像集合中的幅立体图像构成训练集,将立体图像集合中剩余的n-t幅立体图像构成测试集,其中,符号

Figure BDA00003416765200194
为向上取整符号。⑧-1, randomly select the stereo image set Stereo images constitute the training set, and the remaining nt stereo images in the stereo image set constitute the test set, where the symbol
Figure BDA00003416765200194
is the round up sign.

⑧-2、将训练集中的所有立体图像的特征矢量和平均主观评分均值构成训练样本数据集合,记为Ωt,{Xk,MOSk}∈Ωt,其中,Xk表示训练样本数据集合Ωt中的第k幅立体图像的特征矢量,MOSk表示训练样本数据集合Ωt中的第k幅立体图像的平均主观评分均值,1≤k≤t。⑧-2. The feature vectors and average subjective ratings of all stereo images in the training set constitute the training sample data set, which is recorded as Ω t , {X k ,MOS k }∈Ω t , where X k represents the training sample data set The feature vector of the k-th stereo image in Ω t , MOS k represents the mean subjective score of the k-th stereo image in the training sample data set Ω t , 1≤k≤t.

⑧-3、构造训练样本数据集合Ωt中的每幅立体图像的特征矢量的回归函数,将Xk的回归函数记为f(Xk),

Figure BDA00003416765200195
其中,f()为函数表示形式,w为权重矢量,wT为w的转置矩阵,b为偏置项,w和b的值需要通过训练来得到,
Figure BDA00003416765200201
表示Xk的线性函数,
Figure BDA00003416765200202
D(Xk,Xl)为支持向量回归中的核函数,
Figure BDA00003416765200203
Xl为训练样本数据集合Ωt中的第l幅立体图像的特征矢量,1≤l≤t,γ为核参数,其用于反映输入样本值的范围,样本值的范围越大,γ值也就越大,在本实施例中取γ=54,exp()表示以e为底的指数函数,e=2.71828183,符号“|| ||”为求欧式距离符号。8.-3, construct the regression function of the feature vector of each stereoscopic image in the training sample data set Ω t , the regression function of X k is denoted as f(X k ),
Figure BDA00003416765200195
Among them, f() is the function representation, w is the weight vector, w T is the transposition matrix of w, b is the bias item, and the values of w and b need to be obtained through training.
Figure BDA00003416765200201
represents a linear function of X k ,
Figure BDA00003416765200202
D(X k ,X l ) is the kernel function in support vector regression,
Figure BDA00003416765200203
X l is the feature vector of the lth stereo image in the training sample data set Ω t , 1≤l≤t, γ is the kernel parameter, which is used to reflect the range of input sample values, the larger the range of sample values, the value of γ The larger it is, γ=54 is taken in this embodiment, exp() represents an exponential function with e as the base, e=2.71828183, and the symbol "|| ||" is the Euclidean distance symbol.

⑧-4、采用支持向量回归对训练样本数据集合Ωt中的所有立体图像的特征矢量进行训练,使得经过训练得到的回归函数值与平均主观评分均值之间的误差最小,拟合得到最优的权重矢量wopt和最优的偏置项bopt,将最优的权重矢量wopt和最优的偏置项bopt的组合记为(wopt,bopt), ( w opt , b opt ) = arg min ( w , b ) &Element; &Psi; &Sigma; k = 1 t ( f ( X k ) - MOS k ) 2 , 利用得到的最优的权重矢量wopt和最优的偏置项bopt构造支持向量回归训练模型,记为

Figure BDA00003416765200209
其中,Ψ表示对训练样本数据集合Ωt中的所有立体图像的特征矢量进行训练的所有的权重矢量和偏置项的组合的集合, arg min ( w , b ) &Element; &Psi; &Sigma; k = 1 t ( f ( X k ) - MOS k ) 2 表示使得 &Sigma; k = 1 t ( f ( X k ) - MOS k ) 2 最小的w和b的值,Xinp表示支持向量回归训练模型的输入矢量,(wopt)T为wopt的转置矩阵,
Figure BDA00003416765200207
表示支持向量回归训练模型的输入矢量Xinp的线性函数。⑧-4, adopting support vector regression to train the feature vectors of all stereoscopic images in the training sample data set Ω t , so that the error between the regression function value obtained through training and the mean value of the average subjective rating is the smallest, and the fitting is optimal The weight vector w opt and the optimal bias item b opt , the combination of the optimal weight vector w opt and the optimal bias item b opt is recorded as (w opt , b opt ), ( w opt , b opt ) = arg min ( w , b ) &Element; &Psi; &Sigma; k = 1 t ( f ( x k ) - MOS k ) 2 , Using the obtained optimal weight vector w opt and the optimal bias item b opt to construct a support vector regression training model, denoted as
Figure BDA00003416765200209
Among them, Ψ represents the set of combinations of all weight vectors and bias items that are trained on the feature vectors of all stereo images in the training sample data set Ωt , arg min ( w , b ) &Element; &Psi; &Sigma; k = 1 t ( f ( x k ) - MOS k ) 2 express to make &Sigma; k = 1 t ( f ( x k ) - MOS k ) 2 The smallest value of w and b, X inp represents the input vector of the support vector regression training model, (w opt ) T is the transpose matrix of w opt ,
Figure BDA00003416765200207
A linear function representing the input vector X inp of the support vector regression trained model.

⑧-5、将测试集中的所有立体图像的特征矢量和平均主观评分均值构成测试样本数据集合,然后根据支持向量回归训练模型,对测试样本数据集合中的每幅立体图像的特征矢量进行测试,预测得到测试样本数据集合中的每幅立体图像的客观视觉舒适度评价预测值,将测试样本数据集合中的第k'幅立体图像的客观视觉舒适度评价预测值记为Qk',Qk'=f(Xk'),

Figure BDA00003416765200208
其中,Xk'表示测试样本数据集合中的第k'幅立体图像的特征矢量,表示测试样本数据集合中的第k'幅立体图像的线性函数,1≤k'≤n-t。8.-5, the eigenvectors of all stereoscopic images in the test set and the mean value of the average subjective score form the test sample data set, then according to the support vector regression training model, the eigenvectors of each stereoscopic image in the test sample data set are tested, The objective visual comfort evaluation prediction value of each stereoscopic image in the test sample data set is predicted, and the objective visual comfort evaluation prediction value of the k'th stereo image in the test sample data set is recorded as Q k' , Q k ' = f(X k' ),
Figure BDA00003416765200208
Among them, X k' represents the feature vector of the k'th stereo image in the test sample data set, Represents the linear function of the k'th stereo image in the test sample data set, 1≤k'≤nt.

⑧-6、再重新随机选择立体图像集合中的

Figure BDA00003416765200211
幅立体图像构成训练集,将立体图像集合中剩余的n-t幅立体图像构成测试集,然后返回步骤⑧-2继续执行,在经过N次迭代后,计算立体图像集合中的每幅立体图像的客观视觉舒适度评价预测值的平均值,再将计算得到的平均值作为对应那幅立体图像的最终的客观视觉舒适度评价预测值,其中,N的值取大于100。⑧-6, and then re-randomly select the stereo image collection
Figure BDA00003416765200211
Stereo images form the training set, and the remaining nt stereo images in the stereo image set constitute the test set, and then return to step ⑧-2 to continue execution. After N iterations, calculate the objective value of each stereo image in the stereo image set The average value of the predicted value of the visual comfort evaluation, and then the calculated average value is used as the final predicted value of the objective visual comfort evaluation corresponding to the stereo image, wherein the value of N is greater than 100.

在本实施例中,利用评估图像质量评价方法的4个常用客观参量作为评价指标,即非线性回归条件下的Pearson相关系数(Pearson linear correlation coefficient,PLCC)、Spearman相关系数(Spearman rank order correlation coefficient,SROCC)、Kendall相关系数(Kendall rank-order correlation coefficient,KROCC)、均方误差(root mean squarederror,RMSE),PLCC和RMSE反映客观评价预测值的准确性,SROCC和KROCC反映其单调性。将计算得到的120幅立体图像的视觉舒适度客观评价预测值做五参数Logistic函数非线性拟合,PLCC、SROCC和KROCC值越高、RMSE值越小说明本发明的视觉舒适度客观评价方法的评价结果与平均主观评分均值的相关性越好。表1给出了采用不同特征矢量得到的视觉舒适度评价预测值与平均主观评分均值之间的相关性,从表1中可以看出,只采用两个特征矢量得到的视觉舒适度评价预测值与平均主观评分均值之间的相关性均不是最优的,并且由视差幅度特征构成的特征矢量对评价性能的影响比其他两个特征矢量要大,这说明了本发明方法提取的三个特征矢量是有效的,并且结合视差幅度、视差梯度和空间频率特征的特征矢量,得到的视觉舒适度评价预测值与平均主观评分均值之间的相关性更强,这足以说明本发明方法是有效的。In this embodiment, four commonly used objective parameters for evaluating image quality evaluation methods are used as evaluation indicators, namely Pearson correlation coefficient (Pearson linear correlation coefficient, PLCC) and Spearman correlation coefficient (Spearman rank order correlation coefficient) under nonlinear regression conditions. , SROCC), Kendall rank-order correlation coefficient (KROCC), mean square error (root mean squared error, RMSE), PLCC and RMSE reflect the accuracy of objective evaluation of the predicted value, SROCC and KROCC reflect its monotonicity. The visual comfort objective evaluation prediction value of the calculated 120 stereoscopic images is used as a five-parameter Logistic function nonlinear fitting, the higher the PLCC, SROCC and KROCC values, and the smaller the RMSE value, the smaller the visual comfort objective evaluation method of the present invention. The better the correlation between the evaluation results and the average subjective rating mean. Table 1 shows the correlation between the predicted value of visual comfort evaluation obtained by using different feature vectors and the mean value of the average subjective score. It can be seen from Table 1 that the predicted value of visual comfort evaluation obtained by using only two feature vectors None of the correlations with the average subjective rating mean is optimal, and the feature vector composed of disparity magnitude features has a greater impact on the evaluation performance than the other two feature vectors, which illustrates that the three features extracted by the method of the present invention The vector is effective, and combined with the feature vector of parallax amplitude, parallax gradient and spatial frequency features, the correlation between the obtained visual comfort evaluation prediction value and the average subjective rating mean is stronger, which is enough to show that the method of the present invention is effective .

图5给出了采用F1和F2两个特征矢量得到的客观视觉舒适度评价预测值与平均主观评分均值的散点图,图6给出了采用F1和F3两个特征矢量得到的客观视觉舒适度评价预测值与平均主观评分均值的散点图,图7给出了采用F2和F3两个特征矢量得到的客观视觉舒适度评价预测值与平均主观评分均值的散点图,图8给出了采用F1、F2和F3三个特征矢量得到的客观视觉舒适度评价预测值与平均主观评分均值的散点图,散点图中的散点越集中,说明客观评价结果与主观感知的一致性越好。从图5至图8中可以看出,采用本发明方法得到的散点图中的散点比较集中,与主观评价数据之间的吻合度较高。Figure 5 shows the scatter plot of the predicted value of objective visual comfort evaluation and the average subjective score obtained by using two feature vectors F 1 and F 2 , and Figure 6 shows the scatter diagram obtained by using two feature vectors F 1 and F 3 Figure 7 shows the scatter plot of the predicted value of objective visual comfort evaluation and the average subjective score obtained by using the two feature vectors F 2 and F 3 Fig. 8 shows the scatter diagram of the predicted value of objective visual comfort evaluation and the mean value of the average subjective rating obtained by using the three feature vectors F 1 , F 2 and F 3 . The more concentrated the scatter points in the scatter diagram, the more The better the consistency between objective evaluation results and subjective perception. It can be seen from Fig. 5 to Fig. 8 that the scatter points in the scatter diagram obtained by the method of the present invention are relatively concentrated, and the coincidence degree with the subjective evaluation data is relatively high.

表1采用不同特征矢量得到的视觉舒适度评价预测值与平均主观评分均值之间的相关Table 1 Correlation between the predicted value of visual comfort evaluation and the average subjective score obtained by using different feature vectors

sex

特征矢量feature vector F1+F2 F 1 +F 2 F1+F3 F 1 +F 3 F2+F3 F 2 +F 3 F1+F2+F3 F 1 +F 2 +F 3 PLCCPLCC 0.83480.8348 0.85480.8548 0.75430.7543 0.87160.8716

SROCCSROCC 0.79660.7966 0.80450.8045 0.70930.7093 0.83290.8329 KROCCKROCC 0.60840.6084 0.61200.6120 0.52310.5231 0.64540.6454 RMSERMSE 0.44300.4430 0.41760.4176 0.52830.5283 0.39440.3944

Claims (10)

1.一种基于机器学习的立体图像视觉舒适度评价方法,其特征在于包括以下步骤:1. a method for evaluating visual comfort of stereoscopic images based on machine learning, characterized in that it may further comprise the steps: ①将待评价的立体图像的左视点图像记为{IL(x,y)},将待评价的立体图像的右视点图像记为{IR(x,y)},将待评价的立体图像的右视差图像记为{dR(x,y)},其中,此处(x,y)表示{IL(x,y)}、{IR(x,y)}和{dR(x,y)}中的像素点的坐标位置,1≤x≤W,1≤y≤H,W表示{IL(x,y)}、{IR(x,y)}和{dR(x,y)}的宽度,H表示{IL(x,y)}、{IR(x,y)}和{dR(x,y)}的高度,IL(x,y)表示{IL(x,y)}中坐标位置为(x,y)的像素点的像素值,IR(x,y)表示{IR(x,y)}中坐标位置为(x,y)的像素点的像素值,dR(x,y)表示{dR(x,y)}中坐标位置为(x,y)的像素点的像素值;① Denote the left viewpoint image of the stereo image to be evaluated as {I L (x, y)}, the right viewpoint image of the stereo image to be evaluated as {I R (x, y)}, and the stereo image to be evaluated The right disparity image of the image is denoted as {d R (x,y)}, where (x,y) denotes {I L (x,y)}, {I R (x,y)} and {d R The coordinate position of the pixel in (x,y)}, 1≤x≤W, 1≤y≤H, W means {I L (x, y)}, {I R (x, y)} and {d The width of R (x,y)}, H means the height of {I L (x,y)}, {I R (x,y)} and {d R (x,y)}, I L (x,y ) means the pixel value of the pixel whose coordinate position is (x, y) in {I L (x, y)}, and I R (x, y) means that the coordinate position in {I R (x, y)} is (x , y) the pixel value of the pixel point, d R (x, y) represents the pixel value of the pixel point whose coordinate position is (x, y) in {d R (x, y)}; ②提取出{IR(x,y)}的显著图;然后根据{IR(x,y)}的显著图和{dR(x,y)},获取{IR(x,y)}的视觉显著图;再将{IR(x,y)}的视觉显著图划分为视觉重要区域和非视觉重要区域;最后根据{IR(x,y)}的视觉显著图的视觉重要区域和非视觉重要区域,获取待评价的立体图像的视觉重要区域掩膜,记为{M(x,y)},其中,M(x,y)表示{M(x,y)}中坐标位置为(x,y)的像素点的像素值;② Extract the saliency map of {I R (x, y)}; then according to the saliency map of {I R (x, y)} and {d R (x, y)}, get {I R (x, y) } visual saliency map; then divide the visual saliency map of {I R (x,y)} into visually important areas and non-visually important areas; finally according to the visual importance of the visual saliency map of {I R (x,y)} Regions and non-visually important regions, obtain the visually important region mask of the stereo image to be evaluated, denoted as {M(x,y)}, where M(x,y) represents the coordinates in {M(x,y)} The pixel value of the pixel at position (x, y); ③根据{dR(x,y)}和{M(x,y)},获取{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的视差均值μ、视差方差δ、最大负视差θ、视差范围χ,然后将μ、δ、θ和χ按顺序进行排列构成用于反映{dR(x,y)}的视差幅度特征的特征矢量,记为F1,F1=(μ,δ,θ,χ);③According to {d R (x,y)} and {M(x,y)}, obtain the visually important areas of the visual saliency map in {d R (x,y)} and {I R (x,y)} The disparity mean μ, disparity variance δ, maximum negative disparity θ, and disparity range χ of the pixels in the corresponding area, and then arrange μ, δ, θ and χ in order to reflect {d R (x, y )}, denoted as F 1 , F 1 = (μ, δ, θ, χ); ④通过计算{dR(x,y)}的视差梯度幅值图像和视差梯度方向图像,计算{dR(x,y)}的视差梯度边缘图像;然后根据{dR(x,y)}的视差梯度边缘图像和{M(x,y)},计算{dR(x,y)}的视差梯度边缘图像中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的梯度均值ψ;最后将ψ作为用于反映{dR(x,y)}的视差梯度特征的特征矢量,记为F2④ By calculating the parallax gradient magnitude image and parallax gradient direction image of {d R (x, y)}, calculate the parallax gradient edge image of {d R (x, y)}; then according to {d R (x, y) } of the disparity gradient edge image and {M(x,y)}, calculate the visually important regions in the disparity gradient edge image of {d R (x,y)} and the visual saliency map of {I R (x,y)} The gradient mean value ψ of all pixels in the corresponding area; finally, ψ is used as a feature vector reflecting the parallax gradient feature of {d R (x, y)}, denoted as F 2 ; ⑤获取{IR(x,y)}的空间频率图像;然后根据{IR(x,y)}的空间频率图像和{M(x,y)},获取{IR(x,y)}的空间频率图像中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的空间频率均值ν、空间频率方差ρ、空间频率范围ζ、空间频率敏感因子τ;再将ν、ρ、ζ和τ按顺序进行排列构成用于反映{IR(x,y)}的空间频率特征的特征矢量,记为F3,F3=(ν,ρ,ζ,τ);⑤ Obtain the spatial frequency image of {I R (x, y)}; then according to the spatial frequency image of {I R (x, y)} and {M (x, y)}, obtain {I R (x, y) } in the spatial frequency image of {I R (x,y)}, the spatial frequency mean ν, spatial frequency variance ρ, spatial frequency range ζ, spatial frequency Sensitivity factor τ; then arrange ν, ρ, ζ and τ in order to form a feature vector used to reflect the spatial frequency characteristics of {I R (x,y)}, denoted as F 3 , F 3 =(ν,ρ ,ζ,τ); ⑥将F1、F2及F3构成一个新的特征矢量,记为X,X=[F1,F2,F3],然后将X作为待评价的立体图像的特征矢量,其中,符号“[]”为矢量表示符号,[F1,F2,F3]表示将F1、F2和F3连接起来形成一个新的特征矢量;⑥Constitute F 1 , F 2 and F 3 into a new feature vector, denoted as X, X=[F 1 , F 2 , F 3 ], and then use X as the feature vector of the stereoscopic image to be evaluated, where the symbol "[]" is a vector symbol, and [F 1 , F 2 , F 3 ] means connecting F 1 , F 2 and F 3 to form a new feature vector; ⑦采用n副不同的立体图像以及对应的右视差图像建立立体图像集合,利用主观质量评价方法分别计算立体图像集合中的每副立体图像的视觉舒适度的平均主观评分均值,记为MOS,其中,n≥1,MOS∈[1,5];然后按照步骤①至步骤⑥计算待评价的立体图像的特征矢量X的操作,以相同的方式分别计算立体图像集合中的每幅立体图像的特征矢量,将立体图像集合中的第i幅立体图像的特征矢量记为Xi,其中,1≤i≤n,n表示立体图像集合中包含的立体图像的幅数;⑦ Use n different stereoscopic images and corresponding right parallax images to establish a stereoscopic image set, and use the subjective quality evaluation method to calculate the average subjective score of the visual comfort of each stereoscopic image in the stereoscopic image set, denoted as MOS, where , n≥1, MOS∈[1,5]; then follow steps ① to ⑥ to calculate the feature vector X of the stereo image to be evaluated, and calculate the feature of each stereo image in the stereo image set in the same way Vector, denoting the feature vector of the i-th stereo image in the stereo image set as X i , wherein, 1≤i≤n, n represents the number of stereo images contained in the stereo image set; ⑧将立体图像集合中的所有立体图像分成训练集和测试集,将训练集中的所有立体图像的特征矢量和平均主观评分均值构成训练样本数据集合,将测试集中的所有立体图像的特征矢量和平均主观评分均值构成测试样本数据集合,然后采用支持向量回归作为机器学习的方法,对训练样本数据集合中的所有立体图像的特征矢量进行训练,使得经过训练得到的回归函数值与平均主观评分均值之间的误差最小,拟合得到最优的权重矢量wopt和最优的偏置项bopt,接着利用wopt和bopt构造得到支持向量回归训练模型,再根据支持向量回归训练模型,对测试样本数据集合中的每幅立体图像的特征矢量进行测试,预测得到测试样本数据集合中的每幅立体图像的客观视觉舒适度评价预测值,将测试样本数据集合中的第k'幅立体图像的客观视觉舒适度评价预测值记为Qk',Qk'=f(Xk'),
Figure FDA00003416765100021
其中,f()为函数表示形式,Xk'表示测试样本数据集合中的第k'幅立体图像的特征矢量,(wopt)T为wopt的转置矩阵,
Figure FDA00003416765100031
表示测试样本数据集合中的第k'幅立体图像的线性函数,1≤k'≤n-t,t表示训练集中包含的立体图像的幅数;之后通过重新分配训练集和测试集,重新预测得到测试样本数据集合中的每幅立体图像的客观视觉舒适度评价预测值,经过N次迭代后计算立体图像中的每幅立体图像的客观视觉舒适度评价预测值的平均值,并将计算得到的平均值作为对应那幅立体图像的最终的客观视觉舒适度预测值,其中,N的值取大于100。
8. Divide all the stereo images in the stereo image set into training set and test set, form the training sample data set with the feature vectors and average subjective score mean of all the stereo images in the training set, and combine the feature vectors and average values of all the stereo images in the test set The mean value of the subjective score constitutes the test sample data set, and then uses support vector regression as a machine learning method to train the feature vectors of all stereo images in the training sample data set, so that the regression function value obtained after training and the average subjective score mean value The error between is the smallest, and the optimal weight vector w opt and the optimal bias item b opt are obtained by fitting, and then the support vector regression training model is obtained by using w opt and b opt , and then according to the support vector regression training model, the test The feature vector of each stereoscopic image in the sample data set is tested, and the objective visual comfort evaluation prediction value of each stereoscopic image in the test sample data set is predicted, and the k'th stereoscopic image in the test sample data set is The predicted value of objective visual comfort evaluation is denoted as Q k' , Q k' = f(X k' ),
Figure FDA00003416765100021
Among them, f() is a function representation, X k' represents the feature vector of the k'th stereo image in the test sample data set, (w opt ) T is the transposition matrix of w opt ,
Figure FDA00003416765100031
Represents the linear function of the k'th stereo image in the test sample data set, 1≤k'≤nt, t represents the number of stereo images contained in the training set; after that, re-predict the test by reassigning the training set and the test set The objective visual comfort evaluation prediction value of each stereoscopic image in the sample data set, calculate the average value of the objective visual comfort evaluation prediction value of each stereoscopic image in the stereoscopic image after N iterations, and calculate the average The value is used as the final objective visual comfort prediction value corresponding to that stereo image, wherein, the value of N is greater than 100.
2.根据权利要求1所述的一种基于机器学习的立体图像视觉舒适度评价方法,其特征在于所述的步骤②的具体过程为:2. a kind of stereoscopic image visual comfort evaluation method based on machine learning according to claim 1, is characterized in that described step 2. The specific process is: ②-1、采用基于图论的视觉显著性模型提取出{IR(x,y)}的显著图,记为{SMR(x,y)},其中,SMR(x,y)表示{SMR(x,y)}中坐标位置为(x,y)的像素点的像素值;②-1. Use the visual saliency model based on graph theory to extract the saliency graph of {I R (x, y)}, denoted as {SM R (x, y)}, where SM R (x, y) means The pixel value of the pixel whose coordinate position is (x, y) in {SM R (x, y)}; ②-2、根据{SMR(x,y)}和{dR(x,y)},获取{IR(x,y)}的视觉显著图,记为{DR(x,y)},将{DR(x,y)}中坐标位置为(x,y)的像素点的像素值记为DR(x,y),
Figure FDA00003416765100032
其中,
Figure FDA00003416765100033
表示SMR(x,y)的权重,表示dR(x,y)的权重,
Figure FDA00003416765100035
②-2. According to {SM R (x,y)} and {d R (x,y)}, obtain the visual saliency map of {I R (x,y)}, denoted as {D R (x,y) }, record the pixel value of the pixel whose coordinate position is (x, y) in {D R (x, y)} as D R (x, y),
Figure FDA00003416765100032
in,
Figure FDA00003416765100033
Represents the weight of SM R (x,y), Indicates the weight of d R (x,y),
Figure FDA00003416765100035
②-3、根据{DR(x,y)}中的每个像素点的像素值,将{DR(x,y)}划分为视觉重要区域和非视觉重要区域,{DR(x,y)}的视觉重要区域中的每个像素点的像素值大于自适应阈值T1,{DR(x,y)}的非视觉重要区域中的每个像素点的像素值小于或等于自适应阈值T1,其中,T1为利用大津法对{DR(x,y)}进行处理得到的阈值;②-3. According to the pixel value of each pixel in {D R (x, y)}, divide {D R (x, y)} into visually important areas and non-visually important areas, {D R (x , y)}, the pixel value of each pixel in the visually important area is greater than the adaptive threshold T 1 , and the pixel value of each pixel in the non-visually important area of {D R (x, y)} is less than or equal to Adaptive threshold T 1 , where T 1 is the threshold obtained by processing {D R (x, y)} using the Otsu method; ②-4、根据{DR(x,y)}的视觉重要区域和非视觉重要区域,获取待评价的立体图像的视觉重要区域掩膜,记为{M(x,y)},将{M(x,y)}中坐标位置为(x,y)的像素点的像素值记为M(x,y), M ( x , y ) = 1 D R ( x , y ) > T 1 0 D R ( x , y ) &le; T 1 . ②-4. According to the visually important area and non-visually important area of {D R (x, y)}, obtain the visually important area mask of the stereo image to be evaluated, which is recorded as {M(x,y)}, and { The pixel value of the pixel whose coordinate position is (x, y) in M(x, y)} is recorded as M(x, y), m ( x , the y ) = 1 D. R ( x , the y ) > T 1 0 D. R ( x , the y ) &le; T 1 .
3.根据权利要求2所述的一种基于机器学习的立体图像视觉舒适度评价方法,其特征在于所述的步骤②-2中取
Figure FDA00003416765100037
3. a kind of stereoscopic image visual comfort evaluation method based on machine learning according to claim 2, it is characterized in that described step ②-2 takes
Figure FDA00003416765100037
4.根据权利要求1至3中任一项所述的一种基于机器学习的立体图像视觉舒适度评价方法,其特征在于所述的步骤③的具体过程为:4. according to a kind of machine learning-based three-dimensional image visual comfort evaluation method according to any one of claims 1 to 3, it is characterized in that the concrete process of described step 3. is: ③-1、根据{dR(x,y)}和{M(x,y)},计算{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的视差均值,记为μ, &mu; = &Sigma; ( x , y ) &Element; &Omega; d R ( x , y ) &times; M ( x , y ) &Sigma; ( x , y ) &Element; &Omega; M ( x , y ) , 其中,Ω表示图像域范围;③-1. According to {d R (x, y)} and {M (x, y)}, calculate the visual saliency map in {d R (x, y)} and {I R (x, y)} The average disparity value of all pixels in the area corresponding to the visually important area, denoted as μ, &mu; = &Sigma; ( x , the y ) &Element; &Omega; d R ( x , the y ) &times; m ( x , the y ) &Sigma; ( x , the y ) &Element; &Omega; m ( x , the y ) , Among them, Ω represents the image domain range; ③-2、根据{dR(x,y)}和{M(x,y)}及μ,计算{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的视差方差,记为δ, &delta; = &Sigma; ( x , y ) &Element; &Omega; ( d R ( x , y ) - &mu; ) 2 &times; M ( x , y ) &Sigma; ( x , y ) &Element; &Omega; M ( x , y ) ; ③-2. According to {d R (x,y)} and {M(x,y)} and μ, calculate the visual salience of {I R (x,y)} in {d R (x,y)} The disparity variance of all pixels in the area corresponding to the visually important area of the graph is denoted as δ, &delta; = &Sigma; ( x , the y ) &Element; &Omega; ( d R ( x , the y ) - &mu; ) 2 &times; m ( x , the y ) &Sigma; ( x , the y ) &Element; &Omega; m ( x , the y ) ; ③-3、计算{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的最大负视差,记为θ,其中,θ的值为{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内视差值最小的1%像素点的视差均值;③-3. Calculate the maximum negative disparity of the pixels in {d R (x, y)} corresponding to the visually important area of the visual saliency map of {I R (x, y)}, denoted as θ, Among them, the value of θ is the disparity of 1% of the pixels in {d R (x, y)} corresponding to the visually important area of the visual saliency map of {I R (x, y)} with the smallest disparity value mean; ③-4、计算{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的视差范围,记为χ,χ=dmax-dmin,其中,dmax表示{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内视差值最大的1%像素点的视差均值,dmin表示{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内视差值最小的1%像素点的视差均值;③-4, calculate the disparity range of the pixels in the region corresponding to the visually important region of the visual saliency map of {I R (x, y)} in {d R (x, y)}, denoted as χ, χ =d max -d min , where d max represents the largest disparity value in the region corresponding to the visually important region of the visual saliency map of {I R (x,y)} in {d R (x,y)} The mean value of the disparity of 1% of the pixels, d min represents the minimum disparity value of 1 in the area corresponding to the visually important area of the visual saliency map of {I R (x, y)} in {d R (x, y)} The average parallax value of % pixels; ③-5、将μ、δ、θ和χ按顺序进行排列构成用于反映{dR(x,y)}的视差幅度特征的特征矢量,记为F1,F1=(μ,δ,θ,χ),F1的维数为4。③-5. Arrange μ, δ, θ and χ in order to form a feature vector used to reflect the parallax amplitude characteristics of {d R (x, y)}, denoted as F 1 , F1=(μ, δ, θ , χ), the dimension of F1 is 4. 5.根据权利要求4所述的一种基于机器学习的立体图像视觉舒适度评价方法,其特征在于所述的步骤④的具体过程为:5. a kind of stereoscopic image visual comfort evaluation method based on machine learning according to claim 4, is characterized in that the concrete process of described step 4. is: ④-1、计算{dR(x,y)}的视差梯度幅值图像,记为{m(x,y)},将{m(x,y)}中坐标位置为(x,y)的像素点的梯度幅值记为m(x,y),
Figure FDA00003416765100043
其中,Gx(x,y)表示{m(x,y)}中坐标位置为(x,y)的像素点的水平梯度值,Gy(x,y)表示{m(x,y)}中坐标位置为(x,y)的像素点的垂直梯度值;
④-1. Calculate the parallax gradient magnitude image of {d R (x,y)}, which is recorded as {m(x,y)}, and the coordinate position in {m(x,y)} is (x,y) The gradient magnitude of the pixel point is recorded as m(x,y),
Figure FDA00003416765100043
Among them, G x (x, y) represents the horizontal gradient value of the pixel whose coordinate position is (x, y) in {m(x, y)}, and G y (x, y) represents {m(x, y) } in the vertical gradient value of the pixel whose coordinate position is (x, y);
④-2、计算{dR(x,y)}的视差梯度方向图像,记为{θ(x,y)},将{θ(x,y)}中坐标位置为(x,y)的像素点的梯度方向值记为θ(x,y),θ(x,y)=arctan(Gy(x,y)/Gx(x,y)),其中,arctan()为取反正切函数;④-2. Calculate the disparity gradient direction image of {d R (x,y)}, which is recorded as {θ(x,y)}, and the coordinate position in {θ(x,y)} is (x,y) The gradient direction value of the pixel is recorded as θ(x,y), θ(x,y)=arctan(G y (x,y)/G x (x,y)), where arctan() is the arc tangent function; ④-3、根据{m(x,y)}和{θ(x,y)},计算{dR(x,y)}的视差梯度边缘图像,记为{E(x,y)},将{E(x,y)}中坐标位置为p的像素点的梯度边缘值记为E(p),
Figure FDA00003416765100051
其中,Gs(||p-q||)表示标准差为σs的高斯函数,
Figure FDA00003416765100052
||p-q||表示坐标位置p和坐标位置q之间的欧氏距离,符号“||||”为求欧氏距离符号,
Figure FDA00003416765100053
表示标准差为σo的高斯函数, G o ( | | &theta; &RightArrow; ( p ) - &theta; &RightArrow; ( q ) | | ) = exp ( - | | &theta; &RightArrow; ( p ) - &theta; &RightArrow; ( q ) | | 2 2 &sigma; o 2 ) ,
Figure FDA00003416765100055
表示
Figure FDA00003416765100057
之间的欧氏距离, &theta; &RightArrow; ( p ) = [ sin ( &theta; ( p ) ) , cos ( &theta; ( p ) ) ] , &theta; &RightArrow; ( q ) = [ sin ( &theta; ( q ) ) , cos ( &theta; ( q ) ) ] , θ(p)表示{θ(x,y)}中坐标位置为p的像素点的梯度方向值,θ(q)表示{θ(x,y)}中坐标位置为q的像素点的梯度方向值,
Figure FDA000034167651000510
m(q)表示{m(x,y)}中坐标位置为q的像素点的梯度幅值,m(q')表示{m(x,y)}中坐标位置为q'的像素点的梯度幅值,εg为控制参数,符号“[]”为矢量表示符号,exp()表示以e为底的指数函数,e=2.71828183,
Figure FDA000034167651000511
表示以坐标位置为p的像素点为中心的邻域窗口,表示以坐标位置为q的像素点为中心的邻域窗口;
④-3. According to {m(x,y)} and {θ(x,y)}, calculate the parallax gradient edge image of {d R (x,y)}, denoted as {E(x,y)}, Record the gradient edge value of the pixel point whose coordinate position is p in {E(x,y)} as E(p),
Figure FDA00003416765100051
Among them, G s (||pq||) represents the Gaussian function whose standard deviation is σ s ,
Figure FDA00003416765100052
||pq|| represents the Euclidean distance between the coordinate position p and the coordinate position q, and the symbol "||||" is the Euclidean distance symbol,
Figure FDA00003416765100053
Represents a Gaussian function with standard deviation σ o , G o ( | | &theta; &Right Arrow; ( p ) - &theta; &Right Arrow; ( q ) | | ) = exp ( - | | &theta; &Right Arrow; ( p ) - &theta; &Right Arrow; ( q ) | | 2 2 &sigma; o 2 ) ,
Figure FDA00003416765100055
express and
Figure FDA00003416765100057
The Euclidean distance between &theta; &Right Arrow; ( p ) = [ sin ( &theta; ( p ) ) , cos ( &theta; ( p ) ) ] , &theta; &Right Arrow; ( q ) = [ sin ( &theta; ( q ) ) , cos ( &theta; ( q ) ) ] , θ(p) represents the gradient direction value of the pixel point whose coordinate position is p in {θ(x,y)}, and θ(q) represents the gradient direction value of the pixel point whose coordinate position is q in {θ(x,y)} value,
Figure FDA000034167651000510
m(q) represents the gradient magnitude of the pixel at the coordinate position q in {m(x,y)}, and m(q') represents the gradient magnitude of the pixel at the coordinate position q' in {m(x,y)} Gradient amplitude, ε g is the control parameter, the symbol “[]” is the vector representation symbol, exp() represents the exponential function with e as the base, e=2.71828183,
Figure FDA000034167651000511
Indicates the neighborhood window centered on the pixel at the coordinate position p, Represents the neighborhood window centered on the pixel point whose coordinate position is q;
④-4、根据{E(x,y)}和{M(x,y)},计算{E(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的梯度均值,记为ψ,
Figure FDA00003416765100061
其中,Ω表示图像域范围,E(x,y)表示{E(x,y)}中坐标位置为(x,y)的像素点的梯度边缘值;
④-4. According to {E(x,y)} and {M(x,y)}, calculate the visual importance of the visual saliency map in {E(x,y)} and {I R (x,y)} The gradient mean of all pixels in the region corresponding to the region is denoted as ψ,
Figure FDA00003416765100061
Among them, Ω represents the range of the image domain, and E(x, y) represents the gradient edge value of the pixel whose coordinate position is (x, y) in {E(x, y)};
④-5、将ψ作为用于反映{dR(x,y)}的视差梯度特征的特征矢量,记为F2,F2的维数为1。④-5. Take ψ as a feature vector for reflecting the disparity gradient feature of {d R (x, y)}, denoted as F 2 , and the dimension of F 2 is 1.
6.根据权利要求5所述的一种基于机器学习的立体图像视觉舒适度评价方法,其特征在于所述的步骤④-3中取σs=0.4,σo=0.4,εg=0.5。6. A method for evaluating visual comfort of stereoscopic images based on machine learning according to claim 5, characterized in that σ s =0.4, σ o =0.4, ε g =0.5 in the step ④-3. 7.根据权利要求6所述的一种基于机器学习的立体图像视觉舒适度评价方法,其特征在于所述的步骤④-3中
Figure FDA00003416765100062
的大小为3×3,
Figure FDA00003416765100063
的大小为3×3。
7. a kind of machine learning-based stereoscopic image visual comfort evaluation method according to claim 6, is characterized in that in described step 4.-3
Figure FDA00003416765100062
has a size of 3×3,
Figure FDA00003416765100063
The size is 3×3.
8.根据权利要求7所述的一种基于机器学习的立体图像视觉舒适度评价方法,其特征在于所述的步骤⑤的具体过程为:8. a kind of stereoscopic image visual comfort evaluation method based on machine learning according to claim 7, is characterized in that the concrete process of described step 5. is: ⑤-1、计算{IR(x,y)}的空间频率图像,记为{SF(x,y)},将{SF(x,y)}中坐标位置为(x,y)的像素点的空间频率值记为SF(x,y), SF ( x , y ) = ( HF ( x , y ) ) 2 + ( VF ( x , y ) ) 2 + ( DF ( x , y ) ) 2 , 其中,HF(x,y)表示{IR(x,y)}中坐标位置为(x,y)的像素点的水平方向频率值, HF ( x , y ) = &Sigma; m = - 1 1 &Sigma; n = 0 1 ( I R ( x + m , y + n ) - I R ( x + m , y + n - 1 ) ) 2 3 &times; 2 , VF(x,y)表示{IR(x,y)}中坐标位置为(x,y)的像素点的垂直方向频率值, VF ( x , y ) = &Sigma; m = 0 1 &Sigma; n = - 1 1 ( I R ( x + m , y + n ) - I R ( x + m - 1 , y + n ) ) 2 2 &times; 3 , DF(x,y)表示{IR(x,y)}中坐标位置为(x,y)的像素点的对角方向频率值, DF ( x , y ) = &Sigma; m = 0 1 &Sigma; n = 0 1 ( I R ( x + m , y + n ) - I R ( x + m - 1 , y + n - 1 ) ) 2 2 &times; 2 + &Sigma; m = - 1 0 &Sigma; n = 0 1 ( I R ( x + m , y + n ) - I R ( x + m + 1 , y + n - 1 ) ) 2 2 &times; 2 , IR(x+m,y+n)表示{IR(x,y)}中坐标位置为(x+m,y+n)的像素点的像素值,IR(x+m,y+n-1)表示{IR(x,y)}中坐标位置为(x+m,y+n-1)的像素点的像素值,IR(x+m-1,y+n)表示{IR(x,y)}中坐标位置为(x+m-1,y+n)的像素点的像素值,IR(x+m-1,y+n-1)表示{IR(x,y)}中坐标位置为(x+m-1,y+n-1)的像素点的像素值,IR(x+m+1,y+n-1)表示{IR(x,y)}中坐标位置为(x+m+1,y+n-1)的像素点的像素值,如果x+m<1,则IR(x+m,y+n)的值由IR(1,y+n)的值替代,IR(x+m,y+n-1)的值由IR(1,y+n-1)的值替代;如果x+m-1<1,则IR(x+m-1,y+n)的值由IR(1,y+n)的值替代,IR(x+m-1,y+n-1)的值由IR(1,y+n-1)的值替代;如果x+m>W,则IR(x+m,y+n)的值由IR(W,y+n)的值替代,IR(x+m,y+n-1)的值由IR(W,y+n-1)的值替代;如果x+m+1>W,则IR(x+m+1,y+n-1)的值由IR(W,y+n-1)的值替代;如果y+n<1,则IR(x+m,y+n)的值由IR(x+m,1)的值替代,IR(x+m-1,y+n)的值由IR(x+m-1,1)的值替代;如果y+n-1<1,则IR(x+m,y+n-1)的值由IR(x+m,1)的值替代,IR(x+m-1,y+n-1)的值由IR(x+m-1,1)的值替代,IR(x+m+1,y+n-1)的值由IR(x+m+1,1)的值替代;如果y+n>H,则IR(x+m,y+n)的值由IR(x+m,H)的值替代,IR(x+m-1,y+n)的值由IR(x+m-1,H)的值替代;⑤-1. Calculate the spatial frequency image of {I R (x, y)}, which is recorded as {SF(x, y)}, and the pixel whose coordinate position is (x, y) in {SF(x, y)} The spatial frequency value of a point is denoted as SF(x,y), SF ( x , the y ) = ( HF ( x , the y ) ) 2 + ( VF ( x , the y ) ) 2 + ( DF ( x , the y ) ) 2 , Among them, HF(x, y) represents the horizontal direction frequency value of the pixel whose coordinate position is (x, y) in {I R (x, y)}, HF ( x , the y ) = &Sigma; m = - 1 1 &Sigma; no = 0 1 ( I R ( x + m , the y + no ) - I R ( x + m , the y + no - 1 ) ) 2 3 &times; 2 , VF(x,y) represents the vertical frequency value of the pixel whose coordinate position is (x,y) in {I R (x,y)}, VF ( x , the y ) = &Sigma; m = 0 1 &Sigma; no = - 1 1 ( I R ( x + m , the y + no ) - I R ( x + m - 1 , the y + no ) ) 2 2 &times; 3 , DF(x, y) represents the diagonal direction frequency value of the pixel point whose coordinate position is (x, y) in {I R (x, y)}, DF ( x , the y ) = &Sigma; m = 0 1 &Sigma; no = 0 1 ( I R ( x + m , the y + no ) - I R ( x + m - 1 , the y + no - 1 ) ) 2 2 &times; 2 + &Sigma; m = - 1 0 &Sigma; no = 0 1 ( I R ( x + m , the y + no ) - I R ( x + m + 1 , the y + no - 1 ) ) 2 2 &times; 2 , I R (x+m, y+n) represents the pixel value of the pixel whose coordinate position is (x+m, y+n) in {I R (x, y)}, and I R (x+m, y+ n-1) represents the pixel value of the pixel point whose coordinate position is (x+m, y+n-1) in {I R (x, y)}, and I R (x+m-1, y+n) represents The pixel value of the pixel whose coordinate position is (x+m-1, y+n) in {I R (x, y)}, I R (x+m-1, y+n-1) means {I R The pixel value of the pixel whose coordinate position is (x+m-1, y+n-1) in (x, y)}, I R (x+m+1, y+n-1) means that {I R ( The pixel value of the pixel whose coordinate position is (x+m+1,y+n-1) in x,y)}, if x+m<1, then the value of I R (x+m,y+n) is replaced by the value of I R (1,y+n), and the value of I R (x+m,y+n-1) is replaced by the value of I R (1,y+n-1); if x+m- 1<1, the value of I R (x+m-1,y+n) is replaced by the value of I R (1,y+n), and the value of I R (x+m-1,y+n-1) The value is replaced by the value of I R (1,y+n-1); if x+m>W, the value of I R (x+m,y+n) is replaced by the value of I R (W,y+n) Instead, the value of I R (x+m,y+n-1) is replaced by the value of I R (W,y+n-1); if x+m+1>W, then I R (x+m+ 1,y+n-1) is replaced by the value of I R (W,y+n-1); if y+n<1, the value of I R (x+m,y+n) is replaced by I R The value of (x+m,1) is replaced, the value of I R (x+m-1,y+n) is replaced by the value of I R (x+m-1,1); if y+n-1<1 , then the value of I R (x+m,y+n-1) is replaced by the value of I R (x+m,1), and the value of I R (x+m-1,y+n-1) is replaced by I The value of R (x+m-1,1) is replaced, and the value of I R (x+m+1,y+n-1) is replaced by the value of I R (x+m+1,1); if y+ n>H, then the value of I R (x+m,y+n) is replaced by the value of I R (x+m,H), and the value of I R (x+m-1,y+n) is replaced by I R The value of (x+m-1,H) is replaced; ⑤-2、根据{SF(x,y)}和{M(x,y)},计算{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的空间频率均值,记为ν, &nu; = &Sigma; ( x , y ) &Element; &Omega; SF ( x , y ) &times; M ( x , y ) &Sigma; ( x , y ) &Element; &Omega; M ( x , y ) , 其中,Ω表示图像域范围;⑤-2. According to {SF(x,y)} and {M(x,y)}, calculate the visual importance of the visual saliency map in {SF(x,y)} and {I R (x,y)} The spatial frequency mean value of all pixels in the region corresponding to the region is denoted as ν, &nu; = &Sigma; ( x , the y ) &Element; &Omega; SF ( x , the y ) &times; m ( x , the y ) &Sigma; ( x , the y ) &Element; &Omega; m ( x , the y ) , Among them, Ω represents the image domain range; ⑤-3、根据{SF(x,y)}和{M(x,y)}及ν,计算{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的空间频率方差,记为ρ, &rho; = &Sigma; ( x , y ) &Element; &Omega; ( SF ( x , y ) - &nu; ) 2 &times; M ( x , y ) &Sigma; ( x , y ) &Element; &Omega; M ( x , y ) ; ⑤-3. According to {SF(x,y)} and {M(x,y)} and ν, calculate the visual saliency map in {SF(x,y)} and {I R (x,y)} The spatial frequency variance of all pixels in the area corresponding to the visually important area is denoted as ρ, &rho; = &Sigma; ( x , the y ) &Element; &Omega; ( SF ( x , the y ) - &nu; ) 2 &times; m ( x , the y ) &Sigma; ( x , the y ) &Element; &Omega; m ( x , the y ) ; ⑤-4、计算{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的空间频率范围,记为ζ,ζ=SFmax-SFmin,其中,SFmax表示{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内空间频率值最大的1%像素点的空间频率均值,SFmin表示{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内空间频率值最小的1%像素点的空间频率均值;⑤-4. Calculate the spatial frequency range of the pixels in {SF (x, y)} corresponding to the visually important area of the visual saliency map of {I R (x, y)}, denoted as ζ, ζ =SF max -SF min , where SF max represents the 1 with the largest spatial frequency value in the region corresponding to the visually important region of the visual saliency map of {I R (x,y)} in {SF(x,y)} % Spatial frequency mean of pixels, SF min represents the smallest 1% of the spatial frequency value in the region corresponding to the visually important region of the visual saliency map of {I R (x,y)} in {SF(x,y)} The spatial frequency mean of the pixel; ⑤-5、计算{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的空间频率敏感因子,记为τ,τ=ν/μ;⑤-5. Calculate the spatial frequency sensitivity factor of the pixels in the region corresponding to the visually important region of the visual saliency map of {I R (x, y)} in {SF (x, y)}, denoted as τ, τ=ν/μ; ⑤-6、将ν、ρ、ζ和τ按顺序进行排列构成用于反映{IR(x,y)}的空间频率特征的特征矢量,记为F3,F3=(ν,ρ,ζ,τ),F3的维数为4。⑤-6. Arrange ν, ρ, ζ and τ in order to form a feature vector for reflecting the spatial frequency characteristics of {I R (x, y)}, denoted as F 3 , F 3 =(ν, ρ, ζ,τ), the dimension of F 3 is 4. 9.根据权利要求8所述的一种基于机器学习的立体图像视觉舒适度评价方法,其特征在于所述的步骤⑧的具体过程为:9. a kind of stereoscopic image visual comfort evaluation method based on machine learning according to claim 8, is characterized in that the concrete process of described step 8. is: ⑧-1、随机选择立体图像集合中的
Figure FDA00003416765100086
幅立体图像构成训练集,将立体图像集合中剩余的n-t幅立体图像构成测试集,其中,符号
Figure FDA00003416765100087
为向上取整符号;
⑧-1, randomly select the stereo image set
Figure FDA00003416765100086
Stereo images constitute the training set, and the remaining nt stereo images in the stereo image set constitute the test set, where the symbol
Figure FDA00003416765100087
is the symbol for rounding up;
⑧-2、将训练集中的所有立体图像的特征矢量和平均主观评分均值构成训练样本数据集合,记为Ωt,{Xk,MOSk}∈Ωt,其中,Xk表示训练样本数据集合Ωt中的第k幅立体图像的特征矢量,MOSk表示训练样本数据集合Ωt中的第k幅立体图像的平均主观评分均值,1≤k≤t;⑧-2. The feature vectors and average subjective ratings of all stereo images in the training set constitute the training sample data set, which is recorded as Ω t , {X k ,MOS k }∈Ω t , where X k represents the training sample data set The feature vector of the k-th stereo image in Ω t , MOS k represents the average subjective score mean value of the k-th stereo image in the training sample data set Ω t , 1≤k≤t; ⑧-3、构造训练样本数据集合Ωt中的每幅立体图像的特征矢量的回归函数,将Xk的回归函数记为f(Xk),
Figure FDA00003416765100082
其中,f()为函数表示形式,w为权重矢量,wT为w的转置矩阵,b为偏置项,
Figure FDA00003416765100083
表示Xk的线性函数,
Figure FDA00003416765100084
D(Xk,Xl)为支持向量回归中的核函数,
Figure FDA00003416765100085
Xl为训练样本数据集合Ωt中的第l幅立体图像的特征矢量,1≤l≤t,γ为核参数,exp()表示以e为底的指数函数,e=2.71828183,符号“||||”为求欧式距离符号;
8.-3, construct the regression function of the feature vector of each stereoscopic image in the training sample data set Ω t , the regression function of X k is denoted as f(X k ),
Figure FDA00003416765100082
Among them, f() is the function representation, w is the weight vector, w T is the transpose matrix of w, b is the bias term,
Figure FDA00003416765100083
represents a linear function of X k ,
Figure FDA00003416765100084
D(X k ,X l ) is the kernel function in support vector regression,
Figure FDA00003416765100085
X l is the feature vector of the lth stereo image in the training sample data set Ω t , 1≤l≤t, γ is the kernel parameter, exp() means the exponential function with e as the base, e=2.71828183, the symbol "| |||" is the Euclidean distance symbol;
⑧-4、采用支持向量回归对训练样本数据集合Ωt中的所有立体图像的特征矢量进行训练,使得经过训练得到的回归函数值与平均主观评分均值之间的误差最小,拟合得到最优的权重矢量wopt和最优的偏置项bopt,将最优的权重矢量wopt和最优的偏置项bopt的组合记为(wopt,bopt), ( w opt , b opt ) = arg min ( w , b ) &Element; &Psi; &Sigma; k = 1 t ( f ( X k ) - MOS k ) 2 , 利用得到的最优的权重矢量wopt和最优的偏置项bopt构造支持向量回归训练模型,记为
Figure FDA00003416765100092
其中,Ψ表示对训练样本数据集合Ωt中的所有立体图像的特征矢量进行训练的所有的权重矢量和偏置项的组合的集合, arg min ( w , b ) &Element; &Psi; &Sigma; k = 1 t ( f ( X k ) - MOS k ) 2 表示使得 &Sigma; k = 1 t ( f ( X k ) - MOS k ) 2 最小的w和b的值,Xinp表示支持向量回归训练模型的输入矢量,(wopt)T为wopt的转置矩阵,
Figure FDA00003416765100095
表示支持向量回归训练模型的输入矢量Xinp的线性函数;
⑧-4, adopting support vector regression to train the feature vectors of all stereoscopic images in the training sample data set Ω t , so that the error between the regression function value obtained through training and the mean value of the average subjective rating is the smallest, and the fitting is optimal The weight vector w opt and the optimal bias item b opt , the combination of the optimal weight vector w opt and the optimal bias item b opt is recorded as (w opt , b opt ), ( w opt , b opt ) = arg min ( w , b ) &Element; &Psi; &Sigma; k = 1 t ( f ( x k ) - MOS k ) 2 , Using the obtained optimal weight vector w opt and the optimal bias item b opt to construct a support vector regression training model, denoted as
Figure FDA00003416765100092
Among them, Ψ represents the set of combinations of all weight vectors and bias items that are trained on the feature vectors of all stereo images in the training sample data set Ωt , arg min ( w , b ) &Element; &Psi; &Sigma; k = 1 t ( f ( x k ) - MOS k ) 2 express to make &Sigma; k = 1 t ( f ( x k ) - MOS k ) 2 The smallest value of w and b, X inp represents the input vector of the support vector regression training model, (w opt ) T is the transpose matrix of w opt ,
Figure FDA00003416765100095
Represents a linear function of the input vector X inp of the support vector regression training model;
⑧-5、将测试集中的所有立体图像的特征矢量和平均主观评分均值构成测试样本数据集合,然后根据支持向量回归训练模型,对测试样本数据集合中的每幅立体图像的特征矢量进行测试,预测得到测试样本数据集合中的每幅立体图像的客观视觉舒适度评价预测值,将测试样本数据集合中的第k'幅立体图像的客观视觉舒适度评价预测值记为Qk',Qk'=f(Xk'),其中,Xk'表示测试样本数据集合中的第k'幅立体图像的特征矢量,
Figure FDA00003416765100097
表示测试样本数据集合中的第k'幅立体图像的线性函数,1≤k'≤n-t;
8.-5, the eigenvectors of all stereoscopic images in the test set and the mean value of the average subjective score form the test sample data set, then according to the support vector regression training model, the eigenvectors of each stereoscopic image in the test sample data set are tested, The objective visual comfort evaluation prediction value of each stereoscopic image in the test sample data set is predicted, and the objective visual comfort evaluation prediction value of the k'th stereo image in the test sample data set is recorded as Q k' , Q k ' = f(X k' ), Among them, X k' represents the feature vector of the k'th stereo image in the test sample data set,
Figure FDA00003416765100097
Represents the linear function of the k'th stereo image in the test sample data set, 1≤k'≤nt;
⑧-6、再重新随机选择立体图像集合中的
Figure FDA00003416765100098
幅立体图像构成训练集,将立体图像集合中剩余的n-t幅立体图像构成测试集,然后返回步骤⑧-2继续执行,在经过N次迭代后,计算立体图像集合中的每幅立体图像的客观视觉舒适度评价预测值的平均值,再将计算得到的平均值作为对应那幅立体图像的最终的客观视觉舒适度评价预测值,其中,N的值取大于100。
⑧-6, and then re-randomly select the stereo image collection
Figure FDA00003416765100098
Stereo images form the training set, and the remaining nt stereo images in the stereo image set constitute the test set, and then return to step ⑧-2 to continue execution. After N iterations, calculate the objective value of each stereo image in the stereo image set The average value of the predicted value of the visual comfort evaluation, and then the calculated average value is used as the final predicted value of the objective visual comfort evaluation corresponding to the stereo image, wherein the value of N is greater than 100.
10.根据权利要求9所述的一种基于机器学习的立体图像视觉舒适度评价方法,其特征在于所述的步骤⑧-3中取γ=54。10. A method for evaluating visual comfort of stereoscopic images based on machine learning according to claim 9, characterized in that γ=54 is used in the step 8-3.
CN201310264956.8A 2013-06-27 2013-06-27 Method for evaluating stereo image vision comfort level based on machine learning Active CN103347196B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310264956.8A CN103347196B (en) 2013-06-27 2013-06-27 Method for evaluating stereo image vision comfort level based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310264956.8A CN103347196B (en) 2013-06-27 2013-06-27 Method for evaluating stereo image vision comfort level based on machine learning

Publications (2)

Publication Number Publication Date
CN103347196A true CN103347196A (en) 2013-10-09
CN103347196B CN103347196B (en) 2015-04-29

Family

ID=49281967

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310264956.8A Active CN103347196B (en) 2013-06-27 2013-06-27 Method for evaluating stereo image vision comfort level based on machine learning

Country Status (1)

Country Link
CN (1) CN103347196B (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103581661A (en) * 2013-10-28 2014-02-12 宁波大学 Method for evaluating visual comfort degree of three-dimensional image
CN104469355A (en) * 2014-12-11 2015-03-25 西安电子科技大学 Visual comfort prediction based on saliency adaptation and visual comfort enhancement method based on nonlinear mapping
CN104581141A (en) * 2015-01-09 2015-04-29 宁波大学 Three-dimensional picture visual comfort evaluation method
WO2015062149A1 (en) * 2013-10-30 2015-05-07 清华大学 Method for acquiring degree of comfort of motion-sensing binocular stereoscopic video
CN104811693A (en) * 2015-04-14 2015-07-29 宁波大学 A Method for Objective Evaluation of Visual Comfort of Stereo Image
CN105208374A (en) * 2015-08-24 2015-12-30 宁波大学 Non-reference image quality objective evaluation method based on deep learning
CN105243385A (en) * 2015-09-23 2016-01-13 宁波大学 Unsupervised learning based image quality evaluation method
CN105407349A (en) * 2015-11-30 2016-03-16 宁波大学 No-reference objective three-dimensional image quality evaluation method based on binocular visual perception
CN104036502B (en) * 2014-06-03 2016-08-24 宁波大学 A kind of without with reference to fuzzy distortion stereo image quality evaluation methodology
CN106210710A (en) * 2016-07-25 2016-12-07 宁波大学 A kind of stereo image vision comfort level evaluation methodology based on multi-scale dictionary
CN106604012A (en) * 2016-10-20 2017-04-26 吉林大学 Method for evaluating comfort level of 3D video according to vertical parallax
CN106683072A (en) * 2015-11-09 2017-05-17 上海交通大学 PUP (Percentage of Un-linked pixels) diagram based 3D image comfort quality evaluation method and system
CN106686377A (en) * 2016-12-30 2017-05-17 佳都新太科技股份有限公司 Algorithm for determining video key area based on deep neural network
CN106993183A (en) * 2017-03-28 2017-07-28 天津大学 A Quantitative Method of Comfortable Brightness Based on Salient Regions of Stereo Image
CN107909565A (en) * 2017-10-29 2018-04-13 天津大学 Stereo-picture Comfort Evaluation method based on convolutional neural networks
CN109754391A (en) * 2018-12-18 2019-05-14 北京爱奇艺科技有限公司 A kind of image quality evaluating method, device and electronic equipment
CN111669563A (en) * 2020-06-19 2020-09-15 福州大学 A method for enhancing the visual comfort of stereo images based on reinforcement learning
CN119785267A (en) * 2024-12-25 2025-04-08 杭州电子科技大学 A comfort prediction method for stereoscopic panoramic video based on deep learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102137271A (en) * 2010-11-04 2011-07-27 华为软件技术有限公司 Method and device for evaluating image quality
CN101742353B (en) * 2008-11-04 2012-01-04 工业和信息化部电信传输研究所 No-reference video quality evaluating method
CN102945552A (en) * 2012-10-22 2013-02-27 西安电子科技大学 No-reference image quality evaluation method based on sparse representation in natural scene statistics
CN103096125A (en) * 2013-02-22 2013-05-08 吉林大学 Stereoscopic video visual comfort evaluation method based on region segmentation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101742353B (en) * 2008-11-04 2012-01-04 工业和信息化部电信传输研究所 No-reference video quality evaluating method
CN102137271A (en) * 2010-11-04 2011-07-27 华为软件技术有限公司 Method and device for evaluating image quality
CN102945552A (en) * 2012-10-22 2013-02-27 西安电子科技大学 No-reference image quality evaluation method based on sparse representation in natural scene statistics
CN103096125A (en) * 2013-02-22 2013-05-08 吉林大学 Stereoscopic video visual comfort evaluation method based on region segmentation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YE BI AND JUN ZHOU: "visual comfort assessment metric based on motion features in salient motion regions for stereoscopic 3D video", 《COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE》 *

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103581661B (en) * 2013-10-28 2015-06-03 宁波大学 Method for evaluating visual comfort degree of three-dimensional image
CN103581661A (en) * 2013-10-28 2014-02-12 宁波大学 Method for evaluating visual comfort degree of three-dimensional image
WO2015062149A1 (en) * 2013-10-30 2015-05-07 清华大学 Method for acquiring degree of comfort of motion-sensing binocular stereoscopic video
CN103595990B (en) * 2013-10-30 2015-05-20 清华大学 Method for obtaining binocular stereoscopic video comfort level of motion perception
US10091484B2 (en) 2013-10-30 2018-10-02 Tsinghua University Method for acquiring comfort degree of motion-sensing binocular stereoscopic video
CN104036502B (en) * 2014-06-03 2016-08-24 宁波大学 A kind of without with reference to fuzzy distortion stereo image quality evaluation methodology
CN104469355A (en) * 2014-12-11 2015-03-25 西安电子科技大学 Visual comfort prediction based on saliency adaptation and visual comfort enhancement method based on nonlinear mapping
CN104469355B (en) * 2014-12-11 2016-09-28 西安电子科技大学 Based on the prediction of notable adaptive euphoropsia and the euphoropsia Enhancement Method of nonlinear mapping
CN104581141B (en) * 2015-01-09 2016-06-22 宁波大学 A kind of stereo image vision comfort level evaluation methodology
CN104581141A (en) * 2015-01-09 2015-04-29 宁波大学 Three-dimensional picture visual comfort evaluation method
CN104811693A (en) * 2015-04-14 2015-07-29 宁波大学 A Method for Objective Evaluation of Visual Comfort of Stereo Image
CN105208374A (en) * 2015-08-24 2015-12-30 宁波大学 Non-reference image quality objective evaluation method based on deep learning
CN105243385B (en) * 2015-09-23 2018-11-09 宁波大学 A kind of image quality evaluating method based on unsupervised learning
CN105243385A (en) * 2015-09-23 2016-01-13 宁波大学 Unsupervised learning based image quality evaluation method
CN106683072A (en) * 2015-11-09 2017-05-17 上海交通大学 PUP (Percentage of Un-linked pixels) diagram based 3D image comfort quality evaluation method and system
CN105407349B (en) * 2015-11-30 2017-05-03 宁波大学 No-reference objective three-dimensional image quality evaluation method based on binocular visual perception
CN105407349A (en) * 2015-11-30 2016-03-16 宁波大学 No-reference objective three-dimensional image quality evaluation method based on binocular visual perception
CN106210710A (en) * 2016-07-25 2016-12-07 宁波大学 A kind of stereo image vision comfort level evaluation methodology based on multi-scale dictionary
CN106604012A (en) * 2016-10-20 2017-04-26 吉林大学 Method for evaluating comfort level of 3D video according to vertical parallax
CN106686377A (en) * 2016-12-30 2017-05-17 佳都新太科技股份有限公司 Algorithm for determining video key area based on deep neural network
CN106993183A (en) * 2017-03-28 2017-07-28 天津大学 A Quantitative Method of Comfortable Brightness Based on Salient Regions of Stereo Image
CN107909565A (en) * 2017-10-29 2018-04-13 天津大学 Stereo-picture Comfort Evaluation method based on convolutional neural networks
CN109754391A (en) * 2018-12-18 2019-05-14 北京爱奇艺科技有限公司 A kind of image quality evaluating method, device and electronic equipment
CN109754391B (en) * 2018-12-18 2021-10-22 北京爱奇艺科技有限公司 Image quality evaluation method and device and electronic equipment
CN111669563A (en) * 2020-06-19 2020-09-15 福州大学 A method for enhancing the visual comfort of stereo images based on reinforcement learning
CN111669563B (en) * 2020-06-19 2021-06-25 福州大学 Stereo image visual comfort enhancement method based on reinforcement learning
CN119785267A (en) * 2024-12-25 2025-04-08 杭州电子科技大学 A comfort prediction method for stereoscopic panoramic video based on deep learning

Also Published As

Publication number Publication date
CN103347196B (en) 2015-04-29

Similar Documents

Publication Publication Date Title
CN103347196B (en) Method for evaluating stereo image vision comfort level based on machine learning
CN103581661B (en) Method for evaluating visual comfort degree of three-dimensional image
CN104811693B (en) A kind of stereo image vision comfort level method for objectively evaluating
CN105407349B (en) No-reference objective three-dimensional image quality evaluation method based on binocular visual perception
CN104023230B (en) A kind of non-reference picture quality appraisement method based on gradient relevance
CN104581143A (en) Reference-free three-dimensional picture quality objective evaluation method based on machine learning
CN104036501A (en) Three-dimensional image quality objective evaluation method based on sparse representation
CN105389554A (en) Live body discrimination method and device based on face recognition
CN103942525A (en) Real-time face optimal selection method based on video sequence
CN105357519B (en) Non-reference stereo image quality objective evaluation method based on self-similarity characteristics
CN104581141B (en) A kind of stereo image vision comfort level evaluation methodology
CN104036502B (en) A kind of without with reference to fuzzy distortion stereo image quality evaluation methodology
CN106162162B (en) A kind of reorientation method for objectively evaluating image quality based on rarefaction representation
CN104394403A (en) A compression-distortion-oriented stereoscopic video quality objective evaluating method
CN106791822A (en) It is a kind of based on single binocular feature learning without refer to stereo image quality evaluation method
CN108805825A (en) A kind of reorientation image quality evaluating method
CN103338379A (en) Stereoscopic video objective quality evaluation method based on machine learning
CN105809182A (en) Image classification method and device
CN104574363A (en) Full reference image quality assessment method in consideration of gradient direction difference
CN104361583A (en) A Method for Objective Quality Evaluation of Asymmetric Distorted Stereo Images
CN104243956B (en) A kind of stereo-picture visual saliency map extracting method
CN106210710B (en) A kind of stereo image vision comfort level evaluation method based on multi-scale dictionary
CN108848365B (en) A kind of reorientation stereo image quality evaluation method
CN103065302B (en) Image significance detection method based on stray data mining
CN102708568A (en) A Stereoscopic Image Objective Quality Evaluation Method Based on Structural Distortion

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20191213

Address after: Room 1,020, Nanxun Science and Technology Pioneering Park, No. 666 Chaoyang Road, Nanxun District, Huzhou City, Zhejiang Province, 313000

Patentee after: Huzhou You Yan Intellectual Property Service Co.,Ltd.

Address before: 315211 Zhejiang Province, Ningbo Jiangbei District Fenghua Road No. 818

Patentee before: Ningbo University

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20230824

Address after: Room 111, 1st Floor, Building 4, Yard 5, Shangdi East Road, Haidian District, Beijing, 100000

Patentee after: Ape Point Technology (Beijing) Co.,Ltd.

Patentee after: Zheng Juan

Address before: 313000 room 1020, science and Technology Pioneer Park, 666 Chaoyang Road, Nanxun Town, Nanxun District, Huzhou, Zhejiang.

Patentee before: Huzhou You Yan Intellectual Property Service Co.,Ltd.

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载