CN103347196A - Method for evaluating stereo image vision comfort level based on machine learning - Google Patents
Method for evaluating stereo image vision comfort level based on machine learning Download PDFInfo
- Publication number
- CN103347196A CN103347196A CN2013102649568A CN201310264956A CN103347196A CN 103347196 A CN103347196 A CN 103347196A CN 2013102649568 A CN2013102649568 A CN 2013102649568A CN 201310264956 A CN201310264956 A CN 201310264956A CN 103347196 A CN103347196 A CN 103347196A
- Authority
- CN
- China
- Prior art keywords
- value
- image
- pixel
- coordinate position
- denoted
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 41
- 238000010801 machine learning Methods 0.000 title claims abstract description 19
- 239000013598 vector Substances 0.000 claims abstract description 159
- 230000000007 visual effect Effects 0.000 claims abstract description 157
- 238000012549 training Methods 0.000 claims abstract description 74
- 238000011156 evaluation Methods 0.000 claims abstract description 62
- 238000012360 testing method Methods 0.000 claims abstract description 58
- 239000000284 extract Substances 0.000 claims abstract description 5
- 230000006870 function Effects 0.000 claims description 37
- 238000012886 linear function Methods 0.000 claims description 12
- 239000011159 matrix material Substances 0.000 claims description 9
- 230000003044 adaptive effect Effects 0.000 claims description 6
- 238000013441 quality evaluation Methods 0.000 claims description 6
- 230000035945 sensitivity Effects 0.000 claims description 6
- 230000017105 transposition Effects 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 3
- 230000008447 perception Effects 0.000 abstract description 5
- 238000010586 diagram Methods 0.000 description 10
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000013210 evaluation model Methods 0.000 description 1
- 238000012854 evaluation process Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
Abstract
本发明公开了一种基于机器学习的立体图像视觉舒适度评价方法,其首先通过右视点图像的显著图和右视差图像提取出立体图像的视觉重要区域掩膜,然后利用视觉重要区域掩膜提取出用于反映视差幅度特征、视差梯度特征的特征矢量及用于反映空间频率特征的特征矢量,得到立体图像的特征矢量,再利用支持向量回归对立体图像集合中的所有立体图像的特征矢量进行训练,最后利用训练得到的支持向量回归训练模型对立体图像集合中的每幅立体图像进行测试,得到每幅立体图像的视觉舒适度评价预测值,优点是获得的立体图像的特征矢量信息具有较强的稳定性且能够较好地反映立体图像的视觉舒适度变化情况,从而有效地提高了客观评价情况与主观感知的相关性。
The invention discloses a method for evaluating the visual comfort of a stereoscopic image based on machine learning, which first extracts the visually important area mask of the stereoscopic image through the saliency map of the right viewpoint image and the right parallax image, and then uses the visually important area mask to extract The feature vectors used to reflect the parallax magnitude feature, the parallax gradient feature and the feature vector used to reflect the spatial frequency feature are obtained to obtain the feature vector of the stereo image, and then the feature vectors of all the stereo images in the stereo image set are analyzed by using support vector regression Finally, use the support vector regression training model obtained from the training to test each stereo image in the stereo image set, and obtain the visual comfort evaluation prediction value of each stereo image. The advantage is that the feature vector information of the obtained stereo image has relatively It has strong stability and can better reflect the change of the visual comfort of the stereoscopic image, thereby effectively improving the correlation between the objective evaluation and the subjective perception.
Description
技术领域technical field
本发明涉及一种图像质量评价方法,尤其是涉及一种基于机器学习的立体图像视觉舒适度评价方法。The invention relates to an image quality evaluation method, in particular to a machine learning-based method for evaluating visual comfort of stereoscopic images.
背景技术Background technique
随着立体视频显示技术和高质量立体视频内容获取技术的快速发展,立体视频的视觉体验质量(QoE,quality of experience)是立体视频系统设计中的一个重要问题,而视觉舒适度(VC,visual comfort)是影响立体视频的QoE的重要因素。目前,对立体视频/图像的质量评价研究主要考虑内容失真对于图像质量的影响,而很少考虑视觉舒适度等因素的影响。因此,为了提高观看者的视觉体验质量,研究立体视频/图像的视觉舒适度客观评价模型对指导3D内容的制作和后期处理具有十分重要的作用。With the rapid development of stereoscopic video display technology and high-quality stereoscopic video content acquisition technology, the visual quality of experience (QoE, quality of experience) of stereoscopic video is an important issue in the design of stereoscopic video systems, and visual comfort (VC, visual comfort) is an important factor affecting the QoE of stereoscopic video. At present, research on the quality evaluation of stereoscopic video/image mainly considers the influence of content distortion on image quality, but seldom considers the influence of factors such as visual comfort. Therefore, in order to improve the visual experience quality of viewers, it is very important to study the objective evaluation model of visual comfort of stereoscopic video/image to guide the production and post-processing of 3D content.
传统的立体图像视觉舒适度评价方法主要采用全局的视差统计特性来预测视觉舒适度。然而,根据人眼立体视觉注意力特性,人眼只对部分视觉重要区域的视觉舒适/不舒适比较敏感,如果以此全局的视差统计特征来预测视觉重要区域的视觉舒适程度,会导致无法精确预测得到客观评价值。因此,如何在评价过程中有效地根据视觉重要区域来提取出视觉舒适度特征,使得客观评价结果更加感觉符合人类视觉系统,是在对立体图像进行客观视觉舒适度评价过程中需要研究解决的问题。Traditional stereo image visual comfort evaluation methods mainly use the global disparity statistical characteristics to predict visual comfort. However, according to the attention characteristics of human stereo vision, the human eye is only sensitive to the visual comfort/discomfort of some visually important areas. If the global disparity statistical features are used to predict the visual comfort of visually important areas, it will lead to inaccurate Prediction gets an objective evaluation value. Therefore, how to effectively extract visual comfort features based on visually important areas in the evaluation process, so that the objective evaluation results are more in line with the human visual system, is a problem that needs to be studied and solved in the process of objective visual comfort evaluation of stereoscopic images .
发明内容Contents of the invention
本发明所要解决的技术问题是提供一种基于机器学习的立体图像视觉舒适度评价方法,其能够有效地提高客观评价结果与主观感知的相关性。The technical problem to be solved by the present invention is to provide a method for evaluating visual comfort of stereoscopic images based on machine learning, which can effectively improve the correlation between objective evaluation results and subjective perception.
本发明解决上述技术问题所采用的技术方案为:一种基于机器学习的立体图像视觉舒适度评价方法,其特征在于包括以下步骤:The technical solution adopted by the present invention to solve the above-mentioned technical problems is: a method for evaluating visual comfort of stereoscopic images based on machine learning, which is characterized in that it comprises the following steps:
①将待评价的立体图像的左视点图像记为{IL(x,y)},将待评价的立体图像的右视点图像记为{IR(x,y)},将待评价的立体图像的右视差图像记为{dR(x,y)},其中,此处(x,y)表示{IL(x,y)}、{IR(x,y)}和{dR(x,y)}中的像素点的坐标位置,1≤x≤W,1≤y≤H,W表示{IL(x,y)}、{IR(x,y)}和{dR(x,y)}的宽度,H表示{IL(x,y)}、{IR(x,y)}和{dR(x,y)}的高度,IL(x,y)表示{IL(x,y)}中坐标位置为(x,y)的像素点的像素值,IR(x,y)表示{IR(x,y)}中坐标位置为(x,y)的像素点的像素值,dR(x,y)表示{dR(x,y)}中坐标位置为(x,y)的像素点的像素值;① Denote the left viewpoint image of the stereo image to be evaluated as {I L (x, y)}, the right viewpoint image of the stereo image to be evaluated as {I R (x, y)}, and the stereo image to be evaluated The right disparity image of the image is denoted as {d R (x,y)}, where (x,y) denotes {I L (x,y)}, {I R (x,y)} and {d R The coordinate position of the pixel in (x,y)}, 1≤x≤W, 1≤y≤H, W means {I L (x, y)}, {I R (x, y)} and {d The width of R (x,y)}, H means the height of {I L (x,y)}, {I R (x,y)} and {d R (x,y)}, I L (x,y ) means the pixel value of the pixel whose coordinate position is (x, y) in {I L (x, y)}, and I R (x, y) means that the coordinate position in {I R (x, y)} is (x , y) the pixel value of the pixel point, d R (x, y) represents the pixel value of the pixel point whose coordinate position is (x, y) in {d R (x, y)};
②提取出{IR(x,y)}的显著图;然后根据{IR(x,y)}的显著图和{dR(x,y)},获取{IR(x,y)}的视觉显著图;再将{IR(x,y)}的视觉显著图划分为视觉重要区域和非视觉重要区域;最后根据{IR(x,y)}的视觉显著图的视觉重要区域和非视觉重要区域,获取待评价的立体图像的视觉重要区域掩膜,记为{M(x,y)},其中,M(x,y)表示{M(x,y)}中坐标位置为(x,y)的像素点的像素值;② Extract the saliency map of {I R (x, y)}; then according to the saliency map of {I R (x, y)} and {d R (x, y)}, get {I R (x, y) } visual saliency map; then divide the visual saliency map of {I R (x,y)} into visually important areas and non-visually important areas; finally according to the visual importance of the visual saliency map of {I R (x,y)} Regions and non-visually important regions, obtain the visually important region mask of the stereo image to be evaluated, denoted as {M(x,y)}, where M(x,y) represents the coordinates in {M(x,y)} The pixel value of the pixel at position (x, y);
③根据{dR(x,y)}和{M(x,y)},获取{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的视差均值μ、视差方差δ、最大负视差θ、视差范围χ,然后将μ、δ、θ和χ按顺序进行排列构成用于反映{dR(x,y)}的视差幅度特征的特征矢量,记为F1,F1=(μ,δ,θ,χ);③According to {d R (x,y)} and {M(x,y)}, obtain the visually important areas of the visual saliency map in {d R (x,y)} and {I R (x,y)} The disparity mean μ, disparity variance δ, maximum negative disparity θ, and disparity range χ of the pixels in the corresponding area, and then arrange μ, δ, θ and χ in order to reflect {d R (x, y )}, the feature vector of the parallax amplitude feature, denoted as F 1 , F 1 = (μ, δ, θ, χ);
④通过计算{dR(x,y)}的视差梯度幅值图像和视差梯度方向图像,计算{dR(x,y)}的视差梯度边缘图像;然后根据{dR(x,y)}的视差梯度边缘图像和{M(x,y)},计算{dR(x,y)}的视差梯度边缘图像中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的梯度均值ψ;最后将ψ作为用于反映{dR(x,y)}的视差梯度特征的特征矢量,记为F2;④ By calculating the parallax gradient magnitude image and parallax gradient direction image of {d R (x, y)}, calculate the parallax gradient edge image of {d R (x, y)}; then according to {d R (x, y) } of the disparity gradient edge image and {M(x,y)}, calculate the visually important regions in the disparity gradient edge image of {d R (x,y)} and the visual saliency map of {I R (x,y)} The gradient mean value ψ of all pixels in the corresponding area; finally, ψ is used as a feature vector reflecting the parallax gradient feature of {d R (x, y)}, denoted as F 2 ;
⑤获取{IR(x,y)}的空间频率图像;然后根据{IR(x,y)}的空间频率图像和{M(x,y)},获取{IR(x,y)}的空间频率图像中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的空间频率均值ν、空间频率方差ρ、空间频率范围ζ、空间频率敏感因子τ;再将ν、ρ、ζ和τ按顺序进行排列构成用于反映{IR(x,y)}的空间频率特征的特征矢量,记为F3,F3=(ν,ρ,ζ,τ);⑤ Obtain the spatial frequency image of {I R (x, y)}; then according to the spatial frequency image of {I R (x, y)} and {M (x, y)}, obtain {I R (x, y) } in the spatial frequency image of {I R (x,y)}, the spatial frequency mean ν, spatial frequency variance ρ, spatial frequency range ζ, spatial frequency Sensitivity factor τ; then arrange ν, ρ, ζ and τ in order to form a feature vector used to reflect the spatial frequency characteristics of {I R (x,y)}, denoted as F 3 , F 3 =(ν, ρ , ζ, τ);
⑥将F1、F2及F3构成一个新的特征矢量,记为X,X=[F1,F2,F3],然后将X作为待评价的立体图像的特征矢量,其中,符号“[]”为矢量表示符号,[F1,F2,F3]表示将F1、F2和F3连接起来形成一个新的特征矢量;⑥Constitute F 1 , F 2 and F 3 into a new feature vector, denoted as X, X=[F 1 , F 2 , F 3 ], and then use X as the feature vector of the stereoscopic image to be evaluated, where the symbol "[]" is a vector symbol, and [F 1 , F 2 , F 3 ] means connecting F 1 , F 2 and F 3 to form a new feature vector;
⑦采用n副不同的立体图像以及对应的右视差图像建立立体图像集合,利用主观质量评价方法分别计算立体图像集合中的每副立体图像的视觉舒适度的平均主观评分均值,记为MOS,其中,n≥1,MOS∈[1,5];然后按照步骤①至步骤⑥计算待评价的立体图像的特征矢量X的操作,以相同的方式分别计算立体图像集合中的每幅立体图像的特征矢量,将立体图像集合中的第i幅立体图像的特征矢量记为Xi,其中,1≤i≤n,n表示立体图像集合中包含的立体图像的幅数;⑦ Use n different stereoscopic images and corresponding right parallax images to establish a stereoscopic image set, and use the subjective quality evaluation method to calculate the average subjective score of the visual comfort of each stereoscopic image in the stereoscopic image set, denoted as MOS, where , n≥1, MOS∈[1,5]; then follow
⑧将立体图像集合中的所有立体图像分成训练集和测试集,将训练集中的所有立体图像的特征矢量和平均主观评分均值构成训练样本数据集合,将测试集中的所有立体图像的特征矢量和平均主观评分均值构成测试样本数据集合,然后采用支持向量回归作为机器学习的方法,对训练样本数据集合中的所有立体图像的特征矢量进行训练,使得经过训练得到的回归函数值与平均主观评分均值之间的误差最小,拟合得到最优的权重矢量wopt和最优的偏置项bopt,接着利用wopt和bopt构造得到支持向量回归训练模型,再根据支持向量回归训练模型,对测试样本数据集合中的每幅立体图像的特征矢量进行测试,预测得到测试样本数据集合中的每幅立体图像的客观视觉舒适度评价预测值,将测试样本数据集合中的第k'幅立体图像的客观视觉舒适度评价预测值记为Qk',Qk'=f(Xk'),其中,f()为函数表示形式,Xk'表示测试样本数据集合中的第k'幅立体图像的特征矢量,(wopt)T为wopt的转置矩阵,表示测试样本数据集合中的第k'幅立体图像的线性函数,1≤k'≤n-t,t表示训练集中包含的立体图像的幅数;之后通过重新分配训练集和测试集,重新预测得到测试样本数据集合中的每幅立体图像的客观视觉舒适度评价预测值,经过N次迭代后计算立体图像中的每幅立体图像的客观视觉舒适度评价预测值的平均值,并将计算得到的平均值作为对应那幅立体图像的最终的客观视觉舒适度预测值,其中,N的值取大于100。8. Divide all the stereo images in the stereo image set into training set and test set, form the training sample data set with the feature vectors and average subjective score mean of all the stereo images in the training set, and combine the feature vectors and average values of all the stereo images in the test set The mean value of the subjective score constitutes the test sample data set, and then uses support vector regression as a machine learning method to train the feature vectors of all stereo images in the training sample data set, so that the regression function value obtained after training and the average subjective score mean value The error between is the smallest, and the optimal weight vector w opt and the optimal bias item b opt are obtained by fitting, and then the support vector regression training model is obtained by using w opt and b opt , and then according to the support vector regression training model, the test The feature vector of each stereoscopic image in the sample data set is tested, and the objective visual comfort evaluation prediction value of each stereoscopic image in the test sample data set is predicted, and the k'th stereoscopic image in the test sample data set is The predicted value of objective visual comfort evaluation is denoted as Q k' , Q k' = f(X k' ), Among them, f() is a function representation, X k' represents the feature vector of the k'th stereo image in the test sample data set, (w opt ) T is the transposition matrix of w opt , Represents the linear function of the k'th stereo image in the test sample data set, 1≤k'≤nt, t represents the number of stereo images contained in the training set; after that, re-predict the test by reassigning the training set and the test set The objective visual comfort evaluation prediction value of each stereoscopic image in the sample data set, calculate the average value of the objective visual comfort evaluation prediction value of each stereoscopic image in the stereoscopic image after N iterations, and calculate the average The value is used as the final objective visual comfort prediction value corresponding to that stereo image, wherein, the value of N is greater than 100.
所述的步骤②的具体过程为:The concrete process of described
②-1、采用基于图论的视觉显著性模型提取出{IR(x,y)}的显著图,记为{SMR(x,y)},其中,SMR(x,y)表示{SMR(x,y)}中坐标位置为(x,y)的像素点的像素值;②-1. Use the visual saliency model based on graph theory to extract the saliency graph of {I R (x, y)}, denoted as {SM R (x, y)}, where SM R (x, y) means The pixel value of the pixel whose coordinate position is (x, y) in {SM R (x, y)};
②-2、根据{SMR(x,y)}和{dR(x,y)},获取{IR(x,y)}的视觉显著图,记为{DR(x,y)},将{DR(x,y)}中坐标位置为(x,y)的像素点的像素值记为DR(x,y),其中,表示SMR(x,y)的权重,表示dR(x,y)的权重, ②-2. According to {SM R (x,y)} and {d R (x,y)}, obtain the visual saliency map of {I R (x,y)}, denoted as {D R (x,y) }, record the pixel value of the pixel whose coordinate position is (x, y) in {D R (x, y)} as D R (x, y), in, Represents the weight of SM R (x,y), Indicates the weight of d R (x,y),
②-3、根据{DR(x,y)}中的每个像素点的像素值,将{DR(x,y)}划分为视觉重要区域和非视觉重要区域,{DR(x,y)}的视觉重要区域中的每个像素点的像素值大于自适应阈值T1,{DR(x,y)}的非视觉重要区域中的每个像素点的像素值小于或等于自适应阈值T1,其中,T1为利用大津法对{DR(x,y)}进行处理得到的阈值;②-3. According to the pixel value of each pixel in {D R (x, y)}, divide {D R (x, y)} into visually important areas and non-visually important areas, {D R (x , y)}, the pixel value of each pixel in the visually important area is greater than the adaptive threshold T 1 , and the pixel value of each pixel in the non-visually important area of {D R (x, y)} is less than or equal to Adaptive threshold T 1 , where T 1 is the threshold obtained by processing {D R (x, y)} using the Otsu method;
②-4、根据{DR(x,y)}的视觉重要区域和非视觉重要区域,获取待评价的立体图像的视觉重要区域掩膜,记为{M(x,y)},将{M(x,y)}中坐标位置为(x,y)的像素点的像素值记为M(x,y),
所述的步骤②-2中取。Take from the step ②-2 .
所述的步骤③的具体过程为:The concrete process of described
③-1、根据{dR(x,y)}和{M(x,y)},计算{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的视差均值,记为μ,
③-2、根据{dR(x,y)}和{M(x,y)}及μ,计算{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的视差方差,记为δ,
③-3、计算{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的最大负视差,记为θ,其中,θ的值为{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内视差值最小的1%像素点的视差均值;③-3. Calculate the maximum negative disparity of the pixels in {d R (x, y)} corresponding to the visually important area of the visual saliency map of {I R (x, y)}, denoted as θ, Among them, the value of θ is the disparity of 1% of the pixels in {d R (x, y)} corresponding to the visually important area of the visual saliency map of {I R (x, y)} with the smallest disparity value mean;
③-4、计算{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的视差范围,记为χ,χ=dmax-dmin,其中,dmax表示{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内视差值最大的1%像素点的视差均值,dmin表示{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内视差值最小的1%像素点的视差均值;③-4, calculate the disparity range of the pixels in the region corresponding to the visually important region of the visual saliency map of {I R (x, y)} in {d R (x, y)}, denoted as χ, χ =d max -d min , wherein, d max represents the maximum disparity value in {d R (x, y)} corresponding to the visually important area of the visual saliency map of {I R (x, y)} The mean value of the disparity of 1% of the pixels, d min represents the minimum disparity value of 1 in the area corresponding to the visually important area of the visual saliency map of {I R (x, y)} in {d R (x, y)} The average parallax value of % pixels;
③-5、将μ、δ、θ和χ按顺序进行排列构成用于反映{dR(x,y)}的视差幅度特征的特征矢量,记为F1,F1=(μ,δ,θ,χ),F1的维数为4。③-5. Arrange μ, δ, θ and χ in order to form a feature vector used to reflect the parallax amplitude characteristics of {d R (x, y)}, denoted as F 1 , F 1 =(μ, δ, θ, χ), the dimension of F1 is 4.
所述的步骤④的具体过程为:The concrete process of described
④-1、计算{dR(x,y)}的视差梯度幅值图像,记为{m(x,y)},将{m(x,y)}中坐标位置为(x,y)的像素点的梯度幅值记为m(x,y),其中,Gx(x,y)表示{m(x,y)}中坐标位置为(x,y)的像素点的水平梯度值,Gy(x,y)表示{m(x,y)}中坐标位置为(x,y)的像素点的垂直梯度值;④-1. Calculate the parallax gradient magnitude image of {d R (x,y)}, which is recorded as {m(x,y)}, and the coordinate position in {m(x,y)} is (x,y) The gradient magnitude of the pixel point is recorded as m(x,y), Among them, G x (x, y) represents the horizontal gradient value of the pixel whose coordinate position is (x, y) in {m(x, y)}, and G y (x, y) represents {m(x, y) } in the vertical gradient value of the pixel whose coordinate position is (x, y);
④-2、计算{dR(x,y)}的视差梯度方向图像,记为{θ(x,y)},将{θ(x,y)}中坐标位置为(x,y)的像素点的梯度方向值记为θ(x,y),θ(x,y)=arctan(Gy(x,y)/Gx(x,y)),其中,arctan()为取反正切函数;④-2. Calculate the disparity gradient direction image of {d R (x,y)}, which is recorded as {θ(x,y)}, and the coordinate position in {θ(x,y)} is (x,y) The gradient direction value of the pixel is recorded as θ(x,y), θ(x,y)=arctan(G y (x,y)/G x (x,y)), where arctan() is the arc tangent function;
④-3、根据{m(x,y)}和{θ(x,y)},计算{dR(x,y)}的视差梯度边缘图像,记为{E(x,y)},将{E(x,y)}中坐标位置为p的像素点的梯度边缘值记为E(p),④-3. According to {m(x,y)} and {θ(x,y)}, calculate the parallax gradient edge image of {d R (x,y)}, denoted as {E(x,y)}, Record the gradient edge value of the pixel point whose coordinate position is p in {E(x,y)} as E(p),
其中,Gs(||p-q||)表示标准差为σs的高斯函数,p-q表示坐标位置p和坐标位置q之间的欧氏距离,符号“|| ||”为求欧氏距离符号,表示标准差为σo的高斯函数,
④-4、根据{E(x,y)}和{M(x,y)},计算{E(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的梯度均值,记为ψ,其中,Ω表示图像域范围,E(x,y)表示{E(x,y)}中坐标位置为(x,y)的像素点的梯度边缘值;④-4. According to {E(x,y)} and {M(x,y)}, calculate the visual importance of the visual saliency map in {E(x,y)} and {I R (x,y)} The gradient mean of all pixels in the region corresponding to the region is denoted as ψ, Among them, Ω represents the range of the image domain, and E(x, y) represents the gradient edge value of the pixel whose coordinate position is (x, y) in {E(x, y)};
④-5、将ψ作为用于反映{dR(x,y)}的视差梯度特征的特征矢量,记为F2,F2的维数为1。④-5. Take ψ as a feature vector for reflecting the disparity gradient feature of {d R (x, y)}, denoted as F 2 , and the dimension of F 2 is 1.
所述的步骤④-3中取σs=0.4,σo=0.4,εg=0.5。In the step ④-3, σ s =0.4, σ o =0.4, ε g =0.5.
所述的步骤④-3中的大小为3×3,的大小为3×3。In the step ④-3 described has a size of 3×3, The size is 3×3.
所述的步骤⑤的具体过程为:The concrete process of described
⑤-1、计算{IR(x,y)}的空间频率图像,记为{SF(x,y)},将{SF(x,y)}中坐标位置为(x,y)的像素点的空间频率值记为SF(x,y),
⑤-2、根据{SF(x,y)}和{M(x,y)},计算{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的空间频率均值,记为ν,
⑤-3、根据{SF(x,y)}和{M(x,y)}及ν,计算{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的空间频率方差,记为ρ,
⑤-4、计算{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的空间频率范围,记为ζ,ζ=SFmax-SFmin,其中,SFmax表示{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内空间频率值最大的1%像素点的空间频率均值,SFmin表示{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内空间频率值最小的1%像素点的空间频率均值;⑤-4. Calculate the spatial frequency range of the pixels in {SF (x, y)} corresponding to the visually important area of the visual saliency map of {I R (x, y)}, denoted as ζ, ζ =SF max -SF min , where SF max represents the 1 with the largest spatial frequency value in the region corresponding to the visually important region of the visual saliency map of {I R (x,y)} in {SF(x,y)} % Spatial frequency mean of pixels, SF min represents the smallest 1% of the spatial frequency value in the region corresponding to the visually important region of the visual saliency map of {I R (x,y)} in {SF(x,y)} The spatial frequency mean of the pixel;
⑤-5、计算{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的空间频率敏感因子,记为τ,τ=ν/μ;⑤-5. Calculate the spatial frequency sensitivity factor of the pixels in the region corresponding to the visually important region of the visual saliency map of {I R (x, y)} in {SF (x, y)}, denoted as τ, τ=ν/μ;
⑤-6、将ν、ρ、ζ和τ按顺序进行排列构成用于反映{IR(x,y)}的空间频率特征的特征矢量,记为F3,F3=(ν,ρ,ζ,τ),F3的维数为4。⑤-6. Arrange ν, ρ, ζ and τ in order to form a feature vector for reflecting the spatial frequency characteristics of {I R (x, y)}, denoted as F 3 , F 3 =(ν, ρ, ζ, τ), the dimension of F 3 is 4.
所述的步骤⑧的具体过程为:The concrete process of described step 8. is:
⑧-1、随机选择立体图像集合中的幅立体图像构成训练集,将立体图像集合中剩余的n-t幅立体图像构成测试集,其中,符号为向上取整符号;⑧-1, randomly select the stereo image set Stereo images constitute the training set, and the remaining nt stereo images in the stereo image set constitute the test set, where the symbol is the symbol for rounding up;
⑧-2、将训练集中的所有立体图像的特征矢量和平均主观评分均值构成训练样本数据集合,记为Ωt,{Xk,MOSk}∈Ωt,其中,Xk表示训练样本数据集合Ωt中的第k幅立体图像的特征矢量,MOSk表示训练样本数据集合Ωt中的第k幅立体图像的平均主观评分均值,1≤k≤t;⑧-2. The feature vectors and average subjective ratings of all stereo images in the training set constitute the training sample data set, which is recorded as Ω t , {X k ,MOS k }∈Ω t , where X k represents the training sample data set The feature vector of the k-th stereo image in Ω t , MOS k represents the average subjective score mean value of the k-th stereo image in the training sample data set Ω t , 1≤k≤t;
⑧-3、构造训练样本数据集合Ωt中的每幅立体图像的特征矢量的回归函数,将Xk的回归函数记为f(Xk),其中,f()为函数表示形式,w为权重矢量,wT为w的转置矩阵,b为偏置项,表示Xk的线性函数,D(Xk,Xl)为支持向量回归中的核函数,Xl为训练样本数据集合Ωt中的第l幅立体图像的特征矢量,1≤l≤t,γ为核参数,exp()表示以e为底的指数函数,e=2.71828183,符号“|| ||”为求欧式距离符号;8.-3, construct the regression function of the feature vector of each stereoscopic image in the training sample data set Ω t , the regression function of X k is denoted as f(X k ), Among them, f() is the function representation, w is the weight vector, w T is the transpose matrix of w, b is the bias term, represents a linear function of X k , D(X k ,X l ) is the kernel function in support vector regression, X l is the feature vector of the lth stereo image in the training sample data set Ω t , 1≤l≤t, γ is the kernel parameter, exp() represents the exponential function with e as the base, e=2.71828183, the symbol "| | ||" is the Euclidean distance symbol;
⑧-4、采用支持向量回归对训练样本数据集合Ωt中的所有立体图像的特征矢量进行训练,使得经过训练得到的回归函数值与平均主观评分均值之间的误差最小,拟合得到最优的权重矢量wopt和最优的偏置项bopt,将最优的权重矢量wopt和最优的偏置项bopt的组合记为(wopt,bopt),
⑧-5、将测试集中的所有立体图像的特征矢量和平均主观评分均值构成测试样本数据集合,然后根据支持向量回归训练模型,对测试样本数据集合中的每幅立体图像的特征矢量进行测试,预测得到测试样本数据集合中的每幅立体图像的客观视觉舒适度评价预测值,将测试样本数据集合中的第k'幅立体图像的客观视觉舒适度评价预测值记为Qk',Qk'=f(Xk'),其中,Xk'表示测试样本数据集合中的第k'幅立体图像的特征矢量,表示测试样本数据集合中的第k'幅立体图像的线性函数,1≤k'≤n-t;8.-5, the eigenvectors of all stereoscopic images in the test set and the mean value of the average subjective score form the test sample data set, then according to the support vector regression training model, the eigenvectors of each stereoscopic image in the test sample data set are tested, The objective visual comfort evaluation prediction value of each stereoscopic image in the test sample data set is predicted, and the objective visual comfort evaluation prediction value of the k'th stereo image in the test sample data set is recorded as Q k' , Q k ' = f(X k' ), Among them, X k' represents the feature vector of the k'th stereo image in the test sample data set, Represents the linear function of the k'th stereo image in the test sample data set, 1≤k'≤nt;
⑧-6、再重新随机选择立体图像集合中的幅立体图像构成训练集,将立体图像集合中剩余的n-t幅立体图像构成测试集,然后返回步骤⑧-2继续执行,在经过N次迭代后,计算立体图像集合中的每幅立体图像的客观视觉舒适度评价预测值的平均值,再将计算得到的平均值作为对应那幅立体图像的最终的客观视觉舒适度评价预测值,其中,N的值取大于100。⑧-6, and then re-randomly select the stereo image collection Stereo images form the training set, and the remaining nt stereo images in the stereo image set constitute the test set, and then return to step ⑧-2 to continue execution. After N iterations, calculate the objective value of each stereo image in the stereo image set The average value of the predicted value of the visual comfort evaluation, and then the calculated average value is used as the final predicted value of the objective visual comfort evaluation corresponding to the stereo image, wherein the value of N is greater than 100.
所述的步骤⑧-3中取γ=54。In the step 8-3, γ=54 is taken.
与现有技术相比,本发明的优点在于:Compared with the prior art, the present invention has the advantages of:
1)本发明方法考虑到视觉重要区域对视觉舒适度的影响,因此根据立体图像的右视点图像的显著图和立体图像的右视差图像提取出立体图像的视觉重要区域掩膜,然后根据视觉重要区域掩膜只对视觉重要区域进行评价,从而有效地提高了客观评价结果与主观感知的相关性。1) The method of the present invention takes into account the influence of visually important regions on visual comfort, so the visually important region mask of the stereoscopic image is extracted according to the saliency map of the right viewpoint image of the stereoscopic image and the right disparity image of the stereoscopic image, and then according to the visually important Region masks only evaluate visually important regions, thus effectively improving the correlation between objective evaluation results and subjective perception.
2)本发明方法根据用于反映立体图像的右视差图像的视差幅度特征的特征矢量、用于反映立体图像的右视差图像的视差梯度特征的特征矢量、用于反映立体图像的右视点图像的空间频率特征的特征矢量,得到立体图像的特征矢量,然后利用支持向量回归对立体图像集合中的所有立体图像的特征矢量进行训练,计算得到立体图像集合中的每幅立体图像的客观视觉舒适度评价预测值,由于获得的立体图像的特征矢量信息具有较强的稳定性且能够较好地反映立体图像的视觉舒适度变化情况,因此有效地提高了客观评价情况与主观感知的相关性。2) The method of the present invention is based on the feature vector used to reflect the parallax amplitude feature of the right parallax image of the stereo image, the feature vector used to reflect the parallax gradient feature of the right parallax image of the stereo image, and the feature vector used to reflect the right viewpoint image of the stereo image The feature vector of the spatial frequency feature is obtained to obtain the feature vector of the stereo image, and then the feature vector of all the stereo images in the stereo image set is trained by using support vector regression, and the objective visual comfort of each stereo image in the stereo image set is calculated Evaluation prediction value, because the obtained feature vector information of the stereo image has strong stability and can better reflect the change of the visual comfort of the stereo image, so the correlation between the objective evaluation and the subjective perception is effectively improved.
附图说明Description of drawings
图1为本发明方法的总体实现框图;Fig. 1 is the overall realization block diagram of the inventive method;
图2a为“camera”的右视点图像;Figure 2a is the right view image of "camera";
图2b为“camera”的右视差图像;Figure 2b is the right parallax image of "camera";
图2c为“camera”的右视点图像的显著图;Figure 2c is a saliency map of the right view image of "camera";
图2d为“camera”的右视点图像的视觉显著图;Figure 2d is the visual saliency map of the right view image of "camera";
图2e为“camera”的视觉重要区域掩模;Figure 2e is the visually important area mask of "camera";
图3a为“cup”的右视点图像;Figure 3a is the right view image of "cup";
图3b为“cup”的右视差图像;Figure 3b is the right parallax image of "cup";
图3c为“cup”的右视点图像的显著图;Figure 3c is a saliency map of the right view image of "cup";
图3d为“cup”的右视点图像的视觉显著图;Figure 3d is a visual saliency map of the right view image of "cup";
图3e为“cup”的视觉重要区域掩模;Figure 3e is the visually important area mask of "cup";
图4a为“infant”的右视点图像;Figure 4a is the right view image of "infant";
图4b为“infant”的右视差图像;Figure 4b is the right parallax image of "infant";
图4c为“infant”的右视点图像的显著图;Figure 4c is a saliency map of the right view image of "infant";
图4d为“infant”的右视点图像的视觉显著图;Figure 4d is the visual saliency map of the right view image of "infant";
图4e为“infant”的视觉重要区域掩模;Figure 4e is the visually important area mask of "infant";
图5为根据F1和F2两个特征矢量获得的客观视觉舒适度评价预测值与平均主观评分均值的散点图;Fig. 5 is a scatter diagram of the predicted value of the objective visual comfort evaluation and the mean value of the average subjective score obtained according to the two feature vectors of F1 and F2 ;
图6为根据F1和F3两个特征矢量获得的客观视觉舒适度评价预测值与平均主观评分均值的散点图;Fig. 6 is a scatter diagram of the predicted value of the objective visual comfort evaluation and the mean value of the average subjective score obtained according to the two feature vectors of F1 and F3 ;
图7为根据F2和F3两个特征矢量获得的客观视觉舒适度评价预测值与平均主观评分均值的散点图;Fig. 7 is a scatter diagram of the predicted value of the objective visual comfort evaluation and the mean value of the average subjective score obtained according to the two feature vectors of F2 and F3 ;
图8为根据F1、F2和F3三个特征矢量获得的客观视觉舒适度评价预测值与平均主观评分均值的散点图。Fig. 8 is a scatter diagram of the predicted value of the objective visual comfort evaluation and the mean value of the average subjective score obtained according to the three feature vectors of F 1 , F 2 and F 3 .
具体实施方式Detailed ways
以下结合附图实施例对本发明作进一步详细描述。The present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments.
本发明提出的一种基于机器学习的立体图像视觉舒适度评价方法,其总体实现框图如图1所示,其包括以下步骤:A kind of stereoscopic image visual comfort evaluation method based on machine learning that the present invention proposes, its overall realization block diagram is as shown in Figure 1, and it comprises the following steps:
①将待评价的立体图像的左视点图像记为{IL(x,y)},将待评价的立体图像的右视点图像记为{IR(x,y)},将待评价的立体图像的右视差图像记为{dR(x,y)},其中,此处(x,y)表示{IL(x,y)}、{IR(x,y)}和{dR(x,y)}中的像素点的坐标位置,1≤x≤W,1≤y≤H,W表示{IL(x,y)}、{IR(x,y)}和{dR(x,y)}的宽度,H表示{IL(x,y)}、{IR(x,y)}和{dR(x,y)}的高度,IL(x,y)表示{IL(x,y)}中坐标位置为(x,y)的像素点的像素值,IR(x,y)表示{IR(x,y)}中坐标位置为(x,y)的像素点的像素值,dR(x,y)表示{dR(x,y)}中坐标位置为(x,y)的像素点的像素值。① Denote the left viewpoint image of the stereo image to be evaluated as {I L (x, y)}, the right viewpoint image of the stereo image to be evaluated as {I R (x, y)}, and the stereo image to be evaluated The right disparity image of the image is denoted as {d R (x,y)}, where (x,y) denotes {I L (x,y)}, {I R (x,y)} and {d R The coordinate position of the pixel in (x,y)}, 1≤x≤W, 1≤y≤H, W means {I L (x, y)}, {I R (x, y)} and {d The width of R (x,y)}, H means the height of {I L (x,y)}, {I R (x,y)} and {d R (x,y)}, I L (x,y ) means the pixel value of the pixel whose coordinate position is (x, y) in {I L (x, y)}, and I R (x, y) means that the coordinate position in {I R (x, y)} is (x ,y) is the pixel value of the pixel point, and d R (x, y) represents the pixel value of the pixel point whose coordinate position is (x, y) in {d R (x, y)}.
②提取出{IR(x,y)}的显著图;然后根据{IR(x,y)}的显著图和{dR(x,y)},获取{IR(x,y)}的视觉显著图;再将{IR(x,y)}的视觉显著图划分为视觉重要区域和非视觉重要区域;最后根据{IR(x,y)}的视觉显著图的视觉重要区域和非视觉重要区域,获取待评价的立体图像的视觉重要区域掩膜,记为{M(x,y)},其中,M(x,y)表示{M(x,y)}中坐标位置为(x,y)的像素点的像素值。② Extract the saliency map of {I R (x, y)}; then according to the saliency map of {I R (x, y)} and {d R (x, y)}, get {I R (x, y) } visual saliency map; then divide the visual saliency map of {I R (x,y)} into visually important areas and non-visually important areas; finally according to the visual importance of the visual saliency map of {I R (x,y)} Regions and non-visually important regions, obtain the visually important region mask of the stereo image to be evaluated, denoted as {M(x,y)}, where M(x,y) represents the coordinates in {M(x,y)} The pixel value of the pixel at position (x,y).
在此具体实施例中,步骤②的具体过程为:In this specific embodiment, the concrete process of
②-1、采用基于图论的视觉显著性(Graph-based Visual Saliency,GBVS)模型提取出{IR(x,y)}的显著图,记为{SMR(x,y)},其中,SMR(x,y)表示{SMR(x,y)}中坐标位置为(x,y)的像素点的像素值。②-1. Use the Graph-based Visual Saliency (GBVS) model to extract the saliency map of {I R (x, y)}, denoted as {SM R (x, y)}, where , SM R (x, y) represents the pixel value of the pixel at the coordinate position (x, y) in {SM R (x, y)}.
②-2、根据{SMR(x,y)}和{dR(x,y)},获取{IR(x,y)}的视觉显著图,记为{DR(x,y)},将{DR(x,y)}中坐标位置为(x,y)的像素点的像素值记为DR(x,y),其中,表示SMR(x,y)的权重,表示dR(x,y)的权重,在此取 ②-2. According to {SM R (x,y)} and {d R (x,y)}, obtain the visual saliency map of {I R (x,y)}, denoted as {D R (x,y) }, record the pixel value of the pixel whose coordinate position is (x, y) in {D R (x, y)} as D R (x, y), in, Represents the weight of SM R (x,y), Indicates the weight of d R (x,y), take here
②-3、根据{DR(x,y)}中的每个像素点的像素值,将{DR(x,y)}划分为视觉重要区域和非视觉重要区域,{DR(x,y)}的视觉重要区域中的每个像素点的像素值大于自适应阈值T1,{DR(x,y)}的非视觉重要区域中的每个像素点的像素值小于或等于自适应阈值T1,其中,T1为利用大津法对{DR(x,y)}进行处理得到的阈值。②-3. According to the pixel value of each pixel in {D R (x, y)}, divide {D R (x, y)} into visually important areas and non-visually important areas, {D R (x , y)}, the pixel value of each pixel in the visually important area is greater than the adaptive threshold T 1 , and the pixel value of each pixel in the non-visually important area of {D R (x, y)} is less than or equal to Adaptive threshold T 1 , where T 1 is a threshold obtained by processing {D R (x, y)} using the Otsu method.
②-4、根据{DR(x,y)}的视觉重要区域和非视觉重要区域,获取待评价的立体图像的视觉重要区域掩膜,记为{M(x,y)},将{M(x,y)}中坐标位置为(x,y)的像素点的像素值记为M(x,y),
在此,截取三组典型的立体图像来说明本发明方法中获取的待评价的立体图像的视觉重要区域掩膜的性能。图2a和图2b分别给出了“camera”的右视点图像和右视差图像,图2c给出了“camera”的右视点图像的显著图,图2d给出了“camera”的右视点图像的视觉显著图,图2e给出了“camera”的视觉重要区域掩模;图3a和图3b分别给出了“cup”的右视点图像和右视差图像,图3c给出了“cup”的右视点图像的显著图,图3d给出了“cup”的右视点图像的视觉显著图,图3e给出了“cup”的视觉重要区域掩模;图4a和图4b分别给出了“infant”的右视点图像和右视差图像,图4c给出了“infant”的右视点图像的显著图,图4d给出了“infant”的右视点图像的视觉显著图,图4e给出了“infant”的视觉重要区域掩模。从图2e、图3e和图4e可以看出,采用本发明方法得到的视觉重要区域,能够很好地反映人眼视觉舒适程度。Here, three groups of typical stereoscopic images are intercepted to illustrate the performance of the visually important region mask of the stereoscopic image to be evaluated obtained in the method of the present invention. Figure 2a and Figure 2b show the right view image and right disparity image of "camera", respectively, Figure 2c shows the saliency map of the right view image of "camera", and Figure 2d shows the right view image of "camera" Visual saliency map, Figure 2e shows the visually important region mask of "camera"; Figure 3a and Figure 3b show the right viewpoint image and right disparity image of "cup", respectively, Figure 3c shows the right The saliency map of the viewpoint image, Figure 3d shows the visual saliency map of the right viewpoint image of "cup", and Fig. 3e shows the visually important region mask of "cup"; Fig. 4a and Fig. 4b respectively give the "infant" Figure 4c shows the saliency map of the right view image of "infant", Figure 4d shows the visual saliency map of the right view image of "infant", and Figure 4e shows the saliency map of "infant" The visually important region mask of . It can be seen from Fig. 2e, Fig. 3e and Fig. 4e that the visually important areas obtained by the method of the present invention can well reflect the visual comfort of human eyes.
③根据{dR(x,y)}和{M(x,y)},获取{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的视差均值μ、视差方差δ、最大负视差θ、视差范围χ,然后将μ、δ、θ和χ按顺序进行排列构成用于反映{dR(x,y)}的视差幅度特征的特征矢量,记为F1,F1=(μ,δ,θ,χ)。③According to {d R (x,y)} and {M(x,y)}, obtain the visually important areas of the visual saliency map in {d R (x,y)} and {I R (x,y)} The disparity mean μ, disparity variance δ, maximum negative disparity θ, and disparity range χ of the pixels in the corresponding area, and then arrange μ, δ, θ and χ in order to reflect {d R (x, y )} The feature vector of the parallax amplitude feature is denoted as F 1 , F 1 =(μ, δ, θ, χ).
在此具体实施例中,步骤③的具体过程为:In this specific embodiment, the concrete process of
③-1、根据{dR(x,y)}和{M(x,y)},计算{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的视差均值,记为μ,
③-2、根据{dR(x,y)}和{M(x,y)}及μ,计算{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的视差方差,记为δ,
③-3、计算{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的最大负视差,记为θ,其中,θ的值为{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内视差值最小的1%像素点的视差均值。③-3. Calculate the maximum negative disparity of the pixels in {d R (x, y)} corresponding to the visually important area of the visual saliency map of {I R (x, y)}, denoted as θ, Among them, the value of θ is the disparity of 1% of the pixels in {d R (x, y)} corresponding to the visually important area of the visual saliency map of {I R (x, y)} with the smallest disparity value mean.
③-4、计算{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的视差范围,记为χ,χ=dmax-dmin,其中,dmax表示{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内视差值最大的1%像素点的视差均值,dmin表示{dR(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内视差值最小的1%像素点的视差均值。③-4, calculate the disparity range of the pixels in the region corresponding to the visually important region of the visual saliency map of {I R (x, y)} in {d R (x, y)}, denoted as χ, χ =d max -d min , wherein, d max represents the maximum disparity value in {d R (x, y)} corresponding to the visually important area of the visual saliency map of {I R (x, y)} The mean value of the disparity of 1% of the pixels, d min represents the minimum disparity value of 1 in the area corresponding to the visually important area of the visual saliency map of {I R (x, y)} in {d R (x, y)} % disparity mean of pixels.
③-5、将μ、δ、θ和χ按顺序进行排列构成用于反映{dR(x,y)}的视差幅度特征的特征矢量,记为F1,F1=(μ,δ,θ,χ),F1的维数为4。③-5. Arrange μ, δ, θ and χ in order to form a feature vector used to reflect the parallax amplitude characteristics of {d R (x, y)}, denoted as F 1 , F 1 =(μ, δ, θ, χ), the dimension of F1 is 4.
④通过计算{dR(x,y)}的视差梯度幅值图像和视差梯度方向图像,计算{dR(x,y)}的视差梯度边缘图像;然后根据{dR(x,y)}的视差梯度边缘图像和{M(x,y)},计算{dR(x,y)}的视差梯度边缘图像中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的梯度均值ψ;最后将ψ作为用于反映{dR(x,y)}的视差梯度特征的特征矢量,记为F2。④ By calculating the parallax gradient magnitude image and parallax gradient direction image of {d R (x, y)}, calculate the parallax gradient edge image of {d R (x, y)}; then according to {d R (x, y) } of the disparity gradient edge image and {M(x,y)}, calculate the visually important regions in the disparity gradient edge image of {d R (x,y)} and the visual saliency map of {I R (x,y)} The gradient mean value ψ of all pixels in the corresponding area; finally, ψ is used as a feature vector reflecting the disparity gradient feature of {d R (x, y)}, which is denoted as F 2 .
在此具体实施例中,步骤④的具体过程为:In this specific embodiment, the concrete process of
④-1、计算{dR(x,y)}的视差梯度幅值图像,记为{m(x,y)},将{m(x,y)}中坐标位置为(x,y)的像素点的梯度幅值记为m(x,y),其中,Gx(x,y)表示{m(x,y)}中坐标位置为(x,y)的像素点的水平梯度值,Gy(x,y)表示{m(x,y)}中坐标位置为(x,y)的像素点的垂直梯度值。④-1. Calculate the parallax gradient magnitude image of {d R (x,y)}, which is recorded as {m(x,y)}, and the coordinate position in {m(x,y)} is (x,y) The gradient magnitude of the pixel point is recorded as m(x,y), Among them, G x (x, y) represents the horizontal gradient value of the pixel whose coordinate position is (x, y) in {m(x, y)}, and G y (x, y) represents {m(x, y) } in the vertical gradient value of the pixel whose coordinate position is (x, y).
④-2、计算{dR(x,y)}的视差梯度方向图像,记为{θ(x,y)},将{θ(x,y)}中坐标位置为(x,y)的像素点的梯度方向值记为θ(x,y),θ(x,y)=arctan(Gy(x,y)/Gx(x,y)),其中,arctan()为取反正切函数。④-2. Calculate the disparity gradient direction image of {d R (x,y)}, which is recorded as {θ(x,y)}, and the coordinate position in {θ(x,y)} is (x,y) The gradient direction value of the pixel is recorded as θ(x,y), θ(x,y)=arctan(G y (x,y)/G x (x,y)), where arctan() is the arc tangent function.
④-3、根据{m(x,y)}和{θ(x,y)},计算{dR(x,y)}的视差梯度边缘图像,记为{E(x,y)},将{E(x,y)}中坐标位置为p的像素点的梯度边缘值记为E(p),其中,Gs(||p-q||)表示标准差为σs的高斯函数,在此取σs=0.4,||p-q||表示坐标位置p和坐标位置q之间的欧氏距离,符号“|| ||”为求欧氏距离符号,表示标准差为σo的高斯函数,在此取σo=0.4,
④-4、根据{E(x,y)}和{M(x,y)},计算{E(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的梯度均值,记为ψ,其中,Ω表示图像域范围,E(x,y)表示{E(x,y)}中坐标位置为(x,y)的像素点的梯度边缘值。④-4. According to {E(x,y)} and {M(x,y)}, calculate the visual importance of the visual saliency map in {E(x,y)} and {I R (x,y)} The gradient mean of all pixels in the region corresponding to the region is denoted as ψ, Among them, Ω represents the range of the image domain, and E(x, y) represents the gradient edge value of the pixel whose coordinate position is (x, y) in {E(x, y)}.
④-5、将ψ作为用于反映{dR(x,y)}的视差梯度特征的特征矢量,记为F2,F2的维数为1。④-5. Take ψ as a feature vector for reflecting the disparity gradient feature of {d R (x, y)}, denoted as F 2 , and the dimension of F 2 is 1.
⑤获取{IR(x,y)}的空间频率图像;然后根据{IR(x,y)}的空间频率图像和{M(x,y)},获取{IR(x,y)}的空间频率图像中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的空间频率均值ν、空间频率方差ρ、空间频率范围ζ、空间频率敏感因子τ;再将ν、ρ、ζ和τ按顺序进行排列构成用于反映{IR(x,y)}的空间频率特征的特征矢量,记为F3,F3=(ν,ρ,ζ,τ)。⑤ Obtain the spatial frequency image of {I R (x, y)}; then according to the spatial frequency image of {I R (x, y)} and {M (x, y)}, obtain {I R (x, y) } in the spatial frequency image of {I R (x,y)}, the spatial frequency mean ν, spatial frequency variance ρ, spatial frequency range ζ, spatial frequency Sensitivity factor τ; then arrange ν, ρ, ζ and τ in order to form a feature vector used to reflect the spatial frequency characteristics of {I R (x,y)}, denoted as F 3 , F 3 =(ν, ρ , ζ, τ).
在此具体实施例中,步骤⑤的具体过程为:In this specific embodiment, the concrete process of
⑤-1、计算{IR(x,y)}的空间频率图像,记为{SF(x,y)},将{SF(x,y)}中坐标位置为(x,y)的像素点的空间频率值记为SF(x,y),
⑤-2、根据{SF(x,y)}和{M(x,y)},计算{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的空间频率均值,记为ν,
⑤-3、根据{SF(x,y)}和{M(x,y)}及ν,计算{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的所有像素点的空间频率方差,记为ρ,
⑤-4、计算{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的空间频率范围,记为ζ,ζ=SFmax-SFmix,其中,SFmax表示{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内空间频率值最大的1%像素点的空间频率均值,SFmin表示{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内空间频率值最小的1%像素点的空间频率均值。⑤-4. Calculate the spatial frequency range of the pixels in {SF (x, y)} corresponding to the visually important area of the visual saliency map of {I R (x, y)}, denoted as ζ, ζ =SF max -SF mix , where SF max represents the 1 with the largest spatial frequency value in the region corresponding to the visually important region of the visual saliency map of {I R (x,y)} in {SF(x,y)} % Spatial frequency mean of pixels, SF min represents the smallest 1% of the spatial frequency value in the region corresponding to the visually important region of the visual saliency map of {I R (x,y)} in {SF(x,y)} The spatial frequency mean of the pixel.
⑤-5、计算{SF(x,y)}中与{IR(x,y)}的视觉显著图的视觉重要区域相对应的区域内的像素点的空间频率敏感因子,记为τ,τ=ν/μ。⑤-5. Calculate the spatial frequency sensitivity factor of the pixels in the region corresponding to the visually important region of the visual saliency map of {I R (x, y)} in {SF (x, y)}, denoted as τ, τ=ν/μ.
⑤-6、将ν、ρ、ζ和τ按顺序进行排列构成用于反映{IR(x,y)}的空间频率特征的特征矢量,记为F3,F3=(ν,ρ,ζ,τ),F3的维数为4。⑤-6. Arrange ν, ρ, ζ and τ in order to form a feature vector for reflecting the spatial frequency characteristics of {I R (x, y)}, denoted as F 3 , F 3 =(ν, ρ, ζ, τ), the dimension of F 3 is 4.
⑥将F1、F2及F3构成一个新的特征矢量,记为X,X=[F1,F2,F3],然后将X作为待评价的立体图像的特征矢量,其中,符号“[]”为矢量表示符号,[F1,F2,F3]表示将F1、F2和F3连接起来形成一个新的特征矢量。⑥Constitute F 1 , F 2 and F 3 into a new feature vector, denoted as X, X=[F 1 , F 2 , F 3 ], and then use X as the feature vector of the stereoscopic image to be evaluated, where the symbol "[]" is a vector representation symbol, and [F 1 , F 2 , F 3 ] means connecting F 1 , F 2 and F 3 to form a new feature vector.
⑦采用n副不同的立体图像以及对应的右视差图像建立立体图像集合,利用现有的主观质量评价方法分别计算立体图像集合中的每副立体图像的视觉舒适度的平均主观评分均值,记为MOS,其中,n≥1,MOS∈[1,5];然后按照步骤①至步骤⑥计算待评价的立体图像的特征矢量X的操作,以相同的方式分别计算立体图像集合中的每幅立体图像的特征矢量,将立体图像集合中的第i幅立体图像的特征矢量记为Xi,其中,1≤i≤n,n表示立体图像集合中包含的立体图像的幅数。⑦ Use n different stereoscopic images and the corresponding right disparity images to establish a stereoscopic image set, and use the existing subjective quality evaluation method to calculate the average subjective rating of the visual comfort of each stereoscopic image in the stereoscopic image set, which is denoted as MOS, where, n≥1, MOS∈[1,5]; then follow
在本实施例中,采用韩国科学技术院图像和视频系统实验室提供的立体图像数据库作为立体图像集合,该立体图像数据库包含120幅立体图像以及对应的右视差图像,该立体图像数据库包含了各种场景深度的室内和室外图像,并给出了每副立体图像的视觉舒适度的平均主观评分均值。In this embodiment, the stereoscopic image database provided by the Image and Video System Laboratory of the Korea Institute of Science and Technology is used as a stereoscopic image collection. The stereoscopic image database contains 120 stereoscopic images and corresponding right parallax images. Indoor and outdoor images of various scene depths, and the average subjective rating mean of the visual comfort of each stereo image is given.
⑧将立体图像集合中的所有立体图像分成训练集和测试集,将训练集中的所有立体图像的特征矢量和平均主观评分均值构成训练样本数据集合,将测试集中的所有立体图像的特征矢量和平均主观评分均值构成测试样本数据集合,然后采用支持向量回归作为机器学习的方法,对训练样本数据集合中的所有立体图像的特征矢量进行训练,使得经过训练得到的回归函数值与平均主观评分均值之间的误差最小,拟合得到最优的权重矢量wopt和最优的偏置项bopt,接着利用wopt和bopt构造得到支持向量回归训练模型,再根据支持向量回归训练模型,对测试样本数据集合中的每幅立体图像的特征矢量进行测试,预测得到测试样本数据集合中的每幅立体图像的客观视觉舒适度评价预测值,将测试样本数据集合中的第k'幅立体图像的客观视觉舒适度评价预测值记为Qk',Qk'=f(Xk'),其中,f()为函数表示形式,Xk'表示测试样本数据集合中的第k'幅立体图像的特征矢量,(wopt)T为wopt的转置矩阵,表示测试样本数据集合中的第k'幅立体图像的线性函数,1≤k'≤n-t,t表示训练集中包含的立体图像的幅数;之后通过重新分配训练集和测试集,重新预测得到测试样本数据集合中的每幅立体图像的客观视觉舒适度评价预测值,经过N次迭代后计算立体图像中的每幅立体图像的客观视觉舒适度评价预测值的平均值,并将计算得到的平均值作为对应那幅立体图像的最终的客观视觉舒适度预测值,其中,N的值取大于100,以保证立体图像集合中的每幅立体图像都能得到客观视觉舒适度评价预测值,在本实施例中取N=200。8. Divide all the stereo images in the stereo image set into training set and test set, form the training sample data set with the feature vectors and average subjective score mean of all the stereo images in the training set, and combine the feature vectors and average values of all the stereo images in the test set The mean value of the subjective score constitutes the test sample data set, and then uses support vector regression as a machine learning method to train the feature vectors of all stereo images in the training sample data set, so that the regression function value obtained after training and the average subjective score mean value The error between is the smallest, and the optimal weight vector w opt and the optimal bias item b opt are obtained by fitting, and then the support vector regression training model is obtained by using w opt and b opt , and then according to the support vector regression training model, the test The feature vector of each stereoscopic image in the sample data set is tested, and the objective visual comfort evaluation prediction value of each stereoscopic image in the test sample data set is predicted, and the k'th stereoscopic image in the test sample data set is The predicted value of objective visual comfort evaluation is denoted as Q k' , Q k' = f(X k' ), Among them, f() is a function representation, X k' represents the feature vector of the k'th stereo image in the test sample data set, (w opt ) T is the transposition matrix of w opt , Represents the linear function of the k'th stereo image in the test sample data set, 1≤k'≤nt, t represents the number of stereo images contained in the training set; after that, re-predict the test by reassigning the training set and the test set The objective visual comfort evaluation prediction value of each stereoscopic image in the sample data set, calculate the average value of the objective visual comfort evaluation prediction value of each stereoscopic image in the stereoscopic image after N iterations, and calculate the average The value is used as the final objective visual comfort prediction value corresponding to the stereo image, where the value of N is greater than 100, so as to ensure that each stereo image in the stereo image set can get the objective visual comfort evaluation prediction value, in this N=200 is taken in the embodiment.
在此具体实施例中,步骤⑧的具体过程为:In this specific embodiment, the concrete process of step 8. is:
⑧-1、随机选择立体图像集合中的幅立体图像构成训练集,将立体图像集合中剩余的n-t幅立体图像构成测试集,其中,符号为向上取整符号。⑧-1, randomly select the stereo image set Stereo images constitute the training set, and the remaining nt stereo images in the stereo image set constitute the test set, where the symbol is the round up sign.
⑧-2、将训练集中的所有立体图像的特征矢量和平均主观评分均值构成训练样本数据集合,记为Ωt,{Xk,MOSk}∈Ωt,其中,Xk表示训练样本数据集合Ωt中的第k幅立体图像的特征矢量,MOSk表示训练样本数据集合Ωt中的第k幅立体图像的平均主观评分均值,1≤k≤t。⑧-2. The feature vectors and average subjective ratings of all stereo images in the training set constitute the training sample data set, which is recorded as Ω t , {X k ,MOS k }∈Ω t , where X k represents the training sample data set The feature vector of the k-th stereo image in Ω t , MOS k represents the mean subjective score of the k-th stereo image in the training sample data set Ω t , 1≤k≤t.
⑧-3、构造训练样本数据集合Ωt中的每幅立体图像的特征矢量的回归函数,将Xk的回归函数记为f(Xk),其中,f()为函数表示形式,w为权重矢量,wT为w的转置矩阵,b为偏置项,w和b的值需要通过训练来得到,表示Xk的线性函数,D(Xk,Xl)为支持向量回归中的核函数,Xl为训练样本数据集合Ωt中的第l幅立体图像的特征矢量,1≤l≤t,γ为核参数,其用于反映输入样本值的范围,样本值的范围越大,γ值也就越大,在本实施例中取γ=54,exp()表示以e为底的指数函数,e=2.71828183,符号“|| ||”为求欧式距离符号。8.-3, construct the regression function of the feature vector of each stereoscopic image in the training sample data set Ω t , the regression function of X k is denoted as f(X k ), Among them, f() is the function representation, w is the weight vector, w T is the transposition matrix of w, b is the bias item, and the values of w and b need to be obtained through training. represents a linear function of X k , D(X k ,X l ) is the kernel function in support vector regression, X l is the feature vector of the lth stereo image in the training sample data set Ω t , 1≤l≤t, γ is the kernel parameter, which is used to reflect the range of input sample values, the larger the range of sample values, the value of γ The larger it is, γ=54 is taken in this embodiment, exp() represents an exponential function with e as the base, e=2.71828183, and the symbol "|| ||" is the Euclidean distance symbol.
⑧-4、采用支持向量回归对训练样本数据集合Ωt中的所有立体图像的特征矢量进行训练,使得经过训练得到的回归函数值与平均主观评分均值之间的误差最小,拟合得到最优的权重矢量wopt和最优的偏置项bopt,将最优的权重矢量wopt和最优的偏置项bopt的组合记为(wopt,bopt),
⑧-5、将测试集中的所有立体图像的特征矢量和平均主观评分均值构成测试样本数据集合,然后根据支持向量回归训练模型,对测试样本数据集合中的每幅立体图像的特征矢量进行测试,预测得到测试样本数据集合中的每幅立体图像的客观视觉舒适度评价预测值,将测试样本数据集合中的第k'幅立体图像的客观视觉舒适度评价预测值记为Qk',Qk'=f(Xk'),其中,Xk'表示测试样本数据集合中的第k'幅立体图像的特征矢量,表示测试样本数据集合中的第k'幅立体图像的线性函数,1≤k'≤n-t。8.-5, the eigenvectors of all stereoscopic images in the test set and the mean value of the average subjective score form the test sample data set, then according to the support vector regression training model, the eigenvectors of each stereoscopic image in the test sample data set are tested, The objective visual comfort evaluation prediction value of each stereoscopic image in the test sample data set is predicted, and the objective visual comfort evaluation prediction value of the k'th stereo image in the test sample data set is recorded as Q k' , Q k ' = f(X k' ), Among them, X k' represents the feature vector of the k'th stereo image in the test sample data set, Represents the linear function of the k'th stereo image in the test sample data set, 1≤k'≤nt.
⑧-6、再重新随机选择立体图像集合中的幅立体图像构成训练集,将立体图像集合中剩余的n-t幅立体图像构成测试集,然后返回步骤⑧-2继续执行,在经过N次迭代后,计算立体图像集合中的每幅立体图像的客观视觉舒适度评价预测值的平均值,再将计算得到的平均值作为对应那幅立体图像的最终的客观视觉舒适度评价预测值,其中,N的值取大于100。⑧-6, and then re-randomly select the stereo image collection Stereo images form the training set, and the remaining nt stereo images in the stereo image set constitute the test set, and then return to step ⑧-2 to continue execution. After N iterations, calculate the objective value of each stereo image in the stereo image set The average value of the predicted value of the visual comfort evaluation, and then the calculated average value is used as the final predicted value of the objective visual comfort evaluation corresponding to the stereo image, wherein the value of N is greater than 100.
在本实施例中,利用评估图像质量评价方法的4个常用客观参量作为评价指标,即非线性回归条件下的Pearson相关系数(Pearson linear correlation coefficient,PLCC)、Spearman相关系数(Spearman rank order correlation coefficient,SROCC)、Kendall相关系数(Kendall rank-order correlation coefficient,KROCC)、均方误差(root mean squarederror,RMSE),PLCC和RMSE反映客观评价预测值的准确性,SROCC和KROCC反映其单调性。将计算得到的120幅立体图像的视觉舒适度客观评价预测值做五参数Logistic函数非线性拟合,PLCC、SROCC和KROCC值越高、RMSE值越小说明本发明的视觉舒适度客观评价方法的评价结果与平均主观评分均值的相关性越好。表1给出了采用不同特征矢量得到的视觉舒适度评价预测值与平均主观评分均值之间的相关性,从表1中可以看出,只采用两个特征矢量得到的视觉舒适度评价预测值与平均主观评分均值之间的相关性均不是最优的,并且由视差幅度特征构成的特征矢量对评价性能的影响比其他两个特征矢量要大,这说明了本发明方法提取的三个特征矢量是有效的,并且结合视差幅度、视差梯度和空间频率特征的特征矢量,得到的视觉舒适度评价预测值与平均主观评分均值之间的相关性更强,这足以说明本发明方法是有效的。In this embodiment, four commonly used objective parameters for evaluating image quality evaluation methods are used as evaluation indicators, namely Pearson correlation coefficient (Pearson linear correlation coefficient, PLCC) and Spearman correlation coefficient (Spearman rank order correlation coefficient) under nonlinear regression conditions. , SROCC), Kendall rank-order correlation coefficient (KROCC), mean square error (root mean squared error, RMSE), PLCC and RMSE reflect the accuracy of objective evaluation of the predicted value, SROCC and KROCC reflect its monotonicity. The visual comfort objective evaluation prediction value of the calculated 120 stereoscopic images is used as a five-parameter Logistic function nonlinear fitting, the higher the PLCC, SROCC and KROCC values, and the smaller the RMSE value, the smaller the visual comfort objective evaluation method of the present invention. The better the correlation between the evaluation results and the average subjective rating mean. Table 1 shows the correlation between the predicted value of visual comfort evaluation obtained by using different feature vectors and the mean value of the average subjective score. It can be seen from Table 1 that the predicted value of visual comfort evaluation obtained by using only two feature vectors None of the correlations with the average subjective rating mean is optimal, and the feature vector composed of disparity magnitude features has a greater impact on the evaluation performance than the other two feature vectors, which illustrates that the three features extracted by the method of the present invention The vector is effective, and combined with the feature vector of parallax amplitude, parallax gradient and spatial frequency features, the correlation between the obtained visual comfort evaluation prediction value and the average subjective rating mean is stronger, which is enough to show that the method of the present invention is effective .
图5给出了采用F1和F2两个特征矢量得到的客观视觉舒适度评价预测值与平均主观评分均值的散点图,图6给出了采用F1和F3两个特征矢量得到的客观视觉舒适度评价预测值与平均主观评分均值的散点图,图7给出了采用F2和F3两个特征矢量得到的客观视觉舒适度评价预测值与平均主观评分均值的散点图,图8给出了采用F1、F2和F3三个特征矢量得到的客观视觉舒适度评价预测值与平均主观评分均值的散点图,散点图中的散点越集中,说明客观评价结果与主观感知的一致性越好。从图5至图8中可以看出,采用本发明方法得到的散点图中的散点比较集中,与主观评价数据之间的吻合度较高。Figure 5 shows the scatter plot of the predicted value of objective visual comfort evaluation and the average subjective score obtained by using two feature vectors F 1 and F 2 , and Figure 6 shows the scatter diagram obtained by using two feature vectors F 1 and F 3 Figure 7 shows the scatter plot of the predicted value of objective visual comfort evaluation and the average subjective score obtained by using the two feature vectors F 2 and F 3 Fig. 8 shows the scatter diagram of the predicted value of objective visual comfort evaluation and the mean value of the average subjective rating obtained by using the three feature vectors F 1 , F 2 and F 3 . The more concentrated the scatter points in the scatter diagram, the more The better the consistency between objective evaluation results and subjective perception. It can be seen from Fig. 5 to Fig. 8 that the scatter points in the scatter diagram obtained by the method of the present invention are relatively concentrated, and the coincidence degree with the subjective evaluation data is relatively high.
表1采用不同特征矢量得到的视觉舒适度评价预测值与平均主观评分均值之间的相关Table 1 Correlation between the predicted value of visual comfort evaluation and the average subjective score obtained by using different feature vectors
性sex
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201310264956.8A CN103347196B (en) | 2013-06-27 | 2013-06-27 | Method for evaluating stereo image vision comfort level based on machine learning |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201310264956.8A CN103347196B (en) | 2013-06-27 | 2013-06-27 | Method for evaluating stereo image vision comfort level based on machine learning |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN103347196A true CN103347196A (en) | 2013-10-09 |
| CN103347196B CN103347196B (en) | 2015-04-29 |
Family
ID=49281967
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201310264956.8A Active CN103347196B (en) | 2013-06-27 | 2013-06-27 | Method for evaluating stereo image vision comfort level based on machine learning |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN103347196B (en) |
Cited By (18)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103581661A (en) * | 2013-10-28 | 2014-02-12 | 宁波大学 | Method for evaluating visual comfort degree of three-dimensional image |
| CN104469355A (en) * | 2014-12-11 | 2015-03-25 | 西安电子科技大学 | Visual comfort prediction based on saliency adaptation and visual comfort enhancement method based on nonlinear mapping |
| CN104581141A (en) * | 2015-01-09 | 2015-04-29 | 宁波大学 | Three-dimensional picture visual comfort evaluation method |
| WO2015062149A1 (en) * | 2013-10-30 | 2015-05-07 | 清华大学 | Method for acquiring degree of comfort of motion-sensing binocular stereoscopic video |
| CN104811693A (en) * | 2015-04-14 | 2015-07-29 | 宁波大学 | A Method for Objective Evaluation of Visual Comfort of Stereo Image |
| CN105208374A (en) * | 2015-08-24 | 2015-12-30 | 宁波大学 | Non-reference image quality objective evaluation method based on deep learning |
| CN105243385A (en) * | 2015-09-23 | 2016-01-13 | 宁波大学 | Unsupervised learning based image quality evaluation method |
| CN105407349A (en) * | 2015-11-30 | 2016-03-16 | 宁波大学 | No-reference objective three-dimensional image quality evaluation method based on binocular visual perception |
| CN104036502B (en) * | 2014-06-03 | 2016-08-24 | 宁波大学 | A kind of without with reference to fuzzy distortion stereo image quality evaluation methodology |
| CN106210710A (en) * | 2016-07-25 | 2016-12-07 | 宁波大学 | A kind of stereo image vision comfort level evaluation methodology based on multi-scale dictionary |
| CN106604012A (en) * | 2016-10-20 | 2017-04-26 | 吉林大学 | Method for evaluating comfort level of 3D video according to vertical parallax |
| CN106683072A (en) * | 2015-11-09 | 2017-05-17 | 上海交通大学 | PUP (Percentage of Un-linked pixels) diagram based 3D image comfort quality evaluation method and system |
| CN106686377A (en) * | 2016-12-30 | 2017-05-17 | 佳都新太科技股份有限公司 | Algorithm for determining video key area based on deep neural network |
| CN106993183A (en) * | 2017-03-28 | 2017-07-28 | 天津大学 | A Quantitative Method of Comfortable Brightness Based on Salient Regions of Stereo Image |
| CN107909565A (en) * | 2017-10-29 | 2018-04-13 | 天津大学 | Stereo-picture Comfort Evaluation method based on convolutional neural networks |
| CN109754391A (en) * | 2018-12-18 | 2019-05-14 | 北京爱奇艺科技有限公司 | A kind of image quality evaluating method, device and electronic equipment |
| CN111669563A (en) * | 2020-06-19 | 2020-09-15 | 福州大学 | A method for enhancing the visual comfort of stereo images based on reinforcement learning |
| CN119785267A (en) * | 2024-12-25 | 2025-04-08 | 杭州电子科技大学 | A comfort prediction method for stereoscopic panoramic video based on deep learning |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102137271A (en) * | 2010-11-04 | 2011-07-27 | 华为软件技术有限公司 | Method and device for evaluating image quality |
| CN101742353B (en) * | 2008-11-04 | 2012-01-04 | 工业和信息化部电信传输研究所 | No-reference video quality evaluating method |
| CN102945552A (en) * | 2012-10-22 | 2013-02-27 | 西安电子科技大学 | No-reference image quality evaluation method based on sparse representation in natural scene statistics |
| CN103096125A (en) * | 2013-02-22 | 2013-05-08 | 吉林大学 | Stereoscopic video visual comfort evaluation method based on region segmentation |
-
2013
- 2013-06-27 CN CN201310264956.8A patent/CN103347196B/en active Active
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101742353B (en) * | 2008-11-04 | 2012-01-04 | 工业和信息化部电信传输研究所 | No-reference video quality evaluating method |
| CN102137271A (en) * | 2010-11-04 | 2011-07-27 | 华为软件技术有限公司 | Method and device for evaluating image quality |
| CN102945552A (en) * | 2012-10-22 | 2013-02-27 | 西安电子科技大学 | No-reference image quality evaluation method based on sparse representation in natural scene statistics |
| CN103096125A (en) * | 2013-02-22 | 2013-05-08 | 吉林大学 | Stereoscopic video visual comfort evaluation method based on region segmentation |
Non-Patent Citations (1)
| Title |
|---|
| YE BI AND JUN ZHOU: "visual comfort assessment metric based on motion features in salient motion regions for stereoscopic 3D video", 《COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE》 * |
Cited By (27)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103581661B (en) * | 2013-10-28 | 2015-06-03 | 宁波大学 | Method for evaluating visual comfort degree of three-dimensional image |
| CN103581661A (en) * | 2013-10-28 | 2014-02-12 | 宁波大学 | Method for evaluating visual comfort degree of three-dimensional image |
| WO2015062149A1 (en) * | 2013-10-30 | 2015-05-07 | 清华大学 | Method for acquiring degree of comfort of motion-sensing binocular stereoscopic video |
| CN103595990B (en) * | 2013-10-30 | 2015-05-20 | 清华大学 | Method for obtaining binocular stereoscopic video comfort level of motion perception |
| US10091484B2 (en) | 2013-10-30 | 2018-10-02 | Tsinghua University | Method for acquiring comfort degree of motion-sensing binocular stereoscopic video |
| CN104036502B (en) * | 2014-06-03 | 2016-08-24 | 宁波大学 | A kind of without with reference to fuzzy distortion stereo image quality evaluation methodology |
| CN104469355A (en) * | 2014-12-11 | 2015-03-25 | 西安电子科技大学 | Visual comfort prediction based on saliency adaptation and visual comfort enhancement method based on nonlinear mapping |
| CN104469355B (en) * | 2014-12-11 | 2016-09-28 | 西安电子科技大学 | Based on the prediction of notable adaptive euphoropsia and the euphoropsia Enhancement Method of nonlinear mapping |
| CN104581141B (en) * | 2015-01-09 | 2016-06-22 | 宁波大学 | A kind of stereo image vision comfort level evaluation methodology |
| CN104581141A (en) * | 2015-01-09 | 2015-04-29 | 宁波大学 | Three-dimensional picture visual comfort evaluation method |
| CN104811693A (en) * | 2015-04-14 | 2015-07-29 | 宁波大学 | A Method for Objective Evaluation of Visual Comfort of Stereo Image |
| CN105208374A (en) * | 2015-08-24 | 2015-12-30 | 宁波大学 | Non-reference image quality objective evaluation method based on deep learning |
| CN105243385B (en) * | 2015-09-23 | 2018-11-09 | 宁波大学 | A kind of image quality evaluating method based on unsupervised learning |
| CN105243385A (en) * | 2015-09-23 | 2016-01-13 | 宁波大学 | Unsupervised learning based image quality evaluation method |
| CN106683072A (en) * | 2015-11-09 | 2017-05-17 | 上海交通大学 | PUP (Percentage of Un-linked pixels) diagram based 3D image comfort quality evaluation method and system |
| CN105407349B (en) * | 2015-11-30 | 2017-05-03 | 宁波大学 | No-reference objective three-dimensional image quality evaluation method based on binocular visual perception |
| CN105407349A (en) * | 2015-11-30 | 2016-03-16 | 宁波大学 | No-reference objective three-dimensional image quality evaluation method based on binocular visual perception |
| CN106210710A (en) * | 2016-07-25 | 2016-12-07 | 宁波大学 | A kind of stereo image vision comfort level evaluation methodology based on multi-scale dictionary |
| CN106604012A (en) * | 2016-10-20 | 2017-04-26 | 吉林大学 | Method for evaluating comfort level of 3D video according to vertical parallax |
| CN106686377A (en) * | 2016-12-30 | 2017-05-17 | 佳都新太科技股份有限公司 | Algorithm for determining video key area based on deep neural network |
| CN106993183A (en) * | 2017-03-28 | 2017-07-28 | 天津大学 | A Quantitative Method of Comfortable Brightness Based on Salient Regions of Stereo Image |
| CN107909565A (en) * | 2017-10-29 | 2018-04-13 | 天津大学 | Stereo-picture Comfort Evaluation method based on convolutional neural networks |
| CN109754391A (en) * | 2018-12-18 | 2019-05-14 | 北京爱奇艺科技有限公司 | A kind of image quality evaluating method, device and electronic equipment |
| CN109754391B (en) * | 2018-12-18 | 2021-10-22 | 北京爱奇艺科技有限公司 | Image quality evaluation method and device and electronic equipment |
| CN111669563A (en) * | 2020-06-19 | 2020-09-15 | 福州大学 | A method for enhancing the visual comfort of stereo images based on reinforcement learning |
| CN111669563B (en) * | 2020-06-19 | 2021-06-25 | 福州大学 | Stereo image visual comfort enhancement method based on reinforcement learning |
| CN119785267A (en) * | 2024-12-25 | 2025-04-08 | 杭州电子科技大学 | A comfort prediction method for stereoscopic panoramic video based on deep learning |
Also Published As
| Publication number | Publication date |
|---|---|
| CN103347196B (en) | 2015-04-29 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN103347196B (en) | Method for evaluating stereo image vision comfort level based on machine learning | |
| CN103581661B (en) | Method for evaluating visual comfort degree of three-dimensional image | |
| CN104811693B (en) | A kind of stereo image vision comfort level method for objectively evaluating | |
| CN105407349B (en) | No-reference objective three-dimensional image quality evaluation method based on binocular visual perception | |
| CN104023230B (en) | A kind of non-reference picture quality appraisement method based on gradient relevance | |
| CN104581143A (en) | Reference-free three-dimensional picture quality objective evaluation method based on machine learning | |
| CN104036501A (en) | Three-dimensional image quality objective evaluation method based on sparse representation | |
| CN105389554A (en) | Live body discrimination method and device based on face recognition | |
| CN103942525A (en) | Real-time face optimal selection method based on video sequence | |
| CN105357519B (en) | Non-reference stereo image quality objective evaluation method based on self-similarity characteristics | |
| CN104581141B (en) | A kind of stereo image vision comfort level evaluation methodology | |
| CN104036502B (en) | A kind of without with reference to fuzzy distortion stereo image quality evaluation methodology | |
| CN106162162B (en) | A kind of reorientation method for objectively evaluating image quality based on rarefaction representation | |
| CN104394403A (en) | A compression-distortion-oriented stereoscopic video quality objective evaluating method | |
| CN106791822A (en) | It is a kind of based on single binocular feature learning without refer to stereo image quality evaluation method | |
| CN108805825A (en) | A kind of reorientation image quality evaluating method | |
| CN103338379A (en) | Stereoscopic video objective quality evaluation method based on machine learning | |
| CN105809182A (en) | Image classification method and device | |
| CN104574363A (en) | Full reference image quality assessment method in consideration of gradient direction difference | |
| CN104361583A (en) | A Method for Objective Quality Evaluation of Asymmetric Distorted Stereo Images | |
| CN104243956B (en) | A kind of stereo-picture visual saliency map extracting method | |
| CN106210710B (en) | A kind of stereo image vision comfort level evaluation method based on multi-scale dictionary | |
| CN108848365B (en) | A kind of reorientation stereo image quality evaluation method | |
| CN103065302B (en) | Image significance detection method based on stray data mining | |
| CN102708568A (en) | A Stereoscopic Image Objective Quality Evaluation Method Based on Structural Distortion |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant | ||
| TR01 | Transfer of patent right | ||
| TR01 | Transfer of patent right |
Effective date of registration: 20191213 Address after: Room 1,020, Nanxun Science and Technology Pioneering Park, No. 666 Chaoyang Road, Nanxun District, Huzhou City, Zhejiang Province, 313000 Patentee after: Huzhou You Yan Intellectual Property Service Co.,Ltd. Address before: 315211 Zhejiang Province, Ningbo Jiangbei District Fenghua Road No. 818 Patentee before: Ningbo University |
|
| TR01 | Transfer of patent right | ||
| TR01 | Transfer of patent right |
Effective date of registration: 20230824 Address after: Room 111, 1st Floor, Building 4, Yard 5, Shangdi East Road, Haidian District, Beijing, 100000 Patentee after: Ape Point Technology (Beijing) Co.,Ltd. Patentee after: Zheng Juan Address before: 313000 room 1020, science and Technology Pioneer Park, 666 Chaoyang Road, Nanxun Town, Nanxun District, Huzhou, Zhejiang. Patentee before: Huzhou You Yan Intellectual Property Service Co.,Ltd. |