CN101521819B

CN101521819B - Method for optimizing rate distortion in video image compression

Info

Publication number: CN101521819B
Application number: CN 200810065439
Authority: CN
Inventors: 马国强
Original assignee: SHENZHEN RONGCHUANG TIANXIA TECHNOLOGY DEVELOPMENT Co Ltd
Current assignee: Shenzhen Temobi Science and Technology Co Ltd
Priority date: 2008-02-27
Filing date: 2008-02-27
Publication date: 2010-12-01
Anticipated expiration: 2028-02-27
Also published as: WO2009105948A1; CN101521819A

Abstract

The invention relates to the field of video image processing and provides a method for optimizing rate distortion in video image compression. The method comprises the following steps of A. carrying ouThe invention relates to the field of video image processing and provides a method for optimizing rate distortion in video image compression. The method comprises the following steps of A. carrying ous, improving the evaluation precision and further enhancing the image compression performance.s, improving the evaluation precision and further enhancing the image compression performance.t transformation from spatial domain to frequency domain on the acquired image by a coder and generating frequency domain energy distribution after transformation; B. evaluating the optimizing value ot transformation from spatial domain to frequency domain on the acquired image by a coder and generating frequency domain energy distribution after transformation; B. evaluating the optimizing value of the rate distortion by the coder according to coding input parameters and the frequency domain energy distribution; and C. controlling the coding compression of the image by the coder according to tf the rate distortion by the coder according to coding input parameters and the frequency domain energy distribution; and C. controlling the coding compression of the image by the coder according to the optimizing value of the rate distortion. When estimating the rate distortion, compared with the prior art, the invention is characterized in that evaluation parameters are introduced into the codinhe optimizing value of the rate distortion. When estimating the rate distortion, compared with the prior art, the invention is characterized in that evaluation parameters are introduced into the coding input parameters and the frequency domain energy distribution after transformation at the same time, thus evaluating the optimizing value of the rate distortion according to the evaluation parameterg input parameters and the frequency domain energy distribution after transformation at the same time, thus evaluating the optimizing value of the rate distortion according to the evaluation parameter

Description

A kind of method that in video image compression, rate distortion is optimized

Technical field

The present invention relates to field of video image processing, more particularly, relate to a kind of method that in video image compression, rate distortion is optimized.

Background technology

For the modern video compress technique, generally adopt hybrid encoding frame, such coding framework generally all provides series of tools and algorithms such as motion search, spatial texture prediction, transition coding, entropy coding.According to information theory, for the image-region of different characteristic, entropy has very big variation, and its theoretical maximum compression rate need be implemented compression with different instruments in fluctuation.And aspect the selection of coded system and encoding compression instrument, modern coding theory and received for international standard H.264, MPEG-4, H.264 etc., equal utilization rate aberration optimizing (Rate-Distortion Optimization) methods.

Rate-distortion optimization process of the prior art may further comprise the steps: (1) estimates bit rate output under certain coded system by code check function R (Qp); (2) estimate the distortion that produces under certain coded system by distortion function D (Qp); (3) carry out optimization selection according to rate distortion function J=D (Qp)+λ R (Qp).In said process, coding input parameter Qp is unique quantization parameter, and λ rule of thumb is worth from Qp to calculate gained.Prior art is come approximate ask for code check and the distortion that produces under each coding mode or the parameter, production rate distortion curve by the empirical estimation to this single parameter.

Hence one can see that, and prior art is only asked for code check and distortion according to Qp, will make that the rate distortion function precision is not enough, and especially encoder is when lossless compress or mistake quantification, and the deviation that causes is bigger.

Therefore need a kind of new method that in video image compression, rate distortion is optimized, improve the rate distortion assessment precision of encoder, thereby further strengthen the performance of image compression.

Summary of the invention

The object of the present invention is to provide a kind of method that in video image compression rate distortion is optimized, it is lower to be intended to solve prior art assessment precision of rate distortion in video image compression, causes the problem of image compression poor-performing.

In order to realize goal of the invention, the described method that in video image compression rate distortion is optimized may further comprise the steps:

A. encoder carries out the conversion of spatial domain to frequency domain to the image that gets access to, and generates the frequency domain Energy distribution after the conversion;

B. encoder is asked for the rate-distortion optimization value according to coding input parameter and frequency domain Energy distribution;

C. encoder is according to the rate-distortion optimization value, the encoding compression of control chart picture.

Preferably, described steps A further comprises, by Fourier transform or discrete cosine transform, video image is carried out the conversion of spatial domain to frequency domain.

Preferably, described step B further comprises:

B1. calculate evaluate parameter according to coding input parameter and frequency domain Energy distribution;

B2. according to described evaluate parameter calculation rate distortion value;

B3. travel through coding mode, ask for the parameter when making the rate distortion value reach optimum, i.e. the rate-distortion optimization value.

Preferably, the computing formula of evaluate parameter is among the described step B1:

p = α \cdot (1 - γ) + β \cdot (\frac{a}{Qp} + \frac{b}{Q p^{2}});

Wherein, α, β are the empirical value constants, and a, b are modifying factors, and Qp is the coding input parameter, and γ is the frequency domain Energy distribution after the conversion.

Preferably, the computing formula of rate distortion value is among the described step B2:

J＝R(p)+pD(p)；

Wherein J is the rate distortion value, and P is an evaluate parameter, and R (p) is a code check, and D (p) is distortion.

Preferably, in the computing formula of the rate distortion value of described step B2:

The computing formula of code check is R (p)=p+ (1/ δ-1) R (o), and wherein δ is the empirical value regulatory factor, and R (o) is the code check of trying to achieve under the coding mode;

The computing formula of distortion is

D (p) = (\underset{x, y}{Σ} | DiffT (x, y) |) / 2,

Wherein ((x is original picture and target image in that (x, y) energy difference of position correspondence pixel is transformed into the coefficient behind the frequency domain y) to DiffT for x, the y) position coordinates of each pixel in the expression video image.

Preferably, described step C further comprises:

By traveling through various coding modes,, ask for the parameter that the rate distortion value is reached hour, i.e. the rate-distortion optimization value to the mode that the rate distortion value takes Lagrange to approach.

In order better to realize goal of the invention, preceding method is realized based on encoder, this encoder comprise according to the coding input parameter ask for the rate-distortion optimization unit of rate-distortion optimization value, according to the encoding compression unit of rate-distortion optimization value control image encoding, described encoder also comprises the image area converter unit that carries out data interaction with the rate-distortion optimization unit, be used for video image is carried out the conversion of spatial domain to frequency domain, and with the input of the frequency domain Energy distribution after conversion rate-distortion optimization unit;

Described rate-distortion optimization unit is asked for the rate-distortion optimization value, and it is sent into the encoding compression unit according to coding input parameter and described frequency domain Energy distribution.

Preferably, described image area converter unit carries out the conversion of spatial domain to frequency domain by Fourier transform or discrete cosine transform to video image.

Preferably, described rate-distortion optimization unit further comprises evaluate parameter determination module, rate distortion computing module, poll optimal module;

Described evaluate parameter determination module is used for calculating evaluate parameter according to coding input parameter and frequency domain Energy distribution;

Described rate distortion computing module and evaluate parameter determination module carry out data interaction, are used for according to described evaluate parameter calculation rate distortion value;

Described poll optimal module and rate distortion computing module carry out data interaction, are used to travel through coding mode, ask for the parameter when making the rate distortion value reach optimum.

When the present invention estimates at rate distortion, difference compared with prior art is, frequency domain Energy distribution after coding input parameter and the conversion has been introduced evaluate parameter simultaneously, thereby ask for the rate-distortion optimization value according to this evaluate parameter, improve the assessment precision, therefore further strengthened the performance of image compression.

Description of drawings

Fig. 1 is the method flow diagram that the present invention is optimized rate distortion in video image compression;

Fig. 2 is the method flow diagram that one embodiment of the present of invention are optimized rate distortion in video image compression;

Fig. 3 is the structure chart of coded system among the present invention;

Fig. 4 is the rate distortion curve chart that prior art and one embodiment of the present of invention obtain respectively.

Embodiment

In order to make purpose of the present invention, technical scheme and advantage clearer,, the present invention is further elaborated below in conjunction with drawings and Examples.Should be appreciated that specific embodiment described herein only in order to explanation the present invention, and be not used in qualification the present invention.

When the present invention estimates at rate distortion, frequency domain Energy distribution after coding input parameter and the conversion is introduced evaluate parameter simultaneously, thereby according to this evaluate parameter calculation rate distortion value, carry out Lagrangian linear approximation then and ask for and reach the rate-distortion optimization value, improved the assessment precision of rate distortion.

Fig. 1 shows the method flow that the present invention is optimized rate distortion in video image compression, process is as follows:

In step S101, the image that gets access to is carried out the conversion of spatial domain to frequency domain, and generate the frequency domain Energy distribution after the conversion.

In step S102, ask for the rate-distortion optimization value according to coding input parameter and frequency domain Energy distribution.

In step S103, according to the rate-distortion optimization value, the encoding compression of control chart picture.

Fig. 2 shows the method flow that one embodiment of the present of invention are optimized rate distortion in video image compression, this method flow is based on method flow shown in Figure 1, and detailed process is as follows:

In step S201, the image that gets access to is carried out the conversion of spatial domain to frequency domain, and generate the frequency domain Energy distribution after the conversion.

In an exemplary scenario, carry out the conversion of spatial domain to frequency domain by Fourier (Fourier) transfer pair video image.Among the embodiment under this exemplary scenario, as follows by the frequency domain Energy distribution that obtains behind the Fourier transform:

Wherein γ is the frequency domain Energy distribution, (x, the y) position coordinates of each pixel in the expression video image, X (x, y) be image region through the frequency spectrum after the conversion, (x y) is modifying factor to h, and its concrete value can be adjusted in experiment as the case may be, desirable in one embodiment h (x, y)=1

Be normalization factor, A is the pixel number.Should be noted that the aforementioned calculation formula is one of them example of the present invention, other transforms for this formula also should be included in protection scope of the present invention.

In another exemplary scenario, (Discrete Cosine Transform DCT) carries out the conversion of spatial domain to frequency domain to video image by discrete cosine transform.Among the embodiment under this exemplary scenario, as follows by the frequency domain Energy distribution that obtains behind the dct transform:

γ = \frac{1}{A} \underset{x, y &Element; X (x, y)}{Σ} X (x, y) \cdot h (x, y) .

Wherein X (x, y), (x, identical in meaning y) and the preceding formula, A refers to the pixel number to h.Should be noted that the aforementioned calculation formula is one of them example of the present invention, other transforms for this formula also should be included in protection scope of the present invention.

In step S202, calculate evaluate parameter according to coding input parameter and frequency domain Energy distribution.In one embodiment, the computing formula of evaluate parameter P is:

p = α \cdot (1 - γ) + β \cdot (\frac{a}{Qp} + \frac{b}{Q p^{2}}) .

Wherein α, β are the empirical value constants, and a, b are modifying factors, and Qp is the coding input parameter, and γ is the frequency domain Energy distribution after the conversion.To the value of α, β, a, b, in application, can adjust as the case may be, in one embodiment, but value is as follows: α=0.7231, β=0.2769, a=b=1.Should be noted that the aforementioned calculation formula is one of them example of the present invention, other transforms for this formula also should be included in protection scope of the present invention.

In step S203, according to described evaluate parameter calculation rate distortion value.In one embodiment, the computing formula of rate distortion value is:

J=R (p)+pD (p), wherein J is the rate distortion value, and P is an evaluate parameter, and R (p) is a code check, and D (p) is distortion.Should be noted that the aforementioned calculation formula is one of them example of the present invention, other transforms for this formula also should be included in protection scope of the present invention.

And in following formula, code check R (p) during to the different coding mode computation, has different Forecasting Methodologies in the different coding system.In one embodiment,, need consider the code check behind residual quantization, the entropy coding, also need to add the spent bit number of the concrete motion vector of coding, try to achieve the code check R (o) under this coding mode as when the motion search.For example code check R (p) computing formula is under this embodiment:

R(p)＝p+(1/δ-1)R(o)。

Wherein δ is the empirical value regulatory factor, is an empirical value, and is relevant with the specific coding system, in one embodiment, for example in H.264 reaching encoders such as MPEG-4, can value

δ = Q p^{\frac{2}{3}},

Wherein, Qp is the input parameter of rate distortion module, and encoder carries into when the admission rate distortion module.Should be noted that the aforementioned calculation formula is one of them example of the present invention, other transforms for this formula also should be included in protection scope of the present invention.

In following formula, use SATD (Sum of Absolute Transform Difference, the absolute difference of transform domain and) to weigh the energy difference between the image region, so among embodiment, the computing formula of distortion D (p) is:

D (p) = (\underset{x, y}{Σ} | DiffT (x, y) |) / 2 .

Wherein ((x is original picture and target image in that (x, y) energy difference of position correspondence pixel is transformed into the coefficient behind the frequency domain y) to DiffT for x, the y) position coordinates of each pixel in the expression video image.Should be noted that the aforementioned calculation formula is one of them example of the present invention, other transforms for this formula also should be included in protection scope of the present invention.

In this embodiment,

DiffT (x, y)=H * Diff (x, y) * H, wherein, Diff (x, y)=Original (x, y)-Prediction (x, y).

Wherein, H is the Ha Demai transformation matrix, by H * Diff (x, y) * calculating of H, the measurement of the distortion factor is transformed to frequency domain, because the present invention has comprised frequency domain information in p, thus use DiffT (x, y), (x y), can bring higher precision like this and directly do not use Diff.

In the above-described embodiments, this Ha Demai transformation matrix H is as follows:

In step S204, the traversal coding mode is asked for the parameter when making the rate distortion value reach optimum, i.e. the rate-distortion optimization value.In an exemplary scenario, be by traveling through various coding modes, to the mode that the rate distortion value takes Lagrange to approach, ask for the parameter that the rate distortion value is reached hour, with this group parameter as the rate-distortion optimization value.Hence one can see that, and the rate-distortion optimization value is not by calculating direct acquisition, the one group of optimum value that finds out and the algorithm that is based on this paper travels through.

General rate distortion curve all is as abscissa with code check (Kbps), with signal to noise ratio (Pear Signal toNoise Ratio, PSNR) (dB) as ordinate, point on the curve generally is to adopt QP=28, encoder bit rate and coding quality under 32,36,40 these four Qp, curve point is high more, shows that performance is good more.In a concrete application scenarios, for example in reference software JM7.6 H.264, use the present invention and the rate distortion curve (adopting standard test sequences) that obtains, as shown in Figure 4 as sample.Wherein, the curve that is positioned at the below is a rate distortion curve of the prior art, and the curve that is arranged in the top is the rate distortion curve that the present invention obtains, and the each point corresponding data is as shown in the table:

Prior art rate distortion curve		Rate distortion curve of the present invention
				Signal to noise ratio (dB)	Code check (kbit/s)	Signal to noise ratio (dB)	Code check (kbit/s)
38.71	115.34	38.68	112.34
				37.14	89.95	37.28	87.26
35.71	70.64	36.05	69.54
				34.21	54.45	34.95	53.05
32.75	41.66	33.85	39.26

In step S205, according to the rate-distortion optimization value, the encoding compression of control chart picture.Concrete cataloged procedure can repeat no more with reference to prior art herein.

In an application scenarios, said method of the present invention is based on that a coded system realizes.This coded system can be applied in the multiple encoder, for example arbitrarily a H.120, H.261, H.263, H.264, the encoder of MPEG-1, MPEG-4 or other any combination frame.

Fig. 3 shows the structure of coded system among the present invention, comprises following logic function unit: image area converter unit 100, rate-distortion optimization unit 200, encoding compression unit 300.Should be noted that above-mentioned logic function unit can make up by plurality of devices, element or its in concrete the application realizes, therefore not in order to its protection range is defined as specific physical equipment; In addition, the annexation in all diagrams of the present invention between each equipment or the logic function unit is the needs for clear its information interaction of explaination and control procedure, therefore should be considered as annexation in logic, also should not be limited to physical connection.Wherein:

(1) image area converter unit 100 carries out data interaction with rate-distortion optimization unit 200, be used for video image is carried out the conversion of spatial domain to frequency domain, and with the input of the frequency domain Energy distribution after conversion rate-distortion optimization unit 200.

In an exemplary scenario, image area converter unit 100 carries out the conversion of spatial domain to frequency domain by Fourier (Fourier) transfer pair video image.Among the embodiment under this exemplary scenario, as follows by the frequency domain Energy distribution that obtains behind the Fourier transform:

Wherein, γ is the frequency domain Energy distribution, (x, y) position coordinates of each pixel in the expression video image, (x is an image region through the frequency spectrum after the conversion y) to X, and (x y) is modifying factor to h, its concrete value can be adjusted in experiment as the case may be, and desirable in one embodiment h (x, y)=1

Be normalization factor, A is the pixel number.

In another exemplary scenario, (Discrete CosineTransform DCT) carries out the conversion of spatial domain to frequency domain to video image to image area converter unit 100 by discrete cosine transform.Among the embodiment under this exemplary scenario, as follows by the frequency domain Energy distribution that obtains behind the dct transform:

γ = \frac{1}{A} \underset{x, y &Element; X (x, y)}{Σ} X (x, y) \cdot h (x, y) .

Wherein, and X (x, y), (x, identical in meaning y) and the aforesaid formula (1), A refers to the pixel number to h.(2) rate-distortion optimization unit 200 is asked for the rate-distortion optimization value, and it is sent into encoding compression unit 300 according to coding input parameter and described frequency domain Energy distribution.Its internal structure comprises evaluate parameter determination module 201, rate distortion computing module 202, poll optimal module 203, wherein:

Evaluate parameter determination module 201 is used for calculating evaluate parameter (representing with P) according to coding input parameter and frequency domain Energy distribution.In one embodiment, the computing formula of evaluate parameter P is:

p = α \cdot (1 - γ) + β \cdot (\frac{a}{Qp} + \frac{b}{Q p^{2}}) .

Wherein, α, β are the empirical value constants, and a, b are modifying factors, and Qp is the coding input parameter, and γ is the frequency domain Energy distribution after the conversion.To the value of α, β, a, b, in application, can adjust as the case may be, in one embodiment, but value is as follows: α=0.7231, β=0.2769, a=b=1.

Rate distortion computing module 202 carries out data interaction with evaluate parameter determination module 201, is used for according to described evaluate parameter calculation rate distortion value.In one embodiment, the computing formula of rate distortion value is:

J＝R(p)+pD(p)。

R(p)＝p+(1/δ-1)R(o)。

δ = Q p^{\frac{2}{3}},

Wherein, Qp is the input parameter of rate distortion module, and encoder carries into when the admission rate distortion module.

D (p) = (\underset{x, y}{Σ} | DiffT (x, y) |) / 2 .

Wherein ((x is original picture and target image in that (x, y) energy difference of position correspondence pixel is transformed into the coefficient behind the frequency domain y) to DiffT for x, the y) position coordinates of each pixel in the expression video image.In this embodiment,

DiffT(x，y)＝H×Diff(x，y)×H。

Wherein, and Diff (x, y)=Original (x, y)-Prediction (x, y).

H is the Ha Demai transformation matrix, and (x y) * calculating of H, transforms to frequency domain with the measurement of the distortion factor by H * Diff, because the present invention has comprised frequency domain information in p, thus use DiffT (x, y), (x y), can bring higher precision like this and directly do not use Diff.

Poll optimal module 203 is carried out data interaction with rate distortion computing module 202, is used to travel through coding mode, asks for the parameter when making the rate distortion value reach optimum.In an exemplary scenario, this poll optimal module 203 is by traveling through various coding modes, to the mode that the rate distortion value takes Lagrange to approach, asks for the parameter that the rate distortion value is reached hour, with this group parameter as the rate-distortion optimization value.Hence one can see that, and the rate-distortion optimization value is not by calculating direct acquisition, the one group of optimum value that finds out and the algorithm that is based on this paper travels through.

Rate-distortion model itself does not rely on concrete coding framework, can be applied to H.264, the encoder of MPEG-4 and other utilization rate aberration optimizing strategy.In a concrete application scenarios, for example in reference software JM7.6 H.264, use the present invention and the rate distortion curve (adopting standard test sequences) that obtains, as shown in Figure 4 as sample.

(3) encoding compression unit 300 is according to the rate-distortion optimization value control image encoding of rate-distortion optimization unit 200 inputs, and its concrete cataloged procedure can repeat no more with reference to prior art herein.

Should be noted that the present invention can be applied in the multiple encoder, for example arbitrarily a H.120, H.261, H.263, H.264, the encoder of MPEG-1, MPEG-4 or other any combination frame.

The above only is preferred embodiment of the present invention, not in order to restriction the present invention, all any modifications of being done within the spirit and principles in the present invention, is equal to and replaces and improvement etc., all should be included within protection scope of the present invention.

Claims

1. a method that in video image compression rate distortion is optimized is characterized in that, said method comprising the steps of:

C. encoder is according to the rate-distortion optimization value, the encoding compression of control chart picture;

Wherein, described step B further comprises:

2. the method that in video image compression rate distortion is optimized according to claim 1 is characterized in that, described steps A further comprises, by Fourier transform or discrete cosine transform, video image is carried out the conversion of spatial domain to frequency domain.

3. the method that in video image compression rate distortion is optimized according to claim 1 is characterized in that, the computing formula of evaluate parameter is among the described step B1:

P = α \cdot (1 - γ) + β \cdot (\frac{a}{Qp} + \frac{b}{{Qp}^{2}});

Wherein, P is an evaluate parameter, and α, β are the empirical value constants, and a, b are modifying factors, and Qp is the coding input parameter, and γ is the frequency domain Energy distribution after the conversion.

4. the method that in video image compression rate distortion is optimized according to claim 1 is characterized in that, the computing formula of rate distortion value is among the described step B2:

J＝R(p)+pD(p)；

5. the method that in video image compression rate distortion is optimized according to claim 4 is characterized in that, in the computing formula of the rate distortion value of described step B2:

The computing formula of distortion is

6. the method that in video image compression rate distortion is optimized according to claim 4 is characterized in that, described step C further comprises: