WO2010078758A1

WO2010078758A1 - Method for encoding video signal

Info

Publication number: WO2010078758A1
Application number: PCT/CN2009/073589
Authority: WO
Inventors: 马国强
Original assignee: 深圳市融创天下科技发展有限公司
Priority date: 2009-01-09
Filing date: 2009-08-28
Publication date: 2010-07-15
Also published as: CN101778296B; CN101778296A

Abstract

A method for encoding video information is provided, and it includes the steps as follows: in present slice, analyzing the change of the complexity of previous macro-block(MB) relative to the complexity of the MB encoded in the entire slice, predicting the outputted amount of bits b_n of current encoded MB n, if b_n has exceeded Formula (I), stopping encoding the present slice, if b_n has not exceeded Formula (I), continuing to encode the MB; said s is the amount of predicted NAL bytes.

Description

Video signal coding method

The present invention relates to video signal processing, and more particularly to a video signal encoding method.

Background technique

Mobile wireless channels have the property of error-prone:. However, the image compression after encoding by modern compression coding techniques such as H.264 is very small, which is particularly sensitive to packet loss and bit error. The bit error rate of the wireless channel is related to factors such as the moving speed, the bit rate, the time span of the packet, and the packet size. Table 1 exemplifies six application modes, which represent different motion speeds, bit rates, and time spans of packets. 6 possible application modes in wireless transmission

*BER, refers to the Bit Error Ratio. The change in the bit error rate of the six modes in the wireless channel as the packet size changes is shown in FIG. In Fig. 1, BER refers to a bit error ratio (Bit Error Ratio), and BEP refers to a bit error pattern. As can be seen from Figure 1, the bit error rate of the six modes increases almost linearly as the packet size increases (as the number of packets decreases). From the experiment of Fig. 1, reducing the size of the packet has a positive effect on the transmission efficiency. However, from the perspective of the encoder rate distortion performance, the packet size is reduced, which means that the number of NALs (network abstraction layers) increases, because each NAL must be decoded independently, which means that redundant information is increased, which leads to: Slice NAL (slice network extraction layer) internal slice (slice) slice header Must include duplicate slice header syntax elements. Slices inside Slice NAL cannot refer to each other, thus reducing the rate-distortion performance of intra-coded macroblocks.

In the wireless channel transmission, the decision of the packet size is an important factor. People try to find a balance point, even if the rate distortion performance of the encoder is not significantly reduced, and a certain network passability can be obtained. This first requires the ability to arbitrarily control the size of the encoder output packet at the NAL layer.

Figure 2 shows the number of bits of each frame of image output encoded by the FOREMAN test sequence in a 120 kbps CBR (Constants Bit Rate). It can be seen that since the complexity of the video sequence in nature always fluctuates, and the coding modes of the individual frames are different, even if the code is coded according to the CBR mode, the number of bits generated per frame image is always larger. Fluctuations in magnitude. H.264 has designed the structure of the slice, allowing each frame of image to be cut into several slices, thus providing the conditions for the present invention. However, one of the challenges still to be solved is how to predict and control the size of each slice in real time and accurately. Summary of the invention

The technical problem to be solved by the present invention is that the size of the above-mentioned data packet for the prior art cannot dynamically control the defect that the redundant information is increased or the rate distortion performance is not balanced, and a video signal coding capable of dynamically controlling the size of the data packet is provided. method.

The technical solution adopted by the present invention to solve the technical problem thereof is: providing a video signal encoding method: analyzing the complexity of the previous macroblock in the current slice relative to the complexity of the already encoded macroblock in the entire slice, to predict the current Encoded macroblock "number of output bits b"_; if b„ exceeds G.8 - ^ _m ), the current slice is cut off; if b„ does not exceed G .8 - ^ _m ), continue to encode the macroblock ^ is the predicted NAL The number of bytes. ruler

In the video signal encoding method of the present invention, the b„=6, ■a

Where b „ is the macroblock “predicted value of the number of coded bits,

, denotes the macroblock "the sum of the squares of all internal 4 X 4 small residual coefficients; C „ _; ^(0 ≤ ^, / ≤ 3) is the macroblock "internal coordinates (, ^ / residual coefficients; " Is an adjustment factor, which ranges from 0.5 to 2 _;

8 · / 21 where, for the target average bit rate, / for the frame rate,

/frame

The Qp _t Qp„厂 2 corpse frame

The Qp _ni S frame is the quantization parameter of the ith macroblock of the frame image, and Q _Pn is the normalized quantization parameter of the i th macroblock of the «th frame image. In the video signal encoding method of the present invention, the "=1.1.

In the video signal encoding method of the present invention, the s = min( , 1024).

The video signal coding method of the present invention has the following beneficial effects: the encoder can calculate the current bit rate, the frame rate, the NAL length under the network parameters, and count the complexity of the coded macroblock and the number of output bits, and predict the current in real time. The complexity of the coded macroblock and the expected number of coded bits determine the slice cutoff condition, thereby controlling the size of the slice, and achieving an adaptive balance between the rate distortion performance and the bit error rate of the data packet.

DRAWINGS

The present invention will be further described with reference to the accompanying drawings and embodiments in which:

Figure 1 is the relationship between the bit error rate and the packet length;

2 is a schematic diagram showing the fluctuation of the number of bits per frame of the FOREMAN 300 frame image;

3 is a flow chart of a video signal encoding method of the present invention. detailed description

Referring to FIG. 3, the video signal encoding method of the present invention is designed to dynamically determine a target value of a packet size of each type of image, and then guide the workflow of the encoder, the method including (1) a NAL length prediction method and (2) Adaptive slice (Slice) cutoff method.

The NAL length prediction method is used to predict the current bit rate, frame rate, and NAL length under network parameters to balance the rate-distortion performance with the channel error rate.

The adaptive slice cutoff method is used for counting the complexity of the coded macroblock and the number of output bits, real-time predicting the complexity of the currently coded macroblock and the expected number of coded bits, and determining the slice cutoff condition, thereby controlling the slice size.

Let the target average code rate t, the frame rate /, s be the number of bytes predicted by the corresponding image NAL, which are:

In Equation 1, it is the adjustment factor, which is determined according to the type of image, the degree of image over-compression, and the like. Since the image compression loss basically occurs in the quantization stage, the RD (Rate Distortion) performance of the image can be roughly estimated using the quantization parameter values.

Let the current coded image Fmme _n , the quantization parameter of the macroblock is Qp^ the quantization parameter of each macroblock of the previous frame image Fmme^ is „ _-1>; To reduce the calculation amount, by analyzing _{1 in} F, _i , roughly calculate the rate distortion performance of Fr, and then predict the situation of Fr. If the quantization parameter is too high, it means that F _rawe „—, if the over-compression condition is serious, increase the size of NAL in F _ra ^ „ ( That is, the number of bits), thereby compensating for the rate-distortion performance; conversely, if the quantization parameter of F _rawe „− is low, meaning that the quality of Fra ^− is good, the size of the NAL (ie, the number of bits) in ^^ is reduced, thereby Reduce the bit error rate in the channel and improve the network passability of the code stream. In the actual coding, the types of macroblocks of type I, P, and B are different, and the quantization strategy will be different. Editing In the encoder, the I picture as the motion reference source of the entire GOP (Group of Pictures) needs to have the highest rate distortion performance, and its quantization parameter is generally 2~3 higher than the P picture _; P and B pictures also have a high reference value in time, and the quantization parameter is usually 2 higher than the B picture. So you can do similar normalization on Qp^:

4 / frame

2 corpse frames (2)

S frame

^ Value according to equation (3). The model described by equation (3) has the advantage of small computation, simple and accurate. The reason why 21 is used as the critical value is because 21 is almost the starting point of the quantization parameter at the low bit rate, and the encoder works at the low code rate. In the interval, it is hardly lower than 21.

When 0 > 21, the encoder works according to the bit number principle of the NAL analyzed above, that is, the larger the quantization parameter, the larger the distortion, and the NAL size is expanded to improve the rate distortion performance; the smaller the quantization parameter, the rate distortion performance quality. High, you can reduce the number of NAL bytes to reduce the network error rate. Qp _n

<21, this model does not work. ϊ (3)

After 21 is determined, you can get the final expression of ^:

(4)

8 · / 21 Finally, it is necessary to satisfy s = min(s, 1024), min( ·, ·) to represent the minimum function, that is, when s is larger than 1024, the value of s is changed to 1024, when s is more than 1024 hours. , the value of s does not change. This is because routers in IP networks have problems with MTU (Maximum Transmission Unit).

The main idea of the method is to predict the number of output bits of the current coded macroblock by analyzing the complexity of the previous macroblock in the current slice relative to the complexity of the coded macroblock in the entire slice; Deadline.

The advantage of this method is that the cutoff can be determined in advance without actually coding each macroblock. Let "is the serial number of the coded macroblock in the current slice, set ^, ^ (0 ≤ , , / ≤ 3) is the macroblock "the internal residual coefficient of coordinates (,.

∑∑C ² _nM , j , R „ denotes the sum of the squares of the residual coefficients of all 4×4 small blocks within the “macroblock”.

Equation (5) gives the predicted value of the number of bits of the macroblock code, and is set to the number of bits of the wth macroblock code. 6„ is predicted by the weight of 6„-inch complexity: b _n = b _nl ' ^R " ¹ -a (5)

In the formula 5, "is the adjustment factor. In the experiment, the value of 1.1 is a suitable value, and can be adjusted as needed in the specific implementation. The algorithm description of the slice's cutoff condition is given below: If ( >^8- ^ _m ), the current slice is cut off; otherwise, the first "macroblock" is encoded. With the video signal encoding method of the present invention, the encoder can calculate the current bit rate, the frame rate, the NAL length under the network parameters, and calculate the complexity of the coded macroblock and the number of output bits, and predict the complexity of the current coded macroblock in real time. The degree and the expected number of coded bits determine the condition of the slice cutoff, thereby controlling the size of the slice, and achieving an adaptive balance between the rate distortion performance and the bit error rate of the data packet.

Claims

Rights request

A video signal encoding method, comprising the steps of:

In the current slice, the complexity of the previous macroblock is compared with the complexity of the already coded macroblock in the entire slice to predict the current coded macroblock "output bit number 3⁄4 _; if 3⁄4 exceeds G.8-b _m ), frame Frame

Then the current slice is cut off; if it does not exceed G.8-b _m ), continue to encode the first macroblock; the ^ is the predicted number of NAL bytes; the b _m represents the output of the first n-1 macroblocks The sum of the number of bits.

2. The video signal encoding method according to claim 1, wherein said a

Where is the predicted value of the number of bits of the current macroblock M code, R iiiic ,

R „ denotes the sum of the squares of the residual coefficients of all 4×4 small blocks in the Mth macroblock; C _n (0 < i, j, k, l ≤ 3) is the residual of the coordinates (^) inside the macroblock Difference coefficient; "is an adjustment factor, its value range is

0.5-2;

Wherein, the target average code rate, / is the frame rate, the QP,

Q is a quantization parameter of the i-th macroblock of the nth frame image, and Q _Pn is a normalized quantization parameter of the i-th macroblock of the n-th frame image.

3. The video signal encoding method according to claim 2, wherein said "=1.1.

4. The method of encoding a signal according to claim 1, wherein s = min(s, 1024).