WO1999044369A1

WO1999044369A1 - Device and method for coding image

Info

Publication number: WO1999044369A1
Application number: PCT/JP1999/000891
Authority: WO
Inventors: Eiji Ogura; Masatoshi Takashima; Daisuke Hiranaka; Takeshi Miura
Original assignee: Sony Corporation
Priority date: 1998-02-25
Filing date: 1999-02-25
Publication date: 1999-09-02
Also published as: JPH11243546A; KR20010012110A

Abstract

From a memory, a stored search field is inputted to a simple movement detecting circuit. From the simple movement detecting circuit, the amount of movement is outputted and inputted to a control judging circuit. A movement detecting circuit receives the current frame and a search frame and finds a final moving vector practically used for movement compensation. From the control judging circuit, setting parameters for setting up a search range are sent to the movement detection circuit according to a predicted amount of movement predicted using a simple moving vector, and thereby a search range is set up. Thus, the problem that the conventional full-search block matching method requires a very large scale of circuit when an image in which movement is very quick as in a sport program is handled is solved.

Description

TECHNICAL FIELD The present invention relates to an image encoding device and method compliant with, for example, MPEG (Moving Picture Image Coding Experts Group). Background technology

The MPEG system is a coding system for compressing image data by combining orthogonal transform (Discrete Cosine Transform; DCT), motion compensation prediction, and variable length coding.

Fig. 1 shows the configuration of an image encoding device that complies with the MPEG system. This image encoding device is supplied with image data via an input terminal T1. This image data is input to the motion vector detection circuit 21 and the subtraction circuit 22. The motion vector detection circuit 21 obtains a motion vector between the current frame and a reference frame (for example, the previous frame) using the input image data and supplies the motion vector to the motion compensation circuit 23.

The image data of the reference frame is also stored in the frame memory 24. This image data is supplied to the motion compensation circuit 23. The motion compensation circuit 23 uses the motion vector sent from the motion vector detection circuit 21 to perform motion compensation for the image data sent from the frame memory 24. The output of the motion compensation circuit 23 is added to the subtraction circuit 22 It is sent to the arithmetic circuit 25.

The subtraction circuit 22 subtracts the image data of the current frame supplied from the input terminal T1 from the image data of the motion-compensated reference frame supplied from the motion compensation circuit 23 to perform prediction. The error data is obtained and supplied to the DCT circuit 26.

The DCT circuit 26 performs DCT processing on the prediction error data and sends it to the quantizer 27. The quantizer 27 quantizes the output of the DCT circuit 26 and sends it to a variable length coding circuit (not shown).

The output of the quantizer 27 is also supplied to the inverse quantizer 29. Then, it undergoes inverse quantization processing, and its output undergoes inverse DCT processing in the inverse DCT circuit 30, is returned to the original prediction error data, and is supplied to the addition circuit 25.

The adder circuit 25 adds the prediction error data to the output data of the motion compensation circuit 23 to obtain the image data of the current frame. The obtained image data is stored in the frame memory 24 as the image data of the next reference frame.

Among them, in order to perform motion compensation of a moving image by the motion compensation circuit 23, it is necessary to detect the motion vector by the motion vector detection circuit 21. Normally, the reference frame is divided into equal blocks, a search block of the same size as the reference block is moved within the search range in the past or future frame (search frame), and the best matching block is searched for. Let the distance be the motion vector. In general, when searching for the block that has the highest matching, all pixels in the reference block and the search block are subtracted, and the sum of the absolute values or the sum of the squares is calculated. Full search process A matching method is known.

The size of the block in the full search block matching method includes 8 pixels horizontally by 8 pixels vertically (hereinafter abbreviated as 8 × 8), 16 × 16, and the like. Next, the block matching method will be described with reference to FIG.

In FIG. 2, a reference block RB of MXN is set in a reference frame 41. In addition, an inspection block SB having the same size as the reference block R B is set in the search frame 42. The inspection block S B is moved around a predetermined search range 43 of ± m X soil n around the same position as the reference project R B. Then, the degree of coincidence between the reference block: RB and the inspection block SB is calculated, the inspection block having the highest degree of coincidence is set as a matching block, and a motion vector is obtained from this matching block.

In other words, if the test block SB k at the position (u, V) shifted from the test block SB 0 at the same position as the reference block RB has the highest degree of coincidence, the motion vector of the reference block RB is calculated. (u, V). At this time, the test block that minimizes the sum of absolute value differences and the sum of squares of the differences of each pixel at the same position in the reference block RB and the test block SB is defined as the test block with the highest matching degree.

In the MPEG method, one sequence of a moving image is divided into a GOP (Group of Picture) composed of a plurality of frames (one picture) and encoded. The GOP consists of an intra-coded image (I-picture), an already-coded inter-frame coded image (one P-picture) predicted from the temporally previous frame, and an already-coded temporally 2f And inter-frame coded images (B pictures) predicted from the frames.

For example, in FIG. 3, first, motion detection is performed using P3, which is a P picture, as a reference frame, and 10, which is an I picture, as a search frame. Next, B-picture B1 is used as a reference frame, and I0 and P3 are used as search frames to perform motion detection in both directions. Next, motion detection in both directions is performed using B2, which is the B picture, as a reference frame and 10 and P3 as search frames.

As shown in FIGS. 4A to 4C, it is generally desirable that the search range required for motion detection increase in proportion to the frame interval between the reference frame and the search frame. Here, the case where the block size is 16 x 16 will be described. For example, assuming that the search range is 16 in the horizontal and vertical directions as shown in Fig. 4A when one frame is separated, and as shown in Fig. 4B, when the search range is two frames apart, as shown in Fig. 4B. When the frames are separated, it is desirable to set the search range to ± 48 as shown in FIG. 4C.

By the way, when the search range is expanded in proportion to the frame interval in this way, the hard amount required for motion detection also increases to four times and nine times respectively when one frame is apart. In other words, a very large hand amount is required to perform motion detection three frames apart, such as motion detection using P3 as a reference frame and I0 as a search frame. Therefore, in order to reduce the amount of hardware, there are other methods for estimating the motion between frames based on the past motion vector history, controlling the search range, and expanding the search range.

By the way, very dynamic images such as sports programs When handling, the conventional full search block matching method had a problem that the circuit volume was extremely large.

Also, in other conventional methods of controlling the search range based on past vector histories to reduce the amount of circuit, if there is suddenly large movement, the search range is correctly adjusted according to the movement. Could not set. For example, this is the case when the camera suddenly starts panning large. In addition, when the movement speed is irregular, or when there is acceleration or deceleration, the prediction accuracy similarly decreases. DISCLOSURE OF THE INVENTION The present invention has been made in view of the above-mentioned circumstances, and has been described in connection with a case where a large movement is made, a time change of the movement speed is irregular, or there is acceleration or deceleration. It is an object of the present invention to provide an image encoding device and method capable of appropriately setting a range, performing highly accurate motion prediction, and improving image quality.

In order to solve the above-mentioned problems, an image encoding apparatus according to the present invention normally uses image data of a reference block based on a detection result from a simple first motion detection unit that operates earlier in time. The search range of the second motion detecting means for calculating the motion vector by calculating the image data of the inspection block within the search range and the control block is controlled by the control determining means. Here, the operation amount of the first motion detection means is smaller than the second operation amount. Further, the first motion detection means performs simplified motion detection by reducing the amount of calculation by projection for converting two dimensions into one dimension. Further, the control judging means is configured to detect a motion from the first motion detecting means. The search range of the second motion detecting means is controlled based on the amount and the motion vector already obtained by the second motion detecting means.

Here, the second motion detection means may have at least two motion detection blocks capable of setting an independent search range.

In order to solve the above-mentioned problems, the image encoding method according to the present invention normally uses an image data of a reference block based on a detection result from a simple first motion detection step that operates earlier in time. The control determination step controls the search range in a second motion detection step of calculating a motion vector by calculating the image data of the inspection block in the search range overnight.

Here, the calculation amount of the first motion detection step is smaller than the second calculation amount. Also, in the first motion detection step, a simplified type of motion detection is performed by reducing the amount of calculation by projection for converting two dimensions into one dimension. Further, the control determining step includes setting a search range of the second motion detecting step based on the motion amount from the first motion detecting step and the motion vector already obtained in the second motion detecting step. Control.

Here, the second motion detection step may include at least two motion detection blocks capable of setting an independent search range.

As described above, according to the present invention, when there is a large motion in an image, the motion can be detected in advance, so that the search range of a high-precision motion detection circuit finally used can be appropriately set. Become. Also, the image quality can be improved. In addition, even if the temporal change in the speed of the movement is irregular, or if there is acceleration or deceleration, it is possible to accurately predict the movement. BRIEF DESCRIPTION OF THE FIGURES FIG. 1 is a block diagram showing a configuration of a general image encoding device conforming to the MPEG system.

FIG. 2 is a diagram for explaining a block matching method ₍ FIG. 3 is a diagram illustrating an example of motion detection in MPEG).

4A to 4C are diagrams showing the relationship between the frame interval and a desirable search range.

FIG. 5 is a block diagram of the image encoding device according to the embodiment of the present invention.

6A to 6D are diagrams for explaining the operation of the above embodiment.

FIG. 7 is a diagram showing a relationship between a current field and a search field, which is input to the simple motion detection circuit included in the image coding device shown in FIG.

FIGS. 8A to 8C are diagrams for explaining an operation in which the motion detection circuit sets the search range using the setting parameters sent from the control determination circuit constituting the above embodiment.

FIG. 9A and FIG. 9B are a diagram and a circuit configuration block diagram for explaining a horizontal vector detection operation of the simple motion detection circuit configuring the above embodiment.

FIG. 10 is a diagram for explaining the vertical vector detection operation of the simple motion detection circuit.

FIG. 11 is a block diagram of another embodiment of the present invention. BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, embodiments of an image encoding apparatus and method according to the present invention will be described. This embodiment is an image coding apparatus conforming to the MPEG system. As shown in FIG. 5, a simple motion detection circuit 2 that operates earlier in time, an image data of a reference block and search are performed. The motion detection circuit 3 calculates the motion vector by calculating the image data of the inspection block in the range and the motion detection circuit 3 based on the motion amount detected by the simple motion detection circuit 2. And a control decision circuit 4 for controlling the search range. Various modifications and application examples can be considered without departing from the spirit of the present invention. Therefore, the gist of the present invention is not limited to the embodiments.

The simple motion detection circuit 2 requires less computation than the motion detection circuit 3. Specifically, the amount of calculation is reduced by using a projection process for converting two dimensions to one dimension as described later.

In FIG. 5, an input image (current field) 101 is input from an input terminal to a memory 1 and a simple motion detection circuit 2. Here, a case where the simple motion detection circuit 2 operates in units of fields will be described. Actually, simple motion detection may be performed in frame units. However, in the case of in-race images, detection accuracy is higher when performed in field units.

From the memory 1, the stored search field 102 is input to the simple motion detection circuit 2. From the simple motion detection circuit 2, a motion amount 105 which is a simple motion vector is output and input to the control judgment circuit 4. The control decision circuit 4 indicates the amount of motion between frames (between the current frame 103 and the search frame 104) actually used by the motion detection circuit 3. Estimate by calculation. The operation actually performed is, for example, integrating the motion amount 105 between one field according to the required frames.

The motion detection circuit 3 inputs the current frame 103 and the search frame 104 to obtain the final motion vector 107 actually used for motion compensation. At this time, the control decision circuit 4 sends a search range setting parameter 106 to the motion detection circuit 3 in accordance with the predicted motion amount by the simple motion vector 105 to set the search range. Is done.

Next, the operation of the image encoding apparatus will be described with reference to FIGS. 6A to 6D in comparison with a conventional example. 6A to 6D, "I" is an intra frame, "B" is a bidirectional predicted frame, and "P" is a forward predicted frame. The following numbers indicate the frame numbers in order from 0.

First, the case where the encoding is performed in the order shown in FIG. 6A will be described. That is, first, motion detection is performed using P3 as a P picture as a reference frame and I0 as an I picture as a search frame. Next, motion detection in both directions is performed using B picture B 1 as a reference frame and I 0 and P 3 as search frames. Next, an encoding is performed by performing motion detection in both directions using B2, which is a B picture, as a reference frame and I0 and P3 as search frames.

Each frame has a field structure consisting of a top field and a bottom field as shown in Fig. 6B.

In the simplified motion detection circuit 2 shown in FIG. 5, between the current field 101 inputted through the input terminal T1 and the search field 102 read out from the memory 1 and inputted, there is shown in FIG. As shown in There is a relation that the field is shifted by time. Then, the simple motion detection circuit 2 obtains a motion amount 105 that indicates, for example, where the entire image is moving.

This is shown in Figure 6C. For example, the motion amount 105 is considered to be the motion vector of the entire frame. Actually, the inside of the frame may be divided and the unit may be obtained.

The leftmost data in Fig. 6C is the amount of movement between IOt (TOP) and 10b (Bottom), and is obtained at time 1. At time 2, the amount of motion between I 0b and B 1t is determined. In addition, the amount of motion between all the fields from I0 to P3 from time 1 to time 7 is obtained. By adding these, for example, the amount of movement between I0 and P3 is obtained by time 7. The values obtained by actually adding the six fields are shown in the white bar graph in Fig. 6D.

The time at which the detection between I 0 and P 3 is performed by the motion detection circuit 3 in FIG. 5 is time 8 and 9, so the motion detection circuit 2 and the control decision circuit 4 predict the motion amount in advance at this time. The search range can be set. An example of setting the search range will be described later.

Actually, the black bar graph in Fig. 6C is a value obtained as a prediction from the motion vector distribution by the conventional method. For example, in FIG. 6C, the leftmost black value is the amount of motion from B1t to I0b, and is one of the motion vectors from B1 to I0 in the original motion vector detection. It is obtained from the distribution of the vector of the part. As described above, in the past, the motion vector distribution of a part of FIG. For example, when the motion vector between fields is known as in this example, the prediction method multiplies the value by 6 to predict the motion between three frames. The predicted values are shown in the black bar graph in FIG. 6D. The value on the left side is a value obtained by accumulating the values obtained by simply calculating the amount of movement of the present invention. In the conventional method, when the motion is constant, such as between P3 and P6, the actual motion amount is almost the same. However, when there is a large change in the movement, for example, from I0 to P3, or when the movement decelerates from P6 to P9, or from P9 to PI2, the predicted value according to the present invention is Is actually close to the amount of movement. In the conventional method, a large error may occur between I0 and P3.

Next, how to use the motion amount predicted by the simple motion detection circuit 2 and the control determination circuit 4 as described above will be described.

The case where the motion detection circuit 3 has two motion detection circuits ME U ME 2 whose search range can be set independently will be described with reference to FIGS. 8A to 8C. The motion detection circuits ME1 and ME2 indicate the center vectors SMV1 and SMV2 of the search range.

FIG. 8A shows an example of the search range when a large amount of motion is not detected. In this case, the two motion detection circuits ME1 and ME2 are arranged side by side. FIG. 8B shows an example in which a large amount of motion is predicted in the horizontal direction. In this case, ME 1 is set at the center of the coordinates to cover a small motion vector, and ME 2 is set so that the center of the search range SMV 2 has a large amount of motion in the predicted horizontal direction. Thus, the more accurate the predicted motion amount, the more efficient the motion compensation and the better the encoded image quality. The reason that one motion detection circuit always covers a small motion vector is that if there is a motion vector with a small motion, missing it will greatly degrade the image quality. Also, in this example, the motion detection circuit 3 has two motion detection circuits capable of independently setting the search range. However, when there are three or more motion detection circuits, the motion prediction Set each search range according to the amount. In particular, when the motion in a frame is complicated and there are a plurality of predicted motion amounts, it is effective to have three or more motion detection circuits. Figure 8C shows an example in which there are two types of predicted motion amount and three motion detection circuits. The movement in this example is equivalent to the case where the upper half of the screen moves to the right and the lower half moves to the opposite left.

Next, the simple motion detection circuit 2 will be described. The simple motion detection circuit 2 needs to be able to determine the motion of a large area in a frame with a simple circuit configuration, and finds the motion vector for each small macro block like the original motion detection circuit It is not necessary. Here, a specific example will be described in which horizontal and vertical projections are obtained for almost the entire frame, and this is used to detect the motion vector of the entire frame. The motion vector obtained by the simple motion detection circuit 2 is not used directly for normal motion compensation, but is used for setting the search range of the conventional motion detection circuit. It is not necessary and a simple moving vector is sufficient. Actually, the screen may be divided and a simple motion vector (movement amount) may be obtained for each unit. A simple motion vector that allows you to know which part of the screen is moving is sufficient. In this specific example, a simple method is used in which the horizontal and vertical motion vectors are independently detected.

Figure 9A shows the horizontal vector detection method, and Figure 9B shows the circuit diagram. First, the vertical projection 13 is obtained by adding all the pixels of each vertical line to the portion excluding the three lines on both sides of the current field. This hanging Direct projection 13 is obtained by converting the two-dimensional image information of the current field into one-dimensional image information. In the search field, the vertical projection 14 is obtained by adding all the pixels of each vertical line. The operation is to find the position of the search field vertical projection that best matches the vertical projection of the current field. The position of the bold frame shown in the vertical projection 14 of the search field indicates that the motion vector is zero. In this example, the sum of the absolute differences of each of the seven locations from −3 to +3 is calculated, and the position with the minimum value is determined as the horizontal motion vector. In this example, a motion vector with one-pixel accuracy is used, but in practice, an accuracy of a considerably large unit such as four-pixel accuracy is often sufficient.

As an example of the circuit configuration, as shown in FIG. 9B, the current field is input to the adder circuit 5 and the vertical projection 13 is obtained by the register 8. Similarly, the data of the search field is input to the adder circuit 6 and the vertical projection 14 is stored in the register 9. The difference / absolute value calculation circuit 11 and the addition circuit 7 calculate the sum of absolute differences using the vertical projections 13 and 14 read from the register 8 and the register 9 and store the sum in the register 10. Then, two vertical projections 13 and 14 that minimize the sum of absolute differences are obtained by the minimum value circuit 12, and a horizontal motion vector 108 is obtained therefrom.

In the vertical vector detection, as shown in Fig. 10, the horizontal projection 15 of the current field obtained by adding all the pixel values of one line is obtained, and the horizontal projection 16 of the search field is calculated. Find the minimum position of the sum of absolute differences, and find the vertical motion vector. Since the circuit diagram has the same configuration as that of FIG. 9B, the description is omitted.

As a modification of this embodiment, a low-frequency band is included in one field. For example, a method of obtaining a motion vector for an image obtained by reducing the size of an image over time can be considered.

Next, another embodiment will be described with reference to FIG. The other embodiment is also an image coding apparatus conforming to MPEG, but a new motion vector 110 is input from the motion detection circuit 3 to the control decision circuit 4 in the image coding apparatus shown in FIG. This is a path that has been established. The control determination circuit 4 makes a determination using both the motion amount 105 and the motion vector 110.

The motion vector 110 corresponds to the value used in the conventional method shown by the black bar in FIG. 6C. In general, in MPEG and the like, since the processing is performed for each block size as small as 16 × 16, the number of motion vectors obtained is large and the spatial resolution is high. In addition, the motion vector has high 1-pixel accuracy and high accuracy. However, as shown by the black bars in Fig. 6C, not all motion vectors between the fields are required, so the temporal resolution is low. Therefore, the control decision circuit 4 predicts the motion using the motion amount 105 when the temporal accuracy is required, and conversely, when the temporal accuracy is not required or when the spatial accuracy is required. Is required, the motion amount is predicted using the motion vector 110.

An example of the case where the temporal accuracy is required is a case where the motion largely changes like “IO to P 3” in FIG. 6D, and in this case, the motion amount 105 is used.

The case where the temporal accuracy is not required is a case where the motion is almost constant as shown in “P3 to P6” in FIG. 6D. In this case, the motion vector with high vector accuracy 1 10 is used for the judgment. Also, when there are various complicated movements in the screen and the movement amount 105 cannot be calculated well This is the case where spatial accuracy is required, and in this case also, the motion vector is predicted using the motion vector 110.

By using both of these two values with different properties, the motion amount 105 and the motion vector 110, it is possible to predict the motion amount with higher accuracy than when each is performed alone. Becomes possible.

According to the present invention, when a large motion is detected, the motion can be detected in advance, so that a search range of a high-precision motion detection circuit to be finally used can be appropriately set. As a result, the image quality can be greatly improved. Also, compared to the case of using the past motion vector history in the past, the motion prediction accuracy is improved, and as a result, the encoded image quality is improved. Especially in high-speed panning and tilting images of the camera, etc., the image quality is greatly improved by improving the motion prediction accuracy.

Claims

The scope of the claims

1. First motion detecting means that operates earlier in time,

Second motion detecting means for calculating the motion vector by calculating the image data of the reference block and the image data of the inspection block within the search range, and based on the detection result of the first motion detecting means. Control determining means for controlling the search range of the second motion detecting means;

An image encoding device comprising:

2. The image encoding device according to claim 1, wherein the first motion detecting means is a simple type having a smaller amount of calculation than the second motion detecting means.

3. The image encoding apparatus according to claim 1, wherein the first motion detection means performs motion detection by projecting a two-dimensional image into a one-dimensional image.

4. The image encoding apparatus according to claim 1, wherein the second motion detection means has at least two motion detection blocks capable of setting an independent search range.

5. The control judging means searches the second motion detecting means based on the motion amount from the first motion detecting means and the motion vector already obtained by the second motion detecting means. 2. The image encoding device according to claim 1, wherein the range is controlled.

6. A first motion detection step that operates earlier in time;

A second motion detection step of calculating the motion vector by calculating the image data of the reference block and the image data of the inspection block within the search range, and based on the detection result of the first motion detection step. The second motion detection above A control determination step for controlling the search range of the output step;

An image encoding method comprising:

7. The image encoding method according to claim 6, wherein the first motion detection step has a smaller amount of calculation than the second motion detection step.

8. The image encoding method according to claim 6, wherein in the first motion detection step, motion is detected by a projection that converts two dimensions into one dimension.

9. The image encoding method according to claim 6, wherein the second motion detection step uses at least two motion detection blocks capable of setting an independent search range.

10. The control determination step is a search range of the second motion detection step based on the motion amount from the first motion detection step and the motion vector already obtained in the second motion detection step. 7. The image encoding method according to claim 6, wherein the image encoding method is characterized in that the image encoding is controlled.