+

WO2006104357A1 - Procede pour la compression/decompression des vecteurs de mouvement d'une image non synchronisee et appareil utilisant ce procede - Google Patents

Procede pour la compression/decompression des vecteurs de mouvement d'une image non synchronisee et appareil utilisant ce procede Download PDF

Info

Publication number
WO2006104357A1
WO2006104357A1 PCT/KR2006/001171 KR2006001171W WO2006104357A1 WO 2006104357 A1 WO2006104357 A1 WO 2006104357A1 KR 2006001171 W KR2006001171 W KR 2006001171W WO 2006104357 A1 WO2006104357 A1 WO 2006104357A1
Authority
WO
WIPO (PCT)
Prior art keywords
motion vector
picture
unsynchronized
base
predicted
Prior art date
Application number
PCT/KR2006/001171
Other languages
English (en)
Inventor
Kyo-Hyuk Lee
Sang-Chang Cha
Woo-Jin Han
Original Assignee
Samsung Electronics Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from KR1020050044594A external-priority patent/KR100763179B1/ko
Application filed by Samsung Electronics Co., Ltd. filed Critical Samsung Electronics Co., Ltd.
Publication of WO2006104357A1 publication Critical patent/WO2006104357A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/31Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the temporal domain
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Definitions

  • Methods and apparatuses consistent with the present invention relate to video compression, and more particularly, to improving a compression efficiency of motion vectors of an unsynchronized picture by efficiently predicting the motion vectors using motion vectors of a lower layer.
  • data compression is applied to remove data redundancy.
  • data can be compressed by removing spatial redundancy such as a repetition of the same color or object in pictures, temporal redundancy such as a little or no change in adjacent frames of moving pictures or a continuous repetition of sounds in audio, and a visual/ perceptual redundancy, which considers human visual and perceptive insensitivity to high frequencies.
  • spatial redundancy such as a repetition of the same color or object in pictures
  • temporal redundancy such as a little or no change in adjacent frames of moving pictures or a continuous repetition of sounds in audio
  • a visual/ perceptual redundancy which considers human visual and perceptive insensitivity to high frequencies.
  • the temporal redundancy is removed by a temporal prediction based on motion compensation
  • the spatial redundancy is removed by a spatial transform.
  • multimedia data is transmitted over a transmission medium or a communication network, which may differ in terms of performance, as existing transmission mediums have varying transmission speeds.
  • a transmission medium or a communication network which may differ in terms of performance, as existing transmission mediums have varying transmission speeds.
  • an ultra high-speed communication network can transmit several tens of megabits of data per second, while a mobile communication network has a transmission speed of 384 kilobits per second.
  • a scalable video encoding method is implemented.
  • Such a scalable video encoding method makes it possible to truncate a portion of compressed bit stream and to adjust the resolution, frame rate and signal-to-noise ratio (SNR) of a video corresponding to the truncated portion of the bit stream.
  • SNR signal-to-noise ratio
  • MPEG-4 Part 10 has already progressed its standardization work. Particularly, much research for implementing scalability in a video encoding method based on a multilayer has already been carried out.
  • a multilayer structure is composed of a base layer, a first enhancement layer and a second enhancement layer, and the respective layers have different resolutions QCIF, CIF and 2CIF, and different frame rates.
  • MVs motion vectors
  • the motion vectors may be separately searched and used for each layer, or may be searched in one layer and used (as they are or after being up/down-sampled) in other layers.
  • the former case has both an advantage of searching and obtaining exact motion vectors and a disadvantage of serving the motion vectors generated for each layer as an overhead. Accordingly, in the former case, it is important to remove the redundancy between the motion vectors for the respective layers more efficiently.
  • JVT Joint Video Team
  • MPEG International Organization for Standardization / International Electrotechnical Commission (ISO/IEC) and Video Coding Experts Group (VCEG) of International Telecommunications Union (ITU)
  • MPEG International Organization for Standardization / International Electrotechnical Commission (ISO/IEC) and Video Coding Experts Group (VCEG) of International Telecommunications Union (ITU)
  • VCEG Video Coding Experts Group
  • ITU International Telecommunications Union
  • JSVM 'Joint Scalable Video Model
  • the JSVM 1.0 standard uses the scalable video coding method using a multilayer.
  • an H. 264 method is adopted as an encoding method for each layer constituting the multilayer
  • motion compensated temporal filtering MCTF
  • MCTF motion compensated temporal filtering
  • FIG. 1 illustrates an example of a scalable video coding structure having two layers.
  • a white tetragon indicates a low frequency picture
  • a black tetragon indicates a high frequency picture.
  • the upper layer has a frame rate of 30 Hz, and includes a plurality of temporal levels (four in number) according to a hierarchical MCTF separating process.
  • a lower layer has a frame rate of 15 Hz, and includes temporal levels (three in number).
  • the JSVM 1.0 standard discloses a technique for predicting motion vectors of an a picture of an upper layer using a picture of a lower layer which has a temporal position, i.e., a picture order count (POC), which is consistent with the POC of any one of the upper layer pictures.
  • POC picture order count
  • motion vectors of high frequency pictures 15 and 16 of the upper layer of FIG. 1 can be efficiently predicted from motion vectors of high frequency pictures 17 and 18 of the lower layer each having the same temporal position. Since they have the same temporal positions, their motion vectors can also be expected to be similar to each other.
  • the present invention provides a method and apparatus capable of more efficiently encoding motion vectors of an picture having no corresponding lower layer picture.
  • the present invention also provides a method and apparatus capable of efficiently predicting motion vectors of an unsynchronized picture from motion vectors of a lower layer in a scalable video codec based on a multilayer having MCTF structures.
  • the present invention also provides a syntax that is modified to adapt a motion vector prediction technique of an unsynchronized picture to a JSVM.
  • a method for compressing motion vectors of an unsynchronized picture belonging to a current layer in a video encoder based on a multilayer having at least the current layer and a lower layer of the current layer including selecting a base picture for the unsynchronized picture; generating a predicted motion vector of the current layer from a motion vector of the base picture; subtracting the predicted motion vector from a moti on vector of the unsynchronized picture; and encoding the result of subtraction.
  • a method for decompressing motion vectors of an unsynchronized picture belonging to a current layer in a video decoder based on a multilayer having at least the current layer and a lower layer of the current layer including selecting a base picture for the unsynchronized picture; generating a predicted motion vector of the current layer from a motion vector of the base picture; and adding a motion vector difference for the unsynchronized picture to the predicted motion vector.
  • FlG. 1 illustrates an example of a scalable video coding structure having two layers
  • FlG. 2 illustrates a method for selecting a lower layer picture used to predict motion vectors of an unsynchronized picture according to an exemplary embodiment of the present invention
  • FlG. 3 illustrates a case where both a base picture and an unsynchronized picture have bidirectional motion vectors while a picture order count (POC) difference is positive
  • POC picture order count
  • FlG. 4 illustrates a case where both a base picture and an unsynchronized picture have bidirectional motion vectors while a POC difference is negative;
  • FlG. 5 illustrates a case where a base picture has only a reverse motion vector in an environment as in FlG. 3;
  • FlG. 6 illustrates a case where a base picture has only a reverse motion vector in an environment as in FlG. 4;
  • FlG. 7 is a diagram illustrating a corresponding relation between interlayer motion vectors;
  • FlG. 8 illustrates measures to be implemented when a corresponding region of a base picture is not consistent with a block to which a motion vector is allocated; [27] FlG.
  • FIG. 9 illustrates an example of a motion search area and an initial position when a motion is estimated
  • [28] FlG. 10 illustrates measures to be implemented when a macroblock pattern is not consistent between layers
  • [29] FlG. 11 is a block diagram illustrating the construction of a video encoder according to an exemplary embodiment of the present invention
  • [30] FlG. 12 is a block diagram illustrating the construction of a video decoder according to an exemplary embodiment of the present invention
  • [31] FlG. 13 is a diagram illustrating the construction of a system for performing an operation of a video encoder or a video decoder according to an exemplary embodiment of the present invention.
  • the present invention provides a method for selecting a lower layer picture
  • 'base picture' used to predict motion vectors of an unsyn- chronized picture having no corresponding lower layer picture. Further, the present invention provides a method for predicting the motion vector of the unsynchronized picture using a motion vector of the selected lower layer picture.
  • FIG. 2 illustrates a method for selecting a lower layer picture used to predict motion vectors of an unsynchronized picture according to an exemplary embodiment of the present invention. Due to the absence of a corresponding lower layer picture, unsynchronized pictures 11, 12, 13, and 14 need to be determined as to whether an picture corresponding to any condition, as a 'base picture', from a plurality of lower layer pictures should be selected.
  • Selection of the base picture can be based on whether to satisfy the following three conditions.
  • the condition 1 of targeting the picture existing at the uppermost temporal level is because reference lengths of motion vectors of such pictures are the shortest. As the reference length is lengthened, error in predicting the motion vector of the unsynchronized picture increases. Further, the condition 1 of limiting to the high frequency picture makes it possible to predict the motion vector only when the base picture has the motion vector.
  • the condition 2 of having the smallest POC difference from the current unsyn- chronized picture is to provide the smallest temporal distance between the current un- synchronized picture and the base picture.
  • the pictures in a close temporal distance have a greater possibility of having more similar motion vectors. If there are two or more of lower layer pictures having the same POC difference in the condition 2, the lower layer picture having a smaller POC can be selected from the two or more of lower layer pictures as the base picture.
  • condition 3 of existing within the same GOP as the current unsynchronized picture allows an encoding process to be delayed when a base layer is referred beyond the GOP. Accordingly, the condition 3 can be omitted in an environment which does not present a problem, although the encoding process can be delayed.
  • a process of selecting the base picture for the unsynchronized pictures 11, 12, 13, and 14 of FlG. 2 according to the above three conditions is as follows.
  • a high frequency picture 17 having a smaller POC difference, that is, a closer temporal position is selected from high frequency pictures 17 and 18 existing at the uppermost temporal level (temporal level 2) in the lower layer.
  • the high frequency picture 17 is selected as the base picture of the unsynchronized picture 12
  • the high frequency picture 18 is selected as the base picture of the unsynchronized pictures 13 and 14.
  • FlGs. 3 to 6 illustrate processes of predicting motion vectors of the unsynchronized picture 31 using the motion vector of a base picture 36, that is, processes of generating a predicted motion vector for an unsynchronized picture 31.
  • two motion vectors of a base picture 32 such as forward and reverse motion vectors may exist, it should be also determined whether to select the forward or reverse motion vector.
  • Ob positioned at a closer side when the motion vector is predicted.
  • FlG. 3 illustrates a case where both the base picture and the unsynchronized picture have bidirectional motion vectors while the (POC) difference is positive.
  • motion vectors M and M of the unsynchronized picture 31 are predicted from the forward motion vector M of the base picture 32, and as a result, a predicted motion vector P(M ) for the forward motion vector M and a predicted motion vector P(M ) for the reverse motion vector M are respectively obtained.
  • an object moves at a predetermined direction and velocity. In particular, when a background continuously moves or a specific object is observed for a short time, such a property is satisfied in many cases. Accordingly, it can be predicted that the result of M -M is similar with the forward motion vector M . f b Of
  • Equation (1) it can be appreciated that M is predicted using M , and M is predicted using M and M .
  • M is predicted using M and M .
  • a video codec may adaptively select the most appropriate one from forward, reverse, and bidirectional references depending on the efficiency of compression.
  • Equation (2) Equation (1)
  • the difference between M and its prediction value P(M ) can be b b expressed as '2 x M + M '.
  • FIG. 4 illustrates a case where both the base picture and the unsynchronized picture have bidirectional motion vectors while the POC difference is negative.
  • the motion vectors M and M of the unsynchronized picture 31 are predicted from the reverse motion vector M of the base picture 32, and as a result, the predicted
  • Ob motion vector P(M ) for the forward motion vector M and the predicted motion vector P(M ) for the reverse motion vector M are obtained respectively.
  • P(M ) and P(M ) can be defined as follows in Equation (3).
  • Equation (3) M is predicted using M , and M is predicted using M and M .
  • the base picture 32 may have a unidirectional motion vector. In this case, since only one motion vector to be referred to exists, it is not required to calculate the POC difference and select the motion vector to be referred to.
  • FIG. 5 illustrates a case where the base picture has only the reverse motion vector
  • the predicted motion vectors P(M) and P(M ) for the motion vectors M and M of the unsynchronized picture 31 can be determined by the same relational expression as Equation (3).
  • the predicted motion vectors P(M ) and P(M ) for the motion vectors M and M of the unsynchronized picture 31 can be determined by the same relational expression of Equation (1).
  • a reference distance of the motion vector of the base picture (temporal distance between any one picture and its reference picture, capable of being expressed by the POC difference) is twice the reference distance of the unsynchronized picture. However, this may vary.
  • the predicted motion vector P(M ) for the forward motion vector M of the unsynchronized picture can be obtained by multiplying the motion vector M of the base picture by a reference distance coefficient d.
  • the magnitude of the reference distance coefficient d is a value obtained by dividing the reference distance of the unsynchronized picture by the reference distance of the base picture. If the reference direction is the same, the reference distance coefficient d has a positive value, and otherwise, it has a negative value.
  • the predicted motion vector P(M ) for the reverse motion vector M of the unsyn- b b chronized picture can be obtained by subtracting the motion vector of the base picture from the forward motion vector M of the unsynchronized picture when the motion vector of the base picture is the forward motion vector.
  • FIGs. 3 to 6 illustrate different cases where the motion vectors of the unsyn- chronized picture are predicted through the motion vector of the base picture.
  • temporal positions of a low temporal level frame 31 and a high temporal level frame 32 correspond to each other, there is a need to determine whether the motion vectors positioned at any one partition mutually correspond to each other within one picture. This determination can be implemented by the following methods.
  • a method of simply making the motion vectors correspond to each other at the same position can be used.
  • a motion vector 52 allocated to a block 52 within the base picture 32 can be used to predict motion vectors 41 and 42 allocated to a block 51 of the same position as the block 52 in the unsynchronized picture 31.
  • a method of predicting the motion vector after correcting an inconsistent temporal position can be considered.
  • the motion vector 46 in a corresponding region 54 of the unsynchronized picture 32 can be used to predict the motion vectors 41 and 43 of the base picture 31.
  • the region 54 may not be consistent with a unit of block to which the motion vector is allocated respectively, but an area weighted average or a median value can be obtained, thereby obtaining one representative motion vector 46.
  • the representative motion vector (MV) for the region 54 can be obtained by Equation (5) when using an area weighted average, and by Equation (6) when using a median operation.
  • Equation (5) when using an area weighted average
  • Equation (6) when using a median operation.
  • two types of motion vectors exist, and thus operations for each of the motion vectors are performed.
  • MV median (MV 1 ) (6)
  • 'a' denoting a forward distance rate is a value obtained by dividing a forward reference distance by a sum of the forward reference distance and the reverse reference distance
  • 'b' denoting a reverse distance rate is a value obtained by dividing the reverse reference distance by the sum of the distances.
  • the predicted motion vectors P(M) and P(M ) can be obtained considering the reference distance.
  • Equations (1) to (4) should be modified.
  • the reference distance of the base picture 32 is two times the reference distance of the unsynchronized picture 31, and the resolution between the base picture 32 and the unsynchronized picture 31 is the same and, therefore, P(M ) can be expressed as M /2 or -M /2.
  • P(M ) can be expressed as M /2 or -M /2.
  • Equations (1) to (4) should be modified as follows in Equations (7) to (10), respectively. That is, in Equations (1) to (4), the motion vectors M and M of the base picture are substituted with rxM and rxM , respectively. That is because, when a resolution multiple is V, the magnitude of a corresponding motion vector should also be larger accordingly. [72]
  • FIG. 9 illustrates an example of a motion search area 23 and an initial position 24 when a motion is estimated.
  • a method for searching for the motion vector there are a full area search method for searching a whole picture for the motion vector and a local area search method for searching a predetermined search area for the motion vector.
  • the motion vector is used to reduce a texture difference by adopting more similar texture block.
  • the motion vector itself is a part of data that is transmitted to a decoder side and a lossless encoding method is mainly used, a considerable amount of bits is allocated to the motion vector. Accordingly, reduction in the bit amount of the motion vector, no less than reduction of the bit amount of texture data, has influence on the video compression performance. Accordingly, most video codecs limit the magnitude of the motion vector mainly using the local area search method.
  • the motion vector search is performed within the motion search area 23 with a more accurate predicted motion vector 24 provided as an initial value, the amount of calculation performed for the motion vector search and a difference 25 between the predicted motion vector as well as the actual motion vector can be reduced.
  • the predicted motion vectors P(M ) and P(M ) obtained using the base picture can be used as the initial value for performing motion estimation on the unsynchronized picture.
  • the actual motion vectors M and M of the unsynchronized picture can be obtained.
  • a process of encoding and expressing the motion vectors M f and M b , thereby reducing the amount of data of the motion vector, that is, a process of quantizing the motion vector for the motion vector of the unsynchronized picture is performed.
  • the process of quantizing the motion vectors M and M can be performed through f b a method of simply obtaining the difference between the motion vectors M and M and the predicted motion vector.
  • the result of quantizing M can be expressed as 'M -P(M )', and the quantization result of M can be expressed as 'M -P(M )'. If the unsynchronized picture uses only a unidirectional reference, only one of both motion vectors M and M can be also quantized.
  • the predicted motion vector P(M) of the unsynchronized picture can be obtained from the area weighted average of the respective predicted motion vectors (P(M ) where T denotes index) included in corresponding partitions of the base picture as follows in Equation (11).
  • any partition of the macroblock pattern of the base picture includes partitions of a corresponding unsynchronized picture in contrast to the case in FIG. 10, the predicted motion vector of the partition of the base picture is used as it is as the predicted motion vector for the partitions.
  • a parameter of 'base_layer_id_plusl' has a value obtained by adding 1 to a parameter of 'base_layer_id'. If the macroblock of the lower layer picture corresponding to a macroblock of a current layer picture exists, an identification number of the lower layer is allocated to the base_layer_id and, otherwise, '-I 1 is allocated to the base_layer_id. The identification number of the layer has 0, 1, and 2, in the order starting from the lowermost layer.
  • the condition that the parameter of 'base_layer_id_plusl' is not zero means that the macroblock of the lower layer picture corresponding to the macroblock of the current layer picture exists, i.e., that the current layer picture is a synchronized picture.
  • a parameter of 'adaptive_prediction_flag' decides whether to encode a layer with reference to the lower layer. In the case where the parameter is zero, it represents a case of independent encoding without reference to the lower layer, and in the case where the parameter is not zero, it represents a case of encoding with reference to the lower layer.
  • the condition expressed in the first column and second row of Table 1 means that 'base_layer_ide_plusl' is not zero, and 'adaptive_prediction_flag' is not zero.
  • 'base_layer_mode_flag' is set only for the synchronized picture where 'base_layer_id_plusl' is zero. Accordingly, it can be appreciated that the macroblock where 'base_layer_mode_flag' is set has been encoded using information of the lower layer in a subsequent encoder or decoder process.
  • intra_base_mb (CurrMbAddr)' is a syntax added to use a motion prediction method of the unsynchronized picture proposed in the present invention despite having different POCs if the macroblock of the current layer picture is the inter block. Accordingly, the two added conditions are connected using 'or (II)', and are connected with other conditions using 'and (&&)'.
  • a condition of Table 2 when a condition of Table 2 is satisfied, a value of 'TRUE' is allocated as 'base_layer_mode_flag' to the macroblock of the current layer picture.
  • FIG. 11 is a block diagram illustrating the construction of a video encoder 100 according to an exemplary embodiment of the present invention.
  • FIG. 11 exemplifies a case where only one current layer and one lower layer are provided, but those skilled in the art can appreciate that a similar application can be made to even in the case where three or more layers exist.
  • An initially input picture is input to a separation unit 111 of the current layer, and is temporally down-sampled, or is temporally or spatially down-sampled in a down- sampling unit 101 and then, is input to a separation unit 211 of the lower layer.
  • the down-sampling unit 101 can allow the frame rate of the lower layer to have a half of the current layer, or allow the frame rate and the resolution to have halves of the current layer, respectively.
  • the separation unit 111 separates an input picture O as an picture of a high frequency picture position (H position), and as an picture in a low frequency picture position (L position).
  • the high frequency picture is positioned at an odd-numbered position 2i+l, and a low frequency picture at an even-numbered position 2i.
  • T denoting the index representing an picture number has an integer value of greater than zero.
  • the pictures at the H position pass through a temporal prediction process (Here, temporal prediction means prediction of the texture, not prediction of the motion vector), and pictures at the L position pass through an updating process.
  • the picture at the H position is input to a motion estimator 115, a motion compensator 112, and a difference engine 118.
  • a motion estimator 113 performs the motion estimation of a picture (hereinafter referred to as 'current picture'), being at the H position, with reference to a peripheral picture (picture in the same layer but in different temporal positions), thereby obtaining the motion vector.
  • the peripheral picture referred as above is called 'reference picture'.
  • a displacement where its error is minimized is estimated as the motion vector.
  • a fixed block matching method or a hierarchical method using a hierarchical variable size block matching (HVSBM) may be used.
  • the efficiency of motion estimation can be enhanced using the motion vector previously obtained from the picture belonging to the lower layer (hereinafter referred to as 'lower layer picture').
  • the predicted motion vector 24 predicted in the lower layer is provided as an initial value, and the motion vector search is performed within the motion search area 23.
  • the motion vector at the corresponding block position of the lower layer picture being at the same temporal position can be used as it is. If the resolutions of the current layer and the lower layer are different, the lower layer motion vector is multiplied by a resolution multiple to be used as the predicted motion vector 24.
  • the base picture for the current picture should be first selected. This selection of the base picture is performed in a base picture selector 111 according to an inquiry command of the motion estimator 115, and the information on the selected base picture (base picture information) is provided to generate the predicted motion vector through a predicted motion vector generator 114.
  • a picture which exists in the same GOP as that of the current unsynchronized picture among the lower layer pictures and has the least POC difference from the current unsynchronized picture among the high frequency pictures existing at the uppermost temporal level, is selected as the base picture.
  • an picture where the POC is relatively small is selected.
  • the predicted motion vector generator 114 requests a motion vector M of the base picture from a motion vector buffer 213 of the lower layer through the provided base picture information (for example, the picture number POC and GOP number of the selected base pictures), and generates the predicted motion vector for a predetermined block (macroblock or sub-macroblock) of the current layer picture using the motion vector M .
  • the method for generating the predicted motion vector has been described with reference to FlGs. 3 to 6 and thus explanation thereof will be omitted.
  • the predicted motion vector generator 114 receives M and M
  • the predicted motion vector may be generated directly from the motion vector M or with reference the motion vector M ' of a different reference direction previously generated and stored in a motion vector buffer 113.
  • M corresponds to M , and from this, the
  • an optimum motion vector can be determined by obtaining the motion vector having a minimum cost function and, together with this, an optimal macroblock pattern can also be determined in the case where the HVSBM is used.
  • the cost function can also employ, for example, a rate-distortion function.
  • the motion vector buffer 113 stores the motion vector obtained in the motion estimator 115, and provides the stored motion vector to the predicted motion vector generator 114 according to an inquiry command of the predicted motion vector generator 114.
  • the motion compensator 112 performs motion compensation for the current picture using the motion vector obtained in the motion estimator 115 and the reference picture. Further, the difference engine 118 obtains the difference of the current picture and the motion compensation picture provided by the motion compensator 112, thereby generating a high frequency picture (H picture).
  • the high frequency picture is called a residual picture having a meaning of the residual result.
  • the generated high frequency pictures are provided to an updating unit 116 and a transformer 120.
  • the updating unit 116 updates pictures at an L position using the generated high frequency picture.
  • pictures at the L position will be updated using two temporally adjacent high frequency pictures. If the unidirectional (forward or reverse direction) reference is used in the process of generating the high frequency picture, the updating process can be unidirectionally performed.
  • a more detailed relational expression for an MCTF updating process is well known in the art and thus the detailed explanation thereof will be omitted.
  • the updating unit 116 stores the updated pictures at the L position in a frame buffer 117, and the frame buffer 117 provides the stored picture at the L position to a separating unit 111 for a process of separating the MCTF at a subsequent lower temporal level.
  • the picture at the L position is a final L picture, a lower temporal level no longer exists. Therefore, the final L picture is provided to the transformer 120.
  • the separating unit 111 separates the pictures provided from the frame buffer 117 as an picture at an H position and an picture of an L position at a subsequent lower temporal level. In a similar manner, a temporal prediction process and an updating process are performed at a subsequent lower temporal level. The repetitive MCTF separating process can be repeatedly performed until one L picture finally remains.
  • the transformer 120 performs spatial transform and generates a transform coefficient C for the provided final L picture and H picture.
  • the spatial transform process may employ methods such as a discrete cosine transform (DCT) and a wavelet transform.
  • DCT discrete cosine transform
  • a wavelet coefficient will be used as the transform coefficient.
  • a quantizer 130 quantizes a transform coefficient C .
  • the quantization refers to a process of representing the transform coefficient expressed by a predetermined real value, as a discrete value. For example, the quantizer 130 divides the transform coefficient expressed by the predetermined real value through a predetermined quantization step, and the quantization can be performed in such a manner that the division result is rounded off to an integer value.
  • the quantization step can be provided from a predefined quantization table.
  • a motion vector difference calculator 150 calculates a difference ⁇ M between the motion vector M from the motion vector buffer 113 and the predicted motion vector P(M ) for the motion vector M provided from the predicted motion vector generator 114, and provides the calculated difference ⁇ M to an entropy encoder 140.
  • the lower layer does not have a referable lower layer and thus may not include processes of selection of the base picture, generation of the predicted motion vector, and calculation of the motion vector difference according to the present invention.
  • the motion vector of the lower layer may be used to obtain the difference for an efficient compression, the motion vector of the lower layer will be described as being encoded without loss.
  • the motion vector of the current layer M the quantization result of the current layer T , the motion vector of the lower layer M , and the quantization result of the lower layer T are provided to the entropy encoder 140, respectively.
  • the entropy encoder 140 encodes the provided M , T , M , and T without loss, and generates a bit stream.
  • a lossless encoding method Huffman coding, arithmetic coding, variable length coding, and other various coding methods may be used.
  • FlG. 12 is a block diagram illustrating the construction of a video decoder 300 according to an exemplary embodiment of the present invention.
  • An entropy decoder 310 performs lossless decoding, and decodes without loss the motion vector difference ⁇ M 1 and the texture data T 1 of the current layer, and the motion vector M and the texture data M of the lower layer from the input bit stream. First, an operation performed on the current layer will be described.
  • the decoded texture data T is provided to an inverse quantizer 350, and the motion difference ⁇ M is provided to a motion vector buffer 330.
  • a base picture selector 311 selects the base picture for the current layer picture according to the inquiry command from a motion compensator 340. The selection of the base picture can be performed using the same algorithm as in the base picture selector 311. Information on the selected base picture (for example, the picture number POC and GOP number of the selected base pictures) is provided to a predicted motion vector generator 314.
  • the predicted motion vector generator 314 requests the motion vector M of the base picture from the motion vector buffer 430 of the lower layer through the provided base picture information, and generates the predicted motion vector P(M ) for a predetermined block (macroblock or sub-macroblock) of the current layer picture using the motion vector M .
  • the predicted motion vector can be directly generated from M depending on the reference direction but, in other cases, the motion vector M ' of a different reference direction previously generated and stored in the motion vector buffer 330 are also required.
  • the predicted motion vector P(M ) of M cannot be obtained from the determined M and
  • a motion vector decompression unit 315 adds the predicted motion vector P(M ) generated from the predicted motion vector generator 314 and the motion vector difference ⁇ M decoded in the entropy decoder 310, and decompresses the motion vector M .
  • the decompressed motion vector Ml is again temporarily stored in the motion vector buffer 330, and is provided to the motion compensator 340 according to the inquiry command from the motion compensator 340.
  • the inverse quantizer 350 inversely quantizes the texture data T provided from the entropy decoder 310. During the inverse quantization, a value matched to the index is decompressed from the index generated through the quantization process, using the same quantization table as that used in the quantization process.
  • An inverse quantizer 360 performs inverse transform on the result of inverse quantization.
  • the inverse transform is performed by a method corresponding to the transformer 120 of the video encoder 100 and, in detail, can employ an inverse DCT transform, an inverse wavelet transform, and the like.
  • As the result of inverse transform that is, a decompressed high frequency picture is provided to an adder 370.
  • the motion compensator 340 generates the motion compensation picture using the reference picture (previously decompressed and stored in a frame buffer 380) for the current layer picture, and provides the generated motion compensation picture to the adder 370.
  • Such a motion compensation process can be performed in the reverse order to the order of an MCTF separation, i.e., in the reverse order to the temporal level according to the MCTF decompression order.
  • the adder 370 adds the high frequency picture provided from the inverse transformer 360 and the motion compensated picture, decompresses any picture of a current temporal level, and stores the decompressed picture in the frame buffer 380.
  • a decoding process of the lower layer is performed, using a motion vector buffer
  • FlG. 13 is a diagram illustrating the construction of a system for performing an operation of the video encoder 100 or the video decoder 300 according to an exemplary embodiment of the present invention.
  • the system can employ a television (TV) set, a set-top box, a desktop computer, a laptop computer, a palmtop computer, a personal digital assistant (PDA), a video or picture storage device (e.g., video cassette recorder (VCR) and a digital video recorder (DVR)).
  • TV television
  • PDA personal digital assistant
  • VCR video cassette recorder
  • DVR digital video recorder
  • the system may be also a combination of the above devices, or be the device partially included in other devices.
  • the system can include at least one video source 910, at least one input/output device 920, a processor 940, a memory 950, and a display device 930.
  • the video source 910 may be a TV receiver, a VCR, or other video storage units.
  • the video source 910 may be at least one network connection for receiving a video from a server connected the Internet, a wide area network (WAN), a local area network (LAN), a terrestrial broadcast system, a cable network, a satellite communication network, a wireless network, and a telephone network.
  • the source can also be a combination of the networks or a network partially included in other networks.
  • the input/output device 920, the processor 940, and the memory 950 communicate through a communication medium 960.
  • the communication medium 960 may be a communication bus, a communication network, or at least one internal connection circuit.
  • Video data received from the video source 910 can be processed by the processor 940 according to at least one software program stored in the memory 950, and can be executed by the processor 940 in order to generate an output video picture to be provided to the display device 930.
  • the software program stored in the memory 950 may include a scalable video codec executing a method according to the present invention.
  • the encoder or the codec can be stored in the memory 950, and may be read from a storage medium such as a CD-ROM or a floppy disk or may be downloaded from a predetermined server through a variety of networks. It can be substituted with a hardware circuit, or can be substituted with a combination of software and the hardware circuit.
  • the present invention can provide a method for efficiently predicting motion vectors of an unsynchronized picture using motion vectors of a lower layer in a multilayer-based scalable video codec where each layer has an MCTF structure.
  • the present invention can improve the performance of a JSVM by applying a motion vector prediction technique of the unsynchronized picture to the JSVM.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Cette invention concerne un procédé et un appareil servant à améliorer l'efficacité de compression des vecteurs de mouvement d'une image non synchronisée en prévoyant efficacement les vecteurs de mouvement utilisant des vecteurs de mouvement d'une couche inférieure. Ce procédé comprime les vecteurs de mouvement d'une image non synchronisée appartenant à une couche courante dans un codeur vidéo sur la base d'une couche multiple ayant au moins la couche courante et une couche inférieure de la couche courante. Ledit procédé consiste à sélectionner une image de base pour l'image non synchronisée, à générer un vecteur de mouvement prévu de la couche courante à partir d'un vecteur de mouvement de l'image de base, à soustraire le vecteur de mouvement prévu d'un vecteur de mouvement de l'image non synchronisée et à coder le résultat de la soustraction.
PCT/KR2006/001171 2005-04-01 2006-03-30 Procede pour la compression/decompression des vecteurs de mouvement d'une image non synchronisee et appareil utilisant ce procede WO2006104357A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US66709205P 2005-04-01 2005-04-01
US60/667,092 2005-04-01
KR10-2005-0044594 2005-05-26
KR1020050044594A KR100763179B1 (ko) 2005-04-01 2005-05-26 비동기 픽쳐의 모션 벡터를 압축/복원하는 방법 및 그방법을 이용한 장치

Publications (1)

Publication Number Publication Date
WO2006104357A1 true WO2006104357A1 (fr) 2006-10-05

Family

ID=37053587

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2006/001171 WO2006104357A1 (fr) 2005-04-01 2006-03-30 Procede pour la compression/decompression des vecteurs de mouvement d'une image non synchronisee et appareil utilisant ce procede

Country Status (1)

Country Link
WO (1) WO2006104357A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2866441A4 (fr) * 2012-06-26 2016-03-02 Mitsubishi Electric Corp Dispositifs et procédés de codage et de décodage d'image mobile
CN118470598A (zh) * 2024-05-14 2024-08-09 中国科学技术大学 基于运动向量的监控视频流低冗余推理方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6510177B1 (en) * 2000-03-24 2003-01-21 Microsoft Corporation System and method for layered video coding enhancement
WO2004056121A1 (fr) * 2002-12-17 2004-07-01 Koninklijke Philips Electronics N.V. Procede de codage de flux video destine a des descriptions multiples a faible cout au niveau des passerelles
US20040131121A1 (en) * 2003-01-08 2004-07-08 Adriana Dumitras Method and apparatus for improved coding mode selection
WO2004082293A1 (fr) * 2003-03-06 2004-09-23 Thomson Licensing Procede de codage d'une image video prenant en compte la partie relative a une composante du vecteur de mouvement
KR20060043209A (ko) * 2004-10-21 2006-05-15 삼성전자주식회사 다 계층 기반의 모션 벡터를 효율적으로 부호화하는 방법및 장치

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6510177B1 (en) * 2000-03-24 2003-01-21 Microsoft Corporation System and method for layered video coding enhancement
WO2004056121A1 (fr) * 2002-12-17 2004-07-01 Koninklijke Philips Electronics N.V. Procede de codage de flux video destine a des descriptions multiples a faible cout au niveau des passerelles
US20040131121A1 (en) * 2003-01-08 2004-07-08 Adriana Dumitras Method and apparatus for improved coding mode selection
WO2004082293A1 (fr) * 2003-03-06 2004-09-23 Thomson Licensing Procede de codage d'une image video prenant en compte la partie relative a une composante du vecteur de mouvement
KR20060043209A (ko) * 2004-10-21 2006-05-15 삼성전자주식회사 다 계층 기반의 모션 벡터를 효율적으로 부호화하는 방법및 장치

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2866441A4 (fr) * 2012-06-26 2016-03-02 Mitsubishi Electric Corp Dispositifs et procédés de codage et de décodage d'image mobile
US10264289B2 (en) 2012-06-26 2019-04-16 Mitsubishi Electric Corporation Video encoding device, video decoding device, video encoding method, and video decoding method
CN118470598A (zh) * 2024-05-14 2024-08-09 中国科学技术大学 基于运动向量的监控视频流低冗余推理方法

Similar Documents

Publication Publication Date Title
US8085847B2 (en) Method for compressing/decompressing motion vectors of unsynchronized picture and apparatus using the same
US8817872B2 (en) Method and apparatus for encoding/decoding multi-layer video using weighted prediction
US20060209961A1 (en) Video encoding/decoding method and apparatus using motion prediction between temporal levels
KR100714696B1 (ko) 다계층 기반의 가중 예측을 이용한 비디오 코딩 방법 및장치
JP4891234B2 (ja) グリッド動き推定/補償を用いたスケーラブルビデオ符号化
US20060280372A1 (en) Multilayer-based video encoding method, decoding method, video encoder, and video decoder using smoothing prediction
US20060104354A1 (en) Multi-layered intra-prediction method and video coding method and apparatus using the same
US20060120448A1 (en) Method and apparatus for encoding/decoding multi-layer video using DCT upsampling
US20070047644A1 (en) Method for enhancing performance of residual prediction and video encoder and decoder using the same
JP2009532979A (ja) 加重平均合を用いてfgs階層をエンコーディングおよびデコーディングする方法および装置
KR20060135992A (ko) 다계층 기반의 가중 예측을 이용한 비디오 코딩 방법 및장치
JP2009533938A (ja) 多階層基盤のビデオエンコーディング方法および装置
JP2006304307A (ja) エントロピーコーディングのコンテキストモデルを適応的に選択する方法及びビデオデコーダ
JP2006304307A5 (fr)
US20060165303A1 (en) Video coding method and apparatus for efficiently predicting unsynchronized frame
WO2006078115A1 (fr) Procede et appareil de codage video pour la prediction efficace de trames non synchronisees
US20060165301A1 (en) Video coding method and apparatus for efficiently predicting unsynchronized frame
US20060250520A1 (en) Video coding method and apparatus for reducing mismatch between encoder and decoder
EP1878252A1 (fr) Procede et appareil destine a coder/decoder une video a couches multiples en utilisant une prediction ponderee
WO2007024106A1 (fr) Procede permettant d'ameliorer le rendement de la prediction residuelle, codeur et decodeur video utilisant ledit procede
WO2006132509A1 (fr) Procede de codage video fonde sur des couches multiples, procede de decodage, codeur video, et decodeur video utilisant une prevision de lissage
WO2006104357A1 (fr) Procede pour la compression/decompression des vecteurs de mouvement d'une image non synchronisee et appareil utilisant ce procede
JP2005236459A (ja) 動画像符号化装置、その方法及びそのプログラム
WO2006098586A1 (fr) Procede et dispositif de codage/decodage video utilisant une prediction de mouvement entre des niveaux temporels
WO2006083107A1 (fr) Procede et dispositif pour comprimer un vecteur de mouvement multicouche

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

NENP Non-entry into the national phase

Ref country code: RU

122 Ep: pct application non-entry in european phase

Ref document number: 06732746

Country of ref document: EP

Kind code of ref document: A1

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载