WO2006118383A1 - Video coding method and apparatus supporting fast fine granular scalability - Google Patents
Video coding method and apparatus supporting fast fine granular scalability Download PDFInfo
- Publication number
- WO2006118383A1 WO2006118383A1 PCT/KR2006/001471 KR2006001471W WO2006118383A1 WO 2006118383 A1 WO2006118383 A1 WO 2006118383A1 KR 2006001471 W KR2006001471 W KR 2006001471W WO 2006118383 A1 WO2006118383 A1 WO 2006118383A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- frame
- base layer
- fgs
- residual
- motion vector
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 121
- 239000013598 vector Substances 0.000 claims abstract description 127
- 238000013139 quantization Methods 0.000 claims description 49
- 238000012935 Averaging Methods 0.000 claims description 8
- 230000001131 transforming effect Effects 0.000 claims 5
- 230000000750 progressive effect Effects 0.000 abstract description 6
- 239000010410 layer Substances 0.000 description 169
- 238000010586 diagram Methods 0.000 description 20
- 230000008569 process Effects 0.000 description 18
- 230000005540 biological transmission Effects 0.000 description 8
- 238000007906 compression Methods 0.000 description 8
- 230000006835 compression Effects 0.000 description 8
- 238000004891 communication Methods 0.000 description 7
- 239000003607 modifier Substances 0.000 description 7
- 230000009466 transformation Effects 0.000 description 6
- 230000002123 temporal effect Effects 0.000 description 5
- 230000002457 bidirectional effect Effects 0.000 description 3
- 239000011229 interlayer Substances 0.000 description 3
- 230000002441 reversible effect Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 238000012856 packing Methods 0.000 description 2
- 230000003252 repetitive effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013144 data compression Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000004438 eyesight Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000003340 mental effect Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
- H04N19/34—Scalability techniques involving progressive bit-plane based encoding of the enhancement layer, e.g. fine granular scalability [FGS]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/523—Motion estimation or motion compensation with sub-pixel accuracy
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
Definitions
- Methods and apparatuses consistent with the present invention relate to video coding, and more particularly, to video coding which reduces the amount of computations required for a multilayer-based Progressive Fine Granular Scalability (PFGS) algorithm.
- PFGS Progressive Fine Granular Scalability
- Multimedia data requires a large capacity of storage media and a wide bandwidth for transmission since the amount of multimedia data is usually large. Accordingly, a compression coding method is a requisite for transmitting multimedia data including text, video, and audio.
- a basic principle of data compression is removing data redundancy.
- Data can be compressed by removing spatial redundancy in which the same color or object is repeated in an image, temporal redundancy in which there is little change between neighboring frames in a moving image or the same sound is repeated in audio, or mental visual redundancy taking into account human eyesight and limited perception of high frequency.
- temporal redundancy is removed by temporal filtering based on motion compensation
- spatial redundancy is removed by spatial transformation.
- transmission media are required. Different types of transmission media for multimedia have different performance. Currently used transmission media have various transmission rates. For example, an ultrahigh-speed communication network can transmit data of several tens of megabits per second while a mobile communication network has a transmission rate of 384 kilobits per second. To support transmission media having various speeds or to transmit multimedia, data coding methods having scalability may be suitable to a multimedia environment.
- Scalability indicates the ability to partially decode a single compressed bitstream.
- Scalability includes spatial scalability indicating a video resolution, Signal-to-Noise Ratio (SNR) scalability indicating a video quality level, and temporal scalability indicating a frame rate.
- SNR Signal-to-Noise Ratio
- Standardization work for implementation of multi-layer scalability based on H.264 Scalable Extension (hereinafter, to be referred to be as 'H.264 SE') is in progress at present by a joint video team (JVT) of MPEG (Motion Picture Experts Group) and ITU (International Telecommunication Union).
- JVT Joint video team
- MPEG Motion Picture Experts Group
- ITU International Telecommunication Union
- FGS Fine Granular Scalability
- FIG. 1 is a diagram for explaining a conventional Fine Granular Scalability (FGS) technique.
- FGS-based codec performs coding by dividing a video bitstream into a base layer and an FGS layer.
- a prime (') notion is used to denote a reconstructed image obtained after quantization/inverse quantization. More specifically, a block PB predicted from a block MB' in a reconstructed left base layer frame 11 and a block NB' in a reconstructed right base layer frame 12 using a motion vector is subtracted from a block O in an original current frame 12 to obtain a difference block RB.
- the difference block RB can be defined by the Equation
- the difference block R B is quantized by a base layer quantization step QP B (RB Q ) and then inversely quantized to obtain a reconstructed difference block R '.
- a residual between the unquantized difference block R and the reconstructed difference block R block ⁇ corresponding to the residual is quantized with a quantization step size QP smaller than the base layer quantization step size QP (a compression rate decreases as
- FIG. 2 is a diagram for explaining a conventional progressive fine granular scalability (PFGS) technique.
- PFGS progressive fine granular scalability
- a PFGS technique uses the fact that the quality of left and right reference frames in an FGS layer are also improved by a FGS technique. That is, the PFGS technique involves calculating a new difference block R using newly updated left and right reference frames 21 and 23 and quantizing a residual between the new difference block R F and a quantized base layer block R B ', thereby improving coding performance.
- the new difference block RF is defined by the Equation (2): [H]
- a PFGS technique has an advantage over a FGS technique that the amount of data in an FGS layer can be reduced due to high quality of left and right reference frames. Because the FGS layer also requires separate motion compensation, the amount of computations increases. That is, while the PFGS has improved performance over the conventional FGS, it requires a large amount of computations because motion compensation is performed for each FGS layer to generate a predicted signal and a residual signal between the predicted signal and the original signal. Recently developed video codecs interpolate an image signal at 1/2 or 1/4 pixel accuracy for motion compensation. When motion compensation is performed on 1/4 pixel accuracy, an image with size corresponding to quadruple the resolution of an original image should be generated. Disclosure of Invention
- H.264 standard SE technique uses a six-tap filter as a 1/2 pixel interpolation filter that involves a considerable computational complexity, requiring quite a quantity of computations for motion compensation. This complicates encoding and decoding processes, thus requiring higher system resources. In particular, this drawback may be most problematic in a field requiring real-time encoding and decoding, such as realtime broadcasting or video conferencing.
- the present invention provides a method and apparatus for reducing an amount of computations required for motion compensation while maintaining the performance of a progressive fine granular scalability (PFGS) algorithm.
- PFGS progressive fine granular scalability
- a video encoding method supporting FGS including obtaining a predicted image for a current frame using a motion vector estimated at predetermined accuracy, quantizing a residual between the current frame and the predicted image, inversely quantizing the quantized residual and generating a reconstructed image for the current frame, performing motion compensation on an FGS layer reference frame and a base layer reference frame using the estimated motion vector, calculating a residual between the motion-compensated FGS layer reference frame and the motion-compensated base layer reference frame, subtracting the reconstructed image for the current frame and the calculated residuals from the current frame, and encoding the result of subtraction.
- a video encoding method supporting FGS including obtaining a predicted image for a current frame using a motion vector estimated at predetermined accuracy, quantizing a residual between the current frame and the predicted image, inversely quantizing the quantized residual, and generating a reconstructed image for the current frame, performing motion compensation on an FGS layer reference frame and a base layer reference frame using the estimated motion vector and generating a predicted frame for the FGS layer and a predicted frame for the base layer, respectively, calculating a residual between the predicted frame for the FGS layer and the predicted frame for the base layer, subtracting the reconstructed image and the residual from the current frame, and encoding the result of subtraction.
- a video encoding method supporting FGS including obtaining a predicted image for a current frame using a motion vector estimated at predetermined accuracy, quantizing a residual between the current frame and the predicted image, inversely quantizing the quantized residual, and generating a reconstructed image for the current frame, calculating a residual between an FGS layer reference frame and a base layer reference frame, performing motion compensation on the residual using the estimated motion vector, subtracting the reconstructed image and the motion- compensated result from the current frame, and encoding the result of subtraction.
- a video encoding method supporting fine granular scalability including obtaining a predicted image for a current frame using a motion vector estimated at predetermined accuracy, performing motion compensation on an FGS layer reference frame and a base layer reference frame using a motion vector with lower accuracy than that of the estimated motion vector, calculating a residual between the motion-compensated FGS layer and base layer reference frame, subtracting the predicted image and the residual from the current frame, and encoding the result of subtraction.
- FGS fine granular scalability
- a video encoding method supporting FGS including obtaining a predicted image for a current frame using a motion vector estimated at predetermined accuracy, performing motion compensation on an FGS layer reference frame and a base layer reference frame using a motion vector with lower accuracy than that of the estimated motion vector and generating a predicted frame for the FGS layer and a predicted frame for the base layer, respectively, calculating a residual between the predicted frame for the FGS layer and the predicted frame for the base layer, subtracting the predicted image and the calculated residual from the current frame, and encoding the result of subtraction.
- a video encoding method supporting FGS including obtaining a predicted image for a current frame using a motion vector estimated at predetermined accuracy, calculating a residual between an FGS layer reference frame and a base layer reference frame, performing motion compensation on the residual using a motion vector with lower accuracy than that of the estimated motion vector, subtracting the reconstructed image and the motion-compensated result from the current frame, and encoding the result of subtraction.
- a video decoding method supporting FGS including extracting base layer texture data and FGS layer texture data and motion vectors from an input bitstream, reconstructing a base layer frame from the base layer texture data, performing motion compensation on an FGS layer reference frame and a base layer reference frame using the motion vectors, calculating a residual between the motion- compensated FGS layer reference frame and the motion-compensated base layer reference frame, and adding together the base layer frame, the FGS layer texture data, and the residual.
- an FGS- based video encoder including an element obtaining a predicted image for a current frame using a motion vector estimated at predetermined accuracy, an element quantizing a residual between the current frame and the predicted image, inversely quantizing the quantized residual, and generating a reconstructed image for the current frame, an element performing motion compensation on an FGS layer reference frame and a base layer reference frame using the estimated motion vector, an element calculating a residual between the motion-compensated FGS layer and base layer reference frame, an element subtracting the reconstructed image and the residual from the current frame, and an element encoding the result of subtraction.
- the video encoder including an element extracting base layer texture data, FGS layer texture data and motion vectors from an input bitstream, an element reconstructing a base layer frame from the base layer texture data, an element performing motion compensation on an FGS layer reference frame and a base layer reference frame using the motion vector and generating a predicted FGS layer frame and a predicted base layer frame, an element calculating a residual between the predicted FGS layer frame and the predicted base layer frame, and an element adding together the texture data, the reconstructed base layer frame, and the residual.
- FIG. 1 is a diagram for explaining a conventional FGS technique
- FlG. 2 is a diagram for explaining conventional progressive PFGS technique
- FlG. 3 is a diagram for illustrating fast progressive fine granular scalability (PFGS) according to an exemplary embodiment of the present invention
- FlG. 4 is a block diagram of a video encoder according to an exemplary embodiment of the present invention.
- FlG. 5 is a block diagram of a video encoder according to another exemplary embodiment of the present invention.
- FIGS. 6 and 7 are block diagrams of video encoders according a further exemplary embodiment of the present invention.
- FlG. 8 is a block diagram of a video decoder according to an exemplary embodiment of the present invention.
- FlG. 9 is a block diagram of a video decoder according to another exemplary embodiment of the present invention.
- FIGS. 10 and 11 are block diagrams of video decoders according to a further exemplary embodiment of the present invention.
- FlG. 12 is a block diagram of a system for performing an encoding or decoding process according to an exemplary embodiment of the present invention.
- FlG. 3 is a diagram for illustrating PFGS according to a first exemplary embodiment of the present invention.
- Equation (3) [39] Referring to FlG. 3, like in FlG. 2, ⁇ in an FGS layer will be quantized according to a PFGS algorithm and defined simply by the Equation (3):
- R F is defined by the above Equation (2) and R B ' is defined by the Equation (4):
- O' is an image reconstructed by quantizing an original image O with a base layer quantization step size QP B and then inversely quantizing the quantized image.
- Equation (5) [44] Substituting the Equations (2) and (4) into the Equation (3) gives the Equation (5):
- ⁇ and ⁇ denote a residual between left reference frames M
- Equation (6) By substituting the Equation (6) into the Equation (5), ⁇ can be defined by the
- Equation (7) [49]
- an encoder can obtain ⁇ by subtracting the reconstructed base layer image O ' obtained by quantizing the original image O with the base layer quantization step size QP B and then inversely quantizing the quantized image and an average of the residuals between each of the base reference frame and the FGS layer reference frame and the original image O, that is, ( ⁇ + ⁇ )/2.
- M N decoder reconstructs the original image O by adding together the reconstructed base layer image O', ⁇ , and the average of the residuals between the base layer reference frame and the FGS layer reference frame.
- motion compensation is performed using a motion vector with one pixel or sub-pixel (1/2 pixel or 1/4 pixel) accuracy obtained by motion estimation.
- motion estimation and compensation are typically performed according to various pixel accuracies such as half pixel accuracy, or quarter pixel accuracy.
- a predicted image generated by motion compensation with, e.g., 1/4 pixel accuracy is packed into integer pixels.
- quantization is performed on a residual between an original image and the predicted image.
- the packing is a restoration process of a 4x interpolated reference image into the original size image by performing motion estimation with 1/4 pixel accuracy. For example, one of every four pixels may be selected during the packing process.
- the data ⁇ in the FGS layer to be quantized for fast PFGS according to the present invention needs not be subjected to motion estimation with high pixel accuracy.
- Motion estimation and compensation are applied to only the third term ( ⁇ + ⁇ )/2 in the right-hand side of the Equation (7).
- the third term is represented as interlayer residuals between reference frames, it is not highly effective to perform motion estimation and compensation with high pixel accuracy.
- the fast PFGS allows lower pixel accuracy motion estimation and compensation than the conventional PFGS.
- ⁇ in the Equation (5) in the first exemplary embodiment can also be represented as a residual between predicted signals P F and P B as shown in the Eq ⁇ uation (8).
- P F and P B are eq ⁇ ual to (M F '+NF ')/2 and (M B '+NB
- the first and second exemplary embodiments are distinguished from each other as follows.
- residuals ⁇ and ⁇ between FGS layer reference images and base layer reference images are first calculated and then divided by 2.
- FGS layer image P F and the predicted base layer image P B is calculated after calculating the predicted images P F and P B in the two layers. That is to say, although fast PFGS algorithms according to the first and second exemplary embodiments are implemented in different ways, the same calculation result ( ⁇ ) can be obtained.
- motion compensation is first performed and a residual between images is then calculated.
- a residual between reference images in different layers may first be calculated, followed by performing motion compensation.
- boundary padding is the process of duplicating pixels at boundaries in the vicinity of the pixels considering that block matching at a frame boundary is restricted during motion estimation.
- a residual ⁇ can be defined by the Equation (9): [58]
- the fast PFGS algorithms are used to calculate a residual between predicted images or predict a residual between reference images.
- the fast PFGS performance of the present invention is only slightly affected by or insensitive to interpolation used in order to increase the pixel accuracy of motion vector.
- quarter or half pixel interpolation may be skipped.
- a bi-linear filter requiring a smaller amount of computations may be use instead of a half-pixel interpolation filter used in the H.264 standard requiring a large amount of computations.
- a bi-linear filter may be applied to the third terms in the right-hand sides of the Equations (7) through (9). This may reduce degradation in performance compared to when a bi-linear filter is directly applied to a predicted signal for obtaining R F and R B as in a conventional PFGS algorithm.
- Equation (3) The principle of the first through third exemplary embodiments of the present invention is based on the Equation (3).
- implementation of these exemplary embodiments starts on the assumption that a residual between an FGS layer residual R F and a base layer residual R B is to be coded.
- the above fast PFGS algorithms according to the first through third exemplary embodiments may rather degrade coding performance.
- coding only the residual obtained from the FGS layer, i.e., R in the Equation (3) may offer better coding performance. That is, according to a fourth exemplary embodiment of the present invention, the Equations (7) through (9) may be modified into the Equations (10) through (12), respectively: [63]
- Equations (10) through (12) the reconstructed base layer image O' is replaced with a predicted image P B for a base layer image.
- interpolation may not be applied to the third terms in the right-hand side of the Equations (10) through (12), or a bi-linear filter requiring a smaller amount of computations may be used for interpolation.
- the predicted image P occurring twice in the Equation (11) is not necessarily the same one.
- An estimated motion vector may be used during motion compensation to generate the predicted image P B in a second term.
- a motion vector with lower accuracy than the estimated motion vector or a filter requiring a small amount of computations may be used during motion compensation to generate P and P in a third term.
- the drift error can be reduced by a leaky prediction method using a predicted image created by a weighted sum of a predicted image obtained from both the reference frames and a predicted image obtained from a base layer.
- Equation (13) a value being coded in an FGS layer is expressed by the Equation (13):
- Equation (13) can be converted into the Equation (14) according to a fifth exemplary embodiment of the present invention: [70]
- a weighting factor ⁇ can only be applied to the residual (P F -P B ) between predicted images in the Equation (11).
- the present invention can also be applied to a leaky prediction method. That is, interpolation may be skipped or interpolation may be applied to the residual (P F -P B ) using a bi-linear filter requiring a smaller amount of computations. In the latter case, the result of interpolation is multiplied by the weighting factor ⁇ .
- FIG. 4 is a block diagram of a video encoder 100 according to a first exemplary embodiment of the present invention.
- each block As a basic unit of motion estimation with reference to FIGS. 1 through 3, fast PFGS that follows will be described with regard to each frame containing the block.
- an identifier of the block is indicated by a subscript for an 'F' indicating a frame.
- a frame containing a block labeled R B is denoted by F RB .
- a prime (') notion is used to denote reconstructed data obtained after quantization/inverse quantization.
- a current frame F is fed into a motion estimator 105, a subtracter 115, and a o residual calculator 170.
- the motion estimator 105 performs motion estimation on the current frame F using neighboring frames to obtain motion vectors MVs.
- the neighboring frames that are referred to during motion estimation are hereinafter called 'reference frames'.
- a block matching algorithm (BMA) is commonly used to estimate the motion of a given block.
- BMA block matching algorithm
- a given block is moved within a search area in a reference frame at pixel or sub-pixel accuracy and a displacement with a minimum error is determined as a motion vector.
- HVSBM hierarchical variable size block matching
- reference frames When motion estimation is performed at sub-pixel accuracy, reference frames need to be upsampled or interpolated to predetermined resolution. For example, when the motion estimation is performed at 1/2 and 1/4 pixel accuracies, reference frames must be updated or interpolated by a factor of two and four, respectively.
- the motion vectors MVs calculated by the motion estimator 105 are provided to a motion compensator 110.
- the motion compensator 110 performs motion compensation on the reference frames F MB ' and F NB ' using the motion vectors MVs and generates a predicted frame F for the current frame.
- the predicted image can be calculated as an average of motion-compensated reference frames.
- the predicted image may be the same as the motion-compensated reference frame. While it is assumed hereinafter that motion estimation and compensation use bidirectional reference frames, it will be apparent to those skilled in the art that the present invention may use a unidirectional reference frame.
- the subtractor 115 calculates a residual F between the predicted image and the current image for transmission to a transformer 120.
- the transformer 120 performs spatial transform on the residual F to create a transform coefficient F ⁇ .
- the spatial transform method may include a discrete cosine transform (DCT), or wavelet transform. Specifically, DCT coefficients may be created in a case where DCT is employed, and wavelet coefficients may be created in a case where wavelet transform is employed.
- a quantizer 125 applies quantization to the transform coefficient F
- Quantization means the process of expressing the transform coefficients formed in arbitrary real values by discrete values, and matching the discrete values with indices according to the predetermined quantization table.
- the quantizer 125 may divide the real- valued transform coefficient by a predetermined quantization step size and round the resulting value to the nearest integer.
- the quantization step size of a base layer is greater than that of an FGS layer.
- Inverse quantization means an inverse quantization process to restore values matched to indices generated during quantization using the same quantization step used in the quantization.
- An inverse transformer 135 receives the inverse quantization result and performs an inverse transform on the received.
- Inverse spatial transform may be, for example, inverse DCT or inverse wavelet transform, performed in a reverse order to that of transformation performed by the transformer 120.
- An adder 140 adds the inversely transformed result to the predicted image F obtained from the motion compensator 110 in order to generate a reconstructed image F ' for the current frame.
- a buffer 145 stores the addition result received from the adder 140.
- the buffer 145 stores the reconstructed image F ' for the current frame as well as the previously reconstructed base layer reference frames F MB ' and F NB '.
- a motion vector modifier 155 changes the accuracy of the received motion vector
- the motion vector MV with 1/4 pixel accuracy may have a value of 0, 0.25, 0.5, or 0.75.
- the motion vector modifier 155 changes the motion vector MV with 1/4 pixel accuracy into motion vector MV with pixel accuracy lower than the 1/4 pixel accuracy, such as 1/2 pixel or 1 pixel.
- Such a changing procedure can be performed by simply truncating or rounding off the decimal part of the pixel accuracy from the original motion vector.
- a buffer 165 temporarily stores FGS layer reference frames F ', F '.
- reconstructed FGS lay J er frames F MF ' and F NF ' or an orig b inal frame adjacent to the current frame may be used as the FGS layer reference frames.
- the motion compensator 160 uses the modified motion vector MV to perform motion compensation on the reconstructed base layer reference frames F MB ' and F NB ' received from the buffer 145 and the reconstructed FGS lay J er reference frames F MF ' and F ' received from the buffer 165 and provides the motion-compensated frames mc(F MB '), mc(F NB '), mc(F MF '), and mc(F NF ') to the residual calculator 170.
- F MF ' and F NF ' denote forward and backward reference frames in the FGS layer, respectively.
- F MB ' and F NF ' denote forward and backward reference frames in the FGS layer, respectively.
- F ' denote forward and backward reference frames in the base layer, respectively.
- interpolation filter 160 may use a different type of interpolation filter than that used for the motion estimator 105 or motion compensator 110.
- a bi-linear filter requiring a small amount of computations may be used for interpolation instead of a six-tap filter used in the H.264 standard. Because a residual between a motion-compensated base layer frame and a motion-compensated FGS layer is calculated after interpolation, the interpolation process little affects the compression efficiency.
- the residual calculator 170 calculates a residual between the motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated
- ⁇ M mc(F MF ') - mc(F MB ')
- ⁇ N mc(F NF ') - mc(F NB ').
- the residual calculator 170 calculates an average of the residuals ⁇ M and ⁇ N and subtracts the reconstructed imag b e F O ' and the averag b e of the residuals ⁇ M and ⁇ N from the current frame F .
- the process of calculating the average is not required.
- the subtraction result F obtained by the residual calculator 170 is subjected to spatial transform by a transformer 175 and then quantized by a quantizer 180.
- the quantized result F Q is transmitted to the entropy coding unit 150.
- the quantization step size used in the quantizer 180 is typically less than that used in the quantizer 125.
- the entropy coding unit 150 losslessly encodes the motion vector MV estimated by the motion estimator 105, the quantization coefficient F Q received from the quantizer 125, the quantized result F received from the quantizer 180 into a bitstream.
- lossless coding methods including arithmetic coding, variable length coding, and the like.
- a video encoder according to a second exemplary embodiment of the present invention may have the same configuration and operation as the video encoder 100 shown in FIG. 4, except a residual calculator.
- the residual calculator generates a predicted frame for each layer before calculating a residual between frames in different layers.
- the residual calculator generates the predicted FGS layer frame and a predicted base layer frame using a motion-compensated FGS layer reference frame and a motion-compensated base layer reference frame.
- the predicted frame can be calculated by simply averaging the two motion-compensated reference frames.
- the motion-compensated frame may be the predicted frame itself.
- the residual calculator then calculates a residual between the predicted frames, and subtracts a reconstructed image and the calculated residual from a current frame.
- FIG. 5 is a block diagram of a video encoder 300 according to a third exemplary embodiment of the present invention.
- the illustrated video encoder 300 performs motion compensation after calculating the residual between reference frames in the two layers. To avoid repetitive explanation, the following description will focus on distinguished features between the first and second exemplary embodiments.
- a subtracter 390 subtracts reconstructed base lay J er reference frames F MB ' and F NB ', which are received from a buffer 345, from FGS lay J er reference frames F MF ' and F NF ', which are received from a buffer 365, and p r rovides the subtraction results F MF '-F MB ' and F NF '-F NB ' to a motion compensator 360.
- a unidirectional reference frame is used, only one residual exists.
- the motion compensator 360 uses a modified motion vector MV received from a motion vector modifier 355 to perform motion compensation on the residuals F '-F '
- FIGS. 6 and 7 are block diagrams of examples of video encoders 400 and 600 according to a fourth exemplary embodiment of the present invention. Referring first to FIG. 6, unlike in the first exemplary embodiment shown in FIG. 4, a residual calculator 470 in the video encoder 400 of the exemplary embodiment subtracts a predicted base layer frame F , instead of the reconstructed base layer frame F ', from the current frame F . o
- the video encoders 400 and 600 according to the fourth exemplary embodiment shown in FIGS. 6 and 7 correspond to FIGS. 4 and 5 illustrating video encoders 100 and 300 according to the first and third exemplary embodiments.
- the residual calculator 470 subtracts the predicted base layer image F PB received from a motion compensator 410, instead of the reconstructed base layer image F ', from the current frame.
- the residual calculator 470 subtracts the predicted image F and an average of the residuals ⁇ and ⁇ from the current frame F to
- the residual calculator 670 subtracts the predicted image F PB and an average of the motion-compensated residuals mc(F MF '-FMB ') and mc(F
- An example of a video encoder according to a fourth exemplary embodiment corresponding to the second exemplary embodiment may have the same configuration and perform the same operation as shown in FlG. 6 except for the operation of the residual calculator 470.
- the residual calculator 470 generates a predicted FGS layer frame F and a predicted base layer frame F using a motion-compensated FGS layer reference frame mc(F '), mc(F ') and a motion-compensated base layer reference frame mc(F '), mc(F '), respectively.
- the residual calculator 470 also calculates a residual F PF -F PB between the p r redicted frames F PF and F PB and subtracts the reconstructed image F O ' and the residual F PF -F PB from the current frame F to obtain a subtraction result F .
- the residual calculator 470 multiplies a weighting factor ⁇ by J the residual F PF -F PB and subtracts the reconstructed imag to e F O ' and the product ( ⁇ x (F -F )) from the current frame F to obtain a subtraction result F .
- FlG. 8 is a block diagram of a video decoder 700 according to a first exemplary embodiment of the present invention.
- an entropy decoding unit 701 losslessly decodes an input bitstream to extract base layer texture data F PB Q , FGS layer texture data F ⁇ , and motion vectors MVs.
- the lossless decoding is an inverse process of lossless encoding.
- the base layer texture data F and the FGS layer texture data F are provided to inverse quantizers 705 and 745, respectively, and the motion vectors MVs are provided to a motion compensator 720 and a motion vector modifier 730.
- the inverse quantizer 705 applies inverse quantization to the base layer texture data F received from the entropy decoding unit 701.
- the inverse quantization is performed in a reverse order to that of the quantization performed by the transformer to restore values matched to indices generated during quantization according to a predetermined quantization step used in the quantization.
- An inverse transformer 710 performs inverse transform on the inverse quantized result.
- the inverse transformation is performed in a reverse order to that of the transformation performed by the transformer. Specifically, inverse DCT transformation, or inverse wavelet transformation may be used.
- the reconstructed residual F RB ' is provided to an adder 715.
- the motion compensator 720 performs motion compensation on previously reconstructed base lay J er reference frames F MB ' and F NB ' stored in a buffer 725 using to the extracted motion vectors MVs to generate a predicted image F PB , which is then sent to the adder 715.
- the predicted image F PB is calculated by averaging the motion-compensated reference frames.
- unidirectional prediction is used, the predicted image F PB is obtained as a motion-compensated reference frame.
- the adder 715 adds together the input F and F to output a reconstructed base layer image F ' that is then stored in the buffer 725.
- An inverse quantizer 745 applies inverse quantization to the FGS layer texture data
- an inverse transformer 750 performs inverse transform on the inversely quantized result F ⁇ ' to obtain a reconstructed frame F (F ') that is then provided to a frame reconstructor 755.
- the motion vector modifier 730 lowers the accuracy of the extracted motion vector
- a motion vector MV with 1/4 pixel accuracy may have a value of 0, 0.25, 0.5, or 0.75.
- the motion vector modifier 730 changes the motion vector MV with 1/4 pixel accuracy into a motion vector MV with pixel accuracy lower than the 1/4 pixel accuracy, such as 1/2 pixel or 1 pixel.
- a motion compensator 735 uses the modified motion vector MV to perform motion compensation on reconstructed base layer reference frames F MB ' and F NB ' received from the buffer 725 and reconstructed FGS lay J er reference frames F MF ' and F
- ⁇ M mc(FMF ') -mc(FMB '
- ⁇ N between the motion-compensated FGS layer and base layer reference frames mc(F ') and mc(F ')
- ⁇ mc(F ') - mc(F '
- the buffer 740 then stores the reconstructed image F OF '.
- the previously reconstructed images F MF ' and F BF ' can be stored in the buffer 740.
- a video decoder may have the same configuration and perform the same operation as shown in FIG. 8 except for the operation of a frame reconstructor. That is, the frame reconstructor according to the second exemplary embodiment generates a predicted frame for each layer before calculating a residual between frames in the two layers. That is to say, the frame reconstructor generates a predicted FGS layer frame and a predicted base layer frame using motion-compensated FGS layer reference frames and motion-compensated base layer frames. The predicted frames can be generated by simply averaging the two motion-compensated reference frames. Of course, when unidirectional prediction is used, the predicted frame is a motion-compensated frame itself.
- the frame reconstructor then calculates a residual between the predicted frames, and adds together the texture data, the reconstructed base layer frame, and the residual.
- FIG. 9 is a block diagram of a video decoder 900 according to a third exemplary embodiment of the present invention.
- the video decoder 900 performs motion compensation after calculating a residual between the reference frames in the two layers.
- the following description will focus on distinguished features of the first exemplary embodiment shown in FIG. 4.
- a subtracter 960 subtracts reconstructed base layer reference frames F MB ' and F NB ' received from a buffer 925 from FGS lay J er reference frames F MF ' and F NF ' and p *rovides the subtraction results F '-F ' and F '-F ' to a motion compensator 935.
- the motion compensator 935 uses a modified motion vector MV received from a motion vector modifier 930 to perform motion compensation on the residuals F '-F '
- the frame reconstructor 955 calculates an average between motion-compensated residuals, that is, an average between mc(F MF '-FMB ') and mc(F NF '-FNB '), and adds together the calculated average, F ' received from an inverse transformer 950, and a reconstructed base layer image F '.
- the averaging process is not required.
- FIGS. 10 and 11 are block diagrams of examples of video decoders 1000 and 1200 according to a fourth exemplary embodiment of the present invention.
- frame reconstructors 1055 and 1255 add a predicted base layer frame FPB, instead of the reconstructed base layer frame F '.
- the video decoders 1000 and 1200 according to the fourth exemplary embodiment shown in FIGS. 10 and 11 correspond to those according to the first and third exemplary embodiments shown in FIGS. 8 and 9, respectively.
- the motion compensator 1020 provides the base layer reference image F PB to the frame reconstructor 1055, instead of the reconstructed image F '.
- the frame reconstructor 470 adds together F ' received from the inverse transformer 1050, the predicted base layer image F PB , and an average of interlayer residuals ⁇ M and ⁇ N to obtain the reconstructed base layer image
- the frame reconstructor 1255 adds together F ' received from an inverse transformer 1250, the predicted base layer image F PB received from a motion compensator 1220, and an average of motion-compensated residuals mc(F '-F ') and mc(F '-F ')) to obtain a reconstructed FGS layer image F '.
- the video decoder according to the fourth exemplary embodiment may have the same configuration and perform the same operation as shown in FIG. 8 except for the operation of the frame reconstructor 1255.
- the frame reconstructor 1255 generates a predicted FGS layer frame F and a predicted base layer frame F using motion-compensated FGS layer reference frames mc(F ') and mc(F ') and motion-compensated base layer reference frames mc(F MB ') and mc(F NB ').
- the frame reconstructor 1255 also calculates a residual F PF -F PB between a predicted FGS layer frame F and a predicted base layer frame F and adds together F ⁇ ' received from the inverse transformer 1250, the predicted image F PB received from the motion compensator 1220, and the residual F PF -F PB to obtain the re- constructed image F '.
- F ⁇ ' received from the inverse transformer 1250
- the predicted image F PB received from the motion compensator 1220 and the residual F PF -F PB to obtain the re- constructed image F '.
- leaky prediction (fifth exemplary embodiment) is applied, the frame re- constructor 1255 multiplies a weighting factor ⁇ by the interlayer residual F -F and
- PF PB adds together F ⁇ ', F O ' and the product ⁇ x (F PF -F PB ) to obtain F OF '.
- FIG. 12 is a block diagram of a system for performing an encoding or decoding process using a video encoder 100, 300, 400, 600 or a video decoder 700, 900, 1000, 1200, according to an exemplary embodiment of the present invention.
- the system may be a TV, a set-top box (STB), a desktop, laptop, or palmtop computer, PDA, a video or image storage device (e.g., a VCR, or a DVR).
- the system may be a combination of the devices listed above or another device incorporating them.
- the system may be a combination of the above-mentioned apparatuses or one of the apparatuses which includes a part of another apparatus among them.
- the system includes at least one video source 1310, at least one input/output unit 1320, a processor 1340, a memory 1350, and a display unit 1330.
- the video source 1310 may be a TV receiver, a VCR, or other video storing apparatus.
- the video source 1310 may indicate at least one network connection for receiving a video or an image from a server using Internet, a wide area network (WAN), a local area network (LAN), a terrestrial broadcast system, a cable network, a satellite communication network, a wireless network, a telephone network, or the like.
- the video source 1310 may be a combination of the networks or one network including a part of another network among the networks.
- the input/output device 1320, the processor 1340, and the memory 1350 communicate with one another through a communication medium 1360.
- the communication medium 1360 may be a communication bus, a communication network, or at least one internal connection circuit.
- Input video data received from the video source 1310 can be processed by the processor 1340 using to at least one software program stored in the memory 1350 and can be executed by the processor 1340 to generate an output video provided to the display unit 1330.
- the software program stored in the memory 1350 includes a scalable wavelet-based codec performing a method of the present invention.
- the codec may be stored in the memory 1350, may be read from a storage medium such as a compact disc-read only memory (CD-ROM) or a floppy disc, or may be downloaded from a predetermined server through a variety of networks.
- the present invention provides video coding that can sig nificantly reduce the amount of computations required to implement a PFGS algorithm. Since a decoding process is modified according to a video coding process of the present invention, the present invention can be applied to the H.264 SE standardized document. While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present invention as defined by the following claims.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
Claims
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP06747382A EP1878261A1 (en) | 2005-04-29 | 2006-04-20 | Video coding method and apparatus supporting fast fine granular scalability |
BRPI0611142-4A BRPI0611142A2 (en) | 2005-04-29 | 2006-04-20 | Supported video encoding method for fine granularity scalability, video encoder based on fine granularity scalability |
CA002609648A CA2609648A1 (en) | 2005-04-29 | 2006-04-20 | Video coding method and apparatus supporting fast fine granular scalability |
JP2008508745A JP2008539646A (en) | 2005-04-29 | 2006-04-20 | Video coding method and apparatus for providing high-speed FGS |
AU2006241637A AU2006241637A1 (en) | 2005-04-29 | 2006-04-20 | Video coding method and apparatus supporting fast fine granular scalability |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US67592105P | 2005-04-29 | 2005-04-29 | |
US60/675,921 | 2005-04-29 | ||
KR1020050052428A KR100703778B1 (en) | 2005-04-29 | 2005-06-17 | Video coding method and apparatus supporting high speed FPS |
KR10-2005-0052428 | 2005-06-17 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2006118383A1 true WO2006118383A1 (en) | 2006-11-09 |
Family
ID=37308152
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2006/001471 WO2006118383A1 (en) | 2005-04-29 | 2006-04-20 | Video coding method and apparatus supporting fast fine granular scalability |
Country Status (6)
Country | Link |
---|---|
EP (1) | EP1878261A1 (en) |
JP (1) | JP2008539646A (en) |
AU (1) | AU2006241637A1 (en) |
CA (1) | CA2609648A1 (en) |
RU (1) | RU2340115C1 (en) |
WO (1) | WO2006118383A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9369728B2 (en) | 2012-09-28 | 2016-06-14 | Sharp Kabushiki Kaisha | Image decoding device and image encoding device |
US9386322B2 (en) | 2007-07-02 | 2016-07-05 | Nippon Telegraph And Telephone Corporation | Scalable video encoding method and decoding method, apparatuses therefor, programs therefor, and storage media which store the programs |
US20220217372A1 (en) * | 2019-03-20 | 2022-07-07 | V-Nova International Limited | Modified upsampling for video coding technology |
WO2023087159A1 (en) * | 2021-11-16 | 2023-05-25 | 广东博华超高清创新中心有限公司 | Dvs data generation methdo based on avs motion estimation coding |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101624649B1 (en) * | 2009-08-14 | 2016-05-26 | 삼성전자주식회사 | Method and apparatus for video encoding considering hierarchical coded block pattern, and method and apparatus for video decoding considering hierarchical coded block pattern |
JP5381571B2 (en) * | 2009-09-29 | 2014-01-08 | 株式会社Jvcケンウッド | Image encoding device, image decoding device, image encoding method, and image decoding method |
CN103329532B (en) | 2011-03-10 | 2016-10-26 | 日本电信电话株式会社 | Quantization controls apparatus and method and quantization controls program |
WO2013145642A1 (en) * | 2012-03-28 | 2013-10-03 | 株式会社Jvcケンウッド | Image encoding device, image encoding method, image encoding program, transmission device, transmission method, transmission program, image decoding device, image decoding method, image decoding program, reception device, reception method, and reception program |
GB2501535A (en) * | 2012-04-26 | 2013-10-30 | Sony Corp | Chrominance Processing in High Efficiency Video Codecs |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6614936B1 (en) * | 1999-12-03 | 2003-09-02 | Microsoft Corporation | System and method for robust video coding using progressive fine-granularity scalable (PFGS) coding |
-
2006
- 2006-04-20 JP JP2008508745A patent/JP2008539646A/en active Pending
- 2006-04-20 CA CA002609648A patent/CA2609648A1/en not_active Abandoned
- 2006-04-20 AU AU2006241637A patent/AU2006241637A1/en not_active Abandoned
- 2006-04-20 WO PCT/KR2006/001471 patent/WO2006118383A1/en active Application Filing
- 2006-04-20 RU RU2007139817/09A patent/RU2340115C1/en not_active IP Right Cessation
- 2006-04-20 EP EP06747382A patent/EP1878261A1/en not_active Withdrawn
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6614936B1 (en) * | 1999-12-03 | 2003-09-02 | Microsoft Corporation | System and method for robust video coding using progressive fine-granularity scalable (PFGS) coding |
Non-Patent Citations (3)
Title |
---|
DING G.G. AND GUO B.L.: "Improvements to progressive fine granularity scalable video coding", PROC. FIFTH INT'L CONF. COMPUTATIONAL INTELLIGENCE AND MULTIMEDIA APPLICATIONS (ICCIMA), 27 September 2003 (2003-09-27), pages 249 - 253, XP010661660 * |
V.D. SCHAAR M. AND RADHA H.: "Adaptive motion-compensation fine-granular-scalability (AMC-FGS) for wireless video", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, vol. 12, no. 6, June 2002 (2002-06-01), pages 360 - 371, XP001059187 * |
WANG Q. ET AL.: "Optimal rate allocation for progressive fine granularity scalable video coding", IEEE SIGNAL PROCESSING LETTERS, vol. 9, no. 2, February 2002 (2002-02-01), pages 33 - 39, XP001059186 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9386322B2 (en) | 2007-07-02 | 2016-07-05 | Nippon Telegraph And Telephone Corporation | Scalable video encoding method and decoding method, apparatuses therefor, programs therefor, and storage media which store the programs |
US9369728B2 (en) | 2012-09-28 | 2016-06-14 | Sharp Kabushiki Kaisha | Image decoding device and image encoding device |
JPWO2014050948A1 (en) * | 2012-09-28 | 2016-08-22 | シャープ株式会社 | Image decoding apparatus, image decoding method, and image encoding apparatus |
JP2017055458A (en) * | 2012-09-28 | 2017-03-16 | シャープ株式会社 | Image decoding device, image decoding method, image encoding device, and image encoding method |
US20220217372A1 (en) * | 2019-03-20 | 2022-07-07 | V-Nova International Limited | Modified upsampling for video coding technology |
US12177468B2 (en) * | 2019-03-20 | 2024-12-24 | V-Nova International Limited | Modified upsampling for video coding technology |
WO2023087159A1 (en) * | 2021-11-16 | 2023-05-25 | 广东博华超高清创新中心有限公司 | Dvs data generation methdo based on avs motion estimation coding |
Also Published As
Publication number | Publication date |
---|---|
RU2340115C1 (en) | 2008-11-27 |
JP2008539646A (en) | 2008-11-13 |
CA2609648A1 (en) | 2006-11-09 |
EP1878261A1 (en) | 2008-01-16 |
AU2006241637A1 (en) | 2006-11-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20060245495A1 (en) | Video coding method and apparatus supporting fast fine granular scalability | |
US8817872B2 (en) | Method and apparatus for encoding/decoding multi-layer video using weighted prediction | |
US20070047644A1 (en) | Method for enhancing performance of residual prediction and video encoder and decoder using the same | |
CN101208958B (en) | Video coding method and apparatus using multi-layer based weighted prediction | |
KR100703760B1 (en) | Method and apparatus for video encoding / decoding using temporal level motion vector prediction | |
KR100703788B1 (en) | Multi-layered Video Encoding Method Using Smooth Prediction, Decoding Method, Video Encoder and Video Decoder | |
US20060165302A1 (en) | Method of multi-layer based scalable video encoding and decoding and apparatus for the same | |
US20060120448A1 (en) | Method and apparatus for encoding/decoding multi-layer video using DCT upsampling | |
WO2006118383A1 (en) | Video coding method and apparatus supporting fast fine granular scalability | |
KR20060135992A (en) | Method and apparatus for coding video using weighted prediction based on multi-layer | |
EP1782631A1 (en) | Method and apparatus for predecoding and decoding bitstream including base layer | |
KR100763179B1 (en) | Method for compressing/Reconstructing motion vector of unsynchronized picture and apparatus thereof | |
US20070160143A1 (en) | Motion vector compression method, video encoder, and video decoder using the method | |
EP1878252A1 (en) | Method and apparatus for encoding/decoding multi-layer video using weighted prediction | |
WO2007024106A1 (en) | Method for enhancing performance of residual prediction and video encoder and decoder using the same | |
KR20050012755A (en) | Improved efficiency FGST framework employing higher quality reference frames | |
WO2006132509A1 (en) | Multilayer-based video encoding method, decoding method, video encoder, and video decoder using smoothing prediction | |
WO2006078109A1 (en) | Method of multi-layer based scalable video encoding and decoding and apparatus for the same | |
US20060088100A1 (en) | Video coding method and apparatus supporting temporal scalability | |
WO2006104357A1 (en) | Method for compressing/decompressing motion vectors of unsynchronized picture and apparatus using the same | |
WO2006098586A1 (en) | Video encoding/decoding method and apparatus using motion prediction between temporal levels | |
WO2006043754A1 (en) | Video coding method and apparatus supporting temporal scalability |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 200680019114.4 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DPE2 | Request for preliminary examination filed before expiration of 19th month from priority date (pct application filed from 20040101) | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2006241637 Country of ref document: AU Ref document number: 2006747382 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2609648 Country of ref document: CA |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2007139817 Country of ref document: RU |
|
ENP | Entry into the national phase |
Ref document number: 2008508745 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWP | Wipo information: published in national office |
Ref document number: 2006241637 Country of ref document: AU |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1968/MUMNP/2007 Country of ref document: IN |
|
WWP | Wipo information: published in national office |
Ref document number: 2006747382 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: PI0611142 Country of ref document: BR Kind code of ref document: A2 |