+

WO2006118383A1 - Procede et dispositif de codage video permettant l'echelonnement rapide a granularite fine - Google Patents

Procede et dispositif de codage video permettant l'echelonnement rapide a granularite fine Download PDF

Info

Publication number
WO2006118383A1
WO2006118383A1 PCT/KR2006/001471 KR2006001471W WO2006118383A1 WO 2006118383 A1 WO2006118383 A1 WO 2006118383A1 KR 2006001471 W KR2006001471 W KR 2006001471W WO 2006118383 A1 WO2006118383 A1 WO 2006118383A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
base layer
fgs
residual
motion vector
Prior art date
Application number
PCT/KR2006/001471
Other languages
English (en)
Inventor
Woo-Jin Han
Kyo-Hyuk Lee
Sang-Chang Cha
Original Assignee
Samsung Electronics Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from KR1020050052428A external-priority patent/KR100703778B1/ko
Application filed by Samsung Electronics Co., Ltd. filed Critical Samsung Electronics Co., Ltd.
Priority to EP06747382A priority Critical patent/EP1878261A1/fr
Priority to BRPI0611142-4A priority patent/BRPI0611142A2/pt
Priority to CA002609648A priority patent/CA2609648A1/fr
Priority to JP2008508745A priority patent/JP2008539646A/ja
Priority to AU2006241637A priority patent/AU2006241637A1/en
Publication of WO2006118383A1 publication Critical patent/WO2006118383A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/34Scalability techniques involving progressive bit-plane based encoding of the enhancement layer, e.g. fine granular scalability [FGS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/523Motion estimation or motion compensation with sub-pixel accuracy
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding

Definitions

  • Methods and apparatuses consistent with the present invention relate to video coding, and more particularly, to video coding which reduces the amount of computations required for a multilayer-based Progressive Fine Granular Scalability (PFGS) algorithm.
  • PFGS Progressive Fine Granular Scalability
  • Multimedia data requires a large capacity of storage media and a wide bandwidth for transmission since the amount of multimedia data is usually large. Accordingly, a compression coding method is a requisite for transmitting multimedia data including text, video, and audio.
  • a basic principle of data compression is removing data redundancy.
  • Data can be compressed by removing spatial redundancy in which the same color or object is repeated in an image, temporal redundancy in which there is little change between neighboring frames in a moving image or the same sound is repeated in audio, or mental visual redundancy taking into account human eyesight and limited perception of high frequency.
  • temporal redundancy is removed by temporal filtering based on motion compensation
  • spatial redundancy is removed by spatial transformation.
  • transmission media are required. Different types of transmission media for multimedia have different performance. Currently used transmission media have various transmission rates. For example, an ultrahigh-speed communication network can transmit data of several tens of megabits per second while a mobile communication network has a transmission rate of 384 kilobits per second. To support transmission media having various speeds or to transmit multimedia, data coding methods having scalability may be suitable to a multimedia environment.
  • Scalability indicates the ability to partially decode a single compressed bitstream.
  • Scalability includes spatial scalability indicating a video resolution, Signal-to-Noise Ratio (SNR) scalability indicating a video quality level, and temporal scalability indicating a frame rate.
  • SNR Signal-to-Noise Ratio
  • Standardization work for implementation of multi-layer scalability based on H.264 Scalable Extension (hereinafter, to be referred to be as 'H.264 SE') is in progress at present by a joint video team (JVT) of MPEG (Motion Picture Experts Group) and ITU (International Telecommunication Union).
  • JVT Joint video team
  • MPEG Motion Picture Experts Group
  • ITU International Telecommunication Union
  • FGS Fine Granular Scalability
  • FIG. 1 is a diagram for explaining a conventional Fine Granular Scalability (FGS) technique.
  • FGS-based codec performs coding by dividing a video bitstream into a base layer and an FGS layer.
  • a prime (') notion is used to denote a reconstructed image obtained after quantization/inverse quantization. More specifically, a block PB predicted from a block MB' in a reconstructed left base layer frame 11 and a block NB' in a reconstructed right base layer frame 12 using a motion vector is subtracted from a block O in an original current frame 12 to obtain a difference block RB.
  • the difference block RB can be defined by the Equation
  • the difference block R B is quantized by a base layer quantization step QP B (RB Q ) and then inversely quantized to obtain a reconstructed difference block R '.
  • a residual between the unquantized difference block R and the reconstructed difference block R block ⁇ corresponding to the residual is quantized with a quantization step size QP smaller than the base layer quantization step size QP (a compression rate decreases as
  • FIG. 2 is a diagram for explaining a conventional progressive fine granular scalability (PFGS) technique.
  • PFGS progressive fine granular scalability
  • a PFGS technique uses the fact that the quality of left and right reference frames in an FGS layer are also improved by a FGS technique. That is, the PFGS technique involves calculating a new difference block R using newly updated left and right reference frames 21 and 23 and quantizing a residual between the new difference block R F and a quantized base layer block R B ', thereby improving coding performance.
  • the new difference block RF is defined by the Equation (2): [H]
  • a PFGS technique has an advantage over a FGS technique that the amount of data in an FGS layer can be reduced due to high quality of left and right reference frames. Because the FGS layer also requires separate motion compensation, the amount of computations increases. That is, while the PFGS has improved performance over the conventional FGS, it requires a large amount of computations because motion compensation is performed for each FGS layer to generate a predicted signal and a residual signal between the predicted signal and the original signal. Recently developed video codecs interpolate an image signal at 1/2 or 1/4 pixel accuracy for motion compensation. When motion compensation is performed on 1/4 pixel accuracy, an image with size corresponding to quadruple the resolution of an original image should be generated. Disclosure of Invention
  • H.264 standard SE technique uses a six-tap filter as a 1/2 pixel interpolation filter that involves a considerable computational complexity, requiring quite a quantity of computations for motion compensation. This complicates encoding and decoding processes, thus requiring higher system resources. In particular, this drawback may be most problematic in a field requiring real-time encoding and decoding, such as realtime broadcasting or video conferencing.
  • the present invention provides a method and apparatus for reducing an amount of computations required for motion compensation while maintaining the performance of a progressive fine granular scalability (PFGS) algorithm.
  • PFGS progressive fine granular scalability
  • a video encoding method supporting FGS including obtaining a predicted image for a current frame using a motion vector estimated at predetermined accuracy, quantizing a residual between the current frame and the predicted image, inversely quantizing the quantized residual and generating a reconstructed image for the current frame, performing motion compensation on an FGS layer reference frame and a base layer reference frame using the estimated motion vector, calculating a residual between the motion-compensated FGS layer reference frame and the motion-compensated base layer reference frame, subtracting the reconstructed image for the current frame and the calculated residuals from the current frame, and encoding the result of subtraction.
  • a video encoding method supporting FGS including obtaining a predicted image for a current frame using a motion vector estimated at predetermined accuracy, quantizing a residual between the current frame and the predicted image, inversely quantizing the quantized residual, and generating a reconstructed image for the current frame, performing motion compensation on an FGS layer reference frame and a base layer reference frame using the estimated motion vector and generating a predicted frame for the FGS layer and a predicted frame for the base layer, respectively, calculating a residual between the predicted frame for the FGS layer and the predicted frame for the base layer, subtracting the reconstructed image and the residual from the current frame, and encoding the result of subtraction.
  • a video encoding method supporting FGS including obtaining a predicted image for a current frame using a motion vector estimated at predetermined accuracy, quantizing a residual between the current frame and the predicted image, inversely quantizing the quantized residual, and generating a reconstructed image for the current frame, calculating a residual between an FGS layer reference frame and a base layer reference frame, performing motion compensation on the residual using the estimated motion vector, subtracting the reconstructed image and the motion- compensated result from the current frame, and encoding the result of subtraction.
  • a video encoding method supporting fine granular scalability including obtaining a predicted image for a current frame using a motion vector estimated at predetermined accuracy, performing motion compensation on an FGS layer reference frame and a base layer reference frame using a motion vector with lower accuracy than that of the estimated motion vector, calculating a residual between the motion-compensated FGS layer and base layer reference frame, subtracting the predicted image and the residual from the current frame, and encoding the result of subtraction.
  • FGS fine granular scalability
  • a video encoding method supporting FGS including obtaining a predicted image for a current frame using a motion vector estimated at predetermined accuracy, performing motion compensation on an FGS layer reference frame and a base layer reference frame using a motion vector with lower accuracy than that of the estimated motion vector and generating a predicted frame for the FGS layer and a predicted frame for the base layer, respectively, calculating a residual between the predicted frame for the FGS layer and the predicted frame for the base layer, subtracting the predicted image and the calculated residual from the current frame, and encoding the result of subtraction.
  • a video encoding method supporting FGS including obtaining a predicted image for a current frame using a motion vector estimated at predetermined accuracy, calculating a residual between an FGS layer reference frame and a base layer reference frame, performing motion compensation on the residual using a motion vector with lower accuracy than that of the estimated motion vector, subtracting the reconstructed image and the motion-compensated result from the current frame, and encoding the result of subtraction.
  • a video decoding method supporting FGS including extracting base layer texture data and FGS layer texture data and motion vectors from an input bitstream, reconstructing a base layer frame from the base layer texture data, performing motion compensation on an FGS layer reference frame and a base layer reference frame using the motion vectors, calculating a residual between the motion- compensated FGS layer reference frame and the motion-compensated base layer reference frame, and adding together the base layer frame, the FGS layer texture data, and the residual.
  • an FGS- based video encoder including an element obtaining a predicted image for a current frame using a motion vector estimated at predetermined accuracy, an element quantizing a residual between the current frame and the predicted image, inversely quantizing the quantized residual, and generating a reconstructed image for the current frame, an element performing motion compensation on an FGS layer reference frame and a base layer reference frame using the estimated motion vector, an element calculating a residual between the motion-compensated FGS layer and base layer reference frame, an element subtracting the reconstructed image and the residual from the current frame, and an element encoding the result of subtraction.
  • the video encoder including an element extracting base layer texture data, FGS layer texture data and motion vectors from an input bitstream, an element reconstructing a base layer frame from the base layer texture data, an element performing motion compensation on an FGS layer reference frame and a base layer reference frame using the motion vector and generating a predicted FGS layer frame and a predicted base layer frame, an element calculating a residual between the predicted FGS layer frame and the predicted base layer frame, and an element adding together the texture data, the reconstructed base layer frame, and the residual.
  • FIG. 1 is a diagram for explaining a conventional FGS technique
  • FlG. 2 is a diagram for explaining conventional progressive PFGS technique
  • FlG. 3 is a diagram for illustrating fast progressive fine granular scalability (PFGS) according to an exemplary embodiment of the present invention
  • FlG. 4 is a block diagram of a video encoder according to an exemplary embodiment of the present invention.
  • FlG. 5 is a block diagram of a video encoder according to another exemplary embodiment of the present invention.
  • FIGS. 6 and 7 are block diagrams of video encoders according a further exemplary embodiment of the present invention.
  • FlG. 8 is a block diagram of a video decoder according to an exemplary embodiment of the present invention.
  • FlG. 9 is a block diagram of a video decoder according to another exemplary embodiment of the present invention.
  • FIGS. 10 and 11 are block diagrams of video decoders according to a further exemplary embodiment of the present invention.
  • FlG. 12 is a block diagram of a system for performing an encoding or decoding process according to an exemplary embodiment of the present invention.
  • FlG. 3 is a diagram for illustrating PFGS according to a first exemplary embodiment of the present invention.
  • Equation (3) [39] Referring to FlG. 3, like in FlG. 2, ⁇ in an FGS layer will be quantized according to a PFGS algorithm and defined simply by the Equation (3):
  • R F is defined by the above Equation (2) and R B ' is defined by the Equation (4):
  • O' is an image reconstructed by quantizing an original image O with a base layer quantization step size QP B and then inversely quantizing the quantized image.
  • Equation (5) [44] Substituting the Equations (2) and (4) into the Equation (3) gives the Equation (5):
  • ⁇ and ⁇ denote a residual between left reference frames M
  • Equation (6) By substituting the Equation (6) into the Equation (5), ⁇ can be defined by the
  • Equation (7) [49]
  • an encoder can obtain ⁇ by subtracting the reconstructed base layer image O ' obtained by quantizing the original image O with the base layer quantization step size QP B and then inversely quantizing the quantized image and an average of the residuals between each of the base reference frame and the FGS layer reference frame and the original image O, that is, ( ⁇ + ⁇ )/2.
  • M N decoder reconstructs the original image O by adding together the reconstructed base layer image O', ⁇ , and the average of the residuals between the base layer reference frame and the FGS layer reference frame.
  • motion compensation is performed using a motion vector with one pixel or sub-pixel (1/2 pixel or 1/4 pixel) accuracy obtained by motion estimation.
  • motion estimation and compensation are typically performed according to various pixel accuracies such as half pixel accuracy, or quarter pixel accuracy.
  • a predicted image generated by motion compensation with, e.g., 1/4 pixel accuracy is packed into integer pixels.
  • quantization is performed on a residual between an original image and the predicted image.
  • the packing is a restoration process of a 4x interpolated reference image into the original size image by performing motion estimation with 1/4 pixel accuracy. For example, one of every four pixels may be selected during the packing process.
  • the data ⁇ in the FGS layer to be quantized for fast PFGS according to the present invention needs not be subjected to motion estimation with high pixel accuracy.
  • Motion estimation and compensation are applied to only the third term ( ⁇ + ⁇ )/2 in the right-hand side of the Equation (7).
  • the third term is represented as interlayer residuals between reference frames, it is not highly effective to perform motion estimation and compensation with high pixel accuracy.
  • the fast PFGS allows lower pixel accuracy motion estimation and compensation than the conventional PFGS.
  • ⁇ in the Equation (5) in the first exemplary embodiment can also be represented as a residual between predicted signals P F and P B as shown in the Eq ⁇ uation (8).
  • P F and P B are eq ⁇ ual to (M F '+NF ')/2 and (M B '+NB
  • the first and second exemplary embodiments are distinguished from each other as follows.
  • residuals ⁇ and ⁇ between FGS layer reference images and base layer reference images are first calculated and then divided by 2.
  • FGS layer image P F and the predicted base layer image P B is calculated after calculating the predicted images P F and P B in the two layers. That is to say, although fast PFGS algorithms according to the first and second exemplary embodiments are implemented in different ways, the same calculation result ( ⁇ ) can be obtained.
  • motion compensation is first performed and a residual between images is then calculated.
  • a residual between reference images in different layers may first be calculated, followed by performing motion compensation.
  • boundary padding is the process of duplicating pixels at boundaries in the vicinity of the pixels considering that block matching at a frame boundary is restricted during motion estimation.
  • a residual ⁇ can be defined by the Equation (9): [58]
  • the fast PFGS algorithms are used to calculate a residual between predicted images or predict a residual between reference images.
  • the fast PFGS performance of the present invention is only slightly affected by or insensitive to interpolation used in order to increase the pixel accuracy of motion vector.
  • quarter or half pixel interpolation may be skipped.
  • a bi-linear filter requiring a smaller amount of computations may be use instead of a half-pixel interpolation filter used in the H.264 standard requiring a large amount of computations.
  • a bi-linear filter may be applied to the third terms in the right-hand sides of the Equations (7) through (9). This may reduce degradation in performance compared to when a bi-linear filter is directly applied to a predicted signal for obtaining R F and R B as in a conventional PFGS algorithm.
  • Equation (3) The principle of the first through third exemplary embodiments of the present invention is based on the Equation (3).
  • implementation of these exemplary embodiments starts on the assumption that a residual between an FGS layer residual R F and a base layer residual R B is to be coded.
  • the above fast PFGS algorithms according to the first through third exemplary embodiments may rather degrade coding performance.
  • coding only the residual obtained from the FGS layer, i.e., R in the Equation (3) may offer better coding performance. That is, according to a fourth exemplary embodiment of the present invention, the Equations (7) through (9) may be modified into the Equations (10) through (12), respectively: [63]
  • Equations (10) through (12) the reconstructed base layer image O' is replaced with a predicted image P B for a base layer image.
  • interpolation may not be applied to the third terms in the right-hand side of the Equations (10) through (12), or a bi-linear filter requiring a smaller amount of computations may be used for interpolation.
  • the predicted image P occurring twice in the Equation (11) is not necessarily the same one.
  • An estimated motion vector may be used during motion compensation to generate the predicted image P B in a second term.
  • a motion vector with lower accuracy than the estimated motion vector or a filter requiring a small amount of computations may be used during motion compensation to generate P and P in a third term.
  • the drift error can be reduced by a leaky prediction method using a predicted image created by a weighted sum of a predicted image obtained from both the reference frames and a predicted image obtained from a base layer.
  • Equation (13) a value being coded in an FGS layer is expressed by the Equation (13):
  • Equation (13) can be converted into the Equation (14) according to a fifth exemplary embodiment of the present invention: [70]
  • a weighting factor ⁇ can only be applied to the residual (P F -P B ) between predicted images in the Equation (11).
  • the present invention can also be applied to a leaky prediction method. That is, interpolation may be skipped or interpolation may be applied to the residual (P F -P B ) using a bi-linear filter requiring a smaller amount of computations. In the latter case, the result of interpolation is multiplied by the weighting factor ⁇ .
  • FIG. 4 is a block diagram of a video encoder 100 according to a first exemplary embodiment of the present invention.
  • each block As a basic unit of motion estimation with reference to FIGS. 1 through 3, fast PFGS that follows will be described with regard to each frame containing the block.
  • an identifier of the block is indicated by a subscript for an 'F' indicating a frame.
  • a frame containing a block labeled R B is denoted by F RB .
  • a prime (') notion is used to denote reconstructed data obtained after quantization/inverse quantization.
  • a current frame F is fed into a motion estimator 105, a subtracter 115, and a o residual calculator 170.
  • the motion estimator 105 performs motion estimation on the current frame F using neighboring frames to obtain motion vectors MVs.
  • the neighboring frames that are referred to during motion estimation are hereinafter called 'reference frames'.
  • a block matching algorithm (BMA) is commonly used to estimate the motion of a given block.
  • BMA block matching algorithm
  • a given block is moved within a search area in a reference frame at pixel or sub-pixel accuracy and a displacement with a minimum error is determined as a motion vector.
  • HVSBM hierarchical variable size block matching
  • reference frames When motion estimation is performed at sub-pixel accuracy, reference frames need to be upsampled or interpolated to predetermined resolution. For example, when the motion estimation is performed at 1/2 and 1/4 pixel accuracies, reference frames must be updated or interpolated by a factor of two and four, respectively.
  • the motion vectors MVs calculated by the motion estimator 105 are provided to a motion compensator 110.
  • the motion compensator 110 performs motion compensation on the reference frames F MB ' and F NB ' using the motion vectors MVs and generates a predicted frame F for the current frame.
  • the predicted image can be calculated as an average of motion-compensated reference frames.
  • the predicted image may be the same as the motion-compensated reference frame. While it is assumed hereinafter that motion estimation and compensation use bidirectional reference frames, it will be apparent to those skilled in the art that the present invention may use a unidirectional reference frame.
  • the subtractor 115 calculates a residual F between the predicted image and the current image for transmission to a transformer 120.
  • the transformer 120 performs spatial transform on the residual F to create a transform coefficient F ⁇ .
  • the spatial transform method may include a discrete cosine transform (DCT), or wavelet transform. Specifically, DCT coefficients may be created in a case where DCT is employed, and wavelet coefficients may be created in a case where wavelet transform is employed.
  • a quantizer 125 applies quantization to the transform coefficient F
  • Quantization means the process of expressing the transform coefficients formed in arbitrary real values by discrete values, and matching the discrete values with indices according to the predetermined quantization table.
  • the quantizer 125 may divide the real- valued transform coefficient by a predetermined quantization step size and round the resulting value to the nearest integer.
  • the quantization step size of a base layer is greater than that of an FGS layer.
  • Inverse quantization means an inverse quantization process to restore values matched to indices generated during quantization using the same quantization step used in the quantization.
  • An inverse transformer 135 receives the inverse quantization result and performs an inverse transform on the received.
  • Inverse spatial transform may be, for example, inverse DCT or inverse wavelet transform, performed in a reverse order to that of transformation performed by the transformer 120.
  • An adder 140 adds the inversely transformed result to the predicted image F obtained from the motion compensator 110 in order to generate a reconstructed image F ' for the current frame.
  • a buffer 145 stores the addition result received from the adder 140.
  • the buffer 145 stores the reconstructed image F ' for the current frame as well as the previously reconstructed base layer reference frames F MB ' and F NB '.
  • a motion vector modifier 155 changes the accuracy of the received motion vector
  • the motion vector MV with 1/4 pixel accuracy may have a value of 0, 0.25, 0.5, or 0.75.
  • the motion vector modifier 155 changes the motion vector MV with 1/4 pixel accuracy into motion vector MV with pixel accuracy lower than the 1/4 pixel accuracy, such as 1/2 pixel or 1 pixel.
  • Such a changing procedure can be performed by simply truncating or rounding off the decimal part of the pixel accuracy from the original motion vector.
  • a buffer 165 temporarily stores FGS layer reference frames F ', F '.
  • reconstructed FGS lay J er frames F MF ' and F NF ' or an orig b inal frame adjacent to the current frame may be used as the FGS layer reference frames.
  • the motion compensator 160 uses the modified motion vector MV to perform motion compensation on the reconstructed base layer reference frames F MB ' and F NB ' received from the buffer 145 and the reconstructed FGS lay J er reference frames F MF ' and F ' received from the buffer 165 and provides the motion-compensated frames mc(F MB '), mc(F NB '), mc(F MF '), and mc(F NF ') to the residual calculator 170.
  • F MF ' and F NF ' denote forward and backward reference frames in the FGS layer, respectively.
  • F MB ' and F NF ' denote forward and backward reference frames in the FGS layer, respectively.
  • F ' denote forward and backward reference frames in the base layer, respectively.
  • interpolation filter 160 may use a different type of interpolation filter than that used for the motion estimator 105 or motion compensator 110.
  • a bi-linear filter requiring a small amount of computations may be used for interpolation instead of a six-tap filter used in the H.264 standard. Because a residual between a motion-compensated base layer frame and a motion-compensated FGS layer is calculated after interpolation, the interpolation process little affects the compression efficiency.
  • the residual calculator 170 calculates a residual between the motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated motion-compensated
  • ⁇ M mc(F MF ') - mc(F MB ')
  • ⁇ N mc(F NF ') - mc(F NB ').
  • the residual calculator 170 calculates an average of the residuals ⁇ M and ⁇ N and subtracts the reconstructed imag b e F O ' and the averag b e of the residuals ⁇ M and ⁇ N from the current frame F .
  • the process of calculating the average is not required.
  • the subtraction result F obtained by the residual calculator 170 is subjected to spatial transform by a transformer 175 and then quantized by a quantizer 180.
  • the quantized result F Q is transmitted to the entropy coding unit 150.
  • the quantization step size used in the quantizer 180 is typically less than that used in the quantizer 125.
  • the entropy coding unit 150 losslessly encodes the motion vector MV estimated by the motion estimator 105, the quantization coefficient F Q received from the quantizer 125, the quantized result F received from the quantizer 180 into a bitstream.
  • lossless coding methods including arithmetic coding, variable length coding, and the like.
  • a video encoder according to a second exemplary embodiment of the present invention may have the same configuration and operation as the video encoder 100 shown in FIG. 4, except a residual calculator.
  • the residual calculator generates a predicted frame for each layer before calculating a residual between frames in different layers.
  • the residual calculator generates the predicted FGS layer frame and a predicted base layer frame using a motion-compensated FGS layer reference frame and a motion-compensated base layer reference frame.
  • the predicted frame can be calculated by simply averaging the two motion-compensated reference frames.
  • the motion-compensated frame may be the predicted frame itself.
  • the residual calculator then calculates a residual between the predicted frames, and subtracts a reconstructed image and the calculated residual from a current frame.
  • FIG. 5 is a block diagram of a video encoder 300 according to a third exemplary embodiment of the present invention.
  • the illustrated video encoder 300 performs motion compensation after calculating the residual between reference frames in the two layers. To avoid repetitive explanation, the following description will focus on distinguished features between the first and second exemplary embodiments.
  • a subtracter 390 subtracts reconstructed base lay J er reference frames F MB ' and F NB ', which are received from a buffer 345, from FGS lay J er reference frames F MF ' and F NF ', which are received from a buffer 365, and p r rovides the subtraction results F MF '-F MB ' and F NF '-F NB ' to a motion compensator 360.
  • a unidirectional reference frame is used, only one residual exists.
  • the motion compensator 360 uses a modified motion vector MV received from a motion vector modifier 355 to perform motion compensation on the residuals F '-F '
  • FIGS. 6 and 7 are block diagrams of examples of video encoders 400 and 600 according to a fourth exemplary embodiment of the present invention. Referring first to FIG. 6, unlike in the first exemplary embodiment shown in FIG. 4, a residual calculator 470 in the video encoder 400 of the exemplary embodiment subtracts a predicted base layer frame F , instead of the reconstructed base layer frame F ', from the current frame F . o
  • the video encoders 400 and 600 according to the fourth exemplary embodiment shown in FIGS. 6 and 7 correspond to FIGS. 4 and 5 illustrating video encoders 100 and 300 according to the first and third exemplary embodiments.
  • the residual calculator 470 subtracts the predicted base layer image F PB received from a motion compensator 410, instead of the reconstructed base layer image F ', from the current frame.
  • the residual calculator 470 subtracts the predicted image F and an average of the residuals ⁇ and ⁇ from the current frame F to
  • the residual calculator 670 subtracts the predicted image F PB and an average of the motion-compensated residuals mc(F MF '-FMB ') and mc(F
  • An example of a video encoder according to a fourth exemplary embodiment corresponding to the second exemplary embodiment may have the same configuration and perform the same operation as shown in FlG. 6 except for the operation of the residual calculator 470.
  • the residual calculator 470 generates a predicted FGS layer frame F and a predicted base layer frame F using a motion-compensated FGS layer reference frame mc(F '), mc(F ') and a motion-compensated base layer reference frame mc(F '), mc(F '), respectively.
  • the residual calculator 470 also calculates a residual F PF -F PB between the p r redicted frames F PF and F PB and subtracts the reconstructed image F O ' and the residual F PF -F PB from the current frame F to obtain a subtraction result F .
  • the residual calculator 470 multiplies a weighting factor ⁇ by J the residual F PF -F PB and subtracts the reconstructed imag to e F O ' and the product ( ⁇ x (F -F )) from the current frame F to obtain a subtraction result F .
  • FlG. 8 is a block diagram of a video decoder 700 according to a first exemplary embodiment of the present invention.
  • an entropy decoding unit 701 losslessly decodes an input bitstream to extract base layer texture data F PB Q , FGS layer texture data F ⁇ , and motion vectors MVs.
  • the lossless decoding is an inverse process of lossless encoding.
  • the base layer texture data F and the FGS layer texture data F are provided to inverse quantizers 705 and 745, respectively, and the motion vectors MVs are provided to a motion compensator 720 and a motion vector modifier 730.
  • the inverse quantizer 705 applies inverse quantization to the base layer texture data F received from the entropy decoding unit 701.
  • the inverse quantization is performed in a reverse order to that of the quantization performed by the transformer to restore values matched to indices generated during quantization according to a predetermined quantization step used in the quantization.
  • An inverse transformer 710 performs inverse transform on the inverse quantized result.
  • the inverse transformation is performed in a reverse order to that of the transformation performed by the transformer. Specifically, inverse DCT transformation, or inverse wavelet transformation may be used.
  • the reconstructed residual F RB ' is provided to an adder 715.
  • the motion compensator 720 performs motion compensation on previously reconstructed base lay J er reference frames F MB ' and F NB ' stored in a buffer 725 using to the extracted motion vectors MVs to generate a predicted image F PB , which is then sent to the adder 715.
  • the predicted image F PB is calculated by averaging the motion-compensated reference frames.
  • unidirectional prediction is used, the predicted image F PB is obtained as a motion-compensated reference frame.
  • the adder 715 adds together the input F and F to output a reconstructed base layer image F ' that is then stored in the buffer 725.
  • An inverse quantizer 745 applies inverse quantization to the FGS layer texture data
  • an inverse transformer 750 performs inverse transform on the inversely quantized result F ⁇ ' to obtain a reconstructed frame F (F ') that is then provided to a frame reconstructor 755.
  • the motion vector modifier 730 lowers the accuracy of the extracted motion vector
  • a motion vector MV with 1/4 pixel accuracy may have a value of 0, 0.25, 0.5, or 0.75.
  • the motion vector modifier 730 changes the motion vector MV with 1/4 pixel accuracy into a motion vector MV with pixel accuracy lower than the 1/4 pixel accuracy, such as 1/2 pixel or 1 pixel.
  • a motion compensator 735 uses the modified motion vector MV to perform motion compensation on reconstructed base layer reference frames F MB ' and F NB ' received from the buffer 725 and reconstructed FGS lay J er reference frames F MF ' and F
  • ⁇ M mc(FMF ') -mc(FMB '
  • ⁇ N between the motion-compensated FGS layer and base layer reference frames mc(F ') and mc(F ')
  • mc(F ') - mc(F '
  • the buffer 740 then stores the reconstructed image F OF '.
  • the previously reconstructed images F MF ' and F BF ' can be stored in the buffer 740.
  • a video decoder may have the same configuration and perform the same operation as shown in FIG. 8 except for the operation of a frame reconstructor. That is, the frame reconstructor according to the second exemplary embodiment generates a predicted frame for each layer before calculating a residual between frames in the two layers. That is to say, the frame reconstructor generates a predicted FGS layer frame and a predicted base layer frame using motion-compensated FGS layer reference frames and motion-compensated base layer frames. The predicted frames can be generated by simply averaging the two motion-compensated reference frames. Of course, when unidirectional prediction is used, the predicted frame is a motion-compensated frame itself.
  • the frame reconstructor then calculates a residual between the predicted frames, and adds together the texture data, the reconstructed base layer frame, and the residual.
  • FIG. 9 is a block diagram of a video decoder 900 according to a third exemplary embodiment of the present invention.
  • the video decoder 900 performs motion compensation after calculating a residual between the reference frames in the two layers.
  • the following description will focus on distinguished features of the first exemplary embodiment shown in FIG. 4.
  • a subtracter 960 subtracts reconstructed base layer reference frames F MB ' and F NB ' received from a buffer 925 from FGS lay J er reference frames F MF ' and F NF ' and p *rovides the subtraction results F '-F ' and F '-F ' to a motion compensator 935.
  • the motion compensator 935 uses a modified motion vector MV received from a motion vector modifier 930 to perform motion compensation on the residuals F '-F '
  • the frame reconstructor 955 calculates an average between motion-compensated residuals, that is, an average between mc(F MF '-FMB ') and mc(F NF '-FNB '), and adds together the calculated average, F ' received from an inverse transformer 950, and a reconstructed base layer image F '.
  • the averaging process is not required.
  • FIGS. 10 and 11 are block diagrams of examples of video decoders 1000 and 1200 according to a fourth exemplary embodiment of the present invention.
  • frame reconstructors 1055 and 1255 add a predicted base layer frame FPB, instead of the reconstructed base layer frame F '.
  • the video decoders 1000 and 1200 according to the fourth exemplary embodiment shown in FIGS. 10 and 11 correspond to those according to the first and third exemplary embodiments shown in FIGS. 8 and 9, respectively.
  • the motion compensator 1020 provides the base layer reference image F PB to the frame reconstructor 1055, instead of the reconstructed image F '.
  • the frame reconstructor 470 adds together F ' received from the inverse transformer 1050, the predicted base layer image F PB , and an average of interlayer residuals ⁇ M and ⁇ N to obtain the reconstructed base layer image
  • the frame reconstructor 1255 adds together F ' received from an inverse transformer 1250, the predicted base layer image F PB received from a motion compensator 1220, and an average of motion-compensated residuals mc(F '-F ') and mc(F '-F ')) to obtain a reconstructed FGS layer image F '.
  • the video decoder according to the fourth exemplary embodiment may have the same configuration and perform the same operation as shown in FIG. 8 except for the operation of the frame reconstructor 1255.
  • the frame reconstructor 1255 generates a predicted FGS layer frame F and a predicted base layer frame F using motion-compensated FGS layer reference frames mc(F ') and mc(F ') and motion-compensated base layer reference frames mc(F MB ') and mc(F NB ').
  • the frame reconstructor 1255 also calculates a residual F PF -F PB between a predicted FGS layer frame F and a predicted base layer frame F and adds together F ⁇ ' received from the inverse transformer 1250, the predicted image F PB received from the motion compensator 1220, and the residual F PF -F PB to obtain the re- constructed image F '.
  • F ⁇ ' received from the inverse transformer 1250
  • the predicted image F PB received from the motion compensator 1220 and the residual F PF -F PB to obtain the re- constructed image F '.
  • leaky prediction (fifth exemplary embodiment) is applied, the frame re- constructor 1255 multiplies a weighting factor ⁇ by the interlayer residual F -F and
  • PF PB adds together F ⁇ ', F O ' and the product ⁇ x (F PF -F PB ) to obtain F OF '.
  • FIG. 12 is a block diagram of a system for performing an encoding or decoding process using a video encoder 100, 300, 400, 600 or a video decoder 700, 900, 1000, 1200, according to an exemplary embodiment of the present invention.
  • the system may be a TV, a set-top box (STB), a desktop, laptop, or palmtop computer, PDA, a video or image storage device (e.g., a VCR, or a DVR).
  • the system may be a combination of the devices listed above or another device incorporating them.
  • the system may be a combination of the above-mentioned apparatuses or one of the apparatuses which includes a part of another apparatus among them.
  • the system includes at least one video source 1310, at least one input/output unit 1320, a processor 1340, a memory 1350, and a display unit 1330.
  • the video source 1310 may be a TV receiver, a VCR, or other video storing apparatus.
  • the video source 1310 may indicate at least one network connection for receiving a video or an image from a server using Internet, a wide area network (WAN), a local area network (LAN), a terrestrial broadcast system, a cable network, a satellite communication network, a wireless network, a telephone network, or the like.
  • the video source 1310 may be a combination of the networks or one network including a part of another network among the networks.
  • the input/output device 1320, the processor 1340, and the memory 1350 communicate with one another through a communication medium 1360.
  • the communication medium 1360 may be a communication bus, a communication network, or at least one internal connection circuit.
  • Input video data received from the video source 1310 can be processed by the processor 1340 using to at least one software program stored in the memory 1350 and can be executed by the processor 1340 to generate an output video provided to the display unit 1330.
  • the software program stored in the memory 1350 includes a scalable wavelet-based codec performing a method of the present invention.
  • the codec may be stored in the memory 1350, may be read from a storage medium such as a compact disc-read only memory (CD-ROM) or a floppy disc, or may be downloaded from a predetermined server through a variety of networks.
  • the present invention provides video coding that can sig nificantly reduce the amount of computations required to implement a PFGS algorithm. Since a decoding process is modified according to a video coding process of the present invention, the present invention can be applied to the H.264 SE standardized document. While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present invention as defined by the following claims.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

L'invention concerne un procédé permettant de réduire la somme de calculs nécessaires à un algorithme d'échelonnement progressif à granularité fine (PFGS: progressive fine granular scalability) multicouche, ainsi qu'un procédé et un dispositif de codage vidéo mettant en oeuvre ce procédé. Le procédé de codage vidéo à échelonnement à granularité fine (FGS) consiste à obtenir une image prédite pour une trame courante, à l'aide d'un vecteur de mouvement estimé avec une précision prédéterminée, à quantifier un résidu entre la trame courante et l'image prédite, à effectuer la quantification inverse du résidu quantifié, et à générer une image reconstruite pour la trame courante, à appliquer une compensation de mouvement à une trame de référence de la couche FGS, et à une trame de référence de la couche de base au moyen du vecteur de mouvement estimé, à calculer un résidu entre la trame de référence de la couche FGS à mouvement compensé, et la trame de référence de couche de base à mouvement compensé, à soustraire l'image reconstruite pour la trame courante et le résidu calculé de a trame courante, et à coder le résultat de cette soustraction.
PCT/KR2006/001471 2005-04-29 2006-04-20 Procede et dispositif de codage video permettant l'echelonnement rapide a granularite fine WO2006118383A1 (fr)

Priority Applications (5)

Application Number Priority Date Filing Date Title
EP06747382A EP1878261A1 (fr) 2005-04-29 2006-04-20 Procede et dispositif de codage video permettant l'echelonnement rapide a granularite fine
BRPI0611142-4A BRPI0611142A2 (pt) 2005-04-29 2006-04-20 método de codificação de vìdeo com suporte para escalabilidade de granularidade fina, codificador de vìdeo baseado em escabilidade de granularidade fina
CA002609648A CA2609648A1 (fr) 2005-04-29 2006-04-20 Procede et dispositif de codage video permettant l'echelonnement rapide a granularite fine
JP2008508745A JP2008539646A (ja) 2005-04-29 2006-04-20 高速fgsを提供するビデオコーディング方法及び装置
AU2006241637A AU2006241637A1 (en) 2005-04-29 2006-04-20 Video coding method and apparatus supporting fast fine granular scalability

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US67592105P 2005-04-29 2005-04-29
US60/675,921 2005-04-29
KR1020050052428A KR100703778B1 (ko) 2005-04-29 2005-06-17 고속 fgs를 지원하는 비디오 코딩 방법 및 장치
KR10-2005-0052428 2005-06-17

Publications (1)

Publication Number Publication Date
WO2006118383A1 true WO2006118383A1 (fr) 2006-11-09

Family

ID=37308152

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2006/001471 WO2006118383A1 (fr) 2005-04-29 2006-04-20 Procede et dispositif de codage video permettant l'echelonnement rapide a granularite fine

Country Status (6)

Country Link
EP (1) EP1878261A1 (fr)
JP (1) JP2008539646A (fr)
AU (1) AU2006241637A1 (fr)
CA (1) CA2609648A1 (fr)
RU (1) RU2340115C1 (fr)
WO (1) WO2006118383A1 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9369728B2 (en) 2012-09-28 2016-06-14 Sharp Kabushiki Kaisha Image decoding device and image encoding device
US9386322B2 (en) 2007-07-02 2016-07-05 Nippon Telegraph And Telephone Corporation Scalable video encoding method and decoding method, apparatuses therefor, programs therefor, and storage media which store the programs
US20220217372A1 (en) * 2019-03-20 2022-07-07 V-Nova International Limited Modified upsampling for video coding technology
WO2023087159A1 (fr) * 2021-11-16 2023-05-25 广东博华超高清创新中心有限公司 Procédé de génération de données de capteur de vision dynamique basé sur un codage d'estimation du mouvement avs

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101624649B1 (ko) * 2009-08-14 2016-05-26 삼성전자주식회사 계층적인 부호화 블록 패턴 정보를 이용한 비디오 부호화 방법 및 장치, 비디오 복호화 방법 및 장치
JP5381571B2 (ja) * 2009-09-29 2014-01-08 株式会社Jvcケンウッド 画像符号化装置、画像復号化装置、画像符号化方法、及び画像復号化方法
CN103329532B (zh) 2011-03-10 2016-10-26 日本电信电话株式会社 量子化控制装置和方法以及量子化控制程序
WO2013145642A1 (fr) * 2012-03-28 2013-10-03 株式会社Jvcケンウッド Dispositif de codage d'image, procédé de codage d'image et programme de codage d'image ; dispositif de transmission, procédé de transmission et programme de transmission ; dispositif de décodage d'image, procédé de décodage d'image et programme de décodage d'image ; et dispositif de réception, procédé de réception et programme de réception
GB2501535A (en) * 2012-04-26 2013-10-30 Sony Corp Chrominance Processing in High Efficiency Video Codecs

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6614936B1 (en) * 1999-12-03 2003-09-02 Microsoft Corporation System and method for robust video coding using progressive fine-granularity scalable (PFGS) coding

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6614936B1 (en) * 1999-12-03 2003-09-02 Microsoft Corporation System and method for robust video coding using progressive fine-granularity scalable (PFGS) coding

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DING G.G. AND GUO B.L.: "Improvements to progressive fine granularity scalable video coding", PROC. FIFTH INT'L CONF. COMPUTATIONAL INTELLIGENCE AND MULTIMEDIA APPLICATIONS (ICCIMA), 27 September 2003 (2003-09-27), pages 249 - 253, XP010661660 *
V.D. SCHAAR M. AND RADHA H.: "Adaptive motion-compensation fine-granular-scalability (AMC-FGS) for wireless video", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, vol. 12, no. 6, June 2002 (2002-06-01), pages 360 - 371, XP001059187 *
WANG Q. ET AL.: "Optimal rate allocation for progressive fine granularity scalable video coding", IEEE SIGNAL PROCESSING LETTERS, vol. 9, no. 2, February 2002 (2002-02-01), pages 33 - 39, XP001059186 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9386322B2 (en) 2007-07-02 2016-07-05 Nippon Telegraph And Telephone Corporation Scalable video encoding method and decoding method, apparatuses therefor, programs therefor, and storage media which store the programs
US9369728B2 (en) 2012-09-28 2016-06-14 Sharp Kabushiki Kaisha Image decoding device and image encoding device
JPWO2014050948A1 (ja) * 2012-09-28 2016-08-22 シャープ株式会社 画像復号装置、画像復号方法及び画像符号化装置
JP2017055458A (ja) * 2012-09-28 2017-03-16 シャープ株式会社 画像復号装置、画像復号方法、画像符号化装置及び画像符号化方法
US20220217372A1 (en) * 2019-03-20 2022-07-07 V-Nova International Limited Modified upsampling for video coding technology
US12177468B2 (en) * 2019-03-20 2024-12-24 V-Nova International Limited Modified upsampling for video coding technology
WO2023087159A1 (fr) * 2021-11-16 2023-05-25 广东博华超高清创新中心有限公司 Procédé de génération de données de capteur de vision dynamique basé sur un codage d'estimation du mouvement avs

Also Published As

Publication number Publication date
RU2340115C1 (ru) 2008-11-27
JP2008539646A (ja) 2008-11-13
CA2609648A1 (fr) 2006-11-09
EP1878261A1 (fr) 2008-01-16
AU2006241637A1 (en) 2006-11-09

Similar Documents

Publication Publication Date Title
US20060245495A1 (en) Video coding method and apparatus supporting fast fine granular scalability
US8817872B2 (en) Method and apparatus for encoding/decoding multi-layer video using weighted prediction
US20070047644A1 (en) Method for enhancing performance of residual prediction and video encoder and decoder using the same
CN101208958B (zh) 使用基于多层的加权预测的视频编码方法和装置
KR100703760B1 (ko) 시간적 레벨간 모션 벡터 예측을 이용한 비디오인코딩/디코딩 방법 및 장치
KR100703788B1 (ko) 스무딩 예측을 이용한 다계층 기반의 비디오 인코딩 방법,디코딩 방법, 비디오 인코더 및 비디오 디코더
US20060165302A1 (en) Method of multi-layer based scalable video encoding and decoding and apparatus for the same
US20060120448A1 (en) Method and apparatus for encoding/decoding multi-layer video using DCT upsampling
WO2006118383A1 (fr) Procede et dispositif de codage video permettant l'echelonnement rapide a granularite fine
KR20060135992A (ko) 다계층 기반의 가중 예측을 이용한 비디오 코딩 방법 및장치
EP1782631A1 (fr) Procede et appareil de decodage prealable et de decodage d'un train de bits comprenant une couche de base
KR100763179B1 (ko) 비동기 픽쳐의 모션 벡터를 압축/복원하는 방법 및 그방법을 이용한 장치
US20070160143A1 (en) Motion vector compression method, video encoder, and video decoder using the method
EP1878252A1 (fr) Procede et appareil destine a coder/decoder une video a couches multiples en utilisant une prediction ponderee
WO2007024106A1 (fr) Procede permettant d'ameliorer le rendement de la prediction residuelle, codeur et decodeur video utilisant ledit procede
KR20050012755A (ko) 더 높은 질의 참조 프레임들을 이용하는 향상된 효율의미세 입상 계위 시간 프레임워크
WO2006132509A1 (fr) Procede de codage video fonde sur des couches multiples, procede de decodage, codeur video, et decodeur video utilisant une prevision de lissage
WO2006078109A1 (fr) Procede et dispositif d'encodage et decodage video echelonnable multicouche
US20060088100A1 (en) Video coding method and apparatus supporting temporal scalability
WO2006104357A1 (fr) Procede pour la compression/decompression des vecteurs de mouvement d'une image non synchronisee et appareil utilisant ce procede
WO2006098586A1 (fr) Procede et dispositif de codage/decodage video utilisant une prediction de mouvement entre des niveaux temporels
WO2006043754A1 (fr) Procede de video codage et appareil prenant en charge une extensibilite temporelle

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200680019114.4

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application
DPE2 Request for preliminary examination filed before expiration of 19th month from priority date (pct application filed from 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2006241637

Country of ref document: AU

Ref document number: 2006747382

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2609648

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 2007139817

Country of ref document: RU

ENP Entry into the national phase

Ref document number: 2008508745

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

WWP Wipo information: published in national office

Ref document number: 2006241637

Country of ref document: AU

WWE Wipo information: entry into national phase

Ref document number: 1968/MUMNP/2007

Country of ref document: IN

WWP Wipo information: published in national office

Ref document number: 2006747382

Country of ref document: EP

ENP Entry into the national phase

Ref document number: PI0611142

Country of ref document: BR

Kind code of ref document: A2

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载