US20040105586A1 - Method and apparatus for estimating and controlling the number of bits output from a video coder - Google Patents
Method and apparatus for estimating and controlling the number of bits output from a video coder Download PDFInfo
- Publication number
- US20040105586A1 US20040105586A1 US10/617,625 US61762503A US2004105586A1 US 20040105586 A1 US20040105586 A1 US 20040105586A1 US 61762503 A US61762503 A US 61762503A US 2004105586 A1 US2004105586 A1 US 2004105586A1
- Authority
- US
- United States
- Prior art keywords
- blocks
- bits
- group
- groups
- luminance
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 238000013139 quantization Methods 0.000 claims abstract description 107
- 239000013598 vector Substances 0.000 claims abstract description 39
- 238000013507 mapping Methods 0.000 claims abstract description 19
- 230000033001 locomotion Effects 0.000 claims description 30
- 230000002123 temporal effect Effects 0.000 claims description 8
- 238000000638 solvent extraction Methods 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 11
- 230000006835 compression Effects 0.000 description 9
- 238000007906 compression Methods 0.000 description 9
- 238000013461 design Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 230000008569 process Effects 0.000 description 7
- 238000013459 approach Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000001186 cumulative effect Effects 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 230000000750 progressive effect Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 101100099846 Arabidopsis thaliana TMN8 gene Proteins 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 239000000872 buffer Substances 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000001143 conditioned effect Effects 0.000 description 1
- 230000006837 decompression Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 239000012464 large buffer Substances 0.000 description 1
- 238000012417 linear regression Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/152—Data rate or code amount at the encoder output by measuring the fullness of the transmission buffer
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
- H04N19/126—Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/149—Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
Definitions
- the present invention relates to signal processing, and in particular, to a method and apparatus for estimating and controlling the number of bits output from a video coder.
- MPEG-1 defines a bitstream for compressed video and audio optimized to fit into a bandwidth of 1.5 Mbits/sec. This rate is special because it is the data rate of uncompressed audio CDs and DATs.
- MPEG-1 is defined to begin with a relatively low-resolution video sequence of about 352 ⁇ 240 frames ⁇ 30 frames/sec., but use original high (CD) quality audio.
- the images are in color, but are converted into YUV space (a color space represented by luminance (Y) and two color differences (U and V)).
- the basic scheme of MPEG-1 is to predict motion from frame-to-frame in the temporal direction, and then to use discrete cosine transforms (DCTs) to organize the redundancy in the spatial directions.
- DCTs discrete cosine transforms
- the DCTs are performed on 8 ⁇ 8 blocks, and the motion prediction is done in the luminance channel (Y) on 16 ⁇ 16 blocks (each of the 16 ⁇ 16 Y and the corresponding 8 ⁇ 8 U and V block pairs is considered to be a macroblock).
- the DCT coefficients of either the actual data, or the difference between the block and the close match are “quantized,” in that they are coarsely represented by fewer number of bits by means of (shifting and) integer dividing by a quantization parameter to yield quantization levels. By quantization, it is desired that many of these DCT coefficients will become “0” and drop out.
- the result of the coding including the motion vectors and the quantization levels are variable length coded using fixed tables.
- the quantization levels are zigzag scanned and ordered into a one-dimensional array.
- Each nonzero level is represented by a codeword indicating a run-length of zeros preceding in the scan order, the nonzero value of the level that ended the run and whether more nonzero levels are to be coded in the block. Compression is achieved by assigning shorter codewords to frequent events and longer codewords to less frequent events.
- each macroblock in a P-frame can either be characterized by a motion vector from a close match in the last I or P-frame and blocks of DCT coefficients of the motion compensated difference values associated with the motion vector (inter coded), or simply be characterized by the blocks of DCT coefficients of the macroblock itself (intra-coded), if no suitable match exists.
- “B” (bidirectional) frames matching blocks are searched for in the past and/or future I or Pframes.
- the macroblock can be motion compensated by only the forward vector and using DCT blocks from the past frames, or by only the backward vector and using DCT blocks from the future frames or by both forward and backward vectors and using the average of the DCT blocks from past and future frames.
- the macroblock can also be simply intra-coded.
- a typical frame sequence may resemble the following sequence: IBBPBBPBBPBBIBBPBBPB . . . , where there are 12 frames from I to I.
- MPEG-2 can represent interlaced or progressive video sequences.
- the MPEG-2 concept is similar to MPEG-1, but included extensions to cover a wider range of applications.
- the primary application targeted by MPEG-2 is the all-digital transmission of broadcast television quality video at coded bit rates between 4 and 9 Mbit/sec.
- the most significant enhancement in MPEG-2 is the addition of syntax for efficient coding of interlaced video (16 ⁇ 8 block size motion compensation).
- MPEG-2 Several other enhancements such as alternate scan, intra VLC, nonuniform quantization resulted in improved coding efficiency for MPEG-2.
- Other key features of MPEG-2 are the scalable extensions that permitted the division of a continuous video signal into two or more coded bit streams representing the video at different resolutions, picture quality or picture rates.
- H.261 is a video coding standard designed for data rates that are multiples of 64 Kbit/sec. This standard is specifically designed to suit ISDN lines.
- the coding algorithm utilized is a hybrid of inter-picture prediction, transform coding and motion compensation.
- the data rate of the coding algorithm can be set between 40 Kbit/sec. and 2 Mbit/sec.
- Inter-picture prediction aids in the removal of temporal redundancy
- transform coding removes spatial redundancy and motion vectors are used to help the codec compensate for motion.
- variable length coding is utilized.
- H.261 allows the DCT coefficients to be either intra coded or inter coded from previous frames.
- the 8 ⁇ 8 blocks of DCT coefficients of the actual data or the motion compensated difference values are quantized and variable length coded. They are multiplexed onto a hierarchical bitstream along with the variable length coded motion vectors.
- H.263 is a compression standard originally designed for low bit rate communication, but can use a wide range of bit rates.
- the coding algorithm is similar to that of H.261, but improves H.261 in certain areas. Specifically, half-pixel precision is used for motion compensation, as opposed to full pixel precision and a loop filter used by H.261.
- H.263 includes unrestricted motion vectors, syntax-based arithmetic coding, advance prediction and forward and backward frame prediction similar to MPEG, called P-B frames. This results in the ability to achieve the same video quality as in H.261 at a drastically lower bit rate.
- Unrestricted motion vectors point outside the picture. That is, the edge pixels are used as predictions for the “not existing” pixels. There is a significant gain achieved if there is movement along the edge of the picture.
- overlapped block motion compensation is used for the P-frames. That is, four 8 ⁇ 8 vectors, instead of one 16 ⁇ 16 vector are used for some of the macroblocks in the picture, and motion vectors are allowed to point outside the picture. Four vectors require more bits, but give better prediction.
- a “P-B” frame consists of two pictures being coded as one unit.
- the name P-B actually was derived from the name of picture types in MPEG (P-frames and B-frames).
- a P-B-frame consists of one P-frame that is predicted from the last decoded P-frame and one B-frame that is predicted from both the last decoded P-frame and the P-frame currently being decoded.
- the last picture is called a B-picture because parts of it may be bi-directionally predicted from the past and future P-frames.
- the frame rate can be doubled with this mode without greatly increasing the bit rate.
- P-B-frames do not work as well as B-frames in MPEG, since there are no separate forward and backward vectors in H.263.
- a motion vector for the P-frame is scaled to yield the backward vector for the B frame and scaled and augmented by a delta vector to yield the forward vector for the B frame.
- MPEG-4 is closely related to H.263 and MPEG- 1 .
- MPEG-4 video compression uses the hybrid block DCT and motion compensation video coding techniques found in MPEG-1, MPEG-2, H.261 and H.263.
- the DCT is used in transform coding of the macroblock or the motion compensated prediction error (the displaced frame difference, or DFD) of the macroblock.
- DFD the motion compensated prediction error
- VOPs Video Object Planes
- one VOP could be a speaker, such as a newscaster, in the foreground
- another VOP could be a static background, such as a news studio.
- VOPs could be coded separately including shape and transparency information. Since a VOP can be a rectangular plane, such as a single monolithic frame in MPEG-1, or have an arbitrary shape, this allows for separate encoding, decoding, and manipulation of various visual objects that make up a scene.
- a single quantization parameter q controls the scale of the quantizer bin size, which is proportional to the difference between the decision levels of the scalar quantizer applied to each DCT coefficient.
- the spatial data content of a group of one or more luminance or chrominance blocks along with the coding mode and the quantization parameter for the group determine the number of bits that are expended for the quantization of the group.
- the number of quantization bits, combined with the number of overhead bits expended for the representation of the motion vectors, coding modes, coding block patterns of the blocks and the quantization parameter yields the total number of bits used for coding of that group.
- the rate control method adopted by MPEG-4 estimates the number of coding bits of a data entity for each quantization parameter before the coding process.
- the quantization parameters associated with an estimate for the number of coding bits that is closest to the targeted number of coding bits (bit budget) for the data entities are selected for the data entities.
- bit budget the targeted number of coding bits
- the quantization parameters for the remaining data entities are updated such that the estimate for the number of coding bits for the remaining entities closely approximates the remaining bit budget.
- the relation between the estimate for the number of coding bits for a data entity and the quantization parameter is established by means of a rate-distortion function which incorporates a sample statistic of the data entity.
- the quantization parameter and the actual number of coding bits observed after coding a data entity with that quantization parameter are used to update the parameters of the rate distortion function by linear regression.
- FIGS. 1A and 1B A generic coder/decoder pair 100 , 200 is shown in FIGS. 1A and 1B respectively.
- a frame or field of data is partitioned into groups of square blocks, herein referred to as macroblocks, of pixel luminance intensity values and corresponding pixel chrominance intensity values.
- one of the intensity values of the pixels, and the error 120 , 130 of their temporal prediction from one or more temporally local frames is transformed by means of a two-dimensional orthogonal transform, such as a discrete cosine transform (DCT) 140 .
- DCT discrete cosine transform
- the transform coefficients of the chrominance and luminance blocks of the macroblock are quantized, usually one at a time, with a uniform scalar quantizer (Q) 150 .
- the quantized bits of data of each block are further compressed by a variable length coder (VLC) 160 that maps the quantized bits to a series of codewords of bits by means of a look-up table.
- VLC variable length coder
- VLD variable length decoder
- IDCT inverse discrete cosine transform
- IQ inverse uniform scalar quantizer
- the present invention derives a model of the relation between the number of bits used by the quantizer to quantize a group of blocks and the quantization parameter for that group given the spatial data content of the group and the coding mode.
- the invention uses the model to precisely estimate the number of bits that will be expended for the quantization of a future group of blocks for a chosen quantization parameter, a known spatial data content, and a known coding mode.
- a feature extractor lowers the computational and design complexities by extracting the significant part of the data based on the coding mode.
- a classifier then acts on the features to yield a class for the group of blocks.
- a conditional estimator maps the class information and the quantizer parameter to an estimate for the number of quantization bits for the group of blocks. The estimates for the quantization and overhead bits are combined to give an estimate for the number of coding bits of the group of blocks.
- This invention facilitates a targeted number of coding bits for a data entity consisting of one or more groups of blocks to be closely approximated.
- the target number of bits is usually determined by the constraints on transmission bandwidth, latency, and buffer capacity.
- the estimates for the number of quantization bits of all of the groups of the data entity are combined to yield an estimate for the number of quantization bits for the data entity.
- the number of quantization bits for each group decreases monotonically with the quantization parameter for that group. Assuming that the number of overhead bits for the data entity does not increase with the average quantization parameter for the data entity the estimate for the number of coding bits for the data entity also decreases monotonically with the average quantization parameter for the data entity. This allows the system to control number of bits output for the data entity by selecting a combination of quantization parameters which correspond to an estimate for the number of coding bits of the data entity that is closest to the targeted number of coding bits of the data entity.
- FIG. 1A is a block diagram of a conventional video encoder.
- FIG. 1B is a block diagram of a conventional video decoder.
- FIG. 2 is a block diagram of a circuit to estimate the number of quantization bits according to an embodiment of the invention.
- FIG. 3 is a block diagram of a circuit with a look-up table for a memory write operation during the estimator training according to an embodiment of the invention.
- FIG. 4 is a block diagram of a circuit for the approximation of the targeted number of coding bits according to an embodiment of the invention.
- FIG. 5 is a flowchart for the approximation of the targeted number of coding bits showing the initial assignment of quantization parameters before the encoding of macroblocks and the parameter adjustment after the encoding of each macroblock.
- FIG. 6 is a block diagram of a video coder incorporating the rate control method according to an embodiment of the present invention.
- FIG. 2 shows a high-level functional block diagram of a circuit operating the method according to an embodiment of the present invention.
- the circuit illustrated in FIG. 2 includes a feature extractor 300 , a classifier 310 , and an estimator 320 .
- G ⁇ g 1 , . . . ,g N ⁇ denote a group of luminance and chrominance blocks and d denote the index of the coding mode of G.
- the invention facilitates the design of the feature extractor 300 , the classifier 310 and the estimator 320 in such a way that the estimate ⁇ circumflex over (B) ⁇ (g 1 , . . . , G N ,d,q) closely approximates the actual number of quantization bits B(g 1 , . . . , g N ,d,q) in a statistical sense.
- the invention is accomplished according to the following statistical determination. Let the cost of estimating B(g 1 , . . . ,g N ,d,q) by B(g 1 , . . . , g N ,d,q) be represented as C(B(g 1 , . . . , g N ,d,q), ⁇ circumflex over (B) ⁇ (g 1 , . . . g N ,d,q)).
- E is the statistical expectation of its argument with respect to ⁇ g 1 , . . . ,g N ⁇ and dp(g 1 , . . . ,g N ) measures the probability of observing the group of blocks ⁇ g 1 , . . . ,g N ⁇ .
- a sequential design approach involves designing each one of the stages once based on the data supplied to each stage from the preceding stages.
- the feature extractor 300 is designed with a prior knowledge of the significant part of the data in the group of chrominance and luminance blocks.
- the mapping also provides the most desired tradeoff between the reduction of the dimensionality of its input space and the preservation of the significant information in the group of blocks.
- the feature extractor may yield a sample statistic such as sample variance or sample mean absolute value of the data in the group of chrominance and luminance blocks as the one dimensional feature vector.
- the rate-distortion bounds for Laplacian and Gaussians source which are commonly used for modelling the operational rate-distortion functions for the scalar quantization of DCT coefficients are parameterized by source variance.
- the classifier 310 is designed so that any output feature vector (obtained from the operation of the feature extractor 300 ) is in the domain of V and the classification operation does not lead to substantial loss of the extracted significant information (representative of the chrominance and the luminance).
- the invention will be described in operation with coding for a baseline H.263 compliant bitstream and decoder.
- the video sequence consists of I and P pictures. I and P pictures are further partitioned into groups of four luminance and two chrominance blocks (macroblocks).
- a macroblock has 384 luminance and chrominance data elements.
- the I picture macroblocks are either intra-coded and P picture macroblocks are either intra-coded, inter-coded, or not coded at all.
- Intra-coding implies that the macroblock is coded without subtracting from it a temporal prediction from the past temporally local frames.
- Inter-coding implies that the temporal prediction error of the macroblock is coded.
- the macroblock type, coded block pattern, and differential quantization parameter between macroblocks are coded and transmitted. Motion vector information is also coded and transmitted for inter-coded macroblocks.
- This embodiment of the invention exemplifies how 1) the three circuits T, V and U ( 300 , 310 and 320 respectively) can be designed sequentially; 2) how an estimate for the number of quantization bits can be obtained for a macroblock (group of blocks); and 3) how a targeted number of coding bits can be approximated for a single picture (data entity).
- the feature extractor 300 operates according to the following principle.
- G R ⁇ g 1 R , . . . ,g 6 R ⁇ is the R th macroblock to be coded.
- I(x,y) denote the intensity value at location (x,y) of a coded picture. This could represent a luminance or a chrominance intensity value, or the motion compensated error value thereof, depending on the coding mode of the macroblock.
- d R ⁇ 0 if ⁇ ⁇ inter ⁇ - ⁇ coded 1 if ⁇ ⁇ intra ⁇ - ⁇ coded ⁇
- ⁇ R ( 1 384 ⁇ ⁇ j ⁇ ⁇ 1 , ⁇ ... ⁇ , 6 ⁇ ⁇ ⁇ ( x , y ) ⁇ g j R ⁇ ( I ⁇ ( x , y ) - d R ⁇ I _ j R ) 2 ) 1 / 2
- I _ j R 1 64 ⁇ ⁇ ( x , y ) ⁇ g j R ⁇ I ⁇ ( x , y ) .
- the quantizer employed in the classifier 310 is different than the quantizer employed in the main coding loop.
- the estimator 320 employs the expected value of the number of quantization bits conditioned on the class, c, and the quantization parameter, q, as the closest bit count estimate, U(c, q), for a macroblock of class c quantized with quantization parameter q. For the R'th macroblock the estimate is obtained according to the following equation:
- U R (c,q) is the estimate of number of quantization bits for a macroblock of class c quantized with quantization parameter q prior to R'th macroblock.
- P c,q X is the number of macroblocks prior to and including X'th macroblock which are of class c and are coded with parameter q.
- FIG. 3 is a detailed circuit diagram of the estimator 320 with a look-up table for a memory write operation that occurs during the Estimator module 320 training. This function of the Estimator 320 illustrates the computation of the estimated number of quantization bits for the group of macroblocks.
- P c,q max is a threshold.
- the value of Z shown in FIG. 3 is set equal to the number of macroblocks in a picture. Further, it is preferred that the actual number of quantization bits observed for a particular class and the quantization parameters are used to determine the estimates for that class and quantization parameters.
- the present invention determines a combination of quantization parameters for the groups of blocks comprising a data entity prior to the coding of the groups of blocks so that the sum of the estimates for the number of coding bits of all the groups of blocks closely approximates the targeted number of coding bits for the data entity. This is performed by initiating an exhaustive search over the set of all possible combinations of quantization parameters.
- the set of all possible combinations be restricted to the set of combinations for which the first, Z 0 , of the Z macroblocks are quantized with a quantization parameter of q, and the remaining Z ⁇ Z 0 macroblocks are quantized with a quantization parameter of q ⁇ 1 where the sign is alternated from frame to frame.
- This restriction is based on the assumption that in order to achieve near optimal coding performance, the quantization parameter should not be varied greatly across a picture.
- the method for obtaining the optimum pair q, Z 0 is described here for the case with the positive sign (i.e. Z ⁇ Z 0 macroblocks quantized with q+1).
- the search is initialized by setting the quantization parameter to the largest value allowed by the video coding standard for all macroblocks of the picture. For example, in the H.263 video compression standard, the initialization is performed as:
- the macroblocks are scanned in the raster-scan order. That is a picture is scanned by scanning each row from left to right and scanning the row below it after it is completed. The picture scan order is repetitively applied, that is the last macroblock of the last row is followed by the first macroblock of the first row. Only the quantization parameter of the current (scanned) macroblock is decremented. The new quantization parameter and class of the current macroblock are mapped to a new bit count estimate for the current macroblock.
- the bitcount estimate is overridden by an estimate of zero if the macroblock is deemed not to be coded.
- the decision of whether or not to code is made by comparing a feature derived from the data of the macroblock against a threshold.
- this feature is taken to be the sample frame difference replenishment (temporal prediction with a zero motion vector) error variance of the luminance and chrominance values of the macroblock.
- a macroblock is not coded if the inequality 1 384 ⁇ ⁇ j ⁇ ⁇ 1 , ⁇ ... ⁇ , 6 ⁇ ⁇ ⁇ ( x , y ) ⁇ g j R ⁇ ( I F ⁇ ⁇ D ⁇ ( x , y ) ) 2 ⁇ q 2 3
- B ⁇ O ⁇ ⁇ V R ⁇ ( q ) ⁇ 1 if ⁇ ⁇ 1 384 ⁇ ⁇ j ⁇ ⁇ ⁇ 1 , ⁇ ... ⁇ , 6 ⁇ ⁇ ⁇ ⁇ ( x , ⁇ y ) ⁇ g j R ⁇ ( I ⁇ F ⁇ ⁇ D ⁇ ( x , y ) ) 2 ⁇ q 2 3 ⁇ B _ OV else
- FIG. 4 A block diagram of the system performing the search for the combination of quantization parameters achieving the target number of bits B TR , described above, is shown in FIG. 4.
- the block diagram shows how the above calculation is determined from the scan order generator 42 .
- the signal from the comparator 40 switches OFF the search process and directs the ⁇ Q R ⁇ to the encoder 41 .
- FIG. 5 shows a flowchart for the approximation of the targeted number of coding bits, B TR , by the estimation of the number of quantization bits.
- the quantization parameters are initialized for the macroblocks and the first macroblock in the scan order becomes the current macroblock (step S 1 ).
- a query determines whether the macroblock is coded with the current quantization parameter (step S 2 ). If the macroblock is coded with the current quantization parameter, then a bitcount estimate is performed (step S 3 ), and another query determines whether the targeted bitcount, B TR , is reached or exceeded (step S 4 ). If the targeted bitcount has been reached or exceeded, the macroblocks are encoded with the final set of quantization parameters ⁇ Q R ⁇ (step S 5 ).
- step S 6 determines whether the macroblock is at the end of the scan order. If the macroblock is at the end of the scan order, then the first macroblock in the scan order becomes the current macroblock (step S 7 ) and the quantization parameter of the current macroblock is updated (decremented) (step S 8 ). At which time the process continues with the query of step S 2 .
- step S 9 the next macroblock in the scan order becomes the current macroblock (step S 9 ) and the quantization parameter of the current macroblock is updated (decremented) (step S 8 ).
- step S 9 the quantization parameter of the current macroblock is updated (decremented) (step S 8 ).
- step S 8 the quantization parameter of the current macroblock is updated (decremented) (step S 8 ).
- step S 9 the quantization parameter of the current macroblock is updated (decremented)
- step S 8 the quantization parameter of the current macroblock is updated (decremented)
- FIG. 6 is a block diagram of a video coder 60 incorporating the rate-control method of the current invention 600 .
- the addition of 600 is the primary difference between the video coder 60 of FIG. 6 and the video coder 100 of the prior art of FIG. 1A.
- the present invention derives a model of the relation between the number of bits used by the quantizer to quantize a group of blocks and the quantization parameter for that group given the spatial data content of the group and the coding mode.
- the invention uses the model to precisely estimate the number of bits that will be expended for the quantization of a future group of blocks for a chosen quantization parameter, a known spatial data content, and a known coding mode.
- the rate control method of the present invention differs from that of the MPEG-4 video standard, and other similar video standards, in that the spatial data content of the group of blocks, as well as its coding mode, is factored into the estimation process by the utilization of features extracted from the data.
- the rate control method of the current invention maps each unique pair of the class of the features and the quantization parameter to a unique estimate for the number of coding bits.
- the estimate for a particular class and quantization parameter is designed and updated by using the actual number of coding bits observed for previously coded data entities (groups of blocks) getting mapped to a certain class and quantized with a certain parameter.
- the parameters of the rate-distortion function of the MPEG-4 rate control method are designed and updated by using the quantization parameter and the actual number of coding bits observed for all the previously coded data entities.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Algebra (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
A method and apparatus is provided for estimating the number of bits output from a video coder given a known spatial data content, G={g1, . . . , gN}, of a group of luminance and chrominance blocks, and a known coding mode, d, where d represents the index of said coding mode. The method comprises the steps of extracting a significant part of the spatial data content, G, in relation to the coding mode, d, to yield a feature vector F, the feature vector representing statistics and signal components of the luminance and chrominance data of the luminance and chrominance blocks; mapping the feature vector to yield a class index, c, for said respective group of luminance and chrominance blocks; mapping the class index, c, in relation to a quantization parameter, q, where the quantization parameter controls the scale of the bin size of the quantizer applied to the transform coefficients, to yield an estimated number of quantization bits for the group of luminance and chrominance blocks; and determining an estimated total number of coding bits for the group of luminance and chrominance blocks from the combination of the estimated number of quantization bits and an estimated number of overhead bits, wherein the overhead bits represent the additional bits expended to represent respective portions of the bitstream.
Description
- 1. Field of the Invention
- The present invention relates to signal processing, and in particular, to a method and apparatus for estimating and controlling the number of bits output from a video coder.
- 2. Description of the Related Art
- Numerous international video coding standards have been established over the last decade. MPEG-1, for example, defines a bitstream for compressed video and audio optimized to fit into a bandwidth of 1.5 Mbits/sec. This rate is special because it is the data rate of uncompressed audio CDs and DATs.
- MPEG-1 is defined to begin with a relatively low-resolution video sequence of about 352×240 frames×30 frames/sec., but use original high (CD) quality audio. The images are in color, but are converted into YUV space (a color space represented by luminance (Y) and two color differences (U and V)).
- The basic scheme of MPEG-1 is to predict motion from frame-to-frame in the temporal direction, and then to use discrete cosine transforms (DCTs) to organize the redundancy in the spatial directions. The DCTs are performed on 8×8 blocks, and the motion prediction is done in the luminance channel (Y) on 16×16 blocks (each of the 16×16 Y and the corresponding 8×8 U and V block pairs is considered to be a macroblock).
- In other words, given the 16×16 block in a current frame to be coded, a close match to that block in a previous or future frame (there are backward prediction modes where later frames are sent first to allow interpolating between frames) is desired.
- The DCT coefficients of either the actual data, or the difference between the block and the close match, are “quantized,” in that they are coarsely represented by fewer number of bits by means of (shifting and) integer dividing by a quantization parameter to yield quantization levels. By quantization, it is desired that many of these DCT coefficients will become “0” and drop out.
- The result of the coding, including the motion vectors and the quantization levels are variable length coded using fixed tables. The quantization levels are zigzag scanned and ordered into a one-dimensional array. Each nonzero level is represented by a codeword indicating a run-length of zeros preceding in the scan order, the nonzero value of the level that ended the run and whether more nonzero levels are to be coded in the block. Compression is achieved by assigning shorter codewords to frequent events and longer codewords to less frequent events.
- In the MPEG standard, there are three types of coded frames. There are “I” frames, or intra-coded frames, that are simply a frame coded as a still image, without using any past history. Then there are “P” frames, or predicted frames. P-frames are predicted from the most recently reconstructed I- or P-frame (from the point of view of the decompressor). Further, each macroblock in a P-frame can either be characterized by a motion vector from a close match in the last I or P-frame and blocks of DCT coefficients of the motion compensated difference values associated with the motion vector (inter coded), or simply be characterized by the blocks of DCT coefficients of the macroblock itself (intra-coded), if no suitable match exists.
- In “B” (bidirectional) frames matching blocks are searched for in the past and/or future I or Pframes. The macroblock can be motion compensated by only the forward vector and using DCT blocks from the past frames, or by only the backward vector and using DCT blocks from the future frames or by both forward and backward vectors and using the average of the DCT blocks from past and future frames. The macroblock can also be simply intra-coded. Thus, after coding, a typical frame sequence may resemble the following sequence: IBBPBBPBBPBBIBBPBBPB . . . , where there are 12 frames from I to I.
- Unlike MPEG-1, that is strictly meant for progressive sequences, another standard, MPEG-2 was developed. MPEG-2 can represent interlaced or progressive video sequences. The MPEG-2 concept is similar to MPEG-1, but included extensions to cover a wider range of applications. The primary application targeted by MPEG-2 is the all-digital transmission of broadcast television quality video at coded bit rates between 4 and 9 Mbit/sec. The most significant enhancement in MPEG-2 is the addition of syntax for efficient coding of interlaced video (16×8 block size motion compensation).
- Several other enhancements such as alternate scan, intra VLC, nonuniform quantization resulted in improved coding efficiency for MPEG-2. Other key features of MPEG-2 are the scalable extensions that permitted the division of a continuous video signal into two or more coded bit streams representing the video at different resolutions, picture quality or picture rates.
- H.261 is a video coding standard designed for data rates that are multiples of 64 Kbit/sec. This standard is specifically designed to suit ISDN lines.
- As in MPEG standards the coding algorithm utilized is a hybrid of inter-picture prediction, transform coding and motion compensation. The data rate of the coding algorithm can be set between 40 Kbit/sec. and 2 Mbit/sec. Inter-picture prediction aids in the removal of temporal redundancy, while transform coding removes spatial redundancy and motion vectors are used to help the codec compensate for motion. To remove any further redundancy in the bitstream, variable length coding is utilized.
- As in the MPEG standards, H.261 allows the DCT coefficients to be either intra coded or inter coded from previous frames. In other words the 8×8 blocks of DCT coefficients of the actual data or the motion compensated difference values are quantized and variable length coded. They are multiplexed onto a hierarchical bitstream along with the variable length coded motion vectors.
- A similar standard, H.263, is a compression standard originally designed for low bit rate communication, but can use a wide range of bit rates. The coding algorithm is similar to that of H.261, but improves H.261 in certain areas. Specifically, half-pixel precision is used for motion compensation, as opposed to full pixel precision and a loop filter used by H.261. Additionally, H.263 includes unrestricted motion vectors, syntax-based arithmetic coding, advance prediction and forward and backward frame prediction similar to MPEG, called P-B frames. This results in the ability to achieve the same video quality as in H.261 at a drastically lower bit rate.
- Unrestricted motion vectors point outside the picture. That is, the edge pixels are used as predictions for the “not existing” pixels. There is a significant gain achieved if there is movement along the edge of the picture.
- Through advance prediction, overlapped block motion compensation is used for the P-frames. That is, four 8×8 vectors, instead of one 16×16 vector are used for some of the macroblocks in the picture, and motion vectors are allowed to point outside the picture. Four vectors require more bits, but give better prediction.
- A “P-B” frame consists of two pictures being coded as one unit. The name P-B actually was derived from the name of picture types in MPEG (P-frames and B-frames). Thus, a P-B-frame consists of one P-frame that is predicted from the last decoded P-frame and one B-frame that is predicted from both the last decoded P-frame and the P-frame currently being decoded. The last picture is called a B-picture because parts of it may be bi-directionally predicted from the past and future P-frames.
- As a result of the above characteristics, for relatively simple sequences, the frame rate can be doubled with this mode without greatly increasing the bit rate. For sequences with a lot of motion, P-B-frames do not work as well as B-frames in MPEG, since there are no separate forward and backward vectors in H.263. A motion vector for the P-frame is scaled to yield the backward vector for the B frame and scaled and augmented by a delta vector to yield the forward vector for the B frame.
- Another compression standard is MPEG-4. From a video compression perspective, MPEG-4 is closely related to H.263 and MPEG-1. MPEG-4 video compression uses the hybrid block DCT and motion compensation video coding techniques found in MPEG-1, MPEG-2, H.261 and H.263. As in MPEG and H.263, the DCT is used in transform coding of the macroblock or the motion compensated prediction error (the displaced frame difference, or DFD) of the macroblock. Each of the I, P and P-B frames are supported.
- Additionally, as in H.263, unrestricted motion vectors, syntax based arithmetic coding, advance prediction with 8×8 pixel block-based, overlapped motion compensation. DCT transforms are quantized, run-length encoded and variable-length coded using the same tables as H.263 and MPEG-1.
- The major improvement in MPEG-4 did not lie in the video compression algorithm, but instead was in support of multiple video layers in the image sequence (instances of which in a frame are Video Object Planes, or VOPs). For example, one VOP could be a speaker, such as a newscaster, in the foreground, and another VOP could be a static background, such as a news studio. These VOPs could be coded separately including shape and transparency information. Since a VOP can be a rectangular plane, such as a single monolithic frame in MPEG-1, or have an arbitrary shape, this allows for separate encoding, decoding, and manipulation of various visual objects that make up a scene.
- Typically, under these international video coding standards, a single quantization parameter q controls the scale of the quantizer bin size, which is proportional to the difference between the decision levels of the scalar quantizer applied to each DCT coefficient. The spatial data content of a group of one or more luminance or chrominance blocks along with the coding mode and the quantization parameter for the group determine the number of bits that are expended for the quantization of the group. In turn, the number of quantization bits, combined with the number of overhead bits expended for the representation of the motion vectors, coding modes, coding block patterns of the blocks and the quantization parameter yields the total number of bits used for coding of that group.
- In the early reference rate control methods developed for MPEG-2 and H.263, the error between the cumulative actual and cumulative targeted number of coding bits is computed for the previously coded data entities (a single macroblock, a group of macroblocks, and pictures). This error is negatively fed back to the most recent quantization parameter to determine the quantization parameter for the current data entity. Thus, the error between the actual and targeted number of coding bits for the current data entity has no effect on the selection process for the quantization parameter for the current data entity. The delay in the response time to the errors results in large deviations from targeted rate profiles. Even for constant bit rate applications, such large deviations usually leads to large buffer requirements.
- More recent rate control methods adopted by MPEG-4 Verification Model and ITU-T Test Model TMN8 achieve more accurate rate control. For example, the rate control method adopted by MPEG-4 estimates the number of coding bits of a data entity for each quantization parameter before the coding process. The quantization parameters associated with an estimate for the number of coding bits that is closest to the targeted number of coding bits (bit budget) for the data entities are selected for the data entities. After the encoding of each data entity the quantization parameters for the remaining data entities are updated such that the estimate for the number of coding bits for the remaining entities closely approximates the remaining bit budget. The relation between the estimate for the number of coding bits for a data entity and the quantization parameter is established by means of a rate-distortion function which incorporates a sample statistic of the data entity. The quantization parameter and the actual number of coding bits observed after coding a data entity with that quantization parameter are used to update the parameters of the rate distortion function by linear regression.
- Conventional video coders that operate under one of these compression standards process a sequence of video frames or fields and output a bitstream representing the significant data contained in these frames or fields. A video decoder inputting such a bitstream can reconstruct these frames or fields with a certain fidelity.
- A generic coder/
decoder pair coder 100, a frame or field of data is partitioned into groups of square blocks, herein referred to as macroblocks, of pixel luminance intensity values and corresponding pixel chrominance intensity values. - For each macroblock, one of the intensity values of the pixels, and the
error - The transform coefficients of the chrominance and luminance blocks of the macroblock are quantized, usually one at a time, with a uniform scalar quantizer (Q)150. The quantized bits of data of each block are further compressed by a variable length coder (VLC) 160 that maps the quantized bits to a series of codewords of bits by means of a look-up table.
- Similarly, in operation of the
decoder 200, by means of a look-up table, the quantized bits of data of each block are initially decompressed by a variable length decoder (VLD) 210. Further, an inverse discrete cosine transform (IDCT) 220 and an inverse uniform scalar quantizer (IQ) 230 operate upon these quantized bits of data to reproduce the intensity values of the pixels, and the error of their temporal prediction from one or more temporally local frames with a certain error from their original values. - Due to the significant length of the bitstreams involved in compression/decompression, there is a need for a method that can accurately determine and control the number of bits expected to be expended for the quantization of a future group of blocks.
- The present invention derives a model of the relation between the number of bits used by the quantizer to quantize a group of blocks and the quantization parameter for that group given the spatial data content of the group and the coding mode. The invention uses the model to precisely estimate the number of bits that will be expended for the quantization of a future group of blocks for a chosen quantization parameter, a known spatial data content, and a known coding mode.
- However, it is not feasible to precisely model the relation between all possible spatial data content and corresponding number of quantization bits due to the high computational and storage complexity required for the design and storage of such a model. To help avoid this problem, a feature extractor lowers the computational and design complexities by extracting the significant part of the data based on the coding mode. A classifier then acts on the features to yield a class for the group of blocks. A conditional estimator maps the class information and the quantizer parameter to an estimate for the number of quantization bits for the group of blocks. The estimates for the quantization and overhead bits are combined to give an estimate for the number of coding bits of the group of blocks.
- This invention facilitates a targeted number of coding bits for a data entity consisting of one or more groups of blocks to be closely approximated. The target number of bits is usually determined by the constraints on transmission bandwidth, latency, and buffer capacity. The estimates for the number of quantization bits of all of the groups of the data entity are combined to yield an estimate for the number of quantization bits for the data entity.
- The number of quantization bits for each group decreases monotonically with the quantization parameter for that group. Assuming that the number of overhead bits for the data entity does not increase with the average quantization parameter for the data entity the estimate for the number of coding bits for the data entity also decreases monotonically with the average quantization parameter for the data entity. This allows the system to control number of bits output for the data entity by selecting a combination of quantization parameters which correspond to an estimate for the number of coding bits of the data entity that is closest to the targeted number of coding bits of the data entity.
- FIG. 1A is a block diagram of a conventional video encoder.
- FIG. 1B is a block diagram of a conventional video decoder.
- FIG. 2 is a block diagram of a circuit to estimate the number of quantization bits according to an embodiment of the invention.
- FIG. 3 is a block diagram of a circuit with a look-up table for a memory write operation during the estimator training according to an embodiment of the invention.
- FIG. 4 is a block diagram of a circuit for the approximation of the targeted number of coding bits according to an embodiment of the invention.
- FIG. 5 is a flowchart for the approximation of the targeted number of coding bits showing the initial assignment of quantization parameters before the encoding of macroblocks and the parameter adjustment after the encoding of each macroblock.
- FIG. 6 is a block diagram of a video coder incorporating the rate control method according to an embodiment of the present invention.
- Invention Theory
- FIG. 2 shows a high-level functional block diagram of a circuit operating the method according to an embodiment of the present invention. The circuit illustrated in FIG. 2 includes a
feature extractor 300, aclassifier 310, and anestimator 320. - Still referring to FIG. 2, in operation, G={g1, . . . ,gN} denote a group of luminance and chrominance blocks and d denote the index of the coding mode of G. The
feature extractor 300 acts on G and d and yields a feature vector F=T(G,d) where T is the feature extraction mapping. - After obtaining the feature vector, the
classifier 310 maps the feature vector to to a class index c=V(F) c∈{1, . . . , L} where V is the classification mapping and L is the number of classes. There is no need to specify an upper limit to L. - A final two-to-one mapping is performed on the class index by an
estimator 320 that provides the estimate for the number of quantization bits {circumflex over (B)}(G,d,q)=U(c,q) for the group of blocks (of transform coefficients) where U is the nonlinear estimation mapping and q is the quantization parameter. - The invention facilitates the design of the
feature extractor 300, theclassifier 310 and theestimator 320 in such a way that the estimate {circumflex over (B)}(g1, . . . , GN,d,q) closely approximates the actual number of quantization bits B(g1, . . . , gN,d,q) in a statistical sense. -
- where E is the statistical expectation of its argument with respect to {g1, . . . ,gN} and dp(g1, . . . ,gN) measures the probability of observing the group of blocks {g1, . . . ,gN}.
- The minimization of the above expression for the expected cost generally requires the joint optimal design of the
feature extractor 300, theclassifier 310, and theestimator 320. However, this is not generally feasible due to the high computational complexity required to perform such a joint optimization. - A sequential design approach involves designing each one of the stages once based on the data supplied to each stage from the preceding stages.
- Through this approach, the
feature extractor 300, T, is designed with a prior knowledge of the significant part of the data in the group of chrominance and luminance blocks. The mapping also provides the most desired tradeoff between the reduction of the dimensionality of its input space and the preservation of the significant information in the group of blocks. For example, the feature extractor may yield a sample statistic such as sample variance or sample mean absolute value of the data in the group of chrominance and luminance blocks as the one dimensional feature vector. On the other hand, the rate-distortion bounds for Laplacian and Gaussians source which are commonly used for modelling the operational rate-distortion functions for the scalar quantization of DCT coefficients are parameterized by source variance. - The
classifier 310, V, is designed so that any output feature vector (obtained from the operation of the feature extractor 300) is in the domain of V and the classification operation does not lead to substantial loss of the extracted significant information (representative of the chrominance and the luminance). -
- for every possible combination of quantization parameter q and coding mode d.
- In another embodiment, the invention will be described in operation with coding for a baseline H.263 compliant bitstream and decoder. The video sequence consists of I and P pictures. I and P pictures are further partitioned into groups of four luminance and two chrominance blocks (macroblocks). A macroblock has 384 luminance and chrominance data elements. The I picture macroblocks are either intra-coded and P picture macroblocks are either intra-coded, inter-coded, or not coded at all. Intra-coding implies that the macroblock is coded without subtracting from it a temporal prediction from the past temporally local frames. Inter-coding implies that the temporal prediction error of the macroblock is coded.
- The macroblock type, coded block pattern, and differential quantization parameter between macroblocks are coded and transmitted. Motion vector information is also coded and transmitted for inter-coded macroblocks. The cost function employed in this embodiment is determined as the square difference, given as C(a,b)=(a−b)2.
- This embodiment of the invention exemplifies how 1) the three circuits T, V and U (300, 310 and 320 respectively) can be designed sequentially; 2) how an estimate for the number of quantization bits can be obtained for a macroblock (group of blocks); and 3) how a targeted number of coding bits can be approximated for a single picture (data entity).
- Feature Extractor
- The
feature extractor 300 operates according to the following principle. Suppose GR={g1 R, . . . ,g6 R} is the Rth macroblock to be coded. Let I(x,y) denote the intensity value at location (x,y) of a coded picture. This could represent a luminance or a chrominance intensity value, or the motion compensated error value thereof, depending on the coding mode of the macroblock. -
-
-
- Classifier
-
- In general, the quantizer employed in the
classifier 310 is different than the quantizer employed in the main coding loop. - Estimator
- The
estimator 320 employs the expected value of the number of quantization bits conditioned on the class, c, and the quantization parameter, q, as the closest bit count estimate, U(c, q), for a macroblock of class c quantized with quantization parameter q. For the R'th macroblock the estimate is obtained according to the following equation: - {circumflex over (B)}(g 1 R , . . . ,g 6 R ,d R ,q)=U(c R ,q)=└E└B(g 1 , . . . ,g 6 ,d,q)|V(T(g 1 , . . . ,g 6 ,d))=c R┘
-
- where UR(c,q) is the estimate of number of quantization bits for a macroblock of class c quantized with quantization parameter q prior to R'th macroblock. Pc,q X is the number of macroblocks prior to and including X'th macroblock which are of class c and are coded with parameter q.
- The estimate UR(c,q) changes with the number of coded macroblocks. In order to refrain from repeating the summation when R is large, an update form of the above equation is used that is given by the following:
- U R(c,q)=U kZ(c,q) for kZ<R≦(k+1)Z
-
- where the update term in the second recursive equation is a sum over Pc,q kZ−Pc,q (k−1)Z macroblocks.
- FIG. 3 is a detailed circuit diagram of the
estimator 320 with a look-up table for a memory write operation that occurs during theEstimator module 320 training. This function of theEstimator 320 illustrates the computation of the estimated number of quantization bits for the group of macroblocks. -
- where Pc,q max is a threshold. In a preferred embodiment, the value of Z shown in FIG. 3 is set equal to the number of macroblocks in a picture. Further, it is preferred that the actual number of quantization bits observed for a particular class and the quantization parameters are used to determine the estimates for that class and quantization parameters.
- Optimal Macroblock/Quantization Parameter Pairing
- The present invention determines a combination of quantization parameters for the groups of blocks comprising a data entity prior to the coding of the groups of blocks so that the sum of the estimates for the number of coding bits of all the groups of blocks closely approximates the targeted number of coding bits for the data entity. This is performed by initiating an exhaustive search over the set of all possible combinations of quantization parameters.
- In order to reduce the complexity of such an exhaustive search, it is preferred that the set of all possible combinations be restricted to the set of combinations for which the first, Z0, of the Z macroblocks are quantized with a quantization parameter of q, and the remaining Z−Z0 macroblocks are quantized with a quantization parameter of q±1 where the sign is alternated from frame to frame. This restriction is based on the assumption that in order to achieve near optimal coding performance, the quantization parameter should not be varied greatly across a picture.
- The method for obtaining the optimum pair q, Z0 is described here for the case with the positive sign (i.e. Z−Z0 macroblocks quantized with q+1). The search is initialized by setting the quantization parameter to the largest value allowed by the video coding standard for all macroblocks of the picture. For example, in the H.263 video compression standard, the initialization is performed as:
- Q R=31 for kZ<R≦(k+1)Z
- The macroblocks are scanned in the raster-scan order. That is a picture is scanned by scanning each row from left to right and scanning the row below it after it is completed. The picture scan order is repetitively applied, that is the last macroblock of the last row is followed by the first macroblock of the first row. Only the quantization parameter of the current (scanned) macroblock is decremented. The new quantization parameter and class of the current macroblock are mapped to a new bit count estimate for the current macroblock.
- The bitcount estimate is overridden by an estimate of zero if the macroblock is deemed not to be coded. The decision of whether or not to code is made by comparing a feature derived from the data of the macroblock against a threshold. Preferably, this feature is taken to be the sample frame difference replenishment (temporal prediction with a zero motion vector) error variance of the luminance and chrominance values of the macroblock.
-
- is satisfied where IFD(x,y) is the frame difference replenishment error. Otherwise the macroblock is coded.
-
-
- A block diagram of the system performing the search for the combination of quantization parameters achieving the target number of bits BTR, described above, is shown in FIG. 4. The block diagram shows how the above calculation is determined from the
scan order generator 42. When the target number of bits BTR is reached, the signal from thecomparator 40 switches OFF the search process and directs the {QR} to theencoder 41. - The corresponding flowchart for the operation of the system shown in FIG. 4 is illustrated in FIG. 5. FIG. 5 shows a flowchart for the approximation of the targeted number of coding bits, BTR, by the estimation of the number of quantization bits.
- Initially, the quantization parameters are initialized for the macroblocks and the first macroblock in the scan order becomes the current macroblock (step S1).
- Next a query determines whether the macroblock is coded with the current quantization parameter (step S2). If the macroblock is coded with the current quantization parameter, then a bitcount estimate is performed (step S3), and another query determines whether the targeted bitcount, BTR, is reached or exceeded (step S4). If the targeted bitcount has been reached or exceeded, the macroblocks are encoded with the final set of quantization parameters {QR} (step S5).
- If the result of either the query performed in step S2 or the query performed in step S4 is NO, then an additional query determines whether the macroblock is at the end of the scan order (step S6). If the macroblock is at the end of the scan order, then the first macroblock in the scan order becomes the current macroblock (step S7) and the quantization parameter of the current macroblock is updated (decremented) (step S8). At which time the process continues with the query of step S2.
- If the macroblock is not at the end of the scan order, then the next macroblock in the scan order becomes the current macroblock (step S9) and the quantization parameter of the current macroblock is updated (decremented) (step S8). At which time the process continues with the query of step S2. The system of FIG. 2 is used only in step S3. Since the quantization parameter of the current macroblock is changed, a new bitcount estimate for the current macroblock is obtained by using the system of FIG. 2. Note that during initialization (path S1→S2→S3→S4) 300,310 and 320 may need to be performed. If the class information is stored in memory as suggested in FIG. 4, only 320 needs to be performed at a later time (path S8→S2→S3→S4).
- FIG. 6 is a block diagram of a
video coder 60 incorporating the rate-control method of thecurrent invention 600. The addition of 600 is the primary difference between thevideo coder 60 of FIG. 6 and thevideo coder 100 of the prior art of FIG. 1A. - As a result of the rate control method incorporated into the
video coder 60, the present invention derives a model of the relation between the number of bits used by the quantizer to quantize a group of blocks and the quantization parameter for that group given the spatial data content of the group and the coding mode. The invention uses the model to precisely estimate the number of bits that will be expended for the quantization of a future group of blocks for a chosen quantization parameter, a known spatial data content, and a known coding mode. - The rate control method of the present invention differs from that of the MPEG-4 video standard, and other similar video standards, in that the spatial data content of the group of blocks, as well as its coding mode, is factored into the estimation process by the utilization of features extracted from the data.
- Unlike the rate control method of MPEG-4, or other similar standards, where the quantization parameter is mapped to the estimate for the number of coding bits with a continuous function of low degrees of freedom, the rate control method of the current invention maps each unique pair of the class of the features and the quantization parameter to a unique estimate for the number of coding bits.
- In the current invention, the estimate for a particular class and quantization parameter is designed and updated by using the actual number of coding bits observed for previously coded data entities (groups of blocks) getting mapped to a certain class and quantized with a certain parameter.
- In the MPEG-4 standard, or other similar standards, the parameters of the rate-distortion function of the MPEG-4 rate control method are designed and updated by using the quantization parameter and the actual number of coding bits observed for all the previously coded data entities.
- The above-described embodiment is described merely as one possible realization of the design, estimation and control methods in a general framework, and is not meant to be limiting. The invention is also capable of being practiced according to additional embodiments.
Claims (27)
1. A method for estimating the number of bits output from a video coder given a known spatial data content, G={g1, . . . ,gN}, of a group of luminance and chrominance blocks, and a known coding mode, d, where d represents the index of said coding mode, the method comprising the steps of:
(a) extracting a significant part of said spatial data content, G, in relation to said coding mode, d, to yield a feature vector F, said feature vector representing statistics and signal components of the luminance and chrominance data of said luminance and chrominance blocks;
(b) mapping said feature vector to yield a class index, c, for said respective group of luminance and chrominance blocks;
(c) mapping said class index, c, in relation to a quantization parameter, q, where said quantization parameter controls the scale of quantizer bin size, to an estimate of the number of quantization bits for said group of luminance and chrominance blocks; and
(d) determining an estimated total number of coding bits for said group of luminance and chrominance blocks from the combination of said estimated number of quantization bits and an estimated number of overhead bits, wherein said overhead bits represent the additional bits expended to represent respective portions of the bitstream.
2. The method of claim 1 , wherein said class index mapping step is performed by a two-to-one mapping.
3. The method of claim 1 , wherein said extracting step comprises the following steps:
(a) assigning a first predetermined feature representative of the coding mode to one component of said feature vector; and
(b) computing a second feature representative of said spatial content data and assigning said second feature to one component of said feature vector.
4. The method of claim 3 , wherein said computing step determines said second feature according to the following equation:
where Ij represents the mean of j'th block, (j∈{1, . . . N}), and is defined as
with I representing the value of either luminance or chrominance (d=1) or the motion compensated value thereof (d=0), |.| denoting the cardinality of its operand and L≧1.
5. The method of claim 1 , wherein said class index mapping step operates with a uniform scalar quantizer.
6. The method of claim 1 , wherein said estimator mapping step determines said estimated number of quantization bits according to the following equation:
{circumflex over (B)}(g 1 , . . . ,g N ,d,q)=U(c,q)=E[B(g 1 , . . . ,g N ,d,q)|V(T(g 1 , . . . ,g N ,d))=c]
7. The method of claim 6 , wherein the expected value in the equation is further estimated from the actual number of quantization bits for previously encoded groups of blocks.
13. A method for assigning quantization parameters to the groups of blocks of a picture comprising the steps of:
i. setting the quantization parameters of all groups of blocks of the picture equal to the largest value allowed by the video coding standard;
ii. scanning said groups of blocks according to a certain scanning order, where the last group of blocks in the scanning order is followed by the first group of blocks;
iii. determining whether to code the next group of blocks in the said scanning order with the quantization parameter for the group of blocks;
iv. decrementing the quantization parameter of said group of blocks;
v. repeating steps (b)-(d) until the sum of the estimates for the number of coding bits of all of said groups of blocks exceeds the targeted number of coding bits, BTR, for the picture.
14. The method of claim 13 , wherein the first, Z0, of a number Z of groups of blocks are quantized with a quantization parameter of q, and the remaining number, Z−Z0, of groups of blocks are quantized with a quantization parameter of q+1.
17. A signal coding apparatus, comprising:
(a) partitioning means for dividing a field of data into a plurality of data groups (macroblocks);
(b) transform means for encoding respective ones of said plurality of data groups, said data groups represented by respective transform coefficients;
(c) a quantizing means for compressing said respective transform coefficients representing said plurality of data groups;
(d) a compressing means for further compressing said quantized transform coefficients; and
(e) a rate control means for mapping each unique pair of a class of features of said groups of data, and a quantization parameter to a unique estimate for a number of coding bits.
18. The apparatus of claim 17 , wherein said features of said groups of data comprises data indicating pixel luminance intensity values and corresponding pixel chrominance intensity values.
19. The apparatus of claim 17 , wherein said transform means comprises a two-dimensional orthogonal transform.
20. The apparatus of claim 17 , wherein said compressing means comprises a run-length coder and a variable length coder.
21. The apparatus of claim 19 , wherein said orthogonal transform comprises a discrete cosine transform operating on one of the intensity values of the pixels of a group of data, and the error of the temporal prediction from one or more temporally local groups of data.
22. The apparatus of claim 17 , wherein said quantizing means comprises a uniform scalar quantizer.
23. A method for estimating the number of bits output from a video coder given a known spatial data content, G={g1, . . . ,gN}, of a group of luminance and chrominance blocks, and a known coding mode, d, where d represents the index of said coding mode, the method comprising the steps of:
(a) extracting a significant part of said spatial data content, G, in relation to said coding mode, d, to yield a feature vector F, said feature vector representing statistics and signal components of the luminance and chrominance data of said luminance and chrominance blocks;
(b) mapping said feature vector to yield a class index, c, for said respective group of luminance and chrominance blocks; and
(c) mapping said class index, c, in relation to a quantization parameter, q, where said quantization parameter controls the scale of quantizer bin size, to an estimate of the number of coding bits for said group of luminance and group of chrominance blocks, wherein said coding bits comprise the quantization and overhead bits and said overhead bits represent the additional bits expended to represent respective portions of bitstsream.
24. A method for assigning quantization parameters to the groups of blocks of a picture comprising the steps of:
(a) setting the quantization parameters of all groups of blocks of the picture equal to the smallest value allowed by the video coding standard;
(b) scanning said groups of blocks according to a certain scanning order, where the last group of blocks in the scanning order is followed by the first group of blocks;
(c) determining whether to code the next group of blocks in the said scanning order with the quantization parameter for the group of blocks;
(d) incrementing the quantization parameter of said group of blocks;
(e) repeating steps (b)-(d) until the sum of the estimates for the number of coding bits of all of said groups of blocks falls below the targeted number of coding bits, BTR, for the picture.
25. The method of claim 24 , wherein the first, Z0, of a number Z of groups of blocks are quantized with a quantization parameter of q, and the remaining number, Z−Z0, of groups of blocks are quantized with a quantization parameter of q−1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/617,625 US20040105586A1 (en) | 1999-10-21 | 2003-07-10 | Method and apparatus for estimating and controlling the number of bits output from a video coder |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09/425,274 US6639942B1 (en) | 1999-10-21 | 1999-10-21 | Method and apparatus for estimating and controlling the number of bits |
US10/617,625 US20040105586A1 (en) | 1999-10-21 | 2003-07-10 | Method and apparatus for estimating and controlling the number of bits output from a video coder |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/425,274 Division US6639942B1 (en) | 1999-10-21 | 1999-10-21 | Method and apparatus for estimating and controlling the number of bits |
Publications (1)
Publication Number | Publication Date |
---|---|
US20040105586A1 true US20040105586A1 (en) | 2004-06-03 |
Family
ID=29251291
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/425,274 Expired - Lifetime US6639942B1 (en) | 1999-10-21 | 1999-10-21 | Method and apparatus for estimating and controlling the number of bits |
US10/617,625 Abandoned US20040105586A1 (en) | 1999-10-21 | 2003-07-10 | Method and apparatus for estimating and controlling the number of bits output from a video coder |
US10/618,344 Expired - Fee Related US7272181B2 (en) | 1999-10-21 | 2003-07-10 | Method and apparatus for estimating and controlling the number of bits output from a video coder |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/425,274 Expired - Lifetime US6639942B1 (en) | 1999-10-21 | 1999-10-21 | Method and apparatus for estimating and controlling the number of bits |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/618,344 Expired - Fee Related US7272181B2 (en) | 1999-10-21 | 2003-07-10 | Method and apparatus for estimating and controlling the number of bits output from a video coder |
Country Status (1)
Country | Link |
---|---|
US (3) | US6639942B1 (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040022518A1 (en) * | 2002-07-30 | 2004-02-05 | Kabushiki Kaisha Toshiba | Picture data recording method, picture data storage medium, picture data playback apparatus, and picture data playback method |
US20040136458A1 (en) * | 2001-11-30 | 2004-07-15 | Achim Dahlhoff | Method for conducting a directed prediction of an image block |
US20060018552A1 (en) * | 2004-07-08 | 2006-01-26 | Narendranath Malayath | Efficient rate control techniques for video encoding |
US20060176953A1 (en) * | 2005-02-04 | 2006-08-10 | Nader Mohsenian | Method and system for video encoding with rate control |
US20080317116A1 (en) * | 2006-08-17 | 2008-12-25 | Samsung Electronics Co., Ltd. | Method, medium, and system compressing and/or reconstructing image information with low complexity |
US20100201549A1 (en) * | 2007-10-24 | 2010-08-12 | Cambridge Silicon Radio Limited | Bitcount determination for iterative signal coding |
US20140140410A1 (en) * | 2012-06-29 | 2014-05-22 | Wenhao Zhang | Systems, methods, and computer program products for scalable video coding based on coefficient sampling |
US20150229952A1 (en) * | 2011-11-07 | 2015-08-13 | Infobridge Pte. Ltd. | Method of decoding video data |
US10516898B2 (en) | 2013-10-10 | 2019-12-24 | Intel Corporation | Systems, methods, and computer program products for scalable video coding based on coefficient sampling |
Families Citing this family (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
ES2282307T3 (en) * | 2000-05-31 | 2007-10-16 | Thomson Licensing | DEVICE AND PROCEDURE OF VIDEO CODING WITH RECURSIVE FILTER COMPENSATED IN MOTION. |
JP4224662B2 (en) * | 2000-08-09 | 2009-02-18 | ソニー株式会社 | Image encoding apparatus and method, image decoding apparatus and method, and image processing apparatus |
US6904094B1 (en) * | 2000-09-20 | 2005-06-07 | General Instrument Corporation | Processing mode selection for channels in a video multi-processor system |
US6763068B2 (en) * | 2001-12-28 | 2004-07-13 | Nokia Corporation | Method and apparatus for selecting macroblock quantization parameters in a video encoder |
US7418037B1 (en) * | 2002-07-15 | 2008-08-26 | Apple Inc. | Method of performing rate control for a compression system |
US7769084B1 (en) | 2002-07-15 | 2010-08-03 | Apple Inc. | Method for implementing a quantizer in a multimedia compression and encoding system |
US7804897B1 (en) | 2002-12-16 | 2010-09-28 | Apple Inc. | Method for implementing an improved quantizer in a multimedia compression and encoding system |
US7940843B1 (en) | 2002-12-16 | 2011-05-10 | Apple Inc. | Method of implementing improved rate control for a multimedia compression and encoding system |
EP1641269A4 (en) * | 2003-06-30 | 2011-04-27 | Mitsubishi Electric Corp | Image encoding device and image encoding method |
US8094720B2 (en) * | 2003-08-25 | 2012-01-10 | Agency For Science Technology And Research | Mode decision for inter prediction in video coding |
US20060062478A1 (en) * | 2004-08-16 | 2006-03-23 | Grandeye, Ltd., | Region-sensitive compression of digital video |
WO2006070787A1 (en) * | 2004-12-28 | 2006-07-06 | Nec Corporation | Moving picture encoding method, device using the same, and computer program |
US7970219B2 (en) * | 2004-12-30 | 2011-06-28 | Samsung Electronics Co., Ltd. | Color image encoding and decoding method and apparatus using a correlation between chrominance components |
US7974193B2 (en) | 2005-04-08 | 2011-07-05 | Qualcomm Incorporated | Methods and systems for resizing multimedia content based on quality and rate information |
US8126283B1 (en) | 2005-10-13 | 2012-02-28 | Maxim Integrated Products, Inc. | Video encoding statistics extraction using non-exclusive content categories |
US8149909B1 (en) | 2005-10-13 | 2012-04-03 | Maxim Integrated Products, Inc. | Video encoding control using non-exclusive content categories |
US8081682B1 (en) | 2005-10-13 | 2011-12-20 | Maxim Integrated Products, Inc. | Video encoding mode decisions according to content categories |
US20070201388A1 (en) * | 2006-01-31 | 2007-08-30 | Qualcomm Incorporated | Methods and systems for resizing multimedia content based on quality and rate information |
US8792555B2 (en) * | 2006-01-31 | 2014-07-29 | Qualcomm Incorporated | Methods and systems for resizing multimedia content |
CN101553988B (en) * | 2006-12-14 | 2012-10-17 | 日本电气株式会社 | Video encoding method, video encoding device, and video encoding program |
CN100566427C (en) * | 2007-07-31 | 2009-12-02 | 北京大学 | The choosing method and the device that are used for the intraframe predictive coding optimal mode of video coding |
WO2009157827A1 (en) * | 2008-06-25 | 2009-12-30 | Telefonaktiebolaget L M Ericsson (Publ) | Row evaluation rate control |
US8374442B2 (en) * | 2008-11-19 | 2013-02-12 | Nec Laboratories America, Inc. | Linear spatial pyramid matching using sparse coding |
US9172960B1 (en) * | 2010-09-23 | 2015-10-27 | Qualcomm Technologies, Inc. | Quantization based on statistics and threshold of luminanceand chrominance |
US8737464B1 (en) * | 2011-07-21 | 2014-05-27 | Cisco Technology, Inc. | Adaptive quantization for perceptual video coding |
US9042073B2 (en) | 2012-03-16 | 2015-05-26 | Eaton Corporation | Electrical switching apparatus with embedded arc fault protection and system employing same |
CN105426929B (en) * | 2014-09-19 | 2018-11-27 | 佳能株式会社 | Object shapes alignment device, object handles devices and methods therefor |
CA3137206A1 (en) * | 2019-04-26 | 2020-10-29 | Sergey Yurievich IKONIN | Method and apparatus for signaling of mapping function of chroma quantization parameter |
Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5214507A (en) * | 1991-11-08 | 1993-05-25 | At&T Bell Laboratories | Video signal quantization for an mpeg like coding environment |
US5291282A (en) * | 1990-04-19 | 1994-03-01 | Olympus Optical Co., Ltd. | Image data coding apparatus and method capable of controlling amount of codes |
US5305115A (en) * | 1989-08-05 | 1994-04-19 | Matsushita Electric Industrial Co., Ltd. | Highly efficient picture coding method with motion-adaptive zonal sampling enabling optimized image signal compression |
US5333012A (en) * | 1991-12-16 | 1994-07-26 | Bell Communications Research, Inc. | Motion compensating coder employing an image coding control method |
US5416604A (en) * | 1992-05-27 | 1995-05-16 | Samsung Electronics Co., Ltd. | Image compression method for bit-fixation and the apparatus therefor |
US5631644A (en) * | 1993-12-22 | 1997-05-20 | Sharp Kabushiki Kaisha | Image encoding apparatus |
US5959675A (en) * | 1994-12-16 | 1999-09-28 | Matsushita Electric Industrial Co., Ltd. | Image compression coding apparatus having multiple kinds of coefficient weights |
US5986710A (en) * | 1996-06-26 | 1999-11-16 | Samsung Electronics Co., Ltd. | Image encoding method and apparatus for controlling the number of bits generated using quantization activities |
US6044115A (en) * | 1996-12-13 | 2000-03-28 | Matsushita Electric Industrial Co., Ltd. | Method and apparatus for coding and decoding digital image data using image quantization modification |
US6046774A (en) * | 1993-06-02 | 2000-04-04 | Goldstar Co., Ltd. | Device and method for variable length coding of video signals depending on the characteristics |
US6081781A (en) * | 1996-09-11 | 2000-06-27 | Nippon Telegragh And Telephone Corporation | Method and apparatus for speech synthesis and program recorded medium |
US6144763A (en) * | 1997-03-24 | 2000-11-07 | Fuji Photo Film Co., Ltd. | Method and apparatus for compression coding of image data representative of a color image and digital camera including the same |
US6330369B1 (en) * | 1998-07-10 | 2001-12-11 | Avid Technology, Inc. | Method and apparatus for limiting data rate and image quality loss in lossy compression of sequences of digital images |
US6430222B1 (en) * | 1998-08-31 | 2002-08-06 | Sharp Kabushiki Kaisha | Moving picture coding apparatus |
US6496607B1 (en) * | 1998-06-26 | 2002-12-17 | Sarnoff Corporation | Method and apparatus for region-based allocation of processing resources and control of input image formation |
US6507616B1 (en) * | 1998-10-28 | 2003-01-14 | Lg Information & Communications, Ltd. | Video signal coding method |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100495716B1 (en) * | 1996-04-12 | 2005-11-25 | 소니 가부시끼 가이샤 | Apparatus and method for encoding images and medium in which image encoding program has been recorded |
JP3356629B2 (en) * | 1996-07-15 | 2002-12-16 | 日本電気株式会社 | Method of manufacturing lateral MOS transistor |
-
1999
- 1999-10-21 US US09/425,274 patent/US6639942B1/en not_active Expired - Lifetime
-
2003
- 2003-07-10 US US10/617,625 patent/US20040105586A1/en not_active Abandoned
- 2003-07-10 US US10/618,344 patent/US7272181B2/en not_active Expired - Fee Related
Patent Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5305115A (en) * | 1989-08-05 | 1994-04-19 | Matsushita Electric Industrial Co., Ltd. | Highly efficient picture coding method with motion-adaptive zonal sampling enabling optimized image signal compression |
US5291282A (en) * | 1990-04-19 | 1994-03-01 | Olympus Optical Co., Ltd. | Image data coding apparatus and method capable of controlling amount of codes |
US5214507A (en) * | 1991-11-08 | 1993-05-25 | At&T Bell Laboratories | Video signal quantization for an mpeg like coding environment |
US5333012A (en) * | 1991-12-16 | 1994-07-26 | Bell Communications Research, Inc. | Motion compensating coder employing an image coding control method |
US5416604A (en) * | 1992-05-27 | 1995-05-16 | Samsung Electronics Co., Ltd. | Image compression method for bit-fixation and the apparatus therefor |
US6046774A (en) * | 1993-06-02 | 2000-04-04 | Goldstar Co., Ltd. | Device and method for variable length coding of video signals depending on the characteristics |
US5631644A (en) * | 1993-12-22 | 1997-05-20 | Sharp Kabushiki Kaisha | Image encoding apparatus |
US5959675A (en) * | 1994-12-16 | 1999-09-28 | Matsushita Electric Industrial Co., Ltd. | Image compression coding apparatus having multiple kinds of coefficient weights |
US5986710A (en) * | 1996-06-26 | 1999-11-16 | Samsung Electronics Co., Ltd. | Image encoding method and apparatus for controlling the number of bits generated using quantization activities |
US6081781A (en) * | 1996-09-11 | 2000-06-27 | Nippon Telegragh And Telephone Corporation | Method and apparatus for speech synthesis and program recorded medium |
US6044115A (en) * | 1996-12-13 | 2000-03-28 | Matsushita Electric Industrial Co., Ltd. | Method and apparatus for coding and decoding digital image data using image quantization modification |
US6144763A (en) * | 1997-03-24 | 2000-11-07 | Fuji Photo Film Co., Ltd. | Method and apparatus for compression coding of image data representative of a color image and digital camera including the same |
US6496607B1 (en) * | 1998-06-26 | 2002-12-17 | Sarnoff Corporation | Method and apparatus for region-based allocation of processing resources and control of input image formation |
US6330369B1 (en) * | 1998-07-10 | 2001-12-11 | Avid Technology, Inc. | Method and apparatus for limiting data rate and image quality loss in lossy compression of sequences of digital images |
US6430222B1 (en) * | 1998-08-31 | 2002-08-06 | Sharp Kabushiki Kaisha | Moving picture coding apparatus |
US6507616B1 (en) * | 1998-10-28 | 2003-01-14 | Lg Information & Communications, Ltd. | Video signal coding method |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040136458A1 (en) * | 2001-11-30 | 2004-07-15 | Achim Dahlhoff | Method for conducting a directed prediction of an image block |
US7379499B2 (en) * | 2001-11-30 | 2008-05-27 | Robert Bosch Gmbh | Method for conducting a directed prediction of an image block |
US7423672B2 (en) * | 2002-07-30 | 2008-09-09 | Kabushiki Kaisha Toshiba | Picture data recording method, picture data storage medium, picture data playback apparatus, and picture data playback method |
US20040022518A1 (en) * | 2002-07-30 | 2004-02-05 | Kabushiki Kaisha Toshiba | Picture data recording method, picture data storage medium, picture data playback apparatus, and picture data playback method |
US20060018552A1 (en) * | 2004-07-08 | 2006-01-26 | Narendranath Malayath | Efficient rate control techniques for video encoding |
US7606427B2 (en) | 2004-07-08 | 2009-10-20 | Qualcomm Incorporated | Efficient rate control techniques for video encoding |
US20060176953A1 (en) * | 2005-02-04 | 2006-08-10 | Nader Mohsenian | Method and system for video encoding with rate control |
US9232221B2 (en) | 2006-08-17 | 2016-01-05 | Samsung Electronics Co., Ltd. | Method, medium, and system compressing and/or reconstructing image information with low complexity |
US20080317116A1 (en) * | 2006-08-17 | 2008-12-25 | Samsung Electronics Co., Ltd. | Method, medium, and system compressing and/or reconstructing image information with low complexity |
US9554135B2 (en) | 2006-08-17 | 2017-01-24 | Samsung Electronics Co., Ltd. | Method, medium, and system compressing and/or reconstructing image information with low complexity |
US8705635B2 (en) * | 2006-08-17 | 2014-04-22 | Samsung Electronics Co., Ltd. | Method, medium, and system compressing and/or reconstructing image information with low complexity |
US20100201549A1 (en) * | 2007-10-24 | 2010-08-12 | Cambridge Silicon Radio Limited | Bitcount determination for iterative signal coding |
US8217811B2 (en) * | 2007-10-24 | 2012-07-10 | Cambridge Silicon Radio Limited | Bitcount determination for iterative signal coding |
US20150229952A1 (en) * | 2011-11-07 | 2015-08-13 | Infobridge Pte. Ltd. | Method of decoding video data |
US9641860B2 (en) * | 2011-11-07 | 2017-05-02 | Infobridge Pte. Ltd. | Method of decoding video data |
US20140140410A1 (en) * | 2012-06-29 | 2014-05-22 | Wenhao Zhang | Systems, methods, and computer program products for scalable video coding based on coefficient sampling |
US9955154B2 (en) * | 2012-06-29 | 2018-04-24 | Intel Corporation | Systems, methods, and computer program products for scalable video coding based on coefficient sampling |
US10516898B2 (en) | 2013-10-10 | 2019-12-24 | Intel Corporation | Systems, methods, and computer program products for scalable video coding based on coefficient sampling |
Also Published As
Publication number | Publication date |
---|---|
US6639942B1 (en) | 2003-10-28 |
US20040105491A1 (en) | 2004-06-03 |
US7272181B2 (en) | 2007-09-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6639942B1 (en) | Method and apparatus for estimating and controlling the number of bits | |
US11089311B2 (en) | Parameterization for fading compensation | |
US6993078B2 (en) | Macroblock coding technique with biasing towards skip macroblock coding | |
US7974340B2 (en) | Adaptive B-picture quantization control | |
US7463684B2 (en) | Fading estimation/compensation | |
EP1359770B1 (en) | Signaling for fading compensation in video encoding | |
US6687296B1 (en) | Apparatus and method for transforming picture information | |
WO2008140949A1 (en) | Methods and systems for rate-distortion optimized quantization of transform blocks in video encoding | |
WO2004038921A2 (en) | Method and system for supercompression of compressed digital video | |
US5844607A (en) | Method and apparatus for scene change detection in digital video compression | |
KR20010071689A (en) | Image processing circuit and method for modifying a pixel value | |
KR20050122275A (en) | System and method for rate-distortion optimized data partitioning for video coding using parametric rate-distortion model | |
US6823015B2 (en) | Macroblock coding using luminance date in analyzing temporal redundancy of picture, biased by chrominance data | |
Frimout et al. | Forward rate control for MPEG recording | |
JP4532607B2 (en) | Apparatus and method for selecting a coding mode in a block-based coding system | |
Chung et al. | A new approach to scalable video coding | |
McVeigh et al. | Comparative study of partial closed-loop versus open-loop motion estimation for coding of HDTV | |
Milicevic et al. | RD optimization and skip prediction for H. 264/AVC standard | |
KR0174444B1 (en) | Motion compensated apparatus for very low speed transmission | |
EP1746840A2 (en) | Parameterization for fading compensation | |
Pereira | Video coding in a broadcast environment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: TOSHIBA AMERICA ELECTRONIC COMPONENTS, INC., CALIF Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BAYAZIT, ULUG;REEL/FRAME:019755/0927 Effective date: 19991214 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |