WO2001033860A1

WO2001033860A1 - Improved cascaded compression method and system for digital video and images

Info

Publication number: WO2001033860A1
Application number: PCT/EP2000/010158
Authority: WO
Inventors: Santhana Krishnamachari
Original assignee: Koninklijke Philips Electronics N.V.
Priority date: 1999-11-03
Filing date: 2000-10-13
Publication date: 2001-05-10
Also published as: KR100744442B1; CN1186940C; EP1145562A1; JP2003513563A; CN1342369A; KR20010089765A

Abstract

A system and method are disclosed for reducing quantization error in a system using a cascaded compression scheme. An expected quantization error introduced by the cascaded compression scheme is determined. The expected quantization errors based upon two or more second or higher stage test quantizers are then compared. Based on this comparison, a quantizer for the second or higher stage of compression is selected to reduce and/or minimize the quantization error caused by the cascaded compression scheme.

Description

Improved Cascaded Compression Method and System for Digital Video and Images.

FIELD OF THE INVENTION

The present invention pertains generally to the field of video/image compression, and in particular, the invention relates to systems and methods for reducing the quantization error introduced in cascaded compression of digital video and images.

BACKGROUND OF THE INVENTION

The videophone, digital television, teleconferencing, and the information highway are just a few of the elements of the emerging digital age. Developments in the processing of digital images and video have aided the progression into the digital age. In particular, digital image compression methods have played a key role in this evolution. Image compression reduces the amount of data required to represent a digital image. For example, color, grey scale, or binary images may be compressed and then decompressed to yield an accurate representation of the original image.

Compression is usually performed before storage or transmission of the data. This allows for vast amounts of information to be stored in an economical manner and/or transferred quickly. As should be clear, image compression is usually a two-way process involving compression and decompression. These processes may not be symmetrical, i.e. the time taken and/or computing power for one process may differ from the other given the type of compression algorithm used. There are generally two types of image compression - lossy and lossless. In lossy compression, the decompressed image is similar but not exactly the same as the original image. This is because at least a portion of the original data has been changed or discarded. Lossy compression techniques include sample subsampling, differential pulse coding modulation (DPCM), and quantization of discrete cosine transforms (DCT) coefficients. Lossless compression, on the other hand, retains all of the data of the original image, i.e., essentially a completely reversible coding process. Lossless compression techniques include variable-length coding (VLC) and run-length coding (RLC).

The compression ratio is typically defined as being the ratio between the data content to be compressed and the data that results after compression. The Lossy compression methods can provide compression ratios over 100:1. The Lossless compression methods generally achieving ratios of approximately 3 : 1. The trade-off being that, in general, as the lossy compression ratio increases, the degradation of the image also increases.

The compression ratio may be achieved by one stage of compression or multiple cascaded stages of compression. An image or video source undergoes cascaded compression when the input source signal is subjected to multiple stages of compression in a serial manner. For example, in cascaded compression, the source signal (image or video) is first compressed to a particular compression ratio and this compressed data is then further subjected to a second or more level(s) of compression to achieve a higher compression ratio. In practice, the higher compression may be required for efficient storage or bandwidth restricted transmission.

The Joint Photographic Experts Group (JPEG) and the Motion Picture Experts Group (MPEG) standards are the most prevalently used compression schemes for images and video respectively. The JPEG standard is intended for compression of color or grey-scale images of natural real-world scenes. While the JPEG standard includes both a lossless and lossy mode, it is usually used in lossy mode to achieve a greater compression ratio. Typically, an image is transformed to the frequency domain using a discrete cosine transform (DCT). The resulting smaller-valued frequency components are rejected which leave behind the larger-valued components. These larger-valued components are then differential pulse code modulation (DPCM) coded and Huffman coded. The adjustable nature of JPEG compression allows for variable compression ratios and fine-tuning the algorithm for a particular application's requirements.

The MPEG standard uses DCT and Huffman coding methods in conjunction with interframe coding techniques which are used to yield better compression ratios. MPEG- 1 and MPEG-2 are typically used for low-resolution image sequences and higher-resolution sequences respectively. MPEG-4 focuses on unified audio-visual objects and scenes rather than frames. MPEG-7 aids in the location of audio-visual content.

The above compression techniques utilize a DCT transformation followed by a quantization of the DCT coefficients and a variable length coding to achieve data compression. The quantization of the DCT coefficients makes these compression techniques lossy. As discussed above, a lossy compression scheme is one in which the uncompressed data is not an exact replica of the original data. For example, in the JPEG or MPEG compression schemes, each stage of compression in the cascaded compression is lossy in nature. In addition, performing the multiple cascaded compression also introduces additional loss.

To illustrate this additional loss, reference is made to Fig. 1 which shows two lossy compression scenarios (a) and (b). In scenario (a), source data 10 is compressed to a ratio of 20:1 by compression system 11. In scenario (b), the source data 10 first undergoes a first stage of compression of 10:1, followed by a second stage of 2:1 compression by cascaded compression system 12. In scenario (b), the second compression stage does not have access to the original source data 10, but only the compressed signal obtained as the output of the first compression stage. Nevertheless, both of the scenarios (a) and (b) achieve the same 20:1 compression. However, the mean square error (MSE) introduced in scenario (b) will be always greater than or equal to scenario (a) because of the cascaded compression. This additional error is due in part to the selection of quantizer values in the second or higher level compression stages.

Some conventional cascaded compression systems use a non-degenerative set of quanitzing factors for successive generations of compression. The set is selected using numerical relations such as q value = K*(3**n). If one of the quantizers is used in the first generation, then any of the other quantizers can be used in the subsequent generation. However, these systems do not provide any insight in how to choose quantizers in the subsequent generation that will minimize the cascading quantization error. There thus exists in the art a need for improved systems, methods and techniques to minimize the MSE introduced in cascaded compression, in particular to minimize the loss by appropriate selection of quantizers in the second or higher order stages of the cascaded compression schemes.

BRIEF SUMMARY OF THE INVENTION

It is an object of the present invention to address the limitations of the conventional cascaded compression system discussed above.

It is another object of the invention to provide a method to compute an error introduced by cascaded compression for a given pair of quantizers. It is yet another object of the invention to provide a method to minimize the loss introduced by cascaded compression by appropriate quantizer selection.

One preferred embodiment relates to reducing the quantization error introduced within the framework of JPEG and MPEG compression schemes. One aspect of the invention relates to a method for a cascaded compression system including the steps of determining an expected quantization error introduced by a second or higher stage of the cascaded compression system and comparing the expected quantization error of at least two quantizers for the second or higher stage. The method also includes the step of selecting one of the quantizers in accordance with a result of the comparison to minimize the expected quantization error for the cascaded compression system.

In one preferred embodiment of the invention, a probability distribution function is used to determine the expected quantization error. Another aspect of the invention relates to a memory medium and an apparatus for carrying out the above method.

These and other embodiments and aspects of the present invention are exemplified in the following detailed disclosure.

BRIEF DESCRIPTION OF DRAWINGS

The features and advantages of the present invention can be understood by reference to the detailed description of the preferred embodiments set forth below taken with the drawings, in which:

Fig. 1 is a schematic of a non-cascaded compression system and a cascaded compression system.

Fig. 2 is a diagram of quantizers in a cascaded compression system. Fig. 3 is a diagram shown quantization error in a cascaded compression system.

Fig. 4 is a block diagram of an exemplary computer system in accordance with one aspect of the invention.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

The additional MSE introduced by cascaded compression is explained with reference to the diagram shown in Figure 2. Figure 2(a) shows the reconstruction points and decision boundaries for a uniform quantizer with step size of Q . The reconstruction points are indicated by dark circles and the decision boundaries by short vertical lines. The nth reconstruction point is located at Q"ι (not shown) and the decision boundaries on either side of Qⁿι are located at D^πι and ^{n (not shown). The decision boundaries lie approximately halfway between two successive reconstruction points. For a uniform quantizer with a step size Qi, the reconstruction points lie at multiples of Q . Any value of an input signal source falling in between two decision boundaries is quantized to the reconstruction point lying in between the two decision boundaries.

The quantizer Q is used in the first stage (a) of cascaded compression and quantizer Q₂ is used in the second stage (b) of compression. For a single stage of compression, for example, the quantizer Q₂ is used. In this embodiment, the step size of quantizer Q₂ is shown to be greater than quantizer Q^ It is noted that the larger the quantizer step size, the higher the compression ratio that can be achieved, albeit at the expense of introducing more loss. Preferably, uniform quantizers are used such as those used in MPEG I-frames and JPEG compression schemes. However, other step sizes and non-uniform quantizers may be used.

In Fig. 2, x denotes the value of the input source signal that is to be quantized. If x falls in the range of [Q°₂, D°₂), then with the single stage quantizer (in this case using only stage (b)), this value will be quantized to Q°₂. It is noted that the symbol "[" indicates the value is included in the range and the symbol ")" indicates that the value is not included in the range. In the case of two stage quantizer (both (a) and (b)), the output of the first stage quantizer is Q°ι if x falls in the range of [0,D°ι), and Q¹ ₁ if it falls in the range of [D°₁,D°₂). The output from stage (a) is then passed through the second quantizer stage (b), which quantizers Q°ι to Q°₂, and Q\ to Q'₂. Therefore, with cascaded compression, x values in the range of [0, D°ι) are quantized to Q°₂ and x values in the range [D°ι, D°₂) are quantized to Q'₂ Thus the values of x in the range [D°ι, D°₂) are (incorrectly) quantized with larger mean square error using cascaded compression as compared to a single stage quantizer. Similarly, it can be observed that any value of x in the range of [D¹ ₁, D'₂) will be incorrectly quantized to a value of Q² ₂ by the two stage quantizers whereas the single stage quantizer would quantize these values to Q'₂. This additional error is as a direct result of the cascaded compression.

To reduce and/or remove this additional error, the value ranges of x that would be incorrectly quantized must first be determined, given the quantizers Qi and Q₂. In this regard, a particular decision boundary on the Q₂ quantizer at Dⁿ ₂ is considered. For this decision boundary, the two closest decision boundaries of the quantizer O , one being larger (D^mι) and one being smaller (D^m"\) than Dⁿ ₂ are located as shown in Figs. 3(a) and 3(b). Illustratively, the following cases are considered:

Case l (FIG. 3(a):

< D₂" and x e [D₂" , D,^m ) The single stage Q quantizer will quantize x to Qⁿ⁺¹ ₂. In the two stage quantizer (Qi and Q₂), the Qj quantizer will quantize x to Q^m] and the following Q₂ quantizer will quantize x to Qⁿ ₂ thereby introducing additional quantization error.

Case 2 (Fig. 3(b)): Q^m, ≥ D₂ ⁿ and χ e [D^ , D₂" )

The single stage Q₂ quantizer will quantizer x to Qⁿ ₂. In the two stage quantizer (Qi and Q₂), the Qi quantizer will quantize x to Q"\ and the following Q₂ quantizer will quantize it to Q⁽ⁿ⁺ thereby introducing additional quantization error.

To identify all of the ranges of values of x that will be incorrectly quantized, this computation must be repeated for all decision boundaries of Q₂. However, in the case of uniform quantizers as used in JPEG compression, for example, the above computation need only be performed for all decision boundaries of Q only in the range of 0 to LCM(Qι, Q ). LCM(Q_!,Q₂) is the least common multiple of the two numbers Qi and Q₂. The result obtained from this computation is used to obtain the ranges of values of x that lie outside the range [0, LCM(Qι,Q₂] and that are subject to incorrect quantization. As can be seen, the computation is much simpler in the uniform quantizer case. For each combination of pixel values in a context, the probability distribution of black and white pixels can be different. For example, in an all white context, the probability of coding a white pixel will be much greater than that of coding a black pixel. Given a probability distribution f(x) on the input source signal (x), the expected quantization error introduced by the cascaded quantization using the quantizers Qj and Q can be computed as follows:

E(Qι, Q ) = I f(x)

The symbol ξ represents the set containing all ranges of values of x that are incorrectly quantized as determined above.

The computation of quantization error in the above equation is used to select of appropriate quantizers in the second stage of cascaded computing. For example, if the quantizer Qi is used in the first stage and assuming that there are two possible quantizers Q₂ and Q₂' for the second stage, then the quantization errors can be computed for both these quantizers. These two possible quantizers are test quantizers. The minimum quantization error value is used to decide the most appropriate selection of the quantizer, i.e., Q₂ or Q₂'.

If the quantizers Q₂ is expected to offer a bit rate r (the larger the rate, the lower the compression) with quantization error E(Q₂)+E(Q_1>Q₂) and the quantizers Q₂' is expected to offer a rate of r' with a quantization error of E(Q₂')⁺E(Qι,Q₂') then the ratio of the rate to the quantization error can be used as a measure in the selection of the quantizer. Here E(Q₂) and E(Q₂') are the quantization error inherently generated by the quantizer and is not related to additional error caused by the cascaded quantization. Of course, quantizers other than Q or Q₂' discussed immediately above may be used as a starting point. The selection of quantizers is based in part on the overall compression ratio that is desired. Also some trial and error may be used in selecting the initial quantizers. The quantization error for several quantizers may be computed, as discussed above, and the most appropriate quantizer is then selected. As discussed above, one embodiment of the present invention relates to applications using the JPEG and MPEG compression schemes. Both of these compression schemes divide the input source data spatially into contiguous blocks of size 8x8 which are subjected to a DCT transformation resulting in 64 DCT coefficients. This is followed by a quantization of the DCT coefficients. The DC coefficient is differentially coded. The 63 remaining AC coefficients are coded by specifying the run length of zero coefficients followed by the coding of the following non-zero coefficient's value.

In the case of JPEG, the entries of a quantization table determine the quantizer used for different DCT coefficients. Different quantization tables can be used for different bands (e.g., luminance and chrominance), but the quantization tables are fixed for a single band. To apply the quantizer selection method of the present invention to recompress an already JPEG compressed data, knowledge of the probability distribution f(x) for each DCT coefficient is necessary. Experimentally it has been found that the distribution of the AC DCT coefficients follow a Laplacian distribution. It is noted that the parameter associated with the Laplacian distribution being different for different DCT coefficients. This parameter may be estimated or a different distribution, such as Rayleigh or Gaussian, can be obtained from the available compressed data.

It should be understood that a Laplacian distribution may be used for both the luminance and chrominance channels of the DCT encoded image and MPEG error terms. The MPEG compression scheme uses DCT to encode error terms as well as picture information. The error terms are obtained from the MPEG motion compensation algorithm. An error term is obtained by subtracting an image block from a block on another picture in the sequence and applying a DCT to the difference. The allows the picture to be encoded using fewer bits if there are only a small number of changes in the images. Also, for the MPEG case, a preferred embodiment focuses on I- (intracoded) frames in the MPEG format. I-frames are composed of intrablocks only without reference to other pictures. These frames can serve as random access points in the sequence. Other MPEG frame types such as P- (predictive coded) and B- (bidirectionally interpolated) frame may also be used. For example, for each (I) frame in a MPEG video, a quantizer value is decided by a quantizer_scale and a quantization table. Different quantization tables can be used for chrominance and luminance. The quantization tables are fixed for each frame, but the quantizer_scale can be changed for each macroblock. In one embodiment, the quantizer selection methods discussed above are used to select the quantizer_scale for each MPEG frame macroblock. It is noted, however, that the quantizer_scale cannot be changed for each DCT coefficient. This value is fixed for the whole macroblock. Since the human visual system is more sensitive to quantization errors of the low frequency coefficients, preferably the quantizer_scale is selected to minimize the average quantization error of selected low frequency coefficients. FIG. 4 shows a video/image processing system 20 in which the present invention may be implemented. By way of examples, the system 20 may represent a television, a set-top box, a desktop, laptop or palmtop computer, a personal digital assistant (PDA), a video/image storage device such as a video cassette recorder (VCR), a digital video recorder (DVR). a TiVO device, etc., as well as portions or combinations of these and other devices. The system 20 includes one or more video/image sources 22, one or more input/output devices 24, a processor 25 and a memory 26. The video/image source(s) 22 may represent, e.g., a television receiver, a VCR or other video/image storage device. The source(s) 22 may alternatively represent one or more network connections for receiving video/images from a server or servers over, e.g., a global computer communications network such as the Internet, a wide area network, a metropolitan area network, a local area network, a terrestrial broadcast system, a cable network, a satellite network, a wireless network, or a telephone network, as well as portions or combinations of these and other types of networks. The input/output devices 24, processor 25 and memory 26 communicate over a communication medium 27. The communication medium 27 may represent, e.g., a bus, a communication network, one or more internal connections of a circuit, circuit card or other device, as well as portions and combinations of these and other communication media. Input video/images from the source(s) 22 is processed in accordance with one or more software programs stored in memory 26 and executed by processor 25 in order to generate output video/images which is supplied to a display device 28 such as a television display, a computer monitor, etc.

In a preferred embodiment, the computation of the expected quantization error due to the cascaded compression and selection of appropriate quantizers is implemented by computer readable code executed by the system 20. The code may be stored in the memory 26 or read/downloaded from a memory medium such as a CD-ROM or floppy disk. In other embodiments, hardware circuitry may be used in place of, or in combination with, software instructions to implement the invention.

It should be understood that the particular configuration of system 20 as shown in FIG. 4 is by way of example only. Those skilled in the art will recognize that the invention can be implemented using a wide variety of alternative system configurations. While the present invention has been described above in terms of specific embodiments, it is to be understood that the invention is not intended to be confined or limited to the embodiments disclosed herein. For example, the invention is not limited to any specific compression scheme, frame type or probability distribution. On the contrary, the present invention is intended to cover various structures and modifications thereof included within the spirit and scope of the appended claims.

Claims

CLAIMS:

1. A method for a cascaded compression system (12) comprising the steps of: determining an expected quantization error introduced by a second or higher stage of the cascaded compression system; comparing the expected quantization error of at least two quantizers for the second or higher stage; and selecting one of the at least two quantizers in accordance with a result of the comparison.

2. The method according to Claim 1, wherein the expected quantization error is determined using a probability distribution function.

3. The method according to Claim 2, wherein the probability distribution function is a Laplacian distribution.

4. The method according to Claim 2, wherein said selecting step selects the one quantizer in accordance with a ratio of an expected bit rate and the expected quantization error.

5. The method according to Claim 2, wherein the probability distribution function is determined in accordance with input data (10) to be compressed.

6. The method according to Claim 1 , wherein the at least two quantizers are determined in accordance with a desired compression ratio of the cascaded compression system.

7. The method according to Claim 1 , wherein the cascaded compression system comprises at least one of a JPEG or a MPEG system.

8. The method according to Claim 1, wherein said determining step includes determining a set of ranges for values of input data that will be incorrectly quantized by the cascaded compression system.

9. A memory medium including code for a cascaded compression system (12) the code comprising: determining code to compute an expected quantization error introduced by a second or higher stage of the cascaded compression system; comparing code to compare the expected quantization error of at least two quantizers for the second or higher stage; and selecting code to allow for selection of one of the at least two quantizers in accordance with a result of the comparing code.

10. The memory medium according to Claim 9, wherein the expected quantization error is determined using a probability distribution function.

11. The memory medium according to Claim 9, wherein the probability distribution function is a Laplacian distribution.

12. The memory medium according to Claim 9, wherein the cascaded compression system comprises at least one of a JPEG or a MPEG system.

13. A cascaded compression apparatus (20) comprising: a memory (26) which stores executable code; and - a processor (25) which executes the code stored in the memory (26) so as to (i) determine an expected quantization error introduced by a second or higher stage of the cascaded compression system, (ii) compare the expected quantization error of at least two quantizers for the second or higher stage, and (iii) enable selection of one of the at least two quantizers in accordance with a result of the comparison.

14. The apparatus (20) according to Claim 13, wherein the expected quantization error is determined using a probability distribution function.

15. The apparatus (20) according to Claim 14, wherein the probability distribution function is a Laplacian distribution.

16. The apparatus (20) according to Claim 13, wherein the cascaded compression system (12) comprises at least one of a JPEG or a MPEG system.

17. A cascaded compression system (12) comprising: means (25, 26) for determining an expected quantization error introduced by the cascaded compression system; - means (25, 26)for comparing the expected quantization error based upon possible quantizers for a second or higher stage of the cascaded compression system; and means (25, 26) for selecting one of the possible quantizers to reduce the expected quantization error.