US20020021756A1 - Video compression using adaptive selection of groups of frames, adaptive bit allocation, and adaptive replenishment - Google Patents
Video compression using adaptive selection of groups of frames, adaptive bit allocation, and adaptive replenishment Download PDFInfo
- Publication number
- US20020021756A1 US20020021756A1 US09/902,976 US90297601A US2002021756A1 US 20020021756 A1 US20020021756 A1 US 20020021756A1 US 90297601 A US90297601 A US 90297601A US 2002021756 A1 US2002021756 A1 US 2002021756A1
- Authority
- US
- United States
- Prior art keywords
- picture
- video stream
- pictures
- gop
- input video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 230000003044 adaptive effect Effects 0.000 title claims description 13
- 238000007906 compression Methods 0.000 title abstract description 63
- 230000006835 compression Effects 0.000 title abstract description 57
- 238000000034 method Methods 0.000 claims abstract description 175
- 238000005070 sampling Methods 0.000 claims abstract description 26
- 230000008859 change Effects 0.000 claims description 38
- 230000004044 response Effects 0.000 claims description 11
- 230000033001 locomotion Effects 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 6
- 239000013598 vector Substances 0.000 claims description 6
- 238000001308 synthesis method Methods 0.000 abstract description 5
- 230000005540 biological transmission Effects 0.000 description 20
- 230000008569 process Effects 0.000 description 15
- 238000010586 diagram Methods 0.000 description 13
- 238000004891 communication Methods 0.000 description 8
- 230000005236 sound signal Effects 0.000 description 8
- 230000015556 catabolic process Effects 0.000 description 5
- 238000013144 data compression Methods 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 238000006731 degradation reaction Methods 0.000 description 4
- 238000012805 post-processing Methods 0.000 description 4
- 238000007781 pre-processing Methods 0.000 description 3
- 241000255925 Diptera Species 0.000 description 2
- 239000006002 Pepper Substances 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 230000006837 decompression Effects 0.000 description 2
- 230000008030 elimination Effects 0.000 description 2
- 238000003379 elimination reaction Methods 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 230000003321 amplification Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000001010 compromised effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000003708 edge detection Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000013213 extrapolation Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000004513 sizing Methods 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/007—Transform coding, e.g. discrete cosine transform
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/403—Edge-driven scaling; Edge-based scaling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4092—Image resolution transcoding, e.g. by using client-server architectures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/12—Edge-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/107—Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/114—Adapting the group of pictures [GOP] structure, e.g. number of B-frames between two anchor frames
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/137—Motion inside a coding unit, e.g. average field, frame or block difference
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/152—Data rate or code amount at the encoder output by measuring the fullness of the transmission buffer
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/527—Global motion vector estimation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20172—Image enhancement details
- G06T2207/20192—Edge enhancement; Edge preservation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/14—Picture signal circuitry for video frequency region
- H04N5/147—Scene change detection
Definitions
- the present invention relates to the processing of a video stream and more specifically relates to the improvement of video stream compression by adaptively selecting a group of pictures based on video stream content, by adaptively allocating bits to generate a compressed video stream, and by adaptively replenishing macroblocks.
- Data compression is a well-known means for conserving transmission resources when transmitting large amounts of data or conserving storage resources when storing large amounts of data.
- data compression involves minimizing or reducing the size of a data signal (e.g., a data file) in order to yield a more compact digital representation of that data signal.
- a data signal e.g., a data file
- data compression is virtually a necessary step in the process of widespread distribution of digital representations of audio and video signals.
- video signals are typically well suited for standard data compression techniques. Most video signals include significant data redundancy. Within a single video frame (image), there typically exists significant correlation among adjacent portions of the frame, referred to as spatial correlation. Similarly, adjacent video frames tend to include significant correlation between corresponding image portions, referred to as temporal correlation. Moreover, there is typically a considerable amount of data in an uncompressed video signal that is irrelevant. That is, the presence or absence of that data will not perceivably affect the quality of the output video signal. Because video signals often include large amounts of such redundant and irrelevant data, video signals are typically compressed prior to transmission and then decompressed again after transmission.
- the distribution of a video signal includes a transmission unit and a receiving unit.
- the transmission unit will receive a video signal as input and will compress the video signal and transmit the signal to the receiving unit. Compression of a video signal is usually performed by an encoder.
- the encoder typically reduces the data rate of the input video signal to a level that is predetermined by the capacity of the transmission medium. For example, for a typical video file transfer, the required data rate can be reduced from about 30 Megabits per second to about 384 kilobits per second.
- the compression ratio is defined as the ratio between the size of the input video signal and the size of the compressed video signal. If the transmission medium is capable of a high transmission rate, then a lower compression ration can be used. On the other hand, if the transmission medium is capable of a relatively low transmission rate, then a lower compression ratio can be used.
- the receiving unit After the receiving unit receives the compressed video signal, the signal must be decompressed before it can be adequately displayed.
- the decompression process is performed by a decoder.
- the decoder is used to decompress the compressed video signal so that it is identical to the original input video signal. This is referred to as lossless compression, because no data is lost in the compression and decompression processes.
- lossy compression because some predefined amount of the original data is irretrievably lost in the compression and expansion process. In order to decompress the video stream to its original (pre-encoding) data size, the lost data must be replaced by new data.
- Video signal degradation typically manifests itself as a perceivable flaw in a displayed video image. These flaws are typically referred to as noise.
- Well-known kinds of video noise include blockiness, mosquito noise, salt-and-pepper noise, and fuzzy edges.
- the data rate (or bit rate) often determines the quality of the decoded video stream. A video stream that was encoded with a high bit rate is generally a higher quality video stream than one encoded at a lower bit rate.
- video signal compression that efficiently groups pictures in a video stream and provides for lower output signal bit rates and higher output signal quality.
- the video signal compression also should maximize the output signal quality by appropriately allocating bits among pictures and picture groups in the output signal.
- the video signal compression also should apply compression methods that reduce noise in the output signal.
- the method should enable the use of various sampling techniques and should enable the selection of an output stream, based on the sampling technique providing the best video stream.
- the present invention provides video signal compression that efficiently groups pictures in a video stream into variably-sized groups of pictures (GOPs) thereby providing lower achievable output signal bit rates and higher output signal quality.
- the video signal compression maximizes the output signal quality by appropriately allocating bits among pictures and picture groups in the output signal.
- An adaptive method of bit allocation among picture groups and within the pictures in those picture groups enables the efficient allocation of bits, according to the relative sizes of the picture groups.
- the video signal compression of the present invention also applies compression methods that reduce noise in the output signal, by utilizing a macroblock-based tunable conditional replenishment technique.
- the conditional replenishment technique exploits the similarities among images in the variably-sized GOPs to further minimize output bit rate and maximize the output signal quality.
- An analysis-by-synthesis method is also provided to select a best asynchronous sampling method among candidate sampling procedures.
- a method for processing an input video stream comprising a series of pictures.
- a first scene change is detected between a first scene in the input video stream and a second scene in the input video stream.
- the method classifies the first picture following the first scene change as an intra-picture (I-picture).
- I-picture intra-picture
- the input stream processing method determines whether there are a predetermined number of pictures between the first I-picture and a second scene change.
- a second picture in the input video stream is classified as a second I-picture, where it is determined that the predetermined number of pictures exist between the first intra-picture and the second scene change, wherein the second picture coincides with the predetermined number of pictures.
- a system for organizing a series of pictures in an input video stream into at least one group of pictures (GOP).
- the system includes a picture grouping module for detecting a scene change in the series of pictures and for classifying a first picture following the scene change as a first intra-picture (I-picture).
- the picture grouping module also can classify at least one other picture following the scene change as a predicted picture (P-picture) and can classify at least one second picture as a bi-directionally predicted picture (B-picture).
- the system also includes a bit allocation module for determining whether a first GOP uses less than a predetermined target number of bits and further operative to allocate an unneeded bit to a second GOP in response to a determination that the first GOP uses less than the predetermined target number of bits.
- FIG. 1 is a block diagram depicting an exemplary video stream comprised of a series of video pictures.
- FIG. 2 is a flowchart depicting an exemplary method for coding, transmitting, and decoding a video stream.
- FIG. 3 is a block diagram depicting a system for encoding a video stream that is an exemplary embodiment of the present invention.
- FIG. 4 depicts a conventional decoding system for receiving an encoded video stream and providing decoded video and audio output.
- FIG. 5 is a block diagram depicting an exemplary selection of picture encoding modes in a GOP.
- FIG. 6 is a block diagram depicting an exemplary timeline comparing the occurrence of scene changes in a video stream with alternative GOP size formats.
- FIG. 7 is a flowchart depicting an exemplary method for creating GOPs of varying sizes.
- FIG. 8 is a graph depicting a typical relationship between the bits generated by a conventional compression method and a conventional group of pictures.
- FIG. 9 is a series of block diagrams and graphs comparing the generated bit graph of a conventional compression method with a generated bit graph of an exemplary embodiment of the present invention.
- FIG. 10 a is a flow chart depicting an exemplary method for adaptively allocating bits among variable-sized groups of pictures.
- FIG. 10 b is a flow chart depicting an exemplary method for adaptively allocating bits among pictures within a GOP.
- FIG. 11 is a simplified illustration depicting successive pictures in an exemplary GOP divided into macroblocks.
- FIG. 12 is a flowchart depicting an exemplary method for performing conditional replenishment on a macroblock-basis.
- FIG. 13 is a flowchart depicting an exemplary method for generating and selecting between two sampling methods.
- the present invention provides video signal compression that efficiently groups pictures in a video stream into variably-sized groups of pictures (GOPs) thereby providing lower achievable output signal bit rates and higher output signal quality.
- the video signal compression maximizes the output signal quality by appropriately allocating bits among pictures and picture groups in the output signal.
- An adaptive method of bit allocation among picture groups and within the pictures in those picture groups enables the efficient allocation of bits, according to the relative sizes of the picture groups.
- the video signal compression of the present invention also applies compression methods that reduce noise in the output signal, by utilizing a macroblock-based tunable conditional replenishment technique.
- the conditional replenishment technique exploits the similarities among images in the variably-sized GOPs to further minimize output bit rate and maximize the output signal quality.
- An analysis-by-synthesis method is also provided to select a best asynchronous sampling method among multiple non-uniform and/or uniform sampling procedures.
- FIG. 1 is a block diagram depicting an exemplary video stream comprised of a series of video pictures.
- a video stream is simply a collection of related images that have been connected in a series to create the perception that objects in the image series are moving. Because of the large number of separate images that are required to produce a video stream, it is common that the series of images will be digitized and compressed, so that the entire video stream requires less space for transmission or storage. The process of compressing such a digitized video stream is often referred to as “encoding.” Among other things, encoding a video stream typically involves removing the irrelevant and/or redundant digital data from the digitized video stream. Once the video stream has been so compressed, a video stream must usually be decompressed before it can be properly rendered or displayed.
- the video stream 100 depicted in FIG. 1 includes six, separate images or pictures 102 - 112 .
- a video stream is displayed to a viewer at about 30 frames per second. Therefore, the video stream 100 depicted in FIG. 1 would provide about 0.2 seconds of playback at the typical display rate.
- Video stream compression is one means for reducing the size of a video stream.
- video stream compression involves the elimination of irrelevant and/or redundant video data from the video stream.
- many compression methods store only enough video data on a frame-by-frame basis to represent the differences between one frame to the next.
- I-Picture intra-picture
- many compression methods store an intra-picture (I-Picture) that includes all or most of the video data for a particular frame/picture in a video stream.
- Subsequent pictures can be represented by predicted pictures (P-pictures) or by bi-directionally predicted pictures (B-pictures).
- P-pictures are encoded using motion-compensated prediction from a previous I-Picture or a previous P-Picture.
- B-pictures are encoded using motion-compensation prediction from either previous or subsequent I-pictures or P-pictures. B-pictures are not used in the prediction of other B-pictures or other P-pictures. Accordingly, I-pictures require the most amount of video data and can be compressed the least. P-pictures require less video data than I-pictures and can be significantly compressed. B-pictures require the least amount of video data and can be compressed the most.
- the first picture 102 is an I-Picture. Accordingly, much of the video data of the image of the first picture 102 would be used to represent the first picture 102 .
- the second picture 104 may be a B-Picture and, thus, may be represented in terms of video data differences with the I-Picture 102 . Because the B-Picture 104 is bi-directionally predicted, it may also be presented in terms of differences with the P-Picture 106 .
- the P-Picture 106 is predicted in terms of differences with the I-Picture 102 .
- the P-Picture 106 is not represented in terms of differences with the B-Picture 104 .
- Motion vectors are well-known mathematical representations of the movement and/or expected movement of visual “objects” in a series of pictures in a video stream.
- pictures are divided into picture elements (pels).
- Pels may be a video pixel or some other definable division of a picture.
- object motion can be tracked by reference to corresponding pels in a series of related video pictures.
- a video picture (or other digitized picture) is encoded as a collection of blocks 116 .
- Each block is typically an 8-by-8-square of pels.
- video pictures also are commonly divided into macroblocks that usually contain 6 blocks (4 blocks for luminance and 2 blocks for chrominance signal).
- macroblocks usually contain 6 blocks (4 blocks for luminance and 2 blocks for chrominance signal).
- the division of video pictures into blocks and macroblocks is arbitrary, but helpful to the creation of video compression standards.
- the division of pictures into such blocks enables the representation of P-pictures and B-pictures in terms of other pictures in the video stream.
- This block/macroblock-based representation facilitates picture comparisons, based on corresponding portions of successive pictures. As described above, this representation further facilitates the compression of a video stream.
- FIG. 2 is a flowchart depicting an exemplary method for coding, transmitting, and decoding a video stream.
- One application for which the described exemplary embodiment of the present invention is particularly suited is that of video stream processing. Because of the large number of separate images that are required to produce a video stream, it is common that the series of images will be digitized and compressed (encoded), so that the entire video stream requires less space for transmission or storage. Once the video stream has been so compressed, the video stream must usually be decompressed before it can be properly displayed.
- the flow chart of FIG. 2 depicts the steps that are generally followed to encode, decode, and display a video stream.
- Step 202 the input video stream is prepared for encoding.
- Step 202 may be performed by an encoder or prior to sending the video stream to an encoder.
- the video stream can be modified to facilitate encoding. Indeed various exemplary embodiments of the present invention are directed to various aspects of performing this step. The following Figures and accompanying text are drawn to describing those embodiments.
- step 204 the input video stream is encoded.
- the encoding process involves, among other things, the compression of the digitized data making up the input video stream.
- the terms “encoding” and “compression” are used interchangeably.
- the video stream Once the video stream has been encoded, it can be transmitted or stored in its compressed form.
- the encoded video bit stream is transmitted. Often this transmission can be made over conventional broadcast infrastructure, but could also be over broadband communication resources and/or internet-based communication resources.
- step 208 the received, encoded video stream is stored.
- the compressed video stream is significantly smaller than the input video stream. Accordingly, the storage of the received, encoded video stream requires fewer memory resources than storage of the input video stream would require.
- This storage step may be performed, for example, by a computer receiving the encoded video stream over the Internet.
- step 208 could be performed a variety of well-known means and could be even be eliminated from the method depicted in FIG. 2. For example, in a real-time streaming video application, the video stream is typically not stored prior to display.
- step 210 the video stream is decoded.
- Decoding a video stream includes, among other things, expanding (decompressing) the encoded video stream to its original data size. That is, the encoded video stream is expanded so that it is the same size as the input video stream. The irrelevant and/or redundant video data that was removed in the encoding process is replaced with new data.
- Various, well-known algorithms are available for decoding an encoded video stream. Unfortunately, these algorithms are typically unable to return the encoded video stream to its original form without some image degradation. Consequently, a decoded video stream is typically filtered by a post-processing filter to reduce flaws (e.g., noise) in the decoded video stream.
- step 210 the enhanced video stream is displayed.
- step 212 the enhanced video stream is displayed.
- the method then proceeds to end block 214 and terminates.
- FIG. 3 is a block diagram depicting a system for encoding a video stream that is an exemplary embodiment of the present invention.
- the encoding system 300 receives a video input signal 302 and an audio input signal 304 .
- the video input 302 is typically a series of digitized images that are linked together in series.
- the audio input 304 is simply the audio signal that is associated with the series of images making up the video input 302 .
- the video input 302 is first passed through a pre-processing filter 306 that, among other things, filters noise from the video input 302 to prepare the input video stream for encoding.
- the input video stream is then passed to the video encoder 310 .
- the video encoder compresses the video signal by eliminating irrelevant and/or redundant data from the input video signal.
- the video encoder 310 may reduce the input video signal to a predetermined size to match the transmission requirements of the encoding system 300 .
- the video encoder 310 may simply be configured to minimize the size of the encoded video signal. This configuration might be used, for example, to maximize the storage capacity of a storage medium (e.g., hard drive).
- the audio input 304 is compressed by the audio encoder 308 .
- the encoded audio signal is then passed with the encoded video signal to the video stream multiplexer 312 .
- the video stream multiplexer 312 combines the encoded audio signal and the encoded video signal so that the signals can be separated and played-back substantially simultaneously.
- the encoding system outputs the combined signal as an encoded video stream 314 .
- the encoded video stream 314 is thus prepared for transmission, storage, or other processing as needed by a particular application. Often, the encoded video stream 314 will be transmitted to a decoding system that will decode the encoding video stream 314 and prepare it for subsequent display.
- the video input stream 302 can be further processed prior to encoding.
- the exemplary encoding system 300 can prepare the input video stream 302 for encoding by generating a control signal for the input video stream to facilitate compression.
- a rate controller 320 can be used to match the output bit rate of the encoder to the capacity of transmission channel or storage device.
- the rate controller 320 can be used to control the output video quality.
- the exemplary encoding system 300 includes a picture grouping module 316 , a bit allocation module 318 and a bit rate controller 320 .
- the picture grouping module 316 can process a video input stream by selecting and classifying I-pictures in the video stream.
- the picture grouping module 316 can also select and classify P-pictures in the video stream.
- the picture grouping module 316 can significantly improve the quality of the encoded video stream.
- Conventional encoding systems arbitrarily select I-pictures, by adhering to fixed-size picture groups.
- the exemplary coding system 300 can adaptively select I-pictures to maximize the encoded video stream quality.
- the bit allocation module 318 can be used to enhance the quality of the encoded video bit stream by adaptively allocating bits among the groups of pictures defined by the picture grouping module 316 and by allocating bits among the pictures within a given group of pictures. Whereas conventional decoding systems often allocate bits in an arbitrary manner, the allocation module 318 can reallocate bits from the picture groups requiring less video data to picture groups requiring more video data. Consequently, the quality of the encoded video bit stream is enhanced by improving the quality of the groups of pictures requiring more video data for high quality representation.
- the bit rate controller 320 uses an improved method of conditional replenishment to further reduce the presence of noise in an encoded video bit stream.
- Conditional replenishment is a well-known aspect of video data compression.
- a picture element or a picture block will be encoded in a particular picture if the picture element or block has changed when compared to a previous picture.
- the encoder will typically set a flag or send an instruction to the decoder to simply replenish the picture element or block with the corresponding picture element or block from the previous picture.
- the bit rate controller 320 of an exemplary embodiment of the present invention instead focuses on macroblocks and may condition the replenishment of a macroblock on the change of one or more picture elements and/or blocks within the macroblock.
- the bit rate controller 320 may condition the replenishment of a macroblock on a quantification of the change within the macroblock (e.g., the average change of each block) meeting a certain threshold requirement.
- the objective of the bit rate controller 320 is to further reduce the presence of noise in video data and to simplify the encoding of a video stream.
- FIG. 4 depicts a conventional decoding system for receiving an encoded video stream and providing decoded video and audio output.
- the decoding system 400 receives an encoded video stream 402 as input to a video stream demultiplexer 404 .
- the video stream demultiplexer separates the encoded video signal and the encoded audio signal from the encoded video stream 402 .
- the encoded video signal is passed from the video stream demultiplexer 404 to the video decoder 406 .
- the encoded audio signal is passed from the video stream demultiplexer 404 to the audio decoder 410 .
- the video decoder 406 and a audio decoder 410 expand the video signal and the audio signal to a size that is substantially identical to the size of the video input and audio input described above in connection with FIG. 3.
- various well-known algorithms and processes exist for decoding an encoded video and/or audio signal. It will also be appreciated that most encoding and decoding processes are lossy, in that some of the data in the original input signal is lost. Accordingly, the video decoder 406 will reconstruct the video signal with some signal degradation, which is often perceivable as flaws in the output image.
- the post-processing filter 408 is used to counteract noise found in a decoded video signal that has been encoded and/or decoded using a lossy process. Examples of well-known noise types include mosquito noise, salt-and-pepper noise, and blockiness.
- the conventional post-processing filter 408 includes well-known algorithms to detect and counteract these and other known noise problems.
- the post-processing filter 408 generates a filtered, decoded video output 412 .
- the audio decoder 410 generates a decoded audio output 414 .
- the video output 412 and the audio output 414 may be fed to appropriate ports on a display device, such as a television, or may be provided to some other display means such as a software-based media playback component on a computer. Alternatively, the video output 412 and the audio output 414 may be stored for subsequent display.
- the video decoder 406 decompresses or expands the encoded video signal 402 . While there are various well-known methods for encoding and decoding a video signal, in all of the methods, the decoder must be able to interpret the encoded signal. The typical decoder is able to interpret the encoded signal received from an encoder, as long as the encoded signal conforms to an accepted video signal encoding standard, such as the well-known MPEG-1 and MPEG-2 standards. In addition to raw video data, the encoder typically encodes instructions to the decoder as to how the raw video data should be interpreted and represented (i.e., displayed).
- an encoded video stream may include instructions that a subsequent video picture is identical to a previous picture in a video stream.
- the encoded video stream can be further compressed, because the encoder need not send any raw video data for the subsequent video picture.
- the decoder When the decoder receives the instruction, the decoder will simply represent the subsequent picture using the same raw video data provided for the previous picture.
- Such instructions can be provided in a variety of ways, including setting a flag or bit within a data stream.
- FIG. 5 is a block diagram depicting an exemplary selection of picture encoding modes in a GOP.
- the video stream can be described in terms of I-pictures 503 , B-pictures 504 , and P-pictures 506 .
- a video stream can be represented by a series of groups of pictures (GOPs). Each GOP begins with an I-Picture and includes one or more P-pictures and/or B-pictures. As described above, the I-Picture requires the most video data and is represented without reference to any other picture in the video stream.
- the P-Picture 506 can be represented in terms of differences with the I-Picture 502 .
- the B-Picture 504 can be represented in terms of differences with the I-Picture 502 and/or the P-Picture 506 .
- the size of the GOP 508 is arbitrarily set to a specific number of pictures. Consequently, during the encoding process, the first picture is classified as the I-Picture and is followed by a collection of P-pictures and B-pictures. When the predetermined number of pictures have been collected into a GOP, a new GOP can be started. The new GOP is started by identifying a next picture as an I-Picture.
- the size of each GOP may be variable.
- I-Frames coincide with scene changes in the input video stream.
- a scene change can be detected by significant changes and/or structural breakdown of motion vectors from one picture to the next. Once a scene change has been detected, the picture following the scene change (i.e., first picture of the new scene) may be classified as an I-Picture.
- FIG. 6 is a block diagram depicting an exemplary timeline comparing the occurrence of scene changes in a video stream with alternative GOP size formats.
- the video stream 600 is represented as a series of four scenes. Scene changes occur at times 608 , 610 , and 612 .
- the GOP is set at a constant number of frames, as depicted by GOP series 604 .
- the I-Frames in GOP format 604 occur at times 616 , 618 , 620 , and 622 . None of these times correspond with the times of the scene changes in the video stream 600 .
- variable GOP format 602 is an exemplary embodiment of the present invention.
- the I-Frames of the variable GOP format coincide with the scene changes in the video stream 600 .
- the variable GOP format 602 will default to a constant GOP size and insert an I-Picture as needed, as shown at time 606 . Consequently, some GOPs of the variable GOP format 602 will be longer than the typical size of constant GOP format 604 .
- Other GOPs of the variable GOP format 602 (e.g., GOP 614 ) will be significantly longer than the typical size of the constant GOP format 604 .
- variable GOP format 602 of an exemplary embodiment of the present invention is to coincide I-pictures and scene changes. Because both I-pictures and scene changes require the most amount of video data storage, the coincidence of these frames reduces the amount of data required to represent and encoded video stream.
- Another major objective of the variable GOP format 602 of an exemplary embodiment of the present invention is to maximize the benefit of novel adaptive bit allocation and conditional replenishment methods that are described in more detail in connection with FIGS. 8 - 12 .
- FIG. 7 is a flowchart depicting an exemplary method for creating GOPs of varying sizes.
- the method begins at start block 700 and proceeds to step 702 .
- the first GOP is created and a first picture from an input video stream is retrieved.
- the method proceeds to step 704 , wherein the first picture is classified as the I-Picture and is added to the first GOP.
- step 704 proceeds from step 704 to decision block 706 .
- decision block 706 a determination is made as to whether more pictures exist in the input video stream. If a determination is made that more pictures exist in the video stream, the method branches to step 710 . If, on the other hand, a determination is made that no more pictures exist in the video stream, the method branches to end block 708 and terminates.
- step 710 the next picture from the video stream is retrieved.
- the method then proceeds to decision block 712 .
- decision block 712 a determination is made as to whether the predefined GOP picture limit has been reached. As described above in connection with FIG. 6, in the case where a scene is longer than the predefined GOP size, the method will created a new GOP rather than allow the variable GOP to reach an indefinite size. If the predefined GOP picture limit has been reached, the method branches to step 716 and a new GOP is started. If, on the other hand, the standard GOP picture limit has not been reached, the method branches to decision block 714 .
- pictures from an input video stream are added to a GOP until either a scene change occurs or the predefined GOP size is reached.
- Exemplary GOP sizes range from a minimum of 15 frames to a maximum 60 frames.
- GOPs of widely varying sizes could be used within the scope of the present invention.
- the objective of the exemplary method is to coincide scene changes and I-Frames so as to minimize the number I-Frames and scene change frames stored in an encoded video stream.
- FIG. 8 is a graph depicting a typical relationship between the bits generated by a conventional compression method and a conventional group of pictures.
- the graph 800 is divided into three groups of pictures (GOPs) 802 , 804 , 806 .
- Each GOP 802 , 804 , 806 begins with an I-picture 808 , 810 , 812 .
- most conventional compression methods remove irrelevant, redundant, and/or expendable bits from a video stream. This is done by removing as much video data as possible from each picture in an input video stream.
- conventional compression methods encode pictures such that the content of the encoded pictures can be predicted from previous and/or subsequent pictures and the encoded video stream.
- I-pictures 808 , 810 , 812 are used to predict the video data content of other pictures (e.g., B-pictures, P-pictures) and typically contain more video data than other pictures in an encoded video stream.
- I-pictures 808 , 810 , 812 more bits are generated during the compression process than for non-I-pictures 814 , 816 , 818 .
- conventional compression methods select pictures in an input video stream as I-pictures in an arbitrary fashion, based primarily on the number of pictures in a particular GOP.
- I-pictures 808 , 810 , 812 can be selected to coincide with scene changes.
- scene-change pictures and I-pictures require the compression process to generate more bits than for non-scene change pictures or for non-I-pictures.
- an exemplary embodiment of the present invention reduces the overall number of bits generated by the compression process. Because a large number of bits must be stored with an I-picture, regardless of the picture content, classifying scene-change pictures as I-pictures simply capitalizes on this feature to reduce the overall number of bits generated by the compression process.
- FIG. 9 is a series of block diagrams and graphs comparing the generated bit graph of a conventional compression method with a generated bit graph of an exemplary embodiment of the present invention.
- An input video stream is represented as a block diagram 900 divided into scenes.
- a conventional compression method divides groups of pictures on a fixed bases (i.e., the same number of pictures per group).
- a fixed-sized GOP structure is depicted as a block diagram 904 .
- each GOP begins with an I-picture 910 - 916 .
- the fixed GOP Graph 908 has generated bit peaks that coincide with the I-frames 910 - 916 of each of the fixed-sized GOPs in the block diagram 904 .
- the fixed-sized GOP graph 908 also includes peaks coinciding with the scene changes between Scene 1 and Scene 2, between Scene 2 and Scene 3, and between Scene 3 and Scene 4. Accordingly, the conventional, fixed-size GOP compression method generates output bit peaks for both I-pictures and scene-change pictures. Therefore, the bit budget for the remaining P-pictures and B-pictures is decreased. The encoding quality of the remaining P-pictures and B-pictures is, therefore, compromised or degraded.
- variable size GOP graph 906 depicts output bit peaks coinciding primarily with scene changes in the input video stream 900 . Accordingly, the variable-sized GOP compression method of an exemplary embodiment of the present invention reduces the number of output bit peaks in the encoded video stream. More specifically, the variable-sized GOP compression method minimizes the number of double output bit peaks. These double peaks are present in the fixed-sized GOP graph 908 and are created when scene changes occur within a GOP, instead of coinciding with an I-picture of the GOP. As a result, the overall number of output bits generated by the fixed-sized GOP compression method is greater than the overall number of bits generated by the variable-sized GOP compression method of an exemplary embodiment of the present invention.
- the exemplary compression method results in a smaller number of generated compression bits.
- This advantage provides various benefits to an encoding/decoding process.
- the resultant, smaller encoded video stream can be stored and/or transmitted in its smaller state, thereby conserving system resources.
- the encoding quality can be improved by re-allocating bits from smaller GOPs to larger GOPs. This is referred to as adaptive bit allocation, because the bit allocated to a given GOP can be adapted to the GOP size, which varies depending on the scene changes in the input video stream. This benefit is described in more detail in connection with FIG. 10.
- FIG. 10 a is a flow chart depicting an exemplary method for adaptively allocating bits among variable-sized groups of pictures (GOPs).
- bits can be allocated among the variable-sized GOPs.
- bits may be allocated among the pictures within a single GOP.
- the method of FIG. 10 a begins at start block 1000 and proceeds to step 1002 .
- the target bit number of a first GOP is determined. This step may be performed prior to encoding a GOP. For example, after an input stream has been segregated into GOPs, the GOPs may be stored in a buffer. Because the GOPs in the buffer may have different sizes (i.e., contain variable numbers of pictures), they also may have different numbers of bits allocated thereto.
- the method of FIG. 10 a provides a means for adaptively allocating bits among GOPs, depending on the relative sizes of the GOPs.
- step 1004 the number of bits actually generated for the pictures in the GOP is determined.
- step 1004 the number of bits actually generated for the pictures in the GOP is determined.
- step 1006 a determination is made as whether the bit size of the first GOP is less than the target bit number. If the GOP bit size is less than the target bit number, the method branches to step 1010 . If, on the other hand, the GOP size is not less than the target bit number, the method branches to end block 1016 and terminates.
- step 1010 the size and target bit number of a second GOP is determined.
- the method proceeds from step 1010 to step 1014 .
- step 1014 bits from the first GOP are allocated to the second GOP. That is, bits that would otherwise be assigned to the first GOP are reassigned to the second GOP, so that the quality of the second GOP is enhanced.
- the picture quality of the encoded video stream is directly related to the bit rate of the encoded video stream. Accordingly, by reallocating bits between GOPs in a video stream, an exemplary embodiment of the present invention can maximize the quality of the GOPs having bit sizes larger than the target size, while retaining the picture quality of GOPs having bit sizes less than the target bit size.
- Conventional encoding methods cap the bit size of any given GOP at the target bit size. Thus, for GOPs having a larger bit size, the picture quality is reduced as compared to those GOPs having smaller bit sizes.
- FIG. 10 b is a flow chart depicting an exemplary method for adaptively allocating bits among pictures within a GOP.
- bits can be adaptively allocated between pictures within a GOP.
- N-1 bit values can be allocated to the non-I-picture frames.
- the bit allocation can be based on a per-picture target bit size.
- the bits may be allocated using the Root Mean Square (RMS) of the difference between the successive frames.
- RMS Root Mean Square
- T p (i) represents the target bit rate for a current picture
- R represents the target bit rate for the remaining pictures in the GOP
- RMS(i) represents the RMS value of the difference between i th picture and i-l th picture in the GOP.
- the target bit rate for the remaining pictures in the GOP (R) can be updated by subtracting the number of actually generated bits for each picture.
- the bits may be made available for allocation to pictures in other GOPs.
- bits can be allocated on a picture-by-picture basis within a GOP, so as to maximize the picture quality on a picture-by-picture basis.
- FIG. 10 b an exemplary method is depicted, wherein bits are adaptively allocated among the pictures in a GOP.
- the method of FIG. 10 b may be implemented at the time that the picture size (i.e., number of pictures) for a subject (current) GOP has been defined, for example, by the Picture Grouping Module 316 described in connection with FIG. 3.
- the method begins at start block 1050 and proceeds to step 1052 .
- the size of the GOP is determined. This step may be performed by the Picture Grouping Module 316 or the pictures in the GOP may simply be re-counted.
- the method then proceeds to step 1054 , wherein the target bit number for the current GOP is determined.
- a compression process is implemented for a particular application wherein an overall bit rate is predetermined. Those skilled in the art will appreciate that this overall bit rate may be used to determine a bit rate on a per-picture basis.
- step 1056 the Root Mean Square (RMS) of the difference between a current picture and a previous picture is determined. Initially, the current picture will be the first picture in the GOP. This step can be performed using the formula described above. The method then proceeds to step 1058 , wherein the appropriate number of bits is actually allocated to the current picture. The method then proceeds to decision block 1060 , wherein a determination is made as to whether all of the pictures in the GOP have been encoded. If a determination is made that all of the pictures in the GOP have been encoded, the method branches to decision block 1062 . If, on the other hand, a determination is made that all of the pictures in the GOP have not been encoded, the method branches to step 1068 .
- RMS Root Mean Square
- step 1068 the current picture is incremented. That is, the next picture in the GOP is identified for bit allocation consideration. The method then proceeds to step 1056 and proceeds as described above.
- decision block 1062 a determination is made as to whether the number of bits actually generated by encoding all of the pictures in the GOP is less than the target bit total for all of the pictures in the GOP. If the number of bits actually generated by encoding the pictures in the GOP is not less than the target bit total for all of the pictures in the GOP, then the method branches to end block 1066 and terminates. If, on the other hand, the number of bits actually allocated to the pictures in the GOP is less than the target bit total for all of the pictures in the GOP, then the method branches to step 1064 . At step 1064 , the remaining bits (not allocated) are made available to the next GOP (or some other subsequently processed GOP) to be considered for bit allocation. The method proceeds from step 1064 to end block 1066 and terminates.
- the method efficiently allocates bits among pictures within a GOP. Where a surplus of bits exists, the method can make those bits available for subsequent GOPs, for which such a surplus does not exist. Because the GOP size is variable in accordance with exemplary embodiments of the present invention, this bit allocation method capitalizes on bit surpluses that are created by using variable GOP sizes.
- the described bit allocation methods can be used to significantly improve the output quality of an encoding system by efficiently using bits that might otherwise be imprudently allocated.
- Conditional replenishment is a well-known aspect of conventional compression methods. Generally conditional replenishment refers to the elimination of redundant video data in a condition wherein video data remains unchanged between successive pictures in a GOP. More specifically, conditional replenishment is a method of “re-using” (i.e., replenishing) previously encoded video data to populate an area of a video image that is unchanged from a previous video image. When possible, such replenishment reduces the amount of new video data that must be encoded, therefore reducing the output bit rate and increasing output bit quality.
- FIG. 11 is a simplified illustration depicting successive pictures in an exemplary GOP divided into macroblocks.
- Picture 1100 is divided into macroblocks 1102 - 1114 .
- picture 1150 is divided into macroblocks 1152 - 1164 .
- the image in picture 1100 is different than the image in picture 1150 , only certain macroblocks are different.
- macroblocks 1102 - 1110 of picture 1100 are different than macroblocks 1152 - 1160 of picture 1150 .
- macroblocks 1112 - 1114 of picture 1100 are identical to macroblocks 1162 - 1164 of picture 1150 .
- picture 1150 may be represented (i.e., encoded) as being identical to picture 1100 , except for changes to macroblocks 1152 - 1160 .
- the differences can be stored or transmitted in connection with the corresponding picture. If, on the other hand, it is determined that no difference exists between corresponding coded pixels, then a flag can be set to indicate (or other instruction provided) that the pixel from the previous picture can be used, thereby eliminating a need to store additional information for the successive picture graph.
- conditional replenishment is determined by examining the results of the encoding process. If the encoding results (quantized DCT coefficients) are exactly same between the macroblocks of current frame and previous frame, replenishment is used.
- conditional replenishment is performed intelligently by the encoder, based on a calculation of relevant criteria. Accordingly, if the encoder does not detect a replenishment condition, any change detected between corresponding macroblocks in successive pictures may be stored or transmitted. On the other hand, when the encoder detects a replenishment condition, then an instruction and/or flag can be used to indicate that the macroblock should be replenished using the video data from the previous picture.
- conditional replenishment on a macroblock basis enables noise reduction in an encoded video stream.
- noise is commonly detectable in a displayed video stream as a flickering or otherwise perceivable image. Often, such noise is more perceivable when it occurs in a background region (i.e., a region of substantially constant image intensity).
- conditional replenishment is processed on a macroblock basis, utilizing 2-part criteria and selectable thresholds for modifying the criterion . As a result, slight differences resulting from noise in a particular macroblock can be muted (i.e., filtered).
- the first criterion can be used to determine the differences between an original macroblock and a previous macroblock.
- org(i,j) represents the i th and j th pixel of the original (subject) macroblock and prev(i,j) represents the i th and j th pixel of original macroblock of the previous frame.
- the second criterion may be used to evaluate the effect of the decoder, by reference to the original macroblock.
- org(i,j) represents the i th and j th pixel of the original (subject) macroblock and coded(i,j) represents the i th and j th pixel of the decoded macroblock of the previous frame.
- Criterion 1 is the measurement of similarity of the corresponding macroblocks of the current frame and the previous frame.
- Criterion 2 is for double check of the similarity with the decoded macroblock.
- threshold values may be selected for the two criteria, to set the sensitivity of the conditional replenishment process.
- the threshold may be automatically set such that it is adaptive to a particular bit rate.
- C1 Criterion 1
- the threshold value for Criterion 2 may be set manually or automatically (an exemplary value for Threshold 2 is 8).
- FIG. 12 is a flowchart depicting an exemplary method for performing conditional replenishment on a macroblock-basis.
- the method of FIG. 12 begins at start block 1200 and proceeds to step 1202 , wherein a first macroblock is compared to a second macroblock. The method then proceeds to decision block 1204 , wherein a determination is made as to whether Criterion 1 (C1) is less than Threshold 1. If at decision block 1204 , a determination is made that Criterion 1 is not less than Threshold 1, the method branches to step 1210 . At step 1210 , a flag can be set for an instruction providing that the second macroblock should be encoded using the data from the first macroblock, rather than simply replenished. The method proceeds from 1210 to end block 1212 and terminates.
- step 1204 if a determination is made that the Criterion 1 is less than Threshold 1, the method branches to decision block 1206 .
- decision block 1206 a determination is made as to whether Criterion 2 is less than Threshold 2. If a determination is made at decision block 1206 that Criterion 2 is not less than the Threshold 2, the method branches to step 1210 and proceeds as described above. If on the other hand, a determination is made at decision block 1206 that Criterion 2 is less than Threshold 2, the method branches to step 1208 . At step 1208 the replenishment flag is set for the second macroblock. The method proceeds from step 1208 to step 1212 and ends.
- the method of FIG. 12 can be used to utilize selectable criteria to reduce the encoding, decoding and display of noise.
- the replenishment of an exemplary embodiment of the present invention thus, can be used to filter noise from a displayed video stream.
- criteria and/threshold values may be used within the scope of the described embodiments of the present invention.
- a low bit rate e.g., less than 128 kbps
- Sampling is roughly defined as the determination of which pictures in a video stream will be encoded as I-pictures, B-pictures, and P-pictures.
- optimum sampling can be non-uniform (asynchronous) in one or both of the space and time domains.
- Various asynchronous techniques are well known to those skilled in the art and can be used to implement various embodiments of the present invention.
- an analysis-by-synthesis method of selecting an asynchronous sampling technique is provided.
- separately encoded candidate streams are generated using various sampling methods. Once generated, the separate candidate streams can be compared on virtually any basis to determine, for example, which has the best bit rate and signal quality characteristics.
- the best candidate stream can be selected and designated as the output video stream.
- the selected sampling method can be identified to the receiver (decoder) with a small overhead. For example, by using a codebook or dictionary of 16 possible sampling techniques, only 4 bits of overhead are needed to signify the selection.
- the codebook could be either predetermined or generated adaptively (and automatically) over time, based on criteria including extrapolation from a recent history of optimum sampling.
- FIG. 13 is a flowchart depicting an exemplary method for generating and selecting between two sampling methods.
- any number of sampling methods could be used and evaluated within the scope of the present invention.
- the generation of multiple candidate streams creates overhead as described above, and that the exemplary sampling selection method may be more easily applied to one-way communications (e.g., video streaming), than to two-way communications (video teleconferencing).
- the method of FIG. 13 begins at start block 1300 and proceeds to step 1302 .
- a first input video stream is encoded using a first sampling technique.
- the method then proceeds to step 1304 .
- a second input stream is encoded using a second sampling technique.
- the method then proceeds to step 1306 , wherein the encoded candidate video streams are compared. This comparison could be based on various characteristics of the candidate video streams. However, it is preferable that the characteristics are perceptually meaningful characteristics.
- An exemplary characteristic is the signal-to-noise-ratio of each encoded candidate video stream, as compared to the original uncompressed signal.
- step 1306 The method proceeds from step 1306 to decision block 1308 .
- decision block 1308 a determination is made as to whether the signal-to-noise-ratio (SNR) for the first stream is higher than the SNR for the second stream. If the SNR for the first stream is better than the SNR for the second stream, then the method branches to step 1310 . At step 1310 , the first stream is output. Returning to decision block 1308 , if the SNR for the second stream is better than the SNR for the first stream, then the method branches to step 1312 . At step 1312 , the second stream is output. Accordingly, the encoded candidate streams having been encoded using different sampling techniques are compared and the best stream is output, for example, from an encoding system, together with the overhead information that signifies the corresponding sampling method.
- SNR signal-to-noise-ratio
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Discrete Mathematics (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
- Facsimile Image Signal Circuits (AREA)
- Image Processing (AREA)
Abstract
Description
- The present application claims priority to provisional patent application entitled, “Video Processing Method with General and Specific Applications,” filed on Jul. 11, 2000 and assigned U.S. application Ser. No. 60/217,301. The present application is also related to non-provisional application entitled, “Adaptive Edge Detection and Enhancement for Image Processing,” (attorney docket number 07816-105003) filed on Jul. 11, 2001 and assigned U.S. application Ser. No. ______; and non-provisional application entitled, and non-provisional application entitled, “System and Method for Calculating an Optimum Display Size for a Visual Object,” (attorney docket number 07816-105002) filed on Jul. 11, 2001 and assigned U.S. application Ser. No. ______.
- The present invention relates to the processing of a video stream and more specifically relates to the improvement of video stream compression by adaptively selecting a group of pictures based on video stream content, by adaptively allocating bits to generate a compressed video stream, and by adaptively replenishing macroblocks.
- Recent advancements in communication technologies have enabled the widespread distribution of data over communication mediums such as the Internet and broadband cable systems. This increased capability has lead to increased demand for the distribution of a diverse range of content over these communication mediums. Whereas early uses of the Internet were often limited to the distribution of raw data, more recent advances include the distribution of HTML-based graphics and audio files.
- More recent efforts have been made to distribute video media over these communication mediums. However, because of the large amount of data needed to represent a video presentation, the data is typically compressed prior to distribution. Data compression is a well-known means for conserving transmission resources when transmitting large amounts of data or conserving storage resources when storing large amounts of data. In short, data compression involves minimizing or reducing the size of a data signal (e.g., a data file) in order to yield a more compact digital representation of that data signal. Because digital representations of audio and video data signals tend to be very large, data compression is virtually a necessary step in the process of widespread distribution of digital representations of audio and video signals.
- Fortunately, video signals are typically well suited for standard data compression techniques. Most video signals include significant data redundancy. Within a single video frame (image), there typically exists significant correlation among adjacent portions of the frame, referred to as spatial correlation. Similarly, adjacent video frames tend to include significant correlation between corresponding image portions, referred to as temporal correlation. Moreover, there is typically a considerable amount of data in an uncompressed video signal that is irrelevant. That is, the presence or absence of that data will not perceivably affect the quality of the output video signal. Because video signals often include large amounts of such redundant and irrelevant data, video signals are typically compressed prior to transmission and then decompressed again after transmission.
- Generally, the distribution of a video signal includes a transmission unit and a receiving unit. The transmission unit will receive a video signal as input and will compress the video signal and transmit the signal to the receiving unit. Compression of a video signal is usually performed by an encoder. The encoder typically reduces the data rate of the input video signal to a level that is predetermined by the capacity of the transmission medium. For example, for a typical video file transfer, the required data rate can be reduced from about 30 Megabits per second to about 384 kilobits per second. The compression ratio is defined as the ratio between the size of the input video signal and the size of the compressed video signal. If the transmission medium is capable of a high transmission rate, then a lower compression ration can be used. On the other hand, if the transmission medium is capable of a relatively low transmission rate, then a lower compression ratio can be used.
- After the receiving unit receives the compressed video signal, the signal must be decompressed before it can be adequately displayed. The decompression process is performed by a decoder. In some applications, the decoder is used to decompress the compressed video signal so that it is identical to the original input video signal. This is referred to as lossless compression, because no data is lost in the compression and decompression processes. The majority of encoding and decoding applications, however, use lossy compression, wherein some predefined amount of the original data is irretrievably lost in the compression and expansion process. In order to decompress the video stream to its original (pre-encoding) data size, the lost data must be replaced by new data. Unfortunately, lossy compression of video signals will almost always result in the degradation of the output video signal when displayed after decoding, because the new data is usually not identical to the lost original data. Video signal degradation typically manifests itself as a perceivable flaw in a displayed video image. These flaws are typically referred to as noise. Well-known kinds of video noise include blockiness, mosquito noise, salt-and-pepper noise, and fuzzy edges. The data rate (or bit rate) often determines the quality of the decoded video stream. A video stream that was encoded with a high bit rate is generally a higher quality video stream than one encoded at a lower bit rate.
- Conventional methods of compressing video signals include the partitioning of the video signal into groups of pictures. Unfortunately, conventional compression techniques utilize inefficient and arbitrarily simple methods of grouping pictures that result in higher output signal bit rates and/or lower output signal quality. Moreover, because these conventional techniques use arbitrarily simple picture groupings, they do not provide the opportunity to maximize the output signal quality by appropriately allocating bits among pictures and picture groups in the output signal. Finally, these compression techniques typically apply compression methods that result in the propagation and amplification of noise, especially in background potions of a video picture.
- Therefore, there is a need in the art for video signal compression that efficiently groups pictures in a video stream and provides for lower output signal bit rates and higher output signal quality. The video signal compression also should maximize the output signal quality by appropriately allocating bits among pictures and picture groups in the output signal. In addition, the video signal compression also should apply compression methods that reduce noise in the output signal. Finally, the method should enable the use of various sampling techniques and should enable the selection of an output stream, based on the sampling technique providing the best video stream.
- The present invention provides video signal compression that efficiently groups pictures in a video stream into variably-sized groups of pictures (GOPs) thereby providing lower achievable output signal bit rates and higher output signal quality. The video signal compression maximizes the output signal quality by appropriately allocating bits among pictures and picture groups in the output signal. An adaptive method of bit allocation among picture groups and within the pictures in those picture groups enables the efficient allocation of bits, according to the relative sizes of the picture groups. The video signal compression of the present invention also applies compression methods that reduce noise in the output signal, by utilizing a macroblock-based tunable conditional replenishment technique. The conditional replenishment technique exploits the similarities among images in the variably-sized GOPs to further minimize output bit rate and maximize the output signal quality. An analysis-by-synthesis method is also provided to select a best asynchronous sampling method among candidate sampling procedures.
- In one aspect of the invention, a method is provided for processing an input video stream comprising a series of pictures. A first scene change is detected between a first scene in the input video stream and a second scene in the input video stream. The method classifies the first picture following the first scene change as an intra-picture (I-picture).
- In another aspect of the invention, the input stream processing method determines whether there are a predetermined number of pictures between the first I-picture and a second scene change. A second picture in the input video stream is classified as a second I-picture, where it is determined that the predetermined number of pictures exist between the first intra-picture and the second scene change, wherein the second picture coincides with the predetermined number of pictures.
- In yet another aspect of the invention, a system is provided for organizing a series of pictures in an input video stream into at least one group of pictures (GOP). The system includes a picture grouping module for detecting a scene change in the series of pictures and for classifying a first picture following the scene change as a first intra-picture (I-picture). The picture grouping module also can classify at least one other picture following the scene change as a predicted picture (P-picture) and can classify at least one second picture as a bi-directionally predicted picture (B-picture). The system also includes a bit allocation module for determining whether a first GOP uses less than a predetermined target number of bits and further operative to allocate an unneeded bit to a second GOP in response to a determination that the first GOP uses less than the predetermined target number of bits.
- The various aspects of the present invention may be more clearly understood and appreciated from a review of the following detailed description of the disclosed embodiments and by reference to the drawings and claims.
- FIG. 1 is a block diagram depicting an exemplary video stream comprised of a series of video pictures.
- FIG. 2 is a flowchart depicting an exemplary method for coding, transmitting, and decoding a video stream.
- FIG. 3 is a block diagram depicting a system for encoding a video stream that is an exemplary embodiment of the present invention.
- FIG. 4 depicts a conventional decoding system for receiving an encoded video stream and providing decoded video and audio output.
- FIG. 5 is a block diagram depicting an exemplary selection of picture encoding modes in a GOP.
- FIG. 6 is a block diagram depicting an exemplary timeline comparing the occurrence of scene changes in a video stream with alternative GOP size formats.
- FIG. 7 is a flowchart depicting an exemplary method for creating GOPs of varying sizes.
- FIG. 8 is a graph depicting a typical relationship between the bits generated by a conventional compression method and a conventional group of pictures.
- FIG. 9 is a series of block diagrams and graphs comparing the generated bit graph of a conventional compression method with a generated bit graph of an exemplary embodiment of the present invention.
- FIG. 10a is a flow chart depicting an exemplary method for adaptively allocating bits among variable-sized groups of pictures.
- FIG. 10b is a flow chart depicting an exemplary method for adaptively allocating bits among pictures within a GOP.
- FIG. 11 is a simplified illustration depicting successive pictures in an exemplary GOP divided into macroblocks.
- FIG. 12 is a flowchart depicting an exemplary method for performing conditional replenishment on a macroblock-basis.
- FIG. 13 is a flowchart depicting an exemplary method for generating and selecting between two sampling methods.
- The present invention provides video signal compression that efficiently groups pictures in a video stream into variably-sized groups of pictures (GOPs) thereby providing lower achievable output signal bit rates and higher output signal quality. The video signal compression maximizes the output signal quality by appropriately allocating bits among pictures and picture groups in the output signal. An adaptive method of bit allocation among picture groups and within the pictures in those picture groups enables the efficient allocation of bits, according to the relative sizes of the picture groups. The video signal compression of the present invention also applies compression methods that reduce noise in the output signal, by utilizing a macroblock-based tunable conditional replenishment technique. The conditional replenishment technique exploits the similarities among images in the variably-sized GOPs to further minimize output bit rate and maximize the output signal quality. An analysis-by-synthesis method is also provided to select a best asynchronous sampling method among multiple non-uniform and/or uniform sampling procedures.
- An Exemplary Operating Environment
- FIG. 1 is a block diagram depicting an exemplary video stream comprised of a series of video pictures. A video stream is simply a collection of related images that have been connected in a series to create the perception that objects in the image series are moving. Because of the large number of separate images that are required to produce a video stream, it is common that the series of images will be digitized and compressed, so that the entire video stream requires less space for transmission or storage. The process of compressing such a digitized video stream is often referred to as “encoding.” Among other things, encoding a video stream typically involves removing the irrelevant and/or redundant digital data from the digitized video stream. Once the video stream has been so compressed, a video stream must usually be decompressed before it can be properly rendered or displayed.
- The
video stream 100 depicted in FIG. 1 includes six, separate images or pictures 102-112. Typically, a video stream is displayed to a viewer at about 30 frames per second. Therefore, thevideo stream 100 depicted in FIG. 1 would provide about 0.2 seconds of playback at the typical display rate. - Generally, there is little noticeable change from one picture in the series to the next. If a video stream were to be stored or transmitted without compression, large amounts of redundant data would be stored because of the significant video data overlap from one frame to the next. For video stream storage, the storage of such redundant data is consumptive of memory resources. For video stream transmission, the transmission of such redundant data significantly increases transmission time and may be impossible at certain data transmission rates.
- Video stream compression is one means for reducing the size of a video stream. In short, video stream compression involves the elimination of irrelevant and/or redundant video data from the video stream. Moreover, many compression methods store only enough video data on a frame-by-frame basis to represent the differences between one frame to the next. For example, many compression methods store an intra-picture (I-Picture) that includes all or most of the video data for a particular frame/picture in a video stream. Subsequent pictures can be represented by predicted pictures (P-pictures) or by bi-directionally predicted pictures (B-pictures). P-pictures are encoded using motion-compensated prediction from a previous I-Picture or a previous P-Picture. B-pictures are encoded using motion-compensation prediction from either previous or subsequent I-pictures or P-pictures. B-pictures are not used in the prediction of other B-pictures or other P-pictures. Accordingly, I-pictures require the most amount of video data and can be compressed the least. P-pictures require less video data than I-pictures and can be significantly compressed. B-pictures require the least amount of video data and can be compressed the most.
- In the example of FIG. 1, the
first picture 102 is an I-Picture. Accordingly, much of the video data of the image of thefirst picture 102 would be used to represent thefirst picture 102. Thesecond picture 104 may be a B-Picture and, thus, may be represented in terms of video data differences with the I-Picture 102. Because the B-Picture 104 is bi-directionally predicted, it may also be presented in terms of differences with the P-Picture 106. The P-Picture 106, in turn, is predicted in terms of differences with the I-Picture 102. The P-Picture 106 is not represented in terms of differences with the B-Picture 104. - Differences between video pictures are often predicted based on calculated motion vectors. Motion vectors are well-known mathematical representations of the movement and/or expected movement of visual “objects” in a series of pictures in a video stream. In order to track and predict the motion of objects, pictures are divided into picture elements (pels). Pels may be a video pixel or some other definable division of a picture. In any event, object motion can be tracked by reference to corresponding pels in a series of related video pictures.
- Often, a video picture (or other digitized picture) is encoded as a collection of
blocks 116. Each block is typically an 8-by-8-square of pels. In addition, video pictures also are commonly divided into macroblocks that usually contain 6 blocks (4 blocks for luminance and 2 blocks for chrominance signal). Those skilled in the art will appreciate that the division of video pictures into blocks and macroblocks is arbitrary, but helpful to the creation of video compression standards. Moreover, the division of pictures into such blocks enables the representation of P-pictures and B-pictures in terms of other pictures in the video stream. This block/macroblock-based representation facilitates picture comparisons, based on corresponding portions of successive pictures. As described above, this representation further facilitates the compression of a video stream. - FIG. 2 is a flowchart depicting an exemplary method for coding, transmitting, and decoding a video stream. One application for which the described exemplary embodiment of the present invention is particularly suited is that of video stream processing. Because of the large number of separate images that are required to produce a video stream, it is common that the series of images will be digitized and compressed (encoded), so that the entire video stream requires less space for transmission or storage. Once the video stream has been so compressed, the video stream must usually be decompressed before it can be properly displayed. The flow chart of FIG. 2 depicts the steps that are generally followed to encode, decode, and display a video stream.
- The method of FIG. 2 begins at
start block 200 and proceeds to step 202. Atstep 202, the input video stream is prepared for encoding. Step 202 may be performed by an encoder or prior to sending the video stream to an encoder. In any event, the video stream can be modified to facilitate encoding. Indeed various exemplary embodiments of the present invention are directed to various aspects of performing this step. The following Figures and accompanying text are drawn to describing those embodiments. - The method proceeds from
step 202 to step 204. Atstep 204, the input video stream is encoded. As described, the encoding process involves, among other things, the compression of the digitized data making up the input video stream. For the purposes of this description, the terms “encoding” and “compression” are used interchangeably. Once the video stream has been encoded, it can be transmitted or stored in its compressed form. Atstep 206, the encoded video bit stream is transmitted. Often this transmission can be made over conventional broadcast infrastructure, but could also be over broadband communication resources and/or internet-based communication resources. - The method proceeds from
step 206 to step 208. Atstep 208, the received, encoded video stream is stored. As described above, the compressed video stream is significantly smaller than the input video stream. Accordingly, the storage of the received, encoded video stream requires fewer memory resources than storage of the input video stream would require. This storage step may be performed, for example, by a computer receiving the encoded video stream over the Internet. Those skilled in the art will appreciate thatstep 208 could be performed a variety of well-known means and could be even be eliminated from the method depicted in FIG. 2. For example, in a real-time streaming video application, the video stream is typically not stored prior to display. - The method proceeds from
step 208 to step 210. At step 210, the video stream is decoded. Decoding a video stream includes, among other things, expanding (decompressing) the encoded video stream to its original data size. That is, the encoded video stream is expanded so that it is the same size as the input video stream. The irrelevant and/or redundant video data that was removed in the encoding process is replaced with new data. Various, well-known algorithms are available for decoding an encoded video stream. Unfortunately, these algorithms are typically unable to return the encoded video stream to its original form without some image degradation. Consequently, a decoded video stream is typically filtered by a post-processing filter to reduce flaws (e.g., noise) in the decoded video stream. - Once the video stream has been decoded, it is suitable for displaying. The method of FIG. 2 proceeds from step210 to step 212 and the enhanced video stream is displayed. The method then proceeds to end block 214 and terminates.
- An Exemplary Encoding System
- FIG. 3 is a block diagram depicting a system for encoding a video stream that is an exemplary embodiment of the present invention. The
encoding system 300 receives avideo input signal 302 and anaudio input signal 304. Thevideo input 302 is typically a series of digitized images that are linked together in series. Theaudio input 304 is simply the audio signal that is associated with the series of images making up thevideo input 302. - The
video input 302 is first passed through apre-processing filter 306 that, among other things, filters noise from thevideo input 302 to prepare the input video stream for encoding. The input video stream is then passed to thevideo encoder 310. The video encoder compresses the video signal by eliminating irrelevant and/or redundant data from the input video signal. Thevideo encoder 310 may reduce the input video signal to a predetermined size to match the transmission requirements of theencoding system 300. Alternatively, thevideo encoder 310 may simply be configured to minimize the size of the encoded video signal. This configuration might be used, for example, to maximize the storage capacity of a storage medium (e.g., hard drive). - In a similar fashion, the
audio input 304 is compressed by theaudio encoder 308. The encoded audio signal is then passed with the encoded video signal to thevideo stream multiplexer 312. Thevideo stream multiplexer 312 combines the encoded audio signal and the encoded video signal so that the signals can be separated and played-back substantially simultaneously. After the encoded video and encoded audio signals have been combined, the encoding system outputs the combined signal as an encodedvideo stream 314. The encodedvideo stream 314 is thus prepared for transmission, storage, or other processing as needed by a particular application. Often, the encodedvideo stream 314 will be transmitted to a decoding system that will decode theencoding video stream 314 and prepare it for subsequent display. - In an exemplary embodiment of the present invention, the
video input stream 302 can be further processed prior to encoding. In addition to the pre-processing performed by thepre-processing filter 306, theexemplary encoding system 300 can prepare theinput video stream 302 for encoding by generating a control signal for the input video stream to facilitate compression. For example, arate controller 320 can be used to match the output bit rate of the encoder to the capacity of transmission channel or storage device. Furthermore, Therate controller 320 can be used to control the output video quality. For efficient rate control, theexemplary encoding system 300 includes apicture grouping module 316, abit allocation module 318 and abit rate controller 320. - The
picture grouping module 316 can process a video input stream by selecting and classifying I-pictures in the video stream. Thepicture grouping module 316 can also select and classify P-pictures in the video stream. As is discussed in more detail below, thepicture grouping module 316 can significantly improve the quality of the encoded video stream. Conventional encoding systems arbitrarily select I-pictures, by adhering to fixed-size picture groups. Theexemplary coding system 300 can adaptively select I-pictures to maximize the encoded video stream quality. - The
bit allocation module 318 can be used to enhance the quality of the encoded video bit stream by adaptively allocating bits among the groups of pictures defined by thepicture grouping module 316 and by allocating bits among the pictures within a given group of pictures. Whereas conventional decoding systems often allocate bits in an arbitrary manner, theallocation module 318 can reallocate bits from the picture groups requiring less video data to picture groups requiring more video data. Consequently, the quality of the encoded video bit stream is enhanced by improving the quality of the groups of pictures requiring more video data for high quality representation. - The
bit rate controller 320 uses an improved method of conditional replenishment to further reduce the presence of noise in an encoded video bit stream. Conditional replenishment is a well-known aspect of video data compression. In conventional encoding systems, a picture element or a picture block will be encoded in a particular picture if the picture element or block has changed when compared to a previous picture. Where the picture element or block has not changed, the encoder will typically set a flag or send an instruction to the decoder to simply replenish the picture element or block with the corresponding picture element or block from the previous picture. Thebit rate controller 320 of an exemplary embodiment of the present invention instead focuses on macroblocks and may condition the replenishment of a macroblock on the change of one or more picture elements and/or blocks within the macroblock. Alternatively, thebit rate controller 320 may condition the replenishment of a macroblock on a quantification of the change within the macroblock (e.g., the average change of each block) meeting a certain threshold requirement. In any event, the objective of thebit rate controller 320 is to further reduce the presence of noise in video data and to simplify the encoding of a video stream. - A Conventional Decoding System
- FIG. 4 depicts a conventional decoding system for receiving an encoded video stream and providing decoded video and audio output. The
decoding system 400 receives an encodedvideo stream 402 as input to avideo stream demultiplexer 404. The video stream demultiplexer separates the encoded video signal and the encoded audio signal from the encodedvideo stream 402. The encoded video signal is passed from thevideo stream demultiplexer 404 to thevideo decoder 406. Similarly, the encoded audio signal is passed from thevideo stream demultiplexer 404 to theaudio decoder 410. Thevideo decoder 406 and aaudio decoder 410 expand the video signal and the audio signal to a size that is substantially identical to the size of the video input and audio input described above in connection with FIG. 3. Those skilled in the art will appreciate that various well-known algorithms and processes exist for decoding an encoded video and/or audio signal. It will also be appreciated that most encoding and decoding processes are lossy, in that some of the data in the original input signal is lost. Accordingly, thevideo decoder 406 will reconstruct the video signal with some signal degradation, which is often perceivable as flaws in the output image. - The
post-processing filter 408 is used to counteract noise found in a decoded video signal that has been encoded and/or decoded using a lossy process. Examples of well-known noise types include mosquito noise, salt-and-pepper noise, and blockiness. Theconventional post-processing filter 408 includes well-known algorithms to detect and counteract these and other known noise problems. Thepost-processing filter 408 generates a filtered, decodedvideo output 412. Similarly, theaudio decoder 410 generates a decodedaudio output 414. Thevideo output 412 and theaudio output 414 may be fed to appropriate ports on a display device, such as a television, or may be provided to some other display means such as a software-based media playback component on a computer. Alternatively, thevideo output 412 and theaudio output 414 may be stored for subsequent display. - As described above, the
video decoder 406 decompresses or expands the encodedvideo signal 402. While there are various well-known methods for encoding and decoding a video signal, in all of the methods, the decoder must be able to interpret the encoded signal. The typical decoder is able to interpret the encoded signal received from an encoder, as long as the encoded signal conforms to an accepted video signal encoding standard, such as the well-known MPEG-1 and MPEG-2 standards. In addition to raw video data, the encoder typically encodes instructions to the decoder as to how the raw video data should be interpreted and represented (i.e., displayed). For example, an encoded video stream may include instructions that a subsequent video picture is identical to a previous picture in a video stream. In this case, the encoded video stream can be further compressed, because the encoder need not send any raw video data for the subsequent video picture. When the decoder receives the instruction, the decoder will simply represent the subsequent picture using the same raw video data provided for the previous picture. Those skilled in the art will appreciate that such instructions can be provided in a variety of ways, including setting a flag or bit within a data stream. - FIG. 5 is a block diagram depicting an exemplary selection of picture encoding modes in a GOP. As described above in connection with FIG. 1, the video stream can be described in terms of I-
pictures 503, B-pictures 504, and P-pictures 506. A video stream can be represented by a series of groups of pictures (GOPs). Each GOP begins with an I-Picture and includes one or more P-pictures and/or B-pictures. As described above, the I-Picture requires the most video data and is represented without reference to any other picture in the video stream. The P-Picture 506 can be represented in terms of differences with the I-Picture 502. Likewise, the B-Picture 504 can be represented in terms of differences with the I-Picture 502 and/or the P-Picture 506. In conventional encoding methods, the size of theGOP 508 is arbitrarily set to a specific number of pictures. Consequently, during the encoding process, the first picture is classified as the I-Picture and is followed by a collection of P-pictures and B-pictures. When the predetermined number of pictures have been collected into a GOP, a new GOP can be started. The new GOP is started by identifying a next picture as an I-Picture. - In an exemplary embodiment of the present invention, the size of each GOP may be variable. In one embodiment, I-Frames coincide with scene changes in the input video stream. As is well known in the art, a scene change can be detected by significant changes and/or structural breakdown of motion vectors from one picture to the next. Once a scene change has been detected, the picture following the scene change (i.e., first picture of the new scene) may be classified as an I-Picture.
- FIG. 6 is a block diagram depicting an exemplary timeline comparing the occurrence of scene changes in a video stream with alternative GOP size formats. The
video stream 600 is represented as a series of four scenes. Scene changes occur attimes GOP series 604. Notably, the I-Frames inGOP format 604 occur attimes video stream 600. - The
variable GOP format 602 is an exemplary embodiment of the present invention. Typically, the I-Frames of the variable GOP format coincide with the scene changes in thevideo stream 600. However, where a scene is sufficiently long, thevariable GOP format 602 will default to a constant GOP size and insert an I-Picture as needed, as shown attime 606. Consequently, some GOPs of thevariable GOP format 602 will be longer than the typical size ofconstant GOP format 604. Other GOPs of the variable GOP format 602 (e.g., GOP 614) will be significantly longer than the typical size of theconstant GOP format 604. - A major objective of the
variable GOP format 602 of an exemplary embodiment of the present invention is to coincide I-pictures and scene changes. Because both I-pictures and scene changes require the most amount of video data storage, the coincidence of these frames reduces the amount of data required to represent and encoded video stream. Another major objective of thevariable GOP format 602 of an exemplary embodiment of the present invention is to maximize the benefit of novel adaptive bit allocation and conditional replenishment methods that are described in more detail in connection with FIGS. 8-12. - An Exemplary Method for Generating Variably-sized Groups of Pictures
- FIG. 7 is a flowchart depicting an exemplary method for creating GOPs of varying sizes. The method begins at
start block 700 and proceeds to step 702. Atstep 702, the first GOP is created and a first picture from an input video stream is retrieved. The method proceeds to step 704, wherein the first picture is classified as the I-Picture and is added to the first GOP. - The method proceeds from
step 704 todecision block 706. Atdecision block 706, a determination is made as to whether more pictures exist in the input video stream. If a determination is made that more pictures exist in the video stream, the method branches to step 710. If, on the other hand, a determination is made that no more pictures exist in the video stream, the method branches to end block 708 and terminates. - At
step 710, the next picture from the video stream is retrieved. The method then proceeds todecision block 712. Atdecision block 712, a determination is made as to whether the predefined GOP picture limit has been reached. As described above in connection with FIG. 6, in the case where a scene is longer than the predefined GOP size, the method will created a new GOP rather than allow the variable GOP to reach an indefinite size. If the predefined GOP picture limit has been reached, the method branches to step 716 and a new GOP is started. If, on the other hand, the standard GOP picture limit has not been reached, the method branches todecision block 714. - At
decision block 714, a determination is made as to whether a scene change has been reached in the video stream. As described above, a scene change can be detected by various well-known means. If a scene change has been detected, the method branches to step 716 and new GOP is started. If, on the other hand, a scene change has not been reached, the method branches to step 718 and the retrieved picture is added to the current GOP. The method proceeds fromstep 718 to decision block 706 and proceeds as described above. - Accordingly, pictures from an input video stream are added to a GOP until either a scene change occurs or the predefined GOP size is reached. Exemplary GOP sizes range from a minimum of 15 frames to a maximum 60 frames. Those skilled in the art will appreciate that GOPs of widely varying sizes could be used within the scope of the present invention. As described above, the objective of the exemplary method is to coincide scene changes and I-Frames so as to minimize the number I-Frames and scene change frames stored in an encoded video stream.
- FIG. 8 is a graph depicting a typical relationship between the bits generated by a conventional compression method and a conventional group of pictures. The
graph 800 is divided into three groups of pictures (GOPs) 802, 804, 806. EachGOP picture pictures - Referring again to FIG. 8, it is apparent that for the I-
pictures pictures pictures - FIG. 9 is a series of block diagrams and graphs comparing the generated bit graph of a conventional compression method with a generated bit graph of an exemplary embodiment of the present invention. An input video stream is represented as a block diagram900 divided into scenes. As described above, a conventional compression method divides groups of pictures on a fixed bases (i.e., the same number of pictures per group). A fixed-sized GOP structure is depicted as a block diagram 904. As described in connection with FIG. 8, each GOP begins with an I-picture 910-916. The fixed
GOP Graph 908 has generated bit peaks that coincide with the I-frames 910-916 of each of the fixed-sized GOPs in the block diagram 904. In addition, the fixed-sized GOP graph 908 also includes peaks coinciding with the scene changes betweenScene 1 andScene 2, betweenScene 2 andScene 3, and betweenScene 3 andScene 4. Accordingly, the conventional, fixed-size GOP compression method generates output bit peaks for both I-pictures and scene-change pictures. Therefore, the bit budget for the remaining P-pictures and B-pictures is decreased. The encoding quality of the remaining P-pictures and B-pictures is, therefore, compromised or degraded. - The variable
size GOP graph 906, on the other hand, depicts output bit peaks coinciding primarily with scene changes in theinput video stream 900. Accordingly, the variable-sized GOP compression method of an exemplary embodiment of the present invention reduces the number of output bit peaks in the encoded video stream. More specifically, the variable-sized GOP compression method minimizes the number of double output bit peaks. These double peaks are present in the fixed-sized GOP graph 908 and are created when scene changes occur within a GOP, instead of coinciding with an I-picture of the GOP. As a result, the overall number of output bits generated by the fixed-sized GOP compression method is greater than the overall number of bits generated by the variable-sized GOP compression method of an exemplary embodiment of the present invention. - Accordingly, the exemplary compression method results in a smaller number of generated compression bits. This advantage provides various benefits to an encoding/decoding process. First, the resultant, smaller encoded video stream can be stored and/or transmitted in its smaller state, thereby conserving system resources. Alternatively, the encoding quality can be improved by re-allocating bits from smaller GOPs to larger GOPs. This is referred to as adaptive bit allocation, because the bit allocated to a given GOP can be adapted to the GOP size, which varies depending on the scene changes in the input video stream. This benefit is described in more detail in connection with FIG. 10.
- Exemplary Methods for Adaptive Bit Allocation
- FIG. 10a is a flow chart depicting an exemplary method for adaptively allocating bits among variable-sized groups of pictures (GOPs). In an exemplary embodiment of the present invention, bits can be allocated among the variable-sized GOPs. In addition, bits may be allocated among the pictures within a single GOP. These methods may be utilized individually or in concert to maximize the image quality of a compressed video stream and of the pictures within a GOP, while benefiting from the enhanced compression processes of exemplary embodiments of the present invention.
- The method of FIG. 10a begins at
start block 1000 and proceeds to step 1002. Atstep 1002, the target bit number of a first GOP is determined. This step may be performed prior to encoding a GOP. For example, after an input stream has been segregated into GOPs, the GOPs may be stored in a buffer. Because the GOPs in the buffer may have different sizes (i.e., contain variable numbers of pictures), they also may have different numbers of bits allocated thereto. The method of FIG. 10a provides a means for adaptively allocating bits among GOPs, depending on the relative sizes of the GOPs. - The method proceeds from
step 1002 to step 1004. Atstep 1004, the number of bits actually generated for the pictures in the GOP is determined. The method proceeds fromstep 1004 todecision block 1006. Atdecision block 1006, a determination is made as whether the bit size of the first GOP is less than the target bit number. If the GOP bit size is less than the target bit number, the method branches to step 1010. If, on the other hand, the GOP size is not less than the target bit number, the method branches to endblock 1016 and terminates. - At
step 1010 the size and target bit number of a second GOP is determined. The method proceeds fromstep 1010 to step 1014. Atstep 1014, bits from the first GOP are allocated to the second GOP. That is, bits that would otherwise be assigned to the first GOP are reassigned to the second GOP, so that the quality of the second GOP is enhanced. As described above, the picture quality of the encoded video stream is directly related to the bit rate of the encoded video stream. Accordingly, by reallocating bits between GOPs in a video stream, an exemplary embodiment of the present invention can maximize the quality of the GOPs having bit sizes larger than the target size, while retaining the picture quality of GOPs having bit sizes less than the target bit size. Conventional encoding methods cap the bit size of any given GOP at the target bit size. Thus, for GOPs having a larger bit size, the picture quality is reduced as compared to those GOPs having smaller bit sizes. -
- where Tp (i) represents the target bit rate for a current picture, R represents the target bit rate for the remaining pictures in the GOP and RMS(i) represents the RMS value of the difference between ith picture and i-lth picture in the GOP. After encoding each picture in the GOP, the target bit rate for the remaining pictures in the GOP (R) can be updated by subtracting the number of actually generated bits for each picture. When the number of bits that have actually been generated for all of the pictures in the GOP is less than the target bit rate, then the bits may be made available for allocation to pictures in other GOPs. In this embodiment of the present invention, bits can be allocated on a picture-by-picture basis within a GOP, so as to maximize the picture quality on a picture-by-picture basis.
- Turning now to FIG. 10b, an exemplary method is depicted, wherein bits are adaptively allocated among the pictures in a GOP. The method of FIG. 10b may be implemented at the time that the picture size (i.e., number of pictures) for a subject (current) GOP has been defined, for example, by the
Picture Grouping Module 316 described in connection with FIG. 3. The method begins atstart block 1050 and proceeds to step 1052. Atstep 1052, the size of the GOP is determined. This step may be performed by thePicture Grouping Module 316 or the pictures in the GOP may simply be re-counted. The method then proceeds to step 1054, wherein the target bit number for the current GOP is determined. Typically, a compression process is implemented for a particular application wherein an overall bit rate is predetermined. Those skilled in the art will appreciate that this overall bit rate may be used to determine a bit rate on a per-picture basis. - The method proceeds from
step 1054 to step 1056. Atstep 1056, the Root Mean Square (RMS) of the difference between a current picture and a previous picture is determined. Initially, the current picture will be the first picture in the GOP. This step can be performed using the formula described above. The method then proceeds to step 1058, wherein the appropriate number of bits is actually allocated to the current picture. The method then proceeds todecision block 1060, wherein a determination is made as to whether all of the pictures in the GOP have been encoded. If a determination is made that all of the pictures in the GOP have been encoded, the method branches todecision block 1062. If, on the other hand, a determination is made that all of the pictures in the GOP have not been encoded, the method branches to step 1068. - At
step 1068, the current picture is incremented. That is, the next picture in the GOP is identified for bit allocation consideration. The method then proceeds to step 1056 and proceeds as described above. Returning now todecision block 1062, a determination is made as to whether the number of bits actually generated by encoding all of the pictures in the GOP is less than the target bit total for all of the pictures in the GOP. If the number of bits actually generated by encoding the pictures in the GOP is not less than the target bit total for all of the pictures in the GOP, then the method branches to endblock 1066 and terminates. If, on the other hand, the number of bits actually allocated to the pictures in the GOP is less than the target bit total for all of the pictures in the GOP, then the method branches to step 1064. Atstep 1064, the remaining bits (not allocated) are made available to the next GOP (or some other subsequently processed GOP) to be considered for bit allocation. The method proceeds fromstep 1064 to endblock 1066 and terminates. - Accordingly, the method efficiently allocates bits among pictures within a GOP. Where a surplus of bits exists, the method can make those bits available for subsequent GOPs, for which such a surplus does not exist. Because the GOP size is variable in accordance with exemplary embodiments of the present invention, this bit allocation method capitalizes on bit surpluses that are created by using variable GOP sizes. The described bit allocation methods can be used to significantly improve the output quality of an encoding system by efficiently using bits that might otherwise be imprudently allocated.
- An Exemplary Method of Conditional Replenishment
- Conditional replenishment is a well-known aspect of conventional compression methods. Generally conditional replenishment refers to the elimination of redundant video data in a condition wherein video data remains unchanged between successive pictures in a GOP. More specifically, conditional replenishment is a method of “re-using” (i.e., replenishing) previously encoded video data to populate an area of a video image that is unchanged from a previous video image. When possible, such replenishment reduces the amount of new video data that must be encoded, therefore reducing the output bit rate and increasing output bit quality.
- Because successive pictures within an exemplary variable-sized GOP are typically members of the same scene in an input video stream, the opportunity for conditional replenishment is increased with a given GOP. Accordingly, the scene-oriented GOP sizing of exemplary embodiments of the present invention enhance the performance of conventional replenishment methods. In addition, because of the similarity between successive pictures in a given GOP, a novel variation of conditional replenishment is applied in an exemplary embodiment of the present invention to further enhance video stream compression.
- FIG. 11 is a simplified illustration depicting successive pictures in an exemplary GOP divided into macroblocks.
Picture 1100 is divided into macroblocks 1102-1114. Likewise,picture 1150 is divided into macroblocks 1152-1164. Although the image inpicture 1100 is different than the image inpicture 1150, only certain macroblocks are different. Specifically, macroblocks 1102-1110 ofpicture 1100 are different than macroblocks 1152-1160 ofpicture 1150. On the other hand macroblocks 1112-1114 ofpicture 1100 are identical to macroblocks 1162-1164 ofpicture 1150. Accordingly,picture 1150 may be represented (i.e., encoded) as being identical topicture 1100, except for changes to macroblocks 1152-1160. - When it is determined that a difference exists between corresponding coded pixels in the macroblock, the differences can be stored or transmitted in connection with the corresponding picture. If, on the other hand, it is determined that no difference exists between corresponding coded pixels, then a flag can be set to indicate (or other instruction provided) that the pixel from the previous picture can be used, thereby eliminating a need to store additional information for the successive picture graph.
- In conventional conditional replenishment, the replenishment condition is determined by examining the results of the encoding process. If the encoding results (quantized DCT coefficients) are exactly same between the macroblocks of current frame and previous frame, replenishment is used. In an exemplary embodiment of the present invention, on the other hand, conditional replenishment is performed intelligently by the encoder, based on a calculation of relevant criteria. Accordingly, if the encoder does not detect a replenishment condition, any change detected between corresponding macroblocks in successive pictures may be stored or transmitted. On the other hand, when the encoder detects a replenishment condition, then an instruction and/or flag can be used to indicate that the macroblock should be replenished using the video data from the previous picture.
- Advantageously, conditional replenishment on a macroblock basis enables noise reduction in an encoded video stream. When an encoded video stream is decoded, noise is commonly detectable in a displayed video stream as a flickering or otherwise perceivable image. Often, such noise is more perceivable when it occurs in a background region (i.e., a region of substantially constant image intensity). In an exemplary embodiment of the present invention, conditional replenishment is processed on a macroblock basis, utilizing 2-part criteria and selectable thresholds for modifying the criterion . As a result, slight differences resulting from noise in a particular macroblock can be muted (i.e., filtered). The first criterion can be used to determine the differences between an original macroblock and a previous macroblock. This criterion, C1, is given by the expression:
- where org(i,j) represents the ith and jth pixel of the original (subject) macroblock and prev(i,j) represents the ith and jth pixel of original macroblock of the previous frame.
-
- where org(i,j) represents the ith and jth pixel of the original (subject) macroblock and coded(i,j) represents the ith and jth pixel of the decoded macroblock of the previous frame.
Criterion 1 is the measurement of similarity of the corresponding macroblocks of the current frame and the previous frame.Criterion 2 is for double check of the similarity with the decoded macroblock. - In addition, threshold values may be selected for the two criteria, to set the sensitivity of the conditional replenishment process. Alternatively, the threshold may be automatically set such that it is adaptive to a particular bit rate. The following table provides an exemplary relationship between bit rate and Criterion 1 (C1) threshold values.
BIT RATE THRESHOLD 1 greater than 400 k 8 300 k-400 k 11 200 k-300 k 13 110 k-200 k 14 less than 100 k 15 - Similarly, the threshold value for
Criterion 2 may be set manually or automatically (an exemplary value forThreshold 2 is 8). By applying the 2-part criteria in conjunction with the threshold values, the macroblock-based conditional replenishment method of an exemplary embodiment of present invention can be used and fine-tuned to reduce noise in a displayed video stream. - FIG. 12 is a flowchart depicting an exemplary method for performing conditional replenishment on a macroblock-basis. The method of FIG. 12 begins at
start block 1200 and proceeds to step 1202, wherein a first macroblock is compared to a second macroblock. The method then proceeds todecision block 1204, wherein a determination is made as to whether Criterion 1 (C1) is less thanThreshold 1. If atdecision block 1204, a determination is made thatCriterion 1 is not less thanThreshold 1, the method branches to step 1210. Atstep 1210, a flag can be set for an instruction providing that the second macroblock should be encoded using the data from the first macroblock, rather than simply replenished. The method proceeds from 1210 toend block 1212 and terminates. - Returning now to
decision block 1204, if a determination is made that theCriterion 1 is less thanThreshold 1, the method branches todecision block 1206. At decision block 1206 a determination is made as to whetherCriterion 2 is less thanThreshold 2. If a determination is made atdecision block 1206 thatCriterion 2 is not less than theThreshold 2, the method branches to step 1210 and proceeds as described above. If on the other hand, a determination is made atdecision block 1206 thatCriterion 2 is less thanThreshold 2, the method branches to step 1208. Atstep 1208 the replenishment flag is set for the second macroblock. The method proceeds fromstep 1208 to step 1212 and ends. - Accordingly, the method of FIG. 12 can be used to utilize selectable criteria to reduce the encoding, decoding and display of noise. The replenishment of an exemplary embodiment of the present invention, thus, can be used to filter noise from a displayed video stream. Those skilled in the art will appreciate that various criteria and/threshold values may be used within the scope of the described embodiments of the present invention.
- An Exemplary Method for Selecting an Asynchronous Sampling Technique
- To maximize the quality of compressed video at a low bit rate (e.g., less than 128 kbps), it may be useful to sample the video at optimum points in time and space. Sampling is roughly defined as the determination of which pictures in a video stream will be encoded as I-pictures, B-pictures, and P-pictures. Generally, optimum sampling can be non-uniform (asynchronous) in one or both of the space and time domains. Various asynchronous techniques are well known to those skilled in the art and can be used to implement various embodiments of the present invention. In an exemplary embodiment of the present invention, an analysis-by-synthesis method of selecting an asynchronous sampling technique is provided. In the exemplary analysis-by-synthesis method, separately encoded candidate streams are generated using various sampling methods. Once generated, the separate candidate streams can be compared on virtually any basis to determine, for example, which has the best bit rate and signal quality characteristics. The best candidate stream can be selected and designated as the output video stream. The selected sampling method can be identified to the receiver (decoder) with a small overhead. For example, by using a codebook or dictionary of 16 possible sampling techniques, only 4 bits of overhead are needed to signify the selection. The codebook could be either predetermined or generated adaptively (and automatically) over time, based on criteria including extrapolation from a recent history of optimum sampling.
- FIG. 13 is a flowchart depicting an exemplary method for generating and selecting between two sampling methods. Those skilled in the art will appreciate that any number of sampling methods could be used and evaluated within the scope of the present invention. It also will be appreciated that the generation of multiple candidate streams creates overhead as described above, and that the exemplary sampling selection method may be more easily applied to one-way communications (e.g., video streaming), than to two-way communications (video teleconferencing).
- The method of FIG. 13 begins at
start block 1300 and proceeds to step 1302. Atstep 1302, a first input video stream is encoded using a first sampling technique. The method then proceeds to step 1304. Atstep 1304, a second input stream is encoded using a second sampling technique. The method then proceeds to step 1306, wherein the encoded candidate video streams are compared. This comparison could be based on various characteristics of the candidate video streams. However, it is preferable that the characteristics are perceptually meaningful characteristics. An exemplary characteristic is the signal-to-noise-ratio of each encoded candidate video stream, as compared to the original uncompressed signal. - The method proceeds from
step 1306 todecision block 1308. Atdecision block 1308, a determination is made as to whether the signal-to-noise-ratio (SNR) for the first stream is higher than the SNR for the second stream. If the SNR for the first stream is better than the SNR for the second stream, then the method branches to step 1310. Atstep 1310, the first stream is output. Returning todecision block 1308, if the SNR for the second stream is better than the SNR for the first stream, then the method branches to step 1312. Atstep 1312, the second stream is output. Accordingly, the encoded candidate streams having been encoded using different sampling techniques are compared and the best stream is output, for example, from an encoding system, together with the overhead information that signifies the corresponding sampling method. - Although the present invention has been described in connection with various exemplary embodiments, those of ordinary skill in the art will understand that many modifications can be made thereto within the scope of the claims that follow. Accordingly, it is not intended that the scope of the invention in any way be limited by the above description, but instead be determined entirely by reference to the claims that follow.
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09/902,976 US20020021756A1 (en) | 2000-07-11 | 2001-07-11 | Video compression using adaptive selection of groups of frames, adaptive bit allocation, and adaptive replenishment |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US21730100P | 2000-07-11 | 2000-07-11 | |
US09/902,976 US20020021756A1 (en) | 2000-07-11 | 2001-07-11 | Video compression using adaptive selection of groups of frames, adaptive bit allocation, and adaptive replenishment |
Publications (1)
Publication Number | Publication Date |
---|---|
US20020021756A1 true US20020021756A1 (en) | 2002-02-21 |
Family
ID=22810485
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/902,995 Abandoned US20020028024A1 (en) | 2000-07-11 | 2001-07-11 | System and method for calculating an optimum display size for a visual object |
US09/902,976 Abandoned US20020021756A1 (en) | 2000-07-11 | 2001-07-11 | Video compression using adaptive selection of groups of frames, adaptive bit allocation, and adaptive replenishment |
US09/903,028 Expired - Lifetime US7155067B2 (en) | 2000-07-11 | 2001-07-11 | Adaptive edge detection and enhancement for image processing |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/902,995 Abandoned US20020028024A1 (en) | 2000-07-11 | 2001-07-11 | System and method for calculating an optimum display size for a visual object |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/903,028 Expired - Lifetime US7155067B2 (en) | 2000-07-11 | 2001-07-11 | Adaptive edge detection and enhancement for image processing |
Country Status (3)
Country | Link |
---|---|
US (3) | US20020028024A1 (en) |
AU (3) | AU2001276876A1 (en) |
WO (3) | WO2002005121A2 (en) |
Cited By (41)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020028024A1 (en) * | 2000-07-11 | 2002-03-07 | Mediaflow Llc | System and method for calculating an optimum display size for a visual object |
WO2004008766A1 (en) | 2002-07-10 | 2004-01-22 | T-Mobile Deutschland Gmbh | Method for transmitting additional data within a video data transmission |
US20050195897A1 (en) * | 2004-03-08 | 2005-09-08 | Samsung Electronics Co., Ltd. | Scalable video coding method supporting variable GOP size and scalable video encoder |
US20060067410A1 (en) * | 2004-09-23 | 2006-03-30 | Park Seung W | Method for encoding and decoding video signals |
US20060233236A1 (en) * | 2005-04-15 | 2006-10-19 | Labrozzi Scott C | Scene-by-scene digital video processing |
US20060268990A1 (en) * | 2005-05-25 | 2006-11-30 | Microsoft Corporation | Adaptive video encoding using a perceptual model |
US20070237237A1 (en) * | 2006-04-07 | 2007-10-11 | Microsoft Corporation | Gradient slope detection for video compression |
US20070250893A1 (en) * | 2006-04-03 | 2007-10-25 | Yasuhiro Akiyama | Digital broadcasting receiving apparatus |
US20070248163A1 (en) * | 2006-04-07 | 2007-10-25 | Microsoft Corporation | Quantization adjustments for DC shift artifacts |
US20070258518A1 (en) * | 2006-05-05 | 2007-11-08 | Microsoft Corporation | Flexible quantization |
US20070263720A1 (en) * | 2006-05-12 | 2007-11-15 | Freescale Semiconductor Inc. | System and method of adaptive rate control for a video encoder |
US20070280349A1 (en) * | 2006-05-30 | 2007-12-06 | Freescale Semiconductor Inc. | Scalable rate control system for a video encoder |
US20080084491A1 (en) * | 2006-10-06 | 2008-04-10 | Freescale Semiconductor Inc. | Scaling video processing complexity based on power savings factor |
US20080101338A1 (en) * | 2006-11-01 | 2008-05-01 | Reynolds Douglas F | METHODS AND APPARATUS TO IMPLEMENT HIGHER DATA RATE VOICE OVER INTERNET PROTOCOL (VoIP) SERVICES |
US20080192822A1 (en) * | 2007-02-09 | 2008-08-14 | Microsoft Corporation | Complexity-based adaptive preprocessing for multiple-pass video compression |
US20080240250A1 (en) * | 2007-03-30 | 2008-10-02 | Microsoft Corporation | Regions of interest for quality adjustments |
US20080240257A1 (en) * | 2007-03-26 | 2008-10-02 | Microsoft Corporation | Using quantization bias that accounts for relations between transform bins and quantization bins |
US20080240235A1 (en) * | 2007-03-26 | 2008-10-02 | Microsoft Corporation | Adaptive deadzone size adjustment in quantization |
US20080260278A1 (en) * | 2007-04-18 | 2008-10-23 | Microsoft Corporation | Encoding adjustments for animation content |
US20080267284A1 (en) * | 2007-03-28 | 2008-10-30 | Hisayoshi Tsubaki | Moving picture compression apparatus and method of controlling operation of same |
US20090141800A1 (en) * | 2007-11-29 | 2009-06-04 | Larson Arnold W | Transmitting Video Streams |
US20090245587A1 (en) * | 2008-03-31 | 2009-10-01 | Microsoft Corporation | Classifying and controlling encoding quality for textured, dark smooth and smooth video content |
US20090296808A1 (en) * | 2008-06-03 | 2009-12-03 | Microsoft Corporation | Adaptive quantization for enhancement layer video coding |
WO2010056315A1 (en) * | 2008-11-13 | 2010-05-20 | Thomson Licensing | Multiple thread video encoding using gop merging and bit allocation |
US20110019736A1 (en) * | 2009-07-27 | 2011-01-27 | Kyohei Koyabu | Image encoding device and image encoding method |
US20110211633A1 (en) * | 2008-11-12 | 2011-09-01 | Ferran Valldosera | Light change coding |
US20110216774A1 (en) * | 2010-03-02 | 2011-09-08 | Intrusion Inc. | Packet file system |
US20110216828A1 (en) * | 2008-11-12 | 2011-09-08 | Hua Yang | I-frame de-flickering for gop-parallel multi-thread viceo encoding |
US8249145B2 (en) | 2006-04-07 | 2012-08-21 | Microsoft Corporation | Estimating sample-domain distortion in the transform domain with rounding compensation |
US8331438B2 (en) | 2007-06-05 | 2012-12-11 | Microsoft Corporation | Adaptive selection of picture-level quantization parameters for predicted video pictures |
US8412733B1 (en) * | 2003-03-15 | 2013-04-02 | SQL Stream Inc. | Method for distributed RDSMS |
US20130259123A1 (en) * | 2012-04-03 | 2013-10-03 | Xerox Coporation | System and method for identifying unique portions of videos with validation and predictive scene changes |
US8767822B2 (en) | 2006-04-07 | 2014-07-01 | Microsoft Corporation | Quantization adjustment based on texture level |
CN104038762A (en) * | 2013-03-06 | 2014-09-10 | 三星电子株式会社 | Video Encoder, Method Of Detecting Scene Change And Method Of Controlling Video Encoder |
US9094641B1 (en) * | 2011-06-08 | 2015-07-28 | Arris Enterprises, Inc. | Group of pictures size adjustment |
CN106210718A (en) * | 2016-08-08 | 2016-12-07 | 飞狐信息技术(天津)有限公司 | A kind of video sequence Scene switching detection method and device |
US9774848B2 (en) | 2002-07-01 | 2017-09-26 | Arris Enterprises Llc | Efficient compression and transport of video over a network |
US9942570B2 (en) | 2005-11-10 | 2018-04-10 | Nxp Usa, Inc. | Resource efficient video processing via prediction error computational adjustments |
CN108737838A (en) * | 2017-04-19 | 2018-11-02 | 北京金山云网络技术有限公司 | A kind of method for video coding and device |
US10497258B1 (en) * | 2018-09-10 | 2019-12-03 | Sony Corporation | Vehicle tracking and license plate recognition based on group of pictures (GOP) structure |
US20210192681A1 (en) * | 2019-12-18 | 2021-06-24 | Ati Technologies Ulc | Frame reprojection for virtual reality and augmented reality |
Families Citing this family (105)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1999057683A1 (en) | 1998-05-04 | 1999-11-11 | The Johns Hopkins University | Method and apparatus for segmenting small structures in images |
US8090619B1 (en) | 1999-08-27 | 2012-01-03 | Ochoa Optics Llc | Method and system for music distribution |
US6952685B1 (en) * | 1999-08-27 | 2005-10-04 | Ochoa Optics Llc | Music distribution system and associated antipiracy protection |
US20060212908A1 (en) * | 1999-08-27 | 2006-09-21 | Ochoa Optics Llc | Video distribution system |
US7209900B2 (en) * | 1999-08-27 | 2007-04-24 | Charles Eric Hunter | Music distribution systems |
US7647618B1 (en) | 1999-08-27 | 2010-01-12 | Charles Eric Hunter | Video distribution system |
US6647417B1 (en) | 2000-02-10 | 2003-11-11 | World Theatre, Inc. | Music distribution systems |
US7177482B2 (en) * | 1999-12-16 | 2007-02-13 | Sony Corporation | Boundary line detection apparatus and method, and image processing apparatus and method as well as recording medium |
US9252898B2 (en) | 2000-01-28 | 2016-02-02 | Zarbaña Digital Fund Llc | Music distribution systems |
JP4150947B2 (en) * | 2000-08-23 | 2008-09-17 | ソニー株式会社 | Image processing apparatus and method, and recording medium |
US20020112243A1 (en) * | 2001-02-12 | 2002-08-15 | World Theatre | Video distribution system |
US8112311B2 (en) * | 2001-02-12 | 2012-02-07 | Ochoa Optics Llc | Systems and methods for distribution of entertainment and advertising content |
JP2002297496A (en) * | 2001-04-02 | 2002-10-11 | Hitachi Ltd | Media delivery system and multimedia conversion server |
KR20020095350A (en) * | 2001-06-14 | 2002-12-26 | 엘지전자 주식회사 | Pattern-adaptive error diffusion apparatus |
US7960005B2 (en) * | 2001-09-14 | 2011-06-14 | Ochoa Optics Llc | Broadcast distribution of content for storage on hardware protected optical storage media |
US20060029281A1 (en) * | 2002-04-23 | 2006-02-09 | Koninklijke Philips Electronics N.V. | Digital image processing method for low-rate applications |
US11022982B2 (en) * | 2014-03-18 | 2021-06-01 | Transforation Ip Holdings, Llc | Optical route examination system and method |
US11124207B2 (en) * | 2014-03-18 | 2021-09-21 | Transportation Ip Holdings, Llc | Optical route examination system and method |
JP2004266744A (en) * | 2003-03-04 | 2004-09-24 | Brother Ind Ltd | Image processing apparatus, image processing method, and storage medium |
US7336845B2 (en) * | 2003-04-21 | 2008-02-26 | Transpacific Ip Ltd. | Improving modulation transfer function of an image |
SG115540A1 (en) * | 2003-05-17 | 2005-10-28 | St Microelectronics Asia | An edge enhancement process and system |
JP2005064718A (en) * | 2003-08-08 | 2005-03-10 | Toshiba Corp | Digital broadcast receiver and digital broadcast receiving method |
US8621542B2 (en) * | 2003-08-11 | 2013-12-31 | Warner Bros. Entertainment Inc. | Digital media distribution device |
US20070047658A1 (en) * | 2003-09-23 | 2007-03-01 | Alexandros Tourapis | Video comfort noise addition technique |
JP2005141477A (en) * | 2003-11-06 | 2005-06-02 | Noritsu Koki Co Ltd | Image sharpening process and image processor implementing this process |
JP2005292975A (en) * | 2004-03-31 | 2005-10-20 | Alpine Electronics Inc | Button processing method and data processor |
US7266246B2 (en) | 2004-04-29 | 2007-09-04 | Hewlett-Packard Development Company, L.P. | System and method for estimating compression noise in images |
FR2869749B1 (en) * | 2004-04-30 | 2006-10-20 | Sagem | METHOD AND SYSTEM FOR PROCESSING IMAGE BY FILTERING BLOCKS OF IMAGE PIXELS |
JP5073484B2 (en) * | 2004-05-28 | 2012-11-14 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | Method, computer program, apparatus and imaging system for image processing |
KR100618849B1 (en) * | 2004-07-22 | 2006-09-01 | 삼성전자주식회사 | Apparatus and method for filtering blocking phenomenon in video |
US8712831B2 (en) | 2004-11-19 | 2014-04-29 | Repucom America, Llc | Method and system for quantifying viewer awareness of advertising images in a video source |
US8036932B2 (en) | 2004-11-19 | 2011-10-11 | Repucom America, Llc | Method and system for valuing advertising content |
JP2006264301A (en) * | 2005-02-22 | 2006-10-05 | Seiko Epson Corp | Printing apparatus, printing program, printing method, image processing apparatus, image processing program, image processing method, and recording medium recording the program |
US7680355B2 (en) * | 2005-05-02 | 2010-03-16 | Intel Corporation | Detection of artifacts resulting from image signal decompression |
US7657098B2 (en) * | 2005-05-02 | 2010-02-02 | Samsung Electronics Co., Ltd. | Method and apparatus for reducing mosquito noise in decoded video sequence |
KR101120092B1 (en) * | 2005-06-04 | 2012-03-23 | 삼성전자주식회사 | Method for improving quality of composite video signal and the apparatus therefore and method for decoding composite video signal and the apparatus therefore |
US7612792B1 (en) * | 2005-09-07 | 2009-11-03 | Avaya Inc | Method and apparatus for adaptive video encoding based on user perception and use |
US7865035B2 (en) * | 2005-10-06 | 2011-01-04 | Samsung Electronics Co., Ltd. | Video quality adaptive coding artifact reduction |
JP4455487B2 (en) * | 2005-12-16 | 2010-04-21 | 株式会社東芝 | Decoding device, decoding method, and program |
TWI332351B (en) * | 2006-10-05 | 2010-10-21 | Realtek Semiconductor Corp | Image processing method and device thereof for reduction mosquito noise |
US7881497B2 (en) * | 2007-03-08 | 2011-02-01 | Honeywell International Inc. | Vision based navigation and guidance system |
KR101180059B1 (en) * | 2007-05-11 | 2012-09-04 | 삼성전자주식회사 | Image forming apparatus and method thereof |
US20090220169A1 (en) * | 2008-02-28 | 2009-09-03 | Microsoft Corporation | Image enhancement |
US8245005B2 (en) * | 2008-03-03 | 2012-08-14 | Microsoft Corporation | Probabilistic object relocation |
US9110791B2 (en) | 2008-03-03 | 2015-08-18 | Microsoft Technology Licensing, Llc | Optimistic object relocation |
US8031963B2 (en) * | 2008-04-09 | 2011-10-04 | Eyep Inc. | Noise filter |
TWI360790B (en) * | 2008-08-11 | 2012-03-21 | Chunghwa Picture Tubes Ltd | Image compression/decompression device and method |
EP2460354A4 (en) | 2009-07-27 | 2015-11-04 | Utc Fire & Security Corp | System and method for video-quality enhancement |
US8279259B2 (en) * | 2009-09-24 | 2012-10-02 | Microsoft Corporation | Mimicking human visual system in detecting blockiness artifacts in compressed video streams |
CN102483905A (en) * | 2009-09-29 | 2012-05-30 | 松下电器产业株式会社 | Display device and display method |
US8538163B2 (en) * | 2009-10-13 | 2013-09-17 | Sony Corporation | Method and system for detecting edges within an image |
KR101114698B1 (en) * | 2010-01-29 | 2012-02-29 | 삼성전자주식회사 | Apparatus and method for edge enhancement according to image characteristics |
TWI469631B (en) * | 2010-11-05 | 2015-01-11 | Inst Information Industry | Image processing apparatus and image processing method |
US9978156B2 (en) * | 2012-10-03 | 2018-05-22 | Avago Technologies General Ip (Singapore) Pte. Ltd. | High-throughput image and video compression |
TW201437966A (en) | 2013-03-29 | 2014-10-01 | Automotive Res & Testing Ct | Adaptive image edge amending apparatus and method thereof |
KR102025184B1 (en) * | 2013-07-31 | 2019-09-25 | 엘지디스플레이 주식회사 | Apparatus for converting data and display apparatus using the same |
CN103888761A (en) * | 2013-09-22 | 2014-06-25 | 天津思博科科技发展有限公司 | Decoder for enhancing image quality |
US10262430B2 (en) * | 2014-04-28 | 2019-04-16 | Eizo Corporation | Annotation line determining unit, annotation line removing unit, medical display, and method therefor |
US9686449B1 (en) * | 2016-03-18 | 2017-06-20 | Interra Systems, Inc. | Methods and systems for detection of blur artifact in digital video due to high quantization |
US10096097B2 (en) * | 2016-08-01 | 2018-10-09 | The United States Of America As Represented By The Secretary Of The Navy | Content-aware bidirectional image edge highlighting |
US11042161B2 (en) | 2016-11-16 | 2021-06-22 | Symbol Technologies, Llc | Navigation control method and apparatus in a mobile automation system |
US11449059B2 (en) | 2017-05-01 | 2022-09-20 | Symbol Technologies, Llc | Obstacle detection for a mobile automation apparatus |
WO2018204308A1 (en) | 2017-05-01 | 2018-11-08 | Symbol Technologies, Llc | Method and apparatus for object status detection |
US11093896B2 (en) | 2017-05-01 | 2021-08-17 | Symbol Technologies, Llc | Product status detection system |
US10949798B2 (en) | 2017-05-01 | 2021-03-16 | Symbol Technologies, Llc | Multimodal localization and mapping for a mobile automation apparatus |
US11367092B2 (en) | 2017-05-01 | 2022-06-21 | Symbol Technologies, Llc | Method and apparatus for extracting and processing price text from an image set |
US10591918B2 (en) | 2017-05-01 | 2020-03-17 | Symbol Technologies, Llc | Fixed segmented lattice planning for a mobile automation apparatus |
US10663590B2 (en) | 2017-05-01 | 2020-05-26 | Symbol Technologies, Llc | Device and method for merging lidar data |
US10726273B2 (en) | 2017-05-01 | 2020-07-28 | Symbol Technologies, Llc | Method and apparatus for shelf feature and object placement detection from shelf images |
WO2018201423A1 (en) | 2017-05-05 | 2018-11-08 | Symbol Technologies, Llc | Method and apparatus for detecting and interpreting price label text |
US10489677B2 (en) * | 2017-09-07 | 2019-11-26 | Symbol Technologies, Llc | Method and apparatus for shelf edge detection |
US10521914B2 (en) | 2017-09-07 | 2019-12-31 | Symbol Technologies, Llc | Multi-sensor object recognition system and method |
US10572763B2 (en) | 2017-09-07 | 2020-02-25 | Symbol Technologies, Llc | Method and apparatus for support surface edge detection |
US10809078B2 (en) | 2018-04-05 | 2020-10-20 | Symbol Technologies, Llc | Method, system and apparatus for dynamic path generation |
US10832436B2 (en) | 2018-04-05 | 2020-11-10 | Symbol Technologies, Llc | Method, system and apparatus for recovering label positions |
US10740911B2 (en) | 2018-04-05 | 2020-08-11 | Symbol Technologies, Llc | Method, system and apparatus for correcting translucency artifacts in data representing a support structure |
US10823572B2 (en) | 2018-04-05 | 2020-11-03 | Symbol Technologies, Llc | Method, system and apparatus for generating navigational data |
US11327504B2 (en) | 2018-04-05 | 2022-05-10 | Symbol Technologies, Llc | Method, system and apparatus for mobile automation apparatus localization |
US11506483B2 (en) | 2018-10-05 | 2022-11-22 | Zebra Technologies Corporation | Method, system and apparatus for support structure depth determination |
US11010920B2 (en) | 2018-10-05 | 2021-05-18 | Zebra Technologies Corporation | Method, system and apparatus for object detection in point clouds |
US11003188B2 (en) | 2018-11-13 | 2021-05-11 | Zebra Technologies Corporation | Method, system and apparatus for obstacle handling in navigational path generation |
US11090811B2 (en) | 2018-11-13 | 2021-08-17 | Zebra Technologies Corporation | Method and apparatus for labeling of support structures |
US11079240B2 (en) | 2018-12-07 | 2021-08-03 | Zebra Technologies Corporation | Method, system and apparatus for adaptive particle filter localization |
US11416000B2 (en) | 2018-12-07 | 2022-08-16 | Zebra Technologies Corporation | Method and apparatus for navigational ray tracing |
US11100303B2 (en) | 2018-12-10 | 2021-08-24 | Zebra Technologies Corporation | Method, system and apparatus for auxiliary label detection and association |
US11015938B2 (en) | 2018-12-12 | 2021-05-25 | Zebra Technologies Corporation | Method, system and apparatus for navigational assistance |
US10731970B2 (en) | 2018-12-13 | 2020-08-04 | Zebra Technologies Corporation | Method, system and apparatus for support structure detection |
CA3028708A1 (en) | 2018-12-28 | 2020-06-28 | Zih Corp. | Method, system and apparatus for dynamic loop closure in mapping trajectories |
US11402846B2 (en) | 2019-06-03 | 2022-08-02 | Zebra Technologies Corporation | Method, system and apparatus for mitigating data capture light leakage |
US11960286B2 (en) | 2019-06-03 | 2024-04-16 | Zebra Technologies Corporation | Method, system and apparatus for dynamic task sequencing |
US11200677B2 (en) | 2019-06-03 | 2021-12-14 | Zebra Technologies Corporation | Method, system and apparatus for shelf edge detection |
US11341663B2 (en) | 2019-06-03 | 2022-05-24 | Zebra Technologies Corporation | Method, system and apparatus for detecting support structure obstructions |
US11151743B2 (en) | 2019-06-03 | 2021-10-19 | Zebra Technologies Corporation | Method, system and apparatus for end of aisle detection |
US11080566B2 (en) | 2019-06-03 | 2021-08-03 | Zebra Technologies Corporation | Method, system and apparatus for gap detection in support structures with peg regions |
US11662739B2 (en) | 2019-06-03 | 2023-05-30 | Zebra Technologies Corporation | Method, system and apparatus for adaptive ceiling-based localization |
US11507103B2 (en) | 2019-12-04 | 2022-11-22 | Zebra Technologies Corporation | Method, system and apparatus for localization-based historical obstacle handling |
US11107238B2 (en) | 2019-12-13 | 2021-08-31 | Zebra Technologies Corporation | Method, system and apparatus for detecting item facings |
US11822333B2 (en) | 2020-03-30 | 2023-11-21 | Zebra Technologies Corporation | Method, system and apparatus for data capture illumination control |
US11450024B2 (en) | 2020-07-17 | 2022-09-20 | Zebra Technologies Corporation | Mixed depth object detection |
US11593915B2 (en) | 2020-10-21 | 2023-02-28 | Zebra Technologies Corporation | Parallax-tolerant panoramic image generation |
US11392891B2 (en) | 2020-11-03 | 2022-07-19 | Zebra Technologies Corporation | Item placement detection and optimization in material handling systems |
US11847832B2 (en) | 2020-11-11 | 2023-12-19 | Zebra Technologies Corporation | Object classification for autonomous navigation systems |
CN112702616A (en) * | 2020-12-08 | 2021-04-23 | 珠海格力电器股份有限公司 | Processing method and device for playing content |
CN112967273B (en) * | 2021-03-25 | 2021-11-16 | 北京的卢深视科技有限公司 | Image processing method, electronic device, and storage medium |
US11954882B2 (en) | 2021-06-17 | 2024-04-09 | Zebra Technologies Corporation | Feature-based georegistration for mobile computing devices |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5754233A (en) * | 1995-09-04 | 1998-05-19 | Sony Corporation | Compression encoding apparatus and recording apparatus for compressionencoded data |
US5757968A (en) * | 1994-09-29 | 1998-05-26 | Sony Corporation | Method and apparatus for video data compression |
US6028633A (en) * | 1995-10-30 | 2000-02-22 | Sony Corporation | Video data compression with trial encoder |
US6057893A (en) * | 1995-12-28 | 2000-05-02 | Sony Corporation | Picture encoding method, picture encoding apparatus, picture transmitting method and picture recording medium |
US6173012B1 (en) * | 1996-04-25 | 2001-01-09 | Matsushita Electric Industrial Co., Ltd. | Moving picture encoding apparatus and method |
US6215820B1 (en) * | 1998-10-12 | 2001-04-10 | Stmicroelectronics S.R.L. | Constant bit-rate control in a video coder by way of pre-analysis of a slice of the pictures |
US6278735B1 (en) * | 1998-03-19 | 2001-08-21 | International Business Machines Corporation | Real-time single pass variable bit rate control strategy and encoder |
US6434196B1 (en) * | 1998-04-03 | 2002-08-13 | Sarnoff Corporation | Method and apparatus for encoding video information |
US6694060B2 (en) * | 2000-12-21 | 2004-02-17 | General Instrument Corporation | Frame bit-size allocation for seamlessly spliced, variable-encoding-rate, compressed digital video signals |
US6731685B1 (en) * | 2000-09-20 | 2004-05-04 | General Instrument Corporation | Method and apparatus for determining a bit rate need parameter in a statistical multiplexer |
Family Cites Families (49)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4742556A (en) * | 1985-09-16 | 1988-05-03 | Davis Jr Ray E | Character recognition method |
JP2589298B2 (en) * | 1987-01-28 | 1997-03-12 | キヤノン株式会社 | Decoding device for encoded image data |
US5237316A (en) * | 1990-02-02 | 1993-08-17 | Washington University | Video display with high speed reconstruction and display of compressed images at increased pixel intensity range and retrofit kit for same |
US5072297A (en) * | 1990-03-27 | 1991-12-10 | Nippon Hoso Kyokai | Method and system for transmitting and receiving PCM audio signals in combination with a video signal |
KR950002658B1 (en) * | 1992-04-11 | 1995-03-24 | 주식회사금성사 | Condensation incoding and decoding apparatus of image signal |
JPH05328160A (en) * | 1992-05-27 | 1993-12-10 | Victor Co Of Japan Ltd | Automatic changeover device for video display size |
US5418574A (en) * | 1992-10-12 | 1995-05-23 | Matsushita Electric Industrial Co., Ltd. | Video signal correction apparatus which detects leading and trailing edges to define boundaries between colors and corrects for bleeding |
US5621429A (en) * | 1993-03-16 | 1997-04-15 | Hitachi, Ltd. | Video data display controlling method and video data display processing system |
KR970005131B1 (en) * | 1994-01-18 | 1997-04-12 | 대우전자 주식회사 | Digital Audio Coding Device Adaptive to Human Auditory Characteristics |
KR100213015B1 (en) * | 1994-03-31 | 1999-08-02 | 윤종용 | Quantization Method and Circuit |
AU698055B2 (en) * | 1994-07-14 | 1998-10-22 | Johnson-Grace Company | Method and apparatus for compressing images |
US5566001A (en) * | 1994-07-27 | 1996-10-15 | Motorola, Inc. | Method and apparatus for fax messaging in a selective call receiver system using multiple code-book data compression |
EP0721286A3 (en) * | 1995-01-09 | 2000-07-26 | Matsushita Electric Industrial Co., Ltd. | Video signal decoding apparatus with artifact reduction |
JPH09512410A (en) * | 1995-02-15 | 1997-12-09 | フィリップス エレクトロニクス ネムローゼ フェンノートシャップ | Video signal transcoding method and apparatus |
JP3661711B2 (en) * | 1995-04-27 | 2005-06-22 | ソニー株式会社 | Image coding method and apparatus |
US5909249A (en) * | 1995-12-15 | 1999-06-01 | General Instrument Corporation | Reduction of noise visibility in a digital video system |
US5850294A (en) * | 1995-12-18 | 1998-12-15 | Lucent Technologies Inc. | Method and apparatus for post-processing images |
US5872598A (en) * | 1995-12-26 | 1999-02-16 | C-Cube Microsystems | Scene change detection using quantization scale factor rate control |
US6208689B1 (en) * | 1996-03-04 | 2001-03-27 | Mitsubishi Denki Kabushiki Kaisha | Method and apparatus for digital image decoding |
JP3773585B2 (en) * | 1996-03-29 | 2006-05-10 | 富士通株式会社 | Image encoding device |
US5940073A (en) * | 1996-05-03 | 1999-08-17 | Starsight Telecast Inc. | Method and system for displaying other information in a TV program guide |
US5883971A (en) * | 1996-10-23 | 1999-03-16 | International Business Machines Corporation | System and method for determining if a fingerprint image contains an image portion representing a smudged fingerprint impression |
US5953506A (en) * | 1996-12-17 | 1999-09-14 | Adaptive Media Technologies | Method and apparatus that provides a scalable media delivery system |
JPH10210315A (en) * | 1997-01-23 | 1998-08-07 | Fuji Xerox Co Ltd | Image-processing unit |
US6091767A (en) * | 1997-02-03 | 2000-07-18 | Westerman; Larry Alan | System for improving efficiency of video encoders |
US5903673A (en) * | 1997-03-14 | 1999-05-11 | Microsoft Corporation | Digital video signal encoder and encoding method |
JPH10304381A (en) * | 1997-05-01 | 1998-11-13 | Fujitsu Ltd | Video encoding apparatus and method |
US6014694A (en) * | 1997-06-26 | 2000-01-11 | Citrix Systems, Inc. | System for adaptive video/audio transport over a network |
US6246783B1 (en) * | 1997-09-17 | 2001-06-12 | General Electric Company | Iterative filter framework for medical images |
US6151074A (en) * | 1997-09-30 | 2000-11-21 | Texas Instruments Incorporated | Integrated MPEG decoder and image resizer for SLM-based digital display system |
US6229578B1 (en) * | 1997-12-08 | 2001-05-08 | Intel Corporation | Edge-detection based noise removal algorithm |
JPH11205718A (en) * | 1998-01-07 | 1999-07-30 | Hitachi Ltd | Information reproducing apparatus and information recording / reproducing apparatus |
AUPP128498A0 (en) * | 1998-01-12 | 1998-02-05 | Canon Kabushiki Kaisha | A method for smoothing jagged edges in digital images |
US6124893A (en) * | 1998-04-29 | 2000-09-26 | Stapleton; John J. | Versatile video transformation device |
US6442331B1 (en) * | 1998-07-08 | 2002-08-27 | Lsi Logic Corporation | Optical disk system incorporating computer graphics rendering capability to create and display three-dimensional (3-D) objects synchronized with 3-D sound |
US6259822B1 (en) * | 1998-10-30 | 2001-07-10 | Eastman Kodak Company | Edge enhancement which reduces the visibility of false contours |
US6204887B1 (en) * | 1998-12-11 | 2001-03-20 | Hitachi America, Ltd. | Methods and apparatus for decoding and displaying multiple images using a common processor |
US6310909B1 (en) * | 1998-12-23 | 2001-10-30 | Broadcom Corporation | DSL rate adaptation |
US6456305B1 (en) * | 1999-03-18 | 2002-09-24 | Microsoft Corporation | Method and system for automatically fitting a graphical display of objects to the dimensions of a display window |
US6681043B1 (en) * | 1999-08-16 | 2004-01-20 | University Of Washington | Interactive video object processing environment which visually distinguishes segmented video object |
US6563547B1 (en) * | 1999-09-07 | 2003-05-13 | Spotware Technologies, Inc. | System and method for displaying a television picture within another displayed image |
US6859802B1 (en) * | 1999-09-13 | 2005-02-22 | Microsoft Corporation | Image retrieval based on relevance feedback |
JP2001189936A (en) * | 1999-12-28 | 2001-07-10 | Canon Inc | Image display device, image display method and computer- readable recording medium |
US6473092B1 (en) * | 2000-04-07 | 2002-10-29 | Agilent Technologies, Inc. | Apparatus and method for color illumination in display devices |
US6876703B2 (en) * | 2000-05-11 | 2005-04-05 | Ub Video Inc. | Method and apparatus for video coding |
US6650705B1 (en) * | 2000-05-26 | 2003-11-18 | Mitsubishi Electric Research Laboratories Inc. | Method for encoding and transcoding multiple video objects with variable temporal resolution |
AU2001276876A1 (en) * | 2000-07-11 | 2002-01-21 | Mediaflow, Llc | Adaptive edge detection and enhancement for image processing |
US6864921B2 (en) * | 2000-10-17 | 2005-03-08 | Sony Corporation | Display control system for controlling a display screen formed of multiple display units |
US20020071031A1 (en) * | 2000-12-07 | 2002-06-13 | Philips Electronics North America Corporation | Remote monitoring via a consumer electronic appliance |
-
2001
- 2001-07-11 AU AU2001276876A patent/AU2001276876A1/en not_active Abandoned
- 2001-07-11 WO PCT/US2001/021756 patent/WO2002005121A2/en active Application Filing
- 2001-07-11 WO PCT/US2001/021873 patent/WO2002005214A2/en active Application Filing
- 2001-07-11 AU AU2001276871A patent/AU2001276871A1/en not_active Abandoned
- 2001-07-11 AU AU2001273326A patent/AU2001273326A1/en not_active Abandoned
- 2001-07-11 WO PCT/US2001/021848 patent/WO2002005562A2/en active Application Filing
- 2001-07-11 US US09/902,995 patent/US20020028024A1/en not_active Abandoned
- 2001-07-11 US US09/902,976 patent/US20020021756A1/en not_active Abandoned
- 2001-07-11 US US09/903,028 patent/US7155067B2/en not_active Expired - Lifetime
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5832121A (en) * | 1994-09-27 | 1998-11-03 | Sony Corporation | Method and apparatus for video data compression |
US5757968A (en) * | 1994-09-29 | 1998-05-26 | Sony Corporation | Method and apparatus for video data compression |
US5754233A (en) * | 1995-09-04 | 1998-05-19 | Sony Corporation | Compression encoding apparatus and recording apparatus for compressionencoded data |
US6028633A (en) * | 1995-10-30 | 2000-02-22 | Sony Corporation | Video data compression with trial encoder |
US6057893A (en) * | 1995-12-28 | 2000-05-02 | Sony Corporation | Picture encoding method, picture encoding apparatus, picture transmitting method and picture recording medium |
US6173012B1 (en) * | 1996-04-25 | 2001-01-09 | Matsushita Electric Industrial Co., Ltd. | Moving picture encoding apparatus and method |
US6278735B1 (en) * | 1998-03-19 | 2001-08-21 | International Business Machines Corporation | Real-time single pass variable bit rate control strategy and encoder |
US6434196B1 (en) * | 1998-04-03 | 2002-08-13 | Sarnoff Corporation | Method and apparatus for encoding video information |
US6215820B1 (en) * | 1998-10-12 | 2001-04-10 | Stmicroelectronics S.R.L. | Constant bit-rate control in a video coder by way of pre-analysis of a slice of the pictures |
US6731685B1 (en) * | 2000-09-20 | 2004-05-04 | General Instrument Corporation | Method and apparatus for determining a bit rate need parameter in a statistical multiplexer |
US6694060B2 (en) * | 2000-12-21 | 2004-02-17 | General Instrument Corporation | Frame bit-size allocation for seamlessly spliced, variable-encoding-rate, compressed digital video signals |
Cited By (79)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020028024A1 (en) * | 2000-07-11 | 2002-03-07 | Mediaflow Llc | System and method for calculating an optimum display size for a visual object |
US9774848B2 (en) | 2002-07-01 | 2017-09-26 | Arris Enterprises Llc | Efficient compression and transport of video over a network |
WO2004008766A1 (en) | 2002-07-10 | 2004-01-22 | T-Mobile Deutschland Gmbh | Method for transmitting additional data within a video data transmission |
US9049196B1 (en) | 2003-03-15 | 2015-06-02 | SQLStream, Inc. | Method for distributed RDSMS |
US8412733B1 (en) * | 2003-03-15 | 2013-04-02 | SQL Stream Inc. | Method for distributed RDSMS |
US8805819B1 (en) | 2003-03-15 | 2014-08-12 | SQLStream, Inc. | Method for distributed RDSMS |
US8521770B1 (en) | 2003-03-15 | 2013-08-27 | SQLStream, Inc. | Method for distributed RDSMS |
WO2005086493A1 (en) * | 2004-03-08 | 2005-09-15 | Samsung Electronics Co., Ltd. | Scalable video coding method supporting variable gop size and scalable video encoder |
US20050195897A1 (en) * | 2004-03-08 | 2005-09-08 | Samsung Electronics Co., Ltd. | Scalable video coding method supporting variable GOP size and scalable video encoder |
US20060067410A1 (en) * | 2004-09-23 | 2006-03-30 | Park Seung W | Method for encoding and decoding video signals |
WO2006113349A1 (en) * | 2005-04-15 | 2006-10-26 | Inlet Technologies, Inc. | Scene-by-scene digital video processing |
US20060233236A1 (en) * | 2005-04-15 | 2006-10-19 | Labrozzi Scott C | Scene-by-scene digital video processing |
US7864840B2 (en) | 2005-04-15 | 2011-01-04 | Inlet Technologies, Inc. | Scene-by-scene digital video processing |
US20060268990A1 (en) * | 2005-05-25 | 2006-11-30 | Microsoft Corporation | Adaptive video encoding using a perceptual model |
US8422546B2 (en) | 2005-05-25 | 2013-04-16 | Microsoft Corporation | Adaptive video encoding using a perceptual model |
US9942570B2 (en) | 2005-11-10 | 2018-04-10 | Nxp Usa, Inc. | Resource efficient video processing via prediction error computational adjustments |
US20070250893A1 (en) * | 2006-04-03 | 2007-10-25 | Yasuhiro Akiyama | Digital broadcasting receiving apparatus |
US20070248163A1 (en) * | 2006-04-07 | 2007-10-25 | Microsoft Corporation | Quantization adjustments for DC shift artifacts |
US8503536B2 (en) | 2006-04-07 | 2013-08-06 | Microsoft Corporation | Quantization adjustments for DC shift artifacts |
US8767822B2 (en) | 2006-04-07 | 2014-07-01 | Microsoft Corporation | Quantization adjustment based on texture level |
US8249145B2 (en) | 2006-04-07 | 2012-08-21 | Microsoft Corporation | Estimating sample-domain distortion in the transform domain with rounding compensation |
US20070237237A1 (en) * | 2006-04-07 | 2007-10-11 | Microsoft Corporation | Gradient slope detection for video compression |
US8711925B2 (en) | 2006-05-05 | 2014-04-29 | Microsoft Corporation | Flexible quantization |
US8588298B2 (en) | 2006-05-05 | 2013-11-19 | Microsoft Corporation | Harmonic quantizer scale |
US9967561B2 (en) | 2006-05-05 | 2018-05-08 | Microsoft Technology Licensing, Llc | Flexible quantization |
US20070258519A1 (en) * | 2006-05-05 | 2007-11-08 | Microsoft Corporation | Harmonic quantizer scale |
US20070258518A1 (en) * | 2006-05-05 | 2007-11-08 | Microsoft Corporation | Flexible quantization |
US8184694B2 (en) | 2006-05-05 | 2012-05-22 | Microsoft Corporation | Harmonic quantizer scale |
US8077775B2 (en) | 2006-05-12 | 2011-12-13 | Freescale Semiconductor, Inc. | System and method of adaptive rate control for a video encoder |
US20070263720A1 (en) * | 2006-05-12 | 2007-11-15 | Freescale Semiconductor Inc. | System and method of adaptive rate control for a video encoder |
US7773672B2 (en) | 2006-05-30 | 2010-08-10 | Freescale Semiconductor, Inc. | Scalable rate control system for a video encoder |
US20070280349A1 (en) * | 2006-05-30 | 2007-12-06 | Freescale Semiconductor Inc. | Scalable rate control system for a video encoder |
US9883202B2 (en) | 2006-10-06 | 2018-01-30 | Nxp Usa, Inc. | Scaling video processing complexity based on power savings factor |
US20080084491A1 (en) * | 2006-10-06 | 2008-04-10 | Freescale Semiconductor Inc. | Scaling video processing complexity based on power savings factor |
US20080101338A1 (en) * | 2006-11-01 | 2008-05-01 | Reynolds Douglas F | METHODS AND APPARATUS TO IMPLEMENT HIGHER DATA RATE VOICE OVER INTERNET PROTOCOL (VoIP) SERVICES |
US8238424B2 (en) | 2007-02-09 | 2012-08-07 | Microsoft Corporation | Complexity-based adaptive preprocessing for multiple-pass video compression |
US20080192822A1 (en) * | 2007-02-09 | 2008-08-14 | Microsoft Corporation | Complexity-based adaptive preprocessing for multiple-pass video compression |
US8498335B2 (en) | 2007-03-26 | 2013-07-30 | Microsoft Corporation | Adaptive deadzone size adjustment in quantization |
US20080240257A1 (en) * | 2007-03-26 | 2008-10-02 | Microsoft Corporation | Using quantization bias that accounts for relations between transform bins and quantization bins |
US20080240235A1 (en) * | 2007-03-26 | 2008-10-02 | Microsoft Corporation | Adaptive deadzone size adjustment in quantization |
US20080267284A1 (en) * | 2007-03-28 | 2008-10-30 | Hisayoshi Tsubaki | Moving picture compression apparatus and method of controlling operation of same |
US8681860B2 (en) * | 2007-03-28 | 2014-03-25 | Facebook, Inc. | Moving picture compression apparatus and method of controlling operation of same |
US20080240250A1 (en) * | 2007-03-30 | 2008-10-02 | Microsoft Corporation | Regions of interest for quality adjustments |
US8243797B2 (en) | 2007-03-30 | 2012-08-14 | Microsoft Corporation | Regions of interest for quality adjustments |
US8576908B2 (en) | 2007-03-30 | 2013-11-05 | Microsoft Corporation | Regions of interest for quality adjustments |
US20080260278A1 (en) * | 2007-04-18 | 2008-10-23 | Microsoft Corporation | Encoding adjustments for animation content |
US8442337B2 (en) * | 2007-04-18 | 2013-05-14 | Microsoft Corporation | Encoding adjustments for animation content |
US8331438B2 (en) | 2007-06-05 | 2012-12-11 | Microsoft Corporation | Adaptive selection of picture-level quantization parameters for predicted video pictures |
US20090141800A1 (en) * | 2007-11-29 | 2009-06-04 | Larson Arnold W | Transmitting Video Streams |
US8432804B2 (en) * | 2007-11-29 | 2013-04-30 | Hewlett-Packard Development Company, L.P. | Transmitting video streams |
US8189933B2 (en) | 2008-03-31 | 2012-05-29 | Microsoft Corporation | Classifying and controlling encoding quality for textured, dark smooth and smooth video content |
US20090245587A1 (en) * | 2008-03-31 | 2009-10-01 | Microsoft Corporation | Classifying and controlling encoding quality for textured, dark smooth and smooth video content |
US9185418B2 (en) | 2008-06-03 | 2015-11-10 | Microsoft Technology Licensing, Llc | Adaptive quantization for enhancement layer video coding |
US10306227B2 (en) | 2008-06-03 | 2019-05-28 | Microsoft Technology Licensing, Llc | Adaptive quantization for enhancement layer video coding |
US9571840B2 (en) | 2008-06-03 | 2017-02-14 | Microsoft Technology Licensing, Llc | Adaptive quantization for enhancement layer video coding |
US20090296808A1 (en) * | 2008-06-03 | 2009-12-03 | Microsoft Corporation | Adaptive quantization for enhancement layer video coding |
US8897359B2 (en) | 2008-06-03 | 2014-11-25 | Microsoft Corporation | Adaptive quantization for enhancement layer video coding |
US20110216828A1 (en) * | 2008-11-12 | 2011-09-08 | Hua Yang | I-frame de-flickering for gop-parallel multi-thread viceo encoding |
US20110211633A1 (en) * | 2008-11-12 | 2011-09-01 | Ferran Valldosera | Light change coding |
US20110206138A1 (en) * | 2008-11-13 | 2011-08-25 | Thomson Licensing | Multiple thread video encoding using hrd information sharing and bit allocation waiting |
CN102217309A (en) * | 2008-11-13 | 2011-10-12 | 汤姆逊许可证公司 | Multiple thread video encoding using hrd information sharing and bit allocation waiting |
US20110222604A1 (en) * | 2008-11-13 | 2011-09-15 | Thomson Licensing | Multiple thread video encoding using gop merging and bit allocation |
US9143788B2 (en) | 2008-11-13 | 2015-09-22 | Thomson Licensing | Multiple thread video encoding using HRD information sharing and bit allocation waiting |
WO2010056315A1 (en) * | 2008-11-13 | 2010-05-20 | Thomson Licensing | Multiple thread video encoding using gop merging and bit allocation |
US9210431B2 (en) * | 2008-11-13 | 2015-12-08 | Thomson Licensing | Multiple thread video encoding using GOP merging and bit allocation |
US8861596B2 (en) * | 2009-07-27 | 2014-10-14 | Sony Corporation | Image encoding device and image encoding method |
US20110019736A1 (en) * | 2009-07-27 | 2011-01-27 | Kyohei Koyabu | Image encoding device and image encoding method |
US20110216774A1 (en) * | 2010-03-02 | 2011-09-08 | Intrusion Inc. | Packet file system |
US8472449B2 (en) * | 2010-03-02 | 2013-06-25 | Intrusion, Inc. | Packet file system |
US9544664B1 (en) | 2011-06-08 | 2017-01-10 | Arris Enterprises, Inc. | Group of pictures size adjustment |
US9094641B1 (en) * | 2011-06-08 | 2015-07-28 | Arris Enterprises, Inc. | Group of pictures size adjustment |
US9014255B2 (en) * | 2012-04-03 | 2015-04-21 | Xerox Corporation | System and method for identifying unique portions of videos with validation and predictive scene changes |
US20130259123A1 (en) * | 2012-04-03 | 2013-10-03 | Xerox Coporation | System and method for identifying unique portions of videos with validation and predictive scene changes |
CN104038762A (en) * | 2013-03-06 | 2014-09-10 | 三星电子株式会社 | Video Encoder, Method Of Detecting Scene Change And Method Of Controlling Video Encoder |
CN106210718A (en) * | 2016-08-08 | 2016-12-07 | 飞狐信息技术(天津)有限公司 | A kind of video sequence Scene switching detection method and device |
CN108737838A (en) * | 2017-04-19 | 2018-11-02 | 北京金山云网络技术有限公司 | A kind of method for video coding and device |
US10497258B1 (en) * | 2018-09-10 | 2019-12-03 | Sony Corporation | Vehicle tracking and license plate recognition based on group of pictures (GOP) structure |
US20210192681A1 (en) * | 2019-12-18 | 2021-06-24 | Ati Technologies Ulc | Frame reprojection for virtual reality and augmented reality |
US12148120B2 (en) * | 2019-12-18 | 2024-11-19 | Ati Technologies Ulc | Frame reprojection for virtual reality and augmented reality |
Also Published As
Publication number | Publication date |
---|---|
WO2002005562A2 (en) | 2002-01-17 |
AU2001276876A1 (en) | 2002-01-21 |
AU2001273326A1 (en) | 2002-01-21 |
US7155067B2 (en) | 2006-12-26 |
AU2001276871A1 (en) | 2002-01-21 |
WO2002005562A3 (en) | 2003-03-20 |
US20020028024A1 (en) | 2002-03-07 |
US20020006231A1 (en) | 2002-01-17 |
WO2002005121A3 (en) | 2003-03-27 |
WO2002005214A3 (en) | 2003-09-25 |
WO2002005214A2 (en) | 2002-01-17 |
WO2002005121A2 (en) | 2002-01-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20020021756A1 (en) | Video compression using adaptive selection of groups of frames, adaptive bit allocation, and adaptive replenishment | |
US6100940A (en) | Apparatus and method for using side information to improve a coding system | |
US8351513B2 (en) | Intelligent video signal encoding utilizing regions of interest information | |
US6310915B1 (en) | Video transcoder with bitstream look ahead for rate control and statistical multiplexing | |
EP1929784B1 (en) | Content driven transcoder that orchestrates multimedia transcoding using content information | |
US7200276B2 (en) | Rate allocation for mixed content video | |
US5301032A (en) | Digital image compression and decompression method and apparatus using variable-length coding | |
EP1520431B1 (en) | Efficient compression and transport of video over a network | |
US7920628B2 (en) | Noise filter for video compression | |
US20060188014A1 (en) | Video coding and adaptation by semantics-driven resolution control for transport and storage | |
US20020009143A1 (en) | Bandwidth scaling of a compressed video stream | |
US20040008899A1 (en) | Optimization techniques for data compression | |
US20100021071A1 (en) | Image coding apparatus and image decoding apparatus | |
EP3512198A1 (en) | Image decoding device, image encoding device, and method thereof | |
US6137912A (en) | Method of multichannel data compression | |
US6252905B1 (en) | Real-time evaluation of compressed picture quality within a digital video encoder | |
US9014268B2 (en) | Video encoder and its decoder | |
US7636482B1 (en) | Efficient use of keyframes in video compression | |
US7949051B2 (en) | Mosquito noise detection and reduction | |
US7864840B2 (en) | Scene-by-scene digital video processing | |
US7856054B1 (en) | Scene change identification during encoding of compressed video | |
US20100027617A1 (en) | Method and apparatus for compressing a reference frame in encoding/decoding moving images | |
KR20060043051A (en) | Method of encoding and decoding video signal | |
US20030123538A1 (en) | Video recording and encoding in devices with limited processing capabilities | |
US6040875A (en) | Method to compensate for a fade in a digital video input sequence |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: MEDIAFLOW, LLC., GEORGIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:JAYANT, NUGGEHALLY S.;JANG, SEONG H.;YOO, JANGHYUN;REEL/FRAME:011986/0484 Effective date: 20010711 |
|
AS | Assignment |
Owner name: EG TECHNOLOGY, INC., GEORGIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MEDIAFLOW, LLC;REEL/FRAME:012684/0257 Effective date: 20011207 |
|
AS | Assignment |
Owner name: SILICON VALLEY BANK DBA SILICON VALLEY EAST, CALIF Free format text: SECURITY INTEREST;ASSIGNOR:EG TECHNOLOGY, INC.;REEL/FRAME:014925/0329 Effective date: 20040610 |
|
AS | Assignment |
Owner name: SILICON VALLEY BANK,CALIFORNIA Free format text: SECURITY AGREEMENT;ASSIGNOR:EG TECHNOLOGY, INC.;REEL/FRAME:019000/0893 Effective date: 20070226 Owner name: SILICON VALLEY BANK, CALIFORNIA Free format text: SECURITY AGREEMENT;ASSIGNOR:EG TECHNOLOGY, INC.;REEL/FRAME:019000/0893 Effective date: 20070226 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |
|
AS | Assignment |
Owner name: EG TECHNOLOGY, INC.,GEORGIA Free format text: RELEASE;ASSIGNOR:SILICON VALLEY BANK;REEL/FRAME:023973/0512 Effective date: 20100209 Owner name: EG TECHNOLOGY, INC., GEORGIA Free format text: RELEASE;ASSIGNOR:SILICON VALLEY BANK;REEL/FRAME:023973/0512 Effective date: 20100209 |
|
AS | Assignment |
Owner name: EG TECHNOLOGY, INC.,GEORGIA Free format text: RELEASE;ASSIGNOR:SILICON VALLEY BANK;REEL/FRAME:023998/0768 Effective date: 20100209 Owner name: EG TECHNOLOGY, INC., GEORGIA Free format text: RELEASE;ASSIGNOR:SILICON VALLEY BANK;REEL/FRAME:023998/0768 Effective date: 20100209 |
|
AS | Assignment |
Owner name: ARRIS GROUP, INC., GEORGIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:EG TECHNOLOGY, INC.;REEL/FRAME:024864/0491 Effective date: 20090831 |