WO2009139569A2 - Procédé et appareil de décodage de signal vidéo - Google Patents
Procédé et appareil de décodage de signal vidéo Download PDFInfo
- Publication number
- WO2009139569A2 WO2009139569A2 PCT/KR2009/002490 KR2009002490W WO2009139569A2 WO 2009139569 A2 WO2009139569 A2 WO 2009139569A2 KR 2009002490 W KR2009002490 W KR 2009002490W WO 2009139569 A2 WO2009139569 A2 WO 2009139569A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- information
- motion
- view
- current block
- picture
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 230000033001 locomotion Effects 0.000 claims abstract description 301
- 230000001939 inductive effect Effects 0.000 claims abstract description 3
- 239000013598 vector Substances 0.000 claims description 93
- 230000006835 compression Effects 0.000 abstract description 10
- 238000007906 compression Methods 0.000 abstract description 10
- 238000003672 processing method Methods 0.000 description 11
- 239000000284 extract Substances 0.000 description 9
- 230000002123 temporal effect Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 229920001091 Poly(octyl cyanoacrylate) Polymers 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000001965 increasing effect Effects 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000013139 quantization Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- DNORZUSMZSZZKU-UHFFFAOYSA-N ethyl 2-[5-(4-chlorophenyl)pentyl]oxirane-2-carboxylate Chemical compound C=1C=C(Cl)C=CC=1CCCCCC1(C(=O)OCC)CO1 DNORZUSMZSZZKU-UHFFFAOYSA-N 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000003211 malignant effect Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/137—Motion inside a coding unit, e.g. average field, frame or block difference
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/132—Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/593—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
Definitions
- the present invention relates to the processing of video signals, and more particularly, to a video signal processing method and apparatus for decoding a video signal.
- Compression coding refers to a series of signal processing techniques for transmitting digitized information through a communication line or storing the information in a form suitable for a storage medium.
- the object of compression encoding includes objects such as voice and image character, and in particular, a technique of performing compression encoding on an image is called video image compression.
- the general feature of video image is that it has spatial redundancy and temporal redundancy.
- the present invention has been made to solve the above-mentioned problem, and it is possible to decode a video signal from which redundancy of inter-picture pictures is removed by decoding only motion information of relevant inter-view pictures by separating relevant inter-view pictures.
- a signal processing method and apparatus Provided are a signal processing method and apparatus.
- Another object of the present invention is to provide a video signal processing method and apparatus for decoding a video signal using motion information of a related picture based on the relevance of pictures between views.
- the reconstruction of the video signal is quick.
- the compression rate is increased.
- FIG. 1 is a schematic block diagram of a video signal decoding apparatus to which the present invention is applied. Indicates.
- FIG. 2 is an embodiment to which the present invention is applied and shows attribute information of a multiview image that may be added to a multiview image coded bitstream.
- 3 and 4 illustrate syntaxes in which prediction mode identification information is implemented according to an embodiment of the present invention.
- FIG. 5 illustrates an embodiment of a method for transmitting a global motion vector.
- FIG. 6 shows an example of syntax in the case of transmitting a global motion vector.
- FIG. 7 is a view for explaining the concept of a video signal processing method according to an embodiment of the present invention.
- FIG. 8 illustrates a configuration of a video signal processing apparatus according to an embodiment of the present invention.
- FIG. 9 is a flowchart illustrating a video signal processing method according to an embodiment of the present invention.
- FIG. 10 is for explaining a concept of a method of video signal processing according to another embodiment of the present invention.
- FIG. 11 illustrates a configuration of a video signal processing apparatus according to another embodiment of the present invention.
- FIG. 12 illustrates a detailed configuration of the motion information acquisition unit in FIG. 11.
- FIG. 13 is a flowchart of a video signal processing method according to another embodiment of the present invention.
- FIG. 14 is a diagram illustrating a process of extracting first motion flag flag information according to an embodiment of the present invention.
- FIG. 16 is a detailed flowchart of the motion information generation process S370 of FIG. 13.
- 17 is a table illustrating the meaning of second prediction mode identification information according to another embodiment of the present invention.
- 19 is an example of syntax in which a global motion vector is transmitted according to another embodiment of the present invention.
- 20 is a diagram illustrating a process of extracting first motion skip flag information according to another embodiment of the present invention.
- FIG. 21 is a diagram for explaining a method of finding a corresponding block using additional information according to another embodiment of the present invention. [Best form for implementation of the invention]
- a video signal processing method comprises the steps of: acquiring viewpoint information identifying a viewpoint of a current block;
- viewpoint information of a reference time point is used for motion scanning. Deriving the motion information of the current block from a reference block existing at the reference time point when the motion skip is performed according to the second motion skip flag information; And
- offset information indicating a position difference between the reference block and the corresponding block is obtained, wherein the corresponding block includes a start point of the current block and the reference block. Is a block pointed to by a global motion vector representing the difference in transitions between the points of view
- Motion information of the current block may be derived based on the offset information.
- the method may further include deriving a modified global motion vector using the global motion vector and the offset information.
- Position information of the reference block may be derived using the modified global motion vector.
- the global motion vector may be obtained based on the information representing the inter-view dependency.
- the method comprises: acquiring anchor picture flag information indicating whether the current block corresponds to an anchor picture;
- the current block may correspond to a non-anchor picture according to the anchor picture flag information.
- the global motion vector may be obtained based on the anchor picture flag information.
- the prediction mode identification information may be obtained from an extended region of a sequence parameter set.
- the video signal may be received as a broadcast signal.
- the video signal may be received through a digital medium.
- a recording medium according to another aspect of the present invention, a computer program having recorded thereon a program for executing the method according to claim 1 can be read.
- prediction mode identification information indicating whether view information of a reference view referred to by the current blot is used for motion skip or inter-view sample prediction
- a motion skip determination unit for obtaining second motion skip flag information indicating whether motion skip is performed on the current block
- the motion skip may be derived from the motion information of the current block.
- the motion information in the present invention should be understood as a concept including not only temporal motion information but also interview motion information in inter-view direction.
- a motion vector should be understood as a concept that includes not only a mot ion offset in the time direction but also a disparity offset in the inter-view direction.
- FIG. 1 is a schematic block diagram of a video signal decoding apparatus to which the present invention is applied.
- the decoding apparatus includes a parsing unit 10, an entropy decoding unit 20, an inverse quantization / inverse transform unit 30, an intra prediction unit 40, a deblocking filter unit 50, and a decoding picture.
- the buffer unit 60, the inter prediction unit 70, and the like may be included.
- the decoded picture buffer unit 60 may largely include a reference picture storage unit (not shown), a reference picture list generation unit (not shown), a reference picture management unit (not shown), and the like.
- the parsing unit 10 performs parsing in units of NAL to decode the received video image.
- one or more Sequence Parameter Sets (SPSs) PPSCPicture Parameter set) is sent to the decoder before the slice header and slice data are decoded.
- SPSs Sequence Parameter Sets
- various attribute information may be included in the NAL header area or the extension area of the NAL header.
- MVC is an additional technology to the existing AVC technology, it may be more efficient to add various attribute information only in the case of MVC bitstream rather than unconditionally adding. For example, flag information for identifying whether an MVC bitstream is included in the NAL header region or an extension region of the NAL header may be added.
- Attribute information regarding a multiview image may be added only when the input bitstream is an MVC bitstream according to the full lag information.
- the attribute information may include view identification information, anchor picture identification information, inter-view prediction flag information, priority identification information, and identification information indicating whether the picture is a momentary decoded picture for the view. This will be described in detail with reference to FIG. 2.
- the parsed bitstream is entropy decoded through the entropy decoding unit 20, and coefficients, motion vectors, and the like of each macroblock are extracted.
- the inverse quantization / inverse transform unit 30 multiplies the received quantized value by a constant constant to obtain a transformed coefficient value, and inversely transforms the coefficient value to restore the pixel value.
- the intra prediction unit 40 uses the reconstructed pixel value to perform intra prediction from the decoded samples in the current picture.
- the deblocking filter section 50 is applied to each coded macroblock to reduce block distortion.
- the filter smoothes the edges of the block to improve the quality of the decoded frame. The choice of filtering process depends on the boundary strength and the gradient of image samples around the boundary.
- the filtered pictures are output or stored in the decoded picture buffer unit 60 for use as a reference picture.
- the decoded picture buffer unit 60 may store or output previously coded pictures in order to perform inter prediction. Decoded picture at this time Frame_num and POC (Picture Order Count) of each picture are used to store or output the buffer 60. In addition, some of the previously coded picture stones in MVC may have pictures that are different from the current picture. Therefore, in order to use these pictures as reference pictures, not only the frame_num and P0C but also view information for identifying the view point of the picture are included. It is available.
- the decoded picture buffer unit 60 includes a reference picture storage unit (not shown), a reference picture list generator (not shown), and a reference picture manager (not shown).
- the reference picture storage unit stores pictures that are referenced for coding of the current picture.
- the reference picture list generator generates a list of reference pictures for inter prediction. Since the inter-view prediction may be performed in multi-view video coding, when a current picture refers to a picture at a different view, it may be necessary to generate a reference picture list for inter-view prediction.
- the reference picture list generator may use information about a viewpoint to generate a reference picture list for inter-view prediction.
- interview reference information may be used.
- Interview reference information refers to information used to indicate inter-view dependencies. For example, there may be a total number of viewpoints, a viewpoint identification number, a number of interview reference pictures, a viewpoint identification number of an interview reference picture, and the like.
- the reference picture manager manages the reference picture to more flexibly realize inter prediction. For example, an adaptive memory management method and a sliding window method may be used. This is to manage the memory of the reference picture and the non-reference picture into one memory and manage them efficiently with less memory. In multi-view video coding, since pictures in a view direction have the same picture order count, information for identifying the view point of each picture may be used for their marking. Such Reference pictures managed through the process may be used in the inter prediction unit 70.
- the inter predicted pictures and the intra predicted pictures are selected according to the prediction mode to reconstruct the current picture.
- FIG. 2 is an embodiment to which the present invention is applied and shows attribute information on a multiview image that may be added to a multiview image coded bitstream.
- the NAL unit may be composed of a NAL header, RBSKRaw Byte Sequence Pay load, and video compression result data.
- the AL header may include identification information (nal_ref_idc) indicating whether the NAL unit includes a slice of the reference picture and information (nal_unit_type) indicating the type of the NAL unit.
- the extended region of the NAL header may be limited. For example, when the information indicating the type of the NAL unit is related to the multi-view video coding or indicates the pref Nix unit, the NAL unit may also include an extended area of the NAL header.
- the NAL unit is the
- attribute information of a multiview image may be added according to flag information (svc_mvc_f lag) for identifying whether the bit is an MVC bitstream.
- the RBSP may include information about a sequence parameter set.
- the RBSP may include information about a sequence parameter set.
- the RBSP may include information about a subset sequence parameter set.
- the subset sequence parameter set is set to the sequence parameter set. It may include an extended area.
- the subset sequence parameter set may include an extended region of a sequence parameter set.
- the extended area of the sequence parameter set may include interview reference information indicating an inter-view dependency relationship.
- interview reference information indicating an inter-view dependency relationship.
- various attribute information on a multiview image for example, attribute information that may be included in an extended region of a NAL header or attribute information that may be included in an extended region of a sequence parameter set, will be described in detail. .
- the view identification information refers to information for distinguishing a picture at a current view from a picture at a different view.
- P0C Picture Order Count
- frame_num When the video image signal is coded, P0C (Picture Order Count) and frame_num are used to identify each picture.
- viewpoint identification information for distinguishing a picture at a current view from a picture at a different view. Therefore, it is necessary to define viewpoint identification information for identifying the viewpoint of the picture.
- the view identification information may be obtained from a header area of a video signal.
- the header area may be an NAL header area or an extension area of the NAL header, or may be a slice header area.
- the view identification information may be used to obtain information of a picture that is different from the current picture, and the video signal may be decoded using the information of the picture at the other view.
- the viewpoint identification information may be applied throughout the encoding / decoding process of the video signal.
- the viewpoint identification information may be used to indicate the inter-view dependence.
- the number of interview reference pictures and the view identification information of the interview reference picture may be required.
- Information used for indicating inter-view dependency such as the number of interview reference pictures and the view identification information of the interview reference picture, will be referred to as interview reference information.
- the viewpoint identification information may be used to indicate viewpoint identification information of the interview reference picture.
- the interview reference picture may refer to a reference picture used when performing inter-view prediction on the current picture.
- the frame_num may be applied to multi-view video coding as it is, using a frame_num in consideration of a view rather than a specific view identifier.
- the anchor picture flag information refers to information for identifying whether a current coded picture of a NAL unit is an anchor picture.
- the anchor picture refers to an encoded picture in which all slices refer only to slices in frames of the same time zone.
- an encoded picture refers only to slices at different views, but not slices at the current view.
- random access between views may be possible.
- interview reference information is required for inter-view prediction.
- anchor picture flag information may be used. For example, when the current picture corresponds to the anchor picture, interview reference information about the anchor picture may be obtained. If the current picture corresponds to a non-anchor picture, interview reference information about the non-malignant picture may be obtained. This will be described in detail with reference to FIG. 3.
- the anchor picture or the non-anchor picture may refer to pictures at a plurality of viewpoints.
- a picture of a virtual view may be generated from pictures at a plurality of viewpoints, and the current picture may be predicted using the pictures of the virtual view.
- the current picture may be predicted by referring to the plurality of pictures at the plurality of viewpoints.
- the anchor picture flag information may be used when generating a reference picture list.
- the reference picture list may include a reference picture list for inter-view prediction.
- the reference picture list for the inter-view prediction may be added to the reference picture list.
- the anchor picture flag information may be used when initializing a reference picture list or when modifying the reference picture list. It may also be used to manage the added reference pictures for the inter-view prediction.
- the reference pictures may be divided into anchor pictures and non-anchor pictures, and the reference pictures that are not used when performing inter-view prediction may be marked not to be used.
- the anchor picture flag information may also be applied to a hypothetical reference decoder.
- the interview prediction flag information refers to information indicating whether a coded picture of a current NAL unit is used for inter-view prediction.
- the interview prediction flag information may be used in a part where temporal prediction or inter-view prediction is performed.
- the NAL unit may be used together with identification information indicating whether the NAL unit includes a slice of the reference picture.
- the current NAL unit may be a reference picture used only for inter-view prediction.
- the current NAL unit may be used for temporal prediction and inter-view prediction.
- the identification information even if the NAL unit does not include a slice of the reference picture, it may be stored in the decoded picture buffer. This is because the current NAL coded picture according to the interview prediction flag information needs to be stored when it is used for inter-view prediction.
- one identification As information it may indicate whether a coded picture of a current NAL unit is used for temporal prediction and / or inter-view prediction.
- Single view decoding does not decode pictures at a time point not referenced by the current picture, and partially decodes only motion information of a reference picture included at a time point referenced by the current picture.
- multi-view decoding is to decode all information of all pictures of a reference view including a picture referenced in decoding of a current picture.
- the single view decoding information is information indicating whether the sequence performs single view decoding or multiple view decoding.
- the prediction mode identification information extracts motion information of the current block in decoding the current picture belonging to the current view, or generates motion information of the current block using motion information of the reference picture belonging to the reference view, that is, motion Information indicating whether to skip. Motion skip will be described in the relevant section.
- 3 and 4 illustrate syntaxes in which prediction mode identification information is implemented according to an embodiment of the present invention.
- the anchor picture since the anchor picture performs decoding only by inter-view prediction, motion skip is not used. Therefore, it is not necessary to extract the prediction mode identification information in the case of the anchor picture, and the prediction mode identification information will be extracted in the case of the non-blinder picture (S100). For example, when the prediction mode identification information is '0,' motion skip is performed to derive the motion information of the current block from the motion information of the blocks existing at different time points. On the other hand, when the prediction mode identification information is, the motion information of the current block is extracted from the received bitstream without performing motion skip. Referring to FIG.
- the single view decoding information when the single view decoding information is '1', the pictures at the time point that the current picture does not refer to are not decoded, and only the motion information of the reference picture included in the time point that the current picture refers to is partially. Will decode it. So this In this case, it is not necessary to extract the prediction mode identification information.
- the single view decoding information when the single view decoding information is '0', all information of all pictures of the reference view including the picture referenced in decoding of the current picture belonging to the current view will be decoded, and prediction mode identification information in decoding of the current picture. You need to extract Therefore, the prediction mode identification information will be extracted only in the condition S110 where the single view decoding information is '0,'.
- the prediction mode identification information is ⁇ even it may obtain the information indicating whether to perform a motion skip at a slice level or a macroblock level, in decoding a slice or macroblock belonging to a current point in time, according to the information Motion skip may be performed.
- second prediction mode identification information This is called second prediction mode identification information.
- the prediction mode identification information in the SPS extended region indicates that motion skip is performed in decoding the current block belonging to the current view, and even if the first motion skip flag information in the slice header is one of the pictures belonging to the slice, the motion skip is performed. In case of indicating that it is performed, only motion skip may be used when decoding the current block (hereinafter referred to as '0th mode'). However, if the prediction mode identification information in the SPS extension region indicates that motion skip is not performed in decoding the current block, it is determined according to the first motion skip flag information in the slice header.
- each of the three cases ((a), (b), (c)) It may be defined as a zero mode, a first mode, or a second mode, which may have the meaning as shown in FIG. 17.
- the process of extracting the second prediction mode identification information will be described with reference to FIG. 18.
- FIG. 18 is an example of syntax for transmitting second prediction mode identification information according to another embodiment of the present invention.
- the anchor picture since the anchor picture performs decoding only by inter-view prediction, motion skip is not performed. Therefore, it is not necessary to extract the second prediction mode identification information.
- the non-anchor picture for example, when the second prediction mode identification information is '0,' only motion skip is used to generate motion information of the current block. If the second prediction mode identification information is, the motion information of the current block will be extracted from the received bitstream without performing motion skip.
- the motion skip is performed to derive the motion information of the current block from the motion information of the block existing at another point in time, or from the received bitstream without performing the motion skip. It will extract the motion information of the current block.
- the single view decoding information for example, when the single view decoding information is T, the pictures at the time point not referenced by the current picture are not decoded, and the motion of the reference picture included at the time point at which the current picture refers. Since only the information and the block type are partially decoded, it is not necessary to extract the inter-view prediction flag information. Therefore, in this case, the second prediction mode identification information may be extracted as '0'.
- the single view decoding information is '0,' the second prediction in decoding of the current picture since the decoding of the current picture decodes all information of all pictures of the reference view including the reference picture referred to in decoding the current picture belonging to the current view. It is necessary to extract the mode identification information.
- the inter-view prediction mode information may be extracted as '' or '2,'.
- the concept of the global motion vector and the process of deriving the global motion vector of the current picture using the global motion information of another picture when the global motion vector of the current picture (or slice) is not transmitted will be described.
- FIG. 5 illustrates an embodiment of a method for transmitting a global motion vector.
- the motion vector corresponds to a partial region (macroblock, block, pixel, etc.), whereas the global motion vector is a motion vector corresponding to the entire region including the partial region.
- the entire area may be processed in one slice, in one picture, or may correspond to an entire sequence.
- the motion vector may have a pixel value
- the global motion vector may be a pixel value, a 4 * 4 unit, an 8 * 8 unit, or a macroblock unit.
- the transmission method of the global motion vector may also be varied. It may be transmitted for every slice in the picture, may be transmitted for each picture, or may be transmitted for each slice only in the case of an anchor picture.
- a global motion vector is transmitted only in the case of an anchor picture among pictures, and a global motion vector is not transmitted in the case of a non-anchor picture.
- An example of syntax in the case of transmitting a global motion vector by the method shown in FIG. 5 is shown in FIG. 6.
- slice type information is included in the slice header, and the picture to which the current slice belongs is an anchor picture (S120)
- the slice type is P
- two global motion vectors may be transmitted (S130).
- 19 is an example of syntax in which a global motion vector is transmitted according to another embodiment of the present invention.
- a condition to which a picture to which the current slice belongs is an anchor picture (S610) and Under the condition that the second prediction mode identification information is not T (S620), when the slice type is P, one global motion vector may be transmitted, and when the B is two global motion vectors may be transmitted (S630). That is, when the second prediction mode identification information is' 1, since no motion skip is performed, the global motion vector used when the motion skip is unnecessary is unnecessary. Therefore, the global motion vector is transmitted only when the motion skip is performed when the current block is decoded, thereby improving the efficiency of video signal processing.
- FIG. 7 illustrates a concept of a video signal processing method according to an embodiment of the present invention
- FIG. 8 illustrates a configuration of a video signal processing device according to an embodiment of the present invention
- FIG. 9 illustrates one embodiment of the present invention. A flowchart of a video signal processing method according to an embodiment is shown.
- the global motion vector of the current picture is not transmitted, and the global motion vector (GDVA) of the forward neighboring anchor picture on the time axis and the global motion vector of the backward neighboring anchor picture on the time axis are not transmitted. (GDVB) is transmitted.
- GDVCUR global motion vector
- a bitstream is input to the picture information extractor 110, and after extracting the anchor picture flag information (S210), the picture information extractor (no) uses the anchor picture flag information.
- the global motion vector extractor 12 2 extracts the global motion vector of the current picture (S230). However, if the current picture is not an anchor picture, the global The motion vector generator 124 searches for the neighboring anchor picture neighboring the current picture, and then extracts the global motion vector (GDVA.GDVB) of the searched neighboring anchor picture (S240).
- the global motion vector generator 124 generates a global motion vector GDVcur of the current picture by using the extracted global motion vector GDVA.GDVB (S250).
- GDVA.GDVB S250
- a method of generating a global motion vector of the current picture will be described.
- the value obtained by multiplying the constant motion (C) by the global motion vector (GDV prev ) of the anchor picture, as shown in Equation 1, can be the global motion vector (GDVcur) of the current picture.
- PREV may be a global motion vector of the most recently extracted anchor picture, or may be a global motion vector (GDVA, GDVB) of a backward or forward neighboring anchor picture on the time axis.
- the constant (c) may be a predetermined value or a value calculated using the output order.
- G ⁇ DV cur c ⁇ * G ⁇ DV prev
- Equation 2 Second, using the output order (POCcur. POCA, POCB) as shown in Equation 2, the neighboring degree (POCcur-RXX POCB-POCA) between the current picture and the neighboring anchor picture is calculated, and the neighboring degree and the extracted degree are extracted.
- a global motion vector (GDVcur) of the current picture may be generated using the global motion vector (GDVA, GDVB) of the neighboring anchor picture.
- Coding information of the current block may be predicted using the generated global motion vector.
- the global motion vector may be less accurate, and thus it is necessary to find the corresponding block in more detail.
- FIG. 21 is a diagram for explaining a method of finding a corresponding block using additional information according to another embodiment of the present invention.
- the corresponding block existing at a different point in time from the current block may be found and the coding information of the current block may be predicted using the coding information of the corresponding block.
- the corresponding block may be a block indicated by the view motion vector of the current block, and the motion vector in the view direction may mean a global motion vector. The meaning of the global motion vector has been described above. In this case, if the block most similar to the current block is found using the additional information, the coding efficiency can be improved.
- the additional information may be used to increase the accuracy of the motion vector.
- the additional information may include offset information.
- the offset information may include first offset information offsetJO indicating a position difference between the corresponding block MB1 indicated by the global motion vector of the current block and the actual reference block MB2 including the motion information.
- the corresponding block MB1 and the reference block MB2 may be 16 ⁇ 16 macroblocks, and the first offset information offset_X may be obtained from the macroblock layer when motion skip is performed.
- a process of deriving a motion vector indicating the reference block MB2 by using the first offset information offset_X will be described.
- second offset information offset_Y indicating a difference between a position P1 indicated by the global motion vector of the current block and a position P2 of the macro block MB1 including the position P1 may be derived.
- the second offset information offset_Y may mean a variable.
- the second offset information offset_Y may be derived based on a position P1 value indicated by the global motion vector of the current block.
- the second offset information (offset_Y) Is set to (0,0).
- (0, 1) can be set to (0, 1), (1, 0) if (1, 0), (1, 1) can be set to (1, 1), respectively.
- Second offset information offset_Z may be derived. This is shown in Equation 3 below.
- offset_Z [l] offset_X [l]-offset— Y [l]
- 0 and 1 may mean a horizontal direction and a vertical direction, respectively.
- the modified motion vector may be derived using the transmitted global motion vector GDV and the derived third offset information of fset_Z.
- the modified motion vector may mean a motion vector (accGDV) indicating a reference block MB2.
- the reference block MB2 is a rate of all blocks in the encoder. As a result, it may mean a block having the most optimal distortion among them. That is, it may mean a block most similar to the current block.
- the modified motion vector may be derived as shown in Equation 4 below.
- Position information of the reference block MB2 is obtained using the modified motion vector accGDV. It can be derived. For example, if the remaining values obtained by dividing the horizontal and vertical component values of the position (P3) (x, y) indicated by the modified motion vector by 2, respectively, are (0,0), the mode 0 may be referred to as mode 0.
- the mode 0 may mean that the position of the reference block MB2 indicates the position of the 8X8 block on the upper left side of the 16X16 macroblock divided into 4 quarters by 8X8 units.
- the remaining value obtained by dividing the horizontal and vertical component values of the position (P3) (x, y) indicated by the modified motion vector by 2, respectively may be referred to as mode 1.
- the mode 1 may mean that the position of the reference block MB2 represents the position of the 8X8 block on the upper right side among the 16X16 macroblocks that are divided into 8x8 units in 8x8 units.
- mode 2 may indicate the position of the 8 ⁇ 8 block at the lower left
- mode 3 may indicate the position of the 8 ⁇ 8 block at the lower right.
- the location information of the reference block MB2 is derived, and the motion information of the current block can be derived according to the location information of the reference block MB2.
- the motion information may include a motion vector, a reference index, a block type, and the like.
- the processing apparatus may use the motion information of another picture to itself to determine the current picture. It is to generate motion information.
- the motion information of the current picture is also similar to the motion information of other views. Therefore, in the case of a multiview video signal, the motion skip mode according to the present invention may be advantageous. Motion in accordance with the present invention with reference to the drawings The skip mode will be described in detail.
- FIG. 10 illustrates a concept of a video signal processing method according to another embodiment of the present invention.
- FIG. 11 illustrates a configuration of a video signal processing device according to another embodiment of the present invention.
- FIG. The detailed configuration of the motion information acquisition unit is shown.
- 13 and 14 illustrate a procedure of a video signal processing method according to another embodiment of the present invention.
- the motion information of the corresponding block is obtained after searching for Daewoong bltok of a neighboring view using a global motion vector of the current picture.
- the video signal processing apparatus 200 includes a motion skip determination unit 210, a motion information acquisition unit 220, and a motion compensation unit 230.
- the motion information acquisition unit 220 may include a motion information extraction unit 222 and a motion information generation unit 224.
- the motion skip determination unit 210 extracts motion skip information and the like from the received bitstream to determine whether the current block is in the motion skip mode, which will be described in detail later.
- the motion information extractor 222 of the motion information acquirer 220 extracts motion information of the current block when the current block is not in the motion scan mode.
- the motion information generator 224 skips the extraction of the motion information, searches for the corresponding block, and acquires the motion information of the Daewoong block. Shall be.
- the motion compensator 230 performs motion compensation using the motion information acquired by the motion information acquirer 220, thereby generating a predicted value of the current block.
- the global motion vector acquirer 320 acquires a global motion vector of the current picture to which the current block belongs. As a configuration, since the configuration may be the same as the global motion vector acquisition unit 120 described with reference to FIG. 8, a detailed description thereof will be omitted.
- the motion skip determination unit 210 determines whether the current block is in the motion skip mode.
- anchor picture flag information is extracted (S310).
- the anchor picture flag information is information indicating whether a current picture is an anchor picture and may be included in an NAL header extension region.
- the motion skip mode since the motion skip mode is difficult to use, it is not necessary to extract the second motion skip flag information indicating whether the motion skip mode is used for each block in the macroblock layer.
- the first motion skip flag information is information indicating whether even one motion skip mode is used among blocks belonging to a slice, which can be obtained from a slice header. If the first motion skip flag information indicates that none of the blocks in which the motion skip mode is used are among the special features belonging to the slice, the second motion skip flag information need not be extracted.
- the current picture is a non-anchor picture (YES in S330), and the first motion skip flag information is sliced.
- the motion skip mode indicates one or more blocks among blocks belonging to the block (YES in S340)
- the second motion skip flag information is extracted (S350).
- the extracted second motion skip flag information is used to determine whether the current block is in a motion skip mode (S360).
- motion information of the current block is extracted (S380). If the current block is in the motion skip mode, the process of generating motion information is performed (S370).
- 14 and 20 illustrate examples of extracting first motion skip flag information according to an embodiment of the present invention
- FIG. 15 illustrates extracting second motion skip flag information according to an embodiment of the present invention. This is an example of the process.
- the first motion skip flag information should first be a case where the current picture is a non-anchor picture.
- Inter-view dependencies in non-anchor pictures indicate whether the current picture is dependent on another view. Since the current picture can refer to pictures in other views, the current picture must be decoded before it is decoded in another view. It means no.
- the inter-view dependency may be determined according to the SPS extension information, and in particular, the number of interview reference pictures and the view identification number of the interview reference picture.
- the number of interview reference pictures indicates information about the number of reference views to which the reference block to which a reference block refers is in decoding the current block belonging to the current view, and the view identification number of the interview reference picture indicates a view identifier of the reference view. If there is no inter-view dependency in the anchor picture, the motion information of the neighbor view is not decoded before the current picture is decoded. Therefore, since motion information of the neighboring view cannot be used to decode the current picture, it can be determined not to use the motion skip mode when there is no inter-view dependency of the non-anchor picture. Thus, as shown in FIG. 14, the first motion flag flag information is acquired only when there is more than one number of interview reference pictures (S410).
- the first motion skip flag information may be extracted under a condition that the current picture is a non-anchor picture (S710) and the condition that the second prediction mode identification information is not T (S720).
- the motion skip mode is selected among the blocks belonging to the slice. If it is indicated that there is at least one block used, the second motion skip flag information is extracted (S420).
- S420 a process of generating motion information will be described with reference to FIGS. 12 and 16.
- FIG. 16 is a detailed flowchart of the motion information generation process S370 of FIG. 13.
- the motion information generation process is performed.
- the motion information skipping unit 224a of the motion information generating unit 224 omits extracting the motion information of the current block (S510).
- the Daeung block search unit 224b searches for the Daeung block within the neighboring viewpoint, and determines the neighboring viewpoint in which the Daeung block exists (S520).
- the neighbor view is a view different from a view point to which the current block belongs, and is a view point of a picture having motion information suitable for use as motion information of the current block.
- the viewpoint identifier of the neighboring viewpoint of the current block may be explicitly transmitted through a specific variable included in the slice layer or the macro block layer, and based on the viewpoint dependency of the current picture, the identifier of the neighboring viewpoint of the current block is estimated. You may.
- the corresponding block search unit 224b obtains a global motion vector of the current block to search for a corresponding block (S530).
- the global motion vector of the current block may be obtained in the same manner as described above, but the present invention is not limited thereto.
- the corresponding block search unit 224b determines the corresponding block by using the determined neighbor view and the global motion vector (S540).
- the motion information acquisition unit 224c extracts motion information of the Daewoong block (S550).
- the motion information of the corresponding block may be information extracted by the motion information extractor 222. If the motion information of the corresponding block cannot be extracted, for example, when the corresponding block performs only intra prediction and does not perform inter prediction, the motion information of the corresponding corresponding block is changed by changing the neighbor viewpoint determined in step S520. Extract The motion information acquisition unit 224c extracts the extracted information.
- the motion information of the current block is obtained using the motion information (S560).
- the motion information of the corresponding block may be used as the motion information of the current block as it is, but the present invention is not limited thereto.
- the present invention can be used for multiview video encoding and decoding.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
La présente invention concerne un procédé de traitement de signal vidéo comprenant les étapes suivantes : obtention d'une information de vue qui identifie une vue d'un bloc courant; obtention d'une information de vue sur la vue de référence à laquelle le bloc courant se réfère; obtention d'une information d'identification de mode de prédiction qui indique si l'information de vue sur la vue de référence est utilisée pour le saut de mouvement ou la prédiction d'échantillon inter-vues, le saut de mouvement indique que l'information de mouvement du bloc courant est induite; obtention d'une deuxième information drapeau de saut de mouvement en provenance d'une couche de macrobloc, la deuxième information drapeau de saut de mouvement indique si le saut de mouvement est effectué pour le bloc courant ou si elle ne l'est pas; induction de l'information de mouvement du bloc courant à partir d'un bloc de référence qui existe dans la vue de référence si l'information de vue pour la référence de vue est utilisée pour le saut de mouvement conformément à l'information d'identification de mode de prédiction et le saut de mouvement est effectué conformément à la deuxième information drapeau de saut de mouvement; et restauration du bloc courant fondé sur l'information de mouvement du bloc courant. Selon la présente invention, l'information de mouvement du bloc courant peut être induite à partir de l'information de mouvement de blocs correspondants qui existent dans d'autres vues mêmes si l'information de mouvement du bloc courant n'est pas transmise lorsqu'un signal vidéo est codé. Par conséquent, le taux de compression peut être amélioré.
Applications Claiming Priority (12)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US5265208P | 2008-05-13 | 2008-05-13 | |
US61/052,652 | 2008-05-13 | ||
US5787008P | 2008-06-01 | 2008-06-01 | |
US61/057,870 | 2008-06-01 | ||
US5855208P | 2008-06-03 | 2008-06-03 | |
US61/058,552 | 2008-06-03 | ||
US6012808P | 2008-06-10 | 2008-06-10 | |
US61/060,128 | 2008-06-10 | ||
US7913708P | 2008-07-09 | 2008-07-09 | |
US61/079,137 | 2008-07-09 | ||
US8067008P | 2008-07-14 | 2008-07-14 | |
US61/080,670 | 2008-07-14 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2009139569A2 true WO2009139569A2 (fr) | 2009-11-19 |
WO2009139569A3 WO2009139569A3 (fr) | 2010-03-04 |
Family
ID=41319156
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2009/002490 WO2009139569A2 (fr) | 2008-05-13 | 2009-05-12 | Procédé et appareil de décodage de signal vidéo |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2009139569A2 (fr) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110430433A (zh) * | 2014-01-03 | 2019-11-08 | 庆熙大学校产学协力团 | 导出子预测单元的时间点之间的运动信息的方法和装置 |
RU2828826C2 (ru) * | 2014-01-03 | 2024-10-21 | Юнивёрсити-Индастри Кооперейшен Груп Оф Кён Хи Юнивёрсити | Способ декодирования изображения, способ кодирования изображения и машиночитаемый носитель информации |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7561620B2 (en) * | 2004-08-03 | 2009-07-14 | Microsoft Corporation | System and process for compressing and decompressing multiple, layered, video streams employing spatial and temporal encoding |
KR100667830B1 (ko) * | 2005-11-05 | 2007-01-11 | 삼성전자주식회사 | 다시점 동영상을 부호화하는 방법 및 장치 |
KR100781524B1 (ko) * | 2006-04-04 | 2007-12-03 | 삼성전자주식회사 | 확장 매크로블록 스킵 모드를 이용한 인코딩/디코딩 방법및 장치 |
-
2009
- 2009-05-12 WO PCT/KR2009/002490 patent/WO2009139569A2/fr active Application Filing
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110430433A (zh) * | 2014-01-03 | 2019-11-08 | 庆熙大学校产学协力团 | 导出子预测单元的时间点之间的运动信息的方法和装置 |
CN110430433B (zh) * | 2014-01-03 | 2022-12-20 | 庆熙大学校产学协力团 | 导出子预测单元的时间点之间的运动信息的方法和装置 |
US11627331B2 (en) | 2014-01-03 | 2023-04-11 | University-Industry Cooperation Group Of Kyung Hee University | Method and device for inducing motion information between temporal points of sub prediction unit |
US11711536B2 (en) | 2014-01-03 | 2023-07-25 | University-Industry Cooperation Foundation Of Kyung Hee University | Method and device for inducing motion information between temporal points of sub prediction unit |
RU2828826C2 (ru) * | 2014-01-03 | 2024-10-21 | Юнивёрсити-Индастри Кооперейшен Груп Оф Кён Хи Юнивёрсити | Способ декодирования изображения, способ кодирования изображения и машиночитаемый носитель информации |
US12184882B2 (en) | 2014-01-03 | 2024-12-31 | University-Industry Cooperation Group Of Kyung Hee University | Method and device for inducing motion information between temporal points of sub prediction unit |
Also Published As
Publication number | Publication date |
---|---|
WO2009139569A3 (fr) | 2010-03-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5475464B2 (ja) | ビデオ信号処理方法及び装置 | |
CN111971960B (zh) | 用于基于帧间预测模式处理图像的方法及其装置 | |
EP2424247B1 (fr) | Procédé et appareil servant à traiter un signal vidéo à vues multiples | |
JP5021739B2 (ja) | 信号処理方法及び装置 | |
US8699562B2 (en) | Method and an apparatus for processing a video signal with blocks in direct or skip mode | |
KR101158491B1 (ko) | 다시점 영상 부호화, 복호화 방법 및 그 장치. | |
KR101619451B1 (ko) | 다시점 비디오 신호의 처리 방법 및 장치 | |
US9860545B2 (en) | Method and apparatus for multi-view video encoding and method and apparatus for multiview video decoding | |
KR20150109282A (ko) | 다시점 비디오 신호 처리 방법 및 장치 | |
KR20090113281A (ko) | 비디오 신호 처리 방법 및 장치 | |
RU2609753C2 (ru) | Способ и устройство для обработки видеосигнала | |
WO2008054176A1 (fr) | Procédé et dispositif pour codage vidéo prédictif et procédé et dispositif pour décodage vidéo prédictif | |
CN105122812A (zh) | 用于三维(3d)视频译码的高级合并模式 | |
KR20150110357A (ko) | 다시점 비디오 신호 처리 방법 및 장치 | |
WO2020008328A1 (fr) | Codage de mode amvp et mode de fusion dépendant de la forme | |
CN104956676A (zh) | 层间语法预测控制 | |
WO2012128241A1 (fr) | Dispositif de traitement d'image, procédé de traitement d'image et programme | |
KR20150037847A (ko) | 비디오 신호 처리 방법 및 장치 | |
KR20080007086A (ko) | 비디오 신호의 디코딩/인코딩 방법 및 장치 | |
CN111343459B (zh) | 用于解码/编码视频信号的方法以及可读存储介质 | |
KR20080006494A (ko) | 비디오 신호의 디코딩 방법 및 장치 | |
KR101366289B1 (ko) | 비디오 신호의 디코딩/인코딩 방법 및 장치 | |
KR20080060188A (ko) | 비디오 신호 디코딩 방법 및 장치 | |
WO2009139569A2 (fr) | Procédé et appareil de décodage de signal vidéo | |
KR20080029788A (ko) | 비디오 신호의 디코딩 방법 및 장치 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 09746748 Country of ref document: EP Kind code of ref document: A2 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 09746748 Country of ref document: EP Kind code of ref document: A2 |