CN113455000B

CN113455000B - Bidirectional prediction method and video decoding equipment

Info

Publication number: CN113455000B
Application number: CN201980092691.3A
Authority: CN
Inventors: 金在一; 李善暎; 罗太英; 孙世勋; 申在燮
Original assignee: SK Telecom Co Ltd
Current assignee: SK Telecom Co Ltd
Priority date: 2018-12-27
Filing date: 2019-12-26
Publication date: 2024-04-02
Anticipated expiration: 2039-12-26
Also published as: CN113455000A; KR20200081201A

Abstract

A bi-directional inter prediction method and an image decoding apparatus are disclosed. According to an embodiment of the present invention, there is provided a bi-prediction method for inter-predicting a current block using any one of a plurality of bi-prediction modes, including: decoding mode information from a bitstream, which indicates whether to apply a first mode included in a plurality of bi-prediction modes to a current block; when the mode information indicates that the first mode is applied to the current block, decoding first motion information including differential motion vector information and prediction motion vector information and second motion information excluding at least a portion of the differential motion vector information and the prediction motion vector information from the bitstream; deriving a first motion vector based on the first motion information, deriving a second motion vector based on at least a portion of the first motion information and the second motion information; and predicting the current block by using the reference block indicated by the first motion vector in the first reference picture and the reference block indicated by the second motion vector in the second reference picture.

Description

Bidirectional prediction method and video decoding equipment

技术领域Technical field

本发明涉及视频的编码和解码，并且更具体地说，涉及一种通过有效地表达运动信息来提高编码和解码效率的双向预测方法及视频解码装置。The present invention relates to video encoding and decoding, and more specifically, to a bidirectional prediction method and a video decoding device that improve encoding and decoding efficiency by effectively expressing motion information.

背景技术Background technique

由于视频数据的体量比语音数据或静止图像数据的体量大，因此在没有进行压缩处理的情况下存储或传输视频数据需要包括存储器在内的大量硬件资源。Since the volume of video data is larger than that of voice data or still image data, storing or transmitting video data without compression requires a large amount of hardware resources, including memory.

因此，在存储或传输视频数据时，通常使用编码器对视频数据进行压缩以进行存储或传输。然后，解码器接收压缩的视频数据，并解压缩和再现视频数据。用于此类视频的压缩技术包括H.264/AVC和高效视频编码(HEVC)，其编码效率比H.264/AVC提高了约40％。Therefore, when storing or transmitting video data, an encoder is usually used to compress the video data for storage or transmission. The decoder then receives the compressed video data and decompresses and reproduces the video data. Compression technologies used for such videos include H.264/AVC and High Efficiency Video Coding (HEVC), which are about 40% more efficient than H.264/AVC.

但是，视频的尺寸、分辨率和帧率正在逐渐增大，相应地要编码的数据量也正在增加。因此，需要比现有压缩技术具有更好编码效率和更高图像质量的新压缩技术。However, the size, resolution and frame rate of videos are gradually increasing, and accordingly the amount of data to be encoded is also increasing. Therefore, new compression technologies with better coding efficiency and higher image quality than existing compression technologies are needed.

发明内容Contents of the invention

技术问题technical problem

本发明的目的是提供改进的视频编码和解码技术，更具体地，提供通过使用在特定方向的运动信息来推断在其它方向的运动信息而提高编码和解码效率的技术。It is an object of the present invention to provide improved video encoding and decoding techniques, and more specifically, to provide techniques that improve encoding and decoding efficiency by using motion information in a specific direction to infer motion information in other directions.

技术方案Technical solutions

根据至少一个方面，本公开提供了一种使用多种双向预测模式中的任何一种对当前块进行帧间预测的方法。该方法包括：从比特流中解码模式信息，该模式信息指示是否向当前块应用多种双向预测模式中所包括的第一模式。当模式信息指示向当前块应用了第一模式时，该方法还包括从比特流解码包括关于第一运动矢量的差分运动矢量信息和预测运动矢量信息的第一运动信息及不包括关于第二运动矢量的差分运动矢量信息和预测运动矢量信息的至少一部分的第二运动信息；以及基于第一运动信息推导第一运动矢量，以及基于第一运动信息的至少一部分并基于第二运动信息推导第二运动矢量。该方法还包括使用第一参考图片中由第一运动矢量所指示的参考块和第二参考图片中由第二运动矢量所指示的参考块来预测当前块。According to at least one aspect, the present disclosure provides a method of inter-predicting a current block using any one of a plurality of bi-directional prediction modes. The method includes decoding mode information from a bitstream, the mode information indicating whether to apply a first mode included in a plurality of bidirectional prediction modes to a current block. When the mode information indicates that the first mode is applied to the current block, the method further includes decoding, from the bitstream, first motion information including differential motion vector information and predicted motion vector information regarding the first motion vector and excluding information regarding the second motion. differential motion vector information of the vector and second motion information that predicts at least a portion of the motion vector information; and deriving a first motion vector based on the first motion information, and deriving a second motion vector based on at least a portion of the first motion information and based on the second motion information. Motion vector. The method also includes predicting the current block using a reference block in the first reference picture indicated by the first motion vector and a reference block in the second reference picture indicated by the second motion vector.

根据另一方面，本公开提供了一种视频解码设备。该设备包括解码器，该解码器被配置为从比特流中解码模式信息，该模式信息指示是否向当前块应用多种双向预测模式中所包括的第一模式。当模式信息指示向当前块应用了第一模式时，解码器从比特流解码包括关于第一运动矢量的差分运动矢量信息和预测运动矢量信息的第一运动信息及不包括关于第二运动矢量的差分运动矢量信息和预测运动矢量信息的至少一部分的第二运动信息。该设备包括预测单元，该预测单元被配置为基于第一运动信息推导第一运动矢量，并且基于第一运动信息的至少一部分和第二运动信息推导第二运动矢量。预测器被配置为使用第一参考图片中由第一运动矢量所指示的参考块和第二参考图片中由第二运动矢量所指示的参考块来预测当前块。According to another aspect, the present disclosure provides a video decoding device. The device includes a decoder configured to decode mode information from a bitstream, the mode information indicating whether a first mode included in a plurality of bidirectional prediction modes is applied to a current block. When the mode information indicates that the first mode is applied to the current block, the decoder decodes from the bitstream first motion information including differential motion vector information and predicted motion vector information about a first motion vector and second motion information excluding at least a portion of differential motion vector information and predicted motion vector information about a second motion vector. The device includes a prediction unit configured to derive a first motion vector based on the first motion information, and derive a second motion vector based on at least a portion of the first motion information and the second motion information. The predictor is configured to predict the current block using a reference block indicated by the first motion vector in a first reference picture and a reference block indicated by the second motion vector in a second reference picture.

技术效果Technical effect

如上所述，根据本发明的实施方式，可以通过使用在特定方向的运动来推断在其它方向的运动而提高运动表示的比特效率。As described above, according to embodiments of the present invention, the bit efficiency of motion representation can be improved by using motion in a specific direction to infer motion in other directions.

附图说明Description of drawings

图1是能够实现本公开的技术的视频编码设备的示例性框图。1 is an exemplary block diagram of a video encoding device capable of implementing the techniques of the present disclosure.

图2示例性地示出了使用QTBTTT结构的块分区结构。Figure 2 exemplarily shows the block partition structure using the QTBTTT structure.

图3示例性地示出了多种帧内预测模式。Figure 3 exemplarily shows various intra prediction modes.

图4是能够实现本公开的技术的视频解码设备的示例性框图。4 is an exemplary block diagram of a video decoding device capable of implementing the techniques of the present disclosure.

图5是用于描述根据本发明实施方式的双向预测的图。FIG. 5 is a diagram for describing bidirectional prediction according to an embodiment of the present invention.

图6是用于描述根据本发明实施方式的使用差分运动矢量之间的对称关系来推导运动的图。6 is a diagram for describing deriving motion using a symmetrical relationship between differential motion vectors according to an embodiment of the present invention.

图7和图8是用于描述根据本发明实施方式的使用线性关系来推导运动的图。7 and 8 are diagrams for describing the use of linear relationships to derive motion according to an embodiment of the present invention.

图9至图18是用于描述根据本发明的各种实施方式的推导运动的图。9 to 18 are diagrams for describing derivation motion according to various embodiments of the present invention.

图19和图20是用于描述根据本发明实施方式的使用在高级别确定的参考图片来推导运动的流程图。19 and 20 are flowcharts for describing derivation of motion using reference pictures determined at a high level according to an embodiment of the present invention.

具体实施方式Detailed ways

在下文中，将参照附图详细描述本公开的一些实施方式。应该注意的是，在各个附图中向组成元件(constituent element)添加附图标记时，尽管这些元件在不同的附图中示出，但是相似的附图标记指代相似的元件。此外，在本公开的以下描述中，将省略并入本文的已知功能和配置的详细描述以避免混淆本公开的主题。Hereinafter, some embodiments of the present disclosure will be described in detail with reference to the accompanying drawings. It should be noted that when reference numbers are added to constituent elements in various drawings, similar reference numbers refer to similar elements although the elements are shown in different drawings. Furthermore, in the following description of the present disclosure, detailed descriptions of known functions and configurations incorporated herein will be omitted to avoid obscuring the subject matter of the present disclosure.

图1是能够实现本公开的技术的视频编码设备的示例性框图。在下文中，将参照图1描述视频编码设备和该设备的元件。FIG1 is an exemplary block diagram of a video encoding device capable of implementing the technology of the present disclosure. Hereinafter, a video encoding device and elements of the device will be described with reference to FIG1 .

视频编码设备包括块分割器110、预测器120、减法器130、变换器140、量化器145、编码器150、逆量化器160、逆变换器165、加法器170、滤波器单元180和存储器190。The video encoding device includes a block partitioner 110, a predictor 120, a subtractor 130, a transformer 140, a quantizer 145, an encoder 150, an inverse quantizer 160, an inverse transformer 165, an adder 170, a filter unit 180, and a memory 190 .

视频编码设备的每个元件可以用硬件或软件、或者硬件和软件的组合来实现。各个元件的功能可以用软件来实现，并且可以实现微处理器，以执行与各个元件相对应的软件功能。Each element of the video encoding device may be implemented in hardware or software, or a combination of hardware and software. The functions of each component can be implemented by software, and a microprocessor can be implemented to execute the software functions corresponding to each component.

一个视频由多幅图片组成。每幅图片被分割成多个区域，并对每个区域执行编码。例如，一幅图片被分割成一个或更多个瓦片和/或切片。这里，一个或更多个瓦片可以被定义为瓦片组。每个瓦片或切片被分割成一个或更多个编码树单元(CTU)。每个CTU按照树结构被分割成一个或更多个编码单元(CU)。应用于每个CU的信息被编码为CU的语法，共同应用于一个CTU中所包含的CU的信息被编码为CTU的语法。此外，共同应用于一个瓦块中的所有块的信息被编码为瓦片的语法或被编码为作为多个瓦片的集合的瓦片组的语法，并且应用于构成一幅图片的所有块的信息被编码在图片参数集(PPS)或图片标头中。此外，由多个图片共同参考的信息被编码在序列参数集(SPS)中。另外，由一个或更多个SPS共同参考的信息被编码在视频参数集(VPS)中。A video consists of multiple pictures. Each image is segmented into multiple regions, and encoding is performed on each region. For example, a picture is divided into one or more tiles and/or slices. Here, one or more tiles can be defined as a tile group. Each tile or slice is divided into one or more coding tree units (CTUs). Each CTU is divided into one or more coding units (CU) according to a tree structure. Information that applies to each CU is encoded as the syntax of the CU, and information that applies collectively to the CUs contained in a CTU is encoded as the syntax of the CTU. Furthermore, information that is commonly applied to all blocks in a tile is encoded into the syntax of the tile or is encoded into the syntax of a tile group that is a collection of multiple tiles, and is applied to all blocks that make up a picture. The information is encoded in the picture parameter set (PPS) or picture header. In addition, information commonly referenced by multiple pictures is encoded in a sequence parameter set (SPS). In addition, information commonly referenced by one or more SPS is encoded in a video parameter set (VPS).

块分割器110确定编码树单元(CTU)的尺寸。关于CTU的尺寸(CTU尺寸)的信息被编码为SPS或PPS的语法，并且被发送给视频解码设备。Block partitioner 110 determines the size of the coding tree unit (CTU). Information about the size of the CTU (CTU size) is encoded into the syntax of SPS or PPS and sent to the video decoding device.

块分割器110将构成视频的每幅图片分割成具有预定尺寸的多个CTU，然后使用树结构递归分割CTU。在树结构中，叶节点用作编码单元(CU)，编码单元是编码的基本单元。The block divider 110 divides each picture constituting the video into a plurality of CTUs having a predetermined size, and then recursively divides the CTUs using a tree structure. In the tree structure, leaf nodes serve as coding units (CUs), which are the basic units of coding.

树结构可以是其中节点(或父节点)被分割成具有相同尺寸的四个子节点(或孩子节点)的四叉树(QT)、其中节点被分割成两个子节点的二叉树(BT)、其中节点以1:2:1的比例被分割成三个子节点的三叉树(TT)、或者由QT结构、BT结构和TT结构中的两个或更多个的组合而形成的结构。例如，可以使用QTBT(四叉树加二叉树)结构或QTBTTT(四叉树加二叉树三叉树)结构。这里，BTTT可以统称为多类型树(MTT)。The tree structure can be a quadtree (QT) where a node (or parent node) is split into four child nodes (or child nodes) of the same size, a binary tree (BT) where a node is split into two child nodes, where a node A ternary tree (TT) divided into three child nodes in a ratio of 1:2:1, or a structure formed by a combination of two or more of the QT structure, BT structure, and TT structure. For example, a QTBT (quadtree plus binary tree) structure or a QTBTTT (quadtree plus binary tree ternary tree) structure may be used. Here, BTTT can be collectively referred to as multi-type tree (MTT).

图2示出了QTBTTT分割树结构。如图2所示，最初可以以QT结构来分割CTU。可以重复QT分割，直至分割块的尺寸达到QT中所允许的叶节点的最小块尺寸(MinQTSize)。由编码器150对指示QT结构的每个节点是否被分割成下层的四个节点的第一标志(QT_split_flag)进行编码并用信号通知视频解码设备。当QT的叶节点不大于BT中所允许的根节点的最大块尺寸(MaxBTSize)时，可以进一步以BT结构或TT结构中的一种或更多种进行分割。在BT结构和/或TT结构中，可以有多个分割方向。例如，可存在作为节点块的水平分割和垂直分割的两个方向。如图2所示，当MTT分割开始时，由编码器150对指示节点是否被分割的第二标志(mtt_split_flag)、指示分割方向(垂直或水平)的标志、和/或指示分割类型(二叉或三叉)的标志进行编码并用信号通知视频解码设备。Figure 2 shows the QTBTTT split tree structure. As shown in Figure 2, the CTU can initially be segmented in a QT structure. QT splitting can be repeated until the split block size reaches the minimum block size (MinQTSize) of leaf nodes allowed in QT. A first flag (QT_split_flag) indicating whether each node of the QT structure is split into four nodes of the lower layer is encoded by the encoder 150 and signaled to the video decoding device. When the leaf node of QT is not larger than the maximum block size (MaxBTSize) of the root node allowed in BT, it can be further divided in one or more of the BT structure or the TT structure. In BT structure and/or TT structure, there can be multiple splitting directions. For example, there may be two directions as horizontal splitting and vertical splitting of node blocks. As shown in FIG. 2 , when MTT splitting starts, the encoder 150 sets a second flag (mtt_split_flag) indicating whether a node is split, a flag indicating a splitting direction (vertical or horizontal), and/or a splitting type (binary or triple) mark to encode and signal the video decoding device.

作为树结构的另一示例，当使用QTBTTT结构对块进行分割时，由编码器150对关于指示块已被分割的CU分割标志(split_cu_flag)和指示分割类型是QT分割的QT分割标志(split_qt_flag)的信息进行编码并用信号通知视频解码设备。当split_cu_flag的值指示块尚未被分割时，节点的块成为分割树结构中的叶节点并用作作为编码的基本单元的编码单元(CU)。当split_cu_flag的值指示块尚未被分割时，通过split_qt_flag的值区分分割类型是QT还是MTT。当分割类型为QT时，没有附加信息。当分割类型为MTT时，由编码器150对指示MTT分割方向(垂直或水平)的标志(mtt_split_cu_vertical_flag)和/或指示MTT分割类型(二叉或三叉)的标志(mtt_split_cu_binary_flag)进行编码并用信号通知视频解码设备。As another example of a tree structure, when a block is split using the QTBTTT structure, a CU split flag (split_cu_flag) indicating that the block has been split and a QT split flag (split_qt_flag) indicating that the split type is QT split are set by the encoder 150 The information is encoded and signaled to the video decoding device. When the value of split_cu_flag indicates that the block has not been split, the block of the node becomes a leaf node in the split tree structure and is used as a coding unit (CU) which is a basic unit of encoding. When the value of split_cu_flag indicates that the block has not been split, the value of split_qt_flag is used to distinguish whether the split type is QT or MTT. When the split type is QT, there is no additional information. When the split type is MTT, a flag (mtt_split_cu_vertical_flag) indicating the MTT split direction (vertical or horizontal) and/or a flag (mtt_split_cu_binary_flag) indicating the MTT split type (binary or trifurcated) is encoded by the encoder 150 and signaled to the video Decoding equipment.

作为树结构的另一示例，当使用QTBT时，可以存在两种分割类型，这两种分割类型是节点的块水平分割(即，对称水平分割)和垂直分割(即，对称垂直分割)分成相同尺寸的两个块。由编码器150对指示BT结构的每个节点是否被分割成下层的块的分割标志(split_flag)和指示分割类型的分割类型信息进行编码并发送给视频解码设备。可以存在附加类型，附加类型是将节点的块分割为两个非对称块。非对称分割类型可以包括将块以1:3的尺寸比分割为两个矩形块的类型、以及将节点的块进行对角分割的类型。As another example of a tree structure, when using QTBT, there can be two splitting types, which are block horizontal splitting (i.e. symmetrical horizontal splitting) and vertical splitting (i.e. symmetrical vertical splitting) of nodes into the same size of two blocks. A split flag (split_flag) indicating whether each node of the BT structure is split into lower-layer blocks and split type information indicating the split type are encoded by the encoder 150 and sent to the video decoding device. There can be additional types that split a block of nodes into two asymmetric blocks. The asymmetric division type may include a type in which a block is divided into two rectangular blocks with a size ratio of 1:3, and a type in which a node block is divided diagonally.

根据CTU的QTBT或QTBTTT分割，CU可以具有各种尺寸。在下文中，将与要编码或解码的CU(即，QTBTTT的叶节点)相对应的块称为“当前块”。CUs can be of various sizes depending on the QTBT or QTBTTT partitioning of the CTU. Hereinafter, the block corresponding to the CU to be encoded or decoded (ie, the leaf node of QTBTTT) is referred to as the "current block".

预测器120预测当前块以生成预测块。预测器120包括帧内预测器122和帧间预测器124。The predictor 120 predicts the current block to generate a prediction block. Predictors 120 include intra predictors 122 and inter predictors 124 .

一般来说，可以对图片中的每个当前块进行预测编码。可以使用帧内预测技术(基于来自包含当前块的图片的数据执行的)或帧间预测技术(基于来自在包含当前块的图片之前被编码的图片的数据执行的)来执行当前块的预测。帧间预测包括单向预测和双向预测二者。In general, each current block in the picture can be predictively encoded. Prediction of the current block may be performed using intra prediction techniques (performed based on data from the picture containing the current block) or inter prediction techniques (performed based on data from pictures encoded before the picture containing the current block). Inter prediction includes both unidirectional prediction and bidirectional prediction.

帧内预测器122使用包括当前块的当前图片中位于当前块周围的像素(参考像素)来预测当前块中的像素。根据预测方向，存在多种帧内预测模式。例如，如图3所示，多个帧内预测模式可以包括非定向模式以及65个定向模式，非定向模式包括平面模式和DC模式。对于每种预测模式，以不同方式定义了要使用的相邻像素和公式。The intra predictor 122 predicts pixels in the current block using pixels (reference pixels) located around the current block in the current picture including the current block. There are multiple intra prediction modes depending on the prediction direction. For example, as shown in FIG. 3 , a plurality of intra prediction modes may include a non-directional mode including a planar mode and a DC mode and 65 directional modes. For each prediction mode, the neighboring pixels and formulas to be used are defined differently.

帧内预测器122可以确定在对当前块进行编码时要使用的帧内预测模式。在一些示例中，帧内预测器122可以使用若干种帧内预测模式对当前块进行编码并且从被测模式当中选择适当的帧内预测模式来使用。例如，帧内预测器122可以使用若干被测帧内预测模式的率失真分析来计算率失真值，并且可以在被测模式当中选择具有最佳率失真特性的帧内预测模式。Intra predictor 122 may determine the intra prediction mode to use when encoding the current block. In some examples, intra predictor 122 may encode the current block using several intra prediction modes and select an appropriate intra prediction mode to use from among the tested modes. For example, the intra predictor 122 may calculate a rate-distortion value using rate-distortion analysis of several tested intra-prediction modes, and may select an intra-prediction mode with the best rate-distortion characteristics among the tested modes.

帧内预测器122从多种帧内预测模式当中选择一种帧内预测模式，并使用根据所选择的帧内预测模式所确定的相邻像素(参考像素)和公式来预测当前块。由编码器150对关于所选择的帧内预测模式的信息进行编码并且发送给视频解码设备。The intra predictor 122 selects an intra prediction mode from among a plurality of intra prediction modes and predicts the current block using neighboring pixels (reference pixels) determined according to the selected intra prediction mode and a formula. The information about the selected intra prediction mode is encoded by the encoder 150 and sent to the video decoding device.

帧间预测器124通过运动补偿处理生成当前块的预测块。帧间预测器在比当前图片更早编码和解码的参考图片中搜索与当前块最相似的块，并基于搜索到的块生成当前块的预测块。然后，帧间预测器生成与当前图片中的当前块和参考图片中的预测块之间的位移相对应的运动矢量。通常，对亮度分量执行运动估计，并且对于亮度分量和色度分量二者，使用基于亮度分量计算出的运动矢量。由编码器150对包括关于用于预测当前块的参考图片的信息和关于运动矢量的信息的运动信息进行编码并发送给视频解码设备。The inter-frame predictor 124 generates a prediction block of the current block through motion compensation processing. The inter-frame predictor searches for a block most similar to the current block in a reference picture that is encoded and decoded earlier than the current picture, and generates a prediction block of the current block based on the searched block. Then, the inter-frame predictor generates a motion vector corresponding to the displacement between the current block in the current picture and the prediction block in the reference picture. Generally, motion estimation is performed on the luminance component, and for both the luminance component and the chrominance component, a motion vector calculated based on the luminance component is used. Motion information including information about the reference picture used to predict the current block and information about the motion vector is encoded by the encoder 150 and sent to the video decoding device.

减法器130通过从当前块中减去由帧内预测器122或帧间预测器124生成的预测块来生成残差块。Subtractor 130 generates a residual block by subtracting the prediction block generated by intra predictor 122 or inter predictor 124 from the current block.

变换器140将空间域中具有像素值的残差块中的残差信号变换为频域中的变换系数。变换器140可以使用当前块的总尺寸作为变换单元来变换残差块中的残差信号。另选地，变换器可以将残差块分割为变换区和非变换区的子块，仅使用变换区的子块作为变换单元来变换残差信号。这里，变换区子块可以是基于水平轴(或垂直轴)的尺寸比为1:1的两个矩形块之一。在这种情况下，由编码器150对指示仅子块已被变换的标志(cu_sbt_flag)、方向(垂直/水平)信息(cu_sbt_horizontal_flag)和/或位置信息(cu_sbt_pos_flag)进行编码并用信号通知视频解码设备。另外，变换区子块的尺寸可以基于水平轴(或垂直轴)具有1:3的尺寸比。在这种情况下，由编码器150对用于区分分割的标志(cu_sbt_quad_flag)附加地编码用信号通知视频解码设备。The transformer 140 transforms the residual signal in the residual block having pixel values in the spatial domain into transform coefficients in the frequency domain. The transformer 140 may transform the residual signal in the residual block using the total size of the current block as a transform unit. Alternatively, the transformer may divide the residual block into sub-blocks of the transform area and the non-transform area, and transform the residual signal using only the sub-blocks of the transform area as transform units. Here, the transformation area sub-block may be one of two rectangular blocks with a size ratio based on the horizontal axis (or vertical axis) of 1:1. In this case, a flag (cu_sbt_flag), direction (vertical/horizontal) information (cu_sbt_horizontal_flag), and/or position information (cu_sbt_pos_flag) indicating that only the sub-block has been transformed is encoded and signaled to the video decoding device by the encoder 150 . In addition, the size of the transform region sub-block may have a size ratio of 1:3 based on the horizontal axis (or vertical axis). In this case, a flag (cu_sbt_quad_flag) for distinguishing the divisions is additionally encoded and signaled to the video decoding device by the encoder 150 .

量化器145对从变换器140输出的变换系数进行量化，并且向编码器150输出量化后的变换系数。The quantizer 145 quantizes the transform coefficient output from the transformer 140 and outputs the quantized transform coefficient to the encoder 150 .

编码器150通过使用诸如基于上下文的自适应二进制算术编码(CABAC)之类的编码方法对量化的变换系数进行编码，来生成比特流。编码器150对与块分割相关的诸如CTU尺寸、CU分割标志、QT分割标志、MTT分割方向和MTT分割类型之类的信息进行编码，使得视频解码设备与视频编码设备以相同方式分割块。The encoder 150 generates a bitstream by encoding the quantized transform coefficients using a coding method such as context-based adaptive binary arithmetic coding (CABAC). The encoder 150 encodes information related to block partitioning, such as CTU size, CU partition flag, QT partition flag, MTT partition direction, and MTT partition type, so that the video decoding device and the video encoding device partition the blocks in the same manner.

此外，编码器150对关于指示当前块是通过帧内预测还是通过帧间预测被编码的预测类型的信息进行编码，并且根据预测类型对帧内预测信息(即，关于帧内预测模式的信息)或帧间预测信息(关于参考图片和运动矢量的信息)进行编码。In addition, the encoder 150 encodes information on a prediction type indicating whether the current block is encoded by intra prediction or inter prediction, and the intra prediction information (ie, information on an intra prediction mode) according to the prediction type Or inter prediction information (information about reference pictures and motion vectors) is encoded.

逆量化器160对从量化器145输出的量化变换系数进行逆量化以生成变换系数。逆变换器165将从逆量化器160输出的变换系数从频域变换到空间域并重构残差块。The inverse quantizer 160 inversely quantizes the quantized transform coefficients output from the quantizer 145 to generate transform coefficients. The inverse transformer 165 transforms the transform coefficient output from the inverse quantizer 160 from the frequency domain to the spatial domain and reconstructs the residual block.

加法器170将重构的残差块与预测器120生成的预测块相加，以重构当前块。重构的当前块中的像素用作下一个块的帧内预测的参考像素。The adder 170 adds the reconstructed residual block and the prediction block generated by the predictor 120 to reconstruct the current block. The reconstructed pixels in the current block are used as reference pixels for intra prediction of the next block.

滤波器单元180对重构的像素进行滤波以减少由于基于块的预测和变换/量化而产生的块伪影、振铃伪影和模糊伪影。滤波器单元180可以包括去块滤波器182和样本自适应偏移(SAO)滤波器184。Filter unit 180 filters the reconstructed pixels to reduce blocking artifacts, ringing artifacts, and blurring artifacts resulting from block-based prediction and transform/quantization. Filter unit 180 may include a deblocking filter 182 and a sample adaptive offset (SAO) filter 184.

去块滤波器180对重构的块之间的边界进行滤波，以去除由逐块编码/解码引起的块伪影，并且SAO滤波器184附加地对去块滤波后的视频进行滤波。SAO滤波器184是用于补偿由有损编码引起的重构的像素和原始像素之间的差异的滤波器。The deblocking filter 180 filters the boundaries between reconstructed blocks to remove blocking artifacts caused by block-by-block encoding/decoding, and the SAO filter 184 additionally filters the deblocking filtered video. The SAO filter 184 is a filter for compensating for differences between reconstructed pixels and original pixels caused by lossy encoding.

通过去块滤波器182和SAO滤波器184滤波后的重构的块存储在存储器190中。一旦重构了一幅图片中的所有块，重构图片就被用作用于要编码的下一幅图片的帧间预测的参考图片。The reconstructed blocks filtered by the deblocking filter 182 and the SAO filter 184 are stored in the memory 190 . Once all blocks in a picture have been reconstructed, the reconstructed picture is used as a reference picture for inter prediction of the next picture to be encoded.

图4是能够实现本公开的技术的视频解码设备的示例性功能框图。在下文中，将参照图4描述视频解码设备和该设备的元件。4 is an exemplary functional block diagram of a video decoding device capable of implementing the techniques of the present disclosure. In the following, a video decoding device and elements of the device will be described with reference to FIG. 4 .

视频解码设备可以包括解码器410、逆量化器420、逆变换器430、预测器440、加法器450、滤波器单元460和存储器470。The video decoding device may include a decoder 410, an inverse quantizer 420, an inverse transformer 430, a predictor 440, an adder 450, a filter unit 460, and a memory 470.

与图1的视频编码设备类似，视频解码设备的每个元件可以实现为硬件或软件，或者可以实现为硬件和软件的组合。另外，每个元件的功能可以实现为软件，并且可以实现微处理器以执行与每个元件相对应的软件的功能。Similar to the video encoding device of FIG. 1, each element of the video decoding device may be implemented as hardware or software, or may be implemented as a combination of hardware and software. In addition, the function of each element may be implemented as software, and a microprocessor may be implemented to execute the function of the software corresponding to each element.

解码器410通过对从视频编码设备接收到的比特流进行解码并提取与块分割相关的信息来确定要解码的当前块，并且提取重构当前块所需的预测信息和关于残差信号的信息。The decoder 410 determines the current block to be decoded by decoding the bit stream received from the video encoding device and extracting information related to block partitioning, and extracts prediction information required to reconstruct the current block and information on the residual signal .

解码器410从序列参数集(SPS)或图片参数集(PPS)中提取关于CTU尺寸的信息，确定CTU的尺寸，并将图片分割成所确定尺寸的CTU。然后，解码器将CTU确定为最上层(即，树结构的根节点)，并且提取关于CTU的分割信息，以利用树结构对CTU进行分割。The decoder 410 extracts information about the CTU size from the sequence parameter set (SPS) or picture parameter set (PPS), determines the size of the CTU, and segments the picture into CTUs of the determined size. Then, the decoder determines the CTU as the uppermost layer (ie, the root node of the tree structure), and extracts segmentation information about the CTU to segment the CTU using the tree structure.

例如，当使用QTBTTT结构对CTU进行分割时，首先提取与QT分割相关的第一标志(QT_split_flag)，并将每个节点分割为下层的四个节点。然后，对于与QT的叶节点相对应的节点，提取与MTT分割相关的第二标志(MTT_split_flag)和关于分割方向(垂直/水平)和/或分割类型(二叉/三叉)的信息，并且以MTT结构分割叶节点。这样，以BT或TT结构递归地分割QT的叶节点下面的每个节点。For example, when using the QTBTTT structure to split a CTU, first extract the first flag (QT_split_flag) related to QT splitting, and split each node into four nodes in the lower layer. Then, for the node corresponding to the leaf node of the QT, extract the second flag (MTT_split_flag) related to the MTT split and the information about the split direction (vertical/horizontal) and/or the split type (binary/trifurcated), and use The MTT structure divides leaf nodes. In this way, each node below the leaf node of QT is recursively split in a BT or TT structure.

作为另一示例，当使用QTBTTT结构对CTU进行分割时，首先提取指示CU是否被分割的CU分割标志(split_cu_flag)。如果相应的块被分割，则提取QT分割标志(split_qt_flag)。当分割类型不是QT而是MTT时，附加地提取指示MTT分割方向(垂直或水平)的标志(mtt_split_cu_vertical_flag)和/或指示MTT分割类型(二叉或三叉)的标志(mtt_split_cu_binary_flag)。在分割过程中，每个节点可以经历零次或更多次递归QT分割，然后再经历零次或更多次递归MTT分割。例如，CTU可以立即被MTT分割，或者可以仅被QT分割多次。As another example, when a CTU is split using the QTBTTT structure, a CU split flag (split_cu_flag) indicating whether the CU is split is first extracted. If the corresponding block is split, the QT split flag (split_qt_flag) is extracted. When the split type is not QT but MTT, a flag (mtt_split_cu_vertical_flag) indicating the MTT split direction (vertical or horizontal) and/or a flag (mtt_split_cu_binary_flag) indicating the MTT split type (binary or trifurcated) is additionally extracted. During the splitting process, each node can undergo zero or more recursive QT splits and then zero or more recursive MTT splits. For example, a CTU can be split immediately by MTT, or it can be split multiple times by QT only.

作为另一示例，当使用QTBT结构以及与QT分割相关的第一标志(QT_split_flag)分割CTU时，并且每个节点被分割成下层的四个节点。对于与QT的叶子节点相对应的节点，提取指示该节点是否被进一步BT分割的split_flag和分割方向信息。As another example, when the CTU is split using the QTBT structure and the first flag (QT_split_flag) related to QT splitting, and each node is split into four nodes of the lower layer. For the node corresponding to the leaf node of QT, extract the split_flag and split direction information indicating whether the node is further BT split.

一旦通过树结构分割确定了要解码的当前块，解码器410就提取关于指示当前块是经历了帧内预测还是帧间预测的预测类型的信息。当预测类型信息指示帧内预测时，解码器410提取当前块的帧内预测信息(帧内预测模式)的语法元素。当预测类型信息指示帧间预测时，解码器410提取帧间预测信息的语法元素，即，指示运动矢量和运动矢量所参考的参考图片的信息。Once the current block to be decoded is determined through tree structure partitioning, the decoder 410 extracts information about the prediction type indicating whether the current block is subjected to intra prediction or inter prediction. When the prediction type information indicates intra prediction, the decoder 410 extracts syntax elements of intra prediction information (intra prediction mode) of the current block. When the prediction type information indicates inter prediction, the decoder 410 extracts syntax elements of inter prediction information, that is, information indicating a motion vector and a reference picture to which the motion vector refers.

解码器410提取关于当前块的量化变换系数的信息作为关于残差信号的信息。The decoder 410 extracts information on the quantized transform coefficient of the current block as information on the residual signal.

逆量化器420对量化后的变换系数进行逆量化，并将逆量化后的变换系数从频域逆变换到空间域，重构残差信号，以生成当前块的残差块。The inverse quantizer 420 inversely quantizes the quantized transform coefficients, inversely transforms the inversely quantized transform coefficients from the frequency domain to the spatial domain, and reconstructs the residual signal to generate a residual block of the current block.

另外，当逆变换器430仅对变换块的局部区域(子块)进行逆变换时，提取指示仅变换块的子块已被变换的标志(cu_sbt_flag)、以及关于子块的方向信息(垂直/水平)(cu_sbt_horizontal_flag)和/或子块位置信息(cu_sbt_pos_flag)。然后，通过将子块的变换系数从频域逆变换到空间域来重构残差信号。对于没有被逆变换的区域，用“0”填充残差信号。从而，创建当前块的最终残差块。In addition, when the inverse transformer 430 inversely transforms only the local area (sub-block) of the transform block, extracts a flag (cu_sbt_flag) indicating that only the sub-block of the transform block has been transformed, and the direction information (vertical/ Horizontal) (cu_sbt_horizontal_flag) and/or sub-block position information (cu_sbt_pos_flag). Then, the residual signal is reconstructed by inversely transforming the transform coefficients of the sub-block from the frequency domain to the spatial domain. For areas that have not been inversely transformed, the residual signal is filled with "0". Thereby, the final residual block of the current block is created.

预测器440可以包括帧内预测器442和帧间预测器444。在当前块的预测类型为帧内预测时激活帧内预测器442，并且在当前块的预测类型为帧间预测时激活帧间预测器444。Predictors 440 may include intra predictors 442 and inter predictors 444. Intra predictor 442 is activated when the prediction type of the current block is intra prediction, and inter predictor 444 is activated when the prediction type of the current block is inter prediction.

帧内预测器442基于从解码器410提取的帧内预测模式的语法元素，在多个帧内预测模式当中确定当前块的帧内预测模式，并根据帧内预测模式基于当前块周围的参考像素预测当前块。The intra predictor 442 determines an intra prediction mode of the current block among a plurality of intra prediction modes based on the syntax elements of the intra prediction mode extracted from the decoder 410 and based on the reference pixels around the current block according to the intra prediction mode. Predict the current block.

帧间预测器444基于从解码器410提取的帧内预测模式的语法元素确定当前块的运动矢量和运动矢量所参考的参考图片，并基于运动矢量和参考图片预测当前块。The inter predictor 444 determines the motion vector of the current block and the reference picture to which the motion vector refers based on the syntax elements of the intra prediction mode extracted from the decoder 410, and predicts the current block based on the motion vector and the reference picture.

加法器450通过将从逆变换器输出的残差块和从帧间预测器或帧内预测器输出的预测块相加，来重构当前块。重构的当前块中的像素用作用于稍后要解码的块的帧内预测的参考像素。The adder 450 reconstructs the current block by adding the residual block output from the inverse transformer and the prediction block output from the inter predictor or the intra predictor. The reconstructed pixels in the current block are used as reference pixels for intra prediction of blocks to be decoded later.

滤波器单元460可以包括去块滤波器462和SAO滤波器464。去块滤波器462对重构的块之间的边界执行去块滤波，以去除由逐块解码引起的块伪影。SAO滤波器464对去块滤波之后的重构的块执行附加滤波，以补偿由有损编码引起的重构的像素和原始像素之间的差异。通过去块滤波器462和SAO滤波器464滤波的重构的块被存储在存储器470中。当重构了一幅图片中的所有块时，重构的图片用作用于之后要编码的图片中块的帧间预测的参考图片。Filter unit 460 may include a deblocking filter 462 and an SAO filter 464. Deblocking filter 462 performs deblocking filtering on boundaries between reconstructed blocks to remove blocking artifacts caused by block-by-block decoding. The SAO filter 464 performs additional filtering on the reconstructed blocks after deblocking filtering to compensate for differences between the reconstructed pixels and original pixels caused by lossy encoding. The reconstructed blocks filtered by deblocking filter 462 and SAO filter 464 are stored in memory 470 . When all blocks in a picture are reconstructed, the reconstructed picture is used as a reference picture for inter prediction of blocks in the picture to be encoded later.

HEVC标准的图片间预测编码/解码方法(帧间预测方法)可以分类为跳过模式、合并模式和自适应(或高级)运动矢量预测器(AMVP)模式。The inter-picture prediction encoding/decoding method (inter prediction method) of the HEVC standard can be classified into a skip mode, a merge mode, and an adaptive (or advanced) motion vector predictor (AMVP) mode.

在跳过模式中，用信号通知指示相邻块的运动信息候选之一的索引值。在合并模式中，用信号通知指示相邻块的运动信息候选之一的索引值和通过对预测后的残差进行编码而获得的信息。在AMVP模式下，用信号通知当前块的运动信息和通过对预测后的残差进行编码而获得的信息。在AMVP模式下用信号通知的运动信息包括相邻块的运动信息(运动矢量预测器(mvp))以及运动信息(mvp)和当前块的运动信息(mv)之间的差值(运动矢量差(mvd))。In skip mode, an index value indicating one of the motion information candidates of the adjacent block is signaled. In the merge mode, an index value indicating one of the motion information candidates of the adjacent block and information obtained by encoding the prediction residual are signaled. In AMVP mode, the motion information of the current block and the information obtained by encoding the residual after prediction are signaled. The motion information signaled in AMVP mode includes the motion information of neighboring blocks (motion vector predictor (mvp)) and the difference between the motion information (mvp) and the motion information (mv) of the current block (motion vector difference). (mvd)).

更详细地描述在AMVP模式中用信号通知的运动信息，运动信息可以包括参考图片信息(参考图片索引)、预测运动矢量(mvp)信息和差分运动矢量(mvd)信息。在双向预测的情况下，上述信息针对每个方向单独用信号通知。下表1示出了针对每个方向用信号通知的关于参考图片信息、mvp信息和mvd信息的语法元素。Describing the motion information signaled in the AMVP mode in more detail, the motion information may include reference picture information (reference picture index), predicted motion vector (mvp) information, and differential motion vector (mvd) information. In the case of bidirectional prediction, the above information is signaled separately for each direction. Table 1 below shows the syntax elements for reference picture information, mvp information, and mvd information signaled for each direction.

[表1][Table 1]

在上表1中，inter_pred_idc是指示预测方向的语法元素(预测方向信息)，并且可以指示单向-L0(uni-L0)、单向-L1(uni-L1)和双向预测(bi-prediction)中的任何一个。根据本发明，由于在特定方向的运动信息是从在另一方向的运动信息推导的，因此inter_pred_idc指示双向预测。ref_idx_l0是指示在L0方向上的参考图片的语法元素(参考图片信息)，并且通过该语法元素指定参考图片列表0中所包括的参考图片当中用于预测当前块的参考图片。ref_idx_l1是指示在L1方向上的参考图片的语法元素(参考图片信息)，并且通过该语法元素指定参考图片列表1中所包括的参考图片当中用于预测当前块的参考图片。mvp_l0_flag是指示用于L0方向的mvp的语法元素(mvp信息)，并且通过该语法元素指定当前块的在L0方向的预测要使用的mvp。mvp_l1_flag是指示用于L1方向的mvp的语法元素(mvp信息)，并且通过该语法元素指定当前块的在L1方向的预测要使用的mvp。In Table 1 above, inter_pred_idc is a syntax element indicating the prediction direction (prediction direction information), and can indicate uni-L0 (uni-L0), uni-L1 (uni-L1), and bi-prediction (bi-prediction) any of them. According to the present invention, since motion information in a specific direction is derived from motion information in another direction, inter_pred_idc indicates bidirectional prediction. ref_idx_10 is a syntax element (reference picture information) indicating a reference picture in the L0 direction, and a reference picture used for predicting the current block among reference pictures included in the reference picture list 0 is specified by this syntax element. ref_idx_l1 is a syntax element (reference picture information) indicating a reference picture in the L1 direction, and the reference picture used for predicting the current block among the reference pictures included in the reference picture list 1 is specified by this syntax element. mvp_l0_flag is a syntax element (mvp information) indicating an mvp for the L0 direction, and the mvp to be used for prediction in the L0 direction of the current block is specified by this syntax element. mvp_l1_flag is a syntax element (mvp information) indicating an mvp for the L1 direction, and the mvp to be used for prediction in the L1 direction of the current block is specified by this syntax element.

构成mvd信息的语法元素表示在下表2中。The syntax elements that make up the mvd information are shown in Table 2 below.

[表2][Table 2]

在上表2中，abs_mvd_greater0_flag为指示mvd的绝对值(大小)是否超过0的语法元素，并且abs_mvd_greater1_flag为指示mvd的绝对值是否超过1的语法元素。另外，abs_mvd_minus2为指示从mvd的绝对值减去2获得的值的语法元素，并且mvd_sign_flag对应于指示mvd的符号的语法元素。In Table 2 above, abs_mvd_greater0_flag is a syntax element indicating whether the absolute value (size) of mvd exceeds 0, and abs_mvd_greater1_flag is a syntax element indicating whether the absolute value (size) of mvd exceeds 1. In addition, abs_mvd_minus2 is a syntax element indicating a value obtained by subtracting 2 from the absolute value of mvd, and mvd_sign_flag corresponds to a syntax element indicating a sign of mvd.

如表2所示，通过指示x分量和y分量中的每一个的绝对值的语法元素(abs_mvd_greater0_flag、abs_mvd_greater1_flag、abs_mvd_minus2)和指示符号的语法元素(mvd_sign_flag)来表示mvd。As shown in Table 2, mvd is represented by a syntax element (abs_mvd_greater0_flag, abs_mvd_greater1_flag, abs_mvd_minus2) indicating an absolute value of each of an x component and a y component and a syntax element (mvd_sign_flag) indicating a sign.

下表3总结了基于表1和表2中描述的内容从视频编码设备用信号通知视频解码设备的用于传统AMVP模式的双向预测的信息。Table 3 below summarizes the information signaled from the video encoding device to the video decoding device for bidirectional prediction of the conventional AMVP mode based on what is described in Tables 1 and 2.

[表3][table 3]

如上表3所示，在传统的AMVP模式中，为了对当前块执行双向预测，针对每个方向分别用信号通知参考图片信息、mvp信息、mvd信息等，这在比特效率方面可以是效率低下的。As shown in Table 3 above, in the traditional AMVP mode, in order to perform bidirectional prediction for the current block, reference picture information, mvp information, mvd information, etc. are signaled separately for each direction, which can be inefficient in terms of bit efficiency. .

本发明涉及为了提高双向预测的比特效率，通过推断用于预测当前块的参考图片或者使用在每个方向的运动信息之间的相关性从在特定方向的运动信息当中推断在其它方向的运动信息。The present invention relates to improving the bit efficiency of bidirectional prediction by inferring a reference picture used to predict a current block or using correlation between motion information in each direction to infer motion information in other directions from motion information in a specific direction. .

“特定方向”指示基于从视频编码设备用信号通知的信息来推断或推导运动信息的方向，并且“其它方向”指示基于在特定方向的运动信息来推断或推导运动信息的方向。在推断在其它方向的运动信息的过程中，可以使用在特定方向的运动信息和/或从视频编码设备用信号通知的信息中的至少一些。在本说明书中，描述了特定方向对应于方向L0，并且其它方向对应于方向L1，但特定方向可以对应于方向L0和L1中的任何一个，并且其它方向可以对应于两个方向当中不对应于特定方向的其余方向。在下文中，将特定方向称为第一方向，并且将其它方向称为第二方向。另外，将第一方向的运动矢量称为第一运动矢量，并且将第二方向的运动矢量称为第二运动矢量。"Specific direction" indicates a direction in which motion information is inferred or derived based on information signaled from the video encoding device, and "other direction" indicates a direction in which motion information is inferred or derived based on motion information in a specific direction. In inferring motion information in other directions, at least some of the motion information in a specific direction and/or the information signaled from the video encoding device may be used. In this specification, it is described that the specific direction corresponds to the direction L0 and the other directions correspond to the direction L1, but the specific direction may correspond to any one of the directions L0 and L1, and the other directions may correspond to neither of the two directions. The rest of the directions in a specific direction. Hereinafter, a specific direction is called a first direction, and other directions are called a second direction. In addition, the motion vector in the first direction is called a first motion vector, and the motion vector in the second direction is called a second motion vector.

多条运动信息之间的相关性可以包括在多条运动信息之间建立的对称关系、线性关系、比例关系、基于当前图片的参考图片之间的图片次序计数(POC)差关系等。这种相关性可以是针对全部多条运动信息而建立的，并且可以是针对运动信息中所包括的每个元素(参考图片信息、mvp信息和mvd信息中的至少一个)而单独建立的。例如，可以在两个方向的多条mvd信息之间建立对称关系，并且可以在两个方向的mvp信息(由mvp_flag所指示的)以及两个方向的mvd信息之间建立线性关系。在此，在两个方向的mvp信息和mvd信息建立线性关系可以理解为在两个方向的运动矢量(运动)之间建立线性关系。The correlation between multiple pieces of motion information may include a symmetrical relationship, a linear relationship, a proportional relationship, a picture order count (POC) difference relationship between reference pictures based on the current picture, etc. established between the multiple pieces of motion information. This correlation may be established for all multiple pieces of motion information, and may be established separately for each element included in the motion information (at least one of the reference picture information, the MVP information, and the MVD information). For example, a symmetrical relationship may be established between multiple pieces of MVD information in two directions, and a linear relationship may be established between the MVP information in two directions (indicated by the MVP_FLAG) and the MVD information in two directions. Here, establishing a linear relationship between the MVP information and the MVD information in two directions may be understood as establishing a linear relationship between motion vectors (motions) in two directions.

结合本说明书中所称运动信息的名称，将在特定方向(第一方向)的运动信息称为第一运动信息，并且依据包含元素的数量或类型，将在其它方向(第二方向)的运动信息称为第二运动信息或第三运动信息。第三运动信息是在第二方向的运动信息，并且可以是包括在第二方向的mvd信息和在第二方向的mvp信息的运动信息。第二运动信息和第三运动信息二者对应于在第二方向的运动信息，但可以根据包括在第二方向的mvd信息和mvp信息二者还是不包括mvp信息和mvd信息中的至少一个来分类。In conjunction with the names of the motion information referred to in this specification, the motion information in a specific direction (first direction) is referred to as the first motion information, and the motion information in the other direction (second direction) is referred to as the second motion information or the third motion information depending on the number or type of elements included. The third motion information is the motion information in the second direction, and may be motion information including the mvd information in the second direction and the mvp information in the second direction. The second motion information and the third motion information both correspond to the motion information in the second direction, but may be classified according to whether both the mvd information and the mvp information in the second direction are included or at least one of the mvp information and the mvd information is not included.

在图5中例示了用于推断在第二方向的运动的本发明的实施方式。An embodiment of the invention for inferring motion in the second direction is illustrated in Figure 5 .

视频编码设备可以通过在比特流中包括模式信息(mode_info)来用信号通知模式信息(mode_info)。本发明提出的双向预测模式可以包括从第一运动信息(motion_info_l0)中推导第二运动信息(motion_info_l1)的第一模式、使用用信号通知的信息推导第三运动信息(motion_info_l2)的第二模式等。The video encoding device may signal the mode information (mode_info) by including the mode information (mode_info) in the bitstream. The bidirectional prediction modes proposed by the present invention may include a first mode that derives second motion information (motion_info_l1) from first motion information (motion_info_l0), a second mode that uses signaled information to derive third motion information (motion_info_l2), etc. .

mode_info可以对应于用于指示多个双向预测模式中所包括的多个预测模式中的任一种的信息。依据可用双向预测模式的数量，可以以诸如标志或索引之类的各种形式实现模式信息。在下文中，将在mode_info指示在第一模式和第二模式当中当前块的双向预测所使用的预测模式的前提下进行描述。在该前提下，mode_info可以对应于指示第一模式是否应用于当前块的信息。另外，mode_info不指示应用第一模式的情况可以与指示不应用第一模式或指示应用第二模式相同。mode_info may correspond to information indicating any one of a plurality of prediction modes included in a plurality of bidirectional prediction modes. Depending on the number of available bidirectional prediction modes, the mode information can be implemented in various forms such as flags or indexes. Hereinafter, description will be made on the premise that mode_info indicates a prediction mode used for bidirectional prediction of the current block among the first mode and the second mode. Under this premise, mode_info may correspond to information indicating whether the first mode is applied to the current block. In addition, the case where mode_info does not indicate application of the first mode may be the same as indicating not applying the first mode or indicating application of the second mode.

当mode_info指示应用第一模式时，视频编码设备可以通过在比特流中包括motion_info_l0和motion_info_l1来用信号通知motion_info_l0和motion_info_l1。motion_info_l0可以包括在第一方向的差分运动矢量信息(mvd_l0)和在第一方向的预测运动矢量信息(mvp_l0_flag)。motion_info_l1可以包括mvd_l1和mvp_l1_flag中的一些(换言之，motion_info_l1可以不包括mvd_l1和mvp_l1_flag中的至少一些)。另一方面，当mode_info不指示应用第一模式时(当mode_info指示应用第二模式时)，视频编码设备可以通过在比特流中包括motion_info_l0和motion_info_l2来用信号通知motion_info_l0和motion_info_l2。motion_info_l2可以mvd_l1和mvp_l1_flag二者。When mode_info indicates that the first mode is applied, the video encoding device may signal motion_info_10 and motion_info_11 by including motion_info_10 and motion_info_11 in the bitstream. Motion_info_10 may include differential motion vector information (mvd_10) in the first direction and predicted motion vector information (mvp_10_flag) in the first direction. Motion_info_11 may include some of mvd_11 and mvp_11_flag (in other words, motion_info_11 may not include at least some of mvd_11 and mvp_11_flag). On the other hand, when mode_info does not indicate that the first mode is applied (when mode_info indicates that the second mode is applied), the video encoding device may signal motion_info_10 and motion_info_12 by including motion_info_10 and motion_info_12 in the bitstream. Motion_info_12 may include both mvd_11 and mvp_11_flag.

视频解码设备(解码单元)可以从比特流中解码mode_info(S530)。当mode_info指示应用第一模式时(S540)时，由于在比特流中包括motion_info_l1，所以视频解码设备可以从比特流中解码motion_info_l0和motion_info_l1(S550)。The video decoding device (decoding unit) may decode mode_info from the bit stream (S530). When mode_info indicates that the first mode is applied (S540), since motion_info_l1 is included in the bitstream, the video decoding device can decode motion_info_l0 and motion_info_l1 from the bitstream (S550).

视频解码设备(预测单元)可以基于motion_info_l0推导第一运动矢量mv_l0，并且基于motion_info_l0和motion_info_l1的至少一部分推导第二运动矢量mv_l1(S560)。由于motion_info_l0包括mvd_l0和mvp_l0_flag，因此可以通过如下公式1中对mvd_l0和mvp_l0进行求和来推导mv_l0。The video decoding device (prediction unit) may derive the first motion vector mv_l0 based on motion_info_l0, and derive the second motion vector mv_l1 based on motion_info_l0 and at least a part of motion_info_l1 (S560). Since motion_info_l0 includes mvd_l0 and mvp_l0_flag, mv_l0 can be derived by summing mvd_l0 and mvp_l0 in the following formula 1.

[公式1][Formula 1]

在上面的公式1中，mvx₀表示mv_l0的x分量，mvy₀表示mv_l0的y分量。mvpx₀表示mvp_l0的x分量，mvpy₀表示mvp_l0的y分量。mvdx₀表示mvd_l0的x分量，mvdy₀表示mvd_l0的y分量。In Equation 1 above, mvx ₀ represents the x component of mv_l0, and mvy ₀ represents the y component of mv_l0. mvpx ₀ represents the x component of mvp_l0, and mvpy ₀ represents the y component of mvp_l0. mvdx ₀ represents the x component of mvd_l0, and mvdy ₀ represents the y component of mvd_l0.

由于motion_info_l1不包括mvd_l1和mvp_l1_flag的至少一部分，因此可以基于运动的相关性推导mv_l1。下面将描述推导mv_l1的详细方法。Since motion_info_l1 does not include at least part of mvd_l1 and mvp_l1_flag, mv_l1 can be derived based on the correlation of motion. The detailed method of deriving mv_l1 will be described below.

视频解码设备可以使用作为在第一方向的参考图片的第一参考图片(ref_l0)内的由mv_l0所指示的第一参考块，以及在作为在第二方向的参考图片的第二参考图片内的由mv_l1所指示的第二参考块(ref_l1)，从而预测当前块(生成当前块的预测块)(S570)。可以根据从视频编码设备用信号通知的参考图片信息(ref_idx_l0和ref_idx_l1)来指定ref_l0和ref_l1，或者可以基于参考图片列表中所包括的参考图片与当前图片之间的POC差来推导ref_l0和ref_l1。下面将描述其具体实施方式。The video decoding apparatus may use a first reference block indicated by mv_10 within a first reference picture (ref_10) that is a reference picture in the first direction, and within a second reference picture that is a reference picture in the second direction. The second reference block (ref_l1) indicated by mv_l1, thereby predicting the current block (generating the prediction block of the current block) (S570). ref_l0 and ref_l1 may be specified based on reference picture information (ref_idx_l0 and ref_idx_l1) signaled from the video encoding device, or may be derived based on the POC difference between the reference picture included in the reference picture list and the current picture. Specific implementations thereof will be described below.

同时，当在操作S540中mode_info不指示应用第一模式时(当mode_info指示应用第二模式时)，由于在比特流中包括motion_info_l2，所以视频解码设备可以从比特流解码motion_info_l0以及motion_info_l2(S590)。在这种情况下，视频解码设备可以基于motion_info_l0推导mv_l0并基于motion_info_l2推导mv_l1(S560)。另外，视频解码设备可以通过使用由mv_l0所指示的第一参考块和由mv_l1指示的第二参考块来预测当前块(S570)。Meanwhile, when mode_info does not indicate application of the first mode (when mode_info indicates application of the second mode) in operation S540, since motion_info_12 is included in the bitstream, the video decoding device may decode motion_info_10 and motion_info_12 from the bitstream (S590). In this case, the video decoding device may derive mv_l0 based on motion_info_l0 and mv_l1 based on motion_info_l2 (S560). In addition, the video decoding device may predict the current block by using the first reference block indicated by mv_l0 and the second reference block indicated by mv_l1 (S570).

根据实施方式，视频编码设备可以通过在比特流中进一步包括启用信息(enabled_flag)来用信号通知启用信息(enabled_flag)。enabled_flag可以对应于指示是否启用第一模式的信息。当enabled_flag指示启用第一模式时，视频编码设备可以将enabled_flag编码为诸如序列级别、图片级别、瓦片组级别和切片级别的高级语法，并且通过在比特流中包括每个预测单元(块)的mode_info，来用信号通知每个预测单元(块)的mode_info。这样，可以针对每个块设置是否应用本发明提出的实施方式。According to an embodiment, the video encoding device may signal the enabling information (enabled_flag) by further including the enabling information (enabled_flag) in the bitstream. The enabled_flag may correspond to information indicating whether the first mode is enabled. When the enabled_flag indicates that the first mode is enabled, the video encoding device may encode the enabled_flag as a high-level syntax such as a sequence level, a picture level, a tile group level, and a slice level, and signal the mode_info of each prediction unit (block) by including the mode_info of each prediction unit (block) in the bitstream. In this way, whether the embodiment proposed by the present invention is applied can be set for each block.

当enabled_flag被编码为高级语法并且mode_info被以块为单位编码时，视频解码设备可以从高级语法中解码enabled_flag(S510)，并且当enabled_flag指示启用第一模式时(S520)，从比特流中解码motion_info(S530)。同时，当enabled_flag指示不启用第一模式时，可以不对mode_info解码。在这种情况下，视频解码设备可以通过将mode_info设置或估计为“0”或“off(关闭)”以指示不应用第一模式，而向当前块不应用第一模式(S580)。When enabled_flag is encoded into the high-level syntax and mode_info is encoded in block units, the video decoding device may decode enabled_flag from the high-level syntax (S510), and when enabled_flag indicates that the first mode is enabled (S520), decode motion_info from the bitstream (S530). Meanwhile, when enabled_flag indicates that the first mode is not enabled, mode_info may not be decoded. In this case, the video decoding device may not apply the first mode to the current block by setting or estimating mode_info to "0" or "off" to indicate not to apply the first mode (S580).

在下文中，将根据在运动信息中是否包括参考图片信息(ref_idx_l0和ref_idx_l1)、预测运动矢量信息(mvp_l0_flag和mvp_l1_flag)、差分运动矢量信息(mvd_l0和mvd_l0和mvd_l1)中的一些来描述本发明提出的各种实施方式。In the following, the method proposed by the present invention will be described according to whether some of the reference picture information (ref_idx_l0 and ref_idx_l1), predicted motion vector information (mvp_l0_flag and mvp_l1_flag), differential motion vector information (mvd_l0 and mvd_l0 and mvd_l1) are included in the motion information. Various implementations.

在下面描述的实施方式中，motion_info_l0可以包括mvd_l0和mvp_l0_flag，而motion_info_l1可以不包括mvd_l1和mvp_l1_flag中的至少一些。换言之，motion_info_l0可以不包括ref_idx_l0，而motion_info_l1可以不包括ref_idx_l1、mvd_l1和mvp_l1_flag中的一个或更多个。In embodiments described below, motion_info_l0 may include mvd_l0 and mvp_l0_flag, and motion_info_l1 may not include at least some of mvd_l1 and mvp_l1_flag. In other words, motion_info_l0 may not include ref_idx_l0, and motion_info_l1 may not include one or more of ref_idx_l1, mvd_l1, and mvp_l1_flag.

第一实施方式First embodiment

第一实施方式对应于当ref_idx_l0、mvd_l0和mvp_l0全部包含在motion_info_l0中并且ref_idx_l1和mvp_l1包含在motion_info_l1中时，通过推断mvd_l1来推断运动信息的方法。The first embodiment corresponds to a method of inferring motion information by inferring mvd_l1 when ref_idx_l0, mvd_l0 and mvp_l0 are all included in motion_info_l0 and ref_idx_l1 and mvp_l1 are included in motion_info_l1.

在第一实施方式中，可以从mvd_l0推导未用信号通知的mvd_l1。可以基于mvd_l1和mvd_l0之间建立的对称关系来推导mvd_l1。也就是说，mvd_l1可以被设置为或推导为与mvd_l0对称的值(mvd_l1＝-mvd_l0)，并且可以使用推导出的mvd_l1和用信号通知的mvp_l1推导mv_l1(公式2)。In the first embodiment, the unsignaled mvd_l1 may be derived from mvd_l0. mvd_l1 may be derived based on the symmetric relationship established between mvd_l1 and mvd_l0. That is, mvd_l1 may be set or derived to a value symmetric to mvd_l0 (mvd_l1=-mvd_l0), and mv_l1 may be derived using the derived mvd_l1 and the signaled mvp_l1 (Formula 2).

[公式2][Formula 2]

(mvx₁，mvy₁)＝(mvpx₁-mvdx₀，mvpy₁-mvdy₀)(mvx ₁ , mvy ₁ )=(mvpx ₁ -mvdx ₀ , mvpy ₁ -mvdy ₀ )

视频编码设备可以通过与上述相同的过程在在比特流中包括motion_info_l0和motion_info_l1(mvd_l1除外)，来用信号通知motion_info_l0和motion_info_l1(mvd_l1除外)。如图6所示，视频解码设备可以通过使用motion_info_l0中所包括的mvd_l0和mvp_l0来推导mv_l0。此外，视频解码设备可以通过使用从motion_info_l1中所包括的mvd_l0和mvp_l1推导出的mvd_l1(-mvd_l0)来推导mv_l1。The video encoding device may signal motion_info_l0 and motion_info_l1 (except mvd_l1 ) by including motion_info_l0 and motion_info_l1 (except mvd_l1 ) in the bitstream by the same process as described above. As shown in FIG. 6, the video decoding device can derive mv_l0 by using mvd_l0 and mvp_l0 included in motion_info_l0. Furthermore, the video decoding device can derive mv_l1 by using mvd_l1 (-mvd_l0) derived from mvd_l0 and mvp_l1 included in motion_info_l1.

视频解码设备可以使用由ref_idx_l0所指示的ref_l0内的mv_l0所指示的第一参考块630和由ref_idx_l1所指示的ref_l1内的mv_l1所指示的第二参考块640，从而预测位于当前图片610内的当前块620。The video decoding apparatus may predict a current block 620 located in a current picture 610 using a first reference block 630 indicated by mv_l0 within ref_l0 indicated by ref_idx_l0 and a second reference block 640 indicated by mv_l1 within ref_l1 indicated by ref_idx_l1.

第二实施方式Second embodiment

第二实施方式对应于当ref_idx_l0不包含在motion_info_l0中并且ref_idx_l1不包含在motion_info_l1中时，通过推断ref_l0和ref_l1来推断运动信息的方法。The second embodiment corresponds to a method of inferring motion information by inferring ref_l0 and ref_l1 when ref_idx_l0 is not included in motion_info_l0 and ref_idx_l1 is not included in motion_info_l1.

在第二实施方式中，可以将ref_l0和ref_l1确定或推导为参考图片列表中所包括的参考图片当中具有第0索引(位于第一位置)的参考图片，或者可以基于参考图片列表中所包括的参考图片与当前图片之间的POC差来确定或推导ref_l0和ref_l1。在下文中，将描述基于与当前图片的POC差来推导ref_l0和ref_l1的方法。In the second embodiment, ref_10 and ref_11 may be determined or derived as the reference picture with the 0th index (located at the first position) among the reference pictures included in the reference picture list, or may be based on ref_l0 and ref_l1 are determined or derived from the POC difference between the reference picture and the current picture. Hereinafter, a method of deriving ref_l0 and ref_l1 based on the POC difference from the current picture will be described.

视频解码设备可以基于参考图片列表0(在第一方向的参考图片列表)中所包括的参考图片和当前图片之间的POC值的差，选择在第一方向的参考图片列表中所包括的任何一幅参考图片，并将所选择的参考图片设置为ref_l0。例如，视频解码设备可以将与当前图片具有最小POC值差的参考图片(最接近参考图片)设置为ref_l0。The video decoding device may select any of the reference pictures included in the reference picture list in the first direction based on a difference in POC value between the reference picture included in the reference picture list 0 (the reference picture list in the first direction) and the current picture. A reference picture and set the selected reference picture to ref_l0. For example, the video decoding device may set the reference picture with the smallest POC value difference from the current picture (the closest reference picture) as ref_10.

另外，视频解码设备可以基于参考图片列表1(在第二方向的参考图片列表)中包括的参考图片和当前图片之间的POC值的差，选择在第二方向的参考图片列表中所包括的任何一幅选择参考图片，并将所选择的参考图片设置为ref_l1。例如，视频解码设备可以将与当前图片具有最小POC值差的参考图片(最接近参考图片)设置为ref_l1。In addition, the video decoding device may select a reference picture included in the reference picture list in the second direction based on a difference in POC value between the reference picture included in the reference picture list 1 (the reference picture list in the second direction) and the current picture. Select any reference picture and set the selected reference picture to ref_l1. For example, the video decoding device may set the reference picture with the smallest POC value difference from the current picture (the closest reference picture) as ref_l1.

视频解码设备可以顺序地或并行地比较参考图片列表中所包括的参考图片的POC值与当前图片的POC值，以选择任何一幅参考图片。当通过顺序地比较参考图片列表中所包括的参考图片来选择最接近参考图片时，视频解码设备可以虚拟地将参考图片的索引值设置为未指配给参考图片列表的索引值(例如，-1)，然后顺序地比较参考图片。The video decoding device may sequentially or in parallel compare the POC values of the reference pictures included in the reference picture list with the POC value of the current picture to select any one reference picture. When selecting the closest reference picture by sequentially comparing reference pictures included in the reference picture list, the video decoding device may virtually set the index value of the reference picture to an index value not assigned to the reference picture list (for example, -1 ) and then compare the reference images sequentially.

从在第一方向的参考图片列表中所选择的参考图片和从在第二方向的参考图片列表中所选择的参考图片可以相对于当前图片的POC值具有前向POC值或后向POC值。也就是说，从在第一方向的参考图片列表中所选择的参考图片和从在第二方向的参考图片列表中所选择的参考图片可以由一对前向参考图片和后向参考图片组成。The reference picture selected from the reference picture list in the first direction and the reference picture selected from the reference picture list in the second direction may have a forward POC value or a backward POC value relative to the POC value of the current picture. That is, the reference picture selected from the reference picture list in the first direction and the reference picture selected from the reference picture list in the second direction may be composed of a pair of forward reference pictures and backward reference pictures.

当推导出ref_l0和ref_l1时，视频解码设备可以使用ref_l0中由mv_l0所指示的第一参考块630和ref_l1中由mv_l1所指示的第二参考块640来预测当前块。When ref_l0 and ref_l1 are derived, the video decoding device may predict the current block using the first reference block 630 indicated by mv_l0 in ref_l0 and the second reference block 640 indicated by mv_l1 in ref_l1.

根据实施方式，确定ref_l0和ref_l1的过程可以在高于当前块的级别的高级别执行。也就是说，在motion_info_l0和motion_info_l1中所包含的元素中，可以以块为单位推导或确定除ref_l0和ref_l1之外的其余元素，并且可以以高级别为单位确定ref_l0和ref_l1。这里，高级别可以是比块级别更高的级别，诸如图片级别、瓦片组级别、切片级别、瓦片级别和编码树单元(CTU)级别。According to an embodiment, the process of determining ref_l0 and ref_l1 may be performed at a higher level than the level of the current block. That is, among the elements contained in motion_info_l0 and motion_info_l1, the remaining elements except ref_l0 and ref_l1 can be derived or determined in block units, and ref_l0 and ref_l1 can be determined in high-level units. Here, the high level may be a level higher than the block level, such as picture level, tile group level, slice level, tile level, and coding tree unit (CTU) level.

第二实施方式可以与上述第一实施方式或以下要描述的实施方式结合实施。也就是说，虽然已经描述了在第一实施方式中用信号通知ref_idx_l0和ref_idx_l1，但是当应用第二实施方式时，在第一实施方式中没有用信号通知ref_idx_l0和ref_idx_l1，因此视频解码设备本身可以推导ref_l0和ref_l1。The second embodiment may be implemented in combination with the first embodiment described above or the embodiment to be described below. That is, although it has been described that ref_idx_l0 and ref_idx_l1 are signaled in the first embodiment, when the second embodiment is applied, ref_idx_l0 and ref_idx_l1 are not signaled in the first embodiment, so the video decoding device itself can Derive ref_l0 and ref_l1.

第三实施方式Third embodiment

第三实施方式对应于基于在第一方向的运动和在第二方向的运动之间建立的线性关系从第一运动信息推断第二运动信息的方法。The third embodiment corresponds to a method of inferring second motion information from first motion information based on a linear relationship established between motion in the first direction and motion in the second direction.

视频编码设备可以通过在比特流中包括motion_info_l0来用信号通知视频解码设备motion_info_l0。motion_info_l0可以包括mvp_l0_flag、mvd_l0和/或ref_idx_l0。对于稍后要描述的每个实施方式，motion_info_l0中所包括的信息可以不同。The video encoding device may signal the video decoding device motion_info_lO by including motion_info_l0 in the bitstream. motion_info_l0 may include mvp_l0_flag, mvd_l0 and/or ref_idx_l0. The information included in motion_info_10 may be different for each embodiment to be described later.

视频解码设备可以从比特流中解码motion_info_l0(S710)。视频解码设备可以通过使用mvp_l0_flag和mvd_l0来推断或推导mv_l0(S720)。可以通过如以上所述的公式1那样将mvp_l0和mvd_l0相加来推导mv_l0。这里，mvp_l0可以对应于由解码的mvp_l0_flag所指示的相邻块的运动矢量。The video decoding device may decode motion_info_10 from the bit stream (S710). The video decoding device may infer or derive mv_l0 by using mvp_l0_flag and mvd_l0 (S720). mv_l0 can be derived by adding mvp_l0 and mvd_l0 as in Equation 1 described above. Here, mvp_l0 may correspond to the motion vector of the adjacent block indicated by the decoded mvp_l0_flag.

当推导出mv_l0时，视频解码设备可以通过使用ref_l0、ref_l1和mv_l0来推导mv_l1(S730)。推导出的mv_l1可以对应于与mv_l0具有线性关系的运动矢量。ref_l0可以是由从视频编码设备用信号通知的ref_idx_l0所指示的参考图片或单独定义的参考图片。另外，ref_l1可以是由从视频编码设备用信号通知的ref_idx_l1指示的参考图片或单独定义的参考图片。When mv_l0 is derived, the video decoding device may derive mv_l1 by using ref_l0, ref_l1, and mv_l0 (S730). The derived mv_l1 may correspond to a motion vector having a linear relationship with mv_l0. ref_10 may be a reference picture indicated by ref_idx_10 signaled from the video encoding device or a separately defined reference picture. In addition, ref_l1 may be a reference picture indicated by ref_idx_l1 signaled from the video encoding device or a separately defined reference picture.

可以通过将“当前图片610和ref_l0之间的POC值的差”和“当前图片610和ref_l1之间的POC值的差”之间的比例关系如以下公式3所示地应用于mv_l0来推导mv_l1。mv_l1 can be derived by applying the proportional relationship between "the difference in the POC value between the current picture 610 and ref_l0" and "the difference in the POC value between the current picture 610 and ref_l1" to mv_l0 as shown in the following formula 3 .

[公式3][Formula 3]

在公式3中，mvx₁表示mv_l1的x分量，mvy₁表示mv_l1的y分量。POC₀表示ref_l0的POC值，POC₁表示ref_l1的POC值，POC_curr表示包含当前块620的当前图片610的POC值。另外，POC_curr-POC₀表示ref_l0和当前图片610之间的POC值的差，并且POC_curr-POC₁表示ref_l1和当前图片610之间的POC值的差。In formula 3, mvx ₁ represents the x component of mv_l1, and mvy ₁ represents the y component of mv_l1. POC ₀ represents the POC value of ref_l0, POC ₁ represents the POC value of ref_l1, and POC _curr represents the POC value of the current picture 610 including the current block 620. In addition, POC _curr - POC ₀ represents the difference in POC values between ref_10 and the current picture 610 , and POC _curr - POC ₁ represents the difference in POC values between ref_11 and the current picture 610 .

当推导出mv_l1时，视频解码设备可以基于由mv_l0所指示的第一参考块630和由mv_l1所指示的第二参考块640来预测当前块620(S740)。When mv_l1 is derived, the video decoding device may predict the current block 620 based on the first reference block 630 indicated by mv_l0 and the second reference block 640 indicated by mv_l1 (S740).

根据实施方式，由本发明提出的各种实施方式可以使用指示启用/禁用的语法元素(例如，linear_MV_coding_enabled_flag)和/或指示运动的线性关系的语法元素(例如，linear_MV_coding_flag或linear_MV_coding_idc)以确定是否应用于当前块620。这里，指示启用/禁用的语法元素可以对应于上述的启用信息，并且指示线性关系的语法元素可以对应于上述模式信息。Depending on the implementation, various implementations proposed by the present invention may use a syntax element indicating enable/disable (eg, linear_MV_coding_enabled_flag) and/or a syntax element indicating a linear relationship of motion (eg, linear_MV_coding_flag or linear_MV_coding_idc) to determine whether to apply to the current Block 620. Here, the syntax element indicating enable/disable may correspond to the above-mentioned enable information, and the syntax element indicating the linear relationship may correspond to the above-mentioned mode information.

linear_MV_coding_enabled_flag是高级语法，并且可以定义在序列级别、图片级别、瓦片组级别和切片级别当中的一个或更多个位置处。可以针对对应于解码对象的每个块用信号通知linear_MV_coding_flag。linear_MV_coding_enabled_flag is a high-level syntax and may be defined at one or more locations among a sequence level, a picture level, a tile group level, and a slice level. linear_MV_coding_flag may be signaled for each block corresponding to a decoding object.

当linear_MV_coding_enabled_flag＝1时，可以通过针对每个预测单元用信号通知linear_MV_coding_flag来为每个块设置是否应用本发明提出的实施方式。当linear_MV_coding_flag＝1时，没有用信号通知一些或全部motion_info_l1，并且可以使用用发信号通知的motion_info_l0来推导motion_info_l1(第一模式)。当linear_MV_coding_flag＝0时，可以如传统方法中那样用信号通知motion_info_l1(第二模式)。When linear_MV_coding_enabled_flag=1, whether the embodiment proposed by the present invention is applied can be set for each block by signaling linear_MV_coding_flag for each prediction unit. When linear_MV_coding_flag=1, some or all of motion_info_l1 are not signaled, and motion_info_l1 can be derived using signaled motion_info_l0 (first mode). When linear_MV_coding_flag=0, motion_info_l1 can be signaled as in the conventional method (second mode).

在下文中，将在linear_MV_coding_enabled_flag被定义为高级功能的激活并且针对每个块设置linear_MV_coding_flag的前提下描述本发明的各个实施方式。Hereinafter, various embodiments of the present invention will be described on the premise that linear_MV_coding_enabled_flag is defined as activation of the advanced function and linear_MV_coding_flag is set for each block.

实施方式3-1Embodiment 3-1

实施方式3-1对应于如下方法：在双向预测期间motion_info_l1不用信号通知mvp_l1_flag和mvd_l1并且使用运动的线性关系从motion_info_l0推导mvp_l1_flag和mvd_l1。Embodiment 3-1 corresponds to a method in which motion_info_l1 does not signal mvp_l1_flag and mvd_l1 during bidirectional prediction and mvp_l1_flag and mvd_l1 are derived from motion_info_l0 using a linear relationship of motion.

当第二方向为L0方向时，可以通过运动的线性关系从在L1方向的mvd和mvp以及双向参考图片中推导在L0方向的运动信息。也就是说，没有用信号通知在方向L0的mvp信息和mvd信息。当第二方向为L1方向时，可以通过运动的线性关系从在L0方向的mvd和mvp以及双向参考图片中推导在L1方向的运动信息。也就是说，没有用信号通知在方向L1的mvp信息和mvd信息。When the second direction is the L0 direction, the motion information in the L0 direction can be derived from the mvd and mvp in the L1 direction and the bidirectional reference picture through the linear relationship of motion. That is, the mvp information and mvd information in direction L0 are not signaled. When the second direction is the L1 direction, the motion information in the L1 direction can be derived from the mvd and mvp in the L0 direction and the bidirectional reference picture through the linear relationship of motion. That is, the mvp information and mvd information in direction L1 are not signaled.

当使用线性关系(后一种情况)推导出在方向L1的运动矢量时，从视频编码设备用信号通知视频解码设备的信息如下表4所示地以语法表示。When the motion vector in the direction L1 is derived using a linear relationship (the latter case), the information signaled from the video encoding device to the video decoding device is expressed in syntax as shown in Table 4 below.

[表4][Table 4]

如表4所示，motion_info_l0可以通过被包括在比特流中而从视频编码设备用信号通知视频解码设备。用信号通知的motion_info_l0可以包括ref_idx_l0、mvd_l0和mvp_l0_flag。ref_idx_l1也可以通过被包括在比特流中来用信号通知。在实施方式3-1中，用于推导mv_l1的参考图片(ref_l0和ref_l1)对应于由从视频编码设备用信号通知的ref_idx_l0和ref_idx_l0所指示的参考图片。As shown in Table 4, motion_info_10 may be signaled from the video encoding device to the video decoding device by being included in the bitstream. The signaled motion_info_l0 may include ref_idx_l0, mvd_l0, and mvp_l0_flag. ref_idx_l1 may also be signaled by being included in the bitstream. In Embodiment 3-1, the reference pictures (ref_l0 and ref_l1) used to derive mv_l1 correspond to the reference pictures indicated by ref_idx_l0 and ref_idx_l0 signaled from the video encoding device.

当解码了motion_info_l0时(S910)，视频解码设备可以通过使用解码后的mvp_l0_flag和mvd_l0来推断或推导mv_l0(S920)。在这个过程中可以使用公式1。而且，可以从比特流中解码ref_idx_l1(S930)。When motion_info_l0 is decoded (S910), the video decoding device may infer or derive mv_l0 by using the decoded mvp_l0_flag and mvd_l0 (S920). Equation 1 can be used in this process. Furthermore, ref_idx_l1 can be decoded from the bit stream (S930).

视频解码设备可以使用linear_MV_coding_enabled_flag来确定运动矢量推导功能是否被激活/停用(S940)。当linear_MV_coding_enabled_flag指示激活运动矢量推导功能时，可以从比特流中解码linear_MV_coding_flag以确定是否应用本发明提出的推导功能(S950)。The video decoding device may use linear_MV_coding_enabled_flag to determine whether the motion vector derivation function is activated/deactivated (S940). When linear_MV_coding_enabled_flag indicates activation of the motion vector derivation function, linear_MV_coding_flag may be decoded from the bit stream to determine whether to apply the derivation function proposed by the present invention (S950).

当解码后的linear_MV_coding_flag指示建立了运动的线性关系时(S960)，视频解码设备可以在建立了mv_l0和mv_l1之间的线性关系的前提下推导mv_l1(S970)。可以通过将每个参考图片ref_l0和ref_l1以及在每个方向的mv_l0应用于公式3来实现推导mv_l1的过程。When the decoded linear_MV_coding_flag indicates that the linear relationship of motion is established (S960), the video decoding device can derive mv_l1 on the premise that the linear relationship between mv_l0 and mv_l1 is established (S970). The process of deriving mv_l1 can be achieved by applying each reference picture ref_l0 and ref_l1 and mv_l0 in each direction to Equation 3.

同时，当在操作S940中linear_MV_coding_enabled_flag中指示停用运动矢量推导功能或者在操作S960中linear_MV_coding_flag没有指示建立了运动的线性关系时，可以通过第二模式而不是第一种模式来推导mv_l1。具体地，视频解码设备可以从比特流中解码mvp_l1_flag和mvd_l1(S980和S990)，并通过使用mvp_l1_flag和mvd_l1来推导mv_l1(S992)。Meanwhile, when the deactivation of the motion vector derivation function is indicated in linear_MV_coding_enabled_flag in operation S940 or the linear_MV_coding_flag does not indicate that a linear relationship of motion is established in operation S960, mv_l1 may be derived through the second mode instead of the first mode. Specifically, the video decoding device may decode mvp_l1_flag and mvd_l1 from the bitstream (S980 and S990), and derive mv_l1 by using mvp_l1_flag and mvd_l1 (S992).

在下表5中表示出了上述实施方式3-1的语法元素。The syntax elements of the above-described embodiment 3-1 are shown in Table 5 below.

[表5][table 5]

图9例示了可以在解码ref_idx_l1的操作(S930)之后执行确定linear_MV_coding_enabled_flag的操作(S940)以及解码并确定linear_MV_coding_flag的操作(S950和S960)，但是操作S940到S960可以在解码motion_info_l0的操作(S910)之前执行。9 illustrates that the operation of determining linear_MV_coding_enabled_flag ( S940 ) and the operations of decoding and determining linear_MV_coding_flag ( S950 and S960 ) may be performed after the operation of decoding ref_idx_l1 ( S930 ), but operations S940 to S960 may be performed before the operation of decoding motion_info_l0 ( S910 ).

在图10中例示了基于实施方式3-1推断mv_l1的示例。图10中(A)和图10中(B)各例示了在双向预测中根据POC值的大小的两种类型的当前图片610和参考图片ref_l0和ref_l1。下面要描述的实施方式可以应用于图10所例示的两种类型。An example of inferring mv_l1 based on Embodiment 3-1 is illustrated in FIG. 10 . (A) in FIG. 10 and (B) in FIG. 10 each illustrate two types of current picture 610 and reference pictures ref_l0 and ref_l1 according to the size of the POC value in bidirectional prediction. The embodiments to be described below can be applied to both types illustrated in FIG. 10 .

在双向预测中，如图10中(A)所示，当前图片610可以基于POC值(即，(POC₀<POC_cur)&(POC_cur<POC₁))位于参考图片(ref_l0和ref_l1)之间。另外，如图10中(B)所示，基于POC值(即，(POC₀<POC_cur)&(POC₁<POC_cur))，双向预测可以包括当前图片610的POC值大于参考图片ref_l0和ref_l1的POC值的情况。这里，POC₀指示ref_l0的POC值，POC₁指示ref_l1的POC值，并且POC_cur指示当前图片610的POC值。In bidirectional prediction, as shown in (A) of FIG. 10 , the current picture 610 may be located between the reference pictures (ref_l0 and ref_l1) based on POC values (ie, (POC ₀ <POC _cur ) & (POC _cur <POC ₁ )). between. In addition, as shown in (B) of FIG. 10 , based on the POC value (ie, (POC ₀ <POC _cur ) & (POC ₁ <POC _cur )), the bidirectional prediction may include the POC value of the current picture 610 being greater than the reference picture ref_10 and The POC value of ref_l1. Here, POC ₀ indicates the POC value of ref_l0, POC ₁ indicates the POC value of ref_l1, and POC _cur indicates the POC value of the current picture 610.

在两种双向预测中，可以在mv_l0(实线箭头)和mv_l1(虚线箭头)之间建立线性关系的前提下来推导mv_l1。在这个过程中，可以使用mv_l0以及在每个方向的参考图片ref_l0和ref_l1。当推导出mv_l1时，可以基于由mv_l0指示的参考块630和由推导出的mv_l1指示的参考块640来预测当前块620。In both bidirectional predictions, mv_l1 can be derived on the premise of establishing a linear relationship between mv_l0 (solid arrow) and mv_l1 (dashed arrow). In this process, mv_l0 and the reference pictures ref_l0 and ref_l1 in each direction can be used. When mv_l1 is derived, the current block 620 may be predicted based on the reference block 630 indicated by mv_l0 and the reference block 640 indicated by the derived mv_l1.

实施方式3-2Embodiment 3-2

实施方式3-2对应于基于运动的线性关系推断mv_l1然后校正或调整mv_l1的方法。实施方式3-2与实施方式3-1的相同之处在于基于运动的线性关系来推导运动矢量，但是与实施方式3-1的不同之处在于使用偏移信息附加地校正或调整mv_l1。Embodiment 3-2 corresponds to a method of inferring mv_l1 based on a linear relationship of motion and then correcting or adjusting mv_l1. Embodiment 3-2 is the same as Embodiment 3-1 in that the motion vector is derived based on the linear relationship of motion, but is different from Embodiment 3-1 in that offset information is used to additionally correct or adjust mv_l1.

用于运动校正的偏移信息对应于指示mv_l1和“调整后的mv_l1”之间的差异的信息。换言之，偏移信息对应于指示使用运动的线性关系推导出的运动矢量(mv_l1)与当前块的测量(实际)运动矢量(调整后的mv_l1)之间的差异的信息。The offset information for motion correction corresponds to information indicating the difference between mv_l1 and "adjusted mv_l1". In other words, the offset information corresponds to information indicating the difference between the motion vector derived using the linear relationship of motion (mv_l1) and the measured (actual) motion vector of the current block (adjusted mv_l1).

偏移信息可以包括偏移矢量或偏移索引。偏移矢量对应于用于指示相对于由mv_l1所指示的位置由“调整后的mv_l1”所指示的位置的信息。偏移索引对应于通过对可能对应于偏移矢量的候选进行索引而获得的信息。在下文中，将通过单独的实施方式来描述两种类型的偏移信息中的每一种。The offset information may include an offset vector or an offset index. The offset vector corresponds to information indicating the position indicated by "adjusted mv_l1" relative to the position indicated by mv_l1. The offset index corresponds to information obtained by indexing the candidates that may correspond to the offset vector. In the following, each of the two types of offset information will be described by separate implementations.

偏移矢量offset vector

除了motion_info_l0之外，可以通过在比特流中还包括偏移矢量来用信号通知偏移矢量。如上所述，由于偏移矢量对应于调整后的mv_l1和(未调整的)mv_l1之间的差值，因此偏移矢量可以表示为运动矢量差(mvd)。另外，由于偏移矢量对应于使用运动线性关系推导出的运动矢量与当前块的测量到的运动矢量之间的差，因此偏移矢量可以与传统方法中使用的mvd(从相邻块的运动矢量推导出的mvp与当前块的mv之间的差)区分开来。在本实施方式中，用于双向预测的从视频编码设备向视频解码设备用信号通知的信息如下表6所示地以语法表达。The offset vector can be signaled by also including the offset vector in the bitstream in addition to motion_info_10. As mentioned above, since the offset vector corresponds to the difference between the adjusted mv_l1 and the (unadjusted) mv_l1, the offset vector can be expressed as a motion vector difference (mvd). In addition, since the offset vector corresponds to the difference between the motion vector derived using the motion linear relationship and the measured motion vector of the current block, the offset vector can be compared with the mvd (motion vector of the adjacent block from the motion of the adjacent block) used in the traditional method. The difference between the vector-derived mvp and the mv of the current block). In the present embodiment, information signaled from the video encoding device to the video decoding device for bidirectional prediction is expressed in syntax as shown in Table 6 below.

[表6][Table 6]

在上表6中，mvd_l1可以是传统方法中使用的mvd或偏移矢量。对于当前块620，当没有建立运动的线性关系时，可以用信号通知传统方法中使用的mvd作为mvd_l1，并且当建立了运动的线性关系时，可以用信号通知偏移矢量作为mvd_l1。In Table 6 above, mvd_l1 can be the mvd or offset vector used in the traditional method. For the current block 620, when a linear relationship of motion is not established, the mvd used in the conventional method may be signaled as mvd_l1, and when a linear relationship of motion is established, the offset vector may be signaled as mvd_l1.

如表6所示，motion_info_l0可以从视频编码设备用信号通知视频解码设备。用信号通知的motion_info_l0可以包括如表6所示的ref_idx_l0、mvd_l0和mvp_l0_flag。也可以通过将ref_idx_l1包括在比特流中来用信号通知ref_idx_l1。As shown in Table 6, motion_info_l0 may be signaled from the video encoding device to the video decoding device. The signaled motion_info_l0 may include ref_idx_l0, mvd_l0, and mvp_l0_flag as shown in Table 6. Ref_idx_l1 may also be signaled by including ref_idx_l1 in the bitstream.

视频解码设备将由用信号通知的参考图片信息(ref_idx_l0和ref_idx_l1)所指示的参考图片设置为用于推断mv_l1(用于预测当前块)的参考图片(ref_l0和ref_l1)。The video decoding apparatus sets the reference picture indicated by the signaled reference picture information (ref_idx_l0 and ref_idx_l1) as the reference picture (ref_l0 and ref_l1) for inferring mv_l1 (for predicting the current block).

当解码了motion_info_l0时(S1110)，视频解码设备可以通过使用mvp_l0_flag和mvd_l0来推断或推导mv_l0(S1120)。在这个过程中可以使用公式1。而且，视频解码设备可以从比特流解码ref_idx_l1和mvd_l1(S1130和S1140)。这里，依据是否建立线性关系，mvd_l1可以对应于传统方法的mvd和偏移矢量中的任何一个。When motion_info_l0 is decoded (S1110), the video decoding device can infer or derive mv_l0 by using mvp_l0_flag and mvd_l0 (S1120). Equation 1 can be used in this process. Also, the video decoding device may decode ref_idx_l1 and mvd_l1 from the bitstream (S1130 and S1140). Here, depending on whether a linear relationship is established, mvd_l1 can correspond to any one of the mvd and offset vector of the traditional method.

视频解码设备可以使用linear_MV_coding_enabled_flag来确定是否激活/停用运动矢量推导功能(S1150)。当linear_MV_coding_enabled_flag指示激活运动矢量推导功能时，可以从比特流中解码linear_MV_coding_flag(S1160)。The video decoding device may use linear_MV_coding_enabled_flag to determine whether to activate/deactivate the motion vector derivation function (S1150). When linear_MV_coding_enabled_flag indicates activation of the motion vector derivation function, linear_MV_coding_flag may be decoded from the bit stream (S1160).

当linear_MV_coding_flag指示建立了运动的线性关系时(S1170)，视频解码设备可以在建立了运动的线性关系的前提下推导mv_l1(S1180)。可以通过将参考图片(ref_l0和ref_l1)和mv_l0应用于公式3来实现该过程。When linear_MV_coding_flag indicates that a linear relationship of motion is established (S1170), the video decoding apparatus may derive mv_l1 on the premise that a linear relationship of motion is established (S1180). This process may be implemented by applying reference pictures (ref_l0 and ref_l1) and mv_l0 to Formula 3.

视频解码设备可以通过将偏移矢量(mvd_l1)应用于推导出的mv_l1来调整或校正mv_l1(S1182)。具体地，可以调整mv_l1，使得调整后的mv_l1指示以mv_l1所指示的位置作为原点平移了偏移矢量mvd_l1的位置。mv_l1的调整可以被理解为，在推导出的mv_l1是在第二方向的预测运动矢量(mvp)的假设下，将偏移矢量(mvd_l1)应用于假设的预测运动矢量。The video decoding device may adjust or correct mv_l1 by applying the offset vector (mvd_l1) to the derived mv_l1 (S1182). Specifically, mv_l1 can be adjusted such that the adjusted mv_l1 indicates a position where the offset vector mvd_l1 is translated with the position indicated by mv_l1 as the origin. The adjustment of mv_l1 can be understood as applying the offset vector (mvd_l1) to the assumed predicted motion vector (mvp) under the assumption that the derived mv_l1 is the predicted motion vector (mvp) in the second direction.

同时，当linear_MV_coding_enabled_flag在操作S1150中指示停用运动矢量推导功能，或者linear_MV_coding_flag在操作S1170中没有指示建立了运动的线性关系时，视频解码设备可以通过传统方法而不是本发明提出的推导方法来推导mv_l1。具体地，视频解码设备可以解码mvp_l1_flag(S1190)，并且通过将mvp_l1_flag所指示的mvp_l1与在S1140中解码的mvd_l1进行求和来推导mv_l1(S1192)。这里，mvd_l1对应于传统方法中使用的mvd。Meanwhile, when linear_MV_coding_enabled_flag indicates that the motion vector derivation function is disabled in operation S1150, or linear_MV_coding_flag does not indicate that a linear relationship of motion is established in operation S1170, the video decoding device may derive mv_l1 by a conventional method instead of the derivation method proposed by the present invention. Specifically, the video decoding device may decode mvp_l1_flag (S1190), and derive mv_l1 (S1192) by summing the mvp_l1 indicated by mvp_l1_flag with the mvd_l1 decoded in S1140. Here, mvd_l1 corresponds to the mvd used in the conventional method.

下表7中示出了用于上述实施方式的语法元素。Syntax elements for the above-described embodiments are shown in Table 7 below.

[表7][Table 7]

图11例示了在解码mvd_l1的操作(S1140)之后执行确定linear_MV_coding_enabled_flag的操作(S1150)以及解码并确定linear_MV_coding_flag的操作(S1160和S1170)，但是可以在解码motion_info_l0的操作(S1110)之前执行操作S1150至S1170。11 illustrates that the operation of determining linear_MV_coding_enabled_flag ( S1150 ) and the operations of decoding and determining linear_MV_coding_flag ( S1160 and S1170 ) are performed after the operation of decoding mvd_l1 ( S1140 ), but operations S1150 to S1170 may be performed before the operation of decoding motion_info_l0 ( S1110 ).

在图12中例示了基于本实施方式推导mv_l1的示例。如图12所示，可以在mv_l0(实线箭头)和mv_l1(虚线箭头)之间建立了线性关系的前提下推导mv_l1。An example of derivation of mv_l1 based on this embodiment is illustrated in Fig. 12. As shown in Fig. 12, mv_l1 can be derived on the premise that a linear relationship is established between mv_l0 (solid arrow) and mv_l1 (dashed arrow).

此外，假设推导出的mv_l1为预测运动矢量，可以通过根据偏移矢量mvd_l1所指示的方向和大小来移动mv_l1所指示的位置，而调整mv_l1。可以基于由mv_l0指示的参考块630和由调整后的第二运动矢量(mv_A_l1)所指示的参考块640来预测当前块620。Furthermore, assuming that the derived mv_l1 is a predicted motion vector, mv_l1 can be adjusted by moving the position indicated by mv_l1 according to the direction and size indicated by the offset vector mvd_l1. The current block 620 may be predicted based on the reference block 630 indicated by mv_lO and the reference block 640 indicated by the adjusted second motion vector (mv _{A_l1} ).

偏移索引offset index

除了motion_info_l0之外，可以通过在比特流中还包括偏移索引来用信号通知偏移索引。如上所述，偏移索引对应于指示一个或更多个预设偏移矢量候选(可以对应于偏移矢量的候选)中的任何一个的索引。The offset index can be signaled by also including the offset index in the bitstream in addition to motion_info_l0. As described above, the offset index corresponds to an index indicating any one of one or more preset offset vector candidates (candidates that may correspond to offset vectors).

在本实施方式中，用于双向预测的从视频编码设备用信号通知视频解码设备的信息以如下表8所示地以语法表达。In the present embodiment, the information signaled from the video encoding device to the video decoding device for bidirectional prediction is expressed in syntax as shown in Table 8 below.

[表8][Table 8]

在上表8中，mv_offset指示与偏移索引相对应的语法元素。可以通过在比特流中包括motion_info_l0，将motion_info_l0从视频编码设备用信号通知视频解码设备。用信号通知的motion_info_l0可以包括如表8所示的ref_idx_l0、mvd_l0和mvp_l0_flag。也可以通过在比特流中包括ref_idx_l1，来用信号通知ref_idx_l1。视频解码设备将用信号通知的参考图片信息ref_idx_l0和ref_idx_l1所指示的参考图片设置为用于推断mv_l1的参考图片ref_l0和ref_l1。In Table 8 above, mv_offset indicates the syntax element corresponding to the offset index. Motion_info_lO may be signaled from the video encoding device to the video decoding device by including motion_info_l0 in the bitstream. The signaled motion_info_l0 may include ref_idx_l0, mvd_l0, and mvp_l0_flag as shown in Table 8. Ref_idx_l1 may also be signaled by including ref_idx_l1 in the bitstream. The video decoding apparatus sets the reference pictures indicated by the signaled reference picture information ref_idx_l0 and ref_idx_l1 as the reference pictures ref_l0 and ref_l1 for inferring mv_l1.

当解码了motion_info_l0时(S1310)，视频解码设备可以通过使用motion_info_l0中所包括的mvp_l0_flag和mvd_l0来推断或推导mv_l0(S1320)。在这个过程中可以使用公式1。而且，视频解码设备可以解码ref_idx_l1(S1330)。When motion_info_10 is decoded (S1310), the video decoding device can infer or derive mv_10 by using mvp_10_flag and mvd_10 included in motion_info_10 (S1320). Equation 1 can be used in this process. Also, the video decoding device can decode ref_idx_l1 (S1330).

视频解码设备可以通过分析linear_MV_coding_enabled_flag来确定激活还是停用运动矢量推导功能(S1340)。当linear_MV_coding_enabled_flag指示激活运动矢量推导功能时，可以从比特流中解码linear_MV_coding_flag(S1350)。The video decoding device may determine whether to activate or deactivate the motion vector derivation function by analyzing linear_MV_coding_enabled_flag (S1340). When linear_MV_coding_enabled_flag indicates activation of the motion vector derivation function, linear_MV_coding_flag may be decoded from the bit stream (S1350).

当linear_MV_coding_flag指示建立了运动的线性关系时(S1360)，视频解码设备解码偏移索引mv_offset(S1370)，并且可以在建立了mv_l0和mv_l1之间的线性关系的前提下推导mv_l1(S1380)。可以通过将mv_l0和双向参考图片(ref_l0和ref_l1)应用于公式3来实现该过程。When linear_MV_coding_flag indicates that the linear relationship of motion is established (S1360), the video decoding device decodes the offset index mv_offset (S1370), and can derive mv_l1 on the premise that the linear relationship between mv_l0 and mv_l1 is established (S1380). This process can be achieved by applying mv_l0 and bidirectional reference pictures (ref_l0 and ref_l1) to Equation 3.

视频解码设备可以通过将由偏移索引(mv_offset)所指示的偏移矢量候选应用于推导出的mv_l1，来调整或校正mv_l1(S1382)。具体地，可以通过将偏移索引(mv_offset)所指示的偏移矢量候选加至mv_l1，来调整mv_l1。换言之，调整mv_l1可以理解为在推导出的mv_l1为在第二方向的预测运动矢量(mvp)的假设下，将偏移索引(mv_offset)所指示的偏移矢量候选应用于假设的预测运动矢量。The video decoding device may adjust or correct mv_l1 by applying the offset vector candidate indicated by the offset index (mv_offset) to the derived mv_l1 (S1382). Specifically, mv_l1 may be adjusted by adding the offset vector candidate indicated by the offset index (mv_offset) to mv_l1. In other words, adjusting mv_l1 can be understood as applying the offset vector candidate indicated by the offset index (mv_offset) to the assumed predicted motion vector under the assumption that the derived mv_l1 is the predicted motion vector (mvp) in the second direction.

同时，当linear_MV_coding_enabled_flag在操作S1340中指示停用运动矢量推导功能或者linear_MV_coding_flag在操作S1360中没有指示建立了运动的线性关系时，可以通过传统方法而不是本发明提出的推导方法来推导mv_l1。具体地，视频解码设备可以从比特流解码mvd_l1和mvp_l1_flag(S1390和S1392)，并且通过将mvp_l1_flag所指示的mvp_l1和mvd_l1进行求和来推导mv_l1(S1394)。Meanwhile, when linear_MV_coding_enabled_flag indicates that the motion vector derivation function is disabled in operation S1340 or linear_MV_coding_flag does not indicate that a linear relationship of motion is established in operation S1360, mv_l1 may be derived by a conventional method rather than the derivation method proposed by the present invention. Specifically, the video decoding apparatus may decode mvd_l1 and mvp_l1_flag from the bitstream (S1390 and S1392), and derive mv_l1 by summing mvp_l1 and mvd_l1 indicated by mvp_l1_flag (S1394).

下表9中示出了上述实施方式的语法元素。The syntax elements of the above embodiment are shown in Table 9 below.

[表9][Table 9]

图13例示了在解码ref_idx_l1的操作(S1330)之后执行确定linear_MV_coding_enabled_flag的操作(S1340)以及解码并确定linear_MV_coding_flag的操作(S1350和S1360)，但是可以在解码motion_info_l0的操作(S1310)之前执行操作S1340至S1360。13 illustrates that the operation of determining linear_MV_coding_enabled_flag (S1340) and the operations of decoding and determining linear_MV_coding_flag (S1350 and S1360) are performed after the operation of decoding ref_idx_l1 (S1330), but operations S1340 to S1360 may be performed before the operation of decoding motion_info_l0 (S1310). .

图14中例示了在本实施方式中使用的各种类型的偏移矢量候选。图14中(a)例示了当允许4点偏移的运动时的偏移矢量候选(内部为空的圆)。实心圆表示基于运动的线性关系而推导出的mv_l1。当允许4点偏移的运动时，可以使用2比特固定长度(FL)的偏移索引来指示偏移矢量候选中的任何一个。Various types of offset vector candidates used in this embodiment are illustrated in FIG. 14 . (a) in FIG. 14 illustrates offset vector candidates (a circle with an empty interior) when movement of a 4-point offset is allowed. The filled circles represent the mv_l1 derived based on the linear relationship of motion. When movement of 4-point offsets is allowed, a 2-bit fixed length (FL) offset index may be used to indicate any one of the offset vector candidates.

图14中(b)例示了当允许8点偏移的运动时的偏移矢量候选。可以通过将四个偏移矢量候选(填充有垂直图案的圆)添加到4点偏移矢量候选来配置8点偏移矢量候选。当允许8点偏移的运动时，可以使用3比特固定长度的偏移索引来指示偏移矢量候选中的任何一个。(b) in FIG. 14 illustrates offset vector candidates when movement of 8-point offset is allowed. An 8-point offset vector candidate can be configured by adding four offset vector candidates (circles filled with vertical patterns) to the 4-point offset vector candidate. When movement of an 8-point offset is allowed, a 3-bit fixed-length offset index may be used to indicate any one of the offset vector candidates.

图14中(c)例示了当允许16点偏移的运动时的偏移矢量候选。可以通过将8个偏移矢量候选(填充有水平图案的圆)添加到8点偏移矢量候选来配置16点偏移矢量候选。当允许16点偏移的运动时，可以使用4比特固定长度的偏移索引来指示偏移矢量候选中的任何一个。(c) in FIG. 14 illustrates offset vector candidates when movement of 16-point offsets is allowed. The 16-point offset vector candidate can be configured by adding 8 offset vector candidates (circles filled with horizontal patterns) to the 8-point offset vector candidate. When motion of 16-point offsets is allowed, a 4-bit fixed-length offset index may be used to indicate any one of the offset vector candidates.

图14中(d)例示了允许16点偏移的运动的情况的另一示例。可以通过组合填充有水平图案的8点偏移矢量候选和填充有对角线图案的8点偏移矢量候选来配置16点偏移矢量候选。当允许16点偏移的运动时，可以使用4比特固定长度的偏移索引来指示偏移矢量候选中的任何一个。(d) in FIG. 14 illustrates another example of a case where movement of 16-point offset is allowed. The 16-point offset vector candidate can be configured by combining an 8-point offset vector candidate filled with a horizontal pattern and an 8-point offset vector candidate filled with a diagonal pattern. When motion of 16-point offsets is allowed, a 4-bit fixed-length offset index may be used to indicate any one of the offset vector candidates.

可以在图片级别标头、瓦片组标头、瓦片标头和/或CTU标头的一个或更多个位置处确定或定义设置参照图14所描述的各种类型的偏移矢量候选中的哪一种类型。也就是说，可以使用从视频编码设备用信号通知的信息(标识信息)来确定偏移矢量候选的形状，并且可以在上述各种位置定义标识信息。由于标识信息确定或标识各种类型的偏移矢量候选中的任何一种类型，因此可以通过标识信息来确定偏移矢量候选的数量、每个候选的大小和每个候选的方向。The various types of offset vector candidates described with reference to FIG. 14 may be determined or defined at one or more locations of the picture level header, tile group header, tile header, and/or CTU header. Which type of. That is, the shape of the offset vector candidate can be determined using information (identification information) signaled from the video encoding device, and the identification information can be defined at the various positions described above. Since the identification information determines or identifies any one of various types of offset vector candidates, the number of offset vector candidates, the size of each candidate, and the direction of each candidate can be determined by the identification information.

另外，可以通过在视频编码设备和视频解码设备处使用相同的规则来预先确定设置各种类型的偏移矢量候选中的哪一种类型。In addition, which type of various types of offset vector candidates is set can be determined in advance by using the same rule at the video encoding device and the video decoding device.

第四实施方式Fourth embodiment

第四实施方式对应于如下方法：在没有信令的情况下使用motion_info_l0推导在水平和垂直运动方向当中建立了线性关系的方向的运动，同时使用附加用信号通知的信息(偏移信息)来调整未建立线性关系的方向的运动。The fourth embodiment corresponds to a method of deriving the motion in a direction establishing a linear relationship among the horizontal and vertical motion directions without signaling using motion_info_10 while adjusting using additional signaled information (offset information) Movement in directions for which a linear relationship is not established.

例如，当仅针对运动的水平轴分量建立了线性关系时，可以没有修改地使用推导出的mv_l1的水平轴分量，但通过使用附加的用信号通知的偏移信息来调整没有建立线性关系的垂直轴分量。作为另一示例，当仅针对运动的垂直轴分量建立了线性关系时，没有修改地使用推导出的mv_l1的垂直轴分量，但通过使用附加的用信号通知的偏移信息来调整没有建立线性关系的水平轴分量。For example, when a linear relationship is established only for the horizontal axis component of motion, the derived horizontal axis component of mv_l1 can be used without modification, but the vertical axis for which no linear relationship is established is adjusted by using additional signaled offset information. axis component. As another example, when a linear relationship is established only for the vertical axis component of motion, the derived vertical axis component of mv_l1 is used without modification, but no linear relationship is established by adjusting using the additional signaled offset information the horizontal axis component of .

第四实施方式可以以结合上述实施方式3-1或3-2的形式来实现。在下文中，将分别描述第四实施方式与实施方式3-1组合的形式以及第四实施方式与实施方式3-2组合的形式。The fourth embodiment can be implemented in combination with the above-mentioned embodiment 3-1 or 3-2. Hereinafter, a combination of the fourth embodiment and Embodiment 3-1 and a combination of the fourth embodiment and Embodiment 3-2 will be described respectively.

实施方式4-1Embodiment 4-1

实施方式4-1对应于第四实施方式与实施方式3-1组合的形式。在本实施方式中，用于双向预测的从视频编码设备向视频解码设备用信号通知的信息如下表10所示地以语法表达。Embodiment 4-1 corresponds to a form in which the fourth embodiment is combined with Embodiment 3-1. In this embodiment, information signaled from the video encoding device to the video decoding device for bidirectional prediction is expressed in syntax as shown in Table 10 below.

[表10][Table 10]

在表10中，mvd_l1可以是偏移信息(偏移矢量)或传统方法的mvd。例如，当针对水平轴分量没有建立线性关系时，mvd_l1是水平轴分量的偏移矢量，当针对垂直轴分量没有建立线性关系时，mvd_l1可以是垂直轴分量的偏移矢量。此外，当针对水平轴分量和垂直轴分量二者都没有建立线性关系时，mvd_l1可以是传统方法的mvd。当针对水平轴分量和垂直轴分量二者都建立了线性关系时，没有用信号通知mvd_l1。In Table 10, mvd_l1 can be offset information (offset vector) or mvd of the traditional method. For example, when no linear relationship is established for the horizontal axis component, mvd_l1 is the offset vector of the horizontal axis component, and when no linear relationship is established for the vertical axis component, mvd_l1 may be the offset vector of the vertical axis component. Furthermore, when a linear relationship is not established for both the horizontal axis component and the vertical axis component, mvd_l1 may be mvd of the conventional method. When a linear relationship is established for both the horizontal axis component and the vertical axis component, mvd_l1 is not signaled.

motion_info_l0可以通过被包括在比特流中而从视频编码设备用信号通知视频解码设备。用信号通知的motion_info_l0可以包括ref_idx_l0、mvd_l0和mvp_l0_flag。也可以通过在比特流中包括ref_idx_l1来用信号通知ref_idx_l1。视频解码设备将用信号通知的参考图片信息(ref_idx_l0和ref_idx_l1)指示的参考图片设置为用于推断mv_l1的参考图片(ref_l0和ref_l1)。motion_info_10 may be signaled from the video encoding device to the video decoding device by being included in the bitstream. The signaled motion_info_l0 may include ref_idx_l0, mvd_l0, and mvp_l0_flag. Ref_idx_l1 may also be signaled by including ref_idx_l1 in the bitstream. The video decoding apparatus sets the reference picture indicated by the signaled reference picture information (ref_idx_l0 and ref_idx_l1) as the reference picture (ref_l0 and ref_l1) for inferring mv_l1.

当解码了motion_info_l0时(S1510)，视频解码设备可以通过使用mvp_l0_flag和mvd_l0来推断或推导mv_l0(S1520)。在这个过程中可以使用公式1。而且，视频解码设备可以从比特流中解码ref_idx_l1(S1530)。When motion_info_l0 is decoded (S1510), the video decoding device can infer or derive mv_l0 by using mvp_l0_flag and mvd_l0 (S1520). Equation 1 can be used in this process. Also, the video decoding device may decode ref_idx_l1 from the bit stream (S1530).

当linear_MV_coding_enabled_flag指示激活运动矢量推导功能时(S1540)，视频解码设备从比特流中解码linear_MV_coding_idc(S1550)。这里，linear_MV_coding_idc是指示运动是否具有线性关系的信息，并且可以通过使用该信息来指示在运动的水平轴分量和垂直轴分量当中建立了线性关系的分量。When linear_MV_coding_enabled_flag indicates activation of the motion vector derivation function (S1540), the video decoding device decodes linear_MV_coding_idc from the bit stream (S1550). Here, linear_MV_coding_idc is information indicating whether motion has a linear relationship, and the component in which a linear relationship is established among the horizontal axis component and the vertical axis component of motion can be indicated by using this information.

当linear_MV_coding_idc＝none(S1560)时，由于针对两个分量都没有建立线性关系，所以如在传统方法中那样用信号通知mvp_l1_flag和mvd_l1。因此，视频解码设备可以从比特流解码mvp_l1_flag和mvd_l0(S1562)，并通过使用解码的信息来推导mv_l1(S1564)。而且，当在操作S1540中linear_MV_coding_enabled_flag没有指示激活运动矢量推导功能时，视频解码设备可以通过使用解码的mvp_l1_flag和mvd_l1来推导mv_l1(S1562和S1564)。When linear_MV_coding_idc=none (S1560), since no linear relationship is established for both components, mvp_l1_flag and mvd_l1 are signaled as in the conventional method. Therefore, the video decoding device can decode mvp_l1_flag and mvd_l0 from the bit stream (S1562), and derive mv_l1 by using the decoded information (S1564). Also, when linear_MV_coding_enabled_flag does not indicate activation of the motion vector derivation function in operation S1540, the video decoding device may derive mv_l1 by using the decoded mvp_l1_flag and mvd_l1 (S1562 and S1564).

当linear_MV_coding_idc＝x(S1570)时，由于仅针对水平轴分量(x)建立了线性关系，所以用信号通知用于没有建立线性关系的垂直轴分量(y)的偏移矢量(mvd_l1，y)。因此，视频解码设备解码用于垂直轴分量的偏移矢量(mvd_l1，y)(S1572)，并且使用线性关系推导mv_l1。而且，视频解码设备可以通过将用于垂直轴分量的偏移矢量(mvd_l1，y)应用于推导出的mv_l1，来调整mv_l1(S1576)。When linear_MV_coding_idc=x (S1570), since a linear relationship is established only for the horizontal axis component (x), the offset vector (mvd_l1, y) for the vertical axis component (y) for which a linear relationship is not established is signaled. Therefore, the video decoding device decodes the offset vector (mvd_l1, y) for the vertical axis component (S1572), and derives mv_l1 using a linear relationship. Furthermore, the video decoding device may adjust mv_l1 by applying the offset vector (mvd_l1, y) for the vertical axis component to the derived mv_l1 (S1576).

视频解码设备可以对于水平轴分量没有修改地使用“推导出的mv_l1”，而对于垂直轴分量使用调整后的第二运动矢量(mvA_l1)。推导出的mv_l1的水平轴分量和调整后的第二运动矢量(mvA_l1)的水平轴分量可以相同。The video decoding device may use the "derived mv_l1" for the horizontal axis component without modification, and use the adjusted second motion vector (mvA_l1) for the vertical axis component. The derived horizontal axis component of mv_l1 and the horizontal axis component of the adjusted second motion vector (mvA_l1) may be the same.

当linear_MV_coding_idc＝y(S1580)时，由于仅针对垂直轴分量建立了线性关系，所以用信号通知用于没有建立线性关系的水平轴分量的偏移矢量(mvd_l1，x)。因此，视频解码设备可以解码用于水平轴分量的偏移矢量(mvd_l1，x)(S1582)，并且通过使用线性关系将用于水平轴分量的偏移矢量(mvd_l1，x)应用于推导出的mv_l1(S1584)以调整mv_l1(S1586)。When linear_MV_coding_idc=y (S1580), since the linear relationship is established only for the vertical axis component, the offset vector (mvd_l1,x) for the horizontal axis component for which the linear relationship is not established is signaled. Therefore, the video decoding device can decode the offset vector (mvd_l1,x) for the horizontal axis component (S1582), and apply the offset vector (mvd_l1,x) for the horizontal axis component to the derived mv_l1(S1584) to adjust mv_l1(S1586).

视频解码设备可以对于垂直轴分量没有修改地使用“推导出的mv_l1”，并且对于水平轴分量使用调整后的第二运动矢量(mvA_l1)。推导出的mv_l1的垂直轴分量与调整后的第二运动矢量(mvA_l1)的垂直轴分量可以相同。The video decoding apparatus may use the "derived mv_l1" for the vertical axis component without modification, and use the adjusted second motion vector (mvA_l1) for the horizontal axis component. The derived vertical axis component of mv_l1 and the vertical axis component of the adjusted second motion vector (mvA_l1) may be the same.

当linear_MV_coding_idc＝(x&y)(S1580)时，由于针对水平轴分量和垂直轴分量二者都建立了线性关系，因此没有用信号通知mvd_l1(在第二方向的偏移信息或mvd信息)。在这种情况下，视频解码设备通过使用motion_info_l0和ref_idx_l1来推导mv_l1(S1590)。When linear_MV_coding_idc=(x&y) (S1580), since a linear relationship is established for both the horizontal axis component and the vertical axis component, mvd_l1 (offset information in the second direction or mvd information) is not signaled. In this case, the video decoding device derives mv_l1 by using motion_info_l0 and ref_idx_l1 (S1590).

在下表11中示出了实施方式4-1的语法元素。The syntax elements of Embodiment 4-1 are shown in Table 11 below.

[表11][Table 11]

图15例示了可以在解码ref_idx_l1的操作(S1530)之后执行确定linear_MV_coding_enabled_flag的操作(S1540)以及解码并确定linear_MV_coding_idc的操作(S1550至S1580)，但是可以在解码motion_info_l0的操作(S1510)之前执行操作S1540至S1580。15 illustrates that the operation of determining linear_MV_coding_enabled_flag ( S1540 ) and the operations of decoding and determining linear_MV_coding_idc ( S1550 to S1580 ) may be performed after the operation of decoding ref_idx_l1 ( S1530 ), but operations S1540 to S1580 may be performed before the operation of decoding motion_info_l0 ( S1510 ).

实施方式4-2Implementation 4-2

实施方式4-2对应于第四实施方式和实施方式3-2组合的形式。在该实施方式中，用于双向预测的从视频编码设备用信号通知视频解码设备的信息如上表10所示地以语法表达。Embodiment 4-2 corresponds to a combination of the fourth embodiment and Embodiment 3-2. In this embodiment, the information signaled from the video encoding device to the video decoding device for bidirectional prediction is expressed in syntax as shown in Table 10 above.

在表10中，mvd_l1可以是偏移信息(偏移矢量)或传统方法的mvd。例如，当针对水平轴分量没有建立线性关系时，mvd_l1是用于水平轴分量的偏移矢量，而当没有建立垂直轴分量的线性关系时，mvd_l1可以是用于垂直轴分量的偏移矢量。而且，当针对水平轴分量和垂直轴分量二者都没有建立线性关系时，mvd_l1可以是传统方法的mvd。当针对水平轴分量和垂直轴分量二者都建立了线性关系时，mvd_l1可以是用于两个分量的偏移矢量。In Table 10, mvd_l1 can be offset information (offset vector) or mvd of the traditional method. For example, mvd_l1 is an offset vector for the horizontal axis component when no linear relationship is established for the horizontal axis component, and mvd_l1 may be an offset vector for the vertical axis component when no linear relationship is established for the vertical axis component. Furthermore, when a linear relationship is not established for both the horizontal axis component and the vertical axis component, mvd_l1 may be mvd of the conventional method. When a linear relationship is established for both the horizontal axis component and the vertical axis component, mvd_l1 may be the offset vector for both components.

motion_info_l0可以通过被包括在比特流中而从视频编码设备用信号通知视频解码设备。用信号通知的motion_info_l0可以包括ref_idx_l0、mvd_l0和mvp_l0_flag。也可以通过在比特流中包括ref_idx_l1来用信号通知ref_idx_l1。视频解码设备将用信号通知的参考图片信息(ref_idx_l0和ref_idx_l1)所指示的参考图片设置为用于推断mv_l1的参考图片(ref_l0和ref_l1)。motion_info_10 may be signaled from the video encoding device to the video decoding device by being included in the bitstream. The signaled motion_info_l0 may include ref_idx_l0, mvd_l0, and mvp_l0_flag. Ref_idx_l1 may also be signaled by including ref_idx_l1 in the bitstream. The video decoding apparatus sets the reference picture indicated by the signaled reference picture information (ref_idx_l0 and ref_idx_l1) as the reference picture (ref_l0 and ref_l1) for inferring mv_l1.

当解码了motion_info_l0时(S1610)，视频解码设备可以通过使用mvp_l0_flag和mvd_l0来推断或推导mv_l0(S1620)。在这个过程中可以使用公式1。此外，视频解码设备可以从比特流中解码ref_idx_l1(S1630)。When motion_info_l0 is decoded (S1610), the video decoding device can infer or derive mv_l0 by using mvp_l0_flag and mvd_l0 (S1620). Equation 1 can be used in this process. In addition, the video decoding device may decode ref_idx_l1 from the bit stream (S1630).

当linear_MV_coding_enabled_flag指示激活运动矢量推导功能时(S1640)，视频解码设备从比特流中解码linear_MV_coding_idc(S1650)。When linear_MV_coding_enabled_flag indicates activation of the motion vector derivation function (S1640), the video decoding device decodes linear_MV_coding_idc from the bit stream (S1650).

当linear_MV_coding_idc＝none(S1660)时，由于针对两个分量没有建立线性关系，因此如传统方法中那样用信号通知mvp_l1_flag和mvd_l1。因此，视频解码设备可以从比特流中解码mvp_l1_flag和mvd_l1(S1662)并且通过使用解码的信息推导mv_l1(S1664)。即使在操作S1640中linear_MV_coding_enabled_flag没有指示激活运动矢量推导功能时，视频解码设备也可以通过使用解码的mvp_l1_flag和mvd_l1来推导mv_l1(S1662和S1664)。When linear_MV_coding_idc=none (S1660), since no linear relationship is established for the two components, mvp_l1_flag and mvd_l1 are signaled as in the conventional method. Therefore, the video decoding device can decode mvp_l1_flag and mvd_l1 from the bitstream (S1662) and derive mv_l1 by using the decoded information (S1664). Even when linear_MV_coding_enabled_flag does not indicate activation of the motion vector derivation function in operation S1640, the video decoding device may derive mv_l1 by using the decoded mvp_l1_flag and mvd_l1 (S1662 and S1664).

当linear_MV_coding_idc＝x(S1670)时，由于仅针对水平轴分量建立了线性关系，所以用信号通知用于没有建立线性关系的垂直轴分量的偏移矢量(mvd_l1，y)。因此，视频解码设备解码用于垂直轴分量的偏移矢量(mvd_l1，y)(S1672)，并使用线性关系推导mv_l1(1674)。然后，视频解码设备可以通过将用于垂直轴分量的偏移矢量(mvd_l1，y)应用于推导出的mv_l1来调整mv_l1(S1676)。When linear_MV_coding_idc=x (S1670), since the linear relationship is established only for the horizontal axis component, the offset vector (mvd_l1, y) for the vertical axis component for which the linear relationship is not established is signaled. Therefore, the video decoding device decodes the offset vector (mvd_l1, y) for the vertical axis component (S1672), and derives mv_l1 using a linear relationship (1674). Then, the video decoding device may adjust mv_l1 by applying the offset vector (mvd_l1,y) for the vertical axis component to the derived mv_l1 (S1676).

视频解码设备可以对于水平轴分量没有改变地使用“推导出的mv_l1”，并且对于垂直轴分量使用调整后的第二运动矢量(mvA_l1)。推导出的mv_l1的水平轴分量和调整后的第二运动矢量(mvA_l1)的水平轴分量可以相同。The video decoding device may use the "derived mv_l1" without change for the horizontal axis component, and use the adjusted second motion vector (mvA_l1) for the vertical axis component. The horizontal axis component of the derived mv_l1 and the horizontal axis component of the adjusted second motion vector (mvA_l1) may be the same.

当linear_MV_coding_idc＝y(S1680)时，由于仅针对垂直轴分量建立了线性关系，所以用信号通知用于没有建立线性关系的水平轴分量的偏移矢量(mvd_l1，x)。因此，视频解码设备可以解码用于水平轴分量的偏移矢量(mvd_l1，x)(S1682)，推导通过使用线性关系推导的mv_l1(S1684)，并将用于水平轴分量的偏移矢量(mvd_l1，x)应用于推导出的mv_l1以调整mv_l1(S1686)。When linear_MV_coding_idc=y (S1680), since the linear relationship is established only for the vertical axis component, the offset vector (mvd_l1,x) for the horizontal axis component for which the linear relationship is not established is signaled. Therefore, the video decoding device can decode the offset vector (mvd_l1, x) for the horizontal axis component (S1682), derive mv_l1 derived by using the linear relationship (S1684), and convert the offset vector (mvd_l1 for the horizontal axis component , x) is applied to the derived mv_l1 to adjust mv_l1 (S1686).

视频解码设备可以对于垂直轴分量没有修改地使用“推导出的mv_l1”，而针对水平轴分量使用调整后的第二运动矢量(mvA_l1)。推导出的mv_l1的垂直轴分量与调整后的第二运动矢量(mvA_l1)的垂直轴分量可以相同。The video decoding device may use the "derived mv_l1" for the vertical axis component without modification and use the adjusted second motion vector (mvA_l1) for the horizontal axis component. The derived vertical axis component of mv_l1 and the vertical axis component of the adjusted second motion vector (mvA_l1) may be the same.

当linear_MV_coding_idc＝(x&y)(S1680)时，由于针对水平轴分量和垂直轴分量二者都建立了线性关系，所以用信号通知用于水平轴分量和垂直轴分量二者的偏移矢量(mvd_l1，x和y)。因此，视频解码设备从比特流解码用于水平轴分量和垂直轴分量二者的偏移矢量(mvd_l1，x和y)(S1690)，并且可以通过将偏移矢量(mvd_l1，x和y)应用于使用线性关系推导出的mv_l1(S1692)(S1694)。When linear_MV_coding_idc=(x&y) (S1680), since a linear relationship is established for both the horizontal axis component and the vertical axis component, the offset vector (mvd_l1, for both the horizontal axis component and the vertical axis component) is signaled x and y). Therefore, the video decoding device decodes the offset vector (mvd_l1, x, and y) for both the horizontal axis component and the vertical axis component from the bit stream (S1690), and can apply the offset vector (mvd_l1, x, and y) by For mv_l1(S1692)(S1694) derived using a linear relationship.

在下表12中示出了实施方式4-2的语法元素。The syntax elements of Embodiment 4-2 are shown in Table 12 below.

[表12][Table 12]

图16例示了可以在解码ref_idx_l1的操作(S1630)之后执行确定linear_MV_coding_enabled_flag的操作(S1640)以及解码并确定linear_MV_coding_idc的操作(S1650至S1680)，但是可以在解码motion_info_l0的操作(S1610)之前执行操作S1640至S1680。16 illustrates that the operation of determining linear_MV_coding_enabled_flag (S1640) and the operation of decoding and determining linear_MV_coding_idc (S1650 to S1680) may be performed after the operation of decoding ref_idx_l1 (S1630), but the operations S1640 to S1640 may be performed before the operation of decoding motion_info_l0 (S1610). S1680.

图17中例示了基于第四实施方式推导mv_l1的示例。图17中所示的示例对应于针对垂直轴分量建立了线性关系的示例。An example of deriving mv_l1 based on the fourth embodiment is illustrated in FIG. 17 . The example shown in FIG. 17 corresponds to an example in which a linear relationship is established with respect to the vertical axis component.

如图17所示，可以在mv_l0(实线箭头)和mv_l1(虚线箭头)之间建立了线性关系的前提下推导mv_l1。As shown in Figure 17, mv_l1 can be derived on the premise that a linear relationship is established between mv_l0 (solid arrow) and mv_l1 (dashed arrow).

由于针对水平轴分量没有建立线性关系，因此可以通过根据偏移矢量mvd_l1所指示的大小，将推导出的mv_l1所指示的位置在水平轴方向上移动来调整mv_l1。可以通过使用没有修改的mv_l1的垂直轴分量和调整后的第二运动矢量mvA_l1的水平轴分量，来推导在第二方向的最终运动矢量mv_A_l1。可以基于由mv_l0指示的参考块630和由调整后的第二运动矢量(mv_A_l1)所指示的参考块640来预测当前块620。Since no linear relationship is established for the horizontal axis component, mv_l1 can be adjusted by moving the position indicated by the derived mv_l1 in the horizontal axis direction according to the magnitude indicated by the offset vector mvd_l1. The final motion vector mv _{A_l1} in the second direction can be derived by using the vertical axis component of the unmodified mv_l1 and the adjusted horizontal axis component of the second motion vector mvA_l1. The current block 620 may be predicted based on the reference block 630 indicated by mv_lO and the reference block 640 indicated by the adjusted second motion vector (mv _{A_l1} ).

第五实施方式Fifth embodiment

第五实施方式对应于使用预设参考图片作为用于推导mv_l1的参考图片的方法。预设参考图片是指在建立运动线性关系时预设的要使用的参考图片。The fifth embodiment corresponds to a method of using a preset reference picture as a reference picture for deriving mv_l1. The preset reference picture refers to the preset reference picture to be used when establishing a motion linear relationship.

在第五实施方式中，参考图片信息(ref_idx_l0和ref_idx_l1)不是以块为单位用信号通知的，而是可以以高级别用信号通知的。此处，高级别可以对应于图片级别标头、瓦片组级别标头、切片标头、瓦片标头和/或CTU标头中的一个或更多个。预设参考图片可以被称为“代表性参考图片”或“线性参考图片”，并且以高级别用信号通知的参考图片信息可以被称为“代表性参考图片信息”或“线性参考图片信息”。当建立了运动的线性关系时，以块为单位使用预设的线性参考图片。In the fifth embodiment, the reference picture information (ref_idx_l0 and ref_idx_l1) is not signaled in units of blocks, but may be signaled at a high level. Here, the high level may correspond to one or more of a picture level header, a tile group level header, a slice header, a tile header, and/or a CTU header. The preset reference picture may be referred to as a "representative reference picture" or a "linear reference picture", and the reference picture information signaled at a high level may be referred to as "representative reference picture information" or "linear reference picture information". When a linear relationship of motion is established, a preset linear reference picture is used in units of blocks.

在下表13中示出了在瓦片组标头中用信号通知的线性参考图片信息。The linear reference picture information signaled in the tile group header is shown in Table 13 below.

[表13][Table 13]

在表13中，linear_ref_idx_l0和linear_ref_idx_l1中的每一个代表针对每个方向用信号通知的线性参考图片信息。In Table 13, each of linear_ref_idx_l0 and linear_ref_idx_l1 represents linear reference picture information signaled for each direction.

图18例示了在传统方法中通过针对每个块用信号通知参考图片信息来指定参考图片的方法或通过本发明中提出的方法指定线性参考图片的方法的示例。FIG. 18 illustrates an example of a method of specifying a reference picture by signaling reference picture information for each block in a conventional method or a method of specifying a linear reference picture by the method proposed in the present invention.

线性参考图片信息(linear_ref_idx_l0和linear_ref_idx_l1)可以通过高级别从视频编码设备用信号通知视频解码设备。视频解码设备可以通过在参考图片列表内选择用信号通知的线性参考图片信息(linear_ref_idx_l0和linear_ref_idx_l1)所指示的参考图片来设置线性参考图片(linear_ref_l0和linear_ref_l1)。The linear reference picture information (linear_ref_idx_l0 and linear_ref_idx_l1) can be signaled to the video decoding device from the video encoding device at a high level. The video decoding device may set the linear reference picture (linear_ref_l0 and linear_ref_l1) by selecting the reference picture indicated by the signaled linear reference picture information (linear_ref_idx_l0 and linear_ref_idx_l1) within the reference picture list.

当linear_MV_coding_enabled_flag指示激活运动矢量推导功能时(S1810)，视频解码设备从比特流中解码linear_MV_coding_flag(S1820)。When linear_MV_coding_enabled_flag indicates activation of the motion vector derivation function (S1810), the video decoding device decodes linear_MV_coding_flag from the bit stream (S1820).

当linear_MV_coding_flag指示建立了运动的线性关系时(S1830)，视频解码设备可以使用预设的线性参考图片(linear_ref_l0和linear_ref_l1)来推导用于推导mv_l1的参考图片(ref_l0和ref_l1)(S1840和S1850)。也就是说，预设线性参考图片(linear_ref_l0和linear_ref_l1)可以被设置为参考图片(ref_l0和ref_l1)。When linear_MV_coding_flag indicates that a linear relationship of motion is established (S1830), the video decoding device may use preset linear reference pictures (linear_ref_l0 and linear_ref_l1) to derive reference pictures (ref_l0 and ref_l1) for deriving mv_l1 (S1840 and S1850). That is, the preset linear reference pictures (linear_ref_l0 and linear_ref_l1) can be set as the reference pictures (ref_l0 and ref_l1).

同时，当在操作S1810中linear_MV_coding_enabled_flag没有指示激活运动矢量推导函数或者在操作S1830中linear_MV_coding_flag没有指示建立了运动的线性关系时，可以用信号通知参考图片信息(ref_idx_l0和ref_idx_l1)。视频解码设备可解码参考图片信息(ref_idx_l0和ref_idx_l1)(S1860和S1870)，并使用参考图片信息来设置参考图片。Meanwhile, when linear_MV_coding_enabled_flag does not indicate activation of the motion vector derivation function in operation S1810 or linear_MV_coding_flag does not indicate establishment of a linear relationship of motion in operation S1830, reference picture information (ref_idx_l0 and ref_idx_l1) may be signaled. The video decoding apparatus may decode the reference picture information (ref_idx_l0 and ref_idx_l1) (S1860 and S1870) and set a reference picture using the reference picture information.

可以结合上述实施方式来实现由本发明提出的设置参考图片的方法。图19例示了组合了由本发明提出的设置参考图片的方法和上述实施方式3-1的形式。The method for setting a reference picture proposed by the present invention can be implemented in combination with the above-mentioned embodiments. FIG. 19 illustrates a form that combines the method of setting a reference picture proposed by the present invention and the above-mentioned Embodiment 3-1.

对于第一方向，当linear_MV_coding_enabled_flag指示激活运动矢量推导功能时(S1910)，解码linear_MV_coding_flag(S1920)。当linear_MV_coding_flag指示建立了运动的线性关系时，可以推导预设线性参考图片(linear_ref_l0)作为参考图片(ref_l0)(S1940)。另一方面，当linear_MV_coding_enabled_flag没有指示激活运动矢量推导功能或者linear_MV_coding_flag没有指示建立了运动的线性关系时，可以通过使用从比特流中解码的参考图片信息(ref_idx_l0)来设置参考图片(ref_l0)(S1962)。For the first direction, when linear_MV_coding_enabled_flag indicates activation of the motion vector derivation function (S1910), decode linear_MV_coding_flag (S1920). When linear_MV_coding_flag indicates that a linear relationship of motion is established, a preset linear reference picture (linear_ref_l0) can be derived as a reference picture (ref_l0) (S1940). On the other hand, when linear_MV_coding_enabled_flag does not indicate activation of the motion vector derivation function or linear_MV_coding_flag does not indicate establishment of a linear relationship of motion, a reference picture (ref_l0) can be set by using reference picture information (ref_idx_l0) decoded from a bitstream (S1962).

当完成针对第一方向的参考图片的推导或设置时，解码mvd_l0和mvp_l0_flag(S1950)，并且可以使用解码的信息来推导mv_l0(S1960)。When the derivation or setting of the reference picture for the first direction is completed, mvd_l0 and mvp_l0_flag are decoded (S1950), and mv_l0 can be derived using the decoded information (S1960).

对于第二方向，当linear_MV_coding_flag指示建立了运动的线性关系时(S1970)，可以使用预设的线性参考图片(linear_ref_l1)来推导或设置参考图片(ref_l1)(S1972)。另一方面，当linear_MV_coding_flag没有指示建立运动的线性关系时，可以使用从比特流解码的参考图片信息(ref_idx_l1)来设置参考图片(ref_l1)(S1974)。For the second direction, when linear_MV_coding_flag indicates that a linear relationship of motion is established (S1970), a preset linear reference picture (linear_ref_l1) may be used to derive or set the reference picture (ref_l1) (S1972). On the other hand, when linear_MV_coding_flag does not indicate establishing a linear relationship of motion, the reference picture (ref_l1) may be set using the reference picture information (ref_idx_l1) decoded from the bitstream (S1974).

当完成针对第二方向的参考图片的推导或设置时，在linear_MV_coding_flag表示运动的线性关系成立时(S1980)，可以推导与mv_l0具有线性关系的mv_l1(S1982)。另一方面，当linear_MV_coding_flag没有指示建立了运动的线性关系时(S1980)，可以使用从比特流解码的mvd_l1和mvp_l1_flag(S1990和S1992)来推导mv_l1(S1994)。When the derivation or setting of the reference picture for the second direction is completed, when the linear relationship of linear_MV_coding_flag indicating motion is established (S1980), mv_l1 having a linear relationship with mv_l0 can be derived (S1982). On the other hand, when linear_MV_coding_flag does not indicate that the linear relationship of motion is established (S1980), mv_l1 can be derived using mvd_l1 and mvp_l1_flag (S1990 and S1992) decoded from the bit stream (S1994).

在下表14中示出了上述实施方式的语法元素。The syntax elements of the above embodiment are shown in Table 14 below.

[表14][Table 14]

图20例示了本发明提出的设置参考图片的方法和上述实施方式3-2组合的形式。Figure 20 illustrates a combination of the method for setting a reference picture proposed by the present invention and the above-mentioned Embodiment 3-2.

对于第一方向，当linear_MV_coding_enabled_flag指示激活运动矢量推导功能时(S2010)，解码linear_MV_coding_flag(S2020)。当linear_MV_coding_flag指示建立了运动的线性关系时(S2030)，可以使用预设线性参考图片(linear_ref_l0)推导或设置参考图片(ref_l0)(S2040)。另一方面，当linear_MV_coding_enabled_flag没有指示激活运动矢量推导功能(S2010)或者linear_MV_coding_flag没有指示建立了运动的线性关系时(S2030)，可以使用从比特流中解码的参考图片信息(ref_idx_l0)来设置参考图片(ref_l0)(S2062)。For the first direction, when linear_MV_coding_enabled_flag indicates activation of the motion vector derivation function (S2010), linear_MV_coding_flag is decoded (S2020). When the linear_MV_coding_flag indicates that a linear relationship of motion is established (S2030), the reference picture (ref_l0) can be derived or set using the preset linear reference picture (linear_ref_l0) (S2040). On the other hand, when linear_MV_coding_enabled_flag does not indicate activation of the motion vector derivation function (S2010) or linear_MV_coding_flag does not indicate that a linear relationship of motion is established (S2030), the reference picture information (ref_idx_l0) decoded from the bit stream may be used to set the reference picture ( ref_l0)(S2062).

当完成针对第一方向的参考图片的推导或设置时，解码mvd_l0和mvp_l0_flag(S2050)，并且可以使用解码的信息来推导mv_l0(S2060)。When the derivation or setting of the reference picture for the first direction is completed, mvd_l0 and mvp_l0_flag are decoded (S2050), and mv_l0 can be derived using the decoded information (S2060).

对于第二方向，当linear_MV_coding_flag指示建立了运动的线性关系时(S2070)，可以使用预设的线性参考图片(linear_ref_l1)来推导或设置参考图片(ref_l1)(S2072)。另一方面，当linear_MV_coding_flag没有指示建立了运动的线性关系时，可以使用从比特流解码的参考图片信息(ref_idx_l1)来设置参考图片(ref_l1)(S2074)。For the second direction, when linear_MV_coding_flag indicates that a linear relationship of motion is established (S2070), a preset linear reference picture (linear_ref_l1) may be used to derive or set the reference picture (ref_l1) (S2072). On the other hand, when linear_MV_coding_flag does not indicate that the linear relationship of motion is established, the reference picture (ref_l1) may be set using the reference picture information (ref_idx_l1) decoded from the bitstream (S2074).

当完成针对第二方向的参考图片的推导或设置时，从比特流解码mvd_l1(S2080)，并且mvd_l1对应于偏移矢量或传统方法的mvd，如同在实施方式3-2中那样。When the derivation or setting of the reference picture for the second direction is completed, mvd_l1 is decoded from the bit stream (S2080), and mvd_l1 corresponds to the offset vector or the mvd of the conventional method, as in Embodiment 3-2.

当linear_MV_coding_flag指示建立了运动的线性关系时(S2090)，推导与mv_l0具有线性关系的mv_l1(S2092)，并且可以通过将偏移矢量(mvd_l1)应用于推导出的mv_l1来调整mv_l1(S2094)。另一方面，当linear_MV_coding_flag没有指示建立了运动的线性关系时(S2090)，可以使用从比特流解码出的mvp_l1_flag来推导mv_l1(S2096和S2098)。在这个过程中，可以使用由mvp_l1_flag指示的mvp_l1和解码的mvd_l1(传统方法的mvd)。When linear_MV_coding_flag indicates that the linear relationship of motion is established (S2090), mv_l1 having a linear relationship with mv_l0 is derived (S2092), and mv_l1 can be adjusted by applying the offset vector (mvd_l1) to the derived mv_l1 (S2094). On the other hand, when linear_MV_coding_flag does not indicate that the linear relationship of motion is established (S2090), mv_l1 may be derived using mvp_l1_flag decoded from the bit stream (S2096 and S2098). In this process, mvp_l1 indicated by mvp_l1_flag and decoded mvd_l1 (mvd of the traditional method) can be used.

尽管出于示例的目的描述了本发明的示例性实施方式，但是本领域技术人员将理解，在不脱离本发明的思想和范围的情况下，可以有各种修改和变型。为了简洁和清楚起见，已经描述了示例性实施方式。因此，本领域普通技术人员将理解，本发明的范围不受上述明确描述的实施方式的限制，而是包括权利要求及其等同物。Although exemplary embodiments of the present invention have been described for illustrative purposes, those skilled in the art will appreciate that various modifications and variations are possible without departing from the spirit and scope of the invention. Example implementations have been described for the sake of brevity and clarity. Accordingly, one of ordinary skill in the art will understand that the scope of the present invention is not limited by the specifically described embodiments above, but includes the appended claims and their equivalents.

相关申请的交叉引用Cross-references to related applications

本申请要求于2018年12月27日在韩国提交的第10-2018-0171254号专利申请和2019年8月28日在韩国提交的第10-2019-0105769号专利申请的优先权，其全部内容通过引用并入本文中。This application claims priority to Patent Application No. 10-2018-0171254 filed in Korea on December 27, 2018 and Patent Application No. 10-2019-0105769 filed in Korea on August 28, 2019, the entire contents of which are incorporated herein by reference.

Claims

1. A method of inter-predicting the current block using any one of multiple bidirectional prediction modes, the method includes the following steps:

decoding enablement information indicating whether a first mode of the plurality of bidirectional prediction modes is allowed;

When the enabling information indicates that the first mode is allowed, decoding mode information at the block level of the current block in the bitstream, the mode information indicating whether to apply the first mode to the current block;

When the mode information indicates that the first mode is applied to the current block,

decoding, from the bitstream, first motion information including differential motion vector information and predicted motion vector information for the first motion vector and second motion information excluding at least a portion of the differential motion vector information and predicted motion vector information for the second motion vector; and

deriving the first motion vector based on the first motion information, and deriving the second motion vector based on at least a portion of the first motion information and based on the second motion information; and

predicting the current block using a reference block in a first reference picture indicated by the first motion vector and a reference block in a second reference picture indicated by the second motion vector,

Among them, in the first mode, the first reference picture and the second reference picture are determined at a higher level than the block level.

2. The method of claim 1, further comprising the step of: when the mode information indicates that the first mode is not to be applied to the current block,

Decoding the first motion information and third motion information including the differential motion vector information and the predicted motion vector information for the second motion vector from the bitstream; and

The first motion vector is derived based on the first motion information and the second motion vector is derived based on the third motion information.

3. The method of claim 1, wherein when the enabling information indicates that the first mode is not activated, the mode information is not decoded from the bitstream and the mode information is set to indicate that no application The first mode.

4 . The method of claim 1 , wherein the enabling information is decoded at a sequence level, a picture level, a tile group level, or a slice level.

5. The method of claim 1, wherein the high level is a picture level, a tile group level, a slice level, a tile level or a coding tree unit level.

6. The method of claim 1, wherein the first reference picture and the second reference picture are determined based on a Picture Order Count POC difference between a reference picture included in a reference picture list and a current picture.

7. The method of claim 1, further comprising the step of applying offset information included in the bitstream to the second motion vector after deriving the second motion vector. adjust the second motion vector,

wherein the current block is predicted by using a reference block indicated by the adjusted second motion vector in the second reference picture and a reference block indicated by the first motion vector in the first reference picture .

8. The method of claim 7, wherein the offset information is an offset vector with a position indicated by the second motion vector as an origin, and

The adjusting includes adjusting the second motion vector to a position indicated by the offset vector.

9. The method of claim 7, wherein the offset information is an offset index indicating any one of a plurality of preset offset vector candidates, and

The adjusting includes adjusting the second motion vector by applying an offset vector candidate indicated by the offset index to the second motion vector.

10. A video encoding method that uses any one of a plurality of bidirectional prediction modes to perform inter prediction on the current block, the method comprising the following steps:

encoding enabling information indicating whether a first mode of the plurality of bidirectional prediction modes is allowed;

When the enabling information indicates that the first mode is allowed, encoding mode information at the block level of the current block, the mode information indicating whether to apply the first mode to the current block; and

encoding and signaling first motion information including differential motion vector information and predicted motion vector information for the first motion vector and second motion information excluding at least a portion of the differential motion vector information and predicted motion vector information for the second motion vector, and

encoding a residual block, the residual block being a difference between the current block and a prediction block of the current block, wherein the prediction block is generated by using a reference block indicated by the first motion vector in a first reference picture and a reference block indicated by the second motion vector in a second reference picture,

Wherein, in the first mode, the first reference picture and the second reference picture are determined at a higher level than the block level.

11. A method for transmitting a bitstream containing encoded video data, the method comprising the steps of:

generating the bitstream by encoding the current block using any one of a plurality of bidirectional prediction modes; and

transmit said bitstream to a video decoding device,

Wherein, generating the bitstream includes:

encoding first motion information including differential motion vector information and predicted motion vector information for the first motion vector and second motion information excluding at least a portion of differential motion vector information and predicted motion vector information for the second motion vector, and