KR102702948B1

KR102702948B1 - Video generating device and method therefor

Info

Publication number: KR102702948B1
Application number: KR1020230160557A
Authority: KR
Inventors: 권도균; 유한; 양정석
Original assignee: 주식회사 에이펀인터렉티브
Priority date: 2023-11-20
Filing date: 2023-11-20
Publication date: 2024-09-05
Anticipated expiration: 2043-11-20
Also published as: WO2025110328A1

Abstract

영상 생성 장치가 제공된다. 상기 영상 생성 장치는 및 상기 사용자와 상호작용하는 사물의 외형을 촬영하여 생성된 배경 데이터를 이용하여, 상기 사용자가 제거된 제1 영상을 생성하는 배경 생성 모듈, 상기 사용자의 움직임과 관련된 사용자 모션 데이터를 이용하여, 메타버스 캐릭터의 움직임이 포함된 제2 영상을 생성하는 캐릭터 구현 모듈, 상기 사물의 움직임과 관련된 사물 모션 데이터와, 상기 배경 데이터를 이용하여, 상기 사물 표면에 부착된 제1 마커와 관련된 제1 마커 제거 객체가 포함된 제3 영상을 생성하는 마커 제거 모듈, 상기 제1 영상을 기초로, 상기 제2 영상 및 상기 제3 영상을 이용하여, 최종 영상을 생성하는 영상 합성 모듈을 포함한다.An image generating device is provided. The image generating device includes: a background generating module which generates a first image from which the user has been removed using background data generated by photographing an appearance of an object interacting with the user; a character implementation module which generates a second image including the movement of a metaverse character using user motion data related to the movement of the user; a marker removal module which generates a third image including a first marker-removed object related to a first marker attached to a surface of the object using object motion data related to the movement of the object and the background data; and an image synthesis module which generates a final image using the second image and the third image based on the first image.

Description

{Video generating device and method therefor}

본 발명은 영상 생성 장치 및 그 방법에 관한 것이다. 구체적으로, 본 발명은 사용자의 모션을 추적하여, 사용자를 메타버스 캐릭터로 변환하고, 사용자 이외의 객체와 변환된 캐릭터가 이질감 없이 상호작용하는 영상을 생성하는 장치 및 그 방법에 관한 것이다.The present invention relates to an image generating device and method thereof. Specifically, the present invention relates to a device and method thereof for tracking a user's motion, converting the user into a metaverse character, and generating an image in which an object other than the user and the converted character interact without a sense of incongruity.

메타버스 캐릭터 영상 제작은 매우 복잡한 절차를 통해서 이루어진다. 종래의 메타버스 캐릭터 제작의 경우 2D나 3D 컴퓨터 그래픽 영상은 제작자가 동작들을 일일이 수작업으로 만들기 때문에, 시간적/경제적으로 많은 비용이 소요되었으며, 또한 동작간 리얼리티가 부족하다는 단점이 있다.Metaverse character video production is done through a very complex process. In the case of conventional metaverse character production, 2D or 3D computer graphic videos require the creator to manually create each movement, which requires a lot of time and money, and also has the disadvantage of lacking realism between movements.

이러한 단점을 극복하고자, 최근 메타버스 캐릭터 영상 생성 장치는 캐릭터의 리얼리티를 살리기 위해 다양한 방식을 이용하고 있으며, 최근 실제 연기자의 움직임을 이용하여, 메타버스 캐릭터의 움직임을 구현하여 캐릭터 리얼리티를 극대화하는 시도가 있다. To overcome these shortcomings, recent metaverse character image generation devices are using various methods to bring out the realism of characters, and there is a recent attempt to maximize character realism by implementing the movements of metaverse characters using the movements of actual actors.

공개특허공보 제10-2023-0088226호Publication of Patent Publication No. 10-2023-0088226

본 발명의 과제는, 메타버스 캐릭터가 등장하는 영상의 리얼리티를 극대화하는 영상 생성 장치 및 그 방법을 제공하는 것이다.The object of the present invention is to provide an image generating device and method that maximize the reality of an image in which a metaverse character appears.

또한, 본 발명의 과제는 메타버스 캐릭터와 사용자 이외의 객체가 상호작용하는 과정이 포함된 영상의 리얼리티를 극대화하는 영상 생성 장치 및 그 방법을 제공하는 것이다.In addition, the task of the present invention is to provide an image generation device and method that maximize the reality of an image including a process in which a metaverse character and an object other than a user interact.

또한, 본 발명의 과제는 연기를 수행하는 사용자와 사용자 이외의 객체를 동일한 시공간에서 촬영하여 효율을 향상시킨 영상 생성 장치 및 그 방법을 제공하는 것이다.In addition, the object of the present invention is to provide an image generating device and method thereof that improve efficiency by capturing a user performing an act and an object other than the user in the same space and time.

또한, 본 발명의 과제는 연기를 수행하는 사용자의 모션이 캐릭터로 변환되는 과정에서, 변환된 캐릭터와 사용자 이외의 객체가 서로 겹쳐지는 부분이 이질감 없이 자연스러운 영상을 생성하는 영상 생성 장치 및 그 방법을 제공하는 것이다.In addition, the task of the present invention is to provide an image generating device and method for generating a natural image without a sense of incongruity in a part where a converted character and an object other than the user overlap each other in a process in which the motion of a user performing a performance is converted into a character.

본 발명의 목적들은 이상에서 언급한 목적으로 제한되지 않으며, 언급되지 않은 본 발명의 다른 목적 및 장점들은 하기의 설명에 의해서 이해될 수 있고, 본 발명의 실시예에 의해 보다 분명하게 이해될 것이다. 또한, 본 발명의 목적 및 장점들은 특허 청구 범위에 나타낸 수단 및 그 조합에 의해 실현될 수 있음을 쉽게 알 수 있을 것이다.The purposes of the present invention are not limited to the purposes mentioned above, and other purposes and advantages of the present invention which are not mentioned can be understood by the following description, and will be more clearly understood by the embodiments of the present invention. In addition, it will be easily understood that the purposes and advantages of the present invention can be realized by the means and combinations thereof indicated in the claims.

상기 과제를 해결하기 위한 본 발명의 몇몇 실시예에 따른 영상 생성 장치는, 사용자의 외관이 포함된 배경 데이터를 이용하여, 상기 사용자의 외관이 제거된 제1 영상을 생성하는 배경 생성 모듈, 상기 사용자의 동작과 관련된 사용자 모션 데이터를 이용하여, 상기 사용자의 동작과 매칭되는 메타버스 캐릭터의 동작을 포함하는 제2 영상을 생성하는 캐릭터 구현 모듈 및 상기 제1 영상 및 상기 제2 영상을 이용하여, 최종 영상을 생성하는 영상 합성 모듈을 포함하고, 상기 배경 데이터는 상기 사용자의 윤곽과, 상기 사용자와 상호작용하는 객체의 외관을 포함한다.According to some embodiments of the present invention for solving the above problem, an image generating device includes a background generating module which generates a first image from which the user's appearance is removed by using background data including the user's appearance, a character implementation module which generates a second image including a motion of a metaverse character matching the motion of the user by using user motion data related to the motion of the user, and an image synthesis module which generates a final image by using the first image and the second image, wherein the background data includes an outline of the user and an appearance of an object interacting with the user.

또한, 상기 배경 생성 모듈은, 영상 촬영 장치로부터 수신한 상기 배경 데이터에서 상기 사용자의 윤곽 정보를 생성하는 개체 그루핑 모듈과, 상기 윤곽 정보에 기초하여 상기 사용자를 트래킹한 사용자 트래킹 데이터를 생성하는 사용자 트래킹 모듈과, 상기 사용자 트래킹 데이터를 이용하여 상기 제1 영상을 생성하는 사용자 제거 모듈을 포함할 수 있다.In addition, the background generation module may include an object grouping module that generates user outline information from the background data received from an image capturing device, a user tracking module that generates user tracking data that tracks the user based on the outline information, and a user removal module that generates the first image using the user tracking data.

또한, 상기 사용자 제거 모듈은, 상기 사용자 트래킹 데이터를 이용하여, 상기 윤곽 정보와 대응되는 마스킹 데이터를 생성하고, 상기 마스킹 데이터를 이용하여, 상기 배경 데이터에서 상기 윤곽 정보를 제거한 사용자 제거 영상을 생성하고, 상기 배경 데이터와 상기 사용자 제거 영상을 이용하여 상기 제1 영상을 생성할 수 있다.In addition, the user removal module can generate masking data corresponding to the outline information using the user tracking data, generate a user removal image by removing the outline information from the background data using the masking data, and generate the first image using the background data and the user removal image.

또한, 상기 사용자 제거 모듈은, 상기 사용자 트래킹 데이터를 이용하여, 상기 윤곽 정보와 대응되는 마스킹 데이터를 생성하고, 상기 배경 데이터를 이용하여, 상기 마스킹 데이터를 보정하여 사용자 제거 영상을 생성하고, 상기 배경 데이터와 상기 사용자 제거 영상을 이용하여 상기 제1 영상을 생성할 수 있다.In addition, the user removal module can generate masking data corresponding to the outline information using the user tracking data, correct the masking data using the background data to generate a user removal image, and generate the first image using the background data and the user removal image.

또한, 상기 개체 그루핑 모듈은, 사용자에 부착된 복수의 마커 각각을 중심으로 미리 정한 기준에 따라 픽셀들을 그루핑하고, 상기 픽셀들의 그룹을 서로 연결하여 상기 사용자의 윤곽 정보를 생성할 수 있다.In addition, the object grouping module can group pixels based on a pre-determined criterion centered on each of a plurality of markers attached to the user, and connect the groups of pixels to each other to generate outline information of the user.

또한, 상기 캐릭터 구현 모듈은 상기 배경 생성 모듈에서 생성된 사용자 트래킹 데이터를 더 이용하여, 상기 제2 영상을 생성하되, 상기 배경 데이터는 상기 객체의 외관이 상기 사용자의 외관 상에 오버랩된 부분을 적어도 하나 포함하고, 상기 제2 영상은 상기 오버랩된 부분이 제거된 상기 메타버스 캐릭터의 영상을 포함할 수 있다.In addition, the character implementation module may further use the user tracking data generated by the background generation module to generate the second image, wherein the background data includes at least one portion in which the appearance of the object overlaps the appearance of the user, and the second image may include an image of the metaverse character with the overlapped portion removed.

상기 과제를 해결하기 위한 본 발명의 몇몇 실시예에 따른 영상 생성 방법은 사용자의 동작을 이용하여 메타버스 캐릭터 영상을 생성하는 영상 생성 장치를 이용한 영상 생성 방법으로서, 상기 사용자의 외관과 상기 사용자와 상호작용하는 객체의 외관이 포함된 배경 데이터를 이용하여 사용자의 윤곽 정보를 생성하는 단계, 상기 윤곽 정보에 기초하여 사용자를 트래킹한 사용자 트래킹 데이터를 생성하는 단계, 상기 사용자 트래킹 데이터를 이용하여 상기 제1 영상을 생성하는 단계, 상기 사용자의 동작과 관련된 사용자 모션 데이터를 이용하여, 상기 사용자의 동작과 매칭되는 메타버스 캐릭터의 동작을 포함하는 제2 영상을 생성하는 단계 및 상기 제1 영상 및 상기 제2 영상을 이용하여, 최종 영상을 생성하는 단계를 포함한다.According to some embodiments of the present invention for solving the above problem, an image generation method is provided using an image generation device that generates a metaverse character image using a user's motion, the method comprising: a step of generating user outline information using background data including an appearance of the user and an appearance of an object interacting with the user; a step of generating user tracking data that tracks the user based on the outline information; a step of generating the first image using the user tracking data; a step of generating a second image including a motion of a metaverse character matching the motion of the user using user motion data related to the motion of the user; and a step of generating a final image using the first image and the second image.

또한, 상기 제1 영상을 생성하는 단계는, 상기 사용자 트래킹 데이터를 이용하여, 상기 윤곽 정보와 대응되는 마스킹 데이터를 생성하는 단계와, 상기 마스킹 데이터를 이용하여, 상기 배경 데이터에서 상기 윤곽 정보를 제거한 사용자 제거 영상을 생성하는 단계와, 상기 배경 데이터와 상기 사용자 제거 영상을 이용하여 상기 제1 영상을 생성하는 단계를 포함할 수 있다.In addition, the step of generating the first image may include a step of generating masking data corresponding to the outline information using the user tracking data, a step of generating a user-removed image by removing the outline information from the background data using the masking data, and a step of generating the first image using the background data and the user-removed image.

또한, 상기 제1 영상을 생성하는 단계는, 상기 사용자 트래킹 데이터를 이용하여, 상기 윤곽 정보와 대응되는 마스킹 데이터를 생성하는 단계와, 상기 배경 데이터를 이용하여, 상기 마스킹 데이터를 보정하여 사용자 제거 영상을 생성하는 단계와, 상기 배경 데이터와 상기 사용자 제거 영상을 이용하여 상기 제1 영상을 생성하는 단계를 포함할 수 있다.In addition, the step of generating the first image may include a step of generating masking data corresponding to the outline information using the user tracking data, a step of generating a user-removed image by correcting the masking data using the background data, and a step of generating the first image using the background data and the user-removed image.

또한, 상기 제2 영상을 생성하는 단계는, 상기 사용자 트래킹 데이터를 이용하여 상기 메타버스 캐릭터를 생성하는 단계를 포함할 수 있다.Additionally, the step of generating the second image may include a step of generating the metaverse character using the user tracking data.

본 발명의 영상 생성 장치 및 방법은, 사용자와 객체가 오버랩된 부분을 제거하고 캐릭터를 생성하여, 사용자와 객체의 상호작용을 자연스럽게 표현할 수 있다.The image generating device and method of the present invention can naturally express interaction between a user and an object by removing an overlapping portion between the user and the object and generating a character.

또한, 본 발명의 영상 생성 장치 및 방법은, 연기를 수행하는 사용자의 모션을 반영하여, 메타버스 캐릭터의 움직임과 연출 효과에 적용하므로 캐릭터의 리얼리티를 극대화할 수 있다.In addition, the video generation device and method of the present invention can maximize the reality of a character by reflecting the motion of a user performing an act and applying it to the movement and directing effects of a metaverse character.

또한, 본 발명의 영상 생성 장치 및 방법은, 본 발명은 별도의 공간 및 시간에서 영상을 촬영하는 데 소요되는 시간과 비용을 절약하여, 경제성을 확보할 수 있다.In addition, the image generating device and method of the present invention can secure economic efficiency by saving the time and cost required to shoot an image in a separate space and time.

또한, 본 발명의 영상 생성 장치 및 방법은, 사용자가 캐릭터로 전환된 후에도 사용자 이외의 객체를 캐릭터와 이질감 없이 표현할 수 있다.In addition, the image generating device and method of the present invention can express objects other than the user without a sense of incongruity with the character even after the user has been converted into a character.

또한, 본 발명의 영상 생성 장치 및 방법은, 사용자와 사용자 이외의 객체가 겹쳐진 부분이 자연스럽게 표현되어 퀄리티가 극대화된 영상을 제작할 수 있다.In addition, the image generating device and method of the present invention can produce an image with maximized quality by naturally expressing the overlapping portion of a user and an object other than the user.

상술한 내용과 더불어 본 발명의 구체적인 효과는 이하 발명을 실시하기 위한 구체적인 사항을 설명하면서 함께 기술한다.In addition to the above-described contents, the specific effects of the present invention are described together with the specific matters for carrying out the invention below.

도 1은 본 발명의 몇몇 실시예에 따른 영상 생성 장치를 이용하여 최종 영상을 생성하는 과정을 개략적으로 설명하기 위한 도면이다.
도 2는 도 1의 배경 생성 모듈의 구성을 나타내는 블록도이다.
도 3은 도 1의 배경 생성 모듈이 제1 영상을 생성하는 과정을 나타내는 순서도이다.
도 4는 도 1의 각 데이터를 구체적으로 설명하기 위한 도면이다.
도 5a는 도 2의 윤곽 정보를 생성하는 일 실시예를 나타내는 도면이다.
도 5b는 도 2의 윤곽 정보를 생성하는 다른 실시예를 나타내는 도면이다.
도 6은 도 2의 사용자 트래킹 모듈이 사용자 트래킹 데이터를 생성하는 과정을 설명하기 위한 도면이다.
도 7a은 도 2의 사용자 제거 모듈에서 제1 영상을 생성하는 일 실시예를 나타낸 도면이다.
도 7b는 도 2의 사용자 제거 모듈에서 제1 영상을 생성하는 다른 실시예를 나타낸 도면이다.
도 8은 도 1의 영상 촬영 장치가 전송한 배경 데이터로부터 제1 영상이 생성되는 과정을 나타내는 도면이다.
도 9는 도 1의 캐릭터 구현 모듈의 구성을 나타내는 블록도이다.
도 10은 도 8의 캐릭터 구현 모듈이 제2 영상을 생성하는 과정을 나타내는 순서도이다.
도 11은 도 1의 모션 캡쳐 장치가 전송한 사용자 모션 데이터로부터 제2 영상이 생성되는 과정을 나타내는 도면이다.
도 12는 사용자와 객체가 오버랩되는 부분이 있을 때 제2 영상을 생성하는 과정을 도시한 도면이다.
도 13 및 도 14는 본 발명의 몇몇 실시예에 따른 영상 합성 모듈을 통해 최종 영상을 생성하는 과정을 설명하기 위한 도면이다.
도 15는 본 발명의 다른 몇몇 실시예에 따른 영상 합성 모듈을 통해 최종 영상을 생성하는 과정을 설명하기 위한 도면이다.FIG. 1 is a drawing schematically illustrating a process of generating a final image using an image generating device according to some embodiments of the present invention.
Figure 2 is a block diagram showing the configuration of the background generation module of Figure 1.
Figure 3 is a flowchart showing the process in which the background generation module of Figure 1 generates the first image.
Figure 4 is a drawing to specifically explain each data of Figure 1.
FIG. 5a is a drawing showing an embodiment of generating the outline information of FIG. 2.
FIG. 5b is a diagram showing another embodiment of generating the outline information of FIG. 2.
FIG. 6 is a diagram for explaining the process by which the user tracking module of FIG. 2 generates user tracking data.
FIG. 7a is a diagram illustrating an embodiment of generating a first image in the user removal module of FIG. 2.
FIG. 7b is a diagram illustrating another embodiment of generating a first image in the user removal module of FIG. 2.
FIG. 8 is a diagram showing a process of generating a first image from background data transmitted by the image capturing device of FIG. 1.
Figure 9 is a block diagram showing the configuration of the character implementation module of Figure 1.
Figure 10 is a flowchart showing the process by which the character implementation module of Figure 8 generates a second image.
FIG. 11 is a drawing showing a process of generating a second image from user motion data transmitted by the motion capture device of FIG. 1.
Figure 12 is a diagram illustrating a process of generating a second image when there is an overlapping portion between a user and an object.
FIGS. 13 and 14 are diagrams for explaining a process of generating a final image through an image synthesis module according to some embodiments of the present invention.
FIG. 15 is a diagram illustrating a process of generating a final image through an image synthesis module according to some other embodiments of the present invention.

본 명세서 및 특허청구범위에서 사용된 용어나 단어는 일반적이거나 사전적인 의미로 한정하여 해석되어서는 아니된다. 발명자가 그 자신의 발명을 최선의 방법으로 설명하기 위해 용어나 단어의 개념을 정의할 수 있다는 원칙에 따라, 본 발명의 기술적 사상과 부합하는 의미와 개념으로 해석되어야 한다. 또한, 본 명세서에 기재된 실시예와 도면에 도시된 구성은 본 발명이 실현되는 하나의 실시예에 불과하고, 본 발명의 기술적 사상을 전부 대변하는 것이 아니므로, 본 출원시점에 있어서 이들을 대체할 수 있는 다양한 균등물과 변형 및 응용 가능한 예들이 있을 수 있음을 이해하여야 한다.The terms or words used in this specification and the claims should not be interpreted as limited to their general or dictionary meanings. In accordance with the principle that the inventor can define the concept of a term or word in order to best explain his or her invention, they should be interpreted as meanings and concepts that are consistent with the technical idea of the present invention. In addition, the embodiments described in this specification and the configurations illustrated in the drawings are only one embodiment in which the present invention is realized, and do not represent the entire technical idea of the present invention, so it should be understood that there may be various equivalents, modifications, and applicable examples that can replace them at the time of this application.

본 명세서 및 특허청구범위에서 사용된 제1, 제2, A, B 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. '및/또는' 이라는 용어는 복수의 관련된 기재된 항목들의 조합 또는 복수의 관련된 기재된 항목들 중의 어느 항목을 포함한다.The terms first, second, A, B, etc., used in this specification and claims may be used to describe various components, but the components should not be limited by the terms. The terms are used only to distinguish one component from another. For example, without departing from the scope of the present invention, the first component may be referred to as the second component, and similarly, the second component may also be referred to as the first component. The term "and/or" includes any combination of a plurality of related listed items or any item among a plurality of related listed items.

본 명세서 및 특허청구범위에서 사용된 용어는 단지 특정한 실시 예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서 "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terms used in this specification and claims are used only to describe specific embodiments and are not intended to limit the present invention. The singular expression includes the plural expression unless the context clearly indicates otherwise. It should be understood that the terms "comprise" or "have" in this application do not exclude in advance the possibility of the presence or addition of features, numbers, steps, operations, components, parts or combinations thereof described in the specification.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해서 일반적으로 이해되는 것과 동일한 의미를 가지고 있다.Unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Terms defined in commonly used dictionaries should be interpreted as having a meaning consistent with their meaning in the context of the relevant technology, and will not be interpreted in an idealized or overly formal sense unless expressly defined in this application.

또한, 본 발명의 각 실시예에 포함된 각 구성, 과정, 공정 또는 방법 등은 기술적으로 상호 간 모순되지 않는 범위 내에서 공유될 수 있다.In addition, each configuration, process, procedure or method included in each embodiment of the present invention may be shared within a scope that is not technically contradictory to each other.

본 발명은 메타버스 캐릭터 영상 생성 장치에 관한 것으로, '메타버스 캐릭터 영상 생성'은 메타버스, 가상현실, 증강현실 혹은 확장현실에서 이용되는 메타버스 캐릭터의 움직임 및 연출 효과를 포함하는 영상의 제작을 의미할 수 있다. 다시 말해서, 본 명세서에서 '메타버스 캐릭터'는 2D 또는 3D로 구현된 가상의 디지털 아이돌 캐릭터, 게임 캐릭터 등을 포함할 수 있다. 또한, '메타버스 영상'라 함은 메타버스 캐릭터를 포함하는 영상으로서, 메타버스 캐릭터 뿐만 아니라, 메타버스 배경, 구조물 등을 모두 포함하는 영상을 의미할 수 있다. The present invention relates to a metaverse character image generation device, wherein 'metaverse character image generation' may mean production of an image including movement and production effects of a metaverse character used in a metaverse, virtual reality, augmented reality or extended reality. In other words, the 'metaverse character' in this specification may include a virtual digital idol character, game character, etc. implemented in 2D or 3D. In addition, the 'metaverse image' refers to an image including a metaverse character, and may mean an image including not only a metaverse character but also a metaverse background, structures, etc.

이하, 도 1 내지 도 15를 참조하여, 본 발명의 몇몇 실시예에 따른 영상 생성 장치 및 그 방법을 설명한다.Hereinafter, with reference to FIGS. 1 to 15, an image generating device and method thereof according to some embodiments of the present invention will be described.

도 1은 본 발명의 몇몇 실시예에 따른 영상 생성 장치를 이용하여 최종 영상을 생성하는 과정을 개략적으로 설명하기 위한 도면이다. 도 2는 도 1의 배경 생성 모듈의 구성을 나타내는 블록도이다. 도 3은 도 1의 배경 생성 모듈이 제1 영상을 생성하는 과정을 나타내는 순서도이다. 도 4는 도 1의 각 데이터를 구체적으로 설명하기 위한 도면이다.FIG. 1 is a drawing for schematically explaining a process of generating a final image using an image generating device according to some embodiments of the present invention. FIG. 2 is a block diagram showing the configuration of the background generating module of FIG. 1. FIG. 3 is a flowchart showing the process of the background generating module of FIG. 1 generating a first image. FIG. 4 is a drawing for specifically explaining each data of FIG. 1.

도 1 내지 도 4를 참조하면, 본 발명의 몇몇 실시예에 따른 영상 생성 장치(10)는 배경 생성 모듈(100), 캐릭터 구현 모듈(200) 및 영상 합성 모듈(300)을 포함할 수 있다. Referring to FIGS. 1 to 4, an image generation device (10) according to some embodiments of the present invention may include a background generation module (100), a character implementation module (200), and an image synthesis module (300).

배경 생성 모듈(100)은 영상 촬영 장치(20)로부터 배경 데이터(BD)를 수신할 수 있다. 영상 촬영 장치(20)는 영상을 촬영하여 배경 데이터(BD)를 생성할 수 있다. 이때, 배경 데이터(BD)는 사용자의 외관을 포함할 수 있다. 즉, 배경 데이터(BD)는 움직이는 사용자의 상태를 포함할 수 있다.The background generation module (100) can receive background data (BD) from the video capturing device (20). The video capturing device (20) can capture an image to generate background data (BD). At this time, the background data (BD) can include the user's appearance. That is, the background data (BD) can include the state of a moving user.

영상 촬영 장치(20)에서 생성된 배경 데이터(BD)는 배경 생성 모듈(100)에 제공될 수 있다. 배경 생성 모듈(100)은 배경 데이터(BD)를 이용하여 제1 영상(V1)을 생성할 수 있다. 제1 영상(V1)은 배경 데이터(BD)에서 사용자의 외관이 제거된 영상일 수 있다. 제1 영상(V1)에 대한 구체적인 설명은 후술한다. 배경 생성 모듈(100)은 생성된 제1 영상(V1)을 영상 합성 모듈(300)에 제공할 수 있다.Background data (BD) generated by the video capturing device (20) can be provided to the background generation module (100). The background generation module (100) can generate a first image (V1) using the background data (BD). The first image (V1) can be an image from which the user's appearance is removed from the background data (BD). A detailed description of the first image (V1) will be described later. The background generation module (100) can provide the generated first image (V1) to the image synthesis module (300).

캐릭터 구현 모듈(200)은 모션 캡쳐 장치(30)로부터 사용자 모션 데이터(MD)를 수신할 수 있다. 모션 캡쳐 장치(30)는 사용자의 움직임을 센싱하여 사용자 모션 데이터(MD)를 생성할 수 있다. 사용자 모션 데이터(MD)는 사용자의 동작과 관련된 데이터를 포함할 수 있다. 구체적으로, 사용자 모션 데이터(MD)는 사용자의 이동 방향 및 이동 속도에 관련된 데이터를 포함할 수 있다.The character implementation module (200) can receive user motion data (MD) from a motion capture device (30). The motion capture device (30) can sense the user's movement to generate user motion data (MD). The user motion data (MD) can include data related to the user's movement. Specifically, the user motion data (MD) can include data related to the user's movement direction and movement speed.

예를 들어, 모션 캡쳐 장치(30)는 사용자에게 부착된 마커를 센싱하여 사용자 모션 데이터(MD)를 생성할 수 있다. 사용자 모션 데이터(MD)는 바디 모션 데이터, 페이셜 모션 데이터 및 핸드 모션 데이터 중 적어도 하나를 포함할 수 있다. 본 명세서에서는 영상 촬영 장치(20)와 모션 캡쳐 장치(30)가 서로 다른 구성요소인 것으로 표현하였으나, 실시예들이 이에 제한되는 것은 아니다. 모션 캡쳐 장치(30)가 광학식 마커를 이용하는 경우, 영상 촬영 장치(20)와 모션 캡쳐 장치(30)는 서로 동일한 구성요소일 수 있다. 다만, 본 명세서에서는 설명의 편의를 위해, 영상 촬영 장치(20)와 모션 캡쳐 장치(30)를 서로 구분되는 것으로 표현한다.For example, the motion capture device (30) can sense a marker attached to a user to generate user motion data (MD). The user motion data (MD) can include at least one of body motion data, facial motion data, and hand motion data. In this specification, the video recording device (20) and the motion capture device (30) are expressed as different components, but the embodiments are not limited thereto. When the motion capture device (30) uses an optical marker, the video recording device (20) and the motion capture device (30) can be the same components. However, in this specification, for the convenience of explanation, the video recording device (20) and the motion capture device (30) are expressed as being different from each other.

캐릭터 구현 모듈(200)은 사용자 모션 데이터(MD)를 이용하여 메타버스 캐릭터의 움직임이 포함된 제2 영상(V2)을 생성할 수 있다. 메타버스 캐릭터는 사용자와 대응될 수 있다. 구체적으로, 메타버스 캐릭터의 동작은 사용자 모션 데이터(MD)에 포함된 사용자의 동작과 매칭될 수 있다. 다시 말해서, 제2 영상(V2)은 사용자 모션 데이터(MD)와 대응되는 움직임을 갖는 메타버스 캐릭터에 관한 영상일 수 있다. The character implementation module (200) can generate a second image (V2) including the movement of the metaverse character using the user motion data (MD). The metaverse character can correspond to the user. Specifically, the movement of the metaverse character can be matched with the movement of the user included in the user motion data (MD). In other words, the second image (V2) can be an image of the metaverse character having a movement corresponding to the user motion data (MD).

몇몇 실시예에서, 제2 영상(V2)은 사용자(UM)와 객체(OB)가 상호작용하는 부분의 영역이 제거된 메타버스 캐릭터 영상일 수 있다. 객체(OB)는 사용자(UM) 외에 다른 개체를 의미할 수 있다. 다시 말해서, 제2 영상(V2)은 메타버스 캐릭터 영상을 포함하되, 해당 메타버스 캐릭터 영상에는 사용자(UM)와 객체(OB)가 서로 오버랩되는 부분이 제거되어 있을 수 있다. 제2 영상(V2)에 대한 구체적인 설명은 후술한다. 캐릭터 구현 모듈(200)은 사용자 모션 데이터(MD) 및 사용자 트래킹 데이터(TD)를 이용하여 생성된 제2 영상(V2)을 영상 합성 모듈(300)에 제공할 수 있다.In some embodiments, the second image (V2) may be a metaverse character image from which an area where a user (UM) and an object (OB) interact are removed. The object (OB) may mean an entity other than the user (UM). In other words, the second image (V2) may include a metaverse character image, but the metaverse character image may have an area where the user (UM) and the object (OB) overlap each other removed. A detailed description of the second image (V2) will be described later. The character implementation module (200) may provide the second image (V2) generated using the user motion data (MD) and the user tracking data (TD) to the image synthesis module (300).

영상 합성 모듈(300)은 배경 생성 모듈(100)에서 수신한 제1 영상(V1) 및 캐릭터 구현 모듈(200)에서 수신한 제2 영상(V2)을 이용하여 최종 영상(VF)을 생성할 수 있다.The image synthesis module (300) can generate a final image (VF) using the first image (V1) received from the background generation module (100) and the second image (V2) received from the character implementation module (200).

몇몇 실시예에 따르면, 영상 생성 장치(10), 영상 촬영 장치(20) 및 모션 캡쳐 장치(30)는 네트워크를 통해 서로 데이터를 교환할 수 있다. 네트워크를 통해서 데이터를 전송할 수 있다. 네트워크는 유선 인터넷 기술, 무선 인터넷 기술 및 근거리 통신 기술에 의한 네트워크를 포함할 수 있다. 유선 인터넷 기술은 예를 들어, 근거리 통신망(LAN, Local area network) 및 광역 통신망(WAN, wide area network) 중 적어도 하나를 포함할 수 있다.According to some embodiments, the image generating device (10), the image capturing device (20), and the motion capture device (30) can exchange data with each other through a network. The data can be transmitted through the network. The network can include a network by wired Internet technology, wireless Internet technology, and short-range communication technology. The wired Internet technology can include, for example, at least one of a local area network (LAN) and a wide area network (WAN).

무선 인터넷 기술은 예를 들어, 무선랜(Wireless LAN: WLAN), DLNA(Digital Living Network Alliance), 와이브로(Wireless Broadband: Wibro), 와이맥스(World Interoperability for Microwave Access: Wimax), HSDPA(High Speed Downlink Packet Access), HSUPA(High Speed Uplink Packet Access), IEEE 802.16, 롱 텀 에볼루션(Long Term Evolution: LTE), LTE-A(Long Term Evolution-Advanced), 광대역 무선 이동 통신 서비스(Wireless Mobile Broadband Service: WMBS) 및 5G NR(New Radio) 기술 중 적어도 하나를 포함할 수 있다. 단, 본 실시예가 이에 제한되는 것은 아니다.The wireless Internet technology may include, for example, at least one of Wireless LAN (WLAN), Digital Living Network Alliance (DLNA), Wireless Broadband (Wibro), World Interoperability for Microwave Access (Wimax), High Speed Downlink Packet Access (HSDPA), High Speed Uplink Packet Access (HSUPA), IEEE 802.16, Long Term Evolution (LTE), Long Term Evolution-Advanced (LTE-A), Wireless Mobile Broadband Service (WMBS), and 5G NR (New Radio) technologies. However, the present embodiment is not limited thereto.

근거리 통신 기술은 예를 들어, 블루투스(Bluetooth), RFID(Radio Frequency Identification), 적외선 통신(Infrared Data Association: IrDA), UWB(Ultra-Wideband), 지그비(ZigBee), 인접 자장 통신(Near Field Communication: NFC), 초음파 통신(Ultra Sound Communication: USC), 가시광 통신(Visible Light Communication: VLC), 와이 파이(Wi-Fi), 와이 파이 다이렉트(Wi-Fi Direct), 5G NR (New Radio) 중 적어도 하나를 포함할 수 있다. 단, 본 실시예가 이에 제한되는 것은 아니다.The short-range communication technology may include at least one of, for example, Bluetooth, RFID (Radio Frequency Identification), Infrared Data Association (IrDA), Ultra-Wideband (UWB), ZigBee, Near Field Communication (NFC), Ultra Sound Communication (USC), Visible Light Communication (VLC), Wi-Fi, Wi-Fi Direct, and 5G NR (New Radio). However, the present embodiment is not limited thereto.

네트워크를 통해서 통신하는 영상 생성 장치(10), 영상 촬영 장치(20) 및 모션 캡쳐 장치(30) 는 이동통신을 위한 기술표준 및 표준 통신 방식을 준수할 수 있다. 예를 들어, 표준 통신 방식은 GSM(Global System for Mobile communication), CDMA(Code Division Multi Access), CDMA2000(Code Division Multi Access 2000), EV-DO(Enhanced Voice-Data Optimized or Enhanced Voice-Data Only), WCDMA(Wideband CDMA), HSDPA(High Speed Downlink Packet Access), HSUPA(High Speed Uplink Packet Access), LTE(Long Term Evolution), LTEA(Long Term Evolution-Advanced) 및 5G NR(New Radio) 중 적어도 하나를 포함할 수 있다. 단, 본 실시예가 이에 제한되는 것은 아니다.The image generating device (10), the image capturing device (20), and the motion capture device (30) communicating through a network may comply with technical standards and standard communication methods for mobile communication. For example, the standard communication method may include at least one of GSM (Global System for Mobile communication), CDMA (Code Division Multi Access), CDMA2000 (Code Division Multi Access 2000), EV-DO (Enhanced Voice-Data Optimized or Enhanced Voice-Data Only), WCDMA (Wideband CDMA), HSDPA (High Speed Downlink Packet Access), HSUPA (High Speed Uplink Packet Access), LTE (Long Term Evolution), LTEA (Long Term Evolution-Advanced), and 5G NR (New Radio). However, the present embodiment is not limited thereto.

배경 생성 모듈(100)은 개체 그루핑 모듈(110), 사용자 트래킹 모듈(120) 및 사용자 제거 모듈(130)을 포함할 수 있다. 개체 그루핑 모듈(110)은 영상 촬영 장치(20)로부터 배경 데이터(BD)를 수신할 수 있다.The background generation module (100) may include an object grouping module (110), a user tracking module (120), and a user removal module (130). The object grouping module (110) may receive background data (BD) from an image capturing device (20).

예를 들어, 영상 촬영 장치(20)는 연기를 수행하는 사용자(UM), 사용자(UM)와 상호작용하는 객체(OB) 및 사용자(UM)가 촬영되고 있는 장소를 포함한 배경 전체를 촬영하여 배경 데이터(BD)를 생성할 수 있다.For example, the video recording device (20) can capture the entire background including the user (UM) performing the act, the object (OB) interacting with the user (UM), and the location where the user (UM) is being filmed, thereby generating background data (BD).

배경 데이터(BD)는 사용자(UM)를 포함할 수 있다. 구체적으로, 배경 데이터(BD)는 사용자(UM)의 외관을 포함할 수 있다. 이때, 사용자(UM)의 외관이란, 영상 촬영 장치(20)로 촬영되는 사용자(UM)의 모습을 의미할 수 있다. 사용자(UM)는 연기를 수행하는 주체일 수 있다. 구체적으로, 사용자(UM)는 메타버스 캐릭터의 대사 및 행동을 연기하는 사람일 수 있다.Background data (BD) may include a user (UM). Specifically, the background data (BD) may include the appearance of the user (UM). In this case, the appearance of the user (UM) may mean the appearance of the user (UM) captured by the video capture device (20). The user (UM) may be a subject performing the performance. Specifically, the user (UM) may be a person who performs the lines and actions of the metaverse character.

또한, 배경 데이터(BD)는 객체(OB)를 포함할 수 있다. 객체(OB)는 사용자(UM) 외에 다른 개체를 의미할 수 있다. 객체(OB)는 사람 또는 사물일 수 있으며, 그 외에 사용자(UM)와 상호작용하는 유형체를 포함할 수 있다. Additionally, background data (BD) may include objects (OBs). Objects (OBs) may mean entities other than users (UMs). Objects (OBs) may be people or things, and may include other types of entities that interact with users (UMs).

한편, 개체 그루핑 모듈(110)은 배경 데이터(BD)를 이용해 사용자(UM)의 윤곽 정보(OI)를 생성할 수 있다(S100). 윤곽 정보(OI)는 사용자(UM)를 다른 개체와 구별되게 하는 데이터일 수 있다. 이때, 윤곽 정보(OI)는 배경 데이터(BD)에 포함된 사용자(UM)의 외관으로부터 생성될 수 있다. 즉, 윤곽 정보(OI)는 사용자(UM)의 외관을 다른 개체와 구별하는 경계선이 될 수 있다.Meanwhile, the object grouping module (110) can generate outline information (OI) of the user (UM) using the background data (BD) (S100). The outline information (OI) may be data that distinguishes the user (UM) from other objects. At this time, the outline information (OI) may be generated from the appearance of the user (UM) included in the background data (BD). In other words, the outline information (OI) may be a boundary line that distinguishes the appearance of the user (UM) from other objects.

다시 말해서, 개체 그루핑 모듈(110)은 배경 데이터(BD)에서 사용자(UM)와 객체(OB)를 구별함으로써 윤곽 정보(OI)를 생성할 수 있다. 즉, 개체 그루핑 모듈(110)은 배경 데이터(BD)로부터 개체를 찾고 분할할 수 있다. 구체적으로, 개체 그루핑 모듈(110)은 배경 데이터(BD)로부터 개체를 찾고, 찾은 개체에 대하여 사용자(UM)와 객체(OB)를 결정함으로써, 사용자(UM)와 객체(OB)를 구별할 수 있다. 즉, 개체 그루핑 모듈(110)은 배경 데이터(BD)로부터 사용자(UM)와 객체(OB)를 찾고 분할하여 윤곽 정보(OI)를 생성할 수 있다. In other words, the object grouping module (110) can generate outline information (OI) by distinguishing users (UM) and objects (OB) from background data (BD). That is, the object grouping module (110) can find and segment objects from background data (BD). Specifically, the object grouping module (110) can find objects from background data (BD) and determine users (UM) and objects (OB) for the found objects, thereby distinguishing users (UM) and objects (OB). That is, the object grouping module (110) can find and segment users (UM) and objects (OB) from background data (BD) to generate outline information (OI).

사용자 트래킹 모듈(120)은 윤곽 정보(OI)에 기초하여 사용자(UM)를 트래킹한 사용자 트래킹 데이터(TD)를 생성할 수 있다(S110). 사용자 트래킹 데이터(TD)는 시간의 흐름에 따른 윤곽 정보(OI)의 변화를 포함할 수 있다. 다시 말해서, 사용자 트래킹 데이터(TD)는 시간대별로 사용자(UM)의 윤곽 정보(OI)를 모아둔 데이터일 수 있다. 또한, 사용자 트래킹 모듈(120)은 사용자 트래킹 데이터(TD)를 사용자 제거 모듈(130)로 전송할 수 있다. 사용자 트래킹 모듈(120)에 대한 자세한 설명은 후술한다.The user tracking module (120) can generate user tracking data (TD) that tracks the user (UM) based on the outline information (OI) (S110). The user tracking data (TD) can include changes in the outline information (OI) over time. In other words, the user tracking data (TD) can be data that collects the outline information (OI) of the user (UM) by time zone. In addition, the user tracking module (120) can transmit the user tracking data (TD) to the user removal module (130). A detailed description of the user tracking module (120) will be described later.

사용자 제거 모듈(130)은 사용자 트래킹 모듈(120)로부터 사용자 트래킹 데이터(TD)를 수신할 수 있다. 사용자 제거 모듈(130)은 사용자 트래킹 데이터(TD)를 이용하여 제1 영상(V1)을 생성할 수 있다(S120). 이때, 사용자 제거 모듈(130)은 사용자 트래킹 데이터(TD)를 이용하여 제1 영상(V1)을 생성할 수 있다. 즉, 사용자 제거 모듈(130)은 사용자 트래킹 데이터(TD)를 이용해 사용자의 외관을 제거할 수 있다. 다시 말해서, 제1 영상(V1)은 사용자(UM)의 외관이 제거되고, 객체(OB)의 외관을 포함하는 영상일 수 있다. The user removal module (130) can receive user tracking data (TD) from the user tracking module (120). The user removal module (130) can generate a first image (V1) using the user tracking data (TD) (S120). At this time, the user removal module (130) can generate the first image (V1) using the user tracking data (TD). That is, the user removal module (130) can remove the user's appearance using the user tracking data (TD). In other words, the first image (V1) can be an image from which the appearance of the user (UM) is removed and includes the appearance of the object (OB).

몇몇 실시예에 따르면, 사용자 제거 모듈(130)은 배경 데이터(BD)와 사용자 트래킹 데이터(TD)를 이용하여 사용자 제거 영상(DUa, DUb)을 생성할 수 있다. 이어서, 사용자 제거 모듈(130)은 사용자 제거 영상(DUa, DUb)을 이용하여 제1 영상(V1)을 생성할 수 있다. 제1 영상(V1)에 대한 자세한 설명은 후술한다.According to some embodiments, the user removal module (130) can generate user removal images (DUa, DUb) using background data (BD) and user tracking data (TD). Then, the user removal module (130) can generate a first image (V1) using the user removal images (DUa, DUb). A detailed description of the first image (V1) will be described later.

도 5a는 도 2의 윤곽 정보를 생성하는 일 실시예를 나타내는 도면이다. 도 5b는 도 2의 윤곽 정보를 생성하는 다른 실시예를 나타내는 도면이다.FIG. 5a is a drawing showing one embodiment of generating the outline information of FIG. 2. FIG. 5b is a drawing showing another embodiment of generating the outline information of FIG. 2.

도 2, 도 5a 및 도 5b를 참조하면, 개체 그루핑 모듈(110)은 미리 학습된 인공지능 알고리즘을 이용하여 배경 데이터(BD)를 개체별로 그룹화할 수 있다. 다시 말해서, 개체 그루핑 모듈(110)은 영상 촬영 장치(20)로부터 배경 데이터(BD)를 수신하고, 수신한 배경 데이터(BD)를 미리 학습된 인공지능 알고리즘에 입력함으로써, 배경 데이터(BD)에 포함된 개체를 분할하고 그룹화할 수 있다. 예를 들어, 개체 그루핑 모듈(110)은 이미지와 상기 이미지에 대한 주석을 포함하는 데이터셋을 이용하여, 지도 학습(supervised learning)될 수 있다. 다른 예를 들어, 개체 그루핑 모듈(110)은 이미지를 포함하는 데이터셋을 이용하여, 자가 지도 학습(self-supervised learning) 또는 비지도 학습(unsupervised learning)될 수 있다. 다만, 이는 예시적인 설명이며, 실시예들이 이에 제한되는 것은 아니다. 즉, 개체 그루핑 모듈(110)에서 이용하는 이미지 분할 기술은 공지된 다양한 인공지능 알고리즘을 이용할 수 있다.Referring to FIGS. 2, 5A, and 5B, the object grouping module (110) can group background data (BD) by object using a pre-learned artificial intelligence algorithm. In other words, the object grouping module (110) can receive background data (BD) from the image capturing device (20) and input the received background data (BD) into a pre-learned artificial intelligence algorithm, thereby segmenting and grouping objects included in the background data (BD). For example, the object grouping module (110) can be supervised learning using a dataset including images and annotations for the images. As another example, the object grouping module (110) can be self-supervised learning or unsupervised learning using a dataset including images. However, this is an exemplary description, and the embodiments are not limited thereto. That is, the image segmentation technology used in the object grouping module (110) can utilize various known artificial intelligence algorithms.

몇몇 실시예에 따르면면, 개체 그루핑 모듈(110)은 딥러닝 기반 객체 감지 알고리즘을 이용하여 윤곽 정보(OI)를 생성할 수 있다. 딥러닝 기반의 객체 감지 알고리즘은, 예를 들어, YOLO(You Only Look Once), CNN, SSD(Single Shot Multibox Detector) 중 적어도 하나가 사용될 수 있으나, 실시예들이 이미지 분할 기술에 대한 구체적인 알고리즘에 제한되는 것은 아니다. According to some embodiments, the object grouping module (110) may generate outline information (OI) using a deep learning-based object detection algorithm. The deep learning-based object detection algorithm may be, for example, at least one of YOLO (You Only Look Once), CNN, and SSD (Single Shot Multibox Detector), but the embodiments are not limited to a specific algorithm for image segmentation technology.

몇몇 실시예에 따르면, 개체 그루핑 모듈(110)은 배경 데이터(BD)에 포함된 각각의 개체를 분할할 수 있다. 구체적으로, 개체 그루핑 모듈(110)은 사용자(UM), 객체(OB) 그 외에 사용자(UM)가 앉은 의자, 객체(OB)가 앉은 의자 등을 각각의 개체로 분할할 수 있다. 이때, 각 개체의 형상은 데이터셋에 이미지 또는 텍스트의 형태로 저장되어 있을 수 있다. 단, 본 실시예가 이에 제한되는 것은 아니다.According to some embodiments, the object grouping module (110) can divide each object included in the background data (BD). Specifically, the object grouping module (110) can divide the user (UM), the object (OB), and other objects, such as the chair on which the user (UM) is sitting and the chair on which the object (OB) is sitting, into each object. At this time, the shape of each object can be stored in the data set in the form of an image or text. However, the present embodiment is not limited thereto.

몇몇 실시예에 따르면 개체 그루핑 모듈(110)은 마커를 활용할 수 있다. 구체적으로, 개체 그루핑 모듈(110)은 사용자(UM)에게 부착한 마커를 활용하여 개체를 그루핑할 수 있다. 이때, 마커는 사용자의 모션 캡쳐를 위해 부착한 것일 수 있다. 개체 그루핑 모듈(110)은 부착한 마커를 기준으로 유사한 픽셀을 그루핑할 수 있다. 예를 들어, 개체 그루핑 모듈(110)은 제1 마커(MK1)를 중심으로 근접한 픽셀들의 유사도를 판단하여 유사 픽셀들을 그루핑할 수 있다. 즉, 개체 그루핑 모듈(110)은 머리에 부착된 제1 마커(MK1)를 기준으로 가까운 거리부터 픽셀의 유사 여부를 판단할 수 있다. 이어서, 개체 그루핑 모듈(110)은 유사하다고 판단한 픽셀들을 그루핑하여 사용자의 머리 부분으로 분할할 수 있다.In some embodiments, the object grouping module (110) may utilize markers. Specifically, the object grouping module (110) may group objects by utilizing markers attached to the user (UM). In this case, the markers may be attached for motion capture of the user. The object grouping module (110) may group similar pixels based on the attached markers. For example, the object grouping module (110) may determine the similarity of pixels in proximity to the first marker (MK1) and group the similar pixels. That is, the object grouping module (110) may determine whether pixels are similar from a close distance based on the first marker (MK1) attached to the head. Subsequently, the object grouping module (110) may group pixels determined to be similar and segment them into the user's head area.

또한, 개체 그루핑 모듈(110)은 몸통에 부착된 제2 마커(MK2)를 중심으로 근접한 픽셀들의 유사도를 판단할 수 있다. 이어서, 개체 그루핑 모듈(110)은 유사하다고 판단한 픽셀들을 그루핑하여 사용자의 몸통 부분으로 분할할 수 있다. 이와 유사하게, 개체 그루핑 모듈(110)은 사용자의 양 팔에 부착된 제3a 마커(MK3a) 및 제3b 마커(MK3b)를 중심으로 유사한 픽셀들을 그루핑하여 사용자의 팔 부분으로 분할할 수 있다. 또한, 사용자의 양 다리에 부착된 제4a 마커(MK4a) 및 제4b 마커(MK4b)를 중심으로 유사한 픽셀들을 그루핑하여 사용자의 다리 부분으로 분할할 수 있다. 이어서, 개체 그루핑 모듈(110)은 분할된 머리, 몸통, 팔 및 다리 부분을 기초로 하여 사용자의 윤곽 정보(OI)를 생성할 수 있다. 구체적으로, 개체 그루핑 모듈(110)은 그루핑한 유사 픽셀들의 경계를 연결하여 윤곽 정보(OI)를 생성할 수 있다.In addition, the object grouping module (110) can determine the similarity of adjacent pixels centered on the second marker (MK2) attached to the torso. Then, the object grouping module (110) can group the pixels determined to be similar and divide them into a torso part of the user. Similarly, the object grouping module (110) can group similar pixels centered on the third a marker (MK3a) and the third b marker (MK3b) attached to both arms of the user and divide them into an arm part of the user. In addition, the object grouping module (110) can group similar pixels centered on the fourth a marker (MK4a) and the fourth b marker (MK4b) attached to both legs of the user and divide them into a leg part of the user. Then, the object grouping module (110) can generate outline information (OI) of the user based on the divided head, torso, arm, and leg parts. Specifically, the object grouping module (110) can generate outline information (OI) by connecting the boundaries of grouped similar pixels.

도 6는 도 2의 사용자 트래킹 모듈이 사용자 트래킹 데이터를 생성하는 과정을 설명하기 위한 도면이다.FIG. 6 is a diagram for explaining the process by which the user tracking module of FIG. 2 generates user tracking data.

도 2 및 도 6를 참조하면, 사용자 트래킹 모듈(120)은 개체 그루핑 모듈(110)로부터 윤곽 정보(OI)를 수신할 수 있다. 사용자 트래킹 모듈(120)은 수신한 윤곽 정보(OI)에 기초하여 사용자 트래킹 데이터(TD)를 생성할 수 있다. 사용자 트래킹 데이터(TD)는 움직이는 사용자(UM)의 시간상의 변화를 포함할 수 있다. 다시 말해서, 사용자 트래킹 데이터(TD)는 사용자(UM)의 윤곽 정보(OI)를 포함하는 영역을 트래킹한 데이터일 수 있다. 즉, 사용자 트래킹 모듈(120)은 윤곽 정보(OI)를 시간 순서로 배열하여 사용자 트래킹 데이터(TD)를 생성할 수 있다.Referring to FIG. 2 and FIG. 6, the user tracking module (120) can receive outline information (OI) from the object grouping module (110). The user tracking module (120) can generate user tracking data (TD) based on the received outline information (OI). The user tracking data (TD) can include temporal changes of the moving user (UM). In other words, the user tracking data (TD) can be data that tracks an area including the outline information (OI) of the user (UM). That is, the user tracking module (120) can generate user tracking data (TD) by arranging the outline information (OI) in time order.

몇몇 실시예에 따르면, 사용자 트래킹 모듈(120)은 개체 그루핑 모듈(110)과 구별되지 않을 수 있다. 다시 말해서, 사용자 트래킹 모듈(120)은 사용자의 외관을 포함하는 배경 데이터(BD)를 이용하여 트래킹 데이터(TD)를 생성할 수 있다. 즉, 사용자 트래킹 모듈(120)은 인공지능 알고리즘을 이용하여 사용자(UM)를 트래킹할 수 있다.In some embodiments, the user tracking module (120) may be indistinguishable from the object grouping module (110). In other words, the user tracking module (120) may generate tracking data (TD) using background data (BD) including the user's appearance. That is, the user tracking module (120) may track the user (UM) using an artificial intelligence algorithm.

도 7a은 도 2의 사용자 제거 모듈에서 제1 영상을 생성하는 일 실시예를 나타낸 도면이다. 도 7b는 도 2의 사용자 제거 모듈에서 제1 영상을 생성하는 다른 실시예를 나타낸 도면이다.FIG. 7a is a diagram illustrating an embodiment of generating a first image in the user removal module of FIG. 2. FIG. 7b is a diagram illustrating another embodiment of generating a first image in the user removal module of FIG. 2.

도 2, 도 6, 도 7a 및 도 7b를 참조하면, 사용자 제거 모듈(130)은 트래킹 데이터(TD)를 이용하여, 윤곽 정보(OI)와 대응되는 마스킹 데이터(MD)를 생성할 수 있다. 마스킹 데이터(MD)는 일부가 투명한 레이어를 포함할 수 있다. 구체적으로, 마스킹 데이터(MD)는 사용자의 영역(즉, 사용자의 윤곽 내부 영역)은 불투명하고, 사용자의 영역 외의 구역은 투명한 레이어를 포함할 수 있다. 다만, 본 발명의 실시예가 이에 제한되는 것은 아니다.Referring to FIGS. 2, 6, 7a, and 7b, the user removal module (130) may generate masking data (MD) corresponding to outline information (OI) using tracking data (TD). The masking data (MD) may include a layer that is partially transparent. Specifically, the masking data (MD) may include a layer in which the user's area (i.e., the area inside the user's outline) is opaque and the area outside the user's area is transparent. However, the embodiment of the present invention is not limited thereto.

한편, 사용자 제거 모듈(130)은 배경 데이터(BD) 및 트래킹 데이터(TD)를 이용하여 제1 영상(V1a,V1b)를 생성할 수 있다. 이때, 사용자 제거 영상(DUa)은 배경 데이터(BD)에서 사용자의 영역이 제거된 영상일 수 있으며, 사용자의 외형이 제거된 부분은 데이터가 없이 비어있거나, 단색으로 채워져 있을 수 있다.Meanwhile, the user removal module (130) can generate the first image (V1a, V1b) using background data (BD) and tracking data (TD). At this time, the user removal image (DUa) can be an image from which the user's area is removed from the background data (BD), and the part from which the user's appearance is removed can be empty without data or filled with a single color.

몇몇 실시예에 따르면, 사용자 제거 모듈(130)은 배경 데이터(BD)에 마스킹 데이터(MD)를 결합할 수 있다. 즉, 사용자 제거 모듈(130)은 사용자의 영역이 제거된 사용자 제거 영상(DUa)을 생성할 수 있다. 사용자 제거 모듈(130)은 마스킹 데이터(MD)를 이용하여, 배경 데이터(BD)에서 윤곽 정보(OI)를 제거하여 사용자 제거 영상(DUa)을 생성할 수 있다. 이어서, 사용자 제거 모듈(130)은 사용자의 영역이 제거된 부분을 주변 배경 영역과 대응되도록 보정하여 제1 영상(V1a)을 생성할 수 있다.According to some embodiments, the user removal module (130) can combine masking data (MD) to the background data (BD). That is, the user removal module (130) can generate a user removed image (DUa) from which the user's area is removed. The user removal module (130) can generate the user removed image (DUa) by removing the outline information (OI) from the background data (BD) using the masking data (MD). Then, the user removal module (130) can correct the portion from which the user's area is removed so that it corresponds to the surrounding background area to generate the first image (V1a).

또 다른 예로, 사용자 제거 모듈(130)은 배경 데이터(BD)를 이용하여 마스킹 데이터(MD)를 보정할 수 있다. 즉, 사용자 제거 영상(DUb)은 마스킹 데이터(MD)에 포함된 사용자의 영역을 보정한 영상일 수 있다. 구체적으로, 사용자 제거 모듈(130)은 사용자의 영역이 제거된 부분을 배경 데이터(BD)를 활용하여 사용자의 영역 주위의 배경 데이터(BD)와 대응되도록 보정하여 사용자 제거 영상(DUb)을 생성할 수 있다. 이어서, 사용자 제거 모듈(130)은 배경 데이터(BD)에 사용자 제거 영상(DUb)을 결합하여 제1 영상(V1b)을 생성할 수 있다. 다만, 본 발명의 실시예가 이에 제한되는 것은 아니다.As another example, the user removal module (130) can correct the masking data (MD) using the background data (BD). That is, the user removal image (DUb) can be an image in which the user's area included in the masking data (MD) is corrected. Specifically, the user removal module (130) can correct the part from which the user's area is removed so that it corresponds to the background data (BD) around the user's area by using the background data (BD) to generate the user removal image (DUb). Then, the user removal module (130) can generate the first image (V1b) by combining the user removal image (DUb) with the background data (BD). However, the embodiment of the present invention is not limited thereto.

최종적으로, 사용자 제거 모듈(130)은 제1 영상(V1a,V1b)을 생성할 수 있다. 이때, 제1 영상(V1a,V1b)은 처음부터 사용자가 존재하지 않았던 것과 같이, 사용자가 제거된 영역과 주변 영역이 서로 어우러지는 영상일 수 있다. 즉, 제1 영상(V1a,V1b)은 사용자의 영역이 제거되고, 사용자가 제거된 영역이 보정된 영상일 수 있다. Finally, the user removal module (130) can generate the first image (V1a, V1b). At this time, the first image (V1a, V1b) can be an image in which the area from which the user has been removed and the surrounding area are blended together, as if the user had not existed from the beginning. In other words, the first image (V1a, V1b) can be an image in which the area from which the user has been removed has been removed and the area from which the user has been removed has been corrected.

이어서, 사용자 제거 모듈(130)은 배경 데이터(BD)와 사용자 제거 영상(DUa,DUb)을 이용하여 제1 영상(V1a,V1b)을 생성할 수 있다. Next, the user removal module (130) can generate the first images (V1a, V1b) using the background data (BD) and the user removal images (DUa, DUb).

즉, 사용자 제거 영상(DUa,DUb)은 배경 데이터(BD)에서 사용자가 제거된 영상일 수 있다. 다시 말해서, 사용자 제거 모듈(130)은 사용자 트래킹 데이터(TD)를 기반으로, 배경 데이터(BD)에서 사용자의 윤곽을 포함한 사용자 형상 전체를 제거함으로써 사용자 제거 영상(DUa,DUb)을 생성할 수 있다. 사용자 제거 모듈(130)은 사용자 제거 영상(DUa,DUb)을 이용하여 제1 영상(V1)을 생성할 수 있다. That is, the user-removed image (DUa, DUb) may be an image from which the user has been removed from the background data (BD). In other words, the user-removed module (130) may generate the user-removed image (DUa, DUb) by removing the entire user shape including the user outline from the background data (BD) based on the user tracking data (TD). The user-removed module (130) may generate the first image (V1) using the user-removed image (DUa, DUb).

몇몇 실시예에 따르면, 사용자 제거 모듈(130)은 이미지 합성 또는 딥러닝 기술을 이용하여, 사용자 제거 영상(DUa,DUb) 을 제1 영상(V1a,V2b)으로 생성할 수 있다. 예를 들어, 사용자 제거 모듈(130)은 GANs(Generative Adversarial Networks) 또는 CNNs(Convolutional Neural Networks)를 이용하여, 사용자 제거 영상(DUa,DUb) 을 보정함으로써 제1 영상(V1a,V1b)을 생성할 수 있다. 이때, 딥러닝 기술은 제거된 사용자의 영역의 주변 배경 정보를 이용하여 학습될 수 있다. According to some embodiments, the user removal module (130) may generate the user removed images (DUa, DUb) as the first images (V1a, V2b) by using image synthesis or deep learning techniques. For example, the user removal module (130) may generate the first images (V1a, V1b) by correcting the user removed images (DUa, DUb) by using Generative Adversarial Networks (GANs) or Convolutional Neural Networks (CNNs). In this case, the deep learning technique may be learned by using the surrounding background information of the area of the removed user.

또 다른 예로, 사용자 제거 모듈(130)은 배경 데이터(BD)가 생성되기 전에, 사용자가 존재하지 않은 환경에서 영상 데이터를 생성하고, 이 영상 데이터를 이용하여 제1 영상(V1)을 생성할 수 있다. 즉, 사용자 제거 모듈(130)은 사용자 제거 영상(DUa,DUb)에서, 제거된 사용자의 영역을 미리 촬영한 영상 데이터와 합성하여 제1 영상(V1a,V1b)을 생성할 수 있다. 단, 사용자 제거 모듈(130)이 딥러닝 기술 또는 영상 합성 기술을 이용하는 것은 단순히 예시적인 것이며, 본 발명이 이에 제한되는 것은 아니다. 사용자 제거 모듈(130)은 배경 데이터(BD)를 이용하여 생성된 제1 영상(V1)을 영상 합성 모듈(300)로 전송할 수 있다.As another example, the user removal module (130) can generate image data in an environment where a user does not exist before background data (BD) is generated, and can generate the first image (V1) using this image data. That is, the user removal module (130) can generate the first images (V1a, V1b) by synthesizing the area of the removed user in the user removal images (DUa, DUb) with image data captured in advance. However, the use of deep learning technology or image synthesis technology by the user removal module (130) is merely exemplary, and the present invention is not limited thereto. The user removal module (130) can transmit the first image (V1) generated using the background data (BD) to the image synthesis module (300).

도 8은 도 1의 영상 촬영 장치가 전송한 배경 데이터로부터 제1 영상이 생성되는 과정을 나타내는 도면이다.FIG. 8 is a diagram showing a process of generating a first image from background data transmitted by the image capturing device of FIG. 1.

도 1, 도 4 및 도 8을 참조하면, 영상 촬영 장치(20)는 주어진 공간에서 사용자(UM) 및 객체(OB)를 촬영할 수 있다. 이때, 주어진 공간은 크로마키 스튜디오일 수 있다. 이어서, 영상 촬영 장치(20)는 촬영된 영상에서 배경 데이터(BD)를 배경 생성 모듈(100)로 전송할 수 있다. 배경 생성 모듈(100)은 배경 데이터(BD)로부터 제1 영상(V1)을 생성할 수 있다. 이때, 제1 영상(V1)의 배경은 영상이 촬영된 공간과 동일할 수 있다.Referring to FIGS. 1, 4, and 8, the video capturing device (20) can capture a user (UM) and an object (OB) in a given space. At this time, the given space may be a chroma key studio. Subsequently, the video capturing device (20) can transmit background data (BD) from the captured image to the background generating module (100). The background generating module (100) can generate a first image (V1) from the background data (BD). At this time, the background of the first image (V1) may be the same as the space where the image was captured.

또한, 배경 생성 모듈(100)은 미리 촬영된 영상 데이터를 배경으로 활용할 수 있다. 배경 생성 모듈(100)은 배경 데이터(BD)에서 사용자(UM)의 윤곽을 포함한 사용자 형상 전체를 제거할 수 있다. 이때, 배경 생성 모듈(100)은 사용자 형상 뒤 배경 부분도 같이 제거할 수 있다. 배경 생성 모듈(100)은 제거된 배경 부분을 해당 부분과 근접한 주변 배경 영역과 대응되도록 보정하여 제1 영상(V1)을 생성할 수 있다. 이에 따라, 제1 영상(V1)에는 사용자(UM)만 제거되고, 사용자(UM)와 객체(OB)는 동일하게 남아있을 수 있다In addition, the background generation module (100) can utilize pre-recorded image data as a background. The background generation module (100) can remove the entire user shape including the outline of the user (UM) from the background data (BD). At this time, the background generation module (100) can also remove the background part behind the user shape. The background generation module (100) can generate the first image (V1) by correcting the removed background part to correspond to the surrounding background area close to the part. Accordingly, only the user (UM) can be removed from the first image (V1), and the user (UM) and the object (OB) can remain the same.

도 9는 도 1의 캐릭터 구현 모듈의 구성을 나타내는 블록도이다. 도 10은 도 9의 캐릭터 구현 모듈이 제2 영상을 생성하는 과정을 나타내는 순서도이다.Fig. 9 is a block diagram showing the configuration of the character implementation module of Fig. 1. Fig. 10 is a flowchart showing the process by which the character implementation module of Fig. 9 generates a second image.

도 9 및 도 10을 참조하면, 캐릭터 구현 모듈(200)은 모션 캡쳐 장치(30)로부터 사용자 모션 데이터(MD)를 수신한다(S200). 이때, 캐릭터 구현 모듈(200)은 골격 생성 모듈(210), 리타겟 모듈(220) 및 렌더링 모듈(230)을 포함할 수 있다. 몇몇 실시예에 따르면, 골격 생성 모듈(210)은 모션 캡쳐 장치(30)로부터 사용자 모션 데이터(MD)를 수신할 수 있다.Referring to FIGS. 9 and 10, the character implementation module (200) receives user motion data (MD) from the motion capture device (30) (S200). At this time, the character implementation module (200) may include a skeleton generation module (210), a retargeting module (220), and a rendering module (230). According to some embodiments, the skeleton generation module (210) may receive user motion data (MD) from the motion capture device (30).

모션 캡쳐 장치(30)는 사용자의 신체에 부착된 센서 및/또는 마커를 통해, 사용자의 움직임을 데이터로 기록할 수 있다. 다시 말해서, 모션 캡쳐 장치(30)는 사용자의 움직임을 센싱하여, 사용자 모션 데이터(MD)를 생성할 수 있다. 사용자 모션 데이터(MD)는 사용자의 골격의 움직임 및 근육의 움직임을 표현하는 데이터일 수 있다. 이때, 사용자 모션 데이터(MD)는 사용자의 이동 방향 및 이동 속도를 포함할 수 있다.The motion capture device (30) can record the user's movements as data through sensors and/or markers attached to the user's body. In other words, the motion capture device (30) can sense the user's movements and generate user motion data (MD). The user motion data (MD) can be data expressing the movement of the user's skeleton and muscle. At this time, the user motion data (MD) can include the user's movement direction and movement speed.

또한, 모션 캡쳐 장치(30)는 사용자의 신체 부위별로 사용자 모션 데이터(MD)를 기록할 수 있다. 예를 들어, 모션 캡쳐 장치(30)는 사용자의 활동 모션에 대한 데이터를 기록하는 바디 캡쳐 모듈, 사용자의 표정에 대한 데이터를 기록하는 페이셜 캡쳐 모듈 및 사용자의 손 움직임을 기록하는 핸드 캡쳐 모듈을 포함할 수 있다. 구체적으로, 바디 캡쳐 모듈은 머리, 목, 팔, 몸, 다리 등 사용자의 신체의 움직임을 센싱하여 데이터를 기록할 수 있다. 예를 들어, 바디 캡쳐 모듈은 사용자가 특정 행동(걷기, 뛰기, 기기, 앉기, 서기, 눕기, 춤추기, 싸우기, 발길질하기, 팔 휘두르기 등)을 할 때의 머리, 목, 팔, 몸, 다리 등의 움직임을 센싱할 수 있다. 다만, 바디 캡쳐 모듈은 사용자의 신체에 마커를 부착하고, 이를 센싱할 수 있으나, 본 실시예가 이에 제한되는 것은 아니다.In addition, the motion capture device (30) can record user motion data (MD) for each body part of the user. For example, the motion capture device (30) can include a body capture module that records data on the user's activity motion, a facial capture module that records data on the user's facial expression, and a hand capture module that records the user's hand movements. Specifically, the body capture module can sense the movements of the user's body, such as the head, neck, arms, body, and legs, and record data. For example, the body capture module can sense the movements of the head, neck, arms, body, and legs when the user performs a specific action (walking, running, crawling, sitting, standing, lying down, dancing, fighting, kicking, swinging arms, etc.). However, the body capture module can attach a marker to the user's body and sense it, but the present embodiment is not limited thereto.

한편, 페이셜 캡쳐 모듈은 사용자의 얼굴의 움직임을 센싱하여 데이터를 기록할 수 있다. 예를 들어, 페이셜 캡쳐 모듈은 사용자의 특정 표정(우는 표정, 웃는 표정, 놀란 표정, 화난 표정, 경멸하는 표정, 아쉬운 표정, 사랑스러운 표정, 혐오하는 표정 등)에서의 얼굴의 움직임을 센싱할 수 있다. 페이셜 캡쳐 모듈은 사용자의 얼굴에 마커를 부착하고 이를 센싱한 데이터와, 사용자의 얼굴 표정에 대한 영상 처리 기술을 추가적으로 이용할 수 있으나, 본 실시예가 이에 제한되는 것은 아니다.Meanwhile, the facial capture module can sense the movement of the user's face and record data. For example, the facial capture module can sense the movement of the user's face in a specific facial expression (such as a crying expression, a smiling expression, a surprised expression, an angry expression, a contemptuous expression, a regretful expression, a loving expression, or a disgusted expression). The facial capture module can additionally use data sensed by attaching a marker to the user's face and image processing technology for the user's facial expression, but the present embodiment is not limited thereto.

몇몇 실시예에 따르면, 핸드 캡쳐 모듈은 사용자의 손 동작에 대한 데이터일 수 있다. 예를 들어, 핸드 캡쳐 모듈은 사용자의 손 동작(손가락 오므리기, 손가락 펴기 등)에서의 손가락 마디의 움직임을 센싱할 수 있다. 핸드 캡쳐 모듈은 손가락 관절의 움직임을 센싱할 수 있는 특수 장갑 또는 웨어러블 장치로 구현될 수 있으나, 본 실시예가 이에 제한되는 것은 아니다. In some embodiments, the hand capture module may be data about the user's hand movements. For example, the hand capture module may sense the movement of the finger joints in the user's hand movements (such as curling the fingers, spreading the fingers, etc.). The hand capture module may be implemented as a special glove or wearable device capable of sensing the movement of the finger joints, but the present embodiment is not limited thereto.

즉, 모션 캡쳐 장치(30)는 바디 모션 모듈, 페이셜 모션 모듈 및 핸드 모션 모듈 중 적어도 하나를 이용하여 사용자의 몸짓, 표정 및 손 동작 중 적어도 하나에 관련된 사용자 모션 데이터(MD)를 생성할 수 있다.That is, the motion capture device (30) can generate user motion data (MD) related to at least one of the user's gestures, facial expressions, and hand movements by using at least one of a body motion module, a facial motion module, and a hand motion module.

한편, 모션 캡쳐 장치(30)는 광학식, 자기식 및/또는 관성식 모션 캡쳐 장치로 구현될 수 있다. 광학식 모션 캡쳐 장치는, 마커 및 카메라를 포함할 수 있다. 광학식 모션 캡쳐 장치는 마커를 부착한 사용자의 움직임을 하나 이상의 카메라로 촬상하고, 삼각측량법을 통해 사용자에게 부착된 마커의 삼차원적 좌표를 역산하여 사용자 모션 데이터(MD)를 생성하는 장치일 수 있다. 자기식 모션 캡쳐 장치는 사용자의 관절 등에 자기장을 계측할 수 있는 센서를 부착한 뒤 자기장 발생 장치 근처에서 각 센서의 자기장 변화량을 계산하여 움직임을 측정함으로써 사용자 모션 데이터(MD)를 생성하는 장치일 수 있다. 관성식 모션 캡쳐 장치는 가속도 센서, 자이로 센서 및 지자기센서를 포함하는 관성 센서를 사용자의 관절 등에 부착하여, 사용자의 움직임, 회전 및/또는 방향을 읽어내어 사용자 모션 데이터(MD)를 생성하는 장치일 수 있다. 다만, 상술한 광학식, 자기식 및 관성식 모션 캡쳐 장치는, 본 발명의 몇몇 실시예에 따른 모션 캡쳐 장치(30)가 구현될 수 있는 일례를 설명한 것일 뿐, 실시예가 이에 제한되는 것은 아니다. 예를 들어, 모션 캡쳐 장치(30)는 인공지능을 이용한 영상 처리 기술을 이용하거나, 이를 광학식, 자기식 및/또는 관성식 모션 캡쳐 장치와 병행하여 사용자 모션 데이터(MD)를 생성할 수도 있을 것이다. 이하에서는, 설명의 편의를 위해, 모션 캡쳐 장치(30)는 마커 및 카메라를 포함하는 광학식 모션 캡쳐 장치인 것을 가정하여 설명한다.Meanwhile, the motion capture device (30) may be implemented as an optical, magnetic, and/or inertial motion capture device. The optical motion capture device may include a marker and a camera. The optical motion capture device may be a device that captures the movement of a user who has a marker attached to them with one or more cameras, and generates user motion data (MD) by inversely calculating the three-dimensional coordinates of the marker attached to the user through triangulation. The magnetic motion capture device may be a device that generates user motion data (MD) by measuring movement by calculating the amount of change in the magnetic field of each sensor near a magnetic field generating device after attaching a sensor capable of measuring a magnetic field to the user's joints, etc. The inertial motion capture device may be a device that generates user motion data (MD) by attaching an inertial sensor including an acceleration sensor, a gyro sensor, and a geomagnetic sensor to the user's joints, etc., and reading the user's movement, rotation, and/or direction. However, the optical, magnetic and inertial motion capture devices described above are only examples of how the motion capture device (30) according to some embodiments of the present invention can be implemented, and the embodiments are not limited thereto. For example, the motion capture device (30) may generate user motion data (MD) by using an image processing technology using artificial intelligence, or in parallel with an optical, magnetic and/or inertial motion capture device. In the following description, for convenience of explanation, it is assumed that the motion capture device (30) is an optical motion capture device including a marker and a camera.

이어서, 골격 생성 모듈은 사용자의 관절 등에 부착된 마커에서 얻은 사용자 모션 데이터를 이용하여 골격 데이터를 생성할 수 있다(S210). 이때, 골격 데이터는 사용자의 골격 크기, 골격 위치 및 골격 방향 등에 관한 데이터를 포함할 수 있다.Next, the skeleton generation module can generate skeleton data using user motion data obtained from markers attached to the user's joints, etc. (S210). At this time, the skeleton data can include data regarding the user's skeleton size, skeleton position, skeleton direction, etc.

몇몇 실시예에 따르면, 캐릭터 구현 모듈(200)은 배경 생성 모듈(100)로부터 트래킹 데이터(TD)를 수신할 수 있다. 캐릭터 구현 모듈(200)은 수신한 트래킹 데이터를 이용하여 골격 데이터를 수정할 수 있다(S220). 즉, 수정된 골격 데이터(SD)는 트래킹 데이터(TD)를 이용하여 골격 데이터를 수정한 데이터일 수 있다. According to some embodiments, the character implementation module (200) may receive tracking data (TD) from the background generation module (100). The character implementation module (200) may modify skeletal data using the received tracking data (S220). That is, the modified skeletal data (SD) may be data obtained by modifying skeletal data using the tracking data (TD).

이어서, 리타겟 모듈(220)은 골격 생성 모듈(210)로부터 수정된 골격 데이터(SD)를 수신할 수 있다. 리타겟 모듈은 수정된 골격 데이터를 리깅(Rigging)하는 과정을 거쳐 리깅 데이터(RD)를 생성할 수 있다(S230). 이때, 리깅 과정은 3D 캐릭터를 구현할 때, 조인트(joint)라 불리는 구조를 적용하여 캐릭터에 움직임을 적용하는 과정을 뜻할 수 있다. 구체적으로는, 수정된 골격 데이터(SD)를 이용하여, 골격들을 연결하고, 각 골격마다 움직일 수 있는 방향과 각도 등에 가중치를 부여하여 캐릭터가 인간과 같이 자연스럽게 움직일 수 있게 만드는 과정을 뜻할 수 있다. 리타겟 모듈(220)은 수정된 골격 데이터(SD)를 이용하여 생성된 리깅 데이터(RD)를 렌더링 모듈(230)로 전송할 수 있다.Next, the retarget module (220) can receive the modified skeleton data (SD) from the skeleton generation module (210). The retarget module can generate rigging data (RD) through a process of rigging the modified skeleton data (S230). At this time, the rigging process may refer to a process of applying movement to the character by applying a structure called a joint when implementing a 3D character. Specifically, it may refer to a process of connecting skeletons using the modified skeleton data (SD) and giving weights to the direction and angle at which each skeleton can move so that the character can move naturally like a human. The retarget module (220) can transmit the rigging data (RD) generated using the modified skeleton data (SD) to the rendering module (230).

렌더링 모듈(230)은 리깅 데이터(RD)를 토대로 캐릭터 외형에 대한 렌더링 작업을 수행하여 캐릭터를 구현할 수 있다(S240). 구체적으로 렌더링 과정은 캐릭터의 피부, 옷, 머리카락 등 캐릭터의 외형과 질감을 결정하는 단계, 캐릭터를 둘러싼 조명을 설정하여 명암을 주는 단계, 그림자 등 다양한 시각적 요소를 계산하여 재현하는 단계 등을 포함할 수 있다. 렌더링 모듈(230)은 캐릭터의 움직임이 포함된 제2 영상(V2)을 생성할 수 있다(S250). 즉, 제2 영상(V2)은 사용자 모션 데이터(MD)를 기초로 생성된 메타버스 캐릭터 영상에 관한 것이되, 메타버스 캐릭터 영상에서 사용자의 외관이 객체(OB)에 의해 가려지는 부분이 제거된 영상을 의미할 수 있다. 캐릭터 구현 모듈(200)은 제2 영상(V2)을 영상 합성 모듈(300)로 전송할 수 있다.The rendering module (230) can perform rendering work on the character's appearance based on rigging data (RD) to implement the character (S240). Specifically, the rendering process can include a step of determining the character's appearance and texture, such as the character's skin, clothes, and hair, a step of setting the lighting surrounding the character to provide light and shade, a step of calculating and reproducing various visual elements, such as shadows, etc. The rendering module (230) can generate a second image (V2) including the movement of the character (S250). That is, the second image (V2) is related to a metaverse character image generated based on the user motion data (MD), but can mean an image from which a portion of the user's appearance covered by an object (OB) is removed from the metaverse character image. The character implementation module (200) can transmit the second image (V2) to the image synthesis module (300).

본 발명의 몇몇 실시예에 따른 영상 생성 장치는, 사용자와 객체가 오버랩된 부분을 제거하고 캐릭터를 생성하여, 사용자와 객체의 상호작용을 자연스럽게 표현할 수 있다.An image generating device according to some embodiments of the present invention can naturally express interaction between a user and an object by removing an overlapping portion between the user and the object and generating a character.

도 11은 도 1의 모션 캡쳐 장치가 전송한 사용자 모션 데이터로부터 제2 영상이 생성되는 과정을 나타내는 도면이다.FIG. 11 is a drawing showing a process of generating a second image from user motion data transmitted by the motion capture device of FIG. 1.

도 1 및 도 11을 참조하면, 모션 캡쳐 장치(30)는 주어진 공간에서 사용자(UM) 및 객체(OB)를 촬영할 수 있다. 이어서, 모션 캡쳐 장치(30)는 촬영된 영상에서 모션 데이터(MD)를 캐릭터 구현 모듈(200)로 전송할 수 있다. 또한, 캐릭터 구현 모듈(200)은 배경 생성 모듈(100)로부터 트래킹 데이터(TD)를 수신할 수 있다. 캐릭터 구현 모듈(200)은 트래킹 데이터(TD)를 이용하여 모션 데이터(MD)로부터 제2 영상(V2)을 생성할 수 있다.Referring to FIG. 1 and FIG. 11, a motion capture device (30) can capture a user (UM) and an object (OB) in a given space. Then, the motion capture device (30) can transmit motion data (MD) from the captured image to a character implementation module (200). In addition, the character implementation module (200) can receive tracking data (TD) from a background generation module (100). The character implementation module (200) can generate a second image (V2) from the motion data (MD) using the tracking data (TD).

전술한 바와 같이, 제2 영상(V2)은 메타버스 캐릭터의 움직임을 포함하는 영상일 수 있다. 이때, 제2 영상(V2)에 포함된 메타버스 캐릭터는 사용자(UM)와 객체(OB)가 상호작용하는 영역이 제거되어 있을 수 있다. 이에 대한 예시적인 설명을 위해 도 12를 더 참조한다. As described above, the second image (V2) may be an image including the movement of the metaverse character. At this time, the metaverse character included in the second image (V2) may have the area where the user (UM) and the object (OB) interact removed. For an exemplary explanation of this, refer to FIG. 12 further.

도 12는 사용자와 객체가 오버랩되는 부분이 있을 때 제2 영상을 생성하는 과정을 도시한 도면이다.Figure 12 is a diagram illustrating a process of generating a second image when there is an overlapping portion between a user and an object.

도 1, 도 2 및 도 9 내지 도 12를 참조하면, 캐릭터 구현 모듈(200)은 사용자(UM)와 객체(OB)가 오버랩되는 부분이 없는 경우와 공통되는 과정은 생략하고, 차이점이 있는 과정에 대해 이하에서 설명한다. 구체적으로, 배경 데이터(BD)는 사용자(UM)와 객체(OB)가 직접 또는 간접적으로 접촉하는 영상을 포함할 수 있다. 즉, 사용자(UM)와 객체(OB)가 맞닿아 있거나 영상 촬영 장치(20)와 일직선상에 놓여 서로 상호작용하는 영상을 포함할 수 있다. 다시 말해서, 객체(OB)는 사용자(UN)와 상호작용할 수 있으며, 상호작용은 접촉 상호작용과 비접촉 상호작용으로 나눌 수 있다.Referring to FIGS. 1, 2, and 9 to 12, the character implementation module (200) omits common processes when there is no overlapping portion between the user (UM) and the object (OB), and describes processes with differences below. Specifically, the background data (BD) may include an image in which the user (UM) and the object (OB) directly or indirectly contact each other. That is, the image may include an image in which the user (UM) and the object (OB) are in contact or are in a straight line with the image capturing device (20) and interact with each other. In other words, the object (OB) may interact with the user (UN), and the interaction may be divided into contact interaction and non-contact interaction.

접촉 상호작용은, 예를 들어, 악수, 껴안기, 신체의 일부를 만지기, 주먹 치기 등 신체의 일부가 직접 접촉하는 방식일 수 있다. 반면, 비접촉 상호작용은, 인사, 전화나 문자와 같이 단말기를 이용한 통신 등 신체 접촉 없이 소통하는 방식일 수 있다.Contact interactions can be ways of directly touching parts of the body, such as shaking hands, hugging, touching parts of the body, or fist bumping. On the other hand, non-contact interactions can be ways of communicating without physical contact, such as greetings, using terminals such as phone calls or text messages.

이때, 배경 데이터(BD)는 사용자(UM)와 객체(OB)가 화면상에 서로 오버랩된 부분을 포함할 수 있다. 구체적으로, 사용자(UM)가 객체(OB)보다 영상 촬영 장치(20)에 멀리 있는 경우, 영상 촬영 장치(20), 객체(OB) 및 사용자(UM)가 순차적으로 일직선상에 놓여 사용자(UM)의 형상 일부가 가려질 수 있다. 사용자(UM)의 형상 일부가 객체(OB)에 의해 가려졌으므로, 배경 데이터(BD)로부터 사용자(UM)가 객체(OB)에 의해 가려진 부분을 제외한 윤곽 정보(OI_O)가 생성될 수 있다.At this time, the background data (BD) may include a portion where the user (UM) and the object (OB) overlap each other on the screen. Specifically, when the user (UM) is farther away from the image capturing device (20) than the object (OB), the image capturing device (20), the object (OB), and the user (UM) may be sequentially positioned in a straight line, thereby obscuring a portion of the user (UM). Since a portion of the user (UM) is obscured by the object (OB), outline information (OI_O) excluding the portion of the user (UM) obscured by the object (OB) may be generated from the background data (BD).

다시 말해서, 개체 그루핑 모듈(110)은 배경 데이터(BD)에는 사용자(UM)가 객체(OB)에 의해 가려진 부분을 제외한 윤곽 정보(OI_O)를 생성할 수 있다. 따라서, 사용자 트래킹 모듈(120)은 가려진 부분을 제외한 윤곽 정보(OI_O)에 기초하여 사용자 트래킹 데이터(TD_O)를 생성할 수 있다.In other words, the object grouping module (110) can generate outline information (OI_O) excluding the part of the user (UM) that is obscured by the object (OB) from the background data (BD). Accordingly, the user tracking module (120) can generate user tracking data (TD_O) based on the outline information (OI_O) excluding the obscured part.

반면, 사용자(UM)가 객체(OB)보다 영상 촬영 장치(20)에 가까이 있는 경우, 영상 촬영 장치(20), 사용자(UM) 및 객체(OB)가 순차적으로 일직선상에 놓여 객체(OB)의 형상 일부가 가려질 수 있다. 이때, 배경 데이터(BD)는 사용자의 형상 모두를 포함하고 있으므로, 윤곽 정보(OI_O)는 사용자(UM)와 객체(OB)가 오버랩된 부분을 포함한 사용자(UM)의 형상을 포함할 수 있다.On the other hand, when the user (UM) is closer to the imaging device (20) than the object (OB), the imaging device (20), the user (UM), and the object (OB) may be sequentially positioned in a straight line, thereby partially obscuring the shape of the object (OB). In this case, since the background data (BD) includes the entire shape of the user, the outline information (OI_O) may include the shape of the user (UM) including the portion where the user (UM) and the object (OB) overlap.

사용자 트래킹 모듈(120)은 사용자 트래킹 데이터(TD_O)를 캐릭터 구현 모듈(200)에 전송할 수 있다. 이때, 사용자 트래킹 데이터(TD_O)는 사용자(UM)와 객체(OB)가 오버랩된 부분을 포함할 수 있다. 캐릭터 구현 모듈(200)은 사용자 트래킹 데이터(TD_O) 및 사용자 모션 데이터(MD_O)를 이용하여, 제2 영상(V2)를 생성할 수 있다.The user tracking module (120) can transmit user tracking data (TD_O) to the character implementation module (200). At this time, the user tracking data (TD_O) can include a portion where the user (UM) and the object (OB) overlap. The character implementation module (200) can generate a second image (V2) using the user tracking data (TD_O) and the user motion data (MD_O).

구체적으로, 골격 생성 모듈(210)은 트래킹 데이터(TD_O)를 이용하여, 골격 데이터에서 사용자(UM)의 외관이 객체(OB)에 의해 가려지는 부분을 추출할 수 있다. 이후, 골격 생성 모듈(210)은 골격 데이터에서 추출한 부분을 삭제하여 수정된 골격 데이터(SD)를 생성할 수 있다. 즉, 골격 생성 모듈(210)은 트래킹 데이터(TD_O)를 골격 데이터에 반영하여 사용자(UM)가 객체(OB)에 의해 가려지는 부분을 정밀하게 표현할 수 있다. Specifically, the skeleton generation module (210) can extract a portion of the user's (UM) appearance that is obscured by an object (OB) from the skeleton data using the tracking data (TD_O). Thereafter, the skeleton generation module (210) can delete the portion extracted from the skeleton data to generate modified skeleton data (SD). That is, the skeleton generation module (210) can precisely express a portion of the user's (UM) appearance that is obscured by an object (OB) by reflecting the tracking data (TD_O) to the skeleton data.

이어서, 골격 생성 모듈(210)은 사용자 모션 데이터(MD_O) 및 트래킹 데이터(TD_O)를 이용하여 생성된 수정된 골격 데이터(SD)를 리타겟 모듈(220)로 전송할 수 있다. 결과적으로, 골격 생성 모듈(210)에서 리타겟 모듈(220)로 전송되는 수정된 골격 데이터(SD)는 사용자의 골격이 사물(OB)에 의해 가려지는 부분에 대한 데이터가 삭제된 데이터일 수 있다. Next, the skeleton generation module (210) can transmit modified skeleton data (SD) generated using the user motion data (MD_O) and the tracking data (TD_O) to the retarget module (220). As a result, the modified skeleton data (SD) transmitted from the skeleton generation module (210) to the retarget module (220) can be data in which data regarding a portion of the user's skeleton that is obscured by an object (OB) is deleted.

몇몇 실시예에 따르면, 사용자(UM)가 객체(OB)보다 영상 촬영 장치(20)에 멀리 있는 경우, 트래킹 데이터(TD_O)는 사용자(UM)와 객체(OB)가 오버랩된 부분에 사용자(OB)의 형상이 일부 생략되어 있을 수 있다. 이때, 캐릭터 구현 모듈(200)은 사용자(UM)와 객체(OB)가 오버랩된 부분을 생략하고 메타버스 캐릭터를 생성한다. 예를 들어, 메타버스 캐릭터와 객체(OB)가 악수하는 화면을 포함하는 최종 영상(VF)을 생성하는 경우에, 사용자(UM)의 손 위에 객체(OB)의 손이 올려져 있을 수 있다. 이때, 영상 촬영 장치(20)는 일직선상에서 사용자(UM)의 손보다 객체(OB)의 손과 더 가까이 위치할 수 있다. 따라서, 영상 촬영 장치(20)에 사용자(UM)의 손 일부가 생략되어 촬영될 수 있다.According to some embodiments, when the user (UM) is farther away from the image capturing device (20) than the object (OB), the tracking data (TD_O) may omit a portion of the shape of the user (OB) in the portion where the user (UM) and the object (OB) overlap. In this case, the character implementation module (200) generates the metaverse character by omitting the portion where the user (UM) and the object (OB) overlap. For example, when generating the final image (VF) including a screen where the metaverse character and the object (OB) shake hands, the hand of the object (OB) may be placed on top of the hand of the user (UM). In this case, the image capturing device (20) may be positioned closer to the hand of the object (OB) than to the hand of the user (UM) in a straight line. Therefore, a portion of the hand of the user (UM) may be captured by the image capturing device (20).

트래킹 데이터(TD_O)는 생략된 사용자(UM)의 손 일부를 반영할 수 있다. 캐릭터 구현 모듈(200)은 트래킹 데이터(TD_O)에 따라 객체(OB)의 손만큼 메타버스 캐릭터의 손 일부를 생략하여 제2 영상(V2_O)을 생성할 수 있다.Tracking data (TD_O) may reflect a part of the hand of the omitted user (UM). The character implementation module (200) may generate a second image (V2_O) by omitting a part of the hand of the metaverse character as much as the hand of the object (OB) according to the tracking data (TD_O).

본 발명의 영상 생성 장치 및 방법은, 연기를 수행하는 사용자의 모션을 반영하여, 메타버스 캐릭터의 움직임과 연출 효과에 적용하므로 캐릭터의 리얼리티를 극대화할 수 있다. The video generation device and method of the present invention can maximize the reality of a character by reflecting the motion of a user performing an act and applying it to the movement and directing effects of a metaverse character.

도 13 및 도 14는 본 발명의 몇몇 실시예에 따른 영상 합성 모듈을 통해 최종 영상을 생성하는 과정을 설명하기 위한 도면이다.FIGS. 13 and 14 are diagrams for explaining a process of generating a final image through an image synthesis module according to some embodiments of the present invention.

도 13 및 도 14를 참조하면, 영상 합성 모듈(300)은 사용자와 객체가 오버랩되지 않는 경우에 최종 영상(VF)을 생성할 수 있다. 영상 합성 모듈(300)은 배경 생성 모듈(100)에서 생성된 제1 영상(V1), 캐릭터 구현 모듈(200)에서 생성된 제2 영상(V2) 중 적어도 일부를 이용하여, 최종 영상(VF)을 생성할 수 있다. 구체적으로, 영상 합성 모듈(300)은 제1 영상(V1) 및 제2 영상(V2)을 수신할 수 있다(S300). 영상 합성 모듈(400)은 사용자를 제거한 배경인 제1 영상(V1)을 기초로 하여 제2 영상(V2)을 합성함으로써 최종 영상(VF)을 생성할 수 있다(S310, S320). 예를 들어, 영상 합성 모듈(400)은 제1 영상(V1)을 최하단 레이어로 하고, 제1 영상(V1) 상에 제2 영상(V2)을 배치할 수 있다.Referring to FIGS. 13 and 14, the image synthesis module (300) can generate the final image (VF) when the user and the object do not overlap. The image synthesis module (300) can generate the final image (VF) by using at least a part of the first image (V1) generated by the background generation module (100) and the second image (V2) generated by the character implementation module (200). Specifically, the image synthesis module (300) can receive the first image (V1) and the second image (V2) (S300). The image synthesis module (400) can generate the final image (VF) by synthesizing the second image (V2) based on the first image (V1) as the background from which the user has been removed (S310, S320). For example, the image synthesis module (400) can place the second image (V2) on the first image (V1) with the first image (V1) as the lowest layer.

몇몇 실시예에 따르면, 영상 합성 모듈(300)은 제1 영상(V1)을 기초로, 제2 영상(V2)을 합성함으로써 사용자 대신 메타버스 캐릭터가 포함된 영상을 생성할 수 있다.According to some embodiments, the image synthesis module (300) can generate an image including a metaverse character on behalf of a user by synthesizing a second image (V2) based on a first image (V1).

이때, 사용자(UM)는 제1 영상(V1) 내에 존재할 수 있고, 메타버스 캐릭터는 제2 영상(V2) 내에 존재할 수 있다. 또한, 제2 영상(V2) 내 메타버스 캐릭터는 객체(OB)를 보정하여 생성될 수 있다. 즉, 최종 영상(VF)은 사용자(UM)와 객체(OB)가 촬영된 영상에 기초하여 생성될 수 있다.At this time, the user (UM) may exist in the first image (V1), and the metaverse character may exist in the second image (V2). In addition, the metaverse character in the second image (V2) may be generated by correcting the object (OB). That is, the final image (VF) may be generated based on the image in which the user (UM) and the object (OB) were captured.

본 발명의 영상 생성 장치 및 방법은, 사용자(UM)와 객체(OB)가 동일한 공간에서, 동시에 촬영될 수 있다. 즉, 본 발명은, 메타버스 캐릭터를 연기하는 사람과 그와 상호작용하는 대상을 원 테이크로 촬영할 수 있다. 다시 말해서, 본 발명은, 메타버스 캐릭터를 연기하는 사용자(UM)와 그와 상호작용하는 객체(OB)를 별도의 공간 및 시간에서 촬영하여 합성하지 않고 최종 영상(VF)을 생성할 수 있다. 이에 따라, 본 발명은 별도의 공간 및 시간에서 영상을 촬영하는 데 소요되는 시간과 비용을 절약하여, 경제성을 확보할 수 있다.The image generating device and method of the present invention can simultaneously film a user (UM) and an object (OB) in the same space. That is, the present invention can film a person acting as a metaverse character and an object interacting with the person in one take. In other words, the present invention can film a user (UM) acting as a metaverse character and an object (OB) interacting with the person in separate spaces and times and generate a final image (VF) without compositing them. Accordingly, the present invention can secure economic feasibility by saving the time and cost required for filming the image in separate spaces and times.

도 15는 본 발명의 다른 몇몇 실시예에 따른 영상 합성 모듈을 통해 최종 영상을 생성하는 과정을 설명하기 위한 도면이다.FIG. 15 is a diagram illustrating a process of generating a final image through an image synthesis module according to some other embodiments of the present invention.

도 1 및 15를 참조하면, 영상 합성 모듈(300)은 사용자와 객체가 오버랩되는 경우에 최종 영상(VF_O)를 생성할 수 있다. Referring to FIGS. 1 and 15, the image synthesis module (300) can generate a final image (VF_O) when a user and an object overlap.

영상 합성 모듈(300)은 배경 생성 모듈(100)에서 생성된 제1 영상(V1_O), 캐릭터 구현 모듈(200)에서 생성된 제2 영상(V2_O) 중 적어도 일부를 이용하여, 최종 영상(VF_O)을 생성할 수 있다. 이하에서는 도 14-1과 공통되는 내용은 생략하고, 최종 영상(VF_O)을 생성하는 과정에 대해 설명한다.The image synthesis module (300) can generate a final image (VF_O) by using at least a part of the first image (V1_O) generated by the background generation module (100) and the second image (V2_O) generated by the character implementation module (200). Hereinafter, common content with Fig. 14-1 will be omitted, and the process of generating the final image (VF_O) will be described.

몇몇 실시예에 따르면, 사용자와 객체가 겹쳐진 부분이 있을 때, 사용자와 객체의 위치 관계를 반영하여 제1 영상(V1_O)을 생성할 수 있다. 이를 통해, 사용자가 제거된 영상을 생성할 수 있다. 이때, 배경 데이터(BD)는 객체의 외관이 사용자의 외관 상에 오버랩된 부분을 포함할 수 있다. 또한, 트래킹 데이터(TD)는 사용자와 객체가 오버랩된 부분에 대응하는 사용자의 외관이 생략될 수 있다. 즉, 트래킹 데이터(TD)는 사용자의 형상 일부가 생략되어 있을 수 있다.According to some embodiments, when there is an overlapping portion between a user and an object, the first image (V1_O) can be generated by reflecting the positional relationship between the user and the object. Through this, an image from which the user is removed can be generated. At this time, the background data (BD) can include a portion where the appearance of the object overlaps the appearance of the user. In addition, the tracking data (TD) can omit the appearance of the user corresponding to the overlapping portion between the user and the object. That is, the tracking data (TD) can omit a part of the user's shape.

캐릭터 구현 모듈(200)은 배경 생성 모듈(100)로부터 트래킹 데이터(TD)를 전달받을 수 있다. 캐릭터 구현 모듈(200)은 메타버스 캐릭터를 포함하는 제2 영상(V2_O)을 생성할 수 있다. 구체적으로, 제2 영상(V2_O)은 사용자와 객체가 오버랩된 부분이 제거된 메타버스 캐릭터 영상일 수 있다. 이때, 제1 영상(V1_O)을 기초로 제2 영상(V2_O)이 합성되는 경우, 메타버스 캐릭터와 객체가 서로 자연스럽게 상호작용하는 영상이 생성될 수 있다. 다시 말해서, 최종 영상(VF_O)은 처음부터 사용자가 아닌, 메타버스 캐릭터와 객체가 상호작용하는 것과 같은 영상일 수 있다.The character implementation module (200) can receive tracking data (TD) from the background generation module (100). The character implementation module (200) can generate a second image (V2_O) including a metaverse character. Specifically, the second image (V2_O) can be a metaverse character image from which a portion where a user and an object overlap is removed. At this time, when the second image (V2_O) is synthesized based on the first image (V1_O), an image in which the metaverse character and the object naturally interact with each other can be generated. In other words, the final image (VF_O) can be an image in which a metaverse character and an object interact with each other, rather than a user, from the beginning.

따라서, 영상 합성 모듈(300)은 제1 영상(V1_O)을 기초로 제2 영상(V2_O)을 합성하여 최종 영상(VF_O)을 생성함으로써, 메타버스 캐릭터와 객체가 상호작용하는 영상을 생성할 수 있다.Accordingly, the image synthesis module (300) can generate an image of a metaverse character and an object interacting by synthesizing the second image (V2_O) based on the first image (V1_O) to generate a final image (VF_O).

또한, 본 발명의 몇몇 실시예에 따른 영상 생성 장치 및 방법은, 사용자와 사용자 이외의 객체가 겹쳐진 부분이 자연스럽게 표현되어 퀄리티가 극대화된 영상을 제작할 수 있다.In addition, the image generating device and method according to some embodiments of the present invention can produce an image with maximized quality by naturally expressing an overlapping portion of a user and an object other than the user.

이상의 설명은 본 실시예의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 실시예가 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 실시예의 본질적인 특성에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. 따라서, 본 실시예는 본 실시예의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 실시예의 기술 사상의 범위가 한정되는 것은 아니다. 본 실시예의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 실시예의 권리범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely an example of the technical idea of the present embodiment, and those skilled in the art will appreciate that various modifications and variations may be made without departing from the essential characteristics of the present embodiment. Accordingly, the present embodiment is not intended to limit the technical idea of the present embodiment, but rather to explain it, and the scope of the technical idea of the present embodiment is not limited by this embodiment. The protection scope of the present embodiment should be interpreted by the following claims, and all technical ideas within a scope equivalent thereto should be interpreted as being included in the scope of the rights of the present embodiment.

Claims

A background generation module that uses background data including the appearance of a user and the appearance of an object interacting with the user, which were simultaneously photographed in the same space, to generate a first image from which the appearance of the user is removed, except for a portion where the appearance of the user is obscured by the object;
A character implementation module that generates a second image including the appearance of a metaverse character that matches the appearance of the user and excludes the portion of the user's appearance that is obscured by the object, by using user motion data related to the user's actions and tracking data that tracks the appearance of the user for the remaining portion of the user's appearance excluding the portion of the user's appearance that is obscured by the object; and
Including an image synthesis module that generates a final image using the first image and the second image,
An image generating device, wherein the final image includes an image of the metaverse character interacting with the object.

In the first paragraph,
The above background generation module,
An object grouping module that generates outline information of the user from the background data received from the video capturing device,
A user tracking module that generates the tracking data that tracks the user based on the above outline information,
A user removal module comprising a user removal module that generates the first image using the tracking data.
Image generating device.

In the second paragraph,
The above user removal module,
Using the above tracking data, masking data corresponding to the outline information is generated,
Using the above masking data, a user-removed image is generated by removing the outline information from the background data,
Generating the first image using the above background data and the user-removed image,
Image generating device.

In the second paragraph,
The above user removal module,
Using the above tracking data, masking data corresponding to the outline information is generated,
Using the above background data, the masking data is corrected to generate a user-removed image,
Generating the first image using the above background data and the user-removed image,
Image generating device.

In the second paragraph,
The above object grouping module,
Group pixels based on predefined criteria centered on each of the multiple markers attached to the user,
Generate the outline information of the user by connecting the groups of pixels mentioned above.
Image generating device.

delete

A method for generating an image using an image generating device that generates a metaverse character image using a user's movements,
A step of generating user outline information by using background data including the user's appearance and the appearance of an object interacting with the user, which were photographed simultaneously in the same space;
A step of generating tracking data that tracks a user based on the above outline information;
A step of generating a first image from which the user's appearance is removed, excluding a portion of the user's appearance obscured by the object, using the tracking data;
A step of generating a second image including the motion of a metaverse character that matches the motion of the user and corresponds to the remaining part of the user's appearance excluding the part where the user's appearance is obscured by the object, using the user motion data and the tracking data related to the motion of the user; and
A step of generating a final image including an image of the metaverse character and the object interacting using the first image and the second image,
How to create a video.

In Article 7,
The step of generating the first image is:
A step of generating masking data corresponding to the outline information using the above tracking data,
A step of generating a user-removed image by removing the outline information from the background data using the above masking data,
A step of generating the first image using the background data and the user-removed image,
How to create a video.

In Article 7,
The step of generating the first image is:
A step of generating masking data corresponding to the outline information using the above tracking data,
A step of generating a user-removed image by correcting the masking data using the above background data,
A step of generating the first image using the background data and the user-removed image,
How to create a video.

delete