JP2005250748A

JP2005250748A - Video composition device, video composition program, and video composition system

Info

Publication number: JP2005250748A
Application number: JP2004058875A
Authority: JP
Inventors: Toshikazu Ikenaga; 敏和池永
Original assignee: Nippon Hoso Kyokai NHK; Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2004-03-03
Filing date: 2004-03-03
Publication date: 2005-09-15

Abstract

【課題】センサおよび特殊メガネを装着する必要が無く、ＣＧ画像に対して人物が自然な振る舞いをしている合成映像を出力することができる映像合成装置、映像合成プログラムおよび映像合成システムを提供する。
【解決手段】映像合成装置１は、三次元ＣＧデータに従って立体模型造形装置２によって造形された立体模型と人物とを撮影カメラ４によって撮影した映像内で、当該立体模型が映し出されている映像部分に、当該三次元ＣＧデータに従って描画したＣＧ画像を重ね合わせて出力するものであって、三次元ＣＧデータ蓄積手段３と、三次元ＣＧデータ出力制御手段５と、レンダリング手段７と、立体模型映像部分検出手段９と、リプレイス手段１１と、を備えた。
【選択図】図１PROBLEM TO BE SOLVED: To provide a video synthesizing apparatus, a video synthesizing program, and a video synthesizing system that can output a synthesized video in which a person behaves naturally with respect to a CG image without wearing a sensor and special glasses. .
A video synthesizing apparatus 1 includes a video portion in which a three-dimensional model modeled by a three-dimensional model modeling apparatus 2 according to three-dimensional CG data and a person photographed by a photographing camera 4 are projected. The CG image drawn according to the 3D CG data is superimposed and output. The 3D CG data storage means 3, the 3D CG data output control means 5, the rendering means 7, and the 3D model image A partial detection means 9 and a replacement means 11 were provided.
[Selection] Figure 1

Description

本発明は、撮影した映像に三次元ＣＧを合成する映像合成装置、映像合成プログラムおよび映像合成システムに関する。 The present invention relates to a video synthesis device, a video synthesis program, and a video synthesis system that synthesize a three-dimensional CG with a shot video.

従来、撮影カメラで撮影した実写映像内に、コンピュータによって描画した三次元画像である三次元ＣＧを配置する技術として、バーチャルスタジオ技術、ミックスドリアリティ技術が挙げられる（例えば、特許文献１，２参照）。 Conventionally, as a technique for arranging a 3D CG, which is a 3D image drawn by a computer, in a live-action image taken by a shooting camera, there are a virtual studio technique and a mixed reality technique (see, for example, Patent Documents 1 and 2). ).

バーチャルスタジオ技術は、スタジオセットをコンピュータによってＣＧ画像として描画して（仮想的に組み立てて）、この描画したＣＧ画像（バーチャル模型）の中に人物（出演者、例えば、アナウンサ）等の実写映像をはめ込む（配置する、合成する）ことにより、実際には、スタジオセットが存在していないにも拘わらず、恰も本物のスタジオセットの中に人物（出演者）等がいるかのような映像を作り出すことができる技術である。 In the virtual studio technology, a studio set is drawn as a CG image by a computer (virtually assembled), and a live-action image of a person (performer, for example, an announcer) is drawn in the drawn CG image (virtual model). By inserting (arranging and synthesizing), it is possible to create an image as if a person (performer) is in a real studio set even though the studio set does not actually exist. It is a technology that can.

ミックスドリアリティ技術は、ＣＧ画像等の仮想世界と実際に撮影した映像等の現実世界とをリアルタイムに継ぎ目無く融合させる技術であり、このミックスドリアリティ技術の一態様として、人物（使用者）の位置および目線方向を取得可能なセンサと、実写映像にＣＧ画像を合成した合成映像を当該人物（使用者）に提示可能な特殊メガネと、ＣＧ画像を実際の空間上に配置する位置を示す配置データを記憶したコンピュータとを用いたものがある。 The mixed reality technology is a technology that seamlessly fuses a virtual world such as a CG image and a real world such as an actually captured image in real time. As one aspect of this mixed reality technology, a person (user) A sensor capable of acquiring a position and a line-of-sight direction, special glasses capable of presenting a synthesized video obtained by synthesizing a CG image to a live-action video to the person (user), and an arrangement indicating a position where the CG image is arranged in an actual space Some use a computer that stores data.

このミックスドリアリティ技術の一態様では、人物（使用者）がセンサおよび特殊メガネを装着して、任意の動作をすると、センサがこの任意の動作に応じて変化するセンサデータをコンピュータに出力する。続いて、コンピュータが、入力されたセンサデータと配置データとに基づいて、時々の人物（使用者）の位置および目線方向から見えるＣＧ画像をその都度生成し、この生成したＣＧ画像を特殊メガネによって、リアルタイムに実写映像上に配置することで、人物（使用者）に提示するものである。 In one aspect of this mixed reality technology, when a person (user) wears a sensor and special glasses and performs an arbitrary operation, the sensor outputs sensor data that changes according to the arbitrary operation to a computer. Subsequently, the computer generates a CG image that can be seen from the position and eye direction of the person (user) from time to time based on the input sensor data and arrangement data, and the generated CG image is displayed by using special glasses. It is presented to a person (user) by arranging it on a live-action video in real time.

これによって、人物（使用者）は、実際の空間における特定の箇所に、恰もＣＧ画像が存在しているように感じることができる。
特開平１１−２６１８８８号公報（段落０００９〜００１２、図１）特開２００３−３０３３５６号公報（段落００１２〜００２２、図１） As a result, the person (user) can feel as if a CG image is present at a specific location in the actual space.
Japanese Patent Laid-Open No. 11-261888 (paragraphs 0009 to 0012, FIG. 1) Japanese Patent Laying-Open No. 2003-303356 (paragraphs 0012 to 0022, FIG. 1)

しかしながら、従来のバーチャルスタジオ技術では、たとえ、スタジオセットをＣＧ画像（バーチャル模型）として生成しても、人物（出演者）は、このＣＧ画像が実写映像にはめ込まれた合成映像を、テレビモニタによって確認するため、ＣＧ画像の存在や配置位置を正確に把握することができず、当該ＣＧ画像に対して自然な振る舞いを行うことができない、例えば、ＣＧ画像の存在する位置を注目する際の目線、つまり、複数の人物（出演者）が恰もＣＧ画像の中の特定の１箇所を注目している（目線の共有）という演出や、指し棒等を用いた場合は差し位置等が不自然になるという問題がある。 However, in the conventional virtual studio technology, even if a studio set is generated as a CG image (virtual model), a person (performer) uses a television monitor to display a composite image in which the CG image is embedded in a live-action image. For confirmation, it is impossible to accurately grasp the presence or arrangement position of a CG image, and natural behavior cannot be performed on the CG image. For example, a line of sight when paying attention to a position where a CG image exists In other words, if a plurality of persons (performers) are paying attention to one specific place in the CG image (sharing of eyes) or using a pointing stick or the like, the insertion position is unnatural. There is a problem of becoming.

また、従来のミックスドリアリティ技術では、実際の空間に配置されたＣＧ画像に対して自然な振る舞いを行うことができるが、人物（使用者）は、センサおよび特殊メガネを装着しなければならず、この従来のミックスドリアリティ技術を番組制作等に用いる場合、これらセンサおよび特殊メガネを装着した顔面を確認できない人物を出演させることになってしまうので、番組の演出上困難であるという問題がある。 In addition, in the conventional mixed reality technology, natural behavior can be performed on a CG image arranged in an actual space, but a person (user) must wear a sensor and special glasses. When this conventional mixed reality technology is used for program production or the like, there is a problem that it is difficult to produce a program because a person who cannot confirm the face wearing these sensors and special glasses will appear. .

さらに、人物（出演者）が解説する模型を実際に製作して番組中で使用する場合、当該模型は、現物を適当なサイズにスケールダウンした状態で製作する必要があることから、例えば、報道番組において、航空機事故を取り扱う場合に、航空機の精巧な模型が必要になる場合があり、こういった場合に、精巧な模型ほど作成するのに時間を要すという問題がある。 Furthermore, when a model explained by a person (performer) is actually produced and used in a program, the model must be produced with the actual scale down to an appropriate size. When an aircraft accident is handled in a program, an elaborate model of the aircraft may be required. In such a case, there is a problem that it takes time to create an elaborate model.

そこで、本発明では、前記した問題を解決し、センサおよび特殊メガネを装着する必要が無く、緊急な報道番組であっても精巧な模型を作成することなく対処することができ、、ＣＧ画像に対して人物（出演者）が自然な振る舞いをしている合成映像を出力することができる映像合成装置、映像合成プログラムおよび映像合成システムを提供することを目的とする。 Therefore, the present invention solves the above-described problems, eliminates the need for wearing sensors and special glasses, can deal with urgent news programs without creating elaborate models, An object of the present invention is to provide a video composition device, a video composition program, and a video composition system that can output a composite image in which a person (performer) behaves naturally.

前記課題を解決するため、請求項１記載の映像合成装置は、三次元ＣＧデータに従って立体模型造形手段によって造形された立体模型と人物とを撮影手段によって撮影した映像内で、当該立体模型が映し出されている映像部分に、当該三次元ＣＧデータに従って描画した三次元ＣＧを重ね合わせて出力する映像合成装置であって、三次元ＣＧデータ蓄積手段と、三次元ＣＧデータ出力手段と、レンダリング手段と、立体模型映像部分検出手段と、リプレイス手段と、を備える構成とした。 In order to solve the above-described problem, the video composition device according to claim 1 is configured such that the three-dimensional model is projected in the image obtained by photographing the three-dimensional model and the person modeled by the three-dimensional model modeling unit according to the three-dimensional CG data. A video synthesizing apparatus for superimposing and outputting a 3D CG drawn in accordance with the 3D CG data on a video portion, and including a 3D CG data storage unit, a 3D CG data output unit, a rendering unit, The three-dimensional model image part detecting means and the replacing means are provided.

かかる構成によれば、映像合成装置は、三次元ＣＧデータ出力手段によって、三次元ＣＧデータ蓄積手段に蓄積されている三次元ＣＧデータを立体模型造形手段に出力させる。三次元ＣＧデータは、三次元のコンピュータグラフィックスを描画するための形状を示す座標群、描画される三次元ＣＧの質感を示す質感情報、描画されるＣＧ画像の明暗および色彩を示す光彩情報等を含んでいるものである。 According to such a configuration, the video composition device causes the 3D CG data output means to output the 3D CG data stored in the 3D CG data storage means to the 3D model modeling means. The three-dimensional CG data includes a coordinate group indicating a shape for drawing three-dimensional computer graphics, texture information indicating the texture of the three-dimensional CG to be drawn, glow information indicating the lightness and darkness and color of the drawn CG image, and the like. Is included.

また、映像合成装置は、レンダリング手段によって、撮影手段が設置されている位置（例えば、座標ｘｙｚ）と、撮影手段が撮影している方向と、撮影手段が立体模型を撮影する際のズーム量（ズーム率）とを含む撮影情報と、立体模型が配置されている位置を示す配置情報と、三次元ＣＧデータ蓄積手段に蓄積されている三次元ＣＧデータとに基づいて、立体模型を三次元ＣＧに置き換えて配置した場合に撮影手段で撮影されるであろうＣＧ画像（二次元ＣＧ画像）を時々刻々（その都度）描画（生成）する。つまり、立体模型造形手段で造形された模型の撮影手段方向から眺めた二次元ＣＧ画像が、撮影している撮影手段の位置、方向およびズーム量を含む撮影情報の変化に伴って、逐次生成されることになる。 In addition, the video composition device uses the rendering unit to position the shooting unit (for example, coordinates xyz), the direction in which the shooting unit is shooting, and the zoom amount when the shooting unit is shooting the three-dimensional model ( The three-dimensional model is converted into a three-dimensional CG based on the shooting information including the zoom ratio), the arrangement information indicating the position where the three-dimensional model is arranged, and the three-dimensional CG data stored in the three-dimensional CG data storage means. In this case, a CG image (two-dimensional CG image) that will be photographed by the photographing means when being arranged is drawn (generated) every moment. That is, a two-dimensional CG image viewed from the direction of the photographing means of the model modeled by the three-dimensional model shaping means is sequentially generated as the photographing information including the position, direction, and zoom amount of the photographing means that is photographing is changed. Will be.

そして、映像合成装置は、立体模型映像部分検出手段によって、撮影手段で撮影されている実写の映像内の立体模型が映し出されている映像部分を検出し、リプレイス手段によって、検出された映像部分にレンダリング手段で描画されたＣＧ画像を置き換えて出力する。すなわち、レンダリング手段で描かれた、撮影手段の位置、方向およびズーム量を含む撮影情報の変化に伴って形状（見え方）が変わるＣＧ画像（二次元ＣＧ画像）が、リプレイス手段によって、立体模型映像部分検出手段で検出された映像部分に置き換えられることになる。 Then, the video composition device detects the video part in which the three-dimensional model in the live-action video shot by the photographing unit is projected by the three-dimensional model video part detecting unit, and the detected video part is detected by the replacement unit. The CG image drawn by the rendering means is replaced and output. That is, a CG image (a two-dimensional CG image) whose shape (appearance) changes with changes in shooting information including the position, direction, and zoom amount of the shooting unit drawn by the rendering unit is converted into a three-dimensional model by the replacement unit. It will be replaced with the video part detected by the video part detection means.

請求項２記載の映像合成装置は、請求項１に記載の映像合成装置において、前記立体模型造形手段は、ラピッドプロトタイピング技術における薄膜積層法の原理を用いることを特徴とする。 According to a second aspect of the present invention, in the video synthesizing apparatus according to the first aspect, the three-dimensional model forming means uses a principle of a thin film stacking method in a rapid prototyping technique.

かかる構成によれば、映像合成装置は、三次元ＣＧデータを出力する出力先の立体模型造形手段にラピッドプロトタイピング技術における薄膜積層法の原理が用いられるので、三次元ＣＧデータを具現化した実際の形状を表す立体模型がダンボール等によって造形されることとなる。 According to such a configuration, the video synthesizing apparatus uses the principle of the thin film stacking method in the rapid prototyping technique for the output 3D model forming means for outputting the 3D CG data. The three-dimensional model representing the shape is formed by cardboard or the like.

請求項３記載の映像合成プログラムは、三次元ＣＧデータに従って立体模型造形手段によって造形された立体模型と人物とを撮影手段によって撮影した映像内で、当該立体模型が映し出されている映像部分に、当該三次元ＣＧデータに従って描画した三次元ＣＧを重ね合わせて出力するために、コンピュータを、三次元ＣＧデータ出力手段、レンダリング手段、立体模型映像部分検出手段、リプレイス手段、として機能させる構成とした。 The video composition program according to claim 3, in a video portion in which the three-dimensional model is projected in a video obtained by photographing the three-dimensional model and the person modeled by the three-dimensional model modeling means according to the three-dimensional CG data, by the photographing means, In order to superimpose and output three-dimensional CG drawn according to the three-dimensional CG data, the computer is configured to function as a three-dimensional CG data output unit, a rendering unit, a three-dimensional model video portion detection unit, and a replacement unit.

かかる構成によれば、映像合成プログラムは、三次元ＣＧデータ出力手段によって、三次元ＣＧデータ蓄積手段に蓄積されている三次元ＣＧデータを立体模型造形手段に出力させる。また、映像合成プログラムは、レンダリング手段によって、撮影手段の位置、方向およびズーム量を含む撮影情報と、立体模型が配置されている位置を示す配置情報と、三次元ＣＧデータ蓄積手段に蓄積されている三次元ＣＧデータとに基づいて、立体模型を三次元ＣＧに置き換えて配置した場合に撮影手段で撮影されるであろうＣＧ画像（二次元ＣＧ画像）を時々刻々（その都度）描画（生成）する。そして、映像合成プログラムは、立体模型映像部分検出手段によって、撮影手段で撮影されている実写の映像内の立体模型が映し出されている映像部分を検出し、リプレイス手段によって、検出された映像部分にレンダリング手段で描画されたＣＧ画像（二次元ＣＧ画像）を置き換えて出力する。 According to this configuration, the video composition program causes the three-dimensional CG data output unit to output the three-dimensional CG data stored in the three-dimensional CG data storage unit to the three-dimensional model modeling unit. Also, the video composition program is stored in the 3D CG data storage unit by the rendering unit, the shooting information including the position, direction and zoom amount of the shooting unit, the location information indicating the position where the 3D model is placed, and the 3D CG data storage unit. Based on the existing 3D CG data, the CG image (2D CG image) that will be captured by the imaging means when the 3D model is replaced with the 3D CG is arranged (every time) drawn (generated) ) The video composition program detects the video portion in which the three-dimensional model in the live-action video shot by the photographing means is projected by the three-dimensional model video portion detecting means, and the detected video portion is detected by the replacing means. The CG image (two-dimensional CG image) drawn by the rendering means is replaced and output.

請求項４記載の映像合成システムは、立体模型とこの立体模型を使用して解説を行う人物とを撮影手段によって撮影した映像内で、当該立体模型が映し出されている映像部分に、当該三次元ＣＧデータに従って描画したＣＧ画像を置き換えて出力する映像合成システムであって、三次元ＣＧデータ蓄積手段と、三次元ＣＧデータ出力手段と、立体模型造形手段と、撮影情報取得手段と、レンダリング手段と、立体模型映像部分検出手段と、リプレイス手段と、を備える構成とした。 The video composition system according to claim 4, wherein the three-dimensional model is displayed in a video portion where the three-dimensional model is projected in a video obtained by photographing a three-dimensional model and a person who explains using the three-dimensional model. A video composition system that replaces and outputs a CG image drawn according to CG data, and includes a three-dimensional CG data storage unit, a three-dimensional CG data output unit, a three-dimensional model modeling unit, a photographing information acquisition unit, a rendering unit, The three-dimensional model image part detecting means and the replacing means are provided.

かかる構成によれば、映像合成システムは、三次元ＣＧデータ出力手段によって、三次元ＣＧデータ蓄積手段に蓄積されている三次元ＣＧデータを立体模型造形手段に出力させる。また、映像合成システムは、立体模型造形手段によって、立体模型を造形し、撮影情報取得手段によって、撮影手段が設置されている位置と方向（例えば、座標ｘｙｚと方向）と撮影手段が立体模型を撮影する際のズーム量（ズーム率）とを含む撮影情報を取得する。そして、映像合成システムは、レンダリング手段によって、撮影情報取得手段で取得された撮影情報と、立体模型が配置されている位置を示す配置情報と、三次元ＣＧデータ蓄積手段に蓄積されている三次元ＣＧデータとに基づいて、立体模型を三次元ＣＧに置き換えて配置した場合に撮影手段で撮影されるであろうＣＧ画像（二次元ＣＧ画像）を時々刻々（その都度）描画（生成）する。そして、映像合成システムは、立体模型映像部分検出手段によって、撮影手段で撮影されている実写の映像内の立体模型が映し出されている映像部分を検出し、リプレイス手段によって、検出された映像部分にレンダリング手段で描画されたＣＧ画像を置き換えて出力する。 According to such a configuration, the video composition system causes the three-dimensional CG data output means to output the three-dimensional CG data stored in the three-dimensional CG data storage means to the three-dimensional model shaping means. In addition, the video composition system forms a three-dimensional model by the three-dimensional model forming means, and the photographing information acquisition means sets the position and direction (for example, the coordinate xyz and direction) where the photographing means is installed and the photographing means converts the three-dimensional model. Shooting information including a zoom amount (zoom rate) at the time of shooting is acquired. Then, the video composition system uses the rendering unit to acquire the shooting information acquired by the shooting information acquisition unit, the arrangement information indicating the position where the three-dimensional model is arranged, and the three-dimensional CG data storage unit. Based on the CG data, a CG image (two-dimensional CG image) that will be photographed by the photographing means when the three-dimensional model is replaced with the three-dimensional CG is arranged (generated) every moment. Then, the video composition system detects the video portion in which the three-dimensional model in the live-action video photographed by the photographing means is projected by the three-dimensional model video portion detection means, and detects the video portion detected by the replacement means. The CG image drawn by the rendering means is replaced and output.

請求項１、３、４に記載の発明によれば、人物は、センサおよび特殊メガネを装着する必要が無く、実際に造形された立体模型を見ながら動作することできる。また、立体模型が三次元ＣＧに置き換えられるので、三次元ＣＧ（ＣＧ画像）に対して自然な振る舞いをして得られる合成映像を出力することができる。 According to the first, third, and fourth aspects of the invention, the person does not need to wear a sensor and special glasses, and can operate while looking at the actually modeled three-dimensional model. In addition, since the three-dimensional model is replaced with the three-dimensional CG, it is possible to output a composite image obtained by performing a natural behavior with respect to the three-dimensional CG (CG image).

請求項２に記載の発明によれば、三次元ＣＧデータを事前に準備しておけば、容易に立体模型を得ることができ、例えば、リアルタイムの生放送番組に素早く対応することができる。 According to the second aspect of the present invention, if the three-dimensional CG data is prepared in advance, a three-dimensional model can be easily obtained. For example, a real-time live broadcast program can be quickly handled.

次に、本発明の実施形態について、適宜、図面を参照しながら詳細に説明する。
（映像合成装置の構成）
図１は、映像合成システムのブロック図である。この図１に示すように、映像合成システムＡは、映像合成装置１と、立体模型造形装置２（立体模型造形手段）と、撮影カメラ４（撮影手段）と、センサ６（撮影情報取得手段）とから構成されている。映像合成装置１の説明に先立って他の構成を説明する。 Next, embodiments of the present invention will be described in detail with reference to the drawings as appropriate.
(Configuration of video composition device)
FIG. 1 is a block diagram of a video composition system. As shown in FIG. 1, a video composition system A includes a video composition device 1, a three-dimensional model shaping device 2 (three-dimensional model shaping means), a photographing camera 4 (photographing means), and a sensor 6 (photographing information acquisition means). It consists of and. Prior to the description of the video composition device 1, another configuration will be described.

立体模型造形装置２は、入力されたデータ（三次元ＣＧデータ）に基づいて、ラピッドタイミング技術の薄膜積層法（ＬＯＭ法）の原理を利用して、薄紙等のシート状の素材（ダンボール紙や紙片等の材料）に輪郭線を引き、この素材の輪郭線をレーザまたはナイフ等によってカットしながら、このカットした素材を順次積み重ねて行くことで、立体模型を造形するものである。なお、この立体模型造形装置２で使用される素材（材料）は、映像合成装置１の立体模型映像部分検出手段（後記する）で検出可能なように単一色とする。 Based on the input data (three-dimensional CG data), the three-dimensional model forming apparatus 2 uses the principle of the thin film stacking method (LOM method) of rapid timing technology to make a sheet-like material such as thin paper (corrugated cardboard paper or A three-dimensional model is formed by drawing a contour line on a material (such as a piece of paper) and sequentially stacking the cut material while cutting the contour line of the material with a laser or a knife. In addition, the raw material (material) used with this three-dimensional model modeling apparatus 2 is made into a single color so that it can detect with the three-dimensional model image | video part detection means (after-mentioned) of the image | video composition apparatus 1. FIG.

ラピッドタイミング技術の薄膜積層法（ＬＯＭ法）は、入力されたデータの内、ＸＹ平面に関するデータに基づいて薄膜状の素材を切り出して、この切り出した素材を、入力されたデータの内、高さ方向に関するデータに基づいて積層して、特定形状の素材の「塊」（マスタモデル）を作り出すものであり、金型等の製作に応用されるものである。 In the thin film lamination method (LOM method) of the rapid timing technology, a thin film-like material is cut out based on the data related to the XY plane in the input data, and this cut out material is the height of the input data. Lamination is performed based on the data regarding the direction to create a “lumb” (master model) of a material having a specific shape, and is applied to the production of a mold or the like.

この立体模型造形装置２は、具体的には、砂型鋳造および精密鋳造へ応用されている市販のＲＰシステム（ＲａｐｉｔＰｒｏｔｏｔｙｐｉｎｇＳｙｓｔｅｍ−高速造形装置）を用いることでもよく、また、このＲＰシステムで用いる薄膜の紙片だけを取り扱う代わりに、立体模型を高速に造形するためにダンボール紙等の材料を取り扱うことが可能なように改良したものとしてもよい。 Specifically, the three-dimensional model forming apparatus 2 may use a commercially available RP system (rapid prototyping system) applied to sand mold casting and precision casting, and a thin film used in the RP system. Instead of handling only this piece of paper, it may be improved so that a material such as corrugated cardboard can be handled in order to form a three-dimensional model at high speed.

但し、立体模型造形装置２として、市販のＲＰシステムをそのまま用いた場合、このＲＰシステムは一般的に立体模型等を造形するのに時間がかかり、緊急番組等で使用される立体模型の作成には対応できないことが想定される。それゆえ、市販のＲＰシステムは緊急番組以外の番組に適用することを前提とする。 However, when a commercially available RP system is used as it is as the three-dimensional model forming apparatus 2, this RP system generally takes time to form a three-dimensional model and the like, and it is necessary to create a three-dimensional model used in emergency programs and the like. Is assumed to be incompatible. Therefore, it is assumed that a commercially available RP system is applied to programs other than emergency programs.

この立体模型造形装置２によって、高速に造形するためにダンボール等の材料を用いて造形される立体模型は、入力されたデータ（三次元ＣＧデータ）に従ってダンボール紙等の材料の不要な部分を切り落として（削り取って）、この不要な部分を切り落としたダンボール紙等の材料を幾重にも積み重ねたものである。 The three-dimensional model formed by using the material such as corrugated cardboard for modeling at high speed by the three-dimensional model modeling apparatus 2 cuts off unnecessary portions of the material such as corrugated cardboard according to the input data (three-dimensional CG data). (Scraped out), and the material such as corrugated paper from which unnecessary portions are cut off is stacked several times.

すなわち、この映像合成システムＡでは、三次元ＣＧデータを出力する出力先の立体模型造形装置２にラピッドプロトタイピング技術における薄膜積層法の原理が用いられるので、三次元ＣＧデータを事前に準備しておけば、三次元ＣＧの形状に近似した立体模型がダンボール等によって高速に造形されることとなり、例えば、リアルタイムの生放送番組に素早く対応することができる。 That is, in this video composition system A, since the principle of the thin film stacking method in the rapid prototyping technology is used for the output 3D CG data output device 2 for outputting the 3D CG data, the 3D CG data is prepared in advance. If this is done, a three-dimensional model approximating the shape of a three-dimensional CG will be modeled at high speed using corrugated cardboard or the like, and for example, it can quickly respond to a real-time live broadcast program.

撮影カメラ４は、立体模型造形装置２によって造形された立体模型と、この立体模型を見ながら動作（演技、解説）を行う人物とを撮影するもので、当該立体模型と当該人物とを撮影した実写撮影信号を映像合成装置１に出力するものである。なお、この撮影カメラ４は、一般的な放送局等で採用されているものである。 The photographic camera 4 shoots a three-dimensional model formed by the three-dimensional model forming apparatus 2 and a person who performs an action (acting and commentary) while looking at the three-dimensional model, and has photographed the three-dimensional model and the person. The live-shooting signal is output to the video composition device 1. The photographing camera 4 is employed by a general broadcasting station or the like.

センサ６は、撮影カメラ４が設置されている位置（例えば、座標ｘｙｚ）と、撮影カメラ４が撮影している方向と、撮影カメラ４が立体模型を撮影する際のズーム量（ズーム率）とを含む撮影情報を検出（取得）するものである。この実施の形態では、センサ６は、位置、方向およびズーム量（ズーム率）をそれぞれ検出する以下に述べるものから構成されている。 The sensor 6 includes a position (for example, coordinates xyz) where the shooting camera 4 is installed, a direction in which the shooting camera 4 is shooting, and a zoom amount (zoom rate) when the shooting camera 4 is shooting a three-dimensional model. Detecting (acquiring) photographing information including In this embodiment, the sensor 6 includes the following elements that detect the position, direction, and zoom amount (zoom rate).

撮影カメラ４が設置されている位置を検出するセンサ６は、当該撮影カメラ４に設置されて、磁気、赤外線、超音波の少なくとも一つを含む信号を発信する発信機（図示せず）と、この発信機から発信された信号に基づいて、位置を検出する位置検出器（図示せず）とから構成されているものである。 A sensor 6 that detects a position where the photographing camera 4 is installed is a transmitter (not shown) that is installed in the photographing camera 4 and transmits a signal including at least one of magnetism, infrared rays, and ultrasonic waves. A position detector (not shown) that detects a position based on a signal transmitted from the transmitter is configured.

撮影カメラ４が撮影している方向を検出するセンサ６は、方向を検出するジャイロコンパス等である。 The sensor 6 that detects the direction in which the photographing camera 4 is photographing is a gyro compass or the like that detects the direction.

撮影カメラ４のズーム量（ズーム率）を検出するセンサ６は、当該撮影カメラ４の外部インターフェース（図示せず）に取り付けられて、この外部インターフェース（図示せず）から出力されるズーム量（ズーム率）を検出する検出装置である。 The sensor 6 for detecting the zoom amount (zoom rate) of the photographing camera 4 is attached to an external interface (not shown) of the photographing camera 4 and is output from the external interface (not shown). This is a detection device for detecting the rate.

次に、映像合成装置１の構成について説明する。
映像合成装置１は、撮影カメラ４で撮影された実写映像信号に、描画した三次元ＣＧを合成し、この合成したものを合成映像信号として出力するもので、三次元ＣＧデータ蓄積手段３と、三次元ＣＧデータ出力制御手段５（三次元ＣＧデータ出力手段）と、レンダリング手段７と、立体模型映像部分検出手段９と、リプレイス手段１１とを備えている。 Next, the configuration of the video composition device 1 will be described.
The video synthesizing device 1 synthesizes the drawn three-dimensional CG with the real video signal photographed by the photographing camera 4, and outputs the synthesized three-dimensional CG as a synthesized video signal. The three-dimensional CG data storage means 3, A three-dimensional CG data output control means 5 (three-dimensional CG data output means), a rendering means 7, a three-dimensional model video portion detection means 9, and a replacement means 11 are provided.

三次元ＣＧデータ蓄積手段３は、三次元ＣＧを描画するためのデータである三次元ＣＧデータを蓄積するもので、一般的なハードディスク等の記録媒体によって構成されている。この三次元ＣＧデータ蓄積手段３は、三次元ＣＧデータ出力手段５から出力された制御信号に従って、蓄積している三次元ＣＧデータを、当該映像合成装置１の外部にある立体模型造形装置２に出力すると共に、レンダリング手段７にも出力するものである。 The three-dimensional CG data storage means 3 stores three-dimensional CG data that is data for drawing a three-dimensional CG, and is configured by a general recording medium such as a hard disk. This three-dimensional CG data storage means 3 sends the stored three-dimensional CG data to the three-dimensional model modeling apparatus 2 outside the video composition apparatus 1 according to the control signal output from the three-dimensional CG data output means 5. In addition to outputting, it also outputs to the rendering means 7.

三次元ＣＧデータは、三次元のコンピュータグラフィックスを描画するための形状を示す座標群、描画される三次元ＣＧの質感を示す質感情報、描画される三次元ＣＧの明暗および色彩を示す光彩情報等を含んでいるものである。例えば、描画される三次元ＣＧが１／１００スケールの高層ビル等の複数の建築物であった場合、座標群は、建築物の形状、例えば、四角柱を少なくとも示す８個（四角柱の頂点の数）の座標となり、質感情報は、建築物の表面材質、例えば、ガラスおよびコンクリートを示すものとなり、光彩情報は、建築物の色彩、例えば、水色を示すものとなる。 The three-dimensional CG data includes a coordinate group indicating a shape for drawing three-dimensional computer graphics, texture information indicating the texture of the three-dimensional CG to be drawn, and glow information indicating the lightness and darkness and color of the three-dimensional CG to be drawn. Etc. For example, when the 3D CG to be drawn is a plurality of buildings such as 1/100 scale high-rise buildings, the coordinate group has 8 shapes (vertex of the quadrangular columns) indicating at least the shape of the building, for example, a quadrangular column. The texture information indicates the surface material of the building, for example, glass and concrete, and the glow information indicates the color of the building, for example, light blue.

三次元ＣＧデータ出力制御手段５は、当該映像合成装置１の操作者が入力手段（図示せず）を操作して、入力した操作信号に基づいて、三次元ＣＧデータ蓄積手段３から立体模型造形装置２およびレンダリング手段７に三次元ＣＧデータを出力させる制御信号を、三次元ＣＧデータ蓄積手段３に出力するものである。 The three-dimensional CG data output control means 5 operates the input means (not shown) by the operator of the video synthesizer 1 and, based on the input operation signal, the three-dimensional CG data storage means 3 to form a three-dimensional model. A control signal for causing the apparatus 2 and the rendering means 7 to output 3D CG data is output to the 3D CG data storage means 3.

レンダリング手段７は、センサ６で検出された撮影カメラ４の時々刻々に変化する位置、方向およびズーム量（ズーム率）を含む撮影情報と、立体模型が配置されている位置を示す配置情報と、三次元ＣＧデータ蓄積手段３に蓄積されている三次元ＣＧデータとに基づいて、その都度、撮影カメラ４から眺めたＣＧ画像（二次元ＣＧ画像）を描画（生成）するものである。このレンダリング手段７によるＣＧ画像の描画は、一般的なＣＧ描画手法によって生成されたものである。なお、この配置情報は、予め図示を省略した記憶手段に記憶されているものである。 The rendering means 7 includes shooting information including the position, direction, and zoom amount (zoom rate) of the shooting camera 4 detected by the sensor 6, and arrangement information indicating the position where the three-dimensional model is arranged, Based on the three-dimensional CG data stored in the three-dimensional CG data storage unit 3, a CG image (two-dimensional CG image) viewed from the photographing camera 4 is drawn (generated) each time. The drawing of the CG image by the rendering means 7 is generated by a general CG drawing method. This arrangement information is stored in advance in a storage means (not shown).

立体模型映像部分検出手段９は、撮影カメラ４で撮影された実写映像信号の各フレームについて、立体模型が映し出されている映像部分の外縁部を、色彩（事前に立体模型は単一色で形成されている）の違いから検出し、この映像部分の外縁部に係る情報を立体模型映像部分検出信号として、リプレイス手段１１に出力するものである。立体模型映像部分検出信号は、ＣＧ画像を実際の空間上（実写映像中）に配置する位置を示す配置データとなる。つまり、立体模型映像部分検出手段９は、実写映像信号に対して画像処理を施して、検出対象である立体模型の映像部分を検出している。 The three-dimensional model image portion detecting means 9 uses the color (the three-dimensional model is formed in a single color in advance) for each frame of the live-action image signal photographed by the photographing camera 4 for the outer edge portion of the image portion on which the three-dimensional model is projected. The information relating to the outer edge of the video part is output to the replacement means 11 as a three-dimensional model video part detection signal. The three-dimensional model video partial detection signal is arrangement data indicating the position where the CG image is arranged in the actual space (in the live-action video). That is, the three-dimensional model video part detecting means 9 performs image processing on the real video signal to detect the video part of the three-dimensional model that is a detection target.

リプレイス手段１１は、撮影カメラ４で撮影された実写映像信号にレンダリング手段７で描画（生成）されたＣＧ画像を、立体模型映像部分検出手段９で出力された立体模型映像部分検出信号に基づいて、置き換えて、合成映像信号として出力するものである。 The replacement unit 11 converts the CG image drawn (generated) by the rendering unit 7 into the real video signal photographed by the photographing camera 4 based on the three-dimensional model video part detection signal output by the three-dimensional model video part detection unit 9. , And output as a composite video signal.

このリプレイス手段１１から出力された合成映像信号は、表示装置（図示せず）に表示させることも可能であるし、放送波送信装置（図示せず）に入力して放送波として送出することも可能である。 The composite video signal output from the replacement means 11 can be displayed on a display device (not shown), or can be input to a broadcast wave transmission device (not shown) and transmitted as a broadcast wave. Is possible.

なお、この立体模型映像部分検出手段９およびリプレイス手段１１は、従来のクロマキー装置（図示せず）の各処理手段に該当している。 The three-dimensional model image portion detection means 9 and the replacement means 11 correspond to each processing means of a conventional chroma key device (not shown).

この映像合成装置１によれば、三次元ＣＧデータ出力制御手段５によって、三次元ＣＧデータ蓄積手段３に蓄積されている三次元ＣＧデータが立体模型造形装置２に出力される。なお、このときに、三次元ＣＧデータが入力されたに立体模型造形装置２は、立体模型を造形する。そして、撮影カメラ４で、造形された立体模型と、人物等を撮影すると、センサ６がセンシングした撮影カメラ４の位置、方向およびズーム量（ズーム率）を含む撮影情報が入力される。すると、レンダリング手段７によって、センサ６から入力された撮影情報と、予め記憶されている立体模型が配置されている位置を示す配置情報と、三次元ＣＧデータ蓄積手段３から入力された三次元ＣＧデータとに基づいて、ＣＧ画像（二次元ＣＧ画像）が描画される。そして、立体模型映像部分検出手段９によって、撮影カメラ４で撮影されている実写映像信号中の立体模型が映し出されている映像部分が、立体模型映像部分検出信号として検出され、リプレイス手段１１によって、検出された映像部分にレンダリング手段７で描画されたＣＧ画像（二次元ＣＧ画像）が置き換えられた合成映像信号として出力される。このため、番組に出演する出演者等の人物は、従来のミックスドリアリティ技術で装着されていたセンサおよび特殊メガネを装着する必要が無く、実際に造形された立体模型を見ながら動作することできる。また、立体模型がＣＧ画像に置き換えられるので、ＣＧ画像に対して人物が自然な振る舞いをしている合成映像信号（合成映像）を出力することができる。 According to this video composition device 1, the 3D CG data output control means 5 outputs the 3D CG data stored in the 3D CG data storage means 3 to the 3D model modeling device 2. At this time, the three-dimensional model forming apparatus 2 forms a three-dimensional model when the three-dimensional CG data is input. When the photographed camera 4 photographs a modeled solid model, a person, and the like, photographing information including the position and direction of the photographing camera 4 sensed by the sensor 6 and the zoom amount (zoom rate) is input. Then, the photographing means input from the sensor 6 by the rendering means 7, the placement information indicating the position where the pre-stored three-dimensional model is placed, and the three-dimensional CG input from the three-dimensional CG data storage means 3. A CG image (two-dimensional CG image) is drawn based on the data. Then, the three-dimensional model image portion detection means 9 detects the image portion in which the three-dimensional model in the live-action image signal photographed by the photographing camera 4 is projected as a three-dimensional model image portion detection signal, and the replacement means 11 The detected video portion is output as a composite video signal in which the CG image (two-dimensional CG image) drawn by the rendering unit 7 is replaced. For this reason, a person such as a performer appearing in a program does not need to wear sensors and special glasses that have been worn by the conventional mixed reality technology, and can operate while looking at the actually modeled three-dimensional model. . Further, since the three-dimensional model is replaced with a CG image, a synthesized video signal (synthesized video) in which a person behaves naturally with respect to the CG image can be output.

つまり、この映像合成装置１によれば、実在する立体模型を見ながら、且つ、当該立体模型を指し示したりしながら、人物が動作することができ、リプレイス手段１１で、当該立体模型が、レンダリング手段７で描画されたＣＧ画像に置き換え得られるので、ＣＧ画像に対して当該人物が自然な振る舞いをした合成映像信号（合成映像）を得ることができる。 In other words, according to this video composition device 1, a person can operate while looking at an actual three-dimensional model and pointing to the three-dimensional model. 7 can be replaced with the CG image drawn in step 7, so that a composite video signal (composite video) in which the person behaves naturally with respect to the CG image can be obtained.

（映像合成装置の動作）
次に、図２に示すフローチャートを参照して、映像合成装置１の動作について説明する（適宜、図１参照）。この動作の説明では、撮影カメラ４で撮影を開始する前に、三次元ＣＧデータ蓄積手段３から三次元ＣＧデータを立体模型造形装置２に出力させておき、この立体模型造形装置２によって立体模型を造形しておくことを前提としている。 (Operation of video composition device)
Next, the operation of the video composition apparatus 1 will be described with reference to the flowchart shown in FIG. 2 (see FIG. 1 as appropriate). In the explanation of this operation, before the photographing by the photographing camera 4 is started, the three-dimensional CG data is output from the three-dimensional CG data storage means 3 to the three-dimensional model shaping apparatus 2, and the three-dimensional model shaping apparatus 2 uses the three-dimensional model shaping apparatus 2. Is premised on modeling.

まず、映像合成装置１の立体模型映像部分検出手段９およびリプレイス手段１１に、立体模型と人物とが撮影カメラ４で撮影された実写映像信号を入力する（Ｓ１）。 First, a live-action image signal obtained by photographing a three-dimensional model and a person with the photographing camera 4 is input to the three-dimensional model image portion detecting unit 9 and the replacing unit 11 of the image synthesizing apparatus 1 (S1).

また、映像合成装置１は、センサ６から出力された撮影カメラ４の位置、方向およびズーム量（ズーム率）を含む撮影情報を受信（取得）する（Ｓ２）。 In addition, the video composition device 1 receives (acquires) shooting information including the position, direction, and zoom amount (zoom rate) of the shooting camera 4 output from the sensor 6 (S2).

すると、映像合成装置１は、レンダリング手段７によって、撮影情報と配置情報と三次元ＣＧデータとに基づいて、ＣＧ画像をレンダリング（描画）する（Ｓ３）。 Then, the video composition device 1 renders (draws) a CG image by the rendering unit 7 based on the photographing information, the arrangement information, and the three-dimensional CG data (S3).

また、映像合成装置１は、立体模型映像部分検出手段９によって、撮影カメラ４から入力された実写映像信号から立体模型が映し出されている映像部分を立体模型映像部分検出信号として、リプレイス手段１１に出力する（Ｓ４）。 In addition, the video composition apparatus 1 uses the stereoscopic model video part detection unit 9 as a stereoscopic model video part detection signal to replace the video unit in which the stereoscopic model is projected from the actual video signal input from the photographing camera 4. Output (S4).

すると、映像合成装置１は、リプレイス手段１１によって、立体模型が映し出されている映像部分に、レンダリング手段７で描画したＣＧ画像を置き換えて、合成映像信号（合成映像）として出力する（Ｓ５）。 Then, the video synthesizing apparatus 1 replaces the CG image drawn by the rendering unit 7 with the video portion on which the three-dimensional model is projected by the replacing unit 11, and outputs the resultant as a synthesized video signal (synthesized video) (S5).

（合成映像の例について）
次に、図３を参照して、映像合成装置１で合成して出力した合成映像の例について説明する。図３は、映像合成装置１で合成して出力した合成映像の例を示したものである。
この図３に示すように、人物は、高層ビル等の建築物のＣＧ画像を見ながら（実際には、立体模型を見ながら）、このＣＧ画像を指さす動作をしている。 (Example of composite video)
Next, an example of a synthesized video that is synthesized and output by the video synthesizer 1 will be described with reference to FIG. FIG. 3 shows an example of a synthesized video that is synthesized by the video synthesizer 1 and output.
As shown in FIG. 3, the person is pointing to the CG image while looking at the CG image of a building such as a high-rise building (actually while looking at the three-dimensional model).

これによれば、番組等に出演する出演者等の人物は、センサや特殊メガネを装着することなく、簡易に作成された立体模型を見ながら、当該立体模型の特定部分を指し示して動作することができるため、ＣＧ画像に対して自然な振る舞い（自然な動作、目線）をすることができる。 According to this, a person such as a performer appearing in a program or the like can operate by pointing to a specific part of the three-dimensional model while wearing a simple three-dimensional model without wearing a sensor or special glasses. Therefore, it is possible to perform a natural behavior (natural operation, line of sight) with respect to the CG image.

また、図４を参照して、映像合成装置１で合成して出力した合成映像の見え方の例について説明する（適宜、図１参照）。
図４（ａ−１）〜（ｃ−２）は、番組等に出演する出演者等の人物が立体模型の周囲に立った状態で、当該立体模型を指さしながら、見る方向（撮影カメラ４で撮影する方向）を変化させた場合を示している。 In addition, an example of how the synthesized video that is synthesized and output by the video synthesizing device 1 will be described with reference to FIG. 4 (see FIG. 1 as appropriate).
4 (a-1) to 4 (c-2) show a direction in which a person such as a performer appearing in a program stands around the stereoscopic model while pointing to the stereoscopic model (with the photographing camera 4). This shows a case where the shooting direction is changed.

図４（ａ−１、ａ−２）において、図４（ａ−１）は、人物の向かって左方向に「ビル」の立体模型ｒが位置していることを示している。例えば、当該人物が立体模型ｒを見ている方向から撮影カメラ４で立体模型ｒを撮影した場合、撮影カメラ４で立体模型ｒが撮影された際の当該撮影カメラ４の位置、方向およびズーム量を含む撮影情報と、立体模型が配置された位置を示す配置情報と、立体模型ｒを造形する際に使用した三次元ＣＧデータとに基づいてＣＧ画像が生成される。そして、このＣＧ画像が実写映像信号に合成される。図４（ａ−２）は、その結果を、つまり、合成された合成映像（合成映像信号）が映し出されていることを示している。 In FIG. 4 (a-1, a-2), FIG. 4 (a-1) has shown that the three-dimensional model r of "building" is located in the left direction toward a person. For example, when the photographing model 4 is used to photograph the three-dimensional model r from the direction in which the person is looking at the three-dimensional model r, the position, direction, and zoom amount of the photographing camera 4 when the three-dimensional model r is photographed by the photographing camera 4. A CG image is generated based on the shooting information including the position information indicating the position where the three-dimensional model is arranged, and the three-dimensional CG data used when forming the three-dimensional model r. Then, the CG image is combined with the actual video signal. FIG. 4A-2 shows the result, that is, that the synthesized video (synthesized video signal) is displayed.

また、図４（ｂ−１、ｂ−２）において、図４（ｂ−１）は、人物の正面に「ビル」の立体模型ｒが位置していることを示している。例えば、当該人物が立体模型ｒを見ている方向から撮影カメラ４で立体模型ｒを撮影した場合、当該人物の方向から撮影カメラ４で立体模型ｒが撮影された際の当該撮影カメラ４の位置、方向およびズーム量を含む撮影情報と、立体模型が配置された位置を示す配置情報と、立体模型ｒを造形する際に使用した三次元ＣＧデータとに基づいてＣＧ画像が生成される。そして、このＣＧ画像が実写映像信号に合成される。図４（ｂ−２）は、その結果を、つまり、合成された合成映像（合成映像信号）が映し出されていることを示している。 4 (b-1, b-2), FIG. 4 (b-1) shows that the three-dimensional model r of “building” is located in front of the person. For example, when the 3D model r is shot with the shooting camera 4 from the direction in which the person is looking at the 3D model r, the position of the shooting camera 4 when the 3D model r is shot with the shooting camera 4 from the direction of the person. A CG image is generated based on the photographing information including the direction and the zoom amount, the arrangement information indicating the position where the three-dimensional model is arranged, and the three-dimensional CG data used when the three-dimensional model r is formed. Then, the CG image is combined with the actual video signal. FIG. 4B-2 shows the result, that is, that a synthesized video image (synthesized video signal) is displayed.

さらに、図４（ｃ−１、ｃ−２）において、図４（ｃ−１）は、人物の向かって右方向に「ビル」の立体模型ｒが位置していることを示している。例えば、当該人物が立体模型ｒを見ている方向から撮影カメラ４で立体模型ｒを撮影した場合、撮影カメラ４で立体模型ｒが撮影された際の当該撮影カメラ４の位置、方向およびズーム量を含む撮影情報と、立体模型が配置された位置を示す配置情報と、立体模型ｒを造形する際に使用した三次元ＣＧデータとに基づいてＣＧ画像が生成される。そして、このＣＧ画像が実写映像信号に合成される。図４（ｃ−２）は、その結果を、つまり、合成された合成映像（合成映像信号）が映し出されていることを示している。 Further, in FIG. 4 (c-1, c-2), FIG. 4 (c-1) shows that the three-dimensional model r of “building” is located in the right direction toward the person. For example, when the photographing model 4 is used to photograph the three-dimensional model r from the direction in which the person is looking at the three-dimensional model r, the position, direction, and zoom amount of the photographing camera 4 when the three-dimensional model r is photographed by the photographing camera 4. A CG image is generated based on the shooting information including the position information indicating the position where the three-dimensional model is arranged, and the three-dimensional CG data used when forming the three-dimensional model r. Then, the CG image is combined with the actual video signal. FIG. 4C-2 shows the result, that is, that the synthesized video (synthesized video signal) is displayed.

また、図４（ｄ）は、立体模型ｒと人物との位置関係を示しており、（ａ）が図４（ａ−１）に示した人物の位置を示しており、（ｂ）が図４（ｂ−１）に示した人物の位置を示しており、（ｃ）が図４（ｃ−１）に示した人物の位置を示している。すなわち、時々刻々と、撮影カメラ４の位置、方向およびズーム量が変化したときに、生成されるＣＧ画像が追従して変化することを示している。 4D shows the positional relationship between the three-dimensional model r and the person, FIG. 4A shows the position of the person shown in FIG. 4A-1, and FIG. 4 (b-1) shows the position of the person, and (c) shows the position of the person shown in FIG. 4 (c-1). That is, when the position, direction, and zoom amount of the photographing camera 4 change from moment to moment, the generated CG image changes following the change.

つまり、映像合成装置１では、センサ６から入力される撮影カメラ４の位置、方向およびズーム量を含む撮影情報にあわせて、レンダリング手段７によって、描画させるＣＧ画像を逐次変化させることになり、リプレイス手段１１で合成させる合成映像信号も逐次変化させることができる。 That is, in the video composition device 1, the rendering unit 7 sequentially changes the CG image to be drawn in accordance with the photographing information including the position, direction, and zoom amount of the photographing camera 4 input from the sensor 6, and the replacement is performed. The synthesized video signal synthesized by the means 11 can also be changed sequentially.

以上、本発明の実施形態について説明したが、本発明は前記実施形態には限定されない。例えば、本実施形態では、映像合成装置１として説明したが、この映像合成装置１の各構成の処理を汎用的な（特殊なアセンブラ言語であってもよい）コンピュータ言語で記述した映像合成プログラムとみなすことは可能である。この場合、映像合成装置１と同様の効果を得ることができる。 As mentioned above, although embodiment of this invention was described, this invention is not limited to the said embodiment. For example, in the present embodiment, the video composition apparatus 1 has been described. However, a video composition program in which processing of each component of the video composition apparatus 1 is described in a general-purpose computer language (may be a special assembler language) It can be considered. In this case, the same effect as that of the video composition device 1 can be obtained.

本発明に係る映像合成装置のブロック図である。1 is a block diagram of a video composition device according to the present invention. 図１に示した映像合成装置の動作を説明したフローチャートである。3 is a flowchart for explaining the operation of the video composition apparatus shown in FIG. 1. 図１に示した映像合成装置から出力した合成映像の例を説明した図である。It is a figure explaining the example of the synthetic | combination video output from the video synthesizing | combining apparatus shown in FIG. （ａ−１）は立体模型の右側に人物が位置する場合を示し、（ａ−２）はその場合の合成映像の例を示した図である。（ｂ−１）は立体模型の正面に人物が位置する場合を示し、（ｂ−２）はその場合の合成映像の例を示した図である。（ｃ−１）は立体模型の左側に人物が位置する場合を示し、（ｃ−２）はその場合の合成映像の例を示した図である。（ｄ）は立体模型と人物の位置関係を示した図である。(A-1) shows a case where a person is located on the right side of the three-dimensional model, and (a-2) shows an example of a composite image in that case. (B-1) shows a case where a person is located in front of the three-dimensional model, and (b-2) is a diagram showing an example of a composite image in that case. (C-1) shows a case where a person is located on the left side of the three-dimensional model, and (c-2) shows an example of a composite image in that case. (D) is the figure which showed the positional relationship of a solid model and a person.

Explanation of symbols

１映像合成装置
２立体模型造形装置（立体模型造形手段）
３三次元ＣＧデータ蓄積手段
４撮影カメラ（撮影手段）
５三次元ＣＧデータ出力制御手段（三次元ＣＧデータ出力手段）
６センサ（撮影情報取得手段）
７レンダリング手段
９立体模型映像部分検出手段
１１リプレイス手段
Ａ映像合成システム
ｒ立体模型 DESCRIPTION OF SYMBOLS 1 Image composition apparatus 2 Three-dimensional model shaping apparatus (three-dimensional model shaping means)
3 Three-dimensional CG data storage means 4 Photographing camera (photographing means)
5 3D CG data output control means (3D CG data output means)
6 sensor (photographing information acquisition means)
7 Rendering means 9 Three-dimensional model image partial detection means 11 Replacement means A Image composition system r Three-dimensional model

Claims

In the image where the 3D model is projected in the image of the 3D model modeled by the 3D model modeling means and the person who explains using the 3D model by the shooting means. , A video synthesizing device that replaces and outputs a CG image drawn according to the three-dimensional CG data,
3D CG data storage means for storing the 3D CG data;
Three-dimensional CG data output means for outputting the three-dimensional CG data stored in the three-dimensional CG data storage means to the three-dimensional model shaping means;
Based on photographing information including the position, direction and zoom amount of the photographing means, arrangement information indicating a position where the three-dimensional model is arranged, and three-dimensional CG data accumulated in the three-dimensional CG data accumulating means. Rendering means for rendering the CG image;
A three-dimensional model video part detecting means for detecting a video part in which the three-dimensional model is projected in the video;
A replacement means for replacing the CG image drawn by the rendering means with the video portion detected by the three-dimensional model video portion detection means;
A video synthesizing apparatus comprising:

2. The image synthesizing apparatus according to claim 1, wherein the three-dimensional model forming unit uses a principle of a thin film stacking method in rapid prototyping technology.

In the image where the 3D model is projected in the image of the 3D model modeled by the 3D model modeling means and the person who explains using the 3D model by the shooting means. In order to replace and output the CG image drawn according to the three-dimensional CG data,
Three-dimensional CG data output means for outputting the three-dimensional CG data from the three-dimensional CG data storage means for storing the three-dimensional CG data to the three-dimensional model shaping means;
Rendering means for rendering the CG image based on shooting information including the position, direction and zoom amount of the shooting means, arrangement information indicating the position where the three-dimensional model is arranged, and the three-dimensional CG data;
3D model image part detecting means for detecting an image part of the 3D model projected in the image;
Replacing means for replacing the CG image drawn by the rendering means with the video portion detected by the three-dimensional model video portion detecting means;
A video composition program characterized by functioning as

Replace the CG image drawn in accordance with the 3D CG data in the video image of the 3D model in the video of the 3D model and the person who will explain using the 3D model. An image composition system for output,
3D CG data storage means for storing the 3D CG data;
Three-dimensional CG data output means for outputting the three-dimensional CG data stored in the three-dimensional CG data storage means;
Three-dimensional model modeling means for modeling the three-dimensional model according to the three-dimensional CG data output by the three-dimensional CG data output means;
Shooting information acquisition means for acquiring shooting information including the position, direction and zoom amount of the shooting means;
Based on the shooting information acquired by the shooting information acquisition means, the arrangement information indicating the position where the three-dimensional model is arranged, and the three-dimensional CG data stored in the three-dimensional CG data storage means, Rendering means for rendering a CG image;
A three-dimensional model video part detecting means for detecting a video part in which the three-dimensional model is projected in the video;
A replacement means for replacing the CG image drawn by the rendering means with the video portion detected by the three-dimensional model video portion detection means;
A video synthesizing system comprising: