US20140340404A1 - Method and apparatus for generating 3d free viewpoint video - Google Patents
Method and apparatus for generating 3d free viewpoint video Download PDFInfo
- Publication number
- US20140340404A1 US20140340404A1 US14/365,240 US201114365240A US2014340404A1 US 20140340404 A1 US20140340404 A1 US 20140340404A1 US 201114365240 A US201114365240 A US 201114365240A US 2014340404 A1 US2014340404 A1 US 2014340404A1
- Authority
- US
- United States
- Prior art keywords
- graphic model
- roi
- video content
- hybrid
- videos
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 230000002194 synthesizing effect Effects 0.000 claims description 6
- 238000009877 rendering Methods 0.000 claims description 5
- 230000008569 process Effects 0.000 description 27
- 238000012545 processing Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 6
- 238000013459 approach Methods 0.000 description 4
- 230000015572 biosynthetic process Effects 0.000 description 4
- 238000003786 synthesis reaction Methods 0.000 description 4
- 230000009466 transformation Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 238000000844 transformation Methods 0.000 description 2
- 238000012800 visualization Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000009189 diving Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/111—Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation
- H04N13/117—Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation the virtual viewpoint locations being selected by the viewers or determined by viewer tracking
-
- H04N13/0014—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/10—Geometric effects
- G06T15/20—Perspective computation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/10—Geometric effects
- G06T15/20—Perspective computation
- G06T15/205—Image-based rendering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/006—Mixed reality
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/156—Mixing image signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/275—Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals
- H04N13/279—Image signal generators from 3D object models, e.g. computer-generated stereoscopic image signals the virtual viewpoint locations being selected by the viewers or determined by tracking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
- G06T2207/10021—Stereoscopic video; Stereoscopic image sequence
Definitions
- the present invention relates to method and apparatus for generating 3D free-viewpoint video.
- the 3D live broadcasting service with free viewpoints has been attracting a lot of interest from both industry and academic fields.
- a user can watch the 3D video from any user-selected viewpoints, which gives a user great experience on watching 3D video and provides lots of possibilities of virtual 3D interactive applications.
- the 3D model reconstruction approach generally includes 8 steps of process for each video frame, that is, 1) capturing multi-view video frames using cameras installed around the target, 2) finding the corresponding pixels from each view using image matching algorithms, 3) calculating the disparity of each pixel and generating the disparity map for any adjacent views, 4) working out the depth value of each pixel using the disparity and camera calibration parameters, 5) re-generating all the pixels with their depth value in 3D space to form a point cloud, 6) estimating the 3D mesh using the point cloud, 7) merging the texture from all the views and attaching to the 3D mesh to form a whole graphic model, and 8) finally rendering the graphic model at user terminal using the selected viewpoint.
- This 3D model reconstruction approach can achieve free viewpoint smoothly but the rendering results look artificial and are not as good as the video directly captured by cameras.
- the other solution tries to solve the problem through view interpolation algorithms.
- the virtual views can be directly generated.
- This 3D view synthesis approach can achieve better perceptive results if the cameras are uniformly distributed and carefully calibrated, but realistic mathematical transformations are usually difficult and require some computation power at user terminal.
- a method for generating 3D viewpoint video content comprising the steps of receiving videos shot by cameras distributed to capture an object; forming a 3D graphic model of at least part of the scene of the object based on the videos; receiving information related to viewpoint and 3D region of interest (ROI) in the object; and combining the 3D graphic model and the videos related to the 3D ROI to form a hybrid 3D video content.
- ROI 3D region of interest
- ROI 3D region of interest
- FIG. 1 illustrates an exemplary block diagram of a system for broadcasting 3D live free viewpoint video according to an embodiment of the present invention
- FIG. 2 illustrates an exemplary block diagram of the head-end according to an embodiment of the present invention
- FIG. 3 illustrates an exemplary block diagram of the user terminal according to an embodiment of the present invention
- FIGS. 4 and 5 illustrate an example of the implementation of the system according to an embodiment of the present invention
- FIG. 6 is a flow chart showing a process for generating 3D live free viewpoint video content
- FIG. 7 is a flow chart showing the process for creating the 3D graphic model.
- FIG. 8 is a flow chart showing the process for presenting the hybrid 3D video content.
- FIG. 1 illustrates an exemplary block diagram of a system 100 for broadcasting 3D live free viewpoint video according to an embodiment of the present invention.
- the system 100 may comprise a head-end 200 and at least one user terminal 300 connected to the head-end 200 via a wired or wireless network such as Wide Area Network (WAN).
- Video cameras 110 a, 110 b, 110 c (referred to as “ 110 ” hereinafter) are connected to the head-end 200 via a wired or wireless network such as Local Area Network (LAN).
- LAN Local Area Network
- the number of the video cameras may depend on an object to capture.
- FIG. 2 illustrates an exemplary block diagram of the head-end 200 according to an embodiment of the present invention.
- the head-end 200 comprises a CPU (Central Processing Unit) 210 , an I/O (Input/Output) module 220 and storage 230 .
- a memory 240 such as RAM (Random Access Memory) is connected to the CPU 210 as shown in FIG. 2 .
- the I/O module 220 is configured to receive video image data from cameras 110 connected to the I/O module 220 . Also the I/O module 220 is configured to receive information such as user's selection on viewpoint and 3D region of interest (ROI), screen resolution of the display in the user terminal 300 , processing power of the user terminal 300 and other parameters of the user terminal 300 and to transmit video content generated by the head-end 200 to the user terminal 300 .
- ROI viewpoint and 3D region of interest
- the storage 230 is configured to store software programs and data for the CPU 210 of the head-end 200 to perform the process which will be described below.
- FIG. 3 illustrates an exemplary block diagram of the user terminal 300 according to an embodiment of the present invention.
- the user terminal 300 also comprises a CPU (Central Processing Unit) 310 , an I/O module 320 , storage 330 and a memory 340 such as RAM (Random Access Memory) connected to the CPU 310 .
- the user terminal 300 further comprises a display 360 and a user input module 350 .
- the I/O module 320 in the user terminal 300 is configured to receive video content transmitted by the head-end 200 and to transmit information such as user's selection on viewpoint and region of interest (ROI), screen resolution of the display in the user terminal 300 , processing power of the user terminal 300 and other parameters of the user terminal 300 to the head-end 200 .
- ROI viewpoint and region of interest
- the storage 330 is configured to store software programs and data for the CPU 310 of the user terminal 300 to perform the process which will be described below.
- the display 360 is configured so that it can present 3D video content provided by the head-end 200 .
- the display 360 can be a touch-screen so that it can provide a possibility to the user to input on the display 360 the user's selection on viewpoint and 3D region of interest (ROI) in addition to the user input module 350 .
- ROI region of interest
- the user input module 350 may be a user interface such as keyboard, a pointing device like a mouse and/or a remote controller to input the user's selection on viewpoint and region of interest (ROI).
- the user input module 350 can be an option if the display 360 is a touch-screen and the user terminal 300 is configured so that such user's selection can be input on the display 360 .
- FIGS. 4 and 5 illustrate an example of the implementation of the system 100 according to an embodiment of the present invention.
- FIGS. 4 and 5 illustratively show that the system 100 is applied to broadcasting 3D live free viewpoint video for soccer game.
- cameras 110 are preferably distributed so that cameras 110 surround a soccer stadium.
- the head-end 200 can be installed in a room in the stadium and the user terminal 300 can be located at user's home, for example.
- FIG. 6 is a flow chart showing a process for generating 3D live free viewpoint video content. The method will be described below with reference to FIGS. 1 to 6 .
- each of the on-site cameras 100 shoot the live videos from different viewpoints and those live videos are transmitted to the head-end 200 via a network such as Local Area Network (LAN).
- LAN Local Area Network
- a video of a default view point shot by a certain camera 110 is transmitted from the head-end 200 to the user terminal 300 and the video is displayed on the display 360 so that a user can select at least one of 3D region on interest (ROI) on the display 360 .
- the region of interest can be a soccer player on the display 360 in this example.
- the CPU 210 of the head-end 200 analyzes the videos using the calibrated camera parameters to form a graphic model of the whole or at least part of the scene of the stadium.
- the calibrated camera parameters are related to the locations and orientations of the cameras 110 .
- the calibration for each camera can be realized by capturing a reference chart such as a mesh-like chart by each camera and by analyzing the respective captured image of the reference chart.
- the analysis may include analyzing the size and the distortion of the reference chart captured in the image.
- the calibrated camera parameters can be obtained by performing camera calibration using the onsite cameras 110 and are preliminarily stored in the storage 230 .
- the head-end 200 receives the user's selection on viewpoint and 3D region of interest (ROI).
- the user's selection can be input by the user input module 350 and/or the display 360 of the user terminal 300 .
- the user's selection on viewpoint can be achieved by selecting a viewpoint using arrow keys on remote controller, by pointing a viewpoint using pointing device or any other possible methods. For example, if the user wants to see a scene of a diving save by goalkeeper, the user can select the viewpoint towards the goalkeeper.
- the user's selection on 3D region of interest (ROI) can be achieved by circling a pointer around an interesting object or area on the display 360 using the user input module 350 or directly on the display 360 if it is a touch-screen.
- the CPU 210 of the head-end 200 then selects a default viewpoint with a certain camera 110 . Also, if a user does not specify 3D ROI, the CPU 210 of the head-end 200 then analyzes the video of the selected or default viewpoint to estimate the possible 3D ROI within the scene of the video.
- the process for estimating possible 3D ROI within the scene of the video can be performed using a conventional ROI detection methods as mentioned in the technical paper: Xinding Sun, Jonathan Foote, Don Kimber and B. S. Manjunath, “Region of Interest Extraction and Virtual Camera Control Based on Panoramic Video Capturing”, IEEE Transactions on Multimedia, 2005.
- the head-end 200 acquires information related to the user's selection on the viewpoint and the 3D ROI or the default viewpoint and the estimated 3D ROI.
- the CPU 210 of the head-end 200 then encodes the graphic model of the stadium seen from the selected or default viewpoint and the videos related to the selected or estimated 3D ROI which videos are shot by at least two cameras 110 located close to the user's selected or default viewpoint to form a hybrid 3D video content with proper level of detail (resolution) according to the additional data regarding the user terminal 300 .
- the graphic model and the videos related to the 3D ROI is encoded and combined in the hybrid 3D video content.
- hybrid 3D video content with high level of detail can be transmitted to the user terminal 300 .
- the level of detail of the hybrid 3D video content to be transmitted to the user terminal 300 can be reduced in order to save network bandwidth on the network between the head-end 200 and the user terminal 300 and processing load on the CPU 310 .
- the level of detail of the hybrid 3D video content to be transmitted to the user terminal 300 can be determined by the CPU 210 of the head-end 200 based on the additional data regarding the user terminal 300 .
- a 3D graphic model is formed from points so-called “vertex” which define the shape and forming “polygons” and that the 3D graphic model is generally rendered in a 2D representation.
- the graphic model of the hybrid 3D video content is a 3D graphic model which will be presented on the display 360 on the user terminal 300 as a 2D representation as a background, whereas virtual 3D views, which will be generated by the videos related to the selected or estimated 3D ROI, will be presented on the background 3D graphic model in the display 360 as a 3D representation (stereoscopic representation) having right and left views.
- the 3D graphic model rendered in the 2D representation as the background is related to the scene of the soccer stadium and the 3D ROI rendered in the 3D representation on the background is related to the soccer player.
- FIG. 7 is a flow chart showing the process for creating the 3D graphic model. The process for creating the 3D graphic model will be discussed below with reference to FIGS. 2 , 5 and 7 .
- videos shot by on-site cameras 110 are received via I/O module 220 of the head-end 200 and the calibrated camera parameters are retrieved from the storage 230 (S 702 ). Then, video frame pre-processing such as image rectification for the videos is performed by the CPU 210 (S 704 ).
- multi-view image matching process is performed to find the corresponding pixels in videos of adjacent views (S 706 ), disparity map calculation is performed for those videos of adjacent views (S 708 ) and 3D point cloud and 3D mesh are generated based on the disparity map created in step 708 (S 710 ).
- the 3D graphic model is generated (S 716 ).
- the 3D graphic model is an entire view of the soccer stadium as shown in FIG. 5 with reference symbol “3DGM”.
- FIG. 8 is a flow chart showing the process for presenting the hybrid 3D video content. The process for reproducing the hybrid 3D video content will be discussed below with reference to FIGS. 3 and 7 .
- the I/O module 320 of the user terminal 300 receives the hybrid 3D video content from the head-end 200 (S 802 ).
- the CPU 310 of the user terminal 300 decodes the background 3D graphic model seen from the selected or default viewpoint and the videos related to the selected or estimated 3D ROI in the hybrid 3D video content (S 804 ), as a result of this, the background 3D graphic model and the videos related to the 3D ROI are retrieved. Then the CPU 310 renders each video frame of the background 3D graphic model seen from the selected or default viewpoint (S 806 ).
- video frame pre-processing such as image rectification is performed by the CPU 310 for the current video frame of the videos related to the selected or estimated 3D ROI for synthesizing the virtual 3D views in the selected or default viewpoint (S 808 ).
- multi-view image matching process is performed by the CPU 310 to find the corresponding pixels in the videos of adjacent views (S 810 ). If necessary, projective transformation process for major structure in the video scene may be performed by the CPU 310 after the step 810 (S 812 ).
- view interpolation process is performed by the CPU 310 to synthesize the virtual 3D views in the selected or default viewpoint using a conventional pixel level interpolation techniques, for example (S 814 ) and hole-filling and artifact-removing process to the synthesized virtual 3D views is performed by the CPU 310 (S 816 ).
- two virtual 3D views are synthesized if the virtual 3D views are generated for stereoscopic 3D representation and more than two virtual 3D views are synthesized if the virtual 3D views are generated for multi-view 3D representation.
- Virtual 3D views are illustratively shown in FIG. 5 with reference symbols “VV1, VV2 and VV3”.
- the virtual 3D views are aligned and merged on the background 3D graphic model with the same perspective parameters to generate final view for the frame of the hybrid 3D video content (S 818 ) and this frame is displayed on the display 360 (S 820 ).
- step 825 if the process for all video frames of the hybrid 3D video content to be presented is completed, this process will be terminated. If not, the CPU 310 will start to the process of steps 808 - 820 for next video frame.
- User can change the user's selection on viewpoint and 3D region of interest (ROI) at the user terminal 300 during the hybrid 3D video content is presented on the display 360 .
- ROI 3D region of interest
- the system 100 can be configured to present both the background 3D graphic model and the virtual 3D views on the display 360 as a 3D representation if it is possible in view of the conditions such as the bandwidth of the network and the processing load on the head-end 200 and the user terminal 300 . Also, the system 100 can be configured to present both the background 3D graphic model and a virtual view on the display 360 as a 2D representation.
- the teachings of the present principles are implemented as a combination of hardware and software.
- the software may be implemented as an application program tangibly embodied on a program storage unit.
- the application program may be uploaded to, and executed by, a machine comprising any suitable architecture.
- the machine is implemented on a computer platform having hardware such as one or more central processing units (“CPU”), a random access memory (“RAM”), and input/output (“I/O”) interfaces.
- CPU central processing units
- RAM random access memory
- I/O input/output
- the computer platform may also include an operating system and microinstruction code.
- the various processes and functions described herein may be either part of the microinstruction code or part of the application program, or any combination thereof, which may be executed by a CPU.
- various other peripheral units may be connected to the computer platform such as an additional data storage unit.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- General Physics & Mathematics (AREA)
- Computer Graphics (AREA)
- Geometry (AREA)
- Computing Systems (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Processing Or Creating Images (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
Abstract
The present invention relates to a method for generating 3D viewpoint video content. The method comprising the steps of receiving videos shot by cameras distributed to capture an object; forming a 3D graphic model of at least part of the scene of the object based on the videos; receiving information related to viewpoint and 3D region of interest (ROD in the object; and combining the 3D graphic model and the videos related to the 3D ROI to form a hybrid 3D video content.
Description
- The present invention relates to method and apparatus for generating 3D free-viewpoint video.
- The 3D live broadcasting service with free viewpoints has been attracting a lot of interest from both industry and academic fields. With this service, a user can watch the 3D video from any user-selected viewpoints, which gives a user great experience on watching 3D video and provides lots of possibilities of virtual 3D interactive applications.
- One conventional solution for achieving the 3D live broadcasting service with free viewpoints is to install cameras on all the popular viewpoints and to simply switch the video streams according to users' selection on viewpoints. Obviously cost for achieving this solution is very expensive and almost not portable at all as it needs to install lots of cameras if a service provider wants to provide enjoyable
free viewpoint 3D video to users. - Recent technology advancement has introduced two other solutions for this service, namely 3D model reconstruction and 3D view synthesis. The 3D model reconstruction approach generally includes 8 steps of process for each video frame, that is, 1) capturing multi-view video frames using cameras installed around the target, 2) finding the corresponding pixels from each view using image matching algorithms, 3) calculating the disparity of each pixel and generating the disparity map for any adjacent views, 4) working out the depth value of each pixel using the disparity and camera calibration parameters, 5) re-generating all the pixels with their depth value in 3D space to form a point cloud, 6) estimating the 3D mesh using the point cloud, 7) merging the texture from all the views and attaching to the 3D mesh to form a whole graphic model, and 8) finally rendering the graphic model at user terminal using the selected viewpoint. This 3D model reconstruction approach can achieve free viewpoint smoothly but the rendering results look artificial and are not as good as the video directly captured by cameras.
- The other solution, 3D view synthesis approach, tries to solve the problem through view interpolation algorithms. By applying some mathematical transformations for the interpolation of the intermediate views from adjacent cameras, the virtual views can be directly generated. This 3D view synthesis approach can achieve better perceptive results if the cameras are uniformly distributed and carefully calibrated, but realistic mathematical transformations are usually difficult and require some computation power at user terminal.
- A method for synthesizing 2D free viewpoint image is shown in the technical paper: Kunihiro Hayashi and Hideo Saito, “Synthesizing free-viewpoint images from multiple view videos in soccer stadium”, Proceedings of the
- International Conference on Computer Graphics, Imaging and Visualization (CGIV′06), IEEE, 2006.
- These and other drawbacks and disadvantages of the above mentioned related art are addressed by the present invention.
- According to an aspect of the present invention, there is provided a method for generating 3D viewpoint video content, the method comprising the steps of receiving videos shot by cameras distributed to capture an object; forming a 3D graphic model of at least part of the scene of the object based on the videos; receiving information related to viewpoint and 3D region of interest (ROI) in the object; and combining the 3D graphic model and the videos related to the 3D ROI to form a
hybrid 3D video content. - According to another aspect of the present invention, there is provided a method for presenting a
hybrid 3D video content including a 3D graphic model and videos related to a 3D region of interest (ROI), the method comprising the steps of receiving thehybrid 3D video content; retrieving the 3D graphic model and the videos related to the 3D ROI in thehybrid 3D video content; rendering each video frame of the 3D graphic model; synthesizing virtual 3D views in a video frame related to the 3D ROI; merging the synthesized virtual 3D views in the video frame on the 3D graphic model in the corresponding video frame to form the final view for the frame; and presenting the final view on a display. - These and other aspects, features and advantages of the present invention will become apparent from the following description in connection with the accompanying drawings in which:
-
FIG. 1 illustrates an exemplary block diagram of a system for broadcasting 3D live free viewpoint video according to an embodiment of the present invention; -
FIG. 2 illustrates an exemplary block diagram of the head-end according to an embodiment of the present invention; -
FIG. 3 illustrates an exemplary block diagram of the user terminal according to an embodiment of the present invention; -
FIGS. 4 and 5 illustrate an example of the implementation of the system according to an embodiment of the present invention; -
FIG. 6 is a flow chart showing a process for generating 3D live free viewpoint video content; -
FIG. 7 is a flow chart showing the process for creating the 3D graphic model; and -
FIG. 8 is a flow chart showing the process for presenting thehybrid 3D video content. - In the following description, various aspects of an embodiment of the present invention will be described. For the purpose of explanation, specific configurations and details are set forth in order to provide a thorough understanding. However, it will also be apparent to one skilled in the art that the present invention may be practiced without the specific details present herein.
-
FIG. 1 illustrates an exemplary block diagram of asystem 100 forbroadcasting 3D live free viewpoint video according to an embodiment of the present invention. Thesystem 100 may comprise a head-end 200 and at least oneuser terminal 300 connected to the head-end 200 via a wired or wireless network such as Wide Area Network (WAN).Video cameras end 200 via a wired or wireless network such as Local Area Network (LAN). The number of the video cameras may depend on an object to capture. -
FIG. 2 illustrates an exemplary block diagram of the head-end 200 according to an embodiment of the present invention. As shown inFIG. 2 , the head-end 200 comprises a CPU (Central Processing Unit) 210, an I/O (Input/Output)module 220 andstorage 230. Amemory 240 such as RAM (Random Access Memory) is connected to theCPU 210 as shown inFIG. 2 . - The I/
O module 220 is configured to receive video image data from cameras 110 connected to the I/O module 220. Also the I/O module 220 is configured to receive information such as user's selection on viewpoint and 3D region of interest (ROI), screen resolution of the display in theuser terminal 300, processing power of theuser terminal 300 and other parameters of theuser terminal 300 and to transmit video content generated by the head-end 200 to theuser terminal 300. - The
storage 230 is configured to store software programs and data for theCPU 210 of the head-end 200 to perform the process which will be described below. -
FIG. 3 illustrates an exemplary block diagram of theuser terminal 300 according to an embodiment of the present invention. As shown inFIG. 3 , theuser terminal 300 also comprises a CPU (Central Processing Unit) 310, an I/O module 320,storage 330 and amemory 340 such as RAM (Random Access Memory) connected to theCPU 310. Theuser terminal 300 further comprises adisplay 360 and auser input module 350. - The I/
O module 320 in theuser terminal 300 is configured to receive video content transmitted by the head-end 200 and to transmit information such as user's selection on viewpoint and region of interest (ROI), screen resolution of the display in theuser terminal 300, processing power of theuser terminal 300 and other parameters of theuser terminal 300 to the head-end 200. - The
storage 330 is configured to store software programs and data for theCPU 310 of theuser terminal 300 to perform the process which will be described below. - The
display 360 is configured so that it can present 3D video content provided by the head-end 200. Thedisplay 360 can be a touch-screen so that it can provide a possibility to the user to input on thedisplay 360 the user's selection on viewpoint and 3D region of interest (ROI) in addition to theuser input module 350. - The
user input module 350 may be a user interface such as keyboard, a pointing device like a mouse and/or a remote controller to input the user's selection on viewpoint and region of interest (ROI). Theuser input module 350 can be an option if thedisplay 360 is a touch-screen and theuser terminal 300 is configured so that such user's selection can be input on thedisplay 360. -
FIGS. 4 and 5 illustrate an example of the implementation of thesystem 100 according to an embodiment of the present invention.FIGS. 4 and 5 illustratively show that thesystem 100 is applied to broadcasting 3D live free viewpoint video for soccer game. As can be seen inFIGS. 4 and 5 , cameras 110 are preferably distributed so that cameras 110 surround a soccer stadium. The head-end 200 can be installed in a room in the stadium and theuser terminal 300 can be located at user's home, for example. -
FIG. 6 is a flow chart showing a process for generating 3D live free viewpoint video content. The method will be described below with reference toFIGS. 1 to 6 . - At step 602, each of the on-
site cameras 100 shoot the live videos from different viewpoints and those live videos are transmitted to the head-end 200 via a network such as Local Area Network (LAN). In this step, for example, a video of a default view point shot by a certain camera 110 is transmitted from the head-end 200 to theuser terminal 300 and the video is displayed on thedisplay 360 so that a user can select at least one of 3D region on interest (ROI) on thedisplay 360. The region of interest can be a soccer player on thedisplay 360 in this example. - At step 604, the
CPU 210 of the head-end 200 analyzes the videos using the calibrated camera parameters to form a graphic model of the whole or at least part of the scene of the stadium. The calibrated camera parameters are related to the locations and orientations of the cameras 110. For example, the calibration for each camera can be realized by capturing a reference chart such as a mesh-like chart by each camera and by analyzing the respective captured image of the reference chart. The analysis may include analyzing the size and the distortion of the reference chart captured in the image. The calibrated camera parameters can be obtained by performing camera calibration using the onsite cameras 110 and are preliminarily stored in thestorage 230. - At step 606, the head-
end 200 receives the user's selection on viewpoint and 3D region of interest (ROI). The user's selection can be input by theuser input module 350 and/or thedisplay 360 of theuser terminal 300. The user's selection on viewpoint can be achieved by selecting a viewpoint using arrow keys on remote controller, by pointing a viewpoint using pointing device or any other possible methods. For example, if the user wants to see a scene of a diving save by goalkeeper, the user can select the viewpoint towards the goalkeeper. Also, the user's selection on 3D region of interest (ROI) can be achieved by circling a pointer around an interesting object or area on thedisplay 360 using theuser input module 350 or directly on thedisplay 360 if it is a touch-screen. - If a user does not select the viewpoint, the
CPU 210 of the head-end 200 then selects a default viewpoint with a certain camera 110. Also, if a user does not specify 3D ROI, theCPU 210 of the head-end 200 then analyzes the video of the selected or default viewpoint to estimate the possible 3D ROI within the scene of the video. The process for estimating possible 3D ROI within the scene of the video can be performed using a conventional ROI detection methods as mentioned in the technical paper: Xinding Sun, Jonathan Foote, Don Kimber and B. S. Manjunath, “Region of Interest Extraction and Virtual Camera Control Based on Panoramic Video Capturing”, IEEE Transactions on Multimedia, 2005. - As described above, the head-
end 200 acquires information related to the user's selection on the viewpoint and the 3D ROI or the default viewpoint and the estimated 3D ROI. - At step 608, the head-
end 200 may receive additional data including the screen resolution of thedisplay 360, processing power of theCPU 310 and any other parameters of theuser terminal 300 to transmit proper content to theuser terminal 300 in accordance with such additional data. Such data are preliminarily stored in thestorage 330 of theuser terminal 300. - At
step 610, theCPU 210 of the head-end 200 then encodes the graphic model of the stadium seen from the selected or default viewpoint and the videos related to the selected or estimated 3D ROI which videos are shot by at least two cameras 110 located close to the user's selected or default viewpoint to form a hybrid 3D video content with proper level of detail (resolution) according to the additional data regarding theuser terminal 300. The graphic model and the videos related to the 3D ROI is encoded and combined in the hybrid 3D video content. - For example, if the
display 360 has high resolution and theCPU 310 has high processing power, hybrid 3D video content with high level of detail can be transmitted to theuser terminal 300. In the reverse situation, the level of detail of the hybrid 3D video content to be transmitted to theuser terminal 300 can be reduced in order to save network bandwidth on the network between the head-end 200 and theuser terminal 300 and processing load on theCPU 310. The level of detail of the hybrid 3D video content to be transmitted to theuser terminal 300 can be determined by theCPU 210 of the head-end 200 based on the additional data regarding theuser terminal 300. - In general, it is known that a 3D graphic model is formed from points so-called “vertex” which define the shape and forming “polygons” and that the 3D graphic model is generally rendered in a 2D representation. In this illustrative example, the graphic model of the hybrid 3D video content is a 3D graphic model which will be presented on the
display 360 on theuser terminal 300 as a 2D representation as a background, whereas virtual 3D views, which will be generated by the videos related to the selected or estimated 3D ROI, will be presented on thebackground 3D graphic model in thedisplay 360 as a 3D representation (stereoscopic representation) having right and left views. In this example, the 3D graphic model rendered in the 2D representation as the background is related to the scene of the soccer stadium and the 3D ROI rendered in the 3D representation on the background is related to the soccer player. -
FIG. 7 is a flow chart showing the process for creating the 3D graphic model. The process for creating the 3D graphic model will be discussed below with reference toFIGS. 2 , 5 and 7. - At first, videos shot by on-site cameras 110 are received via I/
O module 220 of the head-end 200 and the calibrated camera parameters are retrieved from the storage 230 (S702). Then, video frame pre-processing such as image rectification for the videos is performed by the CPU 210 (S704). - Following this step, by the
CPU 210, multi-view image matching process is performed to find the corresponding pixels in videos of adjacent views (S706), disparity map calculation is performed for those videos of adjacent views (S708) and 3D point cloud and 3D mesh are generated based on the disparity map created in step 708 (S710). - Then, texture is synthesized based on video images from all or at least part of the views and the synthesized texture is attached on the 3D mesh surface by the CPU 210 (S712). Finally, hole-filling and artifact-removing process is performed by the CPU 210 (S714). In this process, the 3D graphic model is generated (S716). In this example, the 3D graphic model is an entire view of the soccer stadium as shown in
FIG. 5 with reference symbol “3DGM”. - A conventional 3D graphic model reconstruction process is mentioned in the technical paper: Noah Snavely, Ian Simon, Michael Goesele, Richard Szeliski and Steven M. Seitz, “Scene Reconstruction and Visualization From Community Photo Collections”, Proceedings of the IEEE, Vol. 98, No. 8, August 2010, pp. 1370-1390.
-
FIG. 8 is a flow chart showing the process for presenting the hybrid 3D video content. The process for reproducing the hybrid 3D video content will be discussed below with reference toFIGS. 3 and 7 . - At first, the I/
O module 320 of theuser terminal 300 receives the hybrid 3D video content from the head-end 200 (S802). - Then, the
CPU 310 of theuser terminal 300 decodes thebackground 3D graphic model seen from the selected or default viewpoint and the videos related to the selected or estimated 3D ROI in the hybrid 3D video content (S804), as a result of this, thebackground 3D graphic model and the videos related to the 3D ROI are retrieved. Then theCPU 310 renders each video frame of thebackground 3D graphic model seen from the selected or default viewpoint (S806). - Next, video frame pre-processing such as image rectification is performed by the
CPU 310 for the current video frame of the videos related to the selected or estimated 3D ROI for synthesizing the virtual 3D views in the selected or default viewpoint (S808). - Following to the step 808, multi-view image matching process is performed by the
CPU 310 to find the corresponding pixels in the videos of adjacent views (S810). If necessary, projective transformation process for major structure in the video scene may be performed by theCPU 310 after the step 810 (S812). - Then, view interpolation process is performed by the
CPU 310 to synthesize the virtual 3D views in the selected or default viewpoint using a conventional pixel level interpolation techniques, for example (S814) and hole-filling and artifact-removing process to the synthesized virtual 3D views is performed by the CPU 310 (S816). In the step 814, two virtual 3D views are synthesized if the virtual 3D views are generated for stereoscopic 3D representation and more than two virtual 3D views are synthesized if the virtual 3D views are generated for multi-view 3D representation. Virtual 3D views are illustratively shown inFIG. 5 with reference symbols “VV1, VV2 and VV3”. - A conventional view interpolation process is mentioned in the technical paper: S. Chen and L. Williams, “View Interpolation for Image Synthesis”, ACM SIGGRAPH'93, pp. 279-288, 1993.
- Finally, by the
CPU 310, the virtual 3D views are aligned and merged on thebackground 3D graphic model with the same perspective parameters to generate final view for the frame of the hybrid 3D video content (S818) and this frame is displayed on the display 360 (S820). - At step 825, if the process for all video frames of the hybrid 3D video content to be presented is completed, this process will be terminated. If not, the
CPU 310 will start to the process of steps 808-820 for next video frame. - User can change the user's selection on viewpoint and 3D region of interest (ROI) at the
user terminal 300 during the hybrid 3D video content is presented on thedisplay 360. When the user's selection on viewpoint and 3D region of interest (ROI) is changed, the above-described process will be performed according to the new user's selection. - The above-described example is discussed in the context of that the
background 3D graphic model is presented on thedisplay 360 as a 2D representation and the virtual 3D views is presented on thedisplay 360 as a 3D representation. However, thesystem 100 can be configured to present both thebackground 3D graphic model and the virtual 3D views on thedisplay 360 as a 3D representation if it is possible in view of the conditions such as the bandwidth of the network and the processing load on the head-end 200 and theuser terminal 300. Also, thesystem 100 can be configured to present both thebackground 3D graphic model and a virtual view on thedisplay 360 as a 2D representation. - These and other features and advantages of the present principles may be readily ascertained by one of ordinary skill in the pertinent art based on the teachings herein. It is to be understood that the teachings of the present principles may be implemented in various forms of hardware, software, firmware, special purpose processors, or combinations thereof.
- Most preferably, the teachings of the present principles are implemented as a combination of hardware and software. Moreover, the software may be implemented as an application program tangibly embodied on a program storage unit. The application program may be uploaded to, and executed by, a machine comprising any suitable architecture. Preferably, the machine is implemented on a computer platform having hardware such as one or more central processing units (“CPU”), a random access memory (“RAM”), and input/output (“I/O”) interfaces. The computer platform may also include an operating system and microinstruction code. The various processes and functions described herein may be either part of the microinstruction code or part of the application program, or any combination thereof, which may be executed by a CPU. In addition, various other peripheral units may be connected to the computer platform such as an additional data storage unit.
- It is to be further understood that, because some of the constituent system components and methods depicted in the accompanying drawings are preferably implemented in software, the actual connections between the system components or the process function blocks may differ depending upon the manner in which the present principles are programmed. Given the teachings herein, one of ordinary skill in the pertinent art will be able to contemplate these and similar implementations or configurations of the present principles.
- Although the illustrative embodiments have been described herein with reference to the accompanying drawings, it is to be understood that the present principles is not limited to those precise embodiments, and that various changes and modifications may be effected therein by one of ordinary skill in the pertinent art without departing from the scope or spirit of the present principles. All such changes and modifications are intended to be included within the scope of the present principles as set forth in the appended claims.
Claims (11)
1. A method for generating 3D viewpoint video content, the method comprising:
receiving videos in which an object is captured;
forming a 3D graphic model of at least part of the scene of the object based on the videos;
acquiring information related to viewpoint and 3D region of interest (ROI) in the object; and
combining the 3D graphic model and the videos related to the 3D ROI to form a hybrid 3D video content.
2. The method according to claim 1 , wherein the method further comprising receiving additional data to determine the level of details of the hybrid 3D video content to be formed.
3. A method for presenting a hybrid 3D video content including a 3D graphic model and videos related to a 3D region of interest (ROI), the method comprising:
receiving the hybrid 3D video content;
retrieving the 3D graphic model and the videos related to the 3D ROI in the hybrid 3D video content;
rendering each video frame of the 3D graphic model;
synthesizing virtual 3D views in a video frame related to the 3D ROI;
merging the synthesized virtual 3D views in the video frame on the 3D graphic model in the corresponding video frame to form the a final view for the video frame; and
presenting the final view on a display.
4. The method according to claim 3 , wherein the 3D graphic model is presented on the display in 2D representation and the virtual 3D views are presented on the display in 3D representation.
5. The method according to claim 3 , wherein the rendering, synthesizing and presenting are repeated.
6. The method according to claim 3 , wherein the merging includes aligning the virtual 3D views with the 3D graphic model with the same perspective parameters.
7. An apparatus for generating 3D viewpoint video content, the apparatus comprising:
a processor configured to:
receive videos in which an object is captured;
form a 3D graphic model of at least part of the scene of the object based on the videos;
acquire information related to viewpoint and 3D region of interest (ROI) in the object; and
combine the 3D graphic model and the videos related to the 3D ROI to form a hybrid 3D video content.
8. The apparatus according to claim 7 , wherein the processor is further configured to receive additional data to determine the level of details of the hybrid 3D video content to be formed.
9. An apparatus for presenting a hybrid 3D video content including a 3D graphic model and videos related to a 3D region of interest (ROI), the apparatus comprising:
a display; and
a processor configured to:
receive the hybrid 3D video content;
retrieve the 3D graphic model and the videos related to the 3D ROI in the hybrid 3D video content;
render each video frame of the 3D graphic model;
synthesize virtual 3D views in a video frame related to the 3D ROI;
merge the synthesized virtual 3D views in the video frame on the 3D graphic model in the corresponding video frame to form a final view for the video frame; and
present the final view on the display.
10. The apparatus according to claim 9 , wherein the processor is further configured to present on the display the 3D graphic model in 2D representation and the virtual 3D views in 3D representation.
11. The apparatus according to claim 9 , wherein the processor is further configured to align the virtual 3D views with the 3D graphic model with the same perspective parameters.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2011/084132 WO2013086739A1 (en) | 2011-12-16 | 2011-12-16 | Method and apparatus for generating 3d free viewpoint video |
Publications (1)
Publication Number | Publication Date |
---|---|
US20140340404A1 true US20140340404A1 (en) | 2014-11-20 |
Family
ID=48611837
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/365,240 Abandoned US20140340404A1 (en) | 2011-12-16 | 2011-12-16 | Method and apparatus for generating 3d free viewpoint video |
Country Status (3)
Country | Link |
---|---|
US (1) | US20140340404A1 (en) |
EP (1) | EP2791909A4 (en) |
WO (1) | WO2013086739A1 (en) |
Cited By (44)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140098100A1 (en) * | 2012-10-05 | 2014-04-10 | Qualcomm Incorporated | Multiview synthesis and processing systems and methods |
US20160150217A1 (en) * | 2014-11-20 | 2016-05-26 | Cappasity Inc. | Systems and methods for 3d capturing of objects and motion sequences using multiple range and rgb cameras |
US20160217591A1 (en) * | 2013-10-02 | 2016-07-28 | Given Imaging Ltd. | System and method for size estimation of in-vivo objects |
US20170148223A1 (en) * | 2014-10-31 | 2017-05-25 | Fyusion, Inc. | Real-time mobile device capture and generation of ar/vr content |
WO2018051747A1 (en) * | 2016-09-14 | 2018-03-22 | キヤノン株式会社 | Image processing device, image generating method, and program |
KR20180042386A (en) * | 2015-08-29 | 2018-04-25 | 후아웨이 테크놀러지 컴퍼니 리미티드 | Method and apparatus for playing video content from any location and any time |
US20180288447A1 (en) * | 2015-10-28 | 2018-10-04 | Sankar Jayaram | Apparatus and method for distributing mulitmedia events from a client |
US20180342267A1 (en) * | 2017-05-26 | 2018-11-29 | Digital Domain, Inc. | Spatialized rendering of real-time video data to 3d space |
US10165199B2 (en) * | 2015-09-01 | 2018-12-25 | Samsung Electronics Co., Ltd. | Image capturing apparatus for photographing object according to 3D virtual object |
US20190026945A1 (en) * | 2014-07-25 | 2019-01-24 | mindHIVE Inc. | Real-time immersive mediated reality experiences |
US20190215486A1 (en) * | 2017-08-07 | 2019-07-11 | Jaunt Inc. | Viewpoint-Adaptive Three-Dimensional (3D) Personas |
US10430995B2 (en) | 2014-10-31 | 2019-10-01 | Fyusion, Inc. | System and method for infinite synthetic image generation from multi-directional structured image array |
JP2020010347A (en) * | 2018-11-02 | 2020-01-16 | キヤノン株式会社 | Generation device, generation method, and program |
US10540773B2 (en) | 2014-10-31 | 2020-01-21 | Fyusion, Inc. | System and method for infinite smoothing of image sequences |
JP2020503792A (en) * | 2016-12-30 | 2020-01-30 | 華為技術有限公司Huawei Technologies Co.,Ltd. | Information processing method and apparatus |
CN111345035A (en) * | 2017-10-31 | 2020-06-26 | 索尼公司 | Information processing device, information processing method, and information processing program |
US10699476B2 (en) * | 2015-08-06 | 2020-06-30 | Ams Sensors Singapore Pte. Ltd. | Generating a merged, fused three-dimensional point cloud based on captured images of a scene |
US10719733B2 (en) | 2015-07-15 | 2020-07-21 | Fyusion, Inc. | Artificially rendering images using interpolation of tracked control points |
US10726560B2 (en) * | 2014-10-31 | 2020-07-28 | Fyusion, Inc. | Real-time mobile device capture and generation of art-styled AR/VR content |
US10726593B2 (en) | 2015-09-22 | 2020-07-28 | Fyusion, Inc. | Artificially rendering images using viewpoint interpolation and extrapolation |
US10818029B2 (en) | 2014-10-31 | 2020-10-27 | Fyusion, Inc. | Multi-directional structured image array capture on a 2D graph |
US10852902B2 (en) | 2015-07-15 | 2020-12-01 | Fyusion, Inc. | Automatic tagging of objects on a multi-view interactive digital media representation of a dynamic entity |
US10944960B2 (en) * | 2017-02-10 | 2021-03-09 | Panasonic Intellectual Property Corporation Of America | Free-viewpoint video generating method and free-viewpoint video generating system |
US10984589B2 (en) | 2017-08-07 | 2021-04-20 | Verizon Patent And Licensing Inc. | Systems and methods for reference-model-based modification of a three-dimensional (3D) mesh data model |
US11037364B2 (en) * | 2016-10-11 | 2021-06-15 | Canon Kabushiki Kaisha | Image processing system for generating a virtual viewpoint image, method of controlling image processing system, and storage medium |
US11074752B2 (en) * | 2018-02-23 | 2021-07-27 | Sony Group Corporation | Methods, devices and computer program products for gradient based depth reconstructions with robust statistics |
US11089284B2 (en) | 2016-09-14 | 2021-08-10 | Canon Kabushiki Kaisha | Image processing apparatus, image generating method, and storage medium |
US20210248769A1 (en) * | 2020-02-11 | 2021-08-12 | Samsung Electronics Co., Ltd. | Array-based depth estimation |
US11115644B2 (en) * | 2017-06-29 | 2021-09-07 | Sony Interactive Entertainment Inc. | Video generation method and apparatus using mesh and texture data |
US11195314B2 (en) | 2015-07-15 | 2021-12-07 | Fyusion, Inc. | Artificially rendering images using viewpoint interpolation and extrapolation |
US11202017B2 (en) | 2016-10-06 | 2021-12-14 | Fyusion, Inc. | Live style transfer on a mobile device |
US11435869B2 (en) | 2015-07-15 | 2022-09-06 | Fyusion, Inc. | Virtual reality environment based manipulation of multi-layered multi-view interactive digital media representations |
US11488380B2 (en) | 2018-04-26 | 2022-11-01 | Fyusion, Inc. | Method and apparatus for 3-D auto tagging |
US20230033201A1 (en) * | 2021-07-28 | 2023-02-02 | Canon Kabushiki Kaisha | Image processing apparatus, image processing method, and storage medium |
US11627251B2 (en) * | 2018-10-26 | 2023-04-11 | Canon Kabushiki Kaisha | Image processing apparatus and control method thereof, computer-readable storage medium |
US11632489B2 (en) * | 2017-01-31 | 2023-04-18 | Tetavi, Ltd. | System and method for rendering free viewpoint video for studio applications |
US11632533B2 (en) | 2015-07-15 | 2023-04-18 | Fyusion, Inc. | System and method for generating combined embedded multi-view interactive digital media representations |
US11636637B2 (en) | 2015-07-15 | 2023-04-25 | Fyusion, Inc. | Artificially rendering images using viewpoint interpolation and extrapolation |
US11776229B2 (en) | 2017-06-26 | 2023-10-03 | Fyusion, Inc. | Modification of multi-view interactive digital media representation |
US11783864B2 (en) | 2015-09-22 | 2023-10-10 | Fyusion, Inc. | Integration of audio into a multi-view interactive digital media representation |
US11876948B2 (en) | 2017-05-22 | 2024-01-16 | Fyusion, Inc. | Snapshots at predefined intervals or angles |
US11956412B2 (en) | 2015-07-15 | 2024-04-09 | Fyusion, Inc. | Drone based capture of multi-view interactive digital media |
US11960533B2 (en) | 2017-01-18 | 2024-04-16 | Fyusion, Inc. | Visual search using multi-view interactive digital media representations |
US12261990B2 (en) | 2015-07-15 | 2025-03-25 | Fyusion, Inc. | System and method for generating combined embedded multi-view interactive digital media representations |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102009928B1 (en) | 2012-08-20 | 2019-08-12 | 삼성전자 주식회사 | Cooperation method and apparatus |
US9473745B2 (en) | 2014-01-30 | 2016-10-18 | Google Inc. | System and method for providing live imagery associated with map locations |
JP2015187797A (en) * | 2014-03-27 | 2015-10-29 | シャープ株式会社 | Image data generation device and image data reproduction device |
BE1022580A9 (en) * | 2014-10-22 | 2016-10-06 | Parallaxter | Method of obtaining immersive videos with interactive parallax and method of viewing immersive videos with interactive parallax |
EP3038358A1 (en) * | 2014-12-22 | 2016-06-29 | Thomson Licensing | A method for adapting a number of views delivered by an auto-stereoscopic display device, and corresponding computer program product and electronic device |
KR102313485B1 (en) * | 2015-04-22 | 2021-10-15 | 삼성전자주식회사 | Method and apparatus for transmitting and receiving image data for virtual reality streaming service |
JP6236573B2 (en) * | 2015-05-01 | 2017-11-22 | 株式会社電通 | Free-viewpoint video data distribution system |
CN108154553A (en) * | 2018-01-04 | 2018-06-12 | 中测新图(北京)遥感技术有限责任公司 | The seamless integration method and device of a kind of threedimensional model and monitor video |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7324594B2 (en) * | 2003-11-26 | 2008-01-29 | Mitsubishi Electric Research Laboratories, Inc. | Method for encoding and decoding free viewpoint videos |
US20100110069A1 (en) * | 2008-10-31 | 2010-05-06 | Sharp Laboratories Of America, Inc. | System for rendering virtual see-through scenes |
US20120026158A1 (en) * | 2010-02-05 | 2012-02-02 | Sony Computer Entertainment Inc. | Three-dimensional image generation device, three-dimensional image generation method, and information storage medium |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5850352A (en) * | 1995-03-31 | 1998-12-15 | The Regents Of The University Of California | Immersive video, including video hypermosaicing to generate from multiple video views of a scene a three-dimensional video mosaic from which diverse virtual video scene images are synthesized, including panoramic, scene interactive and stereoscopic images |
US6144375A (en) * | 1998-08-14 | 2000-11-07 | Praja Inc. | Multi-perspective viewer for content-based interactivity |
US7522186B2 (en) * | 2000-03-07 | 2009-04-21 | L-3 Communications Corporation | Method and apparatus for providing immersive surveillance |
US8027531B2 (en) * | 2004-07-21 | 2011-09-27 | The Board Of Trustees Of The Leland Stanford Junior University | Apparatus and method for capturing a scene using staggered triggering of dense camera arrays |
WO2008073563A1 (en) * | 2006-12-08 | 2008-06-19 | Nbc Universal, Inc. | Method and system for gaze estimation |
US8264542B2 (en) * | 2007-12-31 | 2012-09-11 | Industrial Technology Research Institute | Methods and systems for image processing in a multiview video system |
IL202460A (en) * | 2009-12-01 | 2013-08-29 | Rafael Advanced Defense Sys | Method and system of generating a three-dimensional view of a real scene |
US8558923B2 (en) * | 2010-05-03 | 2013-10-15 | Canon Kabushiki Kaisha | Image capturing apparatus and method for selective real time focus/parameter adjustment |
-
2011
- 2011-12-16 US US14/365,240 patent/US20140340404A1/en not_active Abandoned
- 2011-12-16 EP EP11877189.8A patent/EP2791909A4/en not_active Withdrawn
- 2011-12-16 WO PCT/CN2011/084132 patent/WO2013086739A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7324594B2 (en) * | 2003-11-26 | 2008-01-29 | Mitsubishi Electric Research Laboratories, Inc. | Method for encoding and decoding free viewpoint videos |
US20100110069A1 (en) * | 2008-10-31 | 2010-05-06 | Sharp Laboratories Of America, Inc. | System for rendering virtual see-through scenes |
US20120026158A1 (en) * | 2010-02-05 | 2012-02-02 | Sony Computer Entertainment Inc. | Three-dimensional image generation device, three-dimensional image generation method, and information storage medium |
Non-Patent Citations (7)
Title |
---|
A. Smolic, K. Müller, P. Merkle, M. Kautzner, T. Wiegand, "3D Video Objects for Interactive Applications", September 8, 2005, IEEE, 13th European Signal Processing Conference, 2005, * |
Adrian Hilton, Jean-Yves Guillemaut, Joe Kilner, Oliver Grau, Graham Thomas, "3D-TV Production From Conventional Cameras for Sports Broadcast", June 2011, IEEE, IEEE Transactions on Broadcasting, Volume: 57, Issue: 2, pages 462-476 * |
Aljoscha Smolic, "3D video and free viewpoint video - From capture to display", September 15, 2010, Elsevier, Pattern Recognition, Volume 44, Issue 9, pages 1958-1968 * |
Aljoscha Smolic, Peter Kauff, "Interactive 3-D Video Representation and Coding Technologies", January 2005, IEEE, Proceedings of the IEEE, VOL. 93, NO. 1, pages 98-110 * |
Kyohei Yoshikawa, Takashi Machida, Kiyoshi Kiyokawa, Haruo Takemura, "A High Presence Shared Space Communication System Using 2D Background and 3D Avatar", Janurary 30, 2004, IEEE, Proceedings. 2004 International Symposium on Applications and the Internet. * |
Luca Ballan, Gabriel J. Brostow, Jens Puwein, Marc Pollefeys, "Unstructured Video-Based Rendering: Interactive Exploration of Casually Captured Videos", July 2010, ACM, ACM Transactions on Graphics, Vol. 29, No. 4, Article 87 * |
Robert A. Akka, "Converting existing applications to support high-quality stereoscopy", May 24 1999, SPIE, Proc. SPIE 3639, Stereoscopic Displays and Virtual Reality Systems VI * |
Cited By (72)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140098100A1 (en) * | 2012-10-05 | 2014-04-10 | Qualcomm Incorporated | Multiview synthesis and processing systems and methods |
US20160217591A1 (en) * | 2013-10-02 | 2016-07-28 | Given Imaging Ltd. | System and method for size estimation of in-vivo objects |
US10521924B2 (en) * | 2013-10-02 | 2019-12-31 | Given Imaging Ltd. | System and method for size estimation of in-vivo objects |
US9911203B2 (en) * | 2013-10-02 | 2018-03-06 | Given Imaging Ltd. | System and method for size estimation of in-vivo objects |
US20180096491A1 (en) * | 2013-10-02 | 2018-04-05 | Given Imaging Ltd. | System and method for size estimation of in-vivo objects |
US20190026945A1 (en) * | 2014-07-25 | 2019-01-24 | mindHIVE Inc. | Real-time immersive mediated reality experiences |
US10699482B2 (en) * | 2014-07-25 | 2020-06-30 | mindHIVE Inc. | Real-time immersive mediated reality experiences |
US10818029B2 (en) | 2014-10-31 | 2020-10-27 | Fyusion, Inc. | Multi-directional structured image array capture on a 2D graph |
US10430995B2 (en) | 2014-10-31 | 2019-10-01 | Fyusion, Inc. | System and method for infinite synthetic image generation from multi-directional structured image array |
US10726560B2 (en) * | 2014-10-31 | 2020-07-28 | Fyusion, Inc. | Real-time mobile device capture and generation of art-styled AR/VR content |
US10719939B2 (en) * | 2014-10-31 | 2020-07-21 | Fyusion, Inc. | Real-time mobile device capture and generation of AR/VR content |
US10846913B2 (en) | 2014-10-31 | 2020-11-24 | Fyusion, Inc. | System and method for infinite synthetic image generation from multi-directional structured image array |
US10540773B2 (en) | 2014-10-31 | 2020-01-21 | Fyusion, Inc. | System and method for infinite smoothing of image sequences |
US20170148223A1 (en) * | 2014-10-31 | 2017-05-25 | Fyusion, Inc. | Real-time mobile device capture and generation of ar/vr content |
US10154246B2 (en) * | 2014-11-20 | 2018-12-11 | Cappasity Inc. | Systems and methods for 3D capturing of objects and motion sequences using multiple range and RGB cameras |
US20160150217A1 (en) * | 2014-11-20 | 2016-05-26 | Cappasity Inc. | Systems and methods for 3d capturing of objects and motion sequences using multiple range and rgb cameras |
US10852902B2 (en) | 2015-07-15 | 2020-12-01 | Fyusion, Inc. | Automatic tagging of objects on a multi-view interactive digital media representation of a dynamic entity |
US12261990B2 (en) | 2015-07-15 | 2025-03-25 | Fyusion, Inc. | System and method for generating combined embedded multi-view interactive digital media representations |
US11636637B2 (en) | 2015-07-15 | 2023-04-25 | Fyusion, Inc. | Artificially rendering images using viewpoint interpolation and extrapolation |
US11632533B2 (en) | 2015-07-15 | 2023-04-18 | Fyusion, Inc. | System and method for generating combined embedded multi-view interactive digital media representations |
US11435869B2 (en) | 2015-07-15 | 2022-09-06 | Fyusion, Inc. | Virtual reality environment based manipulation of multi-layered multi-view interactive digital media representations |
US10733475B2 (en) | 2015-07-15 | 2020-08-04 | Fyusion, Inc. | Artificially rendering images using interpolation of tracked control points |
US12020355B2 (en) | 2015-07-15 | 2024-06-25 | Fyusion, Inc. | Artificially rendering images using viewpoint interpolation and extrapolation |
US11956412B2 (en) | 2015-07-15 | 2024-04-09 | Fyusion, Inc. | Drone based capture of multi-view interactive digital media |
US10719733B2 (en) | 2015-07-15 | 2020-07-21 | Fyusion, Inc. | Artificially rendering images using interpolation of tracked control points |
US10719732B2 (en) | 2015-07-15 | 2020-07-21 | Fyusion, Inc. | Artificially rendering images using interpolation of tracked control points |
US11776199B2 (en) | 2015-07-15 | 2023-10-03 | Fyusion, Inc. | Virtual reality environment based manipulation of multi-layered multi-view interactive digital media representations |
US11195314B2 (en) | 2015-07-15 | 2021-12-07 | Fyusion, Inc. | Artificially rendering images using viewpoint interpolation and extrapolation |
US10699476B2 (en) * | 2015-08-06 | 2020-06-30 | Ams Sensors Singapore Pte. Ltd. | Generating a merged, fused three-dimensional point cloud based on captured images of a scene |
RU2679316C1 (en) * | 2015-08-29 | 2019-02-07 | Хуавей Текнолоджиз Ко., Лтд. | Method and device for playback of video content from any location and at any time |
EP3334173A4 (en) * | 2015-08-29 | 2018-08-15 | Huawei Technologies Co., Ltd. | Method and device for playing video content at any position and time |
KR102087690B1 (en) * | 2015-08-29 | 2020-03-11 | 후아웨이 테크놀러지 컴퍼니 리미티드 | Method and apparatus for playing video content from any location and any time |
KR20180042386A (en) * | 2015-08-29 | 2018-04-25 | 후아웨이 테크놀러지 컴퍼니 리미티드 | Method and apparatus for playing video content from any location and any time |
US10165199B2 (en) * | 2015-09-01 | 2018-12-25 | Samsung Electronics Co., Ltd. | Image capturing apparatus for photographing object according to 3D virtual object |
US12190916B2 (en) | 2015-09-22 | 2025-01-07 | Fyusion, Inc. | Integration of audio into a multi-view interactive digital media representation |
US10726593B2 (en) | 2015-09-22 | 2020-07-28 | Fyusion, Inc. | Artificially rendering images using viewpoint interpolation and extrapolation |
US11783864B2 (en) | 2015-09-22 | 2023-10-10 | Fyusion, Inc. | Integration of audio into a multi-view interactive digital media representation |
US20180288447A1 (en) * | 2015-10-28 | 2018-10-04 | Sankar Jayaram | Apparatus and method for distributing mulitmedia events from a client |
US11089284B2 (en) | 2016-09-14 | 2021-08-10 | Canon Kabushiki Kaisha | Image processing apparatus, image generating method, and storage medium |
WO2018051747A1 (en) * | 2016-09-14 | 2018-03-22 | キヤノン株式会社 | Image processing device, image generating method, and program |
US11202017B2 (en) | 2016-10-06 | 2021-12-14 | Fyusion, Inc. | Live style transfer on a mobile device |
US11037364B2 (en) * | 2016-10-11 | 2021-06-15 | Canon Kabushiki Kaisha | Image processing system for generating a virtual viewpoint image, method of controlling image processing system, and storage medium |
JP2020503792A (en) * | 2016-12-30 | 2020-01-30 | 華為技術有限公司Huawei Technologies Co.,Ltd. | Information processing method and apparatus |
JP7058273B2 (en) | 2016-12-30 | 2022-04-21 | 華為技術有限公司 | Information processing method and equipment |
US11960533B2 (en) | 2017-01-18 | 2024-04-16 | Fyusion, Inc. | Visual search using multi-view interactive digital media representations |
US11632489B2 (en) * | 2017-01-31 | 2023-04-18 | Tetavi, Ltd. | System and method for rendering free viewpoint video for studio applications |
US11665308B2 (en) | 2017-01-31 | 2023-05-30 | Tetavi, Ltd. | System and method for rendering free viewpoint video for sport applications |
US10944960B2 (en) * | 2017-02-10 | 2021-03-09 | Panasonic Intellectual Property Corporation Of America | Free-viewpoint video generating method and free-viewpoint video generating system |
US11876948B2 (en) | 2017-05-22 | 2024-01-16 | Fyusion, Inc. | Snapshots at predefined intervals or angles |
US20180342267A1 (en) * | 2017-05-26 | 2018-11-29 | Digital Domain, Inc. | Spatialized rendering of real-time video data to 3d space |
US10796723B2 (en) * | 2017-05-26 | 2020-10-06 | Immersive Licensing, Inc. | Spatialized rendering of real-time video data to 3D space |
US11776229B2 (en) | 2017-06-26 | 2023-10-03 | Fyusion, Inc. | Modification of multi-view interactive digital media representation |
US11115644B2 (en) * | 2017-06-29 | 2021-09-07 | Sony Interactive Entertainment Inc. | Video generation method and apparatus using mesh and texture data |
US10984589B2 (en) | 2017-08-07 | 2021-04-20 | Verizon Patent And Licensing Inc. | Systems and methods for reference-model-based modification of a three-dimensional (3D) mesh data model |
US11004264B2 (en) | 2017-08-07 | 2021-05-11 | Verizon Patent And Licensing Inc. | Systems and methods for capturing, transferring, and rendering viewpoint-adaptive three-dimensional (3D) personas |
US11580697B2 (en) | 2017-08-07 | 2023-02-14 | Verizon Patent And Licensing Inc. | Systems and methods for reconstruction and rendering of viewpoint-adaptive three-dimensional (3D) personas |
US20190215486A1 (en) * | 2017-08-07 | 2019-07-11 | Jaunt Inc. | Viewpoint-Adaptive Three-Dimensional (3D) Personas |
US10997786B2 (en) | 2017-08-07 | 2021-05-04 | Verizon Patent And Licensing Inc. | Systems and methods for reconstruction and rendering of viewpoint-adaptive three-dimensional (3D) personas |
US11461969B2 (en) | 2017-08-07 | 2022-10-04 | Verizon Patent And Licensing Inc. | Systems and methods compression, transfer, and reconstruction of three-dimensional (3D) data meshes |
US11024078B2 (en) | 2017-08-07 | 2021-06-01 | Verizon Patent And Licensing Inc. | Systems and methods compression, transfer, and reconstruction of three-dimensional (3D) data meshes |
US11386618B2 (en) | 2017-08-07 | 2022-07-12 | Verizon Patent And Licensing Inc. | Systems and methods for model-based modification of a three-dimensional (3D) mesh |
US11095854B2 (en) * | 2017-08-07 | 2021-08-17 | Verizon Patent And Licensing Inc. | Viewpoint-adaptive three-dimensional (3D) personas |
US11403057B2 (en) * | 2017-10-31 | 2022-08-02 | Sony Corporation | Information processing device, information processing method, and information processing program |
CN111345035A (en) * | 2017-10-31 | 2020-06-26 | 索尼公司 | Information processing device, information processing method, and information processing program |
US11074752B2 (en) * | 2018-02-23 | 2021-07-27 | Sony Group Corporation | Methods, devices and computer program products for gradient based depth reconstructions with robust statistics |
US11488380B2 (en) | 2018-04-26 | 2022-11-01 | Fyusion, Inc. | Method and apparatus for 3-D auto tagging |
US11967162B2 (en) | 2018-04-26 | 2024-04-23 | Fyusion, Inc. | Method and apparatus for 3-D auto tagging |
US11627251B2 (en) * | 2018-10-26 | 2023-04-11 | Canon Kabushiki Kaisha | Image processing apparatus and control method thereof, computer-readable storage medium |
JP2020010347A (en) * | 2018-11-02 | 2020-01-16 | キヤノン株式会社 | Generation device, generation method, and program |
US20210248769A1 (en) * | 2020-02-11 | 2021-08-12 | Samsung Electronics Co., Ltd. | Array-based depth estimation |
US11816855B2 (en) * | 2020-02-11 | 2023-11-14 | Samsung Electronics Co., Ltd. | Array-based depth estimation |
US20230033201A1 (en) * | 2021-07-28 | 2023-02-02 | Canon Kabushiki Kaisha | Image processing apparatus, image processing method, and storage medium |
Also Published As
Publication number | Publication date |
---|---|
EP2791909A1 (en) | 2014-10-22 |
WO2013086739A1 (en) | 2013-06-20 |
EP2791909A4 (en) | 2015-06-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20140340404A1 (en) | Method and apparatus for generating 3d free viewpoint video | |
JP4783588B2 (en) | Interactive viewpoint video system and process | |
Uyttendaele et al. | Image-based interactive exploration of real-world environments | |
US7307654B2 (en) | Image capture and viewing system and method for generating a synthesized image | |
US7142209B2 (en) | Real-time rendering system and process for interactive viewpoint video that was generated using overlapping images of a scene captured from viewpoints forming a grid | |
EP2412161B1 (en) | Combining views of a plurality of cameras for a video conferencing endpoint with a display wall | |
US7221366B2 (en) | Real-time rendering system and process for interactive viewpoint video | |
EP2214137B1 (en) | A method and apparatus for frame interpolation | |
Inamoto et al. | Virtual viewpoint replay for a soccer match by view interpolation from multiple cameras | |
US11006141B2 (en) | Methods and systems for using atlas frames to process data representative of a scene | |
US11232625B2 (en) | Image processing | |
US10453244B2 (en) | Multi-layer UV map based texture rendering for free-running FVV applications | |
CN111294584B (en) | Three-dimensional scene model display method and device, storage medium and electronic equipment | |
CN112365407A (en) | Panoramic stitching method for camera with configurable visual angle | |
Sumantri et al. | 360 panorama synthesis from a sparse set of images with unknown field of view | |
Inamoto et al. | Free viewpoint video synthesis and presentation of sporting events for mixed reality entertainment | |
Wang et al. | Space-time light field rendering | |
Alain et al. | Introduction to immersive video technologies | |
US20240388681A1 (en) | Presentation of multi-view video data | |
CN114723873A (en) | End-to-end 3D scene reconstruction and image projection | |
CN114881898A (en) | Multi-angle free visual angle image data generation method and device, medium and equipment | |
JP2017102784A (en) | Image processing system, image processing method and image processing program | |
Tsai et al. | Two view to N-view conversion without depth | |
EP4233012A1 (en) | Lighting model | |
CN116528065A (en) | An Efficient Method for Acquiring and Generating Light Field of Content in Virtual Scene |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: THOMSON LICENSING, FRANCE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:WANG, MENG;DU, LIN;MA, XIAOJUN;SIGNING DATES FROM 20140513 TO 20140521;REEL/FRAME:033119/0570 |
|
STCV | Information on status: appeal procedure |
Free format text: NOTICE OF APPEAL FILED |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |