The multi-faceted method for expressing of a kind of real-world object of shared augmented reality scene
Technical field
The present invention relates to computational geometry, Flame Image Process and augmented reality field, particularly the method for expressing of real-world object in the multi-faceted video sequence of shared augmented reality scene.
Background technology
In the cooperating type augmented reality; Different collaborative work users gets in the augmented reality scene; In perception and the interaction area of being everlasting separately, have different viewpoints with different mutual, can accomplish the predetermined anti-task of doing jointly, need set up a shared augmented reality scene in order to make these users; Thereby need the cooperating type augmented reality system can describe true environment a plurality of orientation, that constantly change, set up scene three-dimensional space model based on many video sequences.Wherein, It is the problem that presses for solution that real-world object in the scene is carried out multi-faceted expression; Specifically comprise a plurality of video sequences being set rationally and effectively, transmit the unified observed information of each video sequence and real-world object in the scene is carried out multi-faceted expression to cover shared scene.
In order to show true environment information; The Regenbrecht of Germany Chrysler technical research division not only disposes Helmet Mounted Display for each user; And the Global Information that video camera obtains true environment is set; Can support 4 users to observe same shared augmented reality scene, carry out alternately, carry out various collaborative works, and use true light source to simulate virtual lighting effect with dummy object in the scene with upper position.Though this system has obtained the video information in a plurality of orientation, these information are not carried out associated treatment, do not consider the rationality that video sequence is provided with yet.
To a plurality of video sequences problem is set, Klee has proposed famous gallery problem: a given gallery, need how many platform video cameras, and where should place them in, can guarantee that the summation of all video camera viewing areas is maximum.Chvatal points out that at first for a polygon that contains n summit, points all in this polygon always can be observed by [n/3] individual video camera.Fisk has proved should theory; It at first is decomposed into a plurality of triangles with polygon; Adopt three coloring algorithms to all vertex colorings then, guarantee that there is various colors on adjacent summit, the corresponding vertex position of color of dyeing minimum number is exactly the placed point of video camera.
Research on Interactive Problem for many video sequences; The people such as Pilet of the federal Institute of Technology of Lausanne, SUI have proposed to support the method that method that the log-on message of many video sequences in a plurality of users' the cooperating type augmented reality is complementary and illumination information are obtained; If from one of them video can't complete observation to mark; Can utilize the log-on message in other orientation to come supplementary copy orientation log-on message; Make virtual object by correct being registered on the mark, but the influence that the error that this system does not consider to produce in the complementary process causes complementary result.
The people such as Schall of Graz, Austria Polytechnics utilize fiducial marker, have accomplished the splicing and the modeling of large scale scene.This method is divided into several sections with large scale scene, has placed a series of benchmark marker between per two adjacent parts, utilizes this benchmark marker can confirm the spatial relation of video sequence.At first begin to measure from being numbered 1 space, in obtaining this space the information of all collection points and deposit computing machine in after, more next space is measured; The collection point information of being had living space so all can obtain; And because the mark of different spaces juncture area is measured to 2 times, then pass through coordinate transformation after, can obtain the position relation in two spaces; After so whole space measurement being finished; Can obtain the position relation between a plurality of spaces, and can set up out the three-dimensional model of whole scene according to this relation, the coordinate transformation algorithm of this project is worth using for reference.
People such as the Wang Xiyong of Univ Florida USA incorporate real-world object in the virtual scene based on the laser 3 dimension scan model of color mark registration.At first, use a 3D scanner that real-world object is expressed as a dummy model with sweep trace, then, remove the noise of generation, arrange sweep trace, at last the blank between sweep trace is filled.For the real-time follow-up real-world object; This system has used the color calibration thing to obtain the position of real-world object; Calculate the position of corresponding virtual model under virtual coordinates thus, and draw, this system can be good at making user's experience events in virtual scene; But this method adopts instrument relatively more expensive, is only limited to few user.
Can draw through analyzing domestic and international present Research; Though the multi-faceted expression of many tissues or mechanism real-world object in research cooperating type augmented reality system is arranged at present; But the problem that also has three aspects: at first: the shared augmented reality scene of great majority all seldom considers how to utilize less camera to obtain bigger range of observation at present; Its video sequence of all supposing setting can fully obtain needed true environment information, the method to set up of many video sequences is not described; Secondly, after many video sequences are set, existing mechanism less to log-on message complementation study, even this problem of consideration is arranged, also seldom consider the influence that error is brought complementary result, error analysis is not fed back in complementary the calculating; At last, in representing towards the real-world object of collaborative work task, many methods are carried out modeling with entire environment, and calculated amount is big, do not consider the needs of sharing information sharing in the augmented reality scene.Because each collaboration user has different perception and interaction area, so its required real-world object information is often different.
Summary of the invention
The objective of the invention is to utilize less camera to obtain bigger range of observation, reduce the influence that error is brought complementary result, realize sharing information sharing in the augmented reality scene.
For this reason, the invention discloses a kind of multi-faceted method for expressing of real-world object of shared augmented reality scene.The multi-faceted method for expressing step of said real-world object is following:
Step 1, will to share the augmented reality scene in one plane abstract; Form polygonized structure; This polygonized structure is cut apart, and calculated based on segmentation result and can cover the observation station that the required minimum number of all parts in the augmented reality scene is shared in observation;
Step 2, in shared augmented reality scene; For each observation station is provided with at least one benchmark marker; Confirm the observation area of each observation station; Calculate the yardstick invariant features vector in the observation area of each observation station, and utilize benchmark marker and characteristic matching to confirm the shared observation area between each observation station;
Step 3, each observation station are carried out the three-dimensional registration under the guide of benchmark marker, and obtain the position of each observation station thus;
Step 4, according to the observed result of two observation stations to the benchmark marker in its shared observation area, calculate the spatial relation of these two observation stations;
Step 5, repeating step four are up to calculating each observation station and at least one other observation station spatial relation each other;
The real-world object information of modeling is wanted in step 6, extracting, makes up silhouettes figure and the disparity map of this real-world object with respect to each observation station;
Step 7, according to silhouettes figure and disparity map, create out the real-world object model of arbitrary new observation station fast.
Preferably; In the multi-faceted modeling method of the real-world object of described shared augmented reality scene; In said step 1, with shared augmented reality scene abstract in one plane be to a horizontal plane, to realize through the edge projection that will share the augmented reality scene.
Preferably; In the multi-faceted modeling method of the real-world object of described shared augmented reality scene; In said step 1, this polygonized structure is cut apart through the realization of triangle split plot design, be about to this polygonized structure and be divided into a plurality of triangles in zero lap zone each other.
Preferably, in the multi-faceted modeling method of the real-world object of described shared augmented reality scene, said observation station is the position for video camera.
Preferably, in the multi-faceted modeling method of the real-world object of described shared augmented reality scene, grasping and wanting the real-world object information of modeling is to obtain through the mode that manual work is chosen.
Preferably; In the multi-faceted modeling method of the real-world object of described shared augmented reality scene; In said step 6, grasping the real-world object information want modeling is to share new entering real-world object in the augmented reality scene through detecting, and grasps that the mode of this real-world object obtains.
Preferably; In the multi-faceted modeling method of the real-world object of described shared augmented reality scene; In said step 6; Making up this real-world object is through getting two positions for video camera in each observation station with respect to the disparity map of each observation station, measures the depth distance of real-world objects through two positions for video camera, to form disparity map.
The invention has the beneficial effects as follows:
1, the present invention is directed to the demand that true environment information is obtained in the shared augmented reality scene, use fewer purpose collecting device, guarantee fully to obtain scene environment information simultaneously, with the computing cost of the follow-up shared augmented reality scene application of remarkable reduction.
2, adopt the complementary mode of log-on message to solve the problem that single video sequence can't be observed all scenes, part has been avoided the possibility of the three-dimensional registration failure of video sequence self simultaneously.
3, to the FAQs of sharing real-world object method for expressing in the augmented reality scene; Adopted three-dimensional Convex Hull Method based on vision; In statement real-world object surface point and video sequence spatial relation; Control the expansion of quantity of information, can respond the collaborative user of new adding fast, satisfied the needs of sharing augmented reality environment.
Figure of description
Fig. 1 is this modular design figure that invents a kind of multi-faceted method for expressing of real-world object of shared augmented reality scene;
Fig. 2 is this complementary synoptic diagram of log-on message of inventing many video sequences in a kind of multi-faceted method for expressing of real-world object of shared augmented reality scene;
Fig. 3 is that this is invented in a kind of multi-faceted method for expressing of real-world object of shared augmented reality scene the adjacent video sequence and transmits the log-on message synoptic diagram;
Fig. 4 is this log-on message conveying flow figure that invents many video sequences in a kind of multi-faceted method for expressing of real-world object of shared augmented reality scene;
Fig. 5 is this synoptic diagram of inventing new collaborative user's quick three-dimensional registration in a kind of multi-faceted method for expressing of real-world object of shared augmented reality scene.
Embodiment
Below in conjunction with accompanying drawing the present invention is further specified, so that those of ordinary skills are with reference to implementing according to this behind this instructions.
As shown in Figure 1, the multi-faceted method for expressing of the real-world object of a kind of shared augmented reality scene of the present invention comprises the steps:
Step 1, many video sequences are provided with a computing module, and will share the augmented reality scene areas abstract is a plane polygon P, utilizes scan-line algorithm; Plane polygon is divided into a plurality of triangles in zero lap zone each other, at first polygon is divided into a plurality of monotone polygons, then each monotone polygon is divided into a plurality of triangles; Be decomposed in the leg-of-mutton process at polygon; Can produce line and connect polygonal different summit, these lines are inner at polygon fully, and such line is called as diagonal line; When selecting to be provided with; Calculate diagonal bars number earlier through each summit, and statistics maximal value wherein, the summit that the diagonal bars number is maximum is as being provided with a little; Calculating from this summit can observed leg-of-mutton set again, and it is removed from regional P, upgrades the diagonal line of remaining area simultaneously; Up to remaining area is zero, and when having real-world object in the scene P, the inner point of real-world object need not be observed; And the part scene caused block; Be the inner polygon P ' of P with the real-world object abstract representation in the scene this moment, and promptly inner " cavity " of P utilizes diagonal line to cut apart in initialization; The zone of P ' is removed from P, guarantee that the interior institute of P-P ' is observed a little;
Step 2, as shown in Figure 2 is in shared augmented reality scene, for each observation station is provided with at least one benchmark marker; Confirm the observation area of each observation station, use the yardstick invariant features that two width of cloth video sequence images are carried out the feature point extraction coupling then, the yardstick invariant features is the local feature of image; It maintains the invariance to rotation, scale, brightness variation, and visual angle change, affine transformation, noise are also kept stability to a certain degree, at first generates metric space; Based on metric space, carry out preliminary spatial extrema point and detect, remove unsettled extreme point then; For making algorithm have the rotation inconvenience, utilizing the gradient direction distribution character of residue extreme point neighborhood territory pixel is each key point assigned direction parameter, generates the feature descriptor of 128 dimensions at last for each characteristic point; Behind the video image that obtains two orientation, generated the feature descriptor of a large amount of points according to the constant algorithm of yardstick, it is right to calculate match point according to the Euclidean distance matching method then; Introduce relatively threshold values here, weigh characteristic matching degree between points, if the ratio of Euclidean distance minimum of a value and sub-minimum is greater than comparing threshold values; Then it fails to match, otherwise the match is successful, and relatively threshold values is more little; Matching result is accurate more, and the match point logarithm that obtains is few more;
Step 3, for two the video sequence V (1) that have shared region and V (2), if can observe certain benchmark marker m, find through feature point extraction; V (1), there is certain unique point in the zone that m is corresponding in the video image of V (2), then carries out Feature Points Matching and handles; Match point is to being present in the corresponding image-region of m in a large number, if the right number of match point, can think then that this marker m is V (1) and the shared region of V (2) greater than certain threshold values; Behind the shared region of having confirmed the different video sequence, need to select 1 p (this moment, p was under the world coordinate system) in the shared region, according to projection theory; The different video sequence has different camera coordinate systems; Because p can produce different what comes into a driver's matrixes to the plane of delineation projection of different video sequence and since p be in the shared region a bit; The hereditary property of p capable of using calculates the spatial relation of different video sequence, is M so establish p to the what comes into a driver's matrix of V (1)
1, the world coordinate system that its expression is ordered from p transforms to the camera coordinate system of V (1), and p is M to the what comes into a driver's matrix of V (2)
2, the world coordinate system that its expression is ordered from p transforms to the camera coordinate system of V (2), and to the residing camera coordinate system of V (2), its transformation matrix is M from the residing camera coordinate system transformation of V (1)
1 -1* M
2, be the position relation of video sequence with matrix representation;
Step 4, as shown in Figure 3, at a time, 1 p in space is in the range of observation of video sequence V (1); Point p its with the position of V (1) relation be SP (V (1), p), for video sequence V (2); P maybe be in the range of observation of V (2), perhaps because blocking between the real-world object makes V (2) can not see the p point; The position that therefore can't directly draw p and V (2) concerns that (V (2) p), concerns SP (V (2) according to above-mentioned V (1) that has confirmed and the position of V (2) to SP; V (1)), derive SP (V (2), p)=SP (V (2); V (1)) * SP (V (1), p), complementation is divided into the complementation between the complementary and non-contiguous video sequence between the contiguous video sequence; Weigh " distance " video sequence between with jumping figure this moment, and for two video sequences that have shared region, this distance just 1 is jumped; According to complementary algorithm, when a certain video sequence V (0) needed to obtain the log-on message of certain 1 p in the space, V (0) sent request to contiguous video sequence set; If certain video sequence V (i) receives complementary register requirement and can observe the log-on message that p is ordered that then send answer to video sequence V (i-1), this answer comprises the demarcation log-on message M of a p to video sequence V (i) from video sequence V (i-1)
PS (i)This answer message will be along the reverse initial inquiry video sequence V (0) that returns of query path; If video sequence V (i) still can't draw the log-on message that p is ordered; Then turn to its contiguous video sequence V (i+1) to send new request, when video sequence V (0) obtains complementation registration return information, calculate the log-on message of p point for V (0) according to the spatial relation between the video sequence;
Step 5, each two field picture and a background image of using the background subtracting method newly to obtain compare; Obtain the projected outline of real-world object in a certain video sequence; Use thresholding to handle to every pair of pixel difference; Thereby determine the position that belongs to prospect in every two field picture, prospect is the real-world object that the user is concerned about;
Step 6, for a video sequence, in order to calculate the world coordinates value of observed point, need to obtain corresponding disparity map from this orientation; At first need belong to the orientation, two cameras are set obtain two width of cloth images at video sequence, in the image except the zone that is blocked; The match point that all picture elements can meet with a response in other piece image meets on the corresponding sweep trace of polar curve constraint search and estimates the coupling pixel sequences at two width of cloth images then, and calculates the right horizontal ordinate difference of match point; This difference is saved in the disparity map as parallax value, when calculate object on image the silhouettes line and obtain the disparity map under this orientation after, utilize projection formula to calculate the visible D coordinates value of real-world object surface point under world coordinate system in this orientation; Utilize binocular stereo vision in each orientation then, obtain the outer parameter matrix of left and right sides image respectively, and calculate the distance b of left and right cameras; Preserve the intrinsic parameter matrix of video camera simultaneously based on the computing machine scaling method; Through being the point of foreground object in the traversal silhouettes figure, utilize the corresponding parallax value size of inside and outside parameter matrix, b and this point, utilize projection formula; With the two-dimensional pixel spot projection of the plane of delineation in camera coordinate system; Can obtain corresponding three-dimensional coordinate point, owing to have the video camera in a plurality of orientation, therefore; Need be with in different camera coordinate system unification to world coordinate systems; For the three-dimensional coordinate point after the projection, also need it to be transformed in the world coordinate system according to the outer parameter information of having tried to achieve, the three-dimensional point set that this moment, each orientation calculated can be unified in the world coordinate system;
Step 7, after calculating the observable real-world object surface point of certain video sequence, obtain this moment is that the point of part surface point converges and closes, and uses the three-dimensional convex closure of putting cloud to represent the surface configuration of real-world object then: at first to select not four points of coplane; Construct a tetrahedron,, be increased in the polyhedron of front construction according to certain order then for remaining point; If it is initiate inner at polyhedron; Then can directly ignore, handle next point, if initiate outside at polyhedron; Then need construct new limit and face; Add in the current polyhedron, and delete those sightless limit and faces, obtain the model of real-world object at last;
Step 8, as shown in Figure 5, for the user of new entering, it at first sends initialization requests to all initial users; Confirm the video sequence that it is contiguous, if having shared region between the video sequence, then can be according to the spatial relation that a bit calculates in the shared region between the video sequence; And preserve this positional information, if do not have shared region between the video sequence, then the video sequence in other orientation is passed null value back through Network Transmission; When new user confirmed with its most contiguous left and right sides video sequence after; Send request to these two video sequences, obtain these two existing real-world objects in orientation and represent information, what this stylish orientation obtained still is the point set of real-world object surface point; The data of a plurality of points that can return two orientation are carried out fusion treatment, obtain the model of new real-world object.