WO1999007153A1 - Systemes et procedes de commande de logiciel mis en oeuvre par une analyse et une interpretation d'informations video - Google Patents
Systemes et procedes de commande de logiciel mis en oeuvre par une analyse et une interpretation d'informations video Download PDFInfo
- Publication number
- WO1999007153A1 WO1999007153A1 PCT/US1998/016046 US9816046W WO9907153A1 WO 1999007153 A1 WO1999007153 A1 WO 1999007153A1 US 9816046 W US9816046 W US 9816046W WO 9907153 A1 WO9907153 A1 WO 9907153A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- user
- video
- computer
- pixel
- image
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000012545 processing Methods 0.000 claims abstract description 30
- 230000033001 locomotion Effects 0.000 claims abstract description 25
- 230000009471 action Effects 0.000 claims abstract description 20
- 230000002452 interceptive effect Effects 0.000 claims abstract description 13
- 230000008859 change Effects 0.000 claims abstract description 8
- 230000008569 process Effects 0.000 claims description 9
- 230000000306 recurrent effect Effects 0.000 claims description 6
- 230000000007 visual effect Effects 0.000 claims description 5
- 238000010304 firing Methods 0.000 claims description 3
- 238000004088 simulation Methods 0.000 claims 1
- 230000009466 transformation Effects 0.000 abstract 1
- 238000000844 transformation Methods 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 238000004891 communication Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 230000003993 interaction Effects 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 239000000945 filler Substances 0.000 description 2
- 238000007654 immersion Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000001131 transforming effect Effects 0.000 description 2
- 230000004308 accommodation Effects 0.000 description 1
- 230000001154 acute effect Effects 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 238000010420 art technique Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000011109 contamination Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000007257 malfunction Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000013618 particulate matter Substances 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/254—Analysis of motion involving subtraction of images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/28—Recognition of hand or arm movements, e.g. recognition of deaf sign language
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/18—Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
- H04N7/188—Capturing isolated or intermittent images triggered by the occurrence of a predetermined event, e.g. an object reaching a predetermined position
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F2300/00—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
- A63F2300/10—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by input arrangements for converting player-generated signals into game device control signals
- A63F2300/1087—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by input arrangements for converting player-generated signals into game device control signals comprising photodetecting means, e.g. a camera
- A63F2300/1093—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by input arrangements for converting player-generated signals into game device control signals comprising photodetecting means, e.g. a camera using visible light
-
- A—HUMAN NECESSITIES
- A63—SPORTS; GAMES; AMUSEMENTS
- A63F—CARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
- A63F2300/00—Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
- A63F2300/60—Methods for processing data by generating or executing the game program
- A63F2300/69—Involving elements of the real world in the game world, e.g. measurement in live races, real video
Definitions
- the present invention relates to data processing by digital computers, and more particularly to a video image based user interface for computer software.
- the development of improved user interfaces is a primary concern of the computing industry.
- the conventional method to provide interaction with a computer requires user manipulation of an input device.
- Examples of commonly used input devices include absolute positioning devices such as light pens, touch screens and digitizing tablets, and relative positioning devices such as mouse devices, joy sticks, touch pads and track balls.
- the user manipulates the input device (which may involve translation or rotation of one or more active elements) such that an associated cursor or pointer is aligned with an object or area of interest displayed on a computer monitor.
- the user may then engage other active elements of the input device (such as a mouse button) to perform a function associated with the object or area of interest.
- the input device is utilized to perform functions such as selecting and editing text, selecting items from a menu, or firing at targets in a game or simulation software application.
- input devices may become partially or fully inoperative over time due to mechanical wear, breakage, drifting of optical or electronic elements, and contamination of working parts due to exposure to dirt and particulate matter.
- Second, the method of operation of input devices to effect various functions is not intuitive, and users may need substantial training and experience before they are comfortable with and proficient in the operation of a particular input device. This problem may be especially acute with respect to unsophisticated users, such as young children, or users who have little or no prior experience with personal computers. Further, certain users, such as elderly or physically disabled persons, may not possess the requisite dexterity and motor skills to precisely manipulate an input device so as to perform a desired function.
- prior art video image based interfaces have a number of associated disadvantages, and consequently have not possessed significant commercial appeal.
- Many of the prior art interfaces require that the user wear a distinctively colored or shaped article of clothing (such as a glove) in order to enable tracking of the user's movement by the interface software.
- Other interfaces similarly require the user to hold a specially shaped article (see Dementhon et al.) or to place markers about his or her body (see Oh).
- many prior art interfaces are extremely limited in their functionality, are computationally expensive, and/or must be implemented in a specialized environment.
- the invention comprises a video image based user interface for a computer system wherein a user, by making the appropriate movements, may interact with a computer-generated object or area of interest displayed on a monitor.
- a video camera coupled to a computer system, generates signals representative of the image of the user. The images are stored as successive video frames.
- a video processing engine is configured to process each frame to detect a motion of the user, and to determine a user action with respect to the computer- generated object or area of interest.
- the processing engine includes a set of software devices which sequentially apply various transforms to the video frame so as to achieve the desired analysis.
- the computer-generated object or area of interest (as modified by the user action) is displayed on the monitor together with at least a portion of the user image to provide visual feedback to the user.
- a computer-generated object representative of a sport ball describes a trajectory on the monitor. By making the appropriate movements, the user may simulate striking the ball to change its trajectory. In this manner, the first mode of the invention facilitates playing of simulated sports or games such as volleyball.
- the first mode of the invention may also be adapted to enable a user to engage or adjust the position of a conventional user interface control object, such as a button, icon, or slider.
- the processing engine is configured to recognize and track an article, typically representing a weapon, held by the user. The user, by adjusting the position and orientation of the article, may perform certain actions with respect to a computer-generated object. For example, the user may aim and fire at an object representing a target.
- the article may be visually displayed on the monitor in a visually altered or idealized form.
- a third mode of the invention provides an interactive kiosk for selectively presenting information to a user.
- the kiosk includes a display on which is presented a set of user interface control objects.
- an interactive kiosk located in a tourist information center may display icons representative of lodging, food, entertainment, and transportation.
- a video camera disposed on or proximal to the display records an image of the user, at least a portion of which is presented on the display. Video frames captured by the video camera are then processed in the manner described above to detect a motion of the user and determine an action with respect to the user interface controls, namely, an engagement of a certain icon.
- the kiosk may be provided with a modem or network interface so as to enable remote updating of informational content.
- a characteristic value is calculated based on its color and luminosity.
- the characteristic value may comprise, for example, the sum of the pixel's red, green, and blue color values.
- the characteristic value for the pixel is then compared with the values calculated for spatially corresponding pixels in previously captured video frames. If it is determined that the parameter is substantially equal to a value which has recurred a predetermined number of times in previous video frames, the pixel is considered to be in the background of the captured image, and is processed accordingly. If the calculated characteristic value is different from previously calculated values, then the pixel is considered to be in the foreground of the image.
- FIG. 1 is a schematic diagram showing the operating environment of the video image based user interface of the present invention
- FIG. 2 is a block diagram showing various software components of the invention stored in a computer memory
- FIG. 3 is a block diagram of a video frame
- FIG. 4 is a block diagram showing components of a video processing engine
- FIG. 5 is a block diagram showing components of a generic software device
- FIG. 6 is a block diagram of an exemplary library of software devices
- FIG. 7 is a flowchart depicting the steps of a method for determining whether a pixel lies in the foreground or background portion of a video frame
- FIG. 8 is a graph of pixel value versus time for an exemplary pixel
- FIG. 9 is a schematic of a software device train for processing of a video frame
- FIG. 10 is a schematic of a software device train for an embodiment of the invention wherein two video cameras are employed;
- FIG. 11 is a perspective view of a user and computer system according to a first mode of the video image user interface system of the invention
- FIG. 12 is a schematic of a preferred software device train for implementing the first mode of the invention
- FIG. 13 is a perspective view of a user and computer system according to a second mode of the invention
- FIG. 14 is a schematic of a preferred software device train for implementing the second mode of the invention.
- FIG. 15 is a perspective view of an interactive kiosk according to a third mode of the invention.
- FIG. 1 depicts in schematic form a typical computer system 100 for implementation of the video image user interface system and method of the present invention.
- the computer system 100 includes a monitor 102 for displaying images to a user; a memory 104 for storage of data and instructions therein; a video camera 108 for generating image data; and a processor 106 for executing instructions and for controlling and coordinating the operation of the various components.
- the computer system 100 may further include a set of conventional input devices 110, such as a keyboard and a mouse device, for receiving positional and alphanumeric input from the user.
- the monitor may comprise a conventional Cathode Ray Tube (CRT) type RGB monitor, but may alternatively comprise a Liquid Crystal Display (LCD) monitor or any other suitable substitute.
- CTR Cathode Ray Tube
- LCD Liquid Crystal Display
- the memory may consist of various storage device configurations, including random access memory (RAM), read-only memory (ROM), and non-volatile storage devices such as CD-ROMs, floppy disks, and hard disk drives.
- the processor may be, for example, an Intel® Pentium® microprocessor.
- the components are coupled in communication by a bus 112, which enables transfer of data and commands between and among the individual components.
- the bus 112 may optionally be connected to a communications interface (not shown), such as a modem or Ethernet® card, to permit the computer system 100 to receive and transmit information over a private or public network.
- the video camera 108 may comprise any device capable of generating electronic signals representative of an image.
- Computer- specific video cameras are generally configured to transmit digitized electronic signals via an established communications port, such as a parallel or serial port.
- Examples of video cameras of this type include the Connectix® Quick Cam® (which connects to a computer via a parallel port) and the Intel® Create and ShareTM camera (which connects to a computer via a serial port).
- Standard camcorders may also be utilized, but require a video frame grabber board to digitize and process the video signals prior to sending the image data to other components of the computer system.
- the video camera 108 preferably generates color video signals; however; certain features and aspects of the invention may be implemented using a monochromatic video camera.
- a second video camera 114 coupled to the system bus 112 may be provided to achieve certain objectives of the invention, namely those involving depth analysis, as described hereinbelow.
- FIG. 2 is a map of computer system memory 104 having software and data components of the video image interface system contained therein.
- the memory 104 preferably holds an operating system 202 for scheduling tasks, allocating storage, and performing various low-level functions; device drivers 204 for controlling the operation of various hardware devices, a device module layer 206 for providing an interface with the video camera 108; a video processing engine 208, including a set of software devices for analyzing and transforming the video frames so as to detect a motion of the user and infer an action relative to a computer generated object displayed on the monitor; and, a class library 210, which includes one or more applications classes.
- the device module layer 206 includes a set of device modules configured to receive image data generated by the video camera 108, and to store the image data as successive video frames.
- the device module layer 206 may include one or more certified device modules designed to support a specified model of a video camera from a particular vendor, and a generic device module designed to work with a range of video camera types.
- the generic device module may cooperate with one or more device drivers 204 to achieve the function of storing video frames in the frame buffer 212.
- Video frames captured by the relevant device module are held in a video frame buffer 212 until the video processing engine 208 is ready to initiate processing thereof.
- the video frame buffer 212 stores a small number of frames, preferably two.
- the video frame buffer 212 may be of the ring type, wherein a storage pointer points to the oldest frame for overwriting thereof by an incoming video frame, or have suitable alternative structure.
- FIG. 3 is a block diagram of a video frame 300 configured in accordance with a preferred implementation of the invention.
- the video frame 300 has a two-part structure.
- the first portion of the video frame 300 is a variable format pixel array 302.
- the color and luminosity of each pixel within the array may be encoded in a variety of formats used in the art, such as 8-bit or 16-bit RGB.
- the second portion of the video frame 200 comprises a key buffer 304 having array dimensions corresponding to the variable format pixel array 302.
- the key buffer 304 which begins in an uninitialized state, is employed to store results of processing operations effected by the software devices.
- Processing of the video frame 300 is performed by the video processing engine 208.
- the video processing engine 208 comprises a kernel 402 and a set of software devices 404.
- the kernel 402 operates to route the video frame 300 through a train of software devices 404, which sequentially process the video frame 300 in order to achieve a desired result.
- the kernel 402 performs the routing by executing a simple polling loop.
- the kernel queries each software device 404 to determine if device output is available. If the kernel 402 determines that output is available from a first software device, the kernel then queries a second software device known to receive input from the first software device (i.e., the next software device in the train) whether it is ready to receive input. If the second software device is ready, then the video frame 300 is passed thereto.
- This polling loop is continually executed to enable routing, on a one-at-a-time basis, of video frames 300 through the linked train of software devices 404.
- a generic software device 404 is depicted in block form in FIG. 5.
- the software device 404 includes a software routine 502 for evaluating and transforming an input video frame 504 to produce an output video frame 506.
- the software device 404 further include input ready and output ready bits 508 and 510 which are set to signify to the kernel 402 whether the software device 404 is ready to (respectively) receive an input video frame 504 or transmit an output video frame 506.
- Software device memory 512 may be provided to store data such as previously captured video frames. As is alluded to hereinabove, individual software devices, each performing a specific transform on a video frame, are linked together to implement a desired overall result.
- FIG. 6 depicts in block form a library 600 of exemplary software devices. It is to be noted that the collection of software devices depicted in the figure and discussed below is intended to be illustrative rather than limiting. The software devices generally embody techniques and algorithms known in the art, and hence the specific processing techniques and algorithms associated with each of the various software devices will not be discussed in detail herein, except where significant departures from such prior art techniques and algorithms are practiced by the invention.
- the capture device 602 comprises the first software device in any train of software devices.
- the capture device is configured to examine the frame buffer 212 to detect if an unprocessed video frame 300 is present, and signify the result to the kernel 402.
- the color detector device 604 is configured to examine the video frame 300 to determine if any regions of the frame 300 are of a predetermined color. Each pixel is assigned a value (typically in the range of 0-255) corresponding to how closely it matches the predetermined color, and the assigned values are written to the key buffer 304.
- the key generator device 606 is configured to determine, for each pixel, if a value associated with the pixel exceeds or is less than a threshold value, or if the value falls within a specified range. The key generator 606 then assigns a value to the pixel depending on whether or not the pixel satisfies the test applied by the key generator device 606. The assigned values are written to the key buffer 304.
- the smooth filter device 608 is configured to filter stray pixels or noise from the video frame 300.
- the smooth filter device 608 typically performs this operation by comparing each pixel to neighboring pixels to determine if values associated with the pixel are consistent with those of the neighboring pixels. If an inconsistency is found, the pixel values are reset as appropriate.
- the edge detector device 610 is configured to examine objects in the video frame 300 and determine the edges thereof. As is known in the art, edge determination may be performed by comparing the color and luminosity of each pixel to pixels disposed adjacent thereto. Alternatively, edge determination of dynamic objects may be performed by examining differences between successive video frames.
- the edge detector device is usually employed in connection with the edge filler device 614, which examines the output of the edge detector device and fills in any discontinuities, and the edge filter device 612, which filters out pixels erroneously designated as edge pixels by the edge detector device 610.
- the foreground detector device 616 is configured to examine each pixel to determine whether the pixel lies in the foreground portion or background portion of the video frame 300.
- the operation of the foreground detector is described with reference to FIGS. 7 and 8.
- FIG. 7 depicts the steps of a preferred method for foreground detection.
- the foreground detector device selects a pixel in the video frame 300 for testing.
- a value representative of the pixel's color and luminosity is calculated, step 704.
- the pixel value comprises the sum of the pixel's red, green, and blue color values.
- the calculated pixel value is then stored, step 706.
- the pixel value is compared, step 708, to the values of spatially corresponding pixels of previously captured video frames (stored in the memory of the foreground detector device 616).
- FIG. 8 presents a graph showing a typical variation of pixel value (for spatially corresponding pixels) in successively captured video frames. It is appreciated that the graph shows certain plateaus or recurrent pixel values 802 indicative of the pixel lying in the background portion of the video frame 300. If in step 708 it is determined that the pixel's current value is substantially equal to the recurrent or plateau value 802, then the pixel is assigned a parameter indicating that the pixel is in the background portion of the video frame, step 710.
- the pixel is assigned a parameter indicating that the pixel is in the foreground portion of the video frame 300, step 712.
- the parameter may either have two states (background or foreground), or may alternatively have a range of values corresponding to a confidence level as to whether the pixel lies in the background or foreground portion of the video frame 300.
- the grayscale device 620 converts the pixel color values (typically in RGB format) in the video frame 300 to grayscale values. The calculated grayscale values are then placed in the key buffer 304.
- the difference detector device 622 examines the video frame 300 to determine differences from a previously stored video frame. More specifically, the difference detector device 622 subtracts pixel values (which may comprise color or gray scale values) of the stored video frame from the pixel values of the current video frame 300, and stores the results in the key buffer 304. In unchanged regions of the video frame 300, the spatially corresponding pixels in the current and stored video frames will have identical values, and hence the subtraction will yield a value of zero. In regions where change has occurred, the subtraction process will yield non-zero values.
- the positive displacement device 618 performs an analysis similar to the difference detector device, but is additionally configured to determine a leading edge of the movement represented by the changed regions of the video frame 300.
- the full screen device 624 is configured to cause the video frame 300, or specific portions thereof, to be displayed on the computer monitor 102.
- the class library 210 controls interactions between computer generated objects and the user image, as transformed by the software devices.
- the class library 210 includes a variety of objects having a set of attributes. Typical objects include trackers, bouncing or "sticky" objects, and user interface controls (such as buttons, sliders, and icons).
- the class objects all include in their associated attributes a position with respect to the video frame 300.
- the class library 210 further includes the software application, which is preferably configured as a software device. The application device receives the video frame 300, as transformed by software devices disposed upstream in the processing train, and examines the frame 300 to evaluate user actions with respect to one or more class objects.
- FIG. 9 depicts in block form a generic software device train for processing a video frame 300 in accordance with the present invention.
- the capture device 602 gets the video frame 300 from the frame buffer 212.
- the video frame 300 is then sequentially transformed by the set of software devices (labeled herein as software device 1 through N and collectively numbered 902) selected from the software device library 600.
- the transformed video frame 300 is then passed to the application device 904 which examines the video frame 300 to determine interactions with one or more class objects.
- the video frame 300 is passed to the full screen device 624, which causes the video frame 300 to be displayed by the monitor 102.
- FIG. 10 depicts a variation of the user interface system of the invention wherein two spaced apart video cameras 108 and 114 are utilized to simultaneously record the user's image.
- This configuration is useful in applications requiring three-dimensional tracking of the user's position.
- two software device trains are provided for simultaneous processing of the video frames generated by the video cameras 108 and 114.
- Each of the trains includes a capture device 602, a set of additional software devices 1002 and 1004, and an application device 1006 and 1008.
- the processed video frames are then passed to a triangulation device 1010, which compares the two frames and uses known triangulation algorithms to derive depth measurements.
- FIG. 11 and 12 relate to a first mode of the invention wherein a user interacts with a dynamic computer- generated object representative of a sport ball or the like.
- the user can, by making an appropriate motion with his arm or other body part, simulate "hitting" the ball to thereby change its trajectory. In this manner, various sports or games, such as volleyball or handball, can be simulated.
- FIG. 11 shows a perspective view of a user 1102 situated in front of a computer system 100. Initially, the user's arm is lowered, and the computer-generated object 1104 has a first location and direction of motion indicated in solid lines.
- the user 1102 moves his arm to a second position (indicated in phantom) wherein the locations occupied by the computer-generated object 1104 and the user's image in the video frame 300 are coincident, causing the trajectory of the object 1104 to be changed.
- FIG. 12 depicts the preferred sequential arrangement or train of software devices for applying the required image transforms to the video frame 300 to achieve the objectives of the first mode of the invention.
- the capture device 602 initializes the process by obtaining a video frame 300 from the frame buffer 212.
- the capture device 602 then passes the video frame 300 to the grayscale device 620, which calculates a grayscale value for each pixel in the video frame 300.
- the video frame 300 after processing by the gray scale device 620, is then routed to the difference detector device 622.
- the difference detector device 622 detects a user motion by comparing the current video frame 300 with a previously captured video frame.
- the difference detector subtracts, on a pixel-by-pixel basis, the grayscale values of the current video frame 300 from the previously captured video frame, and places the results in the key buffer 304, The difference detector device will yield a value of zero for pixels in regions where no motion has occurred, and a non-zero value for regions in which motion is detected.
- the video frame 300 is then passed to the smooth filter device 608 for filtering of stray pixels.
- the video frame 300 is then routed to the applications device 1202.
- the application device 1202 examines the video frame 300 to determine whether there has been a collision between the software-generated object 1104 and the region of the video frame 300 in which motion has been found. If a collision is detected, the application device 1202 accordingly adjusts the position and trajectory of the computer-generated object 1104.
- the video frame 300 is then routed to the full screen device 624 for display by the monitor 102.
- FIGS. 13 and 14 relate to a second mode of the invention wherein the video processor engine 208 is configured to recognize and track an article held by the user, and to infer a user action from the movement and orientation of the article.
- FIG. 13 is a perspective view of a user 1302 situated in front of a computer system 100.
- the user 1302 grasps and positions an article 1304 (which may comprise, for example, a toy weapon) which possesses a visual characteristic, such as a distinctive color, which can be recognized by the user interface system.
- an article 1304 which may comprise, for example, a toy weapon
- a visual characteristic such as a distinctive color
- the image of the user 1302 is processed to determine the direction in which the article is pointing (simulating the aiming of a weapon), to enable the user 1302 to fire at one or more computer generated target objects 1306 displayed on the monitor 102.
- FIG. 14 depicts the preferred train of software devices for applying the required image transforms to the video frame 300 to achieve the objectives of the second mode of the invention.
- the capture device 602 initializes the process by obtaining a video frame 300 from the frame buffer 212.
- the capture device 602 then passes the video frame 300 to the color detector device 604, which ranks each pixel in the frame 300 as to how closely it matches a specified color (which is set to the characteristic color of the article 1304).
- the outputted frame 300 from the color detector device 604 is routed to the key generator device 606, which compares the color match values to a threshold value and creates a key based on the thresholding results for each pixel.
- the frame 300 is then routed to the smooth filter device 608 to filter out noise and stray pixels.
- the video frame 300 is then passed to the edge detector 610, which is configured to examine the video frame 300 to detect the edges of the article 1304.
- the video frame 300 is thereafter routed to the edge filter device 612 and edge filler device 614 to, respectively, filter out stray edges and fill in edge discontinuities.
- the video frame 300 is routed to the application device 1402 configured to examine the frame 300 to determine the orientation of the article 1304 and to infer a user action (i.e., firing the weapon at a computer-generated object 1306).
- the frame is passed to the full screen device 624 for display on the monitor 102.
- FIG. 15 relates to a third mode of the invention wherein the video image user interface is implemented in the form of an interactive kiosk
- Interactive kiosks are commonly used in airports, hotels, tourist offices, and the like to interactively present information concerning accommodations, attractions, restaurants, transportation, etc.
- Such kiosks generally utilize menus to allow the user to selectively request certain information to be displayed (such as hotels within a given price range or sports events taking place on a certain day).
- Informational kiosks may also be employed to enable shoppers to identify and locate specific items, or to selectively present product information or advertising.
- Prior art informational kiosks are commonly provided with touch screens to receive user input. Touch screens, while generally facilitating an intuitive user interface, are prone to malfunction, particularly when used in a heavy-traffic environment.
- an interactive kiosk utilizes the user interface system of the present invention to display information selected by the user.
- a set of user interface controls 1502, each denoting a certain type of information, is displayed on a monitor 102.
- the user interface controls 1502 may comprise, for example, icons or textual menu choices.
- a video camera 108 captures an image of a user 1504, and the user's image is processed according to methods and techniques described above to determine a motion of the user 1504 and infer a user action with respect to the interface controls 1502.
- the user 1504 may select one of a plurality of icons 1502 by raising his hand such that it coincides with the area occupied by the selected icon. This causes information relevant to the icon to be displayed on the monitor.
- the interactive kiosk may be advantageously equipped with a modem or similar communications device to allow remote modification of the informational content displayed to the user.
- informational content can be updated and/or changed on a periodic basis without requiring a visit to the physical site of the kiosk.
- the video image based interface of the foregoing description may also be used, for example, for recognition and interpretation of gestures, immersion of a user's image in a static or dynamic video image, etc.
- the user interface system may be combined with other interface technologies, such as voice recognition, to enhance functionality.
- the user interface system may be implemented in connection with an appliance such as a television or audio system to control various aspects of its operation.
- the user interface system may be utilized to control operation of a toy or game.
- the user interface system may be implemented in a networked computing environment.
- the user interface system may be advantageously implemented in connection with a multi-player game application in which users located remotely from each other engage in game play.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU86824/98A AU8682498A (en) | 1997-07-31 | 1998-07-31 | Systems and methods for software control through analysis and interpretation of video information |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US5449897P | 1997-07-31 | 1997-07-31 | |
US60/054,498 | 1997-07-31 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1999007153A1 true WO1999007153A1 (fr) | 1999-02-11 |
Family
ID=21991516
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1998/016046 WO1999007153A1 (fr) | 1997-07-31 | 1998-07-31 | Systemes et procedes de commande de logiciel mis en oeuvre par une analyse et une interpretation d'informations video |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU8682498A (fr) |
WO (1) | WO1999007153A1 (fr) |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002093465A1 (fr) * | 2001-05-15 | 2002-11-21 | Martin Lefebure | Procede de commande d"un curseur, programme d"ordinateur pour sa mise en oeuvre et systeme de commande d"un curseur |
WO2003021410A3 (fr) * | 2001-09-04 | 2004-03-18 | Koninkl Philips Electronics Nv | Systeme et procede d'interface informatique |
US6749432B2 (en) | 1999-10-20 | 2004-06-15 | Impulse Technology Ltd | Education system challenging a subject's physiologic and kinesthetic systems to synergistically enhance cognitive function |
WO2004073815A1 (fr) * | 2003-02-21 | 2004-09-02 | Sony Computer Entertainment Europe Ltd | Commande de traitement de donnees |
WO2004073814A1 (fr) * | 2003-02-21 | 2004-09-02 | Sony Computer Entertainment Europe Ltd | Controle de traitement de donnees |
WO2006023285A1 (fr) * | 2004-08-19 | 2006-03-02 | Igt | Systeme d'entree virtuelle |
EP1645944A1 (fr) | 2004-10-05 | 2006-04-12 | Sony France S.A. | Interface pour une gestion de contenu |
EP1752861A1 (fr) | 2005-08-11 | 2007-02-14 | Samsung Electronics Co.,Ltd. | Procédé et dispositif d'entrée utilisateur dans un terminal mobile de communication |
WO2009092126A1 (fr) * | 2008-01-25 | 2009-07-30 | Stumpfl, Reinhold | Dispositif de présentation multimédia interactif |
GB2463312A (en) * | 2008-09-09 | 2010-03-17 | Skype Ltd | Games system with bi-directional video communication |
FR2972544A1 (fr) * | 2011-03-10 | 2012-09-14 | Intui Sense | Systeme d'acquisition et de traitement d'images robuste pour facade interactive, facade et dispositif interactifs associes |
WO2014121566A1 (fr) * | 2013-02-07 | 2014-08-14 | Han Zheng | Procédé et dispositif de collecte de données pour reconnaissance d'action, et système de reconnaissance d'action |
US9087159B2 (en) | 2007-08-17 | 2015-07-21 | Adidas International Marketing B.V. | Sports electronic training system with sport ball, and applications thereof |
US9188975B2 (en) | 2011-11-21 | 2015-11-17 | Pirelli Tyre S.P.A. | Method for controlling the movement of building members of a tyre in a process for manufacturing tyres for vehicle wheels |
US9230395B2 (en) | 2004-06-18 | 2016-01-05 | Igt | Control of wager-based game using gesture recognition |
US9242142B2 (en) | 2007-08-17 | 2016-01-26 | Adidas International Marketing B.V. | Sports electronic training system with sport ball and electronic gaming features |
US10062297B2 (en) | 2007-08-17 | 2018-08-28 | Adidas International Marketing B.V. | Sports electronic training system, and applications thereof |
CN110998594A (zh) * | 2017-08-07 | 2020-04-10 | 三菱电机株式会社 | 检测动作的方法和系统 |
US10691949B2 (en) * | 2016-11-14 | 2020-06-23 | Axis Ab | Action recognition in a video sequence |
CN114185424A (zh) * | 2014-05-21 | 2022-03-15 | 汤杰宝游戏公司 | 有形界面对象的虚拟化 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4843568A (en) * | 1986-04-11 | 1989-06-27 | Krueger Myron W | Real time perception of and response to the actions of an unencumbered participant/user |
US5297061A (en) * | 1993-05-19 | 1994-03-22 | University Of Maryland | Three dimensional pointing device monitored by computer vision |
US5423554A (en) * | 1993-09-24 | 1995-06-13 | Metamedia Ventures, Inc. | Virtual reality game method and apparatus |
US5528263A (en) * | 1994-06-15 | 1996-06-18 | Daniel M. Platzker | Interactive projected video image display system |
US5534917A (en) * | 1991-05-09 | 1996-07-09 | Very Vivid, Inc. | Video image based control system |
US5616078A (en) * | 1993-12-28 | 1997-04-01 | Konami Co., Ltd. | Motion-controlled video entertainment system |
US5795228A (en) * | 1996-07-03 | 1998-08-18 | Ridefilm Corporation | Interactive computer-based entertainment system |
-
1998
- 1998-07-31 WO PCT/US1998/016046 patent/WO1999007153A1/fr active Application Filing
- 1998-07-31 AU AU86824/98A patent/AU8682498A/en not_active Abandoned
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4843568A (en) * | 1986-04-11 | 1989-06-27 | Krueger Myron W | Real time perception of and response to the actions of an unencumbered participant/user |
US5534917A (en) * | 1991-05-09 | 1996-07-09 | Very Vivid, Inc. | Video image based control system |
US5297061A (en) * | 1993-05-19 | 1994-03-22 | University Of Maryland | Three dimensional pointing device monitored by computer vision |
US5423554A (en) * | 1993-09-24 | 1995-06-13 | Metamedia Ventures, Inc. | Virtual reality game method and apparatus |
US5616078A (en) * | 1993-12-28 | 1997-04-01 | Konami Co., Ltd. | Motion-controlled video entertainment system |
US5528263A (en) * | 1994-06-15 | 1996-06-18 | Daniel M. Platzker | Interactive projected video image display system |
US5795228A (en) * | 1996-07-03 | 1998-08-18 | Ridefilm Corporation | Interactive computer-based entertainment system |
Cited By (38)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6749432B2 (en) | 1999-10-20 | 2004-06-15 | Impulse Technology Ltd | Education system challenging a subject's physiologic and kinesthetic systems to synergistically enhance cognitive function |
WO2002093465A1 (fr) * | 2001-05-15 | 2002-11-21 | Martin Lefebure | Procede de commande d"un curseur, programme d"ordinateur pour sa mise en oeuvre et systeme de commande d"un curseur |
FR2824926A1 (fr) * | 2001-05-15 | 2002-11-22 | Martin Lefebure | Procede de commande d'un curseur, programme d'ordinateur pour sa mise en oeuvre et systeme de commande d'un curseur |
WO2003021410A3 (fr) * | 2001-09-04 | 2004-03-18 | Koninkl Philips Electronics Nv | Systeme et procede d'interface informatique |
US8035613B2 (en) | 2003-02-21 | 2011-10-11 | Sony Computer Entertainment Europe Ltd. | Control of data processing |
WO2004073815A1 (fr) * | 2003-02-21 | 2004-09-02 | Sony Computer Entertainment Europe Ltd | Commande de traitement de donnees |
WO2004073814A1 (fr) * | 2003-02-21 | 2004-09-02 | Sony Computer Entertainment Europe Ltd | Controle de traitement de donnees |
US9798391B2 (en) | 2004-06-18 | 2017-10-24 | Igt | Control of wager-based game using gesture recognition |
US9230395B2 (en) | 2004-06-18 | 2016-01-05 | Igt | Control of wager-based game using gesture recognition |
US8398488B2 (en) | 2004-08-19 | 2013-03-19 | Igt | Virtual input system |
US9116543B2 (en) | 2004-08-19 | 2015-08-25 | Iii Holdings 1, Llc | Virtual input system |
US10564776B2 (en) | 2004-08-19 | 2020-02-18 | American Patents Llc | Virtual input system |
WO2006023285A1 (fr) * | 2004-08-19 | 2006-03-02 | Igt | Systeme d'entree virtuelle |
US9606674B2 (en) | 2004-08-19 | 2017-03-28 | Iii Holdings 1, Llc | Virtual input system |
US7942744B2 (en) | 2004-08-19 | 2011-05-17 | Igt | Virtual input system |
WO2006037786A3 (fr) * | 2004-10-05 | 2006-06-15 | Sony France Sa | Interface de gestion de contenu |
US7886229B2 (en) | 2004-10-05 | 2011-02-08 | Sony France S.A. | Content-management interface |
EP1645944A1 (fr) | 2004-10-05 | 2006-04-12 | Sony France S.A. | Interface pour une gestion de contenu |
EP1752861A1 (fr) | 2005-08-11 | 2007-02-14 | Samsung Electronics Co.,Ltd. | Procédé et dispositif d'entrée utilisateur dans un terminal mobile de communication |
US7898563B2 (en) | 2005-08-11 | 2011-03-01 | Samsung Electronics Co., Ltd. | User input method and device of mobile communication terminal |
US10062297B2 (en) | 2007-08-17 | 2018-08-28 | Adidas International Marketing B.V. | Sports electronic training system, and applications thereof |
US9645165B2 (en) | 2007-08-17 | 2017-05-09 | Adidas International Marketing B.V. | Sports electronic training system with sport ball, and applications thereof |
US12020588B2 (en) | 2007-08-17 | 2024-06-25 | Adidas International Marketing B.V. | Sports electronic training system, and applications thereof |
US9242142B2 (en) | 2007-08-17 | 2016-01-26 | Adidas International Marketing B.V. | Sports electronic training system with sport ball and electronic gaming features |
US9759738B2 (en) | 2007-08-17 | 2017-09-12 | Adidas International Marketing B.V. | Sports electronic training system, and applications thereof |
US9625485B2 (en) | 2007-08-17 | 2017-04-18 | Adidas International Marketing B.V. | Sports electronic training system, and applications thereof |
US9087159B2 (en) | 2007-08-17 | 2015-07-21 | Adidas International Marketing B.V. | Sports electronic training system with sport ball, and applications thereof |
AT506618B1 (de) * | 2008-01-25 | 2013-02-15 | Stumpfl Reinhold | Interaktive multimedia-präsentationsvorrichtung |
WO2009092126A1 (fr) * | 2008-01-25 | 2009-07-30 | Stumpfl, Reinhold | Dispositif de présentation multimédia interactif |
GB2463312A (en) * | 2008-09-09 | 2010-03-17 | Skype Ltd | Games system with bi-directional video communication |
WO2012120243A3 (fr) * | 2011-03-10 | 2014-09-18 | Intui Sense | Système d'acquisition et de traitement d'images robuste pour façade interactive, façade et dispositif interactifs associes |
FR2972544A1 (fr) * | 2011-03-10 | 2012-09-14 | Intui Sense | Systeme d'acquisition et de traitement d'images robuste pour facade interactive, facade et dispositif interactifs associes |
US9188975B2 (en) | 2011-11-21 | 2015-11-17 | Pirelli Tyre S.P.A. | Method for controlling the movement of building members of a tyre in a process for manufacturing tyres for vehicle wheels |
WO2014121566A1 (fr) * | 2013-02-07 | 2014-08-14 | Han Zheng | Procédé et dispositif de collecte de données pour reconnaissance d'action, et système de reconnaissance d'action |
CN114185424A (zh) * | 2014-05-21 | 2022-03-15 | 汤杰宝游戏公司 | 有形界面对象的虚拟化 |
US10691949B2 (en) * | 2016-11-14 | 2020-06-23 | Axis Ab | Action recognition in a video sequence |
CN110998594A (zh) * | 2017-08-07 | 2020-04-10 | 三菱电机株式会社 | 检测动作的方法和系统 |
CN110998594B (zh) * | 2017-08-07 | 2024-04-09 | 三菱电机株式会社 | 检测动作的方法和系统 |
Also Published As
Publication number | Publication date |
---|---|
AU8682498A (en) | 1999-02-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO1999007153A1 (fr) | Systemes et procedes de commande de logiciel mis en oeuvre par une analyse et une interpretation d'informations video | |
US8165345B2 (en) | Method, system, and computer program for detecting and characterizing motion | |
US8139059B2 (en) | Object illumination in a virtual environment | |
US4843568A (en) | Real time perception of and response to the actions of an unencumbered participant/user | |
KR101855639B1 (ko) | 프리젠테이션을 위한 카메라 탐색 | |
US8223147B1 (en) | Method and system for vision-based interaction in a virtual environment | |
US9684968B2 (en) | Method, system and computer program for detecting and characterizing motion | |
US6195104B1 (en) | System and method for permitting three-dimensional navigation through a virtual reality environment using camera-based gesture inputs | |
US20060290663A1 (en) | Simulated training environments based upon fixated objects in specified regions | |
CN102622774A (zh) | 起居室电影创建 | |
Song et al. | Vision-based 3D finger interactions for mixed reality games with physics simulation | |
WO2021102566A9 (fr) | Systèmes et procédés d'amélioration de l'interaction avec un joueur utilisant la réalité augmentée | |
Chen et al. | Using real-time acceleration data for exercise movement training with a decision tree approach | |
US20070298396A1 (en) | Computer executable dynamic presentation system for simulating human meridian points and method thereof | |
Chan et al. | Gesture-based interaction for a magic crystal ball | |
Gross et al. | Gesture Modelling: Using Video to Capture Freehand Modeling Commands | |
Gurieva et al. | Augmented reality for personalized learning technique: Climbing gym case study | |
KR100607046B1 (ko) | 체감형 게임용 화상처리 방법 및 이를 이용한 게임 방법 | |
Hillaire et al. | Using a visual attention model to improve gaze tracking systems in interactive 3d applications | |
KR20010105012A (ko) | 인터넷을 이용한 골프 스윙 비교 분석 비쥬얼 시스템 | |
KR200239844Y1 (ko) | 인공시각과 패턴인식을 이용한 체감형 게임 장치. | |
Lok | Interacting with dynamic real objects in virtual environments | |
Gammon et al. | Limb-O: real-time comparison and visualization of lower limb motions | |
CN118118643B (zh) | 一种视频数据处理方法及相关装置 | |
Peng et al. | Design and Implementation of Multi-mode Natural Interaction of Game Animation Characters in Mixed Reality: A Novel User Experience Method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CZ DE DK EE ES FI GB GE GH GM HR HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 09381136 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: KR |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
122 | Ep: pct application non-entry in european phase | ||
NENP | Non-entry into the national phase |
Ref country code: CA |