US20230360079A1 - Gaze estimation system and method thereof - Google Patents
Gaze estimation system and method thereof Download PDFInfo
- Publication number
- US20230360079A1 US20230360079A1 US17/577,758 US202217577758A US2023360079A1 US 20230360079 A1 US20230360079 A1 US 20230360079A1 US 202217577758 A US202217577758 A US 202217577758A US 2023360079 A1 US2023360079 A1 US 2023360079A1
- Authority
- US
- United States
- Prior art keywords
- person
- camera
- audience
- signage
- parameters
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000004458 analytical method Methods 0.000 claims abstract description 5
- 230000008569 process Effects 0.000 claims abstract description 5
- 210000003128 head Anatomy 0.000 description 26
- 238000010801 machine learning Methods 0.000 description 9
- 210000001508 eye Anatomy 0.000 description 8
- 238000006243 chemical reaction Methods 0.000 description 5
- 230000008451 emotion Effects 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 230000006399 behavior Effects 0.000 description 4
- 230000001815 facial effect Effects 0.000 description 4
- 238000013459 approach Methods 0.000 description 3
- 210000005252 bulbus oculi Anatomy 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 230000002996 emotional effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 210000000887 face Anatomy 0.000 description 2
- 210000001097 facial muscle Anatomy 0.000 description 2
- 230000008685 targeting Effects 0.000 description 2
- 238000012937 correction Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 230000004438 eyesight Effects 0.000 description 1
- 230000008921 facial expression Effects 0.000 description 1
- 230000037308 hair color Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000004297 night vision Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0241—Advertisements
- G06Q30/0242—Determining effectiveness of advertisements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
- G06F3/012—Head tracking input arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
- G06F3/013—Eye tracking input arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/03—Arrangements for converting the position or the displacement of a member into a coded form
- G06F3/0304—Detection arrangements using opto-electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/80—Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
Definitions
- This invention comes under the field of gesture recognition (gaze), more specifically for estimating gaze at a specific pre-defined target object using feature-based methods.
- the present invention involves the system or method for estimating audience head gaze at a target signage board by estimating person depth and height using the monocular camera.
- Cameras are placed at strategic locations in public places and the attention, interest, and reactions of the audience towards the targeted signage boards, advertisements are captured, analyzed using systems that employ certain algorithms that extract and study only certain features to maintain the privacy of the individual, and the analysis is utilized to gauge the success of the advertisement.
- a monocular (single-eyed) system is a single camera sensor which is placed in a strategical position and location to capture images/videos that can be processed.
- a stereo vision system is a system with two cameras, placed at a certain distance from each other.
- monocular cameras are available in the market today.
- the requisite features of the monocular camera are lens elements, compact size, mounting features, night vision or low light capability, connectivity to systems, etc.
- U.S. Pat. No. 8,401,248B1 Method and system for measuring emotional and attentional response to dynamic digital media content—Theis patent relates to a method and system to provide an automatic measurement of people's responses to dynamic digital media, based on changes in their facial expressions and attention to specific content.
- the method detects and tracks faces from the audience. It then localizes each of the faces and facial features to extract emotion-sensitive features of the face by applying emotion-sensitive feature filters, to determine the facial muscle actions of the face based on the extracted emotion-sensitive features. The changes in facial muscle actions are then converted to the changes in affective state, called an emotion trajectory.
- the method also estimates eye gaze based on extracted eye images and three-dimensional facial pose of the face based on localized facial images.
- the gaze direction of the person is estimated based on the estimated eye gaze and the three-dimensional facial pose of the person.
- the gaze target on the media display is then estimated based on the estimated gaze direction and the position of the person.
- the response of the person to the dynamic digital media content is determined by analyzing the emotion trajectory in relation to the time and screen positions of the specific digital media sub-content that the person is watching.
- U.S. Pat. No. 7,921,036B1 Method and system for dynamically targeting content based on automatic demographics and behaviour analysis—this paper relates to a method and system for selectively executing content on a display based on the automatic recognition of predefined characteristics, including visually perceptible attributes, such as the demographic profile of people identified automatically using a sequence of image frames from a video stream.
- the present invention detects the images of the individual or the people from captured images.
- the present invention automatically extracts visually perceptible attributes, including demographic information, local behavior analysis, and emotional status, of the individual or the people from the images in real-time.
- the visually perceptible attributes further comprise height, skin color, hair color, the number of people in the scene, time spent by the people, and whether a person looking at the display.
- a targeted media is selected from a set of media pools, according to the automatically extracted, visually perceptible attributes and the feedback from the people.
- U.S. Pat. No. 9,965,870B2 Camera calibration method using a calibration target—This patent relates to calibration methods that use a calibration target for obtaining the intrinsic and extrinsic camera parameters of one or more cameras are.
- the methods can include acquiring, with each camera, a sequence of target images representing the calibration target in different target poses and at different acquisition times.
- the methods can include identifying reference images from the target images and defining volume bins, angle bins, and multi-camera bins into which the reference images are stored.
- the reference images can be used to determine the intrinsic and extrinsic parameters of one or more cameras.
- the calibration methods can enable a user to monitor the progress of the calibration process, for example by providing an interactive calibration target including an input/output user interface to guide the user in real-time during the acquisition of the target images and/or sensors to provide positional information about the target poses.
- US20160210503A1 Real-time eye tracking for human computer interaction—In this patent, a gaze direction determining system and method is provided.
- a two-camera system may detect the face from a fixed, wide-angle camera, estimates a rough location for the eye region using an eye detector based on topographic features, and directs another active pan-tilt-zoom camera to focus in on this eye region.
- An eye gaze estimation approach employs point-of-regard (PoG) tracking on a large viewing screen.
- PoG point-of-regard
- a calibration approach is provided to find the 3D eyeball location, eyeball radius, and fovea position. Both the iris center and iris contour points are mapped to the eyeball sphere (creating a 3D iris disk) to get the optical axis; then the fovea rotated accordingly and the final, visual axis gaze direction computed.
- Our present patent relates to a system that employs a monocular camera either at the top-center or bottom-center of the target signage board, and the camera input, camera tilt calibration input is processed and analysed to produce information regarding the person head gaze at the signage.
- the main objective of our system is to estimate the audience head gaze at a target signage board by estimating person depth and height using a monocular camera.
- Other objective which is targeted is the estimation process is completely done locally without streaming the camera data outside the system and only the metadata is sent out of the system.
- the secondary objective of our system is to estimate the person's level of interest and determine the success of the target signage board.
- the major problem addressed by the invention is the estimation of a person's head gaze at a target signage board by estimating that person's height and depth in the environment, by using the monocular camera.
- the system or method consists of mounting a monocular camera at the appropriate position near the target signage board.
- the system consists of a camera and main processing unit, the processing unit does the whole processing locally and sends the target gaze information out.
- the main system contains a method for calculating the head gaze at the signage board using ML (Machine learning) based algorithms and some basic geometries.
- the system calculates a person's head gaze or level of interest in the target signage board by estimating the person's height, depth and head pose.
- the system can also perform the camera's tilt calibration to aid the accuracy of the estimation.
- the level of interest or other information regarding the viewer can be obtained. This can be used to determine the success or failure of the content displayed on the target signage board, to gauge the audience details like the age group/ethnicity/socio-economic specifics, and also to determine which factors affect the audience interest in the content on display at the target signage board.
- FIG. 1 is a block diagram of the components and the workflow of our system in accordance with our present invention.
- the present embodiment of our invention consists of fixing a monocular camera ( 102 ) over the top or bottom centre of the digital signage board ( 101 ) and programming the sensor's intrinsic and extrinsic parameters and gauging the camera tilt parameters using the Tilt calibration module ( 103 ).
- the camera captures videos or images of the audience or people and sends the videos/images to the machine learning systems which do inference in the image to obtain the head pose , gender and face key point information.
- This input is fed to the Head Gaze Estimation Module ( 106 ) which takes in other inputs, from the camera tilt calibration module ( 103 ), the signage parameters ( 110 ), the intrinsic and other parameters ( 104 ), the output from the learning systems ( 105 ) which provide input like head pose, face key points, etc.
- the head gaze estimation module estimates the person depth based on the face key points and gender information from Machine Learning Module and produce output the head gaze output ( 107 ) which shows the person's interest, level of engagement, and target estimation and also provide input for adjusting the camera parameters using the camera tilt calibration module.
- the board signage parameters include location, height or position of the board, the dimensions of the board, the relative positioning of the camera with respect to the board dimensions, etc.
- the extrinsic parameters are measured and programmed based on the location of the mount.
- the tilt parameter is computed based on the tilt calibration module as part of the system.
- the intrinsic parameters of the monocular camera can be programmed.
- the distortion parameters which provide correction for the lens distortion are also provided as input to the system.
- the system utilizes multiple open-source Machine Learning (ML) models for calculating head pose, gender, face key points based on the input from the camera module. This output is fed as one of the inputs for the Head Gaze Estimation Module which determines the target head gaze at the signage board by estimating the person depth and height using a proprietary approach.
- ML Machine Learning
- the system or method for audience head gaze estimation can be used to obtain information for the ideal location and positioning for advertisements, displays, target signage boards, design inputs for arenas for theatres, movies, conference halls to provide optimum user engagement.
- the system or method for audience head gaze estimation can be used to obtain the audience information, to study the audience specificity based on gender, age, and other parameters, and using them to design/alter products or marketing strategies.
- the system or method for audience head gaze estimation can be used to gauge the audience interest and engagement and determine the success of the advertisement/program/etc.
- the system or method for audience head gaze estimation can be used to gauge the audience and change the content of the advertisement/display/signage board/television or other display dynamically to suit the audience engagement levels.
- the system or method for audience head gaze estimation can be used to gauge the audience interest, and to switch on or off the display or to change the programs based on the interest of the viewer and to provide a personalized ranking of programs or to find different programs suited to the user's interest.
- the system or method for audience head gaze estimation can be used to display personalized user-targeted advertisements based on input from the user social media, recent search information, and head gaze positioning.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Engineering & Computer Science (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Finance (AREA)
- Strategic Management (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Human Computer Interaction (AREA)
- Entrepreneurship & Innovation (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Economics (AREA)
- Game Theory and Decision Science (AREA)
- Image Analysis (AREA)
Abstract
A system to estimate audience parameters having the following features: receiving inputs from (a) a monocular camera (102) placed top-center/bottom-center of a target signage board (101); (b) a camera tilt calibration module (103); (c) camera parameters (104); (d) signage parameters (110); and (e) output from the ML systems (105) to process and analyse the monocular camera images to aid as input to the gaze estimation module (106) and produce output (107). The output gives the person gaze at the target and other additional parameter like gender, person height and depth from signage and provides data regarding user interest and engagement levels with the target signage board.
Description
- This invention comes under the field of gesture recognition (gaze), more specifically for estimating gaze at a specific pre-defined target object using feature-based methods. The present invention involves the system or method for estimating audience head gaze at a target signage board by estimating person depth and height using the monocular camera.
- The field of marketing showcases the featured product/s in the best way possible to garner attention and convert the attention into sales for the given product/s. The explosion of technology and innovation in the past twenty years have led to specialized marketing techniques which rely on technology to gather real-time information about the needs, likes, and desires of the potential customer, to obtain real-time information whether the particular advertisements, targeted showcasing, product carnivals achieve their expected results, to study the audience reactions and to make product changes effectively and promptly to cater to the audience.
- In the past few decades, market analysis consisted of in-person surveys, telephonic questionnaires, supermarkets/store surveys, and product movement studies, mailed reviews, or studying public footage manually to ascertain the reactions of the audience for particular advertisements, signage boards, etc. With the evolution of machine learning, image processing, and connectivity, real-time processing of the audience information is made possible and with the rise of social media and the tremendous amount of user behaviour and information now being made available, instant tangible results can be achieved by targeting the audience behaviour and analysing it.
- Cameras are placed at strategic locations in public places and the attention, interest, and reactions of the audience towards the targeted signage boards, advertisements are captured, analyzed using systems that employ certain algorithms that extract and study only certain features to maintain the privacy of the individual, and the analysis is utilized to gauge the success of the advertisement.
- A monocular (single-eyed) system is a single camera sensor which is placed in a strategical position and location to capture images/videos that can be processed. A stereo vision system is a system with two cameras, placed at a certain distance from each other.
- Several monocular cameras are available in the market today. The requisite features of the monocular camera are lens elements, compact size, mounting features, night vision or low light capability, connectivity to systems, etc.
- A few patents based on audience reaction estimation have been given below:
- U.S. Pat. No. 8,401,248B1: Method and system for measuring emotional and attentional response to dynamic digital media content—Theis patent relates to a method and system to provide an automatic measurement of people's responses to dynamic digital media, based on changes in their facial expressions and attention to specific content. First, the method detects and tracks faces from the audience. It then localizes each of the faces and facial features to extract emotion-sensitive features of the face by applying emotion-sensitive feature filters, to determine the facial muscle actions of the face based on the extracted emotion-sensitive features. The changes in facial muscle actions are then converted to the changes in affective state, called an emotion trajectory. On the other hand, the method also estimates eye gaze based on extracted eye images and three-dimensional facial pose of the face based on localized facial images. The gaze direction of the person, is estimated based on the estimated eye gaze and the three-dimensional facial pose of the person. The gaze target on the media display is then estimated based on the estimated gaze direction and the position of the person. Finally, the response of the person to the dynamic digital media content is determined by analyzing the emotion trajectory in relation to the time and screen positions of the specific digital media sub-content that the person is watching.
- U.S. Pat. No. 7,921,036B1: Method and system for dynamically targeting content based on automatic demographics and behaviour analysis—this paper relates to a method and system for selectively executing content on a display based on the automatic recognition of predefined characteristics, including visually perceptible attributes, such as the demographic profile of people identified automatically using a sequence of image frames from a video stream. The present invention detects the images of the individual or the people from captured images. The present invention automatically extracts visually perceptible attributes, including demographic information, local behavior analysis, and emotional status, of the individual or the people from the images in real-time. The visually perceptible attributes further comprise height, skin color, hair color, the number of people in the scene, time spent by the people, and whether a person looking at the display. A targeted media is selected from a set of media pools, according to the automatically extracted, visually perceptible attributes and the feedback from the people.
- U.S. Pat. No. 9,965,870B2: Camera calibration method using a calibration target—This patent relates to calibration methods that use a calibration target for obtaining the intrinsic and extrinsic camera parameters of one or more cameras are. The methods can include acquiring, with each camera, a sequence of target images representing the calibration target in different target poses and at different acquisition times. The methods can include identifying reference images from the target images and defining volume bins, angle bins, and multi-camera bins into which the reference images are stored. The reference images can be used to determine the intrinsic and extrinsic parameters of one or more cameras. In some implementations, the calibration methods can enable a user to monitor the progress of the calibration process, for example by providing an interactive calibration target including an input/output user interface to guide the user in real-time during the acquisition of the target images and/or sensors to provide positional information about the target poses.
- US20160210503A1: Real-time eye tracking for human computer interaction—In this patent, a gaze direction determining system and method is provided. A two-camera system may detect the face from a fixed, wide-angle camera, estimates a rough location for the eye region using an eye detector based on topographic features, and directs another active pan-tilt-zoom camera to focus in on this eye region. An eye gaze estimation approach employs point-of-regard (PoG) tracking on a large viewing screen. To allow for greater head pose freedom, a calibration approach is provided to find the 3D eyeball location, eyeball radius, and fovea position. Both the iris center and iris contour points are mapped to the eyeball sphere (creating a 3D iris disk) to get the optical axis; then the fovea rotated accordingly and the final, visual axis gaze direction computed.
- In the above-discussed patents, the systems employ stereo cameras, iris detection techniques and personal height estimation have been done to study the audience populace and estimate their reactions.
- Our present patent relates to a system that employs a monocular camera either at the top-center or bottom-center of the target signage board, and the camera input, camera tilt calibration input is processed and analysed to produce information regarding the person head gaze at the signage.
- The main objective of our system is to estimate the audience head gaze at a target signage board by estimating person depth and height using a monocular camera. Other objective which is targeted is the estimation process is completely done locally without streaming the camera data outside the system and only the metadata is sent out of the system.
- The secondary objective of our system is to estimate the person's level of interest and determine the success of the target signage board.
- The following summary is provided to facilitate a clear understanding of the new features in the disclosed embodiment and it is not intended to be a full, detailed description. A detailed description of all the aspects of the disclosed invention can be understood by reviewing the full specification, the drawing, and the claims, and the abstract, as a whole.
- The major problem addressed by the invention is the estimation of a person's head gaze at a target signage board by estimating that person's height and depth in the environment, by using the monocular camera. The system or method consists of mounting a monocular camera at the appropriate position near the target signage board. The system consists of a camera and main processing unit, the processing unit does the whole processing locally and sends the target gaze information out. The main system contains a method for calculating the head gaze at the signage board using ML (Machine learning) based algorithms and some basic geometries. Based on the features from the ML algorithm and other inputs like: signage parameters (like the height, size, etc), the tilt parameters, the height and position of the camera, the system then calculates a person's head gaze or level of interest in the target signage board by estimating the person's height, depth and head pose. The system can also perform the camera's tilt calibration to aid the accuracy of the estimation. By calculating the head gaze near the target signage board, the level of interest or other information regarding the viewer can be obtained. This can be used to determine the success or failure of the content displayed on the target signage board, to gauge the audience details like the age group/ethnicity/socio-economic specifics, and also to determine which factors affect the audience interest in the content on display at the target signage board.
- The manner in which the proposed system works is given a more particular description below, briefly summarized above, may be had by reference to the components, some of which is illustrated in the appended drawing It is to be noted; however, that the appended drawing illustrates only typical embodiments of this system and are therefore should not be considered limiting of its scope, for the system may admit to other equally effective embodiments.
- Throughout the document, the same drawing reference numerals will be understood to refer to the same elements and features.
- The features and advantages of the present proposed system will become more apparent from the following detailed description along with the accompanying figures, which forms a part of this application and in which:
-
FIG. 1 is a block diagram of the components and the workflow of our system in accordance with our present invention. - 100 Audience/Person
- 101 Target Signage Board
- 102 Monocular Camera module
- 103 Camera tilt calibration module
- 104 Camera parameters
- 105 Machine Learning model/s
- 106 Head Gaze Estimation Module
- 107 Head Gaze Estimation Output
- 108 Person-Camera tilt angle, distance
- 109 Person Gaze of Interest
- 110 Signage parameters
- The principles of operation, design configurations, and evaluation values in these non-limiting examples can be varied and are merely cited to illustrate at least one embodiment of the invention, without limiting the scope thereof.
- The embodiments disclosed herein can be expressed in different forms and should not be considered as limited to the listed embodiments in the disclosed invention. The various embodiments outlined in the subsequent sections are construed such that it provides a complete and a thorough understanding of the disclosed invention, by clearly describing the scope of the invention, for those skilled in the art.
- The present embodiment of our invention consists of fixing a monocular camera (102) over the top or bottom centre of the digital signage board (101) and programming the sensor's intrinsic and extrinsic parameters and gauging the camera tilt parameters using the Tilt calibration module (103). The camera captures videos or images of the audience or people and sends the videos/images to the machine learning systems which do inference in the image to obtain the head pose , gender and face key point information. This input is fed to the Head Gaze Estimation Module (106) which takes in other inputs, from the camera tilt calibration module (103), the signage parameters (110), the intrinsic and other parameters (104), the output from the learning systems (105) which provide input like head pose, face key points, etc. The head gaze estimation module estimates the person depth based on the face key points and gender information from Machine Learning Module and produce output the head gaze output (107) which shows the person's interest, level of engagement, and target estimation and also provide input for adjusting the camera parameters using the camera tilt calibration module.
- In the present embodiment of our invention, the board signage parameters include location, height or position of the board, the dimensions of the board, the relative positioning of the camera with respect to the board dimensions, etc. The extrinsic parameters are measured and programmed based on the location of the mount. The tilt parameter is computed based on the tilt calibration module as part of the system. The intrinsic parameters of the monocular camera can be programmed. The distortion parameters which provide correction for the lens distortion are also provided as input to the system. The system utilizes multiple open-source Machine Learning (ML) models for calculating head pose, gender, face key points based on the input from the camera module. This output is fed as one of the inputs for the Head Gaze Estimation Module which determines the target head gaze at the signage board by estimating the person depth and height using a proprietary approach.
- In one embodiment of our present invention, the system or method for audience head gaze estimation can be used to obtain information for the ideal location and positioning for advertisements, displays, target signage boards, design inputs for arenas for theatres, movies, conference halls to provide optimum user engagement.
- In one embodiment of our present invention, the system or method for audience head gaze estimation can be used to obtain the audience information, to study the audience specificity based on gender, age, and other parameters, and using them to design/alter products or marketing strategies.
- In one embodiment of our present invention, the system or method for audience head gaze estimation can be used to gauge the audience interest and engagement and determine the success of the advertisement/program/etc.
- In one embodiment of our present invention, the system or method for audience head gaze estimation can be used to gauge the audience and change the content of the advertisement/display/signage board/television or other display dynamically to suit the audience engagement levels.
- In one embodiment of our present invention, the system or method for audience head gaze estimation can be used to gauge the audience interest, and to switch on or off the display or to change the programs based on the interest of the viewer and to provide a personalized ranking of programs or to find different programs suited to the user's interest.
- In one embodiment of our present invention, the system or method for audience head gaze estimation can be used to display personalized user-targeted advertisements based on input from the user social media, recent search information, and head gaze positioning.
- While the foregoing written description of the invention enables one of ordinary skill to make and use what is considered presently to be the best mode thereof, those of ordinary skill will understand and appreciate the existence of variations, combinations, and equivalents of the specific embodiment, method, and examples herein. The invention should therefore not be limited by the above-described embodiment, method, and examples, but by all embodiments and methods within the scope of the invention as claimed.
Claims (6)
1. A system for audience head gaze estimation 5 comprising:
a gaze estimation module (106) provided with the following inputs:
a. input from a monocular camera (102);
b. input of camera tilt angle from a tilt calibration module (103);
c. signage parameters (104); and
d. learning input for ML based models (105) which processes the inputs using several ML models to estimate the following:
i. person gender;
ii. person head pose; and
iii. person face key points;
wherein, the person head pose is used to provide feedback for the gaze estimation module (106) to thereby gauge audience interest towards a displayed signage board (101).
2. The system for audience head gaze estimation, as claimed in claim 1 , wherein, the input from the monocular camera can be images or video.
3. The system for audience head gaze estimation, as claimed in claim 1 , wherein, the camera tilt calibration model provides tilt parameters with known intrinsic and extrinsic parameters.
4. The system for audience head gaze estimation, as claimed in claim 1 , wherein, the signage parameters include location, position, dimensions of the display signage board, and camera positioning height.
5. A method for audience head gaze estimation comprising the steps of:
Inputting images from a monocular camera;
Utilizing learning systems that process and analyse the images and produce output (107) to obtain head pose and gender;
Using pre-calibrated camera parameters, signage parameter and the gender to compute person height and person depth; and
Computing person gaze at target from the head pose, the person height, and the person depth.
6. The method for audience head gaze estimation, as claimed in claim 5 , wherein, the audience head gaze is estimated to provide information regarding user interest and engagement levels with a target signage board.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/577,758 US20230360079A1 (en) | 2022-01-18 | 2022-01-18 | Gaze estimation system and method thereof |
US18/453,389 US20230394521A1 (en) | 2022-01-18 | 2023-08-22 | System and method to detect a gaze at an object by utilizing an image sensor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/577,758 US20230360079A1 (en) | 2022-01-18 | 2022-01-18 | Gaze estimation system and method thereof |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/453,389 Continuation US20230394521A1 (en) | 2022-01-18 | 2023-08-22 | System and method to detect a gaze at an object by utilizing an image sensor |
Publications (1)
Publication Number | Publication Date |
---|---|
US20230360079A1 true US20230360079A1 (en) | 2023-11-09 |
Family
ID=88648903
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/577,758 Abandoned US20230360079A1 (en) | 2022-01-18 | 2022-01-18 | Gaze estimation system and method thereof |
US18/453,389 Pending US20230394521A1 (en) | 2022-01-18 | 2023-08-22 | System and method to detect a gaze at an object by utilizing an image sensor |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/453,389 Pending US20230394521A1 (en) | 2022-01-18 | 2023-08-22 | System and method to detect a gaze at an object by utilizing an image sensor |
Country Status (1)
Country | Link |
---|---|
US (2) | US20230360079A1 (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030126013A1 (en) * | 2001-12-28 | 2003-07-03 | Shand Mark Alexander | Viewer-targeted display system and method |
US20100228632A1 (en) * | 2009-03-03 | 2010-09-09 | Rodriguez Tony F | Narrowcasting From Public Displays, and Related Methods |
US7921036B1 (en) * | 2002-04-30 | 2011-04-05 | Videomining Corporation | Method and system for dynamically targeting content based on automatic demographics and behavior analysis |
US8401248B1 (en) * | 2008-12-30 | 2013-03-19 | Videomining Corporation | Method and system for measuring emotional and attentional response to dynamic digital media content |
US20150215672A1 (en) * | 2014-01-29 | 2015-07-30 | Samsung Electronics Co., Ltd. | Display apparatus and control method thereof |
US20150358594A1 (en) * | 2014-06-06 | 2015-12-10 | Carl S. Marshall | Technologies for viewer attention area estimation |
US9965870B2 (en) * | 2016-03-29 | 2018-05-08 | Institut National D'optique | Camera calibration method using a calibration target |
-
2022
- 2022-01-18 US US17/577,758 patent/US20230360079A1/en not_active Abandoned
-
2023
- 2023-08-22 US US18/453,389 patent/US20230394521A1/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030126013A1 (en) * | 2001-12-28 | 2003-07-03 | Shand Mark Alexander | Viewer-targeted display system and method |
US7921036B1 (en) * | 2002-04-30 | 2011-04-05 | Videomining Corporation | Method and system for dynamically targeting content based on automatic demographics and behavior analysis |
US8401248B1 (en) * | 2008-12-30 | 2013-03-19 | Videomining Corporation | Method and system for measuring emotional and attentional response to dynamic digital media content |
US20100228632A1 (en) * | 2009-03-03 | 2010-09-09 | Rodriguez Tony F | Narrowcasting From Public Displays, and Related Methods |
US20150215672A1 (en) * | 2014-01-29 | 2015-07-30 | Samsung Electronics Co., Ltd. | Display apparatus and control method thereof |
US20150358594A1 (en) * | 2014-06-06 | 2015-12-10 | Carl S. Marshall | Technologies for viewer attention area estimation |
US9965870B2 (en) * | 2016-03-29 | 2018-05-08 | Institut National D'optique | Camera calibration method using a calibration target |
Also Published As
Publication number | Publication date |
---|---|
US20230394521A1 (en) | 2023-12-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11430260B2 (en) | Electronic display viewing verification | |
US10517521B2 (en) | Mental state mood analysis using heart rate collection based on video imagery | |
US20190034706A1 (en) | Facial tracking with classifiers for query evaluation | |
US11056225B2 (en) | Analytics for livestreaming based on image analysis within a shared digital environment | |
US10182720B2 (en) | System and method for interacting with and analyzing media on a display using eye gaze tracking | |
US8667519B2 (en) | Automatic passive and anonymous feedback system | |
US7922670B2 (en) | System and method for quantifying and mapping visual salience | |
US20160191995A1 (en) | Image analysis for attendance query evaluation | |
US20170238859A1 (en) | Mental state data tagging and mood analysis for data collected from multiple sources | |
US20120133754A1 (en) | Gaze tracking system and method for controlling internet protocol tv at a distance | |
US20120140069A1 (en) | Systems and methods for gathering viewership statistics and providing viewer-driven mass media content | |
US20150350730A1 (en) | Video recommendation using affect | |
WO2012105196A1 (en) | Interest estimation device and interest estimation method | |
US11430561B2 (en) | Remote computing analysis for cognitive state data metrics | |
KR20190020779A (en) | Ingestion Value Processing System and Ingestion Value Processing Device | |
US20130151333A1 (en) | Affect based evaluation of advertisement effectiveness | |
US20100046797A1 (en) | Methods and systems for audience monitoring | |
Navarathna et al. | Predicting movie ratings from audience behaviors | |
CN105339969A (en) | Linked advertisements | |
JP5225870B2 (en) | Emotion analyzer | |
JP6583996B2 (en) | Video evaluation apparatus and program | |
Modi et al. | Real-time camera-based eye gaze tracking using convolutional neural network: a case study on social media website | |
EP4213105A1 (en) | Gaze estimation system and method thereof | |
CN112163880A (en) | Intelligent advertisement putting method and system based on image processing | |
KR102477231B1 (en) | Apparatus and method for detecting interest in gazed objects |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: E-CON SYSTEMS INDIA PRIVATE LIMITED, INDIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:RAJENDRAN, PARTHASARATHY;REEL/FRAME:058679/0337 Effective date: 20220106 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |