JP2012212325A

JP2012212325A - Visual axis measuring system, method and program

Info

Publication number: JP2012212325A
Application number: JP2011077741A
Authority: JP
Inventors: Akira Uchiumi; 章内海; Hirotake Yamazoe; 大丈山添
Original assignee: ATR Advanced Telecommunications Research Institute International
Current assignee: ATR Advanced Telecommunications Research Institute International
Priority date: 2011-03-31
Filing date: 2011-03-31
Publication date: 2012-11-01
Anticipated expiration: 2031-03-31
Also published as: JP5688514B2

Abstract

PROBLEM TO BE SOLVED: To improve accuracy of an estimation or measurement of a visual axis by collating with a face image of an entire frame and correcting a personal parameter and a frame parameter.SOLUTION: In Step S105, a server 12 acquires position data of feature points of a face and, in Step S107, acquires data of a posture and position of the face. In Step S111, the server sets the personal parameter and the frame parameter for each frame image based on the position data of feature points and the face position and posture data, and, in Step S115, calculates a score value representing goodness of fit of the personal parameter and the frame parameter to each frame face image, and integrates the score values on the entire frame image. In Step S121, each parameter is corrected till the integrated score value becomes less than a predetermined threshold. The server measures the visual axis direction as a three-dimensional straight line connecting the eyeball center to the iris center, based on the corrected personal parameter and frame parameter (S123).

Description

この発明は視線計測システム、方法およびプログラムに関し、特に、たとえば単眼カメラで撮影した多数のフレーム顔画像を処理することによってその顔画像に含まれる目の視線方向を推定または計測する、視線計測システム、方法およびプログラムに関する。 The present invention relates to a line-of-sight measurement system, method, and program, and in particular, a line-of-sight measurement system that estimates or measures the line-of-sight direction of an eye included in the face image, for example, by processing a number of frame face images captured by a monocular camera, It relates to a method and a program.

特許文献１には本件出願人が提案した視線推定システムが開示されている。この背景技術では、単眼カメラからの被験者の顔画像信号を処理することによって、当該被験者の視線方向を推定できる。
特開２００８−１０２９０２号公報［G06T 7/60 A61B 3/113］ Patent Document 1 discloses a gaze estimation system proposed by the present applicant. In this background art, the line-of-sight direction of the subject can be estimated by processing the face image signal of the subject from the monocular camera.
JP 2008-102902 A [G06T 7/60 A61B 3/113]

特許文献１の背景技術ではカメラからの顔画像信号のリアルタイム処理によって視線方向を推定するものであり、被験者の顔の中の目や鼻口などのパーツの配置が全く分からない状態から処理を開始するので、逐次の学習による精度の向上に限界があった。 In the background art of Patent Document 1, the gaze direction is estimated by real-time processing of a face image signal from a camera, and processing is started from a state in which the arrangement of parts such as eyes and nose in the subject's face is not known at all. Therefore, there was a limit to the improvement of accuracy by sequential learning.

それゆえに、この発明の主たる目的は、新規な、視線計測システム、方法およびプログラムを提供することである。 Therefore, a main object of the present invention is to provide a novel gaze measurement system, method and program.

この発明の他の目的は、精度を高められる、視線計測システム、方法およびプログラムを提供することである。 Another object of the present invention is to provide a line-of-sight measurement system, method, and program capable of improving accuracy.

この発明は、上記の課題を解決するために、以下の構成を採用した。なお、括弧内の参照符号および補足説明等は、この発明の理解を助けるために後述する実施形態との対応関係を示したものであって、この発明を何ら限定するものではない。 The present invention employs the following configuration in order to solve the above problems. Note that reference numerals in parentheses, supplementary explanations, and the like indicate correspondence with embodiments to be described later in order to help understanding of the present invention, and do not limit the present invention.

第１の発明は、フレーム顔画像から取得した人の眼球位置と虹彩位置とに基づいてフレーム顔画像毎の視線方向を計測する視線計測システムであって、標準モデルを利用してフレーム顔画像毎に顔画像の虹彩位置を含む特徴点の位置データを取得する特徴点データ取得手段、特徴点の位置データを利用してフレーム顔画像毎に顔の位置および姿勢データを取得する顔データ取得手段、特徴点の位置データおよび顔の位置および姿勢データに基づいてフレーム画像毎に個人パラメータおよびフレームパラメータを設定するパラメータ設定手段、個人パラメータおよびフレームパラメータの全フレーム顔画像に対する適合度を計算する適合度計算手段、適合度が所定の閾値になるまで個人パラメータおよびフレームパラメータを修正する修正手段、および修正された個人パラメータおよびフレームパラメータに基づいて視線方向を計測する手段を備える、視線計測システムである。 A first invention is a line-of-sight measurement system that measures a line-of-sight direction for each frame face image based on a human eyeball position and iris position acquired from the frame face image, and uses the standard model for each frame face image. Feature point data acquisition means for acquiring feature point position data including the iris position of the face image, face data acquisition means for acquiring face position and orientation data for each frame face image using the feature point position data, Parameter setting means for setting individual parameters and frame parameters for each frame image based on feature point position data and face position and orientation data, and fitness calculation for calculating the fitness of individual parameters and frame parameters for all frame face images Means for correcting personal parameters and frame parameters until the matching level reaches a predetermined threshold And a means for measuring a gaze direction based on the modified personal parameter and frame parameters, a gaze tracking system.

第１の発明では、たとえばコンピュータ（サーバ12）によって形成される特徴点データ取得手段（12,261,S105）が、標準モデルを利用してフレーム顔画像毎に顔画像の虹彩位置を含む特徴点の位置データを取得する。ここで、標準モデルとは、一例として、解剖学の知見に従って構築した人の顔の所定の特徴点（たとえば、両目の目頭、目尻、口角）の３次元座標の位置を示す座標データと、それらの特徴点に対して解剖学的に推定できる眼球の位置および眼球の半径のデータとを含むものである。特徴点データ取得手段（12,261,S105）は、フレーム顔画像にこのような標準モデルを適用して、当該被験者の特徴点を検出する。顔データ取得手段（12,262,S107）は、特徴点の位置データを利用して、たとえば複数の特徴点に関する投影行列（Ｐ）をＱＲ分解するなどして、フレーム顔画像毎に顔の位置および姿勢データを取得する。パラメータ設定手段（12, S111,S121）は、特徴点の位置データおよび顔の位置および姿勢データに基づいてフレーム画像毎に個人パラメータおよびフレームパラメータを設定する。個人パラメータとは、一例として、各被験者に独特の上述の標準モデルに相当しかつ全てのフレームに共通する、６つの特徴点の３次元座標、眼球の位置および眼球の半径（ｒ）の１組のデータセットであり、フレームパラメータは、一例として、各フレームに特有のパラメータであって、顔の位置および姿勢、虹彩（瞳孔）の位置および虹彩（瞳孔）の半径を含むデータセットである。適合度計算手段（12,263,S115）は、パラメータ設定手段が設定した個人パラメータおよびフレームパラメータの各フレーム顔画像に対する適合度を示すスコア値を計算し、全フレーム画像についてそのスコア値を積算する。つまり、適合度計算手段（12,263,S115）は、積算スコア値を適合度として計算する。修正手段（12,264,S117,S121）はその適合度が所定の閾値になるまで個人パラメータおよびフレームパラメータを修正する。つまり、修正手段で個人パラメータおよびフレームパラメータを修正する都度積算スコア値を計算し、そのスコア値が所定の閾値を下回るまで、修正、積算スコア値計算が繰り返される。そして、視線方向計測手段（12,265,S123）は、修正された個人パラメータおよびフレームパラメータに基づいて、たとえば眼球中心と虹彩中心を結ぶ３次元直線として視線方向を計測する。 In the first invention, for example, the feature point data acquisition means (12,261, S105) formed by the computer (server 12) uses the standard model to position the feature points including the iris position of the face image for each frame face image. Get the data. Here, the standard model is, for example, coordinate data indicating the position of the three-dimensional coordinates of predetermined feature points (for example, the eyes of the eyes, the corners of the eyes, and the corners of the eyes) constructed according to the knowledge of anatomy, and those Data of the position of the eyeball and the radius of the eyeball that can be estimated anatomically with respect to these feature points. The feature point data acquisition means (12,261, S105) applies such a standard model to the frame face image and detects the feature point of the subject. The face data acquisition means (12,262, S107) uses the position data of the feature points, for example, QR-decomposes the projection matrix (P) related to a plurality of feature points, and performs face position and posture for each frame face image. Get the data. The parameter setting means (12, S111, S121) sets personal parameters and frame parameters for each frame image based on the feature point position data and the face position and orientation data. As an example, the personal parameter corresponds to the above-described standard model unique to each subject and is common to all frames, and is a set of three-dimensional coordinates of the six feature points, the position of the eyeball, and the radius (r) of the eyeball. As an example, the frame parameter is a parameter specific to each frame, and includes a face position and posture, an iris (pupil) position, and an iris (pupil) radius. The goodness-of-fit calculation means (12,263, S115) calculates score values indicating the goodness of the individual parameters and frame parameters set by the parameter setting means for each frame face image, and accumulates the score values for all frame images. That is, the fitness level calculation means (12,263, S115) calculates the integrated score value as the fitness level. The correction means (12, 264, S117, S121) corrects the personal parameters and the frame parameters until the matching level reaches a predetermined threshold. That is, the integrated score value is calculated each time the personal parameter and the frame parameter are corrected by the correcting means, and the correction and the integrated score value calculation are repeated until the score value falls below a predetermined threshold value. Then, the gaze direction measuring means (12,265, S123) measures the gaze direction as, for example, a three-dimensional straight line connecting the eyeball center and the iris center based on the corrected personal parameter and frame parameter.

第１の発明によれば、個人的パラメータおよびフレームパラメータが全フレーム顔画像に対して適合するように修正されるので、視線計測の精度が向上する。 According to the first invention, since the personal parameters and the frame parameters are corrected so as to be suitable for the full-frame face image, the accuracy of the line-of-sight measurement is improved.

第２の発明は、第１の発明に従属し、適合度計算手段は、標準モデルを使って個人パラメータとフレームパラメータに基づいて生成した虹彩の投影像と、各フレーム顔画像における虹彩との比較の誤差の全フレーム顔画像の総計を適合度として計算する、視線計測システムである。 The second invention is dependent on the first invention, and the fitness calculation means compares the iris projection image generated based on the personal parameter and the frame parameter using the standard model and the iris in each frame face image. This is a line-of-sight measurement system that calculates the sum of all frame face images of the above error as fitness.

第２の発明によれば、生成した虹彩の投影像と実際のフレーム顔画像の虹彩との誤差を全フレームについて積算して適合度とするので、推定または計測する視線方向の精度向上が期待できる。 According to the second aspect of the invention, since the error between the generated projection image of the iris and the iris of the actual frame face image is integrated for all frames to obtain the fitness, the accuracy of the gaze direction to be estimated or measured can be expected to improve. .

第３の発明は、第１の発明に従属し、適合度計算手段は、標準モデルを使って個人パラメータとフレームパラメータに基づいて生成した虹彩を含む顔の所定の特徴点の投影像と、各フレーム顔画像におけるそれらの特徴点との比較の誤差の全フレーム顔画像の総計を適合度として計算する、視線計測システムである。 The third invention is dependent on the first invention, and the fitness calculation means includes a projected image of predetermined feature points of the face including the iris generated based on the personal parameters and the frame parameters using the standard model, This is a line-of-sight measurement system that calculates the total of all frame face images of errors in comparison with their feature points in the frame face image as the fitness.

第３の発明によれば、虹彩を含む顔の所定の特徴点の投影像と実際のフレーム顔画像におけるそれらの特徴点の誤差（距離）の和を全フレームについて積算して適合度とするので、これらの発明により適合度を正確に計算することができ、結果的に、推定または計測する視線方向の精度向上が期待できる。 According to the third aspect of the invention, the sum of errors (distances) between the projected images of predetermined feature points of the face including the iris and those feature points in the actual frame face image is integrated for all frames to obtain the fitness. According to these inventions, the fitness can be accurately calculated, and as a result, it is expected that the accuracy of the gaze direction to be estimated or measured is improved.

第４の発明は、第１ないし第３のいずれか発明に従属し、修正したパラメータに基づいて標準モデルを更新する更新手段をさらに備える、視線計測システムである。 A fourth invention is a line-of-sight measurement system according to any one of the first to third inventions, further comprising update means for updating the standard model based on the corrected parameter.

第４の発明によれば、修正した個人パラメータによって標準モデルを更新するので、その標準モデルを利用して特徴点を検出できる被験者の範囲が拡大する。つまり、より多くのタイプの被験者に適用できる標準モデルが得られる。 According to the fourth invention, since the standard model is updated with the corrected personal parameter, the range of subjects who can detect feature points using the standard model is expanded. That is, a standard model that can be applied to more types of subjects is obtained.

第５の発明は、フレーム顔画像から取得した人の眼球位置と虹彩位置とに基づいてフレーム顔画像毎の視線方向を計測する視線計測方法であって、標準モデルを利用してフレーム顔画像毎に顔画像の虹彩位置を含む特徴点の位置データを取得する特徴点データ取得ステップ、特徴点の位置データを利用してフレーム顔画像毎に顔の位置および姿勢データを取得する顔データ取得ステップ、特徴点の位置データおよび顔の位置および姿勢データに基づいてフレーム画像毎に個人パラメータおよびフレームパラメータを設定するパラメータ設定ステップ、個人パラメータおよびフレームパラメータの全フレーム顔画像に対する適合度を計算する適合度計算ステップ、適合度が所定の閾値になるまで個人パラメータおよびフレームパラメータを修正する修正ステップ、および修正された個人パラメータおよびフレームパラメータに基づいて視線方向を計測するステップを含む、視線計測方法である。 A fifth invention is a line-of-sight measurement method for measuring a line-of-sight direction for each frame face image based on a human eyeball position and an iris position acquired from the frame face image, and uses the standard model for each frame face image. A feature point data acquisition step for acquiring feature point position data including the iris position of the face image, a face data acquisition step for acquiring face position and orientation data for each frame face image using the feature point position data, Parameter setting step for setting individual parameters and frame parameters for each frame image based on the feature point position data and face position and posture data, and fitness calculation for calculating the suitability of individual parameters and frame parameters for all frame face images Step, modify personal parameters and frame parameters until the matching level reaches a predetermined threshold Comprising the step of measuring a gaze direction based on personal parameters and frame parameters were fixed steps, and modifications that are line-of-sight measurement method.

第５の発明でも第１の発明と同様の効果が期待できる。 In the fifth invention, the same effect as in the first invention can be expected.

第６の発明は、フレーム顔画像から取得した人の眼球位置と虹彩位置とに基づいてフレーム顔画像毎の視線方向を計測する視線計測システムのコンピュータによって実行される視線計測プログラムであって、コンピュータを、標準モデルを利用してフレーム顔画像毎に顔画像の虹彩位置を含む特徴点の位置データを取得する特徴点データ取得手段、特徴点の位置データを利用してフレーム顔画像毎に顔の位置および姿勢データを取得する顔データ取得手段、特徴点の位置データおよび顔の位置および姿勢データに基づいてフレーム画像毎に個人パラメータおよびフレームパラメータを設定するパラメータ設定手段、個人パラメータおよびフレームパラメータの全フレーム顔画像に対する適合度を計算する適合度計算手段、適合度が所定の閾値になるまで個人パラメータおよびフレームパラメータを修正する修正手段、および修正された個人パラメータおよびフレームパラメータに基づいて視線方向を計測する手段として機能させる、視線計測プログラムである。 A sixth invention is a line-of-sight measurement program executed by a computer of a line-of-sight measurement system that measures a line-of-sight direction for each frame face image based on a human eyeball position and an iris position acquired from the frame face image. The feature point data acquisition means for acquiring the feature point position data including the iris position of the face image for each frame face image using the standard model, the face point for each frame face image using the feature point position data Face data acquisition means for acquiring position and orientation data, parameter setting means for setting individual parameters and frame parameters for each frame image based on the position data of feature points and the position and orientation data of faces, all of the personal parameters and frame parameters A fitness calculation means for calculating the fitness for the frame face image, and the fitness is a predetermined threshold Made up to function as a means for measuring a gaze direction based on personal parameters and correcting means corrects the frame parameters, and modified personal parameter and frame parameters, a line-of-sight measurement program.

第６の発明でも第１の発明と同様の効果が期待できる。 In the sixth invention, the same effect as in the first invention can be expected.

この発明によれば、全フレームの顔画像と照合して個人的パラメータおよびフレームパラメータを修正するので、視線の推定または計測の精度を向上させることができる。 According to the present invention, since the personal parameters and the frame parameters are corrected by collating with the face images of all frames, the accuracy of eye gaze estimation or measurement can be improved.

この発明の上述の目的，その他の目的，特徴，および利点は、図面を参照して行う以下の実施例の詳細な説明から一層明らかとなろう。 The above object, other objects, features, and advantages of the present invention will become more apparent from the following detailed description of embodiments with reference to the drawings.

図１はこの発明の一実施例の視線計測システムを示すブロック図である。FIG. 1 is a block diagram showing a visual line measuring system according to an embodiment of the present invention. 図２は図１実施例におけるサーバのメモリのメモリマップを示す図解図である。FIG. 2 is an illustrative view showing a memory map of the memory of the server in FIG. 1 embodiment. 図３はサーバが実行する視線計測の処理動作を示すフロー図である。FIG. 3 is a flowchart showing the processing operation of eye gaze measurement executed by the server. 図４は図１実施例の視線計測システムにおいて視線計測をするための被験者の顔画像の一例を示す図解図である。FIG. 4 is an illustrative view showing one example of a face image of a subject for performing a gaze measurement in the gaze measurement system of the FIG. 1 embodiment. 図５は眉間候補領域を検出するためのフィルタを説明するための概念図である。FIG. 5 is a conceptual diagram for explaining a filter for detecting an eyebrow candidate region. 図６は６分割矩形フィルタの他の構成を示す概念図である。FIG. 6 is a conceptual diagram showing another configuration of the six-divided rectangular filter. 図７は眉間を中心とした画像領域を利用してＳＶＭによるモデル化を説明する図解図である。FIG. 7 is an illustrative view for explaining modeling by SVM using an image region centered on the eyebrows. 図８は顔検出結果の例を示す図解図である。FIG. 8 is an illustrative view showing an example of a face detection result. 図９は視線方向を決定するためのモデルを説明する概念図である。FIG. 9 is a conceptual diagram illustrating a model for determining the line-of-sight direction. 図１０は被験者の虹彩中心、眼球中心および投影点の関係を示す概念図である。FIG. 10 is a conceptual diagram showing the relationship between the subject's iris center, eyeball center, and projection point.

この発明の一実施例の視線計測システム１０は、サーバ１２を含み、このサーバ１２はネットワーク１６を介してクライアント１４からアクセスされる。クライアント１４にたとえばＣＣＤまたはＣＭＯＳセンサのような固体撮像素子を含むカメラ１８が設けられ、このカメラ１８は被験者２０の顔を撮影して、被験者の顔画像（動画像）信号をクライアント１４に送る。クライアント１４は、カメラ１８で取得した被験者２０の一連の顔画像信号を、一連のフレーム顔画像データとして、被験者２０の視線方向をフレーム顔画像から推定して計測してもらうために、ネットワーク１６を通してサーバ１２に送る。ただし、クライアント１４からサーバ１２へのフレーム顔画像データの送信は、ネットワーク１６を介して行う他、たとえばフレーム顔画像データを記録した記録媒体（図示せず）からサーバ１２に取り込むなどの方法が考えられる。 The line-of-sight measurement system 10 according to an embodiment of the present invention includes a server 12 that is accessed from a client 14 via a network 16. The client 14 is provided with a camera 18 including a solid-state imaging device such as a CCD or a CMOS sensor. The camera 18 captures the face of the subject 20 and sends a face image (moving image) signal of the subject to the client 14. The client 14 uses a series of face image signals of the subject 20 acquired by the camera 18 as a series of frame face image data, and estimates the direction of the subject's line of sight 20 from the frame face image to be measured. Send to server 12. However, transmission of the frame face image data from the client 14 to the server 12 is performed via the network 16, and for example, a method of taking in the server 12 from a recording medium (not shown) on which the frame face image data is recorded is considered. It is done.

サーバ１２は、そのようにして入力された入力画像データを、たとえばハードディスクや半導体メモリのような記憶装置である入力画像保存装置２２に保存する。 The server 12 stores the input image data input as described above in an input image storage device 22 which is a storage device such as a hard disk or a semiconductor memory.

なお、サーバ１２が計測した視線方向のデータ（視線データ）は、必要に応じて、ネットワーク１６を介してクライアント１４に送られる。 The line-of-sight data (line-of-sight data) measured by the server 12 is sent to the client 14 via the network 16 as necessary.

サーバ１２における視線計測は、背景技術として挙げた特許文献１（特開２００８‐１０２９０２号）において本件発明者等が既に提案した視線推定方法を利用する。簡単にいうと、この背景となる視線計測技術は、被験者の顔の特徴点と虹彩中心の関係から顔特徴点と眼球中心の相対関係を求め、ついで、その相対関係を元にそのときの顔画像で得られている特徴点群から眼球中心位置を推定し、その位置と虹彩中心位置から視線方向を推定するのである。 The line-of-sight measurement in the server 12 uses the line-of-sight estimation method already proposed by the present inventors in Patent Document 1 (Japanese Patent Laid-Open No. 2008-102902) cited as background art. In simple terms, the background gaze measurement technology calculates the relative relationship between the facial feature point and the eyeball center from the relationship between the facial feature point of the subject and the iris center, and then based on that relative relationship, The eyeball center position is estimated from the feature point group obtained in the image, and the gaze direction is estimated from the position and the iris center position.

図２は図１実施例におけるサーバ１２のメモリ２４のメモリマップを示し、このメモリ２４にはプログラム記憶領域２６およびデータ記憶領域２８が形成される。プログラム記憶領域２８には、それぞれ後に詳しく説明する、入力顔画像から被験者の顔を検出し、その顔の特徴点を抽出するための顔検出・特徴点抽出プログラム２６１、顔画像からカメラ１６に対する被験者の顔の位置および姿勢を検出するための顔位置・姿勢検出プログラム２６２を含む。顔の特徴点としては、実施例では、被験者の左右の目の目頭および目尻、口の両端（口角）の計６点を用いる。これらの特徴点は２次元座標として表現される。 FIG. 2 shows a memory map of the memory 24 of the server 12 in the embodiment of FIG. 1, in which a program storage area 26 and a data storage area 28 are formed. In the program storage area 28, a face detection / feature point extraction program 261 for detecting the face of the subject from the input face image and extracting feature points of the face, which will be described in detail later, and the subject for the camera 16 from the face image A face position / posture detection program 262 for detecting the position and posture of the face. As facial feature points, in the embodiment, a total of six points including the eyes and corners of the left and right eyes of the subject and both ends (mouth corners) of the mouth are used. These feature points are expressed as two-dimensional coordinates.

プログラム記憶領域２６はさらに、個人パラメータおよびフレームパラメータ（後述）の入力顔画像に対する適合度を判定するために、後述のスコア積算のためのスコア積算プログラム２６３、およびそのスコア積算処理の結果に基づいて個人パラメータおよびフレームパラメータを修正するためのパラメータ修正プログラム２６４を含む。 The program storage area 26 is further based on a score integration program 263 for score integration to be described later and the result of the score integration processing in order to determine the adaptability of the personal parameters and frame parameters (to be described later) to the input face image. A parameter correction program 264 for correcting personal parameters and frame parameters is included.

プログラム記憶領域２６はさらに、修正した個人パラメータに含まれる眼球中心とフレームパラメータに含まれる虹彩中心とに基づいて視線方向を推定する視線推定プログラム２６５、およびパラメータ修正の結果に基づいて標準モデルを更新する標準モデル更新プログラム２６６を含む。 The program storage area 26 further updates the standard model based on the result of parameter correction, and the line-of-sight estimation program 265 that estimates the line-of-sight direction based on the eyeball center included in the corrected personal parameter and the iris center included in the frame parameter. The standard model update program 266 is included.

データ記憶領域２８には、一連のフレーム顔画像を分解した各フレーム顔画像のデータを記憶しておくための、フレーム顔画像データ記憶領域２８１を含み、この領域２８１に、入力画像保存装置２２（図１）から読み出して各フレームに分解した顔画像データを記憶しておく。 The data storage area 28 includes a frame face image data storage area 281 for storing data of each frame face image obtained by disassembling a series of frame face images, and the input image storage device 22 ( The face image data read from FIG. 1) and decomposed into each frame is stored.

データ記憶領域２８はまた、上の顔検出・特徴点抽出プログラム２６１や顔位置・姿勢推定プログラム２６２で利用される標準モデルのデータを記憶した標準モデル記憶領域２８２が含まれる。ここで、標準モデルとは、解剖学の知見に従って構築した人の顔の上述の６つの特徴点の３次元座標の位置を示す座標データと、それらの特徴点に対して解剖学的に推定できる眼球の位置および眼球の半径（ｒ）のデータとを含むデータベースである。 The data storage area 28 also includes a standard model storage area 282 that stores standard model data used in the above face detection / feature point extraction program 261 and face position / posture estimation program 262. Here, the standard model is coordinate data indicating the position of the three-dimensional coordinates of the above-mentioned six feature points of the human face constructed according to anatomical knowledge, and can be estimated anatomically with respect to these feature points. It is a database including data on the position of the eyeball and the radius (r) of the eyeball.

顔検出・特徴点抽出プログラム２６１や顔位置・姿勢推定プログラム２６２で検出した特徴点のデータや顔の位置および姿勢のデータは、入力画像から抽出した図７のような目領域の画像とともに、特徴点、顔位置、姿勢データ記憶領域２８３に記憶される。 The feature point data and the face position and orientation data detected by the face detection / feature point extraction program 261 and the face position / posture estimation program 262 together with the image of the eye region as shown in FIG. The point, face position, and posture data storage area 283 is stored.

データ記憶領域２８はさらに、個人パラメータ記憶領域２８４およびフレームパラメータ記憶領域２８５を含む。個人パラメータとは、各被験者に独特の上述の標準モデルに相当しかつ全てのフレームに共通する、６つの特徴点の３次元座標、眼球の位置および眼球の半径（ｒ）の１組のデータセットのことである。これに対して、フレームパラメータは、各フレームに特有のパラメータであって、顔の位置および姿勢、虹彩（瞳孔）の位置および虹彩（瞳孔）の半径を含むデータセットである。したがって、フレームパラメータ記憶領域２８５は、一連の顔画像のフレーム数（Ｋ）に相当する記憶場所を有する。 The data storage area 28 further includes a personal parameter storage area 284 and a frame parameter storage area 285. The personal parameters are a set of data sets corresponding to the above-mentioned standard model unique to each subject and common to all frames, including the three-dimensional coordinates of the six feature points, the position of the eyeball, and the radius (r) of the eyeball. That is. On the other hand, the frame parameter is a parameter specific to each frame, and is a data set including a face position and posture, an iris (pupil) position, and an iris (pupil) radius. Therefore, the frame parameter storage area 285 has a storage location corresponding to the number of frames (K) of a series of face images.

そして、データ記憶領域２８に形成される視線データ記憶領域２８６は、上述の視線推定プログラムで推定または計測した視線方向を示すデータを、たとえばたとえば旋回方向の角度（水平面内の角度）および俯仰方向の角度（垂直面内の角度）のデータとして、フレーム毎に記憶するための領域である。この視線データ記憶領域も、一連の顔画像のフレーム数（Ｋ）に相当する記憶場所を含む。 The line-of-sight data storage area 286 formed in the data storage area 28 stores data indicating the line-of-sight direction estimated or measured by the above-described line-of-sight estimation program, for example, an angle in a turning direction (an angle in a horizontal plane) and an elevation direction. This is an area for storing data for each frame as angle (angle in the vertical plane) data. This line-of-sight data storage area also includes a storage location corresponding to the number of frames (K) of a series of face images.

図３はサーバ１２が実行する図１実施例の動作を示すフロー図であり、最初のステップＳ１０１では、サーバ１２は、たとえばクライアントコンピュータ１４から入力され、入力画像保存装置２２（図１）に保存されている一連のフレーム顔画像データを、フレーム毎の顔画像データに分解して、メモリ２４（図２）のフレーム顔画像データ記憶領域２８１に記憶する。 FIG. 3 is a flowchart showing the operation of the embodiment in FIG. 1 executed by the server 12. In the first step S101, the server 12 is input from, for example, the client computer 14 and stored in the input image storage device 22 (FIG. 1). The series of frame face image data that has been processed is decomposed into face image data for each frame and stored in the frame face image data storage area 281 of the memory 24 (FIG. 2).

そして、メモリ２６の適宜の領域に形成されて、フレーム数をカウントするためのカウンタ（図示せず）をインクリメントする（ステップＳ１０３）。最初のフレームを処理するためにはこのカウンタに「１」がセットされ、順次フレーム毎にインクリメントされる。以下、ステップＳ１１３で最後のフレーム（Ｋ）までの処理が終了したと判断するまで、ステップＳ１０５‐Ｓ１１１を繰り返し実行する。 Then, a counter (not shown) that is formed in an appropriate area of the memory 26 and counts the number of frames is incremented (step S103). In order to process the first frame, “1” is set in this counter, and is sequentially incremented for each frame. Thereafter, steps S105 to S111 are repeatedly executed until it is determined in step S113 that the processing up to the last frame (K) has been completed.

ステップＳ１０５では、図２に示す顔検出・特徴点抽出プログラム２６１に従って、そのときのフレーム顔画像から、被験者の顔を検出し、ついで特徴点を抽出する。
（顔検出）
視線方向の推定処理の動作の前提として、まず、たとえば６分割矩形フィルタを利用して、顔検出処理が実行される。 In step S105, according to the face detection / feature point extraction program 261 shown in FIG. 2, the face of the subject is detected from the frame face image at that time, and then feature points are extracted.
(Face detection)
As a premise of the operation of the gaze direction estimation process, first, for example, a face detection process is executed using a six-divided rectangular filter.

サーバ１２は、顔画像を処理するにあたり、横が顔幅、縦がその半分程度の大きさの矩形フィルタで画面を走査する。矩形は、たとえば、３×２に６分割されていて、各分割領域の平均明るさが計算され、それらの相対的な明暗関係がある条件を満たすとき、その矩形の中心を眉間候補とする。 When processing the face image, the server 12 scans the screen with a rectangular filter whose width is the width of the face and whose length is about half that of the face. The rectangle is divided into, for example, 3 × 2, and the average brightness of each divided region is calculated, and when the relative brightness relationship is satisfied, the center of the rectangle is set as a candidate for the eyebrows.

連続した画素が眉間候補となるときは、それを取囲む枠の中心候補のみを眉間候補として残す。残った眉間候補を標準モデルと比較してテンプレートマッチング等を行うことで、上述した手続きで得られた眉間候補のうちから、偽の眉間候補を捨て、真の眉間を抽出する。以下、さらに詳しく説明する。 When consecutive pixels become the eyebrow candidate, only the center candidate of the frame surrounding it is left as the eyebrow candidate. By comparing the remaining eyebrow candidates with the standard model and performing template matching or the like, the false eyebrow candidates are discarded from the eyebrow candidates obtained by the above-described procedure, and the true eyebrow candidates are extracted. This will be described in more detail below.

図４は、眉間候補領域を検出するためのフィルタを説明するための概念図であり、図４（ａ）は、上述した３×２に６分割された矩形フィルタ（以下、「６分割矩形フィルタ」と呼ぶ。）を示す。 FIG. 4 is a conceptual diagram for explaining a filter for detecting an eyebrow candidate region. FIG. 4A shows the above described 3 × 2 rectangular filter (hereinafter referred to as “6-divided rectangular filter”). ").

６分割矩形フィルタは、(1) 鼻筋は両目領域よりも明るい、(2) 目領域は頬部よりも暗い、という顔の特徴を抽出し、顔の眉間位置を求めるフィルタである。たとえば、１点（ｘ、ｙ）を中心として、横ｉ画素、縦ｊ画素（ｉ，ｊ：自然数）の矩形の枠を設ける。そして、図４（ａ）のように、この矩形の枠を、横に３等分、縦に２等分して、６個のブロックＳ１‐Ｓ６に分割する。 The six-divided rectangular filter is a filter that extracts facial features such as (1) nose muscles are brighter than both eye regions and (2) eye regions are darker than the cheeks, and obtains the position between the eyebrows. For example, a rectangular frame of horizontal i pixels and vertical j pixels (i, j: natural number) is provided centering on one point (x, y). Then, as shown in FIG. 4A, this rectangular frame is divided into three equal parts horizontally and two equal parts vertically and divided into six blocks S1-S6.

このような６分割矩形フィルタを顔画像の両目領域および頬部に当てはめてみると、図４（ｂ）のようになる。 When such a 6-divided rectangular filter is applied to both eye regions and cheeks of a face image, the result is as shown in FIG.

ただし、図４の６分割フィルタは各矩形領域が等分されたものであったが、このフィルタは図５に示すように変形されてもよい。 However, although the 6-divided filter in FIG. 4 is obtained by equally dividing each rectangular area, this filter may be modified as shown in FIG.

鼻筋の部分が目の領域よりも通常は狭いことを考慮すると、ブロックＳ２およびＳ５の横幅ｗ２は、ブロックＳ１，Ｓ３，Ｓ４およびＳ６の横幅ｗ１よりも狭い方がより望ましい。好ましくは、幅ｗ２は幅ｗ１の半分とすることができる。図１０は、このような場合の６分割矩形フィルタの構成を示す。また、ブロックＳ１、Ｓ２およびＳ３の縦幅ｈ１と、ブロックＳ４、Ｓ５およびＳ６の縦幅ｈ２とは、必ずしも同一である必要もない。 Considering that the nose muscle portion is usually narrower than the eye region, it is more desirable that the width w2 of the blocks S2 and S5 is narrower than the width w1 of the blocks S1, S3, S4 and S6. Preferably, the width w2 can be half of the width w1. FIG. 10 shows the configuration of a six-divided rectangular filter in such a case. Further, the vertical width h1 of the blocks S1, S2 and S3 and the vertical width h2 of the blocks S4, S5 and S6 are not necessarily the same.

図５に示す６分割矩形フィルタにおいて、それぞれのブロックＳｉ（１≦ｉ≦６）について、画素の輝度の平均値「バーＳｉ」（Ｓｉに上に“−”（バー）をつける。）を求める。 In the 6-divided rectangular filter shown in FIG. 5, for each block Si (1 ≦ i ≦ 6), the average value “bar Si” of the pixel luminance (“−” (bar) is added above Si). .

ブロックＳ１に１つの目と眉が存在し、ブロックＳ３に他の目と眉が存在するものとすると、以下の関係式（１）および（２）が成り立つ。 Assuming that one eye and eyebrows exist in the block S1 and another eye and eyebrows exist in the block S3, the following relational expressions (1) and (2) hold.

そこで、これらの関係を満たす点を眉間候補（顔候補）として抽出する。 Therefore, a point satisfying these relationships is extracted as an eyebrow candidate (face candidate).

矩形枠内の画素の総和を求める処理には、公知の文献（P. Viola and M. Jones, “Rapid Object Detection using a Boosted Cascade of Simple Features,”Proc. Of IEEEConf. CVPR, 1, pp.511-518, 2001）において開示されている、インテグラルイメージ（Integral Image）を利用した計算の高速化手法を取り入れることができる。インテグラルイメージを利用することでフィルタの大きさに依らず高速に実行することができる。多重解像度画像に本手法を適用することにより、画像上の顔の大きさが変化した場合にも顔候補の抽出が可能となる。 For the process of calculating the sum of pixels in a rectangular frame, a known document (P. Viola and M. Jones, “Rapid Object Detection using a Boosted Cascade of Simple Features,” Proc. Of IEEEConf. CVPR, 1, pp.511). -518, 2001), it is possible to incorporate a high-speed calculation method using an integral image. By using an integral image, it can be executed at high speed regardless of the size of the filter. By applying this method to a multi-resolution image, face candidates can be extracted even when the size of the face on the image changes.

このようにして得られた眉間候補（顔候補）に対しては、上で説明した標準モデルとのテンプレートマッチングにより、真の眉間位置（真の顔領域）を特定することができる。 For the eyebrow candidate (face candidate) obtained in this way, the true eyebrow position (true face region) can be specified by template matching with the standard model described above.

なお、得られた顔候補に対して、サポートベクタマシン（ＳＶＭ）による顔モデルによる検証処理を適用し顔領域を決定することもできる。髪型の違いや髭の有無、表情変化による認識率の低下を避けるため、たとえば、図７に示すように、眉間を中心とした画像領域を利用してＳＶＭによるモデル化を行うことができる。なお、このようなＳＶＭによる真の顔領域の決定については、文献：S. Kawato, N. Tetsutani and K. Hosaka: “Scale-adaptive face detection and tracking in real time with ssr fi1ters and support vector machine”, IEICE Trans．on Info. and Sys., E88−D, 12, pp．2857−2863（2005）に開示されている。６分割矩形フィルタによる高速候補抽出とＳＶＭによる処理とを組み合わせることで実時間の顔検出が可能である。
（特徴点検出）
続いて、目、口や虹彩（瞳孔）の位置を、たとえば、眼の両端、口の両端の特徴点抽出は、予め用意した各特徴点周辺のテンプレート画像を利用した類似点の探索処理によって実現できる。テンプレート画像をGaborフィルタ等の空間フィルタを利用して低次元のベクトルに変換することで、照明変化に対して頑健で効率的な照合処理とすることもできる。 Note that a face area can be determined by applying verification processing using a face model by a support vector machine (SVM) to the obtained face candidates. In order to avoid a reduction in recognition rate due to differences in hairstyles, presence or absence of wrinkles, and changes in facial expressions, for example, as shown in FIG. For the determination of the true face area by SVM, refer to S. Kawato, N. Tetsutani and K. Hosaka: “Scale-adaptive face detection and tracking in real time with ssr fi1ters and support vector machine”, IEICE Trans. on Info. and Sys., E88-D, 12, pp. 2857-2863 (2005). Real-time face detection is possible by combining high-speed candidate extraction with a six-divided rectangular filter and processing by SVM.
(Feature point detection)
Subsequently, the extraction of feature points at the eyes, mouth and iris (pupil), for example, the feature points at both ends of the eye and at both ends of the mouth, is realized by searching for similar points using template images around each feature point prepared in advance. it can. By converting the template image into a low-dimensional vector using a spatial filter such as a Gabor filter, it is possible to perform a collation process that is robust against an illumination change and efficient.

両目の位置については、先に説明した顔領域検出で眉間のパターンを探索しているため、眉間の両側の暗い領域を再探索することにより、大まかな両目の位置を推定することができる。しかし、視線方向の推定のためには、虹彩中心をより正確に抽出する必要がある。ここでは、上で求まった目の周辺領域に対して、ラプラシアンにより虹彩のエッジ候補を抽出し、円のハフ変換を適用することにより、虹彩および虹彩の中心の投影位置を検出する。 As for the positions of both eyes, since the pattern between the eyebrows is searched for by detecting the face area described above, the positions of both eyes can be roughly estimated by searching again for the dark areas on both sides of the eyebrows. However, it is necessary to extract the iris center more accurately in order to estimate the gaze direction. Here, for the peripheral region of the eye obtained above, iris edge candidates are extracted by Laplacian, and the Hough transform of the circle is applied to detect the projection position of the iris and the center of the iris.

図８は顔検出結果の例を示す図である。検出された顔において、虹彩中心や鼻先や口なども検出されている。たとえば、特徴点としては、左右の目の目尻や目頭、口の両端などを用いることができる。 FIG. 8 is a diagram illustrating an example of a face detection result. In the detected face, the iris center, nose tip and mouth are also detected. For example, as the feature points, the right and left eye corners, the corners of the eyes, both ends of the mouth, and the like can be used.

このようにして検出された特徴点の位置データは、メモリ２４（図２）の特徴点、顔位置、姿勢記憶領域２８３に、フレーム毎に、記憶される。
（視線推定の原理）
視線の推定においては、視線方向は眼球中心と虹彩中心を結ぶ３次元直線として与えられるものとする。 The feature point position data detected in this way is stored for each frame in the feature point, face position, and posture storage area 283 of the memory 24 (FIG. 2).
(Principle of gaze estimation)
In the gaze estimation, the gaze direction is given as a three-dimensional straight line connecting the eyeball center and the iris center.

図９は視線方向を決定するためのモデルを説明する概念図である。画像上での眼球半径をｒ、画像上での眼球中心と虹彩中心との距離をｄとすると、視線方向とカメラ光軸とのなす角θは次式(３)で表される。 FIG. 9 is a conceptual diagram illustrating a model for determining the line-of-sight direction. When the eyeball radius on the image is r and the distance between the eyeball center and the iris center on the image is d, the angle θ formed by the line-of-sight direction and the camera optical axis is expressed by the following equation (3).

後により詳しく説明するように、この実施例の視線方向の推定では、眼球中心と顔特徴点間の相対関係の推定処理と眼球中心の投影位置推定とを行なう。そのために、ステップＳ１０７で、顔の位置および姿勢を推定する。
（顔位置・姿勢推定）
顔特徴点ｐｊの２次元観測位置ｘｊ（ｋ）（太字）＝［ｘｊ（ｋ），ｙｊ（ｋ）］ｔとｖ）標準モデルより求まった３次元位置ｓｊ（太字）＝［Ｘｊ，Ｙｊ，Ｚｊ］ｔ（ｊ＝１，…，Ｍ）の間には、Ｍ個の特徴点のうち観測されたｍ個の特徴点について注目すると、次式の関係が得られる。 As will be described in detail later, in the estimation of the line-of-sight direction in this embodiment, the estimation process of the relative relationship between the eyeball center and the face feature point and the projection position estimation of the eyeball center are performed. Therefore, in step S107, the face position and posture are estimated.
(Face position / posture estimation)
2D observation position xj (k) (bold) = [xj (k), yj (k)] t and v) 3D position sj (bold) = [Xj, Yj, In the range of Zj] t (j = 1,..., M), when attention is paid to the m feature points observed among the M feature points, the following relationship is obtained.

ただし、行列Ｐ（ｋ）は２×３の行列である。右辺の第２項の行列Ｓ（ｋ）は行列Ｓのうち、観測された特徴点に対応する要素のみからなる部分行列である。上述の通り、カメラと顔は十分に離れているとし正射影を仮定している。ここで、４点以上の特徴点が観測されれば、行列Ｐ（ｋ）は以下のように計算できる。 However, the matrix P (k) is a 2 × 3 matrix. The matrix S (k) of the second term on the right side is a partial matrix consisting of only elements corresponding to the observed feature points in the matrix S. As described above, it is assumed that the camera and the face are sufficiently separated from each other and an orthogonal projection is assumed. Here, if four or more feature points are observed, the matrix P (k) can be calculated as follows.

画像フレームＩｋにおける眼球中心の投影位置ｘｒ（ｉ）（太字），ｘｌ（ｉ）（太字）は、行列Ｐ（ｋ）を用いて以下のように計算できる（ステップＳ２１０）。 The projection positions xr (i) (bold) and xl (i) (bold) at the center of the eyeball in the image frame Ik can be calculated as follows using the matrix P (k) (step S210).

したがって、画像フレームＩｋにおいて特徴点として抽出した虹彩中心の投影位置とこの眼球中心の投影位置を用いると、視線の推定を行なうことができる。 Therefore, the line of sight can be estimated by using the iris center projection position extracted as the feature point in the image frame Ik and the eyeball center projection position.

なお、行列ＰをＱＲ分解により分解することで、顔の姿勢Ｒがまた、顔の位置が以下のように計算できる。 By decomposing the matrix P by QR decomposition, the face posture R and the face position can be calculated as follows.

ただし、ｒ１、ｒ２はそれぞれ１×３のベクトルである。このような顔の姿勢Ｒの検出については、文献：L．Quan: “Self-calibration of an affine camera from multiple views”，Int’l Journal of Computer Vision, 19, pp. 93−105（1996）に開示がある。 However, r1 and r2 are 1 × 3 vectors, respectively. Such detection of face posture R is described in literature: L.L. Quan: “Self-calibration of an affine camera from multiple views”, Int’l Journal of Computer Vision, 19, pp. 93-105 (1996).

得られたｒを真値とみなして、式（１０）、（１１）、（１２）に従って、最小二乗法により、投影誤差を最小とするスケールｓおよび並進ベクトルｖｘ，ｖｙを求める。式（１２）のスケールｓが顔の大きさを示し、並進ベクトルｖｘ，ｖｙが顔の位置を示す。 The obtained r is regarded as a true value, and the scale s and the translation vectors vx and vy that minimize the projection error are obtained by the least square method according to the equations (10), (11), and (12). The scale s in Expression (12) indicates the size of the face, and the translation vectors vx and vy indicate the position of the face.

このようにして、ステップＳ１０７で、各フレーム画像における被験者の顔の位置および姿勢を推定する。 In this way, in step S107, the position and posture of the subject's face in each frame image are estimated.

このようにして検出された顔の位置および姿勢データは、メモリ２４（図２）の特徴点、顔位置、姿勢記憶領域２８３に、フレーム毎に記憶される。 The face position and orientation data detected in this way is stored for each frame in the feature point, face position and orientation storage area 283 of the memory 24 (FIG. 2).

続くステップＳ１０９で、サーバ１２はそのとき処理している入力画像から、先の図７に示すような目領域の画像データを抽出し、メモリ２４の特徴点、顔位置、姿勢記憶領域２８３に、特徴点、顔位置および姿勢データと一緒に、フレーム毎に、記憶する。 In subsequent step S109, the server 12 extracts image data of the eye area as shown in FIG. 7 from the input image being processed at that time, and stores it in the feature point, face position, and posture storage area 283 of the memory 24. Along with the feature point, face position, and posture data, each frame is stored.

次のステップＳ１１１で、サーバ１２は、上で求めた特徴点、顔位置および姿勢データに従って、当該フレームの初期パラメータを設定する。個人パラメータは上述のように、特徴点との相対的位置関係から求めた眼球位置と、その眼球位置を中心とした解剖学的に得られる眼球半径である。ただし、個人パラメータはフレーム毎に変化するというものではなく、すべてのフレームに対して共通のものである。ステップＳ１１１で設定された個人パラメータの初期値が個人パラメータ記憶領域２８４（図２）に記憶される。また、フレームパラメータはフレーム毎に変化するもので、各フレームにおける顔の位置、姿勢、虹彩（瞳孔）位置および虹彩（瞳孔）半径の初期値をフレームパラメータ記憶領域２８５（図２）の該当フレームの記憶場所に記憶する。 In the next step S111, the server 12 sets initial parameters of the frame in accordance with the feature point, face position, and posture data obtained above. As described above, the personal parameter is the eyeball position obtained from the relative positional relationship with the feature point, and the eyeball radius obtained anatomically with the eyeball position as the center. However, personal parameters do not change from frame to frame, but are common to all frames. The initial values of the personal parameters set in step S111 are stored in the personal parameter storage area 284 (FIG. 2). The frame parameters change from frame to frame, and the initial values of the face position, posture, iris (pupil) position, and iris (pupil) radius in each frame are stored in the frame parameter storage area 285 (FIG. 2). Store it in a memory location.

このようにして初期パラメータを設定した後、サーバ１２は、ステップＳ１１５において、全フレームについて、スコアを積算する。「スコア」とは、たとえば標準モデルのデータを使って個人パラメータ（眼球半径、眼球位置）とフレームパラメータ（虹彩半径および虹彩位置）に基づいて生成した虹彩の投影像（コンピュータグラフィックス画像）と、実際の各フレーム顔画像における虹彩との比較の誤差である。実際の顔画像としては、実施例では、ステップＳ１０９で特徴点、顔位置、姿勢記憶領域２８３に特徴点、顔位置、姿勢のデータとともにフレーム毎に記憶した図７で示すような目領域の画像を利用する。この誤差は、虹彩の投影像とフレーム顔画像における虹彩とのずれ（距離）、および大きさ（面積）の差をそれぞれ画像上のピクセル数として計算する。あるいは、虹彩を含む顔の所定の特徴点の投影像と実際のフレーム顔画像におけるそれらの特徴点の観測位置のずれ（距離）の和を誤差として計算してもよい。そのフレーム毎の誤差（スコア）を全Ｋフレームで合計する。つまり、全フレームについてスコアを積算する。スコアは、パラメータ設定手段として機能するステップＳ１１１で設定した個人パラメータおよびフレームパラメータが各フレーム顔画像に適合している程度（適合度）を示すものであり、したがって、このステップＳ１１５で得られる積算スコア値は、つまり、初期パラメータ設定手段が設定した個人パラメータおよびフレームパラメータの全フレーム顔画像に対する適合度である。 After setting the initial parameters in this way, the server 12 accumulates scores for all frames in step S115. “Score” is, for example, a projection image (computer graphics image) of an iris generated based on personal parameters (eyeball radius, eyeball position) and frame parameters (iris radius and iris position) using standard model data, This is an error in comparison with the iris in each actual frame face image. As an actual face image, in the embodiment, the image of the eye region as shown in FIG. 7 stored for each frame together with the feature point, face position, and posture data in the feature point, face position, and posture storage region 283 in step S109. Is used. For this error, the difference (distance) and size (area) between the projected image of the iris and the iris in the frame face image are calculated as the number of pixels on the image. Alternatively, the sum of the deviation (distance) of the projection positions of predetermined feature points of the face including the iris and the observation positions of those feature points in the actual frame face image may be calculated as an error. The error (score) for each frame is totaled for all K frames. That is, the scores are integrated for all frames. The score indicates the degree (fitness) that the personal parameter and the frame parameter set in step S111 functioning as the parameter setting means are adapted to each frame face image. Therefore, the integrated score obtained in step S115. In other words, the value is the degree of suitability of the personal parameter and the frame parameter set by the initial parameter setting means with respect to the all-frame face image.

次に、ステップＳ１１７において、ステップＳ１１５で積算したスコア値を所定の閾値と比較し、スコア値が閾値より小さいかどうか、つまり、ステップＳ１１１で設定した、全フレームに共通する個人パラメータおよび各フレーム毎のフレームパラメータが実際の全部のフレーム画像に対して十分適合しているかどうかを判断する。ステップＳ１１７では、すなわち、そのようなパラメータを使って視線方向を推定したとき誤差が許容できる範囲に収まるパラメータであるかどうかを判断する。 Next, in step S117, the score value integrated in step S115 is compared with a predetermined threshold value, and whether or not the score value is smaller than the threshold value, that is, the personal parameters common to all frames set in step S111 and for each frame. It is determined whether or not the frame parameters are sufficiently adapted to all the actual frame images. In step S117, that is, it is determined whether or not the parameter falls within an allowable range when the gaze direction is estimated using such parameters.

ステップＳ１１７で“ＮＯ”が判断されると、つまり、積算したスコア値が閾値以上であると判断したとき、次のステップＳ１１９で、サーバ１２は、ステップＳ１１５で求めたスコア値に基づいてステップＳ１２１で個人パラメータやフレームパラメータを修正した回数（繰り返し回数）が所定の規定値を超えたかどうか判断する。 If “NO” is determined in step S117, that is, if it is determined that the integrated score value is equal to or greater than the threshold value, in the next step S119, the server 12 performs step S121 based on the score value obtained in step S115. It is determined whether or not the number of times the personal parameter or frame parameter has been corrected (the number of repetitions) exceeds a predetermined specified value.

ステップＳ１１９で“ＮＯ”なら、次のステップＳ１２１で、サーバ１２は個人パラメータおフレームパラメータを、スコア値が最も小さくなるように最適化手法を用いて修正する。修正方法としては種々考えられるが、実施例では、一例として最急降下法を用いる。最急降下法とは、誤差曲面の傾斜が最も急に降下する方向に
パラメータを修正することで誤差が最小となるパラメータの数値を求める方法である。ただし、誤差修正の最適化のためには別の方法が採用されてもよい。 If “NO” in the step S119, in the next step S121, the server 12 modifies the personal parameter and the frame parameter by using an optimization method so that the score value becomes the smallest. Although various correction methods are conceivable, in the embodiment, the steepest descent method is used as an example. The steepest descent method is a method of obtaining a numerical value of a parameter that minimizes the error by correcting the parameter in a direction in which the slope of the error curved surface descends most steeply. However, another method may be employed to optimize error correction.

このようにして、ステップＳ１１５‐Ｓ１２１を繰り返し実行して、理想的にはスコア値がゼロ（０）になるように、実際的には、閾値を下回るように個人パラメータおよびフレームパラメータを修正する。なお、ステップＳ１１１で設定した個人パラメータおよび各フレームパラメータ（初期パラメータ）はステップＳ１２１でその後全フレームの顔画像との対比に基づいて、修正される。しかしながら、ステップＳ１２１で修正したパラメータもその後ステップＳ１２１でさらに修正されるのであり、その意味でステップＳ１２１は、先のステップＳ１１１と同じく、パラメータ設定手段としても機能するのである。 In this way, steps S115 to S121 are repeatedly executed, and the personal parameters and the frame parameters are actually corrected to be below the threshold value so that the score value is ideally zero (0). Note that the personal parameters and the frame parameters (initial parameters) set in step S111 are corrected in step S121 based on the comparison with the face images of all frames. However, the parameter modified in step S121 is further modified in step S121, and in this sense, step S121 also functions as a parameter setting unit, similar to the previous step S111.

このステップＳ１２１で修正された個人パラメータおよびフレーム顔画像毎のフレームパラメータは、図２に示す個人パラメータ記憶領域２８４およびフレームパラメータ記憶領域２８５にそれぞれ記憶される。 The personal parameters modified in step S121 and the frame parameters for each frame face image are stored in the personal parameter storage area 284 and the frame parameter storage area 285 shown in FIG.

そして、ステップＳ１１７で“ＹＥＳ”が判断されるか、もしくはステップＳ１１９で“ＹＥＳ”が判断されると、サーバ１２は続いて、ステップＳ１２１で修正して記憶した個人パラメータおよび各フレームパラメータに基づいて、ステップＳ１２３において、眼球位置と虹彩位置とを結ぶ３次元直線（図１０）としてフレーム顔画像毎に視線方向を推定し、視線方向データを視線データ記憶領域２８５（図２）に記憶する。この視線方向データが、先に入力された一連の顔画像信号の各フレーム毎の被験者の視線方向を示すデータとして、必要に応じてクライアント１４に返される。 Then, if “YES” is determined in step S117 or “YES” is determined in step S119, the server 12 subsequently proceeds based on the personal parameters and each frame parameter corrected and stored in step S121. In step S123, the gaze direction is estimated for each frame face image as a three-dimensional straight line (FIG. 10) connecting the eyeball position and the iris position, and the gaze direction data is stored in the gaze data storage area 285 (FIG. 2). This line-of-sight direction data is returned to the client 14 as needed as data indicating the line-of-sight direction of the subject for each frame of the series of face image signals input previously.

そして、最後に、サーバ１２はステップＳ１２５において、ステップＳ１２１で修正した個人パラメータを用いて、標準モデルを更新する。標準モデルの該当するパラメータが平均値であるときは、ステップＳ１２１で修正した個人パラメータの該当するパラメータを加えて平均値を計算し直す。たとえば、標準モデルの１つのパラメータがＮ人の被験者の該当パラメータの平均値であれば、このステップＳ１２５では、Ｎ＋１人の平均値を計算して、その結果で当該パラメータを更新する。標準モデルのパラメータの分散値を更新するときは、ステップＳ１２１で修正した当該パラメータ値に基づいて、たとえば重み付けを考慮して分散値を変更する。 Finally, in step S125, the server 12 updates the standard model using the personal parameters corrected in step S121. If the corresponding parameter of the standard model is an average value, the average value is recalculated by adding the corresponding parameter of the personal parameter corrected in step S121. For example, if one parameter of the standard model is an average value of corresponding parameters of N subjects, in this step S125, an average value of N + 1 people is calculated, and the parameter is updated with the result. When updating the variance value of the parameter of the standard model, the variance value is changed in consideration of weighting, for example, based on the parameter value corrected in step S121.

標準モデルを修正した個人パラメータで更新することによって、その標準モデルを利用して特徴点を検出できる被験者の範囲が拡大する。つまり、より多くのタイプの被験者に適用できる標準モデルが得られる。 By updating the standard model with the modified personal parameters, the range of subjects that can detect feature points using the standard model is expanded. That is, a standard model that can be applied to more types of subjects is obtained.

このように、この実施例によれば、ステップＳ１２１において、全フレームで積算したスコア値（設定パラメータと実際の顔画像との誤差の総和）を最小にするようにパラメータを修正するので、ステップＳ１２３で推定または計測する視線方向の精度が向上する。 As described above, according to this embodiment, in step S121, the parameter is corrected so as to minimize the score value (sum of errors between the setting parameter and the actual face image) accumulated in all frames. The accuracy of the gaze direction estimated or measured by is improved.

１０ …視線計測システム
１２ …サーバ
１４ …クライアント
１６ …ネットワーク
１８ …カメラ
２０ …被験者
２２ …入力画像保存装置
２４ …メモリ DESCRIPTION OF SYMBOLS 10 ... Eye-gaze measurement system 12 ... Server 14 ... Client 16 ... Network 18 ... Camera 20 ... Test subject 22 ... Input image storage device 24 ... Memory

Claims

A line-of-sight measurement system that measures a line-of-sight direction for each frame face image based on a human eyeball position and an iris position acquired from a frame face image,
Feature point data acquisition means for acquiring feature point position data including the iris position of the face image for each frame face image using a standard model;
Face data acquisition means for acquiring face position and posture data for each frame face image using the feature point position data;
Parameter setting means for setting individual parameters and frame parameters for each frame image based on the position data of the feature points and the position and orientation data of the face;
Goodness-of-fit calculation means for calculating goodness of the personal parameters and frame parameters to all frame face images;
A line-of-sight measurement system comprising: correction means for correcting the personal parameter and frame parameter until the fitness level reaches a predetermined threshold; and means for measuring the line-of-sight direction based on the corrected personal parameter and frame parameter.

The goodness-of-fit calculation means calculates the total of the projection image of the iris generated based on the personal parameter and the frame parameter using the standard model and the total frame face image of the error in comparison with the iris in each frame face image. The line-of-sight measurement system according to claim 1, which is calculated as follows.

The fitness calculation means includes an error of comparison between a projected image of predetermined feature points of a face including an iris generated based on personal parameters and frame parameters using a standard model, and those feature points in each frame face image The line-of-sight measurement system according to claim 1, wherein a total of all frame face images is calculated as the fitness.

The line-of-sight measurement system according to claim 1, further comprising updating means for updating the standard model based on the corrected personal parameter.

A gaze measurement method for measuring a gaze direction for each frame face image based on a human eyeball position and an iris position acquired from a frame face image,
A feature point data acquisition step for acquiring feature point position data including the iris position of the face image for each frame face image using a standard model;
A face data acquisition step of acquiring face position and posture data for each frame face image using the feature point position data;
A parameter setting step for setting personal parameters and frame parameters for each frame image based on the position data of the feature points and the position and orientation data of the face;
A fitness calculation step of calculating the fitness of the personal parameters and the frame parameters with respect to all frame face images;
A line-of-sight measurement method, comprising: a correction step of correcting the personal parameter and the frame parameter until the fitness level reaches a predetermined threshold; and a step of measuring the line-of-sight direction based on the corrected personal parameter and the frame parameter.

A line-of-sight measurement program executed by a computer of a line-of-sight measurement system that measures a line-of-sight direction for each frame face image based on a human eyeball position and iris position acquired from a frame face image, the computer comprising:
Feature point data acquisition means for acquiring feature point position data including the iris position of the face image for each frame face image using a standard model;
Face data acquisition means for acquiring face position and posture data for each frame face image using the feature point position data;
Parameter setting means for setting individual parameters and frame parameters for each frame image based on the position data of the feature points and the position and orientation data of the face;
Goodness-of-fit calculation means for calculating goodness of the personal parameters and frame parameters to all frame face images;
A line-of-sight measurement program that functions as a correction unit that corrects the personal parameter and the frame parameter until the fitness level reaches a predetermined threshold, and a unit that measures the line-of-sight direction based on the corrected personal parameter and the frame parameter.