CN112527118A

CN112527118A - Head posture recognition method based on dynamic time warping

Info

Publication number: CN112527118A
Application number: CN202011485090.XA
Authority: CN
Inventors: 李淮周; 王宏; 李森; 曹祥红; 胡海燕; 武东辉; 温书沛; 吴彦福; 李晓彬
Original assignee: Zhengzhou University of Light Industry
Current assignee: Zhengzhou University of Light Industry
Priority date: 2020-12-16
Filing date: 2020-12-16
Publication date: 2021-03-19
Anticipated expiration: 2040-12-16
Also published as: CN112527118B

Abstract

The present invention proposes a head gesture recognition method based on dynamic time regularization. , and stored in the data set; preprocess the data in the data set, detect the start time and end time of the head movement, and extract the action range of the head movement; construct the head movement template; through the detected head movement data A regularized path is calculated with the obtained header motion template data; the standard template header motion type corresponding to the minimum value of the regularized path DTW is the header motion type of the data to be recognized. The invention can accurately estimate the head movement type of the test object by relying on the acceleration and angular velocity information measured by the inertial sensor, and can effectively improve the recognition accuracy of the human head movement; high rate.

Description

A head pose recognition method based on dynamic time warping

技术领域technical field

本发明涉及模式识别的技术领域，尤其涉及一种基于动态时间规整的头部姿态识别方法。The invention relates to the technical field of pattern recognition, in particular to a head gesture recognition method based on dynamic time warping.

背景技术Background technique

随着人工智能技术的发展，极大地改变了人类生产生活方式，传统的键盘鼠标输入方式并不能满足所有人的需求，如上肢不健全者。为此开发一种基于头部姿态动作识别技术引起了研究者的广泛关注。With the development of artificial intelligence technology, the production and life style of human beings have been greatly changed. The traditional keyboard and mouse input methods cannot meet the needs of everyone, such as those with incomplete upper limbs. Therefore, the development of a head gesture-based action recognition technology has attracted extensive attention of researchers.

按照头部姿态计算使用设备类型可以分为两类：一类使基于佩戴惯性传感器的方法，如公告号为CN103076045B的发明专利申请提供了一种头部姿态感应装置和方法；公开后为CN105943052A的发明专利提供了一种基于偏转角的疲劳驾驶检测方法及装置；通过加速度和角速度传感器获得准确的头部姿态，该种方法优点是精度高、实时性好，缺点是需要用户佩戴惯性传感器，并且这些偏重于姿态估计，并没有提供一种头部动作识别技术；另一种方法基于机器视觉方法，如谭等提出0公开号为CN102737235A的发明专利----基于深度信息和彩色图像的头部姿势估计方法，通过摄像头或深度相机来估计头部姿态，该种方法优点是与测试对象不接触，缺点是摄像头成像易受光照、背景、表情的影响，相较前者而言，图像处理一般计算量大，准确率不够高，有待进一步改善。According to the type of equipment used for head posture calculation, it can be divided into two categories: one is a method based on wearing an inertial sensor, such as the invention patent application with the announcement number CN103076045B, which provides a head posture sensing device and method; The invention patent provides a method and device for fatigue driving detection based on deflection angle; accurate head posture is obtained through acceleration and angular velocity sensors. These focus on attitude estimation, and do not provide a head action recognition technology; another method is based on machine vision methods, such as Tan et al. proposed an invention patent with a publication number of CN102737235A ---- head based on depth information and color images The head pose estimation method uses a camera or a depth camera to estimate the head pose. The advantage of this method is that it does not contact the test object, but the disadvantage is that the camera image is easily affected by lighting, background, and expressions. The amount of calculation is large and the accuracy is not high enough, which needs to be further improved.

动态时间规整算法(dynamic time warping，DTW)是一种基于动态规划(dynamicprograming)的方法，广泛用于语音、姿态识别领域。动态时间规整算法可以将数据在时间轴下扭曲，实现时间序列的延伸或缩短，以达到更好的对齐，从而提高算法的准确率、鲁棒性。头部动作由于个人习惯，当前状态不同均可能导致动作时间长度的改变，是一个典型不等长的时间序列识别问题。Dynamic time warping (DTW) is a method based on dynamic programming, which is widely used in the fields of speech and gesture recognition. The dynamic time warping algorithm can distort the data under the time axis, and realize the extension or shortening of the time series to achieve better alignment, thereby improving the accuracy and robustness of the algorithm. Due to personal habits, different current states of head movements may lead to changes in the length of the action time, which is a typical time series recognition problem of unequal length.

发明内容SUMMARY OF THE INVENTION

针对现有头部姿态识别的计算量大，准确率不够高的技术问题，本发明提出一种基于动态时间规整的头部姿态识别方法，通过DTW方法评价不同动作与标准模板之间时间序列规整路径距离来识别不同的头部动作，数据处理量小，识别准确率高。Aiming at the technical problems that the existing head gesture recognition requires a large amount of calculation and the accuracy rate is not high enough, the present invention proposes a head gesture recognition method based on dynamic time regularization, and evaluates the time series regularity between different actions and standard templates through the DTW method. The path distance is used to identify different head movements, the data processing amount is small, and the recognition accuracy is high.

为了达到上述目的，本发明的技术方案是这样实现的：一种基于动态时间规整的头部姿态识别方法，其步骤如下：In order to achieve the above-mentioned purpose, the technical scheme of the present invention is realized as follows: a head gesture recognition method based on dynamic time regularization, the steps of which are as follows:

步骤S1：数据采集：通过固定在头部的惯性传感器采集头部动作姿态在X方向、Y方向、Z方向的加速度和角速度的特征数据，并存储在数据集中；Step S1: data collection: collect the characteristic data of the acceleration and angular velocity of the head movement posture in the X direction, the Y direction, and the Z direction through the inertial sensor fixed on the head, and store it in the data set;

步骤S2：头部动作的端点检测：对数据集中的数据进行预处理，根据预处理后的头部惯性数据合角速度信息检测头部动作的起始时间和终止时间，提取头部动作的动作区间；Step S2: Endpoint detection of head movement: preprocess the data in the data set, detect the start time and end time of the head movement according to the preprocessed head inertial data and angular velocity information, and extract the movement range of the head movement ;

步骤S3：计算头部动作时间序列模板：根据步骤S2端点检测检测到的头部动作数据及相关的动作标签，构建X方向、Y方向、Z方向的加速度、角速度头部动作模板；Step S3: Calculate the head movement time series template: According to the head movement data detected by the endpoint detection in Step S2 and the relevant movement labels, construct the acceleration and angular velocity head movement templates in the X direction, the Y direction, and the Z direction;

步骤S4：计算规整路径：数据集中的测试集通过步骤S2检测的头部动作数据分别与步骤S3得到的头部动作模板数据计算规整路径；Step S4: Calculate the regular path: the test set in the data set calculates the regular path with the head motion data detected in step S2 and the head motion template data obtained in step S3 respectively;

步骤S5：判断头部动作类型：规整路径DTW最小值对应的标准模板头部动作类型则为待识别数据的头部动作类型。Step S5: Determine the header action type: the standard template header action type corresponding to the minimum value of the regularized path DTW is the header action type of the data to be identified.

所述惯性传感器安装位置在靠近头部前部的眼镜腿上，采集数据时，被试者带好眼镜，坐于板凳上，自然的分别做出点头、仰头、左摇头、右摇头、左转头、右转头的头部动作姿态；所述数据集的格式为：[data,label]，其中，data是一个6维矩阵，分别是传感器x、y、z轴加速度和角速度，不同label下长度不定；label是一个类型变量，分别对应6类头部动作。The inertial sensor is installed on the temple near the front of the head. When collecting data, the subject wears glasses, sits on the bench, and naturally nods, tilts his head, shakes his head left, shakes his head right, and turns left. The head action posture of the head and the right-turned head; the format of the data set is: [data, label], where data is a 6-dimensional matrix, which are the sensor x, y, z-axis acceleration and angular velocity, respectively, under different labels The length is indeterminate; label is a type variable, corresponding to 6 types of head movements.

所述步骤S2中的预处理的方法为：The method of preprocessing in the step S2 is:

步骤S21：数据归一化Step S21: data normalization

y'(t)＝arctan(x(t))*2/π (1)y'(t)=arctan(x(t))*2/π(1)

其中，y'(t)为归一化后的数据，x(t)是惯性传感器采集到的加速度或角速度数据。Among them, y'(t) is the normalized data, and x(t) is the acceleration or angular velocity data collected by the inertial sensor.

步骤S22：滑动中值滤波Step S22: sliding median filter

其中，l为中值滤波窗口长度；l＝2n-1代表奇数，N为自然数集；l＝2n代表偶数，median()为中值函数，y(t)为滑动窗口长度内的中值，y'(t-(l-1)/2:t+(l-1)/2)和y'(t-l/2:t+l/2-1)分别表示数据归一化的长度为l的数据。Among them, l is the median filter window length; l=2n-1 represents an odd number, N is a set of natural numbers; l=2n represents an even number, median() is the median function, and y(t) is the median within the sliding window length, y'(t-(l-1)/2:t+(l-1)/2) and y'(t-l/2:t+l/2-1) represent data normalized data of length l respectively .

所述步骤S2中头部动作的端点检测的方法为：确定头部动作的开始时间为：The method for detecting the endpoint of the head movement in the step S2 is: determining the start time of the head movement as:

其中，

是t时刻各向角速度变化的总体描述，反映了头部动作的角度的总体变化程度，ang_x(t)、ang_y(t)和ang_z(t)分别表示三维坐标轴上X方向、Y方向、Z方向的角速度分量；ang_min是头部开始动作的阈值；t_start是头部动作的开始时刻；in,

is the overall description of the change of angular velocity in each direction at time t, which reflects the overall change degree of the angle of the head movement. ang _x (t), ang _y (t) and ang _z (t) represent the X direction, Y The angular velocity components in the direction and Z direction; ang _min is the threshold value of the head movement; t _start is the start time of the head movement;

确定头部动作结束时间：Determine the end time of the head movement:

其中，sum(ang([t-t_min,t))＜ang_min)计算了ang(t)在[t-t_min,t)时间区间内小于阈值ang_min的个数；t_min是头部动作的持续的最小时间；fs是传感器的采样频率；如果头部动作开始后，最小持续时间内所有的采样点的值均小于阈值ang_min，认为头部动作结束，结束时刻为t_end；Among them, sum(ang([tt _min ,t))<ang _min ) calculates the number of ang(t) less than the threshold ang _min in the [tt _min ,t) time interval; t _min is the duration of the head movement minimum time; fs is the sampling frequency of the sensor; if the value of all sampling points within the minimum duration is less than the threshold ang _min after the head movement starts, the head movement is considered to be over, and the end time is t _end ;

判断头部动作的有效：To judge the effectiveness of the head action:

(t_end-t_start＞t_min)且(t_end-t_start＞t_max)，存在头部动作；(t _end -t _start >t _min ) and (t _end -t _start >t _max ), there is a head action;

其中，t_min是头部动作持续的最小时间；t_max是头部动作持续的最长时间。Among them, t _min is the minimum duration of the head movement; t _max is the maximum duration of the head movement.

采集26人头部动作数据，随机分为训练集和测试集，其中训练集中有18人、测试集中有8人；The head motion data of 26 people were collected and randomly divided into training set and test set, of which 18 people were in the training set and 8 people were in the test set;

在步骤S2中处理的数据若属于训练集或个人依赖采集的模板数据，则通过步骤S3计算头部动作时间序列模板；若属于测试集或实时采集的实时数据，则通过步骤S4和步骤S5通过DTW值判断该动作序列属于的头部动作类型。If the data processed in step S2 belongs to the training set or the template data collected by individuals, then calculate the head movement time series template through step S3; The DTW value determines the type of head action the action sequence belongs to.

所述步骤S3的实现方法为：The implementation method of the step S3 is:

步骤S31：根据步骤S2提取的头部动作的时间序列，根据设置的阈值得到每个动作时间序列及其标签；Step S31: obtain each action time series and its label according to the set threshold according to the time series of head movements extracted in step S2;

步骤S32：对于训练集中一个头部动作，令其时间序列的一组数据为S_a＝{s₁,s₂,…,s_a}，S_a为6×a的矩阵，矩阵行向量分别对应X方向、Y方向、Z方向的加速度、角速度；列向量对应头部运动特征；则训练集的总的时间序列集合为S＝{S_a,S_b,…,S_n}，其中，n是训练集中该动作的个数；a、b、…、k分别代表序列S_a、S_b、S_n的长度；Step S32: For a head action in the training set, let a set of data in its time series be S _a ={s ₁ ,s ₂ ,...,s _a }, where S _a is a 6×a matrix, and the row vectors of the matrix correspond to The acceleration and angular velocity in the X, Y, and Z directions; the column vector corresponds to the head motion feature; then the total time series set of the training set is S={S _a , S _b ,...,S _n }, where n is The number of the action in the training set; a, _b , ..., k represent the lengths of the sequences _Sa , Sb, _Sn respectively;

步骤S33：令序列长度向量为S_len＝{a,b,…,n}，则模板时间序列长度为T_len＝median(S_len)，其中，median()为中值函数；Step S33: Let the sequence length vector be S _len ={a,b,...,n}, then the template time sequence length is T _len =median(S _len ), where median( ) is the median function;

步骤S34：令该头部动作的标准模板为T_i，其中，i＝1,2,…,6，对应六种头部动作类型；为6×x的矩阵，矩阵的行向量分别对应X方向、Y方向、Z方向的加速度、角速度；列向量对应头部运动特征，长度x依据训练集中的数据长度确定；通过均值公式

得到T_ik，T_ik的前T_len个数据作为该动作的标准模板时间序列，其中，T_ik代表第i个动作模板中第k行数据；S_jk代表第j个对象动作类型的第k行数据；由于S_jk在测试者之间动作持续时间并不相等，使用binary()函数对S_jk进行二值化{1，0}，从而计算相同位置元素个数；Step S34: Let the standard template of the head action be T _i , where i=1, 2, . , acceleration and angular velocity in the Y and Z directions; the column vector corresponds to the head motion feature, and the length x is determined according to the length of the data in the training set; through the mean value formula

Obtain T _ik , the first T _len data of T _ik are used as the standard template time series of the action, wherein T _ik represents the data of the k th row in the ith action template; S _jk represents the k th row of the j th object action type Data; since the duration of action of S _jk between testers is not equal, use the binary() function to binarize S _jk {1, 0} to calculate the number of elements at the same position;

步骤S34：重复步骤S32、S33，可以得到其它动作类型的标准模板。Step S34: Repeat steps S32 and S33 to obtain standard templates of other action types.

所述步骤S4的实现方法为：The implementation method of the step S4 is:

步骤S41：计算距离矩阵D：令测试集中头部动作的时间序列为S＝{s₁,s₂,…,s_n}；待匹配模板时间序列为标准模板数据为T＝{t₁,t₂,…,t_m}；则它们之间任意两点的欧式距离为

其中，s_i是时间序列S中第i列向量；t_j是标准模板T中第j列向量；s_ik是时间序列S中任第i列向量的第k行元素；t_jk是标准模板T中任第j列向量的第k行元素；计算所有的可能性构成一个n×m的距离矩阵D；转变为应用动态规划方法求解从起点D(1，1)到终点D(n，m)的最短路径问题。Step S41: Calculate the distance matrix D: let the time series of head movements in the test set be S={s ₁ , s ₂ ,...,s _n }; the time series of the template to be matched is the standard template data as T={t ₁ ,t ₂ ,…,t _m }; then the Euclidean distance between any two points between them is

Among them, s _i is the i-th column vector in the time series S; t _j is the j-th column vector in the standard template T; s _ik is the k-th row element of any i-th column vector in the time series S; t _jk is the standard template T The k-th row element of any j-th column vector; calculate all the possibilities to form a distance matrix D of n × m; convert to applying dynamic programming method to solve from the starting point D(1, 1) to the end point D(n, m) the shortest path problem.

步骤S42：令规整路径W＝{w₁,w₂,w₃,…,w_y}，其中，w_e表示时间序列S和标准模板T某点之间距离，y是规整路径长度，范围：max(m,n)≤y≤m+n；根据规整路径的约束条件得到最优规划路径；Step S42: Let the regular path W={w ₁ ,w ₂ ,w ₃ ,...,w _y }, where w _e represents the distance between the time series S and a certain point of the standard template T, y is the length of the regular path, and the range is: max(m,n)≤y≤m+n; the optimal planning path is obtained according to the constraints of the regular path;

步骤S43：采用累积距离的动态规划思想来计算，求解最优规整路径。Step S43: Calculate the optimal regularized path by adopting the dynamic programming idea of cumulative distance.

重复步骤S41、S42、S43，分别计算出该头部动作时间序列S与6种标准模板的动作时间序列之间的DTW值。Steps S41, S42, and S43 are repeated to calculate the DTW values between the head movement time series S and the movement time series of the six standard templates, respectively.

所述规整路径需要满足以下几个约束条件：The regularized path needs to satisfy the following constraints:

①边界条件：路径从起点w₁＝D(1,1)到终点w_y＝D(n,m)；①Boundary conditions: the path from the starting point w ₁ =D(1,1) to the end point w _y =D(n,m);

②连续性：若w_e-1＝D(a,b)，那么路径的下一个点w_e＝D(a',b')需要满足|a-a'|≤1，|b-b'|≤1，也就是不能跨越某个点区匹配；②Continuity: If w _e-1 =D(a,b), then the next point of the path w _e =D(a',b') needs to satisfy |a-a'|≤1, |b-b' |≤1, that is, it cannot match across a certain point area;

③单调性：若w_e-1＝D(a,b)，那么路径的下一个点w_e＝D(a',b')需要满足a'-a≥0，b'-b≥0，也就是规整路径W上的点必须随时间单调进行；③ Monotonicity: If w _e-1 =D(a,b), then the next point of the path w _e =D(a',b') needs to satisfy a'-a≥0, b'-b≥0, That is, the points on the regular path W must be monotonic over time;

由此，知道从w_e-1＝D(a,b)到下一个点只有三种路径：D(a+1,b)，D(a+1,b+1)，D(a,b+1)。那么最优规整路径为：From this, we know that there are only three paths from w _e-1 =D(a,b) to the next point: D(a+1,b), D(a+1,b+1), D(a,b +1). Then the optimal regularization path is:

所述累积距离为：The cumulative distance is:

r(e,f)＝d(s_e,t_f)+min{r(e-1,f),r(e-1,f-1),r(e,f-1)}；r(e,f)=d(s _e ,t _f )+min{r(e-1,f),r(e-1,f-1),r(e,f-1)};

其中，e＝1,2,3,…,n；f＝1,2,3,…,m；s_e表示待检测矩阵S中第e个列向量；t_f表示待检测模板矩阵T中某个头部动作的第e个列向量；r(e,f)为累积距离。Among them, e=1,2,3,...,n; f=1,2,3,...,m; s _e represents the e-th column vector in the matrix S to be detected; t _f represents a certain template matrix T to be detected The e-th column vector of the head movements; r(e,f) is the cumulative distance.

本发明的有益效果：本发明提供了一种头部运动端点检测和基于动态时间规整的头部动作识别方法，使用放置于眼镜腿或耳朵处的惯性传感器采集头部动作X、Y、Z方向的加速度、角速度等，并对采集到的角速度数据计算合角速度，使用门限法进行端点自动检测，剔除异常数据。本发明支持生成个体依赖的头部动作模板或导入经验性的头部动作模板，然后，对自动检出头部动作测试数据分别计算与每个模板数据的动态时间规整路径DTW，经比较，具有最小DTW值的则为该种动作类型。本发明依靠惯性传感器测量的加速度和角速度信息可以准确的估计出测试对象的头部动作类型，如点头、仰头、左摇头、右摇头、左摆头、右摆头等，能有效提高人体头部动作的识别准确率。相对于基于摄像头、深度相机的头部动作识别技术，本发明具有价格低、数据处理量小、反应快、识别准确率高。Beneficial effects of the present invention: The present invention provides a head movement endpoint detection and head movement recognition method based on dynamic time regulation, using inertial sensors placed at temples or ears to collect head movement X, Y, Z directions Acceleration, angular velocity, etc., and calculate the combined angular velocity of the collected angular velocity data, use the threshold method to automatically detect the endpoints, and eliminate abnormal data. The present invention supports generating individual-dependent head motion templates or importing empirical head motion templates, and then calculates the dynamic time warping path DTW with each template data for the automatically detected head motion test data. The one with the smallest DTW value is the action type. According to the acceleration and angular velocity information measured by the inertial sensor, the invention can accurately estimate the head movement type of the test object, such as nodding, tilting, shaking left, shaking right, shaking left, shaking right, etc. Action recognition accuracy. Compared with the head action recognition technology based on the camera and the depth camera, the present invention has the advantages of low price, small data processing amount, fast response and high recognition accuracy.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the following briefly introduces the accompanying drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained according to these drawings without creative efforts.

图1为本发明的流程示意图。FIG. 1 is a schematic flow chart of the present invention.

图2为本发明的头部动作姿态类型示意图。FIG. 2 is a schematic diagram of the types of head action postures of the present invention.

图3为本发明头部动作的时间序列模板。FIG. 3 is a time-series template of the head action of the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有付出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

如图1所示，一种基于动态时间规整的头部姿态识别方法，其步骤如下：As shown in Figure 1, a head pose recognition method based on dynamic time warping, the steps are as follows:

步骤S1：数据采集：通过固定在头部的惯性传感器采集头部动作姿态在X方向、Y方向、Z方向的加速度和角速度的特征数据，并存储在数据集中。Step S1: Data collection: The characteristic data of the acceleration and angular velocity of the head movement posture in the X direction, the Y direction, and the Z direction are collected by the inertial sensor fixed on the head, and stored in the data set.

本发明通过惯性传感器来感知头部姿态变化，采集的数据类型分别为X、Y、Z方向的加速度、角速度，共6类特征数据，惯性传感器安装位置在靠近头部前部的眼镜腿上，如图2中部的正常的视图所示。采集数据时，被试者带好眼镜，坐于板凳上，自然的分别做出点头、仰头、左摇头、右摇头、左转头、右转头的头部动作姿态，分别如图1靠外部的视图所示。The invention senses the change of the head posture through the inertial sensor, and the collected data types are acceleration and angular velocity in the X, Y, and Z directions respectively, and a total of 6 types of characteristic data. The installation position of the inertial sensor is on the temple near the front of the head, As shown in the normal view in the middle of FIG. 2 . When collecting data, the subjects wear glasses, sit on the bench, and naturally make head movements of nodding, looking up, shaking their heads left, shaking their heads right, turning their heads left, and turning their heads right, respectively, as shown in Figure 1. view shown.

为了验证本发明提出方法的有效性，采集了26人头部动作数据，随机分为训练集和测试集进行了测试，其中训练组18人、测试组8人。数据集格式为：[data,label]，其中，data是一个6维矩阵，分别是传感器x、y、z轴加速度、角速度，不同label下长度不定；label是一个类型变量，分别对应6类头部动作。In order to verify the effectiveness of the method proposed by the present invention, the head motion data of 26 persons were collected and randomly divided into a training set and a test set for testing, including 18 persons in the training group and 8 persons in the test group. The format of the data set is: [data, label], where data is a 6-dimensional matrix, which is the sensor x, y, z-axis acceleration, angular velocity, and the length is variable under different labels; label is a type variable, corresponding to 6 types of heads. Ministry action.

步骤S2：头部动作的端点检测：对数据集中的数据进行预处理，根据预处理后的头部惯性数据合角速度信息检测头部动作的起始时间和终止时间，提取头部动作的动作区间，步骤如下：Step S2: Endpoint detection of head movement: preprocess the data in the data set, detect the start time and end time of the head movement according to the preprocessed head inertial data and angular velocity information, and extract the movement range of the head movement ,Proceed as follows:

步骤S21：数据归一化Step S21: data normalization

y'(t)＝arctan(x(t))*2/π (1)y'(t)=arctan(x(t))*2/π(1)

步骤S22：滑动中值滤波Step S22: sliding median filter

其中，l为中值滤波窗口长度；l＝2n-1代表奇数，N为自然数集；l＝2n代表偶数，median()为中值函数，y(t)为滑动窗口长度内的中值，y'(t-(l-1)/2:t+(l-1)/2)和y'(t-l/2:t+l/2-1)分别表示数据归一化的长度为l的数据。滑动中值滤波的作用是减小惯性传感器的椒盐噪声，使后续的动作识别和端点检测减少误判的可能性。Among them, l is the median filter window length; l=2n-1 represents an odd number, N is a set of natural numbers; l=2n represents an even number, median() is the median function, and y(t) is the median within the sliding window length, y'(t-(l-1)/2:t+(l-1)/2) and y'(t-l/2:t+l/2-1) represent data normalized data of length l respectively . The function of the sliding median filter is to reduce the salt and pepper noise of the inertial sensor, so that the subsequent action recognition and endpoint detection reduce the possibility of misjudgment.

步骤S23：确定头部动作的开始时间为：Step S23: Determine the start time of the head movement as:

其中，

是t时刻各向角速度变化的总体描述，反映了头部动作的角度的总体变化程度，ang_x(t)、ang_y(t)和ang_z(t)分别表示三维坐标轴上X方向、Y方向、Z方向的角速度分量；ang_min是头部开始动作的阈值；t_start是头部动作的开始时刻。in,

is the overall description of the change of angular velocity in each direction at time t, which reflects the overall change degree of the angle of the head movement. ang _x (t), ang _y (t) and ang _z (t) represent the X direction, Y The angular velocity component in the direction and the Z direction; ang _min is the threshold for the start of the head movement; t _start is the start time of the head movement.

步骤S24：确定头部动作结束时间：Step S24: Determine the end time of the head action:

其中，sum(ang([t-t_min,t))＜ang_min)计算了ang(t)在[t-t_min,t)时间区间内小于阈值ang_min的个数；t_min是头部动作的持续的最小时间；fs是传感器的采样频率。如果头部动作开始后，最小持续时间内所有的采样点的值均小于阈值ang_min，认为头部动作结束，结束时刻为t_end。Among them, sum(ang([tt _min ,t))<ang _min ) calculates the number of ang(t) less than the threshold ang _min in the [tt _min ,t) time interval; t _min is the duration of the head movement Minimum time; fs is the sampling frequency of the sensor. If the value of all sampling points within the minimum duration is less than the threshold ang _min after the head movement starts, the head movement is considered to be over, and the end time is t _end .

步骤S25：判断头部动作有效：Step S25: judging that the head action is valid:

(t_end-t_start＞t_min)且(t_end-t_start＞t_max)，存在头部动作 (5)(t _end -t _start >t _min ) and (t _end -t _start >t _max ), there is a head motion (5)

其中，t_min是头部动作持续的最小时间，用于剔除波形的尖峰噪声；t_max是头部动作持续的最长时间，用于剔除持续时间异常或未完成的动作。从而完成头部动作数据提取，数据若属于训练集或个人依赖采集的模板数据，则可以通过步骤S3计算头部动作时间序列模板；若属于测试集或实时采集过来的实时数据，则通过步骤S4、S5通过DTW值判断该动作序列属于的头部动作类型。Among them, t _min is the minimum duration of the head movement, which is used to eliminate the peak noise of the waveform; t _max is the maximum duration of the head movement, which is used to eliminate the abnormal or unfinished movements. Thereby, the extraction of head movement data is completed. If the data belongs to the training set or the template data collected by individuals, the head movement time series template can be calculated through step S3; if it belongs to the test set or the real-time data collected in real time, then through step S4 , S5 judges the type of head action to which the action sequence belongs through the DTW value.

步骤S3：计算头部动作时间序列模板：根据步骤S2端点检测检测到的头部动作数据及相关的动作标签，构建X方向、Y方向、Z方向的加速度、角速度头部动作模板，步骤如下：Step S3: Calculate the head movement time series template: According to the head movement data detected by the endpoint detection in Step S2 and the relevant movement labels, construct the acceleration and angular velocity head movement templates in the X direction, the Y direction, and the Z direction, and the steps are as follows:

步骤S31：根据本发明步骤2描述的端点检测方法提取出头部动作的时间序列，阈值选取参考值分别是：l＝0.2s，ang_min＝0.2rad/s，t_min＝0.6s，t_max＝3s，fs＝100Hz得到每个动作时间序列及其标签。Step S31: Extract the time series of head movements according to the endpoint detection method described in Step 2 of the present invention, and the reference values for threshold selection are: l=0.2s, _angmin = _0.2rad /s, tmin=0.6s, _tmax =3s, fs=100Hz to get each action time series and its label.

步骤S32：对于训练集其中一个头部动作为例，令其时间序列的一组数据为S_a＝{s₁,s₂,…,s_a}，a为该组数据的长度，S_a为6×a的矩阵，矩阵行向量分别对应X方向、Y方向、Z方向的加速度、角速度；列向量对应头部运动特征。则训练集的总的时间序列集合为S＝{S_a,S_b,…,S_n}，其中，n是训练集中该动作的个数；a，b，…，k分别代表所处序列的长度。Step S32: Taking one of the head movements in the training set as an example, let a group of data in the time series be S _a ={s ₁ ,s ₂ ,...,s _a }, where a is the length of the group of data, and S _a is 6×a matrix, the row vector of the matrix corresponds to the acceleration and angular velocity in the X direction, Y direction, and Z direction respectively; the column vector corresponds to the head motion characteristics. Then the total time series set of the training set is S={S _a , S _b ,...,S _n }, where n is the number of the actions in the training set; a, b,...,k represent the sequence of length.

步骤S33：令序列长度向量为S_len＝{a,b,…,n}，则模板时间序列长度为T_len＝median(S_len)，其中，median()为中值函数。Step S33: Let the sequence length vector be S _len ={a,b,...,n}, then the template time sequence length is T _len =median(S _len ), where median( ) is a median function.

步骤S34：令该头部动作的标准模板为T_i，其中，i＝1,2,…,6，对应六种头部动作类型；为6×x的矩阵，矩阵的行向量分别对应X方向、Y方向、Z方向的加速度、角速度；列向量对应头部运动特征，长度x依据训练集中的数据长度确定。可通过均值公式

得到T_ik，T_ik的前T_len个数据作为该动作的标准模板时间序列，其中，T_ik代表第i个动作模板中第k行数据；S_jk代表第j个对象动作类型的第k行数据；由于S_jk在测试者之间动作持续时间并不相等，使用binary()函数对S_jk进行二值化{1，0}，从而计算相同位置元素个数。Step S34: Let the standard template of the head action be T _i , where i=1, 2, . , acceleration and angular velocity in the Y and Z directions; the column vector corresponds to the head motion feature, and the length x is determined according to the length of the data in the training set. through the mean formula

Obtain T _ik , the first T _len data of T _ik are used as the standard template time series of the action, wherein T _ik represents the data of the k th row in the ith action template; S _jk represents the k th row of the j th object action type Data; since the duration of action of S _jk is not equal between testers, the binary() function is used to binarize S _jk {1, 0} to calculate the number of elements at the same position.

步骤S34：重复步骤S32、S33，可以得到其它动作类型的标准模板，如图3。图3中acc_x(t)、acc_y(t)和acc_z(t)、acc(t)分别代表X、Y、Z方向的加速度、合加速度；ang_x(t)、ang_y(t)和ang_z(t)、ang_t(t)分别表示X、Y、Z方向的角速度、合角速度。Step S34: Repeat steps S32 and S33 to obtain standard templates of other action types, as shown in FIG. 3 . In Figure 3, acc _x (t), acc _y (t), acc _z (t), and acc (t) represent the acceleration and resultant acceleration in the X, Y, and Z directions, respectively; ang _x (t), ang _y (t) and ang _z (t) and ang _t (t) represent the angular velocity and the resultant angular velocity in the X, Y, and Z directions, respectively.

用户也可以通过系统语音提示，多次做出相应的头部动作数据，并按照步骤S31、S32、S33、S34，计算出个人依赖性头部动作时间序列模板。The user can also make corresponding head motion data multiple times through the system voice prompt, and calculate the personal-dependent head motion time series template according to steps S31, S32, S33, and S34.

步骤S4：计算规整路径，测试集通过步骤S2检测到的头部动作数据分别与步骤S3得到的头部动作模板数据计算规整路径，详细步骤如下：Step S4: Calculate the regular path. The test set uses the head motion data detected in step S2 and the head motion template data obtained in step S3 to calculate the regular path. The detailed steps are as follows:

步骤S41：计算距离矩阵D。令测试集中头部动作的时间序列为S＝{s₁,s₂,…,s_n}，是一个6×n的矩阵；待匹配模板时间序列为标准模板数据为T＝{t₁,t₂,…,t_m}，是一个6×m的矩阵；则它们之间任意两点的欧式距离为

其中，s_i是矩阵S中第i列向量；t_j是矩阵T中第j列向量；s_ik是向量S中任第i列向量的第k行元素；t_jk是向量T中任第j列向量的第k行元素。计算所有的可能性，就构成一个n×m的距离矩阵D。所以，这两个时间序列相似性问题就转变为应用动态规划方法求解从起点D(1，1)到终点D(n，m)的最短路径问题，也就是规整路径(warping path)，用W表示。Step S41: Calculate the distance matrix D. Let the time series of head movements in the test set be S={s ₁ , s ₂ ,...,s _n }, which is a 6×n matrix; the time series of the template to be matched is the standard template data T={t ₁ ,t ₂ ,…,t _m }, is a 6×m matrix; then the Euclidean distance between any two points between them is

Among them, s _i is the i-th column vector in matrix S; t _j is the j-th column vector in matrix T; s _ik is the k-th row element of any i-th column vector in vector S; t _jk is any j-th column vector in vector T The kth row element of the column vector. Calculate all the possibilities to form an n×m distance matrix D. Therefore, these two time series similarity problems are transformed into applying dynamic programming to solve the shortest path problem from the starting point D(1, 1) to the ending point D(n, m), that is, the warping path, using W express.

步骤S42：令规整路径W＝{w₁,w₂,w₃,…,w_y}，其中，w_e表示时间序列S和标准模板T某点之间距离，y是规整路径长度，范围：max(m,n)≤y≤m+n。并需要满足以下几个约束条件：Step S42: Let the regular path W={w ₁ ,w ₂ ,w ₃ ,...,w _y }, where w _e represents the distance between the time series S and a certain point of the standard template T, y is the length of the regular path, and the range is: max(m,n)≤y≤m+n. And the following constraints need to be met:

③单调性：若w_e-1＝D(a,b)，那么路径的下一个点w_e＝D(a',b')需要满足a'-a≥0，b'-b≥0，也就是W上的点必须随时间单调进行；③ Monotonicity: If w _e-1 =D(a,b), then the next point of the path w _e =D(a',b') needs to satisfy a'-a≥0, b'-b≥0, That is, the points above W must proceed monotonically with time;

步骤S43：为了求解最优规整路径即求解式(6)，采用累积距离(cumulativedistance)的动态规划思想来计算，累积距离公式定义为：Step S43: In order to solve the optimal regular path, that is, to solve the formula (6), the dynamic programming idea of cumulative distance is used to calculate, and the cumulative distance formula is defined as:

r(e,f)＝d(s_e,t_f)+min{r(e-1,f),r(e-1,f-1),r(e,f-1)} (7)r(e,f)=d(s _e ,t _f )+min{r(e-1,f),r(e-1,f-1),r(e,f-1)} (7)

其中，e＝1,2,3,…,n；f＝1,2,3,…,m；s_e表示待检测矩阵S中第e个列向量；t_f表示待检测模板矩阵T某个头部动作的第e个列向量；r(e,f)为累积距离，累积距离实际上是一种递推的关系，所以S,T两组时间序列最佳规整路径距离为DTW(S,T)＝r(n,m)，这解决了时间序列长度不一致和特征位置没有对齐时间序列相似性的度量问题。Among them, e=1,2,3,...,n; f=1,2,3,...,m; s _e represents the e-th column vector in the matrix S to be detected; t _f represents a certain template matrix T to be detected The e-th column vector of the head action; r(e,f) is the cumulative distance, and the cumulative distance is actually a recursive relationship, so the optimal regular path distance of the two groups of time series S and T is DTW(S, T)=r(n,m), which solves the problem of time series similarity measurement that the length of time series is inconsistent and the feature positions are not aligned.

以上所述仅为本发明的较佳实施例而已，并不用以限制本发明，凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included in the scope of the present invention. within the scope of protection.

Claims

1. A head posture recognition method based on dynamic time warping is characterized by comprising the following steps:

step S1: data acquisition: acquiring characteristic data of acceleration and angular velocity of the head action posture in the X direction, the Y direction and the Z direction through an inertial sensor fixed on the head, and storing the characteristic data in a data set;

step S2: end point detection of head motion: preprocessing data in the data set, detecting the start time and the end time of head movement according to preprocessed head inertia data and angular speed information, and extracting a movement interval of the head movement;

step S3: calculating a head action time series template: constructing acceleration and angular velocity head motion templates in the X direction, the Y direction and the Z direction according to the head motion data detected by the end point detection in the step S2 and the related motion labels;

step S4: calculating a regular path: calculating a normalized path by the test set in the data set through the head motion data detected in the step S2 and the head motion template data obtained in the step S3;

step S5: judging the head action type: and the standard template head action type corresponding to the minimum value of the regular path DTW is the head action type of the data to be identified.

2. The method for recognizing the head posture based on the dynamic time warping as claimed in claim 1, wherein the inertial sensors are mounted on the glasses legs near the front of the head, and when collecting data, the subject takes the glasses and sits on the stool to naturally and respectively make head actions of nodding, pitching, left shaking, right shaking, left turning and right turning; the format of the data set is: [ data, label ], wherein, data is a 6-dimensional matrix which is acceleration and angular velocity of x, y and z axes of the sensor respectively, and the length is not fixed under different labels; label is a type variable, corresponding to 6 types of head actions.

3. The method for recognizing head pose based on dynamic time warping as claimed in claim 1, wherein the preprocessing in step S2 comprises:

step S21: data normalization

y'(t)＝arctan(x(t))*2/π (1)

Where y' (t) is normalized data, and x (t) is acceleration or angular velocity data collected by the inertial sensor.

Step S22: sliding median filtering

Wherein l is the median filter window length; l is 2N-1 to represent odd number, and N is a natural number set; l ═ 2n represents an even number, mean () is a median function, y (t) is a median value within the length of the sliding window, and y '(t- (l-1)/2: t + (l-1)/2) and y' (t-l/2: t + l/2-1) represent data of length l for data normalization, respectively.

4. The head pose recognition method based on dynamic time warping as claimed in claim 1 or 3, wherein the method of detecting the end point of the head action in step S2 is: determining the start time of the head action as:

wherein,

is the overall description of the angular velocity change of each direction at the time t, and reflects the overall change degree of the angle of the head action, ang_x(t)、ang_y(t) and ang_z(t) representing the angular velocity components in the X direction, the Y direction and the Z direction on the three-dimensional coordinate axis respectively; ang_minIs a threshold for head start action; t is t_startIs the start time of the head movement;

determining a head action end time:

among them, sum ([ t-t ]_min,t))＜ang_min) Calculates the ang (t) at [ t-t_minT) less than a threshold ang in a time interval_minThe number of (2); t is t_minIs the minimum time of duration of the head movement; fs is the sampling frequency of the sensor; if the values of all the sampling points in the minimum duration are less than the threshold ang after the start of the head movement_minThe head movement is considered to be finished, and the finish time is t_end；

Judging the validity of the head action:

(t_end-t_start＞t_min) And (t)_end-t_start＞t_max) There is a head action;

wherein, t_minIs the minimum time for which the head movement lasts; t is t_maxIs the maximum time that the head movement lasts.

5. The method for recognizing the head pose based on the dynamic time warping as claimed in claim 4, wherein the head motion data of 26 persons are collected and randomly divided into a training set and a testing set, wherein 18 persons exist in the training set and 8 persons exist in the testing set;

if the data processed in step S2 belongs to a training set or individual dependency acquisition template data, calculating a head movement time series template through step S3; if the motion sequence belongs to the test set or the real-time data collected in real time, the head motion type to which the motion sequence belongs is judged through the DTW value through the steps S4 and S5.

6. The head pose recognition method based on dynamic time warping as claimed in claim 5, wherein the step S3 is implemented by:

step S31: according to the time series of the head actions extracted in the step S2, each action time series and the label thereof are obtained according to the set threshold;

step S32: for a head movement in the training set, let a group of data in the time sequence be S_a＝{s₁,s₂,…,s_a}，S_aThe matrix is a 6X a matrix, and the row vectors of the matrix respectively correspond to the acceleration and the angular velocity in the X direction, the Y direction and the Z direction; the column vectors correspond to head motion features; the total time series set of the training set is S ═ S_a,S_b,…,S_nN is the number of the actions in the training set; a. b, … and k represent the sequence S_a、S_b、S_nLength of (d);

step S33: let the sequence length vector be S_lenWhen the length of the template time sequence is T, { a, b, …, n }, the length of the template time sequence is T_len＝median(S_len) Wherein, mean () is a median function;

step S34: let the standard template of the head action be T_iWherein, i is 1,2, …,6, corresponding to six head motion types; the matrix is a 6X matrix, and the row vectors of the matrix respectively correspond to the acceleration and the angular velocity in the X direction, the Y direction and the Z direction; the column vector corresponds to the head motion characteristic, and the length x is determined according to the data length in the training set; by means of the mean value formula

To obtain T_ik，T_ikFront T of_lenThe data is used as a standard template time sequence of the action, wherein T_ikRepresenting the kth line of data in the ith action template; s_jkA kth line of data representing a jth object action type; due to S_jkThe duration of the action is not equal among the testers, and the pair S of the binary () function is used_jkCarrying out binarization {1, 0}, thereby calculating the number of elements at the same position;

step S34: repeating the steps S32, S33, standard templates of other action types can be obtained.

7. The head pose recognition method based on dynamic time warping as claimed in claim 5, wherein the step S4 is implemented by:

step S41: calculating a distance matrix D: let the time sequence of the head action in the test set be S ═ S₁,s₂,…,s_n}; the data of the template to be matched with the time sequence of the template as the standard template is T ═ T₁,t₂,…,t_m}; the Euclidean distance between any two points is

Wherein s is_iIs the ith column vector in the time series S; t is t_jIs the jth column vector in the standard template T; s_ikIs the kth row element of any ith column vector in time series S; t is t_jkIs the kth row element of any jth column vector in the standard template T; calculating all the possibilities to form an n multiplied by m distance matrix D; the method is changed into the method for solving the shortest path problem from the starting point D (1,1) to the end point D (n, m) by applying a dynamic programming method.

Step S42: let regular path W be { W ═ W₁,w₂,w₃,…,w_yIn which w_eRepresents the distance between a point in the time series S and the standard template T, y is the warping path length, range: y is more than or equal to max (m, n) and less than or equal to m + n; obtaining an optimal planning path according to the constraint conditions of the regular path;

step S43: and calculating by adopting a dynamic planning idea of accumulated distance to solve the optimal regular path.

The steps S41, S42, and S43 are repeated to calculate DTW values between the head movement time series S and the movement time series of the 6 standard templates, respectively.

8. The method for head pose recognition based on dynamic time warping as claimed in claim 7, wherein the warping path needs to satisfy the following constraints:

boundary conditions: the path is from the starting point w₁D (1,1) to end point w_y＝D(n,m)；

Continuity: if w_e-1D (a, b), then the next point w of the path_eD (a ', b') needs to satisfy | a-a '| ≦ 1, | b-b' | ≦ 1, i.e., no match across a certain dot zone;

monotonicity: if w_e-1D (a, b), then the next point w of the path_eD (a ', b') needs to be such that a '-a ≧ 0, b' -b ≧ 0, i.e., the point on the regular path W must proceed monotonically with time;

thus, know from w_e-1There are only three paths to the next point for D (a, b): d (a +1, b), D (a +1, b +1), D (a, b + 1). Then the optimal warping path is:

the cumulative distance is:

r(e,f)＝d(s_e,t_f)+min{r(e-1,f),r(e-1,f-1),r(e,f-1)}；

wherein e is 1,2,3, …, n; f is 1,2,3, …, m; s_eRepresenting the e-th column vector in the matrix S to be detected; t is t_fAn e-th column vector representing a certain head action in the template matrix T to be detected; r (e, f) is the cumulative distance.