CN117671704B - A method, device and computer storage medium for handwritten digit recognition - Google Patents
A method, device and computer storage medium for handwritten digit recognition Download PDFInfo
- Publication number
- CN117671704B CN117671704B CN202410130100.XA CN202410130100A CN117671704B CN 117671704 B CN117671704 B CN 117671704B CN 202410130100 A CN202410130100 A CN 202410130100A CN 117671704 B CN117671704 B CN 117671704B
- Authority
- CN
- China
- Prior art keywords
- matrix
- data
- projection matrix
- training data
- weight vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/22—Character recognition characterised by the type of writing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/191—Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06V30/19127—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/191—Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06V30/19147—Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/191—Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06V30/19173—Classification techniques
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
本发明公开了一种手写体数字识别方法,包括步骤收集到的样本进行归一化处理得到训练数据,训练数据包括标签数据和无标签数据;由标签数据计算类内散度矩阵和类间散度矩阵;由标签数据和无标签数据构建近邻图计算流形正则项;利用训练数据通过拉普拉斯自适应权重判别分析方法学习得到最优投影矩阵,并采用迭代优化的方法求解优化问题得到最优投影矩阵,将待识别样本进行归一化处理,再通过最优投影矩阵得到投影后的数据,然后采用最近邻分类器得到识别标签。本发明还公开了基于该方法的装置及计算机存储介质。本发明有效解决标签数据少的多分类问题,提高了数据的利用率,并提升了分类性能。
The present invention discloses a method for handwritten digit recognition, including the steps of normalizing the collected samples to obtain training data, the training data including label data and unlabeled data; calculating the intra-class scatter matrix and the inter-class scatter matrix from the label data; constructing a neighbor graph from the label data and the unlabeled data to calculate the manifold regularization term; using the training data to learn the optimal projection matrix through the Laplace adaptive weighted discriminant analysis method, and using the iterative optimization method to solve the optimization problem to obtain the optimal projection matrix, normalizing the samples to be recognized, and then obtaining the projected data through the optimal projection matrix, and then using the nearest neighbor classifier to obtain the recognition label. The present invention also discloses a device based on the method and a computer storage medium. The present invention effectively solves the problem of multiple classifications with less label data, improves the utilization rate of data, and enhances the classification performance.
Description
技术领域Technical Field
本发明涉及图像识别技术领域,特别是涉及一种手写体数字识别方法、装置及计算机存储介质。The present invention relates to the field of image recognition technology, and in particular to a handwritten digit recognition method, device and computer storage medium.
背景技术Background technique
线性判别分析方法(linear discriminant analysis,LDA)是经典的有监督学习算法,主要用于降维和分类。它的主要思想是将数据投影到新的空间,使得同类的数据尽可能的靠近,不同类的数据尽可能的远离。该方法可以用于解决图像分类问题,例如手写体数字识别等问题。线性判别分析方法通过将数据投影到最佳线性判别方向上,可以提高分类准确率。但是对于多分类问题,LDA并不是一个最优的选择。在同方差高斯假设下,LDA的投影是通过最大化不同类别之间的Kullback-Leibler (KL)散度的加权算术平均值得到的,其投影方向是由具有大KL散度的类对主导的,这导致了KL散度小的类对在投影空间中会发生重叠现象,从而使得分类的准确性也在显著退化。针对LDA在多分类问题中类分离的问题,许多研究人员提出各种构造权重的方案来优化LDA。这类有监督的判别分析方法主要分成两大类,一类是替代不同类对间KL散度的算术平均值,对KL散度不同的类对赋予不同的权重;另一类是关注相近类对的分离,强调KL散度小的类对。但这些方法都是有监督的,需要足够的标签数据才能训练出模型,而且容易产生过拟合问题。Linear discriminant analysis (LDA) is a classic supervised learning algorithm, mainly used for dimensionality reduction and classification. Its main idea is to project data into a new space so that data of the same type are as close as possible and data of different types are as far away as possible. This method can be used to solve image classification problems, such as handwritten digit recognition. Linear discriminant analysis can improve classification accuracy by projecting data onto the optimal linear discriminant direction. However, for multi-classification problems, LDA is not the best choice. Under the assumption of homoscedastic Gaussian, the projection of LDA is obtained by maximizing the weighted arithmetic mean of the Kullback-Leibler (KL) divergence between different categories. The projection direction is dominated by class pairs with large KL divergence, which leads to the overlap of class pairs with small KL divergence in the projection space, which significantly degrades the classification accuracy. In response to the problem of class separation of LDA in multi-classification problems, many researchers have proposed various weight construction schemes to optimize LDA. This type of supervised discriminant analysis method is mainly divided into two categories. One is to replace the arithmetic mean of the KL divergence between different class pairs and assign different weights to class pairs with different KL divergences; the other is to focus on the separation of similar class pairs and emphasize class pairs with small KL divergence. However, these methods are supervised and require sufficient labeled data to train the model, and are prone to overfitting problems.
随着科学技术的发展,收集数据的技术与工具在不断进步,大量的数据可以被使用,但是标签数据的标记工作还需要大量的人力和物力,所以如何利用无标签数据帮助提升已有算法的性能成为当前的研究热点。半监督学习就是利用大量的无标签数据来辅助少量的标签数据提升学习性能,从而得到泛化能力更强的学习模型。如何将有监督的判别分析方法扩展至半监督学习中得到更有效的分类模型成为亟待解决的任务之一。With the development of science and technology, the technology and tools for collecting data are constantly improving. A large amount of data can be used, but the labeling of labeled data still requires a lot of manpower and material resources. Therefore, how to use unlabeled data to help improve the performance of existing algorithms has become a current research hotspot. Semi-supervised learning is to use a large amount of unlabeled data to assist a small amount of labeled data to improve learning performance, thereby obtaining a learning model with stronger generalization ability. How to extend the supervised discriminant analysis method to semi-supervised learning to obtain a more effective classification model has become one of the tasks that need to be solved urgently.
发明内容Summary of the invention
针对上述现有技术的缺陷,本发明提供了一种手写体数字识别方法,将有监督的判别分析方法扩展至半监督学习,解决多分类任务中标签数据少,采用传统判别分析方法存在的类分离问题。本发明的另一目的是提供一种手写体数字识别装置及相应的计算机存储介质。In view of the above-mentioned defects of the prior art, the present invention provides a handwritten digit recognition method, which extends the supervised discriminant analysis method to semi-supervised learning, and solves the problem of class separation in multi-classification tasks with little label data and the use of traditional discriminant analysis methods. Another object of the present invention is to provide a handwritten digit recognition device and a corresponding computer storage medium.
本发明技术方案如下:一种手写体数字识别方法,包括以下步骤:The technical solution of the present invention is as follows: A method for handwritten digit recognition comprises the following steps:
步骤S1、收集到的样本进行归一化处理得到训练数据,所述训练数据包括标签数据和无标签数据;Step S1, the collected samples are normalized to obtain training data, wherein the training data includes labeled data and unlabeled data;
步骤S2、由训练数据中的标签数据计算类内散度矩阵和类间散度矩阵;Step S2, calculating the intra-class scatter matrix and the inter-class scatter matrix based on the label data in the training data;
步骤S3、由训练数据中的标签数据和无标签数据构建近邻图计算流形正则项;Step S3, constructing a neighbor graph from the labeled data and the unlabeled data in the training data to calculate the manifold regularization term;
步骤S4、利用训练数据通过拉普拉斯自适应权重判别分析方法学习得到最优投影矩阵,包括将拉普拉斯自适应权重判别分析方法的优化目标设置为Step S4, using the training data to learn the optimal projection matrix through the Laplace adaptive weighted discriminant analysis method, including setting the optimization target of the Laplace adaptive weighted discriminant analysis method to
, ,
, ,
其中为类内散度矩阵,/>为类间散度矩阵,/>为投影矩阵/>的L2,1范数,m为特征个数,d为投影空间的维度,/>为流形正则项,为权衡参数,/>为单位矩阵,/>为训练数据中标签信息的类别数;采用迭代优化的方法求解投影矩阵/>和权重向量/>,得到最优投影矩阵;in is the intra-class scatter matrix, /> is the inter-class scatter matrix, /> is the projection matrix/> The L 2,1 norm of , m is the number of features, d is the dimension of the projection space, /> is the manifold regularization term, To weigh the parameters, /> is the identity matrix, /> is the number of categories of label information in the training data; the projection matrix is solved by iterative optimization method/> and weight vector/> , get the optimal projection matrix;
步骤S5、将待识别样本进行归一化处理,再通过最优投影矩阵得到投影后的数据,然后采用最近邻分类器得到识别标签。Step S5: normalize the sample to be identified, obtain the projected data through the optimal projection matrix, and then use the nearest neighbor classifier to obtain the identification label.
本发明还提供一种手写体数字识别装置,包括:The present invention also provides a handwritten digit recognition device, comprising:
预处理模块:收集到的样本进行归一化处理得到训练数据,所述训练数据包括标签数据和无标签数据;Preprocessing module: normalize the collected samples to obtain training data, which includes labeled data and unlabeled data;
第一计算模块:由训练数据中的标签数据计算类内散度矩阵和类间散度矩阵;The first calculation module: calculates the intra-class scatter matrix and the inter-class scatter matrix based on the label data in the training data;
第二计算模块:由训练数据中的标签数据和无标签数据构建近邻图计算流形正则项;The second calculation module: constructs a neighbor graph from the labeled data and unlabeled data in the training data to calculate the manifold regularization term;
最优投影矩阵求解模块:利用训练数据通过拉普拉斯自适应权重判别分析方法学习得到最优投影矩阵,包括将拉普拉斯自适应权重判别分析方法的优化目标设置为Optimal projection matrix solution module: Use the training data to learn the optimal projection matrix through the Laplace adaptive weighted discriminant analysis method, including setting the optimization target of the Laplace adaptive weighted discriminant analysis method to
, ,
, ,
其中为类内散度矩阵,/>为类间散度矩阵,/>为投影矩阵/>的L2,1范数,m为特征个数,d为投影空间的维度,/>为流形正则项,为权衡参数,/>为单位矩阵,/>为训练数据中标签信息的类别数;采用迭代优化的方法求解投影矩阵/>和权重向量/>,得到最优投影矩阵;in is the intra-class scatter matrix, /> is the inter-class scatter matrix, /> is the projection matrix/> The L 2,1 norm of , m is the number of features, d is the dimension of the projection space, /> is the manifold regularization term, To weigh the parameters, /> is the identity matrix, /> is the number of categories of label information in the training data; the projection matrix is solved by iterative optimization method/> and weight vector/> , get the optimal projection matrix;
识别模块:将待识别样本进行归一化处理,再通过最优投影矩阵得到投影后的数据,然后采用最近邻分类器得到识别标签。Recognition module: Normalize the samples to be identified, obtain the projected data through the optimal projection matrix, and then use the nearest neighbor classifier to obtain the identification label.
进一步地,所述步骤S3以及第二计算模块包括计算步骤:Furthermore, the step S3 and the second calculation module include the calculation steps:
步骤S3.1、利用训练数据构造近邻图得到近邻矩阵,标签数据/>,无标签数据/>,标签数据个数为/>,无标签数据个数为/>,近邻矩阵/>的构造方式如下:Step S3.1: Using training data Construct a neighbor graph to get a neighbor matrix , tag data/> , unlabeled data/> , the number of label data is/> , the number of unlabeled data is/> , neighbor matrix/> is constructed as follows:
, ,
其中表示为/>的/>近邻集合;in Expressed as/> /> Neighbor set;
步骤S3.2、计算流形正则项中的拉普拉斯矩阵,其中/>为对角矩阵,对角元素/>,在投影空间中的得到流形正则项为Step S3.2: Calculate the Laplace matrix in the manifold regularization term , where/> is a diagonal matrix, with diagonal elements/> , the manifold regularization term in the projection space is
, ,
其中为L2范数,/>表示为/>在低维投影空间中的像,/>,。in is the L2 norm,/> Expressed as/> The image in the low-dimensional projected space, /> , .
进一步地,所述采用迭代优化的方法求解投影矩阵和权重向量/>,得到最优投影矩阵,包括步骤:Furthermore, the projection matrix is solved by using an iterative optimization method. and weight vector/> , get the optimal projection matrix, including the steps:
步骤S4.1、初始化权重,求解投影矩阵/>,LapAWDA的优化函数转变成Step S4.1: Initialize weights , solve the projection matrix/> , the optimization function of LapAWDA is transformed into
, ,
, ,
其中为常数,/>0为权衡系数, />,in is a constant, /> 0 is the trade-off coefficient, /> ,
首先计算矩阵,得到优化目标为First calculate the matrix , the optimization objective is
, ,
应用拉格朗日乘子法,将优化问题转换成特征分解问题:Apply the Lagrange multiplier method to transform the optimization problem into a eigendecomposition problem:
, ,
其中是对角矩阵,对角元素/>,/>为/>的第/>行向量,/>为特征值,最优的投影矩阵/>是由前/>个最大特征值对应的/>个特征向量组成,其中/>;in is a diagonal matrix, with diagonal elements/> ,/> For/> The first/> Row vector, /> is the eigenvalue, the optimal projection matrix/> It is from the previous/> The maximum eigenvalue corresponds to /> feature vectors, where /> ;
步骤S4.2、固定投影矩阵,求解权重向量/>,此时LapAWDA的目标函数变为Step S4.2: Fix the projection matrix , solve for the weight vector/> , then the objective function of LapAWDA becomes
, ,
, ,
根据柯西不等式,得到权重向量的解为According to the Cauchy inequality, the solution of the weight vector is
; ;
步骤S4.3、更新权重向量,根据步骤S4.1继续求解投影矩阵/>;得到本轮最优投影矩阵后,进行下一轮交替迭代求解,即/>,固定投影矩阵/>,根据步骤S4.2更新权重向量/>,重复步骤S4.3直至满足停止条件得到最优投影矩阵/>。Step S4.3: Update weight vector , continue to solve the projection matrix according to step S4.1/> ; After obtaining the optimal projection matrix of this round, the next round of alternating iteration is performed, that is, /> , fixed projection matrix/> , update the weight vector according to step S4.2/> , repeat step S4.3 until the stop condition is met to obtain the optimal projection matrix/> .
由于拉普拉斯自适应权重判别分析方法的优化问题并不是一个经典的二次优化问题,所以在求解过程中采用了一种快速有效的迭代优化算法,并可以在理论上证明它是收敛的。Since the optimization problem of the Laplace adaptive weighted discriminant analysis method is not a classic quadratic optimization problem, a fast and effective iterative optimization algorithm is used in the solution process, and it can be theoretically proved to be convergent.
进一步地,所述步骤S4.3中停止条件为。Furthermore, the stopping condition in step S4.3 is .
进一步地,所述类内散度矩阵计算方式如下:Furthermore, the intra-class scatter matrix The calculation is as follows:
, ,
其中表示第/>类中的第/>个样本,/>表示第/>类的均值向量;in Indicates the first/> The first in the category/> samples, /> Indicates the first/> The mean vector of the class;
所述类间散度矩阵计算方式如下:The between-class scatter matrix The calculation is as follows:
。 .
本发明还提供一种计算机存储介质,其上存储有计算机程序,所述计算机该程序被处理器执行时,实现上述手写体数字识别方法。The present invention also provides a computer storage medium on which a computer program is stored. When the computer program is executed by a processor, the above-mentioned handwritten digit recognition method is implemented.
与现有技术相比,本发明所提供的技术方案的优点在于:Compared with the prior art, the advantages of the technical solution provided by the present invention are:
本发明通过流形正则项引入无标签数据的结构信息,同时采用一个自适应权重的方法均衡每个类对间的KL散度,避免KL散度小的类对在投影空间中消失。此外对于投影矩阵施加了L2,1范数约束,目的是得到一个稀疏判别的投影矩阵,进一步提升分类精度,从而更适用于多分类任务。LapAWDA的优化问题得到的最优解是可以提取到有用信息有利于后续的分类任务。The present invention introduces the structural information of unlabeled data through the manifold regularization term, and uses an adaptive weight method to balance the KL divergence between each class pair, so as to avoid the disappearance of class pairs with small KL divergence in the projection space. In addition, an L2,1 norm constraint is imposed on the projection matrix to obtain a sparse discriminant projection matrix, further improve the classification accuracy, and thus be more suitable for multi-classification tasks. The optimal solution to the LapAWDA optimization problem can extract useful information that is beneficial to subsequent classification tasks.
通过判别分析方法和流形正则项的结合,本发明的半监督特征提取方法与最近邻分类结合得到多分类模型,可以用于解决标签数据少的半监督多分类问题,提高了数据的利用率,并提升了分类性能。By combining the discriminant analysis method with the manifold regularization term, the semi-supervised feature extraction method of the present invention is combined with the nearest neighbor classification to obtain a multi-classification model, which can be used to solve the semi-supervised multi-classification problem with little labeled data, improve data utilization, and enhance classification performance.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为手写体数字识别方法的流程示意图。FIG. 1 is a schematic flow chart of a handwritten digit recognition method.
图2为运用拉普拉斯自适应权重判别分析方法的流程示意图。FIG. 2 is a flow chart of the Laplace adaptive weighted discriminant analysis method.
图3为MNIST数据集的样本。Figure 3 shows a sample from the MNIST dataset.
图4为在MNIST数据集上10种方法在10个标签数据时随维度变化的平均准确率。Figure 4 shows the average accuracy of 10 methods on the MNIST dataset as the dimension changes when there are 10 labeled data.
图5为在MNIST数据集上10种方法在20个标签数据时随维度变化的平均准确率。Figure 5 shows the average accuracy of 10 methods on the MNIST dataset as the dimension changes when there are 20 labeled data.
图6为在MNIST数据集上10种方法在30个标签数据时随维度变化的平均准确率。Figure 6 shows the average accuracy of 10 methods on the MNIST dataset with 30 labeled data as the dimension changes.
具体实施方式Detailed ways
下面结合实施例对本发明作进一步说明,应理解这些实施例仅用于说明本发明而不用于限制本发明的范围,在阅读了本说明之后,本领域技术人员对本说明的各种等同形式的修改均落于本申请所附权利要求所限定的范围内。The present invention is further described below in conjunction with examples. It should be understood that these examples are only used to illustrate the present invention and are not used to limit the scope of the present invention. After reading this description, various equivalent modifications to this description by those skilled in the art fall within the scope defined by the claims attached to this application.
本实施例涉及的手写体数字识别装置,包括:The handwritten digit recognition device involved in this embodiment includes:
预处理模块:收集到的样本进行归一化处理得到训练数据,所述训练数据包括标签数据和无标签数据;Preprocessing module: normalize the collected samples to obtain training data, which includes labeled data and unlabeled data;
第一计算模块:由训练数据中的标签数据计算类内散度矩阵和类间散度矩阵;The first calculation module: calculates the intra-class scatter matrix and the inter-class scatter matrix based on the label data in the training data;
第二计算模块:由训练数据中的标签数据和无标签数据构建近邻图计算流形正则项;The second calculation module: constructs a neighbor graph from the labeled data and unlabeled data in the training data to calculate the manifold regularization term;
最优投影矩阵求解模块:利用训练数据通过拉普拉斯自适应权重判别分析方法学习得到最优投影矩阵;Optimal projection matrix solution module: Use the training data to learn the optimal projection matrix through the Laplace adaptive weighted discriminant analysis method;
识别模块:将待识别样本进行归一化处理,再通过最优投影矩阵得到投影后的数据,然后采用最近邻分类器得到识别标签。Recognition module: Normalize the samples to be identified, obtain the projected data through the optimal projection matrix, and then use the nearest neighbor classifier to obtain the identification label.
具体的,请结合图1及图2所示,该装置采用的手写体数字识别方法,包括以下步骤:Specifically, referring to FIG. 1 and FIG. 2 , the handwritten digit recognition method adopted by the device includes the following steps:
步骤S1、收集到的样本进行归一化处理得到训练数据,归一化处理的过程是将图像的每个像素值除以255映射到[0,1]的范围。训练数据包括标签数据和无标签数据/>,训练数据即为/>,其中标签数据的标签向量为,标签信息/>,/>为类别数,标签数据个数为/>,无标签数据个数为/>,训练数据总数为/>。Step S1: The collected samples are normalized to obtain training data. The normalization process is to divide each pixel value of the image by 255 and map it to the range of [0,1]. The training data includes label data and unlabeled data/> , the training data is /> , where the label vector of the label data is , tag information/> ,/> is the number of categories, and the number of label data is/> , the number of unlabeled data is/> , the total number of training data is/> .
步骤S2、由训练数据中的标签数据计算类内散度矩阵和类间散度矩阵;Step S2, calculating the intra-class scatter matrix and the inter-class scatter matrix based on the label data in the training data;
对于类数据,总的类内散度矩阵/>计算方式为:,for Class data, total intra-class scatter matrix/> The calculation method is: ,
其中表示第/>类中的第/>个样本,/>表示第/>类的均值向量。in Indicates the first/> The first in the category/> samples, /> Indicates the first/> The mean vector of the classes.
而对于类中任意两类之间组成/>个类对得到/>个类间散度矩阵,即第/>类和第/>类的类间散度矩阵/>的计算方式为:And for The composition between any two classes in the class/> Class pairs get /> The inter-class scatter matrix, that is, the first/> Class and No./> The inter-class scatter matrix of the class/> The calculation method is:
。 .
步骤S3、由训练数据中的标签数据和无标签数据构建近邻图计算流形正则项;具体包括:Step S3, constructing a neighbor graph from the labeled data and the unlabeled data in the training data to calculate the manifold regularization term; specifically comprising:
步骤S3.1、利用训练数据构造近邻图得到近邻矩阵,近邻矩阵/>的构造方式如下:Step S3.1: Using training data Construct a neighbor graph to get a neighbor matrix , neighbor matrix/> is constructed as follows:
, ,
其中表示为/>的/>近邻集合。in Expressed as/> /> Neighbor set.
步骤S3.2、计算流形正则项中的拉普拉斯矩阵,其中/>为对角矩阵,对角元素/>,在投影空间中的得到流形正则项为Step S3.2: Calculate the Laplace matrix in the manifold regularization term , where/> is a diagonal matrix, with diagonal elements/> , the manifold regularization term in the projection space is
, ,
其中为L2范数,/>表示为/>在低维投影空间中的像,/>,d为投影空间的维度。in is the L2 norm,/> Expressed as/> The image in the low-dimensional projected space, /> , d is the dimension of the projection space.
步骤S4、从流形正则项的表达式可以看出是与投影向量相关,所以将流形正则项引入拉普拉斯自适应权重判别分析方法(LapAWDA),其需要在优化过程中求解投影向量。由于权重向量/>不是预定义的,而是从低维投影空间中学习得到,而且LapAWDA的优化目标表明它是非光滑的,无法直接同时求解投影向量/>和权重向量/>,但采用迭代优化的算法可以得到近似最优解,所以LapAWDA优化问题采用1)固定权重向量/>,更新投影向量/>;2)固定投影向量/>,更新权重向量/>这两步,交替迭代求解,直至满足停止条件得到近似最优解。Step S4: From the expression of the manifold regularization term, it can be seen that it is related to the projection vector Therefore, the manifold regularization term is introduced into the Laplace adaptive weighted discriminant analysis method (LapAWDA), which requires solving the projection vector in the optimization process. . Since the weight vector/> It is not predefined, but learned from the low-dimensional projection space. Moreover, the optimization objective of LapAWDA shows that it is non-smooth and cannot directly solve the projection vector at the same time./> and weight vector/> , but the iterative optimization algorithm can get an approximate optimal solution, so the LapAWDA optimization problem adopts 1) fixed weight vector/> , update the projection vector/> ; 2) Fixed projection vector/> , update the weight vector/> These two steps are solved alternately and iteratively until the stopping condition is met and an approximate optimal solution is obtained.
具体来说,将拉普拉斯自适应权重判别分析方法的优化目标设置为Specifically, the optimization objective of the Laplace adaptive weighted discriminant analysis method is set as
, ,
, ,
其中为类内散度矩阵,/>为类间散度矩阵,/>为投影矩阵/>的L2,1范数,m为特征个数,/>,/>为权衡参数,/>为单位矩阵。in is the intra-class scatter matrix, /> is the inter-class scatter matrix, /> is the projection matrix/> The L 2,1 norm of , m is the number of features, /> ,/> To weigh the parameters, /> is the identity matrix.
迭代优化步骤如下:The iterative optimization steps are as follows:
步骤S4.1、初始化权重,求解投影矩阵/>,LapAWDA的优化函数转变成Step S4.1: Initialize weights , solve the projection matrix/> , the optimization function of LapAWDA is transformed into
, ,
, ,
其中为常数,/>0为权衡系数, />,in is a constant, /> 0 is the trade-off coefficient, /> ,
首先计算矩阵,得到优化目标为First calculate the matrix , the optimization objective is
, ,
应用拉格朗日乘子法,将优化问题转换成特征分解问题:Apply the Lagrange multiplier method to transform the optimization problem into a eigendecomposition problem:
, ,
其中是对角矩阵,对角元素/>,/>为/>的第/>行向量,/>为特征值,最优的投影矩阵/>是由前/>个最大特征值对应的/>个特征向量组成,其中/>;in is a diagonal matrix, with diagonal elements/> ,/> For/> The first/> Row vector, /> is the eigenvalue, the optimal projection matrix/> It is from the previous/> The maximum eigenvalue corresponds to /> feature vectors, where /> ;
步骤S4.2、固定投影矩阵,求解权重向量/>,此时LapAWDA的目标函数变为Step S4.2: Fix the projection matrix , solve for the weight vector/> , then the objective function of LapAWDA becomes
, ,
, ,
根据柯西不等式,得到权重向量的解为According to the Cauchy inequality, the solution of the weight vector is
; ;
步骤S4.3、更新权重向量,根据步骤S4.1继续求解投影矩阵/>;得到本轮最优投影矩阵后,进行下一轮交替迭代求解,即/>,固定投影矩阵/>,根据步骤S4.2更新权重向量/>,重复步骤S4.3直至满足停止条件/>,得到最优投影矩阵/>。Step S4.3: Update weight vector , continue to solve the projection matrix according to step S4.1/> ; After obtaining the optimal projection matrix of this round, the next round of alternating iteration is performed, that is, /> , fixed projection matrix/> , update the weight vector according to step S4.2/> , repeat step S4.3 until the stop condition is met/> , get the optimal projection matrix/> .
步骤S5、将待识别样本进行归一化处理,再通过最优投影矩阵得到投影后的数据,对于识别样本归一化后的数据投影后的像为/>,然后采用最近邻分类器得到识别标签。Step S5: normalize the sample to be identified, and then obtain the projected data through the optimal projection matrix. The projected image is , and then the nearest neighbor classifier is used to get the identification label.
本发明的论证实验使用数据集是:MNIST手写体数字图像。The data set used in the demonstration experiment of the present invention is: MNIST handwritten digital images.
MNIST数据集是机器学习领域中一个经典数据集,由60000个训练样本和10000个测试样本组成,每个样本都是一张28 * 28像素的灰度手写数字图片,如图3所示。实验中训练集由0~9手写体数字的每类训练集中随机抽取的100张图像组成,总共10个类别,测试集为10000个测试样本。为验证本发明在半监督学习中有效性,实验中采用不同数量的标签数据进行训练,取测试集上的10次试验准确率的平均值作为评价指标。本发明是一种特征提取的方法,为了展示其在MNIST数据集上的分类表现,采用的最近邻分类器。The MNIST dataset is a classic dataset in the field of machine learning, consisting of 60,000 training samples and 10,000 test samples, each of which is a 28 * 28 pixel grayscale handwritten digit picture, as shown in Figure 3. In the experiment, the training set consists of 100 images randomly selected from each type of training set of handwritten digits from 0 to 9, with a total of 10 categories, and the test set is 10,000 test samples. In order to verify the effectiveness of the present invention in semi-supervised learning, different amounts of labeled data are used for training in the experiment, and the average value of the accuracy of 10 tests on the test set is taken as the evaluation index. The present invention is a method for feature extraction. In order to demonstrate its classification performance on the MNIST dataset, a nearest neighbor classifier is used.
实验硬件环境:Intel Core i5 (2.7GHz)处理器和8GB内存的Macbook Pro。代码运行环境:Matlab(R2015b)。实验结果如下:Experimental hardware environment: Intel Core i5 (2.7GHz) processor and 8GB memory Macbook Pro. Code running environment: Matlab (R2015b). The experimental results are as follows:
为了验证本发明的有效性和优越性,实验对比了5个有监督的判别分析方法(LDA、LFDA、aPAC、LADA和MDAAWS)和4个半监督的判别分析方法(SLDA、SMMC、SSDR和SELF),在这里,近邻数设为5,每个方法中的正则项参数均在参数范围内通过网格搜索得到的。表1记录了不同标签个数下本发明与其它9种对比方法的分类准确率,这里的特征维度为20。从表中,可以看出半监督的判别分析方法的分类准确率一般要比对应的有监督方法高,说明无标签数据提供了有利信息,而且随着标签样本数的增加,部分有监督的判别分析方法因为过拟合而导致分类性能下降,但是半监督的判别分析方法都没有遇到这个现象。所以引入无标签数据的信息可以提高算法的泛化能力,本发明提出的LapAWDA方法无论是在10个、20个还是30个标签数据的训练数据上,获得的分类准确率是最高的,明显优于其它的判别分析方法。In order to verify the effectiveness and superiority of the present invention, five supervised discriminant analysis methods (LDA, LFDA, aPAC, LADA and MDAAWS) and four semi-supervised discriminant analysis methods (SLDA, SMMC, SSDR and SELF) were experimentally compared. Here, the number of nearest neighbors is set to 5, and the regularization term parameters in each method are within the parameter range. Obtained through grid search. Table 1 records the classification accuracy of the present invention and the other 9 comparison methods under different numbers of labels, where the feature dimension is 20. From the table, it can be seen that the classification accuracy of the semi-supervised discriminant analysis method is generally higher than that of the corresponding supervised method, indicating that the unlabeled data provides favorable information, and with the increase in the number of labeled samples, some supervised discriminant analysis methods lead to a decrease in classification performance due to overfitting, but the semi-supervised discriminant analysis methods do not encounter this phenomenon. Therefore, the introduction of unlabeled data information can improve the generalization ability of the algorithm. The LapAWDA method proposed in the present invention has the highest classification accuracy whether it is on the training data of 10, 20 or 30 labeled data, which is significantly better than other discriminant analysis methods.
表1 不同标签个数下10种方法的分类平均准确率(%)Table 1 Average classification accuracy of 10 methods under different numbers of labels (%)
为了研究特征个数和标记样本数量对LapAWDA获得的投影矩阵的影响,从每类训练数据种分别随机选择10、20和30个标记样本,其余训练数据视为未标记样本。图4至图6展示了多个判别分类方法在MNIST数据集上维度从5到50变化的准确率,本发明在每个维度上都取得了最高准确率,特别是维度取前20时,本发明的分类性能要远远优于其它方法。而且随着标记样本的增加,本发明的分类准确也在上升。以上结果都表明了本发明在分类任务中通过投影矩阵能从训练数据中获得更多的判别信息,同时提高了标签数据的利用率。In order to study the influence of the number of features and the number of labeled samples on the projection matrix obtained by LapAWDA, 10, 20 and 30 labeled samples were randomly selected from each type of training data, and the remaining training data were regarded as unlabeled samples. Figures 4 to 6 show the accuracy of multiple discriminant classification methods on the MNIST data set with dimensions ranging from 5 to 50. The present invention has achieved the highest accuracy in each dimension, especially when the dimensions are taken as the top 20, the classification performance of the present invention is far superior to other methods. Moreover, with the increase of labeled samples, the classification accuracy of the present invention is also increasing. The above results all show that the present invention can obtain more discriminant information from the training data through the projection matrix in the classification task, while improving the utilization rate of the label data.
应当指出的是,上述实施例的具体方法可形成计算机程序产品,因此,本申请实施的计算机程序产品可存储在在一个或多个计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上。It should be noted that the specific methods of the above embodiments can form a computer program product. Therefore, the computer program product implemented in this application can be stored on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.).
Claims (8)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410130100.XA CN117671704B (en) | 2024-01-31 | 2024-01-31 | A method, device and computer storage medium for handwritten digit recognition |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410130100.XA CN117671704B (en) | 2024-01-31 | 2024-01-31 | A method, device and computer storage medium for handwritten digit recognition |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN117671704A CN117671704A (en) | 2024-03-08 |
| CN117671704B true CN117671704B (en) | 2024-04-26 |
Family
ID=90079208
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202410130100.XA Active CN117671704B (en) | 2024-01-31 | 2024-01-31 | A method, device and computer storage medium for handwritten digit recognition |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN117671704B (en) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN118097396B (en) * | 2024-04-23 | 2024-07-12 | 南京信息工程大学 | Underwater optical image recognition method and device |
| CN118918595A (en) * | 2024-07-15 | 2024-11-08 | 长春大学 | Handwriting digital classification method, device, medium and equipment |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN104992166A (en) * | 2015-07-28 | 2015-10-21 | 苏州大学 | Robust measurement based handwriting recognition method and system |
| CN106845358A (en) * | 2016-12-26 | 2017-06-13 | 苏州大学 | A kind of method and system of handwritten character characteristics of image identification |
| CN109376796A (en) * | 2018-11-19 | 2019-02-22 | 中山大学 | Image classification method based on active semi-supervised learning |
| CN112861929A (en) * | 2021-01-20 | 2021-05-28 | 河南科技大学 | Image classification method based on semi-supervised weighted migration discriminant analysis |
-
2024
- 2024-01-31 CN CN202410130100.XA patent/CN117671704B/en active Active
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN104992166A (en) * | 2015-07-28 | 2015-10-21 | 苏州大学 | Robust measurement based handwriting recognition method and system |
| CN106845358A (en) * | 2016-12-26 | 2017-06-13 | 苏州大学 | A kind of method and system of handwritten character characteristics of image identification |
| CN109376796A (en) * | 2018-11-19 | 2019-02-22 | 中山大学 | Image classification method based on active semi-supervised learning |
| CN112861929A (en) * | 2021-01-20 | 2021-05-28 | 河南科技大学 | Image classification method based on semi-supervised weighted migration discriminant analysis |
Also Published As
| Publication number | Publication date |
|---|---|
| CN117671704A (en) | 2024-03-08 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN117671704B (en) | A method, device and computer storage medium for handwritten digit recognition | |
| Ghanem et al. | Maximum margin distance learning for dynamic texture recognition | |
| CN105354595B (en) | A robust visual image classification method and system | |
| CN105488536A (en) | Agricultural pest image recognition method based on multi-feature deep learning technology | |
| WO2015143580A1 (en) | Method and system for verifying facial data | |
| CN111860494A (en) | Optimal method, device, electronic device and storage medium for image target detection | |
| CN105760821A (en) | Classification and aggregation sparse representation face identification method based on nuclear space | |
| CN113887661A (en) | An Image Set Classification Method and System Based on Representation Learning Reconstruction Residual Analysis | |
| CN106557782A (en) | Hyperspectral image classification method and device based on class dictionary | |
| CN108664986B (en) | Multi-task learning image classification method and system based on lp norm regularization | |
| CN103310208B (en) | The distinctive human face posture recognition methods of describing based on local geometric vision phrase | |
| CN107633065A (en) | A kind of recognition methods based on cartographical sketching | |
| CN110659608A (en) | Scene classification method based on multi-feature fusion | |
| CN116188900A (en) | Small sample image classification method based on global and local feature augmentation | |
| CN107145841A (en) | A matrix-based low-rank sparse face recognition method and system | |
| CN114357200A (en) | A Cross-modal Hash Retrieval Method Based on Supervised Graph Embedding | |
| CN109978064A (en) | Lie group dictionary learning classification method based on image set | |
| Ahlawat et al. | A genetic algorithm based feature selection for handwritten digit recognition | |
| Zhang et al. | Semantically modeling of object and context for categorization | |
| CN108108769B (en) | Data classification method, device and storage medium | |
| Aggarwal et al. | Object detection based approaches in image classification: a brief overview | |
| CN113762151A (en) | A fault data processing method, system and fault prediction method | |
| Sadeghi et al. | Fast template evaluation with vector quantization | |
| CN111178533A (en) | Method and device for realizing automatic semi-supervised machine learning | |
| CN104573727B (en) | A kind of handwriting digital image dimension reduction method |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |