CN115761851A - Optimization Method of Cosine Optimal Loss Function Based on Global Information - Google Patents
Optimization Method of Cosine Optimal Loss Function Based on Global Information Download PDFInfo
- Publication number
- CN115761851A CN115761851A CN202211442334.5A CN202211442334A CN115761851A CN 115761851 A CN115761851 A CN 115761851A CN 202211442334 A CN202211442334 A CN 202211442334A CN 115761851 A CN115761851 A CN 115761851A
- Authority
- CN
- China
- Prior art keywords
- class
- cosine
- loss function
- optimal
- range
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Image Analysis (AREA)
Abstract
本发明提出一种基于全局信息的余弦最优损失函数的优化方法,包括:S1.将现有损失函数的优点与一些重要的新属性相结合,应用L2权重归一化;S2.明确遵循最小化类内变化和最大化类间变化两个目标,依靠一种新的算法来学习类中心和类边缘的余弦相似度,并分别提出两个轻量化版本的余弦最优损失函数;S3.整合上述两个轻量化版本来创建余弦最优损失函数的标准版本。本发明主要针对现有损失函数没有应用权重和特征归一化或未明确遵循最小化类内变化和最大化类间变化的问题,使用全局信息作为人脸识别的反馈,提出了一种基于全局信息的余弦最优损失函数,相比于现有的损失函数,该损失函数更加有效并实现了更先进的性能。The present invention proposes a cosine optimal loss function optimization method based on global information, including: S1. Combining the advantages of the existing loss function with some important new attributes, applying L2 weight normalization; S2. Explicitly following the minimum The two goals of minimizing intra-class change and maximizing inter-class change rely on a new algorithm to learn the cosine similarity of the class center and class edge, and propose two lightweight versions of the cosine optimal loss function; S3. Integration The above two lightweight versions are used to create a standard version of the cosine-optimal loss function. The present invention mainly aims at the problem that the existing loss function does not apply weight and feature normalization or does not clearly follow the problem of minimizing intra-class change and maximizing inter-class change, using global information as the feedback of face recognition, and proposes a global-based A cosine-optimal loss function for information that is more efficient and achieves state-of-the-art performance compared to existing loss functions.
Description
技术领域technical field
本发明涉及人工智能、机器学习与人脸识别技术领域,具体涉及可应用于人脸识别且基于全局信息的余弦最优损失函数的优化方法。The invention relates to the technical fields of artificial intelligence, machine learning and face recognition, in particular to an optimization method of a cosine optimal loss function that can be applied to face recognition and is based on global information.
背景技术Background technique
卷积神经网络(CNN)在人脸识别方面表现出令人印象深刻的性能,其中损失函数在此过程中起着重要作用。为了学习到具有高度判别能力的特征,近年来提出了许多不同的损失函数。目前,人脸识别中表现最好的损失函数可以分为两类——基于欧几里德距离的损失函数和基于余弦相似度的损失函数。Convolutional Neural Networks (CNNs) have shown impressive performance in face recognition, where loss functions play an important role in the process. To learn highly discriminative features, many different loss functions have been proposed in recent years. At present, the best-performing loss functions in face recognition can be divided into two categories - loss functions based on Euclidean distance and loss functions based on cosine similarity.
Softmax损失可以表述为:其中N表示批量大小,P表示整个训练集中的类别数,fi∈Rd是属于第yi个类的第i个样本的特征向量,Wi∈Rd是权重矩阵W的第j列最后的全连接层,bj是第j个类的偏置项。典型的基于欧几里德距离的损失包括中心损失、间隔损失和范围损失。它们都添加了额外的惩罚来实现与softmax损失的联合监督,并且基于以下两个目标进行设计:最小化类内变化和最大化类间变化。这两个目标都对性能提升有贡献。基于余弦相似度的损失函数包括L-Softmax损失、A-Softmax损失和AM-Softmax损失。它们是通过添加额外的间隔约束从softmax损失衍生出来的。L2权重归一化提高了性能,尽管改进非常有限。特征归一化带来的优势包括更好的性能和更好的几何解释。Softmax loss can be expressed as: where N represents the batch size, P represents the number of categories in the entire training set, f i ∈ R d is the feature vector of the i-th sample belonging to the y i- th class, W i ∈ R d is the j-th column of the weight matrix W and finally The fully connected layer of , b j is the bias term of the jth class. Typical losses based on Euclidean distance include center loss, margin loss and range loss. They both add additional penalties to achieve joint supervision with softmax loss, and are designed based on the following two objectives: minimizing intra-class variation and maximizing inter-class variation. Both of these goals contribute to performance gains. Loss functions based on cosine similarity include L-Softmax loss, A-Softmax loss and AM-Softmax loss. They are derived from the softmax loss by adding an additional margin constraint. L2 weight normalization improves performance, although the improvement is very limited. Advantages brought about by feature normalization include better performance and better geometric interpretation.
目前已经提出的这些损失函数要么没应用权重和特征归一化,如对比损失、三重损失、中心损失、范围损失和间隔损失;要么未明确遵循提高判别能力的两个目标,如L-Softmax loss、ASoftmax loss、AM-Softmax loss和ArcFace。These loss functions that have been proposed so far either do not apply weight and feature normalization, such as contrastive loss, triple loss, center loss, range loss and interval loss; or do not explicitly follow the two goals of improving discriminative ability, such as L-Softmax loss , ASoftmax loss, AM-Softmax loss and ArcFace.
目前,深度神经网络是通过基于每个小批量的反馈信息迭代更新网络参数来训练的。这是一个可行的解决方案,因为存在两个限制:GPU,TPU或其他类似处理单元的计算能力和内存大小。在没有计算能力限制的情况下,深度神经网络可以以整个训练集作为反馈信息的来源进行训练,直接优化整个训练集的样本分布。在没有内存大小限制的情况下,深度神经网络会将整个训练集输入到内存中,而不是逐个小批量处理数据。也许正是因为以上两个约束,没有任何一个损失使用整个数据集作为反馈信息的来源来优化人脸识别中的CNNs。Currently, deep neural networks are trained by iteratively updating network parameters based on feedback information from each mini-batch. This is a viable solution because there are two constraints: the computing power and memory size of the GPU, TPU or other similar processing unit. Without the limitation of computing power, the deep neural network can use the entire training set as the source of feedback information for training, and directly optimize the sample distribution of the entire training set. Without memory size constraints, deep neural networks feed the entire training set into memory instead of processing data individually in mini-batches. Perhaps precisely because of the above two constraints, none of the losses use the entire dataset as a source of feedback information to optimize CNNs in face recognition.
我们提出了一种新的损失函数,即基于全局信息的余弦最优损失函数。余弦最优损失函数具有优化类内和类间变化以及权重和特征归一化的所有四个属性。并且,余弦最优损失函数由整个训练集的分布信息引导。相比于之前提出的损失函数,余弦最优损失函数更加有效,并表现出了更加先进的性能。We propose a new loss function, the cosine optimal loss function based on global information. The cosine-optimal loss function has all four properties of optimizing intra- and inter-class variation as well as weight and feature normalization. And, the cosine optimal loss function is guided by the distribution information of the whole training set. Compared to previously proposed loss functions, the cosine-optimal loss function is more effective and shows more advanced performance.
发明内容Contents of the invention
(1)要解决的技术问题(1) Technical problems to be solved
损失函数在CNN(卷积神经网络)中起着重要作用。然而,现有的损失函数要么没有应用权重和特征归一化,要么没有明确遵循提高辨别能力的两个目标:最小化类内变化和最大化类间变化。而且,所有这些函数只考虑小批量的反馈信息,而没有考虑整个训练集的分布信息。Loss function plays an important role in CNN (Convolutional Neural Network). However, existing loss functions either do not apply weight and feature normalization, or do not explicitly follow the two goals of improving discriminative ability: minimizing intra-class variation and maximizing inter-class variation. Moreover, all these functions only consider the feedback information of the mini-batch, but not the distribution information of the whole training set.
(2)技术方案(2) Technical solution
可应用于人脸识别且基于全局信息的余弦最优损失函数,包括如下步骤:A cosine optimal loss function that can be applied to face recognition and based on global information, including the following steps:
a)Softmax损失是深度学习中最常用的损失函数,可以表述为:a) Softmax loss is the most commonly used loss function in deep learning, which can be expressed as:
其中N表示批量大小,P表示整个训练集中的类别数,fi∈Rd是属于第yi个类的第i个样本的特征向量,Wj∈Rd是权重矩阵W的第j列最后的全连接层,bj是第j个类的偏置项;where N represents the batch size, P represents the number of categories in the entire training set, f i ∈ R d is the feature vector of the i-th sample belonging to the y i- th class, W j ∈ R d is the j-th column of the weight matrix W and finally The fully connected layer of , b j is the bias item of the jth class;
固定Softmax损失中的bj=0和||Wj||=1来应用L2权重归一化。同时对特征向量fi应用L2归一化并将||fi||重新缩放到S,再与AM-Softmax损失结合。得到的总损失为L=LAM+λLG。其中S是一个指定的常数,LG是所提出的余弦最优损失函数,λ是用于调整这两种损失影响力的超参数,LAM为AM-Softmax损失的函数表述。Fix b j = 0 and ||W j || = 1 in Softmax loss to apply L2 weight normalization. Simultaneously apply L2 normalization to feature vector f i and rescale ||f i || to S, combined with AM-Softmax loss. The resulting total loss is L = L AM + λL G . where S is a specified constant, L G is the proposed cosine optimal loss function, λ is a hyperparameter used to adjust the influence of these two losses, and L AM is the functional expression of AM-Softmax loss.
b)为了最小化类内变化,首先提出一个轻量化版本的余弦最优损失函数其公式如下:R(j)=cos(cj,ej),其中P是整个训练集中的类别数,cj是类j的中心,ej表示类j的边缘(即类j的最远样本)。R(j)表示j类的余弦范围,即类中心与j类边缘的余弦相似度。我们使用Wj作为cj的近似替代,并且提出一种算法来递归更新每个类的范围。b) In order to minimize intra-class variation, a lightweight version of the cosine optimal loss function is first proposed Its formula is as follows: R(j)=cos(c j , e j ), where P is the number of categories in the entire training set, c j is the center of class j, and e j represents the edge of class j (ie, the farthest sample of class j). R(j) represents the cosine range of class j, that is, the cosine similarity between the class center and the edge of class j. We use Wj as an approximate surrogate for cj , and propose an algorithm to recursively update the bounds of each class.
c)根据步骤b)所提到的算法,一开始,R(j)被初始化为1。然后我们使用以下迭代方式来更新R(j):c) According to the algorithm mentioned in step b), at the beginning, R(j) is initialized to 1. Then we update R(j) iteratively as follows:
其中j=1,2,...,P,where j=1,2,...,P,
其中,当yi=j时φ(yi,j)=1,否则φ(yi,j)=0。β是收缩率,用于调整学习类别范围的收缩速度。Wherein, φ(y i , j)=1 when y i =j, otherwise φ(y i , j)=0. β is the shrinkage rate, which is used to adjust the shrinkage speed of the learned category range.
根据步骤b)中所提出的学习算法,其基本思想涉及两种情形:①如果输入样本与其对应的类中心的余弦相似度小于记录的类范围,则直接用它们的余弦相似度替换类范围;②相反,如果输入样本与其对应的类中心的余弦相似度不小于记录的类范围,则通过用β缩放它们的余弦相似度来收缩类范围。情形①使学习的类范围保持最新。随着训练的进行,真实的类范围会越来越小。情形②用于帮助学习的类范围缩小到真实值。According to the learning algorithm proposed in step b), its basic idea involves two situations: ① If the cosine similarity between the input sample and its corresponding class center is smaller than the recorded class range, directly replace the class range with their cosine similarity; ② Conversely, if the cosine similarity of an input sample and its corresponding class center is not smaller than the recorded class range, the class range is shrunk by scaling their cosine similarity with β. Scenario ① keeps the learned class range up-to-date. As training progresses, the true class range becomes smaller and smaller. Case ② is used to help the learning class narrow down to the real value.
d)为了最大化类间变化,提出另一个轻量化版本的余弦最优损失函数 d) In order to maximize the inter-class variation, another lightweight version of the cosine optimal loss function is proposed
其中∑Top(A,k)表示集合A中K个最大元素的总和,Wa和Wb是任意两个不同类的类中心的近似替代值。余弦最优损失函数的目的是在整个训练集中找到K对最近的类中心,并计算它们的距离总和。与不相邻的类中心相比,相邻中心的对应类很可能有较小的间隔或有重叠。如果所有相邻类都有适当的间隔,则非相邻类将具有更大的间隔。因此,没有必要考虑所有中心对。最有效的方法是优化所有相邻中心的距离。这里将K的值设置为P,其中P是类的数量。因为当所有类中心在超球面上排成一圈时,相邻中心对的最小数量是P。where ∑Top (A,k) denotes the sum of the K largest elements in the set A, and W a and W b are approximate surrogates for the class centers of any two different classes. Cosine optimal loss function The purpose of is to find the K pairs of nearest class centers in the whole training set and calculate the sum of their distances. Corresponding classes of adjacent centers are likely to have smaller separations or overlap than non-adjacent class centers. If all adjacent classes have proper spacing, non-adjacent classes will have larger spacing. Therefore, it is not necessary to consider all center pairs. The most efficient way is to optimize the distance of all adjacent centers. Here the value of K is set to P, where P is the number of classes. Because when all the class centers are arranged in a circle on the hypersphere, the minimum number of pairs of adjacent centers is P.
e)整合步骤b)和步骤d)提出的两个轻量化版本创建出余弦最优损失函数的标准版 e) Integrate the two lightweight versions proposed in step b) and step d) to create a standard version of the cosine optimal loss function
(3)有益效果(3) Beneficial effect
本发明的余弦最优损失函数综合了近年来在人脸识别中提出的最优损失函数的优点。并首次尝试使用全局信息作为人脸识别的反馈。余弦最优损失函数运用了一种新的算法来学习类中心和类边缘之间的余弦相似度。本专利所提出的余弦最优损失函数在LFW、SLLFW和YTF数据集上进行了大量的实验。结果证明了其有效性并表明余弦最优损失函数实现了最先进的性能。The cosine optimal loss function of the present invention combines the advantages of the optimal loss function proposed in face recognition in recent years. And it is the first attempt to use global information as feedback for face recognition. The cosine optimal loss function employs a novel algorithm to learn the cosine similarity between class centers and class edges. The cosine optimal loss function proposed in this patent has been extensively tested on LFW, SLLFW and YTF data sets. The results demonstrate its effectiveness and show that the cosine-optimal loss function achieves state-of-the-art performance.
具体实施方式Detailed ways
下面对本发明做进一步说明。The present invention will be further described below.
可应用于人脸识别且基于全局信息的余弦最优损失函数的设计方法,包括如下步骤:A method for designing a cosine optimal loss function that can be applied to face recognition and based on global information includes the following steps:
a)固定Softmax损失中的bj=0和||Wj||=1来应用L2权重归一化。同时对特征向量fi应用L2归一化并将||fi||重新缩放到S,再与AM-Softmax损失结合。得到的总损失为L=LAM+λLG。其中S是一个指定的常数,LG是所提出的余弦最优损失函数,λ是用于调整这两种损失影响力的超参数,LAM为AM-Softmax损失的函数表述。a) Fix b j =0 and ||W j ||=1 in Softmax loss to apply L2 weight normalization. Simultaneously apply L2 normalization to feature vector f i and rescale ||f i || to S, combined with AM-Softmax loss. The resulting total loss is L = L AM + λL G . where S is a specified constant, L G is the proposed cosine optimal loss function, λ is a hyperparameter used to adjust the influence of these two losses, and L AM is the functional expression of AM-Softmax loss.
Softmax损失是深度学习中最常用的损失函数,可以表述为:Softmax loss is the most commonly used loss function in deep learning, which can be expressed as:
其中N表示批量大小,P表示整个训练集中的类别数,fi∈Rd是属于第yi个类的第i个样本的特征向量,Wj∈Rd是权重矩阵W的第j列最后的全连接层,bj是第j个类的偏置项;where N represents the batch size, P represents the number of categories in the entire training set, f i ∈ R d is the feature vector of the i-th sample belonging to the y i- th class, W j ∈ R d is the j-th column of the weight matrix W and finally The fully connected layer of , b j is the bias item of the jth class;
b)为了最小化类内变化,首先提出一个轻量化版本的余弦最优损失函数其公式如下:b) In order to minimize intra-class variation, a lightweight version of the cosine optimal loss function is first proposed Its formula is as follows:
R(j)=cos(cj,ej) R(j)=cos(c j , e j )
其中P是整个训练集中的类别数,cj是类j的中心,ej表示类j的边缘(即类j的最远样本)。R(j)表示j类的余弦范围,即类中心与j类边缘的余弦相似度。我们使用Wj作为cj的近似替代,并且提出一种算法来递归更新每个类的范围。where P is the number of categories in the entire training set, c j is the center of class j, and e j represents the edge of class j (i.e. the farthest sample of class j). R(j) represents the cosine range of class j, that is, the cosine similarity between the class center and the edge of class j. We use Wj as an approximate surrogate for cj , and propose an algorithm to recursively update the bounds of each class.
c)根据步骤b)所提到的算法,一开始,R(j)被初始化为1。然后我们使用以下迭代方式来更新R(j):c) According to the algorithm mentioned in step b), at the beginning, R(j) is initialized to 1. Then we update R(j) iteratively as follows:
其中j=1,2,...,P,where j=1,2,...,P,
其中,当yi=j时φ(yi,j)=1,否则φ(yi,j)=0。β是收缩率,用于调整学习类别范围的收缩速度。Wherein, φ(y i , j)=1 when y i =j, otherwise φ(y i , j)=0. β is the shrinkage rate, which is used to adjust the shrinkage speed of the learned category range.
根据步骤b)中所提出的学习算法,其基本思想涉及两种情形:①如果输入样本与其对应的类中心的余弦相似度小于记录的类范围,则直接用它们的余弦相似度替换类范围;②相反,如果输入样本与其对应的类中心的余弦相似度不小于记录的类范围,则通过用β缩放它们的余弦相似度来收缩类范围。情形①使学习的类范围保持最新。随着训练的进行,真实的类范围会越来越小。情形②用于帮助学习的类范围缩小到真实值。According to the learning algorithm proposed in step b), its basic idea involves two situations: ① If the cosine similarity between the input sample and its corresponding class center is smaller than the recorded class range, directly replace the class range with their cosine similarity; ② Conversely, if the cosine similarity of an input sample and its corresponding class center is not smaller than the recorded class range, the class range is shrunk by scaling their cosine similarity with β. Scenario ① keeps the learned class range up-to-date. As training progresses, the true class range becomes smaller and smaller. Case ② is used to help the learning class narrow down to the real value.
d)为了最大化类间变化,提出另一个轻量化版本的余弦最优损失函数 d) In order to maximize the inter-class variation, another lightweight version of the cosine optimal loss function is proposed
其中∑Top(A,k)表示集合A中K个最大元素的总和,Wa和Wb是任意两个不同类的类中心的近似替代值。余弦最优损失函数的目的是在整个训练集中找到K对最近的类中心,并计算它们的距离总和。与不相邻的类中心相比,相邻中心的对应类很可能有较小的间隔或有重叠。如果所有相邻类都有适当的间隔,则非相邻类将具有更大的间隔。因此,没有必要考虑所有中心对。最有效的方法是优化所有相邻中心的距离。这里将K的值设置为P,其中P是类的数量。因为当所有类中心在超球面上排成一圈时,相邻中心对的最小数量是P。where ∑Top (A,k) denotes the sum of the K largest elements in the set A, and W a and W b are approximate surrogates for the class centers of any two different classes. Cosine optimal loss function The purpose of is to find the K pairs of nearest class centers in the whole training set and calculate the sum of their distances. Corresponding classes of adjacent centers are likely to have smaller separations or overlap than non-adjacent class centers. If all adjacent classes have proper spacing, non-adjacent classes will have larger spacing. Therefore, it is not necessary to consider all center pairs. The most efficient way is to optimize the distance of all adjacent centers. Here the value of K is set to P, where P is the number of classes. Because when all the class centers are arranged in a circle on the hypersphere, the minimum number of pairs of adjacent centers is P.
e)整合步骤b)和步骤d)提出的两个轻量化版本创建出余弦最优损失函数的标准版 e) Integrate the two lightweight versions proposed in step b) and step d) to create a standard version of the cosine optimal loss function
余弦最优损失函数综合了近年来在人脸识别中提出的最优损失函数的优点。并首次尝试使用全局信息作为人脸识别的反馈。余弦最优损失函数运用了一种新的算法来学习类中心和类边缘之间的余弦相似度。本专利所提出的余弦最优损失函数在LFW、SLLFW和YTF数据集上进行了大量的实验。结果证明了其有效性并表明余弦最优损失函数实现了最先进的性能。The cosine optimal loss function combines the advantages of the optimal loss functions proposed in face recognition in recent years. And it is the first attempt to use global information as feedback for face recognition. The cosine optimal loss function employs a novel algorithm to learn the cosine similarity between class centers and class edges. The cosine optimal loss function proposed in this patent has been extensively tested on LFW, SLLFW and YTF data sets. The results demonstrate its effectiveness and show that the cosine-optimal loss function achieves state-of-the-art performance.
Claims (6)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202211442334.5A CN115761851B (en) | 2022-11-16 | 2022-11-16 | Optimization method of cosine optimal loss function based on global information |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202211442334.5A CN115761851B (en) | 2022-11-16 | 2022-11-16 | Optimization method of cosine optimal loss function based on global information |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN115761851A true CN115761851A (en) | 2023-03-07 |
| CN115761851B CN115761851B (en) | 2025-07-11 |
Family
ID=85372857
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202211442334.5A Active CN115761851B (en) | 2022-11-16 | 2022-11-16 | Optimization method of cosine optimal loss function based on global information |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN115761851B (en) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190279091A1 (en) * | 2018-03-12 | 2019-09-12 | Carnegie Mellon University | Discriminative Cosine Embedding in Machine Learning |
| CN110598603A (en) * | 2019-09-02 | 2019-12-20 | 深圳力维智联技术有限公司 | Face recognition model acquisition method, device, equipment and medium |
| CN113052261A (en) * | 2021-04-22 | 2021-06-29 | 东南大学 | Image classification loss function design method based on cosine space optimization |
| CN114627533A (en) * | 2022-03-10 | 2022-06-14 | 厦门熵基科技有限公司 | Face recognition method, face recognition device, face recognition equipment and computer-readable storage medium |
-
2022
- 2022-11-16 CN CN202211442334.5A patent/CN115761851B/en active Active
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190279091A1 (en) * | 2018-03-12 | 2019-09-12 | Carnegie Mellon University | Discriminative Cosine Embedding in Machine Learning |
| CN110598603A (en) * | 2019-09-02 | 2019-12-20 | 深圳力维智联技术有限公司 | Face recognition model acquisition method, device, equipment and medium |
| CN113052261A (en) * | 2021-04-22 | 2021-06-29 | 东南大学 | Image classification loss function design method based on cosine space optimization |
| CN114627533A (en) * | 2022-03-10 | 2022-06-14 | 厦门熵基科技有限公司 | Face recognition method, face recognition device, face recognition equipment and computer-readable storage medium |
Non-Patent Citations (1)
| Title |
|---|
| 徐健锋;何宇凡;刘斓: "三支决策代价目标函数的关系及推理研究", 计算机科学, 9 July 2018 (2018-07-09) * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN115761851B (en) | 2025-07-11 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Sucholutsky et al. | Soft-label dataset distillation and text dataset distillation | |
| Wang et al. | Unsupervised deep clustering via adaptive GMM modeling and optimization | |
| Xu et al. | Weighted multi-view clustering with feature selection | |
| CN110032646A (en) | The cross-domain texts sensibility classification method of combination learning is adapted to based on multi-source field | |
| CN114022693A (en) | A method for clustering single-cell RNA-seq data based on dual self-supervision | |
| Shukla et al. | Semi-supervised clustering with neural networks | |
| Liu et al. | A comparable study on model averaging, ensembling and reranking in nmt | |
| Oskouei et al. | RDEIC-LFW-DSS: ResNet-based deep embedded image clustering using local feature weighting and dynamic sample selection mechanism | |
| CN114444600A (en) | A Small-Sample Image Classification Method Based on Memory Augmented Prototype Network | |
| Song et al. | Real-world cross-modal retrieval via sequential learning | |
| Cao et al. | CircSSNN: circRNA-binding site prediction via sequence self-attention neural networks with pre-normalization | |
| Zhang et al. | Effectiveness of scaled exponentially-regularized linear units (SERLUs) | |
| Shi et al. | Federated learning with ℓ1 regularization | |
| Wu et al. | Robust deep fuzzy K-means clustering for image data | |
| Shi et al. | Efficient federated learning with enhanced privacy via lottery ticket pruning in edge computing | |
| Teng et al. | Cluster ensemble framework based on the group method of data handling | |
| CN111507263B (en) | Face multi-attribute recognition method based on multi-source data | |
| Li et al. | Learning from crowds with robust logistic regression | |
| Zhang et al. | Transformer-based dynamic fusion clustering network | |
| CN115761851A (en) | Optimization Method of Cosine Optimal Loss Function Based on Global Information | |
| Wu et al. | Exponential discriminative metric embedding in deep learning | |
| Yang et al. | Modulation recognition based on incremental deep learning | |
| Min et al. | Bidirectional domain transfer knowledge distillation for catastrophic forgetting in federated learning with heterogeneous data | |
| CN116662834B (en) | Fuzzy hyperplane clustering method and device based on sample style characteristics | |
| Nguyen et al. | Model fusion of heterogeneous neural networks via cross-layer alignment |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |