+

CN114077755B - Controllable and lightweight federated learning method, system and detection method for privacy protection - Google Patents

Controllable and lightweight federated learning method, system and detection method for privacy protection Download PDF

Info

Publication number
CN114077755B
CN114077755B CN202210057267.9A CN202210057267A CN114077755B CN 114077755 B CN114077755 B CN 114077755B CN 202210057267 A CN202210057267 A CN 202210057267A CN 114077755 B CN114077755 B CN 114077755B
Authority
CN
China
Prior art keywords
model
subnet
global
data
snapshot
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210057267.9A
Other languages
Chinese (zh)
Other versions
CN114077755A (en
Inventor
孙知信
徐玉华
赵学健
孙哲
胡冰
宫婧
汪胡青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202210057267.9A priority Critical patent/CN114077755B/en
Publication of CN114077755A publication Critical patent/CN114077755A/en
Application granted granted Critical
Publication of CN114077755B publication Critical patent/CN114077755B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioethics (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a controllable light federal learning method, a controllable light federal learning system and a controllable light federal learning detection method for protecting privacy. The subnet trains the initial model by using local data, and calculates a local model compression snapshot according to the adjustment factor. And determining a compression model of the optimal node as a reference, and performing first-round model aggregation and compression according to the snapshot and a set channel recovery threshold value to form a global compression model. And the subnetworks train the global compression model by using local data, directly aggregate the trained models of the subnetworks, and sequentially iterate until convergence. The data of each subnet training set does not need to be collected or stored, so that the privacy of local data is guaranteed, and the network communication load and the local calculation load are reduced through controllable light weight federal learning.

Description

保护隐私的可控轻量化联邦学习方法、系统和检测方法Controllable and lightweight federated learning method, system and detection method for privacy protection

技术领域technical field

本发明涉及网络流量检测、网络隐私保护等网络安全技术领域,以及联邦学习、数据模型压缩等大数据技术领域,具体涉及保护隐私的可控轻量化联邦学习方法、系统和检测方法。The invention relates to network security technical fields such as network traffic detection and network privacy protection, and big data technical fields such as federated learning and data model compression, in particular to a controllable and lightweight federated learning method, system and detection method for protecting privacy.

背景技术Background technique

随着网络规模的不断扩大,网络流量不断增加,分布式网络流量检测技术也不断深化。不同的子网的训练数据往往不是独立同分布的,各本地数据自身的差异可能会导致局部和全局模型训练不当,造成大量的虚假预警。因此需要进行多检测器协作的分布式流量检测。跨域的协同检测过程可能需要每个涉及域的详细网络数据,然而直接将网络数据进行交互,会造成网络隐私泄露。现在已有研究团队利用联邦学习技术来实现网络流量检测,通过参数来代替数据本身进行交互从而保护本地网络隐私安全,例如已公开专利“基于联邦学习的流量识别方法及装置”(中国专利公开号CN111970277A)、“一种基于联邦学习的流量分类方法及系统”(中国专利公告号CN111865815B)。但是这些研究直接将各本地模型参数直接进行聚合,没有进行轻量化处理,然而神经网络深度学习模型规模较为庞大,当海量的模型参数直接传输时会给造成较大的网络负担,同时影响分布式网络协作流量检测的可扩展性。With the continuous expansion of the network scale, the network traffic continues to increase, and the distributed network traffic detection technology is also deepened. The training data of different subnets are often not independent and identically distributed, and the differences of each local data may lead to improper training of local and global models, resulting in a large number of false warnings. Therefore, it is necessary to perform distributed traffic detection with multi-detector cooperation. The cross-domain collaborative detection process may require detailed network data of each involved domain, but directly interacting with network data will result in network privacy leakage. At present, some research teams use federated learning technology to realize network traffic detection, and use parameters to replace the data itself to interact to protect local network privacy. CN111970277A), "A traffic classification method and system based on federated learning" (Chinese Patent Bulletin No. CN111865815B). However, these studies directly aggregate the local model parameters without lightweight processing. However, the scale of the neural network deep learning model is relatively large. When a large number of model parameters are directly transmitted, it will cause a large network burden and affect the distributed distribution. Scalability of Network Collaborative Traffic Inspection.

虽然也有团队对联邦学习中的模型压缩方法进行研究,例如专利“基于联邦学习的目标检测方法、装置、设备”(中国专利公开号CN112257774A),但是该研究首先通过全局服务器收集各分布式网络的数据来对模型进行压缩,而在这过程可能就会造成各子网数据隐私的泄露。Although some teams have also conducted research on model compression methods in federated learning, such as the patent "Object Detection Method, Device, and Equipment Based on Federated Learning" (Chinese Patent Publication No. CN112257774A), this research first collects the data of each distributed network through a global server. The data is used to compress the model, and in this process, the data privacy of each subnet may be leaked.

发明内容SUMMARY OF THE INVENTION

鉴于现有技术中存在可能会造成各子网数据隐私泄露的技术问题,本发明提供保护隐私的可控轻量化联邦学习方法、系统和检测方法。In view of the technical problems in the prior art that may cause data privacy leakage of each subnet, the present invention provides a controllable and lightweight federated learning method, system and detection method for protecting privacy.

实现上述目的,本发明提供如下技术方案:To achieve the above object, the present invention provides the following technical solutions:

第一方面,本发明提供保护隐私的可控轻量化联邦学习方法,包括:各子网节点基于本地训练集以及预设的模型及参数训练本地模型;利用设定的调节因子计算本地模型各层中需要剪枝的输出通道并生成本地模型的压缩快照;In the first aspect, the present invention provides a controllable and lightweight federated learning method that protects privacy, including: each subnet node trains a local model based on a local training set and a preset model and parameters; and calculates each layer of the local model by using a set adjustment factor. output channels that need to be pruned and generate compressed snapshots of the local model;

根据各子网节点的训练集中数据的数目确定最优子网节点,以最优子网节点的压缩快照为基准快照,根据基准快照、通道恢复阈值和除最优子网节点以外的其它子网节点的压缩快照确定全局快照;Determine the optimal sub-network node according to the number of data in the training set of each sub-network node, take the compressed snapshot of the optimal sub-network node as the benchmark snapshot, and use the benchmark snapshot, channel recovery threshold and other sub-network nodes except the optimal sub-network node The compressed snapshot of the node determines the global snapshot;

对所述其它子网节点的模型进行加权聚合形成聚合后的全局模型;根据全局快照,对聚合后的全局模型的各层输出通道进行修剪形成全局压缩模型;Weighted aggregation is performed on the models of the other sub-network nodes to form an aggregated global model; according to the global snapshot, the output channels of each layer of the aggregated global model are pruned to form a global compression model;

各子网节点利用本地训练集对全局压缩模型进行训练,直至各对子网节点训练后的模型进行加权聚合后的模型收敛,获得最终的聚合后的模型。Each sub-network node uses the local training set to train the global compression model, until the model after weighted aggregation of the models trained by each pair of sub-network nodes converges, and the final aggregated model is obtained.

进一步地,所述模型采用卷积神经网络模型。Further, the model adopts a convolutional neural network model.

再进一步地,所述利用设定的调节因子计算本地模型各层中需要剪枝的输出通道,包括同时满足以下公式的第i层的第c个输出通道确定为需要修剪的输出通道,Still further, the use of the set adjustment factor to calculate the output channels that need to be pruned in each layer of the local model, including the c -th output channel of the i -th layer that simultaneously satisfies the following formula is determined as the output channel that needs to be pruned,

Figure 53828DEST_PATH_IMAGE001
Figure 53828DEST_PATH_IMAGE001

其中

Figure 822064DEST_PATH_IMAGE002
表示模型第i层的第c个输出通道通过激活函数输出的平均特征映射 值;
Figure 940192DEST_PATH_IMAGE003
表示模型第i层的所有输出通道通过激活函数输出的平均特征映射值的平均值;
Figure 336539DEST_PATH_IMAGE004
Figure 899499DEST_PATH_IMAGE006
为调节因子且均大于0,
Figure 14085DEST_PATH_IMAGE007
表示模型第i层的第c个输出通道的平均零激活百分比,
Figure 670326DEST_PATH_IMAGE008
为模型第层所有输出通道的平均零激活百分比的平均值。 in
Figure 822064DEST_PATH_IMAGE002
Represents the average feature map value output by the c -th output channel of the i -th layer of the model through the activation function;
Figure 940192DEST_PATH_IMAGE003
Represents the average of the average feature map values output by all output channels of the i -th layer of the model through the activation function;
Figure 336539DEST_PATH_IMAGE004
,
Figure 899499DEST_PATH_IMAGE006
is the adjustment factor and is greater than 0,
Figure 14085DEST_PATH_IMAGE007
represents the average zero-activation percentage of the c -th output channel of the i -th layer of the model,
Figure 670326DEST_PATH_IMAGE008
is the average of the mean zero-activation percentages for all output channels of the model layer.

再进一步地,所述压缩快照包括需要被修剪的输出通道是第几层神经网络,以及需要修剪的输出通道的ID号。Still further, the compressed snapshot includes the neural network layer number of the output channel to be trimmed, and the ID number of the output channel to be trimmed.

进一步地,以最优子网节点的压缩快照为基准快照,根据基准快照、通道恢复阈值和除最优子网节点以外的其它各子网节点的压缩快照确定全局快照,包括:在所述其它子网节点的压缩快照中对基准快照中所有需要被修剪的输出通道进行扫描,并统计被修剪的输出通道是否存在于其它子网节点的压缩快照中;Further, take the compressed snapshot of the optimal subnet node as the reference snapshot, and determine the global snapshot according to the reference snapshot, the channel recovery threshold and the compressed snapshots of other subnet nodes except the optimal subnet node, including: In the compressed snapshot of the subnet node, scan all the output channels that need to be pruned in the benchmark snapshot, and count whether the pruned output channels exist in the compressed snapshots of other subnet nodes;

当某个输出通道不存在于其它子网节点的压缩快照中,则记录相应的子网节点,当记录的子网节点数量大于设定的输出通道恢复阈值时,则将该输出通道在基准快照中进行删除;最后获得全局快照。When an output channel does not exist in the compressed snapshots of other subnet nodes, the corresponding subnet node will be recorded. When the number of recorded subnet nodes is greater than the set output channel recovery threshold, the output channel will be recorded in the benchmark snapshot. delete in ; finally get a global snapshot.

进一步地,通过以下公式确定最优子网节点:Further, the optimal subnet node is determined by the following formula:

Figure 186758DEST_PATH_IMAGE009
Figure 186758DEST_PATH_IMAGE009

其中

Figure 903041DEST_PATH_IMAGE010
为第i个子网节点的模型的数据不均衡度,
Figure 504924DEST_PATH_IMAGE011
为第i个子网节点的本地训 练集的数据量占所有训练数据的比例,
Figure 964855DEST_PATH_IMAGE012
为最优子网节点。 in
Figure 903041DEST_PATH_IMAGE010
is the data imbalance degree of the model of the i -th subnet node,
Figure 504924DEST_PATH_IMAGE011
is the proportion of the data volume of the local training set of the i -th subnet node to all training data,
Figure 964855DEST_PATH_IMAGE012
is the optimal subnet node.

进一步地,判断加权聚合后的模型是否收敛的方法包括:确定各子网节点训练的模型的损失函数和损失函数标准差的平均值;Further, the method for judging whether the weighted aggregation model has converged comprises: determining the loss function of the model trained by each sub-network node and the average value of the standard deviation of the loss function;

各子网节点训练的模型的损失函数和与损失函数标准差的平均值都小于等于设定阈值,则确定对子网控制器训练后的模型进行加权聚合后的模型收敛。If the average value of the loss function and the standard deviation of the model trained by each subnet node is less than or equal to the set threshold, it is determined that the model trained by the subnet controller is converged after weighted aggregation.

第二方面,本发明提供保护隐私的可控轻量化联邦学习系统,包括数据层,子网控制层和全局控制层;In a second aspect, the present invention provides a controllable and lightweight federated learning system that protects privacy, including a data layer, a subnet control layer and a global control layer;

所述数据层,用于各子网进行数据转发通信;The data layer is used for each sub-network to perform data forwarding communication;

所述子网控制层设置多个子网控制器,所述全局控制层设置全局控制器;The sub-network control layer is provided with a plurality of sub-network controllers, and the global control layer is provided with a global controller;

所述全局控制器用于向所有子网控制器传输预设的模型及参数与模型压缩所需的调节因子;The global controller is used to transmit the preset model and parameters and adjustment factors required for model compression to all sub-network controllers;

所述子网控制器用于进行数据采集并进行特征提取形成本地训练集;接收全局控制器传输的模型、参数和所述调节因子;利用本地训练集、模型及参数训练本地模型,利用设定的调节因子计算本地模型各层中需要剪枝的输出通道并生成本地模型的压缩快照;各所述子网控制器将本地训练集中数据的数目、模型和压缩快照传输至全局控制器。The sub-network controller is used for data collection and feature extraction to form a local training set; receiving the model, parameters and the adjustment factor transmitted by the global controller; using the local training set, model and parameters to train the local model, using the set The adjustment factor calculates the output channels to be pruned in each layer of the local model and generates a compressed snapshot of the local model; each of the sub-network controllers transmits the number of data in the local training set, the model and the compressed snapshot to the global controller.

全局控制器根据各子网控制器获得的训练集中数据的数目确定最优子网控制器,以最优子网控制器的压缩快照为基准快照,根据基准快照、通道恢复阈值和除最优子网控制器以外的其它子网控制器生成的压缩快照确定全局快照;The global controller determines the optimal sub-network controller according to the number of data in the training set obtained by each sub-network controller, takes the compressed snapshot of the optimal sub-network controller as the benchmark snapshot, Compressed snapshots generated by other subnet controllers other than the network controller determine the global snapshot;

全局控制器对各子网控制器训练的模型进行加权聚合形成聚合后的全局模型;根据全局快照,对聚合后的全局模型的各层输出通道进行修剪形成全局压缩模型。The global controller performs weighted aggregation on the models trained by each sub-network controller to form an aggregated global model; according to the global snapshot, the output channels of each layer of the aggregated global model are pruned to form a global compression model.

进一步地,所述子网控制器利用设定的调节因子计算本地模型各层中需要剪枝的输出通道,包括同时满足以下公式的第i层的第c输出通道确定为需要修剪的通道,Further, the sub-network controller uses the set adjustment factor to calculate the output channels that need to be pruned in each layer of the local model, including the c -th output channel of the i -th layer that simultaneously satisfies the following formula is determined as the channel that needs to be pruned,

Figure 335793DEST_PATH_IMAGE001
Figure 335793DEST_PATH_IMAGE001

其中

Figure 82033DEST_PATH_IMAGE013
表示模型第i层的第c个输出通道通过激活函数输出的平均特征映射 值;
Figure 780998DEST_PATH_IMAGE003
表示模型第i层的所有输出通道通过激活函数输出的平均特征映射值的平均值;
Figure 169254DEST_PATH_IMAGE004
Figure 535645DEST_PATH_IMAGE006
为调节因子且均大于0,
Figure 452785DEST_PATH_IMAGE007
表示模型第i层的第c个输出通道的平均零激活百分比,
Figure 904626DEST_PATH_IMAGE014
为模型第i层所有输出通道的平均零激活百分比的平均值。 in
Figure 82033DEST_PATH_IMAGE013
Represents the average feature map value output by the c -th output channel of the i -th layer of the model through the activation function;
Figure 780998DEST_PATH_IMAGE003
Represents the average of the average feature map values output by all output channels of the i -th layer of the model through the activation function;
Figure 169254DEST_PATH_IMAGE004
,
Figure 535645DEST_PATH_IMAGE006
is the adjustment factor and is greater than 0,
Figure 452785DEST_PATH_IMAGE007
represents the average zero-activation percentage of the c -th output channel of the i -th layer of the model,
Figure 904626DEST_PATH_IMAGE014
is the average of the mean zero-activation percentages for all output channels of the i -th layer of the model.

本发明还提供了保护隐私的可控轻量化联邦学习检测方法,采用如第一方面提供的技术方案任意一种可能的实施方式所提供的保护隐私的可控轻量化联邦学习方法获得模型;The present invention also provides a privacy-protecting controllable lightweight federated learning detection method, and a model is obtained by using the privacy-protecting controllable lightweight federated learning method provided by any possible implementation of the technical solution provided in the first aspect;

输入采集获取的网络流量数据,利用最终获得的模型进行流量检测。Input the network traffic data obtained by collection, and use the finally obtained model to perform traffic detection.

本发明取得以下有益技术效果:本发明中无需进行各子网数据的收集,而是由各子网节点在本地进行模型训练与压缩,再对各子网节点的压缩模型进行处理与聚合形成能够体现全局数据特性的全局压缩模型。The present invention achieves the following beneficial technical effects: in the present invention, it is not necessary to collect the data of each sub-network, but each sub-network node performs model training and compression locally, and then processes and aggregates the compressed model of each sub-network node to form a A global compression model that embodies the characteristics of global data.

另外在此过程中,系统可以通过参数调整实现全局压缩模型的规模控制,从而实现可控的轻量化联邦学习,且全局控制器不用收集与掌握具体的流量隐私信息。In addition, in this process, the system can realize the scale control of the global compression model through parameter adjustment, so as to achieve controllable lightweight federated learning, and the global controller does not need to collect and master specific traffic privacy information.

通过最终训练的全局压缩模型对本地流量进行检测,整个过程中本地的训练数据都不需要进行传输交互,从而保护了分布式网络的本地数据隐私安全。The local traffic is detected through the final trained global compression model, and the local training data does not need to be transmitted and interacted in the whole process, thus protecting the local data privacy security of the distributed network.

附图说明Description of drawings

图1是具体实施例提供的支持隐私保护的可控轻量化联邦学习系统框架示意图;1 is a schematic diagram of a controllable lightweight federated learning system framework supporting privacy protection provided by a specific embodiment;

图2是具体实施例提供的支持隐私保护的可控轻量化联邦学习方法流程示意图。FIG. 2 is a schematic flowchart of a controllable and lightweight federated learning method supporting privacy protection provided by a specific embodiment.

具体实施方式Detailed ways

以下结合说明书附图和具体实施例对本发明做进一步说明。The present invention will be further described below with reference to the accompanying drawings and specific embodiments.

实施例1:保护隐私的可控轻量化联邦学习方法,方法流程如图2所示,包括:Example 1: A controllable and lightweight federated learning method to protect privacy, the method flow is shown in Figure 2, including:

各子网节点基于本地训练集以及全局节点预设的模型及参数训练本地模型;利用设定的调节因子计算本地模型各层中需要剪枝的输出通道并生成本地模型的压缩快照;Each subnet node trains the local model based on the local training set and the model and parameters preset by the global node; uses the set adjustment factor to calculate the output channels that need to be pruned in each layer of the local model and generate a compressed snapshot of the local model;

全局节点根据各子网节点的训练集中数据的数目确定最优子网节点,以最优子网节点的压缩快照为基准快照,根据基准快照、通道恢复阈值和除最优子网节点以外的其它子网节点的压缩快照确定全局快照;The global node determines the optimal sub-network node according to the number of data in the training set of each sub-network node, takes the compressed snapshot of the optimal sub-network node as the benchmark snapshot, and uses the benchmark snapshot, channel recovery threshold and other parameters except the optimal sub-network node. Compressed snapshots of subnet nodes determine global snapshots;

全局节点对各子网节点的模型进行加权聚合形成聚合后的全局模型;根据全局快照,对聚合后的全局模型的各层输出通道进行修剪形成全局压缩模型;The global node performs weighted aggregation of the models of each sub-network node to form an aggregated global model; according to the global snapshot, the output channels of each layer of the aggregated global model are pruned to form a global compression model;

各子网节点利用本地训练集对全局压缩模型进行训练,直至各对子网节点训练后的模型进行加权聚合后的模型收敛,获得最终的聚合后的模型。Each sub-network node uses the local training set to train the global compression model, until the model after weighted aggregation of the models trained by each pair of sub-network nodes converges, and the final aggregated model is obtained.

各子网节点的交换机数据转发通信。全局节点向所有子网节点传输与部署初始化的卷积神经网络深度学习模型、参数与模型压缩所需的调节因子。子网节点利用本地训练数据对初始模型进行训练,并根据调节因子计算本地模型压缩快照。全局节点挑选出各子网节点中的最优子网节点,将最优子网节点的压缩模型为基准快照;根据基准快照、通道恢复阈值和各子网节点的压缩快照确定全局快照;The switches of each subnet node forward the communication. The global node transmits and deploys the initialized convolutional neural network deep learning model, parameters and adjustment factors required for model compression to all subnet nodes. The subnet nodes use the local training data to train the initial model, and calculate the local model compressed snapshot according to the adjustment factor. The global node selects the optimal sub-network node in each sub-network node, and takes the compression model of the optimal sub-network node as the benchmark snapshot; determines the global snapshot according to the benchmark snapshot, the channel recovery threshold and the compressed snapshot of each sub-network node;

全局节点对各子网节点的模型进行加权聚合形成聚合后的全局模型;根据全局快照以及通道恢复阈值,对聚合后的全局模型的各层输出通道进行修剪形成全局压缩模型;The global node performs weighted aggregation of the models of each subnet node to form an aggregated global model; according to the global snapshot and channel recovery threshold, the output channels of each layer of the aggregated global model are pruned to form a global compression model;

子网节点再利用本地数据对全局压缩模型进行训练,全局节点再将各子网节点训练后的模型直接聚合,依次迭代直至聚合后的模型收敛。由于无需收集各子网训练集数据或存储训练集数据从而保证了本地数据的隐私性,并通过可控轻量化联邦学习减少网络通信负担与本地计算负载。本实施例提供的联邦学习方法具体包括以下步骤:The sub-network node uses the local data to train the global compression model, and the global node directly aggregates the models trained by each sub-network node, and iterates in turn until the aggregated model converges. Since there is no need to collect training set data of each subnet or store training set data, the privacy of local data is ensured, and the network communication burden and local computing load are reduced through controllable and lightweight federated learning. The federated learning method provided in this embodiment specifically includes the following steps:

步骤1,全局节点预先确定初始化的深度学习模型、参数与控制模型规模的压缩参数,即调节因子。Step 1, the global node pre-determines the initialized deep learning model, parameters and compression parameters that control the scale of the model, that is, the adjustment factor.

步骤2,子网节点通过本地交换机进行数据采集,并进行特征提取形成本地训练集。子网节点利用本地训练数据集,对初始化的模型进行训练。在第一轮模型聚合通信前,为了减少冗余的模型参数提高通信效率,子网节点根据预设的调节因子,计算本地训练的深度学习网络模型各层中需要剪枝的输出通道,并同时生成各压缩快照,压缩快照中记录各个子网模型中具体的剪枝通道。In step 2, the subnet nodes collect data through the local switch, and perform feature extraction to form a local training set. The subnet nodes use the local training dataset to train the initialized model. Before the first round of model aggregation communication, in order to reduce redundant model parameters and improve communication efficiency, the subnet nodes calculate the output channels that need to be pruned in each layer of the locally trained deep learning network model according to preset adjustment factors, and at the same time Each compressed snapshot is generated, and the specific pruning channel in each sub-network model is recorded in the compressed snapshot.

本实施例中,模型采用卷积神经网络。In this embodiment, the model adopts a convolutional neural network.

其中步骤2中,各子网节点计算需要剪枝的输出通道形成压缩快照的方法如下:In step 2, the method for each subnet node to calculate the output channel that needs to be pruned to form a compressed snapshot is as follows:

卷积神经网络中使用多个卷积层,卷积层包含多个卷积核即为多个输出通道,输入训练实例数据与卷积层的卷积核进行卷积运算,得到的值通过激活函数ReLU(RectifiedLinear Units)函数输出多个特征映射值。各输出通道通过激活函数输出的平均特征映射值与平均零激活百分比的均值作为的阈值来进行可控压缩。The convolutional neural network uses multiple convolutional layers. The convolutional layer contains multiple convolution kernels, which are multiple output channels. The input training instance data is convolutional with the convolutional kernel of the convolutional layer. The obtained value is activated by The function ReLU (RectifiedLinear Units) function outputs multiple feature map values. Each output channel performs controllable compression by using the mean value of the average feature map output by the activation function and the mean zero activation percentage as a threshold.

定义平均零激活百分比(APZ)来衡量各层输出通道神经元通过ReLU映射后的零激 活百分比。令

Figure 830994DEST_PATH_IMAGE015
表示第i层的第c个输出通道通过ReLU函数后的输出值。那么第i层的第c 个输出通道的平均零激活百分比
Figure 786312DEST_PATH_IMAGE016
则表示为如下公式: The average percentage of zero activations (APZ) is defined to measure the percentage of zero activations of output channel neurons in each layer after ReLU mapping. make
Figure 830994DEST_PATH_IMAGE015
Represents the output value of the c -th output channel of the i -th layer after passing through the ReLU function. Then the average zero activation percentage of the cth output channel of the ith layer
Figure 786312DEST_PATH_IMAGE016
It is expressed as the following formula:

Figure 405512DEST_PATH_IMAGE017
Figure 405512DEST_PATH_IMAGE017

Figure 79070DEST_PATH_IMAGE018
为第k个训练实例在模型第i层的第c个输出通道通过ReLU函数后输出的第j个特征映射值。其中,若Relu映射后的值为0则
Figure 809128DEST_PATH_IMAGE019
为1,否则
Figure 618952DEST_PATH_IMAGE020
的值 为0。M代表
Figure 409054DEST_PATH_IMAGE021
输出特征映射的总数。N代表训练实例的总数。
Figure 79070DEST_PATH_IMAGE018
For the kth training instance, the jth feature map value output by the cth output channel of the ith layer of the model after passing through the ReLU function. Among them, if the value after Relu mapping is 0, then
Figure 809128DEST_PATH_IMAGE019
is 1, otherwise
Figure 618952DEST_PATH_IMAGE020
value is 0. M is for
Figure 409054DEST_PATH_IMAGE021
The total number of output feature maps. N represents the total number of training instances.

为了判别卷积神经网络第i层的某个输出通道的参数冗余性是否过大,可以将该 层的

Figure 569908DEST_PATH_IMAGE014
作为阈值进行比较,
Figure 838078DEST_PATH_IMAGE014
为模型第i层所有输出通道的平均零激活百分比的平均 值,
Figure 767988DEST_PATH_IMAGE014
的计算公式如下,其中H为该层的输出通道的数目。 In order to determine whether the parameter redundancy of an output channel of the i -th layer of the convolutional neural network is too large, the
Figure 569908DEST_PATH_IMAGE014
as a threshold for comparison,
Figure 838078DEST_PATH_IMAGE014
is the average of the average zero-activation percentages of all output channels of the i -th layer of the model,
Figure 767988DEST_PATH_IMAGE014
The calculation formula is as follows, where H is the number of output channels of this layer.

Figure 728991DEST_PATH_IMAGE022
Figure 728991DEST_PATH_IMAGE022

虽然可以通过平均零激活百分比来衡量网络每层输出通道的冗余度,同时也需要 衡量各输出通道的贡献度。因此还需计算每层输出通道通过ReLU函数之后的平均特征映射 值。

Figure 111562DEST_PATH_IMAGE023
表示模型第i层的第c个输出通道通过ReLU函数输出的平均特征映射值,表示 如下: Although the redundancy of the output channels of each layer of the network can be measured by the average zero activation percentage, it is also necessary to measure the contribution of each output channel. Therefore, it is also necessary to calculate the average feature map value of each layer output channel after passing through the ReLU function.
Figure 111562DEST_PATH_IMAGE023
Represents the average feature map value output by the c -th output channel of the i -th layer of the model through the ReLU function, which is expressed as follows:

Figure 183423DEST_PATH_IMAGE024
Figure 183423DEST_PATH_IMAGE024

其中

Figure 233419DEST_PATH_IMAGE018
为第k个训练实例在模型第i层的第c个输出通道通过ReLU函数后输出 的第j个特征映射值。 in
Figure 233419DEST_PATH_IMAGE018
For the kth training instance, the jth feature map value output by the cth output channel of the ith layer of the model after passing through the ReLU function.

Figure 99743DEST_PATH_IMAGE025
越大时,说明该输出通道的权值贡献度就越大,对于数据分类的影响就 越大。 when
Figure 99743DEST_PATH_IMAGE025
The larger the value, the greater the weight contribution of the output channel, and the greater the impact on data classification.

为了保留有用的输出通道,可以将

Figure 235190DEST_PATH_IMAGE026
与该层的
Figure 845163DEST_PATH_IMAGE027
作为阈值进行比较,
Figure 749665DEST_PATH_IMAGE027
表示模型第层的所有输出通道通过ReLU函数输出的平均特征映射值的平均值,
Figure 786891DEST_PATH_IMAGE027
的计算公式如下。 In order to preserve useful output channels, the
Figure 235190DEST_PATH_IMAGE026
with the layer
Figure 845163DEST_PATH_IMAGE027
as a threshold for comparison,
Figure 749665DEST_PATH_IMAGE027
Represents the average of the average feature map values output by all output channels of the model layer through the ReLU function,
Figure 786891DEST_PATH_IMAGE027
The calculation formula is as follows.

Figure 403774DEST_PATH_IMAGE028
Figure 403774DEST_PATH_IMAGE028

在计算本地模型需要剪枝的输出通道时,可将平均零激活百分比较高,而权值贡 献度较低的输出通道剪枝,剪枝条件如以下公式。其中

Figure 817438DEST_PATH_IMAGE029
Figure 576446DEST_PATH_IMAGE030
为调节因子且均大于0,通过 调整
Figure 50153DEST_PATH_IMAGE029
Figure 894612DEST_PATH_IMAGE030
的值可以调整压缩模型的大小。 When calculating the output channels that the local model needs to prune, the output channels with a higher average zero activation percentage and a lower weight contribution can be pruned. The pruning conditions are as follows. in
Figure 817438DEST_PATH_IMAGE029
,
Figure 576446DEST_PATH_IMAGE030
is the adjustment factor and is greater than 0, by adjusting
Figure 50153DEST_PATH_IMAGE029
,
Figure 894612DEST_PATH_IMAGE030
The value of can adjust the size of the compressed model.

Figure 846388DEST_PATH_IMAGE031
Figure 846388DEST_PATH_IMAGE031

同时满足以上两个公式的第i层的第c输出通道确定为需要修剪的通道。The c -th output channel of the i -th layer that satisfies the above two formulas at the same time is determined as the channel that needs to be trimmed.

然后通过key-value形式的快照,记录本地训练模型需要剪枝的输出通道,其中key记录第几层神经网络,value记录需要修剪的输出通道的ID号。Then, through the snapshot in the form of key-value, the output channel that needs to be pruned for the local training model is recorded, where the key records the number of layers of the neural network, and the value records the ID number of the output channel that needs to be pruned.

步骤3,全局节点利用各子网训练集各类数据的数目计算各子网训练性能,并选出最优子网节点。以最优子网节点的压缩快照为基准快照,并设置通道恢复阈值,然后对除最优子网节点的其它所有子网节点的压缩快照进行扫描,当基准快照中某个被修剪的输出通道不存在于其它节点的快照中,且这些节点数量大于设定的通道恢复阈值时,则将该输出通道在基准快照中进行删除,形成全局快照。其次,全局控制器对子网控制器的训练模型进行加权聚合形成全局模型,然后根据全局快照,对聚合的全局模型的各层输出通道进行修剪形成全局压缩模型。Step 3, the global node calculates the training performance of each subnet using the number of various types of data in each subnet training set, and selects the optimal subnet node. Take the compressed snapshot of the optimal subnet node as the baseline snapshot, and set the channel recovery threshold, and then scan the compressed snapshots of all other subnet nodes except the optimal subnet node. When a pruned output channel in the baseline snapshot If it does not exist in the snapshots of other nodes, and the number of these nodes is greater than the set channel recovery threshold, the output channel will be deleted in the benchmark snapshot to form a global snapshot. Secondly, the global controller performs weighted aggregation on the training model of the sub-network controller to form a global model, and then prunes the output channels of each layer of the aggregated global model according to the global snapshot to form a global compression model.

其中步骤3中,根据子网节点的训练集数据总量情况和数据不均衡度计算各子网训练集性能,根据各子网训练集性能选出最优子网节点,具体步骤如下:In step 3, the performance of each subnet training set is calculated according to the total amount of training set data of the subnet nodes and the degree of data imbalance, and the optimal subnet node is selected according to the performance of each subnet training set. The specific steps are as follows:

当第i个子网节点的本地训练集的数据量占所有训练数据的比例

Figure 725482DEST_PATH_IMAGE032
越大,且第i 个子网节点的数据不均衡度
Figure 104511DEST_PATH_IMAGE033
越小,该子网节点的模型的精度就有可能越高。
Figure 436266DEST_PATH_IMAGE032
Figure 926153DEST_PATH_IMAGE033
的 计算公式如下: When the data volume of the local training set of the i -th subnet node accounts for the proportion of all training data
Figure 725482DEST_PATH_IMAGE032
is larger, and the data imbalance degree of the i -th subnet node
Figure 104511DEST_PATH_IMAGE033
The smaller it is, the higher the accuracy of the model for that subnet node is likely to be.
Figure 436266DEST_PATH_IMAGE032
and
Figure 926153DEST_PATH_IMAGE033
The calculation formula is as follows:

Figure 190912DEST_PATH_IMAGE034
Figure 190912DEST_PATH_IMAGE034

Figure DEST_PATH_IMAGE035
Figure DEST_PATH_IMAGE035

其中,

Figure 412946DEST_PATH_IMAGE036
为第i个子网节点训练数据数量,
Figure 231998DEST_PATH_IMAGE037
为第j个子网控制器训练数据数量,n 为数据的种类数目;
Figure 525576DEST_PATH_IMAGE038
为第i个子网节点中第j种类别数据的数目;
Figure 113683DEST_PATH_IMAGE039
为第i个子网控制器 中各类数据的平均数目。 in,
Figure 412946DEST_PATH_IMAGE036
The number of training data for the ith subnet node,
Figure 231998DEST_PATH_IMAGE037
is the number of training data for the jth subnet controller, and n is the number of data types;
Figure 525576DEST_PATH_IMAGE038
is the number of the jth category data in the ith subnet node;
Figure 113683DEST_PATH_IMAGE039
is the average number of various types of data in the i -th subnet controller.

因而,可以集合各节点的训练数据数量占所有训练数据总量的比例和数据不均衡 度通过以下公式找到最优节点,

Figure 100094DEST_PATH_IMAGE040
表示第i个节点训练数据的性能评估值。但是最优子网 节点的压缩模型结构只能体现其所对应的子网的训练数据特性。 Therefore, the ratio of the number of training data of each node to the total amount of all training data and the data imbalance degree can be found by the following formula to find the optimal node,
Figure 100094DEST_PATH_IMAGE040
Represents the performance evaluation value of the i -th node training data. However, the compression model structure of the optimal subnet node can only reflect the training data characteristics of the corresponding subnet.

Figure 406441DEST_PATH_IMAGE041
Figure 406441DEST_PATH_IMAGE041

其中步骤3中,以最优子网节点的压缩模型为基准,进行模型聚合与压缩,具体步骤如下:In step 3, model aggregation and compression are performed based on the compression model of the optimal subnet node. The specific steps are as follows:

首先,将最优节点的压缩快照作为基准快照,对基准快照中所有需要被修剪的输 出通道,在其它子网节点的压缩快照中进行扫描,并统计被修剪的输出通道是否存在于其 它子网节点的压缩快照中。当基准快照中某个被修剪的输出通道不存在于其它子网节点的 压缩快照中,则进行记录,当这些子网节点数量大于设定的通道恢复阈值Z时(

Figure 503710DEST_PATH_IMAGE042
K 为所有的子网控制器节点数目),则将该输出通道在基准快照中进行删除,获得全局快照。 而当设定值Z越大时,需要从基准快照中删除的通道数会越小。 First, take the compressed snapshot of the optimal node as the benchmark snapshot, scan all the output channels that need to be pruned in the benchmark snapshot, scan the compressed snapshots of other subnet nodes, and count whether the pruned output channels exist in other subnets in the compressed snapshot of the node. When a pruned output channel in the benchmark snapshot does not exist in the compressed snapshots of other subnet nodes, it is recorded. When the number of these subnet nodes is greater than the set channel recovery threshold Z (
Figure 503710DEST_PATH_IMAGE042
, K is the number of all subnet controller nodes), then delete the output channel in the benchmark snapshot to obtain a global snapshot. When the set value Z is larger, the number of channels that need to be deleted from the reference snapshot will be smaller.

然后,利用加权平均公式,对所有子网节点训练的模型的权重进行聚合,计算出全局模型。Then, using the weighted average formula, the weights of the models trained by all subnet nodes are aggregated to calculate the global model.

Figure 946324DEST_PATH_IMAGE043
Figure 946324DEST_PATH_IMAGE043

Figure 103636DEST_PATH_IMAGE044
为第i个子网控制器训练数据数量,
Figure 897280DEST_PATH_IMAGE045
表示第1轮聚合的第i个子网控制器 的模型权值。
Figure 103636DEST_PATH_IMAGE044
the number of training data for the ith subnet controller,
Figure 897280DEST_PATH_IMAGE045
Represents the model weights of the ith subnetwork controller of the first round of aggregation.

其次,根据全局快照,对全局模型的各层输出通道进行剪枝形成全局压缩模型。Secondly, according to the global snapshot, the output channels of each layer of the global model are pruned to form a global compression model.

因而系统管理员可以通过设定的调节因子、通道恢复阈值来控制全局压缩模型规模大小。Therefore, the system administrator can control the scale of the global compression model through the set adjustment factor and channel recovery threshold.

步骤4,各子网节点利用本地数据对全局压缩模型进行训练,当到达规定训练次 数,计算各子网模型t次训练的损失函数值的均值

Figure 532660DEST_PATH_IMAGE046
与标准差
Figure 95360DEST_PATH_IMAGE047
。 Step 4: Each subnet node uses local data to train the global compression model. When the specified number of training times is reached, calculate the mean value of the loss function value of each subnet model trained for t times.
Figure 532660DEST_PATH_IMAGE046
with standard deviation
Figure 95360DEST_PATH_IMAGE047
.

Figure 157994DEST_PATH_IMAGE048
Figure 157994DEST_PATH_IMAGE048

Figure 297988DEST_PATH_IMAGE049
Figure 297988DEST_PATH_IMAGE049

其中

Figure 143584DEST_PATH_IMAGE050
为第k个子网控制器第j轮的t次训练的损失函数值的标准差,
Figure 560790DEST_PATH_IMAGE051
为第i 次训练的损失函数,
Figure 59905DEST_PATH_IMAGE046
是第j轮的t次训练的损失函数平均值。 in
Figure 143584DEST_PATH_IMAGE050
is the standard deviation of the loss function value of the kth subnetwork controller in the jth round of the t training,
Figure 560790DEST_PATH_IMAGE051
is the loss function of the i -th training,
Figure 59905DEST_PATH_IMAGE046
is the average value of the loss function for t training sessions in the jth round.

步骤5,对各子网节点训练的模型进行加权聚合,然后计算本轮聚合通信的各子网 节点训练的模型的损失函数和S、损失函数标准差的平均值

Figure 562561DEST_PATH_IMAGE052
,计算公式如下(其中K是子 网节点的个数): Step 5: Perform weighted aggregation on the models trained by each sub-network node, and then calculate the loss function and S of the model trained by each sub-network node of this round of aggregated communication, and the average value of the standard deviation of the loss function
Figure 562561DEST_PATH_IMAGE052
, the calculation formula is as follows (where K is the number of subnet nodes):

Figure 274165DEST_PATH_IMAGE053
Figure 274165DEST_PATH_IMAGE053

Figure 936091DEST_PATH_IMAGE054
Figure 936091DEST_PATH_IMAGE054

若各子网节点的模型的损失函数和S、损失函数标准差的平均值

Figure 475613DEST_PATH_IMAGE055
都小于等于 设定阈值,则聚合后的全局模型收敛;,否则子网节点重复步骤4。 If the loss function and S of the model of each subnet node, the average value of the standard deviation of the loss function
Figure 475613DEST_PATH_IMAGE055
If both are less than or equal to the set threshold, the aggregated global model converges; otherwise, the subnet node repeats step 4.

可选地,还包括步骤6,各子网节点利用本地数据对收敛的全局模型进行微调,获得微调后的最新模型。Optionally, step 6 is also included, where each sub-network node uses local data to fine-tune the converged global model to obtain the latest fine-tuned model.

实施例2:与实施例1相对应地,本实施例提供保护隐私的可控轻量化联邦学习系统,该系统框架如图1所示,分为数据层,子网控制层和全局控制层。在数据层,各子网的交换机数据转发通信。而子网层控制器主要负责对本地的交换机进行管理,并对本地数据(本实施例采用网络流量数据)进行检测,彼此之间并互不通信,子网控制器也不会向外传输本地数据以防止隐私泄露。在全局层,全局控制器向所有子网控制器传输与部署初始化的流量检测卷积神经网络深度学习模型、参数与模型压缩所需的调节因子。子网控制器利用本地流量训练数据对初始模型进行训练,并根据调节因子计算本地模型压缩快照。全局控制器挑选出最优节点的压缩模型为基准快照;根据基准快照与设定的通道恢复阈值进行首轮模型聚合与压缩形成全局压缩模型,并将全局压缩模型发送至子网控制器。子网控制器再利用本地数据对全局压缩模型进行训练,全局控制器再直接聚合,依次迭代直至收敛。子网控制器利用收敛模型进行本地检测,由于全局控制器无需收集各子网训练集数据或存储训练集数据从而保证了本地数据的隐私性,并通过可控轻量化联邦学习减少网络通信负担与本地计算负载。本实施例包括以下:Embodiment 2: Corresponding to Embodiment 1, this embodiment provides a controllable and lightweight federated learning system that protects privacy. The system framework is shown in Figure 1, which is divided into a data layer, a subnet control layer and a global control layer. At the data layer, the switches of each subnet forward traffic. The subnet layer controller is mainly responsible for managing the local switches and detecting local data (the network traffic data is used in this embodiment). They do not communicate with each other, and the subnet controller will not transmit local data to prevent privacy leaks. At the global layer, the global controller transmits and deploys the initialized traffic detection convolutional neural network deep learning model, parameters and adjustment factors required for model compression to all subnet controllers. The subnet controller uses the local traffic training data to train the initial model and computes a local model compressed snapshot based on the adjustment factor. The global controller selects the compression model of the optimal node as the benchmark snapshot; performs the first round of model aggregation and compression according to the benchmark snapshot and the set channel recovery threshold to form a global compression model, and sends the global compression model to the subnet controller. The sub-network controller uses the local data to train the global compression model, and the global controller directly aggregates and iterates until convergence. The subnet controller uses the convergence model for local detection. Since the global controller does not need to collect the training set data of each subnet or store the training set data, the privacy of the local data is guaranteed, and the controllable lightweight federated learning reduces the network communication burden and Local computing load. This embodiment includes the following:

1)子网控制器在全局控制器中进行注册,由全局控制器对子网控制器进行统一管理,然后全局控制器向所有子网控制器传输与部署初始化的流量检测深度学习模型参数与控制模型规模的压缩参数。1) The subnet controller is registered in the global controller, the global controller manages the subnet controller uniformly, and then the global controller transmits and deploys the initialized traffic detection deep learning model parameters and control to all subnet controllers Compression parameter for model scale.

2)子网控制器通过本地交换机进行流量数据采集,并进行特征提取形成本地流量检测训练集。子网控制器利用本地训练数据集,对初始化的检测模型进行训练。在第一轮模型聚合通信前,为了减少冗余的模型参数提高通信效率,子网控制器根据接收的压缩参数,计算本地训练的深度学习网络模型各层中需要剪枝的输出通道,并同时生成各压缩快照,压缩快照中记录各个子网模型中具体的剪枝输出通道。本地控制器将本地各类训练数据的数目、各子网控制器训练模型、剪枝快照传输至全局控制器。2) The subnet controller collects traffic data through the local switch, and performs feature extraction to form a local traffic detection training set. The subnet controller uses the local training dataset to train the initialized detection model. Before the first round of model aggregation communication, in order to reduce redundant model parameters and improve communication efficiency, the subnet controller calculates the output channels that need to be pruned in each layer of the locally trained deep learning network model according to the received compression parameters, and at the same time Each compressed snapshot is generated, and the specific pruning output channel in each sub-network model is recorded in the compressed snapshot. The local controller transmits the number of local training data, the training model of each sub-network controller, and the pruning snapshot to the global controller.

其中各子网控制器计算需要剪枝的输出通道形成压缩快照的方法如下:卷积神经网络中使用多个卷积层,卷积层包含多个卷积核即为多个输出通道,输入训练实例数据与卷积层的卷积核进行卷积运算,得到的值通过激活函数ReLU(Rectified Linear Units)函数输出多个特征映射值。各输出通道通过激活函数输出的平均特征映射值与平均零激活百分比的均值作为的阈值来进行可控压缩。The method for each sub-network controller to calculate the output channel that needs to be pruned to form a compressed snapshot is as follows: multiple convolutional layers are used in the convolutional neural network, and the convolutional layer contains multiple convolution kernels, which are multiple output channels, and the input training The instance data is convolved with the convolution kernel of the convolution layer, and the obtained value outputs multiple feature map values through the activation function ReLU (Rectified Linear Units) function. Each output channel performs controllable compression by using the mean value of the average feature map output by the activation function and the mean zero activation percentage as a threshold.

定义平均零激活百分比(APZ)来衡量各层通道神经元通过ReLU映射后的零激活百 分比。令

Figure 590200DEST_PATH_IMAGE015
表示第i层的第c个输出通道通过ReLU函数后的输出值。那么第i层的第c个通 道的平均零激活百分比
Figure 246440DEST_PATH_IMAGE016
表示为如下公式: The average percentage of zero activation (APZ) is defined to measure the percentage of zero activation of each layer of channel neurons after ReLU mapping. make
Figure 590200DEST_PATH_IMAGE015
Represents the output value of the c -th output channel of the i -th layer after passing through the ReLU function. Then the average zero activation percentage of the cth channel of the ith layer
Figure 246440DEST_PATH_IMAGE016
It is expressed as the following formula:

Figure 762872DEST_PATH_IMAGE056
Figure 762872DEST_PATH_IMAGE056

Figure 479156DEST_PATH_IMAGE057
为第k个训练实例在模型第i层的第c个输出通道通过ReLU函数后输出的第j 个特征映射值。其中,若Relu映射后的值为0则
Figure 81038DEST_PATH_IMAGE019
为1,否则
Figure 540970DEST_PATH_IMAGE020
的值 为0。M代表
Figure 911908DEST_PATH_IMAGE021
输出特征映射的总数。N代表训练实例的总数。
Figure 479156DEST_PATH_IMAGE057
For the kth training instance, the jth feature map value output by the cth output channel of the ith layer of the model after passing through the ReLU function. Among them, if the value after Relu mapping is 0, then
Figure 81038DEST_PATH_IMAGE019
is 1, otherwise
Figure 540970DEST_PATH_IMAGE020
value of 0. M is for
Figure 911908DEST_PATH_IMAGE021
The total number of output feature maps. N represents the total number of training instances.

为了判别卷积神经网络第i层的某个输出通道的参数冗余性是否过大,可以将该 层的

Figure 799093DEST_PATH_IMAGE014
作为阈值进行比较,
Figure 622692DEST_PATH_IMAGE014
为模型第i层所有输出通道的平均零激活百分比的平均 值,
Figure 886314DEST_PATH_IMAGE014
的计算公式如下,其中H为该层的通道数目。 In order to determine whether the parameter redundancy of an output channel of the i -th layer of the convolutional neural network is too large, the
Figure 799093DEST_PATH_IMAGE014
as a threshold for comparison,
Figure 622692DEST_PATH_IMAGE014
is the average of the average zero-activation percentages of all output channels of the i -th layer of the model,
Figure 886314DEST_PATH_IMAGE014
The calculation formula is as follows, where H is the number of channels in the layer.

Figure 111759DEST_PATH_IMAGE058
Figure 111759DEST_PATH_IMAGE058

虽然可以通过平均零激活百分比来衡量网络每层通道的冗余度,同时也需要衡量各通道的贡献度。因此还需计算每层输出通道通过ReLU函数之后的平均特征映射值。Although the redundancy of channels in each layer of the network can be measured by the average zero activation percentage, the contribution of each channel also needs to be measured. Therefore, it is also necessary to calculate the average feature map value of each layer output channel after passing through the ReLU function.

Figure 169845DEST_PATH_IMAGE059
表示模型第i层的第c个输出通道通过ReLU函数输出的平均特征映射值, 表示如下:
Figure 169845DEST_PATH_IMAGE059
Represents the average feature map value output by the c -th output channel of the i -th layer of the model through the ReLU function, expressed as follows:

Figure 746320DEST_PATH_IMAGE060
Figure 746320DEST_PATH_IMAGE060

其中

Figure 548054DEST_PATH_IMAGE018
为第k个训练实例在模型第i层的第c个输出通道通过ReLU函数后输出 的第j个特征映射值。 in
Figure 548054DEST_PATH_IMAGE018
For the kth training instance, the jth feature map value output by the cth output channel of the ith layer of the model after passing through the ReLU function.

Figure 628005DEST_PATH_IMAGE061
越大时,说明该输出通道的权值贡献度就越大,对于流量分类的影响就 越大。 when
Figure 628005DEST_PATH_IMAGE061
The larger the value, the greater the weight contribution of the output channel, and the greater the impact on traffic classification.

为了保留有用的输出通道,可以将

Figure 122572DEST_PATH_IMAGE061
与该层的
Figure 920764DEST_PATH_IMAGE027
作为阈值进行比较,
Figure 260609DEST_PATH_IMAGE003
表示模型第i层的所有输出通道通过ReLU函数输出的平均特征映射值的平均值,
Figure 460646DEST_PATH_IMAGE003
的计算公式如下。 In order to preserve useful output channels, the
Figure 122572DEST_PATH_IMAGE061
with the layer
Figure 920764DEST_PATH_IMAGE027
as a threshold for comparison,
Figure 260609DEST_PATH_IMAGE003
Represents the average of the average feature map values output by all output channels of the i -th layer of the model through the ReLU function,
Figure 460646DEST_PATH_IMAGE003
The calculation formula is as follows.

Figure 126114DEST_PATH_IMAGE062
Figure 126114DEST_PATH_IMAGE062

在计算本地模型需要剪枝的输出通道时,可将平均零激活百分比较高,而权值贡 献度较低的通道剪枝,剪枝条件如以下公式。其中

Figure 146023DEST_PATH_IMAGE063
Figure 555138DEST_PATH_IMAGE030
为调节因子且均大于0,通过调整
Figure 609682DEST_PATH_IMAGE029
Figure 180472DEST_PATH_IMAGE030
的值可以调整压缩模型的大小。 When calculating the output channels that need to be pruned in the local model, the channels with higher average zero activation percentage and lower weight contribution can be pruned. The pruning conditions are as follows. in
Figure 146023DEST_PATH_IMAGE063
,
Figure 555138DEST_PATH_IMAGE030
is the adjustment factor and is greater than 0, by adjusting
Figure 609682DEST_PATH_IMAGE029
,
Figure 180472DEST_PATH_IMAGE030
The value of can adjust the size of the compressed model.

Figure 953256DEST_PATH_IMAGE064
Figure 953256DEST_PATH_IMAGE064

同时满足以上两个公式的第i层的第c输出通道确定为需要修剪的通道。The c -th output channel of the i -th layer that satisfies the above two formulas at the same time is determined as the channel that needs to be trimmed.

然后通过key-value形式的快照,记录本地训练模型需要剪枝的输出通道,其中key记录第几层神经网络,value记录需要修剪的输出通道的ID号。然后,子网控制器将剪枝快照、子网控制器训练后模型参数矩阵、本地各类训练数据数目、模型训练后的损失函数值传输至全局控制器。Then, through the snapshot in the form of key-value, the output channel that needs to be pruned for the local training model is recorded, where the key records the number of layers of the neural network, and the value records the ID number of the output channel that needs to be pruned. Then, the subnet controller transmits the pruning snapshot, the model parameter matrix after training by the subnet controller, the number of local training data of various types, and the loss function value after model training to the global controller.

3)在首轮聚合通信时,全局控制器接收到各子网控制器的训练模型、快照和训练数据的数目参数后,利用各子网训练集各类数据的数目计算各子网训练性能,并选出最优子网控制器。以最优子网控制器的压缩快照为基准,并设置输出通道恢复阈值,然后对除最优子网的其它所有子网的压缩快照进行扫描,当基准快照中某个被修剪的输出通道不存在于其它节点的快照中,且这些节点数量大于设定的通道恢复阈值时,则将该输出通道在基准快照中进行删除,形成全局快照。其次,全局控制器对子网控制器的训练模型进行加权聚合形成全局模型,然后根据全局快照,对聚合的全局模型的各层输出通道进行修剪形成全局压缩模型,最后将全局压缩模型发送至各子网控制器。3) In the first round of aggregated communication, after the global controller receives the training model, snapshot and number of training data parameters of each subnet controller, it calculates the training performance of each subnet using the number of various types of data in each subnet training set. And select the optimal subnet controller. Take the compressed snapshot of the optimal subnet controller as the benchmark, and set the output channel recovery threshold, and then scan the compressed snapshots of all other subnets except the optimal subnet. When a pruned output channel in the benchmark snapshot is not If it exists in the snapshots of other nodes, and the number of these nodes is greater than the set channel recovery threshold, the output channel will be deleted in the benchmark snapshot to form a global snapshot. Secondly, the global controller performs weighted aggregation of the training model of the sub-network controller to form a global model, and then prunes the output channels of each layer of the aggregated global model according to the global snapshot to form a global compressed model, and finally sends the global compressed model to each Subnet Controller.

其中3)中,全局控制器接收到全部子网控制器节点数据总量情况和数据不均衡度计算各子网训练集性能,根据各子网训练集性能选出最优子网控制器节点,具体步骤如下:In 3), the global controller receives the total amount of data of all subnet controller nodes and the data imbalance to calculate the performance of each subnet training set, and selects the optimal subnet controller node according to the performance of each subnet training set. Specific steps are as follows:

当某子网控制器节点的训练数据量占所有训练数据的比例

Figure 900483DEST_PATH_IMAGE032
越大,且第i个子网 节点的数据不均衡度
Figure 809533DEST_PATH_IMAGE033
越小,该节点的子网模型的精度就有可能越高。
Figure 816804DEST_PATH_IMAGE032
Figure 811304DEST_PATH_IMAGE033
的计算公 式如下: When the training data of a certain subnet controller node accounts for the proportion of all training data
Figure 900483DEST_PATH_IMAGE032
is larger, and the data imbalance degree of the i -th subnet node
Figure 809533DEST_PATH_IMAGE033
The smaller it is, the higher the accuracy of the subnet model for that node is likely to be.
Figure 816804DEST_PATH_IMAGE032
and
Figure 811304DEST_PATH_IMAGE033
The calculation formula is as follows:

Figure 562223DEST_PATH_IMAGE065
Figure 562223DEST_PATH_IMAGE065

Figure 325779DEST_PATH_IMAGE066
Figure 325779DEST_PATH_IMAGE066

其中,

Figure 232512DEST_PATH_IMAGE036
为第i个子网控制器训练数据数量;n为数据的种类数目;
Figure 979888DEST_PATH_IMAGE038
为第i个子 网控制器中第j种类别数据的数目;
Figure 268919DEST_PATH_IMAGE039
为第i个子网控制器中各类数据的平均数目。 in,
Figure 232512DEST_PATH_IMAGE036
is the number of training data for the ith subnet controller; n is the number of data types;
Figure 979888DEST_PATH_IMAGE038
is the number of the jth category data in the ith subnet controller;
Figure 268919DEST_PATH_IMAGE039
is the average number of various types of data in the i -th subnet controller.

因而,全局控制器可以集合各节点的训练数据数量占所有训练数据总量和数据不 均衡度通过以下公式找到最优节点,

Figure 152561DEST_PATH_IMAGE040
表示第i个节点训练数据的性能评估值。但是最优 节点的子网压缩模型结构只能体现其所对应的子网的训练数据特性。 Therefore, the global controller can collect the number of training data of each node to account for the total amount of training data and the data imbalance degree to find the optimal node by the following formula:
Figure 152561DEST_PATH_IMAGE040
Represents the performance evaluation value of the i -th node training data. However, the sub-network compression model structure of the optimal node can only reflect the training data characteristics of its corresponding sub-network.

Figure 236055DEST_PATH_IMAGE041
Figure 236055DEST_PATH_IMAGE041

其中全局控制器以最优子网控制器的压缩模型为基准,进行模型聚合与压缩,具体步骤如下:The global controller uses the compression model of the optimal subnet controller as the benchmark to perform model aggregation and compression. The specific steps are as follows:

首先,全局控制器将最优解点的快照最为基准,对基准的快照中所有需要被修剪 的输出通道,在其它子网控制器的快照中进行扫描,并统计被修剪的输出通道是否存在于 其它子网控制器的快照中。当基准快照中某个被修剪的输出通道不存在于其它节点的快照 中,则进行记录,当这些节点数量大于设定的通道恢复阈值是Z时(

Figure 205148DEST_PATH_IMAGE042
K为所有的子 网控制器节点数目),则将该输出通道在基准快照中进行删除,获得全局快照。而当设定值Z 越大时,需要从基准快照中删除的通道数会越小。 First, the global controller takes the snapshot of the optimal solution point as the benchmark, scans all the output channels that need to be pruned in the snapshot of the benchmark, scans the snapshots of other subnet controllers, and counts whether the pruned output channels exist in the in snapshots of other subnet controllers. When a pruned output channel in the benchmark snapshot does not exist in the snapshots of other nodes, it is recorded, when the number of these nodes is greater than the set channel recovery threshold is Z (
Figure 205148DEST_PATH_IMAGE042
, K is the number of all subnet controller nodes), then delete the output channel in the benchmark snapshot to obtain a global snapshot. When the set value Z is larger, the number of channels that need to be deleted from the reference snapshot will be smaller.

然后,全局控制器对利用加权平均公式,对所有子网控制器的训Then, the global controller pair uses the weighted average formula to train all sub-network controllers.

练模型的权重进行聚合,计算出全局模型。The weights of the training model are aggregated to calculate the global model.

Figure 297868DEST_PATH_IMAGE043
Figure 297868DEST_PATH_IMAGE043

Figure 36017DEST_PATH_IMAGE044
为第i个子网控制器训练数据数量,
Figure 555992DEST_PATH_IMAGE045
表示第1轮聚合的第i个子网控制器 的模型权值。
Figure 36017DEST_PATH_IMAGE044
the number of training data for the ith subnet controller,
Figure 555992DEST_PATH_IMAGE045
Represents the model weights of the ith subnetwork controller of the first round of aggregation.

其次,根据全局快照,对全局模型的各层输出通道进行剪枝形成全局压缩模型,并将全局压缩模型发送至各子网控制器。Secondly, according to the global snapshot, the output channels of each layer of the global model are pruned to form a global compressed model, and the global compressed model is sent to each sub-network controller.

因而系统管理员可以通过设定压缩参数、通道恢复阈值来控制全局压缩模型规模大小。Therefore, system administrators can control the size of the global compression model by setting compression parameters and channel recovery thresholds.

4)子网控制器利用本地数据对全局压缩模型进行训练,当到达规定训练次数,计 算各子网模型t次训练的损失函数值的均值

Figure 12381DEST_PATH_IMAGE046
与标准差
Figure 643213DEST_PATH_IMAGE067
。 4) The sub-network controller uses local data to train the global compression model. When the specified number of training times is reached, the mean value of the loss function value of each sub-network model trained for t times is calculated.
Figure 12381DEST_PATH_IMAGE046
with standard deviation
Figure 643213DEST_PATH_IMAGE067
.

Figure 501448DEST_PATH_IMAGE068
Figure 501448DEST_PATH_IMAGE068

Figure 192323DEST_PATH_IMAGE049
Figure 192323DEST_PATH_IMAGE049

其中

Figure 870429DEST_PATH_IMAGE069
为第k个子网控制器第j轮的t次训练的损失函数值的标准差,
Figure 304953DEST_PATH_IMAGE070
为第i 次训练的损失函数,
Figure 17694DEST_PATH_IMAGE046
是第j轮的t次训练的损失函数平均值。然后,子网控制器将损失函 数值的均值、损失函数值的标准差、本地训练模型发送给全局控制器。 in
Figure 870429DEST_PATH_IMAGE069
is the standard deviation of the loss function value of the kth subnetwork controller in the jth round of the t training,
Figure 304953DEST_PATH_IMAGE070
is the loss function of the i -th training,
Figure 17694DEST_PATH_IMAGE046
is the average value of the loss function for t training sessions in the jth round. Then, the sub-network controller sends the mean of the loss function value, the standard deviation of the loss function value, and the locally trained model to the global controller.

5)为了使各子网模型的收敛达到平衡状态,全局控制器对各子网控制器的训练模 型进行加权聚合,然后计算本轮聚合通信的各子网训练模型的损失函数和S、损失函数标准 差的平均值

Figure DEST_PATH_IMAGE071
,计算公式如下(其中K是子网控制器的个数): 5) In order to make the convergence of each sub-network model reach a balanced state, the global controller performs weighted aggregation on the training models of each sub-network controller, and then calculates the loss function and S , loss function of each sub-network training model of this round of aggregated communication mean of standard deviation
Figure DEST_PATH_IMAGE071
, the calculation formula is as follows (where K is the number of subnet controllers):

Figure 145050DEST_PATH_IMAGE053
Figure 145050DEST_PATH_IMAGE053

Figure 576031DEST_PATH_IMAGE054
Figure 576031DEST_PATH_IMAGE054

若各子网检测模型的损失函数和S、损失函数标准差的平均值

Figure 548666DEST_PATH_IMAGE055
都小于等于 设定阈值,则将聚合后的模型与收敛信息发送至子网控制器,否则将聚合后的模型与非收 敛信息发送至各子网控制器,子网控制器重复步骤4)。 If the loss function and S of each sub-network detection model, the average value of the standard deviation of the loss function
Figure 548666DEST_PATH_IMAGE055
If both are less than or equal to the set threshold, the aggregated model and convergence information are sent to the sub-network controller; otherwise, the aggregated model and non-convergence information are sent to each sub-network controller, and the sub-network controller repeats step 4).

可选地,还包括6)各子网控制器则利用本地数据对全局模型进行微调,然后利用微调后的最新模型对本地流量进行检测。当检测出异常流量时,各子网控制器通过有效措施对异常攻击进行有效缓解。Optionally, 6) each sub-network controller uses local data to fine-tune the global model, and then uses the fine-tuned latest model to detect local traffic. When abnormal traffic is detected, each subnet controller takes effective measures to mitigate abnormal attacks.

本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。As will be appreciated by those skilled in the art, the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其它可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其它可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block in the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in one or more of the flowcharts and/or one or more blocks of the block diagrams.

这些计算机程序指令也可存储在能引导计算机或其它可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer readable memory result in an article of manufacture comprising instruction means, the instructions An apparatus implements the functions specified in a flow or flows of the flowcharts and/or a block or blocks of the block diagrams.

这些计算机程序指令也可装载到计算机或其它可编程数据处理设备上,使得在计算机或其它可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其它可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in one or more of the flowcharts and/or one or more blocks of the block diagrams.

以上结合附图对本发明的实施例进行了描述,但是本发明并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本发明的启示下,在不脱离本发明宗旨和权利要求所保护的范围情况下,还可做出很多形式,这些均属于本发明的保护之内。The embodiments of the present invention have been described above with reference to the accompanying drawings, but the present invention is not limited to the above-mentioned specific embodiments. The above-mentioned specific embodiments are only illustrative rather than restrictive. Under the inspiration of the present invention, without departing from the scope of protection of the present invention and the claims, many forms can be made, which all belong to the protection of the present invention.

Claims (8)

1. A controllable lightweight federated learning method for protecting privacy is characterized by comprising the following steps:
each subnet node trains a local model based on a local training set and a model and parameters preset by the global node; each sub-network node calculates an output channel needing pruning in each layer of the local model by using a set adjusting factor and generates a compressed snapshot of the local model;
the global node determines an optimal subnet node according to the number of data in the training set of each subnet node and the data imbalance degree, takes the compressed snapshot of the optimal subnet node as a reference snapshot, and determines a global snapshot according to the reference snapshot, a channel recovery threshold and the compressed snapshots of other subnet nodes except the optimal subnet node;
the global node performs weighted aggregation on the models of the sub-network nodes to form an aggregated global model; according to the global snapshot, pruning each layer of output channels of the aggregated global model to form a global compression model;
each sub-network node trains the global compression model by using a local training set until the model after each pair of sub-network nodes trains converges after weighted aggregation, and a final aggregated model is obtained;
the calculating of the output channels needing pruning in each layer of the local model by using the set adjustment factors comprises a second step of simultaneously satisfying the following formulaiFirst of a layercEach output channel is determined as an output channel to be clipped,
Figure 842247DEST_PATH_IMAGE001
wherein
Figure 521490DEST_PATH_IMAGE002
The representation model isiFirst of a layercThe average characteristic mapping value of each output channel output by the activation function;
Figure 280498DEST_PATH_IMAGE003
representation modelFirst, theiThe average value of the average feature mapping values output by all output channels of the layer through the activation function;
Figure 754205DEST_PATH_IMAGE004
Figure 598664DEST_PATH_IMAGE005
are adjustment factors and are all greater than 0,
Figure 550440DEST_PATH_IMAGE006
the representation model isiFirst of a layercThe average zero activation percentage of the individual output channels,
Figure 695113DEST_PATH_IMAGE007
is a model ofiAverage of the average zero activation percentage of all output channels of the layer.
2. The privacy-protecting controllable light-weight federal learning method as claimed in claim 1, wherein the step of determining the global snapshot according to the reference snapshot, the channel restoration threshold and the compressed snapshots of other subnet nodes except the optimal subnet node by taking the compressed snapshot of the optimal subnet node as the reference snapshot comprises the steps of: scanning all output channels needing to be pruned in the reference snapshot in the compressed snapshots of the other subnet nodes, and counting whether the pruned output channels exist in the compressed snapshots of the other subnet nodes;
when a certain output channel does not exist in the compressed snapshots of other subnet nodes, recording the corresponding subnet node, and when the number of the recorded subnet nodes is greater than a set channel recovery threshold, deleting the output channel in the reference snapshot; and finally, obtaining the global snapshot.
3. The privacy-preserving controllable lightweight federated learning method of claim 2, wherein the compressed snapshot includes that the output channel that needs to be pruned is a layer number of a neural network, and an ID number of the output channel that needs to be pruned.
4. The privacy-preserving, controllable, lightweight federated learning method of claim 1, wherein an optimal subnet node is determined by the following formula:
Figure 74142DEST_PATH_IMAGE008
wherein
Figure 405897DEST_PATH_IMAGE009
Is a firstiThe degree of data imbalance for each sub-network node,
Figure 161364DEST_PATH_IMAGE010
is as followsiThe data volume of the local training set of individual subnet nodes is a proportion of all training data,
Figure 629385DEST_PATH_IMAGE011
is shown asiThe performance evaluation value of the training data of each node,
Figure 585840DEST_PATH_IMAGE012
is the optimal subnet node.
5. The privacy-preserving, controlled, lightweight federated learning method of claim 4, whereiniThe data volume of the local training set of each sub-network node is in proportion to all training data
Figure 529525DEST_PATH_IMAGE013
And a firstiData imbalance of individual subnet nodes
Figure 964049DEST_PATH_IMAGE014
The calculation formula of (a) is as follows:
Figure 411210DEST_PATH_IMAGE015
Figure 132042DEST_PATH_IMAGE016
wherein,
Figure 703969DEST_PATH_IMAGE017
is as followsiThe amount of training data for the subnet controller,
Figure 676604DEST_PATH_IMAGE018
is as followsjThe amount of training data for the subnet controller,nthe number of types of data;Kthe number of nodes of all the subnet controllers;
Figure 243851DEST_PATH_IMAGE019
is as followsiFrom the subnet controller to the firstjThe number of seed class data;
Figure 135584DEST_PATH_IMAGE020
is as followsiAverage number of various types of data in the subnet controller.
6. The privacy-preserving controlled lightweight federated learning method of claim 1, wherein the method of determining whether the weighted aggregated model converges comprises: determining the loss function of the model trained by each sub-network node and the average value of the standard deviation of the loss function;
and determining the model convergence after weighted aggregation is carried out on the model trained by the sub-network controller if the average value of the loss function of the model trained by each sub-network node and the standard deviation of the loss function is less than or equal to a set threshold value.
7. The controllable light-weight federal learning system for protecting privacy is characterized by comprising a data layer, a subnet control layer and a global control layer;
the data layer is used for carrying out data forwarding communication on each subnet;
the subnet control layer is provided with a plurality of subnet controllers, and the global control layer is provided with a global controller;
the global controller is used for transmitting preset models and parameters to all the subnet controllers and adjusting factors required by model compression;
the subnet controller is used for acquiring data and extracting features to form a local training set; receiving the model, the parameters and the adjustment factors transmitted by the global controller; training a local model by using a local training set, a model and parameters, calculating output channels needing pruning in each layer of the local model by using set adjustment factors, and generating a compressed snapshot of the local model; each subnet controller transmits the number of data in the local training set, the model and the compressed snapshot to a global controller;
the method comprises the steps that an optimal subnet controller is determined by a global controller according to the number of data in a training set obtained by each subnet controller and the data imbalance degree, a compressed snapshot of the optimal subnet controller is taken as a reference snapshot, and the global snapshot is determined according to the reference snapshot, a channel recovery threshold and compressed snapshots generated by other subnet controllers except the optimal subnet controller;
the global controller carries out weighted aggregation on the models trained by the sub-network controllers to form an aggregated global model; according to the global snapshot, pruning each layer of output channels of the aggregated global model to form a global compression model;
the subnet controller calculates output channels needing pruning in each layer of the local model by using the set regulating factors, and the output channels comprise a second channel simultaneously meeting the following formulaiFirst of a layercThe output channel is determined as the channel to be trimmed,
Figure 929228DEST_PATH_IMAGE001
wherein
Figure 830188DEST_PATH_IMAGE021
Represents the model number oneiFirst of the layercThe average characteristic mapping value output by each output channel through the activation function;
Figure 121448DEST_PATH_IMAGE022
the representation model isiThe average value of the average feature mapping values output by all output channels of the layer through the activation function;
Figure 449662DEST_PATH_IMAGE004
Figure 465022DEST_PATH_IMAGE005
are adjustment factors and are all greater than 0,
Figure 169673DEST_PATH_IMAGE023
the representation model isiFirst of the layercThe average zero activation percentage of the individual output channels,
Figure 586879DEST_PATH_IMAGE024
is the average of the average zero activation percentage of all output channels of the model first layer.
8. The privacy-protecting controllable light-weight federal learning flow data detection method is characterized in that a model is obtained by adopting the privacy-protecting controllable light-weight federal learning method as claimed in any one of claims 1-6;
and inputting the acquired network flow data, and performing flow detection by using the finally acquired model.
CN202210057267.9A 2022-01-19 2022-01-19 Controllable and lightweight federated learning method, system and detection method for privacy protection Active CN114077755B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210057267.9A CN114077755B (en) 2022-01-19 2022-01-19 Controllable and lightweight federated learning method, system and detection method for privacy protection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210057267.9A CN114077755B (en) 2022-01-19 2022-01-19 Controllable and lightweight federated learning method, system and detection method for privacy protection

Publications (2)

Publication Number Publication Date
CN114077755A CN114077755A (en) 2022-02-22
CN114077755B true CN114077755B (en) 2022-05-31

Family

ID=80284532

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210057267.9A Active CN114077755B (en) 2022-01-19 2022-01-19 Controllable and lightweight federated learning method, system and detection method for privacy protection

Country Status (1)

Country Link
CN (1) CN114077755B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114492847B (en) * 2022-04-18 2022-06-24 奥罗科技(天津)有限公司 Efficient personalized federal learning system and method
CN117592094A (en) * 2023-10-20 2024-02-23 深圳信息职业技术学院 Privacy data set processing method, device, computer equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111355739A (en) * 2020-03-06 2020-06-30 深圳前海微众银行股份有限公司 Data transmission method, device, terminal equipment and medium for horizontal federal learning
CN112070207A (en) * 2020-07-31 2020-12-11 华为技术有限公司 Model training method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200272945A1 (en) * 2019-02-21 2020-08-27 Hewlett Packard Enterprise Development Lp System and method of decentralized model building for machine learning and data privacy preserving using blockchain

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111355739A (en) * 2020-03-06 2020-06-30 深圳前海微众银行股份有限公司 Data transmission method, device, terminal equipment and medium for horizontal federal learning
CN112070207A (en) * 2020-07-31 2020-12-11 华为技术有限公司 Model training method and device

Also Published As

Publication number Publication date
CN114077755A (en) 2022-02-22

Similar Documents

Publication Publication Date Title
CN114077755B (en) Controllable and lightweight federated learning method, system and detection method for privacy protection
CN114091356A (en) A federated learning method and device
CN105577685A (en) Autonomous analysis intrusion detection method and system in cloud computing environment
CN110113353A (en) A kind of intrusion detection method based on CVAE-GAN
CN113469376B (en) Blockchain-based federated learning backdoor attack defense method and device
CN112420187A (en) A medical disease analysis method based on transfer federated learning
CN106412912A (en) Node trust assessment method facing car networking
CN108632269A (en) Detecting method of distributed denial of service attacking based on C4.5 decision Tree algorithms
CN105791213A (en) A strategy optimization device and method
CN110022531A (en) A kind of localization difference privacy municipal refuse data report and privacy calculation method
CN104270372A (en) A Parameter Adaptive Network Security Situation Quantitative Evaluation Method
CN116502171B (en) Network security information dynamic detection system based on big data analysis algorithm
CN113904872A (en) A feature extraction method and system for fingerprinting attacks on anonymous service websites
CN111698241A (en) Internet of things cloud platform system, verification method and data management method
US20210051170A1 (en) Method and apparatus for determining a threat using distributed trust across a network
CN116861994A (en) A privacy-preserving federated learning method that resists Byzantine attacks
CN115834409B (en) Secure aggregation method and system for federated learning
CN116597498A (en) A Fair Face Attribute Classification Method Based on Blockchain and Federated Learning
CN119997005B (en) A satellite internet traffic security protection method and system
CN116319437A (en) Network connectivity detection method and device
Aljammal et al. Performance Evaluation of Machine Learning Approaches in Detecting IoT-Botnet Attacks.
CN116132081A (en) Software defined network DDOS attack cooperative defense method based on ensemble learning
KR102561702B1 (en) Method and apparatus for monitoring fault of system
TWI797808B (en) Machine learning system and method
Chen et al. Edge-based protection against malicious poisoning for distributed federated learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载