+

CN112364981B - Differentiable searching method and device for mixed precision neural network - Google Patents

Differentiable searching method and device for mixed precision neural network Download PDF

Info

Publication number
CN112364981B
CN112364981B CN202011249481.1A CN202011249481A CN112364981B CN 112364981 B CN112364981 B CN 112364981B CN 202011249481 A CN202011249481 A CN 202011249481A CN 112364981 B CN112364981 B CN 112364981B
Authority
CN
China
Prior art keywords
network
hyper
sub
hardware
super
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011249481.1A
Other languages
Chinese (zh)
Other versions
CN112364981A (en
Inventor
常成
朱雪娟
余浩
毛伟
代柳瑶
李凯
王宇航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Maitexin Technology Co ltd
Original Assignee
Southern University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southern University of Science and Technology filed Critical Southern University of Science and Technology
Priority to CN202011249481.1A priority Critical patent/CN112364981B/en
Publication of CN112364981A publication Critical patent/CN112364981A/en
Application granted granted Critical
Publication of CN112364981B publication Critical patent/CN112364981B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Complex Calculations (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a differentiable searching method and a differentiable searching device for a mixed precision neural network. The method comprises the following steps: acquiring an initialization hyper-network; the super network comprises a plurality of sub-networks, and each sub-network carries a super parameter; updating the hyper-parameters based on a differentiable search method to obtain a first hyper-network; hardware performance evaluation is carried out on sub-networks contained in the first super-network, and the super-parameters of the first super-network are updated according to the evaluation result to obtain a second super-network; judging whether an updating termination condition is met, and if so, determining the second hyper-network as a target neural network; otherwise, returning to and executing the operation of updating the hyper-parameters based on the differentiable search method to obtain the first hyper-network. By using the method, automatic model quantification can be performed, and a neural network can be searched and constructed for a specific hardware platform.

Description

一种混合精度神经网络的可微分搜索方法和装置A Differentiable Search Method and Device for Mixed Precision Neural Network

技术领域technical field

本发明实施例涉及计算机技术领域,尤其涉及一种混合精度神经网络的可微分搜索方法和装置。Embodiments of the present invention relate to the field of computer technology, and in particular to a differentiable search method and device for a mixed-precision neural network.

背景技术Background technique

深度学习可以自动学习出有用的特征,脱离了对特征工程的依赖,在图像识别、视频理解和自然语言处理等任务上取得了超越其他算法的结果。这种成功很大程度上得益于新神经网络结构的出现,如ResNet、Inception、DenseNet、MobileNet等。Deep learning can automatically learn useful features, without relying on feature engineering, and has achieved results that surpass other algorithms in tasks such as image recognition, video understanding, and natural language processing. This success is largely due to the emergence of new neural network structures, such as ResNet, Inception, DenseNet, MobileNet, etc.

神经架构搜索(Neural Architecture Search,NAS)是一种自动设计神经网络的技术,可以通过算法根据数据集自动设计出高精度高性能的网络结构,在某些任务上可以媲美甚至超越人类专家的水准,并可以发现某些人类之前未曾提出的网络结构。和传统手动设计网络结构和超参数相比,神经架构搜索可以有效地降低神经网络的设计和使用成本。Neural Architecture Search (Neural Architecture Search, NAS) is a technology for automatically designing neural networks. It can automatically design high-precision and high-performance network structures based on data sets through algorithms, which can match or even exceed the level of human experts in certain tasks. , and can discover some network structures that humans have not proposed before. Compared with the traditional manual design of network structure and hyperparameters, neural architecture search can effectively reduce the design and use costs of neural networks.

但是随着实际任务复杂度的提升,需要设计架构更大且更深的神经网络,同时还需要将模型更广泛地部署应用在不同的硬件平台上。另外,传统的基于强化学习的NAS方法和进化算法的NAS方法虽然可以设计出高精度和高性能的网络,但搜索算法本身需要耗费的计算资源过高,不利于NAS方法的大规模推广和应用。However, as the complexity of actual tasks increases, it is necessary to design a larger and deeper neural network, and to deploy the model more widely on different hardware platforms. In addition, although traditional NAS methods based on reinforcement learning and evolutionary algorithms can design high-precision and high-performance networks, the search algorithm itself requires too much computing resources, which is not conducive to the large-scale promotion and application of NAS methods. .

因此,提出一种搜索速度快、耗费计算和内存资源少、能够自动进行模型量化、且可以针对特定硬件平台进行搜索的神经网络架构搜索方法是当前亟待解决的技术问题。Therefore, it is an urgent technical problem to propose a neural network architecture search method that has fast search speed, consumes less computing and memory resources, can automatically perform model quantization, and can search for specific hardware platforms.

发明内容Contents of the invention

本发明实施例提供了一种混合精度神经网络的可微分搜索方法和装置,能够进行自动的模型量化,且可以针对特定硬件平台进行搜索构建神经网络。Embodiments of the present invention provide a differentiable search method and device for a mixed-precision neural network, capable of automatic model quantization, and capable of searching and constructing a neural network for a specific hardware platform.

第一方面,本发明实施例提供了一种混合精度神经网络的可微分搜索方法,包括:In the first aspect, an embodiment of the present invention provides a differentiable search method for a mixed-precision neural network, including:

获取初始化超网络;所述超网络包括多个子网络,且每个子网络携带有超参数;Obtain an initialized supernetwork; the supernetwork includes a plurality of subnetworks, and each subnetwork carries hyperparameters;

基于可微分搜索方法对所述超参数进行更新,获得第一超网络;updating the hyperparameters based on a differentiable search method to obtain a first hypernetwork;

对所述第一超网络所包含的子网络进行硬件性能评估,根据评估结果更新所述第一超网络的超参数,获得第二超网络;performing hardware performance evaluation on the subnetworks included in the first supernetwork, updating hyperparameters of the first supernetwork according to the evaluation results, and obtaining a second supernetwork;

判断是否满足更新终止条件,若满足,则将所述第二超网络确定为目标神经网络;否则,返回执行基于可微分搜索方法对所述超参数进行更新,获得第一超网络的操作。Judging whether the update termination condition is satisfied, if so, determining the second supernetwork as the target neural network; otherwise, returning to the operation of updating the hyperparameters based on the differentiable search method to obtain the first supernetwork.

第二方面,本发明实施例还提供了一种混合精度神经网络的可微分搜索装置,包括:In the second aspect, the embodiment of the present invention also provides a differentiable search device for a mixed-precision neural network, including:

获取模块,用于获取初始化超网络;所述超网络包括多个子网络,且每个子网络携带有超参数;An acquisition module, configured to acquire an initialized supernetwork; the supernetwork includes a plurality of subnetworks, and each subnetwork carries hyperparameters;

第一更新模块,用于基于可微分搜索方法对所述超参数进行更新,获得第一超网络;A first update module, configured to update the hyperparameters based on a differentiable search method to obtain a first hypernetwork;

第二更新模块,用于对所述第一超网络所包含的子网络进行硬件性能评估,根据评估结果更新所述第一超网络的超参数,获得第二超网络;The second update module is configured to evaluate the hardware performance of the sub-networks included in the first super-network, update the hyper-parameters of the first super-network according to the evaluation results, and obtain a second super-network;

判断模块,用于判断是否满足更新终止条件,若满足,则将所述第二超网络确定为目标神经网络;否则,返回执行基于可微分搜索方法对所述超参数进行更新,获得第一超网络的操作。The judging module is used to judge whether the update termination condition is satisfied, and if so, determine the second hypernetwork as the target neural network; otherwise, return to execute the update of the hyperparameters based on the differentiable search method to obtain the first supernetwork operation of the network.

第三方面,本发明实施例还提供了一种计算机设备,包括:In a third aspect, an embodiment of the present invention also provides a computer device, including:

一个或多个处理器;one or more processors;

存储装置,用于存储一个或多个程序;storage means for storing one or more programs;

当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现本发明任意实施例所述的混合精度神经网络的可微分搜索方法。When the one or more programs are executed by the one or more processors, the one or more processors are made to implement the differentiable search method for a mixed-precision neural network in any embodiment of the present invention.

第四方面,本发明实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现如本发明任意实施例所提供的混合精度神经网络的可微分搜索方法。In a fourth aspect, an embodiment of the present invention also provides a computer-readable storage medium on which a computer program is stored, and is characterized in that, when the program is executed by a processor, the mixed-precision neural network as provided in any embodiment of the present invention is implemented. Differentiable search methods for networks.

本发明实施例提供了一种混合精度神经网络的可微分搜索方法和装置,首先获取初始化超网络,所述超网络包括多个子网络,且每个子网络携带有超参数,然后基于可微分搜索方法对所述超参数进行更新,获得第一超网络,之后对所述第一超网络所包含的子网络进行硬件性能评估,根据评估结果更新所述第一超网络的超参数,获得第二超网络,最后判断是否满足更新终止条件,若满足,则将所述第二超网络确定为目标神经网络;否则,返回执行基于可微分搜索方法对所述超参数进行更新,获得第一超网络的操作。利用上述技术方案,能够进行自动的模型量化,且可以针对特定硬件平台进行搜索构建神经网络。Embodiments of the present invention provide a differentiable search method and device for a mixed-precision neural network. First, an initial supernetwork is obtained. The supernetwork includes a plurality of subnetworks, and each subnetwork carries hyperparameters. Updating the hyperparameters to obtain a first hypernetwork, and then evaluating the hardware performance of the subnetworks included in the first hypernetwork, updating hyperparameters of the first supernetwork according to the evaluation results, and obtaining a second supernetwork network, and finally determine whether the update termination condition is met, and if so, then determine the second supernetwork as the target neural network; otherwise, return to execute the hyperparameter update based on the differentiable search method to obtain the first supernetwork operate. Using the above technical solution, automatic model quantization can be performed, and a neural network can be searched and constructed for a specific hardware platform.

附图说明Description of drawings

图1为本发明实施例一提供的一种混合精度神经网络的可微分搜索方法的流程示意图;FIG. 1 is a schematic flowchart of a differentiable search method for a mixed-precision neural network provided in Embodiment 1 of the present invention;

图2为本发明实施例二提供的一种混合精度神经网络的可微分搜索方法的流程示意图;FIG. 2 is a schematic flowchart of a differentiable search method for a mixed-precision neural network provided in Embodiment 2 of the present invention;

图3为本发明实施例三提供的一种混合精度神经网络的可微分搜索装置的结构示意图;3 is a schematic structural diagram of a differentiable search device for a mixed-precision neural network provided by Embodiment 3 of the present invention;

图4为本发明实施例四提供的一种计算机设备的结构示意图。FIG. 4 is a schematic structural diagram of a computer device provided by Embodiment 4 of the present invention.

具体实施方式Detailed ways

下面结合附图和实施例对本发明作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释本发明,而非对本发明的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与本发明相关的部分而非全部结构。The present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, but not to limit the present invention. In addition, it should be noted that, for the convenience of description, only some structures related to the present invention are shown in the drawings but not all structures.

在更加详细地讨论示例性实施例之前应当提到的是,一些示例性实施例被描述成作为流程图描绘的处理或方法。虽然流程图将各项操作(或步骤)描述成顺序的处理,但是其中的许多操作可以被并行地、并发地或者同时实施。此外,各项操作的顺序可以被重新安排。当其操作完成时所述处理可以被终止,但是还可以具有未包括在附图中的附加步骤。所述处理可以对应于方法、函数、规程、子例程、子程序等等。此外,在不冲突的情况下,本发明中的实施例及实施例中的特征可以相互组合。Before discussing the exemplary embodiments in more detail, it should be mentioned that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although the flowcharts describe various operations (or steps) as sequential processing, many of the operations may be performed in parallel, concurrently, or simultaneously. In addition, the order of operations can be rearranged. The process may be terminated when its operations are complete, but may also have additional steps not included in the figure. The processing may correspond to a method, function, procedure, subroutine, subroutine, or the like. In addition, the embodiments and the features in the embodiments of the present invention can be combined with each other under the condition of no conflict.

本发明使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”。The term "comprising" and its variants used in the present invention are open to include, ie "including but not limited to". The term "based on" is "based at least in part on". The term "one embodiment" means "at least one embodiment."

实施例一Embodiment one

图1为本发明实施例一提供的一种混合精度神经网络的可微分搜索方法的流程示意图,该方法可适用于搜索高性能神经网络的情况,该方法可以由混合精度神经网络的可微分搜索装置来执行,其中该装置可由软件和/或硬件实现,并一般集成在计算机设备上。Fig. 1 is a schematic flow chart of a differentiable search method for a mixed-precision neural network provided by Embodiment 1 of the present invention. This method is applicable to the situation of searching a high-performance neural network. device, wherein the device can be implemented by software and/or hardware, and generally integrated on a computer device.

如图1所示,本发明实施例一提供的一种混合精度神经网络的可微分搜索方法,包括如下步骤:As shown in Figure 1, a differentiable search method for a mixed-precision neural network provided by Embodiment 1 of the present invention includes the following steps:

S110、获取初始化超网络;所述超网络包括多个子网络,且每个子网络携带有超参数。S110. Acquire an initialized supernetwork; the supernetwork includes multiple subnetworks, and each subnetwork carries hyperparameters.

在本实施例中,初始化超网络可以为一个具有初始值设置的超网络,示例性的,初始化超网络可以从搜索空间里获取。In this embodiment, the initializing supernetwork may be a hypernetwork with initial value settings, for example, the initializing supernetwork may be obtained from the search space.

子网络可以理解为上述初始化超网络的子网络,一个超网络可以由多个子网络构成,一个子网络可以包括多个网络层。A sub-network can be understood as a sub-network of the aforementioned initialized super-network, a super-network can be composed of multiple sub-networks, and a sub-network can include multiple network layers.

子网络携带有超参数,其中,超参数是连续可微的,一个子网络的超参数可以为多组向量构成的空间集合,示例性的,一个子网络中的一个网络层的超参数可以由两组不同的向量组成,其中一组向量可以由该层网络的卷积核配置参数的影响因子构成,另一组向量可以由该层网络的量化比特值的影响因子构成。两组向量的影响因子的个数可以分别根据卷积核配置参数的个数和量化比特值的个数确定,若该层网络的卷积核配置参数的个数为三个,则对应的该层网络的卷积核配置参数的影响因子也为三个。The subnetwork carries hyperparameters, wherein the hyperparameters are continuously differentiable, and the hyperparameters of a subnetwork can be a space set composed of multiple sets of vectors. Exemplarily, the hyperparameters of a network layer in a subnetwork can be represented by Two sets of different vectors, one set of vectors can be composed of the influence factors of the convolution kernel configuration parameters of the layer network, and the other set of vectors can be formed of the influence factors of the quantized bit values of the layer network. The number of influencing factors of the two groups of vectors can be determined according to the number of convolution kernel configuration parameters and the number of quantization bit values respectively. If the number of convolution kernel configuration parameters of the layer network is three, the corresponding The influence factor of the convolution kernel configuration parameters of the layer network is also three.

其中,卷积核配置参数可以包括以下中的一个或多个:每一层卷积核的数目,每一层针对每个卷积核的卷积核高度,每一层针对每个卷积核的卷积核宽度,每一层针对每个卷积核的步幅高度,每一层针对每个卷积核的步幅宽度。每一层网络的卷积核配置参数可以为从上述参数中选取的任意数个参数组成的向量。Wherein, the convolution kernel configuration parameters may include one or more of the following: the number of convolution kernels in each layer, the convolution kernel height for each convolution kernel in each layer, and the convolution kernel height for each convolution kernel in each layer. The convolution kernel width, each layer is for the stride height of each convolution kernel, and each layer is for the stride width of each convolution kernel. The convolution kernel configuration parameters of each network layer can be a vector composed of any number of parameters selected from the above parameters.

其中,量化比特值可以包括以下中的一个或多个:每一层卷积神经网络特征图的数据位宽,每一层卷积神经网络权重的数据位宽,每一层激活函数的数据位宽。每一层网络的量化比特值可以为从上述值中任意选取的至少一个值组成的向量。Wherein, the quantization bit value may include one or more of the following: the data bit width of each layer of convolutional neural network feature map, the data bit width of each layer of convolutional neural network weight, and the data bit width of each layer of activation function width. The quantization bit value of each network layer may be a vector composed of at least one value arbitrarily selected from the above values.

卷积核配置参数的影响因子可以表征对该卷积核配置参数中每个参数的影响程度,示例性的,一个参数对应的影响因子越大,则表明该参数在该网络层的所有卷积核配置参数中的影响力越大;对应的,量化比特值的影响因子可以表征对量化比特值的影响程度,示例性的,量化比特值中的某个值对应的影响因子越大,则表明该值在该网络层的所有量化比特值中的影响力越大。The influence factor of the convolution kernel configuration parameter can represent the degree of influence of each parameter in the convolution kernel configuration parameter. For example, the greater the influence factor corresponding to a parameter, it indicates that the parameter is in all convolutions of the network layer. The greater the influence in the kernel configuration parameters; correspondingly, the influence factor of the quantization bit value can represent the degree of influence on the quantization bit value. For example, the greater the influence factor corresponding to a certain value in the quantization bit value, it indicates The value is more influential among all quantized bit values of this network layer.

S120、基于可微分搜索方法对超参数进行更新,获得第一超网络。S120. Update hyperparameters based on a differentiable search method to obtain a first hypernetwork.

在本实施例中,可微分搜索方法可以为一种神经网络的搜索方法,可以用于接收神经网络的精度反馈。基于可微分搜索方法对超参数进行更新,获得第一超网络可以理解为基于可微分搜索方法对从超网络中采样出的若干子网络携带的超参数进行更新。In this embodiment, the differentiable search method may be a neural network search method, which may be used to receive the accuracy feedback of the neural network. Updating hyperparameters based on a differentiable search method to obtain the first supernetwork can be understood as updating hyperparameters carried by several subnetworks sampled from the supernetwork based on a differentiable search method.

其中,从超网络中采样出若干子网络可以包括使用gumbel-softmax进行采样,示例性的,通过gumbel-softmax将超参数变换成概率向量,根据概率向量的大小从超网络中采样出多个子网络,概率向量可以理解为每一层网络的卷积核配置参数的影响因子和量化比特值的影响因子的乘积组成的向量,概率向量越大,则表明该层网络对应的子网络被采样到的概率越大。Among them, sampling several subnetworks from the supernetwork may include sampling using gumbel-softmax. Exemplarily, hyperparameters are converted into probability vectors through gumbel-softmax, and multiple subnetworks are sampled from the supernetwork according to the size of the probability vector , the probability vector can be understood as a vector composed of the product of the influence factor of the convolution kernel configuration parameters of each layer of the network and the influence factor of the quantization bit value. The greater the probability.

基于可微分搜索方法对超参数进行更新,获得第一超网络可以理解为:首先保持子网络携带的超参数不变,对上述子网络在训练集上进行训练,并更新子网络的权重参数,保持权重参数不变,将训练后的子网络在验证集上进行前向传播,得到网络目标损失函数的值,根据网络目标损失函数的值对超参数进行求导,得到梯度值,并根据梯度值对子网络的超参数进行更新。Based on the differentiable search method to update the hyperparameters, obtaining the first hypernetwork can be understood as: first keep the hyperparameters carried by the subnetwork unchanged, train the above subnetwork on the training set, and update the weight parameters of the subnetwork, Keeping the weight parameters unchanged, the trained sub-network is propagated forward on the verification set to obtain the value of the network target loss function, and the hyperparameters are derived according to the value of the network target loss function to obtain the gradient value, and according to the gradient The value updates the hyperparameters of the subnetwork.

其中,训练集作用是用于拟合模型,通过设置分类器的参数,训练分类模型。后续结合验证集作用时,会选出同一参数的不同取值,拟合出多个分类器。Among them, the training set is used to fit the model, and the classification model is trained by setting the parameters of the classifier. When combined with the verification set later, different values of the same parameter will be selected to fit multiple classifiers.

其中,验证集的作用是当通过训练集训练出多个模型后,为了对超参数进行更新,使用各个模型在验证集数据上进行验证,并基于在验证集上的损失更新超参数。Among them, the function of the verification set is to use each model to verify on the verification set data in order to update the hyperparameters after training multiple models through the training set, and update the hyperparameters based on the loss on the verification set.

第一超网络可以为通过可微分搜索方法对超参数更新后,由更新后的超参数组成的新的超网络。The first hypernetwork may be a new hypernetwork composed of updated hyperparameters after hyperparameters are updated through a differentiable search method.

S130、对所述第一超网络所包含的子网络进行硬件性能评估,根据评估结果更新第一超网络的超参数,获得第二超网络。S130. Evaluate the hardware performance of the subnetworks included in the first supernetwork, and update hyperparameters of the first supernetwork according to the evaluation results to obtain a second supernetwork.

在本实施例中,在进行硬件性能评估之前,需要将第一超网络进行采样,获得若干个第二采样子网络,其中,采样的方法和步骤S120中的采样方法相同。In this embodiment, before hardware performance evaluation, the first supernetwork needs to be sampled to obtain several second sampled subnetworks, wherein the sampling method is the same as that in step S120.

本步骤可以通过进化算法来实现,本实施例中的进化算法可以用于接收和处理不可微的硬件反馈,基于此对第一超网络的超参数进行更新,硬件反馈可以理解为硬件评估模型对子网络中每层网络的硬件性能指标的反馈。This step can be realized by an evolutionary algorithm. The evolutionary algorithm in this embodiment can be used to receive and process non-differentiable hardware feedback. Based on this, the hyperparameters of the first hypernetwork can be updated. The hardware feedback can be understood as the evaluation of the hardware evaluation model. Feedback of hardware performance indicators for each layer of the network in the sub-network.

其中,硬件评估模型可以用于对网络进行硬件性能评估,示例性的,硬件性能评估模型可以对第一超网络中的第二采样子网络进行硬件性能评估。硬件性能评估可以理解为对目标硬件的硬件性能指标进行评估。Wherein, the hardware evaluation model can be used to evaluate the hardware performance of the network. Exemplarily, the hardware performance evaluation model can evaluate the hardware performance of the second sampling sub-network in the first supernetwork. The hardware performance evaluation can be understood as evaluating the hardware performance index of the target hardware.

硬件评估模型可以用于评估神经网络模型在目标硬件上的硬件性能指标,具体的,硬件评估模型可以用于评估第二采样子网络模型在目标硬件上的硬件性能指标,硬件性能指标可以包括以下的一个或多个,例如,功耗、时延和模型大小。The hardware evaluation model can be used to evaluate the hardware performance index of the neural network model on the target hardware. Specifically, the hardware evaluation model can be used to evaluate the hardware performance index of the second sampling sub-network model on the target hardware. The hardware performance index can include the following One or more of, for example, power consumption, latency, and model size.

其中,评估的方式可以包括以下两种方式:Among them, the evaluation methods can include the following two methods:

方式一:将若干第二采样子网络模型部署到目标硬件上,得到硬件性能指标;Method 1: Deploy several second sampling sub-network models to the target hardware to obtain hardware performance indicators;

方式二:根据若干第二采样子网络模型的卷积核配置参数和量化比特值的大小确定硬件性能指标。Method 2: Determine the hardware performance index according to the convolution kernel configuration parameters and quantization bit values of several second sampling sub-network models.

根据评估的结果对第一超网络的超参数进行更新,其中,评估结果可以理解为第二采样子网络的硬件性能指标的评估结果,如果其中一个或多个第二采样子网络的硬件性能指标的评估结果最优,则表明该第二采样子网络的卷积核配置参数和量化比特值是最优的,需要说明的是,该第二采样子网络可以包括很多网络层,可以具体到某个网络层的卷积核配置参数和量化比特值是最优的。The hyperparameters of the first hypernetwork are updated according to the results of the evaluation, wherein the evaluation results can be understood as the evaluation results of the hardware performance indicators of the second sampling subnetwork, if one or more of the hardware performance indicators of the second sampling subnetwork The evaluation result of is optimal, which indicates that the convolution kernel configuration parameters and quantization bit values of the second sampling sub-network are optimal. It should be noted that the second sampling sub-network can include many network layers, and can be specific to a certain The convolution kernel configuration parameters and quantization bit values of each network layer are optimal.

确定最优的第二采样子网络后,可以将其内部的具体网络层的卷积核配置参数和量化比特值对应的影响因子的数值增大,影响因子的数值增大后,则最优的第二采样子网络可以变为一个新的子网络,其所在的超网络也对应更新为一个新的超网络。After determining the optimal second sampling sub-network, the value of the influence factor corresponding to the configuration parameters of the convolution kernel of the specific network layer inside it and the quantization bit value can be increased. After the value of the influence factor is increased, the optimal The second sampling sub-network can be changed into a new sub-network, and the super-network where it is located is correspondingly updated to a new super-network.

第二超网络可以为最优的第二采样子网络内部网络层的卷积核配置参数的影响因子和量化比特值的影响因子进行更新后的网络。The second supernetwork may be a network after updating the influence factor of the convolution kernel configuration parameter and the influence factor of the quantization bit value of the optimal second sampling subnetwork inner network layer.

S140、判断是否满足更新终止条件,若满足,则执行步骤150;否则,返回执行步骤120。S140 , judging whether the update termination condition is satisfied, and if so, execute step 150 ; otherwise, return to execute step 120 .

在本实施例中,更新终止条件可以为超网络终止更新的条件,即神经网络架构停止搜索的条件。具体的,终止更新条件可以包括一下两个:第一,终止更新条件可以为当子网络的验证集反馈和硬件反馈都满足预设的指标时;第二,终止更新条件可以为当超网络更新的次数达到设定阈值时。In this embodiment, the update termination condition may be a condition for terminating the update of the super network, that is, a condition for the neural network architecture to stop searching. Specifically, the termination update condition can include the following two: first, the termination update condition can be when the verification set feedback and hardware feedback of the subnetwork both meet the preset indicators; second, the termination update condition can be when the supernetwork is updated When the number of times reaches the set threshold.

其中,验证集反馈可以理解为可微分搜索方法中对子网络模型准确率的反馈,硬件反馈可以理解为对硬件性能评估的结果的反馈。Among them, the verification set feedback can be understood as the feedback on the accuracy of the sub-network model in the differentiable search method, and the hardware feedback can be understood as the feedback on the results of hardware performance evaluation.

其中,预设指标可以为在使用本实施例提供的方法前,用户根据实际情况自行设置的值。当更新后的超网络的超参数都满足预设指标后,则停止对超网络的超参数进行更新,并将该超网络确定为目标神经网络,目标神经网络可以理解为满足用户需求的超网络。Wherein, the preset index may be a value set by the user according to the actual situation before using the method provided in this embodiment. When the hyperparameters of the updated hypernetwork meet the preset indicators, stop updating the hyperparameters of the hypernetwork, and determine the hypernetwork as the target neural network, which can be understood as a hypernetwork that meets user needs .

其中,设定阈值可以为用户自行设定的超网络超参数的更新次数,当更新的次数达到设定阈值后,本实施例提供的神经网络构建方法可以自动停止对超参数的更新,并将最后一次更新得到的超网络确定为目标神经网络。Wherein, the set threshold can be the number of updates of the hyper-parameters of the hypernetwork set by the user. When the number of updates reaches the set threshold, the neural network construction method provided in this embodiment can automatically stop updating the hyper-parameters, and set The hypernetwork obtained by the last update is determined as the target neural network.

若第二超网络不满足更新终止的条件,则继续执行S120和S130的步骤,据此对超网络进行更新,最后再判断更新后的超网络是否满足更新终止条件。如此循环,直至更新后的超网络满足更新终止条件,则停止对超网络的更新。If the second supernetwork does not meet the update termination condition, continue to execute steps S120 and S130, update the supernetwork accordingly, and finally determine whether the updated supernetwork meets the update termination condition. This loops until the updated supernetwork meets the update termination condition, and then stops updating the supernetwork.

步骤150,将第二超网络确定为目标神经网络。Step 150, determining the second supernetwork as the target neural network.

本发明实施例一提供的一种混合精度神经网络的可微分搜索方法,首先获取初始化超网络;所述超网络包括多个子网络,且每个子网络携带有超参数;其次基于可微分搜索方法对所述超参数进行更新,获得第一超网络;然后对所述第一超网络所包含的子网络进行硬件性能评估,根据评估结果更新所述第一超网络的超参数,获得第二超网络;最终判断是否满足更新终止条件,若满足,则将所述第二超网络确定为目标神经网络;否则,返回执行基于可微分搜索方法对所述超参数进行更新,获得第一超网络的操作。利用上述方法,能够有效地降低神经网络的设计和使用成本,相比传统神经网络架构搜索方法,搜索速度显著提升;此外,相比全精度的神经架构搜索方法,可以搜索到复杂度更低的轻量级网络。Embodiment 1 of the present invention provides a differentiable search method for a mixed-precision neural network. First, an initial supernetwork is obtained; the supernetwork includes a plurality of subnetworks, and each subnetwork carries hyperparameters; secondly, based on the differentiable search method, the The hyperparameters are updated to obtain a first supernetwork; then, hardware performance evaluation is performed on the subnetworks included in the first supernetwork, and the hyperparameters of the first supernetwork are updated according to the evaluation results to obtain a second supernetwork ;Finally judge whether the update termination condition is satisfied, if so, determine the second supernetwork as the target neural network; otherwise, return to the operation of updating the hyperparameters based on the differentiable search method to obtain the first supernetwork . Using the above method can effectively reduce the cost of neural network design and use. Compared with the traditional neural network architecture search method, the search speed is significantly improved; in addition, compared with the full-precision neural architecture search method, it can search for lower complexity Lightweight network.

进一步的,超网络为卷积神经网络,子网络包括至少一层网络;超参数包括每一层网络的卷积核配置参数的影响因子和量化比特值的影响因子,并且超参数是连续可微的。Further, the supernetwork is a convolutional neural network, and the subnetwork includes at least one layer of network; the hyperparameters include the influence factors of the convolution kernel configuration parameters of each layer of the network and the influence factors of quantized bit values, and the hyperparameters are continuously differentiable of.

具体的,超网络可以为卷积神经网络,并且子网络可以包括一层或者多层网络,此处对子网络和超网络的网络层的个数不做具体限制。Specifically, the supernetwork may be a convolutional neural network, and the subnetwork may include one or more layers of networks, and the number of network layers of the subnetwork and supernetwork is not specifically limited here.

其中,子网络的硬件反馈和验证集反馈结果越好,表明该子网络的卷积核配置参数和量化比特值越优,可以增大该子网络中对应的超参数,从而得到新的更新后的超网络。Among them, the better the hardware feedback and verification set feedback results of the subnetwork, the better the convolution kernel configuration parameters and quantization bit values of the subnetwork, and the corresponding hyperparameters in the subnetwork can be increased to obtain a new updated super network.

需要说明的是,超参数可以是连续可微的,以用于进行验证集反馈。示例性的,神经网络的损失函数可以对超参数进行求导,进而可以使用梯度下降的方法对超参数进行迭代优化。It should be noted that hyperparameters can be continuously differentiable for validation set feedback. Exemplarily, the loss function of the neural network can be derived for hyperparameters, and then gradient descent can be used to iteratively optimize the hyperparameters.

进一步的,基于可微分搜索方法对超参数进行更新,获得第一超网络的方式可以是:对初始化超网络包括的子网络进行采样,获得至少一个第一采样子网络;基于训练集对第一采样子网络进行训练;将验证集在训练后的第一采样子网络前向传播,得到目标损失函数值;对目标损失函数值进行微分计算,获得梯度值;根据梯度值对超参数进行更新,获得第一超网络。Further, based on the differentiable search method to update the hyperparameters, the way to obtain the first hypernetwork can be: sampling the subnetworks included in the initial hypernetwork to obtain at least one first sampling subnetwork; The sampling sub-network is trained; the verification set is propagated forward in the first sampling sub-network after training to obtain the target loss function value; the differential calculation is performed on the target loss function value to obtain the gradient value; the hyperparameter is updated according to the gradient value, Get the first hypernet.

其中,对初始化超网络包括的子网络进行采样后,可以根据采样概率获得一个或者多个采样子网络,采样子网络可以理解为从初始化超网络中使用gumbel-softmax采样出的子网络。Wherein, after sampling the sub-networks included in the initialization super-network, one or more sampling sub-networks can be obtained according to the sampling probability, and the sampling sub-networks can be understood as the sub-networks sampled from the initialization super-network using gumbel-softmax.

基于训练集对所述第一采样子网络进行训练可以包括:将一个或多个第一采样子网络在训练集上训练若干次,以更新这些第一采样子网络的权重参数,然后保持权重参数不变。Training the first sampling sub-network based on the training set may include: training one or more first sampling sub-networks several times on the training set to update the weight parameters of these first sampling sub-networks, and then maintain the weight parameters constant.

其中,目标损失函数值可以是最终需要优化的函数。对目标损失函数值进行微分计算可以理解为目标损失函数对超参数进行求导计算,得到超参数的梯度值。然后可以使用梯度下降的方法对超参数进行迭代优化。Wherein, the target loss function value may be a function that needs to be optimized eventually. The differential calculation of the target loss function value can be understood as the derivation calculation of the target loss function on the hyperparameters to obtain the gradient value of the hyperparameters. The hyperparameters can then be iteratively optimized using gradient descent.

示例性的,可以将现有的网络在验证集上进行验证,得到网络损失函数的值;然后将此值对上述超参数进行求导,得到梯度值;最后根据此梯度值对超参数进行更新。Exemplarily, the existing network can be verified on the verification set to obtain the value of the network loss function; then this value is derived from the above hyperparameters to obtain the gradient value; finally, the hyperparameters are updated according to the gradient value .

进一步的,基于训练集对第一采样子网络进行训练的过程可以是:基于训练集对第一采样子网络进行训练,以更新第一采样子网络的权重参数。Further, the process of training the first sampling sub-network based on the training set may be: training the first sampling sub-network based on the training set, so as to update the weight parameters of the first sampling sub-network.

其中,权重参数在每次训练后都更新为一个新的权重参数,通过训练集对第一采样子网络进行训练,训练后的第一采样子网络的权重参数可以得到更新。Wherein, the weight parameter is updated to a new weight parameter after each training, and the first sampling sub-network is trained through the training set, and the weight parameter of the trained first sampling sub-network can be updated.

进一步的,对第一超网络所包含的子网络进行硬件性能评估,根据评估结果更新第一超网络的超参数,获得第二超网络,包括:对第一超网络包括的子网络进行采样,获得至少一个第二采样子网络;对至少一个第二采样子网络进行硬件评估,获得最优第二采样子网络;将最优第二采样子网络中每一层网络的卷积核配置参数的影响因子和量化比特值的影响因子增大,获得第二超网络。Further, performing hardware performance evaluation on the sub-networks included in the first super-network, updating the hyperparameters of the first super-network according to the evaluation results, and obtaining the second super-network, including: sampling the sub-networks included in the first super-network, Obtain at least one second sampling sub-network; perform hardware evaluation on at least one second sampling sub-network to obtain an optimal second sampling sub-network; set the convolution kernel configuration parameters of each layer network in the optimal second sampling sub-network The impact factor and the impact factor of the quantized bit value are increased to obtain the second supernetwork.

其中,对第二采样子网络进行硬件评估可以通过硬件评估模型对第二采样子网络的硬件性能指标进行评估,评估后评估结果最优的子网络为最优第二采样子网络。Wherein, the hardware evaluation of the second sampling sub-network may evaluate the hardware performance index of the second sampling sub-network through a hardware evaluation model, and the sub-network with the best evaluation result after evaluation is the optimal second sampling sub-network.

需要说明的是,得到的最优第二采样子网络的硬件性能指标最优,则表明最优第二采样子网络的超参数,即卷积核配置参数和量化比特值是最优的。此处卷积核配置参数和量化比特值可以是最优第二采样子网络内每一层网络的超参数。It should be noted that the obtained optimal second sampling subnetwork has the best hardware performance index, which indicates that the hyperparameters of the optimal second sampling subnetwork, that is, the convolution kernel configuration parameters and quantization bit values are optimal. Here, the configuration parameter of the convolution kernel and the quantization bit value may be the hyperparameters of each layer of the network in the optimal second sampling sub-network.

确定最优的卷积核配置参数和量化比特值后,可以将该最优卷积核配置参数的影响因子和该最优量化比特值的影响因子增大。增大后的超参数使得对应的最优子网络得到更新,该最优子网络的超网络也得到了更新。After the optimal convolution kernel configuration parameter and quantization bit value are determined, the influence factor of the optimal convolution kernel configuration parameter and the influence factor of the optimal quantization bit value may be increased. The increased hyperparameters cause the corresponding optimal subnetwork to be updated, and the hypernetwork of the optimal subnetwork is also updated.

进一步的,对至少一个第二采样子网络进行硬件评估,包括:将至少一个第二采样子网络部署到目标硬件上,获得硬件性能指标;或者,根据至少一个第二采样子网络的卷积核配置参数和量化比特值的大小确定硬件性能指标。Further, performing hardware evaluation on at least one second sampling subnetwork includes: deploying at least one second sampling subnetwork on target hardware to obtain hardware performance indicators; or, according to the convolution kernel of at least one second sampling subnetwork The size of configuration parameters and quantized bit values determine hardware performance indicators.

其中,对采样子网络进行硬件评估的方式可以有两种,第一种方式可以包括直接将采样出的若干个第二采样子网络部署到目标硬件上,以此可以获得硬件性能指标,此处可以理解为获得多个第二采样子网络的每层网络的硬件性能指标。Among them, there are two ways to evaluate the hardware of the sampling sub-network. The first way may include directly deploying the sampled second sampling sub-networks on the target hardware, so as to obtain hardware performance indicators. Here It can be understood as obtaining the hardware performance index of each layer network of the plurality of second sampling sub-networks.

其中,第二种方式可以包括在神经网络架构搜索时,根据第二采样子网络的网络层以及对应关系表,估算出第二采样子网络的每层网络的硬件性能指标。获得的硬件性能指标可以用于表征该第二采样子网络中的卷积核配置参数和量化比特值的优劣。Wherein, the second way may include estimating the hardware performance index of each layer of the second sampling sub-network according to the network layers of the second sampling sub-network and the corresponding relationship table during the neural network architecture search. The obtained hardware performance index can be used to characterize the pros and cons of the convolution kernel configuration parameters and quantization bit values in the second sampling subnetwork.

进一步的,在对至少一个第二采样子网络进行硬件评估之前,还包括:建立至少一个第二采样子网络的卷积核配置参数和量化比特值的大小选取与硬件性能指标的对应关系表;相应的,根据至少一个第二采样子网络的每一层的卷积核配置参数和量化比特值的大小选取确定硬件性能指标,包括:根据至少一个第二采样子网络的卷积核配置参数和量化比特值的大小选取从对应关系表中查找对应的硬件性能指标。Further, before performing hardware evaluation on at least one second sampling sub-network, it also includes: establishing a correspondence table between convolution kernel configuration parameters and quantization bit value selection of at least one second sampling sub-network and hardware performance indicators; Correspondingly, according to the convolution kernel configuration parameters of each layer of the at least one second sampling sub-network and the size selection of the quantization bit value to determine the hardware performance index, including: according to the convolution kernel configuration parameters of at least one second sampling sub-network and The size of the quantized bit value is selected to find the corresponding hardware performance index from the correspondence table.

其中,对应关系表可以包括多个第二采样子网络的卷积核配置参数和量化比特值的大小选取与硬件性能指标的对应关系,硬件性能指标可以为估算出的所有第二采样子网络中的所有层网络的硬件性能指标。示例性的,一个第二采样子网络的某一网络层的卷积核配置参数和量化比特值可以有多种组合,每种组合的选取都可以有与其对应的硬件性能指标,它们之间具有一一对应的关系,将该一一对应的关系建立为对应关系表。Wherein, the correspondence table may include a plurality of convolution kernel configuration parameters of the second sampling sub-network and the corresponding relationship between the size selection of the quantization bit value and the hardware performance index, and the hardware performance index may be estimated for all the second sampling sub-networks Hardware performance metrics for all layers of the network. Exemplarily, the convolution kernel configuration parameters and quantization bit values of a certain network layer of a second sampling sub-network can have multiple combinations, and the selection of each combination can have its corresponding hardware performance index, and there is a relationship between them One-to-one correspondence relationship, the one-to-one correspondence relationship is established as a correspondence relationship table.

需要说明的是,建立对应关系表是在对第二采样子网络的超参数进行更新之前,在神经架构搜索之前,先将大量已知结构的不同第二采样子网络部署到目标硬件上,确定各第二采样子网络的硬件性能指标,再基于确定的硬件性能指标建立对应关系表。It should be noted that the establishment of the correspondence table is to deploy a large number of different second sampling subnetworks with known structures to the target hardware before updating the hyperparameters of the second sampling subnetwork and before neural architecture search. The hardware performance index of each second sampling sub-network, and then establish a corresponding relationship table based on the determined hardware performance index.

在对第二采样子网络硬件评估时,根据待估算的第二采样子网络的网络层和上述对应关系查找表,可以估算出每层网络的功耗、时延和模型大小中的任意一个或多个。When evaluating the hardware of the second sampling sub-network, according to the network layer of the second sampling sub-network to be estimated and the above-mentioned correspondence lookup table, any one or Multiple.

基于第二采样子网络中的每层网络的卷积核配置参数和量化比特值的大小选取,直接可以从对应关系表中查找到与该层网络的卷积核配置参数和量化比特值组合一一对应的硬件性能指标。示例性的,该层网络的卷积核配置参数可以有多个,该层网络的量化比特值也可以有多个,卷积核配置参数和量化比特值可以两两任意组合为一组,即可以有多种组合,每个组合都有与其对应的硬件性能指标。Based on the selection of the convolution kernel configuration parameters and quantization bit values of each layer of the network in the second sampling sub-network, one can directly find the combination of convolution kernel configuration parameters and quantization bit values of the layer network from the correspondence table. A corresponding hardware performance index. Exemplarily, there can be multiple convolution kernel configuration parameters in this layer network, and there can also be multiple quantization bit values in this layer network. The convolution kernel configuration parameters and quantization bit values can be arbitrarily combined into one group, namely There can be many combinations, and each combination has its corresponding hardware performance index.

进一步的,在获得第二超网络之后,还包括:确定第二超网络所需硬件资源以及预设限定资源;若所需硬件资源超过预设限定资源,则减小各子网络的量化比特值,使得所需硬件资源小于或等于预设限定资源。Further, after obtaining the second supernetwork, it also includes: determining the hardware resources required by the second supernetwork and the preset limited resources; if the required hardware resources exceed the preset limited resources, then reducing the quantization bit value of each subnetwork , so that the required hardware resources are less than or equal to the preset limited resources.

其中,所需硬件资源可以理解为子网络在目标硬件上的性能,示例性的,所需硬件资源可以包括功耗和时延。Wherein, the required hardware resources may be understood as the performance of the subnetwork on the target hardware. Exemplarily, the required hardware resources may include power consumption and delay.

其中,预设限定资源可以理解为对更新的超网络模型在目标硬件上的资源限定值,即更新后的超网络的所需硬件资源不能超过预设限定资源。预设限定资源可以在超网络进行更新之前用户根据实际需求自定义设置。Wherein, the preset limited resource can be understood as a resource limit value of the updated hypernetwork model on the target hardware, that is, the hardware resources required by the updated hypernetwork cannot exceed the preset limited resource. Preset limited resources can be customized by users according to actual needs before the hypernetwork is updated.

示例性的,网络模型在目标硬件上运行的时候一定会产生功耗,不同的网络模型有不同的功耗,一般复杂度较高的网络模型功耗较高,如果我们在对超网络的超参数进行更新之前预先设定好一个功耗上限,比如筛除10W以上的网络,那么就会引导整个更新过程搜索到用户满意的、功耗较低的超网络。Exemplarily, when the network model runs on the target hardware, it will definitely generate power consumption. Different network models have different power consumption. Generally, the network model with higher complexity consumes more power. Before updating the parameters, set a power consumption limit in advance, such as screening out the network above 10W, then the entire update process will be guided to search for a super network with low power consumption that is satisfactory to the user.

当所需硬件资源超过预设限定资源后,可以采用同时减小各子网络中的每层网络的量化比特值的方式,示例性的,可以将每层网络的量化比特值从10比特降低到8比特,再降低到6比特,依次减小,使得所需硬件资源小于或等于预设限定资源。减小量化比特值还可以理解为减小网络层位宽比特值的方式。When the required hardware resources exceed the preset limited resources, the quantization bit value of each layer network in each sub-network can be reduced at the same time. For example, the quantization bit value of each layer network can be reduced from 10 bits to 8 bits, then reduced to 6 bits, and then reduced successively, so that the required hardware resources are less than or equal to the preset limited resources. Reducing the quantization bit value can also be understood as a way of reducing the bit width bit value of the network layer.

实施例二Embodiment two

图2为本发明实施例二提供的一种混合精度神经网络的可微分搜索方法的流程示意图。随着人工智能应用的发展,将神经网络模型部署至边缘硬件设备(如手机、物联网设备等)的需求越来越高。但神经网络的硬件部署往往面临着诸多硬件资源限制,例如功耗和时延,因此需要设计出既能保持模型精度,又能够在硬件实现中具有优秀性能的轻量级网络。模型量化是一种神经网络压缩优化技术,可以将神经网络中的数据用更低比特位宽来表示,可以在保证模型精度没有太大损失的同时极大地减少模型的计算量和复杂度,进而可以部署到一些资源受限的硬件平台。目前模型量化可以分为手工模型量化和自动化的模型量化。在手工模型量化方法中,人类专家需要依靠经验和反复实验去决定网络每一层的量化比特值,虽然已取得了显著的优化提升,但需要消耗大量的人力和时间成本,且易于陷入局部最优。自动化的模型量化将神经网络的低比特量化看成是一个优化问题,凭借各类优化算法对网络的每一层进行自动化的量化,具有比手工模型量化更好的效果和更低的成本。FIG. 2 is a schematic flowchart of a differentiable search method for a mixed-precision neural network provided by Embodiment 2 of the present invention. With the development of artificial intelligence applications, the demand for deploying neural network models to edge hardware devices (such as mobile phones, Internet of Things devices, etc.) is getting higher and higher. However, the hardware deployment of neural networks often faces many hardware resource constraints, such as power consumption and delay, so it is necessary to design a lightweight network that can maintain model accuracy and have excellent performance in hardware implementation. Model quantization is a neural network compression optimization technology, which can represent the data in the neural network with a lower bit width, which can greatly reduce the amount of calculation and complexity of the model while ensuring the accuracy of the model without too much loss. Can be deployed to some resource-constrained hardware platforms. Currently, model quantization can be divided into manual model quantization and automated model quantization. In the manual model quantization method, human experts need to rely on experience and repeated experiments to determine the quantized bit value of each layer of the network. Although significant optimization has been achieved, it requires a lot of manpower and time costs, and it is easy to fall into a local optimum. excellent. Automated model quantization regards the low-bit quantization of the neural network as an optimization problem, and uses various optimization algorithms to automatically quantify each layer of the network, which has better results and lower costs than manual model quantization.

传统的基于强化学习的NAS方法和基于进化算法的NAS方法,在搜索过程中需要耗费大量的时间和计算、内存资源。本实施提供的一种混合精度神经网络的可微分搜索方法,能够进行自动的模型量化、并可以针对特定硬件平台进行搜索,该方法具有高性能、速度快、消耗资源小的优点。Traditional reinforcement learning-based NAS methods and evolutionary algorithm-based NAS methods consume a lot of time, computing, and memory resources during the search process. This implementation provides a differentiable search method for mixed-precision neural networks, which can perform automatic model quantization and search for a specific hardware platform. This method has the advantages of high performance, fast speed, and low resource consumption.

如图2所示,本实施例提供的混合精度神经网络的可微分搜索方法,采用可微分搜索方法和进化算法结合的搜索策略。首先获取超网络超参数即获取初始化的超网络的超参数,基于超网络的超参数使用gumbel-softmax采样出若干子网络即第一采样子网络,然后采用可微分搜索方法对超网络的超参数进行更新,包括:保持子网络的超参数不变,对这些子网络在训练集上训练若干次,并更新这些子网络的权重参数,得到训练之后的子网络,将训练后的子网络送入网络验证集评估器进行验证评估,评估后可以得到验证集反馈,根据验证集反馈对超网络的超参数进行更新,获得第一超网络。As shown in FIG. 2 , the differentiable search method for the mixed-precision neural network provided in this embodiment adopts a search strategy combining the differentiable search method and the evolutionary algorithm. First, obtain the hyperparameters of the hypernetwork, that is, obtain the hyperparameters of the initialized hypernetwork. Based on the hyperparameters of the hypernetwork, use gumbel-softmax to sample several subnetworks, that is, the first sampling subnetwork, and then use the differentiable search method to analyze the hyperparameters of the hypernetwork Update, including: keep the hyperparameters of the sub-networks unchanged, train these sub-networks several times on the training set, and update the weight parameters of these sub-networks, obtain the trained sub-networks, and send the trained sub-networks to The network verification set evaluator performs verification evaluation, and after the evaluation, the verification set feedback can be obtained, and the hyperparameters of the super network are updated according to the verification set feedback to obtain the first super network.

基于可微分搜索方法对初始化的超网络的超参数进行更新后,使用进化算法对超网络超参数进行更新,包括:基于更新后的超网络超参数再次使用gumbel-softmax采样出若干子网络即第二采样子网络;然后根据模型的硬件部署或查询查找表即对应关系表,使用硬件评估模型评估第二采样子网络在目标硬件上的硬件性能指标,得到硬件反馈,根据硬件反馈对第二采样子网络进行更新即更新超网络超参数。After updating the hyperparameters of the initialized hypernetwork based on the differentiable search method, the evolutionary algorithm is used to update the hyperparameters of the hypernetwork, including: based on the updated hyperparameters of the hypernetwork, the gumbel-softmax is used to sample several subnetworks again. Second sampling sub-network; then according to the hardware deployment of the model or query the lookup table, that is, the corresponding relationship table, use the hardware evaluation model to evaluate the hardware performance index of the second sampling sub-network on the target hardware, obtain hardware feedback, and perform the second sampling according to the hardware feedback Sub-network updates update the super-network hyperparameters.

上述两个更新方式交替循环使用,对超网络的超参数进行更新。其中,针对验证集精度反馈,采用基于梯度的方式对超网络超参数进行更新,针对硬件反馈,采用进化算法对超网络超参数进行更新。The above two update methods are used alternately and cyclically to update the hyperparameters of the hypernetwork. Among them, for the accuracy feedback of the validation set, the hyper-parameters of the hyper-network are updated in a gradient-based manner, and for hardware feedback, an evolutionary algorithm is used to update the hyper-parameters of the hyper-network.

其中,可微分搜索方法用于接收网络精度反馈,基于此对超网络超参数进行更新。网络验证集评估器用以评估训练后的子网络在验证集上的损失/精度。Among them, the differentiable search method is used to receive the network accuracy feedback, based on which the hyper-parameters of the hyper-network are updated. The network validation set evaluator is used to evaluate the loss/accuracy of the trained sub-network on the validation set.

本实施例提供的一种混合精度神经网络的可微分搜索方法在对网络架构空间进行探索时,与其他方法相比,该方法采用基于梯度的可微分搜索算法探索搜索空间,在搜索架构的同时结合混合精度量化,在实际任务应用上具有更加出色的性能和效率。该方法在网络搜索过程中,加入了实际的硬件的反馈作为约束信息,并采用进化算法处理不可微的硬件反馈信息,可以搜索到满足硬件资源要求的架构,且更易搜索到全局最优解。该方法只需利用很少的计算资源便可以在计算机视觉任务中达到很高的精确度,为自动化网络架构设计的技术能在个体研究人员、小型公司和大学研究团队的实践中得到应用提供了便利。A differentiable search method for a mixed-precision neural network provided in this embodiment explores the network architecture space. Compared with other methods, this method uses a gradient-based differentiable search algorithm to explore the search space, while searching for the architecture. Combined with mixed precision quantization, it has better performance and efficiency in practical task applications. In the network search process, this method adds actual hardware feedback as constraint information, and uses evolutionary algorithms to process non-differentiable hardware feedback information. It can search for architectures that meet hardware resource requirements, and it is easier to search for global optimal solutions. The method can achieve high accuracy in computer vision tasks using very little computing resources, and provides a basis for the design of automated network architectures that can be applied in practice by individual researchers, small companies, and university research teams. convenient.

本发明实施例提供的混合精度神经网络的可微分搜索方法采用可微分的搜索方式,将搜索空间变成连续可微,将每一层网络的量化比特值选择看成是神经网络架构搜索问题,采用基于梯度的算法搜索架构参数即卷积核配置参数和网络层量化比特值。一方面,基于梯度的可微分搜索算法相比传统神经网络架构搜索采用的强化学习和进化算法,搜索速度显著提升。另一方面,搜索得到的网络为量化之后的混合精度网络,相比全精度神经网络所耗费的计算和内存资源大大减小,提供了在一些资源受限的硬件上进行深度学习的可能。The differentiable search method of the mixed-precision neural network provided by the embodiment of the present invention adopts a differentiable search method to make the search space continuous and differentiable, and regards the selection of quantized bit values of each layer of the network as a neural network architecture search problem, A gradient-based algorithm is used to search for architectural parameters, namely convolution kernel configuration parameters and network layer quantization bit values. On the one hand, the gradient-based differentiable search algorithm significantly improves the search speed compared with the reinforcement learning and evolutionary algorithms used in traditional neural network architecture search. On the other hand, the searched network is a quantized mixed-precision network, which consumes significantly less computing and memory resources than a full-precision neural network, and provides the possibility of deep learning on some resource-constrained hardware.

实施例三Embodiment Three

图3为本发明实施例三提供的一种混合精度神经网络的可微分搜索装置的结构示意图,该装置可适用于搜索高性能神经网络的情况,其中该装置可由软件和/或硬件实现,并一般集成在计算机设备上。Fig. 3 is a schematic structural diagram of a differentiable search device for a mixed-precision neural network provided by Embodiment 3 of the present invention, which is applicable to the case of searching a high-performance neural network, wherein the device can be implemented by software and/or hardware, and Generally integrated on computer equipment.

如图3所示,该装置包括:As shown in Figure 3, the device includes:

获取模块310,用于获取初始化超网络;所述超网络包括多个子网络,且每个子网络携带有超参数;An acquisition module 310, configured to acquire an initialized supernetwork; the supernetwork includes a plurality of subnetworks, and each subnetwork carries hyperparameters;

第一更新模块320,用于基于可微分搜索方法对所述超参数进行更新,获得第一超网络;The first update module 320 is configured to update the hyperparameters based on a differentiable search method to obtain a first hypernetwork;

第二更新模块330,用于对第一超网络进行硬件性能评估,根据评估结果更新第一超网络的超参数,获得第二超网络;The second update module 330 is configured to evaluate the hardware performance of the first supernetwork, update the hyperparameters of the first supernetwork according to the evaluation result, and obtain the second supernetwork;

判断模块340,用于判断是否满足更新终止条件,若满足,则将第二超网络确定为目标神经网络;否则,返回执行基于可微分搜索方法对超参数进行更新,获得第一超网络的操作。The judging module 340 is used to judge whether the update termination condition is satisfied, and if so, determine the second supernetwork as the target neural network; otherwise, return to the operation of updating hyperparameters based on the differentiable search method to obtain the first supernetwork .

在本实施例中,该装置首先通过获取模块用于获取初始化超网络;超网络包括多个子网络,且每个子网络携带有超参数;其次通过第一更新模块,用于基于可微分搜索方法对所述超参数进行更新,获得第一超网络;然后通过第二更新模块,用于对第一超网络所包含的子网络进行硬件性能评估,根据评估结果更新第一超网络的超参数,获得第二超网络;之后通过;最后通过判断模块,用于判断是否满足更新终止条件,若满足,则将所述第二超网络确定为目标神经网络;否则,返回执行基于可微分搜索方法对所述超参数进行更新,获得第一超网络的操作。In this embodiment, the device is firstly used to obtain the initialization supernetwork through the acquisition module; the supernetwork includes a plurality of subnetworks, and each subnetwork carries hyperparameters; secondly, through the first update module, it is used to The hyperparameters are updated to obtain the first hypernetwork; then the second update module is used to evaluate the hardware performance of the subnetworks included in the first supernetwork, and update the hyperparameters of the first supernetwork according to the evaluation results to obtain The second supernetwork; pass afterwards; finally pass through the judging module, used to judge whether the update termination condition is satisfied, if so, then determine the second supernetwork as the target neural network; otherwise, return to execute the differentiable search method based on all The above hyperparameters are updated to obtain the operation of the first hypernetwork.

本实施例提供了一种混合精度神经网络的可微分搜索装置,能够进行自动的模型量化,且可以针对特定硬件平台进行搜索构建神经网络。This embodiment provides a differentiable search device for a mixed-precision neural network, capable of automatic model quantization, and capable of searching and constructing a neural network for a specific hardware platform.

进一步的,所述超网络为卷积神经网络,所述子网络包括至少一层网络;所述超参数包括每一层网络的卷积核配置参数的影响因子和量化比特值的影响因子,并且所述超参数是连续可微的。Further, the supernetwork is a convolutional neural network, and the subnetwork includes at least one layer of network; the hyperparameters include the influence factor of the convolution kernel configuration parameters of each layer of the network and the influence factor of the quantized bit value, and The hyperparameters are continuously differentiable.

进一步的,第一更新模块320,还用于:对所述初始化超网络包括的子网络进行采样,获得至少一个第一采样子网络;基于训练集对所述第一采样子网络进行训练;将验证集在训练后的第一采样子网络前向传播,得到目标损失函数值;对所述目标损失函数值进行微分计算,获得梯度值;根据所述梯度值对所述超参数进行更新,获得第一超网络。Further, the first updating module 320 is also configured to: sample the subnetworks included in the initialization supernetwork to obtain at least one first sampling subnetwork; train the first sampling subnetwork based on the training set; The verification set is propagated forward in the first sampling sub-network after training to obtain the target loss function value; the differential calculation is performed on the target loss function value to obtain the gradient value; the hyperparameter is updated according to the gradient value to obtain The first super network.

进一步的,训练模块,用于基于训练集对所述第一采样子网络进行训练,以更新所述第一采样子网络的权重参数。Further, the training module is configured to train the first sampling sub-network based on the training set, so as to update weight parameters of the first sampling sub-network.

进一步的,第二更新模块330,还用于对所述第一超网络包括的子网络进行采样,获得至少一个第二采样子网络;对所述至少一个第二采样子网络进行硬件评估,获得最优第二采样子网络;将所述最优第二采样子网络中每一层网络的卷积核配置参数的影响因子和量化比特值的影响因子增大,获得第二超网络。Further, the second update module 330 is further configured to sample the subnetworks included in the first supernetwork to obtain at least one second sampled subnetwork; perform hardware evaluation on the at least one second sampled subnetwork to obtain The optimal second sampling subnetwork; increasing the influence factor of the convolution kernel configuration parameters and the influence factor of the quantization bit value of each layer of the network in the optimal second sampling subnetwork to obtain a second supernetwork.

进一步的,硬件评估模块,用于将所述至少一个第二采样子网络部署到目标硬件上,获得硬件性能指标;或者,根据至少一个第二采样子网络的卷积核配置参数和量化比特值的大小确定硬件性能指标。Further, the hardware evaluation module is configured to deploy the at least one second sampling subnetwork on the target hardware to obtain hardware performance indicators; or, according to the convolution kernel configuration parameters and quantization bit values of the at least one second sampling subnetwork The size determines the hardware performance metrics.

进一步的,硬件评估模块对硬件评估之前,第二更新模块330还用于:建立所述至少一个第二采样子网络的卷积核配置参数和量化比特值的大小选取与硬件性能指标的对应关系表;根据至少一个第二采样子网络的每一层的卷积核配置参数和量化比特值的大小选取从所述对应关系表中查找对应的硬件性能指标。Further, before the hardware evaluation module evaluates the hardware, the second update module 330 is also used to: establish the corresponding relationship between the configuration parameters of the convolution kernel of the at least one second sampling sub-network and the size selection of the quantization bit value and the hardware performance index Table; according to the convolution kernel configuration parameters of each layer of the at least one second sampling sub-network and the size of the quantized bit value, the corresponding hardware performance index is searched from the correspondence table.

进一步的,第二更新模块330还用于确定所述第二超网络所需硬件资源以及实际可用预设限定资源;若所述所需硬件资源超过所述预设限定实际可用资源,则减小各子网络的量化比特值,使得所述所需硬件资源小于或等于所述预设限定实际可用资源。Further, the second update module 330 is also used to determine the hardware resources required by the second supernetwork and the actual available preset limited resources; if the required hardware resources exceed the preset limited actual available resources, reduce The quantized bit value of each sub-network makes the required hardware resources less than or equal to the preset limit and actual available resources.

上述混合精度神经网络的可微分搜索装置可执行本发明任意实施例所提供的混合精度神经网络的可微分搜索方法,具备执行方法相应的功能模块和有益效果。The above-mentioned differentiable search device for mixed-precision neural network can execute the differentiable search method for mixed-precision neural network provided by any embodiment of the present invention, and has corresponding functional modules and beneficial effects for executing the method.

实施例四Embodiment Four

图4为本发明实施例四提供的一种计算机设备的结构示意图。如图4所示,本发明实施例四提供的计算机设备包括:一个或多个处理器41和存储装置42;该设备中的处理器41可以是一个或多个,图4中以一个处理器41为例;存储装置42用于存储一个或多个程序;所述一个或多个程序被所述一个或多个处理器41执行,使得所述一个或多个处理器41实现如本发明实施例中任一项所述的混合精度神经网络的可微分搜索方法。FIG. 4 is a schematic structural diagram of a computer device provided by Embodiment 4 of the present invention. As shown in FIG. 4 , the computer device provided by Embodiment 4 of the present invention includes: one or more processors 41 and a storage device 42; there may be one or more processors 41 in the device, and one processor is used in FIG. 4 41 is an example; the storage device 42 is used to store one or more programs; the one or more programs are executed by the one or more processors 41, so that the one or more processors 41 implement The differentiable search method of the mixed precision neural network described in any one of the examples.

所述计算机设备还可以包括:输入装置43和输出装置44。The computer device may further include: an input device 43 and an output device 44 .

计算机设备中的处理器41、存储装置42、输入装置43和输出装置44可以通过总线或其他方式连接,图4中以通过总线连接为例。The processor 41, the storage device 42, the input device 43 and the output device 44 in the computer equipment may be connected via a bus or in other ways, and connection via a bus is taken as an example in FIG. 4 .

该计算机设备中的存储装置42作为一种计算机可读存储介质,可用于存储一个或多个程序,所述程序可以是软件程序、计算机可执行程序以及模块,如本发明实施例一或二所提供方法对应的程序指令/模块(例如,附图3所示的装置中的模块,包括:第一更新模块320和第二更新模块330)。处理器41通过运行存储在存储装置42中的软件程序、指令以及模块,从而执行终端设备的各种功能应用以及数据处理,即实现上述方法实施例中混合精度神经网络的可微分搜索方法。The storage device 42 in the computer equipment, as a computer-readable storage medium, can be used to store one or more programs, and the programs can be software programs, computer executable programs and modules, as described in the first or second embodiment of the present invention Provide program instructions/modules corresponding to the method (for example, the modules in the device shown in FIG. 3 include: a first update module 320 and a second update module 330). The processor 41 executes various functional applications and data processing of the terminal device by running the software programs, instructions and modules stored in the storage device 42, that is, realizes the differentiable search method of the mixed precision neural network in the above method embodiment.

存储装置42可包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序;存储数据区可存储根据设备的使用所创建的数据等。此外,存储装置42可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他非易失性固态存储器件。在一些实例中,存储装置42可进一步包括相对于处理器41远程设置的存储器,这些远程存储器可以通过网络连接至设备。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The storage device 42 may include a program storage area and a data storage area, wherein the program storage area may store an operating system and an application program required by at least one function; the data storage area may store data created according to the use of the device, and the like. In addition, the storage device 42 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage devices. In some examples, the storage device 42 may further include memories that are remotely located relative to the processor 41, and these remote memories may be connected to the device through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.

输入装置43可用于接收输入的数字或字符信息,以及产生与设备的用户设置以及功能控制有关的键信号输入。输出装置44可包括显示屏等显示设备。The input device 43 can be used to receive input numbers or character information, and generate key signal input related to user settings and function control of the device. The output device 44 may include a display device such as a display screen.

并且,当上述设备所包括一个或者多个程序被所述一个或者多个处理器41执行时,程序进行如下操作:And, when one or more programs included in the above-mentioned device are executed by the one or more processors 41, the programs perform the following operations:

获取初始化超网络;所述超网络包括多个子网络,且每个子网络携带有超参数;Obtain an initialized supernetwork; the supernetwork includes a plurality of subnetworks, and each subnetwork carries hyperparameters;

基于可微分搜索方法对所述超参数进行更新,获得第一超网络;updating the hyperparameters based on a differentiable search method to obtain a first hypernetwork;

对所述第一超网络所包含的子网络进行硬件性能评估,根据评估结果更新所述第一超网络的超参数,获得第二超网络;performing hardware performance evaluation on the subnetworks included in the first supernetwork, updating hyperparameters of the first supernetwork according to the evaluation results, and obtaining a second supernetwork;

判断是否满足更新终止条件,若满足,则将所述第二超网络确定为目标神经网络;否则,返回执行基于可微分搜索方法对所述超参数进行更新,获得第一超网络的操作。Judging whether the update termination condition is satisfied, if so, determining the second supernetwork as the target neural network; otherwise, returning to the operation of updating the hyperparameters based on the differentiable search method to obtain the first supernetwork.

实施例五Embodiment five

本发明实施例五提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时用于执行混合精度神经网络的可微分搜索方法,该方法包括:Embodiment 5 of the present invention provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, it is used to perform a differentiable search method for a mixed-precision neural network. The method includes:

获取初始化超网络;所述超网络包括多个子网络,且每个子网络携带有超参数;Obtain an initialized supernetwork; the supernetwork includes a plurality of subnetworks, and each subnetwork carries hyperparameters;

基于可微分搜索方法对所述超参数进行更新,获得第一超网络;updating the hyperparameters based on a differentiable search method to obtain a first hypernetwork;

对所述第一超网络所包含的子网络进行硬件性能评估,根据评估结果更新所述第一超网络的超参数,获得第二超网络;performing hardware performance evaluation on the subnetworks included in the first supernetwork, updating hyperparameters of the first supernetwork according to the evaluation results, and obtaining a second supernetwork;

判断是否满足更新终止条件,若满足,则将所述第二超网络确定为目标神经网络;否则,返回执行基于可微分搜索方法对所述超参数进行更新,获得第一超网络的操作。Judging whether the update termination condition is satisfied, if so, determining the second supernetwork as the target neural network; otherwise, returning to the operation of updating the hyperparameters based on the differentiable search method to obtain the first supernetwork.

可选的,该程序被处理器执行时还可以用于执行本发明任意实施例所提供的混合精度神经网络的可微分搜索方法。Optionally, when the program is executed by the processor, it can also be used to implement the differentiable search method of the mixed precision neural network provided by any embodiment of the present invention.

本发明实施例的计算机存储介质,可以采用一个或多个计算机可读的介质的任意组合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质例如可以是,但不限于,电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器(Random Access Memory,RAM)、只读存储器(Read Only Memory,ROM)、可擦式可编程只读存储器(ErasableProgrammable Read Only Memory,EPROM)、闪存、光纤、便携式CD-ROM、光存储器件、磁存储器件、或者上述的任意合适的组合。计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。The computer storage medium in the embodiments of the present invention may use any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof. More specific examples (non-exhaustive list) of computer-readable storage media include: electrical connections with one or more wires, portable computer disks, hard disks, Random Access Memory (RAM), read-only memory (Read Only Memory, ROM), Erasable Programmable Read Only Memory (Erasable Programmable Read Only Memory, EPROM), flash memory, optical fiber, portable CD-ROM, optical storage device, magnetic storage device, or any suitable combination of the above. A computer readable storage medium may be any tangible medium that contains or stores a program for use by or in connection with an instruction execution system, apparatus, or device.

计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于:电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。A computer readable signal medium may include a data signal carrying computer readable program code in baseband or as part of a carrier wave. Such propagated data signals may take many forms, including but not limited to: electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device. .

计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、无线电频率(Radio Frequency,RF)等等,或者上述的任意合适的组合。Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wire, optical cable, Radio Frequency (RF), etc., or any suitable combination of the foregoing.

可以以一种或多种程序设计语言或其组合来编写用于执行本发明操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言,诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言,诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。Computer program code for carrying out the operations of the present invention may be written in one or more programming languages, or combinations thereof, including object-oriented programming languages, such as Java, Smalltalk, C++, and conventional A procedural programming language, such as the "C" language or similar programming language. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In cases involving a remote computer, the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as through the Internet using an Internet service provider). connect).

注意,上述仅为本发明的较佳实施例及所运用技术原理。本领域技术人员会理解,本发明不限于这里所述的特定实施例,对本领域技术人员来说能够进行各种明显的变化、重新调整和替代而不会脱离本发明的保护范围。因此,虽然通过以上实施例对本发明进行了较为详细的说明,但是本发明不仅仅限于以上实施例,在不脱离本发明构思的情况下,还可以包括更多其他等效实施例,而本发明的范围由所附的权利要求范围决定。Note that the above are only preferred embodiments of the present invention and applied technical principles. Those skilled in the art will understand that the present invention is not limited to the specific embodiments described herein, and that various obvious changes, readjustments and substitutions can be made by those skilled in the art without departing from the protection scope of the present invention. Therefore, although the present invention has been described in detail through the above embodiments, the present invention is not limited to the above embodiments, and can also include more other equivalent embodiments without departing from the concept of the present invention, and the present invention The scope is determined by the scope of the appended claims.

Claims (7)

1. A differential search method of a mixed-precision neural network is characterized by comprising the following steps:
acquiring an initialization hyper network; the super network comprises a plurality of sub networks, and each sub network carries a super parameter;
updating the hyper-parameters based on a differentiable search method to obtain a first hyper-network;
sampling sub-networks included in the first super-network to obtain at least one second sampling sub-network;
establishing a corresponding relation table of the size selection of the convolution kernel configuration parameters and the quantization bit values of the at least one second sampling sub-network and the hardware performance indexes;
deploying the at least one second sampling sub-network to target hardware to obtain a hardware performance index; or selecting a corresponding hardware performance index from the corresponding relation table according to the convolution kernel configuration parameters and the quantization bit values of at least one second sampling sub-network to obtain an optimal second sampling sub-network;
increasing the influence factors of the convolution kernel configuration parameters and the influence factors of the quantized bit values of each layer network in the optimal second sampling sub-network to obtain a second super-network;
judging whether an updating termination condition is met, and if so, determining the second hyper-network as a target neural network; otherwise, returning to execute the operation of updating the hyper-parameters based on the differentiable search method to obtain the first hyper-network.
2. The method of claim 1, wherein the super network is a convolutional neural network, and wherein the sub-network comprises at least one layer of network; the super-parameters include an influence factor of a convolution kernel configuration parameter and an influence factor of a quantization bit value of each layer network, and are continuously differentiable.
3. The method of claim 2, wherein updating the hyper-parameter based on a differentiable search method to obtain a first hyper-network comprises:
sampling sub-networks included in the initialization hyper-network to obtain at least one first sampling sub-network;
training the first sampling subnetwork based on a training set;
forward propagating the verification set in the trained first sampling sub-network to obtain a target loss function value;
carrying out differential calculation on the target loss function value to obtain a gradient value;
and updating the hyper-parameters according to the gradient values to obtain a first hyper-network.
4. The method of claim 3, wherein training the first sampling sub-network based on a training set comprises:
training the first sampling sub-network based on a training set to update weight parameters of the first sampling sub-network.
5. The method of claim 2, after obtaining the second hyper-network, further comprising:
determining hardware resources required by the second hyper-network and preset limited resources;
and if the required hardware resource exceeds the preset limited resource, reducing the quantized bit value of each sub-network so that the required hardware resource is less than or equal to the preset limited resource.
6. A differential search apparatus for a hybrid precision neural network, comprising:
an acquisition module for acquiring an initialized hyper-network; the super network comprises a plurality of sub-networks, and each sub-network carries a super parameter;
the first updating module is used for updating the hyper-parameters based on a differentiable searching method to obtain a first hyper-network;
the second updating module is used for evaluating the hardware performance of the first hyper-network and updating hyper-parameters of the first hyper-network according to an evaluation result to obtain a second hyper-network;
the judging module is used for judging whether the updating termination condition is met or not, and if so, determining the second hyper-network as a target neural network; otherwise, returning to execute updating the hyper-parameters based on a differentiable search method to obtain the operation of the first hyper-network;
the second updating module is further configured to sample subnetworks included in the first super network to obtain at least one second sampling subnetwork; performing hardware evaluation on the at least one second sampling sub-network to obtain an optimal second sampling sub-network; increasing the influence factors of the convolution kernel configuration parameters and the influence factors of the quantized bit values of each layer of the optimal second sampling sub-network to obtain a second super-network;
the hardware evaluation module is used for deploying the at least one second sampling sub-network to target hardware to obtain a hardware performance index; or determining a hardware performance index according to the convolution kernel configuration parameters and the sizes of the quantized bit values of the at least one second sampling sub-network;
before the hardware evaluation module evaluates the hardware, the second updating module is further configured to establish a correspondence table between the size selection of the convolution kernel configuration parameters and the quantized bit values of the at least one second sampling subnetwork and the hardware performance index; and selecting to search the corresponding hardware performance index from the corresponding relation table according to the convolution kernel configuration parameter and the quantized bit value of each layer of at least one second sampling sub-network.
7. The apparatus of claim 6, wherein the super network is a convolutional neural network, and wherein the sub-network comprises at least one layer of network; the super-parameters include an influence factor of a convolution kernel configuration parameter and an influence factor of a quantization bit value of each layer network, and are continuously differentiable.
CN202011249481.1A 2020-11-10 2020-11-10 Differentiable searching method and device for mixed precision neural network Active CN112364981B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011249481.1A CN112364981B (en) 2020-11-10 2020-11-10 Differentiable searching method and device for mixed precision neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011249481.1A CN112364981B (en) 2020-11-10 2020-11-10 Differentiable searching method and device for mixed precision neural network

Publications (2)

Publication Number Publication Date
CN112364981A CN112364981A (en) 2021-02-12
CN112364981B true CN112364981B (en) 2022-11-22

Family

ID=74509525

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011249481.1A Active CN112364981B (en) 2020-11-10 2020-11-10 Differentiable searching method and device for mixed precision neural network

Country Status (1)

Country Link
CN (1) CN112364981B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112908446B (en) * 2021-03-20 2022-03-22 张磊 Automatic mixing control method for liquid medicine in endocrinology department
CN113033784A (en) * 2021-04-18 2021-06-25 沈阳雅译网络技术有限公司 Method for searching neural network structure for CPU and GPU equipment
CN113762489B (en) * 2021-08-12 2025-06-27 北京交通大学 A method for obtaining a multi-bit width quantized deep convolutional neural network
CN113902099B (en) * 2021-10-08 2023-06-02 电子科技大学 Neural Network Design and Optimization Method Based on Joint Learning of Software and Hardware
CN115271043B (en) * 2022-07-28 2023-10-20 小米汽车科技有限公司 Model tuning method, device and storage medium
CN115017377B (en) * 2022-08-05 2022-11-08 深圳比特微电子科技有限公司 Method, device and computing equipment for searching target model
CN117692341B (en) * 2023-07-28 2025-02-07 荣耀终端有限公司 A network acquisition method and device
CN117173551B (en) * 2023-11-02 2024-02-09 佛山科学技术学院 A scene-adaptive unsupervised underwater target detection method and system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111488971A (en) * 2020-04-09 2020-08-04 北京百度网讯科技有限公司 Neural network model searching method and device, and image processing method and device
CN111553480A (en) * 2020-07-10 2020-08-18 腾讯科技(深圳)有限公司 Neural network searching method and device, computer readable medium and electronic equipment
CN111582453A (en) * 2020-05-09 2020-08-25 北京百度网讯科技有限公司 Method and apparatus for generating neural network model
CN111639797A (en) * 2020-05-26 2020-09-08 北京师范大学 Gumbel-softmax technology-based combined optimization method
WO2020190542A1 (en) * 2019-03-18 2020-09-24 Microsoft Technology Licensing, Llc Quantization-aware neural architecture search
CN111783951A (en) * 2020-06-29 2020-10-16 北京百度网讯科技有限公司 Model acquisition method, device, device and storage medium based on hypernetwork
CN111814966A (en) * 2020-08-24 2020-10-23 国网浙江省电力有限公司 Neural network architecture search method, neural network application method, device and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020190542A1 (en) * 2019-03-18 2020-09-24 Microsoft Technology Licensing, Llc Quantization-aware neural architecture search
CN111488971A (en) * 2020-04-09 2020-08-04 北京百度网讯科技有限公司 Neural network model searching method and device, and image processing method and device
CN111582453A (en) * 2020-05-09 2020-08-25 北京百度网讯科技有限公司 Method and apparatus for generating neural network model
CN111639797A (en) * 2020-05-26 2020-09-08 北京师范大学 Gumbel-softmax technology-based combined optimization method
CN111783951A (en) * 2020-06-29 2020-10-16 北京百度网讯科技有限公司 Model acquisition method, device, device and storage medium based on hypernetwork
CN111553480A (en) * 2020-07-10 2020-08-18 腾讯科技(深圳)有限公司 Neural network searching method and device, computer readable medium and electronic equipment
CN111814966A (en) * 2020-08-24 2020-10-23 国网浙江省电力有限公司 Neural network architecture search method, neural network application method, device and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
PROXYLESSNAS: DIRECT NEURAL ARCHITECTURE SEARCH ON TARGET TASK AND HARDWARE;Han Cai等;《arXiv》;20190223;摘要,第1、3部分 *
基于无梯度进化的神经架构搜索算法研究综述;尚迪雅等;《计算机工程》;20200930(第09期);全文 *

Also Published As

Publication number Publication date
CN112364981A (en) 2021-02-12

Similar Documents

Publication Publication Date Title
CN112364981B (en) Differentiable searching method and device for mixed precision neural network
Li et al. Edge AI: On-demand accelerating deep neural network inference via edge computing
CN110807515B (en) Model Generation Method and Device
CN113505883B (en) A neural network training method and device
WO2022027937A1 (en) Neural network compression method, apparatus and device, and storage medium
IL274535B1 (en) Weight data storage method, and neural network processor based on method
US20240135191A1 (en) Method, apparatus, and system for generating neural network model, device, medium, and program product
JP2020191080A (en) Data recognition method for incremental learning
WO2022246986A1 (en) Data processing method, apparatus and device, and computer-readable storage medium
CN112988285A (en) Task unloading method and device, electronic equipment and storage medium
CN108363478B (en) For wearable device deep learning application model load sharing system and method
JP7517464B2 (en) Neural architecture search system and method
WO2025086778A1 (en) Knowledge distillation method and electronic device
CN116090536A (en) Neural network optimization method, device, computer equipment and storage medium
CN112862021A (en) Content labeling method and related device
CN110782016A (en) Method and apparatus for optimizing neural network architecture search
WO2023014298A2 (en) Neural network construction method and apparatus
WO2025067211A1 (en) Data processing method and apparatus
WO2024051655A1 (en) Method and apparatus for processing histopathological whole-slide image, and medium and electronic device
CN117373121A (en) Gesture interaction method and related equipment in intelligent cabin environment
CN116882451A (en) Reinforced learning-based CNN accelerator architecture optimization method, device and storage medium
CN116828541A (en) Edge computing dependent task dynamic unloading method and system based on multi-agent reinforcement learning
CN119520312A (en) Platform operation and maintenance methods, electronic equipment, media and products
US11195094B2 (en) Neural network connection reduction
CN117573118B (en) Sketch recognition-based application page generation method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240124

Address after: 518000, Building 307, Building 2, Nanshan Zhiyuan Chongwen Park, No. 3370 Liuxian Avenue, Fuguang Community, Taoyuan Street, Nanshan District, Shenzhen, Guangdong Province

Patentee after: Shenzhen Maitexin Technology Co.,Ltd.

Country or region after: China

Address before: No. 1088, Xili Xueyuan Avenue, Nanshan District, Shenzhen, Guangdong Province

Patentee before: Southern University of Science and Technology

Country or region before: China

TR01 Transfer of patent right
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载