+

CN116824342A - Image processing method and device - Google Patents

Image processing method and device Download PDF

Info

Publication number
CN116824342A
CN116824342A CN202210260514.5A CN202210260514A CN116824342A CN 116824342 A CN116824342 A CN 116824342A CN 202210260514 A CN202210260514 A CN 202210260514A CN 116824342 A CN116824342 A CN 116824342A
Authority
CN
China
Prior art keywords
image
dimension information
processed
preset
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210260514.5A
Other languages
Chinese (zh)
Inventor
屠震元
周智强
叶挺群
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN202210260514.5A priority Critical patent/CN116824342A/en
Publication of CN116824342A publication Critical patent/CN116824342A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the application provides an image processing method and device, which relate to the technical field of deep learning, and the method comprises the following steps: when the image needs to be processed, a preset model library file is called; the preset model library file comprises: a target model file of a target deep learning model corresponding to each of the plurality of preset image dimension information; the target model file contains weight parameters of the target deep learning model; according to the memory size required by the target deep learning model to process the image with the largest preset image dimension information, distributing a target memory space for processing each image to be processed; for each image to be processed, determining a target model file corresponding to preset image dimension information matched with the image to be processed from preset model library files; and processing the image to be processed based on the determined target model file and the target memory space. In this way, the efficiency of image processing can be improved.

Description

一种图像处理方法和装置An image processing method and device

技术领域Technical field

本申请涉及深度学习技术领域,特别是涉及一种图像处理方法和装置。This application relates to the field of deep learning technology, and in particular to an image processing method and device.

背景技术Background technique

相关技术中,基于深度学习模型对图像进行处理,可以包括以下步骤:步骤一:内存分配;步骤二:模型加载;步骤三:前向推理,以及步骤四:内存释放。分配的内存用于存储深度学习模型的输入数据、输出数据,以及深度学习模型。In related technologies, image processing based on a deep learning model may include the following steps: step 1: memory allocation; step 2: model loading; step 3: forward inference, and step 4: memory release. The allocated memory is used to store the input data, output data, and deep learning model of the deep learning model.

基于深度学习模型对图像进行处理时,所需要的内存空间的大小与输入的图像的图像维度信息相关,因此,当输入的待处理图像对应不同的图像维度信息时,图像处理设备需要循环执行上述步骤一至步骤四,即图像处理设备需要频繁多次的进行内存分配以及内存释放,以适应不同图像维度信息的待处理图像。而内存申请以及内存释放的过程会消耗较多的时间,进而,会降低图像处理的效率。When processing images based on a deep learning model, the size of the memory space required is related to the image dimension information of the input image. Therefore, when the input image to be processed corresponds to different image dimension information, the image processing device needs to execute the above in a loop. From step one to step four, the image processing device needs to perform memory allocation and memory release frequently and many times to adapt to the images to be processed with different image dimension information. The process of memory application and memory release will consume more time, which will further reduce the efficiency of image processing.

发明内容Contents of the invention

本申请实施例的目的在于提供一种图像处理方法和装置,能够提高图像处理的效率。具体技术方案如下:The purpose of the embodiments of the present application is to provide an image processing method and device that can improve the efficiency of image processing. The specific technical solutions are as follows:

第一方面,为了达到上述目的,本申请实施例公开了一种图像处理方法,所述方法包括:In a first aspect, in order to achieve the above objectives, embodiments of the present application disclose an image processing method, which method includes:

当需要对图像进行处理时,调用预设模型库文件;其中,所述预设模型库文件包括:多个预设图像维度信息各自对应的目标深度学习模型的目标模型文件;所述目标模型文件包含所述目标深度学习模型的权重参数;When an image needs to be processed, a preset model library file is called; wherein the preset model library file includes: a target model file of a target deep learning model corresponding to each of multiple preset image dimension information; the target model file Contains the weight parameters of the target deep learning model;

按照所述目标深度学习模型处理最大的预设图像维度信息的图像所需的内存大小,分配用于对各待处理图像进行处理的目标内存空间;According to the memory size required by the target deep learning model to process the image with the largest preset image dimension information, allocate the target memory space for processing each image to be processed;

针对每一待处理图像,从所述预设模型库文件中,确定与该待处理图像相匹配的预设图像维度信息对应的目标模型文件;For each image to be processed, determine a target model file corresponding to the preset image dimension information matching the image to be processed from the preset model library file;

基于确定出的目标模型文件,以及所述目标内存空间,对该待处理图像进行处理。Based on the determined target model file and the target memory space, the image to be processed is processed.

可选的,在从所述预设模型库文件中,确定与该待处理图像相匹配的预设图像维度信息对应的目标模型文件之前,所述方法还包括:Optionally, before determining the target model file corresponding to the preset image dimension information matching the image to be processed from the preset model library file, the method further includes:

针对每一待处理图像,判断各预设图像维度信息中,是否存在与该待处理图像的图像维度信息一致的第一图像维度信息;For each image to be processed, determine whether there is first image dimension information consistent with the image dimension information of the image to be processed in each preset image dimension information;

若存在所述第一图像维度信息,则确定所述第一图像维度信息与该待处理图像相匹配;If the first image dimension information exists, it is determined that the first image dimension information matches the image to be processed;

若不存在所述第一图像维度信息,则确定多个预设图像维度信息中的第二图像维度信息与该待处理图像相匹配;其中,在所述多个预设图像维度信息中,所述第二图像维度信息不小于该待处理图像的图像维度信息,且与该待处理图像的图像维度信息之间的差值最小。If the first image dimension information does not exist, it is determined that the second image dimension information in the plurality of preset image dimension information matches the image to be processed; wherein, among the plurality of preset image dimension information, the The second image dimension information is not less than the image dimension information of the image to be processed, and the difference between the second image dimension information and the image dimension information of the image to be processed is the smallest.

可选的,图像维度信息包括批处理数、通道数目,以及图像高度和图像宽度;Optionally, the image dimension information includes the number of batches, the number of channels, and image height and image width;

所述确定多个预设图像维度信息中的第二图像维度信息与该待处理图像相匹配,包括:Determining that the second image dimension information among the plurality of preset image dimension information matches the image to be processed includes:

从多个预设图像维度信息中,确定在批处理数、通道数目,以及图像高度和图像宽度的维度上均不小于该待处理图像的图像维度信息,作为第三图像维度信息;From the plurality of preset image dimension information, determine the image dimension information that is not smaller than the image to be processed in the number of batches, the number of channels, and the dimensions of image height and image width, as the third image dimension information;

基于与该待处理图像在批处理数、通道数目,以及图像高度和图像宽度的维度上的差值,从第三图像维度信息中确定与该待处理图像相匹配的第二图像维度信息。Second image dimension information matching the image to be processed is determined from the third image dimension information based on differences in the dimensions of the batch number, the number of channels, and the image height and image width with the image to be processed.

可选的,所述基于与该待处理图像在批处理数、通道数目,以及图像高度和图像宽度的维度上的差值,从第三图像维度信息中确定与该待处理图像相匹配的第二图像维度信息,包括:Optionally, the third image matching the image to be processed is determined from the third image dimension information based on the difference between the image to be processed in the batch number, the number of channels, and the dimensions of image height and image width. Two image dimension information, including:

从第三图像维度信息中,确定与该待处理图像在批处理数的维度上的差值最小的图像维度信息,作为第四图像维度信息;From the third image dimension information, determine the image dimension information with the smallest difference in the batch number dimension from the image to be processed as the fourth image dimension information;

从第四图像维度信息中,确定与该待处理图像在通道数目的维度上的差值最小的图像维度信息,作为第五图像维度信息;From the fourth image dimension information, determine the image dimension information with the smallest difference in the dimension of the number of channels from the image to be processed as the fifth image dimension information;

确定第五图像维度信息中,与该待处理图像在图像高度和图像宽度的维度上的差值最小的图像维度信息,作为与该待处理图像相匹配的第二图像维度信息。Among the fifth image dimension information, the image dimension information with the smallest difference in the dimensions of image height and image width from the image to be processed is determined as the second image dimension information matching the image to be processed.

可选的,所述预设模型库文件的构建过程包括:Optionally, the construction process of the preset model library file includes:

获取各预设图像维度信息,以及表示初始深度学习模型的权重参数的初始模型文件;Obtain the dimension information of each preset image and the initial model file representing the weight parameters of the initial deep learning model;

针对每一预设图像维度信息,根据所述初始模型文件,生成与该预设图像维度信息相匹配的目标深度学习模型的模型文件,作为目标模型文件;其中,与该预设图像维度信息相匹配的目标深度学习模型表示:根据该预设图像维度信息对所述初始深度学习模型进行模型优化得到的深度学习模型;For each preset image dimension information, according to the initial model file, a model file of the target deep learning model matching the preset image dimension information is generated as a target model file; wherein, matching the preset image dimension information The matched target deep learning model represents: a deep learning model obtained by optimizing the initial deep learning model based on the preset image dimension information;

基于对各目标模型文件进行封装,得到所述预设模型库文件。Based on encapsulating each target model file, the preset model library file is obtained.

可选的,所述基于对各目标模型文件进行封装,得到所述预设模型库文件,包括:Optionally, the preset model library file is obtained by encapsulating each target model file, including:

对各目标模型文件,以及模型加载指令进行封装,得到所述预设模型库文件;其中,所述模型加载指令,用于从所述预设模型库文件中调用需要运行的目标模型文件;所述预设模型库文件包含各目标模型文件中的共有部分,以及每一目标模型文件中的私有部分。Each target model file and the model loading instruction are encapsulated to obtain the preset model library file; wherein the model loading instruction is used to call the target model file that needs to be run from the preset model library file; so The above-mentioned default model library file contains the common parts in each target model file, and the private parts in each target model file.

可选的,所述预设模型库文件为动态库文件或静态库文件。Optionally, the preset model library file is a dynamic library file or a static library file.

第二方面,为了达到上述目的,本申请实施例公开了一种图像处理装置,所述装置包括:In a second aspect, in order to achieve the above object, an embodiment of the present application discloses an image processing device, which includes:

预设模型库文件调用模块,用于当需要对图像进行处理时,调用预设模型库文件;其中,所述预设模型库文件包括:多个预设图像维度信息各自对应的目标深度学习模型的目标模型文件;所述目标模型文件包含所述目标深度学习模型的权重参数;The preset model library file calling module is used to call the preset model library file when the image needs to be processed; wherein the preset model library file includes: target deep learning models corresponding to multiple preset image dimension information. The target model file; the target model file contains the weight parameters of the target deep learning model;

目标内存空间分配模块,用于按照所述目标深度学习模型处理最大的预设图像维度信息的图像所需的内存大小,分配用于对各待处理图像进行处理的目标内存空间;The target memory space allocation module is used to allocate the target memory space for processing each image to be processed according to the memory size required for processing the image with the largest preset image dimension information according to the target deep learning model;

目标模型文件确定模块,用于针对每一待处理图像,从所述预设模型库文件中,确定与该待处理图像相匹配的预设图像维度信息对应的目标模型文件;A target model file determination module, configured to determine, for each image to be processed, the target model file corresponding to the preset image dimension information matching the image to be processed from the preset model library file;

图像处理模块,用于基于确定出的目标模型文件,以及所述目标内存空间,对该待处理图像进行处理。The image processing module is used to process the image to be processed based on the determined target model file and the target memory space.

可选的,所述装置还包括:Optionally, the device also includes:

判断模块,用于在从所述预设模型库文件中,确定与该待处理图像相匹配的预设图像维度信息对应的目标模型文件之前,针对每一待处理图像,判断各预设图像维度信息中,是否存在与该待处理图像的图像维度信息一致的第一图像维度信息;A judgment module configured to judge each preset image dimension for each image to be processed before determining the target model file corresponding to the preset image dimension information matching the image to be processed from the preset model library file. In the information, whether there is first image dimension information consistent with the image dimension information of the image to be processed;

第一处理模块,用于若存在所述第一图像维度信息,则确定所述第一图像维度信息与该待处理图像相匹配;A first processing module configured to, if the first image dimension information exists, determine that the first image dimension information matches the image to be processed;

第二处理模块,用于若不存在所述第一图像维度信息,则确定多个预设图像维度信息中的第二图像维度信息与该待处理图像相匹配;其中,在所述多个预设图像维度信息中,所述第二图像维度信息不小于该待处理图像的图像维度信息,且与该待处理图像的图像维度信息之间的差值最小。A second processing module configured to, if the first image dimension information does not exist, determine that the second image dimension information in the plurality of preset image dimension information matches the image to be processed; wherein, in the plurality of preset image dimension information, It is assumed that among the image dimension information, the second image dimension information is not less than the image dimension information of the image to be processed, and the difference between the second image dimension information and the image dimension information of the image to be processed is the smallest.

可选的,图像维度信息包括批处理数、通道数目,以及图像高度和图像宽度;Optionally, the image dimension information includes the number of batches, the number of channels, and image height and image width;

所述第二处理模块,包括:The second processing module includes:

第三图像维度信息确定子模块,用于从多个预设图像维度信息中,确定在批处理数、通道数目,以及图像高度和图像宽度的维度上均不小于该待处理图像的图像维度信息,作为第三图像维度信息;The third image dimension information determination submodule is used to determine, from multiple preset image dimension information, image dimension information that is no less than the image to be processed in the number of batches, the number of channels, and the dimensions of image height and image width. , as the third image dimension information;

第二图像维度信息确定子模块,用于基于与该待处理图像在批处理数、通道数目,以及图像高度和图像宽度的维度上的差值,从第三图像维度信息中确定与该待处理图像相匹配的第二图像维度信息。The second image dimension information determination submodule is used to determine the difference between the image to be processed and the image to be processed from the third image dimension information based on the differences in the dimensions of the number of batches, the number of channels, and the image height and image width. The image matches the second image dimension information.

可选的,所述第二图像维度信息确定子模块,具体用于从第三图像维度信息中,确定与该待处理图像在批处理数的维度上的差值最小的图像维度信息,作为第四图像维度信息;Optionally, the second image dimension information determination sub-module is specifically configured to determine, from the third image dimension information, the image dimension information that has the smallest difference with the image to be processed in the dimension of the batch number, as the third image dimension information. Four image dimension information;

从第四图像维度信息中,确定与该待处理图像在通道数目的维度上的差值最小的图像维度信息,作为第五图像维度信息;From the fourth image dimension information, determine the image dimension information with the smallest difference in the dimension of the number of channels from the image to be processed as the fifth image dimension information;

确定第五图像维度信息中,与该待处理图像在图像高度和图像宽度的维度上的差值最小的图像维度信息,作为与该待处理图像相匹配的第二图像维度信息。Among the fifth image dimension information, the image dimension information with the smallest difference in the dimensions of image height and image width from the image to be processed is determined as the second image dimension information matching the image to be processed.

可选的,所述装置还包括:Optionally, the device also includes:

预设模型库文件构建模块,用于获取各预设图像维度信息,以及表示初始深度学习模型的权重参数的初始模型文件;The preset model library file construction module is used to obtain the dimension information of each preset image and the initial model file representing the weight parameters of the initial deep learning model;

针对每一预设图像维度信息,根据所述初始模型文件,生成与该预设图像维度信息相匹配的目标深度学习模型的模型文件,作为目标模型文件;其中,与该预设图像维度信息相匹配的目标深度学习模型表示:根据该预设图像维度信息对所述初始深度学习模型进行模型优化得到的深度学习模型;For each preset image dimension information, according to the initial model file, a model file of the target deep learning model matching the preset image dimension information is generated as a target model file; wherein, matching the preset image dimension information The matched target deep learning model represents: a deep learning model obtained by optimizing the initial deep learning model based on the preset image dimension information;

基于对各目标模型文件进行封装,得到所述预设模型库文件。Based on encapsulating each target model file, the preset model library file is obtained.

可选的,所述预设模型库文件构建模块,具体用于对各目标模型文件,以及模型加载指令进行封装,得到所述预设模型库文件;其中,所述模型加载指令,用于从所述预设模型库文件中调用需要运行的目标模型文件;所述预设模型库文件包含各目标模型文件中的共有部分,以及每一目标模型文件中的私有部分。Optionally, the preset model library file construction module is specifically used to encapsulate each target model file and a model loading instruction to obtain the preset model library file; wherein the model loading instruction is used to obtain the preset model library file from The target model file that needs to be run is called in the preset model library file; the preset model library file includes the common parts in each target model file and the private parts in each target model file.

可选的,所述预设模型库文件为动态库文件或静态库文件。Optionally, the preset model library file is a dynamic library file or a static library file.

在本申请实施的另一方面,为了达到上述目的,本申请实施例还公开了一种电子设备,所述电子设备包括处理器、通信接口、存储器和通信总线,其中,所述处理器,所述通信接口,所述存储器通过所述通信总线完成相互间的通信;In another aspect of the implementation of the present application, in order to achieve the above object, an embodiment of the present application also discloses an electronic device. The electronic device includes a processor, a communication interface, a memory and a communication bus, wherein the processor, the The communication interface, the memories complete mutual communication through the communication bus;

所述存储器,用于存放计算机程序;The memory is used to store computer programs;

所述处理器,用于执行所述存储器上所存放的程序时,实现如上述任一所述的图像处理方法。The processor is configured to implement any of the above image processing methods when executing a program stored on the memory.

在本申请实施的又一方面,还提供了一种计算机可读存储介质,所述计算机可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现如上述任一所述的图像处理方法。In yet another aspect of the implementation of the present application, a computer-readable storage medium is also provided. A computer program is stored in the computer-readable storage medium. When the computer program is executed by a processor, any of the above-mentioned methods are implemented. Image processing methods.

本申请实施例还提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述任一所述的图像处理方法。Embodiments of the present application also provide a computer program product containing instructions that, when run on a computer, cause the computer to execute any of the above image processing methods.

本申请实施例有益效果:Beneficial effects of the embodiments of this application:

本申请实施例提供的图像处理方法,当需要对图像进行处理时,调用预设模型库文件;其中,预设模型库文件包括:多个预设图像维度信息各自对应的目标深度学习模型的目标模型文件;目标模型文件包含所述目标深度学习模型的权重参数;按照目标深度学习模型处理最大的预设图像维度信息的图像所需的内存大小,分配用于对各待处理图像进行处理的目标内存空间;针对每一待处理图像,从预设模型库文件中,确定与该待处理图像相匹配的预设图像维度信息对应的目标模型文件;基于确定出的目标模型文件,以及目标内存空间,对该待处理图像进行处理。The image processing method provided by the embodiment of the present application calls a preset model library file when an image needs to be processed; wherein the preset model library file includes: the target deep learning model corresponding to each of the plurality of preset image dimension information. Model file; the target model file contains the weight parameters of the target deep learning model; according to the memory size required by the target deep learning model to process the image with the largest preset image dimension information, allocate the target for processing each image to be processed Memory space; for each image to be processed, determine the target model file corresponding to the preset image dimension information that matches the image to be processed from the preset model library file; based on the determined target model file, and the target memory space , process the image to be processed.

基于上述处理,由于分配的目标内存空间,是按照处理最大的预设图像维度信息的图像所需的内存大小确定的,因此,针对输入的小于最大的预设图像维度信息的任一图像,该目标内存空间均能够满足对该图像进行处理所需的内存大小。即,只需要一次内存分配,就可以实现对输入的小于最大的预设图像维度信息的不同大小的多个待处理图像进行处理,无需频繁多次的进行内存分配和释放,也就能够提高图像的处理效率。Based on the above processing, since the allocated target memory space is determined according to the memory size required to process the image with the largest preset image dimension information, therefore, for any image input that is smaller than the largest preset image dimension information, the The target memory space can meet the memory size required to process the image. That is, only one memory allocation is needed to process multiple input images of different sizes that are smaller than the largest preset image dimension information. There is no need to perform frequent memory allocation and release, which can improve image quality. processing efficiency.

当然,实施本申请的任一产品或方法并不一定需要同时达到以上所述的所有优点。Of course, implementing any product or method of the present application does not necessarily require achieving all the above-mentioned advantages simultaneously.

附图说明Description of the drawings

为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,还可以根据这些附图获得其他的实施例。In order to explain the embodiments of the present application or the technical solutions in the prior art more clearly, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings in the following description are only These are some embodiments of the present application. For those of ordinary skill in the art, other embodiments can be obtained based on these drawings.

图1为本申请实施例提供的一种图像处理方法的流程图;Figure 1 is a flow chart of an image processing method provided by an embodiment of the present application;

图2为本申请实施例提供的另一种图像处理方法的流程图;Figure 2 is a flow chart of another image processing method provided by an embodiment of the present application;

图3A为本申请实施例提供的另一种图像处理方法的流程图;Figure 3A is a flow chart of another image processing method provided by an embodiment of the present application;

图3B为本申请实施例提供的另一种图像处理方法的流程图;Figure 3B is a flow chart of another image processing method provided by an embodiment of the present application;

图4为本申请实施例提供的一种图像处理的原理示意图;Figure 4 is a schematic diagram of the principle of image processing provided by an embodiment of the present application;

图5为本申请实施例提供的一种图像处理的流程图;Figure 5 is a flow chart of image processing provided by an embodiment of the present application;

图6为本申请实施例提供的一种确定与输入的图像最接近的预设图像维度信息的流程图;Figure 6 is a flow chart for determining the preset image dimension information closest to the input image provided by an embodiment of the present application;

图7为本申请实施例提供的一种图像处理装置的结构图;Figure 7 is a structural diagram of an image processing device provided by an embodiment of the present application;

图8为本申请实施例提供的一种电子设备的结构图。Figure 8 is a structural diagram of an electronic device provided by an embodiment of the present application.

具体实施方式Detailed ways

下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员基于本申请所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are only some of the embodiments of the present application, rather than all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art based on this application fall within the scope of protection of this application.

本申请实施例提供了一种图像处理方法,该方法可以应用于电子设备,该电子设备可以基于深度学习模型对图像进行处理。本申请实施例中涉及的深度学习模型,可以为CNN(卷积神经网络,Convolutional Neural Networks)模型,或者,也可以为RNN(循环神经网络,Rerrent Neural Network)模型,但并不限于此。相应的,对图像进行处理可以为识别图像中对象(例如,人物、车辆等)的类别,或者,也可以为识别图像中对象所属的图像区域,或者,也可以为识别图像中是否存在指定的对象,但并不限于此。The embodiment of the present application provides an image processing method, which can be applied to an electronic device, and the electronic device can process images based on a deep learning model. The deep learning model involved in the embodiments of this application may be a CNN (Convolutional Neural Networks) model or an RNN (Rerrent Neural Network) model, but is not limited thereto. Correspondingly, processing the image can be to identify the category of the object (for example, a person, a vehicle, etc.) in the image, or it can also be to identify the image area to which the object in the image belongs, or it can also be to identify whether there is a specified object in the image. objects, but are not limited to this.

参见图1,图1为本申请实施例提供的一种图像处理方法的流程图,该方法可以包括以下步骤:Referring to Figure 1, Figure 1 is a flow chart of an image processing method provided by an embodiment of the present application. The method may include the following steps:

S101:当需要对图像进行处理时,调用预设模型库文件。S101: When the image needs to be processed, call the preset model library file.

其中,预设模型库文件包括:多个预设图像维度信息各自对应的目标深度学习模型的目标模型文件。目标模型文件包含目标深度学习模型的权重参数。The preset model library file includes: target model files of target deep learning models corresponding to multiple preset image dimension information. The target model file contains the weight parameters of the target deep learning model.

S102:按照目标深度学习模型处理最大的预设图像维度信息的图像所需的内存大小,分配用于对各待处理图像进行处理的目标内存空间。S102: According to the memory size required by the target deep learning model to process the image with the largest preset image dimension information, allocate the target memory space for processing each image to be processed.

S103:针对每一待处理图像,从预设模型库文件中,确定与该待处理图像相匹配的预设图像维度信息对应的目标模型文件。S103: For each image to be processed, determine the target model file corresponding to the preset image dimension information matching the image to be processed from the preset model library file.

S104:基于确定出的目标模型文件,以及目标内存空间,对该待处理图像进行处理。S104: Based on the determined target model file and the target memory space, process the image to be processed.

基于本申请实施例提供的图像处理方法,由于分配的目标内存空间,是按照处理最大的预设图像维度信息的图像所需的内存大小确定的,因此,针对输入的小于最大的预设图像维度信息的任一图像,该目标内存空间均能够满足对该图像进行处理所需的内存大小。即,只需要一次内存分配,就可以实现对输入的小于最大的预设图像维度信息的不同大小的多个待处理图像进行处理,无需频繁多次的进行内存分配和释放,也就能够提高图像的处理效率。Based on the image processing method provided by the embodiment of the present application, since the allocated target memory space is determined according to the memory size required to process the image with the largest preset image dimension information, therefore, for the input smaller than the largest preset image dimension For any image of information, the target memory space can satisfy the memory size required to process the image. That is, only one memory allocation is needed to process multiple input images of different sizes that are smaller than the largest preset image dimension information. There is no need to perform frequent memory allocation and release, which can improve image quality. processing efficiency.

针对步骤S101,每一目标模型文件还可以包括对应的目标深度学习模型中各网络层之间的关系,以及各网络层的网络参数。Regarding step S101, each target model file may also include the relationship between each network layer in the corresponding target deep learning model, and the network parameters of each network layer.

多个预设图像维度信息可以根据需求进行设置,例如,可以设置为当前的业务中常用的图像的图像维度信息。Multiple preset image dimension information can be set according to requirements. For example, it can be set to the image dimension information of images commonly used in current business.

在一个实施例中,图像维度信息包括批处理数、通道数目,以及图像高度和图像宽度。即,图像维度信息可以包括N(Num,批处理数),C(Channel,通道),H(High,高度),以及W(Width,宽度)。N表示批处理数,即,基于深度学习模型单位时间内同时处理的(图像帧/特征图)数目,C表示通道数目,H表示图像帧的高度,W表示图像帧的宽度。In one embodiment, image dimension information includes batch number, channel number, and image height and image width. That is, the image dimension information may include N (Num, batch number), C (Channel, channel), H (High, height), and W (Width, width). N represents the number of batch processing, that is, the number of (image frames/feature maps) processed simultaneously per unit time based on the deep learning model, C represents the number of channels, H represents the height of the image frame, and W represents the width of the image frame.

针对步骤S102,在基于深度学习模型对图像进行处理时,需要获取输入数据(即输入的图像),且会输出数据(即处理结果),以及中间缓存数据(例如,网络层生成的特征图)。因此,在基于深度学习模型对图像进行处理之前,需要预先为深度学习模型分配内存空间,以存储上述数据。深度学习模型对图像进行处理所需的内存空间,与输入的图像的图像维度信息相关,即,输入的图像的图像维度信息越大,则所需的内存空间越大;输入的图像的图像维度信息越小,则所需的内存空间越小。Regarding step S102, when processing an image based on a deep learning model, it is necessary to obtain input data (ie, the input image), and output data (ie, the processing result), as well as intermediate cache data (for example, the feature map generated by the network layer) . Therefore, before processing images based on the deep learning model, memory space needs to be allocated to the deep learning model in advance to store the above data. The memory space required by the deep learning model to process images is related to the image dimension information of the input image. That is, the greater the image dimension information of the input image, the greater the memory space required; the image dimension of the input image The smaller the information, the smaller the memory space required.

在本申请实施例中,可以计算基于深度学习模型对最大的预设图像维度信息的图像进行处理时所需的内存大小,并按照该内存大小,分配内存空间(即目标内存空间)。进而,针对输入的小于最大的预设图像维度信息的任一图像,该目标内存空间均能够满足对该图像进行处理所需的内存空间。In the embodiment of the present application, the memory size required for processing the image with the largest preset image dimension information based on the deep learning model can be calculated, and the memory space (ie, the target memory space) is allocated according to the memory size. Furthermore, for any input image that is smaller than the largest preset image dimension information, the target memory space can satisfy the memory space required for processing the image.

在一个实施例中,在上述步骤S102之前,电子设备还可以进行其他初始化处理,例如镜像加载等。In one embodiment, before the above step S102, the electronic device may also perform other initialization processing, such as image loading, etc.

针对步骤S103以及步骤S104,待处理图像表示输入的需要进行处理的图像,可以为其他设备采集的图像,或者,也可以为特征图。Regarding step S103 and step S104, the image to be processed represents an input image that needs to be processed, and can be an image collected by other devices, or it can also be a feature map.

在分配目标内存空间后,电子设备可以基于该目标内存空间,对多个不同图像维度信息的待处理图像进行处理。After allocating the target memory space, the electronic device can process multiple images to be processed with different image dimension information based on the target memory space.

一种实现方式中,电子设备可以依次对获取的每一待处理图像进行处理。例如,针对每一待处理图像,电子设备可以从各预设图像维度信息中,确定与该待处理图像相匹配的图像维度信息,即,与该待处理图像的图像维度信息相匹配的图像维度信息。进而,电子设备可以从预设模型库文件中获取确定的图像维度信息对应的目标模型文件,然后,电子设备则可以加载该目标模型文件,以运行该图像维度信息对应的目标深度学习模型,并基于分配的上述目标内存空间,对该待处理图像进行处理。基于上述处理,能够实现多次对图像进行处理时的内存空间复用。In one implementation, the electronic device can process each acquired image to be processed in sequence. For example, for each image to be processed, the electronic device can determine the image dimension information that matches the image to be processed from each preset image dimension information, that is, the image dimension that matches the image dimension information of the image to be processed. information. Furthermore, the electronic device can obtain the target model file corresponding to the determined image dimension information from the preset model library file, and then the electronic device can load the target model file to run the target deep learning model corresponding to the image dimension information, and Based on the allocated target memory space, the image to be processed is processed. Based on the above processing, memory space multiplexing can be achieved when images are processed multiple times.

在一个实施例中,在完成对各待处理图像的处理后,电子设备可以释放目标内存空间,以避免内存空间的浪费。In one embodiment, after completing processing of each image to be processed, the electronic device can release the target memory space to avoid wasting memory space.

在一个实施例中,参见图2,在图1的基础上,在上述步骤S103之前,该方法还可以包括以下步骤:In one embodiment, referring to Figure 2, based on Figure 1, before the above step S103, the method may also include the following steps:

S105:针对每一待处理图像,判断各预设图像维度信息中,是否存在与该待处理图像的图像维度信息一致的第一图像维度信息;若是,则执行步骤S106;若否,则执行步骤S107。S105: For each image to be processed, determine whether there is first image dimension information consistent with the image dimension information of the image to be processed in each preset image dimension information; if yes, perform step S106; if not, perform step S106. S107.

S106:确定第一图像维度信息与该待处理图像相匹配。S106: Determine that the first image dimension information matches the image to be processed.

S107:确定多个预设图像维度信息中的第二图像维度信息与该待处理图像相匹配。S107: Determine that the second image dimension information among the plurality of preset image dimension information matches the image to be processed.

其中,在多个预设图像维度信息中,第二图像维度信息不小于该待处理图像的图像维度信息,且与该待处理图像的图像维度信息之间的差值最小。Among the plurality of preset image dimension information, the second image dimension information is not less than the image dimension information of the image to be processed, and the difference between the second image dimension information and the image dimension information of the image to be processed is the smallest.

在本申请实施例中,若多个预设图像维度信息中存在与该待处理图像一致的图像维度信息(即第一图像维度信息),则电子设备可以确定第一图像维度信息与该待处理图像相匹配。也就是说,该待处理图像与第一图像维度信息对应的目标深度学习模型相匹配。进而,在对该处理图像进行处理时,电子设备可以加载第一图像维度信息对应的目标模型文件,以运行第一图像维度信息对应的目标深度学习模型,对该待处理图像进行处理。如此,能够提高目标深度学习模型对该待处理图像进行处理的效率。In the embodiment of the present application, if there is image dimension information that is consistent with the image to be processed (ie, the first image dimension information) in the plurality of preset image dimension information, the electronic device can determine that the first image dimension information is consistent with the image dimension information to be processed. The image matches. That is to say, the image to be processed matches the target deep learning model corresponding to the first image dimension information. Furthermore, when processing the processed image, the electronic device can load the target model file corresponding to the first image dimension information to run the target deep learning model corresponding to the first image dimension information to process the image to be processed. In this way, the efficiency of the target deep learning model in processing the image to be processed can be improved.

若不存在第一图像维度信息,则电子设备确定多个预设图像维度信息中的第二图像维度信息与该待处理图像相匹配。由于第二图像维度信息不小于该待处理图像的图像维度信息,且与该待处理图像的图像维度信息之间的差值最小,即,第二图像维度信息与该待处理图像的图像维度信息最接近,因此,在对该处理图像进行处理时,电子设备可以加载第二图像维度信息对应的目标模型文件,以运行第二图像维度信息对应的目标深度学习模型,对该待处理图像进行处理。如此,在基于第二图像维度信息对应的目标深度学习模型对该待处理图像进行处理,电子设备只需要进行较小的优化处理即可,能够提高目标深度学习模型对该待处理图像进行处理的效率。If the first image dimension information does not exist, the electronic device determines that the second image dimension information among the plurality of preset image dimension information matches the image to be processed. Since the second image dimension information is not less than the image dimension information of the image to be processed, and the difference between the second image dimension information and the image dimension information of the image to be processed is the smallest, that is, the second image dimension information and the image dimension information of the image to be processed closest. Therefore, when processing the processed image, the electronic device can load the target model file corresponding to the second image dimension information to run the target deep learning model corresponding to the second image dimension information to process the image to be processed. . In this way, when the target deep learning model corresponding to the second image dimension information processes the image to be processed, the electronic device only needs to perform minor optimization processing, which can improve the efficiency of the target deep learning model to process the image to be processed. efficiency.

在一个实施例中,若图像维度信息包括批处理数、通道数目,以及图像高度和图像宽度。相应的,参见图3A,在图2的基础上,上述步骤S107可以包括以下步骤:In one embodiment, the image dimension information includes a batch number, a channel number, as well as image height and image width. Correspondingly, referring to Figure 3A, based on Figure 2, the above step S107 may include the following steps:

S1071:从多个预设图像维度信息中,确定在批处理数、通道数目,以及图像高度和图像宽度的维度上均不小于该待处理图像的图像维度信息,作为第三图像维度信息。S1071: From multiple preset image dimension information, determine the image dimension information that is not smaller than the image to be processed in the dimensions of batch number, channel number, image height and image width, as the third image dimension information.

S1072:基于与该待处理图像在批处理数、通道数目,以及图像高度和图像宽度的维度上的差值,从第三图像维度信息中确定与该待处理图像相匹配的第二图像维度信息。S1072: Determine the second image dimension information matching the image to be processed from the third image dimension information based on the differences in the dimensions of the batch number, the number of channels, and the image height and image width of the image to be processed. .

其中,第三图像维度信息可以为一个,也可以为多个。若第三图像维度信息为一个,则电子设备直接确定该第三图像维度信息与该待处理图像相匹配。The third image dimension information may be one or multiple. If the third image dimension information is one, the electronic device directly determines that the third image dimension information matches the image to be processed.

若第三图像维度信息为多个,由于各第三图像维度信息在批处理数、通道数目,以及图像高度和图像宽度四个维度上,均不小于该待处理图像的图像维度信息。因此,电子设备可以从各第三图像维度信息中选择一个图像维度信息,以基于对应的目标深度学习模型对该待处理图像进行处理。If there are multiple third image dimension information, each third image dimension information is not less than the image dimension information of the image to be processed in the four dimensions of batch number, channel number, image height and image width. Therefore, the electronic device can select one image dimension information from each third image dimension information to process the image to be processed based on the corresponding target deep learning model.

示例性地,电子设备可以将每一预设图像维度信息分别与待处理图像的图像维度信息进行比较,筛选出在批处理数、通道数目,以及图像高度和图像宽度的维度上均不小于该待处理图像的预设图像维度信息,得到第三图像维度信息。For example, the electronic device can compare each preset image dimension information with the image dimension information of the image to be processed, and filter out the images that are no less than the number of batches, the number of channels, and the dimensions of image height and image width. The preset image dimension information of the image to be processed is used to obtain the third image dimension information.

一种实现方式中,针对每一第三图像维度信息,电子设备可以计算其与待处理图像分别在批处理数、通道数目,以及图像高度和图像宽度四个维度上的差值,然后,根据上述四个差值,计算该第三图像维度信息与待处理图像的图像维度信息之间的总差值。例如,电子设备可以计算上述四个差值的加权和,作为总差值。进而,电子设备可以将对应的总差值最小的第三图像维度信息,确定为与该待处理图像相匹配的第二图像维度信息。In one implementation, for each third image dimension information, the electronic device can calculate the difference between it and the image to be processed in the four dimensions of batch number, channel number, image height and image width, and then, according to The above four differences are used to calculate the total difference between the third image dimension information and the image dimension information of the image to be processed. For example, the electronic device may calculate the weighted sum of the above four differences as the total difference. Furthermore, the electronic device can determine the third image dimension information corresponding to the smallest total difference value as the second image dimension information matching the image to be processed.

另一种实现方式中,在确定出各第三图像维度信息后,电子设备也可以从各第三图像维度信息中随机选择一个图像维度信息,并基于所选择的图像维度信息对应的目标深度学习模型对该待处理图像进行处理。In another implementation, after determining each third image dimension information, the electronic device can also randomly select one image dimension information from each third image dimension information, and perform deep learning based on the target corresponding to the selected image dimension information. The model processes the image to be processed.

一种实现方式中,在确定出第三图像维度信息后,电子设备也可以按照指定维度顺序,从第三图像维度信息中确定与该待处理图像相匹配的第二图像维度信息。In one implementation, after determining the third image dimension information, the electronic device may also determine the second image dimension information matching the image to be processed from the third image dimension information in a specified dimensional order.

在一个实施例中,指定维度顺序可以表示批处理数、通道数目、图像高度、图像宽度,相应的,参见图3B,在图3A的基础上,上述步骤S1072可以包括以下步骤:In one embodiment, the specified dimension order can represent the number of batches, the number of channels, image height, and image width. Correspondingly, see Figure 3B. Based on Figure 3A, the above step S1072 can include the following steps:

S10721:从第三图像维度信息中,确定与该待处理图像在批处理数的维度上的差值最小的图像维度信息,作为第四图像维度信息。S10721: From the third image dimension information, determine the image dimension information with the smallest difference in the batch number dimension from the image to be processed as the fourth image dimension information.

S10722:从第四图像维度信息中,确定与该待处理图像在通道数目的维度上的差值最小的图像维度信息,作为第五图像维度信息。S10722: From the fourth image dimension information, determine the image dimension information with the smallest difference in the dimension of the number of channels from the image to be processed as the fifth image dimension information.

S10723:确定第五图像维度信息中,与该待处理图像在图像高度和图像宽度的维度上的差值最小的图像维度信息,作为与该待处理图像相匹配的第二图像维度信息。S10723: Determine the image dimension information that has the smallest difference in the dimensions of image height and image width with the image to be processed among the fifth image dimension information, as the second image dimension information matching the image to be processed.

一种实现方式中,由于批处理数对所占内存的影响程度最大,因此,电子设备可以按照批处理数、通道数目、图像高度和图像宽度的顺序,确定第二图像维度信息。In one implementation, since the number of batch processes has the greatest impact on the memory occupied, the electronic device can determine the second image dimension information in the order of the number of batch processes, the number of channels, the image height and the image width.

例如,电子设备从第三图像维度信息中,确定与该待处理图像在批处理数的维度上的差值最小的图像维度信息(即第四图像维度信息)。若第四图像维度信息为一个,则电子设备直接确定该第四图像维度信息与该待处理图像相匹配。For example, the electronic device determines, from the third image dimension information, the image dimension information that has the smallest difference in the dimension of the batch number with the image to be processed (ie, the fourth image dimension information). If the fourth image dimension information is one, the electronic device directly determines that the fourth image dimension information matches the image to be processed.

若第四图像维度信息为多个,则电子设备从第四图像维度信息中,确定与该待处理图像在通道数目的维度上的差值最小的图像维度信息(即第五图像维度信息)。若第五图像维度信息为一个,则电子设备直接确定该第五图像维度信息与该待处理图像相匹配。If there are multiple fourth image dimension information, the electronic device determines from the fourth image dimension information the image dimension information with the smallest difference in the dimension of the number of channels from the image to be processed (that is, the fifth image dimension information). If the fifth image dimension information is one, the electronic device directly determines that the fifth image dimension information matches the image to be processed.

若第五图像维度信息为多个,则电子设备确定第五图像维度信息中,与该待处理图像在图像高度和图像宽度的维度上的差值最小的图像维度信息与该待处理图像相匹配。If there are multiple fifth image dimension information, the electronic device determines that among the fifth image dimension information, the image dimension information with the smallest difference between the image height and the image width of the image to be processed matches the image to be processed. .

基于上述处理,并不需要将所有的图像维度信息全部列举出来,即,预设模型库文件并不需要包含所有图像维度信息对应的目标模型文件,通过动态匹配的方式,可以选择出最近邻的图像维度信息,以进行前向推理,在实现内存空间复用的同时,能够提高图像处理的效率。Based on the above processing, it is not necessary to list all the image dimension information. That is, the preset model library file does not need to contain the target model files corresponding to all image dimension information. Through dynamic matching, the nearest neighbor can be selected. Image dimension information is used for forward reasoning, which can improve the efficiency of image processing while realizing memory space reuse.

另外,指定维度顺序也可以为批处理数、图像高度、图像宽度、通道数目,或者,也可以为图像高度、图像宽度、通道数目、批处理数,但并不限于此。相应的,确定第二图像维度信息的方式可以参考上述步骤S10721-S10723。In addition, the specified dimension order can also be the number of batches, image height, image width, and number of channels, or it can also be the image height, image width, number of channels, and number of batches, but is not limited to this. Correspondingly, the method of determining the second image dimension information may refer to the above steps S10721-S10723.

由于确定出的第二图像维度信息与该待处理图像的图像维度信息并不一致,即,第二图像维度信息大于该待处理图像的图像维度信息。因此,在基于第二图像维度信息对应的目标深度学习模型对该待处理图像进行处理时,可以进行优化处理。Because the determined second image dimension information is not consistent with the image dimension information of the image to be processed, that is, the second image dimension information is greater than the image dimension information of the image to be processed. Therefore, when processing the image to be processed based on the target deep learning model corresponding to the second image dimension information, optimization processing can be performed.

例如,第二图像维度信息在批处理数的维度上大于该待处理图像的图像维度信息,则可以补充输入数据(例如,补充包含的数据为0的图像帧)。相应的,在得到输出数据后,可以对补充的数据对应的输出数据进行裁剪,得到最终的输出结果。For example, if the second image dimension information is greater than the image dimension information of the image to be processed in the batch number dimension, the input data can be supplemented (for example, image frames containing data of 0 are supplemented). Correspondingly, after obtaining the output data, the output data corresponding to the supplementary data can be trimmed to obtain the final output result.

例如,第二图像维度信息在图像高度和图像宽度的维度上大于该待处理图像的图像维度信息,则可以对该待处理图像中的各图像帧进行填充处理(例如,填充部分的数据为0)。相应的,在得到输出数据后,可以对填充部分对应的输出数据进行裁剪,得到最终的输出结果。For example, if the second image dimension information is greater than the image dimension information of the image to be processed in the dimensions of image height and image width, then filling processing can be performed on each image frame in the image to be processed (for example, the data of the filling part is 0 ). Correspondingly, after obtaining the output data, the output data corresponding to the padding part can be cropped to obtain the final output result.

基于上述处理,由于第二图像维度信息与待处理图像的图像维度信息的差异最小,在基于第二图像维度信息对应的目标深度学习模型对该待处理图像进行处理,电子设备只需要进行较少的优化处理即可,能够提高目标深度学习模型对该待处理图像进行处理的效率。Based on the above processing, since the difference between the second image dimension information and the image dimension information of the image to be processed is minimal, the electronic device only needs to perform less processing when processing the image to be processed based on the target deep learning model corresponding to the second image dimension information. The optimization process can improve the efficiency of the target deep learning model in processing the image to be processed.

在一个实施例中,预设模型库文件的构建过程包括:In one embodiment, the construction process of the preset model library file includes:

步骤一:获取各预设图像维度信息,以及表示初始深度学习模型的权重参数的初始模型文件。Step 1: Obtain the dimension information of each preset image and the initial model file representing the weight parameters of the initial deep learning model.

步骤二:针对每一预设图像维度信息,根据初始模型文件,生成与该预设图像维度信息相匹配的目标深度学习模型的模型文件,作为目标模型文件。Step 2: For each preset image dimension information, according to the initial model file, generate a model file of the target deep learning model that matches the preset image dimension information as a target model file.

步骤三:基于对各目标模型文件进行封装,得到预设模型库文件。Step 3: Obtain the preset model library file based on encapsulation of each target model file.

其中,与该预设图像维度信息相匹配的目标深度学习模型表示:根据该预设图像维度信息对初始深度学习模型进行模型优化得到的深度学习模型。Wherein, the target deep learning model matching the preset image dimension information represents: a deep learning model obtained by optimizing the initial deep learning model based on the preset image dimension information.

在本申请实施例中,预设模型库文件可以为电子设备预设生成的,或者,也可以为其他设备预先生成的。In this embodiment of the present application, the default model library file may be pre-generated for the electronic device, or may be pre-generated for other devices.

一种实现方式中,电子设备可以预先基于训练样本集(即样本图像)对初始结构的深度学习模型进行训练,得到初始深度学习模型。然后,可以获取该初始深度学习模型的权重参数。In one implementation, the electronic device can pre-train the deep learning model of the initial structure based on the training sample set (ie, the sample image) to obtain the initial deep learning model. Then, the weight parameters of this initial deep learning model can be obtained.

为了提高基于目标深度学习模型对待处理图像进行处理的效率,各预设图像维度信息可以按照预设步长设置。例如,在各预设图像维度信息中,批处理数可以设置为1、2、4、8、16、24、32。图像高度和图像宽度则可以设置为常用的分辨率(例如540P、720P、1080P)。例如,最大的预设图像维度信息可以为32*3*1080*1920,则最大支持的图像处理大小为32*3*1080*1920。In order to improve the efficiency of processing images to be processed based on the target deep learning model, each preset image dimension information can be set according to the preset step size. For example, in each preset image dimension information, the number of batch processing can be set to 1, 2, 4, 8, 16, 24, and 32. The image height and image width can be set to commonly used resolutions (such as 540P, 720P, 1080P). For example, the maximum preset image dimension information can be 32*3*1080*1920, and the maximum supported image processing size is 32*3*1080*1920.

一种实现方式中,电子设备可以通过模型生成引擎生成预设模型库文件,相应的,各预设图像维度信息可以记录在预设json文件中,以便于模型生成引擎进行加载。另外,该json文件中还可以记录有输入的图像的数据格式,例如,可以为Int8格式,或者,也可以为FP16格式,或者,也可以为Float格式。In one implementation, the electronic device can generate a preset model library file through a model generation engine. Accordingly, each preset image dimension information can be recorded in the preset json file to facilitate loading by the model generation engine. In addition, the json file can also record the data format of the input image, for example, it can be Int8 format, or it can be FP16 format, or it can also be Float format.

进而,针对每一预设图像维度信息,模型生成引擎可以根据初始模型文件,对初始深度学习模型进行优化。一种实现方式中,电子设备可以基于上述数据,对初始深度学习模型中的网络层进行融合。例如,对Conv层(卷积层)、BN(Batch Normalization,批量归一化)以及Relu(Rectified Linear Unit,线性整流函数)层进行融合;和/或,对Concat(全连接)层和卷积层进行融合。Furthermore, for each preset image dimension information, the model generation engine can optimize the initial deep learning model based on the initial model file. In one implementation, the electronic device can fuse the network layers in the initial deep learning model based on the above data. For example, fuse the Conv layer (convolution layer), BN (Batch Normalization, batch normalization) and Relu (Rectified Linear Unit, linear rectification function) layer; and/or, fuse the Concat (fully connected) layer and convolution layers are merged.

基于上述处理,能够实现深度学习模型中各网络层的权重共享,降低各目标模型文件所占的空间,以降低预设模型库文件所占的空间。相应的,若基于NPU(Nneural-network Processing Units,嵌入式神经网络处理器)进行图像处理,也就能提高NPU片上资源利用率。Based on the above processing, it is possible to realize the weight sharing of each network layer in the deep learning model, reduce the space occupied by each target model file, and reduce the space occupied by the preset model library file. Correspondingly, if image processing is performed based on NPU (Nneural-network Processing Units, embedded neural network processor), the utilization of NPU on-chip resources can also be improved.

另外,针对每一预设图像维度信息,电子设备还可以确定对应的目标深度学习模型中各网络层的输入输出数据的维度,并结合各网络层之间的关系,确定各网络层之间的内存复用方式,将该内存复用方式记录在对应的目标模型文件中。In addition, for each preset image dimension information, the electronic device can also determine the dimensions of the input and output data of each network layer in the corresponding target deep learning model, and determine the relationship between each network layer based on the relationship between each network layer. Memory reuse method, record the memory reuse method in the corresponding target model file.

在一个实施例中,针对每一预设图像维度信息,电子设备可以确定对应的目标深度学习模型中各网络层的输入输出数据的维度,并结合当前的硬件特性(比如片上内存大小),确定每一网络层的数据加载方式。例如,针对每一网络层,若该网络层的输入数据的维度大于阈值,则其对应的目标模型文件中记录的数据的加载方式可以为分块加载处理。该阈值可以根据上述片上内存大小确定。例如,每次加载输入数据的1/4,或者,每次加载输入数据的1/8。若该网络层的输入数据的维度小于阈值,其对应的目标模型文件中记录的数据加载方式可以为整体加载,即,直接加载整个输入数据。In one embodiment, for each preset image dimension information, the electronic device can determine the dimensions of the input and output data of each network layer in the corresponding target deep learning model, and combined with the current hardware characteristics (such as on-chip memory size), determine How data is loaded for each network layer. For example, for each network layer, if the dimension of the input data of the network layer is greater than the threshold, the loading method of the data recorded in the corresponding target model file can be block loading processing. This threshold can be determined based on the on-chip memory size mentioned above. For example, load 1/4 of the input data each time, or load 1/8 of the input data each time. If the dimension of the input data of the network layer is less than the threshold, the data loading method recorded in the corresponding target model file can be full loading, that is, the entire input data is loaded directly.

在一个实施例中,模型生成引擎还可以支持用户自定义插件,并将用户自定义的网络层的配置信息(例如,权重信息、实现方式等)一起生成预设模型库文件,进而,能够提高生成的预设模型库文件的灵活性,也就能够提高图像处理的灵活性。其中,实现方式可以包括网络层之间的内存复用方式,以及网络层的数据加载方式。In one embodiment, the model generation engine can also support user-defined plug-ins, and generate a preset model library file together with user-defined network layer configuration information (for example, weight information, implementation methods, etc.), thereby improving The flexibility of the generated preset model library files can also improve the flexibility of image processing. Among them, the implementation method may include the memory multiplexing method between network layers and the data loading method of the network layer.

在一个实施例中,为了进一步降低预设模型库文件所占的空间,上述步骤三,可以包括以下步骤:对各目标模型文件,以及模型加载指令进行封装,得到预设模型库文件。In one embodiment, in order to further reduce the space occupied by the preset model library file, the above-mentioned step three may include the following steps: encapsulate each target model file and the model loading instruction to obtain the preset model library file.

其中,模型加载指令,用于从预设模型库文件中调用需要运行的目标模型文件。预设模型库文件包含各目标模型文件中的共有部分,以及每一目标模型文件中的私有部分。Among them, the model loading instruction is used to call the target model file that needs to be run from the preset model library file. The default model library file contains the common parts in each target model file and the private parts in each target model file.

在本申请实施例中,各预设图像维度信息对应的目标深度学习模型中存在相同的部分,因此,为了进一步降低预设模型库文件所占的空间,在预设模型库文件中,针对各目标模型文件中相同的部分(即共有部分),可以只记录一份。相应的,每一目标模型文件中不同于其他目标模型文件的部分(即私有部分)可以单独记录。In the embodiment of the present application, the target deep learning model corresponding to each preset image dimension information has the same part. Therefore, in order to further reduce the space occupied by the preset model library file, in the preset model library file, for each Only one copy of the same part (that is, the common part) in the target model file can be recorded. Correspondingly, parts of each target model file that are different from other target model files (ie, private parts) can be recorded separately.

例如,共有部分表示多个目标模型文件中相同的网络层,则在预设模型库文件中,针对该多个目标模型文件,只记录一份该相同的网络层,且标记该网络层为该多个目标模型文件的共有部分。相应的,在预设模型库文件中,还可以记录该多个目标模型文件中除相同的网络层以外的其他网络层,并标记这类网络层各自所属的目标模型文件。For example, if the common part represents the same network layer in multiple target model files, then in the default model library file, only one copy of the same network layer is recorded for the multiple target model files, and the network layer is marked as Common parts of multiple target model files. Correspondingly, in the default model library file, network layers other than the same network layer in the multiple target model files can also be recorded, and the target model files to which such network layers belong respectively can be marked.

一种实现方式中,针对上述共有部分和私有部分,电子设备可以以二进制的方式读取,并写入c文件,然后,可以将模型加载指令(例如,vload指令)与上述数据一起进行封装,得到预设模型库文件。例如,电子设备可以以Fatbin的方式对上述数据进行封装。或者,针对上述共有部分和私有部分,电子设备也可以以二进制的方式读取,并写入c++文件,以生成预设模型库文件。本申请中,所写入文件的文件类型并不限于c文件和c++文件,也可以为其他类型。In one implementation, for the above-mentioned public part and private part, the electronic device can read in binary mode and write it into a c file. Then, the model loading instruction (for example, vload instruction) can be encapsulated together with the above-mentioned data. Get the preset model library file. For example, the electronic device can encapsulate the above data in a Fatbin manner. Alternatively, for the above-mentioned public and private parts, the electronic device can also read in binary form and write it into a C++ file to generate a preset model library file. In this application, the file types of the written files are not limited to c files and c++ files, and can also be other types.

在一个实施例中,预设模型库文件可以为动态库文件,即,电子设备可以将上述数据编译为动态库文件,得到预设模型库文件。In one embodiment, the preset model library file may be a dynamic library file, that is, the electronic device may compile the above data into a dynamic library file to obtain the preset model library file.

示例性地,可以通过以下指令,生成动态库文件:For example, you can use the following instructions to generate a dynamic library file:

gcc-shared–fpic XXX.c lib XXX.sogcc-shared–fpic XXX.c lib XXX.so

上述指令中,shared表示输出结果是共享库类型;fpic表示使用地址无关代码生成输出文件;XXX表示生成的动态库文件的名称。In the above instructions, shared indicates that the output result is a shared library type; fpic indicates that the output file is generated using address-independent code; XXX indicates the name of the generated dynamic library file.

一种实现方式中,可以将前向推理(即对输入的图像进行处理)的过程对应的代码封装为库文件,得到推理库。电子设备通过调用推理库,即可以实现前向推理。若预设模型库文件为动态库文件,则在调用该推理库的过程中,电子设备可以通过动态链接的方式,链接上述预设模型库文件,以基于目标深度学习模型对图像进行处理。而基于动态链接的方式,推理库中只需要记录预设模型库文件的入口地址,并不需要记录整个预设模型库文件,因此,能够节省推理库所占的内存空间。In one implementation, the code corresponding to the process of forward inference (that is, processing the input image) can be encapsulated into a library file to obtain an inference library. Electronic devices can implement forward reasoning by calling the reasoning library. If the preset model library file is a dynamic library file, during the process of calling the inference library, the electronic device can link the above preset model library file through dynamic linking to process the image based on the target deep learning model. Based on the dynamic link method, only the entry address of the preset model library file needs to be recorded in the inference library, and the entire preset model library file does not need to be recorded. Therefore, the memory space occupied by the inference library can be saved.

在一个实施例中,预设模型库文件可以为静态库文件,即,电子设备可以将上述数据编译为静态库文件,得到预设模型库文件。In one embodiment, the preset model library file may be a static library file, that is, the electronic device may compile the above data into a static library file to obtain the preset model library file.

示例性地,可以通过以下指令,生成静态库文件:For example, you can use the following instructions to generate a static library file:

gcc–c XXX.cgcc–c XXX.c

ar–cr XXX.a XXX.oar–cr XXX.a XXX.o

上述指令中,ar–cr表示将多个编译后的文件打包为一个静态库文件,XXX表示生成的静态库文件的名称。In the above instructions, ar–cr means to package multiple compiled files into a static library file, and XXX means the name of the generated static library file.

参见图4,图4为本申请实施例提供的一种图像处理的原理示意图。Referring to Figure 4, Figure 4 is a schematic diagram of the principle of image processing provided by an embodiment of the present application.

离线实现部分包括深度学习模型上线之前,生成预设模型库文件的过程。例如,可以通过模型生成引擎,基于json文件包(记录有各预设图像维度信息)和权重文件,生成各预设图像维度信息各自对应的模型文件。对各模型文件以及模型加载指令进行打包,生成so库(即动态库),得到预设模型库文件。另外,在生成模型文件时,模型生成引擎还可以支持用户自定义插件,并将用户自定义的网络层的配置信息(例如,权重信息、实现方式等)一起生成预设模型库文件,进而,能够提高生成的预设模型库文件的灵活性。The offline implementation part includes the process of generating preset model library files before the deep learning model is put online. For example, the model generation engine can be used to generate model files corresponding to each preset image dimension information based on the json file package (recorded with each preset image dimension information) and the weight file. Pack each model file and model loading instructions to generate an so library (i.e. dynamic library) to obtain the preset model library file. In addition, when generating model files, the model generation engine can also support user-defined plug-ins and generate preset model library files together with user-defined network layer configuration information (such as weight information, implementation methods, etc.), and then, Able to improve the flexibility of generated preset model library files.

前向推理部分包括深度学习模型上线后对图像进行处理的过程。例如,在获取输入数据(例如,图像)后,可以调用推理库接口,调用推理库(即前向推理过程对应的代码进行封装得到的库),以对输入数据进行处理。具体的,在处理时,可以调用生成的上述so库,以基于与输入数据相匹配的模型文件,对输入数据进行处理,得到输出结果。The forward reasoning part includes the process of processing images after the deep learning model is online. For example, after obtaining the input data (for example, an image), you can call the inference library interface and call the inference library (that is, a library that encapsulates the code corresponding to the forward inference process) to process the input data. Specifically, during processing, the generated so library can be called to process the input data based on the model file that matches the input data and obtain the output result.

参见图5,图5为本申请实施例提供的一种图像处理的流程图。Referring to Figure 5, Figure 5 is a flow chart of image processing provided by an embodiment of the present application.

加载模型,即,调用预设模型库文件,预设模型库文件包括多个预设图像维度信息对应的模型文件。进而,分配最大的内存空间,即,按照处理最大的预设图像维度信息的图像所需的内存大小,分配内存空间。然后,可以判断各预设图像维度信息中,是否存在与该输入的图像一致的预设图像维度信息。Loading the model means calling a preset model library file. The preset model library file includes multiple model files corresponding to preset image dimension information. Furthermore, the maximum memory space is allocated, that is, the memory space is allocated according to the memory size required to process the image with the largest preset image dimension information. Then, it can be determined whether there is preset image dimension information consistent with the input image in each preset image dimension information.

若存在,则基于分配的内存空间,以该一致的预设图像维度信息对应的模型文件对输入的图像进行处理。If it exists, based on the allocated memory space, the input image is processed with the model file corresponding to the consistent preset image dimension information.

若不存在,则基于分配的内存空间,选择与输入的图像最接近的预设图像维度信息对应的模型文件对输入的图像进行处理。与输入的图像最接近的预设图像维度信息,即,不小于输入的图像,且与输入的图像的图像维度信息之间的差值最小的预设图像维度信息。另外,还可以对输出数据进行裁剪,即,在对输入的图像进行处理时,可以对输入的图像进行填充处理,和/或,补充输入的数据,相应的,在得到输出数据后,可以对填充部分和/或补充部分对应的输出数据进行裁剪。If it does not exist, based on the allocated memory space, the model file corresponding to the preset image dimension information closest to the input image is selected to process the input image. The preset image dimension information that is closest to the input image, that is, the preset image dimension information that is not smaller than the input image and has the smallest difference with the image dimension information of the input image. In addition, the output data can also be cropped, that is, when processing the input image, the input image can be filled in, and/or the input data can be supplemented. Correspondingly, after the output data is obtained, the input image can be processed. The output data corresponding to the padding part and/or the supplementary part is cropped.

在输出处理结果后,则可以释放分配的内存空间。After outputting the processing results, the allocated memory space can be released.

参见图6,图6为本申请实施例提供的一种确定与输入的图像最接近的预设图像维度信息的流程图。Referring to FIG. 6 , FIG. 6 is a flow chart for determining the preset image dimension information closest to the input image provided by an embodiment of the present application.

在获取输入的图像的图像维度信息(即N、C、H、W),可以逐个比较N、C、H、W,以判断各预设图像维度信息中,是否存在四个维度均不小于输入的图像的图像维度信息的第三图像维度信息。When obtaining the image dimension information of the input image (i.e. N, C, H, W), you can compare N, C, H, W one by one to determine whether there are four dimensions in each preset image dimension information that are not smaller than the input The third image dimension information of the image dimension information of the image.

若不存在,则确定不支持对输入的图像进行处理。If it does not exist, it is determined that processing of the input image is not supported.

若存在,从第三图像维度信息中,确定与输入的图像在N的维度上的差值最小第四图像维度信息。进而,从第四图像维度信息中,确定与输入的图像在C的维度上的差值最小的第五图像维度信息,然后,确定第五图像维度信息中,与输入的图像在H和W的维度上的差值最小的图像维度信息,作为与输入的图像最接近的预设图像维度信息。If it exists, determine the fourth image dimension information with the smallest difference in N dimensions from the input image from the third image dimension information. Furthermore, from the fourth image dimension information, determine the fifth image dimension information that has the smallest difference with the input image in the C dimension, and then determine the fifth image dimension information that has the smallest difference with the input image in the H and W dimensions. The image dimension information with the smallest difference in dimension is used as the preset image dimension information closest to the input image.

基于相同的发明构思,本申请实施例还提供了一种图像处理装置,参见图7,图7为本申请实施例提供的一种图像处理装置的结构图,该装置可以包括:Based on the same inventive concept, an embodiment of the present application also provides an image processing device. See Figure 7. Figure 7 is a structural diagram of an image processing device provided by an embodiment of the present application. The device may include:

预设模型库文件调用模块701,用于当需要对图像进行处理时,调用预设模型库文件;其中,所述预设模型库文件包括:多个预设图像维度信息各自对应的目标深度学习模型的目标模型文件;所述目标模型文件包含所述目标深度学习模型的权重参数;The preset model library file calling module 701 is used to call the preset model library file when an image needs to be processed; wherein the preset model library file includes: target deep learning corresponding to each of multiple preset image dimension information. The target model file of the model; the target model file contains the weight parameters of the target deep learning model;

目标内存空间分配模块702,用于按照所述目标深度学习模型处理最大的预设图像维度信息的图像所需的内存大小,分配用于对各待处理图像进行处理的目标内存空间;The target memory space allocation module 702 is used to allocate the target memory space for processing each image to be processed according to the memory size required by the target deep learning model to process the image with the largest preset image dimension information;

目标模型文件确定模块703,用于针对每一待处理图像,从所述预设模型库文件中,确定与该待处理图像相匹配的预设图像维度信息对应的目标模型文件;The target model file determination module 703 is configured to determine, for each image to be processed, the target model file corresponding to the preset image dimension information matching the image to be processed from the preset model library file;

图像处理模块704,用于基于确定出的目标模型文件,以及所述目标内存空间,对该待处理图像进行处理。The image processing module 704 is used to process the image to be processed based on the determined target model file and the target memory space.

可选的,所述装置还包括:Optionally, the device also includes:

判断模块,用于在从所述预设模型库文件中,确定与该待处理图像相匹配的预设图像维度信息对应的目标模型文件之前,针对每一待处理图像,判断各预设图像维度信息中,是否存在与该待处理图像的图像维度信息一致的第一图像维度信息;A judgment module configured to judge each preset image dimension for each image to be processed before determining the target model file corresponding to the preset image dimension information matching the image to be processed from the preset model library file. In the information, whether there is first image dimension information consistent with the image dimension information of the image to be processed;

第一处理模块,用于若存在所述第一图像维度信息,则确定所述第一图像维度信息与该待处理图像相匹配;A first processing module configured to, if the first image dimension information exists, determine that the first image dimension information matches the image to be processed;

第二处理模块,用于若不存在所述第一图像维度信息,则确定多个预设图像维度信息中的第二图像维度信息与该待处理图像相匹配;其中,在所述多个预设图像维度信息中,所述第二图像维度信息不小于该待处理图像的图像维度信息,且与该待处理图像的图像维度信息之间的差值最小。A second processing module configured to, if the first image dimension information does not exist, determine that the second image dimension information in the plurality of preset image dimension information matches the image to be processed; wherein, in the plurality of preset image dimension information, It is assumed that among the image dimension information, the second image dimension information is not less than the image dimension information of the image to be processed, and the difference between the second image dimension information and the image dimension information of the image to be processed is the smallest.

可选的,图像维度信息包括批处理数、通道数目,以及图像高度和图像宽度;Optionally, the image dimension information includes the number of batches, the number of channels, and image height and image width;

所述第二处理模块,包括:The second processing module includes:

第三图像维度信息确定子模块,用于从多个预设图像维度信息中,确定在批处理数、通道数目,以及图像高度和图像宽度的维度上均不小于该待处理图像的图像维度信息,作为第三图像维度信息;The third image dimension information determination submodule is used to determine, from multiple preset image dimension information, image dimension information that is no less than the image to be processed in the number of batches, the number of channels, and the dimensions of image height and image width. , as the third image dimension information;

第二图像维度信息确定子模块,用于基于与该待处理图像在批处理数、通道数目,以及图像高度和图像宽度的维度上的差值,从第三图像维度信息中确定与该待处理图像相匹配的第二图像维度信息。The second image dimension information determination submodule is used to determine the difference between the image to be processed and the image to be processed from the third image dimension information based on the differences in the dimensions of the number of batches, the number of channels, and the image height and image width. The image matches the second image dimension information.

可选的,所述第二图像维度信息确定子模块,具体用于从第三图像维度信息中,确定与该待处理图像在批处理数的维度上的差值最小的图像维度信息,作为第四图像维度信息;Optionally, the second image dimension information determination sub-module is specifically configured to determine, from the third image dimension information, the image dimension information that has the smallest difference with the image to be processed in the dimension of the batch number, as the third image dimension information. Four image dimension information;

从第四图像维度信息中,确定与该待处理图像在通道数目的维度上的差值最小的图像维度信息,作为第五图像维度信息;From the fourth image dimension information, determine the image dimension information with the smallest difference in the dimension of the number of channels from the image to be processed as the fifth image dimension information;

确定第五图像维度信息中,与该待处理图像在图像高度和图像宽度的维度上的差值最小的图像维度信息,作为与该待处理图像相匹配的第二图像维度信息。Among the fifth image dimension information, the image dimension information with the smallest difference in the dimensions of image height and image width from the image to be processed is determined as the second image dimension information matching the image to be processed.

可选的,所述装置还包括:Optionally, the device also includes:

预设模型库文件构建模块,用于获取各预设图像维度信息,以及表示初始深度学习模型的权重参数的初始模型文件;The preset model library file construction module is used to obtain the dimension information of each preset image and the initial model file representing the weight parameters of the initial deep learning model;

针对每一预设图像维度信息,根据所述初始模型文件,生成与该预设图像维度信息相匹配的目标深度学习模型的模型文件,作为目标模型文件;其中,与该预设图像维度信息相匹配的目标深度学习模型表示:根据该预设图像维度信息对所述初始深度学习模型进行模型优化得到的深度学习模型;For each preset image dimension information, according to the initial model file, a model file of the target deep learning model matching the preset image dimension information is generated as a target model file; wherein, matching the preset image dimension information The matched target deep learning model represents: a deep learning model obtained by optimizing the initial deep learning model based on the preset image dimension information;

基于对各目标模型文件进行封装,得到所述预设模型库文件。Based on encapsulating each target model file, the preset model library file is obtained.

可选的,所述预设模型库文件构建模块,具体用于对各目标模型文件,以及模型加载指令进行封装,得到所述预设模型库文件;其中,所述模型加载指令,用于从所述预设模型库文件中调用需要运行的目标模型文件;所述预设模型库文件包含各目标模型文件中的共有部分,以及每一目标模型文件中的私有部分。Optionally, the preset model library file construction module is specifically used to encapsulate each target model file and a model loading instruction to obtain the preset model library file; wherein the model loading instruction is used to obtain the preset model library file from The target model file that needs to be run is called in the preset model library file; the preset model library file includes the common parts in each target model file and the private parts in each target model file.

可选的,所述预设模型库文件为动态库文件或静态库文件。Optionally, the preset model library file is a dynamic library file or a static library file.

本申请实施例还提供了一种电子设备,如图8所示,包括处理器801、通信接口802、存储器803和通信总线804,其中,处理器801,通信接口802,存储器803通过通信总线804完成相互间的通信,The embodiment of the present application also provides an electronic device, as shown in Figure 8, including a processor 801, a communication interface 802, a memory 803, and a communication bus 804. The processor 801, the communication interface 802, and the memory 803 communicate through the communication bus 804. complete mutual communication,

存储器803,用于存放计算机程序;Memory 803, used to store computer programs;

处理器801,用于执行存储器803上所存放的程序时,实现如下步骤:The processor 801 is used to execute the program stored in the memory 803 to implement the following steps:

当需要对图像进行处理时,调用预设模型库文件;其中,所述预设模型库文件包括:多个预设图像维度信息各自对应的目标深度学习模型的目标模型文件;所述目标模型文件包含所述目标深度学习模型的权重参数;When an image needs to be processed, a preset model library file is called; wherein the preset model library file includes: a target model file of a target deep learning model corresponding to each of multiple preset image dimension information; the target model file Contains the weight parameters of the target deep learning model;

按照所述目标深度学习模型处理最大的预设图像维度信息的图像所需的内存大小,分配用于对各待处理图像进行处理的目标内存空间;According to the memory size required by the target deep learning model to process the image with the largest preset image dimension information, allocate the target memory space for processing each image to be processed;

针对每一待处理图像,从所述预设模型库文件中,确定与该待处理图像相匹配的预设图像维度信息对应的目标模型文件;For each image to be processed, determine a target model file corresponding to the preset image dimension information matching the image to be processed from the preset model library file;

基于确定出的目标模型文件,以及所述目标内存空间,对该待处理图像进行处理。Based on the determined target model file and the target memory space, the image to be processed is processed.

上述电子设备提到的通信总线可以是外设部件互连标准(Peripheral ComponentInterconnect,PCI)总线或扩展工业标准结构(Extended Industry StandardArchitecture,EISA)总线等。该通信总线可以分为地址总线、数据总线、控制总线等。为便于表示,图中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。The communication bus mentioned in the above-mentioned electronic equipment may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. The communication bus can be divided into address bus, data bus, control bus, etc. For ease of presentation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus.

通信接口用于上述电子设备与其他设备之间的通信。The communication interface is used for communication between the above-mentioned electronic devices and other devices.

存储器可以包括随机存取存储器(Random Access Memory,RAM),也可以包括非易失性存储器(Non-Volatile Memory,NVM),例如至少一个磁盘存储器。可选的,存储器还可以是至少一个位于远离前述处理器的存储装置。The memory may include random access memory (Random Access Memory, RAM) or non-volatile memory (Non-Volatile Memory, NVM), such as at least one disk memory. Optionally, the memory may also be at least one storage device located far away from the aforementioned processor.

上述的处理器可以是通用处理器,包括中央处理器(Central Processing Unit,CPU)、网络处理器(Network Processor,NP)等;还可以是数字信号处理器(Digital SignalProcessor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。The above-mentioned processor can be a general-purpose processor, including a central processing unit (CPU), a network processor (Network Processor, NP), etc.; it can also be a digital signal processor (Digital SignalProcessor, DSP), an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, and discrete hardware components.

在本申请提供的又一实施例中,还提供了一种计算机可读存储介质,该计算机可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现上述任一图像处理方法的步骤。In yet another embodiment provided by the present application, a computer-readable storage medium is also provided. The computer-readable storage medium stores a computer program. When the computer program is executed by a processor, any of the above image processing methods can be implemented. A step of.

在本申请提供的又一实施例中,还提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述实施例中任一图像处理方法。In yet another embodiment provided by this application, a computer program product containing instructions is also provided, which, when run on a computer, causes the computer to execute any of the image processing methods in the above embodiments.

在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘Solid State Disk(SSD))等。In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented using software, it may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the processes or functions described in the embodiments of the present application are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, e.g., the computer instructions may be transferred from a website, computer, server, or data center Transmission to another website, computer, server or data center by wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) means. The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more available media integrated. The available media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, DVD), or semiconductor media (eg, Solid State Disk (SSD)), etc.

需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。It should be noted that in this article, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply that these entities or operations are mutually exclusive. any such actual relationship or sequence exists between them. Furthermore, the terms "comprises," "comprises," or any other variations thereof are intended to cover a non-exclusive inclusion such that a process, method, article, or apparatus that includes a list of elements includes not only those elements, but also those not expressly listed other elements, or elements inherent to the process, method, article or equipment. Without further limitation, an element defined by the statement "comprises a..." does not exclude the presence of additional identical elements in a process, method, article, or apparatus that includes the stated element.

本说明书中的各个实施例均采用相关的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于装置、电子设备、计算机可读存储介质以及计算机程序产品实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。Each embodiment in this specification is described in a related manner. The same and similar parts between the various embodiments can be referred to each other. Each embodiment focuses on its differences from other embodiments. In particular, the device, electronic equipment, computer-readable storage medium, and computer program product embodiments are described simply because they are basically similar to the method embodiments. For relevant details, please refer to the partial description of the method embodiments.

以上所述仅为本申请的较佳实施例,并非用于限定本申请的保护范围。凡在本申请的精神和原则之内所作的任何修改、等同替换、改进等,均包含在本申请的保护范围内。The above descriptions are only preferred embodiments of the present application and are not intended to limit the protection scope of the present application. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of this application are included in the protection scope of this application.

Claims (10)

1.一种图像处理方法,其特征在于,所述方法包括:1. An image processing method, characterized in that the method includes: 当需要对图像进行处理时,调用预设模型库文件;其中,所述预设模型库文件包括:多个预设图像维度信息各自对应的目标深度学习模型的目标模型文件;所述目标模型文件包含所述目标深度学习模型的权重参数;When an image needs to be processed, a preset model library file is called; wherein the preset model library file includes: a target model file of a target deep learning model corresponding to each of multiple preset image dimension information; the target model file Contains the weight parameters of the target deep learning model; 按照所述目标深度学习模型处理最大的预设图像维度信息的图像所需的内存大小,分配用于对各待处理图像进行处理的目标内存空间;According to the memory size required by the target deep learning model to process the image with the largest preset image dimension information, allocate the target memory space for processing each image to be processed; 针对每一待处理图像,从所述预设模型库文件中,确定与该待处理图像相匹配的预设图像维度信息对应的目标模型文件;For each image to be processed, determine a target model file corresponding to the preset image dimension information matching the image to be processed from the preset model library file; 基于确定出的目标模型文件,以及所述目标内存空间,对该待处理图像进行处理。Based on the determined target model file and the target memory space, the image to be processed is processed. 2.根据权利要求1所述的方法,其特征在于,在从所述预设模型库文件中,确定与该待处理图像相匹配的预设图像维度信息对应的目标模型文件之前,所述方法还包括:2. The method according to claim 1, characterized in that, before determining the target model file corresponding to the preset image dimension information matching the image to be processed from the preset model library file, the method Also includes: 针对每一待处理图像,判断各预设图像维度信息中,是否存在与该待处理图像的图像维度信息一致的第一图像维度信息;For each image to be processed, determine whether there is first image dimension information consistent with the image dimension information of the image to be processed in each preset image dimension information; 若存在所述第一图像维度信息,则确定所述第一图像维度信息与该待处理图像相匹配;If the first image dimension information exists, it is determined that the first image dimension information matches the image to be processed; 若不存在所述第一图像维度信息,则确定多个预设图像维度信息中的第二图像维度信息与该待处理图像相匹配;其中,在所述多个预设图像维度信息中,所述第二图像维度信息不小于该待处理图像的图像维度信息,且与该待处理图像的图像维度信息之间的差值最小。If the first image dimension information does not exist, it is determined that the second image dimension information in the plurality of preset image dimension information matches the image to be processed; wherein, among the plurality of preset image dimension information, the The second image dimension information is not less than the image dimension information of the image to be processed, and the difference between the second image dimension information and the image dimension information of the image to be processed is the smallest. 3.根据权利要求2所述的方法,其特征在于,图像维度信息包括批处理数、通道数目,以及图像高度和图像宽度;3. The method according to claim 2, wherein the image dimension information includes the number of batches, the number of channels, and image height and image width; 所述确定多个预设图像维度信息中的第二图像维度信息与该待处理图像相匹配,包括:Determining that the second image dimension information among the plurality of preset image dimension information matches the image to be processed includes: 从多个预设图像维度信息中,确定在批处理数、通道数目,以及图像高度和图像宽度的维度上均不小于该待处理图像的图像维度信息,作为第三图像维度信息;From the plurality of preset image dimension information, determine the image dimension information that is not smaller than the image to be processed in the number of batches, the number of channels, and the dimensions of image height and image width, as the third image dimension information; 基于与该待处理图像在批处理数、通道数目,以及图像高度和图像宽度的维度上的差值,从第三图像维度信息中确定与该待处理图像相匹配的第二图像维度信息。Second image dimension information matching the image to be processed is determined from the third image dimension information based on differences in the dimensions of the batch number, the number of channels, and the image height and image width with the image to be processed. 4.根据权利要求3所述的方法,其特征在于,所述基于与该待处理图像在批处理数、通道数目,以及图像高度和图像宽度的维度上的差值,从第三图像维度信息中确定与该待处理图像相匹配的第二图像维度信息,包括:4. The method according to claim 3, characterized in that, based on the difference between the image to be processed in the dimensions of batch number, number of channels, and image height and image width, from the third image dimension information Determine the second image dimension information matching the image to be processed, including: 从第三图像维度信息中,确定与该待处理图像在批处理数的维度上的差值最小的图像维度信息,作为第四图像维度信息;From the third image dimension information, determine the image dimension information with the smallest difference in the batch number dimension from the image to be processed as the fourth image dimension information; 从第四图像维度信息中,确定与该待处理图像在通道数目的维度上的差值最小的图像维度信息,作为第五图像维度信息;From the fourth image dimension information, determine the image dimension information with the smallest difference in the dimension of the number of channels from the image to be processed as the fifth image dimension information; 确定第五图像维度信息中,与该待处理图像在图像高度和图像宽度的维度上的差值最小的图像维度信息,作为与该待处理图像相匹配的第二图像维度信息。Among the fifth image dimension information, the image dimension information with the smallest difference in the dimensions of image height and image width from the image to be processed is determined as the second image dimension information matching the image to be processed. 5.根据权利要求1所述的方法,其特征在于,所述预设模型库文件的构建过程包括:5. The method according to claim 1, characterized in that the construction process of the preset model library file includes: 获取各预设图像维度信息,以及表示初始深度学习模型的权重参数的初始模型文件;Obtain the dimension information of each preset image and the initial model file representing the weight parameters of the initial deep learning model; 针对每一预设图像维度信息,根据所述初始模型文件,生成与该预设图像维度信息相匹配的目标深度学习模型的模型文件,作为目标模型文件;其中,与该预设图像维度信息相匹配的目标深度学习模型表示:根据该预设图像维度信息对所述初始深度学习模型进行模型优化得到的深度学习模型;For each preset image dimension information, according to the initial model file, a model file of the target deep learning model matching the preset image dimension information is generated as a target model file; wherein, matching the preset image dimension information The matched target deep learning model represents: a deep learning model obtained by optimizing the initial deep learning model based on the preset image dimension information; 基于对各目标模型文件进行封装,得到所述预设模型库文件。Based on encapsulating each target model file, the preset model library file is obtained. 6.根据权利要求5所述的方法,其特征在于,所述基于对各目标模型文件进行封装,得到所述预设模型库文件,包括:6. The method according to claim 5, characterized in that said obtaining the preset model library file based on encapsulating each target model file includes: 对各目标模型文件,以及模型加载指令进行封装,得到所述预设模型库文件;其中,所述模型加载指令,用于从所述预设模型库文件中调用需要运行的目标模型文件;所述预设模型库文件包含各目标模型文件中的共有部分,以及每一目标模型文件中的私有部分。Each target model file and the model loading instruction are encapsulated to obtain the preset model library file; wherein the model loading instruction is used to call the target model file that needs to be run from the preset model library file; so The above-mentioned default model library file contains the common parts in each target model file, and the private parts in each target model file. 7.根据权利要求1-6任一项所述的方法,其特征在于,所述预设模型库文件为动态库文件或静态库文件。7. The method according to any one of claims 1 to 6, characterized in that the preset model library file is a dynamic library file or a static library file. 8.一种图像处理装置,其特征在于,所述装置包括:8. An image processing device, characterized in that the device includes: 预设模型库文件调用模块,用于当需要对图像进行处理时,调用预设模型库文件;其中,所述预设模型库文件包括:多个预设图像维度信息各自对应的目标深度学习模型的目标模型文件;所述目标模型文件包含所述目标深度学习模型的权重参数;The preset model library file calling module is used to call the preset model library file when the image needs to be processed; wherein the preset model library file includes: target deep learning models corresponding to multiple preset image dimension information. The target model file; the target model file contains the weight parameters of the target deep learning model; 目标内存空间分配模块,用于按照所述目标深度学习模型处理最大的预设图像维度信息的图像所需的内存大小,分配用于对各待处理图像进行处理的目标内存空间;The target memory space allocation module is used to allocate the target memory space for processing each image to be processed according to the memory size required for processing the image with the largest preset image dimension information according to the target deep learning model; 目标模型文件确定模块,用于针对每一待处理图像,从所述预设模型库文件中,确定与该待处理图像相匹配的预设图像维度信息对应的目标模型文件;A target model file determination module, configured to determine, for each image to be processed, the target model file corresponding to the preset image dimension information matching the image to be processed from the preset model library file; 图像处理模块,用于基于确定出的目标模型文件,以及所述目标内存空间,对该待处理图像进行处理。The image processing module is used to process the image to be processed based on the determined target model file and the target memory space. 9.一种电子设备,其特征在于,包括处理器、通信接口、存储器和通信总线,其中,处理器,通信接口,存储器通过通信总线完成相互间的通信;9. An electronic device, characterized in that it includes a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory complete communication with each other through the communication bus; 存储器,用于存放计算机程序;Memory, used to store computer programs; 处理器,用于执行存储器上所存放的程序时,实现权利要求1-7任一所述的方法步骤。The processor is used to implement the method steps described in any one of claims 1-7 when executing a program stored in the memory. 10.一种计算机可读存储介质,其特征在于,所述计算机可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1-7任一所述的方法步骤。10. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the method steps of any one of claims 1-7 are implemented.
CN202210260514.5A 2022-03-16 2022-03-16 Image processing method and device Pending CN116824342A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210260514.5A CN116824342A (en) 2022-03-16 2022-03-16 Image processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210260514.5A CN116824342A (en) 2022-03-16 2022-03-16 Image processing method and device

Publications (1)

Publication Number Publication Date
CN116824342A true CN116824342A (en) 2023-09-29

Family

ID=88126236

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210260514.5A Pending CN116824342A (en) 2022-03-16 2022-03-16 Image processing method and device

Country Status (1)

Country Link
CN (1) CN116824342A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109829542A (en) * 2019-01-29 2019-05-31 武汉星巡智能科技有限公司 Method and device for reconstruction of multivariate deep network model based on multi-core processor
CN112712097A (en) * 2019-10-25 2021-04-27 杭州海康威视数字技术股份有限公司 Image identification method and device based on open platform and user side
WO2021162273A1 (en) * 2020-02-11 2021-08-19 삼성전자 주식회사 Electronic device and method for performing image processing
CN113963175A (en) * 2021-05-13 2022-01-21 北京市商汤科技开发有限公司 Image processing method and device, computer equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109829542A (en) * 2019-01-29 2019-05-31 武汉星巡智能科技有限公司 Method and device for reconstruction of multivariate deep network model based on multi-core processor
CN112712097A (en) * 2019-10-25 2021-04-27 杭州海康威视数字技术股份有限公司 Image identification method and device based on open platform and user side
WO2021162273A1 (en) * 2020-02-11 2021-08-19 삼성전자 주식회사 Electronic device and method for performing image processing
CN113963175A (en) * 2021-05-13 2022-01-21 北京市商汤科技开发有限公司 Image processing method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109919308B (en) A neural network model deployment method, prediction method and related equipment
US20220076123A1 (en) Neural network optimization method, electronic device and processor
KR102631381B1 (en) Convolutional neural network processing method and apparatus
CN216053088U (en) Processing device for performing convolutional neural network operations
CN106250310A (en) A kind of method for generating test case and device
US20230350676A1 (en) Tensor Processing Method, Apparatus, and Device, and Computer-Readable Storage Medium
US20210082119A1 (en) Image semantic segmentation method, programmable logic circuit, system and electronic device
CN116151994B (en) Structured data computing method, computing engine, device and readable storage medium
CN111767243A (en) Data processing method, related apparatus and computer readable medium
CN116824342A (en) Image processing method and device
CN114612025B (en) Waybill data clustering method, device, computer equipment and storage medium
CN117828155B (en) Anti-crawler processing method and device, electronic equipment and storage medium
CN118733104A (en) API calling method, system, device and medium based on metamodel
CN114065123A (en) Sparse matrix calculation method and acceleration device
CN116976432A (en) Chip simulation method and device supporting task parallel processing and chip simulator
CN114662485B (en) A translation model compression method, translation method and related device
CN115034351B (en) Data processing method, convolutional neural network training method and device and FPGA
CN111767246B (en) Data processing methods, related equipment and computer-readable media
US11526709B2 (en) Information processing apparatus, information processing method, and storage medium for classifying object of interest
CN110443746B (en) Picture processing method and device based on generation countermeasure network and electronic equipment
CN110968832B (en) Data processing method and device
CN119396347B (en) Data dictionary processing method, device, computer equipment, readable storage medium and program product
CN111158940A (en) Method and device for docking and dynamically loading different devices in field of Internet of things
CN120296284B (en) Reduction processing method, reduction processing device, reduction processing equipment, reduction processing storage medium and reduction processing program product for artificial intelligent chip
CN118195887B (en) Training method and device for picture foreground keeping model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载