+

CN111950551A - A target detection method based on convolutional neural network - Google Patents

A target detection method based on convolutional neural network Download PDF

Info

Publication number
CN111950551A
CN111950551A CN202010816397.7A CN202010816397A CN111950551A CN 111950551 A CN111950551 A CN 111950551A CN 202010816397 A CN202010816397 A CN 202010816397A CN 111950551 A CN111950551 A CN 111950551A
Authority
CN
China
Prior art keywords
feature map
region
neural network
convolution
convolutional neural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010816397.7A
Other languages
Chinese (zh)
Other versions
CN111950551B (en
Inventor
李松江
吴宁
王鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changchun University of Science and Technology
Original Assignee
Changchun University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changchun University of Science and Technology filed Critical Changchun University of Science and Technology
Priority to CN202010816397.7A priority Critical patent/CN111950551B/en
Publication of CN111950551A publication Critical patent/CN111950551A/en
Application granted granted Critical
Publication of CN111950551B publication Critical patent/CN111950551B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

本发明涉及一种基于卷积神经网络的目标检测方法,包括:基于残差卷神经网络进行特征提取,得到逐层的基础特征图;将所述基础特征图由浅至深依次融合,得到融合特征图;基于区域生成网络对所述融合特征图进行候选框提取,得到候选目标区域特征图;根据所述融合特征图和所述候选目标区域特征图得到感兴趣区域特征图;根据所述感兴趣区域特征图基于全卷积层得到分类得分和边框回归。本发明针对小目标及遮挡目标具有较高检测精度。

Figure 202010816397

The invention relates to a target detection method based on a convolutional neural network, comprising: performing feature extraction based on a residual convolutional neural network to obtain a layer-by-layer basic feature map; and sequentially fusing the basic feature maps from shallow to deep to obtain fusion features Figure; perform candidate frame extraction on the fusion feature map based on the region generation network to obtain a candidate target region feature map; obtain a region of interest feature map according to the fusion feature map and the candidate target region feature map; The region feature maps are based on fully convolutional layers to obtain classification scores and bounding box regression. The present invention has higher detection accuracy for small targets and occluded targets.

Figure 202010816397

Description

一种基于卷积神经网络的目标检测方法A target detection method based on convolutional neural network

技术领域technical field

本发明涉及图像信息处理技术领域,特别是涉及一种基于卷积神经网络的目标检测方法。The invention relates to the technical field of image information processing, in particular to a target detection method based on a convolutional neural network.

背景技术Background technique

随着道路交通压力的日益增大,通过计算机技术对道路车辆的智能化管控已成为研究热门;利用道路监控设备对车辆目标进行检测,掌握路网的车辆数据及行车轨迹是优化交通、缓解交通压力的前提,同时车辆目标检测是无人驾驶、车辆跟踪、车辆特征识别领域的研究基础。With the increasing pressure of road traffic, the intelligent management and control of road vehicles through computer technology has become a research hotspot; the use of road monitoring equipment to detect vehicle targets, and mastering the vehicle data and driving trajectories of the road network is the key to optimizing traffic and alleviating traffic. The premise of pressure, and vehicle target detection is the research basis in the fields of unmanned driving, vehicle tracking, and vehicle feature recognition.

目前,卷积神经网络被广泛应用于车辆目标检测领域,常用的一般分为单阶段检测算法和双阶段检测算法,单阶段检测算法是一种基于回归的目标检测算法,双阶段检测算法首先生成候选区域,然后进行分类和细化。由于算法结构的差异,双阶段检测算法有更高的检测精度,但检测速度低于单阶段检测算法,适用于对检测精度要求较高的场景。At present, convolutional neural networks are widely used in the field of vehicle target detection. The commonly used ones are generally divided into single-stage detection algorithms and two-stage detection algorithms. The single-stage detection algorithm is a regression-based target detection algorithm. The two-stage detection algorithm first generates The candidate regions are then classified and refined. Due to the difference in algorithm structure, the two-stage detection algorithm has higher detection accuracy, but the detection speed is lower than that of the single-stage detection algorithm, which is suitable for scenarios that require higher detection accuracy.

现有的双阶段目标检测算法存在以下问题:由于遮挡目标及小目标的特征较少,现有的算法对于浅层位置信息及上下文信息利用的不充分,使得小目标及遮挡目标的检测精度较低。The existing two-stage target detection algorithms have the following problems: due to the few features of occluded targets and small targets, the existing algorithms do not fully utilize the shallow position information and context information, which makes the detection accuracy of small targets and occluded targets relatively low. Low.

发明内容SUMMARY OF THE INVENTION

本发明的目的是提供一种针对小目标及遮挡目标具有较高检测精度的基于卷积神经网络的目标检测方法。The purpose of the present invention is to provide a target detection method based on a convolutional neural network with high detection accuracy for small targets and occluded targets.

为实现上述目的,本发明提供了如下方案:For achieving the above object, the present invention provides the following scheme:

一种基于卷积神经网络的目标检测方法,包括:A target detection method based on convolutional neural network, including:

基于残差卷神经网络进行特征提取,得到逐层的基础特征图;Feature extraction based on residual volume neural network to obtain basic feature map layer by layer;

将所述基础特征图由浅至深依次融合,得到融合特征图;The basic feature maps are fused sequentially from shallow to deep to obtain a fusion feature map;

基于区域生成网络对所述融合特征图进行候选框提取,得到候选目标区域特征图;Extracting candidate frames from the fusion feature map based on the region generation network to obtain a candidate target region feature map;

根据所述融合特征图和所述候选目标区域特征图得到感兴趣区域特征图;Obtain a region of interest feature map according to the fusion feature map and the candidate target region feature map;

根据所述感兴趣区域特征图基于全卷积层得到分类得分和边框回归。The classification score and bounding box regression are obtained based on the fully convolutional layer according to the region of interest feature map.

优选地,所述基础特征图包括第一特征图、第二特征图、第三特征图和第四特征图。Preferably, the basic feature map includes a first feature map, a second feature map, a third feature map and a fourth feature map.

优选地,所述将所述基础特征图由浅至深依次融合,得到融合特征图,包括:Preferably, the basic feature maps are fused sequentially from shallow to deep to obtain a fusion feature map, including:

对所述第一特征图进行下采样处理,得到下采样特征图;Perform downsampling processing on the first feature map to obtain a downsampling feature map;

对所述第二特征图进行卷积降维处理,得到降维特征图,所述降维特征图的通道数与所述下采样特征图的通道数相同;Performing convolution dimension reduction processing on the second feature map to obtain a dimension reduction feature map, where the number of channels of the dimension reduction feature map is the same as the number of channels of the downsampling feature map;

将所述下采样特征图与所述降维特征图进行融合得到初始融合特征图;同理最终得到所述融合特征图。The down-sampling feature map and the dimension reduction feature map are fused to obtain an initial fused feature map; similarly, the fused feature map is finally obtained.

优选地,所述对所述第一特征图进行下采样处理,得到下采样特征图,包括:Preferably, performing down-sampling processing on the first feature map to obtain a down-sampling feature map, including:

基于n个支路空洞卷积分别对所述第一特征图进行下采样处理;n为大于1的正整数;Perform down-sampling processing on the first feature map based on n branch hole convolutions; n is a positive integer greater than 1;

将经过各支路空洞卷积进行下采样处理的所述第一特征图进行融合得到所述下采样特征图。The down-sampling feature map is obtained by fusing the first feature map subjected to down-sampling processing through hole convolution of each branch.

优选地,所述n为3,3个支路的空洞率分别为1、2和3。Preferably, the n is 3, and the void ratios of the three branches are 1, 2, and 3, respectively.

优选地,所述基于区域生成网络对所述融合特征图进行候选框提取,得到候选目标区域特征图,包括:Preferably, the region-based generation network performs candidate frame extraction on the fusion feature map to obtain a candidate target region feature map, including:

基于第一设定卷积核对所述融合特征图进行卷积处理,得到第一卷积特征图;Perform convolution processing on the fusion feature map based on the first set convolution kernel to obtain a first convolution feature map;

基于第二设定卷积核对所述第一卷积特征图进行卷积处理,得到第二卷积特征图;Perform convolution processing on the first convolution feature map based on the second set convolution kernel to obtain a second convolution feature map;

基于第二设定卷积核对所述第二卷积特征图进行卷积处理,得到第三卷积特征图;Perform convolution processing on the second convolution feature map based on the second set convolution kernel to obtain a third convolution feature map;

将所述第二卷积特征图和所述第三卷积特征图分别输入两个并行的全连接层,基于设定锚框进行处理,得到所述候选目标区域特征图。The second convolutional feature map and the third convolutional feature map are respectively input into two parallel fully-connected layers, and processed based on the set anchor frame to obtain the candidate target region feature map.

优选地,所述根据所述感兴趣区域特征图基于全卷积层得到分类得分和边框回归,包括:Preferably, the classification score and the bounding box regression are obtained based on the fully convolutional layer according to the feature map of the region of interest, including:

根据所述感兴趣区域特征图基于全卷积层得到初始分类得分和初始边框回归;Obtaining the initial classification score and initial frame regression based on the fully convolutional layer according to the region of interest feature map;

用所述初始边框回归替换所述设定锚框,并依次执行后续步骤,通过设定m个阈值,并重复执行m次此过程,得到所述分类得分和所述边框回归;m为大于或等于1的正整数。Replace the set anchor frame with the initial frame regression, and perform subsequent steps in turn, by setting m thresholds, and repeating this process m times to obtain the classification score and the frame regression; m is greater than or A positive integer equal to 1.

优选地,所述第一设定卷积核为3×3;所述第二设定卷积核为1×1。Preferably, the first set convolution kernel is 3×3; the second set convolution kernel is 1×1.

优选地,所述根据所述融合特征图和所述候选目标区域特征图得到感兴趣区域特征图,包括:Preferably, obtaining the region of interest feature map according to the fusion feature map and the candidate target region feature map includes:

基于ROIAlign对所述融合特征图和所述候选目标区域特征图进行融合得到初始感兴趣区域特征图;Based on ROIAlign, the fusion feature map and the candidate target region feature map are fused to obtain the initial region of interest feature map;

按照设定倍数对所述初始感兴趣区域特征图进行放大处理得到放大感兴趣区域特征图;Enlarging the initial region-of-interest feature map according to a set multiple to obtain an enlarged region-of-interest feature map;

基于所述放大感兴趣区域特征图对所述初始感兴趣区域特征图进行全局上下文提取,得到上下文信息;Perform global context extraction on the initial region of interest feature map based on the enlarged region of interest feature map to obtain context information;

基于ROIAlign对初始感兴趣区域特征图与所述上下文信息进行融合得到所述感兴趣区域特征图。The initial region of interest feature map is fused with the context information based on ROIAlign to obtain the region of interest feature map.

优选地,所述残差卷神经网络为ResNet-101网络。Preferably, the residual volume neural network is a ResNet-101 network.

根据本发明提供的具体实施例,本发明公开了以下技术效果:According to the specific embodiments provided by the present invention, the present invention discloses the following technical effects:

本发明涉及一种基于卷积神经网络的目标检测方法,包括:基于残差卷神经网络进行特征提取,得到逐层的基础特征图;将所述基础特征图由浅至深依次融合,得到融合特征图;基于区域生成网络对所述融合特征图进行候选框提取,得到候选目标区域特征图;根据所述融合特征图和所述候选目标区域特征图得到感兴趣区域特征图;根据所述感兴趣区域特征图基于全卷积层得到分类得分和边框回归。本发明针对小目标及遮挡目标具有较高检测精度。The invention relates to a target detection method based on a convolutional neural network, comprising: performing feature extraction based on a residual convolutional neural network to obtain a layer-by-layer basic feature map; and sequentially fusing the basic feature maps from shallow to deep to obtain fusion features Figure; perform candidate frame extraction on the fusion feature map based on the region generation network to obtain a candidate target region feature map; obtain a region of interest feature map according to the fusion feature map and the candidate target region feature map; The region feature maps are based on fully convolutional layers to obtain classification scores and bounding box regression. The present invention has higher detection accuracy for small targets and occluded targets.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the accompanying drawings required in the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some of the present invention. In the embodiments, for those of ordinary skill in the art, other drawings can also be obtained according to these drawings without creative labor.

图1为本发明基于卷积神经网络的目标检测方法流程图。FIG. 1 is a flowchart of a target detection method based on a convolutional neural network according to the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

本发明的目的是提供一种针对小目标及遮挡目标具有较高检测精度的基于卷积神经网络的目标检测方法。The purpose of the present invention is to provide a target detection method based on a convolutional neural network with high detection accuracy for small targets and occluded targets.

为使本发明的上述目的、特征和优点能够更加明显易懂,下面结合附图和具体实施方式对本发明作进一步详细的说明。In order to make the above objects, features and advantages of the present invention more clearly understood, the present invention will be described in further detail below with reference to the accompanying drawings and specific embodiments.

图1为本发明基于卷积神经网络的目标检测方法流程图,如图1所示,本发明提供了一种基于卷积神经网络的目标检测方法,包括:Fig. 1 is the flow chart of the target detection method based on the convolutional neural network of the present invention. As shown in Fig. 1, the present invention provides a target detection method based on the convolutional neural network, including:

步骤S1,基于残差卷神经网络ResNet-101进行特征提取,得到逐层的基础特征图;具体包括第一特征图、第二特征图、第三特征图和第四特征图。本实施例中,所述ResNet-101的各卷积层具体情况如表1。In step S1, feature extraction is performed based on the residual volume neural network ResNet-101 to obtain a layer-by-layer basic feature map, which specifically includes a first feature map, a second feature map, a third feature map, and a fourth feature map. In this embodiment, the details of each convolutional layer of the ResNet-101 are shown in Table 1.

表1、ResNet-101各个卷积层Table 1. Each convolutional layer of ResNet-101

Figure BDA0002632855660000041
Figure BDA0002632855660000041

Figure BDA0002632855660000051
Figure BDA0002632855660000051

其中,w为感兴趣区域的宽度,h为感兴趣区域的高度。where w is the width of the region of interest and h is the height of the region of interest.

步骤S2,将所述基础特征图由浅至深依次融合,得到融合特征图。Step S2, the basic feature maps are fused sequentially from shallow to deep to obtain a fused feature map.

以所述第一特征图和所述第二特征图进行融合为例,进行说明,具体过程如下:Taking the fusion of the first feature map and the second feature map as an example to illustrate, the specific process is as follows:

基于n个支路空洞卷积分别对所述第一特征图进行下采样处理;n为大于1的正整数。本实施例中,n取3,卷积核大小为3×3,卷积步长为2;3个支路的空洞率分别为1、2和3。The first feature map is down-sampled based on n branch hole convolutions; n is a positive integer greater than 1. In this embodiment, n is set to 3, the size of the convolution kernel is 3×3, and the convolution step size is 2;

将经过各支路空洞卷积进行下采样处理的所述第一特征图进行融合得到所述下采样特征图。具体计算公式为:The down-sampling feature map is obtained by fusing the first feature map subjected to down-sampling processing through hole convolution of each branch. The specific calculation formula is:

F=H3,1(x)+H3,2(x)+H3,3(x)F=H 3,1(x) +H 3,2(x) +H 3,3(x) ;

式中:F表示融合后的下采样特征图,Hk,r,(x)表示空洞卷积,k表示卷积核大小,r表示空洞率,x为第一特征图。In the formula: F represents the fused down-sampling feature map, H k, r, (x) represents the hole convolution, k represents the size of the convolution kernel, r represents the hole rate, and x is the first feature map.

对所述第二特征图采用1×1的卷积核进行卷积降维处理,得到降维特征图,所述降维特征图的通道数与所述下采样特征图的通道数相同。A 1×1 convolution kernel is used to perform convolution dimension reduction processing on the second feature map to obtain a dimension-reduced feature map, where the number of channels of the dimension-reduced feature map is the same as the number of channels of the down-sampling feature map.

将所述下采样特征图与所述降维特征图进行融合得到初始融合特征图。The down-sampled feature map and the dimension-reduced feature map are fused to obtain an initial fusion feature map.

根据上述步骤依次进行融合得到所述融合特征图。The fusion feature map is obtained by performing fusion in sequence according to the above steps.

步骤S3,基于区域生成网络对所述融合特征图进行候选框提取,得到候选目标区域特征图。Step S3, extracting candidate frames from the fusion feature map based on the region generation network to obtain a feature map of candidate target regions.

作为一种可选的实施方式,本发明所述步骤S3包括:As an optional implementation manner, the step S3 of the present invention includes:

步骤S31,基于第一设定卷积核对所述融合特征图进行卷积处理,得到第一卷积特征图。本实施例中,所述第一设定卷积核大小为3×3。Step S31 , performing convolution processing on the fusion feature map based on the first set convolution kernel to obtain a first convolution feature map. In this embodiment, the size of the first set convolution kernel is 3×3.

步骤S32,基于第二设定卷积核对所述第一卷积特征图进行卷积处理,得到第二卷积特征图。本实施例中,所述第二设定卷积核大小为1×1。Step S32, performing convolution processing on the first convolution feature map based on the second set convolution kernel to obtain a second convolution feature map. In this embodiment, the size of the second set convolution kernel is 1×1.

步骤S33,基于第二设定卷积核对所述第二卷积特征图进行卷积处理,得到第三卷积特征图。Step S33, performing convolution processing on the second convolution feature map based on the second set convolution kernel to obtain a third convolution feature map.

步骤S34,将所述第二卷积特征图和所述第三卷积特征图分别输入两个并行的全连接层,基于设定锚框进行处理,得到所述候选目标区域特征图。Step S34, the second convolution feature map and the third convolution feature map are respectively input into two parallel fully connected layers, and processed based on the set anchor frame to obtain the candidate target region feature map.

步骤S4,根据所述融合特征图和所述候选目标区域特征图得到感兴趣区域特征图。Step S4, obtaining a region of interest feature map according to the fusion feature map and the candidate target region feature map.

具体地,所述步骤S4包括:Specifically, the step S4 includes:

步骤S41,基于ROI Align对所述融合特征图和所述候选目标区域特征图进行融合得到初始感兴趣区域特征图。Step S41 , fuse the fusion feature map and the candidate target region feature map based on the ROI Align to obtain an initial region of interest feature map.

步骤S42,按照设定倍数对所述初始感兴趣区域特征图进行放大处理得到放大感兴趣区域特征图。本实施例中,所述设定倍数为1.5。Step S42: Enlarging the initial region-of-interest feature map according to a set multiple to obtain an enlarged region-of-interest feature map. In this embodiment, the set multiple is 1.5.

步骤S43,基于所述放大感兴趣区域特征图对所述初始感兴趣区域特征图进行上下左右四个方向的全局上下文提取,得到上下文信息。Step S43 , perform global context extraction in four directions of up, down, left, and right on the initial region of interest feature map based on the enlarged region of interest feature map to obtain context information.

步骤S44,基于ROIAlign将所述初始感兴趣区域特征图与所述上下文信息映射成为相同大小的矩形框,并进行进行融合得到所述感兴趣区域特征图。Step S44, based on ROIAlign, map the initial region of interest feature map and the context information into a rectangular frame of the same size, and perform fusion to obtain the region of interest feature map.

步骤S5,根据所述感兴趣区域特征图基于全卷积层得到分类得分和边框回归。Step S5, obtaining classification score and bounding box regression based on the fully convolutional layer according to the feature map of the region of interest.

具体地,根据所述感兴趣区域特征图基于全卷积层得到初始分类得分和初始边框回归。Specifically, the initial classification score and the initial bounding box regression are obtained based on the fully convolutional layer according to the feature map of the region of interest.

用所述初始边框回归替换所述设定锚框,并依次执行之后的步骤,通过设定m个阈值,并重复执行m次此过程,得到所述分类得分和所述边框回归;m为大于或等于1的正整数。本实施例中,m取3,三个阈值分别为0.5、0.6和0.7。Replace the set anchor frame with the initial frame regression, and perform the following steps in turn, by setting m thresholds and repeating this process m times to obtain the classification score and the frame regression; m is greater than or a positive integer equal to 1. In this embodiment, m is taken as 3, and the three thresholds are respectively 0.5, 0.6 and 0.7.

本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。The various embodiments in this specification are described in a progressive manner, and each embodiment focuses on the differences from other embodiments, and the same and similar parts between the various embodiments can be referred to each other.

本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想;同时,对于本领域的一般技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处。综上所述,本说明书内容不应理解为对本发明的限制。In this paper, specific examples are used to illustrate the principles and implementations of the present invention. The descriptions of the above embodiments are only used to help understand the methods and core ideas of the present invention; meanwhile, for those skilled in the art, according to the present invention There will be changes in the specific implementation and application scope. In conclusion, the contents of this specification should not be construed as limiting the present invention.

Claims (10)

1.一种基于卷积神经网络的目标检测方法,其特征在于,包括:1. a target detection method based on convolutional neural network, is characterized in that, comprises: 基于残差卷神经网络进行特征提取,得到逐层的基础特征图;Feature extraction based on residual volume neural network to obtain basic feature map layer by layer; 将所述基础特征图由浅至深依次融合,得到融合特征图;The basic feature maps are fused sequentially from shallow to deep to obtain a fusion feature map; 基于区域生成网络对所述融合特征图进行候选框提取,得到候选目标区域特征图;Extracting candidate frames from the fusion feature map based on the region generation network to obtain a candidate target region feature map; 根据所述融合特征图和所述候选目标区域特征图得到感兴趣区域特征图;Obtain a region of interest feature map according to the fusion feature map and the candidate target region feature map; 根据所述感兴趣区域特征图基于全卷积层得到分类得分和边框回归。The classification score and bounding box regression are obtained based on the fully convolutional layer according to the region of interest feature map. 2.根据权利要求1所述的一种基于卷积神经网络的目标检测方法,其特征在于,所述基础特征图包括第一特征图、第二特征图、第三特征图和第四特征图。2 . The target detection method based on a convolutional neural network according to claim 1 , wherein the basic feature map comprises a first feature map, a second feature map, a third feature map and a fourth feature map. 3 . . 3.根据权利要求2所述的一种基于卷积神经网络的目标检测方法,其特征在于,所述将所述基础特征图由浅至深依次融合,得到融合特征图,包括:3. a kind of target detection method based on convolutional neural network according to claim 2, is characterized in that, described basic feature map is fused successively from shallow to deep, obtains fusion feature map, comprising: 对所述第一特征图进行下采样处理,得到下采样特征图;Perform downsampling processing on the first feature map to obtain a downsampling feature map; 对所述第二特征图进行卷积降维处理,得到降维特征图,所述降维特征图的通道数与所述下采样特征图的通道数相同;Performing convolution dimension reduction processing on the second feature map to obtain a dimension reduction feature map, where the number of channels of the dimension reduction feature map is the same as the number of channels of the downsampling feature map; 将所述下采样特征图与所述降维特征图进行融合得到初始融合特征图;同理最终得到所述融合特征图。The down-sampling feature map and the dimension reduction feature map are fused to obtain an initial fused feature map; similarly, the fused feature map is finally obtained. 4.根据权利要求3所述的一种基于卷积神经网络的目标检测方法,其特征在于,所述对所述第一特征图进行下采样处理,得到下采样特征图,包括:4. The target detection method based on a convolutional neural network according to claim 3, wherein the downsampling process is performed on the first feature map to obtain a downsampling feature map, comprising: 基于n个支路空洞卷积分别对所述第一特征图进行下采样处理;n为大于1的正整数;Perform down-sampling processing on the first feature map based on n branch hole convolutions; n is a positive integer greater than 1; 将经过各支路空洞卷积进行下采样处理的所述第一特征图进行融合得到所述下采样特征图。The down-sampling feature map is obtained by fusing the first feature map subjected to down-sampling processing through hole convolution of each branch. 5.根据权利要求4所述的一种基于卷积神经网络的目标检测方法,其特征在于,所述n为3,3个支路的空洞率分别为1、2和3。5 . The target detection method based on a convolutional neural network according to claim 4 , wherein the n is 3, and the void ratios of the three branches are 1, 2, and 3, respectively. 6 . 6.根据权利要求1所述的一种基于卷积神经网络的目标检测方法,其特征在于,所述基于区域生成网络对所述融合特征图进行候选框提取,得到候选目标区域特征图,包括:6. A target detection method based on a convolutional neural network according to claim 1, wherein the region-based generation network performs candidate frame extraction on the fusion feature map to obtain a candidate target region feature map, comprising: : 基于第一设定卷积核对所述融合特征图进行卷积处理,得到第一卷积特征图;Perform convolution processing on the fusion feature map based on the first set convolution kernel to obtain a first convolution feature map; 基于第二设定卷积核对所述第一卷积特征图进行卷积处理,得到第二卷积特征图;Perform convolution processing on the first convolution feature map based on the second set convolution kernel to obtain a second convolution feature map; 基于第二设定卷积核对所述第二卷积特征图进行卷积处理,得到第三卷积特征图;Perform convolution processing on the second convolution feature map based on the second set convolution kernel to obtain a third convolution feature map; 将所述第二卷积特征图和所述第三卷积特征图分别输入两个并行的全连接层,基于设定锚框进行处理,得到所述候选目标区域特征图。The second convolutional feature map and the third convolutional feature map are respectively input into two parallel fully-connected layers, and processed based on the set anchor frame to obtain the candidate target region feature map. 7.根据权利要求6所述的一种基于卷积神经网络的目标检测方法,其特征在于,所述根据所述感兴趣区域特征图基于全卷积层得到分类得分和边框回归,包括:7. a kind of target detection method based on convolutional neural network according to claim 6, is characterized in that, described according to described region of interest feature map to obtain classification score and bounding box regression based on full convolution layer, comprising: 根据所述感兴趣区域特征图基于全卷积层得到初始分类得分和初始边框回归;Obtaining the initial classification score and initial frame regression based on the fully convolutional layer according to the region of interest feature map; 用所述初始边框回归替换所述设定锚框,并依次执行后续步骤,通过设定m个阈值,并重复执行m次此过程,得到所述分类得分和所述边框回归;m为大于或等于1的正整数。Replace the set anchor frame with the initial frame regression, and perform subsequent steps in sequence, by setting m thresholds, and repeating this process m times to obtain the classification score and the frame regression; m is greater than or A positive integer equal to 1. 8.根据权利要求6所述的一种基于卷积神经网络的目标检测方法,其特征在于,所述第一设定卷积核为3×3;所述第二设定卷积核为1×1。8 . The target detection method based on a convolutional neural network according to claim 6 , wherein the first set convolution kernel is 3×3; the second set convolution kernel is 1 8 . ×1. 9.根据权利要求1所述的一种基于卷积神经网络的目标检测方法,其特征在于,所述根据所述融合特征图和所述候选目标区域特征图得到感兴趣区域特征图,包括:9. A target detection method based on a convolutional neural network according to claim 1, wherein the obtaining a region of interest feature map according to the fusion feature map and the candidate target region feature map, comprising: 基于ROI Align对所述融合特征图和所述候选目标区域特征图进行融合得到初始感兴趣区域特征图;Based on ROI Align, the fusion feature map and the candidate target region feature map are fused to obtain an initial region of interest feature map; 按照设定倍数对所述初始感兴趣区域特征图进行放大处理得到放大感兴趣区域特征图;Enlarging the initial region-of-interest feature map according to a set multiple to obtain an enlarged region-of-interest feature map; 基于所述放大感兴趣区域特征图对所述初始感兴趣区域特征图进行全局上下文提取,得到上下文信息;Perform global context extraction on the initial region of interest feature map based on the enlarged region of interest feature map to obtain context information; 基于ROI Align对初始感兴趣区域特征图与所述上下文信息进行融合得到所述感兴趣区域特征图。Based on the ROI Align, the initial region of interest feature map and the context information are fused to obtain the region of interest feature map. 10.根据权利要求1所述的一种基于卷积神经网络的目标检测方法,其特征在于,所述残差卷神经网络为ResNet-101网络。10 . The target detection method based on a convolutional neural network according to claim 1 , wherein the residual convolutional neural network is a ResNet-101 network. 11 .
CN202010816397.7A 2020-08-14 2020-08-14 Target detection method based on convolutional neural network Active CN111950551B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010816397.7A CN111950551B (en) 2020-08-14 2020-08-14 Target detection method based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010816397.7A CN111950551B (en) 2020-08-14 2020-08-14 Target detection method based on convolutional neural network

Publications (2)

Publication Number Publication Date
CN111950551A true CN111950551A (en) 2020-11-17
CN111950551B CN111950551B (en) 2024-03-08

Family

ID=73342163

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010816397.7A Active CN111950551B (en) 2020-08-14 2020-08-14 Target detection method based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN111950551B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112419292A (en) * 2020-11-30 2021-02-26 深圳云天励飞技术股份有限公司 Pathological image processing method and device, electronic equipment and storage medium
CN114782676A (en) * 2022-04-02 2022-07-22 北京广播电视台 Method and system for extracting region of interest of video

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180068198A1 (en) * 2016-09-06 2018-03-08 Carnegie Mellon University Methods and Software for Detecting Objects in an Image Using Contextual Multiscale Fast Region-Based Convolutional Neural Network
CN109165644A (en) * 2018-07-13 2019-01-08 北京市商汤科技开发有限公司 Object detection method and device, electronic equipment, storage medium, program product
CN110348384A (en) * 2019-07-12 2019-10-18 沈阳理工大学 A kind of Small object vehicle attribute recognition methods based on Fusion Features
CN111461145A (en) * 2020-03-31 2020-07-28 中国科学院计算技术研究所 Method for detecting target based on convolutional neural network
CN111507998A (en) * 2020-04-20 2020-08-07 南京航空航天大学 Depth cascade-based multi-scale excitation mechanism tunnel surface defect segmentation method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180068198A1 (en) * 2016-09-06 2018-03-08 Carnegie Mellon University Methods and Software for Detecting Objects in an Image Using Contextual Multiscale Fast Region-Based Convolutional Neural Network
CN109165644A (en) * 2018-07-13 2019-01-08 北京市商汤科技开发有限公司 Object detection method and device, electronic equipment, storage medium, program product
CN110348384A (en) * 2019-07-12 2019-10-18 沈阳理工大学 A kind of Small object vehicle attribute recognition methods based on Fusion Features
CN111461145A (en) * 2020-03-31 2020-07-28 中国科学院计算技术研究所 Method for detecting target based on convolutional neural network
CN111507998A (en) * 2020-04-20 2020-08-07 南京航空航天大学 Depth cascade-based multi-scale excitation mechanism tunnel surface defect segmentation method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CHANG TANG等: "DeFusionNET: Defocus Blur Detection via Recurrently Fusing and Refining Multi-Scale Deep Features", PROCEEDINGS OF THE IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 31 December 2019 (2019-12-31), pages 2700 - 2709 *
吕俊奇;邱卫根;张立臣;李雪武;: "多层卷积特征融合的行人检测", 计算机工程与设计, no. 11 *
裴伟;许晏铭;朱永英;王鹏乾;鲁明羽;李飞: "改进的SSD航拍目标检测方法", 软件学报, no. 003, 31 December 2019 (2019-12-31), pages 738 - 758 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112419292A (en) * 2020-11-30 2021-02-26 深圳云天励飞技术股份有限公司 Pathological image processing method and device, electronic equipment and storage medium
CN112419292B (en) * 2020-11-30 2024-03-26 深圳云天励飞技术股份有限公司 Pathological image processing method and device, electronic equipment and storage medium
CN114782676A (en) * 2022-04-02 2022-07-22 北京广播电视台 Method and system for extracting region of interest of video
CN114782676B (en) * 2022-04-02 2023-01-06 北京广播电视台 Method and system for extracting region of interest of video

Also Published As

Publication number Publication date
CN111950551B (en) 2024-03-08

Similar Documents

Publication Publication Date Title
CN112884064B (en) A method of target detection and recognition based on neural network
CN107169421A (en) A kind of car steering scene objects detection method based on depth convolutional neural networks
CN114639042A (en) Video target detection algorithm based on improved CenterNet backbone network
CN111402292B (en) Image sequence optical flow calculation method based on characteristic deformation error occlusion detection
CN115147745A (en) Small target detection method based on urban unmanned aerial vehicle image
CN112766123B (en) A crowd counting method and system based on vertical and horizontal cross attention network
CN108960115A (en) Multi-direction Method for text detection based on angle point
CN116524189A (en) High-resolution remote sensing image semantic segmentation method based on coding and decoding indexing edge characterization
CN114519819A (en) Remote sensing image target detection method based on global context awareness
CN112016489A (en) Pedestrian re-identification method capable of retaining global information and enhancing local features
CN111612825B (en) Motion occlusion detection method for image sequences based on optical flow and multi-scale context
CN109426773A (en) A kind of roads recognition method and device
Nie et al. MIGN: Multiscale image generation network for remote sensing image semantic segmentation
CN111242026A (en) Remote sensing image target detection method based on spatial hierarchy perception module and metric learning
CN110443142A (en) A kind of deep learning vehicle count method extracted based on road surface with segmentation
CN111291760A (en) Semantic segmentation method and device for image and electronic equipment
CN109993772B (en) Example level feature aggregation method based on space-time sampling
CN111950551A (en) A target detection method based on convolutional neural network
CN111753714A (en) A multi-directional natural scene text detection method based on character segmentation
CN110751076A (en) Vehicle detection method
CN110544268A (en) A multi-target tracking method based on structured light and SiamMask network
CN117036412A (en) Twin network infrared pedestrian target tracking method integrating deformable convolution
CN109859222A (en) Edge extracting method and system based on cascade neural network
CN110516640B (en) Vehicle re-identification method based on feature pyramid joint representation
CN114926682A (en) Local outlier factor-based industrial image anomaly detection and positioning method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载