CN117456398A

CN117456398A - Image recognition methods, devices, electronic equipment and storage media for high-risk operations in gas stations

Info

Publication number: CN117456398A
Application number: CN202210854564.6A
Authority: CN
Inventors: 崔靖文; 董平军; 穆波; 程思嘉; 张国之
Original assignee: China Petroleum and Chemical Corp; Sinopec Safety Engineering Research Institute Co Ltd
Current assignee: China Petroleum and Chemical Corp; Sinopec Safety Engineering Research Institute Co Ltd
Priority date: 2022-07-15
Filing date: 2022-07-15
Publication date: 2024-01-26

Abstract

The invention discloses a gas station high risk operation image recognition method and a device, wherein the method comprises the following steps: A. collecting an original video of a gas station area, and carrying out data preprocessing on a video frame containing a key monitoring target based on a lightweight convolutional neural network; B. extracting characteristics of the preprocessed video frames according to personnel, vehicles and dangerous sources, and performing classification recognition training; C. and respectively carrying out feature fusion on the personnel features, the dangerous source features, the personnel features and the vehicle features according to the model after the classification recognition training, and carrying out violation behavior alarm through personnel track intention analysis and personnel identity analysis. According to the invention, through image preprocessing, feature extraction and feature fusion and combining with a lightweight identification model, the accuracy of identifying high-risk operation can be improved, early warning can be performed in time, and the calculated amount of the edge can be effectively reduced.

Description

Image recognition methods, devices, electronic equipment and storage media for high-risk operations in gas stations

技术领域Technical field

本发明涉及石化安全工程技术领域，特别涉及一种加油站高风险作业图像识别方法、装置、电子设备及存储介质。The invention relates to the technical field of petrochemical safety engineering, and in particular to a method, device, electronic equipment and storage medium for image recognition of high-risk operations in a gas station.

背景技术Background technique

伴随着社会对于石油需求量不断增大，加油站的事故发生也越来越频繁，加油站的安全管理问题也逐渐得到社会的关注。目前，全国加油站的数目已经达到了近十万座，给人们的交通出行带来了很大的便利。加油站的风险源有以下：1、车辆伤害：车辆在加油车道的车速过快、阻塞拥挤容易给人员造成人身伤害；2、火灾、爆炸：由于汽油和柴油具有易挥发的特性，遇到点火源就很容易引发火灾，火灾主要发生原因有以下几点：一是车辆携带易燃易爆物品从而引发加油站火灾，二是加油机、卸油区发生油气泄露导致油气聚集，遇到点火源便容易发生火灾、爆炸，点火源主要由于抽烟、接打电话、稳油时间不足等人员危险行为产生，三是罐区、罩棚进行检维修作业时，容易出现人员未佩戴安全带、未开票进入罐内、未佩戴防护面罩等违章行为，等发生事故事件时，现场监护人员不到位以及应急救援器材缺失会导致事故进一步扩大，对加油站造成巨大的经济损失。加油站内对高风险人员行为进行监管变得格外重要。As society's demand for oil continues to increase, accidents at gas stations are becoming more and more frequent, and safety management issues at gas stations have gradually attracted social attention. At present, the number of gas stations across the country has reached nearly 100,000, which has brought great convenience to people's transportation. The risk sources of gas stations are as follows: 1. Vehicle injuries: Vehicles driving too fast in the refueling lane and being blocked and crowded can easily cause personal injuries; 2. Fire and explosion: Due to the volatile characteristics of gasoline and diesel, they will ignite when they are ignited. The main causes of fires are as follows: First, vehicles carry flammable and explosive items, which cause fires in gas stations; Second, oil and gas leakage occurs in the tanker and unloading area, causing oil and gas to accumulate. When encountering an ignition source Fires and explosions are prone to occur. The ignition sources are mainly caused by dangerous behaviors of personnel such as smoking, making phone calls, and insufficient oil stabilization time. Third, when inspection and maintenance operations are carried out in tank areas and sheds, it is easy for personnel to enter without wearing seat belts or entering without invoices. Violations such as inside the tank and failure to wear protective masks. When an accident occurs, the on-site monitoring personnel are not in place and the emergency rescue equipment is missing, which will lead to further expansion of the accident and cause huge economic losses to the gas station. Supervising the behavior of high-risk personnel within gas stations has become particularly important.

综上，加油站需要在各个区域配备相应的摄像机对加油站进行风险防控，通过人工调取监控视频的防控方法的实时性存在比较大的缺陷，无法第一时间完备的将安全预警信息发送到管理者手中。随着人工智能技术的不断发展，各种机器学习算法已经被广泛应用于图像识别领域，常见的机器学习分类算法有Logistic回归算法、支持向量机算法等，它们的优点包括计算速度快、模型小，缺点是分类精度不高，而深度学习算法利用神经网络泛化能力强的特性，可以达到很好的识别效果，缺点在于神经网络的参数量较大，训练学习的时间较长。To sum up, gas stations need to be equipped with corresponding cameras in various areas to prevent and control risks at gas stations. The real-time nature of the prevention and control method through manual retrieval of surveillance videos has relatively large flaws, and it is impossible to complete the safety warning information at the first time. Sent to managers. With the continuous development of artificial intelligence technology, various machine learning algorithms have been widely used in the field of image recognition. Common machine learning classification algorithms include logistic regression algorithm, support vector machine algorithm, etc. Their advantages include fast calculation speed and small model size. , the disadvantage is that the classification accuracy is not high, and the deep learning algorithm can achieve good recognition results by taking advantage of the strong generalization ability of the neural network. The disadvantage is that the number of parameters of the neural network is large and the training and learning time is long.

中国专利申请CN109271938A公开了一种基于智能视频分析技术的加油站卸油过程安全监控方法，该方案进行神经网络深度学习的训练过程，建立卸油场景中与安全操作规程要求相关的关键特征物品识别的神经网络模型，所有的关键特征物品识别可共用一个神经网络模型。该方案可自动产生告警信息，可以有效避免目前常见的通过人工进行远程视频监控的不确定性，降低了由于安全管理人员疏忽而造成的安全事故隐患风险。Chinese patent application CN109271938A discloses a method for safety monitoring of the oil unloading process at a gas station based on intelligent video analysis technology. This solution performs a neural network deep learning training process and establishes the identification of key characteristic items related to safety operating procedures requirements in the oil unloading scene. Neural network model, all key feature item recognition can share a neural network model. This solution can automatically generate alarm information, which can effectively avoid the current common uncertainty of manual remote video monitoring and reduce the risk of safety accidents caused by the negligence of safety managers.

另外，由于加油站的实际点位分布较广，所需要的视频分析算法多，并且某些事件的判断需要通过多种算法进行综合研判，例如卸油口状态检测、烟火检测以及监护人员不到位检测等，容易导致前端(即边缘端)计算资源不足。另一方面，加油站的实际场景数据存在较多的冗余，其中大量视频图像帧不存在目标，直接分析原始数据可能导致计算资源浪费。此外，针对高风险异常事件的智能预警算法要求较快的识别速度和较高的识别准确率，仅采用常见的机器学习算法或深度学习算法无法很好的满足上述需求。In addition, because the actual locations of gas stations are widely distributed, many video analysis algorithms are required, and the judgment of certain events requires comprehensive analysis and judgment through multiple algorithms, such as unloading port status detection, pyrotechnics detection, and insufficient monitoring personnel. Detection, etc., can easily lead to insufficient front-end (i.e. edge) computing resources. On the other hand, there is a lot of redundancy in the actual scene data of the gas station. There are no targets in a large number of video image frames. Directly analyzing the original data may lead to a waste of computing resources. In addition, intelligent early warning algorithms for high-risk abnormal events require faster recognition speed and higher recognition accuracy. Only common machine learning algorithms or deep learning algorithms cannot meet the above needs.

因此，亟需一种边缘端的加油站高风险作业图像识别方法及装置，从而较好地解决计算资源的问题。Therefore, there is an urgent need for an edge-side image recognition method and device for high-risk gas station operations to better solve the problem of computing resources.

公开于该背景技术部分的信息仅仅旨在增加对本发明的总体背景的理解，而不应当被视为承认或以任何形式暗示该信息构成已为本领域一般技术人员所公知的现有技术。The information disclosed in this Background section is merely intended to enhance an understanding of the general background of the invention and should not be construed as an admission or in any way implying that the information constitutes prior art that is already known to a person of ordinary skill in the art.

发明内容Contents of the invention

本发明的目的在于提供一种基于边缘分析的加油站高风险作业图像识别方法及装置，通过图像预处理、特征提取和特征融合，并结合轻量化识别模型，不仅可以提高识别高风险作业的准确度并及时预警，还能有效减少边缘端的计算量。The purpose of the present invention is to provide a method and device for image recognition of high-risk operations in gas stations based on edge analysis. Through image preprocessing, feature extraction and feature fusion, combined with a lightweight recognition model, it can not only improve the accuracy of identifying high-risk operations It can provide timely warning and effectively reduce the amount of calculation at the edge.

为实现上述目的，根据本发明的第一方面，本发明提供了一种加油站高风险作业图像识别方法，包括如下步骤：A、采集加油站区域的原始视频，并基于轻量级卷积神经网络对包含关键监测目标的视频帧进行数据预处理；B、将预处理后的视频帧按照人员、车辆以及危险源进行特征提取，并进行分类识别训练；C、依据分类识别训练后的模型，将人员特征和危险源特征、人员特征和车辆特征分别进行特征融合，通过人员轨迹意图分析以及人员身份分析进行违章行为报警。In order to achieve the above objectives, according to the first aspect of the present invention, the present invention provides a method for image recognition of high-risk operations in a gas station, which includes the following steps: A. Collect the original video of the gas station area, and use lightweight convolutional neural The network performs data preprocessing on video frames containing key monitoring targets; B. Extract features of the preprocessed video frames according to people, vehicles and hazard sources, and conduct classification and recognition training; C. Based on the trained model for classification recognition, The characteristics of personnel and hazard source characteristics, personnel characteristics and vehicle characteristics are respectively fused, and illegal behavior alarms are carried out through personnel trajectory intention analysis and personnel identity analysis.

进一步，上述技术方案中，步骤B中的人员的特征提取及识别训练可具体包括：通过识别人员目标的行动轨迹，对视频帧中的人员进行检测跟踪；通过人体的关键骨架点的位置，对视频帧中的人员动作特征进行检测；通过人员的局部属性特征识别，判断人员的防护措施是否到位并判断人员身份。Furthermore, in the above technical solution, the feature extraction and recognition training of people in step B may specifically include: detecting and tracking people in video frames by identifying the movement trajectories of human targets; detecting and tracking people in video frames through the positions of key skeleton points of the human body. The movement characteristics of people in video frames are detected; through the recognition of local attribute characteristics of people, it is judged whether the protective measures of the personnel are in place and the identity of the personnel.

进一步，上述技术方案中，步骤B中的车辆的特征提取及识别训练可具体包括：识别加油区和卸油区的车辆并对卸油车辆进行标注；利用已标注的卸油车辆视频帧进行模型训练。Furthermore, in the above technical solution, the vehicle feature extraction and recognition training in step B may specifically include: identifying vehicles in the refueling area and the oil unloading area and labeling the oil unloading vehicles; using the labeled video frames of the oil unloading vehicles to model train.

进一步，上述技术方案中，步骤B中的危险源的特征提取及识别训练可具体包括：识别罐区罐口、作业电火花以及吊车具有潜在风险的物体目标并进行标注，利用已标注的所述物体目标视频帧进行模型训练。Furthermore, in the above technical solution, the feature extraction and identification training of hazard sources in step B may specifically include: identifying and labeling potentially risky objects such as tank mouths, operating sparks, and cranes, and using the labeled Object target video frames are used for model training.

进一步，上述技术方案中，步骤C中的将人员特征和危险源特征进行特征融合可具体包括：在检测到的人员人体像素集周围进行标定，形成人员目标的行动轨迹；根据所述行动轨迹的方向以及与所述危险源的距离判断人员目标是否正在靠近危险源。Furthermore, in the above technical solution, the feature fusion of human characteristics and hazard source characteristics in step C may specifically include: calibrating around the detected human body pixel set to form an action trajectory of the human target; The direction and distance to the hazard source are used to determine whether the human target is approaching the hazard source.

进一步，上述技术方案中，骤C中的将人员特征和车辆特征进行特征融合可具体包括：当检测到车辆抵达后，持续监测车辆周围的人员目标并进行人员身份分类；在预设的时间阈值内，判断工作人员是否靠近所述车辆并根据工作人员的动作特征检测结果判断是否进行依规引车。Furthermore, in the above technical solution, the feature fusion of human characteristics and vehicle characteristics in step C may specifically include: after detecting the arrival of the vehicle, continuously monitoring the human targets around the vehicle and classifying the human identity; at a preset time threshold Within the system, it is judged whether the staff is close to the vehicle and whether to guide the vehicle according to the regulations based on the detection results of the staff's movement characteristics.

进一步，上述技术方案中，步骤C中的人员身份分析可具体包括：通过人员局部属性中的着装特征判断人员为工作人员、监护人员或非工作人员。Furthermore, in the above technical solution, the personnel identity analysis in step C may specifically include: judging whether the person is a staff member, a guardian or a non-staff member based on the clothing characteristics in the partial attributes of the person.

进一步，上述技术方案中，步骤A中的数据预处理可包括关键帧提取和图像压缩。Furthermore, in the above technical solution, the data preprocessing in step A may include key frame extraction and image compression.

进一步，上述技术方案中，关键帧提取可具体包括：将原始视频的第一帧作为起始帧，提取起始帧的局部特征，并根据局部特征设定相应的阈值；依次提取后续视频帧的局部特征点并与起始帧的特征点进行比较，当非相似特征点的数量大于阈值时，将该帧设为结束帧；获取起始帧和结束帧之间、与起始帧和结束帧相似点最多的帧作为该段原始视频的候选关键帧；将候选关键帧依次输入至轻量级卷积神经网络中，检测并保留包含关键监测目标的候选关键帧。Furthermore, in the above technical solution, key frame extraction may specifically include: using the first frame of the original video as the starting frame, extracting local features of the starting frame, and setting corresponding thresholds based on the local features; sequentially extracting the key frames of subsequent video frames. Local feature points are compared with the feature points of the start frame. When the number of non-similar feature points is greater than the threshold, the frame is set as the end frame; the relationship between the start frame and the end frame and the start frame and the end frame are obtained. The frame with the most similar points is used as the candidate key frame of the original video; the candidate key frames are input into the lightweight convolutional neural network in turn, and the candidate key frames containing the key monitoring targets are detected and retained.

根据本发明的第二方面，本发明提供了一种加油站高风险作业图像识别装置，包括：采集及预处理模块，其用于采集加油站区域的原始视频，并基于轻量级卷积神经网络对包含关键监测目标的视频帧进行数据预处理；特征提取及训练模块，其用于将预处理后的视频帧按照人员、车辆以及危险源进行特征提取，并进行分类识别训练；特征融合及报警模块，其用于依据分类识别训练后的模型，将人员特征和危险源特征、人员特征和车辆特征分别进行特征融合，通过人员轨迹意图分析以及人员身份分析进行违章行为报警。According to the second aspect of the present invention, the present invention provides an image recognition device for high-risk operations in a gas station, including: a collection and preprocessing module, which is used to collect original videos of the gas station area, and is based on lightweight convolutional neural The network performs data preprocessing on video frames containing key monitoring targets; the feature extraction and training module is used to extract features of the preprocessed video frames according to people, vehicles and hazard sources, and perform classification and recognition training; feature fusion and The alarm module is used to fuse personnel characteristics and hazard source characteristics, personnel characteristics and vehicle characteristics based on the model trained by classification recognition, and conduct violation alarms through personnel trajectory intention analysis and personnel identity analysis.

根据本发明的第三方面，本发明提供了一种加油站高风险作业图像识别电子设备，包括：至少一个处理器；以及与至少一个处理器通信连接的存储器；其中，存储器存储有可被至少一个处理器执行的指令，所述指令被所述至少一个处理器执行，以使所述至少一个处理器执行如前述的加油站高风险作业图像识别方法。According to a third aspect of the present invention, the present invention provides an electronic device for image recognition of high-risk operations in a gas station, including: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores information that can be at least Instructions executed by a processor, the instructions being executed by the at least one processor, so that the at least one processor executes the aforementioned high-risk operation image recognition method in a gas station.

根据本发明的第四方面，本发明提供了一种非暂态计算机可读存储介质，非暂态计算机可读存储介质存储有计算机可执行指令，所述计算机可执行指令用于使所述计算机执行如前述的加油站高风险作业图像识别方法。According to a fourth aspect of the invention, the invention provides a non-transitory computer-readable storage medium, the non-transitory computer-readable storage medium stores computer-executable instructions, the computer-executable instructions are used to cause the computer to Implement the image recognition method for high-risk gas station operations as mentioned above.

与现有技术相比，本发明具有如下一个或多个有益效果：Compared with the prior art, the present invention has one or more of the following beneficial effects:

1)本发明的加油站高风险作业图像识别方法通过特征提取以及特征融合，不仅可以实现人员目标、车辆目标以及危险源目标等进行分类单独识别，还可通过目标特征的组配和融合更准确地进行违章行为报警；1) Through feature extraction and feature fusion, the high-risk gas station operation image recognition method of the present invention can not only realize the classification and individual identification of personnel targets, vehicle targets, and hazard source targets, but also achieve more accurate identification through the combination and fusion of target features. Call the police for violations;

2)本发明在视频采集后的预处理过程中，采用基于前后帧相似度判别的方法选取视频关键帧，并且结合基于卷积神经网络的目标检测算法进一步优化关键帧的选择，可有效去除冗余、提高计算效率；2) In the preprocessing process after video collection, the present invention adopts a method based on similarity discrimination between previous and following frames to select video key frames, and combines it with a target detection algorithm based on a convolutional neural network to further optimize the selection of key frames, which can effectively remove redundancy. Besides, improve calculation efficiency;

3)本发明对数据资源消耗大的视频帧进行压缩，降低了图像质量和长宽，有效减少了数据的大小；3) The present invention compresses video frames that consume large data resources, reducing image quality, length and width, and effectively reducing the size of data;

4)本发明可将轻量化识别模型植入边缘分析硬件芯片中，基于关键点位置提取局部特征，可进行快速去重和关键帧选择，有效提升算法的计算速度和准确性；4) The present invention can embed lightweight recognition models into edge analysis hardware chips, extract local features based on key point locations, perform rapid deduplication and key frame selection, and effectively improve the calculation speed and accuracy of the algorithm;

5)本发明通过在模型训练阶段对图像进行尺度归一化，可有效降低高分辨率图像的计算复杂度。5) The present invention can effectively reduce the computational complexity of high-resolution images by normalizing the scale of the image during the model training stage.

上述说明仅为本发明技术方案的概述，为了能够更清楚地了解本发明的技术手段并可依据说明书的内容予以实施，同时为了使本发明的上述和其他目的、技术特征以及优点更加易懂，以下列举一个或多个优选实施例，并配合附图详细说明如下。The above description is only an overview of the technical solution of the present invention. In order to be able to understand the technical means of the present invention more clearly and implement them according to the content of the specification, and to make the above and other objects, technical features and advantages of the present invention easier to understand, One or more preferred embodiments are enumerated below and described in detail with reference to the accompanying drawings.

附图说明Description of the drawings

图1是本发明实施例1加油站高风险作业图像识别方法的流程示意图。Figure 1 is a schematic flow chart of the high-risk operation image recognition method at a gas station in Embodiment 1 of the present invention.

图2是本发明实施例2加油站高风险作业图像识别装置的结构示意图。Figure 2 is a schematic structural diagram of an image recognition device for high-risk operations in a gas station according to Embodiment 2 of the present invention.

图3是本发明实施例4加油站高风险作业图像识别电子设备的结构示意图。Figure 3 is a schematic structural diagram of the image recognition electronic device for high-risk operations at a gas station in Embodiment 4 of the present invention.

具体实施方式Detailed ways

下面结合附图，对本发明的具体实施方式进行详细描述，但应当理解本发明的保护范围并不受具体实施方式的限制。The specific embodiments of the present invention will be described in detail below with reference to the accompanying drawings, but it should be understood that the protection scope of the present invention is not limited by the specific embodiments.

除非另有其他明确表示，否则在整个说明书和权利要求书中，术语“包括”或其变换如“包含”或“包括有”等等将被理解为包括所陈述的元件或组成部分，而并未排除其他元件或其他组成部分。Unless expressly stated otherwise, throughout the specification and claims, the term "comprises" or its variations such as "includes" or "comprises of," etc., will be understood to include the stated elements or components, and to No other elements or other components are excluded.

在本文中，为了描述的方便，可以使用空间相对术语，诸如“下面”、“下方”、“下”、“上面”、“上方”、“上”等，来描述一个元件或特征与另一元件或特征在附图中的关系。应理解的是，空间相对术语旨在包含除了在图中所绘的方向之外物件在使用或操作中的不同方向。例如，如果在图中的物件被翻转，则被描述为在其他元件或特征“下方”或“下”的元件将取向在元件或特征的“上方”。因此，示范性术语“下方”可以包含下方和上方两个方向。物件也可以有其他取向(旋转90度或其他取向)且应对本文使用的空间相对术语作出相应的解释。In this document, for convenience of description, spatially relative terms, such as "below", "below", "lower", "upper", "upper", "upper", etc., may be used to describe the relationship between one element or feature and another. The relationship of elements or features in the drawing. It will be understood that the spatially relative terms are intended to encompass different orientations of the item in use or operation in addition to the orientation depicted in the figures. For example, if the object in the figures is turned over, elements described as "below" or "beneath" other elements or features would then be oriented "above" the elements or features. Therefore, the exemplary term "below" may include both lower and upper directions. The object may be otherwise oriented (rotated 90 degrees or at other orientations) and the spatially relative terms used herein should be interpreted accordingly.

在本文中，术语“第一”、“第二”等是用以区别两个不同的元件或部位，并不是用以限定特定的位置或相对关系。换言之，在一些实施例中，术语“第一”、“第二”等也可以彼此互换。In this article, the terms "first", "second", etc. are used to distinguish two different elements or parts, but are not used to limit a specific position or relative relationship. In other words, in some embodiments, the terms "first", "second", etc. may also be interchanged with each other.

下面以具体实施例的方式更详细地说明本发明的方法、系统、电子设备及存储介质，应了解的是，实施例仅为示例性的，本发明并不以此为限。The method, system, electronic device and storage medium of the present invention will be described in more detail below with specific embodiments. It should be understood that the embodiments are only exemplary and the present invention is not limited thereto.

本发明通过在加油站进行边缘分析，可对高风险作业图像进行有效识别，通过视频采集以及图像预处理、图像特征提取以及边缘端的模型训练、特征融合以及违章行为报警等步骤，不仅可以实现人员目标、车辆目标以及危险源目标等进行分类单独识别，还可通过目标特征的组配和融合进行违章行为报警，同时可有效节省模型的计算量，为加油站这一特殊应用场景的图像识别需要设置在边缘端创造了较好地条件。The present invention can effectively identify high-risk operation images by performing edge analysis at the gas station. Through the steps of video collection, image preprocessing, image feature extraction, edge-end model training, feature fusion, and violation alarm, it can not only realize personnel Targets, vehicle targets, and dangerous source targets can be classified and identified separately. Violation alarms can also be issued through the combination and fusion of target features. At the same time, it can effectively save the calculation amount of the model and meet the image recognition needs of the special application scenario of gas stations. Setting it at the edge creates better conditions.

实施例1Example 1

如图1所示，本实施例提供了一种加油站高风险作业图像识别方法，包括如下步骤：As shown in Figure 1, this embodiment provides a method for image recognition of high-risk operations at a gas station, including the following steps:

步骤S101，采集加油站区域的原始视频，并基于轻量级卷积神经网络对包含关键监测目标的视频帧进行数据预处理。该步骤具体如下：Step S101: Collect original videos of the gas station area, and perform data preprocessing on video frames containing key monitoring targets based on a lightweight convolutional neural network. The steps are as follows:

本步骤中的原始视频接入主要通过对接视频监控或者视频监控平台获取视频图像数据，视频接入支持onvif、rtsp协议。The original video access in this step mainly obtains video image data by connecting to video surveillance or video surveillance platforms. Video access supports onvif and rtsp protocols.

进一步地，优选而非限制性地，本步骤中的数据预处理包括关键帧提取和图像压缩。其中，关键帧提取具体包括：Further, preferably but not limiting, the data preprocessing in this step includes key frame extraction and image compression. Among them, keyframe extraction specifically includes:

1)首先，设定视频的第一帧为起始帧(例如t1时刻的视频帧为起始帧)；提取起始帧的局部特征(例如f(t1))，该特征具有尺度不变性，主要基于目标上的感兴趣点并对图像的尺度大小、图像旋转、光照变化、噪声以及轻微视角变化不敏感，因此利用其提取视频帧的位置、尺度、旋转不变量等局部性特征并设定相应的阈值(例如阈值Sthre)；1) First, set the first frame of the video as the starting frame (for example, the video frame at time t1 is the starting frame); extract the local features of the starting frame (for example, f(t1)), which have scale invariance, It is mainly based on the points of interest on the target and is insensitive to the scale of the image, image rotation, lighting changes, noise, and slight changes in perspective. Therefore, it is used to extract local features such as the position, scale, and rotation invariants of the video frame and set The corresponding threshold (such as threshold Sthre);

2)其次，依次提取后续视频帧的局部特征点，并且与起始帧的特征点进行比较，当非相似特征点的数目大于所设阈值时，将该帧设为结束帧。例如，设置取帧间隔为Δt,提取后续帧的关键点局部特征tn＝t1+(n-1)Δt)，并且与起始帧的关键点进行特征相似度比较Sn＝f(tn)/f(t1)，当相似度小于阈值，将该帧设为结束帧；2) Secondly, extract local feature points of subsequent video frames in sequence and compare them with the feature points of the starting frame. When the number of non-similar feature points is greater than the set threshold, the frame is set as the end frame. For example, set the frame interval to Δt, extract the local features of the key points of the subsequent frames tn=t1+(n-1)Δt), and compare the feature similarity with the key points of the starting frame Sn=f(tn)/f( t1), when the similarity is less than the threshold, the frame is set as the end frame;

3)然后，在起始帧和结束帧之间寻找与起始帧和结束帧相似关键点最多的帧作为该段视频的候选关键帧；3) Then, between the start frame and the end frame, find the frame with the most similar key points to the start frame and the end frame as the candidate key frame of the video;

4)最后，将候选关键帧依次输入至轻量级卷积神经网络中，检测其是否包含人员、车辆以及危险源等关键目标，若包含，则保留作为关键帧，将其执行下一步的操作，若不包含，则丢弃该图片。具体地，可基于轻量级卷积神经网络对关键帧中的车辆、人员、卸油口、灭火器、静电夹、罐口等关键目标进行检测，对初筛过的视频关键帧做进一步筛选，忽略没有检测出关键目标的视频帧，仅保留含有关键目标的视频帧，从而优化视频关键帧的选择。4) Finally, input the candidate key frames into the lightweight convolutional neural network in sequence to detect whether they contain key targets such as people, vehicles, and danger sources. If they do, retain them as key frames and perform the next step. , if not included, discard the image. Specifically, key targets such as vehicles, personnel, oil unloading ports, fire extinguishers, electrostatic clamps, and tank mouths in key frames can be detected based on lightweight convolutional neural networks, and the initially screened video key frames can be further screened. Ignore video frames in which key objects are not detected, and only retain video frames containing key objects, thereby optimizing the selection of video key frames.

进一步地，预处理中的图像压缩是对数据资源消耗大的视频帧图像进行有效压缩，降低图像存储空间和尺寸大小。即：采用图像生成模型压缩视频帧图像，从而降低输入图像的存储空间消耗；同时采用多种降采样算法融合，对图像进行处理，从而降低输入图像的尺寸大小和计算复杂度。Furthermore, image compression in preprocessing is to effectively compress video frame images that consume large amounts of data resources and reduce image storage space and size. That is: the image generation model is used to compress the video frame image, thereby reducing the storage space consumption of the input image; at the same time, a variety of down-sampling algorithms are used to fuse the image to process the image, thereby reducing the size and computational complexity of the input image.

步骤S102，将步骤S101中预处理后的视频帧按照人员、车辆以及危险源进行特征提取，并进行分类识别训练。本步骤具体采用如下的方式：Step S102: Features are extracted from the video frames preprocessed in step S101 according to people, vehicles and hazard sources, and classification and recognition training is performed. This step specifically takes the following approach:

特征提取即提取关键帧图像特征信息，首先对预处理后的视频帧图像进行尺度归一化，从而降低高分辨率图像的计算复杂度；再将经过尺度归一化处理的图像用于训练数个可用于检测、识别不同关键目标的模型，训练完成后将多个模型植入至边缘计算芯片，芯片可采用高性能工业级处理器，其插件模块扩展了并行的应用程序数量，可以支持同时多个关键目标的检测以及特殊作业、加油、卸油等加油站管理重点行为的特征提取任务。特征提取可包含人员特征、车辆特征以及危险源特征。其中，人员特征又可包括人员目标检测跟踪、人员动作特征、人员局部属性特征等。Feature extraction is to extract key frame image feature information. First, scale the preprocessed video frame images to reduce the computational complexity of high-resolution images; then use the scale-normalized images for training data. A model that can be used to detect and identify different key targets. After the training is completed, multiple models are implanted into the edge computing chip. The chip can use a high-performance industrial-grade processor. Its plug-in module expands the number of parallel applications and can support simultaneous Detection of multiple key targets and feature extraction tasks for key behaviors of gas station management such as special operations, refueling, and unloading. Feature extraction can include human features, vehicle features, and hazard source features. Among them, personnel characteristics can include personnel target detection and tracking, personnel action characteristics, personnel local attribute characteristics, etc.

进一步地，人员目标检测跟踪可采用YOLO-DeepSort模型，识别人员目标的行动轨迹用于后续判断其轨迹的意图。Furthermore, human target detection and tracking can use the YOLO-DeepSort model to identify the movement trajectory of human targets for subsequent judgment of the intention of their trajectory.

进一步地，人员动作特征检测可利用HRNet模型作为识别基准网络，可抽取人体17个关键骨架点的位置进行检测识别。具体地，可关注的人员动作是否有明显姿态变化，可以用躯干与腿部、胳膊之间的夹角来表示人体的具体动作。例如，设某一关节点为O，与之相连的关节点分别为A与B，则该处关节夹角可以用下式表示：Furthermore, human action feature detection can use the HRNet model as the recognition reference network, which can extract the positions of 17 key skeleton points of the human body for detection and recognition. Specifically, you can pay attention to whether there are obvious posture changes in the movements of the person. The angle between the trunk, legs, and arms can be used to represent the specific movements of the human body. For example, assuming a certain joint point is O, and the joint points connected to it are A and B respectively, then the joint angle there can be expressed by the following formula:

假设某个动作组成包含n个空间夹角，则一帧图像可以表示为[θ₁，θ₂，θ₃，…θ_n]t₁表示，利用已标注的人员动作视频帧作为训练集，训练动作特征检测模型，提供视频序列中细节动作进行研判分析。Assuming that an action consists of n spatial angles, a frame of image can be expressed as [θ ₁ , θ ₂ , θ ₃ ,...θ _n ]t ₁ , using the labeled human action video frames as the training set, training The action feature detection model provides detailed actions in the video sequence for analysis.

进一步地，人员的局部属性特征识别可在前述人员目标检测跟踪的基础上，选取检测到人员目标的视频帧，进一步判断其局部属性特征信息以落实防护措施是否到位，即是否佩戴防护面具、佩戴安全帽，分析着装特征，以判定人员身份(包括工作人员、监护人员以及非工作人员)，例如是否着装工作人员的工作服，是否着装监护人员的反光背心、袖标等。Furthermore, the identification of local attribute features of personnel can be based on the aforementioned detection and tracking of personnel targets, selecting the video frames in which the personnel targets are detected, and further judging their local attribute feature information to determine whether protective measures are in place, that is, whether to wear protective masks, wear Safety helmets analyze clothing characteristics to determine the identity of personnel (including staff, guardians and non-staff), such as whether they are wearing staff uniforms, whether they are wearing guardians' reflective vests, armbands, etc.

进一步地，车辆检测识别用于识别在加油区和卸油区工作的车辆，从而实现监控加油站各点位的加油、卸油任务。以卸油车辆为例，采集包含卸油车辆的视频图像数据，利用LabelImg标注软件为视频帧打上卸油车辆的检测框，并标注为卸油车辆；选择目标检测网络YOLOV5作为卸油车辆识别的基准网络；利用已标注的车辆视频帧作为训练集，训练车辆识别模型。Further, vehicle detection and identification is used to identify vehicles working in the refueling area and unloading area, thereby realizing the task of monitoring the refueling and unloading of oil at each point of the gas station. Taking oil unloading vehicles as an example, we collect video image data containing oil unloading vehicles, use LabelImg annotation software to mark the video frames with detection frames of oil unloading vehicles, and label them as oil unloading vehicles; select the target detection network YOLOV5 as the target for identification of oil unloading vehicles. Baseline network; uses annotated vehicle video frames as a training set to train the vehicle recognition model.

进一步地，危险源检测分类用于识别罐区罐口、作业电火花、吊车等具有潜在风险的物体目标。选择目标检测网络YOLOV5作为危险源识别的基准网络；利用已标注的关注目标视频帧作为训练集，训练危险源检测分类模型。Furthermore, hazard source detection classification is used to identify potentially risky objects such as tank mouths, operating sparks, and cranes. The target detection network YOLOV5 is selected as the benchmark network for hazard identification; the labeled video frames of the target of interest are used as training sets to train the hazard detection classification model.

步骤S103，依据分类识别训练后的模型，将人员特征和危险源特征、人员特征和车辆特征分别进行特征融合，通过人员轨迹意图分析以及人员身份分析进行违章行为报警。本步骤具体采用如下的方式：Step S103: Based on the model trained by classification recognition, the personnel characteristics and hazard source characteristics, personnel characteristics and vehicle characteristics are respectively fused, and violation alarms are issued through personnel trajectory intention analysis and personnel identity analysis. This step specifically takes the following approach:

本步骤的违章行为报警主要通过将步骤S102中提取的特征进一步进行人员轨迹意图分析和人员身份分类分析。即在人员(包括人员目标检测跟踪、人员动作特征、人员局部属性特征)、车辆以及危险源的特征提取并进行训练后，进一步将提取的不同类特征进行组配、融合，以便更好地进行后续违章报警。The illegal behavior alarm in this step mainly analyzes the characteristics extracted in step S102 by further analyzing the trajectory intention of the person and classifying the identity of the person. That is, after the characteristics of people (including personnel target detection and tracking, personnel action characteristics, and personnel local attribute characteristics), vehicles, and hazard sources are extracted and trained, the extracted features of different types are further combined and fused for better implementation. Call the police for subsequent violations.

进一步地，人员轨迹意图分析可以在模型检测到的人员人体像素集周围进行标定，设双脚中点所在的像素集合为P＝{(x1,y1),(x2,y2),,,,(xp,yp)},形成人员目标的行动轨迹，根据方向以及与危险源距离判断，如果人员行进轨迹存在靠近危险源目标的意图，则判定此时该人员目标正以某方向靠近危险源。即此时人员轨迹意图分析将人员特征与危险源特征进行组配、融合，通过人员的运动轨迹和方向、人员的局部特征属性以及与危险源的距离等特征的组配和融合来综合判断该人员的意图和违章情况。Furthermore, the person trajectory intention analysis can be calibrated around the pixel set of the human body detected by the model. Let the pixel set where the midpoint of the feet is located be P={(x1,y1),(x2,y2),,,,( xp, yp)}, form the action trajectory of the human target, which is judged based on the direction and distance from the danger source. If the person's trajectory has the intention of approaching the danger source target, it is determined that the personnel target is approaching the danger source in a certain direction at this time. That is, at this time, the personnel trajectory intention analysis combines and fuses the personnel characteristics and hazard source characteristics, and comprehensively judges the situation through the combination and fusion of the personnel's movement trajectory and direction, the personnel's local characteristic attributes, and the distance to the hazard source. The intent of the person and the circumstances of the violation.

进一步地，人员身份分析可以在模型检测到的局部身份属性特征后进行研判，对于身着工作服的人员判定为工作人员，身着工作服和反光背心的人员判定为监护人员，未着装工作服的人员判定为非工作人员。例如，结合上述人员身份分析结果，当识别到加油区车辆抵达后，持续监测车辆周围的人员目标并进行人员身份分类，在设置的时间阈值内，如果未监测到工作人员靠近车辆则视为“车辆抵达而人员未到”进行行为报警；当识别到工作人员后，利用工作人员的动作特征检测结果，判断是否按照规定引车姿势进行引车，如未出现引车行为，则视为“工作人员未依规引车”进行报警。即此时人员轨迹意图分析将人员特征与车辆特征进行组配、融合，通过人员的运动轨迹和响应时间、人员的局部特征属性以及与车辆特征的组配和融合来综合判断该是否存在违章。Furthermore, personnel identity analysis can be based on the local identity attribute characteristics detected by the model. Personnel wearing work clothes are judged to be staff, people wearing work clothes and reflective vests are judged to be guardians, and people not wearing work clothes are judged to be guardians. For non-staff members. For example, combined with the above personnel identity analysis results, when the arrival of a vehicle in the refueling area is identified, the personnel targets around the vehicle are continuously monitored and the personnel identity is classified. Within the set time threshold, if no staff is detected approaching the vehicle, it is considered " "The vehicle has arrived but the people have not arrived" to conduct a behavioral alarm; when the staff is recognized, the staff's action characteristic detection results are used to determine whether the vehicle is guided in accordance with the prescribed vehicle guiding posture. If no vehicle guiding behavior occurs, it is regarded as "work" Personnel failed to guide the vehicle in accordance with the regulations" to call the police. That is, at this time, the personnel trajectory intention analysis combines and fuses the characteristics of the personnel and the characteristics of the vehicle, and comprehensively determines whether there is a violation through the movement trajectory and response time of the personnel, the local characteristic attributes of the personnel, and the combination and fusion with the characteristics of the vehicle.

本步骤通过获得各个模型识别结果后进行特征融合，与违章行为场景规则进行匹配，从而实现更精确的判断并触发真实违章行为的识别报警。In this step, feature fusion is performed after obtaining the recognition results of each model and matched with the violation scene rules, thereby achieving more accurate judgment and triggering the identification alarm of real violations.

本实施例提供的加油站高风险作业图像识别方法通过特征提取以及特征融合，不仅可以实现人员目标、车辆目标以及危险源目标等进行分类单独识别，还可通过目标特征的组配和融合进行违章行为报警；本实施例在视频采集后的预处理过程中，采用基于前后帧相似度判别的方法选取视频关键帧，并且结合基于卷积神经网络的目标检测算法进一步优化关键帧的选择，可有效去除冗余、提高计算效率；本实施例对数据资源消耗大的视频帧进行压缩，降低了图像质量和长宽，有效减少了数据的大小；本实施例可将轻量化识别模型植入边缘分析硬件芯片中，基于关键点位置提取局部特征，可进行快速去重和关键帧选择，有效提升算法的计算速度和准确性；通过在模型训练阶段对图像进行尺度归一化，可有效降低高分辨率图像的计算复杂度。The image recognition method for high-risk gas station operations provided by this embodiment uses feature extraction and feature fusion to not only classify and individually identify personnel targets, vehicle targets, and hazard source targets, but also detect violations through the combination and fusion of target features. Behavioral alarm; In this embodiment, during the pre-processing process after video collection, a method based on similarity discrimination between previous and following frames is used to select video key frames, and a target detection algorithm based on a convolutional neural network is combined to further optimize the selection of key frames, which can effectively Remove redundancy and improve computing efficiency; this embodiment compresses video frames that consume large data resources, reducing image quality, length and width, and effectively reducing the size of data; this embodiment can embed lightweight recognition models into edge analysis In the hardware chip, local features are extracted based on key point positions, which can quickly deduplicate and select key frames, effectively improving the calculation speed and accuracy of the algorithm; by normalizing the scale of the image in the model training stage, it can effectively reduce high-resolution images. rate image computational complexity.

实施例2Example 2

如图2所示，本实施例提供了一种加油站高风险作业图像识别装置，包括：采集及预处理模块201、特征提取及训练模块202以及特征融合及报警模块203。其中，采集及预处理模块201用于采集加油站区域的原始视频，并基于轻量级卷积神经网络对包含关键监测目标的视频帧进行数据预处理；特征提取及训练模块202用于将预处理后的视频帧按照人员、车辆以及危险源进行特征提取，并进行分类识别训练；特征融合及报警模块203用于依据分类识别训练后的模型，将人员特征和危险源特征、人员特征和车辆特征分别进行特征融合，通过人员轨迹意图分析以及人员身份分析进行违章行为报警。As shown in Figure 2, this embodiment provides a high-risk operation image recognition device for a gas station, including: a collection and preprocessing module 201, a feature extraction and training module 202, and a feature fusion and alarm module 203. Among them, the collection and preprocessing module 201 is used to collect original videos of the gas station area, and perform data preprocessing on video frames containing key monitoring targets based on lightweight convolutional neural networks; the feature extraction and training module 202 is used to convert the preprocessed Features of the processed video frames are extracted according to people, vehicles and hazard sources, and classification and recognition training is performed; the feature fusion and alarm module 203 is used to combine the characteristics of personnel and hazard source characteristics, characteristics of personnel and vehicles based on the trained model of classification recognition. Features are fused separately, and violation alarms are made through personnel trajectory intention analysis and personnel identity analysis.

本实施例的加油站高风险作业图像识别装置与实施例1的方法相对应，能够达到相同的技术效果。The image recognition device for high-risk operations in a gas station in this embodiment corresponds to the method in Embodiment 1 and can achieve the same technical effect.

实施例3Example 3

图3是本实施例的加油站高风险作业图像识别电子设备的硬件结构示意图。该设备(例如终端、服务器等)包括一个或多个处理器610以及存储器620。以一个处理器610为例，该设备还可以包括：输入装置630和输出装置640。Figure 3 is a schematic diagram of the hardware structure of the image recognition electronic device for high-risk operations in a gas station in this embodiment. The device (eg, terminal, server, etc.) includes one or more processors 610 and memory 620. Taking a processor 610 as an example, the device may also include: an input device 630 and an output device 640.

处理器610、存储器620、输入装置630和输出装置640可以通过总线或者其他方式连接。The processor 610, the memory 620, the input device 630 and the output device 640 may be connected through a bus or other means.

存储器620作为一种非暂态计算机可读存储介质，可用于存储非暂态软件程序、非暂态计算机可执行程序以及模块。处理器610通过运行存储在存储器620中的非暂态软件程序、指令以及模块，从而执行电子设备的各种功能应用以及数据处理，即实现上述方法实施例的处理方法。As a non-transitory computer-readable storage medium, the memory 620 can be used to store non-transitory software programs, non-transitory computer executable programs and modules. The processor 610 executes various functional applications and data processing of the electronic device by running non-transitory software programs, instructions and modules stored in the memory 620, that is, implementing the processing method of the above method embodiment.

存储器620可以包括存储程序区和存储数据区，其中，存储程序区可存储操作系统、至少一个功能所需要的应用程序；存储数据区可存储数据等。此外，存储器620可以包括高速随机存取存储器，还可以包括非暂态存储器，例如至少一个磁盘存储器件、闪存器件、或其他非暂态固态存储器件。在一些实施例中，存储器620可选包括相对于处理器610远程设置的存储器，这些远程存储器可以通过网络连接至处理装置。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory 620 may include a program storage area and a data storage area, where the program storage area may store an operating system and an application program required for at least one function; the data storage area may store data and the like. In addition, memory 620 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory 620 optionally includes memory located remotely relative to processor 610, and these remote memories may be connected to the processing device through a network. Examples of the above-mentioned networks include but are not limited to the Internet, intranets, local area networks, mobile communication networks and combinations thereof.

输入装置630可接收输入的数字或字符信息，以及产生信号输入。输出装置640可包括显示屏等显示设备。Input device 630 may receive input numeric or character information and generate signal input. The output device 640 may include a display device such as a display screen.

所述一个或者多个模块存储在所述存储器620中，当被所述一个或者多个处理器610执行时，执行如下步骤：A、采集加油站区域的原始视频，并基于轻量级卷积神经网络对包含关键监测目标的视频帧进行数据预处理；B、将预处理后的视频帧按照人员、车辆以及危险源进行特征提取，并进行分类识别训练；C、依据分类识别训练后的模型，将人员特征和危险源特征、人员特征和车辆特征分别进行特征融合，通过人员轨迹意图分析以及人员身份分析进行违章行为报警。The one or more modules are stored in the memory 620, and when executed by the one or more processors 610, the following steps are performed: A. Collect the original video of the gas station area, and perform the processing based on lightweight convolution The neural network performs data preprocessing on video frames containing key monitoring targets; B. Features are extracted from the preprocessed video frames according to people, vehicles and hazard sources, and classification and recognition training is performed; C. The trained model is based on classification recognition , conduct feature fusion of personnel characteristics and hazard source characteristics, personnel characteristics and vehicle characteristics respectively, and conduct violation alarms through personnel trajectory intention analysis and personnel identity analysis.

上述电子设备可执行本发明实施例所提供的方法，具备执行方法相应的功能模块和有益效果。未在本实施例中详尽描述的技术细节，可参见本发明其他实施例所提供的方法。The above-mentioned electronic device can execute the method provided by the embodiment of the present invention, and has corresponding functional modules and beneficial effects for executing the method. For technical details that are not described in detail in this embodiment, please refer to the methods provided by other embodiments of the present invention.

以上所描述的装置实施例仅仅是示意性的，其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The device embodiments described above are only illustrative. The units described as separate components may or may not be physically separated. The components shown as units may or may not be physical units, that is, they may be located in One location, or it can be distributed across multiple network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

通过以上的实施方式的描述，本领域的技术人员可以清楚地了解到各实施方式可借助软件加通用硬件平台的方式来实现，当然也可以通过硬件。基于这样的理解，上述技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品可以存储在计算机可读存储介质中，如ROM/RAM、磁碟、光盘等，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行各个实施例或者实施例的某些部分所述的方法。Through the above description of the embodiments, those skilled in the art can clearly understand that each embodiment can be implemented by means of software plus a general hardware platform, and of course, it can also be implemented by hardware. Based on this understanding, the above technical solutions can be embodied in the form of software products in essence or in part that contribute to related technologies. The computer software products can be stored in computer-readable storage media, such as ROM/RAM, disks. , optical disk, etc., including a number of instructions to cause a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods described in various embodiments or certain parts of the embodiments.

实施例4Example 4

本实施例中提供一种非暂态计算机可读存储介质，该非暂态计算机可读存储介质存储有计算机可执行指令，该指令用于使计算机执行如下加油站高风险作业图像识别方法：A、采集加油站区域的原始视频，并基于轻量级卷积神经网络对包含关键监测目标的视频帧进行数据预处理；B、将预处理后的视频帧按照人员、车辆以及危险源进行特征提取，并进行分类识别训练；C、依据分类识别训练后的模型，将人员特征和危险源特征、人员特征和车辆特征分别进行特征融合，通过人员轨迹意图分析以及人员身份分析进行违章行为报警。This embodiment provides a non-transitory computer-readable storage medium. The non-transitory computer-readable storage medium stores computer-executable instructions. The instructions are used to cause the computer to execute the following high-risk operation image recognition method at a gas station: A. , collect the original video of the gas station area, and perform data preprocessing on the video frames containing key monitoring targets based on a lightweight convolutional neural network; B. Extract features of the preprocessed video frames according to people, vehicles and hazard sources , and carry out classification and recognition training; C. Based on the model after classification and recognition training, the personnel characteristics and hazard source characteristics, personnel characteristics and vehicle characteristics are respectively fused, and illegal behavior alarms are carried out through personnel trajectory intention analysis and personnel identity analysis.

前述对本发明的具体示例性实施方案的描述是为了说明和例证的目的。这些描述并非想将本发明限定为所公开的精确形式，并且很显然，根据上述教导，可以进行很多改变和变化。对示例性实施例进行选择和描述的目的在于解释本发明的特定原理及其实际应用，从而使得本领域的技术人员能够实现并利用本发明的各种不同的示例性实施方案以及各种不同的选择和改变。针对上述示例性实施方案所做的任何简单修改、等同变化与修饰，都应落入本发明的保护范围。The foregoing descriptions of specific exemplary embodiments of the present invention have been presented for purposes of illustration and illustration. These descriptions are not intended to limit the invention to the precise form disclosed, and obviously many modifications and variations are possible in light of the above teachings. The exemplary embodiments were chosen and described in order to explain certain principles of the invention and its practical applications, thereby enabling others skilled in the art to make and utilize various exemplary embodiments of the invention and various different applications. Choice and change. Any simple modifications, equivalent changes and modifications made to the above exemplary embodiments should fall within the protection scope of the present invention.

Claims

1. The high-risk operation image recognition method for the gas station is characterized by comprising the following steps of:

A. collecting an original video of a gas station area, and carrying out data preprocessing on a video frame containing a key monitoring target based on a lightweight convolutional neural network;

B. extracting the characteristics of the preprocessed video frames according to personnel, vehicles and dangerous sources, and performing classification recognition training;

C. and respectively carrying out feature fusion on the personnel features, the dangerous source features, the personnel features and the vehicle features according to the model after classification and recognition training, and carrying out illegal behavior alarm through personnel track intention analysis and personnel identity analysis.

2. The method for recognizing high risk operation images of gas stations according to claim 1, wherein the feature extraction and recognition training of the person in step B specifically comprises:

detecting and tracking personnel in the video frame by identifying the action track of the personnel target;

detecting the action characteristics of personnel in the video frame through the positions of key skeleton points of the human body;

and judging whether the protective measures of the personnel are in place or not and judging the identity of the personnel through the local attribute feature identification of the personnel.

3. The method for recognizing high risk operation images of gas stations according to claim 1, wherein the feature extraction and recognition training of the vehicle in the step B specifically comprises:

identifying vehicles in the oiling area and the oil unloading area and marking the oil unloading vehicles; and performing model training by using the marked oil discharge vehicle video frames.

4. The method for recognizing high risk operation images of gas stations according to claim 1, wherein the feature extraction and recognition training of the hazard source in the step B specifically comprises:

and identifying and marking object targets with potential risks of the tank mouth of the tank area, the operation electric spark and the crane, and performing model training by using the marked object target video frames.

5. The method for recognizing a high risk job image of a gas station according to claim 1, wherein the feature fusion of the personnel feature and the risk source feature in the step C specifically comprises:

calibrating the periphery of the detected human body pixel set of the person to form a movement track of the person target;

and judging whether the personnel target is approaching to the dangerous source according to the direction of the action track and the distance between the personnel target and the dangerous source.

6. The method for recognizing a high risk job image of a gas station according to claim 1, wherein the feature fusion of the personnel features and the vehicle features in the step C specifically comprises:

when the arrival of the vehicle is detected, continuously monitoring personnel targets around the vehicle and classifying personnel identities;

and judging whether a worker approaches the vehicle within a preset time threshold, and judging whether to carry out regular vehicle guiding according to the action characteristic detection result of the worker.

7. The method for identifying high risk operation images of gas stations according to claim 1, wherein the step C of personnel identity analysis specifically comprises:

and judging whether the personnel are working personnel, guardianship personnel or non-working personnel according to the dressing characteristics in the local personnel attributes.

8. The method for recognizing a high risk job image according to claim 1, wherein the data preprocessing in the step a includes key frame extraction and image compression.

9. The method for identifying a high risk job image of a gas station according to claim 8, wherein the key frame extraction specifically comprises:

taking a first frame of the original video as a starting frame, extracting local characteristics of the starting frame, and setting a corresponding threshold according to the local characteristics;

extracting local characteristic points of a subsequent video frame in sequence, comparing the local characteristic points with the characteristic points of the initial frame, and setting the frame as an end frame when the number of dissimilar characteristic points is greater than the threshold value;

acquiring a frame with the most similar points with the initial frame and the end frame between the initial frame and the end frame as a candidate key frame of the original video;

and sequentially inputting the candidate key frames into the lightweight convolutional neural network, and detecting and retaining the candidate key frames containing the key monitoring targets.

10. A high risk job image recognition apparatus for a gas station, comprising:

the acquisition and preprocessing module is used for acquiring an original video of a gas station area and preprocessing data of a video frame containing a key monitoring target based on a lightweight convolutional neural network;

the feature extraction and training module is used for extracting the features of the preprocessed video frames according to personnel, vehicles and dangerous sources and performing classification recognition training;

and the feature fusion and alarm module is used for respectively carrying out feature fusion on the personnel features, the dangerous source features, the personnel features and the vehicle features according to the model after classification and recognition training, and carrying out illegal behavior alarm through personnel track intention analysis and personnel identity analysis.

11. A high risk job image recognition electronic device of a gas station, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the gas station high risk job image recognition method of any one of claims 1 to 9.

12. A non-transitory computer-readable storage medium storing computer-executable instructions for causing the computer to perform the gas station high risk job image recognition method according to any one of claims 1 to 9.