+

CN119399795A - A finger key point detection method for factory operation safety system - Google Patents

A finger key point detection method for factory operation safety system Download PDF

Info

Publication number
CN119399795A
CN119399795A CN202411604718.1A CN202411604718A CN119399795A CN 119399795 A CN119399795 A CN 119399795A CN 202411604718 A CN202411604718 A CN 202411604718A CN 119399795 A CN119399795 A CN 119399795A
Authority
CN
China
Prior art keywords
detection
key point
target
finger
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202411604718.1A
Other languages
Chinese (zh)
Inventor
徐辰楠
王战
何星慰
贾晓燕
俞荣栋
孟瑜炜
骆洲
李泽易
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Zheneng Digital Technology Co ltd
Original Assignee
Zhejiang Zheneng Digital Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Zheneng Digital Technology Co ltd filed Critical Zhejiang Zheneng Digital Technology Co ltd
Priority to CN202411604718.1A priority Critical patent/CN119399795A/en
Publication of CN119399795A publication Critical patent/CN119399795A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0985Hyperparameter optimisation; Meta-learning; Learning-to-learn
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computational Linguistics (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a finger key point detection method for a factory operation safety system, which comprises the steps of acquiring input data, carrying out target detection and key point detection on the input data through a target detection network, wherein the target detection network comprises a network trunk structure, a neck structure and a detection structure, the detection structure comprises a target detection head and a key point detection head, and the key point detection head comprises a first convolution layer, a convolution and self-attention fusion module ACmix and a second convolution layer which are sequentially connected. The method has the beneficial effects that the target detection network is optimized through the convolution and self-attention fusion module ACmix, so that the advantages of the convolution neural network in the aspect of capturing local features of the image are reserved, and the perception capability of the model on the global image semantics is obviously enhanced.

Description

Finger key point detection method for factory operation safety system
Technical Field
The invention relates to the technical field of image processing, in particular to a finger key point detection method for a factory operation safety system.
Background
In the field of computer vision YOLO (You Only Look Once) is a fast and accurate target detection algorithm that has evolved to multiple versions, each of which is improved and optimized on the basis of the previous version. YOLO is excellent in both accuracy and real-time of target detection, while it is easy to deploy on a variety of different hardware platforms due to its relative simplicity of model structure and smaller number of parameters. These advantages have led to the widespread use of YOLO in various downstream tasks of computer vision, including mainly target classification, target detection, image segmentation, and keypoint detection. However, finger key point detection is an important module in certain specific industrial scenarios, such as intelligent systems designed to ensure property and personnel safety. In order to ensure high accuracy, the system requires further improvement of robustness of the YOLO key point detection model.
The original YOLO key point detection model has certain performance, but still has some limitations and improvement room. The following are the main reasons for their limited performance:
1. Model structure limitations YOLO models were originally designed primarily for target detection, with relatively weak key point detection capabilities, especially when complex spatial relationships exist between key points. YOLO keypoint detection is achieved by predicting the relative positions of keypoints within a detected target frame, which may not be sufficient to capture fine spatial layout and dynamic changes between fingers.
2. Limitation of data sets finger keypoint detection requires high quality, diversified annotation data to train the model. If the training data is insufficient or the labeling is inaccurate, the model is difficult to learn the effective feature representation, thereby causing the detection performance to be reduced.
3. Target gesture change-in practical application, the position and gesture of the finger may change significantly. YOLO models may perform poorly when dealing with such complex geometric changes, and thus require increased robustness of the model through a stronger feature extraction network or attention mechanism.
Disclosure of Invention
The invention aims at overcoming the defects of the prior art, and provides a finger key point detection method for a factory operation safety system.
In a first aspect, a finger keypoint detection method for a factory operation safety system is provided, comprising:
step 1, acquiring input data;
step 2, performing target detection and key point detection on the input data through a target detection network;
The target detection network comprises a network backbone structure, a neck structure and a detection structure, wherein the detection structure comprises a target detection head and a key point detection head, and the key point detection head comprises a first convolution layer, a convolution and self-attention fusion module ACmix and a second convolution layer which are sequentially connected.
Preferably, in step 2, the operation of the convolution and self-attention fusion module ACmix includes:
Acquiring an input feature map, and performing linear mapping on the input feature map by utilizing a convolution check to generate a weight matrix;
extracting local features of the input feature map by convolution operation according to the weight matrix, and obtaining a first feature map;
Capturing long-distance dependency relations among different positions in the feature map by using a self-attention mechanism, and acquiring a second feature map according to the long-distance dependency relations;
and carrying out weighted summation on the first characteristic diagram and the second characteristic diagram through the learnable super parameters alpha and beta to obtain an output characteristic diagram, wherein the dimension of the output characteristic diagram is consistent with the dimension of the input characteristic diagram.
Preferably, in step 2, the target detection network performs feature enhancement processing in a training stage, where the feature enhancement processing includes:
Training the RT-DETR model by utilizing a palm target detection data set to obtain deviations dx and dy of a target frame predicted by the RT-DETR and a real target frame in the directions of an x axis and a y axis of the image;
modeling dx and dy by adopting a Gaussian mixture model GMM, and sampling to obtain a supplementary characteristic value;
Adding the supplementary feature values into an original training set to obtain an extended training set, and training a target detection network according to the extended training set
Preferably, in the step 1, the method further comprises the step of carrying out data enhancement processing on the input data, wherein the data enhancement processing comprises the step of mixing the actual application scene picture with the finger key point data set.
In a second aspect, a finger keypoint detection device for a factory operation safety system is provided, for performing any one of the finger keypoint detection methods of the first aspect, comprising:
the acquisition module is used for acquiring input data;
The detection module is used for carrying out target detection and key point detection on the input data through a target detection network;
The target detection network comprises a network backbone structure, a neck structure and a detection structure, wherein the detection structure comprises a target detection head and a key point detection head, and the key point detection head comprises a first convolution layer, a convolution and self-attention fusion module ACmix and a second convolution layer which are sequentially connected.
In a third aspect, a vision-based factory error prevention system is provided, comprising an image input unit, a detection unit, a calculation unit and a judgment unit;
The image input unit is used for acquiring an operation image;
The detection unit is used for executing the finger key point detection method according to any one of the first aspect, and detecting the positions of the target and the finger key points in the operation image;
The calculating unit calculates the distance between the target and the finger key point according to the positions of the target and the finger key point in the operation image, and determines the position and the state of the target closest to the finger key point;
And the judging unit judges whether the executed operation is accurate or not according to the target position closest to the finger key point and the state thereof, and alarms if the executed operation is not accurate.
In a fourth aspect, there is provided a computer storage medium having a computer program stored therein, which when run on a computer causes the computer to perform the method of any of the first aspects.
In a fifth aspect, there is provided an electronic device comprising:
A memory for storing a computer program;
a processor for executing the computer program to implement the method according to any of the first aspects.
The beneficial effects of the invention are as follows:
1. according to the invention, the target detection network is optimized through the convolution and self-attention fusion module ACmix, so that the advantages of the convolution neural network in the aspect of capturing local features of the image are maintained, and the perception capability of the model on the global image semantics is obviously enhanced. In the present invention, the convolution and self-attention fusion module ACmix can effectively characterize the precise location information of the finger keypoints and understand the interrelationship between them.
2. The invention carries out characteristic enhancement processing on the target detection network in a training stage, uses the RT-DETR target detector to carry out palm targeted training, and obtains the deviation of a prediction frame and a real frame of the RT-DETR. In the key point detection model training stage, according to deviation distribution in the directions of the x axis and the y axis, the GMM is used for generating additional offset of a real target frame through dense sampling, so that the target detection performance is improved, the condition of missing detection or false detection is reduced, and the influence on the subsequent key point detection is avoided.
3. The invention creates the finger key point data set in the industrial scene, combines the open-source MHP data set to carry out data enhancement, and improves the generalization capability of the target detection network.
Drawings
FIG. 1 is a schematic diagram of an overall framework of a key point detection model according to an embodiment of the present invention;
fig. 2 is a schematic diagram of specific implementation details of ACmix modules provided in an embodiment of the present invention;
FIG. 3 is a flow chart of a factory error prevention system provided by an embodiment of the invention.
Detailed Description
The invention is further described below with reference to examples. The following examples are presented only to aid in the understanding of the invention. It should be noted that it will be apparent to those skilled in the art that modifications can be made to the present invention without departing from the principles of the invention, and such modifications and adaptations are intended to be within the scope of the invention as defined in the following claims.
Example 1:
In order to solve the problems in the prior art, as shown in fig. 1, embodiment 1 of the present application provides a finger key point detection method for a factory operation security system, including:
and step 1, acquiring input data.
Step 2, performing target detection and key point detection on the input data through a target detection network;
The target detection network comprises a network backbone structure, a neck structure and a detection structure, wherein the detection structure comprises a target detection head and a key point detection head, and the key point detection head comprises a first convolution layer, a convolution and self-attention fusion module ACmix and a second convolution layer which are sequentially connected.
Specifically, the object detection network of the application uses the open-source YOLOv's backbone network as a network backbone structure, and the neck structure adopts a PAN (Path Aggregation Network) feature fusion module, which is an effective multi-scale feature fusion technology, and can transfer information between features of different levels, thereby enhancing the detection capability of the model on objects with different sizes, and the object detection network is widely used as a YOLO series universal neck module. The key point detection head uses an open source YOLOv frame detection head, which contains a convolution layer for returning to the target frame position.
Because the key point detection head in the original YOLO model has a simpler structure, the key point detection head only comprises three convolution layers, so that the accuracy of key point detection is improved. The present application employs ACmix modules instead of intermediate convolutional layers, as shown in fig. 1. The ACmix module skillfully combines a convolution mechanism and a multi-head self-attention mechanism, and the innovative design endows the model with the capability of efficiently coding the space position and the interrelationship of each finger joint. By the method, the model can accurately capture the fine change of the finger when the finger executes the complex motion, and further realize the deep capturing and fusion of the local detail and the global visual representation in the feature map. The fusion mechanism greatly enhances the accuracy and the robustness of the detection of the finger key points by the model.
Specifically, as shown in fig. 2, the operation of the convolution and self-attention fusion module ACmix is designed into multiple stages to fully exploit the advantages of the convolution and self-attention mechanism. The following is a detailed description of the stages:
in stage A, an input feature map is acquired, first using three The convolution check inputs the characteristic diagram to carry on the linear mapping, produce the weight matrix of query, key and value separately. These weight matrices provide the necessary input for the subsequent self-attention mechanism so that the model can focus on key regions in the feature map.
In subsequent phases B and C, the module further processes the feature map by applying the convolution and self-attention mechanisms in parallel. Stage B focuses on extracting local features using convolution operations, enhancing the model's ability to capture detailed information. And the phase C utilizes a self-attention mechanism to capture long-distance dependence between different positions in the characteristic diagram, so that the model can better understand the spatial position relationship between finger joints. The combination of these two phases enables the model to capture global context information while maintaining local detail.
Finally, by introducing the learnable super parameters alpha and beta, the feature images of the B and C stages are weighted and summed, so that the effective integration of the information of the two is realized, and the output feature image is obtained, and the dimension of the output feature image is consistent with the dimension of the input feature image.
Example 2:
on the basis of embodiment 1, embodiment 2 of the present application provides a more specific finger key point detection method for a factory operation safety system, comprising:
and step 1, acquiring input data.
In the step 1, the input data is subjected to data enhancement processing, wherein the data enhancement processing comprises model training by adopting pictures of actual application scenes, and mixing with a part of open-source finger key point data in a certain proportion for subsequent training and verification. Illustratively, data collected in an industrial scene is mixed with a portion of the screened open source Multiview Handpose dataset, enhancing the detection capabilities of the model from the data plane by removing pictures of similar scenes or similar gestures. The data set for model training and verification contains 2091 pictures in total, involving 2115 hand instances.
Step 2, performing target detection and key point detection on the input data through a target detection network;
The target detection network comprises a network backbone structure, a neck structure and a detection structure, wherein the detection structure comprises a target detection head and a key point detection head, and the key point detection head comprises a first convolution layer, a convolution and self-attention fusion module ACmix and a second convolution layer which are sequentially connected.
The object detection network in the embodiment of the application uses the backbone network of YOLOv with an open source as a network backbone structure, and the part of the YOLO network structure needs to be described, so that a proper scaling scale can be flexibly selected according to specific hardware resource conditions, particularly the video memory of a display card. The scale of the network can be changed through scaling, and the larger the network parameter is, the more the occupied display memory is, but the better the effect is.
Furthermore, YOLOv network architecture is not fixed, it has a high degree of flexibility, and can be easily replaced with other YOLO series versions including, but not limited to YOLOv, YOLOv7, YOLOv, etc. Other backbone networks can be replaced according to specific requirements, and through verification, model performance improvement can be obtained by applying the invention to three backbone networks, EFFICIENTNET, resnet and HRnet.
It should be noted that, the YOLO-based keypoint detection method follows a top-down strategy, i.e. first detects the target and determines the target frame, and then carries out accurate regression on the finger keypoints according to the position of the target frame. Therefore, the accuracy of target detection is critical, which is directly linked to the final effect of finger keypoint detection. In order to enhance the detection performance, the application performs feature enhancement processing on the target detection network in a training stage.
The characteristic enhancement processing comprises the steps of firstly, utilizing an open-source palm target detection dataset to conduct targeted training on a Real-time Detection Transformer (RT-DETR) model with high training difficulty and excellent performance. Through the training process, deviations dx and dy of the target frame predicted by the RT-DETR and the real target frame in the directions of the x axis and the y axis of the image are obtained, and the distribution of the deviations approximately accords with the characteristics of Gaussian distribution. To further optimize these bias data, the present design models dx and dy using a Gaussian Mixture Model (GMM) and implements a sampling operation. This process generates a series of supplemental feature values that are blended into the real data during the training process as an extension of the original training set. The feature enhancement means effectively improves the target detection capability of the model, so that more accurate detection of finger key points is realized.
The experimental results obtained using the innovative approach in this design for YOLOv, YOLOv8, YOLOv9 are shown in the following table. The evaluation index in table 1 is mAP (MEAN AVERAGE Precision), which is a common performance index in the tasks of target detection and keypoint detection, and is calculated by Precision (Precision), recall (Recall) and cross-over ratio (IoU).AndThe model's ability to detect hand bounding boxes is measured,AndFor measuring the detection capability of a model to finger key points, whereinRefers to the accuracy of the model detection result when IoU threshold is 0.5,Refers to the average accuracy of the model test accurate results when IoU threshold is 0.5 to 0.95. After the GMM-based feature enhancement method and ACmix modules in the design are used, the performance of the YOLO model is improved.
Table 1 performance of the application on YOLOv, YOLOv8, YOLOv9
In this embodiment, the same or similar parts as those in embodiment 1 may be referred to each other, and will not be described in detail in the present disclosure.
Example 3:
on the basis of the embodiments 1 and 2, the embodiment 3 of the application provides a vision-based factory error prevention system, and in the actual production environment of a factory, the correct execution of a workflow is a key link for ensuring the safe and stable operation of the factory, and the personnel safety and the production efficiency are directly related. With the rapid development of Artificial Intelligence (AI) and computer vision technology, vision-based anti-misoperation systems provide new solutions to improve operation accuracy and safety. The system effectively prevents the occurrence of human misoperation accidents by analyzing the behaviors of operators in real time.
The vision-based factory error prevention system comprises an image input unit, a detection unit, a calculation unit and a judgment unit;
Wherein the image input unit is used for acquiring an operation image. For example, as shown in fig. 3, the image input unit may be a wearable device (such as AR glasses), and may acquire an operation image in real time, such as acquiring any video frame from an operation video as the operation image.
The detection unit is used for executing a finger key point detection method and detecting positions of a target and a finger key point in an operation image. This function is critical to the accuracy of the judgment of the person's operation. In addition, the system integrates various algorithms, including detection algorithms for multi-target states of factory switches, personnel and the like, OCR algorithms for recognizing text or digital contents on an electrical cabinet, two-dimensional code recognition algorithms and the like.
The calculating unit calculates the distance between the target and the finger key point according to the positions of the target and the finger key point in the operation image, and determines the position and the state of the target closest to the finger key point.
The judging unit judges whether the executed operation is accurate or not according to the target position closest to the finger key point and the state thereof. Once the target pointed by the finger is found to be inconsistent with the target appointed in the workflow or the state of the target is inconsistent with the operation requirement, the system can immediately send out a warning, and misoperation is effectively prevented.
Example 4:
On the basis of embodiments 1 and 2, embodiment 4 of the present application provides a finger key point detection device for a factory operation safety system, including:
the acquisition module is used for acquiring input data;
The detection module is used for carrying out target detection and key point detection on the input data through a target detection network;
The target detection network comprises a network backbone structure, a neck structure and a detection structure, wherein the detection structure comprises a target detection head and a key point detection head, and the key point detection head comprises a first convolution layer, a convolution and self-attention fusion module ACmix and a second convolution layer which are sequentially connected.
Specifically, the system provided in this embodiment is a system corresponding to the method provided in embodiments 1 and 2, so that the portions in this embodiment that are the same as or similar to those in embodiments 1 and 2 may be referred to each other, and will not be described in detail in this disclosure.

Claims (9)

1. A finger key point detection method for a factory operation safety system, comprising:
step 1, acquiring input data;
step 2, performing target detection and key point detection on the input data through a target detection network;
The target detection network comprises a network backbone structure, a neck structure and a detection structure, wherein the detection structure comprises a target detection head and a key point detection head, and the key point detection head comprises a first convolution layer, a convolution and self-attention fusion module ACmix and a second convolution layer which are sequentially connected.
2. The method for finger keypoint detection for a plant operation safety system according to claim 1, wherein in step 2, the operation of the convolution and self-attention fusion module ACmix comprises:
Acquiring an input feature map, and performing linear mapping on the input feature map by utilizing a convolution check to generate a weight matrix;
extracting local features of the input feature map by convolution operation according to the weight matrix, and obtaining a first feature map;
Capturing long-distance dependency relations among different positions in the feature map by using a self-attention mechanism, and acquiring a second feature map according to the long-distance dependency relations;
and carrying out weighted summation on the first characteristic diagram and the second characteristic diagram through the learnable super parameters alpha and beta to obtain an output characteristic diagram, wherein the dimension of the output characteristic diagram is consistent with the dimension of the input characteristic diagram.
3. The finger keypoint detection method for a plant operation safety system according to claim 2, wherein in step2, the object detection network performs a feature enhancement process in a training phase, the feature enhancement process comprising:
Training the RT-DETR model by utilizing a palm target detection data set to obtain deviations dx and dy of a target frame predicted by the RT-DETR and a real target frame in the directions of an x axis and a y axis of the image;
modeling dx and dy by adopting a Gaussian mixture model GMM, and sampling to obtain a supplementary characteristic value;
And adding the supplementary feature values into an original training set, obtaining an extended training set, and training a target detection network according to the extended training set.
4. The method for detecting finger keypoints for plant operation safety system as described in claim 3, wherein in step 1, further comprising performing data enhancement processing on the input data, the data enhancement processing comprising mixing an actual application scene picture with a finger keypoint data set.
5. The method for finger keypoint detection for a plant operation safety system according to claim 4, wherein in step 2, the neck structure is a feature fusion module PAN.
6. A finger key point detection device for a factory operation safety system, for performing the finger key point detection method according to any one of claims 1 to 5, comprising:
the acquisition module is used for acquiring input data;
The detection module is used for carrying out target detection and key point detection on the input data through a target detection network;
The target detection network comprises a network backbone structure, a neck structure and a detection structure, wherein the detection structure comprises a target detection head and a key point detection head, and the key point detection head comprises a first convolution layer, a convolution and self-attention fusion module ACmix and a second convolution layer which are sequentially connected.
7. The factory error prevention system based on vision is characterized by comprising an image input unit, a detection unit, a calculation unit and a judgment unit;
The image input unit is used for acquiring an operation image;
the detection unit is used for executing the finger key point detection method according to any one of claims 1 to 5, and detecting the positions of the target and the finger key points in the operation image;
The calculating unit calculates the distance between the target and the finger key point according to the positions of the target and the finger key point in the operation image, and determines the position and the state of the target closest to the finger key point;
And the judging unit judges whether the executed operation is accurate or not according to the target position closest to the finger key point and the state thereof, and alarms if the executed operation is not accurate.
8. A computer storage medium, wherein a computer program is stored in the computer storage medium, and when the computer program runs on a computer, the computer program causes the computer to execute the finger key point detection method as claimed in any one of claims 1 to 5.
9. An electronic device, comprising:
A memory for storing a computer program;
A processor for executing the computer program to implement the finger keypoint detection method as claimed in any one of claims 1 to 5.
CN202411604718.1A 2024-11-12 2024-11-12 A finger key point detection method for factory operation safety system Pending CN119399795A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411604718.1A CN119399795A (en) 2024-11-12 2024-11-12 A finger key point detection method for factory operation safety system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411604718.1A CN119399795A (en) 2024-11-12 2024-11-12 A finger key point detection method for factory operation safety system

Publications (1)

Publication Number Publication Date
CN119399795A true CN119399795A (en) 2025-02-07

Family

ID=94420936

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411604718.1A Pending CN119399795A (en) 2024-11-12 2024-11-12 A finger key point detection method for factory operation safety system

Country Status (1)

Country Link
CN (1) CN119399795A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120279452A (en) * 2025-06-03 2025-07-08 浙江浙能数字科技有限公司 Factory switch error prevention method and system based on image recognition

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120279452A (en) * 2025-06-03 2025-07-08 浙江浙能数字科技有限公司 Factory switch error prevention method and system based on image recognition

Similar Documents

Publication Publication Date Title
CN110084299B (en) Target detection method and device based on multi-head fusion attention
CN109376631B (en) Loop detection method and device based on neural network
CN109522963A (en) A kind of the feature building object detection method and system of single-unit operation
CN113269089A (en) Real-time gesture recognition method and system based on deep learning
Geng et al. An improved helmet detection method for YOLOv3 on an unbalanced dataset
CN112861678B (en) Image recognition method and device
CN113780145A (en) Sperm morphology detection method, sperm morphology detection device, computer equipment and storage medium
KR20240144139A (en) Facial pose estimation method, apparatus, electronic device and storage medium
CN119399795A (en) A finger key point detection method for factory operation safety system
CN114399729B (en) Monitoring object movement identification method, system, terminal and storage medium
CN113706481A (en) Sperm quality detection method, sperm quality detection device, computer equipment and storage medium
CN112651294A (en) Method for recognizing human body shielding posture based on multi-scale fusion
Zhou et al. Object detection in low-light conditions based on DBS-YOLOv8
Wang et al. YOLO-RLC: An Advanced Target-Detection Algorithm for Surface Defects of Printed Circuit Boards Based on YOLOv5.
WO2021169642A1 (en) Video-based eyeball turning determination method and system
CN119516161B (en) Personnel state recognition method and system based on target recognition detection
Sun et al. PIDNet: An efficient network for dynamic pedestrian intrusion detection
CN115984712A (en) Method and system for small target detection in remote sensing images based on multi-scale features
CN118298513B (en) Power operation violation detection method and system based on machine vision
Lee et al. Data and model uncertainty aware salient object detection
CN118115540A (en) Three-dimensional target tracking method, device, equipment and storage medium
Moseva et al. Algorithm for Predicting Pedestrian Behavior on Public Roads
Liu et al. Research on an improved yolov5s algorithm for detecting helmets on construction sites
Tao et al. Detection research of insulating gloves wearing status based on improved YOLOv8s algorithm
CN111353349B (en) Human body key point detection method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载