CN108549878B

CN108549878B - Depth information-based hand detection method and system

Info

Publication number: CN108549878B
Application number: CN201810391268.0A
Authority: CN
Inventors: 王行; 李骊; 盛赞; 周晓军; 李朔; 杨淼
Original assignee: Beijing HJIMI Technology Co Ltd
Current assignee: Beijing HJIMI Technology Co Ltd
Priority date: 2018-04-27
Filing date: 2018-04-27
Publication date: 2020-03-24
Anticipated expiration: 2038-04-27
Also published as: CN108549878A

Abstract

The invention discloses a hand detection method and system based on depth information. The method comprises the following steps: acquiring a current depth image of a human body; marking initial position coordinates of the hand in the current depth image according to the current depth image; normalizing the current depth image; obtaining each selected area including hand detection based on a preset multilayer convolutional neural network model according to the current depth image after normalization processing; carrying out non-maximum suppression operation according to each selected area to obtain an optimal selected area; and carrying out image post-processing on the optimal selected area to obtain the final position coordinate of the hand in the current depth image. The hand detection method based on the depth information eliminates the limitation of the traditional method for detecting the hand by the depth map on the condition that the hand needs to be positioned at the forefront of the camera, can accurately find the position of the hand, and can further improve the position precision of the palm of the hand through non-maximum suppression operation and image post-processing.

Description

Depth information-based hand detection method and system

Technical Field

The invention relates to the technical field of hand detection, in particular to a depth information-based hand detection method and a depth information-based hand detection system.

Background

With the continuous development of new interaction technologies and applications, especially the rapid development of AR (augmented reality) and VR (virtual reality), gesture-related technologies are considered as the next generation of new most representative interaction technologies. Non-contact remote human-computer interaction can be realized. In the field of intelligent home furnishing, intelligent household appliances or robots can carry out control and command through gestures; realistic actual experience can be achieved in VR and AR neighborhoods, and user experience is greatly enhanced in the process of games and teaching. All gesture interaction techniques involve hand detection algorithms. General common hand detection algorithms are mainly based on 2D color images for detection or traditional image processing methods for finding out the area where the closest point is located on a depth map, wherein the area is the position where the hand is located by default, and the methods either completely depend on pixel information and have poor anti-interference capability or have certain limitation on the area where the hand is located. When the range of motion of the human hand is large, the detection is often not accurate enough.

Disclosure of Invention

The invention aims to at least solve one of the technical problems in the prior art and provides a hand detection method based on depth information and a hand detection system based on depth information.

In order to achieve the above object, a first aspect of the present invention provides a hand detection method based on depth information, including:

step S120, obtaining a current depth image of a human body;

step S130, marking initial position coordinates of the hand in the current depth image according to the current depth image;

step S140, carrying out normalization processing on the current depth image;

s150, obtaining each selected area including hand detection based on a preset multilayer convolutional neural network model according to the current depth image after normalization processing;

step S160, carrying out non-maximum value inhibition operation according to each selected area to obtain an optimal selected area;

and S170, performing image post-processing on the optimal selected area to obtain the final position coordinate of the hand in the current depth image.

Optionally, the method further comprises:

s110: and updating the preset multilayer convolutional neural network model.

Optionally, the step S110 includes:

acquiring each depth image of a hand of a human body, wherein each depth image comprises depth images of various postures of the hand;

marking initial position coordinates of the hand in the corresponding depth images according to the depth images;

normalizing each depth image;

and inputting each depth image after normalization processing into a convolutional neural network model, training until iteration times or loss convergence is met, and finishing updating the preset multilayer convolutional neural network model.

Optionally, the step of normalizing each depth image includes:

and normalizing each depth image in a preset depth range of [ -1,1] or [0,1 ].

Optionally, the image post-processing the optimal selected region includes:

at least one of smoothing filtering, dilation-erosion, and contour detection.

In a second aspect of the present invention, there is provided a hand detection system based on depth information, comprising:

the acquisition module is used for acquiring a current depth image of a human body;

the marking module is used for marking the initial position coordinates of the hand in the current depth image according to the current depth image;

the normalization module is used for performing normalization processing on the current depth image;

the selection module is used for obtaining each selected area including hand detection based on a preset multilayer convolutional neural network model according to the current depth image after normalization processing;

the evaluation module is used for carrying out non-maximum value inhibition operation according to each selected area to obtain an optimal selected area;

and the image processing module is used for carrying out image post-processing on the optimal selected area to obtain the final position coordinate of the hand in the current depth image.

Optionally, the method further comprises:

and the updating module is used for updating the preset multilayer convolutional neural network model.

Optionally, the acquiring module is further configured to acquire each depth image of a hand of a human body, where each depth image includes depth images of various gestures of the hand;

the marking module is further used for marking the initial position coordinates of the hand in the corresponding depth images according to the depth images;

the normalization module is also used for performing normalization processing on each depth image;

and the updating module is used for inputting each depth image after the normalization processing into the convolutional neural network model, training until the iteration times or loss convergence is met, and finishing updating the preset multilayer convolutional neural network model.

Optionally, the normalization module is configured to perform normalization processing on each depth image within a preset depth range [ -1,1] or [0,1 ].

Optionally, the acquisition module comprises a depth camera or a depth camera.

The hand detection method based on the depth information eliminates the limitation of the traditional method for detecting the hand by the depth map on the condition that the hand needs to be positioned at the forefront of the camera, can accurately find the position of the hand (including both hands), and can further improve the position precision of the palm center of the hand through non-maximum value inhibition operation and image post-processing.

The hand detection system based on the depth information eliminates the limitation of the traditional method for detecting the hand by the depth map on the condition that the hand needs to be positioned at the forefront of the camera, can accurately find the position of the hand (including both hands), and can further improve the position precision of the palm center of the hand through non-maximum value inhibition operation and image post-processing.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:

FIG. 1 is a flowchart of a hand detection method based on depth information according to a first embodiment of the present invention;

fig. 2 is a schematic structural diagram of a hand detection system based on depth information according to a second embodiment of the present invention.

Description of the reference numerals

100: a hand detection system based on depth information;

110: an acquisition module;

120: a marking module;

130: a normalization module;

140: selecting a module;

150: an evaluation module;

160: image processing module

170: and updating the module.

Detailed Description

The following detailed description of embodiments of the invention refers to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the present invention, are given by way of illustration and explanation only, not limitation.

Referring to fig. 1, a first aspect of the present invention relates to a method S100 for detecting a hand based on depth information, comprising:

and step S120, acquiring a current depth image of the human body.

Specifically, in this step, a current depth image of the human body may be acquired using a depth image device such as a depth camera or a depth camera.

And S130, marking the initial position coordinates of the hand in the current depth image according to the current depth image.

It should be noted that no specific limitation is made on how to mark the initial position coordinates of the hand in the current depth image, and for example, the initial position coordinates may be determined by depth information of each pixel point in the depth image. Of course, the calibration of the position coordinates of the hand part can also be realized according to other modes.

And step S140, carrying out normalization processing on the current depth image.

And S150, obtaining each selected area including hand detection based on a preset multilayer convolutional neural network model according to the current depth image after normalization processing.

It should be noted that the preset multilayer convolutional neural network model is substantially equivalent to a classifier model, and the current depth image after normalization processing can be substituted into the preset multilayer convolutional neural network model, that is, into the classifier model, and for the current depth image, each selected region of the hand detection can be determined according to the output result of the classifier model.

And step S160, performing non-maximum suppression operation according to each selected area to obtain an optimal selected area.

In this step, the final position coordinates of the hand obtained are more accurate than the initial position coordinates of the hand.

The depth information-based hand detection method S100 in this embodiment eliminates the limitation of the conventional depth map-related hand detection method on the condition that the hand must be located at the forefront of the camera, and can accurately find the hand position (including both hands), and further improve the position accuracy of the palm of the hand by non-maximum suppression operation and image post-processing.

Optionally, as shown in fig. 1, the method further includes:

s110: and updating the preset multilayer convolutional neural network model.

Specifically, step S110 includes:

acquiring each depth image of a hand of a human body, wherein each depth image comprises depth images of various postures of the hand.

And marking the initial position coordinates of the hand in the corresponding depth images according to the depth images.

And carrying out normalization processing on each depth image.

In the hand detection method S100 based on depth information in this embodiment, the preset multilayer convolutional neural network model may be updated, that is, the preset multilayer convolutional neural network model may be trained, so that the position of the hand (including both hands) may be further accurately found, and the position accuracy of the palm center of the hand may be further improved.

Optionally, the step of normalizing each depth image includes:

and normalizing each depth image in a preset depth range of [ -1,1] or [0,1 ]. Of course, normalization processing can be performed in other preset depth ranges according to actual needs.

In the hand detection method S100 based on depth information in this embodiment, the positions of the hands (including both hands) can be further accurately found, and the position accuracy of the palm center of the hand can be further improved.

Optionally, the image post-processing the optimal selected region includes:

at least one of smoothing filtering, dilation-erosion, and contour detection.

In a second aspect of the present invention, as shown in fig. 2, there is provided a hand detection system 100 based on depth information, comprising:

an obtaining module 110, configured to obtain a current depth image of a human body;

the marking module 120 is configured to mark an initial position coordinate of the hand in the current depth image according to the current depth image;

a normalization module 130, configured to perform normalization processing on the current depth image;

the selecting module 140 is configured to obtain each selected area including hand detection based on a preset multilayer convolutional neural network model according to the current depth image after the normalization processing;

the evaluation module 150 is used for performing non-maximum suppression operation according to each selected area to obtain an optimal selected area;

and the image processing module 160 is configured to perform image post-processing on the optimal selected area to obtain a final position coordinate of the hand in the current depth image.

The hand detection system 100 based on depth information in this embodiment eliminates the limitation of the conventional method for detecting the hand by using a depth map on the condition that the hand must be located at the forefront of the camera, and can accurately find the position of the hand (including both hands), and further improve the position accuracy of the palm center of the hand by non-maximum suppression operation and image post-processing.

Optionally, the method further comprises:

and the updating module 170 is configured to update the preset multilayer convolutional neural network model.

Optionally, the obtaining module 110 is further configured to obtain each depth image of a hand of a human body, where each depth image includes depth images of various gestures of the hand;

the marking module 120 is further configured to mark initial position coordinates of the hand in the corresponding depth images according to the depth images;

the normalization module 130 is further configured to perform normalization processing on each depth image;

and the updating module 170 is configured to input each depth image after the normalization processing into the convolutional neural network model, train the depth image until the iteration number is satisfied or the loss is converged, and complete updating of the preset multilayer convolutional neural network model.

The hand detection system 100 based on depth information in this embodiment can update the preset multilayer convolutional neural network model, that is, can train the preset multilayer convolutional neural network model, so that the position of the hand (including both hands) can be further accurately found, and the position accuracy of the palm of the hand can be further improved.

Optionally, the normalization module 130 is configured to normalize each depth image within a preset depth range [ -1,1] or [0,1 ].

Optionally, the acquisition module 110 comprises a depth camera or a depth camera.

It will be understood that the above embodiments are merely exemplary embodiments taken to illustrate the principles of the present invention, which is not limited thereto. It will be apparent to those skilled in the art that various modifications and improvements can be made without departing from the spirit and substance of the invention, and these modifications and improvements are also considered to be within the scope of the invention.

Claims

1. A hand detection method based on depth information is characterized by comprising the following steps:

step S120, obtaining a current depth image of a human body;

step S140, carrying out normalization processing on the current depth image;

2. The hand detection method of claim 1, further comprising:

s110: and updating the preset multilayer convolutional neural network model.

3. The hand detection method according to claim 2, wherein the step S110 includes:

normalizing each depth image;

4. The hand detection method according to claim 3, wherein the step of normalizing each depth image includes:

and normalizing each depth image in a preset depth range of [ -1,1] or [0,1 ].

5. A hand detection method as claimed in any of claims 1 to 4, wherein said image post-processing of said optimal selected area comprises:

at least one of smoothing filtering, dilation-erosion, and contour detection.

6. A depth information based hand detection system, comprising:

7. The hand detection system of claim 6, further comprising:

8. The hand detection system of claim 7,

the acquisition module is further configured to acquire each depth image of a hand of a human body, where each depth image includes depth images of various gestures of the hand;

9. The hand detection system of claim 8, wherein the normalization module is configured to normalize each of the depth images within a predetermined depth range [ -1,1] or [0,1 ].

10. A hand detection system as claimed in any of claims 6 to 9 wherein the acquisition module comprises a depth camera or a depth camera.