Disclosure of Invention
In view of the foregoing drawbacks or shortcomings of the prior art, it is desirable to provide a campus security monitoring method and system based on multi-modal awareness to solve the foregoing problems.
The first aspect of the invention provides a campus security monitoring method based on multi-mode sensing, which comprises the following steps:
The method comprises the steps that multi-mode data in a target area are collected in real time through a front-end sensing unit, wherein the multi-mode data comprise human body action data, environment sound data and personnel physiological data;
Independently analyzing the multi-mode data to respectively generate a first early warning signal, a second early warning signal and a third early warning signal;
Uploading the three early warning signals to a cloud analysis platform for fusion analysis, so that the cloud analysis platform judges the comprehensive alarm level based on the signal intensity, the superposition weight and the space-time dynamic threshold value of the three early warning signals, wherein the space-time dynamic threshold value is dynamically adjusted based on the occurrence time and the occurrence place;
And distributing early warning information to a preset terminal in a directional manner according to the comprehensive alarm level and the accident site, and triggering emergency response operation.
According to the technical scheme provided by the invention, the independent analysis of the multi-mode data respectively generates a first early warning signal, a second early warning signal and a third early warning signal, and the method comprises the following steps:
invoking a first preset database to match dangerous action labels according to the human action data, and outputting a first early warning signal when the matching is judged to be successful;
calling a second preset database to match dangerous vocabulary labels according to the environmental sound data, and outputting a second early warning signal when the matching is judged to be successful;
and calling a third preset database to compare the respiratory frequency threshold value with the heartbeat frequency threshold value according to the physiological data of the person, and outputting a third early warning signal when the threshold value is exceeded.
According to the technical scheme provided by the invention, the three early warning signals are uploaded to the cloud analysis platform for fusion analysis, so that the cloud analysis platform judges the comprehensive alarm level based on the signal intensity, the superposition weight and the space-time dynamic threshold of the three early warning signals, and the method comprises the following steps:
Determining signal strengths of the three early warning signals, wherein the signal strengths comprise action matching degree, vocabulary matching degree and physiological abnormality index;
giving weight values to the signal intensities of the three early warning signals based on the occurrence time and the occurrence place respectively, and calculating a weighted sum to obtain risk assessment parameters;
and calling a risk level database according to the occurrence time and the occurrence place to acquire the space-time dynamic threshold value, and judging the comprehensive alarm level according to the matching result of the risk assessment parameter and the space-time dynamic threshold value range.
According to the technical scheme provided by the invention, the acquisition of the human motion data comprises the following steps:
Acquiring a plurality of frames of infrared images through an infrared imaging device, performing heat source segmentation and time sequence analysis on the plurality of frames of infrared images, and identifying the contours of the trunk, the limbs and the head of a target person;
Judging the contact event types among target persons according to the area change rate and the temperature gradient distribution of a heat source contact area between adjacent frames in the infrared image, wherein the contact event types comprise dynamic contact events and instantaneous contact events;
And determining a pre-trained action classification model according to the contact event type, inputting the area change rate and the temperature gradient distribution into the action classification model to output a preliminary action label, and generating human action data.
According to the technical scheme provided by the invention, the method for judging the contact event type between target persons according to the area change rate and the temperature gradient distribution of the heat source contact area between adjacent frames in the infrared image comprises the following steps:
calculating the area change rate of the heat source contact area between continuous frames in the infrared image, and judging that the heat source contact area between continuous frames is an effective contact event if the area change rate of the contact area between continuous frames exceeds a first set value;
and (3) extracting temperature gradient distribution of a contact area to construct a three-dimensional thermodynamic characteristic matrix, and dividing the contact event types by combining the contact duration.
According to the technical scheme provided by the invention, the method further comprises the following steps:
Based on the heat source profile, reversely tracking a motion track of a preset duration before contact, and extracting an acceleration change rate, a direction deflection angle and a relative speed;
Calculating an action rationality score through the movement intention analysis model, and correcting the confidence coefficient of the preliminary action label, wherein the confidence coefficient of the preliminary action label is used for adjusting the action intensity.
According to the technical scheme provided by the invention, the method further comprises the following steps:
calculating the dynamic energy density of the contact area according to the three-dimensional thermodynamic characteristic matrix;
And calling a confidence correction database, and correcting the confidence of the preliminary action label based on the dynamic energy density and the preliminary action label matching confidence correction value.
According to the technical scheme provided by the invention, the method further comprises the following steps:
and generating a report of the spoofed event including the event type and the injury level.
According to the technical scheme provided by the invention, the method further comprises the following steps:
And monitoring the power supply state of the front-end sensing unit, starting a standby power supply when the power is off, and generating a power-off alarm signal.
According to the technical scheme provided by the invention, the campus security monitoring method based on multi-mode sensing is used for executing the campus security monitoring method based on multi-mode sensing, and the system comprises the following steps:
The data acquisition module is configured to acquire multi-modal data in the target area in real time through the front-end sensing unit, wherein the multi-modal data comprises human body action data, environment sound data and personnel physiological data;
the data analysis module is configured to independently analyze the multi-mode data and respectively generate a first early warning signal, a second early warning signal and a third early warning signal;
the data transmission module is configured to upload the three early warning signals to the cloud analysis platform for fusion analysis, so that the cloud analysis platform can judge the comprehensive alarm level based on the signal intensity, the superposition weight and the space-time dynamic threshold value of the three early warning signals;
The cloud response module is configured to directionally distribute early warning information to a preset terminal according to the comprehensive alarm level and the accident site and trigger emergency response operation
Compared with the prior art, the intelligent system has the beneficial effects that human body actions, environment sounds and personnel physiological data are synchronously acquired through the front-end sensing unit and are independently analyzed and then subjected to fusion processing, so that multi-dimensional cross validation is realized, the false alarm rate is remarkably reduced, scene self-adaptive monitoring is realized through dynamically adjusting the alarm threshold value based on the time and place of occurrence, the stiffness problem caused by a fixed threshold value is avoided, the comprehensive alarm level is judged through fusion analysis of the signal intensity, the weight and the dynamic threshold value, early warning information is directionally distributed to a preset terminal, hierarchical emergency response is realized, the resource allocation and response efficiency are optimized, and the intelligent upgrading and the function expansion of the system are realized through the data fusion architecture of the cloud analysis platform, so that model training and multi-terminal expansion are supported.
Detailed Description
The invention is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be noted that, for convenience of description, only the portions related to the invention are shown in the drawings.
It should be noted that, without conflict, the embodiments of the present invention and features of the embodiments may be combined with each other. The invention will be described in detail below with reference to the drawings in connection with embodiments.
Referring to the figure, the embodiment provides a campus security monitoring method based on multi-mode sensing, which includes:
S100, acquiring multi-mode data in a target area in real time through a front end sensing unit, wherein the multi-mode data comprises human body action data, environment sound data and personnel physiological data.
In step S100, the front-end sensing unit includes an action pickup device, a sound pickup device and a physiological data pickup device, where the action pickup device is only used to pick up human actions in the target area, and does not collect optical images in the area, so as to avoid invading privacy of students, and therefore the action pickup device can be deployed in private places such as toilets and dormitories, the sound pickup device is used to collect environmental sounds, especially voices of students in the area, and the physiological data pickup device is optionally a millimeter wave radar, and detects respiration and heartbeat frequencies of students in the area through the millimeter wave radar. The front-end sensing units are deployed at a plurality of places in the school, the front-end sensing units are in signal connection with a monitoring system at the rear end through the Internet of things or the Internet, and human body action data, environment sound data and personnel physiological data collected by the front-end sensing units are sent to the monitoring system in real time for processing. And each front-end sensing unit uploads data and simultaneously transmits the time and the place of data acquisition to a monitoring system so as to track the time and the place of occurrence.
The human motion data comprises a plurality of preliminary motion labels, the preliminary motion labels indicate that a certain part of the body of a person A acts on a certain part of the body of a person B and are used for judging whether body conflict occurs, the environment sound data comprises a plurality of words recognized from voice and are used for judging whether language conflict occurs, and the physiological data of the person comprises respiratory frequency and heartbeat frequency and are used for monitoring whether physiological characteristic abnormal phenomena occur.
Further, since the motion pickup device picks up only the motion of the human body in the target area, an optical camera cannot be employed, and thus in the present embodiment, human motion data is acquired by the infrared imaging device;
The acquisition of the human motion data comprises the following steps:
s110, acquiring a plurality of frames of infrared images through an infrared imaging device, performing heat source segmentation and time sequence analysis on the plurality of frames of infrared images, and identifying the trunk, limbs and head contours of a target person.
In step S110, the infrared imaging devices are disposed at a plurality of corners in the campus, and the infrared imaging devices acquire infrared images of the environment in the field of view at intervals of a set time interval to obtain multi-frame infrared images. In this embodiment, the infrared imaging device is integrated with a person recognition function, and when the infrared imaging device detects that a person appears in the area, the infrared imaging device performs acquisition of infrared images at a set time interval at intervals, so that the infrared imaging device can be started under a set condition, and further the power consumption of the infrared imaging device can be reduced. When more than one target person exists in the infrared image, heat source segmentation is carried out on a plurality of target persons, and then the heat source in the infrared image is segmented into a plurality of independent target persons. By carrying out time sequence analysis on the multi-frame infrared images, the action state of each target person in the infrared images, which changes along with time, can be represented, and then the actions of the target person can be identified.
And S120, judging the contact event types among target persons according to the area change rate and the temperature gradient distribution of the heat source contact area between adjacent frames in the infrared image, wherein the contact event types comprise dynamic contact events and instant contact events.
In step S120, the step is used to identify the actions of the target person in the infrared image. After the multi-frame infrared images are obtained, whether two target persons in the infrared images are contacted or not can be judged by comparing the area change rates of the contact areas of different heat source contours in the two adjacent frames of infrared images, and the contact event types among the target persons can be classified by combining the temperature gradient distribution in the contact areas in the two adjacent frames of infrared images and the temperature change rate in the contact areas. By classifying the contact time types, the subsequent processing according to different contact event types in a targeted manner is facilitated.
Further, step S120 specifically includes:
s121, calculating the area change rate of a heat source contact area between continuous frames in the infrared image, and judging that the infrared image is an effective contact event if the area change rate of the heat source contact area between the continuous frames exceeds a first set value;
In step S121, after the multi-frame infrared images are obtained, the area change rate of the contact area between the continuous frames is calculated, when it is determined that the area change rate of the contact area between the continuous frames does not exceed the first set value, the first set value is 10% for example, the area change rate of the contact area between the continuous five-frame infrared images is 8% for example, that is, the two target persons may merely walk by hand or walk by arm on the shoulder, and step S122 is not continuously performed, and when it is determined that the area of the contact area between the continuous frames exceeds the first set value, the effective contact event is determined at this time, and when it is determined that the area of the contact between the palm and the face in the continuous five-frame infrared images is increased by 18% for example, it is possible that one of the target persons performs the palm-to-palm motion on the other target person, so that step S122 is continuously performed.
S122, extracting temperature gradient distribution of a contact area to construct a three-dimensional thermodynamic characteristic matrix, and dividing the contact event types by combining the contact duration.
In step S122, the contact area in the infrared image is obtained by the result of the heat source segmentation, and the boundary is smoothed by the morphological operation of OpenCV to eliminate noise, and then the contact area is divided into 1mm×1mm grids, each of which records a temperature valueObtaining a sampling matrix of the contact area, for example, when the contact area is 10cm multiplied by 10cm, generating a sampling matrix of 100 multiplied by 100, and then calculating the transverse and longitudinal temperature gradients by adopting a central difference method:
Wherein the method comprises the steps of AndFor spatial resolution, default to 1mm;
then constructing a three-dimensional thermodynamic characteristic matrix, defining a first dimension as a space coordinate, and recording the position of each grid point Locating the second dimension as a temperature parameter, including the contact center point temperatureEdge temperature decay coefficientRate of temperature changeDefining a third dimension as a time sequence, and a time stampRecording continuous frame data, wherein the single frame matrix form is as follows:
multiple frames are overlapped to form a three-dimensional structure, wherein By exponential fittingWhereinAt a distance from the center, a rate of temperature changeCalculating based on the adjacent frame temperature difference;
The contact duration is then determined by first detecting the start point and marking the contact start time when the contact area exceeds the second set value for a first set number of consecutive frames, by detecting the end point and marking the contact end time when the contact area is below the second set value for a second set number of consecutive frames, by marking the base start time when the contact area exceeds 10cm 2 for 3 consecutive frames (time window 0.1 s) in this embodiment When the contact area is lower than 10cm 2 for 5 consecutive frames (0.17 s), the contact end time is markedCalculating the duration as;
Then, the contact event type is judged according to the duration time and the temperature change rate, whenSecond and secondWhen it is determined that a contact event, such as a rapid violent behavior such as palmar, is detected, the edge temperature decay coefficientHigher whenSecond and secondWhen it is determined that the dynamic contact event, such as dragging, pressing, etc., is continuous, the edge temperature attenuation coefficientLower.
By judging the type of the contact event by adopting the contact time and the temperature change rate of the contact area, the misjudgment probability of non-violent actions such as clapping the palm and the like belongs to rapid non-violent actions is reduced. Wherein the edge temperature decay coefficientAlso has a guiding effect on the contact event type, so in other embodiments, the edge temperature decay coefficient can also be consideredInfluence of contact time type judgment.
S130, determining a pre-trained action classification model according to the contact event type, and inputting the area change rate and the temperature gradient distribution into the action classification model to output a preliminary action label so as to generate human action data.
In step S130, the motion classification model is trained in advance, and in this embodiment, the motion classification model is two types, which are respectively used for identifying specific motions of a dynamic contact event and an instantaneous contact event, where the motion classification model is used for performing motion identification, so as to reduce the parameter number of a single model, reduce the reasoning delay of a high-complexity model on an edge device, and ensure the real-time performance of motion identification. Specifically, the contact area is determined to be a specific contact position of the target person according to the contour position of the target person in the contact area while judging the contact event type, and then a contact position combination is obtained, after a three-dimensional thermodynamic characteristic matrix is obtained according to steps S121-S122, a corresponding action classification model is input and the contact position combination is combined, and then a preliminary action label is output by the action classification model, and human action data is generated according to the preliminary action label, so that the human action data is obtained.
Specifically, to facilitate understanding of the steps S110-S130, the preliminary action labels are exemplified herein, in this embodiment, the preliminary action labels define four types of violent contact labels and two types of normal contact labels, wherein the violent contact labels include palms, push-ups, drag and press, the normal contact labels include claps and claps, the purpose of determining the type of contact event is to determine the corresponding action recognition model, and the process of determining the type of contact event is actually determining the contact duration and the rate of temperature change in the contact area, so the steps S110-S130 can be simply represented by the following table 1:
TABLE 1
And S200, independently analyzing the multi-mode data to respectively generate a first early warning signal, a second early warning signal and a third early warning signal.
In step S200, each type of data is independently analyzed according to the multi-modal data acquired in step S100, a first early warning signal is generated for the human body motion data when the human body motion data is determined to have a physical conflict, a second early warning signal is generated for the environmental sound data when the environmental sound data is determined to have a language conflict, and a third early warning signal is generated for the physiological data of the person when the physiological data of the person is determined to have a physiological characteristic abnormal phenomenon. Independent analysis based on different types of data can simplify the processing process and reduce the resource occupation in the calculation process.
Further, the step S200 specifically includes:
S201, calling a first preset database to match dangerous action labels according to the human action data, and outputting a first early warning signal when the matching is judged to be successful.
In step S201, a plurality of dangerous action labels are pre-stored in a first preset database, after the first preset database is called by the human action data, the primary action labels in the human action data are matched with the dangerous action labels, and it is noted that the primary action labels may include more than one type, and when the primary action labels and the dangerous action labels are matched, it is judged that at least one type of primary action labels and the dangerous action labels are matched, a first early warning signal is output. According to the difference of the matching quantity of the primary action labels and the dangerous action labels, the signal intensity of the output first early warning signals is also different, and the stronger the matching quantity is, the stronger the signal intensity is.
S202, calling a second preset database to match dangerous vocabulary labels according to the environmental sound data, and outputting a second early warning signal when the matching is judged to be successful.
In step S202, a plurality of dangerous vocabulary labels are pre-stored in a second preset database, after the second preset database is called by the environmental sound data, the vocabulary in the environmental sound data is matched with the dangerous vocabulary labels, more than one vocabulary in the environmental sound data is used, when the vocabulary in the environmental sound data is matched with the dangerous vocabulary labels, it is judged that at least one vocabulary in the environmental sound data is matched with the dangerous vocabulary labels, and a second early warning signal is output. And outputting second early warning signals with different signal strengths according to different vocabulary matching numbers, wherein the signal strength is stronger as the vocabulary matching number is larger.
S203, calling a third preset database to compare the respiratory frequency threshold value and the heartbeat frequency threshold value according to the physiological data of the person, and outputting a third early warning signal when the threshold value is exceeded.
In step S203, a plurality of respiratory frequency ranges and a plurality of heartbeat frequency ranges are pre-stored in a third preset database, after the third preset database is called through personnel physiological data, the respiratory frequency is compared with the respiratory frequency range, when the respiratory frequency exceeds the threshold value of the respiratory frequency range, the respiratory abnormality phenomenon appears in the personnel in the area, meanwhile, the heartbeat frequency is compared with the heartbeat frequency range, when the heartbeat frequency exceeds the threshold value of the line frequency range, the occurrence of the heartbeat abnormality phenomenon is indicated in the personnel in the area, and when the occurrence of the abnormality in both the heartbeat and the respiratory is identified, a third early warning signal is output. And outputting third early warning signals with different signal intensities according to the quantity that the heartbeat frequency and the respiratory frequency exceed the corresponding thresholds, wherein the signal intensity is stronger as the quantity exceeding the thresholds is larger.
And S300, uploading the three early warning signals to a cloud analysis platform for fusion analysis, so that the cloud analysis platform judges the comprehensive alarm level based on the signal intensity, the superposition weight and the space-time dynamic threshold value of the three early warning signals, and the space-time dynamic threshold value is dynamically adjusted based on the occurrence time and the occurrence place.
In step S300, the cloud analysis platform is a core processor, and the cloud analysis platform is disposed in a school or an operator and communicates through the internet of things or the internet. The three early warning signals are processed through a cloud analysis platform after being output, the cloud analysis platform carries out fusion analysis on the three early warning signals based on two dimensions of time and space of occurrence of the event, and then the comprehensive alarm level is judged and used for representing the severity of the deception event.
Further, the step S300 specifically includes:
And S310, determining signal strengths of the three early warning signals, wherein the signal strengths comprise action matching degree, vocabulary matching degree and physiological abnormality index.
In step S310, after receiving the signal intensities of the three early warning signals, the cloud analysis platform obtains the signal intensities of the three early warning signals respectively, and normalizes the signal intensities of the three early warning signals. It should be noted that, the higher the action matching degree is, the more the number of primary action labels and dangerous action labels are matched, namely, the higher the signal intensity of the first early warning signal is, the more serious the situation of physical collision occurs at the moment, the higher the vocabulary matching degree is, the more the number of vocabularies matched with dangerous vocabulary labels in the environmental sound data is, namely, the higher the intensity of the second early warning signal is, the more serious the situation of language collision occurs at the moment, the higher the physiological abnormality index is, the more the respiratory frequency and the heartbeat frequency of people in the area deviate from normal values, namely, the higher the intensity of the third early warning signal is, and at the moment, the physical collision or the language collision can be verified.
And S320, respectively giving weight values to the signal intensities of the three early warning signals based on the occurrence time and the occurrence place, and calculating a weighted sum to obtain risk assessment parameters.
In step S320, the signal intensities of the three pre-warning signals are given weight values determined based on the time and place of occurrence, and specifically, in this embodiment, a weight value database is pre-established based on the time and place, the weight value database includes a plurality of time of occurrence, a plurality of place of occurrence corresponding to each time of occurrence, and weight values of three signal intensities corresponding to each place of occurrence, the weight value database is shown in the following table 1:
TABLE 2
In table 2, α, β and γ are signal intensity weight values of the first early warning signal, the second early warning signal and the third early warning signal, respectively, and only examples in some cases are given in table 2, and the setting of the time of occurrence, the place of occurrence and the weight value is specific to the actual situation of the school.
Referring to table 2, explanation will be made on the case where a playground of a period of 08:00-17:00 is taken as an example, and since the period is usually a period of lesson or courseware activity, it is possible to perform inter-lesson activities on a stadium in a lesson or with classmates, since the playground activity is frequent, the weight value of human motion data is set to be highest, the environmental sound data is susceptible to interference of environmental noise, the weight value of the environmental sound data is set to be low, and a dormitory of a period of 18:00-22:00 is taken as an example, since it is a night dormitory environment, the weight value of personal physiological data is set to be high, and human motion data is repeated.
In step S320, matching is performed between the occurrence time and the occurrence place corresponding to the three pre-warning signals and the weight value database, so as to obtain a unique set of weight values, and dynamic adjustment of the weight values based on the occurrence time and the occurrence place is realized. After the signal intensity weight values of the three early warning signals are obtained, the weighted sum of the three early warning signal intensities is obtained to obtain risk assessment parameters, and the calculation of the weighted sum enables multi-dimensional cross verification to occur on the spoofing event, so that the true judgment to the occurrence of the spoofing event is improved, and the false alarm rate is remarkably reduced.
S330, calling a risk level database according to the occurrence time and the occurrence place to acquire the space-time dynamic threshold value, and judging the comprehensive alarm level according to the matching result of the risk assessment parameter and the space-time dynamic threshold value range.
In step S330, the risk level database includes a plurality of occurrence times, a plurality of occurrence places corresponding to each occurrence time, a plurality of spatiotemporal dynamic ranges corresponding to each occurrence place, and a comprehensive alarm level corresponding to each spatiotemporal dynamic range, where the threshold of the spatiotemporal dynamic range is a spatiotemporal dynamic threshold, and the risk level database is as shown in table 3 below:
TABLE 3 Table 3
Examples are given in table 3 only in some cases, and the setting of the time of occurrence, the space-time dynamic range of the place of occurrence, and the comprehensive alarm level is specific to the actual conditions of the school. Referring to Table 3, an example is given of a classroom and a playground in a 08:00-17:00 time period, in which a risk assessment parameter satisfies a space-time dynamic range of 0-50 at a low risk condition due to limited activity, and in which a risk assessment parameter satisfies a space-time dynamic range of 0-60 at a playground scene due to physical activity, and an example is given of a dormitory in a 22:00-06:00 time period, in which a risk assessment parameter only satisfies a space-time dynamic range of 0-20 at a low risk due to late night.
In step S330, matching is performed with the risk level database according to the occurrence time, the occurrence place and the risk assessment parameters, so that a unique comprehensive alarm level can be obtained, and dynamic adjustment of the space-time dynamic range based on the occurrence time and the occurrence place is realized. The scene self-adaptive monitoring is realized by dynamically adjusting the alarm threshold value based on the occurrence time and the occurrence place, so that the problem of stiffness caused by the fixed threshold value is avoided, the actual situation can be met, and further, the judgment of the deception event is more accurate.
S400, directionally distributing early warning information to a preset terminal according to the comprehensive alarm level and the accident site, and triggering emergency response operation.
In step S400, the terminal may be a mobile phone of a teacher, a security personnel, a host or even a police, and the corresponding terminal is set in advance according to the combination of different comprehensive alarm levels and the places of occurrence, for example, in the education room in the 08:00-17:00 time period, if the comprehensive alarm level is a risk of occurrence, a slight conflict may occur, the comprehensive alarm level is sent to a class owner mobile phone, the class owner is notified to check, if the integrated alarm level is a high risk, a serious conflict may occur, and the comprehensive alarm level is sent to the security mobile phone. And sending the comprehensive alarm level and the accident site, and triggering emergency response operation, wherein the emergency response operation comprises triggering campus broadcasting alarm to alarm or directly alarm to police, and the like. And the hierarchical emergency response is realized by directionally distributing the early warning information to the preset terminal, and the resource allocation and response efficiency are optimized.
Further, the method further comprises:
Acquiring event types and injury grades according to the intermediate data of the comprehensive alarm grades;
a report of the spoofed event is generated containing the event type and the injury level.
Specifically, the event types include an action spoofing event and a language spoofing event, and the injury levels include a level of the action spoofing event and a level of the language spoofing event, respectively. And in the process of obtaining the comprehensive alarm level, obtaining intermediate data, inputting the intermediate data into a pre-trained spoofing classification model, identifying the human action data and the environment sound data in the intermediate data, outputting event types and corresponding injury levels by the spoofing classification model, and finally generating and storing a spoofing event report according to the event types and the corresponding injury levels so as to facilitate a teacher to trace the spoofing event.
Further, the method further comprises:
And monitoring the power supply state of the front-end sensing unit, starting a standby power supply when the power is off, and generating a power-off alarm signal.
Specifically, through monitoring the power supply state of all front end sensing units, whether the front end sensing units are in a normal working state can be judged, when the front end sensing units are judged to be powered off, the standby power supply is started to supply power to the powered-off front end sensing units, so that the phenomenon that multimode data cannot be obtained in time is avoided, and the monitoring on campus safety is further affected.
Further, the outside of each front end sensing unit is sleeved with a protective cover, the protective cover is electrically connected with the monitoring system, and an alarm signal is automatically generated when the protective cover is detected to be damaged.
In addition, the infrared imaging device is integrated with an anti-shielding function, and when the infrared imaging device detects that the heat source distribution of each position of the infrared image in the visual field is consistent, the infrared imaging device is judged to be shielded, and an alarm signal is generated at the moment.
Example 2
Based on the content of embodiment 1, the present embodiment provides another campus security monitoring method based on multi-mode sensing, and the content of this embodiment identical to that of embodiment 1 is not described in detail, except that:
the method further comprises the steps of:
Based on the heat source profile, reversely tracking a motion track of a preset duration before contact, and extracting an acceleration change rate, a direction deflection angle and a relative speed;
Calculating an action rationality score through the movement intention analysis model, and correcting the confidence coefficient of the preliminary action label, wherein the confidence coefficient of the preliminary action label is used for adjusting the action intensity.
Specifically, in embodiment 1, while outputting the preliminary action labels through the action classification model, a corresponding confidence coefficient is generated for each preliminary action label, so when determining the signal strength of the first early warning signal in step S300, not only the matching number of the labels but also the confidence coefficient of each preliminary action label need to be considered, and the signal strength of the first early warning signal is determined together through the matching number of the labels and the confidence coefficient, so that the signal strength is more accurate.
In embodiment 1, the effective motion recognition can be performed by the current contact behavior, but many times, the "motivation" of the motion can be more explained, and the present embodiment provides the accuracy of the motion recognition by judging the motivation of the motion of the target person.
Specifically, based on time sequence change of heat source contour coordinates, multi-frame infrared images acquired by an infrared imaging device reversely track motion tracks of two target persons in preset time before contact, wherein the preset time can be 1 second, and kinematic parameters such as acceleration change rate, motion direction deflection angle, relative speed and the like in the preset time are extracted, the acceleration change rate is absolute value change of acceleration in unit time, and a calculation formula is as follows:
Wherein, the For the current frame acceleration,For the acceleration of the previous frame,Is the frame interval time;
the motion direction deflection angle calculates a direction included angle through displacement vectors of heat source contours of adjacent frames, and a calculation formula is as follows:
Wherein, the AndThe speed vectors of the current frame and the previous frame respectively;
The relative speed is the relative movement speed between target persons, and the calculation formula is as follows:
Wherein, the AndVelocity vectors for target persons a and B, respectively;
After the acceleration change rate, the movement direction deflection angle and the relative speed are obtained, the kinematic parameters are input according to a preset movement intention analysis model, and the action rationality score is output, wherein the score range is 0-100 minutes. During training of the exercise intention analysis model, violent contact labels and normal contact labels in historical data are used as positive and negative samples. The scoring rules of the exercise intention analysis model are exemplified below when judging the acceleration change rate When the speed is greater than 1m/s 2, 20 minutes are added to each item, and the deflection angle of the movement direction is judgedWhen the relative speed is greater than 45 DEG, 15 minutes are added to judge the relative speedAt >2m/s 2, 25 minutes are added.
The action rationality score is then mapped to a confidence coefficient. For example:
Confidence improvement k1×score/100 (k 1 is a correction factor, optionally 0.3) when score > 50;
When the score is less than or equal to 50, the confidence is reduced by k2× (50-score)/50 (k 2 is a correction factor, optionally 0.2).
After the confidence coefficient of the preliminary action label is corrected by the method provided by the embodiment, the signal strength judgment of the first early warning signal is more accurate, and the accuracy of detecting the deception behavior is further improved.
Example 3
Based on the content of embodiment 2, the present embodiment provides another campus security monitoring method based on multi-mode sensing, and the content of this embodiment identical to that of embodiment 2 is not described in detail, except that:
the method further comprises the steps of:
calculating the dynamic energy density of the contact area according to the three-dimensional thermodynamic characteristic matrix;
And calling a confidence correction database, and correcting the confidence of the preliminary action label based on the dynamic energy density and the preliminary action label matching confidence correction value.
Based on the above embodiment 2, the embodiment further corrects the confidence of the preliminary action label by analyzing the thermodynamic characteristics of the contact area, so as to distinguish the contact behaviors with different forces.
Specifically, firstly, the energy density in the contact area is calculated according to the three-dimensional thermodynamic characteristic matrix obtained in step S122, and the dynamic energy density calculation formula is as follows:
Wherein, the Representation ofThe temperature of the contact area changes at the moment,Representing the contact area;
After the dynamic energy density is obtained, a confidence correction database is called according to the dynamic energy density and the preliminary action tags, wherein the confidence correction database comprises a plurality of preliminary action tags, a plurality of different energy density ranges corresponding to each preliminary action tag and a confidence correction value corresponding to each energy density range, and the confidence correction database is shown in the following table 4:
TABLE 4 Table 4
The unique confidence coefficient correction value can be obtained in the table 4 according to the dynamic energy density and the preliminary action label, the confidence coefficient is further corrected through the confidence coefficient correction value, the accuracy of identifying the human action data is improved through a multi-level verification mode, and the accuracy and the robustness of campus monitoring are further improved obviously. It should be noted that table 4 is only an example, and the specific preliminary action tag and the setting of the energy density range are determined according to the actual situation of different schools.
Example 4
Referring to fig. 2, the present embodiment provides a multi-mode sensing-based campus security monitoring system for executing the multi-mode sensing-based campus security monitoring method described in embodiments 1 to 3, where the system includes:
The data acquisition module is configured to acquire multi-modal data in the target area in real time through the front-end sensing unit, wherein the multi-modal data comprises human body action data, environment sound data and personnel physiological data;
the data analysis module is configured to independently analyze the multi-mode data and respectively generate a first early warning signal, a second early warning signal and a third early warning signal;
the data transmission module is configured to upload the three early warning signals to the cloud analysis platform for fusion analysis, so that the cloud analysis platform can judge the comprehensive alarm level based on the signal intensity, the superposition weight and the space-time dynamic threshold value of the three early warning signals;
The cloud response module is configured to directionally distribute early warning information to a preset terminal according to the comprehensive alarm level and the accident site and trigger emergency response operation.
The above description is only illustrative of the preferred embodiments of the present invention and of the principles of the technology employed. It will be appreciated by persons skilled in the art that the scope of the invention referred to in the present invention is not limited to the specific combinations of the technical features described above, but also covers other technical features formed by any combination of the technical features described above or their equivalents without departing from the inventive concept. Such as the above-mentioned features and the technical features disclosed in the present invention (but not limited to) having similar functions are replaced with each other.