Disclosure of Invention
The invention aims to provide a method, a device, equipment and a storage medium for identifying the running state of equipment.
In order to solve the problems, the technical scheme of the invention is as follows:
an equipment operation state identification method comprises the following steps:
establishing an unsupervised learning model based on a K-means clustering algorithm, and clustering historical data of the equipment state to obtain a plurality of clustering states;
acquiring service state data of equipment, and defining and classifying service states;
clustering and labeling the service state data, matching the clustering state with the service state of the equipment, determining a clustering center and a distance algorithm, and obtaining a semi-supervised learning model based on a K-means clustering algorithm and service state labeling;
based on the operation data and the operation principle of the equipment, the migration of the service state of the equipment is marked by combining a finite state automaton, the state migration rule of the equipment is limited, and an equipment operation state identification model is obtained;
and acquiring the running data of the equipment in the state to be identified, inputting the running data into the equipment running state identification model, and outputting the running state of the equipment.
According to an embodiment of the present invention, the creating an unsupervised learning model based on a K-means clustering algorithm further includes:
a. obtaining historical data of n equipment states to form a sample set { x1,x2,x3,...,xnRandomly selecting k sample points in the sample set as each sample cluster { c }1,c2,...,ckCentral point of { mu } c1,μ2,...,μk};
b. Calculating the distance between all the sample points and the center of each cluster, and dividing the sample points into the clusters with the shortest distance;
c. recalculating cluster centers based on existing sample points in a cluster
d. Repeating steps b and c until no more migration of the center of the cluster occurs.
According to an embodiment of the present invention, the acquiring the service state data of the device, and defining and classifying the service state further includes:
collecting service state data of different types of equipment, and defining and classifying service states;
recording working time periods in different service states;
and unifying and storing the service state data of different kinds of equipment in a data format.
According to an embodiment of the present invention, the clustering and labeling the service state data, and the matching the clustering state with the service state of the device further includes:
based on the low-density separation hypothesis, clustering and labeling the service state data;
and correspondingly matching the clustering data cluster in the theoretical state with the real service state of the equipment.
According to an embodiment of the present invention, the marking the migration of the service state of the device based on the operation data and the operation principle of the device in combination with the finite state automata, and the limiting the state migration rule of the device further includes:
dividing the service state of the equipment into a standby state, a starting state, a plurality of working states and a shutdown state;
when the equipment is switched from a shutdown state to a startup state, the equipment needs to be switched from the shutdown state to a standby state and then from the standby state to the startup state;
when the equipment is switched from one working state to another working state, the equipment needs to be switched to a standby state first and then switched to another working state from the standby state.
According to an embodiment of the present invention, the inputting the operation data into the device operation state recognition model, and outputting the operation state of the device further includes:
acquiring state data of different devices of the same type, and taking the state data as prediction data;
randomly selecting a plurality of time points to verify the equipment operation state recognition model, comparing the operation state output by the model with the prediction data, and if the accuracy of the comparison result does not reach the preset standard, merging the prediction data into the training data of the model to continue training until the prediction accuracy reaches the preset standard.
An apparatus operation state recognition device includes:
the initial model creating module is used for creating an unsupervised learning model based on a K-means clustering algorithm, and clustering historical data of the equipment state to obtain a plurality of clustering states;
the data acquisition module is used for acquiring the service state data of the equipment and defining and classifying the service state;
the clustering and labeling module is used for clustering and labeling the service state data, matching the clustering state with the service state of the equipment, determining a clustering center and a distance algorithm, and obtaining a semi-supervised learning model based on a K-means clustering algorithm and service state labeling;
the state migration marking module is used for marking the migration of the service state of the equipment based on the operation data and the operation principle of the equipment, limiting the state migration rule of the equipment and obtaining an equipment operation state identification model;
and the state identification module is used for acquiring the running data of the equipment in the state to be identified, inputting the running data into the equipment running state identification model and outputting the running state of the equipment.
According to an embodiment of the present invention, the device for identifying an operating state of an apparatus further includes:
the model verification module is used for acquiring state data of different devices of the same type and taking the state data as prediction data; and randomly selecting a plurality of time points to verify the equipment running state identification model.
An apparatus operation state identification apparatus comprising:
a memory having instructions stored therein and at least one processor, the memory and the at least one processor interconnected by a line;
the at least one processor calls the instruction in the memory to enable the device operation state identification device to execute the device operation state identification method in an embodiment of the invention.
A computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements a device operation state identification method in an embodiment of the present invention.
Due to the adoption of the technical scheme, compared with the prior art, the invention has the following advantages and positive effects:
aiming at the problems of low reliability and high false recognition rate of the existing method for recognizing the running state of the equipment by depending on a manually set threshold value, the method for recognizing the running state of the equipment in the embodiment of the invention clusters historical data of the state of the equipment by creating a semi-supervised learning model based on a K-means clustering algorithm and service state labeling to obtain a plurality of clustering states, matches the clustering states of the equipment with the service states, determines a clustering center and a distance algorithm, limits the recognition range during state switching based on the field state switching of the equipment and the principle of the equipment, and predicts the switched state in a finite probability set according to the historical state switching data, thereby realizing the improvement of the state recognition accuracy.
Detailed Description
The following describes a method, an apparatus, a device, and a storage medium for identifying an operating state of a device according to the present invention in detail with reference to the accompanying drawings and specific embodiments. Advantages and features of the present invention will become apparent from the following description and from the claims.
Example one
As shown in fig. 1, the present invention provides a method for identifying an operating state of a device, including:
s1: establishing an unsupervised learning model based on a K-means clustering algorithm, and clustering historical data of the equipment state to obtain a plurality of clustering states;
s2: acquiring service state data of equipment, and defining and classifying service states;
s3: clustering and labeling the service state data, matching the clustering state with the service state of the equipment, determining a clustering center and a distance algorithm, and obtaining a semi-supervised learning model based on a K-means clustering algorithm and service state labeling;
s4: based on the operation data and the operation principle of the equipment, the migration of the service state of the equipment is marked by combining a finite state automaton, the state migration rule of the equipment is limited, and an equipment operation state identification model is obtained;
s5: and acquiring the running data of the equipment in the state to be identified, inputting the running data into the equipment running state identification model, and outputting the running state of the equipment.
Specifically, in step S1, an unsupervised learning model based on a K-means clustering algorithm is created, and the history data of the device states are clustered to obtain a plurality of cluster states. The unsupervised learning model is a machine learning module.
Machine learning for classification problems can be divided into supervised learning, unsupervised learning and semi-supervised learning.
And (3) supervision and learning: refers to the process of adjusting the parameters of the classifier to achieve the required performance by using a set of samples of known classes, and is also called supervised training or teacher learning. Supervised learning is a machine learning task that infers a function from labeled training data. Supervised learning is a machine learning task that infers a function from labeled training data. The training data includes a set of training examples. In supervised learning, each instance consists of an input object (usually a vector) and a desired output value (also called a supervisory signal). Supervised learning algorithms analyze the training data and produce an inferred function that can be used to map out new instances. An optimal solution would allow the algorithm to correctly determine class labels for those instances that are not visible. This requires that the learning algorithm be formed in a "rational" manner from a point of view of the training data to a point of view that is not visible.
Unsupervised learning: various problems in pattern recognition are solved from training samples whose classes are unknown (not labeled), referred to as unsupervised learning. At present, unsupervised learning in deep learning is mainly divided into two types, one type is a deterministic self-coding method and an improved algorithm thereof, the goal is mainly to recover original data from abstracted data in a lossless manner as much as possible, the other type is a probabilistic type restricted boltzmann machine and an improved algorithm thereof, and the goal is mainly to maximize the probability of the original data when the restricted boltzmann machine reaches a stable state.
And clustering is a typical example in unsupervised learning. The purpose of clustering is to cluster things like together, and we do not care what this class is. Therefore, a clustering algorithm usually only needs to know how to calculate the similarity to start working.
Semi-Supervised Learning (SSL) is a key problem in the field of pattern recognition and machine Learning, and is a Learning method combining Supervised Learning and unsupervised Learning. Semi-supervised learning uses large amounts of unlabeled data, and simultaneously labeled data, to perform pattern recognition operations. When semi-supervised learning is used, people are required to work as few as possible, and high accuracy can be brought, so that the semi-supervised learning is increasingly paid more attention by people.
The main algorithm strategy of the invention is based on the semi-supervised learning idea of clustering hypothesis, namely when two samples are positioned in the same clustering cluster, the two samples have the same class label under a high probability. The equivalence of this hypothesis is defined as the low density Separation Assumption (LowDensity Separation application), i.e., the classification decision boundary should pass through the sparse data region while avoiding the splitting of samples of the dense data region to both sides of the decision boundary.
Wherein the classification clustering algorithm is a K-means clustering algorithm. K in the K-means clustering algorithm represents the number of class clusters, and means represents the mean value of data objects in the class clusters (the mean value is a description of the center of the class clusters), so the K-means algorithm is also called as K-means algorithm. The K-means algorithm is a clustering algorithm based on partitioning, and takes distance as a standard for similarity measurement between data objects, i.e. the smaller the distance between data objects is, the higher the similarity is, the more likely they are in the same cluster. The K-means algorithm typically uses Euclidean distances to calculate the distance between data objects. The specific steps of the K-means algorithm are as follows:
a. obtaining historical data of n device states to form a sample setx1,x2,x3,...,xnRandomly selecting k sample points in the sample set as each sample cluster { c }1,c2,...,ckCentral point of { mu } c1,μ2,...,μk};
b. Calculating the distance between all the sample points and the center of each cluster, and dividing the sample points into the clusters with the shortest distance;
c. recalculating cluster centers based on existing sample points in a cluster
d. Repeating steps b and c until no more migration of the center of the cluster occurs.
And (3) finally forming an effect graph of the K-means clustering algorithm, as shown in FIG. 2. Specifically, the invention is to stack (cluster) the state of the equipment from the data distribution perspective by learning and training the historical data of the equipment. According to the actual operation state of the device, it can be generally considered as a shutdown state, a standby state, a startup state and a working state, wherein the working state can be divided into a plurality of different working states. For example, in gynecological color ultrasonic image diagnostic equipment with model number HH0 and serial number HHH under H0 brand in obstetrics and gynecology department of a certain hospital, the product description of the reading equipment HHH is expected to be roughly divided into 5 types, namely { shutdown, standby, work-function a, work-function B, work-function C }; selecting and exporting current, voltage and power data of historical time flow of the HHH; and processing and converting the data to generate secondary characteristic data such as 'maximum current per unit time' and 'power difference per unit time'. Based on the data, K-means algorithm modeling is carried out.
Firstly, performing primary modeling according to the steps a-d, then selecting other k values, such as numerical values in the range of 2-8, performing modeling again, and calculating average contour coefficients under different k values, namely all sample points xiAverage distance from other sample points in the same cluster. The average profile has a value range of [ -1, 1 [)]And the closer the distance of the samples in the clusters is, the farther the distance of the samples between the clusters is, the larger the average contour coefficient isThe better the clustering effect. Then the larger the k value of the average contour coefficient is the optimal cluster number.
Through the above calculation, a distribution diagram of the average profile coefficient is obtained, as shown in fig. 3. In the figure, the theoretical k value is optimally 2, but for the device HHH, the actual recognition state thereof requires more than 5 states, and therefore, the k value here takes 7, that is, there are 7 clusters, corresponding to 7 clustering states.
In step S2, the service status data of the device is collected, and the service status is defined and classified. The method can collect the service state data of different kinds of equipment, and define and classify the service state; recording working time periods in different service states; and unifying and storing the service state data of different kinds of equipment in a data format.
In this embodiment, the operating data of the one-day device HHH is collected, as shown in the following table.
In the table, a part of the collected data is listed. These collected data are plotted against the device status as shown in fig. 4.
In step S3, clustering and labeling are performed on the service state data, the clustering state is matched with the service state of the device, a clustering center and a distance algorithm are determined, and a semi-supervised learning model based on the K-means clustering algorithm and the service state labeling is obtained.
And after the acquisition work of the equipment data is carried out, matching the equipment data with the cluster data. The cluster data clusters in the theoretical state are in one-to-one correspondence to the real equipment state on site; the matching accuracy requirement reaches over 90%, but data which are not matched significantly can be traced (as shown by a mark a in fig. 5, a situation that an acquirer is not on site and acquisition is missed in a certain state may occur in a certain period, and by adopting the clustering label in the embodiment, the theoretical device operation state cannot be matched with the actual acquired 'standby state' by mistake), so the matching accuracy can reach over 95%. The curve in fig. 5 is the device data, the bar graph below the curve is the clustering label of each operating state, and the labels of each operating state are not well distinguished due to the gray processing of fig. 5, but in practical application, fig. 5 is a color image, and different operating states are represented by different colors.
After clustering labeling is performed on the service state data, the unsupervised learning model based on the K-means clustering algorithm in step S1 is converted into a semi-supervised learning model.
In step S4, based on the operation data and the operation principle of the device, the finite state automaton is combined to label the service state transition of the device, and the state transition rule of the device is defined to obtain the device operation state identification model.
In terms of defining the state transition rule of the device, the method specifically includes: when the equipment is switched from a shutdown state to a startup state, the equipment needs to be switched from the shutdown state to a standby state and then from the standby state to the startup state; when the equipment is switched from one working state to another working state, the equipment needs to be switched to a standby state first and then switched to another working state from the standby state.
Specifically, in the present embodiment, after the clustering labeling is completed, the state transition labeling of the device HHH is performed, and the limitation of the state switching is performed on the actual operation state of the device HHH, as shown in fig. 6 (C5-1, C10-3V, and the like are ultrasound probes). The finite state automaton limits the state switching path of the equipment, prevents similar errors that the result identifies that C10-3V is directly switched to the eL18-4 state, and improves the matching accuracy.
After the state transition labeling is completed, the device operation state recognition model is completed. In order to illustrate the accuracy of the device operation state identification model, the accuracy verification of the device operation state identification model is required. The verification process is as shown in fig. 7, parameter training is performed in cooperation with a semi-supervised learning model to obtain a model a0, at this time, one or more times of new data acquisition work is performed on different equipment of the same type to obtain 2 nd to nth new equipment data, then the data is used as prediction data, a plurality of time points are randomly selected for verification, the operation state output by the model is compared with the prediction data, if the prediction accuracy is expected, the model is applied to land, if the prediction accuracy is not expected, the new prediction data and the previous training data are combined into parameters of a new training data to continue training the model to obtain a1 and … Ax, and the previous process is repeated until the prediction accuracy is expected.
After the accuracy verification of the device operation state identification model is completed, the state identification of the device to be tested can be performed, the operation state of the device to be tested is monitored in real time, and as stated in step S5, the operation data of the device to be tested, including current, voltage, power, etc., are input, and the actual operation state of the device is output through the model identification.
Example two
The present embodiment provides an apparatus operating state identifying device, as shown in fig. 8, the apparatus operating state identifying device includes:
the initial model creating module 1 is used for creating an unsupervised learning model based on a K-means clustering algorithm, and clustering historical data of the equipment state to obtain a plurality of clustering states;
the data acquisition module 2 is used for acquiring the service state data of the equipment, and defining and classifying the service state;
the clustering and labeling module 3 is used for clustering and labeling the service state data, matching the clustering state with the service state of the equipment, determining a clustering center and a distance algorithm, and obtaining a semi-supervised learning model based on a K-means clustering algorithm and service state labeling;
the state transition marking module 4 is used for marking the transition of the service state of the equipment based on the operation data and the operation principle of the equipment, limiting the state transition rule of the equipment and obtaining an equipment operation state identification model;
the model verification module 5 is used for acquiring state data of different devices of the same type and taking the state data as prediction data; randomly selecting a plurality of time points to carry out accuracy verification on the equipment running state identification model;
and the state identification module 6 is used for acquiring the running data of the equipment in the state to be identified, inputting the running data into the equipment running state identification model and outputting the running state of the equipment.
The specific contents and implementation methods of the modules in the device operation state identification apparatus are as described in the first embodiment, and are not described herein again.
EXAMPLE III
The second embodiment describes the device operation state identification apparatus in detail from the perspective of the modular functional entity, and the device operation state identification apparatus in detail from the perspective of hardware processing.
Referring to fig. 9, the device operation state identification device 500 may have a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 510 (e.g., one or more processors) and a memory 520, one or more storage media 530 (e.g., one or more mass storage devices) storing applications 533 or data 532. Memory 520 and storage media 530 may be, among other things, transient or persistent storage. The program stored in the storage medium 530 may include one or more modules (not shown), and each module may include a series of instruction operations for the device operation state identification device 500.
Further, the processor 510 may be configured to communicate with the storage medium 530, and execute a series of instruction operations in the storage medium 530 on the device operation state identification device 500.
The device operation state identification device 500 may also include one or more power supplies 540, one or more wired or wireless network interfaces 550, one or more input-output interfaces 560, and/or one or more operating systems 531, such as Windows service, Vista, and the like.
Those skilled in the art will appreciate that the configuration of the apparatus operation state identification apparatus shown in fig. 9 does not constitute a limitation of the apparatus operation state identification apparatus, and may include more or less components than those shown, or some components may be combined, or a different arrangement of components may be used.
The present invention also provides a computer-readable storage medium, which may be a non-volatile computer-readable storage medium, and which may also be a volatile computer-readable storage medium. The computer-readable storage medium stores instructions that, when executed on a computer, cause the computer to perform the steps of the device operation state identification method in the first embodiment.
The modules in the second embodiment, if implemented in the form of software functional modules and sold or used as independent products, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present invention may be substantially or partially implemented in software, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and devices may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The embodiments of the present invention have been described in detail with reference to the accompanying drawings, but the present invention is not limited to the above embodiments. Even if various changes are made to the present invention, it is still within the scope of the present invention if they fall within the scope of the claims of the present invention and their equivalents.