CN119723973A

CN119723973A - Industrial robot assembly training system and method based on virtual simulation

Info

Publication number: CN119723973A
Application number: CN202410658729.1A
Authority: CN
Inventors: 金诚; 宋海萍
Original assignee: Hangzhou Tianke Teaching Instrument & Equipment Co ltd
Current assignee: Hangzhou Tianke Teaching Instrument & Equipment Co ltd
Priority date: 2024-05-24
Filing date: 2024-05-24
Publication date: 2025-03-28

Abstract

An industrial robot assembly training system and method based on virtual simulation are disclosed. The system comprises a virtual simulation platform for providing a virtual scene and virtual equipment for practical training of industrial robot assembly, virtual reality equipment for interacting with the virtual simulation platform to realize operation and control of the virtual scene and the virtual equipment, a data processing module for receiving input signals of the virtual reality equipment, analyzing and processing the input signals to generate corresponding feedback signals and sending the feedback signals to the virtual reality equipment, and an evaluation module for evaluating the assembly operation of a user. In this way, an objective assessment of the user's assembly operation can be achieved.

Description

Industrial robot assembly practical training system and method based on virtual simulation

Technical Field

The application relates to the field of industrial robots, and in particular relates to an industrial robot assembly practical training system and method based on virtual simulation.

Background

The industrial robot is a mechanical device capable of automatically executing industrial production tasks, has the advantages of high speed, high precision, high flexibility, high reliability and the like, and is widely applied to the fields of automobiles, electronics, machinery and the like. However, the assembly process of the industrial robot is a complex technical activity, professional knowledge and skill are required, and the conventional practical training mode has the problems of high cost, low efficiency, poor safety and the like.

Accordingly, an industrial robot assembly training system and method based on virtual simulation is desired.

Disclosure of Invention

In view of the above, the application provides an industrial robot assembly training system and method based on virtual simulation, which can evaluate the deviation degree between the assembly operation and the standard operation of a user and realize objective evaluation of the assembly operation of the user.

According to an aspect of the present application, there is provided an industrial robot assembly training system based on virtual simulation, including:

The virtual simulation platform is used for providing a virtual scene and virtual equipment for the industrial robot assembly training;

the virtual reality equipment is used for interacting with the virtual simulation platform to realize the operation and control of the virtual scene and the virtual equipment;

A data processing module for receiving the input signal of the virtual reality device, analyzing and processing the input signal, generating a corresponding feedback signal, and transmitting the feedback signal to the virtual reality device, and

And the evaluation module is used for evaluating the assembly operation of the user.

In the above-mentioned industrial robot assembly training system based on virtual simulation, the evaluation module includes:

The data acquisition unit is used for acquiring the assembly operation video of the user acquired by the camera and referencing the operation video;

The time sequence analysis unit is used for performing time sequence analysis on the assembly operation video and the reference operation video to obtain an assembly operation time sequence characteristic diagram and a reference operation time sequence characteristic diagram;

A semantic difference extraction unit for extracting operation semantic difference features between the assembly operation time sequence feature map and the reference operation time sequence feature map to obtain an operation difference semantic transfer characterization feature map, and

And the deviation analysis unit is used for determining whether the deviation of the assembly operation of the user from the standard operation exceeds a preset threshold or not based on the operation difference semantic transfer characterization feature map.

In the above-mentioned industrial robot assembly training system based on virtual simulation, the time sequence analysis unit includes:

A preprocessing subunit for preprocessing the assembly operation video and the reference operation video to obtain a sequence of assembly operation key frames and a sequence of reference operation key frames, and

And the time sequence correlation feature extraction subunit is used for enabling the sequence of the assembly operation key frames and the sequence of the reference operation key frames to pass through an operation time sequence correlation feature extractor based on a three-dimensional convolution network model to obtain the assembly operation time sequence feature diagram and the reference operation time sequence feature diagram.

In the industrial robot assembly training system based on virtual simulation, the preprocessing subunit is configured to:

Sparse sampling is performed on the assembly operation video and the reference operation video to obtain a sequence of the assembly operation key frames and a sequence of the reference operation key frames.

In the above-mentioned industrial robot assembly training system based on virtual simulation, the semantic difference extraction unit is configured to:

And calculating a transfer matrix between the assembly operation time sequence feature diagram and the feature matrix of each group of corresponding channel dimensions of the reference operation time sequence feature diagram to obtain the operation difference semantic transfer characterization feature diagram.

In the above-mentioned industrial robot assembly training system based on virtual simulation, the deviation analysis unit includes:

A local operation saliency subunit for passing the operation difference semantic transfer characterization feature map through a local operation saliency based on an adaptive attention layer to obtain a saliency operation difference semantic transfer characterization feature map, and

And the classification subunit is used for passing the significant operation difference semantic transfer characterization feature map through a classifier to obtain a classification result, wherein the classification result is used for indicating whether the deviation degree of the assembly operation and the standard operation of the user exceeds a preset threshold value.

In the above-mentioned industrial robot assembly training system based on virtual simulation, the local operation significant subunit is configured to:

Processing the operation difference semantic transfer characterization feature map with an adaptive attention formula to obtain the salient operation difference semantic transfer characterization feature map, wherein the adaptive attention formula is as follows:

v_c=pool(F)

A=σ(W_a*v_c+b_a)

F′=A′⊙F

Wherein F is the operation difference semantic transfer characterization feature map, pool is pooling, v _c is pooling, W _a is a weight matrix, b _a is a bias vector, σ is activation, a is an initial meta-weight feature vector, a _i is a feature value of the i-th position in the initial meta-weight feature vector, a 'is a correction meta-weight feature vector, F' is the salified operation difference semantic transfer characterization feature map, and as indicated by the weight of the feature value in the correction meta-weight feature vector multiplied by each feature matrix of the operation difference semantic transfer characterization feature map along the channel dimension.

In the above-mentioned industrial robot assembly training system based on virtual simulation, the classifying subunit includes:

a feature map optimization secondary subunit for optimizing the saliency operation difference semantic transfer characterization feature map to obtain an optimized saliency operation difference semantic transfer characterization feature map, and

And the deviation degree classification secondary subunit is used for enabling the optimized significant operation difference semantic transfer characterization feature map to pass through a classifier to obtain a classification result, wherein the classification result is used for indicating whether the deviation degree of the assembly operation and the standard operation of the user exceeds a preset threshold value.

In the above-mentioned industrial robot assembly training system based on virtual simulation, the deviation classification secondary subunit is configured to:

Expanding the optimized significant operation difference semantic transfer characterization feature map into classification feature vectors according to row vectors or column vectors;

Full-concatenated coding of the classification feature vectors using a full-concatenated layer of the classifier to obtain coded classification feature vectors, and

And inputting the coding classification feature vector into a Softmax classification function of the classifier to obtain the classification result.

According to another aspect of the present application, there is provided an industrial robot assembly training method based on virtual simulation, including:

Providing a virtual scene and virtual equipment of the industrial robot assembly training through a virtual simulation platform;

The virtual reality equipment interacts with the virtual simulation platform to realize the operation and control of the virtual scene and the virtual equipment;

Receiving an input signal of the virtual reality device through a data processing module, analyzing and processing the input signal, generating a corresponding feedback signal, and transmitting the feedback signal to the virtual reality device, and

And evaluating the assembly operation of the user through an evaluation module.

The system comprises a virtual simulation platform for providing a virtual scene and virtual equipment for industrial robot assembly training, virtual reality equipment for interacting with the virtual simulation platform and realizing operation and control of the virtual scene and the virtual equipment, a data processing module for receiving input signals of the virtual reality equipment, analyzing and processing the input signals to generate corresponding feedback signals and sending the feedback signals to the virtual reality equipment, and an evaluation module for evaluating assembly operation of a user. In this way, an objective assessment of the user's assembly operation can be achieved.

Other features and aspects of the present application will become apparent from the following detailed description of the application with reference to the accompanying drawings.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments, features and aspects of the application and together with the description, serve to explain the principles of the application.

FIG. 1 shows a block diagram of an industrial robot assembly training system based on virtual simulation in accordance with an embodiment of the present application.

FIG. 2 shows a block diagram of the evaluation module in a virtual simulation based industrial robot assembly training system, according to an embodiment of the application.

Fig. 3 shows a flow chart of an industrial robot assembly training method based on virtual simulation according to an embodiment of the present application.

Fig. 4 shows a schematic diagram of an architecture of a sub-step S140 in the industrial robot assembly training method based on virtual simulation according to an embodiment of the present application.

FIG. 5 illustrates an application scenario diagram of a virtual simulation based industrial robot assembly training system, according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are also within the scope of the application.

As used in the specification and in the claims, the terms "a," "an," "the," and/or "the" are not specific to a singular, but may include a plurality, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that the steps and elements are explicitly identified, and they do not constitute an exclusive list, as other steps or elements may be included in a method or apparatus.

Various exemplary embodiments, features and aspects of the application will be described in detail below with reference to the drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Although various aspects of the embodiments are illustrated in the accompanying drawings, the drawings are not necessarily drawn to scale unless specifically indicated.

In addition, numerous specific details are set forth in the following description in order to provide a better illustration of the application. It will be understood by those skilled in the art that the present application may be practiced without some of these specific details. In some instances, well known methods, procedures, components, and circuits have not been described in detail so as not to obscure the present application.

The application provides an industrial robot assembly practical training system based on virtual simulation, and fig. 1 is a schematic diagram of a block diagram of the industrial robot assembly practical training system based on virtual simulation according to an embodiment of the application. As shown in fig. 1, an industrial robot assembly training system 100 based on virtual simulation according to an embodiment of the present application includes a virtual simulation platform 110 for providing a virtual scene and a virtual device for an industrial robot assembly training, a virtual reality device 120 for interacting with the virtual simulation platform to implement operations and control of the virtual scene and the virtual device, a data processing module 130 for receiving an input signal of the virtual reality device, analyzing and processing the input signal, generating a corresponding feedback signal, and transmitting the feedback signal to the virtual reality device, and an evaluation module 140 for evaluating an assembly operation of a user.

Wherein, the skill level of the user can be measured by evaluating the assembly operation of the user, and the existing problems are found, so that the user can be helped to improve the operation efficiency and accuracy. In addition, damage to equipment or safety accidents due to erroneous operations can be prevented by evaluating the user's assembly operation. In the prior art, the evaluation of the assembly operations of a user is generally performed manually according to the degree to which the user has completed a task. For example, according to the set task requirements and time limits, it is assessed whether the user has completed the assembly task as required and a corresponding score or feedback is given. However, the evaluation result obtained in this way may be affected by subjective factors of the individual, and factors such as mood swings and fatigue may cause deviation in the evaluation result. Thus, an optimized solution is desired.

Aiming at the technical problems, the technical conception of the application is that the assembly operation video of the user and the reference operation video are subjected to joint analysis by utilizing a deepening algorithm to learn and characterize the difference between the assembly operation video of the user and the reference operation video, namely the difference between the assembly operation of the user and the standard operation, so that the deviation degree between the assembly operation of the user and the standard operation is evaluated, and objective evaluation of the assembly operation of the user is realized.

Based on this, as shown in fig. 2, the evaluation module 140 includes a data acquisition unit 141 for acquiring a fitting operation video of the user acquired by a camera and a reference operation video, a timing analysis unit 142 for performing timing analysis on the fitting operation video and the reference operation video to obtain a fitting operation timing characteristic map and a reference operation timing characteristic map, a semantic difference extraction unit 143 for extracting operation semantic difference features between the fitting operation timing characteristic map and the reference operation timing characteristic map to obtain an operation difference semantic transfer characterization characteristic map, and a deviation degree analysis unit 144 for determining whether a deviation degree of the fitting operation of the user from a standard operation exceeds a predetermined threshold based on the operation difference semantic transfer characterization characteristic map.

It should be appreciated that the assessment module 140 is part of an industrial robot assembly training system for assessing the assembly operations of the user. In the data acquisition unit 141, by acquiring video, the processes of actual operations and reference operations of the user may be recorded. The timing analysis unit 142 performs timing analysis on the fitting operation video and the reference operation video to obtain a fitting operation timing characteristic map and a reference operation timing characteristic map, the timing characteristic map reflecting the time order and timing characteristics of operations, which can be used to compare differences between the operations of the user and the reference operations. The semantic difference extraction unit 143 is configured to extract operation semantic difference features between the assembly operation time series feature map and the reference operation time series feature map, and by extracting the semantic difference features, a difference between an operation of a user and a standard operation can be captured. The deviation analysis unit 144 determines whether the deviation of the assembly operation of the user from the standard operation exceeds a predetermined threshold based on the operation difference semantic transfer characterization feature map, and by analyzing the deviation, it can be estimated whether the operation of the user meets the standard requirement. By integrating the functions of the units, the evaluation module can comprehensively evaluate the assembly operation of the user, including time sequence analysis, semantic difference extraction and deviation analysis, so as to help the user improve the operation skill and the assembly quality.

The coding process of the evaluation module comprises the steps of firstly, acquiring an assembly operation video of the user acquired by a camera and referring to the operation video. Here, the user's assembly operation video records the actual operation of the user during the assembly operation. That is, the assembly operation video of the user is collected through the camera, and information such as actions, postures and operation sequences of the user in the actual operation process can be obtained. This video can be used as an important data source for assessing the user's skill level in operation, finding problems. And reference to an operation video refers to a video of a previously recorded specification or standard assembly operation process. This video shows the correct assembly steps, sequence of operations, attitude requirements, etc. The reference operation video serves as a comparison and reference for evaluating the difference and deviation degree between the assembling operation and the standard operation of the user. That is, by comparing the user's assembly operation video with the reference operation video, it is possible to determine an error, an inaccurate action, or an operation that violates a specification of the user in the assembly operation, and provide corresponding feedback.

It should be appreciated that the assembly operation video and the reference operation video are typically continuous video streams containing a large number of frame images. In actual assembly operations, however, many times critical actions and posture changes occur in certain specific key frames, and other frames may contain repeated or unimportant information. Therefore, in the technical scheme of the application, the assembly operation video and the reference operation video are subjected to sparse sampling to select a part of key frames to represent the whole video data, so that a sequence of the assembly operation key frames and a sequence of the reference operation key frames are obtained.

Next, passing the sequence of assembly operation key frames and the sequence of reference operation key frames through an operation timing correlation feature extractor based on a three-dimensional convolutional network model to obtain an assembly operation timing feature map and a reference operation timing feature map. It will be appreciated by those of ordinary skill in the art that three-dimensional convolutional network models are widely used in the field of video processing, which can effectively capture timing information in a video sequence. By inputting the assembly operation key frame sequence and the reference operation key frame sequence into the operation time sequence associated feature extractor based on the three-dimensional convolution network model, time sequence feature extraction can be carried out on the assembly operation key frame sequence and the reference operation key frame sequence, so that key features such as action sequences, gesture changes, motion tracks and the like in the operation sequence are learned. These features may reflect dynamic changes and laws of the assembly operations, helping to more fully understand and characterize the user's assembly operations and reference operations.

Accordingly, the timing analysis unit 142 includes a preprocessing subunit for preprocessing the assembly operation video and the reference operation video to obtain a sequence of assembly operation key frames and a sequence of reference operation key frames, and a timing correlation feature extraction subunit for passing the sequence of assembly operation key frames and the sequence of reference operation key frames through an operation timing correlation feature extractor based on a three-dimensional convolutional network model to obtain the assembly operation timing feature map and the reference operation timing feature map.

It is worth mentioning that the three-dimensional convolution network (3D Convolutional Neural Network) is a deep learning model for processing data having a time dimension. The method is expanded on the basis of a two-dimensional convolution network (2D Convolutional Neural Network), and can effectively process data with time sequence characteristics such as video, time sequence and the like. In the timing analysis unit 142, the timing related feature extraction subunit extracts the fitting operation timing feature map and the reference operation timing feature map using an operation timing related feature extractor based on the three-dimensional convolutional network model. Specifically, a three-dimensional convolution network can capture the characteristics and modes of data in time sequence by performing convolution operation in the time dimension. Three-dimensional convolutional network models are typically composed of multiple convolutional layers, pooled layers, and fully connected layers. It performs feature extraction and representation learning on input data by performing convolution operations in three dimensions (width, height and time). The convolution layer may identify local timing patterns, the pooling layer may reduce the size and number of parameters of the feature map, and the full connection layer may map extracted features to specific categories or feature spaces. The three-dimensional convolution network has wide application in tasks such as video analysis, action recognition, behavior recognition and the like. The method can extract the time characteristics from the video sequence, capture the evolution and time sequence mode of the motion, and play an important role in analysis and understanding of time sequence data. In the industrial robot assembly training system, the three-dimensional convolution network model can be used for effectively extracting time sequence characteristics of assembly operation and reference operation for subsequent analysis and evaluation.

In one example, the preprocessing subunit is configured to sparsely sample the assembly operation video and the reference operation video to obtain a sequence of the assembly operation key frames and a sequence of the reference operation key frames.

Notably, sparse sampling (SPARSE SAMPLING) is a method of selecting a small number of key frames from a continuous sequence of data. In video processing, sparse sampling is used to select a small number of key frames from a video sequence to reduce the amount of data and preserve key information. In the preprocessing subunit, sparse sampling is used to process the assembly operation video and the reference operation video to obtain a sequence of assembly operation key frames and a sequence of reference operation key frames. Typically, video is made up of successive image frames, and the purpose of sparse sampling is to select a portion of representative key frames from these successive frames. The sparse sampling algorithm selects key frames from the continuous frames according to a certain strategy. These key frames are typically characterized by 1. Representative key frames are effectively representative of the entire video sequence, including important information and actions in the video. 2. Diversity-key frames should have some variability between key frames to cover different scenes and actions in a video sequence. 3. Compressibility-the number of key frames selected is relatively small to reduce the amount of data and computational complexity. Sparse sampling may employ different strategies, such as time interval-based sampling, motion information-based sampling, etc., depending on different needs and application scenarios. By sparse sampling, a continuous video sequence can be converted into a small number of key frame sequences, so that subsequent processing and analysis are facilitated.

And then, calculating a transfer matrix between the assembly operation time sequence characteristic diagram and the characteristic matrix of each group of corresponding channel dimensions of the reference operation time sequence characteristic diagram to obtain an operation difference semantic transfer characterization characteristic diagram. That is, the difference between the assembly operation and the reference operation of the user is quantified by calculating the transfer matrix. In particular, the transition matrix reflects a transition relationship or transition probability between feature matrices of each set of corresponding channel dimensions, which may be used to measure the relevance and variability between the assembly operation and the reference operation.

Correspondingly, the semantic difference extraction unit 143 is configured to calculate a transfer matrix between the feature matrices of each set of corresponding channel dimensions of the assembly operation time sequence feature map and the reference operation time sequence feature map to obtain the operation difference semantic transfer characterization feature map.

Further, the operation difference semantic transfer characterization feature map is subjected to a local operation salizer based on an adaptive attention layer to obtain a salified operation difference semantic transfer characterization feature map. Here, the adaptive attention layer may dynamically adjust weights of different locations according to characteristics of input data to achieve selective attention to the input data. That is, through the local operation saliency device based on the self-adaptive attention layer, the operation difference semantic transfer characterization feature map can be weighted, so that the operation difference and important semantic information of the key channel are highlighted. This may increase the perceptibility of operational differences while reducing concerns about extraneous or secondary information. And then, the significant operation difference semantic transfer characterization feature map is passed through a classifier to obtain a classification result, wherein the classification result is used for indicating whether the deviation degree of the assembly operation and the standard operation of the user exceeds a preset threshold value.

Accordingly, the deviation analysis unit 144 includes a local operation saliency subunit, configured to pass the operation difference semantic transfer characterization feature map through a local operation saliency device based on an adaptive attention layer to obtain a saliency operation difference semantic transfer characterization feature map, and a classification subunit, configured to pass the saliency operation difference semantic transfer characterization feature map through a classifier to obtain a classification result, where the classification result is used to indicate whether a deviation degree of a fitting operation and a standard operation of the user exceeds a predetermined threshold.

It should be appreciated that the primary function of the local operation saliency subunit is to process the operation difference semantic transfer characterization feature map through a local operation saliency based on an adaptive attention layer to obtain a salient operation difference semantic transfer characterization feature map, and this subunit is to highlight the salient differences in the assembly operation for better understanding and analyzing the assembly operation of the user. The adaptive attention layer is a deep learning module for learning importance weights of different positions in the input feature map. Through the adaptive attention layer, the local operation significance subunit can automatically focus attention on important areas according to the content of the operation difference semantic transfer characterization feature map, so as to generate a significance operation difference semantic transfer characterization feature map. The main function of the classifying subunit is to process the characteristic image of the significant operation difference semantic transfer representation through a classifier to obtain a classifying result. This subunit is used to determine whether the degree of deviation of the assembling operation from the standard operation by the user exceeds a predetermined threshold. A classifier is a machine learning model that classifies by learning the relationship between a salient operational difference semantic transfer characterization feature map and different classes (e.g., normal and abnormal operations). The classifier can judge the degree of difference between the assembly operation and the standard operation of the user according to the information in the feature map, and gives out a corresponding classification result. Through classifying the subunit, the assembly operation and the standard operation can be compared, and whether the deviation degree exceeds a preset threshold value or not is judged, so that the operation error or abnormality of a user can be found in time, and the accuracy and quality of the assembly operation are improved.

In one example, the local operation saliency subunit is configured to process the operation difference semantic transfer characterization feature map with an adaptive attention formula to obtain the saliency operation difference semantic transfer characterization feature map, where the adaptive attention formula is:

v_c=pool(F)

A=σ(W_a*v_c+b_a)

F′=A′⊙F

In the technical scheme of the application, the assembly operation time sequence characteristic diagram and the reference operation time sequence characteristic diagram respectively express the sequence of the assembly operation key frames and the time sequence related image semantic characteristics of the reference operation key frames, so that the assembly operation time sequence characteristic diagram and the reference operation time sequence characteristic diagram have characteristic distribution information representation differences caused by source image semantic distribution time sequence related differences.

Thus, when calculating the transfer matrices between the feature matrices of each set of corresponding channel dimensions of the assembly operation timing feature map and the reference operation timing feature map, the feature distribution information representation differences between the assembly operation timing feature map and the reference operation timing feature map may cause the channel association distribution information representation consistency between the respective feature matrices of the operation difference semantic transfer characterization feature map to be poor, thereby further causing channel distribution sparseness after passing through the local operation salients based on the adaptive attention layer, affecting the expression effect of the saliency operation difference semantic transfer characterization feature map. Based on the method, in order to promote the channel distribution constraint of the salified operation difference semantic transfer characterization feature map, the salified operation difference semantic transfer characterization feature map is optimized.

The classification subunit comprises a feature map optimization secondary subunit and a deviation degree classification secondary subunit, wherein the feature map optimization secondary subunit is used for optimizing the salified operation difference semantic transfer representation feature map to obtain an optimized salified operation difference semantic transfer representation feature map, and the deviation degree classification secondary subunit is used for enabling the optimized salified operation difference semantic transfer representation feature map to pass through a classifier to obtain a classification result, and the classification result is used for indicating whether the deviation degree of the assembly operation and the standard operation of the user exceeds a preset threshold value.

The feature map optimization two-level subunit comprises a first weight matrix and a second weight matrix for expanding the salified operation difference semantic transfer characterization feature map into a salified operation difference semantic transfer characterization feature vector, wherein the value of each position of the first weight matrix is the average value of two feature values of the corresponding position of the salified operation difference semantic transfer characterization feature vector, the value of each position of the second weight matrix is the variance of two feature values of the corresponding position of the salified operation difference semantic transfer characterization feature vector, the transposed vector of the salified operation difference semantic transfer characterization feature vector serving as a row feature vector is multiplied by the first weight matrix to obtain a first intermediate vector, the second weight matrix is multiplied by the salified operation difference semantic transfer characterization feature vector to obtain a second intermediate vector, the sum of points of the first intermediate vector and the transposed vector of the second intermediate vector is calculated to obtain a third intermediate vector, the method comprises the steps of obtaining a third intermediate vector, obtaining a fourth intermediate vector by multiplying a transposed vector of the salified operation difference semantic transfer representation feature vector by a matrix, obtaining an optimized salified operation difference semantic transfer representation feature vector by the sum of points of the transposed vector of the third intermediate vector and the transposed vector of the fourth intermediate vector, and restoring the optimized salified operation difference semantic transfer representation feature vector into an optimized salified operation difference semantic transfer representation feature map.

The method comprises the steps of taking a weight matrix for group aggregate statistical evaluation of the feature value granularity local distribution of the saliency difference semantic transfer characterization feature image as a search type and a response type distribution enhancement of the saliency operation difference semantic transfer characterization feature image, and constructing a no-reference distribution search type response framework under the open domain of the feature distribution based on group aggregate of the saliency operation difference semantic transfer characterization feature image, so that distribution response redundancy caused by local overflow features of the saliency operation difference semantic transfer characterization feature image is avoided through response superposition, and faithful constraint of response self-aggregation statistical correlation from the saliency operation difference semantic transfer characterization feature image to a classification target domain is realized, so that the expression effect of the saliency operation difference semantic transfer characterization feature image is improved.

Further, the deviation degree classification secondary subunit is configured to expand the optimized saliency operation difference semantic transfer characterization feature map into a classification feature vector according to a row vector or a column vector, perform full-connection encoding on the classification feature vector by using a full-connection layer of the classifier to obtain an encoded classification feature vector, and input the encoded classification feature vector into a Softmax classification function of the classifier to obtain the classification result.

That is, in the technical solution of the present application, the label of the classifier includes that the deviation degree of the assembling operation of the user from the standard operation exceeds a predetermined threshold (first label), and that the deviation degree of the assembling operation of the user from the standard operation does not exceed a predetermined threshold (second label), wherein the classifier determines to which classification label the optimized saliency operation difference semantic transfer characterization feature map belongs through a soft maximum function. It should be noted that the first tag p1 and the second tag p2 do not include the concept of human setting, and in fact, during the training process, the computer model does not include the concept of "whether the deviation of the assembling operation of the user from the standard operation exceeds a predetermined threshold", which is simply that there are two kinds of classification tags and the probability that the output feature is at the two classification tags sign, that is, the sum of p1 and p2 is one. Therefore, the classification result of whether the deviation degree of the assembling operation of the user from the standard operation exceeds the predetermined threshold value is actually converted into the classified probability distribution conforming to the natural law by classifying the labels, and the physical meaning of the natural probability distribution of the labels is essentially used instead of the language text meaning of whether the deviation degree of the assembling operation of the user from the standard operation exceeds the predetermined threshold value.

It should be appreciated that the role of the classifier is to learn the classification rules and classifier using a given class, known training data, and then classify (or predict) the unknown data. Logistic regression (logistics), SVM, etc. are commonly used to solve the classification problem, and for multi-classification problems (multi-class classification), logistic regression or SVM can be used as well, but multiple bi-classifications are required to compose multiple classifications, but this is error-prone and inefficient, and the commonly used multi-classification method is the Softmax classification function.

In summary, a virtual simulation-based industrial robot assembly training system 100 is illustrated that enables objective assessment of a user's assembly operation in accordance with an embodiment of the present application.

As described above, the virtual simulation-based industrial robot assembly training system 100 according to the embodiment of the present application may be implemented in various terminal devices, for example, a server having a virtual simulation-based industrial robot assembly training algorithm, etc. In one example, the virtual simulation based industrial robot assembly training system 100 may be integrated into the terminal device as a software module and/or hardware module. For example, the virtual simulation based industrial robot assembly training system 100 may be a software module in the operating system of the terminal device or may be an application developed for the terminal device, although the virtual simulation based industrial robot assembly training system 100 may also be one of a plurality of hardware modules of the terminal device.

Alternatively, in another example, the virtual simulation-based industrial robot assembly training system 100 and the terminal device may be separate devices, and the virtual simulation-based industrial robot assembly training system 100 may be connected to the terminal device through a wired and/or wireless network and transmit interactive information in a agreed data format.

Fig. 3 shows a flow chart of an industrial robot assembly training method based on virtual simulation according to an embodiment of the present application. As shown in FIG. 3, the industrial robot assembly training method based on virtual simulation according to the embodiment of the application comprises the steps of providing a virtual scene and virtual equipment of the industrial robot assembly training through a virtual simulation platform, interacting with the virtual simulation platform through virtual reality equipment to realize operation and control of the virtual scene and the virtual equipment, receiving an input signal of the virtual reality equipment through a data processing module, analyzing and processing the input signal, generating a corresponding feedback signal and sending the feedback signal to the virtual reality equipment, and evaluating assembly operation of a user through an evaluation module, wherein the virtual reality equipment is provided with the virtual scene and the virtual equipment through the virtual simulation platform, and the virtual reality equipment is provided with the S120.

In one possible implementation, as shown in fig. 4, the evaluation of the assembly operation of the user through the evaluation module includes obtaining an assembly operation video of the user acquired by a camera and a reference operation video, performing time sequence analysis on the assembly operation video and the reference operation video to obtain an assembly operation time sequence characteristic diagram and a reference operation time sequence characteristic diagram, extracting operation semantic difference characteristics between the assembly operation time sequence characteristic diagram and the reference operation time sequence characteristic diagram to obtain an operation difference semantic transfer characterization characteristic diagram, and determining whether the deviation degree of the assembly operation of the user from a standard operation exceeds a preset threshold value based on the operation difference semantic transfer characterization characteristic diagram.

Here, it will be understood by those skilled in the art that the specific operations of the respective steps in the above-described virtual simulation-based industrial robot assembly training method have been described in detail in the above description of the virtual simulation-based industrial robot assembly training system with reference to fig. 1 to 2, and thus, repetitive descriptions thereof will be omitted.

FIG. 5 illustrates an application scenario diagram of a virtual simulation based industrial robot assembly training system, according to an embodiment of the present application. As shown in fig. 5, in this application scenario, first, a fitting operation video (e.g., D1 illustrated in fig. 5) of a user acquired by a camera is acquired, and a reference operation video (e.g., D2 illustrated in fig. 5), and then the fitting operation video and the reference operation video are input into a server (e.g., S illustrated in fig. 5) in which a virtual simulation-based industrial robot fitting training algorithm is deployed, wherein the server is capable of processing the fitting operation video and the reference operation video using the virtual simulation-based industrial robot fitting training algorithm to obtain a classification result for indicating whether a degree of deviation of a fitting operation of the user from a specification operation exceeds a predetermined threshold.

In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as a memory including computer program instructions executable by a processing component of an apparatus to perform the above-described method.

The present application may be a system, method, and/or computer program product. The computer program product may include a computer readable storage medium having computer readable program instructions embodied thereon for causing a processor to implement aspects of the present application.

The computer readable storage medium may be a tangible device that can hold and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium include a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical encoding device, punch cards or intra-groove protrusion structures such as those having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media, as used herein, are not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., optical pulses through fiber optic cables), or electrical signals transmitted through wires.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The foregoing description of embodiments of the application has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the improvement of technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. An industrial robot assembly training system based on virtual simulation, characterized by comprising:

Virtual simulation platform, used to provide virtual scenes and virtual equipment for industrial robot assembly training;

A virtual reality device, used to interact with the virtual simulation platform to operate and control the virtual scene and the virtual device;

a data processing module, configured to receive an input signal from the virtual reality device, analyze and process the input signal, generate a corresponding feedback signal, and send the feedback signal to the virtual reality device; and

The evaluation module is used to evaluate the user's assembly operation.

2. The industrial robot assembly training system based on virtual simulation according to claim 1, characterized in that the evaluation module comprises:

A data acquisition unit, used to acquire the assembly operation video of the user acquired by a camera, and a reference operation video;

A timing analysis unit, used for performing timing analysis on the assembly operation video and the reference operation video to obtain an assembly operation timing characteristic diagram and a reference operation timing characteristic diagram;

A semantic difference extraction unit, used for extracting operation semantic difference features between the assembly operation timing feature graph and the reference operation timing feature graph to obtain an operation difference semantic transfer representation feature graph; and

The deviation analysis unit is used to determine whether the deviation between the user's assembly operation and the standard operation exceeds a predetermined threshold based on the operation difference semantic transfer representation feature map.

3. The industrial robot assembly training system based on virtual simulation according to claim 2, characterized in that the timing analysis unit comprises:

a preprocessing subunit, configured to preprocess the assembly operation video and the reference operation video to obtain a sequence of assembly operation key frames and a sequence of reference operation key frames; and

The timing correlation feature extraction subunit is used to obtain the assembly operation timing feature graph and the reference operation timing feature graph by passing the sequence of the assembly operation key frames and the sequence of the reference operation key frames through an operation timing correlation feature extractor based on a three-dimensional convolutional network model.

4. The industrial robot assembly training system based on virtual simulation according to claim 3, characterized in that the preprocessing subunit is used to:

The assembly operation video and the reference operation video are sparsely sampled to obtain a sequence of the assembly operation key frames and a sequence of the reference operation key frames.

5. The industrial robot assembly training system based on virtual simulation according to claim 4, characterized in that the semantic difference extraction unit is used to:

The transfer matrix between the feature matrices of each group of corresponding channel dimensions of the assembly operation timing feature graph and the reference operation timing feature graph is calculated to obtain the operation difference semantic transfer representation feature graph.

6. The industrial robot assembly training system based on virtual simulation according to claim 5, characterized in that the deviation analysis unit comprises:

A local operation saliency subunit, configured to pass the operation difference semantic transfer representation feature map through a local operation saliency device based on an adaptive attention layer to obtain a salient operation difference semantic transfer representation feature map; and

The classification subunit is used to pass the semantic transfer representation feature map of the significant operation difference through a classifier to obtain a classification result, wherein the classification result is used to indicate whether the deviation between the user's assembly operation and the standard operation exceeds a predetermined threshold.

7. The industrial robot assembly training system based on virtual simulation according to claim 6, characterized in that the local operation significant subunit is used to:

The operation difference semantic transfer representation feature map is processed by the following adaptive attention formula to obtain the salient operation difference semantic transfer representation feature map; wherein the adaptive attention formula is:

v _c = pool(F)

A＝σ(W _a *v _c + _ba )

F′＝A′⊙F

Among them, F is the operation difference semantic transfer representation feature map, pool is the pooling process, _vc is the pooling vector, _Wa is the weight matrix, _ba is the bias vector, σ is the activation process, A is the initial meta-weight feature vector, _Ai is the eigenvalue of the i-th position in the initial meta-weight feature vector, A' is the correction meta-weight feature vector, F' is the significant operation difference semantic transfer representation feature map, ⊙ indicates that the eigenvalue in the correction meta-weight feature vector is multiplied with each feature matrix of the operation difference semantic transfer representation feature map along the channel dimension.

8. The industrial robot assembly training system based on virtual simulation according to claim 7, characterized in that the classification subunit comprises:

A feature map optimization secondary subunit, used for optimizing the saliency operation difference semantic transfer representation feature map to obtain an optimized saliency operation difference semantic transfer representation feature map; and

The deviation classification secondary subunit is used to pass the optimized significant operation difference semantic transfer representation feature map through a classifier to obtain a classification result, and the classification result is used to indicate whether the deviation between the user's assembly operation and the standard operation exceeds a predetermined threshold.

9. The virtual simulation-based industrial robot assembly training system according to claim 8, characterized in that the deviation classification secondary subunit is used to:

Expanding the optimized saliency operation difference semantic transfer representation feature map into a classification feature vector according to a row vector or a column vector;

Performing full connection encoding on the classification feature vector using the fully connected layer of the classifier to obtain an encoded classification feature vector; and

The encoded classification feature vector is input into the Softmax classification function of the classifier to obtain the classification result.

10. A virtual simulation-based industrial robot assembly training method, comprising:

Provide virtual scenes and virtual equipment for industrial robot assembly training through virtual simulation platform;

The virtual scene and the virtual device are operated and controlled by interacting with the virtual simulation platform through a virtual reality device;

receiving an input signal of the virtual reality device through a data processing module, analyzing and processing the input signal, generating a corresponding feedback signal, and sending the feedback signal to the virtual reality device; and

The user's assembly operations are evaluated through the evaluation module.