WO2018187948A1

WO2018187948A1 - Local repairing method for machine learning model

Info

Publication number: WO2018187948A1
Application number: PCT/CN2017/080172
Authority: WO
Inventors: 邹霞
Original assignee: 邹霞
Priority date: 2017-04-12
Filing date: 2017-04-12
Publication date: 2018-10-18

Abstract

A local repairing method for a machine learning model, comprising: collection and analysis of feedback data: collecting user feedback data and extracting incorrectly predicted data samples; spatial transformation: converting an original data space to a new data space by means of scale learning, reducing the distance between the incorrectly predicted data samples as much as possible in the new data space, and increasing the distance between the incorrectly predicted data samples and the correctly predicted data samples as much as possible; learning incorrect data samples to establish a patch model in the new data space and defining an application range of the patch model; and learning the incorrect data samples to establish a patch model in the new data space and defining an application range of the patch model. The local repairing method for the machine learning model can improve the performance of the machine learning model.

Description

Specification Name of Invention: Local Repair Method of Machine Learning Model Technical Field

[0001] The present invention relates to a local repair method of a machine learning model, and belongs to the field of Internet search.

Background technique

[0002] With the rapid development of the Internet, search engines have become an important tool for people to use Internet information resources. With the rise and development of search engines such as Google, Yahoo!. Bing, and Baidu, the relevance of query results has attracted more and more attention. The pros and cons of sorting the results of the query have also become the main indicators for evaluating the search engine.

[0003] With the rapid development and wide application of information technology, the Internet has prospered and become the world's largest information resource, which has occupied an important position in people's lives. The Internet has also become an important platform for people to share and interact with information. Users need to find the information they need in such a large and messy Internet resource, just like a needle in a haystack, and the search engine just solves this problem. The search engine is based on the Internet platform and is a tool for providing network information retrieval services. Search engines have also become the most important applications in Internet technology. The user gives the keyword as a query request, the search engine queries the index database according to the user query, and returns the retrieval result of the sorting and correlation analysis to the user, helping the person to reject and ignore a large amount of irrelevant information, thereby Play the role of information navigation. And the massive amount of information data means massive search results. In practical applications, most users of the cable engine only browse the first few pages of the returned results, and rarely care about the lower ranked pages. Search results with strong correlation should be ranked higher, while weak correlation results should be ranked lower. Therefore, sorting the query results according to their relevance becomes one of the core problems of search engines. The relevance ranking of search results has also become an important indicator for evaluating search engine performance.

[0004] In the search engine ranking problem, a multidimensional feature vector is used to represent the relevant attributes and information of each data pair (user query-query result). Extract some data pairs in the dataset and manually identify the relevance of the query results and user queries in each data pair. The machine learning model is trained using the already identified data as a training data set, and the resulting machine learning model is used to predict the relevance of the unknown query and the query results. However, no matter how powerful the theoretical foundation of a machine learning model is, it can always be applied. In the process, it found that it did not appear wrong. There are many reasons why machine learning models can predict errors in the application process, such as noise or extreme training data, such as unstable data distribution and defects in the machine learning model itself.

[0005] In order to improve the performance of machine learning models, it is common practice to continuously collect erroneous user feedback data as additional training data to re-establish a new learning model. However, the original learning model has achieved good results in most of the test data sets. Because of the small amount of feedback data, it is necessary to re-establish a new learning model. This will greatly reduce the efficiency of the search. Once the learning model is established, the modification of the model becomes more difficult.

technical problem

[0006] Many researchers have proposed many solutions to the problem of repairing machine learning models. The most intuitive method is to obtain a new machine learning model after merging the user feedback data with the original training data as a new training data set. However, there are two main problems with this approach:

[0007] 1. The size of the feedback data set is much smaller than the original training data. The learning model established by re-learning is mainly determined by the original training data set, so the space for performance improvement is very limited.

[0008] 2. Every time a small amount of user feedback is obtained, it is necessary to re-learn to establish a new machine learning model, which will inevitably greatly reduce the search effect. This is what users don't want to see.

Problem solution

Technical solution

In view of the above deficiencies of the prior art, it is an object of the present invention to provide a local repair method for a machine learning model.

[0010] The purpose of the present invention is to modify the original machine learning model from a local perspective, to make up for the deficiencies of the retraining model, incremental learning, and the like, and to improve the performance of the machine learning model. In order to achieve the above object, the present invention adopts the following technical solutions:

[0011] The present invention provides a partial repair method of a machine learning model, comprising the following steps:

[0012] Step 1: Collecting and analyzing feedback data: collecting user feedback data, and extracting data samples of prediction errors

[0013] Step 2: Spatial Transformation: Converting the original data space to a new data space through scale learning, in the new In the data space, the distance between the predicted error data samples is reduced as much as possible, and the distance between the predicted error data sample and the predicted correct data sample is increased as much as possible;

[0014] Step 3: In the new data space, learn the wrong data sample to establish a patch model, and define the application scope of the patch model;

[0015] Step 4: In the new data space, learn the wrong data sample to establish a patch model, and define the application scope of the patch model.

[0016] Preferably, in the above step 1, the user feedback data is a series of data pairs, and the result is evaluated by establishing a machine learning model to evaluate the degree of relevance.

[0017] Preferably, in the second feature step, in the new feature space, the spatial distance between the predicted error data samples is reduced as much as possible, and the distance between the predicted error data sample and the predicted correct data sample is exhausted. Possible increase.

[0018] Preferably, in step 3 above, after mapping the predicted error data set to the new feature space, a patch model is established on the learning data sample.

[0019] Preferably, the process of establishing the patch model in the above step 3 is a training process of the supervised machine learning model.

[0020] Preferably, after obtaining the N patch models in the above step 4 and defining the scope of the patch model, the machine learning model is used to predict the ordering of the query results.

Advantageous effects of the invention

Beneficial effect

[0021] Compared with the prior art, the local repair method of the machine learning model provided by the present invention does not change the original learning model, but only learns the sub-space of the local patch of the model and the patch model according to the predicted error data fed back by the user, and the original learning The model and the generated patch model form a new learning model, modify the original machine learning model from a local perspective, make up for the shortcomings of retraining, incrementing, etc., and improve the performance of the machine learning model.

Embodiments of the invention

The present invention provides a method for locally repairing a machine learning model. The present invention will be further described in detail in the following embodiments in order to clarify and clarify the objects, technical solutions and effects of the present invention. It should be understood that The specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

[0023] In the process of processing massive amounts of information, the machine learning model has been widely used in various problems and played a huge role with its automatic and rapid advantages. Machine learning models, especially supervised machine learning models, are supported by a large amount of training data to achieve higher and higher prediction accuracy. However, the machine learning model has some drawbacks. Once the machine learning model is built, it is like a black box, only the input and output are visible. Even if you find data that predicts errors, you cannot adjust the original machine learning model. Moreover, no matter how powerful a machine learning model is, there is no guarantee that its prediction accuracy will be 100%. This requires constant adjustment of the original machine learning model based on the user's feedback data to continuously improve the prediction accuracy.

[0024] The machine learning model addressed by this embodiment is a collection of multiple decision trees. A decision tree represents a submodel. The weighted sum of the prediction results of all submodels is the final prediction result. For each user query, a collection of query results. In this machine learning model, a feature vector is used to represent each query result. In the decision tree, the non-leaf node will calculate some attributes of the query result, and determine the path of the query result in the current decision tree according to the set threshold. When the leaf node is reached, it can be obtained. The classification result of the query result. The classification result is represented by a score. The final result of the query result is obtained by weighted summation of the classification results of the query results on each decision tree. The level of the score determines how relevant the query results are to the user's query. The higher the score, the stronger the correlation; the lower the score, the weaker correlation.

[0025] As shown in FIG. 1, the local repair method of the machine learning model provided by the present invention

In the local repair method provided by this embodiment, the feedback data information of the user is first collected, and the prediction error samples therein are extracted. Learning and training the prediction error samples, and establishing a patch model to make up for the defects of the original model. In correcting the peers of the predicted error samples, it must be ensured that there is no negative impact on predicting the correct data. Therefore, in the local repair method, not only the patch model needs to be established, but also the scope of the patch model application needs to be defined.

[0027] In the user feedback data, only the data of the prediction error is concerned, and the distribution of the data of the prediction error and the prediction of the correct data is complicated. When the patch model is established, the correctness of the error is corrected, and the prediction is correct. The data has a negative impact. Therefore, in the method, the data space of the prediction failure is spatially transformed into a new space by the method of scale learning, and in the new data space. In the case, the data samples that failed the prediction are aggregated as much as possible, and away from predicting the correct data samples. After the incremental learning is completed, a patch model is created. The patch model is obtained by learning and training the data samples that failed to predict. After the patch model is built, the peers also need to define the area in which the patch model is applied.

[0028] Specifically, the local repair method provided by this embodiment is mainly divided into the following four steps:

[0029] 1. Collecting user feedback data, and extracting data samples of predicted errors;

[0030] 2. Spatial transformation: The original data space is transformed into a new data space by scale learning. In the new data space, the distance between the predicted error data samples is reduced as much as possible, and the data samples of the prediction errors are Predict the distance between the correct data samples as much as possible;

[0031] 3. In the new data space, learn the wrong data sample to establish a patch model, and define the application scope of the patch model;

[0032] 4. In the new data space, learn the wrong data samples to build a patch model, and define the application scope of the patch model.

[0033] First, the feedback data is analyzed. User feedback data D is composed of a series of data pairs. The mathematical definition can be expressed as KD^ , ^ , r _; ) =l,2,...}, g represents the user query, indicating a user query result, 0 ≤ r , . ≤ 5 indicates the degree of correlation between the query result and the user query. r , .=5 indicates that the correlation is the strongest, r _; =0 indicates that the correlation is the weakest.

[number]

■ *3⁄4 * : a. iii ■—

Then the correlation between the representations is less than the correlation between dj and g, then

[Number] 》 : f , 3⁄4: ; 3⁄4

. For any data pair < ^, ^>, r , .), it is meaningless to evaluate its prediction error or prediction correctly. Assume that the machine learning model is

[number]

, the machine learning model can be used to evaluate the result of the query d / (represented. For any two pairs of data (< q, d i> , r _t ), ( < q, dj>, r ₇ r , < rj ( <i; ) > f(dj), Bay U considers ( <<;; , r; ), ( < q , dj>, rj ) is a pair of data pairs that predict errors.

[0034] Second, the spatial transformation matrix is learned. In order to avoid the impact of the established patch model on predicting the correct set of data samples, the feature space of the original data needs to be transformed. The purpose of the spatial transformation is to make the spatial distance between the predicted error data samples as small as possible in the new feature space, and the distance between the predicted error data sample and the predicted correct data sample is increased as much as possible. . This minimizes the impact of the patch model on predicting correct data, thereby ensuring the predictive accuracy of the new machine learning model.

[0035] Again, learn the patch model. After mapping the predicted error data set to the new feature space, you need to learn the data samples to build the patch model. The process of building a patch model is actually a training process for a supervised machine learning model.

[0036] Considering two different query results under the same user query, the two query results have different correlations with the user query, and the patch model is continuously updated by analyzing the two query results. Suppose, for the user query ^ to extract any two query results with different relevance, d _h dj, the correlation with the user query is, 'r _r definition = _§ (<^>), = _§ (<^>) is the evaluation score of the machine learning model, the evaluation score g is set _; the probability that the greater than the evaluation score is the definition %. The calculation formula is as shown in (1):

[0037] [Number]

:3⁄4: two, ■n■ ', ., ■'

[0038], represents the sigm W function. The ideal distribution for defining the probability %· is shown in (2):

[0039]

(2)

[0040] Based on the above probability function, construct a cross entropy loss function as shown in (3):

[0041] [number] c ^¾ * "1 which zero ί

[0042] By the above formula, it can be obtained that the sample data is far away from the center point of the patch model, so that the sample has no influence on the parameters of the patch model. This ensures that the patch model is primarily used to predict incorrect sample data without affecting the correct prediction of the data sample.

[0043] Finally, the patch model is applied. After obtaining a patch model and defining the scope of the patch model, use the new machine learning model

[number]

To predict the ordering of query results. For the user query, assuming that one of the user query results is, the corresponding spatial feature vector is ^, then the space transformation is first required to obtain a new spatial feature vector A, ·, and then the new machine learning model is used to learn A ^ Forecast, get the final evaluation score. After obtaining the evaluation scores of all the query results, the query results are sorted according to the scores. By adding a patch model, it is possible to effectively repair the error samples found in the user feedback data and perform local repair, and during the repair process, it does not affect the correct samples, thereby improving the performance and prediction accuracy of the machine learning model.

This embodiment uses the method of scale learning to map the original data into a new feature space. The objective function only considers data samples of prediction errors in the data space. This is because the model repair algorithm is mainly used to repair the data samples of the prediction errors, but it does not need to be processed for predicting the correct data samples. Moreover, in the user feedback data set, the size of the predicted error data sample is much smaller than the predicted correct data sample size. Considering only the data samples that predict errors will greatly improve the efficiency of the algorithm.

[0045] Compared with the prior art, the local repair method of the machine learning model provided by the present invention does not change the original learning model, but only learns the sub-space of the local patch of the model and the patch model according to the predicted error data fed back by the user, and the original learning The model and the generated patch model form a new learning model, modify the original machine learning model from a local perspective, make up for the shortcomings of retraining, incrementing, etc., and improve the performance of the machine learning model.

[0046]

[0047] It is to be understood that those skilled in the art can make equivalent substitutions or changes in accordance with the technical solutions of the present invention and the inventive concepts thereof, and all such changes or substitutions should belong to the appended claims. protected range.

Claims

Claim

[Claim 1] A method for locally repairing a machine learning model, characterized in that: the partial repair method comprises the following steps:

Step 1. Collect and analyze the feedback data: collect user feedback data, and extract data samples for prediction errors;

Step 2: Spatial transformation: Convert the original data space to a new data space through scale learning. In the new data space, the distance between the predicted error data samples is reduced as much as possible, and the data samples and predictions that predict errors are correct. The distance between the data samples is as large as possible;

Step 3: In the new data space, learn the wrong data sample to build a patch model, and define the application scope of the patch model;

Step 4: In the new data space, learn the wrong data samples to build a patch model, and define the application scope of the patch model.

[Claim 2] The method for repairing a machine learning model according to claim 1, wherein: in step 1, the user feedback data is a series of data pairs, and the machine learning model is evaluated to determine the degree of relevance. result.

[Claim 3] The method for locally repairing a machine learning model according to claim 1, wherein: in the second feature space, the spatial distance between the predicted error data samples is reduced as much as possible in the new feature space. , and the distance between the predicted error data sample and the predicted correct data sample is increased as much as possible.

[Claim 4] The method for locally repairing a machine learning model according to claim 1, wherein: in step 3, after mapping the data set of the prediction error to the new feature space, the patch of the learning data sample is created. model.

[Claim 5] The method for locally repairing a machine learning model according to claim 4, wherein: the process of establishing a patch model in the third step is a training process of a supervised machine learning model.

[Claim 6] The method for locally repairing a machine learning model according to claim 1, wherein: in step 4, after obtaining N patch models and defining a scope of the patch model, Use machine learning models to predict the ordering of query results.