CN119646224A

CN119646224A - Hybrid expert multi-classification method and system combined with embedded dual-model organizational architecture

Info

Publication number: CN119646224A
Application number: CN202411805132.1A
Authority: CN
Inventors: 王猛; 李博然; 李涵昱; 谢靖; 张智雄; 张梦婷; 黎洋; 王雅娇
Original assignee: National Science Library Chinese Academy Of Sciences
Current assignee: National Science Library Chinese Academy Of Sciences
Priority date: 2024-12-10
Filing date: 2024-12-10
Publication date: 2025-03-18
Anticipated expiration: 2044-12-10
Also published as: CN119646224B

Abstract

The invention discloses a mixed expert multi-classification method and a system combined with an embedded double-model organization framework, which are applied to the technical field of data processing and are used for carrying out deep learning on a pre-training model based on a data distribution consistency threshold value and a scientific value sentence training corpus to construct a scientific value sentence recognition model. Based on a mixed expert mechanism, a scientific value sentence multi-classification model is built, wherein the scientific value sentence multi-classification model comprises an academic value expert sub-model, an application value expert sub-model, an innovation value expert sub-model and a gating network. And packaging the scientific value sentence recognition model and the scientific value sentence multi-classification model into a scientific value sentence detection double model to obtain a scientific literature. And inputting the scientific and technological literature into the scientific and valuable sentence detection double model to obtain a comprehensive classification result of the scientific and valuable sentence. The technical problems that the existing scientific literature value sentence classification method in the prior art has low processing efficiency and difficult guarantee of accuracy when processing scientific literature are solved.

Description

Hybrid expert multi-classification method and system combined with embedded dual-model organization architecture

Technical Field

The invention relates to the technical field of data processing, in particular to a mixed expert multi-classification method and system combined with an embedded double-model organization architecture.

Background

Along with the increasing of scientific and technological literature, how to efficiently and accurately identify and classify scientific value sentences such as academic value, application value and innovation value sentences in the literature becomes an important subject in scientific research and literature management. The traditional document classification method depends on manual labeling and rule setting, and can partially meet the requirements, but the efficiency is low and the accuracy is difficult to guarantee when large-scale document processing is carried out.

Therefore, the existing scientific literature value sentence classification method in the prior art has the technical problems of low processing efficiency and difficult guarantee of accuracy when processing scientific literature.

Disclosure of Invention

The application solves the technical problems of low processing efficiency and difficult guarantee of accuracy in the prior art of processing scientific literature by providing the mixed expert multi-classification method and system combined with the embedded double-model organization architecture. By introducing a mixed expert mechanism and optimizing data distribution consistency, the classification accuracy of scientific value sentences can be effectively improved, and the technical effects of high-efficiency and accurate analysis of multidimensional academic, application and innovation values are realized.

The application provides a mixed expert multi-classification method combined with an embedded double-model organization architecture, which comprises the following steps: and performing deep learning on the pre-training model based on the data distribution consistency threshold and the scientific value sentence training corpus, and constructing a scientific value sentence recognition model. Based on a mixed expert mechanism, a scientific value sentence multi-classification model is built, wherein the scientific value sentence multi-classification model comprises an academic value expert sub-model, an application value expert sub-model, an innovation value expert sub-model and a gating network. And packaging the scientific value sentence recognition model and the scientific value sentence multi-classification model into a scientific value sentence detection double model. Obtaining scientific and technological literature. And inputting the scientific and technological literature into the scientific and valuable sentence detection double model to obtain a comprehensive classification result of the scientific and valuable sentence.

In an implementation mode, deep learning is conducted on the pre-training model based on a data distribution consistency threshold and scientific value sentence training corpus, and a scientific value sentence recognition model is built, wherein the scientific value sentence training corpus comprises a scientific value sentence sample set and a non-scientific value sentence sample set. Dividing the scientific value sentence training corpus according to a preset proportion to obtain a first corpus training set, a first corpus testing set and a first corpus verification set. And carrying out data distribution optimization on the first corpus training set, the first corpus testing set and the first corpus verification set based on the data distribution consistency threshold value to obtain a second corpus training set, a second corpus testing set and a second corpus verification set. And training, testing and verifying the pre-training model based on the second corpus training set, the second corpus testing set and the second corpus verification set to generate the scientific value sentence identification model.

In an implementation mode, based on the data distribution consistency threshold, data distribution optimization is performed on the first corpus training set, the first corpus testing set and the first corpus verification set to obtain a second corpus training set, a second corpus testing set and a second corpus verification set, wherein positive and negative sample distribution calculation is performed on the first corpus training set, the first corpus testing set and the first corpus verification set respectively to obtain a first data distribution coefficient, a second data distribution coefficient and a third data distribution coefficient. And based on the first data distribution coefficient, the second data distribution coefficient and the third data distribution coefficient, performing data distribution consistency evaluation on the first corpus training set, the first corpus testing set and the first corpus verification set to obtain a data distribution consistency coefficient. And judging whether the data distribution consistency coefficient is larger than the data distribution consistency threshold value. And if the data distribution consistency coefficient is larger than the data distribution consistency threshold, carrying out data distribution adjustment on the first corpus training set, the first corpus testing set and the first corpus verification set according to the data distribution consistency threshold to generate the second corpus training set, the second corpus testing set and the second corpus verification set.

In an implementation mode, a scientific value sentence multi-classification model is built based on a mixed expert mechanism, and the method comprises the steps of obtaining scientific value sentence multi-classification indexes, wherein the scientific value sentence multi-classification indexes comprise academic value, application value and innovation value. Based on the scientific value sentence multi-classification index, academic value sentence scoring corpus, application value sentence scoring corpus and innovation value sentence scoring corpus are loaded. Training the academic value expert sub-model based on the academic value sentence scoring corpus. And training the application value expert sub-model based on the application value sentence scoring corpus. And training the innovation value expert sub-model based on the innovation value expert sub-model. And constructing the gating network according to the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model based on a gating loss function. And taking the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model as multi-classification parallel nodes. And connecting the multi-classification parallel nodes with the gating network to generate the scientific value sentence multi-classification model.

In an implementation, constructing the gating network based on a gating loss function according to the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model comprises collecting output samples of the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model to obtain an expert output sample set. And acquiring the discrimination probability parameters corresponding to the expert output sample set to obtain discrimination probability distribution. The minimum gating loss is used as a gating network training target. And performing unsupervised training on the expert output sample set and the discrimination probability distribution based on the gating loss function and the gating network training target to generate the gating network.

In an implementation, the gating loss function is:

Wherein, The gating loss function is characterized in that,Characterizing the unsupervised loss, lambda characterizing the predetermined balance weight,The minimization of the regularization term is characterized.

In an implementation mode, the scientific value sentence detection double model is input into the scientific value sentence detection double model to obtain a comprehensive classification result of the scientific value sentence, wherein the method comprises the steps of positioning a scientific value sentence enrichment region based on the scientific value sentence detection double model to obtain a value sentence enrichment region. And inputting the value sentence enrichment area into the scientific value sentence recognition model to obtain a scientific value sentence recognition result. And inputting the scientific value sentence recognition result into the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model to obtain a multi-type value discrimination result. Inputting the multi-type value discrimination results into the gating network to generate the comprehensive classification result of the scientific value sentence.

The application also provides a hybrid expert multi-classification system incorporating an embedded double-model organizational architecture, characterized in that the system comprises:

The scientific value sentence recognition model construction module is used for carrying out deep learning on the pre-training model based on the data distribution consistency threshold value and the scientific value sentence training corpus to construct a scientific value sentence recognition model.

The scientific value sentence multi-classification model construction module is used for constructing a scientific value sentence multi-classification model based on a mixed expert mechanism, wherein the scientific value sentence multi-classification model comprises an academic value expert sub-model, an application value expert sub-model, an innovation value expert sub-model and a gating network.

And the model packaging module is used for packaging the scientific value sentence identification model and the scientific value sentence multi-classification model into a scientific value sentence detection double model.

And the scientific literature acquisition module is used for acquiring scientific literature.

And the comprehensive classification result acquisition module is used for inputting the scientific and technological literature into the scientific and valuable sentence detection double model to obtain a scientific and valuable sentence comprehensive classification result.

The mixed expert multi-classification method and system combining the embedded double-model organization architecture, which are proposed by the application, are used for carrying out deep learning on the pre-training model based on the data distribution consistency threshold and the scientific value sentence training corpus to construct the scientific value sentence recognition model. Based on a mixed expert mechanism, a scientific value sentence multi-classification model is built, wherein the scientific value sentence multi-classification model comprises an academic value expert sub-model, an application value expert sub-model, an innovation value expert sub-model and a gating network. And packaging the scientific value sentence recognition model and the scientific value sentence multi-classification model into a scientific value sentence detection double model. Obtaining scientific and technological literature. And inputting the scientific and technological literature into the scientific and valuable sentence detection double model to obtain a comprehensive classification result of the scientific and valuable sentence. The technical problems that the existing scientific literature value sentence classification method in the prior art has low processing efficiency and difficult guarantee of accuracy when processing scientific literature are solved. By introducing a mixed expert mechanism and optimizing data distribution consistency, the classification accuracy of scientific value sentences can be effectively improved, and the technical effects of high-efficiency and accurate analysis of multidimensional academic, application and innovation values are realized.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings of the embodiments of the present disclosure will be briefly described below. It is apparent that the figures in the following description relate only to some embodiments of the present disclosure and are not limiting of the present disclosure.

FIG. 1 is a flow chart of a hybrid expert multi-classification method incorporating embedded dual model organization architecture according to the present invention;

FIG. 2 is a schematic diagram of a hybrid expert multi-classification system with embedded dual-model organization architecture according to an embodiment of the present application;

Reference numerals illustrate a scientific value sentence recognition model construction module 11, a scientific value sentence multi-classification model construction module 12, a model packaging module 13, a scientific literature acquisition module 14 and a comprehensive classification result acquisition module 15.

Detailed Description

The foregoing description is only an overview of the present application, and is intended to be implemented in accordance with the teachings of the present application in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present application more readily apparent.

The present application will be described in further detail below with reference to the accompanying drawings, in order to make the objects, technical solutions and advantages of the present application more apparent, and the described embodiments should not be construed as limiting the present application, but all other embodiments obtained by those skilled in the art without making any inventive effort are within the scope of the present application.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is to be understood that "some embodiments" may be the same subset or different subsets of all possible embodiments and may be combined with each other without conflict, the term "first\second" being referred to merely as distinguishing between similar objects and not representing a particular ordering for the objects. The terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements that are expressly listed or inherent to such process, method, article, or apparatus, but may include other steps or modules that may not be expressly listed or inherent to such process, method, article, or apparatus, and unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application pertains. The terminology used herein is for the purpose of describing embodiments of the application only.

The embodiment of the application provides a mixed expert multi-classification method and a system combined with an embedded double-model organization architecture, as shown in fig. 1, wherein the method comprises the following steps:

And performing deep learning on the pre-training model based on the data distribution consistency threshold and the scientific value sentence training corpus, and constructing a scientific value sentence recognition model.

Based on a mixed expert mechanism, a scientific value sentence multi-classification model is built, wherein the scientific value sentence multi-classification model comprises an academic value expert sub-model, an application value expert sub-model, an innovation value expert sub-model and a gating network.

In the scientific literature, knowledge is not uniformly distributed, but rather exhibits a certain concentration and regularity. In order to accurately classify academic value sentences, application value sentences and innovation value sentences of scientific and technical literature, deep learning is carried out on a pre-training model based on a data distribution consistency threshold and scientific value sentence training corpus, and a scientific value sentence recognition model is constructed, wherein the data distribution consistency threshold is a preset numerical value and is used for evaluating whether sample distribution of training, testing and verifying a data set is balanced or not. When the category distribution of the dataset does not coincide with the category distribution of the overall dataset, the dataset is adjusted to meet the threshold. The scientific sentence training corpus is a specially prepared text set for training a scientific sentence recognition model, and comprises sentence samples marked as having or not having scientific value. And then, based on a mixed expert mechanism, constructing a scientific value sentence multi-classification model, wherein the scientific value sentence multi-classification model comprises an academic value expert sub-model, an application value expert sub-model, an innovation value expert sub-model and a gating network. The hybrid expert mechanism is an integrated learning framework in which multiple sub-models (experts) evaluate and classify different classes of scientific value sentences according to their expertise. By combining multiple expert models and learning how to adaptively assign to different experts based on input, adaptive decomposition and modeling of tasks is achieved. Through the cooperation of a plurality of expert models, the input is subjected to characterization learning and decision output from different perspectives.

The method provided by the embodiment of the application further comprises the step that the scientific value sentence training corpus comprises a scientific value sentence sample set and a non-scientific value sentence sample set. Dividing the scientific value sentence training corpus according to a preset proportion to obtain a first corpus training set, a first corpus testing set and a first corpus verification set. And carrying out data distribution optimization on the first corpus training set, the first corpus testing set and the first corpus verification set based on the data distribution consistency threshold value to obtain a second corpus training set, a second corpus testing set and a second corpus verification set. And training, testing and verifying the pre-training model based on the second corpus training set, the second corpus testing set and the second corpus verification set to generate the scientific value sentence identification model.

Scientific value sentence training corpus is an example sentence chosen from scientific literature that is labeled as sentences with or without scientific value, i.e., a scientific value sentence sample set and a non-scientific value sentence sample set. And dividing the scientific value sentence training corpus according to a preset proportion to obtain a first corpus training set, a first corpus testing set and a first corpus verification set, wherein the first corpus training set, the first corpus testing set and the first corpus verification set are respectively used for training, testing and verifying the model. The predetermined proportion is a preset data dividing proportion, and can be divided according to the proportion of 70% training, 15% testing and 15% verification. Further, based on the data distribution consistency threshold, data distribution optimization is performed on the first corpus training set, the first corpus testing set and the first corpus verification set so as to ensure that each category is uniformly distributed in the training set, the testing set and the verification set, and a second corpus training set, a second corpus testing set and a second corpus verification set are obtained. Training, testing and verifying the pre-training model based on the second corpus training set, the second corpus testing set and the second corpus verifying set, wherein the pre-training model is an untrained neural network model until the model verification is performed by adopting the second corpus verifying set, and the verification is passed when the output meets the preset accuracy, so that a scientific value sentence recognition model is obtained. The scientific value sentence recognition model is mainly used for judging whether the input sentence type is a scientific value sentence or not. The layer model has the main function of filtering out a large number of irrelevant sentences and providing high-quality corpus for subsequent classification tasks.

And respectively carrying out positive and negative sample distribution calculation on the first corpus training set, the first corpus testing set and the first corpus verification set to obtain a first data distribution coefficient, a second data distribution coefficient and a third data distribution coefficient. And based on the first data distribution coefficient, the second data distribution coefficient and the third data distribution coefficient, performing data distribution consistency evaluation on the first corpus training set, the first corpus testing set and the first corpus verification set to obtain a data distribution consistency coefficient. And judging whether the data distribution consistency coefficient is larger than the data distribution consistency threshold value. And if the data distribution consistency coefficient is larger than the data distribution consistency threshold, carrying out data distribution adjustment on the first corpus training set, the first corpus testing set and the first corpus verification set according to the data distribution consistency threshold to generate the second corpus training set, the second corpus testing set and the second corpus verification set.

Based on the data distribution consistency threshold, performing data distribution optimization on the first corpus training set, the first corpus testing set and the first corpus verification set to obtain a second corpus training set, a second corpus testing set and a second corpus verification set, wherein positive and negative sample distribution calculation is performed on the first corpus training set, the first corpus testing set and the first corpus verification set respectively to obtain a first data distribution coefficient, a second data distribution coefficient and a third data distribution coefficient. Wherein the first data distribution coefficient = scientific value sentence sample size in the first corpus training set ++non-scientific value sentence sample size in the first corpus training set, the calculation process of the second data distribution coefficient and the third data distribution coefficient is the same as the calculation process of the first data distribution coefficient. And carrying out data distribution consistency evaluation on the first corpus training set, the first corpus testing set and the first corpus verification set based on the first data distribution coefficient, the second data distribution coefficient and the third data distribution coefficient, and calculating differences between the first data distribution coefficient and the second data distribution coefficient and the third data distribution coefficient respectively during data distribution consistency evaluation to obtain the data distribution consistency coefficient. And then judging whether the data distribution consistency coefficient is larger than the data distribution consistency threshold, wherein the data distribution consistency threshold is a preset maximum judgment threshold of data distribution difference consistency, and when the data distribution consistency coefficient is smaller than or equal to the threshold, the data distribution consistency of two groups of data is higher. When the data distribution consistency of the two groups of data is lower, the data distribution optimization is needed. And if the data distribution consistency coefficient is larger than the data distribution consistency threshold, carrying out distribution compensation adjustment on the data according to the data distribution consistency coefficient on the first corpus training set, the first corpus testing set and the first corpus verifying set according to the data distribution consistency coefficient, and carrying out data supplementation on the existing difference proportion until the difference proportion is smaller than the data distribution consistency threshold. And generating the second corpus training set, the second corpus testing set and the second corpus verification set.

The method provided by the embodiment of the application further comprises the step of obtaining scientific value sentence multi-classification indexes, wherein the scientific value sentence multi-classification indexes comprise academic value, application value and innovation value. Based on the scientific value sentence multi-classification index, academic value sentence scoring corpus, application value sentence scoring corpus and innovation value sentence scoring corpus are loaded. Training the academic value expert sub-model based on the academic value sentence scoring corpus. And training the application value expert sub-model based on the application value sentence scoring corpus. And training the innovation value expert sub-model based on the innovation value expert sub-model. And constructing the gating network according to the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model based on a gating loss function. And taking the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model as multi-classification parallel nodes. And connecting the multi-classification parallel nodes with the gating network to generate the scientific value sentence multi-classification model.

Based on a mixed expert mechanism, a scientific value sentence multi-classification model is built, which comprises the steps of obtaining scientific value sentence multi-classification indexes, wherein the scientific value sentence multi-classification indexes comprise academic value, application value and innovation value. The academic value reflects the contribution of sentences to the academic world, such as the theoretical proposal or the importance of experimental results. The application value reflects the potential or real benefit of research or discovery described in sentences in practical application. The innovation value is the degree and originality of innovation of the study, method or result mentioned in the evaluation sentence.

And then, based on the scientific value sentence multi-classification index, loading academic value sentence scoring corpus, application value sentence scoring corpus and innovation value sentence scoring corpus, wherein the academic value sentence scoring corpus comprises sentences marked as high academic value and corresponding value scoring identifiers, such as sentences deeply discussing specific scientific questions. The application value sentence scoring corpus comprises sentences marked as high application value and corresponding value scoring identifiers, such as sentences for describing research results with practical application prospects. Innovative value sentence scoring corpus, which comprises sentences marked as high innovation value and corresponding value scoring identifications, such as sentences of a new theory or a new method. Further, the academic value expert sub-model is trained based on the academic value sentence scoring corpus. And training the application value expert sub-model based on the application value sentence scoring corpus. And training the innovation value expert sub-model based on the innovation value expert sub-model. The academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model are all obtained after supervision training through a neural network model. And constructing the gating network according to the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model based on a gating loss function. And taking the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model as multi-classification parallel nodes. And connecting the multi-classification parallel nodes with the gating network to generate the scientific value sentence multi-classification model. Given an input x, the scientific value sentence-multi-classification model output y can be expressed as:

Where f _j (x) represents the output of the jth expert model on input x. g (x) _j is the corresponding gating (gating) function. The gating mechanism acts as a soft route, adaptively assigning to different experts based on the characteristics of the inputs, and weighting and combining their outputs. g (x) is typically implemented using a Softmax function:

Where e _j represents the gating logic value of the jth expert, which can be obtained by linear transformation of x or feed forward network mapping. In the scientific value multi-classification task, three expert models of academic value, application value and innovation value are respectively set in the project.

The method provided by the embodiment of the application further comprises the steps of collecting output samples of the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model, and obtaining an expert output sample set. And acquiring the discrimination probability parameters corresponding to the expert output sample set to obtain discrimination probability distribution. The minimum gating loss is used as a gating network training target. And performing unsupervised training on the expert output sample set and the discrimination probability distribution based on the gating loss function and the gating network training target to generate the gating network.

Based on a gating loss function, constructing the gating network according to the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model, wherein the method comprises the steps of collecting output samples of the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model to obtain an expert output sample set. The expert output sample set is a set of output samples obtained from each expert sub-model (academic value, application value, innovation value expert sub-model). Each sample set contains scoring or classification results for a particular scientific value sentence that reflect the sentence's attributes in the respective value dimension. And acquiring the discrimination probability parameters corresponding to the expert output sample set to obtain discrimination probability distribution. And taking minimized gating loss as a gating network training target, performing unsupervised training on the expert output sample set and the discrimination probability distribution based on the gating loss function and the gating network training target, and generating the gating network.

The gating network plays a key routing role in a scientific value sentence multi-classification model, and aims to adaptively allocate the routing role to different experts according to the characteristics of value sentences. Unlike supervised training of expert models, it is difficult for the gating network to directly obtain the supervisory signal because the proportion of each valuable real expert assignment is not known. Therefore, an unsupervised collaborative training mode is adopted, and the output of the expert model is used as a soft label to guide the gated network learning. In each training batch, we first infer the input value sentences x ₁,x₂,...,x_b separately with three expert models, resulting in their discrimination probabilities p _ij.p_ij＝Sigmoid(f_j(x_i) in the respective value dimensions, i=1, 2.

Then, taking the probability outputs as soft labels, performing JS divergence matching on the outputs of the gating network to obtain an unsupervised loss function of the gating network

The JS divergence is defined as:

by minimizing losses And the gate control network learning carries out self-adaptive weighting on the experts according to the problem characteristics, so that the weighted output of the expert is as close to the discrimination probability of each expert as possible. The collaborative learning mechanism enables the gating network and the expert model to be mutually suitable for iterative updating in the training process, and finally, the overall performance of the model is improved.

While training, a regularization term based on expert entropy is introduced, and the output of the gating network is encouraged to have higher expert selectivity:

Wherein H (·) represents the entropy function. Minimizing the regularization term R _g promotes the gating network to select as few experts as possible for each problem, avoids averaging, and improves expert utilization efficiency.

Finally, the overall goal of the gating network is to minimize the unsupervised synergy loss and expert entropy regularization term, wherein the gating loss function is:

Wherein, The gating loss function is characterized in that,Characterizing the unsupervised loss, lambda characterizing the predetermined balance weight,The minimization of the regularization term is characterized. In conclusion, the expert model realizes the discrimination of different types of values through the supervision training of scientific value sentences. The gate control network realizes the self-adaptive expert routing strategy through collaborative learning and entropy regularization. The expert model is matched with the gate network to form a multi-classification method of scientific value sentences through collaborative optimization.

And packaging the scientific value sentence recognition model and the scientific value sentence multi-classification model into a scientific value sentence detection double model. Obtaining scientific and technological literature. And inputting the scientific and technological literature into the scientific and valuable sentence detection double model to obtain a comprehensive classification result of the scientific and valuable sentence.

And packaging the scientific value sentence recognition model and the scientific value sentence multi-classification model into a scientific value sentence detection double model. Subsequently, scientific literature was obtained, for example, a research paper on biodiversity was uploaded. And inputting the scientific and technological literature into the scientific and valuable sentence detection double model, and analyzing through the scientific and valuable sentence detection double model to obtain a comprehensive classification result of the scientific and valuable sentence. The technical problems that the existing scientific literature value sentence classification method in the prior art has low processing efficiency and difficult guarantee of accuracy when processing scientific literature are solved. By introducing a mixed expert mechanism and optimizing data distribution consistency, the classification accuracy of scientific value sentences can be effectively improved, and the technical effects of high-efficiency and accurate analysis of multidimensional academic, application and innovation values are realized. Illustratively, given that embedding of one scientific question value sentence represents x, three expert models f _1,2,3 (x) output their discrimination scores in three value dimensions, respectively. The gating network g (x) calculates the weight of each expert according to the semantic features of the problem, and finally obtains the comprehensive classification result:

y=g(x)₁f₁(x)+g(x)₂f₂(x)+g(x)₃f₃(x)

Wherein f ₁(x)、f₂ (x) and f ₃ (x) respectively represent the outputs of the academic value, the application value and the innovation value specialists.

The method provided by the embodiment of the application further comprises the step of positioning the scientific value sentence enrichment region based on the scientific literature to obtain the value sentence enrichment region. And inputting the value sentence enrichment area into the scientific value sentence recognition model to obtain a scientific value sentence recognition result. And inputting the scientific value sentence recognition result into the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model to obtain a multi-type value discrimination result. Inputting the multi-type value discrimination results into the gating network to generate the comprehensive classification result of the scientific value sentence.

Inputting the scientific and technological literature into the scientific and valuable sentence detection double model to obtain a comprehensive classification result of the scientific and valuable sentence, wherein knowledge is not uniformly distributed in the scientific and technological literature, but shows a certain concentration and regularity. Certain sections or locations often contain a large amount of key information and core knowledge, and have the characteristics of high knowledge density, large information content and important content, and these areas are called knowledge enrichment areas. And positioning the scientific value sentence enrichment region based on the scientific and technological literature to obtain the value sentence enrichment region. And inputting the value sentence enrichment area into the scientific value sentence recognition model to obtain a scientific value sentence recognition result, and obtaining a sentence with scientific value. And inputting the scientific value sentence recognition result into the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model to obtain a multi-type value discrimination result. And finally, inputting the multi-type value discrimination results into the gating network to generate the comprehensive classification result of the scientific value sentences, wherein the comprehensive classification result of the scientific value sentences is a comprehensive classification result with scientific value sentences belonging to academic value sentences, application value sentences and innovation value sentences.

In the above, a hybrid expert multi-classification method in combination with an embedded dual-model organization architecture according to an embodiment of the present invention is described in detail with reference to fig. 1. Next, a hybrid expert multi-classification system incorporating an embedded dual-model organization architecture according to an embodiment of the present invention will be described with reference to fig. 2.

According to the mixed expert multi-classification system combined with the embedded double-model organization architecture, the technical problems that the processing efficiency is low and the accuracy is difficult to guarantee when the scientific literature value sentence classification method in the prior art is used for processing scientific literature are solved. By introducing a mixed expert mechanism and optimizing data distribution consistency, the classification accuracy of scientific value sentences can be effectively improved, and the technical effects of high-efficiency and accurate analysis of multidimensional academic, application and innovation values are realized. The mixed expert multi-classification system combined with the embedded double-model organization architecture comprises a scientific value sentence identification model construction module 11, a scientific value sentence multi-classification model construction module 12, a model encapsulation module 13, a scientific literature acquisition module 14 and a comprehensive classification result acquisition module 15.

The scientific value sentence recognition model construction module 11 is configured to perform deep learning on the pre-training model based on the data distribution consistency threshold and the scientific value sentence training corpus, and construct a scientific value sentence recognition model.

The scientific value sentence multi-classification model construction module 12 is configured to construct a scientific value sentence multi-classification model based on a hybrid expert mechanism, where the scientific value sentence multi-classification model includes an academic value expert sub-model, an application value expert sub-model, an innovation value expert sub-model, and a gating network.

The model packaging module 13 is configured to package the scientific value sentence recognition model and the scientific value sentence multi-classification model into a scientific value sentence detection dual model.

A scientific literature acquisition module 14 for acquiring a scientific literature.

And the comprehensive classification result acquisition module 15 is used for inputting the scientific and technological literature into the scientific and valuable sentence detection double model to obtain a scientific and valuable sentence comprehensive classification result.

Next, the specific configuration of the scientific value sentence recognition model construction module 11 will be described in detail. The scientific value sentence recognition model construction module 11 may further include performing deep learning on the pre-training model based on a data distribution consistency threshold and a scientific value sentence training corpus, to construct a scientific value sentence recognition model, including a scientific value sentence sample set and a non-scientific value sentence sample set. Dividing the scientific value sentence training corpus according to a preset proportion to obtain a first corpus training set, a first corpus testing set and a first corpus verification set. And carrying out data distribution optimization on the first corpus training set, the first corpus testing set and the first corpus verification set based on the data distribution consistency threshold value to obtain a second corpus training set, a second corpus testing set and a second corpus verification set. And training, testing and verifying the pre-training model based on the second corpus training set, the second corpus testing set and the second corpus verification set to generate the scientific value sentence identification model.

Next, the specific configuration of the scientific value sentence recognition model construction module 11 will be described in further detail. The scientific value sentence recognition model building module 11 further performs data distribution optimization on the first corpus training set, the first corpus testing set and the first corpus verification set based on the data distribution consistency threshold value to obtain a second corpus training set, a second corpus testing set and a second corpus verification set, and includes performing positive and negative sample distribution calculation on the first corpus training set, the first corpus testing set and the first corpus verification set respectively to obtain a first data distribution coefficient, a second data distribution coefficient and a third data distribution coefficient. And based on the first data distribution coefficient, the second data distribution coefficient and the third data distribution coefficient, performing data distribution consistency evaluation on the first corpus training set, the first corpus testing set and the first corpus verification set to obtain a data distribution consistency coefficient. And judging whether the data distribution consistency coefficient is larger than the data distribution consistency threshold value. And if the data distribution consistency coefficient is larger than the data distribution consistency threshold, carrying out data distribution adjustment on the first corpus training set, the first corpus testing set and the first corpus verification set according to the data distribution consistency threshold to generate the second corpus training set, the second corpus testing set and the second corpus verification set.

Next, the specific configuration of the scientific value sentence multi-classification model construction module 12 will be described in detail. The scientific value sentence multi-classification model construction module 12 may further include constructing a scientific value sentence multi-classification model based on a hybrid expert mechanism, including obtaining a scientific value sentence multi-classification index, wherein the scientific value sentence multi-classification index includes an academic value, an application value, and an innovation value. Based on the scientific value sentence multi-classification index, academic value sentence scoring corpus, application value sentence scoring corpus and innovation value sentence scoring corpus are loaded. Training the academic value expert sub-model based on the academic value sentence scoring corpus. And training the application value expert sub-model based on the application value sentence scoring corpus. And training the innovation value expert sub-model based on the innovation value expert sub-model. And constructing the gating network according to the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model based on a gating loss function. And taking the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model as multi-classification parallel nodes. And connecting the multi-classification parallel nodes with the gating network to generate the scientific value sentence multi-classification model.

Next, the specific configuration of the scientific value sentence multi-classification model construction module 12 will be described in detail. The scientific value sentence multi-classification model construction module 12 further includes constructing the gating network from the academic value expert sub-model, the application value expert sub-model, and the innovation value expert sub-model based on a gating loss function, including collecting output samples of the academic value expert sub-model, the application value expert sub-model, and the innovation value expert sub-model, to obtain an expert output sample set. And acquiring the discrimination probability parameters corresponding to the expert output sample set to obtain discrimination probability distribution. The minimum gating loss is used as a gating network training target. And performing unsupervised training on the expert output sample set and the discrimination probability distribution based on the gating loss function and the gating network training target to generate the gating network.

The specific configuration of the scientific value sentence-multi-classification model construction module 12 will be described in further detail below. The scientific value sentence multi-classification model construction module 12 further includes the gating loss function:

Next, the specific configuration of the scientific value sentence integrated classification result acquisition module 15 will be described in detail. The comprehensive classification result obtaining module 15 further inputs the scientific value sentence detection double model into the scientific value sentence detection double model to obtain a comprehensive classification result of the scientific value sentence, and performs scientific value sentence enrichment region positioning based on the scientific value sentence detection double model to obtain a value sentence enrichment region. And inputting the value sentence enrichment area into the scientific value sentence recognition model to obtain a scientific value sentence recognition result. And inputting the scientific value sentence recognition result into the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model to obtain a multi-type value discrimination result. Inputting the multi-type value discrimination results into the gating network to generate the comprehensive classification result of the scientific value sentence.

The mixed expert multi-classification system combined with the embedded double-mode organization architecture provided by the embodiment of the invention can execute the mixed expert multi-classification method combined with the embedded double-mode organization architecture provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.

Although the present application makes various references to certain modules in a system according to embodiments of the present application, any number of different modules may be used and run on a user terminal and/or server, including individual units and modules that are merely partitioned by functional logic, but are not limited to the above-described partitioning, as long as the corresponding functionality is enabled. In addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present application.

The above embodiments do not limit the scope of the present application. It will be apparent to those skilled in the art that various modifications, combinations, and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application should be included in the scope of the present application.

Claims

1. A hybrid expert multi-classification method combined with an embedded dual-model organizational structure, characterized in that the method includes:

Based on the data distribution consistency threshold and scientific value sentence training corpus, the pre-training model is deeply learned to build a scientific value sentence recognition model;

Based on the hybrid expert mechanism, a multi-classification model of scientific value sentences is constructed, wherein the multi-classification model of scientific value sentences includes an academic value expert sub-model, an application value expert sub-model, an innovation value expert sub-model and a gating network;

Encapsulating the scientific value sentence recognition model and the scientific value sentence multi-classification model into a scientific value sentence detection dual model;

Access to scientific and technological literature;

The scientific and technological literature is input into the dual model for detecting scientific value sentences to obtain a comprehensive classification result of scientific value sentences.

2. The method according to claim 1, characterized in that the pre-training model is deeply learned based on the data distribution consistency threshold and the scientific value sentence training corpus to construct a scientific value sentence recognition model, comprising:

The scientific value sentence training corpus includes a scientific value sentence sample set and a non-scientific value sentence sample set;

Divide the scientific value sentence training corpus according to a predetermined ratio to obtain a first corpus training set, a first corpus test set, and a first corpus verification set;

Based on the data distribution consistency threshold, optimizing the data distribution of the first corpus training set, the first corpus test set, and the first corpus verification set to obtain a second corpus training set, a second corpus test set, and a second corpus verification set;

Based on the second corpus training set, the second corpus test set and the second corpus verification set, the pre-trained model is trained, tested and verified to generate the scientific value sentence recognition model.

3. The method according to claim 2, characterized in that, based on the data distribution consistency threshold, data distribution optimization is performed on the first corpus training set, the first corpus test set and the first corpus verification set to obtain a second corpus training set, a second corpus test set and a second corpus verification set, comprising:

Performing positive and negative sample distribution calculations on the first corpus training set, the first corpus test set, and the first corpus verification set, respectively, to obtain a first data distribution coefficient, a second data distribution coefficient, and a third data distribution coefficient;

Based on the first data distribution coefficient, the second data distribution coefficient and the third data distribution coefficient, performing data distribution consistency evaluation on the first corpus training set, the first corpus test set and the first corpus verification set to obtain a data distribution consistency coefficient;

Determining whether the data distribution consistency coefficient is greater than the data distribution consistency threshold;

If the data distribution consistency coefficient is greater than the data distribution consistency threshold, data distribution adjustment is performed on the first corpus training set, the first corpus test set and the first corpus verification set according to the data distribution consistency threshold to generate the second corpus training set, the second corpus test set and the second corpus verification set.

4. The method according to claim 1 is characterized in that, based on the hybrid expert mechanism, a multi-classification model of scientific value sentences is constructed, comprising:

Obtaining multi-classification indicators of scientific value sentences, wherein the multi-classification indicators of scientific value sentences include academic value, application value and innovation value;

Based on the multi-classification index of scientific value sentences, the scoring corpus of academic value sentences, the scoring corpus of application value sentences and the scoring corpus of innovative value sentences are loaded;

Based on the academic value sentence scoring corpus, training the academic value expert sub-model;

Based on the application value sentence scoring corpus, training the application value expert sub-model;

Based on the innovation value expert sub-model, training the innovation value expert sub-model;

Based on the gating loss function, constructing the gating network according to the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model;

The academic value expert sub-model, the applied value expert sub-model and the innovative value expert sub-model are multi-classification parallel nodes;

The multi-classification parallel nodes and the gating network are connected to generate the multi-classification model of scientific value sentences.

5. The method according to claim 4, characterized in that, based on the gating loss function, the gating network is constructed according to the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model, comprising:

Collecting output samples of the academic value expert sub-model, the applied value expert sub-model and the innovative value expert sub-model to obtain an expert output sample set;

Collecting the discriminant probability parameters corresponding to the expert output sample set to obtain the discriminant probability distribution;

The training objective of the gated network is to minimize the gated loss;

Based on the gating loss function and the gating network training objective, unsupervised training is performed on the expert output sample set and the discriminant probability distribution to generate the gating network.

6. The method according to claim 4, wherein the gated loss function is:

in, Characterize the gated loss function, represents the unsupervised loss, λ represents the predetermined balance weight, Characterize the minimized regularization term.

7. The method according to claim 1, characterized in that the scientific literature is input into the scientific value sentence detection dual model to obtain a comprehensive classification result of scientific value sentences, comprising:

Based on the scientific literature, locate the rich area of scientifically valuable sentences to obtain the rich area of valuable sentences;

Inputting the value sentence enrichment region into the scientific value sentence recognition model to obtain a scientific value sentence recognition result;

Inputting the scientific value sentence recognition results into the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model to obtain multi-type value discrimination results;

The multi-type value discrimination results are input into the gating network to generate the comprehensive classification results of the scientific value sentences.

8. A hybrid expert multi-classification system incorporating an embedded dual-model organizational architecture, characterized in that the system comprises:

The scientific value sentence recognition model construction module is used to conduct deep learning on the pre-trained model based on the data distribution consistency threshold and the scientific value sentence training corpus to build a scientific value sentence recognition model;

A scientific value sentence multi-classification model construction module is used to build a scientific value sentence multi-classification model based on a hybrid expert mechanism, wherein the scientific value sentence multi-classification model includes an academic value expert sub-model, an application value expert sub-model, an innovation value expert sub-model and a gating network;

A model encapsulation module, used for encapsulating the scientific value sentence recognition model and the scientific value sentence multi-classification model into a scientific value sentence detection dual model;

A scientific and technological literature acquisition module is used to obtain scientific and technological literature;

The comprehensive classification result acquisition module is used to input the scientific literature into the scientific value sentence detection dual model to obtain the comprehensive classification result of the scientific value sentence.