CN119646224A - Hybrid expert multi-classification method and system combined with embedded dual-model organizational architecture - Google Patents
Hybrid expert multi-classification method and system combined with embedded dual-model organizational architecture Download PDFInfo
- Publication number
- CN119646224A CN119646224A CN202411805132.1A CN202411805132A CN119646224A CN 119646224 A CN119646224 A CN 119646224A CN 202411805132 A CN202411805132 A CN 202411805132A CN 119646224 A CN119646224 A CN 119646224A
- Authority
- CN
- China
- Prior art keywords
- value
- model
- scientific
- corpus
- sentence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a mixed expert multi-classification method and a system combined with an embedded double-model organization framework, which are applied to the technical field of data processing and are used for carrying out deep learning on a pre-training model based on a data distribution consistency threshold value and a scientific value sentence training corpus to construct a scientific value sentence recognition model. Based on a mixed expert mechanism, a scientific value sentence multi-classification model is built, wherein the scientific value sentence multi-classification model comprises an academic value expert sub-model, an application value expert sub-model, an innovation value expert sub-model and a gating network. And packaging the scientific value sentence recognition model and the scientific value sentence multi-classification model into a scientific value sentence detection double model to obtain a scientific literature. And inputting the scientific and technological literature into the scientific and valuable sentence detection double model to obtain a comprehensive classification result of the scientific and valuable sentence. The technical problems that the existing scientific literature value sentence classification method in the prior art has low processing efficiency and difficult guarantee of accuracy when processing scientific literature are solved.
Description
Technical Field
The invention relates to the technical field of data processing, in particular to a mixed expert multi-classification method and system combined with an embedded double-model organization architecture.
Background
Along with the increasing of scientific and technological literature, how to efficiently and accurately identify and classify scientific value sentences such as academic value, application value and innovation value sentences in the literature becomes an important subject in scientific research and literature management. The traditional document classification method depends on manual labeling and rule setting, and can partially meet the requirements, but the efficiency is low and the accuracy is difficult to guarantee when large-scale document processing is carried out.
Therefore, the existing scientific literature value sentence classification method in the prior art has the technical problems of low processing efficiency and difficult guarantee of accuracy when processing scientific literature.
Disclosure of Invention
The application solves the technical problems of low processing efficiency and difficult guarantee of accuracy in the prior art of processing scientific literature by providing the mixed expert multi-classification method and system combined with the embedded double-model organization architecture. By introducing a mixed expert mechanism and optimizing data distribution consistency, the classification accuracy of scientific value sentences can be effectively improved, and the technical effects of high-efficiency and accurate analysis of multidimensional academic, application and innovation values are realized.
The application provides a mixed expert multi-classification method combined with an embedded double-model organization architecture, which comprises the following steps: and performing deep learning on the pre-training model based on the data distribution consistency threshold and the scientific value sentence training corpus, and constructing a scientific value sentence recognition model. Based on a mixed expert mechanism, a scientific value sentence multi-classification model is built, wherein the scientific value sentence multi-classification model comprises an academic value expert sub-model, an application value expert sub-model, an innovation value expert sub-model and a gating network. And packaging the scientific value sentence recognition model and the scientific value sentence multi-classification model into a scientific value sentence detection double model. Obtaining scientific and technological literature. And inputting the scientific and technological literature into the scientific and valuable sentence detection double model to obtain a comprehensive classification result of the scientific and valuable sentence.
In an implementation mode, deep learning is conducted on the pre-training model based on a data distribution consistency threshold and scientific value sentence training corpus, and a scientific value sentence recognition model is built, wherein the scientific value sentence training corpus comprises a scientific value sentence sample set and a non-scientific value sentence sample set. Dividing the scientific value sentence training corpus according to a preset proportion to obtain a first corpus training set, a first corpus testing set and a first corpus verification set. And carrying out data distribution optimization on the first corpus training set, the first corpus testing set and the first corpus verification set based on the data distribution consistency threshold value to obtain a second corpus training set, a second corpus testing set and a second corpus verification set. And training, testing and verifying the pre-training model based on the second corpus training set, the second corpus testing set and the second corpus verification set to generate the scientific value sentence identification model.
In an implementation mode, based on the data distribution consistency threshold, data distribution optimization is performed on the first corpus training set, the first corpus testing set and the first corpus verification set to obtain a second corpus training set, a second corpus testing set and a second corpus verification set, wherein positive and negative sample distribution calculation is performed on the first corpus training set, the first corpus testing set and the first corpus verification set respectively to obtain a first data distribution coefficient, a second data distribution coefficient and a third data distribution coefficient. And based on the first data distribution coefficient, the second data distribution coefficient and the third data distribution coefficient, performing data distribution consistency evaluation on the first corpus training set, the first corpus testing set and the first corpus verification set to obtain a data distribution consistency coefficient. And judging whether the data distribution consistency coefficient is larger than the data distribution consistency threshold value. And if the data distribution consistency coefficient is larger than the data distribution consistency threshold, carrying out data distribution adjustment on the first corpus training set, the first corpus testing set and the first corpus verification set according to the data distribution consistency threshold to generate the second corpus training set, the second corpus testing set and the second corpus verification set.
In an implementation mode, a scientific value sentence multi-classification model is built based on a mixed expert mechanism, and the method comprises the steps of obtaining scientific value sentence multi-classification indexes, wherein the scientific value sentence multi-classification indexes comprise academic value, application value and innovation value. Based on the scientific value sentence multi-classification index, academic value sentence scoring corpus, application value sentence scoring corpus and innovation value sentence scoring corpus are loaded. Training the academic value expert sub-model based on the academic value sentence scoring corpus. And training the application value expert sub-model based on the application value sentence scoring corpus. And training the innovation value expert sub-model based on the innovation value expert sub-model. And constructing the gating network according to the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model based on a gating loss function. And taking the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model as multi-classification parallel nodes. And connecting the multi-classification parallel nodes with the gating network to generate the scientific value sentence multi-classification model.
In an implementation, constructing the gating network based on a gating loss function according to the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model comprises collecting output samples of the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model to obtain an expert output sample set. And acquiring the discrimination probability parameters corresponding to the expert output sample set to obtain discrimination probability distribution. The minimum gating loss is used as a gating network training target. And performing unsupervised training on the expert output sample set and the discrimination probability distribution based on the gating loss function and the gating network training target to generate the gating network.
In an implementation, the gating loss function is:
Wherein, The gating loss function is characterized in that,Characterizing the unsupervised loss, lambda characterizing the predetermined balance weight,The minimization of the regularization term is characterized.
In an implementation mode, the scientific value sentence detection double model is input into the scientific value sentence detection double model to obtain a comprehensive classification result of the scientific value sentence, wherein the method comprises the steps of positioning a scientific value sentence enrichment region based on the scientific value sentence detection double model to obtain a value sentence enrichment region. And inputting the value sentence enrichment area into the scientific value sentence recognition model to obtain a scientific value sentence recognition result. And inputting the scientific value sentence recognition result into the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model to obtain a multi-type value discrimination result. Inputting the multi-type value discrimination results into the gating network to generate the comprehensive classification result of the scientific value sentence.
The application also provides a hybrid expert multi-classification system incorporating an embedded double-model organizational architecture, characterized in that the system comprises:
The scientific value sentence recognition model construction module is used for carrying out deep learning on the pre-training model based on the data distribution consistency threshold value and the scientific value sentence training corpus to construct a scientific value sentence recognition model.
The scientific value sentence multi-classification model construction module is used for constructing a scientific value sentence multi-classification model based on a mixed expert mechanism, wherein the scientific value sentence multi-classification model comprises an academic value expert sub-model, an application value expert sub-model, an innovation value expert sub-model and a gating network.
And the model packaging module is used for packaging the scientific value sentence identification model and the scientific value sentence multi-classification model into a scientific value sentence detection double model.
And the scientific literature acquisition module is used for acquiring scientific literature.
And the comprehensive classification result acquisition module is used for inputting the scientific and technological literature into the scientific and valuable sentence detection double model to obtain a scientific and valuable sentence comprehensive classification result.
The mixed expert multi-classification method and system combining the embedded double-model organization architecture, which are proposed by the application, are used for carrying out deep learning on the pre-training model based on the data distribution consistency threshold and the scientific value sentence training corpus to construct the scientific value sentence recognition model. Based on a mixed expert mechanism, a scientific value sentence multi-classification model is built, wherein the scientific value sentence multi-classification model comprises an academic value expert sub-model, an application value expert sub-model, an innovation value expert sub-model and a gating network. And packaging the scientific value sentence recognition model and the scientific value sentence multi-classification model into a scientific value sentence detection double model. Obtaining scientific and technological literature. And inputting the scientific and technological literature into the scientific and valuable sentence detection double model to obtain a comprehensive classification result of the scientific and valuable sentence. The technical problems that the existing scientific literature value sentence classification method in the prior art has low processing efficiency and difficult guarantee of accuracy when processing scientific literature are solved. By introducing a mixed expert mechanism and optimizing data distribution consistency, the classification accuracy of scientific value sentences can be effectively improved, and the technical effects of high-efficiency and accurate analysis of multidimensional academic, application and innovation values are realized.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings of the embodiments of the present disclosure will be briefly described below. It is apparent that the figures in the following description relate only to some embodiments of the present disclosure and are not limiting of the present disclosure.
FIG. 1 is a flow chart of a hybrid expert multi-classification method incorporating embedded dual model organization architecture according to the present invention;
FIG. 2 is a schematic diagram of a hybrid expert multi-classification system with embedded dual-model organization architecture according to an embodiment of the present application;
Reference numerals illustrate a scientific value sentence recognition model construction module 11, a scientific value sentence multi-classification model construction module 12, a model packaging module 13, a scientific literature acquisition module 14 and a comprehensive classification result acquisition module 15.
Detailed Description
The foregoing description is only an overview of the present application, and is intended to be implemented in accordance with the teachings of the present application in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present application more readily apparent.
The present application will be described in further detail below with reference to the accompanying drawings, in order to make the objects, technical solutions and advantages of the present application more apparent, and the described embodiments should not be construed as limiting the present application, but all other embodiments obtained by those skilled in the art without making any inventive effort are within the scope of the present application.
In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is to be understood that "some embodiments" may be the same subset or different subsets of all possible embodiments and may be combined with each other without conflict, the term "first\second" being referred to merely as distinguishing between similar objects and not representing a particular ordering for the objects. The terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements that are expressly listed or inherent to such process, method, article, or apparatus, but may include other steps or modules that may not be expressly listed or inherent to such process, method, article, or apparatus, and unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application pertains. The terminology used herein is for the purpose of describing embodiments of the application only.
The embodiment of the application provides a mixed expert multi-classification method and a system combined with an embedded double-model organization architecture, as shown in fig. 1, wherein the method comprises the following steps:
And performing deep learning on the pre-training model based on the data distribution consistency threshold and the scientific value sentence training corpus, and constructing a scientific value sentence recognition model.
Based on a mixed expert mechanism, a scientific value sentence multi-classification model is built, wherein the scientific value sentence multi-classification model comprises an academic value expert sub-model, an application value expert sub-model, an innovation value expert sub-model and a gating network.
In the scientific literature, knowledge is not uniformly distributed, but rather exhibits a certain concentration and regularity. In order to accurately classify academic value sentences, application value sentences and innovation value sentences of scientific and technical literature, deep learning is carried out on a pre-training model based on a data distribution consistency threshold and scientific value sentence training corpus, and a scientific value sentence recognition model is constructed, wherein the data distribution consistency threshold is a preset numerical value and is used for evaluating whether sample distribution of training, testing and verifying a data set is balanced or not. When the category distribution of the dataset does not coincide with the category distribution of the overall dataset, the dataset is adjusted to meet the threshold. The scientific sentence training corpus is a specially prepared text set for training a scientific sentence recognition model, and comprises sentence samples marked as having or not having scientific value. And then, based on a mixed expert mechanism, constructing a scientific value sentence multi-classification model, wherein the scientific value sentence multi-classification model comprises an academic value expert sub-model, an application value expert sub-model, an innovation value expert sub-model and a gating network. The hybrid expert mechanism is an integrated learning framework in which multiple sub-models (experts) evaluate and classify different classes of scientific value sentences according to their expertise. By combining multiple expert models and learning how to adaptively assign to different experts based on input, adaptive decomposition and modeling of tasks is achieved. Through the cooperation of a plurality of expert models, the input is subjected to characterization learning and decision output from different perspectives.
The method provided by the embodiment of the application further comprises the step that the scientific value sentence training corpus comprises a scientific value sentence sample set and a non-scientific value sentence sample set. Dividing the scientific value sentence training corpus according to a preset proportion to obtain a first corpus training set, a first corpus testing set and a first corpus verification set. And carrying out data distribution optimization on the first corpus training set, the first corpus testing set and the first corpus verification set based on the data distribution consistency threshold value to obtain a second corpus training set, a second corpus testing set and a second corpus verification set. And training, testing and verifying the pre-training model based on the second corpus training set, the second corpus testing set and the second corpus verification set to generate the scientific value sentence identification model.
Scientific value sentence training corpus is an example sentence chosen from scientific literature that is labeled as sentences with or without scientific value, i.e., a scientific value sentence sample set and a non-scientific value sentence sample set. And dividing the scientific value sentence training corpus according to a preset proportion to obtain a first corpus training set, a first corpus testing set and a first corpus verification set, wherein the first corpus training set, the first corpus testing set and the first corpus verification set are respectively used for training, testing and verifying the model. The predetermined proportion is a preset data dividing proportion, and can be divided according to the proportion of 70% training, 15% testing and 15% verification. Further, based on the data distribution consistency threshold, data distribution optimization is performed on the first corpus training set, the first corpus testing set and the first corpus verification set so as to ensure that each category is uniformly distributed in the training set, the testing set and the verification set, and a second corpus training set, a second corpus testing set and a second corpus verification set are obtained. Training, testing and verifying the pre-training model based on the second corpus training set, the second corpus testing set and the second corpus verifying set, wherein the pre-training model is an untrained neural network model until the model verification is performed by adopting the second corpus verifying set, and the verification is passed when the output meets the preset accuracy, so that a scientific value sentence recognition model is obtained. The scientific value sentence recognition model is mainly used for judging whether the input sentence type is a scientific value sentence or not. The layer model has the main function of filtering out a large number of irrelevant sentences and providing high-quality corpus for subsequent classification tasks.
And respectively carrying out positive and negative sample distribution calculation on the first corpus training set, the first corpus testing set and the first corpus verification set to obtain a first data distribution coefficient, a second data distribution coefficient and a third data distribution coefficient. And based on the first data distribution coefficient, the second data distribution coefficient and the third data distribution coefficient, performing data distribution consistency evaluation on the first corpus training set, the first corpus testing set and the first corpus verification set to obtain a data distribution consistency coefficient. And judging whether the data distribution consistency coefficient is larger than the data distribution consistency threshold value. And if the data distribution consistency coefficient is larger than the data distribution consistency threshold, carrying out data distribution adjustment on the first corpus training set, the first corpus testing set and the first corpus verification set according to the data distribution consistency threshold to generate the second corpus training set, the second corpus testing set and the second corpus verification set.
Based on the data distribution consistency threshold, performing data distribution optimization on the first corpus training set, the first corpus testing set and the first corpus verification set to obtain a second corpus training set, a second corpus testing set and a second corpus verification set, wherein positive and negative sample distribution calculation is performed on the first corpus training set, the first corpus testing set and the first corpus verification set respectively to obtain a first data distribution coefficient, a second data distribution coefficient and a third data distribution coefficient. Wherein the first data distribution coefficient = scientific value sentence sample size in the first corpus training set ++non-scientific value sentence sample size in the first corpus training set, the calculation process of the second data distribution coefficient and the third data distribution coefficient is the same as the calculation process of the first data distribution coefficient. And carrying out data distribution consistency evaluation on the first corpus training set, the first corpus testing set and the first corpus verification set based on the first data distribution coefficient, the second data distribution coefficient and the third data distribution coefficient, and calculating differences between the first data distribution coefficient and the second data distribution coefficient and the third data distribution coefficient respectively during data distribution consistency evaluation to obtain the data distribution consistency coefficient. And then judging whether the data distribution consistency coefficient is larger than the data distribution consistency threshold, wherein the data distribution consistency threshold is a preset maximum judgment threshold of data distribution difference consistency, and when the data distribution consistency coefficient is smaller than or equal to the threshold, the data distribution consistency of two groups of data is higher. When the data distribution consistency of the two groups of data is lower, the data distribution optimization is needed. And if the data distribution consistency coefficient is larger than the data distribution consistency threshold, carrying out distribution compensation adjustment on the data according to the data distribution consistency coefficient on the first corpus training set, the first corpus testing set and the first corpus verifying set according to the data distribution consistency coefficient, and carrying out data supplementation on the existing difference proportion until the difference proportion is smaller than the data distribution consistency threshold. And generating the second corpus training set, the second corpus testing set and the second corpus verification set.
The method provided by the embodiment of the application further comprises the step of obtaining scientific value sentence multi-classification indexes, wherein the scientific value sentence multi-classification indexes comprise academic value, application value and innovation value. Based on the scientific value sentence multi-classification index, academic value sentence scoring corpus, application value sentence scoring corpus and innovation value sentence scoring corpus are loaded. Training the academic value expert sub-model based on the academic value sentence scoring corpus. And training the application value expert sub-model based on the application value sentence scoring corpus. And training the innovation value expert sub-model based on the innovation value expert sub-model. And constructing the gating network according to the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model based on a gating loss function. And taking the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model as multi-classification parallel nodes. And connecting the multi-classification parallel nodes with the gating network to generate the scientific value sentence multi-classification model.
Based on a mixed expert mechanism, a scientific value sentence multi-classification model is built, which comprises the steps of obtaining scientific value sentence multi-classification indexes, wherein the scientific value sentence multi-classification indexes comprise academic value, application value and innovation value. The academic value reflects the contribution of sentences to the academic world, such as the theoretical proposal or the importance of experimental results. The application value reflects the potential or real benefit of research or discovery described in sentences in practical application. The innovation value is the degree and originality of innovation of the study, method or result mentioned in the evaluation sentence.
And then, based on the scientific value sentence multi-classification index, loading academic value sentence scoring corpus, application value sentence scoring corpus and innovation value sentence scoring corpus, wherein the academic value sentence scoring corpus comprises sentences marked as high academic value and corresponding value scoring identifiers, such as sentences deeply discussing specific scientific questions. The application value sentence scoring corpus comprises sentences marked as high application value and corresponding value scoring identifiers, such as sentences for describing research results with practical application prospects. Innovative value sentence scoring corpus, which comprises sentences marked as high innovation value and corresponding value scoring identifications, such as sentences of a new theory or a new method. Further, the academic value expert sub-model is trained based on the academic value sentence scoring corpus. And training the application value expert sub-model based on the application value sentence scoring corpus. And training the innovation value expert sub-model based on the innovation value expert sub-model. The academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model are all obtained after supervision training through a neural network model. And constructing the gating network according to the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model based on a gating loss function. And taking the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model as multi-classification parallel nodes. And connecting the multi-classification parallel nodes with the gating network to generate the scientific value sentence multi-classification model. Given an input x, the scientific value sentence-multi-classification model output y can be expressed as:
Where f j (x) represents the output of the jth expert model on input x. g (x) j is the corresponding gating (gating) function. The gating mechanism acts as a soft route, adaptively assigning to different experts based on the characteristics of the inputs, and weighting and combining their outputs. g (x) is typically implemented using a Softmax function:
Where e j represents the gating logic value of the jth expert, which can be obtained by linear transformation of x or feed forward network mapping. In the scientific value multi-classification task, three expert models of academic value, application value and innovation value are respectively set in the project.
The method provided by the embodiment of the application further comprises the steps of collecting output samples of the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model, and obtaining an expert output sample set. And acquiring the discrimination probability parameters corresponding to the expert output sample set to obtain discrimination probability distribution. The minimum gating loss is used as a gating network training target. And performing unsupervised training on the expert output sample set and the discrimination probability distribution based on the gating loss function and the gating network training target to generate the gating network.
Based on a gating loss function, constructing the gating network according to the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model, wherein the method comprises the steps of collecting output samples of the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model to obtain an expert output sample set. The expert output sample set is a set of output samples obtained from each expert sub-model (academic value, application value, innovation value expert sub-model). Each sample set contains scoring or classification results for a particular scientific value sentence that reflect the sentence's attributes in the respective value dimension. And acquiring the discrimination probability parameters corresponding to the expert output sample set to obtain discrimination probability distribution. And taking minimized gating loss as a gating network training target, performing unsupervised training on the expert output sample set and the discrimination probability distribution based on the gating loss function and the gating network training target, and generating the gating network.
The gating network plays a key routing role in a scientific value sentence multi-classification model, and aims to adaptively allocate the routing role to different experts according to the characteristics of value sentences. Unlike supervised training of expert models, it is difficult for the gating network to directly obtain the supervisory signal because the proportion of each valuable real expert assignment is not known. Therefore, an unsupervised collaborative training mode is adopted, and the output of the expert model is used as a soft label to guide the gated network learning. In each training batch, we first infer the input value sentences x 1,x2,...,xb separately with three expert models, resulting in their discrimination probabilities p ij.pij=Sigmoid(fj(xi) in the respective value dimensions, i=1, 2.
Then, taking the probability outputs as soft labels, performing JS divergence matching on the outputs of the gating network to obtain an unsupervised loss function of the gating network
The JS divergence is defined as:
by minimizing losses And the gate control network learning carries out self-adaptive weighting on the experts according to the problem characteristics, so that the weighted output of the expert is as close to the discrimination probability of each expert as possible. The collaborative learning mechanism enables the gating network and the expert model to be mutually suitable for iterative updating in the training process, and finally, the overall performance of the model is improved.
While training, a regularization term based on expert entropy is introduced, and the output of the gating network is encouraged to have higher expert selectivity:
Wherein H (·) represents the entropy function. Minimizing the regularization term R g promotes the gating network to select as few experts as possible for each problem, avoids averaging, and improves expert utilization efficiency.
Finally, the overall goal of the gating network is to minimize the unsupervised synergy loss and expert entropy regularization term, wherein the gating loss function is:
Wherein, The gating loss function is characterized in that,Characterizing the unsupervised loss, lambda characterizing the predetermined balance weight,The minimization of the regularization term is characterized. In conclusion, the expert model realizes the discrimination of different types of values through the supervision training of scientific value sentences. The gate control network realizes the self-adaptive expert routing strategy through collaborative learning and entropy regularization. The expert model is matched with the gate network to form a multi-classification method of scientific value sentences through collaborative optimization.
And packaging the scientific value sentence recognition model and the scientific value sentence multi-classification model into a scientific value sentence detection double model. Obtaining scientific and technological literature. And inputting the scientific and technological literature into the scientific and valuable sentence detection double model to obtain a comprehensive classification result of the scientific and valuable sentence.
And packaging the scientific value sentence recognition model and the scientific value sentence multi-classification model into a scientific value sentence detection double model. Subsequently, scientific literature was obtained, for example, a research paper on biodiversity was uploaded. And inputting the scientific and technological literature into the scientific and valuable sentence detection double model, and analyzing through the scientific and valuable sentence detection double model to obtain a comprehensive classification result of the scientific and valuable sentence. The technical problems that the existing scientific literature value sentence classification method in the prior art has low processing efficiency and difficult guarantee of accuracy when processing scientific literature are solved. By introducing a mixed expert mechanism and optimizing data distribution consistency, the classification accuracy of scientific value sentences can be effectively improved, and the technical effects of high-efficiency and accurate analysis of multidimensional academic, application and innovation values are realized. Illustratively, given that embedding of one scientific question value sentence represents x, three expert models f 1,2,3 (x) output their discrimination scores in three value dimensions, respectively. The gating network g (x) calculates the weight of each expert according to the semantic features of the problem, and finally obtains the comprehensive classification result:
y=g(x)1f1(x)+g(x)2f2(x)+g(x)3f3(x)
Wherein f 1(x)、f2 (x) and f 3 (x) respectively represent the outputs of the academic value, the application value and the innovation value specialists.
The method provided by the embodiment of the application further comprises the step of positioning the scientific value sentence enrichment region based on the scientific literature to obtain the value sentence enrichment region. And inputting the value sentence enrichment area into the scientific value sentence recognition model to obtain a scientific value sentence recognition result. And inputting the scientific value sentence recognition result into the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model to obtain a multi-type value discrimination result. Inputting the multi-type value discrimination results into the gating network to generate the comprehensive classification result of the scientific value sentence.
Inputting the scientific and technological literature into the scientific and valuable sentence detection double model to obtain a comprehensive classification result of the scientific and valuable sentence, wherein knowledge is not uniformly distributed in the scientific and technological literature, but shows a certain concentration and regularity. Certain sections or locations often contain a large amount of key information and core knowledge, and have the characteristics of high knowledge density, large information content and important content, and these areas are called knowledge enrichment areas. And positioning the scientific value sentence enrichment region based on the scientific and technological literature to obtain the value sentence enrichment region. And inputting the value sentence enrichment area into the scientific value sentence recognition model to obtain a scientific value sentence recognition result, and obtaining a sentence with scientific value. And inputting the scientific value sentence recognition result into the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model to obtain a multi-type value discrimination result. And finally, inputting the multi-type value discrimination results into the gating network to generate the comprehensive classification result of the scientific value sentences, wherein the comprehensive classification result of the scientific value sentences is a comprehensive classification result with scientific value sentences belonging to academic value sentences, application value sentences and innovation value sentences.
In the above, a hybrid expert multi-classification method in combination with an embedded dual-model organization architecture according to an embodiment of the present invention is described in detail with reference to fig. 1. Next, a hybrid expert multi-classification system incorporating an embedded dual-model organization architecture according to an embodiment of the present invention will be described with reference to fig. 2.
According to the mixed expert multi-classification system combined with the embedded double-model organization architecture, the technical problems that the processing efficiency is low and the accuracy is difficult to guarantee when the scientific literature value sentence classification method in the prior art is used for processing scientific literature are solved. By introducing a mixed expert mechanism and optimizing data distribution consistency, the classification accuracy of scientific value sentences can be effectively improved, and the technical effects of high-efficiency and accurate analysis of multidimensional academic, application and innovation values are realized. The mixed expert multi-classification system combined with the embedded double-model organization architecture comprises a scientific value sentence identification model construction module 11, a scientific value sentence multi-classification model construction module 12, a model encapsulation module 13, a scientific literature acquisition module 14 and a comprehensive classification result acquisition module 15.
The scientific value sentence recognition model construction module 11 is configured to perform deep learning on the pre-training model based on the data distribution consistency threshold and the scientific value sentence training corpus, and construct a scientific value sentence recognition model.
The scientific value sentence multi-classification model construction module 12 is configured to construct a scientific value sentence multi-classification model based on a hybrid expert mechanism, where the scientific value sentence multi-classification model includes an academic value expert sub-model, an application value expert sub-model, an innovation value expert sub-model, and a gating network.
The model packaging module 13 is configured to package the scientific value sentence recognition model and the scientific value sentence multi-classification model into a scientific value sentence detection dual model.
A scientific literature acquisition module 14 for acquiring a scientific literature.
And the comprehensive classification result acquisition module 15 is used for inputting the scientific and technological literature into the scientific and valuable sentence detection double model to obtain a scientific and valuable sentence comprehensive classification result.
Next, the specific configuration of the scientific value sentence recognition model construction module 11 will be described in detail. The scientific value sentence recognition model construction module 11 may further include performing deep learning on the pre-training model based on a data distribution consistency threshold and a scientific value sentence training corpus, to construct a scientific value sentence recognition model, including a scientific value sentence sample set and a non-scientific value sentence sample set. Dividing the scientific value sentence training corpus according to a preset proportion to obtain a first corpus training set, a first corpus testing set and a first corpus verification set. And carrying out data distribution optimization on the first corpus training set, the first corpus testing set and the first corpus verification set based on the data distribution consistency threshold value to obtain a second corpus training set, a second corpus testing set and a second corpus verification set. And training, testing and verifying the pre-training model based on the second corpus training set, the second corpus testing set and the second corpus verification set to generate the scientific value sentence identification model.
Next, the specific configuration of the scientific value sentence recognition model construction module 11 will be described in further detail. The scientific value sentence recognition model building module 11 further performs data distribution optimization on the first corpus training set, the first corpus testing set and the first corpus verification set based on the data distribution consistency threshold value to obtain a second corpus training set, a second corpus testing set and a second corpus verification set, and includes performing positive and negative sample distribution calculation on the first corpus training set, the first corpus testing set and the first corpus verification set respectively to obtain a first data distribution coefficient, a second data distribution coefficient and a third data distribution coefficient. And based on the first data distribution coefficient, the second data distribution coefficient and the third data distribution coefficient, performing data distribution consistency evaluation on the first corpus training set, the first corpus testing set and the first corpus verification set to obtain a data distribution consistency coefficient. And judging whether the data distribution consistency coefficient is larger than the data distribution consistency threshold value. And if the data distribution consistency coefficient is larger than the data distribution consistency threshold, carrying out data distribution adjustment on the first corpus training set, the first corpus testing set and the first corpus verification set according to the data distribution consistency threshold to generate the second corpus training set, the second corpus testing set and the second corpus verification set.
Next, the specific configuration of the scientific value sentence multi-classification model construction module 12 will be described in detail. The scientific value sentence multi-classification model construction module 12 may further include constructing a scientific value sentence multi-classification model based on a hybrid expert mechanism, including obtaining a scientific value sentence multi-classification index, wherein the scientific value sentence multi-classification index includes an academic value, an application value, and an innovation value. Based on the scientific value sentence multi-classification index, academic value sentence scoring corpus, application value sentence scoring corpus and innovation value sentence scoring corpus are loaded. Training the academic value expert sub-model based on the academic value sentence scoring corpus. And training the application value expert sub-model based on the application value sentence scoring corpus. And training the innovation value expert sub-model based on the innovation value expert sub-model. And constructing the gating network according to the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model based on a gating loss function. And taking the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model as multi-classification parallel nodes. And connecting the multi-classification parallel nodes with the gating network to generate the scientific value sentence multi-classification model.
Next, the specific configuration of the scientific value sentence multi-classification model construction module 12 will be described in detail. The scientific value sentence multi-classification model construction module 12 further includes constructing the gating network from the academic value expert sub-model, the application value expert sub-model, and the innovation value expert sub-model based on a gating loss function, including collecting output samples of the academic value expert sub-model, the application value expert sub-model, and the innovation value expert sub-model, to obtain an expert output sample set. And acquiring the discrimination probability parameters corresponding to the expert output sample set to obtain discrimination probability distribution. The minimum gating loss is used as a gating network training target. And performing unsupervised training on the expert output sample set and the discrimination probability distribution based on the gating loss function and the gating network training target to generate the gating network.
The specific configuration of the scientific value sentence-multi-classification model construction module 12 will be described in further detail below. The scientific value sentence multi-classification model construction module 12 further includes the gating loss function:
Wherein, The gating loss function is characterized in that,Characterizing the unsupervised loss, lambda characterizing the predetermined balance weight,The minimization of the regularization term is characterized.
Next, the specific configuration of the scientific value sentence integrated classification result acquisition module 15 will be described in detail. The comprehensive classification result obtaining module 15 further inputs the scientific value sentence detection double model into the scientific value sentence detection double model to obtain a comprehensive classification result of the scientific value sentence, and performs scientific value sentence enrichment region positioning based on the scientific value sentence detection double model to obtain a value sentence enrichment region. And inputting the value sentence enrichment area into the scientific value sentence recognition model to obtain a scientific value sentence recognition result. And inputting the scientific value sentence recognition result into the academic value expert sub-model, the application value expert sub-model and the innovation value expert sub-model to obtain a multi-type value discrimination result. Inputting the multi-type value discrimination results into the gating network to generate the comprehensive classification result of the scientific value sentence.
The mixed expert multi-classification system combined with the embedded double-mode organization architecture provided by the embodiment of the invention can execute the mixed expert multi-classification method combined with the embedded double-mode organization architecture provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Although the present application makes various references to certain modules in a system according to embodiments of the present application, any number of different modules may be used and run on a user terminal and/or server, including individual units and modules that are merely partitioned by functional logic, but are not limited to the above-described partitioning, as long as the corresponding functionality is enabled. In addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present application.
The above embodiments do not limit the scope of the present application. It will be apparent to those skilled in the art that various modifications, combinations, and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application should be included in the scope of the present application.
Claims (8)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202411805132.1A CN119646224B (en) | 2024-12-10 | 2024-12-10 | Hybrid expert multi-classification method and system combined with embedded dual-model organization architecture |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202411805132.1A CN119646224B (en) | 2024-12-10 | 2024-12-10 | Hybrid expert multi-classification method and system combined with embedded dual-model organization architecture |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN119646224A true CN119646224A (en) | 2025-03-18 |
| CN119646224B CN119646224B (en) | 2025-07-22 |
Family
ID=94939625
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202411805132.1A Active CN119646224B (en) | 2024-12-10 | 2024-12-10 | Hybrid expert multi-classification method and system combined with embedded dual-model organization architecture |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN119646224B (en) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115510971A (en) * | 2022-09-26 | 2022-12-23 | 北京智谱华章科技有限公司 | Paper classification method based on graph neural network with mixed expert structure |
| CN116362240A (en) * | 2023-01-13 | 2023-06-30 | 北京百度网讯科技有限公司 | Text processing method, device, equipment and medium |
| CN117612520A (en) * | 2023-12-01 | 2024-02-27 | 京东城市(北京)数字科技有限公司 | Abnormal session problem identification method, device, electronic device and readable medium |
| WO2024114659A1 (en) * | 2022-11-29 | 2024-06-06 | 华为技术有限公司 | Summary generation method and related device |
| CN118468138A (en) * | 2024-06-27 | 2024-08-09 | 中国科学技术大学 | A multimodal sentiment analysis model construction method, analysis model and analysis method |
-
2024
- 2024-12-10 CN CN202411805132.1A patent/CN119646224B/en active Active
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115510971A (en) * | 2022-09-26 | 2022-12-23 | 北京智谱华章科技有限公司 | Paper classification method based on graph neural network with mixed expert structure |
| WO2024114659A1 (en) * | 2022-11-29 | 2024-06-06 | 华为技术有限公司 | Summary generation method and related device |
| CN116362240A (en) * | 2023-01-13 | 2023-06-30 | 北京百度网讯科技有限公司 | Text processing method, device, equipment and medium |
| CN117612520A (en) * | 2023-12-01 | 2024-02-27 | 京东城市(北京)数字科技有限公司 | Abnormal session problem identification method, device, electronic device and readable medium |
| CN118468138A (en) * | 2024-06-27 | 2024-08-09 | 中国科学技术大学 | A multimodal sentiment analysis model construction method, analysis model and analysis method |
Non-Patent Citations (2)
| Title |
|---|
| 王珏,戴汝为: "联接机制与符号机制集成的混合专家系统", 信息与控制, no. 06, 23 December 1994 (1994-12-23), pages 321 - 325 * |
| 陆军;梁颖红;陆玉清;李斌;姚建民;: "多分类器融合技术在自动作文评分中的应用", 微电子学与计算机, no. 10, 5 October 2009 (2009-10-05), pages 75 - 79 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN119646224B (en) | 2025-07-22 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN113704389A (en) | Data evaluation method and device, computer equipment and storage medium | |
| CN118520035B (en) | Meteorological service platform data management method and system based on artificial intelligence | |
| CN112200208B (en) | Cloud workflow task execution time prediction method based on multi-dimensional feature fusion | |
| CN113468291A (en) | Patent network representation learning-based automatic patent classification method | |
| CN113032367A (en) | Dynamic load scene-oriented cross-layer configuration parameter collaborative tuning method and system for big data system | |
| CN120011864B (en) | Emotion metaphor recognition method and device based on knowledge extraction and collaborative evolution reasoning | |
| CN114897451A (en) | Double-layer clustering correction method and device considering key features of demand response user | |
| CN118335189A (en) | Single-cell deep clustering method fused with variational graph attention autoencoder | |
| CN107066328A (en) | The construction method of large-scale data processing platform | |
| CN112487819A (en) | Method, system, electronic device and storage medium for identifying homonyms among enterprises | |
| Li et al. | Symbolic expression transformer: A computer vision approach for symbolic regression | |
| CN112668633A (en) | Adaptive graph migration learning method based on fine granularity field | |
| CN112860882B (en) | Book concept front-rear order relation extraction method based on neural network | |
| CN119646224B (en) | Hybrid expert multi-classification method and system combined with embedded dual-model organization architecture | |
| CN109614486B (en) | A service automatic push system and method based on natural language processing technology | |
| CN114529191B (en) | Method and device for risk identification | |
| CN111127184B (en) | A Distributed Combination Credit Evaluation Method | |
| CN110348479A (en) | A kind of Prediction of Stock Index method, system, device and medium propagated based on neighbour | |
| CN115964953A (en) | Power grid digital resource modeling management method based on meta-learning | |
| CN115269844A (en) | Model processing method and device, electronic equipment and storage medium | |
| CN112215297A (en) | Production and manufacturing data hierarchical clustering method based on factor analysis | |
| CN112015659A (en) | Prediction method and device based on network model | |
| Li et al. | Using Machine Learning to Optimize Resource Allocation and Management Strategies in Colleges and Universities | |
| CN120561291B (en) | Low-altitude economics teaching text classification method and system based on topic model | |
| CN120450041A (en) | Industry large model automatic optimization method and system based on artificial intelligence |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |