+

CN116306923A - Evaluation weight calculation method based on knowledge graph - Google Patents

Evaluation weight calculation method based on knowledge graph Download PDF

Info

Publication number
CN116306923A
CN116306923A CN202211647320.7A CN202211647320A CN116306923A CN 116306923 A CN116306923 A CN 116306923A CN 202211647320 A CN202211647320 A CN 202211647320A CN 116306923 A CN116306923 A CN 116306923A
Authority
CN
China
Prior art keywords
evaluation
knowledge
graph
evaluation index
probability
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211647320.7A
Other languages
Chinese (zh)
Inventor
邢健豪
刘剑慰
冒泽慧
付鑫华
方志军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN202211647320.7A priority Critical patent/CN116306923A/en
Publication of CN116306923A publication Critical patent/CN116306923A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a knowledge graph-based evaluation weight calculation method, which comprises the following steps: establishing an evaluation knowledge graph and a keyword vector library according to the evaluation related data; combining the set evaluation target and the calculated adaptation index in the keyword vector similarity retrieval index library to form an evaluation index system; converting the established evaluation index system into a directed graph, and calculating knowledge reasoning probability to replace random walk probability in the conventional PageRank algorithm through keyword vector similarity among nodes of the directed graph to form an improved PageRank algorithm; and calculating the PageRank value of each node according to the improved PageRank algorithm, and taking the PageRank value as the objective weight of each evaluation index. The invention can fully utilize expert knowledge and other data to discover potential relations among evaluation indexes, automatically calculate the weights of all indexes in an evaluation index system, ensure that the determination of the evaluation weights does not depend on subjective judgment of people, and improve the efficiency and objectivity of an evaluation decision process.

Description

一种基于知识图谱的评价权重计算方法An Evaluation Weight Calculation Method Based on Knowledge Graph

技术领域technical field

本发明涉及权重计算、知识图谱技术领域,具体涉及一种基于知识图谱的评价权重计算方法。The invention relates to the technical fields of weight calculation and knowledge map, in particular to a method for calculating evaluation weight based on knowledge map.

背景技术Background technique

知识图谱是通过多种来源收集数据信息而建立的知识库,是知识的集合。知识图谱在2012年由Google正式提出,用以提升搜索引擎的能力,改善用户的搜索体验。知识图谱本质上是一种语义网络,其以图的形式表现客观世界中的实体及其之间的关系。知识图谱中的节点表示实体,节点间的边表示实体之间的关系。知识图谱技术主要包括知识表示、知识图谱构建、知识图谱应用三个部分,其中,知识图谱构建是形成知识图谱最为关键的环节。目前知识图谱已经在自然语言理解、大数据分析、智能问答、物联网等领域展现出巨大的应用价值,是未来人工智能发展的重要动力。A knowledge graph is a knowledge base established by collecting data information from various sources, and it is a collection of knowledge. The knowledge map was officially proposed by Google in 2012 to enhance the capabilities of search engines and improve users' search experience. The knowledge graph is essentially a semantic network, which represents the entities in the objective world and the relationship between them in the form of a graph. The nodes in the knowledge graph represent entities, and the edges between nodes represent the relationships between entities. Knowledge map technology mainly includes three parts: knowledge representation, knowledge map construction, and knowledge map application. Among them, knowledge map construction is the most critical link in the formation of knowledge map. At present, knowledge graphs have shown great application value in natural language understanding, big data analysis, intelligent question answering, Internet of Things and other fields, and are an important driving force for the development of artificial intelligence in the future.

评价权重计算是确定评价指标体系中各个指标重要程度的过程。评价权重计算方法主要分为三大类:主观评价权重计算方法、客观评价权重计算方法以及主客观组合评价权重计算方法。主观评价权重计算方法有层次分析法、网络分析法、决策实验室分析法等,客观评价权重计算方法有熵权法、主成分分析法等,而主客观组合评价权重计算方法则是将上述两者组合使用得到综合权重。评价权重计算是评价决策过程中的重要环节,对最终的评价结果有重大影响,合理的评价权重能增强评价过程的科学性、合理性。Evaluation weight calculation is the process of determining the importance of each index in the evaluation index system. Evaluation weight calculation methods are mainly divided into three categories: subjective evaluation weight calculation methods, objective evaluation weight calculation methods, and subjective and objective combined evaluation weight calculation methods. The calculation methods of subjective evaluation weight include analytic hierarchy process, network analysis method, decision-making laboratory analysis method, etc. The calculation methods of objective evaluation weight include entropy weight method, principal component analysis method, etc. combined use to get the comprehensive weight. Evaluation weight calculation is an important link in the evaluation decision-making process, which has a significant impact on the final evaluation results. Reasonable evaluation weights can enhance the scientificity and rationality of the evaluation process.

目前的研究大多采取主客观组合的评价权重计算方法,以主观评价权重为主,客观评价权重为辅,即强调人在评价决策过程中的作用,并对其进行客观修正。但现在的客观评价权重计算方法大多依据单一的指标数据,寻找其分布、变化规律,从而进行赋权,很难发掘指标间更深的联系。Most of the current research adopts the combination of subjective and objective evaluation weight calculation methods, mainly based on subjective evaluation weights, supplemented by objective evaluation weights, that is, emphasizing the role of people in the evaluation decision-making process, and making objective corrections to it. However, most of the current objective evaluation weight calculation methods are based on a single indicator data, looking for its distribution and change rules, so as to carry out weighting, and it is difficult to explore deeper connections between indicators.

发明内容Contents of the invention

发明目的:为了克服现有技术中存在的不足,提供一种基于知识图谱的评价权重计算方法,能够充分利用专家知识等数据,发掘评价指标间的潜在联系,自动计算出评价指标体系中各指标的权重,使得评价权重的确定不依赖于人的主观判断,提升评价决策过程的效率和客观性。Purpose of the invention: In order to overcome the deficiencies in the prior art, provide a method for calculating evaluation weights based on knowledge graphs, which can make full use of data such as expert knowledge, discover potential connections between evaluation indicators, and automatically calculate each index in the evaluation index system weight, so that the determination of the evaluation weight does not depend on human subjective judgment, improving the efficiency and objectivity of the evaluation decision-making process.

技术方案:为实现上述目的,本发明提供一种基于知识图谱的评价权重计算方法,包括如下步骤:Technical solution: In order to achieve the above purpose, the present invention provides a method for calculating evaluation weights based on knowledge graphs, including the following steps:

S1:根据评价相关数据建立评价知识图谱和关键词词向量库,将评价知识图谱作为评价指标体系的评价指标库,将关键词词向量库作为评价指标检索的依据;S1: Establish the evaluation knowledge graph and keyword vector database based on the evaluation-related data, use the evaluation knowledge graph as the evaluation index database of the evaluation index system, and use the keyword vector database as the basis for evaluation index retrieval;

S2:结合设定的评价目标和计算的关键词词向量相似度检索指标库中的适配指标形成评价指标体系;S2: Combining the set evaluation target and the calculated adaptation index in the keyword vector similarity search index library to form an evaluation index system;

S3:将建立好的评价指标体系转换为有向图,通过有向图节点间的关键词词向量相似度计算出知识推理概率替代现有PageRank算法中的随机游走概率,形成改进的PageRank算法;S3: Convert the established evaluation index system into a directed graph, and calculate the knowledge reasoning probability through the keyword vector similarity between the nodes of the directed graph to replace the random walk probability in the existing PageRank algorithm, forming an improved PageRank algorithm ;

S4:依据改进的PageRank算法计算出各节点的PageRank值作为各评价指标的客观权重。S4: Calculate the PageRank value of each node as the objective weight of each evaluation index according to the improved PageRank algorithm.

进一步地,所述步骤S1中评价知识图谱的建立方法为:根据评价任务,搜集相关文本数据,采用知识图谱构建技术建立评价知识图谱;Further, the establishment method of the evaluation knowledge map in the step S1 is: according to the evaluation task, collect relevant text data, and use the knowledge map construction technology to establish the evaluation knowledge map;

关键词词向量库的建立方法为:对于搜集到的文本数据采用词嵌入算法得到词向量库,然后提取知识图谱各节点的关键词,查找对应词向量形成关键词词向量库。The establishment method of the keyword vector database is as follows: the word embedding algorithm is used to obtain the word vector database for the collected text data, and then the keywords of each node of the knowledge graph are extracted, and the corresponding word vectors are searched to form the keyword vector database.

进一步地,所述步骤S2中评价指标体系的形成具体为:根据设置的评价目标,筛选出知识图谱中具有特定标签的节点,并计算其与评价目标的关键词词向量相似度做进一步检索,然后,将所有满足条件的评价指标及其之间的依赖关系汇总形成评价指标体系。Further, the formation of the evaluation index system in the step S2 is specifically as follows: according to the set evaluation target, the nodes with specific tags in the knowledge graph are screened out, and the similarity between them and the keyword vector vectors of the evaluation target is calculated for further retrieval, Then, all the evaluation indicators that meet the conditions and the dependencies among them are summarized to form an evaluation index system.

进一步地,所述评价目标由专家根据评价任务进行设置;Further, the evaluation target is set by experts according to the evaluation task;

所述关键词词向量相似度的计算方式为:The calculation method of the keyword vector similarity is:

Figure SMS_1
Figure SMS_1

其中,

Figure SMS_2
表示评价目标的关键词词向量,/>
Figure SMS_3
表示知识图谱节点的关键词词向量。in,
Figure SMS_2
Indicates the keyword vector of the evaluation target, />
Figure SMS_3
Keyword vectors representing knowledge graph nodes.

进一步地,所述步骤S3具体为:Further, the step S3 is specifically:

A1:构建评价指标体系的有向图A1: Construct a directed graph of the evaluation index system

在完成评价指标体系的构建后,将评价指标体系中的每个评价指标转换为有向图定义中的节点,将指标间的指向性依赖关系转换为有向图定义中的有向边,实现评价指标体系到有向图的转换过程;After completing the construction of the evaluation index system, each evaluation index in the evaluation index system is converted into a node in the definition of the directed graph, and the directional dependencies between the indicators are converted into directed edges in the definition of the directed graph to realize The conversion process from the evaluation index system to the directed graph;

A2:建立改进随机游走模型A2: Establish an improved random walk model

定义评价指标体系有向图:Define the directed graph of the evaluation index system:

G=(V,E)G=(V,E)

其中V和E分别表示节点和有向边的集合,每条有向边都对应着一对有序节点;Among them, V and E respectively represent the collection of nodes and directed edges, and each directed edge corresponds to a pair of ordered nodes;

随机游走模型本质上是定义在有向图上的马尔科夫链,其马尔科夫性表示为从当前节点状态转移到下一刻节点状态的状态转移概率只与当前节点状态有关。改进随机游走模型中节点不再是以等概率转移到其指向的节点,而是以知识推理概率转移。评价指标体系的有向图本质上是基于知识图谱建立的,因此有向图中的节点,即评价指标对应知识图谱中的节点,可以通过知识图谱查询节点的关键词词向量;The random walk model is essentially a Markov chain defined on a directed graph, and its Markov property is expressed as the state transition probability from the current node state to the next moment node state is only related to the current node state. In the improved random walk model, nodes no longer transfer to the nodes they point to with equal probability, but transfer with knowledge reasoning probability. The directed graph of the evaluation index system is essentially established based on the knowledge graph, so the nodes in the directed graph, that is, the nodes in the knowledge graph corresponding to the evaluation index, can query the keyword vector of the node through the knowledge graph;

定义评价指标体系有向图G中节点vi和vj的知识相似度为:Define the knowledge similarity of nodes v i and v j in the directed graph G of the evaluation index system as:

Figure SMS_4
Figure SMS_4

其中,

Figure SMS_5
为知识图谱中节点vi、vj的排序最高的关键词词向量;in,
Figure SMS_5
is the highest-ranked keyword word vector of nodes v i and v j in the knowledge graph;

假设有向图中节点j有k条有向边连出,则节点j到每个节点的状态转移概率可由知识相似度表示为:Assuming that node j in the directed graph has k directed edges connecting out, the state transition probability from node j to each node can be expressed by the knowledge similarity as:

Figure SMS_6
Figure SMS_6

其中,pi'j是知识推理概率,表示从评价指标j以知识为度量转移到评价指标i的概率;Among them, p i ' j is the probability of knowledge reasoning, indicating the probability of transferring from evaluation index j to evaluation index i with knowledge as the measure;

假设有向图G有m个节点,则改进随机游走模型的状态转移概率矩阵表示为:Assuming that there are m nodes in the directed graph G, the state transition probability matrix of the improved random walk model is expressed as:

Figure SMS_7
Figure SMS_7

显然,pi'j≥0,

Figure SMS_8
符合状态转移概率矩阵的性质。Obviously, p i ' j ≥ 0,
Figure SMS_8
It conforms to the property of the state transition probability matrix.

进一步地,所述步骤S4中PageRank值的计算方法为:Further, the calculation method of the PageRank value in the step S4 is:

基于知识图谱建立的评价指标体系有向图不一定会满足非周期性和强连通的条件,因此其马尔科夫链未必会有稳态分布。The directed graph of the evaluation index system established based on the knowledge map may not necessarily meet the conditions of non-periodicity and strong connectivity, so its Markov chain may not have a steady-state distribution.

对于一般的评价指标体系有向图G,假设其有m个节点,基于改进随机游走模型得状态转移概率矩阵为P,若要求得其马尔科夫链得稳态分布,则在有向图G上除了定义其自身的改进随机游走模型外,还需再定义一个完全随机游走模型,其状态转移概率矩阵P'的元素值均为

Figure SMS_9
由P和P'的线性组合构成新的状态转移概率矩阵,在此基础上得到的马尔科夫链可以证明一定具有平稳分布R,R有如下等式:For the directed graph G of the general evaluation index system, assuming that it has m nodes, the state transition probability matrix based on the improved random walk model is P. If it is required to obtain the steady-state distribution of the Markov chain, then in the directed graph In addition to defining its own improved random walk model on G, it is also necessary to define a complete random walk model, and the element values of its state transition probability matrix P' are
Figure SMS_9
A new state transition probability matrix is formed by the linear combination of P and P', and the Markov chain obtained on this basis can prove to have a stationary distribution R, and R has the following equation:

Figure SMS_10
Figure SMS_10

其中,线性组合系数d(0≤d≤1)是阻尼因子,按照经验取值,E是m×m的全1矩阵;l是各项分量均为1的m维向量,R表示一般情况下的平稳分布,R的各个分量为各节点的PageRank值,也即对应各评价指标的客观权重:Among them, the linear combination coefficient d (0≤d≤1) is the damping factor, according to empirical values, E is an m×m all-1 matrix; l is an m-dimensional vector with all components being 1, and R indicates that in general The smooth distribution of R, each component of R is the PageRank value of each node, that is, the objective weight corresponding to each evaluation index:

R=[PR(v1),PR(v2),…,PR(vm)]T,v1,v2,...,vm∈VR=[PR(v 1 ),PR(v 2 ),...,PR(v m )] T ,v 1 ,v 2 ,...,v m ∈V

其中,有PR(vi)>0,

Figure SMS_11
公式第一项表示状态平稳分布时,模型按照状态概率转移矩阵P游走到各节点的概率分布,且这样的概率分布占权重d;第二项其表示状态平稳分布时,模型按照状态概率转移矩阵P'等概率游走到各节点的概率分布,且这样的概率分布占权重1-d。Among them, PR(v i )>0,
Figure SMS_11
When the first item of the formula indicates the state is in a stable distribution, the model moves to the probability distribution of each node according to the state probability transition matrix P, and such a probability distribution accounts for the weight d; when the second item indicates that the state is in a stable distribution, the model transfers according to the state probability Matrix P' walks to the probability distribution of each node with equal probability, and such probability distribution accounts for the weight 1-d.

直观上,第二项作为平滑项引入使得节点有1-d的概率完全随机转移,以等概率

Figure SMS_12
转移到任意节点,保证了没有有向边连出的节点也可以实现状态转移,进而确保马尔科夫链存在平稳分布。Intuitively, the second term is introduced as a smoothing term so that nodes have 1-d probability of completely random transition, with equal probability
Figure SMS_12
Transferring to any node ensures that nodes without directed edges can also achieve state transfer, thereby ensuring a stable distribution of the Markov chain.

进一步地,所述步骤S4中PageRank值的计算过程包括如下步骤:Further, the calculation process of the PageRank value in the step S4 includes the following steps:

B1:计算评价指标体系有向图G基于知识度量的状态转移概率矩阵P;B1: Calculate the state transition probability matrix P of the directed graph G of the evaluation index system based on the knowledge measurement;

B2:令t=0,选择初始状态分布x0B2: let t=0, select the initial state distribution x 0 ;

B3:计算有向图的一般状态转移概率矩阵AB3: Calculate the general state transition probability matrix A of the directed graph

Figure SMS_13
Figure SMS_13

B4:迭代并规范化结果向量B4: Iterate and normalize the result vector

yt+1=Axt y t+1 =Ax t

Figure SMS_14
Figure SMS_14

B5:当||xt+1-xt||≤ε时,令R=xt,停止迭代;B5: When ||x t+1 -x t ||≤ε, set R=x t and stop iteration;

B6:否则,令t=t+1,执行步骤B4;B6: Otherwise, let t=t+1, execute step B4;

B7:对R进行规范化处理,使其表示概率分布;B7: Normalize R so that it represents a probability distribution;

完成上述步骤后得到的R即为评价指标体系有向图各节点的PageRank值,也即各指标的客观权重。The R obtained after completing the above steps is the PageRank value of each node in the directed graph of the evaluation index system, that is, the objective weight of each index.

有益效果:本发明与现有技术相比,具备如下优点:Beneficial effect: compared with the prior art, the present invention has the following advantages:

1、本发明根据评价任务相关数据建立了评价知识图谱和知识图谱各节点关键词的词向量库作为构建评价指标体系和权重计算的依据,降低了评价过程中的人为参与程度,提升了评价的效率。1. According to the data related to the evaluation task, the present invention establishes the evaluation knowledge map and the word vector library of the keywords of each node of the knowledge map as the basis for constructing the evaluation index system and weight calculation, which reduces the degree of human participation in the evaluation process and improves the efficiency of evaluation. efficiency.

2、本发明定义了评价指标体系到有向图的转换过程,使得图论的算法可以应用在评价指标体系上,提供了一种新的权重计算思路。2. The present invention defines the conversion process from the evaluation index system to the directed graph, so that the algorithm of graph theory can be applied to the evaluation index system, and provides a new idea of weight calculation.

3、本发明改进了PageRank算法,通过评价指标体系有向图中节点间关键词词向量的相似度计算出知识推理概率,替代了PageRank随机游走模型中的等可能转移概率,使得计算出的各指标的PageRank值既能够反映以知识为度量的客观权重,又避免了传统PageRank算法中存在的主题偏移问题。3. The present invention improves the PageRank algorithm, calculates the knowledge reasoning probability by evaluating the similarity of the key word vectors between the nodes in the directed graph of the evaluation index system, and replaces the equal possible transition probability in the PageRank random walk model, so that the calculated The PageRank value of each index can not only reflect the objective weight measured by knowledge, but also avoid the topic deviation problem in the traditional PageRank algorithm.

附图说明Description of drawings

图1为本发明基于知识图谱的评价权重计算流程示意图。Fig. 1 is a schematic diagram of the calculation flow of the evaluation weight based on the knowledge graph in the present invention.

图2为本发明建立的知识图谱示意图。Fig. 2 is a schematic diagram of the knowledge map established by the present invention.

图3为本发明基于知识图谱建立的评价指标体系示意图。Fig. 3 is a schematic diagram of the evaluation index system established based on the knowledge map in the present invention.

图4为本发明提出的改进PageRank算法迭代过程中的误差变化图。FIG. 4 is a diagram of error changes during the iterative process of the improved PageRank algorithm proposed by the present invention.

图5为本发明基于改进PageRank算法得到的各底层评价权重。Fig. 5 is the bottom evaluation weight obtained based on the improved PageRank algorithm in the present invention.

具体实施方式Detailed ways

下面结合附图和具体实施例,进一步阐明本发明,应理解这些实施例仅用于说明本发明而不用于限制本发明的范围,在阅读了本发明之后,本领域技术人员对本发明的各种等价形式的修改均落于本申请所附权利要求所限定的范围。Below in conjunction with accompanying drawing and specific embodiment, further illustrate the present invention, should be understood that these embodiments are only for illustrating the present invention and are not intended to limit the scope of the present invention, after having read the present invention, those skilled in the art will understand various aspects of the present invention Modifications in equivalent forms all fall within the scope defined by the appended claims of this application.

本发明提供一种基于知识图谱的评价权重计算方法,如图1所示,其包括如下步骤:The present invention provides a method for calculating evaluation weights based on knowledge graphs, as shown in Figure 1, which includes the following steps:

S1:根据评价相关数据建立评价知识图谱和关键词词向量库,将评价知识图谱作为评价指标体系的评价指标库,将关键词词向量库作为评价指标检索的依据;S1: Establish the evaluation knowledge graph and keyword vector database based on the evaluation-related data, use the evaluation knowledge graph as the evaluation index database of the evaluation index system, and use the keyword vector database as the basis for evaluation index retrieval;

S2:结合设定的评价目标和计算的关键词词向量相似度检索指标库中的适配指标形成评价指标体系;S2: Combining the set evaluation target and the calculated adaptation index in the keyword vector similarity search index library to form an evaluation index system;

S3:将建立好的评价指标体系转换为有向图,通过有向图节点间的关键词词向量相似度计算出知识推理概率替代现有PageRank算法中的随机游走概率,形成改进的PageRank算法;S3: Convert the established evaluation index system into a directed graph, and calculate the knowledge reasoning probability through the keyword vector similarity between the nodes of the directed graph to replace the random walk probability in the existing PageRank algorithm, forming an improved PageRank algorithm ;

S4:依据改进的PageRank算法计算出各节点的PageRank值作为各评价指标的客观权重。S4: Calculate the PageRank value of each node as the objective weight of each evaluation index according to the improved PageRank algorithm.

步骤S1中评价知识图谱的建立方法为:根据评价任务,搜集相关文本数据,采用知识图谱构建技术建立评价知识图谱;The method for establishing the evaluation knowledge map in step S1 is: according to the evaluation task, collect relevant text data, and use the knowledge map construction technology to establish the evaluation knowledge map;

关键词词向量库的建立方法为:对于搜集到的文本数据采用词嵌入算法得到词向量库,然后提取知识图谱各节点的关键词,查找对应词向量形成关键词词向量库。The establishment method of the keyword vector database is as follows: the word embedding algorithm is used to obtain the word vector database for the collected text data, and then the keywords of each node of the knowledge graph are extracted, and the corresponding word vectors are searched to form the keyword vector database.

步骤S2中评价指标体系的形成具体为:根据设置的评价目标,筛选出知识图谱中具有特定标签的节点,并计算其与评价目标的关键词词向量相似度做进一步检索,然后,将所有满足条件的评价指标及其之间的依赖关系汇总形成评价指标体系。The formation of the evaluation index system in step S2 is as follows: according to the set evaluation target, select the nodes with specific labels in the knowledge map, and calculate the similarity between them and the keyword vector vector of the evaluation target for further retrieval, and then, all the nodes that satisfy the The evaluation indicators of the conditions and the dependencies among them are summarized to form an evaluation index system.

评价目标由专家根据评价任务进行设置;Evaluation objectives are set by experts according to the evaluation tasks;

关键词词向量相似度的计算方式为:The calculation method of keyword vector similarity is:

Figure SMS_15
Figure SMS_15

其中,

Figure SMS_16
表示评价目标的关键词词向量,/>
Figure SMS_17
表示知识图谱节点的关键词词向量。in,
Figure SMS_16
Indicates the keyword vector of the evaluation target, />
Figure SMS_17
Keyword vectors representing knowledge graph nodes.

步骤S3的具体过程为:The specific process of step S3 is:

A1:构建评价指标体系的有向图A1: Construct a directed graph of the evaluation index system

在完成评价指标体系的构建后,将评价指标体系中的每个评价指标转换为有向图定义中的节点,将指标间的指向性依赖关系转换为有向图定义中的有向边,实现评价指标体系到有向图的转换过程;After completing the construction of the evaluation index system, each evaluation index in the evaluation index system is converted into a node in the definition of the directed graph, and the directional dependencies between the indicators are converted into directed edges in the definition of the directed graph to realize The conversion process from the evaluation index system to the directed graph;

A2:建立改进随机游走模型A2: Establish an improved random walk model

定义评价指标体系有向图:Define the directed graph of the evaluation index system:

G=(V,E)G=(V,E)

其中V和E分别表示节点和有向边的集合,每条有向边都对应着一对有序节点;Among them, V and E respectively represent the collection of nodes and directed edges, and each directed edge corresponds to a pair of ordered nodes;

随机游走模型本质上是定义在有向图上的马尔科夫链,其马尔科夫性表示为从当前节点状态转移到下一刻节点状态的状态转移概率只与当前节点状态有关。改进随机游走模型中节点不再是以等概率转移到其指向的节点,而是以知识推理概率转移。评价指标体系的有向图本质上是基于知识图谱建立的,因此有向图中的节点,即评价指标对应知识图谱中的节点,可以通过知识图谱查询节点的关键词词向量;The random walk model is essentially a Markov chain defined on a directed graph, and its Markov property is expressed as the state transition probability from the current node state to the next moment node state is only related to the current node state. In the improved random walk model, nodes no longer transfer to the nodes they point to with equal probability, but transfer with knowledge reasoning probability. The directed graph of the evaluation index system is essentially established based on the knowledge graph, so the nodes in the directed graph, that is, the nodes in the knowledge graph corresponding to the evaluation index, can query the keyword vector of the node through the knowledge graph;

定义评价指标体系有向图G中节点vi和vj的知识相似度为:Define the knowledge similarity of nodes v i and v j in the directed graph G of the evaluation index system as:

Figure SMS_18
Figure SMS_18

其中,

Figure SMS_19
为知识图谱中节点vi、vj的排序最高的关键词词向量;in,
Figure SMS_19
is the highest-ranked keyword word vector of nodes v i and v j in the knowledge graph;

假设有向图中节点j有k条有向边连出,则节点j到每个节点的状态转移概率可由知识相似度表示为:Assuming that node j in the directed graph has k directed edges connecting out, the state transition probability from node j to each node can be expressed by the knowledge similarity as:

Figure SMS_20
Figure SMS_20

其中,p′ij是知识推理概率,表示从评价指标j以知识为度量转移到评价指标i的概率;Among them, p′ ij is the probability of knowledge reasoning, which means the probability of transferring from evaluation index j to evaluation index i with knowledge as the measure;

假设有向图G有m个节点,则改进随机游走模型的状态转移概率矩阵表示为:Assuming that there are m nodes in the directed graph G, the state transition probability matrix of the improved random walk model is expressed as:

Figure SMS_21
Figure SMS_21

显然,p′ij≥0,

Figure SMS_22
符合状态转移概率矩阵的性质。Obviously, p′ ij ≥ 0,
Figure SMS_22
It conforms to the property of the state transition probability matrix.

步骤S4中PageRank值的计算方法为:The calculation method of the PageRank value in step S4 is:

基于知识图谱建立的评价指标体系有向图不一定会满足非周期性和强连通的条件,因此其马尔科夫链未必会有稳态分布。The directed graph of the evaluation index system established based on the knowledge map may not necessarily meet the conditions of non-periodicity and strong connectivity, so its Markov chain may not have a steady-state distribution.

对于一般的评价指标体系有向图G,假设其有m个节点,基于改进随机游走模型得状态转移概率矩阵为P,若要求得其马尔科夫链得稳态分布,则在有向图G上除了定义其自身的改进随机游走模型外,还需再定义一个完全随机游走模型,其状态转移概率矩阵P'的元素值均为

Figure SMS_23
由P和P'的线性组合构成新的状态转移概率矩阵,在此基础上得到的马尔科夫链可以证明一定具有平稳分布R,R有如下等式:For the directed graph G of the general evaluation index system, assuming that it has m nodes, the state transition probability matrix based on the improved random walk model is P. If it is required to obtain the steady-state distribution of the Markov chain, then in the directed graph In addition to defining its own improved random walk model on G, it is also necessary to define a complete random walk model, and the element values of its state transition probability matrix P' are
Figure SMS_23
A new state transition probability matrix is formed by the linear combination of P and P', and the Markov chain obtained on this basis can prove to have a stationary distribution R, and R has the following equation:

Figure SMS_24
Figure SMS_24

其中,线性组合系数d(0≤d≤1)是阻尼因子,按照经验取值,E是m×m的全1矩阵;l是各项分量均为1的m维向量,R表示一般情况下的平稳分布,R的各个分量为各节点的PageRank值,也即对应各评价指标的客观权重:Among them, the linear combination coefficient d (0≤d≤1) is the damping factor, according to empirical values, E is an m×m all-1 matrix; l is an m-dimensional vector with all components being 1, and R indicates that in general The smooth distribution of R, each component of R is the PageRank value of each node, that is, the objective weight corresponding to each evaluation index:

R=[PR(v1),PR(v2),…,PR(vm)]T,v1,v2,...,vm∈VR=[PR(v 1 ),PR(v 2 ),...,PR(v m )] T ,v 1 ,v 2 ,...,v m ∈V

其中,有PR(vi)>0,

Figure SMS_25
公式第一项表示状态平稳分布时,模型按照状态概率转移矩阵P游走到各节点的概率分布,且这样的概率分布占权重d;第二项其表示状态平稳分布时,模型按照状态概率转移矩阵P'等概率游走到各节点的概率分布,且这样的概率分布占权重1-d。Among them, PR(v i )>0,
Figure SMS_25
When the first item of the formula indicates the state is in a stable distribution, the model moves to the probability distribution of each node according to the state probability transition matrix P, and such a probability distribution accounts for the weight d; when the second item indicates that the state is in a stable distribution, the model transfers according to the state probability Matrix P' walks to the probability distribution of each node with equal probability, and such probability distribution accounts for the weight 1-d.

PageRank值的计算过程包括如下步骤:The calculation process of PageRank value includes the following steps:

B1:计算评价指标体系有向图G基于知识度量的状态转移概率矩阵P;B1: Calculate the state transition probability matrix P of the directed graph G of the evaluation index system based on the knowledge measurement;

B2:令t=0,选择初始状态分布x0B2: let t=0, select the initial state distribution x 0 ;

B3:计算有向图的一般状态转移概率矩阵AB3: Calculate the general state transition probability matrix A of the directed graph

Figure SMS_26
Figure SMS_26

B4:迭代并规范化结果向量B4: Iterate and normalize the result vector

yt+1=Axt y t+1 =Ax t

Figure SMS_27
Figure SMS_27

B5:当||xt+1-xt||≤ε时,令R=xt,停止迭代;B5: When ||x t+1 -x t ||≤ε, set R=x t and stop iteration;

B6:否则,令t=t+1,执行步骤B4;B6: Otherwise, let t=t+1, execute step B4;

B7:对R进行规范化处理,使其表示概率分布;B7: Normalize R so that it represents a probability distribution;

完成上述步骤后得到的R即为评价指标体系有向图各节点的PageRank值,也即各指标的客观权重。The R obtained after completing the above steps is the PageRank value of each node in the directed graph of the evaluation index system, that is, the objective weight of each index.

为了验证上述方案的实际效果,本实施例中将上述评价权重计算方法进行实例应用,具体是将对某汽车制造企业生产线机械手故障风险这一评价目标的评价指标进行评价权重的计算,具体如下:In order to verify the actual effect of the above-mentioned scheme, in this embodiment, the above-mentioned evaluation weight calculation method is applied to an example. Specifically, the evaluation index of the evaluation target of the risk of manipulator failure in the production line of an automobile manufacturing enterprise is calculated. The details are as follows:

第一步:收集评价任务相关数据,构建评价知识图谱和其各节点的关键词词向量库。The first step: collect data related to the evaluation task, and construct the evaluation knowledge graph and the keyword vector library of each node.

本实施例的数据来自某汽车制造公司,该公司在某一时期生产过程中产生了大量的设备问题解决报告,这些报告详细地描述了该公司生产过程中生产线上常见的故障问题、导致该故障的原因以及相应的维修过程等,包含着与评价目标相关的具体评价指标以及丰富的专家知识。The data in this embodiment comes from an automobile manufacturing company, which produced a large number of equipment problem solving reports during a certain period of production. The cause of the problem and the corresponding maintenance process, etc., contain specific evaluation indicators and rich expert knowledge related to the evaluation objectives.

将该公司的设备问题解决报告中的部分数据进行标注等处理,作为知识抽取Casrel模型的训练、验证、测试数据集,另一部分输入训练好的Casrel模型实现知识抽取。本节采用人工标注的方法,保证数据集的质量,从而提升模型的训练效果。Part of the data in the company's equipment problem solving report is marked and processed as the training, verification, and test data sets of the knowledge extraction Casrel model, and the other part is input into the trained Casrel model to realize knowledge extraction. In this section, manual labeling is used to ensure the quality of the dataset and improve the training effect of the model.

对设备问题解决报告中对生产线故障的文本描述进行分析,总结确定了以下五类三元组,描述设备问题解决报告中的实体类别及其之间的关系,分别为:(具体故障,追因,故障原因),(故障原因,导致,故障原因),(处理人,使用,维修方法),(故障原因,涉及,零部件),(设备型号,属于,设备分类)。Analyzing the textual description of production line failures in the equipment problem solving report, the following five triples were identified to describe the entity categories and the relationship between them in the equipment problem solving report, respectively: (specific failure, traceable cause , cause of failure), (cause of failure, cause, cause of failure), (handler, use, maintenance method), (cause of failure, involved, parts), (equipment model, belonging, equipment classification).

对设备问题解决报告的文本按照上述三元组形式标注,得到三元组数据集,将标注好的数据集按照8:1:1的比例划分为训练集、验证集和测试集。完成训练后的Casrel模型对剩余数据进行知识抽取,将抽取得到的三元组存入neo4j图数据库中,形成评价知识图谱如图2所示。The text of the equipment problem solving report is marked according to the above triplet form to obtain the triplet data set, and the marked data set is divided into training set, verification set and test set according to the ratio of 8:1:1. After training, the Casrel model performs knowledge extraction on the remaining data, and stores the extracted triples in the neo4j graph database to form an evaluation knowledge graph, as shown in Figure 2.

对设备问题解决报告的文本数据采取Word2vec词嵌入算法得到词向量库,对知识图谱中的节点信息参考TF-IDF文档关键词算法定义知识图谱节点的关键词算法,定义如下:For the text data of the equipment problem solving report, the Word2vec word embedding algorithm is used to obtain the word vector library. For the node information in the knowledge graph, refer to the TF-IDF document keyword algorithm to define the keyword algorithm of the knowledge graph node, which is defined as follows:

Figure SMS_28
Figure SMS_28

Figure SMS_29
Figure SMS_29

TF-IDF=TF*IDFTF-IDF=TF*IDF

其中,c表示某词在该节点出现的次数,m表示该节点的总词数,N表示知识图谱的总节点数,n表示包含该词的节点数。Among them, c represents the number of times a word appears in the node, m represents the total number of words in the node, N represents the total number of nodes in the knowledge graph, and n represents the number of nodes containing the word.

计算出的TF-IDF值按降序排列,取排序最高的词或较高的几个词作为关键词,关键词在词向量库中查找对应词向量就得到了词向量库。The calculated TF-IDF values are arranged in descending order, and the word with the highest ranking or several higher words are used as keywords, and the keyword is searched for the corresponding word vector in the word vector library to obtain the word vector library.

第二步:根据评价目标和关键词检索算法,检索评价知识图谱中的适配指标构建评价指标体系。Step 2: According to the evaluation target and keyword retrieval algorithm, search and evaluate the adaptation index in the knowledge map to construct the evaluation index system.

针对生产线机械手故障风险这一评价目标进行分析,其细分到下一级的具体评价指标可以对应到知识图谱实体类别中的“具体故障”这一类别,所以在评价知识图谱查询与“机械手”这一关键词词向量相似度较高,且带有“具体故障”标签的节点,搜索到5个与评价目标相似度最高的节点,分别为:机械手夹紧故障、机械手回零故障、机械手抓料故障、机械手跳电、机械手不动作五个节点。将其作为具体故障元素集,然后再进一步查询与这5个节点相关的节点,分析后选取标签为故障原因的节点为底层评价指标,共计30个,将底层指标的权重作为需要计算的评价权重。保留检索到的评价指标在知识图谱中的原本关系,构建好的评价指标体系如图3所示。For the analysis of the evaluation target of the failure risk of the production line manipulator, the specific evaluation indicators subdivided into the next level can correspond to the category of "specific failure" in the knowledge map entity category, so when evaluating knowledge map query and "manipulator" This keyword has a high degree of similarity in word vectors and a node with the label "specific fault". Five nodes with the highest similarity to the evaluation target were found, namely: manipulator clamping fault, manipulator zero return fault, manipulator grasping fault There are five nodes: material failure, manipulator tripping, and manipulator not moving. Take it as a specific fault element set, and then further query the nodes related to these 5 nodes. After analysis, select the node whose label is the cause of the fault as the underlying evaluation index, a total of 30, and use the weight of the underlying index as the evaluation weight that needs to be calculated . The original relationship of the retrieved evaluation indicators in the knowledge graph is retained, and the constructed evaluation index system is shown in Figure 3.

第三步:根据改进PageRank算法计算底层评价指标的客观评价权重。Step 3: Calculate the objective evaluation weight of the underlying evaluation index according to the improved PageRank algorithm.

根据改进PageRank算法将该评价指标体系转换为对应的有向图。首先计算存在关系的各指标之间关键词词向量的相似度,得到基于知识推理概率的状态转移矩阵。According to the improved PageRank algorithm, the evaluation index system is transformed into a corresponding directed graph. Firstly, the similarity of the keyword vectors between the indicators with relationship is calculated, and the state transition matrix based on knowledge reasoning probability is obtained.

在改进PageRank的迭代计算中,设置允许误差ε为0.00001,阻尼系数d为0.85,同时,因为最终的平稳分布与初始分布无关,为了更符合实际意义,根据评价指标的个数设置初始分布为各分量均为

Figure SMS_30
的向量35维向量,表示初始为等权重状态。在改进PageRank的迭代计算过程中,误差的变化如图所4示,可见误差逐渐减小且趋于平稳,表明最后得的PageRank值符合平稳分布。In the iterative calculation of improved PageRank, the allowable error ε is set to 0.00001, and the damping coefficient d is set to 0.85. At the same time, because the final stable distribution has nothing to do with the initial distribution, in order to be more practical, the initial distribution is set according to the number of evaluation indicators. Components are
Figure SMS_30
The vector of is a 35-dimensional vector, representing the initial state of equal weight. During the iterative calculation process of the improved PageRank, the change of the error is shown in Figure 4. It can be seen that the error gradually decreases and tends to be stable, indicating that the final PageRank value conforms to a stationary distribution.

最终得到平稳分布,将其中各底层评价指标的PageRank值提取出来并进行归一化,得到以知识为度量的客观评价权重如图5所示。Finally, a stable distribution is obtained, and the PageRank values of each underlying evaluation index are extracted and normalized to obtain an objective evaluation weight measured by knowledge, as shown in Figure 5.

本实施例还提供一种基于知识图谱的评价权重计算系统,该系统包括网络接口、存储器和处理器;其中,网络接口,用于在与其他外部网元之间进行收发信息过程中,实现信号的接收和发送;存储器,用于存储能够在所述处理器上运行的计算机程序指令;处理器,用于在运行计算机程序指令时,执行上述共识方法的步骤。This embodiment also provides an evaluation weight calculation system based on a knowledge map, the system includes a network interface, a memory, and a processor; wherein, the network interface is used to implement the signal during the process of sending and receiving information with other external network elements receiving and sending; the memory is used to store computer program instructions that can be run on the processor; the processor is used to execute the steps of the above-mentioned consensus method when running the computer program instructions.

本实施例还提供一种计算机存储介质,该计算机存储介质存储有计算机程序,在处理器执行所述计算机程序时可实现以上所描述的方法。所述计算机可读介质可以被认为是有形的且非暂时性的。非暂时性有形计算机可读介质的非限制性示例包括非易失性存储器电路(例如闪存电路、可擦除可编程只读存储器电路或掩膜只读存储器电路)、易失性存储器电路(例如静态随机存取存储器电路或动态随机存取存储器电路)、磁存储介质(例如模拟或数字磁带或硬盘驱动器)和光存储介质(例如CD、DVD或蓝光光盘)等。计算机程序包括存储在至少一个非暂时性有形计算机可读介质上的处理器可执行指令。计算机程序还可以包括或依赖于存储的数据。计算机程序可以包括与专用计算机的硬件交互的基本输入/输出系统(BIOS)、与专用计算机的特定设备交互的设备驱动程序、一个或多个操作系统、用户应用程序、后台服务、后台应用程序等。This embodiment also provides a computer storage medium, where a computer program is stored in the computer storage medium, and the method described above can be implemented when a processor executes the computer program. The computer readable medium may be considered tangible and non-transitory. Non-limiting examples of non-transitory tangible computer-readable media include non-volatile memory circuits such as flash memory circuits, erasable programmable read-only memory circuits, or masked read-only memory circuits, volatile memory circuits such as static random access memory circuits or dynamic random access memory circuits), magnetic storage media such as analog or digital magnetic tape or hard drives, and optical storage media such as CD, DVD or Blu-ray discs, etc. The computer programs include processor-executable instructions stored on at least one non-transitory tangible computer readable medium. A computer program may also include or rely on stored data. A computer program may include a basic input/output system (BIOS) for interacting with the hardware of a special purpose computer, device drivers for interacting with specific devices of a special purpose computer, one or more operating systems, user application programs, background services, background applications, etc. .

本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowcharts and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a An apparatus for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions The device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The instructions provide steps for implementing the functions specified in the flow chart or blocks of the flowchart and/or the block or blocks of the block diagrams.

Claims (7)

1. The evaluation weight calculation method based on the knowledge graph is characterized by comprising the following steps of:
s1: establishing an evaluation knowledge graph and a keyword vector library according to the evaluation related data, taking the evaluation knowledge graph as an evaluation index library of an evaluation index system, and taking the keyword vector library as a basis for evaluation index retrieval;
s2: combining the set evaluation target and the calculated adaptation index in the keyword vector similarity retrieval index library to form an evaluation index system;
s3: converting the established evaluation index system into a directed graph, and calculating knowledge reasoning probability to replace random walk probability in the conventional PageRank algorithm through keyword vector similarity among nodes of the directed graph to form an improved PageRank algorithm;
s4: and calculating the PageRank value of each node according to the improved PageRank algorithm, and taking the PageRank value as the objective weight of each evaluation index.
2. The method for calculating the evaluation weight based on the knowledge graph according to claim 1, wherein the method for establishing the evaluation knowledge graph in step S1 is as follows: according to the evaluation task, collecting related text data, and establishing an evaluation knowledge graph by adopting a knowledge graph construction technology;
the method for establishing the keyword vector library comprises the following steps: and (3) obtaining a word vector library by adopting a word embedding algorithm for the collected text data, extracting keywords of each node of the knowledge graph, and searching for corresponding word vectors to form the keyword vector library.
3. The method for calculating the evaluation weight based on the knowledge graph according to claim 1, wherein the forming of the evaluation index system in the step S2 is specifically: and screening out nodes with specific labels in the knowledge graph according to the set evaluation targets, calculating the keyword vector similarity between the nodes and the evaluation targets for further retrieval, and then summarizing all the evaluation indexes meeting the conditions and the dependency relations between the evaluation indexes to form an evaluation index system.
4. A knowledge-graph-based evaluation weight calculation method according to claim 3, wherein said evaluation target is set by an expert according to an evaluation task;
the keyword vector similarity is calculated in the following way:
Figure FDA0004010270220000011
wherein,,
Figure FDA0004010270220000012
keyword vector representing evaluation target, +.>
Figure FDA0004010270220000013
Keyword word vectors representing knowledge-graph nodes.
5. The method for calculating the evaluation weight based on the knowledge graph according to claim 1, wherein the step S3 is specifically:
a1: constructing a directed graph of an evaluation index system
After the construction of the evaluation index system is completed, each evaluation index in the evaluation index system is converted into a node in the definition of the directed graph, the directional dependency relationship among the indexes is converted into a directed edge in the definition of the directed graph, and the conversion process from the evaluation index system to the directed graph is realized;
a2: establishing an improved random walk model
Defining an evaluation index system directed graph:
G=(V,E)
wherein V and E represent a set of nodes and directed edges, respectively, each directed edge corresponding to a pair of ordered nodes;
defining node v in evaluation index system directed graph G i And v j The knowledge similarity of (2) is:
Figure FDA0004010270220000021
wherein,,
Figure FDA0004010270220000022
is node v in the knowledge graph i 、v j Keyword word vectors with highest ranks;
assuming that there are k directed edges connecting out to node j in the graph, the probability of state transition from node j to each node can be represented by the knowledge similarity:
Figure FDA0004010270220000023
wherein p' ij Is a knowledge reasoning probability representing a slaveThe probability that the evaluation index j is transferred to the evaluation index i is measured by taking knowledge as a measure;
assuming that there are m nodes in the graph G, the state transition probability matrix of the improved random walk model is expressed as:
Figure FDA0004010270220000024
obviously, p' ij ≥0,
Figure FDA0004010270220000025
The properties of the state transition probability matrix are met.
6. The method for calculating the evaluation weight based on the knowledge graph according to claim 5, wherein the method for calculating the PageRank value in step S4 is as follows:
for the evaluation index system directed graph G, assuming that m nodes are provided, obtaining a state transition probability matrix based on the improved random walk model as P, if the steady-state distribution of the Markov chain is required to be obtained, defining a complete random walk model on the directed graph G besides the improved random walk model of the directed graph G, wherein the element values of the state transition probability matrix P' are all
Figure FDA0004010270220000026
The new state transition probability matrix is formed by the linear combination of P and P', and the Markov chain obtained on the basis can prove that the state transition probability matrix has stable distribution R, wherein the R has the following equation:
Figure FDA0004010270220000031
wherein, the linear combination coefficient d (d is more than or equal to 0 and less than or equal to 1) is a damping factor, E is an m multiplied by m full 1 matrix according to empirical values; l is an m-dimensional vector with each component being 1, R represents the steady distribution under the general condition, and each component of R is the PageRank value of each node, namely the objective weight corresponding to each evaluation index:
R=[PR(v 1 ),PR(v 2 ),…,PR(v m )] T ,v 1 ,v 2 ,...,v m ∈V
wherein there is PR (v) i )>0,
Figure FDA0004010270220000032
When the first term of the formula represents state stable distribution, the model moves to probability distribution of each node according to a state probability transition matrix P, and the probability distribution occupies a weight d; the second term, which represents the state stationary distribution, models walk to the probability distribution of each node according to the probability such as the state probability transition matrix P', and such probability distribution takes weight 1-d.
7. The method for calculating the evaluation weight based on the knowledge graph according to claim 6, wherein the calculation process of the PageRank value in the step S4 comprises the following steps:
b1: calculating a state transition probability matrix P of the evaluation index system directed graph G based on knowledge measurement;
b2: let t=0, select initial state distribution x 0
B3: calculating a general state transition probability matrix A of the directed graph
Figure FDA0004010270220000033
B4: iterating and normalizing result vectors
y t+1 =Ax t
Figure FDA0004010270220000034
B5: when x t+1 -x t Let r=x when l is less than or equal to ε t Stopping iteration;
b6: otherwise, let t=t+1, execute step B4;
b7: normalizing R to make it represent probability distribution;
and R obtained after the steps are completed is the PageRank value of each node of the directed graph of the evaluation index system, namely the objective weight of each index.
CN202211647320.7A 2022-12-21 2022-12-21 Evaluation weight calculation method based on knowledge graph Pending CN116306923A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211647320.7A CN116306923A (en) 2022-12-21 2022-12-21 Evaluation weight calculation method based on knowledge graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211647320.7A CN116306923A (en) 2022-12-21 2022-12-21 Evaluation weight calculation method based on knowledge graph

Publications (1)

Publication Number Publication Date
CN116306923A true CN116306923A (en) 2023-06-23

Family

ID=86778579

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211647320.7A Pending CN116306923A (en) 2022-12-21 2022-12-21 Evaluation weight calculation method based on knowledge graph

Country Status (1)

Country Link
CN (1) CN116306923A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117196394A (en) * 2023-09-14 2023-12-08 上海鱼尔网络科技有限公司 Evaluation index processing method, device, computer equipment and storage medium
CN118036902A (en) * 2024-04-11 2024-05-14 中国科学院自动化研究所 Knowledge graph-based ocean typical scene evaluation index system construction method and device, electronic equipment and storage medium
CN119761383A (en) * 2025-03-10 2025-04-04 北京工业大学 Ontology-based composite semantic relevance quantification method and system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117196394A (en) * 2023-09-14 2023-12-08 上海鱼尔网络科技有限公司 Evaluation index processing method, device, computer equipment and storage medium
CN118036902A (en) * 2024-04-11 2024-05-14 中国科学院自动化研究所 Knowledge graph-based ocean typical scene evaluation index system construction method and device, electronic equipment and storage medium
CN118036902B (en) * 2024-04-11 2024-09-13 中国科学院自动化研究所 Knowledge graph-based ocean typical scene evaluation index system construction method and device, electronic equipment and storage medium
CN119761383A (en) * 2025-03-10 2025-04-04 北京工业大学 Ontology-based composite semantic relevance quantification method and system

Similar Documents

Publication Publication Date Title
US8224805B2 (en) Method for generating context hierarchy and system for generating context hierarchy
WO2023155508A1 (en) Graph convolutional neural network and knowledge base-based paper correlation analysis method
CN116306923A (en) Evaluation weight calculation method based on knowledge graph
CN104573130B (en) The entity resolution method and device calculated based on colony
CN112925857A (en) Digital information driven system and method for predicting associations based on predicate type
JP7432801B2 (en) Medical data element automated classification method and system based on depth map matching
CN110633365A (en) A hierarchical multi-label text classification method and system based on word vectors
CN114647741A (en) Process automatic decision and reasoning method, device, computer equipment and storage medium
CN111190968A (en) Data preprocessing and content recommendation method based on knowledge graph
CN112100506A (en) Information push method, system, device and storage medium
CN113673889A (en) Intelligent data asset identification method
Chuang et al. TPR: Text-aware preference ranking for recommender systems
CN115035966B (en) Superconductor screening method, device and equipment based on active learning and symbolic regression
Wei et al. A data-driven human–machine collaborative product design system toward intelligent manufacturing
CN119597929A (en) Novel intelligent data architecture knowledge retrieval method and system for power system construction
Fan Data mining model for predicting the quality level and classification of construction projects
Hao et al. The research and analysis in decision tree algorithm based on C4. 5 algorithm
CN119441499B (en) Construction method, device and equipment of financial event map
CN114612914A (en) A machine learning method and system for multi-label imbalanced data classification
Liang et al. DeepDiveAI: Identifying AI-Related Documents in Large Scale Literature Dataset
JP2013242810A (en) Data classification device and data classification method
CN120234581B (en) A professional technology maturity assessment method based on artificial intelligence
Fallah Tehrani et al. A class of monotone kernelized classifiers on the basis of the Choquet integral
CN120104873B (en) Method and system for intelligent recommendation of person post accurate matching
CN120429445B (en) Knowledge graph verification method, knowledge graph verification equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载