CN113987206A - Abnormal user identification method, device, equipment and storage medium - Google Patents
Abnormal user identification method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN113987206A CN113987206A CN202111268127.8A CN202111268127A CN113987206A CN 113987206 A CN113987206 A CN 113987206A CN 202111268127 A CN202111268127 A CN 202111268127A CN 113987206 A CN113987206 A CN 113987206A
- Authority
- CN
- China
- Prior art keywords
- abnormal
- abnormal user
- user set
- users
- user
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0282—Rating or review of business operators or products
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/02—Banking, e.g. interest calculation or account maintenance
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2216/00—Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
- G06F2216/03—Data mining
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Development Economics (AREA)
- Strategic Management (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Marketing (AREA)
- Databases & Information Systems (AREA)
- General Business, Economics & Management (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Technology Law (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本发明涉及大数据技术,揭露了一种异常用户的识别方法,包括:采集历史投诉工单信息,提取诉求详情,得到多个用户的诉求详情文本;提取每个所述诉求详情文本中的关键词,并基于所述关键词生成第一异常用户集;基于所述历史投诉工单信息获取多个所述用户的设备信息,根据多个所述用户的设备信息和预设的设备信息条件生成第二异常用户集;获取基于用户信息构建用户知识图谱,从所述用户知识图谱中识别生成第三异常用户集,得到异常用户集。此外,本发明还涉及区块链技术,历史投诉工单信息可存储于区块链的节点。本发明还提出一种异常用户的识别装置、电子设备以及存储介质。本发明可以提高异常用户的识别效率。
The invention relates to big data technology, and discloses a method for identifying abnormal users, which includes: collecting historical complaint work order information, extracting appeal details, and obtaining appeal detail texts of multiple users; extracting key elements in each appeal detail text and generate the first abnormal user set based on the keywords; obtain the equipment information of multiple users based on the historical complaint ticket information, and generate the equipment information according to the equipment information of multiple users and preset equipment information conditions A second abnormal user set; obtaining a user knowledge graph constructed based on user information, identifying and generating a third abnormal user set from the user knowledge graph, and obtaining an abnormal user set. In addition, the present invention also relates to the blockchain technology, and the historical complaint work order information can be stored in the nodes of the blockchain. The invention also provides an abnormal user identification device, an electronic device and a storage medium. The invention can improve the identification efficiency of abnormal users.
Description
技术领域technical field
本发明涉及大数据技术领域,尤其涉及一种异常用户的识别方法、装置、电子设备及计算机可读存储介质。The present invention relates to the technical field of big data, and in particular, to a method, device, electronic device and computer-readable storage medium for identifying abnormal users.
背景技术Background technique
随着社会的发展,人们对生活质量的要求越来越高,对于各种服务的要求也越来越高,一旦有不满意就会产生投诉。如有的客户因为对银行的服务态度、收费问题等感到不满,而对银行提出投诉。但是有些异常用户为了达到自己的经济目的而恶意的投诉银行,违约还款、甚至不还,委托第三方投诉银行,这种行为可能会严重影响银行的正常业务。With the development of society, people have higher and higher requirements for the quality of life, and higher and higher requirements for various services. Once there is dissatisfaction, complaints will arise. For example, some customers complain to the bank because they are dissatisfied with the bank's service attitude and charging problems. However, some abnormal users maliciously complained to the bank in order to achieve their own economic goals, defaulting on repayment, or even not repaying, and entrusting a third party to complain to the bank, which may seriously affect the normal business of the bank.
而对于这种异常用户,目前主要是通过人工查看工单内容,听取录音,判定是否为黑产用户,这种方法耗时耗力,效率低下且差错率较高,因此,需要一种可以高效的识别异常用户的方法。For such abnormal users, at present, the main method is to manually check the content of the work order, listen to the recording, and determine whether it is a black user. This method is time-consuming, labor-intensive, inefficient and has a high error rate. Therefore, an efficient and effective method is needed. method for identifying abnormal users.
发明内容SUMMARY OF THE INVENTION
本发明提供一种异常用户的识别方法、装置及计算机可读存储介质,其主要目的在于提高异常用户的识别效率。The present invention provides a method, device and computer-readable storage medium for identifying abnormal users, the main purpose of which is to improve the identification efficiency of abnormal users.
为实现上述目的,本发明提供的一种异常用户的识别方法,包括:In order to achieve the above purpose, a method for identifying an abnormal user provided by the present invention includes:
采集历史投诉工单信息,从所述历史投诉工单信息中提取诉求详情,得到多个用户的诉求详情文本;Collect historical complaint work order information, extract appeal details from the historical complaint work order information, and obtain the appeal details texts of multiple users;
提取每个所述诉求详情文本中的关键词,并基于所述关键词生成第一异常用户集;Extracting keywords in each of the appeal details texts, and generating a first abnormal user set based on the keywords;
基于所述历史投诉工单信息获取多个所述用户的设备信息,根据多个所述用户的设备信息和预设的设备信息条件生成第二异常用户集;Obtain the device information of a plurality of the users based on the historical complaint ticket information, and generate a second abnormal user set according to the device information of the plurality of users and a preset device information condition;
获取基于用户信息构建的用户知识图谱,根据所述第一异常用户集和所述第二异常用户集从所述用户知识图谱中识别生成第三异常用户集,并与所述第一异常用户集、所述第二异常用户集汇总得到异常用户集。Obtain a user knowledge graph constructed based on user information, identify and generate a third abnormal user set from the user knowledge graph according to the first abnormal user set and the second abnormal user set, and combine with the first abnormal user set . The second abnormal user set is aggregated to obtain the abnormal user set.
可选地,所述提取每个所述诉求详情文本中的关键词,包括:Optionally, the extracting keywords in each of the appeal detail texts includes:
将每个所述诉求详情文本切分成词,得到每个所述诉求详情文本对应的词语集;Divide each of the appeal detail texts into words to obtain a word set corresponding to each of the appeal detail texts;
计算每个所述词语集中每个词语的词频和逆向文件频率;calculating the term frequency and reverse file frequency for each term in each said term set;
根据所述词频和所述逆向文件频率计算每个词语的权重;Calculate the weight of each word according to the word frequency and the reverse document frequency;
将每个所述词语集中的词语按照所述权重从大到小进行排序,并选择排名前预设阈值个数的词语,得到每个所述诉求详情文本中的关键词。The words in each of the word sets are sorted according to the weight from large to small, and words with a preset threshold number before the ranking are selected to obtain the keywords in each of the appeal detail texts.
可选地,所述计算每个所述词语集中每个词语的词频,包括:Optionally, the calculating the word frequency of each word in each of the word sets includes:
分别统计每个词语在对应的诉求详情文本中出现的次数,得到出现次数;Count the number of occurrences of each word in the corresponding appeal detail text to obtain the number of occurrences;
统计所述词语集中所有词语的数量,得到总词数量;Count the number of all words in the word set to obtain the total number of words;
根据所述出现次数和所述总词数量,利用预设第一公式生成每个词语的词频。According to the number of occurrences and the total number of words, a preset first formula is used to generate the word frequency of each word.
可选地,所述计算每个所述词语集中每个词语的逆向文件频率,包括:Optionally, the calculating the reverse file frequency of each word in each of the word sets includes:
统计所述词语集对应的诉求详情文本的总数量,得到总文档数量;Count the total number of appeal detail texts corresponding to the word set to obtain the total number of documents;
对所述词语集中每个词语,统计包含所述词语的诉求详情文本的数量,得到含词条文档数量;For each word in the word set, count the number of appeal detail texts containing the word to obtain the number of documents containing the entry;
根据所述总文档数量和所述含词条文档数量,利用预设第二公式计算生成每个词语的逆向文件频率。According to the total number of documents and the number of documents containing terms, a preset second formula is used to calculate and generate the reverse file frequency of each term.
可选地,所述设备信息包括用户提交投诉工单时所用设备的账号信息、电量、图片数量中的一项或多项,所述根据多个所述用户的设备信息和预设的设备信息条件生成第二异常用户集,包括:Optionally, the device information includes one or more of account information, power, and number of pictures of the device used when the user submits the complaint work order. The condition generates a second set of abnormal users, including:
若多个所述用户的设备信息中存在任一设备信息满足预设的设备信息条件,将所述任一设备信息对应的用户判断为异常用户,确定得到的所有异常用户组成第二异常用户集。If any device information in the device information of the multiple users satisfies the preset device information conditions, the user corresponding to the device information is determined as an abnormal user, and all the determined abnormal users form a second abnormal user set .
可选地,所述获取基于用户信息构建的用户知识图谱之前,所述方法还包括:Optionally, before the acquiring the user knowledge graph constructed based on the user information, the method further includes:
将各个所述用户作为实体,并以所述实体为知识图谱的节点;Take each of the users as an entity, and take the entity as a node of the knowledge graph;
提取所述历史投诉工单信息中的用户信息作为各实体的属性;Extracting the user information in the historical complaint ticket information as the attributes of each entity;
分析所述实体之间的关联关系,并根据所述实体的属性以及实体之间的关联关系构建多个三元组;Analyzing the association relationship between the entities, and constructing a plurality of triples according to the attributes of the entities and the association relationship between the entities;
将所述多个三元组进行可视化,得到用户知识图谱。The multiple triples are visualized to obtain a user knowledge graph.
可选地,所述根据所述第一异常用户集和所述第二异常用户集从所述用户知识图谱中识别生成第三异常用户集,并与所述第一异常用户集、所述第二异常用户集汇总得到异常用户集,包括:Optionally, the third abnormal user set is identified and generated from the user knowledge graph according to the first abnormal user set and the second abnormal user set, and is combined with the first abnormal user set and the first abnormal user set. Second, the abnormal user set is aggregated to obtain the abnormal user set, including:
将所述第一异常用户集和所述第二异常用户集中的用户在所述用户知识图谱中进行标记;marking the users in the first abnormal user set and the second abnormal user set in the user knowledge graph;
在所述用户知识图谱中查找与所述第一异常用户集和所述第二异常用户集中的异常用户具有相同电话属性的其余用户,并将所述其余用户标记为异常用户,得到第三异常用户集;Find other users in the user knowledge graph who have the same phone attribute as the abnormal users in the first abnormal user set and the second abnormal user set, mark the remaining users as abnormal users, and obtain a third abnormal user user set;
汇总所述第一异常用户集、第二异常用户集及第三异常用户集,得到异常用户集,并对所述异常用户集的用户进行转移处理。The first abnormal user set, the second abnormal user set and the third abnormal user set are aggregated to obtain the abnormal user set, and the users of the abnormal user set are transferred.
为了解决上述问题,本发明还提供一种异常用户的识别装置,所述装置包括:In order to solve the above problems, the present invention also provides a device for identifying abnormal users, the device comprising:
文本采集模块,采集历史投诉工单信息,从所述历史投诉工单信息中提取诉求详情,得到多个用户的诉求详情文本;The text collection module collects historical complaint work order information, extracts the appeal details from the historical complaint work order information, and obtains the appeal details texts of multiple users;
关键词提取模块,用于提取每个所述诉求详情文本中的关键词,并基于所述关键词生成第一异常用户集;A keyword extraction module, used for extracting keywords in each of the appeal details texts, and generating a first abnormal user set based on the keywords;
设备信息获取模块,用于基于所述历史投诉工单信息获取多个所述用户的设备信息,根据多个所述用户的设备信息和预设的设备信息条件生成第二异常用户集;a device information acquisition module, configured to acquire device information of a plurality of the users based on the historical complaint work order information, and generate a second abnormal user set according to the device information of the plurality of users and preset device information conditions;
异常用户生成模块,用于获取基于用户信息构建的用户知识图谱,根据所述第一异常用户集和所述第二异常用户集从所述用户知识图谱中识别生成第三异常用户集,并与所述第一异常用户集、所述第二异常用户集汇总得到异常用户集。The abnormal user generation module is used to obtain a user knowledge graph constructed based on user information, identify and generate a third abnormal user set from the user knowledge graph according to the first abnormal user set and the second abnormal user set, and combine with the user knowledge graph. The first abnormal user set and the second abnormal user set are aggregated to obtain the abnormal user set.
为了解决上述问题,本发明还提供一种电子设备,所述电子设备包括:In order to solve the above problems, the present invention also provides an electronic device, the electronic device includes:
至少一个处理器;以及,at least one processor; and,
与所述至少一个处理器通信连接的存储器;其中,a memory communicatively coupled to the at least one processor; wherein,
所述存储器存储有可被所述至少一个处理器执行的计算机程序,所述计算机程序被所述至少一个处理器执行,以使所述至少一个处理器能够执行上述所述的异常用户的识别方法。The memory stores a computer program executable by the at least one processor, the computer program being executed by the at least one processor, so that the at least one processor can execute the above-mentioned method for identifying an abnormal user .
为了解决上述问题,本发明还提供一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一个计算机程序,所述至少一个计算机程序被电子设备中的处理器执行以实现上述所述的异常用户的识别方法。In order to solve the above problems, the present invention also provides a computer-readable storage medium, where at least one computer program is stored in the computer-readable storage medium, and the at least one computer program is executed by a processor in an electronic device to realize the above-mentioned The method for identifying abnormal users described above.
本发明实施例从历史投诉工单信息中提取诉求详情,保证了信息完整性;从诉求详情文本提取出关键词,将根据关键词和预先配置好的规则匹配,识别出第一异常用户,同时通过设备信息识别出第二异常用户,可以减少因查看诉求详情文本耗费的时间,有效提高效率;基于用户信息构建用户知识图谱,直接以用户为实体,更加直观清晰,便于获取异常用户信息。因此本发明提出的异常用户的识别方法、装置、电子设备及计算机可读存储介质,可以提高异常用户的识别效率。The embodiment of the present invention extracts appeal details from historical complaint work order information to ensure the integrity of the information; keywords are extracted from the appeal details text, and the first abnormal user is identified by matching the keywords with the preconfigured rules, and at the same time Identifying the second abnormal user through the device information can reduce the time spent on viewing the detailed text of the appeal and effectively improve the efficiency; the user knowledge graph is constructed based on the user information, and the user is directly taken as an entity, which is more intuitive and clear, and facilitates the acquisition of abnormal user information. Therefore, the method, device, electronic device and computer-readable storage medium for identifying abnormal users proposed by the present invention can improve the identification efficiency of abnormal users.
附图说明Description of drawings
图1为本发明一实施例提供的异常用户的识别方法的流程示意图;1 is a schematic flowchart of a method for identifying an abnormal user according to an embodiment of the present invention;
图2为本发明一实施例提供的提取关键词的流程示意图;2 is a schematic flowchart of extracting keywords according to an embodiment of the present invention;
图3为本发明一实施例提供的异常用户的识别装置的功能模块图;3 is a functional block diagram of a device for identifying an abnormal user provided by an embodiment of the present invention;
图4为本发明一实施例提供的实现所述异常用户的识别方法的电子设备的结构示意图。FIG. 4 is a schematic structural diagram of an electronic device implementing the method for identifying an abnormal user according to an embodiment of the present invention.
本发明目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The realization, functional characteristics and advantages of the present invention will be further described with reference to the accompanying drawings in conjunction with the embodiments.
具体实施方式Detailed ways
应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.
本申请实施例提供一种异常用户的识别方法。所述异常用户的识别方法的执行主体包括但不限于服务端、终端等能够被配置为执行本申请实施例提供的该方法的电子设备中的至少一种。换言之,所述异常用户的识别方法可以由安装在终端设备或服务端设备的软件或硬件来执行,所述软件可以是区块链平台。所述服务端包括但不限于:单台服务器、服务器集群、云端服务器或云端服务器集群等。所述服务器可以是独立的服务器,也可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、内容分发网络(Content Delivery Network,CDN)、以及大数据和人工智能平台等基础云计算服务的云服务器。The embodiment of the present application provides a method for identifying an abnormal user. The execution body of the method for identifying an abnormal user includes, but is not limited to, at least one of electronic devices that can be configured to execute the method provided by the embodiments of the present application, such as a server and a terminal. In other words, the method for identifying the abnormal user may be executed by software or hardware installed on the terminal device or the server device, and the software may be a blockchain platform. The server includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like. The server can be an independent server, or can provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery network (Content Delivery Network) Network, CDN), and cloud servers for basic cloud computing services such as big data and artificial intelligence platforms.
参照图1所示,为本发明一实施例提供的异常用户的识别方法的流程示意图。在本实施例中,所述异常用户的识别方法包括:Referring to FIG. 1 , it is a schematic flowchart of a method for identifying an abnormal user according to an embodiment of the present invention. In this embodiment, the method for identifying the abnormal user includes:
S1、采集历史投诉工单信息,从所述历史投诉工单信息中提取诉求详情,得到多个用户的诉求详情文本。S1. Collect historical complaint work order information, extract appeal details from the historical complaint work order information, and obtain appeal detail texts of multiple users.
本发明实施例中,所述历史投诉工单信息是用户在各种业务场景中因不满而提交的投诉信息,如对银行的响应或处理的结果不满意等,所述历史投诉工单信息包括但不限于姓名、性别、电话、证件号、卡号、诉求详情、录音。本发明实施例可以从预设的后台数据库中获取所述历史投诉工单信息。In the embodiment of the present invention, the historical complaint work order information is the complaint information submitted by the user due to dissatisfaction in various business scenarios, such as dissatisfaction with the bank's response or processing result, etc. The historical complaint work order information includes: But not limited to name, gender, phone number, ID number, card number, appeal details, and recording. In the embodiment of the present invention, the historical complaint work order information can be acquired from a preset background database.
可选地,为进一步保证所述历史投诉工单信息的隐私性和安全性,所述历史投诉工单信息还可以从一区块链的节点中获取。Optionally, in order to further ensure the privacy and security of the historical complaint work order information, the historical complaint work order information may also be obtained from a node of a blockchain.
进一步地,本发明实施例从所述历史投诉工单信息中提取出诉求详情字段对应的文本数据,得到多个诉求详情文本。Further, the embodiment of the present invention extracts the text data corresponding to the claim detail field from the historical complaint work order information, and obtains a plurality of claim detail texts.
本发明其中一个实施例中,可以按照用户的证件号从所述历史投诉工单信息中提取每个用户对应的诉求详情文本,从而得到多个用户对应的诉求详情文本。In one embodiment of the present invention, the request detail text corresponding to each user can be extracted from the historical complaint work order information according to the user's certificate number, so as to obtain the request detail text corresponding to multiple users.
可选地,本发明实施例还可以从所述历史投诉工单信息提取出每个投诉工单的投诉详情内容,并按照用户姓名进行分类整理,从而得到多个用户对应的诉求详情文本。Optionally, in this embodiment of the present invention, the complaint details of each complaint work order may be extracted from the historical complaint work order information, and the content of the complaint details may be classified and sorted according to the user name, so as to obtain the request details text corresponding to multiple users.
S2、提取每个所述诉求详情文本中的关键词,并基于所述关键词生成第一异常用户集。S2. Extract keywords in each of the appeal detail texts, and generate a first abnormal user set based on the keywords.
本发明实施例利用关联词抽取算法提取每个所述诉求详情文本中的关键词,其中,所述关键词抽取算法是一种用于信息检索与数据挖掘的加权技术,可用于挖掘文章中的关键词,如TF-IDF(term frequency–inverse document frequency)算法。In the embodiment of the present invention, a related word extraction algorithm is used to extract keywords in each of the appeal detail texts, wherein the keyword extraction algorithm is a weighting technology used for information retrieval and data mining, and can be used to mine the key words in the article. words, such as the TF-IDF (term frequency–inverse document frequency) algorithm.
详细地,参照图2所示,所述提取每个所述诉求详情文本中的关键词,包括:In detail, referring to FIG. 2 , the extracting keywords in each of the appeal detail texts includes:
S21、将每个所述诉求详情文本切分成词,得到每个所述诉求详情文本对应的词语集;S21. Divide each appeal detail text into words, and obtain a word set corresponding to each appeal detail text;
S22、计算每个所述词语集中每个词语的词频和逆向文件频率;S22, calculating the word frequency and reverse file frequency of each word in each of the word sets;
S23、根据所述词频和所述逆向文件频率计算每个词语的权重;S23, calculate the weight of each word according to the word frequency and the reverse file frequency;
S24、将每个所述词语集中的词语按照所述权重从大到小进行排序,并选择排名前预设阈值个数的词语,得到每个所述诉求详情文本中的关键词。S24. Sort the words in each of the word sets according to the weight from large to small, and select words with a preset threshold number before the ranking to obtain keywords in each of the appeal detail texts.
本发明其中一个实施例可以采用结巴分词法将每个所述诉求详情文本切分成词。In one embodiment of the present invention, the stammering word segmentation method can be used to segment each of the appeal detail texts into words.
进一步地,所述词频(TF)表示词语(关键字)在文本中出现的频率。所述逆向文件频率(IDF)是某一特定词语的IDF,由总文档数目除以包含该词语的文档的数目,再将得到的商取对数得到;包含词语的文档数越少,IDF越大,则说明词条具有很好的类别区分能力。Further, the term frequency (TF) represents the frequency of words (keywords) appearing in the text. The inverse document frequency (IDF) is the IDF of a specific word, which is obtained by dividing the total number of documents by the number of documents containing the word, and then taking the logarithm of the obtained quotient; the less the number of documents containing the word, the greater the IDF. If it is large, it means that the entry has a good ability to distinguish between categories.
详细地,所述计算每个所述词语集中每个词语的词频,包括:Specifically, the calculating the word frequency of each word in each of the word sets includes:
分别统计每个词语在对应的诉求详情文本中出现的次数,得到出现次数;Count the number of occurrences of each word in the corresponding appeal detail text to obtain the number of occurrences;
统计所述词语集中所有词语的数量,得到总词数量;Count the number of all words in the word set to obtain the total number of words;
根据所述出现次数和所述总词数量,利用预设第一公式生成每个词语的词频。According to the number of occurrences and the total number of words, a preset first formula is used to generate the word frequency of each word.
其中,所述预设第一公式为:TF(c)=词语c的出现次数/总词数量。Wherein, the preset first formula is: TF(c)=the number of occurrences of word c/total number of words.
进一步地,所述计算每个所述词语集中每个词语的逆向文件频率,包括:Further, the calculating the reverse file frequency of each word in each of the word sets includes:
统计所述词语集对应的诉求详情文本的总数量,得到总文档数量;Count the total number of appeal detail texts corresponding to the word set to obtain the total number of documents;
对所述词语集中每个词语,统计包含所述词语的诉求详情文本的数量,得到含词条文档数量;For each word in the word set, count the number of appeal detail texts containing the word to obtain the number of documents containing the entry;
根据所述总文档数量和所述含词条文档数量,利用预设第二公式计算生成每个词语的逆向文件频率。According to the total number of documents and the number of documents containing terms, a preset second formula is used to calculate and generate the reverse file frequency of each term.
所述预设第二公式为:IDF(c)=log(总文档数量/包含词语c的含词条文档数量+1)。The preset second formula is: IDF(c)=log(the total number of documents/the number of documents containing a term containing the word c+1).
进一步地,本发明实施例按照下述公式计算每个词语的权重:TF-IDF=词频(TF)*逆向文件频率(IDF)。Further, the embodiment of the present invention calculates the weight of each word according to the following formula: TF-IDF=word frequency (TF)*inverse document frequency (IDF).
详细地,所述基于所述关键词生成第一异常用户集,包括:In detail, the generating the first abnormal user set based on the keyword includes:
获取预设的模板关键词;Get the preset template keywords;
将每个所述诉求详情文本的关键词与所述模板关键词进行匹配,得到匹配结果;Matching each keyword of the appeal details text with the template keyword to obtain a matching result;
若所述匹配结果为符合,则将所述关键词对应的用户标记为异常用户,得到第一异常用户集。If the matching result is consistent, the user corresponding to the keyword is marked as an abnormal user to obtain a first abnormal user set.
所述模板关键词是异常用户在投诉时惯常使用的词,如:威胁、立案等。所述匹配结果包括符合和完全不符。将所有历史投诉工单信息都提取关键词并进行匹配后,提取其中所有被标记为异常用户的用户,即可得到第一异常用户集。The template keywords are words commonly used by abnormal users when making complaints, such as threats, filing a case, and the like. The matching results include matching and complete non-matching. After extracting and matching keywords from all historical complaint ticket information, extracting all the users marked as abnormal users, the first abnormal user set can be obtained.
如,某个用户提交了投诉工单信息以后,从所述投诉工单信息中提取出的关键词有态度恶劣、报警、威胁,其中,“威胁”与模板关键词中的关键词完全相同,则匹配结果为符合,将当前投诉工单信息对应的用户标记为异常用户。For example, after a user submits the complaint work order information, the keywords extracted from the complaint work order information include bad attitude, alarm, and threat. Among them, "threat" is exactly the same as the keyword in the template keyword. The matching result is consistent, and the user corresponding to the current complaint ticket information is marked as an abnormal user.
S3、基于所述历史投诉工单信息获取多个所述用户的设备信息,根据多个所述用户的设备信息和预设的设备信息条件生成第二异常用户集。S3. Acquire device information of a plurality of the users based on the historical complaint ticket information, and generate a second abnormal user set according to the device information of the plurality of users and a preset device information condition.
本发明实施例中,所述用户设备信息是指用户在登录系统进行投诉时使用设备的相关信息,包括但不限于用户提交投诉工单时所用设备的账号信息、电量、图片数量。在用户提交投诉工单时,可通过预设的权限获取用户使用设备的相关信息并保存。In the embodiment of the present invention, the user equipment information refers to the relevant information of the equipment used by the user when logging in to the system to make a complaint, including but not limited to account information, power, and number of pictures of the equipment used by the user when submitting a complaint work order. When a user submits a complaint ticket, the user can obtain and save relevant information about the user's use of the device through preset permissions.
进一步地,根据多个所述用户的设备信息和预设的设备信息条件生成第二异常用户集,包括:Further, generating a second abnormal user set according to the device information of multiple users and preset device information conditions, including:
若多个所述用户的设备信息中存在任一设备信息满足预设的设备信息条件,将所述任一设备信息对应的用户判断为异常用户,确定得到的所有异常用户组成第二异常用户集。If any device information in the device information of the multiple users satisfies the preset device information conditions, the user corresponding to the device information is determined as an abnormal user, and all the determined abnormal users form a second abnormal user set .
其中,所述预设的设备信息条件包括但不限于:同一设备中登录过多个帐号;设备电量无变化;设备内的图片数量为零。The preset device information conditions include but are not limited to: multiple accounts have been logged in the same device; the power of the device has not changed; the number of pictures in the device is zero.
通常情况下,用户只会使用一个账号,且登录系统的时间不固定,设备电量也是不固定的,使用的设备内也会存有图片,而那些异常用户是为了恶意投诉,因此会使用多个账号进行投诉,或者使用某些恶意的投诉软件,获取到的设备信息就是虚拟的,则获取到的设备电量为恒定值,图片数量也为0,因此,可以通过设备信息判断是否为异常用户。Under normal circumstances, users only use one account, and the time to log in to the system is not fixed, and the power of the device is also not fixed. If an account makes a complaint, or uses some malicious complaint software, the device information obtained is virtual, the obtained device power is a constant value, and the number of pictures is also 0. Therefore, it can be judged by the device information whether it is an abnormal user.
S4、获取基于用户信息构建的用户知识图谱,根据所述第一异常用户集和所述第二异常用户集从所述用户知识图谱中识别生成第三异常用户集,并与所述第一异常用户集、所述第二异常用户集汇总得到异常用户集。S4. Acquire a user knowledge graph constructed based on user information, identify and generate a third abnormal user set from the user knowledge graph according to the first abnormal user set and the second abnormal user set, and generate a third abnormal user set with the first abnormal user set. The user set and the second abnormal user set are aggregated to obtain the abnormal user set.
本发明实施例中,所述用户信息包括但不限于姓名、性别、电话、证件号、卡号等。In this embodiment of the present invention, the user information includes, but is not limited to, name, gender, phone number, certificate number, card number, and the like.
详细地,所述获取基于用户信息构建知识图谱之前,还包括:In detail, before the acquiring and constructing a knowledge graph based on user information, the method further includes:
将各个所述用户作为实体,并以所述实体为知识图谱的节点;Take each of the users as an entity, and take the entity as a node of the knowledge graph;
提取所述历史投诉工单信息中的用户信息作为各实体的属性;Extracting the user information in the historical complaint ticket information as the attributes of each entity;
分析所述实体之间的关联关系,并根据所述实体的属性以及实体之间的关联关系构建多个三元组;Analyzing the association relationship between the entities, and constructing a plurality of triples according to the attributes of the entities and the association relationship between the entities;
将所述多个三元组进行可视化,得到用户知识图谱。The multiple triples are visualized to obtain a user knowledge graph.
本发明实施例所述分析对所述实体之间的关联关系是基于数据挖掘技术对所述实体数据进行关联挖掘,如两个用户的具有相同属性值(电话号码),则表示这两个用户之间具有关联关系。The analysis of the association relationship between the entities in the embodiment of the present invention is based on data mining technology to perform association mining on the entity data. If two users have the same attribute value (telephone number), it means that the two users relationship between them.
所述三元组可表示为“(A,B,C)”,其中,B为关系,A和C为图谱节点,例如:用户A与用户B之间,,用三元组表示为“(用户A,关联关系,用户C)”。The triple can be represented as "(A, B, C)", where B is a relationship, and A and C are graph nodes, for example, between user A and user B, which is represented by a triple as "( User A, Association, User C)".
进一步地,所述根据所述第一异常用户集和所述第二异常用户集从所述用户知识图谱中识别生成第三异常用户集,并与所述第一异常用户集、所述第二异常用户集汇总得到异常用户集,包括:Further, the third abnormal user set is identified and generated from the user knowledge graph according to the first abnormal user set and the second abnormal user set, and is combined with the first abnormal user set and the second abnormal user set. The abnormal user set is aggregated to obtain the abnormal user set, including:
将所述第一异常用户集和所述第二异常用户集中的用户在所述用户知识图谱中进行标记;marking the users in the first abnormal user set and the second abnormal user set in the user knowledge graph;
在所述用户知识图谱中查找与所述第一异常用户集和所述第二异常用户集中的异常用户的具有相同电话属性的其余用户,并将所述其余用户标记为异常用户,得到第三异常用户集;Find other users with the same phone attribute as the abnormal users in the first abnormal user set and the second abnormal user set in the user knowledge graph, mark the remaining users as abnormal users, and obtain a third abnormal user set;
汇总所述第一异常用户集、第二异常用户集及第三异常用户集,得到异常用户集,并对所述异常用户集的用户进行转移处理。The first abnormal user set, the second abnormal user set and the third abnormal user set are aggregated to obtain the abnormal user set, and the users of the abnormal user set are transferred.
本发明实施例可以批量的识别出异常客户,降低工作人员识别的难度,提高识别效率,并针对异常客户做出进一步的应对策略,如移交相应的机关单位处理,降低群体投诉、舆情等方面的风险。The embodiment of the present invention can identify abnormal customers in batches, reduce the difficulty of identification by staff, improve the identification efficiency, and make further coping strategies for abnormal customers, such as handing over to corresponding agencies for processing, reducing group complaints, public opinion and other aspects. risk.
本发明实施例从历史投诉工单信息中提取诉求详情,保证了信息完整性;从诉求详情文本提取出关键词,将根据关键词和预先配置好的规则匹配,识别出第一异常用户,同时通过设备信息识别出第二异常用户,可以减少因查看诉求详情文本耗费的时间,有效提高效率;基于用户信息构建用户知识图谱,直接以用户为实体,更加直观清晰,便于获取异常用户信息。因此本发明提出的异常用户的识别方法、装置、电子设备及计算机可读存储介质,可以提高异常用户的识别效率。The embodiment of the present invention extracts appeal details from historical complaint work order information to ensure the integrity of the information; keywords are extracted from the appeal details text, and the first abnormal user is identified by matching the keywords with the preconfigured rules, and at the same time Identifying the second abnormal user through the device information can reduce the time spent on viewing the detailed text of the appeal and effectively improve the efficiency; the user knowledge graph is constructed based on the user information, and the user is directly taken as an entity, which is more intuitive and clear, and facilitates the acquisition of abnormal user information. Therefore, the method, device, electronic device and computer-readable storage medium for identifying abnormal users proposed by the present invention can improve the identification efficiency of abnormal users.
如图3所示,是本发明一实施例提供的异常用户的识别装置的功能模块图。As shown in FIG. 3 , it is a functional block diagram of a device for identifying an abnormal user provided by an embodiment of the present invention.
本发明所述异常用户的识别装置100可以安装于电子设备中。根据实现的功能,所述异常用户的识别装置100可以包括文本采集模块101、关键词提取模块102、设备信息获取模块103及异常用户生成模块104。本发明所述模块也可以称之为单元,是指一种能够被电子设备处理器所执行,并且能够完成固定功能的一系列计算机程序段,其存储在电子设备的存储器中。The
在本实施例中,关于各模块/单元的功能如下:In this embodiment, the functions of each module/unit are as follows:
所述文本采集模块101,用户采集历史投诉工单信息,从所述历史投诉工单信息中提取诉求详情,得到多个用户的诉求详情文本。In the
本发明实施例中,所述历史投诉工单信息是用户在各种业务场景中因不满而提交的投诉信息,如对银行的响应或处理的结果不满意等,所述历史投诉工单信息包括但不限于姓名、性别、电话、证件号、卡号、诉求详情、录音。本发明实施例可以从预设的后台数据库中获取所述历史投诉工单信息。In the embodiment of the present invention, the historical complaint work order information is the complaint information submitted by the user due to dissatisfaction in various business scenarios, such as dissatisfaction with the bank's response or processing result, etc. The historical complaint work order information includes: But not limited to name, gender, phone number, ID number, card number, appeal details, and recording. In the embodiment of the present invention, the historical complaint work order information can be acquired from a preset background database.
进一步地,本发明实施例从所述历史投诉工单信息中提取出诉求详情字段对应的文本数据,得到多个诉求详情文本。Further, the embodiment of the present invention extracts the text data corresponding to the claim detail field from the historical complaint work order information, and obtains a plurality of claim detail texts.
所述关键词提取模块102,用于通过关键词抽取算法提取每个所述诉求详情文本中的关键词,并基于所述关键词生成第一异常用户集。The
本发明实施例利用关联词抽取算法提取每个所述诉求详情文本中的关键词,其中,所述关键词抽取算法是一种用于信息检索与数据挖掘的加权技术,可用于挖掘文章中的关键词,如TF-IDF(term frequency–inverse document frequency)算法。In the embodiment of the present invention, a related word extraction algorithm is used to extract keywords in each of the appeal detail texts, wherein the keyword extraction algorithm is a weighting technology used for information retrieval and data mining, and can be used to mine the key words in the article. words, such as the TF-IDF (term frequency–inverse document frequency) algorithm.
详细地,所述关键词提取模块102具体用于:In detail, the
将每个所述诉求详情文本切分成词,得到每个所述诉求详情文本对应的词语集;Divide each of the appeal detail texts into words to obtain a word set corresponding to each of the appeal detail texts;
计算每个所述词语集中每个词语的词频和逆向文件频率;calculating the term frequency and reverse file frequency for each term in each said term set;
根据所述词频和所述逆向文件频率计算每个词语的权重;Calculate the weight of each word according to the word frequency and the reverse document frequency;
将每个所述词语集中的词语按照所述权重从大到小进行排序,并选择排名前预设阈值个数的词语,得到每个所述诉求详情文本中的关键词。The words in each of the word sets are sorted according to the weight from large to small, and words with a preset threshold number before the ranking are selected to obtain the keywords in each of the appeal detail texts.
详细地,所述计算每个所述词语集中每个词语的词频,包括:Specifically, the calculating the word frequency of each word in each of the word sets includes:
分别统计每个词语在对应的诉求详情文本中出现的次数,得到出现次数;Count the number of occurrences of each word in the corresponding appeal detail text to obtain the number of occurrences;
统计所述词语集中所有词语的数量,得到总词数量;Count the number of all words in the word set to obtain the total number of words;
根据所述出现次数和所述总词数量,利用预设第一公式生成每个词语的词频。According to the number of occurrences and the total number of words, a preset first formula is used to generate the word frequency of each word.
其中,所述预设第一公式为:TF(c)=词语c的出现次数/总词数量。Wherein, the preset first formula is: TF(c)=the number of occurrences of word c/total number of words.
进一步地,所述计算每个所述词语集中每个词语的逆向文件频率,包括:Further, the calculating the reverse file frequency of each word in each of the word sets includes:
统计所述词语集对应的诉求详情文本的总数量,得到总文档数量;Count the total number of appeal detail texts corresponding to the word set to obtain the total number of documents;
对所述词语集中每个词语,统计包含所述词语的诉求详情文本的数量,得到含词条文档数量;For each word in the word set, count the number of appeal detail texts containing the word to obtain the number of documents containing the entry;
根据所述总文档数量和所述含词条文档数量,利用预设第二公式计算生成每个词语的逆向文件频率。According to the total number of documents and the number of documents containing terms, a preset second formula is used to calculate and generate the reverse file frequency of each term.
所述预设第二公式为:IDF(c)=log(总文档数量/包含词语c的含词条文档数量+1)。The preset second formula is: IDF(c)=log(the total number of documents/the number of documents containing a term containing the word c+1).
进一步地,本发明实施例按照下述公式计算每个词语的权重:TF-IDF=词频(TF)*逆向文件频率(IDF)。Further, the embodiment of the present invention calculates the weight of each word according to the following formula: TF-IDF=word frequency (TF)*inverse document frequency (IDF).
详细地,所述基于所述关键词生成第一异常用户集,包括:In detail, the generating the first abnormal user set based on the keyword includes:
获取预设的模板关键词;Get the preset template keywords;
将每个所述诉求详情文本的关键词与所述模板关键词进行匹配,得到匹配结果;Matching each keyword of the appeal details text with the template keyword to obtain a matching result;
若所述匹配结果为符合,则将所述关键词对应的用户标记为异常用户,得到第一异常用户集。If the matching result is consistent, the user corresponding to the keyword is marked as an abnormal user to obtain a first abnormal user set.
所述设备信息获取模块103,用于基于所述历史投诉工单信息获取多个所述用户的设备信息,根据多个所述用户的设备信息和预设的设备信息条件生成第二异常用户集。The device information obtaining module 103 is configured to obtain the device information of a plurality of the users based on the historical complaint work order information, and generate a second abnormal user set according to the device information of the plurality of users and a preset device information condition. .
本发明实施例中,所述用户设备信息是指用户在登录系统进行投诉时使用设备的相关信息,包括但不限于用户提交投诉工单时所用设备的账号信息、电量、图片数量。在用户提交投诉工单时,可通过预设的权限获取用户使用设备的相关信息并保存。In the embodiment of the present invention, the user equipment information refers to the relevant information of the equipment used by the user when logging in to the system to make a complaint, including but not limited to account information, power, and number of pictures of the equipment used by the user when submitting a complaint work order. When a user submits a complaint ticket, the user can obtain and save relevant information about the user's use of the device through preset permissions.
进一步地,所述设备信息获取模块103具体用于:若多个所述用户的设备信息中存在任一设备信息满足预设的设备信息条件,将所述任一设备信息对应的用户判断为异常用户,确定得到的所有异常用户组成第二异常用户集。Further, the device information acquisition module 103 is specifically configured to: if any device information in the device information of the multiple users satisfies the preset device information condition, determine the user corresponding to the any device information as abnormal Users, it is determined that all abnormal users obtained form a second abnormal user set.
其中,所述预设的设备信息条件包括但不限于:同一设备中登录过多个帐号;设备电量无变化;设备内的图片数量为零。The preset device information conditions include but are not limited to: multiple accounts have been logged in the same device; the power of the device has not changed; the number of pictures in the device is zero.
所述异常用户生成模块104,用于获取基于用户信息构建的用户知识图谱,根据所述第一异常用户集和所述第二异常用户集从所述用户知识图谱中识别生成第三异常用户集,并与所述第一异常用户集、所述第二异常用户集汇总得到异常用户集。The abnormal
本发明实施例中,所述用户信息包括,但不限于姓名、性别、电话、证件号、卡号等。In this embodiment of the present invention, the user information includes, but is not limited to, name, gender, phone number, certificate number, card number, and the like.
详细地,所述异常用户生成模块104具体用于:In detail, the abnormal
将所述第一异常用户集和所述第二异常用户集中的用户在所述用户知识图谱中进行标记;marking the users in the first abnormal user set and the second abnormal user set in the user knowledge graph;
在所述用户知识图谱中查找与所述第一异常用户集和所述第二异常用户集中的异常用户的具有相同电话属性的其余用户,并将所述其余用户标记为异常用户,得到第三异常用户集;Find other users with the same phone attribute as the abnormal users in the first abnormal user set and the second abnormal user set in the user knowledge graph, mark the remaining users as abnormal users, and obtain a third abnormal user set;
汇总所述第一异常用户集、第二异常用户集及第三异常用户集,得到异常用户集,并对所述异常用户集的用户进行转移处理。The first abnormal user set, the second abnormal user set and the third abnormal user set are aggregated to obtain the abnormal user set, and the users of the abnormal user set are transferred.
如图4所示,是本发明一实施例提供的实现异常用户的识别方法的电子设备的结构示意图。As shown in FIG. 4 , it is a schematic structural diagram of an electronic device implementing a method for identifying an abnormal user provided by an embodiment of the present invention.
所述电子设备1可以包括处理器10、存储器11、通信总线12以及通信接口13,还可以包括存储在所述存储器11中并可在所述处理器10上运行的计算机程序,如异常用户的识别程序。The electronic device 1 may include a
其中,所述处理器10在一些实施例中可以由集成电路组成,例如可以由单个封装的集成电路所组成,也可以是由多个相同功能或不同功能封装的集成电路所组成,包括一个或者多个中央处理器(Central Processing unit,CPU)、微处理器、数字处理芯片、图形处理器及各种控制芯片的组合等。所述处理器10是所述电子设备的控制核心(ControlUnit),利用各种接口和线路连接整个电子设备的各个部件,通过运行或执行存储在所述存储器11内的程序或者模块(例如执行异常用户的识别程序等),以及调用存储在所述存储器11内的数据,以执行电子设备的各种功能和处理数据。The
所述存储器11至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、移动硬盘、多媒体卡、卡型存储器(例如:SD或DX存储器等)、磁性存储器、磁盘、光盘等。所述存储器11在一些实施例中可以是电子设备的内部存储单元,例如该电子设备的移动硬盘。所述存储器11在另一些实施例中也可以是电子设备的外部存储设备,例如电子设备上配备的插接式移动硬盘、智能存储卡(Smart Media Card,SMC)、安全数字(Secure Digital,SD)卡、闪存卡(Flash Card)等。进一步地,所述存储器11还可以既包括电子设备的内部存储单元也包括外部存储设备。所述存储器11不仅可以用于存储安装于电子设备的应用软件及各类数据,例如异常用户的识别程序的代码等,还可以用于暂时地存储已经输出或者将要输出的数据。The
所述通信总线12可以是外设部件互连标准(peripheral componentinterconnect,简称PCI)总线或扩展工业标准结构(extended industry standardarchitecture,简称EISA)总线等。该总线可以分为地址总线、数据总线、控制总线等。所述总线被设置为实现所述存储器11以及至少一个处理器10等之间的连接通信。The
所述通信接口13用于上述电子设备与其他设备之间的通信,包括网络接口和用户接口。可选地,所述网络接口可以包括有线接口和/或无线接口(如WI-FI接口、蓝牙接口等),通常用于在该电子设备与其他电子设备之间建立通信连接。所述用户接口可以是显示器(Display)、输入单元(比如键盘(Keyboard)),可选地,用户接口还可以是标准的有线接口、无线接口。可选地,在一些实施例中,显示器可以是LED显示器、液晶显示器、触控式液晶显示器以及OLED(Organic Light-Emitting Diode,有机发光二极管)触摸器等。其中,显示器也可以适当的称为显示屏或显示单元,用于显示在电子设备中处理的信息以及用于显示可视化的用户界面。The
图4仅示出了具有部件的电子设备,本领域技术人员可以理解的是,图4示出的结构并不构成对所述电子设备1的限定,可以包括比图示更少或者更多的部件,或者组合某些部件,或者不同的部件布置。FIG. 4 only shows an electronic device with components. Those skilled in the art can understand that the structure shown in FIG. 4 does not constitute a limitation on the electronic device 1, and may include fewer or more components than those shown in the drawings. components, or a combination of certain components, or a different arrangement of components.
例如,尽管未示出,所述电子设备还可以包括给各个部件供电的电源(比如电池),优选地,电源可以通过电源管理装置与所述至少一个处理器10逻辑相连,从而通过电源管理装置实现充电管理、放电管理、以及功耗管理等功能。电源还可以包括一个或一个以上的直流或交流电源、再充电装置、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。所述电子设备还可以包括多种传感器、蓝牙模块、Wi-Fi模块等,在此不再赘述。For example, although not shown, the electronic device may also include a power source (such as a battery) for powering the various components, preferably, the power source may be logically connected to the at least one
应该了解,所述实施例仅为说明之用,在专利申请范围上并不受此结构的限制。It should be understood that the embodiments are only used for illustration, and are not limited by this structure in the scope of the patent application.
所述电子设备1中的所述存储器11存储的异常用户的识别程序是多个指令的组合,在所述处理器10中运行时,可以实现:The abnormal user identification program stored in the
采集历史投诉工单信息,从所述历史投诉工单信息中提取诉求详情,得到多个用户的诉求详情文本;Collect historical complaint work order information, extract appeal details from the historical complaint work order information, and obtain the appeal details texts of multiple users;
提取每个所述诉求详情文本中的关键词,并基于所述关键词生成第一异常用户集;Extracting keywords in each of the appeal details texts, and generating a first abnormal user set based on the keywords;
基于所述历史投诉工单信息获取多个所述用户的设备信息,根据多个所述用户的设备信息和预设的设备信息条件生成第二异常用户集;Obtain the device information of a plurality of the users based on the historical complaint ticket information, and generate a second abnormal user set according to the device information of the plurality of users and a preset device information condition;
获取基于用户信息构建的用户知识图谱,根据所述第一异常用户集和所述第二异常用户集从所述用户知识图谱中识别生成第三异常用户集,并与所述第一异常用户集、所述第二异常用户集汇总得到异常用户集。Obtain a user knowledge graph constructed based on user information, identify and generate a third abnormal user set from the user knowledge graph according to the first abnormal user set and the second abnormal user set, and combine with the first abnormal user set . The second abnormal user set is aggregated to obtain the abnormal user set.
具体地,所述处理器10对上述指令的具体实现方法可参考附图对应实施例中相关步骤的描述,在此不赘述。Specifically, for the specific implementation method of the above-mentioned instruction by the
进一步地,所述电子设备1集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读存储介质中。所述计算机可读存储介质可以是易失性的,也可以是非易失性的。例如,所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)。Further, if the modules/units integrated in the electronic device 1 are implemented in the form of software functional units and sold or used as independent products, they may be stored in a computer-readable storage medium. The computer-readable storage medium may be volatile or non-volatile. For example, the computer-readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a USB flash drive, a removable hard disk, a magnetic disk, an optical disc, a computer memory, a read-only memory (ROM, Read-Only). Memory).
本发明还提供一种计算机可读存储介质,所述可读存储介质存储有计算机程序,所述计算机程序在被电子设备的处理器所执行时,可以实现:The present invention also provides a computer-readable storage medium, where the readable storage medium stores a computer program, and when executed by a processor of an electronic device, the computer program can realize:
采集历史投诉工单信息,从所述历史投诉工单信息中提取诉求详情,得到多个用户的诉求详情文本;Collect historical complaint work order information, extract appeal details from the historical complaint work order information, and obtain the appeal details texts of multiple users;
提取每个所述诉求详情文本中的关键词,并基于所述关键词生成第一异常用户集;Extracting keywords in each of the appeal details texts, and generating a first abnormal user set based on the keywords;
基于所述历史投诉工单信息获取多个所述用户的设备信息,根据多个所述用户的设备信息和预设的设备信息条件生成第二异常用户集;Obtain the device information of a plurality of the users based on the historical complaint ticket information, and generate a second abnormal user set according to the device information of the plurality of users and a preset device information condition;
获取基于用户信息构建的用户知识图谱,根据所述第一异常用户集和所述第二异常用户集从所述用户知识图谱中识别生成第三异常用户集,并与所述第一异常用户集、所述第二异常用户集汇总得到异常用户集。。Obtain a user knowledge graph constructed based on user information, identify and generate a third abnormal user set from the user knowledge graph according to the first abnormal user set and the second abnormal user set, and combine with the first abnormal user set . The second abnormal user set is aggregated to obtain the abnormal user set. .
在本发明所提供的几个实施例中,应该理解到,所揭露的设备,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus, apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are only illustrative. For example, the division of the modules is only a logical function division, and there may be other division manners in actual implementation.
所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The modules described as separate components may or may not be physically separated, and the components shown as modules may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
另外,在本发明各个实施例中的各功能模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能模块的形式实现。In addition, each functional module in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware, or can be implemented in the form of hardware plus software function modules.
对于本领域技术人员而言,显然本发明不限于上述示范性实施例的细节,而且在不背离本发明的精神或基本特征的情况下,能够以其他的具体形式实现本发明。It will be apparent to those skilled in the art that the present invention is not limited to the details of the above-described exemplary embodiments, but that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics of the invention.
因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本发明的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本发明内。不应将权利要求中的任何附关联图标记视为限制所涉及的权利要求。Therefore, the embodiments are to be regarded in all respects as illustrative and not restrictive, and the scope of the invention is to be defined by the appended claims rather than the foregoing description, which are therefore intended to fall within the scope of the claims. All changes within the meaning and range of the equivalents of , are included in the present invention. Any reference signs in the claims shall not be construed as limiting the involved claim.
本发明所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。The blockchain referred to in the present invention is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
本申请实施例可以基于人工智能技术对相关的数据进行获取和处理。其中,人工智能(Artificial Intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。The embodiments of the present application may acquire and process related data based on artificial intelligence technology. Among them, artificial intelligence (AI) is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results. .
此外,显然“包括”一词不排除其他单元或步骤,单数不排除复数。系统权利要求中陈述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第一、第二等词语用来表示名称,而并不表示任何特定的顺序。Furthermore, it is clear that the word "comprising" does not exclude other units or steps and the singular does not exclude the plural. Several units or means recited in the system claims can also be realized by one unit or means by means of software or hardware. The words first, second, etc. are used to denote names and do not denote any particular order.
最后应说明的是,以上实施例仅用以说明本发明的技术方案而非限制,尽管参照较佳实施例对本发明进行了详细说明,本领域的普通技术人员应当理解,可以对本发明的技术方案进行修改或等同替换,而不脱离本发明技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and not to limit them. Although the present invention has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the present invention can be Modifications or equivalent substitutions can be made without departing from the spirit and scope of the technical solutions of the present invention.
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202111268127.8A CN113987206A (en) | 2021-10-29 | 2021-10-29 | Abnormal user identification method, device, equipment and storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202111268127.8A CN113987206A (en) | 2021-10-29 | 2021-10-29 | Abnormal user identification method, device, equipment and storage medium |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN113987206A true CN113987206A (en) | 2022-01-28 |
Family
ID=79743991
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202111268127.8A Pending CN113987206A (en) | 2021-10-29 | 2021-10-29 | Abnormal user identification method, device, equipment and storage medium |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN113987206A (en) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115775178A (en) * | 2022-12-27 | 2023-03-10 | 平安银行股份有限公司 | Method, device, electronic equipment and storage medium for obtaining root causes of bank complaints |
| CN117094688A (en) * | 2023-10-20 | 2023-11-21 | 国网信通亿力科技有限责任公司 | Digital control method and system for power supply station |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108076032A (en) * | 2016-11-15 | 2018-05-25 | 中国移动通信集团广东有限公司 | A kind of abnormal behaviour user identification method and device |
| CN109034661A (en) * | 2018-08-28 | 2018-12-18 | 腾讯科技(深圳)有限公司 | User identification method, device, server and storage medium |
| CN111949803A (en) * | 2020-08-21 | 2020-11-17 | 深圳供电局有限公司 | A method, device and device for detecting abnormal network users based on knowledge graph |
| CN113255929A (en) * | 2021-05-27 | 2021-08-13 | 支付宝(杭州)信息技术有限公司 | Method and device for acquiring interpretable reasons of abnormal user |
-
2021
- 2021-10-29 CN CN202111268127.8A patent/CN113987206A/en active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108076032A (en) * | 2016-11-15 | 2018-05-25 | 中国移动通信集团广东有限公司 | A kind of abnormal behaviour user identification method and device |
| CN109034661A (en) * | 2018-08-28 | 2018-12-18 | 腾讯科技(深圳)有限公司 | User identification method, device, server and storage medium |
| CN111949803A (en) * | 2020-08-21 | 2020-11-17 | 深圳供电局有限公司 | A method, device and device for detecting abnormal network users based on knowledge graph |
| CN113255929A (en) * | 2021-05-27 | 2021-08-13 | 支付宝(杭州)信息技术有限公司 | Method and device for acquiring interpretable reasons of abnormal user |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115775178A (en) * | 2022-12-27 | 2023-03-10 | 平安银行股份有限公司 | Method, device, electronic equipment and storage medium for obtaining root causes of bank complaints |
| CN117094688A (en) * | 2023-10-20 | 2023-11-21 | 国网信通亿力科技有限责任公司 | Digital control method and system for power supply station |
| CN117094688B (en) * | 2023-10-20 | 2023-12-19 | 国网信通亿力科技有限责任公司 | Digital control method and system for power supply station |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN112541745B (en) | User behavior data analysis method and device, electronic equipment and readable storage medium | |
| CN113364753B (en) | Anti-crawler method and device, electronic equipment and computer readable storage medium | |
| CN113836131A (en) | Big data cleaning method and device, computer equipment and storage medium | |
| CN113807553B (en) | Quantity analysis method, device, equipment and storage medium for reservation service | |
| CN113792089B (en) | Illegal behavior detection method, device, equipment and medium based on artificial intelligence | |
| CN112507230B (en) | Webpage recommendation method and device based on browser, electronic equipment and storage medium | |
| CN112380859A (en) | Public opinion information recommendation method and device, electronic equipment and computer storage medium | |
| CN114066533A (en) | Product recommendation method, device, electronic device and storage medium | |
| CN113707302A (en) | Service recommendation method, device, equipment and storage medium based on associated information | |
| CN113987206A (en) | Abnormal user identification method, device, equipment and storage medium | |
| CN113919738A (en) | Business handling window distribution method and device, electronic equipment and readable storage medium | |
| CN115238179A (en) | Project pushing method and device, electronic equipment and computer readable storage medium | |
| CN116795777A (en) | Enterprise document management method, device, equipment and storage medium | |
| CN114840531B (en) | Data model reconstruction method, device, equipment and medium based on blood edge relation | |
| CN114461630B (en) | Smart attribution analysis method, device, equipment and storage medium | |
| CN114840660A (en) | Service recommendation model training method, device, equipment and storage medium | |
| CN114547696A (en) | File desensitization method and device, electronic equipment and storage medium | |
| CN114003720A (en) | Business document classification method, device, equipment and storage medium | |
| CN111563527A (en) | Abnormal event detection method and device | |
| CN117609368A (en) | A genealogy analysis system, method, equipment and medium based on off-chain storage | |
| CN114996386A (en) | Business role identification method, device, equipment and storage medium | |
| CN116680449A (en) | Method, device, equipment and medium for carrying out same user identification on multi-source data | |
| CN114510695A (en) | Malicious account detection method, device, device and medium | |
| CN116578696A (en) | Text abstract generation method, device, equipment and storage medium | |
| CN116303645A (en) | Method, device, equipment and storage medium for quickly locating interpersonal relationship paths |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| RJ01 | Rejection of invention patent application after publication | ||
| RJ01 | Rejection of invention patent application after publication |
Application publication date: 20220128 |