+

CN119515520A - Method and device for locating the root cause of event failure - Google Patents

Method and device for locating the root cause of event failure Download PDF

Info

Publication number
CN119515520A
CN119515520A CN202311024638.4A CN202311024638A CN119515520A CN 119515520 A CN119515520 A CN 119515520A CN 202311024638 A CN202311024638 A CN 202311024638A CN 119515520 A CN119515520 A CN 119515520A
Authority
CN
China
Prior art keywords
event
fault
processed
source
primary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311024638.4A
Other languages
Chinese (zh)
Inventor
练婉利
徐玉梅
姜妙怡
徐颖
徐雅静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202311024638.4A priority Critical patent/CN119515520A/en
Publication of CN119515520A publication Critical patent/CN119515520A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Debugging And Monitoring (AREA)

Abstract

本发明公开了事件故障根源定位方法及装置,可用于人工智能技术领域,方法包括:获取银行业务办理过程中的待处理故障事件;根据一级事件故障根源列表中各个一级事件故障根源关联的检查方式,确定与待处理故障事件匹配的一级事件故障根源;一级事件故障根源关联的检查方式用于验证对应的一级事件故障根源是否发生;根据与待处理故障事件匹配的一级事件故障根源,确定一级事件故障根源下属的二级事件故障根源列表;在二级事件故障根源列表存在与待处理故障事件匹配的二级事件故障根源时,将与待处理故障事件匹配的二级事件故障根源发送至用户;本发明实现快速定位事件故障根源,提高事件故障根源定位的准确性,提高了治理效率。

The present invention discloses an event fault root source positioning method and device, which can be used in the field of artificial intelligence technology. The method comprises: obtaining a pending fault event in a banking business handling process; determining a first-level event fault root source matching the pending fault event according to an inspection method associated with each first-level event fault root source in a first-level event fault root source list; the inspection method associated with the first-level event fault root source is used to verify whether the corresponding first-level event fault root source occurs; determining a second-level event fault root source list subordinate to the first-level event fault root source according to the first-level event fault root source matching the pending fault event; when a second-level event fault root source matching the pending fault event exists in the second-level event fault root source list, sending the second-level event fault root source matching the pending fault event to a user; the present invention realizes rapid positioning of event fault root sources, improves the accuracy of event fault root source positioning, and improves governance efficiency.

Description

Event fault source positioning method and device
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an event fault source positioning method and device.
Background
This section is intended to provide a background or context to the embodiments of the invention that are recited in the claims. The description herein is not admitted to be prior art by inclusion in this section.
With the continuous update of new technologies, the rapid demand of new demands, large multi-module software systems need to follow technical changes, policy updates, market changes, etc., resulting in frequent events. Multi-tier linkage of architecture, distributed application, fragmentation of data, version iteration frequency lead to long event localization and repair latency, with temporary alternatives mostly being closed transactions, or rollback versions.
In summary, there is a need for a method for locating the source of an event fault to solve the above-mentioned problems.
Disclosure of Invention
The embodiment of the invention provides an event fault source positioning method, which is used for improving the accuracy and efficiency of event fault source positioning, and comprises the following steps:
Acquiring a fault event to be processed in the banking business handling process;
determining a primary event fault source matched with a to-be-processed fault event according to the checking mode of each primary event fault source association in the primary event fault source list;
determining a secondary event fault root list subordinate to the primary event fault root according to the primary event fault root matched with the to-be-processed fault event;
When a secondary event fault source matched with the to-be-processed fault event exists in the secondary event fault source list, the secondary event fault source matched with the to-be-processed fault event is sent to a user;
after receiving a fault treatment instruction of a user, retrieving corresponding treatment scheme information according to a secondary event fault source matched with a fault event to be treated, wherein the treatment scheme information comprises a treatment script;
Determining an event description corresponding to the fault event to be processed when a primary event fault source matched with the fault event to be processed does not exist;
Determining the similarity between the fault event to be processed and each historical event according to the event description corresponding to the fault event to be processed;
and executing the treatment script contained in the treatment scheme information corresponding to the history event with the highest similarity.
The embodiment of the invention also provides an event fault source positioning device, which is used for improving the accuracy and efficiency of event fault source positioning, and comprises the following steps:
The matching module is used for acquiring a fault event to be processed in the banking business handling process; determining a first-level event fault source matched with a to-be-processed fault event according to an inspection mode associated with each first-level event fault source in a first-level event fault source list, wherein the inspection mode associated with the first-level event fault source is used for verifying whether the corresponding first-level event fault source occurs or not;
The management module is used for retrieving corresponding management scheme information according to a secondary event fault source matched with a to-be-processed fault event after receiving a fault management instruction of a user, wherein the management scheme information comprises a management script, executing the management script, determining event description corresponding to the to-be-processed fault event when the primary event fault source matched with the to-be-processed fault event does not exist, determining similarity between the to-be-processed fault event and each historical event according to the event description corresponding to the to-be-processed fault event, determining management scheme information corresponding to the historical event with the highest similarity, and executing the management script contained in the management scheme information corresponding to the historical event with the highest similarity.
The embodiment of the invention also provides computer equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the event fault source positioning method when executing the computer program.
The embodiment of the invention also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the event fault source positioning method when being executed by a processor.
The embodiment of the invention also provides a computer program product, which comprises a computer program, wherein the computer program is executed by a processor to realize the event fault source positioning method.
In the embodiment of the invention, the fault event to be processed in the banking business handling process is acquired; determining a primary event fault source matched with a to-be-processed fault event according to an inspection mode associated with each primary event fault source in a primary event fault source list, wherein the inspection mode associated with the primary event fault source is used for verifying whether the corresponding primary event fault source occurs or not, determining a secondary event fault source list subordinate to the primary event fault source according to the primary event fault source matched with the to-be-processed fault event, sending the secondary event fault source matched with the to-be-processed fault event to a user when the secondary event fault source list exists the secondary event fault source matched with the to-be-processed fault event, retrieving corresponding treatment scheme information according to the secondary event fault source matched with the to-be-processed fault event after a fault treatment instruction of the user is received, the treatment scheme information comprises a treatment script, executing the treatment script, determining event description corresponding to the to-be-processed fault event when the primary event fault source matched with the to-be-processed fault event does not exist, determining similarity of the to-be-processed fault event and each historical event according to the event description corresponding to the to-be-processed fault event, determining similarity of the to-be-processed fault event corresponding to the historical event, and executing the treatment script corresponding to the treatment script with the history script. Compared with the prior art, after the primary event fault source matched with the to-be-processed fault event is determined according to the checking mode of the association of each primary event fault source in the primary event fault source list, the to-be-processed fault event is continuously matched with the secondary event fault source, so that the event fault source is rapidly positioned, the positioning accuracy of the event fault source is improved, a treatment scheme is provided, and the treatment efficiency is improved.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. In the drawings:
FIG. 1 is a flow chart of an event fault source positioning method provided by the invention;
FIG. 2 is a flow chart of the event fault source positioning method provided by the invention;
FIG. 3 is a flow chart of the event fault source positioning method provided by the invention;
FIG. 4 is a flow chart of the event fault source localization method provided by the invention;
fig. 5 is a schematic structural diagram of an event fault source positioning device provided by the invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the embodiments of the present invention will be described in further detail with reference to the accompanying drawings. The exemplary embodiments of the present invention and their descriptions herein are for the purpose of explaining the present invention, but are not to be construed as limiting the invention.
Fig. 1 is a flow chart corresponding to an event fault source positioning method according to an embodiment of the present invention, as shown in fig. 1, where the method includes:
Step 101, obtaining a fault event to be processed in a banking business handling process.
Step 102, determining the primary event fault source matched with the to-be-processed fault event according to the checking mode of each primary event fault source association in the primary event fault source list.
It should be noted that, the checking mode of the primary event fault source association is used to verify whether the corresponding primary event fault source occurs.
For example, according to the fault event to be processed, the primary event fault source list is searched, the primary event sources q1, q2...qn are polled, the matching degree is checked, the q1 is "date cut failure", the current date is T-1 (which should be T) is obtained by searching the database in linkage through the q 1-related checking mode "sql statement", the conclusion of date cut failure is obtained, and the q1 checks that the primary event fault source passes the matching.
If the date obtained by the database is T when q1 date is checked, q1 check is failed, and the failure event to be processed caused by non-q 1 can be estimated, matching is failed, and the steps are repeated until qi (1-n) traversal is finished.
And step 103, determining a secondary event fault root list subordinate to the primary event fault root according to the primary event fault root matched with the fault event to be processed.
And 104, when the secondary event fault source list has the secondary event fault source matched with the to-be-processed fault event, transmitting the secondary event fault source matched with the to-be-processed fault event to a user.
For example, the secondary event fault source list is polled to check for matching. q1 checking is passed, and then secondary root cause lists q1-1, q1-2 and q1-i. q1-n corresponding to q1 are matched again. The secondary root cause list may be empty, meaning that q1 is both the apparent and the deepest root cause. If all secondary sources are traversed, no match is found, meaning that the apparent cause is known, but the specific deepest cause is not known, and manual intervention is required. For example, q1 is "date unsuccessful" and q1-1 is "batch not completed", it is necessary to check whether q1-1 is satisfied. And (3) through the 'checking script' associated with q1-1, the linkage inquiry database knows that the batch execution condition is true, the batch which is not executed exists, and the q1-1 checking result is passed.
For another example, q1 is "error reporting of communication area, cardtype type is not matched", q1-1 is "reconstruction of downstream communication area", and more complex checking script is needed to check if q1-1 is matched, including the steps of 1, pulling error reporting service from log, 2, pulling downstream service from production calling rule library, 3, searching for soft to confirm if downstream service has reconstruction step, 4, matching soft to reconstruct, q1-1 checking if matching is passed. If the matching soft requirement does not search for the transformation requirement, then q1-1 check fails, where the soft requirement semantic similarity is required to reach 80% flag check pass.
And recording the matched q1-1 check result, and returning to the front end when the problem root returns.
Step 105, after receiving the fault treatment instruction of the user, retrieving corresponding treatment scheme information according to the secondary event fault source matched with the to-be-treated fault event.
The treatment scheme information comprises a treatment script, wherein the treatment script is executed;
And step 106, determining an event description corresponding to the fault event to be processed when the primary event fault source matched with the fault event to be processed does not exist.
And step 107, determining the similarity between the fault event to be processed and each historical event according to the event description corresponding to the fault event to be processed.
And 108, determining the treatment scheme information corresponding to the history event with the highest similarity, and executing the treatment script contained in the treatment scheme information corresponding to the history event with the highest similarity.
According to the scheme, after the primary event fault sources matched with the to-be-processed fault event are determined according to the detection mode of the association of each primary event fault source in the primary event fault source list, the to-be-processed fault event is continuously matched with the secondary event fault source, so that the event fault sources are rapidly positioned, the positioning accuracy of the event fault sources is improved, the treatment scheme is provided, and the treatment efficiency is improved.
In the embodiment of the invention, the event description comprises an event name, an event ID, an event level, an occurrence time, a solution time, an event phenomenon, a related application module, an influence range, a primary event fault source, a secondary event fault source, a primary checking script, a secondary treatment scheme, a secondary treatment script, a primary treatment scheme, a primary treatment script and the like.
The solution time, the event source, the inspection script, the treatment scheme and the treatment script are blank when being recorded in advance, and the user is further supplemented after the receiving confirmation.
In one possible implementation, the checking mode confirms whether the event fault source is satisfied by defining sql statement, log search and environment screening, and then returns a Y/N indication when judging whether the event fault source is successfully matched.
The treatment script contained in the treatment scheme information is used for eliminating the event fault source through the shell execution database, the container, the network application and the change. When the primary event fault source is matched and the secondary event source is matched, the secondary treatment scheme and the secondary treatment script are output preferentially, and when the primary event fault source is matched and the secondary event fault source is not matched, the primary treatment scheme information and the primary treatment script are output.
After obtaining a fault event to be processed in a banking business handling process, the embodiment of the invention generates a primary event fault source list and displays the front end.
In the embodiment of the invention, the event analysis system of the system is divided into a foreground management system and a background retrieval system. The foreground management system receives the user standardized event description, performs conventional text word segmentation, intention recognition and similar word replacement, and transmits the optimized event description to the background retrieval system to generate an event root.
The background retrieval system mainly comprises the following components S01, S02, S03, S04 and S05.
And S01, pulling an event library to form an event rule library.
The event library comprises event names, event IDs, event levels, occurrence times, solution times, event phenomena, related application modules, influence ranges, primary event sources, secondary event sources, primary inspection scripts, secondary inspection scripts, primary treatment schemes, primary treatment scripts, secondary treatment schemes and secondary treatment scripts. And (3) collecting and supplementing the events on the basis to form an event rule base, wherein the event rule base comprises events, primary event root times and secondary event root times.
S02, pulling the business data to form a business rule base.
The business architecture is combed by enterprises and subdivided into a product model, a flow model and a solid model, wherein the product model describes service conditions and processing rules, the flow model comprises standardized processing logic, and the solid model comprises rules for how the enterprises store. Describing the business rules of the software product, for example, the network point card opening requires the user to provide the elements such as an identity card, a mobile phone number, an address and the like, and after the teller checks the material successfully, information input and card opening operation are carried out, and a card is provided for the user.
By warehousing related product models, process models and entity models, normal and abnormal business processes and business rule requirements of transactions can be provided.
S03, pulling the change record to form a change rule base.
The modification of the system environment or business rules by the production environment generally does not reflect the need for independent recording and analysis in the soft-demand or business rules. System environments such as disaster recovery, database migration, network changes, container copies, CPU, memory expansion or contraction.
The change rule base generally includes date, module, background, execution content, business impact.
And S04, pulling the software requirement record to form a soft requirement rule base.
The software requirement records the increment of each period of version change, the version change is recorded in time sequence, and the first hand information can be provided for the occurrence of the event by analyzing the sequence transformation. The software requirements reformulation record includes a general overview (describing the reformulation context), logic processing (reformulation implementation flow), system menus and portals (transaction columns involved), technical and business checkpoints (verification points involved).
Wherein causal relationship reasoning is formed based on SCM based on software requirement records, and a soft requirement rule base is recorded.
And S05, pulling the production full-link log library to form a module calling rule library.
For logical warehousing between production run services, the module call rule base typically includes information such as a timestamp, a globally unique tracking id, a module id, a parent module id, a request/return message.
From the tracking id and the timestamp, a call chain of a service- > B service- > C service can be formed. And (5) assisting in positioning the calling relation between the module services, and finally positioning the deepest event source.
And S06, pulling the substitution code library and the table structure to form a code rule library.
In order to realize the correspondence of the code library and the semantics, the code and the annotation need to be analyzed row by row to form a code rule library.
In step 102, according to the inspection mode associated with each primary event fault source in the primary event fault source list, the primary event fault source matched with the to-be-processed fault event is determined, and the step flow is shown in fig. 2, and specifically includes the following steps:
step 201, sorting the plurality of primary event fault sources according to the occurrence times of each primary event fault source in the primary event fault source list. And according to the occurrence times of the event sources 100 and 20..1, sequencing from high to low, and sequentially checking qi matching degree according to sequencing.
Step 202, sequentially matching the to-be-processed fault event with each first-level event fault source according to the sequencing result until the matching is successful, and obtaining the first-level event fault source matched with the to-be-processed fault event.
In the embodiment of the invention, when the current primary event fault source is checked by adopting the checking mode of association and the checking is passed, the current primary event fault source is determined to be successfully matched with the fault event to be processed.
In the embodiment of the invention, after a secondary event fault source list subordinate to a primary event fault source is determined according to the primary event fault source matched with a fault event to be processed, the step flow is shown in a figure 3, and the method specifically comprises the following steps:
step 301, when the secondary event fault source list is empty, the primary event fault source matched with the to-be-processed fault event is sent to the user.
Step 302, after receiving a fault governance instruction of a user, retrieving corresponding governance scheme information according to a primary event fault source matched with a to-be-processed fault event, and executing a governance script contained in the governance scheme information.
In the embodiment of the invention, the system automatically saves the event snapshot, and the snapshot comprises the event description and recommended event sources. After a certain hour or a certain day, the operation and maintenance personnel checks that the event is indeed caused by the recommended event source, and returns to the system to search the event snapshot according to the date and the keywords.
In one possible implementation, the user clicks "generate event governance scheme" in the snapshot, and the system retrieves its governance scheme and governance script output foreground to the operation and maintenance personnel for reference based on the event root in the snapshot.
In the embodiment of the invention, the specific treatment scheme is described in Chinese, including but not limited to executing SQL sentences, rolling back container version, expanding container CPU and memory, expanding copy number, closing program and the like.
The user clicks "script editing" to inquire and edit the treatment script, the script is the history script of the history event, the user associates the application database environment, and after the container registration environment and the network environment are automatically corrected, the clicking "submit" takes effect to the production environment.
In step 107, according to the event description corresponding to the fault event to be processed, the embodiment of the present invention determines the similarity between the fault event to be processed and each historical event, and the step flow is shown in fig. 4, and specifically includes the following steps:
step 401, determining an event description vector of the fault event to be processed according to the event description corresponding to the fault event to be processed by using a word2vec model.
Step 402, determining the similarity between the fault event to be processed and each historical event according to the event description vector of the fault event to be processed.
For a newly input event a, extracting an event description, obtaining an event description vector by using word2vec, and calculating the similarity of a and other historical event problems by using the following similarity calculation formulas r < a, b >.
For example:
r<a,b>=cov(a,b)
in the embodiment of the invention, the smaller the r value is, the higher the similarity is, and the optimal matching history event can be found through the comparison of the r values.
According to the scheme, for the situation that the event description is matched but the event fault source is not matched, word2vec is used for matching the historical event source, the optimal matching historical event is found, and the second-level treatment scheme is used for carrying out event treatment scheme and treatment script display.
In the embodiment of the invention, a user can supplement and input the event source, the treatment scheme and the treatment script, and the information is returned to the event rule base for continuous storage.
The event snapshot is divided into a case-on state and a case-on state according to the information integrity. And searching corresponding treatment scheme information according to a secondary event fault source matched with the fault event to be processed, initializing the snapshot into a 'case-found' state, and updating the snapshot into the 'case-found' state after event information is complemented.
The event of "found case" can continuously send alarm mail to user, so that it is convenient for them to timely track and process and close up as soon as possible to "found case".
Events of "on-ground" can calculate event complexity according to the time of case setting, related modules, event levels, manual scoring, and serve as events of higher complexity for learning and production improvement.
The embodiment of the invention also provides an interface mapping device, as described in the following embodiment. The device is shown in fig. 5, and the device comprises:
The matching module 501 is configured to obtain a fault event to be processed in a banking transaction process; determining a first-level event fault source matched with a to-be-processed fault event according to an inspection mode associated with each first-level event fault source in a first-level event fault source list, wherein the inspection mode associated with the first-level event fault source is used for verifying whether the corresponding first-level event fault source occurs or not;
the governance module 502 is configured to retrieve corresponding governance scheme information according to a second-level event fault source matched with a to-be-processed fault event after receiving a fault governance instruction of a user, where the governance scheme information includes a governance script, execute the governance script, determine an event description corresponding to the to-be-processed fault event when there is no first-level event fault source matched with the to-be-processed fault event, determine similarity between the to-be-processed fault event and each historical event according to the event description corresponding to the to-be-processed fault event, determine governance scheme information corresponding to the historical event with the highest similarity, and execute the governance script included in the governance scheme information corresponding to the historical event with the highest similarity.
In the embodiment of the present invention, the matching module 501 is specifically configured to:
Sequencing a plurality of primary event fault sources according to the occurrence times of each primary event fault source in the primary event fault source list;
and sequentially matching the to-be-processed fault event with each first-level event fault source according to the sequencing result until the matching is successful, and obtaining the first-level event fault source matched with the to-be-processed fault event.
In the embodiment of the present invention, the matching module 501 is specifically configured to:
and when the current primary event fault source is checked by adopting a checking mode related to the current primary event fault source and the checking is passed, determining that the current primary event fault source is successfully matched with the fault event to be processed.
In the embodiment of the present invention, the matching module 501 is further configured to:
After a secondary event fault root list subordinate to the primary event fault root is determined according to the primary event fault root matched with the to-be-processed fault event, when the secondary event fault root list is empty, the primary event fault root matched with the to-be-processed fault event is sent to a user;
After receiving a fault management instruction of a user, retrieving corresponding management scheme information according to a primary event fault source matched with a to-be-processed fault event, and executing a management script contained in the management scheme information.
In the embodiment of the present invention, the abatement module 502 is specifically configured to:
Determining an event description vector of the fault event to be processed according to the event description corresponding to the fault event to be processed by adopting a word2vec model;
And determining the similarity between the fault event to be processed and each historical event according to the event description vector of the fault event to be processed.
Because the principle of the device for solving the problem is similar to that of the event fault source positioning method, the implementation of the device can refer to the implementation of the event fault source positioning method, and the repetition is omitted.
The embodiment of the invention also provides computer equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the event fault source positioning method when executing the computer program.
The embodiment of the invention also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the event fault source positioning method when being executed by a processor.
The embodiment of the invention also provides a computer program product, which comprises a computer program, wherein the computer program is executed by a processor to realize the event fault source positioning method.
In the embodiment of the invention, the fault event to be processed in the banking business handling process is acquired; determining a primary event fault source matched with a to-be-processed fault event according to an inspection mode associated with each primary event fault source in a primary event fault source list, wherein the inspection mode associated with the primary event fault source is used for verifying whether the corresponding primary event fault source occurs or not, determining a secondary event fault source list subordinate to the primary event fault source according to the primary event fault source matched with the to-be-processed fault event, sending the secondary event fault source matched with the to-be-processed fault event to a user when the secondary event fault source list exists the secondary event fault source matched with the to-be-processed fault event, retrieving corresponding treatment scheme information according to the secondary event fault source matched with the to-be-processed fault event after a fault treatment instruction of the user is received, the treatment scheme information comprises a treatment script, executing the treatment script, determining event description corresponding to the to-be-processed fault event when the primary event fault source matched with the to-be-processed fault event does not exist, determining similarity of the to-be-processed fault event and each historical event according to the event description corresponding to the to-be-processed fault event, determining similarity of the to-be-processed fault event corresponding to the historical event, and executing the treatment script corresponding to the treatment script with the history script. Compared with the prior art, after the primary event fault source matched with the to-be-processed fault event is determined according to the checking mode of the association of each primary event fault source in the primary event fault source list, the to-be-processed fault event is continuously matched with the secondary event fault source, so that the event fault source is rapidly positioned, the positioning accuracy of the event fault source is improved, a treatment scheme is provided, and the treatment efficiency is improved.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing description of the embodiments has been provided for the purpose of illustrating the general principles of the invention, and is not meant to limit the scope of the invention, but to limit the invention to the particular embodiments, and any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims (10)

1. An event fault source positioning method is characterized by comprising the following steps:
Acquiring a fault event to be processed in the banking business handling process;
determining a primary event fault source matched with a to-be-processed fault event according to the checking mode of each primary event fault source association in the primary event fault source list;
determining a secondary event fault root list subordinate to the primary event fault root according to the primary event fault root matched with the to-be-processed fault event;
When a secondary event fault source matched with the to-be-processed fault event exists in the secondary event fault source list, the secondary event fault source matched with the to-be-processed fault event is sent to a user;
after receiving a fault treatment instruction of a user, retrieving corresponding treatment scheme information according to a secondary event fault source matched with a fault event to be treated, wherein the treatment scheme information comprises a treatment script;
Determining an event description corresponding to the fault event to be processed when a primary event fault source matched with the fault event to be processed does not exist;
Determining the similarity between the fault event to be processed and each historical event according to the event description corresponding to the fault event to be processed;
and executing the treatment script contained in the treatment scheme information corresponding to the history event with the highest similarity.
2. The method for locating a root cause of an event fault according to claim 1, wherein determining a root cause of an event fault that matches a fault event to be processed based on a checking pattern associated with each of the root causes of the event fault in the list of root causes of the event fault comprises:
Sequencing a plurality of primary event fault sources according to the occurrence times of each primary event fault source in the primary event fault source list;
and sequentially matching the to-be-processed fault event with each first-level event fault source according to the sequencing result until the matching is successful, and obtaining the first-level event fault source matched with the to-be-processed fault event.
3. The method for locating the fault source of the event according to claim 2, wherein the step of sequentially matching the fault event to be processed with each primary event fault source until the matching is successful according to the sorting result comprises the steps of:
and when the current primary event fault source is checked by adopting a checking mode related to the current primary event fault source and the checking is passed, determining that the current primary event fault source is successfully matched with the fault event to be processed.
4. The event fault source localization method as claimed in claim 1, further comprising, after determining a list of secondary event fault sources subordinate to the primary event fault source from the primary event fault source matched with the fault event to be processed,:
and when the secondary event fault source list is empty, sending the primary event fault source matched with the to-be-processed fault event to a user.
5. The event fault source localization method as claimed in claim 4, further comprising, after transmitting the primary event fault source matched with the pending fault event to the user when the secondary event fault source list is empty:
After receiving a fault management instruction of a user, retrieving corresponding management scheme information according to a primary event fault source matched with a to-be-processed fault event, and executing a management script contained in the management scheme information.
6. The event fault source localization method of claim 1, wherein determining the similarity of the fault event to be processed and each historical event according to the event description corresponding to the fault event to be processed comprises:
Determining an event description vector of the fault event to be processed according to the event description corresponding to the fault event to be processed;
And determining the similarity between the fault event to be processed and each historical event according to the event description vector of the fault event to be processed.
7. The method of claim 6, wherein determining an event description vector for the event of the fault to be processed based on the event description corresponding to the event of the fault to be processed, comprises:
and determining an event description vector of the fault event to be processed according to the event description corresponding to the fault event to be processed by adopting a word2vec model.
8. An event fault source positioning device, comprising:
The matching module is used for acquiring a fault event to be processed in the banking business handling process; determining a first-level event fault source matched with a to-be-processed fault event according to an inspection mode associated with each first-level event fault source in a first-level event fault source list, wherein the inspection mode associated with the first-level event fault source is used for verifying whether the corresponding first-level event fault source occurs or not;
The management module is used for retrieving corresponding management scheme information according to a secondary event fault source matched with a to-be-processed fault event after receiving a fault management instruction of a user, wherein the management scheme information comprises a management script, executing the management script, determining event description corresponding to the to-be-processed fault event when the primary event fault source matched with the to-be-processed fault event does not exist, determining similarity between the to-be-processed fault event and each historical event according to the event description corresponding to the to-be-processed fault event, determining management scheme information corresponding to the historical event with the highest similarity, and executing the management script contained in the management scheme information corresponding to the historical event with the highest similarity.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any of claims 1 to 7 when executing the computer program.
10. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program which, when executed by a processor, implements the method of any one of claims 1 to 7.
CN202311024638.4A 2023-08-15 2023-08-15 Method and device for locating the root cause of event failure Pending CN119515520A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311024638.4A CN119515520A (en) 2023-08-15 2023-08-15 Method and device for locating the root cause of event failure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311024638.4A CN119515520A (en) 2023-08-15 2023-08-15 Method and device for locating the root cause of event failure

Publications (1)

Publication Number Publication Date
CN119515520A true CN119515520A (en) 2025-02-25

Family

ID=94647261

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311024638.4A Pending CN119515520A (en) 2023-08-15 2023-08-15 Method and device for locating the root cause of event failure

Country Status (1)

Country Link
CN (1) CN119515520A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103001811A (en) * 2012-12-31 2013-03-27 北京启明星辰信息技术股份有限公司 Method and device for fault locating
CN106603264A (en) * 2015-10-20 2017-04-26 阿里巴巴集团控股有限公司 Method and equipment for positioning fault root
CN110704231A (en) * 2019-09-30 2020-01-17 深圳前海微众银行股份有限公司 A fault handling method and device
CN111930547A (en) * 2020-07-31 2020-11-13 中国工商银行股份有限公司 Fault positioning method and device and storage medium
WO2021190357A1 (en) * 2020-03-27 2021-09-30 华为技术有限公司 Fault detection method and device
CN113887606A (en) * 2021-09-28 2022-01-04 上海工业自动化仪表研究院有限公司 A Fault Diagnosis Method for Electronic Equipment Control System Based on Fault Tree

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103001811A (en) * 2012-12-31 2013-03-27 北京启明星辰信息技术股份有限公司 Method and device for fault locating
CN106603264A (en) * 2015-10-20 2017-04-26 阿里巴巴集团控股有限公司 Method and equipment for positioning fault root
CN110704231A (en) * 2019-09-30 2020-01-17 深圳前海微众银行股份有限公司 A fault handling method and device
WO2021190357A1 (en) * 2020-03-27 2021-09-30 华为技术有限公司 Fault detection method and device
CN111930547A (en) * 2020-07-31 2020-11-13 中国工商银行股份有限公司 Fault positioning method and device and storage medium
CN113887606A (en) * 2021-09-28 2022-01-04 上海工业自动化仪表研究院有限公司 A Fault Diagnosis Method for Electronic Equipment Control System Based on Fault Tree

Similar Documents

Publication Publication Date Title
US11281708B2 (en) Utilizing a machine learning model to predict metrics for an application development process
US12141144B2 (en) Column lineage and metadata propagation
EP2885701B1 (en) Predicting software build errors
US20210248144A1 (en) Systems and methods for data quality monitoring
CA3015240A1 (en) Service management control platform
US20210019762A1 (en) Identity resolution for fraud ring detection
CN109918100B (en) A repair recommendation method based on repair mode for version defects
CN111104306A (en) Method, apparatus, and computer storage medium for error diagnosis in an application
US20250200088A1 (en) Data source mapper for enhanced data retrieval
US12013838B2 (en) System and method for automated data integration
CN112068981A (en) Knowledge base-based fault scanning recovery method and system in Linux operating system
CN119538118A (en) Data classification method and device
CN120179832A (en) A dynamic construction method of industrial innovation knowledge graph based on large language model
CN119441410A (en) An operation and maintenance question-answering system and method based on large language model workflow
CN119515520A (en) Method and device for locating the root cause of event failure
US20240078382A1 (en) Frequent logic-rule sets for ml-based document content extraction support
CN114169776B (en) A task processing method and device
Yang et al. K-detector: Identifying duplicate crash failures in large-scale software delivery
CN114238657A (en) Graph database based automatic enterprise classification method and system in high and new technology field
Wang et al. FastTransLog: A log-based anomaly detection method based on fastformer
Anh et al. A novel relevance aggregation approach for bug localization
CN119474234B (en) Data element processing method, device, electronic device and storage medium
CN118426817B (en) Code data asset management system
CN115525552B (en) Fine granularity software defect positioning method based on blockchain tracing and method level
CN120085835B (en) Generative architecture decoupling functions and business logic methods, systems, devices and media

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载