Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the embodiments of the present invention will be described in further detail with reference to the accompanying drawings. The exemplary embodiments of the present invention and their descriptions herein are for the purpose of explaining the present invention, but are not to be construed as limiting the invention.
Fig. 1 is a flow chart corresponding to an event fault source positioning method according to an embodiment of the present invention, as shown in fig. 1, where the method includes:
Step 101, obtaining a fault event to be processed in a banking business handling process.
Step 102, determining the primary event fault source matched with the to-be-processed fault event according to the checking mode of each primary event fault source association in the primary event fault source list.
It should be noted that, the checking mode of the primary event fault source association is used to verify whether the corresponding primary event fault source occurs.
For example, according to the fault event to be processed, the primary event fault source list is searched, the primary event sources q1, q2...qn are polled, the matching degree is checked, the q1 is "date cut failure", the current date is T-1 (which should be T) is obtained by searching the database in linkage through the q 1-related checking mode "sql statement", the conclusion of date cut failure is obtained, and the q1 checks that the primary event fault source passes the matching.
If the date obtained by the database is T when q1 date is checked, q1 check is failed, and the failure event to be processed caused by non-q 1 can be estimated, matching is failed, and the steps are repeated until qi (1-n) traversal is finished.
And step 103, determining a secondary event fault root list subordinate to the primary event fault root according to the primary event fault root matched with the fault event to be processed.
And 104, when the secondary event fault source list has the secondary event fault source matched with the to-be-processed fault event, transmitting the secondary event fault source matched with the to-be-processed fault event to a user.
For example, the secondary event fault source list is polled to check for matching. q1 checking is passed, and then secondary root cause lists q1-1, q1-2 and q1-i. q1-n corresponding to q1 are matched again. The secondary root cause list may be empty, meaning that q1 is both the apparent and the deepest root cause. If all secondary sources are traversed, no match is found, meaning that the apparent cause is known, but the specific deepest cause is not known, and manual intervention is required. For example, q1 is "date unsuccessful" and q1-1 is "batch not completed", it is necessary to check whether q1-1 is satisfied. And (3) through the 'checking script' associated with q1-1, the linkage inquiry database knows that the batch execution condition is true, the batch which is not executed exists, and the q1-1 checking result is passed.
For another example, q1 is "error reporting of communication area, cardtype type is not matched", q1-1 is "reconstruction of downstream communication area", and more complex checking script is needed to check if q1-1 is matched, including the steps of 1, pulling error reporting service from log, 2, pulling downstream service from production calling rule library, 3, searching for soft to confirm if downstream service has reconstruction step, 4, matching soft to reconstruct, q1-1 checking if matching is passed. If the matching soft requirement does not search for the transformation requirement, then q1-1 check fails, where the soft requirement semantic similarity is required to reach 80% flag check pass.
And recording the matched q1-1 check result, and returning to the front end when the problem root returns.
Step 105, after receiving the fault treatment instruction of the user, retrieving corresponding treatment scheme information according to the secondary event fault source matched with the to-be-treated fault event.
The treatment scheme information comprises a treatment script, wherein the treatment script is executed;
And step 106, determining an event description corresponding to the fault event to be processed when the primary event fault source matched with the fault event to be processed does not exist.
And step 107, determining the similarity between the fault event to be processed and each historical event according to the event description corresponding to the fault event to be processed.
And 108, determining the treatment scheme information corresponding to the history event with the highest similarity, and executing the treatment script contained in the treatment scheme information corresponding to the history event with the highest similarity.
According to the scheme, after the primary event fault sources matched with the to-be-processed fault event are determined according to the detection mode of the association of each primary event fault source in the primary event fault source list, the to-be-processed fault event is continuously matched with the secondary event fault source, so that the event fault sources are rapidly positioned, the positioning accuracy of the event fault sources is improved, the treatment scheme is provided, and the treatment efficiency is improved.
In the embodiment of the invention, the event description comprises an event name, an event ID, an event level, an occurrence time, a solution time, an event phenomenon, a related application module, an influence range, a primary event fault source, a secondary event fault source, a primary checking script, a secondary treatment scheme, a secondary treatment script, a primary treatment scheme, a primary treatment script and the like.
The solution time, the event source, the inspection script, the treatment scheme and the treatment script are blank when being recorded in advance, and the user is further supplemented after the receiving confirmation.
In one possible implementation, the checking mode confirms whether the event fault source is satisfied by defining sql statement, log search and environment screening, and then returns a Y/N indication when judging whether the event fault source is successfully matched.
The treatment script contained in the treatment scheme information is used for eliminating the event fault source through the shell execution database, the container, the network application and the change. When the primary event fault source is matched and the secondary event source is matched, the secondary treatment scheme and the secondary treatment script are output preferentially, and when the primary event fault source is matched and the secondary event fault source is not matched, the primary treatment scheme information and the primary treatment script are output.
After obtaining a fault event to be processed in a banking business handling process, the embodiment of the invention generates a primary event fault source list and displays the front end.
In the embodiment of the invention, the event analysis system of the system is divided into a foreground management system and a background retrieval system. The foreground management system receives the user standardized event description, performs conventional text word segmentation, intention recognition and similar word replacement, and transmits the optimized event description to the background retrieval system to generate an event root.
The background retrieval system mainly comprises the following components S01, S02, S03, S04 and S05.
And S01, pulling an event library to form an event rule library.
The event library comprises event names, event IDs, event levels, occurrence times, solution times, event phenomena, related application modules, influence ranges, primary event sources, secondary event sources, primary inspection scripts, secondary inspection scripts, primary treatment schemes, primary treatment scripts, secondary treatment schemes and secondary treatment scripts. And (3) collecting and supplementing the events on the basis to form an event rule base, wherein the event rule base comprises events, primary event root times and secondary event root times.
S02, pulling the business data to form a business rule base.
The business architecture is combed by enterprises and subdivided into a product model, a flow model and a solid model, wherein the product model describes service conditions and processing rules, the flow model comprises standardized processing logic, and the solid model comprises rules for how the enterprises store. Describing the business rules of the software product, for example, the network point card opening requires the user to provide the elements such as an identity card, a mobile phone number, an address and the like, and after the teller checks the material successfully, information input and card opening operation are carried out, and a card is provided for the user.
By warehousing related product models, process models and entity models, normal and abnormal business processes and business rule requirements of transactions can be provided.
S03, pulling the change record to form a change rule base.
The modification of the system environment or business rules by the production environment generally does not reflect the need for independent recording and analysis in the soft-demand or business rules. System environments such as disaster recovery, database migration, network changes, container copies, CPU, memory expansion or contraction.
The change rule base generally includes date, module, background, execution content, business impact.
And S04, pulling the software requirement record to form a soft requirement rule base.
The software requirement records the increment of each period of version change, the version change is recorded in time sequence, and the first hand information can be provided for the occurrence of the event by analyzing the sequence transformation. The software requirements reformulation record includes a general overview (describing the reformulation context), logic processing (reformulation implementation flow), system menus and portals (transaction columns involved), technical and business checkpoints (verification points involved).
Wherein causal relationship reasoning is formed based on SCM based on software requirement records, and a soft requirement rule base is recorded.
And S05, pulling the production full-link log library to form a module calling rule library.
For logical warehousing between production run services, the module call rule base typically includes information such as a timestamp, a globally unique tracking id, a module id, a parent module id, a request/return message.
From the tracking id and the timestamp, a call chain of a service- > B service- > C service can be formed. And (5) assisting in positioning the calling relation between the module services, and finally positioning the deepest event source.
And S06, pulling the substitution code library and the table structure to form a code rule library.
In order to realize the correspondence of the code library and the semantics, the code and the annotation need to be analyzed row by row to form a code rule library.
In step 102, according to the inspection mode associated with each primary event fault source in the primary event fault source list, the primary event fault source matched with the to-be-processed fault event is determined, and the step flow is shown in fig. 2, and specifically includes the following steps:
step 201, sorting the plurality of primary event fault sources according to the occurrence times of each primary event fault source in the primary event fault source list. And according to the occurrence times of the event sources 100 and 20..1, sequencing from high to low, and sequentially checking qi matching degree according to sequencing.
Step 202, sequentially matching the to-be-processed fault event with each first-level event fault source according to the sequencing result until the matching is successful, and obtaining the first-level event fault source matched with the to-be-processed fault event.
In the embodiment of the invention, when the current primary event fault source is checked by adopting the checking mode of association and the checking is passed, the current primary event fault source is determined to be successfully matched with the fault event to be processed.
In the embodiment of the invention, after a secondary event fault source list subordinate to a primary event fault source is determined according to the primary event fault source matched with a fault event to be processed, the step flow is shown in a figure 3, and the method specifically comprises the following steps:
step 301, when the secondary event fault source list is empty, the primary event fault source matched with the to-be-processed fault event is sent to the user.
Step 302, after receiving a fault governance instruction of a user, retrieving corresponding governance scheme information according to a primary event fault source matched with a to-be-processed fault event, and executing a governance script contained in the governance scheme information.
In the embodiment of the invention, the system automatically saves the event snapshot, and the snapshot comprises the event description and recommended event sources. After a certain hour or a certain day, the operation and maintenance personnel checks that the event is indeed caused by the recommended event source, and returns to the system to search the event snapshot according to the date and the keywords.
In one possible implementation, the user clicks "generate event governance scheme" in the snapshot, and the system retrieves its governance scheme and governance script output foreground to the operation and maintenance personnel for reference based on the event root in the snapshot.
In the embodiment of the invention, the specific treatment scheme is described in Chinese, including but not limited to executing SQL sentences, rolling back container version, expanding container CPU and memory, expanding copy number, closing program and the like.
The user clicks "script editing" to inquire and edit the treatment script, the script is the history script of the history event, the user associates the application database environment, and after the container registration environment and the network environment are automatically corrected, the clicking "submit" takes effect to the production environment.
In step 107, according to the event description corresponding to the fault event to be processed, the embodiment of the present invention determines the similarity between the fault event to be processed and each historical event, and the step flow is shown in fig. 4, and specifically includes the following steps:
step 401, determining an event description vector of the fault event to be processed according to the event description corresponding to the fault event to be processed by using a word2vec model.
Step 402, determining the similarity between the fault event to be processed and each historical event according to the event description vector of the fault event to be processed.
For a newly input event a, extracting an event description, obtaining an event description vector by using word2vec, and calculating the similarity of a and other historical event problems by using the following similarity calculation formulas r < a, b >.
For example:
r<a,b>=cov(a,b)
in the embodiment of the invention, the smaller the r value is, the higher the similarity is, and the optimal matching history event can be found through the comparison of the r values.
According to the scheme, for the situation that the event description is matched but the event fault source is not matched, word2vec is used for matching the historical event source, the optimal matching historical event is found, and the second-level treatment scheme is used for carrying out event treatment scheme and treatment script display.
In the embodiment of the invention, a user can supplement and input the event source, the treatment scheme and the treatment script, and the information is returned to the event rule base for continuous storage.
The event snapshot is divided into a case-on state and a case-on state according to the information integrity. And searching corresponding treatment scheme information according to a secondary event fault source matched with the fault event to be processed, initializing the snapshot into a 'case-found' state, and updating the snapshot into the 'case-found' state after event information is complemented.
The event of "found case" can continuously send alarm mail to user, so that it is convenient for them to timely track and process and close up as soon as possible to "found case".
Events of "on-ground" can calculate event complexity according to the time of case setting, related modules, event levels, manual scoring, and serve as events of higher complexity for learning and production improvement.
The embodiment of the invention also provides an interface mapping device, as described in the following embodiment. The device is shown in fig. 5, and the device comprises:
The matching module 501 is configured to obtain a fault event to be processed in a banking transaction process; determining a first-level event fault source matched with a to-be-processed fault event according to an inspection mode associated with each first-level event fault source in a first-level event fault source list, wherein the inspection mode associated with the first-level event fault source is used for verifying whether the corresponding first-level event fault source occurs or not;
the governance module 502 is configured to retrieve corresponding governance scheme information according to a second-level event fault source matched with a to-be-processed fault event after receiving a fault governance instruction of a user, where the governance scheme information includes a governance script, execute the governance script, determine an event description corresponding to the to-be-processed fault event when there is no first-level event fault source matched with the to-be-processed fault event, determine similarity between the to-be-processed fault event and each historical event according to the event description corresponding to the to-be-processed fault event, determine governance scheme information corresponding to the historical event with the highest similarity, and execute the governance script included in the governance scheme information corresponding to the historical event with the highest similarity.
In the embodiment of the present invention, the matching module 501 is specifically configured to:
Sequencing a plurality of primary event fault sources according to the occurrence times of each primary event fault source in the primary event fault source list;
and sequentially matching the to-be-processed fault event with each first-level event fault source according to the sequencing result until the matching is successful, and obtaining the first-level event fault source matched with the to-be-processed fault event.
In the embodiment of the present invention, the matching module 501 is specifically configured to:
and when the current primary event fault source is checked by adopting a checking mode related to the current primary event fault source and the checking is passed, determining that the current primary event fault source is successfully matched with the fault event to be processed.
In the embodiment of the present invention, the matching module 501 is further configured to:
After a secondary event fault root list subordinate to the primary event fault root is determined according to the primary event fault root matched with the to-be-processed fault event, when the secondary event fault root list is empty, the primary event fault root matched with the to-be-processed fault event is sent to a user;
After receiving a fault management instruction of a user, retrieving corresponding management scheme information according to a primary event fault source matched with a to-be-processed fault event, and executing a management script contained in the management scheme information.
In the embodiment of the present invention, the abatement module 502 is specifically configured to:
Determining an event description vector of the fault event to be processed according to the event description corresponding to the fault event to be processed by adopting a word2vec model;
And determining the similarity between the fault event to be processed and each historical event according to the event description vector of the fault event to be processed.
Because the principle of the device for solving the problem is similar to that of the event fault source positioning method, the implementation of the device can refer to the implementation of the event fault source positioning method, and the repetition is omitted.
The embodiment of the invention also provides computer equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the event fault source positioning method when executing the computer program.
The embodiment of the invention also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the event fault source positioning method when being executed by a processor.
The embodiment of the invention also provides a computer program product, which comprises a computer program, wherein the computer program is executed by a processor to realize the event fault source positioning method.
In the embodiment of the invention, the fault event to be processed in the banking business handling process is acquired; determining a primary event fault source matched with a to-be-processed fault event according to an inspection mode associated with each primary event fault source in a primary event fault source list, wherein the inspection mode associated with the primary event fault source is used for verifying whether the corresponding primary event fault source occurs or not, determining a secondary event fault source list subordinate to the primary event fault source according to the primary event fault source matched with the to-be-processed fault event, sending the secondary event fault source matched with the to-be-processed fault event to a user when the secondary event fault source list exists the secondary event fault source matched with the to-be-processed fault event, retrieving corresponding treatment scheme information according to the secondary event fault source matched with the to-be-processed fault event after a fault treatment instruction of the user is received, the treatment scheme information comprises a treatment script, executing the treatment script, determining event description corresponding to the to-be-processed fault event when the primary event fault source matched with the to-be-processed fault event does not exist, determining similarity of the to-be-processed fault event and each historical event according to the event description corresponding to the to-be-processed fault event, determining similarity of the to-be-processed fault event corresponding to the historical event, and executing the treatment script corresponding to the treatment script with the history script. Compared with the prior art, after the primary event fault source matched with the to-be-processed fault event is determined according to the checking mode of the association of each primary event fault source in the primary event fault source list, the to-be-processed fault event is continuously matched with the secondary event fault source, so that the event fault source is rapidly positioned, the positioning accuracy of the event fault source is improved, a treatment scheme is provided, and the treatment efficiency is improved.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing description of the embodiments has been provided for the purpose of illustrating the general principles of the invention, and is not meant to limit the scope of the invention, but to limit the invention to the particular embodiments, and any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the invention are intended to be included within the scope of the invention.