Cotroneo et al., 2019 - Google Patents
FailViz: A tool for visualizing fault injection experiments in distributed systemsCotroneo et al., 2019
View PDF- Document ID
- 14983101762278598852
- Author
- Cotroneo D
- De Simone L
- Liguori P
- Natella R
- Bidokhti N
- Publication year
- Publication venue
- 2019 15th European Dependable Computing Conference (EDCC)
External Links
Snippet
The analysis of fault injection experiments can be a cumbersome task. These experiments can generate large volumes of data (eg, message traces), which a human analyst needs to inspect to understand the behavior of the system under failure. This paper introduces the …
- 238000002347 injection 0 title abstract description 26
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3466—Performance evaluation by tracing or monitoring
- G06F11/3495—Performance evaluation by tracing or monitoring for systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3668—Software testing
- G06F11/3672—Test management
- G06F11/3676—Test management for coverage analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3668—Software testing
- G06F11/3672—Test management
- G06F11/3688—Test management for test execution, e.g. scheduling of test suites
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
- G06F11/3414—Workload generation, e.g. scripts, playback
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/362—Software debugging
- G06F11/3636—Software debugging by tracing the execution of the program
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3604—Software analysis for verifying properties of programs
- G06F11/3612—Software analysis for verifying properties of programs by runtime analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
- G06F11/0709—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0766—Error or fault reporting or storing
- G06F11/0775—Content or structure details of the error report, e.g. specific table structure, specific error fields
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/362—Software debugging
- G06F11/3632—Software debugging of specific synchronisation aspects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3065—Monitoring arrangements determined by the means or processing involved in reporting the monitored data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/32—Monitoring with visual or acoustical indication of the functioning of the machine
- G06F11/323—Visualisation of programs or trace data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/86—Event-based monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/875—Monitoring of systems including the internet
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/22—Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Chen et al. | Pinpoint: Problem determination in large, dynamic internet services | |
| Xu et al. | POD-Diagnosis: Error diagnosis of sporadic operations on cloud applications | |
| Syer et al. | Leveraging performance counters and execution logs to diagnose memory-related performance issues | |
| Cotroneo et al. | Fault injection analytics: A novel approach to discover failure modes in cloud-computing systems | |
| Dean et al. | Perfcompass: Online performance anomaly fault localization and inference in infrastructure-as-a-service clouds | |
| Jiang | Automated analysis of load testing results | |
| Sun et al. | Non-intrusive anomaly detection with streaming performance metrics and logs for DevOps in public clouds: a case study in AWS | |
| US10360140B2 (en) | Production sampling for determining code coverage | |
| US20130086203A1 (en) | Multi-level monitoring framework for cloud based service | |
| Martino et al. | Logdiver: A tool for measuring resilience of extreme-scale systems and applications | |
| Cotroneo et al. | Enhancing failure propagation analysis in cloud computing systems | |
| CN116405412B (en) | Method and system for verifying cluster effectiveness of simulation server based on chaotic engineering faults | |
| Sharma et al. | Hansel: Diagnosing faults in openStack | |
| Cotroneo et al. | Sabrine: State-based robustness testing of operating systems | |
| Weng et al. | Argus: Debugging performance issues in modern desktop applications with annotated causal tracing | |
| Xu et al. | Detecting cloud provisioning errors using an annotated process model | |
| Ghanbari et al. | Stage-aware anomaly detection through tracking log points | |
| Cotroneo et al. | FailViz: A tool for visualizing fault injection experiments in distributed systems | |
| Chen et al. | Chronos: Finding timeout bugs in practical distributed systems by deep-priority fuzzing with transient delay | |
| Xiao et al. | Experience report: fault triggers in linux operating system: from evolution perspective | |
| Goel et al. | Gretel: Lightweight fault localization for openstack | |
| Šor et al. | Memory leak detection in Plumbr | |
| Japke et al. | The Early Microbenchmark Catches the Bug--Studying Performance Issues Using Micro-and Application Benchmarks | |
| Cinque et al. | A logging approach for effective dependability evaluation of complex systems | |
| Machado et al. | Minha: Large-scale distributed systems testing made practical |