Lou et al., 2019 - Google Patents
Comprehensive and efficient runtime checking in system software through watchdogsLou et al., 2019
View PDF- Document ID
 - 10483387570985565403
 - Author
 - Lou C
 - Huang P
 - Smith S
 - Publication year
 - Publication venue
 - Proceedings of the Workshop on Hot Topics in Operating Systems
 
External Links
Snippet
Systems software today is composed of numerous modules and exhibits complex failure  modes. Existing failure detectors focus on catching simple, complete failures and treat  programs uniformly at the process level. In this paper, we argue that modern software needs … 
    - 238000000034 method 0 abstract description 27
 
Classifications
- 
        
- G—PHYSICS
 - G06—COMPUTING; CALCULATING; COUNTING
 - G06F—ELECTRICAL DIGITAL DATA PROCESSING
 - G06F11/00—Error detection; Error correction; Monitoring
 - G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
 - G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
 - G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
 - G06F11/0721—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment within a central processing unit [CPU]
 
 - 
        
- G—PHYSICS
 - G06—COMPUTING; CALCULATING; COUNTING
 - G06F—ELECTRICAL DIGITAL DATA PROCESSING
 - G06F11/00—Error detection; Error correction; Monitoring
 - G06F11/36—Preventing errors by testing or debugging software
 - G06F11/362—Software debugging
 - G06F11/3632—Software debugging of specific synchronisation aspects
 
 - 
        
- G—PHYSICS
 - G06—COMPUTING; CALCULATING; COUNTING
 - G06F—ELECTRICAL DIGITAL DATA PROCESSING
 - G06F11/00—Error detection; Error correction; Monitoring
 - G06F11/36—Preventing errors by testing or debugging software
 - G06F11/362—Software debugging
 - G06F11/3636—Software debugging by tracing the execution of the program
 
 - 
        
- G—PHYSICS
 - G06—COMPUTING; CALCULATING; COUNTING
 - G06F—ELECTRICAL DIGITAL DATA PROCESSING
 - G06F11/00—Error detection; Error correction; Monitoring
 - G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
 - G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
 - G06F11/0706—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment
 - G06F11/073—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation the processing taking place on a specific hardware platform or in a specific software environment in a memory management context, e.g. virtual memory or cache management
 
 - 
        
- G—PHYSICS
 - G06—COMPUTING; CALCULATING; COUNTING
 - G06F—ELECTRICAL DIGITAL DATA PROCESSING
 - G06F11/00—Error detection; Error correction; Monitoring
 - G06F11/30—Monitoring
 - G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
 - G06F11/3466—Performance evaluation by tracing or monitoring
 
 - 
        
- G—PHYSICS
 - G06—COMPUTING; CALCULATING; COUNTING
 - G06F—ELECTRICAL DIGITAL DATA PROCESSING
 - G06F11/00—Error detection; Error correction; Monitoring
 - G06F11/36—Preventing errors by testing or debugging software
 - G06F11/3668—Software testing
 - G06F11/3672—Test management
 - G06F11/3688—Test management for test execution, e.g. scheduling of test suites
 
 - 
        
- G—PHYSICS
 - G06—COMPUTING; CALCULATING; COUNTING
 - G06F—ELECTRICAL DIGITAL DATA PROCESSING
 - G06F11/00—Error detection; Error correction; Monitoring
 - G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
 - G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
 - G06F11/0766—Error or fault reporting or storing
 - G06F11/0775—Content or structure details of the error report, e.g. specific table structure, specific error fields
 
 - 
        
- G—PHYSICS
 - G06—COMPUTING; CALCULATING; COUNTING
 - G06F—ELECTRICAL DIGITAL DATA PROCESSING
 - G06F11/00—Error detection; Error correction; Monitoring
 - G06F11/36—Preventing errors by testing or debugging software
 - G06F11/3668—Software testing
 - G06F11/3672—Test management
 - G06F11/3676—Test management for coverage analysis
 
 - 
        
- G—PHYSICS
 - G06—COMPUTING; CALCULATING; COUNTING
 - G06F—ELECTRICAL DIGITAL DATA PROCESSING
 - G06F11/00—Error detection; Error correction; Monitoring
 - G06F11/36—Preventing errors by testing or debugging software
 - G06F11/3604—Software analysis for verifying properties of programs
 - G06F11/3612—Software analysis for verifying properties of programs by runtime analysis
 
 - 
        
- G—PHYSICS
 - G06—COMPUTING; CALCULATING; COUNTING
 - G06F—ELECTRICAL DIGITAL DATA PROCESSING
 - G06F11/00—Error detection; Error correction; Monitoring
 - G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
 - G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
 - G06F11/0793—Remedial or corrective actions
 
 - 
        
- G—PHYSICS
 - G06—COMPUTING; CALCULATING; COUNTING
 - G06F—ELECTRICAL DIGITAL DATA PROCESSING
 - G06F11/00—Error detection; Error correction; Monitoring
 - G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
 - G06F11/16—Error detection or correction of the data by redundancy in hardware
 - G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
 
 - 
        
- G—PHYSICS
 - G06—COMPUTING; CALCULATING; COUNTING
 - G06F—ELECTRICAL DIGITAL DATA PROCESSING
 - G06F11/00—Error detection; Error correction; Monitoring
 - G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
 - G06F11/14—Error detection or correction of the data by redundancy in operation
 - G06F11/1402—Saving, restoring, recovering or retrying
 - G06F11/1405—Saving, restoring, recovering or retrying at machine instruction level
 
 - 
        
- G—PHYSICS
 - G06—COMPUTING; CALCULATING; COUNTING
 - G06F—ELECTRICAL DIGITAL DATA PROCESSING
 - G06F11/00—Error detection; Error correction; Monitoring
 - G06F11/30—Monitoring
 - G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
 - G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
 
 - 
        
- G—PHYSICS
 - G06—COMPUTING; CALCULATING; COUNTING
 - G06F—ELECTRICAL DIGITAL DATA PROCESSING
 - G06F11/00—Error detection; Error correction; Monitoring
 - G06F11/36—Preventing errors by testing or debugging software
 - G06F11/3664—Environments for testing or debugging software
 
 - 
        
- G—PHYSICS
 - G06—COMPUTING; CALCULATING; COUNTING
 - G06F—ELECTRICAL DIGITAL DATA PROCESSING
 - G06F9/00—Arrangements for programme control, e.g. control unit
 - G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
 - G06F9/46—Multiprogramming arrangements
 
 - 
        
- G—PHYSICS
 - G06—COMPUTING; CALCULATING; COUNTING
 - G06F—ELECTRICAL DIGITAL DATA PROCESSING
 - G06F11/00—Error detection; Error correction; Monitoring
 - G06F11/22—Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
 
 - 
        
- G—PHYSICS
 - G06—COMPUTING; CALCULATING; COUNTING
 - G06F—ELECTRICAL DIGITAL DATA PROCESSING
 - G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
 - G06F2201/86—Event-based monitoring
 
 - 
        
- G—PHYSICS
 - G06—COMPUTING; CALCULATING; COUNTING
 - G06F—ELECTRICAL DIGITAL DATA PROCESSING
 - G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
 
 - 
        
- G—PHYSICS
 - G06—COMPUTING; CALCULATING; COUNTING
 - G06F—ELECTRICAL DIGITAL DATA PROCESSING
 - G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
 
 
Similar Documents
| Publication | Publication Date | Title | 
|---|---|---|
| Lou et al. | Understanding, detecting and localizing partial failures in large system software | |
| Lam et al. | Root causing flaky tests in a large-scale industrial setting | |
| Xu et al. | Early detection of configuration errors to reduce failure damage | |
| Lu et al. | Crashtuner: Detecting crash-recovery bugs in cloud systems via meta-info analysis | |
| US8726225B2 (en) | Testing of a software system using instrumentation at a logging module | |
| Musuvathi et al. | Finding and Reproducing Heisenbugs in Concurrent Programs. | |
| Liu et al. | FCatch: Automatically detecting time-of-fault bugs in cloud systems | |
| Viennot et al. | Transparent mutable replay for multicore debugging and patch validation | |
| Gu et al. | Acto: Automatic end-to-end testing for operation correctness of cloud system management | |
| Li et al. | Dfix: automatically fixing timing bugs in distributed systems | |
| Chen et al. | {Push-Button} reliability testing for {Cloud-Backed} applications with rainmaker | |
| Lou et al. | Demystifying and checking silent semantic violations in large distributed systems | |
| Huang et al. | Understanding issue correlations: a case study of the hadoop system | |
| Lou et al. | Comprehensive and efficient runtime checking in system software through watchdogs | |
| Wu et al. | Efficient exposure of partial failure bugs in distributed systems with inferred abstract states | |
| Cotroneo et al. | Assessment and improvement of hang detection in the Linux operating system | |
| He et al. | HangFix: automatically fixing software hang bugs for production cloud systems | |
| Montrucchio et al. | Software-implemented fault injection in operating system kernel mutex data structure | |
| Levy et al. | Using unreliable virtual hardware to inject errors in extreme-scale systems | |
| Qiu et al. | Understanding and detecting server-side request races in web applications | |
| Van Der Kouwe et al. | On the soundness of silence: Investigating silent failures using fault injection experiments | |
| Zheng et al. | Towards concurrency race debugging: An integrated approach for constraint solving and dynamic slicing | |
| Arya et al. | Semi-automated debugging via binary search through a process lifetime | |
| Lou et al. | A promise is not a promise—demystifying and checking silent semantic violations in large distributed systems | |
| Martens et al. | CrossCheck: A holistic approach for tolerating crash-faults and arbitrary failures |