+

WO2001080008A2 - Methods and apparatus for persistent volatile computer memory and related applications thereof - Google Patents

Methods and apparatus for persistent volatile computer memory and related applications thereof Download PDF

Info

Publication number
WO2001080008A2
WO2001080008A2 PCT/US2001/012138 US0112138W WO0180008A2 WO 2001080008 A2 WO2001080008 A2 WO 2001080008A2 US 0112138 W US0112138 W US 0112138W WO 0180008 A2 WO0180008 A2 WO 0180008A2
Authority
WO
WIPO (PCT)
Prior art keywords
persistent
computer
memory
operating system
volatile memory
Prior art date
Application number
PCT/US2001/012138
Other languages
French (fr)
Other versions
WO2001080008A3 (en
Inventor
Thomas Olson
Original Assignee
Stratus Technologies Bermuda Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US09/550,108 external-priority patent/US6842823B1/en
Priority claimed from US09/664,483 external-priority patent/US6802022B1/en
Priority claimed from US09/790,750 external-priority patent/US6901481B2/en
Application filed by Stratus Technologies Bermuda Ltd. filed Critical Stratus Technologies Bermuda Ltd.
Priority to EP01925016A priority Critical patent/EP1277115A2/en
Priority to AU2001251617A priority patent/AU2001251617A1/en
Publication of WO2001080008A2 publication Critical patent/WO2001080008A2/en
Publication of WO2001080008A3 publication Critical patent/WO2001080008A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1441Resetting or repowering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1435Saving, restoring, recovering or retrying at system level using file system or storage system metadata
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1471Saving, restoring, recovering or retrying involving logging of persistent data for recovery
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/468Specific access rights for resources, e.g. using capability register

Definitions

  • the present invention relates to preserving the contents of computer memory.
  • the present invention relates to methods and apparatus for preserving the contents of a volatile memory during a system failure and applications thereof.
  • Prior art computer systems typically include a volatile memory for the storage and manipulation of information by an operating system and various software applications, and a non-volatile memory for mass storage of data and computer programs.
  • software applications behave in unexpected ways, they can cause the operating system to fail in catastrophic ways, referred to colloquially as a "system crash.”
  • system crashes there is no guarantee that the information stored in volatile memory can be salvaged.
  • the computer system will be interrupted while writing information to nonvolatile mass storage, damaging or corrupting the contents of the non- volatile memory.
  • the user remedies the system crash by resetting the system.
  • the operating system typically loses the ability to reference the information contained in the volatile memory or actually initializes the volatile memory, changing or destroying its contents.
  • a reboot operation typically destroys the information the computer would need to verify or repair the contents of the non- volatile memory.
  • a second set of prior art solutions to this problem has focused on hardware modifications to preserve the contents of volatile memory during a crash.
  • Some prior art systems are arranged such that every read or write request to an operating system is simultaneously routed to a non- volatile memory. Such a system guarantees a record of memory contents that can be reconstructed during a boot cycle, but suffers from slowness during normal operation, because each transaction is conducted twice, and slowness during a boot cycle, because the operating system must locate the non- volatile record of transactions and reload them.
  • Other prior art systems attempt the same techniques and suffer from the same problems, but reduce the magnitude of the delays by greater selectivity in the transactions actually recorded, or recording transactions in a way that is more amenable to reconstruction.
  • Non-volatile memories such as electrically erasable programmable read-only memories (EEPROMs), Flash ROM, or battery-backed random-access memory.
  • EEPROMs electrically erasable programmable read-only memories
  • Flash ROM often requires a charge pump to achieve the higher voltages needed to write to the memory, and suffers a shorter life than normal volatile RAM because of this process.
  • Battery-backed RAMs rely on batteries that are subject to catastrophic failure or charge depletion.
  • RAID array where RAID is an acronym for "redundant array of inexpensive disks”
  • the computer processes each write transaction to non-volatile storage in parallel, writing it to each device in the array. If the computer fails, then the nonvolatile storage device with the most accurate set of contents available can be used as a master, copying all its contents to the other devices in the array (RAID level 1).
  • RAID level 5 Another solution only stores one copy of the transaction information across multiple mass storage devices, but also stores parity information concerning the transaction data
  • a computer whose information is stored in a volatile memory resistant to loss or corruption resulting from system or application crashes would avoid the problems associated with the loss and recreation of data.
  • a computer that used this persistent volatile memory to store write transactions directed to non-volatile storage would similarly avoid wholesale duplication.
  • the elimination of time-consuming data reconstruction would help make possible a fault-tolerant computer that offered continuous availability.
  • the present invention provides those benefits.
  • the present invention relates to methods and apparatus for providing a persistent volatile computer memory.
  • One object of the invention is to provide at least one region of volatile memory whose contents are resistant to loss or corruption from system or application crashes and the ensuing reboot cycle.
  • Another object of the invention is to store the contents of write transactions to non- volatile storage in a region of volatile memory whose contents are resistant to loss or corruption from system or application crashes and the ensuing reboot cycle, where they are used to repair and complete the contents of the non- volatile storage devices.
  • Still another object of the invention is to ensure the availability of transaction information following the failure of a computer system without having to perform the same transaction.
  • one feature of the invention is the partitioning of computer memory into two types of regions: a non-persistent region directly accessible to the operating system and a persistent region that is not directly accessible to the operating system.
  • Another feature of the invention is an intermediary program, such as a device driver, that serves as an intermediary between the operating system and the persistent memory region, processing requests from the operating system directed to the persistent region of memory.
  • the contents of the persistent memory region are resistant to initialization or modification during a boot cycle.
  • the intermediary program processes write requests atomically, preventing the results of incomplete or partial transactions from subsequent loading from the persistent memory region by computer applications.
  • one feature of the invention is a modified BIOS, wherein the modification effectively partitions the memory into two types of regions: a non-persistent region directly accessible to the operating system and a persistent region that is not directly accessible to the operating system.
  • Another feature of the invention is a intermediary program, such as a device driver, that processes read requests and write requests from the operating system to this persistent memory region.
  • Net another feature of the invention is that the contents of the persistent memory region are resistant to initialization or modification during a boot cycle.
  • one feature of the invention is the partitioning of computer memory into two types of regions: a non-persistent region directly accessible to the operating system and a persistent region that is not directly accessible to the operating system.
  • Another feature of the invention is a program that maps the persistent region onto physical memory locations, enabling the operating system to indirectly access the persistent region.
  • one feature of the invention is the presence of non-volatile storage and persistent volatile memory, where the persistent volatile memory is used to store write transactions posted to non- volatile storage.
  • Another feature of the invention is an intermediary program, such as a device driver, that serves as an intermediary between the operating system and non- volatile storage that processes write requests from the operating system directed to non-volatile storage, stores their contents in persistent volatile memory, and then completes the write to non-volatile storage.
  • the contents of the persistent memory region are resistant to initialization or modification during a boot cycle.
  • the intermediary program processes write requests atomically, preventing the results of incomplete or partial transactions from subsequent loading from the persistent memory region by computer applications.
  • one feature of the invention is a computer program that receives write transactions directed to non-volatile storage by the operating system, stores the contents of the write transaction in persistent volatile memory, and then completes the write to non- volatile storage.
  • Another feature of the invention is the marking of transactions in persistent volatile memory as "complete” or "in progress” for use during the reboot and recovery process.
  • the invention is a method providing improved recovery from system failures.
  • the method receives a write transaction from the operating system, stores the contents of the write transaction in persistent volatile memory, and then stores the contents of the write transaction in non- olatile storage.
  • Another feature of the invention is the marking of transactions in persistent volatile memory as "complete” or "in progress” for use during the reboot and recovery process.
  • Yet another feature of the invention is the selection of those write transactions in persistent volatile memory marked "in progress", copying the contents of the uncompleted write transactions from the persistent volatile memory to the non- volatile storage, and then marking the uncompleted write transactions as completed after the successful completion of the copy to non- volatile storage.
  • the invention features a method for storing transactional information in a computer.
  • the method comprises the steps of: (a) receiving transactional information; (b) storing the particular transactional information in a persistent volatile memory on the computer; and (c) retrieving the transactional information after a computer failure by accessing the transactional information stored in the persistent volatile memory on the computer.
  • the method may also comprise flushing the persistent volatile memory to a persistent mass storage device.
  • the flushing occurs when the transactional information stored in the persistent volatile memory exceeds some predetermined threshold.
  • the flushing occurs when a predefined amount of time has elapsed since the storage of the transactional information in the persistent volatile memory.
  • the flushing occurs when a program, such as the operating system of the computer, is not busy. Alternatively, the flushing occurs when a file is closed or when the computer is shut down.
  • the invention features a method for providing persistent mass storage of transactional information.
  • the method comprises the steps of: (a) receiving transactional information; (b) determining whether the transactional information meets a predetermined criteria; and (c) storing the transactional information that meets the predetermined criteria in a persistent cache.
  • the transactional information comprises unbuffered writes to disk, which are writes requested by an application and in which notification to the application of the completion of the write is necessary.
  • an unbuffered write can include copying a file from one directory to another directory, backing up and/or updating a file, and initializing a file.
  • the invention features a persistent volatile memory and an intermediary program.
  • the intermediary program receives transactional information and stores the transactional information in the persistent volatile memory.
  • the contents of the persistent volatile memory remain unaltered through a system failure.
  • the intermediary program is a filter driver module that identifies particular transactional information to store in the persistent volatile memory.
  • the invention includes a flushing thread to flush the contents of the persistent volatile memory to a persistent non- volatile memory.
  • FIG. 1 is a block diagram of an embodiment of a computer known to the prior art.
  • FIG. 2 is a block diagram of an embodiment of a computer constructed in accordance with the present invention.
  • FIG. 3 is a functional block diagram of the embodiment of the present invention depicted in FIG. 2.
  • FIG. 4 is a flowchart depicting the state of a look-ahead buffer while processing a write request atomically in accord with the present invention.
  • FIG. 5 is a flowchart depicting the recovery of completed transaction information during a boot cycle in accord with the present invention.
  • FIG. 6 is a flowchart depicting the operation of an embodiment of the present invention.
  • FIG. 7 is an embodiment of an interface presented to the user to solicit information for the configuration of the present invention.
  • FIG. 8 is an flowchart describing the operation of an embodiment of the invention directed to the contents of non- volatile storage;
  • FIG. 9 is a flowchart depicting an embodiment of the operation of the computer of FIG. 1 to store unbuffered writes in persistent memory
  • FIG. 10 is a flowchart depicting an embodiment of the steps of flushing persistent memory as shown in FIG. 9.
  • a computer 10 known to the prior art typically includes a microprocessor 12, a memory 14 for storing programs and/or data, an input/output (I/O) controller 16, and a system bus 18 allowing communication among these components.
  • the memory 14 in such a computer typically includes random-access memory (RAM) 20, read-only memory (ROM) 22, and non-volatile random-access memory (NVRAM) 24.
  • the RAM 20 typically contains an operating system 26 and one or more device drivers 28 that permit access to various peripherals by the operating system 26.
  • the ROM 22 typically contains a basic input- output system (BIOS) 30 that handles the boot process of the computer 10.
  • BIOS basic input- output system
  • One or more input devices such as an alphanumeric keyboard or a mouse, and one or more output devices, such as a display or a printer, are also typically included in the computer 10.
  • the computer 10 will also include a network connection.
  • the computer 10 typically has a mass storage device 32 such as a magnetic disk or magneto-optical drive.
  • some computers 10 have redundant arrays of inexpensive disks (RAID arrays) used as failure-tolerant mass storage 32.
  • applicant's invention provides a persistent volatile memory in a computer while avoiding the failings of the prior art. This is achieved by partitioning the volatile computer memory into two regions: a non-persistent memory region that is directly accessible to the operating system and typically is initialized or modified during a boot cycle, and a persistent memory region whose contents are not initialized or modified during a boot cycle.
  • the operating system can indirectly access this persistent memory region through an intermediary program such as a device driver.
  • the intermediary program invokes operating-system level functionality to enable the operating system to access this persistent memory region only after the boot cycle is completed. This invention is particularly useful in a system-critical fault-tolerant computer that offers continuous availability.
  • FIGS. 2 & 3 One embodiment of the present invention is shown in FIGS. 2 & 3.
  • the system includes a computer 10' with RAM 20 partitioned into two different regions.
  • the first memory region 40 is directly accessible to the operating system and is typically initialized or modified during a boot cycle.
  • a device driver 28' handles read requests 50 and write requests 52 from the operating system 26 directed to the second memory region 42.
  • a modified BIOS 30' prevents the operating system 26 from directly accessing the contents of the second memory region 42.
  • the second memory region 42 is not directly accessible to the operating system 26 and therefore is not modified or initialized during a boot cycle.
  • configuration information 44 regarding the location and size of the second memory region 42 is stored in an entry in NNRAM 24.
  • a computer 10 typically invokes a BIOS 30 that typically provides low-level access to peripherals; identifies RAM 20 available to the processor 12; initializes this RAM 20, typically destroying its contents; and then installs the operating system 26 into RAM 20, giving the operating system access to the entire RAM 20 to move information into and out of memory as necessary. If the computer 10 is started after having been powered down, all of its memory will have been initialized.
  • BIOS 30 typically provides low-level access to peripherals
  • a computer 10' constructed in accordance with the present invention invokes a modified BIOS 30.
  • the modified BIOS 30' retrieves configuration information 44 from ⁇ NRAM 24.
  • This configuration information 44 includes the start address and the size of persistent memory. If both these values are zero or non-existent, then the modified BIOS 30' knows that the invention is either not installed or disabled. If the size is non-zero, but the start address is zero, then the start address is recalculated by subtracting the size of the persistent memory from the total memory size and storing this result as the new start address.
  • the modified BIOS 30' then divides the RAM 20 into two memory regions: a first non-persistent memory region 40 ranging from address 0 up to, but not including, the persistent memory start address, and a second persistent memory region 42 consisting of all memory from the persistent memory start address up to, but not including, the sum of the persistent memory start address and the persistent memory size.
  • the system then initializes the first memory region 40.
  • the modified BIOS 30' still provides low-level access to peripherals, but installs the operating system 26 into the first memory region 40 of RAM 20, and preventing the operating system 26 from directly accessing the second memory region 42 during the boot cycle and normal computer operation.
  • the operating system 26 is, in effect, unaware of the second memory region 42.
  • the operating system 26 typically initializes or installs its own programs into the first memory region 40, often modifying the contents of the first memory region 40, but does not modify the contents of the second memory region 42 of which it is unaware. This renders the contents of the second memory region 42 persistent through a boot cycle.
  • the operating system 26 will load device drivers 28 to permit access to various peripheral devices.
  • the operating system 26 loads a device driver 28' that is aware of the second memory region 42 and is able to access its contents.
  • the device driver 28' is aware of the second memory region 42 because it is also aware of and accesses the configuration information 44 stored in NNRAM 24.
  • the device driver 28' serves as an intermediary between the operating system 26 and the second memory region 42.
  • the device driver 28' takes a read request 50 from the operating system 26 and returns information from the appropriate location in the second memory region 42.
  • the device driver 28' takes a write request 52 from the operating system 26 and stores information at the appropriate location in the second memory region 42.
  • an intermediary program installs and configures the invention, and then invokes operating-system level functionality to enable the operating system to access this second set of memory regions only after the boot cycle is completed.
  • the operating system 26 is the Windows 2000 operating system.
  • the second memory region 42 accessible through the device driver 28 appears to the operating system 26 as a RAM disk, though the contents of a normal RAM disk do not survive a boot cycle, in contrast to the present invention.
  • a Windows 2000 read request 50 or write request 52 includes an offset value (in bytes) and a length value (in bytes).
  • the device driver 28' computes the appropriate location in the second memory region 42 by adding the offset value in the request to the start address of the persistent memory region.
  • the second memory region 42 includes 1 MB of configuration information, so the appropriate location is actually the sum of the offset value, the start address of the persistent memory, and 1 MB.
  • the device driver 28 copies a number of bytes equal in size to the length value from the computed location in the second memory region 42 to the user's buffer.
  • the device driver 28 copies a number of bytes equal in size to the length value passed by the operating system 26 from the user's buffer to the computed location in the second memory region 42.
  • This interaction permits the operating system 26 to indirectly access the second memory region 42 without threatening the integrity of the contents of the second memory region 42 during a boot cycle.
  • Windows 2000 is the operating system 26
  • the device driver 28' invokes the functionality of the operating system 26 to map the computed location onto the virtual address space of the operating system 26 for the copy operation.
  • Other operating system 26 functionality completes the copy operation and unmaps the computed location from the virtual address space of the operating system 26.
  • a look-aside buffer in the second persistent memory region 42 and uses it for the atomic update and storage of transaction information; only when the write request 52 has been buffered and completed is it transferred out of the look-aside buffer.
  • a look-aside buffer includes a set of bits that describe its state.
  • FIG. 4 shows how the state of the look-aside buffer changes to reflect various stages in the processing of a write request 52.
  • the buffer state is 0 (Step 10).
  • the driver 28' stores the computed location and the length (in bytes) of the request in the look-aside buffer and the state of the buffer becomes 1 (Step 12).
  • the actual contents of the write request are copied into the buffer (Step 14).
  • the buffer state becomes 2 (Step 16). If the copy fails because of, for example, a system crash, the buffer state remains at 1. Once the buffer state is set to 2, the contents of the write request are copied out of the look-aside buffer to their computed location in the second persistent memory region 42 (Step 18).
  • Step 10 The effect of the value of the state of the look-aside buffer on the subsequent boot process is depicted in FIG. 5.
  • the device driver 28' locates all the look-aside buffers in the second persistent memory region 42. If there are no more look-aside buffers to check (Step 22), the system boot process continues (Step 24). If there are more look- aside buffers (Step 22), the device driver proceeds to examine the state of each look-aside buffer, one at a time, in the second persistent memory region 42 (Step 26).
  • the device driver 28' If the state of the buffer presently under examination is 0, the device driver 28' knows that there is no information stored in the look-aside buffer and the device driver 28' checks the next look-aside buffer (Step 22). If the buffer state is 1, the device driver 28' knows that the information in the look-aside buffer is the result of an incomplete transaction and should not be moved into the second persistent memory region 42 for recovery by a computer application. The device driver 28' sets the state of this buffer to 0 (Step 30) and checks the next look-aside buffer (Step 22). If the buffer under examination is in state 2, then the device driver 28' knows that the contents of the look-aside buffer are the result of a completed transaction that did not get copied into the second persistent memory region 42.
  • the device driver 28' copies the contents of the look-aside buffer to the computed location in the second persistent memory region 42 (Step 28).
  • the buffer state is set to 0 (Step 30) and the device driver 28' checks the next lookaside buffer (Step 22).
  • the device driver 28' will have checked the state of all the look-aside buffers, and the system boot will continue (Step 24).
  • the programs are a modified BIOS and a device driver.
  • the programs divide the memory into two portions: a first region directly accessible to the operating system in Step 42 and making a second region that is not directly accessible to the operating system in Step 44.
  • a device driver 28' or similar intermediary program provides indirect access to the second memory region 42 to the operating system. In step 46, the device driver 28' waits for a read request 50 or a write request 52 from the operating system 26.
  • the device driver 28' decides (Step 48) whether a read request has been received, and if one has, then the intermediary program reads (Step 50) from the appropriate location in the second memory region 42 and returns the result to the operating system 26. Similarly, if the device driver 28 decides (Step 52) that a write request 52 has been received, then the device driver stores (Step 54) information at the appropriate location in the second memory region 42. If neither type of request has been received, then the device driver returns to step 36 and continues to wait for requests. Typically, read and write requests from the operating system to the first memory region operate as they would have before the installation of the present invention.
  • the computer programs comprising the present invention are on a magnetic disk drive and selecting an icon invokes the installation of the programs, initializing the present invention.
  • the programs implement the invention on the computer, and the programs either contain or access the data needed to implement all of the functionality of the invention on the computer.
  • this prompt takes the form of a dialog box, though other forms of prompting are feasible, typically including asking the user a series of questions or requiring the user to manually edit a configuration file.
  • the dialog box typically permits the user to choose the size of the persistent memory and choose a designation through which the operating system accesses it.
  • a dialog box includes a slider element 60, a pop-up menu element 62, and a text-entry element 64.
  • the slider element 60 permits the user to select the size of the persistent memory. For example, the slider element 60 permits the user to select a persistent memory from 10 to 100 MB in size in 10 MB increments. Of course, particular numerical values may vary between implementations.
  • the pop-up menu element 62 permits the user to assign a drive letter to the device driver 28' to permit the operating system to address it.
  • the text-entry element 64 permits the user to designate a file name for the backing store file.
  • a pop-up menu element 66 permits the user to designate the transaction size for the atomic update feature described above.
  • the user selects the OK button 68 which initiates the embodiment of the present invention in the computer.
  • the invention is installed and typically becomes effective after the next boot cycle.
  • the present invention is implemented in RAM, but it may also be implemented in a storage-based virtual memory system or some other memory apparatus.
  • redundant mass storage elements may be provided as RAID level 1 disk arrays, RAID level 5 disk arrays, or other mass storage element configured to provide RAID level 1 or RAID level 5 type functionality.
  • Some current methods require copying, after a system failure, of one disk in a RAID level 1 disk array in its entirety to the other disks in the array. Other methods use parity information to reconstruct the data (RAID level 5). Resynchronization of the RAID device after system failure monopolizes system resources as the resynchronization occurs.
  • the persistent volatile memory of the present invention enables the creation of a list of write transactions resistant to system failures. After a failure, recovery need only involve the recreation of data using the transactions in the list stored in persistent memory.
  • Step 100 the computer follows the process described above to create two regions of volatile memory: a non-persistent memory region that is directly accessible to the operating system and typically is initialized or modified during a boot cycle, and a persistent memory whose contents are not initialized or modified during a boot cycle.
  • a disk write list is created in persistent memory (Step 102).
  • the amount of persistent memory allocated for the disk write list can vary with system configuration, but typically is at least one page (4 kb) in size.
  • the disk write list is used to store a list of pending write transactions posted to mass storage.
  • Step 104 the system typically runs an operating system and various programs. Some of these programs are application programs, which provide the operator of the system with various functionalities utilizing the system, while other programs provide the system itself with additional functionality. Referring to FIG. 1, an example of the latter is a device driver 28, which permits the operating system 26 to access various peripherals.
  • the present invention uses a specialized device driver 28 that permits the operating system 26 to access mass storage 32, which is, in some embodiments, a RAID array.
  • mass storage 32 which is, in some embodiments, a RAID array.
  • the CPU will typically issue a write request to mass storage via the specialized device driver.
  • the device driver records this write transaction in the disk write list in persistent memory and marks the transaction "in progress" (Step 106).
  • the entry in the disk write list includes the start address for the write, the length of the data to be written, the mass storage volume identifier for the write being performed, and the status of the transaction as "in progress.” Then the device driver attempts to complete the posting of the contents of the write transaction to a start address in mass storage.
  • Step 108 Under normal circumstances there is no system failure during the transaction (Step 108), the write is successfully completed, and the device driver deletes the now-completed entry in the disk write list (Step 110). In other embodiments, upon successful completion of a mass storage with the entry in the write list is marked "complete.” Normal system operation continues (Step 104).
  • Step 108 the system will need to be rebooted (Step 112).
  • the operating system through the device driver checks for entries in the disk write list in persistent memory (Step 114). If the disk write list has no entries in it, the system proceeds to normal system operation (Step 104).
  • the operating system processes each entry in the disk write list. For each entry, the operating system assumes that the system failure (Step 108) occurred during the write transaction, and proceeds to complete the posting to the storage volume identified in the entry.
  • the storage volume is a RAID level 1 device
  • the system reads data from the start address equal in length to the length in the disk write list from the primary disk and writes it to the secondary disks in the array.
  • the storage volume is a RAID level 5 device
  • the system reads the data from all of the data volumes at the specified location in the specified length and reconstructs the corresponding parity data.
  • the entry in the disk write list corresponding to the completed transaction is deleted (Step 116).
  • the process continues for each entry in the disk write list (Step 114).
  • normal system operation resumes (Step 104).
  • the present invention may also be employed to improve the performance of unbuffered writes (i.e., writes requested by a software application in which notification to the application of the completion of the write is necessary) to the persistent mass storage 32.
  • unbuffered writes include, without limitation, copying a file from one directory to another directory, backing up and/or updating a file, initializing a file (e.g., writing zeros to the file), and the like.
  • the application can be a database management system (DBMS) that verifies all updates to files before that file can be made available again to the application 18. It should be noted that a request to access the file for a read or write transaction can be from a different query from the same application or can be from a different application altogether.
  • DBMS database management system
  • an application program executing on a computer typically writes transactional information to disk and the operating system of the computer transmits a confirmation message to the application when the write completes. If the application program does not receive a confirmation message after a predetermined time, the application often performs the previous write to disk again and subsequently waits for another confirmation message. This process is generally wasteful and decreases the performance of the application program.
  • the application generates a write to the persistent mass storage 32 (e.g., disk).
  • the operating system 26 instead uses the device driver 28 to write the information to the persistent volatile memory 42.
  • the operating system 26 transmits the confirmation to the application.
  • the application receives the confirmation soon after the generation of the write, as a write to memory (e.g., persistent volatile memory 42) is faster than a write to the persistent mass storage 32. Therefore, in one embodiment the application receives the confirmation from the operating system 26 before the transactional infomiation is stored in the persistent mass storage 32.
  • the filter driver 28 may be a passive filter or an active filter.
  • a passive filter is a filter that monitors the information that the filter driver 28 stores in the persistent volatile memory 42.
  • the computer 10 configures the passive filter driver module 28 to monitor all unbuffered writes requested by a particular application , such as a DBMS. This may be used to help determine performance decreases associated with multiple unbuffered writes by a particular application.
  • the filter driver 28 receives an instruction to store in the persistent volatile memory 42 and performs some modification on the instruction before storing the instruction.
  • the active filter driver 28 may receive a certain type of transactional information, such as an unbuffered write to initialize a file by writing zeros to the file.
  • the active filter driver 28 may alter the unbuffered write to write ones to the file if the operating system 26 determines the writing of ones to be necessary for initialization.
  • the filter driver 28 is created by a file systems filter driver kit (FDDK), developed by Open Systems Resources, Incorporated of Amherst, New Hampshire.
  • FDDK file systems filter driver kit
  • a logical flow chart depicts the operation of the computer 10 on unbuffered writes.
  • the application generates (step 205) an unbuffered write and the filter driver 28 detects (step 210) the unbuffered write.
  • the filter driver 28 updates (step 212) the table described above with information associated with the unbuffered write. For example, the filter driver 28 updates the status field to a reserved state to denote that the detected unbuffered write is about to be copied to the persistent volatile memory 42.
  • the filter driver 28 stores (step 215) the transactional information in the persistent volatile memory 42.
  • all writes unbuffered writes and buffered writes
  • the filter driver 28 stores the transactional information in volatile memory 40, makes a copy of the transactional information stored in the volatile memory 40, and then transfers the copy into the persistent volatile memory 42.
  • the operating system 26 additionally starts a timer to enable future recordation of the time elapsed from the transferring of the transactional information to the persistent volatile memory 42.
  • the operating system 26 stores the time read from a predetermined register located in the computer 10.
  • the filter driver 28 then updates (step 220) the table to denote that the transfer of the transactional information to the persistent volatile memory 42 is complete.
  • the filter driver 28 updates the status field associated with the particular transactional information to an in-use state.
  • the filter driver 28 can determine that the transmittal of particular session information did not complete prior to the computer failure (i.e., the status field associated with the transactional information will not be set to an in-use state). If the filter driver 28 determines that the transactional information was not transmitted to the persistent volatile memory 42, then the filter driver 28 repeats step 215 to store the transactional information in the persistent volatile memory.
  • the operating system 26 then notifies (step 223) the application that the transactional information has been stored in the persistent volatile memory 42. In one embodiment, the notification occurs as a confirmation message to the application.
  • the operating system 26 determines (step 225) whether the operating system 26 should flush, or transfer, the persistent volatile memory 42 to the persistent mass storage 32.
  • the filter driver 8 includes a thread responsible for flushing the persistent volatile memory 42 to the persistent mass storage 32. As described further below, the thread can flush the persistent volatile memory 42 when a particular event occurs, such as when the operating system 26 transmits a message to the filter driver 28 instructing the filter driver 28 to flush the persistent volatile memory 42. In another embodiment, the thread may poll the persistent volatile memory 42 to determine whether the data stored in the persistent volatile memory should be flushed.
  • FIG. 10 illustrates an embodiment of the steps performed by the operating system
  • the operating system 26 determines (step 230) whether the data stored in the persistent volatile memory 42 exceeds or is about to exceed some predetermined threshold (e.g., the allotted size of the persistent volatile memory 42). If so, the operating system 26 flushes (step 233) the persistent volatile memory 42 to the persistent mass storage 32. If not, the operating system 26 does not flush (step 234) the persistent volatile memory 42.
  • the operating system 26 can also flush (step 235) the persistent volatile memory 42 if the operating system 26 determines (step 235) that a predefined amount of time since the transferring of the transactional information has elapsed.
  • the operating system 26 uses the timer described above to make this determination. In another embodiment, the operating system 26 records the current time stored in the register described above. Using this recorded time and the previously recorded time, the operating system 26 determines the amount of time that has elapsed since the transferring the transactional information to the persistent volatile memory 42. If the elapsed time is greater than a predefined amount of time, the operating system 26 flushes (step 233) the persistent volatile memory 42.
  • the operating system 26 determines (step 240) to flush the persistent volatile memory 42 when the operating system 26 is not servicing the application (i.e., the operating system is not busy). For example, the operating system 26 flushes the persistent volatile memory 42 when the application is idle.
  • the operating system 26 determines (step 245) whether the application requests to write the transactional information to a file that has previously been closed. In one embodiment, the application transmits a message to the operating system 26 when the application closes. If so, the operating system 26 flushes (step 233) the persistent volatile memory 42. The operating system 26 additionally flushes (step 233) the persistent volatile memory 42 when the computer 10 is in the process of being (step 250) shut down.
  • the filter driver 28 additionally updates (step 255) the table to denote that the transactional information has been stored in the persistent mass storage 42.
  • the transactional information By storing the transactional information in a persistent volatile memory 42, the information is accessible to the computer 10 at any instant in time. Therefore, a retrieval of such information does not significantly hamper the performance of the computer 10. Additionally, the persistent volatile memory 42 (and the persistent mass storage 32) enable the session information to be accessible to the computer after a computer failure, which would ordinarily erase the information from a volatile memory.
  • the transaction is a catalog merchandise order phoned in by a customer and entered into the computer 10 by a customer representative.
  • the customer enters the order into an application associated with the catalog and the application generates (step 205) the writes to a database file that should occur for the order.
  • the order transaction involves checking an inventory database file, confirming that the item is available, placing the order and confirming that the order has been placed. Considering these steps as a single transaction, then all of the steps are to be completed before the transaction is successful and the inventory database file is actually changed to reflect the new order.
  • the application associated with the catalog checks the inventory database file and confirms that the item is available. If the item is available, the application places the order. In some prior art computer systems, the application stores the transactional information in RAM 20 so that the application (e.g., the DBMS) can commit the information to persistent mass storage 32 at a later time. If the computer 10 failures (e.g., crashes) at this point in time, the order would have been placed, but the information previously stored in RAM 20 is erased. Consequently, the transactional information has not been reflected in the inventory database file and, therefore, to retrieve this information to update the inventory database file, the application frequently has to repeat the transaction again. [0072] Unlike the above scenario, the invention uses the filter driver 28 to determine
  • step 210) that the transaction involves an unbuffered write to the inventory database file.
  • the filter driver 28 then stores (step 215) the transactional information to the persistent volatile memory 42 so that the transactional information can survive a crash of the computer 10. Thus, if the failure of the computer 10 occurs, the transactional information has already been stored in the persistent volatile memory 42.
  • the operating system 26 notifies (step 223) the application of the completion of the write and consequently determines (step 225) whether to flush the persistent volatile memory 42, as described above.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Library & Information Science (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)
  • Storage Device Security (AREA)

Abstract

Methods and apparatus for persistent volatile computer memory and applications thereof. In one embodiment, the memory of a computer in partitioned into two regions, one directly accessible to the operating system and one accessible to the operating system only through an intermediary program such as a device driver. In another embodiment, the partitioning of computer memory is achieved through modifications to the computer's BIOS, preventing the operating system from directly addressing a region of volatile computer memory and protecting the contents of the region from modification during a boot cycle. Applications include the preservation of contents of write transactions and the facilitation of disk mirroring.

Description

METHODS AND APPARATUS FOR PERSISTENT VOLATILE COMPUTER MEMORY AND RELATED APPLICATIONS
THEREOF
FIELD OF THE INVENTION
[0001] The present invention relates to preserving the contents of computer memory. In particular, the present invention relates to methods and apparatus for preserving the contents of a volatile memory during a system failure and applications thereof.
BACKGROUND OF THE INVENTION
[0002] Prior art computer systems typically include a volatile memory for the storage and manipulation of information by an operating system and various software applications, and a non-volatile memory for mass storage of data and computer programs. When software applications behave in unexpected ways, they can cause the operating system to fail in catastrophic ways, referred to colloquially as a "system crash." When a system crashes, there is no guarantee that the information stored in volatile memory can be salvaged. Moreover, there is a significant chance that the computer system will be interrupted while writing information to nonvolatile mass storage, damaging or corrupting the contents of the non- volatile memory.
[0003] Typically, the user remedies the system crash by resetting the system. In the resulting boot cycle the operating system typically loses the ability to reference the information contained in the volatile memory or actually initializes the volatile memory, changing or destroying its contents. Similarly, a reboot operation typically destroys the information the computer would need to verify or repair the contents of the non- volatile memory.
[0004] Prior art solutions addressing the loss of the contents of volatile memory have taken various approaches. One approach requires a user manually to direct applications to save the contents of volatile memory to a non-volatile memory when significant amounts of information have been processed in volatile memory. An incremental improvement over this approach takes the form of modifications to the software applications themselves, whereupon they save the contents of volatile memory to a non- volatile memory when certain criteria are met. For example, the word processing program Microsoft Word™ from Microsoft Corporation, Redmond, WA has an option that automatically saves the contents of documents upon the elapse of a time period selected by the user.
[0005] These prior art systems have several failings. First, a failure in the operating system may prevent the functioning of any application-level safeguards. Second, safeguards that rely on regular human intervention are subject to human failings, such as when humans forget to invoke them. Third, safeguards that attempt to substitute application-administered criteria for human judgment and invocation fail in that they cannot guarantee that critical information would be saved when a human user would have chosen to save it.
[0006] A second set of prior art solutions to this problem has focused on hardware modifications to preserve the contents of volatile memory during a crash. Some prior art systems are arranged such that every read or write request to an operating system is simultaneously routed to a non- volatile memory. Such a system guarantees a record of memory contents that can be reconstructed during a boot cycle, but suffers from slowness during normal operation, because each transaction is conducted twice, and slowness during a boot cycle, because the operating system must locate the non- volatile record of transactions and reload them. Other prior art systems attempt the same techniques and suffer from the same problems, but reduce the magnitude of the delays by greater selectivity in the transactions actually recorded, or recording transactions in a way that is more amenable to reconstruction. Other prior art systems relying on hardware modification use non-volatile memories, such as electrically erasable programmable read-only memories (EEPROMs), Flash ROM, or battery-backed random-access memory. These systems have several drawbacks, including higher prices than normal volatile memories and the requirement of additional hardware. For example, Flash ROM often requires a charge pump to achieve the higher voltages needed to write to the memory, and suffers a shorter life than normal volatile RAM because of this process. Battery-backed RAMs rely on batteries that are subject to catastrophic failure or charge depletion.
[0007] Prior art solutions addressing the integrity of the contents of non- volatile memory have taken several forms. One solution involves equipping the computer with an array of inexpensive mass storage devices (a RAID array, where RAID is an acronym for "redundant array of inexpensive disks"). The computer processes each write transaction to non-volatile storage in parallel, writing it to each device in the array. If the computer fails, then the nonvolatile storage device with the most accurate set of contents available can be used as a master, copying all its contents to the other devices in the array (RAID level 1). Another solution only stores one copy of the transaction information across multiple mass storage devices, but also stores parity information concerning the transaction data (RAID level 5).
[0008] A computer whose information is stored in a volatile memory resistant to loss or corruption resulting from system or application crashes would avoid the problems associated with the loss and recreation of data. A computer that used this persistent volatile memory to store write transactions directed to non-volatile storage would similarly avoid wholesale duplication. The elimination of time-consuming data reconstruction would help make possible a fault-tolerant computer that offered continuous availability. The present invention provides those benefits.
SUMMARY OF THE INVENTION
[0009] The present invention relates to methods and apparatus for providing a persistent volatile computer memory. One object of the invention is to provide at least one region of volatile memory whose contents are resistant to loss or corruption from system or application crashes and the ensuing reboot cycle. Another object of the invention is to store the contents of write transactions to non- volatile storage in a region of volatile memory whose contents are resistant to loss or corruption from system or application crashes and the ensuing reboot cycle, where they are used to repair and complete the contents of the non- volatile storage devices. Still another object of the invention is to ensure the availability of transaction information following the failure of a computer system without having to perform the same transaction. [0010] In one aspect, one feature of the invention is the partitioning of computer memory into two types of regions: a non-persistent region directly accessible to the operating system and a persistent region that is not directly accessible to the operating system. Another feature of the invention is an intermediary program, such as a device driver, that serves as an intermediary between the operating system and the persistent memory region, processing requests from the operating system directed to the persistent region of memory. Yet another feature of the invention is that the contents of the persistent memory region are resistant to initialization or modification during a boot cycle. Another feature of the invention is that the intermediary program processes write requests atomically, preventing the results of incomplete or partial transactions from subsequent loading from the persistent memory region by computer applications. [0011] In another aspect, one feature of the invention is a modified BIOS, wherein the modification effectively partitions the memory into two types of regions: a non-persistent region directly accessible to the operating system and a persistent region that is not directly accessible to the operating system. Another feature of the invention is a intermediary program, such as a device driver, that processes read requests and write requests from the operating system to this persistent memory region. Net another feature of the invention is that the contents of the persistent memory region are resistant to initialization or modification during a boot cycle.
[0012] In yet another aspect, one feature of the invention is the partitioning of computer memory into two types of regions: a non-persistent region directly accessible to the operating system and a persistent region that is not directly accessible to the operating system. Another feature of the invention is a program that maps the persistent region onto physical memory locations, enabling the operating system to indirectly access the persistent region.
[0013] In another aspect, one feature of the invention is the presence of non-volatile storage and persistent volatile memory, where the persistent volatile memory is used to store write transactions posted to non- volatile storage. Another feature of the invention is an intermediary program, such as a device driver, that serves as an intermediary between the operating system and non- volatile storage that processes write requests from the operating system directed to non-volatile storage, stores their contents in persistent volatile memory, and then completes the write to non-volatile storage. Yet another feature of the invention is that the contents of the persistent memory region are resistant to initialization or modification during a boot cycle. Another feature of the invention is that the intermediary program processes write requests atomically, preventing the results of incomplete or partial transactions from subsequent loading from the persistent memory region by computer applications.
[0014] In still another aspect, one feature of the invention is a computer program that receives write transactions directed to non-volatile storage by the operating system, stores the contents of the write transaction in persistent volatile memory, and then completes the write to non- volatile storage. Another feature of the invention is the marking of transactions in persistent volatile memory as "complete" or "in progress" for use during the reboot and recovery process.
[0015] In yet another aspect, the invention is a method providing improved recovery from system failures. One feature is that the method receives a write transaction from the operating system, stores the contents of the write transaction in persistent volatile memory, and then stores the contents of the write transaction in non- olatile storage. Another feature of the invention is the marking of transactions in persistent volatile memory as "complete" or "in progress" for use during the reboot and recovery process. Yet another feature of the invention is the selection of those write transactions in persistent volatile memory marked "in progress", copying the contents of the uncompleted write transactions from the persistent volatile memory to the non- volatile storage, and then marking the uncompleted write transactions as completed after the successful completion of the copy to non- volatile storage.
[0016] Additionally, the invention features a method for storing transactional information in a computer. The method comprises the steps of: (a) receiving transactional information; (b) storing the particular transactional information in a persistent volatile memory on the computer; and (c) retrieving the transactional information after a computer failure by accessing the transactional information stored in the persistent volatile memory on the computer.
[0017] In another aspect, the method may also comprise flushing the persistent volatile memory to a persistent mass storage device. In one embodiment, the flushing occurs when the transactional information stored in the persistent volatile memory exceeds some predetermined threshold. In another embodiment, the flushing occurs when a predefined amount of time has elapsed since the storage of the transactional information in the persistent volatile memory. In yet another embodiment, the flushing occurs when a program, such as the operating system of the computer, is not busy. Alternatively, the flushing occurs when a file is closed or when the computer is shut down. [0018] In still another aspect, the invention features a method for providing persistent mass storage of transactional information. The method comprises the steps of: (a) receiving transactional information; (b) determining whether the transactional information meets a predetermined criteria; and (c) storing the transactional information that meets the predetermined criteria in a persistent cache. In one embodiment, the transactional information comprises unbuffered writes to disk, which are writes requested by an application and in which notification to the application of the completion of the write is necessary. For example, an unbuffered write can include copying a file from one directory to another directory, backing up and/or updating a file, and initializing a file.
[0019] In yet another aspect, the invention features a persistent volatile memory and an intermediary program. The intermediary program receives transactional information and stores the transactional information in the persistent volatile memory. The contents of the persistent volatile memory remain unaltered through a system failure. In one embodiment, the intermediary program is a filter driver module that identifies particular transactional information to store in the persistent volatile memory. In another embodiment, the invention includes a flushing thread to flush the contents of the persistent volatile memory to a persistent non- volatile memory.
BRIEF DESCRIPTION OF THE DRAWINGS [0020] These and other advantages of the invention may be more clearly understood with reference to the specification and the drawings, in which:
[0021] FIG. 1 is a block diagram of an embodiment of a computer known to the prior art. [0022] FIG. 2 is a block diagram of an embodiment of a computer constructed in accordance with the present invention. [0023] FIG. 3 is a functional block diagram of the embodiment of the present invention depicted in FIG. 2.
[0024] FIG. 4 is a flowchart depicting the state of a look-ahead buffer while processing a write request atomically in accord with the present invention.
[0025] FIG. 5 is a flowchart depicting the recovery of completed transaction information during a boot cycle in accord with the present invention.
[0026] FIG. 6 is a flowchart depicting the operation of an embodiment of the present invention.
[0027] FIG. 7 is an embodiment of an interface presented to the user to solicit information for the configuration of the present invention; and [0028] FIG. 8 is an flowchart describing the operation of an embodiment of the invention directed to the contents of non- volatile storage;
[0029] FIG. 9 is a flowchart depicting an embodiment of the operation of the computer of FIG. 1 to store unbuffered writes in persistent memory; and
[0030] FIG. 10 is a flowchart depicting an embodiment of the steps of flushing persistent memory as shown in FIG. 9.
[0031] In the drawings, like reference characters generally refer to the same parts throughout the different views. Also, the drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention. DETAILED DESCRIPTION OF THE INVENTION
[0032] Referring to FIG. 1, a computer 10 known to the prior art typically includes a microprocessor 12, a memory 14 for storing programs and/or data, an input/output (I/O) controller 16, and a system bus 18 allowing communication among these components. The memory 14 in such a computer typically includes random-access memory (RAM) 20, read-only memory (ROM) 22, and non-volatile random-access memory (NVRAM) 24. The RAM 20 typically contains an operating system 26 and one or more device drivers 28 that permit access to various peripherals by the operating system 26. The ROM 22 typically contains a basic input- output system (BIOS) 30 that handles the boot process of the computer 10. One or more input devices, such as an alphanumeric keyboard or a mouse, and one or more output devices, such as a display or a printer, are also typically included in the computer 10. In some embodiments the computer 10 will also include a network connection. The computer 10 typically has a mass storage device 32 such as a magnetic disk or magneto-optical drive. In particular, some computers 10 have redundant arrays of inexpensive disks (RAID arrays) used as failure-tolerant mass storage 32.
[0033] In brief overview, applicant's invention provides a persistent volatile memory in a computer while avoiding the failings of the prior art. This is achieved by partitioning the volatile computer memory into two regions: a non-persistent memory region that is directly accessible to the operating system and typically is initialized or modified during a boot cycle, and a persistent memory region whose contents are not initialized or modified during a boot cycle. In one embodiment, the operating system can indirectly access this persistent memory region through an intermediary program such as a device driver. In another embodiment, the intermediary program invokes operating-system level functionality to enable the operating system to access this persistent memory region only after the boot cycle is completed. This invention is particularly useful in a system-critical fault-tolerant computer that offers continuous availability. Of course, it is to be understood that the invention may include multiple persistent and non-persistent memory regions. For simplicity of explanation and depiction, the following discussion assumes two memory regions, one persistent and one non-persistent. Further, the invention would work equivalently if two independent memory units were used instead of two regions of one memory. Thus, when memory regions are discussed, equivalent descriptions apply for two independent memory units. [0034] One embodiment of the present invention is shown in FIGS. 2 & 3. In the embodiment shown, the system includes a computer 10' with RAM 20 partitioned into two different regions. The first memory region 40 is directly accessible to the operating system and is typically initialized or modified during a boot cycle. A device driver 28' handles read requests 50 and write requests 52 from the operating system 26 directed to the second memory region 42. One skilled in the art would recognize that a device driver can be replaced by other intermediary programs that provide the same functionality. A modified BIOS 30' prevents the operating system 26 from directly accessing the contents of the second memory region 42. The second memory region 42 is not directly accessible to the operating system 26 and therefore is not modified or initialized during a boot cycle. In one embodiment of the present invention, configuration information 44 regarding the location and size of the second memory region 42 is stored in an entry in NNRAM 24.
[0035] Referring again to FIG. 1, during a normal boot operation a computer 10 typically invokes a BIOS 30 that typically provides low-level access to peripherals; identifies RAM 20 available to the processor 12; initializes this RAM 20, typically destroying its contents; and then installs the operating system 26 into RAM 20, giving the operating system access to the entire RAM 20 to move information into and out of memory as necessary. If the computer 10 is started after having been powered down, all of its memory will have been initialized.
[0036] In contrast, referring to FIG. 2, during a normal boot operation a computer 10' constructed in accordance with the present invention invokes a modified BIOS 30.' The modified BIOS 30' retrieves configuration information 44 from ΝNRAM 24. This configuration information 44 includes the start address and the size of persistent memory. If both these values are zero or non-existent, then the modified BIOS 30' knows that the invention is either not installed or disabled. If the size is non-zero, but the start address is zero, then the start address is recalculated by subtracting the size of the persistent memory from the total memory size and storing this result as the new start address.
[0037] The modified BIOS 30' then divides the RAM 20 into two memory regions: a first non-persistent memory region 40 ranging from address 0 up to, but not including, the persistent memory start address, and a second persistent memory region 42 consisting of all memory from the persistent memory start address up to, but not including, the sum of the persistent memory start address and the persistent memory size. The system then initializes the first memory region 40. The modified BIOS 30' still provides low-level access to peripherals, but installs the operating system 26 into the first memory region 40 of RAM 20, and preventing the operating system 26 from directly accessing the second memory region 42 during the boot cycle and normal computer operation. The operating system 26 is, in effect, unaware of the second memory region 42. The operating system 26 typically initializes or installs its own programs into the first memory region 40, often modifying the contents of the first memory region 40, but does not modify the contents of the second memory region 42 of which it is unaware. This renders the contents of the second memory region 42 persistent through a boot cycle.
[0038] Typically the operating system 26 will load device drivers 28 to permit access to various peripheral devices. Referring again to FIG. 3, in one embodiment of the present invention the operating system 26 loads a device driver 28' that is aware of the second memory region 42 and is able to access its contents. The device driver 28' is aware of the second memory region 42 because it is also aware of and accesses the configuration information 44 stored in NNRAM 24. After loading the configuration information 44, the device driver 28' serves as an intermediary between the operating system 26 and the second memory region 42. The device driver 28' takes a read request 50 from the operating system 26 and returns information from the appropriate location in the second memory region 42. Similarly, the device driver 28' takes a write request 52 from the operating system 26 and stores information at the appropriate location in the second memory region 42. In another embodiment, an intermediary program installs and configures the invention, and then invokes operating-system level functionality to enable the operating system to access this second set of memory regions only after the boot cycle is completed.
[0039] For example, in one embodiment of the invention the operating system 26 is the Windows 2000 operating system. Under Windows 2000, the second memory region 42 accessible through the device driver 28 appears to the operating system 26 as a RAM disk, though the contents of a normal RAM disk do not survive a boot cycle, in contrast to the present invention. A Windows 2000 read request 50 or write request 52 includes an offset value (in bytes) and a length value (in bytes). The device driver 28' computes the appropriate location in the second memory region 42 by adding the offset value in the request to the start address of the persistent memory region. In one embodiment, the second memory region 42 includes 1 MB of configuration information, so the appropriate location is actually the sum of the offset value, the start address of the persistent memory, and 1 MB. For a read request 50, the device driver 28 copies a number of bytes equal in size to the length value from the computed location in the second memory region 42 to the user's buffer. For a write request 52, the device driver 28 copies a number of bytes equal in size to the length value passed by the operating system 26 from the user's buffer to the computed location in the second memory region 42. This interaction permits the operating system 26 to indirectly access the second memory region 42 without threatening the integrity of the contents of the second memory region 42 during a boot cycle. In another embodiment where Windows 2000 is the operating system 26, the device driver 28' invokes the functionality of the operating system 26 to map the computed location onto the virtual address space of the operating system 26 for the copy operation. Other operating system 26 functionality completes the copy operation and unmaps the computed location from the virtual address space of the operating system 26.
[0040] It is possible for the operating system 26 to crash while a write request 52 to persistent memory is being executed. In that case, an incomplete version of the request would be stored in the second persistent memory region 42. This can cause problems during subsequent operation, as a computer application may attempt to restore its state based on this incomplete information, potentially crashing the application and necessitating time-consuming reconstruction of the information lost during the crash.
[0041] To prevent this problem, the present invention creates a look-aside buffer in the second persistent memory region 42 and uses it for the atomic update and storage of transaction information; only when the write request 52 has been buffered and completed is it transferred out of the look-aside buffer. In greater detail, a look-aside buffer includes a set of bits that describe its state. FIG. 4 shows how the state of the look-aside buffer changes to reflect various stages in the processing of a write request 52.
[0042] When no information is in the buffer, for example at the creation and initialization of the buffer, the buffer state is 0 (Step 10). When a write request 52 is received by the device driver 28', the driver 28' stores the computed location and the length (in bytes) of the request in the look-aside buffer and the state of the buffer becomes 1 (Step 12). At this point, the actual contents of the write request are copied into the buffer (Step 14). If the copy is successfully completed, the buffer state becomes 2 (Step 16). If the copy fails because of, for example, a system crash, the buffer state remains at 1. Once the buffer state is set to 2, the contents of the write request are copied out of the look-aside buffer to their computed location in the second persistent memory region 42 (Step 18). When this is successfully completed, the buffer state returns to 0 (Step 10). [0043] The effect of the value of the state of the look-aside buffer on the subsequent boot process is depicted in FIG. 5. At system reboot (Step 20), the device driver 28' locates all the look-aside buffers in the second persistent memory region 42. If there are no more look-aside buffers to check (Step 22), the system boot process continues (Step 24). If there are more look- aside buffers (Step 22), the device driver proceeds to examine the state of each look-aside buffer, one at a time, in the second persistent memory region 42 (Step 26). If the state of the buffer presently under examination is 0, the device driver 28' knows that there is no information stored in the look-aside buffer and the device driver 28' checks the next look-aside buffer (Step 22). If the buffer state is 1, the device driver 28' knows that the information in the look-aside buffer is the result of an incomplete transaction and should not be moved into the second persistent memory region 42 for recovery by a computer application. The device driver 28' sets the state of this buffer to 0 (Step 30) and checks the next look-aside buffer (Step 22). If the buffer under examination is in state 2, then the device driver 28' knows that the contents of the look-aside buffer are the result of a completed transaction that did not get copied into the second persistent memory region 42. The device driver 28' copies the contents of the look-aside buffer to the computed location in the second persistent memory region 42 (Step 28). When the copy is completed, the buffer state is set to 0 (Step 30) and the device driver 28' checks the next lookaside buffer (Step 22). Eventually the device driver 28' will have checked the state of all the look-aside buffers, and the system boot will continue (Step 24). [0044] Referring to FIG. 6, during a boot cycle the computer loads the programs implementing the invention into memory at Step -40. In one embodiment, the programs are a modified BIOS and a device driver. The programs divide the memory into two portions: a first region directly accessible to the operating system in Step 42 and making a second region that is not directly accessible to the operating system in Step 44. This is accomplished through modifications to the BIOS. The inaccessibility to the operating system renders the contents of the second region resistant to initialization or modification during a boot cycle. Again, one skilled in the art will recognize that the present invention permits multiple persistent and non- persistent memory regions, but for the sake of simplicity of discussion and depiction, the present discussion assumes one persistent memory region and one non-persistent memory region. [0045] Once the memory partitioning has been achieved, a device driver 28' or similar intermediary program provides indirect access to the second memory region 42 to the operating system. In step 46, the device driver 28' waits for a read request 50 or a write request 52 from the operating system 26. The device driver 28' decides (Step 48) whether a read request has been received, and if one has, then the intermediary program reads (Step 50) from the appropriate location in the second memory region 42 and returns the result to the operating system 26. Similarly, if the device driver 28 decides (Step 52) that a write request 52 has been received, then the device driver stores (Step 54) information at the appropriate location in the second memory region 42. If neither type of request has been received, then the device driver returns to step 36 and continues to wait for requests. Typically, read and write requests from the operating system to the first memory region operate as they would have before the installation of the present invention. [0046] In the preferred embodiment, the computer programs comprising the present invention are on a magnetic disk drive and selecting an icon invokes the installation of the programs, initializing the present invention. In general, the programs implement the invention on the computer, and the programs either contain or access the data needed to implement all of the functionality of the invention on the computer. [0047] Referring to FIG. 7, the user is prompted to configure the invention prior to its installation with a user interface. In one embodiment, this prompt takes the form of a dialog box, though other forms of prompting are feasible, typically including asking the user a series of questions or requiring the user to manually edit a configuration file. The dialog box typically permits the user to choose the size of the persistent memory and choose a designation through which the operating system accesses it. This configuration information provided by the user is stored in non- volatile storage, such as NNRAM 24 or as a file entry in a mass storage device 32. In a preferred embodiment, a dialog box includes a slider element 60, a pop-up menu element 62, and a text-entry element 64. The slider element 60 permits the user to select the size of the persistent memory. For example, the slider element 60 permits the user to select a persistent memory from 10 to 100 MB in size in 10 MB increments. Of course, particular numerical values may vary between implementations. The pop-up menu element 62 permits the user to assign a drive letter to the device driver 28' to permit the operating system to address it. The text-entry element 64 permits the user to designate a file name for the backing store file. A pop-up menu element 66 permits the user to designate the transaction size for the atomic update feature described above. When the user has entered the configuration information, the user selects the OK button 68 which initiates the embodiment of the present invention in the computer. The invention is installed and typically becomes effective after the next boot cycle. In one embodiment the present invention is implemented in RAM, but it may also be implemented in a storage-based virtual memory system or some other memory apparatus.
[0048] Referring back to FIG. 1, the present invention may be deployed to improve the operation of redundant mass storage 32 elements in the computer 10. For example, redundant mass storage elements may be provided as RAID level 1 disk arrays, RAID level 5 disk arrays, or other mass storage element configured to provide RAID level 1 or RAID level 5 type functionality. Some current methods require copying, after a system failure, of one disk in a RAID level 1 disk array in its entirety to the other disks in the array. Other methods use parity information to reconstruct the data (RAID level 5). Resynchronization of the RAID device after system failure monopolizes system resources as the resynchronization occurs. In contrast, the persistent volatile memory of the present invention enables the creation of a list of write transactions resistant to system failures. After a failure, recovery need only involve the recreation of data using the transactions in the list stored in persistent memory.
[0049] Referring to FIG. 8, at system boot (Step 100) the computer follows the process described above to create two regions of volatile memory: a non-persistent memory region that is directly accessible to the operating system and typically is initialized or modified during a boot cycle, and a persistent memory whose contents are not initialized or modified during a boot cycle. Next, a disk write list is created in persistent memory (Step 102). The amount of persistent memory allocated for the disk write list can vary with system configuration, but typically is at least one page (4 kb) in size. The disk write list is used to store a list of pending write transactions posted to mass storage.
[0050] In normal system operation, (Step 104) the system typically runs an operating system and various programs. Some of these programs are application programs, which provide the operator of the system with various functionalities utilizing the system, while other programs provide the system itself with additional functionality. Referring to FIG. 1, an example of the latter is a device driver 28, which permits the operating system 26 to access various peripherals.
[0051] The present invention uses a specialized device driver 28 that permits the operating system 26 to access mass storage 32, which is, in some embodiments, a RAID array. Referring to FIG. 8, during normal system operation the CPU will typically issue a write request to mass storage via the specialized device driver. The device driver records this write transaction in the disk write list in persistent memory and marks the transaction "in progress" (Step 106). The entry in the disk write list includes the start address for the write, the length of the data to be written, the mass storage volume identifier for the write being performed, and the status of the transaction as "in progress." Then the device driver attempts to complete the posting of the contents of the write transaction to a start address in mass storage. Under normal circumstances there is no system failure during the transaction (Step 108), the write is successfully completed, and the device driver deletes the now-completed entry in the disk write list (Step 110). In other embodiments, upon successful completion of a mass storage with the entry in the write list is marked "complete." Normal system operation continues (Step 104).
[0052] In the event of system failure (Step 108), the system will need to be rebooted (Step 112). Upon reboot, the operating system through the device driver checks for entries in the disk write list in persistent memory (Step 114). If the disk write list has no entries in it, the system proceeds to normal system operation (Step 104).
[0053] If the disk write list has entries in it, then the operating system processes each entry in the disk write list. For each entry, the operating system assumes that the system failure (Step 108) occurred during the write transaction, and proceeds to complete the posting to the storage volume identified in the entry. In embodiments where the storage volume is a RAID level 1 device, the system reads data from the start address equal in length to the length in the disk write list from the primary disk and writes it to the secondary disks in the array. In embodiments where the storage volume is a RAID level 5 device, the system reads the data from all of the data volumes at the specified location in the specified length and reconstructs the corresponding parity data.
[0054] When the outstanding transaction is completed, the entry in the disk write list corresponding to the completed transaction is deleted (Step 116). The process continues for each entry in the disk write list (Step 114). When all the "in progress" transactions in the disk write list have been completed, normal system operation resumes (Step 104). [0055] The present invention may also be employed to improve the performance of unbuffered writes (i.e., writes requested by a software application in which notification to the application of the completion of the write is necessary) to the persistent mass storage 32. Examples of an unbuffered write include, without limitation, copying a file from one directory to another directory, backing up and/or updating a file, initializing a file (e.g., writing zeros to the file), and the like. Although described above and below with transactional information, any information can be used within the scope of the invention. [0056] For example, the application can be a database management system (DBMS) that verifies all updates to files before that file can be made available again to the application 18. It should be noted that a request to access the file for a read or write transaction can be from a different query from the same application or can be from a different application altogether. [0057] In general and during a normal unbuffered write of transactional information, an application program executing on a computer typically writes transactional information to disk and the operating system of the computer transmits a confirmation message to the application when the write completes. If the application program does not receive a confirmation message after a predetermined time, the application often performs the previous write to disk again and subsequently waits for another confirmation message. This process is generally wasteful and decreases the performance of the application program.
[0058] In the invention described above and below, the application generates a write to the persistent mass storage 32 (e.g., disk). The operating system 26 instead uses the device driver 28 to write the information to the persistent volatile memory 42. Following the completion of this write to the persistent volatile memory 42, the operating system 26 transmits the confirmation to the application. The application receives the confirmation soon after the generation of the write, as a write to memory (e.g., persistent volatile memory 42) is faster than a write to the persistent mass storage 32. Therefore, in one embodiment the application receives the confirmation from the operating system 26 before the transactional infomiation is stored in the persistent mass storage 32.
[0059] The filter driver 28 may be a passive filter or an active filter. A passive filter is a filter that monitors the information that the filter driver 28 stores in the persistent volatile memory 42. For example, the computer 10 configures the passive filter driver module 28 to monitor all unbuffered writes requested by a particular application , such as a DBMS. This may be used to help determine performance decreases associated with multiple unbuffered writes by a particular application.
[0060] As an active filter, the filter driver 28 receives an instruction to store in the persistent volatile memory 42 and performs some modification on the instruction before storing the instruction. For example, the active filter driver 28 may receive a certain type of transactional information, such as an unbuffered write to initialize a file by writing zeros to the file. The active filter driver 28 may alter the unbuffered write to write ones to the file if the operating system 26 determines the writing of ones to be necessary for initialization. In a further embodiment, the filter driver 28 is created by a file systems filter driver kit (FDDK), developed by Open Systems Resources, Incorporated of Amherst, New Hampshire.
[0061] Referring to FIG. 9, a logical flow chart depicts the operation of the computer 10 on unbuffered writes. The application generates (step 205) an unbuffered write and the filter driver 28 detects (step 210) the unbuffered write. After detecting the unbuffered writes, the filter driver 28 updates (step 212) the table described above with information associated with the unbuffered write. For example, the filter driver 28 updates the status field to a reserved state to denote that the detected unbuffered write is about to be copied to the persistent volatile memory 42. [0062] After updating the table, the filter driver 28 stores (step 215) the transactional information in the persistent volatile memory 42. In another embodiment, all writes (unbuffered writes and buffered writes) are stored in the persistent volatile memory 42. In yet another embodiment, the filter driver 28 stores the transactional information in volatile memory 40, makes a copy of the transactional information stored in the volatile memory 40, and then transfers the copy into the persistent volatile memory 42.
[0063] In one embodiment and as further described below, the operating system 26 additionally starts a timer to enable future recordation of the time elapsed from the transferring of the transactional information to the persistent volatile memory 42. In another embodiment, the operating system 26 stores the time read from a predetermined register located in the computer 10.
[0064] The filter driver 28 then updates (step 220) the table to denote that the transfer of the transactional information to the persistent volatile memory 42 is complete. In particular and in one embodiment, the filter driver 28 updates the status field associated with the particular transactional information to an in-use state. Thus, if a failure of the computer 10 occurs prior to the completion of a transmittal of transactional information to the persistent volatile memory 42, the filter driver 28 can determine that the transmittal of particular session information did not complete prior to the computer failure (i.e., the status field associated with the transactional information will not be set to an in-use state). If the filter driver 28 determines that the transactional information was not transmitted to the persistent volatile memory 42, then the filter driver 28 repeats step 215 to store the transactional information in the persistent volatile memory.
[0065] The operating system 26 then notifies (step 223) the application that the transactional information has been stored in the persistent volatile memory 42. In one embodiment, the notification occurs as a confirmation message to the application. The operating system 26 then determines (step 225) whether the operating system 26 should flush, or transfer, the persistent volatile memory 42 to the persistent mass storage 32. In one embodiment, the filter driver 8 includes a thread responsible for flushing the persistent volatile memory 42 to the persistent mass storage 32. As described further below, the thread can flush the persistent volatile memory 42 when a particular event occurs, such as when the operating system 26 transmits a message to the filter driver 28 instructing the filter driver 28 to flush the persistent volatile memory 42. In another embodiment, the thread may poll the persistent volatile memory 42 to determine whether the data stored in the persistent volatile memory should be flushed. [0066] FIG. 10 illustrates an embodiment of the steps performed by the operating system
26 to determine (step 225) whether the operating system 26 should flush the persistent volatile memory 42 to the persistent mass storage 32. In one embodiment, the operating system 26 determines (step 230) whether the data stored in the persistent volatile memory 42 exceeds or is about to exceed some predetermined threshold (e.g., the allotted size of the persistent volatile memory 42). If so, the operating system 26 flushes (step 233) the persistent volatile memory 42 to the persistent mass storage 32. If not, the operating system 26 does not flush (step 234) the persistent volatile memory 42. The operating system 26 can also flush (step 235) the persistent volatile memory 42 if the operating system 26 determines (step 235) that a predefined amount of time since the transferring of the transactional information has elapsed. In some embodiments, the operating system 26 uses the timer described above to make this determination. In another embodiment, the operating system 26 records the current time stored in the register described above. Using this recorded time and the previously recorded time, the operating system 26 determines the amount of time that has elapsed since the transferring the transactional information to the persistent volatile memory 42. If the elapsed time is greater than a predefined amount of time, the operating system 26 flushes (step 233) the persistent volatile memory 42.
[0067] In another embodiment, the operating system 26 determines (step 240) to flush the persistent volatile memory 42 when the operating system 26 is not servicing the application (i.e., the operating system is not busy). For example, the operating system 26 flushes the persistent volatile memory 42 when the application is idle. [0068] Alternatively, the operating system 26 determines (step 245) whether the application requests to write the transactional information to a file that has previously been closed. In one embodiment, the application transmits a message to the operating system 26 when the application closes. If so, the operating system 26 flushes (step 233) the persistent volatile memory 42. The operating system 26 additionally flushes (step 233) the persistent volatile memory 42 when the computer 10 is in the process of being (step 250) shut down. In one embodiment, the filter driver 28 additionally updates (step 255) the table to denote that the transactional information has been stored in the persistent mass storage 42.
[0069] By storing the transactional information in a persistent volatile memory 42, the information is accessible to the computer 10 at any instant in time. Therefore, a retrieval of such information does not significantly hamper the performance of the computer 10. Additionally, the persistent volatile memory 42 (and the persistent mass storage 32) enable the session information to be accessible to the computer after a computer failure, which would ordinarily erase the information from a volatile memory.
[0070] As an example and referring again to FIG. 9, assume the transaction is a catalog merchandise order phoned in by a customer and entered into the computer 10 by a customer representative. The customer enters the order into an application associated with the catalog and the application generates (step 205) the writes to a database file that should occur for the order. In particular, the order transaction involves checking an inventory database file, confirming that the item is available, placing the order and confirming that the order has been placed. Considering these steps as a single transaction, then all of the steps are to be completed before the transaction is successful and the inventory database file is actually changed to reflect the new order.
[0071] In greater detail, the application associated with the catalog checks the inventory database file and confirms that the item is available. If the item is available, the application places the order. In some prior art computer systems, the application stores the transactional information in RAM 20 so that the application (e.g., the DBMS) can commit the information to persistent mass storage 32 at a later time. If the computer 10 failures (e.g., crashes) at this point in time, the order would have been placed, but the information previously stored in RAM 20 is erased. Consequently, the transactional information has not been reflected in the inventory database file and, therefore, to retrieve this information to update the inventory database file, the application frequently has to repeat the transaction again. [0072] Unlike the above scenario, the invention uses the filter driver 28 to determine
(step 210) that the transaction involves an unbuffered write to the inventory database file. The filter driver 28 then stores (step 215) the transactional information to the persistent volatile memory 42 so that the transactional information can survive a crash of the computer 10. Thus, if the failure of the computer 10 occurs, the transactional information has already been stored in the persistent volatile memory 42. The operating system 26 notifies (step 223) the application of the completion of the write and consequently determines (step 225) whether to flush the persistent volatile memory 42, as described above.
[0073] Many alterations and modifications may be made by those having ordinary skill in the art without departing from the spirit and scope of the invention. Therefore, it must be expressly understood that the illustrated embodiment has been shown only for the purposes of example and should not be taken as limiting the invention, which is defined by the following claims. The following claims are thus to be read as not only literally including what is set forth by the claims but also to include all equivalent elements for performing substantially the same function in substantially the same way to obtain substantially the same result, even though not identical in other respects to what is shown and described in the above illustration.

Claims

What is claimed is: 1. A computer with an operating system and persistent memory, comprising: a memory, comprising: a non-persistent memory region, directly accessible by the operating system; and a persistent memory region; and an intermediary program in communication with the operating system and the persistent memoiy region, wherein the intermediary program enables the operating system to address a persistent memory region.
2. The computer of claim 1 wherein a non-persistent memory region and a persistent memory region are different physical memories.
3. The computer of claim 1 wherein the intermediary program is a device driver.
4. The computer of claim 1 additionally comprising: a basic input/output system (BIOS), which prevents direct access to the persistent memory region by the operating system.
5. The computer of claim 1 wherein the persistent memory region is allocated to redundant CPU memory locations.
6. The computer of claim 1 additionally comprising: a non- volatile memory, wherein the non-volatile memory contains information concerning the configuration of the persistent memory region.
7. The computer of claim 1 additionally comprising: a non- volatile memory, wherein the non-volatile memory contains information concerning the configuration of the non-persistent memory region.
8. The computer of claim 1 additionally comprising: a file containing system settings, wherein the file containing system settings contains information concerning the configuration of the persistent memory region.
9. The computer of claim 1 additionally comprising: a file containing system settings, wherein the file containing system settings contains information concerning the configuration of the non-persistent memory region.
10. The computer of claim 1 wherein the persistent memory region comprises: a look-aside buffer, comprising: a set of state bits, and a buffer region for the storage of data, wherein the look-aside buffer is used for the atomic storage and update of write requests.
11. A storage medium with an encoded program which when loaded into a computer having an operating system and a memory partitioned into a non-persistent memory region and a persistent memory region, provides the computer with a persistent memory, said program comprising the steps of: (a) reading from the persistent memory region in response to requests coming from the operating system; and (b) writing to the persistent memory region in response to requests coming from the operating system.
12. The encoded program of claim 11 wherein the persistent memory region of the computer comprises a look-aside buffer, itself comprising a set of state bits and a buffer region for the storage of data, and step (b) further comprises the steps of: (b-a) setting the state bits to a first value before writing the contents of a request to the buffer region; (b-b) writing the contents of a request to the buffer region; (b-c) setting the state bits to a second value after successful completion of the writing of the contents of a request to the buffer region; (b-d) copying the contents of the buffer region to the appropriate location in the persistent memory; and (b-e) setting the state bits to a third value after successfully copying the contents of the buffer region to the appropriate location in the persistent memory.
13. In a computer system comprising an operating system, an intermediary program, and a memory, a method for providing persistent memory, the method comprising the steps of: (a) partitioning the memory into a non-persistent memory region and a persistent memory region; (b) providing an intermediary program in communication with the persistent memory region such that the persistent memory region is accessible to the operating system solely through the device driver.
14. The method of claim 13 wherein the contents of the persistent memory region retain their integrity during a boot cycle.
15. The method of claim 13 wherein the persistent memory region comprises a look-aside buffer, which the device driver uses for the atomic update and storage of write requests.
16. The method of claim 13 wherein step (a) comprises the steps of: (a-a) reading a stored address defining the start address of the persistent memory region; (a-b) reading a stored value defining the size of the persistent memory region; and (a-c) creating a persistent memory region at the start address equal in size to the stored value defining the size of the persistent memory region.
17. The method of claim 16 wherein the program reads the stored addresses and stored values defining the size of the persistent memory region from non- volatile memory.
18. An operating system memory environment comprising: a first memory mode region accessible to users and to the operating system; a second memory mode region accessible only to the operating system and not to users; and a third memory mode region not accessible by users and not directly accessible by the operating system.
19. The operating system memory environment of claim 18 wherein the operating system is a Microsoft Windows operating system.
20. A computer with a memory-mapped operating system and persistent memory, comprising: a volatile memory, comprising: a non-persistent memory region, directly accessible by the operating system; and a persistent memory region, whose locations are not mapped by the operating system.
21. The computer of claim 20 further comprising: a device driver in communication with the operating system and the persistent memory region, wherein the operating system addresses the persistent memory region via the device driver.
22. The computer of claim 21 wherein the persistent memory region comprises a look-aside buffer, which the device driver uses for the atomic update and storage of write requests.
23. The computer of claim 20 wherein the locations of the persistent memory region are mapped by the operating system or a user application once the boot cycle is complete.
24. A computer, comprising: non-volatile storage; and a persistent volatile memory, storing a list of write transactions posted to said persistent non-volatile memory.
25. The computer of claim 24 wherein said non- volatile storage is a redundant array of inexpensive disks (RAID array).
26. The computer of claim 24 additionally comprising: a non-persistent volatile memory, comprising: a device driver, which is adapted to receive write transactions directed to said non- volatile storage and record said write transactions in said persistent volatile memory.
27. The computer of claim 24 additionally comprising: an operating system repairing the contents of the persistent non- volatile memory in the event of system failure using the contents of the persistent volatile memory, wherein the operating system accomplishes repairs by completing said list of write transactions stored in said persistent volatile memory upon reboot.
28. A storage medium with an encoded program which when loaded into a computer having an operating system, a persistent volatile memory, and non-volatile storage, provides the computer with improved recovery from system failures, said program comprising the steps of: (a) receiving write transactions directed to the non-volatile storage from the operating system; (b) storing the write transaction as an entry in persistent volatile memory; and (c) writing the write transaction to the non- volatile storage.
29. The encoded program of claim 28 additionally comprising the step: (d) marking the write transaction stored in persistent volatile memory "completed" upon the successful completion of step (c).
30. In a computer system comprising an operating system, a device driver, a persistent volatile memory, and non-volatile storage, a method for providing improved recovery from system failures, the method comprising the steps of: (a) receiving a write transaction from the operating system; (b) storing the write transaction as an entry in the persistent volatile memory; (c) storing the write transaction in non- volatile storage.
31. The method of claim 30 additionally comprising the step : (d) marking the write transaction stored in persistent volatile memory "completed" following the successful completion of step (c).
32. In a computer system comprising an operating system, a device driver, a persistent volatile memory containing the stored contents of write transactions, and non-volatile storage, a method for providing improved recovery from system failures, the method comprising the steps of: (a) selecting those write transactions in persistent volatile memory marked uncompleted; (b) reconstructing information related to the uncompleted write transactions from the persistent volatile memory and completing the transaction; and (c) marking the uncompleted write transactions as completed after the successful completion of step (b).
33. A method for storing transactional information in a computer comprising: (a) receiving transactional information; (b) storing the particular transactional information in a persistent volatile memory on the computer; and (c) retrieving the transactional information after a computer failure by accessing the transactional information stored in the persistent volatile memory on the computer.
34. The method of claim 33 further comprising identifying particular transactional information.
35. The method of claim 33 further comprising flushing the persistent volatile memory to a persistent mass storage device.
36. The method of claim 35 further comprising determining when the transactional information stored in the persistent volatile memory exceeds some predetermined threshold.
37. The method of claim 35 further comprising determining when a predefined amount of time has elapsed since the storage of the transactional information in the persistent volatile memory.
38. The method of claim 35 further comprising determining when a program is not busy.
39. The method of claim 38 wherein the program is an operating system of the computer.
40. The method of claim 35 further comprising determining when a file is closed.
41. The method of claim 35 further comprising determining when the computer is being shut down.
42. The method of claim 33 further comprising making a copy of the transactional information.
43. The method of claim 42 further comprising transferring the copy of the transactional information to the persistent volatile memory.
44. In a computer comprising predetermined criteria, a persistent volatile memoiy comprising a persistent cache, and a persistent non-volatile memory, a method for providing persistent mass storage of transactional information, comprising the steps: (a) receiving transactional information; (b) determining whether the transactional information meets the predetermined criteria; and (c) storing the transactional information meeting the predetermined criteria in the persistent cache.
45. The method of claim 44 further comprising the step of: (d) updating the contents of the persistent non- volatile memory to mirror the contents of the persistent cache.
46. The method of claim 45 wherein the transactional information meeting the predetermined criteria comprises unbuffered writes to disk.
47. The method of claim 44 wherein the transactional information meeting the predetermined criteria comprises one of a file copy, a file backup, a file update, and a file initialization.
48. A computer for committing transactional information, the computer comprising: (a) a persistent volatile memory; (b) an intermediary program in communication with the persistent volatile memory, the intermediary program receiving transactional information and storing the transactional information in the persistent volatile memory, wherein the contents of the persistent volatile memory remain unaltered through a system failure.
49. The computer of claim 48 further comprising a filter driver module to identify particular transactional information.
50. The computer of claim 49 further comprising storing the particular transactional information in the persistent volatile memory.
51. The computer of claim 48 wherein the persistent volatile memory further comprises a persistent cache.
52. The computer of claim 51 further comprising a persistent non-volatile memory, wherein the computer flushes the contents of the persistent cache to the persistent non- volatile memory.
53. The computer of claim 51 further comprising a flushing thread to flush the contents of the persistent cache.
54. The computer of claim 48 wherein the transactional information comprises unbuffered writes to disk.
55. The computer of claim 48 wherein the transactional information comprises one of a file copy, a file backup, a file update, and a file initialization.
56. The computer of claim 48 further comprising a data structure to describe the state of the transactional information stored in the persistent volatile memory.
PCT/US2001/012138 2000-04-14 2001-04-13 Methods and apparatus for persistent volatile computer memory and related applications thereof WO2001080008A2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP01925016A EP1277115A2 (en) 2000-04-14 2001-04-13 Methods and apparatus for persistent volatile computer memory and related applications thereof
AU2001251617A AU2001251617A1 (en) 2000-04-14 2001-04-13 Methods and apparatus for persistent volatile computer memory and related applications thereof

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US09/550,108 US6842823B1 (en) 2000-04-14 2000-04-14 Methods and apparatus for persistent volatile computer memory
US09/550,108 2000-04-14
US09/664,483 2000-09-18
US09/664,483 US6802022B1 (en) 2000-04-14 2000-09-18 Maintenance of consistent, redundant mass storage images
US09/790,750 US6901481B2 (en) 2000-04-14 2001-02-22 Method and apparatus for storing transactional information in persistent memory
US09/790,750 2001-02-22

Publications (2)

Publication Number Publication Date
WO2001080008A2 true WO2001080008A2 (en) 2001-10-25
WO2001080008A3 WO2001080008A3 (en) 2002-06-06

Family

ID=27415580

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/012138 WO2001080008A2 (en) 2000-04-14 2001-04-13 Methods and apparatus for persistent volatile computer memory and related applications thereof

Country Status (3)

Country Link
EP (1) EP1277115A2 (en)
AU (1) AU2001251617A1 (en)
WO (1) WO2001080008A2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9767015B1 (en) * 2013-11-01 2017-09-19 Amazon Technologies, Inc. Enhanced operating system integrity using non-volatile system memory
US10229011B2 (en) 2013-09-25 2019-03-12 Amazon Technologies, Inc. Log-structured distributed storage using a single log sequence number space

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8930330B1 (en) 2011-06-27 2015-01-06 Amazon Technologies, Inc. Validation of log formats
US9195542B2 (en) 2013-04-29 2015-11-24 Amazon Technologies, Inc. Selectively persisting application program data from system memory to non-volatile data storage
US10387399B1 (en) 2013-11-01 2019-08-20 Amazon Technologies, Inc. Efficient database journaling using non-volatile system memory
US9740606B1 (en) 2013-11-01 2017-08-22 Amazon Technologies, Inc. Reliable distributed messaging using non-volatile system memory
US10089220B1 (en) 2013-11-01 2018-10-02 Amazon Technologies, Inc. Saving state information resulting from non-idempotent operations in non-volatile system memory
US9760480B1 (en) 2013-11-01 2017-09-12 Amazon Technologies, Inc. Enhanced logging using non-volatile system memory
US10303663B1 (en) 2014-06-12 2019-05-28 Amazon Technologies, Inc. Remote durable logging for journaling file systems

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE69126104T2 (en) * 1990-11-30 1997-08-28 Casio Computer Co Ltd Wrist watch
GB2281644A (en) * 1993-09-02 1995-03-08 Ibm Fault tolerant transaction-oriented data processing.
WO1995012848A1 (en) * 1993-11-03 1995-05-11 Eo, Inc. Recovery boot process
US5694583A (en) * 1994-09-27 1997-12-02 International Business Machines Corporation BIOS emulation parameter preservation across computer bootstrapping
US5794252A (en) * 1995-01-24 1998-08-11 Tandem Computers, Inc. Remote duplicate database facility featuring safe master audit trail (safeMAT) checkpointing
US5799305A (en) * 1995-11-02 1998-08-25 Informix Software, Inc. Method of commitment in a distributed database transaction

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10229011B2 (en) 2013-09-25 2019-03-12 Amazon Technologies, Inc. Log-structured distributed storage using a single log sequence number space
US9767015B1 (en) * 2013-11-01 2017-09-19 Amazon Technologies, Inc. Enhanced operating system integrity using non-volatile system memory

Also Published As

Publication number Publication date
EP1277115A2 (en) 2003-01-22
AU2001251617A1 (en) 2001-10-30
WO2001080008A3 (en) 2002-06-06

Similar Documents

Publication Publication Date Title
US6802022B1 (en) Maintenance of consistent, redundant mass storage images
EP0566966B1 (en) Method and system for incremental backup copying of data
US6460054B1 (en) System and method for data storage archive bit update after snapshot backup
US6618794B1 (en) System for generating a point-in-time copy of data in a data storage system
US6341341B1 (en) System and method for disk control with snapshot feature including read-write snapshot half
US7685180B2 (en) System and article of manufacture for transparent file restore
US8074035B1 (en) System and method for using multivolume snapshots for online data backup
US6061770A (en) System and method for real-time data backup using snapshot copying with selective compaction of backup data
US7480819B1 (en) Method for boot recovery
US7627727B1 (en) Incremental backup of a data volume
US8732121B1 (en) Method and system for backup to a hidden backup storage
US8051044B1 (en) Method and system for continuous data protection
US5379398A (en) Method and system for concurrent access during backup copying of data
JP3641183B2 (en) Method and system for providing instantaneous backup in a RAID data storage system
US7185048B2 (en) Backup processing method
US6842823B1 (en) Methods and apparatus for persistent volatile computer memory
JP6064608B2 (en) Storage device, backup program, and backup method
US20050204186A1 (en) System and method to implement a rollback mechanism for a data storage unit
US6901481B2 (en) Method and apparatus for storing transactional information in persistent memory
US6944789B2 (en) Method and apparatus for data backup and recovery
US7293146B1 (en) Method and apparatus for restoring a corrupted data volume
US6880053B2 (en) Instant refresh of a data volume copy
CN115793985A (en) A safe storage method, device, equipment and storage medium
US7165160B2 (en) Computing system with memory mirroring and snapshot reliability
EP1277115A2 (en) Methods and apparatus for persistent volatile computer memory and related applications thereof

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
AK Designated states

Kind code of ref document: A3

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

WWE Wipo information: entry into national phase

Ref document number: 2001925016

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 2001925016

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Ref document number: 2001925016

Country of ref document: EP

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载