WO2018154697A1 - Système de stockage et procédé de commande de récupération - Google Patents
Système de stockage et procédé de commande de récupération Download PDFInfo
- Publication number
- WO2018154697A1 WO2018154697A1 PCT/JP2017/007015 JP2017007015W WO2018154697A1 WO 2018154697 A1 WO2018154697 A1 WO 2018154697A1 JP 2017007015 W JP2017007015 W JP 2017007015W WO 2018154697 A1 WO2018154697 A1 WO 2018154697A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- determination
- strip
- copyback
- read
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 155
- 238000011084 recovery Methods 0.000 title claims description 9
- 230000008569 process Effects 0.000 claims abstract description 160
- 230000015654 memory Effects 0.000 description 18
- 230000008901 benefit Effects 0.000 description 5
- 230000004044 response Effects 0.000 description 4
- 238000012423 maintenance Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 230000000052 comparative effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/08—Error detection or correction by redundancy in data representation, e.g. by using checking codes
- G06F11/10—Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
Definitions
- the present invention generally relates to storage system recovery control.
- a storage system having a RAID (Redundant Array of Independent) (or Inexpensive) Disks) group composed of a plurality of disks a storage system provided with a spare disk is known.
- the data in the failed disk in the RAID group is restored to the spare disk (rebuild process), and the data is copied from the spare disk to the disk after the failed disk is replaced (copy back process).
- RAID Redundant Array of Independent
- a storage device having a large storage capacity is adopted as a storage device (for example, a disk) constituting a RAID group.
- the I / O processing performance (I / O processing performance in accordance with the I / O request) may be lowered.
- the storage system includes a plurality of storage devices that provide one or more logical RAID groups, and a processor unit that is one or more processors connected to the plurality of storage devices.
- Each RAID group is composed of two or more stripes.
- Each stripe is composed of two or more strips.
- Each of the plurality of storage devices has a plurality of strips and one or more spare areas.
- For each RAID group at least one of two or more storage devices each providing two or more strips constituting any stripe, and two or more providing each two or more strips constituting any other stripe At least one of the storage devices is a different storage device.
- the processor unit executes I / O processing according to the I / O request.
- the processor unit executes a rebuild process for restoring data corresponding to a plurality of strips of a problem storage device among a plurality of storage devices to a plurality of spare areas of two or more storage devices, respectively.
- the processor unit includes a data copy process for copying data from a spare area corresponding to a strip in which data is not restored among a plurality of strips included in the storage device after replacement of the problem storage device after the rebuild process is completed.
- Execute copyback processing includes at least one of the following (W) and (R): (W) When a write request with the strip in the storage device as the write destination is received during the copyback process after the replacement, the data according to the write request is written to the write destination strip in the write process according to the write request.
- the storage device can be restored while reducing the decrease in I / O processing performance.
- 2 shows an example of the configuration of a computer system related to Example 1.
- 2 shows an example of a logical configuration and a physical configuration of a pool.
- 2 shows an example of a program and a table stored in a local memory.
- An example of a pool state table is shown.
- An example of a copyback setting management table is shown.
- An example of a progress management table is shown.
- An example of a head address management table is shown.
- the flow of recovery processing is shown.
- An example of the details of a rebuild process is shown.
- An example of the details of a copyback process is shown.
- the detailed flow of a copyback process is shown.
- the detailed flow of a data copy process is shown.
- An example of write processing during copyback processing is shown.
- the “interface unit” includes one or more interfaces.
- the one or more interfaces may be one or more similar interface devices (for example, one or more NIC (Network Interface Card)) or two or more different interface devices (for example, NIC and HBA (Host Bus Adapter)). There may be.
- NIC Network Interface Card
- HBA Home Bus Adapter
- the “storage unit” includes one or more memories.
- the at least one memory for the storage unit may be a volatile memory.
- the storage unit is mainly used during processing by the processor unit.
- the “processor unit” includes one or more processors.
- the at least one processor is typically a microprocessor such as a CPU (Central Processing Unit).
- Each of the one or more processors may be a single core or a multi-core.
- the processor may include a hardware circuit that performs part or all of the processing.
- the process may be described using “program” as a subject.
- the program is executed by a processor unit (for example, CPU (Central Processing Unit)), so that the determined processing is appropriately performed. Therefore, the subject of processing may be the processor unit (or an apparatus or system having the processor unit) because the processing is performed using a storage unit (for example, a memory) and / or an interface device (for example, a communication port).
- the processor unit may include a hardware circuit that performs part or all of the processing.
- the program may be installed in a computer-like device from a program source.
- the program source may be, for example, a recording medium (for example, non-transitory) readable by a program distribution server or a computer.
- two or more programs may be realized as one program, or one program may be realized as two or more programs.
- the “host system” may be one or more physical host computers (for example, a cluster of host computers), or at least one virtual host computer (for example, VM (Virtual Machine)). ) May be included.
- the host system is simply referred to as “host”.
- the “storage system” may be one or more storage devices.
- the “storage device” may be any device having a function of storing data in the storage device.
- the storage device may be a computer (for example, a general-purpose computer) such as a file server.
- at least one physical storage device may execute a virtual computer (for example, VM (Virtual Machine)), or may execute SDx (Software-Defined anything).
- SDx for example, SDS (Software Defined Storage) (an example of a virtual storage device) or SDDC (Software-defined Datacenter) can be adopted.
- at least one storage device (computer) may have a hypervisor.
- the hypervisor may generate a server VM (Virtual Machine) that operates as a server and a storage VM that operates as a storage.
- the server VM may operate as a host that issues an I / O request
- the storage VM may operate as a storage controller that performs I / O to a drive in response to an I / O request from the server VM.
- a reference code (or a common part in the reference sign) is used, and when explaining the same kind of element separately, the element ID (or Element reference signs) may be used.
- FIG. 1 shows an example of the configuration of a computer system according to the embodiment.
- the computer system has a host 101 and a storage system 102.
- the host 101 and the storage system 102 are connected to each other via a communication network 152.
- the host 101 transmits an I / O (Input / Output) request to the storage system 102.
- the I / O request includes I / O destination information indicating the location of the I / O destination.
- the I / O destination information includes, for example, the LUN (Logical Unit Number) of the LU (Logical Unit) of the I / O destination and the LBA (Logical Unit Block Address) of the area in the LU.
- An LU is a logical volume (logical storage device) provided from the storage system 110. Based on the I / O destination information, the logical area of the I / O destination is specified, and the drive 124 based on the logical area is specified.
- the storage system 102 includes a storage controller 103 and a drive box 121.
- the drive box 121 includes a plurality (or one) of pools 183.
- Each pool 183 includes a plurality of drives 124.
- the drive 124 is an example of a storage device (typically a non-volatile storage device), and is, for example, an HDD (Hard Disk Drive) or an SSD (Solid State Drive).
- the storage controller 103 includes a host I / F 111, a cache memory (CM) 112, a CPU (Central Processing Unit) 113, a drive I / F 114, and a local memory (LM) 115.
- the host I / F 111 and the drive I / F 114 are examples of the interface unit.
- the cache memory 112 and the local memory 115 are examples of a storage unit.
- the CPU 113 is an example of a processor unit.
- the host I / F 111 is an example of an interface device of the storage controller 103, and communicates with the host 101.
- the cache memory 112 temporarily stores data (write data) written from the host 101 to the drive 124 and data (read data) read from the drive 124.
- the CPU 113 executes various processes by executing a program stored in the local memory 115.
- the CPU 113 is connected to the host I / F 111, the cache memory 112, the drive I / F 114, and the local memory 115.
- the CPU 113 transmits various commands to the drive 124 of the storage device 121 via the drive I / F 114.
- the drive I / F 114 is an example of an interface device of the storage controller 103, and communicates with each drive 124.
- the local memory 115 stores various programs and various information.
- the storage controller 103 processes the I / O request received from the host 101. Specifically, for example, the storage controller 103 identifies the drive 124 that is the data I / O destination based on the I / O destination of the I / O request, and executes I / O for the identified drive 124. At this time, the storage controller 103 caches the I / O target data in the cache memory 112.
- FIG. 2 shows an example of the logical configuration and physical configuration of the pool 183.
- the pool 183 has a plurality of RAID groups (logical RAID groups) 223.
- a logical volume is provided based on the RAID group 223.
- Each RAID group 223 has a plurality of drives 224 (logical drives).
- Each RAID group 223 is a RAID group according to a 2D + 2P RAID configuration and a RAID 6 RAID level. That is, in each RAID group 223, each stripe is composed of four strips (unit storage areas) that the four drives 224 respectively have. In each stripe, two data elements (D) are stored in two strips, respectively, and two parities (P) based on the two data elements are stored in two strips.
- the plurality of RAID types (RAID level and RAID configuration) respectively corresponding to the plurality of RAID groups 223 may be the same.
- each pool 183 has a plurality of drive groups 123.
- the number of drive groups 123 is the same as the number of RAID groups 223, but may be different.
- the number of drives 124 constituting each drive group 123 is the same as the number of drives 224 constituting the RAID group 223, but may be different.
- each drive group 123 has four drives 124 (physical drives). Each drive group 123 does not constitute a RAID group by itself.
- the pool 183 logically configures the plurality of RAID groups 223 described above.
- each RAID group 223 is composed of two or more stripes.
- Each stripe is composed of two or more strips.
- Each drive 124 has a plurality of strips.
- At least one of the two or more drives 124 that respectively provide two or more strips that constitute the first stripe, and two or more drives that respectively provide two or more strips that constitute the second stripe At least one of 124 is a different drive 124.
- the first stripe is any stripe
- the second stripe is any stripe other than the first stripe.
- a plurality of strips constituting the same stripe are distributed in different drive groups, and a plurality of drive positions (physical addresses in the drives) respectively corresponding to the plurality of strips.
- the position to follow is different.
- the same number is given to each of the four strips constituting the same stripe.
- four strips 0-0 constituting the same stripe are distributed in a plurality of drive groups 123.
- each drive 124 has one or more spare areas (S) in addition to a plurality of strips.
- each drive 124 has one spare area.
- the “spare area” is a spare storage area.
- any strip is a component of the RAID group 223, but the spare area is not a component of the RAID group 223.
- data (data element or parity) in the strip in the failed drive is restored to the spare area, as will be described later.
- the size of each spare area is not less than the strip size.
- a distributed RAID configuration is adopted, and a spare area is provided in each drive 124 instead of providing a spare drive.
- the RAID level and the RAID configuration may be different.
- the RAID level of any pool 183 is RAID 6 and the RAID configuration is 2D + 2P.
- FIG. 3 shows an example of programs and tables stored in the local memory 115.
- the local memory 115 stores various programs. Examples of the program include a host I / O processing program 301, a rebuild processing program 302, a copyback processing program 303, and a parity generation program 304.
- the host I / O processing program 301 processes an I / O request from the host 101.
- the rebuild process program 302 executes a rebuild process.
- the copy back processing program 303 executes copy back processing.
- the parity generation program 304 generates the parity stored in the stripe.
- the local memory 115 stores various tables. Examples of the table include a pool status table 305, a copyback setting management table 306, a progress management table 307, and a head address management table 308.
- FIG. 4 shows an example of the pool status table 305 and the status of the pool corresponding to the table 305.
- the pool state table 305 is information indicating the redundancy and state of the pool 183.
- the pool state table 305 manages entries that hold information such as the pool number 401, the redundancy 402, and the state 403 for each pool 183.
- Pool number 401 is a pool number.
- the redundancy 402 indicates the redundancy of the pool 183.
- the redundancy as the redundancy 402 is the lowest redundancy in the pool 183.
- N is an integer of 1 or more
- Some stripes have a redundancy “1” and some stripes have a redundancy “0”. In this case, the minimum value “0” is registered as the redundancy 402.
- the state 403 indicates a state related to processing to the pool 183.
- An example of processing for the pool 183 is copy back processing.
- copyback is in progress which means that copyback processing is in progress and data copy processing described later is being processed, and data copy processing is stopped There is “stopped” meaning that it is in an inactive state, and “none” meaning that copyback processing is not in progress.
- the redundancy and status of each of the pools 0 to 2 are as follows. That is, in pool 0, two drives have failed and the redundancy is from 2 to 0. For the pool 0, the data copy process in the copy back process is being performed. In the pool 1, a failure occurs in one drive, and the redundancy is 2 to 1. Although the copy back process is being performed for the pool 1, the data copy process in the process is in a stopped state. In the pool 2, no failure has occurred in any drive.
- FIG. 5 shows an example of the copyback setting management table 306.
- the copyback setting management table 306 is a table for managing thresholds for determining whether or not to execute data copy processing.
- the copyback setting management table 306 manages entries that hold information such as a pool number 501, an I / O waiting time 502, a write rate 503, a CPU usage rate 504, and a determination waiting time 505 for each pool 183. .
- Pool number 501 is the number of pool 183.
- the I / O waiting time 502 indicates a time during which an I / O request (hereinafter referred to as host I / O) has not arrived from the host 101 for the pool 183.
- the write ratio 503 indicates the ratio of writes in the host I / O for the pool 183.
- the CPU usage rate 504 indicates the usage rate of the CPU 113 with respect to processing for the pool 183.
- the determination waiting time 505 indicates a waiting time from the start or stop of the data copy process to the determination.
- the information 502 to 505 is prepared for each pool 183, but may be common to all the pools 183.
- FIG. 6 shows an example of the progress management table 307.
- the progress management table 307 is a table for managing the progress of the copy back process for the replaced drive (the drive replaced with the failed drive 124).
- the progress management table 307 manages entries that hold information such as an address 601, a completion flag 602, and a data position 603 for each strip included in the drive after replacement.
- An address 601 is a strip address (number).
- the completion flag 602 is a flag indicating whether or not the data restoration for the strip is completed. As the value of the completion flag 602, “1” is set when it is completed, and “0” is set when it is not completed.
- a data position 603 indicates a position (a combination of the number of the drive 124 and the address of the spare area in the drive 124) where the data (data to be copied back) to be restored exists in the strip.
- FIG. 7 shows an example of the head address management table 308.
- the head address management table 308 is a table for managing a position where the copy back process is not completed.
- the head address management table 308 stores the head address of the addresses 601 corresponding to the completion flag 602 “0” in the progress management table 307.
- the processing performed in this embodiment will be described by taking one pool 183 as an example.
- the pool 183 is referred to as a “target pool 183”.
- the meaning of each term is as follows.
- “Problem drive” is a drive in which a problem has occurred.
- “Problem” means a failure or a high possibility of occurrence of a failure.
- the problem drive is a failed drive.
- a “failed drive” is a drive in which a failure has occurred.
- a “failure strip” is a strip in a failed drive.
- a failure candidate drive (a drive with a high possibility of failure) can be adopted.
- a strip in a failure candidate drive may be referred to as a “failure candidate strip”.
- “Restoration” is a term that may be used to mean writing to a spare area in a drive or writing to a strip in a drive after replacement.
- Drive after replacement is a drive that has been replaced with a problem drive.
- “Recovery processing” is a term that means processing including rebuild processing and copy back processing.
- “Recovery” is a term that includes rebuild and copyback.
- “Rebuild process” is a process including a rebuild. “Rebuild” is to restore data corresponding to all fault strips to a plurality of spare areas, that is, collection copy.
- the “data corresponding to the failure strip” is typically data in the failure strip, but the data after updating the data in the failure strip may also correspond.
- rebuilding is to restore data in all failure candidate strips to a plurality of spare areas, that is, dynamic sparing.
- “Copy back processing” is processing including copy back.
- “Copy back” is a process corresponding to a data copy process to be described later, and is to copy data from a spare area to a strip in a drive after replacement. In this embodiment, data may be restored to the strip in the drive after replacement by copy back, or data may be restored according to host I / O instead of copy back.
- Data XX is data (data element or parity) in the strip XX (XX is a number).
- FIG. 8 shows the flow of recovery processing.
- the recovery process is started when, for example, the CPU 113 (for example, the rebuild process program 302) detects that a failure (failure) has occurred in the drive 124.
- the CPU 113 for example, the rebuild process program 302
- the rebuild process program 302 starts the rebuild process for restoring the same data as the data (data element or parity) in the strip in the failed drive 124 to the spare area of the normal disk 124 (S801).
- FIG. 9 shows an example of the details of the rebuild process.
- the faulty drive 00 includes fault strips 0-1, 2-2, 1-1 and 2-3.
- the rebuild process program 302 Based on the data 0-1 in the normal drives 05, 06 and 0B, the rebuild process program 302 changes the data 0-1 in the failed drive 00 to the spare area in any normal drive, for example, the normal drive 01. Restore to spare area.
- the rebuild processing program 302 adds the position of the spare area (a combination of the drive 01 number and the address of the spare area) to the progress management table 307 corresponding to the target pool 183 as the data position 603.
- the rebuild processing program 302 restores the other data 2-2, 1-1 and 2-3 in the failed drive 00 to, for example, spare areas in the normal drives 02 to 04, and The positions of the spare areas are added to the progress management table 307 as data positions 603, respectively.
- a plurality of data in the failed drive are restored to the spare areas of a plurality of normal drives, respectively.
- data write destinations are not distributed to a single drive like a spare drive, but are distributed to a plurality of drives. For this reason, it is expected that the time required for the rebuild process can be shortened.
- the failed drive 124 is replaced, for example, by maintenance personnel (S802).
- the rebuild process program 302 determines whether the rebuild process has been completed (S803). If the rebuild is not completed (S803: NO), the rebuild process program 302 continues the rebuild process. On the other hand, if the rebuild process has been completed (S803: YES), the rebuild process program 302 outputs a drive replacement completion sign (for example, a predetermined LED (Light Emitting Diode) provided in the storage system 102). (S804).
- a drive replacement completion sign for example, a predetermined LED (Light Emitting Diode) provided in the storage system 102).
- the copyback processing program 303 executes the copyback processing in response to, for example, an instruction for copyback processing (or in response to detection of the completion of drive replacement) (S805).
- FIG. 10 shows an example of details of the copyback process.
- FIG. 10 corresponds to the continuation of the rebuild process shown in FIG.
- the copy back processing program 303 converts the data 0-1, 2-2, 1-1, and 2-3 in the spare area of the drives 01 to 04 to the strip 0-1, the replacement drive 00, respectively. Restore (copy) to 2-2, 1-1, and 2-3, respectively.
- FIG. 11 shows the detailed flow of the copyback process.
- the copyback processing program 303 refers to the redundancy 402 corresponding to the target pool 183 and determines whether the redundancy 402 is “0” (S1101). If the determination result in S1101 is true (S1101: YES), the copyback processing program 303 executes data copy processing (S1105). This is to increase the reliability of data protection of the target pool 183.
- the redundancy “0” is an example of a redundancy threshold. The threshold may be greater than zero.
- the copyback processing program 303 executes data copy processing (S1105). This is because since the host I / O is not performed, the load on the CPU 113 is relatively low, and it is considered that it is efficient to use it for the data copy processing.
- the copyback processing program 303 determines whether the write ratio in the host I / O to the target pool 183 is less than the write ratio 503 corresponding to the target pool 183. Is determined (S1103). If the determination result in S1103 is true (S1103: YES), the copyback processing program 303 executes data copy processing (S1105). This is because if the write ratio is low, there is a low possibility that data is restored to the strip in the drive after the replacement without data copy between the drives in the later-described write processing during the copy back processing.
- the copyback processing program 303 determines whether or not the CPU usage rate related to the target pool 183 is less than the CPU usage rate 504 corresponding to the target pool 183 ( S1104). If the determination result in S1104 is true (S1104: YES), the copyback processing program 303 executes data copy processing (S1105). This is because the load on the CPU 113 is relatively low, and it is considered that it is more efficient to use it for data copy processing.
- the copyback processing program 303 stops the data copy process (S1106). At this time, if the state 403 corresponding to the target pool 183 is not “stopped”, the copyback processing program 303 updates the state 403 to “stopped”. At this time, if the data copy process is already stopped (if the state 403 is already “stopped”), S1106 may be skipped (that is, the data copy process remains stopped).
- the copy back processing program 303 determines whether the elapsed time from the data copy processing stop time is equal to or longer than the determination waiting time 505 corresponding to the target pool 183 (S1107).
- the copyback processing program 303 determines whether the copyback process has been completed. (S1108).
- the determination in S1108 is a determination as to whether or not the completion flag 602 corresponding to all the strips in the post-replacement drive is “1” in the progress management table 307 corresponding to the target pool 183. If the determination result in S1108 is false (S1108: NO), the process returns to Step 1101. On the other hand, if the determination result in S1108 is true (S1108: YES), the copyback process is terminated.
- S1101, S1102, S1103, and S1104 are in descending order of priority. The determination may be made in a different order. Further, the data copy process (S1105) may be performed when at least one determination result of S1101 to S1104 is true, or the data copy process when at least one determination result of S1101 to S1104 is false. (S1105) may be stopped. For example, if the redundancy 402 of the target pool 183 is larger than a threshold (eg, “0”) (S1101: NO), the data copy process may be stopped regardless of the result of other determinations.
- a threshold eg, “0”
- the data copy process is stopped regardless of the result of other determinations. Also good. For example, if the write ratio is equal to or higher than the write ratio 503 (S1103: NO), the data copy process may be stopped regardless of the result of other determinations.
- FIG. 12 shows the detailed flow of the data copy process.
- the copyback processing program 303 identifies the head address for which data copying has not been completed from the head address management table 308 (S1201).
- the copyback processing program 303 identifies the data position 603 corresponding to the address 601 that matches the address identified in S1201 (S1202).
- the copyback processing program 303 copies the copyback processing target data from the spare area according to the data position 603 specified in S1202 to the strip (strip in the drive after replacement) specified in S1201 (S1203). .
- the copyback processing program 303 updates the completion flag 602 corresponding to the copy destination strip to “1” (completed), and sets the head address in the head address management table 308 to the completion flag 602 “0”. Update to the head address of the corresponding address 601 (S1204). In S1204, when the minimum value of the plurality of redundancy levels corresponding to the plurality of stripes in the target pool 183 becomes high, the copyback processing program 303 also updates the redundancy level 402 corresponding to the target pool 183.
- the copy back processing program 303 determines whether or not the copy back processing is completed (S1205). This determination is the same as the determination in S1108 of FIG. If the determination result in S1205 is true (S1205: YES), the copyback process ends.
- the copy back processing program 303 determines whether or not the elapsed time from the start time of the data copy process is equal to or greater than the determination waiting time 505 corresponding to the target pool 183. Determination is made (S1206). If the determination result in S1206 is false (S1206: NO), the process returns to S1201 (that is, the data copy process continues). On the other hand, if the determination result in S1206 is true (S1206: YES), the process returns to S1101 (at this time, the data copy process may be temporarily ended (stopped)).
- FIG. 13 shows an example of write processing during copy back processing.
- the host I / O processing program 301 When the host I / O processing program 301 receives a write request from the host 101 during the copy back processing, the host I / O processing program 301 writes the data according to the write request to the strip in the drive after replacement. As a result, the data is restored to the strip in the drive after replacement. Details are as follows, for example.
- the host I / O processing program 301 When the host I / O processing program 301 receives a write request with the strip 2-2 in the drive 00 after replacement as the write destination during the copy back processing (S14-1), the host I / O processing program 301 Data 2-2 is read from each strip 2-2 of the normal drives 01, 06 and 0B (S14-2). Next, the host I / O processing program 301 calculates a parity from the plurality of read data 2-2 and the updated data 2-2 from the host 101 (S14-3). Next, the host I / O processing program 301 writes the post-update data 2-2 to the strip 2-2 in the post-replacement drive 00 (S14-4). The host I / O processing program 301 returns a write completion response to the host 101 (S14-5).
- the host I / O processing program 301 updates it to “1” (S14-6). As described above, when a write request is received during the copy back process, the data can be restored to the strip in the drive after replacement by taking advantage of the process of the write request.
- FIG. 14 shows an example of the transition of the progress management table 307 according to the progress of the copyback process and the write process.
- the progress management table 307 is schematically expressed (as a bitmap). It is assumed that the addresses are arranged as indicated by broken arrows.
- the incomplete copy start address indicates the ninth strip, but the completion flag 602 corresponding to the 16th and 20th strips is “1”. It becomes.
- the address of the 16th strip that has already been restored (the completion flag 602 is “1”) It is skipped because it is an unfinished start address. That is, in the copy back process, the data copy process is skipped for the strip on which the data is restored by taking advantage of the write process. This reduces the load on the copyback process.
- a distributed RAID configuration is adopted. Instead of providing a spare drive, each drive 124 is provided with a spare area. A plurality of data in the failed drive are restored to spare areas of a plurality of normal drives, respectively. In other words, data write destinations are not distributed to a single drive like a spare drive, but are distributed to a plurality of drives. For this reason, the time required for the rebuild process can be shortened. As a result, the time during which the redundancy is reduced can be shortened, and the decrease in I / O processing performance can be reduced.
- a spare drive is adopted, and the restoration destination in the rebuild process is aggregated into the spare drive, that is, all data in the failed drive is restored to the spare drive.
- the spare drive becomes a member of the group in place of the failed drive, it is possible to expect copy backlessness that can make it appear that the copy back processing is completed without copying between the drives.
- the host I / O processing program 301 sends updated data according to the write request to the strip in the drive after replacement in the write processing (processing according to the write request from the host 101) during copy back processing. Restore.
- piggybacking on the light process restores the data to the strip in the drive after replacement.
- the copy back process the data copy process is skipped for the restored strip. This reduces the load on the copyback process. As a result, a decrease in I / O processing performance can be reduced.
- thresholds are provided for the redundancy, the write ratio, and the CPU usage rate, and the execution and stop of the data copy processing are performed according to the comparison between the status during the copy back processing and those threshold values. Be controlled. For example, when the load on the CPU 113 is low (specifically, the host I / O has not been received for a certain period of time or the CPU usage rate is low), the data copy process is executed. If the write ratio is high, the data copy process is stopped. By finely controlling the execution and stop of the data copy process in the copy back process, it is possible to restore the drive while suppressing a decrease in host I / O processing performance.
- Example 2 will be described. At that time, differences from the first embodiment will be mainly described, and description of common points with the first embodiment will be omitted or simplified.
- FIG. 15 illustrates an example of read processing during copy back processing according to the second embodiment.
- the host I / O processing program 301 when the host I / O processing program 301 receives a read request from the host 101 during the copyback process, the host I / O processing program 301 stores the data according to the read request.
- the spare area is read and returned to the host 101, and the data is written to the strip in the drive after replacement.
- the data written on the strip is data to be copied back, and as a result, the data is restored to the strip in the drive after the replacement. Details are as follows, for example.
- the host I / O processing program 301 receives a read request with the strip 2-2 in the drive 00 after replacement as the read source (S13-1).
- the completion flag 602 corresponding to the address 601 of the strip 2-2 is “0”
- the host I / O processing program 301 indicates a spare area according to the data position 603 corresponding to the address 601 (normal in the example of FIG. 15).
- Data 2-2 is read from the spare area of the drive 02 (S13-2).
- the host I / O processing program 301 returns the read data 2-2 to the host 101 (S13-3), and writes the data 2-2 to the strip 2-2 in the drive after replacement. (S13-4).
- the host I / O processing program 301 updates the completion flag 602 corresponding to the strip 2-2 to “1” (S13-5).
- a host I / O frequency threshold value may be employed instead of or in addition to the write ratio 503 threshold value.
- the copyback processing program 303 may execute a determination (hereinafter referred to as determination A) as to whether or not the host I / O frequency related to the target pool 183 is less than the threshold instead of or in addition to the determination in S1103. If the result of determination A is true, the copyback processing program 303 may execute data copy processing. This is because there is a low possibility that data is restored to the strip in the drive after the exchange by taking advantage of the host I / O process (write process or read process) during the copy back process.
- a read ratio threshold value may be used instead of or in addition to at least one of the write ratio 503 and the host I / O frequency threshold value.
- the copyback processing program 303 replaces or in addition to at least one of the determination in S1103 and the above determination A with the read ratio of the target pool 183 (the ratio of the read request to the host I / O related to the target pool 183) ) May be less than the lead ratio (threshold value) (hereinafter referred to as determination B). If the result of determination B is true, the copyback processing program 303 may execute data copy processing. This is because there is a low possibility that data is restored to the strip in the drive after the exchange by taking advantage of the host I / O process (write process or read process) during the copy back process.
- At least one of the determinations of S1101 to S1104, determination A, and determination B described above (for example, at least one of determination of S1103, determination A and determination B, determination of S1101, and determination of S1102 or S1104) ) Is included in the determination as to whether or not the execution condition of the data copy process is satisfied.
- the copyback processing program 303 may determine whether or not the execution condition of the data copy process is satisfied regularly or irregularly. If the determination result is true, the copyback processing program 303 can execute the data copy processing (may include continuation). If the determination result is false, the copyback processing program 303 can stop the data copy processing.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- Human Computer Interaction (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
La présente invention concerne un système de stockage, dans lequel chaque dispositif parmi une pluralité de dispositifs de stockage possède une pluralité de bandes et au moins une région de réserve. Dans un groupe RAID formé par lesdits dispositifs de stockage, au moins l'un d'au moins deux dispositifs de stockage se rapportant à une première bande diffère d'au moins l'un d'au moins deux dispositifs de stockage se rapportant à une seconde bande. Le système de stockage réalise : un processus de reconstruction dans lequel des données pour la pluralité de bandes d'un dispositif de stockage défaillant sont restaurées respectivement à une pluralité de régions de réserve d'au moins deux dispositifs de stockage ; et un processus de recopie comprenant un processus de copie de données dans lequel, après que le dispositif de stockage défaillant soit remplacé par un autre dispositif de stockage, des données dans au moins une région de réserve de ladite pluralité de régions de réserve sont copiées sur au moins une bande de la pluralité de bandes de l'autre dispositif de stockage, ladite région de réserve étant associée à ladite bande, et aucune donnée n'a encore été restaurée à ladite bande. Pendant le processus de recopie, le système de stockage saute le processus de copie de données pour une bande si des données ont déjà été écrites dans la bande dans un processus d'écriture ou un processus de lecture.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2017/007015 WO2018154697A1 (fr) | 2017-02-24 | 2017-02-24 | Système de stockage et procédé de commande de récupération |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2017/007015 WO2018154697A1 (fr) | 2017-02-24 | 2017-02-24 | Système de stockage et procédé de commande de récupération |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018154697A1 true WO2018154697A1 (fr) | 2018-08-30 |
Family
ID=63252491
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2017/007015 WO2018154697A1 (fr) | 2017-02-24 | 2017-02-24 | Système de stockage et procédé de commande de récupération |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2018154697A1 (fr) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111124263A (zh) * | 2018-10-31 | 2020-05-08 | 伊姆西Ip控股有限责任公司 | 用于管理多个盘的方法、电子设备以及计算机程序产品 |
TWI709042B (zh) * | 2018-11-08 | 2020-11-01 | 慧榮科技股份有限公司 | 用來進行關於容錯式磁碟陣列的映射資訊管理之方法與裝置以及儲存系統 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH08221217A (ja) * | 1995-02-17 | 1996-08-30 | Hitachi Ltd | ディスクアレイサブシステムのデータ再構築方法 |
JP2005099995A (ja) * | 2003-09-24 | 2005-04-14 | Fujitsu Ltd | 磁気ディスク装置のディスク共有方法及びシステム |
JP2016038767A (ja) * | 2014-08-08 | 2016-03-22 | 富士通株式会社 | ストレージ制御装置、ストレージ制御プログラム、及びストレージ制御方法 |
-
2017
- 2017-02-24 WO PCT/JP2017/007015 patent/WO2018154697A1/fr active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH08221217A (ja) * | 1995-02-17 | 1996-08-30 | Hitachi Ltd | ディスクアレイサブシステムのデータ再構築方法 |
JP2005099995A (ja) * | 2003-09-24 | 2005-04-14 | Fujitsu Ltd | 磁気ディスク装置のディスク共有方法及びシステム |
JP2016038767A (ja) * | 2014-08-08 | 2016-03-22 | 富士通株式会社 | ストレージ制御装置、ストレージ制御プログラム、及びストレージ制御方法 |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111124263A (zh) * | 2018-10-31 | 2020-05-08 | 伊姆西Ip控股有限责任公司 | 用于管理多个盘的方法、电子设备以及计算机程序产品 |
CN111124263B (zh) * | 2018-10-31 | 2023-10-27 | 伊姆西Ip控股有限责任公司 | 用于管理多个盘的方法、电子设备以及计算机程序产品 |
TWI709042B (zh) * | 2018-11-08 | 2020-11-01 | 慧榮科技股份有限公司 | 用來進行關於容錯式磁碟陣列的映射資訊管理之方法與裝置以及儲存系統 |
US11221773B2 (en) | 2018-11-08 | 2022-01-11 | Silicon Motion, Inc. | Method and apparatus for performing mapping information management regarding redundant array of independent disks |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11163472B2 (en) | Method and system for managing storage system | |
US10459814B2 (en) | Drive extent based end of life detection and proactive copying in a mapped RAID (redundant array of independent disks) data storage system | |
US10656849B2 (en) | Storage system and control method thereof | |
US9152332B2 (en) | Storage system and method for reducing energy consumption | |
US9378093B2 (en) | Controlling data storage in an array of storage devices | |
JP6009095B2 (ja) | ストレージシステム及び記憶制御方法 | |
US9400618B2 (en) | Real page migration in a storage system comprising a plurality of flash packages | |
US20070011579A1 (en) | Storage system, management server, and method of managing application thereof | |
WO2011108027A1 (fr) | Système informatique et procédé de commande associé | |
US20140075240A1 (en) | Storage apparatus, computer product, and storage control method | |
US8812779B2 (en) | Storage system comprising RAID group | |
CN111104055B (zh) | 用于管理存储系统的方法、设备和计算机程序产品 | |
US10579540B2 (en) | Raid data migration through stripe swapping | |
CN113934367A (zh) | 存储设备、操作存储设备的方法及存储系统 | |
US9760296B2 (en) | Storage device and method for controlling storage device | |
US9400723B2 (en) | Storage system and data management method | |
WO2018154697A1 (fr) | Système de stockage et procédé de commande de récupération | |
WO2018142622A1 (fr) | Ordinateur | |
US8880939B2 (en) | Storage subsystem and method for recovering data in storage subsystem | |
US20170038993A1 (en) | Obtaining additional data storage from another data storage system | |
CN110413197B (zh) | 管理存储系统的方法、设备和计算机程序产品 | |
US20230214134A1 (en) | Storage device and control method therefor | |
JP2005055963A (ja) | ボリューム制御方法、この方法を実行するプログラム、及びストレージ装置 | |
KR20210137922A (ko) | 복구 공간으로 패리티 공간을 사용한 데이터 복구 시스템, 방법 및 장치 | |
JP2020086554A (ja) | ストレージアクセス制御装置、ストレージアクセス制御方法、及び、ストレージアクセス制御プログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17897750 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17897750 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: JP |