+

CN114089923B - Dual-active storage system and data processing method thereof - Google Patents

Dual-active storage system and data processing method thereof

Info

Publication number
CN114089923B
CN114089923B CN202111433614.5A CN202111433614A CN114089923B CN 114089923 B CN114089923 B CN 114089923B CN 202111433614 A CN202111433614 A CN 202111433614A CN 114089923 B CN114089923 B CN 114089923B
Authority
CN
China
Prior art keywords
data
storage
site
redundancy
logical volume
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111433614.5A
Other languages
Chinese (zh)
Other versions
CN114089923A (en
Inventor
朱广帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New H3C Big Data Technologies Co Ltd
Original Assignee
New H3C Big Data Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New H3C Big Data Technologies Co Ltd filed Critical New H3C Big Data Technologies Co Ltd
Priority to CN202111433614.5A priority Critical patent/CN114089923B/en
Publication of CN114089923A publication Critical patent/CN114089923A/en
Application granted granted Critical
Publication of CN114089923B publication Critical patent/CN114089923B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/062Securing storage systems
    • G06F3/0622Securing storage systems in relation to access
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请提供一种双活存储系统及其数据处理方法,其中双活存储系统包括:第一存储站点和第二存储站点,且第二存储站点为第一存储站点的备用存储站点;所述第一存储站点中创建有第一资源池,所述第一资源池被配置为第一冗余策略;所述第二存储站点中创建有第二资源池,所述第二资源池被配置为第二冗余策略;所述第一资源池中创建有第一逻辑卷,所述第二资源池中创建有第二逻辑卷,所述第一逻辑卷和所述第二逻辑卷被配置为双活卷,所述双活卷用于记录归置组PG日志,以及提供数据缓存。本方案降低了分布式双活存储系统的带宽占用,提升了系统性能及可靠性。

The present application provides an active-active storage system and a data processing method thereof, wherein the active-active storage system includes: a first storage site and a second storage site, and the second storage site is a backup storage site of the first storage site; a first resource pool is created in the first storage site, and the first resource pool is configured as a first redundancy strategy; a second resource pool is created in the second storage site, and the second resource pool is configured as a second redundancy strategy; a first logical volume is created in the first resource pool, and a second logical volume is created in the second resource pool, and the first logical volume and the second logical volume are configured as active-active volumes, and the active-active volumes are used to record placement group PG logs and provide data cache. This solution reduces the bandwidth usage of the distributed active-active storage system and improves system performance and reliability.

Description

Dual-active storage system and data processing method thereof
Technical Field
The application relates to the technical field of data storage, in particular to a dual-activity storage system and a data processing method thereof.
Background
Along with the integration of information technology into hundreds of industries and the integration of information technology into people's daily lives, storage systems play an increasingly important role in key businesses in various industries, and enterprises have unprecedented demands for business continuity. Especially in the fields of communication, finance, medical treatment, government offices, logistics, electronic commerce and the like, the service interruption of a storage system can cause important data loss, enterprise credit is greatly reduced, and huge economic loss is caused. Therefore, ensuring business continuity is critical to the construction of the storage system, and thus dual activity technology should occur in this context.
The dual-Active technology has been widely accepted in the storage field and commonly known as the Active-Active technology. The dual-activity technology provides flexible and powerful data disaster recovery function for users, can realize real-time synchronous data replication between two data centers, real-time service running state monitoring and fault switching, and ensures that users can realize on-line service switching across the data centers and service load sharing.
For example, a conventional disaster recovery scheme deployment manner generally includes a production center and a disaster recovery center, where the disaster recovery center is normally in an inactive state, and only when a disaster occurs, the production center is paralyzed and the disaster recovery center is started. The disaster recovery system faces the following challenges that when a production center encounters disasters such as flood, fire, artificial disasters and earthquakes, the service needs to be manually switched to the disaster recovery center at the disaster recovery center, the service interruption time is long, and the RTO (RecoveryTime Object, recovery time target) is usually in an hour level, so that continuous operation of the service cannot be guaranteed. The disaster recovery center is in an idle state throughout the year, the resource utilization rate is low, and the total input TCO (Total Cost of Ownership, total possession cost) is increased.
Disclosure of Invention
The application aims to provide a dual-active storage system and a data processing method thereof, which are used for reducing the bandwidth occupation of a distributed dual-active storage system so as to improve the system performance and the reliability.
The first aspect of the present application provides a dual-active storage system, based on a distributed storage mode, including:
The first storage site and the second storage site, and the second storage site is a standby storage site of the first storage site;
Creating a first resource pool in the first storage site, wherein the first resource pool is configured as a first redundancy strategy;
Creating a second resource pool in the second storage site, the second resource pool configured as a second redundancy policy;
and a first logic volume is created in the first resource pool, a second logic volume is created in the second resource pool, and the first logic volume and the second logic volume are configured as dual active volumes which are used for recording the set PG log and providing data caching.
A second aspect of the present application provides a data processing method applied to the dual-active storage system described in the first aspect, the method including:
The first logical volume receives first data sent by a first storage gateway and a processing result of the first data after processing according to a Ceph index rule;
The first logical volume stores first data to the first storage site according to a processing result of the first data, and records a PG log in the processing result, and the first storage site performs redundancy protection on the first data according to a first redundancy strategy;
The first logical volume sends the first data and the processing result thereof to the second logical volume, so that the second logical volume stores the first data to the second storage site according to the processing result of the first data, records the PG log in the processing result, and the second storage site performs redundancy protection on the first data according to a second redundancy strategy.
A third aspect of the present application provides a data processing method, applied to the dual-active storage system described in the first aspect, the method comprising:
the second logical volume receives second data sent by a second storage gateway and a processing result after the second data is processed according to Ceph index rules;
the second logical volume locally caches second data and records PG logs in the processing results;
The second logical volume sends second data and a processing result thereof to the first logical volume, so that the first logical volume stores the second data to the first storage site according to the processing result of the second data, and records a PG log in the processing result, and the first storage site performs redundancy protection on the second data according to a first redundancy strategy;
And after receiving the control message sent by the first logical volume, the second logical volume stores locally cached second data to the second storage site, and the second storage site performs redundancy protection on the second data according to a second redundancy strategy.
Compared with the prior art, the dual-active storage system provided by the application has the following beneficial effects:
1. two-copy dual-active volumes are deployed across clusters, the storage volumes do not store real data, only record metadata, and can provide a caching function.
2. The dual active volumes ensure the data flow of the cross-site by utilizing the strong consistency of the two copies of Ceph, realize the real-time synchronization of the data of the cross-site and the automatic synchronization after the fault recovery, ensure the real-time consistency of the data of the main and standby sites under the normal condition of the cluster, and have RPO=0.
3. And storing the complete copy number of the data by the single site, and ensuring the maximum reliability of the data by utilizing the complete redundant copy number.
4. In the dual-active storage system, each site of the main and the standby is configured with an independent data redundancy strategy, and the redundancy strategy can be flexibly selected.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
FIG. 1 is a schematic diagram of a cluster architecture of a dual active storage system according to the prior art;
FIG. 2 is a flow chart of a data writing operation of the dual active memory system in FIG. 1;
FIG. 3 is a schematic diagram of a cluster structure of a dual-active storage system according to the present application;
FIG. 4 is a flow chart of a data processing method provided by the present application;
FIG. 5 is a flow chart of another data processing method provided by the present application;
FIG. 6 is a flow chart of another data writing operation of the dual active memory system in FIG. 3.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
It is noted that unless otherwise indicated, technical or scientific terms used herein should be given the ordinary meaning as understood by one of ordinary skill in the art to which this application belongs.
In addition, the terms "first" and "second" etc. are used to distinguish different objects and are not used to describe a particular order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.
For ease of understanding, some technical terms in the present application will be first described.
Ceph (distributed storage system) is an open source project, provides a software-defined and unified storage solution, and has the advantages of large-scale expansion, high performance and no single point of failure. When an application accesses a Ceph cluster and performs a write operation, data is stored in the form of objects (objects) in the Ceph's object storage device (Object Storage Device, OSD for short). Ceph Monitor is responsible for the health status of the entire cluster, and typically Monitor nodes may be deployed on a physical host alone or both Monitor and storage nodes may be deployed on the physical host. In a Ceph cluster, a plurality of monitors are commonly responsible for managing, maintaining and publishing state information of the cluster.
In Ceph storage, data is stored in basic units of objects, each object defaults to 4MB in size, several objects belong to one PG (PLACEMENT GROUP, group), several PGs belong to one OSD, and generally one OSD corresponds to one disk. Ceph adopts a hierarchical Cluster structure (Cluster Map), and a user can customize the Cluster structure, while OSD is a leaf node of the hierarchical Cluster structure.
On a Ceph cluster, a number of resource pools (Pool) may be established, each Pool, when created, needs to indicate the number of PGs for that Pool, which is a logical concept.
Fig. 1 is a schematic diagram of a cluster structure of a conventional dual-active storage system, where the dual-active storage system shown in fig. 1 provides storage services in a double data center in the same city based on a storage service cluster remote mode, and the deployment mode is as follows:
1. Taking a node (node) for deploying the storage cluster 6 as an example, node1, node2 and node3 are deployed on a storage site A and two racks, node4, node5 and node6 are deployed on a storage site B and two racks, and an arbitration server is deployed and placed on a site C.
2. The storage gateways are deployed to form a cluster in a cross-site mode, and any gateway can provide storage cluster service;
3. The storage service clusters are deployed across sites to form clusters, taking deployment of 4 copies as an example, a storage fault domain is set as a rack, and the number of copies is set to be 4, namely the 4 copies, data of 2 copies are operated in site A, and data of 2 copies are operated in site B.
4. A storage control service mon node (Monitor) three-node cluster is deployed, mon1 is deployed on one server of site a, mon2 is deployed on one server of site B, and mon3 is deployed on an arbitration server of site C, so as to prevent the occurrence of a storage brain crack phenomenon, resulting in failure of a data storage function.
5. The disaster occurs in the site A or the site B, and the disaster recovery system of the cross-site cluster of the storage system can realize the service fault recovery of RPO=0 and RTO=0, so that the storage data is not lost, and the storage system with no service perception continuously operates.
FIG. 2 is a flow chart of a data writing operation of the dual active memory system in FIG. 1, which is specifically as follows:
1. The host side (client side) issues data to be written and metadata information to the storage side through the gateway, which receives and processes the data.
2. The gateway calculates the disk to which the data needs to fall according to a certain index rule, as shown in figure 2, the disk to be fallen is assumed to be a node3-HDD1, a node2-HDD2, a node4-HDD1 and a node5-HDD2, and the relation of 4 copies to be written is that the node3-HDD1 is a main copy, and the node2-HDD2, the node4-HDD1 and the node5-HDD2 are standby copies.
3. The gateway writes the data to the storage service process where the primary copy node3-HDD1 is located through the network.
4. The storage service process of the node3-HDD1 writes data into the storage service processes of the node2-HDD2, the node4-HDD1 and the node5-HDD2 according to the data redundancy strategy;
5. the storage service process writes the data to the disk according to certain logic processing;
6. After the node3-HDD1 waits until the 4 data are completely written successfully, the data are considered to be written successfully, and writing success information is returned to the host side until the data are written successfully;
From the above data writing operation flow, it is known that the existing dual-active storage system has the following disadvantages:
The host performance is reduced because the number of copies is 4, and 2 copies of the same data are required to be transmitted to the site B and written to different magnetic discs respectively, so that the host performance is reduced;
The second disadvantage is that the input cost of link construction is increased, namely, the transmission of redundant data volume causes doubling of the bandwidth of the link, wastes link resources and increases construction cost;
the reliability is reduced, the site faults can cause the unavailability of double copies, and the risk of continuous operation downtime of the service is increased;
And fourthly, only a copy scheme crossing the clusters is supported, erasure Coding (EC) redundancy is not supported, and space occupation is increased.
Of course, the above scheme can be simplified to 2 copies, but the reliability of the data can be reduced, for example, the number of copies of the data is set to 2, at this time, one copy of each of the site A and the site B is set, if the site A fails, the site B takes over the service, so that only one copy is providing the storage service, the risk of single point failure exists, and if the hard disk or the server where the copy is located fails, the service has no data storage capacity available at this time, and the service is down.
In view of the foregoing, an embodiment of the present application provides a dual-active storage system and a data processing method thereof, which are described below with reference to the accompanying drawings.
Referring to fig. 3, a cluster structure schematic diagram of a dual-active storage system according to some embodiments of the present application is shown, where the dual-active storage system is based on a distributed storage mode, for example, a Ceph mode or other distributed storage modes, which is not limited in this disclosure.
As shown in fig. 3, the dual active storage system includes a first storage site 100 and a second storage site 200, and the second storage site 200 is a standby storage site of the first storage site 200;
The first storage site 100 has a first resource pool 110 created therein, the first resource pool 110 being configured as a first redundancy policy, the second storage site 200 has a second resource pool 210 created therein, the second resource pool 210 being configured as a second redundancy policy;
For example, as shown in fig. 3, the first resource pool 110 includes storage nodes node1, node2, and node3, and the second resource pool 210 includes storage nodes node4, node5, and node6, where each storage node includes two disks, and each disk corresponds to an object storage device OSD.
Specifically, the first redundancy policy may be duplicate redundancy or erasure code redundancy, and the second redundancy policy may be duplicate redundancy or erasure code redundancy. For example, there are three combinations of site 100 configured with duplicate redundancy, site 200 configured with duplicate redundancy, site 100 configured with erasure code redundancy, site 200 configured with erasure code redundancy, and site 100 configured with erasure code redundancy, site 200 configured with erasure code redundancy.
The first resource pool 110 has a first logical volume created therein, the second resource pool 210 has a second logical volume created therein, the first logical volume and the second logical volume are configured as dual live volumes for recording PG logs (PGlog), and providing data caching.
Specifically, the back end storage of the dual live volumes may be a distributed cache, for example CACHE TIER, or may be a distributed database, for example mondb.
For example, logical double active pool PoolAB1 is created and redundancy policy is configured as a 2-copy in order to implement a 2-copy mechanism and a data strong consistency mechanism using existing Crush algorithms. The set Crush algorithm may specify that the local is dominant according to the layout of the data-issuing side. The Crush algorithm is a tool used to calculate on which OSD the object is distributed.
Specifically, a local resource pool PoolA1 is created at the site 100, a redundancy policy is configured as copy redundancy, the number of copies is set to 3, a local resource pool PoolB is created at the site 200, a redundancy policy is configured as copy redundancy, and the number of copies is set to 2;
A volume rbdA1 is created in a site 100/PoolA1, a volume rbdB1 is created in a site 200/PoolB1, a dual-activity relationship is created, wherein 100/PoolA1/rbdA1 and 200/PoolB1/rbdB1 are selected to create the dual-activity relationship in a dual-activity pool PoolAB1, a dual-activity relationship object 100/PoolA1/rbdA1-200/PoolB1/rbdB1 is generated in a PG object in the dual-activity pool, and the dual-activity relationship object is taken as a key value to record PGlog of a subsequent writing operation.
For write operation, the specific responsibility of the logical dual-activity pool is to provide a PG mechanism for the dual-activity relationship, so as to realize a duplicate redundancy mechanism and ensure strong consistency of data, but the data is not directly stored, the data to be written is written into a designated resource space through an index relationship, and three types of data are required to be stored:
The first kind, writing the local station resource pool, namely, for an independent storage resource pool, a redundancy strategy and a redundancy level can be configured at will, writing operation is issued from objecter, the object name is calculated according to the volume name, LBA (Logical Block Address ) and length (data length), the object name is written to a main OSD through a Crush algorithm, and then redundancy protection writing is carried out to a magnetic disk through PG where the object is located. Objecter provide a unified interface for read and write requests of clients.
The second type, the PG in the dual active pool is responsible for sending the write operation to the target site for processing, the PG in the target site indexes the data to the local resource 200/PoolB1rbdB or 100/PoolA1/rbdA for write operation, the write operation is issued from objector, the object name is calculated according to the volume name, LBA and length, the object name is written to the main OSD through Crush algorithm, then the redundant protection is carried out through the PG in the object to be written to the disk, and the higher-efficiency distributed caching technology can be used for the back end device to improve the IO performance.
And the third type, double active pool PGlog, wherein the key words of the record are volume name, LBA, length and writing sequence number, and the double active pool is a logic pool and does not truly store data, so that the written place of PGlog can be formulated according to the requirement, such as a distributed database, a storage pool where local resources of the double active member volume are multiplexed, other storage pools are multiplexed, a copy storage pool is independently built, a cache, a distributed cache and the like.
In the dual-activity storage system provided by the application, when a certain site fails, a data list is written in, and at the moment, a dual-activity pool can record PGlog, when the sites are recovered, data recovery is carried out according to PGlog, and data is read from one end with PGlog and is synchronized to the other end of the sites.
In one possible implementation manner, the dual-active storage system further comprises an arbitration station 300, specifically, the first storage station 100 is deployed with a first monitor mon1, the second storage station is deployed with a second monitor mon2, and the arbitration station is deployed with an arbitration monitor mon3, which is used for providing arbitration service for the first monitor and the second monitor to prevent the occurrence of the brain fracture phenomenon.
As shown in fig. 3, a first monitor mon1 is deployed on node2, a second monitor mon2 is deployed on node5, and an arbitration monitor mon3 is deployed on an arbitration server.
In one possible implementation manner, the dual-active storage system further includes a first storage gateway 120, configured to receive the first data sent by the client, process the first data according to the Ceph index rule to obtain a processing result, and send the processing result and the first data to the first logical volume, where the processing result includes PGlog and the stored object storage device OSD.
In one possible implementation manner, the dual-activity storage system provided by the application further comprises a second storage gateway 220, which is used for receiving the second data sent by the client, processing the second data according to the Ceph index rule to obtain a processing result, and sending the processing result and the second data to the second logical volume.
In one possible implementation manner, in the dual-active storage system provided by the present application, the first storage site 100 is one independent protection domain, and the second storage site 200 is another independent protection domain.
In the application, a Ceph cluster divides two protection domains, specifically, a site 100 is taken as an independent protection domain, a site 200 is taken as an independent protection domain, dividing is carried out, data redundancy cross-site distribution of a local site is avoided, the protection domain is a logic concept arranged for improving the reliability of the cluster, one data (including copies or fragments) only exists in one protection domain, and heartbeat detection is also carried out in the protection domain.
Compared with the prior art, the dual-active storage system provided by the application has the following beneficial effects:
1. two-copy dual-active volumes are deployed across clusters, the storage volumes do not store real data, only record metadata, and can provide a caching function.
2. The dual active volumes ensure the data flow of the cross-site by utilizing the strong consistency of the two copies of Ceph, realize the real-time synchronization of the data of the cross-site and the automatic synchronization after the fault recovery, ensure the real-time consistency of the data of the main and standby sites under the normal condition of the cluster, and have RPO=0.
3. And storing the complete copy number of the data by the single site, and ensuring the maximum reliability of the data by utilizing the complete redundant copy number.
4. In the dual-active storage system, each site of the main and the standby is configured with an independent data redundancy strategy, and the redundancy strategy can be flexibly selected.
In the above embodiment, a dual-active storage system is provided, and two data processing methods based on the dual-active storage system are provided correspondingly, one is a data processing method after a write operation of a client is issued to a first storage gateway, and the other is a data processing method after a write operation of the client is issued to a second storage gateway.
Specifically, after the write operation of the client is issued to the first storage gateway, as shown in fig. 4, the data processing method includes the following steps:
s101, the first logical volume receives first data sent by a first storage gateway and a processing result after the first data is processed according to a Ceph index rule;
S102, the first logical volume stores first data to the first storage site according to a processing result of the first data, and records a PG log in the processing result, and the first storage site performs redundancy protection on the first data according to a first redundancy strategy;
S103, the first logical volume sends the first data and the processing result thereof to the second logical volume, so that the second logical volume stores the first data to the second storage site according to the processing result of the first data, records the PG log in the processing result, and the second storage site performs redundancy protection on the first data according to a second redundancy strategy.
As shown in fig. 3, the data processing flow is as follows:
① The service client side issues writing operation to a first storage gateway, and the first storage gateway processes the first data to obtain a processing result;
② The first storage gateway transmits first data and a processing result thereof to a copy master (a first logical volume);
③ The copy master locally caches the first data and records PGlog according to the PG mechanism;
④ The copy master sends the first data and the processing result to the copy backup (second logical volume) for processing;
⑤ The copy is prepared for locally caching the first data, and double-activity data management logic is realized according to PG mechanism record PGlog;
⑥ And writing data in the second storage site and performing redundancy protection, and returning the writing completion message of the client after the copy master receives the writing completion message.
Specifically, after the write operation of the client is issued to the second storage gateway, as shown in fig. 5, the data processing method includes the following steps:
S201, the second logical volume receives second data sent by a second storage gateway and a processing result after the second data is processed according to Ceph index rules;
S202, the second logical volume locally caches second data, and records PG logs in the processing results;
S203, the second logical volume sends second data and a processing result thereof to the first logical volume, so that the first logical volume stores the second data to the first storage site according to the processing result of the second data, and records a PG log in the processing result, and the first storage site performs redundancy protection on the second data according to a first redundancy strategy;
and S204, after the second logical volume receives the control message sent by the first logical volume, the second data of the local cache is stored to the second storage site, and the second storage site performs redundancy protection on the second data according to a second redundancy strategy.
As shown in fig. 6, the data processing flow is as follows:
① The service client side issues writing operation to a second storage gateway, and the second storage gateway processes second data to obtain a processing result;
② The second storage gateway transmits second data and processing results thereof to the copy preparation;
③ The copy is prepared for locally caching the second data and recording PGlog according to the PG mechanism;
④ The copy backup sends the second data and the processing result to the copy main processing;
⑤ The copy master locally caches the second data, and double-activity data management logic is realized according to PG mechanism record PGlog;
⑥ The first storage site data is written in and redundantly protected, and a control message is sent to a copy device for data writing;
⑦ And writing data in the second storage site and performing redundancy protection, wherein the copy master receives all the writing completion messages and returns copy backup writing completion messages, and the copy backup returns client writing completion messages.
The data processing method of the dual-activity storage system realizes cross-site virtual machine copy protection and single-site independent data redundancy protection, realizes a cross-site dual-activity technology under distributed storage, and realizes a cross-site dual-activity function through a data dual-activity layer PG and a data storage layer PG based on the existing PG mechanism of Ceph. According to the double-activity requirement, a Crush algorithm of a double-activity pool is set to realize that a copy is mainly in an expected site, data read-write performance is improved, data is copied once across sites, double-activity metadata is extremely simply stored without occupying extra network resources and storage space, a copy protection mechanism between original OSD is abstracted to a client layer reservation PGlog mechanism, real data storage is not carried out, and a data double-write strong consistency mechanism across sites is realized. And (3) modifying a PG mechanism, namely selecting synchronous real data or sending a control message of a disk according to the data identification when the copy is subjected to copy-to-copy synchronization, so that high-efficiency data strong consistency among cross-site copies is realized.
Finally, it is noted that the flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.
The embodiments are only used to illustrate the technical scheme of the present application, but not to limit the technical scheme, and although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those skilled in the art that the technical scheme described in the foregoing embodiments may be modified or some or all technical features may be equivalently replaced, and the modification or replacement does not make the essence of the corresponding technical scheme deviate from the scope of the technical scheme of the embodiments of the present application, and is included in the scope of the claims and the specification of the present application.

Claims (8)

1. The dual-activity storage system is based on a distributed storage mode and is characterized by comprising a first storage site and a second storage site, wherein the second storage site is a standby storage site of the first storage site;
The first storage site is provided with a first resource pool, and the first resource pool is configured as a first redundancy strategy;
A second resource pool is created in the second storage site, and the second resource pool is configured as a second redundancy strategy;
a first logic volume is created in the first resource pool, a second logic volume is created in the second resource pool, and the first logic volume and the second logic volume are configured as dual active volumes which are used for recording a set PG log and providing data caching;
The dual live storage system further comprises:
The first storage gateway is used for receiving first data sent by the client, processing the first data according to Ceph index rules to obtain a processing result, and sending the processing result and the first data to the first logical volume, wherein the processing result comprises a PG log and stored object storage equipment (OSD);
and the second storage gateway is used for receiving second data sent by the client, processing the second data according to Ceph index rules to obtain a processing result, and sending the processing result and the second data to the second logical volume.
2. The dual live storage system of claim 1, wherein the backend of the dual live volume is stored as a distributed cache or a distributed database.
3. The dual active storage system of claim 1, wherein the first redundancy policy is duplicate redundancy or erasure code redundancy.
4. The dual active storage system of claim 1, wherein the second redundancy policy is duplicate redundancy or erasure code redundancy.
5. The dual-living storage system of claim 1, further comprising an arbitration site;
The first storage site is provided with a first monitor, the second storage site is provided with a second monitor, and the arbitration site is provided with an arbitration monitor for providing arbitration service for the first monitor and the second monitor.
6. The dual active storage system of claim 1, wherein the first storage site is one independent protection domain and the second storage site is another independent protection domain.
7. A data processing method applied to the dual active storage system of any one of claims 1 to 6, the method comprising:
The first logical volume receives first data sent by a first storage gateway and a processing result of the first data after processing according to a Ceph index rule;
The first logical volume stores first data to the first storage site according to a processing result of the first data, and records a PG log in the processing result, and the first storage site performs redundancy protection on the first data according to a first redundancy strategy;
The first logical volume sends the first data and the processing result thereof to the second logical volume, so that the second logical volume stores the first data to the second storage site according to the processing result of the first data, records the PG log in the processing result, and the second storage site performs redundancy protection on the first data according to a second redundancy strategy.
8. A data processing method applied to the dual active storage system of any one of claims 1 to 6, the method comprising:
the second logical volume receives second data sent by a second storage gateway and a processing result after the second data is processed according to Ceph index rules;
the second logical volume locally caches second data and records PG logs in the processing results;
The second logical volume sends second data and a processing result thereof to the first logical volume, so that the first logical volume stores the second data to the first storage site according to the processing result of the second data, and records a PG log in the processing result, and the first storage site performs redundancy protection on the second data according to a first redundancy strategy;
And after receiving the control message sent by the first logical volume, the second logical volume stores locally cached second data to the second storage site, and the second storage site performs redundancy protection on the second data according to a second redundancy strategy.
CN202111433614.5A 2021-11-29 2021-11-29 Dual-active storage system and data processing method thereof Active CN114089923B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111433614.5A CN114089923B (en) 2021-11-29 2021-11-29 Dual-active storage system and data processing method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111433614.5A CN114089923B (en) 2021-11-29 2021-11-29 Dual-active storage system and data processing method thereof

Publications (2)

Publication Number Publication Date
CN114089923A CN114089923A (en) 2022-02-25
CN114089923B true CN114089923B (en) 2025-08-19

Family

ID=80305576

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111433614.5A Active CN114089923B (en) 2021-11-29 2021-11-29 Dual-active storage system and data processing method thereof

Country Status (1)

Country Link
CN (1) CN114089923B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114610235A (en) * 2022-02-28 2022-06-10 新华三大数据技术有限公司 Distributed storage cluster, storage engine, two-copy storage method and device
CN117453146B (en) * 2023-12-22 2024-04-05 芯能量集成电路(上海)有限公司 Data reading method, system, eFlash controller and storage medium
CN118473942B (en) * 2024-07-08 2024-09-17 西安电子科技大学 Version cutover method for agile VMware virtualization resource pool
CN119902719B (en) * 2025-03-27 2025-06-17 苏州元脑智能科技有限公司 Data processing method and device and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107766003A (en) * 2017-10-31 2018-03-06 郑州云海信息技术有限公司 One kind storage dual-active method, apparatus, system and computer-readable recording medium
CN207704423U (en) * 2017-09-19 2018-08-07 国网重庆市电力公司 Resource pool integration builds system in dispatching of power netwoks
CN109828868A (en) * 2019-01-04 2019-05-31 新华三技术有限公司成都分公司 Date storage method, device, management equipment and dual-active data-storage system
CN110134338A (en) * 2019-05-21 2019-08-16 深信服科技股份有限公司 A kind of distributed memory system and its data redundancy protection method and relevant device

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7444541B2 (en) * 2006-06-30 2008-10-28 Seagate Technology Llc Failover and failback of write cache data in dual active controllers
US9274713B2 (en) * 2014-04-03 2016-03-01 Avago Technologies General Ip (Singapore) Pte. Ltd. Device driver, method and computer-readable medium for dynamically configuring a storage controller based on RAID type, data alignment with a characteristic of storage elements and queue depth in a cache
CN104331254A (en) * 2014-11-05 2015-02-04 浪潮电子信息产业股份有限公司 Storage double-active system design method based on double-active logical volume
US11294601B2 (en) * 2018-07-10 2022-04-05 Here Data Technology Method of distributed data redundancy storage using consistent hashing
CN110989923A (en) * 2019-10-30 2020-04-10 烽火通信科技股份有限公司 Deployment method and device of distributed storage system
CN111046108A (en) * 2019-12-20 2020-04-21 辽宁振兴银行股份有限公司 Ceph-based cross-data center Oracle high-availability implementation method
CN111225302B (en) * 2020-02-18 2021-11-02 中国科学院空天信息创新研究院 Satellite receiving station monitoring system based on virtualization technology
CN111628893B (en) * 2020-05-27 2022-07-12 北京星辰天合科技股份有限公司 Fault handling method and device for distributed storage system, and electronic equipment
CN112000426B (en) * 2020-07-24 2022-08-30 新华三大数据技术有限公司 Data processing method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN207704423U (en) * 2017-09-19 2018-08-07 国网重庆市电力公司 Resource pool integration builds system in dispatching of power netwoks
CN107766003A (en) * 2017-10-31 2018-03-06 郑州云海信息技术有限公司 One kind storage dual-active method, apparatus, system and computer-readable recording medium
CN109828868A (en) * 2019-01-04 2019-05-31 新华三技术有限公司成都分公司 Date storage method, device, management equipment and dual-active data-storage system
CN110134338A (en) * 2019-05-21 2019-08-16 深信服科技股份有限公司 A kind of distributed memory system and its data redundancy protection method and relevant device

Also Published As

Publication number Publication date
CN114089923A (en) 2022-02-25

Similar Documents

Publication Publication Date Title
CN114089923B (en) Dual-active storage system and data processing method thereof
CN101667181B (en) Method, device and system for data disaster tolerance
US7120824B2 (en) Method, apparatus and program storage device for maintaining data consistency and cache coherency during communications failures between nodes in a remote mirror pair
US7185228B2 (en) Method for controlling information processing system and information processing system
JP3187730B2 (en) Method and apparatus for creating snapshot copy of data in RAID storage subsystem
TWI514249B (en) Method for remote asynchronous replication of volumes and apparatus therefor
US7516356B2 (en) Method for transmitting input/output requests from a first controller to a second controller
US8401999B2 (en) Data mirroring method
US6915448B2 (en) Storage disk failover and replacement system
CN101539873B (en) Data recovery method, data node and distributed file system
US20040153481A1 (en) Method and system for effective utilization of data storage capacity
US20100030754A1 (en) Data Backup Method
US20080005614A1 (en) Failover and failback of write cache data in dual active controllers
US8930663B2 (en) Handling enclosure unavailability in a storage system
US7761431B2 (en) Consolidating session information for a cluster of sessions in a coupled session environment
US6178521B1 (en) Method and apparatus for disaster tolerant computer system using cascaded storage controllers
JP2013544386A (en) System and method for managing integrity in a distributed database
CN103942112A (en) Magnetic disk fault-tolerance method, device and system
CN106776123B (en) Disaster-tolerant real-time data copying method and system and backup client
US7376859B2 (en) Method, system, and article of manufacture for data replication
US7260739B2 (en) Method, apparatus and program storage device for allowing continuous availability of data during volume set failures in a mirrored environment
WO2025195152A1 (en) Data backup system, method and apparatus, and device, storage medium and program product
US7412577B2 (en) Shared data mirroring apparatus, method, and system
US7080197B2 (en) System and method of cache management for storage controllers
Dell

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载