JP2003044310A

JP2003044310A - Clustering system and method for restoring data when fault occurs in clustering system

Info

Publication number: JP2003044310A
Application number: JP2001208998A
Authority: JP
Inventors: Tsunehiro Kajita; 恒宏梶田; Akihiro Ogura; 明宏小倉; Mitsuhiro Nishida; 光宏西田; Atsuya Takeuchi; 篤也竹内; Nobuyasu Tanaka; 伸宜田中; Hiroshi Ito; 浩伊藤
Original assignee: International Business Machines Corp
Current assignee: International Business Machines Corp
Priority date: 2001-07-10
Filing date: 2001-07-10
Publication date: 2003-02-14
Anticipated expiration: 2021-07-10
Also published as: JP3640349B2

Abstract

PROBLEM TO BE SOLVED: To provide a clustering system with improved processing efficiency by eliminating, transfer of management data for fail over among the respective server machines. SOLUTION: The respective server machines 11 to 14 are provided with management data storage parts 60 to 61 to store the management data of processings to be executed in the server machines, batteries 70 to 73 to supply power to interface parts 50 to 53 to electrically and mechanically connect at least the management data storage parts 60 to 61 and the server machines with a communication line 110 when a fault occurs in the server machines, the interface parts 50 to 53 to receive power supply by a battery in the first server machine transfer the management data read from the management data storage parts 60 to 61 to a second server machine and continues service to a client based on the management data received by the second server machine when the fault occurs in a first server machine among the respective server machines.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、複数のサーバマシ
ンが切替スイッチに接続されて協調的に動作されること
により、１台のサーバマシンが故障してもクライアント
へのサービスを継続できるクラスタリング・システムに
関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a clustering system in which a plurality of server machines are connected to a changeover switch and operated in a coordinated manner, so that even if one server machine fails, service to clients can be continued. Regarding the system.

【０００２】[0002]

【従来の技術】従来から、複数のサーバマシンが通信線
で切替スイッチに接続されて協調的に動作されることに
より、１台のサーバマシンが故障してもクライアントへ
のサービスを継続できるクラスタリング・システムが知
られている。このクラスタリング・システムでは、クラ
イアントからは、複数のサーバマシンがあたかも１台の
サーバであるように見える。また、クラスタリングの機
能は、ＵＮＩＸ（登録商標）やウインドウズＮＴ（登録
商標）等のようなＯＳによりサポートされている。2. Description of the Related Art Conventionally, a plurality of server machines are connected to a changeover switch by a communication line and operate cooperatively so that even if one server machine fails, clustering can continue service to clients. The system is known. In this clustering system, a plurality of server machines appear to the client as if they were one server. The clustering function is supported by an OS such as UNIX (registered trademark) or Windows NT (registered trademark).

【０００３】また、クラスタリング・システムは、例え
ば、クライアントの増加によるサーバマシンの増設やメ
ンテナンス時にもシステムの機能を停止させることな
く、サービスの提供が可能となるため、フォールト・ト
レラント・システムの一種と考えられる。Further, the clustering system is a kind of fault-tolerant system because it can provide a service without stopping the function of the system even when the number of clients is increased and the number of server machines is increased or the maintenance is performed. Conceivable.

【０００４】フォールト・トレラント・システムは、一
般的には、各種構成要素を２重化する等の冗長性を持た
せて、通常はデータのバックアップ等を除いては片方の
構成要素のみを用いて他方の構成要素は休ませておき、
片方の構成に障害が発生した時には、すぐに他方の構成
要素を用いるように切り替えることにより、障害発生時
に自動的にデータ復旧して処理を継続できるようにして
いる。A fault-tolerant system generally has redundancy such as duplication of various components, and normally uses only one component except for backup of data. Let the other component rest,
When a failure occurs in one of the components, the other component is immediately used so that the data can be automatically recovered and the processing can be continued when the failure occurs.

【０００５】一般的にクラスタリング・システムは、２
台以上のサーバマシンが通信線で切替スイッチに接続さ
れて、複数のクライアントが要求するサービスを個別の
サーバマシンが実施しつつ、各サーバマシンで必要とな
るジャーナルデータ等のフェイルオーバー用（データ復
旧用）の管理データについては、サービスの合間に、各
サーバマシン毎に割り振られる予備サーバマシンに各サ
ーバマシンから転送している。Generally, a clustering system has two
More than one server machine is connected to the changeover switch by a communication line, and while each server machine performs the service requested by multiple clients, for failover of journal data etc. required for each server machine (data recovery Management data) is transferred from each server machine to a spare server machine allocated to each server machine during the service.

【０００６】図５は、従来のクラスタリング・システム
の構成を示すブロック図である。FIG. 5 is a block diagram showing the configuration of a conventional clustering system.

【０００７】図５のクラスタリング・システム１００
は、複数のサーバマシン（第１サーバ１０１、第２サー
バ１０２、第３サーバ１０３、・・・、第ｎサーバ１０
４）と、クライアントからのサービス要求、各サーバマ
シン間の多種の指示や要求、あるいは、各サーバマシン
における処理結果の履歴を示すジャーナルデータ等のフ
ェイルオーバー用の管理データの送受信先を切り替える
管理データ用切替スイッチ１０５と、各サーバマシンで
処理される例えば音声データや画像データ等の一般的に
大容量であるメインデータの送受信先を切り替えるメイ
ンデータ用切替スイッチ１０６と、メインデータを多重
化して保存するメインデータ記憶部１０７と、上記した
各部に商用電源から得た電力を供給する電源部１０８
と、各サーバマシンと管理データ用切替スイッチ１０５
とを接続する、例えばイーサバスである通信線１１０
と、各サーバマシンとメインデータ用切替スイッチ１０
６とを大容量かつ高速に接続する、例えば、光ファイバ
バスである高速通信線１２０と、電源部１０８から各部
へ電力を供給する電源ライン１３０により構成される。The clustering system 100 of FIG.
Is a plurality of server machines (first server 101, second server 102, third server 103, ..., Nth server 10).
4), service requests from clients, various instructions and requests between server machines, or management data for switching the destination of failover management data such as journal data indicating the history of processing results in each server machine. Changeover switch 105, a main data changeover switch 106 for switching a transmission / reception destination of generally large-capacity main data such as voice data and image data processed by each server machine, and main data is multiplexed and stored. Main data storage unit 107, and a power supply unit 108 that supplies electric power obtained from a commercial power supply to each of the above units.
And each server machine and management data changeover switch 105
And a communication line 110, which is, for example, an Ethernet bus.
And each server machine and main data changeover switch 10
6 and a high-speed communication line 120 that is a high-capacity and high-speed connection, for example, an optical fiber bus, and a power supply line 130 that supplies power from the power supply unit 108 to each unit.

【０００８】各サーバマシンと管理データ用切替スイッ
チ１０５との間で扱われるデータの種類としては、例え
ば、システムダウン時のフェイルオーバー用の管理デー
タ（ジャーナル情報等）のみでなく、負荷分散用のデー
タも扱われる。従って、そのデータ量は、画像データや
音声データを扱うメインデータほどではないが多く、例
えば、システムが大規模になるほど高速な通信速度が要
求される。この各サーバマシンと管理データ用切替スイ
ッチ１０５との間では、一般的にイーサバスによる接続
が用いられている。The types of data handled between each server machine and the management data changeover switch 105 include, for example, not only management data (journal information etc.) for failover at the time of system down, but also load balancing. Data is also handled. Therefore, the amount of data is not so large as the main data handling image data and audio data, and, for example, as the system becomes larger, a higher communication speed is required. An Ethernet connection is generally used between each server machine and the management data changeover switch 105.

【０００９】一方、各サーバマシンとメインデータ用切
替スイッチ１０６との間で扱われるデータの種類として
は、例えば、音声データや画像データである。従って、
そのデータ量は、管理データ等に比較して圧倒的に大容
量であるので、通信線１１０で使用される通常のイーサ
バス等による接続より高速な光ファイババス等による接
続となる。On the other hand, the type of data handled between each server machine and the main data changeover switch 106 is, for example, voice data or image data. Therefore,
Since the data amount is overwhelmingly large as compared with the management data and the like, the connection is made by an optical fiber bus or the like which is faster than the connection by the ordinary Ethernet or the like used in the communication line 110.

【００１０】また、各サーバマシンの中には、各サーバ
内部の管理データをイーサバスによる通信に適したデー
タ形態に変換あるいは調整したり、逆にイーサバスの通
信データを各サーバ内部の管理データとして適したデー
タ形態に変換あるいは調整する管理データ用インターフ
ェース部１５０〜１５３と、各サーバマシン毎に要求さ
れる処理内容あるいは処理結果を示すジャーナルデータ
等の管理データを保存する管理データ記憶部１６０〜１
６３を各々が有しており、さらに、各管理データ用イン
ターフェース部１５０〜１５３と各管理データ記憶部１
６０〜１６３との間は、各サーバマシンの内部バス１４
０〜１４３により接続されている。Further, in each server machine, the management data in each server is converted or adjusted into a data format suitable for communication by Ethernet, and conversely, the communication data of Ethernet is suitable as management data in each server. Management data interface units 150 to 153 for converting or adjusting to different data formats, and management data storage units 160 to 1 for storing management data such as journal data indicating processing contents or processing results requested for each server machine.
Each has a management data interface unit 150 to 153 and a management data storage unit 1.
Between 60 and 163, the internal bus 14 of each server machine
They are connected by 0-143.

【００１１】また、電源ライン１３０のうちで、特に第
１サーバ１０１に電力を供給する電源ラインを１３０ａ
とし、第２サーバ１０２に電力を供給する電源ラインを
１３０ｂとし、第３サーバ１０３に電力を供給する電源
ラインを１３０ｃとし、第ｎサーバ１０４に電力を供給
する電源ラインを１３０ｄとする。In addition, among the power supply lines 130, a power supply line 130a for supplying electric power to the first server 101 is particularly used.
The power supply line for supplying power to the second server 102 is 130b, the power supply line for supplying power to the third server 103 is 130c, and the power supply line for supplying power to the nth server 104 is 130d.

【００１２】次に、図５に示したクラスタリング・シス
テム１００の通常動作時における管理データをバックア
ップする（二重化保存する）動作について説明する。Next, the operation of backing up (duplicating and saving) the management data during the normal operation of the clustering system 100 shown in FIG. 5 will be described.

【００１３】図６は、従来のクラスタリング・システム
１００における管理データの二重化保存動作のフローチ
ャートである。FIG. 6 is a flow chart of the operation of redundantly storing management data in the conventional clustering system 100.

【００１４】ここでは、例えば、第１サーバ１０１でク
ライアントからの要求により処理が実施される場合とす
る。Here, for example, it is assumed that the first server 101 performs processing in response to a request from a client.

【００１５】まず、第１サーバ１０１では、クライアン
トからの処理の要求が発生しているか否かを確認してお
り（Ｓ１）、処理要求がない場合（Ｓ１：ＮＯ）には、
ステップＳ１の確認が繰り返されることにより常時クラ
イアントからの要求が監視される。処理要求がある場合
（Ｓ１：ＹＥＳ）には、第１サーバ１０１では、要求内
容が管理データ（例えばジャーナルデータ）の保存であ
るか否かを確認する（Ｓ２）。管理データの保存ではな
い場合（Ｓ２：ＮＯ）には、その要求内容に従って通常
処理が実施される（Ｓ５）。管理データの保存である場
合（Ｓ２：ＹＥＳ）には、第１サーバ１０１で、管理デ
ータを保存するための割り込み処理が実施される（Ｓ
３）。First, the first server 101 confirms whether or not a processing request from a client is generated (S1). If there is no processing request (S1: NO),
The request from the client is constantly monitored by repeating the confirmation in step S1. When there is a processing request (S1: YES), the first server 101 confirms whether or not the requested content is management data (for example, journal data) storage (S2). If the management data is not stored (S2: NO), normal processing is performed according to the requested content (S5). If the management data is to be stored (S2: YES), the first server 101 performs an interrupt process for storing the management data (S2).
3).

【００１６】このステップＳ３の管理データを保存する
ための割り込み処理は、第１サーバ１０１の図示しない
ＣＰＵにおける処理を基本に考えた場合、例えば、第１
サーバ１０１で現在実行中の処理（ジョブ）が終わった
ら、次のジョブの開始を一時的に保留しておき、メイン
データ記憶部１０７中でジョブ結果として格納されたメ
インデータのアドレス位置等についての管理データを生
成し、メインメモリ等の一部に設けられた管理データ記
憶部１６０に格納する。The interrupt processing for storing the management data in step S3 is based on the processing in the CPU (not shown) of the first server 101.
When the process (job) currently being executed by the server 101 is finished, the start of the next job is temporarily suspended, and the address position of the main data stored as the job result in the main data storage unit 107 The management data is generated and stored in the management data storage unit 160 provided in a part of the main memory or the like.

【００１７】また、従来の割り込み処理では、管理デー
タが二重化されて格納（保存）されることから、管理デ
ータ記憶部１６０に保存されたジャーナルデータ等の管
理データは、管理用インターフェース部１５０から第２
サーバ１０２内の管理用インターフェース部１５１に向
けて送出（転送Ｔ１）される。Further, in the conventional interrupt processing, since the management data is duplicated and stored (saved), the management data such as the journal data saved in the management data storage unit 160 is transferred from the management interface unit 150 to the first Two
It is sent (transfer T1) to the management interface unit 151 in the server 102.

【００１８】そのため、図６（ｂ）に示す別サーバへの
転送プロセスにおいて、まず第１サーバ１０１から第２
サーバ１０２に管理データ転送要求が送られる（Ｓ
６）。次に、転送先の第２サーバ１０２がビジーである
か否かの確認が実施され（Ｓ７）、第２サーバ１０２が
ビジーであれば（Ｓ７：ＹＥＳ）、管理データを転送で
きないので、第１サーバ１０１は第２サーバ１０２がビ
ジーでなくなるまで待機する。第２サーバ１０２がビジ
ーでなくなる（Ｓ７：ＮＯ）と、第１サーバ１０１の管
理データ記憶部１６０に保存されている管理データが第
２サーバ１０２に転送される（Ｓ８）。その後、管理デ
ータが第２サーバ内の管理データ記憶部１６１に保存さ
れたという結果が確認されてから、次の処理（ジョブ）
が再スタートされる。Therefore, in the transfer process to another server shown in FIG.
A management data transfer request is sent to the server 102 (S
6). Next, it is confirmed whether or not the second server 102 of the transfer destination is busy (S7). If the second server 102 is busy (S7: YES), the management data cannot be transferred. The server 101 waits until the second server 102 is no longer busy. When the second server 102 is no longer busy (S7: NO), the management data stored in the management data storage unit 160 of the first server 101 is transferred to the second server 102 (S8). After that, after the result that the management data is stored in the management data storage unit 161 in the second server is confirmed, the next process (job) is performed.
Will be restarted.

【００１９】次いで、ステップＳ３の割り込み処理が終
了したか否かが確認され（Ｓ４）、終了した場合（Ｓ
４：ＹＥＳ）には、ステップＳ１に戻り、終了していな
い場合（Ｓ４：ＮＯ）には、ステップＳ３に戻って割り
込み処理が継続される。Next, it is confirmed whether or not the interrupt processing in step S3 is completed (S4), and if it is completed (S4).
4: YES), the process returns to step S1, and if not completed (S4: NO), the process returns to step S3 to continue the interrupt process.

【００２０】通常動作時中には、例えば、第１サーバ１
０１における１個のジョブが終了する度に管理データの
保存要求が発生して、アップデートされた管理データの
内容が管理データ記憶部１６０に保存されると共に、管
理データの保存の二重化のため、第２サーバ１０２へも
転送される。すなわち、第１サーバ１０１における処理
中には、頻繁に上記の割り込み処理が発生して、第１サ
ーバ１０１から第２サーバ１０２への管理データの転送
Ｔ１が実行されることになる。During normal operation, for example, the first server 1
A management data storage request is generated every time one job in 01 is completed, the updated management data contents are stored in the management data storage unit 160, and the management data storage is duplicated. 2 is also transferred to the server 102. That is, during the processing in the first server 101, the above interrupt processing frequently occurs, and the transfer T1 of the management data from the first server 101 to the second server 102 is executed.

【００２１】以上のように通常処理が実施されている途
中で、例えば、第１サーバ１０１への電源ライン１３０
ａに障害が発生した場合について以下に説明する。これ
は、実際には、サーバマシンにおける障害は様々な箇所
で発生する可能性があるが、電源ラインを障害発生箇所
とした場合が最も理解しやすく且つ説明しやすいと考え
られるため、便宜上から電源ラインを障害発生箇所とし
た。While the normal processing is being performed as described above, for example, the power supply line 130 to the first server 101
The case where a failure occurs in a will be described below. In practice, a failure in a server machine may occur at various locations, but it is considered that the failure is the power supply line. The line was designated as the fault location.

【００２２】図７は、従来のクラスタリング・システム
１００における障害検出時の動作のフローチャートであ
る。FIG. 7 is a flowchart of the operation of the conventional clustering system 100 when a failure is detected.

【００２３】まず、例えば、クラスタリング・システム
１００中の障害が発生した第１サーバ１０１から管理デ
ータを受信していた第２サーバ１０２が、処理実行中の
第１サーバ１０１に障害が発生したか否かを確認する
（Ｓ１１）。First, for example, whether or not the second server 102, which has received the management data from the failed first server 101 in the clustering system 100, has failed in the first server 101 that is executing the process. It is confirmed (S11).

【００２４】このステップＳ１１で障害が発生したこと
の確認は、例えば、電源ライン１３０ａに障害が発生し
た場合（Ｓ１１：ＹＥＳ）には、第１サーバ１０１への
電源電力供給が無くなるので、第２サーバ１０２では、
第１サーバ１０１から頻繁に転送（Ｔ１）されてくる管
理データが正常な終了手続を経ずに突然に受信しなくな
ることになる。従って、第２サーバ１０２は、突然に受
信データが無くなったか否かを確認することで、第１サ
ーバ１０１に障害が発生したか否かを確認することがで
きる。また、第２サーバ１０２では、障害が発生しない
場合（Ｓ１１：ＮＯ）には、再度、ステップＳ１１を繰
り返すことにより、障害の発生を常時監視することがで
きる。The confirmation of the occurrence of the failure in step S11 is performed by, for example, when the failure occurs in the power supply line 130a (S11: YES), the power supply to the first server 101 is cut off. On the server 102,
The management data frequently transferred (T1) from the first server 101 will not be received suddenly without going through a normal termination procedure. Therefore, the second server 102 can confirm whether or not a failure has occurred in the first server 101 by confirming whether or not the received data has suddenly disappeared. In addition, in the second server 102, when no failure occurs (S11: NO), the occurrence of the failure can be constantly monitored by repeating step S11 again.

【００２５】第１サーバ１０１における障害の発生を検
出した第２サーバ１０２は、管理データ用切替スイッチ
１０５に対して、管理データが出力される供給元が、例
えば、第１サーバ１０１から第２サーバ１０２になるよ
うに切り替え、管理データの送信先となる受信側が、例
えば、第２サーバ１０２から第３サーバ１０３になるよ
うに切り替えさせる（Ｓ１２）。すなわち、当初の設定
ではバックアップの管理データを保存するだけの予備側
サーバマシンであった第２サーバ１０２が、障害発生時
には管理データの供給元である主処理側サーバマシンに
なることから、管理データ用切替スイッチ１０５におけ
る第２サーバ１０２との接続部が管理データの出力側に
なるように切り替えられると共に、管理データの受信側
として新たに第３サーバ１０３（予備側サーバマシン）
が接続できるように新たな接続部が設定される。In the second server 102 which has detected the occurrence of a failure in the first server 101, the supply source of the management data to the management data changeover switch 105 is, for example, from the first server 101 to the second server. Then, the receiving side that is the destination of the management data is switched from the second server 102 to the third server 103, for example (S12). That is, since the second server 102, which was a spare server machine that only stores backup management data in the initial setting, becomes the main processing server machine that is the source of management data when a failure occurs, the management data The connection switch 105 for switching the connection with the second server 102 is switched to the output side of the management data, and the third server 103 (spare server machine) is newly added as the reception side of the management data.
A new connection is set up so that the can be connected.

【００２６】また、第１サーバ１０１に障害が発生した
ことを検出した第２サーバ１０２は、メインデータ用切
替スイッチ１０６も切り替えて、メインデータ記憶部１
０７とクライアント間のメインデータの送受信が第２サ
ーバ１０２を経由するように設定させる（Ｓ１２）。Further, the second server 102, which has detected that the first server 101 has failed, also switches the main data changeover switch 106 so that the main data storage unit 1
The main data is sent and received between 07 and the client via the second server 102 (S12).

【００２７】第２サーバ１０２は、障害発生時には、保
存されていた管理データの内容を確認し（Ｓ１３）、最
も最近保存された内容（最終保存内容）の次のジョブか
ら処理を開始することにより、第１サーバ１０１が途中
まで実施した処理について第２サーバ１０２で継続して
処理を実施する（Ｓ１４）。When a failure occurs, the second server 102 confirms the contents of the management data stored (S13), and starts processing from the job next to the most recently stored contents (final storage contents). The second server 102 continues to perform the processing that the first server 101 has performed halfway (S14).

【００２８】従来のクラスタリング・システムでは、上
記の構成および方法を用いることにより、一つのサーバ
マシンに障害が発生した場合であっても、クライアント
から要求された処理を他のサーバマシンにより継続して
実施できるようにしたので、要求された処理を自動的に
データ復旧することができた。In the conventional clustering system, by using the above configuration and method, even when a failure occurs in one server machine, the processing requested by the client is continued by another server machine. Since it can be carried out, the requested processing could be automatically restored.

【００２９】[0029]

【発明が解決しようとする課題】しかしながら、上記し
たように、第１サーバ１０１から第２サーバ１０２への
管理データの転送Ｔ１は、例えば、１つのジョブが終了
する度等に頻繁に実施されるため、第１サーバ１０１に
おけるＣＰＵの処理効率を悪化させていた。さらに、イ
ーサバス等である通信線１１０の通信速度は第１サーバ
１０１内の内部バス１４０等に比べて遅いことから、第
１サーバ１０１全体としての処理効率も低下させてい
た。However, as described above, the transfer T1 of the management data from the first server 101 to the second server 102 is frequently performed, for example, every time one job is completed. Therefore, the processing efficiency of the CPU in the first server 101 is deteriorated. Furthermore, since the communication speed of the communication line 110, which is an Ethernet bus or the like, is slower than that of the internal bus 140 or the like in the first server 101, the processing efficiency of the first server 101 as a whole is also reduced.

【００３０】また、通信線１１０は、上記したジャーナ
ルデータ等のフェイルオーバー用の管理データの他に、
例えば、負荷分散等にも使用されるため、頻繁に通信線
１１０を用いて第１サーバ１０１の管理データが第２サ
ーバ１０２へ転送Ｔ１されることは、クラスタリング・
システム１００全体としての処理効率も低下させてい
た。In addition to the above-mentioned management data for failover such as journal data, the communication line 110 is
For example, since the management data of the first server 101 is frequently transferred T1 to the second server 102 using the communication line 110 because it is also used for load balancing and the like, clustering
The processing efficiency of the system 100 as a whole is also reduced.

【００３１】本発明は、上述した如き従来の問題を解決
するためになされたものであって、通常動作時に、各サ
ーバマシン間のフェイルオーバー用管理データのサーバ
間転送を無くすことにより、処理効率が改善されたクラ
スタリング・システムを提供することを目的とする。The present invention has been made to solve the above-described conventional problems, and eliminates the inter-server transfer of failover management data between server machines during normal operation, thereby improving processing efficiency. To provide an improved clustering system.

【００３２】[0032]

【課題を解決するための手段】上述の目的を達成するた
め、請求項１に記載した本発明のクラスタリング・シス
テムは、複数のサーバマシンが通信線で切替スイッチに
接続されて協調的に動作されることにより、１台のサー
バマシンが故障してもクライアントへのサービスを継続
できるクラスタリング・システムであって、各サーバマ
シンは、該サーバマシンにおいて実施される処理の管理
データを保存する管理データ記憶部と、該サーバマシン
に障害が発生した場合に、少なくとも管理データ記憶
部、および、該サーバマシンと通信線とを電気的および
機械的に接続するインターフェース部に対して電源電力
を供給する電池を備え、各サーバマシン中の第１のサー
バマシンに障害が発生した場合には、第１のサーバマシ
ン中の電池により電源電力供給を受けたインターフェー
ス部が、管理データ記憶部から読み出した管理データを
第２のサーバマシンに転送し、第２のサーバマシンが受
信した管理データに基づいてクライアントへのサービス
を継続することを特徴とする。In order to achieve the above object, the clustering system of the present invention according to claim 1 is such that a plurality of server machines are connected to a changeover switch by communication lines and are operated cooperatively. This is a clustering system capable of continuing service to clients even if one server machine fails, and each server machine stores management data of processing executed in the server machine. And a battery that supplies power to at least the management data storage unit and the interface unit that electrically and mechanically connects the server machine and the communication line when a failure occurs in the server unit and the server machine. When a failure occurs in the first server machine in each server machine, the battery in the first server machine is used to supply power. The interface unit that receives the power supply transfers the management data read from the management data storage unit to the second server machine and continues the service to the client based on the management data received by the second server machine. Characterize.

【００３３】また、請求項２の本発明は、請求項１に記
載のクラスタリング・システムにおいて、各サーバマシ
ンのインターフェース部には、通常動作時中に発生する
障害を検出して電源を電池に切り替えた後、第２のサー
バマシンへの管理データの転送処理を制御する障害時動
作制御回路を備えることを特徴とする。The invention according to claim 2 is the clustering system according to claim 1, in which the interface unit of each server machine detects a fault occurring during normal operation and switches the power supply to a battery. After that, a failure operation control circuit for controlling the transfer processing of the management data to the second server machine is provided.

【００３４】また、請求項３の本発明は、請求項１に記
載のクラスタリング・システムにおいて、各サーバマシ
ンのインターフェース部には、通常動作時のクロック信
号の周波数よりも低い周波数のクロック信号を発生させ
る低周波数クロック発生回路と、第１のサーバマシンに
障害が発生した場合に、インターフェース部で利用する
クロック信号を、通常動作時に利用される比較的高い周
波数のクロック信号から低周波数のクロック信号に切り
替えるクロック切替回路を備えることを特徴とする。According to a third aspect of the present invention, in the clustering system according to the first aspect, the interface unit of each server machine generates a clock signal having a frequency lower than the frequency of the clock signal during normal operation. When a failure occurs in the low frequency clock generation circuit and the first server machine, the clock signal used in the interface unit is changed from the relatively high frequency clock signal used in normal operation to the low frequency clock signal. A clock switching circuit for switching is provided.

【００３５】また、請求項４の本発明は、請求項１〜３
の何れかに記載のクラスタリング・システムにおいて、
各サーバマシンのインターフェース部は、標準化された
所定スロットにて接続可能なボード形式あるいはカード
形式であって、電池および管理データ記憶部を内蔵して
一体化されることを特徴とする。Further, the present invention of claim 4 is based on claims 1 to 3.
In the clustering system according to any one of
The interface section of each server machine is a board format or a card format that can be connected through a standardized predetermined slot, and is characterized by being integrated by incorporating a battery and a management data storage section.

【００３６】また、請求項５に記載した本発明のクラス
タリング・システムにおける障害発生時のデータ復旧方
法は、複数のサーバマシンが通信線で切替スイッチに接
続されて協調的に動作されることにより、１台のサーバ
マシンが故障してもクライアントへのサービスを継続で
きるクラスタリング・システムにおける障害発生時のデ
ータ復旧方法であって、クライアントの要求するサービ
スを制御する第１のサーバマシンは、通常動作時には、
該第１のサーバマシンの内部に管理データを保存してお
き、該第１のサーバマシンに障害が発生したことを検出
した場合には、第１のサーバマシンは、該第１のサーバ
マシン内部の少なくとも前記管理データを保存する記憶
部および前記通信線と接続するためのインターフェース
部に供給される電源を、内蔵された電池に切り替え、第
１のサーバマシンは、インターフェース部から管理デー
タを第２のサーバマシンに転送し、第２のサーバマシン
は、受信した管理データを用いて前記クライアントの要
求するサービスを継続することを特徴とする。Further, in the data recovery method at the time of failure in the clustering system of the present invention as set forth in claim 5, the plurality of server machines are connected to the changeover switch by the communication lines and are operated cooperatively, A method for recovering data when a failure occurs in a clustering system capable of continuing service to a client even if one server machine fails, wherein the first server machine controlling the service requested by the client is ,
When the management data is stored inside the first server machine and it is detected that a failure has occurred in the first server machine, the first server machine determines the inside of the first server machine. Of at least the storage unit for storing the management data and the interface unit for connecting to the communication line, the power supply switched to the built-in battery, and the first server machine transmits the management data from the interface unit to the second unit. And the second server machine uses the received management data to continue the service requested by the client.

【００３７】また、請求項６の本発明は、請求項５に記
載のクラスタリング・システムにおける障害発生時のデ
ータ復旧方法において、第１のサーバマシンは、電源
を、内蔵する電池に切り替えた後に、クロック信号を低
周波数のものに切り替えて、障害発生時における管理デ
ータを第２のサーバマシンへ転送する処理を通常動作時
よりも低速で実施することを特徴とする。The present invention of claim 6 is the method of recovering data when a failure occurs in the clustering system according to claim 5, wherein the first server machine switches the power source to an internal battery, It is characterized in that the clock signal is switched to a low frequency one, and the processing of transferring the management data to the second server machine when a failure occurs is carried out at a lower speed than in the normal operation.

【００３８】[0038]

【発明の実施の形態】以下、本発明を図示した実施形態
に基づいて説明する。BEST MODE FOR CARRYING OUT THE INVENTION The present invention will be described below based on the illustrated embodiments.

【００３９】図１は、本発明の第１の実施形態のクラス
タリング・システムの構成を示すブロック図である。FIG. 1 is a block diagram showing the configuration of a clustering system according to the first embodiment of the present invention.

【００４０】なお、図１中で、図５に示した従来のクラ
スタリング・システムと同様な機能を有する部分には、
同じ符号を付与して重複する説明を省略する。In FIG. 1, parts having the same functions as those of the conventional clustering system shown in FIG.
The same reference numerals are given and duplicate description is omitted.

【００４１】図１のクラスタリング・システム１は、複
数のサーバマシンを有している点では従来と同様である
が、各サーバマシンが第１サーバ１１、第２サーバ１
２、第３サーバ１３、・・・、第ｎサーバ１４となって
従来のものとは内部構成が異なっている。その他の構成
である管理データ用切替スイッチ１０５、メインデータ
用切替スイッチ１０６、メインデータ記憶部１０７、電
源部１０８、通信線１１０、高速通信線１２０、およ
び、電源ライン１３０については、図５に示した従来の
構成と同様である。The clustering system 1 of FIG. 1 is similar to the conventional one in that it has a plurality of server machines, but each server machine has a first server 11 and a second server 1.
2, the third server 13, ..., The nth server 14 are different in internal configuration from the conventional one. The management data changeover switch 105, the main data changeover switch 106, the main data storage unit 107, the power supply unit 108, the communication line 110, the high-speed communication line 120, and the power supply line 130 which are other configurations are shown in FIG. It is similar to the conventional configuration.

【００４２】本実施形態の各サーバマシンでは、各サー
バ内部の管理データ用インターフェース部５０〜５３
が、従来と同様に管理データをイーサバスによる通信に
適したデータ形態に変換あるいは調整したり、逆にイー
サバスの通信データを各サーバ内部の管理データとして
適したデータ形態に変換あるいは調整する機能を有して
いるだけでなく、各サーバマシン毎に要求される処理内
容あるいは処理結果を示すジャーナルデータ等の管理デ
ータを保存する管理データ記憶部６０〜６３さらに、そ
れらの管理データ記憶部６０〜６３への電源電力供給に
障害が発生した場合に電源電力を供給する電池７０〜７
３をその内部に有している。In each server machine of this embodiment, the management data interface sections 50 to 53 inside each server are provided.
However, it has the function of converting or adjusting management data into a data format suitable for Ethernet communication, as in the past, or conversely converting or adjusting the Ethernet communication data into a data format suitable as management data inside each server. Management data storage units 60 to 63 that store management data such as journal data indicating the processing content or processing result requested for each server machine, and further to those management data storage units 60 to 63. Batteries 70 to 7 for supplying power source power when a failure occurs in power source power supply of
3 inside.

【００４３】電池７０〜７３は、管理データ記憶部６０
〜６３のメモリーバックアップ用に必要となる電力容量
に加えて、管理データ記憶部６０〜６３の記憶内容を他
のサーバマシンに転送するために必要となる電力容量も
有している。The batteries 70 to 73 are the management data storage unit 60.
In addition to the power capacities required for memory backup of Nos. 63 to 63, the power capacities required to transfer the storage contents of the management data storage units 60 to 63 to other server machines are also provided.

【００４４】また、従来の管理データ記憶部１６０〜１
６３は、一般的には、各サーバマシンで主作業領域とな
るメインメモリ内の一部が用いられる場合が多いため、
図５に示したように管理データ用インターフェース部１
５０〜１５３とは別構成となり、各サーバマシンの内部
バス１４０により接続されている。しかし、本実施形態
では、管理データ記憶部６０〜６３は、管理データ用イ
ンターフェース部５０〜５３の内部に配置されている。
また、管理データ記憶部６０〜６３としては、例えば、
揮発性メモリである６４Ｍｂｉｔ（１Ｍｗｏｒｄ×１６
ｂｉｔ×４ｂａｎｋ）ＳＤＲＡＭを４個（合計３２Ｍバ
イト）使用する。Further, the conventional management data storage units 160-1
63 generally uses a part of the main memory, which is the main work area of each server machine,
As shown in FIG. 5, the management data interface unit 1
The configuration is different from that of 50 to 153, and is connected by the internal bus 140 of each server machine. However, in this embodiment, the management data storage units 60 to 63 are arranged inside the management data interface units 50 to 53.
Further, as the management data storage units 60 to 63, for example,
64Mbit (1Mword x 16) which is a volatile memory
4 bits (4 bank) SDRAM (32 Mbytes in total) are used.

【００４５】図２は、図１中の第１サーバ１１の内部構
成を示す図である。FIG. 2 is a diagram showing the internal configuration of the first server 11 in FIG.

【００４６】なお、以下には第１サーバ１１の内部構成
のみを説明し、図１中の第２サーバ１２〜第ｎサーバ１
４については第１サーバ１１と同様な内部構成であるの
で説明を省略する。Only the internal configuration of the first server 11 will be described below, and the second server 12 to the nth server 1 in FIG.
Since No. 4 has the same internal configuration as the first server 11, its description is omitted.

【００４７】図２の第１サーバ１１中には、管理データ
用インターフェース部５０の他に、クライアントの希望
するメインデータに関する処理を実行するサーバ用主制
御部１５０、および、図１のメインデータ記憶部１０７
にメインデータを格納したり読み出したりするためのメ
インデータ用インターフェース部１６０を有している。
本サーバとクライアントとの入出力部、あるいは、サー
バとしての他の構成については、本発明の主要な構成と
の関係が少なく、従来の構成との相違が少ないため、記
載を省略している。また、上記各部には、電源部１０８
から電源ライン１３０を介して電源電力が供給されてい
る。In the first server 11 of FIG. 2, in addition to the management data interface unit 50, a server main control unit 150 for executing processing relating to main data desired by the client, and the main data storage of FIG. Part 107
It has a main data interface section 160 for storing and reading main data.
The input / output unit between the server and the client, or other configuration as a server, is not described because it has little relation to the main configuration of the present invention and little difference from the conventional configuration. Further, the power source unit 108 is provided in each of the above units.
From the power supply line 130.

【００４８】管理データ用インターフェース部５０中に
は、前記した管理データ記憶部６０および電池７０の他
に、通信線１１０を介した管理データ用切替スイッチ１
０５との通信を制御する通信制御回路６４と、サーバ用
主制御部１５０との信号送受信により第１サーバ１１に
発生した障害を検出して、障害が検出された場合に管理
データ記憶部６０内に保存されたデータを別サーバに転
送する制御を実施する障害時動作制御回路６５と、その
障害時動作制御回路６５の制御により通常動作時と障害
時で使用するクロック信号を切り替えるクロック切替回
路６６と、通信線１１０を第１サーバ１１に接続するた
めのコネクタ等からなる通信線用接続部６７と、障害時
用の比較的低周波数のクロック信号を供給する低周波数
クロック発生回路６８と、管理データ用インターフェー
ス部５０を内部バス１４０に接続するためのコネクタ等
からなるサーバ内部用接続部７４と、障害時動作制御回
路６５の制御により通常動作時と障害時で電源部１０８
から供給される外部電源電力と電池７０から供給される
内部電源電力とを切り替える電源切替回路７５と、通常
動作時用の比較的高周波数のクロック信号を供給する高
周波数クロック発生回路７８とを有している。In the management data interface unit 50, in addition to the management data storage unit 60 and the battery 70, the management data changeover switch 1 via the communication line 110 is provided.
A communication control circuit 64 that controls communication with the server 05 and a server main control unit 150 sends and receives a signal to detect a failure in the first server 11, and if a failure is detected, the management data storage unit 60 is stored. A failure operation control circuit 65 for controlling the transfer of the data stored in the server to another server, and a clock switching circuit 66 for switching a clock signal to be used at the time of normal operation and at the time of failure under the control of the failure operation control circuit 65. A communication line connector 67 including a connector for connecting the communication line 110 to the first server 11; a low-frequency clock generation circuit 68 for supplying a relatively low-frequency clock signal for failure; By the control of the server internal connection unit 74 including a connector for connecting the data interface unit 50 to the internal bus 140, and the failure operation control circuit 65. Power supply unit 108 during normal operation and failure
A power supply switching circuit 75 that switches between external power supply power supplied from the battery 70 and internal power supply power supplied from the battery 70, and a high frequency clock generation circuit 78 that supplies a relatively high frequency clock signal for normal operation. is doing.

【００４９】また、障害時に電池７０から電源電力が供
給される障害時電池動作範囲５６には、管理データ記憶
部６０、通信制御回路６４、障害時動作制御回路６５、
クロック切替回路６６、通信線用接続部６７、および、
低周波数クロック発生回路６８が含まれている。Further, in the fault battery operation range 56 to which power is supplied from the battery 70 at the time of fault, the management data storage unit 60, the communication control circuit 64, the fault operation control circuit 65,
A clock switching circuit 66, a communication line connecting section 67, and
A low frequency clock generator circuit 68 is included.

【００５０】上記した管理データ用インターフェース部
５０は、一般的にイーサネット（登録商標）等の標準化
されたバス形式に対応するように設計されることから、
その構成もイーサネット（登録商標）のインターフェー
ス回路を内蔵するボード形式あるいはカード形式であっ
て、そのボード形式あるいはカード形式のインターフェ
ースを接続可能な所定スロットに挿入されることによ
り、電気的接続および機械的接続が行われる。そのた
め、電池７０および管理データ記憶部６０は、そのボー
ド形式あるいはカード形式のインターフェース上に配置
されて、すなわち、インターフェースに内蔵され、一体
化される。Since the management data interface section 50 is generally designed to support a standardized bus format such as Ethernet (registered trademark),
The configuration is also a board type or a card type incorporating an Ethernet (registered trademark) interface circuit, and by inserting the board type or card type interface into a connectable predetermined slot, electrical connection and mechanical The connection is made. Therefore, the battery 70 and the management data storage unit 60 are arranged on the board-type or card-type interface, that is, incorporated in and integrated with the interface.

【００５１】通常動作時用にクロック信号を供給する高
周波数クロック発生回路７８による動作周波数と、障害
時用にクロック信号を供給する低周波数クロック発生回
路６８による動作周波数とでは、１０倍程度の差がある
ものとする。すなわち、高周波数クロック発生回路７８
から低周波数クロック発生回路６８に切り替わること
で、動作周波数は１／１０になる。具体的に通信速度を
例にすると、通常動作時の全２重通信で１Ｇｂｐｓの通
信速度から１００Ｍｂｐｓの通信速度となる。A difference of about 10 times between the operating frequency by the high frequency clock generating circuit 78 which supplies the clock signal for the normal operation and the operating frequency by the low frequency clock generating circuit 68 which supplies the clock signal for the fault. There is. That is, the high frequency clock generation circuit 78
The operating frequency becomes 1/10 by switching from the low frequency clock generating circuit 68 to the low frequency clock generating circuit 68. Taking a specific communication speed as an example, the communication speed from 1 Gbps to 100 Mbps in full-duplex communication during normal operation.

【００５２】電池７０の容量としては、通常動作時の通
信方式では、消費電力が１０Ｗ程度必要になることから
３０００ｍＡｈ程度が必要となる。これは、一般的な乾
電池（ニッケル水素電池）に換算すると、市販されてい
る最も大きい寸法である単１サイズの乾電池が３個分と
なる。この寸法の電池をボードあるいはカードに実装す
ることは非現実的である。ところが、上記した本実施形
態の低周波数を用いる方法では、消費電力を６００ｍＷ
程度に抑制することができるので、消費電力も２００ｍ
Ａｈ程度となる。これは、乾電池（ニッケル水素電池）
に換算すると、市販されている最も小さい寸法（乾電池
中：ボタン電池を除く）である単５サイズの乾電池が１
個分となる。The capacity of the battery 70 is required to be about 3000 mAh because the power consumption is required to be about 10 W in the normal operation communication system. When converted to a general dry battery (nickel-hydrogen battery), this corresponds to the size of a single size dry battery, which is the largest size commercially available, for three batteries. Mounting a battery of this size on a board or card is impractical. However, in the method using the low frequency of the present embodiment described above, the power consumption is 600 mW.
Power consumption is 200m as it can be suppressed to a certain degree
It will be about Ah. This is a dry battery (nickel hydrogen battery)
When converted to, the size of the AA size battery, which is the smallest size on the market (in dry batteries: excluding button batteries), is 1
It will be the number.

【００５３】本実施形態では、上記のように小さい寸法
の電池を用いることができることから、例えば、インタ
ーフェースをＰＣＩアダプタカード等のカード形式にし
て、その中にバッテリーを内蔵させることができる。In the present embodiment, since the battery having a small size as described above can be used, for example, the interface can be in the form of a card such as a PCI adapter card and the battery can be built therein.

【００５４】上記のボード形式あるいはカード形式のイ
ンターフェースを利用することにより、従来のクラスタ
リング・システムを本実施形態のシステムに変更する場
合には、インターフェースカードを変更するだけで良い
ので、新たなクラスタリング・システムを購入する必要
がなくなり、わずかな設置コストで、且つ、わずかな変
更工数で、従来のクラスタリング・システムを本実施形
態のクラスタリング・システムに変更することができ
る。When the conventional clustering system is changed to the system of the present embodiment by utilizing the above board-type or card-type interface, it is only necessary to change the interface card. It is not necessary to purchase the system, and the conventional clustering system can be changed to the clustering system of the present embodiment with a small installation cost and a small change man-hour.

【００５５】次に、図１および図２に示したクラスタリ
ング・システム１の通常動作時における管理データを保
存する動作について説明する。なお、本実施形態では、
通常動作時には、電池によりバックアップされた管理デ
ータ記憶部に管理データが保存されるのみであり、従来
のように、通常動作時において管理データを他のサーバ
マシンにも保存するという管理データの二重化保存処理
については実施されない。Next, the operation of saving the management data during the normal operation of the clustering system 1 shown in FIGS. 1 and 2 will be described. In this embodiment,
In normal operation, the management data is only saved in the management data storage section backed up by the battery, and in the same way as before, the management data is also saved in other server machines during dual operation. No processing is performed.

【００５６】図３は、本実施形態のクラスタリング・シ
ステム１における管理データの保存動作のフローチャー
トである。FIG. 3 is a flowchart of the management data saving operation in the clustering system 1 of this embodiment.

【００５７】ここでは、例えば、第１サーバ１１でクラ
イアントからの要求により処理が実施される場合とす
る。Here, for example, it is assumed that the first server 11 executes the process in response to a request from the client.

【００５８】まず、第１サーバ１１では、クライアント
からの処理の要求が発生しているか否かを確認しており
（Ｓ２１）、処理要求がない場合（Ｓ２１：ＮＯ）に
は、ステップＳ２１の確認が繰り返されることにより常
時クライアントからの要求が監視される。First, the first server 11 confirms whether or not a processing request from the client is issued (S21). If there is no processing request (S21: NO), the confirmation in step S21 is performed. By repeating the above, the request from the client is constantly monitored.

【００５９】処理要求がある場合（Ｓ２１：ＹＥＳ）に
は、第１サーバ１１では、要求内容が管理データ（例え
ばジャーナルデータ）の保存であるか否かを確認する
（Ｓ２２）。管理データの保存ではない場合（Ｓ２２：
ＮＯ）には、その要求内容に従って通常処理が実施され
る（Ｓ２５）。管理データの保存である場合（Ｓ２２：
ＹＥＳ）には、第１サーバ１１で、管理データを保存す
るための割り込み処理が実施される（Ｓ２３）。When there is a processing request (S21: YES), the first server 11 confirms whether or not the request content is the storage of management data (for example, journal data) (S22). When the management data is not saved (S22:
If NO, the normal process is executed according to the request contents (S25). When the management data is stored (S22:
If the answer is YES, the first server 11 executes the interrupt process for saving the management data (S23).

【００６０】このステップＳ２３の管理データを保存す
るための割り込み処理は、例えば、第１サーバ１１のサ
ーバ用主制御部１５０で現在実行中の処理（ジョブ）が
終了したら、次のジョブの開始を一時的に保留してお
き、メインデータ記憶部１０７中でジョブ結果として格
納されたメインデータのアドレス位置等についての管理
データを生成し、管理データ用インターフェース部５０
内に設けられた管理データ記憶部６０に格納する。本実
施形態では、通常動作時は、このようにメインデータに
ついての処理を実施するサーバマシン内部の処理のみ
で、他のサーバマシンへの管理データを二重化するため
の転送は実施されない。従って、次の処理（ジョブ）が
再スタートするタイミングが早くなる。In the interrupt process for saving the management data in step S23, for example, when the process (job) currently being executed by the server main control unit 150 of the first server 11 is completed, the next job is started. Management data for the address position of the main data stored as the job result in the main data storage unit 107 is temporarily held, and the management data interface unit 50 is generated.
It is stored in the management data storage unit 60 provided therein. In the present embodiment, during normal operation, only the internal processing of the server machine that executes the processing for the main data is performed, and the transfer for duplicating the management data to other server machines is not executed. Therefore, the timing for restarting the next process (job) becomes earlier.

【００６１】次いで、ステップＳ２３の割り込み処理が
終了したか否かが確認され（Ｓ２４）、終了した場合
（Ｓ２４：ＹＥＳ）には、ステップＳ２１に戻り、終了
していない場合（Ｓ２４：ＮＯ）には、ステップＳ２３
に戻って割り込み処理が継続される。Next, it is confirmed whether or not the interrupt processing of step S23 is completed (S24). If it is completed (S24: YES), the procedure returns to step S21, and if it is not completed (S24: NO). Is step S23
Then, the interruption processing is continued.

【００６２】通常動作時中には、例えば、第１サーバ１
１における１個のジョブが終了する度に管理データの保
存要求が発生して、アップデートされた管理データの内
容が管理データ記憶部６０に保存される。これは、通信
線１１０等に比べて比較的高速な内部バス１４０等のみ
による処理であるので、ジョブを実行する間の保留時
間、すなわち、割り込み処理が実施される時間が従来の
ものよりも減少する。During normal operation, for example, the first server 1
A management data storage request is generated each time one job in 1 is completed, and the contents of the updated management data are stored in the management data storage unit 60. Since this is processing only by the internal bus 140 or the like, which is relatively faster than the communication line 110 or the like, the holding time during the execution of the job, that is, the time at which the interrupt processing is executed is reduced as compared with the conventional one. To do.

【００６３】以上のように通常処理が実施されている途
中で、例えば、第１サーバ１１への電源ライン１３０ａ
に障害が発生した場合について以下に説明する。While the normal processing is being performed as described above, for example, the power line 130a to the first server 11
The case where a failure occurs in the above will be described below.

【００６４】図４は、本実施形態のクラスタリング・シ
ステム１における障害検出時の動作のフローチャートで
ある。FIG. 4 is a flow chart of the operation when a failure is detected in the clustering system 1 of this embodiment.

【００６５】まず、例えば、クラスタリング・システム
１中の障害が発生した第１サーバ１１では、障害時動作
制御回路６５がサーバ用主制御部１５０等を常時監視し
ており、この障害時動作制御回路６５が、処理実行中の
第１サーバ１１に障害が発生したか否かを確認する（Ｓ
３１）。First, for example, in the first server 11 in which a failure occurs in the clustering system 1, the failure operation control circuit 65 constantly monitors the server main control unit 150 and the like. 65 confirms whether or not a failure has occurred in the first server 11 that is executing the process (S
31).

【００６６】このステップＳ３１における障害が発生し
たことの確認は、例えば、電源ライン１３０ａに障害が
発生した場合（Ｓ３１：ＹＥＳ）には、第１サーバ１１
への電源電力の供給が無くなるので、障害時動作制御回
路６５では、電圧レベルの異常低下を検知する。従っ
て、障害動作検出回路６５は、通常動作時とは異なる突
然な電圧レベルの異常を確認することで、第１サーバ１
１に障害が発生したか否かを確認することができる。ま
た、障害時操作制御回路６５では、障害が発生しない場
合（Ｓ３１：ＮＯ）には、再度、ステップＳ３１を繰り
返すことにより、障害の発生を常時監視することができ
る。The confirmation of the occurrence of the failure in step S31 can be made, for example, when the failure occurs in the power supply line 130a (S31: YES).
Since the power supply to the power source is cut off, the failure operation control circuit 65 detects an abnormal drop in the voltage level. Therefore, the faulty operation detection circuit 65 confirms a sudden voltage level abnormality different from that during normal operation, thereby
It is possible to confirm whether or not a failure has occurred in 1. Further, in the failure operation control circuit 65, when the failure does not occur (S31: NO), the occurrence of the failure can be constantly monitored by repeating step S31 again.

【００６７】第１サーバ１１における障害の発生を検出
した障害時動作制御回路６５は、電源切替回路７５に対
して、通常動作時の電源部１０８から供給される電源電
力から、障害発生時の電池７０から供給される電源電力
に切り替えるよう制御を実施する（Ｓ３２）。また、障
害時動作制御回路６５は、障害発生時の誤動作等を避け
るために、サーバ内部用接続部７４あるいはその先のサ
ーバ用主制御部１５０等の内部回路を回路的に切り離し
て、サーバ内部からの信号が受信できないようにする
（Ｓ３３）。The failure operation control circuit 65, which has detected the occurrence of a failure in the first server 11, controls the power supply switching circuit 75 from the power supply supplied from the power supply unit 108 during normal operation to the battery at the time of failure. Control is performed so as to switch to the power supply power supplied from 70 (S32). Further, the failure operation control circuit 65 disconnects the internal circuit of the server internal connection section 74 or the server main control section 150 and the like in the future from the circuit inside in order to avoid a malfunction when a failure occurs. The signal from is not received (S33).

【００６８】さらに、障害時動作制御回路６５は、クロ
ック切替回路６６を通常動作時の高周波数クロック発生
回路７８から障害発生時の低周波数クロック発生回路６
８側に切り替えるように制御を実施する。Further, the failure operation control circuit 65 controls the clock switching circuit 66 from the high frequency clock generating circuit 78 during normal operation to the low frequency clock generating circuit 6 during failure.
The control is performed so as to switch to the 8 side.

【００６９】以上の制御により、管理データ用インター
フェース部５０内の少なくとも障害時電池動作範囲５６
の内部になる各部は、障害発生時に電池７０による電源
電力の供給を受けて、通常動作時よりも低周波数のクロ
ック信号で動作する。このため、電池７０は、高速動作
に必要となる高電力が必要ではなくなるので、電池容量
を小さくすることができる。By the above control, at least the battery operating range 56 at the time of failure in the management data interface section 50
Each unit inside of is operated by a clock signal having a frequency lower than that in a normal operation when the power supply power from the battery 70 is supplied when a failure occurs. Therefore, the battery 70 does not need the high power required for high-speed operation, and the battery capacity can be reduced.

【００７０】電池容量を小さくできることで、インター
フェースカードに実装される電池の外形寸法を小さくす
ることができることから、所定寸法以内に設計する必要
があるインターフェースカードの設計を容易にすること
ができる。Since the battery capacity can be reduced and the external dimensions of the battery mounted on the interface card can be reduced, the design of the interface card which needs to be designed within a predetermined dimension can be facilitated.

【００７１】その後、障害時動作制御回路６５は、低周
波数のクロック信号による動作環境下で、通信制御回路
６４を制御して管理データ記憶部６０に保存された管理
データを読み出して、通信線１１０および管理データ用
切替スイッチ１０５を介して第２サーバ１２に転送Ｔ２
を実施する（Ｓ３５）。Thereafter, the failure operation control circuit 65 controls the communication control circuit 64 to read the management data stored in the management data storage unit 60 under the operating environment of the low frequency clock signal, and the communication line 110. And transfer to the second server 12 via the management data changeover switch 105 T2
Is carried out (S35).

【００７２】また、その際に、通信制御回路６４は、低
周波数のクロック信号に基づく通信速度で第２サーバ１
２との間の同期を確立し、管理データ記憶部６０から保
存された管理データをシングルバンクでバースト転送さ
せてから、第２サーバ１２に転送する。At this time, the communication control circuit 64 causes the second server 1 to operate at the communication speed based on the low-frequency clock signal.
The management data stored in the management data storage unit 60 is burst-transferred in a single bank and then transferred to the second server 12.

【００７３】この低周波数のクロック信号に基づく通信
速度に切り替えることでは、通常動作時の転送時間と比
較して、転送時間が８倍程度に増加し、例えば、３２Ｍ
バイトの転送に３秒程度が必要になる。しかし、一般的
なクラスタリング・システムによるフェイルオーバー処
理には、最低でも１分程度は必要であることから、本実
施形態における通信速度の低下は、フェイルオーバー処
理全体の遅延の中では割合が小さいため問題になるとは
考えられない。By switching to the communication speed based on this low-frequency clock signal, the transfer time is increased by about 8 times as compared with the transfer time in the normal operation.
It takes about 3 seconds to transfer the bytes. However, since failover processing by a general clustering system requires at least about one minute, the decrease in communication speed in this embodiment is small in the delay of the entire failover processing. I don't think it matters.

【００７４】転送Ｔ２された管理データを受信した第２
サーバ１２は、管理データ用切替スイッチ１０５に対し
て、障害発生時の管理データの出力元が、第１サーバ１
１から第２サーバ１２になるように切り替え、障害発生
時の管理データの送信先を、第２サーバ１２から第３サ
ーバ１３に切り替えさせる（Ｓ３６）。すなわち、当初
の設定では障害発生時に管理データを受信してクライア
ントの要求する処理を継続する予備側サーバマシンであ
った第２サーバ１２が、障害発生時には管理データを内
部に保存する主処理側サーバマシンになることから、管
理データ用切替スイッチ１０５における第２サーバ１２
との接続部が、障害発生時における管理データの出力側
になるように切り替えられると共に、第２サーバ１２に
障害が発生した場合の管理データの受信側として新たに
第３サーバ１３（新たな予備側サーバマシン）が接続さ
れるように、第３サーバ１３との接続部が設定される。Second receiving the management data transferred T2
In the server 12, the output source of the management data when the failure occurs is the first server 1 with respect to the management data changeover switch 105.
The first server is switched to the second server 12, and the transmission destination of the management data when the failure occurs is switched from the second server 12 to the third server 13 (S36). That is, in the initial setting, the second server 12, which is a standby server machine that receives management data when a failure occurs and continues the processing requested by the client, is the main processing server that internally stores the management data when a failure occurs. Since it becomes a machine, the second server 12 in the management data changeover switch 105
The connection part to and is switched so as to be the output side of the management data at the time of the occurrence of a failure, and the third server 13 (new standby server) is newly provided as the reception side of the management data when the failure occurs in the second server 12. The connection unit with the third server 13 is set so that the side server machine) is connected.

【００７５】また、管理データを受信した第２サーバ１
２は、メインデータ用切替スイッチ１０６も切り替え
て、メインデータ記憶部１０７とクライアント間のメイ
ンデータの送受信が第２サーバ１２を経由するように設
定する（Ｓ３６）。The second server 1 which has received the management data
2, the main data changeover switch 106 is also switched, and the main data storage unit 107 and the client are set so that transmission / reception of main data passes through the second server 12 (S36).

【００７６】第２サーバ１２は、第１サーバ１１に障害
が発生したことから、第１サーバ１１の管理データを受
信した場合、受信した管理データの内容を確認し（Ｓ３
７）、最後に受信した内容（最終受信内容）の次のジョ
ブから処理を開始することにより、第１サーバ１１が途
中まで実施した処理について第２サーバ１２で継続して
処理を実施できることになる（Ｓ３８）。When the second server 12 receives the management data of the first server 11 due to the failure of the first server 11, the second server 12 confirms the content of the received management data (S3).
7) By starting the process from the job next to the last received content (final received content), the second server 12 can continue to perform the processing that the first server 11 has performed halfway. (S38).

【００７７】本実施形態のクラスタリング・システム１
では、上記の構成および方法を用いることにより、一つ
のサーバマシンに障害が発生した場合であっても、クラ
イアントから要求された処理を他のサーバマシンにより
継続して実施できるようにしたので、要求された処理を
自動的にデータ復旧することができる。Clustering system 1 of this embodiment
Now, by using the above configuration and method, even when one server machine fails, the processing requested by the client can be continuously executed by other server machines. It is possible to automatically recover the data of the performed processing.

【００７８】このように、本実施形態の第１サーバ１１
では、第２サーバ１２への管理データの転送Ｔ２は、通
常動作時には実施されず、第１サーバ１１に障害が発生
した場合のみである。従って、通常動作時には、第１サ
ーバ１１では、別サーバへ管理データを転送することに
よる処理の遅延が発生しないため、ＣＰＵの処理効率の
悪化が少なくなる。As described above, the first server 11 of this embodiment
Then, the transfer T2 of the management data to the second server 12 is not performed during the normal operation, and is only when the failure occurs in the first server 11. Therefore, during normal operation, the first server 11 does not cause a processing delay due to the transfer of the management data to another server, so that the deterioration of the CPU processing efficiency is reduced.

【００７９】また、本実施形態の第１サーバ１１では、
管理データは電池によりバックアップされ、第１サーバ
１１内部の管理データ記憶部６０に格納されるため、イ
ーサバス等である通信線１１０の通信速度が第１サーバ
１１内の内部バス１４０等に比べて遅いことから低下す
るサーバの処理効率も改善できる。Further, in the first server 11 of this embodiment,
Since the management data is backed up by the battery and stored in the management data storage unit 60 inside the first server 11, the communication speed of the communication line 110, which is an Ethernet bus or the like, is slower than that of the internal bus 140 or the like in the first server 11. Therefore, the processing efficiency of the server, which is lowered, can be improved.

【００８０】また、通信線１１０は、障害発生時以外は
上記したジャーナルデータ等のフェイルオーバー用の管
理データが送受信されなくなるので、例えば、負荷分散
等に通信線１１０が使用される場合のクライアント・シ
ステムの処理効率を改善することができる。Further, since communication data for failover such as journal data described above is not transmitted / received to / from the communication line 110 except when a failure occurs, for example, when the communication line 110 is used for load balancing, etc. The processing efficiency of the system can be improved.

【００８１】なお、上記した実施形態では、第１サーバ
１１に障害が発生した場合に、第２サーバ１２に管理デ
ータを転送する実施形態について記載しているが、本発
明はこれに限られるものではなく、例えば、ｎ個のサー
バ中の任意のサーバに障害が発生した場合に、残りのサ
ーバ中から任意のサーバを指定して管理データを転送す
る場合に適用することができる。In the above embodiment, the management data is transferred to the second server 12 when a failure occurs in the first server 11, but the present invention is not limited to this. Instead, for example, when a failure occurs in any of the n servers, the present invention can be applied to the case of designating an arbitrary server from the remaining servers and transferring management data.

【００８２】また、上記した実施形態では、管理データ
用インターフェース部を、標準化されたイーサネット
（登録商標）で、インターフェースカード形式としたの
で、クラスタリング・システム全体を変更する必要がな
くし、設置コストおよび変更工数をわずかにしたが、本
発明は、これに限られるものではなく、例えば、他の標
準化されたバス形式あるいはインターフェースカード形
式を適用しても実施することができ、インターフェース
カードの設計を容易にすることができる。Further, in the above-described embodiment, the management data interface section is the standardized Ethernet (registered trademark) and has the interface card format. Therefore, it is not necessary to change the entire clustering system, and the installation cost and the change can be changed. Although the number of steps has been reduced, the present invention is not limited to this, and the present invention can be implemented by applying other standardized bus formats or interface card formats, for example, to facilitate the design of the interface card. can do.

【００８３】なお、上記した実施形態では、各サーバマ
シンと管理データ用切替スイッチ１０５との間ではイー
サバスによる接続を用い、各サーバマシンとメインデー
タ用切替スイッチ１０６との間では光ファイババスによ
る接続を用いたが、例えば、イーサバスによる接続を省
略して、全ての接続を光ファイババスにより実施するよ
うにシステムを構成しても良い。In the above embodiment, each server machine and the management data changeover switch 105 are connected by Ethernet bus, and each server machine and the main data changeover switch 106 are connected by optical fiber bus. However, for example, the system may be configured such that the connection by the Ethernet bus is omitted and all the connections are performed by the optical fiber bus.

【００８４】また、本実施形態の通信速度を低下させる
ことにより消費電力を減少させて管理データを転送する
方法は、上記した通常のイーサネット（登録商標）によ
る接続だけではなく、他のネットワーク接続方式、例え
ば、ギガビットイーサネット（登録商標）等にも適用す
ることができる。The method of transferring the management data by reducing the power consumption by lowering the communication speed of this embodiment is not limited to the above-mentioned normal Ethernet (registered trademark) connection, but other network connection methods. For example, it can be applied to Gigabit Ethernet (registered trademark).

【００８５】また、上記した実施形態では、障害発生時
には、管理データを転送するためのクロック信号を低速
（低周波数）のものに切り替えて使用したが、電池容量
に余裕がある場合、あるいは、元々のクロック信号の周
波数が低速であることから、管理データの転送に必要な
電力が少ない場合には、クロック信号を切り替えないで
実施しても良い。Further, in the above-described embodiment, when a failure occurs, the clock signal for transferring the management data is switched to a low speed (low frequency) and used. However, when there is enough battery capacity or originally, Since the frequency of the clock signal is low, the clock signal may be switched without switching when the power required to transfer the management data is small.

【００８６】また、上記した実施形態では、障害が電源
ラインに発生した場合を説明したが、本発明はこれに限
られるものではなく、各サーバに発生する全ての種類の
障害、例えば、信号線や内部処理回路等の障害に対して
適用することができる。Further, in the above-mentioned embodiment, the case where the fault occurs in the power supply line has been described, but the present invention is not limited to this, and all kinds of faults occurring in each server, for example, the signal line. It can be applied to a failure of the internal processing circuit or the like.

【００８７】また、上記した実施形態における障害発生
時の電池動作範囲には、図示していない他の回路を含ま
せるように構成しても良い。The battery operating range at the time of occurrence of a failure in the above-described embodiment may be configured to include other circuits not shown.

【００８８】[0088]

【発明の効果】上記のように本発明のクラスタリング・
システムおよびその障害発生時のデータ復旧方法は、メ
インデータの処理を実施中の第１のサーバマシンから予
備の第２のサーバマシンへの管理データの転送が、通常
動作時には実施されず、第１のサーバマシンに障害が発
生した場合のみに実施されるので、通常動作時の第１の
サーバマシンにおける管理データの転送処理による遅延
が発生せず、処理効率の低下を改善できる。As described above, the clustering of the present invention
The system and the data recovery method at the time of occurrence of the failure are such that the transfer of the management data from the first server machine which is executing the main data processing to the spare second server machine is not executed during the normal operation. Since it is performed only when a failure occurs in the server machine of No. 2, there is no delay due to the transfer processing of the management data in the first server machine during normal operation, and the deterioration of processing efficiency can be improved.

【００８９】また、本発明のクラスタリング・システム
およびその障害発生時のデータ復旧方法は、管理データ
が電池によりバックアップされ、サーバマシン内部にの
み格納されるため、イーサバス等の転送用の通信線と内
部バスとの処理時間差により発生するサーバマシンの内
部処理効率の低下も改善できる。Further, in the clustering system of the present invention and the method of recovering data when a failure occurs, the management data is backed up by the battery and is stored only inside the server machine. It is also possible to improve the decrease in internal processing efficiency of the server machine caused by the processing time difference from the bus.

【００９０】また、本発明のクラスタリング・システム
の通信線には、障害発生時以外はジャーナルデータ等の
フェイルオーバー用の管理データが送受信されなくなる
ので、負荷分散等に通信線を使用できる頻度が多くな
り、クラスタリング・システムの処理効率を改善するこ
とができる。Further, since the management data for failover such as journal data is not transmitted / received to / from the communication line of the clustering system of the present invention except when a failure occurs, the communication line can be frequently used for load balancing or the like. Therefore, the processing efficiency of the clustering system can be improved.

【００９１】また、管理データ用インターフェース部を
標準化されたカード形式とした場合には、従来のクラス
タリング・システムから本発明のクラスタリング・シス
テムに変更する際に、システム全体を変更する必要がな
く、設置コストおよび変更工数をわずかにすることがで
き、電池の寸法を小さくできるので、インターフェース
カードの設計を容易にすることができる。If the management data interface unit is of a standardized card format, it is not necessary to change the entire system when changing from the conventional clustering system to the clustering system of the present invention. The cost and man-hours for change can be reduced, and the size of the battery can be reduced, so that the interface card can be easily designed.

[Brief description of drawings]

【図１】本発明の第１の実施形態のクラスタリング・
システムの構成を示すブロック図である。FIG. 1 is a schematic diagram illustrating a clustering according to a first embodiment of the present invention.
It is a block diagram which shows the structure of a system.

【図２】図１中の第１サーバの内部構成を示す図であ
る。FIG. 2 is a diagram showing an internal configuration of a first server in FIG.

【図３】本実施形態のクラスタリング・システムにお
ける管理データの保存動作のフローチャートである。FIG. 3 is a flowchart of a management data saving operation in the clustering system of the present embodiment.

【図４】本実施形態のクラスタリング・システムにお
ける障害検出時の動作のフローチャートである。FIG. 4 is a flowchart of an operation at the time of detecting a failure in the clustering system of this embodiment.

【図５】従来のクラスタリング・システムの構成を示
すブロック図である。FIG. 5 is a block diagram showing a configuration of a conventional clustering system.

【図６】（ａ）、（ｂ）は従来のクラスタリング・シ
ステムにおける管理データの二重化保存動作のフローチ
ャートである。6 (a) and 6 (b) are flowcharts of the operation of redundantly storing management data in the conventional clustering system.

【図７】従来のクラスタリング・システムにおける障
害検出時の動作のフローチャートである。FIG. 7 is a flowchart of an operation when a failure is detected in the conventional clustering system.

[Explanation of symbols]

１、１００クラスタリング・システム、１１、１０
１第１サーバ、１２、１０２第２サーバ、１
３、１０３第３サーバ、１４、１０４第ｎサー
バ、５０〜５３、１５０〜１５３管理データ用イン
ターフェース部、５６障害時電池動作範囲、６０〜
６３、１６０〜１６３管理データ記憶部、６４通
信制御回路、６５障害時動作制御回路、６６ク
ロック切替回路、６７通信線用接続部、６８低
周波数クロック発生回路、７０〜７３電池、７４
サーバ内部用接続部、７５電源切替回路、７８
高周波数クロック発生回路、１０５管理データ用
切替スイッチ、１０６メインデータ用切替スイッ
チ、１０７メインデータ記憶部、１０８電源
部、１１０通信線、１２０高速通信線、１３
０電源ライン、１４０〜１４３内部バス、１５
０サーバ用主制御部、１６０メインデータ用イン
ターフェース部。1,100 Clustering system, 11,10
1 1st server, 12, 102 2nd server, 1
3, 103 Third server, 14, 104 nth server, 50 to 53, 150 to 153 Management data interface section, 56 Battery failure operating range, 60 to
63, 160 to 163 Management data storage unit, 64 Communication control circuit, 65 Failure operation control circuit, 66 Clock switching circuit, 67 Communication line connection unit, 68 Low frequency clock generation circuit, 70 to 73 Battery, 74
Internal server connection, 75 power supply switching circuit, 78
High-frequency clock generation circuit, 105 management data changeover switch, 106 main data changeover switch, 107 main data storage section, 108 power supply section, 110 communication line, 120 high-speed communication line, 13
0 power supply line, 140-143 internal bus, 15
0 Main control unit for server, 160 Main data interface unit.

───────────────────────────────────────────────────── フロントページの続き (72)発明者梶田恒宏神奈川県大和市下鶴間1623番地14 日本アイ・ビー・エム株式会社大和事業所内 (72)発明者小倉明宏神奈川県大和市下鶴間1623番地14 日本アイ・ビー・エム株式会社大和事業所内 (72)発明者西田光宏神奈川県大和市下鶴間1623番地14 日本アイ・ビー・エム株式会社大和事業所内 (72)発明者竹内篤也神奈川県大和市下鶴間1623番地14 日本アイ・ビー・エム株式会社大和事業所内 (72)発明者田中伸宜神奈川県大和市下鶴間1623番地14 日本アイ・ビー・エム株式会社大和事業所内 (72)発明者伊藤浩神奈川県大和市下鶴間1623番地14 日本アイ・ビー・エム株式会社大和事業所内Ｆターム(参考） 5B011 DA02 DB21 EA01 EB03 FF01 FF04 GG16 JA02 JA05 JB02 5B034 AA04 CC01 DD05 5B045 BB15 GG01 JJ44 5B089 GA02 JA35 JB17 KA12 KB04 KC28 MD09 ME09 ─────────────────────────────────────────────────── ─── Continued front page (72) Inventor Tsunehiro Kajita 1623 1423 Shimotsuruma, Yamato-shi, Kanagawa Japan BM Co., Ltd. Daiwa Office (72) Inventor Akihiro Ogura 1623 1423 Shimotsuruma, Yamato-shi, Kanagawa Japan BM Co., Ltd. Daiwa Office (72) Inventor Mitsuhiro Nishida 1623 1423 Shimotsuruma, Yamato-shi, Kanagawa Japan BM Co., Ltd. Daiwa Office (72) Inventor Atsuya Takeuchi 1623 1423 Shimotsuruma, Yamato-shi, Kanagawa Japan BM Co., Ltd. Daiwa Office (72) Inventor Nobuyoshi Tanaka 1623 1423 Shimotsuruma, Yamato-shi, Kanagawa Japan BM Co., Ltd. Daiwa Office (72) Inventor Hiroshi Ito 1623 1423 Shimotsuruma, Yamato-shi, Kanagawa Japan BM Co., Ltd. Daiwa Office F term (reference) 5B011 DA02 DB21 EA01 EB03 FF01 FF04 GG16 JA02 JA05 JB02 5B034 AA04 CC01 DD05 5B045 BB15 GG01 JJ44 5B089 GA02 JA35 JB17 KA12 KB04 KC28 MD09 ME09

Claims

[Claims]

1. A clustering system in which a plurality of server machines are connected to a changeover switch by a communication line and are operated in a coordinated manner so that even if one server machine fails, the service to clients can be continued. Each server machine stores a management data storage unit for storing management data of processing executed in the server machine, and at least the management data storage unit and the server when a failure occurs in the server machine. A battery is provided that supplies power to an interface unit that electrically and mechanically connects the machine and the communication line, and if a failure occurs in the first server machine among the server machines, The interface unit that receives power supply from the battery in the server machine of No. 1 reads the management data read from the management data storage unit. A clustering system for transferring data to a second server machine and continuing service to the client based on the management data received by the second server machine.

2. The interface unit of each server machine is configured to transfer management data to the second server machine after detecting a failure occurring during normal operation and switching the power source to the battery. The clustering system according to claim 1, further comprising a failure operation control circuit for controlling.

3. A low-frequency clock generation circuit for generating a clock signal having a frequency lower than the frequency of the clock signal at the time of normal operation in the interface section of each server machine, and a failure occurs in the first server machine. In this case, a clock switching circuit that switches the clock signal used in the interface section from the relatively high frequency clock signal used during normal operation to the low frequency clock signal is provided. The described clustering system.

4. The interface unit of each server machine is a board type or a card type that can be connected in a standardized predetermined slot, and is integrated by incorporating the battery and the management data storage unit. The clustering system according to any one of claims 1 to 3, characterized in that:

5. A failure in a clustering system in which a plurality of server machines are connected to a changeover switch by a communication line and operate cooperatively so that even if one server machine fails, service to clients can be continued. In the data recovery method at the time of occurrence, the first server machine that controls the service requested by the client stores the management data inside the first server machine during normal operation, and the first server machine stores the management data. When it is detected that a failure has occurred in the server machine, the first server machine has an interface section for connecting at least a storage section for storing the management data and a communication line inside the first server machine. The power supply power supplied to the power supply unit to the power supply power supplied from the built-in battery. In the clustering system, the management data is transferred from the interface unit to the second server machine, and the second server machine continues the service requested by the client using the received management data. Data recovery method when a failure occurs.

6. The first server machine switches the power source to a built-in battery, and then switches a clock signal to a low-frequency one so that the management data at the time of the occurrence of the failure is stored in the second server machine. 6. The method for recovering data when a failure occurs in the clustering system according to claim 5, wherein the process of transferring the data to the cluster is performed at a speed lower than that during normal operation.