JP2023001510A

JP2023001510A - Data management device and data management method

Info

Publication number: JP2023001510A
Application number: JP2021102283A
Authority: JP
Inventors: 晃 ▲高▼木; Akira Takagi; 元伸齊藤; Motonobu Saito; 義則望月; Yoshinori Mochizuki; 直之武田; Naoyuki Takeda
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2021-06-21
Filing date: 2021-06-21
Publication date: 2023-01-06
Anticipated expiration: 2041-06-21
Also published as: JP7636975B2

Abstract

To provide a data management device and a data management method which cause an appropriate database to store data in each database according to characteristics of the data processing.SOLUTION: In a data management system, a data management device and each database device are communicably connected by a communication network. The data management device 10 comprises: a data processing simulation program which calculates data of a predetermined item to be generated in a future predetermined period, by analyzing processing target data of a predetermined item acquired from a database, and estimates characteristics of data processing to be executed in the database including the calculated data; a data arrangement proposal creation program which identifies the database storing the data of the predetermined item, among the databases having a plurality of different data processing performances, on the basis of the estimated characteristics of the data processing; and a data arrangement execution program which causes the identified database to store the data including the processing target data.SELECTED DRAWING: Figure 2

Description

本発明は、データ管理装置、及びデータ管理方法に関する。 The present invention relates to a data management device and a data management method.

近年のデータベース技術の発展により、従来のようなリレーショナルモデルだけではなく、処理するデータの特性に応じて、キーバリュー及びグラフ等の様々なデータベースモデルが使われるようになってきた。そこで、これらのデータベースモデルを統合的に扱うことのできるマルチモデルデータベースが開発されている。 With recent developments in database technology, not only conventional relational models but also various database models such as key-value models and graph models have come to be used according to the characteristics of the data to be processed. Therefore, a multi-model database has been developed that can handle these database models in an integrated manner.

しかし、既存のデータベース群を用いて運用しているシステムを、新規のマルチモデルデータベースへと移行することは難しい。そこで、マルチモデルデータベースを扱う技術として、例えば特許文献１には、行指向データベースと、行指向データベースから変換される列指向データベースとを記憶する記憶装置と、記憶装置を制御する制御装置とを備え、制御装置は、行指向データベースに含まれる複数のレコードを複数のグループに分け、グループごとにそのグループを列指向データベースのフォーマットに従った列グループに変換する情報処理システムが記載されている。 However, it is difficult to migrate a system operating using existing databases to a new multi-model database. Therefore, as a technique for handling a multi-model database, for example, Patent Document 1 discloses a storage device for storing a row-oriented database and a column-oriented database converted from the row-oriented database, and a control device for controlling the storage device. , a control device divides a plurality of records contained in a row-oriented database into a plurality of groups, and converts each group into a column group according to the format of the column-oriented database.

特開２０２１－０２６７２８号公報Japanese Patent Application Laid-Open No. 2021-026728

しかしながら、データベースは単にデータを記憶するためのものではなく、データ参照及びデータ集計といった種々のデータ処理の対象でもある。データ構造が異なればそのデータ処理の特性も異なってくる。特に、ＩｏＴ（Internet Of Things）システムの普及により、大規模なデータを集計し分析する機会が増えており、処理対象のデータ量が増加しているという現状がある。特許文献１に記載のような技術では、このような点を充分に考慮してマルチモデルデータベースを管理できる仕組みとはなっていない。 However, databases are not just for storing data, but are also targets for various data processing such as data referencing and data aggregation. Different data structures have different data processing characteristics. In particular, due to the spread of IoT (Internet Of Things) systems, there are more opportunities to collect and analyze large-scale data, and the current situation is that the amount of data to be processed is increasing. The technology described in Patent Literature 1 does not provide a mechanism for managing a multi-model database in full consideration of these points.

本発明はこのような現状に鑑みてなされたものであり、その目的は、各データベースにおけるデータを、そのデータ処理の特性に応じて適切なデータベースに管理させることが可能なデータ管理装置、及びデータ管理方法を提供することを目的とする。 SUMMARY OF THE INVENTION The present invention has been made in view of such a situation, and its object is to provide a data management apparatus and a data management system capable of managing data in each database in an appropriate database according to the characteristics of the data processing. The purpose is to provide a management method.

上記課題を解決するための本発明の一つは、プロセッサ及びメモリを有し、データベースから取得した、所定項目の処理対象データを解析することにより、将来の所定期間に生成される前記所定項目のデータを算出し、前記算出したデータを含むデータベースにおいて実行されるデータ処理の特性を推定するデータ処理模擬部と、前記推定したデータ処理の特性に基づき、複数の異なるデータ処理性能を有するデータベースのうち、前記所定項目のデータを記憶させるデータベースを特定するデータ配置提案作成部と、前記特定したデータベースに、前記処理対象データを含むデータを記憶させるデータ配置実行部と、を備える、データ管理装置、とする。 One of the present inventions for solving the above problems is a processor and a memory, and by analyzing data to be processed of a predetermined item acquired from a database, the predetermined item generated in a predetermined period in the future. a data processing simulating unit for calculating data and estimating characteristics of data processing executed in a database containing the calculated data; a data management apparatus comprising: a data allocation proposal creation unit that identifies a database in which data of the predetermined item is stored; and a data allocation execution unit that stores data including the processing target data in the identified database; do.

また、上記課題を解決するための本発明の一つは、情報処理装置が、データベースから取得した、所定項目の処理対象データを解析することにより、将来の所定期間に生成される前記所定項目のデータを算出し、前記算出したデータを含むデータベースにおいて実行
されるデータ処理の特性を推定するデータ処理模擬処理と、前記推定したデータ処理の特性に基づき、複数の異なるデータ処理性能を有するデータベースのうち、前記所定項目のデータを記憶させるデータベースを特定するデータ配置提案作成処理と、前記特定したデータベースに、前記処理対象データを含むデータを記憶させるデータ配置実行処理と、を実行する、データ管理方法、とする。 According to another aspect of the present invention for solving the above problems, an information processing apparatus analyzes data to be processed of a predetermined item acquired from a database, thereby generating a predetermined item for a predetermined period in the future. data processing simulation processing for calculating data and estimating characteristics of data processing executed in a database containing the calculated data; , a data arrangement proposal creation process for specifying a database in which the data of the predetermined item is to be stored, and a data arrangement execution process for storing data including the process target data in the specified database, a data management method, and

本発明によれば、各データベースにおけるデータを、そのデータ処理の特性に応じて適切なデータベースに記憶させることができる。
上記した以外の課題、構成及び効果は、以下の実施形態の説明により明らかにされる。 According to the present invention, data in each database can be stored in an appropriate database according to the characteristics of the data processing.
Problems, configurations, and effects other than those described above will be clarified by the following description of the embodiments.

本実施形態に係るデータ管理システムの構成の一例を示す図である。It is a figure showing an example of composition of a data management system concerning this embodiment. データ管理装置が備える機能（プログラム）の一例を示す図である。It is a figure which shows an example of the function (program) with which a data management apparatus is provided. データ名称及び所在対応テーブルの一例を示す図である。It is a figure which shows an example of a data name and location correspondence table. データ管理装置及びデータベース装置の各情報処理装置が備えるハードウェアの一例を示す図である。FIG. 2 is a diagram showing an example of hardware included in each information processing device of a data management device and a database device; データ収集処理の一例を説明するフロー図である。FIG. 10 is a flow diagram illustrating an example of data collection processing; クエリ実行処理の詳細を説明するフロー図である。FIG. 10 is a flow diagram illustrating details of query execution processing; 処理対象データ解析処理を説明するフロー図である。FIG. 10 is a flow chart for explaining processing target data analysis processing; データ再配置処理の一例を説明するフロー図である。FIG. 11 is a flow diagram illustrating an example of data rearrangement processing; 再配置提案画面の一例を示す図である。It is a figure which shows an example of a rearrangement proposal screen. 配置案作成処理の詳細を説明するフロー図である。FIG. 11 is a flow chart for explaining the details of placement plan creation processing;

図１は、本実施形態に係るデータ管理システム１の構成の一例を示す図である。データ管理システム１は、データ管理装置１０と、複数のデータベース装置２０とを含んで構成される。 FIG. 1 is a diagram showing an example of the configuration of a data management system 1 according to this embodiment. A data management system 1 includes a data management device 10 and a plurality of database devices 20 .

データベース装置２０はそれぞれ、データベースを記憶している。各データベース装置２０が記憶しているデータベースの種類は異なっている。なお、データベースの種類が異なっている場合、そのデータベースに対するデータ問い合わせ（クエリ）及びそれに対するクエリ応答のデータ形式も異なる。 Each database device 20 stores a database. The types of databases stored in each database device 20 are different. When the types of databases are different, the data formats of data inquiries (queries) to the databases and query responses thereto are also different.

本実施形態では、リレーショナルデータベースを記憶しているデータベース装置２０ａと、時系列データベースを記憶しているデータベース装置２０ｂと、列指向データベースを記憶しているデータベース装置２０ｃとがあるものとする。 In this embodiment, it is assumed that there are a database device 20a that stores a relational database, a database device 20b that stores a time-series database, and a database device 20c that stores a column-oriented database.

データ管理装置１０は、異なる種類のデータベースを記憶しているデータベース装置２０のそれぞれに対して、データ処理の問い合わせ処理（クエリ）及び、問い合わせに対するクエリ応答の受信処理を行う。また、データ管理装置１０は、受信したクエリ応答に基づき、データベース装置２０のそれぞれのデータベースを他のデータベースに再配置する提案の処理を行う。 The data management device 10 performs inquiry processing (query) for data processing and processing for receiving a query response to the inquiry for each of the database devices 20 storing databases of different types. In addition, the data management device 10 performs a process of proposing to relocate each database of the database device 20 to another database based on the received query response.

なお、データ管理装置１０、及び各データベース装置２０の間は、例えば、有線ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、無線ＬＡＮ、インターネット、又は専用線等の通信ネットワーク３０により通信可能に接続される。 The data management device 10 and each database device 20 are communicably connected by a communication network 30 such as a wired LAN (Local Area Network), a wireless LAN, the Internet, or a dedicated line.

図２は、データ管理装置１０が備える機能（プログラム）の一例を示す図である。 FIG. 2 is a diagram showing an example of functions (programs) included in the data management device 10. As shown in FIG.

まず、データ管理装置１０は、データベース装置２０が記憶している、異なるデータ処理性能を有するデータベース（以下、データベース群という）に関して、ユーザからのデータ処理の指定を受け付ける。データ管理装置１０は、このデータ処理に関して、データベース群を統一的に（本実施形態では単一のリレーショナルデータベースとして扱うような方法で）に処理する。 First, the data management device 10 accepts designation of data processing from the user regarding databases having different data processing performances (hereinafter referred to as a group of databases) stored in the database device 20 . Regarding this data processing, the data management device 10 processes the database group in a unified manner (in this embodiment, in such a way that it is handled as a single relational database).

具体的には、データ管理装置１０は、データ処理内容入力プログラム２０１、データ処理内容管理プログラム２０２、クエリ送信プログラム２０３５、クエリ結果形式変換プログラム２０８、及びクエリ結果出力プログラム２０９の各プログラムを記憶している。 Specifically, the data management device 10 stores a data processing content input program 201, a data processing content management program 202, a query transmission program 2035, a query result format conversion program 208, and a query result output program 209. there is

データ処理内容入力プログラム２０１は、ユーザから、データベース群に対して行う処理の対象となるデータ（処理対象データ）の入力をユーザから受け付ける。 The data processing content input program 201 receives from the user input of data to be processed (process target data) for the database group.

データ処理内容管理プログラム２０２は、処理対象データを含む、データベース群の各項目のデータに対して行うデータ処理の情報を記憶する。このデータ情報は、例えば、データ処理の種類（データ集計、データ参照、データ更新等）、データ処理の対象となるデータの範囲、又は、データ処理の実行周期等である。 The data processing content management program 202 stores information on data processing to be performed on data of each item in the database group, including processing target data. This data information is, for example, the type of data processing (data aggregation, data reference, data update, etc.), the range of data to be processed, or the execution cycle of data processing.

クエリ送信プログラム２０３５は、クエリ生成プログラム２０３、クエリ解析プログラム２０４、データ管理プログラム２０５、クエリ形式変換プログラム２０７を備える。 The query transmission program 2035 includes a query generation program 203, a query analysis program 204, a data management program 205, and a query format conversion program 207.

クエリ生成プログラム２０３は、データ処理内容管理プログラム２０２で記憶した処理対象データに関する、各データベース装置２０に対して要求する所定形式のコマンド（クエリ）を生成する。クエリは、例えば、各データベース装置２０から、各データ項目（例えば、所定行のデータ、所定列のデータ、所定の時系列データ）のデータの取得を要求するコマンドである。 The query generation program 203 generates a command (query) in a predetermined format to request each database device 20 regarding the processing target data stored by the data processing content management program 202 . A query is, for example, a command requesting acquisition of data of each data item (for example, data of a predetermined row, data of a predetermined column, predetermined time-series data) from each database device 20 .

クエリ解析プログラム２０４は、クエリ生成プログラム２０３が生成したクエリを解析し、当該クエリが処理対象とするデータの所在（例えば、各データベース装置２０のネットワーク上の位置）を特定する。なお、このデータの所在は、テーブル管理プログラム２０５が備えるデータ名称及び所在対応テーブル２０６が記憶している。 The query analysis program 204 analyzes the query generated by the query generation program 203 and identifies the location of data to be processed by the query (for example, location of each database device 20 on the network). The location of this data is stored in the data name/location correspondence table 206 provided in the table management program 205 .

（データ名称及び所在対応テーブル）
ここで、図３は、データ名称及び所在対応テーブル２０６の一例を示す図である。データ名称及び所在対応テーブル２０６は、各処理対象データに係る（データベース上の）データ項目を特定する情報（例えば、データ項目の名称）が設定されるデータ名称２０６１、そのデータ項目のデータを有するデータベースの場所（例えば、データベース装置２０のネットワーク上の位置）を特定する情報が設定される所在２０６２、及び、そのデータベースの種類が設定される種類２０６３の各データ項目を有する。なお、種類２０６３には、例えば、行指向であるリレーショナルデータベース（ＲＤＢ：Relational Data Base）、列指向ＤＢ、行指向ＤＢ、又は時系列ＤＢといった情報が設定される。 (data name and location correspondence table)
Here, FIG. 3 is a diagram showing an example of the data name/location correspondence table 206. As shown in FIG. The data name and location correspondence table 206 includes a data name 2061 in which information (for example, the name of the data item) specifying the data item (on the database) related to each processing target data is set, and a database having the data of the data item. location 2062 in which information specifying the location of the database (for example, location on the network of the database device 20) is set, and type 2063 data items in which the type of the database is set. Information such as row-oriented relational database (RDB: Relational Data Base), column-oriented DB, row-oriented DB, or time-series DB is set in the type 2063, for example.

次に、図２に示すようにクエリ形式変換プログラム２０７は、クエリ解析プログラム２０４が特定したデータの所在等に基づき、クエリの処理要求先の各データベース装置２０（データベース）に応じた変換を行ったクエリをそれぞれ生成し、生成したクエリを各データベース装置２０に送信する。 Next, as shown in FIG. 2, the query format conversion program 207 performs conversion according to each database device 20 (database) to which the query processing request is made, based on the location of the data specified by the query analysis program 204. Each query is generated, and the generated query is transmitted to each database device 20 .

次に、クエリ結果形式変換プログラム２０８は、各データベース装置２０から受信した、各クエリに対する応答（クエリ応答）を所定形式（クエリ生成プログラム２０３における形式と同じ形式）に変換する。 Next, the query result format conversion program 208 converts the response (query response) to each query received from each database device 20 into a predetermined format (the same format as the query generation program 203).

クエリ結果出力プログラム２０９は、クエリ結果形式変換プログラム２０８により変換されたクエリ応答の内容をファイルに出力し又は画面に表示する。 The query result output program 209 outputs the content of the query response converted by the query result format conversion program 208 to a file or displays it on the screen.

次に、データ管理装置１０は、クエリ応答に含まれる各データベース装置２０のデータ（各データ項目における処理対象データ）の傾向を分析することにより、処理対象データを含む各項目のデータを、各データベース装置２０のうちいずれのデータベース装置２０に記憶すべきか、すなわちデータの再配置をユーザに提案し及び実行する。 Next, the data management device 10 analyzes the tendency of the data of each database device 20 (data to be processed in each data item) included in the query response, and stores the data of each item including the data to be processed in each database. It proposes to the user which of the devices 20 should be stored in which database device 20, ie, rearranges the data, and executes it.

具体的には、データ管理装置１０は、データ処理対象解析プログラム２１０、データベース候補管理プログラム２１１、データ処理模擬プログラム２１２、データ再配置提案作成プログラム２１３、データ配置提案出力プログラム２１４、及びデータ配置実行プログラム２１５の各プログラムを記憶している。 Specifically, the data management device 10 includes a data processing target analysis program 210, a database candidate management program 211, a data processing simulation program 212, a data rearrangement proposal creation program 213, a data arrangement proposal output program 214, and a data arrangement execution program. 215 programs are stored.

データ処理対象解析プログラム２１０は、各データベース装置２０から取得した処理対象データに基づき、データ処理内容管理プログラム２０２が示すデータ処理に関する解析を実行する。例えば、データ処理対象解析プログラム２１０は、処理対象データに係る各データベース装置２０のデータの更新範囲、データの更新頻度、データ更新の所要時間、又はデータ更新の実行周期等を算出する。 The data processing target analysis program 210 analyzes the data processing indicated by the data processing content management program 202 based on the processing target data acquired from each database device 20 . For example, the data processing target analysis program 210 calculates the data update range, data update frequency, data update required time, data update execution cycle, etc. of each database device 20 related to the process target data.

データベース候補管理プログラム２１１は、データベース群における各種類のデータベースのテンプレート（以下、データベース候補という）を記憶している。 The database candidate management program 211 stores templates of each type of database in the database group (hereinafter referred to as database candidates).

データ処理模擬プログラム２１２は、データ処理対象解析プログラム２１０の解析結果に基づき、ユーザから指定された、将来の所定期間に生成される各データベース装置２０の各データ項目のデータ（ダミーデータ）を算出（予測）する。そして、データ処理模擬プログラム２１２は、算出したデータを含むデータベースにおいて実行されるデータ処理の特性を推定する。 The data processing simulation program 212 calculates data (dummy data) of each data item of each database device 20 to be generated in a predetermined future period specified by the user based on the analysis result of the data processing target analysis program 210 ( Predict. The data processing simulation program 212 then estimates the characteristics of the data processing executed in the database containing the calculated data.

具体的には、データ処理模擬プログラム２１２は、データベース候補に基づき、ダミーデータを記憶させた試行データベースを作成し、作成した試行データベースに対してデータ処理を実行することにより、そのデータ処理の特性を推定する。 Specifically, the data processing simulation program 212 creates a trial database in which dummy data is stored based on the database candidates, and executes data processing on the created trial database to obtain characteristics of the data processing. presume.

なお、推定されるデータ処理の特性は、例えば、将来の所定期間における、データ処理に要する時間、データ処理の実行頻度、又はデータ処理の対象となるデータの更新頻度である。 Note that the estimated characteristics of data processing are, for example, the time required for data processing, the execution frequency of data processing, or the update frequency of data to be processed in a predetermined future period.

データ再配置提案作成プログラム２１３は、データ処理模擬プログラム２１２の実行結果に基づき、各データベース装置２０のうち、処理対象データに係るデータ項目のデータを記憶させるデータベース装置２０（データベース）の情報（データベースの配置案）を作成する。 Based on the execution result of the data processing simulation program 212, the data rearrangement proposal creation program 213 is based on the information (database) of the database device 20 (database) that stores the data of the data items related to the processing target data among the database devices 20. layout plan).

本実施形態では、データ再配置提案作成プログラム２１３は、データ処理模擬プログラム２１２が推定したデータ処理の特性に基づき、試行データベースのデータに係るデータ更新処理性能、データ圧縮性能、及びデータ集計性能を表す各パラメータ値の高さに応じて、データベースの配置案）を作成する。 In this embodiment, the data rearrangement proposal creation program 213 expresses the data update processing performance, data compression performance, and data aggregation performance related to the data in the trial database based on the data processing characteristics estimated by the data processing simulation program 212. Create a database layout plan) according to the height of each parameter value.

データ配置提案出力プログラム２１４は、データ再配置提案作成プログラム２１３が作成した配置案を画面に表示する。 The data allocation proposal output program 214 displays the allocation plan created by the data relocation proposal creation program 213 on the screen.

データ配置実行プログラム２１５は、データ配置提案出力プログラム２１４が表示した配置案のうちユーザから指定された配置案を実行し、配置案が示すデータベースに、処理対象データに係るデータ項目のデータを記憶させる。 The data placement execution program 215 executes the placement plan specified by the user among the placement plans displayed by the data placement proposal output program 214, and stores the data of the data items related to the data to be processed in the database indicated by the placement plan. .

さらに、データ管理装置１０は、通信プログラム２１６を記憶している。通信プログラム２１６は、各データベース装置２０とのデータ送受信を制御する。 Furthermore, the data management device 10 stores a communication program 216 . The communication program 216 controls data transmission/reception with each database device 20 .

ここで、図４は、データ管理装置１０及びデータベース装置２０の各情報処理装置が備えるハードウェアの一例を示す図である。各情報処理装置は、ＣＰＵ（Central Processing Unit）、ＭＰＵ（Micro Processing Unit）、又はＧＰＵ（Graphics Processing Unit）等のプロセッサ１０１と、ＲＯＭ（Read Only Memory）、又はＲＡＭ（Random Access Memory）等のメモリ１０２と、ハードディスクドライブ（Hard Disk Drive）、フラッシ
ュメモリ（Flash Memory）、又はＳＳＤ（Solid State Drive）等の記憶装置１０３と、
キーボード、マウス、カードリーダ、若しくはタッチパネル等の入力装置１０４と、又は、液晶ディスプレイ（Liquid Crystal Display: LCD）、音声出力装置（スピーカ）、若
しくは印字装置等の出力装置１０５と、ネットワークインタフェースカード（Network Interface Card: NIC）、無線通信モジュール、ＵＳＢ（Universal Serial Interface）モ
ジュール、又はシリアル通信モジュール等の通信装置１０６とを備える。なお、各情報処理装置は、上記のような物理的なハードウェアを備える計算機であってもよいし、物理的な計算機を論理的に分割した計算機の単位（仮想サーバ）でもよい。また、各情報処理装置は、１台または複数の計算機クラスタ上で実行されるタスク（プロセスやコンテナ）であってもよい。 Here, FIG. 4 is a diagram showing an example of hardware included in each information processing device of the data management device 10 and the database device 20. As shown in FIG. Each information processing device includes a processor 101 such as a CPU (Central Processing Unit), MPU (Micro Processing Unit), or GPU (Graphics Processing Unit), and a memory such as ROM (Read Only Memory) or RAM (Random Access Memory). 102, a storage device 103 such as a hard disk drive, flash memory, or solid state drive (SSD);
An input device 104 such as a keyboard, mouse, card reader, or touch panel, or an output device 105 such as a liquid crystal display (LCD), an audio output device (speaker), or a printing device, and a network interface card (Network interface card: NIC), a wireless communication module, a USB (Universal Serial Interface) module, or a communication device 106 such as a serial communication module. Each information processing apparatus may be a computer having physical hardware as described above, or may be a computer unit (virtual server) obtained by logically dividing a physical computer. Also, each information processing device may be a task (process or container) executed on one or more computer clusters.

上記各情報処理装置の機能は、各情報処理装置のハードウェアによって、又は、各情報処理装置のプロセッサ１０１がメモリ１０２又は記憶装置１０３に格納されているプログラムを読み出して実行することにより実現される。また上記のプログラムは、例えば、記録媒体に記録して配布することができる。 The function of each information processing device is realized by the hardware of each information processing device, or by the processor 101 of each information processing device reading and executing a program stored in the memory 102 or the storage device 103. . Further, the above program can be distributed by being recorded on a recording medium, for example.

次に、データ管理システム１で行われる処理について説明する。
データ管理システム１は、データベース装置２０のデータベース群からデータを取得するデータ収集処理Ｓ１と、取得したデータを解析する処理対象データ解析処理Ｓ３０１と、解析結果に基づき、データベース群における各データベースをデータベース装置２０のいずれに再配置するかを決定する再配置処理Ｓ２とを実行する。
まず、データ収集処理Ｓ１について説明する。 Next, processing performed by the data management system 1 will be described.
The data management system 1 includes a data collection process S1 for acquiring data from the database group of the database device 20, a processing target data analysis process S301 for analyzing the acquired data, and each database in the database group based on the analysis result. 20, and a rearrangement process S2 for determining which of 20 is to be rearranged.
First, the data collection processing S1 will be described.

＜データ収集処理＞
図５は、データ収集処理Ｓ１の一例を説明するフロー図である。データ収集処理Ｓ１は、例えば、ユーザから所定の入力があった場合、所定のタイミングで（例えば、所定の時刻、所定の時間間隔）、又はデータベース装置２０のデータベースが更新された場合等に開始される。 <Data collection processing>
FIG. 5 is a flowchart illustrating an example of the data collection processing S1. The data collection process S1 is started, for example, when a predetermined input is received from the user, at predetermined timing (for example, predetermined time, predetermined time interval), or when the database of the database device 20 is updated. be.

データ管理装置１０のデータ処理内容入力プログラム２０１は、ユーザから、データベース群に対して行うデータ処理の入力を受け付ける（Ｓ１０１）。 The data processing content input program 201 of the data management device 10 receives input of data processing to be performed on the database group from the user (S101).

すなわち、データ処理内容入力プログラム２０１は、データ処理の種類の入力を受け付ける。例えば、データ処理内容入力プログラム２０１は、データの参照、又はデータの集計の旨の指定を受け付ける。 That is, the data processing content input program 201 receives input of the type of data processing. For example, the data processing content input program 201 accepts designation of data reference or data aggregation.

また、データ処理内容入力プログラム２０１は、データ処理の方法及び内容の入力を受け付ける。例えば、データ処理内容入力プログラム２０１は、データ処理を行うデータベ
ース、データ処理を行うデータベース上の範囲、そのデータ範囲のうち実際に処理対象となるデータの出現周期（例えば、データの集計周期）、データ処理の実行周期、又は、データ処理に使用する関数（例えば、平均値の算出又は合計値の算出の関数）の入力を受け付ける。 The data processing content input program 201 also accepts input of data processing method and content. For example, the data processing content input program 201 includes a database for data processing, a range on the database for data processing, an appearance cycle of data to be actually processed within the data range (for example, data aggregation cycle), data It accepts an input of a processing execution cycle or a function used for data processing (for example, a function for calculating an average value or calculating a total value).

データ処理内容管理プログラム２０２は、データ処理内容入力プログラム２０１で入力された情報を記憶する（Ｓ１０２）。 The data processing content management program 202 stores the information input by the data processing content input program 201 (S102).

クエリ生成プログラム２０３は、Ｓ１０１で入力されたデータ処理を実現するためのクエリを生成する（Ｓ１０３）。このクエリは、所定の形式（以下、標準形式という）に基づくコマンドであり、例えば、ＳＱＬ（Structured Query Language）に基づく。 The query generation program 203 generates a query for implementing the data processing input in S101 (S103). This query is a command based on a predetermined format (hereinafter referred to as standard format), for example, based on SQL (Structured Query Language).

データ管理装置１０は、Ｓ１０３で生成したクエリに基づき各データベース装置２０に対するデータ処理を実行し、各データベース装置２０から受信したクエリ応答の内容を画面に表示し又はファイルに出力するクエリ実行処理を行う（Ｓ１０４）。なお、Ｓ１０１で実行周期等の実行条件が入力された場合は、データ管理装置１０は、Ｓ１０２で入力された実行条件にてクエリ実行処理Ｓ１０４を実行する。クエリ実行処理Ｓ１０４の詳細は次述する。
以上でデータ収集処理Ｓ１は終了する。 The data management device 10 executes data processing for each database device 20 based on the query generated in S103, and performs query execution processing for displaying the content of the query response received from each database device 20 on the screen or outputting it to a file. (S104). Note that when an execution condition such as an execution cycle is input in S101, the data management device 10 executes query execution processing S104 under the execution condition input in S102. Details of the query execution processing S104 will be described below.
The data collection process S1 ends here.

＜クエリ実行処理＞
図６は、クエリ実行処理Ｓ１０４の詳細を説明するフロー図である。クエリ解析プログラム２０４は、まず、Ｓ１０３で生成及び送信したクエリを解析し、クエリが示す処理対象データ（具体的には、データ項目名）を特定する（Ｓ２０１）。 <Query execution processing>
FIG. 6 is a flowchart for explaining the details of the query execution processing S104. The query analysis program 204 first analyzes the query generated and transmitted in S103, and identifies the processing target data (specifically, the data item name) indicated by the query (S201).

クエリ解析プログラム２０４は、Ｓ２０１で特定した各処理対象データの所在（データベース）を特定する（Ｓ２０２）。例えば、クエリ解析プログラム２０４は、Ｓ２０１で特定した各処理対象データのデータ項目名を取得し、データ名称及び所在対応テーブル２０６の各レコードを参照することで、上記取得したデータ項目名のデータが格納されているデータベース装置２０（データベース）を特定する。 The query analysis program 204 identifies the location (database) of each processing target data identified in S201 (S202). For example, the query analysis program 204 acquires the data item name of each processing target data specified in S201, and stores the data of the acquired data item name by referring to each record in the data name/location correspondence table 206. The specified database device 20 (database) is specified.

クエリ形式変換プログラム２０７は、Ｓ２０２で特定した各データベースの種類（例えば、データ名称及び所在対応テーブル２０６の種類２０６３）に応じて、Ｓ１０３で送信したクエリを、当該各データベースに対応した形式のクエリに変換する。そして、クエリ形式変換プログラム２０７は、変換した各クエリを、各データベース装置２０に送信する（Ｓ２０３）。 The query format conversion program 207 converts the query sent in S103 into a query format corresponding to each database according to the type of each database specified in S202 (for example, the type 2063 of the data name and location correspondence table 206). Convert. Then, the query format conversion program 207 transmits each converted query to each database device 20 (S203).

各クエリを受信したデータベース装置２０は、当該クエリが示すデータ処理を行い、その処理結果（処理対象データを含むクエリ応答）をデータ管理装置１０に送信する。 Upon receiving each query, the database device 20 performs data processing indicated by the query, and transmits the processing result (query response including data to be processed) to the data management device 10 .

クエリ結果形式変換プログラム２０８は、各データベース装置２０から受信したクエリ応答のデータ形式を標準形式に変換する（Ｓ２０４）。 The query result format conversion program 208 converts the data format of the query response received from each database device 20 into a standard format (S204).

クエリ結果出力プログラム２０９は、Ｓ２０４の変換により作成されたクエリ応答の内容を画面に表示し、又は所定のファイルに出力する（Ｓ２０５）。以上でクエリ実行処理Ｓ１０４は終了する。 The query result output program 209 displays the content of the query response created by the conversion in S204 on the screen or outputs it to a predetermined file (S205). The query execution processing S104 ends here.

次に、処理対象データ解析処理Ｓ３０１について説明する。
＜処理対象データ解析処理＞
図７は、処理対象データ解析処理Ｓ３０１を説明するフロー図である。処理対象データ
解析処理Ｓ３０１は、ユーザから所定の入力があった場合、データ管理装置１０がデータベース装置２０から処理対象データを受信した場合、又は、所定のタイミング（例えば、所定の時刻、所定の時間間隔）にて、繰り返し実行される。 Next, the processing target data analysis processing S301 will be described.
<Processing target data analysis processing>
FIG. 7 is a flowchart for explaining the processing target data analysis processing S301. The processing target data analysis processing S301 is performed when there is a predetermined input from the user, when the data management device 10 receives processing target data from the database device 20, or when a predetermined timing (for example, a predetermined time, a predetermined time interval).

すなわち、データ処理対象解析プログラム２１０は、データ管理装置１０がこれまでに蓄積した処理対象データを解析する。 That is, the data processing target analysis program 210 analyzes the processing target data accumulated by the data management device 10 so far.

例えば、データ処理対象解析プログラム２１０は、処理対象データに係るデータ項目又はデータベースの更新頻度を算出する。また、例えば、データ処理対象解析プログラム２１０は、処理対象データの値の分布を表す分布関数を算出することで、更新されるデータの値の分布を算出する。また、例えば、データ処理対象解析プログラム２１０は、処理対象データの値の時系列変化の傾向を解析してその傾向を表す予測関数を算出する。
以上で処理対象データ解析処理Ｓ３０１は終了する。 For example, the data processing target analysis program 210 calculates the update frequency of the data item or database related to the processing target data. Further, for example, the data processing target analysis program 210 calculates the distribution of the values of the data to be updated by calculating a distribution function representing the distribution of the values of the processing target data. Also, for example, the data processing target analysis program 210 analyzes the tendency of time-series changes in the value of the processing target data and calculates a prediction function representing the trend.
Thus, the processing target data analysis processing S301 ends.

＜データ再配置処理＞
図８は、データ再配置処理Ｓ２の一例を説明するフロー図である。データ再配置処理Ｓ２は、例えば、ユーザから所定の入力があった場合、又は、所定のタイミング（例えば、所定の時刻、所定の時間間隔）で実行される。 <Data relocation processing>
FIG. 8 is a flowchart illustrating an example of the data rearrangement processing S2. The data rearrangement process S2 is executed, for example, when there is a predetermined input from the user, or at predetermined timing (for example, predetermined time, predetermined time interval).

まず、データ処理模擬プログラム２１２は、データの再配置の判断の基準となる将来の期間（以下、ユーザ設定期間という。例えば、３年後以降。）を設定する。ユーザ設定期間は、ユーザ入力によって設定されてもよいし、データ処理模擬プログラム２１２が自動的に設定してもよい。ユーザ設定期間は、連続した期間であってもよいし、周期的に繰り返される複数の時間帯であってもよい。 First, the data processing simulation program 212 sets a future period (hereinafter referred to as a user-defined period, for example, after three years) that serves as a reference for judging data rearrangement. The user-set period may be set by user input, or may be automatically set by the data processing simulation program 212 . The user-set period may be a continuous period, or may be a plurality of periodically repeated time periods.

そして、データ処理模擬プログラム２１２は、処理対象データ解析処理Ｓ３０１による解析結果に基づき、ユーザ設定期間における、各データベース装置２０のデータベースにおけるデータ内容を推定する、すなわちダミーデータを作成する（Ｓ４０１）。例えば、データ処理模擬プログラム２１２は、Ｓ３０１で算出した更新頻度、予測関数、又は分布関数等に基づき、ユーザ設定期間における、各データベース装置２０のデータベースに設定されるべき各データ値又はデータ値の変動を算出する。 Then, the data processing simulation program 212 estimates the data content in the database of each database device 20 during the user setting period based on the analysis result of the processing target data analysis processing S301, that is, creates dummy data (S401). For example, the data processing simulation program 212, based on the update frequency, prediction function, distribution function, etc. calculated in S301, each data value to be set in the database of each database device 20 or fluctuation of the data value during the period set by the user Calculate

データ処理模擬プログラム２１２は、データベース候補管理プログラム２１１を呼び出すことで各データベース候補を取得し、取得した各データベース候補に係る、空情報のデータベース（試行データベース）を作成する。そして、データ処理模擬プログラム２１２は、作成した各試行データベースに、Ｓ４０１で推定したデータ（ダミーデータ）を設定する（Ｓ４０２）。 The data processing simulation program 212 acquires each database candidate by calling the database candidate management program 211, and creates a database of empty information (trial database) related to each acquired database candidate. Then, the data processing simulation program 212 sets the data (dummy data) estimated in S401 in each created trial database (S402).

データ処理模擬プログラム２１２は、Ｓ４０２で設定を行った各試行データベースに対して、Ｓ１０１で設定したデータ処理を実行する（Ｓ４０３）。この際、データ処理模擬プログラム２１２は、実行したデータ処理に要した処理時間をそれぞれ測定する。また、データ処理模擬プログラム２１２は、各試行データベースのデータ処理の実行に関して、当該データ処理の実行頻度（本実施形態では実行周期とする）、及び、当該データ処理の対象となったデータの更新頻度を記憶する。 The data processing simulation program 212 executes the data processing set in S101 for each trial database set in S402 (S403). At this time, the data processing simulation program 212 measures the processing time required for each executed data processing. In addition, the data processing simulation program 212, regarding the execution of data processing of each trial database, determines the execution frequency of the data processing (execution cycle in this embodiment), and the update frequency of the data subject to the data processing. memorize

データ再配置提案作成プログラム２１３は、Ｓ４０３で算出した処理時間、実行頻度、及び更新頻度に基づき配置案を作成する配置案作成処理を実行する（Ｓ４０４）。配置案作成処理Ｓ４０４の詳細は後述する。 The data relocation proposal creating program 213 executes placement plan creation processing for creating a placement plan based on the processing time, execution frequency, and update frequency calculated in S403 (S404). The details of the placement plan creation processing S404 will be described later.

データ配置提案出力プログラム２１４は、配置案作成処理Ｓ４０４で作成した配置案を
表示すると共に、表示した配置案を実行するか否かの入力をユーザから受け付ける再配置提案画面に表示する（Ｓ４０５）。 The data allocation proposal output program 214 displays the allocation plan created in the allocation plan creation processing S404, and also displays it on a rearrangement proposal screen for accepting input from the user as to whether or not to execute the displayed allocation plan (S405).

（再配置提案画面）
図９は、再配置提案画面３００の一例を示す図である。再配置提案画面３００は、処理対象データに係るデータ項目のリストであるデータ名表示欄３０１と、当該データ項目に係るデータの配置先のデータベース（データベース装置）である再配置先表示欄３０２と、その配置先に対する再配置を実行するか否かの実行確認欄３０３とを備える。 (Relocation proposal screen)
FIG. 9 is a diagram showing an example of a rearrangement proposal screen 300. As shown in FIG. The rearrangement proposal screen 300 includes a data name display field 301 that is a list of data items related to the data to be processed, a rearrangement destination display field 302 that is a database (database device) to which data related to the data item is allocated, and an execution confirmation column 303 for whether or not to execute rearrangement for the arrangement destination.

そして、図８に示すように、データ配置実行プログラム２１５は、ユーザから、その配置案を実行する旨の入力を受け付けた処理対象データに対して、Ｓ４０５で作成した配置案を実行する（Ｓ４０６）。具体的には、データ配置実行プログラム２１５は、各処理対象データに係るデータ項目のデータを、現在の各データベース装置２０から、配置先の各データベース装置２０に送信する（その際、データ形式の変換も行われる）。以上でデータ再配置処理Ｓ２は終了する。
ここで、配置案作成処理Ｓ４０４の詳細を説明する。 Then, as shown in FIG. 8, the data placement execution program 215 executes the placement plan created in S405 with respect to the processing target data for which an input to the effect that the placement plan is to be executed is received from the user (S406). . Specifically, the data placement execution program 215 transmits the data of the data item related to each process target data from each current database device 20 to each database device 20 of the placement destination (at that time, the data format is converted). is also done). The data rearrangement processing S2 is thus completed.
Here, the details of the placement plan creation processing S404 will be described.

図１０は、配置案作成処理Ｓ４０４の詳細を説明するフロー図である。
データ再配置提案作成プログラム２１３は、Ｓ４０３で測定した処理対象データに係る処理時間が所定の閾値より大きいか否かを判定する（Ｓ５０１）。処理時間が所定の閾値より大きい場合は（Ｓ５０１：ＹＥＳ）、データ再配置提案作成プログラム２１３はＳ５０２の処理を実行し、処理時間が所定の閾値より大きくない場合は（Ｓ５０１：ＮＯ）、データ再配置提案作成プログラム２１３はＳ５０３の処理を実行する。 FIG. 10 is a flowchart for explaining the details of the placement plan creation processing S404.
The data rearrangement proposal creation program 213 determines whether or not the processing time related to the processing target data measured in S403 is greater than a predetermined threshold (S501). If the processing time is longer than the predetermined threshold (S501: YES), the data relocation proposal creating program 213 executes the processing of S502, and if the processing time is not longer than the predetermined threshold (S501: NO), The arrangement proposal creation program 213 executes the process of S503.

Ｓ５０２においてデータ再配置提案作成プログラム２１３は、所定の警告情報を画面に表示し、配置案作成処理Ｓ４０４は終了する。 In S502, the data relocation proposal generation program 213 displays predetermined warning information on the screen, and the allocation plan generation processing S404 ends.

Ｓ５０３においてデータ再配置提案作成プログラム２１３は、処理対象データの更新頻度（データ更新処理性能を表すパラメータ値）が、所定の閾値より高いか否かを判定する。更新頻度が所定の閾値より高い場合は（Ｓ５０３：ＹＥＳ）、データ再配置提案作成プログラム２１３はＳ５０７の処理を実行し、更新頻度が所定の閾値より高くない場合は（Ｓ５０３：ＮＯ）、データ再配置提案作成プログラム２１３はＳ５０４の処理を実行する。 In S503, the data relocation proposal creation program 213 determines whether or not the update frequency of the processing target data (parameter value representing data update processing performance) is higher than a predetermined threshold. If the update frequency is higher than the predetermined threshold (S503: YES), the data relocation proposal creating program 213 executes the process of S507. The placement proposal creation program 213 executes the process of S504.

Ｓ５０４においてデータ再配置提案作成プログラム２１３は、処理対象データに関して、Ｓ４０３で算出した実行周期に対する、Ｓ４０３で算出した時系列ＤＢへの（時系列データに対する）データ処理時間の割合（データ圧縮性能を表すパラメータ値）が所定の閾値より高いか否かを判定する。 In S504, the data rearrangement proposal creation program 213 calculates the ratio of the data processing time (relative to the time series data) calculated in S403 to the execution cycle calculated in S403 (representing the data compression performance) for the data to be processed. parameter value) is higher than a predetermined threshold.

実行周期に対するデータ処理時間の割合が所定の閾値より高くない場合（データ圧縮性能が高い場合）は（Ｓ５０４：ＮＯ）、データ再配置提案作成プログラム２１３は、処理対象データを、時系列ＤＢを備えるデータベース装置２０に配置することを提案する情報を生成し（Ｓ５０５）、配置案作成処理Ｓ４０４は終了する。 If the ratio of the data processing time to the execution cycle is not higher than the predetermined threshold (if the data compression performance is high) (S504: NO), the data relocation proposal creation program 213 stores the data to be processed in the time-series DB. Information proposing placement in the database device 20 is generated (S505), and the placement plan creation processing S404 ends.

実行周期に対するデータ処理時間の割合が所定の閾値より高い場合は（Ｓ５０４：ＹＥＳ）、データ再配置提案作成プログラム２１３は、処理対象データを、列指向ＤＢを備えるデータベース装置２０に配置することを提案する情報を生成し（Ｓ５０６）、配置案作成処理Ｓ４０４は終了する。 If the ratio of the data processing time to the execution cycle is higher than the predetermined threshold (S504: YES), the data rearrangement proposal creating program 213 proposes to arrange the processing target data in the database device 20 having a column-oriented DB. information is generated (S506), and the placement plan creation processing S404 ends.

なお、Ｓ６０５においてデータ再配置提案作成プログラム２１３は、処理対象データに
関して、Ｓ４０３で算出した実行周期に対する、Ｓ４０３で算出した列指向ＤＢに対するデータ処理時間の割合（データ集計性能を表すパラメータ値）が所定の閾値より低いか否かを判定し、当該割合が所定の閾値より低い場合（データ集計性能が高い場合）にのみ、処理対象データを、列指向ＤＢを備えるデータベース装置２０に配置することを提案する情報を生成してもよい。 In S605, the data rearrangement proposal creation program 213 sets the ratio of the data processing time for the column-oriented DB calculated in S403 to the execution cycle calculated in S403 (parameter value representing data aggregation performance) for the data to be processed. is lower than a threshold value, and only when the ratio is lower than a predetermined threshold value (when the data aggregation performance is high), the data to be processed is placed in the database device 20 having a column-oriented DB. may generate information to

次に、Ｓ５０７においてデータ再配置提案作成プログラム２１３は、処理対象データに関して、Ｓ４０３で算出した実行周期に対する、Ｓ４０３で算出したＲＤＢへの（行単位での）データ処理時間の割合（データ更新機能を表すパラメータ）が所定の閾値より高いか否かを判定する。 Next, in S507, the data relocation proposal creation program 213 calculates the ratio of the data processing time (in units of rows) to the RDB calculated in S403 to the execution cycle calculated in S403 (the data update function is enabled). parameter) is higher than a predetermined threshold.

実行周期に対するデータ処理時間の割合が所定の閾値より高い場合は（Ｓ５０７：ＹＥＳ）、データ再配置提案作成プログラム２１３は、処理対象データを行指向ＤＢを備えるデータベース装置２０（すなわち、ＲＤＢを備えるデータベース装置２０）に配置するが、そのデータ処理は列指向ＤＢを用いて行うことを提案する情報を生成し（Ｓ５０９）、配置案作成処理Ｓ４０４は終了する。 If the ratio of the data processing time to the execution cycle is higher than the predetermined threshold (S507: YES), the data relocation proposal creating program 213 transfers the data to be processed to the database device 20 having a row-oriented DB (that is, a database having an RDB). 20), but the data processing is generated using a column-oriented DB (S509), and the placement plan creation processing S404 ends.

実行周期に対するデータ処理時間の割合が所定の閾値以下ある場合は（Ｓ５０７：ＮＯ）、データ再配置提案作成プログラム２１３は、処理対象データを、行指向ＤＢを備えるデータベース装置２０（すなわち、ＲＤＢを備えるデータベース装置２０）に配置することを提案する情報を生成し（Ｓ５０８）、配置案作成処理Ｓ４０４は終了する。 If the ratio of the data processing time to the execution cycle is equal to or less than the predetermined threshold (S507: NO), the data relocation proposal creation program 213 converts the data to be processed into the database device 20 having a row-oriented DB (that is, (S508), and the placement plan creation process S404 ends.

以上のように、本実施形態のデータ管理装置１０は、データベースから取得した、所定データ項目の処理対象データを解析することにより、将来期間に生成されるそのデータ項目のデータを算出し、算出したデータを含むデータベースにおいて実行されるデータ処理の特性を推定し、推定したデータ処理の特性に基づき、データベース群のうち、そのデータ項目のデータを記憶させるデータベースを特定して、そのデータベースに、処理対象データを含むデータを記憶させる。 As described above, the data management apparatus 10 of the present embodiment analyzes the data to be processed of the predetermined data item acquired from the database, and calculates the data of the data item to be generated in the future period. Estimate the characteristics of the data processing executed in the database containing the data, and based on the estimated characteristics of the data processing, specify the database that stores the data of the data item among the database group, and assign the database to be processed Store data containing data.

すなわち、本実施形態のデータ管理装置１０は、処理対象データに関する将来のデータ処理予測を行ってダミーデータを作成することでデータ処理の特性を推定し、その推定結果に基づいて、データを配置すべきデータベースを特定する。これにより、各データベースにおけるデータを、そのデータ処理の特性に応じて適切なデータベースに管理させることができる。 That is, the data management apparatus 10 of the present embodiment estimates the characteristics of data processing by predicting future data processing of the data to be processed and creating dummy data, and arranges data based on the result of the estimation. Identify the database that should As a result, the data in each database can be managed by an appropriate database according to the characteristics of the data processing.

特に、本実施形態のデータ管理装置１０は、ダミーデータを記憶させた試行データベースを作成し、作成した試行データベースに対してデータ処理を実行することにより、データ処理の特性を推定する。このように、データベースを実際に作成することで、データ処理の特性を正確に推定することができる。 In particular, the data management device 10 of the present embodiment creates a trial database storing dummy data, and performs data processing on the created trial database to estimate characteristics of data processing. By actually creating a database in this way, the characteristics of data processing can be accurately estimated.

また、本実施形態のデータ管理装置１０は、将来期間におけるデータ処理時間、データ処理の実行頻度、データ更新頻度に基づき、ダミーデータに係るデータ更新処理性能、データ圧縮性能、データ集計性能を表すパラメータ値を算出し、これに応じてデータベースを特定する。これにより、各データベースが備えるデータ処理性能の特徴に応じて、適切なデータベースを特定することができる。 Further, the data management device 10 of the present embodiment uses parameters representing data update processing performance, data compression performance, and data aggregation performance related to dummy data based on the data processing time in the future period, the execution frequency of data processing, and the data update frequency. Calculate the value and identify the database accordingly. Thereby, an appropriate database can be specified according to the characteristics of the data processing performance of each database.

例えば、本実施形態のデータ管理装置１０は、将来期間におけるデータの更新頻度を推定し、データ更新頻度を表すパラメータ値が所定閾値を超える場合には、行指向のデータベース（ＲＤＢ等）を特定する。これにより、更新性能が高い行指向のデータベースの特性を活用することができる。 For example, the data management device 10 of this embodiment estimates the data update frequency in the future period, and if the parameter value representing the data update frequency exceeds a predetermined threshold value, identifies a row-oriented database (RDB, etc.) . This makes it possible to take advantage of the characteristics of row-oriented databases with high update performance.

さらにこの場合、本実施形態のデータ管理装置１０は、データ更新頻度を表すパラメータ値が所定閾値を超える場合であって、将来期間における行単位でのデータ処理に要する時間を表すパラメータ値が所定の閾値を超える場合には、行指向のデータベースを記憶先とするが、データ処理は列指向のデータベースを用いるものとして特定する。これにより、更新性能が高い行指向のデータベース及び列単位のデータ処理能力が高い列指向のデータベースの双方のメリットを活用することができる。 Further, in this case, the data management device 10 of the present embodiment is configured such that when the parameter value representing the data update frequency exceeds a predetermined threshold, the parameter value representing the time required for data processing in units of rows in the future period is a predetermined value. If the threshold is exceeded, a row-oriented database is used as the storage destination, but the data processing is specified as using a column-oriented database. This makes it possible to utilize the advantages of both a row-oriented database with high update performance and a column-oriented database with high column-by-column data processing capability.

また、本実施形態のデータ管理装置１０は、将来期間における、時系列データに対するデータ処理に要する時間を表すパラメータ値が所定の閾値を超えない場合には、時系列データベースを特定する。これにより、データ圧縮性能に優れる時系列データベースの特性を活用することができる。 Further, the data management device 10 of the present embodiment identifies the time-series database when the parameter value representing the time required for data processing of the time-series data in the future period does not exceed a predetermined threshold. This makes it possible to take advantage of the characteristics of time-series databases that excel in data compression performance.

また、本実施形態のデータ管理装置１０は、将来期間における、列単位でのデータ処理に要する時間を表すパラメータ値が所定の閾値を超えない場合には、列指向データベースを特定する。これにより、（列単位での）データ集計性能に優れる列指向データベース（例えば、ＲＤＢ）の特性を活用することができる。 Further, the data management device 10 of the present embodiment identifies a column-oriented database when the parameter value representing the time required for data processing in units of columns in the future period does not exceed a predetermined threshold value. This makes it possible to utilize the characteristics of a column-oriented database (for example, RDB) that excels in data aggregation performance (on a column-by-column basis).

また、本実施形態のデータ管理装置１０は、再配置先として特定したデータベースに関する情報を画面に表示することで、再配置を行うかをユーザに確認させることができる。 In addition, the data management apparatus 10 of the present embodiment can prompt the user to confirm whether or not to perform the relocation by displaying information about the database specified as the relocation destination on the screen.

また、本実施形態のデータ管理装置１０は、標準形式のクエリに対して変換を行った、各データベース群に対応したクエリをそれぞれ生成し、生成したクエリをデータベース群に送信することにより、データベース群から応答データを受信し、受信した各応答データを標準形式に変換する。そして、データ管理装置１０は、この変換した各応答データを、処理対象データとして取得する。 In addition, the data management device 10 of the present embodiment converts a standard format query to generate a query corresponding to each database group, and transmits the generated query to the database group. Receives response data from and converts each received response data to a standard format. Then, the data management device 10 acquires each converted response data as data to be processed.

これにより、対応形式が異なるデータベースが存在する場合でも、その形式に関係なく容易に、統一した標準形式のデータを取得し、配置先のデータベースを迅速に特定することができる。 As a result, even if there are databases with different compatible formats, it is possible to easily obtain data in a unified standard format regardless of the format, and quickly identify the destination database.

本発明は上記実施形態に限定されるものではなく、その要旨を逸脱しない範囲内で、任意の構成要素を用いて実施可能である。以上説明した実施形態や変形例はあくまで一例であり、発明の特徴が損なわれない限り、本発明はこれらの内容に限定されるものではない。また、上記では種々の実施形態や変形例を説明したが、本発明はこれらの内容に限定されるものではない。本発明の技術的思想の範囲内で考えられるその他の態様も本発明の範囲内に含まれる。 The present invention is not limited to the above-described embodiments, and can be implemented using arbitrary constituent elements without departing from the spirit of the present invention. The embodiments and modifications described above are merely examples, and the present invention is not limited to these contents as long as the features of the invention are not impaired. Moreover, although various embodiments and modifications have been described above, the present invention is not limited to these contents. Other aspects conceivable within the scope of the technical idea of the present invention are also included in the scope of the present invention.

例えば、本実施形態の各装置が備える各機能の一部は他の装置に設けてもよいし、別装置が備える機能を同一の装置に設けてもよい。 For example, part of each function provided by each device of the present embodiment may be provided in another device, or functions provided by another device may be provided in the same device.

また、データ管理装置１０が管理するデータベースの種類は本実施形態で説明したものに限らず、他の種類のデータベースでもよい。 Further, the type of database managed by the data management device 10 is not limited to that described in the present embodiment, and other types of databases may be used.

また、本実施形態では、一つのデータベース装置２０に対して１種類のデータベースが格納されていることを前提としたが、一つのデータベース装置２０に対して複数種類のデータベースが格納され、データ管理装置１０が、データベース装置２０の各データベースにアクセスする形態であってもよい。 Further, in the present embodiment, it is assumed that one type of database is stored in one database device 20, but a plurality of types of databases are stored in one database device 20, and the data management device 10 may be configured to access each database of the database device 20 .

また、データ管理装置１０は、データ集計性能、データ圧縮性能、及びデータ集計性能
をデータの処理時間、データの更新時間、及びデータの更新周期に基づき算出するものとしたが、その他のパラメータを用いてこれらの性能を算出してもよい。 In addition, the data management device 10 calculates data aggregation performance, data compression performance, and data aggregation performance based on data processing time, data update time, and data update cycle, but other parameters are used. may be used to calculate these performances.

また、データ管理装置１０は、データ集計性能、データ圧縮性能、及びデータ集計性能以外の性能を算出してもよい。 The data management device 10 may also calculate performance other than the data aggregation performance, the data compression performance, and the data aggregation performance.

また、データ管理装置１０は、標準形式としてＳＱＬ形式を採用したが、その他の形式にしてもよい。 Also, although the data management device 10 employs the SQL format as the standard format, other formats may be used.

１データ管理システム、１０データ管理装置、２０データベース装置、２１２データ処理模擬プログラム、２１３データ再配置提案作成プログラム、２１５データ配置実行プログラム 1 data management system, 10 data management device, 20 database device, 212 data processing simulation program, 213 data rearrangement proposal creation program, 215 data arrangement execution program

Claims

having a processor and memory,
By analyzing the data to be processed of the predetermined item acquired from the database, the data of the predetermined item that will be generated in a predetermined period in the future is calculated, and the characteristics of the data processing executed in the database containing the calculated data are determined. an estimating data processing simulator;
a data layout proposal creation unit that identifies, from among a plurality of databases having different data processing performances, a database in which the data of the predetermined item is to be stored, based on the estimated characteristics of data processing;
a data placement execution unit that stores data including the data to be processed in the identified database;
A data management device comprising:

2. The data processing simulating unit according to claim 1, wherein said data processing simulating unit creates a trial database storing said calculated data, and executes data processing on the created trial database, thereby estimating characteristics of said data processing. data management equipment.

The data processing simulating unit estimates at least one of the time required for data processing, the frequency of execution of data processing, and the frequency of updating data in the predetermined future period as characteristics of the data processing,
The data allocation proposal creation unit calculates a parameter value representing at least one of data update processing performance, data compression performance, and data aggregation performance related to the calculated data based on the estimated characteristics of data processing, and calculates Identifying a database for storing the data of the predetermined item according to the parameter value,
The data management device according to claim 1.

The data processing simulating unit estimates, as a characteristic of the data processing, a data update frequency in the predetermined future period,
The data allocation proposal creating unit determines whether or not the parameter value representing the estimated update frequency of the data exceeds a predetermined threshold value, and if the parameter value exceeds the threshold value, the data of the predetermined item is determined. identifying a row-oriented database as the stored database;
4. The data management device according to claim 3.

The data processing simulating unit estimates, as characteristics of the data processing, a data update frequency and a time required for data processing in units of rows in the predetermined future period,
The data allocation proposal creation unit determines whether or not the parameter value representing the estimated update frequency of the data exceeds a predetermined threshold value, and if the parameter value exceeds the threshold value, the data in units of rows Determining whether a parameter value representing the time required for processing exceeds a predetermined threshold value, and specifying a row-oriented database as a database for storing data of the predetermined item when the parameter value exceeds the threshold value specifying a column-oriented database as a database for performing data processing of the predetermined item with
5. The data management device according to claim 4.

The data processing simulating unit estimates the time required for data processing of time-series data in the predetermined future period as a property of the data processing,
The data allocation proposal creation unit determines whether or not the parameter value representing the estimated time required for data processing exceeds a predetermined threshold value, and if the parameter value does not exceed the threshold value, the predetermined item Identifying a time-series database as a database for storing data,
4. The data management device according to claim 3.

The data processing simulating unit estimates the time required for data processing in units of columns in the predetermined future period as a characteristic of the data processing,
The data allocation proposal creation unit determines whether or not the parameter value representing the estimated time required for data processing exceeds a predetermined threshold value, and if the parameter value does not exceed the threshold value, the predetermined item identifying a columnar database as the database in which to store the data;
4. The data management device according to claim 3.

2. The data management device according to claim 1, further comprising a data placement proposal output unit that displays information about said specified database on a screen.

By creating a query in a predetermined format and performing a predetermined conversion on the created query, a query corresponding to each of a plurality of databases having different data processing performances is generated, and the generated query is applied to the plurality of databases. a query transmission unit that receives response data for each of the transmitted queries from the plurality of databases by transmitting them to each;
a query result format conversion unit that converts each of the received response data into the predetermined format;
The data processing simulating unit acquires each of the converted response data as data to be processed for the predetermined item.
The data management device according to claim 1.

The information processing device
By analyzing the data to be processed of the predetermined item acquired from the database, the data of the predetermined item that will be generated in a predetermined period in the future is calculated, and the characteristics of the data processing executed in the database containing the calculated data are determined. a simulated data processing to be estimated;
a data arrangement proposal creation process for identifying a database in which the data of the predetermined item is to be stored, among a plurality of databases having different data processing performances, based on the estimated characteristics of the data processing;
a data placement execution process for storing data including the data to be processed in the specified database;
A data management method that implements