US20050160087A1 - Data extractor and method of data extraction - Google Patents
Data extractor and method of data extraction Download PDFInfo
- Publication number
- US20050160087A1 US20050160087A1 US11/019,127 US1912704A US2005160087A1 US 20050160087 A1 US20050160087 A1 US 20050160087A1 US 1912704 A US1912704 A US 1912704A US 2005160087 A1 US2005160087 A1 US 2005160087A1
- Authority
- US
- United States
- Prior art keywords
- data
- extraction
- page
- reading
- database
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000013075 data extraction Methods 0.000 title claims abstract description 54
- 238000000034 method Methods 0.000 title claims description 67
- 238000000605 extraction Methods 0.000 claims description 92
- 230000002401 inhibitory effect Effects 0.000 claims 1
- 230000008569 process Effects 0.000 description 53
- 230000007717 exclusion Effects 0.000 description 39
- 238000013500 data storage Methods 0.000 description 23
- 238000006243 chemical reaction Methods 0.000 description 12
- 230000000875 corresponding effect Effects 0.000 description 6
- 238000012545 processing Methods 0.000 description 6
- 230000004044 response Effects 0.000 description 6
- 238000012544 monitoring process Methods 0.000 description 3
- 238000002360 preparation method Methods 0.000 description 3
- 238000013523 data management Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000011084 recovery Methods 0.000 description 2
- 238000004590 computer program Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2358—Change logging, detection, and notification
Definitions
- the present invention relates to a data extractor and a method of data extraction in which a data extraction process of reading a plurality of data stored in a database successively is performed with a short process exclusion time for the database.
- each terminal may read and write desired data stored in the database.
- exclusive control prevents the plurality of terminals from accessing the same data simultaneously.
- another terminal is kept in a standby state by not allowing access to the data till the access by the first terminal is complete, thereby preventing attempts at simultaneous updating of the data.
- acquisition of process exclusion of the data is called acquisition of process exclusion of the data.
- the contents of data at a predetermined time are read and stored.
- the time required for extraction increases based on the amount of the data. Therefore, during the data extraction, if an access to the data included in the data subjected to extraction is allowed, a part of the data subjected to extraction is updated.
- a value of each data is a value at a time when the data is read, and not a value at a time when data extraction began. This may lead to an inability to acquire data at the time of data extraction.
- the data extracted are correlated, there is a risk of mismatching of correlation between the data.
- the process exclusion of all the data subjected to extraction is acquired from the beginning till the end of the data extraction, and a value of each data at the starting point of data extraction is read.
- the process exclusion time for the data extraction is longer.
- a response expected for updating of the database is a few hundreds of milliseconds.
- the process exclusion time necessary for the data extraction is much more as compared to this value, and reduces the update response time of the database to a great extent.
- a method of data extraction includes reading successively a plurality of data stored in a database; acquiring update contents of the data as update history, if there is an update of the data in the database during a period from a start of the reading to an end of the reading; and overwriting contents of the plurality of the data read with the contents at a time of the start of the reading, based on the update history acquired.
- a data extractor that successively reads a plurality of data stored in a database.
- the data extractor includes an update history acquiring unit that acquires update contents of the data as update history, if there is an update of the data in the database during a period from a start of reading of the data to an end of reading of the data; and an overwriting unit that overwrites contents of the plurality of the data read with the contents at a time of the start of the reading of the data, based on the update history acquired.
- FIG. 1 illustrates a concept of a method of data extraction according to an embodiment
- FIG. 2 is a block diagram of a data extractor
- FIG. 3 illustrates an extraction of page A 1 from a database
- FIG. 4 illustrates an extraction of page B 1 from the database
- FIG. 5 illustrates an extraction of page C 2 from the database
- FIG. 6 is a flowchart of a process procedure of data extraction in a data extractor
- FIG. 7 is a flowchart of a process procedure of restoring executed by a data restoring section.
- FIG. 8 is a flowchart of a process procedure of conversion to a data format executed by a format converter.
- FIG. 1 illustrates a concept of a method of data extraction according to an embodiment.
- a database 2 is connected to a network 3 via a data extractor 1 .
- a database 5 is connected to the network 3 via a data extractor 4 .
- a terminal 6 is connected to the network 3 .
- the database 2 stores data constellations 21 and 22 .
- the data management in the database 2 is performed using an input-output unit named as page.
- the data constellation 21 includes pages A 1 , B 1 , and C 1
- the data constellation 22 includes pages H 1 , I 1 , and J 1 .
- the contents of data are stored in each page using a unit named as record.
- the input and output of data from the outside to the database is performed in units of records.
- the terminal 6 accesses the data extractor 1 via the network 3 . Based on the access from the terminal 6 , an updating processor 11 in the data extractor 1 acquires a process exclusion of desired data through an exclusive controller 13 , thereby reading and writing data. Because this data updating is performed in units of records, when the updating processor 11 accesses a certain record, the exclusive controller 13 acquires a process exclusion of a page that stores this record.
- a process exclusion of data extracted by an extraction processor 12 through the exclusive controller 13 is acquired, and the data extraction is started.
- the extraction processor 12 acquires process exclusion one after another for the pages A 1 , B 1 , and C 1 , which are included in the data constellation 21 .
- the extraction processor 12 first acquires the process exclusion only for the page A 1 , terminates process exclusion of the page A 1 after reading records on the page A 1 , and then acquires a process exclusion for the page B 1 .
- the data extractor 1 monitors the updating operation performed by the updating processor 11 , and stores the changes as updated log data, when the database is updated.
- the data extractor 1 uses the updated log data to revert the data extracted to a value at a starting point of the extraction, and can thus acquire a value of the data constellation 21 at the starting point of the extraction.
- FIG. 2 is a block diagram of a data extractor.
- the data extractor 1 includes an input-output processor 10 , an extraction controller 14 , a log-data acquisition section 15 , an extraction-data storage 16 , a log-data storage 17 , a data restoring section 18 , and a format converter 19 .
- the extraction processor 12 includes a buffer memory for extraction 12 a.
- the input-output processor 10 receives an access to the database 2 via the network 3 . Upon receiving an access requesting updating of data stored in the database 2 , the input-output processor 10 outputs the access received to the updating processor 11 . Moreover, upon receiving an access requesting the extraction of data stored in the database 2 , the input-output processor 10 outputs the access received to the extraction controller 14 .
- the updating processor 11 acquires process exclusion for the page that stores the data to be updated, and updates the data.
- the extraction controller 14 When the input-output processor 10 receives an access requesting the data extraction, the extraction controller 14 outputs a command instructing start of acquisition of the updated log data to the log-data acquisition section 15 , and a command instructing the start of data extraction to the extraction processor 12 .
- the extraction processor 12 receives the command from the extraction controller 14 , and starts the extraction of the data from the database 2 . At this time, the extraction processor 12 performs data extraction by acquiring the process exclusion one after another for the pages in the data constellation that is extracted. Further, the extraction processor 12 extracts the page as a page image, and stores the page in the extraction-data storage 16 .
- the log-data acquisition section 15 receives the command from the extraction controller 14 , and starts monitoring the updating processor 11 . During the monitoring, if the updating processor 11 updates the database 2 , the log-data acquisition section 15 stores the contents of updating by the updating processor 11 as updated log data in the log-data storage 17 . Updated data and the contents of updating are recorded in the updated log data.
- the data restoring section 18 restores the data based on the page image stored in the extraction-data storage 16 and the updated log data stored in the log-data storage 17 .
- Restoration of data is a process of reverting contents of the page image to the contents at the starting point of the extraction using the contents of the updated log data, when the contents changed after the start of extraction are included in the page image extracted.
- the data restoring section 18 outputs the page image restored to the format converter 19 .
- the format converter 19 changes the data included in the page image received from the data restoring section 18 to a desired format according to the requirement, and outputs the changed data to the input-output processor 10 .
- the input and the output within the database 2 are in units of pages. However, it is desirable that the handling of the data included in the page extracted be performed in a generalized format. Therefore, the data is converted by the format converter 19 , before outputting to the network 3 via the input-output processor 10 .
- the buffer memory for extraction 12 a that is connected to the extraction processor 12 , functions as a temporary storage during extracting the page image from the database 2 .
- the extraction processor 12 acquires the process exclusion of the page to be extracted, and at a point of time when the page image read is stored in the buffer memory for extraction 12 a , the extraction processor 12 judges that the reading of the page image is complete, and then terminates the process exclusion for that page.
- the process exclusion is terminated at the point of time when the page image read is stored in the buffer memory for extraction 12 a , and the page image stored in the buffer memory for extraction 12 a is stored in the extraction-data storage 16 after terminating the process exclusion. Therefore, the time required for the process exclusion for reading each page is determined by capacity and speed of reading and writing of the buffer memory for extraction 12 a.
- the updating processor 11 and the extraction processor 12 can be realized as independent processors.
- the processor operates throughout the data extraction. Therefore, if the data extraction and updating are realized by the same processor, the data extraction consumes the processing capacity of the processor, and reduces the processing capacity that can be used for the updating, thereby reducing the processing speed of updating.
- realizing the updating processor 11 and the extraction processor 12 as independent processors can secure the processing capacity used for updating, and avoids a decrease in the processing speed during updating.
- the data extraction performed by the data extractor 1 is described further with reference to FIGS. 3 to 5 .
- the data extractor 1 reads the pages A 1 , B 1 , and C 1 from the database 2 , first, the log-data acquisition section 15 starts monitoring the updating processor 11 , and then the extraction processor 12 acquires the process exclusion for the page A 1 in the database 2 (see FIG. 3 ). Therefore, the updating processor 11 cannot access the page A 1 .
- the extraction processor 12 has not acquired the process exclusion for the pages B 1 and C 1 , the updating processor 11 can freely access the pages B 1 and C 1 .
- the page A 1 stores records a 10 , a 20 , and a 30 .
- the page B 1 stores records b 10 , b 20 , and b 30
- the page C 1 stores records c 10 , c 20 , and c 30 .
- the extraction processor 12 terminates process exclusion for the page A 1 in the database 2 .
- the extraction processor 12 acquires the process exclusion for the page B 1 (see FIG. 4 ). Therefore, the updating processor 11 cannot access the page B 1 . On the other hand, because the extraction processor 12 has not acquired the process exclusion for the pages A 1 and C 1 , the updating processor 11 can access the pages A 1 and C 1 freely.
- the extraction processor 12 upon acquiring the process exclusion for the page B 1 , the extraction processor 12 reads the page B 1 , and stores it in the extraction-data storage 16 . While the extraction processor 12 extracts the page B 1 , the updating processor 11 can update another page. In this case, the updating processor 11 rewrites the record a 30 of the page A 1 to a record a 31 , thus changing the page A 1 to a page A 2 , and rewrites the record c 20 of the page C 1 to a record c 21 , thus changing the page C 1 to a page C 2 .
- the log-data acquisition section 15 creates updated log data, and stores the updated log data in the log-data storage 17 .
- information indicating that the record c 20 has been rewritten as the record c 21 is stored as the updated log data.
- information for specifying the record updated is added to the updated log data, as per requirement.
- the log-data acquisition section 15 acquires log-data related to the page C 2 , and stores it in the log-data storage 17 , but does not acquire log-data related to the page A 2 . This is because the updated log data of the page A 1 updated to the page A 2 is not required, because the extraction processor 12 has already completed the extraction of the page A 1 .
- the extraction processor 12 Upon reading the page B 1 and storing in the extraction-data storage 16 , the extraction processor 12 cancels the process exclusion for the page B 1 in the database 2 .
- the extraction processor 12 acquires the process exclusion for the page C 2 (see FIG. 5 ). Therefore, the updating processor 11 cannot access the page C 2 .
- the updating processor 11 can access the pages A 2 and B 1 freely.
- the page C 2 that is read by the extraction processor 12 has been updated from the page C 1 by the updating processor 11 .
- the extraction processor 12 reads the updated page C 2 as it is, and stores in the extraction-data storage 16 . Therefore, the page C 2 that is stored in the extraction-data storage 16 includes a record c 21 , because the updating processor 11 updated the record.
- the data restoring section 18 restores the page that is stored in the extraction-data storage 16 .
- the log-data storage 17 stores information indicating that the record c 20 has been updated to the record c 21 . Therefore, the data restoring section 18 detects a page with the record c 21 from the pages in the extraction-data storage 16 . In other words, the data restoring section 18 detects the page C 2 and creates the page C 1 by changing the record c 21 to the record c 20 .
- the data restoring section 18 can obtain the pages A 1 , B 1 , and C 1 at a point of time when the data extraction started.
- the extraction processor 12 acquires process exclusion for overall data to be extracted (step S 101 ). Further, the extraction controller 14 transmits a command to the log-data acquisition section 15 instructing to start acquisition of the updated log data, and starts acquiring the updated log data (step S 102 ). Then, the extraction processor 12 terminates the process exclusion for the overall data to be extracted (step S 103 ).
- the acquisition of the updated log data starts after the extraction processor 12 acquires the process exclusion for the overall data to be extracted. This is because, if an operation of updating the data and the start of acquisition of the updated log data occur simultaneously, the updating of data taking place while the starting of acquisition of the updated log data is not affected during the restoration, and there is a mismatching of data contents.
- the extraction processor 12 designates a first page of the data to be extracted as a page subjected to extraction (step S 104 ).
- the database 2 determines whether the page subjected to extraction exists in the database buffer memory or not (step S 105 ). If the database 2 has not stored the page subjected to extraction in the database buffer memory, i.e. if an image of the page subjected to extraction does not exist in the database buffer memory (No at step S 105 ), the database 2 reads the page subjected to extraction into the database buffer memory (step S 106 ).
- the extraction processor 12 acquires the process exclusion for the page subjected to extraction (step S 107 ). Then, the extraction processor 12 reads the page subjected to extraction, and stores it in the buffer memory for extraction 12 a (step S 108 ).
- step S 108 the extraction controller 14 ends the acquisition of the updated log data (step S 109 ), and then the extraction processor 12 terminates the process exclusion for the page subjected to extraction (step S 110 ). Moreover, the extraction processor 12 appropriately stores the page subjected to extraction that is stored in the buffer memory for extraction 12 a , into the extraction-data storage 16 (step S 111 ).
- the extraction processor 12 determines whether all pages of data to be extracted have been read (step S 112 ). If there is a page that has not been read yet (No at step S 112 ), the extraction processor 12 designates the next page as the page subjected to extraction (step S 113 ), and the process returns to step S 105 . Thus, all pages in the data to be extracted are read one by one, and when all the pages have been read (Yes at step S 112 ), the data extraction ends.
- step S 201 preparation for searching the updated log data stored in the log-data storage is performed.
- This search preparation is an operation of rearranging the updated log data according to the corresponding page.
- the log-data acquisition section 15 monitors the operation of the updating processor 11 and acquires the updated log data at any time, the updated log data is normally acquired as data of time series.
- the restoring is in units of pages. Therefore, by rearranging in advance the updated log data according to the corresponding page, the search of the updated log data corresponding to each page can be performed at a high speed.
- the data restoring section 18 designates a first page from among the pages extracted as a page subjected to restoration (step S 202 ). Then, the data restoring section 18 reads the page subjected to restoration from the extraction-data storage 16 (step S 203 ). Further, the data restoring section 18 searches from the log-data storage 17 , updated log data file corresponding to the page subjected to restoration (step S 204 ). The data restoring section 18 restores the page subjected to restoration that is read using the log-data file searched (step S 205 ).
- the data restoring section 18 determines whether all pages of the data to be extracted are restored (step S 206 ). If any page is not restored (No at step S 206 ), the data restoring section 18 designates the next page as the page subjected to restoration (step S 207 ), and the process returns to step S 203 . Thus, all the pages extracted are restored one by one by searching the corresponding log-data for each page, and when all the pages are restored (Yes at step S 206 ), the restoring ends.
- the format converter 19 receives data that is restored by the data restoring section 18 (step S 301 ), and designates a first page from among the pages restored as a page subjected to conversion (step S 302 ).
- the format converter 19 designates a first record from among records in the page subjected to conversion as a record subjected to conversion (step S 303 ).
- the format converter 19 converts the record subjected to conversion to a desired file format (step S 304 ), and outputs the record converted (step S 305 ).
- the format converter 19 determines whether all records in the page subjected to conversion are converted (step S 306 ). If any record is not converted yet (No at step S 306 ), the format converter 19 designates the next record as the record subjected to conversion (step S 307 ), and the process returns to step S 304 .
- the format converter 19 determines if all the pages have been converted (step S 308 ). If any page is yet to be converted (No at step S 308 ), the format converter 19 designates the next page as the page subjected to conversion (step S 309 ), and the process returns to step S 303 . Thus, when all the pages restored have undergone format conversion one by one (Yes at step S 308 ), the conversion of data format ends.
- the process exclusion is acquired only for the page that is being extracted, and the other pages can be accessed freely.
- the data constellation 21 can be extracted without decrease in the update response time.
- the updated contents are stored as the updated log data, and the contents of the data extracted using the updated log data are restored to values at the time of start of data extraction. Therefore, mismatching of the contents of data can be prevented, and a value of each data at the time of start of data extraction can be obtained.
- the data extractor 1 upon starting acquisition of the updated log data at the time of start of data extraction, the data extractor 1 ends the acquisition of the updated log data for the page for which the extraction is complete, and performs data extraction only for a page that is not extracted, thereby reducing the capacity of the updated log data.
- the database buffer used for input-output of the database and the buffer memory for extraction 12 a are provided independently, even if a large amount of memory capacity is used by the extraction, the memory capacity for updating is secured, and the decrease in the update process speed is avoided.
- Realizing the updating processor 11 and the extraction processor 12 as independent processors can secure the process capacity used for the updating, and avoids the decrease in the process speed during updating.
- the updated log data acquired is used only for restoring the page image
- the updated log data may also be used for recovery of the database. In this case, it is not necessary to acquire the log-data uniquely for restoration, thereby enabling to reduce cost of creating log-data of a CPU. While using the same log-data for restoration and recovery, it is necessary to continue acquisition of the updated log data even for a page for which the extraction is completed.
- a data extractor suitable particularly for the method of data extraction is described.
- functions described in the embodiments may be realized by software, as a computer program for data extraction that can be run on any computer terminal.
- data extraction can be performed at any point of time.
- data extraction can be performed without using excessive storage area of update history, and with a simple structure.
- the update response time is secured even if the data extraction is being performed at the same time.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A plurality of data stored in a database is read successively. Update contents of the data are acquired as update history, if there is an update of the data in the database during a period from a start of the data extraction to an end of the data extraction. Contents of the plurality of the data extracted are overwritten with the contents at a time of the start of the data extraction, based on the update history acquired.
Description
- 1) Field of the Invention
- The present invention relates to a data extractor and a method of data extraction in which a data extraction process of reading a plurality of data stored in a database successively is performed with a short process exclusion time for the database.
- 2) Description of the Related Art
- So far, a database in which data is managed by converging in a form that is defined in advance for a purpose of sharing, integrated management, and high independency of data has been used. Normally, the database is connected to a plurality of terminals via a network etc., and the data used by each terminal is uniformly managed. Therefore, to share the data between the plurality of terminals, each terminal may read and write desired data stored in the database.
- Thus, when the database is shared by the plurality of terminals, it is necessary to perform an exclusive control to avoid double updating of the data in the database. In the data management, exclusive control prevents the plurality of terminals from accessing the same data simultaneously. In other words, when a certain terminal is accessing data stored in the database, another terminal is kept in a standby state by not allowing access to the data till the access by the first terminal is complete, thereby preventing attempts at simultaneous updating of the data. Creating a state of not allowing another terminal access to predetermined data is called acquisition of process exclusion of the data.
- Conventionally, for reading a large amount of data from the database for the purpose of backup etc., it is necessary to acquire process exclusion of the entire data to be read. In such a case, an operation of reading the large amount of data from the database is called data extraction.
- In backup data extraction, the contents of data at a predetermined time are read and stored. However, the time required for extraction increases based on the amount of the data. Therefore, during the data extraction, if an access to the data included in the data subjected to extraction is allowed, a part of the data subjected to extraction is updated. Moreover, a value of each data is a value at a time when the data is read, and not a value at a time when data extraction began. This may lead to an inability to acquire data at the time of data extraction. Moreover, if the data extracted are correlated, there is a risk of mismatching of correlation between the data.
- Therefore, conventionally, to prevent mismatching of the data contents, the process exclusion of all the data subjected to extraction is acquired from the beginning till the end of the data extraction, and a value of each data at the starting point of data extraction is read.
- However, in a conventional method of data extraction, the data subjected to extraction cannot be updated from the start of data extraction till the completion of data extraction.
- Particularly, when the amount of data to be extracted is large, the process exclusion time for the data extraction is longer. Normally, a response expected for updating of the database is a few hundreds of milliseconds. However, the process exclusion time necessary for the data extraction is much more as compared to this value, and reduces the update response time of the database to a great extent.
- Conversely, to secure the update response of the database, it is necessary to select a state in which the database is not being updated, to perform the data extraction, but this restricts the start of data extraction. Moreover, for a database that is always in a state in which the updating is possible at any time, the data extraction cannot be performed, and the updating needs to be stopped for enabling the data extraction.
- It is an object of the present invention to at least solve the problems in the conventional technology.
- A method of data extraction according to an aspect of the present invention includes reading successively a plurality of data stored in a database; acquiring update contents of the data as update history, if there is an update of the data in the database during a period from a start of the reading to an end of the reading; and overwriting contents of the plurality of the data read with the contents at a time of the start of the reading, based on the update history acquired.
- A data extractor according to another aspect of the present invention that successively reads a plurality of data stored in a database. The data extractor includes an update history acquiring unit that acquires update contents of the data as update history, if there is an update of the data in the database during a period from a start of reading of the data to an end of reading of the data; and an overwriting unit that overwrites contents of the plurality of the data read with the contents at a time of the start of the reading of the data, based on the update history acquired.
- The other objects, features, and advantages of the present invention are specifically set forth in or will become apparent from the following detailed description of the invention when read in conjunction with the accompanying drawings.
-
FIG. 1 illustrates a concept of a method of data extraction according to an embodiment; -
FIG. 2 is a block diagram of a data extractor; -
FIG. 3 illustrates an extraction of page A1 from a database; -
FIG. 4 illustrates an extraction of page B1 from the database; -
FIG. 5 illustrates an extraction of page C2 from the database; -
FIG. 6 is a flowchart of a process procedure of data extraction in a data extractor; -
FIG. 7 is a flowchart of a process procedure of restoring executed by a data restoring section; and -
FIG. 8 is a flowchart of a process procedure of conversion to a data format executed by a format converter. - Exemplary embodiments of a data extractor and a method of data extraction according to the present invention are described in detail below with reference to the accompanying drawings.
-
FIG. 1 illustrates a concept of a method of data extraction according to an embodiment. InFIG. 1 , adatabase 2 is connected to anetwork 3 via adata extractor 1. Adatabase 5 is connected to thenetwork 3 via adata extractor 4. Aterminal 6 is connected to thenetwork 3. - The
database 2stores data constellations database 2 is performed using an input-output unit named as page. Thedata constellation 21 includes pages A1, B1, and C1, and thedata constellation 22 includes pages H1, I1, and J1. - On the other hand, in the
data constellations - If the
terminal 6 needs to update the data included in thedata constellation 21 stored in thedatabase 2, theterminal 6 accesses thedata extractor 1 via thenetwork 3. Based on the access from theterminal 6, anupdating processor 11 in thedata extractor 1 acquires a process exclusion of desired data through anexclusive controller 13, thereby reading and writing data. Because this data updating is performed in units of records, when theupdating processor 11 accesses a certain record, theexclusive controller 13 acquires a process exclusion of a page that stores this record. - On the other hand, for extracting the
data constellation 21 stored in thedatabase 2 and storing in thedatabase 5, a process exclusion of data extracted by anextraction processor 12 through theexclusive controller 13 is acquired, and the data extraction is started. In this case, theextraction processor 12 acquires process exclusion one after another for the pages A1, B1, and C1, which are included in thedata constellation 21. - In other words, the
extraction processor 12, first acquires the process exclusion only for the page A1, terminates process exclusion of the page A1 after reading records on the page A1, and then acquires a process exclusion for the page B1. - Thus, acquiring the process exclusion only for the page that is to be read, and allowing the access to the other pages enables the updating of the
data constellation 21 during the data extraction. - To prevent the mismatching in data extracted due to the updating that is performed during the data extraction, the
data extractor 1 monitors the updating operation performed by theupdating processor 11, and stores the changes as updated log data, when the database is updated. Thedata extractor 1 uses the updated log data to revert the data extracted to a value at a starting point of the extraction, and can thus acquire a value of thedata constellation 21 at the starting point of the extraction. - Following is a description of a concrete structure of the
data extractor 1.FIG. 2 is a block diagram of a data extractor. In addition to theupdating processor 11, theextraction processor 12, and theexclusive controller 13 shown inFIG. 1 , thedata extractor 1 includes an input-output processor 10, anextraction controller 14, a log-data acquisition section 15, an extraction-data storage 16, a log-data storage 17, adata restoring section 18, and aformat converter 19. Theextraction processor 12 includes a buffer memory forextraction 12 a. - The input-output processor 10 receives an access to the
database 2 via thenetwork 3. Upon receiving an access requesting updating of data stored in thedatabase 2, the input-output processor 10 outputs the access received to the updatingprocessor 11. Moreover, upon receiving an access requesting the extraction of data stored in thedatabase 2, the input-output processor 10 outputs the access received to theextraction controller 14. - When the input-output processor 10 receives an access requesting updating of data, the updating
processor 11 acquires process exclusion for the page that stores the data to be updated, and updates the data. - When the input-output processor 10 receives an access requesting the data extraction, the
extraction controller 14 outputs a command instructing start of acquisition of the updated log data to the log-data acquisition section 15, and a command instructing the start of data extraction to theextraction processor 12. - The
extraction processor 12 receives the command from theextraction controller 14, and starts the extraction of the data from thedatabase 2. At this time, theextraction processor 12 performs data extraction by acquiring the process exclusion one after another for the pages in the data constellation that is extracted. Further, theextraction processor 12 extracts the page as a page image, and stores the page in the extraction-data storage 16. - The log-
data acquisition section 15 receives the command from theextraction controller 14, and starts monitoring the updatingprocessor 11. During the monitoring, if the updatingprocessor 11 updates thedatabase 2, the log-data acquisition section 15 stores the contents of updating by the updatingprocessor 11 as updated log data in the log-data storage 17. Updated data and the contents of updating are recorded in the updated log data. - The
data restoring section 18 restores the data based on the page image stored in the extraction-data storage 16 and the updated log data stored in the log-data storage 17. Restoration of data is a process of reverting contents of the page image to the contents at the starting point of the extraction using the contents of the updated log data, when the contents changed after the start of extraction are included in the page image extracted. Thedata restoring section 18 outputs the page image restored to theformat converter 19. - The
format converter 19 changes the data included in the page image received from thedata restoring section 18 to a desired format according to the requirement, and outputs the changed data to the input-output processor 10. The input and the output within thedatabase 2 are in units of pages. However, it is desirable that the handling of the data included in the page extracted be performed in a generalized format. Therefore, the data is converted by theformat converter 19, before outputting to thenetwork 3 via the input-output processor 10. - Following is a description of the buffer memory for
extraction 12 a. The buffer memory forextraction 12 a that is connected to theextraction processor 12, functions as a temporary storage during extracting the page image from thedatabase 2. In other words, while reading the page image from thedatabase 2, theextraction processor 12 acquires the process exclusion of the page to be extracted, and at a point of time when the page image read is stored in the buffer memory forextraction 12 a, theextraction processor 12 judges that the reading of the page image is complete, and then terminates the process exclusion for that page. - Thus, the process exclusion is terminated at the point of time when the page image read is stored in the buffer memory for
extraction 12 a, and the page image stored in the buffer memory forextraction 12 a is stored in the extraction-data storage 16 after terminating the process exclusion. Therefore, the time required for the process exclusion for reading each page is determined by capacity and speed of reading and writing of the buffer memory forextraction 12 a. - Therefore, it is possible to read and write at a high speed, and by providing the buffer memory for extraction having sufficient capacity, the time for the process exclusion of each page reduces.
- For high speed processing by the buffer memory for
extraction 12 a, it is desirable to provide a database buffer memory in the database. Providing the database buffer memory in the database, and storing the necessary data for the extraction and updating of the data in the database buffer memory in advance, helps to further reduce the time for process exclusion necessary at the time of updating and extraction of the data. - Similarly, the updating
processor 11 and theextraction processor 12 can be realized as independent processors. In the data extraction process, a large amount of data is read continuously, the processor operates throughout the data extraction. Therefore, if the data extraction and updating are realized by the same processor, the data extraction consumes the processing capacity of the processor, and reduces the processing capacity that can be used for the updating, thereby reducing the processing speed of updating. Hence, realizing the updatingprocessor 11 and theextraction processor 12 as independent processors can secure the processing capacity used for updating, and avoids a decrease in the processing speed during updating. - Next, the data extraction performed by the
data extractor 1 is described further with reference to FIGS. 3 to 5. When thedata extractor 1 reads the pages A1, B1, and C1 from thedatabase 2, first, the log-data acquisition section 15 starts monitoring the updatingprocessor 11, and then theextraction processor 12 acquires the process exclusion for the page A1 in the database 2 (seeFIG. 3 ). Therefore, the updatingprocessor 11 cannot access the page A1. On the other hand, because theextraction processor 12 has not acquired the process exclusion for the pages B1 and C1, the updatingprocessor 11 can freely access the pages B1 and C1. - As shown in
FIG. 3 , the page A1 stores records a10, a20, and a30. The page B1 stores records b10, b20, and b30, and the page C1 stores records c10, c20, and c30. Upon reading the page A1 and storing it in the extraction-data storage 16, theextraction processor 12 terminates process exclusion for the page A1 in thedatabase 2. - After the extraction of the page A1 is complete, the
extraction processor 12 acquires the process exclusion for the page B1 (seeFIG. 4 ). Therefore, the updatingprocessor 11 cannot access the page B1. On the other hand, because theextraction processor 12 has not acquired the process exclusion for the pages A1 and C1, the updatingprocessor 11 can access the pages A1 and C1 freely. - As shown in
FIG. 4 , upon acquiring the process exclusion for the page B1, theextraction processor 12 reads the page B1, and stores it in the extraction-data storage 16. While theextraction processor 12 extracts the page B1, the updatingprocessor 11 can update another page. In this case, the updatingprocessor 11 rewrites the record a30 of the page A1 to a record a31, thus changing the page A1 to a page A2, and rewrites the record c20 of the page C1 to a record c21, thus changing the page C1 to a page C2. - When the updating
processor 11 has updated the records, the log-data acquisition section 15 creates updated log data, and stores the updated log data in the log-data storage 17. InFIG. 4 , information indicating that the record c20 has been rewritten as the record c21 is stored as the updated log data. Moreover, information for specifying the record updated is added to the updated log data, as per requirement. - In
FIG. 4 , the log-data acquisition section 15 acquires log-data related to the page C2, and stores it in the log-data storage 17, but does not acquire log-data related to the page A2. This is because the updated log data of the page A1 updated to the page A2 is not required, because theextraction processor 12 has already completed the extraction of the page A1. - Upon reading the page B1 and storing in the extraction-
data storage 16, theextraction processor 12 cancels the process exclusion for the page B1 in thedatabase 2. - After the extraction of the page B1 is complete, the
extraction processor 12 acquires the process exclusion for the page C2 (seeFIG. 5 ). Therefore, the updatingprocessor 11 cannot access the page C2. On the other hand, because theextraction processor 12 has not acquired the process exclusion for the pages A2 and B1, the updatingprocessor 11 can access the pages A2 and B1 freely. In this case, the page C2 that is read by theextraction processor 12 has been updated from the page C1 by the updatingprocessor 11. However, theextraction processor 12 reads the updated page C2 as it is, and stores in the extraction-data storage 16. Therefore, the page C2 that is stored in the extraction-data storage 16 includes a record c21, because the updatingprocessor 11 updated the record. - Based on the updated log data stored in the log-
data storage 17, thedata restoring section 18 restores the page that is stored in the extraction-data storage 16. InFIG. 5 , the log-data storage 17 stores information indicating that the record c20 has been updated to the record c21. Therefore, thedata restoring section 18 detects a page with the record c21 from the pages in the extraction-data storage 16. In other words, thedata restoring section 18 detects the page C2 and creates the page C1 by changing the record c21 to the record c20. - Thus, by restoring the page corresponding to the updated log data stored in the log-
data storage 17, thedata restoring section 18 can obtain the pages A1, B1, and C1 at a point of time when the data extraction started. - Next, an operation of data extraction in the data extractor is described in detail with reference to
FIG. 6 . To start with, theextraction processor 12 acquires process exclusion for overall data to be extracted (step S101). Further, theextraction controller 14 transmits a command to the log-data acquisition section 15 instructing to start acquisition of the updated log data, and starts acquiring the updated log data (step S102). Then, theextraction processor 12 terminates the process exclusion for the overall data to be extracted (step S103). - In this case, the acquisition of the updated log data starts after the
extraction processor 12 acquires the process exclusion for the overall data to be extracted. This is because, if an operation of updating the data and the start of acquisition of the updated log data occur simultaneously, the updating of data taking place while the starting of acquisition of the updated log data is not affected during the restoration, and there is a mismatching of data contents. - Because the operation of transmitting the command instructing to start the acquisition of the updated log data to the log-
data acquisition section 15 can be performed in a very short time, the process exclusion for the overall data to be extracted takes a very short time, and does not affect the process of data updating. - Further, the
extraction processor 12 designates a first page of the data to be extracted as a page subjected to extraction (step S104). Thedatabase 2 determines whether the page subjected to extraction exists in the database buffer memory or not (step S105). If thedatabase 2 has not stored the page subjected to extraction in the database buffer memory, i.e. if an image of the page subjected to extraction does not exist in the database buffer memory (No at step S105), thedatabase 2 reads the page subjected to extraction into the database buffer memory (step S106). - If the image of the page subjected to extraction exists in the database buffer memory (Yes at step S105), or after reading the page subjected to extraction into the database buffer memory (step S106), the
extraction processor 12 acquires the process exclusion for the page subjected to extraction (step S107). Then, theextraction processor 12 reads the page subjected to extraction, and stores it in the buffer memory forextraction 12 a (step S108). - After step S108, the
extraction controller 14 ends the acquisition of the updated log data (step S109), and then theextraction processor 12 terminates the process exclusion for the page subjected to extraction (step S110). Moreover, theextraction processor 12 appropriately stores the page subjected to extraction that is stored in the buffer memory forextraction 12 a, into the extraction-data storage 16 (step S111). - Further, the
extraction processor 12 determines whether all pages of data to be extracted have been read (step S112). If there is a page that has not been read yet (No at step S112), theextraction processor 12 designates the next page as the page subjected to extraction (step S113), and the process returns to step S105. Thus, all pages in the data to be extracted are read one by one, and when all the pages have been read (Yes at step S112), the data extraction ends. - Next, an operation of restoring executed by the
data restoring section 18 is described in detail with reference toFIG. 7 . When thedata restoring section 18 restores the data, preparation for searching the updated log data stored in the log-data storage is performed (step S201). This search preparation is an operation of rearranging the updated log data according to the corresponding page. Because the log-data acquisition section 15 monitors the operation of the updatingprocessor 11 and acquires the updated log data at any time, the updated log data is normally acquired as data of time series. On the other hand, the restoring is in units of pages. Therefore, by rearranging in advance the updated log data according to the corresponding page, the search of the updated log data corresponding to each page can be performed at a high speed. - Upon completion of the search preparation (step S201), the
data restoring section 18 designates a first page from among the pages extracted as a page subjected to restoration (step S202). Then, thedata restoring section 18 reads the page subjected to restoration from the extraction-data storage 16 (step S203). Further, thedata restoring section 18 searches from the log-data storage 17, updated log data file corresponding to the page subjected to restoration (step S204). Thedata restoring section 18 restores the page subjected to restoration that is read using the log-data file searched (step S205). - Further, the
data restoring section 18 determines whether all pages of the data to be extracted are restored (step S206). If any page is not restored (No at step S206), thedata restoring section 18 designates the next page as the page subjected to restoration (step S207), and the process returns to step S203. Thus, all the pages extracted are restored one by one by searching the corresponding log-data for each page, and when all the pages are restored (Yes at step S206), the restoring ends. - Next, an operation of conversion of a data format by the
format converter 19 is described in detail with reference toFIG. 8 . To start with, theformat converter 19 receives data that is restored by the data restoring section 18 (step S301), and designates a first page from among the pages restored as a page subjected to conversion (step S302). - Further, the
format converter 19 designates a first record from among records in the page subjected to conversion as a record subjected to conversion (step S303). Theformat converter 19 converts the record subjected to conversion to a desired file format (step S304), and outputs the record converted (step S305). - The
format converter 19 determines whether all records in the page subjected to conversion are converted (step S306). If any record is not converted yet (No at step S306), theformat converter 19 designates the next record as the record subjected to conversion (step S307), and the process returns to step S304. - On the other hand, if all the records in the page subjected to conversion are converted (Yes at step S306), the
format converter 19 determines if all the pages have been converted (step S308). If any page is yet to be converted (No at step S308), theformat converter 19 designates the next page as the page subjected to conversion (step S309), and the process returns to step S303. Thus, when all the pages restored have undergone format conversion one by one (Yes at step S308), the conversion of data format ends. - As described above, in the
data extractor 1 according to the embodiments, for extracting thedata constellation 21 from thedatabase 2, the process exclusion is acquired only for the page that is being extracted, and the other pages can be accessed freely. Thus, thedata constellation 21 can be extracted without decrease in the update response time. - Moreover, if the database is updated during the data extraction, the updated contents are stored as the updated log data, and the contents of the data extracted using the updated log data are restored to values at the time of start of data extraction. Therefore, mismatching of the contents of data can be prevented, and a value of each data at the time of start of data extraction can be obtained.
- Further, upon starting acquisition of the updated log data at the time of start of data extraction, the
data extractor 1 ends the acquisition of the updated log data for the page for which the extraction is complete, and performs data extraction only for a page that is not extracted, thereby reducing the capacity of the updated log data. - Because the database buffer used for input-output of the database and the buffer memory for
extraction 12 a are provided independently, even if a large amount of memory capacity is used by the extraction, the memory capacity for updating is secured, and the decrease in the update process speed is avoided. - Realizing the updating
processor 11 and theextraction processor 12 as independent processors can secure the process capacity used for the updating, and avoids the decrease in the process speed during updating. - In the embodiments mentioned so far, although the updated log data acquired is used only for restoring the page image, the updated log data may also be used for recovery of the database. In this case, it is not necessary to acquire the log-data uniquely for restoration, thereby enabling to reduce cost of creating log-data of a CPU. While using the same log-data for restoration and recovery, it is necessary to continue acquisition of the updated log data even for a page for which the extraction is completed.
- Moreover, in the embodiments mentioned so far, a data extractor suitable particularly for the method of data extraction is described. However, functions described in the embodiments may be realized by software, as a computer program for data extraction that can be run on any computer terminal.
- Thus, according to the data extractor and the method of data extraction of the present invention, data extraction can be performed at any point of time.
- Furthermore, data extraction can be performed without decreasing the update response time.
- Moreover, data extraction can be performed without using excessive storage area of update history, and with a simple structure.
- Furthermore, the update response time is secured even if the data extraction is being performed at the same time.
- Although the invention has been described with respect to a specific embodiment for a complete and clear disclosure, the appended claims are not to be thus limited but are to be construed as embodying all modifications and alternative constructions that may occur to one skilled in the art which fairly fall within the basic teaching herein set forth.
Claims (7)
1. A method of data extraction comprising:
reading successively a plurality of data stored in a database;
acquiring update contents of the data as update history, if there is an update of the data in the database during a period from a start of the reading to an end of the reading; and
overwriting contents of the plurality of the data read with the contents at a time of the start of the reading, based on the update history acquired.
2. The method according to claim 1 , further comprising:
providing exclusive control including inhibiting updating of the data that is being read, and allowing updating of the data already read and the data yet to be read, from among the plurality of data subjected to the reading.
3. The method according to claim 2 , wherein
the acquiring includes ending the update history acquiring of that data for which the reading is complete, from among the plurality of data subjected to the reading.
4. A data extractor that successively reads a plurality of data stored in a database, comprising:
an update history acquiring unit that acquires update contents of the data as update history, if there is an update of the data in the database during a period from a start of reading of the data to an end of reading of the data; and
an overwriting unit that overwrites contents of the plurality of the data read with the contents at a time of the start of the reading of the data, based on the update history acquired.
5. The data extractor according to claim 4 , further comprising:
an exclusive control unit that provides exclusive control to inhibit updating of the data that is being read, and to allow updating of the data already read and the data yet to be read, from among the plurality of data subjected to the reading.
6. The data extractor according to claim 5 , wherein
the update history acquiring unit ends acquisition of the update history of that data that has been read, from among the plurality of data subjected to the reading.
7. The data extractor according to claim 4 , wherein
an updating processor that updates the plurality of data stored in the database, and an extraction processor that performs the reading of the data, are provided as independent processors.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/019,127 US20050160087A1 (en) | 2002-08-29 | 2004-12-22 | Data extractor and method of data extraction |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2002/008759 WO2004023308A1 (en) | 2002-08-29 | 2002-08-29 | Data extracting method and data extracting device |
US11/019,127 US20050160087A1 (en) | 2002-08-29 | 2004-12-22 | Data extractor and method of data extraction |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2002/008759 Continuation WO2004023308A1 (en) | 2002-08-29 | 2002-08-29 | Data extracting method and data extracting device |
Publications (1)
Publication Number | Publication Date |
---|---|
US20050160087A1 true US20050160087A1 (en) | 2005-07-21 |
Family
ID=34748459
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/019,127 Abandoned US20050160087A1 (en) | 2002-08-29 | 2004-12-22 | Data extractor and method of data extraction |
Country Status (1)
Country | Link |
---|---|
US (1) | US20050160087A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070150447A1 (en) * | 2005-12-23 | 2007-06-28 | Anish Shah | Techniques for generic data extraction |
US20110238630A1 (en) * | 2010-03-26 | 2011-09-29 | Fujitsu Limited | Database management apparatus and recording medium with database management program recorded thereon |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5852715A (en) * | 1996-03-19 | 1998-12-22 | Emc Corporation | System for currently updating database by one host and reading the database by different host for the purpose of implementing decision support functions |
US7058663B2 (en) * | 2001-03-13 | 2006-06-06 | Koninklijke Philips Electronics, N.V. | Automatic data update |
US7107294B2 (en) * | 2003-03-14 | 2006-09-12 | International Business Machines Corporation | Method and apparatus for interrupting updates to a database to provide read-only access |
US7158991B2 (en) * | 2003-09-30 | 2007-01-02 | Veritas Operating Corporation | System and method for maintaining temporal data in data storage |
-
2004
- 2004-12-22 US US11/019,127 patent/US20050160087A1/en not_active Abandoned
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5852715A (en) * | 1996-03-19 | 1998-12-22 | Emc Corporation | System for currently updating database by one host and reading the database by different host for the purpose of implementing decision support functions |
US7058663B2 (en) * | 2001-03-13 | 2006-06-06 | Koninklijke Philips Electronics, N.V. | Automatic data update |
US7107294B2 (en) * | 2003-03-14 | 2006-09-12 | International Business Machines Corporation | Method and apparatus for interrupting updates to a database to provide read-only access |
US7158991B2 (en) * | 2003-09-30 | 2007-01-02 | Veritas Operating Corporation | System and method for maintaining temporal data in data storage |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070150447A1 (en) * | 2005-12-23 | 2007-06-28 | Anish Shah | Techniques for generic data extraction |
US7860903B2 (en) | 2005-12-23 | 2010-12-28 | Teradata Us, Inc. | Techniques for generic data extraction |
US20110238630A1 (en) * | 2010-03-26 | 2011-09-29 | Fujitsu Limited | Database management apparatus and recording medium with database management program recorded thereon |
US8315987B2 (en) | 2010-03-26 | 2012-11-20 | Fujitsu Limited | Database management apparatus and recording medium with database management program recorded thereon |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7139927B2 (en) | Journaling and recovery method of shared disk file system | |
US6519614B1 (en) | Transaction processing system using efficient file update processing and recovery processing | |
US6772421B2 (en) | Information processing apparatus and method therefor of binding common data between processes based on common data search results, and recording medium | |
US8332845B2 (en) | Compile timing based on execution frequency of a procedure | |
US20080215836A1 (en) | Method of managing time-based differential snapshot | |
US8843449B2 (en) | Unobtrusive copies of actively used compressed indices | |
US8051054B2 (en) | Method and system for data processing with database update for the same | |
CN101770383A (en) | Method and device for on-line upgrade of cross-platform version | |
US6725351B1 (en) | Data communication device having a buffer in a nonvolatile storage device | |
US20030220950A1 (en) | Database controlling system | |
US20050262033A1 (en) | Data recording apparatus, data recording method, program for implementing the method, and program recording medium | |
US20050160087A1 (en) | Data extractor and method of data extraction | |
CN110716923B (en) | Data processing method, data processing device, node equipment and storage medium | |
WO2024113543A1 (en) | Data processing method, system, and apparatus, non-volatile readable storage medium, and electronic device | |
US20050131969A1 (en) | Database duplicating method, database duplicating apparatus, database creating method, and database creating apparatus | |
EP2937788A1 (en) | Information processing method, information processing device, and program | |
EP1507219A2 (en) | Computer system and program for improved file system performance | |
JPWO2004023308A1 (en) | Data extraction method and data extraction apparatus | |
JP2980610B2 (en) | Transaction management device | |
US20230045119A1 (en) | Data access method and apparatus | |
JPH08305614A (en) | Information processor | |
KR100308000B1 (en) | Method for processing performance monitoring data in wideband DCS system | |
US20190243341A1 (en) | Program storage device and program storage system | |
JPH11232153A (en) | Database system | |
CN115617580A (en) | Incremental backup and recovery method and system based on shared SST file |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: FUJITSU LIMITED, JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:NISHIGAKI, MASAKI;REEL/FRAME:016429/0674 Effective date: 20050210 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |