WO1999015992A1

WO1999015992A1 - Data processing system

Info

Publication number: WO1999015992A1
Application number: PCT/GB1998/002845
Authority: WO
Inventors: Jonathan Stephen Fowler
Original assignee: British Telecommunications Public Limited Company
Priority date: 1997-09-24
Filing date: 1998-09-21
Publication date: 1999-04-01
Also published as: GB2345560A; HUP0004515A2; AU9173898A; PL339436A1; GB9720395D0; GB0005479D0

Abstract

A data processing application processes data files and generates storage operation instructions for data files identified by identification information independent of the storage location of the data files. A plurality of storage locations are provided for storing data files and a storage map stores information on the storage locations of stored data files in the storage locations and identification information for the stored data files. The identification information generated by the data processing application is used to look up a storage location information and a storage interface receives the storage location information and storage operation instructions indicating a storage operation to be carried out and carries out the storage operation instructions for the data file in a storage location indicated by the storage location information.

Description

DATA PROCESSING SYSTEM

The present invention generally relates to a data processing method and apparatus wherein data is stored in a plurality of storage locations. There are many instances in data processing where it is desirable to distribute data amongst a plurality of storage locations. An example of such an arrangement is in a network of processors each having local storage and access to storage at other nodes in the network.

In order to utilise the flexibility of such a network, it is advantageous to be able to position the data at any storage location and be able to move the data as desired. However, where applications running on the processors utilise such data, they must be able to locate the data that is to be processed.

Thus, in accordance with one aspect the present invention provides data processing apparatus comprising: data processing application means for processing data files and for generating storage operation instructions for data files identified by identification information independent of the storage location of the data files; storage means comprising a plurality of storage locations for storing data files; storage mapping means for storing information on the storage locations of stored data files in said storage means and identification information for the stored data files, for receiving the identification information for a data file from said data processing application means, and for generating location information for the data file; and storage interface means for receiving location information for the data file from said storage mapping means and storage operation instructions from said data processing apparatus indicating a storage operation to be carried out on the data file, and for carrying out the storage operation for the data file in a said storage location indicated by said location information.

In accordance with a second aspect the present invention provides a data processing method for use with data storage means comprising a plurality of storage locations, the method comprising the steps of: a first generating step of generating data file identification information independent of the storage location of the data file; a second generating step of generating a storage operation instruction for a data file identified by said identification information; a third generating step of generating storage location information for the data file using stored information relating the storage location of stored data files to the identification information for the stored data files and the data file identification information; and performing the storage operation for the data file in a said storage location indicated by said storage location information generated for the data file. Thus the present invention provides a data processing apparatus which allows for data processing applications to be able to process data without having a knowledge of the physical location of the data. The application merely needs to identify the data that is to be processed and the storage location of the data can be identified by mapping or looking up the storage location for the identified data. This allows for the movement of data in the storage locations without having to modify the data processing application. Whenever data is moved from one location to another it is simply necessary to update the stored information or the table for each identified data file.

The storage operation instructions can comprise a data file reading instruction, a data file writing instruction or a data file deleting instruction. When a data file is to be read, there will be stored information on the location of the stored data and corresponding identification information and thus when a data file reading instruction is issued by the data processing application the identification of the data file can be used to generate the storage location information to enable the data file to be read from the storage location. Similarly, if a data file deletion instruction is issued the location of the data file can be identified from the identification information using a look-up table or database and the file can be deleted. When data is to be written to the storage locations there is no need to identify any stored data but instead a storage location is chosen and the data file is stored in the location. When the data file is successfully written in the location the table or database is updated to provide an entry for the storage location of the data file and of the identification information.

Thus, by isolating the data processing application from the physical location of the data to be read, stored, or deleted, the data processing applications are not affected in any way by the rearrangement of data files in the storage locations or the restructuring of the storage locations.

The present invention is applicable to any form of storage means which provides a plurality of separate storage locations. In accordance with one embodiment the storage means comprises a storage medium and the storage location comprises designated areas of the storage medium. An example of such an arrangement is a disc drive in a computer which is partitioned into directories by the operating system e.g. UNIX. This method of controlling the storage of data is particularly useful for UNIX version 9.04 wherein directories are limited to a storage space of 2 gigabytes. Thus, if a conventional method of storage is used by a data processing apparatus to always store data in a particular directory, a problem would arise when the directory fills up. The data processing application would need to know where it can store data and this will depend upon the configuration of the disc which a user may wish to change. In this embodiment of the present invention the data processing application has no knowledge of where the data is stored and this function is delegated to a separate procedure which includes a database containing physical locations corresponding to the data file identities

In another embodiment of the present invention the storage means comprises a plurality of interconnected storage mediums each storage location comprising at least an area of a said storage medium. A particular example of such an arrangement is in a network of computers wherein a data processing application running on a particular computer can have access to the disc drive storage on any of the networked computers. Thus the physical location at which the data file is stored can be a directory of any hard disc in the network. If the data processing application is required to determine where to store the data file, this would require a knowledge of the network. However, in this embodiment of the present invention the data processing application has no need to know anything about the structure of the storage that is to be used for the data and instead the simple identification is used for identifying the data file that is required. The table or database is kept identifying the storage location for each data file identified by a unique identification code. Whenever data files are moved in the storage locations the table or database can be updated accordingly. Also, if the structure of the storage location changes e.g. a computer is removed from the network or a hard disc is changed or upgraded, the table or database can be updated accordingly. The database can be held on a single one of any of the computers in the network and it can be held in a computer separate to a storage location to provide a high speed link over the network to allow for the identification of the storage locations rapidly by any processing application running on a computer in the network. The table or database could also be stored on one of the computers having one of the storage locations or it could be provided in multiple copies on all of the computers. The advantage of having multiple copies is that each of the computers has immediate access to the database or table. The disadvantage however is that each of the tables or databases must be updated simultaneously whenever there is a change in the location of data in the storage locations.

The storage of data files in the storage locations can be controlled such that before attempting to store a data file in a storage location, a storage location with sufficient space for storing the data file is identified and the data file is stored in the identified location. The table or database will then be updated in order to show the location of the data file.

Entries in the database for any of the storage locations can be set to deny access to the storage locations i.e. access can be denied in order to enable for example maintenance on a computer in a computer network. In one embodiment entries for any of the storage means can be set to read only to prevent further data files being written to the storage locations. This can take place automatically by monitoring the amount of storage space left in the storage location and setting database entry for the storage location to read only when the amount of free storage space reaches a minimum. This prevents the storage locations from filling up completely but allows the reading of the data. This is particularly useful where the data processing carried out on the data files comprises a sequence of different data processes. Each data process causes the reading, processing and writing of data files from storage locations. Thus, if any initial data process operates too quickly such that the intermediate processed data builds up in the storage locations, database entries for certain of the storage locations which become full of the intermediate processed data can be set to read only to thereby allow efficient access to the storage locations by subsequent processing applications. In this way subsequent data processing applications can have accelerated access to data to thereby enhance the throughput to try to smooth out the passage of data through the sequential processing steps.

Another advantage of providing a database or table outside the data processing applications so that the data processing applications simply need to identify the file they require for processing is that the data files can be stored in the storage locations in different formats e.g. compressed in different formats in different locations. Information regarding the storage format can then be stored in the table or database together with the location so that when data is read from the data locations it can be reconverted from the storage format into the format required for the data processing application In this way the most efficient form of storage can be chosen for each storage location.

In one embodiment the data processing application simply issues instructions for files which include a storage operation instruction and identification of the data file and the file is returned to it by external procedures. In another embodiment of the present invention the data processing application interrogates an external database in order to obtain the location of the data file and once this is returned it can interface with the storage locations to read, write or delete the data file as appropriate. In either case the coding for the data processing application is independent of the location of the data files. Thus there is no modification required to the programming when the storage locations are restructured or data files are moved.

Embodiments of the present invention will now be described with reference to the accompanying drawings, in which:

Figure 1 is a generalised schematic drawing of the first embodiment of the present invention; Figures 2a, 2b and 2c are schematic drawings of alternative embodiments of the present invention;

Figure 3 is a schematic drawing of a network of computers;

Figure 4 is a flow diagram illustrating a storage operation in accordance with one embodiment of the present invention; Figure 5 is a flow diagram illustrating a storage operation in accordance with another embodiment of the present invention;

Figure 6 is a schematic drawing of a multi-communication network system;

Figure 7 is a schematic drawing of the data processing apparatus for use in the multi-communication network system of Figure 6;

Figure 8 is a more detailed schematic drawing of a particular aspect of the data processing apparatus of Figure 7; Figures 9a, 9b and 9c are a flow diagram illustrating the steps carried out in collecting and processing data related to communication instances in accordance with an embodiment of the present invention;

Figure 1 0a is a flow diagram illustrating the steps in the billing process in accordance with an embodiment of the present invention;

Figure 10b is a flow diagram illustrating steps in the billing process in accordance with another embodiment of the present invention;

Figure 1 1 is a table illustrating the streamed file database of the master processor; and Figure 1 2 is a table used for storing the results of the billing process operations.

Figure 1 illustrates a first embodiment of the present invention wherein a data processing application 10 generates a read, write or delete request with respect to data to a storage interface 1 3 in respect of a data file for which there is generated an identification code or information. The identification information is passed to a storage mapping module 1 1 which includes information in the form of for example a database or look up table to allow for the conversion of the identification information into information on a physical location of a storage location from amongst a plurality of storage locations Figure 1 2 and to carry out the storage operation instructions. If for example the data processing application 1 0 generates a read request for a data file, the storage interface will receive information on the location of the data file to be read from the storage mapping module 1 1 and can address the storage locations 1 2 to read the data file from the appropriate storage location and pass this on to the data processing application 10. If the data processing application has generated a processed data file for writing to a storage location, the storage interface receives the data together with the write instruction from the data processing application 1 0 and it can then determine where is appropriate to store the data in the storage locations 1 2 for example by determining where there is sufficient space Before, or once the data is written to the storage location, the storage interface can pass the information on the location of the data file to the storage mapping module 1 1 which has already received information identifying the data file from the data processing application 1 0 to allow information to be stored for future retrieval of the data file. In the embodiment of Figure 1 , the apparatus can comprise a single computer in which the storage locations comprise a single storage medium segmented into separate areas. For example, the storage medium 1 2 can comprise a disc drive and each storage location can comprise a directory configured by the computer operating system e.g. Unix or Dos. Alternatively, the storage locations may comprise separate storage media within the same computer e.g. the storage locations may comprise separate disc drives or storage locations may comprise different storage media.

Figure 2a illustrates an alternative embodiment wherein the data processing application 1 0, the storage mapping module 1 1 , the storage interface 1 3, and a single storage location 14 are provided within a computer 30 which is connected over a network to other storage locations at other nodes in the network 50, 60 and 70 The other nodes in the network 50, 60 and 70 may comprise other computers like computer 30. In such a case, each may contain a storage mapping module each of which are identical and which require simultaneous updating. Alternatively, only one of the computers may hold the storage mapping module 1 1 and the other computers would require access to it.

Figure 2b illustrates another embodiment of the present invention wherein the storage mapping module 1 1 is provided with the first storage location in a computer 40 in a network.

Figure 2c illustrates yet another embodiment wherein the storage mapping module 1 1 is provided in a totally separate node 41 of the network. In a network of computers, the node 41 can comprise a high speed server accessible by all of the other computers running data processing applications to enable the location of data files to be determined and updated rapidly.

Figure 3 illustrates a network of computers 100 connected over a network e.g. a LAN such as an ethernet 200 to which the present invention can be applied.

In the arrangement illustrated in Figure 3, any of the computers 100 can operate one or more data processing applications and the storage mapping module can reside in one or any of the computers 100.

Referring now to Figure 4, a method of operation of the present invention will now be described. In step S1 the data processing application generates the read or write request and in step S2 it is determined whether this is a read or write request by the storage interface If the request generated is a read request, the storage location for the identified data is looked-up in the storage mapping module and the data location is returned to the storage interface to allow the reading of the data in step S4 If in step S2 it is determined that the request is a write request, the storage interface identifies a storage location with sufficient space for the data in step S5. One of the storage locations identified with sufficient space is selected in step S6 and the storage interface then writes the data to the storage location in step S7. The look-up table in the storage mapping module is then updated to provide an entry for the data file to indicate the storage location of the data file. Figure 5 illustrates an alternative embodiment of the present invention wherein step S5 is replaced with step S5A in which not only is the storage location identified which has sufficient space for storing the data, also it is determined whether the storage location is marked as available. Thus in the method of Figure 5, storage locations can be marked as read only to prevent the further storage of data therein.

The present invention is widely applicable to the processing of data in many systems. An embodiment will now be described in which the technique is used for processing data concerning communications instances in order to generate the billing information for communications instances between two communications networks.

Where communications instances, for instance, telephone calls or data transfers, occur within a single network, it is known to log and process data related to those communications instances. Commonly, in a Public Switched Telephone Network (PSTN) data will be collected concerning call duration and processed with respect to at least time of day and call type, so that the network operator can generate an item on a bill destined for the subscriber who initiated the call.

In the past, PSTNs have been run" primarily by government organisations as part of the national infra structure. Privatisation of the PSTNs and the relaxation of regulatory monopolies in the UK means that there are more network operators available to the subscriber and these network operators must, for practical reasons, provide inter network connection This means that a network operator must take into account not only communications instances arising in their own network, or in a limited number of connected networks of independent but similar administrations, but also communications instances arising in a theoretically very large number of computing networks of different types and providing a wide variety of services to subscribers.

It is therefore necessary to collect and process data in connection with communication instances arising outside operator's network but terminating in or simply crossing the operator's network.

A system has been designed by the present applicants to collect and process data relating to calls incoming to a major telecommunications network, the PSTN operated by British Telecommunications pic in the United Kingdom (hereinafter the British PSTN), which can produce and output sufficient detail to allow the associated network administration to generate account information which not only can be allocated to outside network administrations appropriately, but also supports itemised billing information. This system is the subject of International Patent Publications Nos. W094/23529 and WO94/23530. Such a system is shown in outline in Figure 6. Networks 1 and 2 comprise networks outside the administration of the administrations network 3. Network 3 in this example comprises the British PSTN. In the PSTN telephone calls made from telephones are received by the Digital Local Exchanges (DLE) and can be received directly by Digital Main Switching Units (DMSUs). The calls can be routed by the DLEs to the DMSUs in order to pass calls over long distances. The DMSUs can comprise points of interconnection between the first and third networks and the second and third networks. For every call that passes between network 3 and network 1 or network 2 a call detail record is produced by the exchange (DMSU and DLE for Indirect Access (IA) calls) at the point of interconnection. Periodically a Network Mediation Processor (NMP) 1 polls the exchange (DMSU or DLE) over an X25 network in order to down-load the call detail records. Call detail records contain routing information identifying the point of origination of the call and the destination, and information which can be used for billing purposes e.g. duration of the call and time of day. Call detail records are received in files from each of the exchanges (DMSUs or DLEs) and polled by the streamer 2. The streamer 2 uses information contained in a routing reference model to deduce the origination and/or destination network for the call. The streamer then divides each of the call record files for the DMSUs into files specific to each network. These call record files are then passed to the company system for charging and pricing. The results of the charging and pricing operation can be summarised into reporting tables which can be accessed by the client system 4. Also, itemised call records can be stored and any errors identified can be stored for subsequent handling.

The differences between this embodiment of the present invention and the earlier system of W094/23529 and W094/23530 reside in the operations carried out by the company's system, the way in which the output of the streamer 2 is used, and the way in which data is stored in the company system and the streamer. Generation of the streamed call record files is thus as disclosed in W094/23429 and WO94/23530 the disclosure of which is hereby incorporated by reference, and thus the operation of the streamer 2 will not be described in detail.

Figure 7 illustrates the hardware configuration of the call processing system. The streamer system 2 comprises two NMP pollers 1 01 and 102 which poll the NMPs over FTAM (File Transfer Access Method) links. Two NMP pollers

101 and 102 are provided in order to provide security by redundancy. The NMP pollers 1 01 and 102 comprise Hewlett Packard I70 workstations and are connected over an FDDI network 99 to a streamer processor 1 03 which comprises a Hewlett Packard T500/1 2 server. A streamer database server 104 which comprises a Hewlett Packard I70 server is also provided in the streamer system 2 and is connected to the streamer processor 1 03 over the FDDI network 99. Within the company's system there is provided a company server 1 06 which comprises a Hewlett Packard T500/8 server which is connected to 38 slave workstations acting as a batch array processor 1 08. The batch array processor 108 comprises 38 Hewlett Packard 735 workstations. The company server 1 06 is provided with optical disc storage 1 07 which although illustrated as comprising two optical discs actually comprises a multiple disc jukebox. Also within the company system 3 there is provided a data analyser 1 05 which comprises a Hewlett Packard H70 server for carrying out processing on errors and warnings which occur in the call records. An interface 1 1 0 is also provided connected to the FDDI network 99 to interface between a TCIP link to a client system 109 comprising a Hewlett Packard T500/1 2 server.

The call processing system is also provided with a system manager 1 1 1 comprising a Hewlett Packard 735 for performing general system management functions and an archiver 1 1 2 comprising a Hewlett Packard 867 workstation for archiving processed records. Also a remote back up 1 1 3 is provided and comprises a Hewlett Packard HP847 workstation with optical storage 1 14.

It can be seen that the call processing system comprises a plurality of networked processors each of which carries out various sequential processes of batches of call records i.e. data files. In the previous system of W094/23429 and WO94/23530 the data files were accessed by the call processing applications by reference to their physical location in the storage locations i e. with reference to the file name and directory in which the file is stored. However, this has not utilized the flexibility of the network which allows for the storage of data in any of the disc drives available in the servers. Also, it has been hindered by the lack of flexibility since any changes in the data storage structure require changes in the call processing application software. By utilizing a database which can be present on any or all of the workstations and which contains information identifying the storage location of the data file, a highly flexible data processing system is provided.

Figure 8 illustrates in more detail how the streamed files from the streamer system 2 are processed. A stream file storage system 2a is provided to store the streamed call record files generated by the streamer system 2. The streamed file database 2b is also provided to identify the files which are ready for billing. The company system 3 will read a streamed file from the streamed file storage system 2a by identifying a streamed file awaiting billing from the streamed file database 2b. Once the streamed file has been processed the streamed file database is marked accordingly and the streamed file can be deleted from the streamed file storage system 2a. In this way there is a 'de-synchronisation' between the operation of the streamer system 2 and the company system 3, unlike the earlier system wherein the streamed files were sequentially output to the master processor of the company system 3 and sequentially processed. In this embodiment since the streamed files are stored they can be processed in any order by selection from the streamed file database 2b. The company system 3 comprises a master processor 5 which contains a streamed file database 5a of the form illustrated in Figure 1 1 which will be described in more detail hereinafter. The master processor 5 operates a procedure 5b to read charged and priced itemised call records from a cluster 8 of slave processors and to store the itemised call records in an optical storage device 6 A merge process 5c operates to read summary information from the slaves of the cluster 8 in order to merge the summary information to form summary tables. A process 5d for reading errors and warnings from the slaves of the cluster 8 is also provided in the master processor 5. The errors and warnings can be passed to a call record analyser 105 for analysis of the errors and warnings.

In addition to the master processor 5, the company system includes a cluster 8 of slave processors, in the current embodiment 38 slaves operate in parallel. Each of these slaves runs a separate charging and pricing process and obtains the call records directly from the streamed file storage system 2a of the streamer system 2.

The client system 1 09 comprises a processing system connected to the company system 3 to enable the display and analysis of the summary information. In figure 8 the passage of data is indicated by the thick lines whilst passage of control data is indicated by the thin lines. As can be seen the streamed file database 5a is central to avoid the need for the streamed files output from the streamer system 2 to pass into the master processor 5. By providing stream file database 5a which is constructed by reading entries from the stream file database 2b of the streamer 2, the slaves of the cluster are able to identify streamed files available for processing and can then directly obtain streamed files from the stream file storage system 2a of the streamer 2. Although the output of the cluster 8 is passed into the master processor 5, this is of substantially smaller volume than the unprocessed streamed files from the streamer 2 and thus a potential bottleneck for the data flow is avoided.

The provision of the cluster 8 of slave processors provides a highly resilient and stable system. As call volumes rise the capacity of the system can be increased simply by adding additional slaves processors. Further, if any of the slave processors fail, this will not significantly affect the processing throughput of the company system 3.

By 'de-synchronιsιng' the operation of the streamer 2 in the company system 3, if there are any problems with processing a file which is streamed out of the streamer 2, the charging and pricing process is not in any way held up since the problem file can be circumvented since the slave processors can select any of the streamed files in the streamed file storage system 2a. Since the streamed file can be of varying size, the period required for the charging and pricing process can vary greatly. Thus by utilizing the plurality of slave processors in the cluster 8, the call records can be efficiently charged and priced in parallel.

Within the company system the call records are priced by the slaves of the cluster 8 according to complex pricing and charging reference tables. When the processing is complete, the data can be passed from the slaves into the master processor 5 and entries in the stream file database 5a which comprise an Oracle summary tables can be incremented or changed. Reference tables can be incremented or changed. Reference tables provide exchange set up data, routing reference data, accounting agreements, pricing and charging data, and various classes of exceptions. Pricing and charging reference tables are derived from BT's National Charging Database (NCDB) and inter-operator wholesale pricing agreements. These were used by each of the slaves of the cluster.

The slaves of the cluster 8 individually bid for processing tasks by identifying the streamed files awaiting processing using the stream file database 5a. Although in this embodiment 38 slaves are shown which are capable of handling 75 million itemised call records a day, this number can be any number dependent upon the call processing demand.

Before the method of processing the call records is described in detail, the structure of the streamed file Oracle database 5a will now be described with reference to Figure 1 1 . The streamed file database 5a contains information copied from the streamed file database 2b of the streamer system 2. Information in the two databases is mirrored. When a file such as File 1 is stored in the streamed file storage area 2a of the streamer system 2 and is ready for processing an entry is made in the streamed file database 2b of the streamer system 2 and this is copied over to the streamed file database 5a of the company system 3. Initially the entry gives the file name and sets the main status to A indicating that file is waiting for call processing. All other entries remain blank at this time. After call processing the main status can either be changed to M indicating that the file is awaiting merge processing or F indicating that the call processing has failed. As can be seen in Figure 1 1 Fileδ failed call processing and thus there are no further entries. File2 however was successfully call processed and is awaiting merge processing. Since the result of the call processing is a summary file for merge processing, an itemised call record file and an errors and warnings file, the database contains entries indicating the file system number for these files. For File2 also the status of the itemised call records is set to A and the status of errors and warnings is set to A to indicate that these files are ready for itemised call records storage and errors and warnings analysis.

Figure 1 2 illustrates an example of a table identifying the location of the stored merge files, itemised call records files and errors and warnings files. The first column is the file system number and is used as a reference to Figure 1 1 Type indicates the type of file which is stored for that file system number. The status indicates whether the storage location is available (A), read-only (N) or shut down (S) . The location indicates the physical location assigned for the storage of the files. This can either be simply the directory reserved for storage of the files on the local processor, or in a network arrangement this can comprise an identification of the machine or node and the directory on that machine. Columns also indicate the free space currently available for that area and the total space available for that area. Once merge processing has taken place to form summary tables, the main status can either turn to P indicating completion of merge processing or to W to indicate that the merge processing has failed. Further, the main status can also be set to K if call processing has failed for a known reason or Z if merge processing has failed for a known reason. Further, the status of the itemised call records storage can be set to P if storage is completed successfully, F if storage has failed, or K if storage has failed for a known reason. The errors and warnings status can be set to P for successful processing of errors and warnings, F for failure to process errors and warnings, K when there is a known failure, or N when there are no errors and warnings output from the slaves following call processing i.e. no error processing is required.

Although in Figures 1 1 and 1 2 both a file system number and a file name is used so that files are allocated locations by the file system number, the file name and file system number can be integrated as a single identification number.

The operation of call record collection and processing in the system will now be described with reference to Figures 9a, 9b and 9c. In step S1 0 rows of the streamed file database 2b with the status R are copied into the streamed file database 5a in the master processor 5 In step S1 1 the status of each row which has been copied is set to A in the streamed file databases 2b and 5a of the streamer system 2 and the master processor 5 respectively In step S1 2 the slaves of the cluster 8 then process the streamed files as will be described hereinafter in further detail. Following the processing of the streamed file in step S1 3 the master processor receives a merge file, and errors and warnings file, and an itemised call records file from a slave for a processed file. The status of the row in the streamed file database 5a for the processed file is then set to M or F in step S14 dependent upon whether the call processing has been successful or not In step S 1 5 the status of the row is then checked and if the status is set to F in step S1 6 an operator can manually intervene to try to correct the reason for call processing failure. If in step S 1 7 it is determined that correction is possible the correction is made and in step S20 the status of the row is reset to A and the process returned to step S 1 2 where the slave can select a streamed file for processing. If in step S1 7 it is determined that correction is not possible, in step S1 8 the status of the row is set to K and in step S 1 9 an error report for the file is raised. The process can then return to step S 1 2 for the processing of another file. If in step S1 5 it is determined that call processing has been successful i.e. the status of the row is M, in the row details of the streamed file database 5a the locations of the merge file, the errors and warnings file, and the itemised call records file are entered and the status of the errors and warnings file is set to A or N depending upon whether there are errors and warnings present and the status of the itemised call records file is set to A.

Two processes are then carried out in parallel and indicated by the suffixes A and B to the step numbers hereinafter The two parallel processes are the merging of the summary information and the storage of the itemised call records.

Considering first the storage of the itemised call records, in step S22A the storage of the itemised call records file with a status A is identified from the entry in the streamed file database 5a. In step S23A the file is then read and in step S24A the master processor 5 attempts to store the file into the optical disc drive 6. In step S25A it is determined whether the storage attempt has been successful and if not in step S27A the status for the itemised call records file is set to F and in step S28A an operator can manually intervene to try to correct for the reasons for the storage failure. In step S29A it is determined whether the correction attempt has been successful and if so in step S32A the itemised call record status for the row is set to A and the process returns to step S22A in an attempt to try to store the corrected file. If in step S29A it is determined that correction is not possible, in step S30A the itemised call record status for the row is set to K and in step S31 A an error report is raised for the file. The process can then return to step S1 2 for the processing of another file.

Referring now to the merge processing operation, in step S22B the location of the merge file is identified from the database entry and in step S23B the merge file is read. In step S24B the summary information is merged to form a summary table. The summary tables can hold both daily summaries and monthly summaries and provide the basis from which end users can produce billing reports. Information held on these summary tables include the number of calls made, the duration of these calls and various items of settlement information. Call details are grouped by items such as operator, date of calls, the accounting period they were processed in, the wholesale call class, the type of service, direction, point of interconnection, route, tariff period, tariff options and time period.

In step S25B it is determined whether the merge processing has been successful. If not in step S26B the main status of a row is set to W and in step S27B an operator can manually intervene to try to correct the reason for the merge failure. In step S28B it is determined whether the correction attempt has been successful and if so in step S31 B the main status of the row is re-set to M and the process returns to step S22B. If in step S28B it is determined that the correction attempt has not been successful, in step S29B the main status of the row is set to Z and in step S30B an error report is raised for the file. The process can then return to step S1 2 to process another file.

If in step S25B it is determined that the merge processing has been successful, in step S32B the main status of the row is changed to P and the merge file is deleted.

The process then splits into two parallel processes and in step S43B entries in the streamed file database 2b of the streamer system 2 corresponding to the streamed file database 5a of the master processor 5 are set to status P. In step S44b the streamer then deletes the files in the streamed file storage system 2a which are marked P in the stream file database 2b. In step S45 the status of the deleted files in the stream file database 2b is then changed to D and the entry is marked in the stream file database 2b are deleted by an archiving process in step S46B. In step S33B when the row main status is P and the errors and warning status is A in the streamed file database 5a the errors and warnings file is located from the row details and read. In step S34B the errors and warning file is then loaded into the call record error analyser 1 05 for analysis. In step S35B it is determined whether the loading has been successful and if not in step S37B the errors and warnings status is set to F. In step S38B an operator can manually intervene to try to correct the reason for the error file loading failure. In step S39B it is determined whether the correction is possible and if so the errors and warning status is set to A in step S40 and the process returns to step S33B. If correction is not possible in step S41 B the errors and warnings status is set to K and in step S42B an error report for the file is raised. The process then returns in step S1 2 for the processing of another file.

If in step S33B it is determined that the loading of the errors and warnings file has been successful, in step S36B the errors and warnings status is set to P and the errors and warnings file is deleted. The process can then return to step S1 2 for the processing of another file.

Although in the flow diagrams described hereinabove with reference to Figures 9a to 9c the merge processing, itemised call record storage, and errors and warnings loading has been described as being a series of sequential operations, these are preferably carried out in parallel on the three output files from the slaves of the cluster 8 resulting from the call processing These operations can be carried out in parallel with the call processing operations of the slaves of the cluster 8. Processing can be carried out by individual modules which therefore allows for a significant degree of parallel processing. To remove entries from the streamed file database 5a, an archiving process will delete rows having a main status P, K, or Z, and itemise call records file status P or K, and an errors and warnings file status of P, K, or N.

A first method of call processing the streamed files will now be described with reference to Figure 10a. The method of Figure 1 0a is carried out by a call processing program and a control program. In step S1 20 the control program of the slave polls the streamed file database 5a of the master process of 5 to identify a streamed file requiring processing. In step S1 21 when a file is identified the file is flagged in the streamed file database In step S1 22 the streamed file is then copied from the streamed file storage system 2a in the streamer system 2 into the slave. The streamed file storage system 2a although shown in Figure 8 as being within the streamer system 2, can in fact comprise any storage locations throughout the network.

In step S1 23 the slave control program starts the call processing program and in step S1 24 the slave control program waits until the call processing program completes call processing. In step S1 25 once the call processing has been completed the call processing procedure is shut down by the slave control program and in step S1 26 the slave control program copies the summary information, the errors and warnings, and the itemised call records produced by the call processing program to the master processor 5. In step S1 27 the slave control program then changes the main status for the file in the streamed file database to M or F and sets the itemised call record status to A and the errors and warnings status to A or N. Also, the locations of the stored merge file and itemised call records file and errors and warnings file are identified by the appropriate file system number in the row. In step S1 28 the slave control program then deletes the local copy of the streamed file and the call process files and the process returns to step S1 20. This process is repeated and is carried out in parallel on each of the slaves of the cluster 8.

Figure 10b illustrates an alternative method of carrying out the call processing. This method differs from the method of Figure 1 0a in that the call processing program is constantly run and there are a plurality of call processing programs running on a single processor in a multitasking environment. This improves the data throughput and improves the processing time.

Referring to Figure 10b, in step S1 20 the slave control program polls the streamed file database 5a to identify a streamed file requiring processing. In step

51 21 when a file is identified it is flagged in the streamed file database. In step

51 22 the streamed file is copied from the streamed file storage system 2a of the streamer system 2 into the slave. In step S1 23 the slave control program passes the file to one of the call processing programs which is requesting data. The call processing program can then carry out call processing and in step S1 24 if there are any further call processing programs requesting a file the call processing returns to step S1 20 whereby the control program will get another file for processing. In step S1 25 if any call processing programs have finished processing a file, in step S1 26 the slave control program copies the summary information, the errors and warnings, and the itemised call records produced by the call processing program to the master processor 5. In step S1 27 the slave control program then changes the main status of the file in the streamed file database to M or F, enters the status for the itemised call records as A, and enters an errors and warnings status as A or N. In step S1 28 the slave control program then deletes the local copy of the streamed file and the call process files and in step S1 29 the call processing program then requests more data and the process returns to step S1 20.

It can thus be seen that in this data processing system the information on the storage location of the data is stored in a table or database thus allowing the changing of the data storage structure or movement of the data without affecting the data processing operations i.e. there is no need to change the code of the data processing applications The table of Figure 1 2 indicates the location of the data, the status of the storage location and the free space available in the storage location, in this way the storage of data can be controlled simply using this table If different locations can store data in different formats, this table can also include a column indicating the formats of the data stored in the location. In this way data can be held in different formats such as normal and compressed without the need for the data processing applications to have to take into consideration such formats. This greatly simplifies the programming required for the data processing applications.

The interpretation or use of the data in the table of Figure 1 2 is under the control of specific routines which can interface with the data processing applications. These applications can simply receive storage instructions from the data processing applications and return the location information or they can act as an interface to receive and pass data to the data processing application.

In Figure 1 2 it can be seen that different types of data can be stored in different locations since the grouping of data of different types into different areas can enhance the processing capability of the system. Also, the grouping of different types of data into corresponding groups of memory locations allows for the data to be more easily located manually.

It can be seen from Figure 1 2 that the storage of data can be controlled using the status indication. For example, a storage location e.g. a disc drive can be shut down so that the system is forced to choose another storage location for the storage of data This also applies when the status is set to read only. Although the present invention has been described with reference to embodiments, the present invention is not limited to such embodiments and modifications which fall within the spirit and scope of the present invention will be apparent to a skilled person in the art.

Claims

1 . Data processing apparatus comprising: data processing application means for processing data files and for generating storage operation instructions for data files identified by identification information independent of the storage location of the data files; storage means comprising a plurality of storage locations for storing data files; storage mapping means for storing information on the storage locations of stored data files in said storage means and identification information for the stored data files, for receiving the identification information for a data file from said data processing application means, and for generating location information for the data file; and storage interface means for receiving location information for the data file from said storage mapping means and storage operation instructions from said data processing apparatus indicating a storage operation to be carried out on the data file, and for carrying out the storage operation for the data file in a said storage location indicated by said location information.

2. Data processing apparatus according to claim 1 wherein each of said storage operation instructions comprise one of a data file reading instruction, a data file writing instruction, and a data file deleting instruction, and said storage interface means is adapted to read, write or delete data files in said storage means.

3. Data processing apparatus according to claim 1 or claim 2 wherein said storage means comprises a storage medium and said storage locations comprise designated areas of said storage medium.

4. Data processing apparatus according to claim 1 or claim 2 wherein said storage means comprises a plurality of interconnected storage mediums, each storage location comprising at least an area of a said storage medium.

5. Data processing apparatus according to any preceding claim further comprising a plurality of said data processing application means operable in parallel.

i 6. Data processing apparatus according to any one of claims 1 to 4 comprising a plurality of networked computers, a number of the networked computers each having a said storage location such that said storage means is spread over said number of networked computers, said data processing application means residing in a said networked computer.

7. Data processing apparatus according to claim 6 comprising a plurality of said data processing application means each residing in a said networked computer.

8. Data processing apparatus according to claim 6 or claim 7 wherein said storage mapping means resides in a said networked computer without a said storage location.

9. Data processing apparatus according to any preceding claim wherein if said data processing application means generates a data file write instruction for a data file, said storage interface means is adapted to identify a said storage location with space for the data file, write the data file to the identified storage location, and return information on the identified storage location to said storage mapping means, said storage mapping means being adapted to receive the information on the identified storage location and to store it with the corresponding identification information for the data file.

10. Data processing apparatus according to any preceding claim wherein any of said storage mapping means is adapted to store access control information to control access to said storage locations.

1 1 . Data processing apparatus according to claim 1 0 wherein said storage mapping means is adapted to be able to only allow reading of data files therefrom.

1 2. Data processing apparatus according to claim 1 1 wherein said data processing application means is adapted to generate a successful processing indication when each of the read data files is successfully processed, and said storage interface means is responsive to the successful processing indication to cause said storage location to delete the pre-processed data file.

1 3. Data processing apparatus according to claim 1 1 or claim 1 2 including means for monitoring the free capacity of the storage locations and for automatically configuring said storage mapping means to allow only reading of data files from any of said storage locations when the monitored free capacity falls below a threshold.

14. Data processing apparatus according to claim 1 2 comprising a plurality of data processing application means for performing sequential processing of data files, each data processing application means being adapted to input a data file read by said storage interface means, process the input data file, and output the processed data file to said storage interface means for storage in sequence, and means for automatically configuring said storage mapping means to allow only reading of data files from any of said storage locations when the proportion of data files for an intermediate process accumulates to a threshold level.

1 5. Data processing apparatus according to any preceding claim including means for selecting a format for storing each of said data files, said interface means being adapted to convert each data file accordingly when writing data files and to convert each data file when reading data files, and said storage mapping means is adapted to contain information on the format in which data files are stored in said storage means.

1 6. Data processing apparatus according to any preceding claim wherein said storage interface means comprises a part of said data processing application means.

1 7. A data processing method for use with data storage means comprising a plurality of storage locations, the method comprising the steps of: a first generating step of generating data file identification information independent of the storage location of the data file; a second generating step of generating a storage operation instruction for a data file identified by said identification information; a third generating step of generating storage location information for the data file using stored information relating the storage location of stored data files to the identification information for the stored data files and the data file identification information; and performing the storage operation for the data file in a said storage location indicated by said storage location information generated for the data file.

1 8. A data processing method according to claim 1 7 wherein said storage operation instructions comprise a data file reading instruction, a data file writing instruction or a data file deleting instruction, and the storage operation performed is a data file reading operation, a data file storage operation, or a data file deletion operation.

1 9. A data processing method according to claim 1 7 or claim 1 8 wherein if the generated storage operation instruction is a write instruction, the third generating step includes the step of identifying a said storage location with space for the data file and generating the storage location information in accordance with the result of the identification, and the storage operation performed is the writing of the data file to the identified location.

20. A data processing method according to any one of claims 1 7 to 1 9 including the step of selectively allowing only reading of data files from any of said storage locations.

21 . A data processing method according to claim 20 wherein said storage operation is a data file reading operation including the steps of processing a read data file, and if processing is successfully completed deleting the read data file from said storage location.

22. A data processing method according to claim 20 or claim 21 including the steps of monitoring the free capacity of said storage location, and automatically allowing only reading of data files from any of said storage locations when the monitored free capacity falls below a threshold.

23. A data processing method according to claim 21 wherein each data file undergoes sequential processing by a different processing operation, each data file is read from a said storage location, processed, and stored in a said storage location a plurality of times, and only reading of data files is allowed from any of said storage locations when the proportion of data files for an intermediate processing step accumulates to a threshold level.

24. A data processing method according to any one of claims 1 7 to 23 including the steps of selecting a format for storing each of said data files, converting data files accordingly when writing data files and reconverting each data file when reading data files, and storing information on the format in which data files are stored in said storage locations.

25. Data processing apparatus substantially as hereinbefore described with reference to and as illustrated in any of the accompanying drawings.

26. A data processing method substantially as hereinbefore described with reference to and as illustrated in any of the accompanying drawings.