+

WO2004059526A2 - Information management system - Google Patents

Information management system Download PDF

Info

Publication number
WO2004059526A2
WO2004059526A2 PCT/EP2003/014897 EP0314897W WO2004059526A2 WO 2004059526 A2 WO2004059526 A2 WO 2004059526A2 EP 0314897 W EP0314897 W EP 0314897W WO 2004059526 A2 WO2004059526 A2 WO 2004059526A2
Authority
WO
WIPO (PCT)
Prior art keywords
information
datasets
subcategory
dataset
database
Prior art date
Application number
PCT/EP2003/014897
Other languages
French (fr)
Other versions
WO2004059526A3 (en
Inventor
Richard Wiedemann
Ilonka Ringling
Original Assignee
Richard Wiedemann
Ilonka Ringling
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Richard Wiedemann, Ilonka Ringling filed Critical Richard Wiedemann
Priority to AU2003298246A priority Critical patent/AU2003298246A1/en
Publication of WO2004059526A2 publication Critical patent/WO2004059526A2/en
Publication of WO2004059526A3 publication Critical patent/WO2004059526A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2425Iterative querying; Query formulation based on the results of a preceding query

Definitions

  • This application relates to an information management system, a method for the retrieval of information from an information management system and a dataset for use in an information management system.
  • the Internet offers other sources of information about products and databases.
  • a number of services or portals have been developed to aid the retrieval of data from the Internet. Examples of well-known ones include Yahoo, Google, Alta-Vista and Ask Jeeves.
  • These information services generally offer services under which a user of the Internet can enter search terms in a query box and retrieve a list of websites that match the input search terms. For example, the user might wish to locate all Chinese restaurants in Kunststoff. The user would then input in the query box the terms " Chinese+restaurants+muenchen" and would receive a list of entries in, for example, Yahoo.
  • the list of entries includes a reference to the URL of the information dataset on the Internet that contains the information for which the user may be looking.
  • the Yahoo Portal is an example of an edited information retrieval system and thus some of the problems associated with information gathering on the Internet are avoided as the entries are to certain extent categorised and edited by editorial staff members. For example, the editors will recognise the different spelling of the city of Kunststoff, Germany, and also the fact that there are towns in the United States which go by the same name and which may not be of interest.
  • Google system Another example of an information retrieval system is the Google system.
  • This system adopts a different approach than Yahoo and is only partially edited. Rather the Google system crawls the Internet attempting to catalogue all of the pages. Thus the user of the Google system will, after inputting the query, receive a more comprehensive list of entries. These will, however, include a large number of entries which are of no relevance and which can only be eliminated from the search by a brief preview of their contents by the user himself.
  • This type of information retrieval system is also dependent on the information content of the information datasets placed on the net. The information content is thus dependent on the information dataset's author who has little or no idea in advance about how or why the information dataset will be accessed.
  • PCT application No. WO 01/50346 relates to a system and method for graphically displaying the results of a search on the terminal of a user.
  • a graphical method for refining a search illustrated.
  • the user of the system first enters keyword and a search of the Internet is carried out for this keyword and stored in a database.
  • the user is then offered subcategories that he or she can select in order to refine the search until a limited set of choices can be displayed in a further window.
  • the user can then select one of the members of this set of choices in order to obtain the information for which he or she is looking.
  • US Patent No. US-B-6 208 987 (Nihei) assigned to NEC Corporation is an example of a hierarchically structured indexed information retrieval system.
  • the system described provides a series of hierarchically structured indices to allow the user to narrow down his or her search until a limited set of information datasets are obtained.
  • cache of the documents is categorised into one of number of subcategories to allow retrieval through the system.
  • the user is offered no opportunity to expand his or her original search using synonyms of the original search term and thus make the search more comprehensive.
  • the lack of geographically relevant information means that the user may be supplied with too many documents and again this consumes memory space and processing power as well as wasting the user's time.
  • the inability to limit the language choices leads to documents being presented in a language that he or she may not be able to read. It might be possible to present documents in more than one language, this would increase the amount of memory space required to present the results and increase the access time to the document.
  • a further hierarchically organised information retrieval system is described in the US-Patent No. US-B-6 094 652 (Faisal) assigned to the Oracle Corp.
  • the information retrieval system provides a knowledge base that has a plurality of nodes of terminology arranged hierarchically which reflect the associations among the terminology. These associations include synonyms and/or categories and subcategories of information.
  • a query submitted by the user is processed using the knowledge base in order to identify hierarchical query feedback terminology.
  • the user is presented with the query feedback that he or she may utilise in order to reformulate the query.
  • This patent fails to describe, however, the types of information that are supplied after processing of the query.
  • EP-A-1 160 686 (Riva, Frederico) describes an information retrieval system used for providing links to websites in response to search term.
  • the system provides for a thesaurus database in which synonyms of the search term as well as the category to which the search term belongs is provided.
  • the information retrieval system described in this patent application suffers from the limitation that it is only able to identify the content of the documents on the website from the information provided by the author of the information datasets.
  • the thesaurus database is furthermore not hierarchically oriented. Thus categories are not divided into subcategories and the user using the thesaurus database cannot refine the search to retrieve fewer documents. Thus this system consumes both memory and processing resources to obtain documents that may be of limited or no value.
  • the information retrieval system includes a knowledge base that has a hierarchical set of relationships among various terms. Users of the information retrieval system are able to search or browse a collection of documents relating to items offered for sale by selecting desired values of attributes associate with the documents.
  • the information about the products is not available in a structured database and thus there are limitations with respect to combining several search terms. Furthermore, there is no indication that the system can be use in a multi-language environment.
  • the managed database - also known as a curated database - has a plurality of information datasets. One or more categories are assigned to each one of the plurality of the information datasets.
  • the method of retrieval has the following steps; (a) a first step of inputting a search term;
  • the database is a managed database with a defined set of categories and subcategories
  • information in the information dataset has been checked either by the author of an information dataset or by an editor.
  • a classifier may have classified the information dataset using the subcategories and subcategory-dependent criteria.
  • the selection of the subcategory-dependent criteria ensures that the desired information is retrieved, which saves processing time and storage space.
  • the user can select one or more of the subcategory- dependent criteria.
  • the order of the members of the list of subcategory-dependent criteria is dependent on the frequency of selection, so that most-commonly selected criteria are placed near the top of the list.
  • the use of the relationship database and the subcategory database allows the user of the information management system to more closely define the search criteria. Examples of the information datasets include advertisers promoting their goods and services.
  • a further step of retrieving from relationship database a relationship dataset comprising a set of relationship terms of the search term and including the search term is used.
  • the relationship dataset enables the retrieval from the subcategory data- set comprising a set of subcategory values.
  • the relationship dataset can include synonyms, antonyms, and heteronyms of the search term, thus allowing a more comprehensive search.
  • a final step of presenting to the user at least part of the content of a selected one of the one or more information datasets The user can therefore select from the summary of the information dataset only those information datasets in which he or she is interested. Thus data transfer between the database and the user is therefore reduced.
  • the user can select the language in which the information management system operates and the information datasets are presented. Furthermore, the user can chose only those information datasets related to addresses of the advertisers in which he or she can expect a response in a desired language.
  • the relationship database includes explanations as to the relationship between the search term and the relationship term.
  • the user can select only those relationships that directly relate to the information datasets in which he or she is interested.
  • the user can be provided with the geographical coordinates of a business to allow the selection only of those businesses within a certain area. This is particularly advantageous when the information management system is used with a mobile device as only those information datasets are transmitted to the user that are in a nearby locality.
  • the user can use the geographical coordinates to carry out localised marketing or other activities.
  • a further embodiment of the invention includes the possibility of selecting at least one of the information datasets for temporary storage in a further database. This allows the user to retrieve the information datasets at a later stage without the need for repeating the search and / or to ensure that the user is able to access the same information datasets. Alternatively, only the summary of the information datasets can be saved in order to reduce memory space and / or data traffic.
  • the order of the one or more information datasets in the summary is calculated using a frequency table. This has the advantage that the information datasets are distributed evenly in different summaries to ensure that certain information datasets do not continually lead the summary.
  • the invention also provides for a information management system comprising a managed database having a plurality of information datasets to which is assigned one or more sub- categories, a relationship database having a plurality of relationship datasets, wherein the relationship datasets comprise a list of relationship terms which are synonyms of each other, and a subcategory database having a plurality of subcategory datasets, wherein the subcate- gory datasets comprise a first term and a list of second terms which are subcategories of the first term.
  • the information management system of the invention includes selection means and present? tion means to select and present the information datasets retrieved.
  • the information management system further comprises a plurality of managed databases. This reduces the load on the individual databases.
  • each one of the databases could include only those information data- sets related to a particular country or group of countries. Since the majority of users will only search information datasets relating to one country, data traffic on a network connection can be reduced by providing the database locally.
  • the invention also provides for a dataset for storing in a memory of the information management system.
  • the dataset includes an identification number, a designation and a geographical identification.
  • the designation is a title which identifies the entry in the dataset and could be, for example, the name of a business or a list of the activities in which the business is involved.
  • the geographical identification is, for example, the address and/or GPS coordinates.
  • the dataset includes a language identifier to identify the language in which the business can communicate.
  • the dataset should also include contact information and can also be provided with a counter to indicate how often it has been in included in a summary presented to the user and its position there within.
  • the invention provides for a computer program product that can be loaded into the computer for performing the method of the invention.
  • the computer program product can be . disk, a CD-Rom, on an Internet site or in other form.
  • Fig.1 is an outline of a computer system and network used in the invention.
  • Fig.2 is an example of a first search screen used on starting the information management system.
  • Fig. 3 is an example of an industry appropriate form used to further select criteria for the search.
  • Fig. 4 is an example of part of the content of an information dataset displayed on the presentation device.
  • Fig. 5 is an example of a dataset stored in a database for use in the invention.
  • Fig. 6 is a flow diagram illustrating the steps of the invention.
  • Fig. 7 is an array to optimise access to information datasets.
  • Fig. 8 is a two dimensional representation of the industry sector ranking database.
  • Fig. 9 is a summary in two dimensions of the industry sector ranking database.
  • Fig. 1 shows an example of a computer system 100 used for implementing the information retrieval system of the invention.
  • the computer system 100 comprises a server computer 110 connected through a network 130 to a client computer 140.
  • the network 130 may be, for example, the Internet, a local area network, a wide area network or a wireless network.
  • An information retrieval system may be running as a retrieval module 120 on the server computer 110.
  • the server computer 110 further includes an interface module 125 for interfacing to the network 130.
  • Such interface modules 125 include but are not limited to the Microsoft Internet Server, an Apache Server or a Topaz server.
  • the client computer 140 will also be running an interface module 150 for accessing the network.
  • Such interface modules 150 include but are not limited to a telephone or DSL modem, an Ethernet connection, a wireless connection and interactive television and will include a browser such as Microsoft Explorer or Netscape Navigator.
  • a database 160 is connected to the server computer 130.
  • the database 160 contains one or more datasets 170, a relationship database 180, a subcategory database 190 and a reserved section 195.
  • the database 160 can be implemented in any suitable hardware that is known to the skilled person and includes but is not limited to solid-state memory, tapes or magnetic and optical media.
  • the one or more datasets 170 are, in one embodiment of the invention, ordered items of data stored in a structured database such as an Oracle database, an IBM DB2 database, a Lotus Domino database or a Microsoft SQL database.
  • the data stored in the one or more information datasets is managed or curated. That is to say that the data stored is edited and placed in appropriate fields of the database to allow accurate and reliable retrieval of the data.
  • the data relates to advertisers of products or services.
  • the management and curation of the one or more information datasets 170 is carried out either by an administrator, a data curator or the advertiser itself.
  • the curation of the information datasets 170 includes a classification of the information dataset using subcategories stored in the sub- category database 190.
  • a display device such as a visual display unit and an input device such as a keyboard and/or a mouse are attached to the client computer 140.
  • the client computer 140 could also be a mobile device such as a PDA or a mobile phone in which case the visual display unit and the input device are incorporated in the same unit.
  • the user of the client computer 140 who wishes to use the information management system of the invention initiates with the aid of the input device the information management system.
  • One example of the initiation procedure is the selection of an icon displayed on the visual display unit.
  • a command could be input at a command line or a special key pressed on the data input device.
  • Fig. 2 is only a representative example of a display layout 200 that can be called and that other display layouts fall within the scope of the invention.
  • the layout display 200 comprises a number of fields. In a language field 210, the language of the information management system is selected.
  • a location field 220 the country in which entries are to be searched is selected in step 610.
  • a plurality of server computers 130 can be employed. Each one of the server computers 130 is dedicated to hold information datasets relating to one country or one group of countries and entry of the value into the location field 220 generates a message from the client computer 140 to select the appropriate one of the server computers 130. Since each of the server computers 130 is substantially identically, only one of the server computers 130 is shown in Fig. 2 for clarity.
  • a first search term is input in step 615.
  • the search term "Archi" is input, but this is not limiting of the invention.
  • the search term could include a wild card at the beginning and/or end of the search term.
  • the client computer 140 uses the value of the first search term to interrogate in step 620 the database 160 on the server computer 130 to produce a search dataset that includes a list of search terms matching the inputted first search term.
  • the search dataset is returned to the client computer 140 to produce in a first display field 240 a list of search terms matching the inputted first search term. In the example shown only part of a beginning of a word has been input and, as a result, all of the first search terms that include this word part are displayed in step 625 in the first display field 240.
  • the user of the information management system can then select in step 630 the most appropriate one of the first search terms in the first display field 240 using the input device.
  • the client computer 140 sends a message including the selected one of the first search terms to the server computer 130 and it is passed to the relationship database 180 in order to generate a relationship dataset that includes a list of relationship terms that are related to the first search term.
  • the relationship terms include, for example, synonyms, antonyms and hetero- nyms to the first search term.
  • the relationship dataset is displayed on the display unit in step 632 and the user can select those members of the relationship dataset that are relevant to his or her first search term in step 634.
  • a message is sent to the server computer 130 to generate a subcategory dataset containing subcategories for the members of the relationship dataset selected.
  • the members of the subcategories dataset allow the user to further refine the search for the selected ones of the relationship terms.
  • the subcategory dataset is returned to the client computer 140 to produce in step 635 in a second display field 250 a list of subcategories for the selected one of the search terms.
  • the subcategories are generated by examine typical information datasets already present on the Internet and relating to the relevant industries. This can be substantially automated using a crawler engine to examine the contents of the information datasets and list the frequency of the technical terms which are used.
  • a message is sent from the client computer 140 to the server computer 130 to retrieve an industry-appropriate input form from the database 160.
  • An example of an industry-appropriate input form 300 is shown in Fig.3. This is displayed on the visual display device in step 645.
  • the industry-appropriate input form 300 includes a number of fields that can be completed in 650. Each of these fields represents one or more criteria that the user wishes to see fulfilled. For example, in the general field 310, the user can input the name of a company or part of an address. The criteria can be industry-specific. So, for example, on the form 300 of Fig. 3, which is appropriate to an architectural practice, the form 300 includes, but is not limited to, the types of houses which an architectural practice might typically design. The form 300 also includes details of services offered by the practice such as "building supervisor" etc. the criteria are stored with the relevant subcategory in the subcategory database 190.
  • the search start button 320 is selected and a message sent from the client computer 140 to the server computer 130.
  • the server computer 130 uses the criteria in order to select those information datasets from the database 160 which match the criteria to form a response dataset.
  • the response dataset is transferred from the server computer 130 to the client computer 140 where it is displayed in step 660 as a hit list 400 similar to the one shown in Fig. 4.
  • the order of the items in the hit list 400 is determined firstly by evaluating the relevance of the item to the criteria entered by the user. So, for example an information dataset 170 that fulfils all of the criteria would be indicated as an information dataset that fulfils 100% criteria and fills the first place. An information dataset 170 which does not fulfil all of the criteria but which nonetheless is of some interest to the user would be listed after the 100% criteria fulfilled information datasets. In this case, the number of criteria appearing in the information dataset 170 is expressed as a percentage of the total number of criteria selected by the user. In Fig. 4 this percentage is rounded to the nearest 10%, but could be rounded to other values.
  • the order of display in the hit list 400 is determined either by the number of times that one information dataset 170 has headed the hit list 400 or by the number of times which the information dataset 170 has appeared in the hit list 400 as will be explained below.
  • the user can select a member of the hit list 400 as required in order to obtain further details about the members of the hit list 400. Selection of a member of the hit list 400 initiates a further message from the client computer 140 to the server computer 130 in order to extract the required information in step 670 from the database 160.
  • the ability to send an e-mail or a fax message to the member of the hit list 400 is an example of the further information that is required.
  • the user selects by means of one or more of selection boxes 420 those members of the hit list 400 to whom a fax or a e-mail is to be sent.
  • the user selects either the fax button or the e-mail button 410.
  • a message is sent to the server computer 130 to retrieve the appropriate data.
  • an e-mail client program such as Outlook or Lotus Notes or a Fax program such as WinFax can be initiated at the client computer 140.
  • the appropriate data retrieved from the server computer 130 is supplied to the e-mail client program or fax program and the user can input the required communication.
  • the user can save the hit list 400 for later reference.
  • the hit list 400 can be either saved in a memory device on the client computer 140 or it can be saved for access from another client computer 140 in the reserved section 195 of the database 160. The user does this by selecting the members of the hit list 400 to be saved using the selection boxes 420 and then selecting the archiving button 430.
  • Each of the information datasets stored in the database has a structure that comprises at least an identification number 510, a designation 520, geographical co-ordinates 530, contact information 540, a counter 550, industry-specific information 560 and a language identifier 570.
  • the identification number 510 is a single unique identification number that refers to the information dataset and is used for accessing the information dataset.
  • the designation 520 is, for example, the title of the shop or business or the legal name of the company.
  • the geographical co-ordinates 530 locate the shop or business and can be used when the user wishes to locate an entry within an exact geographical location. For example, the user may be using a mobile phone and wish to locate all restaurants within a kilometre. In such a case, it is necessary for the database to be able to reference geographical co-ordinates.
  • a number of different co-ordinate systems are known. Examples include the GPS co-ordinate system or, in the UK and some other countries, grid references.
  • Contact information 540 includes but is not limited to the address, telephone number, fax number, e-mail, address and web page. Either the contact information 540 or the geographical co-ordinates 530 can be used to correlate the user request in the location field 220 with the information datasets in the hit list 400.
  • the counter 550 indicates the number of times that the information dataset is accessed. This is useful when constructing the hit list 400 as the order of the entries in the hit list 400 can be correlated by the number of times that the information dataset has headed the hit list 400. For example, if more than one information dataset fulfils all of the criteria entered by the user in step, the order of display can be adjusted such that the information dataset with the fewer number of hits in the past or first place positions in the hit list 400 can be chosen to be higher in the order of the hit list 400 and thus increase its "prominence" and chance of being selected. Details of the algorithm used to determine the number of access is given below.
  • Each of the information datasets 170 can, of course, contain industry-specific information 560 which may be ordered in sub-fields or dependent tables. Returning to Fig. 3, it will be understood that any information datasets 170 detailing the services of architects will indicate their types of speciality and services available.
  • the information dataset 170 will also contain the language identifier 570 that indicates the language or the languages in which communication can be carried out.
  • the language identifier 570 is used in conjunction with the language field of Fig. 3 that is selected by the user in step.
  • the information datasets 170 may contain further information as and when required.
  • each of the information datasets 170 in this embodiment relates to products for sale or services offered, it is possible for authorised personnel to access the database 160 to update any information datasets 170 to which they are authorised. This is preferably done using password-protected access in which an authorised person desiring access to the data inputs a password at the client computer 140 that is transmitted for authorisation by the server computer 130.
  • the subcategory database 190 will also contain pointers between the list of subcategories and the identification numbers 510 of the information datasets 170. Thus once the user has selected the relevant subcategories using the subcategory field 250, the list of information datasets 170 relevant to the subcategory can be obtained.
  • the algorithm used to optimise the access to the information datasets 170 will now be described.
  • This information is stored in the counter 550.
  • the counter 550 is divided into a table similar to that shown in Fig. 7.
  • This table shows the number of times that a particular information dataset occurs in a position in a hit list 400 in a time frame.
  • the horizontal access gives the position whilst the vertical axes give the time frame.
  • the information dataset 170 in the past week occurred twice at the top of the hit list 400 (i.e. position 1), once in position 2, 152 times in positions 101-150, 566 times in positions 151-200, 111 times in positions 500-999 and 135 times in positions 1000-1999.
  • the lower positions are grouped together since there is little relevant information to be gained from deciding whether an information dataset was in position 500 or position 750. It is likely in both cases that the user of the information management system 100 has not taken note of the information dataset 170. On the other hand, information about positions 1 and 2 are highly relevant and can be used to decide the location of the information dataset in future hit lists 400. By grouping the information together, valuable memory space can be saved.
  • the running totals for access in particular time frames are adjusted on a regular basis, e.g. daily.
  • the information in counter 550 (table of Fig. 7) can be also supplied to the advertiser to allow it to assess the success of its presence in the information management system.
  • an industry sector ranking database is provided in the database 160.
  • the industry sector ranking database comprises a three dimensional array. Two dimension of this array are shown in Fig. 8.
  • the number of the search query is in the x- axis.
  • the y-axis contains details of the date and time of the query as well as type of query ("Protocol"), i.e. whether it comes from a mobile terminal or is a follow-up query. Up to 1000 subcategories per industry sector can typically be stored and the subcategories input can be recorded. In the not shown z-axis, the type of industry sector (architect, lawyer, restaurant etc.) is recorded.
  • Fig. 9 This table has the same y and z axes as Fig. 8. However, the queries are combined together for various days (D1 , D2, etc.), weeks (W1 , W2, ...), months (M1 ,M2,%) and years (Y1 ,Y2,).
  • the storage of the queries allows marketing analysis of the queries to be carried out, e.g. by location, industry sector, search criteria and combinations thereof. Moreover, the analysis of development trends, e.g. fashion or music, can be reported.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to an information management system (100) having a managed database (160) comprising a plurality of information datasets (170) to which is assigned one or more categories. A method of retrieving one or more of the plurality of information datasets (170) is disclosed which comprises: (a) a first step of inputting a search term (615); (b) a second step of using the search term to retrieve from a subcategories database (190) a subcategory dataset comprising a set of subcategory values (634); (c) a third step of presenting said subcategory database (300) to the user (635); (d) a fourth step of selecting at least one of said subcategory dataset (640); (e) a fifth step of selecting subcategory-dependent criteria to the user (645); (f) a sixth step of selecting at least one subcategory-dependent criteria to the user (645); (g) a seventh step of accessing one or more of the information datasets (170) using said subcategory dataset and said selected subcategory-dependent criteria (50); (h) an eight step of presenting a summary (400) to the user of the one or more information datasets (170) accessed in the seventh step (660).

Description

INFORMATION MANAGEMENT SYSTEM
Field of the Invention
This application relates to an information management system, a method for the retrieval of information from an information management system and a dataset for use in an information management system.
Prior Art
Information management systems are known in the art. For example, in most countries of the world, the "Yellow Pages" telephone directory has supplied information about products and services that are available in a particular locality together with their contact information such as address and telephone number. With the introduction of the Internet, these services have been computerised. For example, the German Yellow Pages ("Gelbe Seiten") directory is available under the URL (Uniform Resource Locator) www.teleauskunft.de or www.gelbeseiten.de. The database allows limited search of features and functions offered by the businesses listed therein.
The Internet offers other sources of information about products and databases. A number of services or portals have been developed to aid the retrieval of data from the Internet. Examples of well-known ones include Yahoo, Google, Alta-Vista and Ask Jeeves. These information services generally offer services under which a user of the Internet can enter search terms in a query box and retrieve a list of websites that match the input search terms. For example, the user might wish to locate all Chinese restaurants in Munich. The user would then input in the query box the terms " Chinese+restaurants+muenchen" and would receive a list of entries in, for example, Yahoo. The list of entries includes a reference to the URL of the information dataset on the Internet that contains the information for which the user may be looking.
This example shows one of the limitations of the information management system. The term "muenchen" is the German name of the city and can be alternatively spelt "Mϋnchen". A foreign visitor might not necessarily know this and indeed might use either the English name Munich or type the letter u with umlaut as u without umlaut. Similarly the term "Restaurant" is used in the German Language. However, synonyms such as "Gaststatte" or "Wirtshaus" are also used. A study carried out at the University of the Applied Sciences in Cologne, Germany, shows, however, that only a relatively low number of the returned results are of use to the person seeking the information. The study is published by Dresel et al. "Evolution deutscher Web-Suchwerkzeuge", in nfd, vol 52, pp 381-392 (2001). It reports the conclusion that ballast in the return lists swamps the relevant information. The study suggests that an average only 18.6% of the results are of some use. This wastes not only the user's time but also consumes memory space in the computer and processing power.
The Yahoo Portal is an example of an edited information retrieval system and thus some of the problems associated with information gathering on the Internet are avoided as the entries are to certain extent categorised and edited by editorial staff members. For example, the editors will recognise the different spelling of the city of Munich, Germany, and also the fact that there are towns in the United States which go by the same name and which may not be of interest.
Another example of an information retrieval system is the Google system. This system adopts a different approach than Yahoo and is only partially edited. Rather the Google system crawls the Internet attempting to catalogue all of the pages. Thus the user of the Google system will, after inputting the query, receive a more comprehensive list of entries. These will, however, include a large number of entries which are of no relevance and which can only be eliminated from the search by a brief preview of their contents by the user himself. This type of information retrieval system is also dependent on the information content of the information datasets placed on the net. The information content is thus dependent on the information dataset's author who has little or no idea in advance about how or why the information dataset will be accessed.
Information retrieval systems are also known in the patent literature. PCT application No. WO 01/50346 (Ko) relates to a system and method for graphically displaying the results of a search on the terminal of a user. In this system a graphical method for refining a search illustrated. The user of the system first enters keyword and a search of the Internet is carried out for this keyword and stored in a database. The user is then offered subcategories that he or she can select in order to refine the search until a limited set of choices can be displayed in a further window. The user can then select one of the members of this set of choices in order to obtain the information for which he or she is looking. Since the invention described in WO 01/50346 uses the information stored throughout the Internet, it fails to address the problem that the authors of the documents themselves determine the information content of the information datasets on the Internet. There is no attempt to categorise the documents and there is no help offered in the way of a thesaurus or lexicon to enable the user to make a more comprehensive search.
US Patent No. US-B-6 208 987 (Nihei) assigned to NEC Corporation is an example of a hierarchically structured indexed information retrieval system. The system described provides a series of hierarchically structured indices to allow the user to narrow down his or her search until a limited set of information datasets are obtained. Using this system, cache of the documents is categorised into one of number of subcategories to allow retrieval through the system.
In both the Nihei and the Hoshino patents, the user is offered no opportunity to expand his or her original search using synonyms of the original search term and thus make the search more comprehensive. There is no opportunity for the user to limit his or her request to information that is geographically based or which is supplied only in a particular language. The lack of geographically relevant information means that the user may be supplied with too many documents and again this consumes memory space and processing power as well as wasting the user's time. The inability to limit the language choices leads to documents being presented in a language that he or she may not be able to read. It might be possible to present documents in more than one language, this would increase the amount of memory space required to present the results and increase the access time to the document.
A further hierarchically organised information retrieval system is described in the US-Patent No. US-B-6 094 652 (Faisal) assigned to the Oracle Corp. The information retrieval system provides a knowledge base that has a plurality of nodes of terminology arranged hierarchically which reflect the associations among the terminology. These associations include synonyms and/or categories and subcategories of information. A query submitted by the user is processed using the knowledge base in order to identify hierarchical query feedback terminology. The user is presented with the query feedback that he or she may utilise in order to reformulate the query. This patent fails to describe, however, the types of information that are supplied after processing of the query.
European Patent Application No. EP-A-1 160 686 (Riva, Frederico) describes an information retrieval system used for providing links to websites in response to search term. The system provides for a thesaurus database in which synonyms of the search term as well as the category to which the search term belongs is provided. The information retrieval system described in this patent application suffers from the limitation that it is only able to identify the content of the documents on the website from the information provided by the author of the information datasets. The thesaurus database is furthermore not hierarchically oriented. Thus categories are not divided into subcategories and the user using the thesaurus database cannot refine the search to retrieve fewer documents. Thus this system consumes both memory and processing resources to obtain documents that may be of limited or no value.
Finally, information retrieval systems for supplying information on products held by a shop are known from US Patent Application Publication No. US-A-2002/0051020 (Ferrari et al.). The information retrieval system includes a knowledge base that has a hierarchical set of relationships among various terms. Users of the information retrieval system are able to search or browse a collection of documents relating to items offered for sale by selecting desired values of attributes associate with the documents. As the invention described only relates to products available in one shop, there is no need to provide geographical information. The information about the products is not available in a structured database and thus there are limitations with respect to combining several search terms. Furthermore, there is no indication that the system can be use in a multi-language environment.
Summary of the Invention
There therefore remains a need for a comprehensive information retrieval system.
There is also a need to provide an information retrieval system which takes into account different search criteria relevant to different industries.
There is furthermore a need for an information retrieval system that provides a multi-lingual function and geographically based information.
These and other objects of the invention are solved by a method of retrieving one or more information datasets from an information management system having managed database. The managed database - also known as a curated database - has a plurality of information datasets. One or more categories are assigned to each one of the plurality of the information datasets. The method of retrieval has the following steps; (a) a first step of inputting a search term;
(b) a second step of using the search term to retrieve from a subcategories database a sub- category dataset comprising a set of subcategory values;
(c) a third step of presenting said subcategory dataset to the user;
(d) a fourth step of selecting at least one of said subcategory dataset;
(e) a fifth step of presenting subcategory-dependent criteria to the user;
(f) a sixth step of selecting at least one subcategory-dependent criteria (645)
(g) a several step of accessing one or more of the information datasets using said subcategory dataset and said selected subcategory-dependent criteria; and
(h) an eighth step of presenting a summary to the user of the one or more information data- sets accessed in the seventh step
As the database is a managed database with a defined set of categories and subcategories, information in the information dataset has been checked either by the author of an information dataset or by an editor. In addition, a classifier may have classified the information dataset using the subcategories and subcategory-dependent criteria. The selection of the subcategory-dependent criteria ensures that the desired information is retrieved, which saves processing time and storage space. The user can select one or more of the subcategory- dependent criteria. The order of the members of the list of subcategory-dependent criteria is dependent on the frequency of selection, so that most-commonly selected criteria are placed near the top of the list. The use of the relationship database and the subcategory database allows the user of the information management system to more closely define the search criteria. Examples of the information datasets include advertisers promoting their goods and services.
In an advantageous embodiment a further step of retrieving from relationship database a relationship dataset comprising a set of relationship terms of the search term and including the search term is used. The relationship dataset enables the retrieval from the subcategory data- set comprising a set of subcategory values. The relationship dataset can include synonyms, antonyms, and heteronyms of the search term, thus allowing a more comprehensive search.
In one embodiment of the method, a final step of presenting to the user at least part of the content of a selected one of the one or more information datasets. The user can therefore select from the summary of the information dataset only those information datasets in which he or she is interested. Thus data transfer between the database and the user is therefore reduced. In one advantageous embodiment of the invention, the user can select the language in which the information management system operates and the information datasets are presented. Furthermore, the user can chose only those information datasets related to addresses of the advertisers in which he or she can expect a response in a desired language.
This is an advantage in doing business in a global economy since the user does not tie up resources in attempting to do business with an advertiser with which he or she cannot communicate.
In a further refinement, the relationship database includes explanations as to the relationship between the search term and the relationship term. With the aid of the explanation (e.g. synonym), the user can select only those relationships that directly relate to the information datasets in which he or she is interested. Furthermore, the user can be provided with the geographical coordinates of a business to allow the selection only of those businesses within a certain area. This is particularly advantageous when the information management system is used with a mobile device as only those information datasets are transmitted to the user that are in a nearby locality. Alternatively, the user can use the geographical coordinates to carry out localised marketing or other activities.
A further embodiment of the invention includes the possibility of selecting at least one of the information datasets for temporary storage in a further database. This allows the user to retrieve the information datasets at a later stage without the need for repeating the search and / or to ensure that the user is able to access the same information datasets. Alternatively, only the summary of the information datasets can be saved in order to reduce memory space and / or data traffic.
In one further embodiment of the invention, the order of the one or more information datasets in the summary is calculated using a frequency table. This has the advantage that the information datasets are distributed evenly in different summaries to ensure that certain information datasets do not continually lead the summary.
The invention also provides for a information management system comprising a managed database having a plurality of information datasets to which is assigned one or more sub- categories, a relationship database having a plurality of relationship datasets, wherein the relationship datasets comprise a list of relationship terms which are synonyms of each other, and a subcategory database having a plurality of subcategory datasets, wherein the subcate- gory datasets comprise a first term and a list of second terms which are subcategories of the first term.
The information management system of the invention includes selection means and present? tion means to select and present the information datasets retrieved.
In one advantageous embodiment of the invention, the information management system further comprises a plurality of managed databases. This reduces the load on the individual databases. For example, each one of the databases could include only those information data- sets related to a particular country or group of countries. Since the majority of users will only search information datasets relating to one country, data traffic on a network connection can be reduced by providing the database locally.
Finally, the invention also provides for a dataset for storing in a memory of the information management system. The dataset includes an identification number, a designation and a geographical identification. The designation is a title which identifies the entry in the dataset and could be, for example, the name of a business or a list of the activities in which the business is involved. The geographical identification is, for example, the address and/or GPS coordinates. Advantageously, the dataset includes a language identifier to identify the language in which the business can communicate.
The dataset should also include contact information and can also be provided with a counter to indicate how often it has been in included in a summary presented to the user and its position there within.
Finally, the invention provides for a computer program product that can be loaded into the computer for performing the method of the invention. The computer program product can be . disk, a CD-Rom, on an Internet site or in other form.
Description of the Figures
Fig.1 is an outline of a computer system and network used in the invention.
Fig.2 is an example of a first search screen used on starting the information management system. Fig. 3 is an example of an industry appropriate form used to further select criteria for the search. Fig. 4 is an example of part of the content of an information dataset displayed on the presentation device.
Fig. 5 is an example of a dataset stored in a database for use in the invention.
Fig. 6 is a flow diagram illustrating the steps of the invention.
Fig. 7 is an array to optimise access to information datasets.
Fig. 8 is a two dimensional representation of the industry sector ranking database.
Fig. 9 is a summary in two dimensions of the industry sector ranking database.
Detailed Description of the Invention
Fig. 1 shows an example of a computer system 100 used for implementing the information retrieval system of the invention. The computer system 100 comprises a server computer 110 connected through a network 130 to a client computer 140. The network 130 may be, for example, the Internet, a local area network, a wide area network or a wireless network. An information retrieval system may be running as a retrieval module 120 on the server computer 110. The server computer 110 further includes an interface module 125 for interfacing to the network 130. Such interface modules 125 include but are not limited to the Microsoft Internet Server, an Apache Server or a Topaz server. The client computer 140 will also be running an interface module 150 for accessing the network. Such interface modules 150 include but are not limited to a telephone or DSL modem, an Ethernet connection, a wireless connection and interactive television and will include a browser such as Microsoft Explorer or Netscape Navigator.
A database 160 is connected to the server computer 130. The database 160 contains one or more datasets 170, a relationship database 180, a subcategory database 190 and a reserved section 195. The database 160 can be implemented in any suitable hardware that is known to the skilled person and includes but is not limited to solid-state memory, tapes or magnetic and optical media.
The one or more datasets 170 are, in one embodiment of the invention, ordered items of data stored in a structured database such as an Oracle database, an IBM DB2 database, a Lotus Domino database or a Microsoft SQL database. The data stored in the one or more information datasets is managed or curated. That is to say that the data stored is edited and placed in appropriate fields of the database to allow accurate and reliable retrieval of the data. In one example of the invention, the data relates to advertisers of products or services. The management and curation of the one or more information datasets 170 is carried out either by an administrator, a data curator or the advertiser itself. The curation of the information datasets 170 includes a classification of the information dataset using subcategories stored in the sub- category database 190.
A display device such as a visual display unit and an input device such as a keyboard and/or a mouse are attached to the client computer 140. The client computer 140 could also be a mobile device such as a PDA or a mobile phone in which case the visual display unit and the input device are incorporated in the same unit.
The user of the client computer 140 who wishes to use the information management system of the invention initiates with the aid of the input device the information management system. One example of the initiation procedure is the selection of an icon displayed on the visual display unit. Alternatively, a command could be input at a command line or a special key pressed on the data input device.
Initiation of the information management system causes a display such as that shown in Fig. 2 to be displayed as is described in step 605 of Fig. 6. Note that Fig. 2 is only a representative example of a display layout 200 that can be called and that other display layouts fall within the scope of the invention. The layout display 200 comprises a number of fields. In a language field 210, the language of the information management system is selected.
In a location field 220, the country in which entries are to be searched is selected in step 610. In order to reduce the load on the server computer 130, a plurality of server computers 130 can be employed. Each one of the server computers 130 is dedicated to hold information datasets relating to one country or one group of countries and entry of the value into the location field 220 generates a message from the client computer 140 to select the appropriate one of the server computers 130. Since each of the server computers 130 is substantially identically, only one of the server computers 130 is shown in Fig. 2 for clarity.
In a search term input field 230, a first search term is input in step 615. In this example, the search term "Archi" is input, but this is not limiting of the invention. The search term could include a wild card at the beginning and/or end of the search term. The client computer 140 uses the value of the first search term to interrogate in step 620 the database 160 on the server computer 130 to produce a search dataset that includes a list of search terms matching the inputted first search term. The search dataset is returned to the client computer 140 to produce in a first display field 240 a list of search terms matching the inputted first search term. In the example shown only part of a beginning of a word has been input and, as a result, all of the first search terms that include this word part are displayed in step 625 in the first display field 240.
The user of the information management system can then select in step 630 the most appropriate one of the first search terms in the first display field 240 using the input device. The client computer 140 sends a message including the selected one of the first search terms to the server computer 130 and it is passed to the relationship database 180 in order to generate a relationship dataset that includes a list of relationship terms that are related to the first search term. The relationship terms include, for example, synonyms, antonyms and hetero- nyms to the first search term. The relationship dataset is displayed on the display unit in step 632 and the user can select those members of the relationship dataset that are relevant to his or her first search term in step 634.
Using the selected members of the relationship dataset, a message is sent to the server computer 130 to generate a subcategory dataset containing subcategories for the members of the relationship dataset selected. The members of the subcategories dataset allow the user to further refine the search for the selected ones of the relationship terms. The subcategory dataset is returned to the client computer 140 to produce in step 635 in a second display field 250 a list of subcategories for the selected one of the search terms.
The subcategories are generated by examine typical information datasets already present on the Internet and relating to the relevant industries. This can be substantially automated using a crawler engine to examine the contents of the information datasets and list the frequency of the technical terms which are used.
The user can then select in step 640 the most appropriate one of the subcategories. In one embodiment of the invention, a message is sent from the client computer 140 to the server computer 130 to retrieve an industry-appropriate input form from the database 160. An example of an industry-appropriate input form 300 is shown in Fig.3. This is displayed on the visual display device in step 645.
The industry-appropriate input form 300 includes a number of fields that can be completed in 650. Each of these fields represents one or more criteria that the user wishes to see fulfilled. For example, in the general field 310, the user can input the name of a company or part of an address. The criteria can be industry-specific. So, for example, on the form 300 of Fig. 3, which is appropriate to an architectural practice, the form 300 includes, but is not limited to, the types of houses which an architectural practice might typically design. The form 300 also includes details of services offered by the practice such as "building supervisor" etc. the criteria are stored with the relevant subcategory in the subcategory database 190.
When the user has entered the criteria, the search start button 320 is selected and a message sent from the client computer 140 to the server computer 130. The server computer 130 uses the criteria in order to select those information datasets from the database 160 which match the criteria to form a response dataset. The response dataset is transferred from the server computer 130 to the client computer 140 where it is displayed in step 660 as a hit list 400 similar to the one shown in Fig. 4.
The order of the items in the hit list 400 is determined firstly by evaluating the relevance of the item to the criteria entered by the user. So, for example an information dataset 170 that fulfils all of the criteria would be indicated as an information dataset that fulfils 100% criteria and fills the first place. An information dataset 170 which does not fulfil all of the criteria but which nonetheless is of some interest to the user would be listed after the 100% criteria fulfilled information datasets. In this case, the number of criteria appearing in the information dataset 170 is expressed as a percentage of the total number of criteria selected by the user. In Fig. 4 this percentage is rounded to the nearest 10%, but could be rounded to other values. In the event that more than one information dataset 170 fulfils the criteria, the order of display in the hit list 400 is determined either by the number of times that one information dataset 170 has headed the hit list 400 or by the number of times which the information dataset 170 has appeared in the hit list 400 as will be explained below.
The user can select a member of the hit list 400 as required in order to obtain further details about the members of the hit list 400. Selection of a member of the hit list 400 initiates a further message from the client computer 140 to the server computer 130 in order to extract the required information in step 670 from the database 160.
The ability to send an e-mail or a fax message to the member of the hit list 400 is an example of the further information that is required. The user selects by means of one or more of selection boxes 420 those members of the hit list 400 to whom a fax or a e-mail is to be sent. The user then selects either the fax button or the e-mail button 410. A message is sent to the server computer 130 to retrieve the appropriate data. At the same time, an e-mail client program such as Outlook or Lotus Notes or a Fax program such as WinFax can be initiated at the client computer 140. The appropriate data retrieved from the server computer 130 is supplied to the e-mail client program or fax program and the user can input the required communication.
In a further embodiment of the invention, the user can save the hit list 400 for later reference. The hit list 400 can be either saved in a memory device on the client computer 140 or it can be saved for access from another client computer 140 in the reserved section 195 of the database 160. The user does this by selecting the members of the hit list 400 to be saved using the selection boxes 420 and then selecting the archiving button 430.
One example of the information stored in the database 160 will now be described with reference to Fig. 5. Each of the information datasets stored in the database has a structure that comprises at least an identification number 510, a designation 520, geographical co-ordinates 530, contact information 540, a counter 550, industry-specific information 560 and a language identifier 570. The identification number 510 is a single unique identification number that refers to the information dataset and is used for accessing the information dataset. The designation 520 is, for example, the title of the shop or business or the legal name of the company.
The geographical co-ordinates 530 locate the shop or business and can be used when the user wishes to locate an entry within an exact geographical location. For example, the user may be using a mobile phone and wish to locate all restaurants within a kilometre. In such a case, it is necessary for the database to be able to reference geographical co-ordinates. A number of different co-ordinate systems are known. Examples include the GPS co-ordinate system or, in the UK and some other countries, grid references.
Contact information 540 includes but is not limited to the address, telephone number, fax number, e-mail, address and web page. Either the contact information 540 or the geographical co-ordinates 530 can be used to correlate the user request in the location field 220 with the information datasets in the hit list 400.
Finally the counter 550 indicates the number of times that the information dataset is accessed. This is useful when constructing the hit list 400 as the order of the entries in the hit list 400 can be correlated by the number of times that the information dataset has headed the hit list 400. For example, if more than one information dataset fulfils all of the criteria entered by the user in step, the order of display can be adjusted such that the information dataset with the fewer number of hits in the past or first place positions in the hit list 400 can be chosen to be higher in the order of the hit list 400 and thus increase its "prominence" and chance of being selected. Details of the algorithm used to determine the number of access is given below.
Each of the information datasets 170 can, of course, contain industry-specific information 560 which may be ordered in sub-fields or dependent tables. Returning to Fig. 3, it will be understood that any information datasets 170 detailing the services of architects will indicate their types of speciality and services available.
Finally the information dataset 170 will also contain the language identifier 570 that indicates the language or the languages in which communication can be carried out. The language identifier 570 is used in conjunction with the language field of Fig. 3 that is selected by the user in step.
Of course, the information datasets 170 may contain further information as and when required.
Since each of the information datasets 170 in this embodiment relates to products for sale or services offered, it is possible for authorised personnel to access the database 160 to update any information datasets 170 to which they are authorised. This is preferably done using password-protected access in which an authorised person desiring access to the data inputs a password at the client computer 140 that is transmitted for authorisation by the server computer 130.
The subcategory database 190 will also contain pointers between the list of subcategories and the identification numbers 510 of the information datasets 170. Thus once the user has selected the relevant subcategories using the subcategory field 250, the list of information datasets 170 relevant to the subcategory can be obtained.
The algorithm used to optimise the access to the information datasets 170 will now be described. This information is stored in the counter 550. The counter 550 is divided into a table similar to that shown in Fig. 7. This table shows the number of times that a particular information dataset occurs in a position in a hit list 400 in a time frame. The horizontal access gives the position whilst the vertical axes give the time frame. In the example shown, the information dataset 170 in the past week occurred twice at the top of the hit list 400 (i.e. position 1), once in position 2, 152 times in positions 101-150, 566 times in positions 151-200, 111 times in positions 500-999 and 135 times in positions 1000-1999. The lower positions are grouped together since there is little relevant information to be gained from deciding whether an information dataset was in position 500 or position 750. It is likely in both cases that the user of the information management system 100 has not taken note of the information dataset 170. On the other hand, information about positions 1 and 2 are highly relevant and can be used to decide the location of the information dataset in future hit lists 400. By grouping the information together, valuable memory space can be saved.
The running totals for access in particular time frames are adjusted on a regular basis, e.g. daily.
The information in counter 550 (table of Fig. 7) can be also supplied to the advertiser to allow it to assess the success of its presence in the information management system.
In a further embodiment of the invention, an industry sector ranking database is provided in the database 160. The industry sector ranking database comprises a three dimensional array. Two dimension of this array are shown in Fig. 8. The number of the search query is in the x- axis. The y-axis contains details of the date and time of the query as well as type of query ("Protocol"), i.e. whether it comes from a mobile terminal or is a follow-up query. Up to 1000 subcategories per industry sector can typically be stored and the subcategories input can be recorded. In the not shown z-axis, the type of industry sector (architect, lawyer, restaurant etc.) is recorded.
As the sector ranking database can become too big for storage in memory, it is regularly combined into a table shown in Fig. 9. This table has the same y and z axes as Fig. 8. However, the queries are combined together for various days (D1 , D2, etc.), weeks (W1 , W2, ...), months (M1 ,M2,...) and years (Y1 ,Y2,...).
The storage of the queries allows marketing analysis of the queries to be carried out, e.g. by location, industry sector, search criteria and combinations thereof. Moreover, the analysis of development trends, e.g. fashion or music, can be reported. REFERENCE NUMBERS
Figure imgf000016_0001
Figure imgf000017_0001

Claims

1. In an information management system (100) having a managed database (160) comprising a plurality of information datasets (170) to which is assigned one or more categories, a method of retrieving one or more of the plurality of information datasets (170) comprising:
(a) a first step of inputting a search term (615);
(b) a second step of using the search term to retrieve from a subcategories database (190) a subcategory dataset comprising a set of subcategory values (634);
(c) a third step of presenting said subcategory database (300) to the user (635);
(d) a fourth step of selecting at least one of said subcategory dataset (640);
(e) a fifth step of presenting subcategory-dependent criteria to the user; (645)
(f) a sixth step of selecting at least one subcategory-dependent criteria (650)
(g) a seventh step of accessing one or more of the information datasets (170) using said subcategory dataset and (650) said selected subcategory-dependent criteria; and
(h) an eighth step of presenting a summary (400) to the user of the one or more information datasets (170) accessed in the seventh step (660).
2. The method of claim 1 further comprising a step of retrieving from a relationship dataset (180) the relationship dataset comprising a set of relationship terms of the search term and including the search term, and using the relationship dataset to retrieve from the subcategories database in the second step a subcategory dataset comprising a set of subcategory values.
3. The method of claim 1 or 2, wherein after accessing one or more of the information datasets (170), a step of selecting at least a further one of the subcategory-dependent criteria (650).
4. The method of any of the above claims, wherein after accessing one or more of the information datasets (170), further including a step of replacing at least one of the subcategory-dependent criteria (650) with a replacement one of the subcategory- dependent criteria (650).
5. The method of any one of the above claims further including a final step (660) of presenting to the user at least part of the content of a selected one of the one or more information datasets (170).
6. The method of any one of the above claims further comprising a step of selecting the language in which the subcategory dataset and the one or more information datasets (170) are to be presented.
7. The method of any one of the above claims wherein the second step of retrieving from the relationship database (180) a relationship dataset further comprises a step of retrieving from the relationship database (180) one or more explanations from the relationship database (180).
8. The method of one of the above claims further comprising a step of providing to the user geographical coordinates (530).
9. The method of one of the above claims further comprising a step of selecting from the summary one of the one or more information datasets (170) and displaying at least part of the information to the user.
10. The method of one of the above claims further comprising selecting at least one of the one or more information datasets (170) for storage in a reserved section (195).
11. The method of one of the above claims further comprising a step of saving the summary (400) for later reference.
12. The method of one of the above claims wherein the order of presentation of the one or more information datasets (170) in the summary (400) is calculated using a frequency table (550).
13. The method of claim 12 wherein the frequency table (550) records the number of times for which the information dataset (170) is accessed.
14. The method of any one of the above claims wherein the order of presentation of the one or more information datasets (170) in the summary (400) is dependent on the number of selected sub-category-dependent criteria (650).
15. The method of any one of the above claims wherein the order of presentation of the sub-category-dependent criteria (645) of a function of a frequency of selection of the sub-category-dependent criteria (650).
16. The method of any one of the above claims wherein prior to accessing the one or more of the information datasets (170), a summary of the selected sub-category-dependent criteria (650) is presented.
17. The method of any one of the claims wherein the inputted search terms and selected sub-category-dependent criteria are stored in a database (160).
18. The method of one of the above claims, further comprising a step of generating a communication to one or more members of the hit list (400).
19. The method of one of the above claims, further comprising a step of generating a telephone connection between the user and one of the members of the hit list (400).
20. The method of claim 12 or 13, further comprising a step of generating marketing - related information using the frequency table (550).
21. An information management system (100) comprising:
- a managed database (160) having a plurality of information datasets (170) assigned one or more subcategories; and
- a subcategory database (190) having a plurality of subcategory datasets, wherein the subcategory datasets comprise a first term and a list of second terms which are sub- categories of the first term, and the plurality of information datasets (170) include one or more subcategories.
22. The information management system (100) according to claim 21 further comprising - a relationship database (180) having a plurality of relationship datasets, wherein the relationship datasets comprise a list of relationship terms which are related to each other.
23. The information management system (100) of one of claims 21 or 22 further comprising presentation means for presenting one or more of a plurality of search screens to the user.
24. The information management system (100) of claim 21 to 23 further comprising presentation means for presenting one or more members of one of the subcategory datasets to the user.
25. The information management system (100) of one of claims 21 to 24 further comprising selection means for selecting one of the members of the one or more members of one of the subcategory datasets.
26. The information management system (100) of one of claims 21 to 25 further comprising a language selector for selecting a language.
27. The information management system (100) of claim 26 wherein the language selector accesses a language subcategory of the information dataset (170).
28. The information management system (100) of one of claims 21 to 27 wherein the relationship database (180) further contains explanations attached to at least one of the relationship datasets.
29. The information management system (100) of one of claims 21 to 28 wherein each of the plurality of information datasets (170) includes a document ID (510). a designation (520), a geographical identification (530) and a category.
30. The information management system (100) of claim 29 wherein the category includes an indication of the product offered, a service supplied and/or an industry category.
31. The information management system (100) of one of claims 21 to 30 wherein each of the plurality of information datasets (170) further includes a language identifier.
32. The information management system (100) of one of claims 21 to 31 wherein each of the plurality of the information datasets (170) further includes contact information.
33. The information management system (100) of one of claims 21 to 32 wherein the system (100) further comprises a plurality of managed databases (130).
34. The information management system (100) of one of claims 21 to 33 wherein the system (100) further comprises a reserved section (195) in the database (130) for storing the results of a search.
35. The information management system (100) of one of claims 21 to 34 further comprising an industry sector ranking database for recording the frequency of use of subcategories.
36. A dataset (500) for storing in a memory of the information management system (100) of one of claims 21 to 35 comprising:
- an identification number (510)
- a designation (520)
- a geographical identification (530).
37. The dataset of claim 36 further comprising a language identifier.
38. The dataset of one of claims 36 or 37 further comprising contact information (540).
39. The dataset of one of claims 36 to 38 further comprising a counter (550).
40. The dataset of one of claims 36 to 39 further wherein the geographical identification (530) further comprises geographical coordinates information usable with a location identifier of the user.
41. Computer program product stored on a computer usable medium comprising software code portions for performing the method of one of claims 1 to 20 when said product is run on a computer.
PCT/EP2003/014897 2002-12-30 2003-12-24 Information management system WO2004059526A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2003298246A AU2003298246A1 (en) 2002-12-30 2003-12-24 Information management system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP02029064.9 2002-12-30
EP02029064 2002-12-30

Publications (2)

Publication Number Publication Date
WO2004059526A2 true WO2004059526A2 (en) 2004-07-15
WO2004059526A3 WO2004059526A3 (en) 2004-09-23

Family

ID=32668750

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2003/014897 WO2004059526A2 (en) 2002-12-30 2003-12-24 Information management system

Country Status (2)

Country Link
AU (1) AU2003298246A1 (en)
WO (1) WO2004059526A2 (en)

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5768581A (en) * 1996-05-07 1998-06-16 Cochran; Nancy Pauline Apparatus and method for selecting records from a computer database by repeatedly displaying search terms from multiple list identifiers before either a list identifier or a search term is selected
WO2001022251A2 (en) * 1999-09-24 2001-03-29 Wordmap Limited Apparatus for and method of searching
US6760720B1 (en) * 2000-02-25 2004-07-06 Pedestrian Concepts, Inc. Search-on-the-fly/sort-on-the-fly search engine for searching databases
AU2000233946A1 (en) * 2000-03-03 2001-09-17 Robert Fish Improved parameter-value databases
US20030110055A1 (en) * 2000-04-10 2003-06-12 Chau Bang Thinh Electronic catalogue
AU2001273111A1 (en) * 2000-06-30 2002-01-14 Anthony Romito Method and apparatus for a GIS based search engine utilizing real time advertising
NO314059B1 (en) * 2000-11-10 2003-01-20 Imp Technology As Procedure for structuring and searching information

Also Published As

Publication number Publication date
WO2004059526A3 (en) 2004-09-23
AU2003298246A1 (en) 2004-07-22

Similar Documents

Publication Publication Date Title
US9305100B2 (en) Object oriented data and metadata based search
JP6058705B2 (en) Search method and search system
US8166013B2 (en) Method and system for crawling, mapping and extracting information associated with a business using heuristic and semantic analysis
US6029165A (en) Search and retrieval information system and method
US6336112B2 (en) Method for interactively creating an information database including preferred information elements, such as, preferred-authority, world wide web pages
JP5603337B2 (en) System and method for supporting search request by vertical proposal
US20060047649A1 (en) Internet and computer information retrieval and mining with intelligent conceptual filtering, visualization and automation
US20060106793A1 (en) Internet and computer information retrieval and mining with intelligent conceptual filtering, visualization and automation
US20080222105A1 (en) Entity recommendation system using restricted information tagged to selected entities
US20060129538A1 (en) Text search quality by exploiting organizational information
US20050160080A1 (en) System and method of context-specific searching in an electronic database
US20080140348A1 (en) Systems and methods for predictive models using geographic text search
US20070198506A1 (en) System and method for context-based knowledge search, tagging, collaboration, management, and advertisement
US20050160082A1 (en) System and method of context-specific searching in an electronic database
WO2005111787A2 (en) A method for indexing and searching geocoded pages of a web site
US20080147631A1 (en) Method and system for collecting and retrieving information from web sites
US20090037396A1 (en) Search apparatus and search method
US7509303B1 (en) Information retrieval system using attribute normalization
Fafalios et al. Exploratory patent search with faceted search and configurable entity mining
Burrows et al. A new model for manuscript provenance research: The mapping manuscript migrations project
Kwon et al. Recommendation of e-commerce sites by matching category-based buyer query and product e-catalogs
US8090736B1 (en) Enhancing search results using conceptual document relationships
WO2004059526A2 (en) Information management system
WO2004059525A2 (en) Information management system
GB2460045A (en) Analysing multiple data sources for a user request using business and geographical data, with selected rule sets to filter the data on the databases.

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载