WO2000067161A2 - Method and apparatus for categorizing and retrieving network pages and sites - Google Patents
Method and apparatus for categorizing and retrieving network pages and sites Download PDFInfo
- Publication number
- WO2000067161A2 WO2000067161A2 PCT/US2000/012376 US0012376W WO0067161A2 WO 2000067161 A2 WO2000067161 A2 WO 2000067161A2 US 0012376 W US0012376 W US 0012376W WO 0067161 A2 WO0067161 A2 WO 0067161A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- categories
- page
- pages
- search
- subject matter
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 114
- 230000000007 visual effect Effects 0.000 claims description 22
- 238000004891 communication Methods 0.000 claims description 15
- 238000005516 engineering process Methods 0.000 claims description 14
- 239000000463 material Substances 0.000 claims description 10
- 241000269627 Amphiuma means Species 0.000 claims 4
- 238000010586 diagram Methods 0.000 description 6
- 238000012552 review Methods 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 241000239290 Araneae Species 0.000 description 1
- 238000013475 authorization Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000012612 commercial material Substances 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000029305 taxis Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/954—Navigation, e.g. using categorised browsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2216/00—Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
- G06F2216/01—Automatic library building
Definitions
- the present invention relates generally to methods and apparatus for categorizing and searching for information on a network and. more specifically, to categorizing and searching Web pages on the Internet.
- the Internet contains over one billion Web pages. It has been estimated that two million Web pages are added to the Internet each day (The Industry Standard, February 28, 2000). This vast amount of information is a tremendous resource for the public to use. However, there is no effective way for a user to obtain relevant information. Although 85 percent of users use search engines to find information on the Internet, "a mind-boggling 92 percent of searches fail to find relevant information or to arrange the results in a meaningful order. " (The Industry Standard, April 17, 2000, referring to a Forrester Research review of Web sites.) There are two fundamental problems. First, there is no standardized international categorization system or catalog of the information contained on the Internet. A group of librarians and others have been working on a cataloging system for the Internet for the last few years.
- This work is referred to as the Dublin Core Metadata Element Set.
- This system suffers from a number of problems, including requiring a high degree of cataloging knowledge and being time-consuming and very expensive.
- This system is a system that is unworkable.
- Directories or indices are human-compiled databases of Web sites or pages. Most directories use editors to review and categorize Web sites. Some use contributions by their visitors. A user searches a directory by reviewing lists of categories and subcategories, or also typing in keywords. The result is a list of documents that the user can access by links. Directories are helpful to familiarize a user with the scope of a subject, but are not very useful in finding specific information. Also, directories can be slow, and the results may be haphazard. Another major problem is that directories review and categorize only a small percentage of pages and sites. Examples of directories commonly used are Yahoo! and LookSmart.
- Search engines are huge databases that automatically index large portions of the Internet and continually update that index.
- Search engines typically include a Web crawler or spider (also called a worm, robot, or bot) that automatically crawls through the Internet on hyperlinks indexing Web pages, a database which is the index compiled by the crawler, and a search tool which the user can use to search the database.
- the databases of the existing search engines differ in how they are created.
- Some Web crawlers index each word in a document, some index only keywords, including META tags, and some index other parts of a Web page, such as title, headings, etc.
- Most search engines require a search to be conducted by typing in keywords.
- search query may be by Boolean logic, where keywords are used with various terms, or by natural language, where keywords are used in the form of a question.
- natural language searches may be easier for a user to formulate, both types of formulations rely on keywords.
- search engines use mathematical algorithms to weigh or rank the results, with the most relevant items listed first. These rankings may be based on the number of times a keyword is used on a page or the location of the keyword on the page. Some search engines also allow the user to organize or group the results by category, date, or other variable, such as the folders used by Northern Light, U. S. Patent no. 5,924,090 to Krellenstein.
- Another search engine known as the Clever Project, by IBM, analyzes hyperlinks between pages, in addition to text and citations, in order to develop algorithms that are intended to increase the relevancy of search results. This method is a marginal improvement over other search engines, but has its own set of problems.
- search engines do not index the entire Internet. Most have indexed about one-third of the available or publicly indexable Web pages (i.e. , excluding Web pages with authorization requirements). Examples of search engines are: Inktomi (the largest, with about 500 million Web pages indexed as of April 11 , 2000); FAST (with about 340 million Web pages indexed); AltaVista, Northern Light, and Excite. A greater portion of the Internet can be searched using a meta-search. This technology allows the user to search several search engines at the same time and presents all the results in a single list, but exacerbates the problems inherent in existing search engines.
- search engines Because they contain such huge databases, existing search engines often produce search results too voluminous for the user to review. Also, the search results typically contain a vast amount of irrelevant or unrelated items . As stated above, it has been found that 92 percent of searches did not yield relevant information or did not organize the results in a usable fashion (The Industry Standard, April 17, 2000). Another problem is that search engines are more likely to index pages with more links, pages with commercial information, and pages in the United States, rather than lesser known, educational, or non-United States pages.
- the method and apparatus for categorizing and retrieving network pages and sites of the present invention are adapted to overcome the above-noted shortcomings and to fulfill the stated needs.
- the first embodiment of the invention is a method and apparatus for categorizing a network page.
- the method comprises the steps of providing a list of categories and assigning a page to one or more of a plurality of the categories.
- the apparatus includes means for providing a list of categories and means for assigning a page to one or more of a plurality of categories.
- the second embodiment of the invention is a method and apparatus for categorizing pages on a network. The method comprises the steps of determining whether a page is involved in transacting business or providing information, determining whether a page has information relating to one or more of a plurality of subject matter categories, and determining the type of files associated with a page.
- the apparatus includes means for determining whether a page is involved in transacting business or providing information, means for determining whether a page has information relating to one or more of a plurality of subject matter categories, and means for determining the type of files associated with a page.
- the third embodiment of the invention is a method and apparatus for searching for and locating information on a network. The method comprises the steps of providing the opportunity to limit the search to categories for pages involved in transacting business, pages involved in providing information, and pages involved in both transacting business and providing information; providing an opportunity to limit the search to one or more of a plurality of subject matter categories; providing an opportunity to limit the search to one or more of a plurality of file-type categories; and providing an opportunity to limit the search by keyword.
- the apparatus comprises means for providing an opportunity to limit the search to one or more of a plurality of categories, means for providing an opportunity to limit the search by keyword, means for identifying pages categorized into all the categories to which the search was limited, means for determining which of the identified pages contain the keyword to which the search was limited, and means for reporting to a user all said identified pages and keyword-containing pages. It is an object of the invention to provide a method and apparatus for categorizing a page on a network, during or after the time that the page is created, according to whether the page is involved in transacting business or providing information.
- Figure 1 is a representation of the preferred graphical user interface showing the three tiers and the categories within those tiers.
- Figure 2 is a chart of the Government, Medical, News, and History categories of the second tier showing examples of topics contained within those categories.
- Figure 3 is a chart of the Education & Social Sciences, Science & Technology, Sports & Regulation, and Arts & Humanities categories of the second tier showing examples of topics contained within those categories.
- Figure 4 is a chart of the Finance & Business, Reference, Explicit, and Other categories of the second tier showing examples of topics contained within those categories.
- Figure 5 is a Venn diagram showing the intersection of the domains corresponding to the categories of Commerce and Information.
- Figure 6 is a Venn diagram showing the intersection of the domains corresponding to the categories of Information and Medical.
- Figure 7 is a Venn diagram showing the intersection of the domains corresponding to the categories of Information, Medical, and History.
- Figure 8 is a Venn diagram showing the intersection of the domains corresponding to the categories of Information. Medical. History, and Visual.
- Figure 9 is a diagram showing an example of the relationship between the subcategory created by selecting a combination of the categories and the keyword search.
- the invention includes methods and apparatus for categorizing a page as it is being created or as it exists on a network, and for searching a network.
- Networks include the Internet and private corporate networks, such as intranets and local area networks . Pages on the Internet are identifiable by unique addresses and include both Web sites and Web pages. As shown on Figure 1 , the invention utilizes a graphical user interface
- GUI WorldNet 10
- First tier 12 is a division into one or both of two major categories: pages that are involved in transacting business and pages that are involved in providing information.
- the first category 18 is designated "Commerce”
- the second category 20 is designated "Information.
- Web pages involved in transacting business include e-commerce pages, which provide users with the ability to conduct online purchases, sales, leases, or other financial transactions, pages that may be involved in transacting business, but do not enable the user to conduct the transaction on-line, and other pages that contain commercial information.
- Web pages involved in providing information include pages that contain articles, journals, publications, or other non-commercial materials.
- Second tier 14 is a division into one or more categories based on the subject matter the Web page contains. Many different categories can be used and many different terms may be used to identify a given category .
- the preferred embodiment of the invention includes twelve categories encompassing like subjects that have been carefully selected to allow users to locate and access information in an efficient manner: Government 22, Medical 24, Education & Social Science 26, News 28, Sports & Adventure 30, History 32, Science & Technology 34, Arts & Humanities 36, Finance & Business 38, Reference 40, Explicit 42, and Other 44.
- Each of these categories includes many topics. Figures 2, 3, and 4 list examples of the topics included in each category. For example, category 22. Government, includes the following topics: federal/state/local government, law, military, nations, politics, and taxes.
- Category 42, Explicit includes pornography and sexually- explicit material.
- Category 44, Other is for subjects that do not fit into any of the other categories of second tier 14.
- Third tier 16 is a division into one or more categories according to the type of files associated with a Web page.
- files There are several different types of files, including text, graphics, audio, video, multimedia, and files for communications between persons.
- Most search engines can recognize the type of files associated with a Web page by scanning the files and identifying the file extensions (for example, .gif, .au, .wav).
- the preferred embodiment of the invention includes the following five file-type categories: Visual 46, Audio 48, Multimedia 50, Text- only 52, and Communication 54.
- Category 46. Visual includes files containing pictures, charts, graphs, and diagrams.
- Category 48, Audio includes files containing sound, such as music, voice, and sound effects.
- Category 50, Multimedia includes files containing video, film clips, and virtual reality.
- Category 52, Text-only includes files that do not contain any visual, audio, or multimedia material.
- Category 54, Communication includes files containing e- mail, telnet links, ICQ, and other messaging systems.
- the first embodiment of the invention is a method and apparatus for categorizing a page on a network, as the page is being created or during editing at a later time.
- the method includes the steps of providing the creator with a list of categories and allowing the creator to assign the page to one or more of the categories.
- the preferred categories are the categories of the three tiers 12, 14, and 16, as shown in Figure 1.
- the list of categories includes a different indicium to indicate each category.
- the indicium is preferably a universal symbol or icon that is not associated with any one language.
- the indicia preferably used are shown in Figure 1.
- the creator of a Web page may assign the Web page to any number or combination of the categories of three tiers 12, 14, and 16, depending on which categories best characterize the Web page.
- the steps of assigning a page to categories may be performed in several different ways known to those skilled in the art.
- the creator may also decide not to assign the page to any of the categories of a particular tier.
- the outcome of the categorization method is that a page is designated to be "in” or "within” the categories that best characterize the page.
- First tier 12 includes two categories: Commerce 18 and Information 20, as shown in Figure 1.
- the creator may assign the page to either one of the two categories of Commerce 18 or Information 20. If the page is involved in both transacting business and providing information, the creator may assign it to both
- Second tier 14 includes twelve subject matter categories: Government 22,
- Third tier 16 includes five file-type categories: Visual 46, Audio 48,
- Multimedia 50 Text-only 52, and Communication 54, as shown in Figure 1.
- the creator may assign the page to one or more of the five file-type categories.
- the creator may mark or tag the page as belonging in or within the assigned categories by associating, with the page, the corresponding indicium for each assigned category.
- the creator may communicate the categories to which the page is assigned to one or more search engines for the purpose of allowing such search engines to locate the page, by its assigned categories, in conducting a search.
- the creator may change the categories during editing at a later point in time as frequently as desired.
- a risk with any system whereby the creators of pages are permitted to categorize their own pages is that the creator will assign more categories to the page than are justified in order to increase the number of visitors to the page.
- the invention addresses this problem by including a method for verifying the accuracy of categorization of a network page.
- the method includes the step of scanning Web pages categorized into one or more categories, which step can be performed by a Web crawler. Pages assigned to a larger number of categories are scanned more frequently.
- the crawler will determine whether the page was categorized automatically, for example, by a Web crawler. If the Web page was not categorized automatically, the Web crawler further determines whether the page was properly assigned to each such category.
- the apparatus for categorizing a page includes means or mechanisms for providing a list of categories with corresponding indicia, and means for assigning the page to one or more of a plurality of categories.
- the preferred categories are the categories of the three tiers 12, 14, and 16, as shown in Figure 1.
- the second embodiment of the invention is a method and apparatus for categorizing pages on a network.
- This method may be performed by a Web crawler.
- the method includes the steps of determining whether a page is involved in transacting business or providing information; assigning a business-transacting page to one category, an information-providing page to a second category, and a page that is involved in both transacting business and providing information to both the first and second categories; determining whether a page has information relating to one or more subject matter categories; assigning a page to one or more subject matter categories; determining the types of files associated with a page; and assigning a page to one or more file-type categories.
- the method further includes the step of assigning a page that has been assigned to any two or more categories, to a subcategory that consists of only pages assigned to the identical two or more categories.
- the outcome of the method is that a page is determined to be "in” or "within” the categories that best characterize the page.
- the step of determining whether a page is involved in transacting business may be performed by determining whether the page includes encryption software. If the page includes encryption software, it will be determined to be involved in transacting business. Additionally, or alternatively, the step may be performed by determining whether the page has the capability of permitting a user to conduct a financial transaction through the page. If so, the page will be determined to be involved in transacting business (i.e. , a business- transacting page) . A page involved in providing information will be determined to be an information-providing page.
- the step of assigning business-transacting pages to one category preferably designated Commerce 18
- pages involved in providing information to a second 5 category preferably designated Information 20
- pages that are involved in both transacting business and providing information to both categories is preferably performed by assigning business-transacting pages to a first list (containing only business-transacting pages), assigning pages involved in providing information to a second list (containing only information-providing pages), and assigning pages o that are involved in both transacting business and providing information to both the first and second lists.
- the lists are preferably databases.
- the step of determining whether a page has information relating to one or more subject matter categories is preferably performed by parsing the text of the page.
- parsing the text of the page There are various technologies currently available that parse text that may 5 perform this function satisfactorily.
- the step of assigning a page to one or more subject matter categories is preferably performed by assigning a page that has information related to particular subject matter categories to a separate list for each such subject matter category, where each list contains only pages having information related to that subject matter category.
- the categories are preferably the twelve categories of second tier 14.
- the lists are preferably databases.
- the step of determining the type of files associated with a page may be performed by identifying files containing text, graphics, audio, video, multimedia, and communications between persons. This step can be satisfactorily accomplished by search engines that scan Web pages and recognize file extensions such as .au (audio), .wav (sound), .gif (image), .jpeg (image), pg (image), .avi (video), .mpeg (movies), and .mpg (movies).
- the step of assigning a page to one or more categories based on file type is preferably performed by assigning a page that is associated with particular file types to a separate list for each such file type where each list contains only pages associated with a single file type.
- the categories are preferably the five file-type categories of third tier 16.
- the lists are preferably databases.
- the step of assigning a page to a subcategory is performed after the page has been assigned to all possible categories from three tiers 12, 14, and 16.
- the Web crawlei assigns a page that has been assigned to two or more categories to a subcategory consisting of only pages assigned to the identical categories. For example, a page that has been categorized into the categories of Information History, Medical, and Visual would be assigned to a subcategory containing only pages also assigned to the identical categories of Information 20, History 32, Medical 24, and Visual 46.
- a separate list is created for each of the possible combinations of any two or more categories of three tiers 12, 14, and 16. Each list is preferably a separate database. Examples of software that can be used for creating and managing databases are Oracle 8i version 2 with the File System option and Informix Dynamic server.
- the apparatus for categorizing pages on a network includes means or mechanisms for determining whether a page is involved in transacting business or providing information; means for assigning business-transacting pages to one category, information-providing pages to a second category, and pages involved in both transacting business and providing information to both the first and second categories; means for determining whether a page has information related to one or more subject matter categories; means for assigning a page to one or more subject matter categories; means for determining the types of files associated with a page; and means for assigning the page to one or more file-type categories.
- the apparatus may also include means for indicating to a search engine that the page has been categorized automatically.
- the third embodiment of the invention is a method and apparatus for searching for and locating information on a network.
- the method allows the user to search pages on a network that have already been categorized into three tiers of categories 12, 14, and 16.
- the categorization may have been done by the creator of a page at the time the page was created or during editing at a later time, or by a Web crawler automatically at some time after the page was created.
- the method also includes a categorization step, preferably performed by a search engine, before the search is begun in order to categorize any new pages that have not yet been categorized.
- the categorizing step comprises assigning the page to one or more categories, including a category for pages involved in transacting business and a category for pages involved in providing information, assigning the page to one or more subject matter categories, and assigning the page to one or more file-type categories.
- This categorizing step may be accomplished using a Web crawler, by the method and apparatus of the second embodiment.
- the method provides the user with the opportunity to limit the search by selecting categories from three tiers 12, 14, and 16 and by utilizing a keyword search.
- the user may select one or more categories from each of three tiers 12, 14, and 16, from one or two of the tiers, or from none of the tiers, and may or may not use the keyword search function.
- categories For convenience, as is well known in the art, when an icon is selected, its appearance changes such that it is emphasized (for example, highlighted).
- the user may select, from first tier 12, the category of Commerce 18, the category of Information 20, or both categories 18 and 20.
- the categories may be conveniently represented on the user's screen by an icon or a symbol, for example, as is preferred: "$ " for Commerce 18 and "i" for Information 20. If the user selects "$, " the search will be restricted to only those Web pages that are categorized as Commerce 18. This will include all pages in the Commerce category 18 as well as the subcategory that is both Commerce 18 and Information 20. Pages only in the Information category 20, and not also in Commerce 18, will automatically be excluded. If the user selects "i, " the search will be restricted to only those Web pages that are categorized as Information 20. This will include all pages in Information category 20 as well as the subcategory that is both Information 20 and Commerce 18.
- Pages only in the Commerce category 18, and not also in Information 20, will automatically be excluded. If the user selects both "$" and "i, " as shown in Figure 5, the search will be restricted to only those Web pages that are categorized as both Commerce 18 and Information 20. Only subcategory 56 of Commerce and Information will be searched. Pages only in Commerce 18 and pages only in Information 20 will be excluded. If none of the categories of first tier 12 are selected, the search will include Web pages of both categories and the subcategory and will not be narrowed based on whether the page is involved in transacting business or providing information.
- each of these twelve categories may be conveniently represented on the user's screen by a different icon or symbol, for example, as is preferred: a flag for Government, a caduceus for Medical, a mortarboard for Education & Social Science, a satellite dish for News, a bicycle for Sports & Adventure, a pyramid for History, a microscope for Science & Technology, an artist's pallette for Arts & Humanities, a briefcase for Financial, a book for Reference, an "X" for Explicit (pornographic or sexually-explicit material), and a "? " for Other.
- the user may also view a list of topics included in each category by clicking on the category.
- the twelve subject matter categories and their corresponding topics are shown in Figures 2, 3, and 4. If none of the categories are selected, the search will include Web pages of all twelve categories and will not be narrowed based on the subject 5 matter contained in the page.
- the user may select one or more categories from third tier 16: Visual 46, Audio 48, Multimedia 50, Text-only 52, and Communication 54.
- each of the five categories may be conveniently represented on the user's screen by an icon or symbol, for example, as is preferred: an eye for l o Visual, an ear for Audio, a lightning bolt for Multimedia, a text page for Text-only, and a mouth for Communication.
- the results from the search will include Web pages that are associated with file-types of text, visual, audio, multimedia, and communications and will not be narrowed based on the types of files contained on the page.
- Combining categories restricts the search results to only the relevant categories and subcategories. The greater the number of categories chosen, the more refined the search and the greater the number of pages that are excluded from the search. When the user selects several categories, the user does not get results from each of those categories, but only from the subcategory that is created from o the combination of the selected categories. Combining categories acts as a filtering process, eliminating irrelevant material from the search and from subsequent results. This method allows the user to exclude unwanted material, such as pornography, which is contained in Explicit category 42.
- the user may next enter a keyword 58, which can be a single word or 5 multiple words.
- the keyword search can be formulated by using either Boolean logic terms or natural language.
- the user After making the selections, the user initiates the search.
- the symbols for the categories selected and the keyword preferably remain visible on the user's screen during the search.
- a page may have been categorized using the same categories as are available to the user to limit the search, or the site may have been categorized using different categories.
- the determination of whether a page is categorized is preferably performed by determining whether the page is contained or referred to on a list of categorized pages.
- the list may be a database or an index created automatically by a Web crawler, which contains the addresses of Web pages.
- the network being searched contains at least one page categorized into one or more of the categories which were provided to the user to limit the search, after a user initiates a category-limited search, an identification is made of all pages that have been assigned all of the categories to which the search was limited.
- An example of how a search works is shown in Figures 6 through 9. As shown in Figure 6, if the user selects category 20 Information from first tier 12 and category 24 Medical from second tier 14, the search and subsequent search results will be limited to subcategory 60 that is created by the combination of Information 20 and Medical 24 categories, as shown by gray area. The search results will not include pages from Information category 20 or Medical category 24 that are not contained within smaller subcategory 60.
- Figure 7 shows a search in which the user selected Information 20 from first tier 12 and History 32 and Medical 24 from second tier 14.
- the search and subsequent search results would be limited to subcategory 62 created by the combination of Information 20, Medical 24, and History 32 categories, as shown by the gray area.
- the search results will not include pages from Information 20, Medical 24, or History 32 categories that are not contained within smaller subcategory 62.
- Figure 8 shows a search in which the user selected Information 20 from first tier 12, Medical 24 and History 32 from second tier 14, and Visual 46 from third tier 16.
- the search and subsequent search results would be limited to subcategory 64 created by the combination of Information 20, Medical 24, History 32, and Visual 46 categories, as shown by the gray area.
- the search results will not include pages from Information 20, Medical 24, History 32, or Visual 46 categories that are not contained within smaller subcategory 64.
- Figure 9 shows a search in which the user selected Information 20 from first tier 12, Medical 24 and History 32 from second tier 14, Visual 46 from third tier 16, and the keyword 58 "Pasteur. " In that case, the search and subsequent search results would be limited to the subcategory created by the combination of Information 20, Medical 24, History 32, and Visual 46 categories that contain the keyword 58 "Pasteur. " The search results will not include pages from Information 20, Medical 24, History 32, and Visual 46 categories that are not contained in the subcategory.
- All sites identified by the search are reported as search results to the user, by network address, such as a Web page's "uniform resource locator" (URL), so that the user can access any identified page. Other information, such as the first line, may also be reported.
- the results will show all of the symbols corresponding to all of the categories to which that page had been assigned. The results will also indicate whether the categorization step was performed automatically (for example, by a Web crawler).
- the apparatus for searching for and locating information on a network includes means or mechanisms for providing an opportunity to limit the search to one or more categories from three tiers 12, 14, and 16; means for providing an opportunity to limit the search by keyword; means for identifying all pages categorized into the categories to which the search was limited which contain the keyword; and means for reporting the results to a user.
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
Claims
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU49891/00A AU4989100A (en) | 1999-05-04 | 2000-05-03 | Method and apparatus for categorizing and retrieving network pages and sites |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13269499P | 1999-05-04 | 1999-05-04 | |
US60/132,694 | 1999-05-04 | ||
US56569500A | 2000-05-03 | 2000-05-03 | |
US09/565,695 | 2000-05-03 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2000067161A2 true WO2000067161A2 (en) | 2000-11-09 |
WO2000067161A3 WO2000067161A3 (en) | 2002-06-06 |
Family
ID=26830640
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2000/012376 WO2000067161A2 (en) | 1999-05-04 | 2000-05-03 | Method and apparatus for categorizing and retrieving network pages and sites |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU4989100A (en) |
WO (1) | WO2000067161A2 (en) |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2386440A (en) * | 2002-03-12 | 2003-09-17 | Univ Hertfordshire | Searching and navigating an information source |
WO2002054292A3 (en) * | 2000-12-29 | 2003-11-06 | Treetop Ventures Llc | A cooperative, interactive, heuristic system for the creation and ongoing modification of categorization systems |
EP1388091A4 (en) * | 2001-02-28 | 2006-01-18 | Microsoft Corp | Category name service |
US7168034B2 (en) * | 1999-03-31 | 2007-01-23 | Microsoft Corporation | Method for promoting contextual information to display pages containing hyperlinks |
WO2008030529A3 (en) * | 2006-09-06 | 2008-05-22 | Nexplore Corp | System and method for providing focused search term results |
WO2009034473A2 (en) | 2007-09-12 | 2009-03-19 | Novartis Ag | Gas57 mutant antigens and gas57 antibodies |
WO2009081274A2 (en) | 2007-12-21 | 2009-07-02 | Novartis Ag | Mutant forms of streptolysin o |
WO2010079464A1 (en) | 2009-01-12 | 2010-07-15 | Novartis Ag | Cna_b domain antigens in vaccines against gram positive bacteria |
EP2258365A1 (en) | 2003-03-28 | 2010-12-08 | Novartis Vaccines and Diagnostics, Inc. | Use of organic compounds for immunopotentiation |
EP2277595A2 (en) | 2004-06-24 | 2011-01-26 | Novartis Vaccines and Diagnostics, Inc. | Compounds for immunopotentiation |
EP2357184A1 (en) | 2006-03-23 | 2011-08-17 | Novartis AG | Imidazoquinoxaline compounds as immunomodulators |
EP2360175A2 (en) | 2005-11-22 | 2011-08-24 | Novartis Vaccines and Diagnostics, Inc. | Norovirus and Sapovirus virus-like particles (VLPs) |
WO2011149564A1 (en) | 2010-05-28 | 2011-12-01 | Tetris Online, Inc. | Interactive hybrid asynchronous computer game infrastructure |
EP2583678A2 (en) | 2004-06-24 | 2013-04-24 | Novartis Vaccines and Diagnostics, Inc. | Small molecule immunopotentiators and assays for their detection |
EP2612679A1 (en) | 2004-07-29 | 2013-07-10 | Novartis Vaccines and Diagnostics, Inc. | Immunogenic compositions for gram positive bacteria such as streptococcus agalactiae |
US8549436B1 (en) | 2007-06-04 | 2013-10-01 | RedZ, Inc. | Visual web search interface |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5706507A (en) * | 1995-07-05 | 1998-01-06 | International Business Machines Corporation | System and method for controlling access to data located on a content server |
US5924090A (en) * | 1997-05-01 | 1999-07-13 | Northern Light Technology Llc | Method and apparatus for searching a database of records |
-
2000
- 2000-05-03 AU AU49891/00A patent/AU4989100A/en not_active Abandoned
- 2000-05-03 WO PCT/US2000/012376 patent/WO2000067161A2/en active Application Filing
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7168034B2 (en) * | 1999-03-31 | 2007-01-23 | Microsoft Corporation | Method for promoting contextual information to display pages containing hyperlinks |
WO2002054292A3 (en) * | 2000-12-29 | 2003-11-06 | Treetop Ventures Llc | A cooperative, interactive, heuristic system for the creation and ongoing modification of categorization systems |
EP1388091A4 (en) * | 2001-02-28 | 2006-01-18 | Microsoft Corp | Category name service |
US7213069B2 (en) | 2001-02-28 | 2007-05-01 | Microsoft Corporation | Category name service able to override the category name based on requestor privilege information |
GB2386440A (en) * | 2002-03-12 | 2003-09-17 | Univ Hertfordshire | Searching and navigating an information source |
EP2258365A1 (en) | 2003-03-28 | 2010-12-08 | Novartis Vaccines and Diagnostics, Inc. | Use of organic compounds for immunopotentiation |
EP2583678A2 (en) | 2004-06-24 | 2013-04-24 | Novartis Vaccines and Diagnostics, Inc. | Small molecule immunopotentiators and assays for their detection |
EP2277595A2 (en) | 2004-06-24 | 2011-01-26 | Novartis Vaccines and Diagnostics, Inc. | Compounds for immunopotentiation |
EP2612679A1 (en) | 2004-07-29 | 2013-07-10 | Novartis Vaccines and Diagnostics, Inc. | Immunogenic compositions for gram positive bacteria such as streptococcus agalactiae |
EP2360175A2 (en) | 2005-11-22 | 2011-08-24 | Novartis Vaccines and Diagnostics, Inc. | Norovirus and Sapovirus virus-like particles (VLPs) |
EP2357184A1 (en) | 2006-03-23 | 2011-08-17 | Novartis AG | Imidazoquinoxaline compounds as immunomodulators |
WO2008030529A3 (en) * | 2006-09-06 | 2008-05-22 | Nexplore Corp | System and method for providing focused search term results |
US8549436B1 (en) | 2007-06-04 | 2013-10-01 | RedZ, Inc. | Visual web search interface |
WO2009034473A2 (en) | 2007-09-12 | 2009-03-19 | Novartis Ag | Gas57 mutant antigens and gas57 antibodies |
WO2009081274A2 (en) | 2007-12-21 | 2009-07-02 | Novartis Ag | Mutant forms of streptolysin o |
EP2537857A2 (en) | 2007-12-21 | 2012-12-26 | Novartis AG | Mutant forms of streptolysin O |
WO2010079464A1 (en) | 2009-01-12 | 2010-07-15 | Novartis Ag | Cna_b domain antigens in vaccines against gram positive bacteria |
WO2011149564A1 (en) | 2010-05-28 | 2011-12-01 | Tetris Online, Inc. | Interactive hybrid asynchronous computer game infrastructure |
Also Published As
Publication number | Publication date |
---|---|
WO2000067161A3 (en) | 2002-06-06 |
AU4989100A (en) | 2000-11-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7181459B2 (en) | Method of coding, categorizing, and retrieving network pages and sites | |
US6363377B1 (en) | Search data processor | |
Lawrence | Context in web search | |
Schwartz | Web search engines | |
Choi et al. | Searching for images: The analysis of users' queries for image retrieval in American history | |
US5920859A (en) | Hypertext document retrieval system and method | |
US7065523B2 (en) | Scoping queries in a search engine | |
US6684218B1 (en) | Standard specific | |
US7958126B2 (en) | Techniques for including collection items in search results | |
Nelson | We have the information you want, but getting it will cost you! held hostage by information overload. | |
US6772141B1 (en) | Method and apparatus for organizing and using indexes utilizing a search decision table | |
US20020069203A1 (en) | Internet information retrieval method and apparatus | |
US20020129062A1 (en) | Apparatus and method for cataloging data | |
US20060129538A1 (en) | Text search quality by exploiting organizational information | |
JP2009238241A (en) | Method and apparatus for searching data of database | |
US7024405B2 (en) | Method and apparatus for improved internet searching | |
WO2000067161A2 (en) | Method and apparatus for categorizing and retrieving network pages and sites | |
US20130124541A1 (en) | Collaborative bookmarking | |
EP1586021A4 (en) | DATABASE FOR SEVERAL PERSONALIZED VIEWS | |
Gill | Metadata and the world wide web | |
WO1997049048A1 (en) | Hypertext document retrieval system and method | |
Pu | An analysis of Web image queries for search | |
Prime‐Claverie et al. | Transposition of the cocitation method with a view to classifying web pages | |
Iqbal et al. | Comprehensiveness, Dead Links and Duplicacy of Select Major Search Engines in the Field of Library and Information Science | |
Hubbard | Indexing the Internet |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
AK | Designated states |
Kind code of ref document: A3 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A3 Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
122 | Ep: pct application non-entry in european phase | ||
NENP | Non-entry into the national phase |
Ref country code: JP |