+

WO1998048361A1 - Courtiers d'informations utilisant des contraintes de caracteristiques signees - Google Patents

Courtiers d'informations utilisant des contraintes de caracteristiques signees Download PDF

Info

Publication number
WO1998048361A1
WO1998048361A1 PCT/IB1998/000758 IB9800758W WO9848361A1 WO 1998048361 A1 WO1998048361 A1 WO 1998048361A1 IB 9800758 W IB9800758 W IB 9800758W WO 9848361 A1 WO9848361 A1 WO 9848361A1
Authority
WO
WIPO (PCT)
Prior art keywords
document
feature constraint
feature
entity
data processing
Prior art date
Application number
PCT/IB1998/000758
Other languages
English (en)
Inventor
Jean-Marc Andreoli
Uwe Borghoff
Original Assignee
Xerox Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xerox Corporation filed Critical Xerox Corporation
Priority to JP54536198A priority Critical patent/JP2001521664A/ja
Priority to EP98917545A priority patent/EP0985184A1/fr
Publication of WO1998048361A1 publication Critical patent/WO1998048361A1/fr
Priority to US09/421,846 priority patent/US7020670B1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/02Digital computers in general; Data processing equipment in general manually operated with input through keyboard and computation using a built-in program, e.g. pocket calculators
    • G06F15/0225User interface arrangements, e.g. keyboard, display; Interfaces to other computer systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3322Query formulation using system suggestions

Definitions

  • the present invention relates to data processing, and more particularly relates to the transfer between computing devices, and the retrieval by such devices, of information or knowledge using signed feature constraints.
  • DMA Document Management Alliance
  • Knowledge brokers are software agents, which can process knowledge search requests.
  • Knowledge is taken here to be any piece of electronic information intended to be publicly accessible. Different, possibly distributed, information sources are assumed to be available, from a simple file in a user's directory to a database local to a site, up to a wide area information service (WAIS) on the internet, for example.
  • WAIS wide area information service
  • a broker When receiving a request, a broker may have sufficient knowledge to process it, or may need to retrieve more knowledge. For that purpose, it releases sub-requests, aimed at other brokers. Thus, knowledge retrieval is achieved by the collaboration of all thebrokers that are alternatively service providers processing requests and clients of these services generating sub-requests.
  • a request can be expressed by a pair (x, P) where x is a logical variable and P a logical formula involving x, meaning "Retrieve knowledge objects x such that the property expressed by formula P holds”.
  • an answer to such a request can be expressed in the same formalism, i.e. a pair(x, Q) meaning "77?ere exists a knowledge object x satisfying the property expressed by formula Q”.
  • a broker with scope (x, R) means "/ am not capable of retrieving knowledge objects x which do not satisfy the property expressed by formula R".
  • the scope of a broker may vary, because it gets specialised or, on the contrary, expands its capacities, either externally or due to the knowledge retrieval process itself.
  • logic provides a common languagewhere requests, answers and scopes can be expressed. Brokers then perform logical operations on these three components.
  • FCs feature constraints
  • satisfiability problem in this case is also known as “feature constraint solving”.
  • feature constraints are built from atomic constraints, which are either sorts or features.
  • a sort is a unary relation, expressing a property of a single entity.
  • P person expresses that an entity P is of sort person.
  • a feature is a binary relation expressing a property linking two entities.
  • P emplo ⁇ er->E expresses that entity p has an employer, which is an entity E.
  • Most feature systems also allow built-in relations such as equality and disequality.
  • the present invention provides a method carried out in a data processing device including a processor, memory, and a user interface, the data processing device being couple in a network to one or more other data processing devices, at least one of the data processing devices including means for storing a repository of electronic documents, comprising: (a) receiving at least one user input designating a feature constraint, said feature constraint comprising at least a positive component and a negative component, each of the positive component and the negative component including one or more relations, the or each relation defining a document related entity and a property of the entity, (b) solving the feature constraint to determine from the positive and negative components one or more document references, the or each document reference corresponding to a document within said repository satisfying said feature constraint.
  • the invention further provides a method carried out in a data processing device including a processor, memory, and a user interface, comprising: (i) receiving a first user input designating a graphical object corresponding to a stored feature constraint, said feature constraint comprising at least a positive component and a negative component, each of the positive component and the negative component including one or more relations, the or each relation defining a document related entity and a property of the entity, ) receiving a second user input indicating that the feature constraint is to be sent to another data processing device, (k) encoding the feature constraint in a data packet, and (I) transmitting the data packet.
  • the method may include retrieving knowledge from a repository stored in a data processing device at each of a plurality of locations, and optionally combining each piece of knowledge so obtained to generate a new document.
  • the invention further provides a data processing device when suitably programmed for carrying out the methods as set forth above, or according to any of the appended claims, the device comprising a processor, a memory, and a user interface.
  • the invention further provides a data processing device comprising: a processor, a memory coupled to the processor, and a user interface coupled to the processor and to the memory and adapted to be operable by a user to generate user inputs, the data processing device being couple in a network to one or more other data processing devices, at least one of the data processing devices including means for storing a repository of electronic documents, the data processing device further comprising means for receiving at least one user input designating a feature constraint, said feature constraint comprising at least a positive component and a negative component, each of the positive component and the negative component including one or more relations, the or each relation defining a document related entity and a property of the entity, means for solving the feature constraint to determine from the positive and negative components one or more document references, the or each document reference corresponding to a document within said repository satisfying said feature constraint.
  • the invention further provides a system for accessing or distributing electronic documents, according to claim 19 of the appended claims.
  • the invention further provides a portable device for accessing or distributing electronic documents, according to claim 20 of the appended claims.
  • the invention further provides a apparatus for scanning, copying and/or printing documents, according to claim 21 of the appended claims.
  • the invention employs a subset of feature constraints — “signed feature constraints” (SFC) — and a method for solving SFCs.
  • SFCs can be used in knowledge retrieval engines to specify, in a common language, (i) knowledge search requests, (ii) the answers to these requests and (iii) the state of the knowledge retrieving agents (referred to herein as knowledge brokers).
  • Figure 1 illustrates schematically a data processing network that may be used to implement an embodiment of the invention
  • Figure 2 illustrates schematically the scope defined by a signed feature constraint
  • Figure 3 is a view of the user interface of a fixed computing device at one instant during the entry by a user of a query
  • Figure 4 shows a schematic flow chart of the steps in entering elements of a query using the interface of Fig. 3;
  • Figure 5 illustrates a paper form suitable for use by a user in an alternative embodiment of the invention, for entering a query
  • Figure 6 shows a schematic flow chart of the steps in entering elements of a query using the paper form of Fig. 5;
  • Figure 7 is a schematic flow chart of the steps in using a feature constraint to retrieve document references and display or print corresponding documents;
  • Figure 8 illustrates a portion of a list of hits obtained during the process of Fig. 7;
  • Figure 9 shows selected hits from the list of Fig. 8 after transformation into HTML format
  • Figure 10 illustrates a more detailed presentation of a single selected hit.
  • the present invention may be implemented using conventional computer network technology, either using a local area network (LAN) or, more suitably, a wide area network (WAN).
  • the invention has been implemented using conventional web browser software (e.g. Netscape) providing cross-platform communication and document transfer over the internet.
  • Fig. 1 This is schematically illustrated in Fig. 1.
  • each machine 2, 4, 6 forming part of the network 21 may be a PC running WindowsTM, a Mac running MacOS, or a minicomputer running UNIX, which are well known in the art.
  • the PC hardware configuration is discussed in detail inTAie Art of Electronics, 2nd Edn, Ch. 10, P. Horowitz and W. Hill, Cambridge University Press, 1989.
  • EP-A-691 ,619 see EP-A-691 ,619 (hereafter "EP'619”).
  • EP'619 Exemplary network configurations are discussed in detail in, for example, EP-A-772,857 and EP-A- (corresponding to US application S.N. 08/668,704).
  • a document stored on machine 26 may be retrieved and sent from machine 26 over the internet, via any number of intermediate machine 24 to machine 22.
  • the document may be retrieved using as a unique identifier its World Wide Web URL, as discussed in EP'619.
  • the network 21 is also connected to the network 21 is any number of printers or multifunction devices (capable of scanning/printing/faxing, etc.) (not shown), as discussed in EP'619.
  • Multifunction devices are discussed in more detail in EP-A-741 ,487.
  • Each machine coupled to the network may be equipped with appropriate hardware and software, which is known in the art, for communication with portable computing devices, such as personal digital assistants (PDAs), handheld PCs, or pocket or wristwatch computers.
  • PDAs personal digital assistants
  • the requesting machine may generate a request in response to receiving a data packet from a user of a portable computing device, as discussed in further detail in international patent application WO-A- , based on British patent application 9708175.6
  • the present invention makes use of a powerful operation, referred to as "scope-splitting", which relies on the use of negation.
  • a broker may wish to split its scope, specified by a pair (x, P) according to a criterion expressed by a formula F, thus creating two brokers with scope P ⁇ F and P ⁇ -, F.
  • a broker in charge of bibliographic information may wish to split its scope into two new scopes: “books written after 1950", which can be represented by the BFC x
  • a signed feature constraint is composed of a positive part and a list of negative parts, both of them being basic feature constraints.
  • E 1 "Xerox" specifies a Xerox employee who is not American and is not married to another Xerox employee. This is represent it graphically as in Fig. 2.
  • the round boxes denote the entities (logical variables), the sort relations (unary) are represented by dashed arrows labelled by the name of the sort in a square box, the feature relations (binary) are represented by plain arrows labelled by the name of the feature in a square box.
  • the built-in predicates (not present in the example) are represented by rhombuses.
  • the positive part of the SFC is contained in the top box and marks the distinguished entity of the scope (p in the example) by a double round box.
  • the negative parts of the SFC are contained in the lower boxes in grey.
  • the main interest of SFC comes from the following property:
  • the embodiment disclosed here makes use of a slight variant of the basic axiom system used in A ⁇ t-Kaci H. et a/. (1994), A Feature-Based Constraint-System for Logic Programming with -Entailment, Theoretical Computer Science 122, pp. 263-283, although it will be appreciated by persons skilled in the art that the principles of the method apply to other sets of axioms as well.
  • Sorts are disjoint: this means that no entity can be of two distinct sorts. For example, a book is not a person: we cannot have an entity x with x:book and X: erson. Other systems consider hierarchies of sorts where some entities can have multiple sorts as long as they have a common denominator in the hierarchy.
  • Constraint satisfaction over BFCs is defined by a set of conditional rewrite rules over BFCs (section B.1 of the Appendix) which have the following properties
  • a BFC is satisfiable if and only if its normal form is not reduced to the contradiction.
  • One implication can be proved by showing that rewrite steps preserve satisfiability.
  • the reverse implication can be proved by displaying a model that satisfies BFCs whose normal form is not reduced to the contradiction.
  • the rewrite rules describe the steps of the constraint satisfaction algorithm. This algorithm always terminates because the system of rewrite rules is convergent. It is to be noted that the definition of the rules relies on satisfiability tests of built-in constraints, which has been assumed decidable. This means that the algorithm is modular and can accommodate any kind of built-in constraints as long as a proper built-in constraint satisfaction algorithm is provided.
  • rewrite rules for constraint satisfaction algorithm can be implemented in a naive way in some symbolic language like Lisp or Prolog, or can be optimised, taking into account the properties of the specific built-in constraints which are used.
  • Figure 3 is a view of the user interface of a fixed computing device at one instant during the entry by a user of a query.
  • a main query entry box 30 is displayed, in a form well known in the art.
  • the query box 30 includes boxes 31 , 32 which the user can select by mouse inputs, and can use to type in, or complete elements (e.g. "books/articles") of the query.
  • Buttons 33 may be used to select document- related entities, such as "title” and a constraint applying to it, such as "contains not”.
  • buttons 34, 36 allow the user to restart, add to, edit and build up a query.
  • Each element of the query is gradually added to the current specification of the query, which is displayed in its current state in box 37.
  • button 38 is pressed to launch the search.
  • Figure 4 shows a schematic flow chart of the steps in entering elements (e.g. date after 90) of a query using the interface of Fig. 3.
  • the knowledge broker main query window is displayed (step s41).
  • the elements of the query are then received in turn as they are keyed in by the user (s42).
  • the "current specification” is updated to include it any displayed (s43).
  • each query element is converted (s44) to its corresponding logical relations) - see section 2 above.
  • the feature constraint is then compiled (s45) from the set of logical relations.
  • Figure 5 illustrates a paper form 50 suitable for use for entering a query by a user in an alternative embodiment of the invention.
  • This embodiment is suitable for the user of a multifunction device, or a user having a scanner coupled to a computer.
  • the form 50 used has several sections 51 , 52, 53, 54, enabling the user to enter information about the type of document, author's name, date, and topic; however, it will be appreciated that any number of sections may be used, for entering any kind of information that a user may expect to have about a document.
  • next to each option indicated by human readable text is a box which, when checked by a user, enables the choice to be determined by machine reading, as is known in the art.
  • Certain boxes may be used to enter handwritten information, alternatively, the query may be entered on a sheet entirely in typewritten or handwritten form, with the content of the query being determined by OCR and, where necessary, handwriting recognition.
  • Figure 6 shows a schematic flow chart of the steps in entering elements of a query using the paper form of Fig. 5.
  • the sheet is scanned and a bitmapped image data file corresponding to the content of the sheet is stored (step s60). Then, (s61) an analysis is made of the image data at the locations corresponding the boxes 55-58, either as to whether the box was checked, or to extract the information written in the box. Then, for each section 51-54, the specified query element is determined (s62), where necessary by applying handwriting recognition and OCR (s63). Each query element is then converted to the corresponding logical reiation(s) - see section 2 above. The feature constraint is then compiled (s45) from the set of logical relations.
  • FIG 7 is a schematic flow chart of the steps in using a feature constraint to retrieve document references and display or print corresponding documents. This may be performed by a conventional computer device, or by a multifunction device or printer equipped with a user interface.
  • a FC is received from a user, for example in a data packet from a user of a portable device, as illustrated in Fig. 8, or by input directly into the machine by a user operating a keyboard and mouse, or touch screen, as is well known in the art.
  • the FC is solved as described in section 3 above, and the resulting request in the appropriate form passed to the search engine (s92).
  • the search request is used to search all available repositories for documents satisfying the FC (s93); and if necessary, the request may be broken down into subrequests as discussed in more detail in Andreoli et al (1996), The Constraint-Based Knowledge Broker Model: Semantics, Implementation and Analysis, J. Symbolic Computation).
  • a list of hits — of documents satisfying the FC — is displayed (s94), as shown in Fig. 8. Then, in response to appropriate user input, operations may be performed to display individual hits with expanded detail of the document, to convert the document information to HTML format, or to download the document from the repository (s95).
  • Figure 9 shows selected hits from the list of Fig. 8 after transformation into HTML format. As can be seen, for each hit there is displayed further information, such as author name, http_url, information source and title. If desired, the user can view the document for hit 1 by mouse clicking on the http_uri displayed. The document can then be printed, if needed (s96).
  • Figure 10 illustrates a more detailed presentation of a single selected hit, i.e. with a set of attributes of the document. It can be seen that against one or more of the attributes are displayed URLs providing links to further pages providing information related to those attributes.
  • Vr, y ⁇ :r ⁇ y:rD ⁇ !/ ifrisa value sort
  • Vx . y . z r — y ⁇ x zDz — y
  • disequality can be axiomatized by
  • Precedence constraints are axiomatized by

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Computer Hardware Design (AREA)
  • Human Computer Interaction (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention concerne un procédé pour configurer un logiciel courtier d'informations qui traite des demandes d'informations exprimées en termes de contraintes de caractéristiques, et qui décompose les demandes en sous-demandes si nécessaire. L'invention utilise un sous-ensemble de contraintes de caractéristiques appelées contraintes de caractéristiques signées (SFC:signed feature constraints). Chacune de ces contraintes de caractéristique possède au moins une composante positive et une composante négative, chacune des ces composantes positives et de ces composantes négatives comprenant une ou plusieurs relations, par exemple des relations de tri et des relations de caractéristique. Un des avantages du procédé est que les SFC peuvent être utilisées par des machines d'extraction de connaissances pour spécifier dans un langage commun (i) les demandes de recherche d'informations, (ii) les réponses à ces demandes et (iii) l'état des agents d'extraction de connaissances (appelés ici courtiers d'informations). L'invention porte en outre sur un procédé pour résoudre les SFC.
PCT/IB1998/000758 1997-04-23 1998-04-23 Courtiers d'informations utilisant des contraintes de caracteristiques signees WO1998048361A1 (fr)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP54536198A JP2001521664A (ja) 1997-04-23 1998-04-23 符号つき特徴制約を用いた知識ブローカ
EP98917545A EP0985184A1 (fr) 1997-04-23 1998-04-23 Courtiers d'informations utilisant des contraintes de caracteristiques signees
US09/421,846 US7020670B1 (en) 1997-04-23 1999-10-20 Document constraint descriptors obtained from user signals indicating attribute-value relations

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GBGB9708172.3A GB9708172D0 (en) 1997-04-23 1997-04-23 Knowledge brokers using signed feature constraints
GB9708172.3 1997-04-23

Related Child Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB1998/000757 Continuation WO1998048359A1 (fr) 1997-04-23 1998-04-23 Contrainte d'options basee sur la recuperation et la distribution de references de documents

Publications (1)

Publication Number Publication Date
WO1998048361A1 true WO1998048361A1 (fr) 1998-10-29

Family

ID=10811192

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB1998/000758 WO1998048361A1 (fr) 1997-04-23 1998-04-23 Courtiers d'informations utilisant des contraintes de caracteristiques signees

Country Status (4)

Country Link
EP (1) EP0985184A1 (fr)
JP (1) JP2001521664A (fr)
GB (1) GB9708172D0 (fr)
WO (1) WO1998048361A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6381598B1 (en) 1998-12-22 2002-04-30 Xerox Corporation System for providing cross-lingual information retrieval
US6434546B1 (en) 1998-12-22 2002-08-13 Xerox Corporation System and method for transferring attribute values between search queries in an information retrieval system

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0691619A2 (fr) * 1994-06-27 1996-01-10 Rank Xerox Limited Système d'accès et de distribution de documents électroniques

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0691619A2 (fr) * 1994-06-27 1996-01-10 Rank Xerox Limited Système d'accès et de distribution de documents électroniques

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ANDREOLI, J.M., BORGHOFF, U.M., PARESCHI, R., SCHLICHTER, J.H.: "Constraint Agents for the Information Age", J.UCS - JOURNAL OF UNIVERSAL COMPUTER SCIENCE, vol. 1, no. 12, December 1995 (1995-12-01), http://www.iicm.edu/jucs_1_12/constraint_agents_for_the/html/paper.html, pages 762 - 789, XP002075155 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6381598B1 (en) 1998-12-22 2002-04-30 Xerox Corporation System for providing cross-lingual information retrieval
US6434546B1 (en) 1998-12-22 2002-08-13 Xerox Corporation System and method for transferring attribute values between search queries in an information retrieval system

Also Published As

Publication number Publication date
EP0985184A1 (fr) 2000-03-15
JP2001521664A (ja) 2001-11-06
GB9708172D0 (en) 1997-06-11

Similar Documents

Publication Publication Date Title
US11790009B2 (en) Knowledge operating system
US20210149980A1 (en) Systems and method for investigating relationships among entities
US6820075B2 (en) Document-centric system with auto-completion
US6928425B2 (en) System for propagating enrichment between documents
US6778979B2 (en) System for automatically generating queries
US6732090B2 (en) Meta-document management system with user definable personalities
US7133862B2 (en) System with user directed enrichment and import/export control
US7284191B2 (en) Meta-document management system with document identifiers
US7117432B1 (en) Meta-document management system with transit triggered enrichment
US7020670B1 (en) Document constraint descriptors obtained from user signals indicating attribute-value relations
Velásquez et al. DOCODE 3.0 (DOcument COpy DEtector): A system for plagiarism detection by applying an information fusion process from multiple documental data sources
WO2005099381A9 (fr) Creation de donnees basee sur le temps ainsi que sur l'expression et organisation controlee par un createur
KR20120058544A (ko) 이미지 구성요소의 검색
EP0976057A1 (fr) Contrainte d'options basee sur la recuperation et la distribution de references de documents
Kettler et al. A template-based markup tool for semantic web content
TW486638B (en) A system, method and article of manufacture for effectively interacting with a network user
EP0985184A1 (fr) Courtiers d'informations utilisant des contraintes de caracteristiques signees
Zaka Theory and applications of similarity detection techniques
Völkel Personal knowledge models with semantic technologies
Müller Inducing conceptual user models
Carr et al. Towards a knowledge-aware office environment
Troyer et al. Exploiting link types during the conceptual design of websites
Casteleyn et al. Exploiting Link Types during the Web Site Design Process to Enhance Usability of Web Sites
US20210342541A1 (en) Stable identification of entity mentions
Rohmer Lessons for the future of Semantic Desktops learnt from 10 years of experience with the IDELIANCE Semantic Networks Manager.

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP US

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 1998917545

Country of ref document: EP

ENP Entry into the national phase

Ref country code: JP

Ref document number: 1998 545361

Kind code of ref document: A

Format of ref document f/p: F

WWP Wipo information: published in national office

Ref document number: 1998917545

Country of ref document: EP

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载