WO2002073531A1 - Exploration en profondeur de donnees en une etape avec specifications en langage naturel et resultats - Google Patents
Exploration en profondeur de donnees en une etape avec specifications en langage naturel et resultats Download PDFInfo
- Publication number
- WO2002073531A1 WO2002073531A1 PCT/US2002/006247 US0206247W WO02073531A1 WO 2002073531 A1 WO2002073531 A1 WO 2002073531A1 US 0206247 W US0206247 W US 0206247W WO 02073531 A1 WO02073531 A1 WO 02073531A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data mining
- data
- goal
- natural language
- text
- Prior art date
Links
- 238000007418 data mining Methods 0.000 title claims abstract description 222
- 238000000034 method Methods 0.000 claims abstract description 190
- 238000004458 analytical method Methods 0.000 claims description 54
- 230000001419 dependent effect Effects 0.000 claims description 51
- 238000004422 calculation algorithm Methods 0.000 claims description 31
- 238000004590 computer program Methods 0.000 claims description 22
- 238000013507 mapping Methods 0.000 claims description 14
- 238000004519 manufacturing process Methods 0.000 claims description 13
- 238000013499 data model Methods 0.000 claims description 8
- 230000009471 action Effects 0.000 claims description 2
- 238000012545 processing Methods 0.000 abstract description 41
- 230000008569 process Effects 0.000 description 115
- 230000006870 function Effects 0.000 description 25
- 238000003860 storage Methods 0.000 description 25
- 208000007536 Thrombosis Diseases 0.000 description 9
- 238000012986 modification Methods 0.000 description 7
- 230000004048 modification Effects 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 230000004913 activation Effects 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 230000010365 information processing Effects 0.000 description 4
- 230000003993 interaction Effects 0.000 description 4
- 238000005065 mining Methods 0.000 description 4
- 238000007639 printing Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000009897 systematic effect Effects 0.000 description 3
- 238000012800 visualization Methods 0.000 description 3
- 238000001994 activation Methods 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 238000003825 pressing Methods 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 238000000638 solvent extraction Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 230000003213 activating effect Effects 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 230000003466 anti-cipated effect Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000013479 data entry Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000012913 prioritisation Methods 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/01—Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
Definitions
- Information processing is generally the systematic performance of operations upon information. It includes data processing and can include operations such as data communication and office automation.
- Data processing is generally the systematic performance of operations upon data. Examples of data processing include arithmetic or logic operations upon data, merging or sorting of data, assembling or compiling of programs, or operations on text, such as editing, sorting, merging, storing, retrieving, displaying, or printing .
- a natural language is a language whose rules are based on current usage without being specifically prescribed. Examples of natural language include, for example, English, Russian, or Chinese. In contrast, an artificial language is a language whose rules are explicitly established prior to its use. Examples of artificial languages include computer- programming languages such as C, Java, BASIC, FORTRAN, or COBOL.
- the method can also include providing (as part of the user interface) a plurality text templates for communicating the key performance results and selecting one text template from among the plurality of text templates for communicating the key performance results, whereby the user interface does not display the same text template for every data mining operation.
- the user interface can be provided on a client system and the data model on a server, or the user interface and the client system can both be contained on a general- purpose computer.
- a third embodiment is a method in a computer system for controlling a data mining operation.
- This method includes a step of the computer system receiving problem specification input determining a data mining operation goal.
- the input data determining a data mining operation goal is the only input required by the data mining application.
- the problem specification input can be a formal definition based on a data model or can be natural language data.
- the method can also include identifying key performance results; providing a user interface having a control for communicating information; and communicating a natural language description of the key performance results using the control on the user interface.
- a fourth embodiment is a data mining application user interface.
- the user interface includes, but is not limited to a control that receives natural language input describing the goal of a data mining operation and an interface that sends the natural language input to a text parser.
- FIG. 7 is a system resources chart illustrating an example of a configuration of data units and process units suitable for solving the problem of mapping a goal of data mining in text to data fields for automated problem specification.
- FIG. 8 illustrates screens and windows that can be presented to the user in an embodiment for mapping a goal of data mining in text to data fields for automated problem specification .
- FIG. 11 is a data flow chart that illustrates an example of a path of data in solving the problem of text display of key data mining performance results.
- the algorithm of this embodiment then recommends a set of fields, fewer than all the fields in the database, that have relatively higher probabilities to be candidates for or components of the dependent variable than other fields not recommended. The user can then narrow down the selection even further by selecting some of the suggested fields. If the dependent variable is actually some combination of the selected fields, the user can also define the true dependent variable by entering a mathematical expression .
- the boilerplate text template can be particularly adapted to each specialized market sector using the terminology ordinarily used by persons in that field.
- the performance summary text descriptions are made to seem more human and less automatic by randomizing the templates that have been customized for each market sector. While different templates each provide essentially the same message, the body of the text can contain differently worded details text.
- One embodiment provides one-step data mining. It asks the user to enter the goal of data mining, which can be specified in natural language such as plain English. This embodiment then employs techniques disclosed. in provisional application 60/274,008, filed March 7, 2001, which is incorporated herein by reference, in order to transport the user directly to analyzing output results. This embodiment selects all the algorithm parameters and runs the entire data mining operation automatically.
- the data-exploration window (255) can further include a basic information text box (265) containing fundamental information in the data set that is the subject of data mining.
- the data exploration window (255) can further include additional text boxes (267, 270) containing further information about the data set to be analyzed.
- the data exploration window (255) can further include an inputs text box (272) listing domain-space source variates relevant to the problem to be analyzed.
- the data exploration window (255) can further include a related field index textbox (275) .
- the data exploration window (255) can further include an outputs textbox (276) listing the fields that are the range-space of potential candidate variables relevant to the problem to be analyzed.
- the identify-keywords process (414) matches words parsed from the text by the parse-text process (412) with known keywords. This matching can be performed by any algorithm suitable for this purpose. Those of ordinary skill in the art will recognize in general the equivalence of such algorithms for keyword identification as are now known or may later be developed.
- the next step concerns calculation of maximum a posteriori ("MAP") probabilities.
- MAP maximum a posteriori
- control passes next to a calculate-MAP- probability process (420).
- the calculate-MAP-probability process (420) is a processing function to compare the results of the lexical analysis process (416) with names and descriptions of fields in the database to be analyzed by data mining software package.
- control passes next to an identify-target- field-candidates process (425).
- the identify-target-field- candidates process (425) is a processing function to identify likely dependent variable fields in the database to be analyzed by the data mining software package. Based on the results of the comparison performed in the calculate-MAP- probability process (420) , the identify-target-field- candidates process (425) can select those fields most likely to represent dependent variables for the end user's problem definition.
- the communicate-best- fields process (440) is a processing function to communicate to the end user the identification and ranking of fields likely to be relevant to the dependent variable that the application ' software identifies based on the enter-goal-in- natural-language process (410). This communicate-best-fields process (440) thus enables the end user to complete the selection and definition of the dependent variable for the data mining problem.
- control passes next to an incorporate-user-refinements process (445).
- the rank-input-fields process (447) is a processing function to rank input features based on their level of contribution to the projected data mining performance. One way to assess this contribution is to measure the input field's accurate prediction of the selected output variable. Many actual feature-ranking algorithms can be used. The interchangeability and general equivalence of these algorithms will be appreciated by those of ordinary skill in the art. The selection of a particular feature-ranking algorithm for a particular embodiment of the data mining application software package might depend on various factors and is within the abilities of those of ordinary skill in the art. In one embodiment of the data mining application software package, the end user can be given the option to select input features. In a second embodiment of the data mining application software package, the selection of input features can be performed entirely by the data mining application software package. Any algorithm suitable for sorting can be employed for this rank- input-fields process (435).
- FIG. 5 there is depicted a system resources chart illustrating a configuration of data units and process units suitable for use in a software application to solve the problem.
- the control passes first to an enter-goal-in-natural- language process (410), which is a processing function to receive as input natural language description data (310) describing the goal of the data mining operation in natural language.
- input natural language description data (310) can be input manually at run time, but this is a detail of the implementation that can vary in other embodiments. Other forms of natural language description data in other media can be used without altering the basic characteristics of the invention.
- the lexical analysis process (416) can use other link-analysis techniques.
- the lexical analysis process (416) produces analyzed text data (340), here depicted as being stored in interned storage although the media for such analyzed text data (340) is a detail of implementation that can be varied in different embodiments of the invention.
- This analyzed text data (340) and a field description database (355) are next used in a calculate-MAP-probability process (420).
- control passes to an identify-target-field-candidates process (425).
- the identity-target-field-candidates-process (425) is a processing function that uses the field descriptions database (355) and the results of the calculate-MAP-probability process (420) to select target fields data.
- the software application next evaluates the condition "Is MAP probability high enough" in a decision operation (430). If that conditional evaluates False, then in one embodiment control is passed to an alternative-problem-specification-process (455) .
- FIG. 6 there is depicted a program network chart illustrating the path of program activations and the interactions to related data for translating a goal of data mining expressed in natural language text into the specification of input and output variables automatically prior to the commencement of data mining.
- Natural language description data (610) describing the data mining problem to be solved passes to a parse-text process (412) .
- the parse text-process (412) upon completion activates a identify- keywords-process (414), which interacts with keywords data (620).
- the identify-keywords process (414) upon completion activates a lexical analysis process (416).
- Problem description data (710) is a natural language description of the goal to be analyzed by the data mining software application.
- the medium of problem description data (710) can be manual input.
- Manual input is data, the medium being of any type where the information is entered manually at the time of processing, for example, on-line keyboard, switch settings, push buttons, light pen, bar-code wand.
- the natural language description of 'the problem description data (710) could have been provided previously and stored in some other medium.
- Problem description data (710) communicates with the problem definition processor (750) .
- Temporary workspace data (730) is storage used for working results, such as text that has been parsed and lists of fields likely to be part of the problem definition data (720). Temporary workspace data is here depicted as internal storage. Internal storage is data stored in, for example, RAM or a cache. Temporary workspace data (730) interacts with the problem definition processor (750) .
- Field descriptions data (740) is a database or other suitable data structure containing field names and descriptive information regarding the database that can be analyzed by the data mining software application. Field descriptions data (740) is here depicted as direct access storage. Other media and storage forms are possible and equivalent for purposes of this invention to direct access storage, including but not limited to sequential storage and internal storage.
- the data set comprises three depicted tables: basic patient information, thrombosis-test results, and medical history.
- the field named "thrombosis" is the actual target variable.
- the goal of the data mining operation is to identify other input fields relevant in diagnosing thrombosis.
- the "birthday” field in particular shows an example of a field having no predictive power.
- the example is illustrated by a display window (800) containing elements used in one embodiment implementing the process.
- the display window (800) in this example includes conventional elements such as title bar (805), a drop-down task menu (810) and control elements (815).
- the title bar (805) can contain any appropriate title such as, for example, "Figure No. 3: Help with selecting input be a ranking according to field importance.”
- the drop-down task menu (810) can contain conventional elements such as a file menu, an edit menu, and window menu, and a help menu.
- the control elements (815) can include conventional controls such as a button to minimize the window (800), a button to maximize the window (800), a button to restore the window (800), and a button to close the window (800) .
- the window (800) in this example depicts data from three tables: a basic information table (820), a thrombosis test table (825), and a ranking for historical data table (830) .
- additional display control elements such as, for example, a slider control bar (835) , can be included to permit subsections of the window (800) to be scrolled to display all data.
- fields from each table are explained in text boxes below that table.
- the fields from the basic information table (820) are enumerated in the basic information text box (840), which identifies the fields as sex, birthday, description, first date, admission, and diagnosis respectively.
- the fields charted in the thrombosis test table (825) are enumerated in the thrombosis test list box (845) .
- the fields charted in the ranking for historical data table (830) are enumerated in the ranking for historical data list box (850) .
- List boxes (840, 845, 850) can typically include a slider control bar to scroll through the items listed.
- Fig. 9 there is depicted a data flow chart illustrating a path of data and the processing steps in an embodiment of a method for displaying key performance results of data mining operation in natural language such as plain English to that a novice user can understand the results without having to consult an expert for interpretation.
- the operation takes as input performance results of data (910) and type of operation data (920).
- Presentation templates data (990) is used by a template- selection process (980).
- the template selection-process (980) and the hierarchical-prioritization-process (930) both also can take as input information from vertical market area data (970).
- the hierarchical-prioritization-process (930) generates as its output a set of vital results data (940) .
- the set of vital results data (940) passes as input to a performance-summary-generation process (950).
- the template selection process (980) using vertical market area data (970) as input, generates output template data (990).
- the template data (990) is also provided as input to the performance summary generation process (950).
- the performance summary generation process (950) generates as output performance summary data (960), which can then communicated to the user by any convenient means such as, for example, display on a cathode ray tube, output to a printer, or other output methods.
- control passes first to a select-vital-information process (1010).
- the select-vital-information process (1010) identifies key performance data about which information can be communicated to the user.
- Control passes next to a select-template process (1020).
- the select-template process (1020) identifies a predefined template appropriate for displaying the key information identified in the select-vital-information process (1010).
- control passes next to a generate-summary process (1030).
- a performance window (1280) displays first.
- the performance window (1280) can include conventional elements (1295) such as a title bar, control elements, and drop-down task menus.
- the title bar in the conventional control elements (1295) of the performance summary window (1280) contains the title "performance summary figure.”
- the performance summary window (1280) can also include a text box (1285), which displays key data mining performance results using a text template in a natural language such as, for example, English.
- the performance summary window (1280) can also include a detailed analysis button (1290).
- the performance detail window (1205) can include detailed charts (1210, 1220, 1230, 1240, 1250, 1260), which show in more detail and in graphic form the information summarized in the summary window (1280). Additional controls (1270) can be included in the detail window (1205), the additional controls (1270) providing access to additional information.
- Computer readable media includes any recording medium in which computer code may be fixed, including but not limited to CD's, DVD's, semiconductor ram, rom, or flash memory, paper tape, punch cards, and any optical, magnetic, or semiconductor recording medium or the like.
- Examples of computer readable media include recordable-type media such as floppy disc a hard disk drive, a RAM, and CD-ROMs, DVD-ROMs, an online internet web site, tape storage, and compact flash storage, and transmission-type media such as digital and analog communications links, and any other volatile or non-volatile mass storage system readable by the computer.
- the computer readable medium includes cooperating or interconnected computer readable media, which exist exclusively on single computer system or are distributed among multiple interconnected computer systems that may be local or remote. Those skilled in the art will also recognize many other configurations of these and similar components which can also comprise computer system, which are considered equivalent and are intended to be encompassed within the scope of the claims herein.
- the data mining application software is easy to use with most functionality behind the scenes. Such embodiments can include preprocessing, intelligent performance optimization, information visualization, or an intuitive graphical interface (GUI) that provides guidance for novice users. Such embodiments can provide a near turnkey solution to expand the use of data mining technologies.
- the data mining application software provides technically powerful data mining capabilities. Such embodiments can include a robust algorithm set or scalability for large data sets. Some embodiments can provide digital signal processing and image processing algorithms for temporally and spatially sampled data, such as macroeconomic data or images. Some embodiments can provide fusion of several complementary algorithms. Some embodiments can provide advanced or simple visualization tools to enhance the interpretation of data mining results in order to derive actionable insights.
- the data mining application software is flexible and customizable. Some embodiments can provide seamless insertion of user's algorithms through file- based I/O and dynamic script generation that maps user's requests into actions. Some embodiments can provide web-based data mining through intranet and or Internet.
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US27400801P | 2001-03-07 | 2001-03-07 | |
US60/274,008 | 2001-03-07 | ||
US09/945,530 | 2001-08-03 | ||
US09/945,530 US20020169735A1 (en) | 2001-03-07 | 2001-08-03 | Automatic mapping from data to preprocessing algorithms |
US09/942,435 | 2001-11-16 | ||
US09/992,435 US20020138492A1 (en) | 2001-03-07 | 2001-11-16 | Data mining application with improved data mining algorithm selection |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2002073531A1 true WO2002073531A1 (fr) | 2002-09-19 |
Family
ID=27402619
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2002/006247 WO2002073531A1 (fr) | 2001-03-07 | 2002-03-01 | Exploration en profondeur de donnees en une etape avec specifications en langage naturel et resultats |
PCT/US2002/006248 WO2002073530A1 (fr) | 2001-03-07 | 2002-03-01 | Dispositif et procede d'exploration de donnees utilisant un outil de terrain a interface utilisateur et des algorithmes d'utilisateur |
PCT/US2002/006519 WO2002073532A1 (fr) | 2001-03-07 | 2002-03-04 | Caracterisation hierarchique de champs de tables multiples selon des rapports un a plusieurs pour l'exploration de donnees globale |
Family Applications After (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2002/006248 WO2002073530A1 (fr) | 2001-03-07 | 2002-03-01 | Dispositif et procede d'exploration de donnees utilisant un outil de terrain a interface utilisateur et des algorithmes d'utilisateur |
PCT/US2002/006519 WO2002073532A1 (fr) | 2001-03-07 | 2002-03-04 | Caracterisation hierarchique de champs de tables multiples selon des rapports un a plusieurs pour l'exploration de donnees globale |
Country Status (1)
Country | Link |
---|---|
WO (3) | WO2002073531A1 (fr) |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9244894B1 (en) | 2013-09-16 | 2016-01-26 | Arria Data2Text Limited | Method and apparatus for interactive reports |
US9323743B2 (en) | 2012-08-30 | 2016-04-26 | Arria Data2Text Limited | Method and apparatus for situational analysis text generation |
US9336193B2 (en) | 2012-08-30 | 2016-05-10 | Arria Data2Text Limited | Method and apparatus for updating a previously generated text |
US9355093B2 (en) | 2012-08-30 | 2016-05-31 | Arria Data2Text Limited | Method and apparatus for referring expression generation |
US9396181B1 (en) | 2013-09-16 | 2016-07-19 | Arria Data2Text Limited | Method, apparatus, and computer program product for user-directed reporting |
US9405448B2 (en) | 2012-08-30 | 2016-08-02 | Arria Data2Text Limited | Method and apparatus for annotating a graphical output |
US9600471B2 (en) | 2012-11-02 | 2017-03-21 | Arria Data2Text Limited | Method and apparatus for aggregating with information generalization |
US9640045B2 (en) | 2012-08-30 | 2017-05-02 | Arria Data2Text Limited | Method and apparatus for alert validation |
US9904676B2 (en) | 2012-11-16 | 2018-02-27 | Arria Data2Text Limited | Method and apparatus for expressing time in an output text |
US9946711B2 (en) | 2013-08-29 | 2018-04-17 | Arria Data2Text Limited | Text generation from correlated alerts |
US9990360B2 (en) | 2012-12-27 | 2018-06-05 | Arria Data2Text Limited | Method and apparatus for motion description |
US10115202B2 (en) | 2012-12-27 | 2018-10-30 | Arria Data2Text Limited | Method and apparatus for motion detection |
US10445432B1 (en) | 2016-08-31 | 2019-10-15 | Arria Data2Text Limited | Method and apparatus for lightweight multilingual natural language realizer |
US10467347B1 (en) | 2016-10-31 | 2019-11-05 | Arria Data2Text Limited | Method and apparatus for natural language document orchestrator |
US10565308B2 (en) | 2012-08-30 | 2020-02-18 | Arria Data2Text Limited | Method and apparatus for configurable microplanning |
US10664558B2 (en) | 2014-04-18 | 2020-05-26 | Arria Data2Text Limited | Method and apparatus for document planning |
US10776561B2 (en) | 2013-01-15 | 2020-09-15 | Arria Data2Text Limited | Method and apparatus for generating a linguistic representation of raw input data |
US11176214B2 (en) | 2012-11-16 | 2021-11-16 | Arria Data2Text Limited | Method and apparatus for spatial descriptions in an output text |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103150354A (zh) * | 2013-01-30 | 2013-06-12 | 王少夫 | 一种基于粗糙集的数据挖掘算法 |
CN105117430B (zh) * | 2015-08-06 | 2018-07-31 | 中山大学 | 一种基于等价类的重复任务过程发现方法 |
US10692601B2 (en) * | 2016-08-25 | 2020-06-23 | Hitachi, Ltd. | Controlling devices based on hierarchical data |
US10776408B2 (en) | 2017-01-11 | 2020-09-15 | International Business Machines Corporation | Natural language search using facets |
US10572826B2 (en) | 2017-04-18 | 2020-02-25 | International Business Machines Corporation | Scalable ground truth disambiguation |
US12197846B2 (en) | 2019-11-19 | 2025-01-14 | International Business Machines Corporation | Mathematical function defined natural language annotation |
CN113342800A (zh) * | 2020-02-18 | 2021-09-03 | 中国电信股份有限公司 | 数据关系处理方法和装置、计算机可读存储介质 |
CN111640031B (zh) * | 2020-05-29 | 2023-07-14 | 泰康保险集团股份有限公司 | 跨系统的理赔数据处理方法、装置及相关设备 |
CN113821552B (zh) * | 2020-06-18 | 2023-11-17 | 南京南瑞继保电气有限公司 | 电力实时数据库模型数据导出至关系数据库的映射方法 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5933818A (en) * | 1997-06-02 | 1999-08-03 | Electronic Data Systems Corporation | Autonomous knowledge discovery system and method |
US5966126A (en) * | 1996-12-23 | 1999-10-12 | Szabo; Andrew J. | Graphic user interface for database system |
US5991751A (en) * | 1997-06-02 | 1999-11-23 | Smartpatents, Inc. | System, method, and computer program product for patent-centric and group-oriented data processing |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5257365A (en) * | 1990-03-16 | 1993-10-26 | Powers Frederick A | Database system with multi-dimensional summary search tree nodes for reducing the necessity to access records |
US5544355A (en) * | 1993-06-14 | 1996-08-06 | Hewlett-Packard Company | Method and apparatus for query optimization in a relational database system having foreign functions |
US6034697A (en) * | 1997-01-13 | 2000-03-07 | Silicon Graphics, Inc. | Interpolation between relational tables for purposes of animating a data visualization |
US5861891A (en) * | 1997-01-13 | 1999-01-19 | Silicon Graphics, Inc. | Method, system, and computer program for visually approximating scattered data |
US5960435A (en) * | 1997-03-11 | 1999-09-28 | Silicon Graphics, Inc. | Method, system, and computer program product for computing histogram aggregations |
US5930803A (en) * | 1997-04-30 | 1999-07-27 | Silicon Graphics, Inc. | Method, system, and computer program product for visualizing an evidence classifier |
-
2002
- 2002-03-01 WO PCT/US2002/006247 patent/WO2002073531A1/fr not_active Application Discontinuation
- 2002-03-01 WO PCT/US2002/006248 patent/WO2002073530A1/fr not_active Application Discontinuation
- 2002-03-04 WO PCT/US2002/006519 patent/WO2002073532A1/fr not_active Application Discontinuation
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5966126A (en) * | 1996-12-23 | 1999-10-12 | Szabo; Andrew J. | Graphic user interface for database system |
US5933818A (en) * | 1997-06-02 | 1999-08-03 | Electronic Data Systems Corporation | Autonomous knowledge discovery system and method |
US5991751A (en) * | 1997-06-02 | 1999-11-23 | Smartpatents, Inc. | System, method, and computer program product for patent-centric and group-oriented data processing |
Cited By (39)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10839580B2 (en) | 2012-08-30 | 2020-11-17 | Arria Data2Text Limited | Method and apparatus for annotating a graphical output |
US10467333B2 (en) | 2012-08-30 | 2019-11-05 | Arria Data2Text Limited | Method and apparatus for updating a previously generated text |
US9336193B2 (en) | 2012-08-30 | 2016-05-10 | Arria Data2Text Limited | Method and apparatus for updating a previously generated text |
US9355093B2 (en) | 2012-08-30 | 2016-05-31 | Arria Data2Text Limited | Method and apparatus for referring expression generation |
US10565308B2 (en) | 2012-08-30 | 2020-02-18 | Arria Data2Text Limited | Method and apparatus for configurable microplanning |
US9405448B2 (en) | 2012-08-30 | 2016-08-02 | Arria Data2Text Limited | Method and apparatus for annotating a graphical output |
US9323743B2 (en) | 2012-08-30 | 2016-04-26 | Arria Data2Text Limited | Method and apparatus for situational analysis text generation |
US10504338B2 (en) | 2012-08-30 | 2019-12-10 | Arria Data2Text Limited | Method and apparatus for alert validation |
US10963628B2 (en) | 2012-08-30 | 2021-03-30 | Arria Data2Text Limited | Method and apparatus for updating a previously generated text |
US10282878B2 (en) | 2012-08-30 | 2019-05-07 | Arria Data2Text Limited | Method and apparatus for annotating a graphical output |
US9640045B2 (en) | 2012-08-30 | 2017-05-02 | Arria Data2Text Limited | Method and apparatus for alert validation |
US10026274B2 (en) | 2012-08-30 | 2018-07-17 | Arria Data2Text Limited | Method and apparatus for alert validation |
US10769380B2 (en) | 2012-08-30 | 2020-09-08 | Arria Data2Text Limited | Method and apparatus for situational analysis text generation |
US10216728B2 (en) | 2012-11-02 | 2019-02-26 | Arria Data2Text Limited | Method and apparatus for aggregating with information generalization |
US9600471B2 (en) | 2012-11-02 | 2017-03-21 | Arria Data2Text Limited | Method and apparatus for aggregating with information generalization |
US10853584B2 (en) | 2012-11-16 | 2020-12-01 | Arria Data2Text Limited | Method and apparatus for expressing time in an output text |
US10311145B2 (en) | 2012-11-16 | 2019-06-04 | Arria Data2Text Limited | Method and apparatus for expressing time in an output text |
US11176214B2 (en) | 2012-11-16 | 2021-11-16 | Arria Data2Text Limited | Method and apparatus for spatial descriptions in an output text |
US9904676B2 (en) | 2012-11-16 | 2018-02-27 | Arria Data2Text Limited | Method and apparatus for expressing time in an output text |
US11580308B2 (en) | 2012-11-16 | 2023-02-14 | Arria Data2Text Limited | Method and apparatus for expressing time in an output text |
US10860810B2 (en) | 2012-12-27 | 2020-12-08 | Arria Data2Text Limited | Method and apparatus for motion description |
US10115202B2 (en) | 2012-12-27 | 2018-10-30 | Arria Data2Text Limited | Method and apparatus for motion detection |
US9990360B2 (en) | 2012-12-27 | 2018-06-05 | Arria Data2Text Limited | Method and apparatus for motion description |
US10803599B2 (en) | 2012-12-27 | 2020-10-13 | Arria Data2Text Limited | Method and apparatus for motion detection |
US10776561B2 (en) | 2013-01-15 | 2020-09-15 | Arria Data2Text Limited | Method and apparatus for generating a linguistic representation of raw input data |
US9946711B2 (en) | 2013-08-29 | 2018-04-17 | Arria Data2Text Limited | Text generation from correlated alerts |
US10671815B2 (en) | 2013-08-29 | 2020-06-02 | Arria Data2Text Limited | Text generation from correlated alerts |
US11144709B2 (en) | 2013-09-16 | 2021-10-12 | Arria Data2Text Limited | Method and apparatus for interactive reports |
US10860812B2 (en) | 2013-09-16 | 2020-12-08 | Arria Data2Text Limited | Method, apparatus, and computer program product for user-directed reporting |
US10282422B2 (en) | 2013-09-16 | 2019-05-07 | Arria Data2Text Limited | Method, apparatus, and computer program product for user-directed reporting |
US10255252B2 (en) | 2013-09-16 | 2019-04-09 | Arria Data2Text Limited | Method and apparatus for interactive reports |
US9244894B1 (en) | 2013-09-16 | 2016-01-26 | Arria Data2Text Limited | Method and apparatus for interactive reports |
US9396181B1 (en) | 2013-09-16 | 2016-07-19 | Arria Data2Text Limited | Method, apparatus, and computer program product for user-directed reporting |
US10664558B2 (en) | 2014-04-18 | 2020-05-26 | Arria Data2Text Limited | Method and apparatus for document planning |
US10445432B1 (en) | 2016-08-31 | 2019-10-15 | Arria Data2Text Limited | Method and apparatus for lightweight multilingual natural language realizer |
US10853586B2 (en) | 2016-08-31 | 2020-12-01 | Arria Data2Text Limited | Method and apparatus for lightweight multilingual natural language realizer |
US10467347B1 (en) | 2016-10-31 | 2019-11-05 | Arria Data2Text Limited | Method and apparatus for natural language document orchestrator |
US10963650B2 (en) | 2016-10-31 | 2021-03-30 | Arria Data2Text Limited | Method and apparatus for natural language document orchestrator |
US11727222B2 (en) | 2016-10-31 | 2023-08-15 | Arria Data2Text Limited | Method and apparatus for natural language document orchestrator |
Also Published As
Publication number | Publication date |
---|---|
WO2002073532A1 (fr) | 2002-09-19 |
WO2002073530A1 (fr) | 2002-09-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20030115192A1 (en) | One-step data mining with natural language specification and results | |
WO2002073531A1 (fr) | Exploration en profondeur de donnees en une etape avec specifications en langage naturel et resultats | |
Korinek | Generative AI for economic research: Use cases and implications for economists | |
US11531673B2 (en) | Ambiguity resolution in digital paper-based interaction | |
Kohler et al. | Data analysis using Stata | |
Liu et al. | An adaptive user interface based on personalized learning | |
EP2124173A2 (fr) | Système et procédé de création semi-automatique et maintenance des règles d'expansion de requête | |
US20230376857A1 (en) | Artificial inelligence system with intuitive interactive interfaces for guided labeling of training data for machine learning models | |
AU2020380139A1 (en) | Data preparation using semantic roles | |
JP2011501258A (ja) | 情報抽出装置および方法 | |
US8442992B2 (en) | Mixed mode (mechanical process and english text) query building support for improving the process of building queries correctly | |
KR20040102071A (ko) | 자연어 인식 애플리케이션 구축을 위한 통합 개발 툴 | |
US10977155B1 (en) | System for providing autonomous discovery of field or navigation constraints | |
US12210839B1 (en) | Multilevel data analysis | |
WO2020161505A1 (fr) | Procédé et système améliorés de recherche à base de texte | |
JP2021523509A (ja) | エキスパートレポートエディタ | |
US20230244218A1 (en) | Data Extraction in Industrial Automation Systems | |
US20240111944A1 (en) | System and Method for Annotation-Based Document Management | |
Gillies et al. | Theme and topic: How qualitative research and topic modeling can be brought together | |
US20070129937A1 (en) | Apparatus and method for deterministically constructing a text question for application to a data source | |
JP2001216311A (ja) | イベント分析装置、及びイベント分析プログラムが格納されたプログラム装置 | |
Heilmann et al. | Analyzing the effects of entrenched grammatical constructions on translation | |
Miedema | Towards successful interaction between humans and databases | |
Grammel | User interfaces supporting information visualization novices in visualization construction | |
Ham | The Design of an Interactive Topic Modeling Application for Media Content |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG UZ VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
122 | Ep: pct application non-entry in european phase | ||
NENP | Non-entry into the national phase |
Ref country code: JP |
|
WWW | Wipo information: withdrawn in national office |
Country of ref document: JP |