WO1999045483A9 - Method and system for generating semantic visual templates for image and video retrieval - Google Patents
Method and system for generating semantic visual templates for image and video retrievalInfo
- Publication number
- WO1999045483A9 WO1999045483A9 PCT/US1999/004776 US9904776W WO9945483A9 WO 1999045483 A9 WO1999045483 A9 WO 1999045483A9 US 9904776 W US9904776 W US 9904776W WO 9945483 A9 WO9945483 A9 WO 9945483A9
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- query
- visual
- concept
- subset
- user
- Prior art date
Links
- 230000000007 visual effect Effects 0.000 title claims abstract description 92
- 238000000034 method Methods 0.000 title claims description 36
- 230000003993 interaction Effects 0.000 claims abstract description 6
- 230000002123 temporal effect Effects 0.000 claims description 11
- 230000002452 interceptive effect Effects 0.000 claims description 6
- 238000002372 labelling Methods 0.000 claims description 5
- 241000269627 Amphiuma means Species 0.000 claims 2
- 238000005352 clarification Methods 0.000 claims 2
- 238000007689 inspection Methods 0.000 claims 2
- 208000003734 Supraventricular Tachycardia Diseases 0.000 abstract 3
- 230000000875 corresponding effect Effects 0.000 description 5
- 238000009826 distribution Methods 0.000 description 4
- 239000003607 modifier Substances 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 239000011449 brick Substances 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 239000013598 vector Substances 0.000 description 2
- 239000003086 colorant Substances 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- YOBAEOGBNPPUQV-UHFFFAOYSA-N iron;trihydrate Chemical compound O.O.O.[Fe].[Fe] YOBAEOGBNPPUQV-UHFFFAOYSA-N 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 238000011524 similarity measure Methods 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5838—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5854—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using shape and object relationship
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5862—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using texture
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/71—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7834—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using audio features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7847—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
- G06F16/786—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content using motion, e.g. object motion or camera motion
Definitions
- the invention relates to database still image, video and audio retrieval and, more particularly, to techniques which facilitate access to database items.
- the database can be indexed using a collection of visual templates.
- the visual templates represent semantic concepts or categories, e.g. skiing, sunset and the like.
- SVT semantic visual templates
- Semantic visual templates can be established by an interactive process between a user and a system.
- the user can provide the system with an initial sketch or example image, as a seed to the system to automatically generate other representations of the same concept.
- the user then can pick those views for inclusion that are plausible for representing the concept.
- the database can be searched with it, for the user to provide relevancy feedback on the returned results.
- SVT's the user can interact with the system at concept level. In forming new concepts, pre-existing SVT's can be used..
- Fig. 1 is a schematic of an interactive technique for generating a library or collection of semantic visual templates in accordance with a preferred embodiment of the invention.
- Fig. 2 is a diagram which illustrates a concept having necessary and sufficient conditions.
- Fig. 3 is a diagram which illustrates query generation.
- Fig. 4 is a schematic of an interactive system in accordance with a preferred further embodiment of the invention, including audio processing.
- Fig. 5 shows a set of icons exemplifying the concept "high jump”.
- Fig. 6 shows a set of icons exemplifying the concept “sunset”.
- Fig. 7 shows a semantic visual template for the concept "slalom"
- VideoQ VideoQ
- each object can be characterized by salient attributes such as color, texture, size, shape and motion, for example.
- video object database consisting of all the objects extracted from the scene and their attributes.
- Visual Templates A visual template represents an idea, in the form of a sketch or an animated sketch. As a single visual template may be a poor representative of a class of interest, a library of visual templates can be assembled, containing representative templates for different semantic classes. For example, when searching for video clips of the class Sunset, one could select one or more visual templates corresponding to the class and use similarity-based querying to find video clips of sunsets.
- An important advantage of using a visual template library lies in linkage of a low-level visual feature representation to high-level semantic concepts. For example, if a user enters a query in a constrained natural language form as described in the above-referenced patent applications, visual templates can be used to transform the natural language query into automated queries specified by visual attributes and constraints. When visual content in the repository or database is not indexed textually, customary textual search methods cannot be applied directly.
- a semantic visual template is the set of visual templates associated with a particular semantic. This notion of an SVT has certain key properties as follows: Semantic visual templates are general in nature. For a given concept, there should be a set of visual templates that cover that concept well. Examples of successful SVT's are Sunset, High Jump, Down-hill Skiing.
- a semantic visual template for a concept should be small but cover a large percentage of relevant images and videos in the collection, for high precision-recall performance.
- a semantic visual template can be understood further as a set of icons or example scenes/objects that represent the semantic with which the template is associated. From a semantic visual template, feature vectors can be extracted for querying. The icons are animated sketches.
- the features associated with each object and their spatial and temporal relationships are important. Histograms, texture and structural information are examples of global features that can be part of such a template. The choice between an icon-based realization versus a feature vector set formed out of global characteristics depends upon the semantic to be represented.
- each template contains multiple icons, example scenes/objects to represent a concept.
- the elements of the set can overlap in their coverage. Desirably, coverage is maximized with a minimal template set.
- Each icon for a concept e.g. down-hill ski, sunset, beach crowd, is a visual representation consisting of graphic objects resembling the actual objects in a scene.
- Each object is associated with a set of visual attributes, e.g. color, shape, texture, motion. The relevancy of each attribute and each object to the concept is also specified.
- sun object may be optional, as there may be sunset videos in which the sun is not visible.
- high jump the motion attribute of the foreground object is mandatory, the texture attribute of the background is non-mandatory, and both are more relevant than other attributes.
- Fig. 5 shows several potential icons for "high jump", and Fig. 6 for "sunset".
- the optimal set of icons should be chosen based on relevancy feedback and maximal coverage in terms of recall as described below in further detail.
- the positive coverage sets for different visual templates may overlap. Therefore, it is an objective to find a small set of visual templates with large, minimally overlapping positive coverage.
- Users may provide initial conditions for effective visual templates. For example, a user may use a yellow circle (foreground) and a light-red rectangle (background) as an initial template for retrieving sunset scenes. Also, users may indicate weights and relevancy of different objects, attributes, and necessary conditions pertaining to the context by answering an interactive questionnaire. The questionnaire is sensitive to the current query that the user has sketched out on a sketchpad, for example.
- the search system Given the initial visual template and relevancy of all visual attributes in the template, the search system will return a set of most similar images/video to the user. Given the returned results, the user can provide subjective evaluation of the returned results. The, precision of the results and positive coverage, i.e. recall can be computed.
- the system can determine an optimal strategy for altering the initial visual query and generate modified queries based on: 1.
- Such features are embodied in a technique as conceptually exemplified by Fig. 1, with specific illustration of a query for the concept "high jump".
- the query includes three objects, namely two stationary rectangular background fields and an object which moves to the lower right.
- four qualities are specified with associated weights, e.g. color, texture, shape and size, represented in Fig. 1 by vertical bars.
- a new query can be formed by stepping at least one of the qualities, at which point user interaction can be invoked for deciding as to plausibility for inclusion as an icon in the template.
- this template can be used for a database search. The results of the search can be evaluated for recall and precision. If acceptable, the template can be stored as a semantic visual template for "high jump".
- the fundamental video data unit may be termed a video shot, comprising multiple segmented video objects.
- the lifetime of any particular video object may be equal to or less than the duration of the video shot.
- a similarity measure D between a member of the SVT set and a video shot can be defined as
- O are the objects specified in the template
- O' are the matched objects for Ofact
- d f is the feature distance between its arguments
- d s is the similarity between the spatial-temporal structure in the template and that among matched objects in the video shot
- ⁇ f and ⁇ s are the normalized weights for the feature distance and the structure dissimilarity.
- the query procedure is to generate a candidate list for each object in the query.
- the distance D is the minimum over all possible sets of matched objects that satisfy the spatial-temporal restrictions. For example, if the semantic template has three objects and two candidate objects are kept for each single object query, there will be at most eight potential candidate sets of objects considered in computing the minimal distance in Equation 1. Given N objects in the query, this appears to require searching over all sets of
- Each video object, O is used to query the entire object database, resulting in a list of matched objects which can be kept short by using a threshold. Only objects included in this list are then considered as candidate objects matching O,.
- the candidate objects on the list are then joined, resulting in the final set of matched objects on which the spatial-temporal structure relationships will be verified.
- Template generation Two-way interaction is used between a user and the system for generating the templates. Given the initial scenario and using relevancy feedback, the technique converges on a small set of icons that gives maximum recall.
- a user furnishes an initial query as a sketch of the concept for which a template is to be generated, consisting of objects with spatial and temporal constraints. The user can also specify whether the object is mandatory. Each object has features to which the user assigns relevancy weights.
- the initial query can be regarded as a point in a high-dimensional feature space into which all videos in the database can be mapped.
- a step size can be determined with the help of the weight that the user has specified along with the initial query, which weight can be regarded as a measure for the degree of relevancy attributed by the user to the feature of the object. Accordingly, a low weight results in coarse quantization and vice versa, e.g. when
- ⁇ is the jump distance corresponding to a feature
- ⁇ is the weight associated with the feature
- the feature pace is quantized into hyper-rectangles. For example, for color the cuboids can ge generated using the metric for the LUV space along with ⁇ ( ⁇ ).
- an additional join can be included in step 2 with respect to the candidate lists for each object.
- a “sunset” may be described at the object level as well as the global level.
- a global-level description may take the form of a color or texture histogram.
- An object-level description may be a collection of two objects such as the sky and the sun. These objects may be further quantified using feature level descriptors.
- a concept e.g. Sunset
- N necessary
- S sufficient
- Additional templates may be generated manually i.e. as the user inputs additional queries. The task is undertaken for each concept. Necessary conditions can be imposed on a concept, thereby automatically generating additional templates, given an initial query template.
- the user interacts with the system through a "concept questionnaire", to specify necessary conditions for the semantic searched for. These conditions may also be global, e.g. the global color distribution, the relative spatial and temporal interrelationships etc.
- the system moves in the feature space to generate additional templates, with the user's original one as a starting point.
- This generation is also modified by the relevancy feedback given to the system by the user.
- new rules can be determined pertaining to the necessary conditions. These can be used further to modify the template generation procedure.
- the rules are generated by looking at the correlation between the conditions deemed necessary for a concept with the videos that have been marked as relevant by the user.
- a query, for a "crowd of people", in VideoQ is in the form of a sketch.
- the user has specified a visual query with an object, giving weights for color and size, but is unable to specify a more detailed description in the form of either texture (of the crowd) or the relative spatial and temporal movements that characterize the concept of a crowd of people.
- texture of the crowd
- relative spatial and temporal movements that characterize the concept of a crowd of people.
- the system identifies the video clips relevant to the concept that the user is interested in. Now, since the system knows that texture and the spatial and temporal arrangements are necessary to the concept, it seeks to determine consistent patterns amongst the features deemed necessary, amongst the relevant videos. These patterns are then returned to the user, who is asked if they are consistent with the concept that he is searching for. If the user accepts these patterns as consistent with the concept, then they will be used to generate new query templates, as illustrated by Fig. 3. Including this new rule has two-fold impact on query template generation, namely it improves the speed of the search and increases the precision of the returned results.
- the query defines a feature space where the search is executed.
- the feature space is defined by the attributes and relevancy weights of the visual template.
- the attributes define the axes of the feature space, and the relevancy weights stretch/compress the associated axes.
- each video shot can be represented as a point in this space.
- the visual template covers a portion of this space. Since the visual template can differ in feature and in character (global against object level), the spaces that are defined by the templates differ and are non-overlapping.
- Selection of a few features may be insufficient to determine a concept, but it may be adequately represented by a suitable selection differing as to weight, for example. Thus, a concept can be mapped into a feature space.
- a concept is not limited to a single feature space nor to a single cluster.
- sunsets cannot be totally characterized by a single color or a single shape.
- it is important to determine not only the global static features and weights relating to a concept, but also those features and weights that can vary.
- the search for concepts starts by specifying certain global constants. Through a context questionnaire, the number of objects in the search is determined, and the global features that are necessary to each object. These represent constraints in the search process that do not vary.
- a user gives an initial query specifying features and setting weights.
- a set intersection is taken with the set of necessary conditions defined by the user. The necessary conditions are left unchanged. Changes are made to the template based on changes to those features deemed sufficient. If the sets do not intersect, rules are derived that characterize the concept based on the necessary conditions and relevancy feedback.
- the threshold determines the number of non-overlapping coverings possible. The number of coverings determines the size and number of jumps possible along that particular feature.
- the algorithm performs a breadth first search and is guided by three criteria:
- the greedy algorithm going in the direction of increasing recall Compute all possible initial jumps. Convert each jump into the corresponding visual template. Execute the query and collate all the results. Show the results to the user for relevancy feedback and chose those results that maximize incremental recall as possible points of subsequent query.
- Keywords accompanying the data can either be generated manually or are obtained by association, i.e. keywords are extracted from the accompanying text (in the case of an image) or the captions that accompany videos.
- VideoQ provides a "language" for inputting the query, in terms of a sketch. There is a simple correspondence between what exists in VideoQ and its natural language counterpart, as illustrated by Table 1.
- a constrained language set can be used, with a set of allowable words.
- a sentence is parsed into classes such as nouns, verbs, adjectives, and adverbs to generate a motion model of the video sequence.
- nouns the objects
- An noun (i.e. scenario/object) database may initially include a hundred scenes or so, and be extensible by user interaction. Each object may have a shape description that is modified by the various modifiers such as adjectives (color, texture), verbs (walked), adverbs (slowly). This can then be inserted into the VideoQ palette, where it may be subject to further refinement.
- the parser When the parser encounters a word that is absent from its modifier database(i.e. the databases corresponding respectively to verbs, adverbs, prepositions, adjectives), it then looks up a thesaurus to determine if synonyms of that word are present in its database, and uses them instead. If that fails, it returns a message to indicate an invalid string.
- modifier database i.e. the databases corresponding respectively to verbs, adverbs, prepositions, adjectives
- the parser When the parser encounters a word that it cannot classify, the user must either modify the text or, if the word is a noun (like "Bill"), then he can indicate to the system the class (in this case a noun), and additionally indicate that the word refers to a human being. If the user indicates a noun that is absent from the system databases, then the user is prompted to draw that object in the sketch pad so that the system can learn about the object. In the database, attributes such as motion, color, texture and shapes can be generated at the object level, so that one level of matching can be at that level.
- the audio stream can be used that accompanies a video, as illustrated by Fig. 4. Indeed, if the audio is closely correlated to the video, it may be the single most important source of the semantic content of the video.
- a set of keywords can be generated, 10-20 per video sequence, for example. Then the search at the keyword level can be joined to the search that at the model level. Those videos can then be ranked the highest which match at the keyword (semantic) level as well as the motion-model level.
- Semantic visual templates for retrieving video shots of slalom skiers Semantic visual templates for retrieving video shots of slalom skiers.
- the system asks and the user answers questions regarding context.
- the semantic visual template is labeled "slalom”.
- the query is specified as object-based, including two objects.
- the large blank background represents the ski slope and the smaller foreground object the skier with its characteristic zigzag motion trail.
- the system generates a set or test icons from which the user selects plausible feature variations in the skier's color and motion trajectory.
- the four selected colors and the three selected motion trails are joined to form 12 possible skiers.
- the list of skiers is joined with the single background, resulting in the 12 icons of Fig. 7 where groups of three adjacent icons are understood as having the same.color.
- the user chooses a candidate set to query the system.
- the system retrieves the 20 closest video shots.
- the user provides relevancy feedback to guide the system to a small set of exemplar for slalom skiers.
- the database contains nine high jumpers in 2589 video shots.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Library & Information Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2000534957A JP2002506255A (en) | 1998-03-04 | 1999-03-04 | Method and system for generating semantic visual templates for image and video verification |
CA002322448A CA2322448A1 (en) | 1998-03-04 | 1999-03-04 | Method and system for generating semantic visual templates for image and video retrieval |
EP99911110A EP1066572A1 (en) | 1998-03-04 | 1999-03-04 | Method and system for generating semantic visual templates for image and video retrieval |
KR1020007009804A KR20010041607A (en) | 1998-03-04 | 1999-03-04 | Method and system for generating semantic visual templates for image and video retrieval |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US7678198P | 1998-03-04 | 1998-03-04 | |
US60/076,781 | 1998-03-04 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO1999045483A1 WO1999045483A1 (en) | 1999-09-10 |
WO1999045483A9 true WO1999045483A9 (en) | 2000-10-12 |
Family
ID=22134152
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1999/004776 WO1999045483A1 (en) | 1998-03-04 | 1999-03-04 | Method and system for generating semantic visual templates for image and video retrieval |
Country Status (5)
Country | Link |
---|---|
EP (1) | EP1066572A1 (en) |
JP (1) | JP2002506255A (en) |
KR (1) | KR20010041607A (en) |
CA (1) | CA2322448A1 (en) |
WO (1) | WO1999045483A1 (en) |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU6503800A (en) | 1999-07-30 | 2001-02-19 | Pixlogic Llc | Perceptual similarity image retrieval |
US6563959B1 (en) | 1999-07-30 | 2003-05-13 | Pixlogic Llc | Perceptual similarity image retrieval method |
WO2002023891A2 (en) * | 2000-09-13 | 2002-03-21 | Koninklijke Philips Electronics N.V. | Method for highlighting important information in a video program using visual cues |
US8085995B2 (en) | 2006-12-01 | 2011-12-27 | Google Inc. | Identifying images using face recognition |
US8190604B2 (en) | 2008-04-03 | 2012-05-29 | Microsoft Corporation | User intention modeling for interactive image retrieval |
GB2466245A (en) * | 2008-12-15 | 2010-06-23 | Univ Sheffield | Crime Scene Mark Identification System |
US9317533B2 (en) | 2010-11-02 | 2016-04-19 | Microsoft Technology Licensing, Inc. | Adaptive image retrieval database |
US8463045B2 (en) | 2010-11-10 | 2013-06-11 | Microsoft Corporation | Hierarchical sparse representation for image retrieval |
US9147125B2 (en) | 2013-05-03 | 2015-09-29 | Microsoft Technology Licensing, Llc | Hand-drawn sketch recognition |
KR101912794B1 (en) | 2013-11-27 | 2018-10-29 | 한화테크윈 주식회사 | Video Search System and Video Search method |
CN106126581B (en) * | 2016-06-20 | 2019-07-05 | 复旦大学 | Cartographical sketching image search method based on deep learning |
CN116992294B (en) * | 2023-09-26 | 2023-12-19 | 成都国恒空间技术工程股份有限公司 | Satellite measurement and control training evaluation method, device, equipment and storage medium |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3177746B2 (en) * | 1991-03-20 | 2001-06-18 | 株式会社日立製作所 | Data processing system and method |
JP2903904B2 (en) * | 1992-10-09 | 1999-06-14 | 松下電器産業株式会社 | Image retrieval device |
US5493677A (en) * | 1994-06-08 | 1996-02-20 | Systems Research & Applications Corporation | Generation, archiving, and retrieval of digital images with evoked suggestion-set captions and natural language interface |
-
1999
- 1999-03-04 CA CA002322448A patent/CA2322448A1/en not_active Abandoned
- 1999-03-04 EP EP99911110A patent/EP1066572A1/en not_active Withdrawn
- 1999-03-04 KR KR1020007009804A patent/KR20010041607A/en not_active Withdrawn
- 1999-03-04 JP JP2000534957A patent/JP2002506255A/en active Pending
- 1999-03-04 WO PCT/US1999/004776 patent/WO1999045483A1/en not_active Application Discontinuation
Also Published As
Publication number | Publication date |
---|---|
WO1999045483A1 (en) | 1999-09-10 |
CA2322448A1 (en) | 1999-09-10 |
KR20010041607A (en) | 2001-05-25 |
EP1066572A1 (en) | 2001-01-10 |
JP2002506255A (en) | 2002-02-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Srihari et al. | Intelligent indexing and semantic retrieval of multimodal documents | |
Cheng et al. | Semantic visual templates: linking visual features to semantics | |
Chen et al. | A novel video summarization based on mining the story-structure and semantic relations among concept entities | |
Meghini et al. | A model of multimedia information retrieval | |
US7502780B2 (en) | Information storage and retrieval | |
Feng et al. | Automatic caption generation for news images | |
EP1565846B1 (en) | Information storage and retrieval | |
US7610306B2 (en) | Multi-modal fusion in content-based retrieval | |
Hsu et al. | Reranking methods for visual search | |
US8140550B2 (en) | System and method for bounded analysis of multimedia using multiple correlations | |
WO1999045483A9 (en) | Method and system for generating semantic visual templates for image and video retrieval | |
US20040139105A1 (en) | Information storage and retrieval | |
Ciocca et al. | Quicklook2: An integrated multimedia system | |
Chang et al. | Multimedia search and retrieval | |
Pradhan et al. | A query model to synthesize answer intervals from indexed video units | |
Paz-Trillo et al. | An information retrieval application using ontologies | |
Doulaverakis et al. | Ontology-based access to multimedia cultural heritage collections-The REACH project | |
Budikova et al. | Search-based image annotation: Extracting semantics from similar images | |
Charhad et al. | Semantic video content indexing and retrieval using conceptual graphs | |
Kutics et al. | Use of adaptive still image descriptors for annotation of video frames | |
Henrich et al. | Combining Multimedia Retrieval and Text Retrieval to Search Structured Documents in Digital Libraries. | |
Tanaka et al. | Organization and retrieval of video data | |
Chen et al. | Generating semantic visual templates for video databases | |
Velthausz | Cost-effective network-based multimedia information retrieval | |
Faudemay et al. | Intelligent delivery of personalised video programmes from a video database |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): CA JP KR US |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
ENP | Entry into the national phase |
Ref document number: 2322448 Country of ref document: CA Ref country code: CA Ref document number: 2322448 Kind code of ref document: A Format of ref document f/p: F |
|
ENP | Entry into the national phase |
Ref country code: JP Ref document number: 2000 534957 Kind code of ref document: A Format of ref document f/p: F |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1020007009804 Country of ref document: KR |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1999911110 Country of ref document: EP |
|
AK | Designated states |
Kind code of ref document: C2 Designated state(s): CA JP KR US |
|
AL | Designated countries for regional patents |
Kind code of ref document: C2 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE |
|
COP | Corrected version of pamphlet |
Free format text: PAGES 1/3-3/3, DRAWINGS, REPLACED BY NEW PAGES 1/4-4/4; DUE TO LATE TRANSMITTAL BY THE RECEIVING OFFICE |
|
WWP | Wipo information: published in national office |
Ref document number: 1999911110 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 09623277 Country of ref document: US |
|
WWP | Wipo information: published in national office |
Ref document number: 1020007009804 Country of ref document: KR |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 1999911110 Country of ref document: EP |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 1020007009804 Country of ref document: KR |