WO2002013067A2 - Systeme pour la classification d'images video en ligne sur la base de regles - Google Patents
Systeme pour la classification d'images video en ligne sur la base de regles Download PDFInfo
- Publication number
- WO2002013067A2 WO2002013067A2 PCT/US2001/024719 US0124719W WO0213067A2 WO 2002013067 A2 WO2002013067 A2 WO 2002013067A2 US 0124719 W US0124719 W US 0124719W WO 0213067 A2 WO0213067 A2 WO 0213067A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- video
- video footage
- classification
- features
- color
- Prior art date
Links
- 238000012549 training Methods 0.000 claims abstract description 21
- 230000001939 inductive effect Effects 0.000 claims abstract description 6
- 238000000034 method Methods 0.000 claims description 29
- 238000003066 decision tree Methods 0.000 claims description 21
- 238000000605 extraction Methods 0.000 claims description 6
- 238000013459 approach Methods 0.000 claims description 3
- 230000000007 visual effect Effects 0.000 claims description 3
- 238000012805 post-processing Methods 0.000 claims 1
- 238000013138 pruning Methods 0.000 description 10
- 239000013598 vector Substances 0.000 description 9
- 230000008569 process Effects 0.000 description 6
- 238000001914 filtration Methods 0.000 description 5
- 238000012360 testing method Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 241000037488 Coccoloba pubescens Species 0.000 description 1
- 241000408659 Darpa Species 0.000 description 1
- 230000001154 acute effect Effects 0.000 description 1
- 230000003466 anti-cipated effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000005352 clarification Methods 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/022—Knowledge engineering; Knowledge acquisition
- G06N5/025—Extracting rules from data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7847—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
- G06F16/785—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content using colour or luminescence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7847—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
- G06F16/786—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content using motion, e.g. object motion or camera motion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
Definitions
- the present invention relates to a video classification method and apparatus and more particularly, to a means for categorizing video streams into a fixed set of classes.
- This invention discloses a video classification system that relies on supervised learning (rule- based decision-tree classifier) to categorize video streams into a fixed set of classes.
- Video classification is Important in many different applications, including defense applications where automatic scene understanding is important and useful in areas ranging from Unmanned Aerial Vehicle (UAV) video surveillance monitoring to special missions.
- UAV Unmanned Aerial Vehicle
- this invention can be used for innovative new services, which can be offered by satellite based media systems, such as broadband-based access and novel interactive TV applications.
- the invention includes an apparatus, such as a computer, which is configured to process video programming or footage.
- the processing sequence includes the assignment of semantic designations to a predetermined number of events. For instance, the processor might begin by identifying a number of general characteristics unique to a basketball game. Once having made this threshold determination as to the general nature of a sequence of footage, the processor further processes the footage by breaking it down into a plurality of footage segments. These segments might correspond to "cuts" or breaks in the footage, as might occur when video editors switch from one camera to another camera, or where the camera pans. Utilizing low- level features extracted from the footage segments, and applying the features to one or more if- then else rules each segment can be classified as at least one semantic category.
- the processor might extract a segment and based on the color, motion and other factors determine that the segment shows team A scoring. The processor would then assign at least one semantic classification to that segment of footage. In many situations, more than one classification might be appropriate. For instance, three possible classifications, which might be appropriate to a single segment of footage, are: team A scores, the score and a three point score tally.
- the invention may be utilized to classify any type of video; therefore the invention could be used to extract information related to news programming, weather forecasting, financial news, or other programming having characteristics that are distinguishable.
- the invention also allows a user, or operator, to specify the general characteristics of footage sequence.
- the apparatus allows an external source to input the general nature of footage.
- the low level features, or primitives, that are extracted and used to semantically classify data include, color information such as color histogram, dominant color and regional color. Additionally, a gradient edge operator may be utilized to detect edges along different directions such as horizontal, vertical, diagonal, cross-diagonal axes. Edge densities along different directions and having different lengths may be calculated as edge features.
- FIG. 1 shows a forward prediction codec for MPEG-1 video
- FIGs. 2a and 2b show one technique for motion direction extraction
- FIG. 3 depicts various key frames of basketball scenes
- FIG. 4 shows a flow chart for key-frame classification and clustering
- FIG. 5 shows a knowledge based video classification system
- FIG. 6 shows a rule tree for video classification
- FIG. 7 shows a decision tree and rule tree for basketball video classification
- Table 1 represents the results of basketball video classification from rules-based system
- FIG. 8 depicts a flow chart for online video filtering based on a user's profile.
- the present invention provides a method and an apparatus useful for rule-based video classification, thereby enhancing data categorization.
- the present invention may be tailored to a variety of other applications.
- the following description, taken in conjunction with the referenced drawings, is presented to enable one of ordinary skill in the art to make and use the invention and to incorporate it in the context of particular applications.
- Various modifications, as well as a variety of uses in different applications, will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to a wide range of embodiments.
- the present invention is not intended to be limited to the embodiments presented, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
- This specification discloses a video classification apparatus that relies on supervised learning (rule-based decision-tree classifier) to categorize video streams into a fixed set of classes.
- the system of the present invention utilizes the decision-tree method described herein to select a set of if-then-else rules for video classification which can be directly applied to a set of matching functions for low level features such as, but not limited to, color, motion, and texture that are identifiable in video data.
- an unsupervised learning method can be utilized to cluster the low level features of video data in order to simplify the representation of the feature for further analysis.
- Motion features for example, provide effective cues for video analysis. Motion features are a good cue to be used in video, as they are an integral part of a motion sequence. In addition, motion is typically already calculated in most video codecs, and motion compensation information is present in the compressed data stream.
- MPEG-1 video for example, as shown in FIG. 1 there are three types of frames, I frame 100, B frame 102, and P frame 104.
- direction and magnitude of video sequence motion flows 106 of only P frames 104 were evaluated. P frames 104 only provide forward prediction and this permits calculation of the direction of the "flow" of video data.
- FIGs. 2a and 2b Each motion vector shown in FIG. 2a in the video data, 1 through 12, 200 can be clustered and classified.
- vectors in regions 1 and 8 202 are termed as RIGHT; vectors in regions 2 and 3 204 as UP; vectors in regions 4 and 5 206 as LEFT; and vectors in regions 6 and 7 208 as DOWN.
- the amount of motion along each direction can be evaluated by counting the total numbers of vectors along that direction in each class for the whole video sequence. This results in a four-dimensional motion direction vector.
- the motion magnitude of the whole frame can be calculated according to the following equation:
- n and m are the total number of motion vectors with respect to the X direction and Y direction, respectively, in a frame of video data.
- color and edge information may also be utilized in video classification according to the present invention.
- video data can be divided into key image categories. An example of such categorization is provided in FIG. 3.
- video from a basketball game is divided into key frames, specifically left court 300, right court 302, middle court 304, and close-up images 306.
- Color information including, but not necessarily limited to, color histogram, dominant color and regional color may be extracted from the video data to develop the categorization.
- a gradient edge operator may be utilized to detect edges along different directions including, but not necessarily limited to, horizontal, vertical, diagonal, or cross-diagonal. Edge densities along different directions and with different lengths may then be calculated as edge features.
- the edge information can be analyzed along one or more directions by distribution of visible edges and by clustering into horizontal, vertical and/or other category edge type.
- FIG. 4 A flow chart for key-frame clustering as might be applied to basketball game video data is shown in FIG. 4. According to this flowchart data, in the form of a frame or image 400 is input into an apparatus. The data is then exposed to elements configured to extract edge features 402 and color features 404. Thereafter the data is sent to a clustering or classification step 406; the clusters are then assigned to their respective category 408. It is anticipated that either additional or fewer categories may be appropriate depending on the nature of the data.
- the present invention utilizes decision-tree learning in order to develop an effective video classification system.
- Decision tree learning is a method for approximating discrete-valued target functions, in which the learned function is represented by a decision tree. Learned decision trees are constructed by first evaluating what attribute, should be tested at the root of the tree. Next, the most useful attribute for classifying examples and a quantitative measurement of its value to the classification must be identified. The ultimate purpose is to develop a quantitative measurement of how well a given attribute separates the training examples according to their target classification.
- these attributes can be in the form of the low level video data features discussed in the preceding paragraphs.
- a broad variety of features are used as indexing keys to query video/image databases, but each feature is not always the best selection for all queries under different semantic environments.
- the effectiveness of a particular feature for identifying proper classification must be properly assessed.
- each potential attribute is statistically evaluated to determine how well it alone classifies the training data.
- the best attribute for achieving classification is then selected and used as a test at the root node of the decision tree. Branches from the root node are then created to the possible values of the chosen attribute, or descendant nodes, and training data are sorted to the appropriate descendant nodes. The entire process is then repeated such that the possible attributes are statistically evaluated for relevance at each of the descendant nodes.
- Evaluation of attributes is based on a statistical value known as information gain.
- the information gain for a given attribute is the expected reduction in entropy resulting from classification of a data set according to the attribute.
- the entropy itself of a given collection of data S wherein S can be classified into two target concepts is defined according to equation 2:
- Entropy(S) -p x log 2 p l - p 2 log 2 p 2 , (2) wherein p is the proportion of S belonging to a first category and p 2 is the proportion of S belonging to a second category of target concepts.
- Entropy is equal to 0 (minimum) when all the examples in a set of data belong to the same class, and entropy is equal to 1 (maximum) when each class is equally distributed in the given set of data.
- Equation (4) the first term in Equation (4) is just the entropy of the original collection S, and the second term is the expected value of the entropy after S is partitioned using attribute A.
- the information gain value is utilized for each attribute to identify the best attribute to serve as a test at the root node of the tree.
- Descendant nodes are created based on the possible values for the selected attribute and the process then repeats itself at each of these descendant nodes. Once an attribute is identified for a node, it is no longer evaluated for subsequent nodes in the tree. The process of evaluating attributes and generating additional descendants continues until either of the two conditions is met: (1) Every attribute has already been included along this path through the tree, or (2) the training examples associated with a given "leaf node all have the same target attribute value (i.e. their entropy is zero).
- the development of the tree depends upon the split criterion and the stop criterion for generation of nodes.
- a good tree should have few levels, as it is better to classify with as few decisions as possible.
- a good tree should have a large leaf population, as it is better to classify with as many cases as possible.
- the training data is split upon the variable that maximizes the Gain(S, A).
- Gain value is only based on the class distribution, which makes the computation easy to perform.
- "entropy gain measure” does not take popularity into consideration. If the stop criterion is chosen with entropy equal to 0 (same class for all cases), over-fitting will result and undesirably deep trees with few cases on the leaf nodes will be created.
- the stop criterion either at a minimum popularity allowed for by the leaves, at a certain entropy value to be reached, or, most preferably, as a combination of these two conditions.
- the technique will over-fit the tree first and then proceed with "pruning".
- rule post-pruning In practice, one successful method for finding high accuracy hypotheses for classification is a technique called rule post-pruning. A variant of this pruning method is used by the well-known program, C4.5. Rule post-pruning involves the following steps:
- rule post-pruning In rule post-pruning, one rule is generated for each leaf node in the tree. Each attribute test along the path from the root to the leaf becomes a rule antecedent (precondition) and the classification at the leaf node becomes a rule consequent (postcondition). Next, each such rule is pruned by removing any antecedent, or precondition, whose removal does not worsen its estimated accuracy. It would select whichever of the above pruning steps produced the greatest improvement in estimated rule accuracy, then consider pruning the second precondition as a further pruning step. No pruning step is performed if it reduces the estimated rule accuracy.
- C4.5 calculates its pessimistic estimate by calculating the standard deviation in this estimated accuracy assuming a binomial distribution. For a given confidence level, the low-bound estimate is then taken as the measure of rule performance. For large data sets, this pessimistic estimate is very close to the observed accuracy.
- Converting decision frees to rules permits distinguishing among the different contexts in which a decision node is used and removes the distinction between attribute tests that occur near the root of the tree from those that occur near the "leaves" of the tree. Further, conversion to rules greatly improves readability of the classification system and renders the system easier to incorporate with a knowledge base for further intelligent inference and reasoning.
- FIG. 5 An embodiment of the video classification system of the present invention is shown in FIG. 5.
- This embodiment of the present invention initiates classification with offline training 500.
- Sample video clips 502 of different categories are identified and appropriate low-level features are identified.
- an entropy-based inductive free- learning algorithm is utilized to establish the frained knowledge base.
- This knowledge base is represented as a decision-tree with each node in the tree being an if-then-else rule as applied to a similarity metric utilizing an appropriate low-level feature that is either user specified or generated, along with a good threshold.
- This threshold similarly, may also be user specified or derived; generally it will be derived.
- the classifier 504 then accepts video or data 506 to be classified and utilizes the rules in the Decision Tree in conjunction extracted visual features from the input video or data 506 to classify the input video or data 506.
- the classifier 504 then provides the classification results 508.
- the rule scheme for this embodiment of the invention is shown in FIG. 6, where the rule at each level is depicted as ⁇ F, ⁇ > 600 wherein F represents the low-level feature and ⁇ represents the derived threshold. In most situations, as was stated above, the appropriate feature F and a good threshold ⁇ are automatically created by the training process. Furthermore, the semantic categories for classification form the leaves of the free. New video data is classified according to the decision tree as follows.
- the feature which is utilized at the root level, is initially exfracted and the corresponding rule is applied to direct the data to the next level on the rule tree.
- the same step is carried out, whereby an appropriate feature is selected and the corresponding rule applied. Note that if a feature was already calculated earlier on in the free, it need not be recalculated.
- the decision-tree training algorithm described in the preceding paragraphs was applied to motion, color, and edge features in video data of a basketball game. These features were extracted directly from MPEG-1 video.
- Nine classes of data for the basketball game date were identified and are outlined in FIG. 7. These classes were as follows: 1. team offense at left court 700; 2. team offense at right court 702; 3. fastbreak to left 704; 4. fastbreak to right 706; 5. dunk in left court 708; 6. dunk in right court 710; 7 score in right court 712; 8. score in left court 714; 9. close-ups of audience members or players 716.
- a set of data from one basketball game was utilized to train the learning algorithm and obtain the critical patterns of the classes for efficient rule classifications.
- an at-maximum three-level decision tree containing fourteen rules was generated.
- classification required at most three calculations for each class. No more than 6 features were needed to classify all nine basketball event classes.
- the resultant decision tree as outlined in FIG. 7, was prepared according to the present invention.
- the decision tree prepared according to the present invention was applied to approximately 55 video data clips from a different set of basketball game video data.
- the results of the classification are outlined in Table 1. Use of the classification method according to the present invention when applied to basketball game footage provided classification accuracies of 70% to 82% for all identified class of the basketball game video data. Results maybe improved through additional training.
- the rule-based video classification system of the present invention is useful both for on-line and off-line video classifications, and thus has applications in video indexing systems, video scene understanding and data mining, on-line video filtering, intelligent video summarization, and fast video browsing.
- General video classification problems can be easily resolved by following the embodiment of the present invention illustrated in FIG. 5.
- the present invention can further be easily applied to "smart" video filtering based upon a specified user profile.
- FIG. 8 illustrates the data flow for this type of application according to the present invention.
- the classification system of the present invention may be utilized for realtime user-preference oriented multimedia data distribution via the Internet in "push" applications. This system can provide on-line feature extraction and semantic classification that would be used for data matching to user profile and filtering tools for real-time multimedia distribution and sharing over the Internet.
- the system can also be extended to other related applications, including interactive TV broadcast services, training services and collaboration.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Library & Information Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Image Analysis (AREA)
Abstract
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2001284740A AU2001284740A1 (en) | 2000-08-05 | 2001-08-04 | System for online rule-based video classification |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US22355500P | 2000-08-05 | 2000-08-05 | |
US60/223,555 | 2000-08-05 | ||
US70827200A | 2000-11-07 | 2000-11-07 | |
US09/708,272 | 2000-11-07 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2002013067A2 true WO2002013067A2 (fr) | 2002-02-14 |
WO2002013067A3 WO2002013067A3 (fr) | 2003-12-04 |
Family
ID=26917906
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2001/024719 WO2002013067A2 (fr) | 2000-08-05 | 2001-08-04 | Systeme pour la classification d'images video en ligne sur la base de regles |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU2001284740A1 (fr) |
WO (1) | WO2002013067A2 (fr) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2007118709A1 (fr) * | 2006-04-18 | 2007-10-25 | Technische Universität Berlin | procédé pour détecter une publicité dans un flux de données vidéo en évaluant des informations de descripteur |
WO2009148422A1 (fr) * | 2008-06-06 | 2009-12-10 | Thomson Licensing | Système et méthode de recherche d'images par similarité |
US7865492B2 (en) | 2005-09-28 | 2011-01-04 | Nokia Corporation | Semantic visual search engine |
CN114302224A (zh) * | 2021-12-23 | 2022-04-08 | 新华智云科技有限公司 | 一种视频智能剪辑方法、装置、设备及存储介质 |
CN114697761A (zh) * | 2022-04-07 | 2022-07-01 | 脸萌有限公司 | 一种处理方法、装置、终端设备及介质 |
-
2001
- 2001-08-04 AU AU2001284740A patent/AU2001284740A1/en not_active Abandoned
- 2001-08-04 WO PCT/US2001/024719 patent/WO2002013067A2/fr active Application Filing
Non-Patent Citations (3)
Title |
---|
GAUCH J M ET AL: "Real time video scene detection and classification" INFORMATION PROCESSING & MANAGEMENT, ELSEVIER, BARKING, GB, vol. 35, no. 3, May 1999 (1999-05), pages 381-400, XP004169416 ISSN: 0306-4573 * |
KUNIEDA T ET AL: "PACKAGE-SEGMENT MODEL FOR MOVIE RETRIEVAL SYSTEM AND ADAPTABLE APPLICATIONS" RICOH TECHNICAL REPORT, RICOH COMPANY, TOKYO, JP, no. 25, 30 November 1999 (1999-11-30), pages 33-39, XP002949997 ISSN: 0387-7795 * |
SMOLLAR S W ET AL: "CONTENT-BASED VIDEO INDEXING AND RETRIEVAL" IEEE MULTIMEDIA, IEEE COMPUTER SOCIETY, US, vol. 12, 1994, pages 62-72, XP002921947 ISSN: 1070-986X * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7865492B2 (en) | 2005-09-28 | 2011-01-04 | Nokia Corporation | Semantic visual search engine |
EP2450808A3 (fr) * | 2005-09-28 | 2012-05-30 | Nokia Corporation | Moteur de recherche visuelle sémantique |
CN102999635A (zh) * | 2005-09-28 | 2013-03-27 | 核心无线许可有限公司 | 语义可视搜索引擎 |
KR101516712B1 (ko) * | 2005-09-28 | 2015-05-04 | 코어 와이어리스 라이센싱 에스.에이.알.엘. | 의미론적 시각 검색 엔진 |
WO2007118709A1 (fr) * | 2006-04-18 | 2007-10-25 | Technische Universität Berlin | procédé pour détecter une publicité dans un flux de données vidéo en évaluant des informations de descripteur |
US7761491B2 (en) | 2006-04-18 | 2010-07-20 | Ecodisc Technology Ag | Method for detecting a commercial in a video data stream by evaluating descriptor information |
WO2009148422A1 (fr) * | 2008-06-06 | 2009-12-10 | Thomson Licensing | Système et méthode de recherche d'images par similarité |
CN114302224A (zh) * | 2021-12-23 | 2022-04-08 | 新华智云科技有限公司 | 一种视频智能剪辑方法、装置、设备及存储介质 |
CN114302224B (zh) * | 2021-12-23 | 2023-04-07 | 新华智云科技有限公司 | 一种视频智能剪辑方法、装置、设备及存储介质 |
CN114697761A (zh) * | 2022-04-07 | 2022-07-01 | 脸萌有限公司 | 一种处理方法、装置、终端设备及介质 |
CN114697761B (zh) * | 2022-04-07 | 2024-02-13 | 脸萌有限公司 | 一种处理方法、装置、终端设备及介质 |
Also Published As
Publication number | Publication date |
---|---|
AU2001284740A1 (en) | 2002-02-18 |
WO2002013067A3 (fr) | 2003-12-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113920370B (zh) | 模型训练方法、目标检测方法、装置、设备及存储介质 | |
Zhou et al. | Rule-based video classification system for basketball video indexing | |
Bhaumik et al. | Hybrid soft computing approaches to content based video retrieval: A brief review | |
CN101894125B (zh) | 一种基于内容的视频分类方法 | |
Bianco et al. | Predicting image aesthetics with deep learning | |
US20070196013A1 (en) | Automatic classification of photographs and graphics | |
Chen et al. | Semantic event detection via multimodal data mining | |
Bouguila | A model-based approach for discrete data clustering and feature weighting using MAP and stochastic complexity | |
Tung et al. | Collageparsing: Nonparametric scene parsing by adaptive overlapping windows | |
Lin et al. | Effective feature space reduction with imbalanced data for semantic concept detection | |
WO2022148108A1 (fr) | Systèmes, dispositifs et procédés d'analyse vidéo hiérarchique distribuée | |
CN111191033A (zh) | 一种基于分类效用的开集分类方法 | |
Mohan et al. | Classification of sport videos using edge-based features and autoassociative neural network models. | |
Raval et al. | A survey on event detection based video summarization for cricket | |
CN110765285A (zh) | 基于视觉特征的多媒体信息内容管控方法及系统 | |
Le Saux et al. | Feature selection for graph-based image classifiers | |
Sun et al. | Learning deep semantic attributes for user video summarization | |
CN112613474B (zh) | 一种行人重识别的方法和装置 | |
WO2002013067A2 (fr) | Systeme pour la classification d'images video en ligne sur la base de regles | |
CN110378384B (zh) | 一种结合特权信息和排序支持向量机的图像分类方法 | |
Krishna Mohan et al. | Classification of sport videos using edge-based features and autoassociative neural network models | |
Souvannavong et al. | Multi-modal classifier fusion for video shot content retrieval | |
Calarasanu et al. | From text detection to text segmentation: a unified evaluation scheme | |
Zhou et al. | Video analysis and classification for MPEG-7 applications | |
Laulkar et al. | Semantic rules-based Classification of outdoor natural scene images |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ CZ DE DE DK DK DM DZ EE EE ES FI FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
122 | Ep: pct application non-entry in european phase | ||
NENP | Non-entry into the national phase |
Ref country code: JP |