1. Trang chủ
  2. » Giáo án - Bài giảng

concept-based video retrieval

46 270 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 46
Dung lượng 4,12 MB

Nội dung

Concept-based Video Retrieval Cees G.M. Snoek and Marcel Worring SSIP 2008 www.mediamill.nl 1 Ct Bd Vid Rti l C oncep t - B ase d Vid eo R e t r i eva l Cees Snoek and Marcel Worring with contributions by: many Intelligent Systems Lab Amsterdam, University of Amsterdam, The Netherlands 3 The science of labeling ¾ To understand anything in science, things have to have a name that is recognized and is universal naming chemical elements naming human genome naming ‘categories’ 4 naming living organisms naming rocks and minerals naming textual information What about naming video information? Concept-based Video Retrieval Cees G.M. Snoek and Marcel Worring SSIP 2008 www.mediamill.nl 2 Problem statement 1101011011011 0110110110011 0101101111100 1101011011111 1101011011011 0110110110011 0101101111100 1101011011111 1101011011011 0110110110011 0101101111100 1101011011111 Hu Jintao Basketball Table Tree US flag Building 1101011011111 1101011011011 0110110110011 0101101111100 1101011011111 1101011011011 0110110110011 0101101111100 1101011011111 1101011011011 1101011011011 0110110110011 0101101111100 1101011011111 1101011011011 0110110110011 0101101111100 5 Multimedia Archives Aircraft Dog Tennis Mountain Explosion 0 110110110011 0101101111100 1101011011111 1101011011111 1101011011011 0110110110011 0101101111100 1101011011111 1101011011011 0110110110011 0101101111100 1101011011111 1101011011011 0110110110011 0101101111100 1101011011111 Different low-level features count Histogram Regularity Each feature yields a vector representation of the visual data l Regularity Coarseness Directionality 6 co l o r Concept-based Video Retrieval Cees G.M. Snoek and Marcel Worring SSIP 2008 www.mediamill.nl 3 Basic example: color histogram count Histogram 380 pixels count 7 640 pixels color Total 243200 pixels Histogram is a summary of the data summarizing in this case color characteristics Advanced example: codebook model ¾ Create a codeword vocabulary 9 Codeword annotation (e.g. Sky, Water) Leung and Malik. IJCV, 2001. Sivic and Zisserman. ICCV, 2003. van Gemert, PhD thesis, UvA, 2008. ¾ Discretize Image with codewords ¾ Represent image as codebook histogram 0 100 200 300 40 0 50 0 0 10 20 30 40 50 60 70 80 Concept-based Video Retrieval Cees G.M. Snoek and Marcel Worring SSIP 2008 www.mediamill.nl 4 The goal: semantic video indexing ¾ Is the process of automatically detecting the presence of a semantic concept in a video stream Airplane 9 Semantic indexing ¾ The computer vision approach 9 Building detectors one-at-the-time A face detector for frontal faces 3 years later 10 A face detector for non-frontal faces One (or more) PhD for every new concept Concept-based Video Retrieval Cees G.M. Snoek and Marcel Worring SSIP 2008 www.mediamill.nl 5 So how about these? Rd Bh Bt Ai l B ildi R oa d B eac h B oa t A n i ma l B u ildi ng Graphic People Car Vegetation Overlayed Text And the > 1000 others 11 Studio Setting Outdoor And the > 1000 others ……… Generic concept detection in a nutshell outdoor aircraft Feature Extraction Supervised Learner Training Labeled examples 12 Feature Measurement Classification Testing Video It is an aircraft probability 0.7 It is outdoor probability 0.95 Concept-based Video Retrieval Cees G.M. Snoek and Marcel Worring SSIP 2008 www.mediamill.nl 6 K nearest neighbor F 1 13 F F 2 Linear classification F 1 14 F F 2 Concept-based Video Retrieval Cees G.M. Snoek and Marcel Worring SSIP 2008 www.mediamill.nl 7 Support vector machine F 1 SVM usually is a good choiceSVM usually is a good choice 15 F F 2 ¾ Support Vector Machine 9 Learns from provided examples 9 Maximizes margin between two classes Margin Supervised Learner ¾ Depends on many parameters 9 Select best of multiple parameter combinations 9 Using cross validation 16 SVM Vector Semantic Concept Probability Weight for positive class Weight for negative class Concept-based Video Retrieval Cees G.M. Snoek and Marcel Worring SSIP 2008 www.mediamill.nl 8 How to improve concept detection? Feature Extraction Supervised Learner Feature Extraction Supervised Learner 17 Feature Extraction Supervised Learner Vector concatenation & normalization Feature fusion: multimodal References: Snoek, ACM Multimedia 2005 Magelhaes, CIVR 2007 Feature Fusion Visual Feature Extraction Ttl Supervised Learner + Only one learning phase + Truly a multimedia representation - Multimodal combination often ad hoc 18 T ex t ua l Feature Extraction - One modality may dominate - Feature vectors become too large easily Concept-based Video Retrieval Cees G.M. Snoek and Marcel Worring SSIP 2008 www.mediamill.nl 9 Feature fusion: unimodal References: van de Sande, CIVR 2008 + Codebook model reduces dimensionality - Combination still ad hoc - One feature may dominate 0 1 Relative frequency 123 4 5 Codebook element Harris-Laplace salient points Point sampling strategy Color feature extraction Codebook model 1 Relative frequency Bag-of-features . . . . One feature may dominate Spatial pyramid Dense sampling 0 1 12345 0 1 12345 0 1 12345 0 1 12345 0 123 4 5 Codebook element Bag-of-features Spatial pyramid: multiple bags-of-features . Image . . . + Focus on modality strength + Fusion in semantic space Classifier fusion: multimodal References: Wu, ACM Multimedia 2004 Snoek, ACM Multimedia 2005 Supervised Learner Classifier Fusion Visual Feature Extraction Textual Feature Et ti Supervised Learner Supervised Learner 20 E x t rac ti on Learner - Expensive in terms of learning effort - Possible loss of feature space correlation Concept-based Video Retrieval Cees G.M. Snoek and Marcel Worring SSIP 2008 www.mediamill.nl 10 Classifier fusion: unimodal Support Vector Machine Global Image Feature Extraction References: Snoek, TRECVID 2006 Wang, ACM MIR 2007 Geometric Mean Logistic Regression Fisher Linear Discriminant Regional Image Feature Extraction Keypoint Image Feature Extraction 21 + Aggregation functions reduce learning effort + Offers opportunity to use all available examples efficiently - Linear function likely to be sub-optimal Modeling relations ¾ Exploitation of conceptual co-occurrence 9 Concepts do not occur in vacuum 9 In contrast, they are related References: IBM 2003 Naphade and Huang, TMM 3(1) 2001 In contrast, they are related Sky Aircraft 22 ¾ What is sports? 9 Answer: a combination of various individual sports [...]... SSIP 2008 Concept-based Video Retrieval Cees G.M Snoek and Marcel Worring TRECVID interactive search task So many choices for retrieval Why not let user decide interactively? Topics Query Search Engine Result http://trecvid.nist.gov/ 65 References: Carnegie Mellon University ‘Classic’ Informedia system First multimodal video search engine 66 www.mediamill.nl 32 SSIP 2008 Concept-based Video Retrieval. .. detectors, within the shot retrieval process and this appears to be the roadmap for future work in this area.” Alan Smeaton, Information Systems, 32(4):545-559, 2007 54 www.mediamill.nl 26 SSIP 2008 Concept-based Video Retrieval Cees G.M Snoek and Marcel Worring Measure concept detector influence Hypothesis 1: Increasing the number of concept detectors in a lexicon improves video retrieval accuracy p y... examples 2006 The Ugly Th U l exploit TV repetition 38 www.mediamill.nl 18 Concept-based Video Retrieval Cees G.M Snoek and Marcel Worring SSIP 2008 491 detectors, a closer look The number of labeled image examples used at training time seems decisive in concept detector accuracy Demo time! www.mediamill.nl 19 SSIP 2008 Concept-based Video Retrieval Cees G.M Snoek and Marcel Worring Concept detector: requires... http://mp7.watson.ibm.com/marvel/ 68 www.mediamill.nl 33 SSIP 2008 Concept-based Video Retrieval Cees G.M Snoek and Marcel Worring References: Oulu University Cluster-temporal browsing Using that result are typically similar/close in time 69 References: Dublin City University Físchlár Optimized for use by “real” users 70 www.mediamill.nl 34 SSIP 2008 Concept-based Video Retrieval Cees G.M Snoek and Marcel Worring References:... Mellon University Extreme video retrieval Observation Correct results are retrieved, but not optimally ranked If user has time to scan results exhaustively, retrieval is a matter of watching, selecting, and sorting quickly Push the user to the max = very demanding! Rapid serial visual presentation Adjust browser to depth of results 72 www.mediamill.nl 35 SSIP 2008 Concept-based Video Retrieval Cees G.M Snoek... concept detectors from a lexicon improves video retrieval accuracy 55 TRECVID automatic search task Topics Query Search Engine Result Automatically solve search topic Return 1,000 ranked shot-based results Evaluate using Average Precision TRECVID 2005 85 hrs test set – Chinese, Arabic, English TV News 24 search topics www.mediamill.nl 27 SSIP 2008 Concept-based Video Retrieval Cees G.M Snoek and Marcel Worring... and small data sets Hard to compare methodologies Since 2001 worldwide evaluation by NIST NIST www.mediamill.nl 30 14 SSIP 2008 Concept-based Video Retrieval Cees G.M Snoek and Marcel Worring NIST TRECVID benchmark anno 2001 Benchmark objectives Promote progress in video retrieval research Provide common dataset (shots, recognized speech, key frames) Use open metrics-based evaluation open, Large international... and concepts 52 www.mediamill.nl 25 SSIP 2008 Concept-based Video Retrieval Cees G.M Snoek and Marcel Worring Concept detection challenges Show generality of approach over several domains Show benefit of web-based image /video and annotations Show that concept classes work with less analysis People, objects, setting Show benefit of using dynamic nature of video Events Show that an ontology can help How... relevant retrieved items 3 Precision 4 5 Recall inverse relationship 36 www.mediamill.nl 17 SSIP 2008 Concept-based Video Retrieval Cees G.M Snoek and Marcel Worring TRECVID evaluation measures Classification procedure Training: many hours of (partly) annotated video Testing: many hours of unseen video Results 1 Evaluation measure: Average Precision Combines precision and recall Averages precision... Engine Result How to translate query topic to concept detectors? 62 www.mediamill.nl 30 SSIP 2008 Concept-based Video Retrieval Cees G.M Snoek and Marcel Worring With B Huurnink / M de Rijke L Hollink / G Schreiber M Worring Detector selection strategies Detector Selection Strategies Semantic Visual Querying Video Query Ontology Querying Find shots of an office setting Fusion Text Matching Data flow conventions . organisms naming rocks and minerals naming textual information What about naming video information? Concept-based Video Retrieval Cees G.M. Snoek and Marcel Worring SSIP 2008 www.mediamill.nl 2 Problem. 300 40 0 50 0 0 10 20 30 40 50 60 70 80 Concept-based Video Retrieval Cees G.M. Snoek and Marcel Worring SSIP 2008 www.mediamill.nl 4 The goal: semantic video indexing ¾ Is the process of automatically. examples 12 Feature Measurement Classification Testing Video It is an aircraft probability 0.7 It is outdoor probability 0.95 Concept-based Video Retrieval Cees G.M. Snoek and Marcel Worring SSIP

Ngày đăng: 24/04/2014, 13:20

TỪ KHÓA LIÊN QUAN

w