Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 77 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
77
Dung lượng
5,47 MB
Nội dung
Boostin g and Tree-structured g Classifier Presenter: Nguyen Dang Binh Presenter: Nguyen Dang Binh The classification speed is not just for time-efficiency but good accuracy Object Detection b a Cascade of Classifiers b y a Cascade of Classifiers /763 PicturesfromRomdhani etal.ICCV01 Object Tracking b Fast (Re ) Detection b y Fast (Re - ) Detection From Search region From Timet to t+1 Search region Previousobjectlocation Detection again Detection again Online discriminative feature selection [Collins et al 03] Ensemble /764 Online discriminative feature selection [Collins et al 03] , Ensemble tracking[Avidan 07] Semantic Segmentation Requiringpixel‐wiseclassification /765 Structure of this talk Secondhal f UnifiedBoosting framework Firsthal f IntroductiontoBoosting Bagging/RF Tree‐structuredclassifiers MCBoost Speeding up Bagging/RF Adaboost Robustreal‐timeobject detector Speeding up Supertree Comparison detector Boostingasatree‐ structuredclassifier /766 Things not covered FastTraining(e.g.PhamandChamICCV07) Rd idlif Bti( Rhi i t l NIPS R an d om i se d l earn i ng f or B oos ti ng ( R a hi m i e t a l NIPS 08) Variations such as Variations such as RealvaluedAdaBoost (FreundandSchapire 95) GentleBoost etc by Friedman GentleBoost etc by Friedman Etc. /767 Introduction to Boosting Classifiers Classifiers Introduction to Boosting [Meir et al 03, Schapire 03] [Meir et al 03, Schapire 03] Theunderlyingideaistocombinesimple toforman ensemble such that the performance of the single ensemble ensemble such that the performance of the single ensemble memberisimproved,i.e. . Thestrongclassifieris :weights:weights :asetofhypotheses:asetofhypotheses :weights :asetofhypotheses Abriefhistory PAC(Prob.A pp rox.Correct)learnin g (Valiant1984,KearnsandValiant pp g 1994):learners,eachperformingslightlybetterthanrandom,canbe combinedtoformagoodhypothesis. Schapire (1990)firstprovidedapolynomialtimealgorithmandappliedtoa /76 OCRtask,relyingonneuralnetworksasbaselearners. 9 A brief history (continued) (AdaptiveBoosting)ismostcommon(Freundand Schapire 94). Schapire 94). Lotsofvariationsareformalisedinan (Masonetal00). Groveetal(98)haveshownoverfitting effectsonhighnoise datasets. New types have emerged for e.g. Regression (Duffy et al 00), New types have emerged for e. g. Regression (Duffy et al 00), multi‐class(Allwein etal2000),unsupervisedlearning (Ratsch etal00). f Vi l d J (01) i ld k o f Vi o l aan d J ones (01) i sa l an d mar k inCV. /7610 [...]... A strong boostingclassifierBoosting Cascade [viola & Jones 04], Boosting chain [Xiao et al] Very unbalanced tree Speeds up for unbalanced Speeds up for unbalanced binary problems Hard to design g 31/76 B i Bagging Random ferns? Random ferns? Random forest Random forest Bagged boosting classifiers Bagged boosting classifiers ? Boosting decision stumps A decision stump Ad i i t Decision tree Boosting. .. within the rectangle can be computed: Sum = A‐B‐C+D p 28/76 Boosting as a Tree- structuredClassifierBoosting (very shallow network) The strong classifier H as boosted decision stumps has a flat structure F F F F F F F F F F F F ∑ Cf. Decision “ferns” has been shown to outperform “trees” Cf D i i “f ”h b h t t f “t ” [Zisserman et al, 07] [Fua et al, 07] 30/76 Boosting -continued Good generalisation by a flat structure... Robust real-time object detector Boosting Simple Features [Viola and Jones CVPR 01] Adaboost classification Strong classifier Weak classifier Weak classifiers: Haar‐basis like functions ( W k l ifi H b i lik f ti (45,396 in total) ) 27/76 Boosting Simple Features [Viola and Jones CVPR 01] Integral image A value at (x,y) is the sum of the A value at (x y) is the sum of the pixel values above and to the left ... \([) \7 ([) 14/76 Randomized Decision Forest [Breiman 01, Geurts et al 06] ] Forest is an ensemble of random decision trees Y Y tree W IQ(Y) > W Q tree W 7 …… category F category F Classification is • • • • feature vector split functions thresholds thresholds Classifications Y IQ(Y) W Q 3 Q(F) 15/76 Randomized Tree Learning left split right split Features I Ychosen from a random feature pool I... Ensemble Learning ‐ Bagging /Boosting 12/76 Bagging (Bootstrap AGGregatING) Bootstrap For each set, randomly draw examples from the uniform For each set randomly draw examples from the uniform dist. allowing duplication and missing Ensemble classifier or majority voting are chosen on T bootstrap sets More theory on bias/variance [Geurts et al 06] 13/76 Bagging LS LS tree W LS LS7 tree W 7 tree W …… [ \([)... Thresholds Wchosen in range h h ld h i Choose I and Wto maximize gain in information 16/76 Random Forest – Summary Pros Generalization through bagging (random samples) & G li ti th hb i ( d l )& randomised tree learning (random features) Very f Very fast classification Inherently multi‐class Simple training Cons Inconsistency Difficulty for adaptation 17/76 Boosting Iteratively reweighting training samples. ... Higher weights to previously misclassified samples 50 rounds 5 rounds 4 rounds 3 rounds 2 rounds 1 round 18/76 Boosting Trees LS LS tree W LS tree W LS tree W 7 …… [ \([) In classification: \([) \7 ([) \([) = the majority class in { ,…, } \([) \7 ([) according to the weights 19/76 AdaBoost [Freund and Schapire 04] Input/output: Init: For t=1 to T For t=1 to T Learn KW that minimises Set the hypothesis weight... Boosting decision stumps A decision stump Ad i i t Decision treeBoosting Boosted decision trees Tree hierarchy Tree hierarchy Inspired by Yin, Crinimisi CVPR07 32/76 BREAK !! Unified Boosting Framework AnyBoost : an unified framework [Mason et al 00] Most boosting algorithms have in common that they iteratively update sample weights and select the next hypothesis based on p p g yp the weighted samples Input: ... Update the sample weights Break if 20/76 Existence of weak learners Definition of a baseline learner Data weights: Set S Baseline classifier: for all x Error is at most ½ Error is at most ½ Each weak learner in Boosting is demanded s.t. → Error of the composite hypothesis goes to zero as boosting → f h i h h i b i rounds increase [Duffy et al 00] XOR problems (Matlab demo) XOR problems (Matlab demo) 21/76... By Derek Hoiem (http://www.cs.uiuc.edu/homes/dhoiem/) 22/76 Multiple classifier system Mixture of Experts [Jordan, Jacobs 94] Gating network encourages specialization (local experts) instead of cooperation Expert 1 Expert 2 Gating ftn g1(x) y2(x) Σ output g2(x) … Input x y1(x) yL(x) Expert L gL(x) Gating Network 24/76 Ensemble learning: Boosting and Bagging Cooperation instead of specialization Expert 1 Expert 2 . as Variations such as RealvaluedAdaBoost (Freund and Schapire 95) GentleBoost etc by Friedman GentleBoost etc by Friedman Etc. /767 Introduction to Boosting Classifiers Classifiers Introduction to Boosting [Meir et. Requiringpixel‐wiseclassification /765 Structure of this talk Secondhal f Unified Boosting framework Firsthal f Introductionto Boosting Bagging/RF Tree‐structuredclassifiers MCBoost Speeding up Bagging/RF Adaboost . Robustreal‐timeobject detector Speeding up Supertree Comparison detector Boosting asatree‐ structured classifier /766 Things not covered FastTraining(e.g.Pham and ChamICCV07) Rd idlif Bti( Rhi i t l NIPS R an d om i se d l earn i ng f or B oos ti ng ( R a hi m i e t a l NIPS 08) Variations