1. Trang chủ
  2. » Công Nghệ Thông Tin

Bài giảng Máy học nâng cao: Support vector machine - Trịnh Tấn Đạt

77 63 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống


Bài giảng Máy học nâng cao: Support vector machine cung cấp cho người học các kiến thức: Introduction, review of linear algebra, classifiers & classifier margin, linear svms - optimization problem, hard vs soft margin classification, non linear svms. Mời các bạn cùng tham khảo.

Trịnh Tấn Đạt Khoa CNTT – Đại Học Sài Gòn Email: trinhtandat@sgu.edu.vn Website: https://sites.google.com/site/ttdat88/ Contents  Introduction  Review of Linear Algebra  Classifiers & Classifier Margin  Linear SVMs: Optimization Problem  Hard Vs Soft Margin Classification  Non-linear SVMs Introduction  Competitive with other classification methods  Relatively easy to learn  Kernel methods give an opportunity to extend the idea to  Regression  Density estimation  Kernel PCA  Etc Advantages of SVMs -  A principled approach to classification, regression and novelty detection  Good generalization capabilities  Hypothesis has an explicit dependence on data, via support vectors – hence, can readily interpret model Advantages of SVMs -  Learning involves optimization of a convex function (no local minima as in neural nets)  Only a few parameters are required to tune the learning machine (unlike lots of weights and learning parameters, hidden layers, hidden units, etc as in neural nets) Prerequsites  Vectors, matrices, dot products  Equation of a straight line in vector notation  Familiarity with  Perceptron is useful  Mathematical programming will be useful  Vector spaces will be an added benefit  The more comfortable you are with Linear Algebra, the easier this material will be What is a Vector ?  Think of a vector as a directed line segment in N-dimensions! (has “length” and “direction”)  Basic idea: convert geometry in higher dimensions into algebra!  Once you define a “nice” basis along each dimension: x-, y-, z-axis …    Vector becomes a x N matrix! v = [a b c]T Geometry starts to become linear algebra on vectors like v! a     v = b   c  y v x Vector Addition: A+B A+B v+ w = ( x1 , x ) + ( y1 , y ) = ( x1 + y1 , x + y ) A B C A+B = C (use the head-to-tail method to combine vectors) B A Scalar Product: av a v = a ( x1 , x ) = ( ax1 , ax ) av v Change only the length (“scaling”), but keep direction fixed Sneak peek: matrix operation (Av) can change length, direction and also dimensionality! Vectors: Magnitude (Length) and Phase (direction) v = ( x , x ,  , x )T n n v =  x2 (Magnitude or “2-norm”) i i =1 If v = 1, a unit vector Alternate representations: Polar coords: (||v||, ) Complex numbers: ||v||ej (unit vector => pure direction) y ||v||  “phase” x 10 Consider a Φ Φas shown below é ê ê ê ê ê ê ê ê ê ê ê ê ê F(a) F(b) = ê ê ê ê ê ê ê ê ê ê ê ê ê ê ê ë a1 2am a 2 a2 am 2a1a2 2a1a3 2a1am 2a2 a3 2a2 am 2am-1am ù ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú û é ê ê ê ê ê ê ê ê ê ê ê ê ê ê ê ê ê ê ê ê ê ê ê ê ê ê ê ê ë 2b1 2bm b12 b22 bm 2b1b2 2b1b3 2b1bm 2b2 b3 2b2 bm 2bm-1bm ù ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú ú û 63 Collecting terms in the dot product  First term = +  Next m terms = m å 2a b  Next m terms = i i i=1 m åa  Rest = i bi2 i=1 m m å 2a b m å å 2a a b b i i i=1  Therefore i j i j i=1 j=i+1 m m m m F(a) F(b) = 1+ 2å bi + å b + å å 2ai a j bi b j i=1 i=1 i i=1 j=i+1 64 Out of Curiosity (1+ a b) = (a b) + 2(a b) +1 2 ổ ổ m = ỗ bi ữ + ỗ bi ữ +1 è i=1 ø è i=1 ø m m æ m ö = å å bi a j b j + ỗ bi ữ +1 ố i=1 ø i=1 j=1 m m ỉ = å (ai bi )2 + 2å å bi a j b j + ỗ bi ữ +1 è i=1 ø i=1 i=1 j=i+1 m m m 65 Both are Same  Comparing term by term, we see  Φ.Φ = (1 + a.b)2  But computing the right side is lot more efficient, O(m) (m additions and multiplications)  Let us call (1 + a.b)2 = K(a,b) = Kernel 66 Φ in “Kernel Trick” Example 2-dimensional vectors x = [x1 x2]; Let K(xi,xj)=(1 + xiTxj)2, Need to show that K(xi,xj)= φ(xi) Tφ(xj): K(xi,xj) = (1 + xiTxj)2 = 1+ xi12xj12 + xi1xj1 xi2xj2+ xi22xj22 + 2xi1xj1 + 2xi2xj2 = [1 xi12 √2 xi1xi2 xi22 √2xi1 √2xi2]T [1 xj12 √2 xj1xj2 xj22 √2xj1 √2xj2] = φ(xi) Tφ(xj), where φ(x) = [1 x12 √2 x1x2 x22 √2x1 √2x2] 67 Other Kernels  Beyond polynomials there are other high dimensional basis functions that can be made practical by finding the right kernel function 68 Examples of Kernel Functions ◼ Linear: K(xi,xj)= xi Txj ◼ Polynomial of power p: K(xi,xj)= (1+ xi Txj)p ◼ Gaussian (radial-basis function network): K ( x i , x j ) = exp(− ◼ xi − x j 2 2 ) Sigmoid: K(xi,xj)= tanh(β0xi Txj + β1) 69  The function we end up optimizing is R R R åak - ååa kalQkl where Qkl = yk yl K(xk , xl ) k=1 k=1 l=1 s.t £ a k £ C, "k R and åa k yk = k=1 70 Multi-class classification Multi-class classification  One versus all classification Multi-class SVM Multi-class SVM SVM Software  Python: scikit-learn module  LibSVM (C++)  SVMLight (C)  Torch (C++)  Weka (Java) … 75 Research  One-class SVM (unsupervised learning): outlier detection  Weibull-calibrated SVM (W-SVM) / PI -SVM: open set recognition Homework  CIFAR-10 image recognition using SVM  The CIFAR-10 dataset consists of 60000 32x32 color images in 10 classes, with 6000 images per class * There are 50000 training images and 10000 test images  These are the classes in the dataset: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck  Hint : https://github.com/wikiabhi/Cifar-10 https://github.com/mok232/CIFAR-10-Image-Classification ... Width x- What we know:  w x+ + b = +1  w x- + b = -1  w (x+-x-) = Also x+ = x- + λ w |x+ - x-|= M 30 Width of the Margin  What we know: w x + + b = +1 M = || x - x || = || l w || w x - +... each dimension: x-, y-, z-axis …    Vector becomes a x N matrix! v = [a b c]T Geometry starts to become linear algebra on vectors like v! a     v = b   c  y v x Vector Addition:... Products -1 p = a (aTx) ||a|| = aTa = 13 Projection: Using Inner Products -2 p = a (aTb)/ (aTa) Note: the “error vector e = b-p is orthogonal (perpendicular) to p i.e Inner product: (b-p)Tp =

Ngày đăng: 15/05/2020, 22:39

Xem thêm: