Computer Science “This book provides a concise overview of SVMs, starting from the basics and connecting to many of their most significant extensions Starting from an optimization perspective provides a new way of presenting the material, including many of the technical details that are hard to find in other texts And since it includes a discussion of many practical issues important for the effective use of SVMs (e.g., feature construction), the book is valuable as a reference for researchers and practitioners alike.” —Professor Thorsten Joachims, Cornell University “One thing which makes the book very unique from other books is that the authors try to shed light on SVM from the viewpoint of optimization I believe that the comprehensive and systematic explanation on the basic concepts, fundamental principles, algorithms, and theories of SVM will help readers have a really indepth understanding of the space It is really a great book, which many researchers, students, and engineers in computer science and related fields will want to carefully read and routinely consult.” —Dr Hang Li, Noah’s Ark Lab, Huawei Technologies Co., Ltd Deng, Tian, and Zhang “This book comprehensively covers many topics of SVMs In particular, it gives a nice connection between optimization theory and support vector machines … The setting allows readers to easily learn how optimization techniques are used in a machine learning technique such as SVM.” —Professor Chih-Jen Lin, National Taiwan University Support Vector Machines Chapman & Hall/CRC Data Mining and Knowledge Discovery Series Chapman & Hall/CRC Data Mining and Knowledge Discovery Series Support Vector Machines Optimization Based Theory, Algorithms, and Extensions Naiyang Deng Yingjie Tian Chunhua Zhang K12703 K12703_Cover.indd CuuDuongThanCong.com 11/7/12 9:54 AM Support Vector Machines Optimization Based Theory, Algorithms, and Extensions CuuDuongThanCong.com Chapman & Hall/CRC Data Mining and Knowledge Discovery Series SERIES EDITOR Vipin Kumar University of Minnesota Department of Computer Science and Engineering Minneapolis, Minnesota, U.S.A AIMS AND SCOPE This series aims to capture new developments and applications in data mining and knowledge discovery, while summarizing the computational tools and techniques useful in data analysis This series encourages the integration of mathematical, statistical, and computational methods and techniques through the publication of a broad range of textbooks, reference works, and handbooks The inclusion of concrete examples and applications is highly encouraged The scope of the series includes, but is not limited to, titles in the areas of data mining and knowledge discovery methods and applications, modeling, algorithms, theory and foundations, data and knowledge visualization, data mining systems and tools, and privacy and security issues PUBLISHED TITLES ADVANCES IN MACHINE LEARNING AND DATA MINING FOR ASTRONOMY Michael J Way, Jeffrey D Scargle, Kamal M Ali, and Ashok N Srivastava BIOLOGICAL DATA MINING Jake Y Chen and Stefano Lonardi COMPUTATIONAL METHODS OF FEATURE SELECTION Huan Liu and Hiroshi Motoda CONSTRAINED CLUSTERING: ADVANCES IN ALGORITHMS, THEORY, AND APPLICATIONS Sugato Basu, Ian Davidson, and Kiri L Wagstaff CONTRAST DATA MINING: CONCEPTS, ALGORITHMS, AND APPLICATIONS Guozhu Dong and James Bailey DATA CLUSTERING IN C++: AN OBJECT-ORIENTED APPROACH Guojun Gan DATA MINING FOR DESIGN AND MARKETING Yukio Ohsawa and Katsutoshi Yada DATA MINING WITH R: LEARNING WITH CASE STUDIES Luís Torgo FOUNDATIONS OF PREDICTIVE ANALYTICS James Wu and Stephen Coggeshall GEOGRAPHIC DATA MINING AND KNOWLEDGE DISCOVERY, SECOND EDITION Harvey J Miller and Jiawei Han HANDBOOK OF EDUCATIONAL DATA MINING Cristóbal Romero, Sebastian Ventura, Mykola Pechenizkiy, and Ryan S.J.d Baker CuuDuongThanCong.com INFORMATION DISCOVERY ON ELECTRONIC HEALTH RECORDS Vagelis Hristidis INTELLIGENT TECHNOLOGIES FOR WEB APPLICATIONS Priti Srinivas Sajja and Rajendra Akerkar INTRODUCTION TO PRIVACY-PRESERVING DATA PUBLISHING: CONCEPTS AND TECHNIQUES Benjamin C M Fung, Ke Wang, Ada Wai-Chee Fu, and Philip S Yu KNOWLEDGE DISCOVERY FOR COUNTERTERRORISM AND LAW ENFORCEMENT David Skillicorn KNOWLEDGE DISCOVERY FROM DATA STREAMS João Gama MACHINE LEARNING AND KNOWLEDGE DISCOVERY FOR ENGINEERING SYSTEMS HEALTH MANAGEMENT Ashok N Srivastava and Jiawei Han MINING SOFTWARE SPECIFICATIONS: METHODOLOGIES AND APPLICATIONS David Lo, Siau-Cheng Khoo, Jiawei Han, and Chao Liu MULTIMEDIA DATA MINING: A SYSTEMATIC INTRODUCTION TO CONCEPTS AND THEORY Zhongfei Zhang and Ruofei Zhang MUSIC DATA MINING Tao Li, Mitsunori Ogihara, and George Tzanetakis NEXT GENERATION OF DATA MINING Hillol Kargupta, Jiawei Han, Philip S Yu, Rajeev Motwani, and Vipin Kumar RELATIONAL DATA CLUSTERING: MODELS, ALGORITHMS, AND APPLICATIONS Bo Long, Zhongfei Zhang, and Philip S Yu SERVICE-ORIENTED DISTRIBUTED KNOWLEDGE DISCOVERY Domenico Talia and Paolo Trunfio SPECTRAL FEATURE SELECTION FOR DATA MINING Zheng Alan Zhao and Huan Liu STATISTICAL DATA MINING USING SAS APPLICATIONS, SECOND EDITION George Fernandez SUPPORT VECTOR MACHINES: OPTIMIZATION BASED THEORY, ALGORITHMS, AND EXTENSIONS Naiyang Deng, Yingjie Tian, and Chunhua Zhang TEMPORAL DATA MINING Theophano Mitsa TEXT MINING: CLASSIFICATION, CLUSTERING, AND APPLICATIONS Ashok N Srivastava and Mehran Sahami THE TOP TEN ALGORITHMS IN DATA MINING Xindong Wu and Vipin Kumar UNDERSTANDING COMPLEX DATASETS: DATA MINING WITH MATRIX DECOMPOSITIONS David Skillicorn CuuDuongThanCong.com This page intentionally left blank CuuDuongThanCong.com Support Vector Machines Optimization Based Theory, Algorithms, and Extensions Naiyang Deng Yingjie Tian Chunhua Zhang CuuDuongThanCong.com CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2013 by Taylor & Francis Group, LLC CRC Press is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S Government works Version Date: 20121203 International Standard Book Number-13: 978-1-4398-5793-9 (eBook - PDF) This book contains information obtained from authentic and highly regarded sources Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint Except as permitted under U.S Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers For permission to photocopy or use material electronically from this work, please access www.copyright com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400 CCC is a not-for-profit organization that provides licenses and registration for a variety of users For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com CuuDuongThanCong.com Dedicated to my beloved wife Meifang Naiyang Deng Dedicated to my dearest father Mingran Tian Yingjie Tian Dedicated to my husband Xingang Xu and my son Kaiwen Xu Chunhua Zhang CuuDuongThanCong.com This page intentionally left blank CuuDuongThanCong.com Contents List of Figures xvii List of Tables xxi Preface xxiii List of Symbols xxvii Optimization 1.1 Optimization Problems in Euclidian Space 1.1.1 An example of optimization problems 1.1.2 Optimization problems and their solutions 1.1.3 Geometric interpretation of optimization problems 1.2 Convex Programming in Euclidean Space 1.2.1 Convex sets and convex functions 1.2.1.1 Convex sets 1.2.1.2 Convex functions 1.2.2 Convex programming and their properties 1.2.2.1 Convex programming problems 1.2.2.2 Basic properties 1.2.3 Duality theory 1.2.3.1 Derivation of the dual problem 1.2.3.2 Duality theory 1.2.4 Optimality conditions 1.2.5 Linear programming 1.3 Convex Programming in Hilbert Space 1.3.1 Convex sets and Fr´echet derivative 1.3.2 Convex programming problems 1.3.3 Duality theory 1.3.4 Optimality conditions *1.4 Convex Programming with Generalized Inequality Constraints in Euclidian Space 1.4.1 Convex programming with generalized inequality constraints 1.4.1.1 Cones 1.4.1.2 Generalized inequalities 1 6 8 12 12 13 15 16 18 18 19 20 20 21 21 21 21 ix CuuDuongThanCong.com 302 Bibliography [40] Deng N Y, et al Unconstrained optimization computation methods Beijing: Science Press, 1982 [41] Deng N Y, Zhu M F Optimization methods Shenyang: Liaoning Education Press, 1987 [42] Deng N Y, Tian Y J New methods in data mining—support vector machines Beijing: Science Press, 2004 [43] Deng N Y, Tian Y J Support vector machines—Theory, Algorithms and Extensions Beijing: Science Press, 2009 [44] Denis F, Gilleron R, and Tommasi M Text classification from positive and unlabeled examples In Proceedings of the Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 2002), 2002, 1927–1934 [45] Denis F, Gilleron R, and Letouzey F Learning from positive and unlabeled examples Theoretical Computer Science 2005, 348(1): 70–83 [46] Diao Z Y, Zheng H D, Liu J Z, Liu G Z Operations Research Beijing: Higher Education Press, 2001 [47] Diederich J Rule Extraction from Support Vector Machines: An Introduction, Studies in Computational Intelligence (SCI) 80, 3–31, 2008 [48] Dietterich T G, Bakiri G Solving multiclass learning problems via errorcorrecting output codes Journal of Artificial Intelligence Research, 1995, 2: 263–286 [49] Dietterich T G, Lathrop R H, Lozano-P´erez T Solving the multipleinstance problem with axis-parallel rectangles Artificial Intelligence, 1997, 89(1–2): 31–71 [50] Duda R O, Hart P E, Stork D G Pattern Classification New York: John Wiley and Sons, 2001 [51] Elisseeff A, Weston J Kernel methods for multi-labelled classification and categorical regression problems Paper presented to Advances in Neural Information Processing Systems 14 [52] Fan R E, Chen P H, Lin C J Working set selection using second order information for training SVMs Journal of Machine Learning Research, 2005, 6: 1889–1918 [53] Fan R E, Chang K W, Hsieh C J, Wang X R, Lin C J LIBLINEAR: a library for large linear classification Journal of Machine Learning Research, 2008, 9: 1871–1874 [54] Fletcher R Practical Methods of Optimization(Second Edition) New York: Wiley-Interscience, 1987 CuuDuongThanCong.com Bibliography 303 [55] Fung G, Mangasarian O L Proximal support vector machine classifiers // Proceedings of International Conference of Knowledge Discovery and Data Mining, 2001: 77–86 [56] Fung G and Mangasarian O L Multicategory proximal support vector machine classifiers Machine Learning, 2005, 59(1-2): 77–97 [57] Fung G, Mangasarian O L, Shavlik J Knowledge-based support vector machines classifiers Technical Report 01-09, Data Mining Institute, Computer Sciences Department, University of Wisconsin, Madison, Wisconsin, November 2001 http://ftp.cs.wisc.edu/pub/dmi/techreports/01-09.ps [58] Fung G, Mangasarian O L Knowledge-based support vector machine classifiers, Neural Information Processing Systems NIPS 2002, Vancouver, December 9-14, 2002 [59] Fung G, Mangasarian O L Knowledge-based nonlinear kernel classifiers, Neural Information Processing Systems NIPS 2002, Vancouver, December 9-14, 2002 [60] Fung G, Sandilya S, and Rao R B Rule Extraction from Linear Support Vector Machines via Mathematical Programming, Studies in Computational Intelligence (SCI) 80, 83-107, 2008 [61] Fung G P C, Yu J X, Lu H, and Yu P S Text classification without negative examples revisit (sic) IEEE Transactions on Knowledge and Data Engineering, 2006, 18(1): 620 [62] Fă urnkranz J, Hă ullermeier E, Mencia E L, Brinker K Multilabel classification via calibrated label ranking Machine Learning 2008, 73: 133– 153 [63] Gao T T U -support vector machine and its applications Master thesis, China Agricultural University, 2008 [64] Goldfarb D, Iyengar G Robust convex quadratically constrained programs Mathematical Programming, Series B, 2003, 97: 495–515 [65] Guyon L, Weston J, Barnhill S, Vapnik, V N Gene selection for cancer classification using support vector machines Machine Learning, 2002, 46: 389–422 [66] Han J W, Kamber M Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, Inc, 2001 [67] Hastie T Principal curves and surfaces Laboratory for Computational Statistics, Stanford University, Department of Statistics Technical Report, 1984 CuuDuongThanCong.com 304 Bibliography [68] Hastie T, Stuetzle W Principal curves Journal of the American Statistical Asssociation, 1989, 84: 502–516 [69] He H B, Garcia E A Learning from Imbalanced Data IEEE Transactions on Knowledge and Data Engineering, vol 21, no 9, 2009: 12631284 [70] Henikoff S.,Henikoff J G Amino acid substitution matrices from protein blocks Proceedings of the National Academy of Sciences, USA, 1992, 89(22), 10915-10919 [71] Herbrich R, Graepel T, Bollmann-Sdorra P, Obermayer K Learning a preference relation for information retrieval // Proceedings of the AAAI Workshop Text Categorization and Machine Learning, Madison, USA, 1998 [72] Herbrich R, Graepel T, Obermayer K Support vector learning for ordinal regression // Proceedings of the 9th International Conference on Artifical Neural Networks, 1999: 97–102 [73] Herbrich R Learning Kernel Classifiers: Theory and Algorithms The MIT Press, 2002 [74] Herbrich R, Graepel T, Obermayer K Large margin rank boundaries for ordinal regression Advances in Large Margin Classifiers, 2000: 115–132 [75] Hoegaerts L, Suykens J A K, Vandewalle J and De Moor B A Comparison of Pruning Algorithms for Sparse Least Squares Support Vector Machines Lecture Notes in Computer Science, 2004, 3316: 1247–1253 [76] Hsu C W and Lin C J A comparison of methods for multiclass support vector machines IEEE Transactions on Neural Networks, 2002, 13(2): 415–425 [77] Hu X Q Methods of supervised feature extraction for high-dimensional complex data sets Master thesis, China Agricultural University, 2007 [78] Huang H X, Han J Y Mathematical programming Beijing: Tsinghua University Press, 2006 [79] Jaakkola T S, Haussler D Exploiting generative models in discriminative classifiers // Advances in Neural Information Processing Systems 11 MIT Press, 1998 [80] Jain A K, Murty M N, Flynn P J Data clustering: a review ACM Computing Surveys, 1999, 31: 264–323 [81] Jiao L C, Bo L F, Wang L Fast Sparse Approximation for Least Squares Support Vector Machine IEEE Trans Neural Netw., 2007, 18(3): 685– 697 CuuDuongThanCong.com Bibliography 305 [82] Joachims T Estimating the generalization performance of an SVM efficiently // Proceedings of the 17th International Conference on Machine Learning San Francisco, California: Morgan Kaufmann, 2000: 431–438 [83] Jolliffe I T Principal Component Analysis (Second Edition) New York: Springer-Verlag, 2002 [84] Keerthi S S, Shevade S K SMO algorithm for least-squares SVM formulations Neural Computation, 2003, 15(2): 487-507 [85] Khemchandani J R, Jayadeva R K, and Chandra S Optimal kernel selection in twin support vector machines Optim Lett., 2009, 3(1): 77– 88 [86] Khemchandani J R, Chandra S Twin support vector machines for pattern classification IEEE Trans Pattern Anal Machine Intell 2007, 29(5): 905-910 [87] Kim H C, Pang S, Je H, Kim D, Bang S Y Constructing support vector machine ensemble Pattern Recognition, 2003, 36: 2757-2767 [88] Kim J H, Lee J, Oh B, Kimm K, Koh I Prediction of phosphorylation sites using SVMs Bioinformatics, 2004, 20(17): 3179–3184 [89] Klerk E Aspects of Semidefinite Programming Dordrecht: Kluwer Academic Publishers, 2002 [90] Korenberg M J, David R, Hunter I W, Soloman, JE Automatic classification of protein sequences into structure/function groups via parallel cascade identification: a feasibility study Ann Biomed Eng., 2000, 28(7): 803-811 [91] Kuhn H W Nonlinear programming: a historical note History of Mathematical Programming Amsterdam: North-Holland, 1991: 82–96 [92] Kumar M A and Gopal M Application of smoothing technique on twin support vector machines Pattern Recognit Lett., 2008, 29(13): 18421848 [93] Kumar M A and Gopal M Least squares twin support vector machines for pattern classification Expert Syst Appl., 2009, 36(4): 7535–7543 [94] Lee M D Determining the dimensionality of multidimensional scaling models for cognitive modeling Journal of Mathematical Psychology, 2001, 45: 149–166 [95] Lee D, Seung H Learning the parts of objects by nonnegative matrix factorization Nature, 1999, 401: 788–791 CuuDuongThanCong.com 306 Bibliography [96] Lee W S and Liu B Learning with positive and unlabeled examples using weighted logistic regression In Proceedings of the Twentieth International Conference on Machine Learning (ICML 2003), Washington, DC, 2003, 448–455 [97] Leslie C, Eskin, E and Noble W S The spectrum kernel: a string kernel for SVM protein classification // Proceedings of the Pacific Symposium on Biocomputing New Jersey, Singapore: World Scientific, 2002: 564– 557 [98] Li D C, Fang Y H An algorithm to cluster data for efficient classification of support vector machines Expert Systems with Applications, 2007, 34: 2013–2018 [99] Li Y G, Lin C, Zhang W D Improved sparse least-squares support vector machine classifiers Neurocomputing, 2006, 69(13-15): 1655–1658 [100] Liu B G Nonlinear programming Beijing: Beijing University of Technology Press, 1988 [101] Liu B Web data mining: Exploring hyperlinks, contents, and usage data Opinion Mining Springer, 2006 [102] Liu B, Dai Y, Li X, Lee W S, and Yu P S Building text classifiers using positive and unlabeled examples In Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM 2003), 2003, 179–188 [103] Lodhi H, Shawe-Taylor J, Cristianini N, and Watkins C Text classification using string kernels Journal of Machine Learning Research, 2002, 2: 419–444 [104] Mangasarian O L, Musicant D R Successive overrelaxation for support vector machines, IEEE Transactions on Neural Networks, 1999, 10: 1032–1037 [105] Mangasarian O L, Wild E W Multiple instance classification via successive linear programming Journal of Optimization Theory and Application, 2008, 137(1): 555–568 [106] Mangasarian O L, Wild E W Multisurface proximal support vector machine classification via generalized eigenvalues IEEE Trans Pattern Anal Mach Intell., 2006, 28(1): 69–74 [107] Mangasarian O L Nonlinear Programming SIAM, Philadelphia, PA, 1994 [108] Mangasarian O L, Wild E W Multisurface proximal support vector classification via generalize eigenvalues IEEE Trans Pattern Anal Machine Intell., 2006, 28(1): 69–74 CuuDuongThanCong.com Bibliography 307 [109] Martens D, Huysmans J, Setiono R, Vanthienen J, and Baesens B Rule Extraction from Support Vector Machines: An Overview of Issues and Application in Credit Scoring, Studies in Computational Intelligence (SCI) 80, 33–63, 2008 [110] Molina L C, Belanche L, Nebot A Feature selection algorithms: a survey and experimental evaluation // Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM ’02), 2002: 306–313 [111] Nash S G, Sofer A Linear and Nonlinear Programming McGraw-Hill, USA, 1996 [112] Nocedal J, Wright S J Numerical Optimization New York: SpringerVerlag, 1999 [113] N´ un ˜ ez H, Angulo C, and Catal` a A: Rule Extraction Based on Support and Prototype Vectors, Studies in Computational Intelligence (SCI) 80, 109134 (2008) [114] Osuna E, Freund R, Girosi F Improved Training Algorithm for Support Vector Machines // Proceedings of the IEEE Neural Networks for Signal Processing, 1997: 276–285 [115] Pechyony D, Izmailov R, Vashist A, and Vapnik V SMO-style algorithms for learning using privileged information // Proceedings of the 2010 International Conference on Data Mining (DMIN10), 2010 [116] Pechyony D and Vapnik V On the Theory of Learning with Privileged Information In Advances in Neural Information Processing Systems, Curran Associates Inc., 23, 2010 [117] Peng X TSVR: An efficient twin support vector machine for regression Neural Networks, 2010, 23(3): 365–372 [118] Platt J Sequential minimal optimization: a fast algorithm for training support vector machines Advances in Kernel Methods-Support Vector Learning MIT Press, 1999: 185–208 [119] Qi Z Q, Tian Y J, Shi Y Robust twin support vector machine for pattern classification Pattern Recognition, http://dx.doi.org/10.1016/ j.patcog.2012.06.019 [120] Rahmani R and Goldman S A MISSL: Multiple-instance semisupervised learning // Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA, 2006, 705–712 [121] Rifkin R, Pontil M and Verri A A note on support vector machine degeneracy Lecture Notes in Computer Science, 1999, 1720: 252–263 CuuDuongThanCong.com 308 Bibliography [122] Rousu J, Saunders C, Szedmak S, Shawe-Taylor J Kernel-based learning of hierarchical multilabel classification methods Journal of Machine Learning Research, 2006, 7: 1601–1626 [123] Roweis S T, Saul L K Nonliear dimensionality reduction by locally linear embedding Science, 2000, 290: 23232326 [124] Schă olkopf B, Smola A J Learning with Kernels–Support Vector Machines, Regularization, Optimization, and Beyond MIT Press, 2002 [125] Schă olkopf B, Smola A, Muller K R Nonlinear component analysis as a kernel eigenvalue problem Neural Computation, 1998, 10: 12291319 [126] Schă olkopf B, Smola A, Williamson R C and Bartlett P L New support vector algorithms Neural Computation, 2000(12):1207-1245 [127] Settles B, Craven M, and Ray S Multiple-instance active learning In Platt J C, Koller D, Singer Y, and Roweis S, editors, Advances in Neural Information Processing Systems, MIT Press, Cambridge, MA, 2008, 20: 1289–1296 [128] Shantanu G, Sunita S Discriminative Methods for Multi-labeled Classification PAKDD 2004, LNAI 3056, 2004: 22–30 [129] Shao Y H, Zhang C H and Deng N Y Improvements on Twin support vector machines IEEE Trans Neural Networks 2010(22):962-968 [130] Shao Y H, Deng N Y A coordinate descent margin based-twin support vector machine for classification Neural Networks, 2012, 25: 114–121 [131] Shao YH, Deng N Y, Yang Z M, Chen W J, and Wang Z Probabilistic outputs for twin support vector machines Knowledge-Based Systems, September 2012, 33: 145–151 [132] Shashua A, Levin A Taxonomy of large margin principle algorithms for ordinal regression problems Technical Report 2002-39 Leibniz Center for Research, School of Computer Science and Engineering, The Hebrew University of Jerusalem [133] Shashua A, Levin A Ranking with large margin principle: two approaches Neural Information Processing Systems, 2003, 15: 937–944 [134] Shen J, Zhang J, Luo X, Zhu W, Yu K, Chen K, Li Y, Jiang H Predicting protein-protein interactions based only on sequences information Proc Natl Acad Sci U S A, 2007, 104(11): 4337-4341 [135] Shien D M, Lee T Y, Chang W C, Hsu J B, Horng J T, Hsu P C, Wang T Y, Huang H D Incorporating structural characteristics for identification of protein methylation sites Journal of Computational Chemistry, 2009, 30(9): 1532-1543 CuuDuongThanCong.com Bibliography 309 [136] Sinz F, Chapelle O, Agarwal A, Schăolkopf B An analysis of inference with the universum // Advances in Neural Information Processing Systems 20: Proceedings of the 2007 Conference, Cambridge, MA, USA: MIT Press, 2008: 1369–1376 [137] Steinwart I Consistency of support vector machines and other regularized kernel machines IEEE Transactions on Information Theory, 2005, 51: 128–142 [138] Suykens J A K, Brabanter J D, Lukas L, and Vandewalle J Weighted least squares support vector machines: Robustness and sparse approximation Neurocomput., 2002, 48(1-4): 85–105 [139] Suykens J A K, Lukas L, Vandewalle J Sparse approximation using least squares support vector machines Proc 2000 IEEE International Symposium on ISCAS, Geneva, Switzerland, 2000, 757–760 [140] Suykens J A K, Tony V G, Jos D B, Bart D M, Joos V Least Squares Support Vector Machines World Scientific, 2002 [141] Suykens J A K and Vandewalle J Least Squares Support Vector Machine Classifiers Neural Processing Letters, 1999, 9(3): 293-300 [142] Tan J Y, Wu L Y, Deng N Y Comparison of different encoding scheme on Prediction of Post-translational Modification Sites Submitted 2011 [143] Tan J Y, Zhang C H, Deng N Y Cancer Related Gene Identification via p-norm support vector machine The International Conference on Computational Systems Biology (ISB2010) 2010, [144] Tang Y C, Zhang Y Q, Chawla N V, Krasser S SVMs Modeling for Highly Imbalanced Classification IEEE Trans Syst Man Cybern B Cybern, 2009, Feb, 39(1): 281–288 [145] Tenenbaum J B, Silva V(de), Langford J C A global geometric framework for nonlinear dimensionality reduction Science, 2000, 290: 2319– 2323 [146] Tian Y J Support vector regression and its applications PhD thesis, China Agricultural University, 2005 [147] Tian Y J, Deng N Y Support vector classification with nominal attributes Lecture Notes in Computer Science, 2005, 3801: 586–591 [148] Tian Y J, Yu J, Qi Z Q, Shi Y Efficient Sparse Least Squares Support Vector Machines for Pattern Classification 2012 9th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD2012), 714–718 CuuDuongThanCong.com 310 Bibliography [149] Tian Y J, Yu J, Chen W J lp -norm support vector machine with CCCP 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), 2010: 1560 - 1564 [150] Tsang I W, Kwok J T Distance metric learning with kernels // Proceedings of the International Conference on Artificial Neural Networks Istanbul, Turkey, June, 2003 [151] Tsoumakas G, Katakis I Multi-label classification: An overview International Journal of Data Warehousing and Mining, 2007, 3(3): 1–13 [152] Tsoumakas G, Katakis I, and Vlahavas I Mining Multi-label Data Data Mining and Knowledge Discovery Handbook, 2010, Part 6: 667-685 [153] Tveit A and Hetland M L Multicategory incremental proximal support vector classifiers Lecture Notes in Computer Science, Springer-Verlag, 2003, 386–392 [154] Vanderbei R J Linear Programming: Foundations and Extensions (Second Edition), Kluwer Academic Publishers, 2001 [155] Vapnik V N Estimation of Dependences Based on Empirical Data New York: Springer-Verlag, 1982 [156] Vapnik V N Estimation of Dependences Based on Empirical Data 2nd edition Berlin: Springer Verlag, 2006 [157] Vapnik V N The Nature of Statistical Learning Theory New York: Springer, 1996 [158] Vapnik V N Statistical Learning Theory New York: John Wiley and Sons, 1998 [159] Vapnik V N, Vashist A A new learning paradigm: Learning using privileged information Neural Networks, 2009, 22: 544-577 [160] Vapnik V N , Vashist A, and Pavlovitch N Learning using hidden information: Master-class learning In F F Soulie, D Perrotta, J Piskorski, and R Steinberger, editors, NATO Science for Peace and Security Series, D: Information and Communication Security, IOS Press, 2008, 19: 3–14 [161] Vishwanathan S V N, Borgwardt K M, Schraudolph N N Fast computation of graph kernels Technical Report, National ICT Australia (NICTA), 2006 [162] Wang R S Functional analysis and optimization theory Beijing: Beijing Aerospace University Press, 2004 CuuDuongThanCong.com Bibliography 311 [163] Wang S, Mathew A, Chen Y, Xi L, Ma L, Lee J Empirical analysis of support vector machine ensemble classifiers Expert Systems with applications, 2009, 36: 6466–6476 [164] Wang Y J, Xiu N H Nonlinear programming theory and algorithms (2nd Edition) Xi’an: Shaanxi Science and Technology Press, 2008 [165] Weiss G M Mining with Rarity: a Unifying Framework ACM SIGKOO Explorations Newsletter, 2004, 6(1): 7–19 [166] Weston J, Collobert R, Sinz F, Bottou L and Vapnik V Inference with the Universum // Proceedings of 23th International Conference on Machine Learning, 2006 [167] Weston J, Elisseeff A, Schăolkopf B, Tipping M Use of the Zero-Norm with Linear Models and Kernel Methods Journal of Machine Learning Research 3, 2003: 1439–1461 [168] Weston J, Gammerman A, Stitson M O, Vapnik V N, Vovk V, Watkins C Support vector density estimation Advances in Kernel Methods– Support Vector Learning Cambridge MA: MIT Press, 1999: 293–305 [169] Wong Y H, Lee T Y, Liang H K, Huang C M, Wang T Y, Yang Y H, Chu C H, Huang H D, Ko M T, Hwang J K KinasePhos 2.0: A Web server for identifying protein kinase-specific phosphorylation sites based on sequences and coupling patterns Nucleic Acids Res, 2007, 35 (Web Server issue), W588–94 [170] Xie J X, Xing W X Network optimization Beijing: Tsinghua University Press, 2000 [171] Xie J X, Xue Y Optimization Modeling and LINDO/LINGO softwares Beijing: Tsinghua University Press, 2005 [172] Xu C X, Chen Z P, Li N C Modern optimization methods Beijing: Science Press, 2002 [173] Xu L, Neufeld J, Larson B, Schuurmans D Maximum margin clustering // Advances in Neural Information Processing Systems, 2004, 17 [174] Xu L, Schuurmans D Unsupervised and semisupervised multiclass support vector machines // Proceedings of the 20th National Conference on Artificial Intelligence, 2005 [175] Xu Y, Wang X B, Ding J, Wu L Y, Deng N Y Lysine acetylation sites prediction using an ensemble of support vector machine classifiers Journal of Theoretical Biology, 2010, 264: 130–135 [176] Yang X W, Lin D Y, Hao Z F, et al A fast svm training algorithm based on the set segmentation and k-means clustering Natural Science, 2003, 13 CuuDuongThanCong.com 312 Bibliography [177] Yang Z X Support vector ordinal regression and multiclass problems PhD thesis, China Agricultural University, 2007 [178] Yang Z X, Deng N Y Multi-instance support vector machine based on convex combination The Eighth International Symposium on Operations Research and Its Applications (ISORA’09), 2009:481–487 [179] Yang Z X, Deng N Y, Tian Y J A multiclass classification algorithm based on ordinal regression machine // Proceedings of International Conference on CIMCA 2005 & IAWTIC 2005, Vienna, Austria, 2005, 2: 810–815 [180] Yang Z X, Tian Y J, Deng N Y Leave-one-out bounds for support vector ordinal regression machine Neural Computing and Applications, 2009, Volume 18, Number 7, 731–748 [181] Yoo P D, Ho Y S, Zhou B B, Zomaya A Y SiteSeek: post-translational modification analysis using adaptive locality-effective kernel methods and new profiles BMC Bioinformatics, 2008, 9:272 [182] Yu H, Yang J A direct LDA algorithm for high-dimensional data with application face recognition Pattern Recognition, 2001, 34 [183] Yuan Y X, Sun W Y Optimization theory and methods Beijing: Science Press, 1997 [184] Zanni L, Serafini T, Zanghirati G Parallel software for training large scale support vector machines on multiprocessor systems Journal of Machine Learning Research, 2006, 7: 1467–1492 [185] Zeng X Y and Chen X W SMO-based pruning methods for sparse least squares support vector machines IEEE Trans Neural Netw., 2005, 16(6): 1541-1546 [186] Zhang C H, Tian Y J, Deng N Y The new interpretation of support vector machines on statistical learning theory SCIENCE CHINA Mathematics, 2010, Volume 53, Number 1, 151–164 [187] Zhang X D Matrix Analysis and Applications Beijing: Tsinghua University Press, 2004 [188] Zhang J Z, Xu S J Linear programming Beijing: Science Press, 1990 [189] Zhao K Unsupervised and semisupervised support vector machines for binary classification problems PhD thesis, China Agricultural University, 2007 [190] Zhao K, Tian Y J, Deng N Y Unsupervised and semisupervised twoclass support vector machines // Proceedings of the 6th IEEE International Conference on Data Mining Workshops Hong Kong, December 18-22, 2006: 813–817 CuuDuongThanCong.com Bibliography 313 [191] Zhao K, Tian Y J, Deng N Y Unsupervised and semisupervised lagrangian support vector machines // Proceedings of the 7th International Conference on Computational Science Workshops Beijing, China May 27-30, Part III, LNCS 4489, 2007: 882–889 [192] Zhao K, Tian Y J, Deng N Y Robust unsupervised and semisupervised bounded C-support vector machines // Proceedings of the 7th IEEE ICDM 2007 Workshops, 2007: 331–336 [193] Zhao Y M Robust classification and feature selection of Gene expression data Master thesis, China Agricultural University, 2008 [194] Zhong P, Fukushima M A new multiclass support vector algorithm Optimization Methods and Software, 2006, 21: 359–372 [195] Zhou Z H and Xu J M On the relation between multi-instance learning and semisupervised learning // Proceeding of the 24th International Conference on Machine Learning, Corvallis, OR, 2007, 1167–1174 [196] Zhou Z H, Zhan D C, Yang Q Semisupervised learning with very few labeled training examples Twenty-Second AAAI Conference on Artificial Intelligence (AAAI-07), 2007, 675-680 [197] Zhu J, Rosset S, Hastie T, and Tibshirani R 1-norm support vector machines In S Thrun, L Saul, and B Schăolkopf, editors, Advances in Neural Information Processing Systems 16, pages 49–56, Cambridge, MA, 2004 MIT Press [198] Zhu X Semisupervised Learning with Graphs PhD thesis, Carnegie Mellon University, 2005a [199] Zhu X Semisupervised learning literature survey Computer Sciences Technical Report 1530, University of Wisconsin–Madison, 2005b [200] http://archive.ics.uci.edu/ml/datasets/Heart+Disease [201] http://archive.ics.uci.edu/ml/datasets/Iris CuuDuongThanCong.com This page intentionally left blank CuuDuongThanCong.com This page intentionally left blank Download more eBooks here: http://avaxhm.com/blogs/ChrisRedfield CuuDuongThanCong.com Computer Science “This book provides a concise overview of SVMs, starting from the basics and connecting to many of their most significant extensions Starting from an optimization perspective provides a new way of presenting the material, including many of the technical details that are hard to find in other texts And since it includes a discussion of many practical issues important for the effective use of SVMs (e.g., feature construction), the book is valuable as a reference for researchers and practitioners alike.” —Professor Thorsten Joachims, Cornell University “One thing which makes the book very unique from other books is that the authors try to shed light on SVM from the viewpoint of optimization I believe that the comprehensive and systematic explanation on the basic concepts, fundamental principles, algorithms, and theories of SVM will help readers have a really indepth understanding of the space It is really a great book, which many researchers, students, and engineers in computer science and related fields will want to carefully read and routinely consult.” —Dr Hang Li, Noah’s Ark Lab, Huawei Technologies Co., Ltd Deng, Tian, and Zhang “This book comprehensively covers many topics of SVMs In particular, it gives a nice connection between optimization theory and support vector machines … The setting allows readers to easily learn how optimization techniques are used in a machine learning technique such as SVM.” —Professor Chih-Jen Lin, National Taiwan University Support Vector Machines Chapman & Hall/CRC Data Mining and Knowledge Discovery Series Chapman & Hall/CRC Data Mining and Knowledge Discovery Series Support Vector Machines Optimization Based Theory, Algorithms, and Extensions Naiyang Deng Yingjie Tian Chunhua Zhang K12703 K12703_Cover.indd CuuDuongThanCong.com 11/7/12 9:54 AM ... Zhao and Huan Liu STATISTICAL DATA MINING USING SAS APPLICATIONS, SECOND EDITION George Fernandez SUPPORT VECTOR MACHINES: OPTIMIZATION BASED THEORY, ALGORITHMS, AND EXTENSIONS Naiyang Deng,. .. Knowledge Driven Support Vector Machines Theory, Algorithms and Applications” (♯ 1127 1361); the general project “Models and Algorithms for Support Vector Machines with Adaptive Norms” (♯ 1120 1480);.. .Support Vector Machines Optimization Based Theory, Algorithms, and Extensions CuuDuongThanCong.com Chapman & Hall/CRC Data Mining and Knowledge Discovery Series