Optimization and its applications in control and data sciences

Springer Optimization and Its Applications 115 Boris Goldengorin Editor Optimization and Its Applications in Control and Data Sciences In Honor of Boris T Polyak’s 80th Birthday Springer Optimization and Its Applications VOLUME 115 Managing Editor Panos M Pardalos (University of Florida) Editor–Combinatorial Optimization Ding-Zhu Du (University of Texas at Dallas) Advisory Board J Birge (University of Chicago) C.A Floudas (Texas A & M University) F Giannessi (University of Pisa) H.D Sherali (Virginia Polytechnic and State University) T Terlaky (Lehigh University) Y Ye (Stanford University) Aims and Scope Optimization has been expanding in all directions at an astonishing rate during the last few decades New algorithmic and theoretical techniques have been developed, the diffusion into other disciplines has proceeded at a rapid pace, and our knowledge of all aspects of the field has grown even more profound At the same time, one of the most striking trends in optimization is the constantly increasing emphasis on the interdisciplinary nature of the field Optimization has been a basic tool in all areas of applied mathematics, engineering, medicine, economics, and other sciences The series Springer Optimization and Its Applications publishes undergraduate and graduate textbooks, monographs and state-of-the-art expository work that focus on algorithms for solving optimization problems and also study applications involving such problems Some of the topics covered include nonlinear optimization (convex and nonconvex), network flow problems, stochastic optimization, optimal control, discrete optimization, multi-objective programming, description of software packages, approximation techniques and heuristic approaches More information about this series at http://www.springer.com/series/7393 Boris Goldengorin Editor Optimization and Its Applications in Control and Data Sciences In Honor of Boris T Polyak’s 80th Birthday 123 Editor Boris Goldengorin Department of Industrial and Systems Engineering Ohio University Athens, OH, USA ISSN 1931-6828 ISSN 1931-6836 (electronic) Springer Optimization and Its Applications ISBN 978-3-319-42054-7 ISBN 978-3-319-42056-1 (eBook) DOI 10.1007/978-3-319-42056-1 Library of Congress Control Number: 2016954316 © Springer International Publishing Switzerland 2016 This work is subject to copyright All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed The use of general descriptive names, registered names, trademarks, service marks, etc in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made Printed on acid-free paper This Springer imprint is published by Springer Nature The registered company is Springer International Publishing AG Switzerland This book is dedicated to Professor Boris T Polyak on the occasion of his 80th birthday Preface This book is a collection of papers related to the International Conference “Optimization and Its Applications in Control and Data Sciences” dedicated to Professor Boris T Polyak on the occasion of his 80th birthday, which was held in Moscow, Russia, May 13–15, 2015 Boris Polyak obtained his Ph.D in mathematics from Moscow State University, USSR, in 1963 and the Dr.Sci degree from Moscow Institute of Control Sciences, USSR, in1986 Between 1963 and 1971 he worked at Lomonosov Moscow State University, and in 1971 he moved to the V.A Trapeznikov Institute of Control Sciences, Russian Academy of Sciences Professor Polyak was the Head of Tsypkin Laboratory and currently he is a Chief Researcher at the Institute Professor Polyak has held visiting positions at universities in the USA, France, Italy, Israel, Finland, and Taiwan; he is currently a professor at Moscow Institute for Physics and Technology His research interests in optimization and control have an emphasis in stochastic optimization and robust control Professor Polyak is IFAC Fellow, and a recipient of Gold Medal EURO-2012 of European Operational Research Society Currently, Boris Polyak’s h-index is 45 with 11807 citations including 4390 citations since 2011 This volume contains papers reflecting developments in theory and applications rooted by Professor Polyak’s fundamental contributions to constrained and unconstrained optimization, differentiable and nonsmooth functions including stochastic optimization and approximation, optimal and robust algorithms to solve many problems of estimation, identification, and adaptation in control theory and its applications to nonparametric statistics and ill-posed problems This book focus is on the recent research in modern optimization and its implications in control and data analysis Researchers, students, and engineers will benefit from the original contributions and overviews included in this book The book is of great interest to researchers in large-scale constraint and unconstrained, convex and non-linear, continuous and discrete optimization Since it presents open problems in optimization, game and control theories, designers of efficient algorithms and software for solving optimization problems in market and data analysis will benefit from new unified approaches in applications from managing vii viii Preface portfolios of financial instruments to finding market equilibria The book is also beneficial to theoreticians in operations research, applied mathematics, algorithm design, artificial intelligence, machine learning, and software engineering Graduate students will be updated with the state-of-the-art in modern optimization, control theory, and data analysis Athens, OH, USA March 2016 Boris Goldengorin Acknowledgements This volume collects contributions presented within the International Conference “Optimization and Its Applications in Control and Data Sciences” held in Moscow, Russia, May 13–15, 2015 or submitted by an open call for papers to the book “Optimization and Its Applications in Control Sciences and Data Analysis” announced at the same conference I would like to express my gratitude to Professors Alexander S Belenky (National Research University Higher School of Economics and MIT) and Panos M Pardalos (University of Florida) for their support in organizing the publication of this book including many efforts with invitations of top researches in contributing and reviewing the submitted papers I am thankful to the reviewers for their comprehensive feedback on every submitted paper and their timely replies They greatly improved the quality of submitted contributions and hence of this volume Here is the list of all reviewers: Anatoly Antipin, Federal Research Center “Computer Science and Control” of Russian Academy of Sciences, Moscow, Russia Saman Babaie-Kafaki, Faculty of Mathematics, Statistics, and Computer Science Semnan University, Semnan, Iran Amit Bhaya, Graduate School of Engineering (COPPE), Federal University of Rio de Janeiro (UFRJ), Rio de Janeiro, Brazil Lev Bregman, Department of Mathematics, Ben Gurion University, Beer Sheva, Israel Arkadii A Chikrii, Optimization Department of Controlled Processes, Cybernetics Institute, National Academy of Sciences, Kiev, Ukraine Giacomo Como, The Department of Automatic Control, Lund University, Lund, Sweden Xiao Liang Dong, School of Mathematics and Statistics, Xidian University, Xi’an, People’s Republic of China Trevor Fenner, School of Computer Science and Information Systems, Birkbeck College, University of London, London, UK ix x Acknowledgements Sjur Didrik Flåm, Institute of Economics, University of Bergen, Bergen, Norway 10 Sergey Frenkel, The Institute of Informatics Problems, Russian Academy of Science, Moscow, Russia 11 Piyush Grover, Mitsubishi Electric Research Laboratories, Cambridge, MA, USA 12 Jacek Gondzio, School of Mathematics The University of Edinburgh, Edinburgh, Scotland, UK 13 Rita Giuliano, Dipartimento di Matematica Università di Pisa, Pisa, Italy 14 Grogori Kolesnik, Department of Mathematics, California State University, Los Angeles, CA, USA 15 Pavlo S Knopov, Department of Applied Statistics, Faculty of Cybernetics, Taras Shevchenko National University, Kiev, Ukraine 16 Arthur Krener, Mathematics Department, University of California, Davis, CA, USA 17 Bernard C Levy, Department of Electrical and Computer Engineering, University of California, Davis, CA, USA 18 Vyacheslav I Maksimov, Institute of Mathematics and Mechanics, Ural Branch of the Russian Academy of Sciences, Ekaterinburg, Russia 19 Yuri Merkuryev, Department of Modelling and Simulation, Riga Technical University, Riga, Latvia 20 Arkadi Nemorovski, School of Industrial and Systems Engineering, Atlanta, GA, USA 21 José Valente de Oliveira, Faculty of Science and Technology, University of Algarve Campus de Gambelas, Faro, Portugal 22 Alex Poznyak, Dept Control Automatico CINVESTAV-IPN, Mexico D.F., Mexico 23 Vladimir Yu Protasov, Faculty of Mechanics and Mathematics, Lomonosov Moscow State University, and Faculty of Computer Science of National Research University Higher School of Economics, Moscow, Russia 24 Simeon Reich, Department of Mathematics, Technion-Israel Institute of Technology, Haifa, Israel 25 Alessandro Rizzo, Computer Engineering, Politecnico di Torino, Torino, Italy 26 Carsten W Scherer, Institute of Mathematical Methods in Engineering, University of Stuttgart, Stuttgart, Germany 27 Alexander Shapiro, School of Industrial and Systems Engineering, Atlanta, GA, USA 28 Lieven Vandenberghe, UCLA Electrical Engineering Department, Los Angeles, CA, USA 29 Yuri Yatsenko, School of Business, Houston Baptist University, Houston, TX, USA I would like to acknowledge the superb assistance that the staff of Springer has provided (thank you Razia Amzad) Also I would like to acknowledge help in preparation of this book from Silembarasanh Panneerselvam Acknowledgements xi Technical assistance with reformatting some papers and compilation of this book’s many versions by Ehsan Ahmadi (PhD student, Industrial and Systems Engineering Department, Ohio University, Athens, OH, USA) is greatly appreciated Finally, I would like to thank all my colleagues from the Department of Industrial and Systems Engineering, The Russ College of Engineering and Technology, Ohio University, Athens, OH, USA for providing me with a pleasant atmosphere to work within C Paul Stocker Visiting Professor position The Legendre Transformation in Modern Optimization 493 where A < B means that A B is nonnegative definite Note that (187) takes place for any x; y dom F In order to find the upper and the lower bounds for the matrix Z GD r F.x C y x//d (188) let us consider (187) for y WD x C y x/ From the left inequality (187) follows Z GD r F.x C y Z 1 xkx /2 d : ky xkx < 1, we have Therefore, for r D ky G < r F.x/ x//d < r F.x/ Z Â r/2 d D r F.x/ 1 rC r2 Ã : (189) ; (190) From the right inequality (187) follows G r F.x/ Z 1 r/ d D r F.x/ 1 r i.e for any x dom F, the following inequalities hold: Â rC r2 Ã r F.x/ G 1 r r F.x/ : (191) The first two integrations produced two very important facts For any x dom F, Dikin’s ellipsoid ˚ E.x; r/ D y Rn W ky xk2x Ä r « is contained in dom F, for any Ä r < For any x dom F and any y E.x; r/ from (187) follows r/2 r F.x/ r F.y/ 1 r/2 r F.x/ ; (192) i.e the function F is almost quadratic inside the ellipsoid E.x; r/ for small Ä r < The bounds for the gradient rF.x/, which is a monotone operator in Rn , we establish by integrating (182) 494 R.A Polyak Third Integration From (182), for Ä t < f 0/ 1=2 D ky we obtain Z s Z s Z s f 00 0/ 1=2 C t f 00 0/ dt Ä f 00 t/dt Ä 0 xkx and Ä s Ä 1=2 t dt ; or f 0/ C f 00 0/1=2 Ä f s/ Ä f 0/ C sf 00 0/1=2 f 00 0/1=2 1 Á sf 00 0/1=2 Á : The obtained inequalities we can rewrite as follows 0 00 0 00 f 0/ C w f 0/ s/ Ä f s/ Ä f 0/ C w f 0/ s/; where !.t/ D t ln.1Ct/ and ! s/ D supt> fst tCln.1Ct/g D ! s/ is the LET of !.t/ From the right inequality (193), for s D follows f 1/ f 0/ Ä Â f 00 0/1=2 1 00 f 0/1=2 Ã D (193) s ln.1 s/ D f 00 0/ : f 00 0/1=2 Recalling formulas for f 0/, f 1/, f 00 0/, and f 00 1/ we get rF.y/ rF.x/; y x/ Ä ky xk2x ky xkx (194) for any x and y dom F From the left inequality in (193), for s D follows f 1/ f 0/ Â f 00 0/1=2 1 00 C f 0/1=2 Ã D f 00 0/ C f 00 0/1=2 or rF.y/ rF.x/; y x/ ky xk2x : C ky xkx (195) Fourth Integration In order to establish bounds for F.y/ F.x/ it is enough to integrate the inequalities (193) Taking the integral of the right inequality (193), we obtain f s/ Ä f 0/ C f 0/s C ! f 00 0/1=2 s The Legendre Transformation in Modern Optimization D f 0/ C f 0/s f 00 0/1=2 s 495 ln f 00 0/1=2 s D U.s/ : (196) In other words, U.s/ is an upper bound for f s/ on the interval Œ0; f 00 0/ that f 00 0/ 1=2 D ky xkx > For s D from (196) follows f 1/ f 0/ Ä f 0/ C ! f 00 0/1=2 D f 0/ C ! ky 1=2 / Recall xkx / : (197) Keeping in mind f 0/ D F.x/, f 1/ D F.y/, from (197), we get F.x/ Ä rF.x/; y F.y/ x/ C ! ky xkx / : (198) Integration of the left inequality (193) leads to the lower bound L.s/ for f s/ f 0/ C f 0/s C ! f 00 0/1=2 s f s/ D f 0/ C f 0/s C f 00 0/1=2 s D L.s/ ; s ln C f 00 0/1=2 s 0: (199) For s D 1, we have f 1/ f 0/ > f 0/ C ! f 00 0/1=2 or F.y/ F.x/ rF.x/; y x/ C ! ky xkx / : (200) We conclude the section by considering the existence of the minimizer x D arg minfF.x/ W x dom Fg (201) for a self-concordant function F It follows from (173) that the Hessian r F.x/ is positive definite for any x dom F, but the existence of x W rF.x / D 0, does not follow from strict convexity of F However, it guarantees the existence of the local norm kvkx D r F.x/ v; v/1=2 > at any x dom F For v D rF.x/, one obtains the following scaled norm of the gradient rF.x/, x/ D r F.x/ rF.x/; rF.x/ 1=2 D krF.x/kx > ; which plays an important role in SC theory It is called Newton decrement of F at the point x dom F 496 R.A Polyak Theorem 14 If x/ < for some x dom F then the minimizer x in (201) exists Proof For u D y x Ô and v D rF.x/, where x and y dom F from CS inequality j.u; v/j Ä kvkx kukx follows x/j Ä krF.x/kx ky j.rF.x/; y xkx : (202) From (200) and (202) and the formula for x/ follows F.y/ xkx C ! ky x/ ky F.x/ xkx / : Therefore, for any y L x/ D fy Rn W F.y/ Ä F.x/g we have ! ky xkx / Ä x/ ky xkx ; i.e xkx ! ky ky xkx / Ä x/ < : From the definition of !.t/ follows 1 ky xkx ln C ky xkx / Ä x/ < : The function ln.1 C / is monotone increasing for given < x/ < 1, the equation x/ D > Therefore, for a ln.1 C / has a unique root N > Thus, for any y L x/, we have ky xkx Ä N ; i.e the level set L x/ at x dom F is bounded and closed due to the continuity of F Therefore, x exists due to the Weierstrass theorem The minimizer x is unique due to the strict convexity of F.x/ for x dom F The theorem presents an interesting result: a local condition x/ < at some x dom F guarantees the existence of x , which is a global property of F on the dom F The condition < x/ < will plays an important role later Let us briefly summarize the basic properties of the SC functions established so far The SC function F is a barrier function on dom F For any x dom F and any < r < 1, there is a Dikin’s ellipsoid inside dom F, i.e n o E.x; r/ D y W ky xk2x Ä r dom F : The Legendre Transformation in Modern Optimization 497 For any x dom F and small enough < r < 1, the function F is almost quadratic inside of the Dikin’s ellipsoid E.x; r/ due to the bounds (192) The gradient rF is a strictly monotone operator on dom F with upper and lower monotonicity bounds given by (194) and (195) For any x dom F and any direction u D y x, the restriction f s/ D F.x C s.y x// is bounded by U.s/ and L.s/ (see (196) and (199)) Condition < x/ < at any x dom F guarantees the existence of a unique minimizer x on dom F It is quite remarkable that practically all important properties of SC functions follow from a single differential inequality (176), which is, a direct consequence of the boundedness of LEINV.f / We conclude the section by showing that Newton method can be very efficient for global minimization of SC functions, in spite of the fact that F is not strongly convex 7.2 Damped Newton Method for Minimization of SC Function The SC functions are strictly convex on dom F Such a property, generally speaking, does p not guarantee global convergence of the Newton method For example, f t/ D C t2 is strictly convex, but Newton method for finding f t/ diverges from t any starting point t0 … 1; 1Œ Turns out that SC properties guarantee convergence of the special damped Newton method from any starting point Moreover, such method goes through three phases In the first phase each step reduces the error bound f x/ D f x/ f x / by a constant, which is independent on x dom F In the second phase the error bound converges to zero with at least superlinear rate The superlinear rate is characterized explicitly through w / and its LET w /, where < < is the Newton decrement At the final phase the damped Newton method practically turns into standard Newton method and the error bound converges to zero with quadratic rate The following bounds for the restriction f s/ D F.x C su/ at x dom F in the direction u D y x Rn n f0g is our main tool L s/ Ä f s/ Ä s U.s/ 0ÄsÄf 00 0/ 1=2/ : (203) Let x dom F, f 0/ D F.x/ and x Ô x , then there exists y dom F such that for u D y x Ô we have a/ f 0/ D rF.x/; u/ < ; and b/ f 00 0/ D r F.x/u; u D kuk2x D d2 > 0: (204) 498 R.A Polyak We would like to estimate the reduction of F, as a result of one Newton step with x dom F as a starting point Let us consider the upper bound U.s/ D f 0/ C f 0/s ds ds/ ; ln.1 for f s/ The function U.s/ is strongly convex in s on Œ0; d/ Also, U 0/ D f 0/ < and U s/ ! for s ! d Therefore, the equation U s/ D f 0/ d C d.1 ds/ D0 (205) has a unique solution sN Œ0; d /, which is the unconstrained minimizer for U.s/ From (205) we have sN D f 0/d f 0/d 1 D C / where D f 0/d and < D f 0/d < On the other hand, the unconstrained minimizer sN is a result of one step of the damped Newton method for finding U.s/ with step length t D C / from s D as a starting point It is s easy to see that U C / D f 0/ ! / : From the right inequality in (203), we obtain f C / Ä f 0/ ! / : Keeping in mind (204) for the Newton direction u D y we obtain D f 0/ D f 00 0/ (206) xD r F.x// rF.x/ rF.x/; u/ D 1: r F.x/u; u/ In view of f 0/ D F.x/, we can rewrite (206) as follows: F x C / r F.x// rF.x/ Ä F.x/ ! / : (207) In other words, finding an unconstrained minimizer of the upper bound U.s/ is equivalent to one step of the damped Newton method xkC1 D xk C xk // r F.xk / rF.xk / (208) for minimization of F.x/ on dom F Moreover, our considerations are independent from the starting point x dom F Therefore, for any starting point x0 dom F and The Legendre Transformation in Modern Optimization k 499 1, we have F xkC1 / Ä F xk / ! / : (209) The bound (209) is universal, i.e it is true for any xk dom F Let us compute D f 0/f 00 0/ 1=2 for the Newton direction r F.x/ rF.x/ : uD We have f 0/ f 00 0/ Á x/ D 1=2 rF.x/; u/ r F.x/u; u/1=2 D D r F.x/ rF.x/; rF.x/ 1=2 D krF.x/kx : We have seen already that it is critical that < xk / < 1, k The function !.t/ D t ln.1 C t/ is a monotone increasing, therefore for a small ˇ > and > x/ ˇ, from (209) we obtain reduction of F.x/ by a constant !.ˇ/ at each damped Newton step Therefore, the number of damped Newton steps is bounded by N Ä !.ˇ// F.x0 / F.x // : The bound (209), however, can be substantially improved for x S.x ; r/ D fx dom F W F.x/ F.x / Ä rg and < r < Let us consider the lower bound L.s/ D f 0/ C f 0/s C ds The function L.s/ is strictly convex on s L0 ln.1 C ds/ Ä f s/; If < / D f 0/d D 0: Therefore, sN D / D arg minfL.s/ j s 0g 0: s < 1, then 500 R.A Polyak U(s) f(s) L(s) f(0) w (λ) U(s-) w (-λ) = w∗(λ) f(s*) L(s=) s-= Δ 1+λ s== Δ s∗ 1−λ Fig Lower and upper bounds of a self-concordant function and L.sN/ D f 0/ ! /: Along with sN and sN we consider (see Fig 2) s D arg minff s/ j s 0g : For a small < r < and x S.x ; r/, we have f 0/ f s / < 1, hence f 0/ f Ns/ < The relative progress per step is more convenient to measure on the logarithmic scale ÄD ln.f Ns/ ln.f 0/ f s // : f s // From ! / < f 0/ f s / < follows ln ! / > ln.f 0/ f s // or ln.f 0/ f s // > ln ! / From f Ns/ Ä f 0/ ! / and f s / f 0/ ! / follows (see Fig 2) f Ns/ f s / Ä ! / ! / : Hence, ln.f Ns/ f s // < ln.! / ! // The Legendre Transformation in Modern Optimization 501 and Ä / Ä D For < ln.! / ! // ln ! / C ln.1 C /.1 ln ln.1 C // ln / : Ä 0:5, we have ln Ä / Ä ln 3 C 3 Á C Á: In particular, Ä.0:40/ 1:09 Thus, the sequence fxk g1 kD0 generated by the damped Newton method (208) with xk / D 0:40 converges in value at least with 1.09 Qsuperlinear rate, that is for the error bound the xk / D F.xk / F.x / < 1, we have xkC1 / Ä xk //1:09 Due to limk!1 xk / D limk!1 krF.xk /kx D from some point on, method (208) practically turns into the classical Newton method xkC1 D xk r F.xk / rF.xk / ; (210) which converges with quadratic rate Instead of waiting for this to happen, there is a way of switching, at some point, from (208) to (210) and guarantee that from this point on, only Newton method (210) is used Using such a strategy it is possible to achieve quadratic convergence earlier The following Theorem characterize the neighborhood at x when quadratic convergence accuracy Theorem 15 Let x dom F and x/ D r F.x/ rF.x/; rF.x/ 1=2 < 1; then, the point xO D x r F.x/ rF.x/ (211) belongs to dom F; the following bound holds Â Ox/ Ä x/ x/ Ã2 : (212) 502 R.A Polyak Proof Let p D xO r F.x/ rF.x/, xD kpkx D r F.x/p; p 1=2 D rF.x/; r F.x/ rF.x/ D krF.x/kx D x/ D therefore, xO dom F First of all, note that if A D AT A B D x/, then < 1I 0, B D BT D 1=2 A A and A < B, then B/B 0: For y D xO from the left inequality in (187), we obtain Ox/ D krF.Ox/kxO Ä D 1 kpkx / kpkx / r F.x/ rF.Ox/; rF.Ox/ 1=2 krF.Ox/kx : We can then rewrite (211) as follows r F.x/ Ox x/ C rF.x/ D : Therefore, rF.Ox/ D rF.Ox/ r F.x/.Ox rF.x/ x/ : Then, using (188) and formula (185) (see p [45]), we obtain Z rF.Ox/ rF.x/ D r F.x C Ox x/ Ox x/d D G.Ox O x x/ D G.O O x/ D Gp x/ : Hence, rF.Ox/ D G r F.x/ Ox O O T D G and G From CS inequality follows Á Á O F.x/ Gp; O Gp O O p D Gr krF.Ox/kx D r F.x/ Gp; O F.x/ Gp O Ä Gr x kpkx : (213) The Legendre Transformation in Modern Optimization 503 Then O F.x/ Gp O Gr x Á1=2 O r F.x/ Gr O F.x/ Gp O O F.x/ Gp; D Gr D H.x/2 r F.x/ 1=2 Ä kH.x/k r F.x/ O r F.x/ Gp; 1=2 1=2 Á1=2 O Gp O r F.x/ Gp; Á O Gp O D kH.x/k r F.x/ Gp; 1=2 D kH.x/k r F.x/ rF.Ox/; rF.Ox/ 1=2 Á1=2 O Gp D kH.x/k krF.Ox/kx ; O O F.x/ 1=2 , therefore r F.x/ 12 H.x/r F 12 x/ D G where H.x/ D r F.x/ 1=2 Gr From (213) and the last inequality we obtain krF.Ox/kx Ä kH.x/k kpkx D kH.x/k : It follows from (191) Â 2Ã C O DG r F.x/ G r F.x/ r F.x/ : Then, kH.x/k Ä max ; C D : Therefore, Ox/ Ä 1 /2 krF.Ox/kx Ä /2 kH.x/k2 Ä /4 or Ox/ Ä /2 : We saw already that D x/ < is the main ingredient for the damped Newton method (208) to converge To retain the same condition for Ox/, it is sufficient to require Ox/ Ä Ä =.1 /2 The function Œ =.1 /2 is positive and monotone increasing on 0; 1/ Therefore, to find an upper bound for it is enough to solve 504 the equation =.1 have R.A Polyak /2 D In other words, for any Ã2 Â Ox/ Ä D x/ < N D p , we : Thus, the damped Newton method (208) follows three major stages in terms of the rate of convergence First, it reduces the function value by a constant at each step Then, it converges with superlinear rate and, finally, in the neighborhood of the solution it converges with quadratic rate The Newton area, where the Newton method converges with the quadratic rate is defined as follows: ( p ) N N.x ; ˇ/ D x W x/ D krF.x/kx Ä ˇ < D : (214) To speed up the damped Newton method (208) p one can use the following switching strategy For a given < ˇ < N D 5/=2, one uses the damped Newton method (208) if xk / > ˇ and the “pure” Newton method (210) when xk / Ä ˇ Concluding Remarks The LEID is an universal instrument for establishing the duality results for SUMT, NR and LT methods The duality result, in turn, are critical for both understanding the convergence mechanisms and the convergence analysis In particular, the update formula (107) and concavity of the dual function d leads to the following bound d sC1 / d s / kL/ k sC1 sk ; which together with d sC1 / d s / ! shows that the Lagrange multipliers not change much from same point on It means that if Newton method is used for primal minimization then, from some point on, usually after very few Lagrange multipliers update the approximation for the primal minimizer xs is in the Newton area for the next minimizer xsC1 Therefore it takes few and, from some point on, only one Newton step to find the next primal approximation and update the Lagrange multipliers This phenomenon is called—the “hot” start (see [46]) The neighborhood of the solution where the “hot” start occurs has been characterized in [38] and observed in [5, 10, 25, 41] It follows from Remark that, under standard second order optimality condition, each Lagrange multipliers update shrinks the distance between the current and the optimal solution by a factor, which can be made as small as one wants by increasing k > The Legendre Transformation in Modern Optimization 505 In contrast to SUMT the NR methods requires much less computational effort per digit of accuracy at the end of the process then at the beginning Therefore NR methods is used when high accuracy needed (see, for example, [1]) One of the most important features of NR methods is their numerical stability It is due to the stability of the Newton’s area, which does not shrink to a point in the final phase Therefore one of the most reliable NLP solver PENNON is based on NR methods (see [32–34]) The NR method with truncated MBF transformation has been widely used for both testing the NLP software and solving real life problems (see [1, 5, 10, 25, 32– 34, 38, 41]) The numerical results obtained strongly support the theory, including the “hot” start phenomenon The NR as well as LT are primal exterior points methods Their dual equivalence are interior points methods In particular, the LT with MBF transform t/ D ln.t C 1/ leads to the interior prox with Bregman distance, which is based on the self-concordant MBF kernel '.s/ D s/ D ln s C s Application of this LT for LP calculations leads to Dikin’s type interior point method for the dual LP It establishes, eventually, the remarkable connection between exterior and interior point methods (see [37, 49]) On the other hand, the LEINV is in the heart of the SC theory—one of the most beautiful chapters of the modern optimization Although the Legendre Transformation was introduced more than 200 years ago, we saw that LEID and LEINV are still critical in modern optimization both constrained and unconstrained References Alber, M., Reemtsen, R.: Intensity modulated radiotherapy treatment planning by use of a barrier-penalty multiplier method Optim Methods Softw 22(3), 391–411 (2007) Antipin, A.S.: Methods of Nonlinear Programming Based on the Direct and Dual Augmentation of the Lagrangian VNIISI, Moscow (1979) Auslender, R., Cominetti, R., Haddou, M.: Asymptotic analysis for penalty and barrier methods in convex and linear programming Math Oper Res 22(1), 43–62 (1997) Bauschke, H., Matouskova, E., Reich, S.: Projection and proximal point methods, convergence results and counterexamples Nonlinear Anal 56(5), 715–738 (2004) Ben-Tal, A., Nemirovski, A.: Optimal design of engineering structures Optima 47, 4–9 (1995) Ben-Tal, A., Zibulevski, M.: Penalty-barrier methods for convex programming problems SIAM J Optim 7, 347–366 (1997) Bertsekas, D.: Constrained Optimization and Lagrange Multiplier Methods Academic Press, New York (1982) Bregman, L.: The relaxation method for finding the common point of convex sets and its application to the solution of problems in convex programming USSR Comput Math Math Phys 7, 200–217 (1967) Bregman, L., Censor, Y., Reich, S.: Dykstra algorithm as the nonlinear extension of Bregman’s optimization method J Convex Anal 6(2), 319–333 (1999) 506 R.A Polyak 10 Breitfeld, M., Shanno, D.: Computational experience with modified log-barrier methods for nonlinear programming Ann Oper Res 62, 439–464 (1996) 11 Byrne, C., Censor, Y.: Proximity function minimization using multiple Bregman projections with application to split feasibility and Kullback-Leibler distance minimization Ann Oper Res 105, 77–98 (2001) 12 Carroll, C.: The created response surface technique for optimizing nonlinear-restrained systems Oper Res 9(2), 169–184 (1961) 13 Censor, Y., Zenios, S.: The proximal minimization algorithm with d-functions J Optim Theory Appl 73, 451–464 (1992) 14 Chen, C., Mangasarian, O.L.: Smoothing methods for convex inequalities and linear complementarity problems Math Program 71, 51–69 (1995) 15 Chen, G., Teboulle, M.: Convergence analysis of a proximal-like minimization algorithm using Bregman functions SIAM J Optim 3(4), 538–543 (1993) 16 Courant, R.: Variational methods for the solution of problems of equilibrium and vibrations Bull Am Math Soc 49, 1–23 (1943) 17 Daube-Witherspoon, M., Muehllehner, G.: An iterative space reconstruction algorithm suitable for volume ECT IEEE Trans Med Imaging 5, 61–66 (1986) 18 Dikin, I.: Iterative solutions of linear and quadratic programming problems Sov Math Dokl 8, 674–675 (1967) 19 Eckstein, J.: Nonlinear proximal point algorithms using Bregman functions with applications to convex programming Math Oper Res 18(1), 202–226 (1993) 20 Eggermont, P.: Multiplicative iterative algorithm for convex programming Linear Algebra Appl 130, 25–32 (1990) 21 Ekeland, I.: Legendre duality in nonconvex optimization and calculus of variations SIAM J Control Optim 16(6), 905–934 (1977) 22 Fiacco, A., Mc Cormick, G.: Nonlinear Programming, Sequential Unconstrained Minimization Techniques SIAM, Philadelphia (1990) 23 Frisch, K.: The logarithmic potential method for convex programming Memorandum of May 13 1955, University Institute of Economics, Oslo (1955) 24 Goldshtein, E., Tretiakov, N.: Modified Lagrangian Functions Nauka, Moscow (1989) 25 Griva, I., Polyak, R.: Primal-dual nonlinear rescaling method with dynamic scaling parameter update Math Program Ser A 106, 237–259 (2006) 26 Griva, I., Polyak, R.: Proximal point nonlinear rescaling method for convex optimization Numer Algebra Control Optim 1(3), 283–299 (2013) 27 Guler, O.: On the convergence of the proximal point algorithm for convex minimization SIAM J Control Optim 29, 403–419 (1991) 28 Hestenes, M.R.: Multipliers and gradient methods J Optim Theory Appl 4, 303–320 (1969) 29 Hiriat-Urruty, J., Martinez-Legaz, J.: New formulas for the Legendre-Fenchel transform J Math Anal Appl 288, 544–555 (2003) 30 Ioffe, A., Tichomirov, V.: Duality of convex functions and extremum problems Uspexi Mat Nauk 23(6)(144), 51–116 (1968) 31 Jensen, D., Polyak, R.: The convergence of a modify barrier method for convex programming IBM J Res Dev 38(3), 307–321 (1999) 32 Kocvara, M., Stingl, M.: PENNON A code for convex nonlinear and semidefinite programming Optim Methods Softw 18(3), 317–333 (2003) 33 Kocvara, M., Stingl, M.: Recent progress in the NLP-SDP code PENNON In: Workshop Optimization and Applications, Oberwalfach (2005) 34 Kocvara, M., Stingl, M.: On the solution of large-scale SDP problems by the modified barrier method using iterative solver Math Program Ser B 109, 413–444 (2007) 35 Martinet, B.: Regularization d’inequations variationelles par approximations successive Rev Fr Inf Res Ofer 4(R3), 154–159 (1970) 36 Martinet, B.: Determination approachee d’un point fixe d’une application pseudo-contractante C.R Acad Sci Paris 274(2), 163–165 (1972) The Legendre Transformation in Modern Optimization 507 37 Matioli, L., Gonzaga, C.: A new family of penalties for augmented Lagrangian methods Numer Linear Algebra Appl 15, 925–944 (2008) 38 Melman, A., Polyak, R.: The Newton modified barrier method for QP problems Ann Oper Res 62, 465–519 (1996) 39 Moreau, J.: Proximite et dualite dans un espace Hilbertien Bull Soc Math France 93, 273– 299 (1965) 40 Motzkin, T.: New techniques for linear inequalities and optimization In: project SCOOP, Symposium on Linear Inequalities and Programming, Planning Research Division, Director of Management Analysis Service, U.S Air Force, Washington, D.C., no 10, (1952) 41 Nash, S., Polyak, R., Sofer, A.: A numerical comparison of barrier and modified barrier method for large scale bound-constrained optimization In: Hager, W., Hearn, D., Pardalos, P (eds.) Large Scale Optimization, State of the Art, pp 319–338 Kluwer Academic, Dordrecht (1994) 42 Nesterov, Yu., Nemirovsky, A.: Interior Point Polynomial Algorithms in Convex Programming SIAM, Philadelphia (1994) 43 Nesterov, Yu.: Introductory Lectures on Convex Optimization Kluwer Academic, Norwell, MA (2004) 44 Polyak, B.: Iterative methods using lagrange multipliers for solving extremal problems with constraints of the equation type Comput Math Math Phys 10(5), 42–52 (1970) 45 Polyak, B.: Introduction to Optimization Optimization Software, New York (1987) 46 Polyak, R.: Modified barrier functions (theory and methods) Math Program 54, 177–222 (1992) 47 Polyak, R.: Log-sigmoid multipliers method in constrained optimization Ann Oper Res 101, 427–460 (2001) 48 Polyak, R.: Nonlinear rescaling vs smoothing technique in convex optimization Math Program 92, 197–235 (2002) 49 Polyak, R.: Lagrangian transformation and interior ellipsoid methods in convex optimization J Optim Theory Appl 163(3), 966–992 (2015) 50 Polyak, R., Teboulle, M.: Nonlinear rescaling and proximal-like methods in convex optimization Math Program 76, 265–284 (1997) 51 Polyak, B., Tret’yakov, N.: The method of penalty estimates for conditional extremum problems Comput Math Math Phys 13(1), 42–58 (1973) 52 Powell, M.J.D.: A method for nonlinear constraints in minimization problems In: Fletcher (ed.) Optimization, pp 283–298 Academic Press, London (1969) 53 Powell, M.: Some convergence properties of the modified log barrier methods for linear programming SIAM J Optim 50(4), 695–739 (1995) 54 Ray, A., Majumder, S.: Derivation of some new distributions in statistical mechanics using maximum entropy approach Yugosl J Oper Res 24(1), 145–155 (2014) 55 Reich, S., Sabach, S.: Two strong convergence theorems for a proximal method in reflexive Banach spaces Numer Funct Anal Optim 31, 22–44 (2010) 56 Rockafellar, R.T.: A dual approach to solving nonlinear programming problems by unconstrained minimization Math Program 5, 354–373 (1973) 57 Rockafellar, R.T.: Augmented Lagrangians and applications of the proximal points algorithms in convex programming Math Oper Res 1, 97–116 (1976) 58 Rockafellar, R.T.: Monotone operators and the proximal point algorithm SIAM J Control Optim 14, 877–898 (1976) 59 Teboulle, M.: Entropic proximal mappings with application to nonlinear programming Math Oper Res 17, 670–690 (1992) 60 Tikhonov, A.N.: Solution of incorrectly formulated problems and the regularization method Sov Math (Translated) 4, 1035–1038 (1963) 61 Tseng, P., Bertsekas, D.: On the convergence of the exponential multipliers method for convex programming Math Program 60, 1–19 (1993) 62 Vardi, Y., Shepp, L., Kaufman, L.: A statistical model for positron emission tomography J Am Stat Assoc 80, 8–38 (1985) ... to researchers in large-scale constraint and unconstrained, convex and non-linear, continuous and discrete optimization Since it presents open problems in optimization, game and control theories,... Control and Data Sciences, Springer Optimization and Its Applications 115, DOI 10.1007/978-3-319-42056-1_1 N Andrei Introduction For solving the large-scale unconstrained optimization problem minff... within the International Conference Optimization and Its Applications in Control and Data Sciences held in Moscow, Russia, May 13–15, 2015 or submitted by an open call for papers to the book “Optimization

Định dạng
Số trang	516
Dung lượng	6,38 MB