1. Trang chủ
  2. » Luận Văn - Báo Cáo

Convex functions and their applications a contemporary approach second edition

430 4 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

CMS Books in Mathematics Constantin P Niculescu Lars-Erik Persson Canadian Mathematical Society Société mathématique du Canada Convex Functions and Their Applications A Contemporary Approach Second Edition Tai ngay!!! Ban co the xoa dong chu nay!!! 16990026418081000000 Canadian Mathematical Society Soci´et´e math´ematique du Canada Editors-in-Chief R´edacteurs-en-chef K Dilcher K Taylor Advisory Board Comit´e consultatif M Barlow H Bauschke L Edelstein-Keshet N Kamran M Kotchetov More information about this series at http://www.springer.com/series/4318 Constantin P Niculescu · Lars-Erik Persson Convex Functions and Their Applications A Contemporary Approach Second Edition 123 Constantin P Niculescu Department of Mathematics University of Craiova Craiova Romania Lars-Erik Persson UiT, The Artic University of Norway Campus Narvik Norway and and Academy of Romanian Scientists Bucharest Romania Lule˚a University of Technology Lule˚a Sweden ISSN 1613-5237 ISSN 2197-4152 (electronic) CMS Books in Mathematics ISBN 978-3-319-78336-9 ISBN 978-3-319-78337-6 (eBook) https://doi.org/10.1007/978-3-319-78337-6 Library of Congress Control Number: 2018935865 Mathematics Subject Classification (2010): 26B25, 26D15, 46A55, 46B20, 46B40, 52A01, 52A40, 90C25 1st edition: © Springer Science+Business Media, Inc 2006 2nd edition: © Springer International Publishing AG, part of Springer Nature 2018 This work is subject to copyright All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed The use of general descriptive names, registered names, trademarks, service marks, etc in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations Printed on acid-free paper This Springer imprint is published by the registered company Springer International Publishing AG part of Springer Nature The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland Contents Preface ix List of symbols xiii Convex Functions on Intervals 1.1 Convex Functions at First Glance 1.2 Young’s Inequality and Its Consequences 1.3 Log-convex Functions 1.4 Smoothness Properties of Convex Functions 1.5 Absolute Continuity of Convex Functions 1.6 The Subdifferential 1.7 The Integral Form of Jensen’s Inequality 1.8 Two More Applications of Jensen’s Inequality 1.9 Applications of Abel’s Partial Summation Formula 1.10 The Hermite–Hadamard Inequality 1.11 Comments Convex Sets in Real Linear Spaces 2.1 Convex Sets 2.2 The Orthogonal Projection 2.3 Hyperplanes and Separation Theorems in Euclidean Spaces 2.4 Ordered Linear Spaces 2.5 Sym(N, R) as a Regularly Ordered Banach Space 2.6 Comments 71 71 83 89 95 97 103 Convex Functions on a Normed Linear Space 3.1 Generalities and Basic Examples 3.2 Convex Functions and Convex Sets 3.3 The Subdifferential 3.4 Positively Homogeneous Convex Functions 3.5 Inequalities Associated to Perspective Functions 3.6 Directional Derivatives 3.7 Differentiability of Convex Functions 1 11 20 24 33 36 41 51 55 59 64 107 107 115 122 130 135 141 147 v CONTENTS vi 3.8 3.9 3.10 3.11 3.12 Differential Criteria of Convexity Jensen’s Integral Inequality in the Context of Several Extrema of Convex Functions The Pr´ekopa–Leindler Inequality Comments Variables 153 161 165 173 179 Convexity and Majorization 4.1 The Hardy–Littlewood–P´ olya Theory of Majorization 4.2 The Schur–Horn Theorem 4.3 Schur-Convexity 4.4 Eigenvalue Inequalities 4.5 Horn’s Inequalities 4.6 The Way of Hyperbolic Polynomials 4.7 Vector Majorization in RN 4.8 Comments 185 185 195 197 202 209 212 218 224 Convexity in Spaces of Matrices 5.1 Convex Spectral Functions 5.2 Matrix Convexity 5.3 The Trace Metric of Sym++ (n, R) 5.4 Geodesic Convexity in Global NPC 5.5 Comments 227 227 234 241 248 252 Duality 255 255 263 272 279 283 290 297 Special Topics in Majorization Theory 7.1 Steffensen–Popoviciu Measures 7.2 The Barycenter of a Steffensen–Popoviciu Measure 7.3 Majorization via Choquet Order 7.4 Choquet’s Theorem 7.5 The Hermite–Hadamard Inequality for Signed Measures 7.6 Comments 301 301 308 314 317 322 324 Spaces Duality and Convex Optimization 6.1 Legendre–Fenchel Duality 6.2 The Correspondence of Properties under 6.3 The Convex Programming Problem 6.4 Ky Fan Minimax Inequality 6.5 Moreau–Yosida Approximation 6.6 The Hopf–Lax Formula 6.7 Comments A Generalized Convexity on Intervals 327 A.1 Means 328 A.2 Convexity According to a Pair of Means 330 A.3 A Case Study: Convexity According to the Geometric Mean 333 CONTENTS vii B Background on Convex Sets 339 B.1 The Hahn–Banach Extension Theorem 339 B.2 Separation of Convex Sets 343 B.3 The Krein–Milman Theorem 346 C Elementary Symmetric Functions C.1 Newton’s Inequalities C.2 More Newton Inequalities C.3 Some Results of Bohnenblust, Marcus, and Lopes C.4 Symmetric Polynomial Majorization 349 350 354 356 359 D Second-Order Differentiability of Convex Functions 361 D.1 Rademacher’s Theorem 361 D.2 Alexandrov’s Theorem 364 E The E.1 E.2 E.3 E.4 Variational Approach of PDE The Minimum of Convex Functionals Preliminaries on Sobolev Spaces Applications to Elliptic Boundary-Value Problems The Galerkin Method 367 367 370 372 375 References 377 Index 411 Preface Convexity is a simple and natural notion which can be traced back to Archimedes (circa 250 B.C.), in connection with his famous estimate of the value of π (by using inscribed and circumscribed regular polygons) He noticed the important fact that the perimeter of a convex figure is smaller than the perimeter of any other convex figure surrounding it As a matter of fact, we experience convexity all the time and in many ways The most prosaic example is our upright position, which is secured as long as the vertical projection of our center of gravity lies inside the convex envelope of our feet Also, convexity has a great impact on our everyday life through numerous applications in industry, business, medicine, and art So the problems of optimum allocation of resources, estimation and signal processing, statistics, and finance, to name just a few The recognition of the subject of convex functions as one that deserves to be studied in its own right is generally ascribed to J L W V Jensen [230], [231] However he was not the first to deal with such functions Among his predecessors we should recall here Ch Hermite [213], O Hă older [225], and O Stolz [463] During the whole twentieth century, there was intense research activity and many significant results were obtained in geometric functional analysis, mathematical economics, convex analysis, and nonlinear optimization The classic book by G H Hardy, J E Littlewood, and G P´ olya [209] played a prominent role in the popularization of the subject of convex functions What motivates the constant interest for this subject? First, its elegance and the possibility to prove deep results even with simple mathematical tools Second, many problems raised by science, engineering, economics, informatics, etc fall in the area of convex analysis More and more mathematicians are seeking the hidden convexity, a way to unveil the true nature of certain intricate problems There are two basic properties of convex functions that make them so widely used in theoretical and applied mathematics: The maximum is attained at a boundary point Any local minimum is a global one Moreover, a strictly convex function admits at most one minimum The modern viewpoint on convex functions entails a powerful and elegant interaction between analysis and geometry In a memorable paper dedicated to ix x Preface the Brunn–Minkowski inequality, R J Gardner [176, p 358], described this reality in beautiful phrases: [convexity] “appears like an octopus, tentacles reaching far and wide, its shape and color changing as it roams from one area to the next It is quite clear that research opportunities abound.” Over the years a number of notable books dedicated to the theory and applications of convex functions appeared We mention here: H H Bauschke and P L Combettes [34], J M Borwein and J Vanderwerff [73], S Boyd and L Vandenberghe [75], J.-B Hiriart-Urruty and C Lemarechal [218], L Hăormander [227], M A Krasnoselskii and Ya B Rutickii [259], J E Peˇcari´c, F Proschan and Y C Tong [388], R R Phelps [397], [398], A W Roberts and D E Varberg [420], R T Rockafellar [421], and B Simon [450] The references at the end of this book include many other fine books dedicated to one aspect or another of the theory The title of the book by L Hă ormander, Notions of Convexity, is very suggestive for the present state of art In fact, nowadays the study of convex functions has evolved into a larger theory about functions which are adapted to other geometries of the domain and/or obey other laws of comparison of means Examples are log-convex functions, multiplicatively convex functions, subharmonic functions, and functions which are convex with respect to a subgroup of the linear group Our book aims to be a thorough introduction to contemporary convex function theory It covers a large variety of subjects, from the one real variable case to the infinite-dimensional case, including Jensen’s inequality and its ramifications, the Hardy–Littlewood–P´ olya theory of majorization, the Borell– Brascamp–Lieb form of the Pr´ekopa–Leindler inequality (as well as its connection with isoperimetric inequalities), the Legendre–Fenchel duality, Alexandrov’s result on the second differentiability of convex functions, the highlights of Choquet’s theory, and many more It is certainly a book where inequalities play a central role but in no case a book on inequalities Many results are new, and the whole book reflects our own experiences, both in teaching and research The necessary background is advanced calculus, linear algebra, and some elements of real analysis When necessary, the reader will be guided to the most pertinent sources, where the quoted results are presented as transparent as possible This book may serve many purposes, ranging from honors options for undergraduate students to one-semester graduate course on Convex Functions and Applications For example, Chapter and Appendix A offer a quick introduction to generalized convexity, a subject which became very popular during the last decades The same combination works nicely as supplementary material for a seminar debating heuristics of mathematical research Chapters 1–6 together with Appendix B could be used as a reference text for a graduate course And the options can continue In order to avoid any confusion relative to our notation, a symbol index was added for the convenience of the reader Preface xi A word of caution is necessary In this book, N = {0, 1, 2, } and N∗ = {1, 2, 3, } According to a recent tendency in mathematical terminology we call a number x positive if x ≥ and strictly positive if x > A function f is called increasing if x ≤ y implies f (x) ≤ f (y) and strictly increasing if x < y implies f (x) < f (y) Notice also that our book deals only with real linear spaces and all Borel measures under attention are assumed to be regular This Second Edition corrects a few errors and typos in the original and includes considerably more material emphasizing the rich applicability of convex analysis to concrete examples Chapter 2, on Convex sets in real linear spaces, is entirely new and together with Appendix B (mostly dedicated to the Hahn–Banach separation theorems) assures all necessary background for a thorough presentation of the theory and applications of convex functions defined on linear normed spaces The traditional section devoted to the existence of orthogonal projections in Hilbert spaces is supplemented here with the extension of all basic results to the case of uniformly convex spaces and followed by comments on Clarkson’s inequalities Our discussion on cones in Chapter includes a special section devoted to the order properties of the space Sym(n, R), of all n × n-dimensional symmetric matrices, an important example of a regularly ordered Banach space which is not a Banach lattice Chapter 3, devoted to Convex functions on a linear normed space, appears here in a new form, including special sections on the interplay between convex functions and convex sets, the inequalities associated to perspective functions, the subdifferential calculus and extrema of convex functions Chapters 4–6 are entirely new Chapter 4, devoted to the connection between Convexity and majorization, includes not only the classical theory of majorization but also its connection with the Horn inequalities and the theory of hyperbolic polynomials (developed by L G˚ arding) For the first time in a book, a detailed presentation of Sherman’s theorem of majorization (as well as some of its further generalizations) is included Chapter deals with Convexity in spaces of matrices We focus here on three important topics: the theory of convex spectral functions (treating a special case of convexity under the presence of a symmetry group), the matrix convexity (` a la Lă owner), and the geodesic convexity in the space Sym++ (n, R), of all n × n-dimensional positively definite matrices endowed with the trace metric Chapter 6, on Duality and convex optimization, illustrates the power of convex analysis to provide useful tools in handling practical problems posed by science, engineering, economics, informatics, etc Special attention is paid to the Legendre–Fenchel–Moreau duality theory and its applications We also discuss the convex programming problem, the von Neumann and Ky Fan minimax theorems, the Moreau–Yosida regularization and the implications of the theory of convex functions in deriving the Hopf-Lax formula for the Hamilton Jacobi equation 4.4 Eigenvalue Inequalities 207 4.4.10 Lemma Suppose that x = (x1 , , xN ) and y = (y1 , , yN ) are two vectors in RN such that x1 ≥ · · · ≥ xN ≥ and y1 ≥ · · · ≥ yN ≥ Then, for every doubly stochastic matrix D ∈ MN (R), we have the inequality Dx, y ≤ x, y Proof This is an immediate consequence of the Hardy–Littlewood–P´ olya rearrangement inequality 1.9.8 and of the Birkhoff Theorem 2.3.6 Proof of Theorem 4.4.9 According to the singular value decomposition theorem, A = U SV ∗ and B = XT Y ∗ where S and T are diagonal matrices and U, V, X, Y are orthogonal matrices Then trace(A∗ B) = trace(V SU ∗ XT Y ∗ ) = trace(Y ∗ V SU ∗ XT ) = trace(Q∗ SP T ), where P = U ∗ X = (pij )i,j and Q = V ∗ Y = (qij )i,j are orthogonal matrices )i,j are doubly stochastic and we have Thus the matrices (p2ij )i,j and (qij ∗ trace(A∗ B) = trace (SQ) P T = N  si (A)sj (B)qij pij i,j=1 ≤ ≤ N N   si (A)sj (B)qij + si (A)sj (B)p2ij i,j=1 i,j=1 N  s↓k (A)s↓k (B), k=1 the last inequality being implied by Lemma 4.4.10 4.4.11 Corollary Any matrices A, B ∈ Sym(N, R) verify the inequality trace(AB) ≤ λ↓ (A), λ↓ (B) Equality occurs if, and only if, there exists an orthogonal matrix V such that V ∗ AV = Diag λ↓ (A) and V ∗ BV = Diag λ↓ (B) Proof Choose scalars α and β such A + αI ≥ and B + βI ≥ and apply Theorem 4.4.9 taking into account that λ↓ (C) = s↓ (C) if C ≥ A simple proof of the equality case is available in [278], Theorem 2.2 Most of the above results extend easily to the framework of compact selfadjoint operators A good starting point in this direction is offered by the book of B Simon [449] 208 Convexity and Majorization Exercises Recall Ky Fan’s maximum principle (4.12), r  λ↓k (A) = max trace(AP ) P k=1 for r = 1, , N, where the maximum is taken over all r-dimensional orthogonal projections P Prove that this principle is equivalent to Schur’s Lemma 4.2.2 Consider a symmetric matrix A ∈ Sym(N, R) and its principal submatrix B, obtained from A by deleting the last row and column of A Prove that N ↓ N −1 ↓ λ1 (A)− λ1 (B)) (λ↓1 (B), , λ↓N −1 (B), i=1 i=1 ≺HLP (λ↓1 (A), , λ↓N (A)) (A determinant inequality due to M Lin) Assuming A, B, C ∈ Sym+ (N, R), prove the following matrix analogue of the Hlawka inequality: det (A + B + C)+det A+det B +det C ≥ det(A+B)+det(B +C)+det(C +A) (a) Show that the inequality is equivalent to its particular case where C = I (b) Prove the inequality in the case where A and B are diagonal matrices, that is, N  (ai + bi + 1) + i=1 N  + i=1 N  bi + i=1 ≥ N  (ai + bi ) + i=1 (c) Prove that the function RN + N i=1 N  (1 + ) + i=1 (1 + xi ) − N i=1 N  (1 + bi ) i=1 xi is Schur-concave on (d) Notice that det (A + B + I) − det (A + B) = N i=1 (1 + λi (A + B)) − N i=1 λi (A + B) and conclude the proof using (in order) Ky Fan’s inequality (4.14), the assertion (b) and next the assertion (a) Infer from Cauchy’s interlace theorem that a real symmetric matrix A is strictly positive if and only if the determinants of all its leading minors are strictly positive 4.5 Horn’s Inequalities 209 Consider two interlaced sequences a1 ≥ b1 ≥ a2 ≥ · · · ≥ bN −1 ≥ aN Prove that N N −1 (b1 , , bN −1 , − bj ) ≺HLP (a1 , , aN −1 , aN ) i=1 j=1 (Ky Fan) Consider a matrix A ∈ MN (C) and put Re A = (A + A∗ )/2 Infer from Ky Fan’s maximum principle that (Re λ1 (A), , Re λN (A)) ≺HLP (λ1 (Re A) , , λN (Re A)) 4.5 Horn’s Inequalities The following problem was raised by H Weyl [484] in 1912: Let A,B, and C be Hermitian N × N matrices and denote the string of eigenvalues of A by α, where α : α1 ≥ · · · ≥ αN , and similarly write β and γ for the spectra of B and C What α, β and γ can be the eigenvalues of the Hermitian matrices A, B and C when C = A + B? There is one obvious condition, namely that the trace of C is the sum of the traces of A and B: N N N    γk = αk + βk (4.17) k=1 k=1 k=1 H Weyl was able to indicate supplementary additional conditions in terms of linear inequalities on the possible eigenvalues They were presented in Section 4.4 Weyl’s problem was studied extensively by A Horn [222] who solved it for small N and proposed a complete set of necessary inequalities to accompany (4.17) for N ≥ Horn’s inequalities have the form    γk ≤ αi + βj , (4.18) i∈I k∈K j∈J where I = {i1 , , ir }, J = {j1 , , jr }, K = {k1 , , kr } are subsets of {1, , N } with the same cardinality r ∈ {1, , N − 1} in a certain finite set TrN Let us call such triplets (I, J, K) admissible When r = 1, the condition of admissibility is i1 + j1 = k1 + If r > 1, this condition is:  i∈I i+  j∈J j=  k∈K   r+1 k+ 210 Convexity and Majorization and, for all ≤ p ≤ r − and all (U, V, W ) ∈ Tpr ,  iu + u∈U  jv = v∈V   kw + w∈W  p+1 Notice that Horn’s inequalities are defined by an inductive procedure 4.5.1 Conjecture (Horn’s Conjecture) A triplet (α, β, γ) of elements of RN ≥ occurs as eigenvalues of symmetric matrices A, B, C ∈ MN (R), with C = A + B, if and only if the trace equality (4.17) and Horn’s inequalities (4.18) hold for every (I, J, K) in TrN , and every r < N Nowadays this conjecture is a theorem due to works by A A Klyachko [254] and A Knutson and T Tao [256] It appeals to advanced facts from algebraic geometry and representation theory (beyond the goal of this book) A thorough introduction to the mathematical world of Horn’s conjecture is offered by the paper of R Bhatia [50] A more technical description of the work of Klyachko, Knutson, and Tao can be found in the paper of W Fulton [171] Just to get a flavor of what Horn’s conjecture is about we will detail here the proof in the case of × symmetric matrices Precisely, we will prove that for all families of real numbers α1 ≥ α2 , β1 ≥ β2 , γ1 ≥ γ2 , which verify Weyl’s inequalities, γ1 ≤ α1 + β1 , γ2 ≤ α2 + β1 , γ2 ≤ α1 + β2 , and the trace formula (4.17), γ1 + γ2 = α1 + α2 + β1 + β2 , there exist symmetric matrices A, B, C ∈ M2 (R) with C = A + B, σ(A) = {α1 , α2 }, σ(B) = {β1 , β2 } and σ(C) = {γ1 , γ2 } Assume, for the sake of simplicity, that the spectra of A and B are respectively α = (4, 2) and β = (2, −2) Then the conditions above may be read as γ1 + γ2 = 6, γ1 ≥ γ2 γ1 ≤ 6, γ2 ≤ (4.19) (4.20) This shows that γ has the form γ = (6 − a, a), with ≤ a ≤ 2; clearly, γ1 ≥ γ2 We shall prove that every pair (6 − a, a) with ≤ a ≤ can be the spectrum of a sum A + B In fact, the relations (4.19) and (4.20) lead us to consider (in the plane 0γ1 γ2 ) the line segment XY , where X = (6, 0) and Y = (4, 2) Starting with the matrices   A=  and Rθ  Rθ , −2 4.5 Horn’s Inequalities 211  where Rθ = cos θ − sin θ  sin θ , cos θ we should remark that the spectrum (λ↓1 (Cθ ), λ↓2 (Cθ )) of the matrix     0  Cθ = + Rθ Rθ −2 lies on the line segment XY for all θ ∈ [0, π/2] In fact, since the eigenvalues of a matrix are continuous functions on the entries of that matrix, the map θ → (λ↓1 (Cθ ), λ↓2 (Cθ )) is continuous The trace formula shows that the image of this map is a subset of the line γ1 + γ2 = The point X corresponds to θ = 0, and Y corresponds to θ = π/2 Since the image should be a line segment, we conclude that each point of the linear segment XY represents the spectrum of a matrix Cθ with θ ∈ [0, π/2] The list of inequalities involved in the × 3-dimensional case of Horn’s conjecture is considerably larger and consists of 13 items: • Weyl’s inequalities, γ1 ≤ α1 + β1 γ3 ≤ α1 + β3 γ2 ≤ α1 + β2 γ3 ≤ α3 + β1 γ2 ≤ α2 + β1 γ3 ≤ α2 + β2 ; • Ky Fan’s inequality, γ1 + γ2 ≤ α1 + α2 + β1 + β2 ; • Lidskii–Wielandt inequalities (taking into account the symmetric role of A and B), γ1 + γ3 ≤ α1 + α3 + β1 + β2 γ2 + γ3 ≤ α2 + α3 + β1 + β2 γ1 + γ3 ≤ α1 + α2 + β1 + β3 γ2 + γ3 ≤ α1 + α2 + β2 + β3 ; • Horn’s inequality, γ2 + γ3 ≤ α1 + α3 + β1 + β3 ; • trace identity, γ1 + γ2 + γ3 = α1 + α2 + α3 + β1 + β2 + β3 (4.21) 212 Convexity and Majorization Horn’s inequality (4.21) follows from (4.15), which in the case n = may be read as (α1 + β3 , α2 + β2 , α3 + β1 ) ≺HLP (γ1 , γ2 , γ3 ) The aforementioned set of 13 relations provides necessary and sufficient conditions for the existence of three symmetric matrices A, B, C ∈ M3 (R), with C = A + B, and spectra equal respectively to α1 ≥ α2 ≥ α3 ; β1 ≥ β2 ≥ β3 ; γ1 ≥ γ2 ≥ γ3 The proof is similar to that in the case n = For larger n, things become much more intricate For example, for n = 7, Horn’s list includes 2062 inequalities, not all of them independent The multiplicative companion to Horn’s inequalities is a by-product of the solution of Horn’s conjecture See W Fulton [171] 4.5.2 Theorem Let α1 ≥ · · · ≥ αN , β1 ≥ · · · ≥ βN , γ1 ≥ · · · ≥ γN , be strings of positive real numbers Then there exist matrices A and B with singular numbers sk (A) = αk , sk (B) = βk , sk (AB) = γk , if and only if    γk ≤ αi βj k∈K i∈I j∈J for all admissible triplets (I, J, K) 4.6 The Way of Hyperbolic Polynomials Motivated by the theory of hyperbolic partial differential equations, L G˚ arding [177], [178] has developed the theory of hyperbolic polynomials, that connects in an unexpected way algebra with convex analysis In what follows, by a polynomial on a finite-dimensional real vector space V , we will mean any real-valued functions on V that can be represented as finite sum of finite products of linear functionals Assuming that dim V = N, and ∗ of considering a vector basis v1 , , vN of V and its dual vector basis v1∗ , , vN ∗ V , it follows that every such polynomial p(x) is nothing but a usual polynomial in the coefficients of x If p is a nonconstant polynomial on V and m is a strictly positive integer, then we say that p is homogeneous of degree m if p(tx) = tm p(x) for all t ∈ R and every x ∈ V 4.6.1 Definition (L G˚ arding [177]) A homogeneous polynomial p is called hyperbolic in the direction d ∈ V if p(d) > and the one real-variable polynomial function t → P (x − td) has only real roots for all vectors x Some simple examples of hyperbolic polynomials are as follows The polynomial p(x1 , , xN ) = x1 · · · xN is hyperbolic in the direction d = (1, , 1); the roots of p(x − td) are exactly x1 , x2 , , xN This polynomial is actually hyperbolic in every direction d ∈RN ++ such that p(d) > 4.6 The Way of Hyperbolic Polynomials 213 The polynomial p(x0 , x1 , , xN ) = x20 − x21 − · · · − x2N"is hyperbolic in the direction d = (1, 0, , 0); the roots of p(x − td) are x0 ± x21 + · · · + x2N It is also hyperbolic in every direction d for which p(d) > The function det, when restricted to Sym(N, R), can be considered as a polynomial in the entries above the diagonal The polynomial det is hyperbolic in the direction d = I, the identity matrix For each X ∈ Sym(N, R), the roots of det(X − tI) are precisely the eigenvalues of X; det is also hyperbolic in every direction d ∈ Sym++ (N, R) Now, consider the polynomial p(x) = det (x1 A1 + · · · + xN AN ) , where A1 , , AN ∈ Sym++ (N, R) Then p(x) is hyperbolic in the direction d = (1, 0, , 0) because  n    n   −1/2 −1/2 xk Ak − tA1 = det A1 det xk A1 Ak A1 − tI p(x − td) = det k=1 k=1 4.6.2 Remark The Helton–Vinnikov theorem [211] (previously a conjecture due to P Lax) asserts that every hyperbolic polynomial p(x1 , x2 , x3 ) in three variables possesses a representation of the form p(x1 , x2 , x3 ) = det (x1 A1 + x2 A2 + x3 A3 ) , where A1 , A2 , A3 are real symmetric matrices A simple way to generate new hyperbolic polynomials from old ones is by differentiation 4.6.3 Proposition (L G˚ arding [178]) Suppose that p is a hyperbolic polynomial in the direction d and p is homogeneous, of degree m > Then Q(x) = ∇p(x), d is also a hyperbolic polynomial in the direction d d Proof Since Q(x + td) = ∇p(x + td), d = dt p(x + td), we infer from Rolle’s theorem that the polynomial Q(x+td) has m−1 real roots, separating the roots of p(x + td) On the other hand, by differentiating the identity p(td) = tm p(d) we obtain ∇p(td), d = mtm−1 p(d) > for t = 0, whence Q(d) > 4.6.4 Corollary All elementary symmetric polynomials ek (x1 , , xn ) of strictly positive degree are hyperbolic in the direction d = (1, , 1) Proof Indeed, it was already noticed that en (x1 , , xn ) = x1 · · · xn is hyperbolic in the direction d Then en−1 (x) = ∇en (x), d is also hyperbolic by Proposition 4.6.3 and this argument can be iterated n − steps By applying Proposition 4.6.3 to the polynomial p(X) = det X and the direction d = I, one obtains a new family of hyperbolic polynomials defined on Sym(N, R), σk (X) = ek (λ1 (X), , λN (X)), k = 1, , N, where λ1 (X), , λN (X) are the eigenvalues of X This example can be generalized 214 Convexity and Majorization If p(x) is a hyperbolic polynomial in the direction d, of degree m, then for each x ∈ V we have p(x + td) = p(d) m  (t + λk (x)) , k=1 where λ1 (x), , λm (x) are the so called d-eigenvalues of x The terminology is motivated by the case of the polynomial p(X) = det X and the direction d = I These functions are continuous (and dependent on p and d) The continuity follows from the continuous dependence of the roots of the polynomial of its coefficients See Q I Rahman and G Schmeisser [413], Theorem 1.3.1, p 10 Notice that m  λk (x) (4.22) p(x) = p(d) k=1 The following immediate result implies that the d-eigenvalues of x are also positively homogeneous 4.6.5 Lemma Denote by λ↓k (x) the k-th largest d-eigenvalue of x Then for all s, t ∈ R we have  if s ≥ sλ↓k (x) + t ↓ λk (sx + td) = ↓ sλm−k+1 (x) + t if s < as The eigenvalues map associated to the family of d-eigenvalues of x is defined   Λ : V → Rm , Λ (x) = λ↓1 (x), , λ↓m (x) See Exercise for an important result concerning this map As will be shown in what follows, the d-eigenvalues share many of the properties of the eigenvalues of symmetric matrices The hyperbolicity cone of p is the set Γp,d = {x ∈ V : p(x − td) = 0, for all t ≤ 0} = {x ∈ V : λk (x) > 0, for k = 1, , m} The hyperbolicity cone Γp,d associated with the polynomial x1 · · · xn and the direction d = (1, , 1) is the open positive orthant RN ++ In the case of the polynomial det and the direction d = I, the hyperbolicity cone is Sym++ (n, R) The hyperbolicity cone plays an important role in emphasizing the convexity properties of hyperbolic polynomials 4.6.6 Theorem (L G˚ arding [177], [178]) Suppose that p(x) is a hyperbolic polynomial in the direction d, of degree m Then: (a) The hyperbolicity cone Γp,d is open and convex and d ∈ Γp,d ; (b) The polynomial p is hyperbolic in every direction e ∈ Γp,d and Γp,d = Γp,e ; 4.6 The Way of Hyperbolic Polynomials 215 (c) p(x + y)# ≥ p(x) whenever y$ ∈ Γp,d ; # $ (d) Γp,d = x ∈ V : λ↓m (x) > and Γp,d = x ∈ V : λ↓m (x) ≥ ; (e) p(x)1/m is a concave function on Γp,d , which vanishes on its boundary; (f) The largest eigenvalue λ↓1 (x) is a convex function (while λ↓m (x) is a concave function) An important ingredient in the proof of Theorem 4.6.6 is the following lemma 4.6.7 Lemma (J Renegar [417]) The hyperbolicity cone is the connected component of the open set {x : p(x) = 0} , containing d Proof Let S denote the connected component containing d Since x has as an eigenvalue only if p(x) = 0, and since d ∈ Γp,d , it follows from the continuity of the eigenvalues that S ⊂ Γp,d For the other inclusion, consider x ∈ Γp,d and let L be the line segment with endpoints x and d For sufficiently large t∗ > 0, all y ∈ L satisfy p(y + t∗ d) > Also, since x, d ∈ Γp,d , we know that x + td and d + td belong to Γp,d for t ≥ 0, which implies that p(x + td) > and p(d + td) > for t ≥ Thus, the segments {x + td : ≤ t ≤ t∗ }, {y + t∗ d : y ∈ L}, and {d + td : ≤ t ≤ t∗ } form a path from x to d on which p remains strictly positive This shows that x ∈ S and the proof of the inclusion Γp,d ⊂ S is done Proof of Theorem 4.6.6 (a) According to Lemma 4.6.7, the hyperbolicity cone Γp,d is open and contains d This cone is also convex To show this, let v, w ∈ Γp,d As was noticed in the proof of Lemma 4.6.7, the line segment joining v and d lies inside Γp,d Now, if e is an arbitrary point in Γp,d , we have Γp,d = Γp,e because of the maximality of a connected component Therefore v ∈ Γp,e and the line segment joining v and w lies inside Γp,w = Γp,e (= Γp,d ), which ends the proof (b) If e ∈ Γp,d , then p(e) > See formula (4.22) We will show that for every x ∈ V , the polynomial t → p(x + te) has only real roots Let α > and s ∈ R be two parameters, and consider the polynomial t → p(sx+te+αid) We claim that if s ≥ 0, then this polynomial has only roots with strictly negative imaginary part Clearly, this is true for s = since t → p(te + αid) cannot have a root at t = 0, since p(αid) = (αi)m p(d) = If t = 0, then p(te + αid) = if, and only if, p(e + αt−1 id) = which implies αt−1 i < 0, and thus t = ri for some r < If for some s > the polynomial t → p(sx + te + αid) would have a root in the upper half-plane, then there must exist s∗ for which t → p(s∗ x + te + αid) has a real root t∗ , that is, p(s∗ x + t∗ e + αid) = However, this contradicts the hyperbolicity of p, since s∗ x + t∗ e ∈V Thus, for all s ≥ 0, the roots of t → p(sx + te + αid) have strictly negative imaginary parts The conclusion above was true for every α > Letting α → 0, by continuity of the roots we have that the polynomial t → p(sx + te) must also have only roots with negative imaginary parts However, since it is a polynomial with real coefficients (and therefore its roots always appear in complexconjugate pairs), 216 Convexity and Majorization then all the roots must actually be real Taking now s = 1, we have that t → p(x + te) has real roots for all x The assertion (c) is a consequence of assertion (b) and Lemma 4.6.5 (d) The proof is left to the reader as Exercise (e) We follow the ingenious argument of L Hă ormander [227], p 63 According to Proposition 3.1.2, it suffices to prove that if x ∈ Γ and y ∈ V, then the function ϕ(t) = p1/m (x + ty) is concave on dom ϕ = {t ∈ R : x + ty ∈ Γ} We know that m p(y + tx) = p(x) (t − ri ) k=1 for suitable real numbers ri , whence p(x + ty) = tm p(y + x/m) = p(x) m k=1 (1 − tri ); we have − tri > since x + ty ∈ Γ Consider the function f (t) = log p(x + ty) Then m m ri ri2 , f  (t) = − , f  (t) = − k=1 − tri k=1 (1 − tr )2 i and m2 e−f (t)/m d2 f (t)/m e = f  (t)2 + mf  (t) dt2  2 m m ri ri2 = −m ≤ 0, k=1 − tri k=1 (1 − tr )2 i according to the Cauchy–Bunyakovsky–Schwarz inequality This proves that the function t → p1/m (x + ty) is concave on Γp,d The fact that p vanishes on the boundary of Γp,d follows from assertion (d) (f) Notice that λ↓m (x) ≥ α if, and only if, λ↓m (x − αd) ≥ 0, whence # $ x : λ↓m (x) ≥ α = αd + Γp,d According to assertion (a) this is a convex set Since the function λ↓m (x) is positively homogeneous, we conclude from Lemma 3.4.2 that λ↓m (x) is also a concave function The following theorem extends the results stated in Section 4.5 4.6.8 Theorem (D Serre [441]) Horn’s inequalities remain valid for the eigenvalues of every hyperbolic polynomial Exercises (New hyperbolic polynomials from old ones) (a) Suppose that p : V → R is a hyperbolic polynomial in the direction d ∈ V and W is a vector subspace of V that contains d Prove that p|W is also hyperbolic (in the same direction) Moreover Γp|W ,d = Γp,d ∩ W 4.6 The Way of Hyperbolic Polynomials 217 (b) Let p and q be two hyperbolic polynomials in the direction d Prove that pq is also hyperbolic in the direction d and Γpq,d = Γp,d ∩ Γq,d (c) Let p be a hyperbolic polynomial d, of degree m ≥ 1, m in the direction k and assume that p(x + td) = p (x)t Prove that all coefficients k=0 k pk (x) are hyperbolic polynomials in the direction d [Hint: For (c), use the formula   dk  p(x + td) = ∇k p(x), d.]  k k! dt t=0 pk (x) = (H H Bauschke, O Gă uler, A S Lewis and H S Sendov [35]) Infer from Exercise (c) the following new construction for hyperbolic polynomials Suppose that q is a homogeneous symmetric polynomial of degree m on RN , hyperbolic in the direction e = (1, 1, , 1), with eigenvalue map Φ and let Λ be the eigenvalue map of a hyperbolic polynomial of degree m in the direction d Prove that q ◦ Λ is also a hyperbolic polynomial of degree m in the direction d, with eigenvalue map Φ ◦ Λ Let ≤ k ≤ N and e = (1, 1, , 1) ∈ RN Prove that the polynomial  q(u) = k  uij 1≤i1

Ngày đăng: 03/11/2023, 18:02

Xem thêm:

w