On approximate karush kuhn tucker sequential optimality conditions for smooth constrained optimization

HANOI PEDAGOGICAL UNIVERSUTY DEPARTMENT OF MATHEMATICS DEPARTMENT OF MATHEMATICS Dao Thi Thao ON APPROXIMATE KARUSH-KUHN-TUCKER OPTIMALITY CONDITIONS FOR SMOOTH CONSTRAINED OPTIMIZATION BACHELOR THESIS Hanoi – 2019 HANOI PEDAGOGICAL UNIVERSUTY DEPARTMENT OF MATHEMATICS DEPARTMENT OF MATHEMATICS Dao Thi Thao ON APPROXIMATE KARUSH-KUHN-TUCKER OPTIMALITY CONDITIONS FOR SMOOTH CONSTRAINED OPTIMIZATION BACHELOR THESIS Major: Analysis SUPERVISOR Dr NGUYEN VAN TUYEN Hanoi – 2019 Thesis acknowledgment I would like to express my gratitude to the teachers of the Department of Mathematics, Hanoi Pedagogical University 2, the teachers in the analysis group as well as the teachers involved The lecturers have imparted valuable knowledge and facilitated for me to complete the course and the thesis In particular, I would like to express my deep respect and gratitude to Dr Nguyen Van Tuyen, who has direct guidance, help me complete this thesis Due to time, capacity and conditions are limited, so the thesis can not avoid errors So, I look forward to receiving valuable comments from teachers and friends Hanoi, May , 2019 Student Dao Thi Thao Thesis assurance I assure that the data and the results of this thesis are true and not identical to other topics I also assure that all the help for this thesis has been acknowledge and that the results presented in the thesis has been identified clearly Hanoi, May , 2019 Student Dao Thi Thao Contents Preface Preliminaries 1.1 Convex sets 1.2 Convex functions 1.3 Cones 1.4 Tangent cones 1.5 Optimality conditions for smooth problem 11 Approximate Karush–Kuhn–Tucker optimality conditions 16 2.1 Approximate-KKT conditions 17 2.1.1 AKKT(I) is an optimality condition 19 2.1.2 AKKT(I) is a strong optimality condition 21 2.2 Approximate gradient projection conditions 23 2.2.1 C-AGP condition 24 2.2.2 L-AGP condition 29 2.2.3 Remarks 30 Bibliography 31 Preface Karush–Kuhn–Tucker (KKT) optimality conditions are one of the most important results in optimization theory However, KKT optimality conditions not need to be fulfilled at local minimum points unless some constraint qualifications are satisfied In other words, usual first-order necessary optimality conditions are of the form KKT or not-CQ We note here that a local minimizer might not be KKT, but it can always be approximated by a sequence of “approximate-KKT” points This leads one to study a different type of optimality conditions In this thesis, based on the recent work by Andreani, Haese, and Mart´ıne [5], we study sequential first-order optimality conditions for nonlinear programming problems We first examine some sequential optimality conditions, such as approximate KKT and approximate gradient projection conditions, which may be used as stopping criteria of optimization algorithms Then, we investigate the relationships between the sequential optimality conditions and several necessary optimality conditions The thesis is organized as follows In Chapter 1, we recall some basic definitions and preliminaries from convex analysis, which are widely used in the sequel In Chapter 2, we introduce Approximate KKT conditions for nonlinear programming problems These optimality conditions must be satisfies by the minimizers of optimization problems The relationships between the sequential optimality conditions and several necessary optimality conditions are also investigated Chapter Preliminaries 1.1 Convex sets The notion of a convex set is central to optimization theory A convex set is such that, for any two of its points, the entire segment joining these points is contained in the set Definition 1.1 A set X ⊂ Rn is called convex if for all x1 ∈ X and x2 ∈ X it contains all points α.x1 + (1 − α).x2 , < α < The following lemma say that convexity is preserved by operation of intersection Lemma 1.2 Let I be an arbitrary index set If the sets Xi ⊂ Rn , i ∈ I, are convex, then the set X = Xi is convex i∈I These operations preserve convexity Lemma 1.3 Let X and Y be convex sets in Rn and let c and d be real numbers The the set Z = cX + dY is convex Definition 1.4 A point x is called a convex combination of points x1 , , xm if there exist α1 ≥ 0, , αm ≥ such that x = α1 x1 + α2 x2 + + αm xm and α1 + α2 + + αm = Definition 1.5 The convex hull of the set X (denoted by convX) is the intersection of all convex sets containing X The relation between these two concepts is the subject of the next lemma Lemma 1.6 The set convX is the set of all convex combinations of points of X Lemma 1.7 If X ⊂ Rn , then every element of convX is a convex combination of at most n + points of X Lemma 1.8 If X is convex, then its interior intX and its closure X are convex Lemma 1.9 Assume that the set X ⊂ Rn is convex Then int X = ∅ if and only if X is contained in a linear manifold of dimension smaller than n Consider a convex closed set V ⊂ Rn and a point x ∈ Rn We call the point in V that is closest to x the projection of x on V and we denote it by PV (x) Obviously, if x ∈ V then PV (x) = x, but the projection is always well defined, as the following result shows Theorem 1.10 If the set V ⊂ Rn is nonempty, convex and closed, then for every x ∈ Rn there exists exactly one point z ∈ V that is closest to x Lemma 1.11 Assume that V ⊂ Rn is a closed convex set and let x ∈ Rn Then z = PV (x) if and only if z ∈ V and v − z, x − z ≤ for all v ∈ V Theorem 1.12 Assume that V ⊂ Rn is a closed convex set Then for all x ∈ Rn and y ∈ Rn we have PV (x) − PV (y) ≤ x − y A convex closed set and a point outside of it can be separated by a plane Theorem 1.13 Let X ⊂ Rn be a closed convex set and let x ∈ / X Then there exist a nonzero y ∈ Rn and ε > such that y, v ≤ y, x − ε for all v ∈ X Theorem 1.14 Let X ⊂ Rn be a convex set and let x ∈ / X Then there exists a nonzero y ∈ Rn such that y, v ≤ y, x for all v ∈ X Theorem 1.15 Let X1 and X2 be closed convex sets in Rn If X1 ∩ X2 = ∅, then there exists a nonzero y ∈ Rn such that y, x1 ≤ y, x2 for all x1 ∈ X1 , x2 ∈ X2 Theorem 1.16 Let X1 and X2 be closed convex sets in Rn and let X1 be bounded If X1 ∩ X2 = ∅, then there exists a nonzero y ∈ Rn and ε > such that y, x1 ≤ y, x2 − ε for all x1 ∈ X1 and all x2 ∈ X2 1.2 Convex functions Let R be the set of extended real numbers and defined by R := R ∪ {±∞} With every function f : Rn → R we can associate two sets: the domain domf := {x | f (x) < +∞} and the epigraph epif := {(x, λ) ∈ Rn × R | f (x) ≤ λ} Definition 1.17 A function f is called convex if epif is a convex set Theorem 1.18 A function f is convex if and only if for all x1 and x2 and for all α ∈ [0; 1] we have f (αx1 + (1 − α)x2 ) ≤ αf (x1 ) + (1 − α)f (x2 ) 1.3 Cones A particular class of convex sets, convex cones, play a significant role in optimization theory Definition 1.19 A set K ⊂ Rn is called a cone if for every x ∈ K and all α > one has αx ∈ K A convex cone is a cone that is a convex set Lemma 1.20 Let K be a convex cone If x1 ∈ K, x2 ∈ K, , xm ∈ K and α1 > 0, α2 > 0, , αm > 0, then α1 x1 + α2 x2 + + αm xm ∈ K Lemma 1.21 Assume that X is a convex set Then the set cone (X) = {γx : x ∈ X, γ ≥ 0} is a convex cone Example 1.22 Assume that the set X ⊂ Rn is a closed convex cone itself, and x ∈ X Let us calculate the cone of feasible directions for X at x: KX (x) = {d ∈ Rn : d = τ (y − x), y ∈ X, τ ≥ 0} = {d ∈ Rn : d = h − τ x, h ∈ X, t ≥ 0} =X − {τ x : τ ≥ 0} = X + {τ x : τ ∈ R} In the last two equations we used the fact that X is a cone Definition 1.23 Let X ⊂ Rm be a convex set The set ∆ X∞ = {d : X + d ⊂ X} is called the recession cone of X We shall show that X∞ is a convex cone We first note that for each d ∈ X∞ and for every m: X + md ⊂ X + (m − 1) d ⊂ · · · ⊂ X + d ⊂ X Using convexity of X we infer that X + τ d ⊂ X for all τ ≥ Hence τ d ∈ X∞ for all τ ≥ This means that X∞ is a cone The fact that X∞ is convex can be verifed directly from the definition Indeed, if d1 ∈ X∞ and d2 ∈ X∞ , then x + αd1 + (1 − α) d2 = α x + d1 + (1 − α) x + d1 + (1 − α) x + d2 ∈ X, for all x ∈ X and all α ∈ (0, 1) Definition 1.24 Let K be a cone in Rn The set ∆ K ◦ = {y ∈ Rn : y, x ≤ 0, f or all x ∈ K} is called the polar cone of K Example 1.25 Let K1 , , Km be cones in Rn and let K = K1 + K2 + · · · + Km Clearly, K is a cone We shall calculate its polar cone If z ∈ K ◦ then for every x1 ∈ K1 , , xm ∈ Km we have z, x1 + · · · + z, xm ≤ Let us choose j ∈ {1, , n} Setting all xi = 0, except for i = j, we conclude that We define Ismall = {1},Ibig = {1, 2} Clearly, both Ismall and Ibig satisfy the sufficient interior property The condition AKKT (Ismall ) is satisfied: Take xk = k1 , µk1 = k , µk2 = for all k ∈ N However, AKKT (Ibig ) does not hold In fact, if g2 xk < one has that xk < So, −2xk µk1 + µk2 ≥ and + (−2xk )µk1 + µk2 ≥ cannot tend to zero Remark 2.6 The observation and example above show that AKKT(∅) is the weakest optimality condition of type AKKT(I) The consideration of the general AKKT(I) has algorithmic importance because of its potential application to interior point methods The classical barrier methods [8] are the most typical ones to which AKKT(I), with nontrivial I, is applicable 2.1.1 AKKT(I) is an optimality condition We are going to prove that, if x∗ is a local minimizer of (2.1) and I satisfies the sufficient interior property, then x∗ satisfies AKKT(I) The proof is based on the convergence properties of the internal-external penalty method given below Lemma 2.7 let ρk be a positive sequence that tends to infinity and I1 ⊂ {1, , p} Let Ω ⊂ Rn be closed Assume that, for all k ∈ N, xk is a global solution of m hi (x)2 + minimize f (x) + ρk i=1 gi (x)2+ − i∈I / 1 ρk i∈I1 gi (x) subject to gi (x) < ∀i ∈ I1 , x ∈ Ω Consider the problem minimize f (x) subject to h(x) = 0, g(x) 0, x ∈ Ω (2.12) and assume that I1 is such that there exists a global minimizer z of (2.12), such that there exists a feasible sequence z k ⊂ Ω that converges to z and satisfies gi z k < for all i ∈ I1 Then, every limit point of xk h(x) = 0, g(x) is a global minimizer of f(x) subject to 0, x ∈ Ω Proof Let x∗ be a limit point of xk Let z be a global solution of (2.12) and suppose f (z) < f (x∗ ), such that there exists a feasible z ∈ Ω with gi xk < for all i ∈ I1 19 and f (z ) < f (x∗ ) By the definition of xk , the fact that gi xk < for all i ∈ I1 and the feasibility of z , we have m f xk ≤ f xk + ρ k hi xk gi xk + i=1 + i∈I / m ≤ f (z ) + ρk ρk − gi (z )+ = f (z ) − hi (z ) + i=1 i∈I / 1 ρk gi (xk ) i∈I1 gi (z ) i∈I1 Taking limits for a suitable subsequence we have f (x∗ ) ≤ f (z ), so f (x∗ ) ≤ f (z) Now let us prove that x∗ is feasible By the closedness of Ω, since gi xk < for all i ∈ I1 , we have that x∗ ∈ Ω and gi (x∗ ) ≤ for all i ∈ I1 Now suppose that m i=1 hi (x∗ )2 + i∈I / gi xk + i∈I / gi (x∗ )2+ > Then, there exists ε > such that m i=1 hi xk + > ε for every k in a suitable subsequence Thus, m f x k hi xk + ρk gi xk + i=1 + i∈I / m > f (z ) + ρk hi (z ) + i=1 gi (z i∈I / Where Ak = ρk ε + f xk − f (z ) − ρk )+ − ρk 1 i∈I1 gi (xk ) + ρk − ρk i∈I1 i∈I1 gi (xk ) (2.13) + Ak , gi (z ) i∈I1 gi (z ) Since Ak > for sufficiently large k, (2.13) holds with Ak = for sufficiently large k, which contradicts the definition of xk This prove that x∗ is feasible, hence a global solution of (2.12) Remark 2.8 Lemma 2.7 does not hold under a weaker hypothesis on I1 If we only assume that every feasible point can be approximated by a sequence such that gi z k < for all i ∈ I1 (z k not necessarily feasible), the convergence of the interiorexterior penalty method may not occur Example 2.9 The problem of minimizing x subject to x (x − 1) = 0, −x3 ≤ The global minimizer is x∗ = but the interior-exterior penalty method converges to (see [12] for details) Theorem 2.10 Let x∗ be a local minimizer of (2.1) and assume that I ⊂ {1, , p} satisfies the sufficient interior property Then, x∗ satisfies AKKT(I) Proof Let δ > be such that f (x∗ ) ≤ f (x) for all feasible x such that x − x∗ ≤ δ Consider the problem minimize f (x) + x − x∗ 2 subject to h (x) = 0, g (x) ≤ 0, x ∈ B (x∗ , δ) (2.14) Clearly, x∗ is the unique solution of (2.14) Let xk be a solution of 20 minimize f (x) + x − x∗ 2 + ρk h (x) 2 + i∈I / gi (x)2+ − ρk i∈I gi (x) subject to gi (x) < for all i ∈ I and x ∈ B (x∗ , δ) By the compactness of B (x∗ , δ) and standard arguments of barrier methods [8], xk is well defined for all k By the sufficient interior property, the hypotheses of Lemma 2.7 are fulfilled Therefore, the sequence xk converges to x∗ For k large enough one has that xk − x∗ < δ, therefore, the gradient of the objective function must vanish Thus, m ∇f x k k +2 x −x ∗ 2ρk hi xk ∇hi xk + + i=1 2ρk gi xk + ∇gi xk i∈I / + i∈I k =0 ∇gi x k ρk gi (x ) and gi xk < for all i ∈ I, k ∈ N (2.15) Let us write λki = 2ρk hi xk for all i = 1, , m, µki = 2ρk gi xk ρk gi (xk ) + if i ∈ / I, µki = if i ∈ I By (2.15) we have that (2.4) holds Since xk − x∗ → 0, we have that lim ∇f xk + ∇h xk λk + ∇g xk µk = k→∞ (2.16) If gi (x∗ ) < 0, then, for k large enough, gi xk < Then, if i ∈ / I, we have that g i xk + = and µki = If gi (x∗ ) < and i ∈ I, we have that ρk gi (xk ) → Thus, in (2.16) one has that µki → By the continuity of gi , (2.16) continues to be true re-defining µki = for i ∈ I, gi (x∗ ) < 2.1.2 AKKT(I) is a strong optimality condition In this section we prove that AKKT(I) implies “KKT or not-CPLD”, where CPLD is the Constant Positive Linear Dependence constraint qualification Definition 2.11 Feasible point that satisfy the MFCQ be will called MF-regular Definition 2.12 A feasible point x ∈ X is said to satisfy the CPLD condition if it is MF-regular or, for any I0 ⊂ I(x), J0 ⊂ {1, , m} such that the set of gradients {∇gi (x)}i∈I0 ∪ {∇hj (x)}j∈J0 is positive-linearly dependent, there exists the neighborhood N(x) of x such that, for any y ∈ N (x), the set {∇gi (y)}i∈I0 ∪ {∇hj (y)}j∈J0 is linearly dependent 21 We say that the feasible point x fulfills the CPLD condition if the following property is satisfied: if i1 , , iq ∈ {1, , m} and j1 , , jr ∈ {1, , p} are such that gjl (x) = 0, l = 1, , r and the gradients ∇hi1 (x) , , ∇hiq (x) , ∇gj1 (x) , , ∇gjr (x) are linearly dependent with nonnegative coefficients corresponding to the gradients of inequalities, then there exists a neighborhood V of x such that ∇hi1 (z) , , ∇hiq (z) , ∇gj1 (z) , , ∇gjr (z) are linearly dependent for all z ∈ V The CPLD condition was introduced in [15] and its status as a constraint qualification was elucidated in [4] Every local minimizer that satisfies CPLD necessarily fulfills the KKT conditions [4] This means that “KKT or not-CPLD” is a necessary optimality condition This condition is fulfilled by any feasible limit point of Algencan [2] Since CPLD is weaker than the Mangasarian-Fromovitz constraint qualification (MFCQ) [13], the optimality condition “KKT or not-CPLD” is stronger than the Fritz-John conditions, which can be expressed in the form “KKT or not-MFCQ” [16] Here we are going to prove that AKKT(I) is strictly stronger than “KKT or notCPLD” Before proving that AKKT(I) implies “KKT or not-CPLD” let us show that the reciprocal is not true In fact, the following example shows that this is the case for an arbitrary constraint qualification CQ More general constraint qualifications include Guignard’s [11], Abadie’s [1] and the ones surveyed in [6] The example below implies that, in particular, “KKT or not-CPLD” does not imply AKKT Example 2.13 “KKT or not-CQ” does not imply AKKT Recall that every local minimizer x∗ that satisfies a contraint qualification necessarily fulfills the KKT conditions Consider a nonlinear programming problem whose feasible set is {x ∈ R2 | x21 = 0} No feasible point satisfies any constraint qualification To verify this, consider the objective function f1 (x1 , x2 ) = x1 Although all the feasible points are minimizers, the gradient of f1 is never a linear combination of the constraint gradient, therefore the local minimizers are not KKT points This means that, independently of the objection function and the constraint qualification, all the feasible points satisfy “KKT or not-CQ” Now, consider the objective function defined by f1 (x1 , x2 ) = x2 Since ∇f (x) = (0, 1)T for all x and ∇h (x) is a multiple of (0, 1)T for all x, its turn out that ∇f (x) + λ∇h (x) is bounded away from zero for all x Therefore, ∇f xk + λk ∇h xk can not tend to zero Thus, no feasible point satisfies AKKT Theorem 2.14 AKKT(I) implies “KKT or not-CPLD” Proof Assume that x∗ satisfies AKKT(I) and CPLD Therefore, there exist sequences 22 ⊂ Rn , λk xk ⊂ Rm , µk ⊂ Rp+ , εk ⊂ R+ such that xk → x∗ , εk → satisfying (2.5) and (2.6) (We not need to use (2.7) at all for this proof) Therefore, xk , x∗ , λk , µk satisfy the conditions used in Theorem 4.5 of [2] in the context of proving KKT for Algencan Thus, we can reproduce the arguments of that theorem to prove that x∗ satisfy KKT 2.2 Approximate gradient projection conditions Approximate gradient projection (AGP) conditions were introduced in [14], where the authors observed that AGP is the optimality condition that fits the natural stopping criterion for Inexact Restoration methods [7] Let γ ∈ (0, ∞] We say that a feasible point x∗ of (2.1) satisfies the AGP(γ) condition introduced in [14] when there exists a sequence xk that tends to x∗ and satisfies lim PΩk xk − ∇f xk − xk k→∞ = 0, (2.17) Where Ωk is the set of points x ∈ Rn defined by ∇hi xk T x − xk = for all i = 1, , m, (2.18) ∇gj xk T x − xk ≤ for all j | gj xk ≥ (2.19) x − xk ≤ for all j | − γ < gj xk < (2.20) and gj (xk ) + ∇gj xk T Mart´ınez and Svaiter [14] proved that AGP(γ) is an optimality condition (every local minimizer satisfy it) and that AGP(γ) is equivalent to AGP(γ ) for all γ, γ ∈ (0, ∞] For this reason we will always write AGP instead of AGP(γ) In [14] it was also proved that AGP implies the Fritz-John condition (KKT or not-MFCQ) The stronger result that AGP implies “KKT or not-CPLD” seems to be proved for the first time in [10] If xk is a sequence generated by an optimization algorithm, the natural stopping criterion associated with AGP is given by (2.9) and PΩk xk − ∇f xk − xk ≤ εopt It is easy to prove that AGP implies AKKT [17] Surprisingly, the reciprocal is not true, as the following example shows Therefore, AGP is a stronger optimality condition than AKKT 23 Example 2.15 AKKT does not imply AGP Consider the problem minimize f (x1 , x2 ) subject to h (x1 , x2 ) = 0, g (x1 , x2 ) ≤ 0, where f (x1 , x2 ) = −x2 , h (x1 , x2 ) = x1 x2 , g (x1 , x2 ) = −x1 Define x∗ = (0, 1)T Let us show first that x∗ does not satisfy AGP Assume that xk → x∗ If xk1 > 0, the set Ωk defined by (2.18)-(2.20) is the intersection of the half-space x1 ≥ with the tangent line to h (x1 , x2 ) = h xk1 , xk2 that passes through xk This line tends to be vertical when xk approaches x∗ Therefore, PΩk xk − ∇f xk − xk tends to (0, 1)T Analogously, if xk1 < 0, the set Ωk is the half-space x1 ≥ xk1 intersected with the tangent line to h (x1 , x2 ) = xk1 xk2 that passes through xk So, PΩk xk − ∇f xk xk → x∗ , PΩk xk − ∇f xk − xk tends to (0, 1)T Therefore, for any sequence − xk cannot tend to zero As a consequence, x∗ does not satisfy AGP Now, le us show that x∗ satisfies AKKT Define xk = T , , λk k T = µk = k, therefore, for all k ∈ N we have ∇f xk +∇h xk λk +g xk µk = (0, −1) + 1, k1 T k+ (−1, 0)T k = Since g (x∗ ) = 0, we have that (2.2)-(2.4) hold, so x∗ satisfies AKKT A variation of the AGP condition has been used in the literature in the context of inexact restoration methods and mathematical programming with complementarity constraints without mentioning that this modification is not equivalent to the original AGP condition given in [14] Roughly speaking, the variation consists of including some of the linear constraints of the problem (2.1) in the definition of Ωk , imposing that these constraints must be satisfied by xk for all k ∈ N During several years, authors seemed to believe that all these sequential optimality conditions (including AKKT) were equivalent We will see here that this is not the case 2.2.1 C-AGP condition Assume that the functions gq+1 , , gp in (2.1) are convex and hr+1 , , hm are affine Therefore, the set Ω defined by gj (x) ≤ for j = q + 1, , p and hi (x) = 0, i = r + 1, , m is closed and convex We say that x∗ satisfies the convex-AGP (C-AGP) 24 condition if there exists a sequence xk ⊂ Ω that tends to x∗ and satisfies − xk = 0, lim PΩk ∩Ω xk − ∇f xk k→∞ (2.21) Where Ωk is defined by (2.18)-(2.20) for i = 1, , r, j = 1, , q If an optimization algorithm generates iterates xk ∈ Ω, the natural stopping criterion associated with C-AGP requires (2.9) and − xk ≤ εopt PΩk ∩Ω xk − ∇f xk Let us prove that, under the condition that the constraints in Ω satisfy some constraint qualification, C-AGP is an optimality condition Theorem 2.16 Let x∗ be a local minimizer of (2.1) and assume that, for all x ∈ Ω, the constraint gi (x) ≤ 0, i = q + 1, , p, hi (x) = 0, i = r + 1, , m satisfy some constraint qualification Then, x∗ satisfies C-AGP Proof We use the technique employed in [14] for proving that AGP is an optimality condition Let δ > be such that x∗ is a global minimizer of (2.1) with the additional constraint x − x∗ x − x∗ 2 ≤ δ Therefore, x∗ is the unique global minimizer of f (x) + subject to the constraints of (2.1) and x − x∗ ≤ δ Assume that ρk → ∞ and let xk be a global minimizer of f (x)+ 12 x − x∗ 22 +ρk r i=1 hi (x)2 + q i=1 gi (x)2+ subject to x ∈ Ω and x − x∗ ≤ δ By the theory of convergence of external penalty methods, since x∗ is the unique global minimizer, it turns out that xk → x∗ So, xk − x∗ < δ for k large enough Since Ω satisfies a constraint qualification, it turns out that the KKT conditions of the subproblem must hold Let Ak = i ∈ {q + 1, , p}| gi xk = Therefore, for k large enough there exist µki , i ∈ Ak , λki , i = r + 1, , m such that q r ∇f x k + 2ρk hi x k ∇hi x k gi xk + i=1 i=1 ∇gi xk + xk − x∗ + m λki ∇hi xk + + i=r+1 µki ∇gi xk = i∈Ak Put q r k A = x + 2ρk hi x k ∇hi x i=1 k gi xk + + ∇gi xk i=1 m λki ∇hi xk + + i=r+1 25 µki ∇gi xk i∈Ak Thus, xk − ∇f xk − A = xk − x∗ By the non-expansion property of projections, we deduce that PΩk ∩Ω xk − ∇f xk − PΩk ∩Ω (A) ≤ xk − x∗ But, by the definition of xk , writing the optimality conditions of the projection minimization problem, we obtain PΩk ∩Ω (A) = xk Therefore, − xk ≤ xk − x∗ PΩk ∩Ω xk − ∇f xk → This completes the proof Our next result shows that C-AGP is a strong optimality condition in the sense that it implies KKT or not-MFCQ Theorem 2.17 Assume the feasible point x∗ satisfies C-AGP and the MangasarianFromovitz constraint qualification Then, x∗ satisfies the KKT conditions Proof In order to simplify the notation, we consider here r = m The case r < m follows straightforwardly By the C-AGP condition, there exists a sequence xk such that xk → x∗ and y k − xk → 0, where y k is the solution of minimize y − xk + ∇f xk 2 (2.22) subject to T ∇h xk gi xk − + ∇gi xk T y − xk = 0, y − xk = 0, i = 1, , q, gj (y) ≤ 0, j = q + 1, , p (2.23) (2.24) (2.25) Observe first that, since xk → x∗ and y k − xk → 0, we have lim y k = x∗ k→∞ 26 (2.26) Assume that i ≤ q is such that gi (x∗ ) < Then, there exists c > such that gi xk − < −c < for k large enough Thus, since y k − xk → 0, for k large enough we have gi xk − + ∇gi xk T y k − xk < −c < (2.27) Assume now that j ∈ {q + 1, , p} is such that gj (x∗ ) < Then, by (2.26), gj y k < (2.28) for k large enough By (2.27) and (2.28), for k large enough, the indices of the active constraints of (2.22)-(2.25) at the y k are contained in the set of indices of active constraints of (2.1) at x∗ Assume now that λk ∈ Rm , µk1 , , µkp ∈ R+ are such that q ∇h y k k p µki ∇gi λ + x k µkj ∇gj y k = 0, + i=1 (2.29) j=q+1 with µki = if the constraint (2.24) is not active at y k and µkj = if the constraint (2.25) is not active at y k Assume, moreover, that for infinitely many indices k at least one of the coefficients λk1 , , λkm , µk1 , , µkp is non-null Then, dividing (2.29) by the maximum modulus of the coefficients, we may assume, without loss of generality, that the maximum modulus of the coefficients in (2.29) is for all k Then, using compactness and taking limits for k → ∞ in (2.29) we obtain q ∇h (x∗ ) λ + p µi ∇gi (x∗ ) + i=1 µj ∇gj (x∗ ) = 0, j=q+1 where µ1 , , µp ≥ 0, at least one of the coefficients is non-null and, for all i = 1, , p, µ1 = if gi (x∗ ) < This is not possible since, by hypotheses, x∗ satisfies MFCQ Therefore, the existence of λk ∈ R, µk1 , , µkp ∈ R+ satisfying (2.29) is not possible This means that, for all k large enough, y k satisfies the Mangasarian-Fromovitz constraint qualification corresponding to the problem (2.22)-(2.25) It turns out that, for all k large enough, the KKT conditions of (2.22)-(2.25) are fulfilled Therefore, for k large enough, there exist, λk ∈ Rm , µk ∈ Rp+ such that p k k y − x + ∇f x k + ∇h x k k µki ∇gi λ + i∈Ik 27 x k µkj ∇gj y k = 0, + j∈Jk (2.30) where Ik and Jk are the indices of active inequality constraints at y k Above we have proved that Ik ⊂ I∗ and Jk ⊂ J∗ , where I∗ , J∗ are the indices of active inequality constraints at x∗ for problem (2.1) If the sequences λk and µk are bounded, then taking convergent subsequences and taking limits in (2.30) we arrive to the KKT conditions at x∗ If at least one of the sequences λk , µk is unbounded, the maximum element Mk of λki , i = 1, , m, µkj , j = 1, , p tends to infinity along some subsequence So, dividing both members of (2.30) by Mk , we get y k − xk + ∇f xk Mk µki ∇h xk λk i∈Ik + + Mk Mk ∇gi (xk ) (2.31) p µkj + j∈Jk Mk ∇gj (y k ) = Taking limits along convergent subsequences in (2.31), we obtain ∇h (x∗ ) λ∗ + ∇g (x∗ ) µ∗ = 0, Where µ∗ ≥ and λ∗ + µ∗ > This means that x∗ does not satisfy MFCQ, which contradicts the hypothesis Example 2.18 AGP does not imply C-AGP Consider the problem of minimizing x2 subject to the constraints of the example (2.15) Let us show that the point x∗ = (0, 1)T does not satisfy C-AGP If xk is such that xk1 ≥ for all k, the same argument used in example (2.15) may be used to show that (2.21) can not hold However, x∗ satisfies AGP To see this, consider the sequence xk = k this case, the projection of x − ∇f x and ∇g xk T k on the set defined by ∇h x k T T −1 ,1 k k x−x In =0 x − xk ≤ is equal to xk for all k ∈ N Therefore, AGP holds The above result encourages one to conjecture that C-AGP also implies ’KKT or not-CPLD’, as is the case of the original AGP condition Surprisingly, this is not true, as the following example shows Example 2.19 C-AGP does not imply “KKT or not-CPLD” Consider the problem (2.1) with n = 2, p = 2, q = 1, f (x1 , x2 ) = x1 , g1 (x1 , x2 ) = −x21 −x2 , g2 (x1 , x2 ) = x21 +x2 The function g2 is obviously convex and the set of points such that g2 (x1 , x2 ) ≤ clearly satisfy standard constraint qualifications It is easy to see that the CPLD condition is fulfilled at x∗ = (0, 0)T since ∇g1 (x) + ∇g2 (x) = 28 for all x On the other hand, the KKT conditions not hold at x∗ Let us show, however, that C-AGP is satisfied We define, for all k ∈ N, xk = x∗ Then, ∇f xk = (1, 0)T for all k and xk − ∇f xk x ∈ R2 | ∇g2 x k T Ωk ∩ Ω = (0, 1)T = (−1, 0)T Now, the set Ωk is x − xk ≤ , so Ωk is the half-plane x2 ≥ This implies that = xk for all k Therefore, PΩ∩Ωk xk − ∇f xk − xk = for all k This means that the C-AGP condition is satisfied at x∗ Example 2.20 C-AGP does not imply AGP The example that shows that C-AGP does not imply “KKT or not-CPLD” may also be used to show that C-AGP does not imply AGP In fact, if the point x∗ satisfies AGP, since AGP implies “KKT or not-CPLD”, this optimality condition would hold at x∗ 2.2.2 L-AGP condition The independence of C-AGP with respect to AGP and the fact that C-AGP does not imply “KKT or not-CPLD” induces one to think that AGP is, essentially, the strongest sequential optimality condition that can be achieved by numerical optimization algorithms However, a stronger AGP-like condition may be used in a very common situation: when some of the constraints that define the feasible set are linear We say that a feasible point x∗ satisfies the linear-AGP (L-AGP) condition if it satisfies the C-AGP condition in the case that gq+1 , , gp are affine functions The status of L-AGP can be deduced from observations already made in this article The example (2.18) may be used to show that AGP does not imply L-AGP On the other hand, if a point x∗ satisfies L-AGP, the corresponding sequence xk may be used to show that AGP also holds In other words, L-AGP is strictly stronger than AGP This supports the point of view that, if an optimization problem possesses linear constraints, it is sensible to preserve feasibility with respect to them, declaring convergence when the AGP criterion holds with some tolerance On the other hand, using the same criterion with respect to general convex constraints does not seem to have special advantages It is worth mentioning that, in [3], the L-AGP condition was used (with the name AGP) in connection to mathematical programming problems with equilibrium constraints In that paper it was shown that, if an algorithm that theoretically converges to L-AGP points goes to a feasible nondegenerate point, then this point is KKT 29 2.2.3 Remarks Assume that one has a nonlinear programming problem with a set Ilin of linear constraints, a different set Iconv of convex constraints and a third set Igen of general constraints Let us say that C-AGP is satisfied when (2.21) holds defining {r + 1, , m}∪ {q + 1, , p} = Iconv and that L-AGP holds when one defines {r + 1, , m} ∪ {q + 1, , p} = Ilin Then, the main results of this article can be visualized in Figure 2.1 In this figure CQ represents any constraint qualification, perhaps weaker than CPLD (such as quasinormality [6], Guignard’s or Abadie’s) Roughly speaking, Figure 2.1 shows that L-AGP is the strongest first-order optimality condition currently used to generate stopping criteria in well-established practical algorithms Of course, stronger optimality conditions may exist and new practical methods satisfying these conditions may arise as a result of theoretical and practical research The results of this article may have a practical application in the design of novel optimization algorithms We have in mind recent sequential quadratic programming methods [9] [18] whose implementation details are not consolidated, as well as inexact restoration methods and methods based on filters Figure 2.1: Punctual and sequential optimality conditions Moreover, in the last 15 years many algorithms appeared aiming to solve new optimization-like problems (equilibrium, multiobjective, bilevel, order-value and many others) Punctual necessary optimality conditions have been encountered for many 30 of these problems but, frequently, their algorithmic consequences are not clear We believe that the sequential optimality analysis may be useful in these cases both from the theoretical and the practical point of view 31 Bibliography [1] J.Abadie (ed.), On the Kuhn-Tucker theorem in Nonlinear Programming (NATO Summer School, Menton, 1964), Amsterdam, North Holland, 1967, pp 19-36 [2] R Andreani, E.G Birgin, J.M Mart´ınez, and M.L Schuverdt, On augmented Lagrangian methods with general lower-level constraints, SIAM J Optim 18 (2007), pp 1286-1309 [3] R Andreani and J.M Mart´ınez, On the solution of mathematical programming problems with equilibrium constraints, Math Methods Oper Res 54 (2001), pp 345-358 [4] R Andreani, J.M Mart´ınez, and M.L Schuverdt, On the relation between the constant positive linear dependence condition and quasinormality constraint qualification, J Optim Theory Appl 125 (2005), pp 473-485 [5] R Andreani, G Haeser, J M Mart´ınez, On sequential optimality conditions for smooth constrained optimization, Optimization 60 (2011), 627–641 [6] D.P Bertsekas, Nonlinear Programming, 2nd ed., Athena Scientific, Belmont, MA, 1999 [7] E.G Birgin and J.M Mart´ınez, Local convergence of an inexact-restoration method and numerical experiments, J Optim Theory Appl 127 (2005), pp 229-247 [8] A.V Fiacco and G.P McCormick, Nonlinear Programming: Sequential Unconstrained Minimization Techniques, Wiley, New York, 1968 [9] F.A.M Gomes, A sequential quadratic programming algorithm that combines merit function and filter ideas, Comput Appl Math 26 (2007), pp 337-379 32 [10] M.A Gomes-Ruggieero, J.M Mart´ınez, and S.A Santos, Spectral projected gradient method with inexact restoration for minimization with nonconvex constraints, SIAM J Sci Comput 31 (2009), pp 1628-1652 [11] M Guignard, Generalized Kuhn-Tucker conditions for mathematical programming in a Banach spaces, SIAM J Control (1969), pp 232-241 [12] G Haeser, Condicões sequenciais de otimalidade, Tese de Doutorado, Departamento de Matemática Aplicada, Universidade Estadual de Campinas, Brazil, 2009 [13] O.L Mangasarian and S Fromovitz, The Fritz-John necessary optimality conditions in presence of equality constraints, J Math Anal Appl 17 (1967), pp 37-47 [14] J.M Mart´ınez and B.F Svaiter, A practical optimality condition without constraint qualifications for nonlinear programming, J Optim Theory Appl 118 (2003), pp 117-133 [15] L Qi and Z Wei, On the constant positive linear dependence condition and its application to SQP methods, SIAM J Optim 10 (2000), pp 963-981 [16] R.T Rockafellar, Lagrange multipliers and optimality, SIAM Rev 35 (1993), pp 183-238 [17] M.l Schuverdt, Métodos de Lagrangiano Aumentado com convergêcia usando a condicão de dependéncia linear positiva constante, Tese de Doutorado, Departamento de Matemática Aplicada, Universidade Estadual de Campinas, 2006 [18] C Shen, W Xue, and D Pu, A filter SQP algorithm without a feasibility restoration phase, Comput Appl Math 28 (2009), pp 167-194 33 ... 1.5 Optimality conditions for smooth problem 11 Approximate Karush? ? ?Kuhn? ? ?Tucker optimality conditions 16 2.1 Approximate- KKT conditions ... Mangasarian-Fromovits constraint qualification (MFCQ) condition at a point x0 if and only if it is metrically regular at x0 1.5 Optimality conditions for smooth problem Consider the constrained optimization problem... MATHEMATICS DEPARTMENT OF MATHEMATICS Dao Thi Thao ON APPROXIMATE KARUSH- KUHN- TUCKER OPTIMALITY CONDITIONS FOR SMOOTH CONSTRAINED OPTIMIZATION BACHELOR THESIS Major: Analysis SUPERVISOR Dr NGUYEN

Định dạng
Số trang	37
Dung lượng	371,05 KB