c 2012 Society for Industrial and Applied Mathematics Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php SIAM J OPTIM Vol 22, No 4, pp 1258–1284 ADJOINT-BASED PREDICTOR-CORRECTOR SEQUENTIAL CONVEX PROGRAMMING FOR PARAMETRIC NONLINEAR OPTIMIZATION∗ QUOC TRAN DINH† , CARLO SAVORGNAN‡ , AND MORITZ DIEHL‡ Abstract This paper proposes an algorithmic framework for solving parametric optimization problems which we call adjoint-based predictor-corrector sequential convex programming After presenting the algorithm, we prove a contraction estimate that guarantees the tracking performance of the algorithm Two variants of this algorithm are investigated The first can be used to treat online parametric nonlinear programming problems when the exact Jacobian matrix is available, while the second variant is used to solve nonlinear programming problems The local convergence of these variants is proved An application to a large-scale benchmark problem that originates from nonlinear model predictive control of a hydro power plant is implemented to examine the performance of the algorithms Key words predictor-corrector path-following, sequential convex programming, adjoint-based optimization, parametric nonlinear programming, online optimization AMS subject classifications 49M37, 65K05, 90C31 DOI 10.1137/110844349 Introduction In this paper, tion problem of the form ⎧ ⎪ minn ⎨ x∈R P(ξ) s.t ⎪ ⎩ we consider a parametric nonconvex optimizaf (x) g(x) + M ξ = 0, x ∈ Ω, where f : Rn → R is convex, g : Rn → Rm is nonlinear, Ω ⊆ Rn is a nonempty, closed convex set, and the parameter ξ belongs to a given subset P ⊆ Rp Matrix M ∈ Rm×p plays the role of embedding the parameter ξ into the equality constraints in a linear way Throughout this paper, f and g are assumed to be differentiable on their domain Problem P(ξ) includes many (parametric) nonlinear programming problems such as standard nonlinear programs, nonlinear second order cone programs, and nonlinear semidefinite programs [32, 39, 47] The theory of parametric optimization has been extensively studied in many research papers and monographs; see, e.g., [7, 25, 42] This paper deals with the efficient calculation of approximate solutions to a sequence of problems of the form P(ξ), where the parameter ξ is slowly varying In ∗ Received by the editors August 15, 2011; accepted for publication (in revised form) July 6, 2012; published electronically October 9, 2012 This research was supported by Research Council KUL CoE EF/05/006; Optimization in Engineering (OPTEC); IOF-SCORES4CHEM, GOA/10/009(MaNet), GOA/10/11; the Flemish Government through projects G.0452.04, G.0499.04, G.0211.05, G.0226.06, G.0321.06, G.0302.07, G.0320.08, G.0558.08, G.0557.08, G.0588.09, and G.0377.09; ICCoS, ANMMM, and MLDM; IWT PhD grants, Belgian Federal Science Policy Office, IUAP P6/04; EU, ERNSI; FP7-HDMPC, FP7-EMBOCON; AMINAL; Helmholtz-viCERP, COMET-ACCM, ERCHIGHWIND, and ITN-SADCO http://www.siam.org/journals/siopt/22-4/84434.html † Department of Electrical Engineering (ESAT-SCD) and Optimization in Engineering Center (OPTEC), K.U Leuven, B-3001 Leuven, Belgium, and Department of Mathematics-MechanicsInformatics, Vietnam National University, Hanoi, Vietnam (quoc.trandinh@esat.kuleuven.be) ‡ Department of Electrical Engineering (ESAT-SCD) and Optimization in Engineering Center (OPTEC), K.U Leuven, B-3001 Leuven, Belgium (carlo.savorgnan@esat.kuleuven.be, moritz.diehl@esat.kuleuven.be) 1258 Copyright © by SIAM Unauthorized reproduction of this article is prohibited Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php ADJOINT-BASED PREDICTOR-CORRECTOR SCP 1259 other words, for a sequence {ξk }k≥0 such that ξk+1 − ξk is small, we want to solve the problems P(ξk ) in an efficient way without requiring more accuracy than needed in the result In practice, sequences of problems of the form P(ξ) arise in the framework of realtime optimization, moving horizon estimation, and online data assimilation as well as in nonlinear model predictive control (NMPC) A practical obstacle in these applications is the time limitation imposed on solving the underlying optimization problem for each value of the parameter Instead of solving completely a nonlinear program at each sample time [3, 4, 5, 29], several online algorithms approximately solve the underlining nonlinear optimization problem by performing the first iteration of exact Newton, sequential quadratic programming (SQP), Gauss–Newton, or interior point methods [17, 40, 54] In [17, 40] the authors only considered the algorithms in the framework of SQP methods This approach has been proved to be efficient in practice and is widely used in many applications [14] Recently, Zavala and Anitescu [54] proposed an inexact Newton-type method for solving online optimization problems based on the framework of generalized equations [7, 42] Other related work considers practical problems which possess general convexity structure such as second order cone and semidefinite cone constraints and nonsmooth convexity [21, 47] In these applications, standard optimization methods may not perform satisfactorily Many algorithms for nonlinear second order cone and nonlinear semidefinite programming have recently been proposed and found many applications in robust optimal control, experimental design, and topology optimization; see, e.g [2, 21, 23, 33, 47] These approaches can be considered as generalizations of the SQP method Although solving semidefinite programming problems is in general time consuming due to matrix operations, in some practical applications, problems may possess few expensive constraints such as second order cone or semidefinite cone constraints In this case handling these constraints directly in the algorithm may be more efficient than transforming them into scalar constraints Contribution The contribution of this paper is as follows: (a) We start our paper by proposing a generic framework called the adjointbased predictor-corrector sequential convex programming (APCSCP) method for solving parametric optimization problems of the form P(ξ) The algorithm is specially suited for solving nonlinear MPC problems where the evaluations of the derivatives are time-consuming For example, it can show advantages with respect to standard techniques when applied to problems in which the number of state variables in the dynamic system is much larger than the number of control variables (b) We prove the stability of the tracking error for this algorithm (Theorem 3.5) (c) In the second part of the paper the theory is specialized to the nonparametric case where a single optimization problem is solved The local convergence of this variant is also obtained (d) Finally, we present a numerical application to large-scale nonlinear model predictive control of a hydro power plant with 259 state variables and 10 controls The performance of our algorithms is compared with a standard real-time Gauss–Newton method and a conventional MPC approach APCSCP is based on three main ideas: sequential convex programming, predictorcorrector path-following, and adjoint-based optimization We briefly explain these methods in the following Copyright © by SIAM Unauthorized reproduction of this article is prohibited Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 1260 QUOC TRAN DINH, CARLO SAVORGNAN, AND MORITZ DIEHL 1.1 Sequential convex programming The sequential convex programming (SCP) method is a local nonconvex optimization technique SCP solves a sequence of convex approximations of the original problem by convexifying only the nonconvex parts and preserving the structures that can efficiently be exploited by convex optimization techniques [9, 36, 38] Note that this method is different from SQP methods where quadratic programs are used as approximations of the problem This approach is useful when the problem possesses general convex structures such as conic constraints, a cost function depending on matrix variables, or convex constraints resulting from a low-level problem in multilevel settings [2, 15, 47] Due to the complexity of these structures, standard optimization techniques such as SQP and Gauss– Newton-type methods may not be convenient to apply In the context of nonlinear conic programming, SCP approaches have been proposed under the names sequential semidefinite programming (SSDP) or SQP-type methods [12, 21, 23, 32, 33, 47] It has been shown in [18] that the superlinear convergence is lost if the linear semidefinite programming subproblems in the SSDP algorithm are convexified In [35] the authors considered a nonlinear program in the framework of a composite minimization problem, where the inner function is linearized to obtain a convex subproblem which is made strongly convex by adding a quadratic proximal term In this paper, following the work in [21, 24, 50, 52], we apply the SCP approach to solve problem P(ξ) The nonconvex constraint g(x) + M ξ = is linearized at each iteration to obtain a convex approximation The resulting subproblems can be solved by exploiting convex optimization techniques We would like to point out that the term “sequential convex programming” was also used in structural optimization; see, e.g [22, 55] The cited papers are related to methods of moving asymptotes introduced by Svanberg [49] 1.2 Predictor-corrector path-following methods In order to illustrate the idea of the predictor-corrector path-following method [13, 54] and to distinguish it from the other “predictor-corrector” concepts, e.g., the well-known predictor-corrector interior point method proposed by Mehrotra in [37], we summarize the concept of predictor-corrector path-following methods in the case Ω ≡ Rn as follows The KKT system of problem P(ξ) can be written as F (z; ξ) = 0, where z = (x, y) is its primal-dual variable The solution z ∗ (ξ) that satisfies the KKT condition for a given ξ is in general a smooth map By applying the implicit function theorem, the derivative of z ∗ (·) is expressed as ∂F ∗ ∂z ∗ (ξ) = − (z (ξ); ξ) ∂ξ ∂z −1 ∂F ∗ (z (ξ); ξ) ∂ξ In the parametric optimization context, we might have solved a problem with param¯ and want to solve the next problem for a new parameter eter ξ¯ with solution z¯ = z ∗ (ξ) ˆ is given by ˆ The tangential predictor zˆ for this new solution z ∗ (ξ) ξ ∗ ¯ ξˆ − ξ) ¯ = z ∗ (ξ) ¯ + ∂z (ξ)( ¯ − ∂F (z ∗ (ξ); ¯ ξ) ¯ zˆ = z ∗ (ξ) ∂ξ ∂z −1 ∂F ∗ ¯ ¯ ˆ ¯ (z (ξ); ξ)(ξ − ξ) ∂ ξ¯ Note the similarity with one step of a Newton method In fact, a combination of the tangential predictor and the corrector due to a Newton method proves to be useful in ¯ = but only an approximation In the case that z¯ was not the exact solution of F (z; ξ) ¯ this case, linearization at (¯ z , ξ) yields a formula that one step of a predictor-corrector path-following method needs to satisfy: (1.1) ¯ ξˆ − ξ) ¯ + ∂F (¯ ¯ z − z¯) = ¯ + ∂F (¯ z ; ξ)( z ; ξ)(ˆ F (¯ z ; ξ) ∂ξ ∂z Copyright © by SIAM Unauthorized reproduction of this article is prohibited ADJOINT-BASED PREDICTOR-CORRECTOR SCP 1261 Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php Written explicitly, it delivers the solution guess zˆ for the next parameter ξˆ as (1.2) zˆ = z¯ − ∂F ¯ (¯ z ; ξ) ∂z −1 ∂F ¯ ξˆ − ξ) ¯ − ∂F (¯ ¯ (¯ z ; ξ)( z ; ξ) ∂ξ ∂z =Δzpredictor −1 ¯ F (¯ z ; ξ) =Δzcorrector Note that when the parameter enters linearly into F , we can write ∂F ¯ ξˆ − ξ) ¯ = F (¯ ˆ − F (¯ ¯ (¯ z ; ξ)( z ; ξ) z ; ξ) ∂ξ Thus, (1.1) reduces to (1.3) ˆ + ∂F (¯ F (¯ z ; ξ) z )(ˆ z − z¯) = ∂z It follows that the predictor-corrector step can be easily obtained by just applying ˆ initialized at the past solution one standard Newton step to the new problem P(ξ) guess z¯ if we employ the parameter embedding in the problem formulation [14] Based on the above analysis, the predictor-corrector path-following method only performs the first iteration of the exact Newton method for each new problem In this paper, by applying the generalized equation framework [42, 43], we generalize this idea to the case where more general convex constraints are considered When the parameter does not enter linearly into the problem, we can always reformulate this problem as P(ξ) by using intermediate variables In this case, the derivatives with respect to these intermediate variables contain the information of the predictor term Finally, we notice that the real-time iteration scheme proposed in [17] can be considered as a variant of the above predictor-corrector method in the SQP context 1.3 Adjoint-based method From a practical point of view, most of the time spent on solving optimization problems resulting from simulation-based methods is needed to evaluate the functions and their derivatives [6] Adjoint-based methods rely on the observation that it is not necessary to use exact Jacobian matrices of the constraints Moreover, in some applications, the time needed to evaluate all the derivatives of the functions exceeds the time available to compute the solution of the optimization problem The adjoint-based Newton-type methods in [19, 28, 45] can work with an inexact Jacobian matrix and only require an exact evaluation of the Lagrange gradient using adjoint derivatives to form the approximate optimization subproblems in the algorithm This technique still allows the algorithm to converge to the exact solutions but can save valuable time in the online performance of the algorithm 1.4 A tutorial example The idea of the APCSCP method is illustrated in the following simple example Example 1.1 (tutorial example) Let us consider a simple nonconvex parametric optimization problem: (1.4) −x1 | x21 + 2x2 + − 4ξ = 0, x21 − x22 + ≤ 0, x ≥ 0, x ∈ R2 , where ξ ∈ P := {ξ ∈ R : ξ ≥ 1.2} is a parameter After few calculations, we can √ √ show that x∗ξ = (2 ξ − ξ, ξ − 1)T is a stationary point of problem (1.4) which is also the unique global optimum It is clear that problem (1.4) satisfies the strong second order sufficient condition (SSOSC) at x∗ξ Copyright © by SIAM Unauthorized reproduction of this article is prohibited QUOC TRAN DINH, CARLO SAVORGNAN, AND MORITZ DIEHL Note that the constraint x21 − x22 + ≤ can be rewritten as a second order cone constraint (x1 , 1)T ≤ x2 under the conditon x2 ≥ Let us define g(x) := x21 + 2x2 + 2, M := −4 and Ω := {x ∈ R2 | (x1 , 1)T ≤ x2 , x ≥ 0} Then, problem (1.4) can be cast into the form of P(ξ) The aim is to approximately solve problem (1.4) at each given value ξk of the parameter ξ Instead of solving the nonlinear optimization problem at each ξk until complete convergence, APCSCP only performs the first step of the SCP algorithm to obtain an approximate solution xk at ξk Notice that the convex subproblem needed to be solved at each ξk in the APCSCP method is −x1 | 2xk1 x1 + 2x2 − (xk1 )2 + − 4ξ = 0, (1.5) (x1 , 1)T ≤ x2 , x ≥ x We compare this method with other known real-time iteration algorithms The first is the real-time iteration with an exact SQP method, and the second is the real-time iteration with an SQP method using a projected Hessian [17, 31] In the second algorithm, the Hessian matrix of the Lagrange function is projected onto the cone of symmetric positive semidefinite matrices to obtain a convex quadratic programming subproblem Figures 1.1 and 1.2 illustrate the performance of three methods when ξk = 1.2 + kΔξk for k = 0, , and Δξk = 0.25 Figure 1.1 presents the approximate solution trajectories given by three methods, while Figure 1.2 shows the tracking errors and the cone constraint violations of those methods The initial point x0 of three methods is chosen at the true solution of P(ξ0 ) We can see that the performance of the exact SQP and the SQP using projected Hessian is quite similar However, the second order cone constraint (x1 , 1)T ≤ x2 is violated in both methods The SCP method preserves the feasibility and better follows the exact solution trajectory Note that the subproblem in the exact SQP method is a nonconvex quadratic program, a convex QP in the projected SQP case, and a second order cone constrained program (1.5) in the SCP method 3 2.5 2.5 2.5 x 1.5 x2 x2 Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 1262 1.5 1.5 1 0.5 x 1.5 Exact−SQP 2.5 0.5 x 1.5 Projected−SQP 2.5 0.5 x 1.5 SCP 2.5 Fig 1.1 The trajectory of three methods (k = 0, , 9), ( is x∗ (ξk ) and ◦ is xk ) 0.08 0.2 0.06 0.15 k ||x −x*(ξ )|| k 0.06 0.04 SOC const viol 0.04 0.1 0.02 0.02 0.05 0 Exact−SQP 10 Projected−SQP 10 SCP 10 Fig 1.2 The tracking error and the cone constraint violation of three methods (k = 0, , 9) Copyright © by SIAM Unauthorized reproduction of this article is prohibited Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php ADJOINT-BASED PREDICTOR-CORRECTOR SCP 1263 1.5 Notation Throughout this paper, we use the notation ∇f for the gradient vector of a scalar function f , g for the Jacobian matrix of a vector valn n and S++ ) for the set of n × n real symmetric ued function g, and S n (resp., S+ (resp., positive semidefinite and positive definite) matrices The notation · stands for the Euclidean norm The ball B(x, r) of radius r centered at x is defined as ¯ r) is its closure B(x, r) := {y ∈ Rn | y − x < r} and B(x, The rest of this paper is organized as follows Section presents a generic framework of the APCSCP algorithm Section proves the local contraction estimate for APCSCP and the stability of the approximation error Section considers an adjointbased SCP algorithm for solving nonlinear programming problems as a special case The last section presents computational results for an application of the proposed algorithms in NMPC of a hydro power plant An APCSCP algorithm In this section, we present a generic algorithmic framework for solving the parametric optimization problem P(ξ) Traditionally, at each sample ξk of parameter ξ, a nonlinear program P(ξk ) is solved to get a completely converged solution z¯(ξk ) Exploiting the real-time iteration idea [14, 17] in our algorithm below, only one convex subproblem is solved to get an approximated solution z k at ξk to z¯(ξk ) Suppose that z k := (xk , y k ) ∈ Ω×Rm is a given KKT point of P(ξk ) (more details n can be found in the next section), Ak is a given m × n matrix, and Hk ∈ S+ We consider the following parametric optimization subproblem: ⎧ ⎪ minn f (x) + (sk )T (x − xk ) + 12 (x − xk )T Hk (x − xk ) ⎨x∈R k P(z , Ak , Hk ; ξ) s.t Ak (x − xk ) + g(xk ) + M ξ = 0, ⎪ ⎩ x ∈ Ω, T where sk := s(z k , Ak ) = g (xk ) − Ak y k Matrix Ak is an approximation to g (xk ) at xk , Hk is a regularization or an approximation to ∇2x L(¯ z k ), where L is the Lak grange function of P(ξ) to be defined in secton Vector s can be considered as a correction term of the inconsistency between Ak and g (xk ) Vector y k is referred to as the Lagrange multiplier Since f and Ω are convex and Hk is symmetric positive semidefinite, the subproblem P(z k , Ak , Hk ; ξ) is convex Here, z k , Ak and Hk are considered as parameters Remark Note that computing the term g (xk )T y k of the correction vector sk does not require the whole Jacobian matrix g (xk ), which is usually time-consuming to evaluate When implementing the algorithm, the evaluation of the directional derivatives η k := g (xk )T y k can be done by the reverse mode (or adjoint mode) of automatic differentiation By using this technique, we can evaluate an adjoint directional derivative vector of the form g (xk )T y k without evaluating the whole Jacobian matrix g (xk ) of the vector function g More details about automatic differentiation can be found in a monograph [26] or at http://www.autodiff.org In the NMPC framework, the constraint function g is usually obtained from a dynamic system of the form (2.1) η(t) ˙ = G(η(t), x, t), t0 ≤ t ≤ tf , η(t0 ) = η0 (x), by applying a direct transcription, where η is referred to as a state vector and x is a parameter vector The adjoint directional derivative vector g (x)T y is nothing Copyright © by SIAM Unauthorized reproduction of this article is prohibited Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 1264 QUOC TRAN DINH, CARLO SAVORGNAN, AND MORITZ DIEHL T more than the gradient vector ∂V ∂x of the function V (x) := g(x) y In the dynamic system context, this function V is a special case of the general functional V (x) := t e(η(tf )) + t0f v(η, x, t)dt By simultaneously integrating the dynamic system and its adjoint sensitivity system λ˙ = −GTη λ−vηT and λ(tf ) = ∇η e(η(tf )), we can evaluate the tf ∂η0 T T gradient vector of V with respect to x as dV dx := λ (t0 ) ∂x + t0 (vx + λ Gx )dt, where λ(t0 ) is the solution of the adjoint system at t0 Note that the cost of integrating the adjoint system is of the same order as integrating the forward dynamics and, crucially, independent of the dimension of x Adjoint differentiation of dynamic systems is performed, e.g., in an open-source software package, Sundials [46] For more details of adjoint sensitivity analysis of dynamic systems, see [10, 46] The APCSCP algorithmic framework is described as follows Algorithm (APCSCP) Initialization For a given parameter ξ0 ∈ P, solve approximately (offline) P(ξ0 ) to get an approximate KKT point z := (x0 , y ) Compute g(x0 ), find a man trix A0 which approximates g (x0 ), and H0 ∈ S+ Then, compute vector s0 := T g (x0 ) − A0 y Set k := Iteration k (k = 0, 1, ) For a given (z k , Ak , Hk ), perform the three steps below: Step Get a new parameter value ξk+1 ∈ P Step Solve the convex subproblem P(z k , Ak , Hk ; ξk+1 ) to obtain a solution xk+1 and the corresponding multiplier y k+1 n Step Evaluate g(xk+1 ), update (or recompute) matrices Ak+1 and Hk+1 ∈ S+ k+1 k+1 T k+1 T k+1 Compute vector s := g (x ) y − Ak+1 y Set k := k + and go back to Step End The core step of Algorithm is to solve the convex subproblem P(z k , Ak , Hk ; ξ) at each iteration In Algorithm we not mention explicitly the method to solve P(z k , Ak , Hk ; ξ) In practice, to reduce the computational time, we can either implement an optimization method which exploits the structure of the problem, e.g., block structure or separable structure [22, 51, 55], or rely on several efficient methods and software tools that are available for convex optimization [9, 38, 39, 48, 53] In this paper, we are most interested in the case where one evaluation of g is very expensive A possible simple choice of Hk is Hk = for all k ≥ The initial point z is obtained by solving offline P(ξ0 ) However, as we will show later (Corollary 3.6), if we choose z close to the set of KKT points Z ∗ (ξ0 ) of P(ξ0 ) (not necessarily an exact solution), then the new KKT point z of P(z , A0 , H0 ; ξ ) is still close to Z ∗ (ξ1 ) of P(ξ1 ) provided that ξ1 − ξ0 is sufficiently small Hence, in practice, we only need to solve approximately problem P(ξ0 ) to get a starting point z In the NMPC framework, the parameter ξ usually coincides with the initial state of the dynamic system at the current time of the moving horizon If matrix Ak ≡ g (xk ), the exact Jacobian matrix of g at xk and Hk ≡ 0, then this algorithm collapses to the real-time SCP method considered in [52] Contraction estimate In this section, we will show that under certain assumptions, the sequence {z k }k≥0 generated by Algorithm remains close to the sequence of the true KKT points {¯ zk }k≥0 of problem P(ξk ) Without loss of generality, we assume that the objective function f is linear, i.e., f (x) = cT x, where c ∈ Rn is given Indeed, since f is convex, by using a slack variable s, we can reformulate P(ξ) as a nonlinear program min(x,s) s | g(x) + M ξ = 0, x ∈ Ω, f (x) ≤ s Copyright © by SIAM Unauthorized reproduction of this article is prohibited ADJOINT-BASED PREDICTOR-CORRECTOR SCP 1265 Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 3.1 KKT condition as a generalized equation Let us first define the Lagrange function of problem P(ξ) as L(x, y; ξ) := cT x + (g(x) + M ξ)T y, where y is the Lagrange multiplier associated with the constraint g(x) + M ξ = Since the constraint x ∈ Ω is convex and implicitly represented, we will consider it separately The KKT condition for P(ξ) is now written as ∈ c + g (x)T y + NΩ (x), = g(x) + M ξ, (3.1) where NΩ (x) is the normal cone of Ω at x defined as (3.2) NΩ (x) := u ∈ Rn | uT (x − v) ≥ 0, v ∈ Ω ∅ otherwise if x ∈ Ω, Note that the first line of (3.1) implicitly includes the constraint x ∈ Ω A pair (¯ x(ξ), y¯(ξ)) satisfying (3.1) is called a KKT point of P(ξ) and x ¯(ξ) is called a stationary point of P(ξ) with the corresponding multiplier y¯(ξ) Let us denote by Z ∗ (ξ) and X ∗ (ξ) the set of KKT points and the set of stationary points of P(ξ), respectively In what follows, we use the letter z for the pair of (x, y), i.e., z := (xT , y T )T Throughout this paper, we require the following assumptions, which are standard in optimization Assumption The function g is twice differentiable on their domain Assumption For a given ξ0 ∈ P, problem P(ξ0 ) has at least one KKT point z¯0 , i.e., Z ∗ (ξ0 ) = ∅ Let us define (3.3) F (z) := c + g (x)T y g(x) and K := Ω × Rm Then, the KKT condition (3.1) can be expressed in terms of a parametric generalized equation as follows: (3.4) ∈ F (z) + Cξ + NK (z), Generalized equations are an essential tool to study many problems where C := M in nonlinear analysis, perturbation analysis, variational calculations, and optimization [8, 34, 43] Suppose that for some ξk ∈ P, the set of KKT points Z ∗ (ξk ) of P(ξk ) is nonempty For any fixed z¯k ∈ Z ∗ (ξk ), we define the following set-valued mapping: (3.5) L(z; z¯k , ξk ) := F (¯ z k ) + F (¯ z k )(z − z¯k ) + Cξk + NK (z) We also define the inverse mapping L−1 : Rn+m → Rn+m of L(·; z¯k , ξk ) as follows: (3.6) L−1 (δ; z¯k , ξk ) := z ∈ Rn+m : δ ∈ L(z; z¯k , ξk ) Now, we consider the KKT condition of the subproblem P(z k , Ak , Hk ; ξ) For given neighborhoods B(¯ z k , rz ) of z¯k and B(ξk , rξ ) of ξk , and z k ∈ B(¯ z k , rz ), ξk+1 ∈ Copyright © by SIAM Unauthorized reproduction of this article is prohibited Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 1266 QUOC TRAN DINH, CARLO SAVORGNAN, AND MORITZ DIEHL n B(ξk , rξ ), and given matrices Ak and Hk ∈ S+ , let us consider the convex subprobk lem P(z , Ak , Hk ; ξk+1 ) with respect to the parameter (z k , Ak , Hk , ξk+1 ) The KKT condition of this problem is expressed as follows (3.7) ∈ c + m(z k , Ak ) + Hk (x − xk ) + ATk y + NΩ (x), = g(xk ) + Ak (x − xk ) + M ξk+1 , where NΩ (x) is defined by (3.2) Suppose that the Slater constraint qualification holds for the subproblem P(z k , Ak , Hk ; ξk+1 ), i.e., ri(Ω) ∩ x ∈ Rn | g(xk ) + Ak (x − xk ) + M ξk+1 = = ∅, where ri(Ω) is the relative interior of Ω Then by convexity of Ω, a point z k+1 := (xk+1 , y k+1 ) is a KKT point of P(z k , Ak , Hk ; ξk+1 ) if and only if xk+1 is a solution to P(z k , Ak , Hk ; ξk+1 ) associated with the multiplier y k+1 Since g is twice differentiable by Assumption and f is linear, for a given z = (x, y), we have m (3.8) ∇2x L(z) = yi ∇2 gi (x), i=1 the Hessian matrix of the Lagrange function L, where ∇2 gi (·) is the Hessian matrix of gi (i = 1, , m) Let us define the following matrix: (3.9) Hk F˜k := Ak ATk , n The KKT condition (3.7) can be written as a parametric linear where Hk ∈ S+ generalized equation, (3.10) ∈ F (z k ) + F˜k (z − z k ) + Cξk+1 + NK (z), where z k , F˜k , and ξk+1 are considered as parameters Note that if Ak = g (xk ) and Hk = ∇2x L(z k ), then (3.10) is the linearization of the nonlinear generalized equation (3.4) at (z k , ξk+1 ) with respect to z Remark Note that (3.10) is a generalization of (1.3), where the approximate Jacobian F˜k is used instead of the exact one Therefore, (3.10) can be viewed as one iteration of the inexact predictor-corrector path-following method for solving (3.4) 3.2 The strong regularity concept We recall the following definition of the strong regularity concept This definition can be considered as the strong regularity of the generalized equation (3.4) in the context of nonlinear optimization; see [42] Definition 3.1 Let ξk ∈ P such that the set of KKT points Z ∗ (ξk ) of P(ξk ) is nonempty Let z¯k ∈ Z ∗ (ξk ) be a given KKT point of P(ξk ) Problem P(ξk ) is said to be strongly regular at z¯k if there exist neighborhoods B(0, r¯δ ) of the origin and B(¯ z k , r¯z ) of z¯k such that the mapping zk∗ (δ) := B(¯ z k , r¯z ) ∩ L−1 (δ; z¯k , ξk ) is single-valued and Lipschitz continuous in B(0, r¯δ ) with a Lipschitz constant < γ < +∞, i.e., (3.11) zk∗ (δ) − zk∗ (δ ) ≤ γ δ − δ ∀δ, δ ∈ B(0, r¯δ ) Note that the constants γ, r¯z , and r¯δ in Definition 3.1 are global and not depend on the index k Copyright © by SIAM Unauthorized reproduction of this article is prohibited ADJOINT-BASED PREDICTOR-CORRECTOR SCP 1267 From the definition of L−1 where strong regularity holds, there exists a unique such that δ ∈ F (¯ z k ) + F (¯ z k )(zk∗ (δ) − z¯k ) + Cξk + NK (zk∗ (δ)) Therefore, Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php zk∗ (δ) zk∗ (δ) ∈ (F (¯ z k ) + NK )−1 F (¯ z k )¯ z k − F (¯ z k ) − Cξk + δ = J¯k F (¯ z k )¯ z k − F (¯ z k ) − Cξk + δ , where J¯k := (F (¯ z k ) + NK )−1 The strong regularity of P(ξ) at z¯k is equivalent to the z k )¯ z k − F (¯ zk) − single-valuedness and the Lipschitz continuity of J¯k around v k := F (¯ Cξk The strong regularity concept is widely used in variational analysis, perturbation analysis, and optimization [8, 34, 43, 41] In view of optimization, strong regularity implies the SSOSC if the linear independence constraint qualification (LICQ) holds [42] If the convex set Ω is polyhedral and the LICQ holds, then strong regularity is equivalent to SSOSC [20] In order to interpret the strong regularity condition of P(ξ k ) at z¯k ∈ Z ∗ (ξk ) in terms of perturbed optimization, we consider the following optimization problem: ⎧ ⎪ minn (c − δc )T x + 12 (x − x¯k )T ∇2x L(¯ xk , y¯k )(x − x ¯k ) ⎨x∈R (3.12) xk )(x − x ¯k ) + M ξk = δg , s.t g(¯ xk ) + g (¯ ⎪ ⎩ x ∈ Ω Here, δ = (δc , δg ) ∈ B(0, r¯δ ) is a perturbation Problem P(ξk ) is strongly regular at z¯k if and only if (3.12) has a unique KKT point zk∗ (δ) in B(¯ z k , r¯z ) and zk∗ (·) is Lipschitz continuous in B(0, r¯δ ) with a Lipschitz constant γ Example 3.2 Let us recall example 1.1 in section The optimal multipliers associated with two constraints x21 + 2x2 + − 4ξ = and x21 − x22 + ≤ are √ √ √ y1∗ = (2 ξ − 1)[8 ξ − ξ ξ]−\1 > and y2∗ = [8 ξ − ξ ξ]−1 > 0, respectively Since the last inequality constraint is active, while x ≥ is inactive, we can easily compute the critical cone as C(x∗ξ , y ∗ ) = {(d1 , 0) ∈ R2 | x∗ξ1 d1 = 0} The Hessian matrix ∇2x L(x∗ξ , y ∗ ) = C(x∗ξ , y ∗ ) 2(y1∗ +y2∗ ) 0 −2y2∗ of the Lagrange function L is positive definite in Hence, the second order sufficient optimality condition for (1.1) is satisfied Moreover, y2∗ > which says that the strict complementarity condition holds Therefore, problem (1.1) satisfies the the strong second order sufficient condition On the other hand, it is easy to check that the LICQ condition holds for (1.1) at x∗ξ By applying [42, Theorem 4.1], we can conclude that (1.4) is strongly regular at (x∗ξ , y ∗ ) The following lemma shows the nonemptiness of Z ∗ (ξ) in the neighborhood of ξk Lemma 3.3 Suppose that Assumption is satisfied and Z ∗ (ξk ) is nonempty for a given ξk ∈ P Suppose further that problem P(ξk ) is strongly regular at z¯k for a given z¯k ∈ Z ∗ (ξk ) Then there exist neighborhoods B(ξk , rξ ) of ξk and B(¯ z k , rz ) of k ∗ ∗ z¯ such that Z (ξk+1 ) is nonempty for all ξk+1 ∈ B(ξk , rξ ) and Z (ξk+1 ) ∩ B(¯ z k , rz ) k+1 ¯ < +∞ such contains only one point z¯ Moreover, there exists a constant ≤ σ that (3.13) z¯k+1 − z¯k ≤ σ ¯ ξk+1 − ξk Proof Since the KKT condition of P(ξk ) is equivalent to the generalized equation (3.4) with ξ = ξk , by applying [42, Theorem 2.1] we conclude that there exist neighz k , rz ) of z¯k such that Z ∗ (ξk+1 ) is nonempty for all borhoods B(ξk , rξ ) of ξk and B(¯ ∗ ξk+1 ∈ B(ξk , rξ ) and Z (ξk+1 ) ∩ B(¯ z k , rz ) contains only one point z¯k+1 On the other k k z ) − Cξk+1 = M (ξk − ξk+1 ) ≤ M ξk+1 − ξk , hand, since F (¯ z ) + Cξk − F (¯ by using the formula [42, 2.4], we obtain the estimate (3.13) Copyright © by SIAM Unauthorized reproduction of this article is prohibited 1270 QUOC TRAN DINH, CARLO SAVORGNAN, AND MORITZ DIEHL Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php By using again the strong regularity assumption, it follows from (3.22) and (3.23) that z−z ≤γ δ−δ ≤γ v−v + γ [F (¯ z k ) − F˜k ](z − z ) (3.14) ≤ γ v−v Since γκ < + γκ z − z < 1, rearranging the last inequality we get z−z ≤ γ v−v , − γκ γ which shows that Jk satisfies (3.17) with a constant β := 1−γκ > k+1 Let us recall that if z is a KKT point of the convex subproblem P(z k , Ak , Hk ; ξk+1 ), then ∈ F˜k (z k+1 − z k ) + F (z k ) + Cξk+1 + NK (z k+1 ) z k , rz ), then problem P(z k , Ak , Hk ; ξ) is uniquely According to Lemma 3.4, if z k ∈ B(¯ solvable We can write its KKT condition equivalently as z k+1 = Jk F˜k z k − F (z k ) − Cξk+1 (3.24) Since z¯k+1 is the solution of (3.4) at ξk+1 , we have = F (¯ z k+1 ) + Cξk+1 + u ¯k+1 , k+1 k+1 k+1 k+1 k+1 where u ¯ ∈ NK (¯ z ) Moreover, since z¯ = Jk (F˜k z¯ +u ¯ ), we can write z¯k+1 = Jk F˜k z¯k+1 − F (¯ z k+1 ) − Cξk+1 (3.25) The main result of this section is stated in the following theorem Theorem 3.5 Suppose that Assumptions and are satisfied for some ξ0 ∈ P Then, for k ≥ and z¯k ∈ Z ∗ (ξk ), if P(ξk ) is strongly regular at z¯k , then there exist neighborhoods B(¯ z k , rz ) and B(ξk , rξ ) such that the following hold: (a) The set of KKT points Z ∗ (ξk+1 ) of P(ξk+1 ) is nonempty for any ξk+1 ∈ B(ξk , rξ ) (b) If, in addition, Assumption 3(a) is satisfied, then subproblem P(z k , Ak , Hk ; z k , rz ) ξk+1 ) is uniquely solvable in the neighborhood B(¯ (c) Moreover, if, in addition, Assumption 3(b) is satisfied, then the sequence {z k }k≥0 generated by Algorithm 1, where ξk+1 ∈ B(ξk , rξ ), guarantees z k+1 − z¯k+1 ≤ α + c1 z k − z¯k (3.26) z k − z¯k + (c2 + c3 ξk+1 − ξk ) ξk+1 − ξk , where ≤ α < 1, ≤ ci < +∞, i = 1, , 3, and c2 > are given constants and z¯k+1 ∈ Z ∗ (ξk+1 ) Proof We prove the theorem by induction For k = 0, we have Z ∗ (ξ0 ) is nonempty by Assumption Now, we assume Z ∗ (ξk ) is nonempty for some k ≥ We will prove that Z ∗ (ξk+1 ) is nonempty for some ξk+1 ∈ B(ξk , rξ ), a neighborhood of ξk Indeed, since Z ∗ (ξk ) is nonempty for some ξk ∈ P, we take an arbitrary z¯k ∈ ∗ Z (ξk ) such that P(ξk ) is strong regular at z¯k Now, by applying Lemma 3.3 to probz k , rz ) of z¯k and B(ξk , rξ ) lem P(ξk ), we conclude that there exist neighborhoods B(¯ ∗ of ξk such that Z (ξk+1 ) is nonempty for any ξk+1 ∈ B(ξk , rξ ) Copyright © by SIAM Unauthorized reproduction of this article is prohibited 1271 Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php ADJOINT-BASED PREDICTOR-CORRECTOR SCP Next, if in addition Assumption 3(a) holds, then the conclusions of Lemma 3.4 hold By induction, we conclude that the convex subproblem P(¯ z k , Ak , ξk ) is uniquely k solvable in B(¯ z , rz ) for any ξk+1 ∈ B(ξk , rξ ) Finally, we prove inequality (3.26) From (3.24), (3.25), and the mean-value theorem and Assumption 3(b), we have z k+1− z¯k+1 (3.24) Jk (F˜k z k − F (z k ) − Cξk+1 − z¯k+1 = (3.25) z k+1 ) − Cξk+1 Jk F˜k z k − F (z k ) − Cξk+1 − Jk F˜k z¯k+1 − F (¯ = (3.17) ≤ β F˜k (z k − z¯k+1 ) − F (z k ) + F (¯ z k+1 ) z k ) + F (¯ z k+1 )−F (¯ z k )− F˜k (¯ z k+1 − z¯k ) F˜k (z k − z¯k )−F (z k )+F (¯ =β (3.27) z k )](z k − z¯k ) − ≤ β [F˜k − F (¯ + β [F˜k −F (¯ z k )](¯ z k+1− z¯k ) − [F (¯ z k +t(z k − z¯k ))−F (¯ z k )](z k − z¯k )dt [F (¯ z k +t(¯ z k+1− z¯k ))−F (¯ z k )](z k+1− z¯k )dt (3.14)+(3.15) ω k z k − z¯k z − z¯k ≤ β κ+ ω k+1 +β κ + z¯ − z¯k z¯k+1 − z¯k By substituting (3.13) into (3.27) we obtain ω k z k − z¯k z − z¯k z k+1 − z¯k+1 ≤ β κ + ω¯ σ2 + β κ¯ σ+ ξk+1 − ξk ξk+1 − ξk If we define α := βκ = γω σ ¯ 2(1−γκ) γκ 1−γκ < due to 3(a), c1 := γω 2(1−γκ) ≥ 0, c2 := γκ¯ σ 1−γκ > 0, and c3 := ≥ as four given constants, then the last inequality is indeed (3.26) The following corollary shows the stability of the approximate sequence {z k }k≥0 generated by Algorithm Corollary 3.6 Under the assumptions of Theorem 3.5, there exists a positive such that if the initial point z in Algorithm is number < rz < r¯z := (1 − α)c−1 0 chosen such that z − z¯ ≤ rz , where z¯0 ∈ Z ∗ (ξ0 ), then for any k ≥ 0, we have z k+1 − z¯k+1 ≤ rz , (3.28) provided that ξk+1 − ξk ≤ rξ , where z¯k+1 ∈ Z ∗ (ξk+1 ) and < rξ ≤ r¯ξ with r¯ξ := (2c3 )−1 c−1 rz (1 c22 + 4c3 rz (1 − α − c1 rz ) − c2 − α − c1 rz ) if c3 > 0, if c3 = Consequently, the error sequence {ek }k≥0 , where ek := z k − z¯k , between the exact KKT point z¯k and the approximate KKT point z k of P(ξk ) is bounded Proof Since ≤ α < 1, we have r¯z := (1 − α)c−1 > Let us choose rz such that z , rz ), i.e., z − z¯0 ≤ rz , then it follows from (3.26) that < rz < r¯z If z ∈ B(¯ z − z¯1 ≤ (α + c1 rz )rz + (c2 + c3 ξ1 − ξ0 ) ξ1 − ξ0 Copyright © by SIAM Unauthorized reproduction of this article is prohibited Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 1272 QUOC TRAN DINH, CARLO SAVORGNAN, AND MORITZ DIEHL Fig 3.1 The approximate sequence {z k }k≥0 along the trajectory z¯(·) of the KKT points In order to ensure z − z¯1 ≤ rz , we need (c2 + c3 ξ1 − ξ0 ) ξ1 − ξ0 ≤ ρ := (1 − α − c1 rz )rz Since < rz < r¯z , ρ > The last condition leads to ξ1 − ξ0 ≤ (2c3 )−1 ( c22 + 4c3 ρ − c2 ) if c3 > and ξ1 − ξ0 ≤ c−1 rz (1 − α − c1 rz ) if c3 = By induction, we conclude that inequality (3.28) holds for all k ≥ The conclusion of Corollary 3.6 is illustrated in Figure 3.1, where the approximate sequence {z k }k≥0 computed by Algorithm remains close to the sequence of the true KKT points {¯ z k }k≥0 if the starting point z is sufficiently close to z¯0 Let us assume that the constant ω > Then c3 > If we choose rz := 1−2γκ γω , then the quantity r¯ξ in Corollary 3.6 can be tightened to r¯ξ = (1−2γκ) γω σ ¯ r¯z = 3.4 A contraction estimate for predictor-corrector SCP using an exact Jacobian matrix If Ak ≡ g (xk ), then the correction vector sk = and the convex subproblem P(z k , Ak , Hk ; ξ) collapse to the following one: k P(x , Hk ; ξ) ⎧ ⎪ ⎨minn x∈R s.t ⎪ ⎩ cT x + 12 (x − xk )T Hk (x − xk ) g(xk ) + g (xk )(x − xk ) + M ξ = 0, x ∈ Ω Note that problem P(xk , Hk ; ξ) does not depend on the multiplier y k if we choose Hk independently of y k We refer to a variant of Algorithm 1, where we use the convex subproblem P(xk , Hk ; ξ) instead of P(z k , Ak , Hk ; ξ) as a predictor-corrector SCP algorithm (PCSCP) for solving a sequence of the optimization problems {P(ξk )}k≥0 Instead of Assumption 3(a) in the previous section, we make the following assumption Assumption There exists a constant ≤ κ ˜ < 2γ such that (3.29) ˜ ∀k ≥ 0, ∇2x L(¯ z k ) − Hk ≤ κ where ∇2x L(z) defined by (3.8) Assumption requires that the approximation Hk to the Hessian matrix ∇2x L(¯ zk ) k of the Lagrange function L at z¯ is sufficiently close Note that matrix Hk in the framework of the SSDP method in [12] is not necessarily positive definite Example 3.3 Let us continue analyzing example 1.1 The Hessian matrix of the Lagrange function L associated with the equality constraint x21 + 2x2 + − 4ξ = ∗ is ∇2x L(x∗ξ , y1∗ ) = 2y01 00 , where y1∗ is the multiplier associated with the equality constraint at x∗ξ Let us choose a positive semidefinite matrix Hk := h011 00 , where h11 ≥ 0; then ∇2x L(x∗ξ , y1∗ ) − Hk = |y1∗ − h11 | Since y1∗ ≥ 0, for an arbitrary κ ˜ > 0, Copyright © by SIAM Unauthorized reproduction of this article is prohibited 1273 Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php ADJOINT-BASED PREDICTOR-CORRECTOR SCP we can choose h11 ≥ such that |h11 − y1∗ | ≤ κ ˜ Consequently, the condition (3.29) is satisfied In the example (1.4) of subsection 1.4, we choose h11 = The following theorem shows the same conclusions as in Theorem 3.5 and Corollary 3.6 for the predictor-corrector SCP algorithm Theorem 3.7 Suppose that Assumptions and are satisfied for some ξ0 ∈ P Then for k ≥ and z¯k ∈ Z ∗ (ξk ), if P(ξk ) is strongly regular at z¯k , there exist neighborhoods B(¯ z k , rz ) and B(ξk , rξ ) such that the following hold: (a) The set of KKT points Z ∗ (ξk+1 ) of P(ξk+1 ) is nonempty for any ξk+1 ∈ B(ξk , rξ ) (b) If, in addition, Assumption is satisfied, then subproblem P(xk , Hk ; ξk+1 ) is uniquely solvable in the neighborhood B(¯ z k , rz ) (c) Moreover, if, in addition, Assumption 3(b) holds, then the sequence {z k }k≥0 generated by the PCSCP, where ξk+1 ∈ B(ξk , rξ ), guarantees the following inequality: ˜ + c˜1 z k − z¯k z k+1 − z¯k+1 ≤ α z k − z¯k + (˜ c2 + c˜3 ξk+1 − ξk ) ξk+1 − ξk , (3.30) where ≤ α ˜ < 1, ≤ c˜i < +∞, i = 1, , 3, and c˜2 > are given constants and z¯k+1 ∈ Z ∗ (ξk+1 ) (d) If the initial point z in the PCSCP is chosen such that z − z¯0 ≤ r˜z , where z¯0 ∈ Z ∗ (ξ0 ) and < r˜z < r˜ ¯z := c˜−1 ˜ then (1 − α), z k+1 − z¯k+1 ≤ r˜z , (3.31) provided that ξk+1 − ξk ≤ r˜ξ with < r˜ξ ≤ r¯˜ξ , r¯ ˜ξ := (2˜ c3 )−1 ˜z (1 c˜−1 r c˜22 + 4˜ c3 r˜z (1 − α ˜ − c˜1 r˜z ) − c˜2 −α ˜ − c˜1 r˜z ) if c˜3 > 0, if c˜3 = Consequently, the error sequence { z k − z¯k }k≥0 between the exact KKT point z¯k and the approximation KKT point z k of P(ξk ) is still bounded Proof The statement (a) of Theorem 3.7 follows from Theorem 3.5 We prove (b) Since Ak ≡ g (xk ), the matrix F˜k defined in (3.9) becomes ˜k := Fˆ Hk g (xk ) g (xk ) Moreover, since g is twice differentiable due to Assumption 1, g is Lipschitz continuous xk , rz ) Therefore, by Assumption , we have with a Lipschitz constant Lg ≥ in B(¯ ˜ F (¯ z k ) − Fˆ k (3.32) = zk ) ∇2x L(¯ k g (¯ x ) − g (xk ) ≤ ∇2x L(¯ z k ) − Hk g (¯ xk )T − g (xk )T + g (xk ) − g (¯ xk ) ≤κ ˜ + 2L2g xk − x ¯k 2 z k , rz ) sufficiently small such that Since κ ˜ γ < 12 , we can shrink B(¯ γ κ ˜ + 2L2g rz2 < Copyright © by SIAM Unauthorized reproduction of this article is prohibited 1274 QUOC TRAN DINH, CARLO SAVORGNAN, AND MORITZ DIEHL Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php If we define κ ˜ := (3.33) κ ˜ + 2L2g rz2 ≥ 0, then the last inequality and (3.32) imply ˜1, F (¯ z k ) − Fˆ˜k ≤ κ where κ ˜1 γ < 12 Similar to the proof of Lemma 3.4, we can show that the mapping ˜ +NK )−1 is single-valued and Lipschitz continuous with a Lipschitz constant Jˆk := (Fˆ k β˜ := γ(1−γ˜ κ1)−1 > in B(¯ z k , rz ) Consequently, the convex problem P(xk , Hk ; ξk+1 ) k is uniquely solvable in B(¯ z , rz ) for all ξk+1 ∈ B(ξk , rξ ), which proves (b) With the same argument as in the proof of Theorem 3.5, we can also prove the estimate ˜ k + c˜1 z k − z¯k z k+1 − z¯k ≤ α z k − z¯k + (˜ c2 + c˜3 ξk+1 − ξk ) ξk+1 − ξk , κ1 )−1 ∈ [0, 1), c˜1 := γω(2 − 2γ˜ κ1 )−1 ≥ 0, c˜2 := γ˜ κ1 σ ¯ (1 − where α ˜ := γ˜ κ1 (1 − γ˜ −1 −1 σ (2−2γ˜ κ1) ≥ The remaining statements of Theorem 1γ˜ κ1 ) > 0, and c˜3 := γω¯ 3.7 are proved similarly to the proofs of Theorem 3.5 and Corollary 3.6 Similar to Corollary 3.6, the constant r¯ξ in the statement (d) of this theorem can κ ˜ )2 be simplified to r¯ ˜ξ = (1−2γ γω σ ¯ 3.5 Choice of matrix Ak In the APCSCP algorithm, an approximate matrix Ak of g (xk ) and a vector sk = (g (xk ) − Ak )T y k are required at each iteration such that they maintain Assumption This matrix needs to be obtained in a cost-efficient way but shall also provide a sufficiently good approximation of g (xk ) These are conflicting objectives Though not the subject of this paper, let us discuss some ways to obtain Ak First, it might be the exact Jacobian matrix, the most expensive option Second, it might be computed by a user provided approximation algorithm, e.g., based on inaccuracy differential equation solutions Third, suppose that the initial approximation A0 is known For given z k and Ak , k ≥ 0, we need to compute xk+1 ) is still small, then we can even Ak+1 and sk+1 in an efficient way If Ak − g (¯ use the same matrix Ak for the next iteration, i.e., Ak+1 = Ak due to Assumption (see section 5) Otherwise, matrix Ak+1 can be constructed in different ways, e.g., by using low-rank updates or by a low-accuracy computation As by an inexactness computation, we can use either the two-sided rank-1 updates (TR1) [19, 28] or the Broyden formulas [45] However, it is important to note that the use of the low-rank update for matrix Ak might destroy possible sparsity structure of matrix Ak Then high-rank updates might be an option [6, 27] In Algorithm we can set matrix Hk = for all k ≥ However, this matrix can alternatively be updated at each iteration by using BFGS-type formulas or the n projection of ∇2x L(z k ) onto S+ Applications in nonlinear programming If the set of parameters Σ collapses to one point, i.e., Σ := {ξ}, then, without loss of generality, we assume that ξ = and problem P(ξ) reduces to a nonlinear programming problem of the form ⎧ ⎪ minn f (x) := cT x ⎨x∈R (P) s.t g(x) = 0, ⎪ ⎩ x ∈ Ω, where c, g, and Ω are as in P(ξ) In this section we describe a local optimization algorithm for solving (P) that is a special case of the APCSCP method Copyright © by SIAM Unauthorized reproduction of this article is prohibited Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php ADJOINT-BASED PREDICTOR-CORRECTOR SCP 1275 The subproblem P(z k , Ak , Hk ; ξ) in Algorithm reduces to ⎧ ⎪ minn cT x + (sj )T (x − xj ) + 12 (x − xj )T Hj (x − xj ) ⎨x∈R j P(z , Aj , Hj ) s.t g(xj ) + Aj (x − xj ) = 0, ⎪ ⎩ x ∈ Ω Here, we use the index j in the algorithm for the nonparametric problem (see below) to distinguish it from the index k in the parametric case In order to apply the theory in the previous sections, we only consider the fullstep algorithm for solving (P), which we call full-step adjoint-based sequential convex programming (FASCP) It is described as follows Algorithm (FASCP) Initialization Find an initial guess x0 ∈ Ω and y ∈ Rm , a matrix A0 approxin mated to g (x0 ), and H0 ∈ S+ Set s0 := (g (x0 ) − A0 )T y and j := Iteration j For a given (z j , Aj , Hj ), perform the following steps: and Step Solve the convex subproblem P(z j , Aj , Hj ) to obtain a solution xj+1 t the corresponding multiplier y j+1 Step If xj+1 −xj ≤ ε, for a given tolerance ε > 0, then: terminate Otherwise, t compute the search direction Δxj := xj+1 − xj t j+1 j j := x + Δx Evaluate the function value g(xj+1 ), update Step Update x n (or recompute) matrices Aj+1 and Hj+1 ∈ S+ (if necessary) and the correction vector j+1 s Increase j by and go back to Step End The following corollary shows that the FASCP algorithm generates an iterative sequence that converges linearly to a KKT point of (P) Corollary 4.1 Let Zˆ ∗ = ∅ and zˆ∗ ∈ Zˆ ∗ Suppose that Assumption holds and that problem (P) is strongly regular at zˆ∗ (in the sense of Definition 3.1) Suppose further that Assumption 3(a) is satisfied in B(ˆ z ∗ , rˆz ) Then there exists a neighborhood ∗ ∗ B(ˆ z , rz ) of zˆ such that in this neighborhood, the convex subproblem P(xj , Aj , Hj ) has a unique KKT point z j+1 for any z j ∈ B(ˆ z ∗ , rz ) Moreover, if in addition j Assumption 3(b) holds, then the sequence {z }j≥0 generated by Algorithm starting from z ∈ B(ˆ z ∗ , rz ) satisfies (4.1) α + cˆ1 z j − zˆ∗ ) z j − zˆ∗ z j+1 − zˆ∗ ≤ (ˆ ∀j ≥ 0, where ≤ α ˆ < and ≤ cˆ1 < +∞ are given constants Consequently, this sequence converges linearly to zˆ∗ , the unique KKT point of (P) in B(ˆ z ∗ , rz ) Proof The estimate (4.1) follows directly from Theorem 3.5 by taking ξk = for all k The remaining statement is a consequence of (4.1) If Aj = g (xj ), then the local convergence of the full-step SCP algorithm considered in [50] follows similarly from Theorem 3.7 Remark The adjoint based variant is a generalization of the SSDP methods in [12, 21] or the SQP method presented in [31] Numerical results In this section we test the algorithms proposed in the previous sections on the solution of the optimization problem arising when applying model predictive control to a hydro power plant 5.1 Dynamic model We consider a hydro power plant composed of several subsystems connected together The system includes six dams with turbines Di (i = 1, , 6) located along a river and three lakes L1 , L2 , and L3 as visualized in Figure Copyright © by SIAM Unauthorized reproduction of this article is prohibited Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 1276 QUOC TRAN DINH, CARLO SAVORGNAN, AND MORITZ DIEHL Fig 5.1 Overview of the hydro power plant 5.1 U1 is a duct connecting lakes L1 and L2 T1 and T2 are ducts equipped with turbines, and C1 and C2 are ducts equipped with turbines and pumps The flows through the turbines and pumps are the controlled variables The complete model with all the parameters can be found in [44] The dynamics of the lakes is given by (5.1) ∂h(t) qin (t) − qout (t) = , ∂t S where h(t) is the water level and S is the surface area of the lakes; qin and qout are the input and output flows, respectively The dynamics of the reaches Ri (i = 1, , 6) are described by the one-dimensional Saint-Venant partial differential equation: (5.2) ∂q(t,y) + ∂s(t,y) = 0, ∂y ∂t q(t,y) ∂ ∂ g ∂t s(t,y) + 2g ∂y q2 (t,y) s2 (t,y) + ∂h(t,y) ∂y + If (t, y) − I0 (y) = Here, y is the spatial variable along the flow direction of the river, q is the river flow (or discharge), s is the wetted surface, h is the water level with respect to the river bed, g is the gravitation acceleration, If is the friction slope, and I0 is the river bed slope The partial differential equation (5.2) can be discretized by applying the method of lines in order to obtain a system of ordinary differential equations Stacking all the equations together, we represent the dynamics of the system by (5.3) w(t) ˙ = f (w, u), where the state vector w ∈ Rnw includes all the flows and the water levels and u ∈ Rnu represents the input vector The dynamic system consists of nw = 259 states and nu = 10 controls The control inputs are the flows going in the turbines, the ducts, and the reaches Copyright © by SIAM Unauthorized reproduction of this article is prohibited ADJOINT-BASED PREDICTOR-CORRECTOR SCP 1277 Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 5.2 NMPC formulation We are interested in the following NMPC setting: J(w(·), u(·)) s.t w˙ = f (w, u), w(t) = w0 (t), u(τ ) ∈ U, w(τ ) ∈ W, τ ∈ [t, t + T ], w(t + T ) ∈ RT , w,u (5.4) where the objective function J(w(·), u(·)) is given by t+T (w(τ ) − ws )T P (w(τ ) − ws ) + (u(τ ) − us )T Q(u(τ ) − us ) dτ J(w(·), u(·)) := t (5.5) + (w(t + T ) − ws )T S(w(t + T ) − ws ) Here P, Q, and S are given symmetric positive definite weighting matrices, and (ws , us ) is a steady state of the dynamics (5.3) The control variables are bounded by lower and upper bounds, while some state variables are also bounded and the others are unconstrained Consequently, W and U are boxes in Rnw and Rnu , respectively, but W is not necessarily bounded The terminal region RT is a control-invariant ellipsoidal set centered at ws of radius r > and scaled by matrix S, i.e., (5.6) RT := w ∈ Rnw | (w − ws )T S(w − ws ) ≤ r To compute matrix S and the radius r in (5.6) the procedure proposed in [11] can be used In [30] it has been shown that the receding horizon control formulation (5.4) ensures the stability of the closed-loop system under mild assumptions Therefore, the aim of this example is to track the steady state of the system and to ensure the stability of the system by satisfying the terminal constraint along the moving horizon To have a more realistic simulation we added a disturbance to the input flow qin at the beginning of the reach R1 and the tributary flow qtributary The matrices P and Q have been set to P := diag Q := diag 0.01 : ≤ i ≤ nw , (ws )2i + : ≤ i ≤ nu , (ul + ub )2i + where ul and ub are the lower and upper bounds of the control input u 5.3 A short description of the multiple shooting method We briefly describe the multiple shooting formulation [6] which we use to discretize the continuous time problem (5.4) The time horizon [t, t + T ] of T = hours is discretized into Hp = 16 shooting intervals with Δτ = 15 minutes such that τ0 = t and τi+1 := τi +Δτ (i = 0, , Hp − 1) The control u(·) is parametrized by using a piecewise constant function u(τ ) = ui for τi ≤ τ ≤ τi + Δτ (i = 0, , Hp − 1) Let us introduce Hp + shooting node variables si (i = 0, , Hp ) Then, by integrating the dynamic system w˙ = f (w, u) in each interval [τi , τi + Δτ ], the continuous dynamic (5.3) is transformed into the nonlinear equality constraints of the form ⎡ ⎤ s0 − ξ ⎢ ⎥ w(s0 , u0 ) − s1 ⎥ = (5.7) g(x) + M ξ := ⎢ ⎣ ⎦ w(sHp −1 , uHp −1 ) − sHp Copyright © by SIAM Unauthorized reproduction of this article is prohibited Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 1278 QUOC TRAN DINH, CARLO SAVORGNAN, AND MORITZ DIEHL Here, vector x combines all the controls and shooting node variables ui and si as x = (sT0 , uT0 , , sTHp −1 , uTHp −1 , sTHp )T , ξ is the initial state w0 (t) which is considered as a parameter, and w(ui , wi ) is the result of the integration of the dynamics from τi to τi + Δτ , where we set u(τ ) = ui and w(τi ) = si The objective function (5.6) is approximated by Hp −1 (si − ws )T P (si − ws ) + (ui − us )T Q(ui − us ) f (x):= (5.8) i=0 + (sHp − ws )T S(sHp − ws ), while the constraints are imposed only at τ = τi , the beginning of the intervals, as (5.9) si ∈ W, ui ∈ U, sHp ∈ RT , (i = 0, , Hp − 1) If we define Ω := U Hp ×(W Hp ×RT ) ⊂ Rnx , then Ω is convex Moreover, the objective function (5.8) is convex quadratic Therefore, the resulting optimization problem is indeed of the form P(ξ) Note that Ω is not a box but a curved convex set due to RT The nonlinear program to be solved at every sampling time has 4563 decision variables and 4403 equality constraints, and it is expensive to evaluate its function and derivative values due to the ODE integration 5.4 Numerical simulation Before presenting the simulation results, we give some details on the implementation To evaluate the performance of the methods proposed in this paper we implemented the following algorithms: • Full-NMPC, the nonlinear program obtained by multiple shooting is solved at every sampling time to convergence by several SCP iterations • PCSCP, the implementation of Algorithm using the exact Jacobian matrix of g • APCSCP, the implementation of Algorithm with approximated Jacobian of g Matrix Ak is fixed at Ak = g (x0 ) for all k ≥ 0, where x0 is approximately computed offline by performing the SCP algorithm (Algorithm 2) to solve the nonlinear program P(ξ) with ξ = ξ0 = w0 (t) • RTGN, the solution of the nonlinear program is approximated by solving a quadratic program obtained by linearizing the dynamics and the terminal constraint sHp ∈ RT The exact Jacobian g (·) of g is used This method can be referred to as a classical real-time iteration [17] based on the constrained Gauss–Newton method in [6, 13] To compute the set RT a mixed MATLAB and C++ code has been used The computed value of r is 1.687836, while the matrix S is dense, symmetric, and positive definite The QPs and the quadratically constrained quadratic programming problems (QCQPs) arising in the algorithms we implemented can be efficiently solved by means of interior point or other methods [9, 38] In our implementation, we used the commercial solver CPLEX which can deal with both types of problems All the tests have been implemented in C++ running on a 16-cores workstation with 2.7 GHz Intel Xeon CPUs and 12 GB of RAM We used CasADi, an open-source C++ package [1] which implements automatic differentiation to calculate the derivatives of the functions and offers an interface to CVODES from the Sundials package [46] to integrate the ordinary differential equations and compute the sensitivities The integration has been parallelized by using openmp Copyright © by SIAM Unauthorized reproduction of this article is prohibited 1279 ADJOINT-BASED PREDICTOR-CORRECTOR SCP Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php Table 5.1 The average computational time of four methods Methods Full-NMPC PCSCP RTGN APCSCP Methods Full-NMPC PCSCP RTGN APCSCP AvEvalTime[s] AvSolTime[s] AvAdjDirTime[s] Total[s] 219.655 (82.43%) 46.804 (17.56%) 266.483 57.724 (89.23%) 7.627 (10.76%) 64.690 58.095 (95.85%) 2.511 (4.14%) 60.608 0.443 ( 4.73%) 8.512 (78.90%) 1.527 (16.31%) 9.364 [min, max] [min, max] [min, max] [min, max] [164.884, 302.288] [13.489, 114.899] - [179.664, 397.861] [52.162, 70.776] [4.427, 15.476] [59.881, 86.258] [52.971, 68.021] [2.265, 2.943] [55.680, 70.333] [0.402, 0.596] [4.806, 13.110] [1.331, 1.862] [5.323, 14.153] In the full-NMPC algorithm we perform at most five SCP iterations for each time interval We terminated the SCP algorithm when the relative infinity-norm of the search direction as well as of the feasibility gap reached the tolerance ε = 10−3 To have a fair comparison of the different methods, the starting point x0 of the PCSCP, APCSCP, and RTGN algorithms has been set to the solution of the first full-NMPC iteration The disturbance on the flows qin and qtributary are generated randomly and varying from to 30 and to 10, respectively All the simulations are perturbed with the same disturbance scenario We simulated the algorithms for Hm = 30 time intervals The average computational time required by the four methods is summarized in the first part of Table 5.1 Here, AvEvalTime is the average time in seconds needed to evaluate the function g and its Jacobian; AvSolTime is the average time for solving the QP or QCQP problems; AvAdjTime is the average time for evaluating the adjoint direction g (xk )T y k in Algorithm 1; Total corresponds to the sum of the previous terms and some preparation time On average, the full-NMPC algorithm needed 3.32 iterations to converge to a solution The second part of Table 5.1 represents the minimum and maximum time corresponding to the evaluation of the function and its Jacobian, the solution of the subproblems, the calculation of the adjoint derivatives, and the total time It can be seen from Table 5.1 that evaluating the function and its Jacobian matrix costs approximately 82% to 96% of the total time On the other hand, solving a QCQP problem is approximately two to five times more expensive than solving a QP problem The most expensive step at every iteration is the integration of the dynamics and its linearization The average computational time of PCSCP and RTGN is similar, while the time consumed in APCSCP is approximately six times less than for PCSCP The closed-loop control profiles of the simulation are illustrated in Figures 5.2 and 5.3 Here, the first figure shows the flows in the turbines and the ducts of lakes L1 and L2 , while the second plots the flows to be controlled in the reaches Ri (i = 1, , 6) We can observe that the control profiles achieved by PCSCP as well as APCSCP are close to the profiles obtained by full-NMPC, while the results from RTGN oscillate in the first intervals due to the violation of the terminal constraint The terminal constraint in the PCSCP is active in many iterations Figure 5.4 shows the relative feasibility and optimality gaps of the four methods, where Realative Feasibility Gap := g(xk ) + M ξk+1 ∞ / max{1.0, g(x0 ) + M ξ0 ∞ and ˜ k , λk ) Relative Optimality Gap := ∇x L(x ∞ / max{1.0, ∇f (x0 ) ∞ }, Copyright © by SIAM Unauthorized reproduction of this article is prohibited } 1280 QUOC TRAN DINH, CARLO SAVORGNAN, AND MORITZ DIEHL −3 9.96 −2 q T [m/s] −1 9.94 q T [m/s] 10 9.98 x 10 −3 −4 9.92 −5 9.9 −6 9.88 0.5 1.5 time [s] −7 2.5 x 10 15 9.99 10 q C [m/s] 10 9.98 0.5 −4 x 10 1.5 time [s] 2.5 x 10 PCSCP APCSCP RTGN Full−NMPC u ss q C [m/s] 9.97 9.96 −5 9.95 0.5 1.5 time [s] 2.5 0.5 x 10 1.5 time [s] 2.5 x 10 Fig 5.2 The controller profiles qT1 , qC1 , qT2 , and qC1 212 204 202 [m/s] 200 q R R q 208 [m/s] 210 206 198 204 0.5 1.5 time [s] 2.5 0.5 1.5 time [s] 2.5 4 x 10 x 10 242 240 qR [m/s] 240 [m/s] 235 236 q R 238 234 230 232 0.5 1.5 time [s] 2.5 0.5 1.5 time [s] 2.5 x 10 x 10 250 250 245 qR [m/s] [m/s] 245 240 235 240 PCSCP APCSCP RTGN Full−NMPC u R q Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 235 ss 230 0.5 1.5 time [s] 2.5 x 10 0.5 1.5 time [s] 2.5 x 10 Fig 5.3 The controller profiles of qR1 , , qR6 Copyright © by SIAM Unauthorized reproduction of this article is prohibited 1281 ADJOINT-BASED PREDICTOR-CORRECTOR SCP Relative Optimality Gap PCSCP APCSCP RTGN Full−NMPC 0.04 0.03 0.02 0.01 0 Relative Feasibility Gap 10 15 Time horizon [k] 20 25 30 PCSCP APCSCP RTGN Full−NMPC 0.4 0.3 0.2 0.1 0 10 15 Time horizon [k] 20 25 30 Fig 5.4 The relative feasibility and optimality gaps of PCSCP, APCSCP, RTGN, and fullNMPC The relative solution differences Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 0.05 PCSCP APCSCP RTGN 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 10 15 Time horizon [k] 20 25 30 Fig 5.5 The relative differences between the approximate solution of full-NMPC and PCSCP, APCSCP, and RTGN ˜ being the “full” Lagrange function of the optimization problem obtained with L from P(ξ) by fixing ξ at ξ = ξk+1 While the relative optimality gaps vary in the range [0.01, 0.05], the feasibility gaps in PCSCP and full-NMPC are smaller than in APCSCP and RTGN The relative differences between the approximate solution of full-NMPC and the approximate solutions of three other methods, The relative solution differences := (xk − xkfull−nmpc)./ max{ xkfull−nmpc , 1} ∞ , are plotted in Figure 5.5 These quantities in PCSCP and APCSCP are smaller than in RTGN This happens because the linearization of the quadratic constraint can not adequately capture the shape of the terminal constraint sN ∈ RT The relative solution differences in APCSCP are as good as in PCSCP Conclusions We have proposed an adjoint-based predictor-corrector SCP algorithm and two variants for solving parametric optimization problems as well as Copyright © by SIAM Unauthorized reproduction of this article is prohibited Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 1282 QUOC TRAN DINH, CARLO SAVORGNAN, AND MORITZ DIEHL nonlinear optimization problems We have proved the stability of the tracking error for the online SCP algorithm and the local convergence of the SCP algorithm These methods are suitable for nonconvex problems that possess convex substructures which can be efficiently handled by using convex optimization techniques The performance of the algorithms has been validated by a numerical example of an application in nonlinear model predictive control The basic assumptions used in our development are the strong regularity, Assumptions 3(b) and 3(a) (or ) The strong regularity concept was introduced by Robinson in [42] and has widely been used in optimization and nonlinear analysis, Assumption 3(b) (or ) is needed in any Newton-type algorithm As in SQP methods, these assumptions involve some Lipschitz constants that may be difficult to evaluate in practice Future research directions include the development of a complete theory for this approach and the applications to new problems For example, in robust control problem formulations as well as robust optimization formulations, where we consider the worst-case performance within robust counterparts, a nonlinear programming problem with second order cone and semidefinite constraints needs to be solved This can be done by employing the SCP framework Acknowledgments The authors would like to thank the associate editor and two anonymous referees for their insightful comments and suggestions that helped us to improve this paper REFERENCES [1] J Andersson, B Houska, and M Diehl, Towards a computer algebra system with automatic differentiation for use with object-oriented modeling languages, Proceedings of the 3rd International Workshop on Equation-Based Object-Oriented Modeling Languages and Tools, Oslo, Norway, 2010 ă rkel, and J P Schlo ă der, Numerical methods for optimum [2] I Bauer, H G Bock, S Ko experimental design in DAE systems, J Comput Appl Math., 120 (2000), pp 1–15 [3] L T Biegler, Efficient solution of dynamic optimization and NMPC problems, in Nonlinear Predictive Control, Progress in Systems Theory, F Allgă ower and A Zheng, eds., Birkhă auser, Basel, 2000, pp 219–244 [4] L T Biegler and J B Rawlings, Optimization approaches to nonlinear model predictive control, Proceedings of the 4th International Conference on Chemical Process Control, CPC IV, W H Ray and Y Arkun, eds., 1991, pp 543571 ă der, A direct multiple shooting [5] H G Bock, M Diehl, D B Leineweber, and J P Schlo method for real-time optimization of nonlinear DAE processes, in Nonlinear Predictive Control, Progress in Systems Theory, F Allgă ower and A Zheng, eds., Birkhă auser, Basel, 2000, pp 246267 [6] H G Bock and K J Plitt, A multiple shooting algorithm for direct solution of optimal control problems, Proceedings of the 9th IFAC World Congress Budapest, Pergamon Press, 1984, pp 243–247 [7] J F Bonnans, Local Analysis of Newton-Type Methods for Variational Inequalities and Nonlinear Programming, Appl Math Optim, 29 (1994), pp 161–186 [8] J F Bonnans and A Shapiro, Perturbation Analysis of Optimization Problems, Springer, Berlin, 2000 [9] S Boyd and L Vandenberghe, Convex Optimization, Cambridge University Press, Cambridge, UK, 2004 [10] Y Cao, S Li, L Petzold, and R Serban, Adjoint sensitivity analysis for differentialalgebraic equations: The adjoint DAE system and its numerical solution, SIAM J Sci Comput., 24 (2003), pp 10761089 ă wer, A quasi-innite horizon nonlinear model predictive control scheme [11] H Chen and F Allgo with guaranteed stability, Automatica, 34 (1998), pp 1205–1218 [12] R Correa and H Ramirez C, A Global Algorithm for Nonlinear Semidefinite Programming, Technical report RR-4672, INRIA-Rocquencourt, 2002 Copyright © by SIAM Unauthorized reproduction of this article is prohibited Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php ADJOINT-BASED PREDICTOR-CORRECTOR SCP 1283 [13] P Deuflhard, Newton Methods for Nonlinear Problems: Affine Invariance and Adaptive Algorithms, Springer-Verlag, Berlin, 2004 [14] M Diehl, Real-Time Optimization for Large Scale Nonlinear Processes, http://www.ub uni-heidelberg.de/archiv/1659 (2002) [15] M Diehl, H G Bock, and E Kostina, An approximation technique for robust nonlinear optimization, Math Program., 107 (2006), pp 213230 ă der, A real-time iteration scheme for non[16] M Diehl, H G Bock, and J P Schlo linear optimization in optimal feedback control, SIAM J Control Optim., 43 (2005), pp 17141736 ă der, R Findeisen, Z Nagy, and F Allgo ă wer, Real-time [17] M Diehl, H G Bock, J P Schlo optimization and nonlinear model predictive control of processes governed by differentialalgebraic equations, J Proc Control, 12 (2002), pp 577–585 [18] M Diehl, F Jarre, and C Vogelbusch, Loss of superlinear convergence for an SQP-type method with conic constraints, SIAM J Optim., 16 (2006), pp 1201–1210 [19] M Diehl, A Walther, H G Bock, and E Kostina, An adjoint-based SQP algorithm with quasi-Newton Jacobian updates for inequality constrained optimization, Optim Methods Softw., 25 (2010), pp 531–552 [20] A L Dontchev and T R Rockafellar, Characterizations of strong regularity for variational inequalities over polyhedral convex sets, SIAM J Optim., (1996), pp 1087–1105 [21] B Fares, D Noll, and P Apkarian, Robust control via sequential semidefinite programming, SIAM J Control Optim., 40 (2002), pp 1791–1820 [22] C Fleury, Sequential convex programming for structural optimization problems, Optimization of Large Structural Systems: Proceedings of the NATO/DFG Advanced Study Institute, Berchtesgaden, Germany, 1991, pp 531–553 [23] R W Freund, F Jarre, and C H Vogelbusch, Nonlinear Semidefinite Programming: Sensitivity, Convergence, and an Application in Passive Reduced-Order Modeling, Math Program Ser B, 109 (2007), pp 581–611 [24] R W Freund and F Jarre, A Sensitivity Analysis and a Convergence Result for a Sequential Semidefinite Programming Method, Technical report, Bell Laboratories, Murray Hill, NY, 2003 [25] J Gauvin and R Janin, Directional behaviour of optimal solutions in nonlinear mathematical programming, Math Oper Res., 13 (1988), pp 629–649 [26] A Griewank, Evaluating Derivatives, Principles and Techniques of Algorithmic Differentiation, Frontiers in Appl Math 19, SIAM, Philadelphia, 2000 [27] A Griewank and Ph L Toint, Partitioned variable metric updates for large structured optimization problems, Numer Math., 39 (1982), pp 119–137 [28] A Griewank and A Walther, On constrained optimization by adjoint based quasi-Newton methods, Optim Methods Softw., 17 (2002), pp 869–889 [29] A Helbig, O Abel, and W Marquardt, Model predictive control for on-line optimization of semi-batch reactors, Proceedings of the American Control Conference, Philadelphia, 1998, pp 1695–1699 [30] A Jadbabaie and J Hauser, On the stability of receding horizon control with a general terminal cost, IEEE Trans Automat Control, 50 (2005), pp 674–678 [31] F Jarre, On an Approximation of the Hessian of the Lagrangian, http://www.optimizationonline.org/DB− HTML/2003/12/800.html (2003) [32] C Kanzow, C Nagel, H Kato, and M Fukushima, Successive Linearization Methods for Nonlinear Semidefinite Programs, Comput Optim Appl., 31 (2005), pp 251–273 [33] H Kato and M Fukushima, An SQP-type algorithm for nonlinear second-order cone programs, Optim Lett., (2007), pp 129–144 [34] D Klatte and B Kummer, Nonsmooth Equations in Optimization: Regularity, Calculus, Methods and Applications, Kluwer Academic Publishers, Dordrecht, Netherlands, 2002 [35] A S Lewis and S J Wright, A Proximal Method for Composite Minimization, Technical report, University of Wisconsin, Madison, 2008 [36] J Mattingley, Y Wang, and S Boyd, Code generation for receding horizon control, Proceedings of the IEEE International Symposium on Computer-Aided Control System Design, Yokohama, Japan, 2010 [37] S Mehrotra, On the implementation of a primaldual interior point method, SIAM J Optim., (1992), pp 575–601 [38] Y Nesterov, Introductory Lectures on Convex Optimization: A Basic Course, Kluwer Academic Publishers, Dordrecht, Netherlands, 2004 [39] J Nocedal and S J Wright, Numerical Optimization, Springer Seri Oper Res Financial Engrg Springer, New York, 2006 Copyright © by SIAM Unauthorized reproduction of this article is prohibited Downloaded 12/06/14 to 216.165.95.79 Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php 1284 QUOC TRAN DINH, CARLO SAVORGNAN, AND MORITZ DIEHL [40] T Ohtsuka, A continuation/GMRES method for fast computation of nonlinear receding horizon control, Automatica, 40 (2004), pp 563–574 [41] J V Outrata, Mathematical programs with equilibrium constraints: Theory and numerical methods, in Nonsmooth Mechanics of Solids, J Haslinger and G E Stavroulakis, eds., CISM Courses and Lectures 485, Springer, Vienna, 2006, pp 221–274 [42] S M Robinson, Strongly Regular Generalized Equations, Math Oper Res., (1980), pp 43–62 [43] T R Rockafellar and R J-B Wets, Variational Analysis, Springer-Verlag, New York, 1997 [44] C Savorgnan and M Diehl, Control Benchmark of a Hydro Power Plant, Technical report, Optimization in Engineering Center, KU Leuven, 2010; also available online from http://www.ict-hd-mpc.eu/index.php?page=benchmarks [45] S Schlenkrich, A Griewank, and A Walther, On the local convergence of adjoint Broyden methods, Math Program., 121 (2010), pp 221–247 [46] R Serban and A C Hindmarsh, CVODES: The sensitivity-enabled ODE solver in SUNDIALS, Proceedings of IDETC/CIE 2005 [47] M Stingl, M Kocvara, and G Leugering, A sequential convex semidefinite programming algorithm for multiple-load free material optimization, SIAM J Optim., 20 (2009), pp 130–155 [48] J F Sturm, Using SeDuMi 1.02: A Matlab toolbox for optimization over symmetri, cones Optim Methods Softw., 11-12 (1999), pp 625–653 [49] K Svanberg, The method of moving asymptotes: A new method for structural optimization, Internat J Numer Methods Engrg., 24 (1987), pp 359–373 [50] Q Tran Dinh and M Diehl, Local Convergence of Sequential Convex Programming for Nonconvex Optimization, in Recent Advances in Optimization and Its Applications in Engineering, M Diehl, F Glineur, E Jarlebring, and W Michiels, eds., Springer-Verlag, Berlin, 2010, pp 93–102 [51] Q Tran Dinh, S Gumussoy, W Michiels, and M Diehl, Combining convex-concave decompositions and linearization approaches for solving BMIs, with application to static output feedback, IEEE Trans Automat Control, 57 (2012), pp 1377–1390 [52] Q Tran Dinh, C Savorgnan, and M Diehl, Real-Time Sequential Convex Programming for Optimal Control Applications, in Modeling, Simulation and Optimization of Complex Processes, H Bock, H X Phu, R Rannacher, and J P Schlă oder, eds., Springer-Verlag, Berlin, 2012, pp 91102 ă tu ă nku ă , K C Toh, and M J Todd, Solving semidefinite-quadratic-linear programs [53] R H Tu using SDPT3, Math Program., 95 (2003), pp 189–217 [54] V M Zavala and M Anitescu, Real-Time Nonlinear Optimization as a Generalized Equation, SIAM J Control Optim., 48 (2010), pp 5444–5467 [55] C Zillober, K Schittkowski, and K Moritzen, Very large scale optimization by sequential convex programming, Optim Methods Softw., 19 (2004), pp 103–120 Copyright © by SIAM Unauthorized reproduction of this article is prohibited ... MORITZ DIEHL 1.1 Sequential convex programming The sequential convex programming (SCP) method is a local nonconvex optimization technique SCP solves a sequence of convex approximations of the original... predictor-corrector sequential convex programming (APCSCP) method for solving parametric optimization problems of the form P(ξ) The algorithm is specially suited for solving nonlinear MPC problems... Tran Dinh and M Diehl, Local Convergence of Sequential Convex Programming for Nonconvex Optimization, in Recent Advances in Optimization and Its Applications in Engineering, M Diehl, F Glineur,