Numer Algor (2011) 58:23–52 DOI 10.1007/s11075-011-9446-9 ORIGINAL PAPER Extending the applicability of the Gauss–Newton method under average Lipschitz–type conditions Ioannis K Argyros · Saïd Hilout Received: 25 October 2010 / Accepted: 17 January 2011 / Published online: February 2011 © Springer Science+Business Media, LLC 2011 Abstract We extend the applicability of the Gauss–Newton method for solving singular systems of equations under the notions of average Lipschitz–type conditions introduced recently in Li et al (J Complex 26(3):268–295, 2010) Using our idea of recurrent functions, we provide a tighter local as well as semilocal convergence analysis for the Gauss–Newton method than in Li et al (J Complex 26(3):268–295, 2010) who recently extended and improved earlier results (Hu et al J Comput Appl Math 219:110–122, 2008; Li et al Comput Math Appl 47:1057–1067, 2004; Wang Math Comput 68(255):169–186, 1999) We also note that our results are obtained under weaker or the same hypotheses as in Li et al (J Complex 26(3):268–295, 2010) Applications to some special cases of Kantorovich–type conditions are also provided in this study Keywords Gauss–Newton method · Newton’s method · Majorizing sequences · Recurrent functions · Local/semilocal convergence · Kantorovich hypothesis · Average Lipschitz conditions Mathematics Subject Classifications (2010) 65G99 · 65H10 · 65B05 · 65N30 · 47H17 · 49M15 I K Argyros Department of Mathematics Sciences, Cameron University, Lawton, OK 73505, USA e-mail: iargyros@cameron.edu S Hilout (B) Laboratoire de Mathématiques et Applications, Poitiers University, Bd Pierre et Marie Curie, Téléport 2, B.P 30179, 86962 Futuroscope Chasseneuil Cedex, France e-mail: said.hilout@math.univ–poitiers.fr 24 Numer Algor (2011) 58:23–52 Introduction In this study we are concerned with the problem of approximating a locally unique solution x of equation F(x) = 0, (1.1) where, F is a Fréchet–differentiable operator defined on an open, nonempty, convex subset D of Rm with values in Rl , where m, l ∈ N The field of computational sciences has seen a considerable development in mathematics, engineering sciences, and economic equilibrium theory For example, dynamic systems are mathematically modeled by difference or differential equations, and their solutions usually represent the states of the systems For the sake of simplicity, assume that a time–invariant system is driven by the equation x˙ = T(x), for some suitable operator T, where x is the state Then the equilibrium states are determined by solving equation (1.1) Similar equations are used in the case of discrete systems The unknowns of engineering equations can be functions (difference, differential, and integral equations), vectors (systems of linear or nonlinear algebraic equations), or real or complex numbers (single algebraic equations with single unknowns) Except in special cases, the most commonly used solution methods are iterative–when starting from one or several initial approximations a sequence is constructed that converges to a solution of the equation Iteration methods are also applied for solving optimization problems In such cases, the iteration sequences converge to an optimal solution of the problem at hand Since all of these methods have the same recursive structure, they can be introduced and discussed in a general framework We note that in computational sciences, the practice of numerical analysis for finding such solutions is essentially connected to variants of Newton’s method We shall use the Gauss–Newton method (GNM) xn+1 = xn − F (xn )+ F(xn ) (n ≥ 0), (x0 ∈ D), (1.2) to generate a sequence {xn } approximating a solution x of equation F (x)+ F(x) = 0, (1.3) + where, F (x) denotes the Moore–Penrose inverse of matrix F (x), (x ∈ D) (see Definition 2.3) If m = l, and F (xn ) is invertible, then (GNM) reduces to Newton’s method (NM) given by xn+1 = xn − F (xn )−1 F(xn ) (n ≥ 0), (x0 ∈ D), There is an extensive literature on the local as well as the semilocal convergence of (GNM) and (NM) under Lipschitz–type conditions We refer the reader to [4, 6] and the references there for convergence results on Newton– type methods (see also [1–3, 5, 9–30, 32–52]) In particular, we recommend the paper by Xu and Li [52], where (GNM) is studied under average Lipschitz conditions Numer Algor (2011) 58:23–52 25 In [31], Li, Hu and Wang provided a Kantorovich–type convergence analysis for (GNM) by inaugurating the notions of a certain type of average Lipschitz conditions (GNM) is also studied using the Smale point estimate theory This way, they unified convergence criteria for (GNM) Special cases of their results extend and/or improve important known results [3, 23] In this study, we are motivated by the elegant work in [31] and optimization considerations In particular, using our new concept of recurrent functions, we provide a tighter convergence analysis for (GNM) under weaker or the same hypotheses in [31] for both the local as well the semilocal case The study is organized as follows: Section contains some preliminaries on majorizing sequences for (GNM), and well known properties for Moore– Penrose inverses In Sections and 4, we provide a semilocal convergence analysis for (GNM), respectively, using the Kantorovich approach The semilocal convergence for (GNM) using recurrent functions is given in Section Finally, applications and further conparisons between the Kantorovich and the recurrent functions approach are given in Section Preliminaries Let R = R ∪ {+∞} and R+ = [0, +∞] We assume that L and L0 are non– decreasing functions on [0, R), where R ∈ R+ , and L0 (t) ≤ L(t), t ∈ [0, R) for all (2.1) Let β > 0, and ≤ λ < be given parameters Define function g : [0, R] −→ R by t g(t) = β − t + L0 (u) (t − u) du (2.2) Moreover, define majorizing function hλ : [0, R] −→ R corresponding to a fixed pair (λ, L) by: t hλ (t) = β − (1 − λ) t + L(u) (t − u) du (2.3) We have for t ∈ [0, R] t g (t) = −1 + L0 (u) du, (2.4) g (t) = L0 (t), hλ (t) = −(1 − λ) + (2.5) t L(u) du, (2.6) 26 Numer Algor (2011) 58:23–52 and hλ (t) = L(t) (2.7) It follows from (2.1), (2.4), and (2.6) that g (t) ≤ h0 (t), for all t ∈ [0, R] (2.8) Define r rλ := sup {r ∈ (0, R) : L(u) du ≤ − λ}, (2.9) and rλ b λ := (1 − λ) rλ − L(u) (rλ − u) du (2.10) Set R = L(u) du Then, we have: rλ = R if tλ if < − λ, ≥ − λ, (2.11) where, t1 ∈ [0, R] is such that tλ L(u) du = − λ, and is guaranteed to exist, since ≥ − λ in this case We also get ⎧ rλ ⎪ L(u) u du if < − λ, bλ ≥ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩ bλ = rλ L(u) u du if (2.12) ≥ − λ, Let us define scalar sequence {sλ,n } by sλ,0 = 0, sλ,n+1 = sλ,n − hλ (sλ,n ) , g (sλ,n ) (n ≥ 0) (2.13) Note that if equality holds in (2.1), then g(t) = h0 (t) t ∈ [0, R], (2.14) and {sλ,n } reduces to {tλ,n } introduced in [31], and given by: tλ,0 = 0, tλ,n+1 = tλ,n − hλ (tλ,n ) , h0 (sλ,n ) (n ≥ 0) (2.15) Numer Algor (2011) 58:23–52 27 We shall show in Lemma 2.2 that under the same hypothesis (see (2.16)), scalar sequence {sλ,n } is at least as tight as {tλ,n } But first, we need a crucial result on majorizing sequences for the (GNM) Lemma 2.1 Assume: β ≤ b λ (2.16) Then, the following hold (i) Function hλ is strictly decreasing on [0, rλ ], and has exactly one zero tλ ∈ [0, rλ ], such that β < tλ (2.17) (ii) Sequence {sλ,n } given by (2.13) is strictly increasing, and converges to tλ Proof (i) This part follows immediately from (2.3), (2.6), and (2.10) (ii) We shall show this part using induction on n It follows from (2.13), and (2.17) that = sλ,0 < sλ,1 = β < tλ Assume sλ,k−1 < sλ,k < tλ for all k ≤ n (2.18) In view of (2.5), −g is strictly increasing on [0, R], and so by (2.8), (2.18), and the definition of rλ , we have −g (sλ,k ) > −g (tλ ) ≥ −g (rλ ) ≥ −hλ (rλ ) + λ ≥ (2.19) We also have hλ (sλ,k ) > by (i) That is, it follows from sλ,k+1 = sλ,k − hλ (sλ,k ) > sλ,k g (sλ,k ) (2.20) Let us define pλ on [0, tλ ] by hλ (t) g (t) (2.21) for t ∈ [0, tλ ] (2.22) pλ (t) = t − We have g (t) < except if λ = 0, and t = tλ = rλ As in [31], we set by convention hλ (tλ ) hλ (t) = lim − =0 g (tλ ) t−→tλ g (t) by Hospital’s rule (2.23) 28 Numer Algor (2011) 58:23–52 Therefore, function pλ is well defined, and continuous on [0, tλ ] It then follows from part (i), (2.7), and (2.22): pλ (t) = − hλ (t) g (t) − hλ (t) g (t) (g (t))2 t − λ+ = >0 (L(u) − L0 (u)) du g (t) + hλ (t) L0 (t) (2.24) (g (t))2 for a.e t ∈ [0, tλ ) Using (2.18), (2.20), and (2.24), we get sλ,k < sλ,k+1 = pλ (sλ,k ) < pλ (tλ ) = tλ , (2.25) which completes the induction Hence {sλ,n } is increasing, bounded above by tλ , and as such it converges to its unique least upper bound s ∈ (0, tλ ], with hλ (s ) = Using part (i), we get s = tλ That completes the proof of Lemma 2.1 Next, we compare sequence {sλ,n } with {tλ,n } Lemma 2.2 Assume that condition (2.16) holds, then the following hold for n ≥ 0: sλ,n ≤ tλ,n , (2.26) sλ,n+1 − sλ,n ≤ tλ,n+1 − tλ,n (2.27) and Moreover, if L0 (t) < L(t) for t ∈ [0, tλ ], then (2.26), and (2.27) hold as strict inequalities for n ≥ 1, and n ≥ 0, respectively Proof It was shown in [31] that under hypothesis (2.16), assertions (i) and (ii) of Lemma 2.1 hold with {tλ,n } replacing {sλ,n } We shall show (2.26), and (2.27) using induction It follows from (2.1), (2.13), and (2.15) that sλ,0 = tλ,0 , sλ,1 = tλ,1 = β, sλ,1 = sλ,0 − hλ (sλ,0 ) hλ (tλ,0 ) ≤ tλ,0 − = tλ,1 , g (sλ,0 ) h0 (tλ,0 ) (2.28) hλ (sλ,0 ) hλ (tλ,0 ) ≤− = tλ,1 − tλ,0 g (sλ,0 ) h0 (tλ,0 ) (2.29) and sλ,1 − sλ,0 = − Numer Algor (2011) 58:23–52 29 Hence, (2.26), and (2.27) hold for n = Let us assume that (2.26), and (2.27) hold for all k ≤ n Then, we have in turn: sλ,k+1 = sλ,k − hλ (sλ,k ) hλ (tλ,k ) ≤ tλ,k − = tλ,k+1 , g (sλ,k ) h0 (tλ,k ) (2.30) hλ (sλ,k ) hλ (tλ,k ) ≤− = tλ,k+1 − tλ,k , g (sλ,k ) h0 (tλ,k ) (2.31) and sλ,k+1 − sλ,k = − which completes the induction for (2.26), and (2.27) Moreover, if L0 (t) < L(t) for t ∈ [0, tλ ], then (2.28)–(2.31) hold as strict inequalities That completes the proof of Lemma 2.1 It is convenient for us to provide some well known definitions and properties of the Moore–Penrose inverse [4, 12] Definition 2.3 Let M be a matrix l × m The m × l matrix M+ is called the Moore–Penrose inverse of M if the following four axioms hold: (M+ M)T = M+ M, (M M+ )T = M M+ , M+ M M+ = M+ , M M+ M = M, (2.32) where, MT is the adjoint of M In the case of a full rank (l, m) matrix M, with rank M = m, the Moore– Penrose inverse is given by: M+ = (MT M)−1 MT E Denote by Ker M and Im M the kernel and image of M, respectively, and the projection onto a subspace E of Rm We then have M+ M = Ker M⊥ and M M+ = Im M (2.33) Note that if M is surjective or equivalently, when M is full row rank, then M M+ = IRl (2.34) We also state a result providing a pertubation upper bound for Moore– Penrose inverses [4, 50]: Lemma 2.4 Let M and N be two l × m matrices with rank N ≥ Assume that ≤ rank M ≤ rank N , (2.35) 30 Numer Algor (2011) 58:23–52 and N + < M−N (2.36) Then, the following hold rank M = rank N , (2.37) and M+ ≤ N+ M−N 1− N+ (2.38) Semilocal convergence I of (GNM) We shall provide a semilocal convergence for (GNM) using majorizing sequences under the Kantorovich approach Let U(x, r) denotes an open ball in Rm with center x and of radius r > 0, and let U(x, r) denotes its closure For the remainder of this study, we assume that F : Rm −→ Rl is continuous Fréchet–differentiable, F (y)+ (I − F (x) F (x)+ ) F(x) ≤ κ x−y , for all x, y ∈ D (3.1) where, κ ∈ [0, 1), I is the identity matrix, F (x0 ) = for some x0 ∈ D, (3.2) and rank (F (x)) ≤ rank (F (x0 )) for all x ∈ D (3.3) We need the definition of the modified L–average Lipschitz condition on U(x0 , r) Definition 3.1 [31] Let r > be such that U(x0 , r) ⊆ D Mapping F satisfies the modified L–average Lipschitz condition on U(x0 , r), if, for any x, y ∈ U(x0 , r), with x − x0 + y − x < r, F(x0 )+ F (y) − F (x) ≤ x−x0 + y−x L(u) du (3.4) x−x0 Condition (3.4) was used in [31] as an alternative to L–average Lipschitz condition [26] F(x0 )+ (F (y) − F (x)) ≤ x−x0 + y−x L(u) du, (3.5) x−x0 which is a modification of Wang’s condition [41, 42] Condition (3.4) fits the case when F (x0 ) is not surjective [31] Numer Algor (2011) 58:23–52 31 We also introduce the condition Definition 3.2 Let r > be such that U(x0 , r) ⊆ D Mapping F satisfies the modified center L0 –average Lipschitz condition on U(x0 , r), if, for any x ∈ U(x0 , r), with x − x0 < r, F(x0 )+ x−x0 F (y) − F (x0 ) ≤ L0 (u) du (3.6) x−x0 If (3.4) holds, then so (3.6), and (2.1) Therefore, (3.6) is not an L can be arbitrarily large [1–7] additional hypothesis We also note that L0 We shall use the perturbation result on Moore–Penrose inverses Lemma 3.3 Let r r˜0 = sup{r ∈ (0, R) : L0 (u) du ≤ 1} Suppose that ≤ r ≤ r˜0 satisf ies U(x0 , r) ⊆ D, and that F satisf ies (3.6) on U(x0 , r) Then, for each x ∈ U(x0 , r), rank (F (x)) = rank (F (x0 )), and F (x)+ ≤ −g ( x − x0 )−1 F (x0 )+ (3.7) Remark 3.4 If equality holds in (2.1), then Lemma 3.3 reduces to Lemma 3.2 in [31] Otherwise, (i.e if strict inequality hold in (2.1)), (3.7) is a more precise estimate than F (x)+ ≤ −h0 ( x − x0 )−1 F (x0 )+ (3.8) given in [31], since −g (r)−1 < −h0 (r)−1 r ∈ [0, R] (3.9) We also have r0 ≤ r˜0 , (3.10) and λ0 = κ β 1− L(u) du ≤ λ˜ = κ β 1− L0 (u) du , (3.11) where, β = F (x0 )+ F(x0 ) (3.12) 32 Numer Algor (2011) 58:23–52 We also state the result Lemma 3.5 [26, 31] Let ≤ c < R Def ine χ(t) = t t L(c + u) (t − u) du, ≤ t < R − c Then, function χ is increasing on [0, R − c) We can show the following semilocal convergence result for (GNM) Our approach differs from the corresponding [31, Theorem 3.1, p 273] since we use (3.7) instead (3.8) Theorem 3.6 Let λ ≥ λ˜ Assume β ≤ bλ and U(x0 , tλ ) ⊆ D; (3.13) F satisf ies (3.4) and (3.6) on U(x0 , tλ ) Then, sequence {xn } generated by (GNM) is well def ined, remains in U(x0 , tλ ) for all n ≥ 0, and converges to a zero x of F (.)+ F(.) in U(x0 , tλ ) Moreover, the following estimates hold: xn+1 − xn ≤ sλ,n+1 − sλ,n , (3.14) and xn − x ≤ tλ − sλ,n (3.15) Proof We first use mathematical induction to prove that xn − xn−1 ≤ sλ,n − sλ,n−1 (3.16) hold for each n ≥ We first have x1 − x0 = F (x0 )+ F(x0 ) = β ≤ sλ,1 − sλ,0 then (3.16) holds for n = Assume that (3.16) hold for all n ≤ k Then xk − xk−1 ≤ sλ,k − sλ,k−1 (3.17) For θ ∈ [0, 1] we denote by xθk = xk−1 + θ (xk − xk−1 ) and sθλ,k = sλ,k−1 + θ (sλ,k − sλ,k−1 ) Then for all θ ∈ [0, 1] k−1 xθk − x0 ≤ xθk − xk−1 xi − xi−1 ≤ sθλ,k < tλ ≤ rλ ≤ r0 + (3.18) i=1 In particular xk−1 − x0 ≤ sλ,k−1 and xk − x0 ≤ sλ,k (3.19) 38 Numer Algor (2011) 58:23–52 Corollary 4.6 Suppose that F satisf ies (3.4) in U(x , r0 ) Let x0 ∈ bκ U x, Then {xn } generated by (GNM) starting at x0 converges to 2−κ a zero of F (.)+ F(.) Corollary 4.7 Suppose that F satisf ies (3.4) in U(x , r0 ), and the condition (3.31) holds Let x0 ∈ U(x , r0 ), where r0 is the exact one zero of function φ0 : [0, r0 ] −→ R def ined by t φ0 (t) = b − t + L(u) (t − u) du Then {xn } generated by (GNM) starting at x0 converges to a zero of F (.)+ F(.) Corollary 4.8 Suppose that F is surjective, and satisf ies (3.4) in U(x , r0 ) Let x0 ∈ U(x , r0 ), where r0 is given in Corollary 4.7 Then {xn } generated by (GNM) starting at x0 converges to a solution of F(x) = Remark 4.9 The local results obtained can also be used to solve equation of the form F(x) = 0, where F satisfies the autonomous differential equation [4]: F (x) = T(F(x)), (4.7) where T : Y −→ X is a known continuous operator Since F (x ) = T(F(x )) = T(0), we can apply our results without actually knowing the solution of x of equation F(x) = As an example, let X = Y = (−∞, +∞), D = U(0, 1), and define function F on D by F(x) = ex − (4.8) Then, for x = 0, we can set T(x) = x + in (4.7) Semilocal convergence II of (GNM) We shall provide a semilocal convergence for (GNM) using our new concept of recurrent functions This idea has already produced a finer convergence analysis for iterative methods using invertible operators or outer or generalized inverses [5–7] We need to define some parameters, sequences, and functions Definition 5.1 Let x0 ∈ D, κ ∈ [0, 1), and λ ∈ [0, 1) Numer Algor (2011) 58:23–52 39 Define parameter β, iteration {vλ,n }, functions fλ,n , ελ,n , μλ,n on [0, 1), and ]2 (dλ ∈ [0, 1)) by ξλ on Iξλ = [0, 1]2 × [1, − dλ vλ,0 = 0, vλ,1 = β, δλ,n + κ = vλ,n + (vλ,n − vλ,n−1 ), − δ λ,n vλ,n+1 zλ,n−1 (dλ ) fλ,n (dλ ) = wλ,n−1 (dλ ) zλ,n (dλ ) wλ,n (dλ ) zλ,n+1 (dλ ) −2 ξλ (θ, dλ , aλ , eλ ) = wλ,n+1 (dλ ) −2 + dλ wλ,n−1 (dλ ) L0 (u) du − (aλ +eλ +eλ dλ ) β (aλ +eλ +θ eλ dλ ) β zλ,n−1 (dλ ) L(u) du L(u) du dθ wλ,n (dλ ) (aλ +eλ +eλ dλ +θ d2λ ) β (5.3) zλ,n (dλ ) zλ,n+1 (dλ ) + dλ L(u) du dθ wλ,n−1 (dλ ) L(u) du + wλ,n+1 (dλ ) (5.2) zλ,n−1 (dλ ) L0 (u) du, wλ,n (dλ ) μλ,n (dλ ) = L(u) du − wλ,n+1 (dλ ) + dλ L0 (u) du + cλ , ελ,n (dλ ) = wλ,n (dλ ) L(u) du dθ + dλ (5.1) (n ≥ 1), zλ,n (dλ ) wλ,n (dλ ) L(u) du + L0 (u) du , (5.4) (aλ +θ eλ ) β aλ β L(u) du L(u) du dθ (aλ +eλ ) β (aλ +eλ +eλ dλ +d2λ eλ ) β (aλ +eλ +eλ dλ ) β L0 (u) du − (aλ +eλ +eλ dλ ) β (aλ +eλ ) β L0 (u) du , (5.5) where, δλ,n = vλ,n−1 +θ (vλ,n −vλ,n−1 ) L(u) du dθ, vλ,n−1 vλ,n δ λ,n = (5.6) L0 (u) du, (5.7) − dnλ + θ dnλ β, − dλ (5.8) zλ,n (dλ ) = 40 Numer Algor (2011) 58:23–52 wλ,n (dλ ) = − dnλ β, − dλ (5.9) cλ = κ − dλ (5.10) Define function fλ,∞ on [0, 1) by fλ,∞ (dλ ) = lim fλ,n (dλ ) (5.11) n−→∞ Remark 5.2 Using (5.2), and (5.11), we get β 1−dλ fλ,∞ (dλ ) = dλ L0 (u) du + cλ (5.12) It then follows from (5.2)–(5.10) that the following identities hold: fλ,n+1 (dλ ) = fλ,n (dλ ) + ελ,n (dλ ), (5.13) ελ,n+1 (dλ ) = ελ,n (dλ ) + μλ,n (dλ ), (5.14) and n−1 μλ,n (dλ ) = ξλ (θ, dλ , aλ = (1 + dλ + · · · + dn−2 β) λ ) β, eλ = dλ (5.15) We need the following result on majorizing sequences for (GNM) Lemma 5.3 Let parameters β, κ, λ, iteration {vλ,n }, and functions fλ,n , ελ,n , μλ,n , and ξλ be as in Def inition 5.1 Assume there exists αλ ∈ (0, 1) such that θβ β L(u) du dθ + κ ≤ αλ − L0 (u) du , (5.16) cλ = κ − αλ < 0, ξλ (θ, q1 , q2 , q3 ) ≥ on (5.17) Iξλ , (5.18) ελ,1 (αλ ) ≥ 0, (5.19) fλ,∞ (αλ ) ≤ (5.20) and Numer Algor (2011) 58:23–52 41 Then, iteration {vλ,n } given by (5.1) is non–decreasing, bounded from above by vλ = β , − αλ (5.21) and converges to its unique least bound vλ such that vλ ∈ [0, vλ ] (5.22) Moreover, the following estimates hold for all n ≥ 0: ≤ vλ,n+1 − vλ,n ≤ αλ (vλ,n − vλ,n−1 ) ≤ αλn β, (5.23) and αλn β − αλ (5.24) δλ,n + κ ≤ αλ (1 − δ λ,n ) (5.25) ≤ vλ − vλ,n ≤ Proof Estimate (5.23) is true, if holds for all n ≥ It follows from (5.1), (5.16), and (5.17) that estimate (5.25) holds for n = Then, we also have that (5.23) holds for n = 1, and vλ,n ≤ − αλn β < vλ − αλ (5.26) Using the induction hypotheses, and (5.26), estimate (5.25) is true, if δλ,k + αλ δ λ,k + cλ ≤ (5.27) or zλ,k−1 (αλ ) wλ,k−1 (αλ ) wλ,k (αλ ) L(u) du dθ + αλ L0 (u) du + cλ ≤ (5.28) hold for all k ≤ n Estimate (5.28) (for dλ = αλ ) motivates us to introduce function fλ,k given by (5.2), and show instead of (5.28) fλ,k (αλ ) ≤ (5.29) We have by (5.13)–(5.15) (for dλ = αλ ) and (5.19) that fλ,k+1 (αλ ) ≥ fλ,k (αλ ) (5.30) In view of (5.11), (5.12), and (5.30), estimate (5.29) shall holds, if (5.20) is true, since fλ,k (αλ ) ≤ fλ,∞ (αλ ), and the induction is completed (5.31) 42 Numer Algor (2011) 58:23–52 It follows from (5.23) and (5.26) that iteration {vλ,n } is non–decreasing, bounded from above by vλ given by (5.21), and as such it converges to vλ Finally, estimate (5.24) follows from (5.23) by using the standard majorization techniques [4–6] That completes the proof of Lemma 5.3 We can show the following semilocal convergence result for (GNM) using recurrent functions, wich is the analog of Theorem 3.6 Theorem 5.4 Let λ ≥ λ˜ Assume and U(x0 , vλ ) ⊆ D; (5.32) F satisf ies (3.4), and (3.6) on U(x0 , vλ ); and Hypotheses of Lemma 2.2 hold Then, sequence {xn } generated by (GNM) is well def ined, remains in U(x0 , vλ ) for all n ≥ 0, and converges to a zero x of F (.)+ F(.) in U(x0 , vλ ) Moreover, the following estimates hold: xn+1 − xn ≤ vλ,n+1 − vλ,n , (5.33) and xn − x ≤ vλ − vλ,n (5.34) Proof As in Theorem 3.6, we arrive at the estimate on line above (3.24) (with vλ,k replacing sλ,k ), which in view of (5.1) leads to xk+1 − xk ≤ vλ,k+1 − vλ,k (5.35) Estimates (3.18), (3.19), (5.35), and Lemma 5.3 implie that sequence {xk } is a complete sequence in Rm , and as such it converges to some x ∈ U(x0 , vλ ) (since U(x0 , vλ ) is a closed set) That completes the proof of Theorem 5.4 Remark 5.5 (a) The point vλ given in closed form by (5.21) can replace vλ in hypothesis (5.32) (b) Hypotheses of Lemma 5.2 involve only computations at the initial data These hypotheses differ from (3.13) given in Theorem 3.6 In practice, we shall test to see which of the two are satisfied if any If both conditions are satisfied, we shall use the more precise error bounds given in Theorem 5.4 (see also Section 4) In Section 6, we show that the conditions of Theorem 5.4 can be weaker than those of Theorem 3.6 Numer Algor (2011) 58:23–52 43 Applications We compare the Kantorovich–type conditions introduced in Section with the corresponding ones in Section 6.1 Semilocal case An operator Q : Rm −→ Rl is said to be Lipschitz continuous on D0 ⊆ D with modulus L > if Q(x) − Q(y) ≤ L x−y for all x, y ∈ D0 , (6.1) and center–Lipschitz continuous on D0 with modulus L0 > if Q(x) − Q(x0 ) ≤ L0 x − x0 for all x ∈ D0 (6.2) Let x0 ∈ D, and r > be such that U(x0 , r) ⊆ D Clearly, if F (x0 )+ F is Lipschitz continuous on U(x0 , r) with modulus L, then F satisfies the modified L–average Lipschitz condition on U(x0 , r) Similarily, if F (x0 )+ F is center–Lipschitz continuous on U(x0 , r) with modulus L0 , then F satisfies the modified center L0 –average Lipschitz condition on U(x0 , r) Using (2.2), (2.3), (2.9), and (2.10), we get for t ≥ 0: L0 t , L hλ (t) = β − (1 − λ) t + t2 , 1−λ rλ = , L g(t) = β − t + (6.3) (6.4) (6.5) and bλ = (1 − λ)2 2L (6.6) Moreover, if β ≤ b λ, (6.7) then, (1 − λ)2 − β L L We have the following improvement of Theorem 5.1 in [31] tλ = 1−λ− (6.8) Theorem 6.1 Let λ = λ˜ = (1 − β L0 ) κ Assume βL≤ = (1 − κ)2 ; √ κ2 − κ + + κ2 − κ + U(x0 , vλ ) ⊆ D; and + F (x0 ) F satisf ies (6.1), and (6.2) on U(x0 , tλ ) (6.9) (6.10) 44 Numer Algor (2011) 58:23–52 Then, sequence {xn } generated by (GNM) is well def ined, remains in U(x0 , tλ ) for all n ≥ 0, and converges to a zero x of F (.)+ F(.) in U(x0 , tλ ) Moreover, the following estimates hold: xn+1 − xn ≤ sλ,n+1 − sλ,n , (6.11) and xn − x ≤ tλ − sλ,n (6.12) Proof Similarily replace {tλ,n } by {sλ,n } in the proof of Theorem 5.1 in [31] That completes the proof of Theorem 6.1 We provide now example, where κ = 0, and the hypotheses of Theorem 6.1 are satisfied, but not earlier ones [23, 26] Example 6.2 [8, 26] Let i = j = 2, and R2 be equipped with the Choose: x0 = (.2505, 0)T , –norm D = {x = (v, w)T : −1 < v < and − < w < 1} Define function F on U(x0 , σ ) ⊆ D (σ = 72) by F(x) = (v − w, (v − w)2 ), x = (v, w)T ∈ D (6.13) Then, for each x = (v, w)T ∈ D, the Fréchet–derivative of F at x, and the Moore–Penrose–pseudoinverse of F (x) are given by F (x) = −1 v−w w−v (6.14) and F (x)+ = (1 + (v − w)2 ) v−w −1 w − v (6.15) respectively Let x = (v1 , w1 )T ∈ D and y = (v2 , w2 )T ∈ D By (6.14), we have F (x) − F (y) = |(v1 − v2 ) − (w1 − w2 )| ≤ x − y That is L = L0 = Using (6.14), (6.15) and (3.1), we obtain: F (y)+ (I − F (x) F (x)+ ) F(x) ≤ x−y Hence the constant κ in hypothesis (3.1) is given by: κ = Using hypotheses of Theorem 6.1, (6.13)–(6.15), Theorem 2.4 in [23], and Theorem 3.1 in [26] are not applicable However, our Theorem 6.1 can apply to solve equation (6.13) Numer Algor (2011) 58:23–52 45 Remark 6.3 If L = L0 , Theorem 6.1 reduces to [31, Theorem 5.1] Otherwise, it constitues an improvements (see Lemma 2.2) = 1/2 Then If κ = (Newton’s method for h0 ), we have λ = 0, and sequences {tλ,n } and {vλ,n } reduce to: t1 = β, tn+1 = tn + L (tn − tn−1 )2 , (1 − L tn ) (6.16) v1 = β, vn+1 = + L (vn − vn−1 )2 , (1 − L0 ) (6.17) t0 = 0, and v0 = 0, respectively The corresponding sufficient convergence conditions are: h LHW = β L ≤ [31], (6.18) and h AH = β L ≤ , (6.19) where, L= L + L0 + L2 + L0 L (6.20) (see Lemma 5.3) Note that h LHW ≤ 1 =⇒ h AH ≤ 2 but not necessarily vice versa unless if L = L0 Moreover, since arbitrarily small, we have by (6.18), and (6.19) that h LHW −→ h AH as L0 −→ L (6.21) L0 L can be (6.22) That is our approach extends the applicability of (GNM) by at must four times Concerning the error bounds, we have already shown (see Lemma 2.2) that {vn } is a tighter majorizing sequence for {xn } than {tn } (see Example 6.4 (b)) Example 6.4 Let X = Y = R2 , be equipped with the max–norm, D = [ , − ]2 , ∈ [0, 1), and define function F on D by F(x) = (ξ13 − , ξ23 − )T , x = (ξ1 , ξ2 )T The Fréchet–derivative of operator F is given by F (x) = ξ12 ξ22 (6.23) 46 Numer Algor (2011) 58:23–52 (a) Let x0 = (1, 1)T Using hypotheses of Theorem 6.1, we get: β= (1 − ), L0 = − , and L = (2 − ) Condition (6.18) is violated, since h LHW = (1 − ) (2 − ) > for all ∈ [0, 5) √ √ Hence, there is no guarantee that (NM) converges to x = ( , )T , starting at x0 However, our condition (6.19) is true for all ∈ I = 450339002, 12 Hence, the conclusions of our Theorem 6.1 can apply to solve equation (6.23) for all ∈ I (b) Let x0 = (.9, 9)T , and = Using hypotheses of Theorem 6.1, we get: β = 1, L = 2.6, L0 = 2.3, h LHW = 26 and L = 2.39864766, h AH = 239864766 Then (6.18) and (6.19) are satisfied We have also: F (.9, 9)−1 = 4115226337 I , where, I is the identity × matrix The hypotheses of our Theorem 6.1, and the Kantorovich theorem are satisfied Then (NM) converges to x = (.8879040017, 8879040017)T , starting at x0 We also can provide the comparison table using the software Maple 13 Comparison table n (NM) xn+1 − xn (6.17) vn+1 − (6.16) tn+1 − tn 10 01193415638 0001618123748 2.946995467e-8 4.228114294e-11 ∼ ∼ ∼ ∼ ∼ ∼ ∼ 01688311688 0005067933842 4.57383463010e-7 3.725461907e-13 2.471607273e-25 1.087872853e-49 2.107538365e-98 7.909885354e-196 1.114190851e-390 2.210743650e-780 01756756757 0005778355237 6.265131450e-7 7.365175646e-13 1.017862116e-24 1.944019580e-48 7.091269701e-96 9.435626465e-191 1.670568212e-380 5.236621208e-760 The table shows that our error bounds vn+1 − are tighter than tn+1 − tn We also have the following result on error bound for (GNM) Numer Algor (2011) 58:23–52 47 Lemma 6.5 [7] Assume that there exist constants L0 ≥ 0, L ≥ 0, with L0 ≤ L, and β ≥ 0, such that: ⎧ ⎪ ⎪ ⎪ ⎨ ≤ i f L0 = q0 = L β (6.24) ⎪ ⎪ ⎪ ⎩ < i f L0 = 0, where, L is given by (6.20) Then, sequence {vk } (k ≥ 0) given by (6.17) is well def ined, nondecreasing, bounded above by v , and converges to its unique least upper bound v ∈ [0, v ], where v 1≤δ= = 2β , 2−δ 4L L+ L0 = e − Example 6.12 Let X = Y = C [0, 1], the space of continuous functions defined on [0, 1], equipped with the max norm, and D = U(0, 1) Define function F on D, given by F(h)(x) = h(x) − x θ h(θ )3 dθ (6.29) Then, we have: F (h[u])(x) = u(x) − 15 x θ h(θ)2 u(θ ) dθ for all u ∈ D Using (6.29), for x (x) = for all x ∈ [0, 1], we get L = 15 and L0 = 7.5 We also have in this example L0 < L Conclusion Using our new concept of recurrent functions, and a combination of average Lipschitz/center–Lipschitz conditions, we provided a semilocal/local convergence analysis for (GNM) to approximate a locally unique solution of a system of equations in finite dimensional spaces Our analysis has the following advantages over the work in [31]: weaker sufficient convergence conditions, and larger convergence domain Applications of our results to Kantorovich’s analysis are also provided in this study References Argyros, I.K.: On the Newton–Kantorovich hypothesis for solving equations J Comput Appl Math 169, 315–332 (2004) Argyros, I.K.: A unifying local–semilocal convergence analysis and applications for two–point Newton–like methods in Banach space J Math Anal Appl 298, 374–397 (2004) Argyros, I.K.: On the semilocal convergence of the Gauss–Newton method Adv Nonlinear Var Inequal 8, 93–99 (2005) Argyros, I.K.: Computational theory of iterative methods In: Chui, C.K., Wuytack, L (eds.) Studies in Computational Mathematics, vol 15 Elsevier, New York (2007) Argyros, I.K.: On a class of Newton–like methods for solving nonlinear equations J Comput Appl Math 228, 115–122 (2009) Argyros, I.K., Hilout, S.: Efficient Methods for Solving Equations and Variational Inequalities Polimetrica Publisher, Milano (2009) Argyros, I.K., Hilout, S.: Enclosing roots of polynomial equations and their applications to iterative processes Surv Math Appl 4, 119–132 (2009) Argyros, I.K., Hilout, S.: On the solution of systems of equations with constant rank derivatives Numer Algor (to appear) doi:10.1007/s11075-010-9426-5 Numer Algor (2011) 58:23–52 51 Argyros, I.K., Hilout, S.: Improved generalized differentiability conditions for Newton–like methods J Complex 26(3), 316–333 (2010) 10 Blum, L., Cucker, F., Shub, M., Smale, S.: Complexity and Real Computation With a Foreword by Richard M Karp Springer–Verlag, New York (1998) 11 Ben–Israel, A.: A Newton–Raphson method for the solution of systems of equations J Math Anal Appl 15, 243–252 (1966) 12 Ben–Israel, A., Greville, T.N.E.: Generalized inverses Theory and Applications, 2nd edn CMS Books in Mathematics/Ouvrages de Mathématiques de la SMC, vol 15 Springer–Verlag, New York (2003) 13 Chen, P.Y.: Approximate zeros of quadratically convergent algorithms Math Comput 63, 247–270 (1994) 14 Dedieu, J.P., Kim, M–H.: Newton’s method for analytic systems of equations with constant rank derivatives J Complex 18, 187–209 (2002) 15 Dedieu, J.P., Shub, M.: Newton’s method for overdetermined systems of equations Math Comput 69, 1099–1115 (2000) 16 Deuflhard, P.: A study of the Gauss–Newton algorithm for the solution of nonlinear least squares problems Special topics of applied mathematics (Proc Sem., Ges Math Datenverarb., Bonn, 1979), pp 129–150 North–Holland, Amsterdam–New York (1980) 17 Deuflhard, P., Heindl, G.: Affine invariant convergence theorems for Newton’s method and extensions to related methods SIAM J Numer Anal 16, 1–10 (1979) 18 Ezquerro, J.A., Hernández, M.A.: Generalized differentiability conditions for Newton’s method IMA J Numer Anal 22, 187–205 (2002) 19 Ezquerro, J.A., Hernández, M.A.: On an application of Newton’s method to nonlinear operators with w–conditioned second derivative BIT 42, 519–530 (2002) 20 Gragg, W.B., Tapia, R.A.: Optimal error bounds for the Newton–Kantorovich theorem SIAM J Numer Anal 11, 10–13 (1974) 21 Gutiérrez, J.M.: A new semilocal convergence theorem for Newton’s method J Comput Appl Math 79, 131–145 (1997) 22 Gutiérrez, J.M., Hernández, M.A.: Newton’s method under weak Kantorovich conditions IMA J Numer Anal 20, 521–532 (2000) 23 Häubler, W.M.: A Kantorovich–type convergence analysis for the Gauss–Newton–method Numer Math 48, 119–125 (1986) 24 He, J.S., Wang, J.H., Li, C.: Newton’s method for underdetermined systems of equations under the γ –condition Numer Funct Anal Optim 28, 663–679 (2007) 25 Hernández, M.A.: The Newton method for operators with Hölder continuous first derivative J Optim Theory Appl 109, 631–648 (2001) 26 Hu, N., Shen, W., Li, C.: Kantorovich’s type theorems for systems of equations with constant rank derivatives J Comput Appl Math 219, 110–122 (2008) 27 Kantorovich, L.V., Akilov, G.P.: Functional Analysis Pergamon Press, Oxford (1982) 28 Li, C., Ng, K.F.: Majorizing functions and convergence of the Gauss-Newton method for convex composite optimization SIAM J Optim 18(2), 613–642 (2007) 29 Li, C., Wang, J.: Newton’s method on Riemannian manifolds: Smale’s point estimate theory under the γ –condition IMA J Numer Anal 26(2), 228–251 (2006) 30 Li, C., Zhang, W–H., Jin, X–Q.: Convergence and uniqueness properties of Gauss–Newton’s method Comput Math Appl 47, 1057–1067 (2004) 31 Li, C., Hu, N., Wang, J.: Convergence bahavior of Gauss–Newton’s method and extensions to the Smale point estimate theory J Complex 26(3), 268–295 (2010) 32 Potra, F.A.: On the convergence of a class of Newton–like methods In: Iterative Solution of Nonlinear Systems of Equations (Oberwolfach, 1982) Lecture Notes in Math., vol 953, pp 125–137 Springer, Berlin–New York (1982) 33 Potra, F.A.: On an iterative algorithm of order 1.839 · · · for solving nonlinear operator equations Numer Funct Anal Optim 7(1), 75–106 (1984/85) 34 Potra, F.A.: Sharp error bounds for a class of Newton–like methods Libertas Mathematica 5, 71–84 (1985) 35 Shub, M., Smale, S.: Complexity of Bezout’s theorem IV Probability of success, extensions SIAM J Numer Anal 33, 128–148 (1996) 36 Smale, S.: The fundamental theorem of algebra and complexity theory Bull Am Math Soc 4, 1–36 (1981) 52 Numer Algor (2011) 58:23–52 37 Smale, S.: Newton’s method estimates from data at one point In: The Merging of Disciplines: New Directions in Pure, Applied, and Computational Mathematics (Laramie, Wyo., 1985), pp 185–196 Springer, New York (1986) 38 Smale, S.: Complexity theory and numerical analysis In: Acta Numerica, pp 523–551 Cambridge University Press, Cambridge (1997) 39 Stewart, G.W., Sun, J.G.: Matrix perturbation theory Computer Science and Scientific Computing Academic Press, Inc., Boston, MA (1990) 40 Traub, J.F., Wo´zniakowski, H.: Convergence and complexity of Newton iteration for operator equations J Assoc Comput Mach 26, 250–258 (1979) 41 Wang, X.H.: Convergence of Newton’s method and inverse function theorem in Banach space Math Comput 68(255), 169–186 (1999) 42 Wang, X.H.: Convergence of Newton’s method and uniqueness of the solution of equations in Banach space IMA J Numer Anal 20(1), 123–134 (2000) 43 Wang, X.H., Han, D.F.: On dominating sequence method in the point estimate and Smale theorem Sci China, Ser A 33(2), 135–144 (1990) 44 Wang, X.H., Han, D.F.: Criterion α and Newton’s method under weak conditions (Chinese) Math Numer Sin 19(1), 103–112 (1997); translation in Chinese J Numer Math Appl 19(2), 96–105 (1997) 45 Wang, X.H., Li, C.: The local and global behaviors of methods for solving equations (Chinese) Kexue Tongbao 46(6), 444–451 (2001) 46 Wang, X.H., Li, C.: Local and global behavior for algorithms of solving equations Chin Sci Bull 46(6), 441–448 (2001) 47 Wang, G.R., Wei, Y.M., Qiao, S.Z.: Generalized inverse: Theory and Computations Science Press, Beijing, New York (2004) 48 Wang, X.H., Xuan, X.H.: Random polynomial space and computational complexity theory Sci Sin Ser A 30(7), 673–684 (1987) 49 Wang, D.R., Zhao, F.G.: The theory of Smale’s point estimation and its applications, Linear/nonlinear iterative methods and verification of solution (Matsuyama, 1993) J Comput Appl Math 60(1–2), 253–269 (1995) 50 Wedin, P.A.: Perturbation theory for pseudo–inverse BIT 13, 217–232 (1973) 51 Xu, X.B., Li, C.: Convergence of Newton’s method for systems of equations with constant rank derivatives J Comput Math 25, 705–718 (2007) 52 Xu, X.B., Li, C.: Convergence criterion of Newton’s method for singular systems with constant rank derivatives J Math Anal Appl 345, 689–701 (2008)