Numerical method for unconstrained optimization and nonlinear equations part 2

7 Stopping, Scaling, and Testing In this chapter we discuss three issues that are peripheral to the basic mathematical considerations in the solution of nonlinear equations and minimization problems, but essential to the computer solution of actual problems The first is how to adjust for problems that are badly scaled in the sense that the dependent or independent variables are of widely differing magnitudes The second is how to determine when to stop the iterative algorithms in finite-precision arithmetic The third is how to debug, test, and compare nonlinear algorithms 7.1 SCALING An important consideration in solving many "real-world" problems is that some dependent or independent variables may vary greatly in magnitude For example, we might have a minimization problem in which the first independent variable, x1? is in the range [102, 103] meters and the second, x2, is in the range [10~ , 10 ~6] seconds These ranges are referred to as the scales of the respective variables In this section we consider the effect of such widely disparate scales on our algorithms One place where scaling will effect our algorithms is in calculating terms such as || x+ — x c || , which we used in our algorithms in Chapter In the above example, any such calculation will virtually ignore the second (time) 156 Chap Scaling, Stopping, and Testing variable However, there is an obvious remedy: rescale the independent variables; that is, change their units For example, if we change the units of Xj to kilometers and x2 to microseconds, then both variables will have range [10"1, 1] and the scaling problem in computing ||x+ — xc ||2 will be eliminated Notice that this corresponds to changing the independent variable to x = Dxx, where Dx is the diagonal scaling matrix This leads to an important question Say we transform the units of our problem to x = Dx x, or more generally, transform the variable space to x = Tx, where T e IR"*" is nonsingular, calculate our global step in the new variable space, and then transform back Will the resultant step be the same as if we had calculated it using the same globalizing strategy in the old variable space? The surprising answer is that the Newton step is unaffected by this transformation but the steepest-descent direction is changed, so that a linesearch step in the Newton direction is unaffected by a change in units, but a trust region step may be changed To see this, consider the minimization problem and let us define x = Tx, /(x) =/CT~ 'x) Then it is easily shown that so that the Newton step and steepest-descent direction in the new variable space are or, in the old variable space, These conclusions are really common sense The Newton step goes to the lowest point of a quadratic model, which is unaffected by a change in units of x (The Newton direction for systems of nonlinear equations is similarly unchanged by transforming the independent variable.) However, determining which direction is "steepest" depends on what is considered a unit step in each direction The steepest-descent direction makes the most sense if a step of one unit in variable direction x, has about the same relative length as a step of one unit in any other variable direction x, For these reasons, we believe the preferred solution to scaling problems is for the user to choose the units of the variable space so that each component of x will have roughly the same magnitude However, if this is troublesome, the equivalent effect can be achieved by a transformation in the algorithm of Scaling, Stopping, and Testing Chap 157 the variable space by a corresponding diagonal scaling matrix Dx This is the scaling strategy on the independent variable space that is implemented in our algorithms All the user has to is set Dx to correspond to the desired change in units, and then the algorithms operate as if they were working in the transformed variable space The algorithms are still written in the original variable space, so that an expression like || x+ — xc ||2 becomes || D^x+ — xc) \\2 and the steepest-descent and hook steps become respectively (see Exercise 3) The Newton direction is unchanged, however, as we have seen The positive diagonal scaling matrix Dx is specified by the user on input by simply supplying n values typx;, i = 1, , n, giving "typical" magnitudes of each x, Then the algorithm sets (D,),-,- = (typx,)- , making the magnitude of each transformed variable x, = (£*);, x, about For instance, if the user inputs typx t = 103, typx = 10~ in our example, then Dx will be (7.1.1) If no scaling of x, is considered necessary, typx, should be set to Further instructions for choosing typx, are given in Guideline in the appendix Naturally, our algorithms not store the diagonal matrix Dx, but rather a vector Sx (S stands for scale), where (Sx), = (DJ,, = (typx,)- l The above scaling strategy is not always sufficient; for example, there are rare cases that need dynamic scaling because some x, varies by many orders of magnitude This corresponds to using Dx exactly as in all our algorithms, but recalculating it periodically Since there is little experience along these lines, we have not included dynamic scaling in our algorithms, although we would need only to add a module to periodically recalculate Dx at the conclusion of an iteration of Algorithm D6.1.1 or D6.1.3 An example illustrating the importance of considering the scale of the independent variables is given below EXAMPLE 7.1.1 A common test problem for minimization algorithms is the Rosenbrock banana function which has its minimum at x^ = (\, l)r Two typical starting points are x0 = (-1.2, 1)T and x0 = (6.39, -0.221)T This problem is well scaled, but if a ^ 1, then the scale can be made worse by substituting axt for x1? and x2/a for x2 in (7.1.2), giving 158 Chap Scaling, Stopping, and Testing This corresponds to the transformation If we run the minimization algorithms found in the appendix on/(x), starting from x0 = ( — 1.2/a, a)r and x0 = (6.39/a, a( — 0.221))T, use exact derivatives, the "hook" globalizing step, and the default tolerances, and neglect the scale by setting typxj = typx = 1, then the number of iterations required for convergence with various values of a are as follows (the asterisk indicates failure to converge after 150 iterations): a 0.01 0.1 10 100 Iterations from x0 = (-1.2/a, a)r Iterations from x0 = (6.39/a, a(-0.221))r 150 + * 94 24 52 150 + * 150 + 47 29 48 150 + * * However, if we set typxi = I/a, typ*2 = vector norms, 42, 66 matrix norms, 43-44, 66 Large problems, 245-46 C.33.44.55.54.78.65.5.43.22.2.4 22.Tai lieu Luan 66.55.77.99 van Luan an.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.22 Do an.Tai lieu Luan van Luan an Do an.Tai lieu Luan van Luan an Do an 376 Subject Index Maximization, 33-34, 101, 109 necessary and sufficient conditions for, 81-83 Mean value theorem, 70, 74 Minimization (See Unconstrained minimization) MINPACK, 187, 189, 228, 265 Model Hessian matrix, 82-83, 101-3, 114, 116, 133, 194, 198, 229 algorithm, 315-18 Model of nonlinear function, 39 algorithm, unconstrained minimization, 315-18 algorithms, nonlinear equations, 342-47 nonlinear least squares, 219-21, 229 one-variable problems, 17-18, 28, 134-36 systems of nonlinear equations, 75, 87, 147-48, 151, 153, 159, 169-70 unconstrained minimization, 73, 76, 82-84, 100, 102, 130, 145, 195 Model trust region (See Trust region methods) Modular algorithms or software (See Algorithms; Software) Multivariate derivatives, 69-73, 77-80 N N(x, r), 89 NAG library, 47, 265 Necessary conditions for unconstrained minimization, 33, 80-84 Negative curvature directions, 101, 145 Negative definite matrix, 58, 82-83 Neighborhood, N(x, r), 89 Newton's method (See also Finite-difference Newton's method) convergence, nonlinear equations, 21-24, 37, 89-93, 107 convergence, unconstrained minimization, 27,100 as descent method, 114,147-48 for extended precision division, 36 nonlinear least squares, 220, 229 scaled, 156, 165 single nonlinear equation, 16-19 systems of nonlinear equations, 86-89 unconstrained minimizations, 34,99-103 Newton's theorem, 18, 74 NL2SOL, 232-33 Noise in nonlinear function, 32, 96-99, 103-6, 188, 209, 279, 292 Nonlinear equations, systems of: algorithms (main driver), 285-89, 294-95 Broyden's method, 169-74, 189-92 choosing methods, 97, 167, 290-91 choosing tolerances, 278-80, 291-92 convergence theory, 21-25, 29-31, 89-96, 174-86, 245,251-56 definition, failure to find root, 152, 161, 279-80, 292, 350 finite-difference Newton's method, 28-32, 94-99, 240^12, 256 global methods, 24-27, 147-52 line search methods, 149-51 Newton's method, 16-19, 86-89 one equation in one unknown, 15-32 problems with special structure, 239-58 scaling, 152, 155-61, 165, 187, 192, 278 79, 291-92 secant methods, 28-31, 168-93, 242-58 Nonlinear equations, systems of (com.): singular Jacobian matrix, 89, 93, 151, 153-54 software, 187, 189 sparse problems, 239^16, 250-51, 254, 256-57 stopping criteria, 27, 160-61 trust region methods, 149 Nonlinear least squares: algorithms, 238 choosing methods, 228, 233 convergence theory, 222-28, 232, 236, 254-56 definition, derivatives, 219-21 example problem, Gauss-Newton method, 221-28, 232, 236 global methods, 225-28, 232 Levenberg-Marquardt method, 227-28, 232-33, 237 line search methods, 227 Newton's method, 220, 229 partly linear problems, 234, 238 rank-deficient Jacobian, 225-28 residuals, 219-20, 222-28, 230-31, 233-34 secant methods, 229-33, 249, 251, 254-56 software, 228, 232-33 statistical considerations, 219, 234-35, 237-38 stopping criteria, 233-34, 237-38 trust region methods, 227-28, 232 Nonsingular matrix, 44 Normal equations, 62, 221, 237 Norms, vector, 41-43, 66 matrix, 43-45, 66-67 Numerical derivatives (See Finite-difference derivatives) o 0, o notation, 11 One-variable problems, 15-37 convergence theory, 21-25, 29-31, 34 finite-difference Newton's method, 28-32 global methods, 24-27, 34-35 Newton's method, 16-19, 21-24 one nonlinear equation in one unknown, 15-19, 21-32 secant method, 28-31 unconstrained minimization, 32-35 Operation counts, matrix factorizations, 51 solving factored linear systems, 49 updating matrix factorizations, 57-58 Optimally conditioned secant update, 212 Order of convergence (See Rates of convergence) Orthogonal factorization (See QR decomposition) Orthogonal matrix, 46, 48 Orthogonal vectors, 46 Orthonormal vectors, 46, 59, 67 Overdetermined systems of equations (See Linear least squares; Nonlinear least squares) Overflow, 12, 13 P Stt.010.Mssv.BKD002ac.email.ninhd 77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77t@edu.gmail.com.vn.bkc19134.hmu.edu.vn.Stt.010.Mssv.BKD002ac.email.ninhddtt@edu.gmail.com.vn.bkc19134.hmu.edu.vn Partial derivatives (See Gradient; Hessian; Jacobian) PASCAL, 262, 265, 267, 270, 282 Permutation matrix, 48 C.33.44.55.54.78.65.5.43.22.2.4 22.Tai lieu Luan 66.55.77.99 van Luan an.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.22 Do an.Tai lieu Luan van Luan an Do an.Tai lieu Luan van Luan an Do an Subject Index 377 Positive definite matrix, 49-51, 58 (See also Cholesky decomposition) as model Hessian, 82-83, 101-3, 114, 116, 133, 148, 151, 194, 197-201, 245 sufficient condition for unconstrained minimum, 81-82 Positive definite secant method (see BFGS method) Precision of real variables, 10-12, 270 Problems with special structure: partly linear problems, 234, 238, 246-49, 251 secant methods, 246-58 sparse problems, 240-46, 256-57 Program notes, 1, 39, 85, 167, 217 Projected secant updates, 190, 192-93, 212 Projection operator, 176, 183, 195, 205, 243, 250-51, 253, 255 PSB method (See Symmetric secant method) Pseudoinverse, 65 use in nonlinear equations algorithm, 154 Q Q-order convergence rates, 20-21, 36 (See also Linear; Quadratic; Superlinear convergence) QR decomposition, 49-51, 63, 68 algorithms, 304-5, 311-14 nonlinear equations algorithm, use in, 151-52, 187 nonlinear least squares, use in, 221 updating, 55-57, 187, 201, 246 Quadratic convergence, 20 of finite-difference Newton's method, 29-31, 95-96, 106 of Gauss-Newton method, 222 26 of Levenberg-Marquardt method, 227-28 of line search algorithms, 123-25 of Newton's method, 21 24, 34, 89-93, 100, 102, 220, 229 Quadratic interpolation in line search, 126-27, 150 in trust region method, 144 Quadratic model, 28, 73, 76, 82-84, 100, 102, 130, 145, 147-49, 159,195,220,229 Quasi-Newton direction (or step), 116-17, 123-25, 129-30, 139, 189 Quasi-Newton equation (See Secant equation) Quasi-Newton method, 26, 112 (See also Secant methods) R Rank deficiency (See Hessian matrix; Jacobian matrix, singular or ill-conditioned; Singular value decomposition) Rank of matrix, 64-66 Rank-one update, 170, 190, 211, 215 Rank-two update, 196, 198, 201, 203, 211 12, 215, 232 Rates of convergence, 20-21, 36 (See also Linear; Qorder; Quadratic; R-order; Superlinear convergence) Regression (See Linear least squares; Nonlinear least squares) Regression, robust, 235-36 Relative error, 11, 13, 160 Relative nonlinearity, 24, 91, 224 Residuals, 6, 61, 219-20, 222-28, 230-31, 233-34 R-order convergence rates, 21, 36, 92-93, 168, 170, 186 ROSEPACK, 66 Round-off errors, 11-13 (See also Finite-precision arithmetic; Machine epsilon) S Saddle point, 81, 101, 109, 115, 165, 349 Scaling, independent variables, 35, 155-60, 164-65, 187, 192, 206, 209, 227, 232, 278-79 objective function for minimization, 35, 153, 209, 279 objective function for nonlinear equations, 27, 152, 158-59, 161, 165, 187, 192, 291-92 Secant equation, 169, 195, 229-30, 248-58 Secant method, one variable problems, 28-31 Secant methods, multi-variable problems (See also BFGS; Broyden's; DFP; Symmetric secant method; Optimally conditioned; Projected secant updates) convergence of derivative approximations, 174, 184-86, 191-92, 206-7, 214 convergence theory, 174-86, 190, 196-98, 206, 21011,213,232,245,251-56 implementation, 186-189, 201, 208 10, 278, 291 initial approximation, 172, 187, 209 10, 215, 232 least change updates, 170-72, 189, 196, 198, 204-6, 213,231-32,244-56 nonlinear least squares, 229-33 problems with special structure, 246-56 sparse problems, 242 46, 250-51, 254, 257 systems of nonlinear equations, 168 93, 242 58 unconstrained minimization, 194-215, 245-58 Sensitivity of linear equations, 51-55 Sherman-Morrison-Woodbury formula, 188 Simultaneous nonlinear equations (See Nonlinear equations) Single-variable problems (See One-variable problems) Singular matrix, test for, 47 (See also Hessian matrix; Jacobian matrix) Singular value decomposition (SVD), 64-66, 68 Size of nonlinear problems, Software (See also E1SPACK; Harwell; IMSL; LINPACK; MINPACK; NAG; NL2SOL; ROSEPACK; VARPRO; Yale sparse matrix package) evaluation and testing, 161-64 for linear equations, 47, 51, 240 modular, 162, 164, 264 for nonlinear equations, 187, 189 for nonlinear least squares, 228, 232 34 for singular value decomposition, 66 for unconstrained minimization, 265 Sparse matrix, 239 closest to nonsparse matrix, 243 Sparse problems: finite-difference methods, 240-42, 256-57 secant methods, 242^46, 250-51, 257 secant methods, convergence, 245, 254 Steepest descent direction, 114, 116, 133, 13943, 148, 209, 228 scaled, 156 57, 209 Stt.010.Mssv.BKD002ac.email.ninhd 77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77t@edu.gmail.com.vn.bkc19134.hmu.edu.vn.Stt.010.Mssv.BKD002ac.email.ninhddtt@edu.gmail.com.vn.bkc19134.hmu.edu.vn C.33.44.55.54.78.65.5.43.22.2.4 22.Tai lieu Luan 66.55.77.99 van Luan an.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.22 Do an.Tai lieu Luan van Luan an Do an.Tai lieu Luan van Luan an Do an Stt.010.Mssv.BKD002ac.email.ninhd 77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77t@edu.gmail.com.vn.bkc19134.hmu.edu.vn.Stt.010.Mssv.BKD002ac.email.ninhddtt@edu.gmail.com.vn.bkc19134.hmu.edu.vn

Định dạng
Số trang	224
Dung lượng	9,25 MB