excel for scientists and engineers phần 8 ppt

48 251 0
excel for scientists and engineers phần 8 ppt

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

3 14 EXCEL: NUMERICAL METHODS Nonlinear Least-Squares Curve Fitting Unlike for linear regression, there are no analytical expressions to obtain the set of regression coefficients for a fitting function that is nonlinear in its coefficients. To perform nonlinear regression, we must essentially use trial-and- error to find the set of coefficients that minimize the sum of squares of differences between ycalc and yobsd. For data such as in Figure 14-1, we could proceed in the following manner: using reasonable guesses for kl and k2, calculate [B] at each time data point, then calculate the sum of squares of residuals, SSresiduals = C([B]ca~c - [B]e,,t)2. Our goal is to minimize this error- square sum. We could do this in a true "trial-and-error" fashion, attempting to guess at a better set of kl and k2 values, then repeating the calculation process to get a new (and hopefully smaller) value for the SSresjduals. Or we could attempt to be more systematic. Starting with our initial guesses for kl and k2, we could create a two- dimensional array of starting values that bracket our guesses, as in Figure 14-2. (The initial guesses for kl and k2 were 0.30 and 0.80, respectively and the array of starting values are 70%, SO%, go%, loo%, 1 lo%, 120% and 130% of the respective initial estimates.) Then, for each set of kl and k2 values, we calculate the SSresiduals. The kl and kl values with the smallest error-square sum (kl = 0.27, 0'025 I 0.020 0.01 5 0.01 0 0.005 0.000 1 0 2 4 6 8 10 Time Figure 14-1. A typical plot of the concentration of species B for a system of two consecutive first-order reactions (the reaction scheme A+B+C) CHAPTER 14 NONLINEAR REGRESSION USNG THE SOLVER 315 k, = 0.64 in Figure 14-2) become the new initial estimates and the process is repeated, using smaller bracketing values. Years ago this procedure, called "pit- mapping," was performed on early digital computers. In essence we are mapping out the error surface, in a sort of topographic way, searching for the minimum. A typical error surface is shown in Figure 14-3 (the logarithm of the SSresiduals has been plotted to make the minimum in the surface more obvious in the chart). Figure 14-2. The error-square sums for an array of initial estimates. The minimum SSresiduals value is in bold. Figure 14-3. An error surface A more efficient process, the method of steepest descent, starts with a single set of initial estimate values (a point on the error surface), determines the direction of downward curvature of the surface, and progresses down the surface in that direction until the minimum is reached (a modern implementation of this method is called the Marquardt-Levenberg algorithm). Fortunately, Excel provides a tool, the Solver, that can be used to perform this kind of minimization and thus makes nonlinear least-squares curve fitting a simple task. Introducing the Solver Like Goal Seek, the Solver can vary a changing cell to make a target cell have a certain value. But unlike Goal Seek, which can vary only a single changing cell, the Solver can vary the values of a number of changing cells. The Solver is a general-purpose optimization package that can find a maximum, minimum or specified value of the target cell. The Solver code is a product of Frontline Systems Inc. (P.O. Box 4288, Incline Village, NV 89450; www. frontsys .corn). Microsoft's documentation makes no mention of the use of the Solver to perform least-squares curve fitting, but it is immediately obvious to almost any scientist that the Solver can be used to minimize the sum of squares of residuals (differences between Yobsd and ycalc) and thus perform least-squares curve fitting. The Solver can be used to perform either linear or nonlinear least-squares curve fitting. How the Solver Works The Solver uses the Generalized Reduced Gradient (GRG2) nonlinear optimization code developed by Leon Lasdon, University of Texas at Austin, and Allan Waren, Cleveland State University*. For each of the changing cells, the Solver evaluates the partial derivative of the objective function F (the target cell) with respect to the changing cell ai, by means of the finite-difference method. The procedure works something like this: the Solver reads the value of each changing cell a, in turn, modifies the value by a perturbation factor (the perturbation factor is approximately 1 0-8), and writes the new value back to the worksheet cell. This causes the spreadsheet to recalculate, producing a new value of the objective. The Solver calculates the * For linear and integer problems, the Solver uses the simplex method and branch-and- bound method, but these methods need not be discussed here. You can read more about the design and operation of the Solver in the following article (available online): "Design and Use of the Microsoft Excel Solver," Daniel Fylstra, Leon Lasdon, John Watson and Allan Waren, Interfaces 28, September 1998, pp. 29-55. CHAPTER 14 NONLINEAR REGRESSION USING THE SOLVER 3 17 partial derivative dF/dai according to equation 14-4 and then restores the changing cell to its original value and perturbs the next changing cell. The same method was used earlier in this book to calculate the first derivative of a function (see "Derivative of a Worksheet Formula Using the Finite-Difference Method" in Chapter 6). 8F AF F(ai + Aai) - F(ai) dai Aai Aa, (1 4-4) - - = The Solver uses a matrix of the partial derivatives to determine the gradient of the response surface, and thus how to change the values of the changing cells in order to approach the desired solution. The use of finite differences to obtain the partial derivatives means that the Excel spreadsheet performs all of the intermediate calculations leading to the evaluation of the derivatives. Thus all of Excel's built-in worksheet functions, as well as any user-defined functions, are supported. The alternative, obtaining the derivatives analytically by symbolic differentiation of the spreadsheet formulas, would have been an impossible task. Loading the Solver Add-In The Solver is an Excel Add-in, a software program that is loaded only when needed. You'll find the Solver in the Tools menu; if it's not there, choose Add- Ins from the Tools menu to display the Add-Ins dialog box, shown in Figure 14-4, check the box for Solver Add-In, then press OK. Why Use the Solver for Nonlinear Regression? A number of commercial statistical packages provide the capability to perform nonlinear least-squares curve fitting, so why use the Solver? First, the Solver is used within the familiar Excel environment, so that you don't have to learn new commands and procedures. Secondly, with commercial statistical packages you are generally restricted to using an equation chosen from a library of fitting functions provided within the program, whereas with the Solver you can fit data to any model (that is, any ycalc formula) you choose. Finally, the Solver is part of Excel. It's free, so why not use it? 3 18 EXCEL: NUMERICAL METHODS Figure 14-4. The Add-Ins dialog box. Nonlinear Regression Using the Solver: An Example To perform nonlinear least-squares curve fitting using the Solver, your spreadsheet model must contain a column of known y values and a column of calculated y values, so that the sum of squares of residuals can be calculated. The calculated y values must be spreadsheet formulas that depend on the curve fitting coefficients that will be varied by the Solver. To illustrate the use of the Solver for nonlinear least-squares curve fitting, we'll use as an example the system of two consecutive first-order reactions (the reaction scheme A-+B-+C) where the species B is the observed variable. Equation 14-3 gives the expression for the concentration of species B as a function of time; as we have seen, [B], depends on two rate constants, kl and k2. In the experimental results that follow, species B was monitored by spectrophotometry (light absorption) and the relationship between the light absorbed (the absorbance) and the concentration of B is given by Beer's Law: A = E~ x (path length of light through the sample) x [B] CHAPTER 14 NONLINEAR REGRESSION USING THE SOLVER 319 where E~ is the molar absorptivity (a constant dependent on the chemical species and the wavelength, and thus a third unknown quantity in this example). Therefore three curve-fitting coefficients (k,, k2 and E~) must be varied in this example. If two variable coefficients produce an error surface in three dimensions, as illustrated in Figure 14-3, then varying three coefficients requires that we work in four dimensions! Figure 14-5 shows the spreadsheet that was used to produce the result shown in Figure 14-1. The experimental values of the dependent variable, Aobsd, are in column B, the concentration [B], in column C, Acalc in column D and the square of the residual in column E. Figure 14-5. The spreadsheet before optimization of coefficients by the Solver. The initial values of the three coefficients (the changing cells) and the current value of the objective (the target cell) are in bold. 320 EXCEL: NUMERICAL METHODS The formulas in cells CIO, D10 and El0 are, respectively, =C-A*k-l*( EXP(-k-2*t)-EXP(-k-l *t))/(k-I -k-2) =E-B*0.4*CI 0 =(BI 0-D10)"2 Range names were used in these formulas; the names assigned to cells are shown in parentheses in the cell to the right of each named cell. The three changing cells ($E$6, $E$7 and $B$7) and the target cell ($E$26) are in bold. The initial values are guesses based on the appearance of the data in Figure 14-1. More specifically, the guesses were based on the rise time, decay time and maximum of the data, but if you experiment with the Solver you will see that much poorer guesses will almost always lead to the correct answer. (A good way to get initial values for the changing cells is to create a chart of the data, then vary the coefficients in order to get an approximate fit of the calculated curve to the experimental data points.) When the spreadsheet model has been set up, choose Solver from the Tools menu. The Solver Parameters dialog box (Figure 14-6) will be displayed. Figure 14-6. The Solver Parameters dialog box. In the Set Target Cell box, type E26, or select cell E26 with the mouse. We In the By want to minimize the sum of squares, so press the Min button. Changing Cells box, enter E6:E7 and B7. CHAPTER 14 NONLINEAR REGRESSION USING THE SOLVER 32 1 Figure 14-7. The Solver Options dialog box. For reasons that will be explained in a subsequent section, press the Options button to display the Solver Options dialog box (Figure 14-7) and check the Use Automatic Scaling box. Figure 14-8. The Solver Results dialog box. Press OK to exit from Solver Options and return to the Solver Parameters dialog box. Press the Solve button. 322 EXCEL: NUMERICAL METHODS When the Solver finds a solution, the Solver Results dialog box is displayed (Figure 14-8). There are three reports that you can choose to print: Answer, Sensitivity and Limits, but none of these reports contain any information that we will use. You have the option of accepting the Solver's solution or restoring the original values. Press the Keep Solver Solution button. The spreadsheet will be displayed with the final values of the changing and target cells (Figure 14-9). Figure 14-9. The spreadsheet after optimization of coefficients by the Solver. The three coefficients (the changing cells) and the objective (the target cell) are in bold. CHAPTER 14 NONLINEAR REGRESSION USING THE SOLVER 323 The Solver provides results that are essentially identical to those from commercial software packages. Any slight differences (usually ca. 0.00 1 YO or less) arise from the fact that, with all of these programs, the coefficients are found by a search method; the "final" values will differ depending on the convergence criteria used in each program. In fact, you would probably obtain slightly different results using the same program and the same data, if you started with different initial estimates of the coefficients. Some Notes on Using the Solver External References. The target cell and the changing cells must be on the active sheet. However, your model can involve external references to values in other worksheets or workbooks. Discontinuous Functions. Discontinuous functions in your Solver model may cause problems. They can be either discontinuous mathematical functions such as TAN, which has a discontinuity at 7d2, or worksheet functions that are inherently "discontinuous," such as IF, ABS, INT, ROUND, CHOOSE, LOOKUP, HLOOKUP, or VLOOKUP. Initial Estimates. Since the Solver operates by a search routine, it will find a solution most rapidly and efficiently if the initial estimates that you provide are close to the final values. As mentioned previously, it is often useful to create a chart of the data that displays both Yobsd and ycalo and then vary the parameters manually in order to find a good set of initial parameter estimates. Global Minimum. To ensure that the Solver has found a global minimum rather than a local minimum, it's a good idea to obtain a solution using different sets of initial estimates. "Unable to find a solution" When There Are a Large Number of Parameters. For a complicated model with a large number of adjustable coefficients, the Solver may not be able to converge to a reasonable solution. In such a case, it is sometimes helpful to perform initial Solver runs with subsets of the coefficients. For example, to fit a UV-visible spectrum with five Gaussian bands, and thus 15 adjustable coefficients, you could perform initial runs varying the coefficients for two or three of the bands at a time. When a reasonable fit has been found for the subsets, perform a final Solver run varying all of the coefficients. Some Notes on the Solver Parameters Dialog Box There are some additional controls in the Solver Parameters dialog box: By Changing Cells. individual cells or ranges in the By Changing Cells input box. You can use names instead of cell references for [...]... simulation and Monte Carlo integration Random Numbers in Excel Since the Monte Carlo method involves the use of random numbers, we will begin by examining how random numbers are produced and used within Excel How Excel Generates Random Numbers In Excel 2003, an improved random number generator was implemented Earlier versions of Excel used a pseudo-random-number-generation algorithm whose performance on standard... stated that more than 1013 numbers will be generated before the repetition begins The random-number algorithm in Excel 2003 was developed by B A Wichman and I D Hill ("Algorithm AS 183 : An Efficient and Portable PseudoRandom Number Generator," Applied Statistics, 31, 188 -190, 1 982 ; "Building a Random-Number Generator," BYTE,pp 127-1 28, March 1 987 ) This random number generator is also used in a software... =RAND()*(top - bottom) + bottom For example, if bottom = 0 and fop = 5, the returned result could be for example, 4.046 086 61 9 780 98 To generate a random integer between bottom and top, use =ROUND(RAND()*(fOp- bottom) + bOttOm,O) For example, if bottom = 0 and top = 50, the returned result could be 27 344 EXCEL: NUMERICAL METHODS Since all of the above formulas include the RAND function, the returned result... between Excel' s built-in worksheet functions, such as RAND, and the Add-In names, such as Randbetween RANDBETWEEN(bofforn,top) returns an integer random number Bottom is the smallest integer RANDBETWEEN will return, top is the largest For example, the expression RANDBETWEEN(0,lOO) returns (e.g., 74) To generate a random number between bottom and top, without loading the Analysis ToolPak, use =RAND()*(top... USING THE SOLVER 333 Table 14-2 Data for simple logistic equation 1 Y x 1 0.03 38 2 -6 0.04 68 3 0 .81 77 -5 0.0712 4 0 .88 43 -4 0.1152 -2 1 0.61 98 5 0. 185 0 I 1 I 0.7292 -3 1 v 0.0150 -7 I x -8 1 0.2716 0.9206 6 0.9547 I 7 0.9706 -1 0.3775 8 0. 986 3 0 0.4972 10 1 0.61 98 3 Logistic Curve 1 The logistic function 1 Y= a 1 + e b+cx +d takes into account offsets on the x-axis and the y-axis Using the data in Table... of Health and Human Services It has been shown to pass tests developed by NIST (National Institute of Standards and Technology) Using Random Numbers in Excel You can use random numbers in many ways, for example: to add "noise'' to a signal generated by a formula, to select items randomly from a list, or to perform a simulation by using the Monte Carlo method These and some other uses of random numbers... value You can do this by entering the formula =RAND() in a cell, copying the cell, and then use Paste Special (Values) This will convert the contents of the cell from =RAND() to a value (e.g., 0.743 487 0 981 26025) Alternatively, you can type the formula =RAND() in the formula bar, then press F9, then Enter Instead of using the RAND worksheet function, you can use the RANDBETWEEN function, one of the Engineering... method To randomize manually, use =RAND() to generate a column of random numbers adjacent to (and most convenient, to the left of) the column of values to be randomized as shown in Figure 15-2 Figure 15-2 A list of names before randomizing Only part of the list is shown (folder 'Chapter 15 Examples', workbook 'Randomize', worksheet 'By H n ' ad) Then select the two columns and use the Sort command to Sort... workbook 'Randomize', worksheet 'By Formula') CHAPTER 15 RANDOM NUMBERS & MONTE CARL0 METHOD 347 The preceding formulas can be combined into a single "megaformula" =INDEX(Database,MATCH(SMALL(random,ROW()-I ),random,0)) to produce a more compact spreadsheet, as shown in Figure 15-5 Figure 15-5 A list of names randomized by using a single "megaformula." (folder 'Chapter 15 Examples', workbook 'Randomize',... tests on series of random numbers produced by the earlier version of RAND revealed that the cycle before numbers started repeating was unacceptably short, in the vicinity of one million In the improved random number generator used in Excel 2003, three sets of random numbers are generated Three of these random numbers are summed, and the fractional part of the sum is used as the random number By this . SOLVER 333 -8 -7 -6 -5 -4 -3 Table 14-2. Data for simple logistic equation. 0.01 50 1 0.61 98 0.03 38 2 0.7292 0.04 68 3 0 .81 77 0.0712 4 0 .88 43 0.1152 5 0.9206 0. 185 0 6 0.9547. transformed, ycalc is calculated directly from equation 14-7, the Solver returns the coefficients V,,, and K,,, and SolvStat returns the standard deviations of V,,, and K,n. 332 EXCEL: . guesses for kl and k2, we could create a two- dimensional array of starting values that bracket our guesses, as in Figure 14-2. (The initial guesses for kl and k2 were 0.30 and 0 .80 ,

Ngày đăng: 14/08/2014, 12:20

Từ khóa liên quan

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan