CURVE FITTING 73 the same steps of determining the coefficients for each of the spline pieces by solving the system of linear equations derived from imposing constraints on how they are joined. Method of Least Squares A very common experimental situation is that, while the data values that we have are not exact, we do have some idea of the form of the function which is to fit the data. The function might depend on some parameters and the curve fitting procedure is to find the choice of parameters that “best” matches the observed values at the given points. If the function were a poly- nomial (with the parameters being the coefficients) and the values were exact, then this would be interpolation. But now we are considering more general functions and inaccurate data. To simplify the discussion, we’ll concentrate on fitting to functions which are expressed as a linear combination of simpler functions, with the unknown parameters being the coefficients: f(x) = + This includes most of the functions that we’ll be interested in. After studying this case, we’ll consider more general functions. A common way of measuring how well a function fits is the least-squares criterion: the error is calculated by adding up the squares of the errors at each of the observation points: E= -- This is a very natural measure: the squaring is done to stop cancellations among errors with different signs. Obviously, it is most desirable to find the choice of parameters that minimizes E. It turns out that this choice can be computed efficiently: this is the so-called method of least squares. The method follows quite directly from the definition. To simplify the derivation, we’ll do the case = 2, N = but the general method will follow directly. Suppose that we have three points xi, and corresponding values which are to be fitted to a function of the form f(x) = + Our job is to find the choice of the coefficients which minimizes the least-squares error + + 74 6 To find the choices of and which minimize this error, we simply need to set the derivatives and to zero. For we have: + 1 + Setting the derivative equal to zero leaves an equation which the variables and must satisfy etc. are all “constants” with known values): + + = We get a similar equation when we set the derivative to zero. These rather formidable-looking equations can be greatly simplified using vector notation and the “dot product” operation that we encountered briefly in Chapter 2. If we define the vectors x = and y = and then the dot product of x and y is the real number defined by = Now, if we define the vectors = and = en our equations for the coefficients and can be very simply expressed: . + = y . These can be solved with Gaussian elimination to find the desired coefficients. For example, suppose that we know that the data points should be fit by a function of the form + (These data points are slightly perturbed from the exact values for 1 + In this case, we have = 1.0) and = so we have to solve the system of equations CURVE FITTING 75 with the result = 0.998 and = 1.054 (both close to 1, as expected). The method outlined above easily generalizes to find more than two coefficients. To find the constants . . in = + + which minimize the least squares error for the point and observation vectors first compute the function component vectors fl = . . . , = . Then make up an M-by-M linear system of equations with = . y. The solution to this system of simultaneous equations yields the required coefficients. This method is easily implemented by maintaining a two dimensional array for the f vectors, considering y as the (M + vector. Then an array can be filled as follows: for to Mdo for to do begin 0.0; for to N do k]; end; and then solved using the Gaussian elimination procedure from Chapter 5. The method of least squares can be extended to handle nonlinear func- tions (for example a function such as = and it is often 76 CHAPTER 6 used for this type of application. The idea is fundamentally the same; the problem is that the derivatives may not be easy to compute. What is used is an iterative method: use some estimate for the coefficients, then use these within the method of least squares to compute the derivatives, thus producing a better estimate for the coefficients. This basic method, which is widely used today, was outlined by Gauss in the 1820s. CURVE FITTING 77 Exercises 1. Approximate the function lgx with a degree 4 interpolating polynomial at the points and 5. the quality of the fit by computing the sum of the squares of the errors at 1.5, 2.5, 3.5, and 4.5. 2. Solve the previous problem for the function sinx. Plot the function and the approximation, if that’s possible on your computer system. 3. Solve the previous problems using a cubic spline instead of an interpolat- ing polynomial. 4. Approximate the function lgx with a cubic spline with knots at for N between 1 and 10. Experiment with different placements of knots in the same range to try to obtain a better fit. 5. What would happen in least squares data fitting if one of the functions was the function = 0 for some 6. What would happen in least squares data-fitting if all the observed values were O? 7. What values of a, c minimize the least-squares error in using the function f(x) = log x + bx c to approximate the observations = 0, f(4) = 13, f(8) = 8. Excluding the Gaussian elimination phase, how many multiplications are involved in using the method of least squares to find M coefficients based on N observations? 9. Under what circumstances would the matrix which arises in least-squares curve fitting be singular? 10. Does the least-squares method work if two different observations are in- cluded for the same point? 7. Integration Computing the integral is a fundamental analytic operation often per- formed on functions being processed on computers. One of two com- pletely different approaches can be used, depending on the way the function is represented. If an explicit representation of the function is available, then it may be possible to do symbolic to compute a similar representation for the integral. At the other extreme, the function may be defined by a table, so that function values are known for only a few points. The most common situation is between these: the function be integrated is represented in such a way that its value at any particular point can be computed. In this case, the goal is to compute a reasonable approximation to the integral of the func- tion, without performing an excessive number of function evaluations. This computation is often called quadrature by numerical analysts. Symbolic Integration If full information is available about a function, then it may be worthwhile to consider using a method which involves manipulating some representation of the function rather than working with numeric values. The goal is to transform a representation of the function into a representation of the integral, in much the same way that indefinite integration is done by hand. A simple example of this is the of polynomials. In Chapters 2 and 4 we examined methods for “symbolically” computing sums and products of polynomials, with programs that on a particular representation for the polynomials and produced the representation for the answers from the rep- resentation for the inputs. The operation of integration (and differentiation) of polynomials can also be done in this way. If a polynomial 79 80 CHAPTER 7 is represented simply by keeping the values of the coefficients in an array p then the integral can be easily computed as follows: do p [i] : This is a direct implementation of the well-known symbolic integration rule dt = for > 0. Obviously a wider class of functions than just polynomials can be handled by adding more symbolic rules. The addition of composite rules such as integration by parts, v du, can greatly expand the set of functions which can be handled. (Integration by parts requires a differentiation capability. Symbolic differentiation is some- what easier than symbolic integration, since a reasonable set of elementary rules plus the composite chain rule will suffice for most common functions.) The large number of rules available to be applied to a particular function makes symbolic integration a difficult task. Indeed, it has only recently been shown that there is an algorithm for this task: a procedure which either returns the integral of any given function or says that the answer cannot be expressed in terms of elementary functions. A description of this algorithm in its full generality would be beyond the scope of this book. However, when the functions being processed are from a small restricted class, symbolic integration can be a powerful tool. Of course, symbolic techniques have the fundamental limitation that there are a great many integrals (many of which occur in practice) which can’t be evaluated symbolically. Next, we’ll examine some techniques which have been developed to compute approximations to the values of real integrals. Simple Quadrature Methods Perhaps the most obvious way to approximate the value of an integral is the rectangle method: evaluating an integral is the same as computing the area under a curve, and we can estimate the area under a curve by summing the areas of small rectangles which nearly fit under the curve, as diagrammed below. INTEGRATION 81 To be precise, suppose that we are to compute and that the interval [a, b] over which the integral is to be computed is divided into N parts, delimited by the points . . Then we have N rectangles, with the width of the ith rectangle (1 i N)) given by x,. For the height of the ith rectangle, we could use or but it would seem that the result would be more accurate -if the value of at the midpoint of the interval + is used, as in the above diagram. This leads to the quadrature formula which estimates the value of the integral f(x) over the interval from a = to b = In the common case where all the intervals are to be the same size, say = we have + = (2i + so the approximation to the integral is easily computed. function b: real; integer) : real; var i: w, real; begin for to N do end ; Of course, as N gets larger, the answer becomes more accurate, For example, the following table shows the estimate produced by this function for (which we know to be 2 = 0.6931471805599 . ) when invoked with the call for N = 82 CHAPTER 7 10 0.6928353604100 100 0.6931440556283 1000 0.6931471493100 When N = 1000, our answer is accurate to about seven decimal places. More sophisticated quadrature methods can achieve better accuracy with much less work. It is not difficult to derive an analytic expression for the error made in the rectangle method by expanding f(z) in a Taylor series about the midpoint of each interval, integrating, then summing over all intervals. We won’t go through the details of this calculation: our purpose is not to derive detailed error bounds, but rather to show error estimates for simple methods and how these estimates suggest more accurate methods. This can be appreciated even by a reader not familiar with Taylor series. It turns out that where w is the interval width ((b and es depends on the value of the third derivative of at the interval midpoints, etc. (Normally, this is a good approximation because most “reasonable” functions have small order derivatives, though this is not always true.) For example, if we choose to make w = (which would correspond to N = 200 in the example above), this formula says the integral computed by the procedure above should be accurate to about six places. Another way to approximate the integral is to divide the area under the curve into trapezoids, as diagrammed below. This trapezoid method leads to the quadrature formula = + 2 .