Data Analysis Using the Method of Least Squares J Wolberg Data Analysis Using the Method of Least Squares Extracting the Most Information from Experiments With Figures and 123 Tables John Wolberg Technion-Israel Institute of Technology Faculty of Mechanical Engineering 32000 Haifa, Israel E-mail: jwolber@attglobal.net Library of Congress Control Number: 20 5934 ISBN-10 3-540-25674-1 Springer Berlin Heidelberg New York ISBN-13 978-3-540-25674-8 Springer Berlin Heidelberg New York This work is subject to copyright All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer Violations are liable to prosecution under the German Copyright Law Springer is a part of Springer Science+Business Media springer.com © Springer-Verlag Berlin Heidelberg 2006 Printed in Germany The use of general descriptive names, registered names, trademarks, etc in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use Typesetting: Data prepared by the Author and by SPI Publisher Services T Cover design: design & production GmbH, Heidelberg Printed on acid-free paper SPIN 11010197 62/3141/SPI Publisher Services For my parents, Sidney and Beatrice Wolberg ʬ"ʦ My wife Laurie My children and their families: Beth, Gilad, Yoni and Maya Sassoon David, Pazit and Sheli Wolberg Danny, Iris, Noa, Adi and Liat Wolberg Tamar, Ronen, Avigail and Aviv Kimchi Preface Measurements through quantitative experiments are one of the most fundamental tasks in all areas of science and technology Astronomers analyze data from asteroid sightings to predict orbits Computer scientists develop models for recognizing spam mail Physicists measure properties of materials at low temperatures to understand superconductivity Materials engineers study the reaction of materials to varying load levels to develop methods for prediction of failure Chemical engineers consider reactions as functions of temperature and pressure The list is endless From the very small-scale work on DNA to the huge-scale study of black holes, quantitative experiments are performed and the data must be analyzed Probably the most popular method of analysis of the data associated with quantitative experiments is least squares It has been said that the method of least squares was to statistics what calculus was to mathematics Although the method is hardly mentioned in most engineering and science undergraduate curricula, many graduate students end up using the method to analyze the data gathered as part off their research There is not a lot of available literature on the subject Very few books deal with least squares at the level of detail that the subject deserves Many books on statistics include a chapter on least squares but the treatment is usually limited to the simplest cases of linear least squares The purpose of this book is to fill the gaps and include the type of information helpful to scientists and engineers interested in applying the method in their own special fields The purpose of many engineering and scientific experiments is to determine parameters based upon a mathematical model related to the phenomenon under observation Even if the data is analyzed using least squares, the full power of the method is often overlooked For example, the data can be weighted based upon the estimated errors associated with the data Results from previous experiments or calculations can be combined with the least squares analysis to obtain improved estimate of the model parameters In addition, the results can be used for predicting values of the dependent variable or variables and the associated uncertainties of the predictions as functions of the independent variables VIII Preface The introductory chapter (Chapter 1) includes a review of the basic statistical concepts that are used throughout the book The method of least squares is developed in Chapter The treatment includes development of mathematical models using both linear and nonlinear least squares In Chapter evaluation of models is considered This chapter includes methods for measuring the "goodness of fit" of a model and methods for comparing different models The subject of candidate predictors is discussed in Chapter Often there are a number of candidate predictors and the task of the analyst is to try to extract a model using subspaces of the full candidate predictor space In Chapter attention is turned towards designing experiments that will eventually be analyzed using least squares The subject considered in Chapter is nonlinear least squares software Kernel regression is introduced in the final chapter (Chapter 7) Kernel regression is a nonparametric modeling technique that utilizes local least squares estimates Although general purpose least squares software is available, the subject of least squares is simple enough so that many users of the method prefer to write their own routines Often, the least squares analysis is a part of a larger program and it is useful to imbed it within the framework of the larger program Throughout the book very simple examples are included so that the reader can test his or her own understanding of the subject These examples are particularly useful for testing computer routines The REGRESS program has been used throughout the book as the primary least squares analysis tool REGRESS is a general purpose nonlinear least squares program and I am its author The program can be downloaded from www.technion.ac.il/wolberg I would like to thank David Aronson for the many discussions we have had over the years regarding the subject of data modeling My first experiences with the development of general purpose nonlinear regression software were influenced by numerous conversations that I had with Marshall Rafal Although a number of years have passed, I still am in contact with Marshall Most of the examples included in the book were based upon software that I developed with Ronen Kimchi and Victor Leikehman and I would like to thank them for their advice and help I would like to thank Ellad Tadmor for getting me involved in the research described in Section 7.7 Thanks to Richard Green for introducing me to the first English translation of Gauss's Theoria Motus in which Gauss developed the foundations of the method of least squares I would also like to thank Donna Bossin for her help in editing the manuscript and teaching me some of the cryptic subtleties of WORD Preface IX I have been teaching a graduate course on analysis and design of experiments and as a result have had many useful discussions with our students throughout the years When I decided to write this book two years ago, I asked each student in the course to critically review a section in each chapter that had been written up to that point Over 20 students in the spring of 2004 and over 20 students in the spring of 2005 submitted reviews that included many useful comments and ideas A number of typos and errors were located as a result of their efforts and I really appreciated their help John R Wolberg Haifa, Israel July, 2005 Contents Chapter INTRODUCTION .1 1.1 Quantitative Experiments 1.2 Dealing with Uncertainty 1.3 Statistical Distributions The normal distribution .8 The binomial distribution 10 The Poisson distribution 11 The χ2 distribution 13 The t distribution .15 The F distribution 16 1.4 Parametric Models 17 1.5 Basic Assumptions 19 1.6 Systematic Errors 22 1.7 Nonparametric Models 24 1.8 Statistical Learning 27 Chapter THE METHOD OF LEAST SQUARES 31 2.1 Introduction 31 2.2 The Objective Function 34 2.3 Data Weighting .38 XII Contents 2.4 Obtaining the Least Squares Solution 44 2.5 Uncertainty in the Model Parameters 50 2.6 Uncertainty in the Model Predictions 54 2.7 Treatment of Prior Estimates 60 2.8 Applying Least Squares to Classification Problems 64 Chapter MODEL EVALUATION 73 3.1 Introduction 73 3.2 Goodness-of-Fit 74 3.3 Selecting the Best Model 79 3.4 Variance Reduction 85 3.5 Linear Correlation 88 3.6 Outliers 93 3.7 Using the Model for Extrapolation 96 3.8 Out-of-Sample Testing 99 3.9 Analyzing the Residuals 105 Chapter CANDIDATE PREDICTORS 115 4.1 Introduction 115 4.2 Using the F Distribution 116 4.3 Nonlinear Correlation 122 4.4 Rank Correlation 131 Chapter DESIGNING QUANTITATIVE EXPERIMENTS 137 5.1 Introduction 137 5.2 The Expected Value of the Sum-of-Squares 139 5.3 The Method of Prediction Analysis 140 5.4 A Simple Example: A Straight Line Experiment 143 5.5 Designing for Interpolation 147 5.6 Design Using Computer Simulations 150 5.7 Designs for Some Classical Experiments 155 5.8 Choosing the Values of the Independent Variables 162 Contents XIII 5.9 Some Comments about Accuracy 167 Chapter SOFTWARE 169 6.1 Introduction 169 6.2 General Purpose Nonlinear Regression Programs 170 6.3 The NIST Statistical Reference Datasets 173 6.4 Nonlinear Regression Convergence Problems 178 6.5 Linear Regression: a Lurking Pitfall .184 6.6 Multi-Dimensional Models 191 6.7 Software Performance 196 6.8 The REGRESS Program .198 Chapter KERNEL REGRESSION 203 7.1 Introduction 203 7.2 Kernel Regression Order Zero 205 7.3 Kernel Regression Order One .208 7.4 Kernel Regression Order Two 212 7.5 Nearest Neighborr Searching 215 7.6 Kernel Regression Performance Studies .223 7.7 A Scientific Application .225 7.8 Applying Kernel Regression to Classification 232 7.9 Group Separation: An Alternative to Classification .236 Appendix A: Generating Random Noise 239 Appendix B: Approximating the Standard Normal Distribution 243 References 245 Index 249 ... uncertainty in the input data Also, the output of our analysis must include estimates of the uncertainty of the results One of the most compelling reasons for using least squares analysis of data is... generated data Once data has been obtained, regardless of its origin, the task of data analysis commences Whether or not the method of least squares is applicable depends upon the applicability of some... Even if the data is analyzed using least squares, the full power of the method is often overlooked For example, the data can be weighted based upon the estimated errors associated with the data