Introduction to Probability and Statistics for Science, Engineering, and Finance Walter A Rosenkrantz Department of Mathematics and Statistics University of Massachusetts at Amherst Chapman & Hall/CRC Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487‑2742 © 2009 by Taylor & Francis Group, LLC Chapman & Hall/CRC is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S Government works Printed in the United States of America on acid‑free paper 10 International Standard Book Number‑13: 978‑1‑58488‑812‑3 (Hardcover) This book contains information obtained from authentic and highly regarded sources Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the valid‑ ity of all materials or the consequences of their use The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint Except as permitted under U.S Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or uti‑ lized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopy‑ ing, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers For permission to photocopy or use material electronically from this work, please access www.copyright.com (http:// www.copyright.com/) or contact the Copyright Clearance Center, Inc (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978‑750‑8400 CCC is a not‑for‑profit organization that provides licenses and registration for a variety of users For orga‑ nizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe Library of Congress Cataloging‑in‑Publication Data Rosenkrantz, Walter A Introduction to probability and statistics for science, engineering, and finance / Walter A Rosenkrantz p cm Includes bibliographical references and index ISBN 978‑1‑58488‑812‑3 (alk paper) Probabilities Mathematical statistics I Title QA273.R765 2008 519.5‑‑dc22 Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com 2008013044 Preface Student Audience and Prerequisites This book is written for undergraduate students majoring in engineering, computer science, mathematics, economics, and finance who are required, or urged, to take a one- or twosemester course in probability and statistics to satisfy major or distributional requirements The mathematical prerequisites are two semesters of single variable calculus Although some multivariable calculus is used in Chapter 5, it can be safely omitted without destroying the continuity of the text Indeed, the topics have been arranged so that the instructor can always adjust the mathematical sophistication to a level the students can feel comfortable with Chapters and sections marked with an asterisk (*) are optional; they contain material of independent interest, but are not essential for a first course Objectives My primary goal in writing this book is to integrate into the traditional one- or twoterm statistics course some of the more interesting and widely used concepts in financial engineering For example, the volatility of a stock is the standard deviation of its returns; value at risk (VaR) is essentially a confidence interval; a stock’s β is the slope of the regression line obtained when one performs a linear regression of the stock’s returns against the returns of the S&P500 index (the S&P500 index, itself, is used as a proxy for the market portfolio) The binomial distribution, it is worth noting, plays a fundamental role in the Cox-Ross-Rubinstein (CRR) model, also called the binomial lattice model, of stock price fluctuations A passage to the limit via the central limit theorem yields the lognormal distribution for stock prices as well as the famous Black-Scholes option pricing formula Organization of the Book Beginning with the first chapter on data analysis, I introduce the basic concepts a student needs in order to understand and create the tables and graphs produced by standard statistical software packages such as MINITAB, SAS, and JMP The data sets themselves have been carefully selected to illustrate the role and scope of statistics in science, engineering, public health, and finance The text then takes students through the traditional topics of a first course in statistics Novel features include: (i) applications of traditional statistical concepts and methods to the analysis and interpretation of financial data; (ii) an introduction to modern portfolio theory; (iii) mean-standard deviation (r − σ) diagram of a collection of portfolios; and (iv) computing a stock’s β via simple linear regression For the benefit of instructors using this text, I have included technical, even tedious details, needed to derive various theorems, including the famous Black-Scholes option pricing formula, because, in my opinion, one cannot explain this formula to students without a thorough understanding of the fundamental concepts, methods, and theorems used to derive them These computational details, which can safely be omitted on a first reading, are contained in a section titled “Mathematical Details and Derivations,” put at the end of most chapters Examples The text introduces the student to the most important concepts by using suitably chosen examples of independent interest Applications to engineering (queueing theory, reliability theory, acceptance sampling), computer performance analysis, public health, and finance are included as soon as the statistical concepts have been developed Numerous examples, using both statistical software packages and scientific calculators, help to reinforce the student’s mastery of the basic concepts Problems The problems (there are 675 of them), which range from the routine to the challenging, help students master the basic concepts and give them a glimpse of the vast range of applications to a variety of disciplines Supplements An Instructor’s Solutions Manual containing carefully worked out solutions to all 675 problems is available to adopters of the textbook All data sets, including those used in the worked out examples, are available in a CD-ROM to users of this textbook Contacting the Author In spite of the copy editor’s and my best efforts to eliminate all errors and typos it is an almost impossible task to eliminate them all I, therefore, encourage all users of this text to send their comments and criticisms to me at rkrantz@math.umass.edu Acknowledgments The publication of a statistics textbook, containing many tables and graphs, is not possible without the cooperation of a large number of highly talented individuals, so it is a great pleasure for me to have this opportunity of thanking them First, I want to thank my editors at Chapman-Hall: Sunil Nair for initiating this project, and Theresa Delforn and Michele Dimont for guiding and prodding me to a successful conclusion Shashi Kumar’s technical advice with LaTeX is deeply appreciated and Theresa Gandolph, of the Instructional Technology Lab at George Washington University, gave me valuable assistance with the Freehand graphics software package Professors Alan Durfee of Mount Holyoke College and Michael Sullivan of the University of Massachusetts (Amherst) provided me with some insights and ideas on financial engineering that were very useful to me in the writing of this book I am also grateful to the American Association for the Advancement of Science, the American Journal of Clinical Nutrition, the American Journal of Epidemiology, the American Statistical Association, the Biometrika Trustees, Cambridge University Press, Elsevier Science, Iowa State University Press, Richard D Irwin, McGraw-Hill, Oxford University Press, Prentice-Hall, Routledge, Chapman & Hall, the Royal Society of Chemistry, Journal of Chemical Education, and John Wiley & Sons for permission to use copyrighted material I have made every effort to secure permission from the original copyright holders for each data set, and would be grateful to my readers for calling my attention to any omissions so they can be corrected by the publisher Finally, I dedicate this book to my wife, Linda, for her patient support while I was almost always busy writing it This page intentionally left blank Contents Data Analysis 1.1 Orientation 1.2 The Role and Scope of Statistics in Science and Engineering 1.3 Types of Data: Examples from Engineering, Public Health, and Finance 1.3.1 Univariate Data 1.3.2 Multivariate Data 1.3.3 Financial Data: Stock Market Prices and Their Time Series 1.3.4 Stock Market Returns: Definition and Examples 1.4 The Frequency Distribution of a Variable Defined on a Population 1.4.1 Organizing the Data 1.4.2 Graphical Displays 1.4.3 Histograms 1.5 Quantiles of a Distribution 1.5.1 The Median 1.5.2 Quantiles of the Empirical Distribution Function 1.6 Measures of Location (Central Value) and Variability 1.6.1 The Sample Mean 1.6.2 Sample Standard Deviation: A Measure of Risk 1.6.3 Mean-Standard Deviation Diagram of a Portfolio 1.6.4 Linear Transformations of Data 1.7 Covariance, Correlation, and Regression: Computing a Stock’s Beta 1.7.1 Fitting a Straight Line to Bivariate Data 1.8 Mathematical Details and Derivations 1.9 Chapter Summary 1.10 Problems 1.11 Large Data Sets 1.12 To Probe Further Probability Theory 2.1 Orientation 2.2 Sample Space, Events, Axioms of Probability Theory 2.2.1 Probability Measures 2.3 Mathematical Models of Random Sampling 2.3.1 Multinomial Coefficients 2.4 Conditional Probability and Bayes’ Theorem 2.4.1 Conditional Probability 2.4.2 Bayes’ Theorem 2.4.3 Independence 2.5 The Binomial Theorem 2.6 Chapter Summary 2.7 Problems 2.8 To Probe Further 1 5 13 17 17 18 22 26 26 27 32 32 33 36 37 38 40 43 44 44 65 70 71 71 72 78 84 93 94 94 97 99 100 101 101 111 Discrete Random Variables and Their Distribution Functions 3.1 Orientation 3.2 Discrete Random Variables 3.2.1 Functions of a Random Variable 3.3 Expected Value and Variance of a Random Variable 3.3.1 Moments of a Random Variable 3.3.2 Variance of a Random Variable 3.3.3 Chebyshev’s Inequality 3.4 The Hypergeometric Distribution 3.5 The Binomial Distribution 3.5.1 A Coin Tossing Model for Stock Market Returns 3.6 The Poisson Distribution 3.7 Moment Generating Function: Discrete Random Variables 3.8 Mathematical Details and Derivations 3.9 Chapter Summary 3.10 Problems 3.11 To Probe Further 113 113 114 120 121 125 128 130 130 134 140 144 146 148 150 151 160 Continuous Random Variables and Their Distribution Functions 161 4.1 Orientation 161 4.2 Random Variables with Continuous Distribution Functions: Definition and Examples 162 4.3 Expected Value, Moments, and Variance of a Continuous Random Variable 167 4.4 Moment Generating Function: Continuous Random Variables 171 4.5 The Normal Distribution: Definition and Basic Properties 172 4.6 The Lognormal Distribution: A Model for the Distribution of Stock Prices 177 4.7 The Normal Approximation to the Binomial Distribution 179 4.7.1 Distribution of the Sample Proportion pˆ 185 4.8 Other Important Continuous Distributions 185 4.8.1 The Gamma and Chi-Square Distributions 185 4.8.2 The Weibull Distribution 188 4.8.3 The Beta Distribution 188 4.9 Functions of a Random Variable 189 4.10 Mathematical Details and Derivations 191 4.11 Chapter Summary 192 4.12 Problems 192 4.13 To Probe Further 202 Multivariate Probability Distributions 205 5.1 Orientation 205 5.2 The Joint Distribution Function: Discrete Random Variables 206 5.2.1 Independent Random Variables 211 5.3 The Multinomial Distribution 212 5.4 Mean and Variance of a Sum of Random Variables 213 5.4.1 The Law of Large Numbers for Sums of Independent and Identically Distributed (iid) Random Variables 220 5.4.2 The Central Limit Theorem 222 5.5 Why Stock Prices Have a Lognormal Distribution: An Application of the Central Limit Theorem 224 5.5.1 The Binomial Lattice Model as an Approximation to a Continuous Time Model for Stock Market Prices 227 5.6 5.7 5.8 5.9 5.10 5.11 5.12 5.13 5.14 5.15 5.16 Modern Portfolio Theory 5.6.1 Mean-Variance Analysis of a Portfolio Risk Free and Risky Investing 5.7.1 Present Value Analysis of Risk Free and Risky Returns 5.7.2 Present Value Analysis of Deterministic and Random Cash Flows Theory of Single and Multi-Period Binomial Options 5.8.1 Black-Scholes Option Pricing Formula: Binomial Lattice Model Black-Scholes Formula for Multi-Period Binomial Options 5.9.1 Black-Scholes Pricing Formula for Stock Prices Governed by a Lognormal Distribution The Poisson Process 5.10.1 The Poisson Process and the Gamma Distribution Applications of Bernoulli Random Variables to Reliability Theory The Joint Distribution Function: Continuous Random Variables 5.12.1 Functions of Random Vectors 5.12.2 Conditional Distributions and Conditional Expectations: Continuous Case 5.12.3 The Bivariate Normal Distribution Mathematical Details and Derivations Chapter Summary Problems To Probe Further Sampling Distribution Theory 6.1 Orientation 6.2 Sampling from a Normal Distribution 6.3 The Distribution of the Sample Variance 6.3.1 Student’s t Distribution 6.3.2 The F Distribution 6.4 Mathematical Details and Derivations 6.5 Chapter Summary 6.6 Problems 6.7 To Probe Further 230 230 232 232 235 237 237 240 242 243 246 248 251 254 256 257 258 263 263 275 277 277 277 282 284 285 286 287 287 290 Point and Interval Estimation 291 7.1 Orientation 291 7.2 Estimating Population Parameters: Methods and Examples 292 7.2.1 Some Properties of Estimators: Bias, Variance, and Consistency 294 7.3 Confidence Intervals for the Mean and Variance 296 7.3.1 Confidence Intervals for the Mean of a Normal Distribution: Variance Unknown 299 7.3.2 Confidence Intervals for the Mean of an Arbitrary Distribution 300 7.3.3 Confidence Intervals for the Variance of a Normal Distribution 302 7.3.4 Value at Risk (VaR): An Application of Confidence Intervals to Risk Management 303 7.4 Point and Interval Estimation for the Difference of Two Means 304 7.4.1 Paired Samples 305 7.5 Point and Interval Estimation for a Population Proportion 307 7.5.1 Confidence Intervals for p1 − p2 309 7.6 Some Methods of Estimation 310 7.6.1 Method of Moments 310 Chapter 11 Multiple Linear Regression Problem 11.3 (a) Y1 Y2 Y3 Y4 = Y5 Y6 1 Y7 14 16 27 42 39 50 83 + β0 β1 3 4 5 6 2 (b) 271 271 13855 βˆ0 βˆ1 = 66 3375 (d) s2 (X X)−1 = 0.6907 −0.0135 −0.0135 0.0003 Problem 11.5 det(X X) = n =n Problem 11.7 x2i − xi =n x2i − (nx)2 x2i − n(x)2 = nSxx Using the result that ((X X)−1 ) = (X X)−1 it follows that H = (X(X X)−1 X ) = ((X ) ((X X)−1 ) X = X(X X)−1 X ; H = HH = X(X X)−1 X X(X X)−1 X = X(X X)−1 (X X)(X X)−1 X = XI(X X)−1 X = X(X X)−1 X = H Now H = H implies that = H − H = H(I − H), and finally, (I − H)2 = I − 2H + H = I − H + (H − H) = I − H Problem 11.9 (a) y = −28.3654 + 0.6013x1 + 1.0279x2 (b) 65.2695 (c) yˆ(841, 48) = 526.7, residual = 8.3028 645 646 Answers to Selected Odd-Numbered Problems (d) Analysis of Variance Source Model Error C Total DF Sum of Squares Mean Square 171328.66023 85664.33012 459.73977 65.67711 171788.40000 Root MSE 8.10414 R-square F Value Prob>F 1304.326 0.0001 0.9973 R2 = 0.9973 and adjusted R2 = 0.9966 (e) SSE(r) = 1166.75898, DF (r) = and SSE(f ) = 459.73977, DF (f ) = The F ratio F = 10.7651 > F1,7 (0.05) = 5.59 On the other hand F = 10.7651 < F1,7 (0.01) = 12.25 The P-value is 0.0135 Problem 11.11 (b) y = −26219 + 189.204551x − 0.331194x2 (d) SSE(r) = 22508.10547, DF (r) = and SSE(f ) = 10213.02766, DF (f ) = The F ratio is F = 6.0172 > F 1, 5(0.05) = 6.61 The P-value is 0.0577 Problem 11.13 Looking at the ANOVA table we see that the F ratio is 1.774 with a P-value of 0.1925 The coefficient of multiple determination is 0.2496, which is quite low The model is not a good one Analysis of Variance Source DF Sum of Squares Mean Square Model Error C Total 16 19 0.02685 0.08072 0.10757 0.00895 0.00505 Root MSE 0.07103 R-square F Value Prob>F 1.774 0.1925 0.2496 Chapter 12 Single Factor Experiments: Analysis of Variance Problem 12.1 IQR = Q3 − Q1 = 29 − 27 = and 1.5 × IQR = There is only one observation in group that lies outside the interval (Q1 − 1.5 × IQR, Q3 + 1.5 × IQR] = [24, 32] and that is 23 Problem 12.3 (a) Y31 = 28 = µ3 + 31 , Y32 = 30 = µ3 + 32 Consequently, the difference between the two values Y31 − Y32 = −2 = 31 − 32 is accounted for by the difference in the error terms (which are random variables) (b) The arithmetic test grades for each group are modeled by a normal distribution with (possibly) different means There is a positive probability that an individual taught by method will have a test grade higher than someone taught by method 3, even though the latter has a higher sample mean Problem 12.5 (a) y1 = 221.67, y = 202.67, y = 177.00, y = 249.00, y = 212.58 (b) SST r = 8319.58, SSE = 3523.33, SST = 11842.92 (c) Dependent Variable: Y Source Model Error Corrected Total DF 11 Burst Strength (psi) Sum of Mean Squares Square F Value 8319.5833333 2773.1944444 6.30 3523.3333333 440.4166667 11842.9166667 Pr > F 0.0168 F = 6.30 > 4.07 = F3,8 (0.05) Reject H0 Problem 12.7 Source Model Error Corrected Total DF 11 Sum of Mean Squares Square F Value 76.85405000 38.42702500 74.21 4.66045000 0.51782778 81.51450000 Pr > F 0.0001 F = 74.21 > 4.26 = F2,9 (0.05) Reject H0 647 648 Answers to Selected Odd-Numbered Problems Problem 12.9 (a) The factor is the type of diet, there are three levels corresponding to the three different diets, and the response variable is the weight loss (b) The null hypothesis is the mean weight loss is the same for the three diets (c) Dependent Variable: POUNDS Source Model Error Corrected Total Problem 12.11 Sum of Squares 155.1450286 284.1725714 439.3176000 DF 22 24 Mean Square 77.5725143 12.9169351 F Value 6.01 Pr > F 0.0083 (a) E(yij ) = µ + αi ; therefore E yij = J(µ + αi ), so 1≤j≤J E(y i ) = E J yij = (µ + αi ) 1≤j≤J (b) Using the fact that E(y i − y ) = E(y i ) − E(y ), the result follows by noting that E(y i ) = (µ + αi ) and E(y ) = µ as will now be shown E(y ) = E(yij ) = i j i y E(y ) = E IJ where we used the result that (µ + αi ) = IJµ, so j = µ, i αi = Problem 12.13 (a) tn−I (0.025) M SE/J = t12 (0.025) 12584.14583/4 = 122.2191 The four 95% confidence intervals are: [2755.7809, 3000.2191], [3032.2809, 3276.7191], [2815.0309, 3059.4691], [2551.7809, 2796.2191] (b) T tests (LSD) for variable: Y NOTE: This test controls the type I comparisonwise error rate not the experimentwise error rate Alpha= 0.05 Confidence= 0.95 df= 12 MSE= 12584.15 Critical Value of T= 2.17881 Least Significant Difference= 172.83 Comparisons significant at the 0.05 level are indicated by ’***’ 2 LEVEL Comparison Lower Confidence Limit Difference Between Means Upper Confidence Limit - - 44.42 103.67 217.25 276.50 390.08 449.33 *** *** Single Factor Experiments: Analysis of Variance 3 1 4 (c) - 4 307.67 -390.08 -113.58 90.42 -449.33 -232.08 31.17 -653.33 -436.08 -376.83 480.50 -217.25 59.25 263.25 -276.50 -59.25 204.00 -480.50 -263.25 -204.00 653.33 -44.42 232.08 436.08 -103.67 113.58 376.83 -307.67 -90.42 -31.17 649 *** *** *** *** *** *** *** *** Tukey’s Studentized Range (HSD) Test for variable: Y NOTE: This test controls the type I experimentwise error rate Alpha= 0.05 Confidence= 0.95 df= 12 MSE= 12584.15 Critical Value of Studentized Range= 4.199 Minimum Significant Difference= 235.5 Comparisons significant at the 0.05 level are indicated by ’***’ LEVEL Comparison 2 3 1 4 - 4 Simultaneous Simultaneous Lower Difference Upper Confidence Between Confidence Limit Means Limit -18.25 41.00 245.00 -452.75 -176.25 27.75 -512.00 -294.75 -31.50 -716.00 -498.75 -439.50 217.25 276.50 480.50 -217.25 59.25 263.25 -276.50 -59.25 204.00 -480.50 -263.25 -204.00 452.75 512.00 716.00 18.25 294.75 498.75 -41.00 176.25 439.50 -245.00 -27.75 31.50 *** *** *** *** *** *** Problem 12.15 (a) In this problem n = 24, I = 3, n − I = 21, J = 8, M SE = 10.7262 The confidence intervals are: Y ± tn−I (α/2) Y ± tn−I (α/2) Y ± tn−I (α/2) M SE = 15.75 ± 2.4085 = [13.3415, 18.1585] J M SE = 14.625 ± 2.4085 = [12.2165, 17.0335] J M SE = 16.625 ± 2.4085 = [15.2165, 20.0335] J (b) The three means arranged in increasing order are: y < y < y3 14.625 < 15.75 < 17.625 650 Answers to Selected Odd-Numbered Problems HSD = 4.1275 There are no significant differences among the means Problem 12.17 (b) Tukey’s Studentized Range (HSD) Test for variable: POUNDS NOTE: This test controls the type I experimentwise error rate Alpha= 0.05 Confidence= 0.95 df= 22 MSE= 12.91694 Critical Value of Studentized Range= 3.553 Comparisons significant at the 0.05 level are indicated by ’***’ Simultaneous Simultaneous Lower Difference Upper Confidence Between Confidence Limit Means Limit L Comparison 3 1 2 - 3 -2.248 1.667 -6.318 -0.145 -11.012 -8.754 2.035 6.339 -2.035 4.304 -6.339 -4.304 6.318 11.012 2.248 8.754 -1.667 0.145 *** *** Problem 12.19 ˆ = E(θ) ci E(Y i ) = ci µi = θ ˆ = V (θ) c2i V (Yi ) = c2i σ /J c2i J = σ2 Problem 12.21 Dependent Variable: Y Source Model Error Corrected Total DF 12 17 Resistivities Sum of Mean Squares Square 261.50648 52.3013 9.09446 0.7579 270.60094 F Value 69.0108 P r(F5,12 > 69.0108) < 0.0001 We reject H0 Components of variance: σ ˆA = (52.3013 − 0.7579)/3 = 17.811, and σ ˆ = 0.7579 Problem 12.23 (a) Dependent Variable: X Analysis of Variance Procedure Source Model Error Corrected Total DF 20 23 Sum of Squares 33.45833333 92.50000000 125.95833333 Mean Square 11.15277778 4.62500000 F Value 2.41 Pr > F 0.0969 Looking at the ANOVA table we see that the F ratio is 2.41 and the P-value is P rob(F3,20 > Single Factor Experiments: Analysis of Variance 2.41) = 0.0969 > 0.05 Consequently we not reject H0 (b)The components of variance are σ ˆA = 1.0880 and σ ˆ = 4.625 Problem 12.25 (a) E(Yij ) = µ + αi , j = 1, J; therefore, E(Y i ) = µ + αi (b) Since E(Y ) = µ it follows from part (a) that E(Y i − Y ) = E(Y i ) − E(Y ) = µ + αi − µ = αi 651 This page intentionally left blank Chapter 13 Design and Analysis of Multi-Factor Experiments Problem 13.1 Sum of Mean Source DF Squares Square F Value Model 219.5000000 109.7500000 0.49 Error 2029.5000000 225.5000000 Corrected Total 11 2249.0000000 Conclusions: Ignoring the genetic strain sharply increases the unexplained from SSE = 11.83 to SSE = 2029.50 Consequently, we not reject the null that the treatment means are equal Pr > F 0.6299 variability hypothesis Problem 13.3 (a) Grand mean is µ ˆ = 102.56 The treatment means are: µ ˆ1 = 97.67, µ ˆ2 = 103.33, µ ˆ3 = 106.67 Block means are: 100.67, 100.33, 101.00, 104.00, 103.00, 106.33 (b) Source A B Error Corrected Total DF 10 17 Sum of Squares 248.4444444 82.4444444 25.5555556 356.4444444 Mean Square 124.2222222 16.4888889 2.5555556 F Value 48.61 6.45 Pr > F 0.0001 0.0063 Problem 13.5 (a) There are three treatments corresponding to the three configurations There are five blocks corresponding to the five workloads (b) Source A B Error DF Sum of Squares 12857.20000 308.40000 236.80000 Corrected Total 14 13402.40000 Mean Square 6428.60000 77.10000 29.60000 F Value 217.18 2.60 Pr > F 0.0001 0.1161 (c) The three estimated treatment means are 51.0, 52.0, 113.6 HSD = 9.8322 and M SE = 29.6 Consequently, the mean execution time for the no cache memory is significantly different from the first two mean execution times The difference in execution times between the two cache memory and the one cache memory is not significant Problem 13.7 (a) In this case the blocking variable is the soil type (B) and the furnace slags (A) are the treatments Looking at the ANOVA table we see that the P-value of the F ratio is 0.2457, so the differences among the treatments not appear to be significant 653 654 Answers to Selected Odd-Numbered Problems Source DF SLAGS SOIL Error Corrected Total 12 20 Sum of Squares 731.0580952 5696.3400000 947.433333 7374.831429 Mean Square F Value 121.8430159 2848.1700000 78.952778 1.54 36.07 Pr > F 0.2457 0.0001 W = 0.940208, P rob < W = 0.22 The hypothesis of normality is not rejected Conclusions: The graph indicates that there is a treatment-block interaction Problem 13.9 (a) Source DF SS Mean Square F Value Pr > F FUNGCIDE PESTCIDE FUNGCIDE*PESTCIDE Error Corrected Total 2 42 47 1.29323105 0.64873201 0.71247627 2.94960161 5.60404094 1.29323105 0.32436600 0.35623813 0.07022861 18.41 4.62 5.07 0.0001 0.0154 0.0106 Conclusions: Each of the main effects and interactions is significant No improvement is discernable Problem 13.11 We only give the proof that Y i is an unbiased estimator for µi ; the proof for Y j is similar Since E(Yijk ) = µij , (k = 1, , n) it follows that E(Y i ) = E E(Yijk ) bn j k 1 nµij = µij = µi = bn j b j Problem 13.13 Using the facts that a αi = and µ.j = µ + βj we see that a (µij − µ − αi − βj ) = a(µ.j − µ − βj ) = (αβ)ij = i=1 j (αβ)ij The proof that i i=1 = is similar and is therefore omitted Problem 13.15 Factor (A) Column means Estimated Cell Means Factor (B) 5340.00 4990.00 4815.00 5045.00 5345.00 4515.00 5675.00 5415.00 4917.50 5353.33 5250.00 4749.17 Row means 5048.33 4968.33 5335.83 y = 5117.50 Design and Analysis of Multi-Factor Experiments 655 Source DF Type I SS Mean Square F Value Pr > F GAUGER BREAKER GAUGER*BREAKER Error Corrected Total 2 27 35 896450.000 2506116.667 663833.333 7415475.000 11481875.000 448225.000 1253058.333 165958.333 274647.222 1.63 4.56 0.60 0.2142 0.0196 0.6629 Problem 13.17 (a) Factor (A) 50 75 100 125 Column means Estimated Cell Means Factor (B) 40 60 80 100 18.5 18.5 23.0 27.5 10.5 15.5 14.5 29.0 14.0 19.5 24.0 26.5 19.0 22.0 22.5 30.0 15.5 18.875 21.0 28.25 Row means 21.875 17.375 21.0 21 y = 20.9065 (c) Dependent Variable: WARPING Source DF Sum of Squares Mean Square TEMP COPPER TEMP*COPPER Error Corrected Total 3 16 31 156.0937500 698.3437500 113.7812500 108.5000000 1076.7187500 52.0312500 232.7812500 12.6423611 6.7812500 F Value Pr > F 7.67 34.33 1.86 0.0021 0.0001 0.1327 Problem 13.19 (a) LX1 LX8 LX1 X8 = 0.352, = 0.5658, = 0.125 8 2.8172 (−4.526)2 12 = 0.496, SSX1 = = 1.280, SSX1X8 = = 0.0625 16 16 16 (c) The sample mean of Y at X8 = is −0.3656; the sample mean of Y at X8 = is −0.9313 Setting the nozzle position at X8 = produces a smaller variance, since exp(−0.9313) < exp(−0.3656) (b) SSX1 = Problem 13.21 (a) X4 = 0, X8 = 14.821, 14.757, 14.415, 14.932 X4 = 1, X8 = 13.880, 13.860, 13.972, 13.907 X4 = 0, X8 = 14.888, 14.921, 14.843, 14.878 X4 = 1, X8 = 14.037, 14.165, 14.032, 13.914 656 Answers to Selected Odd-Numbered Problems (b) Dependent Variable: Y1 Source DF X8 X4 X8*X4 Error Corrected Total 1 12 15 Epitaxial thickness Sum of Squares 0.08037225 2.79558400 0.00036100 0.19080650 3.06712375 Mean Square 0.08037225 2.79558400 0.00036100 0.01590054 F Value Pr > F 5.05 175.82 0.02 0.0441 0.0001 0.8827 (c) X4 = 0, X8 = −0.4425, −0.3267, −0.3131, −0.2292 X4 = 1, X8 = −0.6505, −0.4969, −0.3467, −0.1190 X4 = 0, X8 = −1.1989, −0.6270, −0.4369, −0.6154 X4 = 1, X8 = −1.4307, −1.4230, −0.8663, −0.8625 (d) Dependent Variable: Y2 Source X8 X4 X8*X4 Error Corrected Total DF 1 12 15 Log of s-square Sum of Squares 1.28034883 0.24897605 0.12122583 0.82812271 2.47867342 Mean Square 1.28034883 0.24897605 0.12122583 0.06901023 F Value 18.55 3.61 1.76 Pr > F 0.0010 0.0818 0.2097 (e) Yes Deposition time (X4) is significant for Y1 (P-value of F ratio is < 0.0001) but not for Y2 (P-value of F ratio is 0.0818) Problem 13.23 contrasts lA =339 lB =313 lC =183 Source A B C Error Corrected Total SSef f ect 9576.75 8164.083 2790.750 SSef f ect /SST 0.4588 0.3911 0.1337 DF 1 11 Sum of Squares 9576.750000 8164.083333 2790.750000 341.33333 20872.91667 SSef f ect /SST (complete factorial) 0.4463 0.4463 0.0735 Mean Square 9576.750000 8164.083333 2790.750000 42.66667 F Value 224.46 191.35 65.41 Pr > F 0.0001 0.0001 0.0001 Design and Analysis of Multi-Factor Experiments 657 These results are consistent with the full factorial model which indicate that the factors A, B, and C are significant Problem 13.25 contrasts lA =37.5 lB =10.3 lC =3.1 Source A B C Error Corrected Total SSef f ect 175.78 13.26 1.20 SSef f ect /SST 0.8953 0.0675 0.0061 DF 1 Sum of Squares 175.7812500 13.2612500 1.2012500 6.0850000 196.3287500 SSef f ect /SST (complete factorial) 0.5302 0.0054 0.0023 Mean Square 175.7812500 13.2612500 1.2012500 1.5212500 F Value 115.55 8.72 0.79 Pr > F 0.0004 0.0419 0.4244 In the full factorial model A, A*C, and B*C are significant while B is not In the fractional replicate, B is significant because it is aliased with AC, which is significant C is not significant in the fractional model because it is aliased with AB, which is not significant in the full factorial So the results of the fractional factorial model are not consistent with the results of the full factorial Problem 13.27 The pattern of plus and minus signs for the A,B,C, and D factors are A −, +, −, +, −, +, −, +, −, +, −, +, −, +, −, + B −, −, +, +, −, −, +, +, −, −, +, +, −, −, +, + C −, −, −, −, +, +, +, +, −, −, −, −, +, +, +, + D −, −, −, −, −, −, −, −, +, +, +, +, +, +, +, + The pattern of ± signs for the ABCD interaction is the product of these rows.Thus, block contains {1, ab, ac, bc, ad, bd, cd, abcd} and block contains {a, b, c, abc, d, abd, acd, bcd} This page intentionally left blank Chapter 14 Statistical Quality Control Problem 14.1 (a) LCLX = 13.9, U CLX = 15.1 (b) LCLR = 0, U CLR = 1.88 (c) P (|X − 14.5| > 0.5) = 0.0155 Problem 14.3 Assume µ = µ0 + δσ √ √ √ P (|X − µ0 | > 3σ/ n) = P (X − µ0 > 3σ/ n) + P (X − µ0 < −3σ/ n) √ X − µ0 − δσ 3σ/ n − δσ √ √ > =P σ/ n σ/ n √ X − µ0 − δσ −3σ/ n − δσ √ √ < +P σ/ n σ/ n √ √ = P (Z > − δ n) + P (Z < −3 − δ n) √ √ = Φ(−3 − δ n) + (1 − Φ(3 − δ n)) Problem 14.5 (a) Forty-two rheostats are non-conforming, that is, approximately 31% failed to meet the specifications (b) LCLX = 135.67, U CLX = 145.62, x = 140.64; LCLR = 0, U CLR = 18.2, R = 8.6 The x and R control charts not show a lack of control Problem 14.7 (a) LCLp = 0, U CLp = 0.0794 (b) P (ˆ p > 0.0794|p = 0.04) = 0.0778 T has a geometric distribution with parameter p = 0.0778 E(T ) = 12.85 Problem 14.9 LCLc = 0, U CLc = 14.702 The maximum number of observed defects is 12, so the control chart does not detect a lack of control 659 ... special relativity theory as correct This demonstrates that the validity of a scientific experiment strongly depends upon the theoretical framework within which the data are collected, analyzed and... using the χ2 (chi-square statistic, which we will study in Chapter 9) A large value of χ2 provides strong evidence against the model Fisher noted that the χ2 values for Mendel’s numerous experiments