Linear models in statistics

LINEAR MODELS IN STATISTICS LINEAR MODELS IN STATISTICS Second Edition Alvin C Rencher and G Bruce Schaalje Department of Statistics, Brigham Young University, Provo, Utah Copyright # 2008 by John Wiley & Sons, Inc All rights reserved Published by John Wiley & Sons, Inc., Hoboken, New Jersey Published simultaneously in Canada No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley com/go/permission Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose No warranty may be created or extended by sales representatives or written sales materials The advice and strategies contained herein may not be suitable for your situation You should consult with a professional where appropriate Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002 Wiley also publishes its books in variety of electronic formats Some content that appears in print may not be available in electronic formats For more information about Wiley products, visit our web site at www.wiley.com Wiley Bicentennial Logo: Richard J Pacifico Library of Congress Cataloging-in-Publication Data: Rencher, Alvin C., 1934Linear models in statistics/Alvin C Rencher, G Bruce Schaalje – 2nd ed p cm Includes bibliographical references ISBN 978-0-471-75498-5 (cloth) Linear models (Statistics) I Schaalje, G Bruce II Title QA276.R425 2007 519.50 35–dc22 2007024268 Printed in the United States of America 10 CONTENTS Preface xiii Introduction 1.1 1.2 1.3 Simple Linear Regression Model Multiple Linear Regression Model Analysis-of-Variance Models Matrix Algebra Matrix and Vector Notation 2.1.1 Matrices, Vectors, and Scalars 2.1.2 Matrix Equality 2.1.3 Transpose 2.1.4 Matrices of Special Form 2.2 Operations 2.2.1 Sum of Two Matrices or Two Vectors 2.2.2 Product of a Scalar and a Matrix 10 2.2.3 Product of Two Matrices or Two Vectors 10 2.2.4 Hadamard Product of Two Matrices or Two Vectors 16 2.3 Partitioned Matrices 16 2.4 Rank 19 2.5 Inverse 21 2.6 Positive Definite Matrices 24 2.7 Systems of Equations 28 2.8 Generalized Inverse 32 2.8.1 Definition and Properties 33 2.8.2 Generalized Inverses and Systems of Equations 2.9 Determinants 37 2.10 Orthogonal Vectors and Matrices 41 2.11 Trace 44 2.12 Eigenvalues and Eigenvectors 46 2.12.1 Definition 46 2.12.2 Functions of a Matrix 49 2.1 36 v vi CONTENTS 2.13 2.14 2.12.3 Products 50 2.12.4 Symmetric Matrices 51 2.12.5 Positive Definite and Semidefinite Matrices 53 Idempotent Matrices 54 Vector and Matrix Calculus 56 2.14.1 Derivatives of Functions of Vectors and Matrices 56 2.14.2 Derivatives Involving Inverse Matrices and Determinants 58 2.14.3 Maximization or Minimization of a Function of a Vector 60 Random Vectors and Matrices 3.1 3.2 3.3 3.4 3.5 3.6 69 Introduction 69 Means, Variances, Covariances, and Correlations 70 Mean Vectors and Covariance Matrices for Random Vectors 3.3.1 Mean Vectors 75 3.3.2 Covariance Matrix 75 3.3.3 Generalized Variance 77 3.3.4 Standardized Distance 77 Correlation Matrices 77 Mean Vectors and Covariance Matrices for Partitioned Random Vectors 78 Linear Functions of Random Vectors 79 3.6.1 Means 80 3.6.2 Variances and Covariances 81 Multivariate Normal Distribution 4.1 4.2 4.3 4.4 4.5 Univariate Normal Density Function 87 Multivariate Normal Density Function 88 Moment Generating Functions 90 Properties of the Multivariate Normal Distribution Partial Correlation 100 87 92 Distribution of Quadratic Forms in y 5.1 5.2 5.3 5.4 5.5 5.6 Sums of Squares 105 Mean and Variance of Quadratic Forms 107 Noncentral Chi-Square Distribution 112 Noncentral F and t Distributions 114 5.4.1 Noncentral F Distribution 114 5.4.2 Noncentral t Distribution 116 Distribution of Quadratic Forms 117 Independence of Linear Forms and Quadratic Forms 75 105 119 vii CONTENTS Simple Linear Regression 6.1 6.2 6.3 6.4 The Model 127 Estimation of b0, b1, and s 128 Hypothesis Test and Confidence Interval for b1 Coefficient of Determination 133 127 132 Multiple Regression: Estimation 137 7.1 7.2 7.3 Introduction 137 The Model 137 Estimation of b and s 141 7.3.1 Least-Squares Estimator for b 145 ˆ 141 7.3.2 Properties of the Least-Squares Estimator b 7.3.3 An Estimator for s 149 7.4 Geometry of Least-Squares 151 7.4.1 Parameter Space, Data Space, and Prediction Space 152 7.4.2 Geometric Interpretation of the Multiple Linear Regression Model 153 7.5 The Model in Centered Form 154 7.6 Normal Model 157 7.6.1 Assumptions 157 7.6.2 Maximum Likelihood Estimators for b and s 158 ˆ and sˆ2 159 7.6.3 Properties of b 7.7 R in Fixed-x Regression 161 7.8 Generalized Least-Squares: cov(y) ¼ s 2V 164 7.8.1 Estimation of b and s when cov(y) ¼ s 2V 164 7.8.2 Misspecification of the Error Structure 167 7.9 Model Misspecification 169 7.10 Orthogonalization 174 Multiple Regression: Tests of Hypotheses and Confidence Intervals 8.1 8.2 8.3 8.4 8.5 Test of Overall Regression 185 Test on a Subset of the b Values 189 F Test in Terms of R 196 The General Linear Hypothesis Tests for H0: Cb ¼ and H0: Cb ¼ t 198 8.4.1 The Test for H0: Cb ¼ 198 8.4.2 The Test for H0: Cb ¼ t 203 Tests on bj and a0 b 204 8.5.1 Testing One bj or One a0 b 204 8.5.2 Testing Several bj or a0 ib Values 205 185 viii CONTENTS 8.6 8.7 Confidence Intervals and Prediction Intervals 209 8.6.1 Confidence Region for b 209 8.6.2 Confidence Interval for bj 210 8.6.3 Confidence Interval for a0 b 211 8.6.4 Confidence Interval for E(y) 211 8.6.5 Prediction Interval for a Future Observation 8.6.6 Confidence Interval for s 215 8.6.7 Simultaneous Intervals 215 Likelihood Ratio Tests 217 213 Multiple Regression: Model Validation and Diagnostics 9.1 9.2 9.3 9.4 10 235 Multiple Regression: Random x’s 10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8 11 Residuals 227 The Hat Matrix 230 Outliers 232 Influential Observations and Leverage 243 Multivariate Normal Regression Model 244 Estimation and Testing in Multivariate Normal Regression 245 Standardized Regression Coefficents 249 R in Multivariate Normal Regression 254 Tests and Confidence Intervals for R 258 Effect of Each Variable on R 262 Prediction for Multivariate Normal or Nonnormal Data 265 Sample Partial Correlations 266 Multiple Regression: Bayesian Inference 11.1 11.2 227 Elements of Bayesian Statistical Inference 277 A Bayesian Multiple Linear Regression Model 279 11.2.1 A Bayesian Multiple Regression Model with a Conjugate Prior 280 11.2.2 Marginal Posterior Density of b 282 11.2.3 Marginal Posterior Densities of t and s 284 11.3 Inference in Bayesian Multiple Linear Regression 285 11.3.1 Bayesian Point and Interval Estimates of Regression Coefficients 285 11.3.2 Hypothesis Tests for Regression Coefficients in Bayesian Inference 286 11.3.3 Special Cases of Inference in Bayesian Multiple Regression Models 286 11.3.4 Bayesian Point and Interval Estimation of s 287 277 ix CONTENTS 11.4 11.5 Bayesian Inference through Markov Chain Monte Carlo Simulation 288 Posterior Predictive Inference 290 12 Analysis-of-Variance Models 12.1 12.2 12.3 12.4 12.5 12.6 12.7 12.8 295 Non-Full-Rank Models 295 12.1.1 One-Way Model 295 12.1.2 Two-Way Model 299 Estimation 301 12.2.1 Estimation of b 302 12.2.2 Estimable Functions of b 305 Estimators 309 12.3.1 Estimators of l0 b 309 12.3.2 Estimation of s 313 12.3.3 Normal Model 314 Geometry of Least-Squares in the Overparameterized Model 316 Reparameterization 318 Side Conditions 320 Testing Hypotheses 323 12.7.1 Testable Hypotheses 323 12.7.2 Full-Reduced-Model Approach 324 12.7.3 General Linear Hypothesis 326 An Illustration of Estimation and Testing 329 12.8.1 Estimable Functions 330 12.8.2 Testing a Hypothesis 331 12.8.3 Orthogonality of Columns of X 333 13 One-Way Analysis-of-Variance: Balanced Case The One-Way Model 339 Estimable Functions 340 Estimation of Parameters 341 13.3.1 Solving the Normal Equations 341 13.3.2 An Estimator for s 343 13.4 Testing the Hypothesis H0: m1 ¼ m2 ¼ ¼ mk 13.4.1 Full – Reduced-Model Approach 344 13.4.2 General Linear Hypothesis 348 13.5 Expected Mean Squares 351 13.5.1 Full-Reduced-Model Approach 352 13.5.2 General Linear Hypothesis 354 339 13.1 13.2 13.3 344 REFERENCES 657 Hocking, R R and F M Speed (1975) A full rank analysis of some linear model problems Journal of the American Statistical Association 70, 706– 712 Hoerl, A E and R W Kennard (1970) Ridge regression: Biased estimation for nonorthogonal problems Technometrics 12, 55–67 Hogg, R V and A T Craig (1995) Introduction to Mathematical Statistics (5th ed.) Englewood Cliffs, NJ: Prentice-Hall Holland, B (1991) On the application of three modified Bonferroni procedures to pairwise multiple comparisons in balanced repeated measures designs Computational Statistics Quarterly 3, 219 –231 Holland, B and M D Copenhaver (1987) An improved sequentially rejective Bonferroni test procedure Biometrics 43, 417 –423 Holm, S (1979) A simple sequentially rejective multiple test procedure Scandinavian Journal of Statistics 6, 65 –70 Hommel, G (1988) A stagewise rejective multiple test procedure based on a modified Bonferroni test Biometrika 75, 383 –386 Hosmer, D., B Jovanovic, and S Lemeshow (1989) Best subsets logistic regression Biometrics 45, 1265–1270 Hosmer, D W and S Lemeshow (1989) Applied Logistic Regression New York: Wiley Huber, P J (1973) Robust regression: Asymptotics, conjectures, and Monte Carlo Annals of Statistics 1, 799 –821 Hummel, T J and J Sligo (1971) Empirical comparison of univariate and multivariate analysis of variance procedures Psychological Bulletin 76, 49– 57 Jammalamadaka, S R and D Sengupta (2003) Linear Models an Integrated Approach Singapore: World Scientific Publications Jeske, D R and D A Harville (1988) Prediction-interval procedures and (fixed-effects) confidence-interval procedures for mixed linear models Communications in Statistics: Theory and Methods 17, 1053–1087 Jørgensen, B (1993) The Theory of Linear Models New York: Chapman & Hall Kackar, R N and D A Harville (1984) Approximations for standard errors of estimators of fixed and random effects in mixed linear models Journal of the American Statistical Association 79, 853 –862 Kendall, M G and A Stuart (1969) The Advanced Theory of Statistics (3rd ed.), Vol New York: Hafner Kenward, M G and J H Roger (1997) Small sample inference for fixed effects from restricted maximum likelihood Biometrics 53, 983– 997 Keselman, H J., R K Kowalchuk, J Algina, and R D Wolfinger (1999) The analysis of repeated measurements: A comparison of mixed-model Satterthwaite F tests and a nonpooled adjusted degrees of freedom multivariate test Communications in Statistics: Theory and Methods 28, 2967–2999 Kleinbaum, D G (1994) Logistic Regression New York: Springer-Verlag Krasker, W S and R Welsch (1982) Efficient bounded-influence regression estimation Journal of the American Statistical Association 77, 595– 604 Kshirsagar, A M (1983) A Course in Linear Models New York: Marcel Dekker Ku, H H and S Kullback (1974) Loglinear models in contingency table analysis The American Statistician 28, 115 –122 658 REFERENCES Kutner, M H., C J Nachtsheim, J Neter, and W Li (2005) Applied Linear Statistical Models (5th ed.) New York: McGraw-Hill/Irwin Lehmann, E L (1999) Elements of Large-Sample Theory New York: Springer-Verlag Lindley, D V and A F M Smith (1972) Bayes estimates for the linear model (with discussion) Journal of the Royal Statistical Society, Series B: Methodological 34, –41 Lindsey, J K (1997) Applying Generalized Linear Models New York: Springer-Verlag Little, R J A and D B Rubin (2002) Statistical Analysis with Missing Data Hoboken, NJ: Wiley Mahalanobis, P C (1936) On the generalized distance in statistics Proceedings of the National Institute of Science of India 12, 49 –55 Mahalanobis, P C (1964) Professor Ronald Aylmer Fisher Biometrics 20, 238–250 Marcuse, S (1949) Optimum allocation and variance components in nested sampling with an application to chemical analysis Biometrics 5(3), 189–206 McCullagh, P and J A Nelder (1989) Generalized Linear Models (2nd ed.) New York: Chapman & Hall McCulloch, C E and S R Searle (2001) Generalized, Linear, and Mixed Models New York: Wiley Mclean, R A., W L Sanders, and W W Stroup (1991) A unifed approach to mixed linear models American Statistician 45, 54–64 Mendenhall, W and T Sincich (1996) A Second Course in Statistics: Regression Analysis Englewood Cliffs, NJ: Prentice-Hall Milliken, G A and D E Johnson (1984) Analysis of Messy Data, Vol 1: Designed Experiments New York: Van Nostrand-Reinhold Montgomery, D C and E A Peck (1992) Introduction to Linear Regresion Analysis (2nd ed.) New York: Wiley Morrison, D F (1983) Applied Linear Statistical Methods Englewood Cliffs, NJ: PrenticeHall Mosteller, F and J W Tukey (1977) Data Analysis and Regression Reading, MA: AddisonWesley Muller, K E and M C Mok (1997) The distribution of Cook’s D statistic Communications in Statistics: Theory and Methods 26, 525 –546 Myers, R H (1990) Classical and Modern Regression with Applications (2nd ed.) Boston: Duxbury Press Myers, R H and J S Milton (1991) A First Course in the Theory of Linear Statistical Models Boston: PWS-Kent Nelder, J A (1974) Letter to the editor Journal of the Royal Statistical Society, Series C 23, 232 Nelder, J A and P W Lane (1995) The computer analysis of factorial experiments: In memoriam—Frank Yates The American Statistician 49, 382–385 Nelder, J A and R W M Wedderburn (1972) Generalized linear models Journal of the Royal Statistical Society, Series A 135, 370 –384 Ogden, R T (1997) Essential Wavelets for Statistical Applications and Data Analysis Birkhauser REFERENCES 659 Ostle, B and L C Malone (1988) Statistics in Research: Basic Concepts and Techniques for Research Workers (4th ed.) Ames: Iowa State University Press Ostle, B and R W Mensing (1975) Statistics in Research (3rd ed.) Ames: Iowa State University Press Patterson, H D and R Thompson (1971) Recovery of inter-block information when block sizes are unequal Biometrika 58, 545 –554 Pawitan, Y (2001) In All Likelihood: Statistical Modelling and Inference Using Likelihood Oxford University Press Pearson, E S., R L Plackett, and G A Barnard (1990) Student: A Statistical Biography of William Sealy Gossett New York: Oxford University Press Plackett, R L (1981) The Analysis of Categorical Data (2nd ed.) London: Griffin Rao, C R (1965) Linear Statistical Inference and Its Applications New York: Wiley Rao, P S R S (1997) Variance Components Estimation London: Chapman & Hall Ratkowsky, D A (1983) Nonlinear Regression Modelling: A Unified Approach New York: Marcel Dekker Ratkowsky, D A (1990) Handbook of Nonlinear Regression Models New York: Marcel Dekker Read, T R C and N A C Cressie (1988) Goodness-of-Fit Statistics for Discrete Multivariate Data New York: Springer-Verlag Reader, M W (1973) The Analysis of Covariance with a Single Linear Covariate Having Heterogeneous Slopes Master’s thesis, Department of Statistics, Brigham Young University Rencher, A C (1993) The contribution of individual variables to Hotelling’s T 2, Wilks’ L and R Biometrics 49, 217 –225 Rencher, A C (1995) Methods of Multivariate Analysis New York: Wiley Rencher, A C (1998) Multivariate Statistical Inference and Applications New York: Wiley Rencher, A C (2002) Multivariate Statistical Inference and Applications Hoboken, NJ: Wiley Rencher, A C and D T Scott (1990) Assessing the contribution of individual variables following rejection of a multivariate hypothesis Communications in Statistics—Series B, Simulation and Computation 19, 535 –553 Rom, D M (1990) A sequentially rejective test procedure based on a modified Bonferroni inequality Biometrika 77, 663 –665 Ross, S M (2006) Introduction to Probability Models (9th ed.) San Diego, CA: Academic Press Royston, J P (1983) Some techniques for assessing multivariate normality based on the Shapiro-Wilk W Applied Statistics 32, 121 –133 Ryan, T P (1997) Modern Regression Methods New York: Wiley Santner, T J and D E Duffy (1989) The Statistical Analysis of Discrete Data New York: Springer-Verlag Satterthwaite, F E (1941) Synthesis of variances Psychometrika 6, 309–316 Saville, D J (1990) Multiple comparison procedures: The practical solution (C/R: 91V45 p165–168) The American Statistician 44, 174–180 660 REFERENCES Schaalje, G B., J B McBride, and G W Fellingham (2002) Adequacy of approximations to distributions of test statistics in complex mixed linear models Journal of Agricultural, Biological, and Environmental Statistics 7(4), 512–524 Scheffe´, H (1953) A method of judging all contrasts in the analysis of variance Biometrika 40, 87–104 Scheffe´, H (1959) The Analysis of Variance New York: Wiley Schott, J R (1997) Matrix Analysis for Statistics New York: Wiley Schwarz, G (1978) Estimating the dimension of a model Annals of Statistics 6, 461–464 Searle, S R (1971) Linear Models New York: Wiley Searle, S R (1977) Analysis of Variance of Unbalanced Data from 3-Way and Higher-Order Classifications Technical Report BU-606-M, Cornell University, Biometrics Units Searle, S R (1982) Matrix Algebra Useful for Statistics New York: Wiley Searle, S R., G Casella, and C E McCulloch (1992) Variance Components New York: Wiley Searle, S R., F M Speed, and H V Henderson (1981) Some computational and model equivalencies in analysis of variance of unequal-subclass-numbers data The American Statistician 35, 16–33 Seber, G A F (1977) Linear Regression Analysis New York: Wiley Seber, G A F and A J Lee (2003) Linear Regression Analysis (2nd ed.) Hoboken, NJ: Wiley Seber, G A F and C J Wild (1989) Nonlinear Regression New York: Wiley Sen, A and M Srivastava (1990) Regression Analaysis: Theory, Methods, and Applications New York: Springer-Verlag Shaffer, J P (1986) Modified sequentially rejective multiple test procedures Journal of the American Statistical Association 81, 826 –831 Silverman, B W (1999) Density Estimation for Statistics and Data Analysis London: Chapman & Hall Simes, R J (1986) An improved Bonferroni procedure for multiple tests of significance Biometrika 73, 751 –754 Snedecor, G W (1948) Answer to query Biometrics 4(2), 132–134 Snedecor, G W and W G Cochran (1967) Statistical Methods (6th ed.) Ames: Iowa State University Press Snee, R D (1977) Validation of regression models: Methods and examples Technometrics 19, 415 –428 Speed, F M (1969) A New Approach to the Analysis of Linear Models Technical report, National Aeronautics and Space Administration, Houston, TX; a NASA Technical memo, NASA TM X-58030 Speed, F M., R R Hocking, and O P Hackney (1978) Methods of analysis of linear models with unbalanced data Journal of the American Statistical Association 73, 105–112 Spiegelhalter, D J., N G Best, B P Carlin, and A van der Linde (2002) Bayesian measures of model complexity and fit (pkg: P583-639) Journal of the Royal Statistical Society, Series B: Statistical Methodology 64(4), 583 –616 Stapleton, J H (1995) Linear Statistical Models New York: Wiley Stigler, S M (2000) The problematic unity of biometrics Biometrics 56(3), 653–658 REFERENCES 661 Stokes, M E., C S Davis, and G G Koch (1995) Categorical Data Analysis Using the SAS System Cary, NC: SAS Institute Theil, H and C Chung (1988) Information-theoretic measures of fit for univariate and multivariate linear regressions The American Statistician 42, 249–252 Tiku, M L (1967) Tables of the power of the F-test Journal of the American Statistical Association 62, 525 –539 Turner, D L (1990) An easy way to tell what you are testing in analysis of variance Communications in Statistics—Series A, Theory and Methods 19, 4807–4832 Urquhart, N S., D L Weeks, and C R Henderson (1973) Estimation associated with linear models: A revisitation Communications in Statistics 1, 303–330 Verbeke, G and G Molenberghs (2000) Linear Mixed Models for Longitudinal Data Springer-Verlag Wald, A (1943) Tests of statistical hypotheses concerning several parameters when the number of observations is large Transactions of the American Mathematical Society 54, 426 –483 Wang, S G and S C Chow (1994) Advanced Linear Models: Theory and Applications New York: Marcel Dekker Weisberg, S (1985) Applied Linear Regression New York: Wiley Welsch, R E (1975) Confidence regions for robust regression Paper presented at Annual Meeting of the American Statistical Association, Washington, DC Winer, B J (1971) Statistical Principles in Experimental Design (2nd ed.) New York: McGraw-Hill Working, H and H Hotelling (1929) Application of the theory of error to the interpretation of trends Journal of the American Statistical Association, Suppl (Proceedings) 24, 73– 85 Yates, F (1934) The analysis of multiple classifications with unequal numbers in the different classes Journal of the American Statistical Association 29, 52– 66 Index Adjusted R 2, 162 Alias matrix, 170 Analysis of covariance, 443 –478 assumptions, 443 –444 covariates, 444 estimation, 446 –448 model, 444 –445 one-way model with one covariate, 449 –451 estimation of parameters, 449 –450 model, 449 testing hypotheses, 448, 450 –451 equality of treatment effects, 450 –452 homogeneity of slopes, 452 –456 interpretation, 456 slope, 452 one-way model with multiple covariates, 464 –472 estimation of parameters, 465 –468 model, 464 –465 testing hypotheses, 468 –469 equality of treatment effects, 468 –469 homogeneity of slope vectors, 470 –472 slope vector, 470 power, 444 testing hypotheses, 448 two-way model with one covariate, 457 –464 model, 457 testing hypotheses, 458 –464 homogeneity of slopes, 463 –464 main effects and interactions, 458 –462 slope, 462 unbalanced models, 473 –474 cell means model, 473 constrained model, 473 –474 Analysis of variance, 295–338 estimability of b in the empty cells model, 432, 434– 435 estimability of b in the non-full-rank model, 302–304 estimable functions l0 b, 305–308 conditions for estimability of l0 b, 305–307 estimators of l0 b, 309–313 BLUE properties of, 313 covariance of, 312 variance of, 311 estimation of s in the non-full-rank model, 313–314 model, –4, 295–301 one-way See One-way model two-way See Two-way model normal equations, 302–303 solution using generalized inverse, 302–303 normal model, 314–316 estimators of b and s2, 314–315 properties of, 316 and regression, reparameterization to full-rank model, 318–320 side conditions, 320–322, 433 SSE in the non-full-rank model, 313–314 testable hypotheses, 323–324 testable hypotheses in the empty cells model, 433 testing hypotheses, 323– 329 full and reduced model, 324–326 general linear hypothesis, 326–329 treatments or natural groupings of units, unbalanced data See Unbalanced data in ANOVA Angle between two vectors, 41–42, 136, 163, 238 Linear Models in Statistics, Second Edition, by Alvin C Rencher and G Bruce Schaalje Copyright # 2008 John Wiley & Sons, Inc 663 664 INDEX Asymptotic inference for large samples, 260 –262, 491, 515 Augmented matrix, 29 Bayes’ theorem, 278 –279 Bayesian linear model, 279 –284, 480 Bayesian linear mixed model, 497 Best linear predictor, 499 Best linear unbiased estimators (BLUE), 147, 165, 313 Best quadratic unbiased estimators, 151, 486 Beta weights, 251 BIC See Information criterion BLUE See Best linear unbiased estimators Causality, 3, 130 –131, 443 Chi-square distribution, 112– 114 central chi-square, 112 moment-generating function, 112 –113 noncentral chi-square, 112 –114 noncentrality parameter, 112, 124 Cluster correlation, 479–480, 481 –485 Coefficient of determination in multiple regression, 161– 164 in simple linear regression, 133 –134 Coefficient(s), regression, 2, 127 Conditional density, 73, 95 –99, 278 –284, 498 –499 Confidence interval(s) for b1 in simple linear regression, 133 in Bayesian regression, 278, 285 in linear mixed models, 491, 495 in multiple regression See Regression, multiple linear with fixed x’s, confidence interval(s) in random-x regression, 261 –262 Contrasts, 308, 341, 357 –371 Control of output, Correlation bivariate, 134 Correlation matrix (matrices) population, 77– 78 relationship to covariance matrix, 77–78 sample, 247 relationship to covariance matrix, 247– 248 Covariance matrix (matrices) for bˆ, 145 for partitioned random vector, 78 population, 75– 76 sample, 156, 246 –247 for two random vectors, 82 Data space, 153, 163, 316–317 Dependent variable, 1, 137, 295 Derivative, matrix and vector, 56 –59, 91, 109, 142, 158, 495 Determinant, 37–41 Determination, coefficient of See Coefficient of determination Diagnostics, regression, 227–238 also Hat matrix; Influential observations; Outliers; Residual(s) Diagonal matrix, DIC See Information criterion Distance Mahalanobis, 77 standardized, 77 Distribution(s) chi-square, 112–114 F, 114–116 gamma, 280 inverse gamma, 284 multivariate t, 282–283, 285 normal See Normal distribution t, 216, 283 Effect of each variable on R 2, 262–265 Eigenvalues See Matrix, eigenvalues Eigenvectors See Matrix, eigenvectors Empty cells, 432–439 Error sum of squares See SSE Error term, 1, 137 Estimated best linear unbiased predictor, 499 Estimated generalized least squares estimation, 490 Exchangeability, 277 Expected mean squares, 173– 174, 179, 182, 312–317, 362– 367, 433 Expected value of bilinear form [E(x0 Ay)], 111 of least squares estimators, 131–132 of quadratic form [E(y0 Ay)], 107 of R 2, 162 of random matrix, 75 –76 of random variable [E( y)], 70 of random vector [E( y)], 75–76 of sample covariance [E(sxy)], 112 of sample variance [E(s 2)], 108, 131, 150 of sum of random variables, 70 of sum of random vectors, 75–76 Exponential family, 514 INDEX F-Distribution, 114 –116 central F, 114 mean of central F, 115 noncentral F, 115 noncentrality parameter, 115 variance of central F, 115 F-Tests See also Regression, multiple linear with fixed x’s, tests of hypotheses; Tests of hypotheses general linear hypothesis test, 198–203 for overall regression, 185 power, 115 subset of the b’s, 189 False discovery rate, 206 First order multivariate Taylor series, 495 Fixed effects models, 480 Gauss-Markov theorem, 146– 147, 276 See also Best linear unbiased estimators Generalized least squares, 164 –169, 285 –286, 479, 503 Generalized linear models, 513 –516 exponential family, 514 likelihood function, 512 linear predictor, 513 –514 link function, 514 model, 514 Generalized inverse, 32 –37, 302 –303, 343, 384 of symmetric matrix, 33 Generalized variance, 77, 88–89 Geometry of least squares, 151 –154, 163, 316 –317 angle between two vectors, 163 prediction space, 153 –154, 163, 316 –317 data space, 153, 163, 316 –317 parameter space, 152, 154, 316 –317 Gibbs sampling, 289, 291 Hadamard product, 16, 425 Hat matrix, 230 –231 Hessian matrix, 495 Highest density interval, 279, 285 Hyperprior distribution, 280, 287 Hypothesis tests See Tests of hypotheses Idempotent matrix for chi-square distribution, 117– 118 definition and properties, 54 –55 in linear mixed models, 487 Identity matrix, 665 Independence of contrasts, 358–362 independence and zero covariance, 93–94 of linear functions and quadratic forms, 119–120 of quadratic forms, 120–121 of random variables, 71, 94 of random vectors, 93, 94 of SSR and SSE, 187 Influential observations, 235–238 Cook’s distance, 236– 237 leverage, 236 Information criterion, 286 Iterative methods for finding estimates, 490 Invariance of F, 149, 200 of maximum likelihood estimators, 247–248 of R 2, 149 of s 2, 149 of t, 149 of yˆ, 148–149 Inverse matrix See Matrix, inverse j vector, J matrix, Kenward –Roger adjustment, 496–497 Lagrange multiplier, 60, 68, 179, 201, 220, 223, 429 Least squares, 128, 131, 141, 143, 145–151, 302, 507 properties of estimators, 129–133, 143, 145–147 Likelihood function, 158, 513–514 Likelihood ratio tests, 258– 262 Linear estimator, 143 See also Best linear unbiased estimators Linear mixed model, 480 randomized blocks, 481–482 subsampling, 482 split plot studies, 483–484, 492–494 one-way random effects, 484, 489 random coefficients, 484–485 heterogeneous variances, 485–486 Linear model, 2, 137 Linear models, generalized See Generalized linear models Logistic regression, 508–511 binary y, 508 estimation, 510 666 INDEX Logistic regression (Continued ) logit transformation, 509 model, 509 –510 polytomous model, 511 categorical, 511 ordinal, 511 several x’s, 510 Logit transformation, 509 Loglinear models, 511– 512 contingency table, 511 likelihood ratio test, 512 maximum likelihood estimators, 512 LSD test, 209 Mahalanobis distance, 77 Markov Chain Monte Carlo, 288 –289, 291 –292 Matrix (matrices), –68 addition of, –10 algebra of, 5–60 augmented matrix, 29 bilinear form, 16 Cholesky decomposition, 27 conditional inverse, 33 conformable matrices, definition, derivatives, 56– 58 determinant, 37–41 of partitioned matrix, 38–40 diagonal of a matrix, diagonal matrix, diagonalizing a matrix, 52 differentiation, 56 –57 eigenvalues, 46 –53, 496 characteristic equation, 47 and determinant, 51–52 of functions of a matrix, 49– 50 of positive definite matrix, 53 square root matrix, 53 of product, 50 –53 of symmetric matrix, 51 and trace, 51 eigenvectors, 46– 47, 496 equality, generalized inverse, 32 –37, 302, 343, 384, 391– 395 of symmetric matrix, 36 Hadamard product, 16, 425 idempotent matrix, 54 and eigenvalues, 54 identity matrix, inverse, 21–23 conditional inverse, 33 generalized inverse, 32 –37 of partitioned matrix, 23–24 of product, 22 j vector, J matrix, multiplication of, 10 conformal matrices, 10 nonsingular matrix, 21 notation, O (zero matrix), orthogonal matrix, 41–43 partitioned matrix, 16 –18 multiplication of, 17 positive definite matrix, 24 –28 positive semidefinite matrix, 25– 28 product, 10 commutativity, 10 as linear combination of columns, 17 matrix and diagonal matrix, 16 matrix and j, 12 matrix and scalar, 10 product equal to zero, 20 rank of product, 21 quadratic form, 16 See also Quadratic form(s) random matrix, 69 rank, 19 –21 See also Rank of a matrix spectral decomposition, 51, 360, 362, 495–496 square root matrix, 53 sum of, symmetric matrix, spectral decomposition, 51 trace, 44– 46 transpose, of product, 13 triangular matrix, vector(s) See Vector(s) zero matrix (O) and zero vector (0), Matrix product See Matrix, product Maximum likelihood estimators for b and s2 in ANOVA, 315 for b and s2 in fixed-x regression, 158–159 properties, 159–161 for b0, b1, and s2 in random-x regression, 245–248 properties, 248–249 invariance of, 249 in loglinear models, 511 for partial correlation, 266–268 MCMC See Markov Chain Monte Carlo Mean See also Expected value sample mean See Sample mean population mean, 70 INDEX Missing at random, 432 Misspecification of cov(y), 167– 169 See also Generalized least squares Misspecification of model, 169 –174 alias matrix, 170 overfitting, 170 –172 underfitting, 170– 172 Model diagnostics, 227 –238 See also Hat matrix; Influential observations; Outliers; Residual(s) Model, linear, 2, 137 Model validation, 227 –238 See also Hat matrix; Influential observations; Outliers; Residual(s); Moment-generating function, 90 –92, 96, 99– 100, 103 –104, 108 Multiple linear regression, 90– 92, 108, 112 –114, 117 –119, 122 See Regression, multiple linear with fixed x’s Multivariate delta method, 495 Multivariate normal distribution, 87 –103 conditional distribution, 95–97 density function, 88 –89 independence and zero covariance, 93–94 linear functions of, 89 marginal distribution, 93 moment generating-function of, 90 –92 partial correlation, 100 –101 properties of, 92–100 Noncentrality parameter for chi-square, 112 for F, 114, 187, 192, 325 for t, 116, 132 Nonlinear regression, 507 confidence intervals, 507 least squares estimators, 507 tests of hypotheses, 507 Nonsingular matrix, 21 Normal distribution multivariate See Multivariate normal distribution univariate, 87– 88 standard normal, 87 Normalizing constant, 278, 281, 284 O (zero matrix), One-way model (balanced), 3, 295 –298, 339 –376 contrasts, 357 –371 and eigenvectors, 360 –362 hypothesis test for, 344 –351 667 orthogonal contrasts, 358– 371 independence of, 363–364 orthogonal polynomial contrasts, 363–371 partitioning of sum of squares, 360–361 estimable functions, 340– 341 contrasts, 341 estimation of s2, 343–344 expected mean squares, 351– 357 full-reduced–model method, 352–354 general linear hypothesis method, 354–356 normal equations, 341–344 solution using generalized inverse, 343 solution using side conditions, 342–343 overparameterized model, 297 assumptions, 297–298 parameters not unique, 297 reparameterization, 298 side conditions, 298 SSE, 314 testing the hypothesis H0 : m1 ¼ m2 ¼ ¼ mk, 344–351 full and reduced model, 344–348 general linear hypothesis, 348–351 Orthogonal matrix, 41 –43 Orthogonal polynomials, 363–371 Orthogonal vectors, 40 Orthogonal x’s in regression models, 149, 174–178 Orthogonality of columns of X in balanced ANOVA models, 333– 335 Orthogonality of rows of A in unbalanced ANOVA models, 293– 296 Orthogonalizing the x’s in regression models, 174–178 and partial regression coefficients, 175–176 Outliers, 232–235 mean shift outlier model, 235 PRESS (prediction sum of squares), 235 Overfitting, 170–172 p-Value for F-test, 188–189 for t-test, 132 Parameter space, 152, 154, 316– 317 Partial correlation(s), 100–101, 266–273 matrix of (population) partial correlations, 100–101 sample partial correlations, 177–178, 266–173 668 INDEX Partial interaction constraints, 434 Poisson distribution, 512 Poisson regression, 512 –513 likelihood function, 513 model, 513 Polynomials, orthogonal See Orthogonal polynomials Positive definite matrix, 24 –28 Positive semidefinite matrix, 25 –28 Posterior distribution, 278 –284 conditional, 289 marginal, 282 Posterior predictive distribution, 279, 290 –292 Prediction, –3, 137, 142, 148, 156, 161 Precision, 280 Prediction of a random effect, 497 –499 Prediction interval, 213 –215 Prediction space, 153 –154, 163, 316 –317 Prediction sum of squares (PRESS), 235 PRESS (prediction sum of squares), 235 Prior distribution, 278 –284 diffuse, 281, 287 informative, 281 conjugate, 281, 289 specification, 280 Projection matrix, 228 Quadratic form(s), 16, 489 distribution of, 117 –118 expected value of, 107 idempotent matrix, 106 independence of, 119– 121 moment-generating function of, 108 variance of, 108 r in simple linear regression, 133 –134 R (squared multiple correlation), 161 –164, 254 –257 effect of each variable on R 2, 262 –265 fixed x’s, 161 –164 adjusted R 2, 162 angle between two vectors, 163 properties of R and R, 162 random x’s, 254– 257 population multiple correlation, 254 properties, 255 sample multiple correlation, 256 properties, 256– 257 Random matrix, 69 Random model, 480 Random variable(s), 69 correlation, 74 covariance, 71 and independence, 71– 74 expected value (mean), 70 independent, 71, 94 mean (expected value), 70 standard deviation, 71 variance, 70 Random vector(s), 69–74 correlation matrix, 77– 78 covariance matrix, 75– 76, 83 linear functions of, 79 –83 mean of, 80 variances and covariances of, 81–83 mean vector, 75 –76 partitioned, 78 –79 Random x’s in regression See Regression, random x’s Rank of a matrix, 19 –21 full rank, 19 rank of product, 20 –21 Regression coefficients (b’s), 2, 138, 251 partial regression coefficients, 138 standardized coefficients (beta weights), 251 Regression, logistic See Logistic regression Regression, multiple linear with fixed x’s, 2–3, 137– 184 assumptions, 138–139 centered x’s, 154–157 coefficients See Regression coefficients confidence interval(s) for b, 209 for E( y), 211–212 for one a0 b, 211 for one bj, 210– 211 for s2, 215 for several ai0 b’s, 216– 217 for several bj’s, 216 design matrix, 138 diagnostics, 227–238 See also Diagnostics, regression estimation of b0, b1, , bk, 141–145 with centered x’s, 154–157 least squares, 2, 143–144 maximum likelihood, 158– 159 properties of estimators, 145– 149 with sample covariances, 157 INDEX estimation of s2 maximum likelihood estimator, 158–159 minimum variance unbiased estimator, 158 –159 unbiased estimator,149 –151 best quadratic unbiased estimator, 151 generalized least squares, 164 –169 minimum variance estimators, 158 –159 misspecification of error structure, 151 –153 misspecification of model, 169 –174 See also Misspecification of model model, 137 –140 multiple correlation (R), 161– 162 normal equations, 141 –142 orthogonal x’s, 149, 174 –178 orthogonalizing the x’s, 174 –178 outliers, 232 –235 See also Outliers partial regression, 141 prediction See Prediction prediction equation, 142 prediction interval, 213– 215 properties of estimators, 145 –149 purposes of, 2–3 random x’s See Regression, random x’s residuals, 227– 230 See also Residuals sufficient statistics, 159– 160 tests of hypotheses all possible a0 b, 193– 194 expected mean squares, 173–174 general linear hypothesis test H0 : Cb ¼ 0, 198–203 estimation under reduced model, 324 –326 full and reduced model, 324 –326 H0 : Cb ¼ t, 203 –204 likelihood ratio tests, 217 –221 distribution of likelihood ratio, 218 –219 likelihood ratio, 218 for H0 : b ¼ 0, 219 –220 for H0 : Cb ¼ 0, 220 –221 linear combination a0 b, 204–205 one bj, 204 –205 F-test, 204 –205 t-test, 205 overall regression test, 185 –189 669 in terms of R 2, 196–198 several ai0 b’s, 205 several bj’s Bonferonni method, 206–207 experimentwise error rate, 206 overall a-level, 206 Scheffe´ method, 207–209 subset of the b’s, 189–196 expected mean squares, 193, 196 full and reduced model, 190 noncentrality parameter, 192–193 quadratic forms, 190– 193, 195 in terms of R 2, 196 weighted least squares, 168 X matrix, 138–139 Regression, nonlinear See Nonlinear regression Regression, Poisson See Poisson regression Regression, random x’s, 243–273 multivariate normal model, 244 confidence intervals, 258–262 estimation of b0, b1, and s2, 245–249 properties of estimators, 249 standardized coefficients (beta weights), 251 in terms of correlations, 249–154 R 2, 254–257 See also R 2, random x’s effect of each variable on R 2, 262–265 tests of hypotheses, 258– 262 comparison with tests for fixed x’s, 258 correlations, tests for, 260– 261 Fisher’s z-transformation, 261 likelihood ratio tests, 258–260 nonnormal data, 265–266 estimation of bˆ0 and bˆ1, 266 sample partial correlations, 266–273 maximum likelihood estimators, 268 other estimators, 269–271 Regression, simple linear (one x), 1, 127–136 assumptions, 127 coefficient of determination r 2, 133–134 confidence interval for b0, 134 confidence interval for b1, 132–133 correlation r, 133–134 in terms of angle between vectors, 135 estimation of b0 and b1, 128–129 estimation of s2, 131–132 670 INDEX Regression, simple linear (Continued ) model, 127 properties of estimators, 131 test of hypothesis for b0, 119 test of hypothesis for b1, 132– 133 test of hypothesis for r, 134 Regression sum of squares See SSR Regression to the mean, 498 Residual(s), 131, 227– 230 deleted residuals, 234 externally studentized residual, 234 hat matrix, 228, 230 –232 in linear mixed models, 501 –502 plots of, 230 properties of, 237 –230 residual sum of squares (SSE), 131, 150 –151 See SSE studentized residual, 233 Response variable, 1, 137, 150 Robust estimation methods, 232 Sample mean definition, 105 –106 independent of sample variance, 119 –120 Sample space (data space), 152 –153 Sample variance (s 2), 107 –108 best quadratic unbiased estimator, 151 distribution, 118 expected value, 108, 127 independent of sample mean, 120 Satterthwaite, 494 Scalar, Scientific method, Selection of variables, 2, 172 Serial correlation, 479 Shrinkage estimator, 287, 500 Significance level (a), 132 Simple linear regression See Regression, simple linear Singular matrix, 22 Small sample inference for mixed linear models, 491 –491, 494 –497 Span, 153 Spectral decomposition, 51, 495 –496 Square root matrix, 53 SSE (error sum of squares) balanced ANOVA one-way model, 343– 344 two-way model, 385, 390 –391 independence of SSR and SSE, 187 multiple regression, 150 –156, 179 non-full-rank model, 313 –314 simple linear regression, 131– 132 unbalanced ANOVA one-way model, 417 two-way model constrained, 428 unconstrained, 432 SSH (for general linear hypothesis test) in ANOVA, 326–329, 348– 351, 401–403 in regression, 199, 203 SSR (regression sum of squares), 133–134, 161, 164, 186– 189 Standardized distance, 77 Subspace, 153, 317 Sufficient statistics, 159–160 Sum(s) of squares Analysis of covariance, 449–463, 468–473 ANOVA, balanced one-way, 345–346, 348– 351 contrasts, 358–363, 367– 331 two-way, 388–395, 395–403 ANOVA, unbalanced one-way, 417 contrasts, 417–421 two-way, 426, 431–432 full-and-reduced-model test in ANOVA, 324–326 SSE See SSE SSH (for general linear hypothesis test) See SSH SSR (for overall regression test) See SSR as quadratic form, 105–107 test of a subset of b’s, 190–192 Symmetric matrix, Systems of equations, 28 –32 consistent and inconsistent, 29 and generalized inverse, 37–39 t-Distribution, 116– 117, 123 central t, 117 noncentral t, 116– 117, 132 noncentrality parameter, 116–117, 132 p-value See p-Value t-Tests, 123, 131– 132, 134, 205 p-value See p-Value Tests of hypotheses See also Analysis of variance, testing hypotheses; One-way model (balanced), testing the hypothesis H0 : m1 ¼ m2 ¼ ¼ mk; Two-way model (balanced), tests of hypotheses for b1 in simple linear regression, 131–132 INDEX in Bayesian regression, 286 F-tests See F-Tests general linear hypothesis test, 198–204 for individual b’s or linear combinations See Regression, multiple linear with fixed x’s, tests of hypotheses likelihood ratio tests, 217 –221 in linear mixed models, 491, 495 overall regression test, 185 –189, 196 for r in bivariate normal distribution, 134 regression tests in terms of R 2, 196 –198 significance level (a), 132 subset of the b’s, 189 –196 t-tests See t-Tests Trace of a matrix, 44– 46 Transpose, Treatments, 4, 295, 339, 377 Triangular matrix, Two-way model (balanced), 3, 299 –301, 377 –408 estimable functions, 378– 382 estimates of, 382– 384 interaction terms, 380 main effect terms, 380 –381 estimation of s2, 384 –385 expected mean squares, 403– 408 quadratic form approach, 405 sums of squares approach, 403 –405 interaction, 301, 377 model, 377 –378 assumptions, 378 no-interaction model, 329 –335 estimable functions, 330 –331 testing a hypothesis, 331 –333 normal equations, 382 –384 orthogonality of columns of X, 333 –335 reparameterization, 299– 300 side conditions, 300 –301, 381 SSE, 384, 390 tests of hypotheses interaction full-and-reduced-model test, 388 –391 generalized inverse approach, 391 –395 hypothesis, 385– 388 main effects full-and-reduced-model approach, 395 –401 general linear hypothesis approach, 401 –403 hypothesis, 396 671 Unbalanced data in ANOVA cell means model, 414 one-way model, 415–421 contrasts, 417–421 conditions for independence, 418 orthogonal contrasts, 418 weighted orthogonal contrasts, 419 estimation, 415–416 SSE, 416 testing H0 : m1 ¼ m2 ¼ ¼ mk, 416 overparameterized model, 414 serial correlation, 479 two-way model, 421–432 cell means model, 421, 422 constrained model, 428–432 estimation, 430 model, 429 SSE, 431 testing hypotheses, 431–432 type I, II and III sums of squares, 414 unconstrained model, 421–428 contrasts, 424–425 estimator of s2, 423 Hadamard product, 425 SSE, 423 testing hypotheses, 425–428 two-way model with empty cells, 432–439 estimability of empty cell means, 435 estimation for the partially constrained model, 434 isolated cells, 432 missing at random, 432 testing the interaction, 433–434 SSE, 433 weighted squares of means, 414 Underfitting, 170– 172 Validation of model, 227–238 See also Hat matrix; Influential observations; Outliers; Residual(s) Variable(s) dependent, 1, 137 independent, 1, 137 predictor, 1, 137 response, 1, 137 selection of variables, 2, 172 Variance of estimators of l0 b, 311 generalized, 77 of least squares estimators, 130–131 population, 70– 71 of quadratic form, 107 sample, 95 See also Sample variance 672 INDEX Variance components, 480 estimating equations, 488 estimation, 486 –489 Vector(s) angle between two vectors, 41– 42, 136, 163, 238 column vector, j vector, 8–9 length of, 12 linear independence and dependence, 19 normalized vector, 42 notation, orthogonal vectors, 37 orthonormal vectors, set of, 38 product of, 10 –11 random vector See Random Vectors row vector, zero vector (0), Weighted least squares, 168 Zero matrix (O), Zero vector (0), ... 11 on Bayesian inference in linear models (including Gibbs sampling) and Chapter 17 on linear mixed models We also added a section in Chapter on vector and matrix calculus, adding several new... material in all other chapters Our continuing objective has been to introduce the theory of linear models in a clear but rigorous format In spite of the availability of highly innovative tools in statistics, .. .LINEAR MODELS IN STATISTICS LINEAR MODELS IN STATISTICS Second Edition Alvin C Rencher and G Bruce Schaalje Department of Statistics, Brigham Young University,

Tiêu đề	Linear Models In Statistics
Tác giả	Alvin C. Rencher, G. Bruce Schaalje
Trường học	Brigham Young University
Thể loại	book
Năm xuất bản	2008
Thành phố	Hoboken

Định dạng
Số trang	679
Dung lượng	5,9 MB
File đính kèm	133. Linear Models in Statistics.rar (5 MB)