Statistical Power Analysis with Missing Data Statistical Power Analysis with Missing Data A Structural Equation Modeling Approach Adam Davey Temple University Jyoti Savla Virginia Polytechnic Institute and State University New York London Visit the Family Studies Arena Web site at: www.family-studies-arena.com Routledge Taylor & Francis Group 270 Madison Avenue New York, NY 10016 Routledge Taylor & Francis Group 27 Church Road Hove, East Sussex BN3 2FA © 2010 by Taylor and Francis Group, LLC Routledge is an imprint of Taylor & Francis Group, an Informa business Printed in the United States of America on acid-free paper 10 International Standard Book Number: 978-0-8058-6369-7 (Hardback) 978-0-8058-6370-3 (Paperback) For permission to photocopy or use material electronically from this work, please access www copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400 CCC is a not-for-profit organization that provides licenses and registration for a variety of users For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe Library of Congress Cataloging-in-Publication Data Davey, Adam Statistical power analysis with missing data : a structural equation modeling approach / Adam Davey, Jyoti Savla p cm Includes bibliographical references and index ISBN 978-0-8058-6369-7 (hbk : alk paper) ISBN 978-0-8058-6370-3 (pbk.: alk paper) Social sciences Statistics Social sciences Statistical methods Social sciences Mathematical models I Savla, Jyoti II Title HA29.D277 2010 519.5 dc22 Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the Psychology Press Web site at http://www.psypress.com 2009026347 Contents Introduction Overview and Aims Statistical Power Testing Hypotheses Choosing an Alternative Hypothesis Central and Noncentral Distributions Factors Important for Power Effect Sizes 10 Determining an Effect Size 12 Point Estimates and Confidence Intervals 14 Reasons to Estimate Statistical Power 17 Conclusions 17 Further Readings 18 Section I Fundamentals The LISREL Model 21 Matrices and the LISREL Model 22 Latent and Manifest Variables 24 Regression Coefficient Matrices 25 Variance‑Covariance Matrices 25 Vectors of Means and Intercepts 26 Model Parameters 27 Models and Matrices 30 Structure of a LISREL Program 34 Reading and Interpreting LISREL Output 38 Evaluating Model Fit 41 Measures of Population Discrepancy 42 Incremental Fit Indices 42 Absolute Fit Indices 43 Conclusions 43 Further Readings 43 Missing Data: An Overview 47 Why Worry About Missing Data? 47 Types of Missing Data 48 vi Contents Missing Completely at Random 48 Missing at Random 49 Missing Not at Random 49 Strategies for Dealing With Missing Data 51 Complete Case Methods 51 List‑Wise Deletion 51 List‑Wise Deletion With Weighting 51 Available Case Methods 52 Pair‑Wise Deletion 52 Expectation Maximization Algorithm 52 Full Information Maximum Likelihood 53 Imputation Methods 54 Single Imputation 54 Multiple Imputation 55 Estimating Structural Equation Models With Incomplete Data 56 Conclusions 64 Further Readings 65 Estimating Statistical Power With Complete Data 67 Statistical Power in Structural Equation Modeling 67 Power for Testing a Single Alternative Hypothesis 68 Tests of Exact, Close, and Not Close Fit 72 Tests of Exact, Close, and Not Close Fit Between Two Models 75 An Alternative Approach to Estimate Statistical Power 76 Estimating Required Sample Size for Given Power 78 Conclusions 80 Further Readings 80 Section I I Applications Effects of Selection on Means, Variances, and Covariances 89 Defining the Population Model 90 Defining the Selection Process 92 An Example of the Effects of Selection 93 Selecting Data Into More Than Two Groups 99 Conclusions 101 Further Readings 102 Testing Covariances and Mean Differences With Missing Data 103 Step 1: Specifying the Population Model 104 Step 2: Specifying the Alternative Model 105 Contents vii Step 3: Generate Data Structure Implied by the Population Model 106 Step 4: Decide on the Incomplete Data Model 106 Step 5: Apply the Incomplete Data Model to Population Data 106 Step 6: Estimate Population and Alternative Models With Missing Data 109 Step 7: Using the Results to Estimate Power or Required Sample Size 110 Conclusions 117 Further Readings 117 Testing Group Differences in Longitudinal Change 119 The Application .119 The Steps 122 Step 1: Selecting a Population Model 123 Step 2: Selecting an Alternative Model 124 Step 3: Generating Data According to the Population Model 125 Step 4: Selecting a Missing Data Model 126 Step 5: Applying the Missing Data Model to Population Data 127 Step 6: Estimating Population and Alternative Models With Incomplete Data 128 Step 7: Using the Results to Calculate Power or Required Sample Size 136 Conclusions 140 Further Readings 141 Effects of Following Up via Different Patterns When Data Are Randomly or Systematically Missing 143 Background 143 The Model 145 Design 146 Procedures 148 Evaluating Missing Data Patterns 152 Extensions to MAR Data 158 Conclusions 164 Further Readings 164 Using Monte Carlo Simulation Approaches to Study Statistical Power With Missing Data 165 Planning and Implementing a Monte Carlo Study 165 Simulating Raw Data Under a Population Model 170 Generating Normally Distributed Univariate Data 171 Generating Nonnormally Distributed Univariate Data 172 viii Contents Generating Normally Distributed Multivariate Data 174 Generating Nonnormally Distributed Multivariate Data 177 Evaluating Convergence Rates for a Given Model 178 Step 1: Developing a Research Question 180 Step 2: Creating a Valid Model 180 Step 3: Selecting Experimental Conditions 180 Step 4: Selecting Values of Population Parameters 181 Step 5: Selecting an Appropriate Software Package 182 Step 6: Conducting the Simulations 182 Step 7: File Storage 182 Step 8: Troubleshooting and Verification 183 Step 9: Summarizing the Results 184 Complex Missing Data Patterns 186 Conclusions 190 Further Readings 191 Section II I Extensions 10 Additional Issues With Missing Data in Structural Equation Models 207 Effects of Missing Data on Model Fit 207 Using the NCP to Estimate Power for a Given Index 211 Moderators of Loss of Statistical Power With Missing Data 211 Reliability 211 Auxiliary Variables 215 Conclusions 218 Further Readings 219 11 Summary and Conclusions 231 Wrapping Up 231 Future Directions 232 Conclusions 233 Further Readings 233 References 235 Appendices 243 Index 359 Preface Statistical power analysis has revolutionized the ways in which behavioral and social scientists plan, conduct, and evaluate their research Similar developments in the statistical analysis of incomplete (missing) data are gaining more widespread applications as software catches up with theory However, very little attention has been devoted to the ways in which miss‑ ing data affect statistical power In fields such as psychology, sociology, human development, education, gerontology, nursing, and health sciences, the effects of missing data on statistical power are significant issues with the potential to influence how studies are designed and implemented Several factors make these issues (and this book) significant First and foremost, data are expensive and difficult to collect At the same time, data collection with some groups may be taxing This is particularly true with today’s multidisciplinary studies where researchers often want to com‑ bine information across multiple (e.g., physiological, psychological, social, contextual) domains If there are ways to economize and at the same time reduce expense and testing burden through application of missing data designs, then these should be identified and exploited in advance when‑ ever possible Second, missing data are a nearly inevitable aspect of social science research and this is particularly true in longitudinal and multi‑informant studies Although one might expect that any missing data would simply reduce power, recent research suggests that not all missing data were cre‑ ated equal In other words, some types of missing data may have greater implications for loss of statistical power than others Ways to assess and anticipate the extent of loss in power with regard to the amount and type of missing data need to be more widely available, as ways to moderate the effects of missing data on the loss of statistical power whenever possible Finally, some data are inherently missing A number of “incomplete” designs have been considered for some time, including the Solomon four‑group design, Latin squares design, and Schaie’s most efficient design However, they have not typically been analyzed as missing data designs Planning a study with missing data may actually be a cost‑effective alter‑ native to collecting complete data on all individuals For some applications, these “missing by design” methods of data collection may be the only prac‑ tical way to plan a study, such as with accelerated longitudinal designs Knowing how best to plan a study of this type is increasingly important .. .Statistical Power Analysis with Missing Data Statistical Power Analysis with Missing Data A Structural Equation Modeling Approach Adam Davey Temple University Jyoti Savla Virginia Polytechnic... Cataloging-in-Publication Data Davey, Adam Statistical power analysis with missing data : a structural equation modeling approach / Adam Davey, Jyoti Savla p cm Includes bibliographical references and index... just as in the complete data case (e.g., Hancock, 2006; Kaplan, 1995) Statistical Power Because the practical aspects of statistical power not always receive a great deal of attention in many statistics