Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 206 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
206
Dung lượng
0,94 MB
File đính kèm
57. Missing data in randomised controlled trials.rar
(879 KB)
Nội dung
Missing data in randomised controlled trials — a practical guide James R Carpenter & Michael G Kenward November 21, 2007 Competing interests: None Word count (main text): 54,158 Funder: NHS (NCCRM) Grant no RM04/JH17/MK c James R Carpenter and Michael G Kenward, 2007 EXECUTIVE SUMMARY Randomised Controlled Trials (RCTs) are well established as the preferred method for evaluating interventions Unlike studies based on observational data, the randomisation of patients to interventions means that a direct causal link can be made between an intervention and its effect In order to measure the effect of a treatment we need, at least, a measure of each patient’s response at the end of the trial More commonly, a series of responses will be measured at baseline and throughout follow-up Inevitably, it is not possible to collect all the intended data on each individual Unfortunately, as one might expect, simply analysing the data that were collected without any further reflection generally leads to misleading conclusions Specifically, when the data are incomplete the causal link between intervention and response is broken We refer to data that we intended to collect, but for one reason or another were unable to, as missing data Anyone with practical experience of trials knows that missing data are ubiquitous Nevertheless, a recent survey of 383 parallel group trials showed that 69% did not report how attrition was handled This monograph reviews the issues of raised by missing data in clinical trials, and describes and illustrates a principled approach to analyses in such settings It is divided into three parts Part I gives a non-technical overview of the issues raised by missing data We propose a systematic approach to handling missing data in clinical trials, and discuss the implications of this for design, ‘intention to treat’ and ‘per-protocol’ analyses This leads to a critique of the current Committee for Proprietary Medicinal Products guidelines for missing data, together with many of the ad-hoc statistical methods often used by statisticians for the analysis of trials with missing data We argue that analyses should be principled, that is, follow well-defined and accepted statistical arguments, using models and assumptions that are transparent, and hence open to criticism and debate When data are missing any attempt to draw conclusions from a statistical analysis rests on untestable assumptions concerning the relationship between the unobserved data and the reasons for them being missing (the missing value mechanism) In this way, missing data introduce ambiguity into the analysis beyond conventional sampling imprecision and the assumptions behind any such analyses form a crucial part of the argument behind any conclusions drawn We argue that primary analyses should rest on a central assumption about this relationship, the socalled missing at random assumption Broadly, this is the most general assumption that allows valid analyses to be made independently of the missing value mechanism Part II shows how primary analyses in a range of settings can be carried out under the socalled missing at random assumption This assumption has a central role in underpinning the most important classes of primary analysis, such as those based on likelihood However, as its validity cannot be assessed from the data under analysis, Part III outlines practical methods for assessing the sensitivity of conclusions drawn from the analyses in part II to the missing at random assumption We compare and contrast the two main approaches to this in the literature, again giving examples and code In summary: • From the design stage onwards, our principled approach to handling missing data should be adopted, and • This monograph outlines how this principled approach can be practically, and directly, applied to the majority of trials with longitudinal follow-up iii ABSTRACT Missing data in clinical trials — a practical guide James R Carpenter and Michael G Kenward Medical Statistics Unit, London School of Hygiene & Tropical Medicine, UK Corresponding author Objective: Missing data are ubiquitous in clinical trials, yet recent research suggests many statisticians and investigators appear uncertain how to handle them The objective of this monograph is to set out a principled approach for handling missing data in clinical trials, and provide examples and code to facilitate its adoption Data sources: An asthma trial from GlaxoSmithKline, a asthma trial from AstraZeneca, and a dental pain trial from GlaxoSmithKline Methods: Part I gives a non-technical review of how missing data are typically handled in the analysis of clinical trials, and outlines the issues raised by missing data When faced with missing data, we show no analysis can avoid making additional untestable assumptions This leads to a proposal for a principled, systematic approach for handling missing data in clinical trials, which in turn informs a critique of current Committee of Proprietary Medicinal Products guidelines for missing data, together with many of the ad-hoc statistical methods currently employed Part II shows how primary analyses in a range of settings can be carried out under the so-called missing at random assumption This key assumption has a central role in underpinning the most important classes of primary analysis, such as those based on likelihood However its validity cannot be assessed from the data under analysis, so in Part III two main approaches are developed and illustrated, for the assessment of the sensitivity of the primary analyses to this assumption Examples: Throughout, examples are used to illustrate the arguments and analyses Code for the analyses (mostly in SAS) is given in Appendix C The end of each example is indicated with a ‘ ’ Results: The literature review revealed missing data are often ignored, or poorly handled in the analysis Current guidelines, and frequently used ad-hoc statistical methods, are shown to be flawed A principled, yet practical, alternative approach is developed, which examples show leads to inferences with greater validity SAS code is given to facilitate its direct application Conclusions: From the design stage onwards, a principled approach to handling missing data should be adopted Such an approach follows well-defined and accepted statistical arguments, using models and assumptions that are transparent, and hence open to criticism and debate This monograph outlines how this principled approach can be practically, and directly, applied to the majority of trials with longitudinal follow-up v ACKNOWLEDGEMENTS We would like to thank GlaxoSmithKline and AstraZeneca for permission to use their data We gratefully acknowledge funding from the NHS (CCRM) James Carpenter would like to thank colleagues at the Department for Medical Biometry and Statistics, University Hospital, Freiburg, for their hospitality and encouragement during his sabbatical visit, where the first draft of this monograph was completed We are grateful for stimulating conversations with a many colleagues In particular, Prof James H Roger, who also contributed a SAS macro, and Prof Geert Molenberghs We thank Stephen Evans and three anonymous referees for their helpful comments on the first draft, which have led to a greatly improved manuscript The remaining errors are ours We invite comments and corrections Please email these to james.carpenter@lshtm.ac.uk This book has a homepage at www.missingdata.org.uk In due course we intend to publish here a supplementary chapter on time to event outcomes James Carpenter & Mike Kenward London School of Hygiene & Tropical Medicine, Spring 2007 vi C ONTRIBUTIONS OF THE AUTHORS The monograph was planned jointly James Carpenter did the majority of the writing and computation Mike Kenward reviewed draft Chapters and wrote some additional material vii PRINCIPAL ABBREVIATIONS COPD Chronic Obstructive Pulmonary Disease CPMP Committee for Proprietary Medicinal Products GEE Generalised Estimating Equation GLM Generalised Linear Model GLMM Generalised Linear Mixed Model ICH International Council on Harmonisation LOCF Last Observation Carried Forward MAR Missing At Random MCAR Missing Completely At Random MI Multiple Imputation MNAR Missing Not At Random PM Pattern Mixture PA Population Averaged SS Subject Specific viii Table of Contents Title i Executive summary iii Abstract v Acknowledgements vi Contributions of the Authors vii List of abbreviations viii Table of Contents ix List of Tables xv List of Figures xix I 1 Missing data: principles 1.1 Introduction 1.2 What we mean by missing data 1.3 Trial validity and sensible analyses 1.4 How much should we bother about missing data? 1.5 Towards a systematic approach 10 1.6 Missing data mechanisms 13 1.6.1 Missing completely at random 13 1.6.2 Is MCAR likely in practice? 14 ix 1.6.3 Missing at random 15 1.6.4 Missing not at random 20 1.7 Some other terms that may confuse 21 1.8 Implications 22 1.8.1 Design 22 1.8.2 Missing data and per-protocol analyses 23 1.8.3 Missing data and intention to treat (ITT) analyses 24 1.8.4 Composite hypotheses 25 A critique of CPMP guidelines 25 1.10 Inferential approach 27 1.11 Summary 28 1.9 A critique of common approaches to missing data 29 2.1 Introduction 29 2.2 Complete cases 30 2.3 Last observation carried forward 31 2.4 Missing indicator method 36 2.4.1 Missing indicator method with pre-randomisation variables 36 2.4.2 Other settings 41 2.4.3 Summary 42 2.5 Marginal and conditional mean imputation 42 2.6 Conclusions 47 II 49 MAR Methods for Quantitative Data 51 3.1 Introduction 51 3.2 Some modelling issues 52 3.2.1 Comparative power under different covariance structures 53 Summary statistics 54 3.3.1 Approach 55 3.3.2 Further details and examples 55 Estimating treatment effects when follow-up and/or baseline values are missing 57 3.3 3.4 x 172 Code for examples # Wishart prior for precision matrix R R[1:6 , 1:6] ~ dwish(lambda[1:6 , 1:6 ], 6) # create an estimate of variance/covariance matrix: vcov.mat[1:6 ,1:6 ]