applied bayesian modelling - p. congdon

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang	465
Dung lượng	3,08 MB

Nội dung

Applied Bayesian Modelling Applied Bayesian Modelling. Peter Congdon Copyright  2003 John Wiley & Sons, Ltd. ISBN: 0-471-48695-7 WILEY SERIES IN PROBABILITY AND STATISTICS Established by WALTER A. SHEWHART and SAMUEL S. WILKS Editors: David J. Balding, Peter Bloomfield, Noel A. C. Cressie, Nicholas I. Fisher, Iain M. Johnstone, J. B. Kadane, Louise M. Ryan, David W. Scott, Adrian F. M. Smith, Jozef L. Teugels Editors Emeriti: Vic Barnett, J. Stuart Hunter and David G. Kendall A complete list of the titles in this series appears at the end of this volume. Applied Bayesian Modelling PETER CONGDON Queen Mary, University of London, UK Copyright # 2003 John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England Telephone (44) 1243 779777 Email (for orders and customer service enquiries): cs-books@wiley.co.uk Visit our Home Page on www.wileyeurope.com or www.wiley.com All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except under the terms of the Copyright, Designs and Patents Act 1988 or under the terms of a licence issued by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London W1T 4LP, UK, without the permission in writing of the Publisher. Requests to the Publisher should be addressed to the Permissions Department, John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England, or emailed to permreq@wiley.co.uk, or faxed to (44) 1243 770620. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold on the understanding that the Publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought. Other Wiley Editorial Offices John Wiley & Sons Inc., 111 River Street, Hoboken, NJ 07030, USA Jossey-Bass, 989 Market Street, San Francisco, CA 94103-1741, USA Wiley-VCH Verlag GmbH, Boschstr. 12, D-69469 Weinheim, Germany John Wiley & Sons Australia Ltd, 33 Park Road, Milton, Queensland 4064, Australia John Wiley & Sons (Asia) Pte Ltd, 2 Clementi Loop #02-01, Jin Xing Distripark, Singapore 129809 John Wiley & Sons Canada Ltd, 22 Worcester Road, Etobicoke, Ontario, Canada M9W 1L1 Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books. Library of Congress Cataloging-in-Publication Data Congdon, Peter. Applied Bayesian modelling / Peter Congdon. p. cm. ± (Wiley series in probability and statistics) Includes bibliographical references and index. ISBN 0-471-48695-7 (cloth : alk. paper) 1. Bayesian statistical decision theory. 2. Mathematical statistics. I. Title. II. Series. QA279.5 .C649 2003 519.542±dc21 2002035732 British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library ISBN 0 471 48695 7 Typeset in 10/12 pt Times by Kolam Information Services, Pvt. Ltd., Pondicherry, India Printed and bound in Great Britain by Biddles Ltd, Guildford, Surrey. This book is printed on acid-free paper responsibly manufactured from sustainable forestry in which at least two trees are planted for each one used for paper production. Contents Preface xi Chapter 1 The Basis for, and Advantages of, Bayesian Model Estimation via Repeated Sampling 1 1.1 Introduction 1 1.2 Gibbs sampling 5 1.3 Simulating random variables from standard densities 12 1.4 Monitoring MCMC chains and assessing convergence 18 1.5 Model assessment and sensitivity 20 1.6 Review 27 References 28 Chapter 2 Hierarchical Mixture Models 31 2.1 Introduction: Smoothing to the Population 31 2.2 General issues of model assessment: marginal likelihood and other approaches 32 2.2.1 Bayes model selection using marginal likelihoods 33 2.2.2 Obtaining marginal likelihoods in practice 35 2.2.3 Approximating the posterior 37 2.2.4 Predictive criteria for model checking and selection 39 2.2.5 Replicate sampling 40 2.3 Ensemble estimates: pooling over similar units 41 2.3.1 Mixtures for Poisson and binomial data 43 2.3.2 Smoothing methods for continuous data 51 2.4 Discrete mixtures and Dirichlet processes 58 2.4.1 Discrete parametric mixtures 58 2.4.2 DPP priors 60 2.5 General additive and histogram smoothing priors 67 2.5.1 Smoothness priors 68 2.5.2 Histogram smoothing 69 2.6 Review 74 References 75 Exercises 78 Chapter 3 Regression Models 79 3.1 Introduction: Bayesian regression 79 3.1.1 Specifying priors: constraints on parameters 80 3.1.2 Prior specification: adopting robust or informative priors 81 3.1.3 Regression models for overdispersed discrete outcomes 82 3.2 Choice between regression models and sets of predictors in regression 84 3.2.1 Predictor selection 85 3.2.2 Cross-validation regression model assessment 86 3.3 Polytomous and ordinal regression 98 3.3.1 Multinomial logistic choice models 99 3.3.2 Nested logit specification 100 3.3.3 Ordinal outcomes 101 3.3.4 Link functions 102 3.4 Regressions with latent mixtures 110 3.5 General additive models for nonlinear regression effects 115 3.6 Robust Regression Methods 118 3.6.1 Binary selection models for robustness 119 3.6.2 Diagnostics for discordant observations 120 3.7 Review 126 References 129 Exercises 132 Chapter 4 Analysis of Multi-Level Data 135 4.1 Introduction 135 4.2 Multi-level models: univariate continuous and discrete outcomes 137 4.2.1 Discrete outcomes 139 4.3 Modelling heteroscedasticity 145 4.4 Robustness in multi-level modelling 151 4.5 Multi-level data on multivariate indices 156 4.6 Small domain estimation 163 4.7 Review 167 References 168 Exercises 169 Chapter 5 Models for Time Series 171 5.1 Introduction 171 5.2 Autoregressive and moving average models under stationarity and non-stationarity 172 5.2.1 Specifying priors 174 5.2.2 Further types of time dependence 179 5.2.3 Formal tests of stationarity in the AR(1) model 180 5.2.4 Model assessment 182 5.3 Discrete Outcomes 191 5.3.1 Auto regression on transformed outcome 193 vi CONTENTS 5.3.2 INAR models for counts 193 5.3.3 Continuity parameter models 195 5.3.4 Multiple discrete outcomes 195 5.4 Error correction models 200 5.5 Dynamic linear models and time varying coefficients 203 5.5.1 State space smoothing 205 5.6 Stochastic variances and stochastic volatility 210 5.6.1 ARCH and GARCH models 210 5.6.2 Stochastic volatility models 211 5.7 Modelling structural shifts 215 5.7.1 Binary indicators for mean and variance shifts 215 5.7.2 Markov mixtures 216 5.7.3 Switching regressions 216 5.8 Review 221 References 222 Exercises 225 Chapter 6 Analysis of Panel Data 227 6.1 Introduction 227 6.1.1 Two stage models 228 6.1.2 Fixed vs. random effects 230 6.1.3 Time dependent effects 231 6.2 Normal linear panel models and growth curves for metric outcomes 231 6.2.1 Growth Curve Variability 232 6.2.2 The linear mixed model 234 6.2.3 Variable autoregressive parameters 235 6.3 Longitudinal discrete data: binary, ordinal and multinomial and Poisson panel data 243 6.3.1 Beta-binomial mixture for panel data 244 6.4 Panels for forecasting 257 6.4.1 Demographic data by age and time period 261 6.5 Missing data in longitudinal studies 264 6.6 Review 268 References 269 Exercises 271 Chapter 7 Models for Spatial Outcomes and Geographical Association 273 7.1 Introduction 273 7.2 Spatial regressions for continuous data with fixed interaction schemes 275 7.2.1 Joint vs. conditional priors 276 7.3 Spatial effects for discrete outcomes: ecological analysis involving count data 278 7.3.1 Alternative spatial priors in disease models 279 7.3.2 Models recognising discontinuities 281 7.3.3 Binary Outcomes 282 CONTENTS vii 7.4 Direct modelling of spatial covariation in regression and interpolation applications 289 7.4.1 Covariance modelling in regression 290 7.4.2 Spatial interpolation 291 7.4.3 Variogram methods 292 7.4.4 Conditional specification of spatial error 293 7.5 Spatial heterogeneity: spatial expansion, geographically weighted regression, and multivariate errors 298 7.5.1 Spatial expansion model 298 7.5.2 Geographically weighted regression 299 7.5.3 Varying regressions effects via multivariate priors 300 7.6 Clustering in relation to known centres 303 7.6.1 Areas vs. case events as data 306 7.6.2 Multiple sources 306 7.7 Spatio-temporal models 310 7.7.1 Space-time interaction effects 312 7.7.2 Area Level Trends 312 7.7.3 Predictor effects in spatio-temporal models 313 7.7.4 Diffusion processes 314 7.8 Review 316 References 317 Exercises 320 Chapter 8 Structural Equation and Latent Variable Models 323 8.1 Introduction 323 8.1.1 Extensions to other applications 325 8.1.2 Benefits of Bayesian approach 326 8.2 Confirmatory factor analysis with a single group 327 8.3 Latent trait and latent class analysis for discrete outcomes 334 8.3.1 Latent class models 335 8.4 Latent variables in panel and clustered data analysis 340 8.4.1 Latent trait models for continuous data 341 8.4.2 Latent class models through time 341 8.4.3 Latent trait models for time varying discrete outcomes 343 8.4.4 Latent trait models for clustered metric data 343 8.4.5 Latent trait models for mixed outcomes 344 8.5 Latent structure analysis for missing data 352 8.6 Review 357 References 358 Exercises 360 Chapter 9 Survival and Event History Models 361 9.1 Introduction 361 9.2 Continuous time functions for survival 363 9.3 Accelerated hazards 370 9.4 Discrete time approximations 372 9.4.1 Discrete time hazards regression 375 viii CONTENTS 9.4.2 Gamma process priors 381 9.5 Accounting for frailty in event history and survival models 384 9.6 Counting process models 388 9.7 Review 393 References 394 Exercises 396 Chapter 10 Modelling and Establishing Causal Relations: Epidemiological Methods and Models 397 10.1 Causal processes and establishing causality 397 10.1.1 Specific methodological issues 398 10.2 Confounding between disease risk factors 399 10.2.1 Stratification vs. multivariate methods 400 10.3 Dose-response relations 413 10.3.1 Clustering effects and other methodological issues 416 10.3.2 Background mortality 427 10.4 Meta-analysis: establishing consistent associations 429 10.4.1 Priors for study variability 430 10.4.2 Heterogeneity in patient risk 436 10.4.3 Multiple treatments 439 10.4.4 Publication bias 441 10.5 Review 443 References 444 Exercises 447 Index 449 CONTENTS ix Preface This book follows Bayesian Statistical Modelling (Wiley, 2001) in seeking to make the Bayesian approach to data analysis and modelling accessible to a wide range of researchers, students and others involved in applied statistical analysis. Bayesian statistical analysis as implemented by sampling based estimation methods has facilitated the analysis of complex multi-faceted problems which are often difficult to tackle using `classical' likelihood based methods. The preferred tool in this book, as in Bayesian Statistical Modelling, is the package WINBUGS; this package enables a simplified and flexible approach to modelling in which specification of the full conditional densities is not necessary and so small changes in program code can achieve a wide variation in modelling options (so, inter alia, facilitating sensitivity analysis to likelihood and prior assumptions). As Meyer and Yu in the Econometrics Journal (2000, pp. 198±215) state, ``any modifications of a model including changes of priors and sampling error distributions are readily realised with only minor changes of the code.'' Other sophisticated Bayesian software for MCMC modelling has been developed in packages such as S-Plus, Minitab and Matlab, but is likely to require major reprogramming to reflect changes in model assumptions; so my own preference remains WINBUGS, despite its possible slower performance and convergence than tailored made programs. There is greater emphasis in the current book on detailed modelling questions such as model checking and model choice, and the specification of the defining components (in terms of priors and likelihoods) of model variants. While much analytical thought has been put into how to choose between two models, say M 1 and M 2 , the process underlying the specification of the components of each model is subject, especially in more complex problems, to a range of choices. Despite an intention to highlight these questions of model specification and discrimination, there remains considerable scope for the reader to assess sensitivity to alternative priors, and other model components. My intention is not to provide fully self-contained analyses with no issues still to resolve. The reader will notice many of the usual `specimen' data sets (the Scottish lip cancer and the ship damage data come to mind), as well as some more unfamiliar and larger data sets. Despite recent advantages in computing power and speed which allow estimation via repeated sampling to become a serious option, a full MCMC analysis of a large data set, with parallel chains to ensure sample space coverage and enable convergence to be monitored, is still a time-consuming affair. Some fairly standard divisions between topics (e.g. time series vs panel data analysis) have been followed, but there is also an interdisciplinary emphasis which means that structural equation techniques (traditionally the domain of psychometrics and educa- tional statistics) receive a chapter, as do the techniques of epidemiology. I seek to review the main modelling questions and cover recent developments without necessarily going into the full range of questions in specifying conditional densities or MCMC sampling [...]... programs or results via e-mail at p .congdon@ qmul.ac.uk The WINBUGS programs that support the examples in the book are made available at ftp://ftp.wiley.co.uk/pub/books /congdon Peter Congdon Applied Bayesian Modelling Peter Congdon Copyright  2003 John Wiley & Sons, Ltd ISBN: 0-4 7 1-4 869 5-7 CHAPTER 1 The Basis for, and Advantages of, Bayesian Model Estimation via Repeated Sampling BAYESIAN MODEL ESTIMATION... non-standard densities If the full conditionals are non-standard but of a certain mathematical form (log-concave), then adaptive rejection sampling (Gilks and Wild, 1992) may be used within the Gibbs sampling for those parameters In other cases, alternative schemes based on the Metropolis±Hastings algorithm, may be used to sample from non-standard densities (Morgan, 2000) The program WINBUGS may be applied. .. Poisson probability of case i could then be evaluated in terms of that parameter This type of approach (n-fold cross-validation) may be computationally expensive except in small samples Another option is for a large dataset to be randomly divided into a small number k of groups; then cross-validation may be applied to each partition of the data, with k À 1 groups as `training' sample and the remaining group... domain estimation for survey outcomes (Ghosh and Rao, 1994), and meta-analysis across several studies (Smith et al., 1995) Unlike classical techniques, the Bayesian method allows model comparison across non-nested alternatives, and again the recent sampling estimation 1 See, for instance, Example 2.8 on geriatric patient length of stay 2 BAYESIAN MODEL ESTIMATION VIA REPEATED SAMPLING developments have... chooses the sampling method, opting for standard Gibbs sampling if conjugacy is identified, and for adaptive rejection sampling (Gilks and Wild, 1992) for non-conjugate problems with log-concave sampling densities For non-conjugate problems without log-concavity, Metropolis±Hastings updating is used, either slice sampling (Neal, 1997) or adaptive sampling (Gilks et al., 1998) To monitor parameters (i.e... BUGS the codings SIMULATING RANDOM VARIABLES FROM STANDARD DENSITIES and 15 t[i] $ dweib(alpha,lambda) x[i] $ dexp(lambda) t[i] `- pow(x[i],1/alpha) generate the same density 1.3.4 Gamma, chi-square and beta densities The gamma density is central to the modelling of variances in Bayesian analysis, and as a prior for the Poisson mean It has the form f (x) [ba aG(a)]xaÀ1 exp (À bx), xb0 with mean aab and... Example 1.3 below, and in a particular kind of multiple random effects model, the age-period-cohort model (Knorr-Held and Rainer, 2001) Identifiability issues also occur in discrete mixture regressions (Chapter 3) and structural equation models (Chapter 8) due to label switching during the MCMC sampling Such instances of non-identifiability will show as essentially nonconvergent parameter series between... INTRODUCTION Bayesian analysis of data in the health, social and physical sciences has been greatly facilitated in the last decade by advances in computing power and improved scope for estimation via iterative sampling methods Yet the Bayesian perspective, which stresses the accumulation of knowledge about parameters in a synthesis of prior knowledge with the data at hand, has a longer history Bayesian. .. number of other densities, and hence ways of sampling from them The chi-square is also used as a prior for the variance, and is the same as a gamma density with a na2, b 0X5 Its expectation is then n, usually interpreted as a degrees of freedom parameter The density (1.6) above is sometimes known as a scaled chi-square The chi-square may also be obtained for n an integer, by taking n draws x1 ,... u(t 1) , u(t 2)XX , or from more widely spaced sub-samples K steps apart u(t) , u(t K) , u(t 2K) Geweke (1992) developed a t-test applicable to assessing convergence in runs of sampled parameter values, both in single and multiple chain situations Let "a be the posterior u u mean of scalar parameter u from the first na iterations in a chain (after burn-in), and "b be the mean from the last nb draws . Applied Bayesian Modelling Applied Bayesian Modelling. Peter Congdon Copyright  2003 John Wiley & Sons, Ltd. ISBN: 0-4 7 1-4 869 5-7 WILEY SERIES IN PROBABILITY AND STATISTICS Established. Data Congdon, Peter. Applied Bayesian modelling / Peter Congdon. p. cm. ± (Wiley series in probability and statistics) Includes bibliographical references and index. ISBN 0-4 7 1-4 869 5-7 (cloth : alk. paper) 1 for instance, Example 2.8 on geriatric patient length of stay. Applied Bayesian Modelling. Peter Congdon Copyright  2003 John Wiley & Sons, Ltd. ISBN: 0-4 7 1-4 869 5-7 developments have facilitated

Ngày đăng: 31/03/2014, 16:23

Xem thêm