P1: IML/OTB P2: IML/OTB GRBT241-FM GRBT241-v4.cls QC: IML/OTB T1: IML February 5, 2008 Printer: RRD 11:43 THIRD EDITION MODERN EPIDEMIOLOGY Kenneth J Rothman Vice President, Epidemiology Research RTI Health Solutions Professor of Epidemiology and Medicine Boston University Boston, Massachusetts Sander Greenland Professor of Epidemiology and Statistics University of California Los Angeles, California Timothy L Lash Associate Professor of Epidemiology and Medicine Boston University Boston, Massachusetts i P1: IML/OTB P2: IML/OTB GRBT241-FM GRBT241-v4.cls QC: IML/OTB T1: IML February 5, 2008 Printer: RRD 11:43 Acquisitions Editor: Sonya Seigafuse Developmental Editor: Louise Bierig Project Manager: Kevin Johnson Senior Manufacturing Manager: Ben Rivera Marketing Manager: Kimberly Schonberger Art Director: Risa Clow Compositor: Aptara, Inc © 2008 by LIPPINCOTT WILLIAMS & WILKINS 530 Walnut Street Philadelphia, PA 19106 USA LWW.com All rights reserved This book is protected by copyright No part of this book may be reproduced in any form or by any means, including photocopying, or utilized by any information storage and retrieval system without written permission from the copyright owner, except for brief quotations embodied in critical articles and reviews Materials appearing in this book prepared by individuals as part of their official duties as U.S government employees are not covered by the above-mentioned copyright Printed in the USA Library of Congress Cataloging-in-Publication Data Rothman, Kenneth J Modern epidemiology / Kenneth J Rothman, Sander Greenland, and Timothy L Lash – 3rd ed p ; cm 2nd ed edited by Kenneth J Rothman and Sander Greenland Includes bibliographical references and index ISBN-13: 978-0-7817-5564-1 ISBN-10: 0-7817-5564-6 Epidemiology–Statistical methods Epidemiology–Research–Methodology I Greenland, Sander, 1951- II Lash, Timothy L III Title [DNLM: Epidemiology Epidemiologic Methods WA 105 R846m 2008] RA652.2.M3R67 2008 614.4–dc22 2007036316 Care has been taken to confirm the accuracy of the information presented and to describe generally accepted practices However, the authors, editors, and publisher are not responsible for errors or omissions or for any consequences from application of the information in this book and make no warranty, expressed or implied, with respect to the currency, completeness, or accuracy of the contents of the publication Application of this information in a particular situation remains the professional responsibility of the reader The publishers have made every effort to trace copyright holders for borrowed material If they have inadvertently overlooked any, they will be pleased to make the necessary arrangements at the first opportunity To purchase additional copies of this book, call our customer service department at (800) 638-3030 or fax orders to 1-301-223-2400 Lippincott Williams & Wilkins customer service representatives are available from 8:30 am to 6:00 pm, EST, Monday through Friday, for telephone access Visit Lippincott Williams & Wilkins on the Internet: http://www.lww.com 10 ii P1: IML/OTB P2: IML/OTB GRBT241-FM GRBT241-v4.cls QC: IML/OTB T1: IML February 5, 2008 Printer: RRD 11:43 Contents Preface and Acknowledgments Contributors Introduction vii ix Kenneth J Rothman, Sander Greenland, and Timothy L Lash SECTION I Basic Concepts Causation and Causal Inference Kenneth J Rothman, Sander Greenland, Charles Poole, and Timothy L Lash Measures of Occurrence 32 Sander Greenland and Kenneth J Rothman Measures of Effect and Measures of Association 51 Sander Greenland, Kenneth J Rothman, and Timothy L Lash Concepts of Interaction 71 Sander Greenland, Timothy L Lash, and Kenneth J Rothman SECTION II Study Design and Conduct Types of Epidemiologic Studies 87 Kenneth J Rothman, Sander Greenland, and Timothy L Lash Cohort Studies 100 Kenneth J Rothman and Sander Greenland Case-Control Studies 111 Kenneth J Rothman, Sander Greenland, and Timothy L Lash Validity in Epidemiologic Studies 128 Kenneth J Rothman, Sander Greenland, and Timothy L Lash 10 Precision and Statistics in Epidemiologic Studies 148 Kenneth J Rothman, Sander Greenland, and Timothy L Lash 11 Design Strategies to Improve Study Accuracy 168 Kenneth J Rothman, Sander Greenland, and Timothy L Lash 12 Causal Diagrams 183 M Maria Glymour and Sander Greenland iii P1: IML/OTB P2: IML/OTB GRBT241-FM GRBT241-v4.cls QC: IML/OTB T1: IML February 5, 2008 Printer: RRD 11:43 iv Contents SECTION III Data Analysis 13 Fundamentals of Epidemiologic Data Analysis 213 Sander Greenland and Kenneth J Rothman 14 Introduction to Categorical Statistics 238 Sander Greenland and Kenneth J Rothman 15 Introduction to Stratified Analysis 258 Sander Greenland and Kenneth J Rothman 16 Applications of Stratified Analysis Methods 283 Sander Greenland 17 Analysis of Polytomous Exposures and Outcomes 303 Sander Greenland 18 Introduction to Bayesian Statistics 328 Sander Greenland 19 Bias Analysis 345 Sander Greenland and Timothy L Lash 20 Introduction to Regression Models 381 Sander Greenland 21 Introduction to Regression Modeling 418 Sander Greenland SECTION IV Special Topics 22 Surveillance 459 James W Buehler 23 Using Secondary Data 481 Jørn Olsen 24 Field Methods in Epidemiology 492 Patricia Hartge and Jack Cahill 25 Ecologic Studies 511 Hal Morgenstern 26 Social Epidemiology 532 Jay S Kaufman 27 Infectious Disease Epidemiology 549 C Robert Horsburgh, Jr., and Barbara E Mahon 28 Genetic and Molecular Epidemiology 564 Muin J Khoury, Robert Millikan, and Marta Gwinn 29 Nutritional Epidemiology 580 Walter C Willett 30 Environmental Epidemiology 598 Irva Hertz-Picciotto 31 Methodologic Issues in Reproductive Epidemiology Clarice R Weinberg and Allen J Wilcox 620 P1: IML/OTB P2: IML/OTB GRBT241-FM GRBT241-v4.cls QC: IML/OTB T1: IML February 5, 2008 Printer: RRD 11:43 Contents v 32 Clinical Epidemiology 641 Noel S Weiss 33 Meta-Analysis 652 Sander Greenland and Keith O’Rourke References Index 683 733 P1: IML/OTB P2: IML/OTB GRBT241-FM GRBT241-v4.cls QC: IML/OTB T1: IML February 5, 2008 Printer: RRD 11:43 vi P1: IML/OTB P2: IML/OTB GRBT241-FM GRBT241-v4.cls QC: IML/OTB T1: IML February 5, 2008 Printer: RRD 11:43 Preface and Acknowledgments This third edition of Modern Epidemiology arrives more than 20 years after the first edition, which was a much smaller single-authored volume that outlined the concepts and methods of a rapidly growing discipline The second edition, published 12 years later, was a major transition, as the book grew along with the field It saw the addition of a second author and an expansion of topics contributed by invited experts in a range of subdisciplines Now, with the help of a third author, this new edition encompasses a comprehensive revision of the content and the introduction of new topics that 21st century epidemiologists will find essential This edition retains the basic organization of the second edition, with the book divided into four parts Part I (Basic Concepts) now comprises five chapters rather than four, with the relocation of Chapter 5, “Concepts of Interaction,” which was Chapter 18 in the second edition The topic of interaction rightly belongs with Basic Concepts, although a reader aiming to accrue a working understanding of epidemiologic principles could defer reading it until after Part II, “Study Design and Conduct.” We have added a new chapter on causal diagrams, which we debated putting into Part I, as it does involve basic issues in the conceptualization of relations between study variables On the other hand, this material invokes concepts that seemed more closely linked to data analysis, and assumes knowledge of study design, so we have placed it at the beginning of Part III, “Data Analysis.” Those with basic epidemiologic background could read Chapter 12 in tandem with Chapters and to get a thorough grounding in the concepts surrounding causal and non-causal relations among variables Another important addition is a chapter in Part III titled, “Introduction to Bayesian Statistics,” which we hope will stimulate epidemiologists to consider and apply Bayesian methods to epidemiologic settings The former chapter on sensitivity analysis, now entitled “Bias Analysis,” has been substantially revised and expanded to include probabilistic methods that have entered epidemiology from the fields of risk and policy analysis The rigid application of frequentist statistical interpretations to data has plagued biomedical research (and many other sciences as well) We hope that the new chapters in Part III will assist in liberating epidemiologists from the shackles of frequentist statistics, and open them to more flexible, realistic, and deeper approaches to analysis and inference As before, Part IV comprises additional topics that are more specialized than those considered in the first three parts of the book Although field methods still have wide application in epidemiologic research, there has been a surge in epidemiologic research based on existing data sources, such as registries and medical claims data Thus, we have moved the chapter on field methods from Part II into Part IV, and we have added a chapter entitled, “Using Secondary Data.” Another addition is a chapter on social epidemiology, and coverage on molecular epidemiology has been added to the chapter on genetic epidemiology Many of these chapters may be of interest mainly to those who are focused on a particular area, such as reproductive epidemiology or infectious disease epidemiology, which have distinctive methodologic concerns, although the issues raised are well worth considering for any epidemiologist who wishes to master the field Topics such as ecologic studies and metaanalysis retain a broad interest that cuts across subject matter subdisciplines Screening had its own chapter in the second edition; its content has been incorporated into the revised chapter on clinical epidemiology The scope of epidemiology has become too great for a single text to cover it all in depth In this book, we hope to acquaint those who wish to understand the concepts and methods of epidemiology with the issues that are central to the discipline, and to point the way to key references for further study Although previous editions of the book have been used as a course text in many epidemiology vii P1: IML/OTB P2: IML/OTB GRBT241-FM GRBT241-v4.cls viii QC: IML/OTB T1: IML February 5, 2008 Printer: RRD 11:43 Preface and Acknowledgments teaching programs, it is not written as a text for a specific course, nor does it contain exercises or review questions as many course texts Some readers may find it most valuable as a reference or supplementary-reading book for use alongside shorter textbooks such as Kelsey et al (1996), Szklo and Nieto (2000), Savitz (2001), Koepsell and Weiss (2003), or Checkoway et al (2004) Nonetheless, there are subsets of chapters that could form the textbook material for epidemiologic methods courses For example, a course in epidemiologic theory and methods could be based on Chapters through 12, with a more abbreviated course based on Chapters through and through 11 A short course on the foundations of epidemiologic theory could be based on Chapters through and Chapter 12 Presuming a background in basic epidemiology, an introduction to epidemiologic data analysis could use Chapters 9, 10, and 12 through 19, while a more advanced course detailing causal and regression analysis could be based on Chapters through 5, 9, 10, and 12 through 21 Many of the other chapters would also fit into such suggested chapter collections, depending on the program and the curriculum Many topics are discussed in various sections of the text because they pertain to more than one aspect of the science To facilitate access to all relevant sections of the book that relate to a given topic, we have indexed the text thoroughly We thus recommend that the index be consulted by those wishing to read our complete discussion of specific topics We hope that this new edition provides a resource for teachers, students, and practitioners of epidemiology We have attempted to be as accurate as possible, but we recognize that any work of this scope will contain mistakes and omissions We are grateful to readers of earlier editions who have brought such items to our attention We intend to continue our past practice of posting such corrections on an internet page, as well as incorporating such corrections into subsequent printings Please consult to find the latest information on errata We are also grateful to many colleagues who have reviewed sections of the current text and provided useful feedback Although we cannot mention everyone who helped in that regard, we give special thanks to Onyebuchi Arah, Matthew Fox, Jamie Gradus, Jennifer Hill, Katherine Hoggatt, Marshal Joffe, Ari Lipsky, James Robins, Federico Soldani, Henrik Toft Sørensen, Soe Soe Thwin and Tyler VanderWeele An earlier version of Chapter 18 appeared in the International Journal of Epidemiology (2006;35:765–778), reproduced with permission of Oxford University Press Finally, we thank Mary Anne Armstrong, Alan Dyer, Gary Friedman, Ulrik Gerdes, Paul Sorlie, and Katsuhiko Yano for providing unpublished information used in the examples of Chapter 33 Kenneth J Rothman Sander Greenland Timothy L Lash P1: IML/OTB P2: IML/OTB GRBT241-FM GRBT241-v4.cls QC: IML/OTB T1: IML February 5, 2008 Printer: RRD 11:43 Contributors James W Buehler Research Professor Department of Epidemiology Rollins School of Public Health Emory University Atlanta, Georgia Jack Cahill Vice President Department of Health Studies Sector Westat, Inc Rockville, Maryland Sander Greenland Professor of Epidemiology and Statistics University of California Los Angeles, California M Maria Glymour Robert Wood Johnson Foundation Health and Society Scholar Department of Epidemiology Mailman School of Public Health Columbia University New York, New York Department of Society, Human Development and Health Harvard School of Public Health Boston, Massachusetts Marta Gwinn Associate Director Department of Epidemiology National Office of Public Health Genomics Centers for Disease Control and Prevention Atlanta, Georgia Patricia Hartge Deputy Director Department of Epidemiology and Biostatistics Program Division of Cancer Epidemiology and Genetics National Cancer Institute, National Institutes of Health Rockville, Maryland Irva Hertz-Picciotto Professor Department of Public Health University of California, Davis Davis, California C Robert Horsburgh, Jr Professor of Epidemiology, Biostatistics and Medicine Department Epidemiology Boston University School of Public Health Boston, Massachusetts Jay S Kaufman Associate Professor Department of Epidemiology University of North Carolina at Chapel Hill, School of Public Health Chapel Hill, North Carolina Muin J Khoury Director National Office of Public Health Genomics Centers for Disease Control and Prevention Atlanta, Georgia Timothy L Lash Associate Professor of Epidemiology and Medicine Boston University Boston, Massachusetts ix P1: IML/OTB P2: IML/OTB GRBT241-FM GRBT241-v4.cls QC: IML/OTB T1: IML February 5, 2008 Printer: RRD 11:43 x Contributors Barbara E Mahon Assistant Professor Department of Epidemiology and Pediatrics Boston University Novartis Vaccines and Diagnostics Boston, Massachusetts Charles Poole Associate Professor Department of Epidemiology University of North Carolina at Chapel Hill, School of Public Health Chapel Hill, North Carolina Robert C Millikan Professor Department of Epidemiology University of North Carolina at Chapel Hill, School of Public Health Chapel Hill, North Carolina Kenneth J Rothman Vice President, Epidemiology Research RTI Health Solutions Professor of Epidemiology and Medicine Boston University Boston, Massachusetts Hal Morgenstern Professor and Chair Department of Epidemiology University of Michigan School of Public Health Ann Arbor, Michigan Clarice R Weinberg National Institute of Environmental Health Sciences Biostatistics Branch Research Triangle Park, North Carolina Jørn Olsen Professor and Chair Department of Epidemiology UCLA School of Public Health Los Angeles, California Keith O’Rourke Visiting Assistant Professor Department of Statistical Science Duke University Durham, North Carolina Adjunct Professor Department of Epidemiology and Community Medicine University of Ottawa Ottawa, Ontario Canada Noel S Weiss Professor Department of Epidemiology University of Washington Seattle, Washington Allen J Wilcox Senior Investigator Epidemiology Branch National Institute of Environmental Health Sciences/NIH Durham, North Carolina Walter C Willett Professor and Chair Department of Nutrition Harvard School of Public Health Boston, Massachusetts P1: TNL/OTB GRBT241-INDEX P2: IML/OTB QC: IML/OTB GRBT241-v4.cls T1: IML February 5, 2008 Printer: RRD 14:38 744 Impact fractions, 67 See also Attributable fractions; Imprinting, 639 Incentives, 498, 498t Incidence, 47–48 model-based estimates, 439–440 Incidence density See Incidence rate Incidence measures, 33–36, 40–45 prevalence v., 46 relations among, 41–45 Incidence odds, 41 case-control studies, 113 risk-based measures and, 58 models, 394–395 Incidence proportion, 33, 40–41 See also Risk case-control studies, 113–122 causes and, 11, 11t component causes and, 12, 12t case-cohort study of, 123–124 models See Risk models Incidence rate, 33, 34–35, 40, 41, 42 absolute, 40 case-control studies, 113 incidence times v., in special populations, 37f, 39 models, 395–398 person time and, 34, 101 of population, 34 proper interpretation of, 35–36 of recurrent events, 36 Incidence rate difference See Rate difference Incidence rate ratio See Rate ratio Incidence studies See Cohort studies Incidence time, 33–34 models, 396–398 ratios, 53, 397 Incident case-control studies, 94 See Case-control studies Income, 535 Incomplete matching, 177 See also Marginal matching; Partial matching Incremental plots (Slope plots), 311–312, 312f Incubation periods, 551 agents with, infection occurrence, 557, 557f persistent infection v., progression-related states, 551 Independence See also Association; D-Separation conditional, 190 marginal, 184,188 statistical, 184–185 Independent censoring, 34–35, 45, 289–290 Independent competing risks, 44–45, 289–290 Independent error, 138 Independent outcome, 239 Independent variables See also Covariate; Regressor in regression, 383 statistically, 184–185 Index condition, counterfactual measures and, 55 Index level, regressors and, 407 Indicator variables, 407–408 Indirect adjustment See Standardized morbidity ratio Indirect effects, 200 direct effect v., 545, 545f sufficient cause model and, 13 Indirectly adjusted rate, 665 Individual matching, 170–171, 178 Individual risks, 40 Index Individual studies, statistical reanalysis of, meta-analysis, 659–668 Individual-level analysis, 512 Induction period, 15–17, 603 analysis of, 300–302 Induction, 18–19 Inductivism, 18–19 Refutationism v., 20 Inequality, 540–541 Infection, 549 See also Preinfectiousness; Subclinical infection cure, factors influencing, 559–560 death from, factors influencing, 559–560 disease v., infectious agents and, 551 factors influencing, 557–559 measures of frequency, 553–555 occurrence of, factors influencing, 557–559 occurrence of exposure, 556–557 persistent, factors influencing, 559–560 subclinical, 551 transmission, factors influencing, 561–562 transmission-related states, 551, 552 Infectious agents exposure occurrence, factors influencing, 555–556 persistent infection, 559 Infectious disease emerging, outbreak investigations, 553 epidemiology, 549–563 process, states of, 551 progression axis, 549, 550f public health, 617 surveillance and, 460 transmission models, 562–563 Infectivity factors, infection occurrence, 556 Inferential statistics, 221 See also Confidence limits; P-values data descriptors v., 216 Infertile worker effect, 622 Infertility, 624–628 Inflammatory bowel disease, genetic susceptibility, 577–578 Influence analysis, 221 meta-analysis and, 675 sensitivity, 221–222 Information accuracy, comparability of, 121 needs, epidemiologic investigation, 460 systems, surveillance, 474 weighting, 334–337 Information bias, 137–146 See also Misclassification Information-weighted averaging, 334–337 See also Inverse-variance weighting Informative cluster size, recognized loss, 630 Informed consent, 90 Initiator, in carcinogensis, 16 In-person interviews, questionnaires and, 503 Instrumental variables, 91, 202–204 Instrument-outcome association, 203 Intensity of disease, 34 See also hazard, incidence rate Intensity scoring, 449 Intentional selection, bias from, 196–198 Intent-to-treat analysis (ITT), 203, 647 P1: TNL/OTB GRBT241-INDEX P2: IML/OTB QC: IML/OTB GRBT241-v4.cls T1: IML February 5, 2008 Printer: RRD 14:38 Index Interaction, 10, 13, 71–83 See also Biologic Interaction; Public health interaction; Statistical interaction; Synergism analysis of, 298–300 Interaction contrast (IC), 75–79, 298–299 Interaction response types, 79–80, 80t Interaction terms, 406 See also Product terms Interactive voice response (IVR), 500 Intercept See Zero level, special handling in trend evaluation Intercommunity comparisons, social class, 611–612 Intermediate variables, 131–134, 186–188, 545–546 adjusting for, problems of, 131–134, 200–202, 260, 545–546 in meta-analysis, 656 Internal reference group, ratios derived, meta-analysis, 665–666 Internal validity, 128–129 International ecologic studies, limitations, 581 Internet See also Software sites, follow-up techniques, 508 surveillance data, 478 surveillance systems, 471 Interval estimation, 157–158, 163–167 See also Bayesian intervals; Confidence intervals; Likelihood intervals Intervening variables See Intermediate variables Intervention measure, 648–649 Intervention studies, recruitment to, 494 Interview response rates, epidemiologic research, 497–498 Interviews See also Recall bias longitudinal studies, 503–504 response rates and, 509 techniques and training, 504 Inverse-probability weighting (IPW), 266, 444–445, 449 See also Doubly robust estimation; Marginal models; Model-based standardization exposure models, standardization using, 444–445 and propensity scores, 445 stabilized, 446 Inverse-variance weighting, 334–337, 670–671 See also Information-weighted averaging IP See Incidence proportion IPW See Inverse-probability weighting Item non-response See Missing values, handling ITT association See Intent-to-treat association IVR See Interactive voice response Join points, of spline, 411 Joint action, 82 See also Interaction, concepts of Joint analysis, of multiple exposure levels, 323 Joint confidence regions, 235–236, 324, 324f Joint effects, model searching, 419 Joint hypothesis, 237, 307, 322–327 Joint null hypothesis, Pearson χ statistic, 307 Joint P-value, 235 Joint trend statistic, 326 Kaplan-Meier formula, 42–43, 290–292 censoring and, 45, 290 Kernels, 314, 318–320, 440, 441 Knots, of spline, 411 Kuhn, Thomas, philosophy of, 21 745 Laboratory-based surveillance, 472–473 Lagged analysis, 301 Large-sample methods, 216, 220, 225–230, 240–253, 263, 267–282, 307, 314–316, 422 and model fitting, 422–423 person-time data, 240–245 pure count data, 245–253 sparse data and, 263 trend statistics, 314–316 Latency of effects, 16, 603 preinfectiousness v., transmission-related states, 551 reactivation disease, 559 Latency periods, exposure assessment, 603 Latent effects model, 546 Latent period, 16, 603 See also Induction period Levels of analysis, ecologic studies, 512–513 Levels of inference, ecologic studies, 513 Levels of measurement, ecologic studies, 512 Life-course model, 546 Life expectancy, 34 “Life trajectory” effect, 547 Life-table analysis, 37, 290 See also Log-rank test; Product-limit estimator Likelihood function, 164, 165f, 227–228 conditional, 257, 272–273, 422, 433 partial, 422 statistical theory and, 227–229 Likelihood inference, pure, 164–165, 227–229 Likelihood intervals, pure, 164–165, 229 Likelihood ratio, 228–231, 425–426 in Bayesian analysis, 230–231 confidence limits, 229–230 statistics, 229–230 tests, 229–230, 425–426 Linear hypothesis, 314 Linear models, 392, 395, 396, 398 See also Generalized linear models in ecologic analysis, 517, 520 in hierarchical regression, 436 for odds, 395 for rates, 396 for risks, 392 for trends, 398, 402–403 Linear predictor, 416 Linear risk models, 392, 395, 398, 402–403 Linear spline, 411 trend, 411 Link function generalized linear models, 416 secondary data, 486 Linkage analysis, 565 List-based sampling, 496 Locally linear regression, 441 LOESS procedures, 442 Log additive model See Log-linear model, Multiplicative model Logarithmic curve, 398 Logarithmic plot scales, 310–313, 313f Logic checks, 215, 509 Logistic distributions, 370 P1: TNL/OTB GRBT241-INDEX P2: IML/OTB QC: IML/OTB GRBT241-v4.cls T1: IML February 5, 2008 Printer: RRD 14:38 746 Logistic models, 394–395, 399, 414–415, 442 adjacent-category, 414–415 conditional, 433 continuation ratio, 415 cumulative-odds, 415 group-specific, 442 exact, 422 hierarchical, 435–439 marginal, 446 for matched data, 434 ordinal, 414–415 polytomous, 413–414 proportional-odds, 415 stratified, 433 Logistic transform, 246, 391 Logit-logistic distribution, 371 Logit-normal distribution, 371 Logit, 231, 246, 391 Logit models, 395, 414–415 See also Logistic models Log-likelihood function, 165 Log-linear hypothesis, 314 Log-linear models, 395, 416, 420 See also Multiplicative models for counts, 423, 444 ecologic studies and, 517 genetic factors, 638 for incidence times, 396 for odds, 394, 416 for rates, 396, 416 for risks, 393, 416 Log-log risk model, 417 Log-rank test, 295 Log-risk scale, 74 Longitudinal causal modeling, 46, 453–455 Longitudinal cohort studies, 569 See also Cohort studies Longitudinal data, 46 modeling, 451–455 Longitudinal monitoring, 489 Longitudinal sampling, 116 See also Density sampling Longitudinal studies See also Cohort studies follow-up techniques, 507–508 interviews, 503–504 Longitudinal trends, 606, 608 Log-likelihoods See Deviance statistics; Likelihood functions; Likelihood ratios Log-rank test, 295 Lorenz curve, 540, 541f, 542 Loss to follow-up, 100–101, 108–109, 289 See also Censoring Low birth-weight paradox, 633 Lower P-value, 220 Lower-tailed P-value, 152, 220 LOWESS procedures, 442 Lung cancer, 18, 103, 263, 604, 607f, 609 Lyme disease, culture technique and, 553 “M” diagram, 188f Malformations, prevalence and, 47 Mantel trend statistic, 314–316 Mantel-Haenszel methods, 271 case-cohort, 281 homogeneity assumption, 271 Index model fitting, 433 odds ratio, 276, 276t rate difference, 273 rate ratio, 273 risk difference, 274–275 risk ratio 274–275, 281, 285 sparse-data, 272, 275, 281 statistic for person-time data, 277 statistic for pure-count data, 278 statistic for polytomous exposures, 307 study size, 679 survival analysis, 294–295 two-stage data, 282 Mapping, disease rates, 608–609 Marginal averages, 387 Marginal independence, 184 Marginal matching, 182 Marginal models, 442, 446, 447, 450, 451, 544–545 See also Inverse probability weighting; Marginal structural models; Model based standardization Marginal structural models (MSM), 208, 446, 454–455 Marginally unbiased, 191 Marginal outcome modeling, 446 Markov-chain Monte Carlo (MCMC), 342 Matched data See also Matched pair; Matching analysis of, 283–288 modeling, 434–435 Matched designs, 171–182 See also Matching Matched pair analysis, 284–288 case-control data, 287–288 cohort data, 284–287 odds ratio, 287–288 risk ratio, 285 test, 286 Matched randomization, 175 Matching, 171–179 See also Matched data; Matched pairs; Overmatching in case-control studies, 175–179 cost of, 178 selection bias and, 174–176, 205, 205f in cohort studies, 174–175 effect of, 171–174, 205 information accuracy, indicators of, 181 marginal, 182 partial, 177, 182 purpose of, 171–174 stratification variables, 150 Matrix adjustment methods for misclassification, 360–361 Maximum likelihood estimate (MLE), 164, 221, 228, 271–272 See also Likelihood function; Likelihood ratio homogeneity assumption, 271 of homogenous measure, 271 model fitting and, 421, 425 overdispersion and, 422 priors, 342 score statistic and, 226 Maximum likelihood test statistic, 230 See Wald statistic Maximal models, 421 MCMC See Markov-chain Monte Carlo program McNemar test statistic, 286, 288 P1: TNL/OTB GRBT241-INDEX P2: IML/OTB QC: IML/OTB GRBT241-v4.cls T1: IML February 5, 2008 Printer: RRD 14:38 Index MCSA See Monte-Carlo sensitivity analysis of biases MDR method See Multifactor dimensionality-reduction method Measles vaccine prevention programs and, 462, 463f register, secondary data, 485–486 Measurement error, 137–138, 346, 352–353, 361 See also Misclassification in dietary and nutritional assessment, 591–592, 595, 596–597 Measures of association, 51–70, 57 See also Association Measures of effect v., 59–60, 59t, 185, 385–388 standardized measures of, 67–69 Measures of body composition, anthropometry, 594–595 Measures of occurrence, 33–35, 40, 46 Measures of effect, 51–56, 59–70; See also Causal effects; Effects; Effect-measure modification defining exposure in, 55 generalized, 66–67 measures of association v., 59–60, 59t, 185, 385–388 null state and, 54–56 regression, 385–386 relations among, 60–62 standardized, 67–69, 386–388 theoretical nature of, 54–55 Measures of frequency, infection and disease, 553–555 Median-unbiased estimates, 221, 224, 253, 255, 257 Mediating variables See Intermediate variables Medical testing, 643, 644t Medical-record abstracts, 499 Mendelian transmission, 565 Menopause, 622–623 Menstrual cycle, 623 Meta-analysis, 652–682 See also Publication bias goals of, 654–655 nature of, 652–653 protocol for, 655 quality scores, 679, 681 quantification of effects, 657–658 role and limitations, 681 statistical methods, 668–677 study identification, 656–657 summary statistics for, 670t vote counting, 680–681 Meta-regression methods, 673–677 Methods, classes of, 220 Mid-P-values, 232–233, 255–257 v continuity corrections, 232–233 Migrant studies, nutritional exposures, 582 Migration across groups, ecologic studies, 527 Mill, John Stuart, philosophy of, 19 Minimally sufficient conditioning sets, 192 Minimal models, 420–421 Miscarriage studies, 629 Misclassification, 138–144 analysis of, 352–361, 372 bias related to, quantification of, 352–361, 488 of confounders, 144–145 dependent, 138, 143, 145, 360 of multiple variables, 360 in meta-analysis, 662–663 multiple-bias analyses and, 363 Misspecification See Model specification; Specification bias; Specification error 747 Missing at random (MAR), 219 Missing completely at random (MCAR), 219 Missing data, 215, 219 bias, 199–200, 200f, 346 methods, 219 Missing-data indicators, bias from, 199–200, 219 Mixed effects model, 130, 529 See also Hierarchical regression in meta regression, 677 MLE See Maximum-likelihood estimate Mobility See Social mobility Model accuracy, 389–390, 419, 449–450 Model averaging, 427 Model-based estimates, 424, 439–440 Model-based standardization, 442–446; See also Inverse probability weighting; Marginal modeling; Regression standardization Model checking, 423–429, 425t Model combination, doubly robust estimation, 450–451 Model diagnostics, model checking and, 423, 427–428 Model fitting, 389–390, 421–423 background example, 390 Model parameter, 389 Model selection, 419–421 hierarchical regression and, 437–438 confounder scoring and, 449–450 Model sensitivity analysis, 346, 429 Model specification, 382, 389–390, 420–421, 449–450 See also Model selection Modifiers See Effect measure modification Modular questionnaire, multipurpose studies and, 503 Molecular epidemiology, 564–579 Monitoring of disease, 488–490, 490f Monotone trend, 308 Monte-Carlo sensitivity analysis (MCSA), 364–378 Bayesian analysis v., 378–380 combined, 376–378 intervals, 365 probabilistic bias analysis and, 364 simulation histograms, 368f Mortality rates follow-up, 109–110 monitoring disease, 488–490, 489f patterns of, 33, 33t time-series analyses, 611 Mosquito, environmental reservoir, 555 Moving To Opportunity study (MTO study), 547 Moving averages, 317–321 See also Smoothers and Smoothing categorical estimates as, 321 MRFIT See Multiple Risk Factor Intervention Trial MSM See Marginal structural model MTO study See Moving To Opportunity study Multidimensional outcome, 56 Multifactor dimensionality-reduction (MDR) method, gene-disease association, 572 Multifactorial causation and etiology, 5, 14 Multigeneration registers, secondary data, 484 Multilevel analyses, 513 and designs, ecologic studies, 528–530 Multilevel modeling, 337, 435–439, 529, 530, 542–544 See also Hierarchical regression Multi-linear regression, 674t Multiple bias analyses, 363–364, 376–378 P1: TNL/OTB GRBT241-INDEX P2: IML/OTB QC: IML/OTB GRBT241-v4.cls T1: IML February 5, 2008 Printer: RRD 14:38 748 Multiple comparisons, 234–237, 322–327 hierarchical regression and, 237 single-comparisons v., 326–327 in trend analyses, 307, 326 Multiple control groups, case-control studies, 122, 322 Multiple correlation, 424–425 Multiple diseases, analysis of See Multiple outcomes analysis Multiple group designs, ecologic studies, 514 Multiple group ecologic analysis, confounders, 518–519 Multiple group ecologic studies, within-group misclassification, 526 Multiple imputation, 219 Multiple inference procedures See Multiple comparisons Multiple logistic model, 401 See also Logistic models extensions of, 413–416 single-logistic model v., 401 Multiple outcomes analysis, 321–323, 322t, 325–326 See also Longitudinal data; Recurrent events simultaneous analysis of, 325, 325t Multiple probabilistic bias analysis, Monte-Carlo analyses and, 376, 377t, 378 Multiple regression, 384 models, 400–408 See also specific models, e.g., Logistic models trend models, 408–413 Multiple Risk Factor Intervention Trial (MRFIT), 92 Multiple testing See Multiple comparisons Multiplicative models, 404–405 See also Exponential models; Log-linear models Multiplicative-intercept models conditional fitting, 433 cumulative studies, 431 density studies, 430–431 prevalence studies, 431 Multipurpose studies, modular questionnaire, 503 Multistage model, 82 Multivariable modeling See Multiple regression; Multivariate regression Multivariate outcome, with competing risks, 56 Multivariate regression, 388 Mycobacterium avium, infectivity and, 558 Mycobacterium leprae, culture technique and, 553 Mycobacterium tuberculosis infection occurrence, 556 surveillance, 464 Nails, biochemical measurements, 594 Naive direct-effect analysis, 201 Narrative historical approach, 548 National Death Index (NDI), tracing, epidemiologic studies, 507 National genetic testing committees, terminology by, 566, 566t National Health and Nutrition Examination Survey See NHANES survey Natural direct effect, 200–201 Natural log function, exponential function v., 416 Naturalism, consensus and, 21–22 NDI See National Death Index Nearest-neighbor windows, 321 Necessary cause, 7, Negative adjustments, treatment of, in bias analysis, 373 Neighborhood controls, 117, 121 Index Neisseria meningitidis, colonization of, 556 Nelson-Aalen estimator, 292 See also Exponential formula; Kaplan-Meier formula Neonatal mortality, 633–635, 634f Nested case-control studies, 114, 122–123 and missing data models, 432 Nested confidence intervals, 157f Nested indicator coding, 407 Next-step investigations, surveillance data, 478 Neyman-Pearson hypothesis testing, 150, 153–155 See also Hypothesis test; P-values; Statistical significance statistical estimation and, 157–158 NHANES survey (National Health and Nutrition Examination Survey), 569, 611 surveillance, 474 Nodes See Vertices Nominal-scale variables, 214 Nonadherence See adherence Noncausal associations, 26 Noncollapsibility, 58, 62 See also Collapsibility confounding v., 62 Noncompliance See compliance g-estimation and, 454 Nondependent error, 138 See Independent error Nondifferential misclassification, 139, 372–373 of disease, 142–143 of exposure, 139–142 incidence-rate difference and, incidence-rate ratio and, 139t pervasiveness of misinterpretation, 143 requirements for, 355–356 with three exposure categories, 141t with two exposure categories, 140t Nonexperimental studies, 93–99 types of, 87–88 Noninformative priors, 167, 335–337, 343, 348, 379–380 Nonnested models, 426 Non-normal priors, 340, 370–372 Nonparametric bootstrapping, model checking and, 429 Nonparametric regression, 421, 440–442 See also Smoothers and Smoothing Nonrandomized clinical studies, 641, 650 Nonsignificance of a statistical test, proper interpretation of, 151–152 In meta-analysis, 657, 680 Notifiable disease reporting, 472 surveillance and, 460 Novum Organum, scientific method and, 18 Null hypergeometric distribution, 256, 257 Null hypothesis, 60, 88, 151, 156, 314 bias and, 144 P-value, 153, 315, 454 test for, 158 true, cross-level bias, 523, 525t Null state, effect measure and, 54–56 Null studies, publication bias and, meta-analysis and, 678–679 Nutrient content, measurement of, epidemiologic studies, 586–588 Nutrient, definition complexity, 580 P1: TNL/OTB GRBT241-INDEX P2: IML/OTB QC: IML/OTB GRBT241-v4.cls T1: IML February 5, 2008 Printer: RRD 14:38 Index Nutritional epidemiology, 580–597 methodological issues, 595–597 Nutritional exposures, 581–586 Observational studies, 19, 87–88, 332–333 confounding and, 129 Occam’s razor, 389 Occupation, as social variable, 537 Occupational cohort study, 495 Occupational epidemiology, 599 Occurrence measures, 33–50 Occurrence time, 33 See also Incidence time Odds models, 394–395 Odds ratio, 61, 62, 74 case-control studies, 127 pseudo-frequencies and, 113–114 Oil disease, 603 “Omic” tools, 564–566 One step study, 646 One-sided P-value, 156 One-sided tests, 156 One-tailed P-value, 152 Open populations, 38, 101 See also Dynamic population closed populations v., 38 steady state, 38–39 Open-ended categories, confounder stratification, 218, 304–305, 410 “Optimal” control distribution, case-control studies, 176, 177 Ordered variables, categorization of, 303–305 Ordering of classification, multiple-bias analyses and, 363 Ordinal logistic models, 414–415 Ordinal category scores, 312–313, 410 Outbreak bias, 553 Outbreak investigations, emerging infectious diseases, 553 Outcome events See also Causal effects; Effects; Outcome variables sufficient-component cause model and, timing of, exposure category and, 107–108 Outcome measures See specific measures, e.g., Incidence times; Rates Outcome models exposure models v., 449–450 standardization using, 443 Outcome scores, 447–448 See also Confounder scores Outcome transformations, 400 Outcome variables, 382–383 See also Causal effects; Effects; Outcome events; Regressand in meta-analysis, 655–656 in regression v causation, 383 transformed, 400 Outliers, 410 Overadjustment, 181, 260 Overconclusiveness, meta-analysis and, 677 Overdispersion, 422 Overmatching, 179–181, 260 bias and, 180–181 cost efficiency, 181 statistical efficiency and, 179–180 Ovulation, 623 Pair matching, 283–289 Pairwise blocking, 174 Parallel linear trends, 404 749 Partial likelihood, 422 Partial-Bayes analysis See Semi-Bayes analysis Partial matching, 177, 182 Partially ecologic analysis, 512, 512f Passive surveillance, 476 active surveillance v., 472 Pathway model, 546 Patterns of exposure, 66 attributable fractions, 67 distinguishing, 67 Pearson global test statistics, 427 Pearson χ statistic, 307 Penalized splines, 440–441 Penalized estimation, 271, 331, 341, 421, 423, 436, 437–439 See also Hierarchical regression; Shrinkage estimation maximum likelihood v., 271 Per protocol analysis, randomized controlled trials, 648 Percentile boundaries, problems with, 217–218, 264, 303–305 Perfect compatibility, in causal graphs, 191 Perinatal mortality, 635–636 Period effect, 606, 608 Persistent infection (Chronic infection), 552 incubation period v., progression-related states, 551 infectious agents and, 559 Persistent organic pollutants, 616–617 Personal identifiers, secondary data access, ethics of, 491 Person-count cohort data, 240 Person-time, 294 See also Follow-up analysis, follow-up data, 287 classifying, 102–103 data large-sample methods, 240–245 small-sample methods, 253–255 unstratified data with, 243, 243t distribution, 49–50, 68 exposure, 106 follow-up data, disease misclassification, 358–359 immortal, 106–107 rate, 34 at risk, 34, 39 units, classification of, 218–219 weights, Mantel-Haenszel estimation, 275 Physical examinations, epidemiologic studies, 504–505 Placebo, 91 equipoise and, 89 response, 91 Plausibility, causal inference and, 28–29 PMR See Proportional mortality ratio Point estimate, 156 Point prevalence, 46 Point sources, clustering analysis, 613 Point-source outbreaks, epidemic outbreaks, 552 Poisson distribution (model), 241–243 clusters, 612 Poisson regression, 421–422 See also Exponential models for rates case-parent studies, genotype distribution and, 575 extravariation, 422 genetic factors, 638 Policy, and bias analysis, 347, 380 Policy, surveillance data, 467 P1: TNL/OTB GRBT241-INDEX P2: IML/OTB QC: IML/OTB GRBT241-v4.cls T1: IML February 5, 2008 Printer: RRD 14:38 750 Polynomial regression, 410 Polytomous exposures, and outcomes, analysis of, 303–327 Polytomous logistic models, 413–414 Pooled estimate, 271 Popper, Karl, philosophy of, 20 Population See also Case selection; Complete populations; Confounding; General-population; Impact fractions; Incidence rate; Population-based case-control studies; Source population; Superpopulation; Target population closed, 36–37 cohorts v., 38 concepts of, 383 individual rates and, 34–35 open, 38, 101 regression, 383 at risk, 34 of recurrent events, 36 steady state, 38–39 source, 114–115, 128–129, 383 target, 60, 90, 146–147, 383, 471 under surveillance, 460 types of, 36–39 Population-attributable fractions, 67, 295–297 Population-attributable risk percent, 67 Population-average models, 544–545 Population-averaged regression, 387 Population-based case-control studies, 114, 123 See also Nested case-control studies Population-based prevalence study, 569 Population-based surveillance systems, 469–470 Population effects, 52 Population incidence rate, 34, 35 Population time at risk, 34 Post hoc ergo propter hoc fallacy, 19 Posterior probability, 23, 166–167, 330–331, 337, 345, 364, 379–380 Postexposure events, person-time allocation, 107 Potential biases See Systematic errors Potential outcomes, 54, 385–386 See also Counterfactual outcomes binary exposure variables and, 75, 76t causal models and, 60 causation, 18 model, 59–60, 59t in regression, 385–388 in standardization, 387–388 sufficient-cause models v., 81–82, 81f Poverty, 536 Power models, 410 Power of statistical test, 153–154 Pr See Probability Precision data stratification and, 150 definition of, 1498–149 statistical significance and, 162, 162f weighting, 271, 334–337, 670–671 See also Information-weighted averaging Prediction, 421 Prediction accuracy, 389–390, 436 Predictive values diagnostic tests, 643–644,644t Index exposure misclassification, 138, 353–354 negative, 138, 353, 643–644,644t positive, 138, 353, 643–644, 644t screening tests, 643–644, 644t sensitivity and specificity v., 357–358, 643–644, 644t surveillance systems, 479 Pregnancy See also Subclinical pregnancy loss complications, 632 hormone, 621 loss, 628–631 reproductive epidemiology, 620, 621 Preinfectiousness, latency v., 551 See also Infection Pretesting, nonresponse and, 499 Prevalence, 46–48 case-control studies, 97, 127 duration of disease, 47–48 etiologic research and, 46–47 model-based estimates, 439–440 multiplicative-intercept models, 431 odds, 48 pool, 46, 47 proportion, 46, 47 rate, 46 ratios, 69 studies, 88, 97 Preventable fractions, 54, 67 See also Attributable fractions Prevention programs, measles vaccine and, 462, 463f Preventive effects, 8, 9, 17, 54, 59, 65 See also Causal effects; Potential outcomes Primary base study, 114–115 Primary data, secondary data v., 481 Prior data, 337–342 See also Data priors; Priors Prior information, in model searching, 419–420 See also Bayesian analysis; Priors Prior limits, 330 See also Priors Prior model, in hierarchical regression, 436 Prior probability See Priors Prior parameters, Bayesian analysis and, 330 Prior standard deviation, hierarchical regression, 436–437 Prior data, 337–342 as diagnostic device, 338–339, 341–342 frequentist interpretation, 337–338 methods, extensions, 340 and reverse Bayes analysis, 338–339 Priors, 23, 166–167, 330–332, 337–343, 366–373, 378–380 See also Prior data; Probability; Probability distributions cautions, 342–343 correlated, 371–372 noninformative, 167, 335–337, 343, 348, 379–380 reference, 343 realistic, 335–336, 337, 365, 369–370, 371, 373 Privacy rights, surveillance, 462 Probabilistic bias analysis, 364–380 Probabilistic sensitivity analysis (PSA), 364–365 See also Monte-Carlo sensitivity analysis Bayesian analysis v., 378–380 Probability, 9–10 See also Likelihood; Probability distributions and densities; Risk conditional, 184–185 data, 331 frequency, 10, 331–332 P1: TNL/OTB GRBT241-INDEX P2: IML/OTB QC: IML/OTB GRBT241-v4.cls T1: IML February 5, 2008 Printer: RRD 14:38 Index in graphs, 187–190 marginal, 184 mass functions, 378 models for counts, 240–247 population, 184–185, 387 posterior, 23, 166–167, 330–331, 337, 345, 364, 379–380 prior, 23, 166–167, 330–332, 337–343, 366–373, 378–380 propensity, 10 standardized, 442–445 subjective, 10, 23–24, 165–167, 330–331 unconditional, 184 Probability distributions and densities, 184–185, 222–225, 370–372, 378–380 binomial, 223–225, 245–247, 254 graphical models for, 186–195 exact statistics and, 222–225 log F, 370 logistic, 367f, 370 logit logistic, 371 log normal, 371 logit normal, 371 normal, 367f parameter, 223 Poisson, 241–245 population, 184–185, 387 posterior, 23, 166–167, 330–331, 337, 345, 364, 379–380 prior, 23, 166–167, 330–332, 337–343, 366–373, 378–380 trapezoidal, 367f, 371 uniform, 365, 369 Probability of causation (of a case), 64–65, 297–298 Probit model, 395 Product terms, 402–407 See also Statistical interaction biologic interactions v., 407 interpreting, 405–406 trends and, 404–405 Product-limit estimator, 290–292 Nelson-Aalen estimator v., 292 Product-limit formula, 42–43, 42f, 42t, 43, 290–292 censoring and, 45, 290 Prognostic scores, 447 See also Confounder scores; Outcome scores Progression axis, infectious disease and, 549, 550f Progression-related studies, 553–560 Promoter, in carcinogenesis, 16 Propagated outbreaks, epidemic outbreaks, 552 Propensity scores, 445, 448–450 See also Confounder scores; Doubly robust estimation; Exposure scores inverse-weighting by, 445, 448 Proportional hazards model See Cox model Proportional mortality ratio (PMR), 97–98 Proportional mortality studies, 97–99, 127 Proportional odds model, 415 Prospective ascertainment, 96 Prospective studies, retrospective studies v., 95–97 Prostate cancer, surveillance, 464, 465f PSA See Probabilistic sensitivity analysis Pseudo-denominators, in case-cohort studies, 252–253 Pseudo-frequencies, 113–114, 125 Pseudo-likelihood, 422 751 Pseudo-rates, 113–114 Pseudo-risks, 123–124 Puberty, 622–623 Public health bias analysis and, 347, 380 effects, 14 functions of, 462 interaction, 83 interventions, evaluation, 465–466, 466f laws, notifiable disease reporting, 472 services, planning and projections, 466–467 surveillance, history of, 460–462, 461t Public resources, secondary data access, ethics of, 491 Publication bias In meta-analysis, 656–657, 678–679 from study exclusion, 679 Pure count data See Count Data Pure direct effect, 200–201 Pure likelihood equation, 230 Pure likelihood inference, 228 Pure likelihood limits, 229 P-values, 151–153, 156–163, 364 See also Confidence intervals and limits; Evidence of absence of an effect; Hypothesis tests; Joint P-value; Lower-P-value; Mantel trend P-value; Mid-P-values; Nonsignificance; One-sided P-value; Significance tests; Two-sided P-value Bayesian theory and, 166 confidence intervals and, 158–159 continuity corrected, 232 data and, 216 deviance, 229–230 exact, 220 function, 158–159, 160, 160f, 161f, 239 hypothetical data, 158t interpretation, 151–152, 159–163 likelihood ratio, 229–230 linear hypothesis, 314–316 lower-tailed, 152 Mantel trend, 314–316 mid, 232–233 misinterpretations, 151–152 multiple testing, 234–237, 322–327 null hypothesis and, 153, 315, 454 one-sided, 156 from Pearson χ statistic, 307 practice guidelines, 162–163 random error and, 156 score, 225–226 test statistics, 220–221 trend, 314–316 two-sided, 156, 221, 234 two-tailed, 152 types, 152–153 upper-tailed, 152 Wald, 220, 226–227 Quadratic model, 398 Quadratic spline, 411–412 Qualitative statistical interaction, 79–80 biologic interaction v., 299 Qualitative tally (vote counting), in meta-analysis, 680 Quality scoring, 681 P1: TNL/OTB GRBT241-INDEX P2: IML/OTB QC: IML/OTB GRBT241-v4.cls T1: IML February 5, 2008 Printer: RRD 14:38 752 Quantification of effects, 51–52 See also Effect measures in meta-analysis, 657–658 Quantiles, 217–218, 304 Quantitative criterion, variables and, 261 Quasi-experiments, 88–89, 547 Quasi-likelihood, 422 Questionnaires See also Data coding questionnaires; Data entry questionnaires; Food-frequency questionnaires; Modular questionnaire; Self-administered questionnaires; Semiquantitative food-frequency questionnaires administration method, 500–501 data and, 214 modular, multipurpose studies, 503 respondent burden, 503–504 R See Effective reproductive number (R) R2 , 424–425 Race/ethnicity, as social variable, 534–535 Racism, 537–538 Random allocation See Randomization Random coefficient models, 331, 423, 426, 427, 439 See also Hierarchical regression Random digit dialing (RDD), 117–118, 496 Random effects models, 452, 542–544, 628 See also Hierarchical regression in meta-analysis, 675–677 Random error, 24, 128, 148–167, 332–333; See also Confidence Limits; Probability distributions; P-values; Random sampling adjustment for, in bias analysis, 346, 366–369, 376, 379 conventional statistical analysis, 364 distributions, 222–225, 241–247, 421 P-values, 151–153, 156 P-value functions, 158–159 statistical precision and, 148–149 study size and, 149–150 systematic errors v., 346 Random sampling, 10, 213, 220, 222, 224, 346 See also Sampling error; Selection bias of cases, case-control studies, 115 of controls, case-control studies, 114, 122–123 for validation data, 361 Random variation, components, 148–149 See also Random error Randomization (Random allocation), 88–89 clinical studies, 641 d-separated unconditionally v., 187 experiments and, 168–169 of exposure assignment, 346 intervention studies, recruitment to, 494 Randomized experiments See Experimental studies Randomized recruitment, 604 Randomized trials, 18, 203f See also Experimental studies dietary hypothesis, 585–586 subject selection, 647–648 therapeutic interventions and, 646–649 Rare disease assumption in case-control studies, 114, 125 Rate fractions, 54, 63 See also Attributable fractions etiologic fractions v., 63–64, 297–298 probability of causation v., 64–65, 297–298 Index Rate, 34–36, 39–40 See also specific types, e.g., Case fatality rate; Incidence rate; Prevalence; Mortality rates standardized, 49–50 Rate difference, 52–53, 277 crude data, 243, 244, 245 Mantel-Haenszel, 273–274 standardized, 67–69, 266–267 Rate models, 395–398 in meta-analysis, 663 Rate ratio (RR), 61, 62, 116, 241 crude data, 243, 244, 245, 251, 254–255 ecologic studies and, 517, 518f Mantel-Haenszel, 273–274 relation to other ratio measures, 250 standardized, 67–69, 267–269 unconditional v conditional, 272 Ratio measures, 53 See also Rate ratio; Relative risk; Risk ratio relations among, 250 Rationally coherent probability assessments, 166 See also Bayesianism RDD See Random-digit dialing Reactivation disease, latency and, 559 Reanalysis, for meta-analysis and, 659–668, 669t Recall, in interviews, 96 Recall bias, 111, 112, 138, 356–357 birth defects, 637 Recentering variables, 392 Recognized loss, 628–630 Record abstraction, 499 Record linkage, 486 in surveillance, 475, 476 Record-level adjustment, in bias analysis, 373 Recurrent events, recurrent outcomes, 36, 452–453 See also Longitudinal data Red blood cells, biochemical measurements, 594 Reference level, regressors and, 407 Reference model, in checking model fit, 425 Reference priors, 343 See also Noninformative priors Referent, for causes, Refutationism, 20–21, 24, 30 Registries, use of in epidemiologic research, 481–491 surveillance, 473 of vaccinations, 488 Regressand, 383–384 See also Outcome variables in multivariate regression, 388 Regression, 382 Bayesian, 388 binary, 383–384 causal, 385–386 frequentist, 382–383, 388–389 hierarchical, 237, 263, 435–439, 529, 630 meta-analytic, 673–677 multiple, 384 multivariate, 388 ridge, 331, 423, 436, 437, 439 See also Shrinkage estimation standardization, 386–388, 442–445 Regression analysis, 381–382, 418–456 Regression measures of effect, 385–388 Regression functions, 382–389 P1: TNL/OTB GRBT241-INDEX P2: IML/OTB QC: IML/OTB GRBT241-v4.cls T1: IML February 5, 2008 Printer: RRD 14:38 Index Regression models, 381–417 See also specific entries, e.g., Generalized linear models; Linear models; Logistic models; Log-linear models biologic interactions and, 406–407 contingency table data and, 216 in meta-analysis, 658–659, 673–677 smoothed curves and, 321 transformed, 400 Regression splines, 410–412 See also Nonparametric regression; Power models; Smoothers and Smoothing Regressor, 383–384 transformed, 398–400 Regressor-specific predictions, fitted model v., 423 Relative birth weight, 635 Relative difference, 54, 65 Relative effect measures, 52 Relative excess measures, 53–54 Relative risk scale, meta-analysis and, 670 Relative risk (RR), 53, 67 See also Rate ratio; Risk ratio adjustment of, in Bayesian-analysis, 336 factorization of, meta-analysis, 660–661 relations among ratio effect measures, 60–61 single two-way table, 334–335, 334t Relative tests of regression models, 425 Repeated-measures, 451 Representativeness generalizability and, 146–147 in case-control studies, 120–121 surveillance systems, 479 Reproductive endpoints, 621–622 Reproductive epidemiology, 620–641 Reproductive systems, 620 Reproductively unhealthy worker effect, 622 Resampling, 368–369, 376 model checking and, 428–429 Rescaling variables, 392–393 Research biomarkers, ELSI and, 567–568 conduct, 21 links to, surveillance, 464–465 objective, 32–33 questions, meta-analysis, 655 resources, secondary data access, 491 secondary data access, ethics of, 490–491 sponsors, suppression bias and, 678 staff, occupational cohort study, 495 Research synthesis See Bias analysis; Meta-analysis Reservoirs, infectious agents and, 555 Residual confounding, 69, 198–199, 199f, 259 bias quantification and, 198–199 Residual degrees of freedom global tests of fit, 427 meta-regression methods, 673 Residual distributions, model fitting and, 421–422 Residual effects, 435 Residual sum of squares, 427 meta-regression, 673 Register-based search, secondary data, 488 Respondent burden of questionnaires, 503–504 Response rates epidemiologic research, 497–499 interviews and, 509 753 Response types, causal and preventive, 59–60 and additivity, 78–79 cohorts and, equivalent interaction contrast and, 80t sufficient cause models and, 82 Restriction in study design, 169 Retention time of toxic exposures, 602–603 Retrospective ascertainment, 96, 112 See also Recall bias Retrospective cohort selection, 109 Retrospective studies, prospective studies v., 95–97, 112 Reference event, in incidence times, 33 Reverse causation, 28, 30, 622 Reverse-Bayes analysis, 338–339 Reverse-continuation-ratio model, 415 Ridge regression, 331, 423, 436, 437, 439 See also Hierarchical regression; Shrinkage estimation Risk, 9–10 analysis, 347 assessment, 347 difference, 52–53 environmental hazards, 614 estimation, 290–293 fraction, 54, 65 history, 41, 42f models, 395 ratio, 61, 73, 116 score, 447 standardized, 49–50 Risk difference, 52–53, 260–261 crude data, 247, 249, 250 Mantel-Haenszel, 274–276 standardized, 68, 266–267 Risk-difference additivity, biologic interactions and, 78–79, 298–299, 406 Risk factors coronary heart disease, 14 lung cancer, 18 non-causal, 28 Risk fraction, 54 See also Excess fraction Risk models, 393–395, 417 Risk period, 33, 34 Risk ratio, 61, 73, 116 case-cohort data, 252–253, 280–281 crude data, 246, 247, 250 Mantel-Haenszel, 274–276 standardized, 68, 267–269 Risk scores, 447 See also Confounder scores, Outcome scores Risk-set sampling, 124, 125 RR See Rate ratio; Relative risk; Risk ratio Run-in phase, subject selection, randomized controlled trials, 647 Running mean, 317 Running-weighted regression curve, 321 See also Nonparametric regression; Smoothers and Smoothing Russell, Bertrand, philosophy of, 19 Salk vaccine field trial, Salmonella, infection occurrence, factors influencing, 556 Salmonella newport, surveillance, 463 Sample-size, considerations, 239 See also Study size model fitting and, 422–423 P1: TNL/OTB GRBT241-INDEX P2: IML/OTB QC: IML/OTB GRBT241-v4.cls T1: IML February 5, 2008 Printer: RRD 14:38 754 Sampling error, 149, 346 See also Random error; Random sampling; Selection bias fractions, case-control studies, 115, 123–125, 430–432 rates, case-control studies, 113, 430 variation, 149 SARS See Severe acute respiratory syndrome Saturated regression model, 427 Scale dependence of effect-measure modification, 72–74 of regression coefficients, 392–393 Scaling, of graphs, 310–311, 312–313, of regressors, 392–393 Scanned forms, 500 Scatterplot smoothers, 321 See also Nonparametric regression; Smoothers and Smoothing Scatterplots, 216, 321 of test statistics, 681 Schwarz information criterion See Bayesian information criterion Scientific inference, philosophy of, 18–25 Scientific proof, impossibility of, 24–25 Score limits, 226 Score statistics, 225–226 See also Mantel-Haenszel statistics for case-cohort data, 252 for case-control data, 251 for person-time data, 242–245 for pure count data, 245–250 Score test, 225–226 Scoring methods for confounder control See Confounder scoring Scoring of categories, 312–313, 410 Screening tests, studies, 642–646 SD See Standard deviation SE See Standard error Seasonal variation, 605, 608, 609, 611 See also Cyclic patterns Second stage model, hierarchical regression and, 438 Second stage standard deviations, hierarchical regression, 436 Secondary attack rate, 560–561 Secondary data, 481–491 access, ethics of, 490–491 analysis, complete populations, 482–483 use of, examples, 484 validity and, 483 Second-stage model, 435–437 See also Hierarchical regression Secular trends, migrant studies and, nutritional exposures, 582 Segregation analysis, genetic, 565 Segregation, racial/social, 539 Selection bias, 2, 134–137, 186, 193, 193f, 194, 362–363 See also Berksonian bias; Collider bias; Publication bias; Sampling error; Source population; Target population analysis of, 362–363, 375–376, 662–663 case-control studies, matching, 205, 205f complete populations, 482–483 confounding v., 136–137 graphical representation and analysis, 192–198 and publication bias, 654–655 Index quantitatively v qualitatively, 136 substudies and, 488 Selection time, study design and, 122 Self-administered questionnaires, 500 Self-reported disease, surveillance, 473 Self-selection bias, 134 Semen quality, 623–624 Semi-Bayes analysis, 337, 345, 378–380, 437, 439 See also Bayesian analysis; Hierarchical regression; Shrinkage estimation biases and, 378–380 Semilogarithmic plotting, 310, 311f Semiquantitative food-frequency questionnaires, 591 Sensitivity in bias analysis, 354–360 of diagnostic or screening test, 643–644 of exposure measurement method, 138 influence analysis and, 221–222 in probabilistic bias analysis, 372–375 relation of predictive values to, 357–358 surveillance systems, 479 Sensitivity analysis, 208, 355, 488 See also Bias analysis base priors and, 380 Bayesian analysis, 342, 378–380 external adjustment process and, 351, 351t meta-analysis, 661–662, 675 multiple bias analyses and, 363 unmeasured confounders, 349 Sentinel events, surveillance, 475 Sequelae prevention, clinical trials and, 90 Serial interval, state of infectiousness, 550f, 560 Serotyping, 553 Seroconversion, HIV, 557 Seventh Day Adventists studies, 582 Severe acute respiratory syndrome (SARS), infection occurrence, factors influencing, 556 Sex/gender, as social variable, 535 Sharp null hypothesis, 60 Shep’s relative difference, 65 Short-term recall, food intake, 589 Shrinkage estimation, 331, 337, 423, 436, 437, 439 See also Bayesian analysis; Hierarchical regression confounder control and, 263 Siblings records, secondary data, 484 Side effects of treatments See Adverse events Sign test, 680 Significance tests, 150, 151–153 confidence intervals v., 157–158 model searching and, 419 Simplifying search, model searching, 419 Simulation intervals, 365 Simultaneous analysis, 323–327 See also Multiple comparisons of exposure levels, 323–324 of multiple outcomes, 325–327 single-comparison v., 326–327 of trends, 326 Simultaneous misclassification, 145–146 Sine curve, cyclic patterns, 608 Single study group large-sample methods, 245–247, 247t person-time data, large-sample methods, 240–243 small-sample statistics, person time data and, 253–254 P1: TNL/OTB GRBT241-INDEX P2: IML/OTB QC: IML/OTB GRBT241-v4.cls T1: IML February 5, 2008 Printer: RRD 14:38 Index Single-regression models, 401 multiple-regression models v., 401–402 SIRs See Standardized incidence ratios Small-sample methods, 220, 239, 253–257, 276 See also Exact statistics; Sparse data person-time data, 253–255 count data, 255–257 Smoking 1, 2, 8–10, 13–14, 24, 26–30, 52, 55–56, 64–66, 117, 118, 119, 121, 122, 251, 263, 264, 266–267, 274, 280, 424, 483, 488, 608, 616 in pregnancy, 602, 624, 632, 634, 635 Smoothed curves, 319–321, 410–412 Smoothers and Smoothing, 313–14, 317–321, 410–412, 421, 438–442 See also Moving averages; Nonparametric regression; Power models; Spline models fractional polynomial, 410 graphical, 313–314 with hierarchical regression, 438–439 kernels, 314, 318–320, 440, 441 locally linear, 441 power model, 410 regression splines, 410–412 for surveillance data, 478 variable span, 321, 442 windows, 317–321 Smoothing parameter, 441–442 Smoothing splines, 440–441 See also Nonparametric regression; Spline models SMR See Standardized morbidity ratio SNFT See Structural nested failure-time models SNM models See Structural nested mean models Snow’s natural experiment, 94, 95 See also Cholera, Snow’s study of Social capital, 541–542 Social class, 537, 548 intercommunity comparisons, 611–612 Social epidemiology, 532–548 analytic approaches, 542–548 covariate assessment, 534–542 exposure, 534–542 Social factors, 534–542 aggregate level, 538–542 individual level, 534–542 Social mobility, 539, 547 Software for Bayesian analysis, 338 for hierarchical modeling, 436 problems in distinguishing models, 402 Source population, 114–115, 128–129, 383 See also Sampling error; Selection bias; Superpopulation; Target population confounding and, 129–133 subject identification, 496 target population v., 129, 383 Sparse data See also Small-sample methods bias from, 263, 386, 422, 433 methods for, 272–278, 280–281, 283–288, 294, 296, 314, 316, 381, 386, 425, 433–435, 446–451 Spatial-data analysis, time-trend designs v., 609–610 Special-exposure cohorts, 109–110 Specification accuracy, 449–450 Specification biases, 346 Specification error, 346 755 Specificity, as causal criterion, 27–28 Specificity in bias analysis, 354–360 of diagnostic or screening test, 643–644 of exposure measurement method, 138 in probabilistic bias analysis, 372–375 relation to predictive values, 357–358 Spline models, 410–412, 440–441 See also Nonparametric regression; Power models; Smoothers and Smoothing Spontaneous abortion, 621, 628 Square-root model, 398 Stability assumption in causal diagrams, 191 Standard, choice of, 266 Standard deviation (SD) See also Variance flaws in conventional estimates, 239, 346–347 flaws in rescaling by, 393, 681 Standard distribution, 49–50, 68–69, 265–266, 386–388 See also Standardization Standard error (SE) See also specific estimates, e.g., Risk ratio of externally adjusted estimates, meta-analysis, 661–662 impact of model searching on, 419 Standardization, 49–50, 67–69, 265–269, 386–388, 442–445 See also Inverse-probability weighting; Standard distribution; Standardized measures; Standardized morbidity ratio in case-control studies, 269, 445 by exposure models, 444–445 by full-data models, 444 by outcome models, 443 of regression, 386–388 Standardized coefficients, 681 See also Standard deviation, flaws in rescaling by Standardized incidence ratio (SIR), 68 See Standardized morbidity ratio Standardized measures, 33, 386–388 See also Standardization of association, 67–69 of effect, 67–69, 386–388 of outcome, 33, 387 probability, 442–445 rate, 49, 67, 265, 442 rate difference, 68, 266 rate ratio, 68, 267, 269 regression, 386–388, 442–445 risk, 50, 265 risk difference, 68, 266 risk ratio, 68, 267 regression coefficients v., 443–444 Standardized morbidity ratio (SMR), 68–69, 241–242, 267, 269, 664–665 Standardized mortality ratio, 68 See Standardized morbidity ratio State of infectiousness, 550f, 560 assessing risk of, 561–563 progression-related infectious process and, 550f, 560 Stationary population, 38–39, 47 Statistical approximations, accuracy of, 220, 225, 227–229, 243, 245, 247, 249, 250, 324, 334, 340, 342 Statistical biases, variable selection, 263, 419 Statistical efficiency, overmatching and, 176t, 179–180 Statistical estimation, 156–167 P1: TNL/OTB GRBT241-INDEX P2: IML/OTB QC: IML/OTB GRBT241-v4.cls T1: IML February 5, 2008 Printer: RRD 14:38 756 Statistical independence, 184–185 absence of open paths and, rules linking, 187–190 and d-separation rules, 189–190 Statistical inference, philosophies of See Bayesianism; Frequentist methods; Pure likelihood inference Statistical interaction, 72–74, 402–404 See also Effect-measure modification; Heterogeneity; Product terms and biologic interactions, 82–83, 298–300, 407 effect-measure modification v., 72 qualitative, 79–80 Statistical precision, 149 Statistical significance, 151 See also Hypothesis tests; P-values; Significance tests; inference and, 160f, 161 Statistical tests See Hypothesis tests; P-values; Significance tests Statistics, epidemiology and, 167 Steady state population, 38–39 Stein estimation, 331, 423, 439 See also Bayesian analysis; Hierarchical regression; Shrinkage estimation Stepwise model searching, 419, 420 Stereotype model, 415 Stillbirth mortality, 635–636 Stratification of data, and precision, 150 Stratified analysis, 258–302 problems with, 179 Stratified count data, polytomous exposure, 305, 306t Stratified logistic model, 433 Stratified models, 432–433 Stratified null hypothesis, P-values, 277–278 Stratified person-count data See Stratified count data, polytomous exposure Stratified person-time data, polytomous exposure, 305, 305t Stratified pure count data, P-values, 278 Stratum-specific estimates, 258, 264–265 measures, overall measures v., 62 models, 433 Strength associations, causal inference, 26–27 Strength of effects, 6f, 10–13, 14t Streptococcus pneumoniae, laboratory-based surveillance, 472–473 Stress, during pregnancy, 485 Structural equations, 60, 208, 453 Structural nested models, 453–455 See also G-estimation Structural nested failure-time (SNFT) models, 453–454 Structural nested mean (SNM) models, 453 Study See also Epidemiologic studies base, in case-control studies, 112 efficiency, 150 apportionment ratios and, 169–171 endpoints, 648–649 exposure, levels of, 440 hypothesis, person-time and, 102 populations, variations, 493–494 primary base, secondary base v., 114–115 size binomial likelihood, 247 formulas, 149–150 Index meta-analysis and, 679 random error and, 149–150 small, publication bias and, 679 variables, in meta-analysis, 655–656 Subclinical infection, clinical disease v., 551 Subclinical pregnancy loss, 630–631, 631f Subcutaneous fat, biochemical measurements, 594 Subinterval-specific incidence rates, survival proportion and, 45 Subject selection case-control studies, 120–121, 496–497 community trials, 494 and recruitment, epidemiologic studies, 493–499 run-in phase, randomized controlled trials, 647 Subject tracing, 109, 157 Subjective Bayesianism, 165–166, 328–333 See also Bayesianism; Subjective probability frequentism v., 166–167, 329–333 Subjective probability, 10, 23–24, 165–167, 330–331 Subjects, classification of, 218–219 Subject-selection bias, meta-analysis, 656–657 Subject-specific models, 544 See also Group-specific models Substudies, selection bias and, 488 Sufficient cause, 13 disease and, 6, 6f mechanisms, disease proportion and, 13–15 Sufficient component cause model, 6–18 biologic interactions under, 80–81 epidemiology and, 8–9, 8f indirect effects and, 13 potential-outcome v., 81–82, 81f scope of, 17–18 Sufficient conditioning sets, 192 Superpopulation, 151, 152, 383 for regression, 383 “Superspreaders,” infection occurrence, 556 Suppression bias, 678 Surgeon General’s Smoking and Health, 29 Surrogate confounders, 130 Surveillance, 459–480 approaches to, 472–476 history of, 460–462, 461t objectives, 462–467 Surveillance data analysis and interpretation, 476–478 assessing completeness, 478 education and policy, 467 presentation of, 478–479 Surveillance methods, combinations of, 475–476 Surveillance reports, surveillance data, 478 Surveillance systems, 459 attributes of, 479 cycles of, 470 elements of, 467–472 participation incentives, 470–471 SARS, 480 Surveys, 473 See also Cross-sectional studies; Prevalence studies Survival analysis, 41, 290–294, 291t See also Accelerated failure-time model; Cox model average incidence time v., 45 subinterval-specific incidence rates and, 45 Survival proportion, 40 P1: TNL/OTB GRBT241-INDEX P2: IML/OTB QC: IML/OTB GRBT241-v4.cls T1: IML February 5, 2008 Printer: RRD 14:38 Index Survival time See Incidence time Survivor bias, 198 Susceptibility measures, 64–65 Susceptibility factors, infection occurrence, 556 Syndromic surveillance systems, bioterrorism-related epidemics and, 464 Synergism, 76–82 See also Biologic interaction Synthetic meta-analysis, problem with, 654 Systematic errors (biases), 128–146, 345–380 random errors v., 346 Tabular analysis See stratified analysis Tabular checks, regression model checking and, 423–424 Tabular data, simultaneous statistics for, 323 Tabular presentation, surveillance data, 478–479 Target population, 60, 90, 146–147, 383 See also Source population; Sampling error; Selection bias confounding and, 60 generalizability of trial results, 90 matching, 172 source population v., 129, 383 surveillance systems, 471 TDT See Transmission disequilibrium test Temporal ambiguity, ecologic studies, 527 Temporality, causal inference and, 28 Test hypothesis, 152–154, 156, 165, 220 See also Null hypothesis Test of fit, 425–427 Test statistics, 152, 219–237 Test validity, screening tests, 642, 643, 643t See also Misclassification Test-based method for extraction of estimates, meta-analysis, 660 Tetanus, surveillance, 464 Therapy, studies of, 646–650 Time at risk, 102 time of exposure v., 102 Time clustering, 606–607, 607f Time unit, incidence rates and, 36 Time windows, induction analysis, 300–302 Time-dependent exposures, 300–302, 397–398, 452 See also Longitudinal data Time-dependent covariates, 397–398, 452 Time-related disease patterns, 606–608 Time-series analyses, 517, 610–611 Time-stratified referent periods, case-crossover studies, 605, 605f Time-to-pregnancy studies, 625–628 Time-trend designs ecologic studies, 515–517 spatial-data analysis v., 609–610 Time-varying covariates, 397–398, 452 See also Longitudinal data Tissue analysis, biochemical measurements, 594 Tobacco See Smoking Total energy intake, dietary intake and, 596–597 Townsend and Carstairs deprivation indices, 538 Toxic shock syndrome, surveillance, 465, 466f Tracing, epidemiologic studies, 109, 507 Transformed regressors, 398–400 Transformed-outcome models, 400 generalized linear models v., 416 Transmission axis, 549, 550f Transmission disequilibrium test (TDT), 638 757 Transmission, factors influencing, 561 Transmission-related states, infection and, 551, 552 Transmission-related studies, epidemiologic issues in, 560–563 Trapezoidal distribution, 371 Treatment assignment, clinical trials and, 90 Trend See also Dose response analysis, 314–317 dose-response and, 308–314, 308t graphing, 308–14 modeling, product terms and, 404–405 P-value, 314–316 in surveillance data, 477–478, 477f Trend models, 398–399, 408–413 category codes in, 409 hierarchical, 438 trend variation models, 412–413 Trend statistics, 314–316 Triple-blind studies, 91 Tumor heterogeneity, 577 Tumor promoter, 16 Twin studies, 15, 568–569 Two-binomial model large-sample methods, 247–250 Wald confidence limits, 250 Two-by-two tables, 247, 247t, 345 See also Contingency tables, data analysis and Two-phase studies See Two-stage studies Two-Poisson model, 245 Two-sided confidence interval, 158, 221 Two-sided P-value, 156, 221 Two-stage studies, 127, 496, 604 analysis of, 281–282, 432 Two-tailed P-value, 152 Type I error (alpha error), 153, 154 DSMB and, 91 gene-disease association, 571–572 Type II error (beta error), 153, 154, 155f Typhoid fever, evolution of, 553, 554f Uncertainty, 329, 331, 346–347, 364–365, 370, 376, 378, 379, 380 Uncertainty analysis, 346–347 See also Bayesian analysis; Bias analysis; Uncertainty Unconditional analysis, conditional analysis v., 272–273 Unconditional d-separation, 187–188 See also d-Separation Unconditionally unbiased, in causal diagrams, 191 Unconditional probability, 184 See also Probability Uncontrolled confounders, analysis of, 348–352, 365–371 Unconfounded dependence, in causal diagrams, 193 Unconfounded direct effect, diagram, 201f Unemployment, 537 Unexposed group, bias and, in trend evaluation, 317 Unexposed time in exposed subjects, cohort studies, 104 Unhealthy worker effect, 622 See also Healthy worker effect Uniformity of effect measures, 61, 259 See also Homogeneity Uniform prior distributions, 365, 369 Univariate analysis See Descriptive statistics Univariate exposure transforms, 398–399 Unmatched case-control studies, modeling in, 429–432 P1: TNL/OTB GRBT241-INDEX P2: IML/OTB QC: IML/OTB GRBT241-v4.cls T1: IML February 5, 2008 Printer: RRD 14:38 758 Upper P-value, 220 Upper-tailed P-value, 152, 220 U.S Surgeon General’s Smoking and Health, 29 UV-B radiation, cancer and, 617 Vaccines and autism, secondary data, 485–486 Vacuous models, 390–391 Validation data, 361 dietary assessment methods, 591–592 secondary data, 487 Validity See also Bias; Generalizability; Precision; specific validity topics, e.g., Confounding; Selection bias; Sparse-data bias of biomarkers, 566 external, 128 in epidemiologic studies, 128–147 of estimation, 128–129 ethical considerations v., 89 internal, 128 secondary data and, 483 Variables See also specific types, e.g., Regressors in causal graphs, 186–187 in regression and causation, 383 Variable selection, 261–263, 419–421 Variable-span smoothers, 321, 442 See also Nonparametric regression; Smoothers and Smoothing Variance See also specific estimates, e.g., Standardized estimates bootstrap estimation, 429, 443 and confounder scores, 447 estimates, 149, 225, 226, 231, 243, 246, 248, 253, 254, 256, 422, 440, 443 model, 436 residual, 422, 425 Variance component models See Hierarchical regression Vectors, 384–385 Vertical scaling of graphs, 310–311 Vertices, in causal diagrams, 187 Virulence factors, 558 Volunteer providers, surveillance networks, 473 Vote counting, in meta-analysis, 680–681 Index Waiting time, 39 See also Incidence time Wald limits, 227 for binomial data, 246 for case-cohort data, 252–253 for case-control data, 250–252 for person-time data, 240–245 for pure count data, 245–250 Wald statistic, 220, 226–227, 279 Wald test, 227 Weighted averaging for Bayesian analysis, 334–337 by information, 334–337 for pooling across strata, 271 for smoothing, 318–319, 441 for meta-analysis, 670–671 by inverse probability of selection See inverse-probability weighting by inverse variance, 271, 334–337, 670–671 by precision, 271, 334–337, 670–671 by quality scores, 681 in standardization, 49–50, 67–69, 265–269, 386–388 Weighted regression, 441 See also Inverse-probability weighting in meta-analysis, 663, 668 Weights in meta-analysis, 670–671, 681 in standardization, 49–50, 67–69, 265–267, 386–388 Weight-specific mortality, 633–635, 634f WHO See World Health Organization Windows, See also Smoothers and Smoothing in smoothing, 317–321 Within-community time-series analysis, air pollution, 616 Within-group bias, ecologic studies, 520 Within-group misclassification, ecologic studies, 526–527, 526f Withdrawals See Loss to follow-up; Censoring Woolf method for pooling, 271 See also Information-weighted averaging World Health Organization (WHO) surveillance, 466, 474 Zero level, in trend evaluation, 316–317 Z-ratio, 220 See Wald statistic Z-score See Wald statistic; Z-ratio ... Cataloging-in-Publication Data Rothman, Kenneth J Modern epidemiology / Kenneth J Rothman, Sander Greenland, and Timothy L Lash – 3rd ed p ; cm 2nd ed edited by Kenneth J Rothman and Sander Greenland... rare red meat with persons who ate highly cooked red meat The same exposure, regular consumption of rare red meat, might have a preventive effect when contrasted against highly cooked red meat... grafted onto it Kuhn and others have argued that the consensus of the scientific community determines what is considered accepted and what is considered refuted Kuhn’s critics characterized this