Describe the basic steps of conducting a case-control study Discuss how to select cases and controls Discuss how to conduct basic data analysis (odds, odds ratios, and matched analysis) Provide examples of recent outbreak investigations that have used the case-control study design
Trang 1Case-Control Studies for Outbreak Investigations
Trang 2 Describe the basic steps of conducting a case-control study
Discuss how to select cases and controls
Discuss how to conduct basic data analysis (odds, odds ratios, and matched analysis)
Provide examples of recent outbreak
investigations that have used the case-control study design
Trang 3Quick Review of
Case-Control Studies
Analytic studies answer “what is the
relationship between exposure and disease?”
Case-control design often conducted with relatively few diseased individuals (so is efficient)
Case-control design useful when studying a rare disease or investigating an outbreak
Trang 4Case Selection
Depends on how the study investigator defines a case
Case definition: “a set of standard criteria for deciding whether an individual should be
classified as having the health condition of interest” (1)
Clinical criteria
Restricted to time, place, person characteristics
Simple, objective, and consistently applied
Trang 5 Mass screening programs
Case-patients identify other persons who have similar illness
Trang 6Case Selection Example
August 2001: Illinois Department of Health notified of a cluster of cases of diarrheal illness associated with exposure to a
recreational water park in central Illinois (2)
Local media and community networks used to encourage ill persons to contact the local
health department
Case-patients asked if there were any other ill persons in their household or if anyone
attending the water park with them was ill
Trang 7Control Selection
Most difficult part of a case-control study!
We would like to be able to conclude that there is an association between exposure and disease in question
Way the controls are selected is major determinant of whether this conclusion is valid (3)
Trang 8Control Selection (1)
Controls are persons who do not have the disease in question
Should be representative of population from which cases arose (source population)
If a control had developed the disease, would have been included as a case in the study
Should provide good estimate of the level of exposure one would expect in that population
Trang 9Control Selection
Sources for controls:
Same health-care institutions or providers as cases
Same institution or organization as cases (e.g., schools, workplaces)
Relatives, friends, or neighbors of cases
Randomly from the source population (1)
May choose multiple methods of control selection
Source will depend on the scope of the outbreak
May choose multiple controls per case to increase likelihood of identifying significant associations (usually no more than 3 controls per case)
Trang 10Control Selection Example
Persons served by the same health-care institution or providers as the cases
August 2001: cluster of Ralstonia pickettii
bacteremia among neonatal intensive care unit (NICU) infants at a California hospital (4)
Controls were NICU infants who:
1.Had blood cultures taken during either cluster period (July 30-August 3 and August 19-30);
2.Had blood cultures that did not yield R pickettii; and
3.Had been in the hospital for at least 72 hours
Attempted to recruit 2 controls per case-patient
Trang 11Control Selection Example
Members of the same institution or organization
2004: outbreak of varicella in a primary school in a suburb of Beijing, China (5)
Case-control study to identify factors contributing to high rate of transmission and assess
effectiveness of control measures
Controls included randomly-selected students in grades K-2 of the primary school with no history of current or previous varicella
One control recruited for each case-patient
Trang 12Control Selection Example
Relatives, friends, or neighbors
August 2000: increase noted in Salmonella serotype Thompson isolates from Southern
California patients with onset of illness in July (6)
Preliminary interviews found many case-patients had eaten at Chain A restaurant in 5 days before illness onset
Case-control study conducted to evaluate specific food and drink exposures at Chain A restaurants
Controls were well friends or family members who shared meals with cases at Chain A during
exposure period
Trang 13Control Selection Example
Random sample of the source population
January-June 2004: aflatoxicosis outbreak in eastern Kenya resulted in 317 cases and 125 deaths (7)
Case-control study conducted to identify risk factors for contamination of implicated maize
Randomly selected 2 controls from each case patient’s village
Spun a bottle in front of village elder’s home and walked to fifth house in direction indicated by the bottle (or third house in sparsely populated areas)
Random number list was used to select one household member
Trang 14Control Selection Example
Multiple methods of control selection
In waterpark outbreak in Illinois previously mentioned, recruited 1 control per case
using 3 methods (2)
Case-patients asked to identify another healthy person
Used local reverse-telephone directory based on residential address of case-patients
Canvassed local schools and community groups
Trang 15Selection Bias
Bias: distortion of relationship between exposure and disease
Systematic difference in way you select your controls compared to way you select your
cases that could be related to the exposure could introduce bias
Bias related to the way cases or controls are chosen for a study is ‘selection bias’
Trang 16Selection Bias Example
Case-patients more likely to work on lower floors of an office building and employees on the lower floors are more likely to leave the building to go out for lunch
If control population is mostly employees from upper floors, conclude there is a real difference between cases and controls
associated with eating at a local deli
But the difference is due to where they
worked in the building, which resulted in how often they ate out
Trang 17Selection Bias Example
Outbreak at a gym and a majority of the case-patients are females
Majority of the controls are male
Found an association between illness and an aerobics class
Outbreak was caused by the steam in the sauna in the women’s locker room
Relationship between illness and the aerobics class due to the fact that women are more likely to take an aerobics class than men
Trang 18 Validity is dependent on the similarity of
cases and controls in all respects except for exposure
“Match” cases and controls on characteristics like age and gender
Matching factors should be important in disease development, but not the exposure under
Since matching variable will not be associated with either case or control status, it cannot confound, or distort, the exposure-disease association
Analysis of data must take matching into account
Trang 19 Individual matching (aka matched pairs)
Matches each case with a control that has specific characteristics in common with the case
Used when each case has unique and important characteristics
Group matching (aka frequency matching, category matching)
Proportion of controls with certain characteristics to be identical to the proportion of cases with these same characteristics
Requires that all cases be selected first so investigator knows the proportions to which the controls should be matched
If 30% of cases were male, would select so that 30% of controls were male
Trang 20 Can be time efficient, cost effective, and improve statistical power
The more variables that are chosen as
matching characteristics, the more difficult it is to find a suitable control to match to the case
Once a variable is used for matching, no relationship can be discerned between this variable and the disease
Don’t match on anything you think might be a risk factor!
Trang 21Individual Matching Example
Outbreak of tularemia in Sweden in 2000 (8)
Selected two controls for each case
Matched for age, sex, and place of residence
Identified through computerized Swedish National Population Register (stores name, date of birth, personal identifying number, address of all citizens and residents)
Trang 22Group Matching Example
Outbreak of Escherichia coli associated with petting zoo at 2004 North Carolina State Fair (9)
Recruited 3 controls for each case
Group-matched by age groups (1-5 years, 6-17 years, and 18 years and older)
Identified from list provided by fair officials of 23,972 persons who purchased tickets to the fair online, at kiosks, or in
malls
Trang 23Conducting the Investigation
Gather demographic information and exposure histories from cases and
After you have collected the data you need, you can begin the analysis and calculate measures of association
Trang 24Analyzing the Data
Odds ratio is calculated to measure the association between an exposure and a disease outcome
Trang 25Calculating Odds
Odds measure occurrence of an event compared to non-occurrence of same event
Variables with two levels (binary
variables) used to calculate an odds ratio
Examples of binary variables: yes/no responses (disease/no disease,
exposed/not exposed)
Trang 26Calculating Odds
Odds of exposure among cases
calculated by dividing number of
exposed cases by number of unexposed cases
Odds of exposure among controls
calculated by dividing number of exposed controls by number of unexposed controls
Trang 27An Odd Measure – How are odds different from probability or risk?
In a bag containing 20 poker chips: 4 red and 16 blue…
Probability is the number of times something occurs divided
by the total numberof occurrences
Probability of getting red is 4/20 (or 1/5 or 20%)Probability of getting blue is 16/20 (or 4/5 or 80%).
Odds are the number of times something occurs divided by the
number of times something does not occur
Odds of getting red are 4/16 (or 1/4)Odds of picking blue are 16/4 (or 4/1)
May refer to the odds of getting blue as 4 to 1 against getting red
Trang 28Calculating Odds
A 2x2 table shows distribution of cases and controls:
Trang 29Calculating Odds Ratios
Odds ratio is odds of exposure among cases divided by odds of exposure
among controls
Exposure among cases is compared to exposure among controls to assess if
and how exposure levels differ between cases and controls
Trang 30Calculating Odds Ratios
Odds ratio calculated by dividing odds of exposure among cases (a/c) by odds of exposure among controls (b/d)
Numerically the same as dividing the products obtained when multiplying
diagonally across the 2x2 table (ad/bc)
Also known as “cross-products ratio”
Trang 31Calculating Odds Ratios
To interpret odds ratio, compare value to 1:
If odds ratio = 1: odds of exposure is the same
for cases and controls (no association between disease and exposure)
If odds ratio > 1: odds of exposure among cases is greater than among controls (a positive
association between disease and exposure)
If odds ratio < 1: odds of exposure among cases is less than among controls (a negative, or
protective, association between disease and exposure)
Trang 32Calculating Odds Example
Outbreak of Hepatitis A among patrons of a single Pennsylvania restaurant (10)
240 case-patients and 134 controls identified
Trang 33Matched Analysis
If individual matching, 2x2 table set up differently
Examine pairs in table, so have cases along one side and controls along the other, and each cell in the
table contains pairs
Trang 34Matched Analysis
Cell e contains number of matched case-control pairs where both case and control were exposed
Concordant cell (and cell h) because case and control have same exposure status
Cell f contains number of matched case-control pairs where cases were exposed but controls were not
Discordant cell (as cell g) because case and control have different exposure status
Only discordant cells give useful data: the matched odds ratio calculated as cell f divided by cell g
Matched Odds Ratio = f/g
Trang 35Odds vs Risk
Odds are qualitatively different from risk (calculated in a cohort study)
Case-control studies select participants based on disease status and then measure exposure among the participants
Can only approximate risk of disease given exposure
Values needed to calculate risk are not available because entire population at risk is not included in the study
Finding and accessing all who did not get sick would be difficult or impossible
Case-control study allows us to use only a subset of controls and calculate the odds ratio as an
estimate of the risk
Trang 36Example Case-Control Study:
November 1999: children’s hospital notified Fresno County Health Department (California) of 5 cases of
E coli O157 infections during a 2-week period (11)
All case patients had eaten at popular fast-food restaurant chain A in 7-day period before onset of illness
Local health officials and clinicians throughout
California asked to enhance surveillance for E coli
O157 infections
States bordering California asked to review medical histories of persons with recent E coli O157
infections and arrange for subtyping of isolates
2 sequential case-control studies conducted in early December 1999
Trang 37Example Case-Control Study:
First study conducted to determine the restaurant associated with the outbreak
Case defined as patient with:
An infection with the PFGE-defined outbreak strain of E coli
O157:H7, diarrheal illness with more than 3 loose stools
during a 24-hour period, and/or hemolytic uremic syndrome (HUS) during the first 2 weeks of November 1999; or
Illness clinically compatible with E coli O157:H7 infection, without laboratory confirmation but with epidemiologic connection to the outbreak
Control defined as person without a diarrheal illness or HUS during the first 2 weeks of November 1999
Trang 38Example Case-Control Study:
Controls age-matched and systematically
identified using computer-assisted telephone interviewing or residents in the same
telephone exchange area as case patients
Attempted 2 controls per case
Enrolled 10 cases and 19 matched controls
Only chain A showed statistically significant association with illness among cases and controls
Trang 39Example Case-Control Study:
Second case-control study involving patrons of chain A restaurants conducted to determine specific menu item or ingredient associated with illness (11)
Case defined as above but restricted to those who
had eaten at chain A and who could be matched with “meal companion-controls”
8 cases and 16 meal companion-controls enrolled
Consumption of a beef taco was found to be statistically associated with illness
Traceback investigation implicated an upstream supplier of beef, but farm investigation was not possible
Trang 40Example Case-Control Study:
Listeriosis with deli meat
July and August 2002: 22 cases of listeriosis were reported in Pennsylvania, a nearly 3-fold increase over baseline (12)
Subtyping identified cluster of cases caused by single
Liseteria monocytogenes strain
CDC asked health departments in northeast United States to conduct active case finding, prompt
reporting of listeriosis cases and retrieval of clinical isolates for rapid PFGE testing
Conducted case-control study to identify cause of increase in cases
Trang 41Example Case-Control Study:
Listeriosis with deli meat
Case-patient defined as person with
culture-confirmed listeriosis between July 1 and November 30, 2002, whose infection was caused by the
outbreak strain
Control defined as person with culture-confirmed listeriosis between July 1 and November 30, 2002, whose infection was caused by any other
non-outbreak strain of L monocytogenes, and who lived in a state with at least 1 case patient
Interviewed with standard questionnaire including more than 70 specific food items to gather medical and food histories during the 4 weeks preceding culture for L monocytogenes.