nondifferential (random) Random error: use of invalid outcome measure that equally misclassifies cases and controls Differential error: use of an invalid measures that misclassifies cases in one direction and misclassifies controls in another Term bias should be reserved for differential or systematic error
Trang 1Bias, Confounding and Fallacies in Epidemiology
M Tevfik DORAK
http://www.dorak.info/epi
Trang 2Definition Types Examples Remedies
CONFOUNDING
Definition Examples Remedies
FALLACIES
Definition
Trang 3Bias is one of the three major threats to internal
validity:
Bias Confounding Random error / chance
What is Bias?
Trang 4Any trend in the collection, analysis, interpretation,
publication or review of data that can lead to conclusions that are systematically different from
the truth (Last, 2001)
A process at any state of inference tending to
produce results that depart systematically from
the true values (Fletcher et al, 1988) Systematic error in design or conduct of a study
(Szklo et al, 2000)
What is Bias?
Trang 5Errors can be differential (systematic) or
Term ' bias ' should be reserved for differential or systematic error
Bias is systematic error
Trang 8Chance vs Bias
Chance is caused by random error Bias is caused by systematic error
Errors from chance will cancel each other out in the
long run (large sample size) Errors from bias will not cancel each other out
whatever the sample size Chance leads to imprecise results Bias leads to inaccurate results
Trang 9Selection bias
Unrepresentative nature of sample
Information (misclassification) bias
Errors in measurement of exposure of disease
Confounding bias
Distortion of exposure - disease relation by some
other factor
Types of bias not mutually exclusive
(effect modification is not bias)
This classification is by Miettinen OS in 1970s See for example Miettinen & Cook, 1981 (www)
Types of Bias
Trang 11Selection Bias Examples
Trang 12Selection Bias Examples
Trang 13Selection Bias Examples
Trang 14Selection Bias Examples
Trang 15Selection Bias Examples
Selective survival (Neyman's) bias
Trang 16Selection Bias Examples
Case-control study:
Controls have less potential for exposure than cases Outcome = brain tumour; exposure = overhead high voltage power lines
Cases chosen from province wide cancer registry Controls chosen from rural areas
Systematic differences between cases and controls
Trang 17Case-Control Studies:
Potential Bias
Schulz & Grimes, 2002 (www) (PDF)
Trang 18Selection Bias Examples
Cohort study:
Differential loss to follow-up
Especially problematic in cohort studies
Subjects in follow-up study of multiple sclerosis may differentially drop out due to disease severity
Differential attrition selection bias
Trang 19Selection Bias Examples
Self-selection bias:
- You want to determine the prevalence of HIV infection
- You ask for volunteers for testing
- You find no HIV
- Is it correct to conclude that there is no HIV in this location?
Trang 20Selection Bias Examples
Healthy worker effect:
Another form of self-selection bias
“self-screening” process – people who are unhealthy
“screen” themselves out of active worker population
Example:
- Course of recovery from low back injuries in 25-45 year olds
- Data captured on worker’s compensation records
- But prior to identifying subjects for study, self-selection has already taken place
Trang 21Selection Bias Examples
Diagnostic or workup bias:
Also occurs before subjects are identified for study
Diagnoses (case selection) may be influenced by
physician’s knowledge of exposure
Trang 22Selection bias
Unrepresentative nature of sample
** Information (misclassification) bias **
Errors in measurement of exposure of disease
Confounding bias
Distortion of exposure - disease relation by some
other factor
Types of Bias
Trang 23If misclassification of exposure (or disease) is
unrelated to disease (or exposure) then the
misclassification is non-differential
If misclassification of exposure (or disease) is related
to disease (or exposure) then the misclassification is
differential
Distorts the true strength of association
Trang 24Information / Measurement /
Misclassification Bias
Sources of information bias:
Subject variation Observer variation Deficiency of tools Technical errors in measurement
Trang 25- specifically important in case-control studies
- when exposure history is obtained retrospectively cases may more closely scrutinize their past history looking for ways to explain their illness
- controls, not feeling a burden of disease, may less closely examine their past history
Those who develop a cold are more likely to identify the exposure than those who do not – differential misclassification
- Case: Yes, I was sneezed on
- Control: No, can’t remember any sneezing
Trang 26Information / Measurement /
Misclassification Bias
Reporting bias:
Individuals with severe disease tends to have
complete records therefore more complete
information about exposures and greater association found
Individuals who are aware of being participants of a study behave differently (Hawthorne effect)
Trang 27Controlling for Information Bias
- Blinding
prevents investigators and interviewers from
knowing case/control or exposed/non-exposed status of a given participant
multiple checks in medical records
gathering diagnosis data from multiple sources
Trang 28Selection bias
Unrepresentative nature of sample
Information (misclassification) bias
Errors in measurement of exposure of disease
** Confounding bias **
Distortion of exposure - disease relation by some
other factor
Types of Bias
Trang 30Cases of Down Syndrome by Birth Order
Trang 31Cases of Down Syndrome by Age Groups
Trang 32Cases of Down Syndrome by Birth Order
and Maternal Age
Trang 33• A third factor which is related to both
exposure and outcome, and which accounts for some/all of the observed relationship
between the two
• Confounder not a result of the exposure
– e.g., association between child’s birth rank
(exposure) and Down syndrome (outcome);
mother’s age a confounder?
– e.g., association between mother’s age (exposure)
and Down syndrome (outcome); birth rank a
confounder?
Confounding
Trang 34Exposure Outcome
Third variable
To be a confounding factor, two conditions must be met:
Be associated with exposure
- without being the consequence of exposure
Confounding
Trang 35Birth Order Down Syndrome
Trang 36Birth Order
Down Syndrome Maternal Age
Confounding ?
Birth order is correlated with maternal age but not a risk factor in younger mothers
Trang 38Coffee
CHD Smoking
Confounding ?
Coffee drinking may be correlated with smoking but is not a risk factor in non-
Trang 39Alcohol Lung Cancer
Trang 40Smoking CHD
Yellow fingers
Not related to the outcome
Confounding ?
Trang 42Imagine you have repeated a positive finding of birth order
association in Down syndrome or association of coffee drinking with CHD in another sample Would you be able to replicate it?
If not why?
Imagine you have included only non-smokers in a study and examined association of alcohol with lung cancer Would you find an association?
Imagine you have stratified your dataset for smoking status in the alcohol - lung cancer association study Would the odds
ratios differ in the two strata?
Trang 43Imagine you have repeated a positive finding of birth order
association in Down syndrome or association of coffee drinking with CHD in another sample Would you be able to replicate it?
If not why?
You would not necessarily be able to replicate the
original finding because it was a spurious association
due to confounding
In another sample where all mothers are below 30 yr,
there would be no association with birth order
In another sample in which there are few smokers,
the coffee association with CHD would not be
replicated.
Trang 44Imagine you have included only non-smokers in a study and examined association of alcohol with lung cancer Would you find an association?
No because the first study was confounded The association with alcohol was actually due to smoking
By restricting the study to non-smokers, we have found the truth Restriction is one way of preventing confounding at the time of study design
Trang 45Imagine you have stratified your dataset for smoking status in the alcohol - lung cancer association study Would the odds
ratios differ in the two strata?
The alcohol association would yield the similar odds ratio in both strata and would be close to unity In confounding, the stratum-specific odds ratios should
be similar and different from the crude odds ratio by at least 15% Stratification is one way of identifying
confounding at the time of analysis
If the stratum-specific odds ratios are different, then
Trang 46Imagine you have tried to adjust your alcohol association for smoking status (in a statistical model) Would you see an
association?
Trang 47For confounding to occur, the confounders should be differentially represented in the comparison groups
Randomisation is an attempt to evenly distribute
potential (unknown) confounders in study groups It does not guarantee control of confounding
Matching is another way of achieving the same It
ensures equal representation of subjects with known confounders in study groups It has to be coupled with matched analysis.
Restriction for potential confounders in design also
prevents confounding but causes loss of statistical
power (instead stratified analysis may be tried).
Trang 48Randomisation , matching and restriction can be tried at the time of designing a study to reduce the risk of
confounding
At the time of analysis:
Stratification and multivariable (adjusted) analysis can achieve the same
It is preferable to try something at the time of designing the study.
Trang 49Effect of randomisation on outcome of
trials in acute pain
Trang 51If each case is matched with a same-age control, there will be no
association (OR for old age = 2.6, P = 0.0001)
Trang 52No Confounding
Trang 53Cases of Down Syndrome by Birth Order
and Maternal Age
If each case is matched with a same-age control, there will be no
association If analysis is repeated after stratification by age, there
will be no association with birth order
Trang 54Definition Types Examples Remedies
CONFOUNDING
Definition Examples Remedies
** (Effect Modification) **
FALLACIES
Definition
Trang 55Confounding or Effect Modification
Birth Weight Leukaemia
Sex
Can sex be responsible for the birth weight
association in leukaemia?
- Is it correlated with birth weight?
- Is it correlated with leukaemia independently of birth weight?
- Is it on the causal pathway?
- Can it be associated with leukaemia even if birth weight is low?
- Is sex distribution uneven in comparison groups?
Trang 56Confounding or Effect Modification
Birth Weight Leukaemia
Sex
Does birth weight association differ in strength according to sex?
Birth Weight Leukaemia
OR = 1.5
Trang 57Effect Modification
In an association study, if the strength of the association varies over different categories of a third variable, this is called effect modification The third variable is changing the effect of the exposure
The effect modifier may be sex, age, an environmental
exposure or a genetic effect
Effect modification is similar to interaction in statistics There is no adjustment for effect modification Once it
is detected, stratified analysis can be used to obtain
stratum-specific odds ratios
Trang 59Definition Types Examples Remedies
CONFOUNDING
Definition Examples Remedies
(Effect Modification)
** FALLACIES **
Definition
Trang 60HISTORICAL FALLACY
ECOLOGICAL FALLACY (Cross-Level Bias)
BERKSON'S FALLACY (Selection Bias in Hospital-Based CC Studies)
HAWTHORNE EFFECT (Participant Bias) REGRESSION TO THE MEAN
Fallacies
Trang 61HOW TO CONTROL FOR
CONFOUNDERS?
• IN STUDY DESIGN…
– RESTRICTION of subjects according to potential
confounders (i.e simply don’t include confounder in study)
– RANDOM ALLOCATION of subjects to study groups to attempt to even out unknown confounders
– MATCHING subjects on potential confounder thus
assuring even distribution among study groups
Trang 62HOW TO CONTROL FOR
CONFOUNDERS?
• IN DATA ANALYSIS…
– STRATIFIED ANALYSIS using the Mantel Haenszel
method to adjust for confounders
– IMPLEMENT A MATCHED-DESIGN after you have
collected data (frequency or group)
– RESTRICTION is still possible at the analysis stage but
it means throwing away data
Trang 63Effect of blinding on outcome of trials
of acupuncture for chronic back pain
Trang 64WILL ROGERS' PHENOMENON Assume that you are tabulating survival for patients with a certain type of tumour You separately track survival of patients whose cancer has
metastasized and survival of patients whose cancer remains localized As you would expect, average survival is longer for the patients without metastases Now a fancier scanner becomes available, making it possible to detect
metastases earlier What happens to the survival of patients in the two groups?
The group of patients without metastases is now smaller The patients who are removed from the group are those with small metastases that could not have been detected without the new technology These patients tend to die sooner than the patients without detectable metastases By taking away these patients, the average survival of the patients remaining in the "no metastases"
group will improve
What about the other group? The group of patients with metastases is now larger The additional patients, however, are those with small metastases These patients tend to live longer than patients with larger metastases Thus the average survival of all patients in the "with-metastases" group will
Trang 65Cause-and-Effect Relationship