Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 50 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
50
Dung lượng
291,5 KB
File đính kèm
24. Handling interaction in Stata.rar
(221 KB)
Nội dung
Handling interactions in Stata Stata, especially with continuous predictors di t Patrick Royston & Willi Sauerbrei German Stata Users’ meeting, g, Berlin,, June 2012 Interactions – general concepts • General idea of a (two-way) (two way) interaction in multiple regression is effect modification: • η(x1,x x2) = f1(x1) + f2(x2) + f3(x1,x x2) • Often, η(x1,x2) = E(Y | x1,x2), with obvious extension to GLM, GLM Cox regression, regression etc etc • Simplest case: η(x1,x2) is linear in the x’s and f3((x1,,x2) is s tthe ep product oduct o of tthe e x’s: s • η(x1,x2) = β1x1 + β2x2 + β3x1x2 • Can extend to more general, non non-linear linear functions The simplest type of interaction • Binary x binary • E.g in the MRC RE01 trial in kidney cancer • 12 month % survival since randomisation • Substantial treatment effect in patients with low WCC • Little or no treatment effect in those with high g WCC • But really, WCC is a continuous variable … Treatment group White cell count low (10) MPA 34% (se 4) 24% (se 4) Interferon 49% (se 4) 21% (se 7) Overview • Interactions and factor variables (Stata 11/12) • Note: I am not an expert on factor variables! I sometimes use them them • General interactions between continuous covariates in observational studies • Focus on continuous covariates … • … because people don don’tt appear to know how to handle them! • Special case: interactions between treatment and continuous covariates in randomized controlled trials Interactions and factor variables Scope • We introduce the topic with a brief introduction to factor variables • In this part part, we consider only linear interactions: • Binary x binary (2 x table) • Binary x continuous • Continuous x continuous Factor variables: brief notes • Implemented via prefixes (unary operators) and binary interaction operators • see help fvvarlist • There are four factor-variable operators: Operator Description i unary operator to specify indicators (dummies) c unary operator to treat as continuous # binary operator to specify interactions ## binary operator to specify factorial interactions • Dummy variables are ‘virtual’ – not created per se • Names of regression parameters easily found by inspecting the post-estimation post estimation result matrix e(b) (b) Factor variables: i i prefix • Example from Stata manual [U]11.4.3: [U]11 3: li list t group i.group i i in 1/5 group 1b.group 2.group 3.group 1 0 0 3 0 Example dataset • MRC RE01 trial in advanced kidney cancer • Of 347 patients, only censored, the rest died • For F simplicity, i li it as a continuous ti response variable, Y, we use months to death, _t • Ignore the small amount of censoring • There are several prognostic factors that may influence time to death • Some are binary, some categorical, some continuous Example: factor variable parameters regress _t t i.who Source | SS df MS -+ -Model | 7780.81413 3890.40707 R id l | 85126 Residual 85126.5686 5686 344 247 247.460955 460955 -+ -Total | 92907.3828 346 268.518447 Number of obs F( 2, 344) Prob > F R-squared R d Adj R-squared Root MSE = = = = = = 347 15.72 0.0000 0837 0.0837 0.0784 15.731 -_t | Coef Std Err t P>|t| [95% Conf Interval] -+ -who | | -4.2783 2.026215 -2.11 0.035 -8.263629 -.2929707 | -12.94365 12.94365 2.354542 -5.50 5.50 0.000 -17.57476 17.57476 -8.312534 8.312534 | _cons | 19.17358 1.622518 11.82 0.000 15.98227 22.36488 matrix i li list e(b) (b) e(b)[1,4] y1 0b 0b 1 2 who who who -4.2782996 -12.943646 _cons 19.173577 values for interactions Results: P P-values mfpigen, select(0.05): mfpigen select(0 05): logit all10 cigs sysbp /// age ht wt chol (gradd1 gradd2 gradd3) *FP transformations were selected; otherwise, linear Graphical presentation of age x chol interaction fracgen cigs 5, center(mean) fracgen sysbp -2 -2, center(mean) fracgen wt -2 3, center(mean) mfpigen, linadj(cigs_1 sysbp_1 sysbp_2 > wt_1 wt_2 ht i.jobgrade) df(1) > fplot(%10 35 65 90): logit all10 age chol Alternatively: logit all10 c.age##c.chol cigs_1 sysbp_1 sysbp_2 > wt_1 wt_2 ht i.jobgrade sliceplot age chol, sliceat(10 35 65 90) percent • sliceplot is a new user-written command -5 -3 40 45 25 10 50 age 55 60 65 15 -1 05 -2 Pr(death) -4 Logit(pr(de eath)) -5 -3 05 15 -2 Pr(death h) -4 Logit(pr(dea L ath)) 25 -1 Graphical presentation of age x chol intn intn chol 15 40 chol 45 10 50 age 55 15 60 65 37 Check of chol x age interaction -2 -4 -6 -8 -8 2.5 7.5 10 12.5 2.5 7.5 10 12.5 -4 -6 -8 -6 -4 2 Q4: slope -0.01 (SE 0.04) Q3: slope 0.14 (SE 0.04) -8 Logit(pr(death)) -6 -4 -2 Q2: slope 0.22 (SE 0.05) Q1: slope 0.17 (SE 0.06) 2.5 7.5 10 12.5 chol 2.5 7.5 10 12.5 38 Interactions with continuous covariates in randomized trials 39 MFPI method (Royston & Sauerbrei 2004) • Continuous covariate x of interest interest, binary treatment variable t and other covariates z • Independent of x and t, t use MFP to select an ‘adjustment’ (confounder) model z* from z • Find best FP2 function of x (in all patients) adjusting for z* and t ã Test est FP2(x) ( ) ì t interaction te act o (2 ( d d.f.)) • Estimate β’s in each treatment group • Standard test for equality of β β’s s • May also consider simpler FP1 or linear g by y AIC functions – choose e.g 40 MFPI in Stata • MFPI is implemented as a user command, command mfpi • mfpi is available on SSC • Details are given by Royston & Sauerbrei, Stata Journal 9(2): 230-251 230 251 (2009) • Program was updated in 2012 to support acto variables a ab es factor 41 Treatment effect function • Have estimated two FP2 functions – one per treatment group • Plot the difference between functions against x to show the interaction • i.e i e the treatment effect at different x • Pointwise 95% CI shows how strongly the te act o is s supported suppo ted at different d e e t values a ues of o x interaction • i.e variation in the treatment effect with x 42 Example: MRC RE01 trial in kidney cancer • Main analysis: Interferon improves survival • HR: 0.76 (0.62 - 0.95), P = 0.015 • Is I the h treatment effect ff similar i il iin all ll patients? i ? • Nine possible covariates available for the investigation of treatment-covariate interactions – only one is significant (WCC) 43 1.0 00 Kaplan Meier showing treatment effect Kaplan-Meier 0.5 50 0.2 25 0.0 00 Prop portion a alive 0.7 75 (1) MPA (2) Interferon At risk 1: 175 55 22 11 At risk 2: 172 73 36 20 12 24 36 48 60 72 Follow-up (months) 44 The mfpi command mfpi, select(0.05) fp2(wcc) with(trt) gendiff(d): stcox (whod1 whod2) t t_dt dt t t_mt mt rem mets haem Interactions with trt (347 observations) Flex-1 model (least flexible) Var Main Interact idf Chi2 P Deviance tdf AIC wcc FP2(-1 -.5) FP2(-1 -.5) 6.91 0.0316 3180.194 3194.194 idf = interaction degrees of freedom; tdf = total model degrees of freedom mfpi_plot wcc [ i [using variables i bl created d b by gendiff(d)] diff(d)] 45 -3 -2 Log relativve hazard -1 Treatment effect plot for wcc 10 15 White cell count 20 25 About 25% of patients, those with WCC > 10 seem not to benefit from interferon 46 Concluding remarks • MFPIgen and MFPI should help researchers detect, model and visualize interactions with continuous covariates • Usually, we are searching for interactions, so small P-values are required q • Other methods not considered •S STEPP – mainly a y graphical g ap ca •… 47 Thank you you 48 Cox (1984) paper: Interaction • Cox identifies types of variable that might appear in interactions: • Treatment variables • Can be modified or imposed • Treatments, Treatments e.g e g chemotherapy, chemotherapy surgery • Behaviours, e.g smoking, drinking • Intrinsic variables • Cannot be modified • Often demog demographic, aphic e.g e g sex, se age • Unspecific variables • e.g structural t t l bl blocks, k `random’ d ’ factors f t 49 ... Continuous x continuous interaction • Results are best explored graphically • Consider in more detail next Continuous x continuous interactions 21 Motivation: continuous x continuous intn intn... • We introduce the topic with a brief introduction to factor variables • In this part part, we consider only linear interactions: • Binary x binary (2 x table) • Binary x continuous • Continuous... consider linear by linear interactions • Not sensible if main effect of either variable is non-linear • Mi Mismodelling d lli th the main i effect ff t may iintroduce t d spurious interactions