To illustrate what has become a standard application of HLM in studies of change, we reanalyze data from the first cohort of the National Youth Survey (Elliott, Huizinga, & Menard, 1989). These data, summarized in Table 2.1, were analyzed by Raudenbush and Chan (1993). Members of the first cohort were sampled in 1976 a t age 11 and interviewed annually until 1980 when they were 15. The outcome, described in detail in Raudenbush and Chan (1993), is a measure of attitudes toward deviant behavior] with higher values indicating greater tolerance of pro-deviant activities such as lying, cheating, stealing, vandalism, and drug use. We shall refer t o this outcome as "tolerance." The table appears t o indicate an increase in toler- ance as a function of age during these early adolescent years. However, the means a t each age are based on different sample sizes because of missing data. In fact, 168 persons had a full complement of five time-series mea- surements, whereas 45 had only four, 14 had three, and 5 had two, and 7 had one. To illustrate the SEM approach t o the study of change, Willett and Sayer (1994) analyzed the subset of 168 participants with complete data. Our analysis, in contrasts, makes use of all available data from the 239 participants.
Simple Model
The general theory of crime of Gottfredson and Hirschi (1990) predicts a near-linear increase in antisocial behavior during these early adolescent years, and it may be that tolerance t o deviant thinking is similarly linear.
Thus we might formulate the simple Iinear model for each person:
A1 t erna tive Covariance Structures 31
Table 2.1
Prescription of NYS Sample Tolerance of
Deviant Attitudes
Age n m sd
11 237 .217 .197 12 232 .241 .212 13 230 .332 .270 14 220 .410 .290 15 219 ,444 .301
Number of Observations Frequency 168
45 14 5 7 where Yij is the tolerance score for person j a t occasion i; aij is the age minus 13 of that person at that occasion, so that 7roj represents the expected tolerance level for participant j at age 13; and 7r1j represents the annual rate of increase in tolerance between the ages of 11 and 15; and rij is the within-person residual, assumed independently normally distributed with mean 0 and constant variance 02.
In sum, j indexes persons ( j = 1, ... 239) and i indexes occasions (i = 1, ..., n j ) where nj is the number of interviews for person j with a maximum 5 in these data. The change trajectory for each person j consists of two parameters: 7roj = status age 13 and 7r1j = annual rate of increase. This pair of person-specific change parameters become outcomes in a level-:!
model for variation between persons. The simplest level-2 model enables us t o estimate the mean trajectory and the extent of variation around the mean:
noj = Po0 + uoj
T l j = PlO + u1j
Thus, Po0 is the population mean status a t age 13 and Plo is the popu- lation mean annual rate of increase from age 11 t o 15. The person-specific random effects are uoj, the deviation of person j ’ s status at 13 from the population mean; and u l j , the deviation of person j ’ s rate of increase from the population mean rate. These random effects are assumed bivariate normally distributed, that is
so that TOO is the variance in status a t age 13, 7 1 1 is the variance of the rates of change, and 701 = 710 the covariance between status a t age 13 and rate of change.
Results
We first consider the results for the fixed effects (Table 2.2, under "ran- dom linear slope"). Mean tolerance at age 13 is estimated t o be &^=
0.327, se = 0.013. The mean rate of increase is significantly positive, Plo
= 0.065, se = 0.0049, t = 13.15. In terms of the standard deviation of the outcome (Table 2.2), this is equivalent t o an increase of roughly 20 t o 25%
of a standard deviation per year. The variance-covariance estimates give information about the degree of individual variation in status and change.
For example, the variance of the rates of change is estimated at .ioo = .0025, equivalent to a standard deviation of about ,050. This implies that a p$r- ticipant who is one standard deviation below the mean in rate of change Po0
= .065 would have a rate of .065 - .050 = .015, quite near t o zero, while a participant with a rate one standard deviation above the mean would have a rate .065 + .050 = .115, quite a rapid rate of increase (at least a third of a standard deviation per year).
Implied Assumptions Concerning Variation and Covariation over Time
If we combine the level-] model (Equation 2.1) and the level-2 model (Equa- tion 2.2), we have the combined model
or
where
which has a mean of zero and a variance
(2.7)
V U T ( E i j ) = roo + 2aij.ro1 + aij711 2 + cT2
Thus, under the linear model, the variance of an observation at a par- ticular occasion is a quadratic function of aij = age - 1 3 for person j at time i. By taking the first derivative with respect to age, we also see t h at the variance will change as a linear function of age:
Thus, the rate of change in the variance has an intercept proportional of 701 and a slope proportional t o 7 1 1 . These are strong assumptions, and it is natural t o ask whether the variances across the five time points behave in the way implied by the model.
A1 t erna tive Covariance Structures 33 The model also has strong implications for the covariance between two outcomes Yij and Yi/j for person j , th at is, outcomes observed, at occasion i and occasion i' for person j:
C O V ( € i j , Q j ) = TOO + (aij + Ui/j)T01 $. U i j U i / j T 1 1 (2.9) Again the question is whether the covariances between pairs of time points implied by Equation 2.9 accurately capture the "true" covariances.
If a study is designed t o collect T time points per participant, and if each person has the same variance-covariance matrix, there will be T variances and T ( T - l ) / 2 covariances for a total of T ( T + 1) / 2 variance- covariance parameters overall. In the current example, with T = 5, there will be 5 variances and 10 covariances for a total of 15 variance covariance parameters. Yet our simple linear model of Equations 2.1 and 2.2 implies that these 15 parameters are effectively linear functions of four underlying parameters: T O O , 701, 7 1 1 , and c2 (see Equations 2.7 and 2.9). It is possible th at four parameters are insufficient t o adequately represent the 15 vari- ances and covariances that might be estimated. In this case, our model, which is based on randomly varying linear change functions across the pop- ulation of persons, is too simple to adequately represent the variation and covariation over time. We might then elaborate the model, for example, by formulating a quadratic model, which would have three random effects per person. The level-1 model might be
In this model, 7roj remains the status of person j at age 13; 7r1j becomes the "average velocity," th at is, the average rate of increase in tolerance;
7 r 2 j becomes "acceleration." According t o past research, tolerance of pro- deviant thinking, although increasing during adolescence, will reach a peak and then decline in early adulthood. The quadratic model enables us t o assess whether this diminishing rate of increase has begun to occur as early as 15. If so, values of 7 r 2 j will tend t o be negative.
We might decide to keep the structure of the level-1 variance simple here so that the level-1 residuals are independent and homoscedastic. However, the variance-covariance structure is now elaborated at level 2:
T O j = Po0 + uoj
T l j = P l O + Ulj
7i-2j = P z o + u3j
where we assume
(2.11)
'This is the homogeneity of dispersion assumption common in multivariate repeated measures. It provides a reasonable starting point for a multivariate analysis, aithough t h e modeling framework t o be presented is not limited to t h e homogeneity assumption.
Note that the level-2 model has six unique parameters: three variances and three covariances. Together with the level-1 variance, then, the model uses 7 parameters t o represent the 15 marginal variance-covariance param- eters of the five time points. It is of interest t o assess whether this model provides a significantly better fit to the data than does the linear change model, which, as we have seen, generated 4 parameters t o account for the 15 marginal variances and covariances.
Alternatively, it might be that an even simpler between-person model might fit the 15 variances and covariances as well as those given by Equa- tions 2.7 and 2.9. Suppose, for example, that in the linear model the variance of the linear rates of change were null, that is n11 = no1 = 0.
Then the expression for the variance in Equation 2.7 would simplify t o Var(cij) equals ~~0 + o2 and the expression for the covariance in Equation 2.9 would simplify to C O V ( E ~ ~ , ~ i , j ) = TOO. This model, which constrains the linear rates of change of all persons t o be the same but allows the intercept t o vary, would then generate the compound symmetry model commonly assumed in univariate repeated measures analysis of variance. According t o this model, variances are constant across time as are the covariances, and the 15 possible variance-covariance parameters associated with the five time points would be effectively reproduced by two parameters.
These possibilities are explored in Table 2.2 The fits of the alterna- tive models (linear mean change with compound symmetry, linear mean change with varying linear functions, quadratic mean change with varying quadratic functions) are compared by comparing model deviance statistics.
Models can be compared by computing the differences between deviances, which are asymptotically distributed as x2 variates under the null hypoth- esis that the simpler model adequately fits the data as well as the more complex data does. The degrees of freedom for the test is the difference between the numbers of parameters estimated in the two models. The to- tal number of parameters is the number of variance-covariance parameter plus the number of fixed effects. The results indicate clearly that the compound symmetry model with fixed linear slopes provides a poorer fit than does the model with randomly varying linear slopes. We reach this conclusion by computing the difference between the deviance based on the compound symmetry model and the deviance based on the model with randomly varying linear slopes, obtaining - 229.00 - (-338.07) = 109.07, comparable t o the percentiles of the x2 distribution with df = 2, the dif- ference in the number of parameters estimated (the compound symmetry
'If the model is estimated via restricted maximum likelihood, the number of pa- rameters is just the number of covariance parameters. See Bryk & Raudenbush, 1992, Chapters 3 and 10.
model has 4 parameters and the randomly varying linear model has 6), p = 0.000. In comparing the model with randomly varying linear func- tions with the model with randomly varying quadratic functions, there is marginally significant evidence t ha t the quadratic model fits better. Here the difference in deviances is -338.07 - (-348.23) = 10.16, df = 4, p = 0.037.
is considerably smaller under the compound symmetry assumptions t han under the other two models. Given t h at the compound symmetry model provides a poorer fit t o the d a t a than does either of the other two models, we must conclude tha t this smaller standard error is unjustified and t hat inferences about the fixed effects are sensitive t o incorrectly assuming t hat compound symme- try holds. This illustrates a key point about inference for these models.
The question is not only whether the d at a support the variance-covariance assumptions but whether inferences about fixed effects are sensitive t o mis- specification of the variance-covariance structure.
Note, however, t ha t none of the three models reported in Table 2.3 is compared with a model that estimates all 15 parameters associated with the 5 time points nor have we considered alternative covariance structures, including autocorrelated or heteoscedastic level-1 errors. We now t ur n t o t ha t problem.
Note also t ha t the standard error estimated for