MODEL MATRICES AND MODEL VECTOR SPACES

For the data vector ywith𝝁=E(y), consider the GLM 𝜼=X𝜷 with link function g and transformed mean values𝜼=g(𝝁). For this GLM,y,𝝁, and𝜼are points in n-dimensional Euclidean space, denoted byℝn.

4For linear models, Section 2.5.6 gives a technical definition ofadjusting, based on removing effects ofx2andx3by regressing bothyandx1on them.

5Yule (1907) introduced such notation in a landmark article on regression modeling.

1.3.1 Model Matrices Induce Model Vector Spaces

Geometrically, model matrices of GLMs naturally inducevector spacesthat deter- mine the possible𝝁for a model. Recall that a vector spaceSis such that ifuandv are elements inS, then so areu+vandcufor any constantc.

For a particularn×pmodel matrixX, the values ofX𝜷for all possible vectors𝜷 of model parameters generate a vector space that is a linear subspace ofℝn. For all possible𝜷,𝜼=X𝜷traces out the vector space spanned by the columns ofX, that is, the set of all possible linear combinations of the columns ofX. This is thecolumn spaceofX, which we denote byC(X),

C(X)={𝜼: there is a𝜷such that𝜼=X𝜷}.

In the context of GLMs, we refer to the vector spaceC(X) as themodel space. The 𝜼, and hence the 𝝁, that are possible for a particular GLM are determined by the columns ofX.

Two models with model matricesXaandXbare equivalent ifC(Xa)=C(Xb). The matricesXaandXbcould be different because of a change of units of an explanatory variable (e.g., pounds to kilograms), or a change in the way of specifying indicator variables for a qualitative predictor. On the other hand, if the model with model matrixXais a special case of the model with model matrixXb, for example, withXa obtained by deleting one or more of the columns ofXb, then the model spaceC(Xa) is a vector subspace of the model spaceC(Xb).

1.3.2 Dimension of Model Space Equals Rank of Model Matrix

Recall that therankof a matrixXis the number of vectors in abasisforC(X), which is a set of linearly independent vectors whose linear combinations generateC(X).

Equivalently, the rank is the number of linearly independent columns (or rows) of X. Thedimension of the model spaceC(X) of𝜼values, denoted by dim[C(X)], is defined to be the rank ofX. In all but the final chapter of this book, we assumep≤n, so the model space has dimension no greater thanp. We say thatXhasfull rankwhen rank(X)=p.

WhenXhas less than full rank, the columns of Xare linearly dependent, with any one column being a linear combination of the other columns. That is, there exist linear combinations of the columns that yield the0 vector. There are then nonzero p×1 vectors𝜻such thatX𝜻 =0. Such vectors make up thenull spaceof the model matrix,

N(X)={𝜻:X𝜻=0}.

When X has full rank, then dim[N(X)] = 0. Then, no nonzero combinations of the columns ofX yield 0, and N(X) consists solely of thep×1 zero vector,0= (0, 0,…, 0)T. Generally,

dim[C(X)]+dim[N(X)]=p.

WhenXhas less than full rank, we will see that the model parameters𝜷 are not well defined. Then there is said to bealiasingof the parameters. In one way this can happen, calledextrinsic aliasing, an anomaly of the data causes the linear dependence, such as when the values for one predictor are a linear combination of values for the other predictors (i.e., perfectcollinearity). Another way, called intrinsic aliasing, arises when the linear predictor contains inherent redundancies, such as when (in addition to the usual intercept term) we use an indicator variable for each category of a qualitative predictor. The following example illustrates.

1.3.3 Example: The One-Way Layout

Many research studies have the central goal of comparing response distributions for different groups, such as comparing life-length distributions of lung cancer patients under two treatments, comparing mean crop yields for three fertilizers, or comparing mean incomes on the first job for graduating students with various majors. Forc groups of independent observations, letyijdenote response observationjin groupi, fori=1,…,candj=1,…,ni. This data structure is called theone-way layout.

We regard the groups asccategories of a qualitative factor. For𝜇ij =E(yij), the GLM has linear predictor,

g(𝜇ij)=𝛽0+𝛽i.

Let𝜇idenote the common value of {𝜇ij,j=1,…,ni}, fori=1,…,c. For the identity link function and an assumption of normality for the random component, this model is the basis of theone-way ANOVAsignificance test ofH0:𝜇1=⋯=𝜇c, which we develop in Section 3.2. This hypothesis corresponds to the special case of the model in which𝛽1=⋯=𝛽c.

Lety=(y11,…,y1n

1,…,yc1,…,ycn

c)Tand𝜷=(𝛽0,𝛽1,…,𝛽c)T. Let1n

i denote theni×1 column vector consisting ofnientries of 1, and likewise for0n

i. For the one- way layout, the model matrixXfor the linear predictorX𝜷 in the GLM expression g(𝝁)=X𝜷that representsg(𝜇ij)=𝛽0+𝛽iis

⎛⎜

⎜⎜

⎝ 1n

1 1n

1 0n

1 ⋯ 0n

2 0n

2 1n

2 ⋯ 0n

⋮ ⋮ ⋮ ⋱ ⋮2

c 0n

c ⋯ 1n

⎞⎟

⎟⎟

⎠ .

This matrix has dimensionn×pwithn=n1+⋯+ncandp=c+1.

Equivalently, this parameterization corresponds to indexing the observations asyh forh=1,…,n, defining indicator variablesxhi=1 when observationhis in group iandxhi=0 otherwise, fori=1,…,c, and expressing the linear predictor for the link functiongapplied toE(yh)=𝜇has

g(𝜇h)=𝛽0+𝛽1xh1+⋯+𝛽cxhc.

In either case, the indicator variables whose coefficients are {𝛽1,…,𝛽c} add up to the vector 1n. That vector, which is the first column of X, has coefficient that is

the intercept term 𝛽0. The columns of Xare linearly dependent, because columns 2 throughc+1 add up to column 1. Here𝛽0 is intrinsically aliased with∑c

i=1𝛽i. The parameter𝛽0ismarginalto {𝛽1,…,𝛽c}, in the sense that the column space for the coefficient of𝛽0 in the model lies wholly in the column space for the vector coefficients of {𝛽1,…,𝛽c}. So,𝛽0is redundant in any explanation of the structure of the linear predictor.

Because of the linear dependence of the columns ofX, this matrix does not have full rank. But we can achieve full rank merely by dropping one column ofX, because we need onlyc−1 indicators to represent ac-category explanatory variable. This model with one less parameter has the same column space for the reduced model matrix.

QUANTITATIVE/QUALITATIVE EXPLANATORY VARIABLES AND INTERPRETING EFFECTS

Comparing Mean Numbers of Satellites by Crab Color