CHAPTER 41 Interval Estimation We will first show how the least squares principle can be used to construct confidence regions, and then we will derive the properties of these confidence regions. 41.1. A Basic Construction Principle for Confidence Regions The least squares objective function, whose minimum argument gave us the BLUE, naturally allows us to generate confidence intervals or higher-dimensional confidence regions. A confidence region for β based on y = Xβ+ε ε ε can be constructed as follows: • Draw the OLS estimate ˆ β into k-dimensional space; it is the vector which minimizes SSE = (y − X ˆ β) (y −X ˆ β). 921 922 41. INTERVAL ESTIMATION • For every other vector ˜ β one can define the sum of squared errors associ- ated with that vector as SSE ˜ β = (y − X ˜ β) (y − X ˜ β). Draw the level hypersurfaces (if k = 2: level lines) of this function. These are ellipsoids centered on ˆ β. • Each of these ellipsoids is a confidence region for β. Different confidence regions differ by their coverage probabilities. • If one is only interested in certain coordinates of β and not in the others, or in some other linear transformation β, then the corresponding confidence regions are the corresponding transformations of this ellipse. Geometrically this can best be seen if this transformation is an orthogonal projection; then the confidence ellipse of the transformed vector Rβ is also a projection or “shadow” of the confidence region for the whole vector. Projections of the same confidence region have the same confidence level, independent of the direction in which this projection goes. The confidence regions for β with coverage probability π will be written here as B β;π or, if we want to make its dependence on the observation vector y explicit, B β;π (y). These confidence regions are level lines of the SSE, and mathematically, it is advantageous to define these level lines by their level relative to the minimum level, i.e., as as the set of all ˜ β for which the quotient of the attained SSE ˜ β = 41.1. CONSTRUCTION OF CONFIDENCE REGIONS 923 (y −X ˜ β) (y −X ˜ β) divided by the smallest p oss ible SSE = (y −X ˆ β) (y −X ˆ β) is smaller or equal a given number. In formulas, (41.1.1) ˜ β ∈ B β;π (y) ⇐⇒ (y −X ˜ β) (y −X ˜ β) (y −X ˆ β) (y −X ˆ β) ≤ c π;n−k,k It will be s hown below, in the discussion following (41.2.1), that c π;n−k,k only depends on π (the confidence level), n −k (the degrees of freedom in the regression), and k (the dimension of the confidence region). To get a geometric intuition of this principle, look at the case k = 2, in which the parameter vector β has only two components. For each possible value ˜ β of the parameter vector, the associated sum of squared errors is SSE ˜ β = (y − X ˜ β) (y − X ˜ β). This a quadratic function of ˜ β, whose level lines form concentric ellipses as shown in Figure 1. The center of these ellipses is the unconstrained least squares estimate. Each of the ellipses is a confidence region for β for a different confidence level. If one needs a confidence region not for the whole vector β but, say, for i linearly independent linear combinations Rβ (here R is a i ×k matrix with full row rank), then the above principle applies in the following way: the vector ˜ u lies in the con- fidence region for Rβ generated by y for confidence level π, notation B Rβ;π , if and only if there is a ˜ β in the confidence region (41.1.1) (with the parameters adjusted 924 41. INTERVAL ESTIMATION to reflect the dimensionality of ˜ u) which satisfies R ˜ β = ˜ u: (41.1.2) ˜ u ∈ B Rβ;π (y) ⇐⇒ exist ˜ β with ˜ u = R ˜ β and (y −X ˜ β) (y −X ˜ β) (y −X ˆ β) (y −X ˆ β) ≤ c π;n−k,i Problem 416. Why does one have to change the value of c when one goes over to the projections of the confidence regions? Answer. Because the projection is a many-to-one mapping, and vectors which are not in the original ellipsoid may still end up in the projection. Again let us illustrate this with the 2-dimensional case in which the confidence region for β is an ellipse, as drawn in Figure 1, called B β;π (y). Starting with this ellipse, the above criterion defines individual confidence intervals for linear combina- tions u = r β by the rule: ˜u ∈ B r β;π (y) iff a ˜ β ∈ B β (y) exists with r ˜ β = ˜u. For r = [ 1 0 ], this interval is simply the projection of the ellipse on the horizontal axis, and for r = [ 0 1 ] it is the projection on the vertical axis. The same argument applies for all vectors r with r r = 1. The inner product of two vectors is the length of the first vector times the length of the projection of the second vector on the first. If r r = 1, therefore, r ˜ β is simply the length of the orthogonal projection of ˜ β on the line generated by the vector r. Therefore 41.1. CONSTRUCTION OF CONFIDENCE REGIONS 925 the confidence interval for r β is simply the projection of the ellipse on the line generated by r. (This projection is sometimes called the “shadow” of the ellipse.) The confidence region for Rβ can also be defined as follows: ˜ u lies in this confidence region if and only if the “best” ˆ ˆ β which satisfies R ˆ ˆ β = ˜ u lies in the confidence region (41.1.1), this bes t ˆ ˆ β being, of course, the constrained least squares estimate subject to the constraint Rβ = ˜ u, whose formula is given by (29.3.13). The confidence region for Rβ consists therefore of all ˜ u for which the constrained least squares estimate ˆ ˆ β = ˆ β −(X X) −1 R R(X X) −1 R −1 (R ˆ β − ˜ u) satisfies condition (41.1.1): (41.1.3) ˜ u ∈ B Rβ (y) ⇐⇒ (y −X ˆ ˆ β) (y −X ˆ ˆ β) (y −X ˆ β) (y −X ˆ β) ≤ c π;n−k,i One can also write it as (41.1.4) ˜ u ∈ B Rβ (y) ⇐⇒ SSE constrained SSE unconstrained ≤ c π;n−k,i i.e., those ˜ u are in the confidence region which, if imposed as a constraint on the regression, will not make the SSE too much bigger. 926 41. INTERVAL ESTIMATION −2 −1 0 1 2 −2 −1 0 1 2 −5 −4 −3 −5 −4 −3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Figure 1. Confidence Ellipse with “Shadows” In order to transform (41.1.3) into a mathematically more convenient form, write it as ˜ u ∈ B Rβ;π (y) ⇐⇒ (y −X ˆ ˆ β) (y −X ˆ ˆ β) −(y −X ˆ β) (y −X ˆ β) (y −X ˆ β) (y −X ˆ β) ≤ c π;n−k,i − 1 41.1. CONSTRUCTION OF CONFIDENCE REGIONS 927 and then use (29.7.2) to get (41.1.5) ˜ u ∈ B Rβ;π (y) ⇐⇒ (R ˆ β − ˜ u) R(X X) −1 R −1 (R ˆ β − ˜ u) (y −X ˆ β) (y −X ˆ β) ≤ c π;n−k,i − 1 This formula has the great advantage that ˆ ˆ β no longer appears in it. The condition whether ˜ u belongs to the confidence region is here formulated in terms of ˆ β alone. Problem 417. Using (18.2.12), show that (41.1.1) can be rewritten as (41.1.6) ˜ β ∈ B β;π (y) ⇐⇒ ( ˆ β − ˜ β) X X( ˆ β − ˜ β) (y −X ˆ β) (y −X ˆ β) ≤ c π;n−k,k − 1 Verify that this is the same as (41.1.5) in the special case R = I. Problem 418. You have run a regression with intercept, but you are not inter- ested in the intercept per se but need a joint confidence region for all slope parameters. Using the notation of Problem 361, show that this confidence region has the form (41.1.7) ˜ β ∈ B β;π (y) ⇐⇒ ( ˆ β − ˜ β) X X( ˆ β − ˜ β) (y −X ˆ β) (y −X ˆ β) ≤ c π;n−k,k−1 − 1 928 41. INTERVAL ESTIMATION I.e., we are sweeping the means out of both regressors and dependent variables, and then we act as if the regression never had an intercept and use the formula for the full parameter vector (41.1.6) for these transformed data (except that the number of degrees of freedom n−k still reflects the intercept as one of the explanatory variables). Answer. Write the full parameter vector as α β and R = o I . Use (41.1.5) but instead of ˜ u write ˜ β. The only tricky part is the following which uses (30.0.37): (41.1.8) R(X X) −1 R = o I 1/n + ¯ x (X X) −1 ¯ x − ¯ x (X X) −1 −(X X) −1 ¯ x (X X) −1 o I = (X X) −1 The denominator is (y −ιˆα − X ˆ β) (y − ι ˆα − X ˆ β), but since ˆα = ¯y − ¯ x ˆ β, see problem 242, this denominator can be rewritten as (y −X ˆ β) (y − X ˆ β). Problem 419. 3 points We are in the simple regression y t = α + βx t + ε t . If one draws, for every value of x, a 95% confidence interval for α + βx, one gets a “confidence band” around the fitted line, as shown in Figure 2. Is the probability that this confidence band covers the true regression line over its whole length equal to 95%, greater than 95%, or smaller than 95%? Give a good verbal reasoning for your answer. You should make sure that your explanation is consistent with the fact that t he confidence interval is random and the true regression line is fixed. 41.2. COVERAGE PROBABILITY OF THE CONFIDENCE REGIONS 929 Figure 2. Confidence Band for Regression Line 41.2. Coverage Probability of the Confidence Regions The probability that any given known value ˜ u lies in the confidence region (41.1.3) depends on the unknown β. But we will show now that the “coverage prob- ability” of the region, i.e., the probability with which the confidence region contains the unknown t rue value u = Rβ, does not depend on any unknown parameters. 930 41. INTERVAL ESTIMATION To get the coverage probability, we must substitute ˜ u = Rβ (where β is the true parameter value) in (41.1.5). This gives (41.2.1) Rβ ∈ B Rβ;π (y) ⇐⇒ (R ˆ β −Rβ) R(X X) −1 R −1 (R ˆ β −Rβ) (y −X ˆ β ) (y −X ˆ β ) ≤ c π;n−k,i − 1 Let us look at numerator and denominator separately. Under the Normality assump- tion, R ˆ β ∼ N(Rβ, σ 2 R(X X) −1 R ). Therefore, by (10.4.9), the distribution of the numerator of (41.2.1) is (41.2.2) (R ˆ β −Rβ) R(X X) −1 R −1 (R ˆ β −Rβ) ∼ σ 2 χ 2 i . This probability distribution only depends on one unknown parameter, namely, σ 2 . Regarding the denominator, remember that, by (24.4.2), (y − X ˆ β) (y − X ˆ β) = ε ε ε Mε ε ε, and if we apply (10.4.9) to this we can see that (41.2.3) (y −X ˆ β) (y −X ˆ β) ∼ σ 2 χ 2 n−k Furthermore, numerator and denominator are independent. To see this, look first at ˆ β and ˆε. By Problem 300 they are uncorrelated, and since they are also jointly Normal, it follows that they are independent. If ˆ β and ˆε are independent, any functions of ˆ β are independent of any functions of ˆε. The numerator in the test statistic (41.2.1) is a function of ˆ β and the denominator is a function of ˆε; therefore [...]... Furthermore, in this variation, the numerator and denominator are independent random variables If this test statistic is much larger than 1, then the constraints are incompatible with the data and the null hypothesis must be rejected The statistic (42.1.3) can also be written as (42.1.4) (SSE constrained − SSE unconstrained )/number of constraints SSE unconstrained /(numb of observations − numb of coefficients in. .. of σ 2 when the null hypothesis is correct, and has a positive bias otherwise • If the distribution of ε is normal, then numerator and denominator are ˆ independent The numerator is a function of β and the denominator one ˆ and ε are independent of ε, and β ˆ ˆ • Again under assumption of normality, numerator and denominator are distributed as σ 2 χ2 with i and n − k degrees of freedom, divided by their... standard normal, and the expression on the right of the big slash is the square root of an independent χ2 divided by n − k The random variable between the absolute signs n−k has therefore a t-distribution, and (41.4.13) follows from (41.4.8) In R, one obtains t(n−k;α/2) by giving the command qt(1-alpha/2,n-p) Here qt stands for t-quantile [BCW96, p 48] One needs 1-alpha/2 instead of alpha/2 41.4 INTERPRETATION... (1) and (2) explicit For instance [Chr87, p 29ff] distinguishes between “testing linear parametric functions” and “testing models.” However the distinction between all 3 principles has been introduced into the linear model only after the discovery that these three principles give different but asymptotically equivalent tests in the Maximum Likelihood estimation Compare [DM93, Chapter 3.6] about this... if the constraint holds and it is biased upwards if the constraint does not hold The unconstrained SSE u , divided by its degrees of freedom, on the other hand, is always an unbiased estimator of σ 2 (42.1.1) 944 42 THREE PRINCIPLES FOR TESTING A LINEAR CONSTRAINT If the constraint holds, the SSE’s divided by their respective degrees of freedom should give roughly equal numbers According to this, a... twice: once with the constraint Rβ = u, and once without the constraint Reject the null hypothesis if 941 942 42 THREE PRINCIPLES FOR TESTING A LINEAR CONSTRAINT the model with the constraint imposed has a much worse fit than the model without the constraint (3) (“Lagrange Multiplier Criterion”) This third criterion is based on the constrained estimator only It has two variants In its “score test” variant,... principle (1), and the F -test for several parameters by principle (2) Later, the student is surprised to find out that the t-test and the F -test in one dimension are equivalent, i.e., that the difference between t-test and F -test has nothing to do with the dimension of the parameter vector to be tested Some textbooks make the distinction between (1) and (2) explicit For instance [Chr87, p 29ff] distinguishes... σ 2 cancels out, and the ratio has a F distribution Since both numerator and denominator have the same expected value σ 2 , the value of this F distribution should be in the order of magnitude of 1 If it is much larger than that, the null hypothesis is to be rejected (Precise values in the F -tables) 950 42 THREE PRINCIPLES FOR TESTING A LINEAR CONSTRAINT 42.2 Examples of Tests of Linear Hypotheses... not reject and the t-test for β2 = β3 does not reject either, but the t-test for β1 = β3 does reject! 954 42 THREE PRINCIPLES FOR TESTING A LINEAR CONSTRAINT Problem 426 4 points [Seb77, exercise 4b.5 on p 109/10] In the model y = β + ε with ε ∼ N (o, σ 2 I) and subject to the constraint ι β = 0, which we had in Problem 348, compute the test statistic for the hypothesis β1 = β3 Answer In this problem,... problem, the “unconstrained” model for the purposes of testing is already constrained, it is subject to the constraint ι β 0 The “constrained” model has the additional = β1 constraint Rβ = 1 0 −1 0 · · · 0 = 0 In Problem 348 we computed the “uncon βk ˆ y y strained” estimates β = y − ι¯ and s2 = n¯2 = (y 1 + · · · + y n )2 /n You are allowed to use this ˆ without proving it again Therefore Rβ = . If one is only interested in certain coordinates of β and not in the others, or in some other linear transformation β, then the corresponding confidence regions are the corresponding transformations. but, say, for i linearly independent linear combinations Rβ (here R is a i ×k matrix with full row rank), then the above principle applies in the following way: the vector ˜ u lies in the con- fidence. numerator and denominator are independent. To see this, look first at ˆ β and ˆε. By Problem 300 they are uncorrelated, and since they are also jointly Normal, it follows that they are independent.