Chapter 5 - Inference in the simple regression model: Interval estimation, hypothesis testing, and prediction. In this chapter, students will be able to understand: Interval estimation, hypothesis testing, the least squares predictor.
Chapter Inference in the Simple Regression Model: Interval Estimation, Hypothesis Testing, and Prediction Assumptions of the Simple Linear Regression Model SR1 yt = β + β xt + et SR2 E(et) = ⇔ E[yt] = β1 + β2xt SR3 var(et) = σ2 = var(yt) SR4 cov(ei, ej) = cov(yi, yj) = SR5 xt is not random and takes at least two different values SR6 et ~ N(0, σ2) ⇔ yt ~ N[(β1 + β2xt), σ2] (optional) Slide 5.1 Undergraduate Econometrics, 2nd Edition –Chapter If all the above-mentioned assumptions are correct, then the least squares estimators b1 and b2 are normally distributed random variables and have, from Chapter 4.4, normal distributions with means and variances as follows: σ ∑ xt2 b1 ~ N β1 , T ∑ ( x − x ) t σ2 b2 ~ N β2 , ∑ ( x − x ) t From Chapter 4.5 we know that the unbiased estimator of the error variance is as follows: σˆ ∑ eˆ = t T −2 Slide 5.2 Undergraduate Econometrics, 2nd Edition –Chapter By replacing the unknown parameter σ2 with this estimator we can estimate the variances of the least squares estimators and their covariance In Chapter you learned how to calculate point estimates of the regression parameters β1 and β2 using the best, linear unbiased estimation procedure The estimates represent an inference about the regression function E(y) = β1 + β2x of the population from which the sample data was drawn In this chapter we introduce the additional tools of statistical inference: interval estimation, prediction, interval prediction, and hypothesis testing A prediction is a forecast of a future value of the dependent variable y, for creating ranges of values, sometimes called confidence intervals, in which the unknown parameters, or the value of y, are likely to be located Hypothesis testing procedures are a means of comparing conjecture that we as economists might have about the regression parameters to the information about the parameters contained in a sample of data Hypothesis tests allow Slide 5.3 Undergraduate Econometrics, 2nd Edition –Chapter us to say that the data are compatible, or are not compatible, with a particular conjecture, or hypothesis The procedures for interval estimation, prediction, and hypothesis testing, depend heavily on assumption SR6 of the simple linear regression model, and the resulting normality of the least squares estimators If assumption SR6 is not made, then the sample size must be sufficiently large so that the least squares estimator’s distributions are approximately normal, in which case the procedures we develop in this chapter are also approximate In developing the procedures in this chapter we will be using the normal distribution, and distributions related to the normal, namely “Student’s” t-distribution and the chi-square distribution Slide 5.4 Undergraduate Econometrics, 2nd Edition –Chapter 5.1 Interval Estimation 5.1.1 The Theory A standard normal random variable that we will use to construct an interval estimator is based on the normal distribution of the least squares estimator Consider, for example, the normal distribution of b2 the least squares estimator of β2, which we denote as σ2 b2 ~ N β2 , ( x − x ) ∑ t A standardized normal random variable is obtained from b2 by subtracting its mean and dividing by its standard deviation: Slide 5.5 Undergraduate Econometrics, 2nd Edition –Chapter Z= b2 − β2 ~ N (0,1) var(b2 ) (5.1.1) That is, the standardized random variable Z is normally distributed with mean and variance 5.5.1a The Chi-Square Distribution • Chi-square random variables arise when standard normal, N(0,1), random variables are squared If Z1, Z2, , Zm denote m independent N(0,1) random variables, then V = Z12 + Z 22 + K + Z m2 ~ χ (2m ) (5.1.2) Slide 5.6 Undergraduate Econometrics, 2nd Edition –Chapter The notation V ~ χ (2m ) is read as: the random variable V has a chi-square distribution with m degrees of freedom The degrees of freedom parameter m indicates the number of independent N(0,1) random variables that are squared and summed to form V • The value of m determines the entire shape of the chi-square distribution, and its mean and variance E[V ] = E χ (2m ) = m (5.1.3) var[V ] = var χ(2m ) = 2m In Figure 5.1, graphs of the chi-square distribution for various degrees of freedom, m, are presented Slide 5.7 Undergraduate Econometrics, 2nd Edition –Chapter • Since V is formed be squaring and summing m standardized normal [N(0,1)] random variables, the value of V must be nonnegative, v ≥ • The distribution has a long tail, or is skewed, to the right • As the degrees of freedom m gets larger, the distribution becomes more symmetric and “bell-shaped.” • As m gets large, the chi-square distribution converges to, and essentially becomes, a normal distribution 5.5.1b The Probability Distribution of σˆ • If SR6 holds, then the random error term et has a normal distribution, et ~ N(0,σ2) • Standardize the random variable by dividing by its standard deviation so that , et/σ ~ N(0,1) • The square of a standard normal random variable is a chi-square random variable with one degree of freedom, so (et / σ) ~ χ (1) Slide 5.8 Undergraduate Econometrics, 2nd Edition –Chapter • If all the random errors are independent then 2 2 e e e e ∑t σt = σ1 + σ2 + L + σT ~ χ(2T ) (5.1.4) Since the true random errors are unobservable we replace them by their sample counterparts, the least squares residuals eˆt = yt − b1 − b2 xt to obtain V= ∑ eˆ t t σ (T − 2)σˆ = σ2 (5.1.5) • The random variable V in Equation (5.1.5) does not have a χ (2T ) distribution because the least squares residuals are not independent random variables Slide 5.9 Undergraduate Econometrics, 2nd Edition –Chapter • All T residuals eˆt = yt − b1 − b2 xt depend on the least squares estimators b1 and b2 It can be shown that only T – of the least squares residuals are independent in the simple linear regression model That is, when multiplied by the constant (T – 2)/σ2 the random variable σˆ has a chi-square distribution with T – degrees of freedom, (T − 2)σˆ 2 ~ V= χ ( T − 2) σ2 (5.1.6) • We have not established the fact that the chi-square random variable V is statistically independent of the least squares estimators b1 and b2, but it is Now we turn our attention to define a t-random variable Slide 5.10 Undergraduate Econometrics, 2nd Edition –Chapter against the alternative H1: βk > c, the p-value is computed by finding the probability that the test statistic is greater than or equal to the computed sample value of the test statistic • In the food expenditure example we test H0: β2 = against the alternative H1: β2 > This is the relevant “test of significance” for this example, since economic theory rules out negative values of β2 Following our standard testing format we have: The null hypothesis is H0: β2 = The alternative hypothesis is H1: β2 > If the null hypothesis is true, then there is no economic relationship between weekly household income and weekly household food expenditure, given our economic and statistical model If the alternative hypothesis is true, then there is a positive relationship between income and food expenditure The test statistic t = b2 ~ t(T − 2) if the null hypothesis is true se(b2 ) Slide 5.52 Undergraduate Econometrics, 2nd Edition –Chapter For the level of significance α = 05 the critical value tc is 1.686 for a t-distribution with T − = 38 degrees of freedom Thus, we will reject the null hypothesis in favor of the alternative if t ≥ 1.686 The least squares estimate of β2 is b2 = 1283, with standard error se(b2) = 0305 Exactly as in the two-tailed test the value of the test statistic is t = 1283 = 4.20 0305 Conclusion: Since t = 4.20 > tc = 1.686 we reject the null hypothesis and accept the alternative, that there is a positive relationship between weekly income and weekly food expenditure The p-value for this test is p(t ≥ 4.20) = 000775, which is far less than the level of significance α = 05; thus, we also reject the null hypothesis on this basis The p-value in the one-tailed test is exactly one-half the p-value in the two-tailed test By forming an inequality alternative, we have added the information that negative values of β2 are impossible Slide 5.53 Undergraduate Econometrics, 2nd Edition –Chapter 5.2.11 A Comment on Stating Null and Alternative Hypotheses • When we fail to reject a null hypothesis, all the hypothesis test can establish is that the information in a sample of data is compatible with the null hypothesis On the other hand, a statistic test can lead us to reject the null hypothesis, with only a small probability, α, of rejecting the null hypothesis when it is actually true Thus, rejecting a null hypothesis is a stronger conclusion than failing to reject it • The null hypothesis is usually stated in such a way that if our theory is correct, then we will reject the null hypothesis For example, economic theory implies that there should be a positive relationship between income and food expenditure • When using a hypothesis test we would like to establish that there is statistical evidence, based on a sample of data, to support this theory With this goal we set up the null hypothesis that there is no relation between the variables, H0: β2 = In the alternative hypothesis we put the conjecture that we would like to establish, H1: β2 > Slide 5.54 Undergraduate Econometrics, 2nd Edition –Chapter • Alternatively, suppose the conjecture that we would like to establish is that the marginal propensity to spend on food is greater than 0.10 To so, we define the null hypothesis H0: β2 = 0.10 against the alternative hypothesis H1: β2 > 0.10 You may view the null hypothesis to be too limited in this case, since it is feasible that β2 < 0.10 The hypothesis testing procedure for testing the null hypothesis that H0: β2 ≤ 0.10 against the alternative hypothesis H1: β2 > 0.10 is exactly the same as testing H0: β2 = 0.10 against the alternative hypothesis H1: β2 > 0.10 The test statistic, rejection region, and p-value are exactly the same For one-tailed test you can form the null hypothesis in either of these ways What counts is that the alternative hypothesis is properly specified • It is important to set up the null and alternative hypotheses before you carry out the regression analysis Failing to so can lead to errors in formulating the alternative hypothesis Slide 5.55 Undergraduate Econometrics, 2nd Edition –Chapter 5.3 The Least Squares Predictor • Given the model and assumptions SR1-SR6, we want to predict for a given value of the explanatory variable x0 the value of the dependent variable y0, which is given by y0 = β1 + β2x0 + e0 (5.3.1) where e0 is a random error This random error has mean E(e0) = and variance var(e0) = σ2 We also assume that cov(e0, et) = • In Equation (5.3.1) we can replace the unknown parameters by their estimators, b1 and b2 Since y0 is not known, the random error e0 can not be estimated, so we replace it by its expectation, zero This produces the least squares predictor of y0, yˆ = b1 + b2 x0 (5.3.2) Slide 5.56 Undergraduate Econometrics, 2nd Edition –Chapter graphed in Figure 5.9 This prediction is given by the point on the least squares fitted line where x = x0 • How good a prediction procedure is this? Since the least squares estimators b1 and b2 are random variables, then so is yˆ = b1 + b2 x0 To evaluate the sampling properties of this predictor it is customary to examine the forecast error f = yˆ − y0 = b1 + b2 x0 − (β1 + β2 x0 + e0 ) = (b1 − β1 ) + (b2 − β2 ) x0 − e0 (5.3.3) Using the properties of the least squares estimators and the assumptions about e0, the expected value of f is: E ( f ) = E ( yˆ − y0 ) = E (b1 − β1 ) + E (b2 − β2 ) x0 − E (e0 ) =0+0−0=0 (5.3.4) Slide 5.57 Undergraduate Econometrics, 2nd Edition –Chapter which means, on average, the forecast error is zero, and yˆ0 is an unbiased linear predictor of y0 • Using Equation (5.3.3) for the forecast error, and what we know about the variances and covariances of the least squares estimators, it can be shown that the variance of the forecast error can be proven as follows in Equation (5.3.5): var( f ) = var( yˆ − y0 ) = E (b1 − β1 ) + x02 E (b2 − β2 ) + E (e02 ) + x0 E[(b1 − β1 )(b2 − β )] + E[(b1 − β1 )e0 ] + E[(b2 − β2 )e0 ] = var(b1 ) + x02 var(b2 ) + var(e0 ) + x0 cov(b1 , b2 ) Slide 5.58 Undergraduate Econometrics, 2nd Edition –Chapter σ ∑ xt2 σ x02 −2 x σ 2 (from Equation (4.2.10)) = + +σ + ( ) T ∑ ( xt − x ) ∑ ( xt − x ) x x − ∑ t x ∑ = σ [1 + 2 t + Tx02 − 2Tx0 x T ∑ ( xt − x ) ∑(x − x ) = σ [1 + t ] = σ [1 + (∑ xt2 − Tx ) + (Tx02 − x0 xT + Tx ) + T ( x02 − x0 x + x ) T ∑ ( xt − x ) T ∑ ( xt − x ) ] ( x0 − x ) ] = σ 1 + + 2 T ∑ ( xt − x ) Notice that the further x0 is from the sample mean x , the more unreliable the forecast will be, in the sense that the variance of the forecast error is larger • If the random errors are normally distributed, or if the sample size is large, then the forecast error f is normally distributed with mean zero and variance given by Equation (5.3.5) Slide 5.59 Undergraduate Econometrics, 2nd Edition –Chapter • The forecast error variance is estimated by replacing σ2 by its estimator σˆ to give ( x0 − x ) ˆ f ) = σˆ 1 + + var( 2 T ∑ ( xt − x ) (5.3.6) • The square root of the estimated variance is the standard error of the forecast, ˆ (f) se ( f ) = var (5.3.7) We can use the predicted value yˆ0 and the standard error of the forecast to compute a confidence interval, or a prediction interval • Consequently we can construct a standard normal random variable as Slide 5.60 Undergraduate Econometrics, 2nd Edition –Chapter f ~ N (0, 1) var( f ) (5.3.8) ˆ f ) , we obtain a t-statistic, Then by replacing var(f) in Equation (5.3.8) by var( f f = ~ t(T − 2) se( ) f ˆ var( f ) (5.3.9) • Using these results we can construct a prediction interval for y0 just as we constructed confidence intervals for the parameters βk If tc is a critical value from the t(T-2) distribution such that P(t ≥ tc) = α/2, then P ( −t c ≤ t ≤ t c ) = − α (5.3.10) Slide 5.61 Undergraduate Econometrics, 2nd Edition –Chapter • Substitute the t random variable from Equation (5.3.9) into Equation (5.3.10) to obtain P[−tc ≤ yˆ − y0 ≤ tc ] = − α se( f ) and simplify this expression to obtain P[ y?0 − tcse(f ) ≤ y0 ≤ y0 + tcse(f )] = − α (5.3.11) • A (1−α)×100% confidence interval, or prediction interval, for y0 is yˆ ± tcse(f ) (5.3.12) Slide 5.62 Undergraduate Econometrics, 2nd Edition –Chapter • Equation (5.3.5) implies that, the farther x0 is from the sample mean x , the larger the variance of the prediction error In other words, our predictions for values of x0 close to the sample mean x are more reliable than our predictions for values of x0 far from the sample mean x • The relationship between point and interval predictions for different values of x0 is illustrated in Figure 5.10 A point prediction is always given by the fitted least squares line, yˆ = b1 + b2 x0 A prediction confidence interval takes the form of two bands around the least squares line Since the forecast variance increases the farther x0 is from the sample mean of x , the confidence bands increase in width as | x0 − x | increases Slide 5.63 Undergraduate Econometrics, 2nd Edition –Chapter 5.3.1 Prediction in the Food Expenditure Model • In Chapter 3.3.3b we predicted the weekly expenditure on food for a household with x0 = $750 weekly income The point prediction is yˆ = b1 + b2 x0 = 40.7676 + 1283(750) = 136.98 This means we predict that a household with $750 weekly income will spend $136.98 on food per week • Using our estimate σˆ = 1429.2456, the estimated variance of the forecast error is ( x0 − x ) (750 − 698) ˆ f ) = σˆ 1 + + var( = 1429.2456 1 + + = 1467.4986 2 T x x ( ) 40 1532463 − ∑ t Slide 5.64 Undergraduate Econometrics, 2nd Edition –Chapter The standard error of the forecast is then ˆ f ) = 1467.4986 = 38.3079 se(f ) = var( • If we select – α = 95, then tc = 2.024 and the 95% confidence interval for y0 is yˆ ± tcse(f ) = 136.98 ± 2.024(38.3079) or [59.44 to 214.52] Our prediction interval suggests that a household with $750 weekly income will spend somewhere between $59.44 and $214.52 on food Such a wide interval means that our point prediction, $136.98, is not reliable We might be able to improve it by measuring the effect that factors other than income might have Slide 5.65 Undergraduate Econometrics, 2nd Edition –Chapter Exercise 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.10 5.15 5.16 5.19 Slide 5.66 Undergraduate Econometrics, 2nd Edition –Chapter ... hypothesis The procedures for interval estimation, prediction, and hypothesis testing, depend heavily on assumption SR6 of the simple linear regression model, and the resulting normality of the. .. t(T − 2) , k = 1, se(bk ) (5.1.9) The random variable t in Equation (5.1.9) will be the basis for interval estimation and hypothesis testing in the simple linear regression model • Equation (5.1.9),... drawn • The probability endpoints of the interval define an interval estimator of β2 The probability statement in Equation (5.1.13) says that the interval b2 ± tcse(b2 ) , with random endpoints,