In this chapter, students will be able to understand: Model specification and the data, estimating the parameters of the multiple regression model, sampling properties of the least squares estimator, interval estimation, hypothesis testing for a single coefficient, measuring goodness of fit.
Chapter The Multiple Regression Model • When we turn an economic model with more than one explanatory variable into its corresponding statistical model, we refer to it as a multiple regression model • Most of the results we developed for the simple regression model in Chapters 3–6 can be extended naturally to this general case There are slight changes in the interpretation of the β parameters, the degrees of freedom for the t-distribution will change, and we will need to modify the assumption concerning the characteristics of the explanatory (x) variables Slide 7.1 Undergraduate Econometrics, 2nd Edition-Chapter 7.1 Model Specification and the Data When we turn an economic model with more than one explanatory variable into its corresponding econometric model, we refer to it as a multiple regression model Most of the results we developed for the simple regression model in Chapter 3–6 can be extended naturally to this general case There are slight changes in the interpretation of the β parameters, the degrees of freedom for the t-distribution will change, and we will need to modify the assumption concerning the characteristics of the explanatory (x) variables As an example for introducing and analyzing the multiple regression model, consider a model used to explain total revenue for a fast-food hamburger chain in the San Francisco Bay area We begin with an outline of this model and the questions that we hope it will answer Slide 7.2 Undergraduate Econometrics, 2nd Edition-Chapter 7.1.1 The Economic Model • Each week the management of a Bay Area Rapid Food hamburger chain must decide how much money should be spent on advertising their products, and what specials (lower prices) should be introduced for that week • How does total revenue change as the level of advertising expenditure changes? Does an increase in advertising expenditure lead to an increase in total revenue? If so, is the increase in total revenue sufficient to justify the increased advertising expenditure? • Management is also interested in pricing strategy Will reducing prices lead to an increase or decrease in total revenue? If a reduction in price leads only to a small increase in the quantity sold, total revenue will fall (demand is price inelastic); a price reduction that leads to a large increase in quantity sold will produce an increase in total revenue (demand is price elastic) This economic information is essential for effective management Slide 7.3 Undergraduate Econometrics, 2nd Edition-Chapter • We initially hypothesize that total revenue, tr, is linearly related to price, p, and advertising expenditure, a Thus the economic model is: tr = β1 + β2p + β3a (7.1.1) where tr represents total revenue for a given week, p represents price in that week and a is the level of advertising expenditure during that week Both tr and a are measured in terms of thousands of dollars • Let us assume that management has constructed a single weekly price series, p, measured in dollars and cents, that describes overall prices • The remaining items in Equation (7.1.1) are the unknown parameters β1, β2 and β3 that describe the dependence of revenue (tr) on price (p) and advertising (a) • In the multiple regression model the intercept parameter, β1, is the value of the dependent variable when each of the independent, explanatory variables takes the Slide 7.4 Undergraduate Econometrics, 2nd Edition-Chapter value zero In many cases this parameter has no clear economic interpretation, but it is almost always included in the regression model It helps in the overall estimation of the model and in prediction • The other parameters in the model measure the change in the value of the dependent variable given a unit change in an explanatory variable, all other variables held constant For example, in Equation (7.1.1), β2 = the change in tr ($1000) when p is increased by one unit ($1), and a is held constant, or β2 = ∆tr ∂tr = ∆p ( a held constant) ∂p Slide 7.5 Undergraduate Econometrics, 2nd Edition-Chapter • The symbol ∂ stands for “partial differentiation.” It means that we calculate the change in one variable, tr, when the variable p changes, all other factors, a, held constant • The sign of β2 could be positive or negative If an increase in price leads to an increase in revenue, then β2 > 0, and the demand for the chain’s products is price inelastic Conversely, a price elastic demand exists if an increase in price leads to a decline in revenue, in which case β2 < Thus, knowledge of the sign of β2 provides information on the price elasticity of demand The magnitude of β2 measures the amount of the change in revenue for a given price change • The parameter β3 describes the response of revenue to a change in the level of advertising expenditure That is, β3 = the change in tr ($1000) when a is increased by one unit ($1000), and p is held constant, or Slide 7.6 Undergraduate Econometrics, 2nd Edition-Chapter β3 = ∆tr ∂tr = ∆a ( p held constant) ∂a • We expect the sign of β3 to be positive That is, we expect that an increase in advertising expenditure, unless the ad is offensive, will lead to an increase in total revenue • The next step along the road to learning about β1, β2 and β3 is to convert the economic model into an econometric model Slide 7.7 Undergraduate Econometrics, 2nd Edition-Chapter 7.1.2 The Econometric Model • The economic model in Equation (7.1.1) describes the expected behavior of many individual franchises As such we should write it as E(tr) = β1 + β2p + β3a, where E(tr) is the “expected value” of total revenue • Weekly data for total revenue, price and advertising will not follow an exact linear relationship The Equation (7.1.1) describes, not a line as in Chapters 3-6, but a plane • The plane intersects the vertical axis at β1 The parameters β2 and β3 measure the slope of the plane in the directions of the “price axis” and the “advertising axis,” respectively • Table 7.1 shows representative weekly observations on total revenue, price and advertising expenditure for a hamburger franchise If we plot the data we obtain Figure 7.1 These data not fall exactly on a plane, but instead resemble a “cloud.” • To allow for a difference between observable total revenue and the expected value of total revenue we add a random error term, e = tr − E(tr) This random error represents Slide 7.8 Undergraduate Econometrics, 2nd Edition-Chapter all the factors that cause weekly total revenue to differ from its expected value These factors might include the weather, the behavior of competitors, a new Surgeon General’s report on the deadly effects of fat intake, etc • Denoting the t’th weekly observation by the subscript t, we have trt = E(tr) + et = β1 + β2pt + β3at + et (7.1.2) • The economic model in Equation (7.1.1) describes the average, systematic relationship between the variables tr, p, and a The expected value E(tr) is the nonrandom, systematic component, to which we add the random error e to determine tr Thus, tr is a random variable We not know what the value of weekly total revenue will be until we observe it • The introduction of the error term, and assumptions about its probability distribution, turn the economic model into the econometric model in Equation (7.1.2) The Slide 7.9 Undergraduate Econometrics, 2nd Edition-Chapter econometric model provides a more realistic description of the relationship between the variables, as well as a framework for developing and assessing estimators of the unknown parameters 7.1.2a The General Model • In a general multiple regression model a dependent variable yt is related to a number of explanatory variables xt2, xt3,…, xtK through a linear equation that can be written as yt = β1 + β2xt2 + β3xt3 +…+ βK xtK (7.1.3) • The coefficients β1, β2,…, βK are unknown parameters The parameter βK measures the effect of a change in the variable xtK upon the expected value of yt, E(yt), all other variables held constant The parameter β1 is the intercept term The “variable” to which β1 is attached is xt1 = Slide 7.10 Undergraduate Econometrics, 2nd Edition-Chapter 7.6 Measuring Goodness of Fit • The coefficient of determinant, R2, is a measure of the proportion of variation in the dependent variable that is explained by variation in the explanatory variable The coefficient of determination for the multiple regression model is SSR Σ ( yˆ t − y ) R2 = = SST Σ ( yt − y )2 = 1− SSE Σeˆ = 1− SST Σ ( yt − y ) t (7.6.1) where SSR is the variation in y “explained” by the model, SST is the total variation in y about its mean, and SSE is the sum of squared least squares residuals and is the portion of the variation in y that is not explained by the model The values of these sums of Slide 7.58 Undergraduate Econometrics, 2nd Edition-Chapter squares appear in the analysis of variance (or ANOVA) table reported by regression software • For the Bay Area Burger example the Analysis of Variance table includes the following information: Table 7.4 Partial ANOVA Table Source DF Sum of Squares Explained 11776.18 Unexplained 49 1805.168 Total 51 13581.35 • Using these sums of square we have Slide 7.59 Undergraduate Econometrics, 2nd Edition-Chapter R =1− Σeˆt2 Σ ( yt − y ) = 1− 1805.168 = 0.867 13581.35 • The interpretation of R2 is that 86.7% of the variation in total revenue is explained by the variation in price and by the variation in the level of advertising expenditure It means that, in our sample, only 13.3% of the variation in revenue is left unexplained and is due to variation in the error term or to variation in other variables that implicitly form part of the error term • The coefficient of determination is also viewed as a measure of the predictive ability of the model over the sample period, or as a measure of how well the estimated regression fits the data The value of R2 is equal to the squared sample correlation coefficient between the values of yˆt and yt Slide 7.60 Undergraduate Econometrics, 2nd Edition-Chapter • Since the sample correlation measures the linear association between two variables, if R2 is high, that means there is a close association between the values of yt and the values predicted by the model, yˆt In this case, the model is said to be “fit” the data well If R2 is low, there is not a close association between the values of yt and the values predicted by the model, yˆt , and the model does not fit the data well • One difficulty with R2 is that it can be made large by adding more and more variables, even if the variables added have no economic justification Algebraically it is a fact that as variables are added the sum of squared errors SSE goes down (it can remain unchanged but this is rare) and thus R2 goes up If the model contains T − variables, the R2 = The manipulation of a model just to obtain a high R2 is not wise • An alternative measure of goodness-of-fit called the adjusted R2 and often symbolized as R 2, is usually reported by regression programs; it is computed as Slide 7.61 Undergraduate Econometrics, 2nd Edition-Chapter R2 = 1− SSE /(T − K ) SST /(T − 1) For the Bay Area Burger data the value of this descriptive measure is R = 8617 • This measure does not always go up when a variable is added, because of the degrees of freedom term T − K in the numerator As the number of variables K increases, SSE goes down, but so does T − K The effect on R depends on the amount by which SSE falls • While solving one problem, this corrected measure of goodness of fit unfortunately introduces another one It loses its interpretation; R is no longer the percent of variation explained • We should concentrate on the unadjusted R2 and think of it as a descriptive advice for telling us about the “fit” of the model; it tells us the proportion of variation in the dependent variable explained by the explanatory variables, and the predictive ability Slide 7.62 Undergraduate Econometrics, 2nd Edition-Chapter of the model over the sample period Return to Table 7.2 Note where the values R2 = 0.867 and R = 0.862 appear in the EViews outputs • One final note is in order The intercept parameter β1 is the y-intercept of the regression “plane,” as shown in Figure 7.1 If, for theoretical reasons, you are certain that the regression plane passes through the origin, then β1 = and can be omitted from the model • If the model does not contain an intercept parameter, then the measure R2 given in Equation (7.6.1) is no longer appropriate The reason it is no longer appropriate is that, without an intercept term in the model, Σ ( yt − y ) ≠ Σ ( y?t − y ) + Σet2 2 SST ≠ SSR + SSE Slide 7.63 Undergraduate Econometrics, 2nd Edition-Chapter Under these circumstances it does not make sense to talk of the proportion of total variation that is explained by the regression Thus, when your model does not contain a constant, it is better not to report an R2, even if your computer output displays one Slide 7.64 Undergraduate Econometrics, 2nd Edition-Chapter 7.7 Appendix 7.7.1 Explanation of Simple and Partial Correlation Coefficients In Chapter 6.1.2 we introduced the coefficient of correlation as a measure of the degree of linear association between two variables For the three-variable regression model we can compute three correlation coefficients: r12 (correlation between Y and X2), r13 (correlation between Y and X3), and r23 (correlation between X2 and X3) Notice that we are letting the subscript represent Y for notational convenience These correlation coefficients are called gross or simple correlation coefficients, or correlation coefficients of zero order These coefficients can be computed by the definition of correlation coefficient given in Equation (6.1.11) But now consider this question: Does, say, r12 in fact measure the “true” degree of (linear) association between Y and X2 when a third variable X3 may be associated with both of them? This question is analogous to the following question: Suppose the true Slide 7.65 Undergraduate Econometrics, 2nd Edition-Chapter regression model is Equation (7.2.1) but we omit from the model the variable X3 and simple regress Y on X2, obtaining the slope coefficient of, say, b12 Will this coefficient be equal to the true coefficient β2 if the model (7.2.1) were estimated to begin with? In general, r12 is not likely to reflect the true degree of association between Y and X2 in the presence of X3 As a matter of fact, it is likely to give a false impression of the nature of association between Y and X2, as will be shown shortly Therefore, what we need is a correlation coefficient that is independent of the influence, if any, of X3 on X2 and Y Such a correlation coefficient can be obtained and is known appropriately as the partial correlation coefficient Conceptually, it is similar to the partial regression coefficient We define r12.3 = partial correlation coefficient between Y and X2, holding X3 constant r13.2 = partial correlation coefficient between Y and X3, holding X2 constant r23.1 = partial correlation coefficient between X2 and X3, holding Y constant Slide 7.66 Undergraduate Econometrics, 2nd Edition-Chapter These partial correlations can be easily obtained from the simple or zero-order, correlation coefficients as follows: r12.3 = r12 − r13r23 2) (1− r132 )(1− r23 (A7.1) r13.2 = r13 − r12r23 2) (1− r122 )(1− r23 (A7.2) r23.1 = r23 − r12r13 (1− r122 )(1− r132 ) (A7.3) The partial correlations given in Equations (A7.1) to (A7.3) are called first-order correlation coefficients By order we mean the number of secondary subscripts Thus r12.34 would be the correlation of two, r12.345 would be correlation coefficient of order Slide 7.67 Undergraduate Econometrics, 2nd Edition-Chapter three, and so on As noted previously, r12 and r13, and so on are called simple or zeroorder correlation between Y and X2, holding X3 and X4 constant 7.7.2 Interpretation of Simple and Partial Correlation Coefficients In the two-variable case, the simple r had a straightforward meaning: It measured the degree of (linear) association (and not causation) between the dependent variable Y and the single explanatory variable X But once we beyond the two-variable case, we need to pay careful attention to the interpretation of the simple correlation coefficient From Equation (A7.1), for example, we observe the following: Even if r12 = 0, r12.3 will not be zero unless r13 or r23 or both are zero If r12 = and r13 and r23 are nonzero and are of the same sign, r12.3 will be negative, whereas if they are of the opposite signs, it will be positive An example will make this point clear Let Y = crop yield, X2 = rainfall, and X3 = temperature Assume r12 = 0, that is, no association between crop yield and rainfall Assume further that r13 is Slide 7.68 Undergraduate Econometrics, 2nd Edition-Chapter positive and r23 is negative Then, as Equation (A7.1) shows, r12.3 will be positive; that is, holding temperature constant, there is a positive association between yield and rainfall This seemingly paradoxical result, however, is not surprising Since temperature X3 affects both yield Y and rainfall X2, in order to find out the net relationship between crop yield and rainfall, we need to remove the influence of the “nuisance” variable temperature This example shows how one might be misled by the simple coefficient of correlation The term r12.3 and r12 (and similar comparisons) need not have the same sign In the two-variable case we have seen that r2 lies between and The same property holds true of the squared partial correlation coefficients Using this fact, the reader should verify that one can obtain the following expression from Equation (A7.1): − 2r r r ≤ ≤ r122 + r132 + r23 12 13 23 (A7.4) Slide 7.69 Undergraduate Econometrics, 2nd Edition-Chapter which gives the interrelatinships among the three zero-order correlation coefficients Suppose that r13 = r23 = Does this mean that r12 is also zero? The answer is obvious from Equation (A7.1) The fact that Y and X3 as well as X2 and X3 are uncorrelated does not mean that Y and X2 are uncorrelated may be called the coefficient of partial In passing, note that the expression r12.3 determination and may be interpreted as the proportion of the variation in Y not explained by the variable X3 that has been explained by the inclusion of X2 into the model Conceptually it is similar to R2 Before moving on, note the following relationships between R2, simple correlation coefficients, and partial correlation coefficients: Slide 7.70 Undergraduate Econometrics, 2nd Edition-Chapter r122 + r132 − 2r12r13r23 R = 1− r23 (A7.5) R2 = r122 + (1− r122 )r13.2 (A7.6) )r R2 = r132 + (1− r13 12.3 (A7.7) In concluding this section, consider the following: It was stated previously that R2 will not decrease if an additional explanatory variable is introduced into the model, which can be seen clearly from Equation (A7.6) This equation states that the proportion of the variation in Y explained by X2 and X3 jointly is the sum of two parts: the part explained by X2 alone (= r122 ) and the part not explained by X2 (= – r122 ) times the proportion that is explained by X3 after holding the influence of X2 constant Now R2 > r122 so long as r13.2 will be zero, in which case R2 = r > At worst, r13.2 12 Slide 7.71 Undergraduate Econometrics, 2nd Edition-Chapter Exercise 7.1 7.2 7.9 7.10 7.3 7.5 7.8 Slide 7.72 Undergraduate Econometrics, 2nd Edition-Chapter ... always included in the regression model It helps in the overall estimation of the model and in prediction • The other parameters in the model measure the change in the value of the dependent variable... 7.16 Undergraduate Econometrics, 2nd Edition -Chapter 7.2 Estimating the Parameters of the Multiple Regression Model We consider the problem of using the least squares principle to estimate the. .. revenue • The next step along the road to learning about β1, β2 and β3 is to convert the economic model into an econometric model Slide 7.7 Undergraduate Econometrics, 2nd Edition -Chapter 7.1.2 The