Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 55 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
55
Dung lượng
671,41 KB
Nội dung
KTEE 310 FINANCIAL ECONOMETRICS THE SIMPLE REGRESSION MODEL Chap – S & W Dr TU Thuy Anh Faculty of International Economics Output and labor use 300 250 200 150 100 50 10 11 12 Labor 13 14 Output 15 16 17 18 19 20 21 22 23 24 Output and labor use Output vs labor use 300 250 200 150 100 50 80 100 120 140 160 180 200 220 The scatter diagram shows output q plotted against labor use l for a sample of 24 observations Output and labor use Economic theory (theory of firms) predicts An increase in labor use leads to an increase output in the SR, as long as MPL>0, other things being equal From the data: consistent with common sense the relationship looks linear Setting up: Want to know the impact of labor use on output => Y: output, X: labor use SIMPLE LINEAR REGRESSION MODEL Y Y 1 2 X 1 X1 X2 X3 X4 X Suppose that a variable Y is a linear function of another variable X, with unknown parameters 1 and 2 that we wish to estimate SIMPLE LINEAR REGRESSION MODEL Y Y 1 2 X 1 X1 X2 X3 X4 X Suppose that we have a sample of observations with X values as shown SIMPLE LINEAR REGRESSION MODEL Y Y 1 2 X 1 Q1 X1 Q2 X2 Q3 X3 Q4 X4 X If the relationship were an exact one, the observations would lie on a straight line and we would have no trouble obtaining accurate estimates of and 2 1 SIMPLE LINEAR REGRESSION MODEL P4 Y Y 1 2 X P1 1 Q1 X1 Q2 P2 X2 Q3 Q4 P3 X3 X4 X In practice, most economic relationships are not exact and the actual values of Y are different from those corresponding to the straight line SIMPLE LINEAR REGRESSION MODEL P4 Y Y 1 2 X P1 1 Q1 X1 Q2 P2 X2 Q3 Q4 P3 X3 X4 X To allow for such divergences, we will write the model as Y = where u is a disturbance term 1 + 2X + u, SIMPLE LINEAR REGRESSION MODEL P4 Y Y 1 2 X u P1 Q1 1 1 X X1 Q2 P2 X2 Q3 Q4 P3 X3 X4 X Each value of Y thus has a nonrandom component, 1 + 2X, and a random component, u The first observation has been decomposed into these two components UNBIASEDNESS OF THE REGRESSION COEFFICIENTS Simple regression model: Y = 1 + 2X + u b2 X X Y Y a u X X i i 2 i i i Xi X X X j E b2 E E a i ui E a i ui a i E ui 2 Now for each i, E(aiui) = aiE(ui) PRECISION OF THE REGRESSION COEFFICIENTS Simple regression model: Y = 1 + 2X + u Efficiency probability density function of b2 OLS other unbiased estimator 2 b2 The Gauss–Markov theorem states that, provided that the regression model assumptions are valid, the OLS estimators are BLUE: Linear, Unbiased, Minimum variance in the class of all unbiased estimators PRECISION OF THE REGRESSION COEFFICIENTS Simple regression model: Y = 1 + 2X + u probability density function of b2 standard deviation of density function of b2 2 b2 In this sequence we will see that we can also obtain estimates of the standard deviations of the distributions These will give some idea of their likely reliability and will provide a basis for tests of hypotheses PRECISION OF THE REGRESSION COEFFICIENTS Simple regression model: Y = 1 + 2X + u X2 2 n X i X b1 b2 2 u u2 X i X u2 n MSD( X ) Expressions (which will not be derived) for the variances of their distributions are shown above We will focus on the implications of the expression for the variance of b2 Looking at the numerator, we see that the variance of b2 is proportional to u2 This is as we would expect The more noise there is in the model, the less precise will be our estimates PRECISION OF THE REGRESSION COEFFICIENTS Simple regression model: Y = 1 + 2X + u X2 2 n X i X b1 b2 2 u u2 X i X u2 n MSD( X ) MSD( X ) X i X n However the size of the sum of the squared deviations depends on two factors: the number of observations, and the size of the deviations of Xi around its sample mean To discriminate between them, it is convenient to define the mean square deviation of X, MSD(X) PRECISION OF THE REGRESSION COEFFICIENTS Simple regression model: Y = 1 + 2X + u Y Y 35 35 30 30 25 25 20 20 15 15 10 10 5 0 -5 10 15 20 X 10 15 20 -5 -10 -10 -15 -15 Y = 3.0 + 0.8X This is illustrated by the diagrams above The nonstochastic component of the relationship, Y = 3.0 + 0.8X, represented by the dotted line, is the same in both diagrams However, in the right-hand diagram the random numbers have been multiplied by a factor of As a consequence, the regression line, the solid line, is a much poorer approximation to the nonstochastic relationship X PRECISION OF THE REGRESSION COEFFICIENTS Simple regression model: Y = 1 + 2X + u X2 2 n X i X b1 b2 2 u u2 X i X u2 n MSD( X ) Looking at the denominator, the larger is the sum of the squared deviations of X, the smaller is the variance of b2 PRECISION OF THE REGRESSION COEFFICIENTS Simple regression model: Y = 1 + 2X + u X2 2 n X i X b1 b2 2 u u2 X i X u2 n MSD( X ) MSD( X ) X i X n A third implication of the expression is that the variance is inversely proportional to the mean square deviation of X 11 PRECISION OF THE REGRESSION COEFFICIENTS Simple regression model: Y = 1 + 2X + u Y Y 35 35 30 30 25 25 20 20 15 15 10 10 5 0 -5 10 15 20 X 10 15 20 -5 -10 -10 -15 -15 Y = 3.0 + 0.8X In the diagrams above, the nonstochastic component of the relationship is the same and the same random numbers have been used for the 20 values of the disturbance term X PRECISION OF THE REGRESSION COEFFICIENTS Simple regression model: Y = 1 + 2X + u Y Y 35 35 30 30 25 25 20 20 15 15 10 10 5 0 -5 10 15 20 X 10 15 -5 -10 -10 -15 -15 Y = 3.0 + 0.8X However, MSD(X) is much smaller in the right-hand diagram because the values of X are much closer together 20 X PRECISION OF THE REGRESSION COEFFICIENTS Simple regression model: Y = 1 + 2X + u Y Y 35 35 30 30 25 25 20 20 15 15 10 10 5 0 -5 10 15 20 X 10 15 20 -5 -10 -10 -15 -15 Y = 3.0 + 0.8X Hence in that diagram the position of the regression line is more sensitive to the values of the disturbance term, and as a consequence the regression line is likely to be relatively inaccurate X PRECISION OF THE REGRESSION COEFFICIENTS Simple regression model: Y = 1 + 2X + u X 2 b1 u 2 n X i X b2 u2 X i X u2 n MSD( X ) We cannot calculate the variances exactly because we not know the variance of the disturbance term However, we can derive an estimator of u2 from the residuals PRECISION OF THE REGRESSION COEFFICIENTS Simple regression model: Y = 1 + 2X + u X 2 b1 u 2 n X i X b2 u2 X i X u2 n MSD( X ) 1 MSD(e ) ei e ei2 n n Clearly the scatter of the residuals around the regression line will reflect the unseen scatter of u about the line Yi = 1 + b2Xi, although in general the residual and the value of the disturbance term in any given observation are not equal to one another One measure of the scatter of the residuals is their mean square error, MSD(e), defined as shown PRECISION OF THE REGRESSION COEFFICIENTS Model 1: OLS, using observations 1899-1922 (T = 24) Dependent variable: q coefficient std error t-ratio p-value const -38,7267 14,5994 -2,653 0,0145 ** l 1,40367 0,0982155 14,29 1,29e-012 *** Mean dependent var 165,9167 Sum squared resid 4281,287 R-squared 0,902764 F(1, 22) 204,2536 Log-likelihood -96,26199 Schwarz criterion 198,8801 rho 0,836471 S.D dependent var 43,75318 S.E of regression 13,95005 Adjusted R-squared 0,898344 P-value(F) 1,29e-12 Akaike criterion 196,5240 Hannan-Quinn 197,1490 Durbin-Watson 0,763565 The standard errors of the coefficients always appear as part of the output of a regression The standard errors appear in a column to the right of the coefficients 54 Summing up Simple Linear Regression model: Verify dependent, independent variables, parameters, and the error terms Interpret estimated parameters b1 & b2 as they show the relationship between X and Y OLS provides BLUE estimators for the parameters under GaussMakov ass What next: Estimation of multiple regression model 55 [...]... e 12 e 22 e 32 (3 b1 b2 ) 2 (5 b1 2b2 ) 2 (6 b1 3b2 ) 2 4 SIMPLE REGRESSION ANALYSIS RSS e 12 e 22 e 32 (3 b1 b2 ) 2 (5 b1 2b2 ) 2 (6 b1 3b2 ) 2 RSS 0 6b1 12b2 28 0 b1 RSS 0 12b1 28 b2 62 0 b2 b1 1.67, b2 1.50 The first-order conditions give us two equations in two unknowns Solving them, we find that RSS is minimized when b1 and b2... the fitted regression as Y 1 2 1 2 minimize RSS, the sum of the squares of the residuals 3 DERIVING LINEAR REGRESSION COEFFICIENTS True model : Y 1 2 X u Fitted line : Yˆ b1 b2 X Yˆ3 b1 3b2 Y 6 Y3 Y2 5 4 Y 2 b1 2b2 Yˆ1 b1 b2 3 e1 Y1 Yˆ1 3 b1 b2 e2 Y2 Y 2 5 b1 2b2 Y1 2 b1 1 b2 e3 Y3 Yˆ3 6 b1 3b2 0 0 1 2 3 X Given our choice of b1 and b2, the residuals... expressions for b1 and b2 using the first order condition 40 Practice – calculate b1 and b2 28 Year Output - Y Labor - X 1899 100 100 1900 101 105 1901 1 12 110 19 02 122 118 1903 124 123 1904 122 116 1905 143 125 1906 1 52 133 1907 151 138 1908 126 121 1909 155 140 1910 159 144 1911 153 145 INTERPRETATION OF A REGRESSION EQUATION Model 1: OLS, using observations 1899-1 922 (T = 24 ) Dependent variable:... b2 that minimize RSS, the sum of the squares of the residuals Y b b X e 1 2 Y b b X e 1 2 Min RSS Min e 2 Min Y b b X 2 Min L 1 2 b ,b b ,b i b ,b i b ,b 1 2 1 2 1 2 1 2 17 DERIVING LINEAR REGRESSION COEFFICIENTS True model : Y 1 2 X u Fitted line : Yˆ b1 b2 X Y 6 Y3 Y2 5 4 3 Y1 2 1 0 0 1 2 3 X This sequence shows how the regression coefficients for a simple regression. .. they maximize R2 INTERPRETATION OF A REGRESSION EQUATION Model 1: OLS, using observations 1899-1 922 (T = 24 ) Dependent variable: q coefficient std error t-ratio p-value const -38, 726 7 14,5994 -2, 653 0,0145 ** l 1,40367 0,09 821 55 14 ,29 1 ,29 e-0 12 *** Mean dependent var 165,9167 Sum squared resid 428 1 ,28 7 R-squared 0,9 027 64 F(1, 22 ) 20 4 ,25 36 Log-likelihood -96 ,26 199 Schwarz... const -38, 726 7 14,5994 -2, 653 0,0145 ** l 1,40367 0,09 821 55 14 ,29 1 ,29 e-0 12 *** Mean dependent var 165,9167 Sum squared resid 428 1 ,28 7 R-squared 0,9 027 64 F(1, 22 ) 20 4 ,25 36 Log-likelihood -96 ,26 199 Schwarz criterion 198,8801 rho 0,836471 S.D dependent var 43,75318 S.E of regression 13,95005 Adjusted R-squared 0,898344 P-value(F) 1 ,29 e- 12 Akaike criterion 196, 524 0 Hannan-Quinn 197,1490.. .SIMPLE LINEAR REGRESSION MODEL P4 Y P1 P3 P2 X1 X2 X3 X4 X In practice we can see only the P points 7 SIMPLE LINEAR REGRESSION MODEL P4 Y Yˆ b1 b2 X P1 P2 b1 X1 X2 P3 X3 X4 X Obviously, we can use the P points to draw a line which is an approximation to the line ^ Y = 1 + 2X If we write this line Y = b1 + b2X, b1 is an estimate of 1 and b2 is an estimate of 2 8 SIMPLE LINEAR REGRESSION MODEL. .. e 1 2 Y b b X e 1 2 Min RSS Min e 2 Min Y b b X 2 Min L 1 2 b ,b b ,b i b ,b i b ,b 1 2 1 2 1 2 1 2 26 DERIVING LINEAR REGRESSION COEFFICIENTS Y True model : Y 1 2 X u Fitted line : Yˆ b1 b2 X Yˆn b1 b2 X n Yn Y1 b1 b2 Yˆ1 b1 b2 X 1 b1 Y b2 X b2 X X Y Y X X i i 2 i X1 Xn X We chose the parameters of the fitted line so as to minimize... variance of Y explained by the regression equation TSS ESS RSS Y 2 i Y 2 ˆ Yi Y e i2 ESS 2 R TSS 2 ˆ ( Y Y ) i 2 ( Y Y ) i GOODNESS OF FIT 2 ˆ ( Y Y ) ESS i R2 TSS (Yi Y ) 2 2 ei TSS RSS 2 R 1 2 TSS ( Y Y ) i 2 Yi Y Yˆi Y 2 e i2 TSS ESS RSS The OLS regression coefficients are chosen in such a way as to minimize the sum... DERIVING LINEAR REGRESSION COEFFICIENTS Y True model : Y 1 2 X u Fitted line : Yˆ b1 b2 X Yˆn b1 b2 X n Yn Y1 b1 b2 Yˆ1 b1 b2 X 1 X1 Xn X Given our choice of b1 and b2, we will obtain a fitted line as shown 14 DERIVING LINEAR REGRESSION COEFFICIENTS Y True model : Y 1 2 X u Fitted line : Yˆ b1 b2 X en Yˆn b1 b2 X n Yn e1 b1 b2 Y1 Yˆ1 b1 b2 X 1 X1 e1 Y1 ... e 22 e 32 (3 b1 b2 ) (5 b1 2b2 ) (6 b1 3b2 ) SIMPLE REGRESSION ANALYSIS RSS e 12 e 22 e 32 (3 b1 b2 ) (5 b1 2b2 ) (6 b1 3b2 ) RSS 6b1 12b2 28 ... Y2 Y 2 b1 2b2 Yˆ1 b1 b2 e1 Y1 Yˆ1 b1 b2 e2 Y2 Y 2 b1 2b2 Y1 b1 b2 e3 Y3 Yˆ3 b1 3b2 0 X Given our choice of b1 and b2, the residuals are as shown RSS e 12. .. labor use 300 25 0 20 0 150 100 50 10 11 12 Labor 13 14 Output 15 16 17 18 19 20 21 22 23 24 Output and labor use Output vs labor use 300 25 0 20 0 150 100 50 80 100 120 140 160 180 20 0 22 0 The scatter