Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 153 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
153
Dung lượng
576,32 KB
File đính kèm
College and DIW Berlin.rar
(446 KB)
Nội dung
OLS, IV, IV–GMM and DPD Estimation in Stata Christopher F Baum Boston College and DIW Berlin Durham University, 2011 Christopher F Baum (BC / DIW) OLS, IV, DPD Estimation Durham University, 2011 / 153 Linear regression methodology Linear regression A key tool in multivariate statistical inference is linear regression, in which we specify the conditional mean of a response variable y as a linear function of k independent variables E [y|x1 , x2 , , xk ] = β1 x1 + β2 x2 + · · · + βk xi,k (1) The conditional mean of y is a function of x1 , x2 , , xk with fixed parameters β1 , β2 , , βk Given values for these βs the linear regression model predicts the average value of y in the population for different values of x1 , x2 , , xk Christopher F Baum (BC / DIW) OLS, IV, DPD Estimation Durham University, 2011 / 153 Linear regression methodology This population regression function specifies that a set of k regressors in X and the stochastic disturbance u are the determinants of the response variable (or regressand) y The model is usually assumed to contain a constant term, so that x1 is understood to equal one for each observation We may write the linear regression model in matrix form as y = Xβ + u (2) where X = {x1 , x2 , , xk }, an N × k matrix of sample values Christopher F Baum (BC / DIW) OLS, IV, DPD Estimation Durham University, 2011 / 153 Linear regression methodology The key assumption in the linear regression model involves the relationship in the population between the regressors X and u We may rewrite Equation (2) as u = y − Xβ (3) E (u | X ) = (4) We assume that i.e., that the u process has a zero conditional mean This assumption states that the unobserved factors involved in the regression function are not related in any systematic manner to the observed factors This approach to the regression model allows us to consider both non-stochastic and stochastic regressors in X without distinction; all that matters is that they satisfy the assumption of Equation (4) Christopher F Baum (BC / DIW) OLS, IV, DPD Estimation Durham University, 2011 / 153 Linear regression methodology Regression as a method of moments estimator We may use the zero conditional mean assumption (Equation (4)) to define a method of moments estimator of the regression function Method of moments estimators are defined by moment conditions that are assumed to hold on the population moments When we replace the unobservable population moments by their sample counterparts, we derive feasible estimators of the model’s parameters The zero conditional mean assumption gives rise to a set of k moment conditions, one for each x In the population, each regressor x is assumed to be unrelated to u, or have zero covariance with u.We may then substitute calculated moments from our sample of data into the expression to derive a method of moments estimator for β: X ′u = X ′ (y − X β) = Christopher F Baum (BC / DIW) OLS, IV, DPD Estimation (5) Durham University, 2011 / 153 Linear regression methodology Regression as a method of moments estimator Substituting calculated moments from our sample into the expression and replacing the unknown coefficients β with estimated values b in Equation (5) yields the ordinary least squares (OLS) estimator X ′ y − X ′ Xb = b = (X ′ X )−1 X ′ y (6) We may use b to calculate the regression residuals: e = y − Xb Christopher F Baum (BC / DIW) OLS, IV, DPD Estimation (7) Durham University, 2011 / 153 Linear regression methodology Regression as a method of moments estimator Given the solution for the vector b, the additional parameter of the regression problem σu2 , the population variance of the stochastic disturbance, may be estimated as a function of the regression residuals ei : N ′e e e s2 = i=1 i = N −k N −k (8) where (N − k) are the residual degrees of freedom of the regression problem The positive square root of s2 is often termed the standard error of regression, or standard error of estimate, or root mean square error Stata uses the last terminology and displays s as Root MSE Christopher F Baum (BC / DIW) OLS, IV, DPD Estimation Durham University, 2011 / 153 Linear regression methodology A macroeconomic example As an illustration, we present regression estimates from a simple macroeconomic model, constructed with US quarterly data from the latest edition of International Financial Statistics The model, of the log of real investment expenditures, should not be taken seriously Its purpose is only to illustrate the workings of regression in Stata In the initial form of the model, we include as regressors the log of real GDP, the log of real wages, the 10-year Treasury yield and the S&P Industrials stock index Christopher F Baum (BC / DIW) OLS, IV, DPD Estimation Durham University, 2011 / 153 Linear regression methodology A macroeconomic example We present the descriptive statistics with summarize, then proceed to fit a regression equation use usmacro1, clear tsset time variable: yq, 1959q1 to 2010q3 delta: quarter summarize lrgrossinv lrgdp lrwage tr10yr S_Pindex, sep(0) Obs Mean Std Dev Min Variable lrgrossinv lrgdp lrwage tr10yr S_Pindex 207 207 207 207 207 Christopher F Baum (BC / DIW) 7.146933 8.794305 4.476886 6.680628 37.81332 4508421 4707929 1054649 2.58984 40.04274 OLS, IV, DPD Estimation 6.31017 7.904815 4.21887 2.73667 4.25073 Max 7.874346 9.50028 4.619725 14.8467 130.258 Durham University, 2011 / 153 Linear regression methodology A macroeconomic example The regress command, like other Stata estimation commands, requires us to specify the response variable followed by a varlist of the explanatory variables regress lrgrossinv lrgdp lrwage tr10yr S_Pindex Source SS df MS Model Residual 41.3479199 523342927 202 10.33698 002590807 Total 41.8712628 206 203258557 lrgrossinv Coef lrgdp lrwage tr10yr S_Pindex _cons 6540464 7017158 0131358 0020351 -1.911161 Std Err .0414524 1562383 0022588 0002491 399555 t 15.78 4.49 5.82 8.17 -4.78 P>|t| 0.000 0.000 0.000 0.000 0.000 Number of obs F( 4, 202) Prob > F R-squared Adj R-squared Root MSE = 207 = 3989.87 = 0.0000 = 0.9875 = 0.9873 = 0509 [95% Conf Interval] 5723115 3936485 008682 001544 -2.698994 7357813 1.009783 0175896 0025261 -1.123327 The header of the regression output describes the overall model fit, while the table presents the point estimates, their precision, and interval estimates Christopher F Baum (BC / DIW) OLS, IV, DPD Estimation Durham University, 2011 10 / 153 Dynamic panel data estimators An excellent alternative to Stata’s built-in commands is David Roodman’s xtabond2, available from SSC (findit xtabond2) It is very well documented in his paper, included in your materials The xtabond2 routine handles both the difference and system GMM estimators and provides several additional features—such as the orthogonal deviations transformation—not available in official Stata’s commands As the DPD estimators are instrumental variables methods, it is particularly important to evaluate the Sargan–Hansen test results when they are applied Roodman’s xtabond2 provides C tests (as discussed in re ivreg2) for groups of instruments In his routine, instruments can be either “GMM-style" or “IV-style" The former are constructed per the Arellano–Bond logic, making use of multiple lags; the latter are included as is in the instrument matrix For the system GMM estimator (the default in xtabond2) instruments may be specified as applying to the differenced equations, the level equations or both Christopher F Baum (BC / DIW) OLS, IV, DPD Estimation Durham University, 2011 139 / 153 Dynamic panel data estimators Another important diagnostic in DPD estimation is the AR test for autocorrelation of the residuals By construction, the residuals of the differenced equation should possess serial correlation, but if the assumption of serial independence in the original errors is warranted, the differenced residuals should not exhibit significant AR(2) behavior These statistics are produced in the xtabond and xtabond2 output If a significant AR(2) statistic is encountered, the second lags of endogenous variables will not be appropriate instruments for their current values A useful feature of xtabond2 is the ability to specify, for GMM-style instruments, the limits on how many lags are to be included If T is fairly large (more than 7–8) an unrestricted set of lags will introduce a huge number of instruments, with a possible loss of efficiency By using the lag limits options, you may specify, for instance, that only lags 2–5 are to be used in constructing the GMM instruments Christopher F Baum (BC / DIW) OLS, IV, DPD Estimation Durham University, 2011 140 / 153 Dynamic panel data estimators We illustrate DPD estimation using a cross-country panel from the Penn World Tables We specify a model for kc, the consumption share of per capita GDP, depending on its own lag, cgnp (the lagged ratio of GNP to GDP) and a set of time fixed effects, which we compute with the xi command, as xtabond2 does not support factor variables We first estimate the two-step ‘difference GMM’ form of the model with (cluster-)robust VCE, using data for 1991–2007 We could use testparm _I* after estimation to evaluate the joint significance of time effects (listing of which has been suppressed) Christopher F Baum (BC / DIW) OLS, IV, DPD Estimation Durham University, 2011 141 / 153 Dynamic panel data estimators xi i.year i.year _Iyear_1991-2007 (naturally coded; _Iyear_1991 omitted) xtabond2 kc L.kc cgnp _I*, gmm(L.kc openc cgnp, lag(2 9)) iv(_I*) /// > twostep robust noleveleq nodiffsargan Favoring speed over space To switch, type or click on mata: mata set matafavor > space, perm Dynamic panel-data estimation, two-step difference GMM Group variable: iso Time variable : year Number of instruments = 283 Wald chi2(17) = 94.96 Prob > chi2 = 0.000 kc Coef kc L1 .6478636 cgnp 233404 Number of obs Number of groups Obs per group: avg max Corrected Std Err = = = = = 1485 99 15 15.00 15 z P>|z| [95% Conf Interval] 1041122 6.22 0.000 4438075 8519197 1080771 2.16 0.031 0215768 4452312 Christopher F Baum (BC / DIW) OLS, IV, DPD Estimation Durham University, 2011 142 / 153 Dynamic panel data estimators (continued) Instruments for first differences equation Standard D.(_Iyear_1992 _Iyear_1993 _Iyear_1994 _Iyear_1995 _Iyear_1996 _Iyear_1997 _Iyear_1998 _Iyear_1999 _Iyear_2000 _Iyear_2001 _Iyear_2002 _Iyear_2003 _Iyear_2004 _Iyear_2005 _Iyear_2006 _Iyear_2007) GMM-type (missing=0, separate instruments for each period unless collapsed) L(2/9).(L.kc openc cgnp) Arellano-Bond test for AR(1) in first differences: z = Arellano-Bond test for AR(2) in first differences: z = Sargan test of (Not robust, Hansen test of (Robust, but overid restrictions: chi2(266) = 465.53 but not weakened by many instruments.) overid restrictions: chi2(266) = 87.81 can be weakened by many instruments.) Christopher F Baum (BC / DIW) OLS, IV, DPD Estimation -2.94 0.23 Pr > z = Pr > z = 0.003 0.815 Prob > chi2 = 0.000 Prob > chi2 = 1.000 Durham University, 2011 143 / 153 Dynamic panel data estimators Given the relatively large number of time periods available, I have specified that the GMM instruments only be constructed for lags 2–9 to keep the number of instruments manageable I am treating openc, a measure of openness, as a GMM-style instrument The autoregressive coefficient is 0.648, and the cgnp coefficient is positive and significant Although not shown, the test for joint significance of the time effects has p-value 0.0270 We could also fit this model with the ‘system GMM’ estimator, which will be able to utilize one more observation per country in the level equation, and estimate a constant term in the relationship I am treating lagged openc as a IV-style instrument in this specification Christopher F Baum (BC / DIW) OLS, IV, DPD Estimation Durham University, 2011 144 / 153 Dynamic panel data estimators xtabond2 kc L.kc cgnp _I*, gmm(L.kc cgnp, lag(2 8)) iv(_I* L.openc) /// > twostep robust nodiffsargan Dynamic panel-data estimation, two-step system GMM Group variable: iso Time variable : year Number of instruments = 207 Wald chi2(17) = 8193.54 Prob > chi2 = 0.000 Number of obs Number of groups Obs per group: avg max Corrected Std Err kc Coef kc L1 .9452696 0191167 cgnp 097109 _cons -6.091674 z = = = = = 1584 99 16 16.00 16 P>|z| [95% Conf Interval] 49.45 0.000 9078014 9827377 0436338 2.23 0.026 0115882 1826297 3.45096 -1.77 0.078 -12.85543 672083 Christopher F Baum (BC / DIW) OLS, IV, DPD Estimation Durham University, 2011 145 / 153 Dynamic panel data estimators (continued) Instruments for first differences equation Standard D.(_Iyear_1992 _Iyear_1993 _Iyear_1994 _Iyear_1995 _Iyear_1996 _Iyear_1997 _Iyear_1998 _Iyear_1999 _Iyear_2000 _Iyear_2001 _Iyear_2002 _Iyear_2003 _Iyear_2004 _Iyear_2005 _Iyear_2006 _Iyear_2007 L.openc) GMM-type (missing=0, separate instruments for each period unless collapsed) L(2/8).(L.kc cgnp) Instruments for levels equation Standard _cons _Iyear_1992 _Iyear_1993 _Iyear_1994 _Iyear_1995 _Iyear_1996 _Iyear_1997 _Iyear_1998 _Iyear_1999 _Iyear_2000 _Iyear_2001 _Iyear_2002 _Iyear_2003 _Iyear_2004 _Iyear_2005 _Iyear_2006 _Iyear_2007 L.openc GMM-type (missing=0, separate instruments for each period unless collapsed) DL.(L.kc cgnp) Arellano-Bond test for AR(1) in first differences: z = Arellano-Bond test for AR(2) in first differences: z = Sargan test of (Not robust, Hansen test of (Robust, but overid restrictions: chi2(189) = 353.99 but not weakened by many instruments.) overid restrictions: chi2(189) = 88.59 can be weakened by many instruments.) Christopher F Baum (BC / DIW) OLS, IV, DPD Estimation -3.29 0.42 Pr > z = Pr > z = 0.001 0.677 Prob > chi2 = 0.000 Prob > chi2 = 1.000 Durham University, 2011 146 / 153 Dynamic panel data estimators Note that the autoregressive coefficient is much larger: 0.945 in this context The cgnp coefficient is again positive and significant, but has a much smaller magnitude when the system GMM estimator is used We can also estimate the model using the forward orthogonal deviations (FOD) transformation of Arellano and Bover, as described in Roodman’s paper The first-difference transformation applied in DPD estimators has the unfortunate feature of magnifying any gaps in the data, as one period of missing data is replaced with two missing differences FOD transforms each observation by subtracting the average of all future observations, which will be defined (regardless of gaps) for all but the last observation in each panel To illustrate: Christopher F Baum (BC / DIW) OLS, IV, DPD Estimation Durham University, 2011 147 / 153 Dynamic panel data estimators xtabond2 kc L.kc cgnp _I*, gmm(L.kc cgnp, lag(2 8)) iv(_I* L.openc) /// > twostep robust nodiffsargan orthog Dynamic panel-data estimation, two-step system GMM Group variable: iso Time variable : year Number of instruments = 207 Wald chi2(17) = 8904.24 Prob > chi2 = 0.000 Number of obs Number of groups Obs per group: avg max Corrected Std Err kc Coef kc L1 .9550247 0142928 cgnp 0723786 _cons -4.329945 z = = = = = 1584 99 16 16.00 16 P>|z| [95% Conf Interval] 66.82 0.000 9270114 983038 0339312 2.13 0.033 0058746 1388825 2.947738 -1.47 0.142 -10.10741 1.447515 Christopher F Baum (BC / DIW) OLS, IV, DPD Estimation Durham University, 2011 148 / 153 Dynamic panel data estimators (continued) Instruments for orthogonal deviations equation Standard FOD.(_Iyear_1992 _Iyear_1993 _Iyear_1994 _Iyear_1995 _Iyear_1996 _Iyear_1997 _Iyear_1998 _Iyear_1999 _Iyear_2000 _Iyear_2001 _Iyear_2002 _Iyear_2003 _Iyear_2004 _Iyear_2005 _Iyear_2006 _Iyear_2007 L.openc) GMM-type (missing=0, separate instruments for each period unless collapsed) L(2/8).(L.kc cgnp) Instruments for levels equation Standard _cons _Iyear_1992 _Iyear_1993 _Iyear_1994 _Iyear_1995 _Iyear_1996 _Iyear_1997 _Iyear_1998 _Iyear_1999 _Iyear_2000 _Iyear_2001 _Iyear_2002 _Iyear_2003 _Iyear_2004 _Iyear_2005 _Iyear_2006 _Iyear_2007 L.openc GMM-type (missing=0, separate instruments for each period unless collapsed) DL.(L.kc cgnp) Arellano-Bond test for AR(1) in first differences: z = Arellano-Bond test for AR(2) in first differences: z = Sargan test of (Not robust, Hansen test of (Robust, but overid restrictions: chi2(189) = 384.95 but not weakened by many instruments.) overid restrictions: chi2(189) = 83.69 can be weakened by many instruments.) Christopher F Baum (BC / DIW) OLS, IV, DPD Estimation -3.31 0.42 Pr > z = Pr > z = 0.001 0.674 Prob > chi2 = 0.000 Prob > chi2 = 1.000 Durham University, 2011 149 / 153 Dynamic panel data estimators Ex ante forecasting Using the FOD transformation, the autoregressive coefficient is a bit larger, and the cgnp coefficient a bit smaller, although its significance is retained After any DPD estimation command, we may save predicted values or residuals and graph them against the actual values: Christopher F Baum (BC / DIW) OLS, IV, DPD Estimation Durham University, 2011 150 / 153 Dynamic panel data estimators Ex ante forecasting predict double kchat if inlist(country, "Italy", "Spain", "Greece", "Portugal > ") (option xb assumed; fitted values) (1619 missing values generated) label var kc "Consumption / Real GDP per capita" xtline kc kchat if !mi(kchat), scheme(s2mono) Christopher F Baum (BC / DIW) OLS, IV, DPD Estimation Durham University, 2011 151 / 153 Dynamic panel data estimators Christopher F Baum (BC / DIW) Ex ante forecasting OLS, IV, DPD Estimation Durham University, 2011 152 / 153 Dynamic panel data estimators Ex ante forecasting Although the DPD estimators are linear estimators, they are highly sensitive to the particular specification of the model and its instruments: more so in my experience than any other regression-based estimation approach There is no substitute for experimentation with the various parameters of the specification to ensure that your results are reasonably robust to variations in the instrument set and lags used Christopher F Baum (BC / DIW) OLS, IV, DPD Estimation Durham University, 2011 153 / 153 ... testing in regression Stata contains a number of commands for the construction of hypothesis tests and confidence intervals which may be applied following an estimated regression Some Stata commands... illustrate by reestimating the investment model through 2007Q3, the calendar quarter preceding the most recent recession, and producing ex ante point and interval forecasts for the remaining periods We... 2011 36 / 153 Linear regression methodology Tests of nonlinear hypotheses The nlcom command permits us to compute nonlinear combinations of the estimated coefficients in point and interval form,