KInh tế ứng dụng_ Lecture 9: Autocorrelation

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang	9
Dung lượng	82,89 KB

Nội dung

Applied Econometrics Autocorrelation 1 Applied Econometrics Lecture 9: Autocorrelation “It is never possible to step twice into the same river” 1) Introduction Autocorrelation (also called serial correlation) is violation of the assumption that the error terms are not correlated, i.e., with autocorrelation E(∈ i , ∈ j ) ≠ 0 (∈ i ≠ ∈ j ). That is, the error in the period t is not independent of previous errors. Since we do not know the population line, we do not know the actual errors (∈s), but we estimate them by the residuals (e). Hence a look at the residual plot for a regression that (i) has no autocorrelation; (ii) has positive autocorrelation, and, (iii) has negative autocorrelation. The positive autocorrelation is the common problem in economics. 2) Consequences of autocorrelation Ordinary least squares (OLS) estimates in presence of autocorrelation will not have the desirable statistical properties. With positive autocorrelation the standard errors are too low (underestimated). This adversely affects the t statistics (overestimated), so we may reject the null when it is in fact valid. Likewise the R 2 and related F – statistic are likely to be overestimated. 3) Detecting autocorrelation There are many ways to check for autocorrelation such as (1) looking at the residual plot; (2) observing the correlogram; (3) using the runs tests; and, (4) using the Durbin – Watson statistic. This section presents the runs tests and Durbin – Watson tests. 3.1) Runs test Autocorrelation can show up in the residual plot. A non – autocorrelation error should jump around the mean (zero) in a random manner. With positive autocorrelation (we are most likely to get with economic data) the error is more likely to stay above or below the mean for successive observations. (with negative autocorrelation it will jump above and below very frequently). We can formalize this approach in the runs test, by counting the number of runs in the data. A run is defined as the succession of positive or negative residual (even just one observation counts as a run). We saw also that if there is positive autocorrelation then there will be rather fewer runs than we should expect from a series with no – autocorrelation. On the other hand, if there is negative autocorrelation then there are more runs than with no autocorrelation. Written by Nguyen Hoang Bao May 31, 2004 Applied Econometrics Autocorrelation 2 The table for the runs test gave a confident interval – if the observed number of runs is outside this interval we reject null hypothesis of autocorrelation. If the actual number of runs is less than the lower bound of the confidence interval then we reject in favor of negative autocorrelation. If it is a higher we reject in favor of negative autocorrelation. We may sometimes need to calculate the interval ourselves. E(R) = n NN 21 2 +1 S R 2 = )1( )2(2 2 2121 − − nn nNNNN Where N 1 is the number of positive residuals, N 2 is the number of negative residual, R is total number of runs, and n is the number observations ( so n = N 1 + N 2 ) The confidence interval at 5 percent level of significance is given by: E(R) – 1.96 s R ≤ R ≤ E(R) +1.96 s R We accept the null hypothesis of no autocorrelation if the observed number of runs falls within the confidence interval. 3.2) Durbin – Watson test A second (the most common) test is the Durbin – Watson (DW) test. The DW statistic is defined as: d = () ∑ ∑ = = − − n t t n t tt e ee 1 2 2 2 1 Note that d = 2(1+ρ) -1 ≤ ρ ≤ +1 d will be zero with extreme positive autocorrelation 4 with extreme negative autocorrelation and 2 if there is no autocorrelation. The null hypothesis is that DW = 2, which corresponds to no autocorrelation. Reject H 0 : positive autocorrelation Zone of indecision Accept H 0 : no autocorrelation Zone of indecision Reject H 0 : positive autocorrelation 0 d L d U 2 4 – d U 4 – d L 4 Testing for autocorrelation with a lagged dependent variable If the model contains a lagged dependent variable, d will be biased 2 (this bias may lead us to accept the null when in fact autocorrelation is present. In such cases we must instead use Durbin’s h Written by Nguyen Hoang Bao May 31, 2004 Applied Econometrics Autocorrelation 3 h = )][var(12 1 1 bn nd − ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − Where var(b 1 ) is the square of the standard error of the coefficient on the lagged dependent variables and n is the number of observations. The test may not be used if n[var(b 1 )] is greater than one. The runs test and DW are not equivalent – they may give different answers. Also the fact that DW may frequently fall in the indecision zone mean that some judgment is required. If DW is in the in decision zone, but fairly close to d u , and the run test indicates no autocorrelation, then you can assume no autocorrelation. 4) Why do we get autocorrelation? We test for autocorrelation on the residuals; but these are only a good proxy for the true error if the model is correct. The presence of autocorrelation will very often indicate miss–specification, including: Incorrect functional form Omitted variable(s) 1 Structural instability Influential points Spurious regression Spurious regression is the very serious problem in time series data. A rule of thumb is the R 2 > d indicates that a regression is spurious (if R 2 > d the regression is almost certainly spurious, but if R 2 <d the regression may still be spurious). A note on cross–section data: autocorrelation must be a time series problem, as we can always remove autocorrelation from cross–section data by re–ordering the data. However, if the data are sorted by one of the independent variables then the apparent presence of autocorrelation can still indicate misspecification. Reordering is not usually an option in time – series data, and certainly not so if the equation includes any lags. 5) Remedial measures The first thing to be is to interpret autocorrelation as a symptom of misspecification and so to carry out various specification tests (i.e. for omitted variables, structural breaks, etc). This will nearly always cure the problem. If the autocorrelation is genuine you can remove the autocorrelated errors by: 1 The exclusion of relevant variable(s) will bias the estimates of the coefficients of all variables included in the model (unless they happen to be orthogonal to the excluded variable). The normal t – tests cannot tell us if the model is misspecified on account of omitted variables, since they are calculated on the assumption that the estimated model is correct one. Written by Nguyen Hoang Bao May 31, 2004 Applied Econometrics Autocorrelation 4 5.1 The Cochrane – Orcult procedure Suppose we have the model: Y t = β 0 + β 1 X 1 + u t It is usually assumed that the e i follow the first–order autoregressive scheme, namely, u t = ρu t-1 + ε t Cochrance and Orcutt (1949) then recommend the following steps to estimste ρ 1 Estimate the two – variable model and calculate the residuals, e t-1 2 Run the following regression: e t = ρe t-1 + v t 3 Using ρ, run the generalized difference equation: (Y t – ρY t-1 ) = β 1 (1 – ρ) + β 2 (X t – ρX t-1 ) + (u t – ρu t-1 ) Y t * = β 1 * + β 2 * X t * + e t * 4 Calculate the new residuals: e t * * = Y t – β 1 * – β 2 * X t * 5 Estimate regression: e t ** = ρe t-1 * * + w t This second round estimate of ρ may not be the best estimate. We can go into the third round estimate and so on. We may stop calculating when the successive estimates of ρ differ by a very small amount from 0.01 to 0.005. 5.2 The Durbin procedure Durbin (1960) suggested can alternative method of estimating ρ . The generalized difference equation can be written as: Y t = β 1 (1 – ρ) + β 2 X t + ρβ 2 X t-1 + ρY t-1 +e t Once an estimate of ρ is obtained, we regress the transformed variable Y * on X * as in Y t * = β 1 * + β 2 * X t * + e t * 5.3 The Theil – Nagar procedure Theil and Nagar (1961) have estimated ρ based on d statistic (in the small samples) ρ = 22 22 )2/1( kN kdN − +− Where N is the total number of observations, d is DW, k is the number of coefficients including the intercept. Written by Nguyen Hoang Bao May 31, 2004 Applied Econometrics Autocorrelation 5 5.4 The Hildreth – Lu procedure From the first – order autoregressive scheme u t = ρu t-1 + ε t Hildreth – Lu (1960) recommend selecting ρ lie between ±1 using 0,1 unit intervals and transforming the data by the generalized difference equation and obtain the associated RSS. Hildreth – Lu suggest choosing that ρ which minimizes the RSS. The differencing procedure looses one observation. To avoid this, the first observation on Y and X is transformed as follows: Y 1 (1–ρ ) 0.5 and X 1 (1–ρ) 0.5 (Prais –winsten: 1971) 5.5 Detrending by including a time trend as one of regessors Y t = β 1 + β 2 X t + β 3 u t The first – order transformation of it as follow: ΔY t =β 2 ΔX t + β 3 + ε t There is an intercept term in the first difference form. It signifies that there was a linear trend term in the original model. If β 3 > 0, there is an upward trend after removing the influence of the variable X. We emphasize that techniques are only to be use if you are sure that there is no problem of misspecification. 6) An example Regression of crop output on the price index and fertilizer input (Table 6.1) was found to be badly autocorrelated: the DW statistic was 0.96 to a critical value of d L of 1.28. We found that the autocorrelation arose from a problem of omitted variable bias. But for illustrative purposes we shall see how the autocorrelation may be removed using the Cochrane – Orcutt correction. To do this we carry out the following steps: (1) The estimated equation with OLS gives DW = 0.958; thus ρ = 1 – d/2 = 0.521. (2) Calculate Q t * = Q t – 0.521Q t–1 , and similarly for P * and F * for observations 1962 to 1990. The results are shown in Table 6.1. (3) Apply the Prais – Winsten transformation to get Q * 1961 = (1 – 0.561 2 ) 1/2 . Q 1961 , and similarly for the 1961 of P * and F * (Although we do have 1960 values for P and F, though not Q, and so could apply the Cochrane – Orcutt procedure to the 1960 observations, the fact that we use the Prais – Winsten transformation for one variable means that we must also use it for the others). The resulting values are shown in Table 6.1. Written by Nguyen Hoang Bao May 31, 2004 Applied Econometrics Autocorrelation 6 Table 6.1: Application of Cochrane – Orcutt correction to crop production function data Year Q Q* P P* F F* 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 40.4 36.4 35.4 37.9 34.8 27.9 29.8 34.7 38.4 33.6 33.6 32.2 35.3 39.4 30.6 30.5 33.7 35.8 36.0 37.0 30.7 28.0 28.4 27.6 32.9 37.1 36.0 36.6 38.8 37.1 34.5 15.4 16.5 19.4 15.1 9.8 15.3 19.1 20.3 13.6 16.1 14.7 18.5 21.1 10.1 14.5 17.8 18.2 17.3 18.3 11.4 12.0 13.8 12.8 18.5 20.0 16.7 17.9 19.7 16.9 106.0 108.1 110.3 110.1 108.6 103.8 109.5 102.6 101.1 100.9 104.7 107.3 103.0 116.4 112.7 108.0 103.2 101.0 103.6 109.6 105.2 98.7 99.2 94.8 100.6 104.5 98.9 101.8 105.6 108.7 90.5 52.9 54.0 52.6 51.2 47.2 55.4 45.5 47.6 48.2 52.2 52.7 47.1 62.8 52.0 49.3 46.9 47.2 51.0 55.6 48.1 43.9 47.7 43.2 51.2 52.1 44.5 50.2 52.6 53.6 99.4 100.8 102.1 102.9 103.1 104.2 104.6 105.6 106.8 106.7 108.3 108.6 110.4 111.2 111.1 110.7 110.5 112.0 111.6 113.2 114.1 114.8 114.8 114.4 114.6 114.5 114.2 115.5 116.7 118.0 84.8 49.0 49.6 49.7 49.5 50.5 50.3 51.1 51.7 51.1 52.7 52.2 53.8 53.7 53.1 52.8 52.8 54.4 53.2 55.0 55.1 55.4 55.0 54.7 54.9 54.8 54.6 55.9 56.5 57.2 (4) Estimate the following model: Q t * = β 1 * + β 2 * P t * + β 3 * F t * + ∈ t * Written by Nguyen Hoang Bao May 31, 2004 Applied Econometrics Autocorrelation 7 The regression results are given in Table 6.2 (which repeats also those for OLS estimation). Calculate the estimate of the intercept b 1 = b 1 * /(1–ρ), which equals –19.98. Table 6.2: Regression results with Cochrane – Orcutt procedure Constant P F R 2 DW OLS Coefficient (t – statistic) 10.21 (0.41) 0.26 (1.26) –0.03 (–0.20) 0.12 0.96 Cochrane – Orcutt procedure Coefficient (t – statistic) –9.57 (–2.02) 0.28 (2.63) 0.22 (1.58) 0.62 1.50 (5) Comparing the two regressions, we see that the DW statistic is now 1.5. This value falls towards the upper end of the zone of indecision, so the evidence for autocorrelation is much weaker than in the OLS regression, though it may be thought worthwhile to repeat the procedure (using a new ρ of 0.25, calculated from the new DW). Comparison of the slope coefficients from the two regressions shows price to be relatively unaffected. With the Cochrane – Orcutt procedure, the fertilizer variable produces the expected positive sign, though it remains insignificant. The unexpected insignificance of fertilizer is a further indication that we should have treated the initial autocorrelation as a sign of misspecification. In this case, the Cochrane – Orcutt procedure has suppressed the symptom of misspecification, but cannot provide the cure – which is to include the omitted variables. If you believe the model to be correctly specified and there is autocorrelation, then the Cochrane – Orcutt procedure may be used to obtain efficient estimates. References Bao, Nguyen Hoang (1995), ‘Applied Econometrics’, Lecture notes and Readings, Vietnam-Netherlands Project for MA Program in Economics of Development. Maddala, G.S. (1992), ‘Introduction to Econometrics’, Macmillan Publishing Company, New York. Mukherjee Chandan, Howard White and Marc Wuyts (1998), ‘Econometrics and Data Analysis for Developing Countries’ published by Routledge, London, UK. Written by Nguyen Hoang Bao May 31, 2004 Applied Econometrics Autocorrelation 8 Workshop 9: Autocorrelation 1.1) Use the data given in below table to regress output on the price index and fertilizer in put. Draw the residual plot and count the number of runs. Output (Q) Price Index (P) Fertilizer input (F) Rainfall (R) 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 n.a. 40.4 36.4 35.4 37.9 34.8 27.9 29.8 34.7 38.4 33.6 33.6 32.2 35.3 39.4 30.6 30.5 33.7 35.8 36.0 37.0 30.7 28.0 28.4 27.6 32.9 37.1 36.0 36.6 38.8 37.1 100.0 106.0 108.1 110.3 110.1 108.6 103.8 109.5 102.6 101.1 100.9 104.7 107.3 103.0 116.4 112.7 108.0 103.2 101.0 103.6 109.6 105.2 98.7 99.2 94.8 100.6 104.5 98.9 101.8 105.6 108.7 100.0 99.4 100.8 102.1 102.9 103.1 104.2 104.6 105.6 106.8 106.7 108.3 108.6 110.4 111.2 111.1 110.7 110.5 112.0 111.6 113.2 114.1 114.8 114.8 114.4 114.6 114.5 114.2 115.5 116.7 118.0 184.2 155.3 107.3 110.1 169.3 81.9 22.0 31.5 198.2 147.5 76.5 90.2 54.3 178.3 70.6 53.0 74.4 136.4 131.1 109.2 112.8 43.0 53.5 20.6 56.4 78.3 151.8 145.2 112.3 161.0 94.5 Written by Nguyen Hoang Bao May 31, 2004 Applied Econometrics Autocorrelation 9 1.2) Run the following regressions: Q = α 0 + α 1 P + α 2 F and Q = β 0 + β 1 P + β 2 F + β 3 R + β 4 R –1 Use an F – test to test the hypothesis that the two rainfall variables may be excluded from the regression. 1.3) In the light of your results, comment on the apparent problem of autocorrelation in the regression of output on the price index and fertilizer output. 2) Compile time – series data for the population of the country of your choice. Regress both terms of trade and logged terms of trade on time and graph the residuals in each case. How many runs are there in each case? Comment. Can you specify the equation to increase the number of runs? 3) Using data given in data file SOCECON, regress the infant mortality rate on income per capita. Plot the residuals with the observations ordered: (a) alphabetically; and, (b) by income per capita. Count the number of runs in each case. Comment on your results. 4) Using the Sri Lankan macroeconomic data set (SRINA), perform the simple regression of Ip on Ig and plot the residuals. Use both runs and DW tests to check for autocorrelation. Add variables to the equation to improve the model specification and repeat the tests for autocorrelation. Use the Cochrane – Orcutt estimation procedure if you feel it is appropriate. Comment on your findings. Written by Nguyen Hoang Bao May 31, 2004 . Econometrics Autocorrelation 1 Applied Econometrics Lecture 9: Autocorrelation “It is never possible to step twice into the same river” 1) Introduction Autocorrelation. regression that (i) has no autocorrelation; (ii) has positive autocorrelation, and, (iii) has negative autocorrelation. The positive autocorrelation is the

Ngày đăng: 25/10/2013, 09:15

Xem thêm

KInh tế ứng dụng_ Lecture 9: Autocorrelation