Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 32 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
32
Dung lượng
439,75 KB
Nội dung
Diagnostic testing 171 In this figure, one point is a long way away from the rest. If this point is included in the estimation sample, the fitted line will be the dotted one, which has a slight positive slope. If this observation were removed, the full line would be the one fitted. Clearly, the slope is now large and negative. OLS will not select this line if the outlier is included since the observation is a long way from the others, and hence, when the residual (the distance from the point to the fitted line) is squared, it will lead to a big increase in the RSS. Note that outliers could be detected by plotting y against x only in the context of a bivariate regression. In the case in which there are more explanatory variables, outliers are identified most easily by plotting the residuals over time, as in figure 6.10. It can be seen, therefore, that a trade-off potentially exists between the need to remove outlying observations that could have an undue impact on the OLS estimates and cause residual non-normality, on the one hand, and the notion that each data point represents a useful piece of information, on the other. The latter is coupled with the fact that removing observations at will could artificially improve the fit of the model. A sensible way to proceed is by introducing dummy variables to the model only if there is both a statistical need to do so and a theoretical justification for their inclusion. This justification would normally come from the researcher’s knowledge of the historical events that relate to the dependent variable and the model over the relevant sample period. Dummy variables may be justifiably used to remove observations corresponding to ‘one-off’ or extreme events that are considered highly unlikely to be repeated, and the information content of which is deemed of no relevance for the data as a whole. Examples may include real estate market crashes, economic or financial crises, and so on. Non-normality in the data could also arise from certain types of het- eroscedasticity, known as ARCH. In this case, the non-normality is intrinsic to all the data, and therefore outlier removal would not make the residuals of such a model normal. Another important use of dummy variables is in the modelling of seasonality in time series data, and accounting for so-called ‘calendar anomalies’, such as end-of-quarter valuation effects. These are discussed in section 8.10. 6.10 Multicollinearity An implicit assumption that is made when using the OLS estimation method is that the explanatory variables are not correlated with one another. If there 172 Real EstateModellingand Forecasting is no relationship between the explanatory variables, they would be said to be orthogonal to one another. If the explanatory variables were orthogonal to one another, adding or removing a variable from a regression equation would not cause the values of the coefficients on the other variables to change. In any practical context, the correlation between explanatory variables will be non-zero, although this will generally be relatively benign, in the sense that a small degree of association between explanatory variables will almost always occur but will not cause too much loss of precision. A prob- lem occurs when the explanatory variables are very highly correlated with each other, however, and this problem is known as multicollinearity.Itis possible to distinguish between two classes of multicollinearity: perfect multicollinearity and near-multicollinearity. Perfect multicollinearity occurs when there is an exact relationship between two or more variables. In this case, it is not possible to estimate all the coefficients in the model. Perfect multicollinearity will usually be observed only when the same explanatory variable is inadvertently used twice in a regression. For illustration, suppose that two variables were employed in a regression function such that the value of one variable was always twice that of the other (e.g. suppose x 3 = 2x 2 ).Ifbothx 3 and x 2 were used as explanatory variables in the same regression, then the model parameters cannot be estimated. Since the two variables are perfectly related to one another, together they contain only enough information to estimate one parameter, not two. Technically, the difficulty would occur in trying to invert the (X X) matrix, since it would not be of full rank (two of the columns would be linearly dependent on one another), meaning that the inverse of (X X) would not exist and hence the OLS estimates ˆ β = (X X) −1 X y could not be calculated. Near-multicollinearity is much more likely to occur in practice, and will arise when there is a non-negligible, but not perfect, relationship between two or more of the explanatory variables. Note that a high correlation between the dependent variable and one of the independent variables is not multicollinearity. Visually, we could think of the difference between near- and perfect mutlicollinearity as follows. Suppose that the variables x 2t and x 3t were highly correlated. If we produced a scatter plot of x 2t against x 3t ,then perfect multicollinearity would correspond to all the points lying exactly on a straight line, while near-multicollinearity would correspond to the points lying close to the line, and the closer they were to the line (taken altogether), the stronger the relationship between the two variables would be. Diagnostic testing 173 6.10.1 Measuring near-multicollinearity Testing for multicollinearity is surprisingly difficult, and hence all that is presented here is a simple method to investigate the presence or otherwise of the most easily detected forms of near-multicollinearity. This method simply involves looking at the matrix of correlations between the individ- ual variables. Suppose that a regression equation has three explanatory variables (plus a constant term), and that the pairwise correlations between these explanatory variables are corr x 2 x 3 x 4 x 2 –0.20.8 x 3 0.2 – 0.3 x 4 0.8 0.3 – Clearly, if multicollinearity was suspected, the most likely culprit would be a high correlation between x 2 and x 4 . Of course, if the relationship involves three or more variables that are collinear – e.g. x 2 + x 3 ≈ x 4 –thenmulti- collinearity would be very difficult to detect. In our example (equation (6.6)), the correlation between EFBSg and GDPg is 0.51, suggesting a moderately strong relationship. We do not think multi- collinearity is completely absent from our rent equation, but, on the other hand, it probably does not represent a serious problem. Another test is to run auxiliary regressions in which we regress each independent variable on the remaining independent variables and examine whether the R 2 values are zero (which would suggest that the variables are not collinear). In equations with several independent variables, this procedure is time-consuming, although, in our example, there it is only one auxiliary regression that we can run: ˆ EFBSg t = 1.55 + 0.62GDPg t (6.48) (2.54) (2.99) R 2 = 0.26; adj. R 2 = 0.23; T = 28.WeobservethatGDPg is significant in the EFBSg t equation, which is indicative of collinearity. The square of the coefficient of determination is not high but neither is it negligible. 6.10.2 Problems if near-multicollinearity is present but ignored First, R 2 will be high, but the individual coefficients will have high stan- dard errors, so the regression ‘looks good’ as a whole, 4 but the individual variables are not significant. This arises in the context of very closely related 4 Note that multicollinearity does not affect the value of R 2 in a regression. 174 Real EstateModellingand Forecasting explanatory variables as a consequence of the difficulty in observing the individual contribution of each variable to the overall fit of the regres- sion. Second, the regression becomes very sensitive to small changes in the specification, so that adding or removing an explanatory variable leads to large changes in the coefficient values or significances of the other variables. Finally, near-multicollinearity will make confidence intervals for the param- eters very wide, and significance tests might therefore give inappropriate conclusions, thus making it difficult to draw clear-cut inferences. 6.10.3 Solutions to the problem of multicollinearity A number of alternative estimation techniques have been proposed that are valid in the presence of multicollinearity – for example, ridge regres- sion, or principal component analysis (PCA). PCA is a technique that may be useful when explanatory variables are closely related, and it works as follows. If there are k explanatory variables in the regression model, PCA will transform them into k uncorrelated new variables. These components are independent linear combinations of the original data. Then the compo- nents are used in any subsequent regression model rather than the original variables. Many researchers do not use these techniques, however, as they can be complex, their properties are less well understood than those of the OLS estimator and, above all, many econometricians would argue that multicollinearity is more a problem with the data than with the model or estimation method. Other, more ad hoc methods for dealing with the possible existence of near-multicollinearity include the following. ● Ignore it, if the model is otherwise adequate – i.e. statistically and in terms of each coefficient being of a plausible magnitude and having an appropriate sign. Sometimes the existence of multicollinearity does not reduce the t-ratios on variables that would have been significant without the multicollinearity sufficiently to make them insignificant. It is worth stating that the presence of near multicollinearity does not affect the BLUE properties of the OLS estimator – i.e. it will still be consistent, unbiased and efficient – as the presence of near-multicollinearity does not violate any of the CLRM assumptions 1 to 4. In the presence of near- multicollinearity, however, it will be hard to obtain small standard errors. This will not matter if the aim of the model-building exercise is to produce forecasts from the estimated model, since the forecasts will be unaffected by the presence of near-multicollinearity so long as this relationship between the explanatory variables continues to hold over the forecast sample. Diagnostic testing 175 ● Drop one of the collinear variables, so that the problem disappears. This may be unacceptable to the researcher, however, if there are strong a priori theoretical reasons for including both variables in the model. Moreover, if the removed variable is relevant in the data-generating pro- cess for y, an omitted variable bias would result (see section 5.9). ● Transform the highly correlated variables into a ratio and include only the ratio and not the individual variables in the regression. Again, this may be unacceptable if real estate theory suggests that changes in the dependent variable should occur following changes in the individual explanatory variables, and not a ratio of them. ● Finally, as stated above, it is also often said that near-multicollinearity is more a problem with the data than with the model, with the result that there is insufficient information in the sample to obtain estimates for all the coefficients. This is why near-multicollinearity leads coefficient estimates to have wide standard errors, which is exactly what would happen if the sample size were small. An increase in the sample size will usually lead to an increase in the accuracy of coefficient estimation and, consequently, a reduction in the coefficient standard errors, thus enabling the model to better dissect the effects of the various explanatory variables on the explained variable. A further possibility, therefore, is for the researcher to go out and collect more data – for example, by taking a longer run of data, or switching to a higher frequency of sampling. Of course, it may be infeasible to increase the sample size if all available data are being utilised already. Another method of increasing the available quantity of data as a potential remedy for near-multicollinearity would be to use a pooled sample. This would involve the use of data with both cross-sectional and time series dimensions, known as a panel (see Brooks, 2008, ch. 10). 6.11 Adopting the wrong functional form A further implicit assumption of the classical linear regression model is that the appropriate ‘functional form’ is linear. This means that the appropriate model is assumed to be linear in the parameters, and that, in the bivariate case, the relationship between y and x can be represented by a straight line. This assumption may not always be upheld, however. Whether the model should be linear can be formally tested using Ramsey’s (1969) RESET test, which is a general test for misspecification of functional form. Essentially, the method works by using higher-order terms of the fitted values (e.g. ˆ y 2 t , ˆ y 3 t , etc.) in an auxiliary regression. The auxiliary regression is thus one in 176 Real EstateModellingand Forecasting which y t , the dependent variable from the original regression, is regressed on powers of the fitted values together with the original explanatory variables: y t = α 1 + α 2 ˆ y 2 t + α 3 ˆ y 3 t +···+α p ˆ y p t + β i x it + v t (6.49) Higher-order powers of the fitted values of y can capture a variety of non- linear relationships, since they embody higher-order powers and cross- products of the original explanatory variables – e.g. ˆ y 2 t = ( ˆ β 1 + ˆ β 2 x 2t + ˆ β 3 x 3t +···+ ˆ β k x kt ) 2 (6.50) The value of R 2 is obtained from the regression (6.49), and the test statistic, given by TR 2 , is distributed asymptotically as a χ 2 (p − 1). Note that the degrees of freedom for this test will be (p − 1) and not p. This arises because p is the highest-order term in the fitted values used in the auxiliary regression, and thus the test will involve p − 1 terms: one for the square of the fitted value, one for the cube, . . . , one for the pth power. If the value of the test statistic is greater than the χ 2 critical value, reject the null hypothesis that the functional form was correct. 6.11.1 What if the functional form is found to be inappropriate? One possibility would be to switch to a non-linear model, but the RESET test presents the user with no guide as to what a better specification might be! In addition, non-linear models in the parameters typically preclude the use of OLS, and require the use of a non-linear estimation technique. Some non-linear models can still be estimated using OLS, provided that they are linear in the parameters. For example, if the true model is of the form y t = β 1 + β 2 x 2t + β 3 x 2 2t + β 4 x 3 2t + u t (6.51) – that is, a third-order polynomial in x – and the researcher assumes that the relationship between y t and x t is linear (i.e. x 2 2t and x 3 2t are missing from the specification), this is simply a special case of omitted variables, with the usual problems (see section 5.9) and obvious remedy. The model may be multiplicatively non-linear, however. A second pos- sibility that is sensible in this case would be to transform the data into logarithms. This will linearise many previously multiplicative models into additive ones. For example, consider again the exponential growth model y t = β 1 x β 2 t u t (6.52) Taking logs, this becomes ln(y t ) = ln(β 1 ) + β 2 ln(x t ) + ln(u t ) (6.53) Diagnostic testing 177 or Y t = α + β 2 X t + v t (6.54) where Y t = ln(y t ), α = ln(β 1 ), X t = ln(x t ) and v t = ln(u t ). A simple logarith- mic transformation therefore makes this model a standard linear bivariate regression equation that can be estimated using OLS. Loosely following the treatment given in Stock and Watson (2006), the following list shows four different functional forms for models that are either linear or can be made linear following a logarithmic transformation to one or more of the dependent variables, examining only a bivariate specification for simplicity. Care is needed when interpreting the coefficient values in each case. (1) Linear: y t = β 1 + β 2 x 2t + u t ; a one-unit increase in x 2t causes a β 2 -unit increase in y t . y t x 2 t (2) Log-linear: ln(y t ) = β 1 + β 2 x 2t + u t ; a one-unit increase in x 2t causes a 100 × β 2 per cent increase in y t . In y t x 2t y t x 2 t (3) Linear-log: y t = β 1 + β 2 ln(x 2t ) + u t ; a 1 per cent increase in x 2t causes a 0.01 × β 2 -unit increase in y t . y t In(x 2t ) y t x 2 t 178 Real EstateModellingand Forecasting (4) Double log: ln(y t ) = β 1 + β 2 ln(x 2t ) + u t ; a 1 per cent increase in x 2t causes a β 2 per cent increase in y t . Note that to plot y against x 2 would be more complex, as the shape would depend on the size of β 2 . In(y t ) In(x 2t ) Note also that we cannot use R 2 or adjusted R 2 to determine which of these four types of model is most appropriate, since the dependent variables are different in some of the models. Example 6.7 We follow the procedure described in equation (6.49) to test whether equa- tion (5.39) has the correct functional form. Equation (5.39) is the restricted regression. The unrestricted (auxiliary) regression contains the square of the fitted value: ˆ RRg t =−14.41 + 2.68EFBSg t + 2.24GDPg t + 0.02FITTED 2 RRSS = 1,078.26; URSS = 1,001.73; T = 28; m = 1; and k = 4.TheF -statistic is 1078.26 − 1001.72 1001.72 × 28 − 4 1 = 1.83 The F(1,24) critical value is 4.26 at the 5 per cent significance level. The computed test statistic is lower than the critical value, and hence we do not reject the null hypothesis that the functional form is correct, so we would conclude that the linear model is appropriate. 6.12 Parameter stability tests So far, regressions of a form such as y t = β 1 + β 2 x 2t + β 3 x 3t + u t (6.55) have been estimated. These regressions embody the implicit assumption that the parameters (β 1 , β 2 and β 3 ) are constant for the entire sample, both for the data period used to estimate the model and for any subsequent period used in the construction of forecasts. This implicit assumption can be tested using parameter stability tests. The idea is, essentially, to split the data into sub-periods and then to estimate Diagnostic testing 179 up to three models, for each of the sub-parts and for all the data, and then to ‘compare’ the RSS of each of the models. There are two types of test that will be considered, namely the Chow (analysis of variance) test and the predictive failure test. 6.12.1 The Chow test The steps involved are shown in box 6.7. Box 6.7 Conducting a Chow test (1) Split the data into two sub-periods. Estimate the regression over the whole period and then for the two sub-periods separately (three regressions). Obtain the RSS for each regression. (2) The restricted regression is now the regression for the whole period, while the ‘unrestricted regression’ comes in two parts: one for each of the sub-samples. It is thus possible to form an F -test, which is based on the difference between the RSSs. The statistic is test statistic = RSS − ( RSS 1 + RSS 2 ) RSS 1 + RSS 2 × T − 2k k (6.56) where RSS = residual sum of squares for the whole sample; RSS 1 = residual sum of squares for sub-sample 1; RSS 2 = residual sum of squares for sub-sample 2; T = number of observations; 2k = number of regressors in the ‘unrestricted’ regression (as it comes in two parts), each including a constant; and k = number of regressors in (each) ‘unrestricted’ regression, including a constant. The unrestricted regression is the one in which the restriction has not been imposed on the model. Since the restriction is that the coefficients are equal across the sub-samples, the restricted regression will be the single regression for the whole sample. Thus the test is one of how much the residual sum of squares for the whole sample (RSS) is bigger than the sum of the residual sums of squares for the two sub-samples (RSS 1 + RSS 2 ). If the coefficients do not change much between the samples, the residual sum of squares will not rise much upon imposing the restriction. The test statistic in (6.56) can therefore be considered a straightforward application of the standard F-test formula discussed in chapter 5. The restricted residual sum of squares in (6.56) is RSS, while the unrestricted residual sum of squares is (RSS 1 + RSS 2 ). The number of restrictions is equal to the number of coefficients that are estimated for each of the regressions – i.e. k. The number of regressors in the unrestricted regression (including the constants) is 2k, since the unrestricted regression comes in two parts, each with k regressors. (3) Perform the test. If the value of the test statistic is greater than the critical value from the F-distribution, which is an F (k, T − 2k), then reject the null hypothesis that the parameters are stable over time. 180 Real EstateModellingand Forecasting Note that it is also possible to use a dummy variables approach to calculating both Chow and predictive failure tests. In the case of the Chow test, the unrestricted regression would contain dummy variables for the intercept and for all the slope coefficients (see also section 8.10). For example, suppose that the regression is of the form y t = β 1 + β 2 x 2t + β 3 x 3t + u t (6.57) If the split of the total of T observations is made so that the sub-samples con- tain T 1 and T 2 observations (where T 1 + T 2 = T ), the unrestricted regression would be given by y t = β 1 + β 2 x 2t + β 3 x 3t + β 4 D t + β 5 D t x 2t + β 6 D t x 3t + v t (6.58) where D t = 1 for t ∈ T 1 and zero otherwise. In other words, D t takes the value one for observations in the first sub-sample and zero for observations in the second sub-sample. The Chow test viewed in this way would then be a standard F-test of the joint restriction H 0 : β 4 = 0 and β 5 = 0 and β 6 = 0, with (6.58) and (6.57) being the unrestricted and restricted regressions, respectively. Example 6.8 The application of the Chow test using equation (6.6) is restricted by the fact that we have only twenty-eight observations, and therefore if we split the sample we are left with a mere fourteen observations in each sub-sample. These are very small samples to run regressions, but we do so in this example for the sake of illustrating an application of the Chow test. We split the sample into two sub-samples: 1979 to 1992 and 1993 to 2006. We compute the F -statistic (as described in equation (6.56)) and test for the null hypothesis that the parameters are stable over time. The restricted equation is (6.6) and thus the RRSS is 1,078.26. Unrestricted equation 1 (first sub-sample): ˆ RRg t =−10.14 + 2.21EFBSg t + 1.86GDPg t (6.59) R 2 = 0.66;adj.R 2 = 0.60; URSS 1 = 600.83. Unrestricted equation 2 (second sub-sample): ˆ RRg t =−23.92 + 3.36EFBSg t + 5.00GDPg t (6.60) R 2 = 0.52;adj.R 2 = 0.43; URSS 2 = 385.31. [...]... along with second-hand space resulting from lease termination, sub-letting, and so forth If these demand and supply forces result in falling vacancy, the market becomes a ‘landlords’ market’ Landlords will push for higher rents in new leases or rent reviews Valuers will also be taking account of these developments, and estimated rental values should rise 196 Real EstateModellingand Forecasting There... assessment of how strong the relationships between rent and our chosen drivers (vacancy and output) are We estimate cross-correlations with two lags in table 7.2 – that is, we study past effects from vacancy and output on rents In addition, we compute correlations between rents and lead values of vacancy and output This is to 200 Real EstateModellingand Forecasting Table 7.2 Cross-correlations with annual... (twenty-eight observations), with D01t = 1 for observation for 2001 and zero otherwise, D02t = 1 for 2002 and zero otherwise, and so on In this case, k = 3 and T2 = 6 The null hypothesis for the predictive failure test in this regression is that the coefficients on all the dummy variables are zero (i.e H0 : γ1 = 0 and γ2 = 0 and and γ6 = 0), where γ1 , , γ6 represent the parameters on the six... vacancy rate (defined as vacant stock over total stock, expressed as a percentage) and an economic output variable The argument is straightforward: vacancy is considered an indicator of the demand and supply balance in the real estate market – i.e it reflects demand and supply conditions As business conditions strengthen and firms need to take on more space, the level of vacant stock in the market should... rents In real estate modelling, however, the availability of data plays a significant role in the form of the model In markets in which there is an abundance of data (in terms of both the existence and the history of series describing the occupier and investment markets), the solid black arrows show that a model of rents is part of a more general model within which demand, vacancy, rent and supply are... real estate theory drawn from the model left until after a statistically adequate model has been found According to Hendry and Richard (1982), a final acceptable model should satisfy several criteria (adapted slightly here) The model should: ● be logically plausible; ● be consistent with underlying real estate theory, including satisfying any relevant parameter restrictions; 188 Real EstateModelling and. .. always zero) A set of ±2 standard error bands is usually plotted around zero, and any statistic lying outside the bands is taken as evidence of parameter instability The CUSUMSQ test is based on a normalised version of the cumulative sum of squared residuals The scaling is such that, under the null hypothesis of parameter stability, the CUSUMSQ statistic will start at zero and end the sample with a... restricted regression) and obtain the RSS ● Run the regression for the ‘large’ sub-period and obtain the RSS (called RSS1 ) Note that, in this book, the number of observations for the long-estimation sub-period will be denoted by T1 (even though it may come second) The test statistic is given by test statistic = T1 − k RSS − RSS 1 × RSS 1 T2 (6.61) 182 Real EstateModellingand Forecasting where T2... Real EstateModellingand Forecasting samples The Cochrane–Orcutt procedure is an alternative, which operates as in box 6.8 Box 6.8 The Cochrane–Orcutt procedure (1) Assume that the general model is of the form (6A.1) above Estimate the equation in (6A.1) using OLS, ignoring the residual autocorrelation (2) Obtain the residuals, and run the regression ˆ ˆ ut = ρ ut−1 + vt (6A.7) yt∗ (3) Obtain ρ and. .. Theory might guide this decision, and sometimes we can put forward a theoretical argument in real estate for modelling either in levels or in growth rates Assuming that the analyst is interested in modelling short-term cyclical movements, significant weight is given to the statistical properties of the data Historical data that contain trends, or data series that are smoothed and hence highly autocorrelated . backcast could be 184 Real Estate Modelling and Forecasting 1,400 1,200 1,000 800 600 400 200 0 Observation number y t 1 33 65 97 129 161 193 2 25 257 289 321 353 3 85 417 449 Figure 6.12 Plot. viewed in this way would then be a standard F-test of the joint restriction H 0 : β 4 = 0 and β 5 = 0 and β 6 = 0, with (6 .58 ) and (6 .57 ) being the unrestricted and restricted regressions, respectively. Example. equation: ˆ RRg t =−10. 95 + 2.35EFBSg t + 1.91GDPg t + 1 .59 D01 t − 1 .57 D02 t (1.81) (3.01) (2.09) (0.23) (0.21) −12.17D03 t −2.99D04 t − 1.37D 05 t + 4.92D06 t (6.63) (1.71) (0.42) (0.19) (0.69) R 2 = 0. 65; adj.