Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 28 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
28
Dung lượng
173,67 KB
Nội dung
Group 10 I Econometrics Report INTRODUCTION Overall about econometrics Why choosing OLS? II QUESTION OF INTEREST .5 III ECONOMIC MODEL Choosing the variables .5 Embedding that target in a general unrestricted model (GUM) IV ECONOMETRICS MODEL Population regression function (PRF) Sample regression function (SRF) .9 V DATA COLLECTION .10 Data overview 10 Data description 10 VI ESTIMATION OF ECONOMETRIC MODEL 10 Checking the correlation among variables: .10 Regression run 12 VII CHECK MULLTICOLLINEARITY AND HETEROSCEDASTICITY .15 Multicollinearity .15 Heteroskedasticity 16 VIII HYPOTHESES POSTULATED 19 The t test .19 Confidence Intervals 21 P- Value 22 Testing the overall significance: The F test 23 IX RESULT ANALYSIS AND POLICY IMPLICATION 24 X CONCLUSION 24 XI REFERENCES 25 Figure Figure LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 Econometrics Report Figure Figure Figure Figure Figure Figure Figure .10 .11 .13 .15 .16 .18 .21 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 Econometrics Report I INTRODUCTION 1.Overall about econometrics Econometrics is the application of statistical methods to economic data and is described as the branch of economics that aims to give empirical content to economic relations Precisely speaking, it is the quantitative analysis of actual economic problems, based on the concurrent development of theory and observation, related by appropriate methods of inference It is understandable that economist make comparison econometrics is like an effective tool to convert mountains of data into extract simple relationships The reason why econometrics is effective is economics theory use statistical theory and mathematical statistics to evaluate and develop econometrics method In reality, econometrics help economists to assess economic theories, developing econometrics model, analyzing and forecasting the economic history Aware of the importance of econometrics to economic phenomena, our group decides to carry out a research of econometrics: “The factors that have influence on median housing price” and aim to analyze statistic and point out differences and their reason of price level The data set has 506 observations with 12 variables in total We choose variables: price, crime, nox, rooms, dist and proptax to the research in which price is dependent variable and the other five are independent variables The general method used in LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 Econometrics Report this research is OLS (ordinary least squares) In addition, the specialized method is estimate, running Stata software as well During carrying out this research, our group is so lucky to be guided thoroughly by Dr Dinh Thi Thanh Binh We are grateful for everything you have taught us! This is the first time our group carry out an econometrics research, our performance is unavoidable to have many mistakes It would be a pleasure if we can receive the feedback from you to better ourselves next time 2.Why choosing OLS? Ordinary least squares (OLS) is a type of linear least squares method for estimating the unknown parameters in a linear regression model function of of least a set squares: differences OLS chooses the of explanatory minimizing between the the parameters variables by sum of the observed dependent of a linear the principle squares of variable in the the given dataset and those predicted by the linear function With the six selected variables, we use the OLS model because all regressions variable are exogenous variables, the effects of independent variables on the dependent variable are linear effects In addition, the estimates calculated by means of the least squares OLS are linear estimates that are not deviate and are better than others When using OLS, we have some basic assumptions: LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 Econometrics Report The regression model is linear in the parameters X values are fixed in repeated sampling, which means Xi and ui are uncorrelated Zero mean value of disturbance (E(ui)) =0) Homoscedasticity or equal variance of ui : var(ui) = σ No correlation between disturbances The model is correctly specified Number of observations must be greater than the number of parameters to be estimated II X values in a given sample must not be the same No perfect multicollinearity 10 Normal distribution QUESTION OF INTEREST We have always been wondering “Why housing prices among locations and regions differ so much?” Housing prices are affected by many different factors such as structure, neighborhood, accessibility, air pollution and so on To seek the answer to that question, our group is going to use the collected data to build and run the regression model and then the results are going to be analyzed to finally answer the question of interest above III ECONOMIC MODEL According the provided data, the economic model used in this report is an empirical one Note that the fundamental model is mathematical; with an empirical model, however, data is gathered for the variables and using accepted statistical techniques, the data are used to provide estimates of the model's values Choosing the variables LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 Econometrics Report Having described data via the command “des” in file… from Stata software, we gain the result as following: des obs: 506 vars: 12 31 Oct 1996 16:37 size: 22,770 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 Econometrics Report valu e Variable name storag display labe e type format l variable label median housing price float %9.0g price, $ crimes committed per crime float %9.0g capita nit ox concen; nox float %9.0g parts per 100m avg number of rooms float %9.0g rooms wght dist to dist float %9.0g employ centers access index radial byte %9.0g to rad hghwys property tax proptax float %9.0g per $1000 average student- stratio float %9.0g teacher ratio perc of people lowstat float %9.0g 'lower status' lprice float %9.0g log(price) LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 Econometrics Report lnox float %9.0g log(nox) lproptax float %9.0g log(proptax) Figure The above table reveal that this is the statistic of factors which have influence in housing price via 506 observations After discussing choose carefully, a dependent our group variable jumped Y: Price, into a conclusion independent to variable contains: X1-crime X2-nox X3-rooms X4-dist X5-proptax Price=f (x ) Embedding that target in a general unrestricted model (GUM) In its simplest acceptable representation (which will later be specified in the econometric model), the GUM of is determined to be: Price=f (crime , nox , rooms , dist , proptax) A brief description of each variable is given in Figure LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 Econometrics Report Name Meaning Expected sign Dependent Price Median housing price + Crime Number of crimes - Variable (Y) committed per capita Nox The amount of nitrogen Independent oxide concentrator parts Variables (X) in the air per 100m Rooms The average number of - + rooms Dist Weight distance to - employ centers Proptax Property tax per $1000 - Figure IV ECONOMETRICS MODEL Population regression function (PRF) PRF: Price=β + β × crime + β × nox+ β ×rooms+ β × dist + β × proptax+u i Sample regression function (SRF) SRF: LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 Econometrics Report ^ Price= β^0 + ^ β × crime + ^ β × nox+ β^3 ×rooms+ ^ β × dist + β^5 × proptax where: is the intercept of the regression model is the slope coefficient of the independent variable is the disturbance of the regression model ^ β is the estimator of ^ β iis the estimator of μi is the residual (the estimator of ^ ) V DATA COLLECTION Data overview This set of data is collected from a given source, therefore it is a secondary one The structure of Economic data: cross-sectional data Data description To get statistic indicators of the variables, in Stata, the following command is used: sum Variab le price crime Std Obs 506 506 Mean Dev Min Max 22511 9208.85 51 5000 50001 3.6115 8.59024 0.00 88.97 36 6 10 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 Econometrics Report Adj R- Total 4.28E+10 505 84803032 Std squared = 0.5842 Root MSE = 5937.9 [95% price Coef Err t P>t Conf Interval] crime -150.0703 38.11571 -3.94 -224.957 -75.18364 nox -1737.66 410.7763 -4.23 -2544.72 -930.5992 rooms 7707.327 399.0772 19.31 6923.252 8491.402 dist -791.2588 197.9444 -4 1180.164 -402.3535 proptax -89.95717 23.61555 _cons -9060.303 3978.871 -3.81 -2.28 136.3551 0.02 - 16877.67 -43.55923 -1242.937 Figure From table above we have Sample Regression Function: Price = -9060.303 - 1737.66*nox + 7707.327*rooms - 89.95717*proptax From the result, it can be inferred that crime, nox, rooms, dist, proptax all have statistically significant effects on price at the 5% significant level (as all p-values are smaller than 0.05) In particular, those effects can be specified by the regression coefficients as follows: β0 = -9060.303 14 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 Econometrics Report = -1737.66 means that if nit ox concen per 100m increases by one , average housing price will decrease by 1737.66 in condition other factors not change = -150.0703 means that if crimes committed per capital increases by one , average housing price will decrease by 150.0703 in condition other factors not change = 7707.327 means that if average number of rooms increases by one, average housing price will increase by 7707.327 in condition other factors not change = -791.2588 means that if weight distance to employ centers increases unit, average housing price will decrease by 791.2588 in condition other factors not change = -89.95717 means that if average property tax per $1000 increases by one, average housing price will decrease by 89.95717 in condition other factors not change The coefficient independent jointly variable of variables explain 58.83% (price); determination (crime, of other nox, the R-squared=0.5883: rooms, variation factors that dist, in are all proptax,) the dependent not mentioned explain the remaining 41.17% of the variation in the price Other indicators: - Adjusted coefficient of determination adj R-squared = 0.5842 - Total Sum of Squares TSS = 4,28E+14 - Explained Sum of Squares ESS = 2,52E+14 - Residual Sum of Squares RSS = 1,76E+14 - The degree of freedom of Model Dfm= - The degree of freedom of residual Dfr = 500 VII CHECK MULLTICOLLINEARITY AND HETEROSCEDASTICITY 15 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 Econometrics Report Multicollinearity Multicollinearity is the high degree of correlation amongst the explanatory variables, which may make it difficult to separate out the effects of the individual regressors, standard errors may be overestimated and t-value depressed Detect multicollinearity o Method 1: Use cor command to examine multicollinearity If independent variables are strongly correlated (r > 0.8), multicollinearity may occur price crime nox rooms dist proptax price 1.0000 crime -0.3879 1.0000 nox -0.426 0.4212 1.0000 rooms 0.6958 -0.2188 -0.3028 1.0000 dist 0.2493 -0.3799 -0.7702 0.2054 1.0000 proptax -0.4671 0.5828 0.667 -0.2921 -0.5344 1.0000 Figure From the table above, we can easily see that correlating coefficient among independent variables are pretty low and all smaller than 0.8 As a result, we can conclude that multicollinearity does not occur in this model o Method 2: Use variance inflation factor (VIF) If VIF > 10, multicollinearity occurs 16 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 Econometrics Report Variable VIF 1/VIF nox 3.24 0.308352 dist 2.49 0.401709 proptax 2.27 0.440742 crime 1.54 0.651256 rooms 1.13 0.888073 Mean VIF 2.13 Figure The table shows that all VIF value is smaller than 10, thus, multicollinearity does not is occur in this model We can draw a conclusion from methods above that multicollinearity not too worrisome a problem for this set of data Heteroskedasticity Another problem that our model can suffer from when being examined is heteroskedasticity Heteroskedasticity may result in the situation that some least squared estimators are still unbiased but are no longer effective, along with that, estimators of variances will become biased, thus lead to the reduction in effectiveness of our model When the assumption of variance of each error term Ui is unchanged when i moves from 1, to n It can also be rewritten as: Var (Ui) = Var (Uj) i=1,2,3,…,n 17 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 Econometrics Report j=1,2,3,…,n When that assumption is violated, heteroskedasticity appears Causes o Essence of economic phenomena: If economic phenomena is examined on subjects having difference in scale or they are examined under periods of time that are not similar in fluctuation level o Model’s function is wrongly formatted, maybe because appropriate variables are missing or function analysis is false o cannot fully and correctly reflect the essence of economic phenomena For example, external observations appear Bringing in or eliminate these observations does great impact on regression analysis o Error tends to decrease as data collecting, conserving and processing techniques are improved o Hypothesis: Behaviors in the past are learnt { H :the variance is homogenous H :the variance is not homogenous Using the command estat hettest in STATA: Breusch-Pagan / Cook-Weisberg test for heteroskedasticity Ho: Constant variance 18 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 Econometrics Report Variables: fitted values of price chi2(1) = 26.56 Prob > chi2 = 0.0000 We can see that Prob > chi2 = 0.0000 < 0.05 => We reject H0, accept H1 We can conclude that heteroskedasticity does occur in this model Correcting heteroskedasticity We use command: reg price crime nox rooms dist proptax, robust we have the result Number of obs = F( 506 5, 103.2 500) = Prob > F = 0.588 R-squared = 5937 Root MSE = Robust price Coef Std Err t P>t [95% Conf Interval] crime -150.0703 30.45247 -4.93 -209.9009 -90.23976 nox -1737.66 389.6642 -4.46 -2503.241 -972.0787 rooms 7707.327 670.6304 11.49 6389.726 9024.928 19 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 dist Econometrics Report -791.2588 175.744 -4.5 -1136.546 -445.9712 proptax -89.95717 26.84788 -3.35 0.001 -142.7057 -37.20862 _cons -1.68 0.094 -19667.75 1547.148 -9060.303 5398.964 Figure Note that comparing the results with the earlier regression, none of the coefficient estimates changed, but the standard errors and hence the t values are different, which gives reasonably more accurate p values VIII HYPOTHESES POSTULATED The t test Hypothesis: t s= { H : β 1=0 H : β1≠ ^ β1 −0 =−4.93 se( β^1 ) c(500)0.025 = 1.965 < |ts | => Reject H Conclusion: Number of crimes committed per capita has statistically signifincant effect on median housing price Higher number of crimes commited per capita, lower median housing price Hypothesis: t s= { H : β 2=0 H : β2≠ ^ β2 −0 =−4.46 se( β^2 ) c(500)0.025 = 1.965 < |ts | => Reject H 20 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 Econometrics Report Conclusion: nitrogen oxide concentrator per 100m has statistically signifincant effect on median housing price Higher nitrogen oxide concentrator per 100m, lower median housing price Hypothesis: t s= { H : β 3=0 H : β3≠ ^ β3 −0 =11.49 se( β^ ) c(500)0.025 = 1.965 < |ts | => Reject H Conclusion: The average number of rooms has statistically signifincant effect on median housing price, higher average number of rooms, higher median housing price Hypothesis: t s= { H : β =0 H : β4 ≠ ^ β 4−0 =¿ -4.5 se( β^ ) c(500)0.025 = 1.965 < |ts | => Reject H Conclusion weight distance to employ centers has statistically signifincant effect on median housing price, higher weight distance to employ centers, lower median housing price 21 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 Hypothesis: t s= Econometrics Report { H : β 5=0 H : β5≠ ^ β5 −0 =−3.35 se( β^ ) c(500)0.025 = 1.965 < |ts | => Reject H Conclusion Property tax per $1000 has statistically signifincant effect on median housing price, higher property tax per $1000, lower median housing price Confidence Intervals Test the following hypothesis: { { { { H : β 1=0 H : β1≠ H : β 2=0 H : β2≠ H : β 3=0 H : β3≠ H : β =0 H : β4 ≠ 22 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 Econometrics Report { Variable H : β 5=0 H : β5≠ Significant Coefficient Level Confidence Interval Const β0 5% (-19667.75 ; X1 β1 5% (-209.9009 ; -90.23976) X2 β2 5% (-2503.241 ; -972.0787) X3 β3 5% (6389.726 X4 β4 5% (-1136.546 ;-445.9712) X5 β5 5% (-142.7057 1547.148) ; 9024.928) ; -37.20862) Figure We can see that for all coefficients, doesn’t belong to the confidence interval, so we reject the hypotheses H0: H : β 1=0, β 2=0 , β 3=0 , β =0, β 5=0 Conclusion: Number of crimes committed per capita, nitrogen oxide concentrator per 100m, the average number of rooms, weight distance to employ centers and property tax per $1000 all have statistically signifincant effect on median housing price with the confidence level of 95% P- Value Hypothesis testing: { H : β 1=0 H1 : β1≠ P-value = 0.0004 < α = 0.05 => Reject H0 Number of signifincant crimes effect committed on median per housing capita price has statistically Higher number of crimes commited per capita, lower median housing price 23 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 Econometrics Report In particular, with the sample we have, the estimated result shows that one more crime committed decreases median housing price by 150.07$, holding other factors fixed Hypothesis testing: { H : β 2=0 H1 : β2≠ P-value = 0.0004 < α = 0.05 => Reject H0 Nitrogen oxide concentrator per 100m has statistically signifincant effect on median housing price Higher nitrogen oxide concentrator per 100m, lower median housing price In particular, with the sample we have, the estimated result shows that one more unit in nitrogen oxide concentrator per 100m decreases median housing price by 1737.66$, holding other factors fixed Hypothesis testing: { H : β 3=0 H : β3 ≠ P-value = 0.0004 < α = 0.05 => Reject H0 The average number of rooms has statistically signifincant effect on median housing price, higher average number of rooms, higher median housing price In particular, with the sample we have, the estimated result shows that one more room added in the house increases median housing price by 7707.33 $, holding other factors fixed Hypothesis testing: { H : β =0 H1 : β4 ≠ P-value = 0.0004 < α = 0.05 => Reject H0 Weight distance to employ centers has statistically signifincant effect on median housing price, higher weight distance to employ centers, lower median housing price In particular, with the sample we have, the estimated result shows that one more unit increased in weight distance to employ 24 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 Econometrics Report centers decreases median housing price by 791.25$, holding other factors fixed Hypothesis testing: { H : β 5=0 H : β5 ≠ P-value = 0.0008 < α = 0.05 => Reject H0 Property tax per $1000 has statistically signifincant effect on median housing price, higher property tax per $1000, lower median housing price In particular, with the sample we have, the estimated result shows that one more $ increased in property tax per 1000$ decreases median housing price by 89.96 $, holding other factors fixed Testing the overall significance: The F test This test is to examine if the parameters of the independent variable βi at the same time can be zero The hypothesis is as follows: ¿ F qs = As R (n−k ) ( 1−R2 ) ( k −1 ) a = 142.92 result, there > is 500 F 5; 0,05 =2.23 enough evidence to reject the null hypothesis and conclude that at least one independent variable in the subset does have explanatory or predictive power on price, so we don’t reduce the model by dropping out this subset IX RESULT ANALYSIS AND POLICY IMPLICATION From data analysis in previous sections, we have gained an overall view of data set given in term of the satistical 25 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 Econometrics Report relationship between housing prices and each of the factors proposed As mentioned at the beginning of this report, we aim to learn how security of the neighborhood, the air pollution, the size of house, accessibility and the property tax are associated with housing price In other words, we are concerned about what is the willingness of buyers to pay for these components Following the analysis of data, regression model run and hypothesis testing, it can be concluded that security of the neighborhood, the air pollution, the size of house, accessibility and the property tax statistically affect the housing prices Therefore, tenants, investors or constructors should take all of these ingredients into account when making deals X CONCLUSION This report is completed on the dedicated contribution of each member and the knowledge from our study in Econometrics This research has provided us with a good opportunity to practice what we have learned and to get a deeper understanding of data analysis and relevant testing From this useful application, we hope that our research can somehow suggest the relationship between the housing prices and some other factors Again, due to the limitation of understanding and resources, our report may contain misinterpretations We hope that teacher and readers can give us constructive comments on the report so that we would improve ourselves and better in the future XI REFERENCES https://www.york.ac.uk/media/economics/documents/seminars/ Hendry_Feb_2011.pdf http://pages.hmc.edu/evans/chap1.pdf 26 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 Econometrics Report http://citeseerx.ist.psu.edu/viewdoc/download? doi=10.1.1.926.5532&rep=rep1&type=pdf D.A Belsey, E Kuh, and R Welsch, Regression Diagnostics: Identifying Influential Data and Sources of Collinearity, New York: Wiley (1990) 27 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com Group 10 Econometrics Report 28 LUAN VAN CHAT LUONG download : add luanvanchat@agmail.com ... wondering “Why housing prices among locations and regions differ so much?” Housing prices are affected by many different factors such as structure, neighborhood, accessibility, air pollution and so on... application, we hope that our research can somehow suggest the relationship between the housing prices and some other factors Again, due to the limitation of understanding and resources, our report... run and hypothesis testing, it can be concluded that security of the neighborhood, the air pollution, the size of house, accessibility and the property tax statistically affect the housing prices