tiểu luận kinh tế lượng tài chính factors that affect housing prices among location and region

22 93 0
tiểu luận kinh tế lượng tài chính factors that affect housing prices among location and region

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Group 10 I Econometrics Report INTRODUCTION Overall about econometrics Why choosing OLS? II QUESTION OF INTEREST .5 III ECONOMIC MODEL Choosing the variables .5 Embedding that target in a general unrestricted model (GUM) IV ECONOMETRICS MODEL Population regression function (PRF) Sample regression function (SRF) .9 V DATA COLLECTION .10 Data overview 10 Data description 10 VI ESTIMATION OF ECONOMETRIC MODEL 10 Checking the correlation among variables: .10 Regression run 12 VII CHECK MULLTICOLLINEARITY AND HETEROSCEDASTICITY .15 Multicollinearity .15 Heteroskedasticity 16 VIII HYPOTHESES POSTULATED 19 The t test .19 Confidence Intervals .21 P­ Value 22 Testing the overall significance: The F test 23 IX RESULT ANALYSIS AND POLICY IMPLICATION 24 X CONCLUSION 24 XI REFERENCES 25 Y Figure Figure Figure Figure .10 .11 Group 10 Econometrics Report Figure Figure Figure Figure Figure .13 .15 .16 .18 .21 Group 10 Econometrics Report I INTRODUCTION 1.Overall about econometrics Econometrics   is   the   application   of   statistical   methods   to economic data and is described as the branch of economics that aims to give empirical content to economic relations. Precisely speaking,   it   is   the   quantitative   analysis   of   actual   economic problems,   based   on   the   concurrent   development   of   theory   and observation, related by appropriate methods of inference. It is understandable   that   economist   make   comparison   econometrics   is like an effective tool to convert mountains of data into extract simple relationships The reason why econometrics is effective is economics theory use   statistical   theory   and   mathematical   statistics   to   evaluate and   develop   econometrics   method   In   reality,   econometrics   help economists   to   assess   economic   theories,   developing   econometrics model, analyzing and forecasting the economic history Aware of the importance of econometrics to economic phenomena, our group decides to carry out a research of econometrics: “The factors that have influence on median housing price” and aim to analyze statistic and point out differences and their reason of price level The data set has 506 observations with 12 variables in total We choose 6 variables: price, crime, nox, rooms, dist and proptax to do the research in which price is dependent variable and the other five are independent variables. The general method used in this research is OLS (ordinary least squares). In addition, the specialized method is estimate, running Stata software as well Group 10 Econometrics Report During carrying out this research, our group is so lucky to be guided thoroughly by Dr. Dinh Thi Thanh Binh. We are grateful for everything you have taught us! This is the first time our group carry out an econometrics research, our performance is unavoidable to have many mistakes It would be a pleasure if we can receive the feedback from you to better ourselves next time 2.Why choosing OLS? Ordinary   least   squares (OLS)   is   a   type   of linear   least squares method for estimating the unknown parameters in a linear regression model   OLS   chooses   the   parameters   of   a linear function of   a   set   of explanatory   variables by   the   principle of least   squares:   minimizing   the   sum   of   the   squares   of   the differences   between   the   observed dependent   variable in   the given dataset and those predicted by the linear function.  With the six selected variables, we use the OLS model because all regressions variable are exogenous variables, the effects of independent   variables   on   the   dependent   variable   are   linear effects   In   addition,   the   estimates   calculated   by   means   of   the least squares OLS are linear estimates that are not deviate and are better than others When using OLS, we have some basic assumptions: The regression model is linear in the parameters X values are fixed in repeated sampling, which means Xi and ui are uncorrelated  Zero mean value of disturbance (E(ui)) =0) Homoscedasticity or equal variance of ui : var(ui) =  No correlation between disturbances The model is correctly specified Group 10 II Econometrics Report Number of observations must be greater than the number of parameters to be estimated X values in a given sample must not be the same No perfect multicollinearity 10 Normal distribution.  QUESTION OF INTEREST We   have   always   been   wondering   “Why     housing   prices   among locations   and   regions   differ   so   much?”   Housing   prices   are affected   by   many   different   factors   such   as   structure, neighborhood, accessibility, air pollution and so on. To seek the answer to that question, our group is going to use the collected data to build and run the regression model and then the results are   going   to   be   analyzed   to   finally   answer   the   question   of interest above   III ECONOMIC MODEL According the provided data, the economic model used in this report is an empirical one. Note that the fundamental model is mathematical; with an empirical model, however, data is gathered for the variables and using accepted statistical techniques, the data are used to provide estimates of the model's values Choosing the variables Having   described   data   via   the   command   “des”   in   file…   from Stata software, we gain the result as following:   des  obs:           506     vars:            12                            31 Oct 1996  16:37  size:        22,770   Group 10 Econometrics Report                        Variable name storag display valu e type format e variable label median housing  price float %9.0g   price, $ crimes committed crime float %9.0g   per capita nit ox concen;  nox float %9.0g   parts per 100m avg number of  rooms float %9.0g   rooms wght dist to 5  dist float %9.0g   employ centers access. index to radial byte %9.0g   rad. hghwys property tax per proptax float %9.0g   $1000 Group 10 Econometrics Report average student­ stratio float %9.0g   teacher ratio perc of people  lowstat lprice lnox lproptax Figure 1 float float float float %9.0g %9.0g %9.0g %9.0g         'lower status' log(price) log(nox) log(proptax) The above table reveal that this is the statistic of factors which have influence in housing price via 506 observations. After discussing   carefully,   our   group   jumped   into   a   conclusion   to choose   a   dependent   variable   Y:   Price,   independent   variable contains:      X1­crime X2­nox X3­rooms X4­dist X5­proptax Embedding   that   target   in   a   general   unrestricted   model (GUM) In   its   simplest   acceptable   representation   (which   will   later   be specified in the econometric model), the GUM of is determined to be: A brief description of each variable is given in Figure 1 Group 10 Econometrics Report Name Dependent Variable (Y) Independent Variables (X) Meaning Expected Price Median housing price sign + Crime Number of crimes ­ Nox committed per capita The amount of nitrogen ­ oxide concentrator parts Rooms in the air per 100m The average number of + Dist rooms Weight distance to 5 ­ Proptax employ centers Property tax per $1000 ­ Figure 2 IV ECONOMETRICS MODEL  Population regression function (PRF) PRF:   Sample regression function (SRF) SRF: where:   is the intercept of the regression model  i  is the slope coefficient of the independent variable  xi   is the disturbance of the regression model        is the estimator of   is the estimator of  i   is the residual (the estimator of  i ) Group 10 Econometrics Report V DATA COLLECTION Data overview  This set of data is collected from a given source, therefore it is a secondary one  The structure of Economic data: cross­sectional data Data description To get statistic indicators of the variables, in Stata, the following command is used:  sum  Variab Std le Obs Mean 22511 Dev 9208.85 Min Max price 506 51 3.6115 8.59024 5000 0.00 50001 88.97 crime 506 36 5.5497 1.15839 6 nox 506 83 6.2840 0.70259 3.85 8.71 rooms 506 51 3.7957 38 2.10613 3.56 8.78 dist propta 506 51 40.823 16.8537 1.13 12.13 x 506 72 18.7 71.1 Figure 3 where: Obs is the number of observations Std. Dev is the standard deviation of the variable Min is the minimum value of the variable   Max is the maximum value of the variable VI ESTIMATION OF ECONOMETRIC MODEL Checking the correlation among variables: Group 10   Econometrics Report price price crime crime nox rooms dist proptax                             ­0.3879 nox ­0.426 0.4212 rooms 0.6958 ­0.2188 ­0.3028 dist 0.2493 ­0.3799 ­0.7702 0.2054 ­0.4671 0.5828 0.667 ­0.2921 proptax Figure 4   ­0.5344 First and foremost, the correlation of Price and nox, crime, rooms,   dist,   proptax   is   checked   by   calculating   the   correlation coefficient   among   these   variables   The   correlation   coefficient measures   the   strength   and   direction   of   a   linear   relationship between two variables on a scatterplot. In Stata, the correlation with matrix is generated the command: corr price crime nox rooms dist proptax We   can   see   from   the   matrix,   it   can   be   inferred   that   the correlation between price and each of the independent variable is decent enough to run the regression model. Specifically: ­ Correlation coefficient between price and crime is ­0.3879 => price and crime have a moderate relationship ­ Correlation coefficient between price and nox is ­0.426 => price and nox have a moderate relationship ­ Correlation coefficient between price and rooms is 0.6958 => price and rooms  have a moderate relationship ­ Correlation coefficient between price and dist is 0.2493 => price and dist have a weak relationship ­ Correlation coefficient between price and proptax is ­0.4671 => price and proptax have a moderate relationship 10 Group 10 Econometrics Report Independent   variables   including   Rooms   and   Dist   have correlation   coefficient   larger   than   0,   which   means   they   are   in directly   relationship   with   dependent   variable   The   highest coefficient is 0.6958 (between Rooms and Price) points out that Rooms   have   the   strongest   impaction   on   Price   When   rooms increases, then price will increase much. On the other hands, the correlation   coefficient   between   Price   and   Dist   is   0.2493   It implies that they have not strong connection. Even if the Dist increases, Price increases but not much.  In   addition,   all   variables   have   correlation   coefficient   not larger   than   0.8   so   this   model   does   not   have   multicollinearity problem.  Regression run Having   checked   the   required   condition   of   correlation   among variables, the regression model is ready to run. In Stata, this is done by using the command: Reg price nox crime rooms dist proptax Number  of obs F(  5,   = 506 500) Prob > F R­ = = 142.92 Source Model SS 2.52E+10 df MS 5.04E+09 Residual 1.76E+10 500 35258403.7 squared Adj R­ = 0.5883 Total 4.28E+10 505 84803032 squared = 0.5842 11 Group 10 Econometrics Report Root MSE [95%  Std.  Err t 38.11571 410.7763 399.0772 P>t = 5937.9 price crime nox rooms Coef ­150.0703 ­1737.66 7707.327 dist ­791.2588 197.9444 ­4 1180.164 ­ ­402.3535 proptax ­89.95717 23.61555 ­3.81 0.02 136.3551 ­ ­43.55923 _cons ­9060.303 3978.871 ­2.28 16877.67 ­1242.937 ­3.94 ­4.23 19.31 Conf Interval] ­224.957 ­75.18364 ­2544.72 ­930.5992 6923.252 8491.402 ­ Figure 5 From table above we have Sample Regression Function: Price   =   ­9060.303   ­   1737.66*nox   +   7707.327*rooms ­89.95717*proptax From the result, it can be inferred that crime,   nox,   rooms,   dist,   proptax  all   have   statistically significant effects on  price  at the 5% significant level (as all p­values are smaller than 0.05). In particular, those effects can be specified by the regression coefficients as follows:  β0 = ­9060.303   1 = ­1737.66 means that if nit ox concen per 100m increases by one , average housing price will decrease by 1737.66 in condition other factors do not change     =   ­150.0703   means   that   if   crimes   committed   per   capital increases by one , average housing price will decrease by 150.0703 in condition other factors do not change   3 = 7707.327 means that if average number of rooms increases by one, average housing price will increase by 7707.327 in condition other factors do not change 12 Group 10 Econometrics Report   4 = ­791.2588 means that if weight distance to 5 employ centers increases 1 unit, average housing price will decrease by 791.2588 in condition other factors do not change     =   ­89.95717   means   that   if   average   property   tax   per   $1000 increases by one, average housing price will decrease by 89.95717 in condition other factors do not change  The   coefficient   of   determination   R­squared=0.5883:   all independent   variables   (crime,   nox,   rooms,   dist,   proptax,) jointly   explain   58.83%   of   the   variation   in   the   dependent variable   (price);   other   factors   that   are   not   mentioned explain the remaining 41.17% of the variation in the price Other indicators: ­ Adjusted coefficient of determination adj R­squared = 0.5842 ­ Total Sum of Squares TSS = 4,28E+14 ­ Explained Sum of Squares ESS = 2,52E+14 ­ Residual Sum of Squares RSS =  1,76E+14 ­ The degree of freedom of Model Dfm= 5 ­ The degree of freedom of residual Dfr = 500  VII  CHECK MULLTICOLLINEARITY AND HETEROSCEDASTICITY  Multicollinearity  Multicollinearity is the high degree of correlation amongst the   explanatory   variables,   which   may   make   it   difficult   to separate   out   the   effects   of   the   individual   regressors, standard errors may be overestimated and t­value depressed  Detect multicollinearity o Method 1: Use cor command to examine multicollinearity If   independent   variables   are   strongly   correlated   (r   >   0.8), multicollinearity may occur   price crime price 1.0000 ­0.3879 crime   1.0000 nox     rooms     dist     proptax     13 Group 10 nox rooms dist proptax Figure 6 Econometrics Report ­0.426 0.6958 0.2493 ­0.4671 0.4212 ­0.2188 ­0.3799 0.5828 1.0000 ­0.3028 ­0.7702 0.667   1.0000 0.2054 ­0.2921     1.0000 ­0.5344       1.0000 From   the   table   above,   we   can   easily   see   that   correlating coefficient   among   independent   variables   are   pretty   low   and   all smaller   than   0.8   As   a   result,   we   can   conclude   that multicollinearity does not occur in this model o Method 2: Use variance inflation factor (VIF) If VIF > 10, multicollinearity occurs Variable nox dist proptax crime rooms Mean VIF Figure 7 VIF 3.24 2.49 2.27 1.54 1.13 2.13 1/VIF 0.308352 0.401709 0.440742 0.651256 0.888073   The table shows that all VIF value is smaller than 10, thus, multicollinearity does not is occur in this model We   can   draw   a   conclusion   from     methods   above   that multicollinearity   not   too   worrisome   a   problem   for   this   set   of data Heteroskedasticity Another   problem   that   our   model   can   suffer   from   when   being examined is heteroskedasticity. Heteroskedasticity may result in the   situation   that   some   least   squared   estimators   are   still unbiased but are no longer effective, along with that, estimators 14 Group 10 Econometrics Report of variances will become biased, thus lead to the reduction in effectiveness of our model When   the   assumption   of   variance   of   each   error   term   Ui   is unchanged when i moves from 1, 2 to n. It can also be rewritten as: Var (Ui) = Var (Uj) i=1,2,3,…,n j=1,2,3,…,n When that assumption is violated, heteroskedasticity appears  Causes o   Essence of economic phenomena: If economic phenomena is examined on subjects having difference in scale or they are examined under periods of time that are not similar in fluctuation level o   Model’s function is wrongly formatted, maybe because appropriate   variables   are   missing   or   function   analysis   is false o     cannot   fully   and   correctly   reflect   the   essence   of economic   phenomena   For   example,   external   observations appear   Bringing   in   or   eliminate   these   observations   does great impact on regression analysis o  Error tends to decrease as data collecting, conserving and processing techniques are improved o  Behaviors in the past are learnt Hypothesis:    Using the command estat hettest in STATA: 15 Group 10 Econometrics Report Breusch­Pagan / Cook­Weisberg test for heteroskedasticity           Ho: Constant variance          Variables: fitted values of price          chi2(1)      =    26.56          Prob > chi2  =   0.0000 We can see that Prob > chi2 = 0.0000  We reject H0, accept H1 We   can   conclude   that   heteroskedasticity   does   occur   in   this model Correcting heteroskedasticity We use command:  reg price crime nox rooms dist proptax, robust we have the result  Number   of obs F(   Robust price Coef     crime ­150.0703 nox ­1737.66 rooms 7707.327 dist ­791.2588 proptax ­89.95717 _cons ­9060.303 Figure 8 Std. Err   30.45247 389.6642 670.6304 175.744 26.84788 5398.964 t   ­4.93 ­4.46 11.49 ­4.5 ­3.35 ­1.68 = 506 103.2 500) Prob > F = = 0.588 R­squared = 5937 Root MSE = P>t   0 0 0.001 0.094   5, [95% Conf   ­209.9009 ­2503.241 6389.726 ­1136.546 ­142.7057 ­19667.75 Interval]   ­90.23976 ­972.0787 9024.928 ­445.9712 ­37.20862 1547.148 Note that comparing the results with the earlier regression, none   of   the   coefficient   estimates   changed,   but   the   standard 16 Group 10 Econometrics Report errors   and   hence   the   t   values   are   different,   which   gives reasonably more accurate p values VIII HYPOTHESES POSTULATED  The t test Hypothesis:   c(500)0.025 = 1.965  Reject  Conclusion:  Number   of   crimes   committed   per   capita  has statistically signifincant effect on median housing price. Higher number of crimes commited per capita, lower median housing price Hypothesis:   c(500)0.025 = 1.965  Reject  Conclusion:   nitrogen   oxide   concentrator   per   100m   has statistically signifincant effect on median housing price. Higher nitrogen oxide concentrator per 100m, lower median housing price Hypothesis:   c(500)0.025 = 1.965  Reject  Conclusion:   The   average   number   of   rooms   has   statistically signifincant   effect   on   median   housing   price,   higher   average number of rooms, higher median housing price Hypothesis:   ­4.5 17 Group 10 Econometrics Report c(500)0.025 = 1.965  Reject  Conclusion weight distance to 5 employ centers has statistically signifincant   effect   on   median   housing   price,   higher  weight distance to 5 employ centers, lower median housing price Hypothesis:   c(500)0.025 = 1.965  Reject  Conclusion  Property tax per $1000  has statistically signifincant effect   on   median   housing   price,   higher  property   tax   per   $1000, lower median housing price Confidence Intervals Test the following hypothesis: Variable Const X1 X2 X3 X4 Coefficient Significant Level 5% 5% 5% 5% 5% Confidence Interval (­19667.75  ;  1547.148) (­209.9009  ; ­90.23976) (­2503.241  ; ­972.0787) (6389.726  ; 9024.928) (­1136.546 ;­445.9712) 18 Group 10 X5 Econometrics Report 5% (­142.7057  ; ­37.20862) Figure 9 We   can   see   that   for   all   coefficients,    doesn’t   belong   to   the confidence interval, so we reject the hypotheses H0: , , , ,  Conclusion: Number of crimes committed per capita, nitrogen oxide concentrator   per   100m,   the   average   number   of   rooms,  weight distance to 5 employ centers and property tax per $1000 all have statistically   signifincant   effect   on   median   housing   price  with the confidence level of 95%.  P­ Value Hypothesis testing:  P­value = 0.0004  Reject H0 Number   of   crimes   committed   per   capita  has   statistically signifincant   effect   on   median   housing   price   Higher  number   of crimes commited per capita, lower median housing price In   particular,   with   the   sample   we   have,   the   estimated   result shows   that   one   more   crime   committed   decreases   median   housing price by 150.07$, holding other factors fixed Hypothesis testing:  P­value = 0.0004  Reject H0 Nitrogen   oxide   concentrator   per   100m   has   statistically signifincant   effect   on   median   housing   price   Higher   nitrogen oxide concentrator per 100m, lower median housing price In   particular,   with   the   sample   we   have,   the   estimated   result shows that one more unit in nitrogen oxide concentrator per 100m decreases median housing price by 1737.66$, holding other factors fixed Hypothesis testing:  P­value = 0.0004  Reject H0 The average number of rooms has statistically signifincant effect on median housing price, higher average number of rooms, higher median housing price 19 Group 10 Econometrics Report In   particular,   with   the   sample   we   have,   the   estimated   result shows   that   one   more   room   added   in   the   house   increases   median housing price by 7707.33 $, holding other factors fixed Hypothesis testing:  P­value = 0.0004  Reject H0 Weight   distance   to     employ   centers  has   statistically signifincant   effect   on   median   housing   price,   higher  weight distance to 5 employ centers, lower median housing price In   particular,   with   the   sample   we   have,   the   estimated   result shows that one more unit increased in weight distance to 5 employ centers decreases median housing price by 791.25$, holding other factors fixed Hypothesis testing:  P­value = 0.0008  Reject H0 Property tax per $1000  has statistically signifincant effect on median housing price, higher property tax per $1000, lower median housing price.  In   particular,   with   the   sample   we   have,   the   estimated   result shows   that   one   more   $   increased   in   property   tax   per   1000$ decreases median housing price by 89.96 $, holding other factors fixed  Testing the overall significance: The F test This   test   is   to   examine   if   the   parameters   of   the   independent variable βi at the same time can be zero The hypothesis is as follows: = 142.92   >     As   a   result,   there   is   enough   evidence   to   reject   the   null hypothesis and conclude that at least one independent variable in 20 Group 10 Econometrics Report the subset does have explanatory or predictive power on price, so we don’t reduce the model by dropping out this subset IX RESULT ANALYSIS AND POLICY IMPLICATION From data analysis in previous sections, we have gained an  overall view of data set given in term of the satistical  relationship between housing prices and each of the factors  proposed. As mentioned at the beginning of this report, we aim to learn how security of the neighborhood, the air pollution, the  size of house, accessibility and the property tax are associated  with housing price. In other words, we are concerned about what  is the willingness of buyers to pay for these components.  Following the analysis of data, regression model run and  hypothesis testing, it can be concluded that security of the  neighborhood, the air pollution, the size of house, accessibility and the property tax statistically affect the housing prices.  Therefore, tenants, investors or constructors should take all of  these ingredients into account when making deals X CONCLUSION This   report   is   completed   on   the   dedicated   contribution   of each   member   and   the   knowledge   from   our   study   in   Econometrics This research has provided us with a good opportunity to practice what we have learned and to get a deeper understanding of data analysis and relevant testing. From this useful application, we hope   that   our   research   can   somehow   suggest   the   relationship between the housing prices and some other factors Again, due to the limitation of understanding and resources, our report may contain misinterpretations. We hope that teacher and readers can give us constructive comments on the report so that we would improve ourselves and do better in the future XI REFERENCES 21 Group 10 Econometrics Report https://www.york.ac.uk/media/economics/documents/seminars/Hendry_ Feb_2011.pdf http://pages.hmc.edu/evans/chap1.pdf http://citeseerx.ist.psu.edu/viewdoc/download? doi=10.1.1.926.5532&rep=rep1&type=pdf D.A. Belsey, E. Kuh, and R. Welsch, Regression Diagnostics: Identifying Influential Data and Sources of Collinearity, New York: Wiley (1990) 22 ... been   wondering   “Why     housing   prices   among locations   and   regions   differ   so   much?”   Housing   prices   are affected   by   many   different   factors   such   as   structure,... analysis and relevant testing. From this useful application, we hope   that   our   research   can   somehow   suggest   the   relationship between the housing prices and some other factors Again, due to the limitation of understanding and resources, our... Again, due to the limitation of understanding and resources, our report may contain misinterpretations. We hope that teacher and readers can give us constructive comments on the report so that we would improve ourselves and do better in the future

Ngày đăng: 22/06/2020, 21:35

Từ khóa liên quan

Mục lục

  • I. INTRODUCTION

    • 1. Overall about econometrics

    • 2. Why choosing OLS?

    • II. Question of interest

    • III. Economic model

      • 1. Choosing the variables

      • 2. Embedding that target in a general unrestricted model (GUM)

      • IV. ECONOMETRICS MODEL

        • 1. Population regression function (PRF)

        • 2. Sample regression function (SRF)

        • V. DATA COLLECTION

          • 1. Data overview

          • 2. Data description

          • VI. ESTIMATION OF ECONOMETRIC MODEL

            • 1. Checking the correlation among variables:

            • 2. Regression run

            • VII. CHECK MULLTICOLLINEARITY AND HETEROSCEDASTICITY

              • 1. Multicollinearity

                • 2. Heteroskedasticity

                • VIII. HYPOTHESES POSTULATED

                  • 1. The t test

                  • 2. Confidence Intervals

                  • 3. P- Value

                  • 4. Testing the overall significance: The F test

                  • IX. RESULT ANALYSIS AND POLICY IMPLICATION

                  • X. CONCLUSION

                  • XI. REFERENCES

Tài liệu cùng người dùng

Tài liệu liên quan