Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 467 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
467
Dung lượng
15,32 MB
File đính kèm
121. Using Stata for econometric.rar
(15 MB)
Nội dung
Using Stata For Principles of Econometrics Third Edition ! I ·1· I t i: f, I Lee Adkins dedicates this work to his lovely and loving wife, Kathy , Carter Hill dedicates this work to Stan Johnson and George Judge , -' Bicentennial Logo Design: Richard Pacifico Copyright @ 2008 John Wiley & Sons, Inc All rights reserved No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, exC;ept as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, or on the web at www.copyright.com Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030-5774, (201) 748-6011, fax (201) 748-60'08, or online at http://www.wiley.com/go/permissions To order books or for customer service, please caIlI-800-CALL-WILEY (225-5945) ISBN-13 978- 0-470-18546-9 Printed in the United States of America 109876543 Printed and bound by Malloy, Inc PREFACE This book is a supplement to Principles of Eeonometrics, 3'" Edition by R Carter Hill Willi~ E Griffiths and Guay C Lim (Wiley" 2007), hereinafter POE This book is not a substitute for the textbook, nor is it a stand alone computer manual It is a companian to the textbook, showing bow to perform the examples in the textbook using Stata Release 10 This book will be 'ijSeful to students taking econometrics, as well as their instructors, and otheJS who wish to llSC Stata for econometric analysis Stata is a very powerful program that is used in a wide variety of academic discipline~ The website is http://www.stata.com There you will find a great deal of doeumentation One great and visual resource is at UCLA: http://www.ats.uela.edulSTAT/statal.,' We highly recommend this website In addition to this Gomputer manual for Stata, there are similar manuals and support for the software packages EViews, Excel, Gretl and ShI12aOl In addition, all the data for POE in vlU10us formats, including Stata, are a:vailable at htto-:J/www.wUey.com/coDegelhlll Individual Stata data files errata for this Ittanual IUld the textbook: can be found at http://principlesofeconometrics.com m The chapters this book parallel the chapters in POE Thus if you seek help for the examples in Chapter 11 of the textbook, check Chapter 11 in this book: However within a Chapter the sections numbers in POE not necessarily correspond to the Stam manual sections Data files and other resources for POE can be fuund at http://stata.comitextsls4poe We w!'lcome con;unents on this book and suggestions for improvement We would like to acknowledge the help of the Stata Corporation, and in particular Bill Rising and Brian Poi for answ~g many of our questions Lee C Adkins Department of Economics Oklahoma State University Stillwater, OK 7407.8 lee.adkins@olrstate.edu R Carter H"ill EconomicS Department Louisiana State University Baton Roug.e, LA 70803 eohill@/su.edu y' CONTENTS CHAPTER Introducing Stata 1.1 Starting Stata 1.2 The opening display 1.3 Exiting Stata 1.4 Stata data files for Principles of Econometrics 1.4.1 A working directory 1.4.2 Data definition files 1.5 Opening Stata data files 1.5.1 The use command Using the toolbar 1.5.2 Using files on the internet 1.5.3 1.5.4 Locating POE files on the internet 1.6 The variables window 1.7 Describing the data and obtaining summary statistics 1.8 The Stata help system 11 1.8.1 Using keyword search 12 ,1.8.2 Using command search 13 1.8.3 Opening a dialog box 13 1.9 Stata command syntax 14 1.9.1 Syntax of summarize 14 1.9.2 "" Learning syntax using the review window 15 1.1 Saving your work 19 1.10.1 Copying and pasting 19 1.10.2 Using a log file 20 1.10.3 Viewing a log file 22 1.1 0.4 Translating a log file to a text file 24 1.10.5 Using Stata commands for log files 25 1.1 I Using the data browser 26 1.12 Using Stata graphics 26 1.12.1 Histograms 27 Scatter diagrams 29 1.12.2 1.13 Using Stata po-files 30 1.14 Creating and managing variables 33 i 14.1 Creating (generating) new variables 33 Using the expression builder 34 1.14.3 Dropping or renaming a variable 36 1.14.4 Using arithmetic operators 38 1.14.5, Using Statamath functions 39 1.15 Using Stata density functions 39 1.15.1 Cumulative distribution functions 40 1.15.2 Inverse cumulative distribution functions 41 1.16 Using and displaying scalars 41 1.16.1 Example of standard nonnal cdf 42 1.16.2 Example of t-distribution tail-cdf 42 1.16.3 Example computing percentile of the standard normal 42 1.16.4 Example computing percentile of the t-distribution 43 1.17 A scalar dialog box 43 Key Terms 47 Chapter Do-file 47 1.14.2 CHAPTER Simple Linear Regression 49 2.1 2.2 2.3 2.4 The food expenditure data 49 2.1.1 Starting a new problem 49 Starting a log file 50 2.1.2 2.1.3 Opening a Stata data file 51 2.1.4 Browsing and listing the data 52 Computing summary statistics 55 Creating a scatter diagram 57 2.3.1 Enhancing the plot 58 Regression 61 2.4:1 Fitted values and residuals 63 2.4.2 Computing an elasticity 65 2.4.3 Plotting the fitted regression line 67 2.4.4 Estimating the variance of the error term 70 2.4.5 Viewing estimated variances and covariances 71 ix Using Stata to obtain predicted values 72 2.6 Saving the Stata data file and ending the session 74 KeyTenns 75 Chapter Do-file 75 2.5 CHAPTER Interval Estimation and Hypothesis Testing 77 3.1 Interval estimates 77 3.1.1 Critical values from the I-distribution 78 3.1.2 Creating an interval estimate 80 3.2 Hypothesis tests 80 3.2.1 Right tail test of significance 80 3.2.2 Right tail test ofan economic hypothesis 82 3.2.3 Left tail test of an economic hypothesis 83 3.2.4 Two tail test of an economic hypothesis 84 3.3 P-values 84 3.3.1 P-value of a right tail test 85 3.3.2 P-value ofa left tail test 86 P-value for a a two tail 3.3.3 test 87 3.3.4 P-values in Stata output 87 Key Terms 88 Chapter Do-file 88 CHAPTER Prediction, Goodness-of-Fit and ModeUng Issues 91 4.1 Least squares prediction 91 4.1.1 Editing the data 92 4.1.2 Estimate the regression and obtain post-estimation results 93 4.1.3 Creating the prediction interval 94 4.2 Measuring goodness-of-fit 95 4.2.1 Correlations and R2 96 4.3 The effects of scaling and transforming the data 97 4.3.1 The reciprocal functional form 98 4.3.2 Editing graphs 100 4.3.3 The linear-log model 103 4.4 Analyzing the residuals 105 4.4.1 The Jarque-Bera test 106 4.4.2 Chi-square distribution critical values 107 4.4.3 Chi-square distribution p-values 108 Another empirical example 109 4.5 4.5.1 Examining the data 109 4.5.2 Estimating and checking the linear relationship 110 4.5.3 Estimating and checking a cubic equation 115 4.6 Estimating a log-linear wage equation 117 4.6.1 The log-linear model 118 4.6.2 Calculating wage predictions 120 Constructing wage plots 122 4.6.3 Generalized ~ 125 4.6.4 Prediction intervals in the 4.6.5 log-linear model 125 Key Terms 127 Chapter Do-file 127 CHAPTER Multiple Linear Regression 131 5.I Big Andy's burger bam 131 5.2 Prediction 133 5.3 Sampling precision 135 5.4 Confidence intervals 137 5.5 Hypothesis tests 139 5.6 Goodness-of-fit 141 Key Terms 142 Chapter Do-file 143 CHAPTER Further Inference in the Multiple Regression Model 144 6.1 The F-test 144 6.2 Testing the significance of the model 148 6.3 An extended model 150 6.4 Testing some economic hypotheses 151 6.4.1 Significance of advertising 151 - 6.4.2 Optimal advertising 152 Nonsample infonnation 155 Model specification 158 6.6.1 Omitted variables 158 6.6.2 Irrelevant variables 159 6.6.3 Choosing the model 160 6.7 Poor data, collinearity and insignificance 163 Key Tenns 166 Chapter Do-File 166 6.5 6.6 CHAPTER Nonlinear Relationships 171 7.1 Nonlinear Relationships 171 7.1.1 Summarize data and estimate regression 171 7.1.2 Calculate marginal effect 172 7.1.3 Plotting wage-experience profile 173 7.2 Dummy variables 176 7.2.1 Creating dummy variables 176 7.2.2 Using tabulate 177 7.2.3 Estimating a dummy variable regression 178 7.2.4 Testing the significance of the dummy variables 179 7.2.5 Further calculations 180 7.3 Applying dummy variables 180 7.3.1 Interactions between qualitative factors 181 7.3.2 Adding regional dummies 182 7.3.3 Testing the equivalence of two regressions 183 7.3.4 Estimating separate regressions 185 7.4 Interactions between continuous variables 186 7.5 Dummy variables in log-linear models 187 Key Tenns 189 Chapter Do-file 190 CHAPTER Heteroskedasticity 193 8.1 The nature of heteroskedasticity 193 Using the least squares estimator 194 The generalized least squares estimator 196 Transfonning the model 196 8.3.1 Estimating the variance 8.3.2 function 198 8.3.3 A heteroskedastic partition 200 Detecting heteroskedasticity 202 8.4 Residual plots 202 8.4.1 The Goldfeld-Quandt 8.4.2 test 203 8.4.3 Testing the variance function 205 8.4.3a The White test 206 Key Teoos 209 Chapter Do-file 209 8.2 8.3 CHAPTER Dynamic Models, Autocorrelation, and Forecasting 212 9.1 Lags in the error teoo: autocorrelation 212 9.2 Area response for sugar 213 9.3 Estimating an AR(I) model 215 9.3.1 Least squares and HAC standard errors 216 9.3.2 Nonlinear least squares 217 9.3.3 A more general model 220 9.4 Detecting autocorrelation 222 9.5 Autoregressive models 225 9.6 Finite distributed lags 229 9.7 Autoregressive distributed lag models 230 Appendix 232 Key Teoos 234 Chapter Do-file 235 CHAPTER 10 Random Regressors and Moment Based Estimation 239 10.1 Least squares with simulated data 239 10.2 Instrumental variables estimation with simulated data 241 10.2.1 IV estimation in two steps 241 10.2.2 IV estimation in one step 242 xi 10.2.3 IV estimation with surplus instruments 246 10.3 The Hausman test: simulated data 247 Testing for weak instruments: simulated lOA data 249 10.5 Testing the validity of surplus instruments: simulated data 251 10.6 Estimation using the Mroz data 254 10.6.1 Least squares regression 254 10.6.2 Two-stage l~ast squares 255 10.6.3 Instrumental variables estimation 256 10.6.4 Instrumental variables estimation with surplus instruments 259 10.7 Testing the endogeneity of education 259 10.8 Testing for weak instruments 261 10.9 Testing the validity of surplus instruments 262 Key Terms 263 Chapter 10 Do-file 263 CHAPTER 11 Simultaneous Equations Models 267 11.1 Truffle supply and demand 267 11.2 Estimating the reduced form equations 268 11.3 2SLS estimates of truffle demand 269 11.4 2SLS estimates of truffle supply 273 11.5 Supply and demand of fish 275 11.6 Reduced forms for fish price and quantity 276 11.7 2SLS estimates offish demand 277 Key Terms 27& Chapter 11 Do-file 279 CHAPTER 12 Nonstationary Time Series nata and Cointegration 281 12.1 Stationary and nonstationary data 281 12.2 Spurious regressions 286 12.3 Unit root tests for stationarity 289 1204 Integration and co~tegration 294 12.5 Engle-Granger test 294 Key Terms 295 Chapter 12 Do-file 295 xii CHAPTER 13 An Introduction to Macroeconometrics: VEC and VAR Models 298 13.1 VECandVARmodels 298 13.2 Estimating a VEC model 298 13.3 Estimating a VAR 305 13.4 Impulse responses and variance decompositions 313 KeyTerms 318Chapter 13 Do-file 318 CHAPTER 14 AJi Introduction to Financial Econometrics: Time-Varying Volatility and ARCH Models 321 14.1 ARCH model and time-varying volatility 321 14.2 Testing, estimating, and forecasting 323 14.3 Extensions 330 14.3.1 GARCH 330 14.3.2 Threshold GARCH 332 14.3.3 GARCH-in-mean 334 Key Terms 337 Chapter 14 Do-file 337 CHAPTER 15 Panel nata Models 340 15.1 Sets of regression equations 340 15.2 Seemingly unrelated regressions 344 15.3 The fixed effects model 346 15.3.1 A dummy variable model 346 15.3.2 The fixed effects estimator 349 15.3.3 The fixed effects estimator for a microeconometric panel 353 15.4 Random effects estimation 356 15.4.1 Breusch-Pagan test 357 15.4.2 Hausman test 358 Key Terms 359 Chapter 15 Do-file 360 CHAPTER 16 Qualitative and Limited nependent Variable Models 363 16.1 Models with binary dependent variables 363 16.2 Multinomiallogit 367 ,'" ~ 16.3 Conditionallogit 371 16.3.1 Release 9: elogit 374 16.3.2 Release 10: asclogit 378 16.4 Ordered choice models 381 16.5 Models for count data 384 16.6 Censored data models 389 16.6.1 Simulated data example 389 16.6.2 Mroz data example 392 16.7 Selection bias 395 KeyTerms 401 Chapter 16 Do-file 401 Appendix A Review of Math Essentials 406 A.l Stata math and logical operators 406 A.2 Math functions 407 A.3 Extensions to generate operations 408 AA The calculator 409 A.5 Scientific notation 409 Key Terms 410 Appendix B Review of Probability , Concepts 411 B.1 Stata probability functions 411 B_2 Binomial probability calculations 413 B.3 Normal probability calculations 414 BA t-distribution probability calculations 419 B~5 F-distribution probability calculations 421 B.a' Chi-sq~~e distribution probability calculations 423 Key Terms 425 Appendix B Do-file 425 C.4.1 Using simulated data 440 Using the hip data 442 C.4.2 C.5 Testing the mean of a normal population 443 C.5.1 Right tail test 444 C.S.2 Two tail test 446 C.6 Testing the variance of a normal population 446 C.7 Testing the equality of two normal population means 448 C 7.1 Population variances are equal 448 C.7.2 Population variances are unequal 449.C.8 Testing the equality of two normal population variances 450 C.9 Testing normality 451 C.lO Maximum likelihood estimation 453 Key Terms 453 Appendix C Do-file 453 Index 457 Appendix C ~eview of Statistical Inference 427 C.1 Examining the hip data 427 C.1.1 Constructing a histogram 427 C.1.2 Obtaining summary statistics 429 C.l.3 Estimating the population mean 43-0 C.2 Using simulated data values 431 C.3 The central limit theorem 436 CA Interval estimation 439 xiii CHAPTER Introducing Stata CHAPTER OUTLINE 1.1 Starting Stata 1.2 The opening display 1.3 Exiting Stata 1.4 Stata data files for Principles of Econometrics 1.4.1 A working directory 1.4.2 Data definition files 1.5 Opening Stata data files 1.5.1 The use command 1.5.2 Using the tool bar 1.5.3 Using files on the internet 1.5.4 LOcating POE files on "the internet 1.6-'The variables window "1.7 Describing the data and obtaining summary statistics 1.8 The Stata help system 1.8.1 Using keyword search 1.8.2 Using command search 1.8.3 Opening a dialog box 1.9 Stata command syntax 1.9.1 Syntax of summarize 1.9.2 Learning syntax using the review window 1.10 Saving your work 1.10.1 Copying and pasting 1.10.2 Using a log file 1.10.3 Viewing a log file 1.10.4 Translating a log file to a text file 1.10.5 Using Stata commands for log files 1.11 Using the data browser 1.12 Using Stata graphics 1.12.1 Histograms 1.12.2 Scatter diagrams 1.13 Using Stata Do-files 1.14 Creating and managing variables 1.14.1 Creating (generating) new variables 1.14.2 Using the expression buDder 1.14.3 Dropping or renaming a variable 1.14.4 Using arithmetic operators 1.14.5 Using Stata math functions 1.15 Using Stata density functions 1.15.1 Cumulative distribution functions 1.15.2 Inverse cumulative distribution functions 1.16 Using and displaying scalars 1.16.1 Exampie of standard normal cdf 1.16.2 Example of t-distribution tail-cdf 1.16.3 Example computing percentile of the standard normal 1.16.4 Example computing percentile of the t-distribution 1.17 A scalar dialOg box 1.1 STARTING STATA Stata can be started several ways First, there may be shortcut on the desktop that you can doubleclick For the StataiSE Release 10 it will look like Chapter Earlier versions of Stata have a similar looking Icon, but~of course with a different number Alternatively, using the Windows menu, click the Start> All Programs> Stata 10 A second way is to simply locate a Stata data file, with *.dta extension, and double-click 1.2 THE OPENING DISPLAY Once Stata is started a display will appear that contains windows titled Command-this is where Stata command are typed Results output from commands, and error messages, appear here Review-a listing of commands recently executed Variables-names of variables in data and labels (if created) It should look something like riStatdlSE 10.0 - "- - - - - - - - - I::':[Q~ _ _ _ _ _ 'tIO 1- I ' I 1-1 ~atiSl:ic:sJDa1:a sp«:f~l This ' 1-1 10.0 .1ysis Copyright: 191U-2007 Sti:tACQiP 4905 Lak y Drive s~.n:1on SOo-STATA-PC Edfrfon college 'IilQqp't¥.sft,Q;I"'~ Texas 77845 USA tTctp:!jWtlflf st&'ta COlD 979-69~-4GOO 979-696-4601 rec'¥1t'C?Qmmflnd~ (f~x) sU1;a Itl) = 0.5387 17 67175 = = 0.6191 49 Ha: mean >17 pr(T > t) = 0.2694 The test statistic value is 0.6191 and the two tail p-value is 0.5387 In the Stata output"! =" means "not equal to," or "::j:." The details are quietly summarize y, detail scalar t2 = (ybar - 17)/se scalar"p2 = 2*ttail(df;abs(t2» di "two tail test" two tail test di "tstat = " t2 tstat = 6190558 di "tc975 tc975 di "pval pval =" tc975 = 2.0095752 =" p2 = 53874725 C.6 TESTING THE VARIANCE OF A NORMAL POPULATION Let Y be a normally distributed random variable, Y - N (j.l, 0'2 ) Assume that we have a random sample of size N from this population, ~'Y;""'YN' The estimator of the population mean is - Review of Statistical Inference 447 f =L.r; /N and the unbiased estimator of the population variance is &2 =L (.r; - ft /( N -1) To test thc~ null hypothesis Ho : cr = cr~ we use the test statistic If the null hypothesis is true then the test statistic has the indicated chi-square distribution with (N-1) degrees of freedom If the alternative hypothesis is HI: cr > cr~ then we carry out a onetail test If we choose the level of significance a = 05, then the null hypothesis is rejected if V ~ X~95,N-ll' where X~95,N-l) is the 95 th -percentile of the chi-square distribution with (N-1) degrees of freedom To illustrate consider the null hypothesis that the variance of the hip population data equals The Stata automatic test is sdtest It specifies the null hypothesis in terms of the standard deviation, rather than the variance Thus the null hypothesis is Ho : cr = The test command, assuming the hip data is in memory, is sdtEist Y == one-sample test of variance variablc~ obs Mean Std Err Std Dev [95% conf Interval] y 50 17.1582 2555502- 1.807013 16.64465 sd Ho: sd sd(y) c = chi degrees of freedom Ha: sd < pr(c < c) = 0.1832 Ha: sd 2*pr(c < c) != = 0.3664 17.67175 = 39.9999 49 Ha: sd > pr(c > c) = 0.8168 r The test statistic value is 39.9999 and the right tailp-value is 0.8168 To see the details, specify a scalar equal to the hypothesized variance scalar sO =4 Restore the summary statistics details and calculate some scalars quietly summarize y, detail scalar sighat2 = r(Yar) scalar df = r(N)-1 The test statistic value is scalar v = df*sighat2/s0 Compute the critical values, andp-value 448 Appendix C scalar chi2_95 = invchi2(df,.95) scalar p - chi2(df,v) = Display the results di "Chi square test stat " v Chi square test stat 39.999882 di "95th percentile chisquare(49) " chi2_95 95th percentile chisquare(49) 66.338649 di "Right tail test p value n p Right tail test p value 81678069 C.7 TESTING THE EQUALITY OF TWO NORMAL POPULATION MEANS Let two normal populations be denoted N(~I,O'n and N(~2,O'n In orderto estimate and tes~ the difference between means, ~I - ~2' we must have random samples of data from each of the two populations We draw a sample of size NJ from the first population, and a sample of size N z from the second population Using the first sample we obtain the sample mean ~ and sample cr;; variance Ho : ~I - ~2 cr; from the second sample we obtain 1; and How the null hypothesJs =c is tested depends on whether the two population variances are equal or not C.7.1 Population variances are equal If the population variances are equal, so that samples to estimate the common value If the null hypothesis Ho.: ~I - ~2 =c 0'; =cr; = O'!, then we use information in both O'! This "pooled variance estimator" is is true, then As usual we can construct a one-sided alternative, sucl;t as HI: ~ - ~2 > c, or the two-sided alternative H1 : J L1 - ~2 C • *" Review of Statistical Inference 449 C.7.2 Population variances are unequal· If the population variances are not equal, then we cannot use the pooled variance estimate Instead we use The exact distribution of this test statistic is neither normal nor the usual t-distribution The distribution of can be approximated by a t-distribution with degrees of freedom t This is called Satterthwaite's formula To illustrate the test for populations with equal variances, draw two samples from normal populations with means and 2, using drawnorm clear drawnorm x1 x2, n(50) means(1 2) seed(12345) Calculate the summary statistics summarize variable obs Mean Std Dev Min Max xl 50 50 9804798 1.853527 9662047 1.040021 -1 021499 -.1856575 2.501836 4.222027 x2 Ii Using this information you could compute the test statistic given in POE The automatic test uses the command ttest with an option ttest x1 == ill' I, :1'1' 11 / " x2, unpaired ['Ii: :1 II i,1 Unpaired means that the observations are not matched to each other in any way The results are shown below The difference between the two sample means is -0.87 and the t-statistic value is -4.3487 with 98 degrees offreedom The two tailp-value is 0.0000 leading us to correctly reject the equality of the two population means 450 Appendix C \.J TWo-sample t test with equal variances variable obs Mean Std Err Std Dev [95% conf Interval] xl x2 50 50 9804798 1.853527 136642 1470812 9662647 1.040021 705887,5 1.557956- 1.255072 2.149098 combined 100 1.417003 1090824 1.090824 1.20056 1.633447 -.8730473 2007583 -1.271446 -.474649 diff diff Ho: diff = mean(x1) =0 Ha: diff < pr(T < t) = 0.0000 - mean(x2) " t degrees of freedom Ha: diff != pr(ITI > Itl) = 0.0000 = = -4.3487 98 Ha: diff >- pr(T > t) = 1.0000 To illustrate the test when we not assume variances are equal, generate two normal variables that have N(l,l) and N(2,4) distributions drawnorm x3 x4, n(50) means(1 2) sds(1 2) seed(12345) The command ttest now has the option unequal == ttest x3 x4, unpaired unequal TWo-sample t test with unequal variances variable obs Mean Std Err Std Dev [95% conf I,nterval] x3 x.4 50 50 9804798 1.70~054 136642 2941624 ~ 9662047 080042 ]'.:1:15913 2.298196 combi ned 100 343767 165433 1.65433 1.015512 1.672022 -.7265745 3243494 -1.3736 -.0795493 diff diff Ho: diff = mean(x3) =0 Ha: diff < pr(T < t) = 0.0141 - mean(x4) 7058875 t = Satterthwaite's degrees of freedom = Ha: diff != pr(ITI > Itl) = 0.0283 255072 -2.2401 69.2049 t diff> Pr(T > t) = 0.9~59 Ha: The degrees of freedom are calculated to be 69.2049, and again the two tail test rejects the null hypothesis that the population means are equal C.B TESTING THE EQUALITY OF TWO NORMAL POPULATION VARIANCES Given two normal populations: denoted N().lpO-:) and N().l2,0'i), we can test the null 0'; /0'; = If the null hypothesis is true, then the population variances are equal The test statistic is derived from the results that (Nl -1)&; /0'; -X(N,-l) and (N2 -1)&;/0'; -X(N -1)' The ratio hypothesis H ; Review of Statistical Inference 451 (NI -1)&~ /cr~ F= (NI-l) (N2 -1)&;/ cr; (Nz -1) If the null hypothesis H : cr~ / cr; =1 is true then the test statistic is F =&~ / &; , which has an Fdistribution with (N1-1) numerator and (N2-1) denominator degrees of freedom If the alternative hypothesis is HI : cr~ / cr; *" , then we carry out a two-tail test If we choose level of significance a = 05, then we reject the null hypothesis if F ~ F'c.97S,N1-I,N,-I) or if F ~ F'c.OZS,N1-l,N,-I)' where F'ca,N,-I,N,-I) denotes the 100a-percentile of the F-distri.bution with the specified degrees of freedom If the alternative is one-sided, HI: cr~ / cr; > then we reject the null hypothesis if F ~ F(.95,N,-I,N,-I) , Using the simulated variables x3 and x4, the test is carried out using the automatic command sdtest sdtest x3 == x4 variance ratio test variabl e obs Mean x3 x4 50 50 combined 100 ratio Ho: ratiCl = sd(x3) =1 Std Err Std Dev [95% conf Intervall 707054 136642 2941624 9662047 2.080042 7058875 1.115913 255072 2.298196 343767 165433 1.65433 1.015512 1.672022 9804798 / sd(x4) Ha: ratio < pr(F < f) = 0.0000 f = degrees of freedom = Ha: ratio != 2*pr(F < f) = 0.0000 0.2158 49, 49 Ha: ratio> pr(F > f) = 1.0000 c.g TESTING NORMALITY The normal distribution is symmetric, and has a bell-shape with a peakedness and tail-thickness leading to a kurtosis of Thus we can certainly test for departures from normality by checking the skewness and kurtosis from a sample of data If skewness is not close to zero, and if kurtosis is not close to 3, then we would reject the normality of the population In Appendix CA.2 we developed sample measures of skewness and kurtosis i I I " i 452 Appendix C The Jarque Bera test statistic allows a joint test of these two characteristics,C If the true distribution is symmetric and has kurtosis 3, which includes the normal distribution, then the JB test statistic has a chi-square distribution with degrees of freedom if the sample siZe is sufficiently large If a = 05 then the criticiu value of the X(2) distribution is 5.99 We reject the null hypothesis and conclude that the data are non-normal if JB ~ 5.99 If we reiect the null hypothesis then we know the data have non-normal characteristics, but we not know what distribution the population might have Clear memory and open hip.dta use hip, clear Stata offers number of automatic tests The nature of the tests is beyond the scope of this book They offer one test that is similar to, but not exactly the same, as the Jarque-Bera test It is implemented using sktest y variable y skewneSS/Kurtosis tests for Normality - - - joi nt - - pr(skewness) pr(Kurtosis) adj chi2(2) Prob>chi2 965 Q.290 1.17 0.5.569 The Jarque-Bera test follows, using the skewness and kurtosis values generated by summarize qUietly summarize y, detail scalar nabs = r(N) scalar s = r(skewness) scalar k = r(kurtosis) scalar jb = (nobs/6)*(s A + «k-3)A2)/4) The critical value and p-value are: scalar chi2_95 = invchi2(2,.95) scalar pval = - chi2(2,jb) Display these results di njb test statistic " jb jb test statistic 93252312 Review of Statistical Inference 453 di "95th percentile chi2(2) 95th percentile chi2(2) di "pvalue II n chi2_95 5.9~14645 pval pvalue 62734317 Close the log file log close translate appx_c.smcl appx_c.txt C.10 MAXIMUM LIKELIHOOD ESTIMATION Stata offers powerful general command for maximizing likelihood functions Enter help ml These commands are far beyond the scope of POE Advanced users may wish to consider Maximum Likelihood Estimation with Stata, 3m Edition, by Gould, Pitblado and Scribney, Stata Press, 2006, which is available on www.stata.com Key Terms '