CAO HỌC TÀI LIỆU PHÂN TÍCH STATA . NHỮNG ĐIỀU CẦN BIẾT VỀ CAO HỌC TÀI LIỆU PHÂN TÍCH STATA, LÝ THUYẾT CAO HỌC TÀI LIỆU PHÂN TÍCH STATA, BÀI GIẢNG CAO HỌC TÀI LIỆU PHÂN TÍCH STATA. TỔNG QUAN CAO HỌC TÀI LIỆU PHÂN TÍCH STATA Pham Thi Bich Ngoc, Ph.D. (University of Kiel, Germany) FEC/Hoa Sen University ngoc.phamthibich@hoasen.edu.vn UNIVERSITY OF ECONOMICS HOCHIMINHCITY, 03 June 2014 June14 - Dr. Pham Thi Bich Ngoc 1  Learn and use STATA? http://www.ats.ucla.edu/stat/stata/  “Economic Analysis of Cross section and Panel data” - Jeffrey M. Wooldridge (2010) June14 - Dr. Pham Thi Bich Ngoc 2  These are Models that Combine Cross- section and Time-Series Data  In panel data the same cross-sectional unit (industry, firm, country) is surveyed over time, so we have data which is pooled over space as well as time. June14 - Dr. Pham Thi Bich Ngoc 3 1. Panel data can take explicit account of individual-specific heterogeneity (“individual” here means related to the microunit) 2. By combining data in two dimensions, panel data gives more data variation, less collinearity and more degrees of freedom. 3. Panel data is better suited than cross- sectional data for studying the dynamics of change . For example it is well suited to understanding transition behaviour – for example company bankruptcy or merger. June14 - Dr. Pham Thi Bich Ngoc 4 4. Panel data is better at detecting and measuring effects that cannot be observed in either cross-section or time-series data. 5. Panel data enables the study of more complex behavioural models – for example the effects of technological change, or economic cycles. 6. Panel data can minimise the effects of aggregation bias, from aggregating firms into broad groups. June14 - Dr. Pham Thi Bich Ngoc 5 If all the cross-sectional units have the same number of time series observations the panel is balanced , if not it is unbalanced .                   NTiTTT Ntittt Ni Ni yyyy yyyy yyyy yyyy       21 21 222212 112111 Time series Cross section - a matrix of balanced panel data observations on variable y , N cross-sectional observations, T time series observations. June14 - Dr. Pham Thi Bich Ngoc 6  Grunfeld and Griliches [1960] ◦ i = 10 firms: GM, CH, GE, WE, US, AF, DM, GY, UN, IBM; t = 20 years: 1935-1954 ◦ I it = Gross investment ◦ F it = Market value ◦ C it = Value of the stock of plant and equipment it i it it it I F C         June14 - Dr. Pham Thi Bich Ngoc 7  y it = Real per capita GDP  s i = Average saving rate (over 1960-1985)  n i = Average population growth rate (over 1960-1985)  g+d = 5%  COM i = 1 if communist, 0 otherwise  OPEC i =1 if OPEC, 0 otherwise 1 ln( ) ln( ) it t it i i i i it y y s n g COM OPEC     d              June14 - Dr. Pham Thi Bich Ngoc 8  LWAGE = log of wage = dependent variable in regressions  EXP = work experience WKS = weeks worked OCC = occupation, 1 if blue collar, IND = 1 if manufacturing industry SOUTH = 1 if resides in south SMSA = 1 if resides in a city (SMSA) MS = 1 if married FEM = 1 if female UNION = 1 if wage set by union contract ED = years of education BLK = 1 if individual is black June14 - Dr. Pham Thi Bich Ngoc 9 June14 - Dr. Pham Thi Bich Ngoc 10  Two basic windows ◦ Command ◦ Results  Optional windows ◦ Variable list ◦ History of commands  Other functions ◦ Data browser/editor ◦ Do file editor ◦ Viewer (for log, help files, etc) [...]... sectorcode / =2007 June14 - Dr Pham Thi Bich Ngoc 13  summarize [varlist] [, detail] ◦ # obs, mean, SD, range ◦ “, detail” gets you more detail (median, etc) Eg sum lnY/lnK/lnL/lnM  ci [varlist] ◦ Mean, standard error of mean, and confidence intervals ◦ Actually works for dichotomous variables, too ◦ Eg ci lnY/lnK/lnL/lnM June14 - Dr Pham Thi Bich Ngoc 14  histogram... stats(rho p)] Eg: Correlation between lnY/lnK/lnL/lnM?  June14 - Dr Pham Thi Bich Ngoc 20 regress depvar [indepvars] [if] [in] [weight] [, options] regress fits a model of depvar on indepvars using linear regression  regress lnY lnK lnL lnM horizontal Bam Bch    Checking Homoscedasticity of Residuals rvfplot, yline(0) June14 - Dr Pham Thi Bich Ngoc 21 xtset id year xtreg lnY lnK lnL lnM … xtreg lnY lnK... Checking Homoscedasticity of Residuals rvfplot, yline(0) June14 - Dr Pham Thi Bich Ngoc 21 xtset id year xtreg lnY lnK lnL lnM … xtreg lnY lnK lnL lnM … i.year xtreg lnY lnK lnL lnM … i.year i.industry June14 - Dr Pham Thi Bich Ngoc 22 .                   NTiTTT Ntittt Ni Ni yyyy yyyy yyyy yyyy       21 21 222 212 11 211 1 Time series Cross section - a matrix of balanced panel data observations on variable y , N cross-sectional observations, T time series observations. June14 - Dr  Log-file open/suspend/close  Do-file editor  Browse and Edit  Break June14 - Dr. Pham Thi Bich Ngoc 11  Open draft-student.dta  Create .do file/.log file  A 3-factor Cobb- Douglas. sectorcode< ;11 June14 - Dr. Pham Thi Bich Ngoc 18  scatter [var1] [var2] ◦ Scatterplot of the two variables ◦ Extention: twoway lfit[var1] [var2] twoway scatter [var1] [var2]|| lfit [var1] [var2]||,
