CAO HỌC TÀI LIỆU PHÂN TÍCH STATA 1

CAO HỌC TÀI LIỆU PHÂN TÍCH STATA . NHỮNG ĐIỀU CẦN BIẾT VỀ CAO HỌC TÀI LIỆU PHÂN TÍCH STATA, LÝ THUYẾT CAO HỌC TÀI LIỆU PHÂN TÍCH STATA, BÀI GIẢNG CAO HỌC TÀI LIỆU PHÂN TÍCH STATA. TỔNG QUAN CAO HỌC TÀI LIỆU PHÂN TÍCH STATA

Trang 1

Pham Thi Bich Ngoc, Ph.D (University of Kiel, Germany)

FEC/Hoa Sen University

ngoc.phamthibich@hoasen.edu.vn

UNIVERSITY OF ECONOMICS HOCHIMINHCITY, 03 June 2014

Trang 2

 Learn and use STATA?

http://www.ats.ucla.edu/stat/stata/

 “Economic Analysis of Cross section and

Panel data” - Jeffrey M Wooldridge (2010)

Trang 3

 These are Models that Combine

Cross-section and Time-Series Data

 In panel data the same cross-sectional unit

(industry, firm, country) is surveyed over

time, so we have data which is pooled over

space as well as time

Trang 4

1 Panel data can take explicit account of individual-specific heterogeneity (“individual” here means related to the microunit)

2 By combining data in two dimensions, panel data gives more data variation, less collinearity and more degrees of freedom

3 Panel data is better suited than sectional data for studying the dynamics of

example company bankruptcy or merger

Trang 5

4 Panel data is better at detecting and

measuring effects that cannot be observed

in either cross-section or time-series data

5 Panel data enables the study of more

complex behavioural models – for example

the effects of technological change, or

economic cycles

6 Panel data can minimise the effects of

aggregation bias, from aggregating firms

Trang 6

If all the cross-sectional units have the same number of time series observations the panel is balanced, if not it is

T T

Nt it

t t

N i

y y

2 1

2 2

22 12

1 1

21 11

Time series

Cross section

- a matrix of balanced panel data observations on variable y,

N cross-sectional observations, T time series observations

Trang 7

 Grunfeld and Griliches [1960]

◦ i = 10 firms: GM, CH, GE, WE, US, AF, DM, GY, UN,

Trang 8

 yit = Real per capita GDP

 si = Average saving rate (over 1960-1985)

 ni = Average population growth rate (over 1960-1985)

 g+ d = 5%

 COMi = 1 if communist, 0 otherwise

 OPECi =1 if OPEC, 0 otherwise

Trang 9

 LWAGE = log of wage = dependent variable in regressions

 EXP = work experience

WKS = weeks worked

OCC = occupation, 1 if blue collar,

IND = 1 if manufacturing industry

SOUTH = 1 if resides in south

SMSA = 1 if resides in a city (SMSA)

Trang 10

 Two basic windows

Trang 11

 The usual – open, save, print

 Log-file open/suspend/close

 Do-file editor

 Browse and Edit

 Break

Trang 12

 Open draft-student.dta

 Create do file/.log file

 A 3-factor Cobb- Douglas function (simple):

Trang 14

 summarize [varlist] [, detail]

◦ # obs, mean, SD, range

Eg sum lnY/lnK/lnL/lnM

Trang 15

 histogram varname

◦ Simple histogram of your variable

◦ Eg histogram lnY

 histogram lnY, frac by(D7, title(“Firm Sales in 2007 and the Rest") subtitle("(in VND)")

 qnorm varname

◦ Quantile plot of your variable to check normality

◦ Eg qnorm lnY

Trang 16

 regress lnY to lnK, lnL, lnM, horizontal, Bam, Bch

 predict r, resid

 kdensity r, normal

Trang 17

 tabulate [varname]

◦ Counts and percentages

◦ (see also, table - this is very different!)

Trang 18

 tabulate [var1] [var2]

◦ “Cross-tab”

◦ Descriptive options

Eg tab D7 sectorcode if sectorcode<11

Trang 19

 scatter [var1] [var2]

◦ Scatterplot of the two variables

twoway lfit[var1] [var2]

twoway scatter [var1] [var2]|| lfit [var1]

[var2]||, by(var3, total row(1))

http://www.stata.com/support/faqs/graphics/gph/gr aphdocs/twoway-linear-prediction-plot/index.html

Eg Graph lnY to lnK (linear, scatter plots)

Trang 20

 pwcorr [varlist] [, sig]

◦ Pairwise correlations between variables

◦ “sig” option gives p-values

 spearman [varlist] [, stats(rho p)]

Eg: Correlation between lnY/lnK/lnL/lnM?

Trang 21

 regress depvar [indepvars] [if] [in]

[weight] [, options]

regress fits a model of depvar on indepvars using linear regression

 regress lnY lnK lnL lnM horizontal Bam Bch

 Checking Homoscedasticity of Residuals

 rvfplot, yline(0)

Trang 22

xtset id year

xtreg lnY lnK lnL lnM …

xtreg lnY lnK lnL lnM … i.year

xtreg lnY lnK lnL lnM … i.year i.industry

Định dạng
Số trang	22
Dung lượng	1,76 MB