Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 113 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
113
Dung lượng
3,04 MB
File đính kèm
EmpiricalAnalysis.rar
(3 MB)
Nội dung
WBS seminar ('17/12/23) Introduction of Empirical Analysis using Stata: For Beginners Lecturer: Tohru Yoshioka-Kobayashi Project Research Associate Department of Technology Management for innovation Graduate School of Engineering, the University of Tokyo t-koba@tmi.t.u-tokyo.ac.jp Acknowledgements: Mr Kisa Sugihara and Mr Akihiro Kawamura made a great contribution to the English translation This lecture material can be used secondary according to the Creative Commons name display Please note that there are some areas that not adequately touch the statistical rigor 0.Introduction Introduction of the lecturer • Researcher in MOT: '15 Ph.D in Engineering from UTokyo • Studying an organizational management in technology and design development • Researcher in IP policy: '07 Master in Law from Osaka-U • Seeking policy implications in the intelletual property law • Career • Assistant in legal affairs in the univ start-up (Signpost, Corp.) →Policy analysit in a private think tank →Hitotsubashi Univ & Univ of Tokyo (Mitsubishi Res Inst.) 0.Introduction Goal • We will learn basic knowledge and skills to reveal (or proof) a causal relationship • Even those who are not bright in mathematics will be able to analyze by yourself after the seminar • The contents of the lecture are based on statistics, but no formula is used • Specialized in general-purpose analytical methods ã We use Statađ 0.Introduction Agenda I Preparation for the Analysis: How to Load data II Descriptive Statistics and Graphs III Data Processing IV Regression Analysis V Reporting of Regression Results 0.Introduction Empirical analysis procedures ① Setting research questions ⑧ Creating a data set ② Literature review ⑨ Analysis ③ Causal model design ⑩ Discussion (interpretation) ④ Search for statistics and other data sources ⑤ Perform simple verification ⑥ Collecting data ⑦ Cleaning data (Data cleansing) Carried out in the head With data Embodiment I Preparation for the Analysis: How to Load data I Preparation for the analysis 1)Characteristics of statistical analysis software Stata SPSS R GRETL High High Medium Medium User Good experience Good Bad Good Price High High Free Free Support Official Official Support + a support couple of + Books books A variety of information online + Books Information online Characteri stics Strong in the analysis of the social science Strong in data processing Strong in analysis of the economics Features A little strong in the analysis of the natural science (High w/add-in) (High w/add-in) I Preparation for the analysis 2)Data to use • SampleData_OECD.txt • Created from OECD, Main Science and Technology Indicator • tab-separated data • Records of the following values 2008 and 2013 and their growth in 2013 (compare to those in 2008) • Workforce population (thousands) • PCT Patent applications (number of patents) Number of patent applications that are willing to apply to foreign countries • Industry Value added (US $ million) • Technology trade received (US $ Million) • Technology trade payments (US $ Million) • Technical trade balance (US $ million) Amount Received - payment I Preparation for the analysis 2)Data to use Data item Variable name Content Country Country Region_Narrow Region name Region_Broad Continent name Laborforce_2008_thousands Workforce population (thousands) (2008) Variable name Content Techreceipts_2008_m_usd Technology trade received (US $ Million) (2008) Techreceipts_2013_m_usd (2013) Techreceipts_growthrate Growth rate(2008-2013) Techpayments_2008_m_usd Technology trade payments (US $ Million) (2008) Techpayments_2013_m_usd (2013) Techpayments_growthrate Growth rate(2008-2013) Techbalance_2008_m_usd Technical trade balance of payment (US $ Million) (2008) Techbalance_2013_m_usd (2013) Laborforce_2013_thousands (2013) Techbalance_growth_m_usd Laborforce_growthrate Growth (2008-2013) Growth value (US $ Million) (2008-2013) Laborforce_growth_dummy pctpatentapplication_2008 Number of international patent applications (2008) Dummy variable takes if labor force population growth rate > Techbalance_growth_dummy Dummy variable takes if technology trade balance growth rate > Asiapacific_dummy Dummy variable takes if the country is in Asia or Paficif (including North America) Europe_dummy Dummy variable takes if the country is in Europe Eu_dummy Dummy variable takes if the country is one of the EU members pctpatentapplication_2013 (2013) Pct_growthrate Growth rate(2008-2013) Valueadded_2008_m_usd Industry Value added (US $ Million) (2008) Valueadded_2013_m_usd (2013) Valueadded_growthrate Growth rate (2008-2013) ValueAdded_Growth_M_USD Growth value (2008-2013) I Preparation for the analysis 10 2)Data to use Questions to be solved • What factor does increase the industry value- added? • What factor does increase the technology balance of payment? • Important limitation: Examine only within the available data V Reporting of Regression Results 99 1) Reporting of Regression Results Common practice • Examples of regression results Keller, R T (2001) Cross-functional project groups in research and new product development: Diversity, communications, job stress, and outcomes Academy of Management Journal, 44(3), 547-555 V Reporting of Regression Results 1) Reporting of Regression Results Exercise in Stata • Set up add-ins: outreg2, mkcorr #Install outreg2 (You need to it only once) ssc install outreg2 #Install mkcorr (You need to it only once) ssc install mkcorr 100 V Reporting of Regression Results 101 1) Reporting of Regression Results Exercise in Stata • Export descriptive statistics You can export in MS word format #Create a new “desc_stat.doc” file and export descriptive statistics outreg2 using desc_stat.doc, replace sum(log) keep(valueadded_growthrate pct_growthrate laborforce_growthrateSelect eu_dummy) variables to export in “keep” Results The file (reg_res.doc) will be saved in the folder indicated the status bar V Reporting of Regression Results 1) Reporting of Regression Results Exercise in Stata • Export correlation matrix #Export correlation matrix in a text file mkcorr valueadded_growthrate pct_growthrate laborforce_growthrate eu_dummy, log(corr_matrix.txt) 102 V Reporting of Regression Results 1) Reporting of Regression Results Exercise in Stata • Export regression results #Regression analysis regress valueadded_growthrate laborforce_growthrate eu_dummy #Create a new file “regress_res.doc” and export results in it outreg2 using regress_res.doc, replace ctitle(Model 1) #Another regression analysis regress valueadded_growthrate pct_growthrate laborforce_growthrate eu_dummy #Append the results into the file outreg2 using regress_res.doc, append ctitle(Model 2) 103 V Reporting of Regression Results 1) Reporting of Regression Results Exercise in Stata • Export regression results: Results 104 V Reporting of Regression Results 2) Visualization of Regression Results Exercise in Stata • Plot estimated marginal effect • Graphs showing marginal effects with confidence intervals #Plot marginal effect with confidence intervals graph twoway lfitci valueadded_growthrate pct_growthrate #Plot marginal effect with confidence intervals and original data graph twoway (lfitci valueadded_growthrate pct_growthrate) (scatter valueadded_growthrate pct_growthrate) 105 V Reporting of Regression Results 106 2) Visualization of Regression Results Exercise in Stata -.2 • Plot estimated marginal effect -1 PCT_GrowthRate 95% CI ValueAdded_GrowthRate Fitted values V Reporting of Regression Results 107 2) Visualization of Regression Results Exercise in Stata • Plot estimated results • It is divided depending on whether it is Europe or not, and other values are plotted on the assumption that they are average values #Run immediately after regression estimates: Store estimated results in variables adjust laborforce_growthrate, by(eu_dummy) gen(p2_va_gr) Here, we use the mean value of Laborforce_growthrate #Show estimates twoway (scatter p2_va_gr pct_growthrate if eu_dummy==1, mcolor(blue))(scatter p2_va_gr pct_growthrate if eu_dummy==0, mcolor(red)), legend (order(1 "EU" "NonEU")) ytitle("Value Added Growth") Blue in the EU and red in the case outside the EU V Reporting of Regression Results 108 You can change it in “ytitle” Value Added Growth 2) Visualization of Regression Results Exercise in Stata -1 You can change it in “legend (order( ) )” PCT_GrowthRate EU Non-EU 109 Appendix For further improvement Appendix 110 Variations of regressions for causality analysis • Variations of estimation models corresponding with characteristics of the dependent variable • Dependent variable = dummy variable • Example: Surplus of technology balance of payments • logistic regression • logit model regression • probit model regression • Depenedent variable has cut-off point • Example: Longitudanal performance of engineers (suddenly decrease due to the retirement, job rotation, and other life events) • Tobit model Appendix 111 Variations of regressions for causality analysis • Variations of estimation models (cont.) • Dependent variable = count & natural number • Example: Number of inventions in a organization (the number of inventors who generate n inventions is 1/n2 of all inventors (Narin&Breitzman, 1995)) • Poisson model • Negative binomial model Appendix 112 Variations of estimation models to reveal causality • Omitted variable bias prevention • Panel data analysis • Use time series data and exclude unobservable effects of individuals • Fixed effect model • Random effect model • difference-in-difference • regression discontinuity Appendix Variations of estimation models to reveal causality • Estimation of other than mean value • quantile regression 113 ... HS_S_UNIV_R I Preparation for the analysis 3)Data for the experienced Questions • What factors influence on English skills of high school students? 15 I Preparation for the analysis 16 4)Load data... contain Japanese and symbols 17 I Preparation for the analysis 18 4)Load data File format • It is best to read the Excel file • It is possible for STATA (though the old version does not work)... 19 I Preparation for the analysis 20 4)Load data Exercise in Stata • Click on Browse [ii] Click on Browse [i] Keep checking tabdelimited data in advance I Preparation for the analysis 21 4)Load