Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 30 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
30
Dung lượng
2,77 MB
Nội dung
ECONOMETRICS REPORT Class: KTEE309.2 Instructor: PhD Dinh Thi Thanh Binh ABOUT US Nguyễn Nam Anh - 1811150046 Vũ Tuấn Đức - 1810150006 Nguyễn Mạnh Hùng - 1811150083 OUR TOPIC FACTORS THAT AFFECT STUDENTS’ GPA Phạm Văn Trọng - 1811150138 AN VAN CHAT LUONG download : add luanvanchat@agmail.c TABLES OF CONTENT I, Introduction .3 II, Literature review Question of interest Some background analysis into the topic .4 Methodology Procedure and program used III Economic model Specifying the object for modeling Defining the target for modeling by the choice of the variables to analyze, denote {xi} Embedding that target in a general unrestricted model (GUM) IV Econometric model 10 V Data collection 11 Data overview 11 Data description .12 VI Estimation of econometric model 13 Checking the correlation among variables 13 Regression run .15 VII Diagnosing the model problem 18 Normality .18 Multicollinearity 19 3.Heteroscedasticity 20 VIII Hypothesis postulated .24 IX Result analysis & Policy implication .26 X Conclusion 27 XI References 28 XII Appendix 29 AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page TABLES OF FIGURES Exhibit 1: Difference in personal information and GPA Exhibit 2: Difference in time-spending compared to GPA Exhibit 3: Definition of variables in the GPA model Exhibit 4: Statistic indicators of variables in the GPA model 12 Exhibit 5: Correlation matrix 13 Exhibit 6: Scatterplot of variables in GPA model 14 Exhibit 7: Regression model 15 Exhibit 8: Histogram plot indicating normality .18 Exhibit 9: Skewness/ Kurtosis tests for normality 19 Exhibit 10: Multicollinearity test .20 Exhibit 11: Heteroscedasticity test .21 Exhibit 12: Residual-versus-fitted plot of the model 22 Exhibit 13: Correcting heteroscedasticity 23 AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page I, INTRODUCTION Studying well always remains as a personal concern for students, no matter what level of education they are in From our perspective of view, we believe that studying at university requires a lot of interpersonal skills as well as flexible application of different methods of study to get good result As this topic is popular and practical among students, we choose the topic: Factors affect students’ GPA to present in this report To reach the goal, our team members have conducted a survey and use econometrics model to analyize the situation This report will show our working process, which begins by collecting data, processing data and then applying econometrics model to analyze these factors and end up by giving some recommendations and suggestions for students to manage to get better GPA in the future Since the duration of the research was very limited, there are still deficiencies in this report Therefore, we hope to have the review and comment of Dr Dinh Thi Thanh Binh to develop the topic and improve this report All after all, we believe that this report would help students’s performance at school in some way and it can provide readers with a decent view of the data set as well as the knowledge we have gained through the course AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page II, LITERATURE REVIEW 1, Question of interest As we have stated above, learning technique is one of the most common concerns for students, especially undergraduates at university It is common and easy to understand that due to the change in the learning environment as well as the difference in social circumstances, campus students are incapable of performing their best at school, as a result, having low academic scores – GPA Hanoi University of Science and Technology, or National University of Civil Engineering, for example, about 40% of students can graduate with the exact number of course years, and up to 15% are expelled from school due to very low academic result Therefore, in this research, we will analyze the factors that affect students’ GPA by using Regression model running and hypothesis testing to truly understand these effects Since we have conducted survey on social network, all these 152 observations (answers) are updated and the result would be objective enough to count on 2, Some background analysis into the topic When we found the materials for this research, we come across an articale named «Derterminants of academic performance for undergraduate in Can Tho University of Technology» published on 27 October, 2016 The following are some results that the article had pointed out, using median hypothesis testing AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page Exhibit 1: Difference in personal information and GPA (Source: sj.ctu.edu.vn) Criteria Gender Form of enrollment Current accomodation Study materials Monitors/Class operators Joining club Having part-time job Joining extra-curricular programmes Choices a, Male b, Female a, First enrollment b, Second enrollment a, Rented house b, Family home a, Efficient b, Inefficient a, Yes b, No a, Yes b, No a, Yes b, No a, Yes b, No GPA (out of 4) Difference (a –b) 2,278 2,381 2,119 2,275 2,239 2,217 2,177 2,194 2,498 2,290 2,394 2,299 2,324 2,320 2,329 2,313 -0,103 * -0,156 * 0,022 ns -0,017 ns 0,208 * 0,095 * 0,004 ns 0,016 ns Exhibit 2: Difference in time-spending compared to GPA (Source: sj.ctu.edu.vn) Criteria Time for surfing webs Time for self-study Time for revisions Class - skipping Group studying Choices a, < 3,6 hours a day b, > 3,6 hours a day a, < 2,7 hours a day b, > 2,7 hours a day a, Yes b, No a, Yes b, No a, Yes b, No GPA (out of 4) Difference (a –b) 2,359 2,312 2,291 2,342 2,339 2,241 2,281 2,423 2,330 2,262 0,047 ns -0,051 ns 0,098 ** -0,142 * 0,068 *** Note: *,**,***: having meaning with the confidence interval of 99%, 95%, 90% respectively ns: no meaning with the confidence interval of 90% AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page As it is shown in the above tables, we can see that girls have higher GPA than boys in general; monitors/class operators/students joining clubs also have higher GPA than others and all this factors have the meaning in terms of statistics with the confidence interval of 99% Research suggests that time for revisions/ skipping class/groupstudy also affect the academic result with statistical meaning With the help of the last research, we are now conducting another research to see these factors’ effect into students’ GPA 3, Methodology In this research, to have data for study, we have conducted the online survey asking people about their GPA and other habits Also, we apply quantitative and qualitative method to estimate the effect of every factor to GPA With the help of Excel, Stata and other software, we can analyze and present all the data into the result that can tell us level of each factors’ effect We also follow the steps of analyzing the problems in econometrics as we will show in the next part 4, Procedure and program used a, The procedure for analyzing include: Step 1: Question of interest Step 2: Economic model Step 3: Econometrics model Step 4: Data collection Step 5: Estimation of econometric model Step 6: Check multicollinearity and heteroscedasticity Step 7: Hypothesis postulated Step 8: Result analysis & Policy implication AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page b, Program used for the whole research Google Forms: To collect data & carry out the survey Google Drive: To store all materials we have collected for this report, which includes lots of folders & files Microsoft Excel: To present data & replace some answers to match the Stata The data set will be attached with this report Stata: To analyze the data and run the regression AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page III, ECONOMIC MODEL As data are provided up front, the economic model used in this report is an empirical one Note that the fundamental model is mathematical; with an empirical model, however, data is gathered for the variables and using accepted statistical techniques, the data are used to provide estimates of the model's values Empirical model discovery and theory evaluation are suggested to involve five key steps, but for the limitation of purpose and resources, this part of the report only follows three of them: 1) Specifying the object for modelling 2) Defining the target for modelling 3) Embedding that target in a general unrestricted model Specifying the object for modeling GPA = 𝑓(𝑥) As such, this report find the relationship between GPA, which is the object for modeling, and each of relating factors Defining the target for modeling by the choice of the variables to analyze, denote {𝒙𝒊} After thorough research, our group have been chosen ten significant factors: years of education at university, gender, time for clubs, jobs, entertainment, sleep, selfstudy and hanging out, number of credits and impact of teachers Embedding that target in a general unrestricted model (GUM) In its simplest acceptable representation (which will later be specified in the econometric model), the GUM of is determined to be: GPA = 𝑓(educ, female, tclb, tjob, tentertain, tsleep, tstudy, tout, ncre, tchimp) AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page III, ECONOMIC MODEL Exhibit 3: Definition of variables in the GPA model Variables Definition gpa GPA educ years of education at university female = if female tclb time for clubs tjob time for jobs tsleep time for sleep tstudy time for self-study tout time for hanging out tentertain time for entertainment tchimp Impact of teachers ncre Number of credits AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page Regression run Having checked the required condition of correlation among variables, the regression model is ready to run In Stata, this is done by using the command: reg gpa educ female time1 time2 time3 time4 time5 time6 ncre tchimp The result is shown in Exhibit Exhibit 7: Regression model AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page 15 From the result, it can be inferred that: ➢ We have the regression function: 𝒈𝒑𝒂 = 𝟑 𝟑𝟎𝟖𝟑𝟎𝟏 + 𝟎 𝟎𝟑𝟐𝟖𝟏𝟑𝟕𝒆𝒅𝒖𝒄 + 𝟎 𝟎𝟒𝟑𝟔𝟕𝟎𝟔𝒇𝒆𝒎𝒂𝒍𝒆 − 𝟎 𝟎𝟓𝟎𝟓𝟑𝟓𝟖𝒕𝒄𝒍𝒃 − 𝟎 𝟎𝟎𝟒𝟔𝟒𝟖𝟕𝒕𝒋𝒐𝒃 − 𝟎 𝟏𝟎𝟖𝟗𝟎𝟏𝟏𝒕𝒆𝒏𝒕𝒆𝒓𝒕𝒂𝒊𝒏 − 𝟎 𝟎𝟎𝟎𝟖𝟏𝟏𝟕𝒕𝒔𝒍𝒆𝒆𝒑 + 𝟎 𝟏𝟏𝟔𝟒𝟔𝟖𝟕𝒕𝒔𝒕𝒖𝒅𝒚 + 𝟎 𝟎𝟏𝟒𝟕𝟒𝟕𝟐𝒕𝒐𝒖𝒕 − 𝟎 𝟎𝟏𝟒𝟕𝟒𝟕𝟐𝒏𝒄𝒓𝒆 + 𝟎 𝟎𝟖𝟕𝟖𝟗𝟑𝟐𝒕𝒄𝒉𝒊𝒎𝒑 + 𝒖 in which, regression coefficients: ❖ 𝛽0 = 3.308301 : When all the independent variables are zero, the expected value of GPA is 3.308301 ❖ 𝛽1 = 0.0328137: When years of education at university increases by one year, the expected value of GPA increases by 0.0328137 ❖ 𝛽2 = 0.0436706: Expected value of GPA in 𝑓𝑒𝑚𝑎𝑙𝑒 is lower than that in male 0.0436706 unit ❖ 𝛽3 = −0.0505358 : When 𝑡𝑖𝑚𝑒 𝑓𝑜𝑟 𝑐𝑙𝑢𝑏𝑠 increases by one hour, the expected value of GPA decreases by 0.0505358 ❖ 𝛽4 = −0.0046487: When 𝑡𝑖𝑚𝑒 𝑓𝑜𝑟 𝑗𝑜𝑏𝑠 increases by one hour, the expected value of GPA decreases by 0.0046487 ❖ 𝛽5 = −0.1089011: When the 𝑡𝑖𝑚𝑒 𝑓𝑜𝑟 𝑒𝑛𝑡𝑒𝑟𝑡𝑎𝑖𝑛𝑚𝑒𝑛𝑡 increases by one hour, the expected value of GPA decreases by 0.1089011 ❖ 𝛽6 = −0.0008117: When 𝑡𝑖𝑚𝑒 𝑓𝑜𝑟 𝑠𝑙𝑒𝑒𝑝 increases by hour, the expected value of GPA decreases by 0.0008117 ❖ 𝛽7 = 0.1164687 : When 𝑡𝑖𝑚𝑒 𝑓𝑜𝑟 𝑠𝑒𝑙𝑓 − 𝑠𝑡𝑢𝑑𝑦 increases by hour, the expected value of GPA increases by 0.1164687 AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page 16 ❖ 𝛽8 = 0.0147472 : When 𝑡𝑖𝑚𝑒 𝑓𝑜𝑟 ℎ𝑎𝑛𝑔𝑖𝑛𝑔 𝑜𝑢𝑡 increases by hour, the expected value of GPA increases by 0.0147472 ❖𝛽9 = −0.0147472 : When 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓𝑐𝑟𝑒𝑑í𝑡𝑠 increases by credit per student, the expected value of GPA decreases by 0.0147472 ❖ 𝛽10 = 0.0878932 : When 𝑖𝑚𝑝𝑎𝑐𝑡 𝑜𝑓 𝑡𝑒𝑎𝑐ℎ𝑒𝑟 increases by unit, the expected value of GPA increases by 0.0878932 The coefficient 𝑹 − 𝒔𝒒𝒖𝒂𝒓𝒆𝒅 = 𝟎 𝟒𝟏𝟑𝟐: ❖All independent variables (educ, female, tclb, tjob, tentertain, tsleep, tstudy, tout, ncre, tchimp) jointly explain 41.32% of the variation in the dependent variable (gpa) ❖Other factors that are not mentioned explain the remaining 58.68% of the variation in the gpa Other indicators: ❖ Adjusted coefficient of determination adj R- squared= 0.3716 ❖ Total Sum of Squares TSS= 33.2980886 ❖ Explained Sum of Squares ESS = 13.7604018 ❖ Residual Sum of Squares RSS = 19.53768681 ❖ The degress of freedom of Model Dfm = 10 ❖ The degree of freedom of residual Dfr = 141 AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page 17 VII, DIAGNOSING THE PROBLEMS Normality We have this following hypothesis: H0: ui is normally distributed H1: ui is not normally distributed To test this hypothesis, we can use histogram in Stata, which is generated using these commands: predict resid, residual histogram resid, normal The result is shown in Exhibit Exhibit 8: Histogram plot indicating normality AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page 18 We can also test normality using Skewness Kurtosis test for normality, using the command: Sktest resid The result is shown in Exhibit Exhibit 9: Skewness/ Kurtosis tests for normality At the 5% significance level, both p-values of Skewness and Kurtosis are smaller than 0.05 so we have enough evidence to reject H0 However, our sample has 152 observations in total, which is really big that even though ui is not normally distributed, this model can still give us good results and can still be used for statistic analysis Multicolinearity Multicollinearity is the high degree of correlation amongst the explanatory variables, which may make it difficult to separate out the effects of the individual regressors, standard errors may be overestimated and t-value depressed The problem of Multicollinearity can be detected by examining the correlation matrix of regressors and carry out auxiliary regressions amongst them In Stata, the vif command is used, which stand for variance inflation factor Exhibit 10 shows the result AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page 19 Exhibit 10: Multicollinearity test The value of VIF here is lower than 10, indicating that multicollinearity is not too worrisome a problem for this set of data Heteroscedasticity Heteroscedasticity indicates that the variance of the error term is not constant, which makes the least squares results no longer efficient and t tests and F tests results may be misleading The problem of Heteroscedasticity can be detected by plotting the residuals against each of the regressors, most popularly the White’s test It can be remedied by respecifying the model – look for other missing variables In Stata, the imtest, white command is used, which stands for information matric test Exhibit 11 shows the result AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page 20 Exhibit 11: Heteroscedasticity test At the 5% significance level, there is enough evidence to reject the null hypothesis and conclude that this set of data meets the problem of Heteroscedasticity Another way to test if Heteroscedasticity exists is to graph the residualversusfitted plot, which can be generated using the rvfplot, yline (0) line command in Stata The result is shown in Exhibit 12 AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page 21 Exhibit 12: Residual-versus-fitted plot of the model From the graph, we can see that there is an increase in the variability, which means this set of data has Heteroscedasticity problem To fix the problem, robust standard errors are used to relax the assumption that errors are both independent and identically distributed In Stata, regression is rerun with the robust option, using the command: reg gpa educ female tclb tjob tentertain tsleep tstudy tout ncre tchimp, robust Exhibit 13 shows the result AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page 22 Exhibit 13: Correcting heteroscedasticity Note that comparing the results with the earlier regression, none of the coefficient estimates changed, but the standard errors and hence the t values are different, which gives reasonably more accurate p values AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page 23 VIII, HYPOTHESIS POSTULATED The question of interest, in multiple regression model: 𝑔𝑝𝑎 = 𝛽0 + 𝛽1 𝑒𝑑𝑢𝑐 + 𝛽2 𝑓𝑒𝑚𝑎𝑙𝑒 + 𝛽3 𝑡𝑐𝑙𝑏 + 𝛽4 𝑡𝑗𝑜𝑏 + 𝛽5 𝑡𝑒𝑛𝑡𝑒𝑟𝑡𝑎𝑖𝑛 + 𝛽6 𝑡𝑠𝑙𝑒𝑒𝑝 + 𝛽7 𝑡𝑠𝑡𝑢𝑑𝑦 + 𝛽8 𝑡𝑜𝑢𝑡 + 𝛽9 𝑛𝑐𝑟𝑒 + 𝛽10 𝑡𝑐ℎ𝑖𝑚𝑝 + 𝑢 (Full mode) Which independent variables among educ, female, tclb, tjob, tentertain, tsleep, tstudy, tout, ncre and tchimp contribute to explaining/ predicting gpa and which ones should be dropped to reduce the model? From this question, the following hypothesis is postulated: 𝐻0: 𝑇ℎ𝑒 𝑖𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒 𝑥𝑖 𝑑𝑜𝑒𝑠𝑛′𝑡 𝑐𝑜𝑛𝑡𝑟𝑖𝑏𝑢𝑡𝑒 𝑡𝑜 𝑒𝑥𝑝𝑙𝑎𝑖𝑛𝑖𝑛𝑔 𝑔𝑝𝑎 𝐻1: 𝑇ℎ𝑒 𝑖𝑛𝑑𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒 𝑥𝑖 𝑖𝑠 𝑢𝑠𝑒𝑓𝑢𝑙 𝑖𝑛 𝑒𝑥𝑝𝑙𝑎𝑖𝑛𝑖𝑛𝑔 𝑔𝑝𝑎 which is expressed as: 𝐻0: 𝛽𝑖 = 𝐻0: 𝛽𝑖 ≠ For this, we use p-value method: if p-value is smaller than significance level 𝛼 = 5%, H0 is rejected We have: AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page 24 As a result, there is enough evidence to reject the null hypothesis and conclude that tclb, tentertain, tstudy, tchimp does have explanatory or predictive power on gpa, meanwhile we can reduce the model by dropping educ, female, tjob, tsleep, tout and ncre out of the model 𝑔𝑝𝑎 = 𝛽0 + 𝛽1 𝑡𝑐𝑙𝑏 + 𝛽2 𝑡𝑒𝑛𝑡𝑒𝑟𝑡𝑎𝑖𝑛 + 𝛽3 𝑡𝑠𝑡𝑢𝑑𝑦 + 𝛽4 𝑡𝑐ℎ𝑖𝑚𝑝 + 𝑢 (Reduced model) AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page 25 IX, RESULT ANALYSIS & POLICIES IMPLICATION From data analysis in preceding sections, we have gained an overall view of the data set given in terms of the statistical proof of the relationship between GPA and each of the factors proposed As mentioned at the beginning of this report, we aim to learn how the average time spent on normal activities, gender, school factors affect the student ‘s GPA Following the analysis of data, regression model run and hypothesis testing, it can be concluded that time for self-study, time for entertainment, time for clubs and the impact of teachers affect or at least statistically affect GPA Therefore, students, teachers should take all these ingredients into account in order to improve the student’s GPA AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page 26 X, CONCLUSION This report is completed on the dedicated contribution of each member and the knowledge from our study in Econometrics This also provides us with a good opportunity to practice what we have learned and to get a deeper understanding of data analysis and relevant testing From this useful application, we hope that our work can somehow suggest you a closer look about what factors really affect the student GPA in order to take steps improving it Again, due to the limitation of understanding and resources, our report may contain misinterpretations We hope that PhD Dinh Thi Thanh Binh and readers can give us constructive comments on the report so that we would improve ourselves and better in the future Sincerely AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page 27 XI, REFERENCES “Những nhân tố ảnh hưởng đến kết học tập sinh viên năm nhất, năm hai trường ĐH Kĩ thuật - Công nghệ Cần Thơ” by Nguyễn Thị Thu An, Nguyễn Thị Ngọc Thứ, Đinh Thị Kiều Oanh Nguyễn Văn Thành AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page 28 XII, APPENDIX 1, Role play in our team 2, Timeline for finishing team report 3, Data set The data set would be attached in the following pages AN Econometrics VAN CHAT : add Report –LUONG KTEE309.2 download Ha Noi, December 2019luanvanchat@agmail.c Page 29 ... the topic: Factors affect students’ GPA to present in this report To reach the goal, our team members have conducted a survey and use econometrics model to analyize the situation This report will... affect or at least statistically affect GPA Therefore, students, teachers should take all these ingredients into account in order to improve the student’s GPA AN Econometrics VAN CHAT : add Report. .. relationship between GPA and each of the factors proposed As mentioned at the beginning of this report, we aim to learn how the average time spent on normal activities, gender, school factors affect the