Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 41 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
41
Dung lượng
2,2 MB
Nội dung
CONTENT CONTENT .1 INTRODUCTION PART 1: DATA DESCRIPTION I GENERAL DATA DESCRIPTION II DATA DESCRIPTION IN DETAILS Time worked per week in 1975 Age in 1975 Educational level in 1975 Health status in 1975 10 Gender 11 Marital status in 1975 .11 Time of sleeping per week in 1975 12 PART 2: REGRESSION ANALYSIS 13 I THE RELATIONSHIP BETWEEN VARIABLES – STATISTICAL CORRELATION 13 II ESTIMATE THE REGRESSION MODEL BY OLS METHOD 14 Population regression function 14 Sample regression function 14 Analysis of Parameters in the Sample Regression Model 14 III MISTAKE TESTS OF THE MODEL 15 Testing multicollinearity 15 Testing heteroskedasticity 16 Cure for heteroskedasticity 17 IV HYPOTHESES TESTS 17 Testing overall significance of the regression 17 Testing significance of the regression coefficients 18 Testing exclusion restricstions 20 PART 3: CONSTRUCTING FINAL REGRESSION MODEL 22 I ESTIMATE THE REGRESSION MODEL BY OLS METHOD 22 Population regression function 22 Sample regression function 22 Analysis of parameters in the sample regression model 22 II MISTAKE TESTS OF THE MODEL 23 Testing multicollinearity 23 Testing heteroskedasticity 24 Cure for heteroskedasticity 24 III HYPOTHESES TESTS 25 Testing the overall significance of regression 25 Testing the significance of the regression coefficients 25 CONCLUSION 27 APPENDIX 28 Result of using command ‘tab totwrk75’ .28 Result of using command ‘tab slpnap75’ 34 TABLE OF FIGURES Figure 1: The result of using command 'des' Figure 2: The result of using command 'des' for variables chosen Figure 3: The result of using command 'sum' Figure 4: The result of using command 'tab totwrk75' (full version in appendix) Figure 5: The result of using command 'tab age75' Figure 6: The result of using command 'tab educ75' 10 Figure 7: The result of using command 'tab gdhlth75' .10 Figure 8: The result of using command 'tab male75' 11 Figure 9: The result of using command 'tab marr75' 11 Figure 10: The result of using command 'tab slpnap75' (full version in appendix) 12 Figure 11: The result of using command ‘corr’ in STATA 13 Figure 12: The result of using command 'reg' in STATA (6 variables) 14 Figure 13: The result of using command 'vif' after using 'reg' in STATA 15 Figure 14: The result of using 'imtest, white' in STATA 16 Figure 15: The result of using command robust in STATA 17 Figure 16: The result of command 'test' (after using robust) 17 Figure 17: The result of using command 'reg' (2 variables) .20 Figure 18: The result of using command 'test' for variables above - after robust 21 Figure 19: The result of using command 'reg' after omitting variables 22 Figure 20: The result of using 'corr' with variables 23 Figure 21: The result of using command 'vif' after 'reg totwkr75 male slpnap75' .23 Figure 22: The result of using command ‘imtest, white’ for new function .24 Figure 23: The result of using 'reg robust' 24 Figure 24: The result of using command’ test male slnap’ 25 AKNOWLEDGEMENT The success and final outcome of this assignment required a lot of support from others, and we are extremely fortunate to have this all along the completion of our work We would like to express our gratitude to Mrs Dinh Thi Thanh Binh, our Econometrics lecturer, for excellent expertise and supportive guidance she provided us throughout the process Without such help, we might not have been able to complete this assignment so far We are really grateful as we managed to complete the assignment on time, which could not be done without the effort and co-operation from our group members Last but not least, we would like to thank all of our friends for their nice support and willingness to spend some time helping us finishing the documents Group 11 INTRODUCTION Researches have shown that various factors have influences on the working time of labor For instance, older workers tend to work less time than younger ones The same thing happens to female workers who are married and have a family to take care of And for each person, the influences of these factors are different Therefore, after taking everything into consideration, we decided to choose and study the project: “The factors affecting weekly working time in 1975” Thus through our project, we analyze the factors that have major impact on the working time of labor in 1975, using the econometric methods Econometrics is a social science in which tools of economic, mathematical, and statistical theories are used to estimate economic relationships, testing economic theories, and evaluating and implementing government and business policy It is based upon the development of statistical methods to forecast economic issues In this paper, we consider six factors that may affect staffs’ weekly working time: age, educational level, health status (good or poor), gender (male or female), marital status (married or single), time of sleeping Throughout the project, we used STATA as the tool for econometrics analysis to analyze the data set “11.DTA” We hope that arguments and statistics in this project will be helpful for anyone who is interested in the topic stated PART 1: DATA DESCRIPTION I GENERAL DATA DESCRIPTION Chosen Variables for Research We obtained the following result by using command ‘des’ o b s : v a r s : s i z e : 6,214 s t o r a g e v a r i a b l e n a m e d i s p l a y v a l u e l a b e l A u g 1999 v a r i a b l e 22:56 t y p e f o r m a t a g e b y t e % g a g e e d u c b y t e % g y e a r s e d u c i n ' e d u c b y t e % g y e a r s e d u c i n ' g d h l t h b y t e % g = g d h l t h b y t e % g = m a l e b y t e % g = m a r r b y t e % g = m a r r b y t e % g = s l p n a p i n t % g m i n s s l p w k , i n c n a p s , ' s l p n a p i n t % g m i n s s l p w k , i n c n a p s , ' t o t w r k i n t % g m i n u t e s w o r k e d p e r w e e k , ' t o t w r k i n t % g m i n u t e s w o r k e d p e r w e e k , ' y n g k i d b y t e % g = y n g k i d b y t e % g = c e d u c b y t e % g c h a n g e i n c g d h l t h b y t e % g c h a n g e i n g d h l t h c m a r r b y t e % g c h a n g e i n m a r r c s l p n a p i n t % g c h a n g e i n s l p n a p c t o t w r k i n t % g c h a n g e i n t o t w r k c y n g k i d b y t e % g c h a n g e i n y n g k i d i n i f g o o d i f g o o d i f m a l e i f i f l a b e l 1975 h l t h h l t h m a r r i e d m a r r i e d i f i f c h i l d c h i l d i n i n i n < < ' i n ' , , ' ' ' ' e d u c Figure 1: The result of using command 'des' The data set was created on August 18, 1999, containing 20 variables, 239 observations After considering the meaning of variables in file 11.dta, our group decided to choose following variables as variables in regression model: Dependent variable: totwrk75 Independent variables: age75, educ75, gdhlth75, male, marr75, slpnap75 General Description of Chosen Data We obtained the following result by using command ‘des’ for variables analyzed: Figure 2: The result of using command 'des' for variables chosen From the above result, we can see that age75, educ75 and, slpnap75, totwrk75 are quantitative variables and gdhlth75, male, marr75 are qualitative variables Here is the variables explanation in detail: Variables Display Format Meaning Unit totwrk75 %9.0 g Time worked per week in 1975 Minute age75 %9.0 g Age in 1975 Year educ75 %9.0 g Years of education Year gdhlth75 %9.0 g = if in good health in 1975 Male %9.0 g = if male marr75 %9.0 g = if married in 1975 slpnap75 %9.0 g Time of sleeping per week, including naps Minute Using command ‘sum totwrk75 age75 educ75 gdhlth75 male marr75 slpnap75’, we can know the number of observations and the mean, standard deviation, min, max of each variables (age75, educ75, gdhlth75, male, marr75, slpnap75, totwrk75) sum totwrk75 age75 educ75 gdhlth75 male marr75 slpnap75 Variable Obs Mean Std Dev Min Max totwrk75 239 2184.205 922.632 4805 age75 educ75 gdhlth75 male marr75 239 239 239 239 239 39.01255 13.10879 8828452 6025105 748954 11.06683 2.858844 3222796 4904058 4345249 23 0 65 17 1 slpnap75 239 3369.665 502.8366 2053 6110 Figure 3: The result of using command 'sum' II DATA DESCRIPTION IN DETAILS To describe variables in details, we used command ‘tab’ for each variable: Time worked per week in 1975 Figure 4: The result of using command 'tab totwrk75' (full version in appendix) Minutes of working time per week starts from to 4805 The most frequent is minute, with 10 observations, accounted for 4.18% Followed by is 2325 minutes, with observations, accounted for 1.67% Age in 1975 Figure 5: The result of using command 'tab age75' Age of workers in 1975 varies from 23 years old to 65 years old The most frequent age is 33 years old, with 14 observations, accounted for 5.8% The least frequent age are 49, 63, and 64 years old, with only observation for each, accounted for 0.42% Educational level in 1975 Years of education starts from to 17 Twelve years of education has the highest number of observations (with 98 observation, accounted for 41%), while year of education has the lowest (with observation, accounted for 0.42%) Figure 6: The result of using command 'tab educ75' Health status in 1975 Figure 7: The result of using command 'tab gdhlth75' - Variable gdhlth = if good health in 1975 has 211 observations, accounted for 88.28% - Variable gdhlth = if poor health in 1975 has 28 observations, accounted for 11.72% 10 CONCLUSION From the above analysis and results, some conclusions are obtained When constructing first regression model based on dependent variable totwrk75 and independent variables (age75, educ75, gdhlth75, male, marr75, slpnap75), we tested mistakes and realized that heteroskedasticity existed in the model To test hypotheses more exactly, we used command robust in STATA to fix this mistake After testing overall significance of the sample regression function, we concluded that the regression function is relevant In other words, at least one of the independent variables can help to explain the dependent variable Of all variables chosen as independent variables, two variables have statistically significant on time of working per week in 1975; they are male (gender) and slpnap75 (time of sleeping, including naps per week in 1975) Next, we tested whether age75, educ75, gdhlth75, marr75 should be excluded or not The result showed that they should be excluded In the next step, we decided to construct the second regression function without variables listed above After re-testing by the same way as what we did with the first model, heteroskedasticity was realized, which made us to fix it again by command robust We found no problems with hypotheses tests; that means the function is relevant and all independent variables have statistically significance on time of working per week in 1975 Finally, we had the following regression function: ̂̂ = 3816.693 + 678.3548 Sample Regression Function − 0.6057587 In this model, male, slpnap75 can explain 25.46% of the variation in totwrk75 27 APPENDIX Result of using command ‘tab totwrk75’ tab totwrk75 minutes worked per week, '75 Freq Percent 10 4.18 4.18 68 0.42 4.60 113 0.42 5.02 188 0.42 5.44 305 0.42 5.86 353 0.42 6.28 363 0.42 6.69 375 1.26 7.95 388 0.42 8.37 588 0.42 8.79 650 0.42 9.21 838 0.42 9.62 875 0.42 10.04 900 0.42 10.46 940 0.42 10.88 950 0.42 11.30 958 0.42 11.72 1013 0.42 12.13 1050 0.42 12.55 1075 0.42 12.97 1125 1.26 14.23 1138 0.42 14.64 1168 0.42 15.06 Cum 28 1175 1188 0.84 15.90 0.42 16.32 1208 0.84 17.15 1263 0.42 17.57 1350 0.84 18.41 1470 0.42 18.83 1497 0.42 19.25 1510 0.42 19.67 1538 0.42 20.08 1543 0.42 20.50 1553 0.42 20.92 1555 0.42 21.34 1563 0.42 21.76 1571 0.42 22.18 1578 0.42 22.59 1588 0.42 23.01 1673 0.42 23.43 1675 0.42 23.85 1733 0.42 24.27 1738 0.42 24.69 1751 0.42 25.10 1755 0.42 25.52 1775 0.42 25.94 1800 0.42 26.36 1806 0.42 26.78 1851 0.42 27.20 1853 0.42 27.62 1863 0.42 28.03 1880 0.42 28.45 1905 0.42 28.87 1913 0.42 29.29 1920 0.42 29.71 29 1971 1988 0.42 30.13 0.42 30.54 1995 0.42 30.96 2013 0.42 31.38 2026 0.42 31.80 2048 0.42 32.22 2049 0.42 32.64 2050 0.84 33.47 2100 1.26 34.73 2108 0.42 35.15 2113 0.42 35.56 2125 0.84 36.40 2138 0.42 36.82 2140 0.42 37.24 2143 0.42 37.66 2150 0.84 38.49 2163 0.84 39.33 2188 0.42 39.75 2195 0.42 40.17 2205 0.42 40.59 2206 0.42 41.00 2218 0.42 41.42 2238 0.42 41.84 2250 1.67 43.51 2263 0.84 44.35 2270 0.42 44.77 2276 0.84 45.61 2281 0.42 46.03 2288 0.42 46.44 2313 0.42 46.86 2325 1.67 48.54 2326 0.42 48.95 30 2348 2351 0.42 49.37 0.42 49.79 2353 0.42 50.21 2363 0.42 50.63 2366 0.42 51.05 2388 0.84 51.88 2393 0.42 52.30 2400 0.42 52.72 2418 0.42 53.14 2420 0.42 53.56 2425 0.42 53.97 2430 0.42 54.39 2433 0.42 54.81 2438 0.42 55.23 2443 0.42 55.65 2450 0.42 56.07 2463 0.42 56.49 2466 0.42 56.90 2475 0.84 57.74 2477 0.42 58.16 2480 0.42 58.58 2485 0.42 59.00 2488 1.26 60.25 2493 0.42 60.67 2501 0.42 61.09 2505 0.42 61.51 2506 0.84 62.34 2508 0.42 62.76 2513 0.42 63.18 2525 0.42 63.60 2526 0.42 64.02 2533 0.42 64.44 31 2535 2538 0.42 64.85 0.42 65.27 2560 0.42 65.69 2563 0.42 66.11 2568 0.42 66.53 2570 0.42 66.95 2571 0.42 67.36 2575 0.84 68.20 2587 0.42 68.62 2588 0.84 69.46 2592 0.42 69.87 2601 0.42 70.29 2606 0.42 70.71 2616 0.42 71.13 2620 0.42 71.55 2638 0.42 71.97 2666 0.42 72.38 2681 0.42 72.80 2686 0.42 73.22 2698 0.42 73.64 2700 0.42 74.06 2701 0.42 74.48 2711 0.42 74.90 2713 0.42 75.31 2725 0.42 75.73 2735 0.42 76.15 2740 0.42 76.57 2755 0.42 76.99 2758 0.42 77.41 2763 0.42 77.82 2775 0.84 78.66 2778 0.42 79.08 32 2788 2791 0.42 79.50 0.42 79.92 2808 0.42 80.33 2815 0.42 80.75 2820 0.42 81.17 2880 0.84 82.01 2896 0.42 82.43 2905 0.42 82.85 2931 0.42 83.26 2941 0.42 83.68 2946 0.42 84.10 2948 0.42 84.52 2965 0.42 84.94 2975 0.42 85.36 3003 0.42 85.77 3071 0.42 86.19 3075 0.42 86.61 3080 0.42 87.03 3105 0.42 87.45 3118 0.42 87.87 3135 0.42 88.28 3146 0.42 88.70 3148 0.42 89.12 3171 0.42 89.54 3180 0.42 89.96 3188 0.42 90.38 3194 0.42 90.79 3213 0.42 91.21 3225 0.42 91.63 3230 0.42 92.05 3242 0.42 92.47 3275 0.42 92.89 33 3276 3411 0.42 0.42 93.31 93.72 3421 0.42 94.14 3518 0.42 94.56 3533 0.42 94.98 3556 0.42 95.40 3588 0.42 95.82 3595 0.42 96.23 3598 0.42 96.65 3615 0.42 97.07 3825 0.42 97.49 3869 0.42 97.91 4011 0.42 98.33 4058 0.42 98.74 4065 0.42 99.16 4325 0.42 99.58 4805 0.42 100.00 Total 239 100.00 Result of using command ‘tab slpnap75’ tab slpnap75 mins slp wk, inc naps, '75 Freq Percent 2053 0.42 0.42 2115 0.42 0.84 2243 0.42 1.26 2250 0.42 1.67 2293 0.42 2.09 2400 0.42 2.51 Cum 34 2423 2428 0.42 2.93 0.42 3.35 2443 0.42 3.77 2580 0.42 4.18 2618 0.84 5.02 2623 0.42 5.44 2668 0.42 5.86 2698 0.42 6.28 2700 0.42 6.69 2745 0.42 7.11 2755 0.42 7.53 2760 0.84 8.37 2768 0.42 8.79 2770 0.42 9.21 2771 0.42 9.62 2799 0.42 10.04 2820 0.42 10.46 2836 0.42 10.88 2838 0.42 11.30 2851 0.42 11.72 2879 0.42 12.13 2888 0.84 12.97 2891 0.42 13.39 2895 0.42 13.81 2901 0.42 14.23 2908 0.42 14.64 2911 0.42 15.06 2915 0.42 15.48 2916 0.42 15.90 2921 0.42 16.32 2923 0.42 16.74 2928 0.42 17.15 35 2948 2956 0.42 17.57 0.42 17.99 2970 0.84 18.83 2985 0.42 19.25 2993 0.42 19.67 3001 0.42 20.08 3003 0.42 20.50 3008 0.84 21.34 3018 0.42 21.76 3023 0.84 22.59 3030 0.42 23.01 3038 0.42 23.43 3045 0.42 23.85 3048 0.42 24.27 3088 0.42 24.69 3090 0.42 25.10 3095 0.42 25.52 3103 0.42 25.94 3108 0.42 26.36 3116 0.42 26.78 3120 0.84 27.62 3128 0.42 28.03 3135 0.42 28.45 3138 0.42 28.87 3140 0.42 29.29 3143 0.84 30.13 3155 0.42 30.54 3158 0.42 30.96 3165 0.84 31.80 3181 0.84 32.64 3183 0.84 33.47 3188 0.42 33.89 36 3191 3195 0.42 34.31 1.26 35.56 3201 0.42 35.98 3203 0.84 36.82 3208 0.42 37.24 3210 0.42 37.66 3215 0.42 38.08 3218 0.84 38.91 3226 0.42 39.33 3228 0.42 39.75 3233 0.42 40.17 3236 0.42 40.59 3238 0.42 41.00 3243 0.84 41.84 3245 0.42 42.26 3248 0.84 43.10 3255 0.42 43.51 3261 0.42 43.93 3263 0.42 44.35 3270 0.42 44.77 3271 0.42 45.19 3278 0.84 46.03 3280 0.42 46.44 3285 0.42 46.86 3290 0.42 47.28 3295 0.42 47.70 3298 0.42 48.12 3300 0.84 48.95 3305 0.42 49.37 3308 0.42 49.79 3328 0.42 50.21 3330 0.84 51.05 37 3338 3353 0.42 51.46 1.26 52.72 3367 0.42 53.14 3368 0.84 53.97 3376 0.42 54.39 3385 0.42 54.81 3390 0.42 55.23 3406 0.42 55.65 3418 0.42 56.07 3424 0.42 56.49 3428 0.42 56.90 3433 0.42 57.32 3435 0.84 58.16 3438 0.84 59.00 3440 0.42 59.41 3441 0.84 60.25 3443 0.42 60.67 3453 0.42 61.09 3458 0.42 61.51 3465 0.42 61.92 3466 0.42 62.34 3470 0.42 62.76 3480 0.42 63.18 3488 0.42 63.60 3490 0.42 64.02 3495 0.42 64.44 3496 0.42 64.85 3503 0.42 65.27 3510 0.42 65.69 3518 1.26 66.95 3523 0.42 67.36 3530 0.42 67.78 38 3533 3541 0.84 68.62 0.42 69.04 3545 0.84 69.87 3551 0.42 70.29 3560 0.42 70.71 3566 0.42 71.13 3570 0.84 71.97 3601 0.42 72.38 3605 0.42 72.80 3608 0.42 73.22 3610 0.42 73.64 3613 0.42 74.06 3618 0.42 74.48 3628 0.42 74.90 3646 0.42 75.31 3653 0.42 75.73 3660 0.42 76.15 3665 0.42 76.57 3668 0.42 76.99 3675 0.42 77.41 3680 0.42 77.82 3683 0.84 78.66 3685 0.42 79.08 3693 0.42 79.50 3698 0.42 79.92 3705 0.42 80.33 3706 0.42 80.75 3715 0.42 81.17 3724 0.42 81.59 3725 0.84 82.43 3743 0.42 82.85 3773 0.84 83.68 39 3775 3778 0.42 84.10 0.42 84.52 3780 0.42 84.94 3791 0.42 85.36 3798 0.42 85.77 3810 0.42 86.19 3813 0.42 86.61 3840 0.42 87.03 3848 0.42 87.45 3871 0.42 87.87 3925 0.42 88.28 3938 0.42 88.70 3940 0.42 89.12 3943 0.42 89.54 3975 0.42 89.96 3991 0.42 90.38 4025 0.42 90.79 4034 0.42 91.21 4035 0.42 91.63 4073 0.42 92.05 4081 0.42 92.47 4088 0.42 92.89 4100 0.42 93.31 4103 0.42 93.72 4113 0.42 94.14 4135 0.42 94.56 4185 0.42 94.98 4210 0.42 95.40 4290 0.42 95.82 4313 0.42 96.23 4353 0.42 96.65 4360 0.42 97.07 40 4367 4474 0.42 0.42 97.49 97.91 4530 0.42 98.33 4538 0.42 98.74 4618 0.42 99.16 5224 0.42 99.58 6110 0.42 100.00 Total 239 100.00 41 ... taking everything into consideration, we decided to choose and study the project: The factors affecting weekly working time in 1975 Thus through our project, we analyze the factors that have... that additional minutes of sleeping corresponds to a decrease in working time per week of 6057587 minutes, holding all other independent variables fixed In conclusion, all of the independent variables... statistically significant on time of working per week in 1975; they are male (gender) and slpnap75 (time of sleeping, including naps per week in 1975) Next, we tested whether age75, educ75, gdhlth75,