(TIỂU LUẬN) the data for nine variables are collected from the worldbank of 25 countries in region a asia and 25 countries in region b america the datasets are included in the excel file
Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 36 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
36
Dung lượng
1,32 MB
Nội dung
ECON1193 - Business Statistic Semester - 2021 Title of Assignment Name and Student ID Assignment 3A: Team Assignment Report ❖ Le Thien Ai - s3864119 ❖ Vu Dinh Thai - s3877521 ❖ Fuoc An Doanh - s3879951 Location Saigon South - Vietnam Class Group SGS - Group Lecturer Tuan CT Word count (excluding table, figures, references and appendix) 3245 words Part Data Collection The data for nine variables are collected from the WorldBank of 25 countries in region A: Asia and 25 countries in region B: America The datasets are included in the Excel file Part Descriptive Statistics Measures of central tendency Measures of Central Tendency Region A Asia Region B - America Mean 3.825 1.102 Median 3.35 1.33 Mode - - Figure The measures of central tendency of Asia and America regarding the GDP per capita growth (annual%) According to the table above, it can be seen that the mode is undetectable in both regions of Asia and America, so the mode is considered unusable in this measurement Moreover, there are outliers identified in both regions; Asia has one lower outlier, America has two lower outliers, and both not have any upper outlier (Appendix 1) Because the mean is influenced strongly by the outliers and could cause errors, so the mean is not applicable in this situation; hence, the median is the most suitable descriptive measurement to compare and analyze the GDP per capita growth rate in both regions In this case, the median illustrates that 50% of countries in a region have a higher GDP growth rate than the median value, and the remaining 50% of countries have a GDP growth rate lower than the median As seen in figure 1, it can be observed that Asia’s (3.35%) median value is higher than in America’s (1.33%) This number demonstrates that 50% of countries in Asia had a GDP growth rate in 2016 that is higher than 3.35%, and the others recorded less than 3.35%, and some countries' data is not yet recorded Hence, the median results have shown that in 2016, countries in Asia had a higher GDP growth rate than most countries in America 2 Measures of variation MEASURES OF VARIATION Region A Asia Region B - America Standard Deviation 3.398 2.266 Sample Variance 11.549 5.135 Range 12.99 9.54 Interquartile range 3.875 2.21 Coefficient of Variation 88.83% 205.65% Figure The measures of variation of Asia and America regarding the GDP per capita growth (annual%) In this case, the standard deviation is not applicable for the measurement because it was affected strongly by the outliers The preferred indicator must transcend the effect of several extreme values present in both data strings to make an unbiased comparison An extensive range between the means of the two listed regions and the size of two data strings is another factor that could contribute to skewed assumptions Thus, the Interquartile Range (IQR) is the most suitable variable to compare the variation between Asia and America The greater the IQR value is, the higher the range, hence the more massive and incoherent variation In figure 2, the value of IQR in Asia (3.875) is more significant than the value of IQR in America (2.21), which shows that the GDP growth rate from countries in Asia is less consistent and has a propensity to deviate from the core value 3 Measure of Shape Figure Box and whisker plots of GDP growth rate of Asia and America in 2016 In all the measure of shape methods, the box and whisker plot is the most suitable option because the box plot can display the median, Quartile & 3, and the outliers in both regions In comparison, the histogram does not help contrast two data sets because it is highly dependent on the bin range, making it challenging Hence, smaller to examine the actual values of the data Looking at the box plot in figure 3, it can be observed that most countries in Asia have a higher GDP growth rate (about 5.47) than countries in America However, the GDP of countries in Asia fluctuates more than in America from -1.5 to 11.94 at the higher end of the box plots Another application to be made is that 50% of countries in Asia have a higher GDP growth rate than 3.35 while the maximum GDP growth rate in America is only 5.47 This means that many countries in Asia are growing fast in their economy, so they have higher GDP growth rates It should be recognized that the data in the boxplots for Asia are right-skewed, and the American region is left-skewed Therefore, it can be concluded that the GDP growth rate in Asia outnumbers the sameThe lower figures in the region of America Part Multiple regression REGION A – ASIA a) Regression Final Output and scatter plots After applying the backward elimination method in appendix 2, the final regression model of Asia is displayed as below Figure Final regression model of Region A: Asia Figure Scatter plot of GDP per capita (current US$) of Asia in 2016 Y: GDP per capita growth rate (annual%) X1: GDP per capita (current US$) Figure shows the variable of X1 experiencing a downward trend and the data often fluctuating between and 5, showing a negative relationship with Y b) Regression Equation As can be observed in Figure 4, there is only one significant variable Therefore, the regression equation is: Ŷ = b + b1 X Ŷ = 4.882 - 0.00007*(GDP per capita) · Ŷ: predicted GDP per capita growth rate (annual %) · X1: GDP per capita, Atlas method (current US$) c) Regression coefficient of the significant independent variables · b0 = 4.882 shows that Y would be estimated for 48.82% when the GDP per capita (current US$) variable is zero, but it will make no sense · b1 = -0.00007 means that Y decreases by 0.0007% for every US$ in X1, holding the GDP per capita (current US$) as constant d) The coefficient of determination In figure 4, the coefficient of determination (R2) for this region is displayed at 0.176 or 17.6% This assumes that 17.6% of the variation in GDP per capita growth rate (annual %) can be clarified by the variation in the GDP per capita The remaining 82.4% of the GDP per capita growth rate variation in 2016 may be answered by other factors that are not included in this study REGION B - AMERICA a) Regression Final Output and scatter plots After applying the backward elimination method in appendix 3, the final regression model of America is displayed as below Figure Final regression output of Region B - America Figure Scatter plot of trade (% of GDP) of America in 2016 Y: GDP per capita growth rate (annual%) X1 : Trade (% of GDP) Figure shows the variable of X1 experiencing a dramatic downward trend (fluctuating from to - 0.5), showing a negative relationship with Y As can be observed from figure 5, the data predicted for GDP per capita growth rate in Asia is more stable than in America b) Regression Equation According to figure 6, the regression equation is: Ŷ = b + b1 X Ŷ = -0.667 + 0.027*(GDP per capita) ● ● Ŷ: predicted GDP per capita growth rate (annual %) X1: Trade (% of GDP) c) Regression coefficient of the significant independent variables According to figure and appendix 3, there is no significant independent variable because after applying the backward elimination, Trade (% of GDP) is the only independent variable left (0.06) which is higher than the significance level is 0.05, so the regression coefficient is not available here in this case d) The coefficient of determination In figure 6, the coefficient of determination (R2) for this region is displayed at 0.142 or 14.2% This assumes that 14.2% of the variation in GDP per capita growth rate (annual %) can be clarified by the variation in the Trade (% of GDP) The remaining 85.8% of the GDP per capita growth rate variation in 2016 may be answered by other factors not included in this study Part Team Regression Conclusion According to the research in part 3, it is recognizable that the two regions Asia and America have a different number of significant variables While Asia has one independent variable that can affect the GDP per capita growth rate is the GDP per capita (current US$), America is affected by none of the variables However, in the American region, although it is not affected by any independent variable, we can still compare other aspects in the result of the final regression output between the two regions to have the most objective perspective In comparison, it is witnessed that the coefficient of determination in Asia is higher R2 than in America (17.6% > 14.2%) Thus, there is a higher proportion of the variation in the GDP per capita growth rate in Asia that could be explained by the variation in the GDP per capita (current US$) of the countries In region B, the dependent variable is not affected by any other independent variable due to (Appendix 3) so none of the variables could have a high impact on the GDP per capita growth rate in America In contrast, in region A the dependent variable's effect on only one independent variable is the GDP per capita (current US$) so this is the only variable that could have the highest impact on the GDP per capita growth rate in Asia To summarize, this study shows that in Asia, the GDP per capita (current US$) variable can be used to forecast the GDP per capita growth rate in Asia, whereas in America there is no independent variable that can be used to predict the GDP per capita growth rate in 2016 Part Times Series Low-Income countries (LI): Nepal(Asia)(C1), Honduras(America)(C3) High-Income countries (HI): Singapore(Asia)(C2), United States(Ameria)(C4) I Trend Models Region A - Asia Low-Income Country Asia After applying the hypothesis for trend models in Nepal country (appendix 3.1), the findings imply that linear, quadratic and exponential trend models are significant for this country Linear Trend a) Regression Output Figure Linear trend regression output of Nepal – Low-Income country (1990-2015) b) Formula & Coefficient explanation Y =1.357―0.0043×T 𝛽0 = 1.357 shows that the GNI of a Low-Income country, Nepal (1990-2015), is expected to be around $1367.5 when the time period, T, is years However, this does not make sense as being out of our observation scope Therefore, this is the portion of Gross National Income, total that is not explained by time period T 𝛽1 = -0.0043, illustrates that for every one year, on average, the GNI, total of Low-Income country, Nepal (1990-2015), is estimated to decrease by $0.0043 per head approximately This also indicates the downward sloping of its linear trend model Quadratic Trend Model a) Regression Output Figure Quadratic trend regression output of Nepal – Low-Income country (1990-2015) b) Formula & Coefficient explanation Y =1357.5―0.0043×T―0.00001×T 𝛽1 = ―0.0043, illustrates that when T = (year), the instantaneous rate of change of the GNI per head, a total of Low-Income country, Nepal (1990-2015) is ―0.0043 $USD per head β3 Foreign direct investment, net inflows (% of GDP) 0.147 >α Do not reject H0 β4 Exports of goods and services (% of GDP) 0.539 >α Do not reject H0 β5 Imports of goods and services (% of GDP) 0.249 >α Do not reject H0 β6 Trade (% of GDP) 0.270 >α Do not reject H0 β7 Population ages 15-64, total 0.726 >α Do not reject H0 In the second regression output, the p-value of the population variable output is the highest with 0.726 and greater than α, thus we eliminate this variable The third regression output with six variables Variable P-value Comparison Decision β1 Life expectancy at birth, total (years) 0.654 >α Do not reject H0 β2 GDP per capita (current US$) 0.176 >α Do not reject H0 21 β3 Foreign direct investment, net inflows (% of GDP) 0.112 >α Do not reject H0 β4 Exports of goods and services (% of GDP) 0.458 >α Do not reject H0 β5 Imports of goods and services (% of GDP) 0.163 >α Do not reject H0 β6 Trade (% of GDP) 0.188 >α Do not reject H0 In the third regression output, the p-value of the life expectancy at the birth variable output is the highest with 0.654 and greater than α, thus we eliminate this variable The fourth regression output with five variables Variable P-value Comparison Decision β1 GDP per capita (current US$) 0.020 α Do not reject H0 β3 Exports of goods and services (% of GDP) 0.501 >α Do not reject H0 22 β4 Imports of goods and services (% of GDP) 0.175 >α Do not reject H0 β5 Trade (% of GDP) 0.203 >α Do not reject H0 In the fourth regression output, the p-value of the exports of goods and services variable output is the highest with 0.501 and greater than α, thus we eliminate this variable The fifth regression output with four variables Variable P-value Comparison Decision β1 GDP per capita (current US$) 0.021 α Do not reject H0 β3 Imports of goods and services (% of GDP) 0.223 >α Do not reject H0 β4 Trade (% of GDP) 0.262 >α Do not reject H0 In the fifth regression output, the p-value of the trade variable output is the highest with 0.262 and greater than α, thus we eliminate this variable 23 The sixth regression output with three variables Variable P-value Comparison Decision β1 GDP per capita (current US$) 0.039 α Do not reject H0 β3 Imports of goods and services (% of GDP) 0.452 >α Do not reject H0 In the sixth regression output, the p-value of the imports of goods and services variable output is the highest with 0.452 and greater than α, thus we eliminate this variable The seventh regression output with two variables Variable P-value Comparison Decision β1 GDP per capita (current US$) 0.016 α Do not reject H0 24 In the seventh regression output, the p-value of the foreign direct investment, net inflows variable output is the highest with 0.190 and greater than α, thus we eliminate this variable Final regression output with the last variable β1 Variable P-value Comparison GDP per capita (current US$) 0.037 α Do not reject H0 β2 GNI per capita, Atlas method (current US$) 0.411 >α Do not reject H0 25 β3 GDP per capita (current US$) 0.426 >α Do not reject H0 β4 Foreign direct investment, net inflows (% of GDP) 0.105 >α Do not reject H0 β5 Exports of goods and services (% of GDP) 0.099 >α Do not reject H0 β6 Imports of goods and services (% of GDP) 0.099 >α Do not reject H0 β7 Trade (% of GDP) 0.099 >α Do not reject H0 β8 Population ages 15-64, total 0.459 >α Do not reject H0 In the first regression output, all of the variables are insignificant variables, as its p-value is higher than the level of significance (0.05), thus we not reject H0 Using the backward elimination, we first eliminate the population from the data set as it has the highest p-value is 0.459, hence the most insignificant variables The second regression output with seven variables Variable β1 Life expectancy at birth, total (years) P-value Comparison Decision 0.301 >α Do not reject H0 26 β2 GNI per capita, Atlas method (current US$) 0.465 >α Do not reject H0 β3 GDP per capita (current US$) 0.467 >α Do not reject H0 β4 Foreign direct investment, net inflows (% of GDP) 0.124 >α Do not reject H0 β5 Exports of goods and services (% of GDP) 0.059 >α Do not reject H0 β6 Imports of goods and services (% of GDP) 0.059 >α Do not reject H0 β7 Trade (% of GDP) 0.059 >α Do not reject H0 In the second regression output, the p-value of the GDP per capita variable output is the highest with 0.467 and greater than α, thus we eliminate this variable The third regression output with six variables Variable P-value Comparison Decision β1 Life expectancy at birth, total (years) 0.404 >α Do not reject H0 β2 GNI per capita, Atlas method (current US$) 0.939 >α Do not reject H0 27 β3 Foreign direct investment, net inflows (% of GDP) 0.156 >α Do not reject H0 β4 Exports of goods and services (% of GDP) 0.073 >α Do not reject H0 β5 Imports of goods and services (% of GDP) 0.073 >α Do not reject H0 β6 Trade (% of GDP) 0.073 >α Do not reject H0 In the third regression output, the p-value of the GNI per capita variable output is the highest with 0.939 and greater than α, thus we eliminate this variable The fourth regression output with five variables Variable P-value Comparison Decision β1 Life expectancy at birth, total (years) 0.357 >α Do not reject H0 β2 Foreign direct investment, net inflows (% of GDP) 0.139 >α Do not reject H0 β3 Exports of goods and services (% of GDP) 0.065 >α Do not reject H0 28 β4 Imports of goods and services (% of GDP) 0.065 >α Do not reject H0 β5 Trade (% of GDP) 0.065 >α Do not reject H0 In the fourth regression output, the p-value of the life expectancy at the birth variable output is the highest with 0.357 and more significant than α, thus eliminating this variable The fifth regression output with four variables Variable P-value Comparison Decision β1 Foreign direct investment, net inflows (% of GDP) 0.211 >α Do not reject H0 β2 Exports of goods and services (% of GDP) 0.065 >α Do not reject H0 β3 Imports of goods and services (% of GDP) 0.065 >α Do not reject H0 β4 Trade (% of GDP) 0.065 >α Do not reject H0 In the fifth regression output, the p-value of the foreign direct investment, net inflows variable output is the highest with 0.211 and greater than α, thus we eliminate this variable The sixth regression output with three variables 29 Variable P-value Comparison Decision β2 Exports of goods and services (% of GDP) 0.07936 >α Do not reject H0 β3 Imports of goods and services (% of GDP) 0.07931 >α Do not reject H0 β4 Trade (% of GDP) 0.07929 >α Do not reject H0 In the sixth regression output, the p-value of the exports of goods and services (% of GDP) variable output is the highest with 0.07936 and greater than α, thus eliminating this variable The seventh regression output with two variables Variable P-value Comparison Decision β1 Imports of goods and services (% of GDP) 0.834 >α Do not reject H0 β2 Trade (% of GDP) 0.554 >α Do not reject H0 In the seventh regression output, the p-value of the imports of goods and services (% of GDP) variable output is the highest with 0.834 and more significant than α, thus eliminating this variable The final regression output with the last variables 30 Variable β1 Trade (% of GDP) P-value Comparison Decision 0.063 >α Do not reject H0 Finally, as can be observed here that the trade (% of GDP) variable output is 0.063, which is higher than α, so we not reject H0 Hence, the trade (% of GDP) is an insignificant variable and the linear relationship between it and GDP per capita growth (annual %) does not exist The backward elimination method above shows a final multiple regression model for Region B - America, including only no significant variables at a 5% significance level APPENDIX 3: Hypothesis testing for trend models in Asia Appendix 3.1: Nepal a Linear H0: 𝛽1 = (There is no linear trend in the GDP growth rate of Asia) H1: 𝛽1 ≠ (There is a linear trend in the GDP growth rate of Asia) As the p-value, the result (0.00043< 0.05) is much smaller than the confidence level, 𝛼 (0.05) Therefore, we reject the null hypothesis H0 and not reject H1 This means that, with a 95% level of confidence, there is sufficient evidence to confirm that the linear trend is a significant trend model representing the Gross National Income in a total of the Low-Income country, Nepal (1990-2015) b Quadratic H0: 𝛽2 = (No quadratic trend in the GNI, total (GNI per head) in Low-Income country (1990- 2015)) H1: 𝛽2 ≠ (Quadratic trend in the GNI, total (GNI per head) in Low-Income country (1990-2015) observed) As seen in the regression output above, the p-value of variable T2 equals 2,4135× 10-o5, which is much smaller than the confidence level, 𝛼 (0.05) Therefore, we reject H0 and not reject H1 This means that, with a 95% level of confidence, there is sufficient evidence to confirm that the quadratic trend is also a significant trend model representing the GNI, total (GNI per head) of the Low-Income country, Nepal, from 1990 to 2015 31 c Exponential H0: 𝛽1 = (No exponential trend in the Gross National Income, total in Low-Income country (1990-2015)) H1: 𝛽1 ≠ (Exponential trend in the Gross National Income, total in Low-Income country (1990-2015) observed As seen in the regression output above, the p-value of variable T equals 0,0141, much smaller than the confidence level, 𝛼 (0.05) Therefore, we reject H0 and not reject H1 This means that, with a 95% level of confidence, there is sufficient evidence to confirm that the exponential trend is also a significant trend model representing the Gross National Inc Appendix 3.2: Singapore a Linear H0: β1 = (There is no linear trend in the total GNI in Singapore) H1: β1 ≠ (There is a linear trend in the total GNI in Singapore) Based on the calculations, the p-value of the linear trend model is approaching (-3.452E ― 05 ≈ -3.45 ∗ 10 ―05 ≈ 0), and smaller than the confidence level (α = 0.05), so we can reject H0 As a result, with a 95% level of confidence, it can be claimed that there is a linear trend model for a High-Income country – Singapore yearly b Quadratic Y^=1876.5―0.029×T―0.00001×T2 H0: 𝛽2 = (No quadratic trend in the GNI, total (GNI per head) in High-Income country (1990- 2015)) H1: 𝛽2 ≠ (Quadratic trend in the GNI, total (GNI per head) in High-Income country (1990-2015) observed) c Exponential H0: β1 = (There is no exponential trend in the GNI in Singapore (1990-2015)) H1: β1 ≠ (There is an exponential trend in the GNI in Singapore(1990-2015)) Similarly, we can reject H0 because the p-value is 0.48 and more significant than (α = 0.05) Thus, we are 95% confident that there is an exponential trend model for GNI of High-Income Country Singapore from 1990 to 2015 32 - Linear format: log (Y) = 1.876 ― 0.0299 (T) - Non-linear format : Y = 75.229 (0.9332T) 𝛽1 = 0.9332 Thus, (0.9332 - 1) *100% = -6.68% is the estimated annual compound growth rate of the total GNI of Poland (1990-2015) APPENDIX 4: Hypothesis testing for trend models in America Appendix 4.1: Honduras a Linear H0: 𝛽1 = (There is no linear trend in the GDP growth rate of Asia) H1: 𝛽1 ≠ (There is a linear trend in the GDP growth rate of Asia) As the p-value, the result (0.0001< 0.05) is much smaller than the confidence level, 𝛼 (0.05) Therefore, we reject the null hypothesis H0 and not reject H1 This means that, with a 95% level of confidence, there is sufficient evidence to confirm that the linear trend is a significant trend model representing the Gross National Income in a total of the Low-Income country, Honduras (1990-2015) b Quadratic H0: 𝛽2 = (No quadratic trend in the GNI, total (GNI per head) in Low-Income country (1990- 2015)) H1: 𝛽2 ≠ (Quadratic trend in the GNI, total (GNI per head) in Low-Income country (1990-2015) observed) c Exponential H0: 𝛽1 = (No exponential trend in the Gross National Income, total in Low-Income country (1990-2015)) H1: 𝛽1 ≠ (Exponential trend in the Gross National Income, total in Low-Income country (1990-2015) observed As seen in the regression output above, the p-value of variable T equals 0,02 which is smaller than the confidence level, 𝛼 (0.05) Therefore, we reject H0 and not reject H1 This means that, with a 95% level of confidence, there is sufficient evidence to confirm that the exponential trend is also a significant trend model representing the Gross National Inc Appendix 4.2: United States a Linear H0: β1 = (There is no linear trend in the total GNI in the USA) 33 H1: β1 ≠ (There is a linear trend in the total GNI in the USA) Based on the calculations, the p-value of the linear trend model is 0.069 more significant than the confidence level (α = 0.05), so we can reject H0 As a result, with a 95% level of confidence, it can be claimed that there is a linear trend model for the High-Income country – the USA yearly b Quadratic H0: 𝛽2 = (No quadratic trend in the GNI, total (GNI per head) in High-Income country (1990- 2015)) H1: 𝛽2 ≠ (Quadratic trend in the GNI, total (GNI per head) in High-Income country (1990-2015) observed) As seen in the regression output above, the p-value of variable T2 equals 4,674× E-10, which is much smaller than the confidence level, 𝛼 (0.05) Therefore, we reject H0 and not reject H1 This means that, with a 95% level of confidence, there is sufficient evidence to confirm that the quadratic trend is also a significant trend model representing the GNI, total (GNI per head) of the High-Income country, the USA from 1990 to 2015 c Exponential H0: β1 = (There is no exponential trend in the GNI in the USA (1990-2015)) H1: β1 ≠ (There is an exponential trend in the total fertility rate in the USA (1990-2015)) Similarly, we can reject H0 because the p-value is 8.71 E-22 is smaller than (α = 0.05) Thus, we are 95% confident that there is an exponential trend model for GNI of High-Income Country USA from 1990 to 2015 - Linear format: log (Y) = 1.876 ― 0.0299(T) - Non-linear format : Y = 1.923 (0.97T) 𝛽1 = 0.9332 Thus, (0.97 - 1) *100% = -3% is the estimated annual compound growth rate of the total GNI of the USA (1990-2015) This shows that for every one year, on average, the total Gross National Income of the High-Income country, USA(1990-2015) is predicted to decrease by 3% APPENDIX 5: Prediction for GNI in 2021, 2022, 2023 As the quadratic trend model is determined to be the most reliable trend model to estimate the average number GNI per head of all four countries, their total GNI per head in 2021, 2022, and 2023 are calculated based on the quadratic trend model formula as follows: 34 LI- Nepal: Y^ =1.3575―0.0043×T―0.00001×T LI-Honduras: Y^ =-0.25435―0.02×T―0.00004×T HI-Singapore: Y^=1.8765―0.029×T―0.00001×T HI-USA: Y^=0.2839―0.009×T―0.00003×T Y^ GNI per head T Year the predicted value of GNI in total in country 2021,2022,2023 years Corresponding time/ pẻriod Predict Average of GNI per head the independent variables LI-Honduras HI-Singapore HI-America LI-Nepal 2021 21 3.712 2.893 1.772 1.533 2022 22 3.646 2.722 1.699 1.484 2023 23 3.917 2.667 1.938 1.411 35 ...1 Part Data Collection The data for nine variables are collected from the WorldBank of 25 countries in region A: Asia and 25 countries in region B: America The datasets are included in the Excel. .. central tendency of Asia and America regarding the GDP per capita growth (annual%) According to the table above, it can be seen that the mode is undetectable in both regions of Asia and America, ... be recognized that the data in the boxplots for Asia are right-skewed, and the American region is left-skewed Therefore, it can be concluded that the GDP growth rate in Asia outnumbers the sameThe