(TIỂU LUẬN) fileCUsersBinhBIenDownloadsTIỂU%20LUWe collected over 55 countries from both regions at first the cleaning process is straightforward, any countries with missing variables will be void below a

29 1 0
(TIỂU LUẬN) fileCUsersBinhBIenDownloadsTIỂU%20LUWe collected over 55 countries from both regions at first  the cleaning process is straightforward, any countries with missing variables will be void  below a

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

RMIT International University Vietnam ECON1193 - Business Statistics ASSIGNMENT Subject code ECON1193 Subject name Business Statistics Lecturer Teck Lee Yap (Stanley) Name Student id Part contributed Countribution Nguyen Si Hung S3878216 7,B 100% Le Nguyen Hung S3822922 100% Andrew Yit S3881108 1,4 100% Tran Anh Quan S3877058 100% Ho Nguyen Phuc S3878433 5,6 100% Signature PART 1: For this part of the assignment, our team has to collect nine variables of the dataset from the world bank in the year 2004, which includes GDP per capita growth rate (annual %), Life expectancy at birth, total (years), GNI per capita, Atlas method (current US$), GDP per capita (current US$), Foreign direct investment, net inflow (% of GDP), Exports of goods and services (% of GDP), Imports of goods and services (% of GDP), Trade (% of GDP) and Population (ages 15-64 (total) years) We collected over 55 countries from both regions at first The cleaning process is straightforward, any countries with missing variables will be void Below are the following pictures on how the data are cleaned Figure 1: How to download data from the worldbank Figure 2: Worldbank data of GDP (growth annual %) Once choosing the countries to use, I will go download thee file as an excel as can be seen from the drawing above on the picture After downloading the excel file from the world bank the raw data looks pretty messy so we transform them into a style where it is much easier to read Figure 3: Worldbank missing information Then, any countries with any missing variables as can be seen from the picture above we mark them as red so we could know that these countries should be deleted Figure 4: Chosen country data Finally, after successfully voiding all, we finalised the list of chosen countries as can be seen above PART 2: DESCRIPTIVE STATICS Measures of Central Tendency:  European countries: Q3 + 1.5*IQR = 11.7666257545 9.28288775 < 11.7666257545 => Max < Q3 + 1.5*IQR => Max is not an outlier in upper values Q1 – 1.5*IQR = -3.7394 1.105911322 > -3.7394 => Min > Q1 – 1.5*IQR => Min is not an outlier in lower values  African countries: Q3 + 1.5*IQR = 7.1876624645 6,489599681 < 7.1876624645 => Max < Q3 + 1.5*IQR => Max is not an outlier in upper values Q1 – 1.5*IQR = -2.1719459235 -6,10287512 < -2.1719459235 => Min < Q1 – 1.5*IQR => Min is an outlier in lower values European countries' average GDP per capita growth rate (mean) in 2004 was much higher than that of African countries (4.218861687 % > 1.958022018 %) Due to the exist of outliers, mean is no longer the best measure of central tendency Meanwhile, the mode has been disabled, as a result of which neither European nor African countries are recognized Therefore, median is widely recognized as the best measure for analyzing the GDP per capita growth rate (annual %) of European and African countries since it is not affected by outliers From the table 1, European countries have higher median than African ones, with almost 3.93% compared to mostly 3% and from this comparison, it can be said that European countries have a bigger GDP per capita growth rate (annual %) than countries in Africa Measures of Variation Table 2: Measures of Variation of GDP per capita growth rates of country categories in 2004 (annual %) In terms of range and IQR, European countries’ range is smaller than the range of African countries (8.176976428 < 12.5924748) whereas the IQR of European countries is relatively bigger than African countries’ one (3.876496621 > 2.339902097) About coefficient of variation, the result in European countries is considerably lower than African countries, specifically almost 59.5% compared to nearly 160% The coefficient of variation results for countries in Africa and Europe are both fairly high, implying that data dispersion around the mean is enormous in both regions To put it another way, the GDP per capita growth rates recorded in 2004 for African and European countries varied and ranged by considerable margins Figure 1: Box and Whisker graph of GDP per capita growth rates of country categories in 2004 (annual %) According to figure 1, the most obvious factor is that African countries had the lowest GDP per capita growth rate, whereas countries in Europe dominated the growth rate of GDP per capita European countries’ box plot is right-skewed whereas the box plot of African countries is leftskewed None of European countries has negative GDP per capita growth rate whereas the minium GDP per capita growth value of countries in Africa records a number of -6.10287512 (Table 3) All the values of European countries ( min, Q1, median, Q3 and max ) is higher than in African countries PART 3: MULTIPLE REGRESSION (2004) In this case, we are going to utilize backward elimination to analyze the regression of Region A (Europe) The final regression modle after applying backward elimination will include only variable(s) that are significant at the level of 5% Regression Output and Scatter Plots  Region A: Europe Figure X: Final regression model of Europe GDP per capita growth rate (annual %) Life expectancy at birth, total (years) Line Fit Plot GDP per capita growth rate (annual %) Predicted GDP per capita growth rate (annual %) Linear (GDP per capita growth rate (annual %)) 14.00 12.00 10.00 8.00 6.00 4.00 2.00 0.00 60.00 65.00 70.00 75.00 80.00 85.00 90.00 Life expectancy at birth, total (years) Fig ure Y: The scatter plot of GDP per capita growth rate (annual %) and Life expectancy at birth, total (years) of Europe countries As data shown in Figure Y, it is considerable that:  The Life expectancy at birth, total (years) results in 2004 of Europe countries in the dataset were all higher than 60 years The points were quite near to one other, showing that the variations in life expectancy at birth amongst Asian nations were not very significant  The trendline had a decreasing slope, indicating that there was a negative relationship between GDP per capita growth rate and life expectancy at birth Figure Z: The scatter plot of GDP per capita growth rate (annual %) and Population (age 1564 (total) years) of Europe countries As data shown in Figure Z, it is considerable that:  Most Europe countries recorded the Population (age 15-64 (total) years) results lower than 60 millions, whereas two of them were outliers of more than 100 millions  The trendline had a increasing slope, indicating that there was a positive relationship between GDP per capita growth rate and Population From both Figure Y and Z, we can see that there is no Europe countries received negative GDP per capita growth rate when the points are all greater than They were also quite far away from each other, especially there was existence of two outliers of almost 11% - 12% GDP per capita growth  Region B: Africa Figure 1: Final regression model of Africa Foreign direct investment, net inflow (% of GDP) Line Fit Plot GDP per capita growth rate (annual %) GDP per capita growth rate (annual %) Predicted GDP per capita growth rate (annual %) Linear (GDP per capita growth rate (annual %)) 35.00 30.00 25.00 20.00 15.00 10.00 5.00 -2.00 0.00 0.00 -5.00 2.00 4.00 6.00 8.00 10.00 12.00 -10.00 Foreign direct investment, net inflow (% of GDP) Figure 2: The scatter plot of GDP per capita growth rate (annual %) and Foreign direct investment, net inflow (% of GDP) of Africa countries As data shown in Figure 2, it is considerable that:  The majority of African countries had positive net inflows of Foreign direct investment, net inflows of (% of GDP), while one had a negative outcome The points were likewise distributed between and percent of GDP which means that there are some voids in the Foreign direct investment, net inflow among African nations  Same as the Foreign direct investment, net inflows of (% of GDP), GDP per capita growth rate reecorded by Africa countries are positive, where as one of them had a negative result Region A: Europe High income country: Netherland  Regression model: Linear regression trend(LIN) A Regression Output: B Formula & Coefficient explanation: Formula: Y^=17168+1486,59*T Coefficient explanation: b0=17168 is the estimate of GDP when T=0 But does not make sense because the range not included T=0 b1:=1486,5 the decrease of GDP in the time period T Quadratic regression trend (QUA): A Regression Output: B Formula & Coefficient explanation: Formula:Y^= 17328,75-1452,26*T +1,27*(T^2) Coefficient explanation: b1=-1452,26 is the estimate decrease annually of GDP of Netherland when T=0 But in this case T=0 is not in the range so that it can not be identified b2: annually rate 1,27*2=2,54% is the increase of GDP rate annually of Netherland Exponential regression trend (EXP) A Regression Output: B Formula & Coefficient explanation: Formula: Linear format: log(Y^)=4,305+0,0179*T Non-linear format:Y^= 20183*1,039^T Annual growrth rate: (1,039-1)*100=3.9% Annualy, Netherland’s GDP each year will increase 3% Low-Middle Imcome country: Moldova Linear regression trend(LIN) A Regression trend output B Formula& Coefficient explanation: Formula: Y^=-246,4711+158,653*T Coefficient explanation: B0=-246,4711 is the estimate of GDP when T=0 But does not make sense because the range not included T=0 So it is not related to the trend B1:=158,653 the decrease of GDP in the time period T Quadratic regression trend(QUA) A Regression output B Formula & Coefficient explanation: Formula:Y^= 479,038-30,609*T +8,6*(T^2) Coefficient explanation: B1:=-30,609 is the estimate decrease annually of GDP of Moldova when T=0 But in this case T=0 is not in the range so that it can not be identified B2: annually rate 8,6*2=17,2% is the increase of GDP rate annually of Moldova Exponential regression trend (EXP) A Regression output B Formula & Coefficient explanation: Formula: Linear format: log(Y^)=2,5206+0,049*T Non-linear format:Y^= 331,5*1,119^T Annual growrth rate: (1,119-1)*100=11.9% Annualy, Moldova’s GDP each year will increase 11.9% Time series forecast: after calculating both SSE and MAD of all three trend types The smallest SSE and MAD of Congo :  Quadratic regression trend Congo GDP Prediction: South Africa:  Quadratic regression trend South Africa GDP Prediction: Moldova:  Quadratic regression trend Moldova GDP Prediction: Netherland:  Linear regression trend Netherland GDP Prediction: PART 6: Time series Conclusion: Figure 1: Three low-middle countries: South Africa,Moldova and Cong,DEM.REP Description: As you can see in the graph, The above three countries have different income level: Low Income: Congo,dem.rep: is a country with a low GDP below 1000 Since the years 2002 to 2015, there has been an upward trend and according to the above prediction, it will continue to increase by 8% in each of 2017, 2018 and 2019 Low-midlle income: Moldova: grew rapidly from 2004 to 2014 Based on calculated projections, it will continue to grow at 7% per year in 2017,2018 and 2019 Middle Income: South Africa: has an upward trend since the early 1990s but gradually decreased and reached the lowest value in 2002(1990-2015) But then it gradually increased until 2012 and tended to decrease again But with the above prediction, from 2017-2019 there will be an upward trend in GDP with a growth rate of about 5% Figure 2:High income country: Netherland High income country (Netherland): had a rapid growth from 2000-2007 and as predicted calculated above GDP will continue to grow by 3% in 2018 and 2% in 2019 Conclusion: Both region A and B follow the same trend line: The similarity is that they all grew rapidly from 2000 onwards Pridict world trend: After comparison and analysis, the SSE and MAD of the Quadratic regression trend of Congo are the lowest  World’s quadratic model: Y^= 282,79-24,01*T+1,28*(T^2) PART 7: TEAM CONCLUSION: 1&2 Predicted GDP Per Capita Growth Rate In 2030: Applying the formula derived from the quadratic regression trend model of low-income country, Congo, 2030 will have the T value of 41, which shows a rate of -0.154 lower than the previous years The main factors are possibly the urban concentration level and the takeover of technology Recommendations: Our datasets only include gathering information from African and European countries,but there are 195 countries in the world, which means still many places not evaluated in this research If expanding the sample size, the data will be more reliable Our data has worked on aspects for the GDP per capita growth rate evaluation, but studies illustrate some other factors:  Tuğba & Yılmaz (Intechopen, 2020) demonstrate the importance of inflation rate and unemployment rate in economic growth and how their influences over GDP per capita  Vernon Henderson (Worldbank) suggests the contribution of urban concentration level in the GDP growth rate and this also relative to level of technology References:  Dayıoğlu, Tuğba, and Yılmaz Aydın, September 2020, Relationship between Economic Growth, Unemployment, Inflation and Current Account Balance: Theory and Case of Turkey, IntechOpen, IntechOpen, viewed on 30 May 2021,  Worldbank, How Urban Concentration Affects Economic Growth, Worldbank, viewed on 30 May 2021, Kingsley Ighobor August 2012, “African economy capture world attention”, [Access online],viewed 29 May 2021,   The World Bank 2021, DataBank World Development Indicators, viewed May 28, 2021,  Worldometer 2021, COVID-19 CORONAVIRUS PANDEMIC, viewed 28 May 2021, Appendices: Part 3: Multiple regression In part 3, we are going to utilize the backward elimitation method instead of the hypothesis testing in order to build the final regression model The final regression model will include only variable(s) which is/are significant at the level of 5% a) Europe countries Figure A: First regression output (8 variables) In the first regression, we used all eight independent variables in the model, which resulted in the #NUM! error After searching online, we discovered that the problem was due to perfect collinearity, since various independent variables (excluding population, GDP per capita, GNI per capita, and life expectancy at birth) were expressed as a percentage of GDP If we combine these factors together, we could have 100 percent, or perfect collinearity As a result, we eliminated the variable Exports of goods and services ( percent of GDP) Figure B: Second regression output (7 variables) Figure C: Third regression output (6 variables) Figure D: Fourh regression output (5 variables) Figure E: Fifth regression output (4 variables) Figure F: Sixth regression output (3 variables) Figure G: Final regression output (2 variables) In the last round, we ended the backward elimitation and recorded that Life expentancy at birth, total (years) and Population (age 15-64 (total) years) are two significant variables b) Africa countries Figure H: First regression output (8 variables) Same as Europe countries, after building the first regression model with variables, we received #NUM! Results because of perfect collinearity Therefore we decided to remove the Trade (% of GDP) variable Figure I: Second regression output (7 variables) Figure J: Third regression output (6 variables) Figure K: Fourh regression output (5 variables) Figure L: Fifth regression output (4 variables) Figure M: Sixth regression output (3 variables) Figure N: Seventhl regression output (2 variables) Figure O: Final regression output (1 variables) In the last round, we ended the backward elimitation and recorded that Foreign direct investment, net inflow (% of GDP) is the only one significant variables Part5: Hypothesis Testing: Congo: LIN H0: b1 = (No linear regression trend in GDP per capita of Congo) H1: b1 =/0 (Have linear regression trend in GDP per capita of Congo) Description: According to the calculation above We can see that the P-value= 9,9396E-05,which mean that it is smaller than the confidence level (0,05) Conclusion: Acept H1 and decline H0 => There is a linear trend QUA H0: b2 = (No quadratic trend in GDP growth rate of total GDP of Congo (1990-2015) H1: b2 =/0 (Quadratic trend in GDP growth rate of total GDP of Congo (1990-2015) Description: According to the calculation: The p-values of Time squared is equal to 9,3776E-06 and smaller than the confidence level (0,05) Conclusion: Acept H1 and decline H0 => There is quadratic trend EXP H0: b1 = (No exponential trend in the GDP growth rate of total GDP of Congo (1990-2015) H1: b1 =/0 (Exponential trend in the GDP growth rate of total GDP of Congo (1990-2015)) =>B1 p-values is smaller than the confidence level (0,05) =>Acept H1 and decline H0 => There is a exponential trend South Africa LIN H0: b1 = (No linear trend of GDP of South Africa) H1: b1 =/0 (There is linear trend of GDP of South Africa) Description: According to the calculation:We can see that the B1 pvalues is 1,2065E-07 which is smaller than the confidence level (0,05) Conclusion: Acept H1 and decline H0 => There is linear regression trend QUA H0: b2 = (No quadratic trend in the GDP rate of South Africa) H1: b2 =/0 (There are quadratic trend in the GDP rate of South Africa) Description: According to the calculation above: B2 p-values is equal to 0,04870965 which mean it is smaller than the confidence level (0,05) Conclusion Acept H1 and decline H0 => There is a quadratic trend EXP H0: b1 = (There is no linear trend of GDP in South Africa) H1: b1 =/0 (There is linear trend of GDP in South Africa) Description: According to the calculation above,we can see that B1 p-values is euqal to 3,0429E07 which mean that it is smaller than the confidence level (0,05) Conclusion: Acept H1 and decline H0 => there is linear trend Netherland LIN H0: b1 = (No linear trend of GDP in NetherLand) H1: b1 =/0 (There is linear trend of GDP in Netherland) Description: According to the calculation above we can see that the B1 p-values is equal to 5,7268E-11 which mean it is smaller than the confidence level (0,05) Conclusion: Acept H1 and decline H0 => There is a linear trend QUA H0: b2 = (Not Quadratic trend of GDP in netherland) H1: b2 =/0 (There is quadratic trend in Netherland) Description: Because the B2 P value- is larger than the confidence level (0,05) Conclusion:Reject H1 and acept H0 => No quadratic trend EXP B Hypothenesis: H0: b1 = (No exponential trend of GDP in Netherland) H1: b1 =/0 (There is a exponential trend of GDP in Netherland) Description: The p-value of B1= 4,9883E-12 is smaller than the confidence level (0,05) =>Acept H1 and decline H0 => There is a exponential trend of GDP in Nether Land Moldova LIN H0: b1 = (No linear trend of GDP in Moldova) H1: b1 =/0 (There is linear trend of GDP in Moldova) Description: According to the calculation above we can see that the B1 p-values is equal to 3,0856E-09 which mean it is smaller than the confidence level (0,05) Conclusion: Acept H1 and decline H0 => There is a linear trend QUA H0: b2 = (Not Quadratic trend of GDP in Moldova) H1: b2 =/0 (There is quadratic trend in Moldova) Description: Because the B2 P value- is euqal to 0,0006886 which is smaller than the confidene level (0,05) Conclusion: Acept H1 and decline H0 => There is a quadratic trend EXP H0: b1 = (No exponential trend of GDP in Moldova) H1: b1 =/0 (There is a exponential trend of GDP in Moldova) Description: The p-value of B1= 6,399E-10 is smaller than the confidence level (0,05) Conclusion:Acept H1 and decline H0 => There is a exponential trend of GDP in Moldova ... countries with missing variables will be void Below are the following pictures on how the data are cleaned Figure 1: How to download data from the worldbank Figure 2: Worldbank data of GDP (growth annual... showing that the variations in life expectancy at birth amongst Asian nations were not very significant  The trendline had a decreasing slope, indicating that there was a negative relationship between... the raw data looks pretty messy so we transform them into a style where it is much easier to read Figure 3: Worldbank missing information Then, any countries with any missing variables as can be

Ngày đăng: 02/12/2022, 18:14