(TIỂU LUẬN) therefore, that increasing internet usage should be included as one of the critical components of the new economy model for the policys vision and purpose to be realized in the future
Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 13 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
13
Dung lượng
365,41 KB
Nội dung
ASSIGNMENT COVER PAGE Subject Code ECON1193B Subject Name Business Statistics Location of campus Sai Gon South Campus Title of Assignment Individual case study – Inferential Statistics Dataset Internet usage – dataset Student Name Nguyen Phuong Thao Student Number s3891607 Lecturer Ha Thanh Nguyen Assignment Due Date August 22nd, 2021 Date of Submission August 22nd, 2021 Number of this one) Word Count page (including 12 pages 2960 Part 1: Introduction Nowadays, the internet plays an essential role in people's life The internet, which connects billions of people across the world, is an essential part of today's information society The worldwide penetration rate had grown from almost 17% in 2005 to over 53% in 2019 However, because certain parts of the globe have reached saturation levels, worldwide growth rates have slowed in recent years (ITU Telecommunication Development Bureau 2019) As an information distribution system, the internet, and its use can deliver education and knowledge to everyone It also creates substantial new economic prospects as well as the potential for more environmentally friendly choices for the marketplace Furthermore, the internet can help developing-country enterprises jump into the development mainstream It holds great promise for easing the delivery of essential services such as health and education, which are now unevenly dispersed (United Nations Department of Economic and Social Affairs 2007) To illustrate, most individuals in developed countries are online, with around 87 percent In the least developed countries (LDCs), on the other hand, just 19% of individuals have an internet connection in 2019 Furthermore, Europe has the most outstanding Internet usage rates, while the lowest are in Africa (ITU Telecommunication Development Bureau 2019) Besides, the United Nations has created the 2030 Agenda for 17 Sustainable Development Goals (SDGs) One of which is goal number being “promoting sustained, inclusive, and sustainable economic growth, full” (United Nations Department of Economic and Social Affairs, n.d) For these reasons, to accomplish the United Nations' SDG 8, it is critical to keep track of who is using the internet According to Chong, Liew and Suhaimi (2012), there is a significant long-run and short-run connection between gross national income and internet usage rate For more details, investing in Information and Communications Technology (ICT) infrastructure, particularly encouraging increased internet usage, is advantageous to raising gross national income per capita Therefore, that increasing internet usage should be included as one of the critical components of the New Economy Model for the policy's vision and purpose to be realized in the future Part 2: Descriptive Statistics and Probability - 33 nations are divided into three groups according to their gross national income (GNI): + Low-Income countries (LI): containing countries with a per capita GNI of less than $1,000 + Middle-Income countries (MI): containing countries with a per capita GNI of between $1,000 and $12,500 + High-Income countries (HI): containing countries with a per capita GNI of more than $12,500 - After being divided into three categories, these nations are split into two groups: “low usage of internet” countries (L), which have individuals using the internet (percentage of population) of no more than 40%, and “high usage of internet” countries (H), which have individuals using the internet of more than 40% A Probability Low-Income countries (LI) Middle-Income countries (MI) High-Income countries (HI) Total Low usage of internet (L) High usage of internet (H) Total 4 13 19 10 10 10 23 33 Table 1: Contingency table of internet usage statistics for each nation category a To see if income and internet usage are statistically independent events or not, we must evaluate and compare the conditional probability of low internet usage given that low-income nations P (L | LI), where L denotes the probability for examination and LI denotes the conditional component Furthermore, the probability of all nations with low internet usage is P (L) P ( L) = L 10 = =0.3 33 33 P(L∧LI ) P(L ∩ LI ) 33 P ( L|LI )= =1 = = P (LI ) P (LI ) 33 P (L|LI) ≠ P (L) After calculating, the income and internet usage are statistically dependent events as nations with low internet usage, given that low-income P (L| LI) have a different probability than countries with low internet usage P(L) It demostrates that these two probabilities affect each other It is the same for the other group countries As a result, the gross national income of each country is dependent on the individual's use of the internet b To determine which country categories have more internet usage, we must calculate the probability of country categories and compare it 0 33 P ( H ∨LI )= =0 33 13 33 13 P ( H ∨MI ) = = =0.684 19 19 33 10 33 P ( H ∨HI )= =1 10 33 P (H | LI) < P (H | MI) < P (H | HI) - As a result, the chance of low-income nations having high internet usage is 0% In contrast, the probability of high internet usage is 100% in high-income countries, compared to 68.4% of middle-income countries As a result, the countries that have a higher internet usage rate will have a higher GNI Therefore, the governments should push the percentage of citizens using the internet to improve the GNI and boost the economy B Descriptive statistics Min Lower bound Max Upper bound Result LI 4.339 > -1.78 18.618 < 18.859 No outliers MI 8.478 > -4.502 71.391 < 95.86 No outliers HI 67.096 > 65.996 97.099 < 101.1127 No outliers Table 2: Measures of identifying outliers of each country on usage of internet To get the most accurate analysis of descriptive statistics, the data set must be carefully examined to see whether it contains any outliers From table 2, it is clear that no extreme values in the three country categories a Measurement of Central Tendency Central Tendency Low-Income Middle-Income High-Income Mean 9.519 45.949 82.647 Median 7.56 49.966 81.535 Mode N/A N/A N/A Table 3: Measurements of Central Tendency of each country on usage of internet in 2017 The mean is the most tool for the central tendency The mean is calculated based on all the data's values and can be further mathematically treated Moreover, it's simple to comprehend for non-technical audiences (Gholba 2012) Furthermore, although the mean is sensitive to outliers, there are no extreme values in the dataset (table 2) Table shows that low-income nations have a lower mean than middle-income countries, and middle-income countries have a lower mean than high-income ones It indicates that internet usage affects the GNI For more details, countries with higher income will have the mean of individuals using the internet higher b Measurement of Variation Variation Low-Income Middle-Income High-Income Range 14.279 62.913 30.003 Interquartile Range 5.16 25.09 8.78 Variance 39.848 365.984 75.841 Standard Deviation 6.313 19.131 8.709 Coefficient of Variation 66.313 41.635 10.537 Table 4: Measures of Variation of each country on usage of internet in 2017 The best suitable measure for variation is the standard deviation - S Standard deviation is the most often used in variation measurement It demonstrates variance around the mean and calculates all the values in the dataset Furthermore, the units are the same as in the original data (Descriptive Statistics 2021) Besides, the S is sensitive to the extreme values, there are no outliers in this dataset (table 2) Table demonstrates that the S of middle-income countries is the highest (19.131%), following the high-income and low-income countries at 8.709% and 6.313%, respectively For more details, the values of middle-income countries are more spread out from the mean than two country categories In other words, usage internet in MI may be further (higher or lower) from the mean, compared to other countries On the contrary, HI countries have less S value, so it is more likely to have a high usage rate (the mean of HI countries is 82.647% with spreading around the mean at a low level – 8.709%) Consequently, the countries should push internet usage to stimulate GNI and economy too Part 3: Confidence Intervals a Calculating confidence interval for the worldwide average of an individual using the Internet (percent of population) - Assuming the confidence level in this part is equal to 95% since it is the most often used confidence level (Hazra 2017) Therefore, the level of confidence in this case is 95% Population standard deviation σ unknown Sample standard deviation S 27.773 Sample mean X 52.654 Sample size n 33 Significance level α 0.05 Confidence level (1−α )∗100 % 95% Degree of freedom d.f 32 t-critical value t ± 2.0369 Table 5: Summary of data regarding the global average of people who use the internet Confidence interval = X ± tα ,n−1 S √n = 52.654 ± 2.0369 27.773 √ 33 42.806 ≤ μ≤ 62.502 We are 95% confident that the world average of internet users is between 42.806 and 62.502 percent of population b Assumption Since the sample size is large enough (n=33 > 30), no matter how the population is normally distributed or not, we can apply Central limit theorem (CLT) that the sampling distribution of all possible sample means can be approximated by normally distributed No assumptions are required c Assume we know the worldwide standard deviation of Internet users In this case, the world standard deviation of internet users, which means population standard deviation is provided, we will use z-table instead of ttable In other words, the t-critical value (part 3b) will be replaced by the σ ) Besides, critical z-values z-critical value (the new formula is: X ± Z √n will be smaller than critical t-values for any given degree of confidence Confidence intervals are smaller when critical values are smaller A broader interval, on the other hand, is a more cautious interval (McEvoy 2018) Furthermore, the confidence interval is defined by its margins of error Therefore, when the width of the confidence interval reduces, the margins of error decrease too Thus, it leads to higher precise results (Simundic 2008) If σ – the world standard deviation of an individual using the internet is known, the confidence interval will decrease and be more precise Part 4: Hypothesis Testing a Hypothesis Testing (CV approach) In 2016, the population mean for internet users was 44.7% (percentage of population), according to a World Health Organization survey We are 95 percent confident that the global average for an internet user is between 42.806 and 62.502 based on the calculations in part The data of the 2016 year also lies in this interval It leads to confusion about whether individuals' use of the internet will increase, decrease, or remain unchanged in the upcoming years So, we should the two-tailed test first to test if the internet usage will change or remain unchanged Population mean Population standard deviation μ 44.7 σ unknown Sample mean Sample standard deviation Sample size X 52.654 S 27.773 n 33 Confidence level (1−α )∗100 % 95% Significance level α 0.05 Degree of freedom d.f 32 Table 5: Summary of data regarding the global average of people who use the internet - Step 1: Check for CLT 33 countries are calculated in this case, so the sample size n=33 which is higher than 30 Therefore, CLT is applicable Then, the sampling distribution of mean becomes normally distributed - - Step 2: Determine the null hypothesis H0 and the alternative hypothesis H1 H0: μ=44.7 H1: μ ≠ 44.7 Step 3: Determine what kinds of test From the result in step 2, it is two-tailed test - Step 4: Choose which table to use The t-table is utilized because the population standard deviation is unknown - Step 5: Determine critical values (CV) In this case, α =0.05 , degree of freedom = 32 and two-tailed test, tcritical value = ± 2.0369 t= - Step 6: Calculate test statistics t X−μ 52.654−44.7 =1.645 = 27.773 S √ 33 √n Step 7: Make statistical decision 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 10 15 20 25 30 35 40 45 As we can see from the curve above, the t value is in the non-rejection area (-2.0369 < 1.654 < 2.0369) Consequently, we not reject the null hypothesis H0 (reject the alternative hypothesis H1) - Step 8: Make managerial decision As H0 is not rejected, then with a 95% degree of confidence, the average global individual utilizing the Internet (percentage of population) is 44.7 percent in the future - Step 9: Discuss the possible error Since H0 is not rejected, then type II error might have been committed In this context the error means: It is concluded that the world average individual using the Internet (percentage of population) is 44.7% but in actual, the average world individual using the Internet might not be 44.7% in the upcoming years The average world number of people utilizing the internet may change (drop or rise) in the future b Consider the influence of doubling the number of nations in the dataset on hypothesis testing findings - The sample size (n) will double if the number of nations in the dataset is doubled For more details, the degree of freedom will be affected directly - In hypothesis testing, the standard error, determined by sample size, is used to calculate the width of sampling distributions (The University of Texas, n.d) In other words, the standard error represents the distribution's dispersion The dispersion of the distribution decreases as the sample size increases, and the mean of the distribution is near the population mean (Central Limit Theory) As a result, the sample size is inversely proportional to the standard error of a sample (Zijing Zhu 2020) So, when increasing the sample size, t-distribution will have a skinner curve Hence t – critical values, in this case, will be pushed closer to the mean, a nonrejection region At the same time, the sample mean approaches the actual population mean, and the data distribution becomes less variable, resulting in a lower standard deviation S With a lower S and a higher n, a X −μ t' = t’ statistical test has a new formula: Then, t and t’ will move to S √n each other Besides, in this case, a test statistic point is significantly far from critical numbers Thus, it is difficult for a test statistic point to fall the rejection area even with these adjustments As a result, it is reasonable to state that the statistical conclusion will not change - Furthermore, the statistical power and sample size have a positive correlation with each other Increasing the sample size enhances power by lowering the standard error to raise the test statistic value (Introduction to Hypothesis Testing n.d) Then, the power of the test will increase In other words, the sample will be more representative of the general population if the standard error is lower The sample size has an inverse relationship with the standard error; the larger the sample size, the smaller the standard error as the statistic approaches the real value The standard error is a type of inferential statistic that is used to make inferences It indicates the standard deviation of the mean within a dataset This works as a measure of variance for random variables and offers a measurement for the spread The dataset will be more exact if the dispersion is less (Kenton and Mansa 2020) Consequently, the result will be more accurate - 1−β , which β is the Type II error is used to determine the power of the test With higher power, we're less likely to commit a Type II error, which does not reject the null hypothesis when the null hypothesis is false P (Reject H0 | H0 is false) = – P (Fail to reject H0 | H0 is false) = 1−β Decreasing beta error ( β ) through increasing the sample size increases power of the test 1−β In other words, the lower β is, the higher statistical power (Zijing Zhu 2020) From these reasons above, when increasing the sample size, n, the statistical decision will remain unchanged and the results of hypothesis testing will be more accurate Part 5: Conclusion Key findings from the analysis and calculation of individuals using the internet at three different income levels in 33 countries are listed below - In part 1, there is an upward trend in using the internet in the whole world It is a link between internet usage and gross national income It means in countries with a high- middle-income, individuals using the internet will be higher than in low-income countries GNI indicates how the countries’ economy is So, the government should encourage citizens to access more education and pay attention to developing the internet (for example, high-speed internet, 5G, and so on) to push the economy - To strengthen this relationship, in part 2, it is concluded that the GNI and internet usage are dependent event when the example of probability of conditional probability of low usage internet given that low-income countries is not equal to the probability of low usage countries (P (L|LI) ≠ P (L)) In addition, the high-income nations are more likely to have using internet rate of 100%, compared to 68.4% and 0% of middle- and lowincome, respectively (P (H | LI) < P (H | MI) < P (H | HI)) The descriptive statistics also show the link between GNI and internet usage In measuring the central tendency for high-, middle-, and low-income countries, the mean of individuals using the internet (% of the population) is 9.519%, 45.949%, and 82.647%, respectively It demonstrates the significant difference in internet usage between three major groups of countries - Besides, in part 3, we are 95% confident that the world average of an individual using the internet is between 42.806 and 62.502 % of the population In 2016, 44.7% of people who use the internet in the world lay in the confidence interval It is predicted in part that the rate will still be unchanged in the upcoming years in hypothesis testing Moreover, in this case, the sample size is 33 countries; if increasing that number (doubling, for instance), the results will be more accurate For more details, increasing the sample size n will decrease the Type II error β , the power of test 1−β will increase In conclusion, individuals who use the internet have a significant link with gross national income GNI through the above analysis The more accessing the internet, the higher income Since when people can access the internet, they can access higher education As a result, it is a sustainable development economy which is one of the Sustainable Development Goals of the United Nations Therefore, the countries should invest in advance the internet and encourage using the internet in order to develop the economy sustainably REFERENCE Chong, F, Liew, VKS, and Suhaimi, R 2012, The Relationship between Internet Usage and Gross National Income of an Emerging Economy, ISBN: 978-1-61804-124-1, Advances in Finance and Accounting, Malaysia, viewed 15 August 2021, < https://www.researchgate.net/publication/301549358_The_Relationship_be tween_Internet_Usage_and_Gross_National_Income_of_an_Emerging_Econ omy> ‘Descriptive Statistics’ 2021, PowerPoint slides for ECON1193B Business Statistics 1, RMIT University, Vietnam, viewed 15 August 2021, Dashboard@RMIT Gholba, MJ 2012, Measures of Central Tendency, DSpace, viewed 14 August 2021, < http://dspace.vpmthane.org:8080/jspui/bitstream/123456789/2836/1/Meas ures%20of%20Central%20Tendency.pdf> Hazra, A 2017, ‘Using the confidence interval confidently’, Journal of Thoracic Disease, vol 9, no 10, pp.4125, viewed 16 August 2021, < https://www.researchgate.net/publication/320742650_Using_the_confidence_interval_confid ently> Introduction to Hypothesis Testing n.d, Sage Journals, viewed 17 August 2021, < https://www.sagepub.com/sites/default/files/upmbinaries/40007_Chapter8.pdf? fbclid=IwAR2xGVOGRFd81Db27959ErvWoS3jqQeq1Eh0iDWL5jNJ6njt4qydv_TqSU> ITU Telecommunication Development Bureau 2019, Measuring digital development, ISBN 978-92-61-29521-9, ITU, ITU Publications, Switzerland Kenton, W and Mansa, J 2020, Standard Error, Investopedia, viewed 18 August 2021, < https://www.investopedia.com/terms/s/standarderror.asp> McEvoy, DM 2018, A Guide to Business Statistics, Newark: John Wiley & Sons, Incorporated, Knovel Complete database Simundic, AM 2008, ‘Confidence interval’, Biochemia medica, vol 18, no 2, viewed 15 August 2021, < https://www.biochemiamedica.com/en/journal/18/2/10.11613/BM.2008.015/fullArticle> The University of Texas n.d, Lesson 2.2 – Hypothesis Testing, The University of Texas – UTHealth, viewed 17 August 2021, United Nations Department of Economic and Social Affairs 2007, Indicators of Sustainable Development: Guidelines and Methodologies, 3rd edn, United Nations, viewed 14 August 2021, < https://sustainabledevelopment.un.org/content/documents/methodology_s heets.pdf> United Nations Department of Economic and Social Affairs n.d, The 17 goals, United Nations, viewed 14 August 2021, < https://sdgs.un.org/goals? fbclid=IwAR0hkIkb5HK7z1GGbYsDCOExhxDMoIHaN2sLQWPa9EweW04r_M g3_e4YqgQ> Zijing Zhu 2020, How is Sample Size Related to Standard Error, Power, Confidence Level, and Effect Size?, towards data science, viewed 17 August 2021, < https://towardsdatascience.com/how-is-sample-sizerelated-to-standard-error-power-confidence-level-and-effect-sizec8ee8d904d9c> ... infrastructure, particularly encouraging increased internet usage, is advantageous to raising gross national income per capita Therefore, that increasing internet usage should be included as one. .. one of the critical components of the New Economy Model for the policy''s vision and purpose to be realized in the future Part 2: Descriptive Statistics and Probability - 33 nations are divided into... the future b Consider the influence of doubling the number of nations in the dataset on hypothesis testing findings - The sample size (n) will double if the number of nations in the dataset