RMIT International University Vietnam Assignment Cover Page Subject Code ECON1193B Subject Name Business Statistic Location & Campus (SGS or HN) where you study SGS Title of Assignment Individual Case[.]
RMIT International University Vietnam Assignment Cover Page Subject Code: ECON1193B Subject Name: Business Statistic Location & Campus (SGS or HN) where SGS you study: Title of Assignment: Individual Case Study - Inferential Statistics Student name: Nguyễn Trần Gia Toại Student Number: s3929314 Teachers Name: Lien Dau Thi Mai Assignment due date: 8/22/2022 Date of Submission: 8/22/2022 Number of pages including this one: 22 Word Count: 2928 I declare that in submitting all work for this assessment I have read, understood and agreed to the content and expectations of the Assessment Declaration I give RMIT University permission to use my work as an example and for showcase/exhibition display forever TABLE OF CONTENT: I/ INTRODUCTION 1.1/ Description of “crude birth rate” 1.2/ Importance of observing the birth rate 1.3/ Relationship between the crude birth rate and Gross National Income (GNI) 1.4/ Other factors of birth rate description: II/ DESCRIPTIVE STATISTIC AND PROBABILITY 2.1/Check the independent events: 2.2/ Which country categories are likely to have a higher birth rate? 2.3 Best descriptive measures: a) Measures of Central Tendency b) Measures of Variation c) Box-and-Whisker Plot III/ CONFIDENCE INTERVALS IV/ HYPOTHESIS TESTING a) Discuss the possible error committed, its consequence and how this error can be minimized b) Considering the number of countries is increased by 50% V/ OVERALL CONCLUSION VI/ EXTENSION VI/ REFERENCE LISTS I/ INTRODUCTION 1.1/ Description of “crude birth rate”: - “Crude birth rate” is the quantity of live births per 1,000 of the estimated population at the midyear mark (World Bank n.d.) It only serves as a crude estimate of fertility given that not all members of the denominator group are exposed to the possibility of becoming pregnant (Kirch 2008) In other words, CBR does not regconize age or sex differences among the population (Rosenberg 2019) Despite the variability of crude birth rates from countries to countries, the world has witnessed a downward trend since 1970 to 2015 (Figure 1) (World Bank n.d.) However, the United Nations (UN) anticipates that by 2050, there will be 9.8 billion individuals on the planet, in large part because of high birth rates in many developing nations (2017) (Figure 1: Line graph of the world crude birth rate 19070-2015) 1.2/ Importance of observing the birth rate: - Birth rates have a key role in determining population growth (or decline) and age distribution, both of which have significant socioeconomic implications (Cleland 2008) According to Becker, Murphy, and Tamura (1990), human capital is the main factor influencing economic growth, and there are disparities between the donations of human capital It is detected that high birth rates could cause slow economic growth through investment impacts and potentially through "capital dilution", exploiting a 107-country panel database spanning 1960–1985 (Brander & Dowrick 1994) Therefore, monitoring the birth rate is also observing a country’s current situation such as the economic, well-being of its population, etc This would assist the government to implement appropriate policies in order to improve its country 1.3/ Relationship between the crude birth rate and Gross National Income (GNI): - Gross national income (GNI) is the total of the value contributed by all domestic manufacturers added with any product taxes (exclude subsidies) not factored into the valuation of production, including net primary income receipts (employee compensation and property income from abroad) (WHO 2022) - There are many reports that have verified the association between the GNI rate and the crude birth rate It is claimed that low income countries tend to suffer a higher birth rate due to the effect of poverty on fertility This could be explained as high infant mortality, undereducated woman existence in particular, inequitable family income distribution, inability to obtain family planning, and other aspects of poverty can all contribute to high fertility (Birdsall 1980) Women in the majority of the least developed countries (LDCs) still have five children each on average, despite the fact that birth rates have decreased across the developing world since the 1970s, rely on a policy brief released by the Population Division of the UN Department of Economic and Social Affairs (DESA) (United Nations 2009) Conversely, developed countries tend to have lower fertility rates because birth control is widely available, death rates are low, and kids frequently turn into an economic burden due to housing, education, and other charges linked with raising children (Nargund 2009) It can be drawn out that the GNI rate and the birth rate of a country have an inverse relationship 1.4/ Other factors of birth rate description: - There are several factors that affect the birth rate causing it to increase and decline According to Moran, childbearing can be impacted by many reasons such as age at which a woman has her first child, female educational possibilities, family planning accessibility, and government actions and regulations (2020) For example, modernized women seem to have children later in life as a result of higher education and professional employment (Nargund 2009) This can be considered as one of the reasons behind the downward trend of birth rates II/ DESCRIPTIVE STATISTIC AND PROBABILITY - In this case, a crude birth rate per 1000 population above 17.5 is considered as high 2.1/Check the independent events: Low GNI countries High birth rate countries Low birth rate countries Total (per 1000) (per 1000) 16 16 11 12 4 17 15 32 (LI) Middle GNI countries (MI) High GNI countries (HI) Total (Figure 2: Contingency table of three different categories based on income level and birth rate) - To determine if the crude birth rate and GNI rate are statistically independent events or not, conditional probability of two associated variables is the most practical measure Conditional probability, which is represented by the symbol P(A|B), is the possibility of A given that B has happened, where A and B are two events from a sample space of a unplanned experiment (Borovcnik & Kapadia 2009) Specifically in this measurement, high birth rate countries (A) and low GNI countries (LI) are the two events being taken into account P( LI )=16/32=1 /2 P( LI / A )= P ( LI ∧A ) 16 /32 16 = = P (A ) 17 /32 17 16 P( LI / A )≠ P(LI )( ≠ ) 17 - Dependent events are considered as events that depend on those that happened before For instance, if one event’s outcome is affected by the one that occurred previously, it is called a dependent event (Berenson et al 2012) In this circumstance, the probability of low GNI countries(LI) is unequal to the probability of low GNI countries with high birth rate(A) Therefore, the two events are examined to be statistically dependent It also illustrates that low GNI countries(LI) impact the probability of high birth rate countries (A) As a consequence, it is understandable that the GNI level and the birth rate among countries are statistically dependent events 2.2/ Which country categories are likely to have a higher birth rate? - A comparison between conditional probability of high birth rate(A) of those examined countries within three categories (LI,MI,HI) will be required in order to examine which country categories are more potential to be witnessed a higher birth rate P( A /LI )= P (A∧LI ) 16 /32 16 = = =0,9411=94,11 % P( LI ) 17 /32 17 P( A /MI )= P( A∧MI) 1/32 = = =0,0833=8,33 % P( MI ) 12/32 12 P( A /HI )= P( A∧HI ) 0/32 = =0=0 % P( HI ) /32 =>It can easily be seen that P(A/LI) > P(A/MI) > P(A/HI) - Therefore, countries with low GNI rate are more likely to be influenced by high birth rate with the greatest probability (94,11%) Following are the middle GNI rate countries with 8,33% probability Surprisingly, the high GNI rate group seems to not be affected by the high birth rate with the 0% probability 2.3 Best descriptive measures: a) Measures of Central Tendency Low GNI Middle GNI High GNI countries (LI) countries (MI) countries (HI) Mean 30,83% 13,16% 12,42% Median 32,79% 13,34% 13,11% Mode #N/A #N/A #N/A (Figure 3: Measures of Central Tendency table of three country categories in birth rate%) MIN ,= Observation MAX ,= value < Low GNI Observation value > 18,64 > 3,19 41,16 < 56,65 7,78 > 0,99 23,65 > 23,02 group(LI) Middle income group (MI) High 9,40 > 4,56 14,08 < 19,60 income group (HI) (Figure 4: Outliers testing table of three country categories) - According to figure 3, mode is clearly not available for any calculation as all three categories not have Mode Median and mean are absolutely our last two choices and outliers are the principle criteria in determining between the two Based on figure 4, there is one outlier detected in the MI category, as a result, median appears to be the best suited measure due to its insensitivity to outliers - Based on the calculations and figures, the median of birth rate in low GNI countries accounted for the highest (32,19), followed by middle GNI and high GNI countries with 13,34 and 13,11 respectively Due to figures, the GNI rate is in an inverse proportion to the median Moreover, it can be seen that low GNI countries are opposed to high GNI countries, indicating that countries with low GNI rate will experience a higher birth rate, vice versa b) Measures of Variation Low GNI countries Middle GNI countries High GNI countries Range 22,53 15,88 4,68 Interquartile range 13,37 5,51 3,76 7,22 4,30 2,11 23,41 32,67 16,97 (IQR) Standard deviation (SD) Coefficient of Variation (CV)% (Figure 5: Measures of Variation of table of three country categories in birth rate%) - With the outlier detected above, the Range and Standard Deviation obviously are not recommended as these measures are affected by the outlier The most appropriate measure will be required to be insensitive to outliers and Interquartile range (IQR) is defined as one Based on figure 5, the Interquartile range (IQR) of low GNI countries continues to be the highest among the three categories (13,37) The figures of middle GNI countries still hold the second highest with 5,51 and the lowest (3,76) belongs to the high GNI group As claimed above, the inverse relationship is still being shown between the GNI rate and the Interquartile range (IQR) c) Box-and-Whisker Plot (Figure 6: Box-and-Whisker plot of three country categories in birth rate%) Lower box Comparison Upper box Shape Box Q2-Q1=9,55 > Q3-Q2=3,82 Left-skewed Whisker Q1-Min=4,60 > Max-Q3=4,56 Left-skewed Median Q2-Min=14,15 > Max-Q2=8,37 Left-skewed (Figure 7: Box-and-Whisker plot summary for LI group) Lower box Comparison Upper box Shape Box Q2-Q1=4,09 > Q3-Q2=1,42 Left-skewed Whisker Q1-Min=1,48 < Max-Q3=8,89 Rightskewed Median Q2-Min=5,56 < Max-Q2=10,31 Rightskewed (Figure 8: Box-and-Whisker plot summary MI group) Lower box Comparison Upper box Shape Box Q2-Q1=2,91 > Q3-Q2=0,85 Left-skewed Whisker Q1-Min=0,80 > Max-Q3=0,12 Left-skewed Median Q2-Min=3,71 > Max-Q2=0,97 Left-skewed (Figure 9: Box-and-Whisker plot summary HI group) - Deprived from figure 7,8&9, all three categories Box-and-Whisker plot are imaging left-skewed distribution, yet, there are still many dissimilarities among them According to figure 6, it is noticeable that low GNI countries have the biggest box plot and also have significantly higher minimum and maximum points compared to the other two categories As a consequence, low GNI countries also have a bigger range (23,24-36,6) compared to MI (9,25-14,76) and HI (10,2-13,96) In addition, the Interquartile range (IQR) of three categories (LI,MI,HI) are 13,37, 5,51, 3,76 respectively III/ CONFIDENCE INTERVALS - Since the population standard deviation is unidentified, the sample standard deviation (S) could be a great substitution and the Student’s T distribution would be put to good use instead of the normal distribution Further information, the confidence level will be aimlessly set at 95% to create a confidence interval for the world average of the crude birth rate => α =1−0,95=0,05 α 0,05 =0,025 => = 2 - Calculated by the T-value calculator: t α ,n−1=± 2,039 - Degree of freedom (d.F) : d.F = n-1 = 32-1 =31 Population standard of Unknown Per 1000 10,70 Per 1000 Sample mean (X) 21,90 Per 1000 Sample size 32 countries Confidence level (1-) 95 % T-critical value +/- 2,039 deviation Sample standard of deviation (S) (Figure 10: Statistic summary for the crude birth rate in 2015) μ= X ±t α / ,n−1 S √n => μ=21,90± t 0,025,31( 10,70 ) =21,90 ±2,039( √32 10,70 ) √ 32 => 18,043 ≤ μ ≤ 25,756 **=> It is 95% confident that the actual world average of the crude birth rate lies between 18,043 and 25,756 Discussion: - According to the data collected, the sample size (n=32) is higher than the standard (30) by units, hence, assumption is not required for the confidence interval estimation As a consequence, the Central Limit Theorem (CLT) is adopted, and with a normally distributed sampling distribution IV/ HYPOTHESIS TESTING a) Discuss the possible error committed, its consequence and how this error can be minimized - As reported by the World Health Organization, 19 was the world average crude birth rate in 2015 As specified above, the confidence interval was calculated to be between 18,043 and 25,756 and it is 95% expected that the future rate would line between the range It might remain the same as 19 does meet the requirement for the forecasting value *Nine stages of hypothesis testing: Stage 1: Check for Central Limit Theorem (CLT) - Due to the fact that the sample size (n=32) is larger than 30, Central Limit Theorem is adopted and the sampling distribution of the world average crude birth rate will therefore be normally distributed Stage 2: Specify the statistical hypothesis containing the null hypothesis (Ho) and the alternative hypothesis (H1) - Ho: μ=19 - H1: μ ≠19 Stage 3: Decide the level of significance (α), sample size (n) and determine whether upper, lower or two tailed test - Since the hypothesis testing contains “=” and “≠” signs, it is recommended to implement the two tails test Stage 4: Determine table: - As proven above, the t-table will be implemented due to the unidentified population standard deviation and the normal sample distribution Stage 5: Clarify the critical value: - Level of significance α =0,05 - Degree of freedom d.f= n-1 = 31 - Two tailed test => t=± 2,039 Stage 6: Compute test statistic: - t '= X−μ 21,90−19 = =1,533 S/ √ n 10,70 / √ 32 Stage 7: Make the statistic decision - Since t(-2,039) < t’(1,533) < t(2,039), the test statistic does lie between the nonrejection region range, we not reject the null hypothesis (Ho) Stage 8: Conclude managerial decision - Whereas Ho is not rejected, with the level of confidence up to 95%, there would be no clear proof obtained that the world average crude birth rate is not equal to 19 births per 1000 people Stage 9: - Considering Ho is not rejected, type II errors possibly have been committed This indicates that though it is concluded that there is insufficient evidence proving the world average crude birth rate is unequal to 19 births per 1000 people, there is still a 5% chance it would change in the future Downsizing the error: - There are several ways to decrease the errors and specifically is to increase the power of the test The power of the test is defined as the capability to uncover a false null hypothesis and calculated by the equation : Power = - P(Type II error) (Priest 2019) There are several strategies to boost a test's power, including increasing the sample size, lowering the standard error, widening the gap between the sample statistic and the parameter under study, or raising the alpha level (PennState n.d.) For example, when the sample sizes increase, our estimate is more certain, our uncertainty is lower, and our precision improves (Littler 2018) b) Considering the number of countries is increased by 50% - Assuming the number of countries have been changed, consequently, many of the calculations have also been altered such as the degrees of freedom When the sample size grows, the variability of sampling distribution will decline due to the negative relationship between standard error and sample size (PennState, EC of S n.d.) This also indicates that the t-distribution will become more and more similar to the normal distribution with a thinner and taller curve (Ahmad & Halim 2017) Furthermore, when sample size increases, the width of the confidence interval will be narrower, therefore it would be more accurate to determine the actual value In this circumstance, the sample size was raised leading to the confidence interval to reduce, thus, the hypothesis testing would likely to change Similar to Part 4, it is still committing a type II error, nevertheless, as claimed above it would be downsized due to more information being gathered to take into account V/ OVERALL CONCLUSION In a nutshell, there are three major discoveries made after investigating the case study: - First of all, the GNI rate of a country is linked to its crude birth rate, specifically countries with GNI level less than 4,000$ are more likely to experience higher crude birth rate while the other two groups (Mi and HI) seem to suffer lower It is apparent that the GNI rate and the crude birth rate have an inverse relationship - Second of all, according to the estimation in part 2, up to 94,11% of low income countries are measured to have greater probability of reporting high birth rates Contrastingly, middle income countries have only reported a 8,33% possibility of experiencing high birth rates Remarkably, there are no high income countries out of four countries in the dataset that have been acknowledged with high birth rate (0%) Along with the first finding, this calculation has reinforced the invearse relationship between the GNI rate and the crude birth rate - Third of all, with 95% confidence, the world average crude birth rate is anticipated to remain the same as the rate reported in 2019 still lies between the confidence level, [18,043;25,756] However, when the sample size is increased by 50% resulting in a higher accuracy in the testing process Therefore, the final conclusion could be changed VI/ EXTENSION - In the final part, a birth rate along with sociodemographic and economic characteristics study of a representative sample of households in Vietnam will be conducted When it comes to making surveys, it is necessary to determine which sampling method would be utilized We are expected to decide whether to use probability sampling or non-probability sampling methods According to Showkat & Parveen, probability sampling is outperforming non-probability at timeefficient, cost-effective, less biased and significantly more accurate (2017) Therefore, probability sampling will be selected in conducting this survey Probability sampling consists of methods, however, only out of which seems to be appropriate in this circumstance The two methods are stratified and clustered methods As our objective is households in Vietnam which have many natural differences, it is recommended to use stratified sampling when the population is heterogeneous - The study will follow these steps: Step 1: Divide the population into small groups Step 2: Group them based on their shared characteristics, for example, income levels or city Step 3: Randomly select from each group proportionally => After following the steps provided, we are able to start the survey and gather insights - During the survey, sampling errors would be committed because it might not be practical to obtain information from the population This can be explained due to the outcome might vary around the true value (Adwok 2015) However, this error can be reduced as we expand our sample size as proven above VI/ REFERENCE LISTS Adwok, J 2015, ‘The ANNALS of AFRICAN SURGERY | www.annalsofafricansurgery.com The ANNALS of AFRICAN SURGERY’, Probability Sampling - A Guideline for Quantitative Health Care Research, vol 12, no 2, viewed 21 August 2022, Ahmad, H & Halim, H 2017, ‘Determining Sample Size for Research Activities’, Selangor Business Review, vol 2, pp 20–34, viewed 20 August 2022, Becker, GS, Murphy, KM & Tamura, R 1990, ‘Human Capital, Fertility, and Economic Growth’, Journal of Political Economy, vol 98, no 5, Part 2, pp S12–S37, viewed 21 August 2022, Berenson, M, Levine, D, Szabat, KA & Krehbiel, TC 2012, Basic Business Statistics: Concepts and Applications, Google Books, Pearson Higher Education AU, viewed 17 August 2022, Birdsall, N 1980, ‘Population growth and poverty in the developing world’, Population Bulletin, vol 35, no 5, pp 1–48, viewed 21 August 2022, Borovcnik, M & Kapadia, R 2009, ‘Research and Developments in Probability Education’, International Electronic Journal of Mathematics Education, vol 4, no 3, pp 111–130, viewed 16 August 2022, Brander, JA & Dowrick, S 1994, ‘The role of fertility and population in economic growth’, Journal of Population Economics, vol 7, no 1, pp 1–25, viewed 21 August 2022, Cleland, JG 2008, Trends in Human Fertility, in HK (Kris) Heggenhougen (ed.), ScienceDirect, Academic Press, Oxford, pp 364–371, viewed 21 August 2022, Kirch, W 2008, Encyclopedia of public health : with 75 figures and 86 tables, Springer, Cop, Dordrecht, viewed 21 August 2022, Li, Y 2015, The Relationship between Fertility Rate and Economic Growth in Developing Countries, viewed 21 August 2022, Littler, S 2018, The Importance and Effect of Sample Size - Select Statistical Consultants, Select Statistical Consultants, viewed 20 August 2022, Moran 2020, What Factors Affect the Total Fertility Rate, or TFR?, Population Education, viewed 21 August 2022, Nargund, G 2009, ‘Declining birth rate in Developed Countries: A radical policy re-think is required’, Facts, views & vision in ObGyn, vol 1, Universa Press, no 3, pp 191–3, viewed 21 August 2022, PennState n.d., 6.5 - Power | STAT 200, PennState: Statistics Online Courses, viewed 20 August 2022, PennState, EC of S n.d., 4.1.3 - Impact of Sample Size | STAT 200, PennState: Statistics Online Courses, viewed 20 August 2022, Priest, R 2019, Encyclopedia of Social Measurement | ScienceDirect, Sciencedirect.com, viewed 20 August 2022, Rosenberg, M 2019, How Is a Population’s Growth and Decline Measured?, ThoughtCo, viewed 21 August 2022, Showkat, N & Parveen, H 2017, Non-Probability and Probability Sampling, ResearchGate, viewed 21 August 2022, United Nation 2017, World population to hit 9.8 billion by 2050, despite nearly universal lower fertility rates – UN, UN News, viewed 21 August 2022, United Nations 2009, High birth rates hamper development in poorer countries, warns UN forum, UN News, viewed 21 August 2022, WHO 2022, Indicator Metadata Registry Details, Who.int, viewed 21 August 2022, Worldbank n.d., Glossary | DataBank, databank.worldbank.org, viewed 21 August 2022,