Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 16 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
16
Dung lượng
581,77 KB
Nội dung
Inferential Statistics Assessment Course code and name: ECON1193 - Business Statistics Assessment name: Individual Case Study Inferential Statistics Word count: 2572 Air pollution Dataset #4 PART 1: Introduction According to the 2019 World Air Quality Report by IQAir, amongst the environmental health risks faced by the global population, air pollution constitutes the most imperative one WHO (2018) listed particulate matter (PM) as a common proxy indicator for air pollution due to its exceptionally detrimental health impacts In this case study, mean annual exposure to PM2.5 is the key focus of all calculation and prediction This pollutant’s effects to human health are the most disastrous Its microscopic size of 2.5 microns lets these particles penetrate the human body via breath, deeply absorbed into the bloodstream and cause far-reaching health effects The WHO noted that an annual mean exposure threshold of 10 μg/m3 can lessen the impacts of PM2.5, however, there is no level of exposure to this pollutant that is safe for humans (IQAir 2019) Apart from its culpability in harming human health, PM2.5 also hurts the environment in different ways, such as acid deposition, increased ozone levels and damage to marine lives and vegetation (IQAir 2021) In fact, the UNECE (n.d.) named particulate matter as one of the shortlived climate pollutants (SCLPs), which means that it is an air pollutant and is also climaterelevant Monitoring the Mean annual exposure to air pollution is aligned with the United Nation SDG 13 goal: Climate Change as is a global effort to keep track of the level of PM concentrations in the air and ‘retaliate’ accordingly Ritchie & Roser (2019) reported that the mean annual exposure of 95% of the world population transcends WHO limits, and this is applicable for both high and low-to-middle income nations Nevertheless, the majority of the population in most low-to-middle income countries is put at the risk of pollution levels above 35 μg/m3 (Appendix 1) Similarly, as reported by WHO (2006), the highest levels of air pollution are to be found in the low- and middle-income countries of Asia As Gross National Income (GNI) delivers a total measure of income (WHO n.d.), these information somewhat indicates the relationship between mean annual exposure and GNI Overall, the purpose of this case study is to evaluate the relation bewtween mean annual exposure and GNI of countries, as well as to estimate the mean annual exposure to air pollution value and rate, with the support of statistics PART 2: Probability and Descriptive Statistics a Probability 25 countries in the data set will be categorized based on their GNI, specifically: Low-Income countries (LI): GNI < $1,000 per capita Middle- Income countries (MI): GNI between $1,000 and $12,500 per capita High-Income countries (HI): GNI > $12,500 per capita As well as the level of exposure to air pollution, in this case, countries that have a mean annual exposure to air pollution higher than 33 micrograms per cubic meter (μg/m3) is considered as “High exposure to air pollution” (Appendix 2) Low-income countries (LI) Middle-income countries (MI) High-income High mean annual Not high mean annual exposure to air pollution exposure to air (H) pollution (N) 2 10 15 Total 6 countries (HI) Total 18 25 Table 1: Contingency table of country category based on income and mean annual exposure to air pollution To figure out whether income and mean annual exposure to air pollution are statistically independent events or not, the conditional probability of high-income countries (HI) given that they have high mean annual exposure to air pollution (H) and the marginal probability of the high-income countries (HI) will be examined: As two events are only statistical independent when them conditional probability equals to the marginal probability, this means that high income and high mean annual exposure to air pollution are not Instead, they are dependent To conclude, income and mean annual exposure to air pollution are statistically dependent, which means that the income of a country influences its mean annual exposure to air pollution To determine which country category have more chance of having high mean annual exposure to air pollution, the conditional probability of high mean annual exposure to air pollution given the income level (high, middle, low) will be calculated: It can be interpreted from the calculations above that no high-income countries face high mean annual exposure to air pollution This percentage is impressive as for low and middle-income countries, the situation is not so good Actually, for low-income countries, they have the highest chance of dealing with high mean annual exposure to air pollution b Descriptive Statistics Measures of central Low-income Middle-income High-income tendency countries (LI) >,, 30.456 > 10.447 Median 31.878 > 22.722 > 10.922 Mode None _ None _ None Table 2: Measures of central tendency of mean annual exposure to air pollution (μg/m3) In this data set, there is no existence of outliers, hence Median cannot be utilized Furthermore, the Mode certainly cannot be used as they are not to be found in this data set Consequently, Mean is the most idealistic measure The fact that Low-income countries own the highest Mean (32.199 μg/m3) is conspicuous This indicate that these countries in general are on the verge of reaching the high level of exposure to air pollution, which is 33 μg/m3 Middle-income countries’ Mean, despite being smaller compared to Low-income countries’ one, is still concerning as it only 2.544 μg/m3 short from reaching the high level of exposure As for High-income countries, their Mean overall are significantly smaller than the other two Actually, they are somewhat aligned with WHO’s annual mean exposure limit of 10 μg/m3 Measures of variation Low-income >,, 10.356 Interquartile Range 6.385 < 21.218 > 4.418 Standard Deviation 5.558 < 20.053 > 3.812 Sample Variance 30.893 < 402.142 > 14.536 Coefficient of Variation 17% < 66% > 36% (%) Table 3: Measures of central variation of mean annual exposure to air pollution (μg/m3) Albeit there are no outliers in the data set, the appearances of unequal Mean and SD as seen in table one and three make it certain that Coefficient of Variation (%) is the most appropriate measure The highest CV belongs to Middle-income countries (66%), which means that the countries in this category have the highest degree of mean annual exposure to air pollution disparity As for High-income countries, their CV suggests that their mean annual exposure to air pollution ratio experiences a moderate spread from the Mean value Lastly, Low-income countries’ mean annual exposure to air pollution is the least disperse and tends to revolve around their average value of 32.199 μg/m3 PART 3: Confidence intervals a Calculation The confidence level chosen to calculate confidence interval of mean annual exposure to air pollution is 95% Population standard deviation ( Sample standard deviation (S) Sample mean () Sample size (n) Confidence level (1-α) unknown 17.909 25.932 25 95% The sample size is 25, which is smaller than 30, hence the Central Limit Theorem (CLT) cannot be applied Besides, the distribution of the sample is unknown In this situation, the sampling distribution will be assumed that it is normally distributed Since the population standard deviation (s) is unknown, sample standard deviation (S) will be used instead, which means the Student’s t-table will be utilized Degree of freedom: d.f = n-1=24 t = ± 2.063 Significance level: α =0.05 Confidence interval: With 95% of confidence, the mean annual exposure to air pollution of the world in 2017 is between μg/m3 and 33.321μg/m3 b Assumption Apart from the missing population standard deviation, the sample size of 25 is not sufficient to apply the CLT, which requires the sample size to be larger than 30 Therefore, the assumption that the sampling distribution is normally distributed have to be made in order to carry out the calculation c If the world standard deviation (σ) of mean annual exposure to air pollution is available, the z-value will be used instead of t-value, which means replacing t-distribution with zdistribution This will minimize the degree of variability as standard normal distribution (z- distribution) has smaller standard deviation and variance compared to t-distribution (Anderson 2014) Confidence interval uses the variability of the data to judge the estimated statistics’s accuracy, the less the degree of variability is, the more precise (smaller) it gets (NEDARC n.d.) As a result, the confidence interval in this case, will decrease Generally, t-distributions shape is almost identical with normal distribution shape (bellshaped and symmetric), however, it has a thicker tail (Greeni 2021) (Appendix 3) This is due to the fact that sample standard deviation (S) is used in t-distributions, as S is mutable from sample to sample, it creates extra uncertainty (Greeni 2021) In contrary, the standard deviation (σ) used in z-distributions create more stability All of this indicates that t-distributions possesses more uncertainty compared to the z-distribution Consequently, when z-value is used, confidence interval will be more accurate, as its distribution is more stable PART 4: Hypothesis Testing a Based on the data provided by The World Bank (2017) about the world mean annual exposure to air pollution, after reaching the peak of 50.8μg/m3 in 2011, it has started to decline (whereas increase still resurfaces in some years) Then, in 2016, the mean annual exposure to air pollution dropped to 45.2μg/m3 With the confidence interval of calculated in part 3a, the global mean annual exposure to air pollution is predicted to decrease in the future Population standard deviation ( Sample standard deviation (S) Sample mean () Sample size (n) Confidence level (1-α) unknown 17.909 25.932 25 95% Step 1: Check for CLT The CLT is not applicable since the sample size is smaller than 30 (n=25) In this case, the sample is assumed to be normally distributed Step 2: Determine hypothesis Step 3: Since the , ’s sign is ‘