DESCRIPTIVE STATISTICS
Variation measure
Table 2: Measures of Variation of total deaths in South America and Asia (Unit: number of deaths per million population)
The range is not the best metric for assessing COVID-19 death counts due to the presence of outliers in both datasets Standard deviation (SD) and standard variance (SV) are also inadequate, as they are influenced by outliers since they rely on the mean for calculation Given the right-skewed distribution of the data, the coefficient of variation (CV) is similarly not optimal Instead, the interquartile range (IQR) is the most suitable measure, as it is unaffected by outliers and effectively quantifies the variability of the middle 50% of data around the median Figure 3 illustrates that the IQR for South America is nearly eight times greater than that of Asia, with values of 46,058 and 5,834 per million population, respectively This indicates that the dataset for Asia is more consistent, suggesting that South America experiences a higher number of COVID-19 deaths per country compared to Asia.
MULTIPLE REGRESSION
Final Regression Output
Using backward elimination, the analysis reveals that population is the sole variable meeting the 5% significance level, with a p-value significantly lower than 0.05 This indicates a rejection of the null hypothesis, concluding that changes in the national population (in millions) significantly affect the total number of COVID-19 deaths (per million population).
Figure 2: Final Regression model of South America
Regression Equation
Total number of deaths caused by COVID^−19 = b + b *Population in 2020 (in 0 1 millions)
Total number of deaths caused by COVID^−19 = -2727,65 + 1096,26 * Population in
Interpretation of the regression coefficient of the significant variable
In 2020, the positive population coefficient of 1096.26 indicates a direct relationship between population growth and COVID-19 deaths Specifically, for every unit increase in population, there is an expected rise of approximately 1,096 COVID-19 related deaths, assuming all other factors remain constant.
The intercept suggests a negative total death count from COVID-19 when the population is zero, which is both impossible and unreasonable, as a population can never be zero.
Interpretation of the coefficient of determination
The coefficient of determination, R, stands at 97%, indicating that 97% of the total COVID-19 deaths can be attributed to the population size (in millions) The remaining 3% suggests that other factors also play a role in influencing the number of deaths.
In a study similar to those conducted in South America, backward elimination revealed that among five independent variables, only the Population (in millions) variable was significant at the 5% level, indicated by a small p-value of less than 0.05 This finding leads to the rejection of the null hypothesis, suggesting that fluctuations in population size can significantly impact the total number of COVID-19 deaths (in millions).
Figure 3: Final Regression model of Asia.
Total number of deaths caused by COVID^−19 = b + b * Population in 2020 (in 0 1 millions)
Total number of deaths caused by COVID^−19 = - 4535,41 + 186,69 * Population in 2020
3 Interpretation of the regression coefficient of the significant variable:
In 2020, the coefficient for the variable Population in Asia indicates a positive correlation, suggesting that an increase of one million people is associated with approximately 187 additional COVID-19-related deaths, assuming all other factors remain constant.
The intercept suggests an unrealistic scenario of total COVID-19 deaths being negative when the population is zero, which is impossible since a population cannot be zero.
4 Interpretation of the coefficient of determination:
The coefficient of determination, or R, reveals that 97% of the total number of COVID-19 deaths can be explained by the population size (in millions), while the remaining 3% is influenced by other factors affecting the dependent variable.
TEAM REGRESSION CONSLUSION
Regression Equation -2727,65 + 1096.26 * Population in 2020 (in millions)
Table 3: Summary of information calculated in Part 3 (Unit: in millions)
The analysis of the regression models for COVID-19 mortality in Asia and South America reveals that the population size in 2020, measured in millions, is a significant independent variable with 95% confidence This conclusion was reached after excluding four other variables—hospital beds, doctors, average rainfall, and temperature—that did not contribute to explaining the variation in COVID-19 death cases Thus, the population size has a proven impact on COVID-19 mortality and can be utilized to estimate death rates in both regions.
Both Asia and South America exhibit a similar impact on Covid-19 death estimates, as indicated by their regression models, which share a coefficient of determination of 97% A higher R-squared value signifies a better fit for the model, reinforcing the accuracy of the predictions made regarding Covid-19 fatalities in these regions.
2015) In this case, the forecast of COVID-19 death in Asia and South America are close to the real-life figure.
The regression equations for South America and Asia, which are 1096.26 and 186.69 respectively, indicate a positive correlation between COVID-19 death cases and population size.
In Asia, the regression analysis reveals that for every million people in the region, the number of COVID-19 death cases is projected to rise by approximately 186.69.
In South America, the regression equation indicates that for every million people in the designated countries, the number of COVID-19-related deaths is projected to rise by 1,096.26 cases.
The regression line intercept for COVID-19 death cases in Asia is significantly higher at -4535.41 compared to -2727.65 in South America, indicating that an increase in population in Asian countries could lead to a greater rise in death cases.
A comparison of COVID-19 death rates between South America and Asia from April 1 to July 31, 2020, reveals that South America experienced significantly higher fatalities, with a median of 11,358.50 deaths compared to 2,008 in Asia This disparity is further illustrated in the box and whisker plot (Figure 3) Additionally, the interquartile range (IQR) indicates that the variation in death cases per country is greater in South America (46,058) than in Asia (5,834) Both regions exhibit a positive correlation between death cases and population factors, as shown in Figures 4 and 5, with a determination coefficient of 97%, highlighting a high accuracy in the data Furthermore, the regression analysis indicates that Asian countries are more affected by population increases in relation to COVID-19 mortality, with regression factors of 1,096.26 for South America and 186.69 for Asia.
TIME SERIES
South America
1 Use time series data to build Linear (LIN), Quadratic (QUA) and Exponential (EXP) trend model for South America:
Hypothesis testing on the daily death data from Covid-19 in North America reveals two significant trend models: a linear trend and an exponential trend, as detailed in Appendix X and calculated in Appendix 2.
Figure 4: Regression output of the Linear trend for the data set for daily deaths rate due to
Covid 19- of South American b) Regression Formular and Coefficient of Significant Variable:
Where: ^Y : is the predicted number of death due to Covid-19 in South America
The coefficient of significant variable β 0 indicates that at time T=0, the daily death rate from Covid-19 in South America was 4,708.073 Meanwhile, β 1 shows a decrease in the daily death rate by 17.091 during the period between April 1 and July.
31) decreased by 17,091 every day, showing that is a downward trend.
Figure 5: Regression output of the Exponential trend for the data set for daily deaths rate due to
Covid 19- in South American b) Regression Formular and Coefficient of Significant Variable:
In Non-Linear Form: ^YH52,885x1,005 T Where: log^Y : is the predicted number of death due to Covid-19 in South America
T: is the time period The Coefficient of Significant Variable:
Where: β 0 = 3,6860 β 1 = 0,002 ¿>(β 1−1)x100 %=−0,98 %, shows the daily decreases in the number of deaths rate due to Covid-19 in South American at -0,98%.
From April 1 to July 31, 2021, two trend models were identified for forecasting the daily death rate from Covid-19 in South America Among these, the model that demonstrated the lowest measurement errors is highlighted in the accompanying table.
South America Sum of Squared Errors (SSE)
Table 4: Comparation of SSE and MAD of Linear Trend and Exponential Trend in the data set of
To predict the daily death rate from Covid-19 in South America, the Exponential Trend model is the most suitable choice, as it demonstrates the lowest Sum of Squared Errors (SSE) and Mean Absolute Deviation (MAD) compared to other trend models.
2 Predict the number of deaths due to Covid 19 in South America on September 28, September 29 and September 30.
According to the calculations outlined in section 5.2, the projected daily deaths from Covid-19 in South America for September 28, 29, and 30 have been determined using an Exponential Trend analysis, as detailed in the table below.
Calculation Predicted Number of daily deaths (deaths per day) September 28 181 ^YH52,885x1,005 181 11969
Table 5: Predict number of daily deaths due to Covid-19 in South America on September 28,September 29 and September 30 using Exponential Trend (Unit: deaths per day)
Asia
1 Use time series data to build Linear (LIN), Quadratic (QUA) and Exponential (EXP) trend model for Asia:
Analysis of the daily death data from Covid-19 in Asia indicates a significant quadratic trend model, as detailed in Appendix X and calculated in Appendix 3.
Figure 6: Regression output of the Quadratic trend for the data set for daily deaths rate due to
Covid-19 in Asia b) Regression formular and explaination of coefficient of significant variable
^Y is the predicted number of deaths due to COVID-19 in Asia (T) is the time period
The C oefficient of S ignificant V ariable:
The intercept β 0 $93,341 , presents that when T=0, the rate of total deaths due to Covid-19 was 2493,341 on 31 March, 2021
The slope β 1 h,629 , presents that the daily deaths rate in Asia due to Covid-19 19 (from April 1 to July 31, 2021) increased by 68,629 (per million population) everyday, showing this is an upward trend.
The slope β 2 =−0,572 , presents that the daily deaths rate in Asia due to COVID-19 (from April 1 to July 31, 2021) decreased by 0,572 (per million population) every day, showing this is a downward trend.
Table 6: Calculation of SSE and MAD in the data set of Asia
The quadratic trend model is the sole type identified in section 5.1 for forecasting daily Covid-19 death rates in Asia from April 1 to July 31, 2021 As such, it stands out as the most effective model for predicting the number of deaths attributed to Covid-19 in the region.
2 Predict the number of deaths due to Covid 19 in Asia on September 28, September
According to the calculations presented in section 5.2, the projected daily Covid-19 death toll in Asia for September 28 (day 181), September 29 (day 182), and September 30 (day 183) has been determined using the Quadratic Trend method, as detailed in the table below.
Table 7: Predict number of daily deaths due to Covid-19 (people) in Asia on September 28, September 29 and September 30 using Exponential Trend (Unit: deaths per day)
Figure 7: Line chart of the total daily Covid-19 deaths in South America and Asia from April 1 to July 21, 2021 (Unit: deaths per day)
From April 1 to July 31, 2021, the line chart illustrates the daily total number of Covid-19 deaths in South America and Asia In South America, the daily death toll decreased from over 5,000 to under 2,000, with fluctuations between 2,000 and 6,000 deaths in April, narrowing to 2,000-5,000 from May to June A significant drop occurred in the last 23 days of July, with a peak of 11,202 deaths on July 20 and a notable low of 1,208 deaths on July 25 Conversely, Asia experienced a gradual increase in deaths from April 1 to May 18, reaching a peak of 8,524 deaths on June 10 before declining to 2,288 by July 5 For the remaining days, deaths fluctuated between 2,500 and 4,700, with a rising trend in the last six days of July, peaking at 4,484 Notably, July 20 also marked a high in Asia with 6,659 deaths, just shy of the peak recorded on June 10.
The growth patterns of Covid-19 daily deaths differ significantly between South America and Asia, with South America exhibiting an exponential trend, while Asia follows a quadratic trend.
2 Recommended the best trend model to predict number of deaths due to COVID-19 in the world:
Table 8: Comparation of SSE and MAD of Asia’ Quadratic Trend and South America’s
The analysis reveals that the Sum of Squared Errors (SSE) and Mean Absolute Deviation (MAD) for South America’s exponential trend are considerably lower than those of Asia’s quadratic trend Consequently, we will utilize South America’s exponential trend for a more accurate prediction of global COVID-19 death tolls.
Formular for calculate number of deaths due to COVID-19 in the world:
1 The main factors that impact the number of deaths due to COVID-19:
The final model from part 3 reveals that population size (in millions) is the key factor significantly influencing the total COVID-19 deaths per million in both South America and Asia, achieving a 5% level of significance.
The relationship between population size and COVID-19 death rates is directly proportional, indicating that an increase in population leads to a rise in fatalities Countries with larger populations, like Brazil and India, face greater challenges due to the strain on healthcare systems Historical data from past pandemics shows that higher infection rates among individuals significantly contribute to the spread of COVID-19, particularly across international borders Densely populated areas experience faster transmission due to increased human contact, while regions with low population density see lower infection risks Furthermore, as the Coronavirus evolves and new variants emerge, the fatality rate has risen over time Notably, from April to July, Brazil and India, the most populous countries in South America and Asia, reported the highest death tolls, with rates of 230,176 and 260,414 per million population, respectively.
The coefficient of determination in the final regression models reveals that 97% of the total COVID-19 deaths can be attributed to the population size in millions.
2 Predicted number of deaths due to COVID-19 in the world on October 31, 2021:
Part 6 highlights that South America's exponential trend serves as the most reliable model for forecasting global Covid-19 death tolls Using the formula ^Y = 4852.885 x 1.005, we can predict the number of deaths as of October 31, 2021, by applying the time period (T!4).
Based on the above calculation, the number of deaths due to Covid-19 in the world on October
31, 2021, is 14110 cases, showing an upward trend in the number of deaths cases in the future.
3 The number of deaths by COVID 19 by the end of year 2020:
Current projections suggest that COVID-19 fatalities may not decrease over time and could potentially rise at an accelerated pace As of October 31, the global death toll is anticipated to reach 14,110 cases, indicating a troubling trend towards increasing mortality rates Utilizing the same forecasting method as in section 7.2, the estimated number of COVID-19 deaths by December 31 is expected to reflect this upward trajectory.
Current social distancing practices and vaccination rates remain inadequate, leading to forecasts that daily COVID-19-related fatalities in some states may exceed the peak levels seen in early 2021 The emergence of the Delta variant, which is 2 to 3 times more infectious than previous strains, poses a greater risk, particularly for unvaccinated individuals who may experience more severe disease Low vaccination rates, especially in Asia and Africa, exacerbate the severity of illness and the potential for outbreaks, compounded by under-resourced healthcare systems, vaccine hesitancy, and ineffective distribution The Southeast Asian region is currently facing significant COVID-19 outbreaks and a rising mortality rate Consequently, we anticipate an increase in COVID-19-related deaths by the end of this year.
4 Two other variables might impact the number of COVID-19 deaths in the world:
While six variables were highlighted in this study, the number of Covid-19 fatalities may also be affected by additional factors To gain a comprehensive understanding of the global Covid-19 mortality rate, it is essential to consider two other significant variables that may influence the number of deaths.
The elderly proportion is a key categorical variable, as this demographic has faced the most significant impacts from both the virus and lockdown measures Additionally, they have experienced limited benefits from available physical and digital solutions (Martins 2020).
According to the Centers for Disease Control and Prevention, nursing home populations are at significant risk of contracting COVID-19, leading to high mortality rates (The New York Times 2021) Long Term Care Facilities (LTCF) have been severely affected by the pandemic, with estimates indicating that 55.3% of COVID-19 fatalities are among individuals over 80 years old (Amore et al 2021) Recent statistics reveal that approximately 80% of COVID-19 deaths occurred in adults aged 65 and older, particularly those in their 60s and older with pre-existing health conditions (Nania 2021) Data shows that the majority of COVID-19 fatalities are reported in individuals aged 50 and above While COVID-19 can affect individuals of all ages, older adults face a higher risk of severe illness due to age-related physiological changes and underlying health issues (WHO 2020) Thus, the proportion of elderly individuals in a population is directly related to COVID-19 fatalities, suggesting that countries with a higher percentage of elderly residents will experience increased mortality rates from the virus.
TIME-SERIES CONCLUSION
Line chart
Figure 7: Line chart of the total daily Covid-19 deaths in South America and Asia from April 1 to July 21, 2021 (Unit: deaths per day)
The line chart illustrates the daily total deaths due to Covid-19 in South America and Asia from April 1 to July 31, 2021 In South America, the daily death toll decreased significantly from over 5,000 to below 2,000 during this period, with fluctuations between 2,000 and 6,000 deaths in April, narrowing to 2,000 to 5,000 from May to June A notable peak occurred on July 20 with 11,202 deaths, but the total dropped to a low of 1,208 on July 25 Conversely, Asia experienced a gradual increase in deaths from April 1 to May 18, peaking at 8,524 on June 10 before declining to 2,288 by July 5 The last 25 days showed daily deaths mostly ranging from 2,500 to 4,700, with a notable rise in the final days of July, culminating at 4,484 Both regions saw significant death tolls on July 20, with Asia reporting 6,659 deaths, just below its peak.
The growth patterns of Covid-19 daily death rates in South America and Asia exhibit distinct trends, with South America following an exponential model and Asia adhering to a quadratic model, as determined in Part 5 of the analysis.
Recommended the best trend model to predict number of deaths due to COVID-19 in the world
Table 8: Comparation of SSE and MAD of Asia’ Quadratic Trend and South America’s
The analysis reveals that South America's exponential trend exhibits significantly lower SSE and MAD values compared to Asia's quadratic trend Consequently, to enhance accuracy in forecasting, we will utilize South America's exponential trend to predict global COVID-19 death tolls.
Formular for calculate number of deaths due to COVID-19 in the world:
OVERALL TEAM CONCLUSION
The main factors that impact the number of deaths due to COVID-19
The final model from part 3 reveals that population size (in millions) is a crucial factor significantly influencing the total COVID-19 deaths per million in both South America and Asia, achieving a 5% level of significance.
The positive correlation between population and COVID-19 deaths in South America and Asia suggests a directly proportional relationship, where an increase in population leads to a rise in total deaths Countries with larger populations face higher lethality due to increased pressure on healthcare facilities and personnel High population density, a consequence of large populations, facilitates the spread of COVID-19 through increased human contact, making densely populated regions more susceptible to the virus This is evident in the cases of Brazil and India, the most populous countries in their respective regions, which recorded the highest total deaths per million population between April 1 and July 31.
The coefficient of determination in the final regression models reveals that 97% of the total COVID-19 deaths can be attributed to the population size in millions.
Predicted number of deaths due to COVID-19 in the world on October 31, 2021
Part 6 reveals that South America's exponential trend serves as the most reliable model for forecasting global Covid-19 death tolls Utilizing the formula ^Y = 4852.885 x 1.005, we can predict the outcomes for the time period ending October 31, 2021.
Based on the above calculation, the number of deaths due to Covid-19 in the world on October
31, 2021, is 14110 cases, showing an upward trend in the number of deaths cases in the future.
The number of deaths by COVID 19 by the end of year 2020
Forecasts indicate that COVID-19 fatalities are unlikely to decrease over time and may actually rise at an accelerated pace As of October 31, the global death toll is projected to reach 14,110 cases, suggesting a troubling trend of increasing fatalities in the future Utilizing the same calculation method as outlined in section 7.2, predictions for COVID-19 deaths by December 31 indicate a continued upward trajectory.
Current social distancing practices and vaccination rates are insufficient, leading to forecasts that daily COVID-19-related fatalities in some states may exceed the peak levels seen in early 2021 The Delta variant, which is 2 to 3 times more infectious than previous strains, poses a heightened risk, especially for unvaccinated individuals who may experience more severe disease The low vaccination rates in regions like Asia and Africa exacerbate the severity of illness and the potential for outbreaks Contributing factors include under-resourced healthcare systems, vaccine hesitancy, and ineffective vaccine distribution Southeast Asia, in particular, is facing significant COVID-19 outbreaks and rising mortality rates Consequently, we anticipate an increase in COVID-19-related deaths by the end of this year.
Two other variables might impact the number of COVID-19 deaths in the world
The Covid-19 mortality rate may be affected by additional factors beyond the six variables identified in this study To gain a deeper understanding of the global factors influencing Covid-19 fatalities, two additional variables are considered significant in impacting the death toll.
The elderly proportion is a crucial categorical variable, as this demographic has faced the most significant impact from both the virus and lockdown measures, with limited access to physical and digital solutions for support (Martins 2020).
According to the Centers for Disease Control and Prevention, nursing home populations are particularly vulnerable to COVID-19, with a significant impact on older adults in Long Term Care Facilities (LTCF) globally It is estimated that 55.3% of pandemic-related fatalities are among those aged 80 and older, and approximately 80% of COVID-19 deaths occur in individuals aged 65 and above, especially those with pre-existing health conditions Data shows that the majority of COVID-19 deaths are reported in people aged 50 and over While the virus can affect individuals of all ages, older adults face a higher risk of severe illness due to age-related physiological changes and underlying health issues Consequently, a higher proportion of elderly individuals in a population correlates with increased COVID-19 mortality rates, suggesting that countries with a larger elderly demographic may experience more fatalities from the virus.
Vaccination rates play a crucial role in mitigating the impact of Covid-19, with widespread consensus that vaccination is the most effective strategy to reduce mortality While vaccines are not 100% effective in preventing infections, they significantly decrease the severity of illness and overall case numbers in communities (Reals 2021) Projections indicate that achieving an 80% vaccination rate by January 1, 2022, could lead to a 20% reduction in cases and a 22% decrease in deaths, potentially preventing 1.5 million infections and 21,000 fatalities (Jocelyn 2021) Areas with low vaccination rates experience substantially higher hospitalization and death rates, with 39 hospitalizations per 100,000 in low-vaccination states versus 10 in highly vaccinated states, and death rates over 5.5 times higher (Hanna, Almasy & Holcombe 2021) Former COVID-19 advisor Andy Slavitt noted that 98-99% of deaths from the virus occur among unvaccinated individuals (Carla & Mike 2021), underscoring that even minimal vaccine protection can significantly lower Covid-19 mortality rates (Seyed et al 2020) Thus, countries with higher vaccination rates tend to report fewer Covid-19 deaths.
Figure 8 Covid-19 cases and deaths by age group in America based on available data in
October 29 2020 Source: Center of Disease Control and Prevention, CDC COVID Data
1 Amore, S, Puppo, E, Melara, J, Terracciano, E, Gentili, S & Liotta, G 2021, ‘Impact of COVID-19 on older adults and role of long-term care facilities during early stages of epidemic in Italy’, Scientific Reports 11, no 12530.
2 Bhadra, A, Mukherjee, A & Sarka, K 2021, 'Impact of population density on Covid-19 infected and mortality rate in India', Modeling Earth Systems and Environment, October, vol 7, p 623–629, viewed 8 September
2021, .
3 Carla, KJ & Mike, S 2021, ‘Nearly all COVID deaths in US are now among unvaccinated’, AP News, 30 June, viewed 7 September 2021, < https://apnews.com/article/coronavirus-pandemic-health-
941fcf43d9731c76c16e7354f5d5e187? fbclid=IwAR3wMMowCuF_EAK9emaUXz9jKknmHor2PAZt6NlCeTWnpCw1-
4 CFI n.d, R-Squared, Corporate Finance Institute, viewed 9 September
2021, .
5 Frost, J 2015, Introduction to Statistics: An Intuitive Guide, pp 215, viewed 11 May 2020.
6 Hanna, J, Almasy, S & Holcombe, M 2021, ‘If you live in a state with a low vaccination rate, you're 4 times more likely to be hospitalized and more than 5 times more likely to die’, CNN Health, 19 August, viewed 7 September 2021, < https://edition.cnn.com/2021/08/18/health/us-coronavirus-wednesday/index.html>.
7 Jocelyn, SM 2021, ‘Delta and low vaccination rates projected to cause COVID-19 resurgence in United States’, News Medical Life Science, 7 September, viewed 7 September 2021, < https://www.news-medical.net/news/20210907/Delta-and-low- vaccination-rates-projected-to-cause-COVID-19-resurgence-in-United-States.aspx>
8 Lulbadda, K, Kobbekaduwa, D & Guruge, M 2021, 'The impact of temperature, population size and median age on COVID-19 (SARS-CoV-2) outbreak', Clinical
Epidemiology and Global Health, January, vol 9, pp 231-236, viewed 8 September
9 Marcela, QD 2021, ‘With Delta variant dominance, simulator predicts surge in COVID-
19 deaths in most of U.S’, Massgeneral, 16 August, viewed 7 September 2021, < https://www.massgeneral.org/news/press-release/With-delta-variant-dominance- simulator-predicts-surge-in-covid-deaths-in-most-of-us>.
10 Martins, G 2020, ‘The Effects of COVID-19 Among the Elderly Population: A Case for Closing the Digital Divide’, Front Psychiatry, vol.11, no 577427.
11 Minseo, J 2021, ‘COVID-19 in Southeast Asia: Current situation and outlook’, Medical
News Today, 11 August, viewed 7 September 2021, < https://www.medicalnewstoday.com/articles/covid-19-in-southeast-asia-current-situation- and-outlook>
12 Nania, R 2021, ‘95 Percent of Americans Killed by COVID-19 Were 50 or Older’, AARP,
1 April, viewed 7 September 2021, < https://www.aarp.org/health/conditions- treatments/info-2020/coronavirus-deaths-older-adults.html? fbclid=IwAR3uW3nbtc66zq4KEVNX81tBgEu0C1Wm3yptmMX4mmxrUZhl4Ks8FfqN gGs>
13 Our Word in Data 2021, Cumulative confirmed COVID-19 deaths, Our World in Data, viewed 1 September 2021, < https://ourworldindata.org/grapher/cumulative-covid- deaths-region?year=latest&time 20-01-11 latest>
14 Rajee, G 2021, ‘Column: What to know about the delta variant’, The Daily Tarheel, 6
September, viewed 7 September 2021, < https://www.dailytarheel.com/article/2021/09/opinion-delta-variant-unc-booster-shot>.
15 Reals, T 2021, ‘Study finds low rate of COVID-19 "breakthrough" infections, fewer symptoms in vaccinated people’, CBSNews, 2 September, viewed 7 September 2021, < https://www.cbsnews.com/news/covid-breakthrough-infections-vaccine-rate-symptoms- study/>.
16 Seyed, MM, Thomas, NV, Kevin, Z, Chad, RW, Affan, S, Burton, HS, Lauren, AM, Kathleen, MN, Joanne, ML, Meagan, CF & Alison PG 2020, ‘The impact of vaccination on COVID-19 outbreaks in the United States’, Clin Infect Dis, vol.2.
17 Tableau n.d, Guide To Data Cleaning: Definition, Benefits, Components, And How To
Clean Your Data, Tableau, viewed 9 September
2021, .
18 The New York Times 2021, ‘Nearly One-Third of U.S Coronavirus Deaths Are Linked to Nursing Homes’, The New York Times, 1 June, viewed 7 September 2021, < https://www.nytimes.com/interactive/2020/us/coronavirus-nursing-homes.html? fbclid=IwAR16VddvVVirylq6JQJW0uy0ALlz3vy4wZKk- r3XzV39KK1uNYScZSqcChw>
19 World Health Organization n.d 2020, ‘WHO delivers advice and support for older people during COVID-19’, WHO, 3 April, viewed 7 September 2021, < https://www.who.int/news-room/feature-stories/detail/who-delivers-advice-and-support- for-older-people-during-covid-19? fbclid=IwAR1xI_9oyUM1vIG5O3pYTRZAoRTUhLlDNbXFxiHVpY0J8JMec7MJJH_t nCs>.
Appendix 1 Applying the backward elimination for the regression model
To assess the significance of the five independent variables, we will perform null hypothesis testing at a 5% significance level This involves comparing the p-value of each variable against the 5% threshold to determine their statistical significance.
Null Hypothesis: H : B = 0 (No variables can affect the dependent variable)0 j
Alternative Hypothesis: H : B ≠ 0 (At least one of the independent variables can affect the 1 j dependent variable) j: 1,2,3,4,5
If the p-value exceeds 0.05, we do not reject the null hypothesis, indicating that the independent variable is insignificant Conversely, a p-value below 0.05 leads us to reject the null hypothesis, concluding that the independent variable is significant.
Figure 9 The first regression output
The p-value for the average rainfall variable from 1991-2020 is 0.84, the highest among the five explanatory variables, and exceeds the alpha level, leading us to eliminate this variable from consideration.
Figure 10 Part of the regression output after eliminating the value of average rainfall from 1991-2020 (in mm).
In 2017, the p-value for the variable "Medical doctors per 10,000" was 0.53, the highest among the four explanatory variables Since this value exceeds the significance level (α), we will eliminate this variable from consideration.
Figure 11 Part of the regression output after eliminating the value of medical doctor (per 10.000).
The p-value for average temperature from 1991-2020 (in Celsius) is the largest value among the three explanatory variables at 0.76 which is also higher than the α , we eliminate this variable.
Figure 12 Part of the regression output after eliminating the value of average temperature from 1991-2020 (in Celsius)
Figure 13 The final regression output.
Among the two variables analyzed, only the population in 2020 (in millions) meets the significance level, as its p-value is below 0.05 Consequently, the other variable is discarded, leading to the rejection of the null hypothesis This indicates that the population in 2020 (in millions) is the sole significant independent variable.
Figure 14 The first regression output