1. Trang chủ
  2. » Giáo Dục - Đào Tạo

Use time series data to build Linear (LIN), Quadratic (QUA) and Exponential (EXP) trend model for South America

35 68 1

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Tiêu đề Use Time Series Data To Build Linear (LIN), Quadratic (QUA) And Exponential (EXP) Trend Model For South America
Tác giả Nguyen Huynh Hong Ngoc, Nguyen Hong Bao Ngoc, Nguyen Phu Quan, Luu My Quan, Truong Nguyen Van Nhi
Người hướng dẫn Ms. Greeni Maheshwari
Trường học RMIT University Vietnam
Chuyên ngành Business Statistics 1
Thể loại Team Assignment Report
Năm xuất bản 2021
Thành phố Ho Chi Minh City
Định dạng
Số trang 35
Dung lượng 1,45 MB

Nội dung

Course Code ECON1193 Course Name Business Statistics Semester Sem B 2021 Location and Campus RMIT University Vietnam_SGS Nguyen Huynh Hong Ngoc_s3892061 Nguyen Hong Bao Ngoc_s3891683 Student Name_Student ID Truong Nguyen Van Nhi_s3891684 Nguyen Phu Quan_s3878567 Luu My Quan_s3824171 Assigned Regions South America_Asia Lecturer Ms Greeni Maheshwari Word Count 4036 words BUSINESS STATISTICS Team Assignment Report_3A CONTRIBUTION First name Student ID Parts contributed Contribution Ngoc S3892061 1, 3, 4, 7.1 100% Ngoc S3891683 1, 3, 4, 7.1 100% Quan S3878567 1, 5, 6, 7.2 100% Quan S3824171 1, 5, 100% Nhi S3891684 1, 2, 7.3, 7.4 100% TABLE OF CONTENT: Signature PART 1: DATA COLLECTION .4 PART 2: DESCRIPTIVE STATISTICS Central Tendency Measure: Variation measure: PART 3: MULTIPLE REGRESSION: Region A: South America Final Regression Output: Regression Equation: Interpretation of the regression coefficient of the significant variable: Interpretation of the coefficient of determination: Region B: Asia Final Regression Output: Regression Equation: Interpretation of the regression coefficient of the significant variable: Interpretation of the coefficient of determination: PART 4: TEAM REGRESSION CONSLUSION: 10 Non-technical perspective conclusion: 11 PART 5: TIME SERIES: 11 A) South America: 12 Use time series data to build Linear (LIN), Quadratic (QUA) and Exponential (EXP) trend model for South America: .12 Linear Trend .12 Exponential Trend 13 Recommended Trend Model: .14 Predict the number of deaths due to Covid 19 in South America on September 28, September 29 and September 30 .15 B) Asia: 16 Use time series data to build Linear (LIN), Quadratic (QUA) and Exponential (EXP) trend model for Asia: 16 Quadratic Trend .16 Recommended Trend Model: .18 Predict the number of deaths due to Covid 19 in Asia on September 28, September 29 and September 30: 18 PART 6: TIME-SERIES CONCLUSION: 19 Line chart: .19 Recommended the best trend model to predict number of deaths due to COVID-19 in the world: 20 PART 7: OVERALL TEAM CONCLUSION .20 The main factors that impact the number of deaths due to COVID-19: 20 Predicted number of deaths due to COVID-19 in the world on October 31, 2021: 21 The number of deaths by COVID 19 by the end of year 2020: 22 Two other variables might impact the number of COVID-19 deaths in the world: 22 References: 25 Appendix: 27 PART 1: DATA COLLECTION The data for the total number of deaths due to COVID 19 from April 01 to July 31, 2021, which is collected with the use of several credible sources There are six variables in our datasets including population of the country (in millions, latest available), total number of deaths (per million population) due to COVID 19 between April 01 to July 31, 2021, average temperature (in Celsius), average rainfall (in mm) based on available data from 1991 to 2020, hospital beds (per 10,000, latest available) and medical doctors (per 10,000 people, latest available) for 14 countries in Region A: South America and 35 countries in Region B: Asia After cleaning process due to the absence of information in some countries, our datasets includes 12 South American countries and 29 Asia countries PART 2: DESCRIPTIVE STATISTICS Central Tendency Measure: South America Asia Mean 36613.58 14717.74 Median 11358.50 2008.00 Mode None None Table 1: Measures of Central Tendency of total number of deaths due to COVID-19 between April 01 to July 31, 2021, in South America and Asia (per million population) As datasets of South America and Asia not contain Mode values, the Mode cannot be used to estimate the number of fatalities attributable to Covid-19 Besides it, both of datasets have outliers Therefore, Mean and Median can be considered as a suitable comparison measure There are no observations less than Q1-1.5*IQR in either dataset, however there is one observations higher than Q3+1.5*IQR in South America and two in Asia, hence there are outliers in this case totally As the result, the Mean cannot be an optimal measure of central tendency, thus the Median is the most accurate method to represent the value As can be seen in figure 1, the Median of south America is about times greater than Asia with 11358.5 and 2008, respectively (per million population) It shows that that 50% of nations have more than 11358 Covid-19 deaths (from April to 31 July) and 50% have less than 11358 deaths for South America as well as Asia with 50% countries have more than 2008 deaths and the remaining have less than 2008 deaths In brief, it is obvious that there is a substantial difference between South America and Asia when it comes to the total number of COVID-19 fatalities South America has significant greater numer of COVID-19 deaths comparing to Asia Figure 1: Box-and-whisker plots of total number of deaths due to COVID 19 in South America and Asia (per million population) According to the box and whisker plots graph, it clearly shows that there are three outliers leading to some effects on sensitive measures such as Mean or Range There is also a rightskewed distribution for South America and Asia, as seen by smaller box and whisker plots on the left Furthermore, the plot demonstrates that the box plot for South America is in higher position comparing to Asia’s box plot Also, Asia's quartile is rather smaller than the South America‘s one, indicating that the number of COVID-19 fatalities in South America countries is greater than in Asia countries Variation measure: South America Asia Range 229874.00 260413.00 IQR 46058 5834 Sample Varienace 4166990956.45 2509753753.35 Standard Deviation 64552.23 50097.44 Coefficient of Variation (%) 176% 340% Table 2: Measures of Variation of total deaths in South America and Asia (Unit: number of deaths per million population) The value of range cannot be considered to be the best way to describe the number of COVID-19 deaths because both datasets have outliers Additionally, SD and SV are not considered as suitable measure They can be heavily influenced by the outliers since the value of Mean is used in order to calculate them Apart from it, as the distribution of the datasets above is right-skewed, thus the CV is not the optimal measure of variation In this case, as IQR is not impacted by outliers, it would be the most appropriate measurement since it would be able to quantify how much the middle 50% of data deviate from Median As can be obiviously seen on figure that the IQR of South America is nearly times bigger than Asia’s with 46058 and 5834, respectively (per million population) On the basis of this, it is more possible to conclude that the Asia data set is more consistent Therefore, there are more Covid-19 deaths per country in the South America than in Asia PART 3: MULTIPLE REGRESSION: The data is collected from six different variables from two regions: South America and Asia The six categories include: o Population of the country (in millions) o Total number of deaths (per million population) due to COVID 19 o Average rainfall (in mm) o Average temperature (in Celsius) o Hospital beds (per 10,000 people) o Medical doctors (per 10,000) Population is the only response variable while the others are recorded to be explainatory variables Region A: South America Final Regression Output: By applying the backward elimination, population is the only variable that satisfies the 5% level of significance since its p-value is witnessed to be significantly smaller in comparison (0 < 0.05) which leads to a conclusion is that the null hypothesis is rejected and the change in national population (in millions) have an impact on the total number of deaths (per million population) by COVID-19 Figure 2: Final Regression model of South America Regression Equation: ^ Total number of deaths caused by COVID −19 = b0 + b1*Population in 2020 (in millions) ^ = -2727,65 + 1096,26 * Population in Total number of deaths caused by COVID −19 2020 (in millions) Interpretation of the regression coefficient of the significant variable: Since the coefficient of Population in 2020 (in millions) is positive (1096.26), it illustrates that there is a proportionality between the two variables This means that if the population increases by one unit, the total number of deaths caused by COVID-19 would rise for approximately 1096 deaths, ceteris paribus For the intercept, it indicates a negative figure of the total death by COVID-19 when the poplation is zero which is impossible and unreasonable since the population can never equal zero Interpretation of the coefficient of determination: The R2 or the coefficent of determination is at 97% which means that there are 97% that the total number of deaths by COVID-19 can be explained by the variable called Population (in millions) The remaining 3% shows that the dependent variable could be explained by other factors Region B: Asia Final Regression Output: Similar to South America, after conducting the backward elimination, among the five independent varibles, only Population (in millions) variable are significant at 5% level of significant due to its small p-value (0 < 0.05) which also means that the null hypothesis is rejected and any fluctuation in the population (in millions) could affect the total deaths (in millions) by COVID-19 Figure 3: Final Regression model of Asia Regression Equation: ^ Total number of deaths caused by COVID −19 = b0 + b1 * Population in 2020 (in millions) ^ = -4535,41 + 186,69 * Population in 2020 Total number of deaths caused by COVID −19 (in millions) Interpretation of the regression coefficient of the significant variable: The coefficient of variable named Population in 2020 (in millions) in Asia shows a positve figure meaning that there is also a positive relationship between the two variables This implies that, if the population increase by one unit, there would be an increase of approximate 187 deaths caused by COVID-19, ceteris paribus For the intercept, it indicates a negative figure of the total death by COVID-19 when the poplation is zero which is impossible and unreasonable since the population can never equal zero Interpretation of the coefficient of determination: The coefficient of determination or R2 has indicated that there is 97% that it is possible for the total number of deaths by COVID-19 is explained by the Population (in millions) The other elements that impact the dependent variable hold the remained 3% PART 4: TEAM REGRESSION CONSLUSION: Regression Equation SOUTH AMERICA -2727,65 + 1096.26 * Population ASIA -4535,41 + 186.69 * Population in in 2020 (in millions) 2020 (in millions) Coefficient of 97% 97% Determination Population (in millions) 439363 397379 Table 3: Summary of information calculated in Part (Unit: in millions) In accordance with the two discussed regression model above, the mortality cases due to COVID-19 in both Asia and South America data sets have the same significant independent variable with 95% of confidence which is the population in 2020 in million people after the elimination of remaining variables (Hospital beds, doctors, average rainfall, and temperature) that not explain the variation in the COVID-19 death cases Therefore, population affect on the number of deaths due to COVID-19 was proven and can be used to estimate the number of COVID-19 deaths in the Asia and South America In general, the population of both Asia and South America contribute the same level of estimation to the number of deaths caused by Covid-19 as the regression model of both have the same coefficient of determination (97%) Additionally, the higher the R2 the better the figure for the model (CFI n.d), furthermore indicate that the regression model fit the prediction (Frosat 2015) In this case, the forecast of COVID-19 death in Asia and South America are close to the real-life figure Moreover, the regression equation of South America and Asia are (1096 ,26 and 186,69 respectively) are both positive which further imply the positive relationship between death cases caused by COVID-19 and the Population figure  In Asia, the regression equation of 186,86 indicate that the number of death cases due to COVID-19 would increase by 186,69 cases for every millions of people among the countries in the chosen region  In South America, the regression equation of 1096,26 signified that the figure of death cases caused by COVID-19 would increase by 1096,26 cases for every millions of people among the assigned countries On the other hand, the intercept of the regression line in Asia (-4535 ,41) is twice time larger than it is in South America (-2727,65) therefore, the number of COVID-19 death cases in Asia countries would be more affected in case the population increase during the period of April to 31 July, the two countries that are heavily populated are Brazil and India also recorded to have the highest total deaths in each region, 230176 and 260414 (per million population), respectively The coefficient of determination in the two final regression models also indicates that there are 97% of the total number of deaths by COVID-19 can be explained by Population (in millions) Predicted number of deaths due to COVID-19 in the world on October 31, 2021: According to Part 6, South America’s exponential trend is the most accurate model trend to predict the number of deaths due to Covid-19 in the world In this part, we would use the formula: ^ Y = 4852,885 x 1,005T with Time-period (T=214) for October 31, 2021: ^ Y = 4852,885 x 1,005T = 4852,885 x 1,005214 ≈ 14110 (deaths case) Based on the above calculation, the number of deaths due to Covid-19 in the world on October 31, 2021, is 14110 cases, showing an upward trend in the number of deaths cases in the future The number of deaths by COVID 19 by the end of year 2020: It is believed that COVID-19 fatalities will not decline in the long run, but may potentially grow at an accelerated rate, based on results of part in the analysis above As previously stated, the number of deaths due to COVID-19 in the world on October 31 this year is predicted to be 14110 cases which forecasting a future with an increasing frequency of fatalities Assuming using the same formula as 7.2, the predicted COVID-19 deaths number on 31 December (T=275): ^ Y = 4852,885 x 1,005T = 4852,885 x 1,005275 ≈ 19128 (deaths case) Since current social distancing practices and vaccination rates not improve, the model forecasts that in some states, daily COVID-19-related fatalities might surpass the peak daily deaths that occurred in early 2021 (Marcela 2021) The world has been witnessed the existence of Delta variant which is determined to be to times more infectious than the preceding versions In unvaccinated individuals, some research suggests that the new variation may produce a more severe disease than earlier versions (Rajee 2021) As can be seen, the vaccination rate in the world is still low especially Asia and Afica which can increase the severity of sickness and the likelihood of disease outbreaks Due to the under-resourced healthcare systems, vaccination reluctance and ineffective vaccine delivery all contribute to the present record Case numbers for COVID-19 in the area, Asia especially South East region has been suffering the outbreaks of Covid-19 cases and the significant rise of the mortality rate (Minseo 2021) In brief, in our team opinion, the number of deaths due to COVID-19 will increase by the end of this year Two other variables might impact the number of COVID-19 deaths in the world: The number of Covid-19 fatalities may be influenced by other factors besides six variables that were mentioned in this study To better understand the factors affecting the Covid-19 mortality rate in the world, there are two other variables which are considered to influent to the number of deaths The first variable which is considered as categorical variable is the elderly proportion The elderly who have been not only the most severely affected by both the virus and the lockdown measures but also have been the least mitigated by physical and digital solutions (Martins 2020) As stated by the Centers for Disease Control and Prevention, populations of nursing homes are at danger of contracting the coronavirus and dying as a result (The New York Times 2021) Besides, the old at Long Term Care Facilities (LTCF) have been particularly hard impacted by the COVID-19 epidemic in the all around the world as it is estimated that 55.3% of fatalities caused by the pandemic is in the age group of over 80 (Amore et.al 2021) Moreover, according to the latest statistics available from the Centers for Disease Control and Prevention, around eight out of ten Covid-19 fatalities occurred among adults 65 and older as people in their 60s and older with pre-existing health problems (Nania 2021) As can be seen from the Figure 8, the majourity of Covid-19 deaths have been reported in people aged 50 years and over Although COVID-19 may infect people of all ages, older people are at a higher risk of getting serious illness because to physiological changes that come with aging and certain underlying health problems that make them more susceptible (WHO 2020) Therefore, it can be conclude that proportion of elderly may be directly related to Covid 19 fatalities, indicating that countries with a high proportion of elderly persons will have more Covid 19 fatality instances Another variable that is taken into account is vaccination rates which is a categorical variable There is a consensus now that vaccination is the best way to minimize the amount of deaths caused by Covid-19 To prevent infections, vaccinations were never meant to be 100% effective, but rather to lower the number of illnesses in a community and, most critically, to lessen the degree of disease in those who got Covid-19 (Reals 2021) As more individuals are vaccinated, the effect of Delta will likely be lessened, saving roughly 1.5 million COVID-19 infections and 21,000 fatalities Vaccinating 80% of the country by January 1, 2022 will help reduce Delta's impact on the nation shown by resulting in a 20% reduction in cases and a 22% reduction in deaths (Jocelyn 2021) Hospitalization and mortality rates are substantially greater in areas with low vaccination rates despite the fact that the vast majority of Americans live in a region with high Covid-19 transmission rates States which have low vaccination rate had a hospitalization rate of 39 persons per 100,000 population, compared to 10 per 100,000 in the top ten most vaccinated states and death rates are more than 5.5 times higher than those states, according to the US Department of Health and Human Services (Hanna, Almasy & Holcombe 2021) According to Andy Slavitt, a former COVID-19 advisor to the Biden administration, 98-99% of Americans who die from the coronavirus are unvaccinated (Carla & Mike 2021) Even with minimal protection against infection, vaccination can have a significant influence on COVID-19 mortality rate (Seyed et al 2020) That also means that countries which have higher rates of vaccination, they will have lower number of Covid-19 deaths Figure Covid-19 cases and deaths by age group in America based on available data in October 29 2020 Source: Center of Disease Control and Prevention, CDC COVID Data Tracker References: Amore, S, Puppo, E, Melara, J, Terracciano, E, Gentili, S & Liotta, G 2021, ‘Impact of COVID-19 on older adults and role of long-term care facilities during early stages of epidemic in Italy’, Scientific Reports 11, no 12530 Bhadra, A, Mukherjee, A & Sarka, K 2021, 'Impact of population density on Covid-19 infected and mortality rate in India', Modeling Earth Systems and Environment, October, vol 7, p 623–629, viewed September 2021, Carla, KJ & Mike, S 2021, ‘Nearly all COVID deaths in US are now among unvaccinated’, AP News, 30 June, viewed September 2021, < https://apnews.com/article/coronavirus-pandemic-health941fcf43d9731c76c16e7354f5d5e187? fbclid=IwAR3wMMowCuF_EAK9emaUXz9jKknmHor2PAZt6NlCeTWnpCw1T7AyTdAwH4> CFI n.d, R-Squared, Corporate Finance Institute, viewed September 2021, Frost, J 2015, Introduction to Statistics: An Intuitive Guide, pp 215, viewed 11 May 2020 Hanna, J, Almasy, S & Holcombe, M 2021, ‘If you live in a state with a low vaccination rate, you're times more likely to be hospitalized and more than times more likely to die’, CNN Health, 19 August, viewed September 2021, < https://edition.cnn.com/2021/08/18/health/us-coronavirus-wednesday/index.html> Jocelyn, SM 2021, ‘Delta and low vaccination rates projected to cause COVID-19 resurgence in United States’, News Medical Life Science, September, viewed September 2021, < https://www.news-medical.net/news/20210907/Delta-and-low- vaccination-rates-projected-to-cause-COVID-19-resurgence-in-United-States.aspx> Lulbadda, K, Kobbekaduwa, D & Guruge, M 2021, 'The impact of temperature, population size and median age on COVID-19 (SARS-CoV-2) outbreak', Clinical Epidemiology and Global Health, January, vol 9, pp 231-236, viewed September 2021, ScienceDirect database Marcela, QD 2021, ‘With Delta variant dominance, simulator predicts surge in COVID19 deaths in most of U.S’, Massgeneral, 16 August, viewed September 2021, < https://www.massgeneral.org/news/press-release/With-delta-variant-dominancesimulator-predicts-surge-in-covid-deaths-in-most-of-us> 10 Martins, G 2020, ‘The Effects of COVID-19 Among the Elderly Population: A Case for Closing the Digital Divide’, Front Psychiatry, vol.11, no 577427 11 Minseo, J 2021, ‘COVID-19 in Southeast Asia: Current situation and outlook’, Medical News Today, 11 August, viewed September 2021, < https://www.medicalnewstoday.com/articles/covid-19-in-southeast-asia-current-situationand-outlook> 12 Nania, R 2021, ‘95 Percent of Americans Killed by COVID-19 Were 50 or Older’, AARP, April, viewed September 2021, < https://www.aarp.org/health/conditionstreatments/info-2020/coronavirus-deaths-older-adults.html? fbclid=IwAR3uW3nbtc66zq4KEVNX81tBgEu0C1Wm3yptmMX4mmxrUZhl4Ks8FfqN gGs> 13 Our Word in Data 2021, Cumulative confirmed COVID-19 deaths, Our World in Data, viewed September 2021, < https://ourworldindata.org/grapher/cumulative-coviddeaths-region?year=latest&time=2020-01-11 latest> 14 Rajee, G 2021, ‘Column: What to know about the delta variant’, The Daily Tarheel, September, viewed September 2021, < https://www.dailytarheel.com/article/2021/09/opinion-delta-variant-unc-booster-shot> 15 Reals, T 2021, ‘Study finds low rate of COVID-19 "breakthrough" infections, fewer symptoms in vaccinated people’, CBSNews, September, viewed September 2021, < https://www.cbsnews.com/news/covid-breakthrough-infections-vaccine-rate-symptomsstudy/> 16 Seyed, MM, Thomas, NV, Kevin, Z, Chad, RW, Affan, S, Burton, HS, Lauren, AM, Kathleen, MN, Joanne, ML, Meagan, CF & Alison PG 2020, ‘The impact of vaccination on COVID-19 outbreaks in the United States’, Clin Infect Dis, vol.2 17 Tableau n.d, Guide To Data Cleaning: Definition, Benefits, Components, And How To Clean Your Data, Tableau, viewed 2021, September 18 The New York Times 2021, ‘Nearly One-Third of U.S Coronavirus Deaths Are Linked to Nursing Homes’, The New York Times, June, viewed September 2021, < https://www.nytimes.com/interactive/2020/us/coronavirus-nursing-homes.html? fbclid=IwAR16VddvVVirylq6JQJW0uy0ALlz3vy4wZKkr3XzV39KK1uNYScZSqcChw> 19 World Health Organization n.d 2020, ‘WHO delivers advice and support for older people during COVID-19’, WHO, April, viewed September 2021, < https://www.who.int/news-room/feature-stories/detail/who-delivers-advice-and-supportfor-older-people-during-covid-19? fbclid=IwAR1xI_9oyUM1vIG5O3pYTRZAoRTUhLlDNbXFxiHVpY0J8JMec7MJJH_t nCs> Appendix: Appendix Applying the backward elimination for the regression model Since the significant level is at 5%, we would conduct the null hypothesis testing to determine whether the five independent variables will be significant at 5% by comparing the p-value and the significant level Null Hypothesis: H0: Bj = (No variables can affect the dependent variable) Alternative Hypothesis: H1: Bj ≠ (At least one of the independent variables can affect the dependent variable) j: 1,2,3,4,5 In this case, if the p-value is higher than 0.05, we will fail to reject the null hypothesis and conclude that the independent variable is insignificant and if the p-value is lower than 0.05, we will reject the null hypothesis Therefore the independent variable is concluded as significant  Region A: South America Figure The first regression output As can be seen, the p-value for a variable called average rainfall from 1991-2020 (in mm) is the largest value among the five explanatory variables at 0.84 which is also higher than the α , we first eliminate this variable Figure 10 Part of the regression output after eliminating the value of average rainfall from 1991-2020 (in mm) Next, looking at the p-value for a variable called Medical doctors in 2017 (per 10.000), this is the largest value among the four explanatory variables at 0.53 which is also higher than the α so we eliminate this variable Figure 11 Part of the regression output after eliminating the value of medical doctor (per 10.000) The p-value for average temperature from 1991-2020 (in Celsius) is the largest value among the three explanatory variables at 0.76 which is also higher than the α , we eliminate this variable Figure 12 Part of the regression output after eliminating the value of average temperature from 1991-2020 (in Celsius) Figure 13 The final regression output Compared between the two last variables, only the variable named population in 2020 (in millions) can satisfy the significant level as its p-value is lower than 0.05 Hence, another variable is eliminated and the null hypothesis is rejected meaning that only population in 2020 (in millions) is a significant independent variable  Region B: Asia Figure 14 The first regression output As can be seen, the p-value for a variable called medical doctors (per 10.000) is the largest value among the five explanatory variables at 0.31 which is also higher than the α , we first eliminate this variable Figure 15 Part of the regression output after eliminating the value of medical doctor (per 10.000) Next, looking at the p-value for a variable called Hospital beds (per 10.000 people), this is the largest value among the four explanatory variables at 0.31 which is also higher than the α so we eliminate this variable Figure 16 Part of the regression output after eliminating the value of hospital beds (per 10.000 people) The p-value for average rainfall from 1991-2020 (in mm) is the largest value among the three explanatory variables at 0.24 which is also higher than the α , we eliminate this variable Figure 17 Part of the regression output after eliminating average rainfall from 1991-2020 (in mm) Figure 18 The final regression output Compared between the two last variables, only the variable named population in 2020 (in millions) can satisfy the significant level as its p-value is lower than 0.05 Hence, another variable is eliminated and the null hypothesis is rejected meaning that only population in 2020 (in millions) is a significant independent variable Appendix 2: Hypothesis testing for significance for South America with p-value: Linear Trend: Assuming that the level of significant H0 : H1 β0 α =0,05 = (There is no linear trend) : β1 ≠ (There is no linear trend) We compare the p-value and level of significant ( α =0,05) in order to test the null-hypothesis: -If p-value < α => We reject the null hypothesis => There is no linear trend -If p-value > α => We not reject the null hypothesis => There is linear trend Standard Coefficients Intercept Error 4708,072754 207,15975 t Stat 22,726774 Time Period (T) P-value Lower Upper Lower Upper 95% 95% 95,0% 95,0% 4297,9109 -17,09074717 2,92311166 5,8467651 5118,235 4297,9109 - - 22,878305 11,30319 -22,8783 -11,30319 Figure 19: Regression output of the linear trend of the number of deaths in Asia from April to 31 July, 2021  As p-value of linear trend = < α =0,05, we reject H Therefore, there is a linear trend in South America’s data set with 95% level of confidence  Quadratic Trend: Assuming that the level of significant α =0,05 5118,235 H0 : H1 β = (There is no quadratic trend) : β1 ≠ (There is no quadratic trend) We compare the p-value and level of significant ( α =0,05) in order to test the null-hypothesis: -If p-value < α => We reject the null hypothesis => There is no quadratic trend -If p-value > α => We not reject the null hypothesis => There is quadratic trend Coefficients Standard Error t Stat Intercept 4430,22175 313,453159 14,133601 Time Period (T) 3,64634382 11,7646259 0,3099413 Time Period (Tsquare) 0,10930409 0,09265783 Lower 95% P-value Upper 95% 3809,5532 5050,89 Lower 95,0% 3809,553 26,941478 19,64879 -26,94148 19,64879 -1,179653 0,24049118 0,2927759 0,074168 -0,292776 0,074168 to 31 July, 2021  As p-value of quadratic trend = 0,24 > α =0,05, we not reject H As a result, there is not a quadratic trend in South America’s data set with 95% level of confidence Exponential Trend: Assuming that the level of significant H1 α =0,05 β = (There is no exponential trend) : β1 5050,89 0,7571476 Figure 20: Regression output of the quadratic trend of the number of deaths in Asia from April H0 : Upper 95,0% ≠ (There is no exponential trend) We compare the p-value and level of significant ( α =0,05) in order to test the null-hypothesis: -If p-value < α => We reject the null hypothesis => There is no exponential trend -If p-value > α => We not reject the null hypothesis => There is exponential trend Coefficients Standard Error t Stat Lower 95% Intercept 3,68579029 0,02268769 162,45772 3,6408702 Time Period (T) 0,00240451 0,00032013 7,5109864 0,0030384 P-value Upper 95% Lower 95,0% 3,73071 3,6408702 0,001771 Upper 95,0% 3,73071 -0,003038 0,001771 Figure 21: Regression output of the exponential trend of the number of deaths in Asia from April to 31 July, 2021  As p-value of quadratic trend = < α =0,05, we reject H As a result, there is a exponential trend in South America’s data set with 95% level of confidence Appendix 3: Hypothesis testing for significant for Asia with p-value: Linear Trend: Assuming that the level of significant (α)=0,05 H0: β = (There is no linear trend) H1: β ≠ (There is a linear trend) We compare the p-value and level of significant (α=0,05) in order to test the null-hypothesis: -If p-value < α = 0,05 =>We reject the null hypothesis => There is a linear trend -If p-value > α = 0,05 =>We not reject the null hypothesis => There is not a linear trend Standard Error t Stat 3946,9832 249,1167016 15,84391241 -1,7089098 3,515141989 -0,4861567 Coefficients Intercept Time period (T) Lower 95% Upper 95% Lower 95.0% Upper 95.0% 3453,749468 4440,21693 3453,74947 4440,216932 0,62774 -8,66864633 5,25082668 8,66864633 5,250826678 P-value Figure 22: Regression output of the linear trend of the number of deaths in Asia from April to 31 July, 2021  As p-value of linear trend = 0,63 > H As a result, there α =0,05, we not reject is no linear trend in Asia’s data set with 95% level of confidence Quadratic Trend: Assuming that the level of significant H0 : H1 (α ) =0,05 β = (There is no quadratic trend) : β1 ≠ (There is a quadratic trend) We compare the p-value and level of significant ( α =0,05) in order to test the null-hypothesis: -If p-value < α => We reject the null hypothesis => There is a quadratic trend -If p-value > α => We not reject the null hypothesis => There is no quadratic trend Coefficient s Intercept Standard Error Pvalue t Stat Lower 95% Upper 95% Lower 95.0% Upper 95.0% 2493,341 335,1052526 7,44047139 1829,799175 3156,88291 1829,79918 3156,882914 Time period (T) 68,628614 12,57727931 5,45655480 43,72434461 93,532883 43,7243446 93,53288305 Time period (Tsquare) -0,5718498 0,099058265 -5,77286293 -0,76799504 -0,3757045 0,76799504 Figure 23: Regression output of the quadratic trend of the number of deaths in Asia from April to 31 July, 2021  As p-value of quadratic trend = < α =0,05, we reject H As a result, there is a quadratic trend in Asia’s data set with 95% level of confidence Exponential Trend: Assuming that the level of significant ( α ¿ =0,05 H0 : H1 β = (There is no exponential trend) : β1 ≠ (There is an exponential trend) -0,37570453 We compare the p-value and level of significant ( α =0,05) in order to test the null-hypothesis: -If p-value < α => We reject the null hypothesis => There is an exponential trend -If p-value > α => We not reject the null hypothesis => There is no exponential trend Coefficient s Standard Error t Stat Intercept 3,5467379 0,029709235 119,3816644 Time period (T) 0,0001467 0,00041921 P-value Lower 95% Lower 95.0% Upper 95% Upper 95.0% 3,48791566 3,60556009 3,48791566 3,605560094 0,34991285 0,72702 0,00068332 0,00097669 0,00068332 0,000976693 Figure 24: Regression output of the exponential trend of the number of deaths in Asia from April to 31 July, 2021  As p-value of exponential trend = 0,73 > α =0,05, we not reject H As a result, there is no exponential trend in Asia’s data set with 95% level of confidence  Appendix 4: Figure 25: The screenshot of the total daily deaths number of South America and Asia from April to Octorber 31, 2021 (Unit: deaths per day) ... 11 PART 5: TIME SERIES: 11 A) South America: 12 Use time series data to build Linear (LIN), Quadratic (QUA) and Exponential (EXP) trend model for South America: ... America: Use time series data to build Linear (LIN), Quadratic (QUA) and Exponential (EXP) trend model for South America: Based on the hypothesis testing to determine trend models in the data set... 29 and September 30 .15 B) Asia: 16 Use time series data to build Linear (LIN), Quadratic (QUA) and Exponential (EXP) trend model for Asia: 16 Quadratic

Ngày đăng: 27/04/2022, 08:25

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w