Therefore, the purpose of our project is to investigate statistically whether there is a significant change in temperature over time in Vietnam, especially in April.. From those raw data
Trang 1VIETNAM NATIONAL UNIVERSITY OF HO CHI MINH CITY
Lê Hu nh Tu n Ki t_MAMAIU18066 Nguyn Trn Duy Tân_MAMAIU18031
Trang 2Table of Contents
Trang 3I.Overall information of topic:
Our team choose the project to analyze the global warming in Vietnam Therefore, the purpose of our project is to investigate statistically whether there is a significant change in temperature over time in Vietnam, especially in April From sources, we have collected the two distinct data set which are the temperature over years from 1931 to 1960 and from 1991 to 2016 respectively
Data set 1: From 1931 to 1960
Data set 2: From 1991 to 2016
Trang 4From those raw data sets, we are on the purpose to analyse only the temperature over the years in April Therefore, we decided to get two samples from there Sample 1: Temperature in April from 1931 to 1960
Sample 2: Temperature in April from 1991 to 2016
Trang 5II.Analysis of the temperature:
1) Historical analysis over the two samples:
By using ToolPak of Excel and the knowledge of Statistics course, we have calculated some numbers which are represented for the distribution of sample 1 and sample 2
Trang 6
Overally, from these numbers, the average temperature of two periods changed slightly The median of the second period’s temperature is a slightly more than the first period’s temperature
19441945194619471948194919501951195219531954195519561957195819591960First Sample
Second Sample
Trang 7From these above graphs, it indicates that the temperature in April over the year fluctuated significantly in the interval of 24 degree and 27 degree
Visually, the maximum and minimum temperature of the first sample is greater than the maximum temperature of the second sample Furthermore, from these charts, we can predict that the variances of two samples are approximately the same with each other
Trang 8Over these charts, the sample 1 do not distribute normally and neither does essample 2 Moreover, data of sample 1 is right skewed while the data of –sample 2 is on the opposite side
2) Hypothesis Test:
a) Hypothesis test of equality of variances:
From the previous part, we have discovered that those two sample are not assumed to be distributed normally Therefore, until the times, we
Trang 9did not know whether the variances of two populations are equal Hence, it is compulsory to estimate it through Levene’s test.
The Levene’s test theory:
Definition: Levene's test is used to test if samples have equal k
variances Equal variances across samples is called homogeneity of variance Some statistical tests, for example the analysis of variance, assume that variances are equal across groups or samples The Levene test can be used to verify that assumption
Given a variable with sample of size divided YN
into subgroups, where is the sample size of kNi
the th subgroup, the Levene s test statistic is i ’defined as:
W= (𝑁−𝑘) ∑𝑘 𝑁𝑖(𝑍𝑖.−𝑍 )2𝑖=1
(𝑘−1) ∑𝑘𝑖=1∑𝑁𝑖𝑗=1𝑁𝑖(𝑍𝑖𝑗−𝑍𝑖.)2
Where:
- k is the number of different groups to which the sampled cases belong
- Ni is the number of cases in the th group i
- N is the total number of cases in all groups - Yij is the value of the measured variable for
the th case from the th group ji
- Zij = |Yij - yi.| - yi. is a mean of th group i
- Zi. = 𝑁1𝑖∑𝑁𝑖𝑍𝑖𝑗
𝑗=1 is the mean of the Zij for group ith
Trang 10- Z = 𝑁1∑∑𝑁𝑖 𝑍𝑖𝑗
𝑖=1 is the mean of all
We can easily calculate the component of W:
It is clear from the statistical evidence that the p-value (P(F0.05,55,1>W)) is greater than α = 0.05 Therefore, we accept the assumption that the variances of two population are equal
b) Hypothesis test of the change of the average temperature
between two population:
Trang 11Using t test: Assuming Equal Variances –H0: μ2 ≥μ1 Ha: μ2 < μ1
With the level of significant is 0.02:
It is clear that the TS = 2.104 < |t-stat|; therefore, it is reasonable to reject H0 That is, for the level of significance of 0.02, the average temperature of April did not increase over the years
With the level of significant is 0.05:
It is clear that the TS = 1.6736 < |t-stat|; therefore, it is reasonable to reject H0 That is, for the level of significance of 0.05, the average temperature of April still did not increase over the years
III Regression:
1) Linear regression:
- From the summary of sample 2:
Trang 12The R Square number, which represented for the correlation between – isfactors, is too small In the other words, by linear regression, we are not able to get the data with high accuracy
2) Quadratic regression: a) The idea:
- A quadratic regression is the process of finding the equation of the parabola that
best fits a set of data As a result, we get an equation of the form:
f(x) = ax2 + bx + c where a≠0.
- The best way to find this equation manually is by using the least squares method The purpose is to minimize the sum of the squares of the residuals between the measured y and the y calculated with the quadratic model
a) f(xi): the quadratic model
b) g(a,b,c): the function of the sum of the squares of the residuals
- Take the derivative of the function g(a,b,c) with respect to each coefficient a,b,c:
Trang 13- Solving this set of equations, we will obtain the value of a, b and c
c) Applications:
Now, we applied this method to estimate the future value of the sample 2:
Year(x) Temperature (y)
1991 26.1502 1992 26.6464 1993 25.4547 1994 26.7231
1996 24.1817
1998 26.7605 1999 25.6341 2000 25.3206 2001 26.7044
2003 26.8283 2004 26.0383 2005 26.0422 2006 25.8245 2007 24.7316
2009 25.2222 2010 25.9229 2011 24.6103 2012 26.2427 2013 25.8306
∑yixi2 2591684871
Trang 14- The normal equations:
402436699193645a + 200909162375b + 100301525c = 2591684871 200909162375a+ 100301525b + 52091c = 1293904
100301525a+ 52091b + 26c = 673.322
- Hence, the value of (a, c) = (-3.3645*10 , 0.01357, 0.01724) b, -7
- And the predicting temperature function is: f(x) = (-3.3645*10-7)x2 + 0.01357x + 0.01724 Hence, the temperature will decrease
Trang 15IV Refferences:
1/ Numerical Methods For Engineers, Seventh Edition
2/ The Impact of Levene's Test of Equality of Variances on Statistical Theory and Practice
-The End-