1. Trang chủ
  2. » Luận Văn - Báo Cáo

project report statistics analysis of the temperature

15 0 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Nội dung

Therefore, the purpose of our project is to investigate statistically whether there is a significant change in temperature over time in Vietnam, especially in April.. From those raw data

Trang 1

VIETNAM NATIONAL UNIVERSITY OF HO CHI MINH CITY

Lê Hu nh Tu n Ki t_MAMAIU18066   Nguyn Trn Duy Tân_MAMAIU18031

Trang 2

Table of Contents

Trang 3

I.Overall information of topic:

Our team choose the project to analyze the global warming in Vietnam Therefore, the purpose of our project is to investigate statistically whether there is a significant change in temperature over time in Vietnam, especially in April From sources, we have collected the two distinct data set which are the temperature over years from 1931 to 1960 and from 1991 to 2016 respectively

Data set 1: From 1931 to 1960

Data set 2: From 1991 to 2016

Trang 4

From those raw data sets, we are on the purpose to analyse only the temperature over the years in April Therefore, we decided to get two samples from there Sample 1: Temperature in April from 1931 to 1960

Sample 2: Temperature in April from 1991 to 2016

Trang 5

II.Analysis of the temperature:

1) Historical analysis over the two samples:

By using ToolPak of Excel and the knowledge of Statistics course, we have calculated some numbers which are represented for the distribution of sample 1 and sample 2

Trang 6

Overally, from these numbers, the average temperature of two periods changed slightly The median of the second period’s temperature is a slightly more than the first period’s temperature

19441945194619471948194919501951195219531954195519561957195819591960First Sample

Second Sample

Trang 7

From these above graphs, it indicates that the temperature in April over the year fluctuated significantly in the interval of 24 degree and 27 degree

Visually, the maximum and minimum temperature of the first sample is greater than the maximum temperature of the second sample Furthermore, from these charts, we can predict that the variances of two samples are approximately the same with each other

Trang 8

Over these charts, the sample 1 do not distribute normally and neither does essample 2 Moreover, data of sample 1 is right skewed while the data of –sample 2 is on the opposite side

2) Hypothesis Test:

a) Hypothesis test of equality of variances:

From the previous part, we have discovered that those two sample are not assumed to be distributed normally Therefore, until the times, we

Trang 9

did not know whether the variances of two populations are equal Hence, it is compulsory to estimate it through Levene’s test.

The Levene’s test theory:

Definition: Levene's test is used to test if samples have equal k

variances Equal variances across samples is called homogeneity of variance Some statistical tests, for example the analysis of variance, assume that variances are equal across groups or samples The Levene test can be used to verify that assumption

Given a variable with sample of size divided YN

into subgroups, where is the sample size of kNi

the th subgroup, the Levene s test statistic is i ’defined as:

W= (𝑁−𝑘) ∑𝑘 𝑁𝑖(𝑍𝑖.−𝑍 )2𝑖=1

(𝑘−1) ∑𝑘𝑖=1∑𝑁𝑖𝑗=1𝑁𝑖(𝑍𝑖𝑗−𝑍𝑖.)2

Where:

- k is the number of different groups to which the sampled cases belong

- Ni is the number of cases in the th group i

- N is the total number of cases in all groups - Yij is the value of the measured variable for

the th case from the th group ji

- Zij = |Yij - yi.| - yi. is a mean of th group i

- Zi. = 𝑁1𝑖∑𝑁𝑖𝑍𝑖𝑗

𝑗=1 is the mean of the Zij for group ith

Trang 10

- Z = 𝑁1∑∑𝑁𝑖 𝑍𝑖𝑗

𝑖=1 is the mean of all

We can easily calculate the component of W:

It is clear from the statistical evidence that the p-value (P(F0.05,55,1>W)) is greater than α = 0.05 Therefore, we accept the assumption that the variances of two population are equal

b) Hypothesis test of the change of the average temperature

between two population:

Trang 11

Using t test: Assuming Equal Variances –H0: μ2 ≥μ1 Ha: μ2 < μ1

With the level of significant is 0.02:

It is clear that the TS = 2.104 < |t-stat|; therefore, it is reasonable to reject H0 That is, for the level of significance of 0.02, the average temperature of April did not increase over the years

With the level of significant is 0.05:

It is clear that the TS = 1.6736 < |t-stat|; therefore, it is reasonable to reject H0 That is, for the level of significance of 0.05, the average temperature of April still did not increase over the years

III Regression:

1) Linear regression:

- From the summary of sample 2:

Trang 12

The R Square number, which represented for the correlation between – isfactors, is too small In the other words, by linear regression, we are not able to get the data with high accuracy

2) Quadratic regression: a) The idea:

- A quadratic regression is the process of finding the equation of the parabola that

best fits a set of data As a result, we get an equation of the form:

f(x) = ax2 + bx + c where a≠0.

- The best way to find this equation manually is by using the least squares method The purpose is to minimize the sum of the squares of the residuals between the measured y and the y calculated with the quadratic model

a) f(xi): the quadratic model

b) g(a,b,c): the function of the sum of the squares of the residuals

- Take the derivative of the function g(a,b,c) with respect to each coefficient a,b,c:

Trang 13

- Solving this set of equations, we will obtain the value of a, b and c

c) Applications:

Now, we applied this method to estimate the future value of the sample 2:

Year(x) Temperature (y)

1991 26.1502 1992 26.6464 1993 25.4547 1994 26.7231

1996 24.1817

1998 26.7605 1999 25.6341 2000 25.3206 2001 26.7044

2003 26.8283 2004 26.0383 2005 26.0422 2006 25.8245 2007 24.7316

2009 25.2222 2010 25.9229 2011 24.6103 2012 26.2427 2013 25.8306

∑yixi2 2591684871

Trang 14

- The normal equations:

402436699193645a + 200909162375b + 100301525c = 2591684871 200909162375a+ 100301525b + 52091c = 1293904

100301525a+ 52091b + 26c = 673.322

- Hence, the value of (a, c) = (-3.3645*10 , 0.01357, 0.01724) b, -7

- And the predicting temperature function is: f(x) = (-3.3645*10-7)x2 + 0.01357x + 0.01724 Hence, the temperature will decrease

Trang 15

IV Refferences:

1/ Numerical Methods For Engineers, Seventh Edition

2/ The Impact of Levene's Test of Equality of Variances on Statistical Theory and Practice

-The End-

Ngày đăng: 23/07/2024, 17:02

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w