H0: µ = 47.5 i.e., there is no significant difference between the sample and population mean.. Conclusion: Since calculated t is less than tabulated t0.05 for 9 d.f., H0 may be accepted
Trang 1Similarly, we will get
b – E(b) = −ad bc
N
–
= c – E(c); d – E(d) = ad bc
N
–
Substituting in (2), we get
χ2 = ad bc
N
–
2
E aa f a f a f a f+ E b +E c + E d
L
= ad bc
N
–
b g2
a+b a+c + a+b b+d a c c d b d c d
RS|
RS|
L N
N
–
b g2
b d a c
a b a c b d
b d a c
a c c d b d
+ + +
+ + +
L NMM b gb gb g b gb gb g O QPP
= (ad – bc)2 c d a b
a b a c b d c d
+ + +
L NMM b gb gb gb g O QPP
a b a c b d c d
–
b gb gb gb g
2
Example 11 From the following table regarding the colour of eyes of father and son test if the colour
of son’s eye is associated with that of the father.
Eye colour of son
Light Not light
Eye colour of father Light
Not light
Sol Null Hypothesis H 0 : The colour of son’s eye is not associated with that of the father i.e.,
they are independent
Under H0, we calculate the expected frequency in each cell as
= Product of column total and row total
whole total
Trang 2Expected frequencies are:
Eye colour
Eye colour
of father
900 = 359.02
289 522×
900 = 167.62 522
900
×378
= 259.98 289
900
×378
= 121.38 378
χ2 = 471 359 02
359 02
51 167 62
167 62
148 259 98
259 98
230 121 38
121 38
–
–
–
–
= 261.498
Conclusion: At 5% level for 1 d.f., χ2 is 3.841 (tabulated value)
Since tabulated value of χ2 < calculated value of χ2 Hence H0 is rejected
Example 12 The following table gives the number of good and bad parts produced by each of the three shifts in a factory:
Test whether or not the production of bad parts is independent of the shift on which they were produced.
Sol Null Hypothesis H 0 : The production of bad parts is independent of the shift on which they were produced
i.e., the two attributes, production and shifts are independent.
i
i j i j
i j j
A B
N
MM MM
O Q
PP PP
1
2
0
2
0 1
e j
–
Calculation of expected frequencies
Let A and B be the two attributes namely production and shifts A is divided into two classes
A1, A2 and B is divided into three classes B1, B2, B3
Trang 3(A1B1)0 = A B
N
1 2
2985
= 954.77;
(A1B2)0 = A B
N
1 2
b gb g = 2850 990
2985
= 945.226
(A1B3)0 = A B
N
1 3
b gb g = 2850 995
2985
= 950;
(A2B1)0 = A B
N
2 1
b gb g = 135 1000
2985
= 45.27
(A2B2)0 = A B
N
2 2
b gb g = 135 990
2985
= 44.773;
(A2B3)0 = A B
N
2 3
b gb g = 135 995
2985
To calculate the value of χ2
1.28126 Conclusion: The tabulated value of χ2 at 5% level of significance for 2 degrees of freedom
(r – 1)(s –1) is 5.991 Since the calculated value of χ2 is less than the tabulated value, we accept
H0 i.e., the production of bad parts is independent of the shift on which they were produced.
12.7.2 Student’s t-distribution
The t-distribution is used when sample size is less than equal to 30 (≤ 30) and the population
standard deviation is unknown
Let X i , i = 1, 2, , n be a random sample of size n from a normal population with mean
µ and variance σ2 Then student’s t is defined by
t = X
S n
– /
µ
~ t (n –1 d.f.)
1
n i X i
n
=
∑ is the sample mean
Trang 4S2 = 1
1
n – i
n
=
∑
1
X i–X
d i2
is an unbiased estimate of the population variance σ2
The t-distribution has different values for each d.f and when the d.f are infinitely large, the t-distribution is equivalent to normal distribution.
Example 13 The 9 items of a sample have the following values 45, 47, 50, 52, 48, 47, 49, 53, 51 Does the mean of these values differ significantly from the assumed mean 47.5 ?
Sol H0: µ = 47.5
i.e., there is no significant difference between the sample and population mean.
H1: µ ≠ 47.5 (two tailed test): Given: n = 9, µ = 47.5
X–X – 4.1 – 2.1 0.9 2.9 –1.1 –2.1 –0.1 3.9 1.9
X–X
16.81 4.41 0.81 8.41 1.21 4.41 0.01 15.21 3.61
X = Σx
n =
442
9 = 49.11; Σd iX–X 2
= 54.89;
s2 = Σ X X
n
– –
d i
b g
2
1 = 6.86 ∴ s = 2.619
Applying t-test t = X
s n
– /
µ = 49 1 47 5
1 6 8
2 619
a f = 1.7279
t0.05 = 2.31 for γ = 8
Conclusion: Since t < t0.05, the hypothesis is accepted i.e., there is no significant difference
between their mean
Example 14 A random sample of 10 boys had the following I Q’ s: 70, 120, 110, 101, 88, 83, 95,
98, 107, 100 Do these data support the assumption of a population mean I.Q of 100 ? Find a reasonable range in which most of the mean I.Q values of samples of 10 boys lie.
Sol Null hypothesis, H0: The data are consistent with the assumption of a mean I.Q of 100
in the population, i.e., µ = 100
Alternative hypothesis: H1 :µ ≠ 100
Test Statistic Under H0, the test statistic is:
t = x
– /
µ
d i
2 ∼ t(n –1) where x and S2 are to be computed from the sample values of I.Q.’s
Trang 5Calculation for Sample Mean and S.D.
Hence n = 10, x = 972
10 = 97.2 and S2 =
1833 60 9
= 203.73
203 73 10
– / =
2 8
20 37
2 8
4 514
= 0.62
Tabulated t0.05 for (10 – 1) i.e., 9 d.f for two-tailed test is 2.262.
Conclusion: Since calculated t is less than tabulated t0.05 for 9 d.f., H0 may be accepted at 5% level of significance and we may conclude that the data are consistent with the assumption
of mean I.Q of 100 in the population
The 95% confidence limits within which the mean I.Q values of samples of 10 boys will lie are given by
x ± t0.05 S/ n = 97.2 ± 2.262 × 4.514 = 97.2 ± 10.21 = 107.41 and 86.99 Hence the required 95% confidence intervals is [86.99, 107.41]
Example 15 The mean weekly sales of soap bars in departmental stores was 146.3 bars per store After an advertising campaign the mean weekly sales in 22 stores for a typical week increased to 153.7 and showed a standard deviation of 17.2 Was the advertising campaign successful?
Sol We are given: n = 22, x = 153.7, s = 17.2.
Null Hypothesis: The advertising campaign is not successful, i.e.,
H0: µ = 146.3
Alternative Hypothesis: H1: µ > 146.3 (Right-tail)
Test Statistic: Under the null hypothesis, the test statistic is:
s n
–
µ
2 b g1 ~ t22 – 1 = t21
Now t = 153 7 146 3
17 2 2 21
−
a f = 7 4 21
17 2
× = 9.03
Trang 6Conclusion: Tabulated value of t for 21 d.f at 5% level of significance for single-tailed test
is 1.72 Since calculated value is much greater than the tabulated value, therefore it is highly significant Hence we reject the null hypothesis
Example 16 A machinist is making engine parts with axle diameters of 0.700 inch A random sample of 10 parts shows a mean diameter of 0.742 inch with a standard deviation of 0.040 inch Compute the statistic you would use to test whether the work is meeting the specifications Also state how you would proceed further.
Sol Here we are given:
µ = 0.700 inches, x = 0.742 inches, s = 0.040 inches and n = 10 Null Hypothesis, H0: µ = 0.700, i.e., the product is conforming to specifications
Alternative Hypothesis, H1: µ ≠ 0.700
Test Statistic : Under H0, the test statistic is:
t = x
s n
– /
µ
s n
–
µ
2 b g1 ∼ t(n – 1)
Now, t = 9 0 742 0 700
0 040
–
Here the test statistic ‘t’ follows Student’s t-distribution with 10 – 1 = 9 d.f We will now compare this calculated value with the tabulated value of t for 9 d.f and at certain level of significance, say 5% Let this tabulated value be denoted by t0
(i) If calculated ‘t’ viz., 3.15 > t0, we say that the value of t is significant This implies that
x differs significantly from µ and H0 is rejected at this level of significance and we conclude that the product is not meeting the specifications
(ii) If calculated t < t0, we say that the value of t is not significant, i.e., there is no significant
difference between x and µ In other words, the deviation d ix –µ is just due to
fluctuations of sampling and null hypothesis H0 may be retained at 5% level of
significance, i.e., we may take the product conforming to specifications.
Example 17 A random sample of size 16 has 53 as mean The sum of squares of the derivation from mean is 135 Can this sample be regarded as taken from the population having 56 as mean ? Obtain 95% and 99% confidence limits of the mean of the population.
Sol H0: There is no significant difference between the sample mean and hypothetical population mean
H0: µ = 56; H1: µ ≠ 56 (Two tailed test)
t : X
s n
– /
µ ∼ t(n – 1 d.f.)
Given: X = 53, µ = 56, n = 16, Σd iX–X 2
= 135
s = ΣX X
n
– –
d i2
1 =
135
15 = 3; t =
53 56
3 16
–
t = 4, d.f = 16 – 1 = 15.
Trang 7Conclusion: t0.05 = 1.753.
Since t = 4 > t0.05 = 1.753 i.e., the calculated value of t is more than the table value The
hypothesis is rejected Hence, the sample mean has not come from a population having 56 as mean
95% confidence limits of the population mean
=X ± s
n t0.05 = 53 ±
3
16 (1.725) = 51.706; 54.293
99% confidence limits of the population mean
=X ± s
n t0.01, = 53 ±
3
16 (2.602) = 51.048; 54.951.
(i ) t-Test of Significance for Mean of a Random Sample: To test whether the mean of a sample drawn from a normal population deviates significantly from a stated value when variance of the population is unknown
H0: There is no significant difference between the sample mean x and the population mean
µ i.e., we use the statistic
t = X
s n
– /
µ where X is mean of the sample
s2 = 1
1
n – i X i X
n
–
d i2 1
=
∑ with degrees of freedom (n – 1).
At given level of significance α1 and degrees of freedom (n – 1) We refer to t-table tα (two
tailed or one tailed) If calculated t value is such that t < tα the null hypothesis is accepted and for t > tα, H0 is rejected
(ii ) t-Test For Difference of Means of Two Samples: This test is used to test whether the
two samples x1, x2, x n
1, y1, y2, , y n
2 of sizes n1, n2 have been drawn from two normal populations with mean µ1 and µ2 respectively under the assumption that the population variance are equal (σ1 = σ2 = σ)
H0: The samples have been drawn from the normal population with means µ1 and µ2 i.e.,
H0: µ1 ≠ µ2
Let X, Y be their means of the two samples
Under this H0 the test of statistic t is given by t = X Y
s
–
d i
1 1
1 2 +
– t(n1 + n2 – 2 d.f.)
Also, if the two sample’s standard deviations s1, s2 are given then we have s2 = n s n s
n n
1 12 2 22
1 2 2
+
And, if n1 = n2 = n, t = X Y
s s n
–
–
12 22
1
+ can be used as a test statistic.
Trang 8If the pairs of values are in some way associated (correlated) we can’t use the test statistic
as given in Note 2 In this case, we find the differences of the associated pairs of values and apply
for single mean i.e., t = X
s n
– /
µ
with degrees of freedom n – 1.
The test statistic is t = d
s/ n or t =
d
s/ n– 1 , where d is the mean of paired difference.
i.e., d i = x i – y i
d i = X–Y , where (x i , y i ) are the paired data i = 1, 2, , n Example 18 Samples of sizes 10 and 14 were taken from two normal populations with S.D 3.5 and 5.2 The sample means were found to be 20.3 and 18.6 Test whether the means of the two populations are the same at 5% level.
Sol H0: µ1 = µ2, i.e., the means of the two populations are the same.
H1: µ1 ≠ µ2 Given X = 20.3, X2 = 18.6; n1 = 10, n2 = 14, s1 = 3.5, s2 = 5.2
s2 = n s n s
1 12 2 22
1 2 2
+ + – =
10 3 5 14 5 2
10 14 2
–
+ = 22.775 ∴ s = 4.772
s
1 1
− +
= 20 3 18 6
1 10
1
14 4 772
–
+
F
= 0.8604
The value of t at 5% level for 22 d.f is t0.05 = 2.0739
Conclusion: Since t = 0.8604 < t0.05 the hypothesis is accepted i.e., there is no significant
difference between their means
Example 19 Two samples of sodium vapour bulbs were tested for length of life and the following results were got:
Sample
Is the difference in the means significant to generalise that Type I is superior to Type II regarding length of life ?
Sol H0: µ1 = µ2, i.e., two types of bulbs have same lifetime.
H1: µ1 > µ2 i.e., type I is superior to type II.
s2 = n s n s
1 12 2 22
1 2 2
+
Trang 9= 8 36 7 40
8 7 2
a f a f+ + – = 1659.076 ∴ s = 40.7317
s
1 1
− +
8
1 7
–
= 18.1480 ~ t(n1 + n2 –2d.f)
t0.05 at d.f 13 is 1.77 (one tailed test).
Conclusion: Since calculated t > t0.05, H0 is rejected i.e., H1 is accepted
∴ Type I is definitely superior to Type II
where X =
i
n
=
∑
1
1 X
n i
i , Y =
j
n
=
∑
1
2 Y n j
2
; s2 = 1
2
n +n – E Xd i –Xi e2 Y j –Yj2
+
estimate of the population variance σ2
t follows t-distribution with n1 + n2 – 2 degrees of freedom
Example 20 The following figures refer to observations in live independent samples:
Sample I
Sample II
Analyse whether the samples have been drawn from the populations of equal means.
Sol H0: The two samples have been drawn from the population of equal means i.e., there
is no significant difference between their means
H1: µ1 ≠ µ2 (Two tailed test) Given n1 = Sample I size = 10; n2 = Sample II size = 10
To calculate the two sample mean and sum of squares of deviation from mean Let X1 be
the Sample I and X2 be the Sample II
X–X1 – 1.6 3.4 1.4 7.4 –2.6 –6.6 –13.6 5.4 4.6 11.4
X1 X1
2
–
d i 2.56 11.56 1.96 54.76 6.76 43.56 184.96 29.16 21.16 129.96
X2 –X2 10.7 4.7 – 7.3 – 9.3 1.7 10.7 0.7 – 6.3 6.7 –12.3
X2 X2
2
–
d i 114.49 22.09 53.29 86.49 2.89 114.49 0.49 39.67 44.89 151.29
Trang 10X1 =
i=
∑
1
10
X n
1 1
= 26.6 X2=
i=
∑
1
10 X n
2 2
= 293
10 = 29.3
Σ X1 X1
2
–
d i = 486.4 Σ X2 X2
2
–
d i = 630.08
2
n +n – Σ X1 X1 ΣX X
2
2
10 10 2+ – [486.4 + 630.08] = 62.026 ∴ S = 7.875
Under H0 the test statistic is given by
s
1 1
–
+
= 26 6 29 3
10
1 10
= – 0.7666 –t(n1 + n2 – 2 d.f)
t = 0.7666
Conclusion: The tabulated value of t at 5% level of significance for 18 d.f is 2.1 Since the
calculated value t = 0.7666 < t0.05 H0 is accepted
i.e., there is no significant difference between their means.
i.e., the two samples have been drawn from the populations of equal means.
Applications of t-Distribution: The t-distribution has a wide number of applications in
statistics, some of them are:
1 To test if the sample mean d iX differs significantly from the hypothetical value µ of the population mean
2 To test the significance between two sample means
3 To test the significance of observed partial and mutiple correlation coefficients
4 To test the significance of an observed sample correlation co-efficient and sample regression
coefficient Also, the critical value or significant value of t at level of significance α and
degree of freedom ν for two tailed test are given by
P[J > tν (α)] = α
⇒ P[J ≤ tν(α)] = 1– α
The significant value of t at level of significance ‘α’ for a single tailed test can be obtained
from those of two tailed test by considering the values at level of significance ‘2α’
12.7.3 Snedecor’s Variance Ratio Test or F-test
Suppose we want to test (i) whether two independent samples x i and y j For i = 1, 2 , n1 and j
= 1, 2, , n2 have been drawn from the normal populations with the same variance σ2, (say) or
(ii) whether two independent estimates of the population variance are homogenous or not.