A textbook of Computer Based Numerical and Statiscal Techniques part 54 potx

where x1be the mean of a sample of size n1 from a population with mean µ1 and variance σ12.x2 be the mean of an independent sample of size n2 from population with mean µ2 and variance σ2

Trang 1

where x1be the mean of a sample of size n1 from a population with mean µ1 and variance σ12.

x2 be the mean of an independent sample of size n2 from population with mean µ2 and variance σ22

Remarks:1 Under the null hypothesis H0: µ1 = µ2, i.e., there is no significant difference

between the sample means therefore σ12 = σ22 = σ2 i.e., if the sample have been drawn from the

populations with common standard deviation σ then

1 2

–

2 If σ12≠σ22 and σ1 and σ2 are not known, then test statistic estimated from sample values i.e.,

S n

1 2

12 1

22 2

–

F

HG I KJ+F

HG I KJ

3 If σis not known, then its test statistic based on the sample variances is used

If σ1 = σ2, we use σ2 = n S1 1n n S n

2

2 22

+ + to evaluate σ.

1 1

x x

n S n S

−

(E) Test of Significance for the Difference of standard Deviations: If S1 and S2 are the standard deviations of two independent samples, then under the null hypothesis,

H0 : σ1 = σ2 (the sample S.D do not differ significantly), the test statistic is given by

S E S S

– b – g (For large samples) but the difference of the sample standard deviation is given by

S.E (S1 – S2) = σ12 σ

1

22 2

2n + 2n

12 1

22 2

− +

when σ12 and σ22 are not known (i.e., population S.D are not known) then the test statistic reduces

to

S n

12 1

22 2

–

+

Trang 2

Example 25 Intelligence tests were given to two groups of boys and girls

Examine if the difference between mean scores is significant.

Sol Null hypothesis H 0: There is no significant difference between mean scores i.e.,

x

—

1 = x2

H1 : x—1 = x—2 Under the null hypothesis Z = x x

S n

1 2

12 1 2 2

2

–

+

F

HG I KJ

= 75 73

8 60

10 100

–

+

= 1.3912

Conclusion: As the calculated value of |Z| < 1.96, the significant value of Z at 5% level of significance, H0 is accepted

Example 26 The means of two single large samples of 1000 and 2000 members are 67.5 inches and 68.0 inches respectively Can the samples be regarded as drawn from the same population of standard deviation 2.5 inches? (Test at 5% level of significance).

Solution: Given: n1 = 1000

n2 = 2000

x1 = 67.5 inches

x2 = 68.0 inches

Null hypothesis: H0: µ1 = µ2 and σ = 2.5 inches

i.e., the samples have been drawn from the same population of standard deviation 2.5

inches

Alternative hypothesis: H1: µ1 ≠µ2 (Two-tailed)

Test statistic: Under H0, the test statistic (For large samples)

2

–

σ F +

HG I KJ

Z = 67 5 68 0

2 5 1 1000

1 2000

– FH + IK = –

0 5

2 5 0 0387×

Z = –5.1

Conclusion: Since Z > 3, the value is highly significant and we reject the null hypothesis and conclude that samples are certainly not from the same population with standard deviation 2.5

Example 27 The average income of persons was Rs 210 with a S.D of Rs 10 in sample of 100 people of a city For another sample of 150 persons, the average income was Rs 220 with standard deviation of Rs 12 The S.D of incomes of the people of the city was Rs.11 Test whether there is any significant difference between the average incomes of the localities.

Trang 3

Sol Given that n1=100, n2 = 150, x1 = 210, x2 = 220, S1 = 10, S2 = 12.

Null Hypothesis: The difference is not significant i.e., there is no difference between the

incomes of the localities

H0 : x1 = x2, H1 : x1 ≠ x2

s n

1 2

12 1

22 2

–

+

= 210 220

10 100

12 150

–

+

= –7.1428 ∴ |Z| = 7.1428

Conclusion: As the calculated value of |Z| > 1.96, the significant value of Z at 5% level of significance, H0 is rejected i.e., there is significant difference between the average incomes of the

localities

Example 28 In a survey of buying habits, 400 women shoppers are chosen at random in super market ‘A’ located in a certain section of the city Their average weekly food expenditure is Rs 250 with

a standard deviation of Rs 40 For 400 women shoppers chosen at random in super market ‘B’ in another section of the city, the average weekly food expenditure is Rs 220 with a standard deviation of Rs 55 Test

at 1% level of significance whether the average weekly food expenditure of the two populations of shoppers are equal.

Sol We have: n1 = 400, n2 = 400, x1= Rs 250, x2 = Rs 220, S1 = Rs 40, S2 = Rs 55 Null hypothesis, H0: µ1 = µ2

i.e., the average weekly food expenditures of the two populations of shoppers are equal Alternative Hypothesis, H1: µ1≠µ2 (Two-tailed)

Test Statistic: Since samples are large, under H0 then

12 1

22 2

–

σ +σ

F

HG I KJ

Since σ1 and σ2 are not known then we use

S n

12 1

22 2

–

+

40 400

55 400

–

a f a f+ = 8.82 (Approx.) Conclusion: Since |Z| is much greater than 2.58, the null hypothesis (µ1 = µ2) is rejected

at 1% level of significance and we conclude that the average weekly expenditures of two

populations of shoppers in market A and B differ significantly.

Example 29 In a certain factory there are two independent processes manufacturing the same items The average weight in a sample of 250 items produced from one process is found to be 120 ozs with a standard deviation of 12 ozs While the corresponding figures in a sample of 400 items from the other process are 124 and 14 Obtain the standard error of difference between the two sample means; Is this

Trang 4

difference significant? Also find the 99% confidence limits for the difference in the average weights of items produced by the two processes respectively.

Sol Given: n1 = 250, x1 = 120 ozs, S1 = 12 ozs = σ1

n2 = 400, x2 = 124 ozs, S2 = 14 ozs = σ2

S.E dx1 –x2i = σ12 σ

1

22 2

F

HG I KJ+F

HG I KJ = S

n

S n

12 1

22 2

F

HG I KJ +F

HG I KJ

250

196 400

+

FHG IKJ = 0 576 +0 490 = 1.034 Null Hypopthesis, H0: µ1 = µ2 (i.e., the sample means do not differ significantly) Altnerative Hypothesis, H1 = µ1≠ µ2 (Two-tailed)

Test Statistic: Under H0, the test statistic is given by

– d – i = 120 124

1 034

–

1 034 = 3.87

Conclusion: Since |Z| > 3, the null hypothesis is rejected and we can say that there is

significant difference between the sample means 99% confidence limits for µ1 –µ2 is

x1 –x2 ± 2.58 S.E x1 –x2

= 4 ± 2.58 × 1.034

= 4 ± 2.67 (Approx.)

= 6.67 (on taking +ve sign) and 1.33 (on taking –ve sign).

∴ 1.33 < µ1 –µ2 < 6.67

Example 30 Two populations have their means equal, but S.D of one is twice the other Show that

in the samples of size 2000 from each drawn under simple sampling conditions, the difference of means will,

in all probability not exceed 0.15σ, where σ is the smaller S.D what is the probability that the difference will exceed half this amount ?

Sol Let standard deviations of the two populations be σ and 2σ respectvely and let µ be the mean of each of two populations

Given n1 = n2 = 2000

If x1and x2 be two sample means then

–

Now Edx1 –x2i = Ed ix1 – E d ix2 = µ – µ = 0

(Samples are large)

Trang 5

Also S.E dx1 –x2i = σ2 σ

1

2

n +a fn

= σ 1

2000

4 2000

+ = 0.05σ

−

.d i ∼ N (0, 1)

Under simple sampling conditions, we should in all probability have

|Z| < 3

⇒ x1 –x2 < 3 S.E dx1 –x2i

⇒ x1 –x2 < 0.15 σ

which is the required result

We want p = P x1 x2 1

2 0 15 – > ×

L

HG 0 051 2I KJ

– σ

= P [|Z| > 1.5]

= 1 – P [|Z| ≤ 1.5]

= 1 – 2P (0 ≤ Z ≤ 1.5)

= 1 – 2 × 0.4332 = 0.1336 Ans

Example 31 Random samples drawn from two countries gave the following data relating to the heights of adult males:

(i) Is the difference between the means significant?

(ii) Is the difference between the standard deviations significant?

Sol Given: n1 = 1000, n2 = 1200, x1= 67.42; x2 = 67.25, s1 = 2.58, s2 = 2.50

Since the samples size are large we can take

σ1 = S1 = 2.58;

σ2 = S2 =2.50

Trang 6

(i) Null Hypothesis: H0 : µ1 = µ2 i.e., sample means do not differ significantly Alternative hypothesis: H1 : µ1≠ µ2 (two-tailed test)

s n

12 1

22 2

–

+

= 67 42 67 25

2 58 1000

2 50 1200

–

a f a f+ = 1.56

Since |z| < 1.96 we accept the null hypothesis at 5% level of significance.

(ii) We set up the null hypothesis.

H0: σ1 = σ2 i.e., the sample S.D.’s do not differ significantly.

Alternative hypothesis: H1 : σ1 ≠ σ2 (two-tailed)

∴ The test statistic is

12 1

22 2

–

σ + σ =

s n

12 1

22 2

–

+

( σ1 = s1, σ2 = s2 for large samples)

= 258 250 258

2 1000

250

2 1200

b g b g

= 0 08

6 6564 2000

625 2400

+ = 1.0387.

Since |z| < 1.96 we accept the null hypothesis at 5% level of significance.

PROBLEM SET 12.1

1 325 men out of 600 men chosen from a big city were found to be smokers Does this information support the conclusion that the majority of men in the city are smokers?

[Ans H0 rejected at 5% level]

2 A sample of size of 600 persons selected at random from a large city shows that the percentage of males in the sample is 53 It is believed that the ratio of males to the total population in the city is 0.5 Test whether the belief is confirmed by the observation

[Ans H0 accepted at 5% level]

3 In a city a sample of 1000 people were taken and out of them 540 are vegetarian and the rest are non-vegetarian Can we say that the both habits of eating are equally

popular in the city at (i) 5% level of significance (ii) 1% level of significance.

[Ans H0 rejected at 5% level

H0 accepted at 1% level]

4 In a hospital 475 female and 525 male babies were born in a week Do these figures confirm the hypothesis that males and females are born in equal number?

[Ans H0 accepted at 5% level]

5 A random sample of 500 bolts was taken from a large consignment and 65 were found

to be defective Find the percentage of defectives bolts in the consignment

[Ans Between 17.51 and 8.49]

6 In a town A, there were 956 births of which 52.5% were males while in towns A and

B combined, this proportion in total of 1406 births was 0.496 Is there any significant difference in the proportion of male births in the two towns? [Ans H0: Rejected]

Trang 7

7 1,000 apples are taken from a large consignment and 100 are found to be bad Estimate the percentage of bad apples in the consignment and assign the limits within which the percentage lies

8 In a referendum submitted to the students body at a university, 850 men and 560 women voted 500 men and 320 women voted yes Does this indicate a significant difference of opinion between men and women on this matter at 1% level?

[Ans H0: Accepted]

9 A manufacturing firm claims that its brand A product outsells its brand B product by 8% If it is found that 42 out of a sample of 200 persons prefer brand A and 18 out of another sample of 100 persons prefer brand B Test whether the 8% difference is a valid

10 In a large city A, 25% of a random sample of 900 school boys had defective eye-sight.

In another large city B, 15.5% of a random sample of 1,600 school boys had the same

defect Is this difference between the two proportions significant?

[Ans Not Significant]

11 A sample of 1000 students from a university was taken and their average weight was found to be 112 pounds with a S.D of 20 pounds Could the mean weight of students

in the population be 120 pounds? [Ans H0: Rejected]

12 A sample of 400 male students is found to have a mean height of 160 cms Can it be reasonably regarded as a sample from a large population with mean height 162.5 cms and standard deviation 4.5 cms? [Ans H0: Accepted]

13 A random sample of 200 measurements from a large population gave a mean value of

50 and a S.D of 9 Determine 95% confidence interval for the mean of population?

[Ans 48.8 and 51: 2]

14 The guaranteed average life of certain type of bulbs is 1000 hours with a S.D of 125 hours It is decided to sample the output so as to ensure that 90% of the bulbs do not fall short of the guaranteed average by more than 2.5% What must be the minimum

15 The heights of college students in a city are normally distributed with S.D 6 cms A sample of 1000 students has mean height 158 cms Test the hypothesis that the mean height of college students in the city is 160 cms

[Ans H0: Rejected at both level 1% and 5%]

16 A normal population has a mean of 0.1 and standard deviation of 2.1 Find the probability that mean of a sample of size 900 will be negative? [Ans 0.0764]

17 Intelligence tests on two groups of boys and girls gave the following results Examine

if the difference is significant

[Ans Not a significant difference]

Trang 8

18 Two random samples of sizes 1000 and 2000 farms gave an average yield of 2000 kg and 2050 kg respecitvely The variance of wheat farms in the country may be taken as

100 kg Examine whether the two samples differ significantly in yield

[Ans Highly significant]

19 A random sample of 200 measurements from a large population gave a mean value of

50 and S.D of 9 Determine the 95% confidence interval for the mean of the population?

[Ans 49.58, 50.41]

20 The means of two large samples of 1000 and 2000 members are 168.75 cms and 170 cms respectively Can the samples be regarded as drawn from the same population of standard deviation 6.25 cms? [Ans Not significant]

21 A sample of heights of 6400 soldiers has a mean of 67.85 inches and a S.D of 2.56 inches While another sample of heights of 1600 sailors has a mean of 68.55 inches with S.D of 2.52 inches Do the data indicate that the sailors are on the average taller than

22 The yield of wheat in a random sample of 1000 farms in a certain area has a S.D of 192

kg Another random sample of 1000 farms gives a S.D of 224 kg Are the standard deviations significantly different?

[Ans Z = 4.851 and standard deviations are significantly different]

Generally when the size of the sample is less than 30, it is called small sample For small sample

size we use t-test, f-test, z-test and chi-square (χ2) test for testing of hypothesis Chi-square test

is flexible for small sample size problem as well as large sample size

For small sample it will not be possible for us to assume that the random sampling distribution

of a statistic is approximately normal and the values given by the sample data are sufficiently close to the population values and can be used in their place for the calculation of the standard error of the estimate

12.7.1 Chi-Square (χχχχχ2 ) Test

χ2 test is one of the simplest and general known test It is applicable to a very large number as well as small number of problems in general practice under the following headings

(i) As a test of goodness of fit.

(ii) As a test of independence of attributes.

(iii) As a test of homogenity of independent estimates of the population variance.

(iv) As a test of the hypothetical value of the population variance σ2

(v) To test the homogeneity of independent estimates of the population correlation coefficient.

The quantity χ2 describes the magnitude of discrepancy between theory and observations

If χ = 0, the expected and the observed frequencies completely coincide

The greater the discrepancy between the observed and expected frequencies, the greater is the value of χ2 Thus χ2 affords a measure of the correspondence between theory and observation

Trang 9

If O i (i = 1, 2, , n) is a set of observed (experimental) frequencies and E i (i = 1, 2, , n)

is the corresponding set of expected (theoretical or hypothetical) frequencies, then, χ2is defined as

χ2 =

E

i i i i

n b – g2 1

L N

MM O Q PP

=

∑

where ΣO i = ΣE i = N (total frequency) and degrees of freedom (d.f.) = (n – 1).

Remarks

(i) If χ2 = 0, the observed and theoretical frequencies agree exactly

(ii) If χ2 > 0, they do not agree exactly

Degrees of Freedom (d.f ): The number of independent variates which make up the statistic

χ2 is known as the degrees of freedom (d.f.) and is denoted by ν (Greek alphabet Nu)

In other way, the number of degrees of freedom, is the total number of observations less the number of independent constraints imposed on the observations

where n = no of observations

k = the number of independent constraints in a set of data of n observations Thus in a set of n observations the d.f for χ2 are (n –1) generally, one d.f being lost because

of linear constraints

O i

i

∑ = N, on the frequencies.

For a p × q contingency table, ν = (p –1)(q –1); where (p columns and q rows)

Also, in case of a contingency table, the expected frequency of any class

= Total of rows in which it occurs Total of columns in which it occurs

Total no of observations

×

Conditions For the Validity of χχχχχ2 Test: χ2 test is an approximate test for large values of

n For the validity of chi-square test of ‘goodness of fit’ between theory and experiment, the

following conditions must be satisfied

1 The sample observations should be independent

2 The constraints on the cell frequencies, if any, should be linear e.g.

ΣO i=ΣE i

3 N, the total number of frequencies should be large It is difficult to say what constitutes

largeness, but as an arbitrary figure, we can say that N should be atleast 50, however, few the cells

4 No theoretical cell-frequency should be small Also it is difficult to say what constitutes smallness, but 5 should be regarded as the very minimum and 10 is better If small theoretical

frequencies occur (i.e., < 10), the difficulty is overcome by grouping two or more classes together before calculating (O – E) It is important to remember that the number of degrees of freedom

is determined with the number of classes after regrouping

5 χ2 test depends only on the set of observed and expected frequencies and on d.f It does

not make any assumptions regarding the parent population from which the observations are

Trang 10

taken Since χ2 does not involve any population parameters it is termed as a statistic and the test

is known as Non-parametric test or Distribution-Free test

Remark: The probability function of χ2 distribution is given by

f(χ2) = ? χ 2 ν2 1 A N2 2

e j e − j −

where e = 2.71828,

ν = degree of freedom

c = a constrant depending only on ν

For large sample sizes, the sampling distribution of χ2 can be closely approximated by a continuous curve known as the chi-square distribution

If the data is given in a series of “n” numbers then degrees of freedom = n –1

In the case of Binomial distribution d.f = n – 1

In the case of Poisson distribution d.f = n – 2

In the case of Normal distribution d.f = n – 3.

(i) Chi-Square test For Population Variance: Under the null hypothesis that the population variance is σ2 = σ02 the statistic

χ2 = x i x

i

n d i– 2

02

=

∑

= 1

02

x n

i i

n

i

2 1

2

=

∑

L N

MM –b gΣ O Q PP

χ2 = nS

2

02

σ

follows chi-square distribution with (n –1) d.f.

This test can be applied only if the population from which sample is drawn is normal

If the sample size n is large (n > 30) then we can use Fisher’s approximation

and apply Normal test

Example 1 Test the hypothesis that σ = 10, given that S = 15 for a random sample of size 50 from

a normal population.

Sol Null Hypothesis,

H0: σ = 10

We are given n = 50, S = 15

2 2

σ

Định dạng
Số trang	10
Dung lượng	123,87 KB