A textbook of Computer Based Numerical and Statiscal Techniques part 55 potx

10 1.2K 2
A textbook of Computer Based Numerical and Statiscal Techniques part 55 potx

Đang tải... (xem toàn văn)

Thông tin tài liệu

526 COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES = 50 225 100 × = 112.5 Since n is large, the test statistic is Z = 2 2 χ – 21n – ∼ N (0, 1) Now, Z = 225 – 99 = 15 – 9.95 = 5.05 Since Z > 3, it is significant at all levels of significance and hence H 0 is rejected and we conclude that σ ≠ 10. Example 2. It is believed that the precision (as measured by the variance of an instrument is no more than 0.16. Write down the null and alternative hypothesis for testing this belief. Carry out the test at 1% level, given 11 measurements of the same subject on the instrument: 2.5, 2.3, 2.4, 2.3, 2.5, 2.7, 2.5, 2.6, 2.6, 2.7, 2.5 [B.U. (2006), Kanpur (2007)] Sol. Null Hypothesis, H 0 : σ 2 = 0.16 Alternative Hypothesis, H 1 : σ 2 > 0.16 Computation of Sample Variance XX – X XX– di 2 2.5 – 0.01 0.0001 2.3 – 0.21 0.0441 2.4 – 0.11 0.0121 2.3 – 0.21 0.0441 2.5 – 0.01 0.0001 2.7 + 0.19 0.0361 2.5 – 0.01 0.0001 2.6 + 0.09 0.0081 2.6 + 0.09 0.0081 2.7 + 0.19 0.0361 2.5 – 0.01 0.0001 X = 27 6 11 . = 2.51 ∑ XX– di 2 = 0.1891 Under the null hypothesis H 0 : σ 2 =0.16, the test statistic is: χ 2 = nS 2 2 σ = ∑ XX– di 2 2 σ = 0 1891 016 . . = 1.182 which follows χ 2 -distribution with d.f. (11 – 1) = 10. TESTING OF HYPOTHESIS 527 Since the calculated value of χ 2 is less than the tabulated value 23.2 of χ 2 for 10 d.f. at 1% level of significance, it is not significant. Hence H 0 may be accepted and we conclude that the data are consistent with the hypothesis that the precision of the instrument is 0.16. (ii) Chi-Square Test of Goodness of Fit: χ 2 test is an approximate test for large values of n. χ 2 test enables us to ascertain how well the theoretical distributions fit empirical distributions or distribution obtained from sample data. If the calculated value of chi-square is less than the table value at a specified level of significance the fit is considered to be good. Generally we take significance at 5% level. Similarly if the calculated value of χ 2 is greater than the table value, the chi-square fit is considered to be poor. Example 3. The following table shows the distribution of digits in numbers chosen at random from a telephone directory: Digits 0 1 2 3 4 5 6 7 8 9 Frequency 1026 1107 997 996 1075 933 1107 972 964 853 Test whether the digits may be taken to occur equally frequently in the directory. Sol. Null Hypothesis H 0 : The digits taken in the directory occur equally frequently. Therefore there is no significant difference between the observed and expected frequency. Under H 0 , the expected frequency is given by = 10 000 10 , = 1000. To find the value of χ 2 O i 1026 1107 997 996 1075 1107 933 972 964 853 E i 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 (O i – E i ) 2 676 11449 9 1156 5625 11449 4489 784 1296 21609 χ 2 = Σ OE E ii i – bg 2 = 58542 1000 = 58.542. Conclusion. The tabulated value of χ 2 at 5% level of of significance for 9 d.f. is 16.919. Since the calculated value of χ 2 is greater than the tabulated value, H 0 is rejected. i.e., there is significant difference between the observed and theoretical frequency. i.e., the digits taken in the directory do not occur equally frequently. Example 4. The following table gives the number of aircraft accidents that occurs during the various days of the week. Find whether the accidents are uniformly distributed over the week Days Sun. Mon. Tues. Wed. Thus. Fri. Sat. No. of accidents 14 16 8 12 11 9 14 (Given: The values of chi-square significant at 5, 6, 7, d.f. are respecitvely 11.07.,12.59, 14.07 at the 5% level of significance. Sol. Here we set up the null hypothesis that the accidents are uniformly distributed over the week. Under the null hypothesis, the expected frequencies of the accidents on each of the days would be: 528 COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES Days Sun. Mon. Tues. Wed. Thus. Fri. Sat. Total No. of accidents 12 12 12 12 12 12 12 84 χ 2 = 14 12 12 16 12 12 812 12 12 12 12 2222 –––– bgbgbgbg +++ + 11 12 12 912 12 14 12 12 22 2 –– – bgbgbg ++ = 1 12 (4 + 16 + 16 + 0 + 1 + 9 + 4) = 50 12 = 4.17 The number of degrees of freedom = Number of observations – Number of independent constraints. = 7 – 1 = 6 The tabulated χ 2 0.05 for 6 d.f. = 12.59 Since the calculated χ 2 is much less than the tabulated value, it is highly insignificant and we accept the null hypothesis. Hence we conclude that the accidents are uniformly distributed over the week. Example 5. Records taken of the number of male and female births in 800 families having four children are as follows: No. of male births 0 1 2 3 4 No. of female births 4 3 2 1 0 No. of families 32 178 290 236 94 Test whether the data are consistent with the hypothesis that the Binomial law holds and the chance of male birth is equal to that of female birth, namely p = q = 1/2. Sol. H 0 : The data are consistent with the hypothesis of equal probability for male and female births, i.e., p = q = 1/2. We use Binomial distribution to calculate theoretical frequency given by: N(r)=N × P(X = r) where N is the total frequency. N(r) is the number of families with r male children: P(X = r)= n C r p r q n–r where p and q are probability of male and female births, n is the number of children. N(0) = No. of families with 0 male children = 800 × 4 C 0 1 2 4 F H G I K J = 800 × 1 × 1 2 4 = 50 N(2) = 800 × 4 C 1 1 2 1 2 13 F H G I K J F H G I K J = 200; N(2) = 800 × 4 C 2 1 2 1 2 22 F H G I K J F H G I K J = 300 TESTING OF HYPOTHESIS 529 N(4) = 800 × 4 C 3 1 2 1 2 13 F H G I K J F H G I K J = 200; N(4) = 800 × 4 C 4 1 2 1 2 04 F H G I K J F H G I K J = 50 Observed frequency O i 32 178 290 236 94 Expected frequency E i 50 200 300 200 50 (O i – E i ) 2 324 484 100 1296 1936 OE E ii i – bg 2 6.48 2.42 0.333 6.48 38.72 χ 2 = Σ OE E ii i – bg 2 = 54.433 Conclusion. Table value of χ 2 at 5% level of significance for 5 – 1 = 4 d.f. is 9.49. Since the calculated value of χ 2 is greater than the tabulated value, H 0 is rejected. i.e., the data are not consistent with the hypothesis that the Binomial law holds and that the chance of a male birth is not equal to that of a female birth. Since the fitting is Binomial, the degrees of freedom ν = n –1 i.e., ν = 5 –1 = 4 Example 6. A survey of 320 families with 5 children each revealed the following distribution: No of boys No of girls No of families . . . 54 3 210 01 2 345 14 56 110 88 40 12 Is this result consistent with the hypothesis that male and female births are equally probable ? Sol. Let us set up the null hypothesis that the data are consistent with the hypothesis of equal probability for male and female births. Then under the null hypothesis: p = Probability of male birth = 1 2 = q p(r) = Probability of ‘r’ male births in a family of 5 = 5 r F H I K p r q 5 – r = 5 r F H I K 1 2 5 F H G I K J The frequency of r male births is given by: f(r)=N. p(r) = 320 × 5 r F H I K × 1 2 5 F H G I K J = 10 × 5 r F H I K (1) 530 COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES Substituting r = 0, 1, 2, 3, 4 successively in (1), we get the expected frequencies as follows : f(0) = 10 × 1 = 10, f(1) = 10 × 5 C 1 = 50 f(2) = 10 × 5 C 2 = 100, f(3) = 10 × 5 C 3 = 100 f(4) = 10 × 5 C 4 = 50, f(5) = 10 × 5 C 5 = 10 Calculations for χχ χχ χ 2 Observed Expected (O – E) 2 (O – E) 2 /E Frequencies Frequencies (O)(E) 14 10 16 1.6000 56 50 36 0.7200 110 100 100 1.0000 88 100 144 1.4400 40 50 100 2.0000 12 10 4 0.4000 Total 320 320 7.1600 ∴χ 2 = OE E – bg 2 L N M M O Q P P ∑ = 7.16 Tabulated χ 2 0.05 for 6 – 1 = 5 d.f. is 11.07. Calculated value of χ 2 is less than the tabulated value, it is not significant at 5% level of significance and hence the null hypothesis of equal probability for male and female births may be accepted. Example 7. Fit a Poisson distribution to the following data and test the goodness of fit: X:01234 56 f: 275 72 30 7 5 21 Sol. Mean of the given distribution is: X = fx N ii i ∑ = 189 392 = 0.482 In order to fit a Poisson distribution to the given data, we take the mean (parameter) m of the Poisson distribution equal to the mean of the given distribution, i.e., we take m = X = 0.482 The frequency of r successes is given by the Poisson law as: f(r)=Np(r) = 392 × e r r –0. . ! 482 0 482 af ; r = 0, 1, 2, , 6 Now, f(0) = 392 × e –0.482 = 392 × Antilog [– 0.482 log e] = 392 × Antilog [– 0.482 × log 2.7183] [ e = 2.7183] TESTING OF HYPOTHESIS 531 = 392 × Antilog [– 0.482 × 0.4343] = 392 × Antilog [– 0.2093] = 392 × Antilog [1.7907] = 392 × 0.6176 = 242.1 f(1) = m × f(0) = 0.482 × 242.1 = 116.69 f(2) = m 2 × f(1) = 0.241 × 116.69 = 28.12 f(3) = m 3 × f(2) = 0482 3 . × 28.12 = 4.518 f(4) = m 4 × f(3) = 0482 4 . × 4.518 = 0.544 f(5) = m 5 × f(4) = 0482 5 . × 0.544 = 0.052 f(6) = m 6 × f(5) = 0482 6 . × 0.052 = 0.004 Hence the theoretical Poisson frequencies correct to one decimal place are as given below: 0 1 2 3456Total 242.1 116.1 28.1 4.5 0.5 0.1 0 392 X Expected Frequency CALCULATIONS FOR CHI-SQUARE Observed Expected (O – E) (O – E) 2 (O – E) 2 /E Frequency Frequency (O) (E) 275 242.1 32.9 1082.41 4.471 72 116.7 44.7 1998.09 17.121 30 28.1 1.9 3.61 0.128 7 5 2 1 15 U V | | W | | 45 05 01 0 51 . . . . U V | | W | | 9.9 98.01 19.217 392 392.0 40.937 ∴χ 2 = Σ OE E – bg 2 = 40.937 degree of freedom = 7 – 1 – 1 – 3 = 2 Tabulated value of χ 2 for 2 degree of freedom at 5% level of significance is 5.99. 532 COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES Conclusion: Since calculated value of χ 2 (40.937) is much greater than 5.99, it is therefore highly significant. Hence we say that poisson distribution is not a good fit to the given data. Example 8. A die is thrown 270 times and the results of these throws are given below: . 123456 40 32 29 59 57 59 No appeared on the die Frequency Test whether the die is biased or not. Sol. Null Hypothesis H 0 : Die is unbiased. Under this H 0 , the expected frequencies for each digit is 276 6 = 46. To find the value of χ 2 , () 2 40 32 29 59 57 59 46 46 46 46 46 46 36 196 289 169 121 169 i i ii O E OE − χ 2 = Σ OE E ii i – bg 2 = 980 46 = 21.30. Conclusion: Tabulated value of χ 2 at 5% level of significance for (6 – 1= 5) d.f. is 11.09. Since the calculated value of χ 2 = 21.30 > 11.07 the tabulated value, H 0 is rejected. i.e., die is not unbiased or die is biased. Example 9. The theory predicts the proportion of beans in the four groups, G 1 , G 2 , G 3 , G 4 should be in the ratio 9: 3: 3: 1. In an experiment with 1600 beans the numbers in the four groups were 882, 313, 287 and 118. Does the experimental result support the theory. Sol. H 0 : The experimental result support the theory, i.e., there is no significant difference between the observed and theoretical frequency under H 0 , the theoretical frequency can be calculated as follows: E(G 1 )= 1600 9 16 × = 900; E(G 2 ) = 1600 3 16 × = 300; E(G 3 )= 1600 3 16 × = 300; E(G 4 ) = 1600 1 16 × = 100. TESTING OF HYPOTHESIS 533 To calculate the value of χ 2 Observed frequency O i 882 313 287 118 Expected frequency E i 900 300 300 100 OE E ii i – bg 2 0.36 0.5633 0.5633 3.24 χ 2 = Σ OE E ii i – bg 2 = 4.7266. Conclusion: Table value of χ 2 at 5% level of significance for 3 d.f. is 7.815. Since the calculated value of χ 2 is less than that of the tabulated value. Hence H 0 is accepted i.e., the experimental result support the theory. (iii) χχ χχ χ 2 test as a test of Attributes: Let us consider two attributes A and B, A divided into r classes A 1 , A 2 , , A r and B divided into S classes B 1 , B 2 , B S , such a classification in which attributes are divided into more than two classes is known as manifold classification. The various cell frequencies can be expressed in the following table known as r × s manifold contingency table. Here (A i ) is the number of persons possessing the attributes and (B j ) is the number of persons possessing the attributes (B j ) and (A i B j ) is the number of persons possessing both the attributes A i and B j for [i = 1, 2, , r; j = 1, 2, S] A i i r = ∑ 1 = B j j s = ∑ 1 = N, is the total frequency. The contingency table for r × s is given below: A A 1 A 2 A 3 A r Total B B 1 (A 1 B 1 )(A 2 B 1 )(A 3 B 1 ) (A 1 B 1 ) B 1 B 2 (A 1 B 2 )(A 2 B 2 )(A 3 B 2 ) (A r B 2 ) B 2 B 3 (A 1 B 3 )(A 2 B 3 )(A 3 B 3 ) (A r B 3 ) B 3 B s (A 1 B s )(A 2 B s )(A 3 B s ) (A r B s )(B s ) Total (A 1 )(A 2 )(A 3 ) (A r ) N The problem is to test if two attributes A and B under consideration are independent or not. Under the null hypothesis, both the attributes are independent, the theoretical cell frequencies are calculated as follows. 534 COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES P(A i ) = Probability that a person possesses the attribute A i = A N i bg i = 1, 2, , r P(B i ) = Probability that a person possesses the attribute B j = B N j ej P(A i B j ) = Probability that a person possesses both attributes A i and B j = AB N ij ej If (A i B j ) 0 is the expected number of persons possessing both the attributes A i and B j (A i B j ) 0 = N.P (A i B j ) = NP (A i )(B j ) = N A N B N i j bg ej = AB N ij bg ej (Since A and B are independent) Therefore χ 2 = i r j s == ∑∑ 11 AB AB AB ij ij ij ejej ej – 0 2 0 which is distributedd as a χ 2 variate with (r –1)(S –1) d.f. Some Remarkable points: 1. For a 2 × 2 contingency table where the frequencies are ab cd / / , χ 2 can be calculated from independent frequencies as χ 2 = abcdadbc abcdbdac +++ ++++ bgbg bgbgbgbg – 2 2. If the contingency table is not 2 × 2, then the above formula for calculating χ 2 cannot be used. Hence, we have another formula for calculating the expected frequency (A i B j ) 0 = AB N ij bg ej i.e., expected frequency in each cell is = Product of column total and row total whole total 3. If ab cd / / is the 2 × 2 contingency table with two attributes, Q = ad bc ad bc – + is called the coefficient of association. If the attributes are independent then a b = c d . Remark: Yatess Correction: In a 2 × 2 table, if the frequencies of a cell is small, we make Yates’s correction to make χ 2 continuous. Decrease by 1 2 those cell frequencies which are greater than expected frequencies, and increase by 1 2 those which are less than expectation. This will not affect the marginal columns. This correction is known as Yates’s correction to continuity. TESTING OF HYPOTHESIS 535 After Yates’s correction χ 2 = Nbc ad N acbdcdab –– 1 2 2 F H I K ++++ bgbgbgbg when ad – bc < 0 χ 2 = Nad bc N acbdcdab –– 1 2 2 F H I K ++++ bgbgbgbg when ad – bc > 0 Example 10. (2 × 2 contingency table). For the 2 × 2 table, ab cd prove that chi-square test of independence gives χ 2 = Nad bc acbdabcd – bg bgbgbgbg 2 ++++ , N = a + b + c + d (1) [Guwahati Univ. B.Sc., 2002] Sol. Under the hypothesis of independence of attributes, E(a)= abac N ++ bgbg E(b)= abbd N ++ bgbg E(c)= accd N ++ bgbg and E(d)= bdcd N ++ bgbg abab cdcd ac bd N + + ++ ∴ χ 2 = aEa Ea bEb Eb cEc Ec dEd Ed –––– af af af af af af af af 2222 +++ (2) a – E(a)=a – abac N ++ bgbg = a a b c d a ac ab bc N +++ + + + bg ej – 2 = ad bc N – . the theoretical cell frequencies are calculated as follows. 534 COMPUTER BASED NUMERICAL AND STATISTICAL TECHNIQUES P (A i ) = Probability that a person possesses the attribute A i = A N i bg . significant at 5% level of significance and hence the null hypothesis of equal probability for male and female births may be accepted. Example 7. Fit a Poisson distribution to the following data and. test as a test of Attributes: Let us consider two attributes A and B, A divided into r classes A 1 , A 2 , , A r and B divided into S classes B 1 , B 2 , B S , such a classification in which attributes

Ngày đăng: 04/07/2014, 15:20

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan