15- Chapter Fifteen McGraw- © 2005 The McGraw-Hill Companies, Inc., All 15- Chapter Fifteen Nonparametric Methods: Chi-Square Applications GOALS When you have completed this chapter, you will be able to: ONE List the characteristics of the Chi-square distribution TWO Conduct a test of hypothesis comparing an observed set of frequencies to an expected set of frequencies THREE Conduct a hypothesis test to determine whether two classification criteria are related Goals 15- Chi-Square Applications The major characteristics of the chi-square distribution are: It is positively skewed It is non-negative There is a family of chi-square distributions Characteristics of the Chi-Square Distribution 15- df = df = df = 10 χ2 χ distribution Let f0 and fe be the observed and expected frequencies respectively H1: There is a difference between the observed and the expected frequencies The test statistic is: 15- H0: There is no difference between the observed and expected frequencies ( fo − fe ) χ = ∑ fe The critical value is a chi-square value with (k-1) degrees of freedom, where k is the number of categories Goodness-of-Fit Test: Equal Expected Frequencies 15- The following information shows the number of employees absent by day of the week at a large a manufacturing plant At the 01 level of significance, is there a difference in the absence rate by day of the week? Day of Week Number Absent Monday 120 Tuesday 45 Wednesday 60 Thursday 90 Friday 130 Total 445 Example continued Step 1: State the null and alternate hypotheses H0: There is no difference between the observed and expected frequencies 15- H1: There is a difference between the observed and the expected frequencies Step 2: Select the level of significance This is given in the problem as 01 Step 3: Select the test statistic It is the chi-square distribution Example continued 15- Step 4: Formulate the decision rule Assume equal expected frequency as given in the problem fe = (120+45+60+90+130)/5=89 The degrees of freedom: (5-1)=4 The critical value of χ is 13.28 Reject the null and accept the alternate if Computed χ > 13.28 or p< 01 EXAMPLE continued 15- Step Five: Compute the value of chi-square and make a decision Day Monday Tuesday Wednesday Thursday Friday Total Frequency Expected (fo – fe)2/fe 120 45 60 90 130 445 89 89 89 89 89 445 10.80 21.75 9.45 0.01 18.89 60.90 The p(χ > 60.9) = 000000000001877 or essentially Example continued 15- 10 Because the computed value of chi-square, 60.90, is greater than the critical value, 13.28, the p of 000000000001877 < 01, H0 is rejected We conclude that there is a difference in the number of workers absent by day of the week Example continued Goodness-of-fit Test: Unequal Expected Frequencies The U.S Bureau of the Census indicated that 63.9% of the population is married, 7.7% widowed, 6.9% divorced (and not re-married), and 21.5% single (never been married) A sample of 500 adults from the Philadelphia area showed that 310 were married, 40 widowed, 30 divorced, and 120 single At the 02 significance level can we conclude that the Philadelphia area is different from the U.S as a whole? Example 15- 11 15- 12 Step 4: H0 is rejected if χ2 >9.837, df=3, or if p < α of 02 Step 3: The test statistic is the chi-square Step 2: The significance level given is 02 Step 1: H0: The distribution has not changed H1: The distribution has changed Example continued 15- 13 Calculate the expected frequencies f0 Married: Widowed: Divorced: Single: fe (.639)500 (.077)500 (.069)500 (.215)500 = 319.5 = 38.5 = 34.5 = 107.5 ( f − f e )2 / f e Calculate chi-square values Example continued 15- 14 Step 5: χ2 = 2.3814, p(χ2 > 2.3814) = 497 The null hypothesis is not rejected The distribution regarding marital status in Philadelphia is not different from the rest of the United States Example continued 15- 15 Contingency Table Analysis Chi-square can be used to test for a relationship between two nominal scaled variables, where one variable is independent of the other A contingency table is used to investigate whether two traits or characteristics are related Contingency Table Analysis 15- 16 Contingency Table Analysis Each observation is classified according to two criteria We use the usual hypothesis testing procedure The degrees of freedom are equal to: (number of rows-1)(number of columns-1) The expected frequency is computed as: Expected Frequency = (row total)(column total) grand total Contingency table analysis 15- 17 Contingency Table Analysis Is there a relationship between the location of an accident and the gender of the person involved in the accident? A sample of 150 accidents reported to the police were classified by type and gender At the 05 level of significance, can we conclude that gender and the location of the accident are related? Example 15- 18 Step 1: H0: Gender and location are not related H1: Gender and location are related Step 2: The level of significance is set at 01 Step 3: the test statistic is the chi-square distribution Step 4: The degrees of freedom equal (r-1)(c-1) or The critical χ2 at d.f is 9.21 If computed χ2 >9.21, or if p < 01, reject the null and accept the alternate Step 5: A data table and the following contingency table are constructed 15- 19 Observed frequencies (fo ) Gender Male Female Total Work 60 20 80 Home Other 20 10 30 10 50 20 Total 90 60 150 The expected frequency for the work-male intersection is computed as (90)(80)/150=48 Similarly, you can compute the expected frequencies for the other cells Example continued 15- 20 Expected frequencies (fe ) Gender Male Female Total Work (80) (90)150 = 48 Home (50)(90) 150 =30 Other (20)(90) 150 =12 Total (80)(60) 150 =32 (50)(60) 150 =20 (20)(60) 150 =8 60 80 50 20 150 90 Example continued 15- 21 χ2: (fo – fe)2/ fe Gender Male Work (60-48)2 48 Home (20-30)2 30 Other (10-12)2 12 Total χ2 6.667 Female (20-32)2 32 (30-20)2 20 (12-10)2 10 10.000 Total 16.667 Example continued 15- 22 The p(χ2 > 16.667) = 00024 Since the χ of 16.667 > 9.21, p of 00024 < 01, reject the null and conclude that there is a relationship between the location of an accident and the gender of the person involved Example concluded ... Example continued 15- 15 Contingency Table Analysis Chi-square can be used to test for a relationship between two nominal scaled variables, where one variable is independent of the other A contingency... Frequencies 15- The following information shows the number of employees absent by day of the week at a large a manufacturing plant At the 01 level of significance, is there a difference in the absence rate... table is used to investigate whether two traits or characteristics are related Contingency Table Analysis 15- 16 Contingency Table Analysis Each observation is classified according to two criteria