Chapter 15 - Chi-square applications, when you have completed this chapter, you will be able to: Understand the nature and role of chi-square distribution, identify a wide variety of uses of the chi-square distribution, conduct a test of hypothesis comparing an observed frequency distribution to an expected frequency distribution,...
15 1 Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved 15 2 When you have completed this chapter, you will be able to: Understand the nature and role of chisquare distribution Identify a wide variety of uses of the chisquare distribution Conduct a test of hypothesis comparing an observed frequency distribution to an expected frequency distribution Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved 15 3 Conduct a test of hypothesis for normality using the chisquare distribution Conduct a hypothesis test to determine whether two attributes are independent Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved Characteristics of the 15 4 Characteristics of the ChiSquare Distribution ChiSquare Distribution … it is positively skewed … it is nonnegative … it is based on degrees of freedom …when the degrees of freedom change a new distribution is created Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved …e.g Characteristics of the 15 5 Characteristics of the ChiSquare Distribution ChiSquare Distribution df = 3 df = 5 df = 10 Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved GoodnessofFit Test: GoodnessofFit Test: 15 6 Equal Expected Equal Expected Frequencies Frequencies Let f0 and fe be the observed and expected frequencies respectively H0: There is no difference between the observed and expected frequencies H1: There is a difference between the observed and the expected frequencies Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved GoodnessofFit Test: GoodnessofFit Test: 15 7 Equal Expected Equal Expected Frequencies Frequencies … the test statistic is: fo fe fe …the critical value is a chisquare value with (k1) degrees of freedom, where k is the number of categories Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved GoodnessofFit Test: GoodnessofFit Test: 15 8 Equal Expected Equal Expected Frequencies Frequencies The following information shows the number of employees absent by day of the week at a large a manufacturing plant. Day Frequency Day Frequency Monday 120 Monday 120 Tuesday 45 Tuesday 45 Wednesday 60 Wednesday 60 Thursday 90 Thursday 90 Friday 130 Friday 130 Total 445 Total 445 At the .05 level of significance, is there a difference in the absence rate by day of the week? Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved GoodnessofFit Test: GoodnessofFit Test: 15 9 Equal Expected Equal Expected Frequencies Frequencies Hypothesis Test Hypothesis Test Step 1 Step 1 Step 2 Step 2 Step 3 Step 3 Step 4 Step 4 H0: There is no difference in absence rate by day of the week… (120+45+60+90+130)/5 = 89 H1: Absence rates by day are not all equal = 0.05 Use ChiSquare test Degrees of freedom (51) = 4 Reject H0 if 2 > 9.488. (see Appendix I) Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved ChiSquare 15 10 Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved 15 15 … continued Step 1 Step 1 H0: The distribution has not changed H1: The distribution has changed Step 2 Step 2 = 0.05 H0 is rejected if >7.815, df = 3 Step 3 Step 3 Step 4 Step 4 = 2.3814 Reject the null hypothesis The distribution regarding marital status in Philadelphia is different from the rest of the United States. Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved GoodnessofFit Test: 15 16 GoodnessofFit Test: Normality Normality … the test investigates if the observed frequencies in a frequency distribution match the theoretical normal distribution …to determine the mean and standard deviation of the frequency distribution Compute the zvalue for the lower class limit and the upper class limit for each class Determine fe for each category Use the chisquare goodnessoffit test to determine if fo coincides with fe Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved GoodnessofFit Test: 15 17 GoodnessofFit Test: Normality Normality A sample of 500 donations to the Arthritis Foundation is reported in the following frequency distribution Is it reasonable to conclude that the distribution is normally distributed with a mean of $10 and a standard deviation of $2? Use the .05 significance level Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved 15 18 … continued Amount Spent fo $14 70 Total 500 Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved Area (fo fe ) fe 2/fe 15 19 … continued To compute fe for the first class, first X determine the z value 10 z 2 00 Now… find the probability of a z value less than –2.00 P( z 2.00) 0.5000 4772 Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved .0228 15 20 … continued Amount Spent fo Area $14 70 02 Total 500 Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved (fo fe ) fe 2/fe 15 21 … continued The expected frequency is the probability of a zvalue less than –2.00 times the sample size fe ( 0228 )( 500 ) 11 40 The other expected frequencies are computed similarly Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved 15 22 … continued (fo fe ) fe 2/fe 11.40 6.49 Amount Spent fo Area $14 70 02 11.40 301.22 Total 500 500 336.33 Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved 15 23 … continued Step 1 Step 1 Step 2 Step 2 Step 3 Step 3 Step 4 Step 4 H0: The observations follow the normal distribution H0: The observations do NOT follow the normal distribution = 0.05 H0 is rejected if 2 >7.815, df = 6 = 336.33 H0: is rejected The observations do NOT follow the normal distribution Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved 15 24 A contingency table is used to investigate A contingency table is used to investigate whether two traits or characteristics whether two traits or characteristics are related are related … each observation is classified according to two criteria …the usual hypothesis testing procedure is used … the degrees of freedom is equal to: (number of rows 1)(number of columns 1) … the expected frequency is computed as: Expected Frequency = (row total)(column total)/grand total Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved 15 25 Is there a relationship between the location of an accident and the gender of the person involved in the accident? A sample of 150 accidents reported to the police were classified by type and gender. At the .05 level of significance, can we conclude that gender and the location of the accident are related? Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved 15 26 … continued Sex Work Location Home Other Total Male 60 20 10 90 Female 20 30 10 60 Total 80 50 20 150 The expected frequency for the workmale intersection is computed as (90)(80)/150 =48 Similarly, you can compute the expected frequencies for the other cells Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved 15 27 … continued Step 1 Step 1 H0: The Gender and Location are NOT related H0: The Gender and Location are related Step 2 Step 2 = 0.05 Step 3 Step 3 H0 is rejected if 2 >5.991, df = 2 (…there are (3 1)(21) = 2 degrees of freedom) Step 4 Step 4 Find the value of 2 60 48 48 16 667 10 8 H0: is rejected Gender and Location are related! Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved Test your learning… … Test your learning 15 28 … … n o n o k ilcick CCl www.mcgrawhill.ca/college/lind Online Learning Centre for quizzes extra content data sets searchable glossary access to Statistics Canada’s EStat data …and much more! Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved 15 29 This completes Chapter 15 This completes Chapter 15 Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved ... match the theoretical normal distribution …to determine the mean and standard deviation of the frequency distribution Compute the zvalue for the lower class limit and the upper class limit for each class Determine fe for each category... Foundation is reported in the following frequency distribution Is it reasonable to conclude that the distribution is normally distributed with a mean of $10 and a standard deviation of $2? ... Copyright © 2004 McGrawHill Ryerson Limited. All rights reserved 15 24 A contingency table is used to investigate A contingency table is used to investigate whether two traits or characteristics