Chapter 8 provides knowledge of sampling methods and central limit theorem. When you have completed this chapter, you will be able to: Explain under what conditions sampling is the proper way to learn something about a population, describe methods for selecting a sample, define and construct a sampling distribution of the sample mean,...
8 1 Sa m p l i n g Methods & Central Limit Theorem Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. 8 2 When you have completed this chapter, you will be able to: Explain under what conditions sampling is the proper way to learn something about a population Describe methods for selecting a sample. Define and construct a sampling distribution of the sample mean Explain the central limit theorem Use the central limit theorem to find probabilities of selecting possible sample means from a specified population. Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. 8 3 We use sample information We use sample information to make to make decisions or inferences decisions or inferences about the population about the population Two KEY KEY steps: steps: Two Choice of a proper method for selecting sample data & 2. Proper analysis of the sample data (more later) KEY 1 KEY 1 Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. 8 4 Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. 8 5 KEY 1 KEY 1 If the proper method for selecting the sample is NOT MADE … the SAMPLE will not be truly representative of the TOTAL Population! … and wrong conclusions can be drawn! Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. Why Sample the Population? Why Sample the Population? 8 6 Because… …of the physical impossibility of checking all items in the population, and, also, it would be too timeconsuming $ …the studying of all the items in a population would NOT be cost effective …the sample results are usually adequate …the destructive nature of certain tests Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. Techniques 8 7 with Replacement with Replacement without Replacement without Replacement Each data unit in the Each data unit in the population is allowed to population is allowed to appear in the sample appear in the sample more than once more than once Each data unit in the Each data unit in the population is allowed to population is allowed to appear in the sample appear in the sample no more than once no more than once robability SSampling ampling PProbability NonProbability Sampling NonProbability Sampling Each data unit in the Each data unit in the population population has a known has a known likelihood likelihood of being of being included in the sample included in the sample Does not not involve involve Does random selection; random selection; inclusion of an item inclusion of an item is is based on convenience convenience based on Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. Methods Simple Random Systematic Random Stratified Random Cluster 8 8 .each item(person) in the population has an equal chance of being included …items(people) of the population are arranged in some order. A random starting point is selected, and then every kth member of the population is selected for the …a population is sample first divided into subgroups, called strata, and a sample is selected from each strata …a population is first divided into primary units, and samples are selected from each unit Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. Terminology “Sampling error” 8 9 … is the difference between … is the difference between a sample statistic a sample statistic and its and its corresponding population corresponding population parameter parameter “Sampling distribution … is a probability distribution … is a probability distribution consisting of of the sample consisting of mean” all possible sample means all possible sample means of a given sample size of a given sample size selected from a selected from a population population Example Example Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. 8 10 The law firm of Hoya and Associates has five partners. At their weekly partners meeting each reported the number of hours they billed their clients last week: Partner Hours Example Example Dunn 22 Hardy 26 Kiers 30 Malinowski 26 Tillman 22 If two partners are selected randomly… If two partners are selected randomly… how many different samples are possible? how many different samples are possible? Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. Using 8 33 Since this is random number Since this is random number generation, you will get different generation, you will get different numbers each time you do numbers each time you do this… this… Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. Using the Sampling Distribution Using the Sampling Distribution of the Sample Mean of the Sample Mean Data… Suppose it takes an Suppose it takes an average of 330 minutes average of 330 minutes for taxpayers to for taxpayers to prepare, copy, and prepare, copy, and mail an income tax mail an income tax return form. return form. A consumer watchdog A consumer watchdog agency selects a random agency selects a random sample of 40 taxpayers sample of 40 taxpayers and finds the standard and finds the standard deviation of the time deviation of the time needed is 80 minutes needed is 80 minutes What is the standard error of the mean? What is the standard error of the mean? Formula Formula / / nn Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. 8 34 = 12.6 = 80 / 40 = 12.6 = 80 / Using the Sampling Distribution Using the Sampling Distribution of the Sample Mean of the Sample Mean 8 35 Data… Suppose it takes an average of 330 minutes for Suppose it takes an average of 330 minutes for taxpayers to prepare, copy, and mail an income tax taxpayers to prepare, copy, and mail an income tax return form. A consumer watchdog agency selects a return form. A consumer watchdog agency selects a random sample of 40 taxpayers and random sample of 40 taxpayers and finds the standard deviation of the time needed is 80 finds the standard deviation of the time needed is 80 minutes. minutes. What is the likelihood the sample mean What is the likelihood the sample mean is greater than 320 minutes? is greater than 320 minutes? nswer… Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. Using the Sampling Distribution Using the Sampling Distribution of the Sample Mean of the Sample Mean 8 36 Data… * average of 330 minutes *random sample of 40 * average of 330 minutes *random sample of 40 * standard deviation is 80 minutes * standard deviation is 80 minutes What is the likelihood the sample mean What is the likelihood the sample mean is greater than 320 minutes? is greater than 320 minutes? Formula Formula X z s 320 330 80 40 n = 0.79 = 0.79 Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. a1 320 330 Using the Sampling Distribution Using the Sampling Distribution of the Sample Mean of the Sample Mean 8 37 Data… * average of 330 minutes *random sample of 40 * average of 330 minutes *random sample of 40 * standard deviation is 80 minutes * standard deviation is 80 minutes What is the likelihood the sample mean What is the likelihood the sample mean is greater than 320 minutes? is greater than 320 minutes? Look up 0.79 Look up 0.79 in Table in Table =0.2852 aa11 =0.2852 Required Area = Required Area = 0.2852 + .5 = 0.7852 0.2852 + .5 = 0.7852 a1 320 330 Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. Sampling Distribution of Sampling Distribution of Proportion Proportion 8 38 The normal distribution (a continuous distribution) yields a good approximation of the binomial distribution (a discrete distribution) Use when np and n(1 p ) are both greater than 5! Use when np and n(1 p ) are both greater than 5! for large values of n Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. 2 Mean and Variance Mean and Variance of a of a Binomial Probability Distribution Binomial Probability Distribution np Formula Formula Formula Formula Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. 8 39 np (1 p) Sampling Distribution of Sampling Distribution of Proportion Proportion 8 40 A multinational company claims that 55% of its employees are bilingual. To verify this claim, a statistician selected a sample of 60 employees of the company using simple random sampling and found 48% to be bilingual. Based on this information, what can we say about the company’s claim? The sample size is big is big np = 60(.55) The sample size enough to use the normal enough to use the normal = 33 approximation with a mean approximation with a mean n(1 p ) = 60(.45) of .55 and a standard of .55 and a standard = 27 deviation of (.55)(.45)/60 = (.55)(.45)/60 = deviation of Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. Sampling Distribution of Sampling Distribution of Proportion Proportion …continued …continued Formula Formula z X s Z = (0.48 0.55) / 0.064 Z = 1.09 Look up 1.09 in Table Look up 1.09 in Table =0.3621 aa11 =0.3621 Required Area Required Area = .5 – 0.3621 = = .5 – 0.3621 = 0.1379 or 14% 0.1379 or 14% Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. a1 .48 55 8 41 Sampling Distribution of Sampling Distribution of Proportion Proportion …continued …continued Formula Formula z X s Z = (0.48 0.55) / 0.064 Z = 1.09 Look up 1.09 in Table Look up 1.09 in Table a =0.3621 a =0.3621 Required Area Required Area = .5 – 0.3621 = = .5 – 0.3621 = 0.1379 or 14% 0.1379 or 14% Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. 8 42 Conclusion There is There is approximately approximately a 14% chance a 14% chance that the that the company’s claim company’s claim is true, based on is true, based on this sample this sample Sampling Distribution of Sampling Distribution of Mean Mean 8 43 Suppose the mean selling price of a litre of gasoline in Canada is $.659. Further, assume the distribution is positively skewed, with a standard deviation of $0.08. What is the probability of selecting a sample of 35 gasoline stations and finding the sample mean within $.03 of the population mean? Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. Sampling Distribution of Sampling Distribution of Mean Mean 8 44 mean selling price is $.659 SD of $0.08 Data… mean selling price is $.659 SD of $0.08 Sample of 35 gasoline stations Sample of 35 gasoline stations Probability of sample mean within $.03? Probability of sample mean within $.03? Find the zscores for Find the zscores for 659 +/ .03 i.e. 0.629 and .689 659 +/ .03 z1 z2 X s n X s n $ 629 $ 659 $ 08 35 $ 689 $ 659 $ 08 Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. 35 2 22 2.22 629 .689 Sampling Distribution of Sampling Distribution of Mean Mean 8 45 mean selling price is $.659 SD of $0.08 Data… mean selling price is $.659 SD of $0.08 Sample of 35 gasoline stations Sample of 35 gasoline stations Probability of sample mean within $.03? Probability of sample mean within $.03? Find areas from table… Find areas from table… z1 z2 2.22 a1 = .4868 2.22 a2 = .4868 Required A = 9736 Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. We would expect about We would expect about 97% 97% of the of the sample means to be sample means to be within $0.03 of the within $0.03 of the population mean population mean Test your learning… … Test your learning … … n o n o k ilcick CCl www.mcgrawhill.ca/college/lind Online Learning Centre for quizzes extra content data sets searchable glossary access to Statistics Canada’s EStat data …and much more! Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. 8 46 8 47 This completes Chapter 8 Copyright © 2004 by The McGrawHill Companies, Inc. All rights reserved. ... Define and construct a sampling distribution of the sample mean Explain the central limit theorem Use the central limit theorem to find probabilities of selecting possible sample means from ... completed this chapter, you will be able to: Explain under what conditions sampling is the proper way to learn something about a population Describe methods for selecting a sample. Define and construct a sampling distribution ... population is allowed to appear in the sample appear in the sample no more than once no more than once robability SSampling ampling PProbability NonProbability Sampling NonProbability Sampling Each data unit in the