SPEAKER WOOFER DRIVER MANUFACTURING
DEFINITION 12.2 Margin of Error for the Estimate of p
E=zα/2ã ˆ
p(1− ˆp)/n.
? What Does It Mean?
The margin of error is equal to half the length of the confidence interval. It represents the precision with which a sample proportion estimates the population proportion at the specified confidence level.
In Example 12.3, the margin of error is E =zα/2ã
ˆ
p(1− ˆp)/n=1.96ã
(0.2)(1−0.2)/1010=0.025,
which can also be obtained by taking one-half the length of the confidence interval:
(0.225−0.175)/2=0.025. Therefore we can be 95% confident that the error in esti- mating the proportion, p, of all U.S. employees who play hooky by the proportion, 0.2, of those in the sample who play hooky is at most 0.025, that is, plus or minus 2.5 per- centage points.
On the one hand, given a confidence interval, we can find the margin of error by taking half the length of the confidence interval. On the other hand, given the sam- ple proportion and the margin of error, we can determine the confidence interval—its endpoints are pˆ±E.
Most newspaper and magazine polls provide the sample proportion and the mar- gin of error associated with a 95% confidence interval. For example, a survey of U.S. women conducted byGallupfor theCNBCcable network stated, “36% of those polled believe their gender will hurt them; the margin of error for the poll is plus or minus 4 percentage points.”
Translated into our terminology, pˆ =0.36 and E =0.04. Thus the confidence interval has endpoints pˆ±E =0.36±0.04, or 0.32 to 0.40. As a result, we can be 95% confident that the percentage of all U.S. women who believe that their gender will hurt them is somewhere between 32% and 40%.
Determining the Required Sample Size
If the margin of error and confidence level are given, then we must determine the sample size required to meet those specifications. Solving fornin the formula for the margin of error, we get
n= ˆp(1− ˆp)zα/2 E
2
. (12.1)
550 CHAPTER 12 Inferences for Population Proportions
This formula cannot be used to obtain the required sample size because the sample proportion, p, is not known prior to sampling.ˆ
There are two ways around this problem. To begin, we examine the graph of ˆ
p(1− ˆp)versus pˆ shown in Fig. 12.1. The graph reveals that the largest pˆ(1− ˆp) can be is 0.25, which occurs when pˆ =0.5. The farther pˆ is from 0.5, the smaller will be the value of pˆ(1− ˆp).
FIGURE 12.1 Graph ofp(1 –ˆ p) versusˆ pˆ
0.05 0.10 0.15 0.20 0.25
0.0 0.2 0.4 0.6 0.8 1.0 pˆ
p(1 − p)ˆ
ˆ (0.5, 0.25) Because the largest possible value of pˆ(1− ˆp)is 0.25, the most conservative ap- proach for determining sample size is to use that value in Equation (12.1). The sample size obtained then will generally be larger than necessary and the margin of error less than required. Nonetheless, this approach guarantees that the specifications will at least be met.
However, because sampling tends to be time consuming and expensive, we usually do not want to take a larger sample than necessary. If we can make an educated guess for the observed value ofp—say, from a previous study or theoretical considerations—ˆ we can use that guess to obtain a more realistic sample size.
In this same vein, if we have in mind a likely range for the observed value of p,ˆ then, in light of Fig. 12.1, we should take as our educated guess for pˆ the value in the range closest to 0.5. In either case, we should be aware that, if the observed value of pˆis closer to 0.5 than is our educated guess, the margin of error will be larger than desired.
FORMULA 12.2 Sample Size for Estimatingp
A (1−α)-level confidence interval for a population proportion that has a margin of error of at mostE can be obtained by choosing
n=0.25zα/2 E
2
rounded up to the nearest whole number. If you can make an educated guess, pˆg(g for guess), for the observed value ofp, then you should insteadˆ choose
n= ˆpg(1− ˆpg)zα/2 E
2
rounded up to the nearest whole number.
EXAMPLE 12.4 Sample Size for Estimating p
Playing Hooky From Work Consider again the problem of estimating the propor- tion of all U.S. employees who play hooky.
a. Obtain a sample size that will ensure a margin of error of at most 0.01 for a 95% confidence interval.
b. Find a 95% confidence interval for pif, for a sample of the size determined in part (a), the proportion of those who play hooky is 0.194.
c. Determine the margin of error for the estimate in part (b), and compare it to the margin of error specified in part (a).
d. Repeat parts (a)–(c) if the proportion of those sampled who play hooky can reasonably be presumed to be between 0.1 and 0.3.
e. Compare the results obtained in parts (a)–(c) with those obtained in part (d).
Solution
a. We apply the first equation in Formula 12.2. To do so, we must identifyzα/2 and the margin of error, E. The confidence level is stipulated to be 0.95, so zα/2=z0.05/2=z0.025=1.96, and the margin of error is specified at 0.01.
Thus a sample size that will ensure a margin of error of at most 0.01 for a 95% confidence interval is
n=0.25 zα/2
E 2
=0.25 1.96
0.01 2
=9604.
Interpretation If we take a sample of 9604 U.S. employees, the margin of error for our estimate of the proportion of all U.S. employees who play hooky will be 0.01 or less—that is, plus or minus at most 1 percentage point.
b. We find, by applying Procedure 12.1 (page 548) withα=0.05,n=9604, and ˆ
p=0.194, that a 95% confidence interval for phas endpoints 0.194±1.96ã
(0.194)(1−0.194)/9604, or 0.194±0.008, or 0.186 to 0.202.
Interpretation Based on a sample of 9604 U.S. employees, we can be 95% confident that the percentage of all U.S. employees who play hooky is somewhere between 18.6% and 20.2%.
c. The margin of error for the estimate in part (b) is 0.008. Not surprisingly, this is less than the margin of error of 0.01 specified in part (a).
d. If we can reasonably presume that the proportion of those sampled who play hooky will be between 0.1 and 0.3, we use the second equation in Formula 12.2, with pˆg=0.3 (the value in the range closest to 0.5), to determine the sample size:
n= ˆpg(1− ˆpg)zα/2 E
2
=(0.3)(1−0.3) 1.96
0.01 2
=8068 (rounded up).
Applying Procedure 12.1 with α=0.05,n=8068, and pˆ =0.194, we find that a 95% confidence interval for phas endpoints
0.194±1.96ã
(0.194)(1−0.194)/8068, or 0.194±0.009, or 0.185 to 0.203.
Interpretation Based on a sample of 8068 U.S. employees, we can be 95% confident that the percentage of all U.S. employees who play hooky is somewhere between 18.5% and 20.3%. The margin of error for the estimate is 0.009.
e. By using the educated guess for pˆ in part (d), we reduced the required sample size by more than 1500 (from 9604 to 8068). Moreover, only 0.1% (0.001) of precision was lost—the margin of error rose from 0.008 to 0.009. The risk of using the guess 0.3 for pˆis that, if the observed value ofpˆhad turned out to be larger than 0.3 (but smaller than 0.7), the achieved margin of error would have exceeded the specified 0.01.
Exercise 12.33 on page 555
The One-Proportion Plus-Four z-Interval Procedure
The confidence interval for a population proportion presented in Procedure 12.1 on page 548 does not always provide reasonably good accuracy, even for relatively large samples. As a consequence, more accurate methods have been developed. One such method is called theone-proportion plus-fourz-interval procedure.†
†See “Approximate Is Better than ‘Exact’ for Interval Estimation of Binomial Proportions” (The American Statis- tician, Vol. 52, No. 2, pp. 119–126) by A. Agresti and B. Coull, and “Simple and Effective Confidence Intervals for Proportions and Differences of Proportions Result from Adding Two Successes and Two Failures” (The Amer- ican Statistician, Vol. 54, No. 4, pp. 280–288) by A. Agresti and B. Caffo.
552 CHAPTER 12 Inferences for Population Proportions
To obtain a plus-fourz-interval for a population proportion, we first add two suc- cesses and two failures to our data (hence, the term “plus four”) and then apply Pro- cedure 12.1 to the new data. In other words, in place of pˆ (which is x/n), we use
˜
p=(x+2)/(n+4). Thus, for a confidence level of 1−α, the plus-fourz-interval is from
˜
p−zα/2ã
˜
p(1− ˜p)/(n+4) to p˜+zα/2ã
˜
p(1− ˜p)/(n+4).
As a rule of thumb, the one-proportion plus-fourz-interval procedure should be used only with confidence levels of 90% or greater and sample sizes of 10 or more.
Exercises 12.47–12.56 provide practice with the one-proportion plus-fourz-interval procedure.
THE TECHNOLOGY CENTER
Most statistical technologies have programs that automatically perform the one- proportionz-interval procedure. In this subsection, we present output and step-by-step instructions for such programs.
EXAMPLE 12.5 Using Technology to Obtain a One-Proportion z-Interval
Playing Hooky From Work Of 1010 randomly selected U.S. employees asked whether they play hooky from work, 202 said they do. Use Minitab, Excel, or the TI-83/84 Plus to find a 95% confidence interval for the proportion,p, of all U.S. em- ployees who play hooky.
Solution We applied the one-proportionz-interval programs to the data, resulting in Output 12.2. Steps for generating that output are presented in Instructions 12.1.
OUTPUT 12.2 One-proportionz-interval on the data on playing hooky from work MINITAB
EXCEL TI-83/84 PLUS
As shown in Output 12.2, the required 95% confidence interval is from 0.175 to 0.225. We can be 95% confident that the percentage of all U.S. employees who play hooky is somewhere between 17.5% and 22.5%.
INSTRUCTIONS 12.1 Steps for generating Output 12.2
MINITAB EXCEL TI-83/84 PLUS
1 ChooseStat➤Basic Statistics➤ 1 Proportion. . .
2 Select theSummarized data option button
3 Click in theNumber of eventstext box and type202
4 Click in theNumber of trialstext box and type1010
5 Click theOptions. . . button 6 Click in theConfidence leveltext
box and type95
7 Check theUse test and interval based on normal distribution check box
8 ClickOKtwice
1 Store the sample size, 1010, and the number of successes, 202, in ranges named n and x, respectively 2 ChooseDDXL➤Confidence
Intervals
3 SelectSumm 1 Var Prop Interval from theFunction type
drop-down list box
4 Specify x in theNum Successes text box
5 Specify n in theNum Trialstext box
6 ClickOK
7 Click the95%button
8 Click theCompute Intervalbutton
1 PressSTAT, arrow over to TESTS, and pressALPHA➤A 2 Type202forxand pressENTER 3 Type1010fornand press
ENTER
4 Type.95forC-Leveland press ENTERtwice
Exercises 12.1
Understanding the Concepts and Skills
12.1 In a newspaper or magazine of your choice, find a statistical study that contains an estimated population proportion.
12.2 Why is statistical inference generally used to obtain infor- mation about a population proportion?
12.3 Is a population proportion a parameter or a statistic? What about a sample proportion? Explain your answers.
12.4 Answer the following questions about the basic notation and terminology for proportions.
a. What is a population proportion?
b. What symbol is used for a population proportion?
c. What is a sample proportion?
d. What symbol is used for a sample proportion?
e. For what is the phrase “number of successes” an abbreviation?
What symbol is used for the number of successes?
f. For what is the phrase “number of failures” an abbreviation?
g. Explain the relationships among the sample proportion, the number of successes, and the sample size.
12.5 This exercise involves the use of an unrealistically small population to provide a concrete illustration for the exact distri- bution of a sample proportion. A population consists of three men and two women. The first names of the men are Jose, Pete, and Carlo; the first names of the women are Gail and Frances. Sup- pose that the specified attribute is “female.”
a. Determine the population proportion, p.
b. The first column of the following table provides the possible samples of size 2, where each person is represented by the first letter of his or her first name; the second column gives the number of successes—the number of females obtained—for each sample; and the third column shows the sample propor- tion. Complete the table.
c. Construct a dotplot for the sampling distribution of the propor- tion for samples of size 2. Mark the position of the population proportion on the dotplot.
Number of females Sample proportion
Sample x ˆp
J, G 1 0.5
J, P 0 0.0
J, C 0 0.0
J, F 1 0.5
G, P G, C G, F P, C P, F C, F
d. Use the third column of the table to obtain the mean of the variablep.ˆ
e. Compare your answers from parts (a) and (d). Why are they the same?
12.6 Repeat parts (b)–(e) of Exercise 12.5 for samples of size 1.
12.7 Repeat parts (b)–(e) of Exercise 12.5 for samples of size 3.
(There are 10 possible samples.)
12.8 Repeat parts (b)–(e) of Exercise 12.5 for samples of size 4.
(There are five possible samples.)
12.9 Repeat parts (b)–(e) of Exercise 12.5 for samples of size 5.
12.10 Prerequisite to this exercise are Exercises 12.5–12.9. What do your graphs in parts (c) of those exercises illustrate about the impact of increasing sample size on sampling error? Explain your answer.
12.11 NBA Draft Picks. From Wikipedia’s on-line docu- ment “List of First Overall NBA Draft Picks,” we found that, since 1947, 11.3% of the number-one draft picks in the National Basketball Association have been other than U.S. nationals.
554 CHAPTER 12 Inferences for Population Proportions a. Identify the population.
b. Identify the specified attribute.
c. Is the proportion 0.113 (11.3%) a population proportion or a sample proportion? Explain your answer.
12.12 Staying Single.According to an article inTimemagazine, women are staying single longer these days, by choice. In 1963, 83% of women in the United States between the ages of 25 and 54 years were married, compared to 67% in 2007. For 2007, a. identify the population.
b. identify the specified attribute.
c. Under what circumstances is the proportion 0.67 a population proportion? a sample proportion? Explain your answers.
12.13 Random Drug Testing. AHarris Pollasked Americans whether states should be allowed to conduct random drug tests on elected officials. Of 21,355 respondents, 79% said “yes.”
a. Determine the margin of error for a 99% confidence interval.
b. Without doing any calculations, indicate whether the margin of error is larger or smaller for a 90% confidence interval.
Explain your answer.
12.14 Genetic Binge Eating. According to an article in Science News, binge eating has been associated with a mu- tation of the gene for a brain protein called melanocortin 4 receptor (MC4R). In one study, F. Horber of the Hirslanden Clinic in Zurich and his colleagues genetically analyzed the blood of 469 obese people and found that 24 carried a mutated MC4R gene. Suppose that you want to estimate the proportion of all obese people who carry a mutated MC4R gene.
a. Determine the margin of error for a 90% confidence interval.
b. Without doing any calculations, indicate whether the margin of error is larger or smaller for a 95% confidence interval. Ex- plain your answer.
12.15 In each of parts (a)–(c), we have given a likely range for the observed value of a sample proportionp. Based on the givenˆ range, identify the educated guess that should be used for the observed value of pˆ to calculate the required sample size for a prescribed confidence level and margin of error.
a. 0.2 to 0.4 b. 0.2 or less c. 0.4 or greater d. In each of parts (a)–(c), which observed values of the sam-
ple proportion will yield a larger margin of error than the one specified if the educated guess is used for the sample-size computation?
12.16 In each of parts (a)–(c), we have given a likely range for the observed value of a sample proportionp. Based on the givenˆ range, identify the educated guess that should be used for the observed value of pˆ to calculate the required sample size for a prescribed confidence level and margin of error.
a. 0.4 to 0.7 b. 0.7 or greater c. 0.7 or less d. In each of parts (a)–(c), which observed values of the sam-
ple proportion will yield a larger margin of error than the one specified if the educated guess is used for the sample-size computation?
In each of Exercises12.17–12.22, we have given the number of successes and the sample size for a simple random sample from a population. In each case, do the following tasks.
a. Determine the sample proportion.
b. Decide whether using the one-proportion z-interval procedure is appropriate.
c. If appropriate, use the one-proportion z-interval procedure to find the confidence interval at the specified confidence level.
12.17 x=8,n=40, 95% level.
12.18 x=10,n=40, 90% level.
12.19 x=35,n=50, 99% level.
12.20 x=40,n=50, 95% level.
12.21 x=16,n=20, 90% level.
12.22 x=3,n=100, 99% level.
In Exercises12.23–12.28, use Procedure 12.1 on page 548 to find the required confidence interval. Be sure to check the conditions for using that procedure.
12.23 Shopping Online. An issue of Time Style and Design reported on a poll conducted bySchulman Ronca & Bucuvalas Public Affairsabout the shopping habits of wealthy Americans.
A total of 603 interviews were conducted among a national sam- ple of adults with household incomes of at least $150,000. Of the adults interviewed, 410 said they had purchased clothing, acces- sories, or books online in the past year. Find a 95% confidence interval for the proportion of all U.S. adults with household in- comes of at least $150,000 who purchased clothing, accessories, or books online in the past year.
12.24 Life Support. In 2005, the Terri Schiavo case focused national attention on the issue of withdrawal of life support from terminally ill patients or those in a vegetative state. AHarris Poll of 1010 U.S. adults was conducted by telephone on April 5–10, 2005. Of those surveyed, 140 had experienced the death of at least one family member or close friend within the last 10 years who died after the removal of life support. Find a 90% confidence interval for the proportion of all U.S. adults who had experienced the death of at least one family member or close friend within the last 10 years after life support had been withdrawn.
12.25 Asthmatics and Sulfites. In the article “Explaining an Unusual Allergy,” appearing on theEveryday Health Network, Dr. A. Feldweg explained that allergy to sulfites is usually seen in patients with asthma. The typical reaction is a sudden in- crease in asthma symptoms after eating a food containing sulfites.
Studies are performed to estimate the percentage of the nation’s 10 million asthmatics who are allergic to sulfites. In one survey, 38 of 500 randomly selected U.S. asthmatics were found to be allergic to sulfites.
a. Find a 95% confidence interval for the proportion, p, of all U.S. asthmatics who are allergic to sulfites.
b. Interpret your result from part (a).
12.26 Drinking Habits. AReader’s Digest/Gallup Survey on the drinking habits of Americans estimated the percentage of adults across the country who drink beer, wine, or hard liquor at least occasionally. Of the 1516 adults interviewed, 985 said that they drank.
a. Determine a 95% confidence interval for the proportion, p, of all Americans who drink beer, wine, or hard liquor at least occasionally.
b. Interpret your result from part (a).
12.27 Factory Farming Funk. TheU.S. Environmental Pro- tection Agency recently reported that confined animal feeding operations (CAFOs) dump 2 trillion pounds of waste into the en- vironment annually, contaminating the ground water in 17 states and polluting more than 35,000 miles of our nation’s rivers. In a survey of 1000 registered voters bySnell, Perry and Associates, 80% favored the creation of standards to limit such pollution and, in general, viewed CAFOs unfavorably.
a. Find a 99% confidence interval for the percentage of all reg- istered voters who favor the creation of standards on CAFO pollution and, in general, view CAFOs unfavorably.
b. Interpret your answer in part (a).
12.28 The Nipah Virus. From fall 1998 through mid 1999, Malaysia was the site of an encephalitis outbreak caused by the Nipah virus, a paramyxovirus that appears to spread from pigs to workers on pig farms. As reported by K. Goh et al. in the paper
“Clinical Features of Nipah Virus Encephalitis among Pig Farm- ers in Malaysia” (New England Journal of Medicine, Vol. 342, No. 17, pp. 1229–1235), neurologists from the University of Malaysiafound that, among 94 patients infected with the Nipah virus, 30 died from encephalitis.
a. Find a 90% confidence interval for the percentage of Malaysians infected with the Nipah virus who will die from encephalitis.
b. Interpret your answer in part (a).
12.29 Literate Adults. Suppose that you have been hired to es- timate the percentage of adults in your state who are literate. You take a random sample of 100 adults and find that 96 are literate.
You then obtain a 95% confidence interval of 0.96±1.96ã
(0.96)(0.04)/100,
or 0.922 to 0.998. From it you conclude that you can be 95% con- fident that the percentage of all adults in your state who are liter- ate is somewhere between 92.2% and 99.8%. Is anything wrong with this reasoning?
12.30 IMR in Singapore. The infant mortality rate (IMR) is the number of infant deaths per 1000 live births. Suppose that you have been commissioned to estimate the IMR in Singapore.
From a random sample of 1109 live births in Singapore, you find that 0.361% of them resulted in infant deaths. You next find a 90% confidence interval:
0.00361±1.645ã
(0.00361)(0.99639)/1109, or 0.000647 to 0.00657. You then conclude, “I can be 90% con- fident that the IMR in Singapore is somewhere between 0.647 and 6.57.” How did you do?
12.31 Warming to Russia. An ABCNEWS Poll found that Americans now have relatively warm feelings toward Russia, a former adversary. The poll, conducted by telephone among a ran- dom sample of 1043 adults, found that 647 of those sampled con- sider the two countries friends. The margin of error for the poll was plus or minus 2.9 percentage points (for a 0.95 confidence level). Use this information to obtain a 95% confidence interval for the percentage of all Americans who consider the two coun- tries friends.
12.32 Online Tax Returns. According to the Internal Rev- enue Service, among people entitled to tax refunds, those who file online receive their refunds twice as fast as paper filers.
A study conducted byInternational Communications Research (ICR) of Media, Pennsylvania, found that 57% of those polled said that they are not worried about the privacy of their finan- cial information when filing their tax returns online. The tele- phone survey of 1002 people had a margin of error of plus or minus 3 percentage points (for a 0.95 confidence level). Use this information to determine a 95% confidence interval for the percentage of all people who are not worried about the pri- vacy of their financial information when filing their tax returns online.
12.33 Asthmatics and Sulfites. Refer to Exercise 12.25.
a. Determine the margin of error for the estimate ofp.
b. Obtain a sample size that will ensure a margin of error of at most 0.01 for a 95% confidence interval without making a guess for the observed value ofp.ˆ
c. Find a 95% confidence interval for pif, for a sample of the size determined in part (b), the proportion of asthmatics sam- pled who are allergic to sulfites is 0.071.
d. Determine the margin of error for the estimate in part (c) and compare it to the margin of error specified in part (b).
e. Repeat parts (b)–(d) if you can reasonably presume that the proportion of asthmatics sampled who are allergic to sulfites will be at most 0.10.
f. Compare the results you obtained in parts (b)–(d) with those obtained in part (e).
12.34 Drinking Habits. Refer to Exercise 12.26.
a. Find the margin of error for the estimate ofp.
b. Obtain a sample size that will ensure a margin of error of at most 0.02 for a 95% confidence interval without making a guess for the observed value ofp.ˆ
c. Find a 95% confidence interval for pif, for a sample of the size determined in part (b), 63% of those sampled drink alco- holic beverages.
d. Determine the margin of error for the estimate in part (c) and compare it to the margin of error specified in part (b).
e. Repeat parts (b)–(d) if you can reasonably presume that the percentage of adults sampled who drink alcoholic beverages will be at least 60%.
f. Compare the results you obtained in parts (b)–(d) with those obtained in part (e).
12.35 Factory Farming Funk. Refer to Exercise 12.27.
a. Determine the margin of error for the estimate of the percentage.
b. Obtain a sample size that will ensure a margin of error of at most 1.5 percentage points for a 99% confidence interval with- out making a guess for the observed value ofp.ˆ
c. Find a 99% confidence interval for pif, for a sample of the size determined in part (b), 82.2% of the registered voters sampled favor the creation of standards on CAFO pollution and, in general, view CAFOs unfavorably.
d. Determine the margin of error for the estimate in part (c) and compare it to the margin of error specified in part (b).
e. Repeat parts (b)–(d) if you can reasonably presume that the percentage of registered voters sampled who favor the cre- ation of standards on CAFO pollution and, in general, view CAFOs unfavorably will be between 75% and 85%.
f. Compare the results you obtained in parts (b)–(d) with those obtained in part (e).
12.36 The Nipah Virus. Refer to Exercise 12.28.
a. Find the margin of error for the estimate of the percentage.
b. Obtain a sample size that will ensure a margin of error of at most 5 percentage points for a 90% confidence interval with- out making a guess for the observed value ofp.ˆ
c. Find a 90% confidence interval for pif, for a sample of the size determined in part (b), 28.8% of the sampled Malaysians infected with the Nipah virus die from encephalitis.
d. Determine the margin of error for the estimate in part (c) and compare it to the margin of error specified in part (b).
e. Repeat parts (b)–(d) if you can reasonably presume that the percentage of sampled Malaysians infected with the Nipah