In this chapter, we developed several inferential procedures for comparing the means of two populations. Table 10.15 summarizes the hypothesis-testing procedures;
confidence-interval procedures would have a similar table.
*All previous sections in this chapter, including the material on the Mann–Whitney test and paired Wilcoxon signed-rank test, are prerequisite to this section.
TABLE 10.15 Summary of hypothesis-testing procedures for comparing two population means. The null hypothesis for all tests isH0:μ1=μ2
Type Assumptions Test statistic Procedure to use
Pooledt-test
1. Simple random samples
t= x¯1− ¯x2 sp
√(1/n1)+(1/n2)
†
(df=n1+n2−2)
10.1 (page 441) 2. Independent samples
3. Normal populations or large samples 4. Equal population standard deviations Nonpooledt-test
1. Simple random samples
t= x¯1− ¯x2 (s12/n1)+(s22/n2)
‡
10.3 (page 453) 2. Independent samples
3. Normal populations or large samples Mann–Whitney test
1. Simple random samples M=sum of the ranks for sample data from Population 1
10.5 (page 468) 2. Independent samples
3. Same-shape populations
Pairedt-test 1. Simple random paired sample t= d
sd/√ n (df=n−1)
10.6 (page 481) 2. Normal differences or large sample
PairedW-test 1. Simple random paired sample
W =sum of positive ranks 10.8 (page 492) 2. Symmetric differences
†sp=
(n1−1)s21+(n2−1)s22 n1+n2−2
‡df= [(s12/n1)+(s22/n2)]2 (s12/n1)2
n1−1 +(s22/n2)2 n2−1
Each row of Table 10.15 gives the type of test, the conditions required for using the test, the test statistic, and the procedure to use. For brevity, we have written “paired W-test” instead of “paired Wilcoxon signed-rank test.” As before, we have used the following abbreviations:
r normal populations—the two distributions of the variable under consideration are normally distributed;
r same-shape populations—the two distributions of the variable under consideration have the same shape;
r normal differences—the paired-difference variable is normally distributed;
r symmetric differences—the paired-difference variable has a symmetric distribution.
In selecting the correct procedure, keep in mind that the best choice is the pro- cedure expressly designed for the types of distributions under consideration, if such a procedure exists, and that the threet-tests are only approximately correct for large samples from nonnormal populations.
For instance, suppose that independent simple random samples are taken from two populations with equal standard deviations and that the two distributions (one for each population) of the variable under consideration are normally distributed. Al- though the pooledt-test, nonpooledt-test, and Mann–Whitney test are all applicable, the correct procedure is the pooled t-test because it is designed specifically for use with independent samples from two normally distributed populations that have equal standard deviations.
The flowchart in Fig. 10.19 (next page) provides an organized strategy for choos- ing the correct hypothesis-testing procedure for comparing two population means.
You should examine the sample data to settle on distribution type before choosing a procedure. We recommend using normal probability plots and either stem-and-leaf diagrams (for small or moderate-size samples) or histograms (for moderate-size or large samples); boxplots can also be quite helpful, especially for moderate-size or large samples.
502CHAPTER10InferencesforTwoPopulationMeans
FIGURE 10.19 Flowchart for choosing the correct hypothesis-testing procedure for comparing two population means
NO
YES Start
NO
NO
Use the paired t-test
Requires a procedure not
covered here
Requires a procedure not
covered here Use the
Mann–Whitney test YES
Paired sample
?
Normal populations
?
Same shape
?
Large samples
? Equal
std. devs.
? Use the YES
pooled t-test
Use the nonpooled t-test
NO
NO Normal differences
?
Symmetric differences
?
Large sample
?
YES NO
YES
Use the paired W-test
YES
NO
NO YES
YES
EXAMPLE 10.21 Choosing the Correct Hypothesis-Testing Procedure
Skinfold Thickness A study titled “Body Composition of Elite Class Distance Runners” was conducted by M. Pollock et al. to determine whether elite dis- tance runners are thinner than other people. Their results were published in The Marathon: Physiological, Medical, Epidemiological, and Psychological Studies, P. Milvey (ed.), New York: New York Academy of Sciences, p. 366.
The researchers measured skinfold thickness (an indirect indicator of body fat) of runners and nonrunners in the same age group. The data in Table 10.16 are based on the skinfold-thickness measurements on the thighs of the people sampled.
TABLE 10.16 Skinfold thickness (mm) for independent samples of elite runners and others
Runners Others
7.3 6.7 8.7 24.0 19.9 7.5 18.4 3.0 5.1 8.8 28.0 29.4 20.3 19.0 7.8 3.8 6.2 9.3 18.1 22.8 24.2 5.4 6.4 6.3 9.6 19.4 16.3 16.3 3.7 7.5 4.6 12.4 5.2 12.2 15.6
Suppose that we want to use the sample data to decide whether elite runners have smaller skinfold thickness, on average, than other people. Letμ1denote the mean skinfold thickness of elite runners and letμ2denote the mean skinfold thick- ness of others. We want to perform the hypothesis test
H0: μ1=μ2(mean skinfold thickness is not smaller) Ha: μ1< μ2 (mean skinfold thickness is smaller).
Which procedure should we use to perform the hypothesis test?
Solution We begin by drawing normal probability plots and boxplots of the data, as shown in Figs. 10.20 (below) and 10.21 (next page), respectively.
FIGURE 10.20 Normal probability plots of the sample data for (a) elite runners and (b) others
−3
−2
−1 0 1 2 3
Thickness (mm) (a) Runners
Normal score
1
0 2 3 4 5 6 7 8 9
−3
−2
−1 0 1 2 3
5 10 15 20 25 30
0
Thickness (mm) (b) Others
Normal score
Next we consult the flowchart in Fig. 10.19. The answer to the first question (paired sample?) is “No.” This “No” answer leads to the question, Are the popula- tions normal? The normal probability plots in Fig. 10.20 are linear, so the answer to the second question is probably “Yes.”
This “Yes” answer leads to the question, Are the population standard devia- tions equal? The standard deviations of the two samples are 1.80 mm and 6.61 mm, respectively. These statistics and the boxplots in Fig. 10.21 both suggest that the answer to the third question is probably “No.”
504 CHAPTER 10 Inferences for Two Population Means FIGURE 10.21
Boxplots of the sample data for elite runners and others
0 5 10 15 20 25 30
Thickness (mm)
Runners
Others
This “No” answer leads us to the statement, Use the nonpooledt-test. There- fore, we should use Procedure 10.3 to conduct the hypothesis test.
Exercises 10.7
Understanding the Concepts and Skills
10.195 We considered three hypothesis-testing procedures based on independent simple random samples to compare the means of two populations with unknown standard deviations.
a. Identify the three procedures by name.
b. List the conditions for using each procedure.
c. Identify the test statistic for each procedure.
10.196 We examined two hypothesis-testing procedures based on a simple random paired sample to compare the means of two populations.
a. Identify the two procedures by name.
b. List the conditions for using each procedure.
c. Identify the test statistic for each procedure.
10.197 Suppose that you want to perform a hypothesis test based on independent simple random samples to compare the means of two populations. Assume that the variable under consideration is normally distributed on each of the two populations and that the population standard deviations are equal.
a. Identify the procedures discussed in this chapter that could be used to carry out the hypothesis test, that is, the procedures whose assumptions are satisfied.
b. Among the procedures that you identified in part (a), which is the best one to use? Explain your answer.
10.198 Suppose that you want to perform a hypothesis test based on independent simple random samples to compare the means of two populations. Assume that the variable under consideration is normally distributed on each of the two populations and that the population standard deviations are unequal.
a. Identify the procedures discussed in this chapter that could be used to carry out the hypothesis test, that is, the procedures whose assumptions are satisfied.
b. Among the procedures that you identified in part (a), which is the best one to use? Explain your answer.
10.199 Suppose that you want to perform a hypothesis test based on independent simple random samples to compare the means of two populations. Assume that the two distributions of the variable under consideration have the same shape but are not normally dis- tributed and that the sample sizes are both large.
a. Identify the procedures discussed in this chapter that could be used to carry out the hypothesis test, that is, the procedures whose assumptions are satisfied.
b. Among the procedures that you identified in part (a), which is the best one to use? Explain your answer.
10.200 Suppose that you want to perform a hypothesis test based on a simple random paired sample to compare the means of two populations. Assume that the paired-difference variable is nor- mally distributed.
a. Identify the procedures discussed in this chapter that could be used to carry out the hypothesis test, that is, the procedures whose assumptions are satisfied.
b. Among the procedures that you identified in part (a), which is the best one to use? Explain your answer.
10.201 Suppose that you want to perform a hypothesis test based on a simple random paired sample to compare the means of two populations. Assume that the paired-difference variable has a nonnormal symmetric distribution and that the sample size is large.
a. Identify the procedures discussed in this chapter that could be used to carry out the hypothesis test, that is, the procedures whose assumptions are satisfied.
b. Among the procedures that you identified in part (a), which is the best one to use? Explain your answer.
In Exercises10.202–10.207, we provide a type of sampling (in- dependent or paired), sample size(s), and a figure showing the results of preliminary data analyses on the sample(s). For in- dependent samples, the graphs are for the two samples; for a paired sample, the graphs are for the paired differences. The in- tent is to employ the sample data to perform a hypothesis test to compare the means of the two populations from which the data were obtained. In each case, use the information provided and the flowchart shown in Fig. 10.19 on page 502 to decide which procedure should be applied.
10.202 Paired;n=75; Fig. 10.22
10.203 Independent;n1=25 andn2 =20; Fig. 10.23
10.204 Independent;n1=17 andn2=17; Fig. 10.24 10.205 Independent;n1=40 andn2=45; Fig. 10.25
10.206 Independent;n1=20 andn2=15; Fig. 10.26 10.207 Paired;n=18; Fig. 10.27
FIGURE 10.22 Results of preliminary data analyses in Exercise 10.202
0 100 200 300 400 500 600 700 800 900 1000
−3
−2
−1 0 1 2 3
200 400 600 800 1000
FIGURE 10.23 Results of preliminary data analyses in Exercise 10.203
0 50 100 150
−3
−2
−1 0 1 2 3
50 70 90 110 130
−3
−2
−1 0 1 2 3
0 50 100 150 200
FIGURE 10.24 Results of preliminary data analyses in Exercise 10.204
−3
−2
−1 0 1 2 3
20
−3
−2
−1 0 1 2 3
30 40 50 60 70 80 90 100 30 40 50 60 70 80 90 100 110
1 6
3 8 6 1 3 5 8 0 3 3 5 5 6 8 1
2 3 4 5 6 7 8 9 10 5
6 5 3 1 6 5 5 5 6 5 3 2 1 0 5 4
FIGURE 10.25 Results of preliminary data analyses in Exercise 10.205
−3
−2
−1 0 1 2 3
−3
−2
−1 0 1 2 3
150 160 170 180 190
170180190 200 210220230 240
506 CHAPTER 10 Inferences for Two Population Means FIGURE 10.26
Results of preliminary data analyses in Exercise 10.206
−3
−2
−1 0 1 2 3
40 50 60 70 80 90
−3
−2
−1 0 1 2 3
30 40 50 60 70 80 90
FIGURE 10.27 Results of preliminary data analyses in Exercise 10.207
−3
−2
−1 0 1 2 3
10 20 30 40 50 60 70 80 90 100
2 3 4 7 9 9 0 1 1 2 9 7 1 3 5 9 1
3 0 1 2 3 4 5 6 7 8 9