We have developed two procedures for performing a hypothesis test to compare the means of two populations: the pooled and nonpooledt-tests. Both tests require simple random samples, independent samples, and normal populations or large samples. The pooledt-test also requires equal population standard deviations.
Recall that the shape of a normal distribution is determined by its standard devia- tion. In other words, two normal distributions have the same shape if and only if they have equal standard deviations. Consequently, the pooledt-test applies when the two distributions (one for each population) of the variable under consideration are normal and have the same shape; the nonpooledt-test applies when the two distributions are normal, even if they don’t have the same shape.
Another procedure for performing a hypothesis test based on independent simple random samples to compare the means of two populations is theMann–Whitney test.
This nonparametric test, introduced by Wilcoxon and further developed by Mann and Whitney, is also commonly referred to as theWilcoxon rank-sum testor theMann–
Whitney–Wilcoxon test.
The Mann–Whitney test applies when the two distributions of the variable under consideration have the same shape, but it does not require that they be normal or have any other specific shape. See Fig. 10.9.
FIGURE 10.9 Appropriate procedure for comparing two population means based on independent simple random samples
(a) Normal populations, same shape.
Use pooled t-test.
(b) Normal populations, different shapes.
Use nonpooled t-test.
(c) Nonnormal populations, same shape.
Use Mann–Whitney test.
(d) Not both normal populations, different shapes. Use nonpooled t-test for large samples; otherwise, consult a statistician.
EXAMPLE 10.9 Introducing the Mann–Whitney Test
Computer-System Training A nationwide shipping firm purchased a new com- puter system to track its shipments, pickups, and deliveries. Employees were ex- pected to need about 2 hours to learn how to use the system. In fact, some em- ployees could use the system in very little time, whereas others took considerably longer.
Someone suggested that the reason for this difference might be that only some employees had experience with this kind of computer system. To test this sugges- tion, independent samples of employees with and without such experience were randomly selected.
The times, in minutes, required for these employees to learn how to use the system are given in Table 10.9. At the 5% significance level, do the data pro- vide sufficient evidence to conclude that the mean learning time for all employ- ees without experience exceeds the mean learning time for all employees with experience?
TABLE 10.9 Times, in minutes, required to learn how to use the system
Without With
experience experience
139 142
118 109
164 130
151 107
182 155
140 88
134 95
104
Solution Letμ1andμ2denote the mean learning times for all employees without experience and with experience, respectively. Then the null and alternative hypothe- ses are, respectively,
H0:μ1=μ2(mean time for inexperienced employees is not greater) Ha:μ1> μ2(mean time for inexperienced employees is greater).
To use the Mann–Whitney test, the learning-time distributions for employees without and with experience should have the same shape. If they do, then the distributions of the two samples in Table 10.9 should also have the same shape, roughly.
To check this condition, we constructed Fig. 10.10, aback-to-back stem-and- leaf diagramof the two samples in Table 10.9. In such a diagram, the leaves for the first sample are on the left, the stems are in the middle, and the leaves for the second sample are on the right. The stem-and-leaf diagrams in Fig. 10.10 have roughly the same shape and so do not reveal any obvious violations of the same- shape condition.†
FIGURE 10.10 Back-to-back stem-and-leaf diagram of the two learning-time samples in Table 10.9
8 5 4 7 9
0 2 5 8 9 10 11 12 13 14 15 16 17 18 8
9 4 0 1 4
2
With experience Without
experience
To apply the Mann–Whitney test, we first rank all the data from both samples combined. (Referring to Fig. 10.10 is helpful in ranking the data.) The ranking, depicted in Table 10.10, shows, for instance, that the first employee without ex- perience had the ninth-shortest learning time among all 15 employees in the two samples combined.
The idea behind the Mann–Whitney test is simple: If the sum of the ranks for the sample of employees without experience is too large, we conclude that the null hypothesis is false and, therefore, that the mean learning time for all employees without experience exceeds that for all employees with experience. From Table 10.10, the sum of the ranks for the sample of employees without experience, denotedM, is
9+6+14+12+15+10+8=74.
TABLE 10.10 Results of ranking the combined data from Table 10.9
Without Overall With Overall
experience rank experience rank
139 9 142 11
118 6 109 5
164 14 130 7
151 12 107 4
182 15 155 13
140 10 88 1
134 8 95 2
104 3
†For ease in explaining the Mann–Whitney test, we have chosen an example in which the sample sizes are very small. However, very small sample sizes make effectively checking the same-shape condition difficult, so proceed cautiously when dealing with very small samples.
466 CHAPTER 10 Inferences for Two Population Means
To decide whether M =74 is large enough to reject the null hypothesis, we need to first discuss some preliminary material.
Using the Mann–Whitney Table†
Table VI in Appendix A gives values of Mα for a Mann–Whitney test.‡ The size of the sample from Population 2 is given in the leftmost column of Table VI, the values ofαin the next column, and the size of the sample from Population 1 along the top.
As expected, the symbol Mαdenotes theM-value with area (percentage, probability) αto its right.
We can express the critical value(s) for a Mann–Whitney test at the significance levelαas follows:
r For a two-tailed test, the critical values are theM-values with areaα/2 to its left (or, equivalently, area 1−α/2 to its right) and areaα/2 to its right, which areM1−α/2 andMα/2, respectively. See Fig. 10.11(a).
r For a left-tailed test, the critical value is the M-value with area α to its left or, equivalently, area 1−αto its right, which isM1−α. See Fig. 10.11(b).
r For a right-tailed test, the critical value is theM-value with areaαto its right, which isMα. See Fig. 10.11(c).
FIGURE 10.11 Critical value(s) for a Mann–Whitney test at the significance levelαif the test is (a) two tailed, (b) left tailed, or (c) right tailed
M (b) Left tailed
M (c) Right tailed M
(a) Two tailed /2
M1−/2 M/2 M1− M
/2 Reject
H0
Reject H0
Reject H0
Reject H0 Do not
reject H0
Do not reject H0 Do not reject H0
Note the following:
r A critical value from Table VI is to be included as part of the rejection region.
r Although the variableMis discrete, we drew the “histograms” in Fig. 10.11 in the shape of a normal curve. This approach is acceptable becauseMis close to normally distributed except for very small sample sizes. We use this graphical convention throughout this section.
The distribution of the variable M is symmetric about n1(n1+n2+1)/2. This characteristic implies that the M-value with area A to its left (or, equivalently, area 1− Ato its right) equals n1(n1+n2+1) minus the M-value with area A to its right. In symbols,
M1−A =n1(n1+n2+1)−MA. (10.2) Referring to Fig. 10.11, we see that by using Equation (10.2) and Table VI, we can determine the critical value for a left-tailed Mann–Whitney test and the critical values for a two-tailed Mann–Whitney test. The next example illustrates the use of Table VI to determine critical values for a Mann–Whitney test.
†We can use the Mann-Whitney table to estimate theP-value of a Mann-Whitney test. However, because doing so can be awkward or tedious, using statistical software is preferable. Thus, those concentrating on theP-value approach to hypothesis testing can skip to the subsection “Performing the Mann–Whitney Test.”
‡Actually, theα-levels in Table VI are only approximate, but are used in practice.
EXAMPLE 10.10 Using the Mann–Whitney Table
In each case, use Table VI to determine the critical value(s) for a Mann–Whitney test. Sketch graphs to illustrate your results.
a. n1=9,n2=6; significance level=0.01; right tailed b. n1=5,n2=7; significance level=0.10; left tailed c. n1=8,n2=4; significance level=0.05; two tailed
Solution In solving these problems, it helps to refer to Fig. 10.11.
a. The critical value for a right-tailed test at the 1% significance level isM0.01. To find the critical value, we use Table VI. First we go down the leftmost column, labeledn2, to “6.” Then, going across the row forαlabeled 0.01 to the column labeled “9,” we reach 92, the required critical value. See Fig. 10.12(a).
b. The critical value for a left-tailed test at the 10% significance level isM1−0.10. To find the critical value, we use Table VI and Equation (10.2). First we go down the leftmost column, labeledn2, to “7.” Then, going across the row forα labeled 0.10 to the column labeled “5,” we reach 41; thusM0.10=41. Now we apply Equation (10.2) and the result just obtained to get
M1−0.10=5(5+7+1)−M0.10=65−41=24, which is the required critical value. See Fig. 10.12(b).
c. The critical values for a two-tailed test at the 5% significance level areM1−0.05/2andM0.05/2, that is,M1−0.025andM0.025. First we use Table VI to findM0.025. We go down the leftmost column, labeledn2, to “4.” Then, go- ing across the row forαlabeled 0.025 to the column labeled “8,” we reach 64;
thus M0.025=64. Now we apply Equation (10.2) and the result just obtained to getM1−0.025:
M1−0.025=8(8+4+1)−M0.025=104−64=40. See Fig. 10.12(c).
Exercise 10.99 on page 474
FIGURE 10.12 Critical value(s) for a Mann–Whitney test: (a) right tailed,α=0.01,n1=9,n2=6;
(b) left tailed,α=0.10,n1=5,n2=7; (c) two tailed,α=0.05,n1=8,n2=4
M (c)
Do not reject H0 Reject
H0
Reject H0
0.025
40 64
0.025 M
0.01
(a)
Do not reject H0 Reject H0
92 M
(b)
Do not reject H0 Reject
H0
24 0.10
Performing the Mann–Whitney Test
Procedure 10.5 on the following page provides a step-by-step method for performing a Mann–Whitney test. Note that we often use the phrase same-shape populations to indicate that the two distributions (one for each population) of the variable under consideration have the same shape.
Note:When there are ties in the sample data, ranks are assigned in the same way as in the Wilcoxon signed-rank test. Namely, if two or more observations are tied, each is assigned the mean of the ranks they would have had if there had been no ties.
468 CHAPTER 10 Inferences for Two Population Means
PROCEDURE 10.5 Mann–Whitney Test
Purpose To perform a hypothesis test to compare two population means,μ1andμ2
Assumptions
1. Simple random samples 2. Independent samples 3. Same-shape populations
Step 1 The null hypothesis isH0:μ1=μ2, and the alternative hypothesis is Ha:μ1=μ2 or Ha:μ1< μ2 or Ha:μ1> μ2
(Two tailed) (Left tailed) (Right tailed) Step 2 Decide on the significance level,α.
Step 3 Compute the value of the test statistic
M=sum of the ranks for sample data from Population 1
and denote that value M0. To do so, construct a work table of the following form.
Sample from Overall Sample from Overall Population 1 rank Population 2 rank
ã ã ã ã
ã ã ã ã
ã ã ã ã
CRITICAL-VALUE APPROACH OR P-VALUE APPROACH
Step 4 The critical value(s) are
M1−α/2andMα/2 M1−α Mα
or or
(Two tailed) (Left tailed) (Right tailed) Use Table VI to find the critical value(s). For a left- tailed or two-tailed test, you will also need the rela- tionM1−A=n1(n1+n2+1)−MA.
M Left tailed Do not
reject H0 Reject
H0
Reject H0
Do not reject H0 Reject
H0
Do not reject H0 Reject H0
M Right tailed M
Two tailed /2
M1−/2 M/2 M1− M
/2
Step 5 If the value of the test statistic falls in the rejection region, reject H0; otherwise, do not rejectH0.
Step 4 Obtain the P-value by using technology.
P- value
M M M
P- value
Two tailed Left tailed Right tailed
M0 M0
M0 P- value
Step 5 If P≤α, reject H0; otherwise, do not reject H0.
Step 6 Interpret the results of the hypothesis test.
EXAMPLE 10.11 The Mann–Whitney Test
Computer-System Training Let’s complete the hypothesis test of Example 10.9.
Independent simple random samples of employees with and without computer- system experience were obtained. The employees selected were timed to see how long it would take them to learn how to use a certain computer system.
The times, in minutes, are given in Table 10.9 on page 465. At the 5% sig- nificance level, do the data provide sufficient evidence to conclude that the mean learning time for employees without experience exceeds that for employees with experience?
Solution We apply Procedure 10.5.
Step 1 State the null and alternative hypotheses.
Letμ1andμ2denote the mean learning times for all employees without and with experience, respectively. Then the null and alternative hypotheses are, respectively,
H0:μ1=μ2(mean time for inexperienced employees is not greater) Ha:μ1> μ2(mean time for inexperienced employees is greater).
Note that the hypothesis test is right tailed.
Step 2 Decide on the significance level,α.
We are to perform the test at the 5% significance level; so,α=0.05.
Step 3 Compute the value of the test statistic
M =sum of the ranks for sample data from Population 1.