JERZY NEYMAN: A PRINCIPAL FOUNDER OF MODERN STATISTICAL THEORY
DEFINITION 10.1 Distributions of the Same Shape
We say that two or more distributions have thesame shapeif they are iden- tical except possibly for the locations of their centers.†
For instance, two normal distributions with equal standard deviations have the same shape, regardless of their means. And, conversely, two normal distributions with different standard deviations have different shapes, regardless of their means. In short, two normal distributions have the same shape if and only if they have equal standard deviations.
†Observe that, in the context of our earlier discussions of distribution shape, Definition 10.1 means “same shape and spread.”
10.4 The Mann–Whitney Test∗ 493
Introducing the Mann–Whitney Test
Thus far, we have developed two procedures for performing a hypothesis test to com- pare the means of two populations: the pooled and nonpooledt-tests. Both tests require simple random samples, independent samples, and normal populations or large sam- ples. The pooledt-test also requires equal population standard deviations.
As we have just seen, two normal distributions have the same shape if and only if they have equal standard deviations. Consequently, the pooledt-test applies when the two distributions (one for each population) of the variable under consideration are nor- mal and have the same shape; the nonpooledt-test applies when the two distributions are normal, even if they don’t have the same shape.
Another procedure for performing a hypothesis test based on independent simple random samples to compare the means of two populations is theMann–Whitney test.
This nonparametric test, introduced by Wilcoxon and further developed by Mann and Whitney, is also commonly referred to as theWilcoxon rank-sum testor theMann–
Whitney–Wilcoxon test.
The Mann–Whitney test applies when the two distributions of the variable under consideration have the same shape, but it does not require that they be normal or have any other specific shape. See Fig. 10.9.
FIGURE 10.9 Appropriate procedure for comparing two population means based on independent simple random samples
(a) Normal populations, same shape.
Use pooled t-test.
(b) Normal populations, different shapes.
Use nonpooled t-test.
(c) Nonnormal populations, same shape.
Use Mann–Whitney test.
(d) Not both normal populations, different shapes. Use nonpooled t-test for large samples; otherwise, consult a statistician.
EXAMPLE 10.9 Introducing the Mann–Whitney Test
Computer-System Training A nationwide shipping firm purchased a new com- puter system to track its packages. Independent samples of employees with and without experience in this type of computer system were randomly selected and the times required to learn how to use the new system were measured. The times, in minutes, are given in Table 10.9.
At the 5% significance level, do the data provide sufficient evidence to conclude that the mean learning time for all employees without experience exceeds the mean learning time for all employees with experience?
TABLE 10.9 Times, in minutes, required to learn how to use the system
Without With
experience experience
139 142
118 109
164 130
151 107
182 155
140 88
134 95
104
Solution Letμ1andμ2denote the mean learning times for all employees without experience and with experience, respectively. Then the null and alternative hypothe- ses are, respectively,
H0:μ1=μ2(mean time for inexperienced employees is not greater) Ha:μ1> μ2(mean time for inexperienced employees is greater).
To use the Mann–Whitney test, the learning-time distributions for employees without and with experience should have the same shape. If they do, then the distri- butions of the two samples in Table 10.9 should also have the same shape, roughly.
To check this condition, we constructed Fig. 10.10, aback-to-back stem-and- leaf diagramof the two samples in Table 10.9. In such a diagram, the leaves for the first sample are on the left, the stems are in the middle, and the leaves for the second sample are on the right. The stem-and-leaf diagrams in Fig. 10.10 have roughly the same shape and so do not reveal any obvious violations of the same-shape condition.†
FIGURE 10.10 Back-to-back stem-and-leaf diagram of the two learning-time samples in Table 10.9
8 5 4 7 9
0 2 5 8 9 10 11 12 13 14 15 16 17 18 8
9 4 0 1 4
2
With experience Without
experience
To apply the Mann–Whitney test, we first rank all the data from both samples combined. (Referring to Fig. 10.10 is helpful in ranking the data.) The ranking, de- picted in Table 10.10, shows, for instance, that the first employee without experience had the ninth-shortest learning time among all 15 employees in the two samples combined.
The idea behind the Mann–Whitney test is simple: If the sum of the ranks for the sample of employees without experience is too large, we conclude that the null hypothesis is false and, therefore, that the mean learning time for all employees with- out experience exceeds that for all employees with experience. From Table 10.10, the sum of the ranks for the sample of employees without experience, denotedM, is
9+6+14+12+15+10+8=74. TABLE 10.10
Results of ranking the combined data from Table 10.9
Without Overall With Overall experience rank experience rank
139 9 142 11
118 6 109 5
164 14 130 7
151 12 107 4
182 15 155 13
140 10 88 1
134 8 95 2
104 3
To decide whether M =74 is large enough to reject the null hypothesis, we need to first discuss some preliminary material.
Using the Mann–Whitney Table‡
Table VI in Appendix A gives values of Mα for a Mann–Whitney test.§The size of the sample from Population 2 is given in the leftmost column of Table VI, the values ofαin the next column, and the size of the sample from Population 1 along the top.
As expected, the symbol Mαdenotes the M-value with area (percentage, probability) αto its right.
We can express the critical value(s) for a Mann–Whitney test at the significance levelαas follows:
r For a two-tailed test, the critical values are theM-values with areaα/2 to its left (or, equivalently, area 1−α/2 to its right) and areaα/2 to its right, which areM1−α/2 andMα/2, respectively. See Fig. 10.11(a).
†For ease in explaining the Mann–Whitney test, we have chosen an example in which the sample sizes are very small. However, very small sample sizes make effectively checking the same-shape condition difficult, so proceed cautiously when dealing with very small samples.
‡We can use the Mann–Whitney table to estimate theP-value of a Mann–Whitney test. However, because doing so can be awkward or tedious, using statistical software is preferable. Thus, those concentrating on theP-value approach to hypothesis testing can skip to the subsection “Performing the Mann–Whitney Test” on page 496.
§Actually, theα-levels in Table VI are only approximate, but are used in practice.
10.4 The Mann–Whitney Test∗ 495 r For a left-tailed test, the critical value is theM-value with areaαto its left or, equiv-
alently, area 1−αto its right, which isM1−α. See Fig. 10.11(b).
r For a right-tailed test, the critical value is theM-value with areaαto its right, which isMα. See Fig. 10.11(c).
FIGURE 10.11 Critical value(s) for a Mann–Whitney test at the significance levelαif the test is (a) two tailed, (b) left tailed, or (c) right tailed
M
(b) Left tailed
M
(c) Right tailed M
(a) Two tailed
␣/2
M1−␣/2 M␣/2 M1−␣ M␣
␣
␣
␣/2 Reject
H0
Reject H0
Reject H0
Reject H0 Do not
reject H0
Do not reject H0 Do not reject H0
Note the following:
r A critical value from Table VI is to be included as part of the rejection region.
r Although the variableMis discrete, we drew the “histograms” in Fig. 10.11 in the shape of a normal curve. This approach is acceptable becauseMis close to normally distributed except for very small sample sizes. We use this graphical convention throughout this section.
The distribution of the variable M is symmetric about n1(n1+n2+1)/2. This characteristic implies that the M-value with area Ato its left (or, equivalently, area 1−Ato its right) equalsn1(n1+n2+1) minus theM-value with area Ato its right.
In symbols,
M1−A=n1(n1+n2+1)−MA. (10.2) Referring to Fig. 10.11, we see that by using Equation (10.2) and Table VI, we can determine the critical value for a left-tailed Mann–Whitney test and the critical values for a two-tailed Mann–Whitney test. The next example illustrates the use of Table VI to determine critical values for a Mann–Whitney test.
EXAMPLE 10.10 Using the Mann–Whitney Table
In each case, use Table VI to determine the critical value(s) for a Mann–Whitney test. Sketch graphs to illustrate your results.
a. n1=9,n2=6; significance level=0.01; right tailed b. n1=5,n2=7; significance level=0.10; left tailed c. n1=8,n2=4; significance level=0.05; two tailed
Solution In solving these problems, it helps to refer to Fig. 10.11.
a. The critical value for a right-tailed test at the 1% significance level isM0.01. To find the critical value, we use Table VI. First we go down the leftmost column, labeledn2, to “6.” Then, going across the row forαlabeled 0.01 to the column labeled “9,” we reach 92, the required critical value. See Fig. 10.12(a) on the next page.
b. The critical value for a left-tailed test at the 10% significance level isM1−0.10. To find the critical value, we use Table VI and Equation (10.2). First we go down the leftmost column, labeledn2, to “7.” Then, going across the row forα labeled 0.10 to the column labeled “5,” we reach 41; thusM0.10=41. Now we apply Equation (10.2) and the result just obtained to get
M1−0.10=5(5+7+1)−M0.10=65−41=24, which is the required critical value. See Fig. 10.12(b).
c. The critical values for a two-tailed test at the 5% significance level areM1−0.05/2 andM0.05/2, that is,M1−0.025andM0.025. First we use Table VI to findM0.025. We go down the leftmost column, labeledn2, to “4.” Then, going across the row forαlabeled 0.025 to the column labeled “8,” we reach 64; thusM0.025=64.
Now we apply Equation (10.2) and the result just obtained to getM1−0.025: M1−0.025 =8(8+4+1)−M0.025=104−64=40.
See Fig. 10.12(c).
Exercise 10.113 on page 502
FIGURE 10.12 Critical value(s) for a Mann–Whitney test: (a) right tailed,α=0.01,n1=9,n2=6;
(b) left tailed,α=0.10,n1=5,n2=7; (c) two tailed,α=0.05,n1=8,n2=4
M
(c) Do not reject H0 Reject
H0
Reject H0
0.025
40 64
0.025 M
0.01
(a)
Do not reject H0 Reject H0
92 M
(b)
Do not reject H0 Reject
H0
24 0.10
Performing the Mann–Whitney Test
Procedure 10.5 provides a step-by-step method for performing a Mann–Whitney test.
Observe that we often use the phrasesame-shape populationsto indicate that the two distributions (one for each population) of the variable under consideration have the same shape.
Note:When there are ties in the sample data, ranks are assigned in the same way as in the Wilcoxon signed-rank test. Namely, if two or more observations are tied, each is assigned the mean of the ranks they would have had if there had been no ties.
EXAMPLE 10.11 The Mann–Whitney Test
Computer-System Training Let’s complete the hypothesis test of Example 10.9.
Independent simple random samples of employees with and without computer- system experience were obtained. The employees selected were timed to see how long it would take them to learn how to use a certain computer system.
The times, in minutes, are given in Table 10.9 on page 493. At the 5% sig- nificance level, do the data provide sufficient evidence to conclude that the mean learning time for employees without experience exceeds that for employees with experience?
Solution We apply Procedure 10.5.
Step 1 State the null and alternative hypotheses.
Letμ1andμ2 denote the mean learning times for all employees without and with experience, respectively. Then the null and alternative hypotheses are, respectively,
H0:μ1=μ2(mean time for inexperienced employees is not greater) Ha:μ1> μ2(mean time for inexperienced employees is greater).
Note that the hypothesis test is right tailed.
10.4 The Mann–Whitney Test∗ 497