Recall that standard deviation is a measure of the variation (or spread) of a data set.
Also recall that, for a variablex, the standard deviation of all possible observations for the entire population is called thepopulation standard deviationorstandard deviation of the variable x. It is denotedσx or, when no confusion will arise, simplyσ.
Suppose that we want to obtain information about a population standard deviation.
If the population is small, we can often determineσ exactly by first taking a census and then computingσ from the population data. However, if the population is large, which is usually the case, a census is generally not feasible, and we must use inferential methods to obtain the required information aboutσ.
In this section, we describe how to perform hypothesis tests and construct confi- dence intervals for the standard deviation of a normally distributed variable. Such in- ferences are based on a distribution called thechi-square distribution.Chi (pronounced
“k¯i”) is a Greek letter whose lowercase form isχ. The Chi-Square Distribution
A variable has achi-square distribution if its distribution has the shape of a spe- cial type of right-skewed curve, called achi-square (χ2) curve.Actually, there are infinitely many chi-square distributions, and we identify the chi-square distribution (andχ2-curve) in question by its number of degrees of freedom, just as we did for t-distributions. Figure 11.1 shows threeχ2-curves and illustrates some basic proper- ties ofχ2-curves.
FIGURE 11.1 χ2-curves for df = 5, 10, and 19
df = 19
2 df = 5
df = 10
0 5 10 15 20 25 30
KEY FACT 11.1 Basic Properties ofχ2-Curves
Property 1: The total area under aχ2-curve equals 1.
Property 2: Aχ2-curve starts at 0 on the horizontal axis and extends indef- initely to the right, approaching, but never touching, the horizontal axis as it does so.
Property 3: Aχ2-curve is right skewed.
Property 4: As the number of degrees of freedom becomes larger, χ2- curves look increasingly like normal curves.
Percentages (and probabilities) for a variable having a chi-square distribution are equal to areas under its associatedχ2-curve. To perform a hypothesis test or construct
a confidence interval for a population standard deviation, we need to know how to find theχ2-value that corresponds to a specified area under aχ2-curve. Table VII in Appendix A providesχ2-values corresponding to several areas for various degrees of freedom.
The χ2-table (Table VII) is similar to the t-table (Table IV). The two outside columns of Table VII, labeled df, display the number of degrees of freedom. As ex- pected, the symbolχα2denotes theχ2-value having areaαto its right under aχ2-curve.
Thus the column headedχ02.995, for example, containsχ2-values having area 0.995 to their right.
EXAMPLE 11.1 Finding the χ2-Value Having a Specified Area to Its Right For aχ2-curve with 12 degrees of freedom, findχ02.025; that is, find theχ2-value having area 0.025 to its right, as shown in Fig. 11.2(a).
FIGURE 11.2 Finding theχ2-value having area 0.025 to its right
2 2-curve
df = 12 Area = 0.025 0
(a) (b)
2 2-curve
df = 12 Area = 0.025
0
= 23.337 0.0252
= ? 0.0252
Solution To find thisχ2-value, we use Table VII. The number of degrees of free- dom is 12, so we first go down the outside columns, labeled df, to “12.” Then, going across that row to the column labeledχ02.025, we reach 23.337. This number is the χ2-value having area 0.025 to its right, as shown in Fig. 11.2(b). In other words, for aχ2-value with df=12,χ02.025=23.337.
Exercise 11.5 on page 522
EXAMPLE 11.2 Finding the χ2-Value Having a Specified Area to Its Left
Determine theχ2-value having area 0.05 to its left for aχ2-curve with df=7, as depicted in Fig. 11.3(a).
FIGURE 11.3 Finding theχ2-value having area 0.05 to its left
2 2- curve
df = 7
0 2 = ?
(a) 0.05
2 = 2.167
2 2- curve
df = 7
0
(b) 0.05
Exercise 11.9 on page 522
Solution Because the total area under aχ2-curve equals 1 (Property 1 of Key Fact 11.1), the unshaded area in Fig. 11.3(a) must equal 1−0.05=0.95. Thus the required χ2-value isχ02.95. From Table VII with df =7,χ02.95=2.167. So, for a χ2-curve with df=7, theχ2-value having area 0.05 to its left is 2.167, as shown in Fig. 11.3(b).
514 CHAPTER 11 Inferences for Population Standard Deviations∗
EXAMPLE 11.3 Finding the χ2-Values for a Specified Area
For a χ2-curve with df = 20, determine the two χ2-values that divide the area under the curve into a middle 0.95 area and two outside 0.025 areas, as shown in Fig. 11.4(a).
FIGURE 11.4 Finding the twoχ2-values that divide the area under the curve into a middle 0.95 area and two outside 0.025 areas
0.025
2= ? 34.170
0.025 2
2- curve df = 20
0 2= ?
(a) 0.025
9.591
2 2- curve
df = 20
0
(b)
0.025 0.95
0.95
Solution First, we find the χ2-value on the right in Fig. 11.4(a). Because the shaded area on the right is 0.025, theχ2-value on the right isχ02.025. From Table VII with df=20,χ02.025=34.170.
Next, we find theχ2-value on the left in Fig. 11.4(a). Because the area to the left of thatχ2-value is 0.025, the area to its right is 1−0.025=0.975. Hence the χ2-value on the left isχ02.975, which, by Table VII, equals 9.591 for df=20.
Consequently, for aχ2-curve with df=20, the twoχ2-values that divide the area under the curve into a middle 0.95 area and two outside 0.025 areas are 9.591 and 34.170, as shown in Fig. 11.4(b).
Exercise 11.11 on page 522
The Logic Behind Hypothesis Tests for One Population Standard Deviation
We illustrate the logic behind hypothesis tests for one population standard deviation in the next example.
EXAMPLE 11.4 Hypothesis Tests for a Population Standard Deviation
Xenical Capsules Xenical is used to treat obesity in people with risk factors such as diabetes, high blood pressure, and high cholesterol or triglycerides. Xenical works in the intestines, where it blocks some of the fat a person eats from being absorbed.
A standard prescription of Xenical is given in 120-milligram (mg) capsules.
Although the capsule weights can vary somewhat from 120 mg and also from each other, keeping the variation small is important for various medical reasons.
Based on standards set by the United States Pharmacopeia (USP)—an of- ficial public standards-setting authority for all prescription and over-the-counter medicines and other health care products manufactured or sold in the United States—we determined that a standard deviation of Xenical capsule weights of less than 2 mg is acceptable.†
a. Formulate statistically the problem of deciding whether the standard deviation of Xenical capsule weights is less than 2.0 mg.
†See Exercise 11.42 for an explanation of how that information could be obtained.
b. Explain the basic idea for carrying out the hypothesis test.
c. In the paper “HPLC Analysis of Orlistat and Its Application to Drug Qual- ity Control Studies” (Chemical & Pharmaceutical Bulletin, Vol. 55, No. 2, pp. 251–254), E. Souri et al. studied various properties of Xenical. A sample of 10 Xenical capsules had the weights shown in Table 11.1. Discuss the use of these data to make a decision concerning the hypothesis test.
TABLE 11.1 Weights (mg) of 10 Xenical capsules 120.94 118.58 119.41 120.23 121.13 118.22 119.71 121.09 120.56 119.11
Solution
a. We want to perform the hypothesis test
H0: σ =2.0 mg (too much weight variation) Ha: σ <2.0 mg (not too much weight variation).
If the null hypothesis can be rejected, we can be confident that the variation in capsule weights is acceptable.†
b. Roughly speaking, the hypothesis test can be carried out in the following manner:
1. Take a random sample of Xenical capsules.
2. Find the standard deviation,s, of the weights of the capsules sampled.
3. Ifsis “too much smaller” than 2.0 mg, reject the null hypothesis in favor of the alternative hypothesis; otherwise, do not reject the null hypothesis.
c. The sample standard deviation of the capsule weights in Table 11.1 is
s =
xi2−(xi)2/n
n−1 =
143765.3242−(1198.98)2/10
9 =1.055 mg.
Is this value of s “too much smaller” than 2.0 mg, suggesting that the null hypothesis be rejected? Or can the difference betweens =1.055 mg and the null hypothesis value ofσ =2.0 mg be attributed to sampling error? To answer these questions, we need to know the distribution of the variables, that is, the distribution of all possible sample standard deviations that could be obtained by sampling 10 Xenical capsules. We examine that distribution and then return to complete the hypothesis test.
Sampling Distribution of the Sample Standard Deviation
Recall that to perform a hypothesis test with null hypothesisH0:μ=μ0for the mean, μ, of a normally distributed variable, we do not use the variablex¯as the test statistic;
rather, we use the variable
t = x¯−μ0
s/√ n .
Similarly, when performing a hypothesis test with null hypothesisH0:σ =σ0for the standard deviation,σ, of a normally distributed variable, we do not use the variables as the test statistic; rather, we use a modified version of that variable:
χ2= n−1 σ02 s2. This variable has a chi-square distribution.
†Another approach would be to let the null hypothesis beH0:σ =2.0 mg (not too much weight variation) and the alternative hypothesis to beHa:σ >2.0 mg (too much weight variation). Then rejection of the null hypothesis would indicate that the variation in capsule weights is unacceptable.
516 CHAPTER 11 Inferences for Population Standard Deviations∗
KEY FACT 11.2 The Sampling Distribution of the Sample Standard Deviation† Suppose that a variable of a population is normally distributed with standard deviationσ. Then, for samples of sizen, the variable
χ2=n−1 σ2 s2
has the chi-square distribution withn−1 degrees of freedom.
Applet 11.1
EXAMPLE 11.5 The Sampling Distribution of the Sample Standard Deviation Xenical Capsules In Example 11.4, suppose that the capsule weights are normally distributed with mean 120 mg and standard deviation 2.0 mg. Then, according to Key Fact 11.2, for samples of size 10, the variable
χ2= n−1
σ2 s2= 10−1
(2.0)2s2=2.25s2
has a chi-square distribution with 9 degrees of freedom. Use simulation to make that fact plausible.
Solution We first simulated 1000 samples of 10 capsule weights each, that is, 1000 samples of 10 observations each of a normally distributed variable with mean 120 and standard deviation 2.0. Then, for each of those 1000 samples, we de- termined the sample standard deviation,s, and obtained the value of the variableχ2 displayed above. Output 11.1 shows a histogram of those 1000 values ofχ2, which is shaped like the superimposedχ2-curve with df=9.
OUTPUT 11.1 Histogram ofχ2for 1000 samples of 10 capsule weights with superimposedχ2-curve
0
CHISQ
4 6 8 10 12 141618 20 22 24 2
Hypothesis Tests for a Population Standard Deviation
In light of Key Fact 11.2, for a hypothesis test with null hypothesis H0:σ =σ0, we can use the variable
χ2= n−1 σ02 s2
as the test statistic and obtain the critical value(s) from theχ2-table, Table VII. We call this hypothesis-testing procedure theone-standard-deviationχ2-test.‡
Procedure 11.1 gives a step-by-step method for performing a one-standard- deviationχ2-test by using either the critical-value approach or theP-value approach.
For the P-value approach, we could use Table VII to estimate the P-value, but to do so is awkward and tedious; thus, we recommend using statistical software.
Unlike thez-tests andt-tests for one and two population means, the one-standard- deviationχ2-test is not robust to moderate violations of the normality assumption. In fact, it is so nonrobust that many statisticians advise against its use unless there is considerable evidence that the variable under consideration is normally distributed or very nearly so.
Consequently, before applying Procedure 11.1, construct a normal probability plot. If the plot creates any doubt about the normality of the variable under consid- eration, do not use Procedure 11.1.
We note that nonparametric procedures, which do not require normality, have been developed to perform inferences for a population standard deviation. If you have
†Strictly speaking, the sampling distribution presented here is not the sampling distribution of the sample standard deviation but is the sampling distribution of a multiple of the sample variance.
‡The one-standard-deviationχ2-test is also known as theχ2-test for one population standard deviation.This test is often formulated in terms of variance instead of standard deviation.
doubts about the normality of the variable under consideration, you can often use one of those procedures to perform a hypothesis test or find a confidence interval for a population standard deviation.
PROCEDURE 11.1 One-Standard-Deviationχ2-Test
Purpose To perform a hypothesis test for a population standard deviation,σ Assumptions
1. Simple random sample 2. Normal population
Step 1 The null hypothesis is H0:σ =σ0, and the alternative hypothesis is Ha:σ =σ0 or Ha:σ < σ0 or Ha:σ > σ0
(Two tailed) (Left tailed) (Right tailed) Step 2 Decide on the significance level,α.