1. Trang chủ
  2. » Kinh Doanh - Tiếp Thị

Elementary statistics technology update 11th edition part 2

336 1,5K 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 336
Dung lượng 33,77 MB

Nội dung

Table 6-1 Methods for Finding Normal Distribution Areas Table A-2, STATDISK, Minitab, Excel Gives the cumulative area from the left up to a vertical line above a specific value of z.. I

Trang 1

Normal Probability Distributions

6-2 The Standard Normal

Trang 2

Ergonomics involves the study of people

fit-ting into their environments Ergonomics is

used in a wide variety of applications such as

these: Design a doorway so that most people

can walk through it without bending or

hit-ting their head; design a car so that the

dash-board is within easy reach of most drivers;

design a screw bottle top so that most

peo-ple have sufficient grip strength to open it;

design a manhole cover so that most workers

can fit through it Good ergonomic design

re-sults in an environment that is safe,

func-tional, efficient, and comfortable Bad

er-gonomic design can result in uncomfortable,

unsafe, or possibly fatal conditions For

ex-ample, the following real situations illustrate

the difficulty in determining safe loads in

air-craft and boats.

“We have an emergency for Air Midwest

fifty-four eighty,” said pilot Katie Leslie, just

be-fore her plane crashed in Charlotte, North

Carolina The crash of the Beech plane killed

all of the 21 people on board In the

subse-quent investigation, the weight of the

passengers was suspected as a factor that

contributed to the crash This prompted

the Federal Aviation Administration to

or-der airlines to collect weight information

from randomly selected flights, so that the

old assumptions about passenger weights

could be updated.

Twenty passengers were killed when the

Ethan Allen tour boat capsized on New

York’s Lake George Based on an assumed mean weight of 140 lb, the boat was certi- fied to carry 50 people A subsequent in- vestigation showed that most of the pas- sengers weighed more than 200 lb, and the boat should have been certified for a much smaller number of passengers.

A water taxi sank in Baltimore’s Inner bor Among the 25 people on board, 5 died and 16 were injured An investigation revealed that the safe passenger load for the water taxi was 3500 lb Assuming a mean passenger weight of 140 lb, the boat was allowed to carry 25 passengers, but the mean of 140 lb was determined 44 years ago when people were not as heavy

Har-as they are today (The mean weight of the

25 passengers aboard the boat that sank was found to be 168 lb.) The National Transportation and Safety Board suggested that the old estimated mean of 140 lb be updated to 174 lb, so the safe load of 3500

lb would now allow only 20 passengers stead of 25.

in-This chapter introduces the statistical tools that are basic to good ergonomic de- sign After completing this chapter, we will

be able to solve problems in a wide variety of different disciplines, including ergonomics.How do we design airplanes, boats, cars, and homes

for safety and comfort?

Trang 3

Review and Preview

In Chapter 2 we considered the distribution of data, and in Chapter 3 we consideredsome important measures of data sets, including measures of center and variation InChapter 4 we discussed basic principles of probability, and in Chapter 5 we presented

the concept of a probability distribution In Chapter 5 we considered only discrete probability distributions, but in this chapter we present continuous probability distri-

butions To illustrate the correspondence between area and probability, we begin with

a uniform distribution, but most of this chapter focuses on normal distributions

Nor-mal distributions occur often in real applications, and they play an important role inmethods of inferential statistics In this chapter we present concepts of normal distri-butions that will be used often in the remaining chapters of this text Several of thestatistical methods discussed in later chapters are based on concepts related to thecentral limit theorem, discussed in Section 6-5 Many other sections require normallydistributed populations, and Section 6-7 presents methods for analyzing sample data

to determine whether or not the sample appears to be from such a normally uted population

on the left side and one variable x on the right side The letters and e represent the

constant values of 3.14159 and 2.71828 , respectively The symbols and represent fixed values for the mean and standard deviation, respectively Once specificvalues are selected for and we can graph Formula 6-1 as we would graph any

equation relating x and y ; the result is a continuous probability distribution with the

same bell shape shown in Figure 6-1 From Formula 6-1 we see that a normal bution is determined by the fixed values of the mean and standard deviation And that’s all we need to know about Formula 6-1!

distri-sm

s,m

smÁ

Trang 4

The Standard Normal Distribution

Key Concept In this section we present the standard normal distribution, which has

these three properties:

1. Its graph is bell-shaped (as in Figure 6-1)

2. Its mean is equal to 0 (that is, )

3. Its standard deviation is equal to 1 (that is, )

In this section we develop the skill to find areas (or probabilities or relative

frequen-cies) corresponding to various regions under the graph of the standard normal

distri-bution In addition, we find z-scores that correspond to areas under the graph.

Uniform Distributions

The focus of this chapter is the concept of a normal probability distribution, but we

begin with a uniform distribution The uniform distribution allows us to see two very

important properties:

1. The area under the graph of a probability distribution is equal to 1

2. There is a correspondence between area and probability (or relative frequency),

so some probabilities can be found by identifying the corresponding areas

Chapter 5 considered only discrete probability distributions, but we now consider

continuous probability distributions, beginning with the uniform distribution.

s = 1

m = 0

6-2

A continuous random variable has a uniform distribution if its values are

spread evenly over the range of possibilities The graph of a uniform

distri-bution results in a rectangular shape

Home Power Supply The Newport Power and Light

Company provides electricity with voltage levels that are uniformly distributed

be-tween 123.0 volts and 125.0 volts That is, any voltage amount bebe-tween 123.0

volts and 125.0 volts is possible, and all of the possible values are equally likely If

we randomly select one of the voltage levels and represent its value by the random

variable x, then x has a distribution that can be graphed as in Figure 6-2.

1

The Placebo Effect

It has long been believed that placebos actually help some patients In fact, some formal

studies have shown that when given

a placebo (a treat- ment with no medici- nal value), many test subjects show some improvement Estimates of improvement rates have typically ranged between one-third and two- thirds of the patients How- ever, a more recent study suggests that placebos have

no real effect An article in

the New England Journal of

Medicine (Vol 334, No 21)

was based on research of 114 medical studies over 50 years The authors of the article con- cluded that placebos appear

to have some effect only for relieving pain, but not for other physical conditions They con- cluded that apart from clini- cal trials, the use of placebos

Trang 5

Voltage Level Given the uniform distribution illustrated in

Figure 6-2, find the probability that a randomly selected voltage level is greaterthan 124.5 volts

Figure 6-3 Using Area to Find Probability

The shaded area in Figure 6-3 represents voltage levels that aregreater than 124.5 volts Because the total area under the density curve is equal to

1, there is a correspondence between area and probability We can find the sired probability by using areas as follows:

de-The probability of randomly selecting a voltage level greaterthan 124.5 volts is 0.25

= 0.25

= 0.5 * 0.5

P (voltage greater than 124.5 volts) = area of shaded region in Figure 6-3

The graph of a continuous probability distribution, such as in Figure 6-2, is called a

density curve A density curve must satisfy the following two requirements.

Requirements for a Density Curve

1. The total area under the curve must equal 1

2. Every point on the curve must have a vertical height that is 0 or greater (That

is, the curve cannot fall below the x-axis.)

By setting the height of the rectangle in Figure 6-2 to be 0.5, we force the closed area to be , as required (In general, the area of the rectangle be-comes 1 when we make its height equal to the value of ) The requirementthat the area must equal 1 makes solving probability problems simple, so the follow-ing statement is important:

en-Because the total area under the density curve is equal to 1, there is a

cor-respondence between area and probability.

1>range

2 * 0.5 = 1

Trang 6

Standard Normal Distribution

The density curve of a uniform distribution is a horizontal line, so we can find the

area of any rectangular region by applying this formula:

Be-cause the density curve of a normal distribution has a complicated bell shape as

shown in Figure 6-1, it is more difficult to find areas However, the basic principle is

the same: There is a correspondence between area and probability In Figure 6-4 we

show that for a standard normal distribution, the area under the density curve is

equal to 1

Area = width * height

The standard normal distribution is a normal probability distribution with

and The total area under its density curve is equal to 1 (See

Figure 6-4.)

s = 1

m = 0

It is not easy to find areas in Figure 6-4, so mathematicians have calculated

many different areas under the curve, and those areas are included in Table A-2 in

Figure 6-4 Standard Normal Distribution:

Bell-Shaped Curve with M ⴝ 0 and S ⴝ 1

Finding Probabilities When Given z Scores

Using Table A-2 (in Appendix A and the Formulas and Tables insert card), we can find

areas (or probabilities) for many different regions Such areas can also be found using

a Plus calculator, or computer software such as STATDISK, Minitab, or

Excel The key features of the different methods are summarized in Table 6-1 on the

next page Because calculators or computer software generally give more accurate results

than Table A-2, we strongly recommend using technology (When there are

discrep-ancies, answers in Appendix D will generally include results based on Table A-2 as

well as answers based on technology.)

If using Table A-2, it is essential to understand these points:

1. Table A-2 is designed only for the standard normal distribution, which has a

mean of 0 and a standard deviation of 1

2. Table A-2 is on two pages, with one page for negative z scores and the other

page for positive z scores.

TI-83>84

Trang 7

3. Each value in the body of the table is a cumulative area from the left up to a tical boundary above a specific z score.

ver-4. When working with a graph, avoid confusion between z scores and areas.

z score: Distance along the horizontal scale of the standard normal

distribution; refer to the leftmost column and top row of Table A-2.

Area: Region under the curve; refer to the values in the body of

Table A-2.

5. The part of the z score denoting hundredths is found across the top row of

Table A-2

Table 6-1 Methods for Finding Normal Distribution Areas

Table A-2, STATDISK, Minitab, Excel

Gives the cumulative area from the left up to a vertical line above a specific value

of z.

The procedure for using Table A-2 is described in the text.

Select Analysis,

Probability Distributions, Normal

Distribution Enter the z value,

then click on Evaluate.

Select Calc,

Probability Distributions, Normal.

In the dialog box, select Cumulative

Probability, Input Constant.

Select fx,

Statisti-cal, NORMDIST In the dialog box,

enter the value and mean, the standard deviation, and “true.”

[2: normal cdf ( ], then enter the

two z scores separated by a comma,

as in (left z score, right z score).

T I - 8 3 / 8 4

z

UpperLower

CAUTION

When working with a normal distribution, avoid confusion between z scores and

areas

The following example requires that we find the probability associated with a

z score less than 1.27 Begin with the z score of 1.27 by locating 1.2 in the left column;

next find the value in the adjoining row of probabilities that is directly below 0.07, asshown in the following excerpt from Table A-2

Trang 8

TABLEA-2 (continued) Cumulative Area from the LEFT

Scientific Thermometers The Precision Scientific

In-strument Company manufactures thermometers that are supposed to give

read-ings of 0 C at the freezing point of water Tests on a large sample of these

instru-ments reveal that at the freezing point of water, some thermometers give readings

below 0 (denoted by negative numbers) and some give readings above 0

(de-noted by positive numbers) Assume that the mean reading is 0 C and the

stan-dard deviation of the readings is 1.00 C Also assume that the readings are

nor-mally distributed If one thermometer is randomly selected, find the probability

that, at the freezing point of water, the reading is less than 1.27 °

Area 0.8980(from Table A-2)

Figure 6-5 Finding the Area Below

zⴝ 1.27

The probability distribution of readings is a standard normal tribution, because the readings are normally distributed with and We

dis-need to find the area in Figure 6-5 below The area below is

equal to the probability of randomly selecting a thermometer with a reading less than

1.27 From Table A-2 we find that this area is 0.8980 °

z = 1.27

The probability of randomly selecting a thermometer with a

reading less than 1.27 (at the freezing point of water) is equal to the area of 0.8980

shown as the shaded region in Figure 6-5 Another way to interpret this result is to

conclude that 89.80% of the thermometers will have readings below 1.27 °

°

The area (or probability) value of 0.8980 indicates that there is a probability of

0.8980 of randomly selecting a z score less than 1.27 (The following sections will

consider cases in which the mean is not 0 or the standard deviation is not 1.)

Trang 9

We again find the desired probability by finding a corresponding

area We are looking for the area of the region that is shaded in Figure 6-6, but Table

A-2 is designed to apply only to cumulative areas from the left Referring to Table A-2 for the page with negative z scores, we find that the cumulative area from the left up

to is 0.1093 as shown Because the total area under the curve is 1, wecan find the shaded area by subtracting 0.1093 from 1 The result is 0.8907 Eventhough Table A-2 is designed only for cumulative areas from the left, we can use it tofind cumulative areas from the right, as shown in Figure 6-6

z = -1.23

Scientific Thermometers Using the thermometers from

Example 3, find the probability of randomly selecting one thermometer that reads(at the freezing point of water) above -1.23°

Figure 6-6 Finding the Area Above zⴝ ⴚ1.23

Because of the correspondence between probability and area,

we conclude that the probability of randomly selecting a thermometer with a reading

above at the freezing point of water is 0.8907 (which is the area to the right of

) In other words, 89.07% of the thermometers have readings above -1.23°

z = -1.23-1.23°

Example 4 illustrates a way that Table A-2 can be used indirectly to find a lative area from the right The following example illustrates another way that we canfind an area indirectly by using Table A-2

cumu-Scientific Thermometers Make a random selection

from the same sample of thermometers from Example 3 Find the probability that thechosen thermometer reads (at the freezing point of water) between -2.00°and 1.50 °

5

We are again dealing with normally distributed values having amean of 0 and a standard deviation of 1 The probability of selecting a thermome-ter that reads between and 1.50 corresponds to the shaded area in Figure 6-7.Table A-2 cannot be used to find that area directly, but we can use the table to findthat corresponds to the area of 0.0228, and corresponds to the area of 0.9332, as shown in the figure From Figure 6-7 we see that the shadedarea is the difference between 0.9332 and 0.0228 The shaded area is therefore

.0.0228 = 0.91040.9332 -

Trang 10

Using the correspondence between probability and area, weconclude that there is a probability of 0.9104 of randomly selecting one of the ther-

mometers with a reading between and 1.50 at the freezing point of water

Another way to interpret this result is to state that if many thermometers are selected

and tested at the freezing point of water, then 0.9104 (or 91.04%) of them will read

between and -2.00° 1.50 °

°-2.00°

z 1.500

z 2.00

(3)Area

 0.9332  0.0228

 0.9104

(1) Area is

0.0228 (from Table A-2)

(2) Total area from left up to

z  1.50 is 0.9332 (from Table A-2)

Figure 6-7 Finding the Area Between Two Values

Example 5 can be generalized as the following rule: The area corresponding to

the region between two specific z scores can be found by finding the difference

between the two areas found in Table A-2 Figure 6-8 illustrates this general rule.

Note that the shaded region B can be found by calculating the difference between two

areas found from Table A-2: area A and B combined (found in Table A-2 as the area

corresponding to ) and area A (found in Table A-2 as the area corresponding to

) Study hint: Don’t try to memorize a rule or formula for this case Focus on

un-derstanding how Table A-2 works If necessary, first draw a graph, shade the desired

area, then think of a way to find that area given the condition that Table A-2 provides

only cumulative areas from the left

Shaded area B (areas A and B combined) — (area A)

 (area from Table A-2 using z Right) — (area from Table A-2 using z Left)

Figure 6-8 Finding the

Area Between Two z

Scores

Probabilities such as those in the preceding examples can also be expressed with

the following notation

Notation

denotes the probability that the z score is between a and b.

denotes the probability that the z score is greater than a.

denotes the probability that the z score is less than a.

Using this notation, we can express the result of Example 5 as:

, which states in symbols that the probability of a z score falling between

P(z 6 a)

P(z 7 a)

P (a 6 z 6 b)

Trang 11

and 1.50 is 0.9104 With a continuous probability distribution such as the

normal distribution, the probability of getting any single exact value is 0 That is,

For example, there is a 0 probability of randomly selecting someoneand getting a person whose height is exactly 68.12345678 in In the normal distri-bution, any single point on the horizontal scale is represented not by a region underthe curve, but by a vertical line above the point For we have a verticalline above , but that vertical line by itself contains no area, so

With any continuous random variable, the probability of any

follows that the probability of getting a z score of at most b is equal to the probability

of getting a z score less than b It is important to correctly interpret key phrases such as

at most, at least, more than, no more than, and so on.

Finding z Scores from Known Areas

So far in this section, all of the examples involving the standard normal distribution

have followed the same format: Given z scores, find areas under the curve These

areas correspond to probabilities In many cases, we have the reverse: Given the area

(or probability), find the corresponding z score In such cases, we must avoid sion between z scores and areas Remember, z scores are distances along the horizontal

confu-scale, whereas areas (or probabilities) are regions under the curve (Table A-2 lists

z-scores in the left column and across the top row, but areas are found in the body of

the table.) Also, z scores positioned in the left half of the curve are always negative If

we already know a probability and want to determine the corresponding z score, we

find it as follows

Procedure for Finding a z Score from a Known Area

1. Draw a bell-shaped curve and identify the region under the curve that sponds to the given probability If that region is not a cumulative region from theleft, work instead with a known region that is a cumulative region from the left

corre-2. Using the cumulative area from the left, locate the closest probability in the

body of Table A-2 and identify the corresponding z score.

When referring to Table A-2, remember that the body of the table gives cumulative

areas from the left.

Figure 6-9 Finding the 95th Percentile

Scientific Thermometers Use the same thermometers

from Example 3, with temperature readings at the freezing point of water that arenormally distributed with a mean of 0 C and a standard deviation of 1.00 C Findthe temperature corresponding to , the 95th percentile That is, find the tem-perature separating the bottom 95% from the top 5% See Figure 6-9

P95

°

°

6

Trang 12

Figure 6-9 shows the z score that is the 95th percentile, with 95%

of the area (or 0.95) below it Referring to Table A-2, we search for the area of 0.95 in

the body of the table and then find the corresponding z score In Table A-2 we find

the areas of 0.9495 and 0.9505, but there’s an asterisk with a special note indicating

that 0.9500 corresponds to a z score of 1.645 We can now conclude that the z score

in Figure 6-9 is 1.645, so the 95th percentile is the temperature reading of 1.645 C.°

When tested at freezing, 95% of the readings will be lessthan or equal to 1.645 C, and 5% of them will be greater than or equal to 1.645 C.° °

Note that in the preceding solution, Table A-2 led to a z score of 1.645, which is

midway between 1.64 and 1.65 When using Table A-2, we can usually avoid

inter-polation by simply selecting the closest value Special cases are listed in the

accompa-nying table because they are often used in a wide variety of applications (For one of

those special cases, the value of gives an area slightly closer to the area of

0.9950, but has the advantage of being the value midway between

and ) Except in these special cases, we can select the closest value

in the table (If a desired value is midway between two table values, select the larger

value.) For z scores above 3.49, we can use 0.9999 as an approximation of the

cumu-lative area from the left; for z scores below , we can use 0.0001 as an

approxi-mation of the cumulative area from the left

z Score the Left

-Scientific Thermometers Using the same thermometers

from Example 3, find the temperatures separating the bottom 2.5% and the top 2.5%

7

The required z scores are shown in Figure 6-10 To find the z score located to the left, we search the body of Table A-2 for an area of 0.025 The

result is To find the z score located to the right, we search the body of

Table A-2 for an area of 0.975 (Remember that Table A-2 always gives cumulative

areas from the left.) The result is The values of and

separate the bottom 2.5% and the top 2.5%, as shown in Figure 6-10

greater than 1.96 Another interpretation is that at the freezing point of water, 95%

of all thermometer readings will fall between ° -1.96°and 1.96 °

°-1.96

z  1.96

Area  0.025Area  0.025

To find this z score,locate the cumulativearea to the left in Table A–2. Locate 0.975

in the body of Table A–2.

Figure 6-10

Finding z Scores

Trang 13

Caution: When using Table A-2 for finding a value of for a particular value of

, note that is the area to the right of , but Table A-2 lists cumulative areas to the

left of a given z score To find the value of by using Table A-2, resolve that conflict

by using the value of In Example 8, the value of can be found by ing the area of 0.9750 in the body of the table

locat-The examples in this section were created so that the mean of 0 and the standarddeviation of 1 coincided exactly with the properties of the standard normal distribu-tion In reality, it is unusual to find such convenient parameters, because typical nor-mal distributions involve means different from 0 and standard deviations differentfrom 1 In the next section we introduce methods for working with such normal dis-tributions, which are much more realistic and practical

Critical Values For a normal distribution, a critical value is a z score on the

bor-derline separating the z scores that are likely to occur from those that are unlikely.

shown in Example 7 In Example 7, the values below are not likely tooccur, because they occur in only 2.5% of the readings, and the values above

are not likely to occur because they also occur in only 2.5% of the

read-ings The reference to critical values is not so important in this chapter, but will

become extremely important in the following chapters The following notation is

used for critical z values found by using the standard normal distribution.

The notation of is used to represent the z score with an area

of 0.025 to its right Refer to Figure 6-10 and note that the value of has anarea of 0.025 to its right, so z0.025 = 1.96 z = 1.96

za

Y When working with the standard normal distribution, a

technol-ogy can be used to find z scores or areas, so the technoltechnol-ogy can be

used instead of Table A-2 The following instructions describe

how to find such z scores or areas.

Select Analysis, Probability

Distribu-tions, Normal Distribution Either enter the z score to find

corresponding areas, or enter the cumulative area from the left to

find the z score After entering a value, click on the Evaluate

button See the accompanying STATDISK display for an entry

of z = 2.00

S TAT D I S K

STATDISK

Trang 14

To find the cumulative area to the left of a z score (as in Table

A-2), select Calc, Probability Distributions, Normal,

Cu-mulative probabilities Then enter the mean of 0 and

stan-dard deviation of 1 Click on the Input Constant button and

enter the z score.

To find a z score corresponding to a known probability, select

Calc, Probability Distributions, Normal Then select Inverse cumulative probabilities and the option Input con- stant For the input constant, enter the total area to the left of

the given value.

To find the cumulative area to the left of a z score (as in Table A-2),

click on f x, then select Statistical, NORMSDIST, and enter the z score (In Excel 2010, select NORM.S.DIST.)

To find a z score corresponding to a known probability,

select f x, Statistical, NORMSINV, and enter the total

area to the left of the given value (In Excel 2010, select

NORM.S.INV.)

To find the area between two

z scores, press F Oand select normalcdf Proceed to enter

the two z scores separated by a comma, as in (left z score, right

T I - 8 3 / 8 4 P L U S

E X C E L

M I N I TA B

To find a z score corresponding to a known probability, press F

Oand select invNorm Proceed to enter the total area to the left

of the z score For example, the command of invNorm(0.975)

yields a z score of 1.959963986, which is rounded to 1.96, as in

Example 6.

TI-83/84 PLUS

Basic Skills and Concepts

Statistical Literacy and Critical Thinking

1 Normal DistributionWhen we refer to a “normal” distribution, does the word “normal”

have the same meaning as in ordinary language, or does it have a special meaning in statistics?

What exactly is a normal distribution?

2 Normal DistributionA normal distribution is informally described as a probability

dis-tribution that is “bell-shaped” when graphed Describe the “bell shape.”

3 Standard Normal DistributionWhat requirements are necessary for a normal

probabil-ity distribution to be a standard normal probabilprobabil-ity distribution?

4 NotationWhat does the notation indicate?

Continuous Uniform Distribution In Exercises 5–8, refer to the continuous

uni-form distribution depicted in Figure 6-2 Assume that a voltage level between

123.0 volts and 125.0 volts is randomly selected, and find the probability that the

given voltage level is selected.

5 Greater than 124.0 volts

6 Less than 123.5 volts

7 Between 123.2 volts and 124.7 volts

8 Between 124.1 volts and 124.5 volts

za

6-2

z score) Example 5 could be solved with the command of

normalcdf( , 1.50), which yields a probability of 0.9104

(rounded) as shown in the accompanying screen.

ⴚ2.00

Trang 15

Standard Normal Distribution In Exercises 9–12, find the area of the shaded

re-gion The graph depicts the standard normal distribution with mean 0 and dard deviation 1.

stan-9 10

11 12

Standard Normal Distribution In Exercises 13–16, find the indicated z score.

The graph depicts the standard normal distribution with mean 0 and standard deviation 1.

13 14

15 16

Standard Normal Distribution In Exercises 17–36, assume that thermometer

readings are normally distributed with a mean of 0°C and a standard deviation

of 1.00°C A thermometer is randomly selected and tested In each case, draw a sketch, and find the probability of each reading (The given values are in Celsius degrees.) If using technology instead of Table A-2, round answers to four decimal places.

17 Less than 18 Less than

19 Less than 1.23 20 Less than 2.34

21 Greater than 2.22 22 Greater than 2.33

23 Greater than 24 Greater than

25 Between 0.50 and 1.00 26 Between 1.00 and 3.00

29 Between and 1.95 30 Between and 1.34

31 Between and 5.00 32 Between and 1.00

33 Less than 3.55 34 Greater than 3.68

35 Greater than 0 36.Less than 0

Basis for the Range Rule of Thumb and the Empirical Rule In Exercises 37–40,

find the indicated area under the curve of the standard normal distribution, then

-4.50-2.50

-2.87-1.20

-0.50-1.00

-1.00-3.00

-1.96-1.75

-2.75-1.50

Trang 16

convert it to a percentage and fill in the blank The results form the basis for the

range rule of thumb and the empirical rule introduced in Section 3-3.

37.About _% of the area is between and (or within 1 standard deviation

of the mean)

38 About _% of the area is between and (or within 2 standard

devia-tions of the mean)

39 About _% of the area is between and (or within 3 standard

devia-tions of the mean)

40 About _% of the area is between and (or within 3.5 standard

de-viations of the mean)

Finding Critical Values In Exercises 41–44, find the indicated value.

41 42

43 44

Finding Probability In Exercises 45–48, assume that the readings on the

ther-mometers are normally distributed with a mean of 0°C and a standard deviation

of 1.00° Find the indicated probability, where z is the reading in degrees.

45 46

Finding Temperature Values In Exercises 49–52, assume that thermometer

read-ings are normally distributed with a mean of 0°C and a standard deviation of

1.00°C A thermometer is randomly selected and tested In each case, draw a

sketch, and find the temperature reading corresponding to the given information.

49 Find , the 95th percentile This is the temperature reading separating the bottom 95%

from the top 5%

50 Find , the 1st percentile This is the temperature reading separating the bottom 1%

from the top 99%

51 If 2.5% of the thermometers are rejected because they have readings that are too high and

another 2.5% are rejected because they have readings that are too low, find the two readings

that are cutoff values separating the rejected thermometers from the others

52 If 0.5% of the thermometers are rejected because they have readings that are too low and

another 0.5% are rejected because they have readings that are too high, find the two readings

that are cutoff values separating the rejected thermometers from the others

Beyond the Basics

53 For a standard normal distribution, find the percentage of data that are

a within 2 standard deviations of the mean

b more than 1 standard deviation away from the mean

c more than 1.96 standard deviations away from the mean

d between and

e more than 3 standard deviations away from the mean

54 If a continuous uniform distribution has parameters of and then the

min-imum is and the maximum is

a For this distribution, find

b Find if you incorrectly assume that the distribution is normal instead of

Trang 17

Applications of Normal Distributions

Key Concept In this section we introduce real and important applications

involv-ing nonstandard normal distributions by extendinvolv-ing the procedures presented inSection 6-2 We use a simple conversion (Formula 6-2) that allows us to standard-ize any normal distribution so that the methods of the preceding section can beused with normal distributions having a mean that is not 0 or a standard deviationthat is not 1 Specifically, given some nonstandard normal distribution, we should

be able to find probabilities corresponding to values of the variable x, and given

some probability value, we should be able to find the corresponding value of the

variable x.

To work with a nonstandard normal distribution, we simply standardize values touse the procedures from Section 6-2

If we convert values to standard z-scores using Formula 6-2, then

proce-dures for working with all normal distributions are the same as those for the standard normal distribution.

Formula 6-2

z = x - ms (round z scores to 2 decimal places)

Some calculators and computer software programs do not require the above

conversion to z scores because probabilities can be found directly However, if you use Table A-2 to find probabilities, you must first convert values to standard z

scores Regardless of the method you use, you need to clearly understand the aboveprinciple, because it is an important foundation for concepts introduced in the fol-lowing chapters

Figure 6-11 illustrates the conversion from a nonstandard to a standard normal

distribution The area in any normal distribution bounded by some score x (as in Figure 6-11(a)) is the same as the area bounded by the equivalent z score in the stan-

dard normal distribution (as in Figure 6-11(b)) This means that when working with

a nonstandard normal distribution, you can use Table A-2 the same way it was used

in Section 6-2, as long as you first convert the values to z scores.

55 Assume that z scores are normally distributed with a mean of 0 and a standard

56 In a continuous uniform distribution,

Find the mean and standard deviation for the uniform distribution represented in Figure 6-2

Trang 18

m

ms

z  x 

(a)

z

P P

0(b)

NonstandardNormal Distribution

StandardNormal Distribution

Figure 6-11 Converting from a Nonstandard

to a Standard Normal Distribution

When finding areas with a nonstandard normal distribution, use this procedure:

1. Sketch a normal curve, label the mean and the specific x values, then shade the

region representing the desired probability

2. For each relevant value x that is a boundary for the shaded region, use Formula 6-2

to convert that value to the equivalent z score.

3. Refer to Table A-2 or use a calculator or computer software to find the area of

the shaded region This area is the desired probability

The following example applies these three steps to illustrate the relationship

be-tween a typical nonstandard normal distribution and the standard normal distribution

Why Do Doorways Have a Height of 6 ft 8 in.? The

typical home doorway has a height of 6 ft 8 in., or 80 in Because men tend to be

taller than women, we will consider only men as we investigate the limitations of

that standard doorway height Given that heights of men are normally distributed

with a mean of 69.0 in and a standard deviation of 2.8 in., find the percentage of

men who can fit through the standard doorway without bending or bumping

their head Is that percentage high enough to continue using 80 in as the standard

height? Will a doorway height of 80 in be sufficient in future years?

1

continued

Step 1: See Figure 6-12, which incorporates this information: Men have heights

that are normally distributed with a mean of 69.0 in and a standard deviation of

2.8 in The shaded region represents the men who can fit through a doorway that

has a height of 80 in

Figure 6-12 Heights (in inches) of Men

Step 2: To use Table A-2, we first must use Formula 6-2 to convert from the

non-standard normal distribution to the non-standard normal distribution The height of

80 in is converted to a z score as follows:

z = x - ms = 80 - 69.0

Trang 19

Figure 6-13 shows the shaded region representing birth weights tween 2450 g and 4390 g We can’t find that shaded area directly from Table A-2, but

be-we can find it indirectly by using the same basic procedures presented in Section 6-2,

as follows: (1) Find the cumulative area from the left up to 2450; (2) find the lative area from the left up to 4390; (3) find the difference between those two areas

cumu-The proportion of men who can fit through the standarddoorway height of 80 in is 0.9999, or 99.99% Very few men will not be able to fitthrough the doorway without bending or bumping their head This percentage ishigh enough to justify the use of 80 in as the standard doorway height However,heights of men and women have been increasing gradually but steadily over the pastdecades, so the time may come when the standard doorway height of 80 in may nolonger be adequate

Birth Weights Birth weights in the United States are

nor-mally distributed with a mean of 3420 g and a standard deviation of 495 g TheNewport General Hospital requires special treatment for babies that are less than

2450 g (unusually light) or more than 4390 g (unusually heavy) What is the centage of babies who do not require special treatment because they have birthweights between 2450 g and 4390 g? Under these conditions, do many babies re-quire special treatment?

Figure 6-13 Birth Weights

Step 3: Referring to Table A-2 and using , we find that this z score is in

the category of “3.50 and up,” so the cumulative area to the left of 80 in is0.9999 as shown in Figure 6-12

If we use technology instead of Table A-2, we get the more accurate cumulative area

of 0.999957 (instead of 0.9999)

z = 3.93

Find the cumulative area up to 2450:

Using Table A-2, we find that corresponds to an area of 0.0250, as shown

Evelyn Marie Adams won

the New Jersey Lottery

twice in four months This

happy event was reported

17 trillion.

But vard mathe-

Har-maticians Persi Diaconis

and Frederick Mosteller

note that there is 1 chance

in 17 trillion that a particular

person with one ticket in

each of two New Jersey

lotteries will win both

times However, there is

about 1 chance in 30 that

someone in the United

States will win a lottery

twice in a four-month

pe-riod Diaconis and Mosteller

analyzed coincidences and

conclude that “with a large

enough sample, any

outra-geous thing is apt to

hap-pen.” More recently,

ac-cording to the Detroit

News, Joe and Dolly

Hor-nick won the Pennsylvania

lottery four times in 12

years for prizes of $2.5

mil-lion, $68,000, $206,217,

and $71,037.

Trang 20

Find the cumulative area up to 4390:

Using Table A-2, we find that corresponds to an area of 0.9750, as shown in

weights between 2450 g and 4390 g It follows that 5.00% of the babies do require

special treatment because they are unusually light or heavy The 5.00% rate is

proba-bly not too high for typical hospitals

Finding Values from Known Areas

Here are helpful hints for those cases in which the area (or probability or percentage)

is known and we must find the relevant value(s):

1. Don’t confuse z scores and areas Remember, z scores are distances along the

horizon-tal scale, but areas are regions under the normal curve Table A-2 lists z scores in the

left columns and across the top row, but areas are found in the body of the table

2. Choose the correct side of the graph A value separating the top 10%

from the others will be located on the right side of the graph, but a value

sepa-rating the bottom 10% will be located on the left side of the graph

3. A z score must be negative whenever it is located in the left half of the normal

distribution

4. Areas (or probabilities) are positive or zero values, but they are never negative

Graphs are extremely helpful in visualizing, understanding, and successfully

work-ing with normal probability distributions, so they should be used whenever possible

Procedure for Finding Values Using Table A-2 and Formula 6-2

1. Sketch a normal distribution curve, enter the given probability or percentage in

the appropriate region of the graph, and identify the x value(s) being sought.

2. Use Table A-2 to find the z score corresponding to the cumulative left area

bounded by x Refer to the body of Table A-2 to find the closest area, then

identify the corresponding z score.

3. Using Formula 6-2, enter the values for and the z score found in Step 2,

then solve for x Based on Formula 6-2, we can solve for x as follows:

(If z is located to the left of the mean, be sure that it is a negative number.)

4. Refer to the sketch of the curve to verify that the solution makes sense in the

context of the graph and in the context of the problem

x = m + (z #s)

c (another form of Formula 6-2)

m, s,

(right >left)

Trang 21

Step 2: In Table A-2 we search for an area of 0.9500 in the body of the table (The

area of 0.9500 shown in Figure 6-14 is a cumulative area from the left, and that isexactly the type of area listed in Table A-2.) The area of 0.9500 is between theTable A-2 areas of 0.9495 and 0.9505, but there is an asterisk and footnote indi-cating that an area of 0.9500 corresponds to

men would not fit through a doorway with a height of 73.6 in Because so many

men walk through doorways so often, this 5% rate is probably not practical

Designing Doorway Heights When designing an

en-vironment, one common criterion is to use a design that accommodates 95% ofthe population How high should doorways be if 95% of men will fit throughwithout bending or bumping their head? That is, find the 95th percentile ofheights of men Heights of men are normally distributed with a mean of 69.0 in.and a standard deviation of 2.8 in

3

Step 1: Figure 6-14 shows the normal distribution with the height x that we want

to identify The shaded area represents the 95% of men who can fit through thedoorway that we are designing

Figure 6-14 Finding Height

The following example uses the procedure just outlined

Trang 22

Birth Weights The Newport General Hospital wants to

redefine the minimum and maximum birth weights that require special

treat-ment because they are unusually low or unusually high After considering

rele-vant factors, a committee recommends special treatment for birth weights in the

lowest 3% and the highest 1% The committee members soon realize that

spe-cific birth weights need to be identified Help this committee by finding the

birth weights that separate the lowest 3% and the highest 1% Birth weights in

the United States are normally distributed with a mean of 3420 g and a standard

deviation of 495 g

Step 1: We begin with the graph shown in Figure 6-15 We have entered the mean

of 3420 g, and we have identified the x values separating the lowest 3% and the

Step 2: If using Table A-2, we must use cumulative areas from the left For the

leftmost value of x, the cumulative area from the left is 0.03, so search for an area

of 0.03 in the body of the table to get (which corresponds to the

clos-est area of 0.0301) For the rightmost value of x, the cumulative area from the left

is 0.99, so search for an area of 0.99 in the body of the table to get (which

corresponds to the closest area of 0.9901)

Step 3: We now solve for the two values of x by using Formula 6-2 directly or by

using the following version of Formula 6-2:

Step 4: Referring to Figure 6-15, we see that the leftmost value of

is reasonable because it is less than the mean of 3420 g Also, the rightmost value

of 4573.35 is reasonable because it is above the mean of 3420 g (Technology yields

the values of 2489.0 g and 4571.5 g.)

The birth weight of 2489 g (rounded) separates the lowest3% of birth weights, and 4573 g (rounded) separates the highest 1% of birth weights

The hospital now has well-defined criteria for determining whether a newborn baby

should be given special treatment for a birth weight that is unusually low or high

Trang 23

When using the methods of this section with applications involving a normaldistribution, it is important to first determine whether you are finding a probability

(or area) from a known value of x or finding a value of x from a known probability (or

area) Figure 6-16 is a flowchart summarizing the main procedures of this section

Are you using technology or Table A-2

?

Are you using technology or Table A-2

?

Solve for x:

Look up the cumulative left area in Table A-2 and find the corresponding

(from a known value of x)

Applications with Normal Distributions

Find a value of x

(from known probability or area)

Identify the cumulative area to the

by using the technology.

Figure 6-16 Procedures for Applications with Normal Distributions

Trang 24

USING TECHNOL

Y When working with a nonstandard normal distribution, a

technol-ogy can be used to find areas or values of the relevant variable, so the

technology can be used instead of Table A-2 The following

instruc-tions describe how to use technology for such cases.

Select Analysis, Probability Distributions,

Normal Distribution Either enter the z score to find corresponding

areas, or enter the cumulative area from the left to find the z score.

After entering a value, click on the Evaluate button.

To find the cumulative area to the left of a z score (as in Table A-2),

select Calc, Probability Distributions, Normal, Cumulative

probabilities Enter the mean and standard deviation, then click

on the Input Constant button and enter the value.

• To find a value corresponding to a known area, select Calc,

Probability Distributions, Normal, then select Inverse

cumu-lative probabilities Enter the mean and standard deviation

Se-lect the option Input constant and enter the total area to the left

of the given value.

To find the cumulative area to the left of a value (as in Table A-2),

click on f x, then select Statistical, NORMDIST (In Excel 2010,

select NORM.DIST.) In the dialog box, enter the value for x,

enter the mean and standard deviation, and enter 1 in the

“cu-mulative” space.

To find a value corresponding to a known area, select f x, Statistical,

NORMINV, (or NORM.INV in Excel 2010), and proceed to

make the entries in the dialog box When entering the

probabil-ity value, enter the total area to the left of the given value See the

accompanying Excel display for Example 3.

E X C E L

M I N I TA B

S TAT D I S K

• To find the area between two values, press 2nd, VARS, 2 (for

normalcdf ), then proceed to enter the two values, the mean, and the standard deviation, all separated by commas, as in (left value,

right value, mean, standard deviation) Hint: If there is no left

value, enter the left value as , and if there is no right value, enter the right value as 999999 In Example 1 we want the area to the left of , so use the command

normalcdf ( , 80, 69.0, 2.8) as shown in the

accompa-nying screen display.

ⴚ999999

x = 80 in -999999

T I - 8 3 / 8 4 P L U S

EXCEL

TI-83/84 PLUS

Basic Skills and Concepts

Statistical Literacy and Critical Thinking

1 Normal DistributionsWhat is the difference between a standard normal distribution and

a nonstandard normal distribution?

2 IQ Scores The distribution of IQ scores is a nonstandard normal distribution with a

mean of 100 and a standard deviation of 15, and a bell-shaped graph is drawn to represent

this distribution

a What is the area under the curve?

b What is the value of the median?

c What is the value of the mode?

6-3

• To find a value corresponding to a known area, press 2nd, VARS, the select invNorm, and proceed to enter the total area to the left

of the value, the mean, and the standard deviation in the format

of (total area to the left, mean, standard deviation) with the mas included.

Trang 25

com-3 Normal DistributionsThe distribution of IQ scores is a nonstandard normal tion with a mean of 100 and a standard deviation of 15 What are the values of the mean and

distribu-standard deviation after all IQ scores have been distribu-standardized by converting them to z scores

using

4 Random Digits Computers are often used to randomly generate digits of telephonenumbers to be called when conducting a survey Can the methods of this section be used tofind the probability that when one digit is randomly generated, it is less than 5? Why or whynot? What is the probability of getting a digit less than 5?

IQ Scores In Exercises 5–8, find the area of the shaded region The graphs depict

IQ scores of adults, and those scores are normally distributed with a mean of 100 and a standard deviation of 15 (as on the Wechsler test).

IQ Scores In Exercises 9–12, find the indicated IQ score The graphs depict IQ

scores of adults, and those scores are normally distributed with a mean of 100 and a standard deviation of 15 (as on the Wechsler test).

IQ Scores In Exercises 13–20, assume that adults have IQ scores that are

nor-mally distributed with a mean of 100 and a standard deviation of 15 (as on the Wechsler test) (Hint: Draw a graph in each case.)

13 Find the probability that a randomly selected adult has an IQ that is less than 115

14 Find the probability that a randomly selected adult has an IQ greater than 131.5 (therequirement for membership in the Mensa organization)

15 Find the probability that a randomly selected adult has an IQ between 90 and 110

(referred to as the normal range).

16 Find the probability that a randomly selected adult has an IQ between 110 and 120

(referred to as bright normal ).

Trang 26

17 Find , which is the IQ score separating the bottom 30% from the top 70%.

18 Find the first quartile , which is the IQ score separating the bottom 25% from the top 75%

19 Find the third quartile , which is the IQ score separating the top 25% from the others

20.Find the IQ score separating the top 37% from the others

In Exercises 21–26, use this information (based on data from the National

Health Survey):

Men’s heights are normally distributed with mean 69.0 in and standard deviation 2.8 in.

Women’s heights are normally distributed with mean 63.6 in and standard deviation 2.5 in.

21 Doorway HeightThe Mark VI monorail used at Disney World and the Boeing 757-200 ER

airliner have doors with a height of 72 in

a What percentage of adult men can fit through the doors without bending?

b What percentage of adult women can fit through the doors without bending?

c Does the door design with a height of 72 in appear to be adequate? Explain

d What doorway height would allow 98% of adult men to fit without bending?

22 Doorway HeightThe Gulfstream 100 is an executive jet that seats six, and it has a

door-way height of 51.6 in

a What percentage of adult men can fit through the door without bending?

b What percentage of adult women can fit through the door without bending?

c Does the door design with a height of 51.6 in appear to be adequate? Why didn’t the

engi-neers design a larger door?

d What doorway height would allow 60% of men to fit without bending?

23 Tall Clubs International Tall Clubs International is a social organization for tall

people It has a requirement that men must be at least 74 in tall, and women must be at least

70 in tall

a What percentage of men meet that requirement?

b What percentage of women meet that requirement?

c Are the height requirements for men and women fair? Why or why not?

24 Tall Clubs InternationalTall Clubs International has minimum height requirements

for men and women

a If the requirements are changed so that the tallest 4% of men are eligible, what is the new

minimum height for men?

b If the requirements are changed so that the tallest 4% of women are eligible, what is the

new minimum height for women?

25 U.S Army Height Requirements for WomenThe U.S Army requires women’s

heights to be between 58 in and 80 in

a Find the percentage of women meeting the height requirement Are many women being

denied the opportunity to join the Army because they are too short or too tall?

b If the U.S Army changes the height requirements so that all women are eligible except the

shortest 1% and the tallest 2%, what are the new height requirements?

26 Marine Corps Height Requirement for MenThe U.S Marine Corps requires that

men have heights between 64 in and 80 in

a Find the percentage of men who meet the height requirements Are many men denied the

opportunity to become a Marine because they do not satisfy the height requirements?

b If the height requirements are changed so that all men are eligible except the shortest 3%

and the tallest 4%, what are the new height requirements?

27 Birth WeightsBirth weights in Norway are normally distributed with a mean of 3570 g

and a standard deviation of 500 g

a If the Ulleval University Hospital in Oslo requires special treatment for newborn babies

weighing less than 2700 g, what is the percentage of newborn babies requiring special treatment?

Q3

Q1

P30

Trang 27

b If the Ulleval University Hospital officials plan to require special treatment for the lightest3% of newborn babies, what birth weight separates those requiring special treatment fromthose who do not?

c Why is it not practical for the hospital to simply state that babies require special treatment

if they are in the bottom 3% of birth weights?

28 Weights of Water Taxi PassengersIt was noted in the Chapter Problem that when awater taxi sank in Baltimore’s Inner Harbor, an investigation revealed that the safe passengerload for the water taxi was 3500 lb It was also noted that the mean weight of a passenger wasassumed to be 140 lb Assume a “worst case” scenario in which all of the passengers are adultmen (This could easily occur in a city that hosts conventions in which people of the samegender often travel in groups.) Based on data from the National Health and Nutrition Exami-nation Survey, assume that weights of men are normally distributed with a mean of 172 lband a standard deviation of 29 lb

a If one man is randomly selected, find the probability that he weighs less than 174 lb (thenew value suggested by the National Transportation and Safety Board)

b With a load limit of 3500 lb, how many men passengers are allowed if we assume a meanweight of 140 lb?

c With a load limit of 3500 lb, how many men passengers are allowed if we use the newmean weight of 174 lb?

d Why is it necessary to periodically review and revise the number of passengers that are lowed to board?

al-29 Body TemperaturesBased on the sample results in Data Set 2 of Appendix B, assumethat human body temperatures are normally distributed with a mean of 98.20°F and a stan-dard deviation of 0.62°F

a Bellevue Hospital in New York City uses 100.6°F as the lowest temperature considered to

be a fever What percentage of normal and healthy persons would be considered to have afever? Does this percentage suggest that a cutoff of 100.6°F is appropriate?

b Physicians want to select a minimum temperature for requiring further medical tests Whatshould that temperature be, if we want only 5.0% of healthy people to exceed it? (Such a re-

sult is a false positive, meaning that the test result is positive, but the subject is not really sick.)

30 Aircraft Seat WidthEngineers want to design seats in commercial aircraft so that theyare wide enough to fit 99% of all males (Accommodating 100% of males would require verywide seats that would be much too expensive.) Men have hip breadths that are normally dis-tributed with a mean of 14.4 in and a standard deviation of 1.0 in (based on anthropometricsurvey data from Gordon, Clauser, et al.) Find That is, find the hip breadth for men thatseparates the smallest 99% from the largest 1%

31 Lengths of PregnanciesThe lengths of pregnancies are normally distributed with amean of 268 days and a standard deviation of 15 days

a One classical use of the normal distribution is inspired by a letter to “Dear Abby” in which

a wife claimed to have given birth 308 days after a brief visit from her husband, who was ing in the Navy Given this information, find the probability of a pregnancy lasting 308 days

serv-or longer What does the result suggest?

b If we stipulate that a baby is premature if the length of pregnancy is in the lowest 4%, find

the length that separates premature babies from those who are not premature Premature bies often require special care, and this result could be helpful to hospital administrators inplanning for that care

ba-32 Sitting DistanceA common design requirement is that an item (such as an aircraft ortheater seat) must fit the range of people who fall between the 5th percentile for women andthe 95th percentile for men If this requirement is adopted, what is the minimum sitting dis-tance and what is the maximum sitting distance? For the sitting distance, use the buttock-to-knee length Men have buttock-to-knee lengths that are normally distributed with a mean of23.5 in and a standard deviation of 1.1 in Women have buttock-to-knee lengths that arenormally distributed with a mean of 22.7 in and a standard deviation of 1.0 in

P99

Trang 28

Large Data Sets In Exercises 33 and 34, refer to the data sets in Appendix B and

use computer software or a calculator.

33 Appendix B Data Set: Systolic Blood PressureRefer to Data Set 1 in Appendix B

and use the systolic blood pressure levels for males

a Using the systolic blood pressure levels for males, find the mean and standard deviation,

and verify that the data have a distribution that is roughly normal

b Assuming that systolic blood pressure levels of males are normally distributed, find the 5th

percentile and the 95th percentile (Treat the statistics from part (a) as if they were population

parameters.) Such percentiles could be helpful when physicians try to determine whether

blood pressure levels are too low or too high

34 Appendix B Data Set: Duration of Shuttle FlightsRefer to Data Set 10 in Appendix B

and use the durations (hours) of the NASA shuttle flights

a Find the mean and standard deviation, and verify that the data have a distribution that is

roughly normal

b Treat the statistics from part (a) as if they are population parameters and assume a normal

distribution to find the values of the quartiles Q1, Q2, and Q3

Beyond the Basics

35 Units of MeasurementHeights of women are normally distributed

a If heights of individual women are expressed in units of centimeters, what are the units

used for the z scores that correspond to individual heights?

b If heights of all women are converted to z scores, what are the mean, standard deviation,

and distribution of these z scores?

36 Using Continuity CorrectionThere are many situations in which a normal

distribu-tion can be used as a good approximadistribu-tion to a random variable that has only discrete values In

such cases, we can use this continuity correction: Represent each whole number by the interval

extending from 0.5 below the number to 0.5 above it Assume that IQ scores are all whole

numbers having a distribution that is approximately normal with a mean of 100 and a

stan-dard deviation of 15

a Without using any correction for continuity, find the probability of randomly selecting

someone with an IQ score greater than 103

b Using the correction for continuity, find the probability of randomly selecting someone

with an IQ score greater than 103

c Compare the results from parts (a) and (b)

37 Curving Test ScoresA statistics professor gives a test and finds that the scores are

nor-mally distributed with a mean of 25 and a standard deviation of 5 She plans to curve the scores

a If she curves by adding 50 to each grade, what is the new mean? What is the new standard

deviation?

b Is it fair to curve by adding 50 to each grade? Why or why not?

c If the grades are curved according to the following scheme (instead of adding 50), find the

numerical limits for each letter grade

A: Top 10%

B: Scores above the bottom 70% and below the top 10%

C: Scores above the bottom 30% and below the top 30%

D: Scores above the bottom 10% and below the top 70%

F: Bottom 10%

d Which method of curving the grades is fairer: Adding 50 to each grade or using the scheme

given in part (c)? Explain

6-3

Trang 29

38 SAT and ACT TestsScores on the SAT test are normally distributed with a mean of

1518 and a standard deviation of 325 Scores on the ACT test are normally distributed with amean of 21.1 and a standard deviation of 4.8 Assume that the two tests use different scales tomeasure the same aptitude

a If someone gets a SAT score that is the 67th percentile, find the actual SAT score and theequivalent ACT score

b If someone gets a SAT score of 1900, find the equivalent ACT score

39 OutliersFor the purposes of constructing modified boxplots as described in Section 3-4,outliers were defined as data values that are above by an amount greater than

or below by an amount greater than , where IQR is the interquartile range.Using this definition of outliers, find the probability that when a value is randomly selectedfrom a normal distribution, it is an outlier

1.5 * IQR

Q1

1.5 * IQR

Q3

Sampling Distributions and Estimators

Key Concept In this section we consider the concept of a sampling distribution of a

statistic Also, we learn some important properties of sampling distributions of the

mean, median, variance, standard deviation, range, and proportion We see that somestatistics (such as the mean, variance, and proportion) are unbiased estimators of pop-ulation parameters, whereas other statistics (such as the median and range) are not.The following chapters of this book introduce methods for using sample statistics

to estimate values of population parameters Those procedures are based on an standing of how sample statistics behave, and that behavior is the focus of this section

under-We begin with the definition of a sampling distribution of a statistic

6-4

The sampling distribution of a statistic (such as a sample mean or sample

proportion) is the distribution of all values of the statistic when all possible

samples of the same size n are taken from the same population (The

sam-pling distribution of a statistic is typically represented as a probability bution in the format of a table, probability histogram, or formula.)

distri-Sampling Distribution of the Mean

The preceding definition is general, so let’s consider the specific sampling tion of the mean

distribu-The sampling distribution of the mean is the distribution of sample

means, with all samples having the same sample size n taken from the same

population (The sampling distribution of the mean is typically represented

as a probability distribution in the format of a table, probability histogram,

or formula.)

Trang 30

The top portion of Table 6-2 illustrates a process of rolling a die

5 times and finding the mean of the results Table 6-2 shows results from repeating

this process 10,000 times, but the true sampling distribution of the mean involves

repeating the process indefinitely Because the values of 1, 2, 3, 4, 5, 6 are all equally

likely, the population has a mean of , and Table 6-2 shows that the 10,000

sample means have a mean of 3.49 If the process is continued indefinitely, the mean

of the sample means will be 3.5 Also, Table 6-2 shows that the distribution of the

sample means is approximately a normal distribution

Based on the actual sample results shown in the top portion

of Table 6-2, we can describe the sampling distribution of the mean by the histogram

at the top of Table 6-2 The actual sampling distribution would be described by a

histogram based on all possible samples, not only the 10,000 samples included in

the histogram, but the number of trials is large enough to suggest that the true

sam-pling distribution of means is a normal distribution

m = 3.5

Sampling Distribution of the Mean Consider repeating

this process: Roll a die 5 times and find the mean of the results (See Table 6-2

on the next page.) What do we know about the behavior of all sample means that

are generated as this process continues indefinitely?

x

1

The results of Example 1 allow us to observe these two important properties of the

sampling distribution of the mean:

1. The sample means target the value of the population mean (That is, the mean

of the sample means is the population mean The expected value of the sample

mean is equal to the population mean.)

2. The distribution of sample means tends to be a normal distribution (This will

be discussed further in the following section, but the distribution tends to

be-come closer to a normal distribution as the sample size increases.)

Sampling Distribution of the Variance

Having discussed the sampling distribution of the mean, we now consider the

sam-pling distribution of the variance

The sampling distribution of the variance is the distribution of sample

variances, with all samples having the same sample size n taken from the

same population (The sampling distribution of the variance is typically

rep-resented as a probability distribution in the format of a table, probability

histogram, or formula.)

Caution: When working with population standard deviations or variances, be sure to

evaluate them correctly Recall from Section 3-3 that the computations for population

Do Boys or Girls Run in the

Family?

The author of this book, his siblings, and his siblings’ children consist

of 11 males and only one female Is this an example

of a nomenon whereby one particular gender runs

phe-in a family? This issue was studied by examining a random sample of 8770 households in the United States The results were

reported in the Chance

magazine article “Does Having Boys or Girls Run in the Family?” by Joseph Rodgers and Debby Doughty Part of their analysis involves use of the binomial probability distribution Their conclusion

is that “We found no compelling evidence that sex bias runs in the family.”

Trang 31

standard deviations or variances involve division by the population size N (not the

value of ), as shown below

Because the calculations are typically performed with computer software or tors, be careful to correctly distinguish between the standard deviation of a sampleand the standard deviation of a population Also be careful to distinguish between thevariance of a sample and the variance of a population

Roll a die 5 times

and find the mean x

Roll a die 5 times and

find the variance s2

Roll a die 5 times and

find the proportion

Sample 3

Approximatelynormal

ApproximatelynormalSkewed

Trang 32

The middle portion of Table 6-2 illustrates a process of rolling adie 5 times and finding the variance of the results Table 6-2 shows results from re-

peating this process 10,000 times, but the true sampling distribution of the variance

involves repeating the process indefinitely Because the values of 1, 2, 3, 4, 5, 6 are

all equally likely, the population has a variance of , and Table 6-2 shows

that the 10,000 sample variances have a mean of 2.88 If the process is continued

in-definitely, the mean of the sample variances will be 2.9 Also, the middle portion of

Table 6-2 shows that the distribution of the sample variances is a skewed distribution

Based on the actual sample results shown in the middleportion of Table 6-2, we can describe the sampling distribution of the variance by

the histogram in the middle of Table 6-2 The actual sampling distribution would be

described by a histogram based on all possible samples, not the 10,000 samples

in-cluded in the histogram, but the number of trials is large enough to suggest that the

true sampling distribution of variances is a distribution skewed to the right

s2 = 2.9

Sampling Distribution of the Variance Consider

repeat-ing this process: Roll a die 5 times and find the variance of the results What do

we know about the behavior of all sample variances that are generated as this

process continues indefinitely?

s2

2

The results of Example 2 allow us to observe these two important properties of the

sampling distribution of the variance:

1. The sample variances target the value of the population variance (That is, the

mean of the sample variances is the population variance The expected value of

the sample variance is equal to the population variance.)

2. The distribution of sample variances tends to be a distribution skewed to the

right

Sampling Distribution of Proportion

We now consider the sampling distribution of a proportion

The sampling distribution of the proportion is the distribution of sample

proportions, with all samples having the same sample size n taken from the

same population

We need to distinguish between a population proportion p and some sample

propor-tion, so the following notation is commonly used

Notation for Proportions

p N = sample proportion

p = population proportion

Trang 33

The bottom portion of Table 6-2 illustrates a process of rolling adie 5 times and finding the proportion of odd numbers Table 6-2 shows resultsfrom repeating this process 10,000 times, but the true sampling distribution of theproportion involves repeating the process indefinitely Because the values of 1, 2, 3,

4, 5, 6 are all equally likely, the proportion of odd numbers in the population is 0.5,and Table 6-2 shows that the 10,000 sample proportions have a mean of 0.50 If theprocess is continued indefinitely, the mean of the sample proportions will be 0.5.Also, the bottom portion of Table 6-2 shows that the distribution of the sample pro-portions is approximately a normal distribution

Based on the actual sample results shown in the bottomportion of Table 6-2, we can describe the sampling distribution of the proportion bythe histogram at the bottom of Table 6-2 The actual sampling distribution would

be described by a histogram based on all possible samples, not the 10,000 samplesincluded in the histogram, but the number of trials is large enough to suggest thatthe true sampling distribution of proportions is a normal distribution

Sampling Distribution of the Proportion Consider

re-peating this process: Roll a die 5 times and find the proportion of odd numbers.

What do we know about the behavior of all sample proportions that are generated

as this process continues indefinitely?

3

The results of Example 3 allow us to observe these two important properties of thesampling distribution of the proportion:

1. The sample proportions target the value of the population proportion (That

is, the mean of the sample proportions is the population proportion The pected value of the sample proportion is equal to the population proportion.)

ex-2. The distribution of sample proportions tends to be a normal distribution.The preceding three examples are based on 10,000 trials and the results are sum-

marized in Table 6-2 Table 6-3 describes the general behavior of the sampling

distri-bution of the mean, variance, and proportion, assuming that certain conditions aresatisfied For example, Table 6-3 shows that the sampling distribution of the meantends to be a normal distribution, but the following section describes conditions thatmust be satisfied before we can assume that the distribution is normal

Unbiased Estimators The preceding three examples show that sample means,

variances, and proportions tend to target the corresponding population parameters More formally, we say that sample means, variances, and proportions are unbiased es-

timators That is, their sampling distributions have a mean that is equal to the mean

of the corresponding population parameter If we want to use a sample statistic (such

as a sample proportion from a survey) to estimate a population parameter (such as thepopulation proportion), it is important that the sample statistic used as the estimator

targets the population parameter instead of being a biased estimator in the sense that

it systematically underestimates or overestimates the parameter The preceding threeexamples and Table 6-2 involve the mean, variance, and proportion, but here is asummary that includes other statistics

Trang 34

Estimators: Unbiased and Biased

Standard deviation s (Important Note: The sample standard deviations do not

target the population standard deviation , but the bias is relatively small ins

Sample 3

NormalSample Means x

Table 6-3 General Behavior of Sampling Distributions

Trang 35

large samples, so s is often used to estimate even though s is a biased estimator

of )The preceding three examples all involved rolling a die 5 times, so the number of

different possible samples, it is not practical to manually list all of them The next ample involves a smaller number of different possible samples, so we can list themand we can then describe the sampling distribution of the range in the format of atable for the probability distribution

ex-6 * 6 * 6 * 6 * 6 = 7776s

Sampling Distribution of the Range Three randomly

se-lected households are surveyed as a pilot project for a larger survey to be ducted later The numbers of people in the households are 2, 3, and 10 (based onData Set 22 in Appendix B) Consider the values of 2, 3, and 10 to be a popula-tion Assume that samples of size are randomly selected with replacementfrom the population of 2, 3, and 10

con-a.List all of the different possible samples, then find the range in each sample

b.Describe the sampling distribution of the ranges in the format of a table marizing the probability distribution

sum-c.Describe the sampling distribution of the ranges in the format of a probabilityhistogram

d.Based on the results, do the sample ranges target the population range, which is

b.The nine samples in Table 6-4 are all equally likely, so each sample has a probability

of The last two columns of Table 6-4 list the values of the range along with thecorresponding probabilities, so the last two columns constitute a table summarizingthe probability distribution, which can be condensed as shown in Table 6-5 Table

6-5 therefore describes the sampling distribution of the sample ranges.

c.Figure 6-17 is the probability histogram based on Table 6-5

d.The mean of the nine sample ranges is 3.6, but the range of the population is 8.Consequently, the sample ranges do not target the population range

e.Because the mean of the sample ranges (3.6) does not equal the populationrange (8), the sample range is a biased estimator of the population range Wecan also see that the range is a biased estimator by simply examining Table 6-5and noting that most of the time, the sample range is well below the populationrange of 8

Trang 36

In this example, we conclude that the sample range is

a biased estimator of the population range This implies that, in general, the sample

range should not be used to estimate the value of the population range

Table 6-4 Sampling Distribution of the Range

2 9

1 9

Figure 6-17 Probability Histogram: Sampling Distribution of the Sample Ranges

Sampling Distribution of the Proportion In a study of

gender selection methods, an analyst considers the process of generating 2 births

When 2 births are randomly selected, the sample space is bb, bg, gb, gg Those 4

outcomes are equally likely, so the probability of 0 girls is 0.25, the probability of

1 girl is 0.5, and the probability of 2 girls is 0.25 Describe the sampling

distribu-tion of the propordistribu-tion of girls from 2 births as a probability distribudistribu-tion table and

also describe it as a probability histogram

5

continued

Trang 37

See the accompanying display The top table summarizes the ability distribution for the number of girls in 2 births That top table can be used to

prob-construct the probability distribution for the proportion of girls in 2 births as shown.

The top table can also be used to construct the probability histogram as shown

2 births

0 0.5 1

Proportion ofgirls from

2 births Probability0

0.51

0.250.500.25

Number ofGirls from

2 Births

P (x) x

012

0.250.500.25

histogram

Sampling distribution ofthe proportion of girlsfrom 2 births

Example 5 shows that a sampling distribution can be described with a table or agraph Sampling distributions can also be described with a formula (as in Exercise 21),

or may be described in some other way, such as this: “The sampling distribution ofthe sample mean is a normal distribution with and ”

Why sample with replacement? All of the examples in this section involved

sampling with replacement Sampling without replacement would have the very

practi-cal advantage of avoiding wasteful duplication whenever the same item is selected

more than once However, we are particularly interested in sampling with replacement

for these two reasons:

1. When selecting a relatively small sample from a large population, it makes no nificant difference whether we sample with replacement or without replacement

sig-2. Sampling with replacement results in independent events that are unaffected

by previous outcomes, and independent events are easier to analyze and result

in simpler calculations and formulas

s = 15

m = 100

Trang 38

For the above reasons, we focus on the behavior of samples that are randomly

se-lected with replacement Many of the statistical procedures discussed in the following

chapters are based on the assumption that sampling is conducted with replacement

The key point of this section is to introduce the concept of a sampling

distribu-tion of a statistic Consider the goal of trying to find the mean body temperature of

all adults Because that population is so large, it is not practical to measure the

tem-perature of every adult Instead, we obtain a sample of body temtem-peratures and use it

to estimate the population mean Data Set 2 in Appendix B includes a sample of 106

such body temperatures The mean for that sample is Conclusions that

we make about the population mean temperature of all adults require that we

under-stand the behavior of the sampling distribution of all such sample means Even

though it is not practical to obtain every possible sample and we are stuck with just

one sample, we can form some very meaningful conclusions about the population of

all body temperatures A major goal of the following sections and chapters is to learn

how we can effectively use a sample to form conclusions about a population In

Section 6-5 we consider more details about the sampling distribution of sample

means, and in Section 6-6 we consider more details about the sampling distribution

of sample proportions

x = 98.20°F

CAUTION

Many methods of statistics require a simple random sample Some samples, such as

volun-tary response samples or convenience samples, could easily result in very wrong results

Basic Skills and Concepts

Statistical Literacy and Critical Thinking

1 Sampling DistributionIn your own words describe a sampling distribution

2 Sampling DistributionData Set 24 in Appendix B includes a sample of FICO credit

rating scores from randomly selected consumers If we investigate this sample by constructing

a histogram and finding the sample mean and standard deviation, are we investigating the

sampling distribution of the mean? Why or why not?

3 Unbiased EstimatorWhat does it mean when we say that the sample mean is an

unbi-ased estimator, or that the sample mean “targets” the population mean?

4 Sampling with Replacement Give two reasons why statistical methods tend to be

based on the assumption that sampling is conducted with replacement, instead of without

re-placement

5 Good Sample?You want to estimate the proportion of all U.S college students who have

the profound wisdom to take a statistics course You obtain a simple random sample of

students at New York University Is the resulting sample proportion a good estimator of the

population proportion? Why or why not?

6 Unbiased EstimatorsWhich of the following statistics are unbiased estimators of

popu-lation parameters?

a Sample mean used to estimate a population mean

b Sample median used to estimate a population median

c Sample proportion used to estimate a population proportion

d Sample variance used to estimate a population variance

e Sample standard deviation used to estimate a population standard deviation

f Sample range used to estimate a population range

6-4

Trang 39

7 Sampling Distribution of the MeanSamples of size are randomly selectedfrom the population of the last digits of telephone numbers If the sample mean is found foreach sample, what is the distribution of the sample means?

8 Sampling Distribution of the ProportionSamples of size are randomly lected from the population of the last digits of telephone numbers, and the proportion of evennumbers is found for each sample What is the distribution of the sample proportions?

se-In Exercises 9–12, refer to the population and list of samples in Example 4.

9 Sampling Distribution of the MedianIn Example 4, we assumed that samples of sizeare randomly selected without replacement from the population consisting of 2, 3, and

10, where the values are the numbers of people in households Table 6-4 lists the nine ent possible samples

differ-a Find the median of each of the nine samples, then summarize the sampling distribution of

the medians in the format of a table representing the probability distribution (Hint: Use a

format similar to Table 6-5)

b Compare the population median to the mean of the sample medians

c Do the sample medians target the value of the population median? In general, do samplemedians make good estimators of population medians? Why or why not?

10 Sampling Distribution of the Standard DeviationRepeat Exercise 9 using dard deviations instead of medians

stan-11 Sampling Distribution of the VarianceRepeat Exercise 9 using variances instead ofmedians

12 Sampling Distribution of the MeanRepeat Exercise 9 using means instead of medians

13 Assassinated Presidents: Sampling Distribution of the MeanThe ages (years) ofthe four U.S presidents when they were assassinated in office are 56 (Lincoln), 49 (Garfield),

58 (McKinley), and 46 (Kennedy)

a Assuming that 2 of the ages are randomly selected with replacement, list the 16 differentpossible samples

b Find the mean of each of the 16 samples, then summarize the sampling distribution of themeans in the format of a table representing the probability distribution (Use a format similar

to Table 6-5 on page 283)

c Compare the population mean to the mean of the sample means

d Do the sample means target the value of the population mean? In general, do samplemeans make good estimators of population means? Why or why not?

14 Sampling Distribution of the MedianRepeat Exercise 13 using medians instead ofmeans

15 Sampling Distribution of the Range Repeat Exercise 13 using ranges instead ofmeans

16 Sampling Distribution of the VarianceRepeat Exercise 13 using variances instead ofmeans

17 Sampling Distribution of ProportionExample 4 referred to three randomly selectedhouseholds in which the numbers of people are 2, 3, and 10 As in Example 4, consider the val-ues of 2, 3, and 10 to be a population and assume that samples of size are randomly se-lected with replacement Construct a probability distribution table that describes the samplingdistribution of the proportion of odd numbers when samples of size are randomly se-lected Does the mean of the sample proportions equal the proportion of odd numbers in thepopulation? Do the sample proportions target the value of the population proportion? Doesthe sample proportion make a good estimator of the population proportion?

18 Births: Sampling Distribution of ProportionWhen 3 births are randomly selected,the sample space is bbb, bbg, bgb, bgg, gbb, gbg, ggb, and ggg Assume that those 8 outcomes

are equally likely Describe the sampling distribution of the proportion of girls from 3 births as

Trang 40

a probability distribution table Does the mean of the sample proportions equal the

propor-tion of girls in 3 births? (Hint: See Example 5.)

19 Genetics: Sampling Distribution of ProportionA genetics experiment involves a

population of fruit flies consisting of 1 male named Mike and 3 females named Anna,

Bar-bara, and Chris Assume that two fruit flies are randomly selected with replacement.

a After listing the 16 different possible samples, find the proportion of females in each sample,

then use a table to describe the sampling distribution of the proportions of females

b Find the mean of the sampling distribution

c Is the mean of the sampling distribution (from part (b)) equal to the population proportion

of females? Does the mean of the sampling distribution of proportions always equal the

popu-lation proportion?

20 Quality Control: Sampling Distribution of ProportionAfter constructing a new

manufacturing machine, 5 prototype integrated circuit chips are produced and it is found that

2 are defective (D) and 3 are acceptable (A) Assume that two of the chips are randomly

se-lected with replacement from this population.

a After identifying the 25 different possible samples, find the proportion of defects in each of

them, then use a table to describe the sampling distribution of the proportions of defects

b Find the mean of the sampling distribution

c Is the mean of the sampling distribution (from part (b)) equal to the population proportion

of defects? Does the mean of the sampling distribution of proportions always equal the

popu-lation proportion?

Beyond the Basics

21 Using a Formula to Describe a Sampling DistributionExample 5 includes a table

and graph to describe the sampling distribution of the proportions of girls from 2 births

Consider the formula shown below, and evaluate that formula using sample proportions x of

0, 0.5, and 1 Based on the results, does the formula describe the sampling distribution? Why

or why not?

22 Mean Absolute DeviationIs the mean absolute deviation of a sample a good statistic

for estimating the mean absolute deviation of the population? Why or why not? (Hint: See

Example 4.)

2(2 - 2x)!(2x)! where x = 0, 0.5, 1

6-4

The Central Limit Theorem

Key Concept In this section we introduce and apply the central limit theorem The

central limit theorem tells us that for a population with any distribution, the

distri-bution of the sample means approaches a normal distridistri-bution as the sample size

increases In other words, if the sample size is large enough, the distribution of

sample means can be approximated by a normal distribution, even if the original

population is not normally distributed In addition, if the original population has

mean and standard deviation , the mean of the sample means will also be ,

but the standard deviation of the sample means will be , where n is the

sam-ple size

sm

6-5

Ngày đăng: 25/11/2016, 13:22

TỪ KHÓA LIÊN QUAN

w