May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.The Normal Distribution The single most important distribution in statistics
Trang 1DECISION MAKING
Normal, Binomial, Poisson, and Exponential Distributions
5
Trang 2 Several specific distributions commonly occur in
a variety of business situations:
Normal distribution—a continuous distribution
characterized by a symmetric bell-shaped curve
Binomial distribution—a discrete distribution that is relevant when we sample from a population with
only two types of members or when we perform a series of independent, identical experiments with
only two possible outcomes
Poisson distribution—a discrete distribution that
describes the number of events in any period of time
Exponential distributions—a continuous distribution
that describes the times between events
Trang 3© 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Normal Distribution
The single most important distribution in
statistics is the normal distribution
It is a continuous distribution and is the basis of the familiar symmetric bell-shaped curve
Any particular normal distribution is specified by its mean and standard deviation
By changing the mean, the normal curve shifts to the right
The normal distribution is a two-parameter family, where
the two parameters are the mean and standard deviation.
Trang 4Continuous Distributions and
Density Functions (slide 1 of 2)
For continuous distributions, instead of a list
of possible values, there is a continuum of
possible values, such as all values between
0 and 100 or all values greater than 0
Instead of assigning probabilities to each
individual value in the continuum, the total
probability of 1 is spread over this continuum
The key to this spreading is called a density
function, which acts like a histogram
The higher the value of the density function, the
more likely this region of the continuum is.
Trang 5© 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Continuous Distributions and
Density Functions (slide 2 of 2)
A density function , usually denoted by f(x), specifies the probability distribution of a continuous random variable X
The higher f(x) is, the more likely x is
The total area between the graph of f(x) and the horizontal
axis, which represents the total probability, is equal to 1
f(x) is nonnegative for all possible values of X.
Probabilities are found from a density function as areas under the curve.
Trang 6The Normal Density
The normal distribution is a continuous distribution
with possible values ranging over the entire number
line—from “minus infinity” to “plus infinity.”
Only a relatively small range has much chance of occurring
The normal density function is actually quite complex, in
spite of its “nice” bell-shaped appearance.
The formula for the normal density function, where μ
and σ are the mean and standard deviation, is:
Trang 7© 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Standardizing: Z-Values
The standard normal distribution has mean 0 and standard deviation 1, so it is denoted by
N(0,1)
It is also referred to as the Z distribution.
To standardize a variable, subtract its mean
and then divide the difference by the standard deviation:
A Z-value is the number of standard deviations to the right or left of the mean.
If Z is positive, the original value is to the right of the
mean.
If Z is negative, the original value is the left of the mean.
Trang 8Example 5.1:
Standardizing.xlsx
Objective: To use Excel® to standardize annual returns of
various mutual funds.
Solution: Data set includes the annual returns of 30 mutual
Trang 9© 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Normal Tables and Z-Values
A common use for Z-values and the standard normal
distribution is in calculating probabilities and percentiles
by the traditional method.
This method is based on a table of the standard normal
distribution found in many statistics textbooks An example of such a table is given below.
The body of the table contains probabilities.
The left and top margins contain possible values.
Trang 10Normal Calculations in Excel
Two types of calculations are typically made with
normal distributions: finding probabilities and
finding percentiles
The functions used for normal probability calculations
are NORMDIST and NORMSDIST
The main difference between these is that the one with the
“S” (for standardized) applies only to N(0, 1) calculations, whereas NORMDIST applies to any normal distribution.
Percentile calculations that take a probability and return
a value are often called inverse calculations.
The Excel functions for these are named NORMINV and
NORMSINV
Again, the “S” in the second of these indicates that it
applies to the standard normal distribution.
Trang 11© 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 5.2:
Normal Calculations.xlsx (slide 1 of 2)
Objective: To calculate probabilities and
percentiles for standard normal and general
normal distributions in Excel.
Solution: For “less than” probabilities, use
NORMDIST or NORMSDIST directly.
For “greater than” probabilities, subtract the
NORMDIST or NORMSDIST function from 1.
For “between” probabilities, subtract the two
NORMDIST or NORMSDIST functions.
For percentile calculations, use the NORMINV or
NORMSINV function with the specified probability
as the first argument.
Trang 12Example 5.2:
Normal Calculations.xlsx (slide 2 of 2)
Trang 13© 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Empirical Rules Revisited
Three empirical rules apply to many data sets:
About 68% of the data fall within one
standard deviation of the mean.
About 95% fall within two standard
deviations of the mean.
Almost all fall within three standard
deviations of the mean.
For these rules to hold with real data, the distribution of the data must be at least approximately symmetric and bell-
shaped.
Trang 14Weighted Sums of Normal
Random Variables
One very attractive property of the normal
distribution is that if you create a weighted sum of normally distributed random variables, the weighted sum is also normally distributed.
This is true even if the random variables are not
independent.
If X 1 through X n are n independent and normally
distributed random variables with common mean μ and
common standard deviation σ, then the sum X 1 + … +
X n is normally distributed with mean nμ, variance nσ 2 ,
and standard deviation √nσ.
If a 1 through a n are any constants, then the weighted
sum a 1 X 1 + … + a n X n is normally distributed with mean
a 1 μ 1 + … + a n μ n and variance a 21 σ 21 + … + a 2n σ 2n
Trang 15© 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 5.3:
Personnel Decisions.xlsx
Objective: To determine test scores that can be used to accept or
reject job applicants at ZTel.
Solution: Scores of all applicants are approximately normally
distributed with mean 525 and standard deviation 55.
Calculate the percentage of applicants who are automatic accepts or rejects, given the current standards of 600 for automatic accept and
425 for automatic reject.
Find new cutoff values that reject 10% and accept 15% of applicants.
Trang 16Example 5.4:
Paper Machine Settings.xlsx
Objective: To determine the machine settings that result in paper of
acceptable quality at PaperStock Company.
Solution: A given roll of paper must be rejected if its actual fiber
content is less than 19.8 pounds or greater than 20.3 pounds.
The variability in fiber content is 0.10 pound when the process is
“good,” but increases to 0.15 pound when the machine goes “bad.”
Calculate the probability that a given roll is rejected, for a setting of μ
= 20, when the machine is “good” and when it is “bad.”
Trang 17© 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 5.5:
Tax on Stock Returns.xlsx
Objective: To determine the after-tax profit Howard Davis can be
90% certain of earning.
Solution: Howard is in the 33% tax bracket, so his after-tax profit
is 67% of his before-tax profit He invests $10,000 in a certain
stock, whose annual return is normally distributed with mean 5% and standard deviation 14%.
Calculate the dollar amount such that Howard’s after-tax profit is 90% certain to be less than this amount; that is, calculate the
90th percentile of his after-tax profit.
Trang 18Example 5.6:
Objective: To construct and analyze a spreadsheet
model for microwave oven demand over the next 12
years using Excel’s NORMINV function, and to show
how models using the normal distribution can lead to nonsensical outcomes unless they are modified
appropriately.
Solution: Using historical data, the company assumes
that demand in year 1 is normally distributed with
mean 5000 and standard deviation 1500
It also assumes that demand in each subsequent year
is normally distributed with mean equal to the actual
demand from the previous year and standard
deviation 1500.
Trang 19© 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 5.6:
Using this model may lead to nonsensical
results as shown below:
Trang 20Example 5.6:
One way to modify the model is to let the standard deviation and
mean move together That is, if the mean is low, then the standard deviation will also be low.
To be even safer, it is possible to truncate the demand distribution
at some nonnegative value such as 250, as shown below.
Trang 21© 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Binomial Distribution
can occur in two situations:
When sampling from a population with only two types of
members (males and females, for example)
When performing a sequence of identical experiments, each
of which has only two possible outcomes
Consider a situation where there are n independent,
identical trials, where the probability of a success on
each trial is p and the probability of a failure is 1 – p.
Define X to be the random number of successes in the n
trials.
Then X has a binominal distribution with parameters n and p.
In Excel, calculate binomial probabilities with the
BINOMDIST function.
Trang 22Example 5.7:
Binomial Calculations.xlsx
Objective: To use Excel’s BINOMDIST and CRITBINOM functions
for calculating binomial probabilities and percentiles in the
context of flashlight batteries.
Solution: Let X be the number of successes in 100 trials of
flashlight batteries, where a success means that the battery is still functioning after eight hours.
Find the probabilities of various events, using the BINOMDIST
function, as shown in the spreadsheet below.
Find the 95th percentile of the distribution of X, using the
CRITBINOM function.
Trang 23© 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Mean and Standard Deviation of the
Binomial Distribution
It can be shown that the mean and standard
deviation of a binomial distribution with parameters n and p are given by the following equations
The empirical rules discussed in Chapter 2 also apply,
at least approximately, to the binomial distribution
There is about a 95% chance that the actual number of successes will be within two standard deviations of the mean.
There is almost no chance that the number of successes will be more than three standard deviations from the
mean.
Trang 24The Binomial Distribution in the
Context of Sampling
If sampling is done without replacement , each
member of the population can be sampled only once
That is, once a person is sampled, his or her name is struck from the list and cannot be sampled again
If sampling is done with replacement , then it is
possible, although maybe not likely, to select a given member of the population more than once
Most real-world sampling is performed without
replacement.
The binomial model applies only to sampling with
replacement.
However, if no more than 10% of the population is
sampled, the binomial model can be used safely even if
sampling is performed without replacement.
Trang 25© 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
The Normal Approximation
to the Binomial
If you graph the binomial probabilities, you will see an interesting phenomenon: the graph begins to look
symmetric and bell-shaped when n is fairly large and p
is not too close to 0 or 1
The normal distribution provides a very good approximation
to the binomial under these conditions.
One practical consequence of the normal approximation to the binomial is that the empirical rules apply very well to binomial distributions.
Trang 26Example 5.8:
Beating the Market.xlsx
Objective: To determine the probability of a mutual fund outperforming a
standard market index at least 37 out of 52 weeks.
Solution: The number of weeks where a given fund outperforms the market
index is binomially distributed with n = 52 and p = 0.5 This probability is
quite small (0.00159).
Now let Y be the number of the 400 best mutual funds that beat the market
at least 37 of 52 weeks Y is also binomially distributed, with parameters n
= 400 and p = 0.00159 The resulting probability is nearly 0.5.
Trang 27© 2015 Cengage Learning All Rights Reserved May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 5.9:
Supermarket Spending.xlsx
Objective: To use the normal and binomial distributions to calculate the
typical number of customers who spend at least $100 per day and the
probability that at least 30% of all 500 daily customers spend at least $100.
Solution: Historical data indicate that the amount spent per customer is
normally distributed with mean $85 and standard deviation $30.
If 500 customers shop in a given day, calculate the mean and standard
deviation of the number who spend at least $100.
Then calculate the probability that at least 30% of the 500 customers spend
at least $100 This is the probability that a binomially distributed random
variable, with n = 500 and p = 0.309, is at least 150.
Trang 28Example 5.10:
Airline Overbooking.xlsx (slide 1 of 2)
Objective: To assess the benefits and
drawbacks of airline overbooking.
Solution: Assume that the no-show rate is 10%
—that is, each ticketed passenger shows up with probability 0.90.
For a flight with 200 seats, calculate the
probability that more than 205 passengers show up; that more than 200 passengers show up;
that at least 195 seats are filled; and that at
least 190 seats are filled.
Use the BINOMDIST function and a data table to
determine the probabilities.