(BQ) Part 1 book Statistics for managers using Microsoft excel has contents Introduction and data collection, presenting data in tables and charts, numerical descriptive measures, basic probability, some important discrete probability distributions, the normal distribution and other continuous distributions.
Business Data Analysis SCH-MGMT 650 STATISTICS FOR MANAGERS USING Microsoft Excel David M Levine David F Stephan Timothy C Krehbiel Mark L Berenson Custom Edition for UMASS-Amherst Professor Robert Nakosteen Taken from: Statistics for Managers: Using Microsoft Excel, Fifth Edition by David M Levine, David F Stephan, Timothy C Krehbiel, and Mark L Berenson Cover photo taken by Lauren Labrecque Taken from: Statistics for Managers: Using Microsoft Excel, Fifth Edition by David M Levine, David F Stephan, Timothy C Krehbiel, and Mark L Berenson Copyright 2008, 2005, 2002, 1999, 1997 by Pearson Education, Inc Published by Prentice Hall Upper Saddle River, New Jersey 07458 All rights reserved No part of this book may be reproduced, in any form or by any means, without permission in writing from the publisher This special edition published in cooperation with Pearson Custom Publishing The information, illustrations, and/or software contained in this book, and regarding the above-mentioned programs, are provided As Is, without warranty of any kind, express or implied, including without limitation any warranty concerning the accuracy, adequacy, or completeness of such information Neither the publisher, the authors, nor the copyright holders shall be responsible for any claims attributable to errors, omissions, or other inaccuracies contained in this book Nor shall they be liable for direct, indirect, special, incidental, or consequential damages arising out of the use of such information or material All trademarks, service marks, registered trademarks, and registered service marks are the property of their respective owners and are used herein for identification purposes only Printed in the United States of America 10 ISBN 0-536-04080 X 2008600006 KA Please visit our web site at www.pearsoncustom.com PEARSON CUSTOM PUBLISHING 501 Boylston Street, Suite 900, Boston, MA 02116 A Pearson Education Company To our wives, Marilyn L., Mary N., Patti K., and Rhoda B., and to our children Sharyn, Mark, Ed, Rudy, Rhonda, Kathy, and Lori ABOUT THE AUTHORS The textbook authors meet to discuss statistics at Shea Stadium for a Mets v Phillies game Shown left to right, Mark Berenson, David Stephan, David Levine, Tim Krehbiel David M Levine is Professor Emeritus of Statistics and Computer Information Systems at Bernard M Baruch College (City University of New York) He received B.B.A and M.B.A degrees in Statistics from City College of New York and a Ph.D degree from New York University in Industrial Engineering and Operations Research He is nationally recognized as a leading innovator in statistics education and is the co-author of 14 books including such best selling statistics textbooks as Statistics for Managers using Microsoft Excel, Basic Business Statistics: Concepts and Applications, Business Statistics: A First Course, and Applied Statistics for Engineers and Scientists using Microsoft Excel and Minitab He also recently wrote Even You Can Learn Statistics and Statistics for Six Sigma Green Belts published by Financial Times-Prentice-Hall He is coauthor of Six Sigma for Green Belts and Champions and Design for Six Sigma for Green Belts and Champions, also published by Financial Times-Prentice-Hall, and Quality Management Third Ed., McGraw-Hill-Irwin (2005) He is also the author of Video Review of Statistics and Video Review of Probability, both published by Video Aided Instruction He has published articles in various journals including Psychometrika, The American Statistician, Communications in Statistics, Multivariate Behavioral Research, Journal of Systems Management, Quality Progress, and The American Anthropologist and given numerous talks at Decision Sciences, American Statistical Association, and Making Statistics More Effective in Schools of Business conferences While at Baruch College, Dr Levine received several awards for outstanding teaching and curriculum development David F Stephan is an instructional designer and lecturer who pioneered the teaching of spreadsheet applications to business school students in the 1980 s He has over 20 years experience teaching at Baruch College, where he developed the first personal computing lab to support statistics and information systems studies and was twice nominated for his excellence in teaching He is also proud to have been the lead designer and assistant project director of a U.S Department of Education FIPSE project that brought interactive, multimedia learning to Baruch College Today, David focuses on developing materials that help users make better use of the information analysis tools on their computer desktops and is a co-author, with David M Levine, of Even You Can Learn Statistics vi About the Authors vii Timothy C Krehbiel is Professor of Decision Sciences and Management Information Systems at the Richard T Farmer School of Business at Miami University in Oxford, Ohio He teaches undergraduate and graduate courses in business statistics In 1996 he received the prestigious Instructional Innovation Award from the Decision Sciences Institute In 2000 he received the Richard T Farmer School of Business Administration Effective Educator Award He also received a Teaching Excellence Award from the MBA class of 2000 Krehbiel s research interests span many areas of business and applied statistics His work appears in numerous journals including Quality Management Journal, Ecological Economics, International Journal of Production Research, Journal of Marketing Management, Communications in Statistics, Decision Sciences Journal of Innovative Education, Journal of Education for Business, Marketing Education Review, and Teaching Statistics He is a coauthor of three statistics textbooks published by Prentice Hall: Business Statistics: A First Course, Basic Business Statistics, and Statistics for Managers Using Microsoft Excel Krehbiel is also a co-author of the book Sustainability Perspectives in Business and Resources Krehbiel graduated summa cum laude with a B.A in history from McPherson College in 1983, and earned an M.S (1987) and Ph.D (1990) in statistics from the University of Wyoming Mark L Berenson is Professor of Management and Information Systems at Montclair State University (Montclair, New Jersey) and also Professor Emeritus of Statistics and Computer Information Systems at Bernard M Baruch College (City University of New York) He currently teaches graduate and undergraduate courses in statistics and in operations management in the School of Business and an undergraduate course in international justice and human rights that he co-developed in the College of Humanities and Social Sciences Berenson received a B.A in economic statistics and an M.B.A in business statistics from City College of New York and a Ph.D in business from the City University of New York Berenson s research has been published in Decision Sciences Journal of Innovative Education, Review of Business Research, The American Statistician, Communications in Statistics, Psychometrika, Educational and Psychological Measurement, Journal of Management Sciences and Applied Cybernetics, Research Quarterly, Stats Magazine, The New York Statistician, Journal of Health Administration Education, Journal of Behavioral Medicine, and Journal of Surgical Oncology His invited articles have appeared in The Encyclopedia of Measurement & Statistics and in Encyclopedia of Statistical Sciences He is co-author of 11 statistics texts published by Prentice Hall, including Statistics for Managers using Microsoft Excel, Basic Business Statistics: Concepts and Applications, and Business Statistics: A First Course Over the years, Berenson has received several awards for teaching and for innovative contributions to statistics education In 2005 he was the first recipient of The Catherine A Becker Service for Educational Excellence Award at Montclair State University BRIEF CONTENTS Preface xix INTRODUCTION AND DATA COLLECTION PRESENTING DATA IN TABLES AND CHARTS 31 NUMERICAL DESCRIPTIVE MEASURES 95 BASIC PROBABILITY 147 SOME IMPORTANT DISCRETE PROBABILITY DISTRIBUTIONS 179 THE NORMAL DISTRIBUTION AND OTHER CONTINUOUS DISTRIBUTIONS 217 SAMPLING AND SAMPLING DISTRIBUTIONS 251 CONFIDENCE INTERVAL ESTIMATION 283 FUNDAMENTALS OF HYPOTHESIS TESTING: ONE-SAMPLE TESTS 327 10 SIMPLE LINEAR REGRESSION 369 11 INTRODUCTION TO MULTIPLE REGRESSION 429 Appendices A-F 471 Self-Test Solutions and Answers to Selected Even-Numbered Problems 513 Index 535 CD-ROM TOPICS 4.5 5.6 6.6 7.6 8.7 9.7 COUNTING RULES CD4-1 USING THE POISSON DISTRIBUTION TO APPROXIMATE THE BINOMIAL DISTRIBUTION CD5-1 THE NORMAL APPROXIMATION TO THE BINOMIAL DISTRIBUTION CD6-1 SAMPLING FROM FINITE POPULATIONS CD7-1 ESTIMATION AND SAMPLE SIZE DETERMINATION FOR FINITE POPULATIONS CD8-1 THE POWER OF A TEST CD9-1 ix 236 CHAPTER SIX The Normal Distribution and Other Continuous Distributions The interquartile range of 7.0 is approximately 1.41 standard deviations (In a normal distribution, the interquartile range is 1.33 standard deviations.) The range of 35.6 is equal to 7.19 standard deviations (In a normal distribution, the range is approximately six standard deviations.) 74.2% of the returns are within standard deviation of the mean (In a normal distribution, 68.26% of the values lie between the mean standard deviation.) 83.3% of the returns are within 1.28 standard deviations of the mean (In a normal distribution, 80% of the values lie between the mean 1.28 standard deviations.) Based on these statements and the criteria given on pages 234 235, you can conclude that the three-year returns are right-skewed and are not normally distributed Constructing the Normal Probability Plot A normal probability plot is a graphical approach for evaluating whether data are normally distributed One common approach is called the quantile quantile plot In this method, you transform each ordered value to a Z value and then plot the data values versus the Z values For example, if you have a sample of n = 19, the Z value for the smallest value corresponds to a cumulative area of 1 = = = 0.05 The Z value for a cumulative area of 0.05 n + 19 + 20 (from Table E.2) is 1.65 Table 6.7 illustrates the entire set of Z values for a sample of n = 19 TAB L E Ordered Values and Corresponding Z Values for a Sample of n = 19 Ordered Value 10 Z Value Ordered Value Z Value 1.65 1.28 1.04 0.84 0.67 0.52 0.39 0.25 0.13 0.00 11 12 13 14 15 16 17 18 19 0.13 0.25 0.39 0.52 0.67 0.84 1.04 1.28 1.65 The Z values are plotted on the X axis, and the corresponding values of the variable are plotted on the Y axis Figure 6.21 illustrates the typical shape of normal probability plots for a left-skewed distribution (Panel A), a normal distribution (Panel B), and a right-skewed distribution (Panel C) If the data are left-skewed, the curve will rise more rapidly at first and then level off If the data are normally distributed, the points will plot along an approximately straight line If the data are right-skewed, the data will rise more slowly at first and then rise at a faster rate for higher values of the variable being plotted FIGURE 6.21 Normal probability plots for a left-skewed distribution, a normal distribution, and a rightskewed distribution Left skewed Panel A Normal Panel B Right skewed Panel C Statistics for Managers Using Microsoft Excel, Fifth Edition, by David M Levine, Mark L Berenson, and Timothy C Krehbiel Published by Prentice Hall Copyright 2008 by Pearson Education, Inc 6.3: Evaluating Normality 237 Figure 6.22 shows a Microsoft Excel quantile quantile normal probability plot for the three-year returns FIGURE 6.22 Microsoft Excel normal probability plot for three-year returns See Section E6.2 to create this Figure 6.22 shows that the three-year returns rise more slowly at first and subsequently rise more rapidly This occurs since the data are right-skewed Thus, the data are not normally distributed PROBLEMS FOR SECTION 6.3 Learning the Basics PH Grade ASSIST 6.14 Show that for a sample of n = 39, the smallest and largest Z values are *1.96 and +1.96, and the middle (that is, 20th) Z value is 0.00 ing data (stored in the file phone.xls ) represent two samples of 20 problems reported to two different offices of a telephone company The time to clear these problems from the customers lines is recorded, in minutes Central Office I Time to Clear Problems (Minutes) 6.15 For a sample of n = 6, list the six Z values 1.48 1.75 0.78 2.85 0.52 1.60 4.15 3.97 1.48 3.10 Applying the Concepts 1.02 0.53 0.93 1.60 0.80 1.05 6.32 3.93 5.45 0.97 6.16 The data in the file chicken.xls contains the total fat, in grams per serving, for a sample of 20 chicken sandwiches from fast-food chains The data are as follows: SELF Test 16 20 20 24 19 30 23 30 25 19 29 29 30 30 40 56 Source: Extracted from Fast Food: Adding Health to the Menu, Consumer Reports, September 2004, pp 28 31 Decide whether the data appear to be approximately normally distributed by a comparing data characteristics to theoretical properties b constructing a normal probability plot 6.17 A problem with a telephone line that prevents a customer from receiving or making calls is disconcerting to both the customer and the telephone company The follow- Central Office II Time to Clear Problems (Minutes) 7.55 3.75 0.10 1.10 0.60 0.52 3.30 2.10 0.58 4.02 3.75 0.65 1.92 0.60 1.53 4.23 0.08 1.48 1.65 0.72 For each of the two central office locations, decide whether the data appear to be approximately normally distributed by a comparing data characteristics to theoretical properties b constructing a normal probability plot 6.18 Many manufacturing processes use the term workin-process (often abbreviated as WIP) In a book manufacturing plant, the WIP represents the time it takes for sheets from a press to be folded, gathered, sewn, tipped on end sheets, and bound The following data (stored in the file wip.xls ) represent samples of 20 books at each of two production plants and the processing time (operationally Statistics for Managers Using Microsoft Excel, Fifth Edition, by David M Levine, Mark L Berenson, and Timothy C Krehbiel Published by Prentice Hall Copyright 2008 by Pearson Education, Inc 238 CHAPTER SIX The Normal Distribution and Other Continuous Distributions defined as the time, in days, from when the books came off the press to when they were packed in cartons) for these jobs: measured by a laser measurement device, and the specified length of the steel part Decide whether the data appear to be approximately normally distributed by a comparing data characteristics to theoretical properties b constructing a normal probability plot Plant A 5.62 5.29 16.25 10.92 11.46 21.62 8.45 8.58 5.41 11.42 11.62 7.29 7.50 7.96 4.42 10.50 7.58 9.29 7.54 18.92 6.21 The data in the file savings.xls are the yields for a money market account, a one-year certificate of deposit (CD), and a five-year CD for 40 banks in south Florida as of December 20, 2005 (extracted from Bankrate.com, December 20, 2005) For each of the three types of investments, decide whether the data appear to be approximately normally distributed by a comparing data characteristics to theoretical properties b constructing a normal probability plot Plant B 9.54 11.46 16.62 12.62 25.75 15.41 14.29 13.13 13.71 10.04 5.75 12.46 9.17 13.21 6.00 2.33 14.25 5.37 6.25 9.71 For each of the two plants, decide whether the data appear to be approximately normally distributed by a comparing data characteristics to theoretical properties b constructing a normal probability plot 6.22 The following data, stored in the file utility.xls , represent the electricity costs in dollars, during July 2006 for a random sample of 50 two-bedroom apartments in a large city: 6.19 The data in the file spending.xls represent the per capita spending, in thousands of dollars, for each state in 2004 Decide whether the data appear to be approximately normally distributed by a comparing data characteristics to theoretical properties b constructing a normal probability plot 96 171 202 178 147 102 153 197 127 157 185 6.20 One operation of a mill is to cut pieces of steel into parts that will later be used as the frame for front seats in an automotive plant The steel is cut with a diamond saw and requires the resulting parts to be within 0.005 inch of the length specified by the automobile company The data come from a sample of 100 steel parts and are stored in the file steel.xls The measurement reported is the difference, in inches, between the actual length of the steel part, as 6.4 82 90 116 172 111 148 213 130 165 141 149 206 175 123 128 144 168 109 167 95 163 150 154 130 143 187 166 139 149 108 119 183 151 114 135 191 137 129 158 Decide whether the data appear to be approximately normally distributed by: a comparing data characteristics to theoretical properties b constructing a normal probability plot THE UNIFORM DISTRIBUTION In the uniform distribution, a value has the same probability of occurrence anywhere in the range between the smallest value, a, and the largest value, b Because of its shape, the uniform distribution is sometimes called the rectangular distribution (see Panel B of Figure 6.1 on page 218) Equation (6.5) defines the continuous probability density function for the uniform distribution UNIFORM DISTRIBUTION f (X ) = b a if a X b and elsewhere (6.5) where a = the minimum value of X b = the maximum value of X Statistics for Managers Using Microsoft Excel, Fifth Edition, by David M Levine, Mark L Berenson, and Timothy C Krehbiel Published by Prentice Hall Copyright 2008 by Pearson Education, Inc 6.4: The Uniform Distribution 239 Equation (6.6) defines the mean of the uniform distribution MEAN OF THE UNIFORM DISTRIBUTION = a+b (6.6) Equation (6.7) defines the variance and standard deviation of the uniform distribution VARIANCE AND STANDARD DEVIATION OF THE UNIFORM DISTRIBUTION = = (b a)2 12 (6.7a) (b a )2 12 (6.7b) One of the most common uses of the uniform distribution is in the selection of random numbers When you use simple random sampling (see Section 7.1), you assume that each value comes from a uniform distribution that has a minimum value of and a maximum value of Figure 6.23 illustrates the uniform distribution with a = and b = The total area inside the rectangle is equal to the base (1.0) times the height (1.0) Thus, the resulting area of 1.0 satisfies the requirement that the area under any probability density function equals 1.0 FIGURE 6.23 Probability density function for a uniform distribution with a = and b = f (x ) 1.0 1.0 x In such a distribution, what is the probability of getting a random number between 0.10 and 0.30? The area between 0.10 and 0.30, depicted in Figure 6.24, is equal to the base (which is 0.30 0.10 = 0.20) times the height (1.0) Therefore, P(0.10 < X < 0.30) = (Base)(Height) = (0.20)(1.0) = 0.20 FIGURE 6.24 Finding P(0.10 < X < 0.30) for a uniform distribution with a = and b = f (x) 1.0 1.0 x Statistics for Managers Using Microsoft Excel, Fifth Edition, by David M Levine, Mark L Berenson, and Timothy C Krehbiel Published by Prentice Hall Copyright 2008 by Pearson Education, Inc 240 CHAPTER SIX The Normal Distribution and Other Continuous Distributions From Equations (6.6) and (6.7), the mean and standard deviation of the uniform distribution for a = and b = are computed as follows: a+b 0+1 = = 0.5 = and = = = (b a)2 12 (1 0)2 12 = 0.0833 12 = 0.0833 = 0.2887 Thus, the mean is 0.5 and the standard deviation is 0.2887 PROBLEMS FOR SECTION 6.4 Learning the Basics 6.23 Suppose you sample one value from a uniform distribution with a = and b = 10 What is the probability that the value will be between and 7? between and 3? What is the mean? What is the standard deviation? PH Grade ASSIST a b c d Applying the Concepts 6.24 The time between arrivals of customers at a bank during the noon-to-1 p.m hour has a uniSELF form distribution between to 120 seconds Test What is the probability that the time between the arrival of two customers will be a less than 20 seconds? b between 10 and 30 seconds? c more than 35 seconds? d What are the mean and standard deviation of the time between arrivals? a b c PH Grade ASSIST 6.25 A study of the time spent shopping in a supermarket for a market basket of 20 specific items showed an approximately uniform distribution between 20 minutes and 40 minutes What is the probability that the shopping time will be a between 25 and 30 minutes? b less than 35 minutes? c What are the mean and standard deviation of the shopping time? 6.26 The time to failure for a continuousoperation monitoring device of air quality has a uniform distribution over a 24-hour day If a failure occurs on a day when daylight is between 5:55 a.m and 7:38 p.m., what is the probability that the failure will occur during daylight hours? If the device is in secondary mode from 10 p.m to a.m., what is the probability that if a failure occurs, it will happen during secondary mode? If the device has a self-checking computer chip that determines whether the device is operational every hour on the hour, what is the probability that a failure will be detected within 10 minutes of its occurrence? If the device has a self-checking computer chip that determines whether the device is operational every hour on the hour, what is the probability that it will take at least 40 minutes to detect that a failure has occurred? PH Grade ASSIST d 6.27 The scheduled commuting time on the Long Island Rail Road from Glen Cove to New York City is 65 minutes Suppose that the actual commuting time is uniformly distributed between 64 and 74 minutes What is the probability that the commuting time will be a less than 70 minutes? b between 65 and 70 minutes? c greater than 65 minutes? d What are the mean and standard deviation of the commuting time? Statistics for Managers Using Microsoft Excel, Fifth Edition, by David M Levine, Mark L Berenson, and Timothy C Krehbiel Published by Prentice Hall Copyright 2008 by Pearson Education, Inc 6.5: The Exponential Distribution 6.5 241 THE EXPONENTIAL DISTRIBUTION The exponential distribution is a continuous distribution that is right-skewed and ranges from zero to positive infinity (see Panel C of Figure 6.1 on page 218) The exponential distribution is widely used in waiting-line (or queuing) theory to model the length of time between arrivals in processes such as customers at a bank s ATM, patients entering a hospital emergency room, and hits on a Web site The exponential distribution is defined by a single parameter, its mean, , the mean number of arrivals per unit of time The value 1/ is equal to the mean time between arrivals For example, if the mean number of arrivals in a minute is = 4, then the mean time between arrivals is 1/ = 0.25 minutes, or 15 seconds Equation (6.8) defines the probability that the length of time before the next arrival is less than X EXPONENTIAL DISTRIBUTION P(Arrival time < X) = e X (6.8) where e = the mathematical constant approximated by 2.71828 = the mean number of arrivals per unit X = any value of the continuous variable where < X < To illustrate the exponential distribution, suppose that customers arrive at a bank s ATM at a rate of 20 per hour If a customer has just arrived, what is the probability that the next customer will arrive within minutes (that is, 0.1 hour)? For this example, = 20 and X = 0.1 Using Equation (6.8), P ( Arrival time < 0.1) = e 20(0.1) =1 e =1 0.1353 = 0.8647 Thus, the probability that a customer will arrive within minutes is 0.8647, or 86.47% You can also use Microsoft Excel to compute this probability (see Figure 6.25) FIGURE 6.25 Microsoft Excel worksheet for finding exponential probabilities (mean = ) See Section E6.3 to create this EX A MP LE COMPUTING EXPONENTIAL PROBABILITIES In the ATM example, what is the probability that the next customer will arrive within minutes (that is, 0.05 hour)? Statistics for Managers Using Microsoft Excel, Fifth Edition, by David M Levine, Mark L Berenson, and Timothy C Krehbiel Published by Prentice Hall Copyright 2008 by Pearson Education, Inc 242 CHAPTER SIX The Normal Distribution and Other Continuous Distributions SOLUTION For this example, = 20 and X = 0.05 Using Equation (6.8), P ( Arrival time < 0.05) = e 20(0.05) =1 e =1 0.3679 = 0.6321 Thus, the probability that a customer will arrive within minutes is 0.6321, or 63.21% PROBLEMS FOR SECTION 6.5 Learning the Basics 6.28 Given an exponential distribution with is the probability that the arrival time is a less than X = 0.1? b greater than X = 0.1? c between X = 0.1 and X = 0.2? d less than X = 0.1 or greater than X = 0.2? = 10, what 6.29 Given an exponential distribution with is the probability that the arrival time is a less than X = 0.1? b greater than X = 0.1? c between X = 0.1 and X = 0.2? d less than X = 0.1 or greater than X = 0.2? = 30, what 6.30 Given an exponential distribution with is the probability that the arrival time is a less than X = 4? b greater than X = 0.4? c between X = 0.4 and X = 0.5? d less than X = 0.4 or greater than X = 0.5? = 20, what Applying the Concepts 6.31 Autos arrive at a tollplaza located at the entrance to a bridge at the rate of 50 per minute during the 5:00 6:00 p.m hour If an auto has just arrived, a what is the probability that the next auto will arrive within seconds (0.05 minute)? b what is the probability that the next auto will arrive within second (0.0167 minute)? c What are your answers to (a) and (b) if the rate of arrival of autos is 60 per minute? d What are your answers to (a) and (b) if the rate of arrival of autos is 30 per minute? PH Grade ASSIST 6.32 Customers arrive at the drive-up window of a fast-food restaurant at a rate of per minute during the lunch hour a What is the probability that the next customer will arrive within minute? b What is the probability that the next customer will arrive within minutes? SELF Test c During the dinner time period, the arrival rate is per minute What are your answers to (a) and (b) for this period? 6.33 Telephone calls arrive at the information desk of a large computer software company at a rate of 15 per hour a What is the probability that the next call will arrive within minutes (0.05 hour)? b What is the probability that the next call will arrive within 15 minutes (0.25 hour)? c Suppose the company has just introduced an updated version of one of its software programs, and telephone calls are now arriving at a rate of 25 per hour Given this information, redo (a) and (b) 6.34 An on-the-job injury occurs once every 10 days on average at an automobile plant What is the probability that the next on-the-job injury will occur within a 10 days? b days? c day? PH Grade ASSIST 6.35 The time between unplanned shutdowns of a power plant has an exponential distribution with a mean of 20 days Find the probability that the time between two unplanned shutdowns is a less than 14 days b more than 21 days c less than days PH Grade ASSIST 6.36 Golfers arrive at the starter s booth of a public golf course at a rate of per hour during the Monday-to-Friday midweek period If a golfer has just arrived, a what is the probability that the next golfer will arrive within 15 minutes (0.25 hour)? b what is the probability that the next golfer will arrive within minutes (0.05 hour)? c The actual arrival rate on Fridays is 15 per hour What are your answers to (a) and (b) for Fridays? PH Grade ASSIST 6.37 TrafficWeb.org claims that it can deliver 10,000 hits to a Web site in the next 60 days for only $21.95 (www.trafficweb.org, April 26, 2004) If this amount of Statistics for Managers Using Microsoft Excel, Fifth Edition, by David M Levine, Mark L Berenson, and Timothy C Krehbiel Published by Prentice Hall Copyright 2008 by Pearson Education, Inc Key Terms Web site traffic is experienced, then the time between hits has as a mean of 8.64 minutes (or 0.116 per minute) Assume that your Web site does get 10,000 hits in the next 60 days and that the time between hits has an exponential distribution What is the probability that the time between two hits is a b c d 243 less than minutes? less than 10 minutes? more than 15 minutes? Do you think it is reasonable to assume that the time between hits has an exponential distribution? (CD-ROM Topic) THE NORMAL APPROXIMATION TO THE BINOMIAL DISTRIBUTION 6.6 In many circumstances, the normal distribution can be used to approximate the binomial distribution For further discussion, see section 6.6.pdf on the Student CD-ROM that accompanies this text SUMMARY In this chapter, you used the normal distribution in the Using Statistics scenario to study the time to download a Web page In addition, you studied the uniform distribu- KEY tion, the exponential distribution, and the normal probability plot In Chapter 7, the normal distribution is used in developing the subject of statistical inference E Q U AT I O N S Normal Probability Density Function f (X ) = (1/ )[( X e Mean of the Uniform Distribution )/ ]2 Transformation Formula Z = X = (6.1) (6.2) a+b Variance and Standard Deviation of the Uniform Distribution = (b a)2 12 (6.7a) (b a)2 12 (6.7b) Standardized Normal Probability Density Function f (Z ) = (1/ ) Z e (6.3) Finding an X Value Associated with Known Probability X= +Z (6.6) = Exponential Distribution (6.4) P(Arrival time < X) =1 e X (6.8) Uniform Distribution f (X ) = KEY b a (6.5) TERMS continuous probability density function 218 cumulative standardized normal distribution 222 exponential distribution 241 normal distribution 219 normal probability density function 220 normal probability plot 236 quantile-quantile plot 236 rectangular distribution 238 standardized normal random variable 221 transformation formula 221 uniform distribution 238 Statistics for Managers Using Microsoft Excel, Fifth Edition, by David M Levine, Mark L Berenson, and Timothy C Krehbiel Published by Prentice Hall Copyright 2008 by Pearson Education, Inc 244 CHAPTER SIX The Normal Distribution and Other Continuous Distributions CHAPTER REVIEW Checking Your Understanding 6.38 Why is it that only one normal distribution table such as Table E.2 is needed to find any probability under the normal curve? 6.39 How you find the area between two values under the normal curve? 6.40 How you find the X value that corresponds to a given percentile of the normal distribution? 6.41 What are some of the distinguishing properties of a normal distribution? 6.42 How does the shape of the normal distribution differ from those of the uniform and exponential distributions? 6.43 How can you use the normal probability plot to evaluate whether a set of data is normally distributed? 6.44 Under what circumstances can you use the exponential distribution? Applying the Concepts 6.45 An industrial sewing machine uses ball bearings that are targeted to have a diameter of 0.75 inch The lower and upper specification limits under which the ball bearings can operate are 0.74 inch and 0.76 inch, respectively Past experience has indicated that the actual diameter of the ball bearings is approximately normally distributed, with a mean of 0.753 inch and a standard deviation of 0.004 inch What is the probability that a ball bearing is a between the target and the actual mean? b between the lower specification limit and the target? c above the upper specification limit? d below the lower specification limit? e 93% of the diameters are greater than what value? 6.46 The fill amount of soft drink bottles is normally distributed, with a mean of 2.0 liters and a standard deviation of 0.05 liter If bottles contain less than 95% of the listed net content (1.90 liters, in this case), the manufacturer may be subject to penalty by the state office of consumer affairs Bottles that have a net content above 2.10 liters may cause excess spillage upon opening What proportion of the bottles will contain a between 1.90 and 2.0 liters? b between 1.90 and 2.10 liters? c below 1.90 liters or above 2.10 liters? d 99% of the bottles contain at least how much soft drink? e 99% of the bottles contain an amount that is between which two values (symmetrically distributed) around the mean? 6.47 In an effort to reduce the number of bottles that contain less than 1.90 liters, the bottler in Problem 6.46 sets the filling machine so that the mean is 2.02 liters Under these circumstances, what are your answers in (a) through (e)? PROBLEMS 6.48 An orange juice producer buys all his oranges from a large orange grove The amount of juice squeezed from each of these oranges is approximately normally distributed, with a mean of 4.70 ounces and a standard deviation of 0.40 ounce a What is the probability that a randomly selected orange will contain between 4.70 and 5.00 ounces? b What is the probability that a randomly selected orange will contain between 5.00 and 5.50 ounces? c 77% of the oranges will contain at least how many ounces of juice? d 80% of the oranges contain between what two values (in ounces), symmetrically distributed around the population mean? PH Grade ASSIST 6.49 Data concerning 58 of the best-selling domestic beers in the United States are located in the file domesticbeer.xls The values for three variables are included: percentage alcohol, number of calories per 12 ounces, and number of carbohydrates (in grams) per 12 ounces For each of the three variables, decide whether the data appear to be approximately normally distributed Support your decision through the use of appropriate statistics and graphs Source: Extracted from www.Beer100.com, March 31, 2006 6.50 The evening manager of a restaurant was very concerned about the length of time some customers were waiting in line to be seated She also had some concern about the seating times that is, the length of time between when a customer is seated and the time he or she leaves the restaurant Over the course of one week, 100 customers (no more than per party) were randomly selected, and their waiting and seating times (in minutes) were recorded in the file wait.xls a Think about your favorite restaurant Do you think waiting times more closely resemble a uniform, exponential, or normal distribution? b Again, think about your favorite restaurant Do you think seating times more closely resemble a uniform, exponential, or normal distribution? c Construct a histogram and a normal probability plot of the waiting-times Do you think these waiting times more closely resemble a uniform, exponential, or normal distribution? d Construct a histogram and a normal probability plot of the seating times Do you think these seating times more closely resemble a uniform, exponential, or normal distribution? 6.51 At the end of the first quarter of 2006, all the major stock market indexes had posted strong gains in the past 12 months Mass Mutual Financial Group credited the increases to solid growth in corporate profits ( Market Commentary: Economic Growth Characterizes Q1 2006, Statistics for Managers Using Microsoft Excel, Fifth Edition, by David M Levine, Mark L Berenson, and Timothy C Krehbiel Published by Prentice Hall Copyright 2008 by Pearson Education, Inc Chapter Review Problems www.massmutual.com, May 1, 2006) The mean one-year return for stocks in the S&P 500, a group of 500 very large companies, was approximately 12% The mean one-year return for companies in the Russell 2000, a group of 2000 small companies, was approximately 26% Historically, the one-year returns are approximately normal, the standard deviation in the S&P 500 is approximately 20%, and the standard deviation in the Russell 200 is approximately 35% a What is the probability that a stock in the S&P 500 gained 25% or more in the last year? gained 50% or more? b What is the probability that a stock in the S&P 500 lost money in the last year? Lost 25% or more? lost 50% or more? c Repeat (a) and (b) for a stock in the Russell 2000 d Write a short summary on your findings Be sure to include a discussion of the risks associated with a large standard deviation 6.52 The New York Times reported (L J Flynn, Tax Surfing, The New York Times, March 25, 2002, p C10) that the mean time to download the home page for the Internal Revenue Service, www.irs.gov, is 0.8 second Suppose that the download time is normally distributed with a standard deviation of 0.2 seconds What is the probability that a download time is a less than second? b between 0.5 and 1.5 seconds? c above 0.5 second? d 99% of the download times are above how many seconds? e 95% of the download times are between what two values, symmetrically distributed around the mean? 6.53 The same article mentioned in Problem 6.52 also reported that the mean download time for the H&R Block Web site, www.hrblock.com, is 2.5 seconds Suppose that the download time is normally distributed with a standard deviation of 0.5 second What is the probability that a download time is a less than second? b between 0.5 and 1.5 seconds? c above 0.5 second? d 99% of the download times are above how many seconds? e Compare the results for the IRS site computed in Problem 6.52 to those of the H&R Block site 6.54 (Class Project) According to Burton G Malkiel, the daily changes in the closing price of stock follow a random walk that is, these daily events are independent of each other and move upward or downward in a random manner and can be approximated by a normal distribution To test this theory, use either a newspaper or the Internet to select one company traded on the NYSE, one company traded on the American Stock Exchange, and one company traded over the counter (that is, on the NASDAQ national market) and then the following: 245 Record the daily closing stock price of each of these companies for six consecutive weeks (so that you have 30 values per company) Record the daily changes in the closing stock price of each of these companies for six consecutive weeks (so that you have 30 values per company) For each of your six data sets, decide whether the data are approximately normally distributed by a examining the stem-and-leaf display, histogram or polygon, and box-and-whisker plot b comparing data characteristics to theoretical properties c constructing a normal probability plot d Discuss the results of (a) through (c) What can you say about your three stocks with respect to daily closing prices and daily changes in closing prices? Which, if any, of the data sets are approximately normally distributed? Note: The random-walk theory pertains to the daily changes in the closing stock price, not the daily closing stock price Team Projects The data file Mutual Funds.xls contains information regarding nine variables from a sample of 838 mutual funds The variables are: Category Type of stocks comprising the mutual fund (small cap, mid cap, large cap) Objective Objective of stocks comprising the mutual fund (growth or value) Assets In millions of dollars Fees Sales charges (no or yes) Expense ratio Ratio of expenses to net assets, in percentage Risk Risk-of-loss factor of the mutual fund (low, average, high) 2005 return Twelve-month return in 2005 Three-year return Annualized return, 2003 2005 Five-year return Annualized return, 2001 2005 6.55 For the expense ratio in percentage, 2005 return, and five-year return, decide whether the data are approximately normally distributed by a comparing data characteristics to theoretical properties b constructing a normal probability plot Student Survey Data Base 6.56 Problem 1.27 on page 15 describes a survey of 50 undergraduate students (see the file undergradsurvey.xls ) For these data, for each numerical variable, decide whether the data are approximately normally distributed by a comparing data characteristics to theoretical properties b constructing a normal probability plot 6.57 Problem 1.27 on page 15 describes a survey of 50 undergraduate students (see the file undergradsurvey.xls ) a Select a sample of 50 undergraduate students and conduct a similar survey for those students Statistics for Managers Using Microsoft Excel, Fifth Edition, by David M Levine, Mark L Berenson, and Timothy C Krehbiel Published by Prentice Hall Copyright 2008 by Pearson Education, Inc 246 CHAPTER SIX The Normal Distribution and Other Continuous Distributions b For the data collected in (a), repeat (a) and (b) of Problem 6.56 c Compare the results of (b) to those of Problem 6.56 6.58 Problem 1.28 on page 15 describes a survey of 50 MBA students (see the file gradsurvey.xls ) For these data, for each numerical variable, decide whether the data are approximately normally distributed by a comparing data characteristics to theoretical properties b constructing a normal probability plot 6.59 Problem 1.28 on page 15 describes a survey of 50 MBA students (see the file gradsurvey.xls ) a Select a sample of 50 undergraduate students and conduct a similar survey for those students b For the data collected in (a), repeat (a) and (b) of Problem 6.58 c Compare the results of (b) to those of Problem 6.58 Managing the Springville Herald The production department of the newspaper has embarked on a quality improvement effort Its first project relates to the blackness of the newspaper print Each day, a determination needs to be made concerning how black the newspaper is printed Blackness is measured on a standard scale in which the target value is 1.0 Data collected over the past year indicate that the blackness is normally distributed, with a mean of 1.005 and a standard deviation of 0.10 Each day, one spot on the first newspaper printed is chosen, and the blackness of the spot is measured The blackness of the newspaper is considered acceptable if the blackness of the spot is between 0.95 and 1.05 EXERCISES SH6.1 Assuming that the distribution has not changed from what it was in the past year, what is the probability that the blackness of the spot is a less than 1.0? b between 0.95 and 1.0? c between 1.0 and 1.05? d less than 0.95 or greater than 1.05? SH6.2 The objective of the production team is to reduce the probability that the blackness is below 0.95 or above 1.05 Should the team focus on process improvement that lowers the mean to the target value of 1.0 or on process improvement that reduces the standard deviation to 0.075? Explain Web Case Apply your knowledge about the normal distribution in this Web Case, which extends the Using Statistics scenario from this chapter To satisfy concerns of potential advertisers, the management of OurCampus! has undertaken a research project to learn the amount of time it takes users to download a complex video features page The marketing department has collected data and has made some claims based on the assertion that the data follow a normal distribution These data and conclusions can be found in a report located on the internal Web page www.prenhall.com/Springville/ Our_DownloadResearch.htm (or in the file with the same name in the Student CD-ROM Web Case folder) Read this marketing report and then answer the following: Can the collected data be approximated by the normal distribution? Review and evaluate the conclusions made by the OurCampus! marketing department Which conclusions are correct? Which ones are incorrect? If OurCampus! could improve the mean time by five minutes, how would the probabilities change? REFERENCES Gunter, B., Q-Q Plots, Quality Progress (February 1994), 81 86 Levine, D M., P Ramsey, and R Smidt, Applied Statistics for Engineers and Scientists Using Microsoft Excel and Minitab (Upper Saddle River, NJ: Prentice Hall, 2001) Microsoft Excel 2007 (Redmond, WA: Microsoft Corp., 2007) Statistics for Managers Using Microsoft Excel, Fifth Edition, by David M Levine, Mark L Berenson, and Timothy C Krehbiel Published by Prentice Hall Copyright 2008 by Pearson Education, Inc E6.1: Computing Normal Probabilities 247 Excel Companion to Chapter E6.1 COMPUTING NORMAL PROBABILITIES You compute normal probabilities by either using the PHStat2 Normal procedure or by making entries in the Normal.xls workbook Using PHStat2 Normal Select PHStat * Probability & Prob Distributions * Normal In the Normal Probability Distribution dialog box (shown below), enter values for the mean and standard deviation Click one or more of the input options and enter values Enter a title as the Title and click OK For Example 6.1, click Probability for: X > and enter in its box For Example 6.2, click Probability for range and enter as the from value and as the to value For Example 6.5, click Probability for: X