Essentials of statistics 5e global edition triola 1

Essentials of Statistics For these Global Editions, the editorial team at Pearson has collaborated with educators across the world to address a wide range of subjects and requirements, equipping students with the best possible learning tools This Global Edition preserves the cutting-edge approach and pedagogy of the original, but also features alterations, customization and adaptation from the North American version Global edition Global edition Global edition FIFTH edition Triola Essentials of Statistics FIFTH edition Mario F Triola This is a special edition of an established title widely used by colleges and universities throughout the world Pearson published this exclusive edition for the benefit of students outside the United States and Canada If you purchased this book within the United States or Canada you should be aware that it has been imported without the approval of the Publisher or Author Pearson Global Edition Triola_1292058765_mech.indd 18/07/14 7:34 am Symbol Table Σx y sum of the products of each x value multiplied by the corresponding y value alternative hypothesis n number of values in a sample alpha; probability of a type I error or the area of the critical region n! n factorial N number of values in a finite population; also used as the size of all samples combined A complement of event A H0 null hypothesis H1 a b beta; probability of a type II error r sample linear correlation coefficient k number of samples or populations or categories r rho; population linear correlation coefficient x mean of the values in a sample r2 coefficient of determination m mu; mean of all values in a population rs Spearman’s rank correlation coefficient s standard deviation of a set of sample values b1 point estimate of the slope of the regression line s b0 point estimate of the y-intercept of the regression line lowercase sigma; standard deviation of all values in a population s2 variance of a set of sample values variance of all values in a population ny predicted value of y s d difference between two matched values z standard score mean of the differences d found from matched sample data za>2 critical value of z t t distribution ta>2 critical value of t df number of degrees of freedom F x2 F distribution d sd standard deviation of the differences d found from matched sample data se standard error of estimate mx mean of the population of all possible sample means x sx standard deviation of the population of all possible sample means x E margin of error of the estimate of a population parameter, or expected value Q1, Q2, Q3 quartiles chi-square distribution x2R x2L left-tailed critical value of chi-square p probability of an event or the population proportion q probability or proportion equal to - p right-tailed critical value of chi-square pn sample proportion qn sample proportion equal to - pn p proportion obtained by pooling two samples data value q proportion or probability equal to - p f frequency with which a value occurs P(A) probability of event A Σ capital sigma; summation P (A ͉ B) probability of event A, assuming event B has occurred Σx sum of the values Σx sum of the squares of the values nPr number of permutations of n items selected r at a time (Σx)2 square of the sum of all values nCr number of combinations of n items selected r at a time D 1, D 2, c , D P1, P2, c , P99 x A00_TRIO4599_05_SE_FE_p01.indd Triola_1292058765_ifc.indd deciles percentiles 09/10/13 6:02 PM 18/07/14 7:37 am Essentials of Statistics A01_TRIO8764_05_GE_FM.indd 25/07/14 10:54 AM A01_TRIO8764_05_GE_FM.indd 25/07/14 10:54 AM Essentials of Statistics 5th Edition Global Edition Mario F Triola Boston Columbus Indianapolis New York San Francisco Upper Saddle River Amsterdam Cape Town Dubai London Madrid Milan Munich Paris Montréal Toronto Delhi Mexico City São Paulo Sydney Hong Kong Seoul Singapore Taipei Tokyo A01_TRIO8764_05_GE_FM.indd 25/07/14 10:54 AM Editor in Chief: Deirdre Lynch Executive Editor: Christopher Cummings Senior Content Editors: Rachel Reeve and Chere Bemelmans Assistant Editor: Sonia Ashraf Senior Managing Editor: Karen Wernholm Production Project Managers: Tracy Patruno and Mary Sanger Associate Director of Design: Andrea Nix Art Director and Cover Designer: Beth Paquin Digital Assets Manager: Marianne Groth Media Producer: Vicki Dreyfus Software Developers: Mary Durnwald and Bob Carroll Senior Marketing Manager: Erin Lane Marketing Assistant: Kathleen DeChavez Senior Author Support/Technology Specialist: Joe Vetere Image Manager: Rachel Youdelman Procurement Specialist: Debbie Rossi Production Coordination, Composition, Illustrations: Cenveo Publisher Services Text Design: Leslie Haimes Cover Image: tadamichi/Shutterstock; Head of Learning Asset Acquisition, Global Edition:Laura Dent Associate Acquisitions Editor, Global Edition: Murchana Borthakur Senior Manufacturing Controller, Global Edition: Trudy Kimber Project Editor, Global Edition: Aaditya Bugga ® Credits appear on pages 673–674, which constitute a continuation of the copyright page Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks Where those designations appear in this book, and Pearson was aware of a trademark claim, the designations have been printed in initial caps or all caps Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world Visit us on the World Wide Web at: www.pearsonglobaleditions.com © Pearson Education Limited, 2015 The right of Mario F Triola to be identified as the author of this work has been asserted by him in accordance with the Copyright, Designs and Patents Act 1988 Authorized adaptation from the United States edition, entitled E ssentials of Statistics, 5th edition, ISBN 978-0-321-92459-9, b y Mario F Triola, published by Pearson Education © 2015 All rights reserved No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without either the prior written permission of the publisher or a license permitting restricted copying in the United Kingdom issued by the Copyright Licensing Agency Ltd, Saffron House, 6–10 Kirby Street, London EC1N 8TS All trademarks used herein are the property of their respective owners The use of any trademark in this text does not vest in the author or publisher any trademark ownership rights in such trademarks, nor does the use of such trademarks imply any affiliation with or endorsement of this book by such owners ISBN 10: 1-292-05876-5 ISBN 13: 978-1-292-05876-4 British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library 10 Typeset in 11 AGaramondPro-Regular by Cenveo® Publisher Services Printed and bound by Courier Kendallville in the United States of America A01_TRIO8764_05_GE_FM.indd 25/07/14 3:11 PM www.freebookslides.com ✎ To Ginny Marc, Dushana, and Marisa Scott, Anna, Siena, and Kaia A01_TRIO8764_05_GE_FM.indd 29/07/14 3:00 PM www.freebookslides.com A01_TRIO8764_05_GE_FM.indd 25/07/14 10:54 AM www.freebookslides.com About the Author Mario F Triola is a Professor Emeritus of Mathematics at Dutchess Community College, where he has taught statistics for over 30 years Marty is the author of Elementary Statistics, 12th edition, Elementary S tatistics Using Excel, 5th edition, Elementary S tatistics U sing the TI-83/84 P lus C alculator, 4th edition, and he is a co-author of Biostatistics for the B iological and H ealth Sciences, S tatistical Reasoning for E veryday Life, 4th edition, Business Statistics, and Introduction to Technical Mathematics, 5th edition Elementary Statistics is currently available as an International Edition, and it has been translated into several foreign languages Marty designed the original STATDISK statistical software, and he has written several manuals and workbooks for technology supporting statistics education He has been a speaker at many conferences and colleges Marty’s consulting work includes the design of casino slot machines and fishing rods, and he has worked with attorneys in determining probabilities in paternity lawsuits, analyzing data in medical malpractice lawsuits, identifying salary inequities based on gender, and analyzing disputed election results He has also used statistical methods in analyzing medical school surveys, and analyzing survey results for the New York City Transit Authority Marty has testified as an expert witness in New York State Supreme Court The Text and Academic Authors Association has awarded Marty a “Texty” for Excellence for his work on Elementary Statistics A01_TRIO8764_05_GE_FM.indd 25/07/14 10:54 AM www.freebookslides.com Contents Introduction to Statistics 1-1 1-2 1-3 1-4 Summarizing and Graphing Data 2-1 2-2 2-3 2-4 Statistics for Describing, Exploring, and Comparing Data 3-1 3-2 3-3 3-4 Review and Preview 98 Measures of Center 98 Measures of Variation 114 Measures of Relative Standing and Boxplots 130 4-2 4-3 4-4 4-5 4-6 4-7 4-8 152 Basic Concepts of Probability 153 Addition Rule 167 Multiplication Rule: Basics 174 Multiplication Rule: Complements and Conditional Probability 186 Counting 193 Probabilities through Simulations (on companion Web site) Bayes’ Theorem (on companion Web site) Discrete Probability Distributions 5-1 5-2 5-3 5-4 150 212 254 Review and Preview 256 The Standard Normal Distribution 257 Applications of Normal Distributions 270 Sampling Distributions and Estimators 284 The Central Limit Theorem 296 Assessing Normality 309 Normal as Approximation to Binomial 317 Estimates and Sample Sizes 7-1 7-2 7-3 7-4 96 Review and Preview 214 Probability Distributions 214 Binomial Probability Distributions 228 Parameters for Binomial Distributions 241 Normal Probability Distributions 6-1 6-2 6-3 6-4 6-5 6-6 6-7 60 Review and Preview 62 Frequency Distributions 62 Histograms 72 Graphs That Enlighten and Graphs That Deceive 78 4 Probability 4-1 Review and Preview 20 Review and Preview 22 Statistical and Critical Thinking 23 Types of Data 33 Collecting Sample Data 41 334 Review and Preview 336 Estimating a Population Proportion 336 Estimating a Population Mean 355 Estimating a Population Standard Deviation or Variance 373 A01_TRIO8764_05_GE_FM.indd 25/07/14 10:54 AM www.freebookslides.com 84 Chapter Summarizing and Graphing Data but a frequency polygon uses line segments instead of bars We construct a frequency polygon from a frequency distribution as shown in Example 10 Example 10 Frequency Polygon: IQ Scores of Low Lead Group See Figure 2-14 for the frequency polygon corresponding to the IQ scores of the low lead group summarized in the frequency distribution of Table 2-2 on page 45 The heights of the points correspond to the class frequencies, and the line segments are extended to the right and left so that the graph begins and ends on the horizontal axis Just as it is easy to construct a histogram from a frequency distribution table, it is also easy to construct a frequency polygon from a frequency distribution table Figure 2-14 Frequency Polygon: IQ Scores of Low Lead Group A variation of the basic frequency polygon is the relative frequency polygon, which uses relative frequencies (proportions or percentages) for the vertical scale When one is trying to compare two data sets, it is often very helpful to graph two relative frequency polygons on the same axes Example 11 Relative Frequency Polygon: IQ Scores of Lead Groups See Figure 2-15, which shows the relative frequency polygons for the IQ scores of the low lead group and the high lead group as listed in Table 2-1 given with the Chapter Problem at the beginning of this chapter Figure 2-15 shows that the high lead group generally has lower (farther left) IQ scores than the low lead group Figure 2-15 Relative Frequency Polygons: IQ Scores M02_TRIO8764_05_GE_CH02.indd 84 07/07/14 8:04 PM www.freebookslides.com 2-4 Graphs That Enlighten and Graphs That Deceive 85 It appears that the greater exposure to lead tends to be associated with lower IQ scores Figure 2-15 enables us to understand data in a way that is not possible with visual examination of the lists of data in Table 2-1 Ogive Another type of statistical graph is an ogive (pronounced “oh-jive”), which depicts cumulative frequencies Ogives are useful for determining the number of values below some particular value, as illustrated in Example An ogive uses class boundaries along the horizontal scale and uses cumulative frequencies along the vertical scale Example 12 Ogive: IQ Scores of Low Lead Group Figure 2-16 shows an ogive corresponding to the cumulative frequency distribution table (Table 2-5) on page 48 From Figure 2-16, we see that for the low lead group, 35 of the IQ scores are less than 89.5 Figure 2-16 Ogive: IQ Scores of Low Lead Group Graphs That Deceive Some graphs deceive because they contain errors, and some deceive because they are technically correct but misleading It is important to develop the ability to recognize deceptive graphs Here we present two of the ways in which graphs are commonly used to deceive Example 13 Nonzero Axis Figure 2-17 and Figure 2-18 are based on the same data from Data Set 14 in Appendix B By using a vertical scale starting at 30 mi/gal instead of at mi/gal, Figure 2-17 exaggerates the differences and creates the false impression that the Honda Civic gets mileage that is substantially better than the mileage ratings found for the Chevrolet Aveo and the Toyota Camry Figure 2-18 shows that the differences among the three mileage ratings are actually small M02_TRIO8764_05_GE_CH02.indd 85 07/07/14 8:04 PM www.freebookslides.com 86 Chapter Summarizing and Graphing Data Figure 2-17 Highway Fuel Consumption with Figure 2-18 Highway Fuel Consumption with Vertical Scale Not Starting at Zero Vertical Scale Starting at Zero Pictographs Drawings of objects, called pictographs, are often misleading Data that are onedimensional in nature (such as budget amounts) are often depicted with two-dimensional objects (such as dollar bills) or three-dimensional objects (such as stacks of coins, homes, or barrels) By using pictographs, artists can create false impressions that grossly distort differences by using these simple principles of basic geometry: (1) When you double each side of a square, the area doesn’t merely double; it increases by a factor of four (2) When you double each side of a cube, the volume doesn’t merely double; it increases by a factor of eight See Figure 2-19 in the following example, and note that the larger airliner is twice as long, twice as tall, and twice as deep as the first airliner, so the volume of the larger airliner is eight times that of the smaller airliner Example 14 Pictograph of Airline Passengers In 1984, U.S airlines carried 345 million passengers, and in 2010 they carried 706 million passengers, so the number of passengers approximately doubled from 1984 to 2010 The pictograph in Figure 2-19 illustrates these data with images of airliners that are objects of volume Readers can have a variety of perceptions Some might think that the numbers of passengers are the same in both images, because the same numbers of seats are included Others might look at the different sizes of the airliners and see objects of volume, the larger aircraft being roughly eight times the size of the smaller one Even though Figure 2-19 includes attractive images, it does a very poor job of accurately and unambiguously depicting the data In contrast, Figure 2-20 is a simple bar graph that does a good job of depicting the data accurately Passengers in 1984 Passengers in 2010 Figure 2-19 Passengers Carried by U.S Airlines M02_TRIO8764_05_GE_CH02.indd 86 07/07/14 8:04 PM www.freebookslides.com 2-4 Graphs That Enlighten and Graphs That Deceive 87 Figure 2-20 Passengers Carried by U.S Airlines Examples 13 and 14 illustrate the following principles related to misleading graphs: • Nonzero axis: Always examine a graph to see whether an axis begins at some point other than zero so that differences are exaggerated • Pictographs: When examining data depicted with a pictograph, determine whether the graph is misleading because objects of area or volume are used to depict amounts that are actually one-dimensional (Histograms and bar charts represent one-dimensional data with two-dimensional bars, but they use bars with the same width so that the graph is not misleading.) Conclusion In this section we saw that graphs are excellent tools for describing, exploring, and comparing data Describing data: In a histogram, for example, consider the distribution, center, variation, and outliers (values that are very far away from almost all of the other data values) What is the approximate value of the center of the distribution, and what is the approximate range of values? Consider the overall shape of the distribution Are the values evenly distributed? Is the distribution skewed (lopsided) to the right or left? Does the distribution peak in the middle? Is there a large gap, suggesting that the data might come from different populations? Identify any extreme values and any other notable characteristics Exploring data: Look for features of the graph that reveal some useful and/ or interesting characteristics of the data set For example, the scatterplot included with Example shows that there appears to be a relationship between the waist circumferences and arm circumferences of males Comparing data: Construct similar graphs to compare data sets For example, Figure 2-14 shows a frequency polygon for the IQ scores of a group with low lead exposure and another frequency polygon for a group with high lead exposure, and both polygons are shown on the same set of axes Figure 2-14 makes the comparison relatively easy In addition to the graphs we have discussed in this section, there are many other useful graphs—some of which have not yet been created The world desperately M02_TRIO8764_05_GE_CH02.indd 87 07/07/14 8:04 PM www.freebookslides.com 88 Chapter Summarizing and Graphing Data needs more people who can create original graphs that enlighten us about the nature of data For some really helpful information about graphs, see The Visual Display of Quantitative Information, second edition, by Edward Tufte (Graphics Press, PO Box 430, Cheshire, CT 06410) Here are just a few of the important principles that Tufte suggests: • For small data sets of 20 values or fewer, use a table instead of a graph • A graph of data should make us focus on the true nature of the data, not on other elements, such as eye-catching but distracting design features • Do not distort data; construct a graph to reveal the true nature of the data • Almost all of the ink in a graph should be used for the data, not for other design elements using technology Here we list the graphs that can be generated by various technologies (Detailed instructions can range from quite simple to extremely complex, so see the individual manuals that are supplements to this book.) STAT D IS K Histograms, scatterplots, and pie charts M INITA B Histograms, frequency polygons, dotplots, stemplots, bar graphs, multiple bar graphs, Pareto charts, pie charts, scatterplots, and time-series graphs EX C EL Histograms and scatterplots TI - / P LUS Histograms and scatterplots STATC RUN C H Histograms, scatterplots, pie charts, bar charts, stemplots, and dotplots 2-4 Basic Skills and Concepts Statistical Literacy and Critical Thinking Bar Chart and Pareto Chart A bar chart and a Pareto chart both use bars to show frequencies of categories of categorical data What characteristic distinguishes a Pareto chart from a bar chart, and how does that characteristic help us in understanding the data? Scatterplot What is a scatterplot? What type of data is required for a scatterplot? What characteristic of the data can be better understood by looking at a scatterplot? SAT Scores Listed below are SAT scores from a sample of students (based on data from www.talk.collegeconfidential.com) Why is it that a graph of these data will not be very effective in helping us understand the data? 2400 2200 2150 2040 2230 1890 2100 2090 SAT Scores Given that the data in Exercise were obtained from students who made a decision to submit their SAT scores to a Web site, what type of sample is given in that exercise? If we had a much larger sample of that type, would a graph help us understand some characteristics of the population? Scatterplots. In Exercises 5–8, use the given paired data from Appendix B to construct a scatterplot President’s Heights Refer to Data Set 12 in Appendix B, and use the heights of U.S presidents and the heights of their main opponents in the election campaign Does there appear to be a correlation? M02_TRIO8764_05_GE_CH02.indd 88 07/07/14 8:04 PM www.freebookslides.com 2-4 Graphs That Enlighten and Graphs That Deceive Brain Volume and IQ Refer to Data Set in Appendix B, and use the brain volumes (cm3) and IQ scores A simple hypothesis is that people with larger brains are more intelligent and thus have higher IQ scores Does the scatterplot support that hypothesis? 89 Bear Chest Size and Weight Refer to Data Set in Appendix B, and use the measured chest sizes and weights of bears Does there appear to be a correlation between those two variables? Coke Volume and Weight Refer to Data Set 19 in Appendix B, and use the volumes and weights of regular Coke Does there appear to be a correlation between volume and weight? What else is notable about the arrangement of the points, and how can it be explained? Time-Series Graphs. In Exercises and 10, construct the time-series graph Harry Potter Listed below are the gross amounts (in millions of dollars) earned from box office receipts for the movie Harry Potter and the Half-Blood Prince The movie opened on a Wednesday, and the amounts are listed in order for the first 14 days of the movie’s release Suggest an explanation for the fact that the three highest amounts are the first, third, and fourth values listed 58 22 27 29 21 10 10 8 7 9 11 9 4 4 10 Home Runs Listed below are the numbers of home runs in major league baseball for each year beginning with 1990 (listed in order by row) Is there a trend? 3317 3383 3038 4030 3306 4081 4962 4640 5064 5528 5693 5458 5059 5207 5451 5017 5386 4957 4878 4655 Dotplots. In Exercises 11 and 12, construct the dotplot 11 Coke Volumes Refer to Data Set 19 in Appendix B, and use the volumes of regular Coke Does the configuration of the points appear to suggest that the volumes are from a population with a normal distribution? Why or why not? Are there any outliers? 12 Car Pollution Refer to Data Set 14 in Appendix B, and use the greenhouse gas (GHG) emissions from the sample of cars Does the configuration of the points appear to suggest that the amounts are from a population with a normal distribution? Why or why not? Stemplots. In Exercises 13 and 14, construct the stemplot 13 Car Crash Tests Refer to Data Set 13 in Appendix B and use the 21 pelvis (PLVS) deceleration measurements from the car crash tests Is there strong evidence suggesting that the data are not from a population having a normal distribution? 14 Car Braking Distances Refer to Data Set 14 in Appendix B and use the 21 braking distances (ft) Are there any outliers? Is there strong evidence suggesting that the data are not from a population having a normal distribution? Pareto Charts. In Exercises 15 and 16, construct the Pareto chart 15 Awful Sounds In a survey, 1004 adults were asked to identify the most frustrating sound that they hear in a day In response 279 chose jackhammers, 388 chose car alarms, 128 chose barking dogs, and 209 chose crying babies (based on data from Kelton Research) 16 School Day Here are weekly instruction times for school children in different countries: 23.8 hours (Japan), 26.9 hours (China), 22.2 hours (U.S.), 24.6 hours (U.K.), 24.8 hours (France) What these results suggest about education in the United States? Pie Charts. In Exercises 17 and 18, construct the pie chart 17 Awful Sounds Use the data from Exercise 15 18 School Day Use the data from Exercise 16 Does it make sense to use a pie chart for the given data? M02_TRIO8764_05_GE_CH02.indd 89 07/07/14 8:04 PM www.freebookslides.com 90 Chapter Summarizing and Graphing Data Frequency Polygon. In Exercises 19 and 20, construct the frequency polygon 19 Earthquake Magnitudes Use the frequency distribution from Exercise 23 in Section 2-2 to construct a frequency polygon Applying a loose interpretation of the requirements for a normal distribution, the magnitudes appear to be normally distributed? Why or why not? 20 Earthquake Depths Use the frequency distribution from Exercise 24 in Section 2-2 to construct a frequency polygon Applying a strict interpretation of the requirements for a normal distribution, the depths appear to be normally distributed? Why or why not? Deceptive Graphs. In Exercises 21–24, identify the characteristic that causes the graph to be deceptive 21 Election Results The accompanying graph depicts the numbers of votes (in millions) in the 2008 U.S presidential election 22 Subway Fare In 1986, the New York City subway fare cost $1, and in 2003 the cost was raised to $2, so the price doubled In the accompanying graph, the $2 bill is twice as long and twice as tall as the $1 bill 1986 Subway Fare Current Subway Fare 23 Oil Consumption China currently consumes 7.6 million barrels of oil per day, compared to the United States oil consumption of 20.7 million barrels of oil per day In the accompanying illustration, the larger barrel is about three times as wide and three times as tall as the smaller barrel China M02_TRIO8764_05_GE_CH02.indd 90 United States 07/07/14 8:04 PM www.freebookslides.com Chapter 2 Review 91 24 Braking Distance Data Set 14 in Appendix B lists braking distances (ft) of different cars, and the braking distances of three of those cars are shown in the accompanying illustration 2-4 Beyond the Basics 25 Back-to-Back Stemplots Exercise 19 in Section 2-3 used back-to-back relative frequency histograms for the ages of actresses and actors that are listed in Data Set 11 of Appendix B Use the same method to construct back-to-back stemplots of the ages of actresses and actors, and then use the results to compare the two data sets 26 Expanded and Condensed Stemplots a A stemplot can be expanded by subdividing rows into those with leaves having digits of through and those with leaves having digits through Using the body temperatures from 12 am on Day listed in Data Set of Appendix B, the first three rows of an expanded stemplot have stems of 96 (for leaves between and inclusive), 97 (for leaves between and inclusive), and 97 (for leaves between and inclusive) Construct the complete expanded stemplot for the body temperatures from 12 am on Day listed in Data Set of Appendix B b A stemplot can be condensed by combining adjacent rows Using the LDL cholesterol measurements from males in Data Set of Appendix B, we obtain the first two rows of the condensed stemplot as shown below Note that we insert an asterisk to separate digits in the leaves associated with the numbers in each stem Every row in the condensed plot must include exactly one asterisk so that the shape of the condensed stemplot is not distorted Complete the condensed stemplot What is an advantage of using a condensed stemplot instead of one that is not condensed? 6-7 79*778 8-9 45678*049 Chapter 2 Review This chapter presented methods for organizing, summarizing, and graphing data sets When one is investigating a data set, the characteristics of center, variation, distribution, outliers, and changing pattern over time are generally very important, and this chapter includes a variety of tools for investigating the distribution of the data After completing this chapter, you should be able to the following: • Construct a frequency distribution or relative frequency distribution to summarize data (Section 2-2) • Construct a histogram or relative frequency histogram to show the distribution of data (Section 2-3) M02_TRIO8764_05_GE_CH02.indd 91 07/07/14 8:04 PM www.freebookslides.com 92 Chapter Summarizing and Graphing Data • Examine a histogram or normal quantile plot to determine whether sample data appear to be from a population having a normal distribution (Section 2-3) • Construct graphs of data using a scatterplot (for paired data), frequency polygon, dotplot, stemplot, bar graph, multiple bar graph, Pareto chart, pie chart, or time-series graph (Section 2-4) • Critically analyze a graph to determine whether it objectively depicts data or is somehow misleading or incorrect (Section 2-4) Chapter Quick Quiz When one is constructing a table representing the frequency distribution of weights (lb) of discarded textile items from Data Set 23 in Appendix B, the first two classes of a frequency distribution are 0.00–0.99 and 1.00–1.99 What is the class width? Using the same first two classes from Exercise 1, identify the class boundaries of the first class The first class described in Exercise has a frequency of 51 If you know only the class limits given in Exercise and the frequency of 51, can you identify the original 51 data values? The marks (out of a maximum of 100) obtained by 22 university students in an examination are 91, 64, 96, 66, 62, 74, 69, 73, 71, 68, 66, 76, 80, 68, 65, 72, 65, 86, 77, 62, 83, and 96 On a stemplot representing this data, identify the row representing marks between 60 and 70 In the California Daily lottery, four digits between and inclusive are randomly selected each day We normally expect that each of the ten different digits will occur about 1/10 of the time, and an analysis of last year’s results shows that this did happen Because the results are what we normally expect, is it correct to say that the distribution of selected digits is a normal distribution? In an investigation of the travel costs of college students, which of the following does not belong: center; variation; distribution; bar graph; outliers; changing patterns over time? The scatterplot of the heights and weights of 1367 newborn babies in a county shows a strong pattern What does this suggest? To compare the systolic blood pressure of senior citizens from two different countries, data regarding the systolic blood pressure of 20,456 senior citizens from each of the two countries is obtained Which of the following graphs would be best for such a comparison: scatterplot; bar graph; pie chart; frequency polygon? What characteristic of a data set can be better understood by constructing a histogram? 10 A histogram is to be constructed from the brain sizes listed in Data Set of Appendix B Without actually constructing that histogram, simply identify two key features of the histogram that would suggest that the data have a normal distribution Review Exercises Frequency Distribution of Brain Volumes Construct a frequency distribution of the 20 brain volumes (cm3) listed below (These volumes are from Data Set of Appendix B.) Use the classes 900–999, 1000–1099, and so on 1005 963 1035 1027 1281 1272 1051 1079 1034 1070 1173 1079 1067 1104 1347 1439 1029 1100 1204 1160 M02_TRIO8764_05_GE_CH02.indd 92 25/07/14 12:31 PM www.freebookslides.com Chapter 2 Cumulative Review Exercises Histogram of Brain Volumes Construct the histogram that corresponds to the frequency distribution from Exercise Applying a very strict interpretation of the requirements for a normal distribution, does the histogram suggest that the data are from a population having a normal distribution? Why or why not? 93 Dotplot of California Lottery In the California Daily lottery, four digits are randomly selected each day Listed below are the digits that were selected in one recent week Construct a dotplot Does the dotplot suggest that the lottery is fair? 5 3 8 9 2 9 1 1 3 0 9 7 3 8 7 4 7 4 8 5 6 8 0 0 4 7 5 3 Stemplot of IQ Scores Listed below are the first eight IQ scores from Data Set in Appendix B Construct a stemplot of these eight values Is this data set large enough to reveal the true nature of the distribution of IQ scores for the population from which the sample is obtained? 96 89 87 87 101 103 103 96 CO Emissions Listed below are the amounts (million metric tons) of carbon monoxide emissions in the United States for each year of a recent ten-year period The data are listed in order Construct the graph that is most appropriate for these data What type of graph is best? What does the graph suggest? 5638 5708 5893 5807 5881 5939 6024 6032 5946 6022 CO and NO Emissions Exercise lists the amounts of carbon monoxide emissions, and listed below are the amounts (million metric tons) of nitrous oxide emissions in the United States for the same ten-year period as in Exercise What graph is best for exploring the relationship between carbon monoxide emissions and nitrous oxide emissions? Construct that graph Does the graph suggest that there is a relationship between carbon monoxide emissions and nitrous oxide emissions? 351 349 345 339 335 335 362 371 376 384 Sports Equipment According to USA Today, the largest categories of sports equipment sales are as follows: fishing ($2.0 billion); firearms and hunting ($3.1 billion); camping ($1.7 billion); golf ($2.5 billion) Construct the graph that best depicts these different categories and their relative amounts What type of graph is best? Cumulative Review Exercises In Exercises 1–5, refer to the table in the margin, which summarizes results from 641 people who responded to a USA Today survey Participants responded to this question: “Who you most like to get compliments from at work?” Response Frequency Co-workers 260 Boss 241 table: histogram; dotplot; scatterplot; Pareto chart; stemplot? Strangers 82 Level of Measurement Is the level of measurement of the 641 individual responses nom- People who report to me 58 Graph Which of the following graphs would be best for visually illustrating the data in the inal, ordinal, interval, or ratio? Why? Sampling The results in the table were obtained by posting the question on a Web site, and readers of USA Today could respond to the question if they chose to What is this type of sampling called? Is this type of sample likely to be representative of the population of all workers? Why or why not? Misleading Graph How is the accompanying graph misleading? How could it be modified so that it would not be misleading? (The graph is on the top of the next page.) M02_TRIO8764_05_GE_CH02.indd 93 07/07/14 8:04 PM www.freebookslides.com 94 Chapter Summarizing and Graphing Data Statistic or Parameter? Among the 641 people who responded to the USA Today survey, what is the percentage of respondents who chose the category of boss? When considered in the context of the population of all workers, is that percentage a statistic or a parameter? Explain Grooming Time Listed below are times (minutes) spent on hygiene and grooming in the morning (by randomly selected subjects) (based on data from a Svenska Cellulosa Aktiebolaget survey) Construct a table representing the frequency distribution Use the classes 0–9, 10–19, and so on 0 5 12 15 15 20 22 24 25 25 25 27 27 28 30 30 35 35 40 45 Histogram of Grooming Times Use the frequency distribution from Exercise to construct a histogram Based on the result, the data appear to be from a population with a normal distribution? Explain Stemplot of Grooming Times Use the data from Exercise to construct a stemplot Technology Project It was noted in this section that the days of charming and primitive hand-drawn graphs are well behind us, and technology now provides us with powerful tools for generating a wide variety of different graphs The data sets in Appendix B are available as files that can be opened by statistical software packages, such as STATDISK, Minitab, Excel, SPSS, and SAS Use a statistical software package to open the male and female body measurements from Data Set in Appendix B Use the statistical software with the methods of this chapter to describe, explore, and compare the blood platelet measurements of males and females Does there appear to be a gender difference in blood platelet counts? When analyzing blood platelet counts of patients, should physicians take the gender of patients into account? Support your conclusions with printouts of suitable graphs (Later chapters will present more formal methods for making such comparisons.) from data to Decision Flight Planning Data Set 15 in Appendix B includes data about American Airline flights from New York (JFK airport) to Los Angeles (LAX airport) The data are from the Bureau of Transportation Critical Thinking Use the methods from this chapter to address the following questions M02_TRIO8764_05_GE_CH02.indd 94 Is there a relationship between taxiout times at JFK and taxi-in times at LAX? Explain Is there a relationship between departure delay times at JFK and arrival delay times at LAX? Explain Arrival delay times are important because they can affect the plans of passengers Explore the arrival delay times by using the methods of this chapter and comment on the results Is there very small variation among the arrival delay times? Are there any outliers? What is the nature of the distribution of arrival delay times? Based on the results, are arrival delay times very predictable? 07/07/14 8:04 PM www.freebookslides.com Chapter 2 Cooperative Group Activities 95 Cooperative Group Activities In-class activity Using a package of purchased chocolate chip cookies, each student should be given two or three cookies Proceed to count the number of chocolate chips in each cookie Not all of the chocolate chips are visible, so “destructive testing” must be used through a process involving consumption Record the numbers of chocolate chips for each cookie and combine all results Construct a frequency distribution, histogram, dotplot, and stemplot of the results Given that the cookies were made through a process of mass production, we might expect that the numbers of chips per cookie would not vary much Is that indicated by the results? Explain (See “Chocolate Chip Cookies as a Teaching Aid” by Herbert K H Lee, American Statistician, Vol 61, No 4.) In-class activity In class, each student should record two pulse rates by counting the number of her or his heartbeats in one minute The first pulse rate should be measured while seated, and the second pulse rate should be measured while standing Using the pulse rates measured while seated, construct a frequency distribution and histogram for the pulse rates of males, and then construct another frequency distribution and histogram for the pulse rates of females Using the pulse rates measured while standing, construct a frequency distribution and histogram for the pulse rates of males, and then construct another frequency distribution and histogram for the pulse rates of females Compare the results Do males and females appear to have different pulse rates? Do pulse rates measured while seated appear to be different from pulse rates measured while standing? Use an appropriate graph to determine whether there is a relationship between sitting pulse rate and standing pulse rate In-class activity Given below are recent measurements from the Old Faithful geyser in Yellowstone National Park The time intervals between eruptions are matched with the corresponding times of duration of the geyser; thus the interval of 76 is paired with the duration of 4.53 min, the interval of 84 is paired with the duration of 3.83 min, and so on Use the methods of this chapter to summarize and explore each of the two sets of data separately, and then investigate whether there is some relationship between them Describe the methods used and the conclusions reached Intervals (min) between eruptions 76 84 76 103 92 47 98 54 80 91 69 86 83 75 93 89 96 65 94 85 94 60 94 86 93 88 61 96 52 98 Durations (min) of eruptions 4.53 3.83 3.83 4.23 4.70 1.83 4.00 2.00 3.57 4.25 2.75 4.47 3.35 3.27 4.30 4.25 4.05 2.12 4.63 4.18 4.05 2.13 4.60 4.53 3.70 4.17 1.87 4.68 1.83 4.10 Out-of-class activity Search newspapers and magazines to find an example of a graph that is misleading (See Examples 13 and 14 in Section 2-4.) Describe how the graph is misleading Redraw the graph so that it depicts the information correctly Out-of-class activity Obtain a copy of The Visual Display of Quantitative Information, second edition, by Edward Tufte (Graphics Press, PO Box 430, Cheshire, CT 06410) Find the graph describing Napoleon’s march to Moscow and back, and explain why Tufte says that “it may well be the best graphic ever drawn.” Out-of-class activity Obtain a copy of The Visual Display of Quantitative Information, second edition, by Edward Tufte (Graphics Press, PO Box 430, Cheshire, CT 06410) Find the graph that appeared in American Education, and explain why Tufte says that “this may well be the worst graphic ever to find its way into print.” Construct a graph that is effective in depicting the same data M02_TRIO8764_05_GE_CH02.indd 95 07/07/14 8:04 PM www.freebookslides.com Statistics for Describing, Exploring, and Comparing Data 96 M03_TRIO8764_05_GE_CH03.indd 96 07/07/14 8:15 PM www.freebookslides.com chapter problem How many chips are in a chocolate chip cookie? This edition of Elementary Statistics and previous editions have included data obtained from M&M plain candies This Chapter Problem continues the legacy of using snack foods for statistical purposes, and the choice here is chocolate chip cookies as suggested by the article “Chocolate Chip Cookies as a Teaching Aid,” by Herbert Lee (The American Statistician, Vol 61, No 4) Table 3-1 lists the numbers of chocolate chips counted in different brands The counts were obtained by the author, who found that the counting process was not as simple as it might seem What you with loose chocolate chips that were found in each package? Some chocolate chips were stuck together, so they had to be counted with great care Care also had to be taken to not count nut particles as chocolate chips There were some small fragments that were not counted after the author made an arbitrary decision about the minimum size required to be counted as an official chocolate chip The counts in Table 3-1 not include weights of the chocolate chips, and the Hannaford brand had many that were substantially larger than any of the chocolate chips in the other brands Also, there is an issue with the sampling method The author used all of the cookies in one package from each of the different brands A better sampling method would involve randomly selecting cookies from different packages obtained throughout the country, and this would have required extensive travel by the author—a prospect with some appeal In developing this Chapter Problem, the author learned much about chocolate chip cookies, including the observation that a huge pile of crushed chocolate chip cookies has absolutely none of the appeal of a single cookie untouched by human hands Table 3-1 Numbers of Chocolate Chips in Different Brands of Cookies Chips Ahoy (regular) 3-1 Review and Preview 22 22 26 24 23 27 25 20 24 26 25 25 19 24 20 22 24 25 25 20 3-2 Measures of Center 23 30 26 20 25 28 19 26 26 23 25 23 23 23 22 26 27 23 28 24 3-3 Measures of Variation 3-4 Measures of Relative Standing and Boxplots Chips Ahoy (chewy) 21 20 16 17 16 17 20 22 14 20 19 17 20 21 21 18 20 20 21 19 22 20 20 19 16 19 16 15 24 23 14 24 Chips Ahoy (reduced fat) 13 24 18 16 21 20 14 20 18 12 24 23 28 18 18 19 22 21 22 16 13 20 20 23 24 20 17 20 19 21 27 16 24 19 23 25 14 18 15 19 Keebler 29 31 25 32 27 31 30 29 31 26 32 33 32 30 33 29 30 28 32 35 37 31 24 30 30 34 29 27 24 38 37 32 26 30 Hannaford 13 15 16 21 15 14 14 15 13 13 16 11 14 12 13 12 14 12 16 17 14 16 14 15 97 M03_TRIO8764_05_GE_CH03.indd 97 07/07/14 8:15 PM www.freebookslides.com 98 Chapter Statistics for Describing, Exploring, and Comparing Data Figure 3-1 Dotplot of Numbers of Chocolate Chips in Cookies Figure 3-1 is a dotplot (described in Section 2-4) that includes all of the cookies from Table 3-1 Figure 3-1 shows some obvious differences Instead of relying solely on subjective interpretations of a graph like Figure 3-1, this chapter introduces measures that are essential to any study of statistics The mean, median, standard deviation, and variance are among the most important statistics presented in this chapter, and those statistics will be used in our description, exploration, and comparison of the counts in Table 3-1 3-1 Review and Preview Chapter discussed methods of collecting sample data Chapter presented frequency distributions and a variety of different graphs that help us summarize and visualize data In Chapter we noted that when describing, exploring, and comparing data sets, these characteristics are usually extremely important: (1) center; (2) variation; (3) distribution; (4) outliers; and (5) changing characteristics of data over time In this chapter we introduce important statistics, including the mean, median, and standard deviation Upon completing this chapter, you should be able to find the mean, median, standard deviation, and variance from a data set, and you should be able to clearly understand and interpret such values It is especially important to understand values of standard deviation by using tools such as the range rule of thumb described in Section 3-3 Critical Thinking and Interpretation: Beyond Formulas This chapter includes several formulas used to compute basic statistics Because many of these statistics can be easily calculated by using technology, it is not so important for us to memorize formulas and manually perform complex calculations Instead, we should focus on understanding and interpreting the values we obtain from them The methods and tools presented in Chapter and in this chapter are often called methods of descriptive statistics, because they summarize or describe relevant characteristics of data Later in this book, we will use inferential statistics to make inferences, or generalizations, about a population TOP 20 3-2 Measures of Center Key Concept The focus of this section is the characteristic of center of a data set In particular, we present measures of center, including mean and median, as tools for analyzing data Our objective here is not only to find the value of each measure of M03_TRIO8764_05_GE_CH03.indd 98 07/07/14 8:15 PM ... Table 1- 1 IQ Scores and Brain Volumes (cm3) IQ Brain Volume M 01_ TRIO8764_05_GE_CH 01. indd 24 (cm3) 96 87 10 1 10 3 12 7 96 88 85 97 12 4 10 05 10 35 12 81 10 51 1034 10 79 11 04 14 39 10 29 11 60 08/07 /14 4:40... This Edition Use Real Data Exercises 15 85 86% (13 62) 89% (14 11) Examples 19 6 85% (16 6) 92% (18 1) 11 10 0% (11 ) 10 0% (11 ) Chapter Problems Organization Combined Sections • The 4th edition. .. 09 /10 /13 6:02 PM 18 /07 /14 7:37 am Essentials of Statistics A 01_ TRIO8764_05_GE_FM.indd 25/07 /14 10 :54 AM A 01_ TRIO8764_05_GE_FM.indd 25/07 /14 10 :54 AM Essentials of Statistics 5th Edition? ?? Global

Tiêu đề	essentials of statistics
Tác giả	Mario F. Triola
Trường học	pearson
Chuyên ngành	statistics
Thể loại	global edition
Năm xuất bản	2014
Thành phố	boston

Định dạng
Số trang	100
Dung lượng	7,64 MB