4-1 Chapter Four McGraw- © 2005 The McGraw-Hill Companies, Inc., All Chapter Four 4-2 Describing Data: Displaying and Exploring Data GOALS When you have completed this chapter, you will be able to: ONE Develop and interpret a dot plot TWO Develop and interpret a stem-and-leaf display THREE Compute and interpret quartiles, deciles, and percentiles FOUR Construct and interpret box plots Goals Chapter Four 4-3 Describing Data: Displaying and Exploring Data FIVE Compute and understand the coefficient of variation and the coefficient of skewness SIX Draw and interpret a scatter diagram SEVEN Set up and interpret a contingency table Goals 4-4 Dot Plot Dot plots: Report the details of each observation Are useful for comparing two or more data sets Dot Plot 4-5 This example gives the percentages of men and women participating in the workforce in a recent year for the fifty states of the United States Compare the dispersions of labor force participation by gender Example This example gives the percentages of men and women participating in the workforce in a recent year for the fifty states of the United States Compare the dispersions of labor force participation by gender Example (continued) 4-6 4-7 Percentage of women participating In the labor force for the 50 states Percentage of men participating In the labor force for the 50 states Example (continued) 4-8 Stem-and-leaf Displays Stem-and-leaf display: A statistical technique for displaying a set of data Each numerical value is divided into two parts: the leading digits become the stem and the trailing digits the leaf Note: an advantage of the stem-and-leaf display over a frequency distribution is we not lose the identity of each observation Stem-and-leaf Displays 4-9 Stock prices on twelve consecutive days for a major publicly traded company 100 90 80 70 60 86, 79, 92, 84, 69, 88, 91 50 10 11 12 83, 96, 78, 82, 85 Example 4-10 Stem and leaf display of stock prices stem leaf 89 234568 126 Example (Continued ) 4-17 12 Q4 11 10 Q3 Q2 Q1 th 96 75 percentile 92 Price at 9.75 observation = 88 + 75(91-88) 91 = 90.25 88 86 50th percentile: Median 85 Price at 6.50 observation = 85 + 5(85-84) 84 = 84.50 83 82 th 25 percentile 79 78 Price at 3.25 observation = 79 + 25(82-79) = 79.75 69 Example (continued) 4-18 The Interquartile range is the distance between the third quartile Q3 and the first quartile Q1 This distance will include the middle 50 percent of the observations Interquartile range = Q3 - Q1 Interquartile Range 4-19 For a set of observations the third quartile is 24 and the first quartile is 10 What is the quartile deviation? The interquartile range is 24 - 10 = 14 Fifty percent of the observations will occur between 10 and 24 Example 4-20 A box plot is a graphical display, based on quartiles, that helps to picture a set of data Five pieces of data are needed to construct a box plot: the Minimum Value, the First Quartile, the Median, the Third Quartile, and the Maximum Value Box Plots 4-21 Based on a sample of 20 deliveries, Buddy’s Pizza determined the following information The minimum delivery time was 13 minutes and the maximum 30 minutes The first quartile was 15 minutes, the median 18 minutes, and the third quartile 22 minutes Develop a box plot for the delivery times Example 4-22 Example continued 4-23 M in 12 Q 14 M e d ia n 16 18 Q 20 22 M ax 24 26 28 30 32 Example continued Relative dispersion 4-24 The coefficient of variation is the ratio of the standard deviation to the arithmetic mean, expressed as a percentage: s CV = (100%) X M ea n Coefficient of Variation 4-25 Skewness is the measurement of the lack of symmetry of the distribution The coefficient of skewness can range from -3.00 up to 3.00 when using the following formula: ( X − Median sk = s A value of indicates a symmetric distribution ) Some software packages use a different formula which results in a wider range for the coefficient Movie 4-26 Using the twelve stock prices, we find the mean to be 84.42, standard deviation, 7.18, median, 84.5 Coefficient of variation s CV = (100%) = 8.5% X Coefficient of skewness − X Median ( = sk s ) = -.035 Example revisited 4-27 Scatter diagram: A technique used to show the relationship between variables Variables must be at least interval scaled Relationship can be positive (direct) or negative (inverse) Example The twelve days of stock prices and the overall market index on each day are given as follows: Scatter diagram 4-28 Price 8.0 7.5 7.5 7.3 7.2 7.2 7.1 7.1 7.0 6.2 6.2 5.1 96 92 91 88 86 85 84 83 82 79 78 69 Relationship between Market Index and Stock Price 100 90 Price Index (000s) 80 70 60 50 10 Index Example revisited 4-29 A contingency table is used to classify observations according to two identifiable characteristics Contingency tables are used when one or both variables are nominally scaled A contingency table is a cross tabulation that simultaneously summarizes two variables of interest Contingency table 4-30 Weight Loss 45 adults, all 60 pounds overweight, are randomly assigned to three weight loss programs Twenty weeks into the program, a researcher gathers data on weight loss and divides the loss into three categories: less than 20 pounds, 20 up to 40 pounds, 40 or more pounds Here are the results Example 4-31 Weight Loss Plan Less 20 up to 40 than 20 40 pounds pounds pounds or more Plan Plan 2 12 Plan 12 Compare the weight loss under the three plans Example continued ... participating In the labor force for the 50 states Percentage of men participating In the labor force for the 50 states Example (continued) 4-8 Stem-and-leaf Displays Stem-and-leaf display: A statistical. .. following information The minimum delivery time was 13 minutes and the maximum 30 minutes The first quartile was 15 minutes, the median 18 minutes, and the third quartile 22 minutes Develop a box plot... Four 4-2 Describing Data: Displaying and Exploring Data GOALS When you have completed this chapter, you will be able to: ONE Develop and interpret a dot plot TWO Develop and interpret a stem-and-leaf