Statistics Demystified Demystified Series Advanced Statistics Demystified Algebra Demystified Anatomy Demystified Astronomy Demystified Biology Demystified Business Statistics Demystified Calculus Demystified Chemistry Demystified College Algebra Demystified Earth Science Demystified Everyday Math Demystified Geometry Demystified Physics Demystified Physiology Demystified Pre-Algebra Demystified Project Management Demystified Statistics Demystified Trigonometry Demystified Statistics Demystified STAN GIBILISCO McGRAW-HILL New York Chicago San Francisco Lisbon London Madrid Mexico City Milan New Delhi San Juan Seoul Singapore Sydney Toronto Copyright © 2004 by The McGraw-Hill Companies, Inc All rights reserved Manufactured in the United States of America Except as permitted under the United States Copyright Act of 1976, no part of this publication may be reproduced or distributed in any form or by any means, or stored in a database or retrieval system, without the prior written permission of the publisher 0-07-147104-9 The material in this eBook also appears in the print version of this title: 0-07-143118-7 All trademarks are trademarks of their respective owners Rather than put a trademark symbol after every occurrence of a trademarked name, we use names in an editorial fashion only, and to the benefit of the trademark owner, with no intention of infringement of the trademark Where such designations appear in this book, they have been printed with initial caps McGraw-Hill eBooks are available at special quantity discounts to use as premiums and sales promotions, or for use in corporate training programs For more information, please contact George Hoare, Special Sales, at george_hoare@mcgraw-hill.com or (212) 9044069 TERMS OF USE This is a copyrighted work and The McGraw-Hill Companies, Inc (“McGraw-Hill”) and its licensors reserve all rights in and to the work Use of this work is subject to these terms Except as permitted under the Copyright Act of 1976 and the right to store and retrieve one copy of the work, you may not decompile, disassemble, reverse engineer, reproduce, modify, create derivative works based upon, transmit, distribute, disseminate, sell, publish or sublicense the work or any part of it without McGraw-Hill’s prior consent You may use the work for your own noncommercial and personal use; any other use of the work is strictly prohibited Your right to use the work may be terminated if you fail to comply with these terms THE WORK IS PROVIDED “AS IS.” McGRAW-HILL AND ITS LICENSORS MAKE NO GUARANTEES OR WARRANTIES AS TO THE ACCURACY, ADEQUACY OR COMPLETENESS OF OR RESULTS TO BE OBTAINED FROM USING THE WORK, INCLUDING ANY INFORMATION THAT CAN BE ACCESSED THROUGH THE WORK VIA HYPERLINK OR OTHERWISE, AND EXPRESSLY DISCLAIM ANY WARRANTY, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE McGraw-Hill and its licensors not warrant or guarantee that the functions contained in the work will meet your requirements or that its operation will be uninterrupted or error free Neither McGraw-Hill nor its licensors shall be liable to you or anyone else for any inaccuracy, error or omission, regardless of cause, in the work or for any damages resulting there from McGraw-Hill has no responsibility for the content of any information accessed through the work Under no circumstances shall McGraw-Hill and/or its licensors be liable for any indirect, incidental, special, punitive, consequential or similar damages that result from the use of or inability to use the work, even if any of them has been advised of the possibility of such damages This limitation of liability shall apply to any claim or cause whatsoever whether such claim or cause arises in contract, tort or otherwise DOI: 10.1036/0071431187 To Tim, Tony, and Samuel from Uncle Stan This page intentionally left blank For more information about this title, click here CONTENTS Preface Acknowledgments xi xiii PART 1: STATISTICAL CONCEPTS CHAPTER Background Math Sets Relations and Functions CHAPTER Numbers 12 One-Variable Equations 17 Simple Graphs 20 Tweaks, Trends, and Correlation 25 Quiz 32 Learning the Jargon 35 Experiments and Variables 35 Populations and Samples 38 vii viii CONTENTS CHAPTER CHAPTER Distributions 41 More Definitions 47 Quiz 60 Basics of Probability 63 The Probability Fallacy 63 Key Definitions 65 Properties of Outcomes 70 Permutations and Combinations 79 The Density Function 82 Two Common Distributions 86 Quiz 91 Descriptive Measures 94 Percentiles 94 Quartiles and Deciles 102 Intervals by Element Quantity 106 Fixed Intervals 111 Other Specifications 117 Quiz 121 Test: Part One 125 CONTENTS ix PART 2: STATISTICS IN ACTION 141 CHAPTER Sampling and Estimation 143 Source Data and Sampling Frames 143 Random Sampling 148 Estimation 155 Confidence Intervals 161 Quiz 169 Hypotheses, Prediction, and Regression 171 Assumptions and Testing 171 What’s the Forecast? 178 Regression 185 Quiz 196 CHAPTER CHAPTER CHAPTER Correlation, Causation, Order, and Chaos 200 Correlation Principles 200 Causes and Effects 208 Chaos, Bounds, and Randomness 215 Quiz 230 Some Practical Problems 234 Frequency Distributions 234 Variance and Standard Deviation 244 Final Exam (c) primary source data (d) secondary source data (e) none of the above 92 As (a) (b) (c) (d) the size of an experimental sample set increases: the size of the population increases the size of the population decreases the estimate of the mean can be done with less and less accuracy the estimate of the mean can be done with more and more accuracy (e) the standard deviation approaches 93 Which of the following statements (a), (b), (c), or (d) is true? (a) The values in a bar graph not necessarily have to add up to 100% (b) Some functions are relations (c) Zero correlation is indicated by widely scattered points on a graph (d) A histogram is a specialized bar graph (e) All of the above statements (a), (b), (c), and (d) are true 94 Suppose it’s autumn in Minnesota, and you predict that it will be an average winter temperature-wise, based on historical data This is a null hypothesis Your uncle Jim thinks it will be a colder winter than average Your sister Susan thinks it will be either warmer or colder than average, but not average Uncle Jim’s prediction is an example of (a) a one-sided alternative hypothesis (b) a two-sided alternative hypothesis (c) a positive hypothesis (d) a negative hypothesis (e) an off-center hypothesis 95 Suppose it’s autumn in Minnesota, and you predict that it will be an average winter temperature-wise, based on historical data This is a null hypothesis Your uncle Jim thinks it will be a colder winter than average Your sister Susan thinks it will be either warmer or colder than average, but not average Susan’s prediction is an example of (a) a one-sided alternative hypothesis (b) a two-sided alternative hypothesis (c) a positive hypothesis (d) a negative hypothesis (e) an off-center hypothesis 323 Final Exam 324 96 Consider the following process for limiting the length of a number to three decimal places: 35.78790178 35.7879017 35.787901 35.78790 35.7879 35.787 The steps in this process are examples of (a) rounding (b) variance (c) normalization (d) truncation (e) standard deviation 97 As the number of events in an experiment increases, the average value of the outcome approaches the theoretical mean This is a statement of (a) the law of least squares (b) the Central Limit Theorem (c) the law of large numbers (d) the Regression Theorem (e) the butterfly effect 98 In the plot of Fig Exam-13, the correlation between phenomenon X and phenomenon Y appears to be (a) positive (b) negative (c) zero (d) linear (e) undefined 99 With respect to the plot shown by Fig Exam-13, which of the following scenarios (a), (b), (c), or (d) is plausible? (a) Changes in the frequency, intensity, or amount of X cause changes in the frequency, intensity, or amount of Y (b) Changes in the frequency, intensity, or amount of Y cause changes in the frequency, intensity, or amount of X (c) Changes in the frequency, intensity, or amount of some third factor, Z, cause changes in the frequencies, intensities, and amounts of both X and Y Final Exam Fig Exam-13 Illustration for Final Exam Questions 98 and 99 (d) There is no cause–effect relationship between X and Y whatsoever (e) Any of the above scenarios (a), (b), (c), or (d) is plausible 100 Two outcomes are independent if and only if (a) they both lie along the least-squares line in a scatter plot (b) they are perfectly correlated (c) the occurrence of one outcome affects the probability that the other will occur (d) the occurrence of one outcome does not affect the probability that the other will occur (e) Wait! The premise is implausible Two outcomes in an experiment can never be independent 325 Answers to Quiz, Test, and Exam Questions CHAPTER 1 b d b b b a d c c 10 a c a a a b 10 d d c a b c 10 b CHAPTER c b a d CHAPTER a b c d 326 Copyright © 2004 by The McGraw-Hill Companies, Inc Click here for terms of use Answers 327 CHAPTER b b b b c a d b d 10 b 14 19 24 29 34 39 44 49 54 59 10 15 20 25 30 35 40 45 50 55 60 TEST: PART ONE 11 16 21 26 31 36 41 46 51 56 c a d e c c d d a a a d 12 17 22 27 32 37 42 47 52 57 b d c b d b d e d e e a 13 18 23 28 33 38 43 48 53 58 a c a c a b c b a b b c a e c e e d a a e d b d e e a a d b c d d a d e CHAPTER b d a c a b a d a 10 a b d a b a 10 d c d c c a 10 a d c c a b 10 b CHAPTER c c b b CHAPTER a a b b CHAPTER b d b a Answers 328 TEST: PART TWO 11 16 21 26 31 36 41 46 51 56 c b e c e c c b d b e d 12 17 22 27 32 37 42 47 52 57 a b b c c e b b c b b b 13 18 23 28 33 38 43 48 53 58 d e e a a d c d c c e d 14 19 24 29 34 39 44 49 54 59 e a a d a b c a d d b e 10 15 20 25 30 35 40 45 50 55 60 a a a b c a d d b d a b 13 18 23 28 33 38 43 48 53 58 63 68 73 78 83 88 93 98 c b a b c d c b d d b e c a c b c a e a 14 19 24 29 34 39 44 49 54 59 64 69 74 79 84 89 94 99 a c e a c b d b a e b a c d c c d b a e 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 FINAL EXAM 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 b a c c e a c e b b e e d e c b b e c d 12 17 22 27 32 37 42 47 52 57 62 67 72 77 82 87 92 97 a a e c e b d b a d d c b a d e b e d c b a d a b c b c d c e d c a c c a e b d Suggested Additional References Books Downing, Douglas and Clark, Jeffrey, Statistics the Easy Way – 3rd Edition Barron’s Educational Series, Hauppauge, NY, 1997 Graham, Alan, Teach Yourself Statistics – 2nd Edition Contemporary Books, Chicago, IL, 1999 Jaisingh, Lloyd, Statistics for the Utterly Confused McGraw-Hill, New York, NY, 2000 Moore, David S., Statistics: Concepts and Controversies – 5th Edition W H Freeman & Co., New York, NY, 2001 Stephens, Larry J., Beginning Statistics Schaum’s Outline Series, McGrawHill, New York, NY, 1998 329 Copyright © 2004 by The McGraw-Hill Companies, Inc Click here for terms of use Suggested Additional References 330 Web Sites Encyclopedia Britannica Online, www.britannica.com Eric Weisstein’s World of Mathematics, www.mathworld.wolfram.com INDEX absolute frequency, 40, 48–9 algorithm, 228 alternative one-sided, 175–6 two-sided, 175–6 alternative hypothesis, 175–6 Archimedes, spiral of, 223 area code, 148 arithmetic mean, 16–17 average, 16–17 bar graph horizontal, 22 paired, 186–9 vertical, 21–2 bell-shaped curve, 87–8 bimodal distribution, 54 butterfly effect, 222–3 causation and correlation, 208–13 problems involving, 268–71 cell, 46 census, 39 Central Limit Theorem, 160–1 central tendency, 55 chaos theory, 215–30 checksum, 49 cholesterol, 212–13 coefficient of variation, 118–19 coincidence, 213–14 coincident sets, combination, 82 complementary outcomes, 73–4 confidence interval, 161–8, 179 68%, 162 95%, 162–3 99.7%, 164–5 c%, 164–6 constant function, 31 continuous variable, 37–8 Continuum Hypothesis, 16 correlation causation and, 208–13 coincidence and, 213–14 definition of, 30–1 principles, 200–7 problems involving, 268–71 range, 201–2 counting numbers, 13 cumulative absolute frequency, 48–9 cumulative relative frequency, 49–50 curve fitting, 26–7 damped oscillation, 225 data elements, ranked, 99 data intervals, problems involving, 254–60 decile, 103–6 in normal distribution, 104–5 in tabular data, 105–6 decimal expansions, 14–15 density function, 82–6 dependent variable, 8, 10–11, 205–6 dialing prefix, 148 331 Copyright © 2004 by The McGraw-Hill Companies, Inc Click here for terms of use INDEX 332 discrete values, 20 discrete variable, 36–7 descriptive measures, 94–124 dimensionless quantity, 118 disjoint sets, dispersion, 56 distribution, 41–7, 54, 86–91 frequency, 43–7, 234–43 normal, 87–9 sampling, 158–9 uniform, 86–7 domain of function, 11 definition of, 43–7 grouped, 45–6 ungrouped, 44–5 problems involving, 234–43 function constant, 31 definition of, 7–12 nondecreasing, 29 nonincreasing, 29 trending upward, 29 trending downward, 29 fuzzy truth, 64 element, 38 empirical evidence, 68 empirical probability, 68–70 empirical rule, 89, 162–4, 260 empty set, equations one-variable, 17–20 error experimental defect, 153–5 instrument, 38, 155 in visual interpolation, 153–5 linear interpolation, 26, 31 observation, 38 type-1, 178 type-2, 178 estimation definition of, 155–61 problems involving, 260–3 event, 39, 65 experiment definition of, 35–6 extraction of square root, 228 extrapolation, 27–9 extrapolation, linear, 28 graphs, 20–25 histogram, 23 horizontal bar, 22 pie, 113–14 point-to-point, 23–4 resolution of, 24 vertical bar, 21–2 greatest lower bound, 221–2 factorial, 79–81 fixed intervals, 111–16 fixed-width histogram, 113 Fractint, 219–20 frequency absolute, 40 cumulative absolute, 48–9 relative, 40 cumulative relative, 49–50 frequency distribution Haahr, Dr Mads, 229 heart disease, 212–13 histogram, 23, 113–15 horizontal bar graph, 22 hypothesis, 171–8 alternative, 175–6 null, 175 problems involving, 264–8 testing, 177–8 increment, 36 independent outcomes, 71–2 independent variable, 8, 10–11, 205–6 inference, 181–3 infimum, 221 information, 144 instrument error, 38, 155 integers, 13–14 interpolation error in, 153–5 linear, 26 visual, 153–5 interquartile range, 120 intersection of sets, intervals by element quantity, 106–11 INDEX data, 254–60 fixed, 111–16 irrational numbers, 15 knowledge, 144 large numbers, law of, 70–1 law of large numbers, 70–1 law of least squares, 192–3 least-squares law of, 192–3 line, 191–5, 201–5 least upper bound, 224 level of significance, 178 linear extrapolation, 28 linear interpolation definition of, 26 error, 26, 31 lower bound, 221–2 Malthus, Thomas, 224 Malthusian model, 224–5 Mandelbrot, Benoit, 215–16 Mandelbrot set, 219–20, 223 mathematical probability, 67–8 mean definition of, 50–3 estimating, 155–6 in normal distribution, 118 in tabulated data, 118 population, 50–1, 57–8 sample, 50, 52–3 standard error of, 160 true, 156 measures of central tendency, 55 measures of dispersion, 56 median definition of, 53–4 in population, 57–8 mode definition of, 54–5 in population, 57–8 model, 173 multiple hypotheses, 173–4 multiple outcomes, 76–9 mutually exclusive outcomes, 72–3 natural logarithm base, 79 333 natural numbers, 13 nondisjoint outcomes, 74–6 nonterminating, nonrepeating decimals, 15 nonterminating, repeating decimals, 15 normal distribution, 87–9, 94–6, 102, 104–5 null hypothesis, 175 null set, numbers, 12–17 counting, 13 integers, 13–14 irrational, 15 natural, 13 rational, 14 real, 16–17 whole, 13 observation error, 38 one-sided alternative, 175–6 one-variable equations, 17–20 basic, 18 factored, 18 quadratic, 18–20 rules for solving, 17–18 optimization problem, 193 outliers, 100, 203–5, 207 outcome complementary, 73–4 definition of, 65 independent, 71–2 multiple, 76–9 mutually exclusive, 72–3 nondisjoint, 74–6 properties of, 70–9 paired bar graph, 186–9 paired data, 185–6 parameter, 40 peak, of normal distribution, 88 percentile, 94–101 in normal distribution, 94–6 in tabular data, 96–9 inversion, 100–1 ranks, 100 permutation, 80 pie chart, 114 pie graph, 113–14 Platonic solid, 249 point-to-point graph, 23–4 INDEX 334 population, 38, 144 population mean, 50 power-of-10 notation, 80 prediction, problems involving, 264–8 primary source data, 144 probability density function, 84 fallacy, 63–5, 222 empirical, 68–70 key definitions, 65–70 mathematical, 67–8 problems involving, 249–54 proper subset, pseudorandom digits, 228–9 pseudorandom numbers, 228–9 quadratic equation, 19 quadratic formula, 19 qualitative variable, 201 quantitative variable, 201 quantum mechanics, 64 quartile, 102–4 in normal distribution, 102 in tabular data, 102–3 quartile point, 102 r factor, 225–7 random sample, 39 random sampling, 148–55 random variable, 39–40 randomness, 227–30 range of correlation, 201–2 of function, 11 of interval, 117–18, 179–81 ranked data elements, 99 rational numbers, 14 real numbers, 16–17 regression definition of, 185–96 problems invoving, 264–8 regression curve, 189–93 regular dodecahedron, 230 relation, 7–12 relative frequency, 40, 49–50 resolution, 24 rounding, 47–8 sample, 39, 144, 146 sample mean, 50 sample set, 149–52 sample space, 65–7 sampling problems involving, 260–3 random, 148–55 without replacement, 149–53 with replacement, 149–53 sampling distribution, 158–9, 161 of means, 158, 161 of standard deviations, 158 sampling frame, 144–6, 148 scale parallels, 223–4 scale-recurrent patterns, 218–20 scatter plot, 30, 188–91 scatter graph, 30 scientific notation, 80 secondary source data, 144 set, 3–7 coincident, disjoint, empty, intersection, null, union, significance, level of, 178 source data, 144 spiral of Archimedes, 223 square root, extraction of, 228 standard deviation calculation of, 60 definition of, 56–7 estimating, 156–8 in normal distribution, 89–91, 118 problems involving, 244–9 standard error of the mean, 160 statistic, 40 statistical testing, 177–8 Stone Soup Team, 220 subset, 6–7 proper, terminating decimals, 14 truncation, 47 two-sided alternative, 175–6 type-1 error, 178 type-2 error, 178 INDEX uniform distribution, 86–7 union of sets, unknown, 36 upper bound, 224 variable continuous, 37–8 dependent, 8, 10–11, 205–6 discrete, 36–7 independent, 8, 10–11, 205–6 qualitative, 201 quantitative, 201 335 variable-width histogram, 113–15 variance definition of, 55–6, 57, 59 problems involving, 244–9 vertical bar graph, 21–2 visual interpolation, 153–5 whole numbers, 13 www.google.com, 229 www.random.org, 229 Z score, 119–21 This page intentionally left blank ABOUT THE AUTHOR Stan Gibilisco is one of McGraw-Hill’s most prolific and popular authors His clear, reader-friendly writing style makes his electronics books accessible to a wide audience, and his background in mathematics and research makes him an ideal editor for professional handbooks He is the author of the TAB Encyclopedia of Electronics for Technicians and Hobbyists, Teach Yourself Electricity and Electronics, and The Illustrated Dictionary of Electronics Booklist named his McGraw-Hill Encyclopedia of Personal Computing a ‘‘Best Reference’’ of 1996 .. .Statistics Demystified Demystified Series Advanced Statistics Demystified Algebra Demystified Anatomy Demystified Astronomy Demystified Biology Demystified Business Statistics Demystified... Demystified Pre-Algebra Demystified Project Management Demystified Statistics Demystified Trigonometry Demystified Statistics Demystified STAN GIBILISCO McGRAW-HILL New York Chicago San Francisco Lisbon... students have a hard time with statistics This is a preparatory text that can get you ready for a standard course in statistics If you’ve had trouble with other statistics books because they’re