STATISTICS WITH COMMON SENSE David Kault Greenwood Press Westport, Connecticut • London Library of Congress Cataloging-in-Publication Data Kault, David Statistics with common sense / David Kault p cm Includes bibliographical references and index ISBN 0-313-32209-0 (alk paper) Statistics I Title QA276.12.K38 2003 519.5—dc21 2002075322 British Library Cataloguing in Publication Data is available Copyright © 2003 by David Kault All rights reserved No portion of this book may be reproduced, by any process or technique, without the express written consent of the publisher Library of Congress Catalog Card Number: 2002075322 ISBN: 0-313-32209-0 First published in 2003 Greenwood Press, 88 Post Road West, Westport, CT 06881 An imprint of Greenwood Publishing Group, Inc www.greenwood.com Printed in the United States of America The paper used in this book complies with the Permanent Paper Standard issued by the National Information Standards Organization (Z39.48-1984) 10 Contents Preface vii Acknowledgments ix Glossaries xi Statistical Computer Program xiii Statistics: The Science of Dealing with Variability and Uncertainty Descriptive Statistics Basic Probability and Fisher's Exact Test 33 Discrete Random Variables and Some Statistical Tests Based on Them 63 Continuous Random Variables and Some Statistical Tests Based on Them 97 General Issues in Hypothesis Testing 141 Causality: Interventions and Observational Studies 167 Categorical Measurements on Two or More Groups 175 Statistics on More Than Two Groups 197 10 Miscellaneous Topics 227 Appendix: Table of the Standard Normal Distribution 239 Answers 241 Annotated Bibliography 253 Index 255 Preface Statistics is primarily a way of making decisions in the face of variability and uncertainty Often some new treatment is first tried on a few individuals and there seems to be some improvement We want to decide whether we should believe the improvement is "for real" or just the result of chance variation The treatment may be some actual medical treatment, or it may be the application of a new fertilizer to a crop or an assessment of the effect of particular social circumstances on social outcomes In many professional areas people want to answer the same basic question: "Does this make a real difference?" In the modern world this question is answered by statistics Statistics is therefore part of the training course for people in a wide range of professions Sadly, though, statistics remains a bit of a mystery to most students and even to some of their statistics teachers Formulas and rules are learned that lead to an answer to the question, "Does this make a genuine difference?" in various situations However, when people actually come to apply statistics in real life they are generally uneasy They may be uneasy not only because they have forgotten which formula to apply in which situation or which button to press on the computer, but also because the formula or the computer is using criteria that they never properly understood to make important decisions that sometimes don't accord with common sense People in this situation are right to be uneasy Statistics applied correctly but without full understanding can lead to the most inappropriate, even bizarre decisions Common sense without any assistance from statistical analysis will often lead to more sensible decisions Nevertheless, statistics has conquered the world of modern decision making Few people notice that many statisticians don't believe in statistics as it is currently practiced Statistics can of course be used wisely, but this depends on the user properly understanding the meaning of Vlll Preface the answers from the formula or the computer and understanding how to combine these answers with common sense This book is primarily aimed at people who learned statistics at some stage, never properly understood it, and now need to use it wisely in everyday professional life However, the book should be equally suitable as an introductory text for students learning statistics for the first time There is a large number of introductory statistics texts This text stands out in three ways: • It emphasizes understanding, not formulas • It emphasizes the incorporation of common sense into decision making • It gives the full mathematical derivation of some statistical tests to enhance understanding The last point requires an immediate qualification to prevent the large number of people with mathematics phobia from shutting the book for good at this point No mathematical background beyond grade 10 is assumed, and the mathematics often consists of simply explaining one logical idea Because formulas that can't be fully understood by someone with grade 10 mathematics are omitted, there is less mathematics than in most statistics texts The aim is to show the limited connection between wise decision making and statistics as it is conventionally practiced, and to show how this situation can be rectified by combining statistics with common sense Acknowledgments I thank my former statistics teacher, John Hunter, for his teaching, his inspiration, and his suggestions for this book I also thank my son Sam for his proofreading I am grateful for the support given to me by James Cook University of North Queensland in writing this book and the accompanying statistical computer program Glossaries MATHEMATICAL SYMBOLS < ^ > ^ 7^ ~ ~ U H | n\ less than less than or equal to greater than greater than or equal to does not equal approximately equals is or (meaning one or the other or both) and given n factorial, meaning nx(n-\)x(n-2)x x3x2x 1; for example, 4!=4x3x2xl=24 "Ck Number of ways that from n objects k objects can be chosen (from n Choose k) for example, (see Chapter 3) Glossaries Xll COMMON ENGLISH EXPRESSIONS USED IN THE TEXT TO IMPROVE READABILITY The expressions here on the left-hand side are not normally intended to be used in an absolutely precise way However, in certain contexts in this book they are used in place of precise quantitative expressions to improve readability The precise meanings that I attach to these expressions are given on the right "hardly ever" "nearly always" "quite often" or "commonly" with probability ^ 0.05 with probability ^ 0.95 with probability > 0.05 Statistical Computer Program Many people frequently come across questionable decisions made on the basis of statistical evidence This book will help them to make their own informed judgment about evidence based on statistical analyses Only some people will need to undertake statistical analyses themselves On the other hand, it is just a small step from understanding statistical evidence to being able to undertake statistical analyses in many situations It is a small step because in most cases the actual calculations are performed by a computer The only additional skill to be learned in order for readers to perform statistical analyses for themselves is to learn which button on the computer to press Doing helps learning, so this book includes questions, some of which are intended to be answered with the assistance of a statistical computer program There are many statistical computer programs or "packages" available Almost all would be capable of the calculations covered in this book However, none are ideal Many are unnecessarily complex for use in straightforward situations The complexity, profusion of options, and graphical output may serve to confuse and distract users interested only in straightforward situations Many contain errors in that they use easy-to-program, approximate methods when exact methods are more appropriate Some contain other errors Few programs are available free of charge, even though most of the intellectual effort underpining such programs is ultimately a product of publicly funded universities in which academics have worked for the public good In response to these issues, I have written a statistical program to accompany this book I have called the program "pds" for Public Domain Statistics It is designed to run on the Windows operating system (version 95 or later), and occupies about Mb It is available for distribution free of charge with the proviso that it is not to be used against the interests of humanity and the envi- 228 Statistics with Common Sense we were to conduct a census of all of them, our resources might limit us to a postal survey that might then be filled out in haste by the mayors' aides If, however, we were to concentrate our resources on a sample, we might be able to get much more thoughtful responses from all the mayors in the sample using face-to-face interviews However, if our sample consists of an appreciable proportion of the entire population, our statistical analysis has to be modified The modification reflects the fact that our sample not only gives us a probabilistic idea of what we would expect from the rest of the population, but it also gives us precise information about an appreciable proportion of the whole population Mathematical theory we will not cover shows that the appropriate modification is quite straightforward Our ideas on how uncertain our estimate of the population mean is—the standard error of the mean-^have to be reduced by multiplying the standard error of the mean by v/l - / , w h e r e / i s the proportion of the population in the sample This adjustment is known as the finite population correction The statistical analysis then proceeds as usual If we are using a computer program in our analysis, the program may display the standard error of the mean so that we can factor in a finite population correction manually, if appropriate For example, say we sampled a random sample often of the mayors of Australia's twenty largest cities and found the following results on some numerical environmental awareness scale: 5, 6, 9, 4, 8, 10, 7, 3, 1, We will assume here that it reasonable to analyze these figures as though they came from a normal distribution and so use the methods described in Chapter The mean score is 5.5 The sample standard deviation is 3.027 The standard error of the mean without using the finite population correction is 3.027/^10 = 0.957, but with the finite population correction it is 0.957 x ^1 - ]0/2o = 0.677 If we wanted 95 percent confidence intervals for the true mean of the scores of the mayors on this environmental awareness scale, it would be 5.5 ± 0.677 x 2.262 where ±2.262 is the range of values from the t9 distribution that contains 95 percent of values In other words, the 95 percent confidence interval would be (4.0, 7.0) If we had not used the finite population correction we would have (3.3,7.7) The latter calculation would ignore the fact that not only we have an impression of what all the mayors are like from interviewing ten of them, but we also have certain knowledge about what half of them are like In general, the (100 - a) percent confidence interval for the mean as a result of a survey from a finite population is given by where n is the number surveyed,/is the proportion of the finite population surveyed, and T is the figure from the tn_, distribution such that (100 - a) percent of the values from this distribution are in the range - T to +T (it is Miscellaneous Topics 229 assumed here that the values that occur in this finite population are values that are chosen from a normal distribution) Sometimes there is a philosophical difficulty here If we want to know about the actual mayors of large Australian cities, we would use the finite population correction factor as in the preceding paragraph However, if we were thinking of these mayors as a representative sample of all the mayors who could ever exist given the same social circumstances as exist in Australia, we would regard our population as infinite and not use the population correction factor In the same way, if we have a statistics class with male and female students, we may find that the average mark of the females on the examination is percent better than the average mark of the males We could then ask the question, "Is this due to chance?" From one point of view, this is a meaningless question Our sample is the population We know precisely the mark of everyone in the class Knowing these marks, there is no chance that these results for the population, the class, could be anything other than the results that we have in front of us We can say, dogmatically, that in this class we are absolutely sure that, on these marks, women are better on average than men From another point of view, we can think of this class as just a sample of all the billions of men and women who could potentially enroll in a class such as ours Assuming the class is not large and that there is considerable individual variation in marks and noting the small difference in the average mark, after a statistical analysis we could conclude, "There is no (convincing) evidence that women are better than men." DETERMINING THE NUMBER OF SAMPLE VALUES REQUIRED The amount of data required depends on how accurate we want our estimate to be and how variable the individual values are We will consider two cases: estimating a proportion and estimating the mean of values that are normally distributed Estimating a Proportion Say we wanted to find out the proportion of people intending to vote for a particular political party This involves the binomial random variable (Chapter 4) When the sample is of considerable size (e.g., twenty or more), the binomial distribution "looks" like a normal distribution, as it is the result of adding a considerable number of chances (this is the central limit theorem; see Chapter 5) If the true proportion in the population is 6, then the expected (anticipated average) number of successes in a sample of size n is nd and the standard deviation is JnQfy (see Chapter and recall that in the notation used there ()) = - 0) Then, by the central limit theorem, the actual number of successes in 230 Statistics with Common Sense the sample will be approximately a value chosen from a normal distribution centered on nd and with standard deviation JnQ$ The proportion of successes in the sample will be 7wth of this, so the proportion of successes in the sample will be chosen from a normal distribution centered on and with standard deviation Therefore, 95 percent of the time the proportion that will be obtained will be in the range We see that if we want a 95 percent chance of being no further than percent away from the true value of 9, we should take n so that Rearranging this equation gives Now cf) = - 9, and high school algebra shows that the biggest value that ( - ) can take is lA (this value occurs when = Vi) Approximating 1.96 by and taking the largest possible value of 9c() of XA shows that if we take n = lAx (2/o.oi)2 = 10,000 we will have at least a 95 percent chance of obtaining a proportion that is within percent of the true proportion Public-opinion surveys are often done using n = 500, not 10,000 Calculations using this theory show that these surveys can have about a in 20 chance of being inaccurate by more than percent The Number of Values Required for Estimation of the Mean of a Continuous Variable to within a Given Accuracy The formula for the amount of data required in this situation is worked out using similar reasoning This time, however, we have to have some preliminary estimate of the variability in order to make an estimate of n We assume Miscellaneous Topics 231 that we have an estimate s of the standard deviation The formula, by similar reasoning to that shown earlier, turns out to be where we are prepared to be in error by an amount of d or more with a chance of a, and u is the value from a t distribution such that the chance of being above u is a/i Ifn turns out to be reasonably large (e.g., greater than 20), the standard normal distribution usually is used, as it is a good approximation to the corresponding t distribution For example, if we had a variable for which preliminary information indicated that the standard deviation was 10.0 and we wanted to be 99 percent sure that our estimate of the mean was within 2.0 units of the true value, we would need The figure 2.57 is used here because the range ±2.57 from the standard normal distribution contains 99 percent of values Our considerations here can lead to the issues raised in dealing with confidence intervals If, with s = 10, we are about to examine a sample of 165, then there will be a 99 percent chance that the mean of the sample will be within 2.0 units of the mean However, once we have obtained a particular sample mean we cannot generally say there is a 99 percent chance that the true population mean p will lie within 2.0 units of the sample mean Our 99 percent confidence interval will, however, be the sample mean ±2.0 units For example, if we are dealing with the heights of women, although 99 percent of samples of 165 women may give sample means in the range 170 to 174 cm, by a very long coincidence we may have obtained a sample with a mean of 150 cm We have prior knowledge about the likely average height of women and so it would not be correct for us to believe on the basis of our sample that there is a 99 percent chance that the average woman is between 148 and 152 cm Ideas about where the sample mean is likely to be knowing the population mean cannot generally be inverted to ideas about where the population mean is likely to be knowing a sample mean (see the material on confidence intervals in Chapter for more explanation of this issue) TOPICS COVERED IN MORE ADVANCED STATISTICS TEXTS The topics covered in this text give the reader sufficient knowledge to apply statistics to most straightforward situations where only one or two measurements are made on each individual Hopefully, this text, with its emphasis on 232 Statistics with Common Sense understanding the philosophy, will enable the reader to apply statistics with common sense in such situations However, there are many parts of the subject of statistics that have not been covered and it seems appropriate in this last chapter to give an indication of the scope of the subject This will be given in the form of the following list of randomly selected topics in random order: • We have covered the common tests appropriate to certain combinations of sources of data and types of data as described at the end of Chapter and expanded at the end of this chapter However, there are many more tests applicable in circumstances we haven't considered For example, we haven't considered the situation where there is a measure on each individual performed after each of a number of different interventions where that number is more than one • Mathematical statistics is a large subject Among many other things it delves into questions of being precise about what we mean when we say things like "s as defined in Chapter is a "good" estimate of a." What are the mathematical properties of a "good" or "best" method of estimating a parameter value? Are there mathematical methods for finding the "best" method of estimation? • Throughout this text the need to incorporate common sense into statistics has been emphasized The reader has been urged to this by modifying their benchmark p value according to circumstances There are also mathematical ways of incorporating common sense into statistics The subject of Bayesian statistics is one of a number of mathematical approaches to incorporating common sense into statistics • Decision theory is a further attempt to refine the use of statistics It is a method of using objective and subjective information about probabilities and explicitly taking into account the costs of errors • The previous section gave some information on calculating how many sample values would be needed for a certain amount of accuracy in two simple situations There is a lot more to the topic of finding the sample numbers necessary for a statistical analysis to have some required power A related topic is the topic of stratified sampling If we wanted an estimate of total amount of soil lost to erosion in Australia each year, positioning test areas at randomly chosen spots throughout Australia would not be optimal To improve accuracy, we would be best off focusing more sampling effort on geographic regions where erosion levels were more variable • Being as economical as possible in terms of number of subjects in an experiment is particularly important when the decision about the best treatment is a matter of life and death There are special methods known as sequential analysis for continually checking the data to decide when sufficient people have undergone the experimental treatment for a decision about it to be made • Survival analysis is a related area In medicine, the final endpoint for many studies is death However, as people, even sick ones, often live a very long time, we would often have to wait a very long time before everybody in a study died and we had complete results to use in comparing the benefits of different methods of delaying death Using incomplete results when only some people have died is the subject of survival analysis Miscellaneous Topics 233 • The practicalities of sampling, particularly sampling humans, is another large topic For example, phone surveys don't represent people in households without a phone, but, less obviously, they underrepresent those in large households with only one phone per household (see the answer to question of Chapter 2) • Most of this book has dealt with samples in which the individuals have been chosen at random In spatial statistics, though, the equivalent to our individuals are points we choose to sample in space Points in space are not independent: They are related according to how physically close they are Spatial statistics is an important component of environmental science and of geology It is required, for example, in order to use limited information to draw maps of pollution levels, assess the population of endangered species, and assess the amount of ore in a mine • Just as points can't be independent in space, they can't be independent in time Special statistical methods are required for analysis of fluctuating data through time This is the subject of time series Physicists looking at sunspots, meteorologists looking at weather patterns, and economists looking at fluctuations in the capitalist economies are all interested in time series • In biology and medicine we are often interested in a number of factors that may all be operating simultaneously in one individual to have an effect on what is being measured We may want to know if the effect of a new drug taken for blood pressure is affected by gender, age, preexisting blood pressure level, dietary salt intake, and coprescription of certain other drugs Furthermore, we may want to compare how the new drug and the standard drug interacts with these factors Teasing out answers to such questions constitutes the majority of many second-level applied statistics courses for biologists These answers come under headings such as advanced regression and multiway ANOVA or ANOVA models These topics, in turn, are derived from a branch of mathematical statistics known as generalized linear models • Often more than one or two measurements are made on an individual Particularly in psychology, vast numbers of measurements are made on each individual in the form of responses to a questionnaire containing hundreds of questions In biology and medicine as well, it is common to take many measurements of various aspects of each individual Making sense of all this information is the subject of multivariate analysis Multivariate analysis includes a number of topics For instance, principal component analysis, factor analysis, and cluster analysis can deal with condensing the mass of information from psychology questionnaires into a more manageable form These topics give methods of answering questions about whether people's personalities tend to fall into a limited number of types or about how much of the variation between people can be summarized by, say, three figures (for example, the three figures might give a measure of intelligence, a position on a scale of introversion-extroversion, and a position on a scale of conservatism-radicalism) There are also many other uses for these techniques In medicine, the topic of discriminant analysis is used to find a method of combining a number of indirect measures to obtain a score that is best able to discriminate between the presence or absence of a serious disease This can avoid the need for expensive or dangerous operative treatment to decide the issue beyond doubt There are many other topics within the area of multivariate analysis and many other uses for this branch of statistics 234 Statistics with Common Sense SUMMARY • Occasionally, our sample constitutes an appreciable proportion of the population of interest In such cases, confidence intervals for means have to be reduced in width by a factor of V1 - / , reflecting the fact that we are certain in our knowledge of an appreciable proportion of the population • We can determine the number of measurements required for a statistical test to have a certain level of accuracy • This text covers most straightforward statistical tests, but the scope of advanced statistics is huge SUMMARY OF STATISTICAL TESTS All the statistical tests covered in this book are designed to help answer the question, "Are the differences we see Tor real' or are they just the result of chance?" The results from the statistical tests are not a direct answer to this question, but instead tell us how easy it would be for chance to explain the results The term "differences" in this context has several shades of meaning: We may be interested in the possibility of a difference of an appreciable size caused by a definite intervention or membership of a definite group We have covered a number of such tests Different tests apply in different situations, depending on the source of data and the type of data These tests were shown at the end of Chapter 5, and the list is repeated here with the addition of tests covered later: Dichotomous (e.g., better or worse) data Numerical but not necessarily normal data Two related measures (e.g., measures before and after an intervention on the same individual; measures on one twin who had an intervention and the other twin who didn 't) Sign test Wilcoxon signed rank test Paired samples nest A single measure on two unrelated samples (e.g., measuring the same quantity on men and women) Fisher's exact test Mann-Whitney test Independent samples t test A single measure on more than two unrelated samples X2 test of association (applies to nominal not just dichotomous outcomes) Kruskal-Wallis test ANOVA Source of data Numerical and normal data Miscellaneous Topics 235 We may be interested in shades of difference where there is no distinct group as such but individual measures are at different points along a continuous range of values In this case, we usually use the word "association" and ask if variation in one measure is associated with variation in another measure Here we use correlation and regression Valid p values can be calculated for the measures that we obtain from correlation and regression when the association is linear and the appropriate test is used There are several types of data: • Both measures are numerical but not normal Test: Spearman's p (rho) • At least one of the measures is normally distributed as the dependent variable scattered about a regression line Test: to see if the slope of the regression line ((3) or Pearson's correlation coefficient (r) is zero • The two measures have a bivariate normal distribution Test: (Pearson's) correlation coefficient We may be interested in the possibility of a difference between the population from which we have drawn our sample and some theoretical distribution The relevant tests that we have covered include the following: • the use of the binomial random variable to test an hypothesis about the value of the parameter or the proportion in the population, based on knowledge about a sample • the use of the Poisson random variable to test an hypothesis about the value of the parameter X or the average rate at which something happens • the use of the z test to test whether a single value or the average of a number of values comes from a normal distribution with p and a specified • the use of the single sample t test to test whether the average of a number of values comes from a normal distribution with just p specified • the use of the Komogoroff-Smirnoff test, mentioned in this book as a test of the data fitting a normal distribution, but with wider applicability • the use of the chi-square goodness of fit test to test whether the proportions of the data in various categories are a reasonable match to those expected by the theoretical distribution Most of the tests under (1) and (2) can also be used to define confidence intervals Confidence intervals are a way of using the measures on the sample to obtain information on the likely position of some unknown parameter that describes the population or describes the average difference made by some treatment Again, note that a confidence interval does not directly tell us that the unknown parameter is in a certain interval with a certain probability Instead, the confidence interval tells us that if the parameter was in this interval it could "easily" have given the underlying data The term "easily" refers to calculations with a given value of the parameter, giving a/7 value for the data larger than the benchmarkp value 236 Statistics with Common Sense QUESTIONS Say that the entire adult population of northern hairy-nose wombats consists of 180 individuals and that we survey a randomly selected sample of 100 of these individuals and measure their weights If the results are that the mean is 27 kg and the standard deviation is kg, find the 95 percent confidence interval for the mean weight of adult northern hairy-nosed wombats On a test in a statistics class the mean mark was 69.71 and the standard deviation was 11.65 However, only twenty-one of the twenty-two students now enrolled took the test, as one student was unable to take it because of ill health a Assume this one student has been attending class as much as the other twentyone students until the time of the test Find the 95 percent confidence interval for the mean mark that would have been obtained in the first test had all twentytwo students been able to take it b The absent student's health deteriorates further and she withdraws from the class What now is the 95 percent confidence interval for the mean mark of the class? Consider a survey on an issue in which the American population is thought to be approximately evenly divided a How many people would we need to survey for us to have a 99 percent chance of obtaining a value that differs from the true percentage by no more than percentage point? b After taking our survey using an appropriately chosen representative sample and with the appropriate numbers obtained from part a, we find that 40 percent of the sample is positive about the issue under consideration Does this mean that we can be 99 percent certain that the true proportion of the whole of the American population who are positive about this issue is between 39 and 41 percent? Discuss For each of the following scenarios state the most appropriate statistical test a People are classified into two groups depending on whether they were abused as children and the recorded outcome is whether they have had a conviction for theft b Groups of people in four different industries are selected at random and their blood pressures are measured, as the investigators are interested in a possible association between blood pressure and type of workplace c It is thought that the position of a person's surname in the alphabet may, as a result of childhood experiences of waiting for names to be announced in alphabetical lists, lead to personality differences Accordingly, a psychological test that gives a numerical score on an introversion-extroversion scale is administered to a group of people About half the people in the group have surnames starting with the first four letters of the alphabet and the remainder have surnames starting with the last four letters of the alphabet The introversionextroversion score of all these people is assessed Assume that inspection of the figures suggests that it is reasonable to believe that the data come from a normal distribution Miscellaneous Topics 237 d Consider the same scenario as in c, with names starting with letters at different ends of the alphabet and measurement of an introversion-extroversion score What test should we use to assess the results if inspection of the figures suggests that it is not reasonable to believe that the data come from a normal distribution? e A piped-music system is installed in a hospital for long-stay patients and the recorded outcome is whether patients felt better or worse on a day when they had access to the music than on a day when they didn't have access to the music f A piped-music system is installed in a gym and the recorded outcomes are the amount of weight each member of the gym could lift on a day without piped music and the amount of weight each member could lift on a day with piped music Assume that inspection of the figures suggests that it is reasonable to believe that the data come from a normal distribution g Consider the same scenario with the gym and piped music as in f, but this time assume that inspection of the figures suggests that it is not reasonable to believe that the data come from a normal distribution h People are classified into two groups, depending on whether they grew up in a rural or urban setting, and the recorded outcome is whether they are vegetarian i Groups of women, all age twenty, who adhere to four different religions are selected at random and their time for a 100-meter sprint race is recorded, as the investigators are interested in a possible association between athletic performance and religion j We are interested in the possibility of an association between religion and occupational group k People of varying incomes are selected and their IQs are measured, as the investigators are interested in a possible association between income and IQ Index A posteriori, 52 A priori, 52 Absolute cause, 169 Absolute values, 19 Alternative hypothesis, 67 Analysis of Variance, 197-205 ANOVA, 197-205,233 Arrangements, 56 Association, 181,235 Bar graph, 12, 13 Bayes's rule, 50, 56, 71, 156, 157 Bayesian statistics, 54, 71, 79, 152, 156, 157, 161,162 Best estimates, 91-92 Binomial random variable, 64-79 Bivariate normality, 221 Bonferroni, 199,203 Boxplots, 29 Cauchy random variable, 102 Causality, 57, 167-173,211 Census, 9, 227 Central limit theorem, 101-102, 112, 122,229 Central tendency, 14 Charts, bar and pie, 13 Chi-square tests (x2 tests), 181-195 Combinations, 45-46, 57, 64, 65 Compatibility intervals See Confidence intervals Conditional probability, 39, 50 Confidence intervals, 149-162, 186, 187,209,210,231,235 Conservatism, 142-144 Continuous data, 10, 97 Contrasts, 203 Contributory cause, 169 Correlation, 218-223 Cost of errors, 85, 149 Data, 9-30; bimodal, 16; bivariate, 12; categorical, 11; continuous, 10, 11, 97; dichotomous, 11; discrete, 11; multivariate, 12; ordinal, 11,12, 80; skewed, 18, 107; unimodal, 16; univariate, 12 Data transformations, 129, 136,202 Decision theory, 149,232 Dependence, 101 See also Independence Dichotomous data, 11 Dispersion, 18 Distribution-free statistics, 130 Double blind, 170 Error term, 206, 207 256 Index Errors, 146, 147 Ethics, 142 Expected value, 90 Factorial (!), 56, 58, 86 F distribution, 139n.4,200 Ftest, 125, 139 n.4, 199,200 Finite population correction, 227-229 Fisher's exact test, 39-45, 77, 175-181 Frequentist statistics, 2, 54 Goodness of fit, 194-195,235 Heteroscedastic, 212 Histogram, 27-28, 131 Hypothesis testing, 67-69, 108, 141149 Ideological bias, 42 Inappropriate conclusions, 43, 95 n.l, 118,143, 144 Independence, 20, 35, 65, 66, 78, 87, 88, 101,111-113,118, 198,209,210, 214,215 Independent samples /test, 124-129 Interquartile range (IQR), 22 Interventions, 168 Kolmogoroff-Smirnoff test, 135 Kruskal-Wallis test, 202-205 Law of total probability, 47-49 Lilliefors test, 135 Line graph, 17,23,24 Lognormal distribution, 107-108 Mann-Whitney test, 82-85 Matching, 172 McNemar's test, 76, 192-194 Mean, 14,90 Median, 15 Medical screening, 51-52 Mode, 16 Multivariate analysis, 233 Mutually exclusive, 34, 36, 37 N(p,cr2), 100 Nonparametric statistics, 130, 136 Normal distribution, normal curve, 139 Null hypothesis, 67 Numbers required in a sample, 43,44, 69, 95n.l, 110,115-116,144-147,229-231 Objectivity, 2, 141, 155 Observational study, 168, 171 Odds ratio, 183, 185-187 One-tail and two-tail tests, 73-76, 82, 85,128,180-181 Ordinal data, 11, 12,80,82 Ordinate diagram, 28 Outliers, 18,29 p value, 4-6, 8, 40-45, 54, 67-72, 79, 81,85,89, 108-111, 115,123,124, 127-129, 142,146,148,178-183, 190-195,200,204,210,211,221-223 Paired samples t test, 121-124 Pairing, 76-77, 121-122, 193, 194 Percentiles, 22, 23 Permutations, 57 Pie chart, 13 Placebo, 170 Poisson random variable, 86-89 Population, 9, 89-92, 229 Post-hoc tests, 203 Power (of a test), 147, 161, 162,223 P-P plots, 131-135 Probability, 33-61; addition rule, 34; conditional, 39, 50; definitions, 33; multiplication rule, 35 Probability density, 97, 98 Probability density function (pdf), 98 Probability weighted averaging, 90 Proportion of the time, 33, 34 Prospective studies, 171 Q-Q plots, 131-135 Quartiles, 22, 23 Random variables, 63-139; binomial, 64-66; Cauchy, 102; continuous, 139; discrete, 63-96; lognormal, 107— 108; normal, 98-101; Poisson, 86-89; standard normal (Z), 102-107; uniform, 134 Index Range, 18 Rank, 80, 83, 108 Regression, 197,205-218 Relative risk, 183-185 Representative sample, 10, 111 Retrospective studies, 171 s, cr, standard deviation, 20-22, 91, 92 Sample, 9, 89-92 Sampling distributions, 111-114 Sensitivity, 47, 48, 50, 147 Sequential analysis, 232 Sign test, 66-79 Significance, 5, 146 Simpson's paradox, 187-188 Single sample t test, 119-121 Skewed data, 18, 107 Spatial statistics, 233 Spearman's rho (p), 57, 221-223 Specificity, 47, 48, 50, 147 Standard deviation, 20-22, 91; sample standard deviation, 92 Standard error (of the mean), 116 Standard normal random variable, 103107 Statistical significance, 5, 42, 54, 69, 109 Stem and leaf plot, 25-27 Stratified sampling, 232 Studies, experimental or observational, 168 257 Summary statistics, 14 Tests: x2 (chi-square) tests, 181-195; Fisher's exact test, 40-45, 175-181; F test, 125, 139 n 4, 199, 200; goodness of Attests, 194-195, 235; independent samples rtest, 124-129; KolmogoroffSmirnoff test, 135; Kruskal-Wallis test, 202-205; Lilliefors' test, 135; Mann-Whitney test (Wilcoxon rank sum test), 82-85; McNemar's test, 76, 192-194; paired samples t test, 121124; sign test, 66-79; single sample t test, 119-121; Wilcoxon signed rank test, 80-82; z test, 108-111 Ties, 69-70, 82, 85 Time series, 233 Transformations, 129, 136, 202 Trial: binomial, 64; clinical, 170 Uniform random variable, 134 Van der Warden's method, 134 Variance, 20, 91 Venn diagrams, 34-37 Wilcoxon rank sum test, 85 Wilcoxon signed rank test, 80-82 z test, 108-111, 115-118 This page intentionally left blank ABOUT THE AUTHOR David Kault is a medical practitioner and an adjunct lecturer in mathematics at James Cook University, Queensland, Australia He has taught a number of introductory statistics courses, both general and applied to such areas as environmental science and medicine .. .STATISTICS WITH COMMON SENSE David Kault Greenwood Press Westport, Connecticut • London Library of Congress Cataloging-in-Publication Data Kault, David Statistics with common sense /... students Com- Statistics with Common Sense mon sense here would lead to far better decision making than blind application of frequentist statistics THE LOGIC USED IN STATISTICS Frequentist statistics. .. wheat and note that on the side of the Statistics with Common Sense field treated with Bloggs's fertilizer the wheat grew better than on the side treated with Jones's fertilizer and conclude that