This page intentionally left blank Important Formulas Chapter Data Description ᎐ Mean for individual data: X ϭ ᎐ Mean for grouped data: X ϭ Chapter Discrete Probability Distributions ͚X n ͚ f • Xm n ͙ ͚Θ X Ϫ X Ι nϪ1 ͙ nΘ ͚X 2Ι Ϫ Θ͚XΙ nΘn Ϫ 1Ι (Shortcut formula) sϭ or Standard deviation for grouped data: sϭ ͙ nΘ͚ f • X m2 Ι Ϫ Θ ͚ f • Xm Ι nΘn Ϫ 1Ι Range rule of thumb: s Ϸ s2 ϭ ͚[X и P(X)] Ϫ m2 s ϭ ͙͚[X • PΘXΙ ] Ϫ m2 Standard deviation for a sample: sϭ Mean for a probability distribution: m ϭ ͚[X и P(X)] Variance and standard deviation for a probability distribution: range n! • pX • q nϪX Ϫ XΙ !X! Mean for binomial distribution: m ϭ n и p Variance and standard deviation for the binomial distribution: s2 ϭ n и p и q s ϭ ͙n • p • q Multinomial probability: n! PΘXΙ ϭ • p X • p2X • p3X • • • pkX k X1!X2!X3! Xk! Binomial probability: PΘXΙ ϭ Θn Poisson probability: P(X; l) ϭ Chapter Probability and Counting Rules Addition rule (mutually exclusive events): P(A or B) ϭ P(A) ϩ P(B) Addition rule (events not mutually exclusive): P(A or B) ϭ P(A) ϩ P(B) Ϫ P(A and B) Multiplication rule (independent events): P(A and B) ϭ P(A) и P(B) Multiplication rule (dependent events): P(A and B) ϭ P(A) и P(B ͉ A) Conditional probability: PΘB Խ AΙ ϭ Expectation: E(X) ϭ ͚[X и P(X)] PΘ A and BΙ PΘ AΙ ᎐ Complementary events: P(E ) ϭ Ϫ P(E) Fundamental counting rule: Total number of outcomes of a sequence when each event has a different number of possibilities: k и k и k и и и k n Permutation rule: Number of permutations of n objects n! taking r at a time is n Pr ϭ Θn Ϫ rΙ ! Combination rule: Number of combinations of r objects n! selected from n objects is n Cr ϭ Θ n Ϫ r Ι !r! X ϭ 0, 1, 2, e Ϫ X where X! Hypergeometric probability: PΘXΙ ϭ a CX • bCnϪX aϩbCn Chapter The Normal Distribution Standard score z ϭ ᎐ XϪ zϭ or XϪX s Mean of sample means: mX ϭ m ͙n ᎐ XϪ Central limit theorem formula: z ϭ ր͙n Standard error of the mean: sX ϭ Chapter Confidence Intervals and Sample Size z confidence interval for means: ᎐ X Ϫ z ␣ր2 Θ ͙n Ι Ͻ Ͻ X ϩ z ր Θ ͙n Ι ᎐ ␣ t confidence interval for means: ᎐ X Ϫ t ␣ր2 Θ ͙s n Ι Ͻ Ͻ X ϩ t ր Θ ͙s n Ι ᎐ ␣ z␣ր2 • E maximum error of estimate Sample size for means: n ϭ Θ Ι where E is the Confidence interval for a proportion: pˆ Ϫ Θz ␣ ր Ι ͙ pˆ qˆ Ͻ p Ͻ pˆ ϩ Θz ␣ ր 2Ι n ͙ pˆ qˆ n Sample size for a proportion: n ϭ pˆ qˆ z␣ Θ Eր Ι Formula for the confidence interval for difference of two means (small independent samples, variance unequal): X and qˆ ϭ Ϫ pˆ n Confidence interval for variance: pˆ ϭ where Θn ᎐ Θ X1 ͙ ᎐ Ϫ X2Ι Ϫ t ␣ ր Θ n Ϫ Ι s2 Ϫ 1Ι s2 Ͻ 2 Ͻ right 2left ᎐ ͙ Ϫ 1Ι s2 ϽϽ 2right ͙ Θn ᎐ tϭ ᎐ XϪ for any value n If n Ͻ 30, ր͙n population must be normally distributed sD ϭ (d.f ϭ n Ϫ 1) Θn Ϫ 1Ι s 2 ᎐ ᎐ Ϫ X2Ι Ϫ z␣ր2 ͙ 21 22 ϩ Ͻ 1 Ϫ n1 n2 ᎐ ᎐ ᎐ ᎐ Ϫ X2 Ι Ϫ Θ1 Ϫ 2Ι ͙ ͙ pq Θ n1 ϩ n1 Ι _ pϭ X1 ϩ X2 n1 ϩ n2 _ _ qϭ1Ϫp pˆ ϭ X1 n1 pˆ2 ϭ X2 n2 s21 s22 ϩ n1 n2 (d.f ϭ the smaller of n Ϫ or n2 Ϫ 1) Θ pˆ1 Ϫ pˆ2Ι Ϫ z␣ր2 ͙ pˆ qˆ1 pˆ qˆ2 ϩ Ͻ p1 Ϫ p2 n1 n2 Ͻ Θ pˆ1 Ϫ pˆ 2Ι ϩ z␣ ր ͙ 21 22 ϩ n1 n2 t test for comparing two means (independent samples, variances not equal): Θ X1 Ϫ pˆ 2Ι Ϫ Θ p1 Ϫ p2Ι Formula for the confidence interval for the difference of two proportions: Ͻ ΘX1 Ϫ X2Ι ϩ z ␣ ր tϭ Θ pˆ1 where Ϫ Θ 1 Ϫ Ι 21 22 ϩ n1 n2 Formula for the confidence interval for difference of two means (large samples): Θ X1 ϭ n Ϫ 1Ι z test for comparing two proportions: z test for comparing two means (independent samples): ͙ Θ d.f and SD S ᎐ Ͻ D Ͻ D ϩ t␣ր2 D ͙n ͙n (d.f ϭ n Ϫ 1) zϭ Chapter Testing the Difference Between Two Means, Two Proportions, and Two Variances Ϫ n͚D Ϫ Θ͚DΙ nΘn Ϫ 1Ι ͚D n ᎐ Dϭ ᎐ (d.f ϭ n Ϫ 1) zϭ ͙ where D Ϫ t␣ր2 pˆ Ϫ p ͙pqրn Chi-square test for a single variance: ϭ ᎐ X2 Ι D Ϫ D sD ր͙n Formula for confidence interval for the mean of the difference for dependent samples: ᎐ ᎐ Θ X1 s21 s22 ϩ n1 n2 t test for comparing two means for dependent samples: z test: z ϭ z test for proportions: z ϭ ͙ (d.f ϭ smaller of n1 Ϫ and n2 Ϫ 1) Ϫ 1Ι s2 2left Chapter Hypothesis Testing XϪ t test: t ϭ sր͙n ᎐ Ͻ ΘX1 Ϫ X2Ι ϩ t ␣ ր Confidence interval for standard deviation: Θn s21 s22 ϩ Ͻ 1 Ϫ n1 n2 ͙ pˆ qˆ1 pˆ qˆ2 ϩ n1 n2 s21 where s 21 is the s22 larger variance and d.f.N ϭ n1 Ϫ 1, d.f.D ϭ n2 Ϫ F test for comparing two variances: F ϭ Chapter 10 Correlation and Regression Chapter 11 Other Chi-Square Tests Correlation coefficient: Chi-square test for goodness-of-fit: rϭ nΘ͚xyΙ Ϫ Θ ͚xΙΘ͚yΙ t test for correlation coefficient: t ϭ r (d.f ϭ n Ϫ 2) ͙ nϪ2 Ϫ r2 The regression line equation: yЈ ϭ a ϩ bx Ϫ EΙ E [d.f ϭ (rows Ϫ 1)(col Ϫ 1)] Ϫ Θ͚xΙΘ͚xyΙ nΘ͚x2Ι Ϫ Θ͚xΙ nΘ͚xyΙ Ϫ Θ͚xΙΘ͚yΙ nΘ ͚x 2Ι Ϫ Θ͚xΙ bϭ Coefficient of determination: r ϭ ͙ explained variation total variation ANOVA test: F ϭ d.f.N ϭ k Ϫ d.f.D ϭ N Ϫ k ͚y2 Ϫ a ͚y Ϫ b ͚xy nϪ2 ͙ ᎐ nΘ x Ϫ X Ι 1ϩ ϩ n n ͚x Ϫ Θ ͚xΙ Ͻ y Ͻ yЈ ϩ t␣ ր 2s est ͙ ᎐ nΘ x Ϫ XΙ 1ϩ ϩ n n ͚x2 Ϫ Θ͚xΙ (d.f ϭ n Ϫ 2) Formula for the multiple correlation coefficient: Rϭ ͙ 2 r yx ϩ r yx Ϫ 2ryx • ryx • rx 1x2 Ϫ r 2x x Formula for the F test for the multiple correlation coefficient: Fϭ Θ1 Ϫ R 2րk ր Ϫ k Ϫ 1Ι R 2Ι Θn ͚niΘXi Ϫ XGM Ι kϪ1 sW2 ϭ ͚Θni Ϫ 1Ι s2i ͚Θni Ϫ 1Ι Scheffé test: FS ϭ Θ1 ΄ Ϫ R2 ΙΘn Ϫ 1Ι nϪkϪ1 Xi Ϫ Xj ͙sW2 րn Formulas for two-way ANOVA: SSA aϪ1 SSB MSB ϭ bϪ1 MSA ϭ MSW ϭ ΅ and Tukey test: q ϭ (d.f.N ϭ n Ϫ k and d.f.D ϭ n Ϫ k Ϫ 1) R 2adj ϭ Ϫ Ϫ Xj Ι րni ϩ 1րnjΙ ΘXi sW2 Θ1 FЈ ϭ (k Ϫ 1)(C.V.) MSAϫB ϭ Formula for the adjusted R2: sB2 ͚X where XGM ϭ sW2 N where N ϭ n1 ϩ n2 ϩ и и и ϩ nk where k ϭ number of groups sB2 ϭ Prediction interval for y: yЈ Ϫ t␣ ր sest ΘO Chapter 12 Analysis of Variance Standard error of estimate: sest ϭ ΘO Chi-square test for independence and homogeneity of proportions: x2 ϭ a Θ ͚y ΙΘ ͚x2 Ι aϭ where Ϫ EΙ E (d.f ϭ no of categories Ϫ 1) x2 ϭ a ͙[nΘ͚x2 Ι Ϫ Θ͚xΙ 2][nΘ ͚y2Ι Ϫ Θ ͚yΙ 2] Θa SSAϫB Ϫ 1ΙΘb Ϫ 1Ι SSW abΘ n Ϫ 1Ι MSA MSW MSB FB ϭ MSW FA ϭ FAϫB ϭ MSAϫB MSW Chapter 13 Nonparametric Statistics ϩ 0.5Ι Ϫ Θnր2Ι z test value in the sign test: z ϭ ͙n ր where n ϭ sample size (greater than or equal to 26) X ϭ smaller number of ϩ or Ϫ signs Kruskal-Wallis test: ΘX Wilcoxon rank sum test: z ϭ R Ϫ mR sR where R ϭ n1Θn1 ϩ n2 ϩ 1Ι ͙ n n 2Θn1 ϩ n ϩ 1Ι 12 R ϭ sum of the ranks for the smaller sample size (n1) n1 ϭ smaller of the sample sizes n2 ϭ larger of the sample sizes n1 Ն 10 and n2 Ն 10 R ϭ ws Ϫ Wilcoxon signed-rank test: z ϭ A where nΘn ϩ 1Ι nΘn ϩ 1ΙΘ2n ϩ 1Ι 24 Hϭ R21 R22 12 R2 ϩ ϩ • • • ϩ k Ϫ 3ΘN ϩ 1Ι NΘN ϩ 1Ι n1 n2 nk Θ Ι where R1 ϭ sum of the ranks of sample n1 ϭ size of sample R2 ϭ sum of the ranks of sample n2 ϭ size of sample и и и Rk ϭ sum of the ranks of sample k nk ϭ size of sample k N ϭ n1 ϩ n2 ϩ и и и ϩ nk k ϭ number of samples Spearman rank correlation coefficient: rS ϭ Ϫ ͚d nΘn2 Ϫ 1Ι where d ϭ difference in the ranks n ϭ number of data pairs n ϭ number of pairs where the difference is not ws ϭ smaller sum in absolute value of the signed ranks Procedure Table Step State the hypotheses and identify the claim Step Find the critical value(s) from the appropriate table in Appendix C Step Compute the test value Step Make the decision to reject or not reject the null hypothesis Step Summarize the results Procedure Table Solving Hypothesis-Testing Problems (P-value Method) Step State the hypotheses and identify the claim Step Compute the test value Step Find the P-value Step Make the decision Step Summarize the results ISBN-13: 978–0–07–743861–6 ISBN-10: 0–07–743861–2 Solving Hypothesis-Testing Problems (Traditional Method) Table E The Standard Normal Distribution Cumulative Standard Normal Distribution z 00 01 02 03 04 05 06 07 08 09 Ϫ3.4 0003 0003 0003 0003 0003 0003 0003 0003 0003 0002 Ϫ3.3 0005 0005 0005 0004 0004 0004 0004 0004 0004 0003 Ϫ3.2 0007 0007 0006 0006 0006 0006 0006 0005 0005 0005 Ϫ3.1 0010 0009 0009 0009 0008 0008 0008 0008 0007 0007 Ϫ3.0 0013 0013 0013 0012 0012 0011 0011 0011 0010 0010 Ϫ2.9 0019 0018 0018 0017 0016 0016 0015 0015 0014 0014 Ϫ2.8 0026 0025 0024 0023 0023 0022 0021 0021 0020 0019 Ϫ2.7 0035 0034 0033 0032 0031 0030 0029 0028 0027 0026 Ϫ2.6 0047 0045 0044 0043 0041 0040 0039 0038 0037 0036 Ϫ2.5 0062 0060 0059 0057 0055 0054 0052 0051 0049 0048 Ϫ2.4 0082 0080 0078 0075 0073 0071 0069 0068 0066 0064 Ϫ2.3 0107 0104 0102 0099 0096 0094 0091 0089 0087 0084 Ϫ2.2 0139 0136 0132 0129 0125 0122 0119 0116 0113 0110 Ϫ2.1 0179 0174 0170 0166 0162 0158 0154 0150 0146 0143 Ϫ2.0 0228 0222 0217 0212 0207 0202 0197 0192 0188 0183 Ϫ1.9 0287 0281 0274 0268 0262 0256 0250 0244 0239 0233 Ϫ1.8 0359 0351 0344 0336 0329 0322 0314 0307 0301 0294 Ϫ1.7 0446 0436 0427 0418 0409 0401 0392 0384 0375 0367 Ϫ1.6 0548 0537 0526 0516 0505 0495 0485 0475 0465 0455 Ϫ1.5 0668 0655 0643 0630 0618 0606 0594 0582 0571 0559 Ϫ1.4 0808 0793 0778 0764 0749 0735 0721 0708 0694 0681 Ϫ1.3 0968 0951 0934 0918 0901 0885 0869 0853 0838 0823 Ϫ1.2 1151 1131 1112 1093 1075 1056 1038 1020 1003 0985 Ϫ1.1 1357 1335 1314 1292 1271 1251 1230 1210 1190 1170 Ϫ1.0 1587 1562 1539 1515 1492 1469 1446 1423 1401 1379 Ϫ0.9 1841 1814 1788 1762 1736 1711 1685 1660 1635 1611 Ϫ0.8 2119 2090 2061 2033 2005 1977 1949 1922 1894 1867 Ϫ0.7 2420 2389 2358 2327 2296 2266 2236 2206 2177 2148 Ϫ0.6 2743 2709 2676 2643 2611 2578 2546 2514 2483 2451 Ϫ0.5 3085 3050 3015 2981 2946 2912 2877 2843 2810 2776 Ϫ0.4 3446 3409 3372 3336 3300 3264 3228 3192 3156 3121 Ϫ0.3 3821 3783 3745 3707 3669 3632 3594 3557 3520 3483 Ϫ0.2 4207 4168 4129 4090 4052 4013 3974 3936 3897 3859 Ϫ0.1 4602 4562 4522 4483 4443 4404 4364 4325 4286 4247 Ϫ0.0 5000 4960 4920 4880 4840 4801 4761 4721 4681 4641 For z values less than Ϫ3.49, use 0.0001 Area z Table E (continued ) Cumulative Standard Normal Distribution z 00 01 02 03 04 05 06 07 08 09 0.0 5000 5040 5080 5120 5160 5199 5239 5279 5319 5359 0.1 5398 5438 5478 5517 5557 5596 5636 5675 5714 5753 0.2 5793 5832 5871 5910 5948 5987 6026 6064 6103 6141 0.3 6179 6217 6255 6293 6331 6368 6406 6443 6480 6517 0.4 6554 6591 6628 6664 6700 6736 6772 6808 6844 6879 0.5 6915 6950 6985 7019 7054 7088 7123 7157 7190 7224 0.6 7257 7291 7324 7357 7389 7422 7454 7486 7517 7549 0.7 7580 7611 7642 7673 7704 7734 7764 7794 7823 7852 0.8 7881 7910 7939 7967 7995 8023 8051 8078 8106 8133 0.9 8159 8186 8212 8238 8264 8289 8315 8340 8365 8389 1.0 8413 8438 8461 8485 8508 8531 8554 8577 8599 8621 1.1 8643 8665 8686 8708 8729 8749 8770 8790 8810 8830 1.2 8849 8869 8888 8907 8925 8944 8962 8980 8997 9015 1.3 9032 9049 9066 9082 9099 9115 9131 9147 9162 9177 1.4 9192 9207 9222 9236 9251 9265 9279 9292 9306 9319 1.5 9332 9345 9357 9370 9382 9394 9406 9418 9429 9441 1.6 9452 9463 9474 9484 9495 9505 9515 9525 9535 9545 1.7 9554 9564 9573 9582 9591 9599 9608 9616 9625 9633 1.8 9641 9649 9656 9664 9671 9678 9686 9693 9699 9706 1.9 9713 9719 9726 9732 9738 9744 9750 9756 9761 9767 2.0 9772 9778 9783 9788 9793 9798 9803 9808 9812 9817 2.1 9821 9826 9830 9834 9838 9842 9846 9850 9854 9857 2.2 9861 9864 9868 9871 9875 9878 9881 9884 9887 9890 2.3 9893 9896 9898 9901 9904 9906 9909 9911 9913 9916 2.4 9918 9920 9922 9925 9927 9929 9931 9932 9934 9936 2.5 9938 9940 9941 9943 9945 9946 9948 9949 9951 9952 2.6 9953 9955 9956 9957 9959 9960 9961 9962 9963 9964 2.7 9965 9966 9967 9968 9969 9970 9971 9972 9973 9974 2.8 9974 9975 9976 9977 9977 9978 9979 9979 9980 9981 2.9 9981 9982 9982 9983 9984 9984 9985 9985 9986 9986 3.0 9987 9987 9987 9988 9988 9989 9989 9989 9990 9990 3.1 9990 9991 9991 9991 9992 9992 9992 9992 9993 9993 3.2 9993 9993 9994 9994 9994 9994 9994 9995 9995 9995 3.3 9995 9995 9995 9996 9996 9996 9996 9996 9996 9997 3.4 9997 9997 9997 9997 9997 9997 9997 9997 9997 9998 For z values greater than 3.49, use 0.9999 Area z Table F d.f The t Distribution Confidence intervals 80% 90% 95% 98% 99% One tail, A 0.10 0.05 0.025 0.01 0.005 Two tails, A 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 32 34 36 38 40 45 50 55 60 65 70 75 80 90 100 500 1000 (z) ϱ 0.20 0.10 0.05 0.02 0.01 3.078 1.886 1.638 1.533 1.476 1.440 1.415 1.397 1.383 1.372 1.363 1.356 1.350 1.345 1.341 1.337 1.333 1.330 1.328 1.325 1.323 1.321 1.319 1.318 1.316 1.315 1.314 1.313 1.311 1.310 1.309 1.307 1.306 1.304 1.303 1.301 1.299 1.297 1.296 1.295 1.294 1.293 1.292 1.291 1.290 1.283 1.282 1.282a 6.314 2.920 2.353 2.132 2.015 1.943 1.895 1.860 1.833 1.812 1.796 1.782 1.771 1.761 1.753 1.746 1.740 1.734 1.729 1.725 1.721 1.717 1.714 1.711 1.708 1.706 1.703 1.701 1.699 1.697 1.694 1.691 1.688 1.686 1.684 1.679 1.676 1.673 1.671 1.669 1.667 1.665 1.664 1.662 1.660 1.648 1.646 1.645b 12.706 4.303 3.182 2.776 2.571 2.447 2.365 2.306 2.262 2.228 2.201 2.179 2.160 2.145 2.131 2.120 2.110 2.101 2.093 2.086 2.080 2.074 2.069 2.064 2.060 2.056 2.052 2.048 2.045 2.042 2.037 2.032 2.028 2.024 2.021 2.014 2.009 2.004 2.000 1.997 1.994 1.992 1.990 1.987 1.984 1.965 1.962 1.960 31.821 6.965 4.541 3.747 3.365 3.143 2.998 2.896 2.821 2.764 2.718 2.681 2.650 2.624 2.602 2.583 2.567 2.552 2.539 2.528 2.518 2.508 2.500 2.492 2.485 2.479 2.473 2.467 2.462 2.457 2.449 2.441 2.434 2.429 2.423 2.412 2.403 2.396 2.390 2.385 2.381 2.377 2.374 2.368 2.364 2.334 2.330 2.326c 63.657 9.925 5.841 4.604 4.032 3.707 3.499 3.355 3.250 3.169 3.106 3.055 3.012 2.977 2.947 2.921 2.898 2.878 2.861 2.845 2.831 2.819 2.807 2.797 2.787 2.779 2.771 2.763 2.756 2.750 2.738 2.728 2.719 2.712 2.704 2.690 2.678 2.668 2.660 2.654 2.648 2.643 2.639 2.632 2.626 2.586 2.581 2.576d a This value has been rounded to 1.28 in the textbook This value has been rounded to 1.65 in the textbook c This value has been rounded to 2.33 in the textbook d This value has been rounded to 2.58 in the textbook One tail Two tails b Source: Adapted from W H Beyer, Handbook of Tables for Probability and Statistics, 2nd ed., CRC Press, Boca Raton, Fla., 1986 Reprinted with permission Area ␣ t Area ␣ Ϫt Area ␣ ϩt Table G The Chi-Square Distribution A Degrees of freedom 0.995 0.99 0.975 0.95 0.90 0.10 0.05 0.025 0.01 0.005 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 40 50 60 70 80 90 100 — 0.010 0.072 0.207 0.412 0.676 0.989 1.344 1.735 2.156 2.603 3.074 3.565 4.075 4.601 5.142 5.697 6.265 6.844 7.434 8.034 8.643 9.262 9.886 10.520 11.160 11.808 12.461 13.121 13.787 20.707 27.991 35.534 43.275 51.172 59.196 67.328 — 0.020 0.115 0.297 0.554 0.872 1.239 1.646 2.088 2.558 3.053 3.571 4.107 4.660 5.229 5.812 6.408 7.015 7.633 8.260 8.897 9.542 10.196 10.856 11.524 12.198 12.879 13.565 14.257 14.954 22.164 29.707 37.485 45.442 53.540 61.754 70.065 0.001 0.051 0.216 0.484 0.831 1.237 1.690 2.180 2.700 3.247 3.816 4.404 5.009 5.629 6.262 6.908 7.564 8.231 8.907 9.591 10.283 10.982 11.689 12.401 13.120 13.844 14.573 15.308 16.047 16.791 24.433 32.357 40.482 48.758 57.153 65.647 74.222 0.004 0.103 0.352 0.711 1.145 1.635 2.167 2.733 3.325 3.940 4.575 5.226 5.892 6.571 7.261 7.962 8.672 9.390 10.117 10.851 11.591 12.338 13.091 13.848 14.611 15.379 16.151 16.928 17.708 18.493 26.509 34.764 43.188 51.739 60.391 69.126 77.929 0.016 0.211 0.584 1.064 1.610 2.204 2.833 3.490 4.168 4.865 5.578 6.304 7.042 7.790 8.547 9.312 10.085 10.865 11.651 12.443 13.240 14.042 14.848 15.659 16.473 17.292 18.114 18.939 19.768 20.599 29.051 37.689 46.459 55.329 64.278 73.291 82.358 2.706 4.605 6.251 7.779 9.236 10.645 12.017 13.362 14.684 15.987 17.275 18.549 19.812 21.064 22.307 23.542 24.769 25.989 27.204 28.412 29.615 30.813 32.007 33.196 34.382 35.563 36.741 37.916 39.087 40.256 51.805 63.167 74.397 85.527 96.578 107.565 118.498 3.841 5.991 7.815 9.488 11.071 12.592 14.067 15.507 16.919 18.307 19.675 21.026 22.362 23.685 24.996 26.296 27.587 28.869 30.144 31.410 32.671 33.924 35.172 36.415 37.652 38.885 40.113 41.337 42.557 43.773 55.758 67.505 79.082 90.531 101.879 113.145 124.342 5.024 7.378 9.348 11.143 12.833 14.449 16.013 17.535 19.023 20.483 21.920 23.337 24.736 26.119 27.488 28.845 30.191 31.526 32.852 34.170 35.479 36.781 38.076 39.364 40.646 41.923 43.194 44.461 45.722 46.979 59.342 71.420 83.298 95.023 106.629 118.136 129.561 6.635 9.210 11.345 13.277 15.086 16.812 18.475 20.090 21.666 23.209 24.725 26.217 27.688 29.141 30.578 32.000 33.409 34.805 36.191 37.566 38.932 40.289 41.638 42.980 44.314 45.642 46.963 48.278 49.588 50.892 63.691 76.154 88.379 100.425 112.329 124.116 135.807 7.879 10.597 12.838 14.860 16.750 18.548 20.278 21.955 23.589 25.188 26.757 28.299 29.819 31.319 32.801 34.267 35.718 37.156 38.582 39.997 41.401 42.796 44.181 45.559 46.928 48.290 49.645 50.993 52.336 53.672 66.766 79.490 91.952 104.215 116.321 128.299 140.169 Source: Owen, Handbook of Statistical Tables, Table A–4 “Chi-Square Distribution Table,” © 1962 by Addison-Wesley Publishing Company, Inc Copyright renewal © 1990 Reproduced by permission of Pearson Education, Inc Area ␣ 2 blu38582_ans_IS1-IS76.qxd 9/28/10 8:24 PM Page 71 Instructor’s Section Answers rs ϭ 0.471; H0: r ϭ and H1: r 0; C.V ϭ Ϯ0.886; not reject There is no significant linear relationship rs ϭ 0.817; H0: r ϭ and H1: r 0; C.V ϭ Ϯ0.700; reject There is a significant relationship between the number of new releases and the gross receipts rs ϭ 0.893; H0: r ϭ and H1: r 0; C.V ϭ Ϯ0.786; reject There is a significant relationship between the number of hospitals and the number of nursing homes in a state rs ϭ 0.048; H0: r ϭ and H1: r 0; C.V ϭ Ϯ0.738; not reject There is not enough evidence to say that a significant correlation exists between calories and the cholesterol amounts in fast-food sandwiches 10 rs ϭ 0.8857; H0: r ϭ and H1: r 0; C.V ϭ Ϯ0.886 Very close! There is not a significant relationship between the number of books published in 1980 and in 2004 in the same subject area Since r is not significant, no relationship can be predicted 20 years from now Even if r is significant, you should not make a prediction for 20 years from now That would be extrapolating 11 rs ϭ 0.624; H0: r ϭ and H1: r 0; C.V ϭ Ϯ0.700; not reject There is no significant relationship between gasoline prices paid to the car rental agency and regular gasoline prices One would wonder how the car rental agencies determine their prices 12 rs ϭ 0.714; H0: r ϭ and H1: r 0; C.V ϭ Ϯ0.886; not reject There is not sufficient evidence to conclude a significant relationship between the number of motor vehicle thefts and burglaries 13 rs ϭ Ϫ0.10; H0: r ϭ and H1: r 0; C.V ϭ Ϯ0.900; not reject There is no significant relationship between the number of cyber school students and the cost per pupil In this case, the cost per pupil is different in each district 14 rs ϭ 0.542; H0: r ϭ and H1: r 0; C.V ϭ Ϯ0.643; not reject There is no significant relationship between the costs of the drugs 15 H0: the number of cavities in a person occurs at random and H1: the null hypothesis is not true There are 21 runs; the expected number of runs is between 10 and 22 Therefore, not reject the null hypothesis; the number of cavities in a person occurs at random 16 H0: the numbers occur at random and H1: the null hypothesis is not true There are 14 runs Since the expected number of runs is between and 20, not reject The numbers occur at random 17 H0: the purchases of soft drinks occur at random and H1: the null hypothesis is not true There are 16 runs, and the expected number of runs is between and 22, so not reject the null hypothesis Hence the purchases of soft drinks occur at random 18 H0: the integers generated by a calculator occur at random and H1: the null hypothesis is not true There are 13 runs, and the expected number of runs is between and 17, so the null hypothesis is not rejected The integers occur at random 19 H0: the seating occurs at random and H1: the null hypothesis is not true There are 14 runs Since the expected number of runs is between 10 and 23, not reject The seating occurs at random 20 H0: the gender of the shoppers in line at the grocery store is random (claim) and H1: the null hypothesis is not true There are 10 runs Since the expected number of runs is between and 16, the null hypothesis should not be rejected There is not enough evidence to reject the hypothesis that the gender of the shoppers in line is random 21 H0: the number of absences of employees occurs at random over a 30-day period and H1: the null hypothesis is not true There are only runs, and this value does not fall within the 9-to-21 range Hence, the null hypothesis is rejected; the absences not occur at random 22 H0: the days customers are able to ski occur at random (claim) and H1: the null hypothesis is not true There are runs Since this number is not between and 20, the decision is to reject the null hypothesis There is enough evidence to reject the claim that the days customers are able to ski occur at random 23 Answers will vary 24 Ϯ0.28 25 Ϯ0.479 26 Ϯ0.400 27 Ϯ0.215 28 Ϯ0.413 Review Exercises H0: median ϭ 36 years and H1: median 36 years; z ϭ Ϫ0.548; C.V ϭ Ϯ1.96; not reject There is insufficient evidence to conclude that the median differs from 36 H0: median ϭ 40,000 miles (claim) and H1: median 40,000 miles; z ϭ Ϫ0.913; C.V ϭ Ϯ1.96; not reject There is not enough evidence to reject the claim that the median is 40,000 miles H0: there is no difference in prices and H1: there is a difference in prices; test value ϭ 1; C.V ϭ 0; not reject There is insufficient evidence to conclude a difference in prices Comments: Examine what affects the result of this test H0: there is no difference in the record high temperatures of the two cities and H1: there is a difference in the record high temperatures of the two cities (claim); z ϭ Ϫ1.24; P-value ϭ 0.2150; not reject There is not enough evidence to support the claim that there is a difference in the record high temperatures of the two cities H0: there is no difference in the hours worked and H1: there is a difference in the hours worked; R ϭ 85; mR ϭ 110; sR ϭ 14.2009; z ϭ Ϫ1.76; C.V ϭ Ϯ1.645; reject There is sufficient evidence to conclude a difference in the hours worked C.V ϭ Ϯ1.96; not reject IS–71 blu38582_ans_IS1-IS76.qxd 9/28/10 8:24 PM Page 72 Instructor’s Section Answers H0: the additive did not improve the gas mileage and H1: the additive did improve the gas mileage (claim); C.V ϭ 14; ws ϭ 14; reject There is enough evidence to support the claim that the additive improved the gas mileage H0: there is no difference in the amount spent and H1: there is a difference in the amount spent; ws ϭ 1; C.V ϭ 2; reject There is sufficient evidence of a difference in amount spent at the 0.05 level of significance H0: there is no difference in the breaking strengths of the ropes and H1: there is a difference in the breaking strengths of the ropes (claim); C.V ϭ 5.991; H ϭ 28.02; reject There is enough evidence to support the claim that there is a difference in the breaking strengths of the ropes H0: there is no difference in beach temperatures and H1: there is a difference in temperatures; H ϭ 15.524; C.V ϭ 7.815; reject There is sufficient evidence to conclude a difference in beach temperatures (Without the Southern Pacific: H ϭ 3.661; C.V ϭ 5.991; not reject.) 10 rs ϭ 0.933; H0: r ϭ and H1: r 0; C.V ϭ Ϯ0.700; reject There is a significant relationship between the rankings 11 rs ϭ 0.891; H0: r ϭ and H1: r 0; C.V ϭ Ϯ0.648; reject There is a significant relationship in the average number of people who are watching the television shows for both years 12 H0: the books are arranged at random and H1: the null hypothesis is not true There are 12 runs Since the expected number of runs is between 10 and 22, not reject The books are arranged at random 13 H0: the grades of students who finish the exam occur at random and H1: the null hypothesis is not true Since there are runs and this value does not fall in the 9-to-21 interval, the null hypothesis is rejected The grades not occur at random Chapter Quiz False False True True a c d Nonparametric 11 Sign b 10 Nominal, ordinal 12 Sensitive 13 H0: median ϭ $177,500; H1: median $177,500 (claim); C.V ϭ 2; test value ϭ 3; not reject There is not enough evidence to say that the median is not $177,500 14 H0: median ϭ 1200 (claim) and H1: median 1200 There are 10 minus signs Do not reject since 10 is greater than the critical value There is not enough evidence to reject the claim that the median is 1200 IS–72 15 H0: there will be no change in the weight of the turkeys after the special diet and H1: the turkeys will weigh more after the special diet (claim) There is plus sign; hence, the null hypothesis is rejected There is enough evidence to support the claim that the turkeys gained weight on the special diet 16 H0: there is no difference in the amounts of money received by the teams and H1: there is a difference in the amounts of money each team received; C.V ϭ Ϯ1.96; z ϭ Ϫ0.79; not reject There is not enough evidence to say that the amounts differ 17 H0: the distributions are the same and H1: the distributions are different (claim); z ϭ Ϫ0.14434; C.V ϭ Ϯ1.65; not reject the null hypothesis There is not enough evidence to support the claim that the distributions are different 18 H0: there is no difference in the GPA of the students before and after the workshop and H1: there is a difference in the GPA of the students before and after the workshop (claim); test statistic ϭ 0; C.V ϭ 2; reject the null hypothesis There is enough evidence to support the claim that there is a difference in the GPAs of the students 19 H0: there is no difference in the amounts of sodium in the three sandwiches and H1: there is a difference in the amounts of sodium in the sandwiches; C.V ϭ 5.991; H ϭ 11.795; reject There is enough evidence to conclude that there is a difference in the amounts of sodium in the sandwiches 20 H0: there is no difference in the reaction times of the monkeys and H1: there is a difference in the reaction times of the monkeys (claim); H ϭ 6.9; 0.025 Ͻ P-value Ͻ 0.05 (0.032); reject the null hypothesis There is enough evidence to support the claim that there is a difference in the reaction times of the monkeys 21 rs ϭ 0.683; H0: r ϭ and H1: r 0; C.V ϭ Ϯ0.600; reject There is enough evidence to say that there is a significant relationship between the drug prices 22 rs ϭ 0.943; H0: r ϭ and H1: r 0; C.V ϭ Ϯ0.829; reject There is a significant relationship between the amount of money spent on Head Start and the number of students enrolled in the program 23 H0: the births of babies occur at random according to gender and H1: the null hypothesis is not true There are 10 runs, and since this is between and 19, the null hypothesis is not rejected There is not enough evidence to reject the null hypothesis that the gender occurs at random 24 H0: there is no difference in the rpm of the motors before and after the reconditioning and H1: there is a difference in the rpm of the motors before and after the reconditioning (claim); test statistic ϭ 0; C.V ϭ 6; not reject the null hypothesis There is not enough evidence to support the claim that there is a difference in the rpm of the motors before and after reconditioning blu38582_ans_IS1-IS76.qxd 9/28/10 8:24 PM Page 73 Instructor’s Section Answers 25 H0: the numbers occur at random and H1: the null hypothesis is not true There are 16 runs, and since this is between and 21, the null hypothesis is not rejected There is not enough evidence to reject the null hypothesis that the numbers occur at random Chapter 14 Exercises 14–1 Random, systematic, stratified, cluster Samples can save the researcher time and money They are used when the population is large or infinite They are used when the original units are to be destroyed, such as in testing the breaking strength of ropes A sample must be randomly selected Random numbers are used to ensure every element of the population has the same chance of being selected Talking to people on the street, calling people on the phone, and asking your friends are three incorrect ways of obtaining a sample Over the long run each digit, through 9, will occur with the same probability Random sampling has the advantage that each unit of the population has an equal chance of being selected One disadvantage is that the units of the population must be numbered; if the population is large, this could be somewhat time-consuming Systematic sampling has an advantage in that once the first unit is selected, each succeeding unit selected has been determined This saves time A disadvantage would be if the list of units was arranged in some manner so that a bias would occur, such as selecting all men when the population consists of both men and women An advantage of stratified sampling is that it ensures representation for the groups used in stratification; however, it is virtually impossible to stratify the population so that all groups are represented 10 Clusters are easy to use since they already exist, but it is difficult to justify that the clusters actually represent the population 11–20 Answers will vary Exercises 14–2 Flaw—biased; it’s confusing Flaw—the purpose of the question is unclear You could like him personally but not politically Flaw—the question is too broad Flaw—none The question is good if the respondent knows the mayor’s position; otherwise his position needs to be stated Flaw—confusing words How many hours did you study for this exam? Possible order problem—ask first, “Do you use artificial sweetener regularly?” Flaw—confusing words If a plane were to crash on the border of New York and New Jersey, where should the victims be buried? Flaw—none Answers will vary 10 Answers will vary Exercises 14–3 Simulation involves setting up probability experiments that mimic the behavior of real-life events Answers will vary John Von Neumann and Stanislaw Ulam Using the computer to simulate real-life situations can save time, since the computer can generate random numbers and keep track of the outcomes very quickly and easily The steps are as follows: a List all possible outcomes b Determine the probability of each outcome c Set up a correspondence between the outcomes and the random numbers d Conduct the experiment by using random numbers e Repeat the experiment and tally the outcomes f Compute any statistics and state the conclusions Random numbers can be used to ensure the outcomes occur with appropriate probability When the repetitions increase, there is a higher probability that the simulation will yield more precise answers Use a table of random numbers Select 40 random numbers Numbers 01 through 16 mean the person is foreign-born Use three-digit random numbers; numbers 001 through 681 mean that the mother is in the labor force 10 Select two-digit random numbers in groups of For one person, 01 through 70 means a success For the other person, 01 through 75 means a success 11 Select 100 two-digit random numbers Numbers 00 to 34 mean the household has at least one set with premium cable service Numbers 35 to 99 mean the household does not have the service 12 Use the odd digits to represent a match and the even digits to represent a nonmatch 13 Let an odd number represent heads and an even number represent tails Then each person selects a digit at random 14–24 Answers will vary IS–73 blu38582_ans_IS1-IS76.qxd 9/28/10 8:24 PM Page 74 Instructor’s Section Answers Review Exercises 1–8 Answers will vary Use one-digit random numbers through for a strikeout and through and represent anything other than a strikeout 10 Use two-digit random numbers: 01 through 15 represent an overbooked plane, and 16 through 99 and 00 represent a plane that is not overbooked 11 In this case, a one-digit random number is selected Numbers through represent the numbers on the face Ignore 7, 8, 9, and and select another number 12 The first person selects a two-digit random number Any two-digit random number that has a 7, 8, 9, or is ignored, and another random number is selected Player selects a one-digit random number; any random number that is not through is ignored, and another one is selected 13 Let the digits through represent rock, let through represent paper, let through represent scissors, and omit 14–18 Answers will vary 19 Flaw—asking a biased question Have you ever driven through a red light? 20 Flaw—using a double negative Do you think students who are not failing should be given tutoring if they request it? 21 Flaw—asking a double-barreled question Do you think all automobiles should have heavy-duty bumpers? 18 Use two-digit random numbers to represent the spots on the face of the dice Ignore any two-digit random numbers with 7, 8, 9, or For cards, use two-digit random numbers between 01 and 13 19 Use two-digit random numbers The first digit represents the first player, and the second digit represents the second player If both numbers are odd or even, player wins If a digit is odd and the other digit is even, player wins 20–24 Answers will vary Appendix A A–1 362,880 A–2 5040 A–3 120 A–4 A–5 A–6 A–7 1320 A–8 1,814,400 A–9 20 A–10 7920 A–11 126 A–12 120 A–13 70 A–14 455 A–15 A–16 10 A–17 560 A–18 1980 A–19 2520 A–20 90 A–21 121; 2181; 14,641; 716.9 A–22 56; 550; 3136; 158 A–23 32; 258; 1024; 53.2 A–24 150; 4270; 22,500; 1457.5 22 Answers will vary A–25 328; 22,678; 107,584; 1161.2 A–26 829; 123,125; 687,241; 8584.8333 Chapter Quiz True True A–27 693; 50,511; 480,249; 2486.1 False True A–28 409; 40,333; 167,281; 6876.80 a c A–29 318; 20,150; 101,124; 3296 c Larger A–30 Ϫ20; 778; 400; 711.3334 Biased 10 Cluster A–31 y 11–14 Answers will vary 15 Use two-digit random numbers: 01 through 45 means the player wins Any other two-digit random number means the player loses 16 Use two-digit random numbers: 01 through 05 means a cancellation Any other two-digit random number means the person shows up 17 The random numbers 01 through 10 represent the 10 cards in hearts The random numbers 11 through 20 represent the 10 cards in diamonds The random numbers 21 through 30 represent the 10 spades, and 31 through 40 represent the 10 clubs Any number over 40 is ignored IS–74 (1, 6) (3, 2) –6 –5 –4 –3 –2 –1–1 –2 –3 –4 –5 –6 x blu38582_ans_IS1-IS76.qxd 9/28/10 8:24 PM Page 75 Instructor’s Section Answers A–32 A–35 y (0, 5) y 10 10 –10 –9 –8 –7 –6 –5 –4 –3 –2 –1–1 10 –10 –9 –8 –7 –6 –5 –4 –3 –2 –1–1 –2 –3 –4 –5 –6 –7 –8 –9 –10 A–33 A–36 (0, 5) x (–1, 3) A–34 –8 –7 –6 –5 –4 –3 –2 –1–1 x –1 10 –8 –7 –6 –5 –4 –3 –2 –1–1 –2 (–1, –2) –3 –4 –5 –6 –7 –8 –9 –10 10 y = + 2x y (–7, 8) x 10 y (3, 6) –6 –5 –4 –3 –2 –1–1 –2 –3 –4 –5 –6 (10, 3) –2 –3 –4 –5 –6 –7 –8 –9 –10 y (–2, 4) (6, 3) (8, 0) x A–37 x y x –2 –3 –4 –5 –6 –7 –8 –9 –10 y y = –1 + x (1, 0) x –6 –5 –4 –3 –2 –1–1 (0, –1) x y –2 –1 –3 –4 –5 –6 IS–75 blu38582_ans_IS1-IS76.qxd 9/28/10 8:24 PM Page 76 Instructor’s Section Answers A–38 A–39 y (0, 4) (1, 1) (1, 7) y = + 4x (0, 3) x –6 –5 –4 –3 –2–1 –1 –2 –3 –4 –5 –6 –6 –5 –4 –3 –2–1 –1 –2 –3 –4 –5 –6 x y = – 3x y y = –2 – 2x (–2, 2) x y –2 –2 Appendix B–2 B–1 0.65 –6 –5 –4 –3 –2 –1–1 IS–76 A–40 y –2 –3 –4 –5 –6 x (0, –2) B–2 0.579 B–3 0.653 B–4 0.005 B–5 0.379 B–6 0.585 B–7 B–8 B–9 0.64 B–10 0.467 B–11 0.857 B–12 0.33 blu38582_index_I1-I6.qxd 9/25/10 2:14 PM Page I-1 Index A Addition rules, 199–204 Adjusted R2, 579–580 Alpha, 406 Alternate approach to standard normal distribution, 765–768 Alternative hypotheses, 401 Algebra review, 753–757 Analysis of variance (ANOVA), 631–662 assumptions, 631–650 between-group variance, 631 degrees of freedom, 632, 649 F-test, 633 hypotheses, 631, 648–649 one-way, 631–637 summary table, 633, 651 two-way, 647–655 within-group variance, 631 Assumptions for the use of chi-square test, 448, 594, 613 Assumptions for valid predictions in regression, 556 Averages, 105–116 properties and uses, 116 B Bar graph, 69–70 Bayes’ theorem, 761–764 Bell curve, 301 Beta, 406, 459 Between-group variance, 631 Biased sample, 721 Bimodal, 60, 111 Binomial distribution, 271–276 characteristics, 271 mean for, 274 normal approximation, 340–346 notation, 271 standard deviation, 274 variance, 274 Binomial experiment, 271 Binomial probability formula, 271 Boundaries, Boundaries, class, 39 Boxplot, 162 C Categorical frequency distribution, 38–39 Census, Central limit theorem, 331–338 Chebyshev’s theorem, 134–136 Chi-square assumptions, 448, 594, 613 contingency table, 606–607 degrees of freedom, 386 distribution, 386–388 goodness-of-fit test, 593–598 independence test, 606–611 use in H-test, 694 variance test, 447–453 Yates correction for, 613, 617 Class, 37 boundaries, 39 limits, 39 midpoint, 40 width, 39–40 Classical probability, 186–191 Cluster sample, 12, 728 Coefficient of determination, 569 Coefficient of nondetermination, 569 Coefficient of variation, 132–133 Combination, 229–232 Combination rule, 230 Complementary events, 189–190 Complement of an event, 189 Compound event, 186 Conditional probability, 213, 216–218 I–1 blu38582_index_I1-I6.qxd 9/25/10 2:14 PM Page I-2 Index Confidence interval, 358 hypothesis testing, 457–459 mean, 358–373 means, difference of, 478, 486, 499 median, 672 proportion, 377–379 proportions, differences, 508–509 variances and standard deviations, 385–390 Confidence level, 358 Confounding variable, 15 Consistent estimator, 357 Contingency coefficient, 617 Contingency table, 606–607 Continuous variable, 6–7, 253, 300 Control group, 14 Convenience sample, 12–13 Correction factor for continuity, 342 Correlation, 534, 538–547 Correlation coefficient, 539 multiple, 578 Pearson’s product moment, 539 population, 543 Spearman’s rank, 700–702 Critical region, 406 Critical value, 406 Cumulative frequency, 54 Cumulative frequency distribution, 42–43 Cumulative frequency graph, 54–56 Cumulative relative frequency, 57–58 D Data, Data array, 109 Data set, Data value (datum), Deciles, 151 Degrees of freedom, 370 Dependent events, 213 Dependent samples, 492 Dependent variable, 14, 535 Descriptive statistics, Difference between two means, 473–479, 484–487, 492–499 assumptions for the test to determine, 473, 486, 493 proportions, 504–509 Discrete probability distribution, 254 Discrete variable, 6, 253 Disordinal interaction, 653 Distribution-free statistics (nonparametric), 672 I–2 Distributions bell-shaped, 59, 301 bimodal, 60, 111 binomial, 270–276 chi-square, 386–388 F, 513 frequency, 37 hypergeometric, 286–289 multinomial, 283–284 negatively skewed, 60, 117, 301 normal, 302–311 Poisson, 284–286 positively skewed, 60, 117, 301 probability, 253–258 sampling, 331–333 standard normal, 304 symmetrical, 59, 117, 301 Double sampling, 729 E Empirical probability, 191–193 Empirical rule, 136 Equally likely events, 186 Estimation, 356 Estimator, properties of a good, 357 Event, simple, 185 Events complementary, 189–190 compound, 189 dependent, 213 equally likely, 186 independent, 211 mutually exclusive, 199–200 Expectation, 264–266 Expected frequency, 593 Expected value, 264 Experimental study, 14 Explained variation, 566 Explanatory variable, 14 Exploratory data analysis (EDA), 162–165 Extrapolation, 556 F Factorial notation, 227 Factors, 647 F-distribution, characteristics of, 513 Finite population correction factor, 337 Five-number summary, 162 Frequency, 37 blu38582_index_I1-I6.qxd 9/25/10 2:14 PM Page I-3 Index Frequency distribution, 37 categorical, 38–39 grouped, 39–42 reasons for, 45 rules for constructing, 41–42 ungrouped, 43 Frequency polygon, 53–54 F-test, 513–519, 631 comparing three or more means, 633–636 comparing two variances, 513–519 notes for the use of, 516 Fundamental counting rule, 224–227 G Gallup poll, 720 Gaussian distribution, 301 Geometric mean, 122 Goodness-of-fit test, 593–598 Grand mean, 632 Grouped frequency distribution, 39–42 H Harmonic mean, 121 Hawthorne effect, 15 Hinges, 165 Histogram, 51–53 Homogeniety of proportions, 611–614 Homoscedasticity assumption, 568 Hypergeometric distribution, 286–288 Hypothesis, 4, 401 Hypothesis testing, 4, 400–404 alternative, 401 common phrases, 402 critical region, 406 critical value, 406 definitions, 401 level of significance, 406 noncritical region, 406 null, 401 one-tailed test, 406 P-value method, 418–421 research, 402 statistical, 401 statistical test, 404 test value, 404 traditional method, steps in, 411 two-tailed test, 402, 408 types of errors, 404–405 I Independence test (chi-square), 606–611 Independent events, 211 Independent samples, Independent variables, 14, 535, 647 Inferential statistics, 484 Influential observation or point, 557 Interaction effect, 648 Intercept (y), 552–555 Interquartile range (IQR), 151, 162 Interval estimate, 358 Interval level of measurement, K Kruskal-Wallis test, 693–696 L Law of large numbers, 193–194 Left-tailed test, 402, 406 Level of significance, 406 Levels of measurement, 7–8 interval, nominal, ordinal, 7–8 ratio, Limits, class, 39 Line of best fit, 551–552 Lower class boundary, 39 Lower class limit, 39 Lurking variable, 547 M Main effects, 649 Marginal change, 555 Margin of error, 359 Mean, 106–108 binomial variable, 274 definition, 106 population, 106 probability distribution, 259–261 sample, 106 Mean deviation, 141 Mean square, 633 Measurement, levels of, 7–8 Measurement scales, 7–8 Measures of average, uses of, 116 Measures of dispersion, 123–132 I–3 blu38582_index_I1-I6.qxd 9/25/10 2:14 PM Page I-4 Index Measures of position, 142–151 Measures of variation, 123–134 Measures of variation and standard deviation, uses of, 132 Median, 109–111 confidence interval for, 672 defined, 109 for grouped data, 122 Midquartile, 155 Midrange, 115 Misleading graphs, 18, 76–80 Modal class, 112 Mode, 111–114 Modified box plot, 165, 168 Monte Carlo method, 739–744 Multimodal, 111 Multinomial distribution, 283–284 Multiple correlation coefficient, 578 Multiple regression, 535, 575–580 Multiple relationships, 535, 575–580 Multiplication rules probability, 211–216 Multistage sampling, 729 Mutually exclusive events, 199–200 N Negatively skewed distribution, 117, 301 Negative linear relationship, 535, 539 Nielsen television ratings, 720 Nominal level of measurement, Noncritical region, 406 Nonparametric statistics, 672–710 advantages, 673 disadvantages, 673 Nonrejection region, 406 Nonresistant statistic, 165 Normal approximation to binomial distribution, 340–346 Normal distribution, 302–311 applications of, 316–321 approximation to the binomial distribution, 340–346 areas under, 305–307 formula for, 304 probability distribution as a, 307–309 properties of, 303 standard, 304 Normal quantile plot, 324, 328–330 Normally distributed variables, 300–302 Notation for the binomial distribution, 271 Null hypothesis, 401 I–4 O Observational study, 13–14 Observed frequency, 593 Odds, 199 Ogive, 54–56 One-tailed test, 406 left, 406 right, 406 One-way analysis of variance, 631–637 Open-ended distribution, 41 Ordinal interaction, 653 Ordinal level of measurement, 7–8 Outcome, 183 Outcome variable, 14 Outliers, 60, 113, 151–153, 322 P Paired-sample sign test, 677–679 Parameter, 106 Parametric tests, 672 Pareto chart, 70–71 Pearson coefficient of skewness, 141, 322–324 Pearson product moment correlation coefficient, 539 Percentiles, 143–149 Permutation, 227–229 Permutation rule, 228 Pie graph, 73–76 Point estimate, 357 Poisson distribution, 284–286 Pooled estimate of variance, 487 Population, 4, 721 Positively skewed distribution, 117, 301 Positive linear relationship, 535, 539 Power of a test, 459–460 Practical significance, 421 Prediction interval, 572–573 Probability, 4, 182 addition rules, 199–204 at least, 218–219 binomial, 270–276 classical, 186–191 complementary rules, 190 conditional, 213, 216–218 counting rules, 237–239 distribution, 253–258 empirical, 191–193 experiment, 183 multiplication rules, 211–216 subjective, 194 blu38582_index_I1-I6.qxd 9/25/10 2:14 PM Page I-5 Index Properties of the distribution of sample means, 331 Proportion, 377, 437 P-value, 418 for F test, 518 method for hypothesis testing, 418–421 for t test, 430–432 for X2 test, 451–453 Q Quadratic mean, 122 Qualitative variables, Quantitative variables, Quantile plot, 324, 328–330 Quartiles, 149–151 Quasi-experimental study, 14 Questionnaire design, 736–738 R Random numbers, 11, 722–725 Random samples, 10, 721–725 Random sampling, 10–11, 721–725 Random variable, 3, 253 Range, 41, 124–125 Range rule of thumb, 133 Rank correlation, Spearman’s, 700–702 Ranking, 673–674 Ratio level of measurement, Raw data, 37 Regression, 534, 551–558 assumptions for valid prediction, 556 multiple, 535, 575–580 Regression line, 551 equation, 552–556 intercept, 552–554 line of best fit, 551–552 prediction, 535 slope, 552–553 Rejection region, 406 Relationships, 4–5, 535 Relative frequency graphs, 56–58 Relatively efficient estimator, 357 Requirements for a probability distribution, 257 Research hypothesis, 402 Research report, 759 Residual, 567–568 Residual Plot, 568–569 Resistant statistic, 165 Right-tailed test, 402–406 Robust, 357 Run, 703 Runs test, 702–706 S Sample, 4, 721 biased, 721 cluster, 12, 728 convenience, 12–13 random, 10, 721–725 size for estimating means, 363–365 size for estimating proportions, 379–381 stratified, 12, 726–728 systematic, 11–12, 725–726 unbiased, 721 Sample space, 183 Sampling, 10–13, 721–730 distribution of sample means, 331–333 double, 729 error, 331 multistage, 729 random, 10–11, 721–725 sequence, 729 Scatter plot, 535–538 Scheffé test, 642, 643 Sequence sampling, 729 Short-cut formula for variance and standard deviation, 129 Significance, level of, 406 Sign test, 675–677 test value for, 675 Simple event, 185 Simple relationship, 535 Simulation technique, 739 Single sample sign test, 675–677 Skewness, 59–60, 301–302 Slope, 552–553 Spearman rank correlation coefficient, 700–702 Standard deviation, 125–132 binomial distribution, 274 definition, 127 formula, 127 population, 127 sample, 128 uses of, 132 Standard error of difference between means, 474 Standard error of difference between proportions, 505 Standard error of the estimate, 570–572 I–5 blu38582_index_I1-I6.qxd 9/25/10 2:14 PM Page I-6 Index Standard error of the mean, 333 Standard normal distribution, 304 Standard score, 142–143 Statistic, 106 Statistical hypothesis, 401 Statistical test, 406 Statistics, descriptive, inferential, misuses of, 16–19 Stem and leaf plot, 80–83 Stratified sample, 12, 726–728 Student’s t distribution, 370 Subjective probability, 194 Sum of squares, 633 Surveys, 9–10, 736–738 mail, 9–10 personal interviews, 10 telephone, Symmetrical distribution, 59, 117, 301 Systematic sampling, 11–12, 725–726 T t-distribution, characteristics of, 370 Test of normality, 322–324, 328–330, 598–600 Test value, 404 Time series graph, 71–73 Total variation, 566 Treatment groups, 14, 648 Tree diagram, 185, 215, 225–226 t-test, 427 coefficient for correlation, 543–545 for difference of means, 484–487, 492–500 for mean, 427–433 Tukey test, 644–645 Two-tailed test, 402, 408 Two-way analysis of variance, 647–655 Type I error, 405–406, 459–460 Type II error, 405–406, 459–460 U Unbiased estimate of population variance, 128 Unbiased estimator, 357 Unbiased sample, 721 Unexplained variation, 566 Ungrouped frequency distribution, 43–44 Uniform distribution, 60, 310 I–6 Unimodal, 60, 111 Upper class boundary, 39 Upper class limit, 39 V Variable, 3, 253, 535 confounding, 15 continuous, 6–7, 253, 300 dependent, 14, 535 discrete, 6, 253 explanatory, 14 independent, 14, 535 qualitative, quantitative, random, 3, 253 Variance, 125–132 binomial distribution, 274 definition of, 127 formula, 127 population, 127 probability distribution, 262–264 sample, 128 short-cut formula, 129 unbiased estimate, 128 uses of, 132 Variances equal, 513–514 unequal, 513–514 Venn diagram, 190–191, 203, 218 W Weighted estimate of p, 505 Weighted mean, 115 Wilcoxon rank sum test, 683–686 Wilcoxon signed-rank test, 688–692 Within-group variance, 631 Y Yates correction for continuity, 613, 617 y-intercept, 552–555 Z z-score, 142–143 z-test, 413 z-test for means, 413–421, 473–479 z-test for proportions, 437–441, 504–508 z-values (score), 304 blu38582_IBC.qxd 9/13/10 Table F d.f 7:08 PM Page The t Distribution Confidence intervals 80% 90% 95% 98% 99% One tail, A 0.10 0.05 0.025 0.01 0.005 Two tails, A 0.20 0.10 0.05 0.02 0.01 3.078 1.886 1.638 1.533 1.476 1.440 1.415 1.397 1.383 1.372 1.363 1.356 1.350 1.345 1.341 1.337 1.333 1.330 1.328 1.325 1.323 1.321 1.319 1.318 1.316 1.315 1.314 1.313 1.311 1.310 1.309 1.307 1.306 1.304 1.303 1.301 1.299 1.297 1.296 1.295 1.294 1.293 1.292 1.291 1.290 1.283 1.282 1.282a 6.314 2.920 2.353 2.132 2.015 1.943 1.895 1.860 1.833 1.812 1.796 1.782 1.771 1.761 1.753 1.746 1.740 1.734 1.729 1.725 1.721 1.717 1.714 1.711 1.708 1.706 1.703 1.701 1.699 1.697 1.694 1.691 1.688 1.686 1.684 1.679 1.676 1.673 1.671 1.669 1.667 1.665 1.664 1.662 1.660 1.648 1.646 1.645b 12.706 4.303 3.182 2.776 2.571 2.447 2.365 2.306 2.262 2.228 2.201 2.179 2.160 2.145 2.131 2.120 2.110 2.101 2.093 2.086 2.080 2.074 2.069 2.064 2.060 2.056 2.052 2.048 2.045 2.042 2.037 2.032 2.028 2.024 2.021 2.014 2.009 2.004 2.000 1.997 1.994 1.992 1.990 1.987 1.984 1.965 1.962 1.960 31.821 6.965 4.541 3.747 3.365 3.143 2.998 2.896 2.821 2.764 2.718 2.681 2.650 2.624 2.602 2.583 2.567 2.552 2.539 2.528 2.518 2.508 2.500 2.492 2.485 2.479 2.473 2.467 2.462 2.457 2.449 2.441 2.434 2.429 2.423 2.412 2.403 2.396 2.390 2.385 2.381 2.377 2.374 2.368 2.364 2.334 2.330 2.326c 63.657 9.925 5.841 4.604 4.032 3.707 3.499 3.355 3.250 3.169 3.106 3.055 3.012 2.977 2.947 2.921 2.898 2.878 2.861 2.845 2.831 2.819 2.807 2.797 2.787 2.779 2.771 2.763 2.756 2.750 2.738 2.728 2.719 2.712 2.704 2.690 2.678 2.668 2.660 2.654 2.648 2.643 2.639 2.632 2.626 2.586 2.581 2.576d 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 32 34 36 38 40 45 50 55 60 65 70 75 80 90 100 500 1000 (z) ϱ a This value has been rounded to 1.28 in the textbook This value has been rounded to 1.65 in the textbook c This value has been rounded to 2.33 in the textbook d This value has been rounded to 2.58 in the textbook One tail Two tails b Source: Adapted from W H Beyer, Handbook of Tables for Probability and Statistics, 2nd ed., CRC Press, Boca Raton, Fla., 1986 Reprinted with permission Area ␣ t Area ␣ Ϫt Area ␣ ϩt blu38582_IBC.qxd 9/13/10 7:08 PM Page Glossary of Symbols a y intercept of a line MR Midrange a Probability of a type I error MSB Mean square between groups b Slope of a line MSW Mean square within groups (error) b Probability of a type II error n Sample size C Column frequency N Population size cf Cumulative frequency n(E) Number of ways E can occur nCr Number of combinations of n objects taking r objects at a time n(S) Number of outcomes in the sample space O Observed frequency C.V Critical value P Percentile; probability CVar Coefficient of variation p Probability; population proportion D Difference; decile pˆ Sample proportion ᎐ _ D Mean of the differences p d.f Degrees of freedom P(B͉A) Conditional probability d.f.N Degrees of freedom, numerator P(E) d.f.D Degrees of freedom, denominator E Event; expected frequency; maximum error of estimate ᎐ Weighted estimate of p ᎐ Probability of an event E P(E ) Probability of the complement of E n Pr Number of permutations of n objects taking r objects at a time E Complement of an event p e Euler’s constant Ϸ 2.7183 Pi Ϸ 3.14 Q Quartile E(X) Expected value q f Frequency Ϫ p; test value for Tukey test qˆ F F test value; failure Ϫ pˆ _ FЈ Critical value for the Scheffé test q R Ϫ p– MD Median FS Scheffé test value GM Geometric mean Range; rank sum blu38582_IBC.qxd 9/13/10 7:08 PM Page H Kruskal-Wallis test value rS Spearman rank correlation coefficient H0 Null hypothesis S Sample space; success H1 Alternative hypothesis s Sample standard deviation HM Harmonic mean s k Number of samples s Sample variance Population standard deviation l Number of occurrences for the Poisson distribution s Standard deviation of the differences sX Standard error of the mean sD Standard error of estimate ͚ Summation notation sest SSB Sum of squares between groups ws Smaller sum of signed ranks, Wilcoxon signed-rank test SSW Sum of squares within groups X sB2 Between-group variance Data value; number of successes for a binomial distribution sW2 Within-group variance X Sample mean t t test value x Independent variable in regression ta͞2 Two-tailed t critical value X GM Grand mean m Population mean Xm Midpoint of a class ᎐ ᎐ Population variance mD Mean of the population differences Chi-square mX Mean of the sample means y Dependent variable in regression w Class width; weight yЈ Predicted y value r Sample correlation coefficient z z test value or z score R Multiple correlation coefficient za͞2 Two-tailed critical z value r2 Coefficient of determination ! Factorial r Population correlation coefficient ... O N Elementary Statistics A Step by Step Approach Allan G Bluman Professor Emeritus Community College of Allegheny County TM blu38582_fm_i-xxviii.qxd 9/29/10 2:43 PM Page ii TM ELEMENTARY STATISTICS: ... page Library of Congress Cataloging-in-Publication Data Bluman, Allan G Elementary statistics : a step by step approach / Allan Bluman — 8th ed p cm Includes bibliographical references and index... blu38582_fm_i-xxviii.qxd 9/29/10 2:44 PM Page xii Preface Approach Elementary Statistics: A Step by Step Approach was written as an aid in the beginning statistics course to students whose mathematical