Ebook Elementary statistics: A step by step approach (Eighth edition) - Part 1 presents the following content: Chapter 1 - the nature of probability and statistics, chapter 2 - frequency distributions and graphs, chapter 3 - data description, chapter 4 - probability and counting rules, chapter 5 - discrete probability distributions, chapter 6 - the normal distribution, chapter 7 - confidence intervals and sample size.
This page intentionally left blank Important Formulas Chapter Data Description ᎐ Mean for individual data: X ϭ ᎐ Mean for grouped data: X ϭ Chapter Discrete Probability Distributions ͚X n ͚ f • Xm n ͙ ͚Θ X Ϫ X Ι nϪ1 ͙ nΘ ͚X 2Ι Ϫ Θ͚XΙ nΘn Ϫ 1Ι (Shortcut formula) sϭ or Standard deviation for grouped data: sϭ ͙ nΘ͚ f • X m2 Ι Ϫ Θ ͚ f • Xm Ι nΘn Ϫ 1Ι Range rule of thumb: s Ϸ s2 ϭ ͚[X и P(X)] Ϫ m2 s ϭ ͙͚[X • PΘXΙ ] Ϫ m2 Standard deviation for a sample: sϭ Mean for a probability distribution: m ϭ ͚[X и P(X)] Variance and standard deviation for a probability distribution: range n! • pX • q nϪX Ϫ XΙ !X! Mean for binomial distribution: m ϭ n и p Variance and standard deviation for the binomial distribution: s2 ϭ n и p и q s ϭ ͙n • p • q Multinomial probability: n! PΘXΙ ϭ • p X • p2X • p3X • • • pkX k X1!X2!X3! Xk! Binomial probability: PΘXΙ ϭ Θn Poisson probability: P(X; l) ϭ Chapter Probability and Counting Rules Addition rule (mutually exclusive events): P(A or B) ϭ P(A) ϩ P(B) Addition rule (events not mutually exclusive): P(A or B) ϭ P(A) ϩ P(B) Ϫ P(A and B) Multiplication rule (independent events): P(A and B) ϭ P(A) и P(B) Multiplication rule (dependent events): P(A and B) ϭ P(A) и P(B ͉ A) Conditional probability: PΘB Խ AΙ ϭ Expectation: E(X) ϭ ͚[X и P(X)] PΘ A and BΙ PΘ AΙ ᎐ Complementary events: P(E ) ϭ Ϫ P(E) Fundamental counting rule: Total number of outcomes of a sequence when each event has a different number of possibilities: k и k и k и и и k n Permutation rule: Number of permutations of n objects n! taking r at a time is n Pr ϭ Θn Ϫ rΙ ! Combination rule: Number of combinations of r objects n! selected from n objects is n Cr ϭ Θ n Ϫ r Ι !r! X ϭ 0, 1, 2, e Ϫ X where X! Hypergeometric probability: PΘXΙ ϭ a CX • bCnϪX aϩbCn Chapter The Normal Distribution Standard score z ϭ ᎐ XϪ zϭ or XϪX s Mean of sample means: mX ϭ m ͙n ᎐ XϪ Central limit theorem formula: z ϭ ր͙n Standard error of the mean: sX ϭ Chapter Confidence Intervals and Sample Size z confidence interval for means: ᎐ X Ϫ z ␣ր2 Θ ͙n Ι Ͻ Ͻ X ϩ z ր Θ ͙n Ι ᎐ ␣ t confidence interval for means: ᎐ X Ϫ t ␣ր2 Θ ͙s n Ι Ͻ Ͻ X ϩ t ր Θ ͙s n Ι ᎐ ␣ z␣ր2 • E maximum error of estimate Sample size for means: n ϭ Θ Ι where E is the Confidence interval for a proportion: pˆ Ϫ Θz ␣ ր Ι ͙ pˆ qˆ Ͻ p Ͻ pˆ ϩ Θz ␣ ր 2Ι n ͙ pˆ qˆ n Sample size for a proportion: n ϭ pˆ qˆ z␣ Θ Eր Ι Formula for the confidence interval for difference of two means (small independent samples, variance unequal): X and qˆ ϭ Ϫ pˆ n Confidence interval for variance: pˆ ϭ where Θn ᎐ Θ X1 ͙ ᎐ Ϫ X2Ι Ϫ t ␣ ր Θ n Ϫ Ι s2 Ϫ 1Ι s2 Ͻ 2 Ͻ right 2left ᎐ ͙ Ϫ 1Ι s2 ϽϽ 2right ͙ Θn ᎐ tϭ ᎐ XϪ for any value n If n Ͻ 30, ր͙n population must be normally distributed sD ϭ (d.f ϭ n Ϫ 1) Θn Ϫ 1Ι s 2 ᎐ ᎐ Ϫ X2Ι Ϫ z␣ր2 ͙ 21 22 ϩ Ͻ 1 Ϫ n1 n2 ᎐ ᎐ ᎐ ᎐ Ϫ X2 Ι Ϫ Θ1 Ϫ 2Ι ͙ ͙ pq Θ n1 ϩ n1 Ι _ pϭ X1 ϩ X2 n1 ϩ n2 _ _ qϭ1Ϫp pˆ ϭ X1 n1 pˆ2 ϭ X2 n2 s21 s22 ϩ n1 n2 (d.f ϭ the smaller of n Ϫ or n2 Ϫ 1) Θ pˆ1 Ϫ pˆ2Ι Ϫ z␣ր2 ͙ pˆ qˆ1 pˆ qˆ2 ϩ Ͻ p1 Ϫ p2 n1 n2 Ͻ Θ pˆ1 Ϫ pˆ 2Ι ϩ z␣ ր ͙ 21 22 ϩ n1 n2 t test for comparing two means (independent samples, variances not equal): Θ X1 Ϫ pˆ 2Ι Ϫ Θ p1 Ϫ p2Ι Formula for the confidence interval for the difference of two proportions: Ͻ ΘX1 Ϫ X2Ι ϩ z ␣ ր tϭ Θ pˆ1 where Ϫ Θ 1 Ϫ Ι 21 22 ϩ n1 n2 Formula for the confidence interval for difference of two means (large samples): Θ X1 ϭ n Ϫ 1Ι z test for comparing two proportions: z test for comparing two means (independent samples): ͙ Θ d.f and SD S ᎐ Ͻ D Ͻ D ϩ t␣ր2 D ͙n ͙n (d.f ϭ n Ϫ 1) zϭ Chapter Testing the Difference Between Two Means, Two Proportions, and Two Variances Ϫ n͚D Ϫ Θ͚DΙ nΘn Ϫ 1Ι ͚D n ᎐ Dϭ ᎐ (d.f ϭ n Ϫ 1) zϭ ͙ where D Ϫ t␣ր2 pˆ Ϫ p ͙pqրn Chi-square test for a single variance: ϭ ᎐ X2 Ι D Ϫ D sD ր͙n Formula for confidence interval for the mean of the difference for dependent samples: ᎐ ᎐ Θ X1 s21 s22 ϩ n1 n2 t test for comparing two means for dependent samples: z test: z ϭ z test for proportions: z ϭ ͙ (d.f ϭ smaller of n1 Ϫ and n2 Ϫ 1) Ϫ 1Ι s2 2left Chapter Hypothesis Testing XϪ t test: t ϭ sր͙n ᎐ Ͻ ΘX1 Ϫ X2Ι ϩ t ␣ ր Confidence interval for standard deviation: Θn s21 s22 ϩ Ͻ 1 Ϫ n1 n2 ͙ pˆ qˆ1 pˆ qˆ2 ϩ n1 n2 s21 where s 21 is the s22 larger variance and d.f.N ϭ n1 Ϫ 1, d.f.D ϭ n2 Ϫ F test for comparing two variances: F ϭ Chapter 10 Correlation and Regression Chapter 11 Other Chi-Square Tests Correlation coefficient: Chi-square test for goodness-of-fit: rϭ nΘ͚xyΙ Ϫ Θ ͚xΙΘ͚yΙ t test for correlation coefficient: t ϭ r (d.f ϭ n Ϫ 2) ͙ nϪ2 Ϫ r2 The regression line equation: yЈ ϭ a ϩ bx Ϫ EΙ E [d.f ϭ (rows Ϫ 1)(col Ϫ 1)] Ϫ Θ͚xΙΘ͚xyΙ nΘ͚x2Ι Ϫ Θ͚xΙ nΘ͚xyΙ Ϫ Θ͚xΙΘ͚yΙ nΘ ͚x 2Ι Ϫ Θ͚xΙ bϭ Coefficient of determination: r ϭ ͙ explained variation total variation ANOVA test: F ϭ d.f.N ϭ k Ϫ d.f.D ϭ N Ϫ k ͚y2 Ϫ a ͚y Ϫ b ͚xy nϪ2 ͙ ᎐ nΘ x Ϫ X Ι 1ϩ ϩ n n ͚x Ϫ Θ ͚xΙ Ͻ y Ͻ yЈ ϩ t␣ ր 2s est ͙ ᎐ nΘ x Ϫ XΙ 1ϩ ϩ n n ͚x2 Ϫ Θ͚xΙ (d.f ϭ n Ϫ 2) Formula for the multiple correlation coefficient: Rϭ ͙ 2 r yx ϩ r yx Ϫ 2ryx • ryx • rx 1x2 Ϫ r 2x x Formula for the F test for the multiple correlation coefficient: Fϭ Θ1 Ϫ R 2րk ր Ϫ k Ϫ 1Ι R 2Ι Θn ͚niΘXi Ϫ XGM Ι kϪ1 sW2 ϭ ͚Θni Ϫ 1Ι s2i ͚Θni Ϫ 1Ι Scheffé test: FS ϭ Θ1 ΄ Ϫ R2 ΙΘn Ϫ 1Ι nϪkϪ1 Xi Ϫ Xj ͙sW2 րn Formulas for two-way ANOVA: SSA aϪ1 SSB MSB ϭ bϪ1 MSA ϭ MSW ϭ ΅ and Tukey test: q ϭ (d.f.N ϭ n Ϫ k and d.f.D ϭ n Ϫ k Ϫ 1) R 2adj ϭ Ϫ Ϫ Xj Ι րni ϩ 1րnjΙ ΘXi sW2 Θ1 FЈ ϭ (k Ϫ 1)(C.V.) MSAϫB ϭ Formula for the adjusted R2: sB2 ͚X where XGM ϭ sW2 N where N ϭ n1 ϩ n2 ϩ и и и ϩ nk where k ϭ number of groups sB2 ϭ Prediction interval for y: yЈ Ϫ t␣ ր sest ΘO Chapter 12 Analysis of Variance Standard error of estimate: sest ϭ ΘO Chi-square test for independence and homogeneity of proportions: x2 ϭ a Θ ͚y ΙΘ ͚x2 Ι aϭ where Ϫ EΙ E (d.f ϭ no of categories Ϫ 1) x2 ϭ a ͙[nΘ͚x2 Ι Ϫ Θ͚xΙ 2][nΘ ͚y2Ι Ϫ Θ ͚yΙ 2] Θa SSAϫB Ϫ 1ΙΘb Ϫ 1Ι SSW abΘ n Ϫ 1Ι MSA MSW MSB FB ϭ MSW FA ϭ FAϫB ϭ MSAϫB MSW Chapter 13 Nonparametric Statistics ϩ 0.5Ι Ϫ Θnր2Ι z test value in the sign test: z ϭ ͙n ր where n ϭ sample size (greater than or equal to 26) X ϭ smaller number of ϩ or Ϫ signs Kruskal-Wallis test: ΘX Wilcoxon rank sum test: z ϭ R Ϫ mR sR where R ϭ n1Θn1 ϩ n2 ϩ 1Ι ͙ n n 2Θn1 ϩ n ϩ 1Ι 12 R ϭ sum of the ranks for the smaller sample size (n1) n1 ϭ smaller of the sample sizes n2 ϭ larger of the sample sizes n1 Ն 10 and n2 Ն 10 R ϭ ws Ϫ Wilcoxon signed-rank test: z ϭ A where nΘn ϩ 1Ι nΘn ϩ 1ΙΘ2n ϩ 1Ι 24 Hϭ R21 R22 12 R2 ϩ ϩ • • • ϩ k Ϫ 3ΘN ϩ 1Ι NΘN ϩ 1Ι n1 n2 nk Θ Ι where R1 ϭ sum of the ranks of sample n1 ϭ size of sample R2 ϭ sum of the ranks of sample n2 ϭ size of sample и и и Rk ϭ sum of the ranks of sample k nk ϭ size of sample k N ϭ n1 ϩ n2 ϩ и и и ϩ nk k ϭ number of samples Spearman rank correlation coefficient: rS ϭ Ϫ ͚d nΘn2 Ϫ 1Ι where d ϭ difference in the ranks n ϭ number of data pairs n ϭ number of pairs where the difference is not ws ϭ smaller sum in absolute value of the signed ranks Procedure Table Step State the hypotheses and identify the claim Step Find the critical value(s) from the appropriate table in Appendix C Step Compute the test value Step Make the decision to reject or not reject the null hypothesis Step Summarize the results Procedure Table Solving Hypothesis-Testing Problems (P-value Method) Step State the hypotheses and identify the claim Step Compute the test value Step Find the P-value Step Make the decision Step Summarize the results ISBN-13: 978–0–07–743861–6 ISBN-10: 0–07–743861–2 Solving Hypothesis-Testing Problems (Traditional Method) Table E The Standard Normal Distribution Cumulative Standard Normal Distribution z 00 01 02 03 04 05 06 07 08 09 Ϫ3.4 0003 0003 0003 0003 0003 0003 0003 0003 0003 0002 Ϫ3.3 0005 0005 0005 0004 0004 0004 0004 0004 0004 0003 Ϫ3.2 0007 0007 0006 0006 0006 0006 0006 0005 0005 0005 Ϫ3.1 0010 0009 0009 0009 0008 0008 0008 0008 0007 0007 Ϫ3.0 0013 0013 0013 0012 0012 0011 0011 0011 0010 0010 Ϫ2.9 0019 0018 0018 0017 0016 0016 0015 0015 0014 0014 Ϫ2.8 0026 0025 0024 0023 0023 0022 0021 0021 0020 0019 Ϫ2.7 0035 0034 0033 0032 0031 0030 0029 0028 0027 0026 Ϫ2.6 0047 0045 0044 0043 0041 0040 0039 0038 0037 0036 Ϫ2.5 0062 0060 0059 0057 0055 0054 0052 0051 0049 0048 Ϫ2.4 0082 0080 0078 0075 0073 0071 0069 0068 0066 0064 Ϫ2.3 0107 0104 0102 0099 0096 0094 0091 0089 0087 0084 Ϫ2.2 0139 0136 0132 0129 0125 0122 0119 0116 0113 0110 Ϫ2.1 0179 0174 0170 0166 0162 0158 0154 0150 0146 0143 Ϫ2.0 0228 0222 0217 0212 0207 0202 0197 0192 0188 0183 Ϫ1.9 0287 0281 0274 0268 0262 0256 0250 0244 0239 0233 Ϫ1.8 0359 0351 0344 0336 0329 0322 0314 0307 0301 0294 Ϫ1.7 0446 0436 0427 0418 0409 0401 0392 0384 0375 0367 Ϫ1.6 0548 0537 0526 0516 0505 0495 0485 0475 0465 0455 Ϫ1.5 0668 0655 0643 0630 0618 0606 0594 0582 0571 0559 Ϫ1.4 0808 0793 0778 0764 0749 0735 0721 0708 0694 0681 Ϫ1.3 0968 0951 0934 0918 0901 0885 0869 0853 0838 0823 Ϫ1.2 1151 1131 1112 1093 1075 1056 1038 1020 1003 0985 Ϫ1.1 1357 1335 1314 1292 1271 1251 1230 1210 1190 1170 Ϫ1.0 1587 1562 1539 1515 1492 1469 1446 1423 1401 1379 Ϫ0.9 1841 1814 1788 1762 1736 1711 1685 1660 1635 1611 Ϫ0.8 2119 2090 2061 2033 2005 1977 1949 1922 1894 1867 Ϫ0.7 2420 2389 2358 2327 2296 2266 2236 2206 2177 2148 Ϫ0.6 2743 2709 2676 2643 2611 2578 2546 2514 2483 2451 Ϫ0.5 3085 3050 3015 2981 2946 2912 2877 2843 2810 2776 Ϫ0.4 3446 3409 3372 3336 3300 3264 3228 3192 3156 3121 Ϫ0.3 3821 3783 3745 3707 3669 3632 3594 3557 3520 3483 Ϫ0.2 4207 4168 4129 4090 4052 4013 3974 3936 3897 3859 Ϫ0.1 4602 4562 4522 4483 4443 4404 4364 4325 4286 4247 Ϫ0.0 5000 4960 4920 4880 4840 4801 4761 4721 4681 4641 For z values less than Ϫ3.49, use 0.0001 Area z Table E (continued ) Cumulative Standard Normal Distribution z 00 01 02 03 04 05 06 07 08 09 0.0 5000 5040 5080 5120 5160 5199 5239 5279 5319 5359 0.1 5398 5438 5478 5517 5557 5596 5636 5675 5714 5753 0.2 5793 5832 5871 5910 5948 5987 6026 6064 6103 6141 0.3 6179 6217 6255 6293 6331 6368 6406 6443 6480 6517 0.4 6554 6591 6628 6664 6700 6736 6772 6808 6844 6879 0.5 6915 6950 6985 7019 7054 7088 7123 7157 7190 7224 0.6 7257 7291 7324 7357 7389 7422 7454 7486 7517 7549 0.7 7580 7611 7642 7673 7704 7734 7764 7794 7823 7852 0.8 7881 7910 7939 7967 7995 8023 8051 8078 8106 8133 0.9 8159 8186 8212 8238 8264 8289 8315 8340 8365 8389 1.0 8413 8438 8461 8485 8508 8531 8554 8577 8599 8621 1.1 8643 8665 8686 8708 8729 8749 8770 8790 8810 8830 1.2 8849 8869 8888 8907 8925 8944 8962 8980 8997 9015 1.3 9032 9049 9066 9082 9099 9115 9131 9147 9162 9177 1.4 9192 9207 9222 9236 9251 9265 9279 9292 9306 9319 1.5 9332 9345 9357 9370 9382 9394 9406 9418 9429 9441 1.6 9452 9463 9474 9484 9495 9505 9515 9525 9535 9545 1.7 9554 9564 9573 9582 9591 9599 9608 9616 9625 9633 1.8 9641 9649 9656 9664 9671 9678 9686 9693 9699 9706 1.9 9713 9719 9726 9732 9738 9744 9750 9756 9761 9767 2.0 9772 9778 9783 9788 9793 9798 9803 9808 9812 9817 2.1 9821 9826 9830 9834 9838 9842 9846 9850 9854 9857 2.2 9861 9864 9868 9871 9875 9878 9881 9884 9887 9890 2.3 9893 9896 9898 9901 9904 9906 9909 9911 9913 9916 2.4 9918 9920 9922 9925 9927 9929 9931 9932 9934 9936 2.5 9938 9940 9941 9943 9945 9946 9948 9949 9951 9952 2.6 9953 9955 9956 9957 9959 9960 9961 9962 9963 9964 2.7 9965 9966 9967 9968 9969 9970 9971 9972 9973 9974 2.8 9974 9975 9976 9977 9977 9978 9979 9979 9980 9981 2.9 9981 9982 9982 9983 9984 9984 9985 9985 9986 9986 3.0 9987 9987 9987 9988 9988 9989 9989 9989 9990 9990 3.1 9990 9991 9991 9991 9992 9992 9992 9992 9993 9993 3.2 9993 9993 9994 9994 9994 9994 9994 9995 9995 9995 3.3 9995 9995 9995 9996 9996 9996 9996 9996 9996 9997 3.4 9997 9997 9997 9997 9997 9997 9997 9997 9997 9998 For z values greater than 3.49, use 0.9999 Area z Table F d.f The t Distribution Confidence intervals 80% 90% 95% 98% 99% One tail, A 0.10 0.05 0.025 0.01 0.005 Two tails, A 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 32 34 36 38 40 45 50 55 60 65 70 75 80 90 100 500 1000 (z) ϱ 0.20 0.10 0.05 0.02 0.01 3.078 1.886 1.638 1.533 1.476 1.440 1.415 1.397 1.383 1.372 1.363 1.356 1.350 1.345 1.341 1.337 1.333 1.330 1.328 1.325 1.323 1.321 1.319 1.318 1.316 1.315 1.314 1.313 1.311 1.310 1.309 1.307 1.306 1.304 1.303 1.301 1.299 1.297 1.296 1.295 1.294 1.293 1.292 1.291 1.290 1.283 1.282 1.282a 6.314 2.920 2.353 2.132 2.015 1.943 1.895 1.860 1.833 1.812 1.796 1.782 1.771 1.761 1.753 1.746 1.740 1.734 1.729 1.725 1.721 1.717 1.714 1.711 1.708 1.706 1.703 1.701 1.699 1.697 1.694 1.691 1.688 1.686 1.684 1.679 1.676 1.673 1.671 1.669 1.667 1.665 1.664 1.662 1.660 1.648 1.646 1.645b 12.706 4.303 3.182 2.776 2.571 2.447 2.365 2.306 2.262 2.228 2.201 2.179 2.160 2.145 2.131 2.120 2.110 2.101 2.093 2.086 2.080 2.074 2.069 2.064 2.060 2.056 2.052 2.048 2.045 2.042 2.037 2.032 2.028 2.024 2.021 2.014 2.009 2.004 2.000 1.997 1.994 1.992 1.990 1.987 1.984 1.965 1.962 1.960 31.821 6.965 4.541 3.747 3.365 3.143 2.998 2.896 2.821 2.764 2.718 2.681 2.650 2.624 2.602 2.583 2.567 2.552 2.539 2.528 2.518 2.508 2.500 2.492 2.485 2.479 2.473 2.467 2.462 2.457 2.449 2.441 2.434 2.429 2.423 2.412 2.403 2.396 2.390 2.385 2.381 2.377 2.374 2.368 2.364 2.334 2.330 2.326c 63.657 9.925 5.841 4.604 4.032 3.707 3.499 3.355 3.250 3.169 3.106 3.055 3.012 2.977 2.947 2.921 2.898 2.878 2.861 2.845 2.831 2.819 2.807 2.797 2.787 2.779 2.771 2.763 2.756 2.750 2.738 2.728 2.719 2.712 2.704 2.690 2.678 2.668 2.660 2.654 2.648 2.643 2.639 2.632 2.626 2.586 2.581 2.576d a This value has been rounded to 1.28 in the textbook This value has been rounded to 1.65 in the textbook c This value has been rounded to 2.33 in the textbook d This value has been rounded to 2.58 in the textbook One tail Two tails b Source: Adapted from W H Beyer, Handbook of Tables for Probability and Statistics, 2nd ed., CRC Press, Boca Raton, Fla., 1986 Reprinted with permission Area ␣ t Area ␣ Ϫt Area ␣ ϩt Table G The Chi-Square Distribution A Degrees of freedom 0.995 0.99 0.975 0.95 0.90 0.10 0.05 0.025 0.01 0.005 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 40 50 60 70 80 90 100 — 0.010 0.072 0.207 0.412 0.676 0.989 1.344 1.735 2.156 2.603 3.074 3.565 4.075 4.601 5.142 5.697 6.265 6.844 7.434 8.034 8.643 9.262 9.886 10.520 11.160 11.808 12.461 13.121 13.787 20.707 27.991 35.534 43.275 51.172 59.196 67.328 — 0.020 0.115 0.297 0.554 0.872 1.239 1.646 2.088 2.558 3.053 3.571 4.107 4.660 5.229 5.812 6.408 7.015 7.633 8.260 8.897 9.542 10.196 10.856 11.524 12.198 12.879 13.565 14.257 14.954 22.164 29.707 37.485 45.442 53.540 61.754 70.065 0.001 0.051 0.216 0.484 0.831 1.237 1.690 2.180 2.700 3.247 3.816 4.404 5.009 5.629 6.262 6.908 7.564 8.231 8.907 9.591 10.283 10.982 11.689 12.401 13.120 13.844 14.573 15.308 16.047 16.791 24.433 32.357 40.482 48.758 57.153 65.647 74.222 0.004 0.103 0.352 0.711 1.145 1.635 2.167 2.733 3.325 3.940 4.575 5.226 5.892 6.571 7.261 7.962 8.672 9.390 10.117 10.851 11.591 12.338 13.091 13.848 14.611 15.379 16.151 16.928 17.708 18.493 26.509 34.764 43.188 51.739 60.391 69.126 77.929 0.016 0.211 0.584 1.064 1.610 2.204 2.833 3.490 4.168 4.865 5.578 6.304 7.042 7.790 8.547 9.312 10.085 10.865 11.651 12.443 13.240 14.042 14.848 15.659 16.473 17.292 18.114 18.939 19.768 20.599 29.051 37.689 46.459 55.329 64.278 73.291 82.358 2.706 4.605 6.251 7.779 9.236 10.645 12.017 13.362 14.684 15.987 17.275 18.549 19.812 21.064 22.307 23.542 24.769 25.989 27.204 28.412 29.615 30.813 32.007 33.196 34.382 35.563 36.741 37.916 39.087 40.256 51.805 63.167 74.397 85.527 96.578 107.565 118.498 3.841 5.991 7.815 9.488 11.071 12.592 14.067 15.507 16.919 18.307 19.675 21.026 22.362 23.685 24.996 26.296 27.587 28.869 30.144 31.410 32.671 33.924 35.172 36.415 37.652 38.885 40.113 41.337 42.557 43.773 55.758 67.505 79.082 90.531 101.879 113.145 124.342 5.024 7.378 9.348 11.143 12.833 14.449 16.013 17.535 19.023 20.483 21.920 23.337 24.736 26.119 27.488 28.845 30.191 31.526 32.852 34.170 35.479 36.781 38.076 39.364 40.646 41.923 43.194 44.461 45.722 46.979 59.342 71.420 83.298 95.023 106.629 118.136 129.561 6.635 9.210 11.345 13.277 15.086 16.812 18.475 20.090 21.666 23.209 24.725 26.217 27.688 29.141 30.578 32.000 33.409 34.805 36.191 37.566 38.932 40.289 41.638 42.980 44.314 45.642 46.963 48.278 49.588 50.892 63.691 76.154 88.379 100.425 112.329 124.116 135.807 7.879 10.597 12.838 14.860 16.750 18.548 20.278 21.955 23.589 25.188 26.757 28.299 29.819 31.319 32.801 34.267 35.718 37.156 38.582 39.997 41.401 42.796 44.181 45.559 46.928 48.290 49.645 50.993 52.336 53.672 66.766 79.490 91.952 104.215 116.321 128.299 140.169 Source: Owen, Handbook of Statistical Tables, Table A–4 “Chi-Square Distribution Table,” © 1962 by Addison-Wesley Publishing Company, Inc Copyright renewal © 1990 Reproduced by permission of Pearson Education, Inc Area ␣ 2 blu38582_ch07_355-398.qxd 384 9/8/10 1:07 PM Page 384 Chapter Confidence Intervals and Sample Size Type 90 for the confidence level Check the box for Use test and interval based on normal distribution Click [OK] twice The results for the confidence interval will be displayed in the session window Test and CI for One Proportion Test of p = 0.5 vs p not = Sample X N Sample p 90% CI 60 500 0.120000 (0.096096, 0.143904) TI-83 Plus or TI-84 Plus Step by Step Finding a Confidence Interval for a Proportion Z-Value -16.99 P-Value 0.000 Input Press STAT and move the cursor to TESTS Press A (ALPHA, MATH) for 1-PropZlnt Type in the appropriate values Move the cursor to Calculate and press ENTER Example TI7–3 Find the 95% confidence interval of p when X ϭ 60 and n ϭ 500 The 95% confidence level for p is 0.09152 Ͻ p Ͻ 0.14848 Also pˆ is given Excel Step by Step Output Finding a Confidence Interval for a Proportion Excel has a procedure to compute the margin of error But it does not compute confidence intervals However, you may determine confidence intervals for a proportion by using the MegaStat Add-in available on your CD If you have not installed this add-in, so, following the instructions from the Chapter Excel Step by Step Example XL7–3 There were 500 nursing applications in a sample, including 60 from men Find the 90% confidence interval for the true proportion of male applicants From the toolbar, select Add-Ins, MegaStat>Confidence Intervals/Sample Size Note: You may need to open MegaStat from the MegaStat.xls file on your computer’s hard drive In the dialog box, select Confidence interval—p Enter 60 in the box labeled p; p will automatically change to x 7–30 blu38582_ch07_355-398.qxd 9/8/10 1:07 PM Page 385 Section 7–4 Confidence Intervals for Variances and Standard Deviations Speaking of Statistics 385 OTHER PEOPLE’S MONEY Here is a survey about college students’ credit card usage Suggest several ways that the study could have been more meaningful if confidence intervals had been used Undergrads love their plastic That means—you guessed it—students are learning to become debtors According to the Public Interest Research Groups, only half of all students pay off card balances in full each month, 36% sometimes and 14% never Meanwhile, 48% have paid a late fee Here's how undergrads stack up, according to Nellie Mae, a provider of college loans: Undergrads with a credit card 78% Average number of cards owned Average student card debt $1236 Students with or more cards 32% Balances of $3000 to $7000 13% Balances over $7000 9% Reprinted with permission from the January 2002 Reader’s Digest Copyright © 2002 by The Reader’s Digest Assn Inc Enter 500 in the box labeled n Either type in or scroll to 90% for the Confidence Level, then click [OK] The result of the procedure is shown next Confidence interval—proportion 90% 0.12 500 1.645 0.024 0.144 0.096 7–4 Objective Find a confidence interval for a variance and a standard deviation Confidence level Proportion n z Half-width Upper confidence limit Lower confidence limit Confidence Intervals for Variances and Standard Deviations In Sections 7–1 through 7–3 confidence intervals were calculated for means and proportions This section will explain how to find confidence intervals for variances and standard deviations In statistics, the variance and standard deviation of a variable are as important as the mean For example, when products that fit together (such as pipes) are manufactured, it is important to keep the variations of the diameters of the products as small as possible; otherwise, they will not fit together properly and will have to be scrapped In the manufacture of medicines, the variance and standard deviation of the medication in the pills play an important role in making sure patients receive the proper dosage For these reasons, confidence intervals for variances and standard deviations are necessary 7–31 blu38582_ch07_355-398.qxd 9/8/10 1:07 PM Page 386 Chapter Confidence Intervals and Sample Size 386 Historical Note The distribution with degrees of freedom was formulated by a mathematician named Hershel in 1869 while he was studying the accuracy of shooting arrows at a target Many other mathematicians have since contributed to its development Figure 7–9 The Chi-Square Family of Curves To calculate these confidence intervals, a new statistical distribution is needed It is called the chi-square distribution The chi-square variable is similar to the t variable in that its distribution is a family of curves based on the number of degrees of freedom The symbol for chi-square is x2 (Greek letter chi, pronounced “ki”) Several of the distributions are shown in Figure 7–9, along with the corresponding degrees of freedom The chi-square distribution is obtained from the values of (n Ϫ 1)s2͞s2 when random samples are selected from a normally distributed population whose variance is s2 A chi-square variable cannot be negative, and the distributions are skewed to the right At about 100 degrees of freedom, the chi-square distribution becomes somewhat symmetric The area under each chi-square distribution is equal to 1.00, or 100% Table G in Appendix C gives the values for the chi-square distribution These values are used in the denominators of the formulas for confidence intervals Two different values d.f = d.f = d.f = d.f = 15 2 7–32 blu38582_ch07_355-398.qxd 9/8/10 1:07 PM Page 387 Section 7–4 Confidence Intervals for Variances and Standard Deviations 387 are used in the formula because the distribution is not symmetric One value is found on the left side of the table, and the other is on the right See Figure 7–10 For example, to find the table values corresponding to the 95% confidence interval, you must first change 95% to a decimal and subtract it from (1 Ϫ 0.95 ϭ 0.05) Then divide the answer by (a͞2 ϭ 0.05͞2 ϭ 0.025) This is the column on the right side of the table, used to get the values for x2right To get the value for x2left, subtract the value of a͞2 from (1 Ϫ 0.05͞2 ϭ 0.975) Finally, find the appropriate row corresponding to the degrees of freedom n Ϫ A similar procedure is used to find the values for a 90 or 99% confidence interval Figure 7–10 Chi-Square Distribution for d.f ؍n ؊ 1Ϫ␣ ␣ ␣ 2left Example 7–13 2right Find the values for x2right and x2left for a 90% confidence interval when n ϭ 25 Solution To find x2right, subtract Ϫ 0.90 ϭ 0.10 and divide by to get 0.05 To find x2left, subtract Ϫ 0.05 to get 0.95 Hence, use the 0.95 and 0.05 columns and the row corresponding to 24 d.f See Figure 7–11 Table G The Chi-square Distribution ␣ Figure 7–11 X Table for Example 7–13 Degrees of freedom 0.995 0.99 0.975 0.95 0.90 0.10 0.05 0.025 0.01 0.005 24 13.848 36.415 2left 2right The answers are x2right ϭ 36.415 x2left ϭ 13.848 See Figure 7–12 7–33 blu38582_ch07_355-398.qxd 388 9/8/10 1:07 PM Page 388 Chapter Confidence Intervals and Sample Size Figure 7–12 X Distribution for Example 7–13 0.90 0.05 0.05 13.848 36.415 Useful estimates for s2 and s are s2 and s, respectively To find confidence intervals for variances and standard deviations, you must assume that the variable is normally distributed The formulas for the confidence intervals are shown here Formula for the Confidence Interval for a Variance Θn Θ n Ϫ Ι s2 Ϫ 1Ι s2 Ͻ s2 Ͻ right 2left d.f ϭ n Ϫ Formula for the Confidence Interval for a Standard Deviation ͙ Θn Ϫ 1Ι s2 ϽsϽ 2right d.f ϭ n Ϫ ͙ Θn Ϫ Ι s2 2left Recall that s2 is the symbol for the sample variance and s is the symbol for the sample standard deviation If the problem gives the sample standard deviation s, be sure to square it when you are using the formula But if the problem gives the sample variance s2, not square it when you are using the formula, since the variance is already in square units Assumptions for Finding a Confidence Interval for a Variance or Standard Deviation The sample is a random sample The population must be normally distributed Rounding Rule for a Confidence Interval for a Variance or Standard Deviation When you are computing a confidence interval for a population variance or standard deviation by using raw data, round off to one more decimal place than the number of decimal places in the original data When you are computing a confidence interval for a population variance or standard deviation by using a sample variance or standard deviation, round off to the same number of decimal places as given for the sample variance or standard deviation Example 7–14 shows how to find a confidence interval for a variance and standard deviation 7–34 blu38582_ch07_355-398.qxd 9/8/10 1:07 PM Page 389 Section 7–4 Confidence Intervals for Variances and Standard Deviations Example 7–14 389 Nicotine Content Find the 95% confidence interval for the variance and standard deviation of the nicotine content of cigarettes manufactured if a sample of 20 cigarettes has a standard deviation of 1.6 milligrams Solution Since a ϭ 0.05, the two critical values, respectively, for the 0.025 and 0.975 levels for 19 degrees of freedom are 32.852 and 8.907 The 95% confidence interval for the variance is found by substituting in the formula Θn Θ 20 Θ n Ϫ Ι s2 Ϫ 1Ι s2 Ͻ s Ͻ 2right 2left Θ 20 Ϫ ΙΘ 1.6 Ι Ϫ 1ΙΘ 1.6 Ι Ͻ s2 Ͻ 32.852 8.907 1.5 Ͻ s Ͻ 5.5 Hence, you can be 95% confident that the true variance for the nicotine content is between 1.5 and 5.5 For the standard deviation, the confidence interval is ͙1.5 Ͻ s Ͻ ͙5.5 1.2 Ͻ s Ͻ 2.3 Hence, you can be 95% confident that the true standard deviation for the nicotine content of all cigarettes manufactured is between 1.2 and 2.3 milligrams based on a sample of 20 cigarettes Example 7–15 Cost of Ski Lift Tickets Find the 90% confidence interval for the variance and standard deviation for the price in dollars of an adult single-day ski lift ticket The data represent a selected sample of nationwide ski resorts Assume the variable is normally distributed 59 54 53 52 51 39 49 46 49 48 Source: USA TODAY Solution Step Find the variance for the data Use the formulas in Chapter or your calculator The variance s2 ϭ 28.2 Step Find x2right and x2left from Table G in Appendix C Since a ϭ 0.10, the two critical values are 3.325 and 16.919, using d.f ϭ and 0.95 and 0.05 Step Substitute in the formula and solve Θn Θ 10 Θ n Ϫ Ι s2 Ϫ 1Ι s2 Ͻ s2 Ͻ right 2left Θ 10 Ϫ ΙΘ 28.2 Ι Ϫ 1ΙΘ 28.2Ι Ͻ s2 Ͻ 16.919 3.325 15.0 Ͻ s Ͻ 76.3 7–35 blu38582_ch07_355-398.qxd 390 9/9/10 10:11 AM Page 390 Chapter Confidence Intervals and Sample Size For the standard deviation ͙15 Ͻ s Ͻ ͙76.3 3.87 Ͻ s Ͻ 8.73 Hence you can be 90% confident that the standard deviation for the price of all singleday ski lift tickets of the population is between $3.87 and $8.73 based on a sample of 10 nationwide ski resorts (Two decimal places are used since the data are in dollars and cents.) Note: If you are using the standard deviation instead (as in Example 7–14) of the variance, be sure to square the standard deviation when substituting in the formula Applying the Concepts 7–4 Confidence Interval for Standard Deviation Shown are the ages (in years) of the Presidents at the times of their deaths 67 68 66 58 88 90 71 63 60 78 83 53 70 72 46 85 65 49 67 64 73 74 57 57 81 80 64 71 60 93 78 77 67 90 93 79 56 71 63 Do the data represent a population or a sample? Select a random sample of 12 ages and find the variance and standard deviation Find the 95% confidence interval of the standard deviation Find the standard deviation of all the data values Does the confidence interval calculated in question contain the mean? If it does not, give a reason why What assumption(s) must be considered for constructing the confidence interval in step 3? See page 398 for the answers Exercises 7–4 What distribution must be used when computing confidence intervals for variances and standard deviations? Chi-square What assumption must be made when computing confidence intervals for variances and standard deviations? The variable must be normally distributed Using Table G, find the values for x2left and x2right a b c d e a ϭ 0.05, n ϭ 12 3.816; 21.920 a ϭ 0.10, n ϭ 20 10.117; 30.144 a ϭ 0.05, n ϭ 27 13.844; 41.923 a ϭ 0.01, n ϭ 0.412; 16.750 a ϭ 0.10, n ϭ 41 26.509; 55.758 Lifetimes of Wristwatches Find the 90% confidence interval for the variance and standard deviation for the lifetimes of inexpensive wristwatches if a sample of 24 watches has a standard deviation of 4.8 months 7–36 Assume the variable is normally distributed Do you feel that the lifetimes are relatively consistent? 15.1 Ͻ s2 Ͻ 40.5; 3.9 Ͻ s Ͻ 6.4 Carbohydrates in Yogurt The number of carbohydrates (in grams) per 8-ounce serving of yogurt for each of a random selection of brands is listed below Estimate the true population variance and standard deviation for the number of carbohydrates per 8-ounce serving of yogurt with 95% confidence 56.6 Ͻ s2 Ͻ 236.3; 7.5 Ͻ s Ͻ 15.4 17 42 41 20 39 41 35 15 43 25 38 33 42 23 17 25 34 Carbon Monoxide Deaths A study of generationrelated carbon monoxide deaths showed that a sample of recent years had a standard deviation of 4.1 deaths per year Find the 99% confidence interval of the variance and standard distribution Assume the variable is normally distributed 5.0 Ͻ s2 Ͻ 204.0; 2.2 Ͻ s Ͻ 14.3 Source: Based on information from Consumer Protection Safety Commission blu38582_ch07_355-398.qxd 9/8/10 1:07 PM Page 391 Section 7–4 Confidence Intervals for Variances and Standard Deviations Cost of Knee Replacement Surgery U.S insurers’ costs for knee replacement surgery range from $17,627 to $25,462 Estimate the population variance (standard deviation) in cost with 98% confidence based on a random sample of 10 persons who have had this surgery The retail costs (for uninsured persons) for the same procedure range from $40,640 to $58,702 Estimate the population variance and standard deviation in cost with 98% confidence based on a sample of 10 persons, and compare your two intervals Source: Time Almanac Age of College Students Find the 90% confidence interval for the variance and standard deviation of the ages of seniors at Oak Park College if a sample of 24 students has a standard deviation of 2.3 years Assume the variable is normally distributed 3.5 Ͻ s2 Ͻ 9.3; 1.9 Ͻ s Ͻ 3.0 10 Stock Prices A random sample of stock prices per share (in dollars) is shown Find the 90% confidence interval for the variance and standard deviation for the prices Assume the variable is normally distributed 26.69 75.37 3.81 6.94 40.25 169 199 239 239 13.88 7.50 53.81 28.25 10.87 28.37 47.50 13.62 28.00 46.12 12.00 43.00 45.12 60.50 14.75 Source: Pittsburgh Tribune Review 259.343 Ͻ s2 Ͻ 772.724; 16.104 Ͻ s Ͻ 27.798 11 Number of Homeless Individuals A researcher wishes to find the confidence interval of the population standard deviation for the number of homeless people in a large city A sample of 25 months had a standard deviation of 462 Find the 95% confidence interval New-Car Lease Fees A new-car dealer is leasing various brand-new models for the monthly rates (in dollars) listed below Estimate the true population variance (and standard deviation) in leasing rates with 90% confidence 604 Ͻ s2 Ͻ 5837; 24.6 Ͻ s Ͻ 76.4 169 391 249 130,136 Ͻ s2 Ͻ 413,084; 361 Ͻ s Ͻ 643 12 Home Ownership Rates The percentage rates of home ownership for randomly selected states are listed below Estimate the population variance and standard deviation for the percentage rate of home ownership with 99% confidence 66.0 75.8 70.9 73.9 63.4 68.5 73.3 65.9 Source: World Almanac 6.8 Ͻ s2 Ͻ 140; 2.6 Ͻ s Ͻ 11.8 Extending the Concepts 13 Calculator Battery Lifetimes A confidence interval for a standard deviation for large samples taken from a normally distributed population can be approximated by s Ϫ za ր s s Ͻ s Ͻ s ϩ zaր2 ͙2n ͙2n Find the 95% confidence interval for the population standard deviation of calculator batteries A sample of 200 calculator batteries has a standard deviation of 18 months 16.2 Ͻ s Ͻ 19.8 Technology Step by Step TI-83 Plus or TI-84 Plus Step by Step The TI-83 Plus and TI-84 Plus not have a built-in confidence interval for the variance or standard deviation However, the downloadable program named SDINT is available on your CD and Online Learning Center Follow the instructions with your CD for downloading the program Finding a Confidence Interval for the Variance and Standard Deviation (Data) Enter the data values into L1 Press PRGM, move the cursor to the program named SDINT, and press ENTER twice Press for Data Type L1 for the list and press ENTER Type the confidence level and press ENTER Press ENTER to clear the screen 7–37 blu38582_ch07_355-398.qxd 392 9/8/10 1:07 PM Page 392 Chapter Confidence Intervals and Sample Size Example TI7–4 This refers to Example 7–15 in the text Find the 90% confidence interval for the variance and standard deviation for the data: 59 54 53 52 51 39 49 46 49 48 Finding a Confidence Interval for the Variance and Standard Deviation (Statistics) Press PRGM, move the cursor to the program named SDINT, and press ENTER twice Press for Stats Type the sample standard deviation and press ENTER Type the sample size and press ENTER Type the confidence level and press ENTER Press ENTER to clear the screen Example TI7–5 This refers to Example 7–14 in the text Find the 95% confidence interval for the variance and standard deviation, given n ϭ 20 and s ϭ 1.6 Summary • An important aspect of inferential statistics is estimation Estimations of parameters of populations are accomplished by selecting a random sample from that population and choosing and computing a statistic that is the best estimator of the parameter A good estimator must be unbiased, consistent, and relatively efficient The best estimate of m is X (7–1) • There are two types of estimates of a parameter: point estimates and interval estimates A point estimate is a specific value For example, if a researcher wishes to estimate the average length of a certain adult fish, a sample of the fish is selected and measured The mean of this sample is computed, for example, 3.2 centimeters From this sample mean, the researcher estimates the population mean to be 3.2 centimeters The problem with point estimates is that the accuracy of the estimate cannot be determined For this reason, statisticians prefer to use the interval estimate By computing an interval about the sample value, statisticians can be 95 or 99% (or some other percentage) confident that their estimate contains the true parameter The confidence level is determined by the researcher The higher the confidence level, the wider the interval of the estimate must be For example, a 95% confidence interval of the true mean length of a certain species of fish might be 3.17 Ͻ m Ͻ 3.23 7–38 blu38582_ch07_355-398.qxd 9/8/10 1:07 PM Page 393 Important Formulas 393 whereas the 99% confidence interval might be 3.15 Ͻ m Ͻ 3.25 (7–1) • When the population standard deviation is known, the z value is used to compute the confidence interval (7–1) • Closely related to computing confidence intervals is the determination of the sample size to make an estimate of the mean This information is needed to determine the minimum sample size necessary The degree of confidence must be stated The population standard deviation must be known or be able to be estimated The margin of error must be stated (7–1) • If the population standard deviation is unknown, the t value is used When the sample size is less than 30, the population must be normally distributed (7–2) • Confidence intervals and sample sizes can also be computed for proportions by using the normal distribution (7–3) • Finally, confidence intervals for variances and standard deviations can be computed by using the chi-square distribution (7–4) Important Terms assumptions 357 degrees of freedom 370 margin of error 359 robust 357 chi-square distribution 386 estimation 356 point estimate 357 t distribution 370 confidence interval 358 estimator 357 proportion 377 unbiased estimator 357 confidence level 358 interval estimate 358 relatively efficient estimator 357 consistent estimator 357 Important Formulas Formula for the confidence interval of the mean when s is known (when n Ն 30, s can be used if s is unknown): Θ Θ Ι S S Ͻ M Ͻ X ؉ zAր2 X ؊ zAր2 ͙n ͙n Ι Formula for the sample size for means: z ؒS n ؍A ր2 E Θ Ι ͙ ͙ pˆ qˆ Ͻ p Ͻ pˆ ؉ zAր2 n where pˆ ؍Xրn and qˆ ؍1 ؊ pˆ n ؍pˆ qˆ pˆ qˆ n Θ zEր Ι A2 Formula for the confidence interval for a variance: Formula for the confidence interval of the mean when s is unknown: s s Ͻ M Ͻ X ؉ tAր2 X ؊ tAր2 n ͙ ͙n Ι pˆ ؊ zAր2 Formula for the sample size for proportions: where E is the margin of error Θ Formula for the confidence interval for a proportion: Θ Ι (n ؊ 1)s2 (n ؊ 1)s2 Ͻ S Ͻ X 2right X 2left Formula for confidence interval for a standard deviation: ͙ (n ؊ 1)s2 Ͻ S Ͻ X 2right ͙ (n ؊ 1)s2 X 2left 7–39 blu38582_ch07_355-398.qxd 394 9/8/10 1:07 PM Page 394 Chapter Confidence Intervals and Sample Size Review Exercises Eight chemical elements not have isotopes (different forms of the same element having the same atomic number but different atomic weights) A random sample of 30 of the elements that have isotopes showed a mean number of 19.63 isotopes per element and the population a standard deviation of 18.73 Estimate the true mean number of isotopes for all elements with isotopes with 90% confidence (7–1) Source: Time Almanac 13.99 Ͻ m Ͻ 25.27 (or 14 Ͻ m Ͻ 25) (TI: 14.005 Ͻ m Ͻ 25.255) Vacation Days A U.S Travel Data Center survey reported that Americans stayed an average of 7.5 nights when they went on vacation The sample size was 1500 Find a point estimate of the population mean Find the 95% confidence interval of the true mean Assume the population standard deviation was 0.8 (7–1) Source: USA TODAY 7.5; 7.46 Ͻ m Ͻ 7.54 Spending for Postage A researcher wishes to estimate within $25 the average cost of postage a community college spends in one year If she wishes to be 90% confident, how large of a sample would be necessary if the population standard deviation is $80 (7–1) 28 Shopping Survey A random sample of 49 shoppers showed that they spend an average of $23.45 per visit at the Saturday Mornings Bookstore The standard deviation of the population is $2.80 Find a point estimate of the population mean Find the 90% confidence interval of the true mean (7–1) $23.45; $22.79 Ͻ m Ͻ $24.11 Lengths of Children’s Animated Films The lengths (in minutes) of a random selection of popular children’s animated films are listed below Estimate the true mean length of all children’s animated films with 95% confidence (7–2) 76.9 Ͻ m Ͻ 88.3 Assume normal distribution 93 83 76 92 77 81 78 100 78 76 75 Dog Bites to Postal Workers For a certain urban area, in a sample of months, on average 28 mail carriers were bitten by dogs each month The standard deviation of the sample was Find the 90% confidence interval of the true mean number of mail carriers who are bitten by dogs each month Assume the variable is normally distributed (7–2) 25 Ͻ m Ͻ 31 Presidential Travel In a survey of 1004 individuals, 442 felt that President George W Bush spent too much time away from Washington Find a 95% confidence interval for the true population proportion (7–3) Source: USA TODAY/CNN/Gallup Poll 0.409 Ͻ p Ͻ 0.471 Vacation Sites A U.S Travel Data Center’s survey of 1500 adults found that 42% of respondents stated that they favor historical sites as vacations Find the 95% confidence interval of the true proportion of 7–40 all adults who favor visiting historical sites as vacations (7–3) Source: USA TODAY 0.395 Ͻ p Ͻ 0.445 Emergency Room Accidents In a study of 200 accidents that required treatment in an emergency room, 80 occurred at work Find the 90% confidence interval of the true proportion of accidents that occurred at work (7–3) 0.343 Ͻ p Ͻ 0.457 10 A local county has a very active adult education venue A random sample of the population showed that 189 out of 400 persons 16 years old or older participated in some type of formal adult education activities, such as basic skills training, apprenticeships, personal interest courses, and part-time college or university degree programs Estimate the true proportion of adults participating in some kind of formal education program with 98% confidence (7–3) 0.414 Ͻ p Ͻ 0.531 11 Health Insurance Coverage for Children A federal report stated that 88% of children under age 18 were covered by health insurance in 2000 How large a sample is needed to estimate the true proportion of covered children with 90% confidence with a confidence interval 0.05 wide? (7–3) 460 Source: Washington Observer-Reporter 12 Child Care Programs A study found that 73% of prekindergarten children ages to whose mothers had a bachelor’s degree or higher were enrolled in centerbased early childhood care and education programs How large a sample is needed to estimate the true proportion within percentage points with 95% confidence? How large a sample is needed if you had no prior knowledge of the proportion? (7–3) 842 children; 1068 children 13 Baseball Diameters The standard deviation of the diameter of 18 baseballs was 0.29 cm Find the 95% confidence interval of the true standard deviation of the diameters of the baseballs Do you think the manufacturing process should be checked for inconsistency? (7–4) 0.218 Ͻ s Ͻ 0.435 Yes It seems that there is a large standard deviation 14 MPG for Lawn Mowers A random sample of 22 lawn mowers was selected, and the motors were tested to see how many miles per gallon of gasoline each one obtained The variance of the measurements was 2.6 Find the 95% confidence interval of the true variance (7–4) 1.5 Ͻ s2 Ͻ 5.3 15 Lifetimes of Snowmobiles A random sample of 15 snowmobiles was selected, and the lifetime (in months) of the batteries was measured The variance of the sample was 8.6 Find the 90% confidence interval of the true variance (7–4) 5.1 Ͻ s2 Ͻ 18.3 16 Length of Children’s Animated Films Use the data from Exercise to estimate the population variance (standard deviation) in length of children’s animated films with 99% confidence (7–4) 28.6 Ͻ s2 Ͻ 334.2; 5.3 Ͻ s Ͻ 18.3 blu38582_ch07_355-398.qxd 9/8/10 1:08 PM Page 395 Chapter Quiz Statistics Today 395 Would You Change the Channel?—Revisited The estimates given in the survey are point estimates However, since the margin of error is stated to be percentage points, an interval estimate can easily be obtained For example, if 45% of the people changed the channel, then the confidence interval of the true percentages of people who changed channels would be 42% Ͻ p Ͻ 48% The article fails to state whether a 90%, 95%, or some other percentage was used for the confidence interval Using the formula given in Section 7–3, a minimum sample size of 1068 would be needed to obtain a 95% confidence interval for p, as shown Use pˆ and qˆ as 0.5, since no value is known for pˆ n ϭ pˆ qˆ za Θ Eր Ι ϭ (0.5)(0.5) 1.96 Θ 0.03 Ι ϭ 1067.1 ϭ 1068 Data Analysis The Data Bank is found in Appendix D, or on the World Wide Web by following links from www.mhhe.com/math/stat/bluman/ From the Data Bank choose a variable, find the mean, and construct the 95 and 99% confidence intervals of the population mean Use a sample of at least 30 subjects Find the mean of the population, and determine whether it falls within the confidence interval Repeat Exercise 1, using a different variable and a sample of 15 Repeat Exercise 1, using a proportion For example, construct a confidence interval for the proportion of individuals who did not complete high school From Data Set III in Appendix D, select a sample of 30 values and construct the 95 and 99% confidence intervals of the mean length in miles of major North American rivers Find the mean of all the values, and determine if the confidence intervals contain the mean From Data Set VI in Appendix D, select a sample of 20 values and find the 90% confidence interval of the mean of the number of acres Find the mean of all the values, and determine if the confidence interval contains the mean Select a random sample of 20 of the record high temperatures in the United States, found in Data Set I in Appendix D Find the proportion of temperatures below 110° Construct a 95% confidence interval for this proportion Then find the true proportion of temperatures below 110°, using all the data Is the true proportion contained in the confidence interval? Explain Chapter Quiz Determine whether each statement is true or false If the statement is false, explain why Interval estimates are preferred over point estimates since a confidence level can be specified True For a specific confidence interval, the larger the sample size, the smaller the margin of error will be True An estimator is consistent if as the sample size decreases, the value of the estimator approaches the value of the parameter estimated False To determine the sample size needed to estimate a parameter, you must know the margin of error True Select the best answer When a 99% confidence interval is calculated instead of a 95% confidence interval with n being the same, the margin of error will be a Smaller b Larger c The same d It cannot be determined 7–41 blu38582_ch07_355-398.qxd 396 9/8/10 1:08 PM Page 396 Chapter Confidence Intervals and Sample Size The best point estimate of the population mean is a The sample mean b The sample median c The sample mode d The sample midrange When the population standard deviation is unknown and the sample size is less than 30, what table value should be used in computing a confidence interval for a mean? a z b t c Chi-square d None of the above Complete the following statements with the best answer A good estimator should be , , and Unbiased, consistent, relatively efficient The maximum difference between the point estimate of a parameter and the actual value of the parameter is called Margin of error 10 The statement “The average height of an adult male is feet 10 inches” is an example of a(n) estimate Point 11 The three confidence intervals used most often are the %, %, and % 90; 95; 99 12 Cost of Textbooks An irate student complained that the cost of textbooks was too high He randomly surveyed 36 other students and found that the mean amount of money spent for textbooks was $121.60 If the standard deviation of the population was $6.36, find the best point estimate and the 90% confidence interval of the true mean $121.60; $119.85 Ͻ m Ͻ $123.35 13 Doctor Visit Costs An irate patient complained that the cost of a doctor’s visit was too high She randomly surveyed 20 other patients and found that the mean amount of money they spent on each doctor’s visit was $44.80 The standard deviation of the sample was $3.53 Find a point estimate of the population mean Find the 95% confidence interval of the population mean Assume the variable is normally distributed $44.80; $43.15 Ͻ m Ͻ $46.45 14 Weights of Minivans The average weight of 40 randomly selected minivans was 4150 pounds The standard deviation was 480 pounds Find a point estimate of the population mean Find the 99% confidence interval of the true mean weight of the minivans 4150; 3954 Ͻ m Ͻ 4346 15 Ages of Insurance Representatives In a study of 10 insurance sales representatives from a certain large city, the average age of the group was 48.6 years and the standard deviation was 4.1 years Assume the variable is normally distributed Find the 95% confidence interval of the population mean age of all insurance sales representatives in that city 45.7 Ͻ m Ͻ 51.5 16 Patients Treated in Hospital Emergency Rooms In a hospital, a sample of weeks was selected, and it was found that an average of 438 patients was treated in the 7–42 emergency room each week The standard deviation was 16 Find the 99% confidence interval of the true mean Assume the variable is normally distributed 418 Ͻ m Ͻ 458 17 Burglaries For a certain urban area, it was found that in a sample of months, an average of 31 burglaries occurred each month The standard deviation was Assume the variable is normally distributed Find the 90% confidence interval of the true mean number of burglaries each month 26 Ͻ m Ͻ 36 18 Hours Spent Studying A university dean wishes to estimate the average number of hours that freshmen study each week The standard deviation from a previous study is 2.6 hours How large a sample must be selected if he wants to be 99% confident of finding whether the true mean differs from the sample mean by 0.5 hour? 180 19 Money Spent on Road Repairs A researcher wishes to estimate within $300 the true average amount of money a county spends on road repairs each year If she wants to be 90% confident, how large a sample is necessary? The standard deviation is known to be $900 25 20 Political Survey A political analyst found that 43% of 300 Republican voters feel that the federal government has too much power Find the 95% confidence interval of the population proportion of Republican voters who feel this way 0.374 Ͻ p Ͻ 0.486 21 Emergency Room Accidents In a study of 150 accidents that required treatment in an emergency room, 36% involved children under years of age Find the 90% confidence interval of the true proportion of accidents that involve children under the age of 0.295 Ͻ p Ͻ 0.425 22 Television Set Ownership A survey of 90 families showed that 40 owned at least one television set Find the 95% confidence interval of the true proportion of families who own at least one television set 0.342 Ͻ p Ͻ 0.547 23 Skipping Lunch A nutritionist wishes to determine, within 3%, the true proportion of adults who not eat any lunch If he wishes to be 95% confident that his estimate contains the population proportion, how large a sample will be necessary? A previous study found that 15% of the 125 people surveyed said they did not eat lunch 545 24 Novel Pages A sample of 25 novels has a standard deviation of pages Find the 95% confidence interval of the population standard deviation Ͻ s Ͻ 13 25 Truck Safety Check Find the 90% confidence interval for the variance and standard deviation for the time it takes a state police inspector to check a truck for safety if a sample of 27 trucks has a standard deviation of 6.8 minutes Assume the variable is normally distributed 30.9 Ͻ s2 Ͻ 78.2; 5.6 Ͻ s Ͻ 8.8 26 Automobile Pollution A sample of 20 automobiles has a pollution by-product release standard deviation of 2.3 ounces when gallon of gasoline is used Find the 90% confidence interval of the population standard deviation 1.8 Ͻ s Ͻ 3.2 blu38582_ch07_355-398.qxd 9/8/10 1:08 PM Page 397 Data Projects 397 Critical Thinking Challenges A confidence interval for a median can be found by using these formulas n ϩ zaր2͙n ϩ 2 LϭnϪUϩ1 Uϭ (round up) to define positions in the set of ordered data values Suppose a data set has 30 values, and you want to find the 95% confidence interval for the median Substituting in the formulas, you get 30 ϩ 1.96͙30 ϩ ϭ 21 2 L ϭ 30 Ϫ 21 ϩ ϭ 10 Uϭ Arrange the data in order from smallest to largest, and then select the 10th and 21st values of the data array; hence, X10 Ͻ median Ͻ X21 Find the 90% confidence interval for the median for the given data 84 14 31 72 26 49 252 104 31 18 72 23 55 133 16 29 225 138 85 24 391 72 158 4340 346 19 846 461 254 125 61 123 60 29 10 366 47 28 254 77 21 97 17 82 (rounded up) when n ϭ 30 and za͞2 ϭ 1.96 Data Projects Business and Finance Use 30 stocks classified as the Dow Jones industrials as the sample Note the amount each stock has gained or lost in the last quarter Compute the mean and standard deviation for the data set Compute the 95% confidence interval for the mean and the 95% confidence interval for the standard deviation Compute the percentage of stocks that had a gain in the last quarter Find a 95% confidence interval for the percentage of stocks with a gain Sports and Leisure Use the top home run hitter from each major league baseball team as the data set Find the mean and the standard deviation for the number of home runs hit by the top hitter on each team Find a 95% confidence interval for the mean number of home runs hit Technology Use the data collected in data project of Chapter regarding song lengths Select a specific genre, and compute the percentage of songs in the sample that are of that genre Create a 95% confidence interval for the true percentage Use the entire music library, and find the population percentage of the library with that genre Does the population percentage fall within the confidence interval? Health and Wellness Use your class as the sample Have each student take her or his temperature on a healthy day Compute the mean and standard deviation for the sample Create a 95% confidence interval for the mean temperature Does the confidence interval obtained support the long-held belief that the average body temperature is 98.6ЊF? Politics and Economics Select five political polls and note the margin of error, sample size, and percent favoring the candidate for each For each poll, determine the level of confidence that must have been used to obtain the margin of error given, knowing the percent favoring the candidate and number of participants Is there a pattern that emerges? Your Class Have each student compute his or her body mass index (BMI) (703 times weight in pounds, divided by the quantity height in inches squared) Find the mean and standard deviation for the data set Compute a 95% confidence interval for the mean BMI of a student A BMI score over 30 is considered obese Does the confidence interval indicate that the mean for BMI could be in the obese range? 7–43 blu38582_ch07_355-398.qxd 9/8/10 1:08 PM Page 398 Chapter Confidence Intervals and Sample Size 398 Answers to Applying the Concepts Section 7–1 Making Decisions with Confidence Intervals Answers will vary One possible answer is to find out the average number of Kleenexes that a group of randomly selected individuals use in a 2-week period People usually need Kleenexes when they have a cold or when their allergies are acting up If we want to concentrate on the number of Kleenexes used when people have colds, we select a random sample of people with colds and have them keep a record of how many Kleenexes they use during their colds Answers may vary I will use a 95% confidence interval: x Ϯ 1.96 s 15 ϭ 57 Ϯ 1.96 ϭ 57 Ϯ 3.2 ͙n ͙85 I am 95% confident that the interval 53.8–60.2 contains the true mean number of Kleenexes used by people when they have colds It seems reasonable to put 60 Kleenexes in the new automobile glove compartment boxes Answers will vary Since I am 95% confident that the interval contains the true average, any number of Kleenexes between 54 and 60 would be reasonable Sixty seemed to be the most reasonable answer, since it is close to standard deviations above the sample mean Section 7–2 Sport Drink Decision Answers will vary One possible answer is that this is a small sample since we are only looking at seven popular sport drinks The mean cost per container is $1.25, with standard deviation of $0.39 The 90% confidence interval is s 0.39 ϭ 1.25 Ϯ 1.943 ϭ 1.25 Ϯ 0.29 ͙n ͙7 or 0.96 Ͻ m Ͻ 1.54 X Ϯ taր2 The 10-K, All Sport, Exceed, and Hydra Fuel all fall outside of the confidence interval None of the values appear to be outliers There are Ϫ ϭ degrees of freedom 7–44 Cost per serving would impact my decision on purchasing a sport drink, since this would allow me to compare the costs on an equal scale Answers will vary Section 7–3 Contracting Influenza (95% CI) means that these are the 95% confidence intervals constructed from the data The margin of error for men reporting influenza is (50.5 Ϫ 47.1)͞2 ϭ 1.7% The total sample size was 19,774 The larger the sample size, the smaller the margin of error (all other things being held constant) A 90% confidence interval would be narrower (smaller) than a 95% confidence interval, since we need to include fewer values in the interval The 51.5% is the middle of the confidence interval, since it is the point estimate for the confidence interval Section 7–4 Confidence Interval for Standard Deviation The data represent a population, since we have the age at death for all deceased Presidents (at the time of the writing of this book) Answers will vary One possible sample is 56, 67, 53, 46, 63, 77, 63, 57, 71, 57, 80, 65, which results in a standard deviation of 9.9 years and a variance of 98.0 Answers will vary The 95% confidence interval for Θ Ι Θ Ι the standard deviation is n Ϫ2right1 s to n Ϫ2left1 s In this case we have Θ 12 Ϫ Ι 9.92 3.8158 Θ 12 Ϫ Ι 9.92 21.920 ϭ ͙49.1839 ϭ 7.0 to ϭ ͙282.538 ϭ 16.8, or 7.0 to 16.8 years The standard deviation for all the data values is 12.0 years Answers will vary Yes, the confidence interval does contain the population standard deviation Answers will vary We need to assume that the distribution of ages at death is normal ... 09 01 0885 0869 0853 0838 0823 ? ?1. 2 11 51 113 1 11 12 10 93 10 75 10 56 10 38 10 20 10 03 0985 ? ?1. 1 13 57 13 35 13 14 12 92 12 71 12 51 1230 12 10 11 90 11 70 ? ?1. 0 15 87 15 62 15 39 15 15 14 92 14 69 14 46 14 23 14 01 1379... 09 01 0885 0869 0853 0838 0823 ? ?1. 2 11 51 113 1 11 12 10 93 10 75 10 56 10 38 10 20 10 03 0985 ? ?1. 1 13 57 13 35 13 14 12 92 12 71 12 51 1230 12 10 11 90 11 70 ? ?1. 0 15 87 15 62 15 39 15 15 14 92 14 69 14 46 14 23 14 01 1379... 1. 363 1. 356 1. 350 1. 345 1. 3 41 1.337 1. 333 1. 330 1. 328 1. 325 1. 323 1. 3 21 1. 319 1. 318 1. 316 1. 315 1. 314 1. 313 1. 311 1. 310 1. 309 1. 307 1. 306 1. 304 1. 303 1. 3 01 1.299 1. 297 1. 296 1. 295 1. 294 1. 293 1. 292