1. Trang chủ
  2. » Luận Văn - Báo Cáo

Ebook Statistics for business and economics (9th edition): Part 2

408 347 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 408
Dung lượng 5,78 MB

Nội dung

(BQ) Part 2 book Statistics for business and economics has contents: Analysis of variance, introduction to nonparametric statistics, additional topics in regression analysis, multiple variable regression analysis, two variable regression analysis,...and other contents.

www.downloadslide.com CHAP T E R C H A P T E R O U T LIN E 10 Two Population Hypothesis Tests 10.1 Tests of the Difference Between Two Normal Population Means: Dependent Samples Two Means, Matched Pairs 10.2 Tests of the Difference Between Two Normal Population Means: Independent Samples Two Means, Independent Samples, Known Population Variances Two Means, Independent Samples, Unknown Population Variances Assumed to Be Equal Two Means, Independent Samples, Unknown Population Variances Not Assumed to Be Equal 10.3 Tests of the Difference Between Two Population Proportions (Large Samples) 10.4 Tests of the Equality of the Variances Between Two Normally Distributed Populations 10.5 Some Comments on Hypothesis Testing Introduction In this chapter we develop procedures for testing the differences between two population means, proportions, and variances This form of inference compares and complements the estimation procedures developed in Chapter Our discussion in this chapter follows the development in Chapter 9, and we assume that the reader is familiar with the hypothesis-testing procedure developed in Section 9.1 The process for comparing two populations begins with an investigator forming a hypothesis about the nature of the two populations and the difference between their means or proportions The hypothesis is stated clearly as involving two options concerning the difference These two options are the only possible outcomes Then a decision is made based on the results of a statistic computed from random samples of data from the two populations Hypothesis tests involving variances are also becoming more important as business firms work to reduce process variability in order to ensure high quality for every unit produced Consider the following two examples as typical problems: An instructor is interested in knowing if assigning case studies increases students’ test scores in her course To answer her question, she could first assign cases in one section and not in the other Then, by collecting data 385 www.downloadslide.com from each class, she could determine if there is strong evidence that the use of case studies increases exam scores To provide strong evidence that the use of cases increases learning, she would begin by assuming that completing assigned cases does not increase overall examination scores Let m1 denote the mean final examination score in the class that used case studies, and let m2 denote the mean final examination score in the class that did not use case studies For this study the null hypothesis is the composite hypothesis H0 : m1 - m2 … which states that the use of cases does not increase the average examination score The alternative topic of interest is that the use of cases actually increases the average examination score, and, thus, the alternative hypothesis is as follows: H1 : m1 - m2 In this problem the instructor would decide to assign cases only if there is strong evidence that using cases increases the mean examination score Strong evidence results from rejecting H0 and accepting H1 Note that this hypothesis test could also be expressed as H0 : m1 … m2 H1 : m1 m2 and continue to maintain the same decision process A news reporter wants to know if a tax reform appeals equally to men and women To test this, he obtains the opinions of randomly selected men and women These data are used to provide an answer The reporter might hold, as a working null hypothesis, that a new tax proposal is equally appealing to men and women Using P1, the proportion of men favoring the proposal, minus P2, the proportion of women favoring the proposal, the null hypothesis is as follows: H0 : P1 = P2 or H0 : P1 - P2 = If the reporter has no good reason to suspect that the bulk of support comes from either men or women, then the null hypothesis would be tested against the two-sided composite alternative hypothesis: H1 : P1 ? P2 or H1 : P1 - P2 ? In this example, rejection of H0 would provide strong evidence that there is a difference between men and women in their response to the tax proposal Once we have specified the null and alternative hypotheses and collected sample data, a decision concerning the null hypothesis must be made We can either reject the null hypothesis and accept the alternative hypothesis or fail to reject the null hypothesis When we fail to reject the null hypothesis, then either the null hypothesis is true or our test procedure was not strong enough to reject it and an error has been committed To reject the null hypothesis, a decision rule based on sample evidence needs to be developed We present specific decision rules for various problems in the remainder of this chapter 386 Chapter 10 Two Population Hypothesis Tests www.downloadslide.com 10.1 T ESTS OF THE D IFFERENCE B ETWEEN T WO N ORMAL P OPULATION M EANS : D EPENDENT S AMPLES There are a number of applications where we wish to draw conclusions about the differences between population means instead of conclusions about the absolute levels of the means For example, we might want to compare the output of two different production processes for which neither population mean is known Similarly, we might want to know if one marketing strategy results in higher sales than another without knowing the population mean sales for either These questions can be handled effectively by various different hypothesis-testing procedures As we saw in Section 8.1, several different assumptions can be made when confidence intervals are computed for the differences between two population means These assumptions generally lead to specific methods for computing the population variance for the difference between sample means There are parallel hypothesis tests that involve similar methods for obtaining the variance We organize our discussion of the various hypothesistesting procedures in parallel with the confidence interval estimates in Section 8.1 In Section 10.1 we treat situations where the two samples can be assumed to be dependent In these cases the best design, if we have control over data collection, is using two matched pairs as shown below Then in Section 10.2 we treat a variety of situations where the samples are independent Two Means, Matched Pairs Here, we assume that a random sample of n matched pairs of observations is obtained from populations with means mx and my The observations are denoted x1, y1 2, x2, y2 2, , xn, yn When we have matched pairs and the pairs are positively correlated, the variance of the difference between the sample means, d = x - y will be reduced compared to using independent samples This results because some of the characteristics of the pairs are similar, and, thus, that portion of the variability is removed from the total variability of the differences between the means For example, when we consider measures of human behavior, differences between twins will usually be less than the differences between two randomly selected people In general, the dimensions for two parts produced on the same specific machine will be closer than the dimensions for parts produced on two different, independently selected machines Thus, whenever possible, we would prefer to use matched pairs of observations when comparing measurements from two populations because the variance of the difference will be smaller With a smaller variance, there is a greater probability that we will reject H0 when the null hypothesis is not true This principle was developed in Section 9.5 in the discussion of the power of a test The specific decision rules for different forms of the hypothesis test are summarized in Equations 10.1, 10.2, and 10.3 Tests of the Difference Between Population Means: Matched Pairs Suppose that we have a random sample of n matched pairs of observations from distributions with means mx and my Let d and sd denote the observed sample mean and standard deviation for the n differences xi - yi If the population distribution of the differences is a normal distribution, then the following tests have significance level a: To test either null hypothesis H0 : mx - my = or H0 : mx - my … 10.1 Tests of the Difference Between Two Normal Population Means: Dependent Samples 387 www.downloadslide.com against the alternative H1 : mx - my the decision rule is as follows: d tn - 1,a s d > 1n reject H0 if (10.1) To test either null hypothesis H0 : mx - my = or H0 : mx - my Ú against the alternative H1 : mx - my the decision rule is as follows: reject H0 if d - tn - 1,a s d > 1n (10.2) To test the null hypothesis H0 : mx - my = against the two-sided alternative H1 : mx - my ? the decision rule is as follows: reject H0 if d - tn - 1,a>2 or s d > 1n d tn - 1,a>2 s d > 1n (10.3) Here, tn - 1,a is the number for which P1 tn - tn - 1,a = a where the random variable tn - follows a Student’s t distribution with (n - 1) degrees of freedom For all these tests, p-values are interpreted as the probability of getting a value at least as extreme as the one obtained, given the null hypothesis Example 10.1 Analysis of Alternative Turkey-Feeding Programs (Hypothesis Test for Differences Between Means) Marian Anderson, production manager of Turkeys Unlimited, has been conducting a study to determine if a new feeding process produces a significant increase in mean weight of turkeys produced in the facilities of Turkeys Unlimited LLC In the process she obtains a random set of matched turkey chicks hatched from the same hen One group of chicks is from the hens fed using the old feeding method and the second group of chicks is from the same hens fed using the new method The weights for each of the turkeys and the differences between the matched pairs are shown in Table 10.1 These data are contained in the data file Turkey Feeding Perform the necessary analysis to determine if the new feeding process produces a significant a = 0.025 increase in turkey weight 388 Chapter 10 Two Population Hypothesis Tests www.downloadslide.com Table 10.1 Finish Weight of Turkeys for Old and New Feeding Programs OLD NEW DIFFERENCE HEN 17.76 18.15 0.38 18.66 19.92 1.26 21.84 23.60 1.76 16.64 17.96 1.33 17.37 16.25 - 1.12 16.75 17.50 0.74 18.01 20.79 2.77 22.00 22.89 0.89 17.68 20.25 2.57 18.23 20.95 2.72 10 20.63 22.76 2.13 11 20.03 20.64 0.61 12 15.90 14.67 -1.23 13 15.89 16.15 0.25 14 18.53 22.56 4.03 15 13.92 15.46 1.54 16 18.60 16.33 -2.26 17 20.09 21.03 0.94 18 18.04 18.51 0.47 19 19.87 22.32 2.45 20 19.00 24.53 5.53 21 18.59 21.15 2.56 22 21.02 26.36 5.35 23 15.62 18.56 2.94 24 15.41 14.02 -1.39 25 Solution In this study we are attempting to determine if the new feeding process results in a significantly greater weight compared to the old feeding process Define the weights from the new feeding process by the random variable X and the weights from the old feeding process by the random variable Y The null and alternative processes for this study are, thus, H0 : mx - my … H1 : mx - my The null hypothesis states that there was no increase in weight for the new process over the old The alternative hypothesis states that there was an increase If we reject the null hypothesis, then we can conclude that the new feeding process does result in higher turkey weights We perform the test using the Student’s t test for matched pairs with a critical value a = 0.025 Figure 10.1 provides the Minitab computation for the mean difference (1.489), the standard deviation of the mean differences (0.385), and the Student’s t The Student’s t statistic for the test can be computed as t = d 1.489 1.489 = = = 3.86 0.385 s d > 1n 1.926> 125 10.1 Tests of the Difference Between Two Normal Population Means: Dependent Samples 389 www.downloadslide.com Figure 10.1 Hypothesis Testing for Differences Between New and Old Turkey Weights Paired T-Test and CI: New, Old Paired T for New – old N New 25 old 25 Difference 25 Mean 19.732 18.244 1.489 StDev 3.226 2.057 1.926 SE Mean 0.645 0.411 0.385 95% lower bound for mean difference: 0.829 T-Test of mean difference = (vs > 0): T-Value = 3.86 P-Value = 0.000 The computed value of Student’s t is greater than the critical value with a = 0.025 and 24 degrees of freedom, equal to 2.064 from the Student’s t table (Appendix Table 8) From this analysis we see that there is strong evidence to conclude that the new feeding method increases the weight of turkeys more than the old method Note also that the variance of the difference between the matched pairs could be computed as follows (the correlation between the pairs is 0.823) using Equation 5.27: S 2d = 0.411 22 + 0.645 22 - * 0.823 21 0.411 21 0.645 = 0.146 S d = 0.385 This is the standard deviation of the differences as computed in the computer output EXERCISES as m1 and process has a mean defined as m2 The null and alternative hypotheses are as follows: Visit www.mymathlab.com/global or www.pearsonglobal editions.com/newbold to access the data files Basic Exercises H0 : m1 - m2 Ú 10.1 H1 : m1 - m2 You have been asked to determine if two different production processes have different mean numbers of units produced per hour Process has a mean defined as m1 and process has a mean defined as m2 The null and alternative hypotheses are as follows: Using a random sample of 25 paired observations, the standard deviation of the difference between sample means is 25 Can you reject the null hypothesis using a probability of Type I error a = 0.05 in each case? a b c d H0 : m1 - m2 = H1 : m1 - m2 Using a random sample of 25 paired observations, the sample means are 50 and 60 for populations and 2, respectively Can you reject the null hypothesis using a probability of Type I error a = 0.05 in each case? a b c d 10.2 390 The sample standard deviation of the difference is 20 The sample standard deviation of the difference is 30 The sample standard deviation of the difference is 15 The sample standard deviation of the difference is 40 You have been asked to determine if two different production processes have different mean numbers of units produced per hour Process has a mean defined Chapter 10 Two Population Hypothesis Tests The sample means are 56 and 50 The sample means are 59 and 50 The sample means are 56 and 48 The sample means are 54 and 50 Application Exercises 10.3 In a study comparing banks in Germany and Great Britain, a sample of 145 matched pairs of banks was formed Each pair contained one bank from Germany and one from Great Britain The pairings were made in such a way that the two members were as similar as possible in regard to such factors as size and age The ratio of total loans outstanding to total assets was calculated for each of the banks For this ratio, the sample mean difference (German – Great Britain) was 0.0518, and the sample standard deviation of the differences was 0.3055 www.downloadslide.com 10.4 Test, against a two-sided alternative, the null hypothesis that the two population means are equal You have been asked to conduct a national study of urban home selling prices to determine if there has been an increase in selling prices over time There has been some concern that housing prices in major urban areas have not kept up with inflation over time Your study will use data collected from Atlanta, Chicago, Dallas, and Oakland, which is contained in the data file House Selling Price Formulate an appropriate hypothesis test and use your statistical computer package to compute the appropriate statistics for analysis Perform the hypothesis test and indicate your conclusion Repeat the analysis using data from only the city of Atlanta 10.5 An agency offers preparation courses for a graduate school admissions test to students As part of an experiment to evaluate the merits of the course, 12 students were chosen and divided into pairs in such a way that the members of any pair had similar academic records Before taking the test, one member of each pair was assigned at random to take the preparation course, while the other member did not take a course The achievement test scores are contained in the Student Pair data file Assuming that the differences in scores follow a normal distribution, test, at the 5% level, the null hypothesis that the two population means are equal against the alternative that the true mean is higher for students taking the preparation course 10.2 T ESTS OF THE D IFFERENCE B ETWEEN T WO N ORMAL P OPULATION M EANS : I NDEPENDENT S AMPLES Two Means, Independent Samples, Known Population Variances Now we consider the case where we have independent random samples from two normally distributed populations The first population has a mean of mx and a variance of s2x and we obtain a random sample of size nx The second population has a mean of my and a variance of s2y and we obtain a random sample of size ny In Section 8.2, we showed that if the sample means are denoted by x and y, then the random variable Z = x - y - mx - my s2y s2x + A nx ny has a standard normal distribution If the two population variances are known, tests of the difference between the population means can be based on this result, using the same arguments as before Generally, we are comfortable using known population variances if the process being studied has been stable over some time and we have obtained similar variance measurements over this time And because of the central limit theorem, the results presented here hold for large sample sizes even if the populations are not normal For large sample sizes, the approximation is quite satisfactory when sample variances are used for population variances The appropriate tests are summarized in Equations 10.4, 10.5, and 10.6 Tests of the Difference Between Population Means: Independent Samples (Known Variances) Suppose that we have independent random samples of nx and ny observations from normal distributions with means mx and my and variances s2x and s2y, respectively If the observed sample means are x and y, then the following tests have significance level a: To test either null hypothesis H0 : mx - my = or H0 : mx - my … against the alternative H1 : mx - my 10.2 Tests of the Difference Between Two Normal Population Means: Independent Samples 391 www.downloadslide.com the decision rule is as follows: x - y reject H0 if s2y s2x A nx + ny za (10.4) To test either null hypothesis H0 : mx - my = or H0 : mx - my Ú against the alternative H1 : mx - my the decision rule is as follows: reject H0 if x - y s2y s2x A nx + ny - za (10.5) To test the null hypothesis H0 : mx - my = against the two-sided alternative H1 : mx - my ? the decision rule is as follows: x - y reject H0 if s2y s2x A nx + ny - z a>2 or x - y s2y s2x + A nx ny z a>2 (10.6) If the sample sizes are large (n 100), then a good approximation at significance level a can be made if we replace the population variances with the sample variances In addition, the central limit theorem leads to good approximations even if the populations are not normally distributed The p-values for all these tests are interpreted as the probability of getting a value at least as extreme as the one obtained, given the null hypothesis Example 10.2 Comparison of Alternative Fertilizers (Hypothesis Test for Differences Between Means) Shirley Brown, an agricultural economist, wants to compare cow manure and turkey dung as fertilizers Historically, farmers had used cow manure on their cornfields Recently, a major turkey farmer offered to sell composted turkey dung at a favorable price The farmers decided that they would use this new fertilizer only if there was strong evidence that productivity increased over the productivity that occurred with cow manure Shirley was asked to conduct the research and statistical analysis in order to develop a recommendation to the farmers Solution To begin the study, Shirley specified a hypothesis test with H0 : mx - my … versus the alternative that H1 : mx - my 392 Chapter 10 Two Population Hypothesis Tests www.downloadslide.com where mx is the population mean productivity using turkey dung and my is the population mean productivity using cow manure H1 indicates that turkey dung results in higher productivity The farmers will not change their fertilizer unless there is strong evidence in favor of increased productivity She decided before collecting the data that a significance level of a = 0.05 would be used for this test Using this design, Shirley implemented an experiment to test the hypothesis Cow manure was applied to one set of ny = 25 randomly selected fields The sample mean productivity was y = 100 From past experience the variance in productivity for these fields was assumed to be s2y = 400 Turkey dung was applied to a second random sample of nx = 25 fields, and the sample mean productivity was x = 115 Based on published research reports, the variance for these fields was assumed to be s2x = 625 The two sets of random samples were independent The decision rule is to reject H0 in favor of H1 if x - y s2y A nx + ny za s2x The computed statistics for this problem are as follows: nx = 25 x = 115 s2x = 625 ny = 25 y = 100 s2y = 400 115 - 100 z = = 2.34 625 400 + A 25 25 Comparing the computed value of z = 2.34 with z 0.05 = 1.645, Shirley concluded that the null hypothesis is clearly rejected In fact, we found that the p-value for this test is 0.0096 As a result, there is overwhelming evidence that turkey dung results in higher productivity than cow manure Two Means, Independent Samples, Unknown Population Variances Assumed to Be Equal In those cases where the population variances are not known and the sample sizes are under 100, we need to use the Student’s t distribution There are some theoretical problems when we use the Student’s t distribution for differences between sample means However, these problems can be solved using the procedure that follows if we can assume that the population variances are equal This assumption is realistic in many cases where we are comparing groups In Section 10.4 we present a procedure for testing the equality of variances from two normal populations The major difference is that this procedure uses a commonly pooled estimator of the equal population variance This estimator is as follows: s 2p = nx - 2s 2x + ny - 2s 2y nx + ny - 2 The degrees of freedom for s 2p and for the Student’s t statistic below is nx + ny - The hypothesis test is performed using the Student’s t statistic for the difference between two means: t = x - y - mx - my s 2p s 2p + A nx ny 10.2 Tests of the Difference Between Two Normal Population Means: Independent Samples 393 www.downloadslide.com Note that the form for the test statistic is similar to that of the Z statistic, which is used when the population variances are known The various tests using this procedure are summarized next Tests of the Difference Between Population Means: Population Variances Unknown and Equal In these tests it is assumed that we have an independent random sample of size nx and ny observations drawn from normally distributed populations with means mx and my and a common variance The sample variances s 2x and s 2y are used to compute a pooled variance estimator: s 2p = nx - 2s 2x + ny - 2s 2y nx + ny - 2 (10.7) We emphasize here that s p2 is the weighted average of the two sample variances, s x2 and s 2y Then, using the observed sample means x and y, the following tests have significance level a: To test either null hypothesis H0 : mx - my = or H0 : mx - my … against the alternative H1 : mx - my the decision rule is as follows: reject H0 if x - y s 2p s 2p A nx + ny tnx + ny - 2,a (10.8) To test either null hypothesis H0 : mx - my = or H0 : mx - my Ú against the alternative H1 : mx - my the decision rule is as follows: reject H0 if x - y s 2p s 2p A nx + ny - tnx + ny - 2,a (10.9) To test the null hypothesis H0 : mx - my against the two-sided alternative H1 : mx - my ? the decision rule is as follows: reject H0 if 394 Chapter 10 x - y s 2p s 2p + A nx ny Two Population Hypothesis Tests - tnx + ny - 2,a>2 or x - y s 2p s 2p + A nx ny tnx + ny - 2,a>2 (10.10) www.downloadslide.com Table 12 Cutoff Points for the Distribution of the Durbin-Watson Test Statistic (Continued ) a = 0.05 n K dL dU dL dU dL dU dL dU dL dU 15 0.81 1.07 0.70 1.25 0.59 1.46 0.49 1.70 0.39 1.96 16 0.84 1.09 0.74 1.25 0.63 1.44 0.53 1.66 0.44 1.90 17 0.87 1.10 0.77 1.25 0.67 1.43 0.57 1.63 0.48 1.85 18 0.90 1.12 0.80 1.26 0.71 1.42 0.61 1.60 0.52 1.80 19 0.93 1.13 0.83 1.26 0.74 1.41 0.65 1.58 0.56 1.77 20 0.95 1.15 0.86 1.27 0.77 1.41 0.68 1.57 0.60 1.74 21 0.97 1.16 0.89 1.27 0.80 1.41 0.72 1.55 0.63 1.71 22 1.00 1.17 0.91 1.28 0.83 1.40 0.75 1.54 0.66 1.69 23 1.02 1.19 0.94 1.29 0.86 1.40 0.77 1.53 0.70 1.67 24 1.04 1.20 0.96 1.30 0.88 1.41 0.80 1.53 0.72 1.66 25 1.05 1.21 0.98 1.30 0.90 1.41 0.83 1.52 0.75 1.65 26 1.07 1.22 1.00 1.31 0.93 1.41 0.85 1.52 0.78 1.64 27 1.09 1.23 1.02 1.32 0.95 1.41 0.88 1.51 0.81 1.63 28 1.10 1.24 1.04 1.32 0.97 1.41 0.90 1.51 0.83 1.62 29 1.12 1.25 1.05 1.33 0.99 1.42 0.92 1.51 0.85 1.61 30 1.13 1.26 1.07 1.34 1.01 1.42 0.94 1.51 0.88 1.61 31 1.15 1.27 1.08 1.34 1.02 1.42 0.96 1.51 0.90 1.60 32 1.16 1.28 1.10 1.35 1.04 1.43 0.98 1.51 0.92 1.60 33 1.17 1.29 1.11 1.36 1.05 1.43 1.00 1.51 0.94 1.59 34 1.18 1.30 1.13 1.36 1.07 1.43 1.01 1.51 0.95 1.59 35 1.19 1.31 1.14 1.37 1.08 1.44 1.03 1.51 0.97 1.59 36 1.21 1.32 1.15 1.38 1.10 1.44 1.04 1.51 0.99 1.59 37 1.22 1.32 1.16 1.38 1.11 1.45 1.06 1.51 1.00 1.59 38 1.23 1.33 1.18 1.39 1.12 1.45 1.07 1.52 1.02 1.58 39 1.24 1.34 1.19 1.39 1.14 1.45 1.09 1.52 1.03 1.58 40 1.25 1.34 1.20 1.40 1.15 1.46 1.10 1.52 1.05 1.58 45 1.29 1.38 1.24 1.42 1.20 1.48 1.16 1.53 1.11 1.58 50 1.32 1.40 1.28 1.45 1.24 1.49 1.20 1.54 1.16 1.59 55 1.36 1.43 1.32 1.47 1.28 1.51 1.25 1.55 1.21 1.59 60 1.38 1.45 1.35 1.48 1.32 1.52 1.28 1.56 1.25 1.60 65 1.41 1.47 1.38 1.50 1.35 1.53 1.31 1.57 1.28 1.61 70 1.43 1.49 1.40 1.52 1.37 1.55 1.34 1.58 1.31 1.61 75 1.45 1.50 1.42 1.53 1.39 1.56 1.37 1.59 1.34 1.62 80 1.47 1.52 1.44 1.54 1.42 1.57 1.39 1.60 1.36 1.62 85 1.48 1.53 1.46 1.55 1.43 1.58 1.41 1.60 1.39 1.63 90 1.50 1.54 1.47 1.56 1.45 1.59 1.43 1.61 1.41 1.64 95 1.51 1.55 1.49 1.57 1.47 1.60 1.45 1.62 1.42 1.64 100 1.52 1.56 1.50 1.58 1.48 1.60 1.46 1.63 1.44 1.65 Computed from TSP 4.5 based on R W Farebrother, “A Remark on Algorithms AS106, AS153, and AS155: The Distribution of a Linear Combination of Chi-Square Random Variables”, Journal of the Royal Statistical Society, Series C (Applied Statistics), 1984, 29, pp 323–333 778 Appendix Tables www.downloadslide.com Table 13 Critical Values of Studentized Range Q a = 0.05 The Studentized Range Upper Quantiles Q(k, df; 0.05) df k-> 17.969 26.976 32.819 37.082 40.408 10 11 12 13 14 43.119 45.397 47.357 49.071 50.592 51.957 53.194 54.323 13.988 14.389 14.749 15.076 15.375 6.085 8.331 9.798 10.881 11.734 12.435 13.027 13.539 4.501 5.910 6.825 7.502 8.037 8.478 8.852 9.177 9.462 9.717 9.946 10.155 10.346 3.926 5.040 5.757 6.287 6.706 7.053 7.347 7.602 7.826 8.027 8.208 8.373 8.524 15 16 55.361 56.320 17 18 19 20 57.212 58.044 58.824 59.558 15.650 15.905 16.143 16.365 16.573 16.769 10.522 10.686 10.838 10.980 11.114 11.240 8.914 9.027 9.133 9.233 8.664 8.793 3.635 4.602 5.218 5.673 6.033 6.330 6.582 6.801 6.995 7.167 7.323 7.466 7.596 7.716 7.828 7.932 8.030 8.122 8.208 3.460 4.339 4.896 5.305 5.628 5.895 6.122 6.319 6.493 6.649 6.789 6.917 7.034 7.143 7.244 7.338 7.426 7.508 7.586 3.344 4.165 4.681 5.060 5.359 5.606 5.815 5.997 6.158 6.302 6.431 6.550 6.658 6.759 6.852 6.939 7.020 7.097 7.169 3.261 4.041 4.529 4.886 5.167 5.399 5.596 5.767 5.918 6.053 6.175 6.287 6.389 6.483 6.571 6.653 6.729 6.801 6.869 3.199 3.948 4.415 4.755 5.024 5.244 5.432 5.595 5.738 5.867 5.983 6.089 6.186 6.276 6.359 6.437 6.510 6.579 6.643 10 3.151 3.877 4.327 4.654 4.912 5.124 5.304 5.460 5.598 5.722 5.833 5.935 6.028 6.114 6.194 6.269 6.339 6.405 6.467 11 3.113 3.820 4.256 4.574 4.823 5.028 5.202 5.353 5.486 5.605 5.713 5.811 5.901 5.984 6.062 6.134 6.202 6.265 6.325 12 3.081 3.773 4.199 4.508 4.750 4.950 5.119 5.265 5.395 5.510 5.615 5.710 5.797 5.878 5.953 6.023 6.089 6.151 6.209 13 3.055 3.734 4.151 4.453 4.690 4.884 5.049 5.192 5.318 5.431 5.533 5.625 5.711 5.789 5.862 5.931 5.995 6.055 6.112 14 3.033 3.701 4.111 4.407 4.639 4.829 4.990 5.130 5.253 5.364 5.463 5.554 5.637 5.714 5.785 5.852 5.915 5.973 6.029 15 3.014 3.673 4.076 4.367 4.595 4.782 4.940 5.077 5.198 5.306 5.403 5.492 5.574 5.649 5.719 5.785 5.846 5.904 5.958 16 2.998 3.649 4.046 4.333 4.557 4.741 4.896 5.031 5.150 5.256 5.352 5.439 5.519 5.593 5.662 5.726 5.786 5.843 5.896 17 2.984 3.628 4.020 4.303 4.524 4.705 4.858 4.991 5.108 5.212 5.306 5.392 5.471 5.544 5.612 5.675 5.734 5.790 5.842 18 2.971 3.609 3.997 4.276 4.494 4.673 4.824 4.955 5.071 5.173 5.266 5.351 5.429 5.501 5.567 5.629 5.688 5.743 5.794 19 2.960 3.593 3.977 4.253 4.468 4.645 4.794 4.924 5.037 5.139 5.231 5.314 5.391 5.462 5.528 5.589 5.647 5.701 5.752 20 2.950 3.578 3.958 4.232 4.445 4.620 4.768 4.895 5.008 5.108 5.199 5.282 5.357 5.427 5.492 5.553 5.610 5.663 5.714 21 2.941 3.565 3.942 4.213 4.424 4.597 4.743 4.870 4.981 5.081 5.170 5.252 5.327 5.396 5.460 5.520 5.576 5.629 5.679 22 2.933 3.553 3.927 4.196 4.405 4.577 4.722 4.847 4.957 5.056 5.144 5.225 5.299 5.368 5.431 5.491 5.546 5.599 5.648 23 2.926 3.542 3.914 4.180 4.388 4.558 4.702 4.826 4.935 5.033 5.121 5.201 5.274 5.342 5.405 5.464 5.519 5.571 5.620 24 2.919 3.532 3.901 4.166 4.373 4.541 4.684 4.807 4.915 5.012 5.099 5.179 5.251 5.319 5.381 5.439 5.494 5.545 5.594 25 2.913 3.523 3.890 4.153 4.358 4.526 4.667 4.789 4.897 4.993 5.079 5.158 5.230 5.297 5.359 5.417 5.471 5.522 5.570 26 2.907 3.514 3.880 4.141 4.345 4.511 4.652 4.773 4.880 4.975 5.061 5.139 5.211 5.277 5.339 5.396 5.450 5.500 5.548 27 2.902 3.506 3.870 4.130 4.333 4.498 4.638 4.758 4.864 4.959 5.044 5.122 5.193 5.259 5.320 5.377 5.430 5.480 5.528 28 2.897 3.499 3.861 4.120 4.322 4.486 4.625 4.745 4.850 4.944 5.029 5.106 5.177 5.242 5.302 5.359 5.412 5.462 5.509 29 2.892 3.493 3.853 4.111 4.311 4.475 4.613 4.732 4.837 4.930 5.014 5.091 5.161 5.226 5.286 5.342 5.395 5.445 5.491 30 2.888 3.486 3.845 4.102 4.301 4.464 4.601 4.720 4.824 4.917 5.001 5.077 5.147 5.211 5.271 5.327 5.379 5.429 5.475 779 (continued) www.downloadslide.com 780 Table 13 df k-> Critical Values of Studentized Range Q a = 0.05 (Continued) 10 11 12 13 14 15 16 17 18 19 20 31 2.884 3.481 3.838 4.094 4.292 4.454 4.591 4.709 4.812 4.905 4.988 5.064 5.134 5.198 5.257 5.313 5.365 5.414 5.460 32 2.881 3.475 3.832 4.086 4.284 4.445 4.581 4.698 4.802 4.894 4.976 5.052 5.121 5.185 5.244 5.299 5.351 5.400 5.445 33 2.877 3.470 3.825 4.079 4.276 4.436 4.572 4.689 4.791 4.883 4.965 5.040 5.109 5.173 5.232 5.287 5.338 5.386 5.432 34 2.874 3.465 3.820 4.072 4.268 4.428 4.563 4.680 4.782 4.873 4.955 5.030 5.098 5.161 5.220 5.275 5.326 5.374 5.420 35 2.871 3.461 3.814 4.066 4.261 4.421 4.555 4.671 4.773 4.863 4.945 5.020 5.088 5.151 5.209 5.264 5.315 5.362 5.408 36 2.868 3.457 3.809 4.060 4.255 4.414 4.547 4.663 4.764 4.855 4.936 5.010 5.078 5.141 5.199 5.253 5.304 5.352 5.397 37 2.865 3.453 3.804 4.054 4.249 4.407 4.540 4.655 4.756 4.846 4.927 5.001 5.069 5.131 5.189 5.243 5.294 5.341 5.386 38 2.863 3.449 3.799 4.049 4.243 4.400 4.533 4.648 4.749 4.838 4.919 4.993 5.060 5.122 5.180 5.234 5.284 5.331 5.376 39 2.861 3.445 3.795 4.044 4.237 4.394 4.527 4.641 4.741 4.831 4.911 4.985 5.052 5.114 5.171 5.225 5.275 5.322 5.367 40 2.858 3.442 3.791 4.039 4.232 4.388 4.521 4.634 4.735 4.824 4.904 4.977 5.044 5.106 5.163 5.216 5.266 5.313 5.358 48 2.843 3.420 3.764 4.008 4.197 4.351 4.481 4.592 4.690 4.777 4.856 4.927 4.993 5.053 5.109 5.161 5.210 5.256 5.299 60 2.829 3.399 3.737 3.977 4.163 4.314 4.441 4.550 4.646 4.732 4.808 4.878 4.942 5.001 5.056 5.107 5.154 5.199 5.241 80 2.814 3.377 3.711 3.947 4.129 4.277 4.402 4.509 4.603 4.686 4.761 4.829 4.892 4.949 5.003 5.052 5.099 5.142 5.183 120 2.800 3.356 3.685 3.917 4.096 4.241 4.363 4.468 4.560 4.641 4.714 4.781 4.842 4.898 4.950 4.998 5.043 5.086 5.126 240 2.786 3.335 3.659 3.887 4.063 4.205 4.324 4.427 4.517 4.596 4.668 4.733 4.792 4.847 4.897 4.944 4.988 5.030 5.069 Inf 2.772 3.314 3.633 3.858 4.030 4.170 4.286 4.387 4.474 4.552 4.622 4.685 4.743 4.796 4.845 4.891 4.934 4.974 5.012 10 11 12 13 14 15 16 17 18 19 20 The Studentized Range Upper Quantiles Q(k, df; 0.01) df k-> 90.024 135.041 164.258 185.575 202.210 215.769 227.166 236.966 245.542 253.151 259.979 266.165 271.812 277.003 281.803 286.263 290.426 294.328 297.997 14.036 19.019 22.294 24.717 26.629 10.619 12.170 13.324 14.241 8.260 6.511 8.120 9.173 9.958 10.583 5.702 6.976 7.804 8.421 8.913 28.201 29.530 30.679 31.689 32.589 33.398 34.134 34.806 35.426 36.000 36.534 37.034 37.502 37.943 14.998 15.641 16.199 16.691 17.130 17.526 17.887 18.217 18.522 18.805 19.068 19.315 19.546 19.765 11.101 11.542 11.925 12.264 12.567 12.840 13.090 13.318 13.530 13.726 13.909 14.081 14.242 14.394 9.321 9.669 9.971 10.239 10.479 10.696 10.894 11.076 11.244 11.400 11.545 11.682 11.811 11.932 5.243 6.331 7.033 7.556 7.972 8.318 8.612 8.869 9.097 9.300 9.485 9.653 9.808 9.951 10.084 10.208 10.325 10.434 10.538 4.949 5.919 6.542 7.005 7.373 7.678 7.939 8.166 8.367 8.548 8.711 8.860 8.997 9.124 9.242 9.353 9.456 9.553 9.645 4.745 5.635 6.204 6.625 6.959 7.237 7.474 7.680 7.863 8.027 8.176 8.311 8.436 8.552 8.659 8.760 8.854 8.943 9.027 4.596 5.428 5.957 6.347 6.657 6.915 7.134 7.325 7.494 7.646 7.784 7.910 8.025 8.132 8.232 8.325 8.412 8.495 8.573 10 4.482 5.270 5.769 6.136 6.428 6.669 6.875 7.054 7.213 7.356 7.485 7.603 7.712 7.812 7.906 7.993 8.075 8.153 8.226 11 4.392 5.146 5.621 5.970 6.247 6.476 6.671 6.841 6.992 7.127 7.250 7.362 7.464 7.560 7.648 7.731 7.809 7.883 7.952 12 4.320 5.046 5.502 5.836 6.101 6.320 6.507 6.670 6.814 6.943 7.060 7.166 7.265 7.356 7.441 7.520 7.594 7.664 7.730 13 4.260 4.964 5.404 5.726 5.981 6.192 6.372 6.528 6.666 6.791 6.903 7.006 7.100 7.188 7.269 7.345 7.417 7.484 7.548 14 4.210 4.895 5.322 5.634 5.881 6.085 6.258 6.409 6.543 6.663 6.772 6.871 6.962 7.047 7.125 7.199 7.268 7.333 7.394 www.downloadslide.com 15 4.167 4.836 5.252 5.556 5.796 5.994 6.162 6.309 6.438 6.555 6.660 6.756 6.845 6.927 7.003 7.074 7.141 7.204 7.264 16 4.131 4.786 5.192 5.489 5.722 5.915 6.079 6.222 6.348 6.461 6.564 6.658 6.744 6.823 6.897 6.967 7.032 7.093 7.151 17 4.099 4.742 5.140 5.430 5.659 5.847 6.007 6.147 6.270 6.380 6.480 6.572 6.656 6.733 6.806 6.873 6.937 6.997 7.053 18 4.071 4.703 5.094 5.379 5.603 5.787 5.944 6.081 6.201 6.309 6.407 6.496 6.579 6.655 6.725 6.791 6.854 6.912 6.967 19 4.046 4.669 5.054 5.334 5.553 5.735 5.889 6.022 6.141 6.246 6.342 6.430 6.510 6.585 6.654 6.719 6.780 6.837 6.891 20 4.024 4.639 5.018 5.293 5.510 5.688 5.839 5.970 6.086 6.190 6.285 6.370 6.449 6.523 6.591 6.654 6.714 6.770 6.823 21 4.004 4.612 4.986 5.257 5.470 5.646 5.794 5.924 6.038 6.140 6.233 6.317 6.395 6.467 6.534 6.596 6.655 6.710 6.762 22 3.986 4.588 4.957 5.225 5.435 5.608 5.754 5.882 5.994 6.095 6.186 6.269 6.346 6.417 6.482 6.544 6.602 6.656 6.707 23 3.970 4.566 4.931 5.195 5.403 5.573 5.718 5.844 5.955 6.054 6.144 6.226 6.301 6.371 6.436 6.497 6.553 6.607 6.658 24 3.955 4.546 4.907 5.168 5.373 5.542 5.685 5.809 5.919 6.017 6.105 6.186 6.261 6.330 6.394 6.453 6.510 6.562 6.612 25 3.942 4.527 4.885 5.144 5.347 5.513 5.655 5.778 5.886 5.983 6.070 6.150 6.224 6.292 6.355 6.414 6.469 6.522 6.571 26 3.930 4.510 4.865 5.121 5.322 5.487 5.627 5.749 5.856 5.951 6.038 6.117 6.190 6.257 6.319 6.378 6.432 6.484 6.533 27 3.918 4.495 4.847 5.101 5.300 5.463 5.602 5.722 5.828 5.923 6.008 6.087 6.158 6.225 6.287 6.344 6.399 6.450 6.498 28 3.908 4.481 4.830 5.082 5.279 5.441 5.578 5.697 5.802 5.896 5.981 6.058 6.129 6.195 6.256 6.314 6.367 6.418 6.465 29 3.898 4.467 4.814 5.064 5.260 5.420 5.556 5.674 5.778 5.871 5.955 6.032 6.103 6.168 6.228 6.285 6.338 6.388 6.435 30 3.889 4.455 4.799 5.048 5.242 5.401 5.536 5.653 5.756 5.848 5.932 6.008 6.078 6.142 6.202 6.258 6.311 6.361 6.407 31 3.881 4.443 4.786 5.032 5.225 5.383 5.517 5.633 5.736 5.827 5.910 5.985 6.055 6.119 6.178 6.234 6.286 6.335 6.381 32 3.873 4.433 4.773 5.018 5.210 5.367 5.500 5.615 5.716 5.807 5.889 5.964 6.033 6.096 6.155 6.211 6.262 6.311 6.357 33 3.865 4.423 4.761 5.005 5.195 5.351 5.483 5.598 5.698 5.789 5.870 5.944 6.013 6.076 6.134 6.189 6.240 6.289 6.334 34 3.859 4.413 4.750 4.992 5.181 5.336 5.468 5.581 5.682 5.771 5.852 5.926 5.994 6.056 6.114 6.169 6.220 6.268 6.313 35 3.852 4.404 4.739 4.980 5.169 5.323 5.453 5.566 5.666 5.755 5.835 5.908 5.976 6.038 6.096 6.150 6.200 6.248 6.293 36 3.846 4.396 4.729 4.969 5.156 5.310 5.439 5.552 5.651 5.739 5.819 5.892 5.959 6.021 6.078 6.132 6.182 6.229 6.274 37 3.840 4.388 4.720 4.959 5.145 5.298 5.427 5.538 5.637 5.725 5.804 5.876 5.943 6.004 6.061 6.115 6.165 6.212 6.256 38 3.835 4.381 4.711 4.949 5.134 5.286 5.414 5.526 5.623 5.711 5.790 5.862 5.928 5.989 6.046 6.099 6.148 6.195 6.239 39 3.830 4.374 4.703 4.940 5.124 5.275 5.403 5.513 5.611 5.698 5.776 5.848 5.914 5.974 6.031 6.084 6.133 6.179 6.223 40 3.825 4.367 4.695 4.931 5.114 5.265 5.392 5.502 5.599 5.685 5.764 5.835 5.900 5.961 6.017 6.069 6.118 6.165 6.208 48 3.793 4.324 4.644 4.874 5.052 5.198 5.322 5.428 5.522 5.606 5.681 5.750 5.814 5.872 5.926 5.977 6.024 6.069 6.111 60 3.762 4.282 4.594 4.818 4.991 5.133 5.253 5.356 5.447 5.528 5.601 5.667 5.728 5.784 5.837 5.886 5.931 5.974 6.015 80 3.732 4.241 4.545 4.763 4.931 5.069 5.185 5.284 5.372 5.451 5.521 5.585 5.644 5.698 5.749 5.796 5.840 5.881 5.920 120 3.702 4.200 4.497 4.709 4.872 5.005 5.118 5.214 5.299 5.375 5.443 5.505 5.561 5.614 5.662 5.708 5.750 5.790 5.827 240 3.672 4.160 4.450 4.655 4.814 4.943 5.052 5.145 5.227 5.300 5.366 5.426 5.480 5.530 5.577 5.621 5.661 5.699 5.735 Inf 3.643 4.120 4.403 4.603 4.757 4.882 4.987 5.078 5.157 5.227 5.290 5.348 5.400 5.448 5.493 5.535 5.574 5.611 5.645 Source: cse.niaes.affrc.go.jp/miwa/probcalc/s-range/srng_tbl.html 781 www.downloadslide.com 782 Table 14 Cumulative Distribution Function of the Runs Test Statistic For a given number n of observations, the table shows the probability, for a random time series, that the number of runs will not exceed K n K 6 100 300 700 900 1.000 029 114 371 629 886 971 1.000 10 11 12 10 008 040 167 357 643 833 960 992 1.000 12 002 013 067 175 392 608 825 933 987 998 1.000 14 001 004 025 078 209 383 617 791 922 975 996 13 14 999 1.000 15 16 17 18 16 000 001 009 032 100 214 405 595 786 900 968 991 999 1.000 1.000 18 000 000 003 012 044 109 238 399 601 762 891 956 988 997 1.000 1.000 1.000 20 000 000 001 004 019 051 128 242 414 586 758 872 949 981 996 999 1.000 19 20 1.000 1.000 Reproduced with permission from F Swed and C Eisenhart, “Tables for testing randomness of grouping in a sequence of alternatives,” Annals of Mathematical Statistics 14 (1943) www.downloadslide.com I NDEX A Acceptance intervals, 260–262 Addition rule of probabilities, 112–113 Adjusted coefficient of determination R2, 492 Allocation proportional, 717 of sample effort among strata, 723–725 Alternative hypothesis, 347, 351, 356–359, 376, 406, 408 See also Hypothesis tests/testing Analysis of variance (ANOVA) comparison of several population means, 645–647 introduction to, 645 Kruskal-Wallis test and, 658–660 one-way, 647–656 for regression, 432–433 two-way, more than one observation per cell, 670–676 two-way, one observation per cell, randomized blocks, 661–667 Analysis of variance tables, two-way, 666–667 Approximate mean, 81–82 ARIMA (autoregressive integrated moving average) models, 713–714 Arithmetic mean, 60 Association test, 615–618 Asymmetric distribution, 62–63 Autocorrelated errors Durbin-Watson test and, 584–586 estimation of regressions with, 586–590 explanation of, 582–584 with lagged dependent variables, 590–591 Autocorrelation, 708–709 Autoregressive integrated moving average (ARIMA) models, 713–714 Autoregressive models estimation and, 709 example of, 709–710 explanation of, 708 first-order, 708 forecasting from, 709–712 second-order, 708 B Bar charts, 28–30, 52–53 Basic outcomes, 95 Bayes, Thomas, 132 Bayes’ theorem, 132–139 alternative statement, 135 examples, 132–138 explanation of, 132 management decision making, 138 solution steps for, 134 Bernoulli distribution, 159–161 Bernoulli random variable, 160 Beta coefficients, 456–458 Beta measure, of financial risk, 456–458 Between-groups mean square (MSG), 682–683 Between-groups variability, 649 Bias explanation of, 287 specification, 571–573 Biased estimators, 287 Binomial distribution, 159–165 compared with normal distribution, 221 compared with Poisson distribution, 172 derived mean and variance of, 160, 195–196 examples of, 162–165 explanation of, 162 normal distribution approximation for, 219–224 Poisson approximation to, 171–172 probability function table, 739–743 Binomial probabilities, cumulative, 744–748 Bivariate probabilities, 122–132 Blocking variables, 559–560, 661 Block means, 671 Box-and-whisker plots, 69–71 C Categorical data analysis contingency tables and, 614–618 goodness-of-fit tests, population parameters unknown, 609–613 goodness-of-fit tests, specified probabilities, 603–608 nonparametric tests for independent random samples, 628–632, 636–639 nonparametric tests for paired or matched samples, 619–626 Spearman rank correlation and, 634–635 Categorical variables, 25 graphs to describe, 28–35 Cell means, 671–672 Central limit theorem, 254–260 from linear sum of random variables, 280 Central tendency, measures of, 59–68 Chebychev’s theorem, 75–77 Chi-square distribution, 306–308 lower critical values table, 769 population variance, 271–272 upper critical values table, 768 variance of normal distribution, 375 Chi-square random variable, 605 for contingency tables, 615 Chi-square test examples of, 606–608 of variance of a normal distribution, 375–376 Classical probability, 101–102, 105 Cluster bar charts, 30, 52–53 Cluster sampling estimators for, 729–732 explanation of, 729 Cobb-Douglas production function, 519 Coefficient estimation, 553–554 Coefficient estimators derivation of, 553–554 least squares, 427–430, 439 variance, 437, 505–506 Coefficient of determination R2 adjusted, 492 explanation of, 433–437 regression models and, 492 sum of squares decomposition and, 489–490 Coefficient of multiple correlation, 492 Coefficient of multiple regression, 481–487 783 www.downloadslide.com Coefficient of standard errors, 495 Coefficient of variation (CV), 75 Collectively exhaustive events, 98 Combinations formula for determining number of, 104–105 number of, 102–103 Complement rule, 111–112, 118–119 Complements, 98–100 Component bar charts, 30 Composite hypothesis, 351, 356–359 Computer applications See also Excel for jointly distributed discrete random variables, 180 of regression coefficient, 429–430 Conditional coefficients, 486 Conditional mean, 180 Conditional probability, 113–114 Conditional probability distribution, 178 Conditional variance, 180 Confidence interval estimator, 291 Confidence intervals based on normal distribution, 292 for difference between two normal population means, dependent samples, 329–332 for difference between two normal population means, independent samples, 333–339 for difference between two population proportions, 340–341 examples of, 294, 300–302, 304–305, 308, 310–313 explanation of, 293 finite populations, 309–313 forecast, and prediction intervals, 447–448 for mean of normal distribution, population variance known, 291–296 for mean of normal distribution, population variance unknown, 297–302 for population mean, 291–302, 309–312 for population proportion, 303–305, 312–313 for population regression slope, 440–441 for population total, 309–312 for predictions, 447–448 reducing margin of error of, 295–296 for regression coefficients, 438–445, 495 784 Index sample size determination, large populations, 340–341 Student’s t distribution and, 297–302 of two means, dependent samples, 329 of two means, unknown population variances that are assumed to be equal, 335–337 of two means, unknown population variances that are not assumed to be equal, 337–339 for variance of normal distribution, 306–309 Confidence level, 292 Consistent estimators, 326 Contingency tables, 52 See also cross tables chi-square random variable for, 615 explanation of, 614–615 test of association in, 615–618 Continuous numerical variables, 26 Continuous random variables, 147–148, 197–205 covariance of, 229 (See also Jointly distributed continuous random variables) expectations for, 203–205 jointly distributed, 228–236 probability density functions and, 199–201 uniform distribution, 201 Control charts, 261–262 Control intervals, 261–262 Correlation applications of, 452 coefficient of determination R2 and, 436 coefficient of multiple correlation, 492 hypothesis test for, 452–453 of random variables, 182, 229 zero population, 453–454 Correlation analysis, 452–454 Correlation coefficient analysis, 87–88 Correlation coefficients, 84–88 defined, 84 example using, 85–88 of random variables, 182–183, 229 scatter plots and, 85 Spearman rank, 634–635 statistical independence and, 184 Counterfactual argument, 351 Covariance (Cov), 84, 181–182 computing using Excel, 87 continuous random variables, 229 statistical independence, 184 Critical value, 353 Cross-sectional data, 35 Cross tables, 29–30 Cumulative binomial probabilities, 744–748 Cumulative distribution function, 198–199, 202 of normal distribution, 208 Cumulative line graphs, 44 Cumulative probability function, 150–151 Cyclical component, of time series, 685 D Data cross-sectional, 35 interval, 26 measurement levels, 26–27 nominal, 26 ordinal, 26 presentation errors, 51–55 qualitative, 26 quantitative, 26 ratio, 27 time-series, 35–39 Data files descriptions, 470–471, 548–550 Davies, O L., 558 Decision making sampling and, 22–23 in uncertain environment, 22–25 Decision rules, guidelines for choosing, 382–383 Degrees of freedom, 273, 440 Dependent samples, 329–332, 387–390 Dependent variables, 47 lagged, as regressors, 567–570 Descriptive statistics, 25 Differences, of random variables, 184, 230 Discrete numerical variables, 25 Discrete random variables, 147 expected value of, 152–153 expected value of functions, 155 jointly distributed, 176–188 probability distributions for, 148–151 joint probability functions of, 178 properties of, 152–157 standard deviation of, 153–155 variance of, 153–155, 194 www.downloadslide.com Distribution shape, 62–63 See also specific distributions Diversifiable risk, 456–458 Dummy variables, 522–526, 554–565 experimental design models, 558–563 public sector applications, 563–565 for regression models, 522–526, 558–565 Durbin-Watson test, 584–586 cut-off points, 777–778 E Efficient estimators, 288 Empirical rule, 76–77 Equality, 403–405 of variances between two normally distributed populations, 403–405 Errors, 51–54, 495, 577–581 data presentation, 51–55 nonsampling, 24–25 reducing margin of, 295–296 sampling, 24, 293, 349 standard error, estimate, 490 Type I, 349–351, 407 Type II, 349–351, 369–373, 407 Error sum of squares, 427–428, 432–433, 489, 652 Error variance, estimation of, 490 Estimated regression model, 424 Estimates, 285 confidence interval, 291 explanation of, 285 point, 286 standard error, 490 Estimation See also Confidence intervals of beta coefficients, 456–458 coefficient, 553–554 of error variance, 490 least squares, 469–470, 483 of model error variance, 437 of multiple regression coefficients, 481–487 of population proportion, 313 of regressions with autocorrelated errors, 586–590 Estimators, 285 biased, 287 confidence interval, 291 consistent, 326 efficient, 288 examples of, 288 explanation of, 285 least squares, 469–470 least squares coefficient, 427–430, 439 least squares derivation of, 546–547 point, 285–289 of population mean, 725 unbiased, 286–287, 289 Events, 96–100 collectively exhaustive, 98 complements, 98–100 independent, 125–126 intersection of, 96–100, 144–145 mutually exclusive, 96–97, 117 union, 97–100, 144–145 Excel, 87 See also Minitab confidence intervals using, 301–302, 331–332 covariance and correlation using, 183 jointly distributed discrete random variables, 180 regression analysis using, 429 shape of a distribution, 62 Expected value of continuous random variables, 203–205 of discrete random variables, 152–153 of functions of random variables, 155, 181, 184 of sample mean, 250 Experimental design models, 558–563 Exploratory data analysis (EDA), 46 Exponential distribution, 225–227 Exponential model transformations, 518–520 Exponential smoothing, 697–707 Extreme points, 459, 461, 464 F Failure to reject, 349–351 F distribution, 403, 771–774 Financial investment portfolios, 232–236 Financial risk, beta measure of, 456–458 Finite population correction factor, 251, 309 Finite populations, confidence interval estimation for, 309–313 First-order autoregressive models, 708–709 First quartile, 64–65 Fisher, R A., 558 Five-number summary, 65 Forecasting from autoregressive models, 709–712 regression models and, 446–450 seasonal time series, 704–707 simple exponential smoothing and, 697–707 trends and, 686 F probability distribution hypothesis test for population slope coefficient using, 443–445 Frequency distributions, 28, 40 class width, 41 construction of, 41 cumulative, 42 inclusive and nonoverlapping classes, 41–42 interval width, 41 number of classes for, 41 for numerical data, 40–43 relative, 28, 42 F tests for simple regression coefficient, 444–445 t tests vs., 508–509 Functions, of random variables, 155–157 G Geometric mean, 63–64 Geometric mean rate of return, 63 Goodness-of-fit tests explanations of, 603 population parameters unknown, 609–613 specified probabilities, 603–608 Gosset, William Sealy, 297, 326 Graphical analysis, 458–464 Graphs for categorical variables, 28–35 data presentation errors, 51–55 to describe relationships between variables, 47–49 distribution shape, 44–46 histograms, 44 of multiple regression model, 480 for numerical variables, 40–50 ogives, 44 scatter plots, 47–49 stem-and-leaf displays, 46–47 for time-series data, 35–40 Grouped data, measures of, 81–82 Group means, 671 Index 785 www.downloadslide.com H Heteroscedasticity explanation of, 577–579 graphical techniques for detecting, 578–579 test for, 579–580 Histograms, 44 misleading, 51–53 Holt-Winters exponential smoothing forecasting model, 700–707 example of, 701–703 nonseasonal series, 701–703 seasonal series, 704–707 Hypergeometric distribution, 173–175 Hypothesis alternative, 351, 352, 356–359, 376 composite, 351, 356–359 null, 347–351, 376 one-sided composite alternative, 351 simple, 351 two-sided composite alternative, 351, 360–361 Hypothesis test decisions, 351 Hypothesis tests/testing, 346–347 assessing power of, 368–373 comments on, 406–408 concepts of, 347–351 confidence intervals, 438–445 control chart, 408 for correlation, 452–454 for difference between two normal population means, dependent samples, 387–390 for difference between two normal population means, independent samples, 391–398 for difference between two population proportions, 399–402 of equality of variances between two normally distributed populations, 403–405 flow chart for selecting, 413–414 introduction to, 352–353 for mean of a normal distribution, population variance known, 352–361, 369–371 for mean of normal distribution, population variance unknown, 362–364 for one-way analysis of variance, 651–653 of population proportion, 366–367 786 Index for population slope coefficient using F distribution, 443–445 power of, 351 for regression coefficients, 497–502, 505–509 for regression models, 438–445 for two-way analysis of variance, 666–667 for variance of a normal distribution, 375–377 for zero population correlation, 453–454 I Income distribution, 63 Independent events, 117, 125–126 Independent random samples, nonparametric tests for, 628–632 Independent samples, 333–339, 391–398 Independent variables, 47 jointly distributed, 178 Indicator variables, 522–526 See also Dummy variables Inference about population regression, 495 model interpretation and, 554 Inferential statistics, 25 Integral calculus, 242–243 Interaction, as source of variability, 670 Intercept, 419 Interquartile range (IQR), 69 Intersection of events, 96–97, 99–100, 151 Interval data, 26 Intervals acceptance, 260–262 control, 261–262 for frequency distribution, 44 Interval scales, 26 Investment portfolios beta measure of financial risk, 456–458 portfolio analysis, 232–236 returns on, 234–236 Irregular component of time series, 685 moving averages to smooth, 689–691 J Jarque-Bera test for normality, 611–613 Joint cumulative distribution function, 228–229 Jointly distributed continuous random variables, 176–188, 228–236 See also Continuous random variables; Random variables examples of, 230–231 financial investment portfolios, 232–236 linear combinations of, 232 Jointly distributed discrete random variables, 176–190 See also Discrete random variables; Random variables computer applications, 180 conditional mean and variance, 180 correlation, 182–183 covariance, 182 examples of, 176–177, 179–180, 183 expected value of functions of, 181 independence of, 178 portfolio analysis, 185–188 Joint probability, 96, 114–115, 117, 123–125 Joint probability distribution, 177–178 Joint probability function, 177 properties of, 178 K Knowledge, 25 Kruskal-Wallis test, 658–660 Kurtosis, 611, 613 L Lagged dependent variables, 567–570 autocorrelation errors in models with, 590–591 Law of large numbers, 254 Least squares algorithm, 514–515 Least squares coefficient estimators, 427–430, 439 Least squares derivation of estimators, 546–547 Least squares derived coefficient estimators, 428–429 Least squares estimation, sample multiple regression and, 483 Least squares estimators, derivation of, 469–470 Least squares procedure, 427–428, 482–487 Least squares regression, 419–420 Least squares regression line, 419, 446 www.downloadslide.com Leverage, 459 Linear combinations, of random variables, 232 Linear functions, of random variables, 180–181, 205 Linear models, 418–420 Linear regression equation, 431–437 analysis of variance and, 433 coefficient of determination R2, 433–434 Linear regression model, 421–426 assumptions, 422–423 examples using, 425–426 outcomes, 424 population, 423 Linear regression population equation model, 423 Linear relationships, 418–419 Linear sum of random variables, 280 Line charts, 35–39 Logarithmic transformations, 517–518 Lower confidence limit, 293 Lower tail test, 620 M Mann-Whitney U statistic, 628–629 Mann-Whitney U test, 628–630 Marginal distributions, 229 Marginal probabilities, 123–125, 179–180 Marginal probability distribution, 177–178 Margin of error, 293, 299, 304 reducing, 295–296 Matched pairs, 387–388 Mathematical derivations, 546–548, 682–683 Matrix plots, 486–487 Mean, 60–64 approximate, 81–82 of Bernoulli random variable, 160 of binomial distribution, 162, 195–196 conditional, 180 of continuous random variables, 204 geometric, 63–64 of jointly distributed random variables, 196 of linear functions of a random variable, 155–157, 194–195 measures of variability from, 68–79 of normal distribution, population variance known, 315–316, 352–361, 369–371 of normal distribution, population variance unknown, 362–364 of Poisson probability distribution, 168 of sampling distribution of sample variances, 283 weighted, 80–83 Mean square regression (MSR), 505, 506 Mean squares between-groups, 651 within-groups, 651 Measurement levels, 26–27 Measures of central tendency, 59–68 geometric mean, 63–64 mean, median, mode, 60–62 shape of a distribution, 62–63 Median, 60–62, 63 Minimum variance unbiased estimator, 288 Minitab, 87 See also Excel autoregressive models, 709–712 confidence intervals using, 337, 338–339, 341 descriptive measures using, 87 Durbin-Watson test, 586 exponential model estimation, 519 hypothesis testing, 377, 389–390, 396 lagged dependent variables, 569 matrix plots, 486–487 Monte Carlo sampling simulations, 280–283 for probability distributions, 154, 164–165 regression analysis using, 429–430 Missing values, 27, 330–331 Mode, 60–62 Model error variance, estimation of, 437 Model specification, 529–531, 552–553 Monte Carlo sampling simulations, 254–260, 280–283 Minitab, 280–283 Most efficient estimator, 287–289 Moving averages explanation of, 689–691 extraction of seasonal component through, 692–697 simple centered (2m 1)-point, 691 Multicollinearity, 574–577 corrections for, 576–577 indicators for, 576 Multiple comparisons, 654–655 Multiple regression See also Regression analysis application procedure and, 529–537 applications of, 475–476 confidence intervals and hypothesis tests for individual regression coefficients, 493–502 estimation of coefficients and, 481–487 explanatory power of multiple regression equation and, 488–492 introduction to, 474 least squares procedure, 482–487 objectives, 476 prediction and, 511–513 tests on regression coefficients, 505–509 Multiple regression equation, 488–492 Multiple regression model, 474 assumptions, 482 development of, 477–480, 531–532 dummy variables for, 522–526 explanation of, 474–480 model specification, 474–476 objectives, 476–477 population, 479 residuals analysis and, 534–537 test on all coefficients of, 497 three-dimensional graphing of, 480 transformations for nonlinear, 514–520 Multiplication rule of probabilities, 114–116 Mutually exclusive events, 96–97, 117 N Nominal data, 26 Nondiversifiable risk, 456 Nonlinear regression models logarithmic transformations, 517–518 quadratic transformations, 515–517 transformations for, 514–520 Index 787 www.downloadslide.com Nonparametic tests for independent random samples, 628–632 Kruskal-Wallis test, 658–660 Mann-Whitney U test, 628–630 normal approximation to the sign test, 623–624 for paired or matched samples, 619–626 for randomness, 636–639 sign test, 619–621, 626 Spearman rank correlation, 634–635 Wilcoxon rank sum test, 631–632 Wilcoxon signed rank test, 622–623 Nonprobabilistic sampling methods, 734 Nonsampling errors, 24–25 Nonuniform variance, 577–578 Normal approximation Mann-Whitney U test, 629 to sign test, 623–624 to Wilcoxon signed rank test, 624–626 Normal distribution, 206–217 to approximate binomial distribution, 219–224 compared with binomial distribution, 221 confidence interval estimation for variance of, 306–309 confidence interval for mean of, 291–296 cumulative distribution function of, 208 examples of, 211–214 explanation of, 206–207 probability density function for, 207 properties of, 207 standard, 209–210 test for, 611–613 tests of mean of, population variance known, 352–361 tests of the variance of, 375–377 Normality, test for, 611 Normal probability plots, 215–217 Normal random variables, range probabilities for, 209 Null hypothesis, 347–351, 376 See also Hypothesis p-value, 360–361, 376 rejection of, 406–407 sign test, 619–621 specifying, 406–407 788 Index testing regression coefficients, 497 tests/testing goodness-of-fit tests, 603–608 Number of combinations, 102 formula for determining, 102 Numerical variables, 25–26 graphs to describe, 40–50 O Odds, 126 Ogives, 44 One-sided composite alternative hypothesis, 347, 351 One-way analysis of variance, 647–656 framework for, 648 hypothesis test for, 651–653 multiple comparisons between subgroup means, 654–655 population model for, 655–656 sum of squares decomposition for, 650–651 One-way analysis of variance tables, 652–653 Ordering, 103 Ordinal data, 26 Outcomes basic, 95 for bivariate events, 122 random experiments and, 95 Outliers, 47, 62, 461 effect of, 462–464 Overall mean, 672, 725–726 Overinvolvement ratios, 126–129 P Paired samples, Wilcoxon signed rank test for, 622–623 Parameters, 24, 60 Pareto, Vilfredo, 32 Pareto diagrams, 32–34 Pearson’s product-moment correlation coefficient, 84–86 Percent explained variability, 435 Percentile, 64–67 Permutations, 102–104 Pie charts, 31–32 Point estimates, 286 Point estimators, properties of, 285–289 Poisson, Simeon, 167 Poisson approximation to binomial distribution, 171–172 Poisson probability distribution, 167–172 approximation to binomial distribution, 171–172 assumptions of, 167 comparison to binomial distribution, 172 cumulative, table of, 759–767 examples of, 168–172 explanation of, 167 functions, mean, and variance, 168 individual, table of, 750–758 test for, 609–611 Pooled sample variance, 336 Population defined, 23 sampling errors, 24 sampling from, 245–249 Population covariance, 84 Population mean allocation overall, 724 comparison of several, 645–647 confidence interval estimation of difference between two, 329–339 confidence interval for, 309–311 estimation of, 718–719, 730 guidelines for choosing decision rule for, 382 tests of difference between, dependent samples, 387–390 tests of difference between, independent samples, 391–398 Population model linear regression, 423 for one-way analysis of variance, 655–656 Population multiple regression model, 479 Population proportions confidence interval estimation for, 303–305, 312–313 estimation of, 313, 340–341, 721–723, 730 guidelines for choosing decision rule for, 383 optimal allocation, 724 sample size for, 317–319 tests of, 366–367, 371–373 tests of difference between, 399–402 Population regression parameters, 495 Population regression slope basis for inference about, 440 confidence interval, 440–443 tests of, 442 Populations, examples of, 245 www.downloadslide.com Population slope coefficient, hypothesis test for, 443–445 Population total confidence interval for, 309–311 estimation of, stratified random sample, 720–721 Population variance, 71–72 chi-square distribution of, 271–272 confidence intervals and, 293–294, 335–339 independent samples and, 333–339 tests of difference with known, 391–393 tests of difference with unknown, 393–396 tests of mean of normal distribution with known, 333–334, 352–361, 369–371 tests of mean of normal distribution with unknown, 335–339, 362–364, 396–398 tests of normal distribution, 375–377 Portfolio analysis, 186–188, 232–236 Portfolio market value, 185–188 Power, 350–351 Power function, 370–371 Prediction multiple regression and, 511–513 regression models and, 446–450 Prediction intervals, 447–448 Predictor variables, bias from excluding significant, 571–573 Price-earnings ratios, 289 Probability, 93–94 addition rule of, 112–113 Bayes’ theorem, 132–138 bivariate, 122–132 classical, 101–102 complement rule, 111–112, 118–119 conditional, 113–114 examples, 105–106 joint, 114–115, 117, 123–125 marginal, 123–125, 179–180 multiplication rule of, 114–116 for normally distributed random variables, 212 overinvolvement ratios and, 126–129 permutations and combinations, 102–105 random experiments and, 94–95 of range using cumulative distribution function, 199 relative frequency, 106 rules, 111–122 statistical independence and, 116–119 subjective, 107–110 Probability density functions, 199–200, 252 areas under, 200–201 for chi-square distribution, 272 for exponential distribution, 226 for normal distribution, 207 properties of, 199–200 for sample means, 252 for sample proportions, 267 of standard normal and Student’s t distribution, 298 Probability distribution function, 149, 199 Probability distributions Bernoulli distribution, 159–161 binomial distribution, 159–165 chi-square distribution, 271–272 for discrete random variables, 148–151 exponential distribution, 225–227 hypergeometric distribution, 173–175 Poisson probability distribution, 167–172 Student’s t distribution, 326–327 uniform, 201 Probability functions binomial distribution table, 739–743 conditional, 178 joint probability function, 177, 178 marginal probability function, 177 Probability plots, normal, 215–217, 535 Probability postulates consequences of, 108–109 explanation of, 107–108 Probability value (p-value), 360–361 Problem definition, 25 Properties of cumulative probability distributions, 151 of joint probability functions, 178 of probability distribution functions, 150 Proportional allocation, 723 Proportion random variable, 223–224 Proportions, confidence interval estimation for, 303–305 Public sector research, 563 Public sector research and policy analysis, dummy variable regression in, 563–565 p-value, 354–359 for chi-square test for variance, 376 for sign test, 620 Q Quadratic transformations, 515–517 Qualitative data, 26 Quantitative data, 26 Quartiles, 64–65 Queuing problems, 169–171 Quota sampling, 734 R Random experiments, 94 outcomes of, 94–100 Randomized block design, 661–662 Random samples/sampling, 23 independent, 333–339 nonparametric tests for independent, 628–632 simple, 23, 245–246 Random variables, 147–148 continuous (See Continuous random variables) correlation of, 229 differences between, 184 differences between pairs of, 230 linear combinations of, 232 linear functions of, 180–181, 205 linear sums and differences of, 184 mean and variance of linear functions of, 155–157 proportion, 223–224 statistical independence and, 181, 184 sums of, 229–230 Range explanation of, 69 interquartile, 69 Ratio data, 27 Ratio of mean squares, 683 Ratios overinvolvement, 126–129 price-earnings, 289 Regression See also Least squares regression; Multiple regression; Simple regression analysis of variance and, 432–433 autocorrelated errors and, 582–591 dummy variables and experimental design, 554–565 heteroscedasticity, 577–581 lagged valued of dependent variables, 567–570 least squares regression, 419–420 Index 789 www.downloadslide.com Regression (continued) linear regression model and, 421–426 mean square, 490, 506 multicollinearity, 574–577 specification bias, 571–573 Regression coefficients computer computation of, 429–430 confidence intervals for, 495–497 hypothesis tests for, 493–495 subsets of, tests on, 506–507 tests on, 505–507 Regression models See also Multiple regression model; Nonlinear regression models coefficient estimation, 553–554 dummy variables, 522–526, 554– 558 interpretation and inference, 554 linear, 418–426, 431–437 methodology for building, 552–554 specification, 552–553 verification, 554 Regression sum of squares, 432, 433, 490 Reject, 351 Relative efficiency, 288 Relative frequency distribution, 28, 42 Relative frequency probability, 106 Reliability factor, 293 Repeated measurements, 329, 331–332 Residuals, analysis of, 534–537 Returns, on financial portfolios, 234–236 Risk, 233 diversifiable, 456–458 nondiversifiable, 456 Runs test, 636–639 S Sample covariance, 84 Sample means acceptance intervals, 260–262 central limit theorem, 254–260 expected value of, 250 explanation of, 249 sampling distributions of, 249–262 standard normal distribution for, 251–253 Sample proportions examples of, 267–268 explanation of, 265 sampling distributions of, 265–268 790 Index Sample sizes determining, 340–341 determining, for stratified random sampling, 725–726 finite populations, 319–322 large populations, 315–319 Sample space, 95 Samples/sampling See also Random samples/sampling cluster, 729–732 defined, 23 dependent, 329–332, 386–390 explanation of, 22–25 independent, 333–339, 386–390 Monte Carlo sampling simulations, 280–283 nonprobabilistic methods, 734 from population, 245–249 simple random, 23, 245–246 stratified, 716–726 systematic, 23 two-phase, 732–734 Sample standard deviation, 271 Sample variances, 73 chi-square distribution, 271–272 explanation of, 271 sampling distributions of, 270–275, 283 Sampling distributions explanation of, 246–249 of least squares coefficient estimator, 439 of sample means, 249–262 of sample proportions, 265–268 of sample variances, 270–275, 283 Sampling error, 24–25, 293, 349 Sampling without replacement, 173–174 Sampling with replacement, 174 Sarbanes-Oxley Act (SOX), 617–618 Scatter plot analysis, 459–464 Scatter plots, 47–49 for residuals analysis, 535–537 Seasonal component extraction of, through moving averages, 692–697 of time series, 686–687 Seasonal index method, 704–707 Seasonal time series, forecasting, 704–707 Second-order autoregressive models, 708 Second quartile, 64 Side-by-side bar chart, 30 Significance level, 349, 351 Sign test explanation of, 619 normal approximation to, 623–626 for paired or matched samples, 619–623 p-value for, 620 for single population median, 626 Simple exponential smoothing explanation of, 698 forecasting through, 698–700 Holt-Winters model and, 700–703 Simple hypothesis, 347, 351 Simple random samples/sampling, 23, 245 beta measure of financial risk, 456–458 correlation analysis and, 452–454 explanatory power of linear regression equation and, 431–437 graphical analysis and, 458–464 least squares coefficient estimators and, 427–430 prediction and, 446–450 sample sizes, 320–322 statistical inference and, 438–445 Simple regression See Regression Simple regression coefficient, F test for, 444–445 Skewed distribution, 45–46 Skewness, 45, 91–92, 611, 613 Slope, 419 differences in, 525 Spearman rank correlation, 634–635 cutoff points, 776 Specification bias, 571–573 SSE, 427–428, 432–433 SSR, 433–435 SST, 433–435 Stacked bar charts, 30 Standard deviation, 72–73, 74 of continuous random variables, 204 of discrete random variable, 153–155 sample, 271 Standard error of the estimate, 490 Standardized normal random variable, 251 Standardized residual, 461–464 Standard normal distribution, 209 cumulative distribution function table, 738 for sample means, 251–253 www.downloadslide.com Statistical independence, 116–119, 181, 184 covariance, 184 Statistical inference, 438–445 Statistical significance, 407 Statistical thinking, 22 Statistics, 22, 60 See also Nonparametic tests defined, 24 descriptive, 25 inferential, 25 Stem-and-leaf displays, 46–47 Stock market crash of 2008, 94 beta coefficients limitations, 457 cautions concerning financial models, 236 Stratified random sampling allocation of sample effort among strata and, 723–725 analysis of results from, 718–720 determining sample sizes for, 725–726 estimation of population mean, 718–719 estimation of population proportion, 721–723 estimation of population total, 720–721 examples of, 719–720 explanation of, 716–717 Student’s t distribution, 326–327 confidence intervals and, 297–302 hypothesis tests, 362–364 for two means with unknown population variances not assumed to be equal, 344 upper critical values table, 770 Subgroup means, multiple comparisons between, 654 Subjective probability, 107–110 Sum of squares, 433, 489, 649 Sum of squares decomposition coefficient of determination and, 489–490 one-way analysis of variance, 650–651 two-way analysis of variance, 665 Sums, of random variables, 184, 229–230 Survey responses missing values in, 330–331 sampling errors, 24 Symmetric distributions, 45 Systematic sampling, 23 T Tables for categorical variables, 28–29 cross tables, 29–30 to describe relationships between variables, 47–49 frequency distribution, 28–29 Test of association, 615–618 Tests See Hypothesis tests/testing Third quartile, 65 Time plots, autocorrelation and, 582–583 Time series autoregressive integrated moving average models, 713–714 autoregressive models, 708–712 components of, 685–689 explanation of, 684–685 exponential smoothing and, 697–707 forecasting seasonal, 704–707 moving averages, 689–697 Time-series component analysis, 688 Time-series data explanation of, 684–685 graphs to describe, 35–40 Time-series plots, 35–39 misleading, 53–54 Time-series regression model, 587–590 Total explained variability, 547–548 Total sum of squares, 433, 489, 682 Treatment variables, 559–560 Tree diagrams, 123–124 Trend component, of time series, 685–686 t tests, vs F tests, 508–509 Two-phase sampling, 732–734 Two-sided composite alternative hypothesis, 347, 351, 360–361 Two-tail test, 620 Two-way analysis of variance examples of, 675–676 hypothesis tests for, 666 more than one observation per cell, 670–676 one observation per cell, 661–667 several observations per cell, 670–676 sum of squares decomposition for, 665 table format, 666–667 tables, 666–667 Two-way analysis of variance tables, 666–667 Type I errors, 349–351, 353, 407 Type II errors, 349–351, 369–370, 407 determining probability of, 369–371 U Unbiased estimator, 286–287 Uncertainty, decision making under, 22–25 Uniform distribution, 201, 204 Uniform probability distribution, 198 Unions, 97–100, 151 Upper confidence limit, 293 V Variability between-groups, 649 interaction as source of, 670 total explained, 547–548 within-groups, 649 Variability, measures of, 68–79 Variables See also Continuous random variables bias from excluding significant predictor, 571–573 blocking, 559–560, 661 categorical, 25, 28–34 classification of, 25–26 correlation analysis and, 452–454 defined, 25 dependent, 47 dummy, 522–526, 554–565 effect of dropping statistically significant, 532–534 independent, 47 indicator, 522–526 lagged dependent, 567–570 of linear functions of a random variable, 188 measures of relationships between, 84–89 numerical, 25–26, 40–49 relationships between, 418–419 tables and graphs to describe relationships between, 47–49 treatment, 559–560 Variance, 71–74 See also Analysis of variance (ANOVA) of Bernoulli random variable, 160 of binomial distribution, 162, 195–196 conditional, 180 of continuous random variables, 204 Index 791 www.downloadslide.com Variance (continued) of discrete random variables, 153–155, 184, 194 for grouped data, 81–82 of jointly distributed random variables, 196 of linear functions of a random variable, 155–157, 194–195 nonuniform, 577–578 of normal distribution, confidence interval estimation for, 306–309 of normal distribution, tests for, 375–377 of Poisson probability distribution, 168 sampling distributions of sample, 270–275 792 Index between two normally distributed populations, tests of equality, 403–405 Variation, coefficient of, 75 Venn diagrams for addition rule, 112 for complement of event, 98 for intersection of events, 97, 100, 144–145 for union of events, 96–98, 144–145 Verifications, 194–196 W Waiting line problems, 169–171 Weighted mean, 80–83 Width, 293 Wilcoxon rank sum statistic T, 631 Wilcoxon rank sum test, 631–632 cutoff points for statistic, 775 Wilcoxon signed rank test, 622–626 normal approximation to, 624–626 Within-groups mean square (MSW), 682 Within-groups variability, 649 Y y-intercept, 419 Z Zero population correlation, 453–454 z-score, 77–78 ... 58,561 23 ,1 12 22, 9 02 211 676 465 18 59,066 23 ,315 23 ,094 22 1 879 658 19 58,596 22 ,865 22 ,915 -50 429 479 20 58,631 22 ,788 22 , 928 -140 3 52 4 92 21 58,758 22 ,949 22 ,977 - 28 513 541 22 59,037 23 ,149 23 ,083... 168 - 722 -554 56,068 21 ,9 32 21,950 - 18 -504 -486 56 ,29 9 22 ,086 22 ,039 48 -350 -398 56, 825 22 ,26 5 22 ,23 9 26 -171 -197 Residual 10 57 ,20 5 22 ,551 22 ,384 167 115 - 52 11 57,5 62 22, 736 22 , 520 21 6 300... 2Dx n1 s 21 s 2Dx n s2 n2 Student t s s 22 2r n2 √ n1 s2 Independent samples? No n15n2 No s 21 s 22 √ n2 s 2p (n 121 )s 1 (n 221 )s s 2Dx n11 n 222 Pooled Variance DOF = n11 n 222 s 2Dx s 2p n1 s 21

Ngày đăng: 04/02/2020, 12:53

TỪ KHÓA LIÊN QUAN

w