Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 12 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
12
Dung lượng
65,31 KB
Nội dung
7. Product and Process Comparisons 7.4. Comparisons based on data from more than two processes 7.4.3. Are the means equal? 7.4.3.1.1-Way ANOVA overview Overview and principles This section gives an overview of the one-way ANOVA. First we explain the principles involved in the 1-way ANOVA. Partition response into components In an analysis of variance the variation in the response measurements is partitoned into components that correspond to different sources of variation. The goal in this procedure is to split the total variation in the data into a portion due to random error and portions due to changes in the values of the independent variable(s). Variance of n measurements The variance of n measurements is given by where is the mean of the n measurements. Sums of squares and degrees of freedom The numerator part is called the sum of squares of deviations from the mean, and the denominator is called the degrees of freedom. The variance, after some algebra, can be rewritten as: The first term in the numerator is called the "raw sum of squares" and the second term is called the "correction term for the mean". Another name for the numerator is the "corrected sum of squares", and this is usually abbreviated by Total SS or SS(Total). 7.4.3.1. 1-Way ANOVA overview http://www.itl.nist.gov/div898/handbook/prc/section4/prc431.htm (1 of 2) [5/1/2006 10:38:54 AM] The SS in a 1-way ANOVA can be split into two components, called the "sum of squares of treatments" and "sum of squares of error", abbreviated as SST and SSE, respectively. The guiding principle behind ANOVA is the decomposition of the sums of squares, or Total SS Algebraically, this is expressed by where k is the number of treatments and the bar over the y denotes the "grand" or "overall" mean. Each n i is the number of observations for treatment i. The total number of observations is N (the sum of the n i ). Note on subscripting Don't be alarmed by the double subscripting. The total SS can be written single or double subscripted. The double subscript stems from the way the data are arranged in the data table. The table is usually a rectangular array with k columns and each column consists of n i rows (however, the lengths of the rows, or the n i , may be unequal). Definition of "Treatment" We introduced the concept of treatment. The definition is: A treatment is a specific combination of factor levels whose effect is to be compared with other treatments. 7.4.3.1. 1-Way ANOVA overview http://www.itl.nist.gov/div898/handbook/prc/section4/prc431.htm (2 of 2) [5/1/2006 10:38:54 AM] 7.4.3.2. The 1-way ANOVA model and assumptions http://www.itl.nist.gov/div898/handbook/prc/section4/prc432.htm (2 of 2) [5/1/2006 10:38:55 AM] ANOVA table Source SS DF MS F Treatments SST k-1 SST / (k-1) MST/MSE Error SSE N-k SSE / (N-k) Total (corrected) SS N-1 The word "source" stands for source of variation. Some authors prefer to use "between" and "within" instead of "treatments" and "error", respectively. ANOVA Table Example A numerical example The data below resulted from measuring the difference in resistance resulting from subjecting identical resistors to three different temperatures for a period of 24 hours. The sample size of each group was 5. In the language of Design of Experiments, we have an experiment in which each of three treatments was replicated 5 times. Level 1 Level 2 Level 3 6.9 8.3 8.0 5.4 6.8 10.5 5.8 7.8 8.1 4.6 9.2 6.9 4.0 6.5 9.3 means 5.34 7.72 8.56 The resulting ANOVA table is Example ANOVA table Source SS DF MS F Treatments 27.897 2 13.949 9.59 Error 17.452 12 1.454 Total (corrected) 45.349 14 Correction Factor 779.041 1 7.4.3.3. The ANOVA table and tests of hypotheses about means http://www.itl.nist.gov/div898/handbook/prc/section4/prc433.htm (2 of 3) [5/1/2006 10:38:55 AM] Interpretation of the ANOVA table The test statistic is the F value of 9.59. Using an of .05, we have that F .05; 2, 12 = 3.89 (see the F distribution table in Chapter 1). Since the test statistic is much larger than the critical value, we reject the null hypothesis of equal population means and conclude that there is a (statistically) significant difference among the population means. The p-value for 9.59 is .00325, so the test statistic is significant at that level. Techniques for further analysis The populations here are resistor readings while operating under the three different temperatures. What we do not know at this point is whether the three means are all different or which of the three means is different from the other two, and by how much. There are several techniques we might use to further analyze the differences. These are: constructing confidence intervals around the difference of two means, ● estimating combinations of factor levels with confidence bounds● multiple comparisons of combinations of factor levels tested simultaneously. ● 7.4.3.3. The ANOVA table and tests of hypotheses about means http://www.itl.nist.gov/div898/handbook/prc/section4/prc433.htm (3 of 3) [5/1/2006 10:38:55 AM] The 829.390 SS is called the "raw" or "uncorrected " sum of squares. Step 3: compute SST STEP 3 Compute SST, the treatment sum of squares. First we compute the total (sum) for each treatment. T 1 = (6.9) + (5.4) + + (4.0) = 26.7 T 2 = (8.3) + (6.8) + + (6.5) = 38.6 T 1 = (8.0) + (10.5) + + (9.3) = 42.8 Then Step 4: compute SSE STEP 4 Compute SSE, the error sum of squares. Here we utilize the property that the treatment sum of squares plus the error sum of squares equals the total sum of squares. Hence, SSE = SS Total - SST = 45.349 - 27.897 = 17.45. Step 5: Compute MST, MSE, and F STEP 5 Compute MST, MSE and their ratio, F. MST is the mean square of treatments, MSE is the mean square of error (MSE is also frequently denoted by ). MST = SST / (k-1) = 27.897 / 2 = 13.949 MSE = SSE / (N-k) = 17.452/ 12 = 1.454 where N is the total number of observations and k is the number of treatments. Finally, compute F as F = MST / MSE = 9.59 That is it. These numbers are the quantities that are assembled in the ANOVA table that was shown previously. 7.4.3.4. 1-Way ANOVA calculations http://www.itl.nist.gov/div898/handbook/prc/section4/prc434.htm (2 of 2) [5/1/2006 10:38:56 AM] Contrasts discussed later Later on the topic of estimating more general linear combinations of means (primarily contrasts) will be discussed, including how to put confidence bounds around contrasts. 7.4.3.5. Confidence intervals for the difference of treatment means http://www.itl.nist.gov/div898/handbook/prc/section4/prc435.htm (2 of 2) [5/1/2006 10:38:56 AM] Confidence intervals for the factor level means It can be shown that: has a t-distribution with (N- k) degrees of freedom for the ANOVA model under consideration, where N is the total number of observations and k is the number of factor levels or groups. The degrees of freedom are the same as were used to calculate the MSE in the ANOVA table. That is: dfe (degrees of freedom for error) = N - k. From this we can calculate (1- )100% confidence limits for each i . These are given by: Example 1 Example for a 4-level treatment (or 4 different treatments) The data in the accompanying table resulted from an experiment run in a completely randomized design in which each of four treatments was replicated five times. Total Mean Group 1 6.9 5.4 5.8 4.6 4.0 26.70 5.34 Group 2 8.3 6.8 7.8 9.2 6.5 38.60 7.72 Group 3 8.0 10.5 8.1 6.9 9.3 42.80 8.56 Group 4 5.8 3.8 6.1 5.6 6.2 27.50 5.50 All Groups 135.60 6.78 7.4.3.6. Assessing the response from any factor combination http://www.itl.nist.gov/div898/handbook/prc/section4/prc436.htm (2 of 7) [5/1/2006 10:38:58 AM] 1-Way ANOVA table layout This experiment can be illustrated by the table layout for this 1-way ANOVA experiment shown below: Level Sample j i 1 2 5 Sum Mean N 1 Y 11 Y 12 Y 15 Y 1. 1. n 1 2 Y 21 Y 22 Y 25 Y 2. 2. n 2 3 Y 31 Y 32 Y 35 Y 3. 3. n 3 4 Y 41 Y 42 Y 45 Y 4. 4. n 4 All Y . n t ANOVA table The resulting ANOVA table is Source SS DF MS F Treatments 38.820 3 12.940 9.724 Error 21.292 16 1.331 Total (Corrected) 60.112 19 Mean 919.368 1 Total (Raw) 979.480 20 The estimate for the mean of group 1 is 5.34, and the sample size is n 1 = 5. Computing the confidence interval Since the confidence interval is two-sided, the entry /2 value for the t-table is .5(1 - .95) = .025, and the associated degrees of freedom is N - 4, or 20 - 4 = 16. From the t table in Chapter 1, we obtain t .025;16 = 2.120. Next we need the standard error of the mean for group 1: Hence, we obtain confidence limits 5.34 ± 2.120 (0.5159) and the confidence interval is 7.4.3.6. Assessing the response from any factor combination http://www.itl.nist.gov/div898/handbook/prc/section4/prc436.htm (3 of 7) [5/1/2006 10:38:58 AM] Definition and Estimation of Contrasts Definition of contrasts and orthogonal contrasts Definitions A contrast is a linear combination of 2 or more factor level means with coefficients that sum to zero. Two contrasts are orthogonal if the sum of the products of corresponding coefficients (i.e., coefficients for the same means) adds to zero. Formally, the definition of a contrast is expressed below, using the notation i for the i-th treatment mean: C = c 1 1 + c 2 2 + + c j j + + c k k where c 1 + c 2 + + c j + + c k = = 0 Simple contrasts include the case of the difference between two factor means, such as 1 - 2 . If one wishes to compare treatments 1 and 2 with treatment 3, one way of expressing this is by: 1 + 2 - 2 3 . Note that 1 - 2 has coefficients +1, -1 1 + 2 - 2 3 has coefficients +1, +1, -2. These coefficients sum to zero. An example of orthogonal contrasts As an example of orthogonal contrasts, note the three contrasts defined by the table below, where the rows denote coefficients for the column treatment means. 1 2 3 4 c 1 +1 0 0 -1 c 2 0 +1 -1 0 c 3 +1 -1 -1 +1 7.4.3.6. Assessing the response from any factor combination http://www.itl.nist.gov/div898/handbook/prc/section4/prc436.htm (4 of 7) [5/1/2006 10:38:58 AM] [...]... estimate We wish to estimate, in our previous example, the following contrast: and construct a 95 percent confidence interval for C Computing the point estimate and standard error The point estimate is: Applying the formulas above we obtain and and the standard error is = 0.51 59 http://www.itl.nist.gov/div 898 /handbook/ prc/section4/prc436.htm (6 of 7) [5/1/2006 10:38:58 AM] ... means, not just for contrasts Confidence Interval for a Contrast Confidence intervals for contrasts An unbiased estimator for a contrast C is given by The estimator of is http://www.itl.nist.gov/div 898 /handbook/ prc/section4/prc436.htm (5 of 7) [5/1/2006 10:38:58 AM] 7.4.3.6 Assessing the response from any factor combination The estimator is normally distributed because it is a linear combination of . Level 3 6 .9 8.3 8.0 5.4 6.8 10.5 5.8 7.8 8.1 4.6 9. 2 6 .9 4.0 6.5 9. 3 means 5.34 7.72 8.56 The resulting ANOVA table is Example ANOVA table Source SS DF MS F Treatments 27. 897 2 13 .94 9 9. 59 Error. table is Source SS DF MS F Treatments 38.820 3 12 .94 0 9. 724 Error 21. 292 16 1.331 Total (Corrected) 60.112 19 Mean 91 9.368 1 Total (Raw) 97 9.480 20 The estimate for the mean of group 1 is. (k-1) = 27. 897 / 2 = 13 .94 9 MSE = SSE / (N-k) = 17.452/ 12 = 1.454 where N is the total number of observations and k is the number of treatments. Finally, compute F as F = MST / MSE = 9. 59 That is