Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 42 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
42
Dung lượng
2,89 MB
Nội dung
1. ExploratoryData Analysis 1.3. EDA Techniques 1.3.5. Quantitative Techniques 1.3.5.2.Confidence Limits for the Mean Purpose: Interval Estimate for Mean Confidence limits for the mean (Snedecor and Cochran, 1989) are an interval estimate for the mean. Interval estimates are often desirable because the estimate of the mean varies from sample to sample. Instead of a single estimate for the mean, a confidence interval generates a lower and upper limit for the mean. The interval estimate gives an indication of how much uncertainty there is in our estimate of the true mean. The narrower the interval, the more precise is our estimate. Confidence limits are expressed in terms of a confidence coefficient. Although the choice of confidence coefficient is somewhat arbitrary, in practice 90%, 95%, and 99% intervals are often used, with 95% being the most commonly used. As a technical note, a 95% confidence interval does not mean that there is a 95% probability that the interval contains the true mean. The interval computed from a given sample either contains the true mean or it does not. Instead, the level of confidence is associated with the method of calculating the interval. The confidence coefficient is simply the proportion of samples of a given size that may be expected to contain the true mean. That is, for a 95% confidence interval, if many samples are collected and the confidence interval computed, in the long run about 95% of these intervals would contain the true mean. Definition: Confidence Interval Confidence limits are defined as: where is the sample mean, s is the sample standard deviation, N is the sample size, is the desired significance level, and is the upper critical value of the t distribution with N - 1 degrees of freedom. Note that the confidence coefficient is 1 - . From the formula, it is clear that the width of the interval is controlled by two factors: As N increases, the interval gets narrower from the term. That is, one way to obtain more precise estimates for the mean is to increase the sample size. 1. The larger the sample standard deviation, the larger the confidence interval. 2. 1.3.5.2. Confidence Limits for the Mean http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm (1 of 4) [5/1/2006 9:57:13 AM] This simply means that noisy data, i.e., data with a large standard deviation, are going to generate wider intervals than data with a smaller standard deviation. Definition: Hypothesis Test To test whether the population mean has a specific value, , against the two-sided alternative that it does not have a value , the confidence interval is converted to hypothesis-test form. The test is a one-sample t-test, and it is defined as: H 0 : H a : Test Statistic: where , N, and are defined as above. Significance Level: . The most commonly used value for is 0.05. Critical Region: Reject the null hypothesis that the mean is a specified value, , if or Sample Output for Confidence Interval Dataplot generated the following output for a confidence interval from the ZARR13.DAT data set: CONFIDENCE LIMITS FOR MEAN (2-SIDED) NUMBER OF OBSERVATIONS = 195 MEAN = 9.261460 STANDARD DEVIATION = 0.2278881E-01 STANDARD DEVIATION OF MEAN = 0.1631940E-02 CONFIDENCE T T X SD(MEAN) LOWER UPPER VALUE (%) VALUE LIMIT LIMIT 50.000 0.676 0.110279E-02 9.26036 9.26256 75.000 1.154 0.188294E-02 9.25958 9.26334 90.000 1.653 0.269718E-02 9.25876 9.26416 95.000 1.972 0.321862E-02 9.25824 9.26468 99.000 2.601 0.424534E-02 9.25721 9.26571 99.900 3.341 0.545297E-02 9.25601 9.26691 99.990 3.973 0.648365E-02 9.25498 9.26794 99.999 4.536 0.740309E-02 9.25406 9.26886 1.3.5.2. Confidence Limits for the Mean http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm (2 of 4) [5/1/2006 9:57:13 AM] Interpretation of the Sample Output The first few lines print the sample statistics used in calculating the confidence interval. The table shows the confidence interval for several different significance levels. The first column lists the confidence level (which is 1 - expressed as a percent), the second column lists the t-value (i.e., ), the third column lists the t-value times the standard error (the standard error is ), the fourth column lists the lower confidence limit, and the fifth column lists the upper confidence limit. For example, for a 95% confidence interval, we go to the row identified by 95.000 in the first column and extract an interval of (9.25824, 9.26468) from the last two columns. Output from other statistical software may look somewhat different from the above output. Sample Output for t Test Dataplot generated the following output for a one-sample t-test from the ZARR13.DAT data set: T TEST (1-SAMPLE) MU0 = 5.000000 NULL HYPOTHESIS UNDER TEST MEAN MU = 5.000000 SAMPLE: NUMBER OF OBSERVATIONS = 195 MEAN = 9.261460 STANDARD DEVIATION = 0.2278881E-01 STANDARD DEVIATION OF MEAN = 0.1631940E-02 TEST: MEAN-MU0 = 4.261460 T TEST STATISTIC VALUE = 2611.284 DEGREES OF FREEDOM = 194.0000 T TEST STATISTIC CDF VALUE = 1.000000 ALTERNATIVE- ALTERNATIVE- ALTERNATIVE- HYPOTHESIS HYPOTHESIS HYPOTHESIS ACCEPTANCE INTERVAL CONCLUSION MU <> 5.000000 (0,0.025) (0.975,1) ACCEPT MU < 5.000000 (0,0.05) REJECT MU > 5.000000 (0.95,1) ACCEPT 1.3.5.2. Confidence Limits for the Mean http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm (3 of 4) [5/1/2006 9:57:13 AM] Interpretation of Sample Output We are testing the hypothesis that the population mean is 5. The output is divided into three sections. The first section prints the sample statistics used in the computation of the t-test.1. The second section prints the t-test statistic value, the degrees of freedom, and the cumulative distribution function (cdf) value of the t-test statistic. The t-test statistic cdf value is an alternative way of expressing the critical value. This cdf value is compared to the acceptance intervals printed in section three. For an upper one-tailed test, the alternative hypothesis acceptance interval is (1 - ,1), the alternative hypothesis acceptance interval for a lower one-tailed test is (0, ), and the alternative hypothesis acceptance interval for a two-tailed test is (1 - /2,1) or (0, /2). Note that accepting the alternative hypothesis is equivalent to rejecting the null hypothesis. 2. The third section prints the conclusions for a 95% test since this is the most common case. Results are given in terms of the alternative hypothesis for the two-tailed test and for the one-tailed test in both directions. The alternative hypothesis acceptance interval column is stated in terms of the cdf value printed in section two. The last column specifies whether the alternative hypothesis is accepted or rejected. For a different significance level, the appropriate conclusion can be drawn from the t-test statistic cdf value printed in section two. For example, for a significance level of 0.10, the corresponding alternative hypothesis acceptance intervals are (0,0.05) and (0.95,1), (0, 0.10), and (0.90,1). 3. Output from other statistical software may look somewhat different from the above output. Questions Confidence limits for the mean can be used to answer the following questions: What is a reasonable estimate for the mean?1. How much variability is there in the estimate of the mean?2. Does a given target value fall within the confidence limits?3. Related Techniques Two-Sample T-Test Confidence intervals for other location estimators such as the median or mid-mean tend to be mathematically difficult or intractable. For these cases, confidence intervals can be obtained using the bootstrap. Case Study Heat flow meter data. Software Confidence limits for the mean and one-sample t-tests are available in just about all general purpose statistical software programs, including Dataplot. 1.3.5.2. Confidence Limits for the Mean http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm (4 of 4) [5/1/2006 9:57:13 AM] 1. ExploratoryData Analysis 1.3. EDA Techniques 1.3.5. Quantitative Techniques 1.3.5.3.Two-Sample t-Test for Equal Means Purpose: Test if two population means are equal The two-sample t-test (Snedecor and Cochran, 1989) is used to determine if two population means are equal. A common application of this is to test if a new process or treatment is superior to a current process or treatment. There are several variations on this test. The data may either be paired or not paired. By paired, we mean that there is a one-to-one correspondence between the values in the two samples. That is, if X 1 , X 2 , , X n and Y 1 , Y 2 , , Y n are the two samples, then X i corresponds to Y i . For paired samples, the difference X i - Y i is usually calculated. For unpaired samples, the sample sizes for the two samples may or may not be equal. The formulas for paired data are somewhat simpler than the formulas for unpaired data. 1. The variances of the two samples may be assumed to be equal or unequal. Equal variances yields somewhat simpler formulas, although with computers this is no longer a significant issue. 2. In some applications, you may want to adopt a new process or treatment only if it exceeds the current treatment by some threshold. In this case, we can state the null hypothesis in the form that the difference between the two populations means is equal to some constant ( ) where the constant is the desired threshold. 3. Definition The two sample t test for unpaired data is defined as: H 0 : H a : 1.3.5.3. Two-Sample t-Test for Equal Means http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm (1 of 4) [5/1/2006 9:57:14 AM] Test Statistic: where N 1 and N 2 are the sample sizes, and are the sample means, and and are the sample variances. If equal variances are assumed, then the formula reduces to: where Significance Level: . Critical Region: Reject the null hypothesis that the two means are equal if or where is the critical value of the t distribution with degrees of freedom where If equal variances are assumed, then Sample Output Dataplot generated the following output for the t test from the AUTO83B.DAT data set: T TEST (2-SAMPLE) NULL HYPOTHESIS UNDER TEST POPULATION MEANS MU1 = MU2 SAMPLE 1: NUMBER OF OBSERVATIONS = 249 MEAN = 20.14458 STANDARD DEVIATION = 6.414700 STANDARD DEVIATION OF MEAN = 0.4065151 1.3.5.3. Two-Sample t-Test for Equal Means http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm (2 of 4) [5/1/2006 9:57:14 AM] SAMPLE 2: NUMBER OF OBSERVATIONS = 79 MEAN = 30.48101 STANDARD DEVIATION = 6.107710 STANDARD DEVIATION OF MEAN = 0.6871710 IF ASSUME SIGMA1 = SIGMA2: POOLED STANDARD DEVIATION = 6.342600 DIFFERENCE (DEL) IN MEANS = -10.33643 STANDARD DEVIATION OF DEL = 0.8190135 T TEST STATISTIC VALUE = -12.62059 DEGREES OF FREEDOM = 326.0000 T TEST STATISTIC CDF VALUE = 0.000000 IF NOT ASSUME SIGMA1 = SIGMA2: STANDARD DEVIATION SAMPLE 1 = 6.414700 STANDARD DEVIATION SAMPLE 2 = 6.107710 BARTLETT CDF VALUE = 0.402799 DIFFERENCE (DEL) IN MEANS = -10.33643 STANDARD DEVIATION OF DEL = 0.7984100 T TEST STATISTIC VALUE = -12.94627 EQUIVALENT DEG. OF FREEDOM = 136.8750 T TEST STATISTIC CDF VALUE = 0.000000 ALTERNATIVE- ALTERNATIVE- ALTERNATIVE- HYPOTHESIS HYPOTHESIS HYPOTHESIS ACCEPTANCE INTERVAL CONCLUSION MU1 <> MU2 (0,0.025) (0.975,1) ACCEPT MU1 < MU2 (0,0.05) ACCEPT MU1 > MU2 (0.95,1) REJECT Interpretation of Sample Output We are testing the hypothesis that the population mean is equal for the two samples. The output is divided into five sections. The first section prints the sample statistics for sample one used in the computation of the t-test. 1. The second section prints the sample statistics for sample two used in the computation of the t-test. 2. The third section prints the pooled standard deviation, the difference in the means, the t-test statistic value, the degrees of freedom, and the cumulative distribution function (cdf) value of the t-test statistic under the assumption that the standard deviations are equal. The t-test statistic cdf value is an alternative way of expressing the critical value. This cdf value is compared to the acceptance intervals printed in section five. For an upper one-tailed test, the acceptance interval is (0,1 - ), the acceptance interval for a two-tailed test is ( /2, 1 - /2), and the acceptance interval for a lower 3. 1.3.5.3. Two-Sample t-Test for Equal Means http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm (3 of 4) [5/1/2006 9:57:14 AM] one-tailed test is ( ,1). The fourth section prints the pooled standard deviation, the difference in the means, the t-test statistic value, the degrees of freedom, and the cumulative distribution function (cdf) value of the t-test statistic under the assumption that the standard deviations are not equal. The t-test statistic cdf value is an alternative way of expressing the critical value. cdf value is compared to the acceptance intervals printed in section five. For an upper one-tailed test, the alternative hypothesis acceptance interval is (1 - ,1), the alternative hypothesis acceptance interval for a lower one-tailed test is (0, ), and the alternative hypothesis acceptance interval for a two-tailed test is (1 - /2,1) or (0, /2). Note that accepting the alternative hypothesis is equivalent to rejecting the null hypothesis. 4. The fifth section prints the conclusions for a 95% test under the assumption that the standard deviations are not equal since a 95% test is the most common case. Results are given in terms of the alternative hypothesis for the two-tailed test and for the one-tailed test in both directions. The alternative hypothesis acceptance interval column is stated in terms of the cdf value printed in section four. The last column specifies whether the alternative hypothesis is accepted or rejected. For a different significance level, the appropriate conclusion can be drawn from the t-test statistic cdf value printed in section four. For example, for a significance level of 0.10, the corresponding alternative hypothesis acceptance intervals are (0,0.05) and (0.95,1), (0, 0.10), and (0.90,1). 5. Output from other statistical software may look somewhat different from the above output. Questions Two-sample t-tests can be used to answer the following questions: Is process 1 equivalent to process 2?1. Is the new process better than the current process?2. Is the new process better than the current process by at least some pre-determined threshold amount? 3. Related Techniques Confidence Limits for the Mean Analysis of Variance Case Study Ceramic strength data. Software Two-sample t-tests are available in just about all general purpose statistical software programs, including Dataplot. 1.3.5.3. Two-Sample t-Test for Equal Means http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm (4 of 4) [5/1/2006 9:57:14 AM] 1. ExploratoryData Analysis 1.3. EDA Techniques 1.3.5. Quantitative Techniques 1.3.5.3. Two-Sample t-Test for Equal Means 1.3.5.3.1.Data Used for Two-Sample t-Test Data Used for Two-Sample t-Test Example The following is the data used for the two-sample t-test example. The first column is miles per gallon for U.S. cars and the second column is miles per gallon for Japanese cars. For the t-test example, rows with the second column equal to -999 were deleted. 18 24 15 27 18 27 16 25 17 31 15 35 14 24 14 19 14 28 15 23 15 27 14 20 15 22 14 18 22 20 18 31 21 32 21 31 10 32 10 24 11 26 9 29 28 24 25 24 19 33 16 33 17 32 19 28 1.3.5.3.1. Data Used for Two-Sample t-Test http://www.itl.nist.gov/div898/handbook/eda/section3/eda3531.htm (1 of 6) [5/1/2006 9:57:14 AM] 18 19 14 32 14 34 14 26 14 30 12 22 13 22 13 33 18 39 22 36 19 28 18 27 23 21 26 24 25 30 20 34 21 32 13 38 14 37 15 30 14 31 17 37 11 32 13 47 12 41 13 45 15 34 13 33 13 24 14 32 22 39 28 35 13 32 14 37 13 38 14 34 15 34 12 32 13 33 13 32 14 25 13 24 12 37 13 31 18 36 16 36 1.3.5.3.1. Data Used for Two-Sample t-Test http://www.itl.nist.gov/div898/handbook/eda/section3/eda3531.htm (2 of 6) [5/1/2006 9:57:14 AM] [...]... AM] 1.3.5.6 Measures of Scale 1 Exploratory Data Analysis 1.3 EDA Techniques 1.3.5 Quantitative Techniques 1.3.5.6 Measures of Scale Scale, Variability, or Spread A fundamental task in many statistical analyses is to characterize the spread, or variability, of a data set Measures of scale are simply attempts to estimate this variability When assessing the variability of a data set, there are two key components:... original data units (the variance squares the units) 3 range - the range is the largest value minus the smallest value in a data set Note that this measure is based only on the lowest and highest extreme values in the sample The spread near the center of the data is not captured at all 4 average absolute deviation - the average absolute deviation (AAD) is defined as where is the mean of the data and... Chi-Square Test Analysis of Variance Case Study Heat flow meter data Software The Bartlett test is available in many general purpose statistical software programs, including Dataplot http://www.itl.nist.gov/div898/handbook/eda/section3/eda357.htm (3 of 3) [5/1/2006 9:57:17 AM] 1.3.5.8 Chi-Square Test for the Standard Deviation 1 Exploratory Data Analysis 1.3 EDA Techniques 1.3.5 Quantitative Techniques... not give the formula here Software Most general purpose statistical software programs, including Dataplot, can generate at least some of the measures of scale discusssed above http://www.itl.nist.gov/div898/handbook/eda/section3/eda356.htm (6 of 6) [5/1/2006 9:57:16 AM] 1.3.5.7 Bartlett's Test 1 Exploratory Data Analysis 1.3 EDA Techniques 1.3.5 Quantitative Techniques 1.3.5.7 Bartlett's Test Purpose:... texts and software programs In particular, Dataplot uses the opposite convention An alternate definition (Dixon and Massey, 1969) is based on an approximation to the F distribution This definition is given in the Product and Process Comparisons chapter (chapter 7) Sample Output Dataplot generated the following output for Bartlett's test using the GEAR.DAT data set: BARTLETT TEST (STANDARD DEFINITION)... collecting more data does not provide a more accurate estimate for the mean or standard deviation That is, the sampling distribution of the means and standard deviation are equivalent to the sampling distribution of the original data That means that for the Cauchy distribution the standard deviation is useless as a measure of the spread From the histogram, it is clear that just about all the data are between... that this is the opposite of some texts and software programs In particular, Dataplot uses the opposite convention The formula for the hypothesis test can easily be converted to form an interval estimate for the standard deviation: Sample Output Dataplot generated the following output for a chi-square test from the GEAR.DAT data set: CHI-SQUARED TEST SIGMA0 = 0.1000000 NULL HYPOTHESIS UNDER TEST STANDARD... out are the data values near the center? 2 How spread out are the tails? Different numerical summaries will give different weight to these two elements The choice of scale estimator is often driven by which of these components you want to emphasize The histogram is an effective graphical technique for showing both of these components of the spread Definitions of Variability For univariate data, there... assumptions for a univariate measurement process That is, after performing an analysis of variance, the model should be validated by analyzing the residuals Dataplot generated the following output for the one-way analysis of variance from the GEAR.DAT data set NUMBER OF OBSERVATIONS NUMBER OF FACTORS NUMBER OF LEVELS FOR FACTOR 1 BALANCED CASE RESIDUAL STANDARD DEVIATION 0.59385783970E-02 RESIDUAL DEGREES... standard deviation 5 median absolute deviation - the median absolute deviation (MAD) is defined as where is the median of the data and |Y| is the absolute value of Y This is a variation of the average absolute deviation that is even less affected by extremes in the tail because the data in the tails have less influence on the calculation of the median than they do on the mean 6 interquartile range - this . 0.99820 0.00056 0.00 178 5.00000 10. 0.99190 -0.00 574 0.00 178 6.00000 10. 0.99880 0.00116 0.00 178 7. 00000 10. 1.00150 0.00386 0.00 178 8.00000 10. 1.00040 0.00 276 0.00 178 9.00000 10. 0.99830. F STATISTIC F CDF SIG TOTAL (CORRECTED) 479 2668446.000000 5 570 .868652 FACTOR 1 1 26 672 .72 6562 26 672 .72 6562 6 .70 80 99.011% ** 1.3.5.5. Multi-factor Analysis of Variance http://www.itl.nist.gov/div898/handbook/eda/section3/eda355.htm. 90.000 1.653 0.26 971 8E-02 9.25 876 9.26416 95.000 1. 972 0.321862E-02 9.25824 9.26468 99.000 2.601 0.424534E-02 9.2 572 1 9.26 571 99.900 3.341 0.545297E-02 9.25601 9.26691 99.990 3. 973 0.648365E-02