Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 115 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
115
Dung lượng
2,24 MB
Nội dung
Chapter 12 Hypothesis testing: Describing a single population Introduction The purpose of hypothesis testing is to determine whether there is enough statistical evidence in favour of a certain belief about a parameter Introduction… Examples • In a criminal trial, a jury must decide whether the defendant is innocent or guilty based on the evidence presented at the court • Is there statistical evidence in a random sample of potential customers, that supports the hypothesis that more than p% of potential customers will purchase a new product? • Is a new drug effective in curing a certain disease? A sample of patients is randomly selected Half of them are given the drug, and the other half a placebo The improvement in the patients’ conditions is then measured and compared 12.1 Concepts of Hypothesis Testing A criminal trial is an example of hypothesis testing without the statistics In a trial a jury must decide hypotheses The null hypothesis is H0: The defendant is innocent between two The alternative hypothesis or research hypothesis is HA: The defendant is guilty The jury does not know which hypothesis is true They must make a decision on the basis of evidence presented Hypothesis Testing In the language of statistics convicting the defendant is called rejecting the null hypothesis in favor of the alternative hypothesis That is, the jury is saying that there is enough evidence to conclude that the defendant is guilty (i.e., there is enough evidence to support the alternative hypothesis) Hypothesis Testing If the jury acquits it is stating that there is not enough evidence to support the alternative hypothesis Notice that the jury is not saying that the defendant is innocent, only that there is not enough evidence to support the alternative hypothesis That is why we never say that we accept the null hypothesis Hypothesis Testing There are two possible errors A Type I error occurs when we reject a true null hypothesis That is, a Type I error occurs when the jury convicts an innocent person A Type II error occurs when we don’t reject a false null hypothesis That occurs when a guilty defendant is acquitted Hypothesis Testing The probability of a Type I error is denoted as α (Greek letter alpha) The probability of a type II error is β (Greek letter beta) The two probabilities are inversely related Decreasing one increases the other Hypothesis Testing In our judicial system Type I errors are regarded as more serious We try to avoid convicting innocent people We are more willing to acquit guilty people We arrange to make α small by requiring the prosecution to prove its case and instructing the jury to find the defendant guilty only if there is ‘evidence beyond a reasonable doubt’ 10 Example 12.1: Calculating β Calculate the probability of a Type II error when the actual mean is 135 Recall that H0:μ = 130 H1:μ ≠ 130 n = 100 σ = 15 α = 05 101 Example 12.1: Calculating β Stage 1: Rejection region (two-tailed test) z > zα / or z < − zα / z > z.025 = 1.96 or z < − z.025 = − 1.96 x − 130 > 1.96 ⇒ x > 132.94 15 / 100 x − 130 < − 1.96 ⇒ x < 127.06 15 / 100 102 Example 12.1: Calculating β Stage 2: Probability of a Type II error β = P (127.06 < x < 132.94 | µ = 135) 127.06 − 135 x − µ 132.94 − 135 = P < < σ/ n 15 / 100 15 / 100 = P( − 5.29 < z < −1.37 ) = 0853 103 Judging the Test The power of a test is defined as 1– β It represents the probability of rejecting the null hypothesis when it is false That is, when more than one test can be performed in a given situation, its is preferable to use the test that is correct more often If one test has a higher power than a second test, the first test is said to be more powerful and the preferred test 104 Using Excel… The Beta-mean workbook is a handy tool for calculating β for any test of hypothesis For example, comparing n=400 to n=1,000 for our department store example… The power of the test has increased by increasing the same size… 105 13.6 Testing the population proportion • When the population consists of nominal or categorical data, the only test we can perform is about the proportion of occurrence of a certain value • The parameter p was used previously to calculate probabilities using the binomial distribution 106 • Statistic and sampling distribution – The statistic employed is x ˆ = x where ˆp p = where n n − the number of successes xx− the number of successes n − sample size n − sample size – Under certain conditions, [np ≥ and n(1 – p) ≥ 5], ˆ p is approximately normally distributed, with µ = p and σ2 = p(1 – p)/n 107 • Test statistic for p p ˆˆ − p p−p Z= Z= p(1 − p)/n p(1 − p)/n where np ≥ and n(1 − p) ≥ where np ≥ and n(1 − p) ≥ 108 Example (marketing application) For a new newspaper to be financially viable, it has to capture at least 12% of the Sydney market In a survey conducted among 400 randomly selected prospective readers, 58 participants indicated they would subscribe to the newspaper if its cost did not exceed $20 a month Can the publisher conclude that the proposed newspaper will be financially viable at a 10% significance level? 109 Solution The problem objective is to describe the population of newspaper readers in Sydney The responses to the survey are nominal The parameter to be tested is p The hypotheses are: H0: p = 0.12 We want to prove that the newspaper is financially viable HA: p > 0.12 We want to prove that the newspaper is financially viable 110 Solving by hand – The rejection region is z > zα = z.10 = 1.28 – The sample proportion is ˆ p = 58 400 = 145 – The value of the test statistic is Z= ˆ p−p p(1 − p) / n = 145 − 12 12(1 − 12) / 400 = 1.54 – p-value = P(Z > 1.54) = 0.0618 – Since p-value = 0.0618 < 0.10, we reject H o There is sufficient evidence to reject the null hypothesis in favour of the alternative hypothesis At the 10% significance level we can conclude that at least 12% of Sydney readers will subscribe to the new newspaper 111 Hypothesis testing about p Using Excel Add-Ins > Data Analysis Plus > Z-test: Proportion or use the z-test_Proportion worksheet of the Test Statistics Excel workbook 112 The Road Ahead ICI approach Identify Compute Interpret The most difficult part of statistics (in real life and on final exams) is to identify the correct technique 113 The Road Ahead There are several factors that identify the correct technique The first two are: Type of data numerical, nominal, ordinal Problem objective 114 Problem Objectives Describe a population Compare two populations Compare two or more populations Analyse the relationship between two variables Analyse the relationship among two or more variables 115 ... ‘null’ hypothesis HA: — the ‘alternative’ or ‘research’ hypothesis The null hypothesis (H0) will always state that the parameter equals the value specified in the 13 alternative hypothesis (HA) Concepts... 400 monthly accounts is drawn, for which the sample mean is $178 The manager knows that the accounts are approximately normally distributed with a standard deviation of $65 Can the manager conclude... different approaches: The rejection region approach (typically used when computing statistics manually), and The p-value approach (which is generally used with a computer and statistical software)