Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 45 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
45
Dung lượng
404,5 KB
Nội dung
Hypothesis tests for two independent samples • Compare two proportions • Compare mean values of two populations • Compare two variances Problem Compare two mean values Let ( X , X , , X n ) be a sample of n independent observations from a variable X with expectation µ1 and varianceσ1 (Y1,Y2 , ,Ym ) be a sample of m independent observations from a variable Y with expectation µ2 and variance σ 2 Problem: Compare two expectations µ1 and µ2 Estimate and compare two mean values X and Y The problem can be solved by using the following Theorem: Theorem Let ( X , X , , X ) and (Y1 , Y2 , , Ym ) be two n samples of n independent observations selected correspondingly from a variable X with sample mean X and sample variance S X and from a variable Y with sample mean Y and sample variance S (both variables are normal distributed) Then the (new) variable Y n.m n+m−2 t= ( X − Y ) 2 n + m n.S X + m.SY has Student distribution with (n+m-2) degrees of freedom Hypothesis Tests A Two-tail Test: Hypothesis H: Mean(X) = Mean(Y) Alternative Hypothesis K: Mean(X) differs from Mean(Y) B Right one-tail Test: Hypothesis H: Mean(X) = Mean(Y) Alternative Hypothesis K: Mean(X) > Mean(Y) C Left one-tail Test: Hypothesis H: Mean(X) = Mean(Y) Alternative Hypothesis K: Mean(X) < Mean(Y) Steps of testing Step Estimate sample mean values Mean(X) , Mean(Y) and sample variances Var(X) , Var(Y) Step Calculating perform the quantity n.m n+m−2 t= ( Mean ( X ) − Mean (Y )) n + m n.Var ( X ) + m.Var (Y ) Step (Version A- Computer) Taking a variable T(n+m-2) of Student distribution with (n + m - 2) degrees of freedom calculate the probability b = P { |T(n+m-2)| >= | t | } (for 2-tails test); or b = P { T(n+m-2) >= t } (for right 1-tail test); or b = P { T(n+m-2) =< t } (for left 1-tail test, then t < ) Step Compare the probability b with a given ahead significance level alpha (=5%, 1%, 0.5% or 0.1%): + If b >= alpha accept Hypothesis H and conclude Mean(X) = Mean(Y) + If b < alpha reject Hypothesis H and confirm Mean(X) kh¸c Mean(Y) (for 2-tails test); or Mean(X) > Mean(Y) (for right 1-tail test); or Mean(X) < Mean(Y) (for left 1-tail test) Version B Using Student distribution table Looking in Table of Student distribution find out critical value T(n+m-2,alpha/2) of Student distribution with n+m-2 degrees of freedom ( alpha is a given ahead significance level =5%, 1% or 0.5%) Decide - Reject Hypothesis H: = if t > T(n+m-2,alpha/2) - Accept Hypothesis H: = if t =< T(n+m-2,alpha/2) Version C Using confidence intervals When degree of freedom (sample size) is large, Student distribution approximates Normal distribution Then we can use confidence intervals (with significance level of 5%) )for; testing: ) + 1.96* Var ( X ) / n Mean ( X ) − 1.96* Var ( X / n Mean ( X Mean (Y ) − 1.96 * Var (Y ) / m ; Mean (Y ) + 1.96* Var (Y ) / m Decide Reject Hypothesis H: = if the two intervals disjoin Accept Hypothesis H: = if the two intervals have nonempty intersection SPSS If the Hypothesis H is true then use the two samples ( X 1, X , , X n ) and (Y1, Y2 , ,Ym ) as samples collected from one variable and estimate the common variance of X and Y by m1 + m2 m1 + m2 m1 + m2 n1 + n2 − m1 − m2 (1 − )= n1 + n2 n1 + n2 n1 + n2 n1 + n2 then perform a statistic m1 m2 m1 + m2 n1 + n2 − m1 − m2 n1 + n2 u= − ÷/ n + n n1 n2 n1 + n2 n1.n2 for testing, where m1 and m2 respectively are the numbers of values appeared in the above two samples By Central Limit Theorem, when sample sizes are large, the difference Mean(X) - Mean(Y) has a distribution very close to Normal distribution Then the testing procedure can be as follows: Step Calculate value of statistic m1 m2 m1 + m2 n1 + n2 − m1 − m2 n1 + n2 u= − ÷/ n + n n1 + n2 n1.n2 n1 n2 Step Taking Normal distribution N(0,1) find the probability b = P { | N(0,1) | > | u | } Step Compare the probability b to a given ahead significance alpha * If b > alpha Accept Hypothesis H , confirm the equality of two proportions * If b = u(alpha/2) - Accept Hypothesis H: = if u < u(alpha/2) Version C Using confidence intervals Use confidence intervals (with significance level of 5%) of estimated proportions for testing: m1 m1 m1 m1 m1 m1 (1 − ) / n1 ; + 1.96 * (1 − ) / n1 − 1.96 * n1 n1 n1 n1 n1 n1 m2 m2 m2 m2 m2 m2 − 1.96* (1 − ) / n2 ; + 1.96* (1 − ) / n2 n2 n2 n2 n2 n2 n2 Decide Reject Hypothesis H: = if the two intervals disjoin Accept Hypothesis H: = if the two intervals have nonempty intersection SPSS Compare several proportions Let X be a binary variable taking two values and Collecting data from that variable under k different conditions we have a sample containing k groups of observations related with the conditions Let p1, p2 , , pk be probabilities of appearance of value of variable X under each of the above k conditions Hypothesis H: p1 = p2 = = pk Alternative Hypothesis K: there is certain difference between p1, p2 , , pk Data: Perform a 2xk table of rows and k columns: each column for one group, the 1rst row for value 1, the 2nd row for value of the variable at observations: n1 = n11 + n12 + + n1k ; n0 = n01 + n02 + + n0k n ( j ) = n j1 + n j ; j = 1,2, , k ; n = n0 + n1 ni n ( j ) ni n ( j ) χ = ∑ ∑ ( nij − ) /( ) n n j =1 i =0 k LEMMA Suppose that hypothesis H is true Then variable χ has distribution approximate to the Chi-square distribution with ( k − 1) degrees of freedom χ (k-1) Note When degree of freedom tends to infinity, the Chi-square distribution converge to Normal distribution! Version A (computer): Step Taking a variable CS(k-1) of Chi-square distribution with (k-1) degrees of freedom calculate the probability b = P { CS(k-1) > χ } Step Compare the probability b to the given ahead significance level alpha : * If b > alpha accept hypothesis H , conclude the all proportions are equal * If b