Chapter 17 Alternative Approaches to Inference Copyright © 2011 Pearson Education, Inc 17.1 A Confidence Interval for the Median An auto insurance company is thinking about compensating agents by comparing the number of claims they produce to a standard Annual claims average near $3,200 with a median claim of $2,000 Claims are highly skewed Use nonparametric methods that don’t rely on a normal sampling distribution of 35 Copyright © 2011 Pearson Education, Inc 17.1 A Confidence Interval for the Median Distribution of Sample of Claims (n = 42) For this sample, the average claim is $3,632 with s = $4,254 The median claim is $2,456 of 35 Copyright © 2011 Pearson Education, Inc 17.1 A Confidence Interval for the Median Is Sample Mean Compatible with µ=$3,200? To answer this question, construct a 95% confidence interval for µ This interval is $3,632 ± 2.02 x $4,254 / [$2,306 to $4,958] 42 of 35 Copyright © 2011 Pearson Education, Inc 17.1 A Confidence Interval for the Median Is Sample Mean Compatible with µ=$3,200? The national average of $3,200 lies within the 95% confidence t-interval for the mean BUT…the sample does not satisfy the sample size condition necessary to use the t-interval The t-interval is unreliable with unknown coverage when the conditions are not met of 35 Copyright © 2011 Pearson Education, Inc 17.1 A Confidence Interval for the Median Nonparametric Statistics Avoid making assumptions about the shape of the population Often rely on sorting the data Suited to parameters such as the population median θ (theta) of 35 Copyright © 2011 Pearson Education, Inc 17.1 A Confidence Interval for the Median Nonparametric Statistics For the claims data that are highly skewed to the right, θ < µ If the population distribution is symmetric, then θ = µ of 35 Copyright © 2011 Pearson Education, Inc 17.1 A Confidence Interval for the Median Nonparametric Confidence Interval First step in finding a confidence interval for θ is to sort the observed data in ascending order (known as order statistics) Order statistics are denoted as X(1) < X(2) < … < X(n) of 35 Copyright © 2011 Pearson Education, Inc 17.1 A Confidence Interval for the Median Nonparametric Confidence Interval If data are an SRS from a population with median θ, then we know The probability that a random draw from the population is less than or equal to θ is ½, The observations in the random sample are independent 10 of 35 Copyright © 2011 Pearson Education, Inc 4M Example 17.1: EXECUTIVE Motivation SALARIES Fees earned by an executive placement service are 5% of the starting annual total compensation package How much can the firm expect to earn by placing a current client as a CEO in the telecom industry? 21 of 35 Copyright © 2011 Pearson Education, Inc 4M Example 17.1: EXECUTIVE SALARIES Method Obtain data (n = 23 CEOs from telecom industry) 22 of 35 Copyright © 2011 Pearson Education, Inc 4M Example 17.1: EXECUTIVE SALARIES Method The distribution of total compensation for CEOs in the telecom industry is not normal Construct a nonparametric prediction interval for the client’s anticipated total compensation package 23 of 35 Copyright © 2011 Pearson Education, Inc 4M Example 17.1: EXECUTIVE Mechanics SALARIES Sort the data: 24 of 35 Copyright © 2011 Pearson Education, Inc 4M Example 17.1: Mechanics EXECUTIVE SALARIES The interval x(3) to x(21) is $743,801 to $29,863,393 and is a 75% prediction interval 25 of 35 Copyright © 2011 Pearson Education, Inc 4M Example 17.1: Message EXECUTIVE SALARIES The compensation package of three out of four placements in this industry is predicted to be in the range from about $750,000 to $30,000,000 The implied fee ranges from $37,500 to $1,500,000 26 of 35 Copyright © 2011 Pearson Education, Inc 17.4 Proportions Based on Small Samples Wilson’s Interval for a Proportion An adjustment that moves the sampling distribution of closer to ½ and away pˆ from the troublesome boundaries at and Add four artificial cases (2 successes and failures) to create an adjusted proportion ~ p 27 of 35 Copyright © 2011 Pearson Education, Inc 17.4 Proportions Based on Small Samples Wilson’s Interval for a Proportion Add successes and failures to the data and define successes+2)/n+4 ( = n+4) ~ p The z-interval is ~ p ± zα / = (# of n~ ~ p (1 − ~ p) ~ n 28 of 35 Copyright © 2011 Pearson Education, Inc 4M Example 17.2: DRUG TESTING Motivation A company is developing a drug to prolong time before a relapse of cancer The drug must cut the rate of relapse in half To test this drug, the company first needs to know the current time to relapse 29 of 35 Copyright © 2011 Pearson Education, Inc 4M Example 17.2: DRUG TESTING Method Data are collected for 19 patients who were observed for 24 months Doctors found a relapse in of the 19 patients While the SRS condition is satisfied, the sample size condition is not Use Wilson’s interval for a proportion 30 of 35 Copyright © 2011 Pearson Education, Inc 4M Example 17.2: DRUG TESTING Mechanics By adding two successes and two failures, we have The interval~is p = (9 + 2) /(19 + 4) ≈ 0.478 0.478 ± 1.96 = [0.27 to 0.68] 0.478(1 − 0.478) /(19 + 4) 31 of 35 Copyright © 2011 Pearson Education, Inc 4M Example 17.2: DRUG TESTING Message We are 95% confident that the proportion of patients with this cancer that relapse within 24 months is between 27% and 68% In order to cut this proportion in half, the drug will have to reduce this rate to somewhere between 13% and 34% 32 of 35 Copyright © 2011 Pearson Education, Inc Best Practices Check the assumptions carefully when dealing with small samples Consider a nonparametric alternative if you suspect nonnormal data Use the adjustment procedure for proportions from small samples Verify that your data are an SRS 33 of 35 Copyright © 2011 Pearson Education, Inc Pitfalls Avoid assuming that populations are normally distributed in order to use a t – interval for the mean Do not use confidence intervals based on normality just because they are narrower than a nonparametric interval Do not think that you can prove normality using a normal quantile plot 34 of 35 Copyright © 2011 Pearson Education, Inc Pitfalls (Continued) Do not rely on software to know which procedure to use Do not use a confidence interval when you need a prediction interval 35 of 35 Copyright © 2011 Pearson Education, Inc ... interval for θ whose coverage is exactly 0.95 The 94.6% confidence interval for the median claim is [$1, 217 to $3,168] 12 of 35 Copyright © 2011 Pearson Education, Inc 17. 1 A Confidence Interval for. .. Inc 17. 3 Prediction Intervals For a Normal Population The 100 (1 – α)% prediction interval for an independent draw from a normal population is where x ± tα / ,n−1 s + n and s estimate µ and σ... median θ (theta) of 35 Copyright © 2011 Pearson Education, Inc 17. 1 A Confidence Interval for the Median Nonparametric Statistics For the claims data that are highly skewed to the right, θ