Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 78 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
78
Dung lượng
1,06 MB
Nội dung
Descriptive Statistics: NumericalMethods 4.1 Measures of Central Location ❧ The central data point reflects the locations of all the actual data points ❧ How? With two data points, the central location With one data point should fall in the middle clearly the central between them (in order location is at the point to reflect the location of itself both of them) 4.1 Measures of Central Location ❧ The central data point reflects the locations of all the actual data points ❧ How? But if the third data point If the third appears data on point theappears left hand-side in the center the measure of the midrange, of central itlocation should will “pull” remain thein central the center, location but…to(click) the left 4.1 Measures of Central Location As more and more data points are added, the central location moves (left and right) as required in order to reflect the effects of all the points The Arithmetic Mean (average) • This is the most popular and useful measure of central location Mean = Sum of the measurements Number of measurements The Arithmetic Mean Sample mean nn x ∑ ∑ i=i=11x i i x= nn Sample size Population mean ∑ Ni=1 x i µ= N Population size The Arithmetic Mean Example Find the mean rate of return for a portfolio equally invested in five stocks having the following annual rate of returns: 11.2%, 8.07%, 5.55%, 13.7%, 21% Solution 11.2 + 8.07 + 5.55 + 13.7 + 21 x= = 9.764% Geometric mean • A specialized measure, used to find the average growth rate, or rate of change of a variable over time • Example: The number of students attending the music class last Tuesday was 160 This Tuesday, the number is expected to increase by 15% How many of them are likely to attend this Tuesday? Geometric mean The number of students likely to attend this Tuesday Number of students Growth rate/rate of change? = 160*(100+15)% = 160*(1+0.15)= 184 (students) 15% or 0.15 Geometric mean • Formula: - Step 1: Express the rate of change (R) as (1+R) - Step 2: Calculate the geometric mean using the formula: (i) Simple geometric mean: applied when each rate of change appears once only Rg = n (1+ R1 )(1+ R2 ) (1+ Rn ) −1 Types of Questions Close-ended Questions • Select from a short list of defined choices Example: Major: business liberal arts science other Open-ended Questions • Respondents are free to respond with any value, words, or statement Example: What did you like best about this course? Demographic Questions • Questions about the respondents’ personal characteristics Example: Gender: Female Male 64 II SAMPLINGMETHODS 1/ Why Sampling - Less time consuming than a census - Less costly to administer than a census - It is possible to obtain statistical results of a sufficiently high precision based on samples - Sometimes, it’s impossible to identify the whole population 65 POPULATION VS SAMPLE All likely voters in the next election 1000 voters selected at random for interview All parts produced today All sales receipts of a year A few parts selected for destructive testing 66 Every 100 th receipt selected for audit 2/ Methods of Sampling Probability Samples Simple Random Stratified Systematic Random 67 Cluster Simple Random Samples Every individual or item from the population has an equal chance of being selected Selection may be with replacement or without replacement Samples can be obtained from a table of random numbers or computer random number generators 68 Stratified Random Samples Population divided into subgroups (called strata) according to some common characteristic Simple random sample selected from each subgroup Samples from subgroups are combined into one Population Divided into strata 69 Sample Systematic Samples Decide on sample size: n Divide frame of N individuals into groups of k individuals: k=N/n Randomly select one individual from the 1st group Select every kth individual thereafter N = 64 n=8 First Group k=8 70 Cluster Samples • Population is divided into several “clusters,” each representative of the population • A simple random sample of clusters is selected • All items in the selected clusters can be used, or items can be chosen from a cluster using another probability sampling technique Population divided into 16 clusters Randomly selected clusters for 71 sample CONVENIENT SAMPLING - Use easily available /convenient group to form a sample - Voluntary response sampling, self-selected sampling… WHAT IS IT? 72 III SAMPLINGAND NON-SAMPLING ERROR 1/ Sampling Error - An error is expected to occur when making statement about the population that is based on the observations contained in a sample taken from the population - The difference/deviation between the true (unknown) value of a population parameter (mean, standard deviation…) and its estimate, the sample statistic is the sampling error - Sample error may be large due to unrepresentative sample be selected - The only way to reduce sample error is to take larger sample size 73 III SAMPLINGAND NON-SAMPLING ERROR 1/ Non-Sampling Error Selection Bias An error occur when there are Measurement or response bias mistakes in the acquisition of the data or due to the sample observations being selected improperly Nonresponse Bias 74 SELECTION BIAS - Occur when the way the sample selected is systematically excludes some part of the population of interest - Example: A study on an issue related to the population consisting of all residents of a city The methods of selecting individuals may exclude the homeless or those without telephones - Selection bias also usually occurs when only volunteers or self-selected individuals are used in a study 75 MEASUREMENT OR RESPONSE BIAS - Occur when the method of observation tends to produce values that systematically differ from the true value in some ways -This problem might happen due to: An improperly calibrated scale is used to weigh items Questions on a survey are worded in a way that tends to influence the response The appearance or the behavior of the interviewer, the group or organization conducting the survey, the tendency for people not to be completely honest when asked about sensitive issues (sexual, illegal activities…) 76 - NONRESPONSE BIAS Occur when responses are not obtained from some individuals of the sample As with selection bias, nonresponse bias can distort results of the study This problem might happen due to: An interviewer unable to contact a person listed in the sample Sampled person refuses to respond for some reasons 77 Case study In summer 1936, the Literary Digest magazine wanted to predict the next US president, just as they had successfully done five times before They sent out postcards to 10 million Americans and then announced that Alfred M Landon, then governor of Kansas, would gain 57% percent of the popular vote and, thus, demolish Franklin D Roosevelt, the incumbent president In fact, Roosevelt won by a landslide never before seen in U.S history He garnered not the predicted 43%, but 62.5% of the popular vote and all but of 531 electoral votes The Digest never survived the debacle and folded shortly thereafter What had gone wrong? 78 ... Location ❧ The central data point reflects the locations of all the actual data points ❧ How? But if the third data point If the third appears data on point theappears left hand-side in the center... of Central Location ❧ The central data point reflects the locations of all the actual data points ❧ How? With two data points, the central location With one data point should fall in the middle... but…to(click) the left 4.1 Measures of Central Location As more and more data points are added, the central location moves (left and right) as required in order to reflect the effects of all the