Sampling in research

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang	21
Dung lượng	214,79 KB

Nội dung

1 QUY NHON UNIVERSITY FOREIGN LANGUAGES DEPARTMENT ASSIGNMENT SAMPLING IN RESEARCH Lecturer Student Course : LE NHAN THANH, PhD : HUYNH THI AN KHANG : Master of Arts in English Language - K19 Quy Nhon, 7th January 2017 Table of Contents Foreword In any research conducted, people, places, and things are studied The opportunity to study the entire population of those people, places, and things is an endeavor that most researchers not have the time and/or money to undertake The idea of gathering data from a population is one that has been used successfully over the years and is called a census In past years, the idea of collecting data from the entire population was used by political entities to collect opinions about potential political candidates Census data collection is still very popular for collecting public opinion for political endeavors For most researchers, however, collecting data from an entire population is almost impossible because of the amount of people, places, or things within the population Taking a census involves much time and money; something to which most researchers are not accustomed To collect data on a smaller scale, researchers gather data from a portion or sample of the population This assignment is a discussion on sampling in research, which mainly provided knowledge on the general issues on sampling such as the purpose of sampling in research, stages in the selection of a sample, types of sampling and errors of sampling and how to minimize them For a clear flow of ideas, a few definitions of the terms used are given Content An Introduction of Sampling Many professions (business, government, engineering, science, social research, agriculture, etc.) seek the broadest possible factual basis for decision-making In the absence of data on the subject, a decision taken is just like leaping into the dark One of the aspects of research design often over-looked by researchers doing fieldwork is the issue of sampling If sampling is found appropriate for a research, the researcher, then: (1) Identifies the target population as precisely as possible, and in a way that makes sense in terms of the purpose of study (Hansen, Hurwitz, & Madow, 1953); (2) Puts together a list of the target population from which the sample will be selected (Hansen et al., 1953 ); (3) Selects the sample, (Hansen et al., 1953) and decide on a sampling technique, and; (4) Makes an inference about the population (Hansen et al., 1953) All these four steps are interwoven and cannot be considered isolated from one another Simple random sampling, systematic sampling, stratified sampling fall into the category of simple sampling techniques Complex sampling techniques are used, only in the presence of large experimental data sets; when efficiency is required; and, while making precise estimates about relatively small groups within large populations (Hansen et al., 1953) Sampling Definitions 2.1 Definition of sample The sample method involves taking a representative selection of the population and using the data collected as research information ‘A sample is a proportion or subset of a larger group called a population A good sample is a miniature version of the population of which it is a part – just like it, only smaller.’ (Fink, 2003, p 1) A sample is “a smaller (but hopefully representative) collection of units from a population used to determine truths about that population” (Field, 2005) The sample should be “representative in the sense that each sampled unit will represent the characteristics of a known number of units in the population” (Lohr, 1999) When dealing with people, it can be defined as a set of respondents (people) selected from a larger population for the purpose of a survey If a researcher desires to obtain information about a population through questioning or testing, he/she has two basic options: Every member of the population can be questioned or tested, a census; or A sample can be conducted; that is, only selected members of the population are questioned or tested Contacting, questioning, and obtaining information from a large population, such as all of the households residing in Binh Dinh province, is extremely expensive, difficult, and time consuming A properly designed probability sample, however, provides a reliable means of inferring information about a population without examining every member or element 2.2 Definition of sampling Sampling is the act, process, or technique of selecting a suitable sample, or a representative part of a population for the purpose of determining parameters or characteristics of the whole population 2.3 Sampling terminology A population is a group of experimental data, persons, etc A population is built up of elementary units, which cannot be further decomposed A group of elementary units is called a cluster Target population is a set of elements larger than or different from the population sampled and to which the researcher would like to generalize study findings Element is the most basic unit about which survey information is collected (i.e., person, business, household, car, dog, etc.) Sampling frame is a list of all the elements or subjects in the population from which the sample is drawn; for examples, telephone directory, list of five-star hotel, list of students, etc Figure 1: Population, sample and individual cases Source: Saunders et al (2007) Figure 2: Sampling The Purpose of Sampling Sampling theory is important to understand in regards to selecting a sampling method because it seeks to “make sampling more efficient” (Cochran, 1953, p 5) There are six main reasons for sampling instead of doing a census These are; -Economy -Timeliness -The large size of many populations -Inaccessibility of some of the population -Destructiveness of the observation - accuracy 3.1 The economic factor The economic advantage of using a sample in research, obviously, taking a sample requires fewer resources than a census For example, let us assume that you are one of the very curious students around You have heard so much about the famous Quy Nhon University (QNU) and now that you are there, you want to hear from the insiders You want to know what all the students at QNU think about the quality of teaching they receive, you know that all the students are different so they are likely to have different perceptions and you believe you must get all these perceptions so you decide because you want an in-depth view of every student, you will conduct personal interviews with each one of them and you want the results in 20 days only, let us assume this particular time you are doing your research QNU has only 20,000 students and those who are helping are so fast at the interviewing art that together you can interview at least 10 students per person per day in addition to your 18 credit hours of course work You will require 100 research assistants for 20 days and since you are paying them minimum wage of $5.00 per hour for ten hours ($50.00) per person per day, you will require $100000.00 just to complete the interviews, analysis will just be impossible You may decide to hire additional assistants to help with the analysis at another $100000.00 and so on assuming you have that amount on your account As unrealistic as this example is, it does illustrate the very high cost of census For the type of information desired, a small wisely selected sample of QNU students can serve the purpose You don’t even have to hire a single assistant You can complete the interviews and analysis on your own Rarely does a circumstance require a census of the population, and even more rarely does one justify the expense 3.2 The time factor A sample may provide you with needed information quickly For example, you are a Doctor and a disease has broken out in a village within your area of jurisdiction, the disease is contagious and it is killing within hours nobody knows what it is You are required to conduct quick tests to help save the situation If you try a census of those affected, they may be long dead when you arrive with your results In such a case, just a few of those already infected could be used to provide the required information 3.3 The very large populations Many populations about which inferences must be made are quite large For example, Consider the population of high school seniors in United States of America, a group numbering 4,000,000 The responsible agency in the government has to plan for how they will be absorbed into the different departments and even the private sector The employers would like to have specific knowledge about the student`s plans in order to make compatible plans to absorb them during the coming year But the big size of the population makes it physically impossible to conduct a census In such a case, selecting a representative sample may be the only way to get the information required from high school seniors 3.4 The partly accessible populations There are some populations that are so difficult to get access to that only a sample can be used Like people in prison, like crashed aeroplanes in the deep seas, presidents, etc The inaccessibility may be economic or time related Like a particular study population may be so costly to reach like the population of planets that only a sample can be used In other cases, a population of some events may be taking too long to occur that only sample information can be relied on For example, natural disasters like a flood that occurs every 100 years or take the example of the flood that occurred in Noah’s days It has never occurred again 3.5 The destructive nature of the observation Sometimes the very act of observing the desired characteristic of a unit of the population destroys it for the intended use Good examples of this occur in quality control For example, to test the quality of a fuse, to determine whether it is defective, it must be destroyed To obtain a census of the quality of a lorry load of fuses, you have to destroy all of them This is contrary to the purpose served by quality-control testing In this case, only a sample should be used to assess the quality of the fuses 3.6 The accuracy and sampling A sample may be more accurate than a census A sloppily conducted census can provide less reliable information than a carefully obtained sample The smaller sampling operation lends itself to the application of more rigorous controls, thus ensuring better accuracy These rigorous controls allow the researcher to reduce non-sampling errors such as interviewer bias and mistakes, nonresponse problems, questionnaire design flaws, and data processing and analysis errors Sample Size When selecting a sample from a population, attention also needs to be given to the size of the sample necessary to ensure a high probability of its representativeness Obviously the closer the sample size is to the size of the whole population, the greater the probability of it being representative But with large populations, it is likely to be impractical to sample most of the population, and so reference should be made to statistical calculations of sample sizes required for differing degrees of confidence in its representativeness This is illustrated in the following table which sets out degrees of confidence of sample sizes for a population of 10,000 units: Sample Sizes Required for Various Margins of Error, by Confidence Level (Population = 10,000) for simple random sampling selection Margin of Error Confidence Level +/1% 2% 3% 4% 5% 95% 4,899 2,088 1,000 579 375 99% 6,247 2,938 1,561 942 624 [If the population is relatively small (less than 150) then elaborate sampling procedures may not be appropriate] Source: Gray et al (2007, p 113) Deciding on a sample size for qualitative inquiry can be even more difficult than quantitative because there are no definite rules to be followed It will depend on what you want to know, the purpose of the inquiry, what is at stake, what will be useful, what will have credibility and what can be done with available time and resources With fixed resources which is always the case, you can choose to study one specific phenomenon in depth with a smaller sample size or a bigger sample size when seeking breadth In purposeful sampling, the sample should be judged on the basis of the purpose and rationale for each study and the sampling strategy used to achieve the studies purpose The validity, meaningfulness, and insights generated from qualitative inquiry have more to with the information-richness of the cases selected and the observational/analytical capabilities of the researcher than with sample size Types of Samples Sampling methodologies are classified under two general categories: Probability sampling and Nonprobability sampling In the former, the researcher knows the exact possibility of selecting each member of the population; in the latter, the chance of being included in the sample is not known A probability sample tends to be more difficult and costly to conduct However, probability samples are the only type of samples where the results can be generalized from the sample to the population In addition, probability samples allow the researcher to calculate the precision of the estimates obtained from the sample and to specify the sampling error Nonprobability samples, in contrast, not allow the study's findings to be generalized from the sample to the population When discussing the results of a nonprobability sample, the researcher must limit his/her findings to the persons or elements sampled This procedure also does not allow the researcher to calculate sampling statistics that provide information about the precision of the results The advantage of nonprobability sampling is the ease in which it can be administered Nonprobability samples tend to be less complicated and less time consuming than probability samples If the researcher has no intention of generalizing beyond the sample, one of the nonprobability sampling methodologies will provide the desired information Figure 3: Overview of sampling techniques Source: Saunders et al (2007) 5.1 Probability sampling Probability sampling provides an advantage because of researcher’s ability to calculate specific bias and error in regards to the data collected It is described more clearly as “every subject or unit has an equal chance of being selected” from the population (Fink, 2003, p 10) There are four types of probability sampling that are standard across disciplines These four include simple random sampling, systematic random sampling, stratified random sampling, and cluster sampling 5.1.1 Simple random sampling Simple random sampling is often called straight random sampling The naming convention of this type of probability sampling method is not indicative of the discipline but reliant upon the researcher or author of the various books and articles referenced That is to say that these two terms are interchangeable and is not interdependent on a specific discipline within academia Simple random sampling requires that each member of the population have an equal chance of being selected (as is the main goal of probability sampling) Sharon Lohr explains that by using simple random sampling, the researcher “is in effect mixing up the population before grabbing n units” (p 24) An example of simple random sampling may include writing each member of the population on a piece of paper and putting in a hat Selecting the sample from the hat is random and each member of the population has an equal chance of being selected This example is not feasible for large population, but can be completed easily if the population is very small Researchers who choose simple random sampling must be cognizant of the numbers that they choose Researcher bias in regards to preferred numbers can be a problem for the end results in regards to sample selection It is best to ask other researchers to aid in the selection of the numbers to be used in the selection process It is also important to note that by using simple random sampling, the sample selected may not include all “elements in the population that are of interest” (Fink, 2003, p 11) 5.1.2 Systematic random sampling Systematic random sampling is usually preferred over simple random sampling in so far as it is more convenient for the researcher Systematic random sampling includes “selection of sampling units in sequences separated on lists by the interval of selection” (Kish, 1965, p 25) A systematic random sample is obtained by selecting one unit on a random basis and choosing additional elementary units at evenly spaced intervals until the desired number of units is obtained For example, there are 100 students in your class You want a sample of 20 from these 100 and you have their names listed on a piece of paper may be in an alphabetical order If you choose to use systematic random sampling, divide 100 by 20, you will get Randomly select any number between and five Suppose the number you have picked is 4, that will be your starting number So, student number has been selected From there you will select every 5th name until you reach the last one, number one hundred You will end up with 20 selected students 5.1.3 Stratified random sampling A stratified sample is obtained by independently selecting a separate simple random sample from each population stratum Stratified random sampling is “one in which the population is divided into subgroups or ‘strata,’ and a random sample is then selected from each subgroup” (Fink, 2003, p 11) When a few characteristics are known about a population, stratified random sampling is preferable because the population may be arranged in subgroups and then a random sample may be selected from each of these subgroups (Cochran, 1953; Kish, 1965) These subgroups can exhibit characteristics including but not limited to gender, race, ethnicity, religion, and age groups Two types of stratified random sampling include proportionate and disproportionate The big difference between the two stems from the use of a fraction Proportionate stratified uses the same fraction for each subgroup and disproportionate uses different fractions for each subgroup To choose which is right for a research project, the researcher must be aware of the various numbers of members in each subgroup Take for instance A population can be divided into different groups may be based on some characteristic or variable like income of education Like anybody with ten years of education will be in group A, between 10 and 20 group B and between 20 and 30 group C These groups are referred to as strata You can then randomly select from each stratum a given number of units which may be based on proportion like if group A has 100 persons while group B has 50, and C has 30 you may decide you will take 10% of each So, you end up with 10 from group A, from group B and from group C A concern when using stratified random sample is that the researcher must identify and justify the subgroups (Fink, 2003) By using stratified random sampling, there is an attempt to control for sampling error To control for sampling error, researchers must not only identify and justify the subgroups but make sure they are truly representative of the population 5.1.4 Cluster sampling A cluster sample is obtained by selecting clusters from the population on the basis of simple random sampling This sampling method is used when no master list of the population exists but “cluster” lists are obtainable (Lohr, 1999) The sample comprises a census of each random cluster selected For example, a cluster may be something like a village or a school, a state So, you decide all the elementary schools in Binh Dinh province are clusters You want 20 schools selected You can use simple or systematic random sampling to select the schools, then every school selected becomes a cluster If you interest is to interview teachers on their opinion of some new program which has been introduced, then all the teachers in a cluster must be interviewed Though very economical cluster sampling is very susceptible to sampling bias Like for the above case, you are likely to get similar responses from teachers in one school due to the fact that they interact with one another A drawback to using cluster sampling occurs within the precision of the statistics While cluster sampling is convenient when a master list of the population does not exist, the researcher will run the risk of inaccurate findings One way to increase the accuracy of results from cluster sampling is to use many clusters when implementing multistage sampling (Fink, 2003) Fink goes on to explain “as you increase the number of clusters, you can decrease the size of the sample within each” (p 16) 5.2 Non-probability sampling The advantage of non-probability sampling is that it a convenient way for researchers to assemble a sample with little or no cost and/or for those research studies that not require representativeness of the population Non-probability sampling is a good method to use when conducting a pilot study, when attempting to question groups who may have sensitivities to the questions being asked and may not want answer those questions honestly, and for those situations when ethical concerns may keep the researcher from speaking to every member of a specific group (Fink, 2003) In non-probability sampling, subjective judgments play a specific role Researchers must be careful not to generalize results based on non-probability sampling to the general population The five types of nonprobability samples are convenience sampling, quota sampling, judgmental sampling, snowball sampling and self- selection sampling 5.2.1 Convenience sampling As the name implies, convenience sampling involves choosing respondents at the convenience of the researcher Examples of convenience samples include people-in-the- street interviews-the sampling of people to which the researcher has easy access, such as a class of students; and studies that use people who have volunteered to be questioned as a result of an advertisement or another type of promotion A drawback to this methodology is the lack of sampling accuracy Because the probability of inclusion in the sample is unknown for each respondent, none of the reliability or sampling precision statistics can be calculated Convenience samples, however, are employed by researchers because the time and cost of collecting information can be reduced 5.2.2 Quota sampling Quota sampling is often confused with stratified and cluster sampling- two probability sampling methodologies All of these methodologies sample a population that has been subdivided into classes or categories The primary differences between the methodologies is that with stratified and cluster sampling the classes are mutually exclusive and are isolated prior to sampling Thus, the probability of being selected is known, and members of the population selected to be sampled are not arbitrarily disqualified from being included in the results In quota sampling, the classes cannot be isolated prior to sampling and respondents are categorized into the classes as the survey proceeds As each class fills or reaches its quota, additional respondents that would have fallen into these classes are rejected or excluded from the results An example of a quota sample would be a survey in which the researcher desires to obtain a certain number of respondents from various income categories Generally, researchers not know the incomes of the persons they are sampling until they ask about income Therefore, the researcher is unable to subdivide the population from which the sample is drawn into mutually exclusive income categories prior to drawing the sample Bias can be introduced into this type of sample when the respondents who are rejected, because the class to which they belong has reached its quota, differ from those who are used 5.2.3 Judgmental sampling In judgmental or purposive sampling, the researcher employs his or her own "expert” judgment about who to include in the sample frame Prior knowledge and research skill are used in selecting the respondents or elements to be sampled An example of this type of sample would be a study of potential users of a new recreational facility that is limited to those persons who live within two miles of the new facility Expert judgment, based on past experience, indicates that most of the use of this type of facility comes from persons living within two miles However, by limiting the sample to only this group, usage projections may not be reliable if the usage characteristics of the new facility vary from those previously experienced As with all nonprobability sampling methods, the degree and direction of error introduced by the researcher cannot be measured and statistics that measure the precision of the estimates cannot be calculated 5.3.4 Snowball sampling In snowball sampling, the researcher begins by identifying someone who meets the criteria for inclusion in his or her study Although this method would hardly lead to representative samples, there are times when it may be the best method available Snowball sampling is especially useful when people are trying to reach populations that are inaccessible or hard to find For instance, if we are studying the homeless, we are not likely to be able to find good lists of homeless people within a specific geographical area However, if we go to that area and identify one or two, we may find that they know very well who the other homeless people in their vicinity are and how we can find them 5.3.5 Self-selection sampling Self-selection sampling occurs when we allow each case, usually individuals, to identify their desire to take part in the research We therefore publicize your need for cases, either by advertising through appropriate media or by asking them to take part; then collect data from those who respond For example, survey researchers may put a questionnaire online and subsequently invite anyone within a particular organization to take part Scientists that conduct experiments using human subjects may advertise the need for volunteers to take part in drug trials or research on physical activity The key component is that research subjects (or organizations) volunteer to take part in the research on their own accord They are not approached by the researcher directly Sample Errors As with all research methods, sampling provides some room for error on the part of the researcher Being aware of those possible errors is essential in selection of the sampling method used as well as calculation of the data collected Simply being aware of possible errors is often not enough Arlene Fink believes that no matter how thorough and proficient the researcher is, “sampling bias or error is inevitable” (p 25) Sampling error may be defined as “the error that results from taking one sample instead of examining the whole population” (Lohr, 1999) Lohr simply defines several types of sample errors as “undercoverage, nonresponse, and sloppiness in data collection” Nonresponse is a non-sampling error that precludes that some members of the population who are eligible to be sampled are unwilling to participate or not answer all questions on the survey(s) (Cochran, 1953; Fink, 2003; Lohr, 1999) Lohr indicates that “the main problem caused by nonresponse is potential bias of population estimates” Non-sampling errors “occurs because of imprecision in the definition of the target and study population and errors in survey design and measurement” (Fink, 2003) Some errors of non-sampling include changes due to historical circumstances, neglecting definitions and inclusion and exclusion of criteria, and instrument or survey process instrument bias Researchers should keep in mind that an increase and sample size and an increased homogeneity of the elements being sampled allow for the reduction of sampling error However, Lohr (1999) warns that “increasing the sample size without targeting nonresponse does nothing to reduce nonresponse bias; a larger sample size merely provides more observations from the class of persons that would respond to the survey” Conclusion In conclusion, it can be said that using a sample in research saves mainly on money and time, if a suitable sampling strategy is used, appropriate sample size selected and necessary precautions taken to reduce on sampling and measurement errors, then a sample should yield valid and reliable information Researchers may choose from a variety of sampling methods The researcher goals inform which sampling method is best for the research to be conducted Many sample method choices are available; the researcher must choose the method that is right for the study The main choice in regards to sample method choice is whether or not the researcher wants to generalize the findings from the sample to the whole of the population being studied Being aware of possible errors due to the sample method chosen is also very important because giving possible errors within the results section allows the study to be regarded as valid Details on sampling can be obtained from the references included below and many other books on statistics or qualitative research which can be found in libraries References Cochran, W G (1953) Sampling techniques New York: John Wiley & Sons, Inc Field, A (2005) Discovering statistics using SPSS London: Sage Publications Ltd Fink, A (2003) How to sample in surveys (2nd ed.) London: Sage Publications, Inc Gray, P S., Williamson, J B., Karp, D A., & Dalphin, J R D (2007) The Research imagination: An introduction to qualitative and quantitative methods Cambridge: Cambridge University Press Hansen, M H., Hurwitz, W N., & Madow, W G (1953) Sample survey methods and theory (Vol I) New York: John Wiley & Sons, Inc Kish, L (1965) Survey sampling New York: John Wiley & Sons, Inc Lohr, S L (1999) Sampling: Design and analysis (2nd ed.) Boston: Brooks/Cole Saunders, M., Lewis, P., & Thornhill, A (2007) Research methods for business students Harlow, England: Pearson Education Limited ... sampling, systematic random sampling, stratified random sampling, and cluster sampling 5.1.1 Simple random sampling Simple random sampling is often called straight random sampling The naming convention... convenience sampling, quota sampling, judgmental sampling, snowball sampling and self- selection sampling 5.2.1 Convenience sampling As the name implies, convenience sampling involves choosing respondents... Figure 2: Sampling The Purpose of Sampling Sampling theory is important to understand in regards to selecting a sampling method because it seeks to “make sampling more efficient” (Cochran, 1953,

Ngày đăng: 17/04/2017, 22:24

Xem thêm