Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 29 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
29
Dung lượng
3,42 MB
Nội dung
CNM HANDBOOK FOR EDUCATIONAL RESEARCH Central New Mexico Community College Ursula Waln, Director of Student Learning Assessment, and Fred Ream, Mathematics Instructor 2014 Table of Contents Purpose of this Handbook IRB and Educational Research .1 How the IRB Came to Be The Role of the IRB Research Involving Human Subjects IRB Exemptions IRB Criteria Informed Consent .5 Research Design Quantitative and Qualitative Options .6 Sampling Methods Statistics 10 Data Types 10 Analytic Approaches .10 Figure 1: Statistical Test Decision-Making Flowchart 11 Refresher of Basic Concepts 12 Figure 2: Normal Distribution .14 Figure 3: Illustration of Outliers 15 Analytic Software Programs 16 Preliminary Data Analysis Tips 16 Glossary 17 References 19 Appendix A: Recommendations from CNM’s IRB 20 Appendix B: Preview of CNM IRB Application Form 21 PURPOSE OF THIS HANDBOOK This handbook was developed to foster depth of intentionality and critical analysis in CNM student outcomes assessment It builds on the information presented in the CNM Handbook for Outcomes Assessment and is based on the premises that 1) student outcomes assessment is a form of educational research, and 2) considering one’s assessment efforts as research may suggest more creative, purposeful, and meaningful approaches The following pages offer information and tips for those who wish to develop a greater degree of sophistication in their assessment approach and/or have their assessment efforts render more useful information IRB AND EDUCATIONAL RESEARCH If you are interested in conducting research that is not purely operational in nature, is experimental, is for a dissertation or master’s thesis, or is going to be used outside of CNM in any way, your project requires prior approval by the CNM Institutional Review Board (IRB) If you think there is any possibility at all that you will ever want to publish your findings, use the data for other publishable research, or anything else with your findings outside of CNM, please get IRB approval before beginning any data collection The question of whether your particular study must have IRB approval hinges on whether or not it meets the definition of ‘research involving human subjects.’ However, for any study that is not operational in nature and going to be used only within CNM, the IRB, not the researcher, must make this determination, based on an application If you have any doubt at all, submit an application The process itself can help you organize your research plans, so the time won’t be wasted, and you’ll know you covered your bases A requirement for IRB approval need not deter faculty from pursuing educational research projects Indeed, structured research into student learning not only has great potential for informing internal improvement efforts, but also can produce insights with the potential to impact the broader profession and/or instructional community The application form for submitting a project for IRB approval is shown in Appendix B and can be accessed at http://www.cnm.edu/depts/planning/instres/irb If you are absolutely certain your project does not require IRB approval and you already know all you need to know about the IRB, you may want to skip ahead to the research design section, page 11 How the IRB Came to Be The existence of IRBs for certification of research conducted at colleges and university is required under The National Research Act of 1974 Although certainly not the first example of a societal effort to regulate experimentation involving humans, this federal statute came about in response to medical research that violated ethical principles deemed fundamental to American values and basic human rights Salient examples include the medical experiments conducted in Nazi concentration camps between 1939 and 1945, (which led to the establishment of the first international code of research ethics, the Nuremburg Code) and the U.S Public Health Service’s Tuskegee Syphilis Study of 1932-1972, in which 400 African American men known to have syphilis had information and treatment withheld so that researchers could observe the progression of the disease (Schneider) In 1974, Congress established the National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research and tasked it with identifying ethical principles and developing guidelines for the conduct of research involving human subjects In 1979, the Commission published the Belmont Report, which summarized three basic principles: Respect for persons o Individuals should be treated as autonomous agents o Persons with diminished autonomy are entitled to additional protections Beneficence o Do no harm o Maximize possible benefits and minimize possible harms Justice o Requires that individuals and groups be treated fairly and equitably in terms of bearing the burdens and receiving the benefits of research NIH, p 21 The Belmont Report principles and guidelines were codified in the Title 45 Code of Federal Regulations, Part 46, and oversight was given to the U.S Department of Health, Education and Welfare (which later split into the Department of Health and Human Services and the Department of Education) In 1991, The U.S Department of Education became one of fifteen federal departments and agencies to codify in separate regulations the “Common Rule,” including in its 34 CFR Part 97 language identical to 45 CFR 46, Subpart A (NIH) Today, seventeen federal agencies fall under the Common Rule, and oversight for all is provided by the Office for Human Research Protections, a division of the U.S Department of Health and Human Services The Role of the IRB At the institutional level, “The IRB is an administrative body established to protect the rights and welfare of human research subjects recruited to participate in research activities conducted under the auspices of the institution with which it is affiliated” (IRB Guidebook, p 1) IRBs determine “the acceptability of proposed research in terms of institutional commitments and regulations, applicable law, and standards of professional conduct and practice” (45 CFR 46.107) The major roles of IRBs in the oversight of research are: Evaluating the respect for persons, beneficence, and justice of all research activities proposed under the auspices of the institution and providing approval or disapproval accordingly Ensuring that the process proposed to collect informed consent meets regulatory requirements Overseeing the progress and protocols of ongoing research studies NIH, p 84 Research Involving Human Subjects The CFR provides the following definitions: …(d) Research means a systematic investigation, including research development, testing and evaluation, designed to develop or contribute to generalizable knowledge Activities which meet this definition constitute research for purposes of this policy, whether or not they are conducted or supported under a program which is considered research for other purposes For example, some demonstration and service programs may include research activities (e) Research subject to regulation, and similar terms are intended to encompass those research activities for which a federal department or agency has specific responsibility for regulating as a research activity, (for example, Investigational New Drug requirements administered by the Food and Drug Administration) It does not include research activities which are incidentally regulated by a federal department or agency solely as part of the department's or agency's broader responsibility to regulate certain types of activities whether research or non-research in nature (for example, Wage and Hour requirements administered by the Department of Labor) (f) Human subject means a living individual about whom an investigator (whether professional or student) conducting research obtains (1) Data through intervention or interaction with the individual, or (2) Identifiable private information Intervention includes both physical procedures by which data are gathered (for example, venipuncture) and manipulations of the subject or the subject's environment that are performed for research purposes Interaction includes communication or interpersonal contact between investigator and subject Private information includes information about behavior that occurs in a context in which an individual can reasonably expect that no observation or recording is taking place, and information which has been provided for specific purposes by an individual and which the individual can reasonably expect will not be made public (for example, a medical record) Private information must be individually identifiable (i.e., the identity of the subject is or may readily be ascertained by the investigator or associated with the information) in order for obtaining the information to constitute research involving human subjects… 45 CFR 46.102 IRB Exemptions The CNM IRB, with endorsement from the VPAA, Deans’ Council, Chairs’ Council, and Faculty Senate, documented its Recommendations for Research in 2012, stating that “Some types of research are exempt from IRB review: student projects for class, data collection (e.g., surveys) by faculty or staff, assessment data collection, student course evaluations, and research by the college that is not designed for publication” (Appendix A) Whether or not an investigation is subject to IRB approval, the CNM IRB (on its homepage) recommends following the basic rules listed in its document Recommendations for Research In the same vein, it is advisable for anyone planning educational research at CNM to be familiar with and follow the IRB criteria and informed consent guidelines in the following sections of this handbook, whether or not the research requires IRB approval Forming the basis for the CNM statements of exemption are the following excerpts from 45 CFR Part 46, and the corresponding educational regulations in 34 CFR Part 97 (formatting enhanced for readability) …(b) Unless otherwise required by department or agency heads, research activities in which the only involvement of human subjects will be in one or more of the following categories are exempt from this policy: (1) Research conducted in established or commonly accepted educational settings, involving normal educational practices, such as (i) research on regular and special education instructional strategies, or (ii) research on the effectiveness of or the comparison among instructional techniques, curricula, or classroom management methods (2) Research involving the use of educational tests (cognitive, diagnostic, aptitude, achievement), survey procedures, interview procedures or observation of public behavior, unless: (i) information obtained is recorded in such a manner that human subjects can be identified, directly or through identifiers linked to the subjects; and (ii) any disclosure of the human subjects' responses outside the research could reasonably place the subjects at risk of criminal or civil liability or be damaging to the subjects' financial standing, employability, or reputation (3) Research involving the use of educational tests (cognitive, diagnostic, aptitude, achievement), survey procedures, interview procedures, or observation of public behavior that is not exempt under paragraph (b)(2) of this section, if: (i) the human subjects are elected or appointed public officials or candidates for public office; or (ii) federal statute(s) require(s) without exception that the confidentiality of the personally identifiable information will be maintained throughout the research and thereafter (4) Research involving the collection or study of existing data, documents, records, pathological specimens, or diagnostic specimens, if these sources are publicly available or if the information is recorded by the investigator in such a manner that subjects cannot be identified, directly or through identifiers linked to the subjects… For the CNM IRB Recommendations for Research, see Appendix A or go to https://www.cnm.edu/depts/planning/instres/irb/documents/RECOMMENDATIONS-FOR-RESEARCH.pdf IRB Criteria The following CFR excerpts hit the highlights of the federal criteria for IRB approval: Risks to human subjects are minimized… Risks to human subjects are reasonable in relation to anticipated benefits, if any, to human subjects and the importance of the knowledge that may reasonably be expected to result… Selection of human subjects is equitable… Informed consent will be sought from each prospective research participant or the prospective research participant’s legally authorized representative in accordance with and to the extent required by §46.116 Informed consent will be appropriately documented in accordance with and to the extent required by §46.117 When appropriate, the research plan makes adequate provision for monitoring the data collected to ensure the safety of subjects When appropriate, there are adequate provisions to protect the privacy of subjects and to maintain the confidentiality of data 45 CFR 46.111 The federal regulations can be accessed in their entirety at http://www.hhs.gov/ohrp/humansubjects/guidance/45cfr46.html or at http://www2.ed.gov/policy/fund/reg/humansub/part97.html In addition, an Institutional Review Board Guidebook, published by the HHS, is available at http://www.hhs.gov/ohrp/archive/irb/irb_guidebook.htm And, the CNM IRB web site and links to documents can be found at http://www.cnm.edu/depts/planning/instres/irb Informed Consent The principle of respect for persons requires that investigators conducting research on human subjects obtain informed consent in the form of “[a] legally-effective, voluntary agreement that is given by a prospective research participant following comprehension and consideration of all relevant information pertinent to the decision to participate in a study” (NIH, p 117) Autonomous adults can provide their own informed consent (The age of majority in New Mexico is eighteen.) However, research involving children or other vulnerable populations requires informed consent from not only the subject (whenever the individual is reasonably capable of providing or denying consent), but also the subject’s legally authorized representative In some cases (e.g., specific cultural settings), community or family consent may also be required Informed consent consists of voluntariness, comprehension, and disclosure: Voluntariness: Consent must be given free of coercion, and the perceived value of any inducements must be small enough to avoid undue influence on the decision to participate Care must be taken to ensure that subjects are not manipulated to ignore the risks of participation either through fear of repercussions for refusal or through motivation by excessive or inappropriate rewards Offering bonus points toward a final course grade would be an example of an inappropriate incentive However, in some cases, minimal compensation for participants’ time and effort may be appropriate Comprehension: To provide informed consent, subjects must first understand the information presented to them sufficiently to make a reasoned, informed decision Subjects must therefore have the cognitive capacity to understand (or else be represented by legally authorized individuals), and the information must be presented using language and methods of delivery appropriate to the people making the decision Disclosure: Information that must be disclosed for participants to be able to provide informed consent include the purpose, risks, and benefits of the research; alternatives to the research protocol; the extent to which confidentiality can be protected; compensation in case of injury; contact information for questions or concerns; and conditions of participation, including the right to refuse or withdraw participation without penalty (Compensation may be terminated proportionately upon withdrawal, but any offer of compensation for participation prior to the time of withdrawal must be honored Telling prospective subjects they must complete the study or forfeit all compensation would be an example of a penalty.) Waivers regarding the disclosure of specific information may be considered by the IRB when the research poses no more than minimal risk to the participants, withholding the information will not adversely affect the rights or welfare of the participants, and the research could not practicably be carried out otherwise When information is withheld, it should be disclosed following the conclusion of research unless doing so might cause harm “No informed consent, whether oral or written, may include any exculpatory language through which the subject or the representative is made to waive or appear to waive any of the subject's legal rights, or releases or appears to release the investigator, the sponsor, the institution or its agents from liability for negligence” (45 CFR 46.116) Finally, informed consent should be viewed as an ongoing process, not just a one-time event Participants should be kept informed and given opportunities to ask questions as the research is carried out RESEARCH DESIGN For the purposes of CNM student learning assessment, the primary consideration in designing your research should be what you want to know related to the development of your program competencies (student learning outcome statements) Using that as your starting point, you can employ ‘reverse design’ principles to map out the steps you will take toward gaining the insights you seek Depending on the type of information you want to collect and what you want to with it, your research design may or may not include the following: A written statement of the issue to be investigated Identification of the population to be studied A description of relevant background information A literature review A written rationale A hypothesis and/or null hypothesis (or just a statement of what you expect to find) Identification of methodology o Sampling methods o Instruments o Procedures for administration o Data analysis methods A timeline Quantitative and Qualitative Options Research approaches are often described as quantitative or qualitative, though the two descriptors more accurately represent ends of a continuum than distinct categories Both extremes seek to answer questions using systematic procedures, collecting evidence, and examining findings However, where a research project falls on the continuum is determined by its objective(s), the types of research questions posed, the methodologies employed, and the types of data produced Expectations for researchers differ as well Quantitative research tends to have as an objective obtaining generalizable findings Questions posed tend to focus on associations between variables, methodologies employed tend to involve studying specific variables, and data produced tends to be numerical Quantitative investigators are typically expected to adhere to scientifically validated research models and techniques and to remain objective Qualitative research, on the other hand, tends to have as an objective exploring the dynamics of a specific context Questions posed tend to focus on understanding factors that influence outcomes, methodologies employed tend to be open-ended and exploratory, and data produced tends to be textual (recorded observations, participant comments, etc.) Qualitative investigators are typically considered free to adapt their research models to fit the changing circumstances of their investigations and to render subjective interpretations Randomized controlled trials represent the quantitative end of the spectrum In a nutshell, such investigations provide an intervention with a randomly selected experimental group and compare the results to those of a control group (a group that does not receive the intervention) Here are some grouping models commonly used in randomized controlled trials: Parallel grouping: Each student is randomly assigned to an experimental or control group; the experimental group receives the intervention and the control group does not – but may receive a dummy/placebo intervention, to minimize the potential for participant expectations to influence the outcomes (In educational research, a placebo may be, for example, an instructional method that has already been shown to not produce the effect being researched.) Cluster grouping: Pre-existing groups of students (such as course sections) are randomly selected to receive or not receive the intervention Crossover grouping: Each student receives or does not receive the intervention in a random sequence Factorial grouping: Used when more than one intervention is tested simultaneously to determine whether there is a compound effect; each student is randomly assigned a group that receives a particular combination of interventions (AA, AB, AC, & BC, where A = no intervention, B = intervention 1, and C = Intervention 2) A common crossover approach is to switch the treatment and non-treatment groups at the midpoint of the study Crossover grouping minimizes the potential for group differences to influence the findings In educational research, the effectiveness of a cross-over approach depends in part upon whether the intervention being studied is influenced by timing A Source of Some Confusion The word quantitative may sometimes seem a little slippery because people use it differently depending on whether they are describing data, measurement methods, or research design Quantitative data is information derived from direct numerical measurement Height, weight, area, pounds per square inch, counts of students, proportions of responses, and GPA are examples Representing something as a number when it is not directly, numerically measurable (such as pain rated on a scale of to 10 or level of effort on a scale of to 4) does not make it quantitative data It is a common error to refer to qualitative data that has been numerically represented as quantitative Readers will find disagreement in the literature about the classification of measurement methods as quantitative versus qualitative Applying the same distinction used for classifying data, quantitative measurement methods are those that involve direct, numerical measures However, common usage lumps together all measurement methods that produce numerical data (especially when conducted in systematic ways) as quantitative measurement methods As a result, Likert-scale questionnaire items and rubric-based evaluations are commonly referred to as quantitative methods even though the data they comprise is qualitative data At the research design level, quantitative essentially means the researchers intend to use inferential statistics to make generalizations For example, let’s say a foreign language instructor randomly selects half of her students to participate in an emersion program during the first half of the semester only and then has the other half of the students participate in the emersion program during the second half of the semester The timing of the emersion program could turn out to be a significant factor Students who develop basic vocabulary and syntactic awareness through classroom instruction during the first eight weeks may have a decoding advantage when they enter the emersion program On the other hand, students who experience the emersion program during the first eight weeks may develop greater awareness of cadence, pronunciation, and speech patterns, and applying these, advance more effectively through the conventional instruction during the second eight weeks Anticipating these possibilities, the instructor may decide to include in the design a third group, comprised of students who not participate in the emersion program at all A subset of randomized controlled trials, called randomized double-blind placebo-controlled crossover studies, is generally considered the ‘gold standard’ in quantitative scientific research (particularly pharmaceutical research) And, while it need not be the goal of every educational researcher to conduct such a study, an understanding of the rationale supporting the design will help inform design decisions Double-blind means neither the researchers in contact with the subjects nor the subjects know who is receiving the ‘treatment’ and who is not This minimizes the potential for either the researchers or the subjects to influence outcomes based on their expectations In educational research, conducing a double-blind study would require that the educational intervention be provided by someone other than the investigator For example, let’s say a researcher wants to find out whether students approached outside of class by a peer and asked to explain core course concepts perform significantly better on subsequent exams than students not questioned in this way To make this a double-blind study, the students must have no inkling that being questioned by a peer is an intervention, and the researcher must remain ignorant regarding which students are questioned until after the data collection is complete At the other end of the spectrum, qualitative research usually seeks to answer a question related to the perspectives of an involved population Qualitative research can be particularly useful in studying attitudes, values, opinions, perceptions, beliefs, behaviors, emotions, relationships, social situations, experiences, and other aspects of individual, community, or cultural contexts Although insights gained may have the potential for extension to similar populations or situations, qualitative research usually focuses more on delving into the factors that influence outcomes within the specific study population than on being able to generalize the findings to other populations Qualitative research explores and interprets how people experience the complex reality of given issues and/or situations For example, imagine CNM researchers want to better understand why area employers within a specific sector not hire as many qualified CNM alumni as might be expected on the basis of job openings The objective of their investigation will be to identify factors influencing the decisions of the local employers Their findings might ultimately reveal something about employer perceptions that could be useful, by extension, to other CNM programs or other colleges with similar programs However, their primary goal is not finding something that can help other programs, but rather, better understanding what their students’ prospective employers want so that the program faculty might better address the preparation of candidates Research within the field of education is often driven by context-specific questions posed by faculty and therefore tends toward qualitative design Indeed, large-scale quantitative educational studies are relatively rare, and generalizations made on their basis tend to be subject to dismissal because of the difficulties inherent in applying scientific models to studies of humans that take place in natural settings, especially when the conclusions contradict strongly-held convictions about teaching and learning (For an illustrative example, consider Project Follow Through, “the largest, most expensive educational experiment ever conducted” [Adams, 1996].) Because investigators conducting qualitative research typically care more about the usefulness of the information gathered in supporting interpretation and insight than about the reproducibility of the study, qualitative research tends to be characterized by adjustability and/or fluidity The design needs to facilitate collection of the desired information, not stand up to scientific scrutiny Stated another way, validity of the findings is important in qualitative research, but reliability is not Observation, interviews, and focus groups are methods commonly used in qualitative research The data produced may be a mix of direct and indirect measurements and/or textual information with no associated numeric values (such as recorded dialogs, comments, field notes, responses to open-ended questions, etc.) Input using categorization, holistic ratings, Likert scales, rubrics, and/or forced-choice responses may be numerically represented to facilitate statistical analysis Note that the use of statistics is not the exclusive domain of quantitative research; however, while quantitative research tends to employ inferential statistics, qualitative research tends to incorporate descriptive statistics These categories are explored further in the Statistics section of this handbook Median: The middle value between the highest and the lowest of the numbers when the set is arranged in order For example, if the set is {2, 3, 3, 5, 6, 8, 9}, the median is If you have an even count of numbers, the median is the sum of the two in the middle divided by For example, if the set is {1, 1, 3, 5, 5, 6}, the median is calculated (3+5) / = Mode: The most frequently occurring of the numbers In the set {2, 3, 3, 5, 6, 8, 9}, the mode is because that is the only number that occurs twice A set can have more than one mode, as in {1, 1, 3, 5, 5, 6}, and if no number occurs more than once, there is no mode at all The mode is considered a measure of central tendency because in a moreor-less normally distributed population, the mode would tend to be at or near the middle, as would the mean and median Correlation: A procedure used to quantify the relationship between two quantitative variables In a positive correlation, an increase in one variable corresponds to an increase in the other In a negative correlation, an increase in one corresponds to a decrease in the other Correlations are represented using correlation coefficients, ranging between -1 and 1, with zero indicating no correlation at all and the strength of the correlation increasing toward each end of the range A correlation of would suggest that an increase in one variable corresponds to an equivalent increase in the other, a phenomenon that rarely seen in educational research In interpreting correlations, it is important to keep in mind that they not necessarily demonstrate causal relationships Both variables could potentially be influenced by other variables For example, ACT scores correlate positively with college completion rates, but having higher ACT scores does not make students more likely to finish college Both are influenced by prior educational experiences, socio-economic status, parental education, cultural/family expectations, etc Data Plots: Graphic representations of data sets, often taking the form of scatterplots, histograms (similar to bar graphs), or box plots (a.k.a box and whisker charts) Effect Size: A calculation that quantifies the magnitude of a difference between two groups The effect size is a measure of how far apart the true value of the parameter being tested and the one assumed in the null hypothesis are Failing to detect the difference is a type II error (interpreting the data as not showing a difference when there really is one) Sometimes effect size is presented instead of (or in addition to) calculations of statistical significance, in which very large sample sizes tend to cause relatively minor differences to be flagged as significant Effect size is an indicator of whether the statistically significant information is of practical significance Different formulas can be used to calculate effect size, but one way is to divide the difference between the two group means by the standard deviation (of either group or of the two groups averaged) Basically, this tells you how many standard deviations’ difference there is between the two means For example, an effect size of 70 tells you that the difference between the means represents approximately 7/10 of a standard deviation Some would consider this a large effect size However, different scales are used for interpretation, depending on the field of study Some interpret 25 as small, 50 as medium, and 80 as large Others consider 10 small, 30 medium, and 50 large 13 Hypothesis Testing: A formal process of deciding whether to reject or not to reject a null hypothesis based on the results of a study Level of Significance & Confidence Level: The level of significance is referred to as the alpha level It is used in hypothesis testing and should be chosen before the study is conducted The level of significance quantifies the investigator’s willingness to make a type I error (to interpret the data as showing a difference when one truly does not exist) by identifying what p-value will be considered the ‘bar’ (or cutoff) for statistical significance After the data is collected, a p-value is calculated from the sample, and if it is smaller than the level of significance, then the null hypothesis is rejected (and the relationship deemed to be statistically significant) Confidence Level is the likelihood that the corresponding confidence interval is one of the intervals that contains the parameter being estimated The confidence level is one minus the level of significance A confidence interval can be interpreted as a two-sided or two-tailed hypothesis test at the alpha level of significance Normal Curve: A frequency plot showing the distribution of a characteristic over a population Also known as a bell curve, the graphic representation is bell-shaped, with the majority of people clustering near the mean A perfectly normal distribution is one in which mean, median, and mode all have the same value and the tails (lowest and highest ends) remain slightly above the horizontal axis, as shown in Figure Figure 2: Normal Distribution Image retrieved from http://www.lesn.appstate.edu/olson/normal_curve.htm Null Hypothesis: Specifies a value for a parameter Usually, the choice is based on there being no relationship or effect, no difference between groups, or no difference from a previous value Like the assumption of innocence until proven guilty, the null hypothesis places the burden upon the investigator to show with significant evidence, beyond reasonable doubt, that a difference exists 14 Outliers: Data points that are significantly above or below the trend or deviate from the pattern The following scatterplot illustrates by pointing out two outliers: Figure 3: Illustration of Outliers Power: An estimate of the sensitivity of the test in detecting a difference The power of a test is the chance that the test correctly rejects a null hypothesis, i.e., the chance of not making a type II error p-Value: The “p” in p-value stands for probability A p-value represents the probability that an experimental result is due to a random occurrence under a true null hypothesis (It is essentially the opposite of the probability of the phenomenon occurring in any given case, which is known as the confidence interval.) P-values are presented with tests of statistical significance to communicate the level of reliability of the statistical conclusions The smaller the p-value, the greater the reliability of the statistical significance P-values are usually derived from the “Sig” (significance) data in analytical output tables and are written as “p ≤ 0.01,” which equates to a 99% confidence interval (i.e., a 99% or greater probability of the relationship holding true), or as “p ≤ 0.05,” which corresponds to a 95% confidence interval Data with a p-value > 0.05 is usually interpreted as not rising to the level of statistical significance Standard Deviation (SD): A measure of how spread out (dispersed) the numbers are from the mean, symbolized as σ (lower case sigma) for the population standard deviation and “s” for the sample standard deviation The more closely grouped the numbers are around the mean, the smaller the SD will be SD is expressed in the same units used to express the mean, so it makes it relatively easy for people to understand just how big the spread is above and below the mean SD is the square root of the mean of the squared differences from the mean value, i.e., the square root of the variance Other measures of the spread: Standard Error is used when the standard deviation for an experiment is estimated from its data Although the sample SD will appear in it, the correct form will depend on the experiment Margin of Error is the product of a probability factor, to insert the confidence, and the standard error as a scale factor based on the data It is the number behind the plus or minus in a confidence interval and represents half the width of the interval For example, with 95% confidence, the true mean μ is thought to lie on the interval 5.25 ± 37 Type I and Type II 15 Errors: A type I error is a false positive where a true null hypothesis is rejected A type II error is a false negative where a false null hypothesis is not rejected Variance: Variance is way of measuring how spread out (or dispersed) the numbers are from the mean The obvious measure of the spread would be the mean of the deviations (the differences between the data values and the mean); however, the mean is the balance point of the data, so the sum of the deviations above the mean cancel those below, and the mean deviation therefore is always zero To avoid this problem, the deviations are squared Variance, then, is the mean of the squared deviations This result is the variance for the population There is an inherent error caused by sampling This is corrected for by dividing by one less than the number of data values to obtain the variance for the experimental sample Analytic Software Programs Open-source and public-domain platforms and online calculators are readily available to support most analytic needs Here are just a few examples: Dataplot: National Institute of Standards and Technology PSPP: The Free Technology Academy Statistical Calculators: MathPortal.org Sample Size Calculator: National Statistical Service Sample Size Calculator: Creative Research Systems Preliminary Data Analysis Tips Clean up your data: Before beginning an in-depth analysis, look through the data for typos and errors in formatting If you are using Excel, be sure to format the cells consistently throughout each field Look for number fields formatted as text, and check decimal settings, time and date formatting, etc If you are using PSPP, SPSS, or some other statistics software, be sure the “Variables” view has the correct data type selected Scan the data for things like ID numbers with too many or too few digits or income figures with inconsistent formatting (ranges, dollar signs, text, etc.) Consider sorting the records in ascending or descending order by each data field to make errors more apparent Develop some first impressions: Look for general trends and/or groupings and anything else that stands out Often it is helpful to look at the data of interest in a scatter plot before running tests on it to see if any linear or grouping trends are apparent, to check for outliers, etc Having familiarized yourself with the data in this way will stimulate ideas about what to explore and will help you interpret the output when you run your analyses Check to see if the requirements for the model you chose are met: For example, for a nearly normal model, look for a single peak and symmetry in the data using a histogram or a normal probability plot For a linear regression, check for linearity using a scatterplot After fitting the regression line to the data, look for structure in the residuals that indicate problems in the original linear model GLOSSARY ANOVA Analysis of Variance, a parametric test used to compare the means of two or 16 Case Studies Chi Square (χ2) Content Validity Construct Construct Validity Correlation Cross-Sectional Analyses Descriptive Statistics Empirical Evidence Hawthorne Effect Inferential Statistics Internal Validity Inter-Rater Reliability Interval Data Longitudinal Analysis Mean Median Mode Nominal Data more independent samples and check for statistically significant differences Anecdotal reports prepared by trained observers When numerous case studies are analyzed together, the findings are often considered empirical evidence A non-parametric test used to compare the frequencies of counts, using nominal or ordinal data, to expected frequencies The Goodness of Fit Chi Square is used for one independent variable, the Chi Square Test of Independence for two independent variables The degree to which an instrument actually measures the content it is intended to measure A theoretical entity or concept such as aptitude, creativity, engagement, intelligence, interest, or motivation Constructs cannot be directly measured, so we assess indicators of their existence For example, IQ tests assess prior learning as an indicator of intelligence The ability of a research approach to actually represent the construct it purports to address Since constructs cannot be directly measured, construct validity depends upon how much people can agree upon a definition of the construct and how truly the indicators represent that definition A process of quantifying a relationship between two quantitative variables in which an increase in one is associated with an increase in the other (a positive correlation) or a decrease in the other (a negative correlation) Studies comparing, at a single point in time, people representing different stages of development, for the purpose of drawing inferences about how people change/develop over time Tests used to quantitatively describe phenomena (Compare nonparametric analyses Contrast inferential statistics.) Usually refers to what has been systematically observed May or may not be obtained through scientific experimentation A phenomenon in which the behavior of participants in a study is affected by their knowing they are being studied Tests used to quantify the degree to which generalizations can be made to a population based on outcomes observed in a sample (Compare parametric analyses Contrast descriptive statistics.) The extent to which variables in a study can be controlled sufficiently to ensure that any observed change is actually attributable to the intervention The degree of consistency in ratings among different evaluators For example, if five judges score a performance with nearly identical ratings, the inter-rater reliability is said to be high Numbers on a scale comprised of equal intervals but having no absolute zero A comparison of outcomes over time within a given group, based on changes at an individual level (versus differences in group averages) E.g., comparing pretest and post-test scores to assess learning gains The average, arithmetic mean, derived by dividing the sum of all scores by the total number (count) of scores It is the balance point of the data The midpoint, were 50% of scores are above it and 50% are below it (in position, not value) when all of the scores are ranked in ascending order The most frequently occurring score Numbers (usually) represent individuals or categories with no associated order E.g., numbers assigned to ethnic groups or student ID numbers 17 Nonparametric Analyses Null Hypothesis Ordinal Data Outliers Parametric Analyses Qualitative Evidence Quantitative Evidence Randomized, Controlled Trials (RCTs) Ratio Data Reliability Standard Deviation (SD) t-Test Validity Studies using nominal and/or ordinal data or any data type that does not follow a normal distribution and/or does not have variances similar to those of the comparison population Nonparametric tests not support population inferences They are used primarily in qualitative research A statement to be tested through systematic investigation, asserting that no relationship or effect exists between the variables being studied, that there is no difference between groups, or that there is no difference from a previous value In scientific research, the assumption is that no claims can be made regarding relationships between variables unless significant evidence exists to reject the null hypothesis Even so, a cause-and-effect relationship cannot be established Numbers (usually) represent observations that have an underlying order or magnitude but the intervals cannot be assumed to be equal Data points that are noticeably outside the pattern of the point distribution Studies using interval or ratio data that are more-or-less normally distributed when the variance within the two groups being compared is the same or similar Parametric analyses can be used to support generalizations/inferences (Compare inferential statistics.) Evidence that cannot be directly measured, such as people’s perceptions, valuations, opinions, effort levels, behaviors, etc Many people mistakenly refer to qualitative evidence that has been assigned numerical representations (e.g., Likert-scale ratings) as quantitative data The difference is that the qualitative evidence is inherently non-numerical Evidence that occurs in traditionally established, measurable units, such as counts of students, grade point averages, percentages of questions correctly answered, speed of task completion, etc A model for scientific experimentation Participants are randomly selected and divided into groups, experimental and control, for outcome comparison Numbers on a scale comprised of equal intervals and having an absolute zero Consistency in outcomes over time or under differing circumstances For example, if a student challenges a placement exam repeatedly without doing any study in between and keeps getting essentially the same result, the test has a high level of reliability A measure of the spread of scores about the mean SD is the square root of the variance There are two, one for a population and one for a sample; these are calculated slightly differently Originally derived for small samples, current practice is to use this distribution when the population standard deviation is not known The ability of a research approach to measure what it is intended to measure An aptitude test in which cultural expectations or socio-economic level significantly influence outcomes may have low validity REFERENCES Adams, G., & Engelmann, S (1996) Research on direct instruction Seattle, WA: Educational Achiebement Systems 18 Grossen, B (1996) The story behind Project Follow Through Effective school practices, Vol 15, Retrieved from http://darkwing.uoregon.edu/~adiep/ft/grossen.htm Institutional Review Board (2004) CNM institutional review board [PDF document] Central New Mexico Community College Retrieved from http://www.cnm.edu/depts/planning/instres/irb/documents/CNM_Institutional_Review_Board.pdf National Institute of Health Office of Extramural Research Protecting human research participants [IRB training PDF] Retrieved from http://phrp.nihtraining.com/ U.S Department of Health and Human Services (1993) Institutional review board guidebook Retrieved from http://www.hhs.gov/ohrp/archive/irb/irb_chapter1.htm Schneider, W H The establishment of institutional review boards in the U.S background history Indiana University-Purdue University Indianapolis History Department Retrieved from http://www.iupui.edu/~histwhs/G504.dir/irbhist.html APPENDIX A: RECOMMENDATIONS FROM CNM’S IRB 19 https://www.cnm.edu/depts/planning/instres/irb/documents/RECOMMENDATIONS-FOR-RESEARCH.pdf APPENDIX B: PREVIEW OF CNM IRB APPLICATION FORM 20 21 22 23 24 25 http://www.cnm.edu/depts/planning/instres/irb 26