Quantitative methods for second language research carsten roever, aek phakiti, routledge, 2018 scan

QUANTITATIVE METHODS FOR SECOND LANGUAGE RESEARCH Quantitative Methods for Second Language Research introduces approaches to and techniques for quantitative data analysis in second language research, with a primary focus on second language learning and assessment research It takes a conceptual, problem-solving approach by emphasizing the understanding of statistical theory and its application to research problems while paying less attention to the mathematical side of statistical analysis The text discusses a range of common statistical analysis techniques, presented and illustrated through applications of the IBM Statistical Package for Social Sciences (SPSS) program These include tools for descriptive analysis (e.g., means and percentages) as well as inferential analysis (e.g., correlational analysis, t-tests, and analysis of variance [ANOVA]) The text provides conceptual explanations of quantitative methods through the use of examples, cases, and published studies in the field In addition, a companion website to the book hosts slides, review exercises, and answer keys for each chapter as well as SPSS files Practical and lucid, this book is the ideal resource for data analysis for graduate students and researchers in applied linguistics Carsten Roever is Associate Professor in Applied Linguistics in the School of Languages and Linguistics at the University of Melbourne, Australia Aek Phakiti is Associate Professor in TESOL in the Sydney School of Education and Social Work at the University of Sydney, Australia QUANTITATIVE METHODS FOR SECOND LANGUAGE RESEARCH A Problem-Solving Approach Carsten Roever and Aek Phakiti First published 2018 by Routledge 711 Third Avenue, New York, NY 10017 and by Routledge Park Square, Milton Park, Abingdon, Oxon, OX14 4RN Routledge is an imprint of the Taylor & Francis Group, an informa business © 2018 Taylor & Francis The right of Carsten Roever and Aek Phakiti to be identified as authors of this work has been asserted by them in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988 All rights reserved No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe Every effort has been made to contact copyright-holders Please advise the publisher of any errors or omissions, and these will be corrected in subsequent editions Library of Congress Cataloging-in-Publication Data A catalog record for this book has been requested ISBN: 978-0-415-81401-0 (hbk) ISBN: 978-0-415-81402-7 (pbk) ISBN: 978-0-203-06765-9 (ebk) Typeset in Bembo by Apex CoVantage, LLC Visit the Companion Website: www.routledge.com/cw/roever CONTENTS List of Illustrations Foreword Preface Acknowledgments vii xv xvii xxii Quantification Introduction to SPSS 14 Descriptive Statistics 28 Descriptive Statistics in SPSS 44 Correlational Analysis 60 Basics of Inferential Statistics 81 T-Tests 92 Mann-Whitney U and Wilcoxon Signed-Rank Tests 106 One-Way Analysis of Variance (ANOVA) 117 10 Analysis of Covariance (ANCOVA) 135 11 Repeated-Measures ANOVA 154 vi Contents 12 Two-Way Mixed-Design ANOVA 166 13 Chi-Square Test 182 14 Multiple Regression 200 15 Reliability Analysis 219 Epilogue References Key Research Terms in Quantitative Methods Index 246 250 255 263 ILLUSTRATIONS Figures 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 2.14 2.15 2.16 2.17 2.18 2.19 3.1 New SPSS spreadsheet SPSS Variable View Type Column Variable Type dialog Label Column Creating student and score variables for the Data View Adding variables named ‘placement’ and ‘campus’ The SPSS spreadsheet in Data View mode Accessing Case Summaries in the SPSS menus Summarize Cases dialog SPSS output based on the variables set in the Summarize Cases dialog SPSS menu to open and import data SPSS dialog to open a data file in SPSS Illustrated example of an Excel data file to be imported into SPSS SPSS dialog when opening an Excel data source The personal factor questionnaire on demographic information SPSS spreadsheet that shows the demographic data of Phakiti et al (2013) The questionnaires and types of scales and descriptors in Phakiti et al (2013) SPSS spreadsheet that shows questionnaire items of Phakiti et al (2013) A pie chart based on gender 16 17 18 18 18 19 19 19 20 21 21 23 23 24 24 25 25 26 26 34 viii Illustrations 3.2 3.3 3.4 3.5 3.6 3.7 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 4.13 4.14 4.15 4.16 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11 5.12 A pie chart based on a 10-point score range A bar chart based on a 10-point score range An example of questionnaire items using a Likert-type scale The positively skewed distribution of length of residence The negatively skewed distribution of speech act scores The low skewed distribution of implicature scores Ch4TEP.sav (Data View) Ch4TEP.sav (Variable View) Defining gender in the Value Labels dialog Defining selfrate (self-rating of proficiency) in the Value Labels dialog Defining missing values SPSS menu for computing descriptive statistics Frequencies dialog Frequencies: Statistics dialog Frequencies: Charts dialog A histogram of the self-rating of proficiency variable with a normal curve SPSS Descriptives options SPSS graphical options SPSS bar option SPSS pie option SPSS histogram option The histogram for the total score variable A scatterplot displaying the values of two variables with a perfect positive correlation of A scatterplot displaying the values of two variables with a correlation coefficient of 0.90 A scatterplot displaying the values of two variables with a correlation coefficient of 0.33 A scatterplot displaying the values of two variables with a perfect negative correlation coefficient of –1 A scatterplot displaying the values of two variables with a low correlation coefficient of 0.06 SPSS output displaying the Pearson product moment correlation between two subsections of a grammar test A view of Ch5correlation.sav SPSS graphs menu with Scatter/Dot option Simple scatterplot options A scatterplot displaying the values of the listening and grammar scores Adding the fit line in a scatterplot A scatterplot displaying the values of the listening and grammar scores with a line of best fit added 34 35 40 41 42 42 45 45 46 47 48 49 50 50 51 53 54 55 55 56 57 58 64 64 65 66 67 71 72 74 74 75 76 77 Illustrations 5.13 SPSS Bivariate Correlations dialog 6.1 A normally distributed data set 7.1 Accessing the SPSS menu to perform the independent-samples t-test 7.2 SPSS dialog for the independent-samples t-test 7.3 Lee Becker’s effect size calculators 7.4 Accessing the SPSS menu to perform the paired-samples t-test 7.5 Paired-Samples T Test dialog 8.1 SPSS menu to perform the Mann-Whitney U test 8.2 SPSS dialog to perform the Mann-Whitney U test 8.3 SPSS menu to perform the Wilcoxon Signed-rank test 8.4 SPSS dialog to perform the Wilcoxon Signed-rank test 9.1 SPSS menu to launch a one-way ANOVA 9.2 Univariate dialog for a one-way ANOVA 9.3 Options for post hoc tests 9.4 Options dialog for ANOVA 9.5 SPSS menu to launch the Kruskal-Wallis test 9.6 Setup for the Kruskal-Wallis test 9.7 Variable entry for the Kruskal-Wallis test 9.8 Analysis settings for the Kruskal-Wallis test 9.9 Kruskal-Wallis test results 9.10 Model Viewer window for the Kruskal-Wallis test 9.11 Viewing pairwise comparisons 9.12 Pairwise comparisons in the Kruskal-Wallis test 10.1 Accessing the SPSS menu to launch the Compute Variable dialog 10.2 Compute Variable dialog 10.3 Checking ANCOVA assumption of independence of covariate and independent variable 10.4 Accessing the SPSS menu to select Cases for analysis 10.5 Select Cases dialog 10.6 Defining case selection conditions 10.7 Data View with cases selected out 10.8 Accessing the SPSS menu to launch ANCOVA 10.9 Univariate dialog for choosing a model to examine an interaction among factors and covariances 10.10 Univariate: Model dialog for defining the interaction term to check the homogeneity of regression slopes 10.11 Changing the analysis setup back to the original setup 10.12 Options in the Univariate dialog 11.1 A pretest, posttest, and delayed posttest design 11.2 Accessing the SPSS menu to launch a repeated-measures ANOVA ix 77 85 98 99 101 102 103 109 110 113 113 123 123 124 125 129 130 131 131 132 132 133 133 137 137 141 143 144 144 145 146 147 147 149 149 154 159 254 References Roever, C (2006) Validation of a web-based test of ESL pragmalinguistics Language Testing, 23(2), 229–256 Roever, C (2012) What learners get for free: Learning of routine formulae in ESL and EFL environments ELT Journal, 66(1), 10–21 Rutherford, A (2011) ANOVA and ANCOVA: A GLM approach Oxford: John Wiley & Sons Scheaffer, R L., Mendenhall, W., Ott, R L., & Gerow, K G (2012) Elementary survey sampling Boston: Brooks/Cole Shadish, W R., Cook, T D., & Campbell, D T (2002) Experimental and quasi-experimental designs for generalized causal inference Boston: Houghton, Mifflin Shintani, N., Ellis, R., & Suzuki, W (2014) Effects of written feedback and revision on learners’ accuracy in using two English grammatical structures Language Learning, 64(1), 103–131 Stevens, J P (2012) Applied multivariate statistics for the social sciences (5th ed.) New York: Routledge Tabachnik, B., & Fidell, L (2012) Using multivariate statistics Boston: Pearson Taguchi, N., & Roever, C (2017) Second language pragmatics Oxford: Oxford University Press Weir, C J (2003) Language testing and validation: An evidence-based approach New York, NY: Macmillan Williamson, D M., Xi, X., & Breyer, F J (2012) A framework for evaluation and use of automated scoring Educational Measurement: Issues and Practice, 31(1), 2–13 Yang, Y., Buckendahl, C W., Juszkewicz, P J., & Bhola, D S (2002) A review of strategies for validating computer-automated scoring Applied Measurement in Education, 15(4), 391–412 KEY RESEARCH TERMS IN QUANTITATIVE METHODS Alternative hypothesis (H1): A statistical hypothesis that is complementary to the null hypothesis This hypothesis is usually related to researchers’ expectation of research findings Analysis of covariance (ANCOVA): A type of analysis of variance (ANOVA) that attempts to minimize the effect of intervening variables on the dependent variable Analysis of variance (ANOVA): A parametric test that functions similarly to the independent-samples t-test, but instead of two groups, it can examine the differences among three or more groups Bar graph: Also called a bar chart, this is a diagram for displaying the comparative sizes of values of an independent variable Between-groups design: An experimental research design that places different groups of learners in different conditions and tests whether an outcome variable differs significantly among the different conditions Categorical data: Nominal data that are sorted into categories (e.g., men versus women; Form A versus Form B versus Form C) Chi-square test (also written as ‘χ2 test’): A type of inferential statistics for analyzing nominal data It is used to determine whether two nominal variables are associated or related to each other Coefficient of determination (denoted by R2): The Pearson correlation coefficient squared It is expressed as a percentage and describes how much of the variance in one variable is explained by the other variable Cohen’s d: A statistical effect size for t-tests or group comparisons It indicates the magnitude of group differences (small, medium or large) Cohen’s d value is unlimited A Cohen’s d above 0.8 is considered large 256 Research Terms Cohen’s kappa: An inter-rater or inter-coder reliability measure Cohen’s kappa is appropriate when researchers use categorical codes, such as yes/no, present/ absent and pass/fail Construct: A research feature of interest that cannot be observed directly (e.g., language proficiency, motivation, anxiety, and beliefs) Contiguity problem: This problem occurs whenever cut-off points are set arbitrarily, so that adjacent data points are classified into different groups Convenience sampling: A sampling technique in which researchers use easily obtainable participants for their research (e.g., a group of students they are teaching) Convenience samples are often not representative of a larger population Correlation coefficient (r): A measure of the strength of a linear relationship between two variables It has a value between –1 and Correlational research: A type of research design which aims to investigate to what extent two variables are related to each other Cronbach’s alpha (α): A standard measure of reliability for tests and questionnaires Cronbach’s alpha tends to be higher if samples are heterogenous, there are many items, and the items correlate strongly with one another Data: Information collected for research Quantitative data are in a plural form (e.g., data are/were) Degree of freedom (df ): A statistical approach for correcting a sample size for statistical calculations It has critical implications for statistical results when a sample size is small Dependent variables: Variables (e.g., test performance) that are affected by independent variables Dependent variables are also known as outcome variables Dichotomous data: This term describes nominal data that can have only two possible values (e.g., pass/fail; international student /domestic student; correct/ incorrect; 0/1) Effect size: A magnitude of a relationship between two variables or a difference among groups For example, the correlation coefficient and R2 are measures of effect size in correlational analysis Cohen’s d is an effect size index for a t-test Eta squared (ƞ2) or partial eta squared (partial ƞ2): The effect size in ANOVA and ANOVA-related tests The η2 effect size can never exceed the value of Experimental research: A type of research design that aims to test a hypothesis of the effect of one or more factors on another factor, under strictly controlled conditions (e.g., the effects of feedback on writing accuracy, the effect of a particular teaching approach on language learning) Also, participants must be randomly assigned to conditions Experimental research can allow researchers to investigate causal relationships among independent and dependent variables Research Terms 257 External validity: The generalization of the current research findings of a quantitative study to other participants or populations, and other contexts or settings Frequency count: The quantification of occurrences or frequencies of a feature (e.g., number of males and females) Hypothesis testing: A process of inferential statistics for assessing how well the data support a null hypothesis and whether the null hypothesis can be rejected with minimal error Independent-samples t-test: A parametric test for comparing the mean scores of two different groups of participants This test is called ‘independent’ because no member of one group is also a member of the other Independent variables: Variables whose effect on other variables researchers are investigating (e.g., the effect of teaching methods on language learning success, or the effect of study-abroad on a fluency measure) Independent variables are known as grouping variables in t-tests, factors in analysis of variance, and predictor variables in regression analysis Individual differences research: Research that aims to determine the effects or influences of individual factors (e.g., age, gender, native language, studyabroad experience, a particular psychological aspect) on language learning, use or performance Inferential statistics: Statistics used to draw conclusions about a population of interest from a sample of that population Internal consistency: A statistical measure of the reliability of a test or research instrument, such as a questionnaire or an elicitation task Cronbach’s alpha is the most common example of this type of measure Internal validity: The trustworthiness of a quantitative study, which is related to all aspects of the research design of the study (e.g., theoretical framework, instruments and data analysis) that lead to inferences and conclusions in the study Inter-rater or inter-coder reliability: A measure of the reliability of rating among different raters or data coders It concerns the amount of variation of scoring of the same learner performance by different raters Interval data: Data that take on continuous values that allow the difference between data values to be calculated Interval data not have a true zero Interval data allow researchers to compute means and standard deviations Intervening variable: An independent variable that interferes with the influence of the target independent variable on the dependent variable This variable is also known as a moderator variable or confounding variable An example of an intervening variable is pretest differences 258 Research Terms Intraclass correlation coefficient: A type of correlational analysis that can be used as a measure of inter-rater reliability Intra-rater reliability: The consistency of the scores a rater assigns to learners’ performances, as well as that rater’s ability to consistently discriminate among good, average, and poor work Introspection or think-aloud protocol: A form of verbal report of current, ongoing thoughts while participants are engaged in language or cognitive activities Kruskal-Wallis test: An alternative to a one-way analysis of variance (ANOVA) if the conditions for ANOVA are not met Kurtosis statistic: A statistic that describes how close the values in a data set are to the mean Language aptitude tests: Measures of individuals’ inherent talent for language learning Language aptitude tests can be used to predict success in future language learning Language tests: Tools to measure and sample learners’ ability or skills to use the target language (e.g., reading, listening, speaking, writing, vocabulary, pragmatics, and grammar) Learner corpora: A large amount of natural language data that are produced by language users in authentic communication (written and spoken language) Corpus analysis can be conducted using a computer program (i.e., corpus linguistic tools) to automatically examine quantitative features of language data from linguistic units (e.g., morphemes) to syntactic structures (e.g., single words, lexical phrases) Leptokurtic distribution: A data distribution that is tall and narrow Levene’s test: A statistical test for checking the equality of variances in t-tests and analysis of variance Likert-type scale: A measurement scale that allows research participants to choose from a discrete set of responses (e.g., [strongly disagree] to [strongly agree]) Mann-Whitney U test: A nonparametric test alternative to the independentsamples t-test Mean: The average of all the values in the data set Measures of central tendency: Each of these measures is a single value that attempts to describe a data set by identifying a typical central value, i.e., mean, median, and mode Measures of dispersion: Descriptive statistics that indicate how much variability there is in the data set Research Terms 259 Median: The value that divides a data set into two groups Meta-analysis: A systematic review of previous empirical studies through the use of statistical analysis for an average effect size Mode: The value that occurs most frequently in a data set Multiple regression: An extension of simple regression to examine the effect of several independent variables on an outcome variable simultaneously Hierarchical regression is performed when researchers enter independent variables in steps (called blocks) Negative correlation coefficient: A correlation coefficient that indicates that as one variable increases, the other decreases, and vice versa Nominal data: Data that consists of named categories Nominal data can be compared only in terms of sameness or difference, rather than size or strength (e.g., gender, nationalities, first language) Nominal data allow frequency counts, including raw frequencies and percentages, as well as visual representations (e.g., pie charts) Normal distribution: A data distribution that is bell-shaped Null hypothesis (H0): A statistical hypothesis that is tested against empirical data Inferential statistics aim to reject the null hypothesis A null hypothesis may have a word such as ‘no’ or ‘not’ (e.g., there is no relationship , there is no difference , and there is no effect ) A rejection of a null hypothesis is linked to a probability value being set (e.g., when p < 0.05) Ordinal data: Data that are put into an order Ordinal data can be obtained when participants are rated or ranked according to their test performances or levels of some feature Ordinal data are more informative than nominal data since they contain information about relative size or position (running at average speed, high speed, very high speed), but are less informative than interval data, which contain information about the exact size of the difference (race times of 25 seconds, 12 seconds, 10 seconds) Paired-samples t-test: A parametric test for comparing the mean scores of two tests or measurement instruments taken by the same group of participants This test is called ‘paired’ because pretest scores are compared with posttest scores This test is also called ‘a dependent t-test’ because both mean scores depend on the same group of participants Parameters: The characteristics of the population of interest Participants: People who take part in a research study Participants are sources of data for analysis Pearson Product Moment correlation (Pearson’s r): A parametric statistic for correlational analysis Pie chart: A circular diagram for displaying the relative sizes of values of a variable 260 Research Terms Platykurtic distribution: A data distribution that is wide and flat Point-biserial correlation: A type of correlational analysis that is used when one variable is nominal and the other is interval or ordinal Population: A particular group of people or learners of interest Positive correlation coefficient: A correlation coefficient that indicates that as one variable increases, the other also increases Post hoc test: A statistical test that is used to inform researchers which groups differ significantly from one another Post hoc tests are used when there are more than two comparison groups (e.g., in ANOVA) Psycholinguistics methods: Data elicitation methods that collect individuals’ online and/or offline mental representations and processes as they use and comprehend language in second language research Psycholinguistics methods rely on technology in software and hardware packages (e.g., E-Prime, Presentation for PC and Eye-Tracking systems) that can produce numerical data for statistical purposes Quantification: A numerical measurement process of the features of individual data points Questionnaires: Research instruments used to measure individuals’ beliefs, perceptions, cognitive processes, or self-knowledge through those individuals answering a series of questions Random assignment: A required condition for a true experimental research design Research participants (identified through a sampling method) are randomly assigned into groups (e.g., experimental or control groups) Random sampling: A sampling technique that allows each member of the target population to have an equal chance of being chosen Ratio data: Data that can be expressed using an interval, continuous scale with a true zero value Raw data: Data that are not yet processed or analyzed to answer research questions (e.g., answers or responses provided by research participants) Regression analysis (or simple regression): An extension of correlation analysis in which the values of an independent variable are used to predict the values of a dependent variable Regression expresses the variance accounted for as R2 (the coefficient of determination) It ranges from (no variance explained) to (all variance explained) Reliability: Consistency or repeatability of observations of behaviors, performance and/or psychological attributes It is related to the extent to which data are free of random error The reliability of a test or research instrument is commonly expressed as a value between and Repeated-measures analysis of variance (repeated-measures ANOVA): An extension of the paired-samples t-test to the case in which the mean scores are compared across three or more tests or measurement instruments Research Terms 261 Research design: A research plan, outline, and method to help researchers tackle a particular research problem Research reliability: The confidence that similar findings or conclusions are likely to be repeated in new studies (i.e., replicability) Sample size: The number of participants who produce data for quantitative analysis Large samples are generally preferable to small samples Scatterplot: A diagram that visualizes the correlation between two variables Skewness statistic: A statistic that describes whether more of the data are at the low end of the range or the high end of the range Spearman’s rho: A nonparametric (distribution-free) statistic for correlational analysis between two variables Spearman’s rho is sometimes written with the Greek letter rho (ρ) or written out (rho) This statistic is alternative to Pearson Product Moment correlation Unlike the Pearson correlation, it does not have a coefficient of determination Spearman-Brown prophecy coefficient: A reliability measure in the spilt-half reliability test It can be used to inform researchers whether the reliability of the test will increase if more items are added It can also be used to examine inter-rater reliability Sphericity: A statistical assumption for the repeated-measures ANOVA that refers to condition that the variances of differences between the individual measurements should be roughly equal Standard deviation (SD or Std Dev): A statistic that indicates how different individual values are from the mean Standard error of measurement (SEM): A statistical method for estimating the lower and upper bound of an individual’s score through the use of a reliability coefficient of a research instrument and the standard deviation of the mean score The higher the reliability coefficient, the lower the value of SEM Standardization: A procedure in which all research participants receive the same conditions (e.g., same tasks and equal time allowance) during data collection Statistical Package for the Social Sciences (SPSS): A statistical program for performing quantitative data analysis Statistical reasoning: The process of making inferences or drawing conclusions Statistical significance: The index that shows how likely it is that a statistical finding is due to chance It is known as the significance level and it is given as a decimal (e.g., p < 0.05 or p = 0.032) In inferential statistics, it is insufficient to merely report statistical significance (see ‘effect size’) Stratified random sampling: A sampling technique in which researchers divide the target population into sub-groups or strata and then randomly choose equal numbers of participants from each sub-group to form a total sample 262 Research Terms Two-way mixed-design analysis of variance (ANOVA): A combination of a repeated-measures ANOVA and a between-groups ANOVA This test allows researchers to simultaneously examine the effect of a between-subject variable, a within-subject variable, and the interaction between these variables Type I error (or false positive): An error made when researchers reject the null hypothesis when it is true Type II error (or false negative): An error made when researchers accept the null hypothesis when it is not true Variability: The extent to which data differs from the mean Variable: A feature that can vary in degree, value, or quantity (e.g., age or first language) Wilcoxon signed-rank test: A nonparametric test analogous to the pairedsamples t-test Within-groups design: An experimental research design that examines whether an outcome variable changes following the application of a treatment condition Measures taken on the same learners are compared, rather than different groups Pretest-posttest studies typically involve a within-groups design Z-score: A raw score that has been transformed into a standardized score INDEX Italic page references indicate illustrations and tables adjacent agreement method 229 alpha level, setting 89–90 alternative hypothesis 89, 89 analysis of covariance (ANCOVA): between-subjects factors/contrasts and 151, 151; case selection in SPSS program and 142, 143–53, 143–51; conditions of 139–40; conditions in SPSS, checking 140–3, 141–2; covariate and 138–40, 151; describing 139; gain scores and 136; homogeneity of regression slopes and 140, 148, 148; homoscedasticity and 150–1; intervening variables and, eliminating 135–9; overview 153; in second language research 135; in SPSS program 140–53, 141–52 analysis of variance (ANOVA): assumptions of, statistical 119–20; degrees of freedom in 86–7; describing 117–22; effect size for 121–2; F-statistic and 210; outcomes of 119–20; overview 117, 134; post hoc tests and 120–1, 126, 127; posttest and 118, 118; in second language research 117–22; in SPSS program 122–7, 122–7; steps in, key 119 ANOVA see analysis of variance; repeatedmeasures ANOVA; two-way mixed design ANOVA assessment 11 Asymp Sig (2-tailed) value 110, 115 bar graphs: in descriptive statistics 35, 35; in descriptive statistics in SPSS program 54–5, 55; SPSS program instructions for 54–5, 55 Becker’s effect size calculator 104 ß value and coefficient 202–5, 211, 216–17 between-subjects factors/contrasts 151, 151, 174, 174, 176, 176 bimodal data 37–8 bivariate correlation 77 Bonferroni post hoc test 150, 156 case summaries, generating in SPSS program 20–2, 20–1 categorical data 7–8, 8, 39 central tendency measures 36–8 chi-square test: assumptions of 189–90; non-SPSS method for 195–8, 196–8; one-dimensional 182–5, 183; overview 198–9; in L2 research 182; in SPSS program 190–5, 191–5; two-dimensional 185–9, 185–6, 188 coding data 15, 234–8 coefficient of determination 67 Cohen’s d 88, 96–7, 101, 103–4, 111–12, 121 Cohen’s kappa 234–8, 236–8 collinearity 205, 212, 212, 217, 217 collocations 188–9, 188 264 Index Common European Reference Framework for Languages composite 40 confounding variables 135–9 constructs contingency table 185, 185, 188, 189–90, 196–7, 197, 234, 234 Continuity Correction 194 control group 30 convenience sampling 82 correlation: bivariate 77; coefficients 61–2, 229–30; defining 200; intervalinterval 66–8; interval-nominal 68–9; interval-ordinal 68; negative 62–6, 66, 68; occurrence of 60; ordinal-ordinal 68; point-biserial 69; positive 60, 62–6, 64, 68; in simple regression 200–1, 201; Spearman 70–9, 72, 78 correlational analysis: application of, in real study 79; background information 60; conditions to be met for using 70; correlation coefficient and 61–2; inferential statistics and 60–1; interpreting correlation and 69; interval-interval relationship and 66–8; interval-nominal relationship and 68–9; interval-ordinal relationship and 68; negative correlations and 62–6, 66, 68; occurrence of correlation and 60; ordinal-ordinal relationship and 68; overview 79–80; Pearson Product Moment and 66, 70–9, 71, 77; pointbiserial correlation and 69; positive correlations and 60, 62–6, 64, 68; scatterplots and 63, 64–7; in second language research 60–1, 70; Spearman correlation and 70–9, 72, 78 covariate 138–40, 151 Cramer’s Phi 189, 197 Cramer’s V 189, 197 Cronbach’s alpha: internal consistency measures and 221–3, 222–3; inter-rater reliability and 230; intraclass correlation coefficient and 242–3; intra-rater reliability and 228; in reliability analysis 90; setting the alpha level versus 89–90; in SPSS program 223–7, 223–7; taxonomy of questionnaire and 58, 59 cross-tabulation 185, 185, 188, 188, 190, 196–7, 197, 234, 234 data: bimodal 37–8; categorical 7–8, 8, 39; checking 14–15; clearing 15; coding 15, 234–8; demographic 25; dichotomous 8, 9; entering 15; heterogeneous 38; homogeneous 38; importing from Excel 15, 22–4, 22–4; interval 3–4, 4, 39; nominal 7–8, 8, 39; ordinal 5–7, 5–6, 39–40; organizing 14–15; in quantification 2; ratio 3–4, 4; screening 15; in second language research 4; in SPSS program, preparing 14–15; transforming, in real-life context 8–11, 9–11 data file: importing from Excel 22–4, 22–4; saving and naming SPSS program 22 degrees of freedom (df ) 86–7, 120 demographic data 25 dependent residuals 205 dependent t-tests 93 dependent variables 7, 119 descriptive statistics: bar graphs and 35, 35; central tendency measures and 36–8; computing 72–3, 72; data in, summarizing 39–40; diagrams and 33–5, 34–5; dispersion measures and 38–9; distribution measures and 40–3, 41–2; frequency counts and 31–3, 31–3; graphs and 33–5, 34–5; histograms in 41–2; kurtosis statistics and 40, 43; in Laufer and Rozovski-Roitblat study 155; mean and 36; median and 36–7, 36, 39; mode and 37–9; in multiple regression 209, 209; overview 28, 43; pie charts and 33–5, 34; in quantification 28; quantification at group level and 28–30, 29–30; in repeated-measures ANOVA 155, 161, 162; in L2 research 28; skewness statistics and 40–3, 41–2; in SPSS program 52–4, 53–4; standard deviation and 38–9; in two-way mixed-design ANOVA 168, 168, 174, 175, 177, 177; in Wilcoxon Signed-rank test 114, 114 descriptive statistics in SPSS program: application of, in real quantitative study 58–9, 59; background information 44; bar graphs and 54–5, 55; computing descriptive statistics and 48–54, 49–52; descriptive statistics option and 52–4, 53–4; frequency option and 48–52, 49–52; graphs and 54–8, 55–8; histograms and 57, 57–8; missing values and, assigning 47–8, 47–8; nominal variables and, assigning values to 44–7, 45–7; overview 44, 59; pie charts and 56, 56 descriptors 5–7, Index diagrams: in descriptive statistics 33–5, 34–5; in descriptive statistics in SPSS program 54–6, 55–6; SPSS program instructions for 54–8, 55–8 dichotomous data 8, Dictogloss tasks 166 discrimination of test item 68 dispersion measures 38–9 distortion of mean 37 distribution measures 40–3, 41–2 dummy coding 205 Dunn-Bonferroni test 133 d-value 97 effect size: for analysis of variance 121–2; inferential statistics and 87–8, 88; magnitude of 88; for repeated-measures ANOVA 157–8; sample size versus 87–8, 88; for t-tests 96–7, 104 equal variances assumption 96 eta squared 121–2, 126, 151, 151 exact agreement method 229 Excel spreadsheet (Microsoft), importing data from 15, 22–4, 22–4 factor variable 119–20 false negative 90 false positive 90 file drawer problem 90 frequency counts 31–3, 31–3 F-statistic 210, 215 F-value 120, 186 gain scores 136 graphs: in descriptive statistics 33–5, 34–5; in descriptive statistics in SPSS program 54–6, 55–6; SPSS program instructions for 54–8, 558 Greenhouse-Geisser corrections 163 grouping variable 119–20, 128 heterogeneous data 38 heteroscedasticity 205 hierarchical regression: multiple regression and 203, 204; in SPSS program 213–18, 213–18 H-index 128 histograms: in descriptive statistics 41–2; in descriptive statistics in SPSS program 57, 57–8; SPSS program instructions for 57, 57 homogeneity of regression slopes 140, 148, 148 265 homogeneous data 38 homoscedasticity 150–1 Huynh-Feldt corrections 163 hypotheses 2, 25, 89, 89 importing data from Excel spreadsheet 15, 22–4, 22–4 independent-samples t-tests 93–4, 93, 96–102, 98–101, 106, 117, 121, 138 independent variables 7–8, 119, 128 inferential statistics: alternative hypothesis and 89, 89; correlational analysis and 60–1; degrees of freedom and 86–7; effect size and 87–8, 88; errors in statistical analysis and 83, 90; normal distribution and 85, 85; null hypothesis and 89, 89; overview 90–1; populations and 81–3; probability and 83–4; quantification and 81–2; samples and 81–3; sample size and 84–8, 88; in second language research 60–1, 81; statistical significance and 83–4, 89–90; see also correlational analysis instrument reliability 245 inter-coder reliability measures 228–30 internal consistency measures: Cronbach’s alpha 221–3, 222–3; describing 219–20; split-half reliability 221; test-retest reliability 221 International English Language Testing System (IELTS) 4, 25, 94, 239 inter-rater reliability measures 228–30 interval data 3–4, 4, 39 interval-interval relationships 66–8 interval-nominal relationships 68–9 interval-ordinal relationships 68 intervening variables 135–9 intraclass correlation coefficient (ICC) 238–43, 239–43 intrarater reliability measures 228 Kruskal-Wallis test: assumptions about, statistical 128; describing 117, 127–8; outcomes of 128; in SPSS program 128–34, 129–33 kurtosis statistics 40, 43 language-related episodes (LREs) 185–6, 185–6 language testing and assessment (LTA) research 12; see also second language (L2) research leptokurtic distribution 43 266 Index Levene’s test 96, 100–1, 100, 125–6, 150, 175, 176, 178–9 Likert-type scale 3, 39–40, 40, 58, 108, 128, 219–20 Lower Bound corrections 163 low skewness statistic 41, 42 main effects 168 Mann-Whitney U test 106–11, 107–11 marginal totals 186, 188, 188 Mauchly’s Test of Sphericity 161–2, 162, 175, 175 mean 36–7, 118 measurements: central tendency 36–8; dispersion 38–9; distribution 40–3, 41–2; inter-coder reliability 228–30; inter-rater reliability 228–30; intra-rater reliability 228; normal distribution 40–3, 41–2; proficiency level 12, 188–9, 188; scales 3–8, 4–8; see also internal consistency measures median 36–7, 36, 39 medium effect 97 Minitab software 14 missing values, assigning 47–8, 47–8 mode 37–9 moderator variables 135–9 multiple regression: ANOVA result and 210–11, 211, 216, 216; assumptions of 205; collinearity and 205, 212, 212, 217, 217; describing 203–5, 204; descriptive statistics in 209, 209; hierarchical regression and 203, 204; model coefficient outputs and 211–12, 211–12, 216–17, 216–17; overview 218; sample size in 205; in second language research 200; simple 200–3, 201; in SPSS program 206–12, 206–12 multivariate analysis of variance (MANOVA) 118, 160, 161 Multivariate Tests 164 negative correlations 62–6, 66, 68 negatively skewed distribution 41, 42 negative ranks 114 nominal data 7–8, 8, 39 nominal variables, assigning value to 44–7, 45–7 nonparametric tests: determining use of 106; Mann-Whitney U test 106–11, 107–11; overview 116; in second language research 106; Wilcoxon Signed-ranked test 111–16, 112–15 non-SPSS method for chi-square test 195–8, 196–8 normal distribution 66, 85, 85 normal distribution measures 40–3, 41–2 null hypothesis 89, 89, 183, 185 one-dimensional chi-square test 182–5, 183 one-way analysis of variance see analysis of variance (ANOVA) ordinal data 5–7, 5–6, 39–40 ordinal-ordinal relationships 68 outcome variable 119–20 outliers 36–7, 36, 106 paired-samples t-tests 93–5, 95, 102–4, 102–4 pairwise comparisons 133, 133, 164, 164 parameters 81 parametric statistic 66 partial eta squared 121, 157–8 partialing out covariate 139 Pearson: correlation analysis 43; correlation coefficient 121, 182, 204; Product Moment 66, 70–9, 71, 77, 230; Pearson’s r 66–8 percentage of agreement 229 performance rating 234–8 phi coefficients 68, 184; Phi value 184, 187, 195 pie charts: in descriptive statistics 33–5, 34; in descriptive statistics in SPSS program 56, 56; SPSS program instructions for 56, 56 platykurtic distribution 43 point-biserial correlation 69 populations 81–3 positive correlations 60, 62–6, 64, 68 positively skewed distribution 41, 41 positive ranks 114 post hoc tests 119–21, 126, 127, 140, 150, 156–7, 178–9, 178 predictor variable 201, 205, 209–10, 209 pre-post studies 154–6, 154–5 probability 83–4 PSPP software 14 purposive sampling 82 p-value 84, 88–90, 93, 100, 120 qualitative data coding 234–8 quantification: categorical data in 7–8, 8; constructs in 2; data in 2; describing 2–3; descriptive statistics in 28; at group level 28–30, 29–30; hypotheses Index in 2; inferential statistics and 81–2; interval data in 3–4, 4; issues in 2–3; measurement scales in 3–8; nominal data in 7–8, 8; ordinal data in 5–7, 5–6; overview 1, 13; ratio data in 3–4, 4; sample study 12–13; in second language research 1, 3; topics in second language research and 11–12; transforming data in real-life context 8–11, 9–11 random assignment 82 random sampling 82 ranks statistics 114, 115 rater reliability: Cohen’s kappa and 234–8, 234–8; correlation coefficients and 229–30; describing 227–8; inter-coder reliability and 228–30; inter-rater reliability and 228–30; intraclass correlation coefficient 238–9, 239; intra-rater reliability measures and 228; percentage of agreement and 229; Spearman-Brown coefficient and 230–4, 231–3 ratio data 3–4, regression see multiple regression; simple regression reliability 219–20 reliability analysis: Cronbach’s alpha in 90; estimates and, factors affecting 244–5; instrument reliability versus research validity and 245; internal consistency measures and 219–27, 222–7; overview 245; rater reliability and 227–43, 228–43; reliability and 219–20; reliability coefficient and 220; in second language research 219; standard error of measurement and 243–4 reliability coefficient 220 reliability estimates, factors affecting 244–5 repeated-measures ANOVA: assumptions of 156–7; between-subjects contrasts and 151, 151, 163; describing 154; descriptive statistics in 155, 161, 162; effect size for 157–8; Mauchly’s Test of Sphericity and 161–2, 162; overview 164–5; paired-samples t-test and 154; post hoc tests and 157; in second language research 154–5, 154–5; sphericity and 156, 161–2; in SPSS program 158–64, 158–64; statistical significance and 157; within-subjects factors/contrasts and 162, 163, 163 repeated measures t-tests 93 267 R software 14 R-squared 210 R-value 210 samples 81–3 sample size 84–8, 88, 205 SAS (Statistical Analysis Software) 14 scatterplots: correlational analysis and 63, 64–7; in simple regression 201, 201; SPSS program instructions for 73, 74–7 Scheffé post hoc test 119–20, 126, 127, 140, 178, 178 screening data 15 selective sampling 82 sequential regression see hierarchical regression setting the alpha level 89–90 Sidak post hoc test 150, 156 significance level 83–4, 84 significance, statistical 2–3, 83–4, 86–7, 89–90, 140, 156–7 simple regression 200–3, 201 skewness statistics 40–3, 41–2 Spearman-Brown coefficient 230–4, 231–3 Spearman-Brown prophecy formula 221, 230 Spearman correlation 70–9, 72, 78 Spearman’s rho 67–8, 71, 182, 230 sphericity 156, 161–2, 175, 175 split-half reliability 221 spreadsheet, creating in SPSS 16–20, 16–19 SPSS program (IBM): analysis of covariance in 140–53, 141–52; analysis of variance in 122–7, 122–7; application of, in real study 24–7, 25–6; background information 14; bar graphs in 54–5, 55; case selection in 142, 143–53, 143–51; case summaries in, generating 20–2, 20, 21; chi-square test in 190–5, 191–5; Cohen’s d in 97; Cohen’s kappa in 235–8, 236–8; computing descriptive statistics in 48–54, 49–52; Cronbach’s alpha in 223–7, 223–7; data file in, saving and naming 22; data in, preparing 14–15; describing 14; descriptive statistics option in 52–4, 53–4; diagrams in 54–8, 55–8; Excel spreadsheet and, importing data from 15, 22–4, 22–4; frequency option in 48–52, 49–52, 54; graphs in 54–8, 55–8; hierarchical regression in 213–18, 213–18; importing data from Excel and 22–4, 22–4; independent-samples 268 Index t-tests in 97–102, 98–101; intraclass correlation in 240–3, 240–3; Kendall’s tau in 68; Kruskal-Wallis test in 128–34, 129–33; Mann-Whitney U test in 108–11, 108–10; missing values in, assigning 47–8, 47–8; multiple regression in 206–12, 206–12; notes on, important 15–16; overview 14, 27; paired-samples t-tests in 102–4, 102–4; Pearson Product Moment in 78, 78, 79; pie charts in 56, 56; repeated-measures ANOVA in 158–64, 158–64; scatterplots in 73, 74–7; in second language research 14; Spearman-Brown coefficient in 231–4, 231–3; Spearman correlation in 76, 79, 78; Spearman’s rho and 68; spreadsheet in, creating 16–20, 16–19; standard deviation in 38; statistical significance and 86; Test of Between-Subjects Effect 163; Tests of Within-Subjects Contrast 163; two-way mixed design ANOVA in 170–80, 170–80; value labels in, assigning 44–7, 45–7; variables in, computing 136– 7, 136–7; Wilcoxon Signed-rank test in 112–16, 112–15; see also descriptive statistics in SPSS program standard deviation (SD) 38–9, 118, 243 standard error of measurement (SEM) 243–4 Statistical Package for Social Sciences program see SPSS program (IBM) statistical significance 2–3, 83–4, 86–7, 89–90, 140, 156–7 stratified random sampling 82 Tamhane T2 post hoc test 120, 126, 140, 178, 178 Test of Between-Subjects Effects 163 Test of English as a Foreign Language (TOEFL) 4, 58, 61–2, 83 Test of English for International Communication (TOEIC) Test of English Pragmatics (TEP) 30, 126, 140, 206, 223 test item discrimination 68 test-retest reliability 221 theories transforming data in real-life context 8–11, 9–11 t-tests: assumptions of 96; Cohen’s d in 88, 96–7; dependent 93; effect size for 96–7, 104; equal variances assumption and 96; independent-samples 93–4, 93, 96–102, 98–101, 106, 117, 121, 138; Levene’s test and 96; overview 104–5; paired-samples 93–5, 95, 102–4, 102–4; repeated measures 93; in second language research 92–3; steps for using 97 t-value 93, 103 two-dimensional chi-square test 185–9, 185–7 two-way analysis of variance 117 two-way mixed-design ANOVA: betweensubjects factors/contrasts and 174, 174, 176, 176; descriptive statistics in 168, 168, 174, 175, 177, 177; Levene’s test and 175, 176, 178–9; Mauchly’s Test of Sphericity and 175, 175; overview 180–1; pairwise comparisons and 177, 177, 179; post hoc tests and 178–9, 178; pretest-posttest control-group design and 166, 167; results, written 180; in second language research 166–9, 167–9; in SPSS program 170–80, 170–80; univariate tests and 177, 177; within-subjects factors/contrasts and 166–7, 174–5, 174, 176, 178 type I error 90 type II error 90 univariate analysis of variance see analysis of variance (ANOVA) U-value 107 variables: confounding 135–9; dependent 7, 119–20; excluded 217, 218; factor 119– 20; grouping 119–20, 128; independent 7–8, 119–20, 128; intervening 135–9; moderator 135–9; nominal, assigning values to 44–7, 45–7; outcome 119–20; predictor 201, 205, 209–10, 209; in quantification 2; SPSS program and computing 136–7, 136–7 VassarStats website 196–7, 196, 198 Wiseheart’s calculator 104 within-subjects factors/contrasts 162, 163, 163, 166–7, 174–5, 174, 176, 178 Yates correction 187, 194, 197 Z-value 107, 110–11, 114–15, 115 .. .QUANTITATIVE METHODS FOR SECOND LANGUAGE RESEARCH Quantitative Methods for Second Language Research introduces approaches to and techniques for quantitative data analysis in second language research, ... research topics in L2 research that can be addressed through quantitative methods 2 Quantification Quantitative Research Quantitative researchers aim to draw conclusions from their research that can... Australia Aek Phakiti is Associate Professor in TESOL in the Sydney School of Education and Social Work at the University of Sydney, Australia QUANTITATIVE METHODS FOR SECOND LANGUAGE RESEARCH

Định dạng
Số trang	291
Dung lượng	26,39 MB