Daniel J Denis SPSS Data Analysis for Univariate, Bivariate, and Multivariate Statistics This edition first published 2019 © 2019 John Wiley & Sons, Inc Printed in the United States of America Set in[.]
SPSS Data Analysis for Univariate, Bivariate, and Multivariate Statistics Daniel J Denis This edition first published 2019 © 2019 John Wiley & Sons, Inc Library of Congress Cataloging‐in‐Publication Data Names: Denis, Daniel J., 1974– author Title: SPSS data analysis for univariate, bivariate, and multivariate statistics / Daniel J Denis Description: Hoboken, NJ : Wiley, 2019 | Includes bibliographical references and index | Identifiers: LCCN 2018025509 (print) | LCCN 2018029180 (ebook) | ISBN 9781119465805 (Adobe PDF) | ISBN 9781119465782 (ePub) | ISBN 9781119465812 (hardcover) Subjects: LCSH: Analysis of variance–Data processing | Multivariate analysis–Data processing | Mathematical statistics–Data processing | SPSS (Computer file) Classification: LCC QA279 (ebook) | LCC QA279 D45775 2019 (print) | DDC 519.5/3–dc23 LC record available at https://lccn.loc.gov/2018025509 Set in 10/12pt Warnock by SPi Global, Pondicherry, India Printed in the United States of America Contents Preface ix Review of Essential Statistical Principles 1.1 Variables and Types of Data 1.2 Significance Tests and Hypothesis Testing 1.3 Significance Levels and Type I and Type II Errors 1.4 Sample Size and Power 1.5 Model Assumptions Introduction to SPSS 2.1 How to Communicate with SPSS 2.2 Data View vs Variable View 10 2.3 Missing Data in SPSS: Think Twice Before Replacing Data! 12 Exploratory Data Analysis, Basic Statistics, and Visual Displays 19 3.1 Frequencies and Descriptives 19 3.2 The Explore Function 23 3.3 What Should I Do with Outliers? Delete or Keep Them? 28 3.4 Data Transformations 29 Data Management in SPSS 33 4.1 Computing a New Variable 33 4.2 Selecting Cases 34 4.3 Recoding Variables into Same or Different Variables 36 4.4 Sort Cases 37 4.5 Transposing Data 38 Inferential Tests on Correlations, Counts, and Means 41 5.1 Computing z‐Scores in SPSS 41 5.2 Correlation Coefficients 44 5.3 A Measure of Reliability: Cohen’s Kappa 52 5.4 Binomial Tests 52 5.5 Chi‐square Goodness‐of‐fit Test 54 5.6 One‐sample t‐Test for a Mean 57 5.7 Two‐sample t‐Test for Means 59 Power Analysis and Estimating Sample Size 63 6.1 Example Using G*Power: Estimating Required Sample Size for Detecting Population Correlation 64 6.2 Power for Chi‐square Goodness of Fit 66 6.3 Power for Independent‐samples t‐Test 66 6.4 Power for Paired‐samples t‐Test 67 Analysis of Variance: Fixed and Random Effects 69 7.1 Performing the ANOVA in SPSS 70 7.2 The F‐Test for ANOVA 73 7.3 Effect Size 74 7.4 Contrasts and Post Hoc Tests on Teacher 75 7.5 Alternative Post Hoc Tests and Comparisons 78 7.6 Random Effects ANOVA 80 7.7 Fixed Effects Factorial ANOVA and Interactions 82 7.8 What Would the Absence of an Interaction Look Like? 86 7.9 Simple Main Effects 86 7.10 Analysis of Covariance (ANCOVA) 88 7.11 Power for Analysis of Variance 90 Repeated Measures ANOVA 91 8.1 One‐way Repeated Measures 91 8.2 Two‐way Repeated Measures: One Between and One Within Factor 99 Simple and Multiple Linear Regression 103 9.1 Example of Simple Linear Regression 103 9.2 Interpreting a Simple Linear Regression: Overview of Output 105 9.3 Multiple Regression Analysis 107 9.4 Scatterplot Matrix 111 9.5 Running the Multiple Regression 112 9.6 Approaches to Model Building in Regression 118 9.7 Forward, Backward, and Stepwise Regression 120 9.8 Interactions in Multiple Regression 121 9.9 Residuals and Residual Plots: Evaluating Assumptions 123 9.10 Homoscedasticity Assumption and Patterns of Residuals 125 9.11 Detecting Multivariate Outliers and Influential Observations 126 9.12 Mediation Analysis 127 9.13 Power for Regression 129 Logistic Regression 131 10.1 Example of Logistic Regression 132 10.2 Multiple Logistic Regression 138 10.3 Power for Logistic Regression 139 10 11 Multivariate Analysis of Variance (MANOVA) and Discriminant Analysis 141 11.1 Example of MANOVA 142 11.2 Effect Sizes 146 11.3 Box’s M Test 147 11.4 Discriminant Function Analysis 148 11.5 Equality of Covariance Matrices Assumption 152 11.6 MANOVA and Discriminant Analysis on Three Populations 153 11.7 Classification Statistics 159 11.8 Visualizing Results 161 11.9 Power Analysis for MANOVA 162 12 Principal Components Analysis 163 12.1 Example of PCA 163 12.2 Pearson’s 1901 Data 164 12.3 Component Scores 166 12.4 Visualizing Principal Components 167 12.5 PCA of Correlation Matrix 170 Exploratory Factor Analysis 175 13.1 The Common Factor Analysis Model 175 13.2 The Problem with Exploratory Factor Analysis 176 13.3 Factor Analysis of the PCA Data 176 13.4 What Do We Conclude from the Factor Analysis? 179 13.5 Scree Plot 180 13.6 Rotating the Factor Solution 181 13.7 Is There Sufficient Correlation to Do the Factor Analysis? 182 13.8 Reproducing the Correlation Matrix 183 13.9 Cluster Analysis 184 13.10 How to Validate Clusters? 187 13.11 Hierarchical Cluster Analysis 188 13 Nonparametric Tests 191 14.1 Independent‐samples: Mann–Whitney U 192 14.2 Multiple Independent‐samples: Kruskal–Wallis Test 193 14.3 Repeated Measures Data: The Wilcoxon Signed‐rank Test and Friedman Test 194 14.4 The Sign Test 196 14 Closing Remarks and Next Steps 199 References 201 Index 203 Preface The goals of this book are to present a very concise, easy‐to‐use introductory primer of a host of computational tools useful for making sense out of data, whether that data come from the social, behavioral, or natural sciences, and to get you started doing data analysis fast The emphasis on the book is data analysis and drawing conclusions from empirical observations The emphasis of the book is not on theory Formulas are given where needed in many places, but the focus of the book is on concepts rather than on mathematical abstraction We emphasize computational tools used in the discovery of empirical patterns and feature a variety of popular statistical analyses and data management tasks that you can immediately apply as needed to your own research The book features analyses and demonstrations using SPSS Most of the data sets analyzed are very small and convenient, so entering them into SPSS should be easy If desired, however, one can also download them from www.datapsyc.com Many of the data sets were also first used in a more theoretical text written by the same author (see Denis, 2016), which should be consulted for a more in‐depth treatment of the topics presented in this book Additional references for readings are also given throughout the book Target Audience and Level This is a “how‐to” book and will be of use to undergraduate and graduate students along with researchers and professionals who require a quick go‐to source, to help them perform essential statistical analyses and data management tasks The book only assumes minimal prior knowledge of statistics, providing you with the tools you need right now to help you understand and interpret your data analyses A prior introductory course in statistics at the undergraduate level would be helpful, but is not required for this book Instructors may choose to use the book either as a primary text for an undergraduate or graduate course or as a supplement to a more technical text, referring to this book primarily for the “how to’s” of data analysis in SPSS The book can also be used for self‐study It is suitable for use as a general reference in all social and natural science fields and may also be of interest to those in business who use SPSS for decision‐making References to further reading are provided where appropriate should the reader wish to follow up on these topics or expand one’s knowledge base as it pertains to theory and further applications An early chapter reviews essential statistical and research principles usually covered in an introductory statistics course, which should be sufficient for understanding the rest of the book and interpreting analyses Mini brief sample write‐ups are also provided for select analyses in places to give the reader a starting point to writing up his/her own results for his/her thesis, dissertation, or publication The book is meant to be an easy, user‐friendly introduction to a wealth of statistical methods while simultaneously demonstrating their implementation in SPSS Please contact me at daniel.denis@umontana.edu or email@datapsyc.com with any comments or corrections Glossary of Icons and Special Features When you see this symbol, it means a brief sample write‐up has been provided for the accompanying output These brief write‐ups can be used as starting points to writing up your own results for your thesis/dissertation or even publication When you see this symbol, it means a special note, hint, or reminder has been provided or signifies extra insight into something not thoroughly discussed in the text When you see this symbol, it means a special WARNING has been issued that if not followed may result in a serious error Acknowledgments Thanks go out to Wiley for publishing this book, especially to Jon Gurstelle for presenting the idea to Wiley and securing the contract for the book and to Mindy Okura‐Marszycki for taking over the project after Jon left Thank you Kathleen Pagliaro for keeping in touch about this project and the former book Thanks goes out to everyone (far too many to mention) who have influenced me in one way or another in my views and philosophy about statistics and science, including undergraduate and graduate students whom I have had the pleasure of teaching (and learning from) in my courses taught at the University of Montana This book is dedicated to all military veterans of the United States of America, past, present, and future, who teach us that all problems are relative 1 Review of Essential Statistical Principles Big Picture on Statistical Modeling and Inference The purpose of statistical modeling is to both describe sample data and make inferences about that sample data to the population from which the data was drawn We compute statistics on samples (e.g sample mean) and use such statistics as estimators of population parameters (e.g population mean) When we use the sample statistic to estimate a parameter in the population, we are engaged in the process of inference, which is why such statistics are referred to as inferential statistics, as opposed to descriptive statistics where we are typically simply describing something about a sample or population All of this usually occurs in an experimental design (e.g where we have a control vs treatment group) or nonexperimental design (where we exercise little or no control over variables) As an example of an experimental design, suppose you wanted to learn whether a pill was effective in reducing symptoms from a headache You could sample 100 individuals with headaches, give them a pill, and compare their reduction in symptoms to 100 people suffering from a headache but not receiving the pill If the group receiving the pill showed a decrease in symptomology compared with the nontreated group, it may indicate that your pill is effective However, to estimate whether the effect observed in the sample data is generalizable and inferable to the population from which the data were drawn, a statistical test could be performed to indicate whether it is plausible that such a difference between groups could have occurred simply by chance If it were found that the difference was unlikely due to chance, then we may indeed conclude a difference in the population from which the data were drawn The probability of data occurring under some assumption of (typically) equality is the infamous p‐value, usually set at 0.05 If the probability of such data is relatively low (e.g less than 0.05) under the null hypothesis of no difference, we reject the null and infer the statistical alter‑ native hypothesis of a difference in population means Much of statistical modeling follows a similar logic to that featured above – sample some data, apply a model to the data, and then estimate how good the model fits and whether there is inferential evidence to suggest an effect in the population from which the data were drawn The actual model you will fit to your data usually depends on the type of data you are working with For instance, if you have collected sample means and wish to test differences between means, then t‐test and ANOVA tech‑ niques are appropriate On the other hand, if you have collected data in which you would like to see if there is a linear relationship between continuous variables, then correlation and regression are usually appropriate If you have collected data on numerous dependent variables and believe these variables, taken together as a set, represent some kind of composite variable, and wish to determine mean differences on this composite dependent variable, then a multivariate analysis of variance (MANOVA) technique may be useful If you wish to predict group membership into two or more 1 Review of Essential Statistical Principles categories based on a set of predictors, then discriminant analysis or logistic regression would be an option If you wished to take many variables and reduce them down to fewer dimensions, then principal components analysis or factor analysis may be your technique of choice Finally, if you are interested in hypothesizing networks of variables and their interrelationships, then path analysis and structural equation modeling may be your model of choice (not covered in this book) There are numerous other possibilities as well, but overall, you should heed the following principle in guid‑ ing your choice of statistical analysis: The type of statistical model or method you select often depends on the types of data you have and your purpose for wanting to build a model There usually is not one and only one method that is possible for a given set of data The method of choice will be dictated often by the rationale of your research You must know your variables very well along with the goals of your research to diligently select a statistical model 1.1 Variables and Types of Data Recall that variables are typically of two kinds – dependent or response variables and independent or predictor variables The terms “dependent” and “independent” are most common in ANOVA‐ type models, while “response” and “predictor” are more common in regression‐type models, though their usage is not uniform to any particular methodology The classic function statement Y = f(X) tells the story – input a value for X (independent variable), and observe the effect on Y (dependent vari‑ able) In an independent‐samples t‐test, for instance, X is a variable with two levels, while the depend‑ ent variable is a continuous variable In a classic one‐way ANOVA, X has multiple levels In a simple linear regression, X is usually a continuous variable, and we use the variable to make predictions of another continuous variable Y Most of statistical modeling is simply observing an outcome based on something you are inputting into an estimated (estimated based on the sample data) equation Data come in many different forms Though there are rather precise theoretical distinctions between different forms of data, for applied purposes, we can summarize the discussion into the fol‑ lowing types for now: (i) continuous and (ii) discrete Variables measured on a continuous scale can, in theory, achieve any numerical value on the given scale For instance, length is typically considered to be a continuous variable, since we can measure length to any specified numerical degree That is, the distance between and 10 in on a scale contains an infinite number of measurement possibilities (e.g 6.1852, 8.341 364, etc.) The scale is continuous because it assumes an infinite number of possi‑ bilities between any two points on the scale and has no “breaks” in that continuum On the other hand, if a scale is discrete, it means that between any two values on the scale, only a select number of possibilities can exist As an example, the number of coins in my pocket is a discrete variable, since I cannot have 1.5 coins I can have coin, coins, coins, etc., but between those values not exist an infinite number of possibilities Sometimes data is also categorical, which means values of the variable are mutually exclusive categories, such as A or B or C or “boy” or “girl.” Other times, data come in the form of counts, where instead of measuring something like IQ, we are only counting the number of occurrences of some behavior (e.g number of times I blink in a minute) Depending on the type of data you have, different statistical methods will apply As we survey what SPSS has to offer, we identify variables as continuous, discrete, or categorical as we discuss the given method However, not get too caught up with definitions here; there is always a bit of a “fuzziness” in 1.2 Significance Tests and Hypothesis Testing learning about the nature of the variables you have For example, if I count the number of raindrops in a rainstorm, we would be hard pressed to call this “count data.” We would instead just accept it as continuous data and treat it as such Many times you have to compromise a bit between data types to best answer a research question Surely, the average number of people per household does not make sense, yet census reports often give us such figures on “count” data Always remember however that the software does not recognize the nature of your variables or how they are measured You have to be certain of this information going in; know your variables very well, so that you can be sure SPSS is treating them as you had planned Scales of measurement are also distinguished between nominal, ordinal, interval, and ratio A nominal scale is not really measurement in the first place, since it is simply assigning labels to objects we are studying The classic example is that of numbers on football jerseys That one player has the number 10 and another the number 15 does not mean anything other than labels to distinguish between two players If differences between numbers represent magnitudes, but that differences between the magnitudes are unknown or imprecise, then we have measurement at the ordinal level For example, that a runner finished first and another second constitutes measurement at the ordinal level Nothing is said of the time difference between the first and second runner, only that there is a “ranking” of the runners If differences between numbers on a scale represent equal lengths, but that an absolute zero point still cannot be defined, then we have measurement at the interval level A classic example of this is temperature in degrees Fahrenheit – the difference between 10 and 20° represents the same amount of temperature distance as that between 20 and 30; however zero on the scale does not represent an “absence” of temperature When we can ascribe an absolute zero point in addition to inferring the properties of the interval scale, then we have measurement at the ratio scale The number of coins in my pocket is an example of ratio measurement, since zero on the scale represents a complete absence of coins The number of car accidents in a year is another variable measurable on a ratio scale, since it is possible, however unlikely, that there were no accidents in a given year The first step in choosing a statistical model is knowing what kind of data you have, whether they are continuous, discrete, or categorical and with some attention also devoted to whether the data are nominal, ordinal, interval, or ratio Making these decisions can be a lot trickier than it sounds, and you may need to consult with someone for advice on this before selecting a model Other times, it is very easy to determine what kind of data you have But if you are not sure, check with a statistical consultant to help confirm the nature of your variables, because making an error at this initial stage of analysis can have serious consequences and jeopardize your data analyses entirely 1.2 Significance Tests and Hypothesis Testing In classical statistics, a hypothesis test is about the value of a parameter we are wishing to estimate with our sample data Consider our previous example of the two‐group problem regarding trying to establish whether taking a pill is effective in reducing headache symptoms If there were no differ‑ ence between the group receiving the treatment and the group not receiving the treatment, then we would expect the parameter difference to equal We state this as our null hypothesis: Null hypothesis: The mean difference in the population is equal to The alternative hypothesis is that the mean difference is not equal to Now, if our sample means come out to be 50.0 for the control group and 50.0 for the treated group, then it is obvious that we 13.11 Hierarchical Cluster Analysis Under Transform Values, we will choose to not standardize our data for this example (see Rencher and Christensen (2012) for a discussion of why you may [or may not] wish to standardize) The main output from the cluster analysis appears below: Agglomeration Schedule Cluster Combined Stage Cluster First Appears Stage Cluster Cluster Coefficients 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 22 22 17 22 16 21 18 21 11 13 18 18 13 26 26 11 11 1 11 11 28 23 20 27 17 25 19 24 12 16 21 22 10 18 30 29 13 26 15 14 11 3.742 4.472 5.000 5.099 5.385 5.477 6.164 6.325 6.782 7.000 7.071 7.071 7.483 7.810 8.062 8.307 8.602 9.539 9.798 10.344 10.488 10.863 11.180 12.083 12.689 13.191 14.177 16.155 17.748 Cluster Cluster Next Stage 0 0 0 10 14 13 18 12 20 22 23 21 25 24 27 0 0 0 0 0 11 15 0 17 19 16 0 26 28 22 22 15 13 11 14 14 20 17 15 17 23 20 19 21 21 25 23 24 27 26 27 29 29 0 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 Dendrogram using Single Linkage Rescaled Distance Cluster Combine 10 15 20 25 10 26 30 29 11 12 17 20 16 13 22 28 23 27 18 19 21 25 24 15 14 The Agglomeration Schedule shows the stage at which clusters were combined For instance, at stage 1, observations and were fused The Coefficients is a measure of the distance between the clusters as we move along in the stages The Stage Cluster First Appears reveals the first time the given cluster made an appearance in the schedule (for stage 1, it reads and because neither or had appeared yet) The Next Stage reveals when the cluster will next be joined (notice “2” appears again in stage 4) The Dendrogram shows the historical progression of the linkages For example, notice and at stage were fused 189 191 14 Nonparametric Tests Most of the statistical models we have applied in this book have in one way or another made some distributional assumptions For instance, in t‐tests and ANOVA, we had to assume such things as normality of population distributions and sampling distributions, and equality of population variances The central limit theorem helped us out with the assurance of normality of sampling distributions so long as our sample size was adequate In repeated measures, we saw how SPSS printed out Mauchly’s Test of Sphericity, which was used to evaluate another assumption we had to verify for data measured on the same subjects over time, the within‐subjects design discussed in earlier chapters In many research situations, however, it is either unfeasible or impossible that certain assumptions for a given statistical method are satisfied, and in some situations, we may know in advance that they definitely are not satisfied Such situations include, but are not restricted to, experiments or studies that feature very small samples For instance, in a t‐test situation with only 5–10 participants per group, it becomes virtually impossible to verify the assumption of normality, and due to the small sample size, we no longer have the central limit theorem to come to our “rescue” for assuming normality of sampling distributions Or, even if we can assume the data arise from normal populations, sample distributions may be nonetheless very skewed with heavy tails and outliers In these cases and others, carrying out so‐called parametric tests is usually not a good idea But not all is lost We can instead perform what are known as nonparametric tests on our data and still test null hypotheses of interest Such null hypotheses in the nonparametric situation will usually not be identical to null hypotheses tested in the parametric case, but they will be similar enough that the nonparametric tests can be considered “parallels” to the parametric ones For instance, for an independent‐samples t‐test, there is a nonparametric “equivalent.” This is a convenient way to think of nonparametrics Nonparametric tests are also very useful for dealing with situations in which our data is in the form of ranks Indeed, the calculation of many nonparametric tests first requires transforming ordinary measurements into ranks (e.g similar to how we did for Spearman’s rho) Overall, parametric tests are usually recommended over nonparametric tests when distributional assumptions are more or less feasible Parametric tests will usually have more statistical power over their nonparametric counterparts when this is the case (Howell 2002) Also, when we perform nonparametric tests and convert data to ranks, for instance, we often may lose information in our data For example, measurements of scores 75 and 50 are reduced to first and second 192 14 Nonparametric Tests rank Ranking data this way forces us to lose the measured “distance” between 75 and 50, which may be important to incorporate Having said that, nonparametric tests are sometimes very convenient to perform, relatively easy to calculate by hand, and usually not require extensive computing power In this chapter, we survey a number of nonparametric tests We discuss the essentials of each test by featuring hypothetical data, carry out the analysis in SPSS, and interpret results It should be noted as well that many nonparametric tests have the option of computing an exact test, which essentially means computing a p‐value based on the exact distribution of the statistic rather than through the asymptotic method, which means that given a sufficiently large sample size, the data will conform to distributional assumptions Indeed, when computing our previously encountered tests as the binomial, chi‐square goodness‐of‐fit test, the Kolmogorov–Smirnov test, phi coefficient, kappa, and others, we could have compared asymptotically derived p‐values with their corresponding exact tests, though we often did not so since SPSS usually reports asymptotically derived p‐values by default However, as a general rule, especially when you are using a very small sample size, you may wish to perform such a comparison and report the exact p‐value especially if it is much different than the default value (i.e based on the asymptotic method) given by SPSS In this chapter, and as a demonstration of the technique, we request the exact test when performing the Wilcoxon signed‐rank test, but to save space we not so for other tests (sometimes SPSS will report it anyway, such as for the Mann–Whitney U) However, you should realize that with small samples especially, reporting exact tests may be requested by your thesis or dissertation committee or publication outlet For further details on exact tests and how they are computed, see Ramsey and Schafer (2002) 14.1 Independent‐samples: Mann–Whitney U Nonparametric analogs to the independent‐samples t‐test in SPSS include the Mann–Whitney U test or Wilcoxon rank‐sum test (not to be confused with the Wilcoxon signed‐rank test, to be discussed later, designed for matched samples or repeated measures) Recall that the null hypothesis in the independent‐samples t‐test was that population means were equal The Mann–Whitney U goes about testing a different null hypothesis but with the same idea of comparing two groups It simply tests the null hypothesis that both samples came from the same population in terms of ranks The test only requires that measurements be made at least the ordinal level To demonstrate the test, recall the data we used for our independent‐samples t‐test in an earlier chapter on grades and the amount of time a student studied for the evaluation In SPSS we select: ANALYZE → NONPARAMETRIC TESTS → INDEPENDENT SAMPLES 14.2 Multiple Independent‐samples: Kruskal–Wallis Test 193 Select Automatically compare distributions across groups, and move studytime under Test Fields and grade under Groups Then under Settings, choose Customize tests, then check off Mann–Whitney U (two samples); NPAR TESTS /M-W= studytime BY grade(0 1) /MISSING ANALYSIS When we run the Mann–Whitney U on two samples, we obtain the following: We reject the null hypothesis that the distribution of studytime is the same across categories of grade (p = 0.008) A Mann–Whitney U test was performed to test the tenability of the null hypothesis that studytime groups were drawn from the same population The test was statistically significant (p = 0.008), providing evidence that they were not 14.2 Multiple Independent‐samples: Kruskal–Wallis Test When we have more than two independent samples, we would like to conduct a nonparametric counterpart to ANOVA The Kruskal–Wallis test is one such test that is commonly used in such a situation The test is used to evaluate the probability that independent samples arose from the same population The test assumes the data are measured at least at the ordinal level Recall our ANOVA data (to the left), where achievement was hypothesized to be a function of teacher When we conducted the one‐way ANOVA on these data in an earlier chapter, we rejected the null hypothesis of equal population means For the Kruskal–Wallis, we proceed in SPSS the same way we did for the Mann– Whitney (moving ac to Test Fields and teach to Groups), but will select the K–W instead of M–W: 194 14 Nonparametric Tests To conduct the Kruskal–Wallis test in SPSS, we select: ANALYZE → NONPARAMETRIC TESTS → INDEPENDENT SAMPLES When we run the test, we obtain: Our decision is to reject the null hypothesis and conclude that distributions of achievement are not the same across teacher A Kruskal–Wallis test was performed to evaluate the null hypothesis that the distribution of achievement scores is the same across levels of teach A p‐value of 0.001 was obtained, providing evidence that the distribution of achievement scores is not the same across teach groups 14.3 Repeated Measures Data: The Wilcoxon Signed‐rank Test and Friedman Test When our data is paired, matched, or repeated, the Wilcoxon signed‐rank test is a useful nonparametric test as a nonparametric alternative to the paired‐samples t‐test The test incorporates the relative magnitudes of differences between conditions, giving more weight to pairings that show large differences than to small The null hypothesis under test is that samples arose from the same population To demonstrate the test, recall our repeated‐measures learning data from a previous chapter: For the purposes of demonstrating the Wilcoxon signed‐rank test, we will consider only the first two trials Our null hypothesis is that both trials were drawn from the same population To conduct the signed‐rank test, we select NONPARAMETRIC TESTS →LEGACY DIALOGS → TWO RELATED SAMPLES 14.3 Repeated Measures Data: The Wilcoxon Signed‐rank Test and Friedman Test Wilcoxon Signed Ranks Test Ranks N trial_2 - trial_1 Negative Ranks Positive Ranks Ties Total 6a 0b 0c Mean Rank 3.50 00 Sum of Ranks 21.00 00 a trail_2 < trial_1 b trail_2 > trial_1 c trail_2 = trial_1 Test Statisticsa Z Asymp Sig (2-tailed) trial_2 trial_1 –2.207b 027 a Wilcoxon Signed Ranks Test b Based on positive ranks Test Statistics trial_2 trial_1 Z Asymp sig (2-tailed) Exact Sig (2-tailed) Exact Sig (1-tailed) Point Probability –2.207b 027 031 016 016 a Wilcoxon Signed Ranks Test b Based on positive ranks The p‐value obtained for the test is equal to 0.027 Hence, we can reject the null hypothesis and conclude that the median of differences between trials is not equal to Since sample size is so small, obtaining an exact p‐value is more theoretically appropriate, though it indicates the same decision on the null (select Exact then check off the appropriate tab, yields a p‐value of 0.031, two tailed) Now, suppose we would like to analyze all three trials We analyzed this data as a repeated measures in a previous chapter With three trials, we will conduct the Friedman test: NONPARAMETRIC TESTS → LEGACY DIALOGS → K RELATED SAMPLES NPAR TESTS /FRIEDMAN=trial_1 trial_2 trial_3 /MISSING LISTWISE 195 196 14 Nonparametric Tests The Friedman test reports a statistically significant difference between trials, yielding a p‐value of 0.002 (compare with Exact test, try it), and hence we reject the null hypothesis Friedman Test Ranks Mean Rank trial_1 3.00 trial_2 2.00 trial_3 1.00 Test Statisticsa N Chi-Square df Asymp Sig 12.000 002 a Friedman Test As one option for a post hoc on this effect, we can run the Wilcoxon signed‐rank test we just ran earlier, but on each pairwise comparison (Leech et al (2015)) We can see below that we have evidence to suggest that all pairs of trials are different (no correction on alpha implemented, you may wish to), as p‐values range from 0.027 to 0.028 for each pair tested Test Statisticsa trial_2 trial_1 Z trial_3 trial_1 trial_3 trial_2 – 2.207b – 2.201b – 2.207b 027 028 027 Asymp Sig (2-tailed) a Wilcoxon Signed Ranks Test b Based on positive ranks A Wilcoxon signed‐rank test was performed to evaluate the tenability of the null hypothesis that two samples arose from the same population The p‐value under the null hypothesis was equal to 0.027, providing evidence that the two samples were not drawn from the same population The Friedman test was also used as the nonparametric to a repeated measures on three trials The test came out statistically significant (p = 0.002), providing evidence that samples were not drawn from the same population Follow‐up Wilcoxon signed‐rank tests confirmed that pairwise differences exist between all trials 14.4 The Sign Test The sign test can be used in situations where matched observations are obtained on pairs, or repeated observations are obtained on individuals, and we wish to compare the two groups, but in a rather crude fashion We are not interested, or able in this case, to account for the magnitudes of differences 14.4 The Sign Test between the two measurements We are only interested in whether the measurement increased or decreased That is, we are only interested in the sign of the difference Some hypothetical data will help demonstrate Consider the following data on husband and wife marital satisfaction scores, measured out of 10, where 10 is “most happy” and is “least happy”: Pair Husband Wife Sign (H–W) − + + + − 10 + 10 − − + 10 − If there were no differences overall on marital happiness scores between husbands and wives, what would we expect the distribution of signs (where we subtract wives’ ratings from husband’s) to be on average? We would expect it to have the same number of + signs as – signs (i.e five each) On the other hand, if there is a difference overall between marital satisfaction scores, then we would expect some disruption in this balance For our data, notice that we have five negative signs and five positive signs, exactly what we would expect under the null hypothesis of no difference Let us demonstrate this test in SPSS: NONPARAMETRIC TESTS → LEGACY DIALOGS → TWO RELATED SAMPLES: Move husband and wife over to Test Pairs and check off Sign under Test Type 197 198 14 Nonparametric Tests Sign Test Frequencies N wife - husband Negative Differencesa Positive Differencesb Tiesc Total 10 a wife < husband b wife > husband c wife = husband Test Statisticsa wife husband Exact Sig (2-tailed) 1.000b a Sign Test b Binomial distribution used We see that the p‐value (two tailed) for the test is equal to 1.000, which makes sense since we had an equal number of + and – signs Deviations from this “ideal” situation under the null would have generated a p‐value less than 1.000, and for us to reject the null, we would have required a p‐value of typically less than 0.05 A sign test was performed on 10 pairs of husband and wife marital satisfaction scores A total of five negative differences and five positive differences were found in the data, and so the test delivered a nonstatistically significant result (p = 1.000), providing no evidence to doubt that husbands and wives, overall, differ on their marital happiness scores 199 Closing Remarks and Next Steps This book has been about statistical analysis using SPSS It is hoped the book has and will c ontinue to serve you well as a reference as an introductory look at using SPSS to address many of your common research questions The book was purposely very light on theory and technical details as to provide you the fastest way to get started using SPSS for your thesis, dissertation, or publication However, that does not mean you should stop here There are scores of books and manuals written on SPSS that you should follow up on to advance your data analysis skills, as well as innumerable statistical and data analysis texts, both theoretical and applied, that you should consult if you are serious about learning more about the areas of statistics, data analyses, computational statistics, and all the methodological issues that arise in research and the use of statistics to address research questions My earlier book, also with Wiley (Denis, 2016), surveys many of the topics presented in this book but at a deeper theoretical level Hays (1994) is a classic text (targeted especially to psychologists) for statistics at a moderate technical level Johnson and Wichern’s classic multivariate text (Johnson and Wichern, 2007) should be consulted for a much deeper look at the technicalities behind multivariate analysis Rencher and Christensen (2012) is also an excellent text in multivariate analysis, combining both theory and application John Fox’s text (2016) is one of the very best regression (and associated techniques, including generalized linear models) texts ever written that, even if somewhat challenging, combines the right mix between theory and application If you have any questions about this book or need further guidance, please feel free to contact me at email@datapsyc.com or daniel.denis@umontana.edu or simply visit www.datapsyc.com/ front.html 201 References Agresti, A (2002) Categorical Data Analysis New York: Wiley Aiken, L.S and West, S.G (1991) Multiple Regression: Testing and Interpreting Interactions London: Sage Publications Anderson, T.W (2003) An Introduction to Multivariate Statistical Analysis New York: Wiley Baron, R.M and Kenny, D.A (1986) The moderator‐mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations Journal of Personality and Social Psychology 51: 1173–1182 Cohen, J.C (1988) Statistical Power Analysis for the Behavioral Sciences New York: Routledge Cohen, J., Cohen, P., West, S.G., and Aiken, L.S (2003) Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences New Jersey: Lawrence Erlbaum Associates Denis, D (2016) Applied Univariate, Bivariate, and Multivariate Statistics New York: Wiley Draper, N.R and Smith, H (1995) Applied Regression Analysis New York: Wiley Everitt, B (2007) An R and S‐PLUS Companion to Multivariate Analysis New York: Springer Everitt, B and Hothorn, T (2011) An introduction to Applied Multivariate Analysis with R New York: Springer Fox, J (2016) Applied Regression Analysis and Generalized Linear Models New York: Sage Publications Hair, J., Black, B., Babin, B et al (2006) Multivariate Data Analysis Upper Saddle River, NJ: Pearson Prentice Hall Hays, W.L (1994) Statistics Fort Worth, TX: Harcourt College Publishers Howell, D.C (2002) Statistical Methods for Psychology Pacific Grove, CA: Duxbury Press Jaccard, J (2001) Interaction Effects in Logistic Regression New York: Sage Publications Johnson, R.A and Wichern, D.W (2007) Applied Multivariate Statistical Analysis Upper Saddle River, NJ: Pearson Prentice Hall Kirk, R.E (1995) Experimental Design: Procedures for the Behavioral Sciences New York: Brooks/Cole Publishing Company Kirk, R.E (2008) Statistics: An Introduction Belmont, CA: Thomson Wadsworth Kulas, J.T (2008) SPSS Essentials: Managing and Analyzing Social Sciences Data New York: Wiley Leech, N.L., Barrett, K.C., and Morgan, G.A (2015) IBM SPSS for Intermediate Statistics: Use and Interpretation New York: Routledge Little, R.J.A and Rubin, D.B (2002) Statistical Analysis with Missing Data Hoboken, NJ: Wiley Meyers, L.S., Gamst, G., and Guarino, A.J (2013) Applied Multivariate Research: Design and Interpretation London: Sage Publications 202 References Olson, C.L (1976) On choosing a test statistic in multivariate analysis of variance Psychological Bulletin 83: 579–586 Petrocelli, J.V (2003) Hierarchical multiple regression in counseling research: common problems and possible remedies Measurement and Evaluation in Counseling and Development 36: 9–22 Preacher, K.J and Hayes, A.F (2004) SPSS and SAS procedures for estimating indirect effects in simple mediation models Behavior Research Methods, Instruments, & Computers 36: 717–731 Ramsey, F.L and Schafer, D.W (2002) The Statistical Sleuth: A Course in Methods of Data Analysis New York: Duxbury Rencher, A.C (1998) Multivariate Statistical Inference and Applications New York: John Wiley & Sons Rencher, A.C and Christensen, W.F (2012) Methods of Multivariate Analysis New York: Wiley Siegel, S and Castellan, N.J (1988) Nonparametric Statistics for the Behavioral Sciences New York: McGraw‐Hill SPSS (2017) IBM knowledge center Retrieved from www.ibm.com on April 11, 2018 https://www.ibm com/support/knowledgecenter/en/SS3RA7_15.0.0/com.ibm.spss.modeler.help/dataaudit_ displaystatistics.htm Tabachnick, B.G and Fidell, L.S (2000) Using Multivariate Statistics Boston, MA: Pearson Warner, R.M (2013) Applied Statistics: From Bivariate Through Multivariate Techniques London: Sage Publications 203 Index a Analysis: of covariance 88–89 of variance 69–90 Assumptions: analysis of variance 70 factor analysis 176 linear regression 105, 123–126 MANOVA 141–148, 153–159 random effects models 80–82 b Bartlett’s test of sphericity (EFA) 182 Binary (response variable in logistic regression) 131 Binomial tests 52 Box‐and‐whisker plot 28 Box’s M test 147 c Canonical correlation 75, 150–152 Central limit theorem 191 Chi‐square 54–56 Cluster analysis: hierarchical 188–189 k‐means 185–187 validation 187–188 Cohen’s: d 61–62 kappa 52 Common: factor analysis 175–176 logarithm 29–30 Communalities (in factor analysis) 164 Composite variables (MANOVA) 1, 141 Confidence interval: for B (regression) 116–117, 120 of the difference in means 58–59, 61–62, 78, 87–88 for a mean 24, 45 Contrasts 75–77, 97 Correlation: biserial 51 Pearson Product‐Moment 44–46, 48 point biserial 51 Spearman’s Rho 46–50 Critique of factor analysis 176 d Discriminant analysis: classification statistics 159–160 function coefficients 156 scores 157–159 structure matrix 156 visualizing separation 161–162 e Effect size 5, 61–62, 74–75, 146 Eigenvalue 75, 145, 149–151, 154–156, 164–169, 172–173, 178–180 Eta‐squared, partial 85, 146 Exploratory: data analysis (EDA) 19–29 factor analysis (EFA) 175–184 Extreme values (in SPSS) 25 204 Index f m F ratio concept in ANOVA 73–74 Factor: analysis (EFA) 175–184 scores 166–167 rotation 181 Factorial analysis of variance 82–88 Fixed effects vs random effects (ANOVA) 80 Mahalanobis distance 126–127, 159–160 Mauchly’s test (for repeated measures) 95 Mediation 127–129 Missing data 12–18 Moderation analysis (regression) 121–123 Multicollinearity (regression) 118 Multiple linear regression 107–118 Multiple R 114–116 Multivariate analysis of variance (MANOVA) 141–148, 153 g Goodness‐of‐fit test (chi‐square) 54–56 Greenhouse‐Geisser correction 95–97 h Hierarchical: clustering 188–189 regression 119–120 i Interaction: ANOVA 82–88 multiple regression 121–123 k Kaiser–Meyer–Olkin measure of sampling adequacy (EFA) 182–183 K‐means clustering 184–187 Kruskal–Wallis test 193 Kurtosis 21, 24–25 l Lawley–Hotelling trace 145 Least‐squares line 104 Levene’s test of equality of variances 6, 61, 72, 148 Linear: combinations 6, 152 regression 103–129 Log: of the odds 133 natural 29–31 Logistic regression: multiple 138–139 one predictor 132–138 n Negatively skewed 21 Nonparametric tests: Friedman test (repeated measures) 194–196 Kruskal–Wallis (multiple independent samples) 193–194 Mann–Whitney U (independent samples) 192–193 Sign test 196–198 Wilcoxon Signed‐rank (repeated measures) 194–196 Normality: of residuals 124–125 of sampling distribution (CLT) 191 Null hypothesis significance testing (NHST) 3–5 o Odds ratio 133–134 Omega‐squared 75 Ordinary least‐squares 104–105 Outliers 24, 28–29, 126 p P‐value (nature of ) Pearson Product‐Moment correlation 44–45 Phi coefficient 51 Pillai’s trace 145 Pooled variance 60 Post‐hocs 75–79 Power: ANOVA 90 Chi‐square 66 independent samples t‐test 66–67 Index logistic regression 139 MANOVA 162 multiple regression 129–130 nature of paired‐samples t‐test 67–68 Principal components analysis: component matrix 165 component scores 166–167 of correlation matrix 170–173 extraction sums of squared loadings 165, 172 vs factor analysis 169–170 initial eigenvalues 172, 179 PCA 163–173 visualizing components 167–169 q Q–Q plot 27 r R‐squared, adjusted (regression) 105–106 Rao’s paradox 147 Regression: forward, backward, stepwise 120–121 multiple 107–120 simple 103–107 Repeated measures: one‐way 91–99 two‐way 99–102 Residual plots (homoscedasticity assumption) 125–126 Roy’s largest root 145 s Sample: vs population size 5, 63–64 Scales of measurement Scatterplot: bivariate 44, 47–48, 104, 161, 167 matrices 111 Scheffé test 78–79 Scree plot 164, 168–169, 180 Shapiro–Wilk normality test 25 Simple main effects (ANOVA) 86–88 Skewness 21, 24–25, 124 Spearman’s Rho 46–50 Sphericity 95 SPSS: computing new variable 33–34 data management 33–39 data view vs variable view 10–11 recoding variables 36–37 selecting cases 34–35 sort cases 37–38 transposing data 38–39 Standard: deviation 21, 24 error of the estimate 115–116, 124 normal distribution 43 Standardize vs normalize 43 Standardized regression coefficient (Beta) 117 Statistics (descriptive vs inferential) Stem‐and‐leaf plots 26–27 Stepwise regression 120–121 t T‐test: one sample 57–58 two samples 59–62 Transformations (data) 29–31 Tukey HSD 77 Type I error rate 75–77, 86–87, 143 Type I, II, errors 4–5 v Variables: continuous vs discrete dependent vs independent Variance: sample 21, 60 components random effects 81–82 of the estimate 115–116 inflation factor (VIF) 118 pooling 60 Varimax (rotation) 181 w Welch adjustment 72–73 Wilk’s lambda 145 z Z‐scores 41–43 205