As such, everything you need to know about graphs and I added a lot of material that wasn’t in the previous edition is now in Chapter 4.. com-In this edition I have incorporated data set
Trang 2First edition published 2000
Second edition published 2005
Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act, 1988, this publication may be reproduced, stored or transmitted in any form, or by any means, only with the prior permission in writing of the publishers, or in the case of reprographic reproduction, in accordance with the terms of licences issued by the Copyright Licensing Agency Enquiries concerning reproduction outside those terms should be sent to the publishers.
Thousand Oaks, California 91320
SAGE Publications India Pvt Ltd
B 1/I 1 Mohan Cooperative Industrial Area
Typeset by C&M Digitals (P) Ltd, Chennai, India
Printed by Oriental Press, Dubai
Printed on paper from sustainable resources
Trang 31 Why is my evil lecturer forcing me to learn statistics? 1
1.2. What the hell am I doing here? I don’t belong here 1 2
1.3. Initial observation: finding something that needs explaining 1 3
1.7.4 Using a frequency distribution to go beyond the data 1 241.7.5 Fitting statistical models to the data 1 26
Trang 42 Everything you ever wanted to know about statistics
2.4.1 The mean: a very simple statistical model 1 352.4.2 Assessing the fit of the mean: sums of squares, variance and standard
2.6. Using statistical models to test research questions 1 48
4.2.2 Lies, damned lies, and … erm … graphs 1 90
Trang 54.3. The SPSS Chart Builder 1 91
4.4. Histograms: a good way to spot obvious problems 1 93
4.6.1 Simple bar charts for independent means 1 105
4.6.2 Clustered bar charts for independent means 1 107
4.6.4 Clustered bar charts for related means 1 111
4.6.5 Clustered bar charts for ‘mixed’ designs 1 113
5.4.1 Oh no, it’s that pesky frequency distribution again: checking
5.5.1 Doing the Kolmogorov–Smirnov test on SPSS 1 145
5.7.2 Dealing with non-normality and unequal variances 2 153
Trang 66 Correlation 166
6.3.1 A detour into the murky world of covariance 1 1676.3.2 Standardization and the correlation coefficient 1 1696.3.3 The significance of the correlation coefficient 3 171
6.3.5 A word of warning about interpretation: causality 1 173
6.5.1 General procedure for running correlations on SPSS 1 175
6.5.5 Biserial and point–biserial correlations 3 182
6.6.1 The theory behind part and partial correlation 2 186
6.6.3 Semi-partial (or part) correlations 2 190
7.2.1 Some important information about straight lines 1 199
7.2.3 Assessing the goodness of fit: sums of squares, R and R2 1 201
7.5.1 An example of a multiple regression model 2 210
Trang 77.6.1 Assessing the regression model I: diagnostics 2 214
7.6.2 Assessing the regression model II: generalization 2 220
7.7.1 Some things to think about before the analysis 2 225
8.3. What are the principles behind logistic regression? 3 265
8.3.1 Assessing the model: the log-likelihood statistic 3 267
8.3.3 Assessing the contribution of predictors: the Wald statistic 2 269
8.4.2 Incomplete information from the predictors 4 273
8.5. Binary logistic regression: an example that will make you feel eel 2 277
Trang 88.6.1 The initial model 2 282
8.9. Predicting several categories: multinomial logistic regression 3 3008.9.1 Running multinomial logistic regression in SPSS 3 301
8.9.4 Interpreting the multinomial logistic regression output 3 306
9.2.1 A problem with error bar graphs of repeated-measures designs 1 3179.2.2 Step 1: calculate the mean for each participant 2 320
9.2.4 Step 3: calculate the adjustment factor 2 3229.2.5 Step 4: create adjusted values for each variable 2 323
9.4.1 Sampling distributions and the standard error 1 3279.4.2. The dependent t-test equation explained 1 3279.4.3. The dependent t-test and the assumption of normality 1 329
9.5.1. The independent t-test equation explained 1 334
Trang 9What have I discovered about statistics? 1 345
11.3.1 Independence of the covariate and treatment effect 3 397
11.4.2 Initial considerations: testing the independence of the independent
Trang 1011.4.3 The main analysis 2 401
11.5.1 What happens when the covariate is excluded? 2 404
11.7. Testing the assumption of homogeneity of regression slopes 3 413
12.2.2 An example with two independent variables 2 423
12.2.5 The residual sum of squares (SSR) 2 428
12.3.1 Entering the data and accessing the main dialog box 2 430
12.4.1 Output for the preliminary analysis 2 435
12.9. What to do when assumptions are violated in factorial AnOVA 3 454
Trang 11Further reading 456
13.2.3 Assessing the severity of departures from sphericity 2 460
13.2.4 What is the effect of violating the assumption of sphericity? 3 460
13.2.5 What do you do if you violate sphericity? 2 461
13.3.4 The residual sum of squares (SSR) 2 467
13.3.7 The between-participant sum of squares 2 468
13.4.2 Defining contrasts for repeated-measures 2 471
13.4.3. Post hoc tests and additional options 3 471
13.5.1 Descriptives and other diagnostics 1 474
13.5.2 Assessing and correcting for sphericity: Mauchly’s test 2 474
13.9.4 The interaction effect (drink × imagery) 2 496
13.9.5 Contrasts for repeated-measures variables 2 498
13.11. Reporting the results from factorial repeated-measures AnOVA 2 502
13.12. What to do when assumptions are violated in repeated-measures AnOVA 3 503
Trang 12Smart Alex’s tasks 504
14.5.4 The interaction between gender and looks 2 52114.5.5 The interaction between gender and charisma 2 52314.5.6 The interaction between attractiveness and charisma 2 52414.5.7 The interaction between looks, charisma and gender 3 527
15.3. Comparing two independent conditions: the Wilcoxon rank-sum test and
15.3.2 Inputting data and provisional analysis 1 545
15.4. Comparing two related conditions: the Wilcoxon signed-rank test 1 55215.4.1 Theory of the Wilcoxon signed-rank test 2 552
Trang 1315.5. Differences between several independent groups: the Kruskal–Wallis test 1 559
15.5.2 Inputting data and provisional analysis 1 562
15.5.3 Doing the Kruskal–Wallis test on SPSS 1 562
15.5.4 Output from the Kruskal–Wallis test 1 564
15.5.5. Post hoc tests for the Kruskal–Wallis test 2 565
15.5.6 Testing for trends: the Jonckheere–Terpstra test 2 568
15.5.8 Writing and interpreting the results 1 571
15.6. Differences between several related groups: Friedman’s AnOVA 1 573
15.6.2 Inputting data and provisional analysis 1 575
15.6.5. Post hoc tests for Friedman’s AnOVA 2 577
15.6.7 Writing and interpreting the results 1 580
16.4.2 Some important matrices and their functions 3 590
16.4.3 Calculating MAnOVA by hand: a worked example 3 591
16.4.4 Principle of the MAnOVA test statistic 4 598
16.7.1 Preliminary analysis and testing assumptions 3 608
Trang 1416.8. Reporting results from MAnOVA 2 614
16.12.2 Univariate AnOVA or discriminant analysis? 624
Trang 15What have I discovered about statistics? 2 682
18.5.5 Breaking down a significant chi-square test with standardized residuals 2 698
18.5.7 Reporting the results of chi-square 1 700
Trang 1619.3.1 An example 2 730
19.4.1 Assessing the fit and comparing multilevel models 4 737
19.6.2 Ignoring the data structure: AnOVA 2 74219.6.3 Ignoring the data structure: AnCOVA 2 74619.6.4 Factoring in the data structure: random intercepts 3 74919.6.5 Factoring in the data structure: random intercepts and slopes 4 75219.6.6 Adding an interaction to the model 4 756
Trang 17Social science students despise statistics For one thing, most have a non-mathematical
back-ground, which makes understanding complex statistical equations very difficult The major
advantage in being taught statistics in the early 1990s (as I was) compared to the 1960s was
the development of computer software to do all of the hard work The advantage of learning
statistics now rather than 15 years ago is that Windows™/MacOS™ enable us to just click on
stuff rather than typing in horribly confusing commands (although, as you will see, we can
still type in horribly confusing commands if we want to) One of the most commonly used of
these packages is SPSS; what on earth possessed me to write a book on it?
You know that you’re a geek when you have favourite statistics textbooks; my favourites are
Howell (2006), Stevens (2002) and Tabachnick and Fidell (2007) These three books are
peer-less as far as I am concerned and have taught me (and continue to teach me) more about statistics
than you could possibly imagine (I have an ambition to be cited in one of these books but I don’t
think that will ever happen.) So, why would I try to compete with these sacred tomes? Well,
I wouldn’t and I couldn’t (intellectually these people are several leagues above me) However,
these wonderful and clear books use computer examples as addenda to the theory The advent of
programs like SPSS provides the unique opportunity to teach statistics at a conceptual level
with-out getting too bogged down in equations However, many SPSS books concentrate on ‘doing
the test’ at the expense of theory Using SPSS without any statistical knowledge at all can be a
dangerous thing (unfortunately, at the moment SPSS is a rather stupid tool, and it relies heavily
on the users knowing what they are doing) As such, this book is an attempt to strike a good
bal-ance between theory and practice: I want to use SPSS as a tool for teaching statistical concepts in
the hope that you will gain a better understanding of both theory and practice
Primarily, I want to answer the kinds of questions that I found myself asking while learning
statistics and using SPSS as an undergraduate (things like ‘How can I understand how this
statis-tical test works without knowing too much about the maths behind it?’, ‘What does that button
do?’, ‘What the hell does this output mean?’) Like most academics I’m slightly high on the
autis-tic spectrum, and I used to get fed up with people telling me to ‘ignore’ options or ‘ignore that
bit of the output’ I would lie awake for hours in my bed every night wondering ‘Why is that bit
of SPSS output there if we just ignore it?’ So that no student has to suffer the mental anguish that
I did, I aim to explain what different options do, what bits of the output mean, and if we ignore
something, why we ignore it Furthermore, I want to be non-prescriptive Too many books tell
the reader what to do (‘click on this button’, ‘do this’, ‘do that’, etc.) and this can create the
impression that statistics and SPSS are inflexible SPSS has many options designed to allow you
to tailor a given test to your particular needs Therefore, although I make recommendations,
pREFACE
Trang 18within the limits imposed by the senseless destruction of rainforests, I hope to give you enough background in theory to enable you to make your own decisions about which options are appro-priate for the analysis you want to do.
A second, not in any way ridiculously ambitious, aim was to make this the only statistics textbook that anyone ever needs to buy As such, it’s a book that I hope will become your friend from first year right through to your professorship I’ve tried, therefore, to write a book that can be read at several levels (see the next section for more guidance) There are chapters for first-year undergraduates (1, 2, 3, 4, 5, 6, 9 and 15), chapters for second-year undergraduates (5, 7, 10, 11, 12, 13 and 14) and chapters on more advanced topics that postgraduates might use (8, 16, 17, 18 and 19) All of these chapters should be accessible
to everyone, and I hope to achieve this by flagging the level of each section (see the next section)
My third, final and most important aim is make the learning process fun I have a sticky history with maths because I used to be terrible at it:
Above is an extract of my school report at the age of 11 The ‘27’ in the report is to say that I came equal 27th with another student out of a class of 29 That’s almost bottom of the class The 43 is my exam mark as a percentage! Oh dear Four years later (at 15) this was my school report:
What led to this remarkable change? It was having a good teacher: my brother, Paul In fact I owe my life as an academic to Paul’s ability to do what my maths teachers couldn’t: teach me stuff in an engaging way To this day he still pops up in times of need to teach me things (a crash course in computer programming some Christmases ago springs to mind) Anyway, the reason he’s a great teacher is because he’s able to make things interesting and relevant to me Sadly he seems to have got the ‘good teaching’ genes in the family (and
he doesn’t even work as a bloody teacher, so they’re wasted!), but his approach inspires
my lectures and books One thing that I have learnt is that people appreciate the human touch, and so in previous editions I tried to inject a lot of my own personality and sense of humour (or lack of …) Many of the examples in this book, although inspired by some of the craziness that you find in the real world, are designed to reflect topics that play on the minds of the average student (i.e sex, drugs, rock and roll, celebrity, people doing crazy stuff) There are also some examples that are there just because they made me laugh So, the examples are light-hearted (some have said ‘smutty’ but I prefer ‘light-hearted’) and by the end, for better or worse, I think you will have some idea of what goes on in my head
on a daily basis!
Trang 19What’s new?
Seeing as some people appreciated the style of the previous editions I’ve taken this as a
green light to include even more stupid examples, more smut and more bad taste I
apolo-gise to those who think it’s crass, hate it, or think that I’m undermining the seriousness of
science, but, come on, what’s not funny about a man putting an eel up his anus?
Aside from adding more smut, I was forced reluctantly to expand the academic content!
Most of the expansions have resulted from someone (often several people) emailing me to
ask how to do something So, in theory, this edition should answer any question anyone
has asked me over the past four years! Mind you, I said that last time and still the questions
come (will I never be free?) The general changes in the book are:
More introductory material
M
M : The first chapter in the last edition was like sticking your
brain into a food blender I rushed chaotically through the entire theory of statistics
in a single chapter at the pace of a cheetah on speed I didn’t really bother explaining
any basic research methods, except when, out of the blue, I’d stick a section in some
random chapter, alone and looking for friends This time, I have written a brand-new
Chapter 1, which eases you gently through the research process – why and how we
do it I also bring in some basic descriptive statistics at this point too
More graphs
M
M : Graphs are very important In the previous edition information about
plotting graphs was scattered about in different chapters making it hard to find
What on earth was I thinking? I’ve now written a self-contained chapter on how to
use SPSS’s Chart Builder As such, everything you need to know about graphs (and
I added a lot of material that wasn’t in the previous edition) is now in Chapter 4
More assumptions
M
M : All chapters now have a section towards the end about what to
do when assumptions are violated (although these usually tell you that SPSS can’t do
what needs to be done!)
More data sets
M
M : You can never have too many examples, so I’ve added a lot of new
data sets There are 30 new data sets in the book at the last count (although I’m not
very good at maths so it could be a few more or less)
More stupid faces
M
M : I have added some more characters with stupid faces because
I find stupid faces comforting, probably because I have one You can find out more
in the next section Miraculously, the publishers stumped up some cash to get them
designed by someone who can actually draw
More reporting your analysis
M
M : OK, I had these sections in the previous edition too,
but then in some chapters I just seemed to forget about them for no good reason This
time every single chapter has one
More glossary
M
M : Writing the glossary last time nearly made me stick a vacuum cleaner
into my ear to suck out my own brain I thought I probably ought to expand it a bit
You can find my brain in the bottom of the vacuum cleaner in my house
New! It’s colour
M
M : The publishers went full colour This means that (1) I had to redo
all of the diagrams to take advantage of the colour format, and (2) If you lick the
orange bits they taste of orange (it amuses me that someone might try this to see
whether I’m telling the truth)
New! Real-world data
M
M : Lots of people said that they wanted more ‘real data’ to play
with The trouble is that real research can be quite boring However, just for you,
I trawled the world for examples of research on really fascinating topics (in my
opin-ion) I then stalked the authors of the research until they gave me their data Every
chapter now has a real research example
New! Self-test questions
M
M : Everyone loves an exam, don’t they? Well, everyone that is
apart from people who breathe Given how much everyone hates tests, I thought the
Trang 20best way to commit commercial suicide was to liberally scatter tests throughout each chapter These range from simple questions to test out what you have just learned to going back to a technique that you read about several chapters before and applying
it in a new context All of these questions have answers to them on the companion website They are there so that you can check on your progress
New!
M
M SPSS tips: SPSS does weird things sometimes In each chapter, I’ve included
boxes containing tips, hints and pitfalls related to SPSS
New!
M
M SPSS 17 compliant: SPSS 17 looks different to earlier versions but in other
respects is much the same I updated the material to reflect the latest editions of SPSS
New! Flash movies
M ! Additional material: Enough trees have died in the name of this book, but still
it gets longer and still people want to know more Therefore, I’ve written nearly
300 pages, yes, three hundred, of additional material for the book So for some more
technical topics and help with tasks in the book the material has been provided electronically so that (1) the planet suffers a little less, and (2) you can actually lift the book
New! Multilevel modelling
Chapter 1 (Research methods)
Chapter 3 (
M
M SPSS): The old Chapter 2 is now SPSS 17 compliant I restructured a lot of
the material, and added some sections on other forms of variables (strings and dates)
Chapter 6 (Correlation)
M
M : The old Chapter 4; I redid one of the examples, added some
material on confidence intervals for r, the biserial correlation, testing differences between dependent and independent rs and how certain eminent statisticians hate each other.
Chapter 7 (Regression)
M
M : This chapter was already so long that the publishers banned
me from extending it! Nevertheless I rewrote a few bits to make them clearer, but otherwise it’s the same but with nicer diagrams and the bells and whistles that have been added to every chapter
Chapter 8 (Logistic regression)
M
M : I changed the main example from one about theory
of mind (which is now an end of chapter task) to one about putting eels up your anus
to cure constipation (based on a true story) Does this help you understand logistic regression? Probably not, but it really kept me entertained for days I’ve extended the
Trang 21chapter to include multinomial logistic regression, which was a pain because I didn’t
know how to do it
Chapter 9 (
M
M t-tests): I stripped a lot of the methods content to go in Chapter 1, so
this chapter is more purely about the t-test now I added some discussion on median
splits, and doing t-tests from only the means and standard deviations.
M : Similar to the old Chapter 9, but I added a section on
assump-tions that now discusses the need for the covariate and treatment effect to be
independent I also added some discussion of eta-squared and partial eta-squared
(SPSS produces partial eta-squared but I ignored it completely in the last edition)
Consequently I restructured much of the material in this example (and I had to create
a new data set when I realized that the old one violated the assumption that I had just
spent several pages telling people not to violate)
Chapter 12 (GLM 3)
M
M : This chapter is ostensibly the same as the old Chapter 10, but
with nicer diagrams
Chapter 13 (GLM 4)
M
M : This chapter is more or less the same as the old Chapter 11
I edited it down quite a bit and restructured material so there was less repetition
I added an explanation of the between-participant sum of squares also The first
example (tutors marking essays) is now an end of chapter task, and the new example
is one about celebrities eating kangaroo testicles on television It needed to be done
M ANoVA): I rewrote a lot of the material on the interpretation of
discriminant function analysis because I thought it pretty awful It’s better now
Chapter 17 (Factor analysis)
M
M : This chapter is very similar to the old Chapter 15 I wrote
some material on interpretation of the determinant I’m not sure why, but I did
Chapter 18 (Categorical data)
M
M : This is similar to Chapter 16 in the previous edition
I added some material on interpreting standardized residuals
Chapter 19 (Multilevel linear models)
M
Goodbye
The first edition of this book was the result of two years (give or take a few weeks to write
up my Ph.D.) of trying to write a statistics book that I would enjoy reading The second
edition was another two years of work and I was terrified that all of the changes would be
the death of it You’d think by now I’d have some faith in myself Really, though, having
spent an extremely intense six months in writing hell, I am still hugely anxious that I’ve
just ruined the only useful thing that I’ve ever done with my life I can hear the cries of
lecturers around the world refusing to use the book because of cruelty to eels This book
has been part of my life now for over 10 years; it began and continues to be a labour of
love Despite this it isn’t perfect, and I still love to have feedback (good or bad) from the
people who matter most: you
Andy(My contact details are at www.statisticshell.com.)
Trang 22What background knowledge do I need?
In essence, I assume you know nothing about statistics, but I do assume you have some very basic grasp of computers (I won’t be telling you how to switch them on, for example) and maths (although I have included a quick revision of some very basic concepts so I really don’t assume anything)
Do the chapters get more difficult as I go through the book?
In a sense they do (Chapter 16 on MANOVA is more difficult than Chapter 1), but in other ways they don’t (Chapter 15 on non-parametric statistics is arguably less complex than
Chapter 14, and Chapter 9 on the t-test is definitely less complex than Chapter 8 on logistic
regression) Why have I done this? Well, I’ve ordered the chapters to make statistical sense (to me, at least) Many books teach different tests in isolation and never really give you a grip of the similarities between them; this, I think, creates an unnecessary mystery Most of the tests in this book are the same thing expressed in slightly different ways So, I wanted the book to tell this story To do this I have to do certain things such as explain regression fairly early on because it’s the foundation on which nearly everything else is built!
However, to help you through I’ve coded each section with an icon These icons are designed to give you an idea of the difficulty of the section It doesn’t necessarily mean you can skip the sections (but see Smart Alex in the next section), but it will let you know whether a section is at about your level, or whether it’s going to push you I’ve based the icons on my own teaching so they may not be entirely accurate for everyone (especially as systems vary in different countries!):
1 This means ‘level 1’ and I equate this to first-year undergraduate in the UK These are sections that everyone should be able to understand
2 This is the next level and I equate this to second-year undergraduates in the UK These are topics that I teach my second years and so anyone with a bit of background in sta-tistics should be able to get to grips with them However, some of these sections will be quite challenging even for second years These are intermediate sections
HOW TO USE THIS BOOk
Trang 233 This is ‘level 3’ and represents difficult topics I’d expect third-year (final-year) UK
undergraduates and recent postgraduate students to be able to tackle these sections
4 This is the highest level and represents very difficult topics I would expect these
sec-tions to be very challenging to undergraduates and recent postgraduates, but
post-graduates with a reasonable background in research methods shouldn’t find them too
much of a problem
Why do I keep seeing stupid faces everywhere?
Brian Haemorrhage: Brian’s job is to pop up to ask questions and look permanently
con-fused It’s no surprise to note, therefore, that he doesn’t look entirely different from the author
As the book progresses he becomes increasingly despondent Read into that what you will
Curious Cat: He also pops up and asks questions (because he’s curious) Actually the only
reason he’s here is because I wanted a cat in the book … and preferably one that looks like
mine Of course the educational specialists think he needs a specific role, and so his role is
to look cute and make bad cat-related jokes
Cramming Sam: Samantha hates statistics In fact, she thinks it’s all a boring waste of time
and she just wants to pass her exam and forget that she ever had to know anything about
normal distributions So, she appears and gives you a summary of the key points that you
need to know If, like Samantha, you’re cramming for an exam, she will tell you the essential
information to save you having to trawl through hundreds of pages of my drivel
Jane Superbrain: Jane is the cleverest person in the whole universe (she makes Smart
Alex look like a bit of an imbecile) The reason she is so clever is that she steals the brains
of statisticians and eats them Apparently they taste of sweaty tank tops, but nevertheless
she likes them As it happens, she is also able to absorb the contents of brains while she eats
them Having devoured some top statistics brains she knows all the really hard stuff and
appears in boxes to tell you really advanced things that are a bit tangential to the main text
(Readers should note that Jane wasn’t interested in eating my brain That tells you all that
you need to know about my statistics ability.)
Labcoat Leni: Leni is a budding young scientist and he’s fascinated by real research He says,
‘Andy, man, I like an example about using an eel as a cure for constipation as much as the next
man, but all of your examples are made up Real data aren’t like that, we need some real
exam-ples, dude!’ So off Leni went; he walked the globe, a lone data warrior in a thankless quest
for real data He turned up at universities, cornered academics, kidnapped their families and
threatened to put them in a bath of crayfish unless he was given real data The generous ones
relented, but others? Well, let’s just say their families are sore So, when you see Leni you know
that you will get some real data, from a real research study to analyse Keep it real
Oliver Twisted: With apologies to Charles Dickens, Oliver, like his more famous fictional
London urchin, is always asking, ‘Please sir, can I have some more?’ Unlike Master Twist,
though, our young Master Twisted always wants more statistics information Of course he
does, who wouldn’t? Let us not be the ones to disappoint a young, dirty, slightly smelly
boy who dines on gruel, so when Oliver appears you can be certain of one thing: there is
additional information to be found on the companion website (Don’t be shy; download it
and bathe in the warm asp’s milk of knowledge.)
Trang 24Satan’s Personal Statistics Slave: Satan is a busy boy – he has all of the lost souls to torture in hell; then there are the fires to keep fuelled, not to mention organizing enough carnage on the planet’s surface to keep Norwegian black metal bands inspired Like many of
us, this leaves little time for him to analyse data, and this makes him very sad So, he has his own personal slave, who, also like some of us, spends all day dressed in a gimp mask and tight leather pants in front of SPSS analysing Satan’s data Consequently, he knows a thing or two about SPSS, and when Satan’s busy spanking a goat, he pops up in a box with SPSS tips
Smart Alex: Alex is a very important character because he appears when things get ticularly difficult He’s basically a bit of a smart alec and so whenever you see his face you know that something scary is about to be explained When the hard stuff is over he reappears to let you know that it’s safe to continue Now, this is not to say that all of the rest of the material in the book is easy, he just let’s you know the bits of the book that you can skip if you’ve got better things to do with your life than read all 800 pages! So, if you
par-see Smart Alex then you can skip the section entirely and still understand what’s going on
You’ll also find that Alex pops up at the end of each chapter to give you some tasks to do
to see whether you’re as smart as he is
What is on the companion website?
In this age of downloading, CD-ROMs are for losers (at least that’s what the ‘kids’ tell me)
so this time around I’ve put my cornucopia of additional funk on that worldwide interweb
thing This has two benefits: (1) The book is slightly lighter than it would have been, and
(2) rather than being restricted to the size of a CD-ROM, there is no limit to the amount
of fascinating extra material that I can give you (although Sage have had to purchase a new server to fit it all on) To enter my world of delights, go to www.sagepub.co.uk/field3e
(see the image on the next page)
How will you know when there are extra goodies on this website? Easy-peasy, Oliver Twisted appears in the book to indicate that there’s something you need (or something extra) on the website The website contains resources for students and lecturers alike:
Data files
M
M : You need data files to work through the examples in the book and they are all on the companion website We did this so that you’re forced to go there and
once you’re there you will never want to leave There are data files here for a range
of students, including those studying psychology, business and health sciences.
Flash movies
M
M : Reading is a bit boring; it’s much more amusing to listen to me
explain-ing thexplain-ings in my camp English accent Therefore, so that you can all have ‘laugh at Andy’ parties, I have created flash movies for each chapter that show you how to do
the SPSS examples I’ve also done extra ones that show you useful things that would
otherwise have taken me pages of drivel to explain Some of these movies are open access, but because the publishers want to sell some books, others are available only
to lecturers The idea is that they can put them on their virtual learning ments If they don’t, put insects under their office doors
environ-Podcast
M
M : My publishers think that watching a film of me explaining what this book is
all about is going to get people flocking to the bookshop I think it will have people flocking to the medicine cabinet Either way, if you want to see how truly uncharis-matic I am, watch and cringe
Trang 25Self-assessment multiple-choice questions
M
M : Organized by chapter, these will allow
you to test whether wasting your life reading this book has paid off so that you can
walk confidently into an examination much to the annoyance of your friends If you
fail said exam, you can employ a good lawyer and sue me
Flashcard glossary
M
M : As if a printed glossary wasn’t enough, my publishers insisted that
you’d like one in electronic format too Have fun here flipping about between terms
and definitions that are covered in the textbook, it’s better than actually learning
something
Additional material
M
M : Enough trees have died in the name of this book, but still it gets
longer and still people want to know more Therefore, I’ve written nearly 300 pages,
yes, three hundred, of additional material for the book So for some more technical
topics and help with tasks in the book the material has been provided electronically
so that (1) the planet suffers a little less, and (2) you can actually lift the book
Answers
M
M : each chapter ends with a set of tasks for you to test your newly acquired
expertise The chapters are also littered with self-test questions How will you know
if you get these correct? Well, the companion website contains around 300 hundred
pages (that’s a different three hundred pages to the three hundred above) of detailed
answers Will I ever stop writing?
Cyberworms of knowledge
M
M : I have used nanotechnology to create cyberworms that
crawl down your broadband connection, pop out of the USB port of your computer
then fly through space into your brain They re-arrange your neurons so that you
understand statistics You don’t believe me? Well, you’ll never know for sure unless
you visit the companion website …
Happy reading, and don’t get sidetracked by Facebook
Trang 26The first edition of this book wouldn’t have happened if it hadn’t been for Dan Wright, who not only had an unwarranted faith in a then-postgraduate to write the book, but also read and commented on draft chapters in all three editions I’m really sad that he is leaving England to go back to the United States
The last two editions have benefited from the following people emailing me with ments, and I really appreciate their contributions: John Alcock, Aliza Berger-Cooper, Sanne Bongers, Thomas Brügger, Woody Carter, Brittany Cornell, Peter de Heus, Edith de Leeuw, Sanne de Vrie, Jaap Dronkers, Anthony Fee, Andy Fugard, Massimo Garbuio, Ruben van Genderen, Daniel Hoppe, Tilly Houtmans, Joop Hox, Suh-Ing (Amy) Hsieh, Don Hunt, Laura Hutchins-Korte, Mike Kenfield, Ned Palmer, Jim Parkinson, Nick Perham, Thusha Rajendran, Paul Rogers, Alf Schabmann, Mischa Schirris, Mizanur Rashid Shuvra, Nick Smith, Craig Thorley, Paul Tinsley, Keith Tolfrey, Frederico Torracchi, Djuke Veldhuis, Jane Webster and Enrique Woll
com-In this edition I have incorporated data sets from real research papers All of these research papers are studies that I find fascinating and it’s an honour for me to have these researchers’ data in my book: Hakan Çetinkaya, Tomas Chamorro-Premuzic, Graham Davey, Mike Domjan, Gordon Gallup, Eric Lacourse, Sarah Marzillier, Geoffrey Miller, Peter Muris, Laura Nichols and Achim Schüetzwohl
Jeremy Miles stopped me making a complete and utter fool of myself (in the book – sadly his powers don’t extend to everyday life) by pointing out some glaring errors; he’s also been a very nice person to know over the past few years (apart from when he’s saying that draft sections of my books are, and I quote, ‘bollocks’!) David Hitchin, Laura Murray, Gareth Williams and Lynne Slocombe made an enormous contribution to the last edition and all of their good work remains in this edition In this edition, Zoë Nightingale’s unwavering positivity and suggestions for many of the new chapters were invaluable My biggest thanks go to Kate Lester who not only read every single chapter, but also kept my research laboratory ticking over while my mind was on this book I liter-ally could not have done it without her support and constant offers to take on extra work that she did not have to do so that I could be a bit less stressed I am very lucky to have her in my research team
All of these people have taken time out of their busy lives to help me out I’m not sure what that says about their mental states, but they are all responsible for a great many improvements May they live long and their data sets be normal.
Not all contributions are as tangible as those above With the possible exception of them not understanding why sometimes I don’t answer my phone, I could not have asked for more loving and proud parents – a fact that I often take for granted Also, very early
in my career Graham Hole made me realize that teaching research methods didn’t have
to be dull My whole approach to teaching has been to steal all of his good ideas and I’m pleased that he has had the good grace not to ask for them back! He is also a rarity in being
ACkNOWLEDGEmENTS
Trang 27brilliant, funny and nice I also thank my Ph.D students Carina Ugland, Khanya
Price-Evans and Saeid Rohani for their patience for the three months that I was physically away
in Rotterdam, and for the three months that I was mentally away upon my return
I appreciate everyone who has taken time to write nice reviews of this book on the
vari-ous Amazon sites around the world (or any other website for that matter!) The success of
this book has been in no small part due to these people being so positive and constructive
in their reviews I continue to be amazed and bowled over by the nice things that people
write and if any of you are ever in Brighton, I owe you a pint!
The people at Sage are less hardened drinkers than they used to be, but I have been very
fortunate to work with Michael Carmichael and Emily Jenner Mike, despite his failings
on the football field(!), has provided me with some truly memorable nights out and he also
read some of my chapters this time around which, as an editor, made a pleasant change.
Both Emily and Mike took a lot of crap from me (especially when I was tired and stressed)
and I’m grateful for their understanding Emily I’m sure thinks I’m a grumpy sod, but she
did a better job of managing me than she realizes Also, Alex Lee did a fantastic job of
turn-ing the characters in my head into characters on the page Thanks to Jill Rietema at SPSS
Inc who has been incredibly helpful over the past few years; it has been a pleasure working
with her The book (obviously) would not exist without SPSS Inc.’s kind permission to use
screenshots of their software Check out their web pages (http://www.SPSS.com) for
sup-port, contact information and training opportunities
I wrote much of this edition while on sabbatical at the Department of Psychology at
the Erasmus University, Rotterdam, The Netherlands I’m grateful to the clinical research
group (especially the white ape posse!) who so unreservedly made me part of the team
Part of me definitely stayed with you when I left – I hope it isn’t annoying you too much.
Mostly, though, I thank Peter (Muris), Birgit (Mayer), Jip and Kiki who made me part of
their family while in Rotterdam They are all inspirational I’m grateful for their kindness,
hospitality, and for not getting annoyed when I was still in their kitchen having drunk all of
their wine after the last tram home had gone Mostly, I thank them for the wealth of happy
memories that they gave me
I always write listening to music For the previous editions, I owed my sanity to: Abba,
AC/DC, Arvo Pärt, Beck, The Beyond, Blondie, Busta Rhymes, Cardiacs, Cradle of Filth,
DJ Shadow, Elliott Smith, Emperor, Frank Black and the Catholics, Fugazi, Genesis (Peter
Gabriel era), Hefner, Iron Maiden, Janes Addiction, Love, Metallica, Massive Attack,
Mercury Rev, Morrissey, Muse, Nevermore, Nick Cave, Nusrat Fateh Ali Khan, Peter
Gabriel, Placebo, Quasi, Radiohead, Sevara Nazarkhan, Slipknot, Supergrass and The White
Stripes For this edition, I listened to the following, which I think tells you all that you need
to know about my stress levels: 1349, Air, Angantyr, Audrey Horne, Cobalt, Cradle of
Filth, Danzig, Dark Angel, Darkthrone, Death Angel, Deathspell Omega, Exodus, Fugazi,
Genesis, High on Fire, Iron Maiden, The Mars Volta, Manowar, Mastodon, Megadeth,
Meshuggah, Opeth, Porcupine Tree, Radiohead, Rush, Serj Tankian, She Said!, Slayer,
Soundgarden, Taake, Tool and the Wedding Present
Finally, all this book-writing nonsense requires many lonely hours (mainly late at night)
of typing Without some wonderful friends to drag me out of my dimly lit room from time
to time I’d be even more of a gibbering cabbage than I already am My eternal gratitude
goes to Graham Davey, Benie MacDonald, Ben Dyson, Martin Watts, Paul Spreckley, Darren
Hayman, Helen Liddle, Sam Cartwright-Hatton, Karina Knowles and Mark Franklin for
reminding me that there is more to life than work Also, my eternal gratitude to Gini Harrison,
Sam Pehrson and Luke Anthony and especially my brothers of metal Doug Martin and Rob
Mepham for letting me deafen them with my drumming on a regular basis Finally, thanks to
Leonora for her support while I was writing the last two editions of this book
Trang 28Like the previous editions, this book is dedicated to my brother Paul and my cat Fuzzy, because one of them is a constant source of intellectual inspiration and the other wakes
me up in the morning by sitting on me and purring in my face until I give him cat food: mornings will be considerably more pleasant when my brother gets over his love of cat food for breakfast.
Trang 29mathematical operators
Σ This symbol (called sigma) means ‘add everything up’ So, if you see something
like Σx i it just means ‘add up all of the scores you’ve collected’.
Π This symbol means ‘multiply everything’ So, if you see something like Π x i it just
means ‘multiply all of the scores you’ve collected’.
√x This means ‘take the square root of x’.
Greek symbols
β The probability of making a Type II error
β i Standardized regression coefficient
χ2
σ2 The variance in a population of data
τ Kendall’s tau (non-parametric correlation coefficient)
ω 2 Omega squared (an effect size measure) This symbol also means ‘expel the
contents of your intestine immediately into your trousers’; you will understand why in
due course
SYmBOLS USED IN THIS BOOk
Trang 30English symbols
bi The regression coefficient (unstandardized)
e i The error associated with the ith person
f f-ratio (test statistic used in AnOVA)
k The number of levels of a variable (i.e the number of treatment conditions), or the
number of predictors in a regression model
MS The mean squared error (Mean Square) The average variability in the data
N, n, n i The sample size N usually denotes the total sample size, whereas n usually
denotes the size of a particular group
P Probability (the probability value, p-value or significance of a test are usually
denoted by p)
rs Spearman’s rank correlation coefficient
rb, rpb Biserial correlation coefficient and point–biserial correlation coefficient respectively
R2 The coefficient of determination (i.e the proportion of data explained by the model)
SS The sum of squares, or sum of squared errors to give it its full title
SSA The sum of squares for variable A
SSM The model sum of squares (i.e the variability explained by the model fitted to the data)
SSR The residual sum of squares (i.e the variability that the model can’t explain – the
error in the model)
SST The total sum of squares (i.e the total variability within the data)
t Test statistic for Student’s t-test
T Test statistic for Wilcoxon’s matched-pairs signed-rank test
W s Test statistic for Wilcoxon’s rank-sum test
X or x– The mean of a sample of scores
z A data point expressed in standard deviation units
Trang 31Two negatives make a positive
1 : Although in life two wrongs don’t make a right, in
mathematics they do! When we multiply a negative number by another negative
number, the result is a positive number For example, −2 × −4 = 8
A negative number multiplied by a positive one make a negative number
mul-tiply a positive number by a negative number then the result is another negative number
For example, 2 × −4 = −8, or −2 × 6 = −12
BODMAS
3 : This is an acronym for the order in which mathematical operations are
per-formed It stands for Brackets, Order, Division, Multiplication, Addition, Subtraction
and this is the order in which you should carry out operations within an equation
Mostly these operations are self-explanatory (e.g always calculate things within
brackets first) except for order, which actually refers to power terms such as squares
Four squared, or 42, used to be called four raised to the order of 2, hence the reason
why these terms are called ‘order’ in BODMAS (also, if we called it power, we’d end
up with BPDMAS, which doesn’t roll off the tongue quite so nicely) Let’s look at an
example of BODMAS: what would be the result of 1 + 3 × 52? The answer is 76 (not
100 as some of you might have thought) There are no brackets so the first thing is to deal
with the order term: 52 is 25, so the equation becomes 1 + 3 × 25 There is no division, so
we can move on to multiplication: 3 × 25, which gives us 75 BODMAS tells us to deal
with addition next: 1 + 75, which gives us 76 and the equation is solved If I’d written the
original equation as (1 + 3) × 52, then the answer would have been 100 because we deal
with the brackets first: (1 + 3) = 4, so the equation becomes 4 × 52 We then deal with
the order term, so the equation becomes 4 × 25 = 100!
http://www.easymaths.com
4 is a good site for revising basic maths
SOmE mATHS REVISION
Trang 331.1 What will this chapter tell me? 1
I was born on 21 June 1973 Like most people, I don’t remember anything about the first
few years of life and like most children I did go through a phase of driving my parents
mad by asking ‘Why?’ every five seconds ‘Dad, why is the sky blue?’, ‘Dad, why doesn’t
mummy have a willy?’ etc Children are naturally curious about the world I remember at
the age of 3 being at a party of my friend Obe (this was just before he left England to return
to Nigeria, much to my distress) It was a hot day, and there was an electric fan blowing
cold air around the room As I said, children are natural scientists and my little scientific
brain was working through what seemed like a particularly pressing question: ‘What
hap-pens when you stick your finger into a fan?’ The answer, as it turned out, was that it hurts –
a lot.1 My point is this: my curiosity to explain the world never went away, and that’s why
1 In the 1970s fans didn’t have helpful protective cages around them to prevent idiotic 3 year olds sticking their
fingers into the blades.
1
Why is my evil lecturer
forcing me to learn statistics?
Figure 1.1
When I grow up, please don’t let
me be a statistics lecturer
Trang 34I’m a scientist, and that’s also why your evil lecturer is forcing you to learn statistics It’s because you have a curious mind too and you want to answer new and exciting questions
To answer these questions we need statistics Statistics is a bit like sticking your finger into
a revolving fan blade: sometimes it’s very painful, but it does give you the power to answer interesting questions This chapter is going to attempt to explain why statistics are an important part of doing research We will overview the whole research process, from why
we conduct research in the first place, through how theories are generated, to why we need data to test these theories If that doesn’t convince you to read on then maybe the fact that
we discover whether Coca-Cola kills sperm will Or perhaps not
1.2 What the hell am I doing here? I don’t belong here 1
You’re probably wondering why you have bought this book Maybe you liked the pictures,
maybe you fancied doing some weight training (it is heavy), or perhaps you need to reach something in a high place (it is thick) The chances are, though, that given the choice of
spending your hard-earned cash on a statistics book or something more entertaining (a nice novel, a trip to the cinema, etc.) you’d choose the latter So, why have you bought the book (or downloaded an illegal pdf of it from someone who has way too much time on their hands if they can scan an 800-page textbook)? It’s likely that you obtained it because you’re doing a course on statistics, or you’re doing some research, and you need to know how to analyse data It’s possible that you didn’t realize when you started your course or research that you’d have to know this much about statistics but now find yourself inexplicably wad-ing, neck high, through the Victorian sewer that is data analysis The reason that you’re in the mess that you find yourself in is because you have a curious mind You might have asked yourself questions like why people behave the way they do (psychology) or why behaviours differ across cultures (anthropology), how businesses maximize their profit (business), how did the dinosaurs die (palaeontology), does eating tomatoes protect you against cancer (medicine, biology), is it possible to build a quantum computer (physics, chemistry), is the planet hotter than it used to be and in what regions (geography, environmental studies)? Whatever it is you’re studying or researching, the reason you’re studying it is probably because you’re interested in answering questions Scientists are curious people, and you probably are too However, you might not have bargained on the fact that to answer inter-esting questions, you need two things: data and an explanation of those data
The answer to ‘what the hell are you doing here?’ is, therefore, simple: to answer ing questions you need data Therefore, one of the reasons why your evil statistics lecturer
interest-is forcing you to learn about numbers interest-is because they are a form of data and are vital to the research process Of course there are forms of data other than numbers that can be used to test and generate theories When numbers are involved the research involves quantitative methods, but you can also generate and test theories by analysing language (such as conversa-
tions, magazine articles, media broadcasts and so on) This involves qualitative methods and
it is a topic for another book not written by me People can get quite passionate about which
of these methods is best, which is a bit silly because they are complementary, not
compet-ing, approaches and there are much more important issues in the world to get upset about Having said that, all qualitative research is rubbish.2
2 This is a joke I thought long and hard about whether to include it because, like many of my jokes, there are people who won’t find it remotely funny Its inclusion is also making me fear being hunted down and forced to eat
my own entrails by a hoard of rabid qualitative researchers However, it made me laugh, a lot, and despite being vegetarian I’m sure my entrails will taste lovely.
Trang 35Data Initial Observation
How do you go about answering an interesting question? The research process is
broadly summarized in Figure 1.2 You begin with an observation that you want
to understand, and this observation could be anecdotal (you’ve noticed that your
cat watches birds when they’re on TV but not when jellyfish are on3) or could be
based on some data (you’ve got several cat owners to keep diaries of their cat’s
TV habits and have noticed that lots of them watch birds on TV) From your
ini-tial observation you generate explanations, or theories, of those observations, from
which you can make predictions (hypotheses) Here’s where the data come into
the process because to test your predictions you need data First you collect some
relevant data (and to do that you need to identify things that can be measured) and then you
analyse those data The analysis of the data may support your theory or give you cause to
modify the theory As such, the processes of data collection and analysis and generating
theo-ries are intrinsically linked: theotheo-ries lead to data collection/analysis and data collection/analysis
informs theories! This chapter explains this research process in more detail
1.3 Initial observation: finding something that
needs explaining 1
The first step in Figure 1.2 was to come up with a question that needs an answer I spend
rather more time than I should watching reality TV Every year I swear that I won’t get
hooked on Big Brother, and yet every year I find myself glued to the TV screen waiting for
3 My cat does actually climb up and stare at the TV when it’s showing birds flying about.
How do I do research?
Trang 36the next contestant’s meltdown (I am a psychologist, so really this is just research – honestly) One question I am constantly perplexed by is why every year there are so many contestants with really unpleasant personalities (my money is on narcissistic personality disorder4) on
the show A lot of scientific endeavour starts this way: not by watching Big Brother, but by
observing something in the world and wondering why it happens
Having made a casual observation about the world (Big Brother contestants on the whole
have profound personality defects), I need to collect some data to see whether this vation is true (and not just a biased observation) To do this, I need to define one or more
obser-variables that I would like to measure There’s one variable in this example: the ity of the contestant I could measure this variable by giving them one of the many well-established questionnaires that measure personality characteristics Let’s say that I did this and I found that 75% of contestants did have narcissistic personality disorder These data
personal-support my observation: a lot of Big Brother contestants have extreme personalities.
1.4 Generating theories and testing them 1
The next logical thing to do is to explain these data (Figure 1.2) One explanation could be
that people with narcissistic personality disorder are more likely to audition for Big Brother
than those without This is a theory Another possibility is that the producers of Big Brother
are more likely to select people who have narcissistic personality disorder to be contestants than those with less extreme personalities This is another theory We verified our original observation by collecting data, and we can collect more data to test our theories We can make two predictions from these two theories The first is that the number of people turning up for
an audition that have narcissistic personality disorder will be higher than the general level in the population (which is about 1%) A prediction from a theory, like this one, is known as a
hypothesis (see Jane Superbrain Box 1.1) We could test this hypothesis by getting a team of
clinical psychologists to interview each person at the Big Brother audition and diagnose them as
having narcissistic personality disorder or not The prediction from our second theory is that
if the Big Brother selection panel are more likely to choose people with narcissistic personality
disorder then the rate of this disorder in the final contestants will be even higher than the rate
in the group of people going for auditions This is another hypothesis Imagine we collected these data; they are in Table 1.1
In total, 7662 people turned up for the audition Our first hypothesis is that the percentage
of people with narcissistic personality disorder will be higher at the audition than the eral level in the population We can see in the table that of the 7662 people at the audition,
gen-4 This disorder is characterized by (among other things) a grandiose sense of self-importance, arrogance, lack of empathy for others, envy of others and belief that others envy them, excessive fantasies of brilliance or beauty, the need for excessive admiration and exploitation of others.
had narcissistic personality disorder and whether they were selected as contestants by the producers
Trang 37854 were diagnosed with the disorder, this is about 11% (854/7662 × 100) which is much
higher than the 1% we’d expect Therefore, hypothesis 1 is supported by the data The
second hypothesis was that the Big Brother selection panel have a bias to choose people
with narcissistic personality disorder If we look at the 12 contestants that they selected, 9
of them had the disorder (a massive 75%) If the producers did not have a bias we would
have expected only 11% of the contestants to have the disorder The data again support
our hypothesis Therefore, my initial observation that contestants have personality
disor-ders was verified by data, then my theory was tested using specific hypotheses that were
also verified using data Data are very important!
be empirically tested So, statements such as ‘The Led Zeppelin reunion concert in London in 2007 was the best gig ever’, 5 ‘Lindt chocolate is the best food’, and ‘This is the worst statistics book in the world’ are all non-scientific; they cannot be proved or disproved Scientific statements can be confirmed or disconfirmed empirically ‘Watching
Curb Your Enthusiasm makes you happy’, ‘having sex
increases levels of the neurotransmitter dopamine’ and
‘Velociraptors ate meat’ are all things that can be tested empirically (provided you can quantify and measure the variables concerned) Non-scientific statements can sometimes be altered to become scientific statements,
so ‘The Beatles were the most influential band ever’ is non-scientific (because it is probably impossible to quan- tify ‘influence’ in any meaningful way) but by changing the statement to ‘The Beatles were the best-selling band ever’
it becomes testable (we can collect data about worldwide record sales and establish whether The Beatles have, in fact, sold more records than any other music artist) Karl Popper, the famous philosopher of science, believed that non-scientific statements were nonsense, and had no place in science Good theories should, therefore, pro- duce hypotheses that are scientific statements.
A good theory should allow us to make statements about
the state of the world Statements about the world are
good things: they allow us to make sense of our world,
and to make decisions that affect our future One current
example is global warming Being able to make a
defini-tive statement that global warming is happening, and
that it is caused by certain practices in society, allows
us to change these practices and, hopefully, avert
catas-trophe However, not all statements are ones that can
be tested using science Scientific statements are ones
that can be verified with reference to empirical evidence,
whereas non-scientific statements are ones that cannot
JANE SUPERBRAIN 1.1
I would now be smugly sitting in my office with a contented grin on my face about how
my theories and observations were well supported by the data Perhaps I would quit while
I’m ahead and retire It’s more likely, though, that having solved one great mystery, my
excited mind would turn to another After another few hours (well, days probably) locked up
at home watching Big Brother I would emerge triumphant with another profound
observa-tion, which is that these personality-disordered contestants, despite their obvious character
flaws, enter the house convinced that the public will love them and that they will win.6 My
hypothesis would, therefore, be that if I asked the contestants if they thought that they would
win, the people with a personality disorder would say yes
6 One of the things I like about Big Brother in the UK is that year upon year the winner tends to be a nice person,
which does give me faith that humanity favours the nice.
5 It was pretty awesome actually.
Trang 38Let’s imagine I tested my hypothesis by measuring their expectations of success in
the show, by just asking them, ‘Do you think you will win Big Brother?’ Let’s say that
7 of 9 contestants with personality disorders said that they thought that they will win, which confirms my observation Next, I would come up with another theory: these contestants think that they will win because they don’t realize that they have a per-sonality disorder My hypothesis would be that if I asked these people about whether their personalities were different from other people they would say ‘no’ As before,
I would collect some more data and perhaps ask those who thought that they would win whether they thought that their personalities were different from the norm All
7 contestants said that they thought their personalities were different from the norm These data seem to contradict my theory This is known as falsification, which is the act
of disproving a hypothesis or theory
It’s unlikely that we would be the only people interested in why individuals who go on
Big Brother have extreme personalities and think that they will win Imagine these
research-ers discovered that: (1) people with narcissistic presearch-ersonality disorder think that they are more interesting than others; (2) they also think that they deserve success more than oth-ers; and (3) they also think that others like them because they have ‘special’ personalities.This additional research is even worse news for my theory: if they didn’t realize that they had a personality different from the norm then you wouldn’t expect them to think that they were more interesting than others, and you certainly wouldn’t expect them to think that others will like their unusual personalities In general, this means that my theory sucks: it cannot explain all of the data, predictions from the theory are not supported by subsequent data, and it cannot explain other research findings At this point I would start
to feel intellectually inadequate and people would find me curled up on my desk in floods
of tears wailing and moaning about my failing career (no change there then)
At this point, a rival scientist, Fester Ingpant-Stain, appears on the scene with a rival theory
to mine In his new theory, he suggests that the problem is not that personality-disordered contestants don’t realize that they have a personality disorder (or at least a personality that
is unusual), but that they falsely believe that this special personality is perceived positively by other people (put another way, they believe that their personality makes them likeable, not dislikeable) One hypothesis from this model is that if personality-disordered contestants are asked to evaluate what other people think of them, then they will overestimate other peo-ple’s positive perceptions To test this hypothesis, Fester Ingpant-Stain collected yet more data When each contestant came to the diary room they had to fill out a questionnaire evaluating all of the other contestants’ personalities, and also answer each question as if they were each
of the contestants responding about them (So, for every contestant there is a measure of what they thought of every other contestant, and also a measure of what they believed every other contestant thought of them.) He found out that the contestants with personality disorders did overestimate their housemate’s view of them; in comparison the contestants without personal-ity disorders had relatively accurate impressions of what others thought of them These data, irritating as it would be for me, support the rival theory that the contestants with personality disorders know they have unusual personalities but believe that these characteristics are ones that others would feel positive about Fester Ingpant-Stain’s theory is quite good: it explains the initial observations and brings together a range of research findings The end result of this whole process (and my career) is that we should be able to make a general statement about the
state of the world In this case we could state: ‘Big Brother contestants who have personality
disorders overestimate how much other people like their personality characteristics’
SELF-TEST Based on what you have read in this section, what qualities do you think a scientific theory should have?
Are Big Brother
contestants odd?
Trang 391.5 Data collection 1: what to measure 1
We have seen already that data collection is vital for testing theories When we collect data
we need to decide on two things: (1) what to measure, (2) how to measure it This section
looks at the first of these issues
To test hypotheses we need to measure variables Variables are just things that can change (or
vary); they might vary between people (e.g IQ, behaviour) or locations (e.g unemployment)
or even time (e.g mood, profit, number of cancerous cells) Most hypotheses can be expressed
in terms of two variables: a proposed cause and a proposed outcome For example, if we take
the scientific statement ‘Coca-Cola is an effective spermicide’7 then proposed cause is
‘Coca-Cola’ and the proposed effect is dead sperm Both the cause and the outcome are variables: for
the cause we could vary the type of drink, and for the outcome, these drinks will kill different
amounts of sperm The key to testing such statements is to measure these two variables
A variable that we think is a cause is known as an independent variable (because its value
does not depend on any other variables) A variable that we think is an effect is called a
dependent variable because the value of this variable depends on the cause (independent
variable) These terms are very closely tied to experimental methods in which the cause
is actually manipulated by the experimenter (as we will see in section 1.6.2) In cross-
sectional research we don’t manipulate any variables, and we cannot make causal statements
about the relationships between variables, so it doesn’t make sense to talk of dependent and
independent variables because all variables are dependent variables in a sense One
possibil-ity is to abandon the terms dependent and independent variable and use the terms predictor
variable and outcome variable In experimental work the cause, or independent variable, is
a predictor, and the effect, or dependent variable, is simply an outcome This terminology
also suits cross-sectional work where, statistically at least, we can use one or more variables
to make predictions about the other(s) without needing to imply causality
7 Actually, there is a long-standing urban myth that a post-coital douche with the contents of a bottle of Coke is
an effective contraceptive Unbelievably, this hypothesis has been tested and Coke does affect sperm motility, and
different types of Coke are more or less effective – Diet Coke is best apparently (Umpierre, Hill, & Anderson,
1985) Nevertheless, a Coke douche is ineffective at preventing pregnancy
When doing research there are some important generic terms for variables that you will encounter:
Independent variable: A variable thought to be the cause of some effect This term is usually used in experimental research to denote a variable that the experimenter has manipulated.
Dependent variable: A variable thought to be affected by changes in an independent variable You can think of this variable as
an outcome.
Predictor variable: A variable thought to predict an outcome variable This is basically another term for independent variable (although some people won’t like me saying that; I think life would be easier if we talked only about predictors and outcomes).
Outcome variable: A variable thought to change as a function of changes in a predictor variable This term could be
synonymous with ‘dependent variable’ for the sake of an easy life.
Trang 401.5.1.2 levels of measurement 1
As we have seen in the examples so far, variables can take on many different forms and levels
of sophistication The relationship between what is being measured and the numbers that represent what is being measured is known as the level of measurement Broadly speaking, variables can be categorical or continuous, and can have different levels of measurement
A categorical variable is made up of categories A categorical variable that you should
be familiar with already is your species (e.g human, domestic cat, fruit bat, etc.) You are
a human or a cat or a fruit bat: you cannot be a bit of a cat and a bit of a bat, and neither
a batman nor (despite many fantasies to the contrary) a catwoman (not even one in a nice PVC suit) exist A categorical variable is one that names distinct entities In its simplest form it names just two distinct types of things, for example male or female This is known
as a binary variable Other examples of binary variables are being alive or dead, pregnant
or not, and responding ‘yes’ or ‘no’ to a question In all cases there are just two categories and an entity can be placed into only one of the two categories
When two things that are equivalent in some sense are given the same name (or number), but there are more than two possibilities, the variable is said to be a nominal variable It should
be obvious that if the variable is made up of names it is pointless to do arithmetic on them (if you multiply a human by a cat, you do not get a hat) However, sometimes numbers are used to denote categories For example, the numbers worn by players in a rugby or football (soccer) team In rugby, the numbers of shirts denote specific field positions, so the number
10 is always worn by the fly-half (e.g England’s Jonny Wilkinson),8 and the number 1 is always the hooker (the ugly-looking player at the front of the scrum) These numbers do not tell us anything other than what position the player plays We could equally have shirts with
FH and H instead of 10 and 1 A number 10 player is not necessarily better than a number 1 (most managers would not want their fly-half stuck in the front of the scrum!) It is equally as daft to try to do arithmetic with nominal scales where the categories are denoted by numbers: the number 10 takes penalty kicks, and if the England coach found that Jonny Wilkinson (his number 10) was injured he would not get his number 4 to give number 6 a piggyback and then take the kick The only way that nominal data can be used is to consider frequencies For example, we could look at how frequently number 10s score tries compared to number 4s
8 Unlike, for example, NFL American football where a quarterback could wear any number from 1 to 19.
on a 10-point scale We might be confident that a judge who gives a rating of 10 found Billie more beautiful than one who gave a rating of 2, but can we be certain that the first judge found her five times more beautiful than the sec- ond? What about if both judges gave a rating of 8, could
we be sure they found her equally beautiful? Probably not: their ratings will depend on their subjective feelings about what constitutes beauty For these reasons, in any situa- tion in which we ask people to rate something subjective (e.g rate their preference for a product, their confidence about an answer, how much they have understood some medical instructions) we should probably regard these data as ordinal although many scientists do not.
A lot of self-report data are ordinal Imagine if two judges
at our beauty pageant were asked to rate Billie’s beauty
JANE SUPERBRAIN 1.2