Friedman’s analysis of variance test: repeated-measures design

Một phần của tài liệu Ebook Statistics without maths for psychology (7th edition) Part 2 (Trang 566 - 596)

TESTS OF WITHIN-SUBJECTS EFFECTS

16.3.3 Friedman’s analysis of variance test: repeated-measures design

Friedman’s ANOVA is the non-parametric equivalent of the repeated-measures ANOVA, and is a generalisation of the Wilcoxon test. In other words, it is the Wilcoxon test applied to more than two groups. As in the Kruskal–Wallis, the formula for this test involves the ranks of the scores rather than the scores themselves.

The test, confusingly, is called a two-way ANOVA. This is because some people consider the participants as a factor, in a repeated-measures design, as we have mentioned previously (see  section  10.1). It is, however, what we know as a one-way. In the following output, participants rated how alert they felt at various times of the day (on a scale of 1–5).

Morning Lunchtime Afternoon

1.00 2.00 3.00

2.00 4.00 5.00

1.00 2.00 2.00

2.00 1.00 3.00

1.00 3.00 3.00

2.00 5.00 5.00

1.00 2.00 3.00

2.00 2.00 2.00

1.00 3.00 3.00

2.00 1.00 3.00

Group Symptom 1 Symptom 2 Group Symptom 1 Symptom 2

1.00 3.00 1.00 2.00 3.00 5.00

1.00 4.00 3.00 2.00 2.00 2.00

1.00 5.00 4.00 3.00 4.00 5.00

1.00 2.00 2.00 3.00 5.00 3.00

1.00 3.00 1.00 3.00 4.00 4.00

2.00 4.00 2.00 3.00 2.00 4.00

2.00 5.00 5.00 3.00 3.00 6.00

2.00 4.00 3.00 3.00 2.00 2.00

2.00 2.00 2.00 3.00 3.00 3.00

Statistics without maths for psychology 542

SPSS: repeated-measures test for more than two conditions – Friedman’s test

Select Analyze, Nonparametric Tests, Legacy Dialogs and Related Samples:

This brings you to the following dialogue box:

CHAPTER 16 Non-parametric statistics 543

Move the test variables from the left-hand side to the right-hand side. Ensure that the Friedman box is checked. Choosing the Statistics option will enable you to obtain descriptives. Press OK. This obtains the following output:

The x2 value is 12.25, with an associated probability of 0.001 (Exact Sig.). The differences found between participants at different times of the day are unlikely to be due to sampling error.

Test Statisticsa

a. Friedman Test

.002 Asymp. Sig.

.001 Exact Sig.

.000 Point Probability

2

df 12.250

Chi-Square

N 10

Activity 16.6

Look at the following output from a three-group analysis. Participants were people with a chronic illness, who declined to take part in a behavioural intervention designed to help them. Measures of depression were taken at three timepoints for all partici- pants. The researcher predicted that there would be significant differences between scores at the three different timepoints, but he did not predict the direction of the difference.

What can you conclude from this analysis?

Friedman Ranks

2.81

TIME3 1.88

TIME2

1.31 TIME1

Mean Rank Ranks

2.70

Afternoon 2.00

Lunch 1.30

Morning

Mean Rank

Test Statisticsa

a. Friedman Test

.002

Exact Sig. 2

df

12.250 Chi-Square

N 8

Statistics without maths for psychology 544

Example from the literature

Pre-operative distress factors predicting postoperative pain

Ferland and colleagues (2016) carried out a study of pain in adolescents undergoing surgery. They wanted to determine whether preoperative distress factors could predict postoperative pain. As part of this study, they assessed the change in cortisol over time. The authors say that because levels of cortisol does not have a parametric distribution, a non parametric analysis of variance, i.e. Friedman’s ANOVA, was used to assess cortisol changes over time.

The changes were assessed at baseline, day of surgery, postoperative day 1, postoperative day 2, and at follow up. The authors showed that differences were observed in cortisol concentrations (Friedman test=53.64, p 6 .0001).

The authors say that cortisol levels increased just before entering the operating room in comparison with baseline levels.

SPSS exercise

Exercise 5

Ten participants in a cognitive experiment learn low-, medium- and high-frequency words. Later they repeat as many words as they can remember in three minutes. The undergraduate student carrying out this project hypothesises that there will be a greater number of words recalled in the high-frequency condition. The scores under the three conditions are as follows:

Low Medium High

10.00 15.00 25.00

5.00 8.00 17.00

7.00 9.00 18.00

8.00 16.00 25.00

10.00 9.00 8.00

15.00 18.00 20.00

21.00 29.00 31.00

18.00 25.00 31.00

20.00 36.00 40.00

8.00 16.00 30.00

Perform a Friedman’s ANOVA for participants at the three timepoints. Was the hypothesis supported?

Give your results, making sure you explain them in terms of the experiment.

CHAPTER 16 Non-parametric statistics 545

Discover the website at www.pearsoned.co.uk/dancey where you can test your knowledge with multiple choice questions and activities, discover more about topics using the links to relevant websites, and explore the interactive flowchart designed to help you find the right method of analysis.

1. The Wilcoxon matched-pairs signed-ranks test (the Wilcoxon) is appropriate for:

(a) Within-participants designs (b) Between-participants designs (c) Matched-participants designs (d) Both (a) and (c) above

2. To assess the difference in scores from two conditions of a between-participants design, with ranked data, you would use:

(a) The independent t-test (b) The Wilcoxon (c) The Related t-test (d) Mann–Whitney

3. Look at the following partial printout of a Mann–Whitney U analysis from SPSS:

Multiple choice questions

68 Ranks

8

16 8.50

Group 1 Total

68

8 8.50

Group 1 SCORE

Sum of Ranks

N Mean Rank

group

Summary

• The tests used in this chapter are non-parametric tests, to be used when it is not possible to use parametric tests.

• Non-parametric tests transform the data as a first stage in calculating the test statistic.

• Non-parametric tests do not require normally distributed data or large samples.

• The non-parametric equivalent of Pearson’s r is Spearman’s rho.

• The non-parametric equivalents of the t-test are Mann–Whitney for independent samples, and Wilcoxon for related samples.

• The non-parametric equivalents of ANOVA are Kruskal–Wallis for independent samples, and Friedman’s ANOVA for related samples.

Statistics without maths for psychology 546

Test Statisticsa

Mann–Whitney U Wilcoxon W

SCORE 32.0 68.0

The above information suggests that:

(a) There will be a statistically significant difference between conditions (b) There will not be a statistically significant difference between conditions (c) The results are indeterminate

(d) None of the above

4. The Wilcoxon matched-pairs signed-ranks test can be used when:

(a) There are two conditions

(b) The same participants take part in both conditions (c) There is at least ordinal-level data

(d) All of the above

5. The Mann–Whitney U involves:

(a) The difference in the means for each condition (b) The sum of the ranks for each condition

(c) Finding the difference in scores across conditions, then ranking these differences (d) The difference in ranks across conditions

6. A Mann–Whitney test gives the following result:

U=9, p=0.1726 (2-tailed probability)

The researcher, however, made a prediction of the direction of the difference, and therefore needs to know the one-tailed probability. This is:

(a) 0.0863 (b) 0.863 (c) 0.1726 (d) Indeterminate

7. If, in a repeated-measures design with two conditions, you have a small number of participants, with skewed, ordinal data, the most appropriate inferential test is:

(a) Unrelated t-test (b) Related t-test

(c) Mann–Whitney U test (d) Wilcoxon

8. If a Wilcoxon test shows that t=3 with an associated probability of 0.02, this means:

(a) Assuming the null hypothesis to be true, a t-value of 3 would occur 2% of the time through sampling variation

(b) We are 98% certain that our results are statistically significant

(c) Given our data, we expect to find a t-value of 3 occurring 2% of the time through chance (d) If the null hypothesis is not true, then a t-value of 3 would occur 2% of the time through sampling

variation

CHAPTER 16 Non-parametric statistics 547

9. A t-value of 3 has been converted into a z-score of -3.2. This means:

(a) The calculations are incorrect

(b) There is not likely to be a statistically significant difference between conditions (c) There is likely to be a statistically significant difference between conditions (d) The results are indeterminate

Questions 10 to 12 relate to the following output:

10. Which is the most sensible conclusion?

(a) There are no significant differences between the three groups, p 7 0.05 (b) There are significant differences between the groups, p=0.003 (c) There are no significant differences between the groups, p=0.003 (d) Impossible to tell

11. Which group had the highest scores?

(a) Group 1 (b) Group 2 (c) Group 3 (d) Cannot tell

12. How many participants were in the study?

(a) 5 (b) 10 (c) 15 (d) 20

Kruskal–Wallis Ranks

5 15 3.00

Total

5 7.30

12.70 2.00

SCORE 1.00 5 4.00

GROUP N Mean Rank

Test Statisticsa,b

.008 Asymp. Sig.

.003 Exact Sig.

.003 Point Probability

2 df

9.785 Chi-Square

SCORE

a. Kruskal–Wallis Test b. Grouping Variable: GROUP

Statistics without maths for psychology 548

Test Statisticsa

.228 Asymp. Sig.

.210 Exact Sig.

.210 Point Probability

2 df

2.960 Chi-Square

N 7

a. Friedman Test

Questions 13 to 15 relate to the following output:

13. Which is the most sensible conclusion?

(a) There are differences between the groups, but these stand a 21% chance of being due to sampling error

(b) There are differences between the groups, and these are unlikely to be due to sampling error (c) There are no differences between the three groups at all

(d) None of the above

14. How many participants were in the study?

(a) 7 (b) 14 (c) 21

(d) Cannot tell

15. The participants were measured:

(a) At two timepoints (b) At three timepoints (c) At four timepoints (d) Cannot tell

Questions 16 to 17 relate to the following table, taken from Holdcroft et al. (2003):

Friedmans Ranks

2.21

FOLLOWUP 2.29

AFTER 1.50

BEFORE

Mean Rank

0.36 20.13

20.24

0.55 20.090.04 0.78 Tension

Spearman rank correlation coefficients for affective measures and pain P value Pain

Autonomic

FearPunishment 0.09

CHAPTER 16 Non-parametric statistics 549

16. Which is the most appropriate statement? In general terms, the affective measures and pain show a:

(a) Weak relationship (b) Moderate relationship (c) Strong relationship (d) Perfect relationship

17. The strongest relationship is between pain and:

(a) Tension (b) Autonomic (c) Fear (d) Punishment

18. Look at the following text, taken from Daley, Sonuga-Barke and Thompson (2003). They were making comparisons between mothers of children with and without behavioural problems.

Mann–Whitney U tests were used . . . significant differences indicated that mothers of children with behavioural problems displayed less positive initial statements (z= -4.24) and relationships (z=-4.25), less warmth (z=-5.08) and fewer positive comments (z=-2.82), all p>s 6 .01 Of the four comparisons, which was the strongest?

(a) Initial statements (b) Relationships (c) Warmth

(d) Positive comments 19. Look at the output below.

Which is the most appropriate statement? The relationship between the two ratings is (a) Strong (rho=0.7, p 60.000)

(b) Strong (rho=0.6, p 60.001) (c) Moderate (r=0.7, p 60.001) (d) Moderate (r=0.6, p 60.000)

20. Look at the following table. Professor Green predicted that strength would relate positively to motivation. Unfortunately the professor meant to obtain one-tailed p-values. Professor Green wants you to interpret the results below, for a one-tailed hypothesis.

. Correlations

70 .000

70 Sig. (2-tailed)

N

1.000 .600

Rating2 Correlation

Coefficient

70 70

N . .000

Sig. (2-tailed)

.600 1.000

Rating1 Correlation

Coefficient Spearman’s rho

Rating2 Rating1

Statistics without maths for psychology 550

. Correlations

16 .094

16 Sig. (2-tailed)

N

1.000 .347

Motivation Correlation Coefficient

16 16

N . .094

Sig. (2-tailed)

.347 1.000

Strength Correlation Coefficient Spearman’s rho

Motivation Strength

The relationship between Strength and Motivation is:

(a) Strong (rho=0.35, p=0.094) (b) Strong (rho=0.35, p=0.047) (c) Moderate (rho=0.35, p=0.094) (d) Moderate (rho=0.35, p=0.047)

Allen K. L., McLean, N. J. and Byrne, S. M. (2012) ‘Evaluation of a new measure of mood intolerance, the Tolerance of Mood States Scale (TOMS): psychometric properties and associations with eating disorder symptoms’, Eating Behaviors, 13: 326–34.

Daley, D., Sonuga-Barke, E. J. S. and Thompson, M. (2003)

‘Assessing expressed emotion in mothers of preschool AD/HD children: psychometric properties of a modified speech sample’, British Journal of Clinical Psychology, 42: 53–67.

Etter, J-F. (2016) ‘Throat hit in users of the electronic cigarette: an exploratory study’, Psychology of Addictive Behaviors, 30(1): 93–100

Ferland, C. E., Saran, N., Valois, T., Bote, S., Chorney, J. M., Stone, L. S. and Quellet, J. A. (2016) ‘Preoperative distress factors predicting postoperative pain in adolescents undergoing surgery: a preliminary study’, Journal of Pediatric Health Care, 31(1), 5–15.

Gould, D. D., Watson, S. L., Price, S. R. and Valliant, P. M.

(2013) ‘The relationship between burnout and coping in

adult and young offender center correctional officers: an exploratory investigation’, Psychological Services, 10(1):

37–47, 1541–59.

Holdcroft, A., Snidvongs, S., Cason, A., Dore, C. J. and Berkley, K. (2003) ‘Pain and uterine contractions during breast feeding in the immediate post-partum period increase with parity’, Pain, 104: 589–96.

Hsieh, S., Foxe, D., Leslie, F., Savage, S., Piquet, O. and Hodges, J. R. (2012) ‘Grief and joy: emotion word comprehension in the dementias’, Neuropsychology, 26(5): 624–30.

Jacobsson, L. and Lexell, J. (2016) ‘Life satisfaction after traumatic brain injury: comparison of ratings with the Life Satisfaction Questionnaire (LiSat-11) and the Satisfaction with Life Scale (SWLS)’, Health and Quality of Life Outcomes, 14: 10.

Sánchez, F. J., Bocklandt, S. and Vilain, E. (2013) ‘The relationship between help-seeking attitudes and masculine norms among monozygotic male twins discordant for sexual orientation’, Health Psychology, 32(1): 52–6.

References

1. d, 2. d, 3. b, 4. d, 5. b, 6. a, 7. d, 8. a, 9. c, 10. b, 11. c, 12. c, 13. a, 14. a, 15. b, 16. a, 17. d, 18. c,

19. b, 20. d

Answers to multiple choice questions

Answers to activities and SPSS exercises

Chapter 1

Activity 1.1

Wind speed – continuous

Degrees offered by a university – categorical

Level of extroversion – continuous

Makes of car – categorical

Division in which football teams play – categorical

Number of chess pieces ‘captured’ in a chess game – discrete

Weight of giant pandas – continuous

Number of paintings hanging in art galleries – discrete

Activity 1.2

The study is a quasi-experimental design. The researcher was interested in differences between believers and non-believers (skeptics) in the paranormal in terms of their perceptual biases.

The researchers have not randomly allocated the participants to the conditions of the IV (they were already either paranormal believers or skeptics). Thus this is quasi-experimental.

Activity 1.3

In the mirror drawing study you would introduce counterbalancing by dividing the participants into two groups. One group would receive the instructions emphasising accuracy the first time they completed the mirror drawing task, and then they would have instructions emphasising speed the second time. The second group of participants would complete the mirror drawing task first with instructions emphasising speed and then a second time with instructions emphasising accuracy.

Activity 1.4

To examine the causal relationship between caffeine and mathematical ability, you should have several groups that differ in terms of the amount of caffeine taken by participants. You could, for example, have four groups: one group has no caffeine, one group low levels of caffeine, one group moderate levels of caffeine and the final group high levels of caffeine.

Statistics without maths for psychology 552

You would then give each group the same mathematics test to complete. You could then compare the performance of each in the maths test to try to establish a causal relationship between the variables. You could also conduct the study as a within-participants design with each person taking part under all four conditions. Obviously in such a case you would need to use different but equivalent maths tests each time they completed it.

Chapter 2

SPSS exercises Exercise 1

1. The IV in Dr Genius’s study is whether the participants were presented with adjectives or nouns.

2. The DV in this study is the number of words correctly remembered by each participant.

3. This is a between-participants design because the adjectives were presented to one group and the nouns to another group.

4. This is an experimental design because Dr Genius has randomly allocated the 20 partici- pants to the two conditions.

5. The data for this particular study should be set up as shown below. There should be two variables. The first one should be a grouping variable and contain just a series of 1s and 2s. In our case, 1 might represent the adjective condition and 2 the noun condition.

The second variable would contain the number of words remembered by each participant.

Answers to activities and SPSS exercises 553

Exercise 2

The data from the adjective/noun study should be input as follows if it were a within-partic- ipants design:

When we input the data for a within-participants design we need to set up two variables, one for each of the conditions. The first variable we have set up is for the adjectives condition and the second one for the nouns condition.

Chapter 3

Activity 3.1

The most suitable sample for the rugby vs football fans study would be one group of football fans and one group of rugby fans (although perhaps some would argue that the group of chimpanzees is just as appropriate).

Activity 3.2

The means, medians and modes are as follows:

(a) Mean=13.6, median=12, mode=12 (b) Mean=6.75, median=5, mode=5 (c) Mean=33.9, median=25.5, mode=32

Activity 3.3

The most appropriate measures of central tendency are as follows:

(a) Median (b) Mode (c) Mean (d) Median

Statistics without maths for psychology 554

Activity 3.4

No answers needed for this activity.

Activity 3.5

The following are the answers to the questions about the histogram:

(a) The mode is 4.

(b) The least frequent score is 8.

(c) Four people had a score of 5.

(d) Two people had a score of 2.

Activity 3.6

The following are the answers to the questions about the box plot:

(a) The median is 30.

(b) There are three extreme scores below the box plot itself.

Activity 3.7

The scattergram suggests that there is no real relationship between petrol prices and driver satisfaction. The dots in the scattergram appear to be randomly scattered between the axes.

Activity 3.8

The condition which has the greatest variation around the mean is the one with the poster of Bill Clinton. We would suggest that overall the variation around the mean for the control and Bill Clinton conditions are of similar magnitude. There appears to be quite a bit less variation around the mean in the Angela Merkel poster condition than in the other two conditions.

Activity 3.9

The only one of the examples given that is a normal distribution is (b).

SPSS exercises Exercise 1

1. The IV in the lighting study is the presence or absence of red lighting.

2. The DV is the number of errors made by each data inputter.

Answers to activities and SPSS exercises 555

3. The box plot for the difference in errors between the two conditions is presented below:

(a) The shortened whisker extending from the lower edge of the box plus the fact that the median is nearer this edge than the middle suggests that the distribution is posi- tively skewed.

(b) The box plot shows several outliers both above and below the inner fences. The outli- ers are from scores 3, 4, 5, 13 and 14.

(c) The mean and standard deviation of the above set of scores can be obtained using the Explore option from the Summarize menu: the mean is 21.40 and the standard deviation 6.61.

Exercise 2

1. The IV in the drug study is whether or not the students took drugs during Dr Boering’s lectures.

2. The DV is the marks obtained in the end-of-term exam and this is a continuous variable measured with a discrete scale. It is continuous because the underlying knowledge of students of the subject tested in the exam is assumed to be continuous. It is simply meas- ured on a discrete scale (%).

Statistics without maths for psychology 556

3. The histograms for the data from each condition are as follows:

(a) One could perhaps argue the case for both sets of scores being approximately nor- mally distributed. The most frequently occurring scores are in the middle of the distributions and they tail off above and below the modes.

(b) Fortunately, the means and standard deviations for both sets of scores are presented with the histograms. Ordinarily we would have used the Explore command to gen- erate both the histograms and the descriptive statistics. The mean and standard deviation for the drugs condition are 48.5 and 25.83 respectively. The mean and standard deviation for the no-drugs conditions are 56.2 and 9.95 respectively. You should be able to see from these that taking drugs has led to a slightly lower exam score than the no drug group and has led to a much greater variability of scores.

The standard deviation of the drugs condition is over 2.5 times that for the no-drugs condition.

Answers to activities and SPSS exercises 557

Chapter 4

Activity 4.1

(a) Night following day: 1

(b) All politicians telling us the truth all the time: 0

(c) Your finding a cheque for a million pounds in the pages of this book: 0 (d) A wood fire being extinguished if you pour water on it: 1

(e) Authors having to extend the deadline for sending in manuscripts for books: 1

Activity 4.2

1. (a) 0.25=25%

(b) 0.99=99%

(c) 1,3=33.33%

(d) 2,10=20%

2. (a) 1,8=0.125 (b) 12,20=0.60 (c) 30%=0.30 (d) 14%=0.14

The probability of rolling an even number on a dice is 0.5.

Activity 4.3

(a) The probability of being struck by lightning while playing golf – conditional probability

(b) The probability of winning the Lottery – not conditional

(c) The probability of winning an Olympic gold medal if you do no training – conditional probability

(d) The probability of getting lung cancer if you smoke – conditional probability (e) The probability of rolling a six on a die – not conditional

(f) The probability of finding a ten pound note in the pages of this book – not conditional (g) The probability of manned flight to Mars within the next ten years – not conditional (h) The probability of having coronary heart disease if you drink moderate levels of beer –

conditional probability

Activity 4.4

If you have a negative z-score, it will be below the mean. With negative z-scores the majority of the population will score above you.

Activity 4.5

Your z-score for Mathematics would be 1 ((65 - 60)/5). For English it would be 0.86 ((71 - 65)/7). Therefore, your better subject in comparison with the others in your group is Mathematics.

Một phần của tài liệu Ebook Statistics without maths for psychology (7th edition) Part 2 (Trang 566 - 596)

Tải bản đầy đủ (PDF)

(634 trang)