Post-hoc tests/planned

DAY OF WEEK* SEX CROSSTABULATION

10.7.3 Post-hoc tests/planned

This is the output obtained under the Compare main effects option, using Bonferroni:

Estimates

Measure: MEASURE_1

Std. Error .777 .672 .880

95% Confidence Interval Lower Bound

Mean 5.833 6.167 10.250

4.123 4.687 8.313

Upper Bound 7.543 7.646 12.187 AlcoholGroup

1 2 3

The table above shows the mean score for each of the conditions, plus the 95% confidence limits.

CHAPTER 10 Analysis of differences between three or more conditions 317

The table above compares each condition with every other condition, giving the mean difference between every pair, the standard error, the probability value and the 95% confidence limits around the mean difference.

The first row compares 1 (placebo) with 2 (low alcohol). The mean difference is 0.333. This is not statistically significant at any acceptable criterion value. This row also compares level 1 (placebo) with level 3 (high alcohol). The difference here is 4.417, and the associated probability level is 0.011. Have a go at interpreting the rest of the table yourself.

Measure: MEASURE_1

Sig.a 1.000 .011

95% Confidence Interval for Differencea Lower Bound Mean

Difference (I–J)

22.939 27.790

Upper Bound 2.272 21.043 (I) AlcoholGroup

Based on estimated marginal means

* The mean difference is significant at the .05 level.

aAdjustment for multiple comparisons: Bonferroni.

(J) AlcoholGroup 2

Std. Error .924 1.196

1.000 .007 .333

24.083*

2.333 24.417*

22.272 26.997

2.939 21.170

2 1

.924 1.033

.011 .007 4.417*

4.083*

1.043 1.170

7.790 6.997

3 1

1.196 1.033 Pairwise Comparisons

Activity 10.4

Think about what the confidence limits are telling us. How would you explain the meaning of the confidence interval of the mean difference to a friend who did not understand the output?

The write-up is similar to the independent groups ANOVA. This time, however, we can say:

A repeated-measures ANOVA was carried out on the driving data. Assumptions of normality and homogeneity of variance were met. Using the Greenhouse–Geisser correction, results showed that there was a significant overall difference between conditions (F(2,20) 10.83, p=0.001); an overall effect size of 0.496 (partial η2) showed that 50% of the variation in error scores can be accounted for by differing levels of alcohol. Pairwise comparisons showed that the difference between the placebo and the low alcohol condition was minimal (mean difference 0.333, p=1.00) whereas the difference between the placebo and the high alcohol condition was large (mean difference=4.42, p=0.011, CI(95%)1.04-7.79). There was also a significant difference between the low and the high alcohol conditions (mean difference=4.08, p=0.007, CI(95%)1.17-7.00). It can therefore be concluded that the more alcohol is consumed, the greater the number of driver errors.

Statistics without maths for psychology 318

Personal reflection

Manna Alma, PhD

University Medical Center Groningen, Department of Health Sciences, Com- munity and Occupational Medicine, The Netherlands

ARTICLE: The effectiveness of a multidisciplinary group rehabilitation program on the psychosocial functioning of elderly people who are visually impaired

Manna Alma says:

“Vision loss and its consequences on daily functioning require substantial psychosocial adjustment, a process many visually impaired persons are struggling with. The psychosocial impact of vision loss is profound, evidenced by deleterious effects on emotional adaptation, an elevated risk for depression, a high level of emotional distress, reduced mental health and a decline in life satisfaction. The psychosocial needs of those who are visually impaired should be part of their rehabilitation. Therefore, we devel- oped a multidisciplinary group rehabilitation program, Visually Impaired Elderly Persons Participation – VIPP, which aims to promote adaptation to vision loss and to improve social functioning.

In that paper, we described the results of a pilot study on the impact of VIPP on psychosocial functioning of the visually impaired elderly. For a convincing estimation of the change in psychosocial functioning a randomized controlled trial is preferable. Since the pilot study was a first step in investigating the effectiveness of the VIPP-program, we used a single group pretest–posttest design. The results showed an increase in psychosocial functioning directly after the program. For some of the outcome measures the improvement appeared to be a temporary effect and was followed by a decline during the six months following the intervention. However, the six-months follow-up measure still indicated positive effects compared to baseline. This pilot study was a first step toward documenting the effect of VIPP on psychosocial functioning. Although the results are preliminary because of the small sample size and the research design, the results are promising.

”

Example from the literature

The effectiveness of a multidisciplinary group rehabilitation program on the psychosocial functioning of elderly people who are visually impaired

Alma et al. (2013) carried out a group rehabilitation programme for visually impaired older people. They measured 29 people on psychosocial variables before an intervention. The intervention consisted of 20 weekly meetings which included practical training and education. The participants were measured at three time-points (baseline, halfway, immediately after the completion of the intervention, and at six-month follow-up). This, then, is a pre-post design, suitable for repeated-measures ANOVA. The authors state that they used Eta squared as a measure of effect size (ES).

The table of results is reproduced below. Note that the second column shows whether the overall ANOVAs are statistically significant. The five columns to the right shows the F values and effect sizes for pairwise comparisons.

CHAPTER 10 Analysis of differences between three or more conditions 319

Comparison of the mean scores of the psychosocial outcome measures at pretest (T0), halfway (T1), posttest (T2) and at six-months follow-up (T3)

Outcome

measure ANOVA T0-T1 T1-T2 T2-T3 T0-T2 T0-T3

Fa η2 Fb ES Fb ES Fb ES Fb ES Fb ES

Adaptation 7.73*** 0.24 15.33** 0.62 0.93 0.19 1.93 0.27 12.13** 0.57 10.41** 0.54 Helplessness 2.80* 0.10 1.60 0.25 0.01 0.02 2.96 0.33 1.80 0.26 9.68** 0.53 Self-Efficacy 4.90** 0.16 2.41 0.30 1.36 0.23 12.68*** 0.58 7.94*** 0.50 0.51 0.14 Mental Health 1.83 0.07 0.32 0.11 3.69 0.36 1.89 0.27 4.45* 0.39 1.22 0.22 Fear of Failing

Generic 1.53 0.06 0.09 0.06 0.63 0.17 3.59 0.35 0.96 0.20 0.87 0.18

Fear of Failing

Vision-specific 1.95 0.07 8.27** 0.50 2.03 0.27 0.06 0.05 1.55 0.24 1.89 0.27

a Degrees of freedom of the F-statistic were (3,75).

b Degrees of freedom of the F-statistic were (1,25).

* p 6 0.05;** p 6 0.01;*** p 6 0.001.

Activity 10.5

Look again at the table above. Complete the authors’ interpretation by filling in the gaps. Check your answers in the Answers section:

The authors state: ‘The one-way repeated measures ANOVA (see Table) showed statistical significant differences for three of the five outcome measures. Large intervention effects were found for adaptation to vision loss (η2=0.24, p 6 001) and ...

(name of variable) (η2=0.16, p 6 ... and a medium effect for ... (name of variable) (η2=... p 6 .046). There were medium effects for ... (name of variable) (η2

=... p 6 .15), a generic fear of falling (η2=... p 6 .22), and ... (name of variable) (η2=... p 6 ..., although not statistically significant.’

Summary

• ANOVAs allow us to test for differences between three or more conditions.

• ANOVAs are suitable for data drawn from a normal population – they are parametric tests.

• ANOVAs allow us to assess the likelihood of having obtained an observed difference between some or all of the conditions by sampling error.

• Planned or post-hoc tests show us which

conditions differ significantly from any of the other conditions.

• Partial eta2 is a correlation coefficient that can be used as a measure of effect in ANOVA. It lets us know, in percentage terms, how much variance in the scores of the dependent variable can be accounted for by the independent variable.

Statistics without maths for psychology 320

Discover the website at www.pearsoned.co.uk/dancey where you can test your knowledge with multiple choice questions and activities, discover more about topics using the links to relevant websites, and explore the interactive flowchart designed to help you find the right method of analysis.

SPSS exercises

Exercise 1

At the local university, students were randomly allocated to one of three groups for their laboratory work – a morning group, an afternoon group and an evening group. At the end of the session they were given 20 questions to determine how much they remembered from the session.

Enter the data from Table 10.7 into SPSS, analyse it by the use of ONEWAY (which is in the Compare Means menu), and obtain the results. Perform a post-hoc test. Copy down the important parts of the printout. Interpret your results in terms of the experiment. Were there differences between the groups, and, if so, in which direction?

Morning Afternoon Evening

P1 15 P11 14 P21 13

P2 10 P12 13 P22 12

P3 14 P13 15 P23 11

P4 15 P14 14 P24 11

P5 17 P15 16 P25 14

P6 13 P16 15 P26 11

P7 13 P17 15 P27 10

P8 19 P18 18 P28 9

P9 16 P19 19 P29 8

P10 16 P20 13 P30 10

Table 10.7 Data from morning, afternoon and evening laboratory groups

Exercise 2

There is some evidence to show that smoking cannabis leads to short-term memory loss and reduced ability in simple tasks. Seven students, smokers who normally did not take cannabis, were recruited to answer difficult arithmetic questions, under four different conditions. In the placebo condition they smoked a herbal mixture, which they were told was cannabis. In condition 2 they smoked a small amount of cannabis, increasing to a large amount in condition 4. Students were required to smoke cannabis alone. To avoid practice effects, there were four different arithmetic tests, all at the same level of dif- ficulty. To avoid the effects of order and fatigue, the order in which participants took the tests was counterbalanced. Results are shown in Table 10.8.

CHAPTER 10 Analysis of differences between three or more conditions 321

Enter the data into SPSS, analyse with a repeated-measures ANOVA, and write up the results in the appropriate manner.

Participant number Placebo Low dose Medium dose High dose

1 19 16 8 7

2 14 8 8 11

3 18 17 6 3

4 15 16 17 5

5 11 14 16 7

6 12 10 9 8

7 11 9 5 11

Table 10.8 Effect of cannabis smoking on fatigue

1. Parametric one-way independent ANOVA is a generalisation of:

(a) The paired t-test (b) The independent t-test (c) x2

(d) Pearson’s r

Questions 2 to 4 are based on the following information:

Alice, a third-year student, noticed that she and her friends learnt more statistics when they were in Madame MacAdamia’s class than in Professor P. Nutt’s. They could not determine whether this was due to the style of the teaching or the content of the lectures, which differed somewhat. For her third- year project, therefore, she persuaded three statistics lecturers to give the same statistics lecture, but to use their usual lecturing styles. First-year students were allotted randomly to the three different lecturers, for one hour. At the end of the lecture, they were tested on their enjoyment of the lecture (ENJOYMENT), and also on what they had learnt in the lecture (KNOWLEDGE). Alice then con- ducted a one-way ANOVA on the results. This is the SPSS printout for ENJOYMENT:

Multiple choice questions

ANOVA ENJOYMENT

145 13892.5548

.6141

47.2154 .4893

2 94.4308

Sig.

Mean Square F

df Between Groups

96.4904 143

13798.1240 Within Groups

Total

Sum of Squares

Statistics without maths for psychology 322

2. Which is the most appropriate conclusion?

(a) There are statistically significant differences between the three groups of students on ENJOYMENT

(b) There are important differences between the three groups but these are not statistically significant

3. The following is also given with the above printout:

What can you conclude from this?

(a) The variances of the groups are significantly different from each other (b) The variances of the groups are similar

4. Here are the results for the KNOWLEDGE questionnaire, which the students completed after their one-hour lecture:

ANOVA KNOWLEDGE

146 1593.2789

.0057

55.1550 5.3557

2 110.3100

Sig.

Mean Square F

df Between Groups

10.2984 144

1482.9689 Within Groups

Total

Sum of Squares

Descriptives ENJOYMENT

62.9091 62.9063 1.00

61.2041 2.00

3.00

Mean

Test of Homogeneity of Variances ENJOYMENT

.267 143

Sig.

df2 1.3343

df1 Levene Statistic

CHAPTER 10 Analysis of differences between three or more conditions 323

Descriptives KNOWLEDGE

12.3235 10.5781 1.00

10.0408 2.00

3.00

Mean

Cashew P.Nutt MacAdamia

Which is the most sensible conclusion?

(a) There are significant differences between the groups on KNOWLEDGE; specifically, Colin Cashew’s group retained more of the lecture than the other two groups

(b) There are significant differences between the groups on KNOWLEDGE; specifically, Madame MacAdamia’s group retained more of the lecture than Professor P. Nutt’s group

(c) There are significant differences between all of the groups on KNOWLEDGE; specifically, Professor P. Nutt’s group retained more of the lecture than the other two groups

(d) There are no significant differences between the groups on KNOWLEDGE 5. The F-ratio is a result of:

(a) Within-groups variance/between-groups variance (b) Between-groups variance/within-groups variance (c) Between-groups variance * within-groups variance (d) Between-groups variance+within-groups variance

6. The relationship between the F-ratio and t-value is explained by:

(a) t3=F (b) F2=t (c) t2=F (d) f 3=t

7. Professor P. Nutt is examining the differences between the scores of three groups of participants. If the groups show homogeneity of variance, this means that the variances for the groups:

(a) Are similar (b) Are dissimilar (c) Are exactly the same (d) Are enormously different

8. Differences between groups, which result from our experimental manipulation, are called:

(a) Individual differences (b) Treatment effects (c) Experiment error

(d) Within-participants effects

Statistics without maths for psychology 324

9. Herr Hazelnuss is thinking about whether he should use a related or unrelated design for one of his studies. As usual, there are advantages and disadvantages to both. He has four conditions. If, in a related design, he uses ten participants, how many would he need for an unrelated design?

(a) 40 (b) 20 (c) 10 (d) 100

10. Individual differences within each group of participants are called:

(a) Treatment effects

(b) Between-participants error (c) Within-participants error (d) Individual biases

11. Dr Colin Cashew allots each of 96 participants randomly to one of four conditions. As Colin Cashew is very conscientious, he meticulously inspects his histograms and other descriptive statistics, and finds that his data are perfectly normally distributed. In order to analyse the differences between the four conditions, the most appropriate test to use is:

(a) One-way between-groups ANOVA (b) t-test

(d) Repeated-measures ANOVA 12. The assumption of sphericity means that:

(a) The variances of all the sample groups should be similar

(b) The variances of the population difference scores should be the same for any two conditions (c) The variances of all the population difference scores should be similar

(d) The variances of all the sample groups should be dissimilar

13. If, in an analysis of variance, you obtain a partial eta2 of 0.52, then how much of the variance in scores on the dependent variable can be accounted for by the independent variable?

(a) 9%

(b) 52%

(d) 27%

14. Calculating how much of the total variance is due to error and the experimental manipulation is called:

(a) Calculating the variance (b) Partitioning the variance (c) Producing the variance (d) Summarising the variance

15. The following is output relating to a post-hoc test, after a one-way ANOVA:

CHAPTER 10 Analysis of differences between three or more conditions 325

.566 .566 .566 .566 Tests of Within-Subjects Effects

Measure: MEASURE_1

.007 .030 .029 .031 7.821

7.821 7.821 7.821 271.429

529.947 522.395 542.857 34.706 67.762 66.796 69.413 2

1.024 1.039 1.000 12 6.146 6.235 6.000 542.857

542.857 542.857 542.857 416.476 416.476 416.476 416.476 FACTOR1

Partial Eta Squared Sig.

F Mean

Square df

Type III Sum of Squares Source

Error (FACTOR1)

Sphericity Assumed Greenhouse–Geisser Huynh-Feldt Lower-bound Sphericity Assumed Greenhouse–Geisser Huynh-Feldt Lower-bound Multiple Comparisons

Dependent Variable: Current Salary Tukey HSD

(J) Employment Category Custodial Manager

Std. Error

$2,023.76

$1,228.35

$2,023.76

Sig.

.276

.276 .000

95%

Confidence Interval

Lower Bound

Clerical 2$1,642.74 $7,843.44

Manager $2,244.41 .000

Custodial

* The mean difference is significant at the .05 level.

Mean Difference (I–J)

2$3,100.35

$3,100.35 2$36,139.26*

2$33,038.91*

Clerical $1,228.35

$2,244.41

.000 .000

2$7,843.44 2$39,018.15

2$38,299.13

$33,260.37

$27,778.69

Upper Bound

$1,642.74 2$33,260.37

2$27,778.69

$39,018.15

$38,299.13

$36,139.26*

$33,038.91*

(I) Employment Category Clerical

Custodial

Manager

Which groups differ significantly from each other?

(a) Clerical and custodial occupations only (b) Custodial and manager occupations only (c) Manager and clerical occupations only

(d) Manager and clerical plus manager and custodial

16. Look at the following output, which relates to a repeated-measures ANOVA with three conditions.

Assume sphericity has been violated.

Which is the most appropriate statement?

The difference between the conditions represented by:

(a) F(2,12)=7.82, p=0.007 (b) F(1,6)=7.82, p=0.030 (c) F(2,12)=7.82, p=0.030 (d) F(1.6)=7.82, p=0.031

Statistics without maths for psychology 326

17. Which is the most appropriate answer? The effect size is:

(a) 5.7%

(b) 57%

(d) 5%

Questions 18 to 20 relate to the output below, which shows a repeated-measures ANOVA with three levels.

Assume sphericity has been violated.

18. Which is the most appropriate statement?

(a) F(2,12)=5.62, p=0.020 (b) F(1,6)=5.62, p=0.051 (c) F(2,12)=5.62, p=0.049 (d) F(1,6)=5.62, p=0.055

Pairwise Comparisons Measure: MEASURE_1

Sig.a .058 .184

95% Confidence Interval for Differencea Lower Bound Mean

Difference (I–J) 211.857 23.429

224.146 28.339

Upper Bound .431 1.482 (I)COND

Based on estimated marginal means

a. Adjustment for multiple comparisons: Bonferroni.

(J)COND 2 3

Std. Error 3.738 1.494

.058 .339 11.857

8.429

2.431 27.514

24.146 24.371

2 1

3.738 4.849

.184 .399 3.429

28.429

21.482 224.371

8.339 7.514

3 1

1.494 4.849 Tests of Within-Subjects Effects

Measure: MEASURE_1

.019 .051 .049 .055 5.624

5.624 5.624 5.624 260.619

485.940 466.251 521.238 46.341 86.406 82.905 92.683 2

1.073 1.118 1.000 12 6.436 6.708 6.000 521.238

521.238 521.238 521.238 556.095 556.095 556.095 556.095 COND

Sig.

F Mean

Square df

Type III Sum of Squares Source

Error (COND)

Sphericity Assumed Greenhouse–Geisser Huynh-Feldt Lower-bound Sphericity Assumed Greenhouse–Geisser Huynh-Feldt Lower-bound

CHAPTER 10 Analysis of differences between three or more conditions 327

19. Which two conditions show the largest difference?

(a) 1 and 2 (b) 2 and 3 (c) 1 and 4

(d) They are identical

20. Assuming that the null hypothesis is true, the difference between conditions 1 and 2 has a:

(a) 5% chance of arising by sampling error (b) 6% chance of arising by sampling error (c) 19% chance of arising by sampling error (d) 20% chance of arising by sampling error

Alma, M. A., Groothoff, J. W., Melis-Dankers, B., Suurmei- jer, T. and van der Mei, S. F. (2013) ‘The effectiveness of a multidisciplinary group rehabilitation program on the psychosocial functioning of elderly people who are visually impaired’, Journal of Visual Impairment & Blindness, 107(1): 5–16.

Huijberts, S., Buurman, B.M. and de Rooij, S.E (2015), ‘End- of-life care after an acute hospitalization in older patients with cancer, end-stage organ failure, or frailty: a

sub-analysis of a prospective cohort study’, Palliative Medicine, 30(1): 75–82

Howell, D. C. (2010) Statistical Methods for Psychology, 7th international edn, Stanford, CT: Wadsworth.

Schlagman, S., Kliegel, M., Schulz, J. and Kvavilashvili, L.

(2009) ‘Differential effects of age on involuntary and vol- untary autobiographical memory’, Psychology and Aging, 24(2): 397–411.

References

1. b, 2. c, 3. b, 4. a, 5. b, 6. c, 7. a, 8. b, 9. a, 10. c, 11. a, 12. b, 13. b, 14. b, 15. d, 16. b, 17. b, 18. b,

19. a, 20. b

Answers to multiple choice questions

Predicting the criterion variables from several explanatory

Assumptions to be met when using multiple regression