Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 61 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
61
Dung lượng
315,24 KB
Nội dung
ON THE ROLE OF VOCAL EMOTIONS FOR VERBAL MEMORY:
AN INVESTIGATION OF
NEURAL AND PSYCHOPHYSIOLOGICAL MECHANISMS
CHAN PEI LING, KAREN
(B.Soc.Sci., Psychology, NUS)
A THESIS SUBMITTED FOR
THE DEGREE OF MASTERS IN SOCIAL SCIENCES
DEPARTMENT OF PSYCHOLOGY
NATIONAL UNIVERSITY OF SINGAPORE
2012
i
Acknowledgments
This entire page is dedicated to all who have helped in some way or another in
different stages of my thesis preparation and production. During the course of my
academic pursuit I have formed new friendships, particularly with folks at the Brain
and Behavior Lab as well as fellow Masters students. Amongst them are Shimin, Eric,
Darshini, Nicolas and Cisy, from whom I have learnt much about the process of doing
good research. I thank Eric for his positive support and always giving constructive
criticisms and suggestions that spurs me on to improve my research, Shimin for her
helpfulness and her contagious readiness to see things from a different perspective,
Darshini for her spontaneity in organizing outdoor activities that has helped me lead a
healthy, all-rounded student life, Nicolas and Cisy for the enjoyable discussions with
brilliant ideas, humor and also technical knowledge like professional researchers. I’d
also like to thank many others for their guidance and provisions. To Dr. Annett
Schirmer, who has taught me much about the research arena – thanks for your
guidance. Also to Dr. Trevor Penney, I enjoyed the interesting conversations and am
always amazed by your insights pertaining to different aspects of life. I’m grateful to
Prof Michael Chee, who has summoned much expertise during my data collection
period - Soon for his help in the initial piloting phase, Weiyan for conducting the
fMRI scans. Thanks also to Christy and April for helping in the data collection. Last
but not the least, I owe all to my fiancé, Calvin, who has always been around for me
rain (when the machine broke down) or shine (when I get new insights to my
findings). His patience and encouraging spirit has given me a huge leap forward in
this journey. A big Thanks to all of you!
ii
Summary
The present study explored the neural and psychophysiological substrates
underlying the influence of vocal emotions on verbal memory. Heart rate (Experiment
1) and fMRI data (Experiment 2) were acquired while participants performed a verbal
memory task comprising an encoding phase and a test phase. In the encoding phase,
participants were asked to memorize a series of words spoken with either a neutral or
sad prosody. During the test phase, participants were presented with written words
and indicated whether they had previously been studied. During encoding, attending
to sad prosody as compared to neutral prosody elicited greater heart rate (HR)
deceleration and greater activity in the bilateral superior temporal gyrus, superior
temporal sulcus and right transverse temporal gyrus. At test, words previously heard
with a sad prosody were remembered less accurately than words previously heard
with a neutral prosody. Moreover, the former were rated more negatively than the
latter. While the encoding effects observed here failed to predict test effects, there was
a correlation between HR acceleration and memory. Specifically, a greater HR
acceleration to words with sad as compared to neutral prosody was associated with a
reduced memory deficit for sadly as compared to neutrally spoken words. This may
be mediated by the relationship between sympathetic arousal and memory.
Implications of current findings are discussed in relation to vocal communication and
future directions proposed to further elucidate the complex relationship between
prosody and verbal memory.
Keywords: vocal, emotion, prosody, verbal, memory, superior temporal gyrus,
sadness, heart rate, cardiac.
iii
Table of Contents
Acknowledgement ....................................................................................................... i
Summary ..................................................................................................................... ii
Table of Contents ...................................................................................................... iii
List of Tables ...............................................................................................................v
List of Figures ............................................................................................................ vi
Chapter 1: Introduction
Prosody and emotional expression ................................................................................1
Effects of speaker prosody on verbal memory .............................................................4
Heart rate studies on emotion and memory ..................................................................6
fMRI studies on emotional processing ..........................................................................8
Individual variation in neural and heart rate responses to emotional stimuli
......................................................................................................................................10
fMRI studies on verbal memory .................................................................................11
Thesis objectives ..........................................................................................................12
Chapter 2: Experiment 1
Objectives ...................................................................................................................15
Methods .......................................................................................................................15
Results ..........................................................................................................................22
Discussion ....................................................................................................................23
Chapter 3: Experiment 2
Objectives ...................................................................................................................25
Methods .......................................................................................................................26
iv
Results ..........................................................................................................................30
Discussion ....................................................................................................................34
Chapter 4: General Discussion
Heart rate and neural correlates of prosody encoding .................................................36
Effects of prosody on verbal memory performance ....................................................38
Effects of prosody on word valence judgment ............................................................42
Limitations and future directions .................................................................................43
Conclusions ..................................................................................................................44
Bibliography ................................................................................................................45
v
List of Tables
Table 1. Table illustrating means and standard deviations for heart rate deceleration
and acceleration in response to words spoken with the neutral and sad prosody
(Experiment 1) ..................................................................................................................21
Table 2. Table illustrating means and standard deviations of mean dprime scores
valence ratings for words spoken with the neutral and sad prosody (Experiment 1) .......22
Table 3. Table illustrating means and standard deviations of dprime scores for words
spoken with the neutral and sad prosody (Experiment 2) ..................................................29
Table 4. Table illustrating peak activations for hitemo > hitneu contrast for the study
phase (Experiment 2) ........................................................................................................30
Table 5. Table illustrating peak activations for hits > correct rejections contrast for the
test phase (Experiment 2) ..................................................................................................33
vi
List of Figures
Figure 1.1. Figure illustrating the sequence of stimulus presentation during a study
phase ..................................................................................................................................17
Figure 1.2. Figure illustrating the sequence of stimulus presentation during a test
phase ..................................................................................................................................18
Figure 2. The QRS complex . ............................................................................................19
Figure 3. A time-series plot illustrating how maximum HR deceleration and
acceleration were computed for each participant ..............................................................20
Figure 4. Figures illustrating regions that show greater activity for words spoken in
negative as compared to neutral intonation . .....................................................................31
1
Chapter 1: Introduction
The utility of verbal memory is ubiquitous in our daily lives. It serves the mastery
of many tasks including those linked to academic and classroom performance. Hence,
understanding the processes that enable verbal memory is of general interest and has
spurred much research in the area of cognitive psychology. The present thesis extends
this research by scrutinizing a factor that has hitherto been neglected. Specifically, it
reports two studies that explored whether and how a speaker’s vocal tone or prosody
influences listener memory for communication content. In the following, I will shortly
introduce prosody as a means of emotional expression and detail behavioral,
psychophysiological and neurological research on prosody processing and verbal
memory.
Prosody and emotional expression
Prosody is defined by the suprasegmental features of an utterance. These features
include pitch (or fundamental frequency), amplitude, rhythm and voice quality among
others. By affecting the vocal apparatus (e.g., rate of breathing, muscular tension),
emotions induce changes in these acoustic features and thus prosody. Given basic
regularities in the way emotions affect vocalizations, individuals can use prosody to
make inferences about a speaker’s emotional state (Banse & Scherer, 1996; Scherer,
1986). Research suggests that such inferences can be fast and automatic and guide
listener attention. Evidence to this effect comes from behavioral, electrophysiological
and neuroimaging research as detailed below.
Behavioral evidence for automatic prosodic processing and attention capture
comes from a range of studies (Brosch, Grandjean, Sander, & Scherer, 2008;
Schirmer, Kotz, & Friederici, 2002; Schirmer & Kotz, 2003). In one of these studies,
2
the authors employed a cross-modal dot-probe paradigm in which participants
indicated whether a dot appeared on the left or right side of a screen. The authors
paired dots with task-irrelevant nonverbal exclamations that sounded angry on one ear
and neutral on the other. Participants were faster at responding to a dot if it appeared
on the side of an angry as compared to a neutral exclamation (Brosch, Grandjean,
Sander, & Scherer, 2008). Thus, one may infer that emotional prosody is processed
even if it is task-irrelevant and that it modulates spatial attention. Other existing work
corroborates this inference using male and female voices as well as happy, angry, sad
and neutral prosody (Schirmer, Kotz, & Friederici, 2002; Schirmer & Kotz, 2003).
Electrophysiological studies provide further evidence that emotional prosody is
automatically processed and captures attention. Additionally, they outline a temporal
course for its influence (Schirmer, Striano, & Friederici, 2005; Schirmer, Escoffier, &
Simpson, 2007). In an auditory event-related potential (ERP) study by Schirmer and
colleagues (2005), the authors employed an oddball paradigm in which participants
were presented with an auditory sequence consisting of rare ‘deviants’ and a series of
frequent ‘standards’. The auditory sequence was played in the background while
participants watched a silent movie with subtitles. In one experimental block, the
deviant was an emotionally spoken syllable, while the standard was a neutrally
spoken syllable. In another experimental block, the deviant was a neutrally spoken
syllable, while the standard was an emotionally spoken syllable. The authors
measured the mismatch negativity (MMN), an ERP component that presumably
reflects pre-attentive change detection (Näätänen & Alho, 1995; Näätänen, 2001). It
can be visualized by subtracting the ERP elicited by standards from those elicited by
deviants. When performing such subtractions, Schirmer and colleagues found a
greater MMN in response to emotionally spoken ‘deviants’ as compared to the neutral
3
‘deviants’. This suggests that listeners can discriminate tone of voice pre-attentively.
Moreover, given that the MMN amplitude is thought to indicate the likelihood of
attention capture (Näätänen & Alho, 1995), one may infer that emotionally spoken
material is more likely than neutral material to engage attention that is initially
directed elsewhere.
Finally, support for the preferential processing of emotional relative to neutral
prosody comes from neuroimaging studies. For example, a study by Grandjean and
colleagues (2005) examined brain responses to meaningless utterances pronounced
with either emotional or neutral prosody and found that emotional prosody elicited
enhanced responses in the superior temporal sulcus (STS) relative to neutral prosody.
This was accompanied by greater activity in the right amygdala regardless of whether
sounds were task-relevant or irrelevant (Grandjean, Sander, Pourtois, Schwartz,
Seghier, Scherer, & Vuilleumier, 2005). Other neuroimaging work is in line with this
(for a review see Schirmer & Kotz, 2006).
Together behavioral, electrophysiological and functional neuroimaging work
indicate that emotional prosody is processed automatically and guides listener
attention. Thus, one might ask whether it could benefit the encoding and memory
storage of concurrently presented verbal information. Moreover, given that prosody is
typically tied to a verbal message, the existing work raises the possibility that the
storage of such a message depends on whether prosody is emotional or neutral. Two
aspects of memory storage have been investigated in this respect. First, researchers
have looked at whether emotionally spoken material is better retained in memory than
neutrally spoken material. Second, researchers have examined the effect of emotional
prosody on the emotional connotation of verbal information maintained in long-term
memory. In the following section, I will present their findings.
4
Effects of speaker prosody on verbal memory
Effects of emotional prosody on verbal memory have been examined by Kitayama
(1996). The author tested the effects of emotional prosody on memory under different
memory load conditions. In his study, participants performed a memory span task
which required them to memorize either two (low load) or four (high load) two-digit
numbers for 20s. During the 20s interval, a sentence was presented as a distraction
stimulus. Participants were told to ignore this distractor so that they could perform at
their best for the memory span task. They were then given a surprise recall and
recognition test of the sentences. Prosody effects were assessed by comparing the
percent recall for sentences spoken with the emotional and neutral prosody. First, the
free recall protocols were coded with a gist criterion. A recalled item was coded
correct if it was uniquely identified with any one of the 24 sentences. Percent recall
was then computed for each condition. Findings revealed that when the task was
demanding (high load), verbal memory (recall) was better if the message was
delivered in an emotional tone of voice than if it was delivered in a neutral prosody.
In other words, when the task is demanding, emotional prosody enabled participants
to recall more sentences. However, when participants had to memorize only two twodigit numbers (low load), memory tended to be worse when the sentences were
spoken with an emotional prosody than when they were spoken with a neutral
prosody. Kitayama had also replicated his findings in a second study using
recognition as an additional dependent measure. This time round participants were not
only given a surprise free recall test but were also asked to select old from among new
sentences and to indicate their level of confidence in this selection. When the memory
load was low, results for recognition memory paralleled that of the free recall memory
in that memory for sentences spoken with the neutral prosody was better than that for
5
sentences spoken with the emotional prosody. In contrast, when memory load was
high, memory for both types of sentences were comparable. Findings suggest that
emotional prosody can either improve or impair memory for verbal content, and that
the effect of emotional prosody on memory depend largely on memory load and the
retrieval method employed at test (Kitayama, 1996).
A recent study by Schirmer (2010) also explored the effect of speaker prosody on
the memory representation of words. In this study, participants performed a crossmodal verbal memory paradigm. During encoding, participants heard words spoken
with either an emotional or neutral prosody that were presented at intervals of one
second. At recognition, words were presented visually and participants made an oldnew judgment. Memory recognition performance was comparable for words spoken
with emotional and neutral prosody (Schirmer, 2010). Together with the findings by
Kitayama, this suggests that any memory benefit for emotionally as compared to
neutrally spoken material may show for free recall only. However, verbal recognition
may not benefit and, according to the results by Kitayama, potentially suffer from
emotional prosody. At present it is still unclear what determines the relationship
between prosody and verbal memory. However, it is clear that this relationship is
more complex than what has been observed for the relationship between prosody and
attention.
Although encoding prosody does not seem to have a consistent effect on
subsequent word recognition, there is evidence that it reliably modulates another
aspect of long-term verbal memory (Schirmer, 2010). Specifically, Schirmer found
that encoding prosody significantly modulated listener’s attitudes towards verbal
information (Schirmer, 2010). Correctly recognized words that were previously heard
with a sad prosody were subsequently rated as more negative compared to words
6
encoded with a neutral prosody. The reversed pattern was found when the author
compared happy and neutral prosody. Interestingly, these effects were independent of
the listener’s ability to consciously recollect speaker prosody suggesting that they
reflect changes in the words’ stored affective representations rather than the conscious
retrieval of encoding prosody. Moreover, they indicate that the valence of words
stored in memory is not fixed and can be adjusted dynamically based on the emotional
context
in
which
these
words
are
encountered.
A
recent
study using
electroencephalography (Schirmer et al. in preparation), replicated these results and
further outlined the time course of prosody encoding processes that underlie the
observed change in affective memory.
The present thesis was aimed to extend this work by studying the role of emotion
related autonomic changes and the involvement of neuroanatomical substrates. The
former was achieved by measuring event-related changes in heart rate. The latter was
achieved by measuring event-related changes in brain activity using fMRI. In the
following, I will review both measures and their utilization in previous studies on
emotion and memory.
Heart rate studies on emotion and memory
Heart rate can be measured as sustained heart rate, heart rate variability and eventrelated heart rate, with the latter being of interest here. An event-related heart rate
(HR) response is a change induced by a stimulus lasting a few seconds (Jennings,
1981); this change is typically triggered within less than a second following stimulus
onset and may last up to several seconds thereafter. The event-related heart rate
requires the observation of individual heart beats and is generally assessed by
interpolating and averaging beat-to-beat intervals across trials.
7
In the past, researchers have measured HR responses to stimuli that vary with
respect to valence. Bradley and Lang (for a review see Bradley & Lang, 2000a), for
instance, presented participants with pleasant and unpleasant pictures and found that
both elicited an initial HR deceleration followed by a HR acceleration. Moreover, the
initial deceleration was greater for both pleasant and unpleasant as compared to
neutral pictures. These results were replicated with other stimuli such as
environmental sounds (Bradley & Lang, 2000b; Palomba, Angrilli & Mini, 1997,
2000) leading researchers to argue that HR deceleration reflects the emotional
intensity of a perceived stimulus. This and related work also inspired the idea that HR
deceleration is linked to stimulus intake or an orienting response, which promotes
attention to information of high survival value.
The HR acceleration that typically follows an initial deceleration has been linked
to cognitive processing effort (Lacey & Lacey, 1979; Barry, Robert, Tremayne &
Patsy, 1987). Its role in emotional processing is still equivocal. In a study by Harrison
and Turpin (2003), the authors examined whether individuals who were high on
anxiety show a bias to threat-related material. Heart rate measures were obtained
while participants performed a memory task consisting of an encoding phase and a
test phase. During encoding, participants viewed threat and non-threat words. At test,
they were presented with word stems and asked to complete these words on a
response sheet. Upon completion, each word was rated based on the level of threat
associated. An initial HR deceleration and subsequent HR acceleration was observed.
The authors found a greater HR deceleration in response to threat stimuli as opposed
to non-threat stimuli for all participants. However, they found also that non-threat
stimuli induced a greater subsequent HR acceleration as compared to threat stimuli
and that they were better remembered (Harrison & Turpin, 2003). Somewhat different
8
results were observed by Buchanan and colleagues (2006), who presented participants
with neutral-unrelated words, school-related words, moderately arousing unpleasant
words and highly arousing taboo words. Participants were told to attend to the words
and remember as many as possible for a subsequent recall and recognition test. The
authors noted greater HR deceleration in response to unpleasant words that were
subsequently remembered as compared to those that were forgotten. In addition,
highly arousing taboo words were found to induce greater HR acceleration as
compared to moderately arousing unpleasant words (Buchanan, Etzel, Adolphs, &
Tranel, 2006).
Although both studies found a greater HR deceleration for threatening words and
taboo words, there seems to be a discrepancy with respect to HR acceleration. These
may stem from the nature of the stimuli and call for further investigations.
fMRI studies on emotional processing
The last century has seen an explosion in the number of studies that used noninvasive techniques such as functional magnetic resonance imaging (fMRI) to
examine the neural processes that underlie psychological phenomena. fMRI is a brain
imaging technique that measures changes in blood oxygenation that appear to be
linked to neural activity (Ogawa, 1990). There are several reasons for using fMRI as a
tool for studying neural processes. One reason is that unlike X-ray Computed
Tomography (CT) or Positron Emission Tomography (PET) scans, fMRI is a noninvasive technique. Another reason is that fMRI provides a relatively high spatial
resolution. Hence, fMRI is an appropriate technique for identifying the brain
structures that support mental processes.
The fMRI technique has been used extensively to study the brain structures that
9
support emotion and memory. With respect to emotions, numerous studies report
enhanced neural activity in response to emotional as compared to neutral stimuli in a
range of modalities including audition (see reviews by Vuilleumier, Armony, &
Dolan, 2003; Costafreda, Brammer, David, & Fu, 2008; Fusar-Poli et al., 2009).
These enhancements are typically seen in regions associated with sensory processing
and perceptual encoding (Dolan & Vuilleumier, 2003; Kensinger, 2004; Grandjean et
al., 2005; Schirmer & Kotz, 2006; Wildgruber et al., 2005; Sander & Scheich, 2001;
Fecteau et al., 2007). As reviewed above, in the case of prosody, researchers observed
greater activity in auditory regions or ‘voice-selective areas’ (see Belin, Zatorre,
Lafaille, Ahad, & Pike, 2000) along the superior temporal sulcus (for a review see
Schirmer & Kotz, 2006), specifically regions in the superior temporal sulcus, superior
temporal gyrus and transverse temporal gyrus (Mitchell et al., 2003, Ethofer et al.
2006; Beaucousin et al. 2007; Wildgruber et al., 2005; Ethofer et al., 2006). Apart
from enhancing sensory and perceptual processes, emotional stimuli have been found
to activate a range of other regions. Foremost among them is the amygdala, an almond
shaped structure in the medial temporal lobe. Emotion effects in this structure have
been observed in studies that used faces (Breiter et al. 1996; Morris, Frith, Perrett,
Rowland, Young, Calder, & Dolan, 1996; Critchley, Rotshtein, Nagai, O'Doherty,
Mathias, & Dolan, 2005; Hariri, Bookheimer, & Mazziotta, 2000; Vuilleumier,
Armony, & Dolan, 2003), images such as pictures of emotional scenes Ohman &
Mineka, 2001; Adolphs, 2002; Vuilleumier, Armony, Clarke, Husain, Driver, &
Dolan, 2002), written words (Kensinger & Corkin, 2004; LaBar & Cabeza, 2006;
Sommer, Gläscher, Moritz, & Büchel, 2008; Mickley & Kensinger, 2008), nonverbal
exclamations (Fecteau et al., 2007; Phillips, Young, Scott, Calder, Andrew,
Giampietro, Williams, Bullmore, Brammer, & Gray, 1998; Sander & Scheich, 2001)
10
and to a lesser extent words or sentences spoken with emotional prosody (Sander et
al., 2005; but see Schirmer, Escoffier, Zysset, Koester, Striano, & Friederici, 2008;
Ethofer et al., 2006; Kotz et al., 2003; Mitchell et al., 2003; Morris, Scott & Dolan,
1999, Wiethoff, Wildgruber, Kreifelts, Becker, Herbert, Grodd, & Ethofer, 2008).
Based on this work, it has been proposed that the amygdala serves as a “relevance”
detector – that is an emotion unspecific region that is activated by any stimulus of
intrinsic relevance for the individual (Sander, Grafman & Zalla, 2003).
Meta-analyses of neuroimaging work on emotions suggest that the involvement of
overall brain activation patterns depend on the specific emotions evoked by the
stimuli (Vytel & Hamann, 2010). Specifically, regions apart from the amygdala and
basic sensory and perceptual processing are activated in an emotion-specific fashion.
For instance, sadness consistently activated the middle frontal gyrus and head of the
caudate/subgenual anterior cingulate cortex.
Individual variation in neural and heart rate responses to emotional stimuli
Verbal memory tasks have been demonstrated to elicit cortical activation that
shows good intra-subject reproducibility but significant inter-individual variation in
spatial location and extent (Miller et al., 2002). Some neuroimaging studies have also
found individual variability in the extent of neural activation evoked by emotional
stimuli. For instance, in a study by Canli and colleagues (2002) where participants
were shown happy facial expressions, the authors found that subjects exhibited highly
variable responses in the amygdala, such that the average group response was not
statistically significant. However, it was subsequently found that this variability was
strongly correlated with subjects’ degree of extraversion (Canli, Sivers, Whitfield,
Gotlib, & Gabrieli, 2002). The greater the degree of extraversion, the greater the
11
extent of amygdala response to the happy faces. Hence, it appears that a certain
amount of variability in neural response to emotional stimuli exists and this variability
may provide vital cues in elucidating the underlying brain mechanisms. It would thus
be prudent to examine the change in cortical activation in response to emotional
stimuli within each individual and how this change may vary across individuals. Such
individual variability in neural correlates could then be related to behavioral correlates
such as memory performance. Despite a considerable amount of literature devoted to
the study of emotional memory and it neural correlates, there are few studies that
examined individual variability in neural / physiological changes evoked by emotional
stimuli. One of the aims of the present study is to examine how such individual
variability in neural / physiological differences may relate to verbal memory.
fMRI studies on verbal memory
Apart from highlighting structures implicated in emotion, fMRI research has also
provided insights into the brain systems that support memory or the storage of
semantic information (Buckner, Koutstaal, Schacter, Wagner, Rosen, 1998; Eldridge,
Knowlton, Furmanski, Bookheimer, & Engel, 2000; Konishi, Wheeler, Donaldson,
Buckner, 2000; Chee, Goh, Lim, Graham, & Lee, 2004; Henson, Hornberger, &
Rugg, 2005). An early study by Buckner and colleagues (1998) found that word
retrieval as compared to viewing a fixation on the screen activated the extrastriate
cortex, motor cortex, dorsolateral prefrontal cortex, anterior cingulate, parietal cortex,
thalamus, anterior insular cortex and several other regions (for a complete list of
regions see Buckner et al., 1998). Subsequent studies have found a similar set of
regions. For instance, in a study by Chee and colleagues (2004), participants
performed an incidental word encoding task (living/non-living judgments) and were
subsequently tested using an old/new recognition paradigm. In this paradigm,
12
participants saw words from the encoding task together with new words and indicated
for each word whether it had been previously seen or whether it was new. Relative to
correctly recognized new words (correct rejections), correctly recognized old words
(hits) elicited greater neural activity in left middle frontal gyrus, left inferior frontal
gyrus, left inferior temporal gyrus, left anterior cingulate, left parietal region,
thalamus and insular cortex (Chee, Goh, Lim, Graham, & Lee, 2004).
Together these studies highlight a network supporting the successful recollection
of previously studied items that comprises the middle frontal gyrus, inferior frontal
gyrus, middle temporal gyrus, inferior temporal gyrus and cingulate gyrus (Chee,
Goh, Lim, Graham, & Lee; Henson, Hornberger, & Rugg, 2005). Of interest for the
present study is whether these memory effects are enhanced for words previously
heard with an emotional as compared to neutral prosody.
Thesis Objectives
As outlined above, this thesis was inspired by previous behavioral work that
identified an effect of emotional prosody on the accuracy (Kitayama, 1996) and
affective connotation (Schirmer, 2010) of verbal memory. Moreover, it sought to
further investigate these effects by assessing their autonomic and neural correlates.
These two aspects were addressed in Experiments 1 and 2, respectively. These
experiments comprised a study and a test phase. In the study phase, participants
listened to a series of neutral words spoken with neutral or sad prosody. In the test
phase, participants saw previously studied words together with new words and
indicated for each word whether it was ‘old’ or ‘new’.
Experiment 1 recorded heart rate responses to words presented in the study phase.
Based on previous work (Bradley & Lang, 2000b; Palomba, Angrilli & Mini, 1997;
13
Buchanan, Etzel, Adolphs, & Tranel, 2006), I predicted greater heart rate deceleration
to words spoken with sad as compared to neutral prosody. If, as previously suggested,
this HR response reflects orienting to the eliciting stimulus as a whole (Lacey &
Lacey, 1979), it should predict subsequent memory. Specifically, it should correlate
with potential differences in the recognition accuracy of visually-presented test words
previously heard with sad as compared to neutral prosody. It might also explain
condition differences in subsequent word valence rating. In line with previous work
(Schirmer, 2010), words studied with a sad prosody should be rated more negatively
than words studied with a neutral prosody and this difference might be enhanced for
individuals with a greater HR deceleration effect. However, if the relationship
between HR deceleration and stimulus processing goes beyond a simple orienting
response as suggested by Harrison and Turbin (2003) then HR deceleration may not
predict verbal memory and valence. Instead, such effects may arise from HR
acceleration. Thus, apart from investigating HR deceleration, Experiment 1 also
aimed to elucidate potential emotion effects on HR acceleration.
Experiment 2 recorded brain activity both during the study and test phases.
Previous neuroimaging studies have implicated perceptual (superior temporal sulcus,
superior temporal gyrus and transverse temporal gyrus) and emotion-specific
(amygdala) regions in processing emotional information. Therefore, during the study
phase, I expected greater activity in the amygdala, superior temporal sulcus, superior
temporal gyrus and transverse temporal gyrus for words spoken with sad as compared
to neutral prosody. During the test phase, I expected greater activity in the middle
frontal gyrus, inferior frontal gyrus, middle temporal gyrus, inferior temporal gyrus,
cingulate gyrus and anterior cingulate for hits relative to correct rejections. Moreover,
based on previous research indicating an influence of emotion on memory (for a
14
review see Phelps & LeDoux, 2005; Murty et al., 2010), I hypothesized this memory
effect to be greater for negatively as compared to neutrally spoken words. Finally, I
explored whether study and test emotion effects on neural activity predict behavioral
performance.
15
Chapter 2: Experiment 1
Do prosody encoding effects predict differences in verbal memory performance and
subsequent word valence judgments?
Objectives
Experiment 1 explored the effect of prosody on subsequent word memory and
valence. Moreover, of interest was whether such effects relate to autonomic changes
triggered by the prosody during word encoding. Based on work by Schirmer (2010),
no significant differences in word memory were expected at the group level for words
studied with sad and neutral prosody. However, as such differences could exist at the
individual level, I intended to explore such individual differences and their
relationship to heart rate changes. A second objective was to replicate the ‘valence
shift effect’ found by Schirmer (2010) and to examine whether words successfully
encoded with a sad prosody were subsequently rated more negatively than words
successfully encoded with a neutral prosody. Furthermore, I hoped to determine
whether this shift in valence was linked to heart rate correlates. More specifically, I
predicted this valence shift effect to be greater for individuals with a greater prosody
effect (emotional – neutral) on heart rate.
Methods
Participants
Forty-seven participants (23 female) aged 21 to 27 took part in the experiment.
Participants reported normal hearing and normal or corrected to normal vision.
Informed consent was obtained prior to the start of the experiment and participants
were reimbursed S$10 per hour.
16
Materials
The materials for this research were taken from a previous study by Schirmer
(2010). It comprised a list of 240 neutrally valenced words. The words were selected
from among 500 words, which were rated by 30 independent raters (15 female) on
emotional valence and arousal. Raters were required to rate the emotional valence of
each word on a 5-point scale ranging from -2 (very negative) to +2 (very positive) and
its arousal ranging from 0 (non-arousing) to 4 (highly arousing). The words selected
for the experiment had a mean valence of 0.16 (SD 0.20) and a mean arousal of 0.58
(SD 0.24).
All selected words were spoken with neutral and sad prosody by a female native
speaker of English. Words were recorded and digitized at a sampling rate of 44.1
KHz. Word amplitude was normalized at the root-mean-square value using Adobe
Audition 2.0. The average duration of words produced by the speaker was 1132.4 ms
(SD 245.5) for sad prosody and 777.6 ms (SD 149) for neutral prosody.
The speaker was selected from among four speakers with drama experience who
were invited to speak 15 neutral words in anger, sadness, happiness and neutrality.
These words were presented to a group of 30 volunteers who indicated whether the
speaker was in an angry, sad, happy, neutral or other emotional state not listed.
Additionally, they rated each word on a five-point scale ranging from -2 (very
negative) to +2 (very positive) for prosody valence and from 0 (non-aroused) to 4
(highly aroused) for prosody arousal. The speaker who produced the material for the
present and previous work (Schirmer, 2010) portrayed sadness (identification
accuracy = 88%, valence = -1.45, arousal = 2.92) and neutrality (identification
accuracy = 89%, valence = 0.06, arousal = 0.79) better than the other speakers.
17
Procedure
Participants were tested individually. A participant visiting our lab was first asked
to read and sign the experimental consent form. Then s/he was brought into a room
and asked to sit in a comfortable chair facing a computer screen. Heart rate (HR) was
measured by two Ag/AgCl electrodes attached to the left and right forearm,
respectively. The data were recorded at 256 Hz with the ActiveTwo system from
Biosemi. The difference between the two electrodes was computed and the resulting
bipolar recording processed using Matlab (Schirmer & Escoffier, 2010). The present
study used an old-new recognition paradigm comprising two study phases each
followed by a test phase (as illustrated in Figures 1.1 and 1.2 respectively). Prior to
the task, participants attempted a short practice to familiarize themselves with the
mapping of required responses and response buttons. During the practice session,
participants were presented with 10 words spoken in either a neutral or sad prosody
and asked to memorize the words. Subsequently, participants viewed 20 words and
were told to indicate whether these words were ‘old’ or ‘new’. The experiment was
conducted using Presentation® software (Version 13.0, www.neurobs.com). A CRT
monitor of 18 inches was used for visual presentation. Sounds were presented using
Etymotic ER 4 MicroPro in-ear earphones.
Study phase
During the study phase, participants listened to a series of words spoken with
either a neutral or sad prosody. They were instructed to study the words for a
subsequent memory test. Each trial began with a fixation cross that was presented for
0.2 s in the center of the screen, followed by a spoken word simultaneously presented
with a fixation cross, the latter lasting 2.3 s. The trial ended with a blank screen
18
marking the onset of the intertrial interval (ITI). The ITI was jittered from 12 to 15 s
in one second steps. Each study phase consisted of 60 trials. Half of the trials
consisted of words spoken with sad prosody; the other half consisted of words spoken
with neutral prosody. The sequence of words presented was pseudorandomized such
that no more than four consecutive trials were of the same prosody. A sample trial is
shown in Figure 1.1.
Figure 1.1. Figure illustrating the sequence of stimulus presentation during a study
phase.
Test phase
During the test phase, participants viewed 120 words half of which were
previously studied (old) and half of which were not previously studied (new) on the
screen. Each test trial began with a fixation cross that lasted 0.2 s, followed by a word
on the screen for 1 s. Next, a prompt appeared, instructing participants to indicate
whether the word was an ‘old’ or a ‘new’ word. Participants who had to press the left
button for old words and the right button for new words, were prompted with the
word ‘OLD’ on the left and the word ‘NEW’ on the right of the screen. Participants
19
with the opposite button assignment saw the reversed prompt. The button assignments
were counterbalanced across participants.
Once participants made an old/new judgment, the prompt disappeared and a
second prompt appeared, instructing participants to rate the same word in terms of its
emotional valence on a 5-point scale ranging from -2 (very negative) to +2 (very
positive). Participants now saw this rating scale and were instructed to move a cursor
(↑) to the appropriate point on this scale and press a key to confirm their response.
The rating scale then disappeared and the screen remained blank for a period jittered
from 0.5 to 1.25 s. After the first test phase, participants took a short break before
continuing with the experiment. The test procedure is illustrated in Figure 1.2.
Figure 1.2. Figure illustrating the sequence of stimulus presentation during a test
phase.
20
Data analysis
Heart rate data was processed off-line. To remove slow drifts and high frequency
noise, a digital band pass filter was applied with a high frequency cutoff of 0.8 and a
low pass frequency cutoff at 40 Hz. QRS complexes in the recorded signal were then
detected using a pattern matching algorithm as implemented in the Biosig toolbox
(Nygards & Sornmo, 1983). The QRS complex (Einthoven, 1901) is a name for the
combination of three of the graphical deflections seen on a typical electrocardiogram
(ECG). It corresponds to the depolarization of the right and left ventricles of the
human heart. The algorithm takes into consideration the QRS complex shape and
detects the R peak (see Figure 2). This technique has been shown to be more accurate
and sensitive than a thresholding technique solely based on amplitude (Berntson et al.
1997). The heart rate data was then plotted on a time series and visually corrected for
potentially erroneous R-peak detection. Instantaneous HR was computed from interbeat intervals and re-sampled at 4 Hz using linear interpolation (Berntson et al.,
1995).
Figure 2. The QRS complex.
21
Next, event-related time courses of inter-beat-intervals were computed over a 12 s
interval after stimulus onset for each condition and participant. To eliminate the
possibility of random pre-stimulus differences between conditions as a potential
confounding factor, heart rate data for each condition was normalized against a prestimulus baseline. To this end, the data 1 s prior to stimulus onset was averaged and
subtracted from each data point in the 12 s epoch. HR deceleration and acceleration
were identified by selecting the HR minimum between 0 and 3 s and the HR
maximum between 1 and 9 s from stimulus onset for each condition and participant,
respectively (Figure 3).
Change in heart rate across time
nstudy
estudy
5
change in heart rate (beats/min)
4
3
2
1
0
0
1
2
3
4
5
6
7
8
9
10
11
12
-1
-2
-3
time lapse from onset of stimulus (s)
Figure 3. A time-series plot illustrating how maximum HR deceleration and
acceleration were computed for each participant.
22
Results
Study phase
First, I examined the influence of emotion on heart rate during the study phase. A
paired-sample t-test was performed to compare HR deceleration for words spoken
with neutral and sad prosody (refer to Table 1). Results revealed that words spoken
with sad prosody (M = -0.867, SD = 0.932) elicited a greater HR deceleration than
words spoken with neutral prosody (M = -0.584, SD = 0.769), t(46) = 2.692, p < 0.05.
A statistical comparison of HR acceleration was non-significant (p > 0.1).
Table 1. Table illustrating means and standard deviations for heart rate deceleration
and acceleration in response to words spoken with the neutral and sad prosody
(Experiment 1).
HR deceleration
HR acceleration
M
SD
M
SD
Neutral
-0.584
0.769
1.803
1.287
Sad
-0.867
0.932
1.676
1.307
Vocal emotion
Test phase
Next, I examined the influence of vocal emotion on verbal memory. To this end,
d’ scores were computed by subtracting the z-score obtained from the probability of
‘new’ being falsely recognized as ‘old’ (false alarms) from the z-score obtained from
the probability of ‘old’ words being correctly recognized as ‘old’ (hits) (Wickens, T.,
2002). A paired samples t-test revealed higher d’ scores for words encoded in neutral
(M = 1.950, SD = 1.082) relative to sad (M = 1.825, SD = 0.951) prosody, t(46) =
2.489, p < 0.05 (Table 2). I also compared the mean valence rating of words encoded
in the neutral and sad conditions. A paired-samples t-test yielded significantly lower
23
mean ratings for correctly recognized ‘old’ words previously heard with sad (M =
0.283, SD = 0.348) as compared to neutral (M = 0.371, SD = 0.320) prosody, t(46) =
2.644 , p < 0.05.
Table 2. Table illustrating means and standard deviations of mean dprime scores
valence ratings for words spoken with the neutral and sad prosody (Experiment 1).
Dprime scores
Valence rating
M
SD
M
SD
Neutral
1.950
1.082
0.371
0.320
Sad
1.825
0.951
0.283
0.348
Vocal emotion
Correlational analyses
To examine the possibility of a relationship between heart rate effects during
study and subsequent behavioral effects at test, I computed an emotion sensitivity
index (ESI) for heart rate and behavioral measures. To this end values obtained for the
neutral condition were subtracted from values obtained for the sad condition for HR
deceleration, HR acceleration, d’ scores and mean valence ratings. The resulting
indices were then subjected to the following two-tailed Pearson correlation analyses.
First, I tested the relationship between the HR deceleration ESI and the d’ ESI. This
analysis was non-significant (p > 0.1). Next, I tested the relationship between the HR
acceleration ESI and the d’ ESI and observed a significant positive correlation (r =
0.287, p = 0.05). Correlations between cardiac responses and the valence rating were
non-significant (ps > 0.1).
Discussion
The current study explored the influence of vocal emotions on heart rate during
verbal encoding and whether such influences predict subsequent verbal memory.
24
Previous studies have reported heart rate changes in response to emotional stimuli
(Bradley & Lang, 2000b) and linked such changes to modulations in memory
(Palomba, Angrilli & Mini, 1997). The goal of the present study was to extend this
work to the context of emotional prosody. First, words spoken with a sad prosody
were expected to elicit greater heart rate deceleration as compared to words spoken
with a neutral prosody. Second, the prosody effect (emotional - neutral) on heart rate
during study was expected to predict emotional differences in subsequent memory
and word valence at test.
As predicted, results revealed a greater HR deceleration in response to words
spoken with sad prosody as compared to words spoken with neutral prosody.
However, results also revealed better performance for words encoded in the neutral as
compared to sad prosody, which was not found previously in Schirmer’s study
(Schirmer, 2010). Critically, results of the current study successfully replicated the
word valence effect found previously (Schirmer, 2010). Words correctly recognized
as ‘old’ were rated as more negative when they were spoken with the negative
prosody than the neutral prosody.
HR deceleration failed to correlate with memory performance and changes in
perceived word valence. Although, across participants, HR acceleration was
comparable for words with sad and neutral prosody, individual variation in HR
acceleration between sadly and neutrally spoken words (HR acceleration ESI)
correlated positively with the emotional difference in memory performance (d’ ESI).
Thus, in line with work by Harrison and Turpin (2003), results seem to suggest that
HR acceleration during encoding predicts subsequent memory performance. The
valence shift effect observed was not related to HR effects. Current findings will be
discussed in further details in the General Discussion.
25
Chapter 3: Experiment 2
Objectives
Does emotional prosody engage preferential neural processing over neutral prosody?
What are the regions associated with successful memory recognition (hits relative to
correct rejections) and does encoding prosody modulate activity in these regions?
Do prosody effects on neural activity predict verbal memory performance?
The objectives of Experiment 2 were fourfold. First, I aimed to replicate and
extend the existing literature on auditory emotional processing. This literature has
implicated the amygdala, superior temporal sulcus, superior temporal gyrus and
transverse temporal gyrus and I thus expected to find these regions more strongly
activated while participants listen to sad as compared to neutral prosody.
Second, I aimed to replicate and extend the literature on memory retrieval. This
literature holds that middle frontal gyrus, inferior frontal gyrus, middle temporal
gyrus, inferior temporal gyrus, cingulate gyrus and anterior cingulate are more
strongly activated for ‘hits’ relative to ‘correct rejections’. Thus, I expected to observe
similar results here. Third, I hoped to see activation differences between words
previously studied with sad and neutral prosody. Given differences in the recognition
and perceived valence of these words (Experiment 1), one would expect that they also
differ in their recruitment of brain structures. Moreover, such recruitment differences
may overlap with potential processing differences between sadly and neutrally spoken
words during study. Finally, I examined whether prosody effects on neural activity
during study and test predict behavioral performance.
26
Methods
Participants
Fifty-one participants were recruited. Ten participants’ data were lost due to
technical problems (image stripes inherent in the MR scanner). Data for forty-one
participants (20 female) ranging in age from 21 to 27 were eventually used for this
thesis.
They had no history of neurological disorders and had either normal or
corrected to normal vision. They reported normal hearing. All participants signed
informed consent and were reimbursed S$40 for their participation.
Pre-scan memory test
Due to the expenses associated with booking the MRI scanner, I wanted to ensure
that participants in the experiment perform well enough for their data to be retained.
Hence, I invited everyone to an initial memory screen. The paradigm used for this
screen was identical to the behavioral paradigm used in the fMRI experiment, with the
exception of a different list of words used. Only participants who scored a minimum
d’ of 1.5 were allowed to proceed to the fMRI experiment.
fMRI experiment
Procedure
The fMRI experiment used a similar old new paradigm as the one used for
Experiment 1. A short practice session was followed by two study/test phases. The
few differences between Experiment 1 and 2 are highlighted in the following. First,
because the experiment took place in an MRI scanner, participants had to lie down
rather than sit in front of a computer screen. Visual images were projected onto a
screen at the back of the MRI scanner and participants viewed the screen using a
27
mirror that was attached to the head coil of the magnet and the position was adjusted
such that it was directly in front of their eyes. Sounds were presented using
NordicNeuroLab headphones (NNL, Bergen, Norway). During the study phases,
participants studied only 40 words as opposed to 60 words and the test phases
consisted of only 80 words. Moreover, the interval between words was jittered from
to 5 to 9 s in steps of 1 s and each trial lasted 12 s. In the test phases, old-new
decisions were no longer followed by a word valence rating. The response window
was set to 5s and the blank interval between the response and the next stimulus was
jittered from 1 to 4 s in steps of 1 s. These procedural changes were introduced in
adaptation to the fMRI sparse sampling technique and to ensure a feasible experiment
duration (~45min). All the remaining aspects of this experiment were comparable to
Experiment 1.
Image Acquisition
All images were acquired on a 3T Siemens Tim Trio (Siemens, Erlangen,
Germany). A T1-weighted anatomical image was first acquired with a 3D MPRAGE
sequence (192 slices, TR = 2530 ms, TE1 = 1.64 ms, TE2 = 3.5ms, TE3 = 5.36ms,
TE4 = 7.22ms, TI = 1200 ms, flip angle = 7°, FOV = 256mm, voxel size = 1 mm × 1
mm × 1 mm) for co-registration purposes. Functional images for the study phase were
obtained by using a standard gradient-echo EPI sequence (TE = 30ms, flip angle =
90º, FOV read = 192mm x 192mm, matrix = 64 x 64). Twenty-eight oblique slices
aligned parallel to the AC-PC line with a slice thickness 4 mm and a distance factor of
0.4 mm were acquired. A sparse-sampling EPI sequence was used during the study
phase to ensure that scanner noise would not interfere with the audibility of words
played over the headphones. Apart from the imaging parameters that were identical to
the standard EPI sequences (parameters mentioned above), the sparse-sampling EPI
28
sequence employed a TR of 12 s and a TA of 2 ms. The interval between the onset of
each EPI pulse and stimulus (i.e., Pulse-Stimulus Interval, PSI), was jittered from 3 to
7 s to allow maximum coverage of the hemodynamic response. Functional images for
the test phase were acquired using a similar EPI sequence with the exception of the
TR being 2 s.
Image analysis
The fMRI data was preprocessed and analyzed using the Statistical Parametric
Mapping software (SPM8, Wellcome Trust Centre for Neuroimaging, University
College London). Functional images obtained from the scanner were converted to
NIFTI formatted images for further processing. A within-subject registration of image
time series was performed. The time series of functional images were realigned using
a least-square minimization and a six-parameter (translations and rotations in the x, y
and z planes) rigid-body spatial transformation. All functional images were realigned
to the first image in the series. Next, images were co-registered by aligning the
anatomical image (MPRAGE) to the reference image (mean functional image
averaged across the series). The anatomical image was then segmented to obtain a
bias field corrected MPRAGE for overlaying purposes. Previously realigned
functional images were normalized to the MNI space, interpolated to 3 mm × 3 mm ×
3 mm voxels and then spatially smoothed with a Gaussian kernel of 8 mm FWHM.
Finally, a high-pass filter of 128 s was applied to the data to remove slow signal drifts.
Statistical analysis was performed on individual and group data by using the general
linear model.
Study phase
For the study phase fMRI analysis, a general linear model with four experimental
29
predictors was used: subsequently remembered words previously spoken with neutral
(study_hitneu) and sad (study_hitemo) prosody, and subsequently forgotten words
previously spoken with neutral (study_missneu1) and sad (study_missemo) prosody.
Motion parameters were included as covariates. T-contrasts of interest were generated
for each participant: Contrast #1: study_hitemo > study_hitneu, Contrast #2:
study_hitneu > study_hitemo, Contrast #3: study_emo > study_neu, Contrast #4:
study_neu > study_emo. A statistical threshold of p < 0.05 (family wise error
corrected) and a cluster size of > 10 continuous voxels was used in the creation of
activation maps. To elucidate regions that showed a greater neural activity in response
to ‘hit’ words spoken with emotional as compared to neutral prosody, a random
effects analysis was conducted across participants (Contrast #1: study_hitemo >
study_hitneu). To determine whether the expected prosody effect on neural activity
during encoding predicts an emotional memory effect, a regression analysis using the
Contrast #1 as the predictor and d’ ESI (emotional - neutral) as the criterion was
generated.
Test phase
For the test phase fMRI analysis, a general linear model with six experimental
predictors was used: remembered words previously spoken with a neutral
(test_hitneu) and sad prosody (test_hitemo), forgotten words previously spoken in a
neutral (test_missneu) and sad prosody (test_missemo) and new words correctly
recognized as ‘new’ (test_newcr) and incorrectly identified as ‘old’ (test_newfa). The
following t-contrasts were generated for each participant: Contrast #1: test_hitemo >
test_hitneu, Contrast #2: test_hitneu > test_hitemo, Contrast #3: hit > cr, Contrast #4:
1
Misses were entered in case they explained variance but not used for subsequent generation of contrasts due to an insufficient
number of trials.
30
cr > hit, Contrast #5: test_emo > test_neu, Contrast #6: test_neu > test_emo. A
statistical threshold of p < 0.05 (family-wise error corrected) and a cluster size of > 10
contiguous voxels was used in the creation of activation maps. To determine regions
that show a memory effect (a greater neural activity in response to ‘hits’ as opposed to
‘correct rejections’), a random effects analysis was generated across participants using
Contrast #3. To examine whether recognition memory differs as a function of the
encoding prosody, I compared activity for negative ‘hits’ and neutral ‘hits’. Finally, to
determine whether the prosody effect for neural activity at test predicts the emotional
difference in memory performance, a regression analysis using the prosody effect for
neural activity (Contrast #5: test_emo > test_neu) as the predictor and d’ ESI (emoneu) as the criterion was generated.
Results
Behavioral analyses
To examine the effect of emotion on memory performance, I subjected the d’
scores (refer to Experiment 1 results for how the d’ scores were computed) to a
paired-samples t-test with emotion as the factor of interest. Memory performance was
better for words previously spoken with a neutral prosody (M = 2.593, SD = 0.984)
than for words spoken previously spoken with a sad prosody (M = 2.403, SD = 0.890),
t(40) = 2.961, p < 0.01 (Table 3).
Table 3. Table illustrating means and standard deviations of dprime scores for words
spoken with the neutral and sad prosody (Experiment 2).
Dprime scores
M
Vocal emotion
Neutral
Sad
2.593
2.403
SD
0.984
0.890
31
fMRI analyses
Study phase
At encoding, sadly spoken words elicited greater activity than neutrally spoken
words in the bilateral superior temporal gyrus and the right transverse temporal gyrus
(Figure 4). A similar effect was found in the inferior frontal gyrus only with a lower
significance threshold of p < 0.001 (uncorrected). No regions showed greater activity
for words spoken in neutral as compared to sad prosody. To determine whether there
is a relationship between the neural correlates of prosody encoding and memory
performance, I conducted a separate whole-brain regression analysis with the contrast
obtained for the emotional difference in neural activity (Contrast #1) as the predictor
and the emotional difference in memory performance (d’ ESI) as criterion. This
regression analysis did not reveal any regions that showed a relationship between
memory performance and neural activity.
Table 4. Table illustrating peak activations for hitemo > hitneu contrast for the study
phase.
Region
Left Superior Temporal Gyrus /
Superior Temporal Sulcus cluster
Left Superior Temporal Gyrus /
Superior Temporal Sulcus cluster
Left Superior Temporal Gyrus /
Superior Temporal Sulcus cluster
BA
Talairach
coordinates
Peak
T-statistic
Peak
Z-score
BA 41
-48 -15 2
11.760
7.690
BA 41
-48 -24 5
10.320
7.170
BA 41
-45 -32 16
9.390
6.780
Right Transverse Temporal Gyrus
Right Superior Temporal Gyrus /
Superior Temporal Sulcus cluster
Right Superior Temporal Gyrus /
Superior Temporal Sulcus cluster
BA 41
53 -24 10
10.520
7.240
BA 22
50 -10 1
9.230
6.720
BA 22
45 4 -12
4.880
4.300
Right Inferior Frontal Gyrus*
BA 44
54 20 19
4.770
4.220
Right Inferior Frontal Gyrus*
BA 44
50 17 11
4.570
4.070
32
Left superior temporal gyrus /
superior temporal sulcus cluster
Right superior temporal gyrus /
superior temporal sulcus cluster
Right transverse temporal gyrus
Figure 4. Figure illustrating regions that show greater activity for words spoken in
negative as compared to neutral intonation. Peak activations at left superior temporal
gyrus / superior temporal sulcus cluster, right superior temporal gyrus / superior
temporal sulcus cluster and right transverse temporal gyrus.
33
Test phase
I first examined the regions with greater activity for the correct recognition of
‘old’ words (hits) as compared to the correct rejection of ‘new’ words. Such an effect
was observed in left superior frontal gyrus, bilateral middle frontal gyrus, left middle
temporal gyrus, left anterior cingulate, bilateral inferior frontal gyrus, bilateral
cingulate gyrus, bilateral sub-gyral and right middle occipital gyrus (Table 5). I then
examined whether neural activity in response to test words previously successfully
encoded with the emotional prosody (hitemo) was greater than those encoded with the
neutral prosody (hitneu). Results did not reveal any regions that showed greater
activity for this contrast. Finally, I examined whether the emotional difference in
memory performance (d’ ESI) is greater for individuals with a greater prosody effect
on neural activity at test by conducting a regression analysis using the contrast
generated for the emotional difference in neural activity at test as the predictor and d’
ESI as the criterion. This regression analyses yielded no significant voxels.
34
Table 5. Table illustrating peak activations for hits > correct rejections contrast for the
test phase.
Region
BA
Left Superior Frontal Gyrus
-
Left Sub-Gyral
Left Inferior Frontal Gyrus
Left Cingulate Gyrus
Left Cingulate Gyrus
Left Superior Frontal Gyrus
Left Middle Frontal Gyrus
Left Sub-Gyral
Left Middle Temporal Gyrus
Left Anterior Cingulate
Right Cingulate Gyrus
Right Middle Occipital Gyrus
Right Sub-Gyral
Right Inferior Frontal Gyrus
Right Middle Frontal Gyrus
Right Inferior Frontal Gyrus
Right Middle Frontal Gyrus
BA 23
BA 24
BA 10
BA 18
BA 9
BA 6
BA 47
-
Talairach
coordinates
-6
-48
-3
-3
-27
-33
-30
-50
-9
6
30
39
50
33
36
39
5 53
-33 -46
39
9 23
-23 27
3 26
54 -4
57 4
43 4
-44 9
37 10
3 26
-85 0
-62 -4
12 28
4 50
16 -3
32 19
Peak
T-statistic
Peak
Z-score
17.870
Inf
17.010
14.400
11.640
7.050
8.310
7.200
6.080
7.100
6.390
7.430
11.020
8.790
10.270
9.190
9.080
7.280
Inf
Inf
7.650
5.650
6.300
5.730
5.080
5.680
5.270
5.850
7.430
6.520
7.150
6.700
6.650
5.770
Discussion
The current study examined the influence of emotional prosody on the encoding
of spoken words and determined whether neural activity in response to prosody
during encoding predicts subsequent recognition memory during test. Words spoken
with a negative prosody were expected to evoke greater activity in the amygdala,
superior temporal sulcus, superior temporal gyrus and transverse temporal gyrus.
Greater activity was evoked by negative prosody relative to neutral prosody only in
the superior temporal gyrus and transverse temporal gyrus, but not in the amygdala.
No regions showed a correlation between the emotional difference on encoding
activity and the emotional difference in memory performance. The effect of emotion
35
was also examined for recognition memory. Similar to the findings from previous
studies, results yielded significantly greater activity for words successfully recognized
as ‘old’ (hits) than for words successfully recognized as ‘new’ (correct rejections).
However, further analysis did not reveal significant differences in the memory effect
(hits relative to correct rejections) as a function of emotion. In addition, the emotional
difference in memory performance seemed unrelated to the emotional difference in
neural activity both during study and test. The significance and implications of these
findings will be discussed in the General Discussion.
36
General Discussion
The overarching objective of the current thesis was to examine the heart rate and
neural correlates underlying the influence of emotional prosody on verbal memory.
My findings have, in general, successfully replicated and extended prior work in this
realm. In the following, I will discuss in greater detail the effects of encoding prosody
on verbal memory and heart rate / neural correlates and the extent to which these
prosody encoding effects predict verbal memory. Finally, I discuss the limitations of
the current thesis and propose future directions in investigating the multifaceted
relationship between prosody and verbal memory.
Heart rate and neural correlates of prosody encoding
During encoding, I observed a greater HR deceleration in response to words
spoken with sad as compared to neutral prosody. This finding is in accord with
previous studies that reported greater HR deceleration in response to emotional as
compared to neutral stimuli (Palomba, Angrilli & Mini, 1997; Hamann, Ely, Grafton,
& Kilts, 1999; Bradley & Lang, 2000b). A decrease in heart rate was thought to
reflect an orienting response towards an incoming stimulus (Sokolov, 1963; Graham
& Clifton, 1966; Lacey & Lacey, 1969). Thus, the greater HR deceleration observed
here may reflect a greater orienting of attention to sad as compared to neutral prosody.
Interestingly, such an enhanced deceleratory effect was observed previously across
different types of emotional stimuli. For instance, in a study by Palomba and
colleagues (1997), pleasant, neutral and unpleasant picture slides were used. The
authors observed a greater HR deceleration in attending to unpleasant slides than
neutral or pleasant slides. In a study by Kreibig and colleagues (2007), the authors
observed a greater decline in heart rate when participants viewed sad as compared to
37
neutral films (Kreibig, Wilhelm, Roth & Gross, 2007). Bradley and Lang (2000) used
acoustic stimuli that were rated pleasant and unpleasant (such as the sound of a crying
baby or bees buzzing). They observed greater HR deceleration for unpleasant sounds
relative to pleasant sounds. Finally, evidence also stems from studies that used other
acoustic stimuli such as music excerpts (Krumhansl & Carol, 1997; Etzel et al., 2006).
For example, Etzel and colleagues (2006) presented subjects with music excerpts that
conveyed happiness, sadness, and fear while heart rate data was recorded. Results
revealed a heart rate deceleration during sadness induction (Etzel et al., 2006). Taken
together, these findings seem to suggest that emotional and in particular negative
stimuli evoke greater heart rate deceleratory responses as compared to neutral stimuli,
and that this effect holds for a wide range of stimuli including pictures, sounds and
musical excerpts.
The results of the first experiment conducted in this thesis suggest that the HR
deceleratory effect extends to words spoken with sad prosody. They imply that words
spoken with a sad prosody elicit greater orienting of attention than words spoken with
a neutral prosody. The fMRI study conducted in the second experiment supports this
assertion. Compared to words spoken with a neutral prosody, those spoken with a sad
prosody elicited greater activity in the superior temporal gyrus and the transverse
temporal gyrus. That encoding emotional prosody would engage the superior
temporal gyrus and superior temporal sulcus to a greater extent than neutral prosody
is not surprising granted the array of studies that have found a similar pattern in
prosody perception (for a review see Schirmer & Kotz, 2006). For instance, in an
fMRI study that also examined the difference in activity in processing emotional and
neutral prosody, Wildgruber and colleagues (2005) found greater activity in the
superior temporal gyrus when participants attended to prosodic excerpts of multiple
38
emotions (sad, happy, fearful, angry and erotic) as compared to neutral excerpts. The
current experiments show that such an effect also holds when sad prosody is
examined individually.
Effects of prosody on verbal memory performance
Memory performance for words spoken with the neutral prosody was better as
compared to those spoken with the sad prosody in both experiments. This finding is in
discord with that of Schirmer (2010), where memory performance was found to be
comparable for both conditions. The extended inter-stimulus-intervals used in the
current study as compared to the short ISIs used in Schirmer’s study could possibly
account for the discrepant findings. The greater time interval available for
memorizing words in the present study has likely reduced memory load and task
difficulty. That this produces a memory benefit for neutrally intoned speech is in line
with evidence from Kitayama’s study (Kitayama, 1996) which demonstrated that for a
task that involves low memory load, recognition performance for distracting
sentences spoken with an emotional prosody was worse than that for their
counterparts spoken with a neutral prosody. But when task difficulty increases with an
increase in memory load, the effect of prosody on memory diminishes. Hence, one
might equate memorizing words at shorter intervals to a task of greater difficulty and
memorizing words at longer intervals to a task that is less demanding. Thus, in the
case of recognition memory, when the task is relatively demanding or involves a high
memory load, the vocal emotion poses neither a threat nor advantage to the verbal
content. However, when the memory load is low, emotional prosody may disrupt
verbal recognition memory relative to neutral prosody. This argument is further
supported by evidence for higher mean d’ scores obtained in both experiments 1 and 2
(mean d’ ~ 1.8) as compared to the relatively lower d’ scores reported in Schirmer
39
(2010) (mean d’ ~ 1.5). Although one should be careful about the implications in
comparing the abovementioned studies since they differed in terms of subject pool
and the number of trials used, the memory paradigm employed in all three studies
were nevertheless comparable. Hence, one may speculate that the memory task in
Schirmer’s study was indeed more demanding than the one used in the current thesis.
As previously mentioned, this difference in task difficulty may stem from a
differential amount of time available for encoding words.
Perhaps the enhanced memory performance for words encoded with neutral
prosody as compared to emotional prosody can be better understood by examining the
match between conditions at encoding and retrieval. Such encoding-retrieval
conditions have been studied by multiple researchers. Recent researchers have
suggested that in optimizing memory performance, it is not enough to simply ensure a
good match between encoding and retrieval conditions; the relative diagnostic value
of the encoding-retrieval match is also crucial (Nairne, 2002; Goh & Lu, 2010). The
diagnostic value refers to the presence of features that help discriminate the target
from competing foils. For example, if a target word “nail” is encoded and stored in
the context of the word “finger”, the words “human body part” would make a better
encoding-retrieval match and thus a more effective retrieval cue than the word “tool”.
However, if other words such as “toe” and “hand” were included in the encoding list,
the effectiveness of the cue “human body part” may diminish. Hence, memory
enhancement by encoding-retrieval match is contingent on the number of competing
foils subsuming the retrieval cue. In the current study, although participants clearly
retained emotional information embedded within emotionally encoded words (as
evident from the more negative ratings for words encoded with sad relative to neutral
prosody), they might have found it difficult to use this information at recognition as
40
several other words were also encoded in the same context (i.e. sad prosody). Hence,
the visual cue presented at recognition had little diagnostic value for words encoded
with the emotional prosody. Instead, the essentially more neutral cues (visually
presented words) at recognition provided a better match with words encoded with the
neutral prosody than those encoded with the emotional prosody, giving rise to a better
memory performance for the former.
The present research is in line with Schirmer’s work in that it failed to find a
memory advantage for words spoken emotionally. This finding is supported by
current neuroimaging results that revealed no significant contribution of the amygdala
in processing emotional prosody. Multiple studies have demonstrated that the
emotional enhancement of memory is contingent on amygdala (Canli, Brewer,
Gabrieli, & Cahill, 2000) activity and interaction between hippocampal and amygdala
activity (Cahill & McGaugh, 1998). Moreover, the engagement of the amygdala in
vocal emotional processing has been demonstrated to vary based on individual traits
such as social orientation (Schirmer et al., 2008) and to be less reliable than for
emotional stimuli from other modalities (Schirmer & Kotz, 2006).
Despite differences in the recognition of words previously studied with neutral
and sad prosody, the present study found no accompanying differences in brain
activity at test. Two reasons could account for this null finding. First, it is possible
that there is an immense amount of inter-individual variation in how prosody may
affect recognition memory so that effects cancel each other out. For instance, while
some participants may show emotionally enhanced neural activity in regions
concerning memory recognition, others may display the opposite pattern. Hence,
future studies may need to explore such inter-individual variation.
Second, sad
prosody, being arousal relatively weak emotional stimulus as compared to vocal
41
screams or other emotions sounds, may have caused only small changes in the
memory representation of the verbal content that can only be detected with a larger
sample size.
Apart from studying encoding and test differences as a function of prosody, I was
interested in examining a potential relation between the two. Specifically, I asked
whether prosody encoding effects predict recognition accuracy. Although results
revealed no significant correlation between the HR deceleration effect at encoding
and subsequent word recognition, individual variation in the HR acceleration effect at
encoding predicted subsequent word recognition. A greater HR acceleration to sadly
spoken words relative to neutrally spoken words was associated with a smaller
memory decrement for the former relative to the latter. This is in line with the
proposal that HR acceleration reflects cognitive effort and enhances stimulus
processing (Lacey & Lacey, 1980; Barry, Robert, Tremayne & Patsy, 1987). It also
accords with research suggesting a benefit for the retention of information that
triggers an increase in bodily arousal (Jennings and Hall, 1980; Kahneman & Peavler,
1969). However, the fact that, overall, neutrally spoken words were more readily
recognized than sadly spoken words suggests that the retention of verbal information
is a complex process and is subjected to a range of factors including whether
emotional information is expressed verbally or vocally and what the demands are on
memory retention (Kitayama, 1996).
Contrary to expectations, encoding activity in the superior temporal gyrus and
superior temporal sulcus did not appear to predict subsequent verbal memory
performance. Infact, no regions showed a significant relationship between encoding
activity and d’ scores. One possible explanation for this null finding is that there is a
sizeable amount of inter-individual variation in how prosody affected verbal memory
42
performance. While most of the participants showed better memory performance for
words encoded with neutral in comparison to negative prosody, a subset of them
either showed no difference or performed better with emotional prosody. Hence,
memory processes may differ amongst these individuals and complicate the
relationship between neural activity and verbal memory performance.
Effects of prosody on word valence judgment
Albeit showing differences in the role of prosody for recognition memory, the
present study has successfully replicated the finding of prosodic influences on word
valence by Schirmer (2010). Words successfully encoded with the sad prosody were
subsequently rated more negatively than words encoded with the neutral prosody. We
can see that this valence shift effect is indeed very robust even at different intervals
between words. In other words, the tone of voice that speech is delivered in can
reliably influence the affective connotation of its verbal content. However, the word
valence shift effect observed above did not seem to be related to HR correlates.
Individuals who reacted more strongly to emotional as compared to neutral prosody in
terms of cardiac deceleration and acceleration were not necessarily the ones who
perceived sadly spoken words and neutrally spoken words differently during test.
Several reasons could account for this finding. First, the valence shift effect
observed may be due to consolidatory mechanisms that extend beyond the encoding
phase. Hence, the heart rate measures obtained during encoding may not be
appropriate for predicting subsequent valence judgment. A second reason for why
there is no relationship between HR and subsequent word valence is that the
measurement of this relationship was inappropriate. It comprised an average of HR
changes and valence across many items and thus failed to account for item specific
43
co-variation. Additionally, it assumed linearity for a relationship that may well be
more complex.
Limitations and future directions
The present study has several limitations. First, the stimuli employed words
spoken with sad prosody. Findings observed with this “withdrawal” emotion may not
apply to other “approach” emotions such as anger. Future studies need to include
other negative emotions such as fear, anger and disgust to determine whether
implications of current findings can be extended to other types of negative emotions.
Other vocal stimuli such as vocalizations (e.g. wail or cough) could be employed to
see if a similar pattern of memory impairment may be observed. In addition, the
current study had not considered individual subject traits that may contribute to the
inter-individual differences in memory performance and thus, may have
underestimated the complexity of the relationship between prosody and verbal
memory. For instance, factors such as the extent to which one is socially orientated
can influence vocal processing (Schirmer et al., 2008) and may affect the relationship.
Other biological factors such as estrogen have been known to affect vocal processing
(Schirmer, Escoffier, Li, Li, Strafford & Li, 2008). Future studies need to factor in
these variables to better characterize the interplay between prosody and verbal
memory. Finally, as discussed previously the interval between word presentations
seems to affect the relationship between encoding prosody and verbal memory
performance, at least for recognition memory. Future studies could attempt to include
the interval between words as a variable of interest to understand how “pauses”
inserted between verbal information can modulate the effect of prosody on verbal
memory.
44
Conclusions
To conclude, the current study has successfully replicated Schirmer’s finding that
the affective connotation of a word in memory can be altered by changing its spoken
prosody. It has also demonstrated that this valence shift effect is indeed a robust effect
that is relatively independent of the time interval between to-be-encoded words.
Current findings also point to a link between individual variation in heart rate
acceleration and prosody effects on verbal memory. Individual variation in heart rate
acceleration between sadly and neutrally spoken words correlated positively with the
emotional difference in memory performance (d’ ESI). Hence, it may be prudent to
consider cardiac sensitivity as an inter-individual variable in examining effects of
vocal emotional expressions on memory. Taken together, this and previous findings
(Schirmer, 2010; Kitayama, 1997) suggest that the relationship between prosody and
verbal memory may be far more complex than originally conceived. Other factors
such as intervals of to-be-encoded words, task difficulty, memory load and subject
traits may modulate the effect of encoding prosody on verbal memory performance.
45
Bibliography
Adolphs, R. (2002). Neural systems for recognizing emotion. Current Opinions in
Neurobiology, 12, 169-77.
Banse, R., & Scherer, K. R. (1996). Acoustic profiles in vocal emotion expression.
Journal of Personality and Social Psychology, 70, 614-636.
Barry, R. J., & Tremayne, P. (1987). Separation of components in the evoked cardiac
response under processing load. Journal of Psychophysiology, 1, 259-264.
Beaucousin, V., Lacheret, A., Turbelin, M. R., Morel, M., Mazoyer, B., & TzourioMazoyer, N. (2007). FMRI study of emotional speech comprehension. Cerebral
Cortex,17, 339-352.
Belin, P., Zatorre, R. J., Lafaille, P., Ahad, P. & Pike, B. (2000). Voice-selective areas
in human auditory cortex. Nature, 403, 309-312.
Berntson, G. G., Cacioppo, J. T., & Quigley, K. S. (1995). The metric of cardiac
chronotropism: Biometric perspectives. Psychophysiology, 32, 162-171.
Bradley, M. M., & Lang, P. J. (2000a). Measuring emotion: Behavior, feeling and
physiology. In R. Lane & L. Nadel (Eds.), Cognitive neuroscience of emotion, pp.
242-276. New York: Oxford University Press.
Bradley, M. M., & Lang, P. J. (2000b). Affective reactions to acoustic stimuli.
Psychophysiology, 37, 204-215.
Breiter, H. C., Etcoff, N.L. , Whalen, P. J, Kennedy, W. A., Rauch, S. L., Buckner, R.
L., Strauss, M. M., Hyman, S. E., & Rosen, B. R. (1996). Response and
Habituation of the Human Amygdala during Visual Processing of Facial
46
Expression. Neuron, 17, 875-887.
Brosch, T., Grandjean, D., Sander, D., & Scherer, K. R. (2008). Behold the voice of
wrath: Cross-modal modulation of visual attention by anger prosody. Cognition,
106, 1497-1503.
Buchanan, T. W., Etzel, J. A., Adolphs, R. A., & Tranel, D. (2006). The influence of
autonomic arousal and semantic relatedness on memory for emotional words.
International Journal of Psychophysiology, 61, 26-33.
Buckner, R. L., Koutstaal, W., Schacter, D. L., Wagner, A. D., & Rosen, B. R. (1998).
Functional-anatomic study of episodic retrieval using fMRI. I. Retrieval effort
versus retrieval success. Neuroimage, 7, 151-162.
Cahill, L., & McGaugh, J. L. (1998). Mechanisms of emotional arousal and lasting
declarative memory. Trends in Neuroscience, 21, 294-299.
Canli, T., Zhao, Z., Brewer, J., Gabrieli, J. D., & Cahill, L. (2000). Event-related
activation in the human amygdala associates with later memory for individual
emotional experience. Journal of Neuroscience, 20.
Canli, T., Sivers, H., Whitfield, S. L., Gotlib, I. H., Gabrieli, J. D. (2002). Amygdala
response to happy faces as a function of extraversion. Science,296, 2191.
Chee, M. W. L., Goh J. O., Lim, Y., Graham, S., & Lee, K. (2004). Recognition
memory for studied words is determined by cortical activation differences at
encoding but not during retrieval. Neuroimage, 22, 1456-1465.
Costafreda, S.G., Brammer, M.J., David, A.S., & Fu. C.H.Y. (2008). Predictors of
amygdala activation during the processing of emotional stimuli: a meta-analysis
47
of 385 PET and fMRI studies. Brain Research Reviews, 58, 57-70.
Critchley, H. D., Rotshtein, P., Nagai, Y., O'Doherty, J., Mathias, C. J., & Dolan, R. J.
(2005). Activity in the human brain predicting differential heart rate responses to
emotional facial expressions. Neuroimage, 24, 751-762.
Dolan R. J., & Vuilleumier, P. (2003). Amygdala automaticity in emotional
processing. Annals of the New York Academy of Sciences, 985, 348-355.
Eldridge, L. L., Knowlton, B. J., Furmanski, C. S., Bookheimer, S. Y., & Engel, S. A.
(2000). Remembering episodes: a selective role for the hippocampus during
retrieval. Nature Neuroscience, 3, 1149-1152.
Ethofer, T., Anders, S., Erb, M., Herbert, C., Wiethoff, S., Kissler, J., Grodd, W., &
Wildgruber, D. (2006). Cerebral pathways in processing of affective prosody: A
dynamic causal modeling study. Neuroimage,30,580-587.
Etzel, J. A., Johnsen, E. L., Dickerson, J., Tranel, D., & Adolphs, R. (2006).
Cardiovascular and respiratory responses during musical mood induction.
International Journal of Psychophysiology, 61, 57-69.
Fecteau S., Belin P., Joanette Y. & Armony JL. (2007). Amygdala responses to
nonlinguistic emotional vocalization. Neuroimage, 36, 480-487.
Fusar-Poli, P., Placentino, A., Carletti, F., Landi, P., Allen, P., Surguladze, S.
Benedetti, F., Abbamonte, M., Gasparotti, R., Barale, F., Perez, J., McGuire, P.,
& Politi, P. (2009). Functional atlas of emotional face processing: a voxel-based
meta-analysis of 105 functional magnetic resonance imaging studies. Journal of
Psychiatry & Neuroscience, 34, 418-432.
48
Graham F. K., & Clifton, R. K. (1966). Heart-rate change as a component of the
orienting response. Psychological Bulletin, 65, 305-320.
Grandjean, D., Sander, D., Pourtois, G., Schwartz, S., Seghier, M., Scherer, K.R., &
Vuilleumier, P. (2005). The voices of wrath: brain responses to angry prosody in
meaningless speech. Nature Neuroscience, 8, 145-146.
Hamann S. B., Ely, T. D., Grafton, S. T., & Kilts, C. D. (1999). Amygdala activity
related to enhanced memory for pleasant and aversive stimuli. Nature
Neuroscience, 2, 289-293.
Hariri, A. R., Bookheimer, S. Y., & Mazziotta, J. C. (2000). Modulating emotional
responses: effects of a neocortical network on the limbic system. Neuroreport, 11,
43-48.
Harrison, L. K., & Turpin, G. (2003). Implicit memory bias and trait anxiety: a
psychophysiological analysis. Biological Psychology, 62, 97-114.
Henson, R. N., Hornberger, M., & Rugg, M. D. (2005). Further dissociating the
processes involved in recognition memory: an FMRI study. Journal of Cognitive
Neuroscience,17, 1058-1073.
Jennings, J. R., Berg, W. K., Hutcheson, J. S., Obrist, P., Porges, S. and Turpin, G.
(1981). Publication Guidelines for Heart Rate Studies in Man. Psychophysiology,
16, 226-231.
Jennings, J. R., & Hall, S. W. Jr. (1980). Recall, recognition, and rate: memory and
the heart. Psychophysiology, 17, 37-46.
Kahneman, D., & Peavler, W. S. (1969). Incentive effects and pupillary changes in
49
association learning. Journal of Experimental Psychology,79, 312-318.
Kensinger E. A. (2004). Remembering emotional experiences: The contribution of
valence and arousal. Reviews in the Neurosciences, 15, 241-251.
Kensinger, E. A. & Corkin, S. (2004). Two routes to emotional memory: Distinct
neural processes for valence and arousal. Proceedings of the National Academy of
Sciences USA, 101, 3310-3315.
Kitayama, S. (1996). Remembrance of emotional speech: Improvement and
impairment of incidental verbal memory by emotional voice. Journal of
Experimental Social Psychology, 32, 289-308.
Konishi, S., Wheeler, M. E., Donaldson, D. I., & Buckner, R. L. (2000). Neural
correlates of episodic retrieval success. Neuroimage, 12, 276-286.
Kotz, S.A., Meyer, M., Alter, K., Besson, M., von Cramon, D.Y., & Friederici, A.D.
(2003). On the lateralization of emotional prosody: An event-related functional
MR investigation. Brain and Language, 68, 366-376.
Kreibig, S. D., Wilhelm, F. H., Gross, J. J. & Roth, W. T. (2007). Effects of filminduced fear and sadness on acoustic startle. Psychophysiology, 43 (S1), S54.
Poster presented at the Society for Psychophysiological Research, 46th Annual
Meeting, Vancouver, BC, Canada.
Krumhansl, & Carol L. (1997). An exploratory study of musical emotions and
psychophysiology. Canadian Journal of Experimental Psychology, 51, 336-353.
LaBar, K. S., & Cabeza, R. (2006). Cognitive neuroscience of emotional memory.
Nature Reviews Neuroscience, 7, 54-64.
50
Lacey, B. C., & Lacey, J. I. (1979). Cognitive modulation of time dependent
bradycardia. Psychophysiology, 17, 209-221.
Mickley K. R. & Kensinger E. A. (2008). Neural processes supporting subsequent
recollection and familiarity of emotional items.
Cognitive, Affective, and
Behavioral Neuroscience, 8, 143-152.
Miller, M. B., Darrell, J., Horn, V., Wolford, G. L., Handy, T. C., Valsangkar-Smyth,
M., Inati, S., Grafton, S., & Gazzaniga, M. (2002). Extensive Individual
Differences in Brain Activations Associated with Episodic Retrieval are Reliable
Over Time. Journal of Cognitive Neuroscience,14,1200-1214.
Mitchell, R.L.C., Elliot, R., Barry, M., Cruttenden, A., & Woodruff, P.W.R. (2003).
The neural response to emotional prosody, as revealed by functional magnetic
resonance imaging. Neuropsychologia, 41, 1410-1421.
Morris, J. S., Frith, C. D., Perrett, D. I., Rowland, D., Young, A. W., Calder, A. J., &
Dolan, R. J. (1996). A differential neural response in the human amygdala to
fearful and happy facial expressions. Nature, 31, 812-815.
Morris, J. S., Scott, S. K., & Dolan, R. J. (1999). Saying it with feeling: neural
responses to emotional vocalizations. Neuropsychologia, 37, 1155-1163.
Murty, V. P., Ritchey, M., Adcock, R. A., LaBar, K. S. (2010). fMRI studies of
successful
emotional
memory
encoding:
A
quantitative
meta-analysis.
Neuropsychologia, 48, 3459-3469.
Näätänen., R., & Alho, K. (1995). Mismatch negativity – a unique measure of sensory
processing in audition. International Journal of Neuroscience, 80, 317-337.
51
Näätänen, R. (2001). The perception of speech sounds by the human brain as reflected
by the mismatch negativity (MMN) and its magnetic equivalent (MMNm).
Psychophysiology, 38, 1-21.
Nygards, M. E., & Sornmo, L. (1983). Delineation of the QRS complex using the
envelope of the ECG. Medical & Biological Engineering & Computing, 21, 538547.
Ogawa, S., Lee, T.M., Nayak, A.S., & Glynn, P. (1990). Oxygenation-sensitive
contrast in magnetic resonance image of rodent brain at high magnetic fields.
Magnetic Resonance in Medicine,14, 68-78.
Ohman, A., & Mineka, S. (2001). Fears, phobias, and preparedness: toward an
evolved module of fear and fear learning. Psychological Review, 108, 483-522.
Palomba, D., Angrilli, A. & Mini, A. (1997). Visual evoked potentials, heart rate
responses and memory to emotional pictorial stimuli. International Journal of
Psychophysiology, 27, 55-67.
Palomba, D., Sarlo, M., Angrilli, A., Mini, A., & Stegagno, L. (2000). Cardiac
responses associated with affective processing of unpleasant film stimuli.
International Journal of Psychophysiology, 36, 45-57.
Phelps, E. A., & LeDoux, J. E. (2005) Contributions of the amygdala to emotion
processing: from animal models to human behavior. Neuron, 175-187.
Phillips, M. L., Young, A.W., Scott, S.K., Calder, A.J., Andrew, C., Giampietro, V.,
Williams, S.C., Bullmore, E.T., Brammer, M., & Gray, J.A. (1998). Neural
responses to facial and vocal expressions of fear and disgust. Proceedings
Biological Science, 265, 1809-1817.
52
Sander, K., & Scheich, H. (2001). Auditory perception of laughing and crying
activates human amygdala regardless of attentional state. Cognitive Brain
Research 12, 181-198.
Sander, D., Grafman, J., & Zalla, T. (2003). The human amygdala: An evolved
system for relevance detection. Reviews in the Neurosciences, 14, 303-316.
Sander, D., Grandjean, D., Pourtois, G., Schwartz, S., Seghier, M. L., Scherer, K. R.,
&Vuilleumier, P. (2005). Emotion and attention interactions in social cognition:
brain regions involved in processing anger prosody. Neuroimage, 28, 848-58.
Scherer, K. R. (1986). Vocal affect expression: A review and a model for future
research. Psychological Bulletin, 99, 143-165.
Schirmer, A., Kotz, S.A., & Friederici, A.D. (2002). Sex differentiates the role of
emotional prosody during word processing. Cognitive Brain Research, 14, 228233.
Schirmer, A. & Kotz, S.A. (2003). ERP evidence for a gender specific Stroop effect in
emotional speech. Journal of Cognitive Neuroscience, 15, 1135-1148.
Schirmer, A., Striano, T., & Friederici, A.D. (2005). Sex differences in the preattentive processing of vocal emotional expressions. Neuroreport, 16, 635-639.
Schirmer, A., Escoffier, N., & Simpson, E. (2007). Listen up! Processing of intensity
change differs for vocal and nonvocal sounds. Brain Research, 1176, 103-112.
Schirmer, A., & Kotz, S.A. (2006). Beyond the right hemisphere: Brain mechanisms
mediating vocal emotional processing. Trends in Cognitive Sciences, 10, 24-30.
Schirmer, A., Escoffier, N., Zysset, S., Koester, D., Striano, T., & Friederici, A. D.
53
(2008). When vocal processing gets emotional: On the role of social orientation in
relevance detection by the human amygdala. NeuroImage, 40, 1402-1410.
Schirmer, A., Escoffier, N., Li, Q.Y., Li, H., Strafford-Wilson, J., & Li, W. I. (2008).
What
grabs
his
attention
neurophysiological
but
measures
not
of
hers?
vocal
Estrogen
correlates
change
with
detection.
Psychoneuroendocrinology, 33, 718-727.
Schirmer, A. (2010). Mark My Words: Tone of Voice Changes Affective Word
Representations in Memory. PLoS, 5, 9080.
Sokolov, E. N. (1963). Higher Nervous Functions: The Orienting Reflex. Annual
Review of Physiology, 25, 545-580.
Sommer, T., Gläscher, J., Moritz, S., & Büchel, C. (2008). Emotional enhancement
effect of memory: Removing the influence of cognitive factors. Learning and
Memory, 15, 569-573.
Vuilleumier, P., Armony, J. L., & Dolan, R. J. (2003). Reciprocal Links between
Emotion and Attention. Human Brain Function, pp. 419-444. San Diego:
Academic Press.
Vuilleumier, P., Armony, J. L., Clarke, K., Husain, M., Driver, J., & Dolan, R. J.
(2002). Neural response to emotional faces with and without awareness: eventrelated fMRI in a parietal patient with visual extinction and spatial neglect.
Neuropsychologia, 40, 2156-2166.
Vytal, K., & Hamann, S. (2010). Neuroimaging support for discrete neural correlates
of basic emotions: a voxel-based meta-analysis. Journal of Cognitive
Neuroscience, 22, 2864-85.
54
Wildgruber, D., Riecker, A., Hertrich, I., Erb, M., Grodd, W., Ethofer, T., &
Ackermann, H. (2005). Identification of emotional intonation evaluated by fMRI.
NeuroImage, 24,1233-1241.
Wiethoff, S., Wildgruber, D., Kreifelts, B., Becker, H., Herbert, C., Grodd, W., &
Ethofer, T. (2008). Cerebral processing of emotional prosody – influence of
acoustic parameters and arousal. Neuroimage, 39, 885-893.
[...]...4 Effects of speaker prosody on verbal memory Effects of emotional prosody on verbal memory have been examined by Kitayama (1996) The author tested the effects of emotional prosody on memory under different memory load conditions In his study, participants performed a memory span task which required them to memorize either two (low load) or four (high load) two-digit numbers for 20s During the 20s interval,... select old from among new sentences and to indicate their level of confidence in this selection When the memory load was low, results for recognition memory paralleled that of the free recall memory in that memory for sentences spoken with the neutral prosody was better than that for 5 sentences spoken with the emotional prosody In contrast, when memory load was high, memory for both types of sentences... that emotional prosody can either improve or impair memory for verbal content, and that the effect of emotional prosody on memory depend largely on memory load and the retrieval method employed at test (Kitayama, 1996) A recent study by Schirmer (2010) also explored the effect of speaker prosody on the memory representation of words In this study, participants performed a crossmodal verbal memory paradigm... obtained for the neutral condition were subtracted from values obtained for the sad condition for HR deceleration, HR acceleration, d’ scores and mean valence ratings The resulting indices were then subjected to the following two-tailed Pearson correlation analyses First, I tested the relationship between the HR deceleration ESI and the d’ ESI This analysis was non-significant (p > 0.1) Next, I tested the. .. with a fixation cross that was presented for 0.2 s in the center of the screen, followed by a spoken word simultaneously presented with a fixation cross, the latter lasting 2.3 s The trial ended with a blank screen 18 marking the onset of the intertrial interval (ITI) The ITI was jittered from 12 to 15 s in one second steps Each study phase consisted of 60 trials Half of the trials consisted of words spoken... deceleration for threatening words and taboo words, there seems to be a discrepancy with respect to HR acceleration These may stem from the nature of the stimuli and call for further investigations fMRI studies on emotional processing The last century has seen an explosion in the number of studies that used noninvasive techniques such as functional magnetic resonance imaging (fMRI) to examine the neural... tested the relationship between the HR acceleration ESI and the d’ ESI and observed a significant positive correlation (r = 0.287, p = 0.05) Correlations between cardiac responses and the valence rating were non-significant (ps > 0.1) Discussion The current study explored the influence of vocal emotions on heart rate during verbal encoding and whether such influences predict subsequent verbal memory 24... based on the emotional context in which these words are encountered A recent study using electroencephalography (Schirmer et al in preparation), replicated these results and further outlined the time course of prosody encoding processes that underlie the observed change in affective memory The present thesis was aimed to extend this work by studying the role of emotion related autonomic changes and the. .. response Functional images for the test phase were acquired using a similar EPI sequence with the exception of the TR being 2 s Image analysis The fMRI data was preprocessed and analyzed using the Statistical Parametric Mapping software (SPM8, Wellcome Trust Centre for Neuroimaging, University College London) Functional images obtained from the scanner were converted to NIFTI formatted images for further... word ‘OLD’ on the left and the word ‘NEW’ on the right of the screen Participants 19 with the opposite button assignment saw the reversed prompt The button assignments were counterbalanced across participants Once participants made an old/new judgment, the prompt disappeared and a second prompt appeared, instructing participants to rate the same word in terms of its emotional valence on a 5-point scale ... of verbal information maintained in long-term memory In the following section, I will present their findings 4 Effects of speaker prosody on verbal memory Effects of emotional prosody on verbal. .. indicate their level of confidence in this selection When the memory load was low, results for recognition memory paralleled that of the free recall memory in that memory for sentences spoken with the. .. button for new words, were prompted with the word ‘OLD’ on the left and the word ‘NEW’ on the right of the screen Participants 19 with the opposite button assignment saw the reversed prompt The