INTRODUCTION
Rationales
The ultimate aim of this research is to achieve a cross language comparison between the acoustic properties of Hanoi Vietnamese monophthongs and General American English monophthongs The findings of the accomplished study are significant, from both the linguistic and pedagogical perspectives
Ladefoged states firmly that, “The best way of describing vowels is not in terms of the articulations involved, but in terms of their acoustic properties.” (2003, p.104) A considerable amount of space of this thesis is devoted to the researcher’s analysis of the monophthongs, or pure vowels (Wells, 1962, p.1) of Vietnamese, Hanoi dialect Aside from a few studies conducted overseas, which have important limitations to be addressed, which are discussed in details in the Review of Literature of this thesis, there has been no attempt to study the vowel acoustics of the recognized standard Vietnamese so far
The literature on Vietnamese vowel acoustics has been mainly concerned with the description of the sounds from the views of articulatory phonetics The investigations conducted by Nguyễn
(1998), and Đoàn (2000) are typical examples These studies examined the behaviors of the vocal organs involved in the articulatory process when a particular sound is being produced This method, while having the advantage of being straightforward, has put forwards ideas which remain an approximation to the truth Ladefoged and Johnson (2011, p.197) comment,
Traditional articulatory descriptions are often not in accord with the actual articulatory facts For well a hundred years, phoneticians have been describing vowels in terms such as high versus low and front versus back To some extent, they have been using these terms as labels to specify acoustic dimensions rather than as descriptions of actual tongue positions Phoneticians are thinking in terms of acoustic fact, and using physiological fantasy to express the idea
Acoustics offers sufficient tools for explaining the vowel qualities
The production of a speech sound involves firstly the vibration of the vocal cords, which produces sound waves It involves secondly the performance of the vocal tract, which can be changed into various shapes, as a filter, under the acoustic impedance Vowel sounds are characterized acoustically by formants, which are frequency regions of high energy concentration corresponding to the pass bands of the throat and mouth cavities (Wells, 1962, p.1).Therefore, instead of only studying a particular sound from the outside, rather subjectively, by observing with eyes, trying to set up a collection of its articulatory features, there should be a rigorous description method where every dimension of a sound as its nature is measured and displayed objectively on the screen of an electronic device
The analysis, carrying out appropriately, would result in an acoustic
Vietnamese monophthongs, which serves as a valuable source of reference for cross language comparison
The pronunciation of General American English and of Hanoi Vietnamese are acknowledged as the reference accents of English and Vietnamese respectively As a result, from the pedagogical aspect, the findings of the research are of highly practical values in teaching the pronunciation of one language to learners of the other language.
Scope of the research and the research questions
The study first examined the quality of the pure vowels in Hanoi Vietnamese The frequencies of each of the first two formants of each monophthong (F1, F2) were investigated on the acoustic spectrographs, generated from the speech analyzer program PRAAT
The results obtained from the analysis were then compared with the results of a recent research in the monophthongs of General American English, conducted by Clark, M J, Hillenbrand, J, et al
The research is aimed at answering two questions:
1) What are the acoustic properties characterizing Hanoi Vietnamese monophthongs?
2) What are the common and distinctive features between the relative positions of the monophthongs in Vietnamese and General American English on the formant charts?
THE REVIEW OF LITERATURE
The articulatory description of Hanoi Vietnamese monophthongs
There have been considerable attempts to give a description of the vowel system of Hanoi Vietnamese, impressionistically and acoustically This part of the review of literature is concerned firstly with the set of Vietnamese monophthongs in Hanoi dialect, the description of which has generated a great amount of debate among phoneticians I shall then give an examination of the second set, being described with fair consistency
As mentioned above, the vowel inventory of Vietnamese includes some monophthongs that have been described consistently in the literature; they also have transparent orthographic representation: i/i/, u/u/, ô/o/, o/ɔ/, ê/e/, e/ε/, a/a/ However, for some other monophthongs, orthographically realized by ư, ơ, â, and ă, there are important conflictions in description For example, Lindau (1978), as cited in Matt (2009) describes ư as high back unrounded, while
Thompson (1965) insists that it is high central unrounded, or as and high central, as proposed by Pham (2003) Hwa-Froelich (2002), as cited in Matt (2009), puts forward the suggestion that ư that includes
/ɯ/ and /ʊ/, is characteristically employed to denote a high back unrounded and a lower-high back rounded vowel, respectively
Lindow (1978) has identified ơ as being back unrounded, /ɤ/ or /ʌ/, while according to Thompson (1965), it should be represented by /ə/
According to Matt, Alina, and Alison (2009) there are two reasons for the inconsistency in the description of ư and ơ Firstly, the acoustic distinction between lip-rounding and the backness of the tongue is not clear The traditional analysis of spectrogram cannot convincingly differentiate the characteristics because of the almost similar, or even equal acoustic properties (Ladefoged, 2011) The second reason is the different goal behind the phonetic and phonological descriptions of the vowels concerned Phonetic descriptions, the goal of which is to provide a description of the vowels’ features as being realized in spoken speech, are concerned with the articulatory or acoustic features of the vowels Phonological descriptions, on the other hand, are concerned with the vowels’ structure and function in relation to each other in a system
Naturally, different goals of the studies conducted have resulted in the inconsistency
As mentioned earlier, there are two other Vietnamese vowels, which have been identified with conflicting features The vowels realized by â and ă are traditionally described as “short”, low central
However, there has been a great amount of debate surrounding whether these vowels are short counterparts of ơ and a respectively, which are long vowels of similar quality, or they are short vowels with distinct vowel qualities One of the ultimate goals of the current study is to provide a systematic description of the quality of Hanoi Vietnamese pure vowel inventory; therefore, it shall not be
Thompson (1965) is among the references of highest citation frequency In his rather comprehensive account of the Vietnamese language, a fine amount of space has been devoted to the vowel system of Hanoi dialect
According to Thompson (1965), the dialect’s vocalic system consists of two sub-systems of upper vocalics, which includes six vowels and three semivowels, articulated relatively high in the mouth, and lower vocalics, which includes five vowels and one semivowels, articulated relatively low The table below gives further details on this
It can be made clearer from this table what Thompson (1965) has illustrated The upper vocalics includes three positions, being relatively distinctive from each other: front, back unrounded, and back rounded A high vowel, an upper-mid vowel, and a semivowel occupy each of the positions He emphasizes that there are no vowels that occur at the final position Further description of the uper - vocalics vowels are provided as follows
/i/ is proposed here as a high front or central unrounded vowel It is lower high central before final ch, nh, as in ích, be useful, and lính, soldier Before ê, p, m in the same syllable, it is an upper high front vowel Examples are provided as in biết, miệng, kíp, tìm, which means know, mouth, be urgent, and search for respectively It is lower high front elsewhere in the same syllable
/e/ is characterized as being upper mid front or central, unrounded It is upper mid central before final ch, nh; and after [i] before [w, p, m, t, n] in the same syllable, which is “slightly lower before [w]” (p.30)
Examples given include ếch, bênh, hiểu, tiếp, which respectively means frog, defend, understand, and receive in English The vowel is upper-mid front elsewhere
/u/ is described as a high back rounded vowel Thomson (1965) emphasizes that “it tends to be upper high, but only before [m] and [p]” (p.31), as in chụp (seize suddenly), chum (earthenware jar), and it will be lower high elsewhere, as in núi, (mountain), mũ, (hat), tuổi,
/o/ is identified as being upper mid back rounded It is higher mid before [j, w], as in tôi, (I), rồi, (be already accomplished), cô, (aunt), lỗ, (hole), and is mid strongly centralized after [u], as in buồn, (be sad), quốc, (country), tuổi, (age), chuột, (rat) Finally, it is upper mid elsewhere, that is, before [p, m, t, n]
/ε/ is proposed to be lower mid front unrounded There is little variation when the sound is realized in different contexts
/ɔ/, is much like that of /ε/, maintaining its quality when being distributed differently The vowel is described as lower mid back rounded
/a/ is characterized as a lower low front unrounded vowel Đoàn (2000) has proposed the largest vowel inventory of Vietnamese, with thirteen monophthongs, including /i/, /e/, /ɛ/, /ɛ̆/, /ɯ/, /u/, /o, /ɔ/, /ɔ̆/, /ɤ/, /ɤ̆/, /a/, and /ă/ The author did not attempt to describe these vowels in terms of how they are articulated, as articulatory phoneticians have often done Instead, qualities of all the vowels are described firstly in terms of their timbre The timbre is then explained as being high (bổng), mid-low (trầm vừa), and low (trầm) The table below illustrates how Vietnamese monophthongs are distinguished from each other in terms of their timbre, according to the author (p.191)
However, it is not clear from the explanation what the vowels are high, mid-low, and low in terms of If that is concerned with pitch, there appears to be confusion between the vowel quality and the pitch at which they are produced Acoustic studies of vowels have demonstrated that the pitch of vowels, as perceived by listeners, is decided by the fundamental frequencies of the sound waves producing that vowel (F0), and has practically no effect on the vowel quality
There are four pairs of Vietnamese vowels, which according to the study, differentiated by duration These include /ɛ̆/ and /ɛ/, /ɔ̆/ and /ɔ/, / / and /ă/, /ɤ/ and /ɤ̆/ It is maintained that these four pairs of vowel have the same quality, and are in long-short opposition (p.195)
The acoustic description attempts
Matt et al (2009) carried out an exploration of the Vietnamese monophthongs produced by a small group of native speakers from both northern and southern Vietnam The researchers also attempted to provide a comparison between the native production and th ose made by American adult learners The goals of the study are significant The method of conducting the study, however, is problematic In order to eliminate the anatomical differences among participants, the normalization method inspired by Watt and
Fabricious (1973) has been employed in the study This method has been severely attacked by modern phoneticians
Johnson (2005) pointed out that, “Talkers may differ from each other at the level of their articulatory habits of speech This , in itself, would suggest that perception may not be able to depend on vocal tract normalization to “remove” talker differences by removing vocal tract differences” (p.19) Johnson et al (1993) goes further:
The presence of individual differences in speech production also complicates matters for vocal tract normalization Though normalization research has usually focused on male/female differences in vocal tract size and shape, vocal tracts - even within genders - come in lots of different sizes and shapes Talkers apparently adopt different (possibly arbitrarily different) articulatory strategies to produce the “same” sounds Thus, accurate recovery of the talker’s articulatory gestures would not completely succeed in “normalizing” speech (P.20)
The second problem of the method is in its scale The study was conducted on too small a scale so as to provide a conclusive support for the researchers’ claims in the discussion of the findings
Native speaker participants included 3 Northern dialect speakers (1 female,
2 males) and 1 Southern dialect speaker (female) All were originally from Vietnam and had been living in an English-speaking country for 6 to 26 years They ranged from 42 to 64, and all had experience teaching Vietnamese as a foreign language to adults
Firstly, the number of participants selected is too small, and is therefore statistically insignificant This can be attributed to the authors’ reliance on the normalization method adopted, as mentioned before Secondly, while the qualities of Vietnamese vowels have been recognized as being substantially varied from dialect to dialect in realization, there is no indication that the subjects were screened for dialect, and very little information is provided about the dialects of the speakers The present research represents the researcher’s attempt to address these limitations (see Chapter 3 for further details)
Srihari and Nguyen (2004) is another attempt to describe the Vietnamese vowel characteristics employing spectrograms analysis
In order to make decision on the set of vowels for the recording process, the authors follow the work of Thompson (1965, 1987), closely, claiming that there are eleven monophthongs in the Vietnamese vowel system (Hanoi dialect), which are /i, ɯ , u, e, γ, o, ε, ɔ, ɐ, a, ɑ
The vocalics systems (Thompson, 1987, as cited in Srihari and Nguyen, 2004)
Making a comparison with the system that Mai, Vu, and Hoang (2008) proposed, considerable differences could be spotted In the latter account, it is suggested that there are 13 pure vowels in the system, and noticeably, there is not an existence of /ɑ/, characterized as a low, back, unrounded vowel, as Srihari and Nguyen (2004) maintain In addition, these authors support the claim that /γ, o, ε/ have three counterparts differing just in terms of duration, which are /ɤ̆/, /ɔ̆/, and /ɛ̆/ This is a part of the inconsistent description of the Vietnamese vowel inventory, as mentioned earlier Even Thompson
(1987) has departed from his previous proposal made in Thompson
(1965), with regards to the existence of /ɑ/ As a result, deciding on a set of eleven monophthongs has posed a threat to the validity of the findings
The aims of the study, as stated by its author, are to provide “a preliminary quantitative description of formant values for F1 and F2 for each vowel and plot the vowel chart of Vietnamese.” (p.2)
However, what has made it even more problematic, again, is the scale of the research The subject of the study, as described, is “a 24- year-old native male speaker of Hanoi dialect, the standard dialect of Vietnam The speaker can speak English fluently but not well-trained in phonetics.” (p.2) This problem also occurred in the previous study There are anatomical differences among speakers of a certain language; therefore, selecting one subject for examination would not provide findings which are representative of the population Given that the author would carry out an analysis on the qualitative aspects of the vowels in question, the conclusion on the acoustics of the vowels of a language being drawn from the analysis of the recording of a single speaker of it is seriously questionable Ladefoged (2003) pointed out that, “The fact that data has been measured correctly does not show that there are no problems with the speakers When looking at the formants of a group of people you should check whether any one speaker is different in any way from the others.” (p.129)
The vowels of five speakers of Banawa, Ladefoged (2003, p.129)
The ellipse in the figure encloses four stressed [e] vowels of a speaker As can be seen, the first formant values of his [e] are distinct from those of the other speakers This speaker, therefore, has produced this sound in a way that is significantly different from the others This deviation, according to Ladefoged (2003), cannot be ascribed to some anatomical factor such as a very small vocal tract size This is because the other vowels produced by him are similar to those made by the rest of the speakers The author’s suggestion is that, “if you find a speaker who pronounces a word in a significantly different way, you should leave this part of the data out when providing diagrams of the vowel qualities of the language, noting, however, that there are speakers who deviate from the general pattern.” (p.129)
The second problem with the currently reviewed study involves the set of words containing the vowels chosen for recording
The word list containing the vowels in question, Srihari and Nguyen (2004, p.3)
The /t-/ context is not the best choice According to Ladefoged
(2011, p.199), a stop closure will cause the vowel’s first formant (F1) to rise from a low position As a result, the accuracy of the formant values calculated might be affected It is suggested in a number of the studies (James et al., 1995; Broadbent & Ladefoged, 1957; Wells, 1962; Ladefoged, 2011) that a word list of the /h-d/ context would provide the best spectrograms, as /h/ has almost no effect on the formants of the adjacent vowels in the same syllable.
Characterizing vowel qualities with the acoustic properties
The current study is inspired by Ladefoged’s (2003) firm statement that, “the best way of describing vowels is not in terms of the articulations involved, but in terms of their acoustic properties.” (p.104) In this section we shall take a closer look at the acoustics of vowels
The different sounds of language are physically characterized with four dimensions, which are the fundamental frequency, the amplitude, the duration, and the formants distribution of the sound wave The four corresponding perceptual dimensions are pitch, loudness, length, and quality
The current study has not investigated the amplitude and the fundamental frequency of vowels, being primarily concerned with the spectral distribution of the pure vowels The measurements of the vowel duration have been investigated insofar as they distinguish the pairs of vowels having been described with inconsistency in articulatory phonetics
Articulatory phonetics describes how a vowel is articulated, in terms of the behaviors of the articulators, but there has not been a term to describe the difference between the quality or timber of one vowel and another vowel Among the dimensions of the complex sound waves produced by the human vocal cords, we need to consider carefully the spectral distribution of the component frequency A speaker can pronounce a vowel on any pitch within the range of his voice without changing its identity Ladefoged (2003) provides a prime example:
I can say the vowels in heed, hid, head, had on a low pitch, when the vocal folds are vibrating about 80 times a second, and then I can say them again with vocal folds vibrating 160 times a second The pitch of my voice will have changed, but the vowels will still have the same quality I can also say any vowel loudly or softly The quality, the factor that distinguishes one vowel from another, remains the same when I shout or talk quietly (p.31)
The differences among vowels are often compared with the different instruments The same note can be played on a guitar, a violin, or a piano This can be done as the sound is produced at the same rate of repetition of a special component wave, i.e, the fundamental frequency What is interesting here is that, the quality of the music produced by one instrument will be different from that of any other
This is due to the differences in the amplitude as well as the frequency of the component waves The quality of a vowel differs from that of another in plainly the same way Irrespective of the pitch on which a vowel is produced, the quality will stay unchanged
A popular way that phoneticians describe the acoustics of the human speech sounds is using the tube models The current research is primarily concerned with the monophthongs (of Vietnamese), so the models can be conveniently summarized as follows
The air in a bottle will be set vibrating when the body of air at the top of it is blown across Naturally, the note that is produced as a result of blowing the air at the bottle top will depend on the size and the shape of the bottle The more the volume of air inside is increased, the lower will the produced note be This is due to the fact that the smaller body of air will vibrate more quickly than that of a larger one, having a higher frequency of resonance
When a vowel is being produced, it is the vocal tract that acts like a bottle, with the size and the shape being constantly altered If for a bottle, the air inside is set in vibration when blowing across the air at the top, for the vocal tract it is the pulses of the air from the vocal folds What makes the tract different from the bottle is its very complex shape, which can be constantly changed due to the movements of the related organs Conveniently, phoneticians often consider the body of air in the throat to be the first tube, and that in the mouth to be the second one The resonances of the vocal tract are called the formants, which correspond to the basic frequencies of the sound are the properties that directly depend on the size and the shape of the tract, both the front and the back part of the cavity They are largely responsible for the characteristic quality of the vowel My vowel [i] in the Vietnamese word hi is characterized by formants around 380, 2200, and 3200 Hz
Figure 1: The spectrogram of the author’s pronunciation of [i] in hi
When my vowel [i] is produced, a damped wave is generated, and always with these approximate basic frequencies It is this set of components that allow us to distinguish [i] from the other vowels
Each vowel is associated with a different shape of the vocal tract, resulting in the different component basic frequencies (the formants) being produced when the body of air inside vibrates
The traditional articulatory descriptions of vowels show a close relationship with the frequencies of the formants of the vowels As the acoustic studies of vowels have demonstrated, the frequency of the first formant (F1) is responsible for the vowel quality of being high or low, and that of the second formant effects the degree of frontness or backness, as described in articulatory phonetics This can be more clearly illustrated with a formant chart of English vowels taken from Johnson (2011), as follows
A formant chart showing the frequency of the first formant on the ordinate plotted against the second formant on the abscissa, by Johnson (2011, p.197)
As it can be seen from the chart, in comparison with the first formant frequency of [i], the first formant of the vowel [a] increases noticeably It is also apparent that, in these vowels, as the height of the vowels decreases, their F1 increase As for the second formant frequencies, it is markedly higher for the front vowels than in the back vowels Briefly, in relation to the descriptions in articulatory phonetics, the degree of frontness or backness varies proportionally with the frequency of the second formant (F2), and the height of the vowels varies inversely with the first formant frequency (F1)
In the previous reviews of the two studies on the vowels of Vietnamese, I have questioned the conclusion of the authors, because of the scale on which the research was conducted, ranging from one to four native speakers as the subjects It is now that this can be further justified As Ladefoged (2001) has pointed out, we can describe the vowel qualities of a particular vowel, produced by a particular speaker by calculating the value of the first and the second formant However, due to the anatomical differences among speakers, the precise formant frequencies that the vocal folds’ vibrations generate might be comparatively different For instance, a speaker with a bigger head will have a larger resonating cavity, which results in his comparatively lower formants, both the F1, and F2 In contrast, a vowel produced by a speaker with a smaller vocal tract will have formants with relatively lower formant frequencies
Ladefoged (2001), concludes that, “In order to represent the vowels of a language, we need to show the average values of the formants” and “the most useful representation of the vowels of a language is a plot showing the average values of formant one and formant two for each vowel as spoken by a group of speakers.” (p.39)
General American English
One of the ultimate goals of the current study is to compare the distribution on the formant chart between Hanoi Vietnamese monophthongs and General American English monophthongs; this section is devoted for an examination of the concept of General American English (GA) and its monophthongs in literature
Generally, phoneticians are united surrounding the definition The pronunciation of American English is traditionally divided into the Eastern pronunciation, which includes New York City and New England, Southern, which stretches from Virginia to Texas and the southwards, and General, which includes all the remaining General American, GA, is comparable with RP in Britain A speaker of GA is a person whose accent does not tell which region of the country he comes from Put it another way, GA is described as having no characteristics of a specific region in the United States Just as RP, sometimes referred to as Queen’s English, or BBC English, GA is often referred to as Network English “It is the standard model for the pronunciation of English as an L2 in parts of Asia, and parts of Latin America.” (Gimson, 2008, p.84)
According to Wells (1982), there are two major systemic differences between British RP and GA Firstly, in RP, there are three diphthongs /iə/, /eə/, /ʊə/ which cannot be found in GA Instead, in
GA, there are sequences of short vowel plus /r/, such as in heard, fare /bɪrd/, /fer/ Secondly, there is no /ɒ/ in GA In RP hot is pronounced as /hɒt/, but in GA, it will become /hɑ:t/ This is true with virtually all the other cases of /ɒ/ in RP, such as in bottle, cot, pot, spot However, Gimson (2008) also points out that a limited subset of GA has /ɔ:/, for example, across, gone, often, cough, orange, porridge
In terms of the lexical occurrence, the differences are in words of RP having /ɑ:/, while in GA, they become /ổ/ Gimson (2008) also stresses that this commonly happens in the context before a voiceless fricative, or before a nasal followed by another consonant For example, RP past [pɑ:st] is GA [pổst]
Bellow is the further examples of the comparison between RP and
GA vowels, provided by Gomez (2012, p.12)
Change of vowel /ɒ/ to /ɑ:/ and /ɔ:/,
Regarding the diphthongs, which is not the primary concern of the current study, the differences between the two systems are varied
The most noticeable change is the shift from /əʊ/ in RP to that of /oʊ/ in GA, such as in home, [həʊm] of RP, and [hoʊm] GA As Gomez
(2012) has pointed out, the shift is concerned with the change of the vowel /o/ in the first vowel of the diphthong This shift, according to the author, is considered to be systematic He offers several examples of this change in the table below (p.14)
Hillenbrand et al (1995) conducted a study of the acoustic properties of GA The vowels /ɪ, i, e, ɜ, ổ, a, ɔ, o, ʊ, u, ʌ, ɝ/ in /h-v-d/ syllables, produced by 45 men, 48 women, and 46 children were recorded
The majority of the participants, (87%), were born and raised in Michigan’s Lower Peninsula, the southeastern and southwestern parts of the state of Michigan The remaining were from other parts of upper Midwest, including Illinois, Wisconsin, Minnesota, northern Ohio, and northern Indiana In order to increase the homogeneity of the sample, ensuring that they all speak GA, a procedure of selecting the subjects from the larger group, described by the researchers as being “an extensive screening procedure” was conducted The key part of the procedure was a careful assessment of dialect It focused on the subjects’ production of /a/ - /ɔ/ distinction
The formants of F1-F4 were measured from the LPC spectra Below are the average F1-F2 formant charts of pure vowels as produced by American men and women
The average formant frequencies of the pure vowels produced by
American men (Hillenbrand et al., 1995, p.1304)
The average formant frequencies of pure vowels produced by American women (Hillenbrand et al., 1995, p.1304)
It is clear from the charts that, although the absolute values of the formant frequencies between men and women are significantly different, due to the anatomical differences between two sexes, the relative positions of the monophthongs on the charts, indicating how these vowels are articulated, are strikingly similar.
RESEARCH METHODOLOGY
The subjects
Ten female speakers of Hanoi Vietnamese were chosen in a procedure as follows First, 20 females aged from 15 to 25, who claimed to have spent most of their life, since being born, in Hanoi were chosen to take part in a recording process They were then asked to read a short piece of Vietnamese scripts (Appendix 2) After that each of the recordings was played back to all the subjects, except for the person producing it The listeners were asked to judge whether each piece of speech sounds typically Hanoi Vietnamese, giving a score ranging from one to ten, with ten being most typical, and one being the least Ten of the twenty subjects who achieved the highest scores were selected This procedure ensured the high homogeneity among the subjects.
The stimuli
Thirteen Vietnamese monophthongs were investigated In different accounts proposed by different authors, as discussed in the review of literature, the number of monophthongs in the system is a matter of controversy Whether the pairs of vowels, as in anh (brother) and xe
(vehicle), ong (bee) and oong, ha and hay (interesting), hơ and hân should be described as two vowels in long-short opposition, having the same vowel qualities, represented by /ɛ̆/ and /ɛ/ /ɔ̃/ and /ɔ/ , distinct quality, or /ɛ̆/, /ɔ̃/, /ă/, and /ɤ̆/ are allophones of the longer counterparts, have divided linguists The current research treated them as being distinct from each other, either in terms of the qualities or duration; therefore, the quality of these thirteen vowels, which is the largest inventory proposed so far, were investigated
Based on the results of the acoustic analysis of F1 and F2 , the controversial matters would be discussed in the section of findings and discussion To record the subjects’ production of these vowels, /i, e, ɛ, ɛ̆, ɯ, u, o, ɔ, ɔ̆, ɤ, ɤ̆, a, ă/ were divided into two sets The first set, including /i, e, ɛ, ɯ, u, o, ɔ, ɤ, a,/ are represented by the corresponding letters in the Vietnamese alphabet, i, ê, e, ư, u, ô, o, ơ, a The second set, including, /ɛ̆/, /ɔ̆/, /ɤ̆/, and /ă/, as described by linguists, have limited distribution Therefore, they are realized in four words, anh, óc, ân, ay respectively.
The recording process
The subjects were required to say the given words and letters two times to the Shure PG27USB microphone, with the relevant specifications information provided by the producer as follows:
Power Requirements: USB-powered, 500 mA maximum
Sampling Rate: up to 48 kHz The USB Plug and Play microphone was chosen instead of the traditional plug as suggested by Jonhson (personal communication,
December 19, 2011) This connectivity method allowed the researcher to conduct convenient digital recording anywhere that a computer can be taken along In addition, the integrated pre -amp with Microphone Gain Control allows the control of input signal strength, meaning there is no requirement of an amplifier This is especially suitable for phonetic fieldwork, where it is commonly impossible to take the speakers of a speech community to a laboratory for recording
The microphone was set up with a personal computer with the specifications as follows:
Ports 4 x USB 2.0; FireWire; VGA port; S-video port
The subjects’ productions of the sounds were recorded at a sampling rate of 11025 Hz, in a 20m2 quiet room
The data were stored on the computer Different copies were made and stored in case of hard disk errors.
The analysis process
F1 and F2 of each vowel were measured The mean of F1 and F2 of each vowels were then plotted on a diagram with JPlotFormants v1.4, a freeware program developed by Roger Billerey (2011), at the University of California, Los Angeles, which “lets you enter and plot formant pairs (F1, F2) for an unlimited number of vowels.” (Billerey,
2011) It runs on a Java-enabled platform, in the case of the current research, Microsoft Windows The parameters of the plot can be easily customized, in terms of the size, the colors and the symbols
The formant values and an acoustic vowel chart of the Hanoi Vietnamese monophthongs were then compared with the values found for General American English monophthongs, by Clark et al
FINDINGS AND DISCUSSION
The acoustics of Hanoi Vietnamese monophthongs
Table 1 presents the information on the values of F1 and F2 of all the subjects for each vowel in question The values are the average values of the two tokens The first column on the left contains the labeled subjects
Table 1: The first and second formant frequencies of all the subjects for each monophthong The values are the average values of the two tokens
The measurement the formants of [ɛ̆] as in anh, [ɛ̆ɲ] posed potential problems [ɲ] is a nasal consonant As a result, its formants would have an influence on the formants of the vowel preceding it The influence is explained below The effect would be the same if the consonant had been replaced by /k/ as in ách Therefore, if [ɛ̆] is a monophthong as described in literature, the formant contours would be affected
Unfortunately, these are the only two possibilities of the distribution of this vowel Before examining the impact it has on [ɛ̆] in the words produced by the subjects, let us consider the extent to which [ɲ] affects the vowels in syllables to which it belongs The figures below represent the spectrograms illustrating the effect of [ɲ] on [i] in inh and nhi
Figure 2: The effect of [ɲ]on [i] in inh and nhi, as produced by the researcher
The spectrogram of inh is on the left, and that of nhi is on the right
It appears that there is practically no effect on the vowel in both cases The contours of both F1 and F2 remain steady throughout the time duration of the vowels
There is, however, a less black segment on both spectrograms, at the starting in nhi, and at the ending in inh This is highlighted in the figure Interestingly, these segments contain two dark bands at the same frequency range of F1 and F2 of the following vowel [i] This suggests that the two bands of dark color might be the properties of [ɲ] being the two formants of the consonant If the formants of [ɲ] have roughly the same values as that of [i], the effect that [ɲ] produces on [i] would not be obvious The situation would be different if the following vowels have considerably different F1 and F2 from those of [i], such as [a] The following figure presents the spectrogram of the researcher’s pronunciation of nha
Figure 3: The effect of [ɲ] on [a] in nha
As obviously shown in the spectrogram, [ɲ] has markedly increased the second formant of [a] at the starting of the formant contour In contrast, the first formant of the vowel has substantially decreased at the same point F1 and F2 increases and decreases respectively and reach the normal values only at the end of the formant contours It can be concluded from the examination of the spectrograms of inh, nhi, and nha that [ɲ] has two formants of approximately the same values as that of [i] The consonant’s formants, therefore, has definitely affected the formant frequencies of the vowel [ɛ̆] in anh,
[ɛ̆ɲ] The segment that best reflects F1 and F2 of [ɛ̆], as a result, is at the starting of the formant contours These are the values represented in table 1
It is clearly seen from the table that while [ɛ] has F1 at below 1000
Hz, ranging from 607 Hz to 755 Hz, and F2 above 2000 Hz, ranging from 2344 Hz to 2712 Hz, being characteristic of an open -mid front vowel as described in traditional phonetics, the formant frequencies of [ɛ̆] has a strikingly similar values to those of [a] Both [a] and [ɛ̆] have the first two formant frequencies at around 1100 Hz and
1800 Hz, being typical of an open near front vowel The figures below illustrate the differences between [ɛ̆] and [ɛ] and the similarities between [ɛ̆] and [a] in terms of the vowel qualities
Figure 4: The difference between the vowel in e and anh, produced by a subject The spectrogram of e is on the left, and of anh is on the right
Figure 5: The difference between the vowel in e and anh, produced by another subject The spectrogram of e is on the left, and of anh is on the right
Figure 6: The similarities between the vowel in anh and ay, produced by a subject The spectrogram of anh is on the left, and of ay is on the right
Figure 7: The similarities between the vowel in anh and ay, produced by another subject The spectrogram of anh is on the left, and of ay is on the right
It is suggested from the findings that /ɛ̆/ and /ɛ/ are two vowels with distinct qualities, and that /ɛ̆/, as investigated in the current research and /a/ are two vowels with the same quality, being distinct from each other only by the duration It would be, therefore, more appropriate for the current /ɛ̆/ to be represented by the IPA symbol /ă/
The situation with [ɤ] and [ɤ̆] is different As shown from the table the values of the first and the second formants of these vowels are much similar to each other The mean of F1 for [ɤ] is 728 Hz, while that of [ɤ̆] is 752 Hz The second formant of [ɤ] varies from 1222 Hz to 1369 Hz, which bears similarity to that of [ɤ̆] (1314 Hz-1435 Hz)
These formant values are characteristic of an open-mid central vowel described in traditional phonetics Although both formant frequencies of [ɤ̆] are slightly higher than those of [ɤ], indicating that the former is to some extent farther to the left and lower than the latter vowel on the formant chart (the vowel chart), the degree of similarity is significant The data, therefore, have not fully confirmed the claim of Đoàn (2000, p.195) that /ɤ/ and /ɤ̆/ are two vowels of the same quality However, that [ɤ]and [ɤ̆] are very close in terms of vowel qualities must be acknowledged The finding of this study also strongly supports his firm statement that “the vowel in sân [sɤ̆n] is always shorter than the vowel in sơn [sɤn] The role of duration is fundamental.” (p.196) A closer examination of the vowel duration from the spectrogram (see below) illustrates how [ɤ]is distinguished from [ɤ̆] by the difference in duration
The spectrogram on the left shows virtually the same frequencies of F1 and F2 of [ɤ] as those of [ɤ̆], which is on the left While the formant contours of the former vowel stays constant from the starting to the ending, F2 of [ɤ̆] goes up slightly This is an indication of the impact of the following consonant [n] It is also obvious from the figure that the duration of [ɤ̆] is considerably longer than that of [ɤ̆]
It is very unlikely that [ɤ̆] is shorter as a result of the following consonant [n] Let us examine the contrast in the minimal pair of vowels in ơn and ân The environment is the same, regarding the following consonant [n] It is the difference in the duration of the vowel in ơn [ɤn], and ân [ɤ̆n] that distinguishes the words from each other The following figure shows the spectrograms of the researcher’s pronunciation of [ɤn] and [ɤ̆n]
Figure 9: Spectrograms of [ ɤ n], on the left, and [ ɤ ̆n], on the right
The formant frequencies are approximately the same at the starting and the ending of both words The only apparent difference is the duration It is clear from this figure that /ɤ/ and /ɤ̆/ are two different vowels The qualities are similar, but it is the duration that makes them two distinct vowels
Let us now turn to another pair of vowels of controversy /a/ and /ă/ have been characterized as two vowels of the same quality, differing from each other fundamentally in terms of duration They are therefore, treated as two distinct vowel, and it is the duration that makes them two distinct sounds It is also admitted that there are other authors who have accounted for these vowels’ qualities differently While /a/ has been described as an open central unrounded vowel, /ă/ is characterized as being mid, unrounded, central, but farther to the front
The measurement of the vowels’ first and second formant frequency, as indicated from Table 1 has demonstrated that they are of the same quality While [a] receives the mean of its F1 at 1117 Hz, and 1826
Hz for F2, the corresponding formants of [ă] measure 1163 Hz and
1802 Hz respectively The differences are statistically insignificant, and the differences in terms of vowel qualities are trivial The following figures illustrate the virtually identical values of the vowels’ first and second formants
The monophthongs of Hanoi Vietnamese and General American English in
Figure 16: The formant chart of Vietnamese monophthongs produced by female speakers
Figure 17: The formant chart of General American English monophthongs produced by female speakers
Before the comparison can be drawn, it must be highlighted that due to the anatomical differences between Vietnamese females and American females, a comparison and contrast of the absolute value of the formant frequencies will not reflect the actual differences and similarities in vowel qualities Rather, the researcher will make an attempt to compare the relative positions of the vowels on the formant charts
It is obvious from the charts that both [i] of Vietnamese and of General American have noticeable similarities They are the most close and front vowel in both systems Despite this, the Vietnamese vowel appears to be higher than the American counterpart
In the front positions, Vietnamese has two other vowels, [e] and [ɛ], while General American has three, [i], [ɛ], and [ổ] In spite of the very different represented phonetic symbols, [e] in Vietnamese, like in hết and [i] of American, as in hit, show considerable quality equivalence The situation is much the same for the other pair in consideration: [ɛ] of Vietnamese and [ổ] of American English They are both a short distance lower on the charts The remaining American vowel [ɛ] is deviant from the group It is farther to the central, and also more open
The central area of the chart also deserves careful attention Asides from appearing to be closer, the Vietnamese [ɯ] and the American counterpart [ɝ] show some important similarity It must be noted, though, that the additional r quality may make the sounds sound less similar [ʌ], [ɤ]and [ɤ̆], on the other hand, show strikingly close quality
The vowel [ɑ] of American English and [a] of Vietnamese are both very open, but the latter is considerably farther to the front area
In the back area of the chart, both Vietnamese and American English have a vowel represented as [u] and another vowel, as [ɔ] They also occupy much the same positions on the chart The qualities of each vowel in these pairs are, therefore, expected to be very alike
The remaining vowel, [o] in Vietnamese, and [ʊ] in American English, are quite different The first vowel is much lower, and is slightly more back.
CONCLUSION
The main findings on the acoustics of Hanoi Vietnamese monophthongs
In this study, the researcher has made an attempt to give a description of the monophthongs in Hanoi Vietnamese, based on the acoustic measurements of the formant frequencies Despite the shortcomings of the research, it has discovered important results
The measurements and comparison of F1 and F2 among the monophthongs have pointed out that there is not a vowel which has been traditionally described and represented by /ɛ̆/ in some accounts in literature The average formant frequencies values have demonstrated that the qualities of this vowel is much the same as those of /ă/ and /a/ It has also been highlighted that the latter vowel is contrasted from the previous essentially by the vowel duration
This strongly supports the claims made by Đoàn (2000) that duration is the distinguishing features of these vowels As a result, it is suggested that /ɛ̆/ and /ă/ are equated, and the later phonetic symbol will be arguably able to represent both
The situation is subtly different for /ɤ̆/ and /ɤ/ The former vowel has slightly higher F1 and F2 than the later vowel, featuring its quality as being more front and more open This has somewhat gone against previous studies which maintain that these vowels are of the same quality, but different in terms of the duration Despite this fact, it must be acknowledged that they are fundamentally similar, and that the duration contrast clearly plays an important role in distinguishing one from the other The formant chart (vowel chart) has also added that they are vowels being typically central
Another pair of vowels, /a/ and /ă/, have also been shown to have radical similarity in qualities While the former is marginally more front and more close, it must be accepted that the degree of similarity is significant It should also be noted again that it is the duration difference that distinguishes them from each other
The last vowel whose description has generated heated debate among phoneticians is /ɯ/ The measurement of F1 and F2, and the position it occupies on the formant chart suggests that this is a close central vowel.
The monophthongs of Hanoi Vietnamese and General American English in
This research has also been set out to provide a comparison between the relative positions of the monophthongs in Vietnamese and General American English on the formant charts The result is of keen interest and is highly suggestive in language education
It is interesting that many monophthongs in both languages share their position with a sound of the other language on the format chart , despite the different phonetic symbols which represent them The Vietnamese /e/ is surprisingly similar to the American /i/, although the former appears to be more open In the same way, the vowel /ổ/ in American English, which has been shown to be substantially different from the counterpart of RP, shares its important qualities with /ɛ/ of Vietnamese /ʌ/, /u/, and /ɔ/ are three other vowels which have strikingly similar sounds in Vietnamese, being /ɤ/, /u/, and /ɔ/ respectively In this group, that /ʌ/ and /ɤ/ share their qualities has been mentioned in literature.
The limitations of the study and suggestions for further research
Despite its achievements, there are important limitations in the current study, which have been pointed out to some extent aforementioned
The spectrograms have failed to present the formant frequencies of the vowel [ɔ̆] in measurable contours This vowel, as described in the chapter of research methods, is realized in the chosen word óc
The failure is due to the limited distribution of the vowel, as being accounted for in literature The consonant following [ɔ̆] has shortened it significantly, resulting in the very short formant contours Added to this, this consonant ([k]), also has great influence on the vowel’s formants, changing its value considerably, aggravating the situation
As it has been pointed out, it is possible that the vowel in anh is not purely [ă], but there is possibly a glide to a second vowel , characterizing an [i] before the influence of the following consonant can take effect However, as this consonant, as analyzed, has its formant frequency values almost equal to those of [i], the attempt to prove that [i] exists has been of little success
Ladefoged (2011, p.212) has pointed out that although the absolute values of the formant frequencies between men and women, between children and adults are generally different, due to the anatomical differences, the relative positions of the vowel on the charts, indicating how the vowels are articulated, are similar Nevertheless, if this thesis had investigated the pure vowels produced by two other groups, men and children, the research results would have been more insightful
Finally, the study has not described all the dimensions of the Vietnamese monophthong qualities Discussion on the findings of the research, as well as the review of literature has suggested that there are pure vowels in Vietnamese whose qualities cannot be described based solely on the qualities of being close or open, front or back, o r on the frequencies of the first two formants It is the duration that plays a crucial part This study has not investigated this dimension
Billerey, J (2001) JPlotFormants v1.4: Formant-plotting software
Retrieved from http://www.linguistics.ucla.edu/people/grads/billerey/PlotFrog.htm
Clark, M J, Hillenbrand, J, et al (1995) Acoustic characteristics of American English vowels Journal of Acoustical Society of America
97(5) Đoàn, T T (2000) Ngữ âm Tiếng Việt Hà Nội: Nhà xuất bản Đại học Quốc gia Hà Nội
Gimson, A.C (2008) Pronunciation of English Oxford: Oxford
Gomez, E.T (2012) British and American English Pronunciation Differences Retrieved from http://www.webpgomez.com/index.php?option=com_content&view article&id32&ItemidQ
Johnson, K (2005) Speaker normalization in speech perception
Retrieved from http://www.phonetik.unimuenchen.de/~reichelu/kurse/perz_fort/liter atur/JohnsonHSP2005.pdf
Johnson, K., Ladefoged, P & Lindau, M (1993) Individual differences in vowel production Journal of Acoustical Society of America 94, 701-714
Ladefodged, P & Johnson, K (2011) A Course in Phonetics Boston:
Ladefoged, P (2003) Phonetic Data Analysis: An Introduction to
Fieldwork and Instrumental Techniques Oxford: Blackwell Publishing
Ladefoged, P (2005) Vowels and Consonants Oxford: Blackwell Publishing
Ladefoged, P (1996) Elements of Acoustic Phonetics Chicago:
Matt, W et al (2009) Vietnamese Vowel, the Central Focus Retrieved from http://www.casl.umd.edu/sites/default/files/WinnTwistBlodgett_Viet nameseVowelsSec22009.pdf
Mai, N C, Vu, D.N & Hoang, T.P (2008) Cơ sở ngôn ngữ học và
Tiếng Việt Hà Nội: Nhà xuất bản Giao dục Việt Nam
Nguyen, B & Srihari, R (2004) A preliminary quantitative study on the characteristics of Vietnamese vowels and English vowels.
Retrieved from http://www.cs.jhu.edu/~nguyen/data/phonetics_prjrpt.pdf
Pham, A (2003) Vietnamese tone: A new analysis Outstanding Dissertations in Linguistics New York: Routledge
Shure Americas (2012) PG27USB Spec Sheet Achieved from http://www.shure.com/specification- sheets/us_pro_pg27usb_specsheet.pdf
Thompson, L.C (1965) A Vietnamese Reference Grammar Hawaii:
Thompson, L.C (1987) A Vietnamese Reference Grammar Hawaii:
Wells, J.C (1982) Accents of English Cambridge: Cambridge University Press
Wells, J.C (1962) A study of the formants of the pure vowels of British
English (MA Thesis) Retrieved from http://www.phon.ucl.ac.uk/home/wells/formants/index.htm
Weenink, D & Boersma, P (2012) Praat: doing phonetics by computer Retrieved from http://www.fon.hum.uva.nl/praat/
Phiếu chấp thuận tham gia vào nghiên cứu
Tên đề tài: Phân tích âm học so sánh nguyên âm đơn tiếng Việt Hà Nội và tiếng Anh Mỹ Phổ thông
Tôi đã đọc tờ phiếu thông tin cũng như được tác giả nghiên cứu giải thích về các thông tin liên quan đến đề tài này
Tác giả đã giải thích một cách rõ ràng về mục đích của đề tài, yêu cầu với những người tham gia nghiên cứu, đồng thời cũng trả lời thỏa mãn các câu hỏi từ phía tôi
Tôi đồng ý với kế hoạch đã được đưa ra trong Phiếu thông tin liên quan đến quá trình tham gia nghiên cứu của mình
Tôi hiểu rằng việc mình tham gia vào đề tài này là hoàn toàn tự nguyện và tôi cũng có quyền dừng tham gia quá trình nghiên cứu vào bất cứ thời điểm nào tôi muốn
Tôi đã nhận được một bản Phiếu chập thuận này cùng tờ phiếu thông tin đi kèm
Ngày tháng: Đại học Ngoại ngữ - Đại học Quốc gia Hà Nội Khoa Sau Đại học