Cochlear Implants: Fundamentals and Application - part 6 potx

Acoustic Models of Cochlear Implant Speech-Processing Strategies 397 Amplitude modulator Formant one Amplifie rSwitch Pulse generator Noise generator Amplitude modulator Amplitude modulator Amplitude modulator Formant tw o Formant thre e Formant four F1 A1 F2 A2 F3 A3 F4 A4 SW F0 F IGURE 7.9. Block diagram of a parallel-coupled formant synthesizer (Reprinted from Ainsworth 1976 with permission.). quency is a product of the outputs from the filters. The optimal number of filters for channel vocoder and fixed filter strategies is discussed below. Formant Vocoders As discussed above, formants are concentrations of frequency energy of importance for intelligibility. The formant vocoder was an advance in communication systems as the data rate was much less than when transmitting an unprocessed speech signal. Speech was first analyzed with a voiced /unvoiced detector, and a pitch detector measured the frequency of the glottal openings. Information about the frequencies and amplitudes of the formants was obtained from a bank of band- pass filters and envelope detectors. Formant vocoders were first developed by Lawrence (1953) [Parametric Arti- ficial Talker (PAT)], and Fant and Martony (1962) [Orator Verbis Electris (OVE II)]. These were adequate for the transmission of vowels, but for some consonants the resonances (poles) and antiresonances (zeros) needed to be specified. This formant vocoder required less bandwidth than the channel vocoder, but was not used in communication systems because of the complexity of the circuitry. How- ever, it has been very useful for studies on speech perception (Ainsworth 1976). The design of a formant synthesizer is illustrated in Figure 7.9. The formant vocoder became the basis for the formant speech-processing strategies used with multiple-electrode stimulation discussed below, and in Chap- ter 8. 398 7. Speech (Sound) Processing T ABLE 7.4. Speech perception test scores (%) for the F0/F2 cochlear implant speech processor and an acoustic model (mean scores for three subjects). Test Implant patient hearing alone Subjects with acoustic model Male–female speaker 24 17 Question statement 19 16 Vowels 29 23 Final consonant 30 29 Initial consonant 21 28 AB words 2 0 Phonemes 12 7 CID sentences 7 4 CID, Central Institute for the Deaf; HA, hearing or electrical stimulation alone. Based on Blamey et al (1984a,b) and Clark (1987). Acoustic Representation of Electrical Stimulation An acoustic model to evaluate multiple-electrode speech-processing strategies was developed by Blamey et al (1984a,b). The model used a pseudo-random white noise generator with the output fed through seven separate band-pass filters with center frequencies corresponding to the electrode sites (Blamey et al 1984a,b). This model was first evaluated psychophysically for pulse rate difference limens; pitch scaling for stimuli differing in pulse rate; pitch scaling and categorization of stimuli differing in filter frequency (equivalent to electrode position); and sim- ilarity judgments of stimuli differing in pulse rate as well as filter frequency (electrode position). The results for the acoustic model were comparable to those obtained with electrical stimulation on implant patients, and were discussed in Chapter 6. Having established that the acoustic model gave similar psychophysical results to those for multiple-channel electrical stimulation, a further study was undertaken to see if similar results could be obtained for an acoustic model of the fundamental (F0) and second formant (F2) speech processor to those for electrical stimulation with the same strategy in the Nucleus multiple-electrode system. Speech perception tests were administered in hearing alone, speech reading alone, and speech reading plus hearing conditions. The scores for the first University of Melbourne cochlear implant patient, and the three normal-hearing subjects using the acoustic model are shown in Table 7.4 (Blamey et al 1984a,b). There was good correspondence between the speech tests (male–female speaker, question– statement, vowels, final and initial consonants, AB (Arthur Boothroyd) words and phonemes (Boothroyd 1968), and CID sentences (Davis and Silverman 1970), for a multiple-channel cochlear implant patient using the F0 /F2 speech processor and subjects using the F0 /F2 model (Clark 1987). The acoustic model and cochlear implant performances were also compared on the basis of the percentage information transferred for each speech feature on a 12-consonant test (Table 7.5). The consonants were /b, p, m, v, f, d, t, n, z, s, g, k/. The speech features were voicing, nasality, affrication, duration, and place. The results for the first implant Acoustic Models of Cochlear Implant Speech-Processing Strategies 399 T ABLE 7.5. Information transmission (%) for the F0/F2 cochlear implant speech processor and an acoustic model. Feature Electrical stimulation alone Acoustic model Voicing 36 39 Nasality 25 68 Affrication 38 31 Duration 79 85 Place 25 15 Overall 46 45 Based on Clark (1987). patient and the average results for the three subjects are shown in Table 7.5. It can be seen that the trends are very similar for both electrical stimulation and the acoustic model except for better transmission of nasality with the acoustic model. This could have been due to the fact nasal sounds are distinguished from other speech sounds by a strong formant at about 200 to 300 Hz, and the information from F2 would not have provided this spectral detail. It could also have been due to better representation of the antiresonances (zeros) with the acoustic model. As the acoustic model on normal-hearing subjects proved to be good at reproducing the speech and speech feature results for the F0 /F2 speech-processing strategy, a study using the acoustic model was undertaken to determine whether an F0 /F1 /F2 speech-processing strategy (F1, first formant) would give better results than the F0 /F2 processor, and to what extent additional information due to F1 would be transmitted. An additional strategy was also evaluated in which F2 was coded as rate of stimulation. A confusion study on the 11 Australian English vowels was carried out on six subjects to determine the information transmission for the vowels grouped according to duration, F1 and F2. The results in Table 7.6 show there was a small increase in the total information transmitted for the F0 /F1 /F2 strategy. The F0/F1/F2 strategy was the only one that transmitted a large proportion of the F1 information. A much greater proportion of the F2 information was transmitted when coded as filter frequency rather than as pulse rate. With consonants the acoustic model of the F0 /F1 /F2 speech processor led to better transmission of the voicing, nasality, affrication, duration and amplitude envelope features, but not place of articulation than for the F0 /F2 strategy (Table 7.7). The F2 (rate) had poorer results than F0/F2 for place of articulation and high F2. The addition of F1 would provide the low-frequency information for identifying voicing through the VOT and a rising F1, as well as the essential cues for nasality. Further cues for duration and amplitude envelope would be provided by the greater energy in F1. Amplitude envelope information (Blamey et al 1985) improved significantly as well, and as a result so did information on manner of articulation. The speech-processing strategies were also compared for connected discourse using the speech-tracking test. The F0 /F1 /F2 strategy was superior to the others, and the F0 /F2 strategy superior to the F2 (rate) strategy. 400 7. Speech (Sound) Processing T ABLE 7.6. Acoustic model: comparison of speech-processing strategies—information transmission for vowels. F2 (rate) (%) F0 /F2 (%) F0/F1 /F2 (%) Total 34 56 72 Duration 83 85 95 F1 grouping 12 27 81 F2 grouping 25 68 55 Based on Blamey et al (1985). Having shown with an acoustic model of multiple-electrode stimulation that the F0/F1 /F2 speech-processing strategy was better than the F0 /F2 strategy, it was implemented as the Nucleus F0/F1/F2 WSP-III system, and a clinical study carried out to determine if the same would apply to cochlear implant patients. This would not only be a good test of the predictive value of the model, but more importantly further support the rationale for the multiple-electrode speech- processing strategy, which still had not been completely proven over single- channel devices at this time. In comparing the results for the F0 /F2 and F0 /F1 /F2 cochlear implant speech processors, it was considered important to analyze the information received by the patient and whether the information transmitted was consistent with the type of speech-processing strategy used. The percentage information transmitted for vowels and consonants was determined for a group of 13 patients with the F0/F2 processor and seven patients with the F0 /F1 /F2 processor. For vowels the scores were 51% (F0/F2) and 64% (F0 /F1 /F2). For consonants the scores were 36% (F0 /F2) and 50% (F0/F1/F2). The information transmitted for duration, F1, and F2 was greater for the F0 /F1 /F2 strategy. From Table 7.8 it can be seen that for consonants, information transmission was also better for the F0/F1 /F2 speech processor compared to the F0 /F2 processor for all speech features (Clark 1987). The information transmission was calculated from a confusion study on the consonants /p, t, k, b, d, g, m, n, s, z, v, f /. The information transmission was for the features of Miller and Nicely (1955), and an additional two features, the amplitude envelope and high F2. The amplitude envelope feature classified the consonants into four groups, as shown in Figure 7.10. These groups were easily recognized visually from the traces of the amplitude envelopes produced by the real-time speech processor. The high F2 feature refers to the output of the speech processor’s F2 frequency extraction circuit during the burst for the stops /t / and /k / or during the frication noise of /s/ and /z /. /f/ and /g/ did not give rise to the feature because the amplitude of the signal was too low during the period the F2 frequency was high. Thus the F2 feature was a binary grouping with /t, k, s, z / in one group and the remainder of the consonants in the other (Blamey et al 1985). As the results for information transmission for vowels and consonants for multiple-electrode stimulation were similar to those obtained for the acoustic model, it confirmed the predictive value of the acoustic model. The features for acoustic Acoustic Models of Cochlear Implant Speech-Processing Strategies 401 T ABLE 7.7. Acoustic model: comparison of F2 (rate), F0 /F2, and F0/F1 /F2 speech- processing strategies—information transmission for consonants. F2 (rate) (%) F0 /F2 (%) F0/F1 /F2 (%) Total 37 43 49 Voicing 35 34 50 Nasality 86 84 98 Affrication 31 32 40 Duration 62 71 81 Place 19 28 28 Amplitude envelope 47 46 61 High F2 48 68 64 and electrical stimulation could be compared to the speech perception data presented below and in Chapter 12. Speech Cues The importance of different speech cues in perception can be examined by presenting natural or synthetic speech without these cues. This helps determine how important they are for reproducing with electrical stimulation. The acoustic representation of electrical stimulation can also help in optimizing fixed-filter speech- processing strategies as well as formant processors. The importance of speech wave envelope cues can be studied by using them to modulate noise, thus separating them from spectral and fine temporal information. The fine temporal information is, for example, phase and frequency mod- ulation. The envelope cues convey information mostly about phoneme duration, voicing, and manner. Rosen (1989) transformed speech wave envelopes into “signal-correlated noise,” as described by Schroeder (1968). This was equivalent to multiplying the envelopes by white noise, resulting in a signal with an instanta- neous amplitude identical to that of the original signal, but with a frequency spectrum that was white. It was found that manner distinctions were present for sampling rates down to 20 Hz. Thus the cues from the amplitude envelope, as shown in Figure 7.10, could be defined at these low frequencies. Voicing was best with unfiltered speech or when filtered with a cut at 2000 Hz. Place recognition was poor. Similar information transmission to “signal-correlated noise” was obtained for the single-electrode cochlear implant (3M, Los Angeles) (Van Tasell et al 1987). With this system, as discussed below, the speech signal was filtered over the frequency range of 200 to 4000 Hz, and the output modulated a 16,000- Hz carrier wave. At 16,000-Hz there would be no fine time structure in neural firing, and the information would be from the amplitude variations. Cues for consonant recognition are not only from frequency spectra (provided by multiple-electrode stimulation) but also from the fine time variations in the amplitude envelopes. These variations were studied with speech processors based on an acoustic model of electrical stimulation (Blamey et al 1985, 1987). The 402 7. Speech (Sound) Processing T ABLE 7.8. Consonant speech features for the F0/F2 and F0/F1/F2 speech-processing strategies. Voicing Nasality Affrication Place Amplitude envelope High F2 F0/F2 (A0) (n ס 13) 33 38 36 20 36 36 F0/F1/F2 (n ס 7) 56 49 45 35 54 48 % increase 70 29 25 75 50 33 A0 is the amplitude of the whole speech wave envelope. Based on Clark (1987). groups of consonants on the basis of the envelope variations were unvoiced stops or plosives, unvoiced fricatives, voiced fricatives and stops together, and nasals (Fig. 7.10). Within these groups, the distinctions of place of articulation must also be made with other coding mechanisms. The amplitude envelope cues are available for cochlear implant patients (Blamey et al 1987; Dorman et al 1990). They may be especially important for those who have poor electrode place identifica- tion, and so do not receive the spectral shape of speech. Research also suggested these cues might be used by those with hearing aids (Van Tasell et al 1987). Studies by Erber (1972) and Van Tasell et al (1987, 1992) have shown that an essential cue for consonant place perception is the distribution of speech energy across frequency. Acoustically this is represented by both place coding and the fine temporal coding of frequency in the frequency bands. With the present meth- ods of electrical stimulation, as was discussed in Chapter 6, the temporal reso- lution is very limited. Consequently, with the cochlear implant the coding of place of stimulation becomes the primary cue. However, as was discussed in Chapter 6, the correlation between electrode place discrimination and the place speech feature recognition is not as good as expected. Channel Numbers The number of stimulus channels required to transmit speech information is important for understanding how to optimize multiple-electrode stimulation. Shan- non et al (1995) and Turner et al (1995) used acoustic models to study, in partic- ular, the speech information transmitted by fixed filter speech-processing schemes, to assess the optimal number of filters to be used as well as the number of electrodes to be stimulated. The research first studied the effects of modulating high-pass and low-pass noise, divided at 1500 Hz, with the speech wave envelope. This showed almost 100% recognition of voicing and manner cues, but the two channels provided only limited speech understanding. Information transmission analysis showed that the addition of a third and fourth band improved place of articulation. Shannon et al (1995) found that with a four-channel processor normal-hearing listeners could obtain near-normal speech recognition in quiet listen- ing conditions. This suggested to the authors that only four channels may be required for good speech recognition with a cochlear implant. Furthermore, in a study in normal-hearing listeners by Dorman et al (1997), in which the amplitudes of the center frequencies of increasing numbers of filters were used to represent speech, it was found that four filters would provide greater than 90% speech Acoustic Models of Cochlear Implant Speech-Processing Strategies 403 Nasals : /m, n/ Voiced plosives & fricatives: /b, d, g, v, z/ Unvoiced fricatives: /f, s/ Unvoiced plosives: /p, t, k/ [ vowel ] [ consonant ] [ vowel ] F IGURE 7.10. Schematic diagrams of the amplitude envelopes for the grouping of consonants from inspection of the outputs of speech processors using an acoustic model of electrical stimulation (Reprinted with permission from Blamey et al 1985. A comparison of three speech coding strategies using an acoustic model of cochlear implant. Journal of the Acoustical Society of America 77: 209–217.). perception accuracy in quiet. The data indicate that speech understanding in quiet is in part due to a fluctuating spatially distributed pattern of neural responses to amplitude variations in the speech signal. The study did not address the importance of the fine temporal or frequency information in each channel for both naturalness and intelligibility especially in noise. The interaction of the limited spectral channels and associated temporal envelope cues was studied for four filtered bands of speech by Shannon et al (1998). The envelope from each speech frequency band modulated a band-limited noise. It was found that significant variation in the cutoff frequencies for the bands, or an overlap in the bands that would simulate current interaction with a cochlear implant, produced only limited deterioration in speech recognition. However, it was essential for the temporal envelope cues to be those derived from the same frequency band as the noise being modulated. In a study by Fu and Shannon (1999) the temporal envelopes from 4, 8, and 16 band-pass filters were used to modulate noise bands shifted in frequency relative to the tonotopic representation of spectral envelope information. It was found that the frequency of the bandwidth and envelope cues did not interact, and were therefore independent in their effect on intelligibility for a shift equivalent to 3 mm along the basilar membrane, that is, a frequency shift of 40% to 60%. The temporal information from the amplitude-modulated speech wave in the 404 7. Speech (Sound) Processing presence of reduced spectral information was studied by varying the low-pass cutoffs (Shannon et al 1998, 1999). No change was observed in vowel, consonant, or sentence recognition for low-pass filter cutoffs above 50 Hz. It was only when the envelope fluctuations between 20 and 50 Hz were removed that a marked reduction in phoneme discrimination occurred. This indicated that in the previous studies of Blamey et al (1987) and Van Tasell et al (1987) on the importance of amplitude envelope patterns for consonant recognition, only a frequency resolu- tion below 50 Hz was required. The data also indicated the upper frequency limit required to refresh the neural patterns for the recognition of vowel spectral information. For cochlear implants the data help determine the rate of stimulation required to represent the amplitude variations in speech and the update rate of information by the hardware. Speech in Noise A study was undertaken by Dorman et al (1998) to investigate the number of filter bands for speech perception in noise using a model of the Med El Combi- 40 implementing the continuous interleaved sampler (CIS) speech-processing strategy (Hochmair and Hochmair-Desoyer 1983) on normal-hearing listeners. The current outputs of the filters with center frequencies distributed on a loga- rithmic scale from approximately 160 to 5200 Hz were used. The results showed that at ם2 dB signal-to-noise ratio (SNR), the maximum speech recognition was achieved with 12 stimulus channels, and at מ2 dB SNR the performance maximum occurred with 20 channels of stimulation. For the same strategy the maximum performance in quiet was with five channels. The results suggest the importance of having adequate stimulus channels for electrical stimulation particularly in noise. This is supported by the evidence obtained earlier for cochlear implants with the F0 /F1 /F2 compared to the F0 /F2 strategy (Dowell et al 1987a). Channel Selection With electrical stimulation it is also important to determine the frequency-to- electrode mapping. In what frequency region of the cochlea should the electrodes be concentrated, and how should they be spaced? The contributions of frequencies to speech understanding were initially investigated by Fletcher and Steinberg (1929), who found that 1500 Hz was the frequency around which low- and high- frequency contributions to speech recognition were equal. A key to the analysis of the contribution of different frequencies to speech understanding is the Speech Intelligibility Index (SII) theory that was developed by Fletcher and Steinberg (1929) and French and Steinberg (1947). It has important application to the assessment of hearing loss and the optimization of cochlear implant speech-processing strategies. It is a measure of the amount of information in the speech signal available to the listener. It is defined by the following equation: Electrical Stimulation: Principles 405 n SII ס I ן W ͚ ii iס1 where n is the number of frequency bands, and I i and W i are the values associated with the frequency band (i) of the importance function (I) representing the relative contribution of different frequency bands to speech perception, and the audibility function W representing the effective proportion of the dynamic range audible within each band. SII has been used by a number of researchers to determine the speech perception of listeners with a sensorineural hearing loss (Skinner et al 1982; Dirks et al 1986; Pavlovic et al 1986). Electrical Stimulation: Principles Processing speech for electrical stimulation of the cochlear nerve should ideally present the information used by people with normal hearing, and their neural pathways are interconnected to process the information. An adjustment of neural connectivity occurs in young children after exposure to speech to facilitate the processing. In presenting speech to the central auditory pathways by electrical stimulation of the cochlear nerve, the normal transduction mechanisms in the intact inner ear are bypassed. Physiological and psychophysical studies (see relevant chapters) have shown the limitations of reproducing the coding of speech frequencies and intensities through electrical stimulation. This created an electroneural bottleneck between the world of sound and the central auditory nervous system, as was discussed in more detail in Chapter 5. Solutions to this problem were to analyze the most important speech information and optimize its transmission through the bottleneck. Nevertheless, this required transmitting the information by attempting to reproduce the coding sound. Cochlear implant speech processing had to use a multiple-electrode implant to transmit sufficient information through the bottleneck (Fig. 7.11). Speech perception has been achieved with studies using electrical stimulation as discussed below, and helped through the acoustic model studies of electrical stimulation discussed above. The perception of speech incorporates both bottom-up and top-down processing of information. Bottom-up is the transmission of perceived sound and its features up the brain central pathways. Top-down is the anticipation of words and syntax /semantic influences applied by knowledge of the context and the language. The bottom-up processing codes the complex sounds or elements of speech in the central auditory pathways. There is a complex pattern of neural activity un- derlying speech perception consisting of (1) time-varying changes in the number of neurons firing in spatially distributed groups at different intensities, and (2) fine temporal activity within and across groups. The fine temporal component in the pattern is supported by the study of Remez et al (1981). In this study time varying patterns of sine waves were produced to represent the center frequency of the one to three formants in speech every 15 ms, as well as their amplitudes. In the signal 406 7. Speech (Sound) Processing Sound Auditory Pathways Electroneural Bottleneck Processed Acoustic Signals Coding and Perception F IGURE 7.11. A diagram showing how the cochlear implant acts as an electroneural bottleneck between sound and the coding mechanisms in the central auditory pathways. there were no formant frequency transitions, and no fundamental frequency changes. With three frequencies most words were recognized, but the signal was not speech-like. In contrast, top-down processing is achieved through processes in the primary auditory cortex, association areas, and other cognitive centers. Channel Numbers The number of stimulus channels required to transmit speech information has been evaluated with acoustic models as referred to above, but ultimately requires validation with electrical stimulation on cochlear implant patients. The Nucleus formant processors extracted peaks of frequency energy, and there was a need to vary their position along the array. Furthermore, as distinct from fixed-filter electrical stimulation the Nucleus F0/F2, F0 /F1 /F2 (Clark, Tong et al 1978; Tong et al 1979, 1980; Clark and Tong 1981) and Multipeak (Dowell et al 1990) strategies presented the voicing (F0) frequency at each electrode. The F0 /F2 strategy extracted the second formant frequency (F2) and coded this as place of stimulation, the fundamental (F0) as rate of stimulation, and the amplitude of F2 as the current level (A2) (Clark, Tong et al 1978). The F0 /F1 /F2 coded the first formant (F1) as place of stimulation as well. The Multipeak is a misnomer, as it extracted not only the F1 and F2 peaks, but also the energy in fixed filters in the bands (2000–2800 Hz; 2800–4000 Hz, and Ͼ4000 Hz), together with voicing as rate of stimulation. Holmes et al (1987) found that open-set word recognition and continuous discourse tracking results for the Nucleus F0/F1/F2 speech processor increased using up to 15 active electrodes. The correlation between electrode number and open-set CID word-in-sentence scores was examined statistically for a combined [...]... /F2-MSP system were 20% and 16% , and for the SMSP-DSP 43% and 39% The open-set CNC word scores (scored as words) were 9% and 1% for the F0 /F1 /F2-MSP system, and 21% and 16% for SMSP-DSP The open-set CID sentence scores (scored as key words) were 53% and 56% for the F0 /F1 /F2-MSP system and 80% and 88% for SMSP-DSP The Multipeak-MSP was evaluated on one of these patients, and the results for electrical... spondee, and openset speech recognition There was no significant difference between the F0 /F1 /F2 WSP-III and Ineraid systems The data suggest that the two systems provided different types and degrees of speech information Fundamental, First and Second Formant Frequencies and High-Frequency Fixed-Filter Outputs The mean open-set CID word-in-sentence score for electrical stimulation alone increased from 16% ... frequency bands (170–570 Hz, 570–1170 Hz, 1170–1 768 Hz, 1 768 – 268 0 Hz, and 268 0–5744 Hz) by 15 users of the Nucleus SPEAK Spectra-22 system Random variations in loudness were introduced into the signal to make the test more difficult and more like everyday conditions Relative to normal-hearing subjects, speech information was significantly more reduced in the four frequency regions between 170 and 268 0 Hz... sentences was 76% for SPEAK Spectra-22 and 67 % for Multipeak-MSP SPEAK performed particularly well in noise SPEAK Spectra-22 was approved by the FDA for postlinguistically deaf adults on March 30, 1994 In another set of data presented to the FDA in January 19 96, a mean open-set CID sentence score of 71% was obtained for the SPEAK strategy on 51 consecutive patients 2 weeks to 6 months after the start-up time... Multipeak-MSP 88% and SMSP 92% (5% increase), and for place Multipeak-MSP 71% and SMSP 82% (15% increase) The improved coding of place of articulation produced a significant but not large increase on the word-insentence recognition scores (from 67 % to 76% ) in the study by Skinner, et al (1994) The differences in information presented to the nervous system with the Multipeak-MSP, SPEAK Spectra-22, and CIS... zero crossing detector, and coded on each electrode as rate of stimulation In addition a voicing decision Spectrogram Multipeak 2 4 Electrode Frequency (kHz) 5 4 3 2 6 8 10 12 14 16 1 18 20 0 100 200 300 400 500 60 0 700 800 0 100 200 400 500 60 0 700 800 60 0 700 800 SPEAK 2 2 4 6 8 10 4 Electrode Electrode CIS 300 12 14 16 18 20 6 8 10 12 14 16 18 20 0 100 200 300 400 500 Time (ms) 60 0 700 800 0 100 200... rose from 16% (range 0–58%) at 3 months postimplantation to 40% (range 0– 86% ) at 12 months (Dowell et al 1986a,b) The F0 /F2 WSP-II was approved by the FDA in October 1985 for use in postlinguistically deaf adults as safe and effective and able to provide speech perception with the aid of speech reading and some open-set speech understanding with electrical stimulation alone Fundamental, First and Second... Bionics), there was a mean open-set CID sentence score of 60 % for 64 patients (Kessler et al 1995) 6 months postoperative, as discussed above The CIS strategy used six fixed filters and stimulated at a rate of 800 pulses /s The speech information transmitted for closed sets of vowels and consonants for SPEAK Spectra-22 (McKay and McDermott 1993) was compared to Multipeak-MSP Vowel and consonant confusion data... F0 /F1 /F2 WSP-III system, and the perception of consonant duration, nasality, and place improved Tong et al (1990) made a comparison between the Multipeak-MSP system and a filter bank strategy that selected the four highest spectral peaks and coded these on a place basis Electrical stimulation occurred at a constant rate of 166 Hz This strategy was also implemented using a Motorola DSP 560 01 digital... long-duration cues such as vowel formants, the TESM was modified (Vandali 2001) to place more emphasis on the rapid changes accompanying short duration signals (5 to 50 ms) A study on eight Nucleus 22 patients found that the CNC open-set word test scores (Fig 7. 16) increased significantly from 53 .6% for SMSP to 61 .3% for TESM, the open-set sentence scores in multispeaker noise from 64 .9% for SMSP to 70 .6% . 50 Nasality 86 84 98 Affrication 31 32 40 Duration 62 71 81 Place 19 28 28 Amplitude envelope 47 46 61 High F2 48 68 64 and electrical stimulation could be compared to the speech perception data presented. same frequency band as the noise being modulated. In a study by Fu and Shannon (1999) the temporal envelopes from 4, 8, and 16 band-pass filters were used to modulate noise bands shifted in frequency relative. developed by Fletcher and Steinberg (1929) and French and Steinberg (1947). It has important application to the assessment of hearing loss and the optimization of cochlear implant speech-processing strategies.

Định dạng
Số trang	87
Dung lượng	638,7 KB