C H A P T E R 9 139 Speech Synthesizer , when a 4MHz cal for software as to offload the ely used speech chip ophone Speech uters, industrial ed to increase ourse, it is almost ailable processing came a footnote cause in terms of processing power the ATMega could possibly o it and the Arduino t much good if you just nt to add voice feedback to an existing project. And the demise of the SPO256 means you can't just ut it, and t has a much er than a clunky parallel interface. The result is the SpeakJet, an 18-pin DIP device that can do everything the old SPO256 did plus more. In this project we'll assemble a speech synthesizer shield that combines a SpeakJet chip with a simple audio amplifier to let you add speech output to a new or existing Arduino project. The required parts are pictured in Figure 9-1, and the complete schematic is in Figure 9-2. (The schematic might be a bit difficult to see, but you can also find it on the Practical Arduino web site.) Synthesized speech was, for a long time, the Holy Grail of computing. Back in the 1980s CPU made you r computer the fastest machine in the neighborhood, it just wasn't practi to create intelligible speech. In those days, the only sensible way to generate speech w task to dedicated hardware because the CPU simply couldn't keep up. The most wid through the 1980s and early 1990s was the famous General Instrument SPO256A-AL2 All Processor. It was used in toys, external speech synthesizer peripherals for desktop comp control systems, and all sorts of other unexpected places. Then, as CPU power continu rapidly, speec h synthesis was moved to being a software function. Nowadays, of c always done entirely with software in the main CPU, using only a tiny fraction of the av power. As a result the SPO256 became unnecessary, dropped out of production, and be in the history of technology. This leaves Arduino developers in a quandary, be chips put us back into the realm of 1980s desk top performance again. An ATMega produce intelligible speech directly, but it would use every available CPU cycle to d itself would be pretty much useless at doing anything else at the same time—no wa link one up to your Arduino and offload speech generation to it. Wi th old stock of the SPO256 drying up Magnevation decided to do something abo designed their own speech chip that works on the same principles as its predecessor bu smaller physical package and offers a handy serial interface rath CHAPTER 9 SPEECH SYNTHESIZER Parts Required Main ield et speech synthesizer chip (www.magnevation.com) 1 18-pin DIP IC socket 3 1K resistors lithic ceramic capacitors (may be marked "103") onolithic ceramic capacitors (may be marked "104") trolytic capacitor (6.3V or greater) 1 3mm Ggreen LED 1 3mm red LED D r line output option cable) On-b ckage 2 10uF electrolytic capacitors (6.3V or greater) citor (6.3V or greater) 000pF) ceramic capacitor (may be marked "102") ramic capacitor (may be marked "104") nal 1 Audio speaker (usually 8 ohms) 1 2-pin, 0.1-inch pitch socket 1 3.5mm stereo line socket 1 1 Length of single-core shielded cable parts: 1 Arduino Duemilanove or equivalent 1 Prototyping sh 1 SpeakJ 2 10K resistors 2 27K resistors 2 10nF mono 1 100nF m 1 10uF elec 1 3mm blue LE 1 2-pin, 0.1-inch pitch pin header (fo oard audio amplifier: 1 LM386 audio amplifier IC, DIP8 pa 1 8-pin DIP IC socket 1 220uF electrolytic capacitor (6.3V or greater) 1 100uF electrolytic capa 1 1nF (1 1 100nF monolithic ce 1 10K trimpot 1 2-pin PCB-mount screw termi Line-level output cable: 1 If you prefer an RCA output, simply swap out the 3.5mm socket for an RCA socket, and use a male-to- male RCA connection cable to reach your connected device. 140 CHAPTER 9 SPEECH SYNTHESIZER Figure 9-1. Parts required for SpeakJet-based speech synthesizer Figure 9-2. Schematic of SpeakJet-based speech synthesizer 141 CHAPTER 9 SPEECH SYNTHESIZER If you prefer an RCA output simply swap out the 3.5mm socket for an RCA socket, and use a male- to-male RCA connection cable to reach your connected device. Source code available from www.practicalarduino.com/projects/speech-synthesizer. ech synthesis" to to understand as "hello world," t speak al spelling of a word cannot be trivially converted to intelligible sounds. ted by any as emotion play as well just to a sound for that n be long, as in the first tters within the nown as the "y" sound. we actually make called a a series of phonemes. s of sound. A by a variety of allophones, depending on context, accent, and other g intelligible: a d a speaker with a result is that we can the same sually still It's critical to understand the difference between letters and allophones if you want to be able to generate intelligible speech from an allophone-based speech synthesizer such as the SpeakJet. Stop lly as a series of letters and start thinking about them audibly as a series of re hundreds of er, instead you equence of you can make it say pretty much anything. Speech Output Signal There are a few different ways you could incorporate a speech synthesizer into a larger project, so we've provided different output options in the design. You can install all the parts, including a nice on-board audio amplifier with speaker, or stop at just the line output if you're connecting to amplified speakers or another device. The circuit diagram follows a direct signal path from left to right, with the commands from the Arduino coming in from digital pin 3 on the left and ending at the loudspeaker output on the right. There Instructions The SpeakJet chip used in this project uses a technique called "allophone-based spe create the necessary sounds that we interpret as intelligible speech. It's very important how it works if you want to get good results from it. You can't simply send the SpeakJet a string of letters spelled out literally such because the way we write words down and the way we say them is often quite different. We don' phonetically, so the origin Instead we subconsciously apply dozens of conventions that alter the sound represen particular letter based on its context within a word or sentence, and even on factors such being conveyed or whether a sentence is a question or a statement. Accents come into make things really complicated. The result is that it's not possible to take a specific letter, such as "e," and define letter that will apply in all contexts. The letter "e" may be short, as in "set," or it ca e in "concrete." It can even be silent, but have an effect on the pronunciation of other le word, as in the last e in "concrete." Letters can also combine to form dipthongs, sometimes k gliding vowels, such as the "oy" in "boy," where the "o" sound slides smoothly into Disregard the spelling of words for a moment and think only about the sounds when we speak those words. The smallest meaningful unit of sound in human speech is "phoneme." Written text consists of a series of letters, but spoken text consists of Phonemes, in turn, are represented by allophones, which are the smallest audible unit phoneme can be represented factors. Variation in allophones is what gives people different accents while still bein speaker with one accent may use a particular allophone to represent a phoneme, an different accent may use a different allophone to represent that same phoneme. The hear that the sound (allophone) is different, but our brain still maps it conceptually to phoneme: the word sounds odd when someone has a different accent, but we can u understand what the speaker means. thinking about words visua sounds or allophones. There are only 26 basic letters in the English alphabet, but there a different allophones. You don't send letters representing words to the speech synthesiz send allophones that represent the sounds you want it to make. By stringing together a s allophones 142 CHAPTER 9 SPEECH SYNTHESIZER is a dashed line drawn vertically through the circuit, so you can stop at that point if yo level output and don't want an on-board amplifier and speaker. Conversion of commands to an analog audio signal is performed within the Speak list u just want a line- Jet chip, which ens on pin 10 for serial communications from the Arduino. Several pins on the SpeakJet need to be be explained as shown in the CA connectors are commonly used to connect audio and video inputs and outputs together ck the ech to the audio are capable of onnect a regular evice that has to operate stand-alone and still provide voice feedback this is probably the best route to take. ome from a larger speaker, or a system without sensitive high-frequency t the high-frequency digital switching noise of the SpeakJet's underlying PWM chip and a few other ite expensive we sing a socket protects the o use in other ly to the propriate supply connections on the prototyping shield. A 100nF bypass capacitor between every IC's o GND. In our we took advantage of the prototyping shield's built-in 100nF capacitor mounting points. to be tied to either ino. e active-low reset pin, so use a 10K resistor to pull it up to VCC to allow the SpeakJet to he chip to enter ing it up to VCC Pin 13 is M0, the "Demo Mode" pin. This one is active-high, so you can tie it straight to GND by putting a jumper wire on the underside of the board to connect pin 13 diagonally across to pin 5, the GND pin on the IC. connected either to low (GND) or high (VCC) levels to put it into the correct mode as will later. A line-level output can be brought out to a 3.5mm stereo jack or RCA connector, schematic. R in home entertainment systems, so if you want to really give your Arduino a voice to ro neighborhood you can use the line-level output to drive an amplifier or send the spe input of your TV. For a self-contained speaking device the audio amplifier components in the project driving a speaker directly to a fairly respectable volume. With this output you can c speaker from a sound system to the screw terminals on the shield and get both good volume and decent audio quality. If you're building a d The best sound will c response, as they won't le (pulse-width modulation) carrier through. Beginning Assembly No matter which output method you choose you'll definitely need the SpeakJet supporting parts, so begin by fitting those to the shield. Because the SpeakJet itself is qu fitted an 18-pin IC socket to the shield rather than solder the chip in directly. U chip from possible thermal damage during soldering and also allows you to remove it t projects later if you want to. The SpeakJet GND connection is on pin 5, and VCC is on pin 14. Link them direct ap VCC pin and GND is always good practice, so connect the 100nF capacitor from pin 14 t case Apart from the power supply pins, the SpeakJet has several other pins that need GND or VCC to force it to run in a mode that can be controlled externally by the Ardu Pin 11 is th run. Pin 12 is M1, the "Baud Configure" pin, which is also active-low. We don't want t automatic baud-rate configuration mode so use another 10K resistor to disable it by pull as well. The only other SpeakJet pin that absolutely must be connected is pin 10, the RCX (serial input) pin. That's the pin the Arduino will use to send data to the SpeakJet chip. In this project we use a software serial communications library to connect the SpeakJet to one of the general-purpose digital I/O lines rather than tie up the hardware USART on digital pins 0/1, so use a short jumper lead and a 1K resistor to connect from Arduino digital pin 3 to SpeakJet pin 10 (see Figure 9-3). 143 CHAPTER 9 SPEECH SYNTHESIZER with basic connections in place The prototyping shield we used has mounting points for surface-mount parts including two 100nF (C1 and C2 on the shield) and active-low (GND-to-illuminate) unt LED to Arduino, but in order to hear the SpeakJet PWM "Audio" Output The SpeakJet works by varying the duty cycle of a fixed 32KHz frequency pulse-width–modulated (PWM) "carrier" from pin 18 into an external two-pole low-pass filter, successfully converting the PWM digital signal into an analog voltage waveform suitable for line output or audio amplifiers. Done fast enough, what this means is that the duty cycle percentage of the PWM signal converts to the same percentage of DC voltage output from the filter, so a 50% 0 to 5V PWM duty cycle will convert to about 2.5 volts DC out of the filter. Figure 9-3. SpeakJet chip mounted on shield bypass capacitors mentioned previously status LEDs. We fitted surface-mount bypass capacitors and linked the green surface-mo GND, providing a handy power-on indicator. At this point the SpeakJet is ready to receive instructions from the result you need to connect something to the output. 144 CHAPTER 9 SPEECH SYNTHESIZER This is a great, inexpensive way to generate analog voltages and waveforms from any tal-Analog- t intelligible can still be heard. There is a 120-ohm minimum load and 25mA maximum current specified for this pin; be careful not to connect anything that may load the output pin more than this and risk damage wered computer this purpose, gh this quick test and the header can be skipped if you'd like to move straight on to the more useful o line output. It's only a few more parts, and the same cable can be used. For the quick test, connect a 3.5mm stereo headphone socket to the pin header as shown in Figure 9-4. The SpeakJet is a single mono output, so we connect it to both the left and right channels on the socket. microcontroller PWM output, and is used in many projects to get a DAC voltage (Digi Conversion) output without special hardware. The raw PWM from SpeakJet output pin 18 is digital and sounds somewhat noisy, bu speech to the IC. Quick Test A simple way to test that everything is working up to this point is to connect a pair of po speakers directly to the SpeakJet output pin 18. We placed a header on pin 18 just for thou filtered audi Figure 9-4. Connection from header to powered computer speakers speakers or Fit Status Indicators You certainly don't need status indicators for the speech synthesizer to operate, but it can be handy to have visual feedback of what the chip is doing. Pins 15, 16, and 17 operate as status outputs when the SpeakJet is in normal operation, so by connecting three LEDs with matching dropper resistors we can see exactly what it's up to. These pins can also be connected to the Arduino so that the software can monitor the SpeakJet speech operation. The status pins, as shown from left to right in Figure 9-5, are described in Table 9-1. At this point you could mount the shield on your Arduino, plug in some powered earbuds, and proceed to the software section to test that it works. 145 CHAPTER 9 SPEECH SYNTHESIZER Table 9-1. Sp atus outputs Pin Name eakJet st Function Read17 D0 y 16 D1 Speaking 15 D2 Buffer Half Full We use a 1K resistor and an LED connected from each output to GND as you can see in Figure 9-5. Green indicates the SpeakJet is ready, blue indicates that it is currently speaking, and red indicates that its input buffer more than half full. Figure 9-5. SpeakJet status outputs connected to LEDs 146 CHAPTER 9 SPEECH SYNTHESIZER Because the SpeakJet has a 64-byte input buffer and the command size is one doesn't take many words to fill up the input buffer on the SpeakJet. Of course it also take the SpeakJet byte per allophone, it s more time for to sound out the allophones than it takes for the Arduino to send them to it, so using more allophones tional bytes sent to it are simply ignored. If you try to send a ly. The red LED on o the SpeakJet ter than it can keep up with speaking it. rduino so the ore sending any in Figure 9-5 a small 100nF power supply decoupling capacitor near the top center, e again that's optional, nexplained when ch as a ones. Line level is s, or other pieces ecibel volts t has a nominal low-pass filter we can generate a clean output from the SpeakJet that gnals below a nals above that ters don't have a hard cutoff frequency but instead tend to roll off gradually around n those output along with a e of the resistors. The resulting circuit filters the SpeakJet's digital PWM output into a smooth voltage waveform, removing the carrier and induced noise. Fit the pairs of resistors and capacitors as shown in Figure 9-6. You can do this even with the pin 18 direct connection still in place. The output from the second 27K resistor then connects to the positive side of a 10uF electrolytic capacitor (not yet fitted in Figure 9-6) with the negative side of the electro going to the signal (non- ground) pin of the Line Out connector. appropriate delay times or waiting for the Buffer Half Full signal to clear before sending to it is important. Once the SpeakJet buffer is full any addi large sequence of allophones to it very fast you may find the buffer fills up quite quick D2 (pin 15) can be handy to give you a quick visual indicator that you're sending data t fas At the end of the project we'll discuss use of this output to provide feedback to the A software can automatically detect when to stop sending data and wait a little while bef more. You can also see mounted between the +5V rail and the GND connection used by the LEDs. Onc but there's no harm in having additional decoupling capacitors. They can help prevent u glitches and noise caused by supply fluctuations, so it's always good practice to fit them possible.Line-Level Output A "line-level" signal is generally a larger signal than you would get from a device su microphone, but a lower level than would be used to directly drive a speaker or headph the connection that is typically fed into a mixer, amplifier, powered computer speaker of audio equipment. Without going into all the technicalities of unusual units such as d (dBV), as a general rule of thumb a line-level signal in consumer-grade audio equipmen amplitude of about 0.5V. By creating a simple two-pole can be fed straight to other audio equipment. A "low-pass" filter is a circuit that allows si certain nominated frequency to pass through, but attenuates (decreases the level) of sig frequency. Simple fil the cutoff point, with signals below that frequency passing through the filter more easily tha above it. The two-pole filter consists of a pair of 27K resistors in series with the SpeakJet's pair of 10nF capacitors, each connected between ground and the output side of on 147 CHAPTER 9 SPEECH SYNTHESIZER use a 3.5mm y is perfect. ductor connects If you use a male RCA plug you have the convenience of being able to plug your speech synthesizer shield directly into a piece of audio equipment, such as an amplifier or TV. Using a female RCA socket as shown in Figure 9-7 allows you to use a common male-to-male RCA extension cable to do the same thing. Which gender you decide to use depends on what you want to connect your speech synthesizer to and how far away it is. Keep in mind that line-level signals should be kept as short as possible, so if you connect a 10-foot length of cable to your shield and put an RCA plug on the other end, the sound quality might not be very good. Shorter is always better when it comes to maintaining audio signal quality. Figure 9-6. Shield with two-pole output filter in place Making a Line-Level Output Cable Solder a short length of shielded cable to your chosen line plug or socket. If you'd like to stereo socket for powered computer speakers the Quick Test cable described previousl For RCA, the braid (shield) conductor connects to the outer shell, and the inner con to the center pin. Choose a male or female RCA connector to suit your needs. 148 . out the 3.5mm socket for an RCA socket, and use a male- to-male RCA connection cable to reach your connected device. Source code available from www.practicalarduino.com /projects/ speech-synthesizer schematic might be a bit difficult to see, but you can also find it on the Practical Arduino web site.) Synthesized speech was, for a long time, the Holy Grail of computing. Back in the 1980s CPU made. commands to an analog audio signal is performed within the Speak list u just want a line- Jet chip, which ens on pin 10 for serial communications from the Arduino. Several pins on the SpeakJet