Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 79 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
79
Dung lượng
2,06 MB
Nội dung
Elsevier US Ch08-H8583 21-7-2007 11:32a.m. Page:532 Trimsize:7.5×9.25in Fonts:Times & Legacy Sans Margins:Top:48pt Gutter:60pt Font Size:11/14 Text Width:34.6pc Depth:37 Lines 532 Chapter 8 of data sets like this very quickly; for most processors, in just one cycle. Since these algorithms are very common in most DSP applications, tremendous execution savings can be obtained by exploiting these processor optimizations. There are also inherent structures in DSP algorithms that allow them to be separated and operated on in parallel. Just as in real life, if I can do more things in parallel, I can get more done in the same amount of time. As it turns out, signal processing algorithms have this characteristic as well. Therefore, we can take advantage of this by putting multiple orthogonal (nondependent) execution units in our DSPs and exploit this parallelism when implementing these algorithms. DSPs must also add some reality to the mix of these algorithms shown above. Take the IIR filter described above. You may be able to tell just by looking at this algorithm that there is a feedback component that essentially feeds back previous outputs into the calculation of the current output. Whenever you deal with feedback, there is always an inherent stability issue. IIR filters can become unstable just like other feedback systems. Careless implementation of feedback systems like the IIR filter can cause the output to oscillate instead of asymptotically decaying to zero (the preferred approach). This problem is compounded in the digital world where we must deal with finite word lengths, a key limitation in all digital systems. We can alleviate this using saturation checks in software or use a specialized instruction to do this for us. DSPs, because of the nature of signal processing algorithms, use specialized saturation underflow/overflow instructions to deal with these conditions efficiently. There is more I can say about this, but you get the point. Specialization is really all it’s about with DSPs; these devices are specifically designed to do signal processing really well. DSPs may not be as good as other processors when dealing with nonsignal processing centric algorithms (that’s fine; I’m not any good at medicine either). Therefore, it’s important to understand your application and pick the right processor. With all of the special instructions, parallel execution units, and so on designed to optimize signal-processing algorithms, there is not much room left to perform other types of general-purpose optimizations. General-purpose processors contain optimization logic such as branch prediction and speculative execution, which provide performance improvements in other types of applications. But some of these optimizations don’t work as well for signal processing applications. For example, branch prediction works really well when there are a lot of branches in the application. But DSP algorithms do not have a lot of branches. Much signal processing code consists of well-defined functions that execute off a single stimulus, not complicated state machines requiring a lot of branch logic. www.newnespress.com Elsevier US Ch08-H8583 21-7-2007 11:32a.m. Page:533 Trimsize:7.5×9.25in Fonts:Times & Legacy Sans Margins:Top:48pt Gutter:60pt Font Size:11/14 Text Width:34.6pc Depth:37 Lines DSP in Embedded Systems 533 Digital signal processing also requires optimization of the software. Even with the fancy hardware optimizations in a DSP, there is still some heavy-duty tools support required—specifically, the compiler—that makes it all happen. The compiler is a nice tool for taking a language like C and mapping the resultant object code onto this specialized microprocessor. Optimizing compilers perform a very complex and difficult task of producing code that fully “entitles” the DSP hardware platform. There is no black magic in DSPs. As a matter of fact, over the last couple of years, the tools used to produce code for these processors have advanced to the point where you can write much of the code for a DSP in a high level language like C or C++ and let the compiler map and optimize the code for you. Certainly, there will always be special things you can do, and certain hints you need to give the compiler to produce the optimal code, but it’s really no different from other processors. The environment in which a DSP operates is important as well, not just the types of algorithms running on the DSP. Many (but not all) DSP applications are required to interact with the real world. This is a world that has a lot of stuff going on; voices, light, temperature, motion, and more. DSPs, like other embedded processors, have to react in certain ways within this real world. Systems like this are actually referred to as reactive systems. When a system is reactive, it needs to respond and control the real world, not too surprisingly, in real-time. Data and signals coming in from the real world must be processed in a timely way. The definition of timely varies from application to application, but it requires us to keep up with what is going on in the environment. Because of this timeliness requirement, DSPs, as well as other processors, must be designed to respond to real-world events quickly, get data in and out quickly, and process the data quickly. We have already addressed the processing part of this. But believe it or not, the bottleneck in many real-time applications is not getting the data processed, but getting the data in and out of the processor quickly enough. DSPs are designed to support this real-world requirement. High speed I/O ports, buffered serial ports, and other peripherals are designed into DSPs to accommodate this. DSPs are, in fact, often referred to as data pumps, because of the speed in which they can process streams of data. This is another characteristic that makes DSPs unique. DSPs are also found in many embedded applications. I’ll discuss the details of embedded systems later in this chapter. However, one of the constraints of an embedded application is scarce resources. Embedded systems, by their very nature, have scarce resources. The main resources I am referring to here are processor cycles, memory, power, and I/O. It has always been this way, and always will. Regardless of how fast embedded processors run, how www.newnespress.com Elsevier US Ch08-H8583 21-7-2007 11:32a.m. Page:534 Trimsize:7.5×9.25in Fonts:Times & Legacy Sans Margins:Top:48pt Gutter:60pt Font Size:11/14 Text Width:34.6pc Depth:37 Lines 534 Chapter 8 much memory can be fit on chip, and so on, there will always be applications that consume all available resources and then look for more! In addition, embedded applications are very application-specific, not like a desktop application that is much more general-purpose. At this point, we should now understand that a DSP is like any other programmable processor, except that it is specialized to perform signal processing really efficiently. So now the only question should be; why program anything at all? Can’t I do all this signal processing stuff in hardware? Well, actually you can. There is a fairly broad spectrum of DSP implementation techniques, with corresponding trade-offs in flexibility, as well as cost, power, and a few other parameters. Figure 8.1 summarizes two of the main trade-offs in the programmable versus fixed-function decision: flexibility and power. Power Consumption DSP Implementation Options Application Flexibility ASIC FPGA DSP μP Figure 8.1 An application-specific integrated circuit (ASIC) is a hardware only implementation option. These devices are programmed to perform a fixed-function or set of functions. Being a hardware only solution, an ASIC does not suffer from some of the programmable von Neumann-like limitations, such as loading and storing of instructions and data. These devices run exceedingly fast in comparison to a programmable solution, but they are not as flexible. Building an ASIC is like building any other microprocessor, to some extent. It’s a rather complicated design process, so you have to make sure the algorithms you are designing into the ASIC work and won’t need to be changed for a while! You cannot simply recompile your application to fix a bug or change to a new wireless standard. (Actually, you could, but it will cost a lot of money and take a lot of time.) If you have a stable, well-defined function that needs to run really fast, an ASIC may be the way to go. Field-programmable gate arrays (FPGAs) are one of those in-between choices. You can program them and reprogram them in the field, to a certain extent. These devices are not as www.newnespress.com Elsevier US Ch08-H8583 21-7-2007 11:32a.m. Page:535 Trimsize:7.5×9.25in Fonts:Times & Legacy Sans Margins:Top:48pt Gutter:60pt Font Size:11/14 Text Width:34.6pc Depth:37 Lines DSP in Embedded Systems 535 flexible as true programmable solutions, but they are more flexible than an ASIC. Since FPGAs are hardware they offer similar performance advantages to other hardware-based solutions. An FPGA can be “tuned” to the precise algorithm, which is great for performance. FPGAs are not truly application specific, unlike an ASIC. Think of an FPGA as a large sea of gates where you can turn on and off different gates to implement your function. In the end, you get your application implemented, but there are a lot of spare gates laying around, kind of going along for the ride. These take up extra space as well as cost, so you need to do the trade-offs; are the cost, physical area, development cost, and performance all in line with what you are looking for? DSP and P (microprocessor): We have already discussed the difference here, so there is no need to rehash it. Personally, I like to take the flexible route: programmability. I make a lot of mistakes when I develop signal processing systems; it’s very complicated technology! Therefore, I like to know that I have the flexibility to make changes when I need to in order to fix a bug, perform an additional optimization to increase performance or reduce power, or change to the next standard. The entire signal-processing field is growing and changing so quickly—witness the standards that are evolving and changing all the time—that I prefer to make the rapid and inexpensive upgrades and changes only a programmable solution can afford. The general answer, as always, lies somewhere in between. In fact, many signal processing solutions are partitioned across a number of different processing elements. Certain parts of the algorithm stream—those that have a pretty good probability of changing in the near future—are mapped to a programmable DSP. Signal processing functions that will remain fairly stable for the foreseeable future are mapped into hardware gates (either an ASIC, an FPGA, or other hardware acceleration). Those parts of the signal processing system that control the input, output, user interface, and overall management of the system heartbeat may be mapped to a more general-purpose processor. Complicated signal processing systems need the right combination of processing elements to achieve true system performance/cost/power trade-offs. Signal processing is here to stay. It’s everywhere. Any time you have a signal that you want to know more about, communicate in some way, make better or worse, you need to process it. The digital part is just the process of making it all work on a computer of some sort. If it’s an embedded application you must do this with the minimal amount of resources possible. Everything costs money; cycles, memory, power—so everything must be conserved. This is the nature of embedded computing; be application specific, tailor to the job at hand, reduce cost as much as possible, and make things as efficient as possible. This www.newnespress.com Elsevier US Ch08-H8583 21-7-2007 11:32a.m. Page:536 Trimsize:7.5×9.25in Fonts:Times & Legacy Sans Margins:Top:48pt Gutter:60pt Font Size:11/14 Text Width:34.6pc Depth:37 Lines 536 Chapter 8 was the way things were done in 1982 when I started in this industry, and the same techniques and processes apply today. The scale has certainly changed; computing problems that required supercomputers in those days are on embedded devices today! This chapter will touch on these areas and more as it relates to digital signal processing. There is a lot to discuss and I’ll take a practical rather than theoretical approach to describe the challenges and processes required to do DSP well. 8.1 Overview of Embedded Systems and Real-Time Systems Nearly all real-world DSP applications are part of an embedded real-time system. While this chapter will focus primarily on the DSP-specific portion of such a system, it would be naive to pretend that the DSP portions can be implemented without concern for the real-time nature of DSP or the embedded nature of the entire system. The next several sections will highlight some of special design considerations that apply to embedded real-time systems. I will look first at real-time issues, then some specific embedded issues, and finally, at trends and issues that commonly apply to both real-time and embedded systems. 8.2 Real-Time Systems A real-time system is a system that is required to react to stimuli from the environment (including the passage of physical time) within time intervals dictated by the environment. The Oxford Dictionary defines a real-time system as “any system in which the time at which output is produced is significant.” This is usually because the input corresponds to some movement in the physical world, and the output has to relate to that same movement. The lag from input time to output time must be sufficiently small for acceptable timeliness. Another way of thinking of real-time systems is any information processing activity or system that has to respond to externally generated input stimuli within a finite and specified period. Generally, real-time systems are systems that maintain a continuous timely interaction with their environment (Figure 8.2). 8.2.1 Types of Real-Time Systems—Soft and Hard Correctness of a computation depends not only on its results but also on the time at which its outputs are generated. A real-time system must satisfy response time constraints or suffer significant system consequences. If the consequences consist of a degradation of www.newnespress.com Elsevier US Ch08-H8583 21-7-2007 11:32a.m. Page:537 Trimsize:7.5×9.25in Fonts:Times & Legacy Sans Margins:Top:48pt Gutter:60pt Font Size:11/14 Text Width:34.6pc Depth:37 Lines DSP in Embedded Systems 537 performance, but not failure, the system is referred to as a soft real-time system. If the consequences are system failure, the system is referred to as a hard real-time system (for instance, antilock braking systems in an automobile). Stimuli from the environment Responses back out to the environment Real-Time Embedded System (state) Figure 8.2: A Real-Time System Reacts to Inputs from the Environment and Produces Outputs that Affect the Environment 8.3 Hard Real-Time and Soft Real-Time Systems 8.3.1 Introduction A system function (hardware, software, or a combination of both) is considered hard real-time if, and only if, it has a hard deadline for the completion of an action or task. This deadline must always be met, otherwise the task has failed. The system may have one or more hard real-time tasks as well as other nonreal-time tasks. This is acceptable, as long as the system can properly schedule these tasks in such a way that the hard real-time tasks always meet their deadlines. Hard real-time systems are commonly also embedded systems. 8.3.2 Differences between Real-Time and Time-Shared Systems Real-time systems are different from time-shared systems in the three fundamental areas (Table 8.2). These include predictably fast response to urgent events: High degree of schedulability—Timing requirements of the system must be satisfied at high degrees of resource usage. Worst-case latency—Ensuring the system still operates under worst-case response time to events. Stability under transient overload—When the system is overloaded by events and it is impossible to meet all deadlines, the deadlines of selected critical tasks must still be guaranteed. www.newnespress.com Elsevier US Ch08-H8583 21-7-2007 11:32a.m. Page:538 Trimsize:7.5×9.25in Fonts:Times & Legacy Sans Margins:Top:48pt Gutter:60pt Font Size:11/14 Text Width:34.6pc Depth:37 Lines 538 Chapter 8 Table 8.2: Real-Time Systems Are Fundamentally Different from Time-Shared Systems Characteristic Time-Shared Systems Real-Time Systems System capacity High throughput Schedulability and the ability of system tasks to meet all deadlines Responsiveness Fast average response time Ensured worst-case latency, which is the worst- case response time to events Overload Fairness to all Stability—When the system is overloaded, important tasks must meet deadlines while others may be starved 8.3.3 DSP Systems Are Hard Real-Time Usually, DSP systems qualify as hard real-time systems. As an example, assume that an analog signal is to be processed digitally. The first question to consider is how often to sample or measure an analog signal in order to represent that signal accurately in the digital domain. The sample rate is the number of samples of an analog event (like sound) that are taken per second to represent the event in the digital domain. Based on a signal processing rule called the Nyquist rule, the signal must be sampled at a rate at least equal to twice the highest frequency that we wish to preserve. For example, if the signal contains important components at 4 kilohertz (kHZ), then the sampling frequency would need to be at least 8 KHz. The sampling period would then be: T =1/8000 =125 microseconds = 0000125 seconds 8.3.3.1 Based on Signal Sample, Time to Perform Actions Before Next Sample Arrives This tells us that, for this signal being sampled at this rate, we would have 0.000125 seconds to perform all the processing necessary before the next sample arrives. Samples are arriving on a continuous basis, and the system cannot fall behind in processing these samples and still produce correct results—it is hard real-time. 8.3.3.2 Hard Real-Time Systems The collective timeliness of the hard real-time tasks is binary—that is, either they will all always meet their deadlines (in a correctly functioning system), or they will not (the system is infeasible). In all hard real-time systems, collective timeliness is deterministic. www.newnespress.com Elsevier US Ch08-H8583 21-7-2007 11:32a.m. Page:539 Trimsize:7.5×9.25in Fonts:Times & Legacy Sans Margins:Top:48pt Gutter:60pt Font Size:11/14 Text Width:34.6pc Depth:37 Lines DSP in Embedded Systems 539 This determinism does not imply that the actual individual task completion times, or the task execution ordering, are necessarily known in advance. A computing system being hard real-time says nothing about the magnitudes of the deadlines. They may be microseconds or weeks. There is a bit of confusion with regards to the usage of the term “hard real-time.” Some relate hard real-time to response time magnitudes below some arbitrary threshold, such as 1 msec. This is not the case. Many of these systems actually happen to be soft real-time. These systems would be more accurately termed “real fast” or perhaps “real predictable.” However, certainly not hard real-time. The feasibility and costs (for example, in terms of system resources) of hard real-time computing depend on how well known a priori are the relevant future behavioral characteristics of the tasks and execution environment. These task characteristics include: • timeliness parameters, such as arrival periods or upper bounds • deadlines • resource utilization profiles • worst-case execution times • precedence and exclusion constraints • ready and suspension times • relative importance, and so on There are also pertinent characteristics relating to the execution environment: • system loading • service latencies • resource interactions • interrupt priorities and timing • queuing disciplines • caching • arbitration mechanisms, and so on Deterministic collective task timeliness in hard (and soft) real-time computing requires that the future characteristics of the relevant tasks and execution environment be www.newnespress.com Elsevier US Ch08-H8583 21-7-2007 11:32a.m. Page:540 Trimsize:7.5×9.25in Fonts:Times & Legacy Sans Margins:Top:48pt Gutter:60pt Font Size:11/14 Text Width:34.6pc Depth:37 Lines 540 Chapter 8 deterministic—that is, known absolutely in advance. The knowledge of these characteristics must then be used to preallocate resources so all deadlines will always be met. Usually, the task’s and execution environment’s future characteristics must be adjusted to enable a schedule and resource allocation that meets all deadlines. Different algorithms or schedules that meet all deadlines are evaluated with respect to other factors. In many real-time computing applications, it is common that the primary factor is maximizing processor utilization. Allocation for hard real-time computing has been performed using various techniques. Some of these techniques involve conducting an offline enumerative search for a static schedule that will deterministically always meet all deadlines. Scheduling algorithms include the use of priorities that are assigned to the various system tasks. These priorities can be assigned either offline by application programmers, or online by the application or operating system software. The task priority assignments may either be static (fixed), as with rate monotonic algorithms 1 or dynamic (changeable), as with the earliest deadline first algorithm. 2 8.3.4 Real-Time Event Characteristics—Real-Time Event Categories Real-time events fall into one of three categories: asynchronous, synchronous, or isochronous. Asynchronous events are entirely unpredictable. An example of this is a cell phone call arriving at a cellular base station. As far as the base station is concerned, the action of making a phone call cannot be predicted. Synchronous events are predictable and occur with precise regularity. For example, the audio and video in a camcorder take place in synchronous fashion. Isochronous events occur with regularity within a given window of time. For example, audio data in a networked multimedia application must appear within a window of time when the corresponding video stream arrives. Isochronous is a subclass of asynchronous. 1 Rate monotonic analysis (RMA) is a collection of quantitative methods and algorithms that allow engineers to specify, understand, analyze, and predict the timing behavior of real-time software systems, thus improving their dependability and evolvability. 2 A strategy for CPU or disk access scheduling. With EDF, the task with the earliest deadline is always executed first. www.newnespress.com Elsevier US Ch08-H8583 21-7-2007 11:32a.m. Page:541 Trimsize:7.5×9.25in Fonts:Times & Legacy Sans Margins:Top:48pt Gutter:60pt Font Size:11/14 Text Width:34.6pc Depth:37 Lines DSP in Embedded Systems 541 In many real-time systems, task and future execution environment characteristics are hard to predict. This makes true hard real-time scheduling infeasible. In hard real-time computing, deterministic satisfaction of the collective timeliness criterion is the driving requirement. The necessary approach to meeting that requirement is static (that is, a priori) 3 scheduling of deterministic task and execution environment characteristic cases. The requirement for advance knowledge about each of the system tasks and their future execution environment to enable offline scheduling and resource allocation significantly restricts the applicability of hard real-time computing. 8.4 Efficient Execution and the Execution Environment 8.4.1 Efficiency Overview Real-time systems are time critical, and the efficiency of their implementation is more important than in other systems. Efficiency can be categorized in terms of processor cycles, memory or power. This constraint may drive everything from the choice of processor to the choice of the programming language. One of the main benefits of using a higher level language is to allow the programmer to abstract away implementation details and concentrate on solving the problem. This is not always true in the embedded system world. Some higher-level languages have instructions that are an order of magnitude slower than assembly language. However, higher-level languages can be used in real-time systems effectively, using the right techniques. 8.4.2 Resource Management A system operates in real time as long as it completes its time-critical processes with acceptable timeliness. Acceptable timeliness is defined as part of the behavioral or “nonfunctional” requirements for the system. These requirements must be objectively quantifiable and measurable (stating that the system must be “fast,” for example, is not quantifiable). A system is said to be real-time if it contains some model of real-time resource management (these resources must be explicitly managed for the purpose of operating in real time). As mentioned earlier, resource management may be performed statically, offline, or dynamically, online. 3 Relating to or derived by reasoning from self-evident propositions (formed or conceived beforehand), as compared to a posteriori that is presupposed by experience (www.wikipedia.org). www.newnespress.com [...]... Figure 8. 17: DSP Processor Solutions Source: Courtesy of Texas Instruments 8. 8.9 A General Signal Processing Solution The solution shown in Figure 8. 18 allows each device to perform the tasks it’s best at, achieving a more efficient system in terms of cost/power/performance For example, in Figure 8. 18, the system designer may put the system control software (state machines and other communication software) ... onboard diagnostic capabilities www.newnespress.com 5 48 Chapter 8 8.5.4.1 Embedded Systems Are Reactive Systems A typical embedded system responds to the environment via sensors and controls the environment using actuators (Figure 8. 5) This imposes a requirement on embedded systems to achieve performance consistent with that of the environment This is why embedded systems are referred to as reactive systems... in the system A typical embedded system is shown in Figure 8. 3 Memory Analog I/O Emulation and Diagnostics Software/ Firmware Power and Cooling Actuators ApplicationSpecific Gates User Interface Sensors Processor Cores Figure 8. 3: Typical Embedded System Components • Processor core—At the heart of the embedded system is the processor core(s) This can be a simple inexpensive 8 bit microcontroller to... www.newnespress.com 554 Chapter 8 8.7 Overview of Embedded Systems Development Life Cycle Using DSP As mentioned earlier, an embedded system is a specialized computer system that is integrated as part of a larger system Many embedded systems are implemented using digital signal processors The DSP will interface with the other embedded components to perform a specific function The specific embedded application... a DSP “system” depends on the embedded application In this chapter, we will discuss the basic steps to develop an embedded application using DSP Video Port 1 L1P Cache Video Port 2 Enhanced DMA Controller L2 Cache/Memory DSP Core Video Port 3 McASP Ethernet Mac PCI L1D Cache EMIF SDRAM Figure 8. 11: Example of a DSP-based “System” for Embedded Video Applications 8. 8 The Embedded System Life Cycle Using... requirements The embedded designer must be able to map or partition the application appropriately using available accelerators to gain maximum application performance • Software is a significant part of embedded system development Over the last several years, the amount of embedded software has grown faster than Moore’s law, with the amount doubling approximately every 10 months Embedded software is usually... trend in embedded DSP development is moving more towards programmable solutions as shown in Figure 8. 19 There will always be a trade-off depending on the application but the trend is moving towards software and programmable solutions nd Tech Tre 100% S/W (Programmable) Combination 100% H/W (Fixed Function) Figure 8. 19: Hardware /Software Mix in an Embedded System; the Trend Is Towards More Software. .. section we will overview the general embedded system life cycle using DSP There are many steps involved in developing an embedded system—some are similar to other www.newnespress.com DSP in Embedded Systems 555 system development activities and some are unique We will step through the basic process of embedded system development, focusing on DSP applications 8. 8.1 Step 1—Examine the Overall Needs of... prevent more system bottlenecks 8. 5.4 Embedded Systems An embedded system is a specialized computer system that is usually integrated as part of a larger system An embedded system consists of a combination of hardware and software components to form a computational engine that will perform a specific function Unlike desktop systems that are designed to perform a general function, embedded systems are www.newnespress.com... etc.) Sensing Fiber Optic Urethane Node Bundles Foam BackRest Figure 8. 8: Automotive Seat Occupancy Detection Source: Courtesy of Texas Instruments Figure 8. 9 shows a block diagram of a digital still camera (DSC) A DSC is an example of an embedded system Referring back to the major components of an embedded system shown in Figure 8. 3, we can see the following components in the DSC: • The charge-coupled . Ch 08- H8 583 21-7-2007 11:32a.m. Page:5 48 Trimsize:7.5×9.25in Fonts:Times & Legacy Sans Margins:Top:48pt Gutter:60pt Font Size:11/14 Text Width:34.6pc Depth:37 Lines 5 48 Chapter 8 8. 5.4.1 Embedded. Ch 08- H8 583 21-7-2007 11:32a.m. Page:5 38 Trimsize:7.5×9.25in Fonts:Times & Legacy Sans Margins:Top:48pt Gutter:60pt Font Size:11/14 Text Width:34.6pc Depth:37 Lines 5 38 Chapter 8 Table 8. 2:. of how fast embedded processors run, how www.newnespress.com Elsevier US Ch 08- H8 583 21-7-2007 11:32a.m. Page:534 Trimsize:7.5×9.25in Fonts:Times & Legacy Sans Margins:Top:48pt Gutter:60pt