Model-Based Design for Embedded Systems- P70 pot

Nicolescu/Model-Based Design for Embedded Systems 67842_C020 Finals Page 676 2009-10-2 676 Model-Based Design for Embedded Systems (0, + 1, +2, +3, ),a is the period of the diffractive grating, and θ is in radians. In the special case of a square well, when light is diffracted by a grating with a displacement of λ/4 (a λ/2 optical path difference after reflection), all the optical power is diffracted from the even modes into the odd modes [45]. In the first simulation, the standard operation of the GLV is verified. We assume an incident plane wave of green light (λ green 520 nm) striking the grating, with the square-well period defined by the ribbon width, and no gap. We simulate the GLV in both cases, that is, when all the ribbons are on the same plane and when the alternating ribbons are moved downward a distance of λ/4. In this example, the light is reflected off of the grating and propagated 1000 μm to an observation plane. A bounding box of 400 × 400 μmisused,withN equal to 2048. Intensity contours of the observation plane are presented in Figure 20.22a and b. When the grating is moved into the down position, all of the optical power is not transferred into the expected odd far-field diffractive modes. This is seen in the center of Figure 20.22b, as small intensity clusters are scattered between the + 1 st modes. This scattering is a near-field effect and demonstrates that in this system, light propagating 1000 μm, is not in the far field. If a designer used a tool propagating with the Fraunhofer far-field approximations, these scattering effects would not be detected. For example, when running the same simulation on LightPipes [46], a CAD tool using the Fraunhofer approximation for optical propagation, only the far-field pat- tern of light diffracted into the 1 st and 3 rd modes is seen, as presented in Figure 20.22c. When comparing this result to Figure 20.22b, it is shown that far-field approximation is not valid for this propagation distance. Through this example we have shown that using the angular frequency technique, we achieve the full Rayleigh–Sommerfeld accuracy, while obtaining the same computational speed of using the Fraunhofer approximation. To show the advantage of the angular spectrum method, we compare the run time of the above simulation with the run time using the direct integration method. With N = 2048, the FFT simulation takes about 1.5 min. –0.0002 0.0 0.0002 0.0002 0.0 –0.0002 0 th mode (a) –0.0002 0.0 0.0002 0.0002 0.0 –0.0002 +_1 st mode +_3 rd mode (c) (b) –0.0002 0.0 0.0002 0.0002 0.0 –0.0002 +_1 st mode +_3 rd mode FIGURE 20.22 GLV operation (a) all ribbons up, (b) alternating ribbons down, (c) Fraun- hofer approximation. Nicolescu/Model-Based Design for Embedded Systems 67842_C020 Finals Page 677 2009-10-2 CAD Tools for Multi-Domain Systems on Chips 677 1.2 1.0 0.8 0.6 0.4 0.2 0.0 0 50 100 Ribbon movement vs. 1 st mode power efficiency Ribbon movement (nm) Power efficiency (au) 150 λ/4 200 FIGURE 20.23 Transient analysis of ribbon movement and intensity contours. The direct integration technique takes approximately 5.5 days to finish. If N is reduced to 1024, the simulation completes in approximately 25 s, whereas the direct integration simulation takes approximately 32 h. These simulations were run on a 1.7 GHz dual-processor PC running Linux, with 2 GB of main memory. In the next simulation, we perform a transient sweep of the ribbon movement, from 0 to 150 nm. The rest of the system setup is exactly the same as before. However, this time, we simulate the normalized power efficiency captured in the 1 st diffraction mode for different ribbon depths. To simulate this, a circular detector (radius = 12.5 μm) is placed on the positive 1 st mode. Figure 20.23 is a graph that shows the simulated normalized power efficiency in this first mode. As the ribbons are moved downward, more optical power is diffracted into the nonzero modes. As the ribbons reach the λ/4 point, almost all the diffractive power is in the + 1 st mode. Figure 20.23 also includes intensity contours of selected wave fronts during the transient simulation, along with the markings of the system origin and circular detector position. From these wave fronts, interesting diffractive effects can be noted. As expected, when there is little or no ribbon movement, all the light is in the 0 th mode. However, with a little ribbon movement, it is interesting to note that the 0 th mode is “steered” at a slight angle from the origin. As the ribbons move downward about λ/8, the energy in the + 1 st modes are clearly defined. As the gratings move closer to the λ/4 point, the power is shifted from the 0 th mode into the +1 st modes, until there is a complete switch. As the ribbons move past the λ/4 point, optical power shifts back into the 0 th mode. In the final simulation, we present a full system-level example as we expand the system to show a complete end-to-end link used in a config- uration of a color projection system. The system is shown in Figure 20.24. Nicolescu/Model-Based Design for Embedded Systems 67842_C020 Finals Page 678 2009-10-2 678 Model-Based Design for Embedded Systems GLV Color wheel Lens ( f= 500 μm) Detector Input light 1000 μm Prism Screen (70 μm) FIGURE 20.24 End-to-end GLV display link. In this system, we model light, passing through a color wheel, striking a prism, reflecting off the GLV device, past a screen, focused by a lens, and striking a detector [44]. In this system, when the GLV ribbons are all up, the screen blocks the light’s 0 th mode and the pixel is not displayed. When the alternating ribbons are pulled down, the lens focuses the light found in the + 1 st modes and converges them to the center of the system, display- ing the pixel. Using a spinning color wheel to change the wavelength of the incident light, a frame-sequential GLV projection system uses red (680 nm), green (530 nm), and blue (470 nm) light on the same grating. Since the same grating is used for all wavelengths of light, the grating movement is tuned for the middle frequency: 130 nm (λ green /4). During this simulation, we use a hybrid approach for the optical modeling. For the propagation through the color wheel and the prism, we use Gaussian propagation. Since propagating through these components does not diffract the beam, this Gaussian technique is not only efficient, but valid. However, as soon as the light propagates past the prism component, we switch the optical propagation technique to our full scalar method to accurately model the diffraction off the GLV device. The remainder of the simulation is propagated with the scalar technique. We analyze the system by looking at the amount of optical power that is being received on a centered circular detector (radius 10 μm) for the different wavelengths of light, since we are using the same GLV that is tuned for the green wavelength for all wavelengths. A sweep of the distance between the focusing lens and the detector plane is simulated for 0–1500 μm, when the GLV ribbons are pulled down. The graph in Figure 20.25 shows the normalized power received on the circular detector for each wavelength along with selected intensity contours of the green wave front as the beam propagates past the lens. For clarity, the detector’s size and position is added onto the intensity contours. For distances under 600 μm, the light remains in Nicolescu/Model-Based Design for Embedded Systems 67842_C020 Finals Page 679 2009-10-2 CAD Tools for Multi-Domain Systems on Chips 679 –5e – 05 5e – 05 0 –5e – 05 5e – 05 0 –5e – 05 5e – 05 0 Normalized power efficiency vs. distance between lens and detector plane Optical efficiency (au) 1.2 1.0 0.8 0.6 0.4 0.2 0.0 0 500 1000 Distance between lens and detector (μm) 1500 Green Red Blue FIGURE 20.25 Wavelength power versus distance propagated. its two positive and negative 1 st modes, as the convergence of the beams has not occurred, resulting in zero power being received on the center detector. As expected, each of the wavelengths focuses at a different rate, as shown by each wavelength’s specific curve in Figure 20.25. However, it is seen that all wavelengths focus and achieve detected maximum power at a distance past the lens of 1000 μm, or twice the lens’ focal length. At this point, all three colors project on top of each other, creating a color pixel in the focal plane. With additional optics, this focal plane can be projected to a screen outside the projector. This simulation has shown that the grating, although tuned for the green wavelength, can be used for all three wavelengths. Having shown the use of Chatoyant for modeling multi-domain analog systems, we now turn to the problem of co-simulation between the framework described above and a traditional HDL simulator. Co-simulation requires the solution of two problems at the interface between the simulators. First, a consistent model of time must be reached for when events occur. Second, a consistent model of signal values must be developed for signals crossing the interface. This is the subject of the next section. 20.3 HDL Co-Simulation Environment The two levels of simulation discussed above, component and analog system that are supported by Chatoyant, have not been optimized to Nicolescu/Model-Based Design for Embedded Systems 67842_C020 Finals Page 680 2009-10-2 680 Model-Based Design for Embedded Systems simulate designs that are specified in an HDL such as Verilog or VHDL. There are no components in the Chatoyant library that directly use HDL as an input language. On the other hand, there are many available commercial and research mixed-language HDL simulators. Mixed-language refers to the ability for a simulator to compile and execute VHDL, Verilog, and Sys- temC (or other C/C++ variants). In an earlier work we investigated the use of CoSim with Chatoyant models [47]. In this section, we explore an interface to a commercial system. Cadence, Mentor Graphics, Synopsys, and other EDA companies provide such simulators. One common feature among the more widely used simulators, such as ModelSim and NCSIM, is the ability to execute C-based shared object files embedded in HDL design objects. These simulators provide an application programmer’s interface (API) to gain access to simulator data and control design components. ModelSim was chosen since it has a large set of C routines that allow access to simulator state as well as modifying design signals and runtime states. These functions and procedures are bundled in an extension package known as the foreign language interface (FLI) [48]. By creating a co-simulation environment between ModelSim and Chatoyant, a powerful MDSoC design and verification environment has been created. This environment is able to address the demand for a robust and efficient system architecture/ design space exploration and prototyping tool that can support the design of MDSoCs. The rest of this chapter focuses on the development of the interface between Chatoyant and ModelSim and the performance of the resulting environment. 20.3.1 Architecture The architecture of the co-simulation environment is kept simple to be as efficient and accurate as possible. There are two phases to the execution of the environment: a system generation phase and a runtime support environment. Each is a standalone process, but both are required for system simulation. Figure 20.26 illustrates this top-level structure. 20.3.1.1 System Generator The System Generator allows the user to create the necessary files needed by both Chatoyant and ModelSim. For Chatoyant this includes a common header and object file used in both simulators as well as components (stars) used for the Chatoyant side of the interface. The same header and object file are used for ModelSim , in addition to a shared object library file that is used for invoking the ModelSim FLI when ModelSim is loaded and elaborates a design. The main input to this generator is the top-level or interface-specific VHDL file. This file contains the list of ports that represent the main conduit Nicolescu/Model-Based Design for Embedded Systems 67842_C020 Finals Page 681 2009-10-2 CAD Tools for Multi-Domain Systems on Chips 681 Top - level VHDL file Wrapper VHDL FLI share object file System generator Chatoyant Co-simulation runtime system ModelSim Definitions library Chatoyant star FIGURE 20.26 Co-simulation top-level structure. between the digital domain running within ModelSim and the other domains handled in Chatoyant. When this file is loaded by the System Generator, the entity portion of the VHDL is parsed and a linked list of the ports is created. Each node in this linked list contains the port’s name, its direction (in/out/bidirectional), and its width (1 bit for a signal and n bits for a bus). Using a graphical user interface, the user can select which ports to include and the mapping for the analog voltage levels to be converted into and out of the MVL9 (Multi-Value Logic 9 signal representation standard) logic representation used by ModelSim. There are four fields for this including a high, a low, a cutoff for high, and a cutoff for low voltage values. The user also specifies a name for the system, used for code generation and library management. The outputs of the generator phase are the component star file for Chatoyant, the FLI source code for the ModelSim FLI, the header and source files for a common resource library for the system, a makefile for remaking the object files, a usage text file, and the first time compilation object files performed at the end of the generation. With these files in place, the user can then proceed with the execution of the linked simulators. 20.3.1.2 Runtime Environment: Application of Parallel Discrete Event Simulation The runtime system differentiates itself from other typical co-simulation environments in that there is no central simulation management system. Chatoyant and ModelSim are treated as two standalone processes and Nicolescu/Model-Based Design for Embedded Systems 67842_C020 Finals Page 682 2009-10-2 682 Model-Based Design for Embedded Systems communicate only between themselves. This reduces the overhead of another application executing along with the two simulators as well as the additional message traffic produced by such an arbiter. This philosophy is an application of a general parallel discrete event simulation (PDES) system. Since there are two standalone processes, each is treated as if it were its own DE processing node. Without a central arbiter, the two must (1) exchange event information by converting logic values into voltages and vice versa, and (2) synchronize their respective local simulation times. To exchange the event information, the system uses technology- specific lookup tables, created by the System Generator, that provide the conversion between a logic “1” and a logic “0” to a voltage in addition to determining what voltage level constitutes a logic “1” and “0.” The synchronization of the simulators is where the application of PDES methods enters [49]. The asynchronous DE simulation invokes both simulators to perform unique tasks on separate parts of a design in a nonsequential fashion. This is because of the fact that there is no master synchronization process as in [1]. For synchronization and scheduling there are two major approaches one can take, conservative or optimistic. We discuss our choice next. 20.3.1.3 Conservative versus Optimistic Synchronization The conservative and optimistic approaches solve the parallel synchronization problem in two distinct ways. This problem is defined in [2] as the requirement for multiple processing elements to produce events of an equal timestamp in order to not violate the physical causality of the system. The conservative method solves this problem by constraining each processing node to remain in synchronicity with the others, never allowing one simulator’s time to pass any other simulator. This can have the penalty of reducing the performance of a simulation by requiring extra overhead in the form of communication and deadlock avoidance. The optimistic approach breaks the rule of maintaining strict causality by allowing each processing element to simulate without considering time in other processing element. This means that the simulators can run freely without having to synchronize, with the exception of communicating explicit event information. If, however, there is an event sent from one simulator to the other, and the second simulator has a local current time greater than the event’s timestamp, then the receiving simulation process must stop and rollback time to a known safe state that is before the timestamp of the incoming event. This approach requires state saving as well as rollback mechanisms. This can be costly in terms of memory usage and processing overhead for determining and recalling previous states, and thus increases the processing time of every event. Both approaches are possible since ModelSim does have check-pointing and restoring methods available [48]. However, the conservative PDES Nicolescu/Model-Based Design for Embedded Systems 67842_C020 Finals Page 683 2009-10-2 CAD Tools for Multi-Domain Systems on Chips 683 method was chosen as the underlying philosophy for our co-simulation solution. Two factors went into this decision. The first consideration is that the co-simulation environment is executing as two processes on one workstation, so that exchanging timing information is not as costly as in a large physically distributed simulation environment. The second is that even with a dual-processor workstation, there is not an excess of computational or memory resources that is seen in a truly distributed PDES architecture, and therefore, a rollback would be too costly. This was confirmed with a preliminary test of the fiber image guide system described below. For that system the amount of data required for a checkpoint file was on the order of 1 to 2MB. With an average of 10 checkpoint files needed to keep the two simulators within a common time horizon, rollback time took between 500 ms and 1.5 s. On the other hand, the conservative approach gives a solution requiring significantly less memory at the expense of increased communication to ensure that both simulators are consistently synchronized. This becomes a matter of passing simple event time information between the two simulators. Thus, the only real design issue becomes the time synchronization method. 20.3.1.4 Conservative Synchronization Using UNIX IPC Mechanisms As described in more detail below, the system was developed and tested on a Linux-based workstation. Therefore, UNIX-style IPC is used for the communication architecture. Event information is exchanged using shared memory, and synchronization is achieved by using named pipes in blocking mode. This is similar to the synchronized data transfer and blocking methodology described in [50]. With these two mechanisms, the conservative approach is implemented in the two algorithms seen in Figure 20.27. The algorithm for the co-simulation is straightforward. Both simulators, running concurrently, reach a point in their respective execution paths where they enter the interface code in Figure 20.27. Both check to ensure that they are at the next synchronization point (next_sync), and if they are not, they exit this section of code and continue. If they are at the next synchronization point, defining the safe-point in terms of the conservative approach in PDES, then Chatoyant starts the exchange by checking for any change in its outputs to ModelSim. If there is any change in any bit of these ports, that port is marked dirty, and a change flag is set. When all the ports have been examined, Chatoyant sends ModelSim either a ModelSim_Bound event, if any port changed value, or a No_Change event. Simultaneously, ModelSim waits for this event message from Chatoyant. Once received, it will update and schedule an event for those ports with dirty flags set, if any. It then jumps to check its own output ports, checking bit by bit for a change in each port’s value. Once again, as in Chatoyant, if there is a difference, the dirty flag for that port is set, and the change flag in ModelSim is set true. Once this is done for every port, ModelSim will send a message to Chatoyant that there is either a change (Chatoyant_Bound) or No_Change. Nicolescu/Model-Based Design for Embedded Systems 67842_C020 Finals Page 684 2009-10-2 684 Model-Based Design for Embedded Systems Chatoyant ModelSim If(time < next_sync) If(time < next_sync) If(Response = = No_Change) <check outputs>; Else For each input: For each output: For each bit in signal: If(cur[i] ! = new[i]) mark dirty; flag change; End If; End For each bit; End For each output; If(change){ send(Chatoyant_Bound); Else send(No_Change); end If; Synchronize: next_sync = now + SYNC_PULSE; Wait(Chatoyant Finished); Send(ModelSim_Finished); Done with iteration; If(input.dirty) update local value; ScheduleEvent(); clear input.dirty; End If; End For each input; End If; return at a later time; return at a later time; Wait(Chatoyant_Response); If(change){ send(ModelSim_Bound); Else send(No_Change); end If; Wait(ModelSim_Response); // Blocking If(Response == No_Change) goto Synchronize; Else For each input: If(input.dirty) update local value; ScheduleEventToPorthole(); clear input.dirty; End If; End If; End for each input; Synchronize: next_sync = now + SYNC_PULSE; Send(Chatoyant_Finished); Wait(ModelSim_Finished); Done with iteration; mark dirty; flag change; For each output: For each bit in signal: If(cur[i] ! = new[i]) End If; End For each bit; End For each output; FIGURE 20.27 The synchronization in both simulators. Chatoyant, waiting for this response, will receive it and take action similar to that of ModelSim in updating the inputs from ModelSim. Finally, the two will set their respective next synchronization times and handshake with one another to indicate it is safe to continue simulating. The No_Change mes- sages are analogous to the null message passing scheme defined by Chandy and Misra [49], which has the benefit of avoiding simulation deadlock. A key point is the concept of the next synchronization time (next_sync). This value is calculated based on a global parameter in the co-simulation Nicolescu/Model-Based Design for Embedded Systems 67842_C020 Finals Page 685 2009-10-2 CAD Tools for Multi-Domain Systems on Chips 685 environment known as the SYNC_PULSE. This parameter defines the resolution of how often synchronization occurs. This valueultimately defines the speed versus accuracy tradeoff ratio between the simulators. A higher resolution (smaller SYNC_PULSE value) means greater accuracy but slower runtime. Depending on a particular system, this could affect the quality of the simulation results. 20.3.2 Co-Simulation of Experimental Systems To examine the effects of synchronization resolution on speed and accuracy, we simulate two example MDSoC systems. Both are large-scale systems, meaning there are many components in each domain, including multiple analog circuits, complex optics, and mixed wire and bus interconnects between the digital and analog domains. 20.3.2.1 Fiber Image Guide The first of these systems is the fiber image guide, or FIG, system developed at the University of Pittsburgh [51]. FIG is a high-speed 64 × 64-bit opto- electronic crossbar switch built using an optical multi-chip module. FIG uses guided wave optics, analog amplification and filtering circuits, and digital control logic to create an 8 × 8, 8-bit bus crossbar switch. The switch is built as a multistage interconnection network (MIN) built with a shuffle-exchange architecture. The shuffle operations are performed by the wave guide, and the digital logic performs the exchange switching operation. Analog circuits amplify the digital signals and drive VCSEL arrays which in turn transmit light through the image guide. Photodetectors are used to convert the light back into an analog signal, which is amplified and fed back into the digital domain. This system, illustrated in Figure 20.28, exercises the ability of the co-simulation environment to handle buses as well as the communications between domains without a synchronous clock. In other words, there is no clock signal traveling across the co-simulation interface, and thus the events occur in asynchronous fashion. 20.3.2.2 Smart Optical Pixel Transceiver The smart optical pixel transceiver, or SPOT, was a development at the Uni- versity of Delaware [52]. It provides a short-range free-space optical link between two custom-designed transceivers. Each transceiver either accepts or generates a parallel bus, in the digital domain. On the transmitter side, each bus is serialized into a double data rate data signal, along with a 4X clock (125 MHz clock doubled to 250 MHz in this test system). Serializa- tion and de-serialization are handled in the digital domain. These serial data/clock streams are converted into analog signals that are amplified and used to drive VCSEL arrays, similar to FIG. Photodetectors convert the . not been optimized to Nicolescu /Model-Based Design for Embedded Systems 67842_C020 Finals Page 680 2009-10-2 680 Model-Based Design for Embedded Systems simulate designs that are specified in an. system is shown in Figure 20.24. Nicolescu /Model-Based Design for Embedded Systems 67842_C020 Finals Page 678 2009-10-2 678 Model-Based Design for Embedded Systems GLV Color wheel Lens ( f= 500. Nicolescu /Model-Based Design for Embedded Systems 67842_C020 Finals Page 676 2009-10-2 676 Model-Based Design for Embedded Systems (0, + 1, +2, +3, ),a is the

Định dạng
Số trang	10
Dung lượng	548,6 KB