Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C012 Finals Page 242 24-9-2008 #5 242 Handbook of Algorithms for Physical Design Automation 12.2.5 MODULE LOCATIONS: KNOWN OR UNKNOWN? The whole point of floorplanning is to find suitable locations for modules and so it would seem that this information would be unknown at the start of the process. However, there are scenarios when locations may be approximately known. For example, if a chip is based off of an earlier generation of the same chip, a floorplan architect who was familiar with the design of the o riginal chip may wish to place modules in the same approximate locations (manual or interactive floorplanning). Or, if the floorplan is the result of engineering change order modifications (i.e., incremental floorplanning), the floorplan architect may not want to radically change the locations of the modules. Alternatively, the approximate locations may be the result of a preceding step such as dualization or force directed placement (Chapter 8), or a quick rough placement as described in the next chapter. There is a substantial bodyofresearch related to the addition of location constraintssuch as range and boundary constraints (discussed in Chapters 9 through 11), and symmetry constraints (discussed later in this chapter) to SA-based algorithms that address situations where there is some insight into module locations. 12.2.6 HUMAN INTERVENTION The preceding section brings up another question. Should floorplanning be completely automated? This is the ideal scenario, but may be unrealistic because of the number of issues involved. Therefore, it may be necessary to build tools that enable an interactive type of floorplanning paradigm that involves in teraction between the architect and the tool. 12.3 FIXED-OUTLINE FLOORPLANNING 12.3.1 A UTOMATED FLOORPLANNING WITH RECTANGULAR MODULES Adya and Markov [3] present a fixed-outline floorplanning algorithm based on SA using sequence pairs. (We refer to this as automated fixed-outline floorplanning to differentiate it from the incremental/interactive formulation described in the next section.) An enabling idea in this work is the use of slack-based moves. The concept of slack is illustrated in Figure 12.1. Consider the horizontal and vertical constraint graphs corresponding to a sequence pair. The longest path from source to sink in these graphs gives the width and height of the chip, respectively. The (x, y) location of each module is obtained from the constraint graph by compacting modules toward the left and the bottom. A module’s (x, y) location may also be obtained by com- pacting modules to the right and the top. The difference in the two x (y) values of a module is its slack in the x (y) dimension. The slack is an indicator of the flexibility available to place the module AB Slack E D C FIGURE 12.1 Slacks shown in the X direction: Module B has a significant amount of slack, the width of the chip will not be impacted if a block is placed to its right. In contrast, modules D and E have zero slack and belong to a critical path. One of t hese will have to be moved out of the critical path to reduce the chip’s width. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C012 Finals Page 243 24-9-2008 #6 Recent Advances in Floorplanning 243 without violating the sequence pair and without increasing chip size. The strategy is to move blocks with zero slack (in one or both d imensions) and place them next to blocks with large slack. The rationale is that a dimension of the chip can only be reduced by removing a zero-slack block from its critical path in the constraint graph. Blocks are placed next to blocks with large slack as there is more likely to be room there. The cost function includes a penalty for violations of the fixed-outline constraint in the horizontal and vertical directions. The experiments in this chapter show that the use of slack-based moves results in substantially higher success rates relative to approaches that only penalize fixed-outline violations. The resulting floorplanning tool called Parquet (http://vlsicad.eecs.umich.edu /BK/parquet/) has been widely used in several research projects involving fixed-outline floorplanning . The authors have released several improvements to the software. The following quotation from the website describes the major algorithmic improvements (as of July 2007). The main difference between Parquet-2 and Parquet-3 is that Parquet-3 has an al ternati ve floorplan representation (B ∗ -Trees) … Parquet-4.5 also introduces the “Best” floorplanning representation which chooses between “SeqPair” and “BT ree” depending upon the input instance and optimization objectives. It has been found empirically that B ∗ -Trees are better at packing than Sequence Pairs, so if wirelength is notbeingoptimizedoravailablewhitespaceis lower than10%,“Best” choosesthe B ∗ -Tree representation. We have also found empirically that B ∗ -Trees are faster than Sequence Pairs on instances with 100 or more blocks, so “Best” chooses B ∗ -Tree over Sequence P air in these cases as well. Lin et al. present a fixed-outline algorithm based on evolutionary search [4]. Chen and Chang [5] present a fixed-outline algorithm based on B ∗ trees that uses an alternative annealing schedule called fast SA that consists of three stages (high-temperature random search, pseudogreedy local-search, and hill-climbing search). The goalis to arrive at asolution faster thantra ditional annealing schedules. The authors report better experimental results than Ref. [4] and Parquet-4.5 (both sequence pair and B ∗ -tree versions). 12.3.2 INCREMENTAL/INTERACTIVE FLOORPLANNING WITH RECTILINEAR MODULES The research described in this section differs from that in the previous section in two respects: (1) flexible blocks are allowed, at least in theory, to take embarrassingly rectilinear shapes an d (2) approximate locations for blocks are known. This permits the algorithm to ignore interconnect- related issues because it assumes that interconnect was considered when the locations were decided. Mehta and Sherwani [6] considered a formulation where the approximate centers of flexible modules andthe exact location of fixed modulesare given.A zero whitespace formulationis assumed (i.e., the fixed-outline includes exactly as much area as necessary tocontain the blocks). The objective is to compute exact shapes and locations for each flexible block such that the number of sides in the corresponding rectilinear polygonsis minimized as is the displacement from the center specified for each module. The algorithm discretizes the floorplan area into a grid; grid squares are assigned to each flexible block by using variations of breadth-first traversal (Figure12.2). However, such a traversal may disconnect available grid squares making it impossible for blocks to remain co nnected. This was overcome by splitting each grid square into four smaller subsquares and traversing a subgrid in a twice-around-the-tree traversal. This guarantees that flexible blocks can be made to fit within the fixed-outline without being disconnected. Experimental results confirm this and also examine trade-offs between various grid square assignment methods. This algorithm processes blocks iteratively and consequently the quality of the shapes and locations assigned to them depends on the order in which they are processed. This approach is also computationally expensive in that its time complexity is a function of the number of grid squares created during the discretization step. Feng et al. [7] considered an improvement to this that overcomes the two problems cited above; namely, the sequential processing of blocks and the expensive discretization step in an approach Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C012 Finals Page 244 24-9-2008 #7 244 Handbook of Algorithms for Physical Design Automation (a) (b) 20 10 3 0 1 19 9 2 6 29 21 16 14 15 20 29 18 7 3 6 13 28 17 4 0 2 11 26 19 8 1 22 10 9 25 24 23 51227 18 8 15 28 17 27 26 5 13 21 11 4 7 14 22 12S 16 25 24 23 (c) FIGURE 12.2 (a) Floorplan grid and the location in t he center square S for a flexible block, (b) allocation of 30 grid squares to the flexible block using a strict breadth-first traversal, and (c) allocation of 30 grid squares using a modified breadth-first traversal. called interactive floorplanning. Once again, the outline is fixed, approximate locations for m odules are known, modules may be fixed or flexible and can have arbitrary r ectilinear shapes. However, the formulation differs in that each module is constrained; i.e., a constraining rectangle is specified for each module so that the module cannot be assigned area outside the constraining rectangle (Figure12.3). This is a very constrained formulation, making it likely that there is no solution for a given input. The idea is that if a solution does not exist, the algorithm should indicate this to the user, and the user should then accordingly adjust the constraints (hence the term “interactive floorplanning”). A zero or near-zero whitespace formulation is addressed in Ref. [7]. A max-flow network f low algorithm is used to determine feasibility. If the input is feasible, a min-cost max-flow algorithm is used to actually assign area to each module. A postprocessing step is needed to clean up the output. Feng and Mehta [8] also consider the situation where white space is relatively plentiful. In this case, the problem objectives are made more stringent by assigning shapes to modules so that the extents and number of sides of modules are minimized. The postprocessing step of Ref. [7] is not needed, making the so lution cleaner. Instead, an iterative refinement step is u sed to make the modules provably more compact. Finally, Feng and Mehta provide automated mechanisms to adjust constraining rectangles based on minimizing the standard deviation of density over the floorplan area [9]. In Ref. [10], Liao et al. explicitly consider the incrementalfloorplanning problem:Given aninitial floorplan with precise locations and shapes for its modules specified. Suppose the area requirements A B A B Block C Block B Block A C (a) (b) C FIGURE 12.3 (a) Constraining r ectangles for modules A, B,andC. Each constraining rectangle contains enough area for the corresponding module. Note that constraining rectangles may ov erlap. (b) Actual area allocation to modules A, B and, C. Each module is allocated area within its constraining rectangle. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C012 Finals Page 245 24-9-2008 #8 Recent Advances in Floorplanning 245 of some of the blocks change. How can the floorplan be reconstituted so that all modules have their updated areas without disrupting their locations in the floorplan (and preferably without increasing the overall area of the floorplan).The available whitespace must be distributed among the competing modules in the vicinity. Geometric algorithms based on planesweep were used to simultaneously expand each module’s boundary until it encountersanother module or the floorplan boundary. These algorithms take polynomial time and were designed to work with arbitrary rectilinear shapes. 12.4 FLOORPLANNING AND INTERCONNECT PLANNING This section is concerned with interconnect planning, an activity that goes hand in hand with floor- planning. We further classify the research activity in this area based on the type of interconnect planning. Congestion-based methods primarily impact routability, while buffer-based methods pri- marily impact timing and performance. We also discuss bus-driven methods and close this section with a description of relatively recent research on constructing floorplans with a view to improving microarchitecture performance. 12.4.1 CONGESTION CONSIDERATIONS DURING FLOORPLANNING Traditional floorplanning considers interconnect by including a wirelength term in the cost function used to guide SA. However, this does not provide the accuracy needed to ensure that the floorplan is routable. Routability is related to congestion. If more nets must pass through a region in the chip than there is room for, the design will be unroutable. Here we d iscuss strategies for congestion evaluation during floorplanning. One extreme is to use a global router to evaluate routability. Although this is accurate, it is computationally expensive because it has to be run within the SA loop. Most congestion evaluation metrics use a grid-based approach. The idea here is to divide the floorplan area into rectangular tiles and then estimate the number of nets that cross tile boundaries. An issue with the grid-based approach is to select how coarse it is. A coarse grid will result in more efficient, but less accu rate, computation. Another concern is h ow to determine precisely which tile boundaries a net will cross. Because there are typically several possible routes for a net, we do not know a priori which ones it will cross. One approach is to perform coarse global routing. Another approach is to compute a probabilistic map (Figure 12.4): First compute the probability (under some assumptions) that a net will cross a tile boundary. Next, foreach tile boundary, add up the probabilities over all the nets. Chen et al. [11] propose two techniques for interconnect analysis during SA-based floorplanning using a slicing tree representation. The first only allows a sing le bend (L-shaped route), while the second allows two bends (Z-shaped route). The floorplan is subdivided into grids. Given the two (a) 1 4 3 5 6 2 (b) 6/6 3/6 1/6 1/6 3/6 6/6 3/6 4/6 3/6 FIGURE 12.4 A two-pin net is to be routed from the upper left grid square to the bottom right grid square. Six possible routes are shown in (a). If each of these routes is equally likely, we are able to compute the probability that the net passes through a given grid square. F or example, four out of six routes (2, 3, 4, and 5) pass through the middle square, resulting in a probability of 4/6. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C012 Finals Page 246 24-9-2008 #9 246 Handbook of Algorithms for Physical Design Automation endpoints of a net, the L-shaped routing scheme only permits two types o f routes. Each is considered to be equally likely with 0.5 probability. This probability is associated with the grid boundaries crossed by the route. Each net is routed one at a time using a routing path that minimizes a cost term that contains a penalty term for congestion, a prevention term that discourages the router from using bin boundaries that are nearing saturation, and a term that rewards a route if it reuses existing routes for the same net. The overflow(excess congestion) is added to the SA cost term. Using Z-shapes (two bends) results in a greater choice of routing paths, resulting in a more accurate, but computationally expensive, estimate. Because this computation takes place within the SA loop, the L-shape routes are used at medium temperature, while Z-shapes routes are used at low temperatures. Lai et al. [12] arguethat grid-based approaches(as described above) to computingcongestion are expensive,giventhat this computationis carried out repeatedly inside the floorplaningSA loop. They propose evaluating the congestion on the half-perimeter boundary of several regions in the floorplan. One of the novelties of this work is that region definitions are naturally tied to the twin binary tree (TBT) r epresentation (the floorplan representation used in this work), m aking their definition of congestion easier to compute. Although the definition of regions is, in some sense, arbitrary, it provides statistical samples of wire d ensity in different areas of the floorplan. The chapter also considersa mirror TBT representation that increases the numberof regions considered (and therefore increases the number of samples considered). Experimental results show improved routability when congestion is considered relative to when it is not. Shen and Chu [13] observe that although the approach in Ref. [12] is efficient, it also only provides a coarse evaluation of congestion. They also observe that a probability map-based approach to evaluating congestion could differ significantly from that of a global router. (Once a route has been chosen for a net, its associated tile boundaries are less available than the probability map suggests, while the other possible routes and the associated tile boundaries are more available than the probability map suggests.) Instead, they propose an approach based on the maximum concurrent flow problem. (Th is is a multicommodity flow p roblem in which every pair of entities can send and receive flow concurrently. The objective is to maximize throughput, subject to capacity constraints, where throughput is the actual flow between the pair of entities divided by the predefined demand for that pair. A concurrent flow is one in which th e throughput is identical for all entity pairs [14].) For a given floorplan, the goal is to estimate the best maximum congestion over all possible global routing solutions. This approach uses twin binary sequences. Sham and Young [15] develop a floorplanner that simultaneously incorporates routability and buffer planning. They observe that congestion depends on the routes chosen for the wires, which in turn depends on the availability of buffer resources. Accordingly, their congestion model based on probability maps takes into account where buffers will be needed along the length of the wire and where white space is available in the floorplanto accommodate buffers. A two-phase SA algorithm is used. The cost function u sed in the first phase is the traditional combination of area and wirelength. The second phase incorporates a congestion metric (the average number of wires in the top ten percent most congested grids) in addition to area and wirelength. Ma et al. [16] also simultaneously consider buffer planning and routability during floorplanning. Their algorithm uses the corner block list (CBL) representation. 12.4.2 INTEGRATED BUFFER PLANNING AND FLOORPLANNING The insertion of buffers reduces delay along the interconnect. One can only determine whether and where to insert buffers after the modules have been placed and the interconnect lengths have been somewhat established. On the other hand, buffers can only be inserted where white space is available in the floorplan. This points to a need to integrate floorplanning and buffer planning, a research area that has been addr essed in the literature [16–19]. However, we omit discussing this important area here because it is covered in considerable detail in Chapter 33. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C012 Finals Page 247 24-9-2008 #10 Recent Advances in Floorplanning 247 B CA FIGURE 12.5 The shaded rectangle is a bus that must pass through modules A, B,andC. For this to work, they must be positioned so that the intersection of their y-interv als is wider than the bus’s y-interval. 12.4.3 BUS-DRIVEN FLOORPLANNING A bus is a group of wires that is required to p ass through a set of specified modules. It is specified by the set of macroblocks through which it must g o and a width. The width is determined by the number of wires it contains. In Ref. [20], each bus is realized by a rectangular strip. For a bus to be feasible, the macroblocks must be located such that it is indeed possible for a horizontal/vertical rectangular strip of the required width to pass through the blocks (Figure 12.5). The sequence pair representation is u sed. This result was extended by Law and Young [21] to allow buses with one or two bends. Chen and Chang [5] also explore bus-driven floorplan using B ∗ trees and a fast SA schedule and report better results than Ref. [20]. 12.4.4 FLOORPLAN/MICROARCHITECTURE INTERACTIONS The execution time of a program is the product of the numberof (machine) instructions executed (the dynamic instruction count), the average number of clock cycles required per instruction (CPI), and the clock cycle time. Reducing instruction count and CPI has traditionally been within the purview of compiler technology and architecture, while reducing the clock cycle time (the reciprocal of the clock frequency) has been the responsibility of logic and physical design. This separation existed because any block-to-block communication on a chip took less than a cycle, an assumption that is no longer true. Consequently, there is a growing body of work that explores the interaction between microarchitecture and physicaldesign.Thisinteraction specificallyfocuses onf loorplanning,because floorplanning is the first high-level physical design step that determines the locations of blocks on the chip, which is needed to determine interconnection lengths. A key strategy used here is interconnect or wire pipelining, which introduces latches (flip- flops) on interconnects to break them into smaller segments so that signal propagation time on these segments takes less than one clock cycle. However, although wire pipelining keeps clock cycle time low, it increases block-to-block latency, which could result in an increase in CPI. A metric that simultaneously captures both of these entities is throughput, which is the number of instructions executed per second = clock frequency/CPI. Several of the papers described below develop algorithms to obtain floorplans that optimize throughput.A key challenge here is to measure the CPI. The traditional method for doing this is to use a cycle-accurate simulator on a large number of benchmarks. However, cycle-accurate simulators are extremely time consuming making them impossible to include inthe inner loop of afloorplan optimizationalgorithm. Theyare also sufficiently time consuming that they can only be used sparingly offline (i.e., outside the floorplanning loop). All of the methods below work by plugging in a term into the Parquet floorplanner SA cost function that in one way orthe other approximatesthroughput. What distinguishesthe methodsis theaccuracy with which throughput is approximated and the effort (in terms of number o f cycle accurate simulations) required to generate the throughput approximation. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C012 Finals Page 248 24-9-2008 #11 248 Handbook of Algorithms for Physical Design Automation Cong et al. [22] developed an SA algorithm that is integrated with Parquet that seeks to minimize the sum of weighted netlengths divided by the IPC (IPC = instruction s executed per cycle = 1/CPI). This quantity approximates maximizing the throughput. Netweights are based on the slacks of their pins (computed by performing a static timing analysis). The algorithm evaluates alternative block implementations and uses an interconnect performance estimator. Long et al. [23] consider a version that minimizes CPI. Their contribution is to use a trajectory piecewise-linear model to estimate CPI. This estimate, based on table lookup, is much faster than a cycle-accurate simulation, and has an error of about 3 percent. When this estimate is incorporated in Parquet’s SA objective function, it results in a significant reduction in CPI with a modest increase in floorplan area. Casu and Macchiarulo [24] focus on the impact of loops in the logic netlist on throughput. The throughput is not explicitly included in the cost function, but is approximated as follows: Each net is assigned a weight that is the inverse of the shortest loop the net belongs to. The Manhattan distance between pins is divided by the maximum length admissible between clocked elements. The weighted sum over all nets is included as the throughput term in the Parquet SA objective function. Casu and Macchiarulo [25] extended their earlier work by taking into account the fact that a channel will contribute to the overall throughput degradation of the system at most up to its activation time. They introduce an additional channel activation ratio, which is defined as the time fraction in which a block communication channel is active. The weighting factor is used to multiply the throughput term so that the channel communication p roperties are taken into account. Ekpanyapong et al. [26] profiled architectural behavior on several applications and obtained frequencies on global interconnect usage; this is used to determine the weight of each wire (the greater the frequency, the greater the weight). These weights are incorporated into a mixed integer nonlinear program (MINP) floorplanner whose goal is to minimize weighted global wirelength. The MINP formulation is relaxed to a more tractable linear programming (LP) formulation. The final results are fed back into a cycle-accurate simulator, which shows a 40 percent improvement in CPI relative to a floorplanner that does not take architectural behavior into account. Jagannathan et al. [27] cite limitations in Ref. [22] in that the cycle time for interconnect may not match the cycle time for blocks because the latter did not consider wire pipelining. It also differs from previous work in that it only considers systemwide critical paths and loops rather than all two-pin nets. They also argue that it is sufficient to use relative changes to the IPC (as opposed to an exact computation of IPC) to guide floorplanning. To this end, they develop an IPC sensitivity model to track changes in IPC because of different layouts. IPC sensitivity is computed as follows: the latency of one critical path is varied while keeping the others fixed and the degradation of IPC with each additional cycle of latency on that path computed. Parquet is used with a weighted combination of area and 1/IPC. The approach used here is to fix a target frequency, and then floorplan to optimize IPC (as opposed to simultaneously optimizing IPC and frequency). Nookala et al. [28] focus specifically on the throughput objective and identify throughput- critical wires based on the theory of multifactorial design (a statistical design technique). They argue that if each of n buses in a design can have k different latencies, the number of simulations that would normally be needed to sample the search space is O(k n ). Whereas, with multifactorial design theory, the number of simulations needed is O(n). The throughput-critical wires so obtained are emphasized during floorplanning, by replacing the total wirelength objective in Parquet with a weightedsum offactor latencies. Theseare inputto thefloorplanneralongwith atargetfrequency. The performance of the obtained layout is validated using cycle-accurate simulations. They subsequently [29] refine this work to significantly speed up cycle-accurate simulations that preserve the quality of the solution. Wu et al. [30] propose a thermal-aware architectural floorplanning framework. By adopting adaptive cost function and heuristic-guided pertubation in their SA algorithm, their floorplanner is able to obtain a high-performance chip layout with significant thermal gains compared to the layout obtained by the traditional performance-driven floorplanner. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C012 Finals Page 249 24-9-2008 #12 Recent Advances in Floorplanning 249 12.4.5 FLOORPLAN AND POWER/GROUND COSYNTHESIS Voltage (IR) drop in power/ground networks is an important problem in IC design. The resistance in power wires is increasing substantially. As a result, the reference supply voltage in chip components may be less than it should be. This can weaken the driving capability in logic gates, reduce circuit performance, reduce the noise margin, etc. Yim et al. [31] show that it is advantageous to consider power network and clock distribution issues at the early floorplanning stage rather than after detailed layout. How does this relate to floorplanning? Power-hungry modules draw larger currents. If these modules are p laced far away from the power pad (on the boundaryof the chip), then the combination of the larger current and greater resistance because of increased wirelength exacerbates the IR drop for that module. Such modules should be placed nearer the power pad. Liu and Chang [32] propose a methodology to simultaneously carry out floorplanning and synthesize the power network. They use SA with the B ∗ -tree representation. The SA cost function is modified to include penalties for violating power integrity constraints and the power/ground mesh density cost function. In addition, the B ∗ -tree representation is constrained so that the most power-hungry modules (the ones that draw the most current) are on the boundary near the power pads. (In experiments, this reduction in solution space caused a factor of three improvementin runtime.) The proposedmethodology was successfully integrated into a commercial design flow. 12.5 FLOORPLANNING FOR SPECIALIZED ARCHITECTURES This section considers variants of the floorplanning problem for specialized architectures such as FPGAs, three-dimensional (3D) ICs, and analog circuits. 12.5.1 FPGA FLOORPLANNING Cheng andWong [33]introduced FPGA floorplanning. Thefloorplanning problem in modernFPGAs is heterogeneous because it consists of different types of components that are arranged in columns at specified lo cations on the chips. These consist of configurable logic blocks (CLBs), multipliers, RAMs, etc. (Figure 12.6). In application specific integrated circuit (ASIC) floorplanning, a module is simply specified by its area or by its height and width. (ASIC floorplanning may be viewed as homogeneous floorplanning because the area throughout the chip is of the same type and an ASIC CLB RAM block Multiplier FIGURE 12.6 Illustration of the heterogeneous resources on an FPGA. The vector corresponding to the resources contained in the highlighted block is (48,2,2). Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C012 Finals Page 250 24-9-2008 #13 250 Handbook of Algorithms for Physical Design Automation module can be placed anywhere on the chip.) In FPGA floorplanning, a module is specified by the number of resources of each type that it requires. Cheng and Wong introduced the notion of a resource requirement vector (n 1 , n 2 , n 3 ) to characterize each module, where n 1 , n 2 ,andn 3 are the number of CLBs, RAMs, and multipliers, respectively. They then define FPGA flo orplann ing to be the placement of modules on the chip so that (1) each region assigned to a module satisfies its resource requirements, (2) regions for different modules do not overlap, and (3) a given cost function is optimized. The problem is solved by a two-step strategy. In the first step, the authors use an approach based on slicing trees and SA. This involves computing irreducible realization lists, which specify all of the locations that a module can be placed in so that it meets its resource requirements. An irreducible list is computed for each node in the floorplan tree in a bottom-up manner. At this stage, each node in the tree corresponds to a rectangle. Once these lists are computed, it is possible to evaluate the g iven floorplanning tree. This evaluation is used in the SA algorithm. The second step consists of compaction followed by postprocessing. Another two-step solution to the FPGA floorplanning problem is presented in Ref. [34]. The first step is a resource-aware floorplanning step based on Parquet. An FPGA is a bounded rectangle making it more natural to use a fixed-outline algorithm than area-minimization. In addition, the SA cost function contains a resource term that penalizes each module by the amount of mismatch between its resource requirements and the resources available in its current location. This step is expected to place modules at locations that are close to their resources. Even so, it is unlikely that each module meets its resource requirements. This is addressed by deploying a second step based on constrained floorplanning. The purpose of this step is to ensure that each module meets its resource requirements without substantially changing the location and shape that were obtained as a result of the first step. The underlying algorithm is based on a min-cost max-flow network formulation. Thus, it results in a solution that takes a global view of resource demand and supply across the chip. The constrained floorplanning techniques, originally designed for homogeneous floorplans, were modified to account for heterogeneity in FPGAs. In a ddition, this algorithm can incorporate trade- off between resources. For example, each CLB in a Xilinx Vertex-II FPGA can imp lement 32 bits. If a module needs memory, it can acquire this resource from the RAM on the FPGA or fromCLBs. The flow network in the constrained floorplanner can incorporate this modification to the formulation. 12.5.2 3D FLOORPLANNING Three-dimensional floorplanning [35–37] has become an active area of research because of the possibility of 3 D integrated circuits. We refrain from discussing this area here because it is covered elsewhere: Chapter 9 on slicing floorplans and Chapter 11 on packingrepresentations contain detailed discussions on 3D floorplan representations, while Chapter 47 discusses the state of the art with respect to 3D IC technologies. 12.5.3 ANALOG FLOORPLANNING High-performance analog circuits require layouts where groups of devices are placed symmetrically with respect to one or more axes. This is done to match layout-related parasitics in both halves of a group of devices. A symmetry pair consists of two modules with the same dimensions. Without loss of generality, we define symmetry with respect to a vertical axis x = X A .Let(x L , y L , w, h)and(x R , y R , w, h) denote the left and right modules, respectively. Then, y L = y R and (x L + w + x R )/2 = x A must be true for modules L and R to be symmetric. A self-symmetric module is one which is symmetric with respect to itself; i.e., x L = x R and y L = y R . This means that the module must be bisected by the axis of symmetry. A symmetry g roup may consist of several symmetry pairs and self-symmetric modules. All of these must share the same axis of symmetry (Figure12.7). The challenge that arises during floorplanning based on SA is how to search the state space efficiently. One approach is to search the state space as before, ignoring states that do not meet the Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C012 Finals Page 251 24-9-2008 #14 Recent Advances in Floorplanning 251 BЈ AЈA B D C E FIGURE 12.7 Symmetry group in an analog floorplan consisting of (A, A ) and (B, B ). C is self-symmetric. D and E have no symmetry constraints. symmetry requirement. However, the majority of states are of this type. This means that the SA process wastes a lot of time on infeasible solutions. The research [38–44] in this area explores how to improve on floorplan representations such as sequence pairs, O trees, and B ∗ trees, so that they are symmetric feasible; i.e., each state visited and evaluated by SA does indeed correspond to a floorplan that satisfies symmetry. The original work mainly considers a single axis of symmetry. More recent research explicitly considers several axes of symmetry simultane ously and involves converting the sequence-pair constraint graphs into a set of linear expressions, which are then solved using linear programming. Most recently, Lin and Lin [45] propose automatically symmetric-feasible B ∗ trees (ASF-B ∗ trees), which can handle not only 1 D but also 2D symmetry constraints. A hierarchical B ∗ tree (HB ∗ tree) is constructed by incorporating ASF-B ∗ trees into traditional B ∗ trees to handle the simultaneous placement of m odules in symmetry group and nonsymmetry modules for analog placement. 12.6 STATISTICAL FLOORPLANNING The basis for this line of research is that precise module dimensions may not be known during floorplanning, because floorplanning can be used very early in the design process by an architect as an estimation tool. At this stage, the modules have not yet been created and therefore their dimensions are unavailable. Suppose that, instead, the architect is able to supply an input consisting of module height and width distribution lists. For ex ample, a module’s width may be represented as {(4, .2), (5, .5), (6, .3)}, meaning that the module has widths of 4, 5, and 6 with probabilities .2, .5, and .3, respectively. Bazargan et al. [46] approach this problem by using SA with the slicing-tree representation. The main novelty of their work is the way in which floorplan area is evaluated. Recall that in slicing floorplans, the area of a larger rectangle is obtained from its two slice rectangles by adding widths and computing the max of their heights (assuming the two rectangles are separated by a vertical cut). In the statistical version, this computation takes the height and width distribution lists of the two modules as input and produces height and width distribution lists of the resulting rectangle as the output. We describe this using an example. M1 : W ={(3,.2), (5, .5), (6, .3)}, H ={(5, .5), (6, .5)} M2 : W ={(2,.3), (3, .6), (4, .1)}, H ={(4, .4), (5, .6)} . probability of 4/6. Alpert /Handbook of Algorithms for Physical Design Automation AU7242_C012 Finals Page 246 24-9-2008 #9 246 Handbook of Algorithms for Physical Design Automation endpoints of a net,. Alpert /Handbook of Algorithms for Physical Design Automation AU7242_C012 Finals Page 242 24-9-2008 #5 242 Handbook of Algorithms for Physical Design Automation 12.2.5 MODULE. block is (48,2,2). Alpert /Handbook of Algorithms for Physical Design Automation AU7242_C012 Finals Page 250 24-9-2008 #13 250 Handbook of Algorithms for Physical Design Automation module can be