Handbook of algorithms for physical design automation part 35 doc

Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C016 Finals Page 322 23-9-2008 #13 322 Handbook of Algorithms for Physical Design Automation annealing is performed where the range of cell movement is limited and the number of moves per cell is greatly reduced; thereby the overall execution time is reduced. This resulted in a factor of 2–3 speedup in execution time and a 6–17 percent improvement in half-perimeter wirelength. Sun and Sechen increased the number of levels of clustering to three [15]. They were able to achieve up to 7.5× speedup on designs containing 25,000 placeable objects. They also saw an improvement in half-perimeter wirelength due to clustering although some of the improvement may be due to their new cost function and wire estimation model. In this work, the cost function used only contained the sum of the half-perimeter wirelengths and timing constraints. All penalty functions were removed so that there would be no need for the sophisticated negative feedback controller used to weigh the penalty terms. Instead only moves that generated no overlap were allowed. 16.8 PARTITION-BASED METHODS Simulated annealing placers have also incorporated partitioning techniques to achieve better quality and speed. NRG [24] and Tomus [25] convert the placement problem to a partitioning problem that is solved using simulated annealing. The placement problem is reduced by dividing the row topology into uniform grids o r bins. Each standard cell or field programmable gate array lookup table (FPGA LUT) is assigned to a bin. A penalty term is introduced to maintain uniform cell density among the bins. Standard cells are exchanged by picking two cells in different bins and swapping them. Because the number of b ins is much smaller than the possible standard cell positions, the search space is reduced resulting in a speedup. A second or detailed placement phase ensues, which removes any residual overlap and legalizes the placement of the cells. Both of these algorithms worked on a flat netlist and did not scale well for larger netlists. The NRG work was enhanced to overcome these shortcomings in the Dragon 2000 placer and der ivatives and is the subject of Chapter 15. 16.9 GENETIC PROGRAMMING Genetic programming is a related stochastic algorithm. Genetic programming is an attempt to m imic the known processes of evolution to solve problems [26,27]. A solution to the problem is represented by a string of symbols. Each solution has an associated cost or score. New solutions known as offspring are generated by combining parts of the solutions from two parent solutions in what is known as a crossover operation. In addition, a new solution may be formed by mutating a string by randomly changing one or more symbols in the string. Initially, a large number of solutions known as the population are constructed. This population is evolved by creating new offspring solutions and maintaining only the fittest solutions. The final solution is the best solution found at a fixed number of generations. A genetic placement algorithm was proposed by Cohoon and Paris [26] and results were furnished for small examples. The advantage of genetic approach is that large number of possible solutions are maintained increasing the likeliness of a good final solution.It is often stated that simulated annealing is a special case o f genetic programming where the population is one. This is not true. In genetic programming, there is not a stationary Boltzmann distribution to manipulate, which would allow the algorithm to converge. This lack of convergence guarantee is the biggest disadvantage of genetic programming. Several works [28,29] were proposed to overcome this weakness. SAGA [28] attempted to rectify the problem by starting with a genetic algorithm and then slowly convert to a simulated annealing algorithm by pruning the population. Mahfoud and Goldberg [29] sought a parallel simulated annealing algorithm. They proposed an algorithm that manipulates a population of simulated annealing solutions rather than a single solution. It employs crossover and mutation operators like standard genetic programming but holds Boltzmann trials between children and parents to determine the fittest members. It slowly lowers the temperature to achieve convergence. Unfortun ately, this work has not been applied to standard cell placement. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C016 Finals Page 323 23-9-2008 #14 Placement Using Simulated Annealing 323 16.10 PARALLEL ALGORITHMS To furth e r reduce the execution time of the simulated annealing placement algorithm, several multiple processor algorithms were proposed. A parallel algorithm may be characterized by the computer organization f or which it is designed (multiple instruction , multiple data architecture [MIMD] or single instruction, multiple data architectu re [SIMD]) and its granularity (fine or coarse). Th ere have been four general strategies utilized in parallel simulated annealing programs: single move acceleration, parallel moves, multiple Markov ch ains, and sp eculative computation [30]. The single move acceleration strategy attempts to break up an individualmove into subtasks, which areevaluated on separate processors. Such strategies require shared memory and d o n o scale well. In the parallel move strategy, each processor generates and evaluates moves independent of any other processor. Unfortunately, care must be taken so that the moves do not interact and g ive erroneous results. The multiple Markov chain approach uses concurre nt but separate simulated annealing chains, which are periodically exchanged . Finally, the speculative computation strategy attempts to predict the future behavior of simulated annealing moves. Kravitz and Rutenbar [6] proposed an adaptive parallel simulated annealing placement algorithm where in the high-temperature regime, a move is decomposed into subtasks and distributed across different processors and in the low-temperature regime, multiple com plete moves are performed in parallel. The authors introduced the concept of “serializable subset” of moves to prevent interaction between processors. A serializable subset is an ordered subset of moves that, if evaluated serially, would produce the same accept and reject decisions as a parallel evaluation of moves. Unfortunately, it is prohibitively expensive to maintain a large serializable set and the authors seek a simple set of one accepted moveandthe remainder rejected moves. Although this guaranteesthatnoconflictarises between the processors, it is only applicable at low temperatures where the acceptance rate is low. Cassotto et al. used clustering on a 8 processsor shared memory computer to achieve six times improvement in speed without loss of quality [4]. Sun and Sechen achieve near linear speedup for on a network of workstations using the parallel move approach [16]. Other algorithms [5] h ave been proposed on hypercube multiprocessors. The lack of access to such specialize hardware has made this work less practical. Chandy et al. attempted to overcome these problems by proposing a framework for implementing parallel simulate annealing placement on a wide range of parallel architectures [30]. 16.11 MACHINE LEARNING Over time there have been many advancements in the evolution of the simulated annealing placer. These include clustering, hierarchy,annealing schedules, range limiters, and move sets. Each of these improvements was introduced on a trial-and-error basis using empirical tests. Su et al. [31] proposed statistical learning techniques to learn and discover strategies to improve and speed the execution of simulated annealing placers. The researchers created a response m odel that comprised of seven normalized parameters for each of ten temperature regions: y = B 0 + r  i=1 q  j=1 B i j p i j (16.23) where r = 10 q = 7 The parameters were drawn fro m placement literatu re and included fo r ce-d irected placement an d quadratic placement features. The linear regression algorithm was trained u sing a set of examples to correlate the mo del with the final solution quality measured with the half-perimeter wirelength Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C016 Finals Page 324 23-9-2008 #15 324 Handbook of Algorithms for Physical Design Automation metric. After training, the 70 parameter value coefficients were examined and those close to zero were eliminated. The trained annealing algorithm was then run on a set of new examples to determin e the efficiency of the new placement algorithm. Remarkably, the trained algorithm had discovered the range window limiter algorithm automatically. The trained algorithm outperformed the base algorithm in both speed and quality. Although the regression analysis is limited by the quality of input parameters, this technique is unique in its ability to tune simulated annealing algorithms to their proper values. When new parameters or techniques are discovered, this methodology allows these parameters to be incorporated easily into the simulated annealing framework. 16.12 FUTURE Although other placement methods have supplanted simulated annealing in the computed-aided design community, these new methods do not compare favorably with simulated annealing on small designs, that is, designs under 25,000 placeable objects. The state-of-the-art placement algorithms are foremost focused on capacity so much that the placement problem was redefined to encompass two stages: global and detailed placement. In the global placement p hase, cell positions are not necessarily legal and may overlap. This phase serves only to minimize wirelength, timing, and congestion constraints. A second phase known as detail placement is performed to legalize the placement such that cell overlaps are removed and each cell is mapped to a valid position in the row. Traditional simulated annealing placers do not make this distinction. Simulated annealing placement has not scaled well as design sizes have increased compared with the latest state-of-the-art placement algorithms. Simulated annealing placers dominate results at small design sizes. Perhaps applying newer multilevel clustering techniques may further improve the performance o f simulated annealing placers. REFERENCES 1. Kirkpatrick S, Gelatt C., and Vecchi M., Optimization by simulated annealing, Science, 220(4598), 671, 1983. 2. Metropolis N., Rosenbluth A., Rosenbluth M., Teller A., and Teller E., Equations of state calculations by fast computing machines, Journal of Chemical Physics, 21, 1087, 1953. 3. Mitra D., Romeo F., and Sangiovanni-Vincentelli A., Convergence and finite-time behaviour of simulated annealing, Electronics Research Laboratory, College of Engineering, Uni versity of CA, Berkeley, 1985. 4. Casotto A., Romeo F., and Sangiovanni-Vincentelli A., A parallel simulated annealing algorithm for the placement of m acro-cells, IEEE T ransactions on CAD, 6( 5), 838, 1987. 5. Banerjee P., Jones H. M., and Sargent J. S., Parallel simulated annealing algorithms for cell placement on hypercube multiprocessors, IEEE Transactions of Parallel and Distributed Systems, 1(1), 91, 1990. 6. Kravitz S. A. and Rutenbar R. A., Placement by simulated annealing on a multiprocessor, IEEE Transactions on CAD 6(4), 534, 1987. 7. Green J. W. and Supowit K. J. Simulated annealing without rejected moves, IEEE Tr ansactions on CAD, 5(1), 221, 1986. 8. Lam J. and Delosme J. M., Performance of a new annealing schedule, Proceedings of the 25th Design Automation Conference, Anaheim, CA, 306, 1988. 9. Rose J., Klebsch W., and Wolf J., Temperature measurement and equilibrium dynamics of simulated annealing placements, IEEE Transactions on CAD, 9(3), 253, 1990. 10. Sechen C. and Sangiovanni-Vincentilli A., The TimberWolf placement and routing package, IEEE Journal of Solid-State Circuits, 20(2), 432, 1985. 11. Sechen C. and Sangiovanni-Vincentelli A., TimberWolf3.2: A new standard cell placement and global routing package, Proceedings of the Design Automation Conference, Las Vegas, NV, 432, 1986. 12. Sechen C. and Lee K. L., An improved simulated annealing algorithm for row-based placement, Proceedings of ICCAD, Las Vegas, NV, 478, 1987. 13. Sechen C., Chip-planning, placement, and global routing of macro/custom intergrated circuits using simulated annealing, Proceedings of the Design A utomation C onference, Atlantic City, NJ, 73, 1988. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C016 Finals Page 325 23-9-2008 #16 Placement Using Simulated Annealing 325 14. Sechen C., VLSI Placement and Global Routing Using Simulated Annealing, Kluwer Academic Publishers, Boston, MA, 1988. 15. Sun W. J. and Sechen C., Efficient and effective placement for very lar g e circuits, Proceedings of ICCAD, Santa Clara, MA, 170, 1993. 16. Sun W. and Sechen C., A loosely coupled parallel algorithm for standard cell placement, Proceedings of ICCAD, San Jose, CA, 137, 1994. 17. Swartz W. and Sechen C., Timing driven placement for large standard cell circuits, Proceedings of Design Automation Conference, San Francisco, C A, 211, 1995. 18. Swartz W. and Sechen C., New algorithms for the placement and routing of macro cells, Proceedings of ICCAD, Santa Clara, CA, 336, 1990. 19. Siarry P., Berthiau P. G., Durbin F., and Haussy J., Enhanced simulated annealing for globally minimizing functions of many -continuous variables, ACM Transactions on Mathematical Software, 23, 209, 1997. 20. Kahng A. B. and Xu X., Accurate pseudo-constructive wirelength and congestion estimation, I n ternational Workshop on System-Level Interconnect Prediction, Monterey, CA, 61, 2003. 21. Hustin S. and Sangiovanni-Vincentelli A., TIM, a new standard cell placement program based on the simulated annealing algorithm, paper 4.2, International Workshop on Placement and Routing, Research T riangle Park, NC, May 10–13, 1988. 22. Available at http://www.spec.org/benchmarks.html. 23. Mallela S. and Grover L., Clustering based simulated annealing for standard cell placement, Proceedings of 25th Design Automation Conference, Atlantic City, NJ, 312, 1988. 24. Sarrafzadeh M. and Wang M., NRG: Global and detailed placement, P roceedings of ICCAD, San Jose, CA, 532, 1997. 25. Roy K. and Sechen C., A timing driven n-way chip and multi-chip partitioner, Proceedings of ICCAD,Santa Clara, CA, 240, 1993. 26. Cohoon J. P. and Paris W. D., Genetic placement, Proceedings of ICCAD, Santa Clara, CA, 422, 1986. 27. Kling R. M. and Banerjee P., ESP: A new standard cell placement package using simulated evolution, Pr oceedings of the IEEE Desi gn Automation Conference, Miami B each, FL, 60, 1987. 28. Esbensen H. and Mazumder P., SAGA: A unification of the genetic algorithm with simulated annealing and its application to macro-cell placements, Proceedings of the 7th International Conference on VLSI Design, Calcutta, India, 211, 1994. 29. Mahfoud S. and Goldberg D., Parallel recombinative simulated annealing: A genetic algorithm, Parallel Computing, 21(1), 1, 1995. 30. Chandy J. A., Kim S., Ramkumar B., Parkes S., and Banerjee P., An evaluation of parallel simulated annealing strategies with application to standard cell placement, IEEE T ransactions on CAD, 16(4), 398, 1997. 31. Su L., Buntine W., Newton A. R., and Peters B.S., Learning as applied to stochastic optimization for standard cell placement, IEEE Transactions on C AD, 20(4), 516, 2001. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C016 Finals Page 326 23-9-2008 #17 Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C017 Finals Page 327 24-9-2008 #2 17 Analytical Methods in Placement Ulrich Brenner and Jens Vygen CONTENTS 17.1 Introduction 327 17.2 How to MinimizeNetlength 328 17.2.1 WhatIs Netlength? 328 17.2.2 Minimizing Netlength 331 17.2.3 How to Minimize Linear Netlength 331 17.2.4 How to Minimize Quadratic Netlength 332 17.2.5 Examples 334 17.2.6 Other Objective Functions 334 17.3 Properties of Quadratic Placement 335 17.3.1 Relation to Electrical Networks and Random Walks. 335 17.3.2 Stability 336 17.4 Geometric Partitioning 337 17.4.1 Objectives 337 17.4.2 Bipartitioning 338 17.4.3 Quadrisection 339 17.4.4 Grid Warping 339 17.4.5 Multisection 340 17.5 How to Use the Partitioning Information 341 17.5.1 Center-of-Gravity Constraints 341 17.5.2 Splitting Nets 342 17.6 Further Techniques 343 17.6.1 Repartitioning 343 17.6.2 Parallelization 344 17.6.3 Dealingwith Macros 344 17.7 Conclusion 344 References 345 17.1 INTRODUCTION The basic idea of analytical placement consists of first placing the cells optimally in terms of an appropriatenetlength estimation (but without considering disjointness constraints) and then working toward disjointnesss. For the second step, we can distinguish two main approaches. One method consists of modifying the objective function in small steps to force cells to move away from each other. Such force-directed approaches will be described in Chapter 18. In this chapter, we consider methods that reduce overlaps by recursive partitioning of the chip area and the set of cells to be placed. This partitioning is done in such a way that no subregion of the chip area contains more cells than fit into it. Consequently, when the regions are small enough, the cells will be spread over the chip area. 327 Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C017 Finals Page 328 24-9-2008 #3 328 Handbook of Algorithms for Physical Design Automation FIGURE 17.1 First six steps of an analytical placer. Such an analytical placer is illustrated in Figure 17.1. The large o bjects are preplaced macros. The first picture shows a placement o f the movable cells with minimum squared netlength (with many overlaps). Then, in each partitioning step , the regions and the sets of cells are divided into fou r parts, indicated by different gray scales. We will explain the details later in this chapter. Analytical placement is based on the a bility to minimize netlength efficiently. Therefore, we first discuss this in Section17.2. We define various measures for netlength and show how to minimize linear and quadraticnetlengths. For reasons thatwe will discuss, mostanalyticalplacersuse quadratic netlength. Important properties of placements with minimum quadratic netlength are summarized in Section17.3. Minimizing quadratic netlength goes back to Tutte (1963) who used it for finding straight-line embeddings of planar graphs. Then, this technique has been applied to placement by Fisk, et al. (1967), Quinn (1975), and Quinn and Breuer (1979). They tried to reduce overlaps between cells by computing itera tively repulsing forces (see Chapter 18). Probably, the first approach to combine algorithms for minimizing netlength with recursive partitioning has been presented by Wipfler et al. (1982). They adapt the approach by Quinn and Breuer (1979) and used the result as a guideline for recursive bisection steps. We explain bisection and the more sophisticated approaches used today in Section 17.4. In Section 17.5, we describe methods how the partitioning results can be incorporated in the ensuing netlength optimization steps.Section17 .6 deals withpracticalaspectsof analytical placement implementations. 17.2 HOW TO MINIMIZE NETLENGTH 17.2.1 W HAT IS NETLENGTH? As discussed in Chapter 14, it is not easy to say what a good placement is. The main design objectives timing, power consumption, and m anufacturing cost can be influenced only indirectly by placement, Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C017 Finals Page 329 24-9-2008 #4 Analytical Methods in Placement 329 as later design steps such as timing optimization or routing follow. Nevertheless, there is a need for objective functions that can be evaluated fast. The most widely adopted quality measure is netlength. Netlength can be defined in various ways, but the idea is always to estimate the wirelength after routing a given placement. Timing is typically taken into account by giving critical nets a higher weight (see Chapter 21). To allow for fast estimation (and possibly optimization) of wirelength, one considers each net individually. T his assumes that each net can be wired optimally, disregarding other nets. Of course, this is not the case, but it is a reasonable approximation, at least for the majority of the nets and in particular for the most critical ones, unless there is serious routing congestion (which one should avoid anyway; cf. Chapter 22). For each net we can consider a shortest rectilinear Steiner tree connecting the pins (see Chapter 24), but we shall also consider other estimates. Formally, we define Definition 1 Given a set N of disjoint nets, each of which is a set of pins, netweights w : N → R ≥0 , pin positions (x, y) :  N → R 2 , and a function M : {V ⊆ R 2 |2 ≤|V| < ∞} → R ≥0 (a net model), the (weighted) n etlength with respect to M is  N∈N w(N)M({(x, y)(p) : p ∈ N}). Typically, a pin shape consists of several rectangles, but this is largely ignored during placement, and a representative point is chosen for each pin. As pin shapes are relatively small, the error resulting from this simplification is also rather small, at least in global placement. Detailed placement (legalization; cf. Chapter 20) can improve by considering the actual pin shapes. The most natural net model, which is closest to the actual wirelength to be expected after routing, is the minimum length of a rectilinear Steiner tree. However, computing a sh ortest rectilinear Steiner tree for a given set of points in the plane is NP-hard (Garey and Johnson, 1977). This is one reason why other net models are u seful. The following net models have been considered in placement (see also Chapters 7 and 14). Let V ⊆ R 2 be a finite set of points in the plane. • Steiner(V ) is the length of a shortest rectilinear Steiner tree for V. • BB(V ) is half the perimeter of the bounding box of V,thatis, max (x,y)∈V x − min (x,y)∈V x + max (x,y)∈V y − min (x,y)∈V y. • Clique(V)is 1 |V|−1 times the sum of rectilinear distances over all pairs of points in V,thatis, 1 |V |−1  (x,y),(x  ,y  )∈V (|x − x  |+|y −y  |). • Star(V ) is the minimum total rectilinear distance of an auxiliary point to all elements of V, that is, min (x  ,y  )∈R 2  (x,y)∈V (|x −x  |+|y −y  |). The factor 1 |V|−1 in the clique estimate is standard to avoid that nets with many pins dominate the netlength, but other factors (like 2 |V| ) have also been used (see, e.g., Alpert et al., 1999). The bounding box and the star estimate can both b e determined in linear time: the auxiliary point for a star can be found by two median searches (Blum et al., 1973). The clique estimate can be computed in O(|V |log|V|) time by scanning the points after sorting in each coordinate. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C017 Finals Page 330 24-9-2008 #5 330 Handbook of Algorithms for Physical Design Automation TABLE 17.1 Bounds on the Ratios of Different Net Models BB(V) Steiner(V) Clique(V) Star(V) BB(V) 1 1 1 1 Steiner(V) √ n − 2  2 + 3 4 1 ⎧ ⎨ ⎩ 9 8 for n = 4 1forn = 4 1 Clique(V)  n 2  n 2  n − 1  n 2  n 2  n − 1 1 1 Star(V)  n 2   n 2  n − 1  n 2  1 The following result tells how well the other three net models approximate the length of an optimum rectilinear Steiner tree. For two-terminal nets, all the net models are identical. Theorem 1 Let V be a finite set of points in R 2 and n :=|V|≥3. Then Table 17.1 shows an upper bound on M 1 (V) M 2 (V) for net models M 1 (row) and M 2 (column) from BB, Steiner, Clique, and Star. As an example how to read Table 17.1, the entry in the second row and third column says that Steiner(V ) ≤ 9 8 Clique(V)forallV and Steiner(V ) ≤ Clique(V)ifn = 4. All inequalities are essentially tight for all n. This result is due to Brenner and Vygen (2001). In particular, Theorem 1 yields Steiner(V ) ≤ Clique(V) ≤ Star(V )forn = 4. Hence, the clique model is superior to the star model as it estimates the length of an optimum rectilinear Steiner tree more accurately. Indeed, a clique is an optimum graph with fixed topology in this respect: Theorem 2 Let n ∈ N, n ≥ 2 . Let G b e a connected undirected graph with V(G) ⊇{1, , n}, and with edge weights w : E(G) → R >0 .Forx, y : {1, , n}→R let M (G,w) (x, y) := min   e={u,v}∈E(G) w(e)(|x(u) −x(v)|+|y(u) − y(v)|)| x, y : V(G ) \{1, , n}→R  . Now define r(G, w) to be the ratio of supremum over infimum of the set  M (G,w) (x, y) | x, y : {1, , n}→R, Steiner({[x(1), y(1)], , [x(n), y(n)]}) = 1  . Then this ratio is minimum for the complete graph on {1, , n } with uniform weights; it equals 3 2 for n = 4 and  n 2  n 2  n −1 for n = 4. In this sense, the clique model is optimal for all n. This is also a result of Brenner and Vygen (2001). A special case that is interesting in the context of net models for mincut approaches was considered before by Chaudhuri et al. (2000). How fast a net model can be computed and how good it approximates the shortest rectilinear Steiner tree are not the only criteria for net models. Another important issue is h ow well we can optimize netlength with respect to a given net model, assuming that we do not care about overlaps. This is discussed in the following sections. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C017 Finals Page 331 24-9-2008 #6 Analytical Methods in Placement 331 17.2.2 MINIMIZING NETLENGTH A key step in analytical placement is to find a placement that minimizes netlength (with respect to a certain net model), disregarding overlaps. This step assumes that there are some fixed pins, because otherwise one can achieve netlength close to zero by placing everything on the same position. The netlength depends on the pin positions. Each pin either belongs to a movable cell or has a fixed position. We write γ(p) to denote the cell that p belongs to, and γ(p) :=  if p is fixed. We denote by [x offs (p), y offs (p)] the offset of p with respect to γ(p), or the absolute position of p if p is fixed. A placement is a pair of coordinates [x(c), y(c)]for each c ∈ C :={γ(p)|p ∈ P}\{}. It implies pin positions [x(p), y(p)]={x[γ(p)]+x offs (p), y[γ(p)]+y offs (p)} for all p ∈ P,wherex() := 0 and y() := 0. Thus, for a given net model M, and given netweights w : N → R >0 , minimizing netlength is the problem of finding a placement minimizing  N∈N w(N)M({(x(γ (p)) +x offs (p), y(γ (p)) +y offs (p)) : p ∈ N}). Let us stress once more that we do not care about overlaps here. Many net models are the sum of two independent parts, one depending on x-coordinates only, and the other one depending on y-coordinates only. Examples are BB, Clique, and Star, but also common quadratic models (see Section 17.2.4). For such net models, x-andy-coordinates can be optimized separately. This results in two independent one-dimensional problems. 17.2.3 HOW TO MINIMIZE LINEAR NETLENGTH Netlength with respect to any of the net models BB, Clique, or Star can be minimized efficiently (if we do not care about overlaps). As discussed above, the coordinates can be considered separately, and we use only x-coordinates in our exposition. The problem of minimizing weighted bounding box netlength can be written as a linear program (LP) by introducing two variables l N and r N for the leftmost and rightmost coordinate of a pin of each net N (i.e., the edges of the bounding box), and writing min  N∈N w(N)(r N − l N ) subject to l N ≤ x(γ (p)) + x offs (p) ≤ r N for all p ∈ N ∈ N . This is an LP with 2|N |+|C|variables and 2|P|+|C|linear inequality constraints. Fortunately, one does not have to use generic LP solvers but can exploit the special structure of this LP. As noted first by Cabot et al. (1970), this LP is the dual of a transshipment problem (uncapacitated minimum cost flow problem), with a vertex for each variable and two arcs for each pin. More precisely, let G be the digraph with vertex set V (G) :={l N , r N | N ∈ N }∪C ∪{} and arc set E(G) :={[l N , γ(p)], [γ(p), r N ]|p ∈ N ∈ N }. The cost of an arc [l N , γ(p)] is x offs (p), and the cost of [γ(p), r N ] is −x offs (p). Then we look for a minimum cost flow carrying one unit out of l N and one unit into r N for each N ∈ N . Given a minimum cost flow, it is easy to obtain an optimum dual solution (a feasible potential in the residual graph) by a shortest path computation. The theoretically fastest known algorithm for transshipment problems, because of Orlin (1993), has a running time of O[n log n(m +n log n)],wheren is the number o f vertices and m is the number of arcs. In our case, we have n =|C|+2|N |and m = 2|P|. With the realistic assumption |N |≥|C|, we get a running time of O[|N |log|N |(|P|+|N |log |N |)]. See Korte and Vygen (2008) for more details on minimum cost flows. . Alpert /Handbook of Algorithms for Physical Design Automation AU7242_C016 Finals Page 322 23-9-2008 #13 322 Handbook of Algorithms for Physical Design Automation annealing is performed where. area. 327 Alpert /Handbook of Algorithms for Physical Design Automation AU7242_C017 Finals Page 328 24-9-2008 #3 328 Handbook of Algorithms for Physical Design Automation FIGURE 17.1 First six steps of an. coordinate. Alpert /Handbook of Algorithms for Physical Design Automation AU7242_C017 Finals Page 330 24-9-2008 #5 330 Handbook of Algorithms for Physical Design Automation TABLE 17.1 Bounds on the Ratios of Different

Định dạng
Số trang	10
Dung lượng	261,8 KB