Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C020 Finals Page 422 23-9-2008 #25 422 Handbook of Algorithms for Physical Design Automation 6. J. Vygen. Algorithms for detailed placement of standard cells. In Proceedings of Design, Automation and Test in Europe Conference, pp. 321–324, 1998. 7. H. Ren, D. Z. Pan, C. J. Alpert, and P. Villarrubia. Diffusion-based placement migration. In Proceedings of Design Automation Conference, pp. 515–520, 2005. 8. C. Li, M. Xie, C. -K. Koh, J. Cong, and P. H. Madden. Routability-driven placement and white space allocation. In Proceedings of International Conference on Computer Aided Design, pp. 394–401, 2004. 9. J. A. Roy, J. F. Lu, and I. L. Markov. Seeing the forest and the trees: Steiner wirelength optimization in placement. In Proceedings of International Symposium on Physical Design, pp. 78–85, 2006. 10. C. Li, C. -K. Koh, and P. H. Madden. Floorplan management: Incremental placement for gate sizing and buffer insertion. In Pr oceedings of Asia South Pacific Design Automation Conference, pp. 349–354, 2005. 11. T. Lou, H. Ren, C. J. Alpert, and D. Z. Pan. Computational geometry based placement migration. In Pr oceedings of International Conference on Computer Aided Design, pp. 41–47, 2005. 12. D. Hill. US Patent 6,370,673: Method and system for high speed detailed placement of cells within an integrated circuit design, 2002. 13. A. R. A gnihotri, S. Ono, C. Li, M. C. Yildiz, A. Khatkhate, C. -K. Koh, and P. H. Madden. Mi xed block placement v ia fractional cut recursive bisection. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 24(5):748–761, 2005. 14. U. Brenner, A. Pauli, and J. Vygen. Almost optimum placement legalization by minimum cost flow and dynamic programming. In Proceedings of International Symposium on Physical D esign, pp. 2–9, 2004. 15. A. R. Agnihotri, M. C. Yildiz, A. Khatkhate, A. Mathur, S. Ono, and P. H. Madden. Fractional cut: Improved recursive bisection placement. In Proceedings of International Conference on Computer Aided Design, pp. 307–310, 2003. 16. T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Intr oduction to Algorithms. MIT Press, Cambridge, MA, 1990. 17. C. K. Cheng, S. Z. Yao, and T. C. Hu. The orientation of modules based on graph decomposition. IEEE Transactions on Computers, 40(6):774–780, 1991. 18. T. W. Her and D. F. Wong. On ov er-the-cell channel routing with cell orientations consideration. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 14(6):766–772, 1995. 19. W. Swartz and C. Sechen. A ne w generalized row-based global router. In Proceedings of Design Automation Conference, pp. 491–498, 1993. 20. A. E. Caldwell, A. B. Kahng, and I. L. Markov. Optimal partitioners and end-case placers for standard- cell l ayout. IEEE Transactions on Computer-Aided Design of Inte grated Circuits and Systems, 19(11): 1304–1313, 2000. 21. S. -W. Hur and J. Lillis. Mongrel: Hybrid techniques for standard cell placement. In Proceedings of International Conference on Computer A ided Design, pp. 165–170, 2000. 22. M. R. Garey and D. S. Johnson. Computers and Intractibility: A Guide to the Theory of NP-Completeness. W. H. Freeman and C o., San Francisco, CA, 1979. p. 209. 23. A. B. Kahng, P. Tuck er, and A. Zelikovsky. Optimization of linear placements f o r wirelength minimization with free sites. In Proceedings of Asia South Pacific Design Automation Conference, pp. 241–244, 1999. 24. U. Brenner and J. Vygen. Faster optimal single-row placement with fixed ordering. In Proceedings of Design, Automation and Test in Europe Conference, pp. 117–122, 2000. 25. C. C. Chang, J. Cong, and M. Xie. Optimality and scalability study of existing placement algorithms. In Pr oceedings of Asia South Pacific Design Automation Conference, pp. 621–627, 2003. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C021 Finals Page 423 24-9-2008 #2 21 Timing-Driven Placement David Z. Pan, Bill Halpin, and Haoxing Ren CONTENTS 21.1 Introduction 424 21.2 Building Blocks and Classification 424 21.2.1 Net Modeling 424 21.2.2 Timing Analysis and Metrics 425 21.2.3 Overview of Timing-Driven Placement 426 21.3 Netweighting-Based Approach 427 21.3.1 Static Netweighting 427 21.3.1.1 Slack-Based Netweighting 428 21.3.1.2 Sensitivity-Based Netweighting 429 21.3.2 Dynamic Netweighting 432 21.3.2.1 IncrementalTiming Analysis 432 21.3.2.2 IncrementalNet Weighting 433 21.3.2.3 Placement Implementation 433 21.4 Net-Constraint-Based Approach 434 21.4.1 Net-Constraint Generation 434 21.4.1.1 Generating Effective NLCs 434 21.4.1.2 Single-Shot NLC Generation 434 21.4.1.3 IncrementalNLC Generation 435 21.4.2 Net-Constraint Placement 436 21.4.2. 1 Partition-Based Net-Constraint Placement 436 21.4.2.2 Force-Directed Net-Constraint Placement 437 21.4.2.3 Net-Constraint-Based Detailed Placement 437 21.5 Path (or Timing Graph)-Based Approach 437 21.5.1 LP-Based Formulation 438 21.5.1.1 Physical Constraints 438 21.5.1.2 Electrical/Timing Constraints 438 21.5.1.3 Objective Functions 439 21.5.2 Partitionin g-Based Overlap Removal 440 21.5.3 LagrangianRelaxation Method 440 21.5.4 Simulated Annealing 441 21.5.5 Graph-Based Differential Timing 441 21.6 Additional Techniques 441 21.6.1 Hybrid Net andPath-Based Approach 442 21.6.2 Hippocrates: A Detailed Placer without Degrading Timing 442 21.6.3 Accurate Net-Modeling Issue 442 21.7 Conclusions 443 References 443 423 Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C021 Finals Page 424 24-9-2008 #3 424 Handbook of Algorithms for Physical Design Automation 21.1 INTRODUCTION The placement algorithms presented in the previous chapters mostly focus on minimizing the total wirelength (TWL). Timing-drivenplacement (TDP) is designed specifically targetingwireson timing critical paths. It shall be noted that a cell is usually connected with two or more cells. Making some targeted nets shorter during placement may sacrifice the wirelengths of other nets that are connected through common cells. While the d elay on critical paths decreases, other paths may become critical. Therefore, TDP has to be performed in a very careful and balanced manner. Timing-driven placement has been studied extensively over the last two d ecades. The drive for new methods in TDP to maximize circuit performance is from multiple facets because of the technology scaling and integration:(1)growing interconnectversus gate delayratios;(2) higher levels of on-die functional integration, which makes global interconnects even longer; (3) increasing chip operating frequencies, which make timing closure tough; and (4) increasing number of macros and standard cells for modern system-on-chip (SOC) designs. These factors create continuing challenges to better TDP. Timing-driven placement can be performed at both global and detailed placement stages (see previous chapters on placement). Historically, TDP algorithms can be roughly grouped into two classes: net-based and path-based. The net-based approach deals with nets only, with the hope that if we handle the nets on the critical paths well, the entire critical path delay may be optimized implic- itly. The two basic techniques for net-based optimization are through netweighting [1–4] and net constraints [5–10]. The path-based approach directly works on all or a subset of paths [11–14]. The majority path-based approaches formulate the problem into a mathematical programming frame- work (e.g., linear programming [LP]). There are pros and cons for both net-based and path-based approaches in terms of runtime/scalability, ease of implementation, controllability, etc. Modern TDP techniques tend to use some hybrid manner of both n et-based and path-based approaches [15]. In this chapter, we discuss fundamental algorithms as well as recent trends of TDP. Because of the large amount of works in TDP, it is not possible to exhaust all of them in this chapter. Instead, we describe the basic ideas and fundamental techniques, and point out recent researches and possible future directions. We first cover the basic building blocks for TDP. Then the next two sections discuss net-based approaches, i.e., through netweighting and net constraints. Then we survey the basic formulations and algorithms behind the path (or timing graph)-based approach. Additional techniques and issues in the context of TDP are discussed, followed by conclusions. 21.2 BUILDING BLOCKS AND CLASSIFICATION 21.2.1 N ET MODELING Given a placement, net modeling answers a f undamental question how the net is modeled for its routing topology and wirelength computation/estimation. Figure 21.1 shows different net modeling strategies for a multiple-pin net. Bounding box model Clique model Star model Steiner model FIGURE 21.1 Different net models that can be used for placement. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C021 Finals Page 425 24-9-2008 #4 Timing-Driven Placement 425 The simplest and most widely used method to compute wirelength is the half-perimeter wire- length (HPWL) of its bounding box. For a net i,letl i , r i , u i ,andb i represent the left, right, top, and bottom locations of its bounding box. Then the HPWL of net i is HPWL i = r i − l i + u i − b i (21.1) HPWL is the lower bound for wirelength estimation, and it is accurate for two- and three-pin nets, which account for the majority nets. In analytical placement engines, wirelength is often modeled as a quadratic term (or p seudolinear term as in recent literature [1 6,17]). In those engines, clique and star models are often used. In the clique model, an edge is introduced between every pin pair of the net. In the star model, an extra star point located at the geometric center is created and each pin is connected to the star point. In general, small nets (e.g., less than five pins) can use the clique model, but for large nets with a lot of pins, clique model is not friendly to the matrix solvers b ecause it creates dense matrices. Star models are preferred for large nets. For the clique model, because it is a complete graph with far more edges than necessary to connect the net, each edge is usually assigned a weight of 2/n (where n is the number of pins of the net) [18]. The bounding box, clique, and star models are three most popular net models. There are other net models, such as Steiner trees, which are more accurate for n ets with four or more pins. However, in most designs two- and three-pin nets are the majority of the entire netlist. For example, in the industry circuit suite from Ref. [19], two- and three-pin nets constitute 64 and 20 percent of the total nets, respectively [20]. With exception of very few placers [21], Steiner-tree-based models are seldom used because they are computationally expensive. There are some recent works trying to link Steiner tree with placement, e.g., in a partition-based placer [22]. More research is needed to evaluate or make Steiner-based placement mainstream. 21.2.2 TIMING ANALYSIS AND METRICS As its name implies, TDP has to be guided by so me timing metrics, which in turn need delay modeling and timing analysis. TDP algorithms can use different levels of timing models to trade off accuracy and runtime. In general, the switch level resistance-capacitance (RC) model for gates and Elmore delay model for interconnects are fairly sufficient. There are more accurate models [23], but they are not extensively used in placement. One main reason is the higher runtime. The other reason is that during placement, routing is not done yet. It is not very meaningful to use more accurate models if errors from those uncertainties are even greater. Based on the gate and interconnect delay models, static timing analysis (STA) or even path- based ∗ timing analysis can be performed. STA [24] computes circuit path d elays using the critical path method [25]. From the set of arrival times (Arr) asserted on timing starting points and required arrival times (Req) asserted on timing endpoints, STA propagates (latest) arrival time forward and (earliest) required arrival time backward, in the topological order. Then the slack at any timing point t is the difference of its required arrival time minus its arrival time. Slk(t) = Req(t) −Arr(t) (21.2) Static timing analysis can be performed incrementally if small changes in the netlist are made. For more details on delay modeling and timing analysis, the reader is referred to Chapter 3. Timing convergence metrics measure the extent to which a placement satisfies timing constraints. They also give an indication of how difficult it would be for a design engineer to ∗ In most cases, STA is sufficient for TDP. The path-based timing analysis is more accur ate, e.g., to capture false paths. But it is very time consuming, and one may do it only if necessary, e.g., on a set of critical paths. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C021 Finals Page 426 24-9-2008 #5 426 Handbook of Algorithms for Physical Design Automation manually fix timing problems. The most commonly used timing closure metric is the worst negative slack (WNS) WNS = min t∈P o Slk(t) (21.3) where P o is the set of timing endpoints, i.e., primary outputs (POs) and data inputs of memory elements. To achieve timing closure, WNS should be nonnegative. For nanometer designs with growing variability, one may set the slack target to be a positive value to safe guard variations from process, voltage, or thermal issues. The WNS, however,only gives information about the worst path. It is possible that two placement solutions have similar WNS values, but one has only a single critical path while the other has thousands of critical and near critical paths. The figure of merit (FOM) is another very important timing closure metric [4]. It can be defined as follows: FOM = t∈P o ,Slk(t)<Slk t [Slk (t) − Slk t ] (21.4) where Slk t is the slack target for th e entire design. If Slk t = 0, the FOM is reduced to the total negative slack (TNS) [8]. 21.2.3 OVERVIEW OF TIMING-DRIVEN PLACEMENT The overview of TDP is shown in Figure 21.2. It has three basic components: timing analysis, core placement algorithms, and interfacesbetweenthemby translating timinganalysis/metrics into certain weights or constraints for core placement engines to drive and guide TDP. The previous section discusses the basics o f timing analysis and metrics from a given netlist. However, which netlist to start with so that we can have a meaningful timing analysis to g uide TDP is a very important yet open question. For example, shall we start from an unplaced netlist or some initial placement? For modern timing closure, many buffers also need to be inserted for high- fanout nets and long interconnects to get a reasonable timing picture (otherwise, there may be many loading/slew violations that make timing reports meaningless). On the other hand, those buffers will change the netlist structure for TDP. Shall they be kept or stripped out during TDP? There is very little literature covering this netlist preparation step. A reasonable strategy can be as follows: First, we start with some initial placement (e.g., wirelength driven), then perform some rough buffering/fanout optimization to get a reasonable timing estimation for the entire chip to guide TDP engine. Whether to keep those buffers during TDP may vary among different physical synthesis systems. Placement has been one of the most heavily studied physical design topics. Some of the most popular placement algorithms include analytical/force-directed placement, partition-based place- ment, simulated-annealing (SA) based placement, and LP-based placement. The reader is referred to Chapters 15 through 18 for detailed discussions. The most interesting aspect of the TDP is the mechanism to translate timing metricsintoactionsto drive the core placement engines. The focus of the rest of this chapter is on this aspect. Based on that, the TDP can beroughlyclassified intonet-based and path-based approaches.Thenet-based approach, Timing analysis/metrics • Gate • Interconnect • Static timing analysis • Path-based timing analysis Interface to placer • Netweights • Net constraints • Path constraints, etc. Core placement algorithms • Force-directed • Partitioning-based • Simulated annealing • LP-based, etc. FIGURE 21.2 Basic building blocks and overview of TDP. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C021 Finals Page 427 24-9-2008 #6 Timing-Driven Placement 427 as its name implies, deals with individual nets. Because timing analysis inherently deals with paths (with timing propagation), the timing information is then translated into either net constraints or netweights [1–4] to guide TDP engines. The main idea of net constraint generation (or delay bud- geting) is to distribute slack for each path to its constituent nets such that a zero-slack solution is obtained. The delay budget for each net is then translated into its wirelength constraint during placement. The main idea of netweighting is to assign higher netweights to more timing-critical nets while minimizing the total weighted wirelength objective. Netweighting gives direction for timing optimization through shortening critical nets, but it does not have exact control because the objective is the total weighted wirelength; the net constraint approach specifies that, but it may b e too much constrained in terms of global optimization. Net weighting and net constraint processes can be iter- atively refined as more accurate timing information is obtained during placement. A systematic way of explicit perturbation control is important for net-based algorithms. The path-based approach directly works on all or a subset of paths. The majority of this approach formulate the problem into mathematical programming framework (e.g., LP). It usually maintains an accurate timing view during the placement optimization [26]. However, its drawback is its poor scalability and complexity because of possible exponential number of paths to be simultaneously optimized [26]. An effective technique is to embed timing graph/constraint through auxiliary vari- ables [11]. The mathematical programming-based approach needs to deal with cell overlapping issues,e.g., throughpartitioning. Thepath-based timingcanalso beevaluated inasimulated annealing framework [13]. Both net-based and path-based approaches have pros and cons. Path-based in general has more accurate timing view and control, but it suffers from poor scalability. The net-based approaches, in particular netweighting, have low computational complexity and high flexibility. Thus, they are suitable for large application specific integrated circuits/system-on-chip (ASIC/SOC) designs with millions of placeable objects. Recent research shows that hybrid of these two basic approaches are promising [15]. 21.3 NETWEIGHTING-BASED APPROACH Classic placement algorithms optimize the TWL. They can be easily modified to be timing-driven using the netweighting technique, which assigns different weights to different nets such that the placer minimizes the total weighted wirelength (if all the weights are the same, it degenerates into the classic wirelength-driven placement). Intuitively, a proper netweighting should assign higher weights on more timing-critical nets, with the hope that the placement engine will reduce the lengths of these critical nets and thus their delays to achieve better overall timing. Netweighting-basedTDP is very simple to implementand less computational intensive. As mod- ern very large scale integration (VLSI) designs have millions placeable objects (gates/cells/macros), netweighting is attractive because of its simplicity. Almost all placement algorithms support netweighting . Quadratic placement can optimize the weighted quadratic wirelength, partition- based placement can optimize the weighted cutsize (see Chapter 8), and simulated-annealing-based placement can optimize the weighted linear wirelength, etc. Although netweighting appears to b e straightforward, it is not easy to generate a good netweight- ing. Higher netweights on a set of critical nets in general shall reduce their wirelengths and delays, but other nets may b ecome longer and more critical. In this section, we will review two basic sets of netweighting algorithms: static netweighting and dynamic netweighting. Static netweighting assigns weights once before TDP and the weights do not change during TDP. Dynamic netweighting updates weights during the TDP process. 21.3.1 STATIC NETWEIGHTING Static netweighting computes the netweights once before TDP. It can be divided into two cate- gories: empirical netweighting and sensitivity-based netweighting. Empirical netweighting methods Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C021 Finals Page 428 24-9-2008 #7 428 Handbook of Algorithms for Physical Design Automation compute weights based on certain criticality factors, such as slack, cycle time, and fanout. Critical nets are assigned higher netweights. Sensitivity-based netweighting computes weights based on the sensitivity analysis of netweights to factors such as WNS and FOM. The key difference of these two netweighting schemes is that sensitivity-based approach has some look-ahead mechanism that can estimate the impact of netweighting on key factors. Therefore, it assigns higher netweights on those nets that have bigger impacts on the overall timing closure goal. 21.3.1.1 Slack-Based Netweighting Empirical netweighting assigns netweight based on the critically of the net, which indicates how much the p lacer should reduce the wirelength on this net. The criticality computation can be com- puted based on the static timing analysis (STA). Assuming there is only one clock period, the net criticality can often be measured by slack. Nets with negative slacks are critical nets and are assigned higher netweights than those nets with positive slacks. w = W 1 , slack < 0 W 2 , slack ≥ 0 (21.5) where W 1 and W 2 are positive constants and W 1 > W 2 . Among the critical nets that have negative slacks, higher weights can also be given to those which are more negative. One can either use a continuous model [1] or a step-wise model [2] to compute weights based on the slack distribution, as shown in Figure 21.3. It shall be noted that netweights shall not continue to increase when slack is less than certain threshold. This is because slack can be very negative because of invalid timing assertions. Usually, placers do not need very high netweights to pull nets together. For some placers, one might add an exponential component into netweighting [27] to further emphasize critical nets. w = 1 − slack T α (21.6) where T is the longest path delay α is the criticality exponent If there are multiple clocks in the design, the clock cycle time can be considered d uring netweighting. Nets on paths of shorter cycle time should have higher weights than those of longer cycle time with the same slack. For example, we can use T in Equation 21.6 to represent different cycle times. Nets of long cycle clocks get a larger T than those of short cycle clocks. (a) Continuous (b) Step 0 0 SlackSlack Weight Weight FIGURE 21.3 Netweight assignment b ased on slack using continuous or step model . Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C021 Finals Page 429 24-9-2008 #8 Timing-Driven Placement 429 w = 1 − slack T clk α (21.7) where T clk is the clock cycle time for a particular net. There are other empirical factors that can be considered for netweighting, e.g., path depth and driver strength [28]. A deep path, which has many stages of logic, is more likely to have longer wirelength. The endpoints of this path could be placed far away from each other, therefore worse timing is expected than a path with less number of stages. A net with a weak driver would have longer delay than a strong driver with the same wirelength. Therefore, netweight should b e proportional to the longest path depth (which can be computed by running breadth first search twice: once from PIs and second time from POs), and inversely to slack and driver strength, i.e., w ≈ D l × R d (21.8) where D l is the longest path depth R d is the driver resistance A weaker driver has a larger effective driving resistance, thus the net it drives will have a bigger netweight. One could also consider path sharing during netweighting. Intuitively, a net on many critical paths should be assigned a higher weight because reducing the length of such net can reduce the delay on many critical paths. w ≈ slack ×GP (21.9) where GP is the number of critical paths passing this net. Suppose we assign two variables for each net p on the timing graph: F(p) the numb er of different critical paths starting from timing beginning point (PI) to net p;andB(p) the number of different critical paths from net p to timing endpoints (PO). The total number of critical paths passing through net p is then GP(p) = F(p) × B(p).This netweighting assignment only considers the sharing effect of critical paths, and each path has the same impact of the netweight. Kong proposed an accurate, all path counting algorithm PATH [3], which considers both noncritical and critical paths during path counting. It can properly scale the impact of all paths by their relative timing criticalities. To perform netweighting for unplaced designs, STA can use the wire load model, e.g., b ased on fanout, to estimate the delay (compared to placed designs, STA can use the actual wire load to compute delay). Normally, it is not accurate with wire load models. Therefore, for an unplaced design, an alternative way of generating weights is to use fanout and delay bound [29] instead of slack. Fanout is used to estimate wirelength and wire delay [30], and delay bound is the estimated allowable wire delay, i.e., any wire delay above this bound would result in negativeslack. The weight can be computed as the ratio of fanout and delay bound. w ≈ fanout net delay bound (21.10) In general, as the impact of netweight assignment is not very predictable, extensive parameter tuning may be needed to make it work on specific design styles. 21.3.1.2 Sensitivity-Based Netweighting Netweighting can help improve timing on critical paths. However, it may have negative effects on TWL. Assigning higher netweights on too many nets may result in significant degradation of Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C021 Finals Page 430 24-9-2008 #9 430 Handbook of Algorithms for Physical Design Automation wirelength, thus may introduce routing congestion and new critical paths. To apply high netweights only on those nets that will result in large gain in timing, we can reduce wirelength degradation and other side effects of netweighting. Sensitivity-based netweighting tries to predict the netweighting impact on timing and use that sensitivity to guide netweighting [4,31]. The question that sensitivity analysis tries to address is as follows: Given an initial placement from an initial netweighting scheme, if we increase the weight for a net i by certain nominal amount, how much improvementneti will get for its worst slack (WNS) and the overall FOM (or in a more familiar term, TNS when the slack threshold for FOM is 0). With detailed sensitivity analysis, larger weights could be assigned to a net whose weight change can have a larger impact to delay. In this section, we will explain how to estimate both slack sensitivity and TNS sensitivity to netweights and how to use those sensitivities to compute netweights. First, one needs to estimate the impact of netweight change to wirelength, i.e. the wirelength sensitivity to netweight. This sensitivity depends on the characteristics of a placer. It is not easy to estimate such sensitivity for mincut or simulated-annealing-based algorithms. But for quadratic placement, one can come up with an analytical model to estimate it. Based on Tsay’s analytical model [6], the wirelength sensitivity to netweight can be derived [4] as S L W ( i ) = L(i) W(i) =−L(i) · W src (i) +W sink (i) −2W (i) W src (i)W sink (i) (21.11) where L(i) is the initial wirelength of net i W(i) is the initial weight of net i W src (i) is the total initial weight on the driver/source of net i (simply the summation of all nets that intersect with the driver) W sink (i) is the total initial weight on the receiver/sink of net i Intuitively, Equation 21.11 implies that if the initial wirelength L(i) is longer, for the same amount of nominal weight change, it expects to see bigger wirelength change. Meanwhile, if the initial weight W (i) is relatively small, its expected wirelength change will be bigger. The negative sign means that increasing netweight will reduce wirelength. The next step is to estimate the wirelength impact on delay. Using the switch level RC device model and the Elmore delay model [32], the delay sensitivity to wirelength can be estimated as S T L (i) = T(i) L(i) = rcL(i) +cR d + rC l (21.12) where r and c are the unit length wire resistance and capacitance, respectively R d is the output resistance of the net driver C l is the load capacitance It implies that for a given technology (fixed r and c ), the delay of a long wire with a weak driver and large load will be more sensitive to the same amount of wirelength change. With wirelength sensitivity and delay sensitivity, one can compute the slack sensitivity to netweight as S Slk W (i) = Slk(i) W(i) =− T(i) L(i) · L(i) W(i) =−S T L (i)S L W (i) (21.13) Total negative slack is an important timing closure objective. The TNS sensitivity to n etweight is defined as follows: S TNS W ( i ) = TNS/W(i) (21.14) Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C021 Finals Page 431 24-9-2008 #10 Timing-Driven Placement 431 Note that TNS improvement comes from the delay improvement of this net, Equation 21.14 can be decomposed into S TNS W (i) = TNS T(i) · T(i) W(i) =−K(i)S Slk W (i) (21.15) where K(i) = TNS T(i) , which means how much TNS improvementit can achieve by reducing net delay T(i). It has been shown in Ref. [4] that K(i) is equal to the negative of the number of critical timing endpointswhose slacks are influencedby net i with a nominal T (i) and can be computed efficiently as shown in the following algorithm: Algorithm 1 Counting the number of influenced timing critical endpoints for each net 1. decompose nets with multiple sink pins into sets of driver-to-sink nets 2. initialize K(i) = 0 for all nets and timing points 3. sort all nets in topological order from timing end points to timing start points 4. for all P o pin t do 5. set K(t) to be 1 if t is timing critical (i.e., Slk(t) < Slk t ; otherwise set K(t) to be 0 6. for all net i in the above topologically sorted order do 7. for all sink pin j of net i do 8. K(i) = K(i) + K(j) 9. propagate K(i) of net i to its driver input pins: only the most critical input pin gets K(i) ; other pins will have K = 0 because they are not on the critical path of net i , thus cannot influence the timing end points from net i As an example, Figure 21.4 shows two paths from a timing begin point P i to timing endpoints P o1 and P o2 .Netn3.1 and n3.2 are the decomposed driver-to-sink nets from the original net n3. The pairs in the figure such as (−3, 1) have the following meaning: the first number is the slack, and the second number is the K value. Because the slacks at P o1 and P o2 are −3ns and −2ns, respectively (worse than the slack target of 0), the K values for P o1 and P o2 are both 1. We can see how the K values are pr opagated from PO to PI. Note that for gate C, the upper input pin has slack of −2ns while the lower input pin has slack of −1ns, thus the upper p in is the most timing critical pin to gate C and it will influence the slack of P o2 . The lower pin of C does not influence P o2 . The sensitivity-based netweighting scheme starts from a set of initial n etweights (e.g., uniform netweighting at the beginning),and computes a new set of netweights that would maximize the slack P i (-3, 2) n5 A n3.2 (-2, 1) (-2, 1) B n3.1 (-3, 1) (-3, 1) n1 P o1 P o2 n4 (-1, 0) C n2 D FIGURE 21.4 Counting the number of influenced timing endpoints. . Alpert /Handbook of Algorithms for Physical Design Automation AU7242_C020 Finals Page 422 23-9-2008 #25 422 Handbook of Algorithms for Physical Design Automation 6. J. Vygen. Algorithms for detailed. 443 423 Alpert /Handbook of Algorithms for Physical Design Automation AU7242_C021 Finals Page 424 24-9-2008 #3 424 Handbook of Algorithms for Physical Design Automation 21.1 INTRODUCTION The placement algorithms. e.g., on a set of critical paths. Alpert /Handbook of Algorithms for Physical Design Automation AU7242_C021 Finals Page 426 24-9-2008 #5 426 Handbook of Algorithms for Physical Design Automation manually