Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C043 Finals Page 912 24-9-2008 #17 912 Handbook of Algorithms for Physical Design Automation 3. S. Rusu. Clock generation and distribution for high-performance processors. In IEEE Intl. SOC Conf., Santa Clara, CA, p. 207, 2004. 4. C. F. Webb et al. A 400-MHz S/390 microprocessor. IEEE J. Solid-State Circuits, 32(11): 1665–1675, November 1997. ( ISSCC 1997). 5. P. J. Restle et al. The clock distribution of the Power4 microprocessor. In Proc. IEEE Intl. Solid-State Circuits Conf., San Francisco, CA, pp. 144–145, 2002. 6. D. W. Bailey and B. J. Benschneider. Clocking design and analysis for a 600-MHz Alpha microprocessor. IEEE J. Solid-State Circuits, 33(11): 1627–1633, November 1998. (ISSCC 1998). 7. I. A. Young, M. F. Mar, and B. Bhushan. A 0.35 µm CMOS 3-880MHz PLL N/2 clock multiplier and distribution network with low jitter for microprocessors. In Proc. IEEE Intl. Solid-State Circuits Conf., San Francisco, CA, pp. 330–331, 1997. 8. R. Senthinathan, S. Fischer , H. Rangchi, and H. Yazdanmehr. A 650-MHz, IA-32 microprocessor with enhanced datastreamingfor graphicsandvideo. IEEE J. Solid-State Circuits,34(11): 1454–1465, November 1999. (Microprocessor Report 1999). 9. N. Kurd, J. Barkatullah, and R. Dizon. A multigigahertz clocking scheme for the Pentium 4 microprocessor. IEEE J. Solid-State Circuits, 36(11): 1647–1653, November 2001. (ISSCC 01). 10. N. Bindal et al. Scalable sub-10ps skew global clock distribution for a 90 nm multi-GHz IA microprocessor. In Proc. IEEE Intl. Solid-State Circuits Conf., San Francisco, CA, pp. 346–498, 2003. 11. S. Tam et al. Clock generation and distribution for the first IA-64 microprocessor. IEEE J. Solid-State Circuits, 35(11): 1545–1552, 2000. (ISPD 2000). 12. F. E. Anderson, J. S. Wells, and E. Z. Berta. The core clock system on the next generation Itanium microprocessor. In Proc. IEEE Intl. Solid-State Circuits Conf., San Francisco, CA, pp. 146–147, 2002. 13. S. Tam, R. D. Limaye, and U. N. Desai. Clock generation and distribution for the 130-nm Itanium 2 processor with 6-MB on-die L3 cache. IEEE J. Solid-State Circuits, 39(4): 636–642, April 2004. (ISSCC 2003). 14. P. Mahoney et al. Clock distribution on a dual-core, multi-threaded Itanium-family processor. In Proc. IEEE Intl. Solid-State Circuits Conf., San Francisco, CA, pp. 292–293, 599, 2005. 15. B. J. Rubin and S. Daijavad. Calculations of multi-port parameters of electronic packages using general purpose electromagnetics code. In Proc. IEEE Topical Meet. Electron. Performance Electr on. Packag., Monterey, C A, pp. 37–39, 1993. 16. A. Deutsch et al. Modeling and characterization of long on-chip interconnections for high-performance microprocessors. IBM J. Res. Dev., 39(5): 547–567, September 1995. 17. C. L. Ratzlaff and L. T. Pillage. RICE: Rapid interconnect circuit evaluation using AWE. IEEE Trans. Comput Aided Des., 13(6): 763–776, June 1994. 18. J. D. Warnock et al. The circuit and physical design of the P OWER4 microprocessor. IBM J. Res. Dev., 46(1): 27–51, January 2002. 19. P. J. Restle et al. A clock distribution network for microprocessors. IEEE J. Solid-State Circuits, 36(5): 792–799, May 2001. 20. D. W. Dobberpuhl et al. A 200-MHz 64-b dual-issue CMOS microprocessor. IEEE J. Solid-State Circuits, 27(11): 1555–1567, November 1992. (ISSCC 1992). 21. W. Bowhill et al. A 300-MHz 64-b quad-issue CMOS RISC microprocessor. In Proc. IEEE Intl. Solid-State Circuits Conf., San Francisco, CA, pp. 182–183, 1995. 22. T. Xanthopoulos et al. The design and analysis of the clock d istribution network for a 1.2 GHz Alpha microprocessor. In Proc. IEEE Intl. Solid-State Circuits Conf., San Francisco, CA, pp. 402–403, 2001. 23. M. R. Choudhury and J. S. Miller. A 300 MHz CMOS microprocessor with multi-media technology. In Proc. IEEE Intl. Solid-State Circuits Conf., San Francisco, CA, pp. 170–171, 450, 1997. 24. G. Geannopoulos and X. Dai. An adaptive digital deskewing circuit for clock distribution networks. In Proc. IEEE Intl. Solid-State Circuits Conf., San Francisco, CA, pp. 400–401, 1998. 25. S. Nafzigger et al. The implementation of the Itanium 2 microprocessor. IEEE J. Solid-State Circuits, 37( 11) : 1448–1459, November 2002. (ISSCC 2002). 26. S. Naffziger et al. The implementation of a 2-core, multi-threaded Itanium-family processor. IEEE J. Solid-State Circuits, 41(1): 197–209, January 2006. (ISSCC 05). 27. T. Fischer et al. A 90-nm variable frequency clock system for a power-managed Itanium architecture processor. IEEE J. Solid-State Circuits, 41( 1): 218–228, January 2006. (ISSCC 05). Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C044 Finals Page 913 24-9-2008 #2 44 Power Grid Design Haihua Su and Sani Nassif CONTENTS 44.1 Motivation 913 44.1.1 Technology Trends and Challenges 913 44.1.2 Overview of the Chapter 915 44.2 Modeling and Analysis Methodology. 915 44.2.1 Package and Power Grid Modeling 915 44.2.2 Decoupling Capacitance and Cell Modeling 916 44.2.3 Leakage Modeling 918 44.2.4 Methodology 920 44.2.5 ToleranceAnalysis of Power Grids 920 44.3 Power Grid Noise Analysis 922 44.3.1 Noise Metrics 922 44.3.2 Fast Analysis Techniques 922 44.3.2. 1 Hierarchical Partitioning Method 923 44.3.2.2 Multigrid Methods 924 44.3.2.3 Model Order Reduction Methods 927 44.3.2.4 Random Walk Method 928 44.3.3 Power Grid Analysis with Uncertain Work Loads 930 44.4 Power Grid Optimization 931 44.4.1 Wire Sizing 931 44.4.2 Decoupling Capacitance Allocation and Sizing 933 44.4.3 Topology Optimization 934 44.4.4 Optimal Placement of Power Supply Pads and Pins 935 References 936 44.1 MOTIVATION 44.1.1 T ECHNOLOGY TRENDS AND CHALLENGES The annual report of the International Technology Roadmap (ITRS) for semiconductors [1] has shown the continued reduction of power supply voltage (V dd ), driven by power consumption reduc- tion, reduced transistor channel length, and reliability of gate dielectrics. It is expected that the lowest V dd target on this roadmap is 0.5 V in 2016 for low-operating power applications. The parameters and characteristics trend of microprocessorunit (MPU) (high-performancemicroprocessor)with on-chip static random access memory (SRAM) from the 2005 edition of I TRS is summarized in Table 44.1. It can be seen from Table 44.1 that the trend for high-performance integrated circuits is toward higher operating frequencyandlower powersupplyvoltages. Power dissipation continues toincrease, but tends to saturate at 0.64 W/mm 2 from 2008 to 2020. The increased power consumption is driven by higher operating frequenciesandthehigheroverall capacitances andresistancesin larger chipsthat 913 Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C044 Finals Page 914 24-9-2008 #3 914 Handbook of Algorithms for Physical Design Automation TABLE 44.1 Trends in IC Technology Parameters Gate Number of Number Number Current Average Length Transistors of Power of Wire fV dd Size Per Power Power Year (nm) (M) Pads Levels (MHz) (V) (mm 2 ) Pad (mA) Density (W/mm 2 ) 2005 32 225 2,048 15 5,204 1.1 310 74.3 0.54 2006 28 283 2,048 15 6,783 1.1 310 79.8 0.58 2007 25 357 2,048 15 9,285 1.1 310 83.9 0.61 2008 23 449 2,048 16 10,972 1.0 310 96.9 0.64 2009 20 566 2,048 16 12,369 1.0 310 96.9 0.64 2012 14 1,133 2,048 16 20,065 0.9 310 107.6 0.64 2014 11 1,798 2,048 17 28,356 0.9 310 107.6 0.64 2016 9 2,854 2,048 17 39,683 0.8 310 121.1 0.64 2018 7 4,531 2,048 18 53,207 0.7 310 138.4 0.64 2020 6 7,192 2,048 18 73,122 0.7 310 138.4 0.64 have more on-chip functions. However, such high-power consumption has to flatten out because of the single-chip package power limits, electromigration problems, and thermal impacts on reliability and performance. In addition, lowering the power supply voltage worsens switching currents and decreases noise margins. As a result, power management is recognized in Ref. [1] as one of the g rand challenges in the near term and leakage power management as one of the grand challenges in the long term. The power delivery system includes on-chip and off-chip power grid and decoupling capacitors on die, package, and board. The power grid (power distribution network) providesthe V dd and ground signals throughout a chip. Compared to signal wires, power wires typically have lower impedances to reduce power grid current resistance (IR) drops because of currents drawn by functional blocks. All levels of decoupling capacitors are extensively used to suppress transient noise because of the transient currents drawn by functional b locks and because of the interaction of package inductance and switching currents, also known as L dI dt noise or I noise. The inductive components in package power grids and decoupling capacitors are the major limitation for performance at high frequency. Supply voltage variations can lead not only to problems related to spurious transitions but also to delay variations [2,3] and timing unpredictability [4]. Thus, a successful design requires careful design of all levels of the power delivery system. In early technologies, the design of power networks was relatively easier b ecause power wires had low resistances and transistors drew relatively low currents. Computer-aided design (CAD) techniques addressed power networks with well-designed tree topologies [5–7] that were said to be sufficient to meet the performance requirements. A typical power grid network in the early technologies consists of only thousands of nodes. In recent deep submicron technologies (0.25nm and below), as pointed out in Refs. [8,9], with the shrinking of feature sizes and increases in the clock frequency, the power grid noise problem has b ecome more significant and power supply noise is among the major reasons that affect the circuit functionality. Even if a reliable supply is provided at an input pin of a chip, it can deteriorate significantly within the chip. These problems become worse with the scaling down of the voltage supply level (V dd ). The solutions to the above problems become even harder because of the larger size of the power distribution network. A typical power grid network size can easily exceed millions of nodes. Therefore, fast and accurate design, verification and optimization techniques are necessary to address the power grid design issue efficiently. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C044 Finals Page 915 24-9-2008 #4 Power Grid Design 915 44.1.2 OVERVIEW OF THE CHAPTER This chapter d iscusses b asic concepts and techniques for deep submicron power grid design and verification in various aspects: modeling, methodology, analysis, and optimization. Section 44.2 discusses widely adopted power grid analysis and verification methodologiesandmodeling for every part of the power distributionsystem. Section 44.3 addresses four analysis techniquesto handle large- scale power grid circuits with fixed and uncertain work loads. Optimization techniques including wire sizing, decoupling capacitance optimization, topology optimization, and optimal power pads/pin placement are covered in Section 44.4. 44.2 MODELING AND ANALYSIS METHODOLOGY 44.2.1 P ACKAGE AND POWER GRID MODELING The power grids in the entire power delivery system from board to die are coupled with each other, implying that the effects at one level can impact another. Because the composite board-to-die system is extremely large, analyzing the entire system can be a difficult task and a simplified model has to be applied. A typical approach is to use a simplified on-chip power grid model when package level power gridperformance isanalyzed. Similarly, a simplified packagemodelis usedwhenon-chip power grid performance is of interest, which is typically the case because it directly impacts chip timing performance and functioning. In terms of accuracy, such macromodels that capture the major electrical properties at other levels are seen to be sufficient for the levels of accuracy desired in simulation, but ignoring these effect entirely can lead to accuracy losses. This is reinforced in Ref. [10], which motivates why a complete chip-level power grid analysis must include a package-level model that considers the effect of package inductance.An interesting comparisonbetween a circuit under 0.25-µm technology using the flip-chip C4 package and wire-bond I/Os shows a difference of worst-case steady-state voltage drop of 0.37 V out of the 2.5 V power supply voltage. A simplified package-level power bus model [10] is shown in Figure 44.1. The inductance dominance of the package can be clearly seen. Although there is only self-inductance in this model, mutual inductance has to be considered if the power buses are close to each other. C4 VG VG Pin MLC Mesh TF Mesh Chip MLC Via V G V G V G V G FIGURE 44.1 Simplified package-level power bus model. (From Chen, H. H. and Neely, J. S., IEEE Trans. Component Package and Manufacturing Technology, 21, 209, 1998. With permission.) Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C044 Finals Page 916 24-9-2008 #5 916 Handbook of Algorithms for Physical Design Automation l s C s 2 C s 2 R s L s w s FIGURE 44.2 RLC π -model of wire segment. (From H. H. Chen and J. S. Neely, Interconnect and Circuit Modeling Techniques for Full Chip Power Supply Noise Analysis, IEEE Tran. Component Package and M anufacturing Technology, 21, 209, 1998. With permission.) On-chip power grids in each metal layer can be accurately modeled using lumped RLC para- meters. Each power wire in the power grid is represented as a set of connected segments under the π-model (Figure 44.2), with each segment modeled using lumped RLC parameters (considering self-inductances only) given by R s = ρl s /w s C s = ( βw s + α ) l s L s = γ l s /w s (44.1) where l s and w s are the length and the width of the segment ρ, β, α,andγ are the sheet resistance per square, capacitance per square, fringing capacitance per unit length, and the self-inductance per square of the metal layer that is being used for routing the power grid The following rules are commonly applied for most on-chip power buses: • Grid capacitances (area and fringing capacitance) are order of magnitude less than the cell or decoupling capacitors, therefore are often ignored. However, there are some works that show that leveraging these capacitors can provide enhanced accuracy and benefit. • Grid inductancescanbeignored if theyare orderofmagnitudesmaller compared to package inductances. • Although the inductance on the package dominates the I noise, on-chip power bus inductance generally cannot be ignored for wires wider than 5 µm. 44.2.2 DECOUPLING CAPACITANCE AND CELL MODELING The modeling of cell switching current has been an active branch of research. The difficulty of the problem lies in the complexity of determining the sets of input patterns that matter most to the power grid noise. The model must capture the worst-case, average currents or transient currents drawn by cells among all input patterns. A typical RC mo del for cells and decaps was presented in Refs. [10,11]. The switching activities for each functional block can be modeled by an equivalent circuit (Figure 44.3), which consists of time-varying resistors (Ri), loading capacitors (Ci L ), and nonideal decoupling capacitor (C di and R di ). The loading capacitance for the equivalent circuit is calculated by Ci L = P/V 2 f ,whereP is the estimated power for the corresponding area i, V is the power supply voltage, and f is the clock frequency.When the circuit is turned on, the time-varying resistance will be set to Ri on ,whereRi on Ci L is the switching time constant. Similarly, when the circuit is switched off, the time-varying resistance will be set to Ri off . At the beginning of every clock cycle, a subset of the switching circuits are turned on and off, corresponding to an event list. An example showing the switching events at a node in the power network is illustrated in Figure 44.4 [12,13]. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C044 Finals Page 917 24-9-2008 #6 Power Grid Design 917 R di V dd G nd Ri off Ci L C di Ci L Ri on FIGURE 44.3 E quivalent switching circuit. (From H. H. Chen andJ.S. Neely,Interconnect and Circuit Model- ing Techniques for Full Chip Power Supply Noise Analysis, IEEE Tran. Component Package and Manufacturing Technology, 21, 209, 1998. With permission.) Although the above model is accurate, the simulation of the entire power grid would require analyzing a varying topology as circuit elements switch in and out of the network, complicating the simulation procedure. Therefore, a direct application of this model is not widely used. A more convenient method is to replace the switching circuit model in Figure 44.3 with a piecewise linear current source whose waveform approximates the actual current waveform of the functional block, assuming ideal V dd and G nd levels. Because these current waveforms are input-pattern-dependent, algorithms for worst-case current waveform estimation are necessary. Recently published algorithms are briefly summarized below. Algorithm 1 In Ref. [14] the circuit is divided into conbinational logic macros. The maximum current requirement for each macro is separately estimated and the input excitation at which the maximum of the transient current occurs is identified. All input-states can be enumerated using a branch-and-bound search technique. The complexity of their method is exponential, and therefore it is hard to be applied to large circuits. This work pessimistically assumes that every macro draws the maximum current simultaneously, and hence it tends to overestimate the worst-case currents. Algorithm 2 In Ref. [15] Kriplani et al. proposed an input-pattern-independent algorithm that estimates an upper bound for the maximum envelope current(MEC) waveform. I MEC (t) is defined as the maximum possible current value that could be drawn from the power grid at time t among all input patterns, given that each input can switch at any time. An accurate estimation of the MEC waveforms Events 0 t 1 t 2 t 3 t 4 t 5 t 6 t 7 t 8 t 9 t 10 t FIGURE 44.4 Switching events at a node in a P/G net work. (From Shah J. C., Younis, A. A., Sapatnekar, S. S., and Hassoun, M. M., IEEE TCAS, 45, 1372, 1998. With permission.) Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C044 Finals Page 918 24-9-2008 #7 918 Handbook of Algorithms for Physical Design Automation would typically require an exponential set of enumerations of all input patterns and is therefore not desirable. The algorithm proposed in this chapter has linear time performance because it ignores signal correlations. This results in a very loose upper boundfor the MEC waveforms and cantherefore overestimate the supply currents. The same authors extended their work in Ref. [16] to consider the signal correlations and obtained a tighter bound for the maximum instantaneous current. Algorithm 3 Bobba in Ref. [17] proposed a constraint-graph-based patten-independent method for maximum current estimation. This method accounts for the timing information andspatiotemporal correlations between pairs of logic gates. In this method, the maximum current value in the kth time interval is obtained as a sum of the peak current values of the gates that can switch in that time interval. Therefore, it provides an improved upper bound on the maximum current waveform. Algorithm 4 In Ref. [18] a timed atomic test pattern generation (ATPG) method and a probability- based method to generate a small set of input patterns for estimating the maximum instantaneous current a re presented. Algorithm 5 Chaudhry and Blaauw in Ref. [19] presented a current signature compression tech- nique, which exploits the pattern of change of individual currents, time locality, and periodicity to achieve b etter compression and accuracy in comparison to the single cycle compression. Algorithm 6 Chen and Ling in Ref. [10] proposed a simple model to represent the switching activities for circuits with information of only the average current I ave and peak current I peak . Depending on the ratio of I peak and I ave , a triangular waveform will be generated if I peak ≥ 2I ave , and a trapezoidal waveform will be generated if I peak < 2I ave . Algorithm 7 In Ref. [20] Jiang et al., a genetic algorithm (GA)-based input vector generation approach was proposed, which iteratively reduces the number of patterns causing the highest power supply noise at specific blocks. The fitness value of a pattern is simply the highest power supply noise at the target chip area. Their experimental results show an average of 23 and 17 percent tighter lower and upper bounds for the benchmark circuits. Algorithm 8 In Ref. [21], block currents are modeled as random variables to capture current variations. The first and second moments of the block currents, as well as the correlations between the currents are assumed to be known, because they can be obtained from simulation of the block and static timing analysis. The optimized power grids in this work show robust performance against variations in block currents. Three decoupling capacitor models are described in Ref. [10]: the n-well capacitor C nw ,the circuit capacitor C ckt , and the thin-oxide capacitor C ox .Then-well capacitor C nw is the reverse-biased pn junction capacitor between the n-well and p-substrate. The time constant for C nw is process- dependent, but usually can be characterized between 250 and 500 ps for contemporary technologies. The circuit capacitor C ckt is derived from the built-in capacitance between V dd and G nd in nonswitching circuits. The total capacitance from nonswitching circuits is estimated to be P/(V 2 f ) ∗(1 −SF)/SF, where P is the power of the cir cuit, V is the supply voltage, f is the frequency, and SF is the switching factor. The nonswitching capacitance are usually placed in parallel with a current source modeling of the functional block. The time constant for C ckt is determined by the switching speed of the device. The thin-oxide capacitor C ox uses the thin-oxide layer between n-well and polysilicon gate to provide additional decoupling capacitance needed to alleviate the switching noise. 44.2.3 LEAKAGE MODELING As described in Chapter 3, leakage power is emerging as a key design challenge in current and future designs because of the lowering of the power supply voltage, reduction of the threshold voltage, Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C044 Finals Page 919 24-9-2008 #8 Power Grid Design 919 and reduction of gate oxide thickness. It is estimated that although leakage power is only about 10 percent of total chip power for current technologies, the number is expected to rise to 50 percent for the future technologies [1]. There are two major components of leakage: subthreshold leakage I sub and gate leakage I gate .For a givencomplementarymetal-oxide-semiconductor(CMOS) technology,both subthreshold and gate leakage currents have strong dependency on the environmental parameters, such as temperature and supply voltage. Based on the Berkeley short-channel IGFET model (BSIM) [22], the subthreshold leakage can be modeled as I sub = I 0 · exp V gs − V th /nV T · 1 +exp ( −V ds /V T ) (44.2) where V T is the thermal voltage V T = kT/q. I 0 is defined as I 0 = µ 0 C ox ( W eff /L eff ) · V 2 T e 1.8 (44.3) From Equation 44.2, clearly the subthreshold leakage is an exponential function of V ds .When the device is off, V ds is proportional of supply voltage V dd . Therefore, the dependency of I sub on V dd is also exponential: I sub V dd ∼ exp ( V dd ) (44.4) Besides directly affecting the thermal voltage V T , temperatureinfluencesthesubthresholdleakage via surface potential s , which in turn affects V th . Because of the short-channel effect and drain- induced barrier lowering (DIBL) effect, the equation describing V th is quite complicated. It can be shown in Ref. [23] that V th ∝ √ T (44.5) Combining the above two factors, a derivation based on Equation 44.2 can show that the effect of temperature change on subthreshold leakage is about order 1.5, i.e., I sub T ∼ ( T ) 1.5 (44.6) The gate leakage current model used in Berkeley BSIM4 model consists of four components: gate to body (I gb ),gatetodrain(I gd ), gate to source (I gs ), and gate to channel (I gc ). The last of these is then partitioned between drain and so urce: I gcd and I gcs . All four components are functions of temperature and supply voltage. The details can be found in Ref. [23]. For example, the first-order dependency of the gate-to-channel current on temperature can be shown as I gc = K s ·V aux (44.7) where V aux is defined as V aux = NIGC ·V T · log 1 +exp V gse − VTH0 NIGC ·V T (44.8) Here V T is the thermal voltage, V gse is the e quivalent g ate voltage and the rest are BSIM4 parameters. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C044 Finals Page 920 24-9-2008 #9 920 Handbook of Algorithms for Physical Design Automation For current CMOS technologies, subthreshold leakageis much stronger than gate leakage. There- fore, when we consider the effects o f temperature and V dd fluctuation, subthreshold leakage is the dominate part. From Equations 44.4 and 44.6, it is clear that same amount of V dd fluctuation has a stronger effect on the leakage than the temperature. 44.2.4 METHODOLOGY Because of the modeling complexity and the large problem sizes associated with the power grid analysis problem, most of the methodologies for full-chip power grid verification proposed in the literature [10,11,24–26] simplify the nonlinear devices into linear elements (current sources and capacitors) attached to the powergrid. Theentireanalysis istypically performed intwo steps.First,the cells (nonlineardevices)areanalyzed assuming perfect power andgroundvoltages. Static, switching, and leakage current models are generated using approaches discussed in preceding sections. Next, attaching these current sources to the power grid, DC or transient analysis for the large-scale power grid linear circuit is performed to estimate the noise or electromigration problems. In Ref. [27], one more step is added due to the nonlinear dependency of dynamic and leakage currents on V dd .Inthis step, power grid voltages computed in step two are applied to the cells to obtain an updated static switching and leakage power. The updated power is used to reanalyze power grid noise. The work in Ref. [10] emphasized that an integrated package-level and chip-level power bus analysis is critical. This is in comparison with traditional technologies where the resistive IR drop occurs mostly on the chip and the inductive I noise only occurs on the p ackage. Therefore, under a traditional methodology, the IR drop and I noise are separately analyzed and summed up. This can become too pessimistic because of the fact that the worst-case I noise and worst-case IR drop do not occur at the same time. Realistic power grid analysis methodologies must handle cells or power grids in a hierarchical manner to manage the complexity of the problem. For example, smaller cells can be grouped into larger macros, and a global level power grid analysis can be performed by applying the current models of such macros. In addition, as indicated in Ref. [26], an important aspect to observe is the voltage distribution trends in a chip. In commercial CAD tools, a visual IR voltage drop plot is often generated to identify hot spots. Hot spot portions identified in the global level need to be investigated in detail in the next level of hierarchy. Similarly, power bus models can also be treated hierarchically [10]. For hot spot areas roughly identified in the global level, finer grids can be generated to model the detailed power bus structure. It is pointed out that the detailed power bus of each fine grid should always be connected to the adjacent global power bus model to ensure the accuracy because of the hierarchy. 44.2.5 TOLERANCE A NALYSIS OF POWER GRIDS To understand the tolerance analysis of power grids, we must examine two importance factors: 1. The manner in which the electrical model of the power grid is derived from the physical implementation, a process commonly referred to as circuit extraction 2. The sources of variability in a power grid model, and the impact such variability will have on the various components of the grid Circuit extraction starts with the physical implementation of the power grid, which consists of the layout geometry of the power grid shapes and defines the power grid wires in the x and y direc- tions, along with the semiconductor process manufacturing information that defines the thickness of the various conducting and insulating layers and that thus defines the power grid wires in the z direction. With the geometry defined, th e circuit extraction p rocess applies models for the resistance, capacitance, and inductance as a function of geometry to calculate the values of the equivalent circuit components for the various geometries defining the power grid. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C044 Finals Page 921 24-9-2008 #10 Power Grid Design 921 For example, the resistance of a rectangular wire segment with width W and length L can be estimated using the simple formula R = ρ L WT (44.9) where ρ is the resistivity of the metal layer in question T is the thickness of the layer Similar first-order equations exist for capacitance (e.g. Ref. [28]) and—to a lesser extent—for inductance.Adeeper explorationofcircuitextractionis,however,beyondthescopeofthis discussion. The important point to note is that well-established procedures exist to map the layout geometry of the power grid to equivalent circuit components. With the aboveunderstandingin place, let us consider the sourcesofvariabilitythat would impact the performance of a power grid. Such sources include 1. Variations in the electrical material properties, for example, material resistivity, insu lator die-electric constant, etc. Let us denote these by category A. 2. Variations in the horizontal geometry of the power grid wires, which will naturally occur in the semiconductor manufacturing process and arise p rimarily from the lithography and etch processes. We denote these by category B. 3. Variations in the vertical geometry of the power grid wires, which arise primarily from the chemical-mechanical polishing (CMP) process. We denote these by category C. 4. Variations in the loading of the power grid. These are caused by two possible sources: (1) lack of complete knowledge of the operational characteristics of the integrated circuit connected to the power grid (e.g., not knowing how active a certain part of the circuit is likely to be), and (2) the impact of manufacturing variations on the power dissipated by the circuit (e.g., the impact of MOSFET channel length fluctuations on the leakage current of the circuit). We denote these by category D. Note that A, B, and C categories are the traditional sources of variations one might consider when performing a tolerance analysis, while category D has more to do without lack of knowledge of the workload. We discuss category D later in Section 44.3.3. It is important when performing such tolerance analysis to understand the relative impact of each source of variability, and to insure that no one source is over- or under-analyzed. For resistors, we note that all three of the categories (A, B, and C) are important, and that one needs to make a careful study of the tolerances expected for each dimensions, especially for those shapes that are at the lower limits of the manufacturing process resolution limits (e.g., vias). For capacitors, on the other hand, the d istances between grid wires of different polarities are typically large enough that the small variations caused by lithography, etch, or CMP are not as importantfordetermining the intrinsic capacitanceof the power grid wires themselves. The dielectric constant, however, can play a part. Capacitance b etween the power grid and signal wires, which are typically interspersed between power grid wires, will vary, but such capacitance does play only a small part in the performance of the power grid compared to the decoupling capacitance presented by inactive circuits. For inductors, the total loop inductance is primarily a function of the loop geometry and how it interacts with other loops as well as the conducting ground plane. Because the variations in geometry caused by variations B and C are small, they have minimal impact on inductance. Therefore, in summary, the primary source of variations in a power grid is the variations in the resistive part of the power grid m odel. The capacitive part varies significantly, but its imp act is relatively small, while the inductive part does not vary significantly. . BSIM4 parameters. Alpert /Handbook of Algorithms for Physical Design Automation AU7242_C044 Finals Page 920 24-9-2008 #9 920 Handbook of Algorithms for Physical Design Automation For current CMOS technologies,. larger chipsthat 913 Alpert /Handbook of Algorithms for Physical Design Automation AU7242_C044 Finals Page 914 24-9-2008 #3 914 Handbook of Algorithms for Physical Design Automation TABLE 44.1 Trends. With permission.) Alpert /Handbook of Algorithms for Physical Design Automation AU7242_C044 Finals Page 916 24-9-2008 #5 916 Handbook of Algorithms for Physical Design Automation l s C s 2 C s 2 R s L s w s FIGURE