LIST OF ABBREVIATIONSCTS Clock Tree Synthesis DRC Design Rule Check ECO Engineering Change Order LVS Layout Versus Schematic PNR Place and Route PG Power Ground QoR Quality of Result SoC
Trang 1VIETNAM NATIONAL UNIVERSITY HO CHI MINH CITY
UNIVERSITY OF INFORMATION TECHNOLOGY
FACULTY OF COMPUTER ENGINEERING
NGUYEN LE NHAT HAO
VU THI HONG NHUNG
GRADUATE THESIS
PLACE AND ROUTE IMPLEMENTATION OF THE 32-BIT
ARM CORTEX-M0
ENGINEER OF COMPUTER ENGINEERING
HO CHI MINH CITY, 2021
Trang 2VIETNAM NATIONAL UNIVERSITY HO CHI MINH CITY
UNIVERSITY OF INFORMATION TECHNOLOGY
FACULTY OF COMPUTER ENGINEERING
NGUYEN LE NHAT HAO - 17520447
VU THI HONG NHUNG - 17520863
HO CHI MINH CITY, 2021
Trang 3INFORMATION OF THE GRADUATE THESES ASSESSMENT
COUNCIL
The graduate theses assessment council, established under Decision No 462/QD-DHCNTT
on 23" July 2021 of the rector of University of Information Technology.
Trang 4We would like to express our sincere gratitude to the lecturers at University of
Information Technology for letting us to be great students here We would also like
to thank SNST & Finger vina company for giving us the opportunity to do this honor thesis.
We are extremely thankful to my faculty guide Mr Nguyen Minh Son for his valuable guidance and support for the completion of this project.
We also acknowledge with our gratitude to Mr Nguyen Duy Manh Thi, Mr Ngo Thanh Sang from SNST & Finger vina company to have technical supports so that we could finish out project.
Finally, we want to show our regards to all our colleagues who directly or
indirectly helped us to complete this thesis report.
Ho Chi Minh city, July 2021
Students
Nguyen Le Nhat Hao Vu Thi Hong Nhung
Trang 5TABLE OF CONTENTS
Chapter 1 INTRODUCTION TO CHIP DESIGN FLOW
1.1 Introduction -s- c5: 56223322223 22121 1121211111111 re 2 1.2 SoC design ÍÏOW án TH HH HH HH TH gi 2 Chapter 2 BACK-END DESIGN FLOW USING SYNOPSYS TOOL 3 2.1 Introduction of IC Compiler II tool.
2.2 Place and route design flow using IC Compiler II tool - 3
2.4.1 Overview of FIOOrplAN $Sf4§€ 55555 S25+5++++x+sv+e+cs++xscex Il
2.4.2 Basic terminologies before FÏOOFDÏAH 5< +5 <ecx+x+x++£e£ 12
2.4.3 Floorplan initialization - 5-5: S555 ‡csscceseseererererterererre 14
2.4.4 Macro Placement SE HH tiệt 15 2.4.5 Phrysical C@ÏÏS thề it 16
2.5 Powerplan cceccsescesesecessssssesesescesesescscscsesssescsessesesesescsnsseseseessneneecseeeessees 17 2.5.1 Overview of powerplan Sf€ - - + SSSEt‡EkEkrkekekekerkrkekeree 17
2.5.2 Powe rplan Structure St 3E E SE ng ri J8
2.5.3 Checks after pOV€TÏ4H 555-5552 S+S*+**E‡x‡tt++tsxexexerrxereree 19
Trang 62.6.1 Overview Of PIACEMENE SÍ(48€ 2k SEeEekekekstrrrrrrerrkexee 20 2.6.2 Timing concepts.
2.6.3 PoWer COHC€BIS Ă ST He 26
2.6.4 Placement steps.
2.6.5 Analyze the design after DÏAC€IH€HI 55c S+cccssxc+cee 28
2.7 Clock tree synthesis.
2.7.1 Overview of Clock tree SYNtheSis c.csccssscseseseseseeeseeeeeteteteneneeeeeeees 28 2.7.2 Clock Tree Synthesis steps
2.7.3 Analyze the design after CTÁ .cccccceececercre 31 2.8 RoOUŒE 8822⁄2 e.n6 TO ìăììoceeeeeerereccer 31
2.8.1 Overview Of TOHÍ€ 5S S352 S*StEE‡E‡EVEEEEEEEeEerkrkeketrxerrkekeree 31
2.8.2 Route SEED SGI đa x@ / 35
2.8.3 Analyze the design after ÑOufe «-cccscscc<ccs<e+ 37 2.8.4 FT) 3.1 A re 37 Chapter 3 ARM CORTEX-M0 OVERVIEW & SPECIFICATION 38 3.l ÖV€TVÏCW à nh HH HH HH tr ườc 38
3.11 Cortex-MO DFOC€SSOF' tt S‡EEEekEkEEEkkekrkekikrkrrrrrreereÕ 38
3.1.2 DesignStart’s Cortex-MO prOC€SSOF c5 S<5+S++x+xcxexereeee 39
3.13 DesignStart’s System đ€SỈETH thiet, 40 3.2 ASIC post-synthesis design stTUC{UT€ - 6 c5 St vrerrkekrkekrkree 42 3.3 ASIC physical design specification
Chapter 4 ARM CORTEX-M0 IMPLEMENTATION RESULT 44 4.1 ARM Cortex MO inpUIS - 5-55 St tt Hư 44 4.2 Read design input St€D 1kg 45
Trang 74.3 Floorplan step - «cành HH 46 4.4 Powerplan steps.
4.4.1 Power COMIN CF€([ÏOHN 2S St S*E‡EEEk‡‡EeEekererrkekrkreree 48
4.42 Powerplan specification
4.4.3 POWF TINGS CTF€[ÏOHH kề TT nghiệt 50
4.4.4, Power straps creation
44.5 Power rails CT€(fÏOHH 5c 5S StS*SExkétertrrkekexerrrkrkrkereree 53 4.4.6 Check results after powerplan
4.5 Placement S(€p nh HH.” HH HH run 55 4.5.1 Logical hieFrdFChy ằĂĂĂSksKserErkereeeeeerree 55
4.5.2 Placement reSults SE E3 BE iệt 37
4.6 Clock tree synthesis S(€D ng HH gi 61 4.6.1 CLOCK SEGUC Ib ey see OT nan TER vss co LH HH HH 1 ke 61 4.6.2 Clock tree synthesis results - tt svEeEsEeekeeerrerererrrsxee 61 4.7 Route SI€ Ăn tr 67 4.7.1 Check DRC, LVS using ICC2 to cececceccceccceseseseseeeeeeseseeteneneneeeeeeses 67 4.7.2 Report utilization At FOHR€ -e«ằĂccccccccseererereererereere 68
4.7.3 Analyzing timing quality & cell COUNT At roHfe - 68
4.7.4 Analyzing cell type & power usage report Al TOULE 69
4.8 STA timing Engineering change order (ECO) +5 5+5+ccc+c+xe+ 71
4.6.1 ECO WOrking ƒÏOWV Sky 71 4.8.2 ECO reSHlis Ă ST 72 Chapter 5 CONCLUSION & FUTURE WORK < <cceesessee 78 S.1 COncÏusion - 6+ ket HT tên 78
Trang 85.2 Future work
Trang 91 GUI interface of IC Compiler II ¿5-5 25+ 5s5s+<+£+£sze>s+ezeexs 3
2 IC Compiler II place and route ÏOAW - - - + +5 S+cxsxcxerererereree 4
3 ICC2 input setup OV€TVICW Sàn HH HH 5
4 Layer definition in technology file - - + +5++s+x+cscecxzxexees 6
5 Unit tile definition exaimpÌe - - + 52 5S S*+*2*zterererrerkrrrrerree 7
6 NDM reference library files - ¿ ¿55525 s++x+xexervrxexss 8
7 Example content in sde file -¿-¿ ¿5c 6 2+ £*£sEvkskekrrkrrrreree 9
8 Example content in upf file - - +52 522<+++c+cvs+ezxrrererrxee 9
9 Example content in def file c.ccceecesceseseseeesteseseseseeseseseseseesescseneeees 0
0 Example content in scandef Ïile ¿-¿55+s+5+s<5+++x+cvss+ 0
1 Example content in V fiÏ€ - - + s55 ++x‡E‡E‡EkEkekekekrrkrkekerrree 1 I2) 15c)“ 3
3 Site row im đ€SigT 5 2 22v 2101213112101 0111 1 re 4
4 Types of core Shape 1S 1k 2 HH HH Hi 4
15 Die and core boundary in deSign - - + ¿5-55 ssc>+z++c+eex+ 5
6 Macros placement example c6 v$sExeeterreeerererexee 6
7 Tap cell placement example 5-5252 5+5++c+s+sec+>xsxsxrree 6
18 Boundary cell placement exampÌe - - + ¿+5 < +s+£+++z++£excr+ 7
19 Core rings and Macro rÏTS +5: 52525 St +tzEexeverertresrerre 8
20 Power meshes Structure csssscssesecseseesesteseeeseesesesseseseesseeseeneseeneees 9
21 Power rails SITUCTUTE 5 S11 1 SH it 9
22 Timing paths 1n 23
23 Types of timing pa(Ï ‹- «th Hit 23
24 Elements of timing cheCK ccceeeceeeeeecseecseseeneeseecseseseseeeeeseeeees 24
25 Scemari0s CT€ALÏON c5 c2 tt th ren 26
26 Before and after CTS example 29
27 NDR rules example
Trang 1034 Global routing expÌanafIOII - - 5 5< vn ng ngư 35
35 Track assignment eXaImpÌÏe 5 5 2+ 1v ng ngư 36
1 Functional block diagram of Cortex-ÌM - 55s £+<cssesese+ 38
2 Simplified block diagram of Cortex-ÌMŨ + +++<++sxsseexss 38
3 Functional block diagram of DesignStart’s Cortex-MO Processot 39
4 Simplified block diagram of DesignStart’s Cortex-MO Processot 39
5 Example system top level VieW eceeseessecesreceseeeseeeeseecenneesaeesseeeeeees 40
6 Design view after ASIC Synthesis 0 cece eeeesecesecesesesesseeeeeeseenes 42
1 Design input all VersiONns ccceeceesseeeseceececeneeeeseeesneeeseeceaeeceaeeesaeensnees 44
2 Design netlist CONtENE ec eecceeeeceteceeecesceceseeceseeesaeeeeeceaeeseaeessaeeeaees 44
3 Design SDC content - 2G 2 2211321133 1113 11 9 11 81 1H ng ngư 45
4 Read design ÍÏOW - HH HH HT TT HH Hàn HH hiệp 45
6 Operating ScenariO SCtUP SCTIR - - 5 11x kg key 46
8 Script for placing POTts - 5 22 3318321183 21% E1 EEEEEErrrserrree 47
11 Power domain creation SCTIPt - 5 5 + + ve reereerre 49
13 Script to vi 0020107 50
14 Power ring eXpÏa'nafIO - <5 + 1313311131189 1 E11 E11 vn rry 51
Trang 1115 M9 horizontal/MS vertical ring -ccxcsxnneseeeeeerrreeree 51
16 Straps creation ÍÏOW/ HH HH HH rưy 51
18 Power strap explanation eeeeescesseceseeceneeceeeeseeeeseecennessaeeeseeeeaees 52
19 M7 horizontal/M2 vertical straps cceecceesseceneeeseeeeseeeeneeesneeeseeenees 52
20 Standard cells rail creation ÍÏOW - - cv ni, 53
21 Standard cell rails SCript - eeececeeeceseeseeseesesaeesesesseseeseaseseeaes 53
22 Standard cell rail explanation eee ee sseeseeseeseeeeneeseeeceeseeceeeeeeeaes 53
23 M1 rails i0 1n 54
24 Checks result at pOWerpDÏ4T s1 ng ng ng cư, 54
25 Design logical hierarchy cccceesccesseceseceseeceeeeeeeceseeceneeeeeessaeeeaees 55
26 CMSDK_mcu_system cell placemen( - 5+5 s + + s+seexss 56
27 Fpga_apb_subsystem cell placermeIi( «- «+ es£+s£+sc+se+se+sxr+ 56
28 Remaining sub modules cell placement -«- «<< <<s++s<+sx++ 56
29 Congestion map at pÏaC€IN€TI 5 5 25 2+ E++sEE+seeEseeerseeeese 57
30 Cell density map at pÏaC€Tm€n( - - s5 s + +*k + Evsseeseseeseseresee 57
31 Pin density map at pÏaC€I€TI - . 555 +11 3+ vssEEeeseeeeeresee 58
32 Utilization at pÏaC€Tm€TI( - - 5 2c 3321332311331 EEEEEEkrersereree 58
33 Report timing quality at placement + + +«£++£+s£+s£+s+sexse 59
34 Report cells usage at pÏaC€Im€TI( ee <6 + + ***E*kEskreerekree 60
35 Report power usage at DÏaC€IT€TIE 5 112k sskseseresee 60
37 Latency report - corner ff_0pO5v_ 125C - -c kcstk* + se, 62
38 Latency report - corner ss_p95v_ 125C - + + + ssssssseres 62
39 Latency path report to clock endpOITIE -.- 55+ <+s£+scxseesseess 63
40 Latency path after cell S1Z1NE - - 5 5 S1 rikt 64
41 Final clock latency - corner ss_p95v125C -c «+ s++cssses 64
42 Utilization at CTTS s9 gTHggg Hgnghrưn 65
43 Report timing quality at C'TS - - c 3c 12 eeirreererssre 65
Trang 1244 Report cells usage at CÏTTS - cành HH nghiệt 66
46 Check LVS report at TOU - c1 1v HH ng Hy 67
47 Check DRC report at route G5 vn HH ng 68
48 Report utilization at TOUIV€ - - c + 113199 1v HH ng re 68
49 Report timing quality at TOUC - óc s1 1v ng ng rn 69
52 ;990 i0 71
53 Timing report before ECO Ï, 6 s11 ng rey 72
54 Max trans report before ECO 1 - S- + k+s *serseeeeee 72
55 Max cap report before ECO 1 ceceecceeescceseeeeseceeeeceseeceseeseeessneeenees 73
56 Script to TUN ECO 100 73
57 Timing report before ECO 2 .- - 55 + kg nh ng 74
58 Max trans report before ECO 2 - -c 2< x9 ng re 74
59 Max cap report before ECO 2 c1 2x11 v9 vn rey 75
60 Script to run ECO 2 2c c 1 23 11v 991119 vn HH ngư, 75
61 Final timing after ECO 22 - - +1 3321113 1E EEEEEErrkrsrerreeree 75
62 Final max tran/cap result after ECO 2 - «+ sccsec+seesessessrs 75
63 Design final power COnSUINPẨIOH .- 5 55 + *E#eEeeEeseesersersrs 76
Trang 13LIST OF TABLES
Table 3.1 Items in DesignStart’s example SySf€im ¿655cc Sc+csxseevrerersee Al Table 3.2 Design specification at Place and route -¿-« + + ++x+xsscereeeeeg 43 Table 4.1 Powerplan SpeCifiCAtiOI - - ¿5 6 1k ST H010 H11 010121 uy 49 Table 4.2 Cell count by type between design input and placemenI - 59 Table 4.3 Cell count by type between Placement and CTS : 66
Table 4.4 Cell count by type between CTS and Route :-‹-+-sc<<+ 69
Table 4.5 Cell count difference from Route to ECO2 - - + +5 5c+c+s<++ 76 Table 4.6 ECO summary T€SUÏ(L - - + + 5S SkEvEvEEEeEekekrkrerrrekeerkrkrkrkrerrre T7
Trang 14LIST OF ABBREVIATIONS
CTS Clock Tree Synthesis
DRC Design Rule Check
ECO Engineering Change Order
LVS Layout Versus Schematic
PNR Place and Route
PG Power Ground
QoR Quality of Result
SoC System on Chip
STA Static Timing Analysis
Trang 15THESIS SUMMARY
ARM Cortex-M is a family of 32-bit RISC ARM processor These cores are
optimized for low-cost & energy saving microcontrollers In particular, the
Cortex-MO core is optimized for small silicon die size and used in the chips with the lowest
price.
There were many topics focusing on research, design, simulate and implement ARM Cortex-M0 on FPGA For the topic this time, the team decided to implement ASIC physical design base on the Netlist which was completely designed and
simulated on FPGA of the previous thesis.
The main goal of this thesis is to implement Place & Route of the ARM Cortex-M0 from gate-level netlist to GDSII file We implement PnR flow following these steps: Floorplan, Powerplan, Placement, Clock Tree Synthesis, Routing using
automatic PnR tool and timing Engineering Change Order (ECO) using timing
sign-off tool.
We have four team working on it: Synsthesis, Design for Testability (DFT),
Static Timing Analysis (STA), Place and Route (PnR) For the first step, we (PnR)
build a working environment to run the design The second step, we receive synthesized netlist, DFT netlist to test the design and give feedback to Synthesis, STA and DFT team about timing and scan chain issues Finally, we officially implement
final PnR flow, co-operate with STA team using timing sign-off tool to verify & fix
timing violations of the design.
In this report, we mainly discuss about:
e An overview of chip design flow: logic design and physical design
(Front-End and Back-End).
e Basic steps in Back-End section used to implement the design:
floorplan, powerplan, placement, clock tree synthesis, routing.
e Implementation result of ARM Cortex-M0 netlist using auto PnR tool.
Trang 16Chapter 1 INTRODUCTION TO CHIP DESIGN FLOW
1.1 Introduction
System on chip (SoC) is the integration of the entire system into a chip SoC
have become one of the most important branches of the semiconductor industry in recent years, allowing designs with up to millions of logic gates of integration level The SoC design process consists of two design phases: Front-End design phase and
Back-End design phase.
The Front-End design phase does the logical construction of design such as coding, simulation, setting constraints, timing analysis, etc.
The Back-End design phase converts the connection between logical cells in
the Front-End design phase into the connection between physical cells and actual
nets.
1.2 SoC design flow
The full flow of SoC design associate with description is shown in figure 1.1.
FRONT - END.
‘SPECIFICATION
DEFINE ARCHITECTURE
1 Understanding of design purposes.
2 Record misunderstanding points to
6 RTL coding for modules.
7 Identify test cases scenarios and
specify test cases in programming
languages (testbench).
8 RTL simulation and faults analysis.
9 Define constraints.
10 Synthesize and extract netlist.
11 Pre-layout timing analysis.
12, Pre-layout functional simulation.
PLACE AND ROUTE
18 Place and route
14 Post-layout timing analysis
15 Post-layout physical verification.
‘TAPE OUT
BACK - END
16 Make sure there are no error left
17 Write data.
18, Export data for fabrication.
Figure 1 1 SoC design flow
2
Trang 17Chapter 2 BACK-END DESIGN FLOW USING SYNOPSYS TOOL
In SoC design flow, Back-End phase can use many tools for place and route
implementation for a design IC Compiler II is one of those tools.
2.1 Introduction of IC Compiler II tool
š; IC Compiler Il
Figure 2 1 GUI interface of IC Compiler II
IC Compiler II is a complete netlist-to-GDSII implementation system that includes early design exploration and prototyping, detailed design planning, block
implementation, chip assembly and sign-off driven design closure.
IC Compiler II includes innovative for flat and hierarchical design planning,
early design exploration, congestion aware placement and optimization, clock tree synthesis, advanced node routing convergence, manufacturing compliance, and
signoff closure.
Figure 2.1 shows the GUI display of Synopsys IC Compiler II.
2.2 Place and route design flow using IC Compiler II tool
Trang 182.2.2 Place and route design flow
Figure 2.2 illustrates the IC Compiler II place and route flow; Descriptions
will be included in section 2.2.3.
Design Inputs
Chip fnsing an dssign for manuf
Figure 2 2 IC Compiler II place and route flow 2.2.3 Description
Design inputs: Set up the libraries and prepare the design data.
Design Initialization: perform design planning and power planning Create a floorplan to determine the size of the design, create the boundary and core area, create site rows for the placement of standard cells, setup I/O pads, and create a powerplan.
Placement and optimization: using place_opt command This iterative process uses enhanced placement and synthesis technologies to generate legalized placement for leaf cells and an optimized design.
Clock tree synthesis and optimization: using clock_opt command CTS is the process of connecting the clocks to all clock pin of sequential circuits by using inverters/buffers in order to balance the skew and to minimize the insertion delay All
the clock pins are driven by a single clock source Clock balancing is important for
meeting all the design constraints.
Trang 19Routing and optimization: using route_auto and route_opt command This
is the stage after Clock Tree Synthesis and Optimization where Exact paths for the
interconnection of standard cells and macros and I/O pins are determined, Electrical
connections using metals and vias are created in the layout, defined by the logical connections present in the netlist The tool performs global routing, track assignment, detail routing, topological optimization, and engineering change order (ECO) routing.
Chip finishing and design for manufacturing: The IC Compiler II tool
provides chip finishing and design for manufacturing and design for yield capabilities that you can apply throughout the various stages of the design flow to address process
design issues encountered during chip manufacturing.
2.3 Libraries preparation and design inputs
2.3.1 Overview of Input setup for ICC2 Figure 2.3 shows the overview of all ICC2 input setup; Descriptions will be
Trang 20It will provide details of metal layer technology parameters such as:
° Number and name of each metal layer/via
° Color and patterns for display
° Design rules (width, spacing, area, pitch, etc.)
° Via contact definitions (lower/upper metal layers, metal enclosure, etc.)
° Default via arrays rules
° Min/max density rules Figure 2.4 shows sample about layer definition in a technology file.
Figure 2 4 Layer definition in technology file
“ Parasitic Technology File (TLU+)
The TLU+ file specifies the RC model to be used for the corresponding metal
layer/via The PnR tool will calculates the interconnect R and C values using the net
geometry and TLU+ look up tables.
TLU+ file is a binary table format The main function of this file can be given
as finding:
e R,C parasitics of metal per unit length.
e These parasitics are used for calculating net delay.
e If TLU+ files are not given, then these are extracted from ITF (Interconnect
Technology Format) file.
e For loading TLU+ files, we have to load three files Max TLU+, Min TLU+
and Map file.
e© Map file maps the ITF file & tf file of the layer and via names.
Trang 21** Design library
> Logic library (.db, lib)
Logic library provides the following information for standard cells and hard
macros or IP (RAMs, ROMs, Datapath ):
¢ Logic functionality of standard cells (and, or, register, )
e Power consumption (dynamic, leakage)
e P/G pin information for UPF support
> Physical library
Physical reference libraries contain physical information of standard, macro
and pad cells which are necessary for placement and routing.
Macro, I/O and standard cell FRAM views provide:
e Cell size/shape
e Pin locations and layers
e Routing blockages (over-the-cell)
It will define placement unit tile details such as:
Height of the placement rows Minimum width resolution Preferred routing direction Pitch of routing tracks
Figure 2.5 gives an example of unit tile definition.
Figure 2 5 Unit tile definition example
> NDM library
Trang 22NDM -— new data model is a Synopsys library format that stores cells information from place and route all the way to signoff This contains information
about design cells, standard cells, macro cells and so on This also contains physical
descriptions, such as metal, diffusion, and polygon geometries Libraries also contain logical information (functionality and timing characteristics) for every cell in the library.
NDM libraries are created by merging logical and physical models from the following sources:
e Logic libraries (lib or db files)
e Physical libraries (LEF and GDS files)
Standard design constraints or Synopsys design constraints contains the timing
related constraints which control design with related to the specification Timing constraints in a design are saved in a common format which is supported by most of
the tools and the format is saved with an sdc extension.
Timing constraints are required for communicating timing intentions of design
to the tool The sdc file must be same as the one used for synthesizing the netlist.
The constraints include the following:
¢ Clock definition
¢ Generated clock info
Trang 23e Input/Output delay
e Max/min delay
e Timing exceptions such as false path, multicycle paths etc.
e Clock uncertainty, transition, latency etc.
Figure 2.7 indicates some example content in an sdc file.
Create clock [get ports OSCCLK @_] -name OSCCLKO -period 10 -waveform {8 5}
s 0§CCLK-1-] -name OSCCLKI 19 form {9 5}
° 0SCLK-2-] -name OSCCLK2 -period 19 orm {0 5}
° 0SCCLK-3-] -nase OSCCLK3 16 “waveform (8 5}
input_virtual clock 4 (05) output_virtual_clock ¢ 10 {0 5)
k input virtual clock -max 3 {0)
k input _virtual_clock 9
k input virtual clock -max 3 res CB nP0R]
1nput virtua\ ctock -min 6 [get ports C8 nP0R]
k input virtual clock -max 3 [oet ports EXP 63 ]
9 3 9
C8 nRST]
ts CB nRST]
k input_virtual_clock (9et ports EXP-63-]
[set ports EXP-62-]
[eet ports EXP-62-]
k input_virtual_clock
k input_virtual_clock
Figure 2 7 Example content in sdc file
“ Power Intent
The Unified Power Format (.upf) is an IEEE standard which is used to define
the power and related aspects of multi voltage design.
UPF contains supply set definition, power domain definition, power switch definition, retention cell definition, level shifter cell definition and other low power related definition.
Figure 2.8 shows an example content in an upf file.
sate Power ñ
te power domain PO -include scope
te supply port VDD -đirection in
te suppy_port VSS -direction in create supply net VSS -domain PD Create supply_net VOD -dosain PD
connect _supply net VSS -ports VSS
connect supply net VOD -ports VDD set_domain supply net PD -prinary power net VOD -primary ground_net VSS
Figure 2 8 Example content in upf file
“ Design Exchange Format (DEF) & SCANDEF
> Design Exchange Format
The Design Exchange Format (DEF) file is an ASCII representation of
physical information of the design DEF contains Property definition, Die area, Row
Trang 24definition, Physical cell definition, STD cell definition, special net, regular nets, port, blockages, module constraints etc.
DEF file contains physical information of the design so we can dump DEF at any stages of PnR like Floorplan, Placement, CTS, Routing or even after ECO stages.
Figure 2.9 illustrates an example content in a def file.
Figure 2 9 Example content in def file
> SCANDEF
Scan chains are nothing but a group of registers connected serially.
SCANDEF is given at the import design stage for scan chain reordering which
contains the connectivity information of scan flipflops and it is also an input of scan
tracing stage.
Figure 2.10 shows an example content in a scandef file.
Figure 2 10 Example content in scandef file
s* Gate-level Netlist
This is also known as synthesized netlist It contains all the gate level information and the connection between these gates.
It can be flat or hierarchical:
e Flat netlist contains only one module with all the information.
10
Trang 25e Hierarchical netlist contains numbers of modules and these modules are being
called by one module.
The common netlist file formats are v or vg and ddc file:
e v: contains the net connectivity information between cells and macros, gate
level descriptions of the cells.
e dde: it contains both the net connectivity info as well as the scan chain info
and gate level descriptions of the cells.
Figure 2.11 shows an example content in a v file.
Figure 2 11 Example content in v file
2.4 Floorplan
2.4.1 Overview of Floorplan stage
Floorplanning is the most important stage in Physical Design It is a factor that directly affects the following in a design: Congestion and routing issues, IR drop,
Timing etc,
In floorplanning, we define the size and shape of the chip or block, place the
10 pins/pads, macro and blockages in the core or chip area in order to effectively find the routing space between them At the floorplanning, we reserve space for the placement of standard cells.
Quality of your chip is determined by your Floorplan A well organized floorplan results in more efficient utilization of the core area thereby aiding the placement of the standard cell without causing issues related to congestion, timing,
signal integrity etc.
11
Trang 26If the floorplan is bad, it affects the area, power, reliability of the chip and requires more efforts for closure and it can increase overall IC cost also It can create
all kind of issues in the design like congestion, timing, nosie, routing issues etc.
Floorplan inputs:
e Synthesis Netlist (.v)
e Design Constraints (sdc)
e Macro placement file (optional)
e Floorplanning control parameters
e Physical + logical libraries
Floorplan outputs:
e Die/Block area
e Floorplan design database
e I/O pad placed
¢ Macro placed
e Standard cell placement areas.
2.4.2 Basic terminologies before Floorplan
“* Macro
A Macro is an Intellectual Property (IP) in a design that is owned by a company These are reusable logic blocks used in a design without the necessity of building them from scratch Two types of macro are Soft Macro and Hard Macro.
Soft Macro is not specific to any technology node Due to this, soft macros
are unpredictable in terms of timing, area and power But soft macros are more
flexible in terms of reconfigurability and can be modified at the RTL level.
Hard Macro is what we call as a Block in PD It is designed specific to a technology node to meet timing, area, and power.
s* Physical cells
These cells do not have any logical functionality in the design Some of the standard physical cells are tap cells, tie cells, endcap cells, decap cells, filler cells, spare cells.
Pin
A pin is an IO terminal that is present in blocks or hard-macros or cells of a design.
standard-12
Trang 27Ex: For a 2-input AND gate, CELL_AND_1/a, CELL_AND_1/b are the input pins and CELL_AND_1/z is the output pin.
“ Port
A port is an IO (Input/Output) terminal that is present in blocks or hard macros
of a design From top-level, ports are pins in hard-macros or blocks But from level, pins talking to top-level are celled as ports.
block-Direction of ports can be input, output or in-out.
“+ Placement blockage
It is the area defined by the designer for the PnR tool to avoid placing or overlapping standard cells in that particular area If a block or standard cell or macro
or IO pad is moved, placement blockages does not move along with it.
Hard placement blockage means that the tool must not place or overlap any standard cell (including buffers and inverters) in the mentioned blockage area.
Soft placement blockage means that the tool can place or overlap any buffer
or inverter in the mentioned blockage area except other standard cells in the design.
Partial blockage means that the designer can adjust the percentage of blockage inside the blockage area and the tool should honor it For example, 60% blocked percentage means tool can use 40% of that area to place standard cells.
Figure 2.12 illustrates example about blockages.
s* Routing blockage
It is the area defined by the designer for the PnR tool to block routing resources
in single or multiple metal layers at a particular area Routing blockages can be
created and removed at any point in the design based on the requirements.
It is possible to create routing blockages over a block or an instance using its cell type or instance name without the area numbers.
13
Trang 28Rows are multiple of sites.
Figure 2.13 indicates the site row in design
Figure 2 13 Site row in design
s* Track
Track is the grid for metal routing, metal shapes go vertical or horizontal on
the track Track are created by command create_track or initialize_floorplan or loadthe def file All tracks must be inside die area
2.4.3 Floorplan initialization
Floorplan Initialization: Define standard cell placement site array within thecore area There are various of core shapes: Rectangular, L-shape, U-shape, T-Shape
Figure 2.14 shows types of core shape that can be used in the design
Figure 2 14 Types of core shape
Die area, Core area:
14
Trang 29Die area is area of block or chip All design objects must be inside dieboundary.
Core area is area to place standard cells It contains all site rows Macros can
be placed in core area
Figure 2 15 Die and core boundary in design
2.4.4 Macros placement
Macro are placed by the manually followed by below requirements:
Macro should be placed at the periphery of the block
Interacting macros should be placed near to each other which is also known as
There should be minimum gap between the macros
We should allow channels for routing pins and for buffer insertion which helps
during timing optimization while placing possible required buffer
Check for all the macros if they are in power domain fence
Figure 2.16 illustrates an example for macros placement
15
Trang 30Latch-up condition: Latch-up basically means a short circuit condition between power and ground Due to this short circuit condition, a low impedance path
is created So, in order to limit this resistance between power and ground connections
to wells of the substrate, tap cells are used.
Figure 2.17 shows example about where tap cell are placed in the design.
Figure 2 17 Tap cell placement example
16
Trang 31Boundary cap cells are technology dependent.
Figure 2.18 illustrates about where boundary cells are placed in the design
present in core area
It is also called Pre-routing as the Power Network Synthesis (PNS) is donebefore actual signal routing and clock routing
s* Input & output of powerplan
Input of files needed for powerplanning:
e Database of floorplan
e Power parameters: min, max, width, spacing, etc
17
Trang 32Output of powerplanning:
e Database of powerplan, contains power structure
e Reports
e Clean check_pg_drc, check_pg_connectivity, check_pg_missing_vias
The power network contains the following:
e Power ring: Carries VDD and VSS around the core
e Power strap: Carries VDD and VSS around the core
e Power rails: connects VDD and VSS to the standard cells
2.5.2 Powerplan structure
s* Power rings
Core rings: the rings around the core that provide power from the pad ring
to the core structures
Macro rings: the rings around one or more macros connected to core rings
to provide power for the macro
Figure 2.19 shows example for core rings and macro rings
Core power ring Power Ground UO-pag ring
Figure 2.20 illustrates the power meshes structure of the design
18
Trang 33Figure 2 21 Power rails structure
2.5.3 Checks after powerplan
There are three commands to check after power planning:
e Check_pg_missing_vias: This command checks for missing vias between
overlapping regions of different metal layers
e Check_pg_connectivity: this command checks the physical connectivity of
the power ground network The command generates information for floatingwires, vias, pads, macro pins, and standard cells
e Check_pg_dre: This command checks and reports violations of technology
design rules and illegal overlaps of objects (shapes, vias, and pins) of powerand ground nets Checking is performed either on entire design area, or in thearea specified by the coordinates option
19
Trang 342.6 Placement
2.6.1 Overview of placement stage
s* Introduction to placement
Once we are done with the floorplan after placing all the macros inside
the core boundary, we are left with standard cells which are still sitting out of the core design area Now we need to place all the standard cells Placing of
these standard cells is called placement stage.
Placement is the process of determining the locations of standard cells
present in the Netlist by placing these cells inside the core area.
The cells are logically present in the netlist Looking at the physical presence of cells, tool places at the desired location.
Placement of cells are most challenging and important phase in PnR Good placement leads to good routing.
There are number of same kind of cells present in the lib (.ndm), the tool
looks at the logic present in the netlist and picks the cell by taking care of input constraints to meet the trade-off of the design.
s* Input & output of placement
Input files needed for placement:
e Database of powerplan
e Logical and physical library (.ndm)
e Design constraints
e Technology file
Output files of placement:
e Reports (timing, congestion, cell density, pin density, etc.)
e Database of placement
20
Trang 35“+ Congestion
> Definition
If the number of required routing resources are more than the number ofavailable routing tracks, then the area becomes congested High congestion causesdetours and leads to worse results Congestion makes the design non-routable thatmeans routing will not be converged if there are congestion in the design
> Types of congestion
There are basically two types of congestion:
e Placement congestion
e Routing congestion
We need to avoid both the types of congestion in our design
> Reasons for congestion
There are different reasons for the congestion which are as follows:
e Bad floorplan
e High standard cell density in particular area
e High pin density in particular area
e Missing/small blockages near macro
> Fixes for congestion
There are below ways to fix congestion issues in design:
e Use blockages in the design, partial blockages help more in optimized way
e Re-arrange modules & macros
e Use tool app-options to control congestion
“+ Placement objectives
These are some goals of placement:
e Timing, power, and area optimization
e Routable design (minimal congestion)
e No/minimal cell density, pin density and congestion hotspots
e Minimal timing DRCs
21
Trang 362.6.2 Timing concepts
s* Introduction
> Static timing analysis
Static timing analysis (STA) is a method of validating the timing performance
of a design by checking all possible paths for timing violations
STA breaks a design down into timing paths, calculates the signal propagationdelay along each path, and checks for violations of timing constraints inside thedesign and at the input/output interface
STA can be done by below tools:
PnR tool: IC Compiler II, Innovus, etc
Sign off STA tool: PrimeTime, Tempus, etc
> How STA work
When performing timing analysis, STA first breaks down the design intotiming paths Each timing path consists of the following elements:
Startpoint: the start of a timing path where data is launched by a clock edge
or where the data must be available at a specific time Every startpoint must
be either an input port or a register clock pin
Combinational logic network: elements that have no memory or internalstate Combinational logic can contains AND, OR, XOR, and inverterelements, but cannot contain flip-flops, latches, registers, or RAM
Endpoint: the end of a timing path where data is captured by a clock edge orwhere the data must be available at a specific time Every endpoint must beeither a register data input pin or an output port
s* Timing path
Figure 2.22 shows basic timing paths in an ASIC design
In this below example, each logic cloud represent a combinational logicnetwork Each path starts at a data launch point, passes through some combinationallogic, and ends at a data capture point
22
Trang 37Figure 2 23 Types of timing path
Figure 2.23 show types of path in timing check, below are some descriptions:
Clock path: a path from a clock input port or cell pin, through one or more
buffers or inverters, to the clock pin of a sequential element; used for setup
and hold checks
Clock gating path: a path from an input port to a clock-gating element forclock-gating setup and hold checks Gating element can be gating cell orcombinational cell
Asynchronous path: path from an input port to an asynchronous set or clearpin of a sequential element; for recovery and removal checks
False path: a path is never sensitized due to the logic configuration, expecteddata sequence, or operating mode
Multicycle path: path is designed to take more than one clock cycle from
launch to capture
Minimum or maximum delay path: path that must meet a delay constraintthat you explicitly specify as a time value
23
Trang 38s* Setup time, hold time
Clock period: Tperioa = 1/Frequency
Flipflop cell delay: Tex->q 1s the delay from Clk to Q of flipflop when clock active
Combinational logic delay: Tg->a is the delay of combinational logic
Data path delay= Tex-sq + Tg->a
Launch clock latency: the delay from clock source to clock pin of startpoint
Capture clock latency: the delay from clock source to clock pin of endpoint
Clock skew: Tskew = Capture clock latency — Launch clock latency
Setup time (Tsetup) is the minimum amount of time before the clock’s active edgethat the data must be stable
Hold time (Thoia) is the minimum amount of time after the clock’s active edge duringwhich data must be stable
Setup slack = Tperioa— (Tek->q + Tq->a+ Tsetup — Tskew)
Hold slack = Tek-5q+ Tq->a- Thota- Tskew
Slack >=0 means timing path met timing requirement
Slack <0 means timing path violated timing requirement
s* Operating modes
A chip usually works in below main modes:
e Test modes: the mode to test the chip after manufacturing whether the chip
has any fault caused by manufacturing or not
If the chip has fault, its working result will not be same as spec of customer,then it will be used for fault debug or be thrown to the trash
If the chip passes the test mode, no fault in chip, it can work correctly, it will
be delivered to customer
24
Trang 39Testing has below different modes, each mode has its SDC file, released byDFT designer who inserted the DFT circuit.
e Functional modes: this mode operates all functions which customer requires.
Function modes have itself sdc file, released by logic designer who designfunctional logic circuit
s* Timing corners
> PVT variations
A chip is designed to work for a range of temperatures and voltages; and have
to work under different environmental conditions and different electrical setup anduser environments
Besides, process which the chip is manufactured has variations and the designhave to cover all variations, so that it can work after manufacturing
By combining Process, Voltage, Temperature, we have PVT variations
In STA, different PVT variation will refer to different library lookup table file.For example, with operating condition: process sspg, voltage 0.9V, temperature125C, to calculate timing, we refer to library: sspg_0p9v_ 125C
> Parasitic corners & timing corners
Parasitic corner: Modeling RC variation of a net, we have extraction cornets:Rmin, Cmin, RCmin, Rmax, Cmax, RCmax Each parasitic corner refers to its TLU+file
Timing corner: By combining PVT and Extraction corner, we have timing
Trang 40Modes list ‘Corners list Scenarios list
ơ -> Alpha a factor, normally value 1s 0.5
F -> Frequency of the design
C -> Load capacitance
VDD -> Power
> Short circuit power
There is a situation when PMOS and NMOS transition comes at thresholdlevel At this time, the PMOS and NMOS are shorted and a rail is getting createdfrom VDD to VSS The short circuit between VDD and VSS results into power losswhich is called as Short Circuit Power Since power is getting calculated from currentand voltage supply, so the expression comes as per below:
Short circuit power = VDD * Isc
26