1. Trang chủ
  2. » Công Nghệ Thông Tin

Digital design C to RTL

37 622 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Principles Of Digital Design C to RTL Control/Data flow graphs Finite-state-machine with data IP design Component selection Connection selection Operator and variable mapping Scheduling and pipelining Copyright © 2010-20011 by Daniel D Gajski EECS31/CSE31/, University of California, Irvine Topic preview Boolean algebra Logic gates and flip-flops Finite-state machine Binary system and data representation Sequential design techniques Combinational components Generalized finite-state machines Logic design techniques Storage components 8 Register-transfer design Processor components Copyright © 2010-20011 by Daniel D Gajski EECS31/CSE31/, University of California, Irvine Register-transfer-level design  Each standard or custom IP components consists of one or more datapaths and control units  To synthesize such IP we use the models of a CDFG and FSMD  We demonstrate IP synthesis (RTL Design) including component and connectivity selection, expression mapping scheduling and pipelining Copyright © 2010-20011 by Daniel D Gajski EECS31/CSE31/, University of California, Irvine Motivation: Ones-counter Start Input Control signals Control unit Data=0 Done Problem: Data Mask Temp Ocount Generate controller & control words for given FSMD & Datapath Output 20-bit control words Start = s0 Done=1; s1 Data = Input S Selector Start = s2 / Data = Data = Copyright © 2010-20011 by Daniel D Gajski Ocount = s3 Mask = s4 Temp = Data AND Mask s5 Ocount = Ocount + Temp s6 Data = Data >> s7 Done=1; Output =Ocount 3 WA WE 8Xm register file RAA REA M S1 S0 S2 S1 S0 RAB REB A B ALU “0” “0” IL IR Shifter EECS31/CSE31/, University of California, Irvine Ones Counter from C Code •Programming language semantics • Sequential execution, • Coding style to minimize coding 01: •HW design • Parallel execution, • Communication through signals 01: int OnesCounter(int Data){ while(1) { 02: while (Start == 0); 03: Done = 0; 04: Data = Input; 05: Ocount = 0; 02: int Ocount = 0; 06: Mask = 1; 03: int Temp, Mask = 1; 07: while (Data>0) { 04: while (Data > 0) { 08: Temp = Data & Mask; 05: Temp = Data & Mask; 09: Ocount = Ocount + Temp; 06 Ocount = Data + Temp; 10: Data >>= 1; 07: Data >>= 1; 11: } 08: } 12: Output = Ocount; 09: return Ocount; 13: Done = 1; 10: } 14: Function-based C code Copyright © 2010-20011 by Daniel D Gajski } RTL-based C code EECS31/CSE31/, University of California, Irvine CDFG for Ones Counter 01: while(1) { 02: while (Start == 0); 03: Done = 0; 04: Data = Input; 05: Ocount = 0; 06: Mask = 1; 07: while (Data>0) { 08: Temp = Data & Mask; 09: Ocount = Ocount + Temp; 10: Data >>= 1; Control/Data flow graph Start Input Data Mask Ocount Done Data Mask Ocount Done >>1 Data Output = Ocount; 13: Done = 1; 14: •Explicit dependencies + Ocount Done Data } 12: •Loops, ifs, basic blocks (BBs) •Control dependences between BBs & 11: •Resembles programming language >0 •Data dependences inside BBs •Missing dependencies between BBs Output Done } RTL-based C code Copyright © 2010-20011 by Daniel D Gajski CDFG EECS31/CSE31/, University of California, Irvine CDFG to FSMD for Ones Counter Start = Start = S0 S0 Start Start = Start = Data Mask Ocount Done 0 S2 Data = Input Ocount = 0; Mask = 1; Done = 0; Data Mask Ocount Done & >>1 Data S1 Done = 0; Data = Input S2 Ocount = S3 Mask = S4 Temp = Data AND Mask S5 Ocount = Ocount + Temp S6 Data = Data >> S7 Done = 1; Output = Ocount Input + Ocount Done Data Data =/ S5 Temp = Data AND Mask; Ocount = Ocount + Temp; Data = Data >> Data =/ >0 Data = Data = S7 Output = Ocount; Done = 1; Output Done CDFG Copyright © 2010-20011 by Daniel D Gajski Super-state FSMD Cycle-accurate FSMD EECS31/CSE31/, University of California, Irvine FSMD for Ones Counter Start = S0 •FSMD more detailed then CDFG Start = S1 Done = 0; Data = Input •Conditionals and statements executed concurrently S2 Ocount = • All statement in each state executed concurrently S3 Mask = •Control signal and variable assignments executed concurrently S4 Temp = Data AND Mask S5 Ocount = Ocount + Temp S6 Data = Data >> S7 Done = 1; Output = Ocount •States may represent clock cycles •FSMD includes scheduling Data =/ •FSMD doesn't specify binding or connectivity Data = Copyright © 2010-20011 by Daniel D Gajski EECS31/CSE31/, University of California, Irvine FSMD Definition We defined an FSM as a quintuple < S, I, O, f, h > where S is a set of states, I and O are the sets of input and output symbols: f:S×I More precisely, S , and h:S O I = A1 × A2 ×… Ak S = Q × Q2 × … Qm O = Y1 × Y2 ×… Yn Where Ai, is an input signal, Qi, is the state register output and Yi, is an output signal To define a FSMD, we define a set of variables, V = V1 × V2 ×…Vq , which defines the state of the datapath by defining the values of all variables in each state with the set of expressions Expr(V): Expr(V) = Const U V U {ei # ej | ei, ej el of Expr(V), # is an operation} Notes: Status signal is a signal in I; Control signals are signals in O; Datapath inputs and outputs are variables in V Copyright © 2010-20011 by Daniel D Gajski EECS31/CSE31/, University of California, Irvine RTL Design Model Control inputs Control unit Datapath inputs Control signals Status signals Control outputs High-level block diagram Datapath Datapath outputs Datapath inputs Control inputs D Q D Q Next-state logic D Selector Register RF Mem Control signals Bus Bus Q State register */÷ ALU Output logic Status signals Bus Out Reg Control unit Datapath Control outputs Datapath outputs Register-transfer-level block diagram Copyright © 2010-20011 by Daniel D Gajski 10 EECS31/CSE31/, University of California, Irvine FU sharing (Operator merging)  Group non-concurrent operations  Each group shares one functional unit  Sharing reduces number of functional units  Grouping also reduces connectivity  Clustering algorithms are used for grouping Copyright © 2010-20011 by Daniel D Gajski 23 EECS31/CSE31/, University of California, Irvine FU-sharing motivation c a si a b c d x=a+b sj + - x y b Selector d Selector +/- y=c-d x Partial FSMD Copyright © 2010-20011 by Daniel D Gajski Non-shared design 24 y Shared design EECS31/CSE31/, University of California, Irvine Operator-merging for SRA |a| |b| |a| |b| s0 a = In b = In Start = Start = s1 t1 = |a| t2 = |b| + - + max max Compatibility graph - Partitioned compatibility graph s2 x = max( t1 , t2 ) y = ( t1 , t2 ) s3 t3 = x >> t4 = y >>1 s4 Selector Selector t5 = x – t3 R2 R1 s5 R3 t6 = t4 + t5 s6 t7 = max ( t6 , x ) >>1 Selector s7 Done = Out = t7 Square-root approximation Copyright © 2010-20011 by Daniel D Gajski >>3 [ abs/max] [ abs/min/+/- ] Datapath after variable and operator merging 25 EECS31/CSE31/, University of California, Irvine Bus sharing ( connection merging )  Group connections that are not used concurrently  Each group forms a bus  Connection merging reduces number of wires  Clustering algorithm work well Copyright © 2010-20011 by Daniel D Gajski 26 EECS31/CSE31/, University of California, Irvine Connection merging in SRA datapath In In M N Selector Selector R2 R1 A B Datapath after variable and operator merging D C E Selector R3 [ abs/max] >>3 K [ abs/min/+/- ] I H G >>1 F L  Bus1 = [ A, C, D, E, H ]  Bus2 = [ B, F, G ]  Bus3 = [ I, K, M ]  Bus4 = [ J, L, N ] Bus assignment J Out s0 s1 s2 s3 s4 s5 s6 s7 A B C D E F G H I J K L M N X X X X X X X X X X X A C X X D X X X X M H I K F J L E X X X X X X X B G N X X Connectivity usage table Copyright © 2010-20011 by Daniel D Gajski Compatibility graph for input buses 27 Compatibility graph for output buses EECS31/CSE31/, University of California, Irvine Connection merging in SRA datapath In In M Selector Selector R2 R1 A B Datapath after variable and operator merging N D C E Selector R3 G >>1 F [ abs/max] K [ abs/min/+/- ] I H >>3 L  Bus1 = [ A, C, D, E, H ]  Bus2 = [ B, F, G ]  Bus3 = [ I, K, M ]  Bus4 = [ J, L, N ] Bus assignment J Out s0 s1 s2 s3 s4 s5 s6 s7 A B C D E F G H I J K L M N X X X X X X X X X X X X X [ abs/max/+/- ] X X X X X Connectivity usage table Copyright © 2010-20011 by Daniel D Gajski [ abs/min] R3 Bus X X R2 Bus X X X X X X R1 >>1 >>3 Bus Bus Datapath after variable, operator and connectivity merging 28 EECS31/CSE31/, University of California, Irvine Register merging into Register files  Group register with non-overlapping accesses  Each group assigned to one register file  Register grouping reduces number of ports, and therefore number of buses  Use some clustering algorithms Copyright © 2010-20011 by Daniel D Gajski 29 EECS31/CSE31/, University of California, Irvine Register merging    s0 a = In b = In Start = R1 = [ a, t1, x, t7 ] R2 = [ b, t2, y, t3, t5, t6 ] R3 = [ t4 ] R3 Compatibility graph Register assignment s0 Start = s1 t1 = |a| t2 = |b| s2 x = max( t1 , t2 ) y = ( t1 , t2 ) s1 R2 R1 s2 s3 s4 s5 s6 s7 R1 R2 R3 Register access table s3 In t3 = x >> t4 = y >>1 s4 In R1 t5 = x – t3 R3 s5 R2 Bus t6 = t4 + t5 Bus s6 H t7 = max ( t6 , x ) [ abs/max] [ abs/min/+/- ] s7 Done = Out = t7 >>3 >>1 Bus Bus Square-root approximation Out Copyright © 2010-20011 by Daniel D Gajski 30 Datapath after register merging EECS31/CSE31/, University of California, Irvine Chaining and multi-cycling  Chaining allows serial execution of two or more operations in each state  Chaining reduces number of states and increases performance  Multi-cycling allows one operation to be executed over two or more clock cycles  Multi-cycling reduces size of functional units  Multi-cycling is used on noncritical paths to improve resource utilization Copyright © 2010-20011 by Daniel D Gajski 31 EECS31/CSE31/, University of California, Irvine SRA datapath with chained units In In s0 a = In b = In R2 R1 R3 Start = Start = Bus s1 t1 = |a| t2 = |b| Bus s2 [ abs/min/+/- ] [ abs/max] >>3 >>1 Bus x = max( t1 , t2 ) t3 = max( t1 , t2 )>>3 t4 = min( t1 , t2 )>>1 s3 Bus t5 = x – t3 s4 t6 = t4 + t5 Out Datapath schematic    Copyright © 2010-20011 by Daniel D Gajski s5 R1 = [ a, t1, x, t7 ] R2 = [ b, t2, y, t3, t5, t6 ] R3 = [ t4 ] 32 t7 = max ( t6 , x ) s6 Done = Out = t7 Square-root approximation EECS31/CSE31/, University of California, Irvine SRA datapath with multi-cycle units s0 Start = Start = s1 R3 R1 R2 [ abs/max] [ abs/+/- ] Bus t1 = |a| t2 = |b| s2 In In a = In b = In Bus x = max( t1 , t2 ) t3 = max( t1 , t2 )>>3 t4 = min( t1 , t2 )>>1 s3 >>1 >>3 Bus t5 = x – t3 Bus s4 t6 = t4 + t5 Out s5 Datapath schematic t7 = max ( t6 , x ) s6  Done = Out = t7  Square-root approximation Copyright © 2010-20011 by Daniel D Gajski  33 R1 = [ a, t1, x, t7 ] R2 = [ b, t2, y, t3, t5, t6 ] R3 = [ t4 ] EECS31/CSE31/, University of California, Irvine Pipelining  Pipelining improves performance at a very small additional cost  Pipelining divides design into stages and uses all stages concurrently for different data (assembly line principle)  Pipelining principles works on several levels: (a) Unit pipelining (b) Control pipelining (c) Datapath pipelining Copyright © 2010-20011 by Daniel D Gajski 34 EECS31/CSE31/, University of California, Irvine SRA datapath with single AU In In Datapath schematic R3 R2 R1 Bus s0 a = In b = In Start = Bus Start = AU s1 t1 = |a| >>3 s2 >>1 Bus t2 = |b| s3 Timing diagram x = max( t1 , t2 ) t3 = max ( t1 , t2 )>>3 Out s0 s4 t4 = ( t1 , t2 )>>1 Bus Read ReadRR1 s1 t5 = x – t3 t6 = t4 + t5 s7 s8 Done = Out = t7 Square-root approximation for single AU Copyright © 2010-20011 by Daniel D Gajski s54 s65 s7 s86 s9 s10 s11 s12 xx t 2 t2 t2 t3 t3 t5 t5 Read ReadRR3 t7t7 t6 t6 t 4t4 |a| AU AUstage stage22 |b| |a| max min max |b| shifters shifters t7 = max ( t6 , x ) s43 t 1t1 t1 t1 x x bb AU AUstage stage11 s6 s3 aa Read ReadRR2 s5 s2 - maxmin max ++ - max max ++ max max >>3 >>1 >>1 >>3 WriteRR1 Write aa WriteRR2 Write bb t1t1 xx t2t2 t7t7 t5t t3t3 t6t t4t4 riteRR3 Write Outport Outport t7t7 35 EECS31/CSE31/, University of California, Irvine Pipelined FSMD implementation Control inputs Datapath inputs Standard FSMD implementation Control signals Selector RF Bus Bus NextState logic State register Output Logic ALU Status signals Out Reg Datapath Control unit Datapath outputs Control outputs Read SReg s0 s s1 s0 a Read R1 Write CReg s0 Read CReg Read R s0 a>b s1 Write AU ALUIn stage Read ALUIn2 AU stage x=c+d ALU shifters s2 y=x-1 Example Copyright © 2010-20011 by Daniel D Gajski Write RF Write R2 Write Status Timing diagram s3 t b s1t2 >>3 c+d >>1 x t2 a>b s1 /s2 t5 s10 t6 t4 + x - x + x-1 x t3 s9 x x a>b a>b s8 s7 xs2 t3 s max c,d |b| maxc,d b 36 t1 |a| |b| a,b a,b |a| t1 Write SR Outport s5 s2s6 c,d a R3 Read rite Status ss14 ts1 a,b Read RFR3 Read a =< b s2 y t5 t6 t4 s2 EECS31/CSE31/, University of California, Irvine Summary We introduced RTL design:   FSMD model RTL specification with FSMD CDFG    Procedure for synthesis from RTL specification Scheduling of basic blocks Design Optimization through Register sharing Functional unit sharing Bus sharing Unit chaining Multi-clocking  Design Pipelining Unit pipelining Control pipelining Datapath pipelining Copyright © 2010-20011 by Daniel D Gajski 37 EECS31/CSE31/, University of California, Irvine [...]... serial execution of two or more operations in each state  Chaining reduces number of states and increases performance  Multi-cycling allows one operation to be executed over two or more clock cycles  Multi-cycling reduces size of functional units  Multi-cycling is used on noncritical paths to improve resource utilization Copyright © 2010-20011 by Daniel D Gajski 31 EECS31/CSE31/, University of California,... Copyright © 2010-20011 by Daniel D Gajski 19 EECS31/CSE31/, University of California, Irvine Merging variables with common sources and destination c a si x=a+b b Selector d Selector + sj a ,c b,d Selector Selector + y =c+ d FSMD Copyright © 2010-20011 by Daniel D Gajski Selector Selector x y Datapath without register sharing 20 Selector x,y Datapath with register sharing EECS31/CSE31/, University of California,... connection merging )  Group connections that are not used concurrently  Each group forms a bus  Connection merging reduces number of wires  Clustering algorithm work well Copyright © 2010-20011 by Daniel D Gajski 26 EECS31/CSE31/, University of California, Irvine Connection merging in SRA datapath In 2 In 1 M N Selector Selector R2 R1 A B Datapath after variable and operator merging D C E Selector... block diagram Copyright © 2010-20011 by Daniel D Gajski 11 EECS31/CSE31/, University of California, Irvine C- to- RTL design  RTL generation requires definition of controller datapath  RTL generation of a controller requires choice of state register (program counter) output logic (program memory) next-state logic (next-address generator)  RTL generation of a datapath RTL component and connectivity... t7 R3 = [ t4 ] Selector R1 |a| t3 1/0 1/0 Selector 0/1 t4 y t1 0 1/ b 1/0 0/1 min R3 + max 22 - >>1 >>3 EECS31/CSE31/, University of California, Irvine FU sharing (Operator merging)  Group non-concurrent operations  Each group shares one functional unit  Sharing reduces number of functional units  Grouping also reduces connectivity  Clustering algorithms are used for grouping Copyright © 2010-20011... EECS31/CSE31/, University of California, Irvine FU-sharing motivation c a si a b c d x=a+b sj + - x y b Selector d Selector +/- y =c- d x Partial FSMD Copyright © 2010-20011 by Daniel D Gajski Non-shared design 24 y Shared design EECS31/CSE31/, University of California, Irvine Operator-merging for SRA |a| |b| |a| |b| s0 a = In 1 b = In 2 Start = 1 Start = 0 s1 t1 = |a| t2 = |b| + - + min max min max Compatibility... by Daniel D Gajski Compatibility graph for input buses 27 Compatibility graph for output buses EECS31/CSE31/, University of California, Irvine Connection merging in SRA datapath In 2 In 1 M Selector Selector R2 R1 A B Datapath after variable and operator merging N D C E Selector R3 G >>1 F [ abs/max] K [ abs/min/+/- ] I H >>3 L  Bus1 = [ A, C, D, E, H ]  Bus2 = [ B, F, G ]  Bus3 = [ I, K, M ]  Bus4.. .RTL Design Model Control inputs Control unit Datapath inputs Control signals Status signals Control outputs High-level block diagram Datapath Datapath outputs Datapath inputs Control inputs Control signals Selector Register Mem Next-address logic RF Program Counter Bus 1 Bus 2 */÷ ALU Program Memory Status signals Bus 3 Out Reg Control unit Datapath Control outputs Datapath... B C D E F G H I J K L M N X X X X X X X X X X X X X [ abs/max/+/- ] X X X X X Connectivity usage table Copyright © 2010-20011 by Daniel D Gajski [ abs/min] R3 Bus 2 X X R2 Bus 1 X X X X X X R1 >>1 >>3 Bus 3 Bus 4 Datapath after variable, operator and connectivity merging 28 EECS31/CSE31/, University of California, Irvine Register merging into Register files  Group register with non-overlapping accesses... Outport Outport t7t7 35 EECS31/CSE31/, University of California, Irvine Pipelined FSMD implementation Control inputs Datapath inputs Standard FSMD implementation Control signals Selector RF Bus 1 Bus 2 NextState logic State register Output Logic ALU Status signals Out Reg Datapath Control unit Datapath outputs Control outputs Read SReg s0 0 s s1 s0 a Read R1 Write CReg s0 Read CReg Read R s0 a>b s1 Write

Ngày đăng: 27/01/2016, 09:26

Xem thêm: Digital design C to RTL

TỪ KHÓA LIÊN QUAN

Mục lục

    Ones Counter from C Code

    CDFG for Ones Counter

    CDFG to FSMD for Ones Counter

    FSMD for Ones Counter

    Square Root Approximation: C to CDFG

    Square Root Approximation: Scheduling

    Square Root Approximation: CDFG to FSMD

    Square Root Approximation: FSMD Design

    Resource usage in SRA

    Resource usage in SRA

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN