Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 122 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
122
Dung lượng
896,65 KB
Nội dung
CLUSTERING TECHNIQUES FOR COARSE-GRAINED, ANTIFUSE-BASED FPGAS by Chang Woo Kang A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ELECTRICAL ENGINEERING) May 2006 Copyright 2006 Chang Woo Kang UMI Number: 3237159 3237159 2007 UMI Microform Copyright All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code. ProQuest Information and Learning Company 300 North Zeeb Road P.O. Box 1346 Ann Arbor, MI 48106-1346 by ProQuest Information and Learning Company. ii DEDICATION To my parents and family, my fiancée Eunju Lee, and my best friend Sung-Hoon Kang, thanks for their unconditional support and love. iii ACKNOWLEDGEMENTS I would like to offer my humble acknowledgment to Professor Massoud Pedram, who supervised and guided me through this achievement. From him, I received not only knowledge on my research but also emotional support whenever I encountered frustration. I also thank Professor Jeff Draper and Professor Roger Zimmermann for being on my thesis committee. I would like to extend my deep gratitude to Professor Jeff Draper, who supported me at the Information Sciences Institute for three years. His support has been an enormous encouragement during study at USC. I would like to thank all of the SPORT group members who have given freely of their time, hearts, and resources to support this research. A partial list includes: Chanseok Hwang, Kihwan Choi, Ali Iranli, Yazdan Aghahiri, Peng Rong, Yu Hou, Afshin Abdollahi, Wonbok Lee, Maryam Soltan, Morteza Maleki, Hanif Fatemi, Soroush Abbaspour, Hwisung Jung, and Behnam Amelifard. Finally, I would like to express my deep affection to Yunjung Choi, Ihn Kim, Joongseok Moon, Kisup Chong, and Kihoon Jeong. Chang Woo Kang USC, May 2006 iv TABLE OF CONTENTS Dedication ii Acknowledgements iii List of Tables vii List of Figures viii Abstract x CHAPTER 1 Introduction 1 1.1 Motivation 1 1.2 Dissertation Outline 6 CHAPTER 2 Coarse-grained FPGAs and Previous Work 8 2.1 Overview of Coarse-grained FPGA Architecture 8 2.2 FPGA CAD Flow 13 2.2.1 Technology mapping 13 2.2.2 Clustering Techniques 14 2.2.3 Placement and Routing 23 2.3 Summary 26 CHAPTER 3 Tool Flow and Cell Library Generation 27 3.1 Introduction 27 3.2 Tool Flow 28 3.3 Cell Library Generation 29 3.4 Cost Assignment 34 3.5 Summary 35 v CHAPTER 4 Area-driven Clustering Algorithm with Considerations of Interconnect Connectivity and Circuit Speed 36 4.1 Introduction 36 4.2 Lower-bound Calculation 38 4.2.1 Problem Statement and Dynamic Programming Approach 38 4.2.2 Set containment relations 42 4.2.3 Minimum number of pASIC3 logic cells with given base gates 43 4.2.4 Type distribution table 44 4.2.5 Problem formulation and solution 46 4.3 Area-driven Clustering Technique 50 4.3.1 Interconnect-aware Clustering 51 4.3.2 Timing Slack-driven Clustering 56 4.4 Experiment Results 61 4.5 Summary 64 CHAPTER 5 Timing-driven Clustering 66 5.1 Introduction 66 5.2 Problem statement 67 5.3 Multi-dimensional labeling algorithm 68 5.4 Signal Path Aware Slack-time relaxation 71 5.5 Merging algorithm 73 5.6 Experiment Results 74 5.7 Summary 77 CHAPTER 6 Low-power Clustering with Minimum Logic Replication 78 6.1 Introduction 78 6.2 Design Flow and Problem Description 83 6.3 Low Power Clustering 86 6.3.1 Cluster generation and power-delay curves 86 6.3.2 Correct accounting of logic replication 87 6.4 Cluster selection 94 6.5 Implementation and Experimental Results 96 6.6 Summary 99 vi CHAPTER 7 Conclusion and Future Work 100 7.1 Dissertation Summary 100 7.2 Future Work 102 Bibliography 104 vii LIST OF TABLES Table 3.1: Cell distribution after cell personalization from base gates 31 Table 3.2: Cell distribution after identifying common primitive cells among base gates 31 Table 3.3: Filtered primitive cells 33 Table 4.1: The type distribution table for primitive cell to base-gate mapping 45 Table 4.2: Results of lower-bound calculation 62 Table 4.3: Results of different clustering objectives with the minimum area solution 63 Table 5.1: Results of timing-driven clustering 75 Table 5.2: Results of slack-time relaxation 76 Table 6.1: Low-power clustering results: Area and delay 97 Table 6.2: Low-power clustering results: Power and CPU time 98 viii LIST OF FIGURES Figure 1.1: Virtex II CLB Element. 3 Figure 1.2: Coarse-grained, antifuse-based FPGA: (a) pASIC3 logic cell, (b) FPGA architecture, and (c) antifuse switch 5 Figure 2.1: Coarse-grained, SRAM-based FPGA [1] 10 Figure 2.2: Coarse-grained, antifuse-based FPGA 11 Figure 2.3: FPGA CAD flow. 12 Figure 2.4: Input reduction by adding a BLE. 17 Figure 2.5: BLE criticality assignment. 19 Figure 2.6: Clustering: (a) before packing node B into cluster C and (b) after packing node B into cluster C. 20 Figure 3.1: Proposed CAD tool flow for pASIC3 family FPGA 29 Figure 3.2: Functions in Packer-pASIC3 29 Figure 3.3: pASIC3 base gates derived from the configurable logic cell. 32 Figure 3.4: Venn’s diagram for the set of logic cells that can be personalized from the base gates 33 Figure 4.1: Interconnect switch architecture for two different FPGAs 37 Figure 4.2: One dimensional coin change problem. 41 Figure 4.3: Examples of local neighborhood connectivity factor computation 50 ix Figure 4.4: Clustering nodes. 55 Figure 4.5: Packing un-clustered nodes by using linear assignment: (a) partially clustered network; (b) bipartite graph for linear assignment 56 Figure 4.6: Selecting the best node for clustering: (a) greedy selection and (b) intelligent selection. 58 Figure 4.7: Selecting the best node for delay improvement 60 Figure 5.1: Multi-dimensional labeling algorithm 70 Figure 5.2: Clustering example 71 Figure 5.3: Slack-time relaxation with awareness of signal path 73 Figure 6.1: An example of redundant logic replication in clustering: (a) clusters and the corresponding area-delay points, (b) non-inferior clusters, (c) circuit after logic replication (i.e., n1, n2, and n3 are duplicated), and (d) a desired clustering solution. 82 Figure 6.2: PD curve generation for a node with a cluster 88 Figure 6.3: Example of logic replication prediction. 92 Figure 6.4: Prediction of logic replication. 93 Figure 6.5: Logic replication cases: (a) child node is replicated, and (b) root node is replicated. 94 Figure 6.6: Logic replication for cluster selection. 95 [...]... techniques for coarsegrained, antifuse-based FPGAs The clustering problem for coarse-grained, antifuse-based FPGAs is quite different from typical clustering problems that we’ve known for SRAM-based FPGAs Coarse-grained, antifuse-based FPGA architecture demands highly intelligent CAD algorithms, because the architecture provides tremendous flexibility with the least hardware overhead The hardware overhead for. .. in [6] 2.2.2 CLUSTERING TECHNIQUES Once the technology mapping is accomplished, then the mapped netlist is provided to a clustering algorithm The clustering algorithm packs multiple basic logic elements into a logic cluster Many clustering techniques, for SRAM-based FPGAs, have been based on constructive clustering techniques In the following section, the major achievements on clustering techniques are... during delay optimization For antifuse based FPGAs, Boolean matching techniques have been used for technology mapping and research results on technology mapping for antifuse logic cells have been reported [30] Boolean matching is therefore a key enabler for antifuse based FPGA mapping Lai et al in [42] proposed a Boolean matching algorithm and introduced matching filters for speedup A more comprehensive... ABSTRACT Coarse-grained, antifuse-based FPGAs have emerged as a compelling technology to minimize the performance gaps between FPGAs and ASICs in area, speed, and power dissipation As the FPGA architectures prefer large, programmable logic blocks, efficient clustering algorithms are vital to make use of the benefits from those advanced architectures Circuit clustering is an important technique for coarse-grained... generation, area-driven clustering, timing-driven clustering, and low-power clustering In CHAPTER 2, an overview of coarse-grained FPGAs is provided and brief review of the flow of FPGA CAD tools is presented In CHAPTER 3, we present the procedure for generating library cells from the target pASIC3 family FPGA 7 architecture The library cells are used during technology mapping, before clustering, to cover... The clustering, therefore, refers to the task of grouping logic gates in the circuit netlist and assigning each group to a configurable logic block in the FPGA array Since poor clustering may result in significant impact on 6 the final design in terms of area, delay and power, clustering must be done carefully before placement and routing In this dissertation, our research focuses on clustering techniques. .. important technique for coarse-grained FPGAs First, clustering can reduce the complexity of large circuit designs by a significant factor Second, clustering can improve the quality of the results of other operations such as placement and routing In this dissertation, clustering techniques for area, delay, and power dissipation are proposed First, an area-driven clustering algorithm is presented to minimize... range of Boolean functions Antifuse-based FPGAs are, therefore, smaller in size when compared to SRAM-based FPGAs with the same number of equivalent gate capacity SRAM-based interconnect contains transistor switches, while the antifuse based interconnect can be considered as a standard metal interconnect found in ASIC chips There are three primary classes of FPGA architectures: Coarse-grained, mediumgrained... we present an interconnect-aware clustering algorithm and a timing-driven clustering algorithm In CHAPTER 5, we present a timing-driven clustering algorithm, which minimizes the number of pASIC3 logic cells on the longest input-output path Logic replication is minimized by slack-time relaxation A low-power clustering algorithm is presented in CHAPTER 6 In the low power clustering algorithm, we minimize... blocks In the following sections, we briefly review technology mapping techniques, clustering techniques, placement, and routing algorithms 2.2.1 TECHNOLOGY MAPPING In a standard cell design procedure for the application specific integrated circuits (ASIC), technology mapping maps the optimized circuits with a target library However, FPGAs have clusters with basic logic elements; and those basic logic . research focuses on clustering techniques for coarse- grained, antifuse-based FPGAs. The clustering problem for coarse-grained, antifuse-based FPGAs is quite different from typical clustering problems. CLUSTERING TECHNIQUES FOR COARSE-GRAINED, ANTIFUSE-BASED FPGAS by Chang Woo Kang A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA. replication for cluster selection. 95 x ABSTRACT Coarse-grained, antifuse-based FPGAs have emerged as a compelling technology to minimize the performance gaps between FPGAs and ASICs