Applications of Field-Programmable Gate Arrays in Scientific Research Sadrozinski • Wu w w w.c r c p r e s s . c o m an informa business 6000 Broken Sound Parkway, NW Suite 300, Boca Raton, FL 33487 270 Madison Avenue New York, NY 10016 2 Park Square, Milton Park Abingdon, Oxon OX14 4RN, UK ISBN: 978-1-4398-4133-4 9 781439 841334 90000 K11921 Applications of Field-Programmable Gate Arrays in Scientific Research Electrical Engineering Focusing on resource awareness in field-programmable gate array (FPGA) design, Applications of Field-Programmable Gate Arrays in Scientific Research covers the principles of FPGAs and their functionality. It explores a host of applications, ranging from small one-chip laboratory systems to large-scale applications in “big science.” The book first describes various FPGA resources, including logic elements, RAM, multipliers, microprocessors, and content-addressable memory. It then presents principles and methods for controlling resources, such as process sequencing, location constraints, and intellectual property cores. The remainder of the book illustrates examples of applications in high-energy physics, space, and radiobiology. Throughout the text, the authors remind designers to pay attention to resources at the planning, design, and implementation stages of an FPGA application in order to reduce the use of limited silicon resources and thereby reduce system cost. Features • Explores the use of these integrated circuits in an array of areas • Emphasizes sound design practices that encourage the saving of silicon resources and power consumption • Contains many hands-on examples drawn from diverse fields, such as high- energy physics and radiobiology • Offers VHDL code, detailed schematics of selected projects, photographs, and more on a supporting Website Supplying practical know-how on an array of FPGA application examples, this book provides an accessible overview of the use of FPGAs in data acquisition, signal processing, and transmission. It shows how FPGAs are employed in laboratory applications and how they are flexible, low-cost alternatives to commercial data acquisition systems. K11921_COVER_final.indd 1 11/12/10 10:10 AM Applications of Field-Programmable Gate Arrays in Scientific Research A TA Y L O R & F R A N C I S B O O K CRC Press is an imprint of the Taylor & Francis Group, an informa business Boca Raton London New York Hartmut F W. Sadrozinski University of California Santa Cruz, USA Jinyuan Wu Fermi National Accelerator Laboratory Batavia, Illinois, USA Taylor & Francis 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2011 by Taylor and Francis Group, LLC Taylor & Francis is an Informa business No claim to original U.S. Government works Printed in the United States of America on acid-free paper 10 9 8 7 6 5 4 3 2 1 International Standard Book Number-13: 978-1-4398-4134-1 (Ebook-PDF) This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Field-programmable gate array (FPGA) Field-programmable gate array (FPGA) Bởi: Lê Văn Tâm Field-programmable gate array (FPGA) vi mạch dùng cấu trúc mảng phần tử logic mà người dùng lập trình (Chữ field muốn đến khả tái lập trình “bên ngoài” người sử dụng, không phụ thuộc vào dây chuyền sản xuất phức tạp nhà máy bán dẫn) Vi mạch FPGA cấu thành từ phận: • • • • Các khối logic lập trình (logic block) Hệ thống mạch liên kết lập trình Khối vào/ra (IO Pads) Phần tử thiết kế sẵn khác DSP slice, RAM, ROM, nhân vi xử lý FPGA xem loại vi mạch bán dẫn chuyên dụng ASIC, so sánh FPGA với ASIC đặc chế hoàn toàn hay ASIC thiết kế thư viện logic FPGA không đạt đựợc mức độ tối ưu loại này, hạn chế khả thực tác vụ đặc biệt phức tạp, FPGA ưu việt chỗ tái cấu trúc lại sử dụng, công đoạn thiết kế đơn giản chi phí giảm, rút ngắn thời gian đưa sản phẩm vào sử dụng Còn so sánh với dạng vi mạch bán dẫn lập trình dùng cấu trúc mảng phần tử logic PLA, PAL, CPLD FPGA ưu việt điểm: tác vụ tái lập trình FPGA thực đơn giản hơn; khả lập trình linh động hơn; khác biệt quan trọng kiến trúc FPGA cho phép có khả chứa khối lượng lớn cổng logic (logic gate), so với vi mạch bán dẫn lập trình có trước Thiết kế hay lập trình cho FPGA thực chủ yếu ngôn ngữ mô tả phần cứng HDL VHDL, Verilog, AHDL, hãng sản xuất FPGA lớn Xilinx, Altera thường cung cấp gói phần mềm thiết bị phụ trợ cho trình thiết kế, có số hãng thứ ba cung cấp gói phần mềm kiểu Synopsys, Synplify Các gói phần mềm có khả thực tất bước toàn quy trình thiết kế IC chuẩn với đầu vào mã thiết kế HDL (còn gọi mã RTL) 1/4 Field-programmable gate array (FPGA) Lịch sử FPGA thiết kế Ross Freeman, người sáng lập công ty Xilinx vào năm 1984, kiến trúc FPGA cho phép tích hợp số lượng tương đối lớn phần tử bán dẫn vào vi mạch so với kiến trúc trước CPLD FPGA có khả chứa tới từ 100.000 đến hàng vài tỷ cổng logic, CPLD chứa từ 10.000 đến 100.000 cổng logic; số PAL, PLA thấp đạt vài nghìn đến 10.000 CPLD cấu trúc từ số lượng định khối SPLD (Simple programable devices, thuật ngữ chung PAL, PLA) SPLD thường mảng logic AND/OR lập trình có kích thước xác định chứa số lượng hạn chế phần tử nhớ đồng (clocked register) Cấu trúc hạn chế khả thực hàm phức tạp thông thường hiệu suất làm việc vi mạch phụ thuộc vào cấu trúc cụ thể vi mạch vào yêu cầu toán Kiến trúc FPGA kiến trúc mảng khối logic, khối logic, nhỏ nhiều đem so sánh với khối SPLD, ưu điểm giúp FPGA chứa nhiều phần tử logic phát huy tối đa khả lập trình phần tử logic hệ thống mạch kết nối, để đạt mục đích kiến trúc FPGA phức tạp nhiều so với CPLD Một điểm khác biệt với CPLD FPGA đại tích hợp nhiều logic số học sơ tối ưu hóa, hỗ trợ RAM, ROM, tốc độ cao, hay nhân cộng (multication and accumulation, MAC), thuật ngữ tiếng Anh DSP slice dùng cho ứng dụng xử lý tín hiệu số DSP Ngoài khả tái cấu trúc vi mạch toàn cục, số FPGA đại hỗ trợ tái cấu trúc cục bộ, tức khả tái cấu trúc phận riêng lẻ đảm bảo hoạt động bình thường cho phận khác Ứng dụng Ứng dụng FPGA bao gồm: xử lý tín hiệu số DSP, hệ thống hàng không, vũ trụ, quốc phòng, tiền thiết kế mẫu ASIC (ASIC prototyping), hệ thống điều khiển trực quan, phân tích nhận dạng ảnh, nhận dạng tiếng nói, mật mã học, mô hình phần cứng máy tính Do tính linh động cao trình thiết kế cho phép FPGA giải lớp toán phức tạp mà trước thực nhờ phần mềm máy tính, nhờ mật độ cổng logic lớn FPGA ứng dụng cho toán đòi hỏi khối lượng tính toán lớn dùng hệ thống làm việc theo thời gian thực 2/4 Field-programmable gate array (FPGA) Kiến trúc Cấu trúc tổng thể FPGA minh họa hình sau Khối logic Khối logic FPGA Trong tài liệu hướng dẫn dòng FPGA Xilinx sử dụng khái niệm SLICE, Slice tạo thành từ gồm khối logic, số lượng Slices thay đổi từ vài nghìn đến vài chục nghìn tùy theo loại FPGA Nếu nhìn cấu trúc tổng thể mảng LUT đầu vào kể hỗ trợ thêm đầu vào bổ sung từ khối logic phân bố trước sau nâng tổng số đầu vào LUT lên chân Cấu trúc nhằm tăng tốc số học logic Hệ thống mạch liên kết nhỏ|200px|Khối chuyển mạch FPGA Mạng liên kết FPGA cấu thành từ đường kết nối theo hai phương ngang đứng, tùy theo loại FPGA mà đường kết nối chia thành nhóm khác nhau, ví dụ XC4000 Xilinx có loại kết nối: ngắn, dài dài Các đường kết nối nối với thông qua khối chuyển mạch lập trình (programable switch), khối chuyển mạch chứa số lượng nút chuyển lập trình đảm bảo cho dạng liên kết phức tạp khác 3/4 Field-programmable gate array (FPGA) Các phần tử tích hợp sẵn Ngoài khối logic tùy theo loại FPGA khác mà có phần tử tích hợp thêm khác nhau, ví dụ để thiết kế ứng dụng SoC, dòng Virtex 4,5 Xilinx có chứa nhân xử lý PowerPC, hay Atmel FPSLIC tích hợp nhân ARV…, hay cho ứng dụng xử lý tín hiệu số DSP FPGA tích hợp DSP Slide nhân cộng tốc độ cao, thực hàm A*B+C, ví dụ dòng Virtex Xilinx chứa từ vài chục đến hàng trăm DSP slices với A, B, C 18-bit 4/4 Hindawi Publishing Corporation EURASIP Journal on Embedded Systems Volume 2006, Article ID 51312, Pages 1–2 DOI 10.1155/ES/2006/51312 Editorial Field-Programmable Gate Arrays in Embedded Systems Miriam Leeser, 1 Scott Hauck, 2 and Russell Tessier 3 1 Department of Electrical and Computer Engineering, Northeastern University, Boston, MA 02115, USA 2 Department of Electrical Engineering, University of Washington, Seattle, WA 98195-2500, USA 3 Department of Electrical and Computer Engineering, University of Massachusetts, Amherst, MA 01003, USA Received 13 July 2006; Accepted 13 July 2006 Copyright © 2006 Miriam Leeser et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Welcome to the special issue on field programmable gate ar- rays (FPGAs). FPGAs are becoming an increasingly impor- tant part of embedded systems, as the collection of papers in this issue illustrates. “An overview of reconfigurable hardware in embedded systems” provides a comprehensive overview of the state-of- the-art use of reconfigurable hardware in embedded systems. A detailed discussion of the use of FPGAs for application ar- eas such as encryption, software-defined radio, and robotics is provided. Additionally, a concise assessment of design is- sues and current design tools is included. A sizable collection of citations provides a handy reference for newcomers to the field. The remaining papers address applications and tools for embedded systems design. The applications presented here are typical of the spectr um of FPGA applications. They fall into the categories of multimedia processing, including video, image and speech processing, as well as communica- tions applications. The implementation of an MPEG-4 image encoder using a scalable number of Altera NIOS soft processors is presented in “Scalable MPEG-4 encoder of FPGA multiprocessor SOC.” An image is partitioned so that each processor receives a hor- izontal slice of the image. The author’s own on-chip inter- connection network is used to connect the soft processors. The authors demonstrate a significant application speedup as additional soft processors are added to the FPGA platform. In “A real-time wavelet domain video denoising imple- mentation in FPGA,” the authors present a two-FPGA so- lution for performing video denoising via a 3D (two spatial and one temporal dimension) wavelet filter. By careful con- sideration of the algorithm, data movement, and pipelining, a complete and complex image processing pipeline is pro- duced. In “A dynamic reconfigurable hardware/software archi- tecture for object tracking in video streams,” the authors present a feature tracker that has been implemented on an FPGA. The authors focus on choosing an algorithm that is well matched to reconfigurable hardware, hardware/software partitioning, and efficient use of memory structures. Their implementation, which runs faster than a software-only so- lution, has applications for mobile autonomous platforms. The paper “Speech silicon: an FPGA architecture for real- time hidden Markov model-based speech recognition” de- tails the implementation of an FPGA SoC that can perform real-time speech recognition of medium-sized speech vocab- ularies. This pipelined approach maximizes the throughput by minimizing the amount of required control circuitr y. The FPGA implementation of each part of the pipeline is care- fully documented to demonstrate the benefits of FPGA spe- cialization. FPGA floorplanning plays an important role in achieving real-time performance. A common application for FPGAs is image processing al- gorithms. In “A visual environment for real-time image pro- cessing in hardware (VERTIPH),” the EURASIP Journal on Applied Signal Processing 2003:6, 543–554 c 2003 Hindawi Publishing Corporation Rapid Prototyping of Field Programmable Gate Array-Based Discrete Cosine Transform Approximations Trevor W. Fox Department of Electrical and Computer Engineering, University of Calgary, 2500 University Drive N.W., Calgary, Alberta, Canada T2N 1N4 Email: fox@enel.ucalgary.ca Laurence E. Turner Department of Electrical and Computer Engineering, University of Calgary, 2500 University Drive N.W., Calgary, Alberta, Canada T2N 1N4 Email: turner@enel.ucalgary.ca Received 28 February 2002 and in revised form 15 October 2002 A method for the rapid design of field programmable gate array (FPGA)-based discrete cosine transform (DCT) approximations is presented that can be used to control the coding gain, mean square error (MSE), quantization noise, hardware cost, and power consumption by optimizing the coefficient values and datapath wordlengths. Previous DCT design methods can only control the quality of the DCT approximation and estimates of the hardware cost by optimizing the coefficient values. It is shown that it is possible to rapidly prototype FPGA-based DCT approximations with near optimal coding gains that satisfy the MSE, hardware cost, quantization noise, and power consumption specifications. Keywords and phrases: DCT, low-power, FPGA, binDCT. 1. INTRODUCTION The discrete cosine transform (DCT) has found wide appli- cation in audio, image, and video compression and has been incorporated in the popular JPEG, MPEG, and H.26x stan- dards [1]. The phenomenal growth in the demand for prod- ucts that use these compression standards has increased the need to develop a rapid prototyping method for hardware- based DCT approximations. Rapid prototyping design meth- ods reduce the time necessary to demonstrate that a complex design is feasible and worth pursuing. The number of logic resources and the speed of field pro- grammable gate arrays (FPGAs) have increased dramatically while the cost has diminished considerably. Desig ns can be quickly and economically prototyped using FPGAs. A methodology that can be used to rapidly proto- type DCT implementations with control over the hardware cost, the quantization noise at each subband output, the power consumption, and the quality of the DCT approx- imation would be useful. For example, a DCT implemen- tation that requires few FPGA resources frees additional space for other signal processing functions, which can per- mit the use of a smaller less expensive FPGA. Also near exact DCT approximations can be obtained such that the hardware cost and power consumption requirements are satisfied. A rapid prototyping methodology for the design of FPGA-based DCT approximations that can be used to con- trol the quality of the DCT approximation, the hardware cost, the quantization noise at each subband output, a nd the power consumption has not been previously introduced in the literature. A method for the design of fixed point DCT approximations has recently been introduced in [2], but it does not specifically target FPGAs or application-specific in- tegrated circuits (ASICs). The method discussed in [2]can be used to control the quality of the DCT approximation and the estimate of the hardware cost (the total number of adders and subtractors required to implement all of the con- stant coefficient multipliers) by optimizing the coefficient values. Unfortunately, the method presented in [2] only esti- mates the hardware cost, ignores the power consumption and quantization noise, and ignores the datapath wordlengths (the number of bits used to represent a signal). In contrast, the method proposed in this paper RAPID PROTOTYPING OF EMBEDDED SYSTEMS USING FIELD PROGRAMMABLE GATE ARRAYS Summa Cum Laude Thesis Bhavya Daya Bachelor of Science in Electrical Engineering Bachelor of Science in Computer Engineering Spring 2009 ii © 2009 Bhavya Daya iii To: God for granting me patience My mom, dad and brother for their unwavering support iv ACKNOWLEDGEMENTS I would like to thank my supervisor, Professor Herman Lam, for his assistance throughout the honors research, Professor Eric Schwartz for obtaining the Xilinx development board for the project, and Professor Ann Gordon-Ross and Professor Prabhat Mishra for being members of my supervisory committee. I would also like to thank Mr. Steve Permann, student advisor, for his guidance and support throughout my undergraduate studies at the University of Florida. v Table of Contents ACKNOWLEDGEMENTS iv LIST OF TABLES x LIST OF FIGURES xi ABSTRACT xiv CHAPTER 1 1 INTRODUCTION 1 What is an Embedded System? 1 Design Considerations when Developing an Embedded System 4 Importance of Rapid Prototyping of Embedded Systems using FPGAs 6 Scope of The Project 10 Outline of Chapters 11 CHAPTER 2 12 EMBEDDED SYSTEMS DESIGN 12 Embedded Systems Design Flow 12 Three Generations of Embedded System Design 16 Trends affecting Embedded System Design 19 Overview of Embedded System Hardware and Software 20 CHAPTER 3 21 vi EMBEDDED SYSTEM HARDWARE 21 Peripherals 22 Processor 24 Microcontroller-Based 31 ASIC-Based 32 DSP Processor-Based 35 FPGA-Based 36 Memory 44 CHAPTER 4 50 EMBEDDED SYSTEM SOFTWARE 50 Intellectual Property 50 Stages of Software Development 51 Embedded Operating System 54 Xilinx and Altera Software Tools 58 CHAPTER 5 63 RAPID PROTOTYPING OF EMBEDDED SYSTEMS 63 Rapid System Prototyping 64 Prototyping of Embedded Hardware and Software Systems 69 CHAPTER 6 74 BOARD-LEVEL RAPID PROTOTYPING OF EMBEDDED SYSTEMS 74 vii Board Level Prototyping Methodology 74 Prototyping Platforms using FPGAs 75 Altera DE2 Development and Education Board 75 Xilinx FX12 PowerPC and Microblaze Embedded Development Kit 84 CHAPTER 7 92 EMBEDDED SYSTEM DEVELOPMENT 92 Embedded System Design 95 Altera DE2 Board 95 USB and Embedded Operating System 96 Choosing an Embedded Operating System 98 UCLinux Operating System 99 Porting uCLinux to Nios II Processor and Cyclone II FPGA 100 Means of Implementing Photo Frame Application Using uCLinux 101 Design of Application Software 102 Porting Application to Nios II processor 103 SD Card and Nios Embedded Processor 104 Research of IP Cores 105 Nios II Hardware Design 105 SD Card Interface 106 VGA Interface 110 viii SRAM Controller 115 JPEG Decoder 116 Application Software Design 116 Xilinx PowerPC and MicroBlaze Development Kit FX12 Edition 118 Choosing an Embedded Processor 118 MicroBlaze Processor 119 PowerPC Processor 121 Research IP Cores Available 122 Compact Flash Interface 123 VGA Interface 125 Embedded Processor Hardware Design 126 Embedded Processor Software Design 129 Embedded System Implementation 130 Altera DE2 Board 130 USB and Embedded Operating System 131 SD Card and Nios ELEC-2005 Electronics in High Energy Physics Winter Term: Introduction to Electronics in HEP Field Programmable Gate Arrays Part 1 Stefan Haas stefan.haas@cern.ch CERN Technical Training 2005 Stefan Haas, 1 F eb. 2005 ELEC-2005 2 Part 2 • VHDL – Introduction – Examples • Design Flow – Entry Methods – Simulation – Synthesis – Place & Route • IP Cores • CERN Tools & Support Part 1 • Programmable Logic • CPLD • FPGA – Architecture – Examples – Features – Vendors and Devices coffee break Outline Programmable Logic Stefan Haas, 1 F eb. 2005 ELEC-2005 4 Programmable Logic • Programmable digital integrated circuit • Standard off-the-shelf parts • Desired functionality is implemented by configuring on-chip logic blocks and interconnections • Advantages (compared to an ASIC): – Low development costs – Short development cycle – Device can (usually) be reprogrammed • Types of programmable logic: – Complex PLDs (CPLD) – Field programmable Gate Arrays (FPGA) CPLD Architecture and Examples Stefan Haas, 1 F eb. 2005 ELEC-2005 6 PLD - Sum of Products A B C CBACBAf ••+••= 1 CBABAf ••+•= 2 AND plane Programmable AND array followed by fixed fan-in OR gates Programmable switch or fuse Stefan Haas, 1 F eb. 2005 ELEC-2005 7 PLD - Macrocell Can implement combinational or sequential logic A B C Flip-flop Select Enable D Q Clock AND plane MUX 1 f Stefan Haas, 1 F eb. 2005 ELEC-2005 8 CPLD Structure Integration of several PLD blocks with a programmable interconnect on a single chip PLD Block PLD Block PLD Block PLD Block Interconnection Matrix Interconnection Matrix I/O Block I/O Block I/O Block I/O Block PLD Block PLD Block PLD Block PLD Block I/O Block I/O Block I/O Block I/O Block • • • Interconnection Matrix Interconnection Matrix • • • • • • • • • Stefan Haas, 1 F eb. 2005 ELEC-2005 9 CPLD Example - Altera MAX7000 EPM7000 Series Block Diagram Stefan Haas, 1 F eb. 2005 ELEC-2005 10 CPLD Example - Altera MAX7000 EPM7000 Series Device Macrocell [...]... needs to be configured at power-on • Flash Erasable Programmable ROM (Flash) – each switch is a floating -gate transistor that can be turned off by injecting charge onto its gate FPGA itself holds the program – reprogrammable, even in-circuit • Fusible Links (“Antifuse”) – Forms a forms a low resistance path when electrically programmed – one-time programmable in special programming machine – radiation...FPGA Architecture FPGA - Generic Structure FPGA building blocks: Logic block Interconnection switches • I/O I/O I/O Programmable logic blocks Implement combinatorial and sequential logic • Programmable interconnect Wires to connect inputs and outputs to logic blocks • Programmable I/O blocks Special logic blocks at the periphery of device for external connections I/O Stefan Haas, 1 F ELEC-2005... LE LE LE ELEC-2005 LE LE LE LE 17 Switch Matrix Operation Before Programming • • • After Programming 6 pass transistors per switch matrix interconnect point Pass transistors act as programmable switches Pass transistor gates are driven by configuration memory cells Stefan Haas, 1 F ELEC-2005 18 Special Features • Clock management – PLL,DLL – Eliminate clock skew between external clock input and on-chip... LUT LUT Z LUT implementation A B Z C D Truth-table Stefan Haas, 1 F Gate implementation ELEC-2005 15 LUT Implementation • Example: 3-input LUT • Based on multiplexers (pass transistors) • LUT entries stored in configuration memory cells X1 X2 0/1 0/1 0/1 0/1 F 0/1 0/1 0/1 Configuration memory cells 0/1 X3 Stefan Haas, 1 F ELEC-2005 16 Programmable Interconnect • Interconnect hierarchy (not shown) – Fast .. .Field- programmable gate array (FPGA) Lịch sử FPGA thiết kế Ross Freeman, người sáng lập công ty Xilinx vào năm... đòi hỏi khối lượng tính toán lớn dùng hệ thống làm việc theo thời gian thực 2/4 Field- programmable gate array (FPGA) Kiến trúc Cấu trúc tổng thể FPGA minh họa hình sau Khối logic Khối logic FPGA... mạch chứa số lượng nút chuyển lập trình đảm bảo cho dạng liên kết phức tạp khác 3/4 Field- programmable gate array (FPGA) Các phần tử tích hợp sẵn Ngoài khối logic tùy theo loại FPGA khác mà có phần