Final project course name hardware software codesign creating a processor system

30 0 0
Final project course name  hardware software codesign creating a processor system

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

IntroductionThis project will present you with the process of using Vivado and IP Integrator to create a complete Zynq ARM Cortex-A9 based processor system targeting the ZyBoard Zynq dev

Trang 1

HCMC UNIVERSITY OF TECHNOLOGY AND EDUCATION FACULTY OF HIGH QUALITY TRAINING

FINAL PROJECT

Course name: HARDWARE/SOFTWARE CODESIGN

CREATING A PROCESSOR SYSTEM

Lecturer name: List of members:

Ho Chi Minh City, 06/2022

Trang 2

LECTURER COMMENT

1 Introduction

2 Creating a Processor System 3 Creating a Processor System Lab

Trang 3

To complete the Hardware/Software Codesign subject, we would like to express our heartfelt gratitude to Assoc Prof Phan Van Ca has enthusiastically guided and equipped us with the necessary helpful knowledge this semester Furthermore, he has directly guided and created all conditions to help us during the process of Hardware/Software Codesign.

Due to the project's short implementation time, the topic's limited knowledge, limitations, and errors have not been completely overcome We look forward to receiving your advice and suggestions.

Student

Trang 4

PART 2 CREATING A PROCESSOR SYSTEM 2

2.1 Embedded System Design in Zynq using IP Integrator 2

2.1.1 Embedded Design Architecture in Zynq 2

2.1.2 The PS and the PL 2

2.1.3 Vivado 2

2.2 Creating IP-XACT Hardware Accelerator 3

2.2.1 Port-Level Interfaces 3

2.2.2 Interface Modes 4

2.2.3 Native AXI Slave Lite Interface 4

2.2.4 Controllable Register Maps in AXI4 Lite 4

2.2.5 Native AXI4 Master 5

2.2.6 Burst Accesses Inferred for AXI4 Master 5

2.2.7 Byte-Enable Accesses on AXI4 Master 5

2.2.8 AXI4 Port Bundling 5

2.2.9 AXI4 Stream Interface: Ease of Use 6

2.2.10 Generate the hardware accelerator 6

2.2.11 Generated impl Directory 6

2.3 Integrating the Hardware Accelerator in AXI System 7

PART 3 CREATING A PROCESSOR SYSTEM LAB 8

3.1 Create a New Project 8

Trang 5

3.6 Generate IP-XACT Adapter 14

3.7 Create a Vivado Project 16

3.8 Export to SDK and create Application Project 20

3.9 Verify the Design in Hardware 21

REFERENCE 24

Trang 6

FIGURE LIST

Figure 1: The design under consideration 9

Figure 2: The header file 10

Figure 3: Initial part of the generated output in the Console view 11

Figure 4: Generated interface signals 12

Figure 5: Selecting the AXI4LiteS adapter and naming bundle 13

Figure 6: Applying bundle to assign y output to AXI4Lite adapter 14

Figure 7: Export RTL Dialog 15

Figure 8: IP-XACT adapter generated 15

Figure 9: Adapter’s drivers directory 16

Figure 10: Block design made for Pynq 17

Figure 11: Setting path to IP Repositories 18

Figure 12: Generated design after IRQ_F2P interface enabled 19

Figure 13: Generated address map 20

Trang 7

PART 1 INTRODUCTION 1.1 Introduction

This project will present you with the process of using Vivado and IP Integrator to create a complete Zynq ARM Cortex-A9 based processor system targeting the ZyBoard Zynq development board You will use the Block Design feature of IP Integrator to configure the Zynq PS and add IP to create the hardware system, and SDK to create an application to verify the design functionality It will also guide you through the process of profiling an application and analyzing the output.

1.2 Purpose and requirements a) Purpose

This lab introduces a design flow to generate a IP-XACT adapter from a design using Vivado HLS and using the generated IP-XACT adapter in a processor system using IP Integrator in Vivado.

b) Requirements

After completing this lab, you will be able to:

- Understand the steps and directives involved in creating an IP-XACT adapter from a synthesized design in Vivado HLS

- Create a processor system using IP Integrator in Vivado

- Integrate the generated IP-XACT adapter into the created processor system 1.3 Layout

The report is divided into 4 parts: Part 1 Introduction

Part 2 Creating a Processor System Part 3 Creating a Processor System Lab Part 4 Conclusion

Trang 8

PART 2 CREATING A PROCESSOR SYSTEM 2.1 Embedded System Design in Zynq using IP Integrator

2.1.1 Embedded Design Architecture in Zynq Embedded design in Zynq is based on: – Processor and peripherals

• Dual ARM® Cortex™ -A9 processors of Zynq-7000 AP SoC • AXI interconnect

• AXI component peripherals • Reset, clocking, debug ports

– Software platform for processing system

• Interrupt service routines (optional) 2.1.2 The PS and the PL

The Zynq-7000 AP SoC architecture consists of two major sections:

• Uses the same 7 series programmable logic

– Artix™-based devices: Z-7010, Z-7015, and Z-7020 (high-range I/O banks only) – Single core versions: Z-7017S, Z-7012S, and Z-7014S

– Kintex™-based devices: Z-7030, Z-7035, Z-7045, and Z-7100 (mix of high-range and high-performance I/O banks)

2.1.3 Vivado

What are Vivado, IP Integrator and SDK?

– Vivado is the tool suite for Xilinx FPGA design and includes capability for embedded system design

Trang 9

• IP Integrator, is part of Vivado and allows block level design of the hardware part of an Embedded system

• Integrated into Vivado

• Vivado includes all the tools, IP, and documentation that are required for designing systems with the Zynq-7000

AP SoC hard core and/or Xilinx MicroBlaze soft core processor • Vivado + IPI replaces ISE/EDK

– SDK is an Eclipse-based software design environment • Enables the integration of hardware and software components • Links from Vivado

Vivado is the overall project manager and is used for developing non-embedded hardware and instantiating embedded systems

– Vivado/IP Integrator flow is recommended for developing Zynq embedded systems Embedded System Design using Vivado

2.2 Creating IP-XACT Hardware Accelerator 2.2.1 Port-Level Interfaces

The AXI4 interfaces supported by Vivado HLS include – The AXI4-Stream (axis)

• Specify on input arguments or output arguments only, not on input/output arguments

Trang 10

– The AXI4 master (m_axi)

• Specify on arrays and pointers (and references in C++) only You can group multiple arguments into the same AXI4-Lite interface using the bundle option

– The AXI4-Lite (s_axilite)

• Specify on any type of argument except arrays You can group multiple arguments into the same AXI4-Lite interface using the bundle option.

2.2.2 Interface Modes Native AXI Interfaces

– AXI4 Slave Lite, AXI4 Master, AXI Stream supported by INTERFACE directive • Provided in RTL after Synthesis

• Supported by C/RTL Co-simulation • Supported for Verilog and VHDL BRAM Memory Interface – Identical IO protocol to ap_memory – Bundled differently in IP Integrator

• Provides easier integration to memories with BRAM interface 2.2.3 Native AXI Slave Lite Interface

Interface Mode: s_axilite

– Supported with INTERFACE directive

– Multiple ports may be grouped into the same Slave Lite interface • All ports which use the same bundle name are grouped

Grouped Ports

– Default mode is ap_none for input ports – Default mode is ap_vld for output ports – Default mode ap_ctrl_hs for function (return port)

– Default mode can be changed with additional INTERFACE Directives 2.2.4 Controllable Register Maps in AXI4 Lite

Assigning offset to array (RAM) interfaces – Specified value is offset to base of array

– Array’s address space is always contiguous and linear C Driver Files include offset information

– In generated driver file xhls_sig_gen_bram2axis.h

Trang 11

2.2.5 Native AXI4 Master Interface Mode: m_axi

– Supported with INTERFACE directive Options

– Multiple ports may be grouped into the same AXI4 Master interface • All ports which use the same bundle name are grouped

– Depth option is required for C/RTL co-simulation • Required for pointers, not arrays

• Set to the number of values read/written – Option to support offset or base address

2.2.6 Burst Accesses Inferred for AXI4 Master

There are two types of accesses on an AXI Master: Single Access and Burst Access – Burst accesses are more efficient

– Burst access has until now required the use of memcpy() Burst Accesses are now inferred

– From operations in a for-loop and from sequential operations in the code – However: there are some limitations

• Single for-loops only, no nested loops 2.2.7 Byte-Enable Accesses on AXI4 Master

Byte-Enable Accesses Support on AXI4 Master Interfaces – Single bytes are now written and read

– Improved AXI4 Master performance Improved Performance

– This code uses 8-bit data

• Previously, accessing this required reading/writing full 32-bit

• This implied a required read-modify-write behavior: Impacted performance – Similar performance improvement when accessing struct members • Also often implied read-modify-write behavior

– Improved Port Bundling

• Variables of different sizes can be grouped into same AXI4 Master port 2.2.8 AXI4 Port Bundling

AXI4 Master and Lite Port Bundling

Trang 12

– The bundle options groups arguments into the same AXI4 port – For example, group 3 arguments into AXI4 port “ctrl” :

Arguments can be Bundled into AXI4 Master and AXI4 Lite ports – If no bundle name is used a default name is used for all arguments • All go into a single AXI4 Master or AXI4 Lite

• Default name applied if no –bundle option is used – Group different sized variables into an AXI4 Master port

2.2.9 AXI4 Stream Interface: Ease of Use Native Support for AXI4 Stream Interfaces

– Native = An AXI4 Stream can be specified with set_directive_interface • No longer required to set the interface then add a resource

• This AXI4 Stream interface is part of the HDL after synthesis • This AXI4 Stream interface is simulated by RTL co-simulation

2.2.10 Generate the hardware accelerator Select Solution > Export RTL

Select IP Catalog, System Generator for Vivado or design check point (dcp) Click on Configuration… if you want to change the version number or other information

– Default is v1_00_a Click on OK

– The directory (ip) will be generated under the impl folder under the current project directory and current solution

– RTL code will be generated, both for Verilog and VHDL languages in their respective folders

2.2.11 Generated impl Directory

Trang 13

2.3 Integrating the Hardware Accelerator in AXI System Create a new Vivado project, or open an existing project Invoke IP Integrator

Construct(modify) the hardware portion of the embedded design by adding the IP-XACT

Create (Update) top level HDL wrapper

Synthesize any non-embedded components and implement in Vivado Export the hardware description, and launch XSDK

Create a new software board support package and application projects in the XSDK

Compile the software with the GNU cross-compiler in XSDK

Download the programmable logic’s completed bitstream using Xilinx Tools > Program

FPGA in XSDK

Use XSDK to download and execute the program (the ELF file).

Trang 14

PART 3 CREATING A PROCESSOR SYSTEM LAB 3.1 Create a New Project

Create a new project in Vivado HLS targeting xc7z020clg400-1 device 1 Select Start > Xilinx Design Tools > Vivado HLS 2017.4

A Getting Started GUI will appear.

2 In the Getting Started section, click on Create New Project The New Vivado HLS Project wizard opens.

3 Click Browse… button of the Location field, browse to {labs}\lab4, and then click OK.

4 For Project Name, type fir.prj and click Next.

5 In the Add/Remove Files for the source files, type fir as the function name (the provided source file contains the function, to be synthesized, called fir).

6 Click the Add Files… button, select fir.c and fir_coef.dat files from the {sources}\lab4 folder, and then click Open.

7 ClickNext.

8 In the Add/Remove Files for the testbench, click the Add Files… button, select fir_test.c file from the {sources}\lab4 folder and click Open.

9 Click Next.

10 In the Solution Configuration page, leave Solution Name field as solution1 and make sure the clock period as 8 Leave Uncertainty field blank.

11 Click on the Part’s Browse button and using the Parts Specify option, select xc7z020clg400-1.

12 ClickFinish.

You will see the created project in the Explorer view Expand various sub-folders to see the entries under each sub-folder.

13 Double-click on the fir.c under the source folder to open its content in the information pane.

Trang 15

Figure 1: The design under consideration

The FIR filter expects x as a sample input and pointer to the computed sample out y Both of them are defined of data type data_t The coefficients are loaded in array c of type coef_t from the file called fir_coef.dat located in the current directory The sequential algorithm is applied and accumulated value (sample out) is computed in variable acc of type acc_t.

14 Double-click on the fir.h in the outline tab to open its content in the information pane.

Trang 16

Figure 2: The header file

The header file includes ap_cint.h so user defined data width (of arbitrary precision) can be used It also defines number of taps (N), number of samples to be generated (in the testbench), and data types coef_t, data_t, and acc_t The coef_t and data_t are short (16 bits) Since the algorithm iterates (multiply and accumulate) over 59 taps, there is a possibility of bit growth of 6 bits and hence acc_t is defined as int38 Since the acc_t is bigger than sample and coefficient width, they have to cast before being used (like in lines 16, 18, and 21 of fir.c).

15 Double-click on the fir_test.c under the testbench folder to open its content in the information pane.

Notice that the testbench opens fir_impulse.dat in write mode, and sends an impulse (first sample being 0x8000.

3.2 Run C Simulation

Run C simulation to observe the expected output.

1 Select Project > Run C Simulation or click on the button from the tools bar buttons, and click OK in the C Simulation Dialog window.

The testbench will be compiled using apcc compiler and csim.exe file will be generated The csim.exe will then be executed and the output will be displayed in the console view.

Trang 17

Figure 3: Initial part of the generated output in the Console view 3.3 Synthesize the Design

Synthesize the design with the defaults View the synthesis results and answer the question listed in the detailed section of this step.

1 Select Solution > Run C Synthesis > Active Solution to start the synthesis process.

2 When synthesis is completed, several report files will become accessible and the Synthesis Results will be displayed in the information pane.

3 The Synthesis Report shows the performance and resource estimates as well as estimated latency in the design.

4 Using scroll bar on the right, scroll down into the report and answer the following question.

Estimated clock period: 8ns Worst case latency: 175 Number of DSP48E used: 0 Number of BRAMs used: 3 Number of FFs used: 168 Number of LUTs used: 157

Trang 18

5 The report also shows the top-level interface signals generated by the tools.

Figure 4: Generated interface signals

You can see the design expects x input as 16-bit scalar and outputs y via pointer of the 16-bit data It also has ap_vld signal to indicate when the result is valid.

Add PIPELINE directive to the loop and re-synthesize the design View the synthesis results.

1 Make sure that the fir.c is open in the information view.

2 Select the Directive tab, and apply the PIPELINE directive to the loop.

3 Select Solution > Run C Synthesis > Active Solution to start the synthesis process.

4 When synthesis is completed, the Synthesis Results will be displayed in the information pane.

5 Note that the latency has reduced to 63 clock cycles The DSP48 and BRAM consumption remains same; however, LUT and FF consumptions have slightly increased.

3.4 Run RTL/C CoSimulation

Run the RTL/C Co-simulation, selecting Verilog Verify that the simulation passes.

1 Select Solution > Run C/RTL Co-simulation or click on the button to open the dialog box so the desired simulations can be run.

2 Select the Verilog option and clickOK.

Ngày đăng: 14/04/2024, 21:33

Tài liệu cùng người dùng

Tài liệu liên quan