Case Study: Batcher’s Bitonic Sorter

Một phần của tài liệu Reconfigurable computing the theory and practice of FPGA based computation~tqw~ darksiderg (Trang 388 - 392)

Part III: Mapping Designs to Reconfigurable Platforms 275

16.3.1 Case Study: Batcher’s Bitonic Sorter

This section presents the layout specification of a high-speed parallel sorter that would have been difficult to lay out using explicit Cartesian coordinates. We show how to build complex structures incrementally by composing the layout of subcomponents using simple operators. The use of hierarchy achieves complex layout structures that would have been difficult or tedious to produce otherwise and impossible to produce in a compositional manner.

The objective is to build a parallel sorter from a parallel merger, as shown in Figure 16.11. A parallel merger takes two sublists of numbers where each sublist is sorted and produces a completely sorted list of numbers as its output. All inputs and outputs are shifted in, in parallel rather than serially. Furthermore, for performance reasons the sorter should have the same floorplan as shown in the figure.

This parallel sorter uses a two-sorter as its building block, which is shown fully placed in Figure 16.12. This circuit has left-to-right dataflow. Although the

>=> combinator is also a serial composition combinator, it does not have any layout semantics because it is used to compose wiring circuits (which are not subject to layout directives).

The two-sorter in Figure 16.12 has been carefully designed to have a rectangu- lar footprint because we will want to tile many of these circuits together vertically and horizontally to produce a compact and high-performance sorter network.

Another important combinator we will use in our sorter design is the two- combinator, which makes two copies of a circuit r, one of which works on the bottom half of the input and the other on the top half of the input, as illustrated in Figure 16.13. Furthermore, the second copy of r should be placed vertically on top of the first copy. The two combinator can be defined as

two r = halve >-> par [r,r] >-> unhalve

which says halve the input, use two copies of r in parallel (stacked vertically) on the halved input, and then take the result and unhalve it.

Sorter

Sorter

Merger

FIGURE 16.11 I The recursive structure of a sorter.

a . b

vr eg

vr eg mu

x b

a

twoSorter clk 5 fork2 >-> fsT comparator >-> condSwap clk clk

mu x x

y

FIGURE 16.12 ITwo-sorter layout and behavior specification.

r

r

twor FIGURE 16.13 IThe two-combinator.

Interleave (ilv) is another combining form that uses two copies of the same circuit. This combinator has the property that the bottom circuit processes the inputs at even positions and the top circuit processes the inputs at odd positions.

It can be defined as

ilv r = unriffle >-> two r >-> riffle

An instance ofilv rfor an 8-input bus is shown in Figure 16.14. The related evenscombinator chops the input list into pairs and then applies copies of the same circuit to each input.

Given these ingredients, we can give a recursive description of a parallel merger butterfly circuit:

bfly r 1 = r

bfly r n = ilv (bfly r (n-1)) >-> evens r

A bitonic merger of degree 3 is shown in Figure 16.15, which not only describes how to compose the behavior of elements to form a merger circuit, but also

16.3 Algebraic Layout Specification 359

r

r

unriffle two r riffle

FIGURE 16.14 I Theilvcombinator.

5 6 7 8 4 3 2 1

4 3 2 1 8 7 6 5 8

6 3 1 7 5 4 2 6

3 8 1 5 4 7 2 2S

2S

2S

2S

2S

2S

2S

2S

2S

2S

2S

2S

FIGURE 16.15 I A bitonic merger.

specifies the layout of the merger circuit using algebraic layout specifications.

This circuit is a bitonic merger that can merge its inputs as long as one half of the input is increasing in the opposite order from the other half, as shown in the figure.

2S

2S

2S

2S

2S

2S

2S

2S

2S

2S

2S

2S

2S

2S

2S

2S

2S

2S

2S

2S

2S

2S 2S

2S

FIGURE 16.16 ISorter recursion and layout for 8 inputs.

Now that we have our merger, we can recursively unfold the pictorial specification in of the sorter layout to produce the design and layout in Figure 16.16 (for 8 inputs). This layout can be specified using the following combinators:

sortB cmp 1 = cmp sorB cmp n

= two (sortB cmp (n-1)) >->

pair >-> snD reverse >-> unpair >->

butterfly cmp n

In the figure the description uses two subsorters to produce a bitonic input for a merger (shown on the right).

The 8-input description can be evaluated to produce an EDIF or VHDL netlist containing RLOC specifications for every gate. The FPGA layout of a degree-5 sorter (32 inputs) with 16-bit numbers is shown in Figure 16.17 on a Xilinx Virtex-II device. The resulting netlist is the same but with the layout informa- tion removed. It is shown in Figure 16.18. The netlist with the layout informa- tion leads to an implementation that is approximately 50 percent faster, and a 64-input sorter leads to a 75 percent speed improvement.

The case study just outlined shows how a complicated and recursive layout can be described in a feasible manner using algebraic layout combinators rather than explicit Cartesian coordinates.

16.4 LAYOUT VERIFICATION FOR PARAMETERIZED DESIGNS

A common problem with parameterized layout descriptions (especially those based on Cartesian coordinates) is that designer errors can produce bad layouts that cannot be realized on the target FPGAs—for example, the layout specifica- tion may try to map too many logic gates into the same location. Such errors

Một phần của tài liệu Reconfigurable computing the theory and practice of FPGA based computation~tqw~ darksiderg (Trang 388 - 392)

Tải bản đầy đủ (PDF)

(945 trang)