OPTIMIZATION OF THE SORTING NETWORK ARCHITECTURE FOR HARDWARE IMPLEMENTATION OF ROF Nivedita S Nellur Sheelavathi T Somaraddi nivedita_nellur@yahoo.co.in sheela_smart@yahoo.co.in B V B College of Engineering and Technology, Hubli-31 Abstr Abstract In this paper, we present an optimized architecture for implementation of ROF and a comparison with the various existing techniques The architecture is based on Sorting Network algorithm optimized for each rank and consists of significantly fewer comparators than the existing architectures that are based on bubble-sort, quick-sort, bit sum comparator architecture and booths algorithm[7], [8], [13], [14] The reduction in comparators is obtained by sorting the columns of the window only once, and then merging the elements based on the range of the ranks the column sorted elements would occupy The design of architecture is further optimized using the concepts of parallelism and pipelining to store the sorted columns of one window for processing the next successive overlapping windows[13],[14].The architecture is pipelined which processes one pixel per clock cycle, thus to process an image of size 256 x 256 it requires 0.65ms when a clock of 100MHz is used and hence is suitable for real time applications Introduction Modern image processing algorithms require high computational capability especially when high-resolution images have to be elaborated under realtime requirements As image sizes and bit depths grow larger, software has become less useful in the image processing realm.[1] , [2] In addition, a common problem is dealing with the large amount of data captured using satellites and ground-based detection systems Non Linear filters have found extensive applications in digital image processing for smoothing noisy images.[3] Rank order filter belong to a class of Non Linear moving window filter that perform better in situations where linear filters fail [a].The output of the rth rank order filter is given by [X1, X2, X3, XN ;r]=X(r), i.e rth order statistic of samples X1, X2 , XN in the filter window Special cases of this filter class are the median (r =K+1) where (N = 2k+1), the morphological operations (r = 1) and (r= N).It is obvious that rank order statistic filters introduce bias towards small values if r< k+1 and towards large values if r>k+1 The strength of the bias depends on the value of r and input distribution ROF adapt better to different noise distributions, effectively suppress Impulse noise and high frequency noise without destroying the edge information Morphological filters are useful in many areas of Image processing such as Skeltonization, edge detection, restoration and texture analysis Sorting algorithms suitable for software implementations are tree sort, shell sort and quick sort.[4] Several designs for hardware implementations have used threshold decomposition technique and bit sliced algorithms which are more suited for relatively small resolution pixel values Network based architectures for hardware implementations of ROF have also been suggested [7] BVB-IEEE ‘06 SAE-INDIA BVB Collegiate Club The proposed architecture requires fewer comparators than existing sorting based architectures of ROF for hardware implementations [7, 8, 9] In the proposed algorithm the elements of the window are sorted column wise The sorted column elements are merged into a new set for further sorting Merging of the elements is done depending on the desired rank of the ROF and the range of the ranks that the column sorted elements would occupy A further reduction in the number of comparators is obtained by using parallelism and pipelining Organization of the rest of the paper is as follows In Section2 we explain the proposed sorting algorithm Section covers implementation Section reports the results for various ranks of the ROF and noise distributions Finally conclusions are drawn in Section and Section lists the references Proposed sorting algorithm Existing sorting algorithms used for ROF are based on arranging the elements is ascending or descending order and then selecting the output based on the rank In the proposed algorithm the sorting algorithm is modified and optimized depending on the rank of the output required Sorting networks are special cases of general sorting algorithms where comparators are data dependant Our proposed algorithm builds a sorting network that is range dependent due to which the no of elements contending for the second stage of sorting are reduced The reduction in the number of comparators required compared to the conventional methods also depends on the desired rank of the output Practical sorting networks have a complexity of ( O(n) log(n2)) comparators, we have achieved the lower bound of O(n log n) comparators.[4], [10] , [11] We explain our proposed algorithm considering a 3x3 filtering window by 1) The elements of each window are read and sorted column wise These sorted elements are stored in the registers so as to allow for pipelinining when processing the next consecutive window The sorted elements will occupy the rank ranges rmax-min as denoted in fig.3 2) The elements occupying equal rank ranges are merged and sorted and the range of the ranks of the sorted elements is further reduced Manthan 3) The inputs and the number of comparators in last stage of sorting depends on the desired rank of the output Implementation Implementation of the filter involves mainly two stages i) Windowing ii) sorting and rank based output selection Windowing As shown in fig 1, the image is serially read through serial-in block Then the window is extracted from the image using the two FIFOs of size equal to the width of the image The first window is obtained after reading pixels equal to the number of pixels in the first two rows plus three pixels of the third row So for the first 514 clock cycles no window would appear and this is the initial latency Thereon for every clock cycle a window is formed As only three elements corresponding to each column in the moving window are sorted at a time, these are stored in registers in the same order The use of these registers would reduce the number of comparisons for the next windows as the six elements of the next window would have already been sorted once Sorting The sorting network implemented is for a 3x3 window and involves three stages Stage I consists of a single three element sorter (S11) shown in fig As only three elements corresponding to each column in the moving window are sorted, these are stored in registers in the same order The use of these registers would reduce the number of comparisons for the next windows as the six elements of the next window would have already been sorted once The output elements of stage I, which occupy equal rank ranges are merged and are the inputs to the stage II sorters Stage II consists of sorters whose structure depends on the desired rank The maxmin rank range is further reduced after stage II sorting BVB-IEEE ‘06 SAE-INDIA BVB Collegiate Club Manthan fig.1 windowing operation fig4.2 Network for Order fig.2 Reading pixels of an image to form a window The general diagram showing the stages and for all the ranks using 3x3 window is as shown in fig 4.1 The fig 4.2 gives the stages and of the median filter sorting that is rank5 Stage consists of sorters the number and size depending on the desired rank The number of comparators required for different ranks are tabulated in the table Results fig.3 stage1 sorter s11 with sorted elements (rank - range) stored in register Table1 comparison of number of comparators required fig 4.1 The general blocks for stages and BVB-IEEE ‘06 SAE-INDIA BVB Collegiate Club Table2 device utilization summary for the design implementation using 2s600efg676-7 Manthan Order Noise One Gaussian MSE 911.16 =0 & MAE 21.51 2=0.001 Salt Pepper & Density 0.02 = MAE 23.45 MSE 1913.50 Speckle MSE 543.12 2=0.001 MAE 12.87 Five Nine 53.81 687.98 5.12 20.45 40.6 2519.00 3.38 60.0 5.16 25.80 559.67 Figure 5: Performance of ROF 16.63 Table.3: Performance comparison of ROF Fig.5 Graph of comparison of comparators required versus ranks for different ROF architectures[13][14] The proposed rank based sorting architecture for ROF was implemented for an image of size 256x256 and a window of size 3x3 The ROF architecture was simulated and synthesized using Modelsim and Xilinx 7.1 ISE webpack Table gives the comparison of number of comparators required in the sorting network for finding the element with a rank r , r in a 3x3 window with respect to the proposed sorting algorithms [4] ,[5] The design was implemented using s600efg676-7 and device utilization summary is as shown in table The proposed architecture is compared with bitsum sorting and booths algorithm architectures Bar graphs plotting the number of comparators is plotted in fig.5 Fig gives the comparison with respect to the slices required the above mentioned architectures The performance of the ROF filter was studied for different types and densities of noise and the results are tabulated in the table the performance was also studied for different ranks and densities of salt and pepper noise and is plotted in fig Conclusion fig.6 Graph of comparison of slices required versus ranks for different ROF architectures[13] [14] An optimized architecture based on sorting network algorithm for implementation of ROF is presented The algorithm was optimized for each rank and consists of significantly fewer comparators than the existing architectures that are based on bubble-sort and quick-sort The design of BVB-IEEE ‘06 SAE-INDIA BVB Collegiate Club architecture is further optimized using the concepts of parallelism and pipelining to store the sorted columns of one window for processing the next successive overlapping windows From Table.3 it can be observed that the ROF gives optimum performance for impulse noise and acceptable results for Gaussian and speckle noise when rank of the filter is five From the Figure it is observed that that when the noise distribution is not symmetric i.e., when there are unequal positive and negative impulses, the use of suitable lower or higher ranks provides more robust performance compared to median filtering The architecture is pipelined which processes one pixel per clock cycle, thus to process an image of size 256 x 256 it requires 0.65ms when a clock of 100MHz is used and hence is suitable for real time applications References [1]A K Jain, Fundamentals of Digital Image Processing, PHI 7th Indian Edition, July 2001 [2] Anthony Edward Nelson, ”Implementation of Image Processing Algorithms on FPG Hardware”, Graduate School of Vanderbilt University [3]Astola Jaakko and Kuosmanen Pauli, Fundamentals of Nonlinear Digital Filtering, CRC Press, 2002 [4] B.I.Justusson (1981), “Median Filtering, Statistic Properties,” In Topic in Applied Physical, Two Dimensional Digital Signal Processing II, T S Huang, Ed Berlin: Springer [5] Bruce A Draper, J Ross Beveridge, A P Willem Bohm, Charles Ross, and Monica Chwathe “Accelerated Image Processing on FPGAs” IEEE Trans Image Proc, vol 12,No 12, Dec 2003 [6] Chakrabarti.C “Sorting Network Based Architectures for Median Filters” IEEE Trans on Circuits and Systems, pp 723-72, Nov 1993 Manthan [7] Donald E Knuth,” Sorting and Searching”,The Art of Computer Programming,Vol 3, EdisonWesley Publishing Company [8]Francisco Cardells-Tormo and Pep-Lluis Molinet ”Area-Efficient 2-D Shift-Variant Convolvers for FPGA-Based Digital Image Image Processing”, IEEE Trans on Circuits and Systems, vol 53,NO.2,Feb 2006 [9] K Oflazer, “Design and implementation of a single chip 1-D median filter”, IEEE Trans on Acoust., speech, and Signal Processing,vol ASSP-30, 1983,pp.1164-1168 [10] Khalid F D Alotaibi, A High Level Hardware Description Environment for FPGA- Based Image Processing Applications, The Queen’s University of Belfast, MAY 1999 [11] L.Lucke, K Parhi,”Parallel Processing Architectures for Rank Order and Stack Filters”, IEEE Trans on Circuits and Systems, pp 723-72, Nov 1993 [12] Meena S M, Linganagouda Kulkarni, ” An efficient Architecture for Hardware implementation of weighted Median Filter ”, Proceedings of the international conference on cognition and recognition,pg 147-154,Allied Publishers Pvt [13]R C Gonzalez and R E Woods, Digital Image Processing, Education Asia Pte Ltd, 5th Indian Reprint 2000 [14] Rama Archana, Meena S M, Linganagouda Kulkarni, ADCOM 2005An efficient Architecture for Rank Orde rFilter ROF