Over the past several years, the military has grown increasingly reliant upon the use of unattended aerial vehicles (UAVs) for surveillance missions. There is an increasing trend towards fielding swarms of UAVs operating as large-scale sensor networks in the air[1]. Such systems tend to be used primarily for the purpose of acquiring sen- sory data with the goal of automatic detection, identification, and tracking objects of interest. These trends have been paralleled by advances in both distributed detection [2], image/signal processing and data fusion techniques[3]. Furthermore, swarmed UAV systems must operate under severe constraints on environmental conditions and sensor limitations. In this work, we investigate the effects of environmental conditions on target detection performance in a UAV network. We assume that each UAV is equipped with an optical camera, and use a realistic computer simulation to gener- ate synthetic images. The automatic target detector is a cascade of classifiers based on Haar-like features. The detector’s performance is evaluated using simulated im- ages that closely mimic data acquired in a UAV network under realistic camera and environmental conditions. In order to improve automatic target detection (ATD) per- formance in a swarmed UAV system, we propose and design several fusion techniques both at the image and score level and analyze both the case of a single observation and the case of multiple observations of the same target
UAV Based Distributed Automatic Target Detection Algorithm under Realistic Simulated Environmental Effects Shanshan Gong A Thesis submitted to the College of Engineering and Mineral Resources at West Virginia University in partial fulfillment of the requirements for the degree of Master of Science in Electrical Engineering Natalia A Schmid, D.Sc., Chair Matthew C Valenti, Ph.D Xin Li, Ph.D Lane Department of Computer Science and Electrical Engineering Morgantown, West Virginia 2007 Keywords: Automatic Target Detection, Data Fusion, Swarmed UAV c Copyright 2007 by Shanshan Gong ° All Rights Reserved ABSTRACT UAV Based Distributed Automatic Target Detection Algorithm under Realistic Simulated Environmental Effects Shanshan Gong Over the past several years, the military has grown increasingly reliant upon the use of unattended aerial vehicles (UAVs) for surveillance missions There is an increasing trend towards fielding swarms of UAVs operating as large-scale sensor networks in the air[1] Such systems tend to be used primarily for the purpose of acquiring sensory data with the goal of automatic detection, identification, and tracking objects of interest These trends have been paralleled by advances in both distributed detection [2], image/signal processing and data fusion techniques[3] Furthermore, swarmed UAV systems must operate under severe constraints on environmental conditions and sensor limitations In this work, we investigate the effects of environmental conditions on target detection performance in a UAV network We assume that each UAV is equipped with an optical camera, and use a realistic computer simulation to generate synthetic images The automatic target detector is a cascade of classifiers based on Haar-like features The detector’s performance is evaluated using simulated images that closely mimic data acquired in a UAV network under realistic camera and environmental conditions In order to improve automatic target detection (ATD) performance in a swarmed UAV system, we propose and design several fusion techniques both at the image and score level and analyze both the case of a single observation and the case of multiple observations of the same target Acknowledgements First, I would like to thank Dr Natalia Schmid for being such a patient and understanding thesis advisor Her foresight, intuition, and care were instrumental in shaping this work I have learned so much from her since I joined the Statistical Signal Processing Lab at West Virginia University I also would like to thank my graduate committee members Dr Xin Li and Dr Matthew Valenti for their expert advice and support to my study and thesis I must thank Xiaohan for her seemingly infinite supply of ideas and support for this work I also thank Jinyu, Nathan and Francesco for their support and discussion which helped me so much on my study and research Lastly, I thank my parents and my boyfriend Lei for always supporting my choice If I may, I would also like to take this moment to thank many great teachers, mentors and friends that I have had the pleasure to interact with over the past two years iii Contents Acknowledgements iii Introduction 1.1 Background and Motivation 1.2 Challenges 1.3 Literature Review 1.3.1 Swarmed UAVs 1.3.2 Automatic Target Detection 1.3.3 Data Fusion 1.4 Organization Single-frame Automatic Target Detection 10 2.1 Haar-like Features 11 2.2 AdaBoost Learning 12 2.3 Classifier Cascade 14 2.4 Performance Evaluation 14 Multi-frame Automatic Target Detection 3.1 3.2 17 Image-level Data Fusion for Improved ATD 18 3.1.1 Super-resolution for Improved ATD 18 3.1.2 Image Mosaicking for Improved ATD 21 Score-level Data Fusion for Improved Detection 22 iv Numerical Results 4.1 4.2 Database Description 25 4.1.1 Simulated Optical Data Set 25 4.1.2 Simulated Environmental and Camera Distortions 27 4.1.3 Data for Testing the Effect of Occlusion 29 Results: Single-frame Detector 31 4.2.1 Learning Results of Single-frame Detector 31 4.2.2 Influence of Environmental and Camera Effects on Detection Performance 31 Influence of Occlusion on Detection Performance 32 Results: Multiple-frame Detector 33 4.3.1 Detection Performance: Super-Resolution for Improved ATD 33 4.3.2 Detection Performance: Image Mosaicking for Improved ATD 35 4.3.3 Detection Performance: Score-level Data Fusion 35 4.2.3 4.3 25 Conclusion and Future Work 46 Bibliography 48 List of Tables 2.1 The AdaBoost Algorithm for Classifier Learning [4] 16 4.1 Training Parameters of Single-frame Detector 31 4.2 Summary on High Resolution and Low Resolution Detectors used in our experiments vi 34 List of Figures 2.1 Extended integral feature set [5] The sum of the pixels which lie within the white rectangles are subtracted from the sum of pixels in the black rectangles 11 2.2 Stage classifier 13 2.3 Cascade of classifiers 14 3.1 Basic Premise for Super Resolution 19 3.2 Super-Resolution Observation Model 20 3.3 A Block-diagram of the Interpolation-based Approach 21 4.1 Model for Capturing Images 26 4.2 The GUI of the ATR training tool 26 4.3 Example of Target Images Used for Training 27 4.4 Example of Non-target Images Used for Training 28 4.5 Example of Testing Images 28 4.6 Examples of Distorted Testing Images 36 4.7 Examples of Targets Occluded by a Single Block 37 4.8 Examples of Targets Occluded by Multiple Blocks 38 4.9 Examples of Targets Occluded by Trees 39 4.10 Single-frame Detector Features Selected by AdaBoost (First Four Stages) 39 4.11 (a)-(e) Detection performance as functions of various environmental and camera effects 4.12 (a)-(e) Performance of Occluded Target Detection 4.13 Example for Super-resolved Natural Image vii 40 41 42 4.14 Image-level Data Fusion Results 43 4.15 Example for Image Mosaicking 44 4.16 Example Frames for Score-level Data Fusion 45 4.17 Score-level Data Fusion Results 45 Chapter Introduction 1.1 Background and Motivation Automatic target recognition (ATR) involves two main tasks: target detection and target recognition [6] The purpose of automatic target detection (ATD) is to find regions of interest (ROI) where a target may be located By locating ROIs, we can filter out a large amount of background clutter from the terrain scene, making object recognition feasible for large data sets The ROIs are then passed to a recognition algorithm that identifies targets [6] Automatic target detection is one of the most critical steps in the ATR problem, since the results of postprocessing depend critically on this step ATD/R is performed for the purpose of surveillance, during rescue missions, or during military missions Sensors positioned on the ground, installed on airplanes, helicopters, ground vehicles, etc acquire sensory data Then the data then have to be processed using automatic detection and recognition algorithms One of the most secure (especially during military mission) means of acquiring sensory data involves remotely operated vehicles Remotely operated vehicles can be broadly divided into two categories: unmanned aerial vehicles (UAVs) and unmanned ground vehicles (UGVs) In this thesis, we focus on a distributed network of air-borne UAVs used to detect and recognize ground targets UAVs traditionally acquire sensory data and send the data to a central location CHAPTER INTRODUCTION such as a base station, where potential targets are identified using image analysis algorithms [7][8][9] However, this centralized model for ATD/R possesses a number of drawbacks such as scalability and network delays in communication with the central location [10] In developing the next generation of UAVs, one of the ideas is to utilize reactive agents and the associated swarming behavior as part of the command and control system for a group of UAVs functioning cooperatively and independently from ground control Previous works [10][11][12] demonstrate that this technique provides a suitable mechanism to assimilate the capabilities of individual UAVs into a group of coordinating UAVs that perform ATD/R in a distributed manner In a swarmed system, multiple mobile entities are directed to converge on a single point of interest, disperse and regroup again To achieve distributed ATD/R using swarming, each UAV individually searches for potential targets within an area of interest using its image sensor As soon as the image of an object is sensed to be a possible target by a UAV, other UAVs cooperate with it by swarming towards the potential target to collectively perform ATD/R in a data fusion manner and confirm the object as a target In contrast to a centralized model for ATD/R, a distributed ATD/R model possesses the potential to provide minimal user intervention, a high level of robustness, and largely autonomous operation The distributed ATD/R model can: (1) scale up efficiently in the number of UAVs deployed and in the amount of data collected and processed for ATD/R, (2) improve the efficiency of the system because most computation is performed locally within each UAV, which in turn reduces the time required to upload every image data from different UAVs onto the central location, and, (3) reduce the communication between the central location and individual UAVs so that the system is less susceptible to the loss of data and failures due to wireless communication problems between an individual UAV and the central location [10] In this thesis, we explore the possibility of distributed optical camera-based ATD in a swarmed UAVs system 4.3 RESULTS: MULTIPLE-FRAME DETECTOR 39 Figure 4.9: Examples of Targets Occluded by Trees Figure 4.10: Single-frame Detector Features Selected by AdaBoost (First Four Stages) CHAPTER NUMERICAL RESULTS 1 0.9 0.9 DETECTION RATE DETECTION RATE 40 0.8 0.7 0.6 Level=0 Level=1 Level=2 Level=3 Level=4 0.5 0.4 50 100 150 200 250 FALSE POSITIVES 300 350 0.8 0.7 0.6 Level=0 Level=1 Level=2 Level=3 Level=4 0.5 0.4 400 50 100 (a) Illumination 150 200 250 FALSE POSITIVES 300 350 400 (b) Contrast 1 0.9 0.9 0.7 DETECTION RATE DETECTION RATE 0.8 0.6 0.5 0.4 0.3 Level=0 Level=1 Level=2 Level=3 Level=4 0.2 0.1 0 50 100 150 200 250 FLASE POSITIVES 300 350 400 0.8 0.7 0.6 Level=0 Level=1 Level=2 Level=3 Level=4 0.5 0.4 50 (c) Gaussian Noise 100 150 200 250 FALSE POSTIVES 300 350 400 (d) Defocused Blur DETECTION RATE 0.9 0.8 0.7 0.6 Level=0 Level=1 Level=2 Level=3 Level=4 0.5 0.4 50 100 150 200 250 FALSE POSITIVES 300 350 400 (e) Motion Blur Figure 4.11: (a)-(e) Detection performance as functions of various environmental and camera effects 4.3 RESULTS: MULTIPLE-FRAME DETECTOR 1 0.9 No Occlusion p=0.2 p=0.4 p=0.6 p=0.8 0.6 0.4 Detection Rate Detection Rate 0.8 0.2 41 0.8 0.7 0.6 No Occlusion w=10 w=15 w=20 w=25 0.5 100 200 300 400 False Positive 500 0.4 600 (a) Single Block 100 200 300 False Positive 400 500 (b) Multiple Blocks 1 0.9 0.9 Detection Rate 0.7 0.6 No Occlusion number=1 number=2 number=3 0.5 0.4 50 100 150 200 250 300 False Positives 350 400 0.7 0.6 0.5 No Occlusion number=1 number=2 number=3 0.4 0.3 0.2 100 (c) Tree 200 300 False Positives 400 (d) Tree 0.9 Detection Rate Detection Rate 0.8 0.8 0.8 0.7 0.6 No Occlusion number=1 number=2 number=3 0.5 0.4 100 200 300 False Positives 400 500 (e) Tree Figure 4.12: (a)-(e) Performance of Occluded Target Detection 500 42 CHAPTER NUMERICAL RESULTS (a) LR Natural Images with Large Displacements (256 ∗ 256) Circle points indicate corresponding control points (b) LR Natural Images after First-Step Registration (235 ∗ 126) (c) Super-resolved HR estimate using four LR images (470 × 252) Figure 4.13: Example for Super-resolved Natural Image 4.3 RESULTS: MULTIPLE-FRAME DETECTOR 43 0.9 0.8 Detection Rate 0.7 0.6 0.5 0.4 0.3 LR Trained Detector on LR Testing Database HR Trained Detector on HR Testing Database LR Trained Detector on SR Testing Database, K=2 LR Trained Detector on SR Testing Database, K=4 LR Trained Detector on SR Testing Database, K=8 0.2 0.1 0 10 15 20 25 False Positive 30 35 40 Figure 4.14: Image-level Data Fusion Results 45 44 CHAPTER NUMERICAL RESULTS (a1) (b1) (c1) (a2) (b2) (c2) Figure 4.15: Example for Image Mosaicking 4.3 RESULTS: MULTIPLE-FRAME DETECTOR 45 Figure 4.16: Example Frames for Score-level Data Fusion 0.95 DETECTION RATE 0.9 0.85 0.8 0.75 0.7 0.65 Score−level Fusion Single−Frame 10 20 30 40 50 FALSE POSITIVES 60 70 Figure 4.17: Score-level Data Fusion Results 80 Chapter Conclusion and Future Work In this work we advanced the state of the art in automatic target detection in the following ways We performed a comprehensive research of currently available ATD algorithms from optical imagery The adopted detector is a modified version of ViolaJones face detector which is a cascade classifier based on Haar-like features In order to improve detection performance in a swarmed UAV network, we proposed and analyzed several data fusion algorithms at different levels Firstly, following Viola-Jones face detector, we trained our automatic target detector to operate on single image In our system, we focus only on optical images captured by optical cameras mounted on board of UAVs Such images suffer from sensor limitations, environmental and camera effects To test the robustness of the detector with respect to non-ideal imagery, we synthesized datasets distorted by individual weather and camera effects The effects include Gaussian noise, lighting, contrast, motion blur, off-focus blur and occlusion Degradations of the detector performances due to these effects were further evaluated In the second part of the work, we proposed several data fusion methods for improved detection performance In the first scenario, a super-resolution technique was employed for low-resolution image data In the second scenario, an image mosaicking technique was applied to improve detector performance when images contain partially occluded objects In the last one, a score-level data fusion technique applied to encoded data 46 47 There are several natural extensions of this work: In order to improve detection performance of a single-frame detector, a multiview detector could be developed by combining several view-specified classifiers It would improve the detection performance of a single-frame detector significantly A set of Haar features could be extended to fit main features of targets The decision-level data fusion techniques such as Random Forest could be explored for improved automatic target detection The detector performance could be evaluated using a more realistic data and setup We are currently in the process of building a database that will allow a more realistic setup and a more comprehensive analysis of the designed detector Bibliography [1] D Hart and P Craid-Hart, “Reducing swarming theory to practice for UAV control,” Proc IEEE Aerospace Conf., pp 3050–3063, Mar 2004 [2] R Blum, S Kassam, and H Poor, “Distributed detection with multiple sensors I: Advanced topics,” Proc IEEE, vol 85, no 1, pp 64–79, Jan 1997 [3] X Chen, S Gong, N Schmid, and M Valenti, “UAV based distributed ATR under realistic simulated environmental effects,” SPIE Defence and Security Symp., April 2007 [4] P Viola and M Jones, “Rapid object detection using a boosted cascade of simple features,” IEEE CVPR, pp 1–9, 2001 [5] R Lienhart and J Maydt, “An extended set of Haar-like features for rapid object detection,” IEEE ICIP, vol 1, no 1, pp 900–903, Sep 2002 [6] J Dufour and V Martin, “Active/passive cooperative image segmentation for automatic target recognition,” SPIE, vol 2294, pp 552–560, 1994 [7] M Cohen, An Introduction to Automatic Target Recognition, EW Design Engineers Handbook, 1990 [8] F Gaudiano and E Bonabeau, “Control of UAV swarms: What the bugs can teach us,” Proc of 2nd AIAA Unmanned Unlimited, pp 582–589, 2003 [9] H Parunak J Sauter, R Matthews and S Brueckner, “Performance of digital pheromones for swarming vehicle control,” Proc of AAMAS, pp 903–910, 2005 48 BIBLIOGRAPHY 49 [10] P Dasgupta, “Distributed automatic target recognition using multi-agent UAV swarms,” Proc of AAMAS, pp 479–481, 2006 [11] M Dorigo E Bonabeau and G Theraulaz, “Swarm intelligence: From natural to artificial systems,” Oxford University Press, 1999 [12] S Edwards, “Swarming on the battlefield: Past, present and future,” RAND National Security Research Div Report, 2000 [13] T Leung, M Burl, and P Perona, “Finding faces in cluttered scenes using random labeled graph matching,” Proc Fifth IEEE ICCV, pp 637–644, 1995 [14] Y Dai and Y Nakano, “Face-texture model based on SGLD and its application in face detection in a color scene,” Pattern Recognition, vol 29, no 6, pp 1007– 1017, 1996 [15] M Betke and N Makris, “Fast object recognition in noisy images usng simulated annealing,” Proceedings of the Fifth International Conference on Computer Vision, pp 523–543, 1995 [16] P Hallinan A Yuille and D Cohen, “Feature extraction from faces using deformable templates,” International Journal of Computer Vision, vol 8, no 2, pp 99–111, 1992 [17] S Mallat, “A theory for multiresolution signal decomposition: The wavelet representation,” IEEE Trans Pattern Analysis and Machine Intelligence, vol 11, no 7, pp 674–693, 1989 [18] M Turk and A Pentland, “Eigenfaces for recognition,” Technical Report-MIT, 1991 [19] V Vapnik, The Nature of Statistical Learning Theory, Springer Verlag, 1995 [20] S Baluj H Rowley and T Kanade, “Neural network-based face detection,” IEEE Conference on Pattern Analysis and Machine Intelligence, vol 20, no 1, pp 23–38, 1998 50 BIBLIOGRAPHY [21] Y Freund and R Schapire, “Experiments with a new boosting algorithm,” Machine Learning: Proceedings of the Thirteenth International Conference, pp 148–156, 1996 [22] M Oren, C Papageorgiou, P Sinha, E Osuna, and T Poggio, “Pedestrian detection using wavelet templates,” Proc Computer Vision and Pattern Recognition, pp 193–199, 1997 [23] C Papageorgiou A Mohan and T Poggio, “Example-based object detection in images by components,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol 23, no 4, pp 349–361, 2001 [24] C Monrocq R Vaillant and Y.L.Cun., “Original approach for the localisation of objects in images,” IEEE Proc Vis Image Signal Processing, vol 141, 1994 [25] B Moghaddam and A Pentland, “Probabilistic visual learning for object detection,” Technical Report MIT, vol 326, 1995 [26] H Schneiderman and T Kanade, “A statistical method for 3d object detection applied to faces and cars,” IEEE Conference of Computer Vision and Pattern Recognition, pp 1746–1751, 2000 [27] R Schapire and Y Singer, “Improving boosting algorithms using confidencerated predictions,” Machine Learning, vol 37, no 3, pp 297–336, 1999 [28] R Schapire, Y Freund, P Bartlett, and W Lee, “Boosting the margin: A new explanation for the effectiveness of voting methods,” Annals of Statistics, vol 26, pp 1651–1686, 1998 [29] F Fleuret and D Geman, “Coarse-to-fine face detection,” Int Journal of Computer Vision, vol 41, no 1, pp 85–107, 2001 [30] S Agarwal A Garg and T Huang, “Fusion of global and local information for object detection,” International Conference on Pattern Recognition, vol 3, pp 723–726, 2002 BIBLIOGRAPHY 51 [31] S Li, Z Zhang, L Zhu, and H Zhang, “Real-time multi-view face detection,” Fourth IEEE International Conference on Automatic Face and Gesture Recognition, vol 4, pp 67–81, 2002 [32] M Jones and P Viola, “Fast multi-view face detection,” Technical Report of Mitsubishi Electric Research Laboratories, 2003 [33] E Waltz and J Llinas, Multisensor Data Fusion, Artech House, 1990 [34] C Chong, S Mori, K Chang, and W Barker, “Architectures and algorithms for track association and fusion,” IEEE Aerospace and Electronic Systems, vol 15, no 1, pp 5–13, January [35] J Aggarwal, Multisensor Fusion For Computer Vision, Springer-Verlag, 1993 [36] H Li, B Manjunath, and S Mitra, “Multi-sensor image fusion using the wavelet transform,” ICIP, vol 1, pp 51–55, Nov 1994 [37] S Nadimi and B Bhanu, “Adaptive fusion for diurnal moving object detection,” ICPR, vol 3, pp 696–699, Aug 2004 [38] M Liggins and M Nebrich, “Adaptive multi-image decision fusion,” SPIE Signal Processing, Sensor Fusion , and Target Recognition IX, vol 4052, pp 218–228, Aug 2000 [39] Open Computer Vision Library:, ,” http:/sourceforge.net/projects/opencvlibrary/ [40] R Lienhart, A Kuranov, and V Pisarevsky, “Empirical analysis of detection cascades of boosted classifiers for rapid object detection,” MRL Technical Report, 2002 [41] M Park S Park and M Kang, “Super-resolution image reconstruction: A technical overview,” IEEE Signal Processing Magazine, vol 20, pp 21–36, May 2003 52 BIBLIOGRAPHY [42] S Borman and R Stevenson, “Spatial resolution enhancement of low-resolution image sequences A comprehensive review with directions for future research,” Technical Report, University of Notre Dame, 1998 [43] C Kenney L Fonseca, G.Hewer and B Manjunath, “Registration and fusion of multispectral images using a new control point assessment method derived from optical flow ideas,” Proc SPIE Conference, vol 3717, pp 104–111, April 1999 [44] M Alam, J Bognar, R Hardie, and B Yasuda, “Infrared image registration and high-resolution reconstruction using multiple translationally shifted aliased video frames,” IEEE Trans on Instrumentation and Measurement, vol 49, pp 915–923, Oct 2000 [45] C Barber, D Dobkin, and H Huhdanpaa, “The quickhull algorithm of convex hulls,” ACM Transactions on Mathematical Software, vol 22, no 4, pp 469–483, Dec 1996 [46] H L Van Trees, Detection, Estimation, and Modulation Theory, Part I, John Wiley and Sons, New York, 2001 [47] F Sadjadi, “Application of genetic algorithm for automatic recognition of partially occluded objects,” SPIE Automatic Object Recognition IV, vol 2234, pp 428–434, April 1994 [48] S Rees and B Jones, “Operator for object recognition and scene analysis by estimation of a set occupancy with noisy and incomplete data sets,” SPIE Intelligent Robots and Computer Vision XI, vol 1825, pp 289–297, November 1992 [49] M Hedley M Trajkovic, “Fast corner detection,” Image and Vision Computing, vol 16, pp 75–87, 1998 [50] J Kittler, M Hatef, P Duin, and J Matas, “On combining classifiers,” IEEE Trans on Pattern Analysis and Machine Intelligence, vol 20, pp 226–239, March 1998 View publication stats BIBLIOGRAPHY 53 [51] L Xu, A Krzyzak, and C Suen, “Methods of combining multiple classifiers and their applications to handwriting recognition,” IEEE Trans Systems, Man, and Cybernetics, vol 22, no 3, pp 418–435, 1992 [52] J Kittleer, J Matas, and K Jonsson, “Combining evidence in personal identity verification systems,” Pattern Recognition Letters, pp 845–852, 1997 [53] A Savakis and H Trussell, “Blur identification by residual spectral matching,” IEEE Transactions on Image Processing, pp 141–151, 1993 [54] B Bhanu, D Dndgeon, E Zelnio, A Rosenfeld, D Casasent, and I Reed, “Introduction to the special issue on automatic target detection and recognition,” IEEE Trans Image Processing, vol 6, no 1, pp 1–6, 1997 [55] A Lanterman, J O’Sullivan, and M Miller, “Kullback-Leibler distances for quantifying clutter and models,” Optical Engineering, vol 38, no 12, pp 2134– 2146, 1999 ... focus on a distributed network of air-borne UAVs used to detect and recognize ground targets UAVs traditionally acquire sensory data and send the data to a central location CHAPTER INTRODUCTION... centralized model for ATD/R, a distributed ATD/R model possesses the potential to provide minimal user intervention, a high level of robustness, and largely autonomous operation The distributed ATD/R... possibility of distributed optical camera -based ATD in a swarmed UAVs system 1.2 CHALLENGES 1.2 Challenges Automatic detection of real-world targets poses many challenging problems An ideal detector