Distinctness Analysis on Natural Landmark Descriptors 71 This involves calculating the probability of similarity between two selected landmarks from different images. Each landmark is extracted and converted into a feature descriptor i.e. a p-dimensional vector, which is subject to sources of randomness. Firstly there is random noise from the sensors. Secondly, the descriptor expression is itself a simplified representation of the landmark. Lastly the two images being compared could be viewing the landmark from a different perspective, which causes geometric distortion. Therefore, each landmark can be considered as a single sample of the observing object. In making inferences from two landmarks in two different images, it is in principle a standard significance test. However, comparison is only made between two single samples. For this reason, the ANOVA test (The Analysis of Variance) cannot be used because the sample size required should be large. For multidimensional vector comparison, the χ 2 v (Chi-Squared) distribu- tion test is appropriate. Chi-Squared distribution is a combined distribution of all dimensions which are assumed to be normally distributed. It includes an additional variable v describing the degrees of freedom. Details can be found in [13]. In multidimensional space, the χ 2 v variable is defined by: χ 2 v = N ( x − y ) t Σ − 1 ( x − y )(7) where: x and y are themean of the measurements of Xand Yrespectively; Σ is the covariance matrix of noise; N is afunctionrelated to the sample size of the two measurements. Since our samplesize is one,then N =1, x = x and y = y .Equation 7 simplifies to: χ 2 v =(x − y ) t Σ − 1 ( x − y )(8) If thenoiseofeachdimension is independent of theother, the inverse covariance is adiagonalmatrix and hence can be further simplified to: χ 2 v = p i =1 ( x i − y i ) 2 σ 2 i (9) where p is the number of dimensions of x . Since x contains p independentdimensions,then the degree of freedom v is p not ( p − 1) as usuallydefined for the categorical statistic. Also σ i = √ 2 σ ,where σ is the standarddeviationfor asingle random variable on each dimension. With χ 2 v and v obtained, theprobabilityofsimilarity is definedtobe equal to the integrated probabilityatthe χ 2 v value obtained. Theintegrated probabilityofChi-Squaredistribution canbefound in statisticaltables. 72 K M. Kiang, R. Willgoss, and A. Blair 4 Experimental In this section, experiments were conducted on a series of sub-sea images (courtesy of ACFR, University of Sydney, Australia). The configuration was set such that the camera was always looking downwards on the sea floor. This configuration minimised the geometrical distortion caused by different viewpoints. 4.1 Initial Test of the Algorithm For this experiment, the algorithm was written in Matlab V6.5 running on a PC with a P4 2.4GHz processor and 512Mb of RAM. To demonstrate how the distinctness analysis algorithm worked, a typical analysis is now explained in detail. In the following example, we have trained the distinctness parameters µ t and C t over 100 images from the series. The texture analysis described in [3] generated invariant landmarks on two partic- ular images shown in Figure 2 which consist of partially overlapping regions. The distinctness analysis described in Section 3 was then applied to further select a smaller set of landmarks which were considered to be distinctive as shown in Figure 3. The innovation factor λ was chosen to be 0.9 weighting the past significantly more than the present. The threshold for distinctness in Equation 1 was chosen to be 0.2, a value that kept the number of landmarks chosen to relatively few. In Figure 4, the two highest matches of landmarks that scored higher than a threshold probability of 0.8 are shown with linked lines. The first selection of landmarks based on DOG techniques generated many landmarks scattered all over the two images. More landmarks could usually mean more confidence for matching. However, the computational time for making comparison would also increase. In addition, since non-distinctive ob- jects were not excluded, many of the matches could possibly have been gen- erated by similar objects located at different places. Figure 3 shows a selection of landmarks that the algorithm chose to be globally distinctive. The number of landmarks was significantly reduced when retaining useful matches between the two images. Since these landmarks should not appear frequently in the environment, the possibility that simi- lar objects appear in different locations is minimised. The run-time of this algorithm depended on the complexity of the images. On average, the time required to generate landmarks with descriptors took ∼ 6 seconds per image while the selection process of distinctive landmarks re- quired only ∼ 0.05 second per image. Thus the extra time required to select distinctive landmarks was comparatively small. The time required to calculate the probability between any two landmarks was ∼ 0.001 second. On average, the sub images could generate 150 landmarks. Therefore there were 150 x 149 potential comparisons required to calculate between two images. The maxi- mum time required would be ∼ 0.001 x 150 x 150 = 22.5 seconds. But after Distinctness Analysis on Natural Landmark Descriptors 73 Fig. 2. Twoparticular imagesfrom theSub seaimages. The differentsizes of boxes are landmarksgeneratedusing texture analysis describedin[3]. applying the distinctness selection process, the number of landmarks reduced to ∼ 10 perimage. The time required to makecomparison thus reduced to ∼ 0.1 second.The algorithm is currently being re-implemented in Cwhich should improve its speed significantly. 4.2 Global DistinctnessTest The performance of the algorithmwas then tested withdifferent imagesacross the environment.The testshould revealwhetherthe algorithmcouldselect objects that are trulydistinctivefrom ahuman’sperspective. Thetask is in some wayssubjective. Agroup of images are displayedinFigure5together 74 K M. Kiang, R. Willgoss, and A. Blair Fig. 3. On thesame two images of Figure 2. After applyingthe Distinctness selection processdescribed in Section3,the number of landmarks is reduced. withthe landmarks selected by the algorithm. The reader canjudge the per- formanceofthe algorithm by noting whathas been pickedout. As can be seen,the distinctivelandmarksare usually the complicated textural corals whichtend to be sparselydistributed. It canbeseen thatinsome of these images, there is asingle distinctive object, in which case, the algorithmhas concentratedthe landmarks in that region. However, in images thatcontainnoobvious distinctiveobjects, the algorithmhas chosen fewer distinctivelandmarksscattered overthe whole image. Distinctness Analysis on Natural Landmark Descriptors 75 Fig. 4. After comparingeachdistinctivelandmarks, two highest matches that con- tains probability of over0.8 are joined by lines for illustration. 4.3StabilityTest Afinal test wasconducted to checkonthe stabilityofchosen landmarks. By stability, we mean that thesame landmarkshould be pickedout invariant to anychanges in shift, rotation,scale and illumination.Aselection of image pairs wasmade suchthat these pairs contained relativelylarge changes in thepreviously mentioned conditions and contained overlapping regions. After the algorithm wasappliedtoeachimagetopickout distinctivelandmarks, an inspection wasmade withinthe overlappingregion to countthe number of distinctivelandmarksthat appeared within afew pixelsincorresponding locations of the two images. By comparing this number withthe number of landmarks that didnot correspond in both of the images, ameasureof stabilitywas obtained. ForexampleinFigure 3, therewere fourdistinctive 76 K M. Kiang, R. Willgoss, and A. Blair Fig. 5. Sample images from sub-seaseries (courtesy of ACFR, University of Sydney, Australia) Distinctness Analysis on Natural Landmark Descriptors 77 landmarks appearing in corresponding locations of both images. On the other hand, there were three which do not correspond in both images. In Figure 6, 20 pairs of images have been analysed in the way indicated above. On average, 47% of the landmarks selected as distinctive in one image appeared correspondingly in both images. This was deemed a relatively high hit rate for tracking good distinctive landmarks through image sequences and shows promise for enabling map building in a SLAM context. Fig. 6. An analysis of finding stable landmarks over20pairsofimages. 5 Conclusion and Future Work The work reported here has shown that it is possible to differentiate image data in such a way that distinctive features can be defined which can be tracked on images as the features progress through a sequence of images in an unexplored environment. The paper presented an extended algorithm for selecting distinctive land- marks among numerous candidates, that could also be adapted and combined with existing invariant landmark generation techniques such as SIFT or Tex- ture Analysis. In our experiments, the algorithm is demonstrated to discrimi- 78 K M. Kiang, R. Willgoss, and A. Blair nate a small enough set of landmarks that would be useful in techniques such as SLAM. We are currently working to incorporate this landmark selection algorithm with inertia sensor information to form a functioning SLAM system and de- ploy it in a submersible vehicle. Acknowledgment This work is financially supported by the Australian Cooperative Research Centre for Intelligent Manufacturing Systems & Technologies (CRC IMST) and by the Australian Research Council Centre of Excellence for Autonomous Systems (ARC CAS). References 1. CsorbaM(1997) Simultaneously Localisation and Mapping. PhD thesis of Ro- botics ResearchGroup, DepartmentofEngineering Science, UniversityofOx- ford 2. Williams SB(2001) EfficientSolutionstoAutonomous Mapping and Naviga- tion Problems. PhD thesis of ACFR, DepartmentofMechanical and Mecha- tronic Engineering, the UniversityofSydney 3. Kiang K, WillgossRA,BlairA(2004) ”DistinctiveFeatureAnalysis of Natural LandmarksasaFrontend for SLAM applications”, 2nd International Confer- enceonAutonomous Robots and Agents, New Zealand,206–211 4. Lowe DG(2004) ”Distinctiveimage featuresfrom scale-invariant keypoint”, International JournalofComputerVision, 60, 2:91–110 5. MikolajczykKand Schmid C(2002) ”An affine invariant interestpointdetec- tor”, 8th European Conference on Computer Vision Czech, 128–142 6. Lindeberg T(1994) ”Scale-Space Theory:ABasic Tool for AnalysingStructures at DifferentScales”, J. of AppliedStatistics, 21, 2:224–270 7. Harris C, StephenM(1988) ”A combined Corner and edge detector”, 4th Alvey Vision Conference Manchester, 147–151 8. CarneiroG,JepsonAD(2002) ”Phase-basedlocal features”,7th European ConferenceonComputerVision Copenhagen, 1:282–296 9. Tuytelaars T, VanGL(2000)”Widebaseline stereomatching based on local, affinelyinvariantregions”, 11th British Machine Vision Conference, 412–425 10. Schmid C, Mohr R(1997) ”Local grayvalue invariantsfor image retrieval”, Pattern Analysis and Machine Intelligence, 19,5:530–534 11.FreemanW,Adelson E(1991) ”The design and use of steerable filters”, Pattern Analysis and Machine Intelligence, 13, 9:891–906 12. Mikolajczyk K, Schmid C(2003) ”Local grayvalueinvariants for image re- trieval”, PatternAnalysis and Machine Intelligence,19, 5:530–534 13. ManlyB(2005) Multivariate Statistical Methods Aprimer3rd edition, Chap- man &Hall/CRC Bimodal Active Stereo Vision Andrew Dankers 1 , 2 , Nick Barnes 1 , 2 , and Alex Zelinsky 3 1 National ICT Australia 4 , Locked Bag 8001, Canberra ACT Australia 2601 2 Australian National University, Acton ACT Australia 2601 { andrew.dankers,nick.barnes} @nicta.com.au 3 CSIRO ICT Centre, Canberra ACT Australia 0200 alex.zelinsky@csiro.au Summary. We present a biologically inspired active vision system that incorpo- rates two modes of perception. A peripheral mode provides a broad and coarse perception of where mass is in the scene in the vicinity of the current fixation point, and how that mass is moving. It involves fusion of actively acquired depth data into a 3D occupancy grid. A foveal mode then ensures coordinated stereo fixation upon mass/objects in the scene, and enables extraction of the mass/object using a maximum a-posterior probability zero disparity filter. Foveal processing is limited to the vicinity of the camera optical centres. Results for each mode and both modes operating in parallel are presented. The regime operates at approximately 15Hz on a 3 GHz single processor PC. Keywords: Active Stereo Vision Road-scene Fovea Periphery 1 Introduction The National ICT Australia (NICTA) Autonomous Systems and Sensing Technologies (ASSeT) Smart Car project focusses on Driver Assistance Sys- tems for increased road safety. One aspect of the project involves monitoring the driver and road scene to ensure a correlation between where the driver is looking, and events occurring in the road scene [11]. The detection of objects in the road scene such as signs [19] and pedestrians [14], and the location of the road itself [2], form part of the set of observable events that the system aims to ensure the driver is aware of, or warn the driver about in the case that they have not noticeably observed such events. In this paper, we concentrate on the use of active computer vision as a scene sensing input to the driver assistance architecture. Scene awareness is useful for tracking objects, classifying them, determining their absolute position or fitting models to them. 4 National ICT Australia is fundedbythe Australian DepartmentofCommunica- tions, Information Technology and the Arts and the Australian ResearchCouncil through BackingAustralia’s ability and the ICT Centre of Excellence Program. P. Corke and S. Sukkarieh (Eds.): Field and Service Robotics, STAR 25, pp. 79–90, 2006. © Springer-Verlag Berlin Heidelberg 2006 80 A. Dankers, N. Barnes, and A. Zelinsky 1.1 Research Platform Fig.1. Research platform.Left: SmartCar,and CeDAR mounted behind the wind- screen(centre). Right: CeDAR, laboratory apparatus. The Smart Car (Fig. 1, left),a1999 Toyota Landcruiser, is equippedwith the appropriate sensors, actuators and processing hardware to provide an en- vironmentinwhichdesired driver assistance competencies can be developed [9].Positioned centrallyinside the frontwindscreenisanactivestereovision mechanism. CeDAR,the C able -DriveActive-Vision R obot [22],incorporates acommon tilt axisand two pan axeseachexhibiting arange of motion of 90 o . Angles of all threeaxes are monitoredbyencodersthatgiveaneffective angu- larresolution of 0 . 01 o .Anadditional CeDAR unit (Fig.1,right) identical to the unit in the Smart Car is usedfor initial visual experiments. Although it is stationary and cannot replicate roadconditions, it is convenient for algorithm developmentsuchasthat presented in this paper. 2Active Vision for SceneAwareness Avisionsystem able to adjustits visual parameterstoaid task-oriented be- haviour –anapproachlabeled active [1]or animate [4] vision –can be ad- vantageous for scene analysis in realistic environments [3].Foveal systems must be abletoalign their foveas withthe region of interest in the scene. Varying the camerapairgeometry means foveal attention can be maintained upon asubject. It also increases the volume of the scene that maybedepth- mapped. Disparity map constructionusing asmall disparity search range that is scanned overthe scene by varying the camera geometry is less computa- tionally expensivethan alarge staticdisparitysearch.Aconfiguration where fixed cameras use pixelshifting of the entire images to simulate horopter re- configuration is more processorintensivethan sending commandstoamotion axis.Such virtual shiftingalso reduces the useful width of the image by the number of pixels of shift. 3Bimodal ActiveVision We propose abiologicallyinspiredvision system that incorporates two modes of perception.Aperipheral mode firstprovides abroad and coarse perception [...]... the floor where each stand is located Then the stands are built, the customers move in, the fair runs, the customer move out and the stands are torn down The next marking takes places and the cycle continues To maximize the P Corke and S Sukkarieh (Eds.): Field and Service Robotics, STAR 25, pp 93 104, 2006 © Springer-Verlag Berlin Heidelberg 2006 94 P Jensfelt, G Gullstrand, and E F¨rell o utility... follows: Section 2 introduces our approach, explaining the purpose of characterisation and angular profiles in particular P Corke and S Sukkarieh (Eds.): Field and Service Robotics, STAR 25, pp 105–116, 2006 © Springer-Verlag Berlin Heidelberg 2006 106 P Thompson and S Sukkarieh Section 3 describes the setup of our vision system and field tests Section 4 shows visual results from the field tests Section 5 develops... Conference on Intelligent Robots and Systems (IROS’ 93) , pp 2094–2100, 2000 10 ActivMedia Robotics http://www.activmedia.com/ 11 EBS Ink-Jet Systems GmbH http://www.ebs-inkjet.de/ 12 O Brock and O Khatib, “High-speed navigation using the global dynamic window approach,” in Proc of the IEEE International Conference on Robotics and Automation (ICRA’99), vol 1, pp 34 1 34 6, May 1999 13 SICK, Laser Measurement... Banks and P Corke, “Quantitative evaluation of matching methods and validity measures for stereo vision,” IEEE Int Journal of Robotics Research, vol 20, no 7, 1991 6 Y Boykov, O Veksler, and R Zabih, “Markov random fields with efficient approximations,” Computer Science Department, Cornell University Ithaca, NY 148 53, Tech Rep TR9 7-1 658, 3 1997 7 A Dankers, N Barnes, and A Zelinsky, “Active vision - rectification... marked3 The length of the trajectory was 437 4m and contained 246 stands corresponding to a total stand area of 10150m2 The total time for the task was 4.5h and the robot succeeded to mark 492 out of the 518 points To evaluate the marking accuracy, the 492 marked coordinates were hand measured Figure 5 shows the error distribution both in terms of the absolute error and separated into the x- and y-components... http://www.stofair.se/ 2 J Leonard and H Durrant-Whyte, “Mobile robot localization by tracking geometric beacons,” IEEE Transactions on Robotics and Automation, vol 7, no 3, pp 37 6 38 2, 1991 3 W Burgard, D Fox, D Henning, and T Schmidt, “Estimating the absolute position of a mobile robot using position probability grids,” in Proc of the National Conference on Artificial Intelligence (AAAI-96), (Portland, Oregon, USA),... Annual Conference on Mechatronics and Machine Vision in Practice, pp 169–174, Sept 1997 Describes difference between direct and indirect kalman filter 6 A Elfes, “A sonar-based mapping and navigation system,” in Proc of the IEEE International Conference on Robotics and Automation (ICRA’86), vol 3, pp 1151–1156, Apr 1986 7 J Borenstein and Y Koren, “The vector field histogram - fast mobile obstacle avoidance... artificial scene awareness 90 A Dankers, N Barnes, and A Zelinsky References 1 J Aloimonos, I Weiss, and A Bandyopadhyay, “Active vision,” in IEEE Int Journal on Computer Vision, 1988 2 N Apostoloff and A Zelinsky, “Vision in and out of vehicles: Integrated driver and road scene monitoring,” IEEE Int Journal of Robotics Research, vol 23, no 4, 2004 3 R Bajczy, “Active perception,” in IEEE Int Journal... (Portland, Oregon, USA), pp 896–901, Aug 1996 4 K Arras and N Tomatis, “Improving robustness and precision in mobile robot localization by using laser range finding and monocular vision,” in Proc of the 3rd European Workshop on Advanced Mobile Robots (Eurobot’99), (Z¨rich, u Switzerland), pp 177–185, Sept 6-9 , 1999 5 E Nebot, S Sukkarieh, and H Durrant-Whyte, “Inertial navigation aided with gps information,”... Nilsson, and M Rilbe, “3d vision sensing for improved pedestrian safety,” in IEEE Intelligent Vehicles Symposium, 2004 15 R Hartley and A Zisserman, Multiple View Geometry in Computer Vision, Second Edition Cambridge University Press, 2004 16 S Kagami, K Okada, M Inaba, and H Inoue, “Realtime 3d depth flow generation and its application to track to walking human being,” in IEEE Int Conf on Robotics and Automation, . Fig. 4shows the outputofthis approach.Blandareasinthe images have been surpressed (set to 0.5) usingDOG pre-processing. Thisis because untextured regions willalwaysreturnahigh NCCresponse whether they. the number of pixels of shift. 3Bimodal ActiveVision We propose abiologicallyinspiredvision system that incorporates two modes of perception.Aperipheral mode firstprovides abroad and coarse perception Bimodal. Periphery 1 Introduction The National ICT Australia (NICTA) Autonomous Systems and Sensing Technologies (ASSeT) Smart Car project focusses on Driver Assistance Sys- tems for increased road safety.