... Correlation theorem states that the Fourier transform of the correlation of two images is the product of the Fourier transform of one image and the complex conjugate of the Fourier transform of the other... because of their tolerance to low image overlap and image scale changes Keypoint indexing methods begin with keypoint detection and localization, and then followed by extraction of an invariant descriptor... translation invariant feature by using some other feature points in the same image (Without specification, rotation and translation in this thesis stand for 2D rotation and 2D translation only.) Here
Acknowledgement I am deeply indebted to my supervisor, Dr. Huang zhiyong for his precious guidance, continuous support, and encouragement throughout my thesis. I also want to thank Dr. Tong San Koh of NTU for discussions, Dr. Wee Kheng Leow and Dr. Alan Cheng Holun of NUS for the detailed comments and suggestions. i Table of Content Acknowledgement ..........................................................................................................i Table of Content.............................................................................................................ii Summery v List of Tables.................................................................................................................vi List of Figures ..............................................................................................................vii List of Figures ..............................................................................................................vii Chapter 1 Introduction ..............................................................................................1 1.1 Motivation ....................................................................................................1 1.2 Contributions ................................................................................................2 1.3 Thesis Organization......................................................................................3 Chapter 2 2.1 Literature Review .....................................................................................4 Image Registration in Theory.......................................................................4 2.1.1 ii Applications .......................................................................................4 2.1.2 2.2 Registration Methods ...................................................................................8 2.2.1 Area based methods ...........................................................................8 2.2.2 Feature-based methods.....................................................................12 2.2.3 Recent registration methods.............................................................13 Chapter 3 Image Registration..................................................................................17 3.1 Algorithm Overview...................................................................................18 3.2 Feature points detection..............................................................................24 3.3 iii Standard image registration stages.....................................................6 3.2.1 Feature point position extraction .....................................................24 3.2.2 Feature point orientation estimation ................................................25 Feature points matching .............................................................................28 3.3.1 Define a feature descriptor...............................................................28 3.3.2 Local structure matching..................................................................31 3.3.3 Global structure matching................................................................43 3.3.4 Eliminating the low-quality matching pairs.....................................46 3.3.5 Performance analysis .......................................................................47 3.4 Chapter 4 Transformation model estimation...............................................................48 Experimental Results..............................................................................50 4.1 Results of local structure matching ............................................................51 4.2 Results of global structure matching ..........................................................57 4.3 Registration results on various images .......................................................61 Chapter 5 Conclusions and Further works ..............................................................80 Bibliography ................................................................................................................82 iv Summery In this these, we propose a novel feature-based image registration method using both the local and global structures of the feature points. To address various imaging conditions, we improve the local structure matching method. Compared to the conventional feature-based image registration methods, our method is robust by guaranteeing the reliable feature points to be selected and used in the registration process. We have successfully applied our method to images of different conditions. v List of Tables Table 1: Comparison of two local structure matching methods. .................................56 Table 2: The registration results on 8 pairs of images in Figure 13-20. ......................62 vi List of Figures Figure 1: System diagram of the feature points matching method ..............................21 Figure 2: feature point i be represented by a feature vector fi=( xi,yi, ϕi ). ...............29 Figure 3: The local spatial relation between two feature points fi and fj ......................30 Figure 4: Spurious or dropped feature points in the neighborhood will result in an invalid local structure for matching. ............................................................................34 Figure 5: The local structure matching on images with geometry transformations.....52 Figure 6: The local structure matching on images with large temporal difference. ....53 Figure 7: The local structure matching on images with image distortions (highly JPEG compressed). ......................................................................................................54 Figure 8: The local structure matching on images from different sensors...................55 Figure 9: The matching pairs detected from the global structure matching in cue1....59 Figure 10: The matching pairs detected from the global structure matching in cue2..59 Figure 11: The matching pairs obtained from intersection of results in cue1 and vii cue2. .............................................................................................................................60 Figure 12: The final matching pair set after cross-validation. .....................................60 Figure 13: Registration of high resolution images.......................................................68 Figure 14: Registration of urban images from different sensors. ................................69 Figure 15: Registration of two Amazon region images from Radar, JERS-1 with two year difference. ............................................................................................................70 Figure 16: Registration of Landsat images with four year difference and associated rotation .........................................................................................................................72 Figure 17: Registration of Amazon region image with deforestations. .......................74 Figure 18: Registration of images with high temporal changes. .................................75 Figure 19: Registration of images with compression distortions.................................77 Figure 20: Registration of retina images with associated rotation and translation. .....79 viii Chapter 1 Introduction 1.1 Motivation Image registration is the process of matching two or more images of the same scene taken in different times, from different view points, or by different sensors. It geometrically aligns the input image and the reference image. Image registration is widely used in many applications, such as image mosaicking, aerial image analysis, medical imaging, stereo vision, automated cartography, motion analysis, and the recovery of the 3D characteristics of a scene [1]. In general, most large systems which evaluate images require the registration of image as an intermediate step [2]. In this thesis we propose and implement a feature-based image registration algorithm. The images under consideration are roughly of the same scale (but not necessarily the same size). Here we adapt Jiang and Yau’s fingerprint minutiae matching algorithm [3]. In [3] Jiang and Yau first establish a feature descriptor which fulfills four important conditions: 1) invariance (the descriptions of the corresponding features from the reference and sensed image have to be the same), 2) uniqueness (two different features should have different descriptions), 3) stability 1 (the description of a feature which is slightly deformed in an unknown manner should be close to the description of the original feature), and 4) independence (if the feature description is a vector, its elements should be functionally independent) [2]. Then they propose a simple and efficient fingerprint minutiae matching algorithm based on the ‘local’ and ‘global’ structures of fingerprint minutiae. (Note that the so called ‘global structure’ in [3] is still a local structure because it is local to the position of a feature. It should be called ‘absolute feature’. In contrast, a better name for ‘local structure’ is ‘relative feature’. In this thesis, we still keep the names ‘local structure’ and ‘global structure’ for the consistence with [3].) However, this algorithm is only suitable for fingerprint image under rotate and translate transformations. We improve the local and global structure matching methods in [3] such that we can obtain a set a applicable corresponding feature points for general images taken under various imaging conditions, such as images taken at different times, from highly different view points, or by different sensors. The proposed feature matching method can also be applied to images with compression distortion or object movement or high deformations. 1.2 Contributions Based on the fingerprint minutiae matching algorithm represented in [3], we propose and implement a feature-based registration algorithm. Our major contributions are in 2 the part of feature matching. We improve the local structure matching method in [3] for image registration. Therefore we can handle the cases where image has significant scene changes such as object movement, growths or deformations. In these cases, the local structure matching method in [3] is not effective. We provide a more reliable local structure matching so that two best-matched local structure pairs are correctly computed under various imaging conditions, such as images taken at different times, by different sensors, and from highly different viewpoints. The improved matching method can also be applied to images with compression distortion or object movement or high deformations. We implement the method in a software system and conduct various experiments with applicable results. 1.3 Thesis Organization The rest of this thesis is organized as follows. In Chapter 2 we give a short review of related work. In Chapter 3, we present our image registration algorithm, of which the reliable feature points matching algorithm is our major concern. In Chapter 4, a series of experiments are performed to evaluate the performance of our registration algorithm. Finally, our work is summarized in Chapter 5. 3 Chapter 2 Literature Review 2.1 Image Registration in Theory 2.1.1 Applications Image registration is widely used in remote sensing, medical imaging, computer vision, etc. In general, according to the manner of the image acquisition the application of image registration can be divided into four main groups [1]. Different viewpoints (multi-view analysis). Images of the same scene are acquired from different viewpoints. The aim is to gain a larger 2D view or a 3D representation of the scanned scene. Examples of applications include remote sensing—mosaicking of images of the surveyed area, computer vision—shape recovery (shape from stereo). Different times (multi-temporal analysis). Images of the same scene are acquired at different times, often at regular time interval, and possibly under different conditions. The aim is to find and evaluate changes in the scene between the consecutive image acquisitions. Examples of applications include remote sensing—monitoring of global land usage, landscape planning, computer 4 vision—automatic change detection for security monitoring, and medical imaging—monitoring of the healing therapy, monitoring of tumor evolution. Different sensors (multi-modal analysis). Images of the same scene are acquired by different sensors. The aim is to integrate the information obtained from different source streams to gain more complex and detailed scene representation. Examples of applications include remote sensing — fusion of information from sensors with different characteristics such as panchromatic images, offering better spatial resolution, color/multi-spectral images with better spectral resolution, or radar images independent of cloud cover and solar illumination; medical imaging—combination of sensors showing the anatomical structure like MRI or CT with sensors showing functional and metabolic activities like PET, SPECT or MRS. Results can be applied , for instance, in radiotherapy and nuclear medicine. Scene to model registration. Image of a scene and a model are registered. The model can be a computer representation of the scene, for instance maps, another scene with similar content. The aim is to localize the acquired image in the scene/model and to compare them. Example of applications includes remote sensing—registration of aerial or satellite data into maps; and medical imaging—comparison of the patient’s image with the digital anatomical atlases, specimen classification. 5 2.1.2 Standard image registration stages Due to the diversity of image registration applications and due to various types of image variation stated above, it is impossible to design a universal method applicable to all registration tasks. However, the standard image registration technique usually consists of three stages as follows. Feature detection. Features are salient structures or distinctive objects in the image. These features can be represented by their point representatives such as centers of gravity, line intersections, corners. In this stage, features are manually or, preferably, automatically detected. Usually the physical interpretability of the feature is required. The major problem in this stage is to decide what kind of feature is applicable to the given task. The detected features sets in sensed image and reference image should have enough common elements, and the detection method should not be sensitive to the assumed image variations. Feature matching. The detected features in sensed image and reference images are matched in this stage. Various feature descriptors and similarity measures are employed for the purpose. The two major categories for feature matching are area-based and feature-based methods. Area-based methods, sometimes called correlation-like methods, usually adapt a window to determine a matched location 6 using the correlation technique. Area based methods deal with the images without attempting to detect salient objects. They are preferable when the images do not have enough prominent details and distinctive objects. While feature-based method is used to extract common features such as curvature, moments, areas, or line segments to perform accurate registration. They are typically applied when the local structural information is more important than the information carried by the image intensities. They are applicable to images of completely different nature (like aerial photograph and map) and can handle complex image distortions. In feature matching stage, problems caused by incorrect feature detection or by image degradations can arise. Physically corresponding features can be missed due to different imaging condition or due to different spectral sensitivity of the sensors. The choice of the feature description and similarity measure has to consider these factors. There are several conditions that a good feature descriptor should fulfill [2]. The most important ones are invariance, (the feature descriptor should be invariant to the assumed image degradations), uniqueness (two different features should have different description), stability (the description of a feature should be sufficiently stable to tolerate slight unexpected feature variations and noise), and independence (the elements of a vector feature descriptor should be functionally independent). The matching algorithm in the space of invariants should be robust and efficient. 7 Transformation model estimation and image resampling. In the last stage, the type and parameters of the mapping function are estimated by the feature correspondences estimated from previous stage. Applying the spatial mapping and interpolation, the sensed image is resampled onto the reference image. Image values in non-integer coordinates are computed by the appropriate interpolation technique. There are two major problems need to be considered in this stage. Firstly, the type of the mapping functions should be chosen correctly. In case there is no priori information available, the model should be flexible enough to handle all possible image transformations. Secondly, there are differences between two images which we would like to detect. Therefore the decisions about which type of image variations is variations of interest must be made in this stage. 2.2 Registration Methods The current automated registration techniques can be classified into two broad categories: area-based and feature-based. 2.2.1 Area based methods Area-based methods, sometimes called correlation-like methods, merge the feature detection step with the feature matching step. Instead of attempting to detect salient 8 objects, windows of predefined size (or even entire images) are used for the correspondence estimation. The area–based methods usually adapt a small window of points to determine a matched location using the correlation technique [4]. Window correspondence is based on the similarity measure between two given windows in both the sensed image and the reference image. The most commonly used measure of similarity is normalized cross-correlation. Other useful similarity measures are the correlation coefficient and the sequential-similarity detection [1]. In normalized cross-correlation, the measure of similarity is computed for window pairs from the sensed and reference images and its maximum is searched. The window pairs for which the maximum is achieved are set as the corresponding ones. Although the cross-correlation based registration can exactly align mutually translated images only, it can also be successfully applied when slight rotation and scaling are present. Another useful property of correlation is given by the Correlation theorem. The Correlation theorem states that the Fourier transform of the correlation of two images is the product of the Fourier transform of one image and the complex conjugate of the Fourier transform of the other. This theorem gives an alternate way to compute the correlation between images. The Fourier transform is simply another way to represent the image function. Instead of representing the image in the spatial 9 domain, as we normally do, the Fourier transform represents the same information in the frequency domain. It can be computed efficiently for images using the Fast Fourier Transform (FFT). Hence, an important reason why the correlation metric is chosen in many registration problems is because the Correlation theorem enables it to be computed efficiently, with existing, well-tested programs using the FFT (and occasionally in hardware using specialized optics). The use of the FFT becomes most beneficial for cases where the image and template to be tested are large. The area-based methods are preferable when the images do not have enough prominent details and distinctive information is provided by graylevels/colors rather than by local shapes and structure [5]. The limitations of the area-based methods are: (1) The rectangular window, which is most often used, suits the registration of images which locally differ only by a translation. If images are deformed by more complex transformations, this type of the window is not able to cover the same parts of the scene in the reference and sensed images (the rectangle can be transformed to some other shape). Several authors proposed to use circular shape of the window for mutually rotated images. However, the comparability of such simple-shaped windows is violated too if more complicated geometric deformations (similarity, perspective transforms, etc.) are present between images. 10 (2) Another disadvantage of the area-based methods refers to the ‘remarkableness’ of the window content. There is high probability that a window containing a smooth area without any prominent details will be matched incorrectly with other smooth areas in the reference image due to its non-saliency. The features for registration should be preferably detected in distinctive parts of the image. Windows, whose selection is often not based on their content evaluation, may not have this property. (3) Classical area-based methods like cross-correlation (CC) exploit for matching directly image intensities, without any structural analysis. Consequently, they are sensitive to the intensity changes, introduced for instance by noise, varying illumination, and/or by using different sensor types. (4) Typically the cross-correlation between the image and the template is computed for each allowable transformation of the template. The transformation whose cross-correlation is the largest specifies how the template can be optimally registered to the image. This is the standard approach when the allowable transformations include a small range of translations, rotations, and scale changes; the template is translated, rotated, and scaled for each possible translation, rotation, and scale of interest. As the number of transformations grows, however, the computational costs quickly become unmanageable. So the correlation methods are generally limited to registration problems in which the images are misaligned only by a small rigid or 11 affine transformation. 2.2.2 Feature-based methods There are two tasks generally need to be handled in the feature-based techniques: feature extraction and feature matching. For feature extraction, the aim is to detect two sets of features in the reference and sensed images represented by the feature points (points themselves, end points or centers of line features, centers of gravity of regions, etc). A variety of image segmentation techniques have been used for extraction of edge and boundary features, such as the Canny operator, the Laplacian of Gaussian (LoG) operator, the thresholding technique in [6], the classification method in [7], the region growing in [8], and the wavelet transformations in [9]. In feature matching, the aim is to find the pair-wise correspondence between two feature sets by their spatial relations or various descriptors of features. Feature correspondence is performed based on the characteristics of the features detected. Existing feature-matching algorithms include binary correlation, distance transform, Chamfer matching, structural matching, chain-code correlation, and distance of invariant moments [8]. In most existing feature-based techniques, the crucial point is to have discriminative and robust feature descriptor that invariant to assumed image variations. 12 Feature-based methods are typically applied when the local information is more significant than the information carried by the image intensities. In contrast to the area-based methods, the feature-based methods do not work directly with the image intensities. The feature represents information on higher level. This property makes feature-based methods suitable to handle complex image distortions (such as image with illuminations changes) and can apply to images of completely different nature (such as multi-sensor analysis). However, the limitation of the feature-based methods is that the feature may be hard to detect or unstable in time, such as some medical images lack of distinctive objects. 2.2.3 Recent registration methods Among all the recent works, we focus on two classes of methods that appear most appropriate for the general-purpose registration problem. 1) Keypoint Indexing Methods: Keypoint methods have received growing attention recently because of their tolerance to low image overlap and image scale changes. Keypoint indexing methods begin with keypoint detection and localization, and then followed by extraction of an invariant descriptor from the intensities around the keypoint. In the end the extracted invariant descriptor is used by indexing methods to match 13 keypoints between images. Existing extraction algorithms are based on approaches such as Laplacian-of-Gaussian operator [10], Harris corners [11], information theory [12], and intensity region stability measures [13]. They are usually invariant to 2d similarity or affine transformations of the image, as well as linear changes in intensity. For example, in [10], distinctive invariant features are extracted from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. 2) ICP ICP is based on point features, where the “points” may be raw measurements such as (x, y, z) values from range images, intensity points in three dimensional medical images [ 14], and edge elements, corners and interest points [15] that locally summarize the geometric structure of the images. Starting from an initial estimate, the ICP algorithm iteratively (a) maps features from the sensed image to the reference image, (b) finds the closest reference image point for each mapping, and (c) re-estimates the transformation based on these temporary correspondences. 14 The Dual-Bootstrap ICP (DB-ICP) algorithm [16] uses the ICP algorithm. DB-ICP begins with an initial transformation estimate and initial matching regions from the two images obtained by keypoint matching. The algorithm iterates among the following 3 steps: (1) refining the current transformation in the current “bootstrap” region by symmetric matching, (2) applying model selection to determine if a more sophisticated model may be used, and (3) expanding the region, growing inversely proportional to the uncertainty of the mapping on the region boundary. The framework of this algorithm has been described elsewhere for other image registration, such as for aerial images under different lighting conditions. The advantage of the Dual-Bootstrap ICP algorithm includes: (1) In comparison to current image registration algorithms, it handles lower image overlaps, image changes and poor image quality, all of which reduce the number of common landmarks between images. Moreover, by effectively exploiting the vascular structure during the dual-bootstrap procedure it avoids the need for expensive global search techniques. (2) In comparison with current indexing-based initialization methods and minimal-subset random sampling methods, Dual-Bootstrap ICP has the major advantage requiring fewer initial correspondences. This is because it starts from an 15 initial low-order transformation that must only be accurate in small initial regions. (3) Instead of matching globally, which could require simultaneous consideration of multiple matches, Dual-Bootstrap ICP uses region and model bootstrapping to resolve matching ambiguities. However, one common problem with DB-ICP [17] is that ICP has a narrow domain of convergence, and therefore must be initialized relatively accurately. 16 Chapter 3 Image Registration In this chapter we present a new image registration algorithm based on the local and global structures of the feature points. We apply both the local and global structure matching methods in [3] to image registration. Moreover, we improve the flexibility of the local structure matching method to handle various image variations, and increase the accuracy of the global structure matching method in the correspondence estimations. The major techniques of feature point matching are summarized in section 3.3. To make the algorithm more flexible, we propose a new local structure matching method in section 3.3.2 to handle the cases where image has significant scene changes or distortions. The proposed matching method provides more reliable local structure matching so that two best-matched local structure pairs are correctly computed in various imaging conditions. What’s more, to improve the accuracy of the feature points matching, we employ consistent checking and cross-validation in our feature points matching method: we first perform global structure matching in two cues to eliminate the false matching in section 3.3.3, and then employ cross-validation to eliminate the low-quality matching in section 3.3.4. 17 Chapter 3 is organized as following: after an overview of our algorithm in section 3.1, in section 3.2, we discuss how to extract the positions of a set of feature points and how to estimate their orientation. In section 3.3, we find correct matching pairs between two partially overlapping images. Based on the matching pair found in section 3.3, we derive the correct transformations between two target images in section 3.4. 3.1 Algorithm Overview The standard point mapping registration method usually consists of three stages: feature points detection, feature points matching, and transformation model estimation. In feature detection stage, the positions of a set of feature points are extracted by OpenCV function GoodFeaturestoTrack [18], which computes the ‘goodness’ of a feature points using the eigenvalue of a matrix formed from the intensities of the pixels in a neighborhood of a feature point. Then the orientations of those feature points are estimated by a least mean square estimation method. After the orientation field of an input image is estimated, we calculate the reliability level of the orientation data. For each feature point, if its reliability level of the orientation field is below a certain threshold, then this feature point is eliminated from the feature 18 points set, so that we only keep the feature points with reliable orientation estimation. The detail of feature points’ position extraction and orientation estimation will be discussed in section 3.2.1and 3.2.2 respectively. Feature points matching is our major concern in the registration since the accuracy of the feature points matching lays the foundation for accurate registration. In the feature points matching stage, both the local and the global structure matching proposed in Jiang and Yau’s fingerprint matching method [3] are applied. The fingerprint matching method in [3] attempts to automate a human expert’s behavior in process of aligning two fingerprints. While comparing two fingerprints, a human expert used to firstly manually examine the local positional relations between the minutiae (referred as local structure of a minutia), and then align two fingerprints using the unique global position structures of the whole image. In our approach, we not only adapt the local and global structure matching methods in [3] so that they can be applied to general image, but also improve both of them so that they are more flexible and reliable. To make the local structure matching more flexible for various imaging variations, we improve the local structure matching method to handle the cases where the image has significant scene changes or distortions. What’s more, consistent checking and cross-validation are both employed in our feature points matching method to guarantee the reliability of estimated feature correspondences. 19 We first perform consistent checking in two cues to eliminate the false matching pairs, and then employ cross-validation to eliminate the low-quality matching pairs. 20 {Fs, Ft} Slight deformation Serious deformation Local structure matching Direct matching Feature descriptor Fij Complex matching {fsp↔ ftq} {fsu↔ ftv} Global structure matching Cue 1 Cue 2 |Fpi − Fqj| |Fui − Fvj| MP2 MP1 Intersection MP Cross validation MP’ Figure 1: System diagram of the feature points matching method 21 The main procedure of our feature points matching algorithm is shown in Figure 1. At the beginning, we adapt the feature descriptor Fij defined in [3] to describe of the spatial relations between the feature points fi and fj by their relative distance, radial angle and orientation difference. Thus for every feature point fi, a local structure LSi is formed as the spatial relations between fi and its k-nearest neighbors. The detail of feature descriptor is summarized in section 3.3.1. Then given two feature points sets Fs={fs1,…fsn} and Ft ={ft1,…ftm}, the local structure matching is performed to find two best-matched local structure pairs { f sp ↔ f tq } and { f su ↔ ftv } . Since the local structure matching method proposed in [3] can only apply to images with simple geometry transformations and slight distortions, we propose a more complex local structure matching method in section 3.3.2 to handle complex image variations. Basic idea of our proposed method is as follows: when comparing two local structures of two feature points, instead of simply matching their k-nearest neighbors in order of their relative distances as [3], we compute the similarity of two local structures only according to those matched neighbors. We will discuss how to qualify two matched neighbor in section 3.3.2. Employing this method, we can provide applicable local structure matching for images with complex variations, such high deformations or object movements. 22 Assume that we obtain two best-matched local structure pairs, say { f sp ↔ f tq } ,{ f su ↔ ftv } , from the local structuring matching, either one of them can serve as a reliable correspondence of the two feature points’ sets Fs and Ft. All other feature points in Fs and Ft will be converted to the polar coordinate system with respect to the corresponding reference pair. Here we perform the global structure matching in two cues for consistence check: only those correspondences found by both cues are considered as valid matches, the other candidates points are excluded from the further processing. As shown in Figure 1, the best-matched local structure pair { f sp ↔ f tq } is input to cue1 to provide correspondence for aligning the global structure of the feature points, and a matching pair set MP1 is generated from cue1; while the other best-matched local structure pair { f su ↔ ftv } is input to cue2 to generate a matching pair set MP2. Only those pairs are generated from both cues are considered as the valid matching pairs. The global matching method is presented in section 3.3.3. After we obtain a number of matching pairs from the global structure matching, we apply the validation step to eliminate those low-quality matching pairs by cross-validation. The details of eliminating the low-quality matching pairs are presented in section 3.3.4. After obtaining a set of correct matching pairs, we can decide the transformation 23 between two images using QR factorization. We discuss how to derive the correct transformations in section 3.4. 3.2 Feature points detection In this section, we shall describe in detail how to extract the positions of a set of feature points using eigenvalue, and how to estimate the orientation of those feature points by a least mean square estimation method. 3.2.1 Feature point position extraction Features are salient structure in the images, such as significant regions, lines, or points. Typical features that are used are corners, line intersections, points on curves with high curvature, high variance points, and local extreme of wavelet transformation. In our approach, we employ OpenCV function goodFeaturestoTrack [ 18 ] to extract feature point positions. In OpenCV the function GoodFeaturesToTrack is designed to find corners by computing the ‘goodness’ of a feature points using Tomasi’s algorithm. This algorithm computes the eigenvalue of a matrix formed from the intensities of the pixels in a neighborhood of a feature point. 24 3.2.2 Feature point orientation estimation A number of methods have been proposed have been proposed for orientation estimation of the feature points. In our system, we apply the least mean square estimation algorithm proposed by [19] [20] .The steps for calculating the orientation at pixel (i , j) are as follows: 1. Divide the input image into blocks of size W × W . 2. For each pixel in the block, calculate image gradients Gx and G y at each pixel by Sobel operator, where Gx and G y are the gradient magnitudes in x and y directions, respectively. The horizontal Sobel operator is used to compute Gx and the vertical Sobel operator is used to compute Gy . 3. Estimate the local orientation at each pixel ( i, j ) by finding the principal axis of variation in the image gradients: i+ w / 2 j+w / 2 u =i − w / 2 v= j −w / 2 i+ w / 2 j+w / 2 u =i − w / 2 v= j −w / 2 Vx (i, j ) = V y (i, j ) = 25 ∑ ∑ ∑ ∑ 2Gx (u , v)G y (u , v), (3.1) (Gx2 (u , v) − G 2y (u , v )), (3.2) 1 2 θ (i, j ) = tan −1 ( Vy (i, j ) Vx (i, j ) ) (3.3) where θ (i, j ) is the least square estimate of the local orientation at the block centered at pixel ( i, j ) 4. Smooth the orientation field in a local neighborhood using a Gaussian filter. The orientation image is first converted into a continuous vector field, which is defined as: U x (i, j ) = cos(2θ (i, j )), (3.4) U y (i, j ) = sin(2θ (i, j )), (3.5) where Ux and Uy are the x and y components of the vector field, respectively. After the vector field has been computed, we smooth the orientation by a Gaussian low-pass filter of size w '× w ' : U ' y (i, j ) = U 'x (i, j ) = 26 w '/ 2 w '/ 2 ∑ ∑ u =− w '/ 2 v =− w '/ 2 w '/ 2 w '/ 2 ∑ ∑ u =− w '/ 2 v =− w '/ 2 G (u , v )U y (i − u )( j − v ) (3.6) G (u , v)U x (i − u )( j − v ) (3.7) 5. The final smoothed orientation field O at pixel O(i, j ) = ( i, j ) is defined as: U ' (i, j ) 1 tan −1 ( y ) 2 U x '(i, j ) (3.8) In the feature point matching process, the orientation of feature points is the most important criteria for feature measurement. Therefore the reliability of orientation estimation is important. To measure the reliability of orientation data, we first calculate the area moment of inertia about the orientation axis found as the minimum inertia, and then calculate the axis perpendicular as the maximum inertia: I min (i, j ) = G 2y (i, j ) + Gx2 (i, j ) 2 - (Gx2 (i, j ) - G 2y (i, j ))U x (i, j ) 2 - Gx (i, j )G y (i, j )U y (i, j ), (3.9) I max (i, j ) = G y2 (i, j ) + Gx2 (i, j ) − I min (i, j ), (3.10) where Imin and Imax denotes the minimum and maximum inertia, respectively. If the ratio of the minimum to maximum inertia is close to one, we have little orientation information. Therefore we calculate the reliability of orientation at pixel ( i, j ) using following equation: Reliability (i, j ) = 1 - I min (i, j ) /( I max + .001), (3.11) where Reliability ( i, j ) denotes the reliability of orientation at pixel ( i, j ) . For each 27 feature point, if its reliability of the orientation field is below a certain threshold, it will be eliminated from the feature points’ set for the subsequent feature points matching stage. 3.3 Feature points matching The main procedure of our feature point matching algorithm is shown in Figure 1. As shown in Figure 1, there are four major steps in our matching algorithm: define an invariant feature descriptor to describe the local positional relations between two feature points; local structure matching to get the best-matched local structure pairs; global structure matching to get a set of matching pairs; and cross-validation to eliminate the low-quality matching pairs. 3.3.1 Define a feature descriptor Each feature point i detected before can be represented by a feature vector fi as: f i = ( xi , yi , ϕi ), (3.12) where (xi,yi) is its coordinate, ϕi is the orientation.(see Figure 2). The feature vector fi represents a feature point’s absolute structure. However, in this thesis we adapt the name from [3], so that fi is called ‘global structure’ of the feature point i. 28 ϕi yi i xi Figure 2: feature point i be represented by a feature vector fi=( xi,yi, ϕi ). The global characteristic of the feature point xi , yi , ϕi are dependent on the rotation and translation of the image. However, a feature point can be described with rotation and translation invariant feature by using some other feature points in the same image. (Without specification, rotation and translation in this thesis stand for 2D rotation and 2D translation only.) Here we adapt the feature descriptor Fij defined in [3] to describe the local positional relations between two feature points fi and fj by their relative distance d ij , radial angle θ ij and orientation difference ϕij (see Figure 3) by equation (3.13): ⎛ d = ( x − x )2 + ( y − y )2 ⎞ i j i j ⎜ ij ⎟ ⎜ ⎟ − y y i j Fij = ⎜ θ ij = dφ (tan −1 ( ), ϕi ) ⎟ xi − x j ⎜ ⎟ ⎜⎜ ϕ = dφ (ϕ − ϕ ) ⎟⎟ i j ⎝ ij ⎠ (3.13) where dφ (t1 , t2 ) is a function to calculate the difference between two angles t1 and t2, 29 −π < t1 , t 2 ≤ π , as follows: if − π < t1 − t2 ≤ π ⎧t1 − t2 , ⎪ dφ (t1 , t2 ) = ⎨2π + t1 − t2 , if t1 − t2 ≤ π . ⎪2π − t + t , if t − t > π 1 2 1 2 ⎩ (3.14) From Figure 3, we see that a feature point fi together with the feature description Fij can uniquely decide the position and orientation of another feature point fj (uniqueness). For any feature point pair, their relative distance d ij , radial angle θ ij and orientation difference ϕij , are invariant to 2D rotation and translation of the image (invariance). Moreover, the elements in the vector Fij are functionally independent (independence). ϕi fj θij dij ϕj fi ϕj ϕij Figure 3: The local spatial relation between two feature points fi and fj 30 3.3.2 Local structure matching Employing the feature descriptor described in section 3.3.1, for every feature point fi, a local structure LSi can be formed as the spatial relations between the feature point fi and its k-nearest neighbors: LSi = {Fi1T , Fi 2T ,K Fik T } = {di1 ,θi1 , ϕi1 ,..., dik ,θik , ϕik }, (3.15) where Fij is the feature descriptor describing the local positional relations between two feature points fi and fj defined in equation(3.13), and FijT is the transpose of Fij. We should note that LSi is ordered ascendingly by the relative distance dik between the feature point i and its neighbor k. It is easy to see that the local structure feature vector LSi is independent of the 2D rotation and translation of the image. So it can directly be used for matching. Thus in local structure matching, given two feature sets Fs={fs1,…fsn} and Ft ={ft1,…ftm}, where Fs and Ft consist of all feature points detected from sensed image s and reference image t, respectively, the aim is to find two best-matched local structure pairs { f sp ↔ f tq } and { f su ↔ ftv } to serve as the corresponding reference pairs later in the global structure matching stage. The reason why we need two best-matched local structure pairs is to perform consistence check, which we will explain in more detail in section 3.3.3. 31 Here we have two ways to measure the similarity level of two local structures according to the complex level of image variations. In case there are only rotate and translate transformations between images and the image distortions are slight, we employ the direct local structure matching method proposed in [3] for its efficiency. Otherwise we propose a more complex local structure matching method to solve the problem where the sensed image and the reference image from the same scene have only a few similar local structure pairs. We present both methods in the follows. Direct local structure matching [3] Suppose LS and LS j are the local structure feature vectors of the feature point i i from sensed image s and the feature points j from the reference image t, respectively. A similarity level between two feature points i and j is defined as ⎧ ⎪⎪ bl − W ⋅ LSi − LS j , sl (i, j ) = ⎨ bl ⎪ ⎪⎩0, if W LS − LS < bl i j (3.16) otherwise W = {w ...w}, where w = ( wd , wθ , wϕ ). { (3.17) k W ⋅ LS − LS = wd di1 − d j1 + wθ θi1 − θ j1 + wϕ ϕi1 − ϕ j1 + ... + i j wd dik − d jk + wθ θik − θ jk + wϕ ϕik − ϕ jk 32 (3.18) The difference between directions and angles in equation (3.16) should be calculated using equation(3.14). W is a weight vector that specifies the weight associate with each component of the feature vector. As we know, 0 ≤ d ik − d jk ≤ max( sizes , sizet ) , 0 ≤ θ ik − θ jk ≤ 2π , 0 ≤ ϕik − ϕ jk ≤ 2π , therefore the weight vector W is defined in a way so that the range of wd d ik − d jk , wθ θ ik − θ jk , wϕ ϕik − ϕ jk are normalized to [0,1]: wd = 1/ max( sizes , sizet ) wθ = wϕ = 1/ 2π , where sizes and sizet denote the size of the sensed image s and the reference image t, respectively. Note that we define wd in inverse proportion of the square root of the image size. Therefore, the larger the image size, the smaller the effect of distance difference. The threshold bl can be 3k. Instead of simply specifying matched or not matched, the similarity level sl (i, j ), 0 ≤ sl (i, j ) ≤ 1 , describes a matching level of a local structure pair. sl (i, j ) = 0 implies a total mismatch while sl (i, j ) = 1 implies a perfect match. Therefore the two best-matched local structure pairs { f sp ↔ f tq } and { f su ↔ ftv } are obtained by maximizing the similarity level. 33 Complex local structure matching This direct local structure matching method in [3] is efficient since the computation time for sl(i, j) is only O(k), where k is the number of feature points in a neighborhood. However, we should note any spurious or dropped feature points in the neighborhood may result in an invalid local structure for matching. As defined by equation(3.15), the local structure LSi of a feature point pi is ordered ascendantly by the relative distance dik between the feature point pi and its neighbor pk. Thus when computing the similarity level between two feature points pi and pj, equation (3.16) simply matches their k-nearest neighbors in order of their relative distances. If there are any dropped or spurious feature points in the neighborhood disturbing the order, the local structure matching will be invalid. The example in Figure 4 demonstrates this case P3 P2 P1 P2 ’ P1 ’ P0 ’ Pi Pj Figure 4: Spurious or dropped feature points in the neighborhood will result in an invalid local structure for matching. 34 In Figure 4, suppose a feature point pi in the sensed image s has a neighborhood knni = { p1 , p2 , p3 } , and pi ’s corresponding point pj in the reference image t has a neighborhood knn j = { p0 ', p2 ', p3 '} , of which { p2 ↔ p2 '} { p1 ↔ p1 '} and are two matching pairs. Because of the image distortions or scene changes, there is no matching feature point for p1 in the knnj, and a spurious feature point p0 ' which does not match to any feature point in knni appears instead. As we know the local structure of a feature point is ordered by the relative distance between the feature point and its neighbors, thus LSi = {Fi1T , Fi 2T , Fi 3T }, LS j = {F j 0T , F j1T , F j 2T } . Therefore using equation(3.16), the similarity level between the local structure of pi and local structure of pj are given as sl ( pi , p j ) = p1 − p0 ' + p2 − p1 ' + p3 − p2 ' (3.19) From equation (3.19) we can see the similarity level of physically corresponding pair pi and pj would be very low since their neighbors are totally mismatched. Thus the similarity level computed by (3.16) is not reliable when the above local structure occurs. Experimental results in section 4.1 show that this direct local structure matching fails to compute the best-matched local structure pair when image variation is complex, such as images with significant scene change (Figure 6), or with compression distortion (Figure 7,), or from highly different viewpoint , or from 35 different sensor (Figure 8). This is because in these cases, the sensed image and the reference image from the same scene may have only a few completely-matched local structures. Therefore we propose a more complex local structure matching method to handle these cases. In general, we improve the local structure matching method in two ways: Firstly, when we match the neighbors of two candidate feature points, we consider not only the relative distance but also the radial angle and orientation difference. It helps us to avoid the mismatch shown in Figure 4: by adding the criteria on radial angle and orientation difference, p1 will not be matched to p0 ' . Secondly, after we identify those matched neighbors, we will drop those feature points which can not find a matching in each other’s neighborhood from the similarity level computation. For the example shown in Figure 4, only { p1 ↔ p1 '} and { p2 ↔ p2 '} will be considered in our local structure matching computation, p3 and p0’ will be dropped from computation because they can not find a matching in each other’s neighborhood. This condition improves the accuracy of the local structure matching. This is because when image has significant scene changes or image distortion is serious, two physically matched feature points may have only a few matched neighbors in their corresponding neighborhoods. In these cases those unmatched neighbors will make the similarity level of two physically matched feature points 36 decrease a lot; therefore we need to delete them from computation. Basic ideas of our proposed local structure matching is as follows: when calculating the similarity of two local structures of two candidate feature points, say p, q, instead of matching their k-nearest neighbors in order of their relative distances directly as(3.16), for each neighbor of point p, we search for its most similar matching point in neighborhood of q according to the relative distance, radial angle and orientation difference. What’s more, we compute the similarity of two local structures only according to those matched neighbors. Suppose we are checking the similarity level between the feature point p from the sensed image s and the feature point q from the reference image t. Let Knnp, Knnq denote the k-nearest neighborhood of the feature point p and q respectively. For every feature point (say n,) in Knnp, we will find its most similar point (say m,) in Knnq. Using equation (3.13), we let Fpn denote the feature descriptor of point p related to its neighbor n, and Fqm denote the feature descriptor of point q related to its neighbor m. A feature point n in the neighborhood Knnp and a feature point m in the neighborhood Knnq are qualified as a matching pair {n ↔ m} if three conditions are satisfied: Condition1: 37 W. Fpn − Fqm = min W . Fpn − Fqj , (3.20) W. Fpn − Fqm = min W . Fpi − Fqm , (3.21) j∈knnq and i∈knn p where W . Fpn − Fqm = wd d pn − d qm + wθ θ pn − θ qm + wϕ ϕ pn − ϕqm . Condition 1 finds a feature point n in Knnp and a feature point m in Knnq such that the measure W Fpn − Fqm is minimized. Equation (3.20)searches for every member in Knnq to find a match for n, while equation (3.21) searches for every member in Knnp to find a match for m. We employ both equation (3.20)and equation (3.21) to avoid a feature point being doubly used for matching. Condition 2: W ⋅ | Fpn − Fqm |≤ Tc , where Tc = 0.75 (3.22) Condition 2 forces the values of W . Fpn − Fqm is smaller than Tc, where W is a weight vector defined in equation(3.17), and Tc is a threshold value empirically chosen as 0.75. As we know, the value of wd dik − d jk , wθ θ ik − θ jk , wϕ ϕik − ϕ jk has been normalized in range of [0, 1], so the value of W ⋅ | Fpn − Fqm | is in range of [0, 3]. If the feature points (p,q), (n,m) are physically matched, we can assume that d pn − d qm ≤ max( sizes , sizet ) / 2 38 ,and θ pn − θ qm ≤ π / 4, ϕ pn − ϕ qm ≤ π / 4. Therefore W ⋅ | Fpn − Fqm |≤ 0.75 . Condition 3: θ np − θ mq ≤ π / 4 (3.23) As we know if both the feature points pair {n ↔ m} and the feature points pair { p ↔ q} are matching pair, the relative orientation difference between θ np and θ mq should be small. Hence it is reasonable to introduce the condition 3, the relative orientation criterion, to force the relative orientation difference within certain threshold, i.e., θ np − θ mq ≤ π / 4 . Adding this criterion will speed up the search time. For example, in Figure 3, p1 and p0’ will be identified as unmatched pair just by checking condition 3. In the real implementation, the orientation constraint will be tested first. If the constraint is not satisfied, it is not necessary to test condition 1 and condition 2. Therefore the number of pairs need to be compared are significantly reduced. Then the similarity level between the feature point p in sensed image s and the feature point q in reference image t is defined as sl ( p, q) = 39 bl − nsl ( p, q ) , bl (3.24) nsl ( p, q) = n↔m ∑ W n∈knn p ,m∈knnq Fpn − Fqm bl = Tc ⋅ | {n ↔ m | n ∈ knn p , m ∈ knnq } | (3.25) (3.26) where nsl(p,q) computes the similarly level only for those matched neighbor pairs from Knnp and Knnq according to conditions 1-3, W is a weight vector defined in equation (3.17), and Tc is a threshold value defined in equation (3.22). From condition 2, we know W Fpn − Fqm < Tc if point n and point m are matched neighbors. Thus we define threshold bl as Tc multiply by number of matching neighbor pairs, to make sure that the similarly level sl(p,q) is always greater than zero. The two best-matched local structure pairs { f sp ↔ f tq } and { f su ↔ ftv } are obtained by maximizing the similarity level, that is to say, select the two local structure pairs with the top two similarity levels among all local structures from template and sensed image. For example shown in Figure 3, a feature point pi in the sensed image s has a neighborhood {p1, p2, p3}, and pi ’s corresponding point pj in the reference image t has a neighborhood { p0 ', p1 ', p2 '} , where { p1 ↔ p1 '} and { p2 ↔ p2 '} are two matching pairs. Employing the condition 1-3 we can recognize the correct matching 40 { p2 ↔ p2 '} and { p1 ↔ p1 '} from two neighborhoods, thus the similarity level between the local structure of pi and local structure of pj is given as sl ( pi , p j ) = 1.5 − (| p2 − p2 ' | + | p3 − p3 ' |) 1.5 (3.27) Therefore our proposed local structure matching solves the problem that sensed image and the reference image from the same scene may have very few completely-matched local structures. Instead of search for the completely-matched local structure as the direct matching method, this method finds those similar local structures. Employing this method, we can provide reliable local structure matching for images with compression distortions (Figure 7), or with significant scene change (Figure 6), or by different sensors (Figure 8), or from highly different viewpoints. The experimental results are shown in section 4.1. Performance Analysis of the local structure matching Here we employ two methods to calculate the similarity of the local structure. The direct matching method is efficient since the computation time for sl (i, j ) is only O ( k ) , where k is the number of feature points in a neighborhood. However, it gives the correct similarity measure only in case where two local structures are completely matched. And spurious or dropped feature points in the neighborhood will result an invalid similarity measurement. To solve this problem we propose a 41 more complex local structure matching method, which measures the similarity of two local structures only according to those relatively-matched feature points in the neighborhood. This method can give correct similarity measurement for two corresponding feature points which do not have completely matched local structures. The experimental results on comparisons of these two local structure methods (Table 1, section 4.1) prove our analysis. However, the reliability of the local structure matching must be increased further. There are two limitations lie in the local structure matching: firstly, two different feature points from the sensed image and the reference image may have similar local structure. On the other hand, two images from the same scene may have only a small number of well-matched local structures. An idea to improve the reliability is to add the feature point’s global structures in the matching process. Although not all well-matched local structures are reliable and although not every corresponding pair has a well-matched local structure, our experiments show that the two best-matched local structure pairs of all local structures of the template and sensed image are very reliable. Therefore the two best-matched local structure pairs obtained from the local structure matching can serve as reliable corresponding references in the global structure matching. 42 3.3.3 Global structure matching The global structure (absolute structure) of a feature point described with equation (3.12) is dependent of the rotation and translation of the image. To align the two feature sets Fs, Ft from the sensed image s and the reference image t, a corresponding reference pair should be found. In this stage we have two best-matched local structure pairs, say ( p, q ), (u , v ) , from the previous local structuring matching, either one of them can serve as a reliable correspondence for the two feature sets Fs and Ft. We perform the global structure matching in two cues for consistence check. As shown in Figure 1, one of the best-matched local structure pairs (p, q) is sent to cue1 as the corresponding reference to align two feature sets, while another best-matched local structure pair (u, v) is sent to cue2 for the same purpose. In cue1, all other feature points from sensed image s and reference image t will be aligned based on ( p, q ) by converting them to the polar coordinate system with respect to the corresponding reference f q , f p : ⎛ ( x − x )2 + ( y − y )2 i p i p ⎛ rip ⎞ ⎜ ⎜ ⎜ ⎟ yi − y p GSi s = ⎜ θip ⎟ = ⎜ dφ (tan −1 ( ), ϕ p ) x x − ⎜ i p ⎜⎜ ⎟⎟ ⎝ ϕiq ⎠ ⎜ dφ (ϕ , ϕ ) ⎜ i p ⎝ 43 ⎞ ⎟ ⎟ ⎟, ⎟ ⎟ ⎟ ⎠ (3.28) ⎛ ( x − x )2 + ( y − y )2 i q i q ⎛ riq ⎞ ⎜ ⎜ ⎟ ⎜ yi − yq GSi t = ⎜ θiq ⎟ = ⎜ dφ (tan −1 ( ), ϕq ) x x − ⎜ i q ⎜⎜ ⎟⎟ ⎝ ϕiq ⎠ ⎜ dφ (ϕ , ϕ ) ⎜ i q ⎝ ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ (3.29) where GSi s , GSi t represent the aligned global structure of feature points i and j in sensed image s and reference image t based on the corresponding reference p and q, respectively. Then we define the matching level ml(i,j) for feature point i of the sensed image s and feature point j from the reference image t by: ⎧ bl − w | GSis − GS tj | ⎪ , if | GSis − GS tj |< Bg ml (i, j ) = ⎨ bl ⎪ 0, otherwise ⎩ (3.30) where w is a weight vector defined in equation (3.17), bl is a threshold defined as 3 to make sure that the matching level ml(i,j) is always greater than zero. Bg is a 3-D bounding box in the feature space to tolerate the image deformation. We empirically choose Bg = (10, π / 4, π / 4) . It is notable that instead of employing the local structure similarity level sl(i, j) in the global matching as [3] does, we calculate the matching level ml (i, j ) by directly matching the aligned global structure GSi s 44 and GS j t . This is because the global structure (absolute structure) is defined as the spatial relation between the feature point and the best-matched local structure pairs; while the local structure (relative structure) is defined as the spatial relation between the feature point and all its k-nearest neighbors. Any spurious or dropped feature point in the neighborhoods will decrease the local structure similarity level of two physically matched feature points. Therefore when image variations are complex, the local structure matching is reliable only on those pairs with the highest local structure similarity levels. Thus for an arbitrary feature point a in the feature sets Fs, we find a feature points b in the feature sets Ft such that ml (a, b) = max(ml (a, j )) j (3.31) While for this feature point b, we search for a feature point c in the feature sets Fs such that ml (c, b) = max(ml (i, b)) i (3.32) The feature point a and the feature point b will be recognized as a matching pair if and only if the feature point c found in equation (3.32) and feature point a are the same point. A matching pair set MP1 containing all correspondences is generated as the output of cue1. 45 The relatively large bounding box employed in global matching may introduce some false matching pairs that we need to eliminate. Therefore we perform consistency check to improve the reliability of the feature points matching. In cue2, we align the two feature sets with respect to the corresponding reference fu and f v , and then perform the same global matching as what we did in cue1 to generate the matching pair set MP2. Only those pairs found in both cues are considered as valid matching pairs. Finally, the matching pair set MP, which is the intersection of MP1 and MP2, is output as results of the global structure matching. The experimental results on how consist checking eliminates the false matching pairs are shown in Figure 9-Figure 11, section 4.2. 3.3.4 Eliminating the low-quality matching pairs Now we have obtained a number of matching pairs from the global structure matching and can derive the registration parameters based on them. However, if accuracy is a major concern, the current results may not satisfy the requirement. In these cases we introduce the validation stage to eliminate those low-quality matching pairs. (The example of the low-quality matching pairs is given in Figure 11, section 4.1. Among all matching pairs indicted by numbered red dots, the matching pairs 3, 4, 7, 11 are regarded as low-quality matching pairs.) In this stage, the low-quality matching pairs in matching pairs’ set MP are identified by 46 cross-validation. First of all, we calculate the root mean square error (say, RMSE) from the whole matching pair set. Then in each step, we exclude one pair (say, Pi) from the set of matching pairs, and then calculate the root mean square error (say RMSE i). If (RMSE i – RMSE) is greater a given threshold, then the matching pair Pi is identified as a low-quality matching pair. Eliminating those pairs we get the correct matching pair set MP’ as the final output of the feature points matching method. The experimental results of how cross-validation eliminating the low-quality matching pairs are shown in Figure 11 and Figure 12, section 4.2. 3.3.5 Performance analysis In Jiang and Yau’s fingerprint minutia matching algorithm proposed in [3], the local and global structures together provide a solid basis for reliable and robust minutiae matching. However, this algorithm is only suitable for images under rotate and translate transformations, and only tolerable to slight image deformations. In this thesis, we adapt the local and global structures matching methods in fingerprint matching algorithm, modify them so that we can obtain a set a applicable corresponding feature points for general images taken under various imaging conditions. However, since the invariant feature descriptor employed in our 47 algorithm contains the relative distance, the proposed algorithm is limited to the image of roughly the same scale. This limitation will be considered as one of the further works. 3.4 Transformation model estimation After obtaining a set of correct matching pairs, the mapping function is constructed. It should transform the sensed image to overlay the reference image. The correspondence of the feature points together with the fact that the corresponding pair should be as close as possible after the sensed image transformation are employed in the transformation model estimation. Assume that the matching pair set obtained is {ui ↔ vi }i =1,2,.., N ,where N is the total number of correct matching pairs. In general, the 2D point sets {ui } and {vi } should satisfy the relation vi = Aui (3.33) where A is the mapping function we want to solve. The type the mapping function A should correspond to the assumed geometric deformation of the sensed image. In case the deformation is unknown, we assume A is the RST (Rotation, Scaling, Translation) transformation function. The parameters 48 of the mapping functions A are computed by the least-square method QR factorization. In case A is the RST transformation function, ⎡ s cos θ A = ⎢⎢ s sin θ ⎢⎣ 0 − s sin θ s cos θ 0 t x ⎤ ⎡ a1 −a 2 a3 ⎤ t y ⎥⎥ = ⎢⎢ a 2 a1 a 4 ⎥⎥ 1 ⎥⎦ ⎢⎣ 0 0 1 ⎥⎦ (3.34) Therefore the scaling parameter s, rotation parameter θ , and translations parameter (t x , t y ) can be derived by: t x = a3 t y = a4 s = a12 + a 22 θ = tan −1 (a 2 / a1) 49 (3.35) Chapter 4 Experimental Results A series of experiments are performed to evaluate the performance of our registration algorithm. The majorities of our testing images are from the testing data of [21], which is an automatic registration system of remote sensing images under development at the Division of Image Processing (National Institute for Space Research - INPE) and the Vision Lab (Electrical & Computer Engineering Department, UCSB). Images that we adapt from [21] include optical, radar, multi-sensor, high-resolution, and Landsat images. Furthermore, our algorithm has been tested on retina images taken by canon CR6-45NM Retinal Camera and some video sequences pictures. The testing platform is a Pentium 2.20GHz, 512MB RAM machine. In section 4.1, the performances of two local structure matching methods discussed in section 3.3.2 are presented. In order to demonstrate the power of our local structure method for estimating the align reference, we ran experiments on a set of images with complex image variations. Then in section 4.2, we show how reliability of the feature points matching is improved by double global structure matching and 50 cross-validation. Finally in section 4.3, in order to demonstrate our algorithm’s flexibility for different types of images, registration results are presented for a series of images taken under various imaging conditions. 4.1 Results of local structure matching To demonstrate the power of our proposed local structure matching method in section 3.3.2, we ran tests on the following four pairs of images (Figure 5-6) with different types of image variations. The results of the local structure matching are shown in Figure 5-6. For every pair of images in the Figure 5-6, the red dots and the red arrows indict the positions and orientations of the feature points, and the two best-matched local structure pairs computed are circled and labeled in blue color. The variations between the input and the reference image, the image size and the number of feature points detected in both images are listed in Table 1. From the Figure 5-6, we see that the best-match local structure pair computed by the improved local structure matching method is applicable. The proposed local structure method can not only apply it to images with geometric transformations (Figure 5), but also can apply to images with large temporal difference (Figure 6), or images with compression distortions (highly JPEG compressed) (Figure 7), or images from different sensors (Figure 8). 51 (a) reference image 1 2 (b) input image 2 1 Figure 5: The local structure matching on images with geometry transformations. The two best-matched local structure pairs are circled and labeled in blue color. 52 1 2 1 2 Figure 6: The local structure matching on images with large temporal difference. The two best-matched local structure pairs are circled and labeled in blue color. 53 (a) template img (b) input img 1 1 2 2 Figure 7: The local structure matching on images with image distortions (highly JPEG compressed). The two best-matched local structure pairs are circled and labeled in blue color. 54 2 1 2 1 Figure 8: The local structure matching on images from different sensors. (a) SPOT band 3; (b) TM band 4. The two best-matched local structure pairs are circled and labeled in blue color To compare the performance of the two local structure matching methods in section 55 3.3.2, we present Table 1 to show the experiments results on the four pair of images by both methods. In Table 1, the variation between the input and the reference images are shown in the 2nd column. The image size and the number of feature points used are listed in the 3rd and 4th column, respectively. In the 5th and 6th column, method 1 means the direct local structure matching method in [3], while method 2 means the proposed complex local structure matching method. For each method both the time and the results for computing the best-local structure pair are listed, where × means the corresponding method fails to compute the best-matched local structure pair. Testing Image variation type Image size Images Figure 5 Figure 6 Figure 7 Figure 8 # Feature Method 1 Method 2 2.04s 20.34s × 57.80s × 15.25s × 46.63s points Smooth geometric (1024x1024) 95 transformation (1172x1064) 86 Large temporal (350x405) 114 difference (350x405) 139 Compression distortion (386x306) 96 (472x335) 81 Differences in sensor (256x256) 97 measurement (256x256) 103 Table 1: Comparison of two local structure matching methods. From the Table 1, we see that the direct local structure matching method fails on images with complex image variations such as significant scene changes or compression distortions, while the complex local structure matching method generates applicable results in those cases. However, the direct matching method is 56 more efficient since the computation time is only O ( kmn) , while the computation time of the complex matching method is O (k 2 mn) , where k is the number of feature points in a neighborhood, m and n are the number of feature points in the input and reference image. 4.2 Results of global structure matching In this section, we show how the reliability of the feature points matching is improved by the double global structure matching and cross-validation step by step. The testing image is a pair of urban images taken from SPOT and TM (Figure 8). It is notable that the intensity differences between the input image and the reference image are large. In the Figure 9-10, we denote a matching pair detected in current step as a pair of green dots numbered with the same digit in the input image and in the reference image. Figure 9 and Figure 10 show the matching pairs obtained from the global structure matching in cue1 and in cue2, respectively, where the two align references pairs for cue1 and cue2 are shown in Figure 8. Then we only output those pairs found by both cues as the results of the global structure matching (Figure 11). In Figure 11, we observe that most of the false matching pairs are eliminated by consistent checking, except the matching pairs 3, 4, 7, 11 which are not matched correctly. These 57 low-quality matching pairs can be identified by cross-validation. Finally we obtain a set of applicable matching pairs shown in Figure 12. 58 22 21 37 29 36 25 23 17 26 9 3 6 11 30 35 10 31 40 19 5 20 13 39 1 27 18 15 38 16 7 5 2 33 32 37 29 24 39 20 21 8 23 27 25 13 22 2 36 16 14 12 28 34 4 18 7 1 26 38 3 11 15 34 35 10 30 17 9 32 6 40 8 33 24 19 31 2 12 4 14 Figure 9: The matching pairs detected from the global structure matching in cue1. 37 19 17 24 20 28 9 17 16 35 25 37 19 2 1 29 21 3 4 36 11 26 18 8 5 34 14 27 31 33 38 15 7 13 30 10 16 20 23 32 22 12 9 24 6 29 35 26 28 11 5 2 1 18 21 3 36 8 34 13 30 27 25 23 14 4 31 38 32 33 22 15 12 7 Figure 10: The matching pairs detected from the global structure matching in cue2 59 6 10 12 7 1 5 9 11 4 7 10 2 12 8 13 6 3 1 4 9 3 5 10 2 13 11 8 6 Figure 11: The matching pairs obtained from intersection of results in cue1 and cue2. Among them the matching pairs 3, 4, 7, 11 are not matched correctly. template input 7 1 3 5 6 2 7 4 8 1 5 3 2 8 Figure 12: The final matching pair set after cross-validation. 60 6 4 4.3 Registration results on various images To demonstrate our algorithm’s flexibility, we test on a series of images taken under various imaging conditions associated with different image variations (Figure 13Figure 20). The registration results on testing images are shown in Table 2. To check the correctness of registration results, we also list the registration results on the same images generated by the UCSB automatic registration system [21], where the optical flow ideas [22] [23] is used to extract the features on both images [24]. UCSB system is widely used for image registration. So we compare the registration parameters estimated by our algorithm with those obtained by UCSB system to show that they are very close. In Table 2, [s1, tx1, ty1, θ1] are the registration parameters generated by our method, while [s2, tx2, ty2, θ2] are the results generated by the UCSB automatic registration system. In the last two column of Table 2, RMSE is the Root mean square error at the matching pairs, and #MP indicts the number of matching pairs detected for each pair of images. What’s more, the correctness of the registration results can also verified by checking the continuities of the ridges between fields or forest or urban in the registered image in Figure 13-Figure 20. 61 Testing Scale: s Translation tx Translation ty Rotation: θ RMSE #MP -24.98 1.607 211 -1.234 -1.098 4.192 8 -8.937 -0.668 -0.168 1.498 6 -78.98 -79.30 0.125 0.193 1.081 19 34.24 -183.43 -186.24 0.984 1.032 9.428 17 1.44 1.84 -3.17 -0.91 -0.269 -0.047 2.185 8 1.004 144.90 144.86 75.33 74.22 -19.90 -20.20 1.611 21 N.A -150.42 N.A 119.96 N.A 1.35 N.A 9.26 15 images s1 s2 Figure 13 1.002 1.002 Figure 14 1.012 Figure 15 tx2 ty1 ty2 θ1 θ2 715.1 714.9 -489.66 -490.67 -25.02 0.996 87.07 75.06 9.83 9.57 1.042 0.997 21.49 22.35 -8.205 Figure 16 0.994 0.991 87.88 87.65 Figure 17 0.991 0.991 33.57 Figure 18 0.998 0.997 Figure 19 0.997 Figure 20 1.072 tx1 Table 2: The registration results on 8 pairs of images in Figure 13-20. [s1, tx1, ty1, θ1 ] are the registration parameters generated by our method, we compare them with [s2, tx2, ty2, θ2 ] generated by UCSB automatic registration system. N.A indicts that the registration fails for the corresponding image because of the bad fit of the estimated correspondences. RMSE is the Root mean square error at the matching pairs. #MP means number of matching pairs detected for each pair of images. Table 2 gives a quantitative review for our registration algorithm. In Table 2, we see that the RMSE value for most of the testing images are within 5 pixels except for Figure 17, Amazon region image with deforestation and Figure 20, retina images are the largest among all images. This is due to their transformations are not the affine transformation as assumed during computing the registration parameters. However, if we check the corresponding points detected in Figure 17 (a) (b) and Figure 20(a) (b), we will see the detected control points are correctly matched. In Table 2, we also see registration results from our algorithm and that from the USCB system are close except for Figure 15 and Figure 18. The reason for the difference is that the number of corresponding points detected for are too small and the transformation are not the 62 assumed affine transformations. In these cases, the transformation parameters computed are not accurate. Figure 13 shows an example of high resolution image registration: (a) and (b) are two high resolution Ikonos optical images with simulated 2D rotation and translation, where (a) is of size [1024x1024], and (b) is of size [1172x1064]. The red dots on (a) and (b) indict the positions of the matching pair detected. Figure 13(c) shows the results after applying our registration method. In the experiment, the estimated transform parameters are: Scale: [1.0021] Angle: [-25.0182] Tx: [714.9755] Ty: [-489.859]. The correctness of the registration results can be verified by checking the continuities of the ridges between fields in Figure 13 (c). Figure 14 shows an example of urban images from two different sensors: (a) and (b) are the input images, where (a) is taken from SPOT band 3, and (b) is from TM band 4. It is notable that the intensity differences between (a) and (b) are quite large. The red dots on (a) and (b) indict the positions of the matching pair detected, from which we see that our feature points matching method estimates very accurate correspondences. Figure 14(c) shows the results after applying our registration method. In the experiment, the estimated transform parameters are: Scale: [1.0122] Angle: [-1.2343] Tx: [87.0705] Ty: [9.8341.We can verify the correctness of the registration results by comparing it with the results generated by the UCSB 63 automatic registration system in Table 2. Figure 15 and Figure 17 both work on forest images with two year differences. In Figure 15, the input image pair (a) and (b) are two radar images of Amazon region, and the estimated transformation parameters are: Scale:[0.98179] Angle: [2.173] Tx: [21.4974] Ty: [-16.2047]. While in Figure 17, the input images (a) (b) are two Amazon region images from TM band 5, and the estimated transformation parameters are: Scale: [0.99183] Angle: [0.98362] Tx: [33.5682] Ty: [-173.4304]. In both cases, we can observe deforestation from the overlapped area in the registered image. Figure 16 show example of Landsat images with four year difference and associated rotation. The estimated registration parameters are: Scale: [0.99481] Angle: [0.12507] Tx: [87.8858] Ty: [-78.9869]. Despite the fact that these images contain significant changes in scene, our algorithm still produces reliable feature points correspondences. One can verify the accuracy of the feature points matching by checking the red dots on (a) and (b) of each figure, which indict the positions of the matching pairs. Figure 18 shows an example of two trailer parking images with highly temporal changes. It is notable that the local structure between the reference images (a) and the input image (b) has significant changes. In general, it is hard to estimate correspondences between images with highly different local structures. The 64 estimated registration parameters are: Scale: [0.99895] Angle: [-0.26995] Tx: [1.4411] Ty: [-3.1712]. On the result image (c) it possible to see temporal changes on the overlapping area. In order to demonstrate our approach’s tolerance for image distortions, we also apply our algorithm to images with serious image distortions. Figure 19 shows an example of highly distorted images with highly rotation: (a) and (b) are two highly JPEG compressed optical images; (c) is the registration of (a) and (b).The estimated registration parameters are: Scale: [0.99725] Angle: [-19.8958] Tx: [144.9035] Ty: [75.3361]. Despite the fact that these images have serious image distortion and highly rotation, our algorithm produces accurate registration results. The correctness of the registration results can be verified either by checking the continuities of the ridges between fields in Figure 19(c), or by comparing our results with the results generated by the UCSB automatic registration system in Table 2. In order to show our approach’s flexibility, Figure 20 provides an example of retina image registration, on which the UCSB automatic registration system is failed because of the bad fit of the estimated correspondences. Figure 20(a) and (b) are two retina images taken by canon CR6-45NM Retinal Camera with significant eyeball rotations. The correspondences estimated are shown 65 by the red dots on (a) and (b). Figure 20(c) is the registration results of (a) and (b). The estimated registration parameters are: Scale: [1.0274] Angle: [1.3509] Tx: [-150.4187] Ty: [119.9696]. With the help of these transformation parameters, the 3D reconstruction of optical fundus can be performed. 66 (a) (b) Figure 13(a) (b): Two high resolution Ikonos optical images with simulated rotation and translation. (a) is of size [1024x1024], and (b) is of size [1172x1064]. 67 (c) Figure 13: Registration of high resolution images. (c): The registration results of (a) and (b). The estimated registration parameters are: Scale: [1.0021] Angle: [-25.0182] Tx: [714.9755] Ty: [-489.859] 68 (a) (b) (c) Figure 14: Registration of urban images from different sensors. (a): SPOT band 3 (08/08/95); (b): TM band 4 (06/07/94); (c) the registration results of (a) and (b). The estimated registration parameters are: Scale: [1.0122] Angle: [-1.2343] Tx: [87.0705] Ty: [9.8341]. 69 (a) (b) (c) Figure 15: Registration of two Amazon region images from Radar, JERS-1 with two year difference. (a) Radar, JERS-1 (10/10/95), (b) Radar, JERS-1 (08/13/96); (c) the registration results of (a) and (b).The estimated registration parameters are: Scale:[0.98179] Angle: [2.173] Tx: [21.4974] Ty: [-16.2047]. 70 (a) (b) Figure 16 (a), (b) Two Landsat images with four year difference and associated rotation (a): agricultural image from TM band 5 (09/09/90) ; (b): agricultural image from TM band 5 (07/18/94) 71 (c) Figure 16: Registration of Landsat images with four year difference and associated rotation (c): The registration results of (a) and (b). The estimated registration parameters are: Scale: [0.99481] Angle: [0.12507] Tx: [87.8858] Ty: [-78.9869]. 72 (a) (b) Figure 17 (a) (b): Two Amazon region images taken in two year difference.(a) forest, TM band 5, (06/07/92) (b) forest, TM band 5, (07/15/94); 73 (c) Figure 17: Registration of Amazon region image with deforestations. (c): the registration results of (a) and (b). We can observe the deforestation between two images from the registration results. The estimated registration parameters are: Scale: [0.99183] Angle: [0.98362] Tx: [33.5682] Ty: [-173.4304]. 74 (a) (b) (c) Figure 18: Registration of images with high temporal changes. (a), (b): Optical images of trailer parking with high temporal changes; (c) the registration results of (a) and (b). The estimated registration parameters are: Scale: [0.99895] Angle: [-0.26995] Tx: [1.4411] Ty: [-3.1712]. On the result image it possible to see temporal changes on the overlapping area 75 (b) (a) Figure 19 (a) (b): Two highly JPEG compressed optical images with compression distortion and highly rotation. 76 (c) Figure 19: Registration of images with compression distortions. (c): The registration results of (a) and (b). The estimated registration parameters are: Scale: [0.99725] Angle: [-19.8958] Tx: [144.9035] Ty: [75.3361] 77 (a) (b) Figure 20 (a)(b): Two retina images by canon CR6-45NM Retinal Camera with highly translations and rotations. 78 Figure 20: Registration of retina images with associated rotation and translation. (c): The registration results of (a) and (b). The estimated registration parameters are: Scale: [1.0274] Angle: [1.3509] Tx: [-150.4187] Ty: [119.9696] . 79 Chapter 5 Conclusions and Further works Image registration is an important operation in multimedia system. We have presented a feature-based image registration method. Compared to the conventional feature-based image registration methods, our method is robust by guaranteeing the high reliable feature points to be selected and used in the registration process. We have successfully applied our method to images of different conditions. To provide a reliable local structure matching in various imaging conditions, we improve the local structure matching method to handle complex image variations. The proposed method can apply to image with significant scene changes or with serious image distortions. However, since the invariant feature descriptor employed in our algorithm contains the relative distance, the proposed algorithm is limited to the image of roughly the same scale. This limitation will be considered as one of the further works. The aligned global structure matching can finally estimate the correspondences between two images. The relative large bounding boxing employed in the global structure matching makes the algorithm tolerable to some non-linear deformation of 80 the image. However, it may also introduce some false matching pair that we need to eliminate. To increase the accuracy of the feature points matching, we perform the global structure matching in two cues with different aligning reference for consistent checking, and then employ cross-validation to eliminate those low-quality matching pairs. The experimental results show that we establish applicable correspondences between images taken at different times, or from highly different view points, or by different sensors. We also successfully apply our method to images changing in scenes such as object movements or deformations. Furthermore, the proposed algorithm can tolerate certain image distortion. However, the corresponding searching results are acceptable for image mosaicking but not for image registration with pixel accuracy. Our future work is headed in several directions. Firstly, the domain of images considered is generated to multi-scale image. Secondly, the corresponding features and transformations will be computed iteratively to increase the accuracy. Thirdly, the manner of displaying the qualitative results will be improved so that it is more suitable for visual inspection, methods such as the checkboard images [16] and difference images will be considered. 81 Bibliography [1] A.G. Brown. A survey of Image Registration Techniques. ACM Computing Surveys, vol.24, pp.326-276, 1992. [2] B. Zitova and J. Flusser. Image Registration Methods: a Survey. Image and Vision Computing, vol.21, pp.977-1000, 2003 [3] X.D. Jiang and W.Y. Yau. Fingerprint Minutiae Matching Based on the Local and Global Structures. 15th International Conference on Pattern Recognition (ICPR'00), vol 2, pp.1038-1041, 2000. [4] Q. Zheng and R. Chellappa, A computational vision approach to image registration, IEEE Trans.Image Processing, vol. 2, no. 3, pp. 311–326, July 1993. [5] J.B Antoine Maintz, An Overview of Medical Image Registration Methods, Imaging Science Department, Imaging Center. Vtrecht, IEEE Transactions on Image Processing, 2002 [6] A. Goshtasby, G. C. Stockman, and C. V. Page, A region-based approach with subpixel accuracy, IEEE Trans. Geosci. Remote Sensing,vol. GE-24, pp. 390–399, May 1986. [7] A. D. Ventura, A. Rampini, and R. Schettini, Image registration by recognition of corresponding structures, IEEE Trans. Geosci. Remote Sensing, vol. 28, pp. 305–314, May 1990 [8] J. Ton and A. K. Jain, Registering Landsat images by point matching, IEEE Trans. Geosci. Remote Sensing, vol. 27, pp. 642–651, Sept. 1989. [9] J. W. Hsieh, H. Y. Mark Liao, K. C. Fan, Ming-Tat Ko and Y. P. Hung, Image Registration Using a New Edge-based Approach, CVGIP: Computer Vision and 82 Image Understanding, vol 67, number 2, pp112-130, Aug. 1997. [ 10 ] D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of ComputerVision, 60(2):91–110, November 2004. [11]K. Mikolajczyk and C. Schmid. Scale and affine invariant interest point detectors. International Journal of ComputerVision, 60(1):63–86, 2004. [12] T. Kadir, A. Zisserman, and M. Brady. An affine invariant salient region detector. In Proceedings of the Eighth European Conference on Computer Vision, 2004. [13]J. Matas, O. Chum, M. Urban, and T. Pajdla. Robust wide-baseline stereo from maximally stable extremal regions. Image and Vision Computing, 22(10):761–767, Sept. 2004. [14] J. Feldmar, J. Declerck, G. Malandain, and N. Ayache. Extension of the ICP algorithm to nonrigid intensity-based registration of 3d volumes. Computer Vision and Image Understanding, 66(2):193–206, May 1997. [15] C. Schmid, R. Mohr, and C. Bauckhage. Comparing and evaluating interest points. In Proceedings of the IEEE International Conference on Computer Vision, pages 230–235,1998. [16] C. Stewart, C.-L. Tsai, and B. Roysam. The dual bootstrap iterative closest point algorithm with application to retinal image registration. Technical Report RPI-CS-TR 02-9, Department of Computer Science, 2002. [17]G. Yang, C.V. Stewart, S. Michal, and C.-L. Tsai, "Registration of Challenging Image Pairs: Initialization, Estimation, and Decision". IEEE Transactions on Pattern Analysis and Machine Intelligence, submitted, 2006. [18] Open Source Computer Vision Library Reference Manual, pp 10-16. 2000 [19] P. D. Kovesi. MATLAB Functions for Computer Vision and Image Analysis. 83 School of Computer Science & Software Engineering, The University of Western Australia. Available from: . [20]Raymond Thai, “Fingerprint Image Enhancement and Minutiae. Extraction”, Technical Report, The University of Western Australia,2003. [21] Fedorov, D. V., Kenney, C., Manjunath, B.S., Fonseca, L. M. G., “Online registration demo”., 2001-2002. [22] Fonseca, L M.G, Hewer, G., Kenney, C., and Manjunath, B.S., Registration and Fusion of Multispectral Images Using a New Control Point Assessment Method Derived from Optical Flow Ideas, Proc. SPIE, Vol. 3717, pp.104-111, April 1999, Orlando, FLA. [23] Fonseca, L M.G, Kenney, C., Control Point Assessment for Image Registration, Proc. XII Brazilian Symposium on Computer Graphics and Image Processing, Campinas, Brazil, October 1999, pp. 125-132. [24] D. Fedorov, L. M. G. Fonseca, C. Kenney, B. S. Manjunath, Automatic Registration and Mosaicking System for Remotely Sensed Imagery, SPIE 9th International Symposium on Remote Sensing, Crete, Greece, Sep. 2002. 84 [...]... Correlation theorem states that the Fourier transform of the correlation of two images is the product of the Fourier transform of one image and the complex conjugate of the Fourier transform of the other This theorem gives an alternate way to compute the correlation between images The Fourier transform is simply another way to represent the image function Instead of representing the image in the spatial... sensed and reference images and its maximum is searched The window pairs for which the maximum is achieved are set as the corresponding ones Although the cross-correlation based registration can exactly align mutually translated images only, it can also be successfully applied when slight rotation and scaling are present Another useful property of correlation is given by the Correlation theorem The Correlation... illumination, and/ or by using different sensor types (4) Typically the cross-correlation between the image and the template is computed for each allowable transformation of the template The transformation whose cross-correlation is the largest specifies how the template can be optimally registered to the image This is the standard approach when the allowable transformations include a small range of translations,... assumed image variations 12 Feature -based methods are typically applied when the local information is more significant than the information carried by the image intensities In contrast to the area -based methods, the feature -based methods do not work directly with the image intensities The feature represents information on higher level This property makes feature -based methods suitable to handle complex image. .. However, one common problem with DB-ICP [17] is that ICP has a narrow domain of convergence, and therefore must be initialized relatively accurately 16 Chapter 3 Image Registration In this chapter we present a new image registration algorithm based on the local and global structures of the feature points We apply both the local and global structure matching methods in [3] to image registration Moreover,... functional and metabolic activities like PET, SPECT or MRS Results can be applied , for instance, in radiotherapy and nuclear medicine Scene to model registration Image of a scene and a model are registered The model can be a computer representation of the scene, for instance maps, another scene with similar content The aim is to localize the acquired image in the scene/model and to compare them Example... appropriate for the general-purpose registration problem 1) Keypoint Indexing Methods: Keypoint methods have received growing attention recently because of their tolerance to low image overlap and image scale changes Keypoint indexing methods begin with keypoint detection and localization, and then followed by extraction of an invariant descriptor from the intensities around the keypoint In the end the extracted... translations, rotations, and scale changes; the template is translated, rotated, and scaled for each possible translation, rotation, and scale of interest As the number of transformations grows, however, the computational costs quickly become unmanageable So the correlation methods are generally limited to registration problems in which the images are misaligned only by a small rigid or 11 affine transformation... human expert used to firstly manually examine the local positional relations between the minutiae (referred as local structure of a minutia), and then align two fingerprints using the unique global position structures of the whole image In our approach, we not only adapt the local and global structure matching methods in [3] so that they can be applied to general image, but also improve both of them... improve the flexibility of the local structure matching method to handle various image variations, and increase the accuracy of the global structure matching method in the correspondence estimations The major techniques of feature point matching are summarized in section 3.3 To make the algorithm more flexible, we propose a new local structure matching method in section 3.3.2 to handle the cases where image