Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 40 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
40
Dung lượng
729,52 KB
Nội dung
Learning Features For Identifying Dolphins 431 This idea is reinforced by Cesar and Costa (1997) that explain how energy curvature diagrams of multi-scales can be easily obtained from curve graphics (Cesar, 1996) and used as a robust feature for morphometric characterization of neural cells. That is, the curvature energy is an interesting global feature for shape characterization, expressing the quantity of energy necessary for transforming a specific shape inside its most down energy state (a circle). The curvegram, which can be precisely obtained by using image processing techniques, more specifically through the Fourier transform and its inverse, provides a multi-scale representation of digital contour curvatures. The same work also discusses that by the normalization of the curvature energy with respect to the standard circle of unitary perimeter this characteristic gets efficient for expressing complex shapes in such a way it is invariant to rotation, translation, and scale. Besides, it is robust to noise and other artefacts that imply in image acquisition. Just recently, Gope et al (2005) have introduced an affine curve matching method that uses the area of mismatch between a query and a database curve. This area is obtained by optimally aligning the curves based on the minimum affine distance involving their distinguishing points. From all observed works, we could see that the major difficulties encountered by researchers is in the comparison between several individuals photographed in different situations, including time, clime, and even different photographers. These and other issues insert certain subjectivity in the analysis done from picture to picture for a correct classification of individuals. Photo analysis is based mainly in the patterns presented by the dorsal fin of the animals. That is, identification has been carried out mostly by identifying remarkable features on the dorsal fin, and visually comparing it with other pictures, following a union of affine features. Effort and time spent in this process are directly proportional to the number of pictures the researchers have collected. Yet, visual analysis of pictures is carried out by marine researchers through use of rules and other relative measurement forms, by hand. This may classify individuals with little differences as the same or else to generate new instances for the same individual. This problem may affect studies on population estimation and others, leading to imprecise results. So, a complete system that starts with picture acquisition, to final identification, even a little helped by the user, is of capital importance to marine mammal researchers. 3. The proposed system architecture Fig. 1Fig. 1 shows the basic architecture of the proposed system. It is basically divided into the shown 7 modules or processing phases. Each of these phases will be briefly described in the next. Fig. 1. Architecture of the proposed system. Obtain Parameter Pre processin g Post processin g Normalize Extract features Classif y 432 Mobile Robots, Towards New Applications 3.1. Acquiring module The acquired image is digitized and stored in the computer memory (from now we refer to this as the original image). An image acquired on-line from a digital camera or from a video stream can serve as input to the system. As another option, in the main window of the system shown in Fig. 2, a dialog box may be activated for opening an image file, in the case of previously acquired images. Fig. 2. Main window of the proposed system. Fig. 3. Acquired image visualized in lower resolution in the system main screen. 3.2. Visualization and delimitation In this phase, a human operator is responsible for the adequate presentation of the interest region in the original image. This image is presumably to be stored in a high resolution, with Learning Features For Identifying Dolphins 433 at least a 1500 x 1500 pixels size. The image is initially presented in low resolution only for making faster the control by the operator, in the re-visualization tool. All processing are done at the original image, without loosing resolution and precision. As a result of this phase, a sub- image is manually selected with the region of interest in the original image, containing dolphins, and mapped in the original image. This sub-image may be blurred to the user, so it would be nice to apply a auto-contrast technique for correcting this (Jain, 1989). 3.3. Preprocessing In this module, the techniques for preprocessing the image are effectively applied. Between these techniques, the Karhunem-Loève (KLT) (Jain, 1989) transform and the auto-contrast are used. The KLT transform is applied in order to make uncorrelated the processing variables, mapping each pixel to a new base. This brings the image to a new color space. The auto-contrast technique is applied in order to obtain a good distribution for the gray level or RGB components in case of colored images, adjusting thus the image for the coming phases. 3.4. Auto-segmentation A non supervised methodology through a competitive network is applied in order to segment the image based on its texture patterns. This generates another image which has two labels, one for background and another foreground (objects) respectively. In order to label the image, we use the average of the neighbor values of a selected pixel as image attributes for entering the competitive net. It is calculated the mean for each of the components, R, G and B, thus giving three attributes for each pixel. 3.5. Regularization In this phase, several algorithms are applied to regularize the image provided by the previous phase. This regular, standard image is important for the next phase, mainly for improving the feature extraction process. A clear case of the real necessity of this regularization is the presence of dolphins in different orientations from one picture to another. In the extreme case, dolphins may be pointing to opposite senses (one to left and other to right of the image). For this regularization, first the three ending points of the dorsal fin, essential to the feature extraction process, are manually chosen by the operator. Note the difficulty to develop an automated procedure for the fin extraction, including segmentation, detection and selection of the ending points. We have made some trials, ending up with the hand process, as this is not the main research goal. In order to get the approximate alignment of the dolphin, the system presents a synthetic representation of it (that is, a 3D graphical dolphin) whose pointing direction can be controlled by using keyboard and mouse. This tool is used by the user to indicate the approximate, actual orientation (direction and sense) of the dolphin in the original image. In practice, the user indicates, interactively, the Euler angles (roll, pitch, yaw) relative to the approximate orientation. These angles are the basis for defining the coefficients of a homogeneous transform (a 3D rotation) to be applied in the 2D image in order to approximately conform to the desired orientation in the model. Then the image is ready for feature extraction. 3.6. Morphological operators Mathematical Morphology techniques tools are used for extracting boundaries, skeletons, and to help improving other algorithms. Mathematical morphology is useful for improving 434 Mobile Robots, Towards New Applications extraction of shape features as the fin, besides being the basis for algorithms for curvature analysis, peak detection, between others. 3.7. Feature extraction In this module, some features representing the dorsal fin shape are extracted, as: height and width, peak location, number and position of holes and/or boundary cracks, between others. These features will be better presented and discussed next, in Section 4 (Feature extraction). 3.8. Identification The extracted features are then presented to the classifier, being given as result an answer about the correct identification or not of the dolphin. As a new tool, we added one of the methodologies for classifying with a self-growing mechanism, that can even find new instances of dolphins, never met previously. Fig. 4. Dolphin with substantial rotation around Y axis. KLT Original Image Auto contrast Auto segmentation Cut Labeling Fin Boundary Extract Peak Sequencing Points Detecting Holes Obtain Curves Skeleton Thresholding Reflexion Extract Parametric Normalize Fig. 5. Proposed sequence for pre-processing images. Learning Features For Identifying Dolphins 435 4. Feature extraction Water reflection makes texture features vary substantially from a picture to another. That is, the quantity of intensity is not uniform along the fin, thus even changing local characteristics extracted in relation to other positions in the fins. In this way, the fin shape, number and positioning of holes/cracks, are less sensitive to noise and intensity variations. Of course, there are also restrictions. Lets consider a left handed 3D system with its origin in the centre of projection of the camera (that is, with positive Z axis pointing through the picture). Fig. 4 shows an animal that represents a relatively large rotation in relation to the Y axis. In this case, even if the fin appears in the picture with enough size and has several holes, most of them may not be detected. So, curvature features extraction may be strongly affected. In this way, it would be necessary to apply a series of processing that preserves and enhance them at mostly. The processing sequence proposed to be applied for this is shown in Figure 5. 4.1. Preprocessing As introduced previously, the image is initially processed with the application of KLT transformation to enhance desired features. Results of KLT is an image with two levels of intensity, we adopt 0 and 50. Using these values, the resulting image is further binarized (to 0 or 1 values). The image is then cut (pixels are put to 0) in the region below the line defined by the two inferior points of the fin entered by the user. Note that we assume the y axis of the camera in vertical alignment, then the interest region of the fin must be to the top of that line. So the points below it are considered background. The image is reflected (in 3D) around the y axis, that is, only reconfiguring the x coordinates. This is already a first action towards regularization for feature extraction. Subsequent process of image labelling is fundamental for identification of point agglomeration. Vector quantization is used for splitting the image in classes. Due to noise present on it after the initial phases, it is common the existence of several objects, the fin is the interest object. A morphological filtering is realized to select the biggest one (presumably the fin) to the next processing phase. Fig. 6. Fin peak extraction from the image. At this time, a binary image with the fin (not yet with rotation corrected) is present. Border and skeleton images are then extracted using image processing algorithms. These images will be used for extraction of the peak using the algorithm graphically depicted in Fig. 6. Initially, it extracts two sparse points on the top portion of the skeleton, some few pixels apart of each other. These points define a line segment which is prolonged up until it intersects the fin boundary. The intersection point is taken as the peak. Besides simple, it showed to be a very good approximation to the peak, as the superior shape of the skeleton naturally points to the peak. 436 Mobile Robots, Towards New Applications 4.2. Sequencing the fin points In this work, the features used for identification can be understood as representations of the fin shape. So, before feature computation, we first extract and order pixels on the fin boundary. Border pixels in a binary image can be easily extracted using morphological morphology. The boundary is given by border points following a given sequence, say counter-clockwise. So, we have just to join the points finding this sequencing. The algorithm starts from the left to the right of the image. Remember during user interaction phase the approximated initial (p 1 ), peak (p), and final (p 2 ) points are manually given. Based on the orientation given by these points, if necessary, the image is reflected in such a way the head of the dolphin points to the right of the image. A search on the border image is then performed from p 2 to p 1 to find the boundary, that is, in crescent order of x. If a substantial change is detected on the y actual value in relation to the previous one, this means the y continuity is broken. So the search is inverted, taking next x values for each fixed y. When the opposite occurs, the current hole has finished, the algorithm returns the initial ordering search. In this way, most boundary points are correctly found and the result is a sequence of point coordinates (x and y) stored in an array (a 2D point structure). Results are pretty good, as can be seen in Fig. 7. It shows the boundary of a fin and the sequence of points obtained and plotted. Fig. 7. Extracting fin boundary points from border pixels of the image (borders are in the left, boundaries are in the right of Figure). The method may loose some points. 4.3. Polynomial representation of the fin For curvature feature extraction, we propose a method that differs from those proposed by Araabi (2000), Hillman (1998), and Gopi (2005), cited previously. Our method Learning Features For Identifying Dolphins 437 generates two third degree polynomials, one for each side of the fin. The junction of these two curves plus the bottom line would form the fin complete shape. This method has proven to be robust, with third degree curves well approximating the curvature. Further, we can use these polynomials to detect possible holes present on the fin in a more precise way. Fig. 8 describes the parameterization of the curves. Note that the two curves are coincident for O=0. This point is exactly the fin peak. The parametric equations of the curves are expressed as: ° ¯ ° ® 1111 1111 23 1 23 1 )( )( 1 yyyy xxxx dcbay dcbax Curve OOOO OOOO ° ¯ ° ® 2222 2222 23 2 23 2 )( )( 2 yyyy xxxx dcbay dcbax Curve OOOO OOOO ¯ ® 222 2 )0( )0()0()0( Re 2 121 xxx x dddx dxxx strictions In this way, one needs only to solve the above system using the found boundary in order to get the parametric coefficients in x and y. A sum of squared differences (SSD) approach is used here, with the final matrix equation simply given by XC x / . An equivalent approach is also adopted for the second curve. y x =1 =1 λ λ λ=0 curve 1 curve 2 Fig. 8. Graphical model of the parametric curves used. 4.4 Obtaining images with regularized (standard) dimensions The camera image can be assumed as a mapping from the world scene in a plane, a 2D representation. Also, we must consider that the dolphin may be rotated in the world, we use Euler angles (x, y, and z rotations are considered). Besides, we can do some important simplifications. The first is not to consider x rotation as in practice we noted its maximum value is less than 10 degrees. Lets consider the dolphin body (its length) is distributed along the x g axis, with its head points to positive direction. The width of the animal is distributed along z g , with its center at the origin z g = 0. The height of the dolphin is along the y g axis, with the fin peak pointing to positive values of this axis. The second simplification is that, considering the animal at some distance, the y coordinate of a point P, say P yr rotated around the y axis and projected at the image plane will be the same as if 438 Mobile Robots, Towards New Applications the transformations does not happen (P y ). It is easy to understand and verify this assumption if we analyze the effects on an object close enough and far enough of an observer. If the object is close, a rotation T around y would affect the y coordinate (the closer to S/2, the bigger change in y). But, if the object gets far from the observer, the displacement in y decreases. The perspective can be neglected and/or assumed as a rigid body transformation. Ideally at infinity, the y change would be zero. As in this work pictures are taken at a distance varying from at least 15 to 40 meters (average 30), the change in y is so small in comparison with the dolphin size and can be neglected. Note that a rotation and projection is performed here. z g b z x m x g -z m y 2D y m z y f z xy P T nadadeira (vista superio r Reflexion coordinates of dolphin coordinates of world Fig. 9. Mapping between 3D and 2D Given a picture in the XY plane with z = 0, the goal is to get an approximation for the coordinates of the real points of the fin on the world, mapped back to 2D. We now can consider that the fin is parallel to the observer plane without rotation. Fig. 9 can be used to better understand the model. Given the Euler angles (rotation) and assuming the simplifications suggested above, equations can be easily derived for determining the mapping between a generic image and the one on a regularized configuration. These angles are interactively given by the user when he rotates the graphical dolphin shown in a window to conform with the configuration seen on the image. We remark that these transformations are applied to the parametric equations of the two polynomials that represent the fin. In this way, the equations representing the ideal fin (without any transformation, aligned to the x axis) are found. After these transformations, it is necessary to normalize the equations as varied sized images can be given, as explained next. 4.5. Polynomials normalization After all pre and post-processing, a normalisation is carried out on the data. This is of fundamental importance for extracting features that are already normalized, making analysis easier, as the images of a same animal coming from the data base may be so different. As stated above, the normalization is done at the coefficients that describe the two parametric, cubic curves. This normalization process is depicted in Fig. 10. Learning Features For Identifying Dolphins 439 0 1 ? y xx x x x x yy y y (a) (b) (c) (d) (e) (f) Rotate Translate 1 Translate 2 Change axes Change scale Fig. 10. Sketch of the algorithm for normalization of the polynomials. 4.5.1 Shape features (descriptors) After polynomial normalization, shape features are then extracted. One of the most important features is the fin format, so a notion of “format” must be given. Two features are extracted for capturing the format notion: the curvature radius of the two curves and the indexes of discrepancy to a circumference. These indexes measure the similarity between one of the sides of the fin (the curve) and a circumference arc. These curvature features are based on the determination of the center of the arc of circumference that better fits the polynomials at each side of the fin. Supposing that the circumference center is at (x c , y c ) and that n points are chosen along the curve, the center is calculated in such a way to minimize the cost of the function J(x c , y c ) defined by: ¦ n i iicc llyxJ 1 2 1 2 , where 22 2 cicii yyxxl . By solving this equation, the following matrix equation can be found: AX=B This can yet be expanded to: » » » » » ¼ º « « « « « ¬ ª » ¼ º « ¬ ª » » » » ¼ º « « « « ¬ ª 22 1 22 1 2 2 2 1 2 2 2 1 2 1 2 0 2 1 2 0 11 2121 1010 22 22 22 nnnn c c nnnn yyxx yyxx yyxx y x yyxx yyxx yyxx 440 Mobile Robots, Towards New Applications The Equation is solved applying a SSD approach. After obtaining the center (x c ,y c ), the Euclidean distances from it to each one of the n points of the considered fin side are calculated. The radius is given by the mean value of these distances. Fig. 11 shows two fins that differ in their radius sizes. This can be even visually observed mainly for the left side curves, the right one is smaller, so this is proven to be a relevant feature. Fig. 11. Two distinct fins that can be identified by the difference in the radius (see text for explanation). The discrepancy index is given by the sum of squared differences between the mean radius and the distance to each of n points considered. Note that for fin formats exactly as a circumference, this index would be zero. The two fins of Fig. 11 have indexes close to zero on the left side, besides different radius. Note that two fins with almost the same calculated radius may have different discrepancy indexes, as for the ones on right sides of Fig. 11. Peak X 0.0958149 Peak Y 0.835369 Number of Holes NumHoles 1 0.4 NumHoles 2 0.2 Location 1 Holes1 1 0.129883 0.546875 0 0 Location 2 Holes 2 0.640625 0 0 0 Radius 1 0.357855 Radius 2 0.436765 Index 1 0.034204 Index 2 0.010989 Table 1. Features calculated for a given image of a dolphin. 4.5. The feature extraction process for identification After all processing presented at previous sections, features for identification are extracted using the two polynomials. Between all measures (values) that can be used as features, we have found 16 relevant features: a) Two coordinates (x and y) of the peak; b) Number of holes/cracks in each curve (from observations in the population, a number between 0 and 4 is considered) normalized (that is, multiplied by 2/10); [...]... rehabilitation, assist the elderly and severely disabled, housekeeping, etc The third class includes robots that operate on human being, such as medical robots mainly for surgery, treatment and diagnosis Service robots with their free navigation capability target a wide range of applications, such 452 Mobile Robots, Towards New Applications as agriculture & harvesting, healthcare/rehabilitation, cleaning (house,... Output Activated Desired Activated Desired Activated Desired Mobile Robots, Towards New Applications 1 0.73 1.00 0.11 0.00 0.23 0.00 2 0.04 0.00 0.04 0.00 0.15 0.00 3 0 .12 0.00 0.02 0.00 0.03 0.00 4 0.00 0.00 0.20 0.00 0.00 0.00 5 0.25 0.00 0 .12 0.00 0.21 0.00 6 0.14 0.00 0.80 1.00 0.10 0.00 7 0.01 0.00 0.04 0.00 0.85 1.00 8 0.03 0.00 0 .12 0.00 0.01 0.00 Table 6 Activation given by last layer of net,... stochastic inference approach can be tried, as future work, in this track 448 Mobile Robots, Towards New Applications 8 References Araabi, B.; Kehtarnavaz, N.; McKinney, T.; Hillman, G.; and Wursig, B A string matching computer-assisted system for dolphin photo-identification, Journal of Annals of Biomedical Engineering, vol.28, pp 126 9 -127 9, Oct 2000 Araabi, B N.; Kehtarnavaz, N.; Hillman, G.; and Wursig,... automatically new features, we have proposed an extension in the BPNN original structure in such a way new instances of dolphins can be added without needs of user intervention That is, the system would find itself a new dolphin instance by looking at the output layer results, acquire the new feature set for this new input, re-structure the net, and retrain it in order to conform to the new situation... vegetation or steep grades In small paths or thick bush, such machines simply cannot maneuver Thus, mechanical mine clearance is particularly suited for roads, and favorable 458 Mobile Robots, Towards New Applications terrain such as flat, and sandy areas Large mechanical systems, in particular the flail and tiller machines are expensive and requires substantial investments, not only for machine costs but... cooperation among multi robots, g) Wireless connectivity and natural communication with humans, 460 Mobile Robots, Towards New Applications h) Virtual reality and real time interaction to support the planning and logistics of robot service, and i) Machine intelligence, computation intelligence and advanced signal processing algorithms and techniques Furthermore, the use of many robots working and coordinating... Angola, Afghanistan, Cambodia, etc Many efforts have been recognized to develop an effective robots for the purpose to offer cheap and fast solutions Three main directions can be recognized: Teleoperated machines, Multifunctional teleopeated robot, and Demining service robots 462 Mobile Robots, Towards New Applications 10 Robotization of Humanitarian Demining This section highlights some of the main... technical features and design capabilities of a mobile platform that can accelerate the demining process, preserve the life of the mine clearing personnel and enhance safety, and achieve cost effective measures 2 Service Robots Between the 60s and end of 80s, most robot applications were related to industries and manufacturing and these robots were called industrial robots that were mainly intended for rationalizing... breakthrough in the invention of a new generation of robots called service robots Service robot is a generic term covering all robots that are not intended for industrial use, i.e., perform services useful to the well being of humans, and other equipment (maintenance, repair, cleaning etc.), and are not intended for rationalizing production The development and operation of service robots provide invaluable... stockpiles exceeding 100 million mines are held in over 100 nations, and 50 of these nations still producing a further 5 million new mines every year The rate of clearance is far slower There exists about 2000 types of mines around the world; among 450 Mobile Robots, Towards New Applications these, there are more than 650 types of AP mines What happens when a landmine explodes is also variable A number . » ¼ º « ¬ ª » » » » ¼ º « « « « ¬ ª 22 1 22 1 2 2 2 1 2 2 2 1 2 1 2 0 2 1 2 0 11 2121 1010 22 22 22 nnnn c c nnnn yyxx yyxx yyxx y x yyxx yyxx yyxx 440 Mobile Robots, Towards New Applications The Equation is solved applying. for dolphins with little modification in pose. 446 Mobile Robots, Towards New Applications Output 1 2 3 4 5 6 7 8 Activated 0.73 0.04 0 .12 0.00 0.25 0.14 0.01 0.03 Desired 1.00 0.00 0.00. further 5 million new mines every year. The rate of clearance is far slower. There exists about 2000 types of mines around the world; among 450 Mobile Robots, Towards New Applications these,