Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 30 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
30
Dung lượng
883,05 KB
Nội dung
25.7 Decision Environment for Iris Recognition 731 TABLE 25.1 Cumulatives under (25.11) giving single false match probabilities for various HD criteria. HD Criterion Odds of False Match 0.26 1 in 10 13 0.27 1 in 10 12 0.28 1 in 10 11 0.29 1 in 13 billion 0.30 1 in 1.5 billion 0.31 1 in 185 million 0.32 1 in 26 million 0.33 1 in 4 million 0.34 1 in 690,000 0.35 1 in 133,000 matches, would seem to be performing at a very impressive level because it must confuse no more than 10% of all identical twin pairs (since about 1% of all persons in the general population have an identical twin). But even with its P 1 ϭ 0.001, how good would it be for searching large databases? Using (25.13) we see that when the search database size has reached merely N ϭ 200 unrelated faces, the probability of at least one false match among them is already 18%. When the search database is just N ϭ 2000 unrelated faces, the probability of at least one false match has reached 86%. Clearly, identification is vastly more demanding than one- to-one verification, and even for moderate database sizes, merely“good”verifiers are of no use as identifiers. Observing the approximation that P N ≈ NP 1 for small P 1 << 1 N << 1, when searching a database of size N ,anidentifierneedstoberoughlyN times better than a verifier to achieve comparable odds against making false matches. The algorithms for iris recognition exploit the extremely rapid attenuation of the HD distribution tail created by binomial combinatorics to accommodate very large database searches without suffering false matches. The HD threshold is adaptive to maintain P N < 10 Ϫ6 regardless of how large the search database size N is. As Table 25.1 illustrates, this means that if the search database contains 1 million different iris patterns, it is only necessary for the HD match criterion to adjust downwards from 0.33 to 0.27 in order to maintain still a net false match probability of 10 Ϫ6 for the entire database. 25.7 DECISION ENVIRONMENT FOR IRIS RECOGNITION The overall “decidability” of the task of recognizing persons by their iris patterns is revealed by comparing the HD distributions for same versus different irises. The left distribution in Fig. 25.9 shows the HDs computed between 7,070 different pairs of same- eye images at different times, under different conditions, and usually with different cameras; and the right distribution gives the same 9.1 million comparisons among 732 CHAPTER 25 How Iris Recognition Works Density Decision environment for iris recognition: Non-ideal imaging d ' 5 7.3 2.3 million comparisons mean 5 0.110 stnd.dev. 5 0.065 mean 5 0.458 stnd.dev. 5 0.0197 same different 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Hamming distance FIGURE 25.9 The decision environment for iris recognition under relatively unfavorable conditions, using images acquired at different distances and by different optical platforms. different eyes shown earlier. Tothe degree that one can confidently decide whether an observed sample belongs tothe left or the right distr ibution in Fig. 25.9, iris recognition can be successfully performed. Such a dual distribution representation of the decision problem may be called the“decision environment,” because it reveals the extent to which the two cases (same versus different) are separable and thus how reliably decisions can be made, since the overlap between the two distributions determines the error rates. Whereas Fig. 25.9 shows the decision environment under less favorable conditions (images acquired by different camera platforms), Fig. 25.10 shows the decision environ- ment under ideal (almost artificial) conditions. Subjects’ eyes were imaged in a laboratory setting using always the same camera with fixed zoom factor and at fixed distance and with fixed illumination. Not surprisingly, more than half of such image comparisons achieved an HD of 0.00, and the average HD was a mere 0.019. It is clear from comparing Figs. 25.9 and 25.10 that the “authentics” distribution for iris recognition (the similar- ity between different images of the same eye, as shown in the left-side distributions) depends very strongly upon theimage acquisition conditions. However, the measured similarity for “imposters” (the right-side distribution) is almost completely independent 25.7 Decision Environment for Iris Recognition 733 Density Decision environment for iris recognition: Ideal imaging same different 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Hamming distance d ' 5 14.1 482,600 comparisons mean 5 0.019 stnd.dev. 5 0.039 mean 5 0.456 stnd.dev. 5 0.020 FIGURE 25.10 The Decision Environment for iris recognition under very favorable conditions, always using the same camera, distance, and lighting. of imaging factors. Instead, it just reflects the combinatorics of Bernoulli trials, as bits from independent binary sources (the phase codes for different irises) are compared. For two-choice decision tasks (e.g., same versus different), such as biometric decision making, the “decidability” index d Ј is one measure of how well separated the two distri- butions are, since recognition errors would be caused by their overlap. If their two means are 1 and 2 and their two standard deviations are 1 and 2 , then d Ј is defined as d Ј ϭ | 1 Ϫ 2 | ( 2 1 ϩ 2 2 )/2 . (25.14) This measure of decidability is independent of how liberal or conser vative the acceptance threshold used is. Rather, by measuring separation, it reflects the degree to which any improvement in (say) the false match error rate must be paid for by a worsening of the failure-to-match error rate. The performance of any biometric technology can be cali- brated by its d Ј score,among other metrics. The measured decidability for iris recognition is d Ј ϭ 7.3 for the nonideal (crossed platform) conditions presented in Fig. 25.9, and it is d Ј ϭ 14.1 for the ideal imaging conditions presented in Fig. 25.10. 734 CHAPTER 25 How Iris Recognition Works Based on the left-side distributions in Figs. 25.9 and 25.10, one could calculate a table of probabilities of failure to match, as a function of HD match criterion, just as we did earlier in Table 25.1 for false match probabilities based on the right-side distribution. However, such estimates may not be stable because the “authentics” distributions depend strongly on the quality of imaging (e.g., motion blur, focus, and noise) and would be different for different optical platforms. As illustrated earlier by the badly defocused image of Fig. 25.3, phase bits are still set randomly with binomial statistics in poor imaging, and so the right distribution is the stable asymptotic form both in the case of well-imaged irises (Fig. 25.10) and poorly imaged irises (Fig. 25.9). Imaging quality determines how much the same-iris distribution evolves and migrates leftward, away from the asymptotic different-iris distribution on the rig ht. In any case, we note that for the 7,070 same-iris comparisons shown in Fig. 25.9, their highest HD was 0.327 which is below the smallest HD of 0.329 for the 9.1 million comparisons between different irises. Thus a decision criterion slightly below 0.33 for the empirical datasets shown can perfectly separate the dual distributions. At this criterion, using the cumulatives of (25.11) as tabulated in Table 25.1, the theoretical false match probability is 1 in 4 million. Notwithstanding this diversity among iris patterns and their apparent singularity because of so many dimensions of random variation, their utility as a basis for automatic personal identification would depend upon their relative stability over time. There is a popular belief that the ir is changes systematically with one’s health or personality, and even that its detailed features reveal the states of individual organs (“iridology”); but such claims have been discredited (e.g., [17, 18]) as medical fraud. In any case, the recognition principle described here is intrinsically tolerant of a large proportion of the iris information being corrupted, say up to about a third, without significantly impairing the inference of personal identity by the simple test of statistical independence. 25.8 SPEED PERFORMANCE SUMMARY On a low-cost 300 MHz reduced instruction set (RISC) processor, the execution times for the critical steps in iris recognition are as shown in Table 25.2, using optimized integer code. The search engine can perform about 100,000 full comparisons between different irises per second on each such 300 MHz CPU, or 1 million in about a second on a 3 GHz server, because of the efficient implementation of the matching process in terms of elementary Boolean operators and acting in parallel on the computed phase bit sequences. If a database contained many millions of enrolled persons, then the inherent parallelism of the search process should be exploited for the sake of speed by dividing up the full database into smaller chunks to be searched in parallel. The confidence levels shown in Table 25.1 indicate how the decision threshold should be adapted for each of these parallel search engines, in order to ensure that no false matches were made despite several large-scale searches being conducted independently. The mathematics of the iris recognition algorithms, particularly the binomial-class distributions (25.4) (25.11) that they generate when comparing different irises, make it clear that databases the size of 25.9 Appendix: 2D Focus Assessment at the Video Frame Rate 735 TABLE 25.2 Execution speeds of various stages in the iris recognition process on a 300 MHz RISC processor. Operation Time Assess image focus 15 msec Scrub specular reflections 56 msec Localize eye and iris 90 msec Fit pupillary boundary 12 msec Detect and fit both eyelids 93 msec Remove lashes and contact lens edges 78 msec Demodulation and IrisCode creation 102 msec XOR comparison of two IrisCodes 10 s an entire country’s population could be searched in parallel to make confident and rapid identification decisions using parallel banks of inexpensive CPUs, if such iris code databases existed. 25.9 APPENDIX: 2D FOCUS ASSESSMENT AT THE VIDEO FRAME RATE The acquisition of iris images in good focus is made difficult by the optical magnification requirements, the restrictions on illumination, and the target motion, distance, and size. All these factors act to limit the possible depth of field of the optics, because they create a requirement for a lower F number to accommodate both the shorter integ ration time (to reduce motion blur) and the light dilution associated with long focal length. The iris isa1cmtarget within a roughly 3 cm wide field that one would like to acquire at a range of about 30 cm to 50 cm and with a resolution of about 5 line pairs per mm. In a fixed-focus optical system, the acquisition of iris images almost always begins in poor focus. It is therefore desirable to compute focus scores for image frames very rapidly, either to control a moving lens element or to provide audible feedback tothe subject for range adjustment, or to select which of several frames in a video sequence is in best focus. Optical defocus can befully described as a phenomenon of the 2D Fourier domain. An image represented as a 2D function of the real plane, I(x,y), has a 2D Fourier transform F(,) defined as F(, ) ϭ 1 (2) 2 I(x,y) exp(Ϫi(x ϩ y))dxdy. (25.15) In theimage domain, defocus is normally represented as convolution of a perfectly focused image by the 2D point-spread function of the defocused optics. This point- spread function is often modeled as a Gaussian whose space constant is proportional tothe degree of defocus. Thus for perfectly focused optics, this optical point-spread 736 CHAPTER 25 How Iris Recognition Works function shrinks almost to a delta function, and convolution with a delta function has no effect on the image. Progressively defocused optics equates to convolving with ever wider point-spread functions. If the convolving optical point-spread function causing defocus is an isotropic Gaus- sian whose width represents the degree of defocus, it is clear that defocus is equivalent to multiplying the 2D Fourier transform of a perfectly focused image with the 2D Fourier transform of the “defocusing” (convolving) Gaussian. This latter quantity is itself just another 2D Gaussian within the Fourier domain, and its spread constant there () is the reciprocal of that of the image-domain convolving Gaussian that represented the optical point-spread function. Thus the 2D Fourier tr ansform D (,) of an image defocused by degree 1 / can be related to F(,), the 2D Fourier t ransform of the corresponding perfectly focused image, by a simple model such as D (,) ϭ exp Ϫ 2 ϩ 2 2 F(, ). (25.16) This expression reveals that the effect of defocus is to attenuate primarily the highest frequencies in theimage and that lower frequency components are affected correspond- ingly less, since the exponential term approaches unity as the frequencies ( , ) become small. (For simplicity, this analysis has assumed isotropic optics and isotropic blur, and the optical point-spread function has been described as a Gaussian just for illustration. But the analysis can readily be generalized to non-Gaussian and to anisotropic optical point-spread functions.) This spectral analysis of defocus suggests that an effective way to estimate the quality of focus of a broadband image is simply to measure its total power in the 2D Fourier domain at higher spatial frequencies, since these are the most attenuated by defocus. One may also perform a kind of “contrast normalization” to make such a spectrally- based focus measure independent of image content, by comparing the ratio of power in higher frequency bands to that in slightly lower frequency bands. Such spectrally-based measurements are facilitated by exploiting Parseval’s Theorem for conserved total power in the two domains: |I(x,y)| 2 dxdy ϭ |F(, )| 2 dd. (25.17) Thus, highpass filtering an image, or bandpass filtering it within a ring of high spatial frequency (requiring only a 2D convolution in theimage domain), and integrating the power contained in it, is equivalent to computing the actual 2D Fourier transform of theimage (a more costly operation) and performing the corresponding explicit mea- surement in the selected frequency band. Since the computational complexity of a fast Fourier transform on n ϫ n data isO(n 2 log 2 n), some 3 million floating-point operations are avoided which would otherw ise be needed to compute the spectral measurements explicitly. Instead, only about 6,000 integer multiplications per image are needed by this algorithm, and no floating-point operations. Computation of focus scores is based only 25.9 Appendix: 2D Focus Assessment at the Video Frame Rate 737 on simple algebraic combinations of pixel values within local closed neighborhoods, repeated across the image. Pixels are combined according tothe following (8 ϫ 8) convolution kernel: Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 ϩ3 ϩ3 ϩ3 ϩ3 Ϫ1 Ϫ1 Ϫ1 Ϫ1 ϩ3 ϩ3 ϩ3 ϩ3 Ϫ1 Ϫ1 Ϫ1 Ϫ1 ϩ3 ϩ3 ϩ3 ϩ3 Ϫ1 Ϫ1 Ϫ1 Ϫ1 ϩ3 ϩ3 ϩ3 ϩ3 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 Ϫ1 The simple weights mean that the sum of the central (4 ϫ 4) pixels can just be tripled, and then the outer 48 pixels subtracted from this quantity ; the result is squared and accumulated as per (25.17); and then the kernel moves tothe next position in the image, selecting every 4th row and 4th column. This highly efficient discrete convolution has a simple 2D Fourier analysis. The above kernel is equivalent tothe superposition of two centered square box func- tions, one of size (8 ϫ 8) and amplitude Ϫ1, and the other one of size ( 4 ϫ 4) and amplitude ϩ4. (For the central region in which they overlap, the two therefore sum to +3.) The 2D Fourier transform of each of these square functions is a 2D “sinc” function, whose size parameters differ by a factor of two in each of the dimensions and whose amplitudes are equal but opposite, since the two component boxes have equal but oppo- site volumes. Thus the overall kernel has a 2D Fourier transform K (, ) which is the difference of two, differently-sized, 2D sinc functions: K (, ) ϭ sin() sin() 2 Ϫ sin(2) sin(2) 4 2 . (25.18) The square of this function of and in the 2D Fourier domain is plotted in Fig. 25.11, revealing K 2 (,), the convolution kernel’s 2D power spectrum. Clearly,low spatial frequencies (nearthe centerof thepower spectral plot inFig. 25.11) are ignored, reflecting the fact that the pixel weights in the convolution kernel all sum to zero,while a bandpass ring of upper frequencies is selected by this filter. The total power in that band is the spectral measurement of focus. Finally, this summated 2D spectral power is passed through a compressive nonlinearity of the form: f (x) ϭ 100 · x 2 /(x 2 ϩ c 2 ) (where parameter c is the half-power corresponding to a focus score of 50%), in order to generate a normalized focus score in the range of 0 to 100 for any image. The com- plete execution time of this 2D focus assessment algorithm, implemented in C using pointer arithmetic and operating on a (480 ϫ 640) image, is 15 msec on a 300 MHz RISC processor. 738 CHAPTER 25 How Iris Recognition Works 2100 250 0 50 100 Spatial frequency () 2100 250 0 50 100 Spatial frequency ( ν) 0.0 1.0 FIGURE 25.11 The 2D Fourier power spectrum of the convolution kernel used for rapid focus assessment. REFERENCES [1] J. Daugman. High confidence visual recognition of persons by a test of statistical independence. IEEE Trans. Pattern Anal. Mach. Intell., 15(11):1148–1161, 1993. [2] J. Daugman. Biometric Personal Identification System Based on Iris Analysis. U.S. Patent No. 5,291,560, US Government Printing Office, Washington, DC, 1994. [3] J. Daugman. Statistical richness of visual phase information: update on recognizing persons by their iris patterns. Int. J. Comput. Vis., 45(1):25–38, 2001. [4] Y. Adini, Y. Moses, and S. Ullman. Face recognition: the problem of compensating for changes in illumination direction. IEEE Trans. Pattern Anal. Mach. Intell., 19(7):721–732, 1997. [5] P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman. Eigenfaces vs. Fisherfaces: recognition using class-specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell., 19(7):711–720, 1997. [6] A. Pentland and T. Choudhury. Face recognition for smart environments. Computer, 33(2):50–55, 2000. [7] P. J. Phillips, A. Martin, C. L. Wilson, and M. Przybocki. An introduction to evaluating biometric systems. Computer, 33(2):56–63, 2000. References 739 [8] P. J. Phillips, H. Moon, S. A. Rizvi, and P. J. Rauss. The FERET evaluation methodology for face-recognition algorithms. IEEE Trans. Pattern Anal. Mach. Intell., 22(10):1090–1104, 2000. [9] P. Kronfeld. Gross anatomy and embryology of the eye. In: H. Davson, editor, The Eye. Academic Press, London, 1962. [10] M. R. Chedekel. Photophysics and photochemistry of melanin. In: Melanin: Its Role in Human Photoprotection, 11–23. Valdenmar, Overland Park, KS, 1995. [11] J. Daugman. Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. J. Opt. Soc. Am. A, 2(7):1160–1169, 1985. [12] J. Daugman. Complete discrete 2D Gabor transforms by neural networks for image analysis and compression. IEEE Trans. Acoust., 36(7):1169–1179, 1988. [13] J. Daugman and C. Downing. Demodulation, predictive coding, and spatial vision. J. Opt. Soc. Am. A, 12(4):641–660, 1995. [14] R. Viveros, K. Balasubramanian, and N. Balakrishnan. Binomial and negative binomial analogues under correlated Bernoulli trials. Am Stat, 48(3):243–247, 1984. [15] T. Cover and J. Thomas. Elements of Information Theory. Wiley, New Yor k, 1991. [16] J. Daugman and C. Downing. Epigenetic randomness, complexity, and singularity of human iris patterns. Proc. R. Soc. Lond., B, Biol. Sci., 268:1737–1740, 2001. [17] L. Berggren. Iridology: a critical review. Acta Ophthalmol., 63(1):1–8, 1985. [18] A. Simon, D. M. Worthen, and J. A. Mitas. An ev aluation of iridology. J. Am. Med. Assoc., 242:1385–1387, 1979. CHAPTER 26 Computed Tomography R.M. Leahy 1 , R. Clackdoyle 2 , and Frédéric Noo 3 1 University of Southern California; 2 CNRS et Université Jean Monnet; 3 University of Utah 26.1 INTRODUCTION The term tomography refers tothe general class of devices and procedures for produc- ing two dimensional (2D) cross-sectional images of a three dimensional (3D) object. Tomographic systems make it possible toimagethe internal structure of objects in a noninvasive and nondestructive manner. By far the best known application is the com- puter assisted tomography (CAT or simply CT) scanner for X-ray imaging of the human body. Other medical imaging devices, including positron emission tomography (PET), single photon emission computed tomography (SPECT) and magnetic resonance imag- ing (MRI) systems, also make use of tomographic principles. Outside of the biomedical realm, tomogr aphy is used in diverse applications such as microscopy, nondestructive testing, radar imaging, geophysical imaging, and radio astronomy. We will restrict our attention here toimage reconstruction methods for X-ray CT, PET, and SPECT. In all three modalities, the data can be modeled as a collection of line integrals of the unknown image. Many of the methods described here can also be applied to other tomographic problems. We describe 2D image reconstruction from parallel and fan-beam projections and 3D reconstruction from sets of 2D projections. Algorithms derived from the analytic relationships between functions and their line integrals, the so-called “direct methods,” are described in Sections 26.3–26.5.InSection 26.6 we describe the class of “iterative methods” that are based on a finite dimensional discretization of the problem. We will include key results and algorithms for a range of imaging geometries, including systems currently in development. References tothe appropriate sources for a complete develop- ment are also included. Our objective is to convey the wide range of methods available for reconstruction from projections and to highlight some recent developments in what remains a highly active area of research. 741 [...]... parallel tothe holes in the collimator The total number of gamma rays detected at a given pixel in the camera will be approximately proportional to the total activity (or line integral) along the line that passes through the patient and is parallel tothe holes in the collimator Thus when viewing a patient from a fixed camera position, we collect a 2D projection image of the 3D distribution of the tracer... imaginary detector centered at the origin and lying in the plane perpendicular to a The source point a is assumed never to lie on the scanner axis e 3 The u-axis lies in the detector plane in the direction e 3 ϫ a The v-direction is perpendicular to u and points in the same direction as e 3 as shown in Fig 26.12(b) In the simplest applications, the detector and source rotate in a circle about the scanner... through the patient in the z direction An ideal 2D detector array or film in the (x, y)-plane would produce an image with intensity proportional tothe negative logarithm of the attenuated X-ray beam, i.e., Ϫ log(I /I0 ) The following projection image would then be formed at the ideal detector: r(x, y) ϭ (x, y, z)dz (26.2) The utility of conventional radiography is limited due tothe projection of 3D anatomy... intersects a ring of detectors allows us to approximately define a line that contains the positron emitter, as illustrated in Fig 26.3(a) The total number of photon pairs measured by a detector pair will be proportional to the total number of positron emissions along the line joining the detectors, i.e., the number of detected events between a detector pair is an approximate line integral of the tracer density... by the relatively high probability of Compton scatter of photons before they reach the detector These attenuation effects can be quantified by performing a separate “transmission” scan in which the scattering properties of the body are measured This information must then be incorporated into the reconstruction algorithm [7, 10] While all these effects can, to some degree, be compensated for within the. .. Computed Tomography 26.2 BACKGROUND 26.2.1 X-ray Computed Tomography In conventional X-ray radiography, a stationary source and a planar detector are used to produce a 2D projection image of the patient Theimage has intensity proportional tothe amount by which the X-rays are attenuated as they pass through the body, i.e., the 3D spatial distribution of X-ray attenuation coefficients is projected into a... about the scanner axis e 3 If the radius of rotation is A, the source trajectory is parameterized by g ∈ [0, 2] as a ϭ (A cos , A sin , 0) In this case, the v-axis in the detector stays aligned with e 3 and the u-axis points in the tangent direction tothe motion of the source Physical detector measurements can easily be scaled to this virtual detector system, just as for the fan-beam example of Section... a then depends on u, we simplify the notation by writing g (u, ) ϭ g (a, ) ϭ ∫f (a ϩ t )dt For the parallel-beam case, the function g is the Radon transform of theimage f [4] Practical inversion methods can be developed using the relationship between the Radon and Fourier transforms The projection slice theorem is the basic result that is 26.3 2D Image Reconstruction u x ⊥ FIGURE 26.6 The. .. cavity and kidneys (CT images courtesy of G.E Medical Systems) reveal the differences between the high resolution, low-noise images produced by X-ray CT scanners and the lower resolution and noisy images produced by the nuclear imaging instruments These differences are primarily due tothe photon flux in X-ray CT which is many orders of magnitude higher than the individually detected photons in nuclear medicine... array of photomultiplier tubes (PMTs) The PMTs measure the location on the camera surface at which each gamma ray photon is absorbed by the scintillator [1] A mechanical collimator, consisting of a sheet of dense metal in which a large number of parallel holes have been drilled, is attached tothe front of the camera as illustrated in Fig 26.2(a) The collimated camera is only sensitive to gamma rays . attached to the front of the camera as illustrated in Fig. 26.2(a). The collimated camera is only sensitive to gamma rays traveling in a direction parallel to the holes in the collimator. The total. pixel in the camera will be approximately proportional to the total activity (or line integ ral) along the line that passes through the patient and is parallel to the holes in the collimator. Thus. of the projection slice theorem. The 2D image at left is projected at angle to produce the 1D projection g(u,). The 1D Fourier transform G(U , ) of this projection is equal to the 2D image