Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 54 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
54
Dung lượng
2,54 MB
Nội dung
34 CHAPTER 2. ORTHOGONAL TRANSFORMS For completely decorrelated spectral coefficients, rj c = I. The second parameter TJE measures the energy compaction property of the transform. Defining J' L as the expected value of the summed squared error J/, of Eq. (2.20) This J' L has also been called the basis restriction, error by Jain (1989). Then the compaction efficiency is Thus rjpj is the fraction of the total energy in the first L components of 0, where {0f} are indexed according to decreasing value. The unitary transformation that makes r\ c = 0 and minimizes J' L is the Karhu- nen-Loeve transform (Karhunen, 1947; Hotellirig, 1933). Our derivation for real signals and transforms follows. Consider a unitary transformation $ such that The approximation /, and approximation error e L are By orthonormality, it easily follows that 2.2. TRANSFORM EFFICIENCY AND CODING PERFORMANCE 35 From Eq. (2.34), Therefore, so that the error measure becomes To obtain the optimum transform, we want to find the $ r that minimizes J' L for a given L, subject to the orthonormality constraint, ^^_ s — 6 r -s- Using Lagrangian multipliers, we minimize Each term, in the sum is of the form Taking the gradient 4 of this with respect to x (Prob. 2.6), or Doing this for each term in Eq. (2.74) gives which implies where 4 The gradient is a vector defined as 36 CHAPTER 2. ORTHOGONAL TRANSFORMS (The reason for the transpose is that we had defined $ r as the rth column of $.) Hence 3> r is an eigenvector of the signal covariance matrix Rf, and A r , the associated eigenvalue, is a root of the characteristic polynomial, det(\I — R/). Since Rf is a real, symmetric matrix, all {A;} are real, distinct, arid nonnegative. The value of the minimized J' L is then The covariance matrix for the spectral coefficient vector is diagonal, as can be seen from Thus 4> is the unitary matrix that does the following: (1) generates a diagonal RQ and thus completely decorrelates the spectral coeffi- cients resulting in r? c — 1, (2) repacks the total signal energy among the first L coefficients, maximizing TJE* It should be noted, however, that while many matrices can decorrelate the input signal, the KLT both decorrelates the input perfectly and optimizes the repacking of signal energy. Furthermore, it is unique. The difficulty with this transformation is that it is input signal specific—i.e., the matrix $ T consists of the eigenvectors of the input covariance matrix Rf. It does provide a theoretical limit against which signal-independent transforms (DFT, DOT, etc.) can be compared. In fact, it is well known that for an AR(1) signal source Eq. (2.61) with p large, on the order of 0.9, the DCT performance is very close to that of the KLT. A frequently quoted result for the AR(1) signal and N even (Ray and Driver. 1970) is where {&k} are the positive roots of This result simply underscores the difficulty in computing the KLT even when applied to the simplest, nontrivial signal model. In the next section, we describe other fixed transforms and compare them with the KLT. (See also Prob. 2.19) 2.2, TRANSFORM EFFICIENCY AND CODING PERFORMANCE 37 For p = 0.91 in the AR(1) model and N - 8, Clarke (1985) has calculated the packing and decorrelation efficiencies of the KLT and the DCT: f\E WE L KLT DCT 1 79.5 79.3 2 91.1 90.9 3 94.8 94.8 4 96.7 96.7 5 97.9 97.9 6 98.7 98.7 7 99.4 99.4 8 100 100 These numbers speak for themselves. Also for this example, t] c = 0.985 for the DCT compared with 1.0 for the KLT. 2.2.2 Comparative Performance Measures The efficiency measures r\ c , TIE in Section 2.2.1 provide the bases for comparing unitary transforms against each other. We need, however, a performance measure that ranges not only over the class of transforms, but also over different coding techniques. The measure introduced here serves that purpose. In all coding techniques, whether they be pulse code modulation (PCM), differ- ential pulse code modulation (DPCM), transform coding (TC), or subband coding (SBC), the basic performance measure is the reconstruction error (or distortion) at a specified information bit rate for storage or transmission. We take as the basis for all comparisons, the simplest coding scheme, namely the PCM, and compare all others to it. With respect to Fig. 2.3, we see that PCM can be regarded as a special case of TC wherein the transformation matrix <3> is the identity matrix I, in which case we have simply 0_ — f. The reconstruction error is / as defined by Eq. (2.38), and the mean square reconstruction error is of The TC performance measure compares of for TC to that for PCM. This measure is called the gain of transform coding over PCM and defined (Jayant and Noll, 1984) as In the next chapter on subband coding, we will similarly define 38 CHAPTER 2. ORTHOGONAL TRANSFORMS In Eq, (2.39) we asserted that for a unitary transform, the mean square le construction error equals the mean square quantization error. The pioof is eas\ Since then ~T where # is the quantization error vector. The average mean square (m.s.) error (or distortion) is where <j' 2 a is the variance of the quantization error in the K ih spectral coefficient. as depicted in Fig. 2.6. Suppose that Rk bits are allocated to quantizer Q^. Then we can choose the quantizer to minimize <r| for this value of R^ and the given probability density function for 0^. This minimum mean square error quantizer is called the Lloyd- Max quantizer, (Lloyd, 1957; Max, 1960). It minimizes separately each cr| fc , and hence the sum Y^k a q k - The structure of Fig. 2.6 suggests that the quantizer can be thought of an estimator, particularly so since a mean square error is being minimized. For the optimal quantizer it can be shown that the quantization error is unbiased, and that the error is orthogonal to the quantizer output (just as in the case for optimal linear estimator), (Prob. 2.9) Figure 2.6: The coefficient quantization error. 2.2. TRANSFORM EFFICIENCY AND CODING PERFORMANCE 39 The resulting mean square error or distortion, depends on the spectral coeffi- cient variance er|., the pdf, the quantizer (in this case. Lloyd-Max), and the number of bits Rk allocated to the kth coefficient. From rate-distortion theory (Berger. 1971), the error variance can be expressed as where f(Rk) is the quantizer distortion function for a unity variance input. Typ- ical I v. where 7^ depends on the pdf for Ok and on the specific quantizer. Jayant and Noll (1984) report values of 7 = 1.0, 2.7, 4.5, and 5.7 for uniform, Gaussian, Laplacian, and Gamma pdfs, respectively. The average mean square reconstruction error is then Next, there is the question of bit allocation to each coefficient, constrained by NR, the total number of bits available to encode the coefficient vector 0 and R is the average number of bits per coefficient. To minimize Eq. (2.89) subject to the constraint of Eq. (2.90), we again resort to Lagrangian multipliers. First we assume 7^ to be the same for each coefficient, and then solve to obtain (Prob. 2.7) This result is due to Huang and Schultheiss (1963) and Segall (1976). The number of bits is proportional to the logarithm of the coefficient variance, or to the power in that band, an intuitively expected result. 40 CHAPTER 2. ORTHOGONAL TRANSFORMS It can also be shown that the bit allocation of Eq. (2.92) results in equal quantization error for each coefficient, and thus the distortion is spread out evenly among all the coefficients, (Prob. 2.8) The latter also equals the average distortion, since The preceding result is the pdf and Rk optimized distortion for any unitary transform. For the PCM case, $ = /, and of reduces to There is a tacit assumption here that the 7 in the PCM case of Eq, (2.95) is the same as that for TC in Eq. (2.93). This may not be the case when, for example, the transformation changes the pdf of the input signal. We will neglect this effect. Recall from Eq. (2.63) that, for a unitary transform, The ratio of distortions in Eqs. (2.95) and (2.93) gives The maximized GTC is the ratio of the arithmetic mean of the coefficient variances to the geometric mean. Among all unitary matrices, the KLT minimizes the geometric mean of the coefficient variances. To appreciate this, recall that from Eq. (2.77) the KLT produced a diagonal RQ, so that 2.3. FIXED TRANSFORMS 41 The limiting value of GKLT for IV -^ oo gives an upper bound on transform coding performance. The denominator in Eq. (2.99) can be expressed as Jayant and Noll (1984) show that where Sf is the power spectral density of the signal Hence, and the numerator in Eq. (2.99) is recognized as Hence, is the reciprocal of the spectral flatness 'measure introduced by Makhoul and Wolf (1972). It is a measure of the predictability of a signal. For white noise, °°GTC — 1 and there is no coding gain. This measure increases with the degree of correlation and hence predictability. Accordingly, coding gain increases as the redundancy in the signal is removed by the unitary transformation. 2.3 Fixed Transforms The KLT described in Section 2.2 is the optimal unitary transform for signal cod- ing purposes. But the DOT is a strong competitor to the KLT for highly correlated 42 CHAPTER 2. ORTHOGONAL TRANSFORMS signal sources. The important practical features of the DCT are that it is signal independent (that is, a fixed transform), and there exist fast computational al- gorithms for the calculation of the spectral coefficient vector. In this section we define, list, and describe the salient features of the most popular fixed transforms, These are grouped into three categories: sinusoidal, polynomial, and rectangular transforms. 2.3.1 Sinusoidal Transforms The discrete Fourier transform (DFT) and its linear derivatives the discrete cosine transform (DCT) and the discrete sine transform (DST) are the main members of the class described here. 2.3.1.1 The Discrete Fourier Transform The DFT is the most important orthogonal transformation in signal analysis with vast implication in every field of signal processing. The fast Fourier transform (FFT) is a fast algorithm for the evaluation of the DFT. The set of orthogonal (but not normalized) complex sinusoids is the family with the property Most authors define the forward and inverse DFTs as The corresponding matrices are This definition is consistent with the interpretation that the DFT is the Z-trans- from of {x(n}} evaluated at N equally-spaced points on the unit circle. The set 2.3. FIXED TRANSFORMS 43 of coefficients {X(k)} constitutes the frequency spectrum of the samples. From Eqs. (2.107) and (2.108) we see that both X(k) and x(n) are periodic in their arguments with period N. Hence Eq. (2.108) is recognized as the discrete Fourier series expansion of the periodic sequence {:r(n)}, and {X(k}} are just the discrete Fourier series coefficients scaled by N. Conventional frequency domain interpre- tation permits an identification of X(0)/N as the "DC" value of the signal. The fundamental (x'i(n) = e? 27rn / Ar } is a unit vector in the complex plane that rotates with the time index n. The first harmonic {x%(ri)} rotates at twice the rate of fundamental and so on for the higher harmonics. The properties of this transform are summarized in Table 2.1. For more details, the reader can consult the wealth of literature on this subject, e.g., Papoulis (1991), Opperiheim and Schafer (1975), Haddad and Parsons (1991). The unitary DFT is simply a normalized DFT wherein the scale factor N appearing in Eqs. (2.106)-(2.109) is reapportioned according to This makes arid the unitary transformation matrix is From a coding standpoint, a key property of this transformation is that the basis vectors of the unitary DFT (the columns of $*) are the eigenvectors of a circulant matrix. That is, with the & th column of 4>* denoted by we will show that <££ are the eigenvectors in where Ti, is any circulant matrix [...]... as 48 CHAPTER, 2 ORTHOGONAL TRANSFORMS Figure 2. 7: Transform bases in time and frequency domains for N ~ 8: (a) KLT (p = 0.95); (b) DOT; (c) DLT; (d) DST; (e) WHT; and (f) MHT 2. 3 FIXED 49 TRANSFORMS Figure 2. 7 (continued) 50 CHAPTER, 2 ORTHOGONAL Figure 2. 7 (continued) TRANSFORMS 2. 3 FIXED TRANSFORMS 51 Figure 2. 7 (continued) CHAPTER, 2 ORTHOGONAL (e) Figure 2. 7 (continued) TRANSFORMS 2, 3 FIXED TRANSFORMS... factor and pr(k] is a polynomial of degree r (Prob 2. 14), ( r \ where ( I is the binomial coefficient, k^"1' is the forward factorial function m \ ) Eq (2. 135), and a = (I - A 2 )/A 2 Using Z -transforms, we can establish the orthonormality (Haddad and Parsons, 1991): 62 CHAPTER 2 ORTHOGONAL TRANSFORMS For our purposes, we outline the steps in the proof First we calculate Pr(z] by induction and obtain... obtain 2, 3 FIXED TRANSFORMS 61 This MHT algorithm can be implemented using 2N real multiplications, as compared with the fast DOT, which requires (JVTog2 N — N + 2) multiplications In Section 2. 4.6, we compare the coding performance (compaction) of various transforms The MHT is clearly inferior to the DCT for positively correlated signals, but superior to it for small or negative values of p 2. 3 .2. 2 The... set, r = 0 , 1 , , (N — l) /2, have significant energy in the half band (0,7r /2) , while the second half, (r — (N + l ) / 2 , , ]V, span the upper half-band These properties will be exploited in Chapter 4 in developing Binomial quadrature mirror filters, and in Chapter 5 as basis sequences for wavelets The 8 x 8 Binomial matrix X follows: ' I 1 1 X - 1 1 1 1 1 7 21 35 35 21 1" 7 5 5 -5 —9 -5 1 9 3... point, Eq (2. 113) In summary, the DFT transformation diagonalizes any circulant matrix, and therefore completely decorrelates any signal whose covariance matrix has the circulant properties of Ji 2. 3.1 .2 The Discrete Cosine Transferm This transform is virtually the industry standard in image and speech transform coding because it closely approximates the KLT especially for highly correlated signals,... relation, Eq (2. 138) The digital filter structure shown in Fig 2. 8 generates the entire Binomial-Hermite family The Hermite polynomials arid the Binomial-Hermite sequences are orthogonal (N\ I and I f N Y 1 , respectively on [0, N] with respect to weighting sequences , (Prob 2. 12) : V k I \k ) This last equation is the discrete counterpart to the analog Hermite orthogonality of Eq (2. 128 ) 58 CHAPTER 2 ORTHOGONAL... orthogonality of rows and columns asserted by Eq (2. 141) Finally, from Eq (2. 1 42) , we can infer that the complementary filters Xr(z] and X^_r(z) have magnitude responses that are mirror images about uj = ?r /2, Hence, the complementary rows and columns of X possess the mirror filter property (Section 3.3) From Eq (2. 139), it is clear that Xr(e^} = Ar(uj]ejdr^\ has magnitude and (linear) phase responses given by... shown in Fig 2. 9 (See also Prob 2. 15) 64 CHAPTER 2 ORTHOGONAL TRANSFORMS 2. 3 .2. 3 The Discrete Legendre Polynomials The discrete Hermite polynomials weighted by the Binomial sequence are suitable for representing signals with Gaussian-like features on a finite interval Such sequences fall off rapidly near the end points of the interval [0, N — 1] The Laguerre functions provide a signal decomposition. .. at a spacing of AT = 2~ N, N = 2P, and then relabeling the ordinate so that there is unit spacing between samples The Walsh functions are continuous from the right, Eqs (2. 161) and (2. 1 62) The sampled value at a discontinuity to is the value at £Q~, just to the right of t$ Therefore, the Walsh sequences are a complete set of N orthogonal sequences on [0, N — 1] consisting of +1 and — 1 values, defined... X(N - k) (7) Conjugation x*(ri) X*(N — k) (8) Correlation p(n) = x(n) * z*(-n) *-* B(fc) = |^(fe) |2 (9) Parseval E'kWI 2 = ^ E^^WI2 V n=0 ^ i=0 (10) Real Signals/ X*(N - k) = X(k) Conjugate Symmetry Table 2. 1: Properties of the discrete Fourier transform 2. 3 FIXED TRANSFORMS 45 We can write Eq (2. 113) as where which results in or The proof is straightforward Consider a linear, time-invariant system . TRANSFORMS 49 Figure 2. 7 (continued) 50 CHAPTER, 2. ORTHOGONAL TRANSFORMS Figure 2. 7 (continued) 2. 3. FIXED TRANSFORMS 51 Figure 2. 7 (continued) CHAPTER, 2. ORTHOGONAL TRANSFORMS (e) Figure 2. 7 . Section 2. 2 is the optimal unitary transform for signal cod- ing purposes. But the DOT is a strong competitor to the KLT for highly correlated 42 CHAPTER 2. ORTHOGONAL TRANSFORMS signal. |^(fe)| 2 (9) Parseval E'kWI 2 = ^ E^^WI 2 n=0 ^ V i=0 (10) Real Signals/ X*(N - k) = X(k) Conjugate Symmetry Table 2. 1: Properties of the discrete Fourier transform. 2. 3.