Tài liệu 66 Subspace Tracking pptx

R D De Groat, et Al “Subspace Tracking.” 2000 CRC Press LLC Subspace Tracking 66.1 Introduction 66.2 Background EVD vs SVD • Short Memory Windows for Time Varying Estimation • Classification of Subspace Methods • Historical Overview of MEP Methods • Historical Overview of Adaptive, Non-MEP Methods 66.3 Issues Relevant to Subspace and Eigen Tracking Methods R.D DeGroat The University of Texas at Dallas E.M Dowling The University of Texas at Dallas D.A Linebarger The University of Texas at Dallas 66.1 Bias Due to Time Varying Nature of Data Model • Controlling Roundoff Error Accumulation and Orthogonality Errors • Forward-Backward Averaging • Frequency vs Subspace Estimation Performance • The Difficulty of Testing and Comparing Subspace Tracking Methods • Spherical Subspace (SS) Updating — A General Framework for Simplified Updating • Initialization of Subspace and Eigen Tracking Algorithms • Detection Schemes for Subspace Tracking 66.4 Summary of Subspace Tracking Methods Developed Since 1990 Modified Eigen Problems • Gradient-Based Eigen Tracking • The URV and Rank Revealing QR (RRQR) Updates • Miscellaneous Methods References Introduction Most high resolution direction-of-arrival (DOA) estimation methods rely on subspace or eigenbased information which can be obtained from the eigenvalue decomposition (EVD) of an estimated correlation matrix, or from the singular value decomposition (SVD) of the corresponding data matrix However, the expense of directly computing these decompositions is usually prohibitive for real-time processing Also, because the DOA angles are typically time-varying, repeated computation is necessary to track the angles This has motivated researchers in recent years to develop low cost eigen and subspace tracking methods Four basic strategies have been pursued to reduce computation: (1) computing only a few eigencomponents, (2) computing a subspace basis instead of individual eigencomponents, (3) approximating the eigencomponents or basis, and (4) recursively updating the eigencomponents or basis The most efficient methods usually employ several of these strategies In 1990, an extensive survey of SVD tracking methods was published by Comon and Golub [7] They classified the various algorithms according to complexity and basically two categories emerge: O(n2 r) and O(nr ) methods, where n is the snapshot vector size and r is the number of extreme eigenpairs to be tracked Typically, r < n or r n, so the O(nr ) methods involve significantly fewer computations than the O(n2 r) algorithms However, since 1990, a number of O(nr) algorithms have 1999 by CRC Press LLC c been developed This article will primarily focus on recursive subspace and eigen updating methods developed since 1990, especially, the O(nr ) and O(nr) algorithms 66.2 Background 66.2.1 EVD vs SVD Let X = [x1 |x2 | |xN ] be an n × N data matrix where the kth column corresponds to the kth snapshot vector, xk ∈ C n With block processing, the correlation matrix for a zero mean, stationary, ergodic vector process is typically estimated as R = N1 XXH where the true correlation matrix, = E[xk xkH ] = E[R] The EVD of the estimated correlation matrix is closely related to the SVD of the corresponding data matrix The SVD of X is given by X = U SV H where U ∈ C n×n and V ∈ C N ×N are unitary matrices and S ∈ C n×N is a diagonal matrix whose nonzero entries are positive It is easy to see that the left singular vectors of X are the eigenvectors of XXH = U SS T U H , and the right singular vectors of X are the eigenvectors of X H X = V S T SV H This is so because XXH and X H X are positive definite Hermitian matrices (which have orthogonal eigenvectors and real, positive eigenvalues) Also note that the nonzero singular values of X are the positive square roots of the nonzero eigenvalues of XXH and XH X Mathematically, the eigen information contained in the SVD of X or the EVD of XXH (or XH X) is equivalent, but the dynamic range of the eigenvalues is twice that of the corresponding singular values With finite precision arithmetic, the greater dynamic range can result in a loss of information For example, in rank determination, suppose the smallest singular value is where is machine precision The corresponding eigenvalue, , would be considered a machine precision zero and the EVD of XX H (or XH X ) would incorrectly indicate a rank deficiency Because of the dynamic range issue, it is generally recommended to use the SVD of X (or a square root factor of R) However, because additive sensor noise usually dominates numerical errors, this choice may not be critical in most signal processing applications 66.2.2 Short Memory Windows for Time Varying Estimation Ultimately, we are interested in tracking some aspect of the eigenstructure of a time varying correlation (or data) matrix For simplicity we will focus on time varying estimation of the correlation matrix, realizing that the EVD of R is trivially related to the SVD of X A time varying estimator must have a short term memory in order to track changes An example of long memory estimation is an estimator that involves a growing rectangular data window As time goes on, the estimated quantities depend more and more on the old data, and less and less on the new data The two most popular short memory approaches to estimating a time varying correlation matrix involve (1) a moving rectangular window and (2) an exponentially faded window Unfortunately, an unbiased, causal estimate of the true instantaneous correlation matrix at time k, 8k = E[xk xkH ], is not possible if averaging is used and the vector process is truly time varying However, it is usually assumed that the process is varying slowly enough within the effective observation window that the process is approximately stationary and some averaging is desirable In any event, at time k, a length N moving rectangular data window results in a rank two modification of the correlation matrix estimate, i.e., (rect) Rk (rect) = Rk−1 + H (xk xkH − xk−N xk−N ) N (66.1) where xk is the new snapshot vector and xk−N is the oldest vector which is being removed from the (rect) (rect) = [xk |xk−1 | |xk−N +1 ] and Rk = estimate The corresponding data matrix is given by Xk H (rect) (rect) Subtracting the rank one matrix from the correlation estimate is referred to as Xk N Xk 1999 by CRC Press LLC c a rank one downdate Downdating moves all the eigenvalues down (or unchanged) Updating, on the other hand, moves all eigenvalues up (or unchanged) Downdating is potentially ill-conditioned because the smallest eigenvalue can move towards zero An exponentially faded data window produces a rank one modification in (f ade) Rk (f ade) = αRk−1 + (1 − α)xk xkH (66.2) where α is the fading factor with ≤ α ≤ In this case, the data matrix is growing in size, but the older data is de-emphasized with a diagonal weighting matrix, (f ade) (f ade) (f ade) (f ade) H = [xk |xk−1 | |x1 ] sqrt(diag(1, α, α , , α k−1 )) and Rk = (1−α)Xk Xk Xk Of course, the two windows could be combined to produce an exponentially faded moving rectangular window, but this kind of hybrid short memory window has not been the subject of much study in the signal processing literature Similarly, not much attention has been paid to which short memory windowing scheme is most appropriate for a given data model Since downdating is potentially ill-conditioned, and since two rank one modifications usually involve more computation than one, the exponentially faded window has some advantages over the moving rectangular window The main advantage of a (short) rectangular window is in tracking sudden changes Assuming stationarity within the effective observation window, the power in a rectangular window will be equal to the power in an exponentially faded window when N≈ N −1 or equivalently α ≈ − = (1 − α) N N (66.3) Based on a Fourier analysis of linearly varying frequencies, equal frequency lags occur when [14] N≈ N −1 (1 + α) or equivalently α ≈ (1 − α) N +1 (66.4) Either one of these relationships could be used as a rule of thumb for relating the effective observation window of the two most popular short memory windowing schemes 66.2.3 Classification of Subspace Methods Eigenstructure estimation can be classified as (1) block or (2) recursive Block methods simply compute an EVD, SVD, or related decomposition based on a block of data Recursive methods update the previously computed eigen information using new data as it arrives We focus on recursive subspace updating methods in this article Most subspace tracking algorithms can also be broadly categorized as (1) modified eigen problem (MEP) methods or (2) adaptive (or non-MEP) methods With short memory windowing, MEP methods are adaptive in the sense that they can track time varying eigen information However, when we use the word adaptive, we mean that exact eigen information is not computed at each update, but rather, an adaptive method tends to move towards an EVD (or some aspect of an EVD) at each update For example, gradient-based, perturbation-based, and neural network-based methods are classified as adaptive because on average they move towards an EVD at each update On the other hand, rank one, rank k, and sphericalized EVD and SVD updates are, by definition, MEP methods because exact eigen information associated with an explicit matrix is computed at each update Both MEP and adaptive methods are supposed to track the eigen information of the instantaneous, time varying correlation matrix 1999 by CRC Press LLC c 66.2.4 Historical Overview of MEP Methods Many researchers have studied SVD and EVD tracking problems Golub [19] introduced one of the first eigen-updating schemes, and his ideas were developed and expanded by Bunch and co-workers in [3, 4] The basic idea is to update the EVD of a symmetric (or Hermitian) matrix when modified by a rank one matrix The rank-one eigen update was simplified in [37], when Schreiber introduced a transformation that makes the core eigenproblem real Based on an additive white noise model, Karasalo [21] and Schreiber [37] suggested that the noise subspace be “sphericalized”, i.e., replace the noise eigenvalues by their average value so that deflation [4] could be used to significantly reduce computation By deflating the noise subspace and only tracking the r dominant eigenvectors, the computation is reduced from O(n3 ) to O(nr ) per update DeGroat reduced computation further by extending this concept to the signal subspace [8] By sphericalizing and deflating both the signal and the noise subspaces, the cost of tracking the r dimensional signal (or noise) subspace is O(nr) and no iteration is involved To make eigen updating more practical, DeGroat and Roberts developed stabilization schemes to control the loss of orthogonality due to the buildup of roundoff error [10] Further work related to eigenvector stabilization is reported in [15, 28, 29, 30] Recently, a more stable version of Bunch’s algorithm was developed by Gu and Eisenstat [20] In [46], Yu extended rank one eigen updating to rank k updating DeGroat showed in [8] that forcing certain subspaces of the correlation matrix to be spherical, i.e., replacing the associated eigenvalues with a fixed or average value, is an easy way to deflate the size of the updating problem and reduce computation Basically, a spherical subspace (SS) update is a rank one EVD update of a sphericalized correlation matrix Asymptotic convergence analysis of SS updating is found in [11, 13] A four level SS update capable of automatic signal subspace rank and size adjustment is described in [9, 11] The four level and the two level SS updates are the only MEP updates to date that are O(nr) and noniterative For more details on SS updating, see Section 66.3.6, Spherical Subspace (SS) Updating: A General Framework for Simplified Updating In [42], Xu and Kailath present a Lanczos based subspace tracking method with an associated detection scheme to track the number of sources A reference list for systolic implementations of SVD based subspace trackers is contained in [12] 66.2.5 Historical Overview of Adaptive, Non-MEP Methods Owsley pioneered orthogonal iteration and stochastic-based subspace trackers in [32] Yang and Kaveh extended Owsley’s work in [44] by devising a family of constrained gradient-based algorithms A highly parallel algorithm, denoted the inflation method, is introduced for the estimation of the noise subspace The computational complexity of this family of gradient-based methods varies from (approximately) n2 r to 27 nr for the adaptation equation However, since the eigenvectors are only approximately orthogonal, an additional nr flops may be needed if Gram Schmidt orthogonalization is used It may be that a partial orthogonalization scheme (see Section 66.3.2 Controlling Roundoff Error Accumulation and Orthogonality Errors) can be combined with Yang and Kaveh’s methods to improve orthogonality enough to eliminate the O(nr ) Gram Schmidt computation Karhunen [22] also extended Owsley’s work by developing a stochastic approximation method for subspace computation Bin Yang [43] used recursive least squares (RLS) methods with a projection approximation approach to develop the projection approximation subspace tracker (PAST) which tracks an arbitrary basis for the signal subspace, and PASTd which uses deflation to track the individual eigencomponents A multi-vector eigen tracker based on the conjugate gradient method is developed in [18] Previous conjugate gradient-based methods tracked a single eigenvector only Orthogonal iteration, lossless adaptive filter, and perturbation-based subspace trackers appear in [40] [36], and [5] respectively A family of non-EVD subspace trackers is given in [16] An adaptive subspace method that uses a linear operator, referred to as the Propagator, is given in [26] Approximate SVD methods that are 1999 by CRC Press LLC c based on a QR update step followed by a single (or partial) Jacobi sweep to move the triangular factor towards a diagonal form appear in [12, 17, 30] These methods can be described as approximate SVD methods because they will converge to an SVD if the Jacobi sweeps are repeated Subspace estimation methods based on URV or rank revealing QR (RRQR) decompositions are referenced in [6] These rank revealing decompositions can divide a set of orthonormal vectors into sets that span the signal and noise subspaces However, a threshold (noise power) level that lies between the largest noise eigenvalue and the smallest signal eigenvalue must be known in advance In some ways, the URV decomposition can be viewed as an approximate SVD For example, the transposed QR (TQR) iteration [12] can be used to compute the SVD of a matrix, but if the iteration is stopped before convergence, the resulting decomposition is URV-like Artificial neural networks (ANN) have also been used to estimate eigen information [35] In 1982, Oja [31] was one of the first to develop an eigenvector estimating ANN Using a Hebbian type learning rule, this ANN adaptively extracts the first principal eigenvector Much research has been done in this area since 1982 For an overview and a list of references, see [35] 66.3 Issues Relevant to Subspace and Eigen Tracking Methods 66.3.1 Bias Due to Time Varying Nature of Data Model Because direction-of-arrival (DOA) angles are typically time varying, a range of spatial frequencies is usually included in the effective observation window Most spatial frequency estimation methods yield frequency estimates that are approximately equal to the effective frequency average in the window Consequently, the estimates lag the true instantaneous frequency If the frequency variation is assumed to be linear within the effective observation window, this lag (or bias) can be easily estimated and compensated [14] 66.3.2 Controlling Roundoff Error Accumulation and Orthogonality Errors Numerical algorithms are generally defined as stable if the roundoff error accumulates in a linear fashion However, recursive updating algorithms cannot tolerate even a linear buildup of error if large (possibly unbounded) numbers of updates are to be performed For real time processing, periodic reinitialization is undesirable Most of the subspace tracking algorithms involve the product of at least k orthogonal matrices by the time the kth update is computed According to Parlett [33], the error propagated by a product of orthogonal matrices is bounded as |Uk UkH − I |E ≤ (k + 1)n1.5 (66.5) where the n × n matrix Uk = Uk−1 Qk = Qk Qk−1 Q1 is a product of k matrices that are each orthogonal to working accuracy, is machine precision, and |.|E denotes the Euclidean matrix norm Clearly, if k is large enough, the roundoff error accumulation can be significant There are really only two sources of error in updating a symmetric or Hermitian EVD: (1) the eigenvalues and (2) the eigenvectors Of course, the eigenvectors and eigenvalues are interrelated Errors in one tend to produce errors in the other At each update, small errors may occur in the EVD update so that the eigenvalues become slowly perturbed and the eigenvectors become slowly nonorthonormal The solution is to prevent significant errors from ever accumulating in either We not expect the main source of error to be from the eigenvalues According to Stewart [38], the eigenvalues of a Hermitian matrix are perfectly conditioned, having condition numbers of one Moreover, it is easy to show that when exponential weighting is used, the accumulated roundoff error 1999 by CRC Press LLC c is bounded by a constant, assuming no significant errors are introduced by the eigenvectors By contrast, if exponential windowing is not used, the bound for the accumulated error builds up in a linear fashion Thus, the fading factor not only fades out old data, but also old roundoff errors that accumulate in the eigenvalues Unfortunately, the eigenvectors of a Hermitian matrix are not guaranteed to be well conditioned An eigenvector will be ill-conditioned if its eigenvalue is closely spaced with other eigenvalues In this case, small roundoff perturbations to the matrix may cause relatively large errors in the eigenvectors The greatest potential for nonorthogonality then is between eigenvectors with adjacent (closely spaced) eigenvalues This observation led to the development of a partial orthogonalization scheme known as pairwise Gram Schmidt (PGS) [10] which attacks the roundoff error buildup problem at the point of greatest numerical instability — nonorthogonality of adjacent eigenvectors If the intervening rotations (orthogonal matrix products) inherent in the eigen update are random enough, the adjacent vector PGS can be viewed as a full orthogonalization spread out over time When PGS is combined with exponential fading, the roundoff accumulation in both the eigenvectors and the eigenvalues is controlled Although PGS was originally designed to stabilize Bunch’s EVD update, it is generally applicable to any EVD, SVD, URV, QR, or orthogonal vector update Moonen et al [29] suggested that the bulk of the eigenvector stabilization in the PGS scheme is due to the normalization of the eigenvectors Simulations seem to indicate that normalization alone stabilizes the eigenvectors almost as well as the PGS scheme, but not to working precision orthogonality Edelman and Stewart provide some insight into the normalization only approach to maintaining orthogonality [15] For additional analysis and variations on the basic idea of spreading orthogonalization out over time, see [30] and especially [28] Many of the O(nr) adaptive subspace methods produce eigenvector estimates that are only approximately orthogonal and normalization alone does not always provide enough stabilization to keep the orthogonality and other error measures small enough We have found that PGS stabilization can noticeably improve both the subspace estimation performance as well as the DOA (or spatial frequency) estimation performance For example, without PGS (but with normalization only), we found that Champagne’s O(nr) perturbation-based eigen tracker (method PC) [5] sometimes gives spurious MUSIC-based frequency estimates On the other hand, with PGS, Champagne’s PC method produced improved subspace and frequency estimates The orthogonality error was also significantly reduced Similar performance boosts could be expected for any subspace or eigen tracking method (especially those that produce eigenvector estimates that are only approximately orthogonal, e.g., PAST and PASTd [43] or Yang and Kaveh’s family of gradient based methods [44, 45]) Unfortunately, normalization only and PGS are O(nr) Adding this kind of stabilization to an O(nr) subspace tracking method could double its overall computation Other variations on the original PGS idea involve symmetrizing the × transformation and making the pairwise orthogonalization cyclic [28] The symmetric transformation assumes that the vector pairs are almost orthgonal so that higher order error terms can be ignored If this is the case, the symmetric version can provide slightly better results at a somewhat higher computational cost For methods that involve working precision orthogonal vectors, the original PGS scheme is overkill Instead of doing PGS orthogonalization on each adjacent vector pair, cyclic PGS orthogonalizes only one pair of vectors per update, but cycles through all possible combinations over time Thus, cyclic PGS covers all vector pairs without relying on the randomness of intervening rotations Cyclic PGS spreads the orthogonalization process out in time even more than the adjacent vector PGS method Moreover, cyclic PGS (or cyclic normalization) involves O(n) flops per update, but there is a small overhead associated with keeping track of the vector pair cycle In summary, we can say that stabilization may not be needed for a small number of updates On the other hand, if an unbounded number of updates is to be performed, some kind of stabilization is recommended For methods that yield nearly orthogonal vectors at each update, only a small amount of orthogonalization is needed to control the error buildup In these cases, cyclic PGS may be best 1999 by CRC Press LLC c However, for methods that produce vectors that are only approximately orthogonal, a more complete orthogonalization scheme may be appropriate, e.g., a cyclic scheme with two or three vector pairs orthogonalized per update will produce better results than a single pair scheme 66.3.3 Forward-Backward Averaging In many subspace tracking problems, forward-backward (FB) averaging can improve subspace as well as DOA (or frequency) estimation performance Although FB averaging is generally not appropriate for nonstationary processes, it does appear to improve spatial frequency estimation performance if the frequencies vary linearly within the effective observation window Based on Fourier analysis of linearly varying frequencies, we infer that this is probably due to the fact that the average frequency in the window is identical for both the forward and the backward cases [14] Consequently, the frequency estimates are reinforced by FB averaging Besides improved estimation performance, FB averaging can be exploited to reduce computation by as much as 75% [24] FB averaging can also reduce computer memory requirements because (conjugate symmetric or anti-symmetric ) symmetries in the complex eigenvectors of an FB averaged correlation matrix (or the singular vectors of an FB data matrix) can be exposed through appropriate normalization 66.3.4 Frequency vs Subspace Estimation Performance It has recently been shown with asymptotic analysis that a better subspace estimate does not necessarily result in a better MUSIC-based frequency estimate [23] In subspace tracking simulations, we have also observed that some methods produce better subspace estimates, but the associated MUSICbased frequency estimates are not always better Consequently, if DOA estimation is the ultimate goal, subspace estimation performance may not be the best criterion for evaluating subspace tracking methods 66.3.5 The Difficulty of Testing and Comparing Subspace Tracking Methods A significant amount of research has been done on subspace and eigen tracking algorithms in the past few years, and much progress has been made in making subspace tracking more efficient Not surprisingly, all of the methods developed to date have different strengths and weaknesses Unfortunately, there has not been enough time to thoroughly analyze, study, and evaluate all of the new methods Over the years, several tests have been devised to “experimentally” compare various methods, e.g., convergence tests [44], response to sudden changes [7], and crossing frequency tracks (where the signal subspace temporarily collapses) [8] Some methods well on one test, but not so well on another It is difficult to objectively compare different subspace tracking methods because optimal operating parameters are usually unknown and therefore unused, and the performance criteria may be ill-defined or contradictory 66.3.6 Spherical Subspace (SS) Updating — A General Framework for Simplified Updating Most eigen and subspace tracking algorithms are based directly or indirectly on tracking some aspect of the EVD of a time varying correlation matrix estimate that is recursively updated according to Eq (66.1) or (66.2) Since Eqs (66.1) and (66.2) involve rank one and rank two modifications to the correlation matrix, most subspace tracking algorithms explicitly or implicitly involve rank one (or two) modification of the correlation matrix Since rank two modifications can be computed as two rank one modifications, we will focus on rank one updating 1999 by CRC Press LLC c Basically, spherical subspace (SS) updates are simplified rank one EVD updates The simplification involves sphericalizing subsets of eigenvalues (i.e., forcing each subset to have the same eigenlevel) so that the sphericalized subspaces can be deflated Based on an additive white noise signal model, Karasalo [21] and Schreiber [37] first suggested that the “noise” eigenvalues be replaced by their average value in order to reduce computation by deflation Using Ljung’s ODE-based method for analyzing stochastic recursive algorithms [25], it has recently been shown that, if the noise subspace is sphericalized, the dominant eigenstructure of a correlation matrix asymptotically converges to the true eigenstructure with probability one (under any noise assumption) [11] It is important to realize that averaging the noise eigenvalues yields a spherical subspace in which the eigenvectors can be arbitrarily oriented as long as they form an orthonormal basis for the subspace A rank-one modification affects only one component of the sphericalized subspace Thus, only one of the multiple noise eigenvalues is changed by a rank-one modification Consequently, making the noise subspace spherical (by averaging the noise eigenvalues, or replacing them with a constant eigenlevel) deflates the eigenproblem to an (r + 1) × (r + 1) problem, which corresponds to a signal subspace of dimension r, and the single noise component whose power is changed For details on deflation, see [4] The analysis in [11] shows that any number of sphericalized eigenlevels can be used to track various subspace spans associated with the correlation matrix For example, if both the noise and the signal subspaces are sphericalized (i.e., the dominant and subdominant set of eigenvalues is replaced by their respective averages), the problem deflates to a × eigenproblem that can be solved in closed form, noniteratively We will call this doubly deflated SS update, SA2 (Signal Averaged, Two Eigenlevels) [8] In [13] we derived the SA2 algorithm ODE and used a Lyapunov function to show asymptotic convergence to the true subspaces w.p under a diminishing gain assumption In fact, the SA2 subspace trajectories can be described with Lie bracket notation and follow an isospectral flow as described by Brockett’s ODE [2] A four level SS update (called SA4) was introduced in [9] to allow for information theoretic source detection (based on the eigenvalues at the boundary of the signal and noise subspaces) and automatic subspace size adjustment A detailed analysis of SA4 and an SA4 minimum description length (SA4-MDL) detection scheme can be found in [11, 41] SA4 sphericalizes all the signal eigenvalues except the smallest one, and all the noise eigenvalues except the largest one, resulting in a × deflated eigenproblem By tracking the eigenvalues that are on the boundary of the signal and noise subspaces, information theoretic detection schemes can be used to decide if the signal subspace dimension should be increased, decreased, or remain unchanged Both SA2 and SA4 are O(nr) and noniterative The deflated core problem in SS updating can involve any EVD or SVD method that is desired It can also involve other decompositions, e.g., the URVD [34] To illustrate the basic idea of SS updating, we will explicitly show how an update is accomplished when only the smallest (n − r) “noise” eigenvalues are sphericalized This particular SS update is called a Signal Eigenstructure (SE) update because only the dominant r “signal” eigencomponents are tracked This case is equivalent to that described by Schreiber [37] and an SVD version is given by Karasalo [21] To simplify and more clearly illustrate the idea SS updating, we drop the normalization factor, (1 − α), and the k subscripts from Eq (66.2) and use the eigendecomposition of R = U DU H to expose a simpler underlying structure for a single rank-one update e = αR + xx H R = αU DU H + xx H = U (αD + ββ H )U H , β = UH x T H H = U G(αD + γ γ )G U , γ = GH β = U GH (αD + ζ ζ T )H T GH U H , ζ = HT γ e T )H T GH U H = U GH (QDQ 1999 by CRC Press LLC c (66.6) (66.7) (66.8) (66.9) (66.10) (66.11) eD eU eH , U = e = U GH Q U (66.12) where G = diag (β1 /|β1 |, , βn /|βn | is a diagonal unitary transformation that has the effect of making the matrix inside the parenthesis real [37], H is an embedded Householder transformation that deflates the core problem by zeroing out certain elements of ζ (see the SE case below), and e T is the EVD of the simplified, deflated core matrix, (αD + ζ ζ T ) In general, H and Q will QDQ involve smaller matrices embedded in an n × n identity matrix In order to more clearly see the details of deflation, we must concentrate on finding the eigendecomposition of the completely real matrix, S = (αD + γ γ T ) for a specific case Let us consider the SE update and assume that the noise eigenvalues contained in the diagonal matrix have been replaced by their average values, d (n) , to produce a sphericalized noise subspace We must then apply block Householder transformations to concentrate all of the power in the new data vector into a single component of the noise subspace The update is thus deflated to an (r + 1) × (r + 1) embedded eigenproblem as shown below, S = (αD + γ γ T ) = H (αD + ζ ζ T )H T , ζ = HT γ     (s)   T Ir Ir 0 Dr  α   + ζζT   =  (n) (n) (n) Hn−r Hn−r d In−r   (s)  er 0     D Ir  Qr+1      de(n) =     (n)    Hn−r In−r−1 (n) 0 αd In−r−1 T   T  Ir Qr+1     × (n) In−r−1 Hn−r = e T )H T H (QDQ (66.13) (66.14) (66.15) (66.16) (66.17) where ζT (n) Hn−r H γ v (n) = = (H T γ )T = [γ (s) , |γ (n) |, 0(n−r−1)×1 ]T , v (n) (v (n) )T In−r − (n) T (n) , (v ) v   Ir , =  (n) Hn−r  (s)  }r γ  =  }n − r γ (n)    = γ (n) + |γ (n) |  0(n−r−1)×1 (66.18) (66.19) (66.20) (66.21) (66.22) The superscripts (s) and (n) denote signal and noise subspace, respectively, and the subscripts denote the size of the various block matrices In the actual implementation of the SE algorithm, the Householder transformations are not explicitly computed, as we will see below Moreover, it should be stressed that the Householder transformation does not change the span of the noise subspace, but 1999 by CRC Press LLC c merely “aligns” the subspace so that all of the new data vector, x, that projects into the noise subspace lies in a single component of the noise subspace The embedded (deflated) (r + 1) × (r + 1) eigenproblem,    (s)   (s) T  D (s) γ γ  er+1 QTr+1 (66.23)    + = Qr+1 D E =   (n) (n) (n) d |γ | |γ | (r+1)×(r+1) can be solved using any EVD algorithm Or, an SVD (square root) version can be computed by finding the SVD of   (s) γ (s) T  er+1 Pr+1 = Qr+1 (66.24) F = (n) (n) σ |γ | (r+1)×(r+2) √ er+1 ) The right singular er+1 = sqrt(D where E = F F T , (s) = sqrt(D (s) ), σ (n) = d (n) and vectors, Pr+1 , are generally not needed or explicitly computed in most subspace tracking problems The new signal and noise subspaces are thus given by e(n) ] e = [U e(s) , U (66.25) U = U GH Q  = U (s) G(s) , U (n) G(n) H (n)   | {z } | {z }  n×r n×(n−r) Qr+1 0 In−r−1   (66.26) (66.27) where U (s) and U (n) are the old signal and noise subspaces, G represents the diagonal unitary transformation that makes the rest of the problem real, H is the block Householder transformation that rotates (or more precisely, reflects) the spherical subspaces so that all of the noise power contained in the new data vector can be concentrated into a single component of noise subspace, and Q represents the evolution and interaction of the two subspaces induced by the new data vector Basically, this update partitions the data space into two subspaces: the signal subspace is not sphericalized and all of its eigencomponents are explicitly tracked whereas the noise subspace is sphericalized and not explicitly tracked (to save computation) Using the properties of the Householder transformation, it can be shown that the single component of the noise subspace that mixes with the signal subspace via Qr+1 is given by u(n) = = = = the first column of U (n) G(n) H (n) U (n) (U (n) )H x |U (n) (U (n) )H x| (I − U (s) (U (s) )H )x |γ (n) | (x − U (s) γ (s) ) |γ (n) | (66.28) (66.29) (66.30) (66.31) where u(n) is the projection of x into the noise subspace and |γ (n) | = |x − U (s) γ (s) | is the power of x projected into the noise subspace Once the eigenvectors of the core (r + 1) × (r + 1) problem are found, the signal subspace eigenvectors can be updated as i h e = U e(s) , e u(n) (66.32) U   = U (s) G(s) , u(n)  Qr+1 | {z } |{z} n×r 1999 by CRC Press LLC c n×1 (66.33) where updating the new noise eigenvector is not necessary (if the noise subspace is resphericalized) The complexity of the core eigenproblem is O(r ) and updating the signal eigenvectors is O(nr ) Thus, the SE update is O(nr ) After an update is accomplished, one of the noise eigencomponents is altered by the embedded eigenproblem To maintain noise subspace sphericity, the noise eigenvalues must be re-averaged before the next SE update can be accomplished On the other hand, if the noise eigenvalues are not re-averaged, the SE update eventually reverts to a full eigen update A whole family of related SS updates is possible by simple modification of the above described process For example, to obtain SA2, the H transformation in Eq (66.20) would be modified by replacing the Ir with an r × r Householder matrix that deflates the signal subspace This would make the core eigenproblem × and the Q matrix an identity with an embedded × orthogonal matrix 66.3.7 Initialization of Subspace and Eigen Tracking Algorithms It is impossible to give generic initialization requirements that would apply to all subspace tracking algorithms, but one feature that is common to many updating methods is a fading factor For cold start initialization (e.g., starting from nothing) at k = 0, initial convergence can often be sped up by ramping up the fading factor, e.g., αk = (1 − )α, k+1 k = 0, 1, 2, (66.34) where α is the final steady state value for the fading factor 66.3.8 Detection Schemes for Subspace Tracking Several subspace tracking methods have detection schemes that were specifically designed for them Xu and Kailath developed a strongly consistent detection scheme for their Lanczos-based method [42] DeGroat and Dowling adapted information theoretic criteria for use with SA4 [9] and an asymptotic proof of consistency is given in [11] Stewart proposed the URV update as a rank revealing method [39] Bin Yang proposed that the eigenvalue estimates from PASTd be used for information theoretic-based rank estimation [43] 66.4 Summary of Subspace Tracking Methods Developed Since 1990 66.4.1 Modified Eigen Problems An O(n2 r) fast subspace decomposition (FSD) method based on the Lanczos algorithm and a strongly consistent source detection scheme was introduced by Xu and Kailath [42] A transposed QR (TQR) iteration-based SVD update was introduced in [12] To reduce computation to O(nr ), the noise subspace is sphericalized and deflated Based on various performance tests, one or two TQR iterations per update yield results that are comparable to the fully converged SVD Moreover, because the diagonalization process is taking place on a triangular factor, the partially converged, deflated TQR-SVD update is very similar to a deflated URV update [34] DeGroat and Roberts [10] simplified Bunch’s rank one eigen update [4] and proposed a partial orthogonalization scheme, called pair-wise Gram Schmidt (PGS), to stabilize the eigenvectors Together with exponential fading to stabilize the eigenvalues, the buildup of roundoff error is essentially controlled and machine precision orthogonality is maintained For a more complete discussion, 1999 by CRC Press LLC c see Section 66.3.2 Controlling Roundoff and Orthogonality Error Recently, Gu and Eisenstat [20] presented an improved version of Bunch’s rank one EVD update The new algorithm contains a more stable way to compute the eigenvectors DeGroat and Dowling have also developed a family of sphericalized EVD and SVD updates (see Section 66.3.6 Spherical Subspace Updating) 66.4.2 Gradient-Based Eigen Tracking Jar-Ferr Yang and Hui-Ju Lin [45] proposed a generalized inflation method which extends the gradient-based work of Yang and Kaveh [44] An O(nr ) noise sphericalized and deflated conjugate gradient-based eigen tracking method is presented by Fu and Dowling in [18] This method can be described as an SS update with a conjugate gradient-based eigen tracker at the core Bin Yang [43] introduced a projection approximation approach that uses RLS techniques to update the signal subspace The projection approximation subspace tracker (PAST) algorithm computes an arbitrary basis for the signal subspace in 3nr + O(r ) flops per update The PASTd algorithm (which uses deflation to track the individual eigenvalues and vectors of the signal subspace) requires 4nr+O(n) flops per update Both methods produce eigenvector estimates that are only approximately orthogonal Regalia and Loubaton [36] use an adaptive lossless transfer matrix (multivariable lattice filter) excited by sensor output to achieve a condition of maximum “power splitting” between two groups of output bins The update equations resemble standard gradient descent algorithms, but they not properly follow the gradient of the error surface Nonetheless, the convergence speed may be a strong function of the source spectral and spatial characteristics Recently, Marcos and Benidir [26] introduced an adaptive subspace-based method that relies on a linear operator, referred to as the Propagator, which exploits the linear independency of the source steering vectors, and which allows the determination of the signal and noise subspaces without any eigendecomposition of the correlation matrix Two gradient-based adaptive algorithms are proposed for the estimation of the Propagator, and then the basis of the signal subspace The overall computational complexity of the adaptive Propagator subspace update is O(nr ) A family of three perturbation-based EVD tracking methods (denoted PA, PB, and PC) are presented by Champagne [5] Each method uses perturbation-based approximations to track the eigencomponents Progressively more simplifications are used to reduce the complexity from 21 n3 +O(n2 ) for PA to 21 nr + O(nr) for PB to 5nr + O(n) for PC Both the PB and PC methods use a sphericalized noise subspace to reduce computation Thus, PB and PC can be viewed as SS updates that use perturbation-based approximations to track the deflated core eigenproblem The PC method achieves greater computational simplifications by assuming well-separated eigenvalues Some special decompositions are also used to reduce the computation of the PC algorithm Surprisingly, simulations seem to indicate that the PC method achieves good overall performance even when the eigenvalues are not well separated Convergence rates are also very good for the PC method However, we have noticed that occasionally spurious frequency estimates may be obtained with PC-based MUSIC Ironically, the PC estimated subspaces tend to be closer to the true subspaces than other subspace tracking methods that not exhibit occasionally spurious frequency estimates Because PC only tracks approximations of the eigencomponents, the orthogonality error is typically much greater than machine precision orthogonality Nevertheless, partial orthogonality schemes can be used to improve orthogonality and other measures of performance (see Section 66.3.2 Controlling Roundoff Error Accumulation and Orthogonality Errors) Artificial neural networks (ANN) have been developed to find eigen information, e.g., see [27] and [35] as well as the references contained therein An ANN consists of many richly interconnected simple and similar processing elements (called artificial neurons) operating in parallel High computational rates (due to massive parallelism) and robustness (due to local neural connectivity) are 1999 by CRC Press LLC c two important features of ANNs Most of the eigenvector estimating ANNs appear under the topic of principal component analysis (PCA) The principal eigenvectors are defined as the eigenvectors associated with the larger eigenvalues 66.4.3 The URV and Rank Revealing QR (RRQR) Updates The URV update [39] is based on the URV decomposition (URVD) developed by G.W Stewart as a two sided generalization of the RRQR methods The URVD can also be viewed as a generalization of the SVD because U and V are orthogonal matrices and R is an upper triangular matrix Clearly, the SVD is a special case of the URVD If X = U RV H is the URVD of X, then the R factor can be rank revealing in the sense that the Euclidean norm of the n − r rightmost columns of R is approximately equal to the Euclidean norm of the n − r smallest singular values of X Also, the smallest singular value of the first r columns of R is approximately equal to the rth singular value of X These two conditions effectively partition the corresponding columns of U and V into an r-dimensional dominant subspace and an (n − r)-dimensional subdominant subspace that can be used as estimates for the signal and noise subspace spans The URV update is O(n2 ) per update An RRQR update [that is usually O(n2 ) per update] is developed by Bischof and Schroff in [1] RRQR methods that use the traditional pivoting strategy to maintain a rank revealing structure involve O(n3 ) flops per update An analysis of problems associated with RRQR methods along with a fairly extensive reference list on RRQR methods can be found in [6] An O(nr) deflated URV update is presented by Rabideau and Steinhardt in [34] (and the references contained therein) 66.4.4 Miscellaneous Methods Strobach [40] recently introduced a family of low rank or eigensubspace adaptive filters based on orthogonal iteration The computational complexity ranges from O(nr ) to O(nr) A family of Subspace methods Without EigenDEcomposition (SWEDE) has been proposed by Eriksson et al [16] With SWEDE, portions of the correlation matrix must be updated for each new snapshot vector at a cost of approximately 12nr flops However, the subspace basis (which is computed from the correlation matrix partitions) need only be computed every time a DOA estimate is needed Computing the subspace estimate is O(nr ), so if the subspace is computed every kth update, the overall complexity is O(nr /k) + 12nr per update At high SNR, SWEDE performs almost as well as eigen-based MUSIC Key References: As previously mentioned, Comon and Golub did a nice survey of SVD tracking methods in 1990 [7] In 1995, Reddy et al published a selected overview of eigensubspace estimation methods, including ANN approaches [35] For a study of URV and RRQR methods, see [6] Partial orthogonalization schemes are studied in [28] Finally, a special issue of Signal Processing [41] is planned for April 1996 featuring Subspace Methods for Detection and Estimation 1999 by CRC Press LLC c TABLE 66.1 Efficient Subspace Tracking Methods Developed Since 1990 Complexity Subspace or eigen tracking method Orthog span O(n2 r) Fast subspace decomposition (FSD) [42] Yesa URV update [39] Rank revealing QR [possibly O(n3 )] [1] Approximate SVD updates [17, 30] Neural Network Based Updates [35] Yes Yes Yesa Noa O(nr ) Stabilized signal eigenstructure (SE) updateb [8, 10] Sphericalized transposed QR SVD updateb [12] Sphericalized conjugate gradient SVD updateb [18] SWEDE [16] Gradient-based EVD updates with gram schmidt orthog [44, 45] Yesa Yesa Yesa No Yesa O(nr) Signal averaged 2-level (SA2 ) updateb [8] Signal averaged 4-level (SA4) updateb [9, 11] Projection approximation subspace tracking (PAST) [43] PAST with deflation (PASTd) [43] Sphericalized perturbation based eigen update (PC method)b [5] Sphericalized URV updateb [34] Yes Yesa No Noa Yesa Yes O(n2 ) Key: n = no of sensors, r = rank of subspace a Tracks individual eigencomponents b Uses sphericalized subspaces References [1] Bischof, C.H and Shroff, G.M., On updating signal subspaces, IEEE Trans on Sig Proc., 40(1), 96–105, Jan 1992 [2] Brockett, R.W., Dynamical systems that sort list, diagonalize matrices and solve linear programming problems, Proc of the 27th Conf on Decis and Cntrl., 799–803, 1988 [3] Bunch, J.R and Nielsen, C.P., Updating the singular value decomposition, Numer Math., 31, 111–129, 1978 [4] Bunch, J.R., Nielsen, C.P and Sorensen, D.C., Rank-one modification of the symmetric eigenproblem, Numer Math., 31, 31–48, 1978 [5] Champagne, B., Adaptive eigendecomposition of data covariance matrices based on first-order perturbations, IEEE Trans Sig Proc., SP-42(10), 2758–2770, Oct 1994 [6] Chandrasekaran, S and Ipsen, I.C.F., On rank-revealing factorisations, SIAM J Matrix Anal Appl., 15(2), 592–622, April 1994 [7] Comon, P and Golub, G.H., Tracking a few extreme singular values and vectors in signal processing, Proc IEEE, 78(8), 1327–1343, Aug 1990 [8] DeGroat, R.D., Non-iterative subspace tracking, IEEE Trans Sig Proc., SP-40(3), 571–577, Mar 1992 [9] DeGroat, R.D and Dowling, E.M., Spherical subspace tracking: analysis, convergence and detection schemes, in 26th Annual Asilomar Conf on Signals, Systems, and Computers, (invited paper) Oct 1992, 561–565 [10] DeGroat, R.D and Roberts, R.A., Efficient, numerically stabilized rank-one eigenstructure updating, IEEE Trans ASSP, ASSP-38(2), 301–316, Feb 1990 [11] Dowling, E.M., DeGroat, R.D., Linebarger, D.A and Ye, H., Sphericalized SVD updating for subspace tracking, in Moonen, M and De Moor, B., Eds., SVD and Signal Processing III: Algorithms, Applications and Architectures, Elsevier, 1995, 227–234 [12] Dowling, E.M., Ammann, L.P and DeGroat, R.D., A TQR-iteration based SVD for real time angle and frequency tracking, IEEE Trans on Sig Proc., 914–925, April 1994 1999 by CRC Press LLC c [13] Dowling, E.M and DeGroat, R.D., Adaptation dynamics of the spherical subspace tracker, IEEE Trans on Sig Proc., 2599–2602, Oct 1992 [14] Dowling, E.M., DeGroat, R.D and Linebarger, D.A., Efficient, high performance subspace based tracking problems, in Adv Sig Proc Algs., Archs and Appls VI, SPIE 2563, 253–264, 1995 [15] Edelman, A and Stewart, G.W., Scaling for orthogonality, IEEE Trans Sig Proc., SP-41(4), 1676–1677, Apr 1993 [16] Eriksson, A., Stoica, P and Soderstrom, T., On-line subspace algorithms for tracking moving sources, IEEE Trans on Sig Proc., 42(9), 2319–2330, Sept 1994 [17] Ferzali, W and Proakis, J.G., Adaptive SVD algorithm and applications, in SVD and Signal Processing II, Elsevier, 1992, 14–21 [18] Fu, Z and Dowling, E.M., Conjugate gradient eigenstructure tracking for adaptive spectral estimation, IEEE Trans Sig Proc., 43(5), 1151–1160, May 1995 [19] Golub, G.H and VanLoan, C.F., Some modified matrix eigenvalue problems, SIAM Review, 15, 318–334, 1973 [20] Gu, M and Eisenstat, S.C., A stable and efficient algorithm for the rank-one modification of the symmetric eigenproblem, SIAM J Matrix Anal Appl., 15(4), 1266–1276, Oct 1994 [21] Karasalo, I., Estimating the covariance matrix by signal subspace averaging, IEEE Trans ASSP, ASSP-34(1), 8–12, Feb 1986 [22] Karhunen, J., Adaptive algorithms for estimating eigenvectors of correlation type matrices, in ICASSP-84, 14.6.1–14.6.4, 1984 [23] Linebarger, D.A., DeGroat, R.D., Dowling, E.M., Stoica, P and Fudge, G., Incorporating a priori information into MUSIC - algorithms and analysis, Signal Processing, 46(1), 85–104, 1995 [24] Linebarger, D.A., DeGroat, R.D and Dowling, E.M., Efficient direction finding methods employing forward/backward averaging, IEEE Tr SP, 42(8), 2136–2145, Aug 1994 [25] Ljung, L., Analysis of recursive stochastic algorithms, IEEE Trans on Automatic Control, AC-22(4), 551–575, Aug 1977 [26] Marcos, S and Benidir, M., An adaptive subspace algorithm for direction finding and tracking, in Adv Sig Proc Algs., Archs and Appls VI, SPIE 2563, 230–241, 1995 [27] Mathew, G and Reddy, V.U., Orthogonal eigensubspace estimation using neural networks, IEEE Trans on Sig Proc., 42, 1803–1811, July 1994 [28] Mathias, R., Analysis of algorithms for orthogonalizing products of unitary matrices, J Numerical Linear Algebra with Applic., 3(2), 125–145, 1996 [29] Moonen, M., VanDooren, P and Vanderwalle, J., A note on efficient, numerically stabilized rank-one eigenstructure updating, IEEE Trans Sig Proc., SP-39(8), 1913–1914, Aug 1991 [30] Moonen, M., VanDooren, P and Vanderwalle, J., A singular value decomposition updating algorithm for subspace tracking, SIAM J Matrix Anal Appl., 13(4), 1015–1038, Oct 1992 [31] Oja, E., A simplified neuron model as a principal component analyzer, J Math Biol., 15, 267–273, 1982 [32] Owsley, N.L., Adaptive data orthogonalization, ICASSP, 109–112, 1978 [33] Parlett, B.N., The Symmetric Eigenvalue Problem, Prentice-Hall, Englewood Cliffs, NJ, 1980 [34] Rabideau, D.J., Subspace invariance: The RO-FST and TQR-SVD adaptive subspace tracking algorithms, IEEE Trans SP, SP-43, 2016–2018, Aug 1995 [35] Reddy, V.U., Mathew, G and Paulraj, A., Some algorithms for eigensubspace estimation, Digital Signal Processing, 5, 97–115, 1995 [36] Regalia, P.A and Loubaton, P., Rational subspace estimation using adaptive lossless filters, IEEE Trans on Sig Proc., 40, 2392–2405, Oct 1992 [37] Schreiber, R., Implementation of adaptive array algorithms, IEEE Trans ASSP, ASSP-34, 1038– 1045, Oct 1986 1999 by CRC Press LLC c [38] Stewart, G.W., Introduction to Matrix Computations, Academic Press, New York, 1973 [39] Stewart, G.W., An updating algorithm for subspace tracking, IEEE Trans Sig Proc., SP-40(6), 1535–1541, June 1992 [40] Strobach, P., Fast recursive eigensubspace adaptive filters, in International Conference on Acoustics, Speech and Sig Proc., 1416–1419, 1995 [41] Viberg, M and Stoica, P., Eds., Signal Processing, 50(1-2) of Special Issue on Subspace Methods for Detection and Estimation, April 1996 [42] Xu, G., Zha, H., Golub, G and Kailath, T., Fast and robust algorithms for updating signal subspaces, IEEE Trans CAS, 41(6), 537–549, June 1994 [43] Yang, B., Projection approximation subspace tracking, IEEE Trans SP, SP-43(1), 95–107, Jan 1995 [44] Yang, J.F and Kaveh, M., Adaptive eigensubspace algorithms for direction or frequency estimation and tracking, IEEE Trans ASSP, ASSP-36(2), 241–251, Feb 1988 [45] Yang, J.-F and Lin, H.-J., Adaptive high-resolution algorithms for tracking nonstationary sources without the estimation of source number, IEEE Trans on Sig Proc., 42(3), 563–571, Mar 1994 [46] Yu, K.B., Recursive updating the eigenvalue decomposition of a covariance matrix, IEEE Trans Sig Proc., SP-39(5), 1136–1145, May 1991 1999 by CRC Press LLC c ... = HT γ e T )H T GH U H = U GH (QDQ 1999 by CRC Press LLC c (66. 6) (66. 7) (66. 8) (66. 9) (66. 10) (66. 11) eD eU eH , U = e = U GH Q U (66. 12) where G = diag (β1 /|β1 |, , βn /|βn | is a diagonal... the ultimate goal, subspace estimation performance may not be the best criterion for evaluating subspace tracking methods 66. 3.5 The Difficulty of Testing and Comparing Subspace Tracking Methods... recursively updated according to Eq (66. 1) or (66. 2) Since Eqs (66. 1) and (66. 2) involve rank one and rank two modifications to the correlation matrix, most subspace tracking algorithms explicitly

Định dạng
Số trang	17
Dung lượng	149,97 KB