Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 16 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
16
Dung lượng
284,26 KB
Nội dung
19 Convolutive Mixtures and Blind Deconvolution This chapter deals with blind deconvolution and blind separation of convolutive mixtures. Blind deconvolution is a signal processing problem that is closely related to basic independentcomponentanalysis (ICA) and blind source separation (BSS). In com- munications and related areas, blind deconvolution is often called blind equalization. In blind deconvolution, we have only one observed signal (output) and one source signal (input). The observed signal consists of an unknown source signal mixed with itself at different time delays. The task is to estimate the source signal from the observed signal only, without knowing the convolving system, the time delays, and mixing coefficients. Blind separation of convolutive mixtures considers the combined blind deconvolu- tion and instantaneous blind source separation problem. This estimation task appears under many different names in the literature: ICA with convolutive mixtures, mul- tichannel blind deconvolution or identification, convolutive signal separation, and blind identification of multiple-input-multiple-output (MIMO) systems. In blind separation of convolutive mixtures, there are several source (input) signals and sev- eral observed (output) signals just like in the instantaneous ICA problem. However, the source signals have different time delays in each observed signal due to the finite propagation speed in the medium. Each observed signal may also contain time- delayed versions of the same source due to multipath propagation caused typically by reverberations from some obstacles. Figure 23.3 in Chapter 23 shows an example of multipath propagation in mobile communications. In the following, we first consider the simpler blind deconvolution problem, and after that separation of convolutive mixtures. Many techniques for convolutive mix- 355 IndependentComponent Analysis. Aapo Hyv ¨ arinen, Juha Karhunen, Erkki Oja Copyright 2001 John Wiley & Sons, Inc. ISBNs: 0-471-40540-X (Hardback); 0-471-22131-7 (Electronic) 356 CONVOLUTIVE MIXTURES AND BLIND DECONVOLUTION tures have in fact been developed by extending methods designed originally for either the blind deconvolution or standard ICA/BSS problems. In the appendix, certain basic concepts of discrete-time filters needed in this chapter are briefly reviewed. 19.1 BLIND DECONVOLUTION 19.1.1 Problem definition In blind deconvolution [170, 171, 174, 315], it is assumed that the observed discrete- time signal x(t) is generated from an unknown source signal s(t) by the convolution model x(t)= 1 X k=1 a k s(t k ) (19.1) Thus, delayed versions of the source signal are mixed together. This situation appears in many practical applications, for example, in communications and geophysics. In blind deconvolution, both the source signal s(t) and the convolution coefficients a k are unknown. Observing x(t) only, we want to estimate the source signal s(t) .In other words, we want to find a deconvolution filter y (t)= 1 X k=1 h k x(t k ) (19.2) which provides a good estimate of the source signal s(t) at each time instant. This is achieved by choosing the coefficients h k of the deconvolution filter suitably. In practice the deconvolving finite impulse response (FIR) filter (see the Appendix for definition) in Eq. (19.2) is assumed to be of sufficient but finite length. Other structures are possible, but this one is the standard choice. To estimate the deconvolving filter, certain assumptions on the source signal s(t) must be made. Usually it is assumed that the source signal values s(t) at different times t are nongaussian, statistically independent and identically distributed (i.i.d.). The probability distribution of the source signal s(t) may be known or unknown. The indeterminacies remaining in the blind deconvolution problem are that the estimated source signal may have an arbitrary scaling (and sign) and time shift compared with the true one. This situation is similar to the permutation and sign indeterminacy encountered in ICA; the two models are,in fact, intimately related as will be explained in Section 19.1.4. Of course, the preceding ideal model usually does not exactly hold in practice. There is often additive noise present, though we have omitted noise from the model (19.1) for simplicity. The source signal sequence may not satisfy the i.i.d condition, and its distribution is often unknown, or we may only know that the source signal is subgaussian or supergaussian. Hence blind deconvolution often is a difficult signal processing task that can be solved only approximately, in practice. BLIND DECONVOLUTION 357 If the linear time-invariant system (19.1) is minimum phase (see the Appendix), then the blind deconvolution problem can be solved in a straightforward way. On the above assumptions, the deconvolving filter is simply a whitening filter that temporally whitens the observed signal sequence fx(t)g [171, 174]. However, in many appli- cations, for example, in telecommunications, the system is typically nonminimum phase [174] and this simple solution cannot be used. We shall next discuss some popular approaches to blind deconvolution. Blind deconvolution is frequently needed in communications applications where it is con- venient to use complex-valued data. Therefore we present most methods for this general case. The respective algorithms for real data are obtained as special cases. Methods for estimating the ICA model with complex-valued data are discussed later in Section 20.3. 19.1.2 Bussgang methods Bussgang methods [39, 171, 174, 315] include some of the earliest algorithms [152, 392] proposed for blind deconvolution, but they are still widely used. In Bussgang methods, a noncausal FIR filter structure y (t)= L X k=L w k (t)x(t k ) (19.3) of length 2L +1 is used. Here denotes the complex conjugate. The weights w k (t) of the FIR filter depend on the time t , and they are adapted using the least-mean-square (LMS) type algorithm [171] w k (t +1) = w k (t)+x(t k )e (t) k = L::: L (19.4) where the error signal is defined by e(t)=g(y (t)) y (t) (19.5) In these equations, is a positive learning parameter, y (t) is given by (19.3), and g (:) is a suitably chosen nonlinearity. It is applied separately to the real and imaginary components of y (t) . The algorithm is initialized by setting w 0 (0) = 1 , w k (0) = 0 k 6=0 . Assume that the filter length 2L +1 is large enough and the learning algorithm has converged. It can be shown that then the following condition holds for the output y (t) of the FIR filter (19.3): E fy (t)y (t k )g E fy (t)g (y (t k ))g (19.6) A stochastic process that satisfies the condition (19.6) is called a Bussgang process. The nonlinearity g (t) can be chosen in several ways, leading to different Bussgang type algorithms [39, 171]. The Godard algorithm [152] is the best performing Bussgang algorithm in the sense that it is robust and has the smallest mean-square 358 CONVOLUTIVE MIXTURES AND BLIND DECONVOLUTION error after convergence; see [171] for details. The Godard algorithm minimizes the nonconvex cost function J p (t)= E fjy (t)j p p ] 2 g (19.7) where p is a positive integer and p is a positive real constant defined by the statistics of the source signal: p = E fjs(t)j 2p g E fjs(t)j p g (19.8) The constant p is chosen in such a way that the gradient of the cost function J p (t) is zero when perfect deconvolution is attained, that is, when y (t) = s(t) . The error signal (19.5) in the gradient algorithm (19.4) for minimizing the cost function (19.7) with respect to the weight w k (t) has the form e(t)=y (t)jy (t)j p2 p jy (t)j p ] (19.9) In computing e(t) , the expectation in (19.7) has been omitted for getting a simpler stochastic gradient type algorithm. The respective nonlinearity g (y (t)) is given by [171] g (y (t)) = y (t)+y(t)jy (t)j p2 p jy (t)j p ] (19.10) Among the family of Godard algorithms,the so-called constant modulus algorithm (CMA) is widely used. It is obtained by setting p =2 in the above formulas. The cost function (19.7) is then related to the minimization of the kurtosis. The CMA and more generally Godard algorithms perform appropriately for subgaussian sources only, but in communications applications the source signals are subgaussian. 1 .The CMA algorithm is the most successful blind equalization (deconvolution) algorithm used in communications due to its low complexity, good performance, and robustness [315]. Properties of the CMA cost function and algorithm have been studied thoroughly in [224]. The constant modulus property possessed by many types of communications signals has been exploited also in developing efficient algebraic blind equalization and source separation algorithms [441]. A good general review of Bussgang type blind deconvolution methods is [39]. 19.1.3 Cumulant-based methods Another popular group of blind deconvolution methods consists of cumulant-based approaches [315, 170, 174, 171]. They apply explicitly higher-order statistics of the observed signal x(t) , while in the Bussgang methods higher-order statistics 1 The CMA algorithm can be applied to blind deconvolution of supergaussian sources by using a negative learning parameter in (19.4); see [11] BLIND DECONVOLUTION 359 are involved into the estimation process implicitly via the nonlinear function g () . Cumulants have been defined and discussed briefly in Chapter 2. Shalvi and Weinstein [398] have derived necessary and sufficient conditions and a set of cumulant-based criteria for blind deconvolution. In particular, they intro- duced a stochastic gradient algorithm for maximizing a constrained kurtosis based criterion. We shall next describe this algorithm briefly, because it is computationally simple, converges globally, and can be applied equally well to both subgaussian and supergaussian source signals s(t) . Assume that the source (input) signal s(t) is complex-valued and symmetric, satisfying the condition E fs(t) 2 g = 0 . Assume that the length of the causal FIR deconvolution filter is M . The output z (t) of this filter at discrete time t can then be expressed compactly as the inner product z (t)=w T (t)y(t) (19.11) where the M -dimensional filter weight vector w(t) and output vector y(t) at time t are respectively defined by y(t)=y (t)y(t 1)::: y(t M + 1)] T (19.12) w(t)=w(t)w(t 1)::: w(t M +1)] T (19.13) Shalvi and Weinstein’s algorithm is then given by [398, 351] u(t +1) = u(t)+ sign ( s )jz (t)j 2 z (t)]y (t) w(t +1) = u(t +1)= k u(t +1) k (19.14) Here s is the kurtosis of s(t) , kk is the usual Euclidean norm, and the unnormalized filter weight vector u(t) is defined quite similarly as w(t) in (19.13). It is important to notice that Shalvi and Weinstein’s algorithm (19.14) requires whitening of the output signal y (t) for performing appropriately (assuming that s(t) is white, too). For a single complex-valued signal sequence (time series) fy (t)g ,the temporal whiteness condition is E fy (t)y (t k )g = 2 y tk = ( 2 y t = k 0 t 6= k (19.15) where the variance of y (t) is often normalized to unity: 2 y =1 . Temporal whitening can be achieved by spectral prewhitening in the Fourier domain, or by using time- domain techniques such as linear prediction [351]. Linear prediction techniques have been discussed for example in the books [169, 171, 419]. Shalvi and Weinstein have presented a somewhat more complicated algorithm for the case E fs(t) 2 g 6= 0 in [398]. Furthermore, they showed that there exists a close relationship between their algorithm and the CMA algorithm discussed in the previous subsection; see also [351]. Later, they derived fast converging but more involved super-exponential algorithms for blind deconvolution in [399]. Shalvi and 360 CONVOLUTIVE MIXTURES AND BLIND DECONVOLUTION Weinstein have reviewed their blind deconvolution methods in [170]. Closely related algorithms were proposed earlier in [114, 457]. It is interesting to note that Shalvi and Weinstein’s algorithm (19.14) can be derived by maximizing the absolute value of the kurtosis of the filtered (deconvolved) signal z (t) under the constraint that the output signal y (t) is temporally white [398, 351]. The temporal whiteness condition leads to the normalization constraint of the weight vector w(t) in (19.14). The corresponding criterion for standard ICA is familiar already from Chapter 8, where gradient algorithms similar to (19.14) have been discussed. Also Shalvi and Weinstein’s super-exponential algorithm [399] is very similar to the cumulant-based FastICA as introduced in Section 8.2.3. The connection between blind deconvolution and ICA is discussed in more detail in the next subsection. Instead of cumulants, one can resort to higher-order spectra or polyspectra [319, 318]. They are defined as Fourier transforms of the cumulants quite similarly as the power spectrum is defined as a Fourier transform of the autocorrelation function (see Section 2.8.5). Polyspectra provide a basis for blind deconvolution and more generally identification of nonminimum-phase systems, because they preserve phase information of the observed signal. However, blind deconvolution methods based on higher-order spectra tend to be computationally more complex than Bussgang methods, and converge slowly [171]. Therefore, we shall not discuss them here. The interested reader can find more information on those methods in [170, 171, 315]. 19.1.4 Blind deconvolution using linear ICA In defining the blind deconvolution problem, the values of the original signal s(t) were assumed to be independent for different t and nongaussian. Therefore, the blind deconvolution problem is formally closely related to the standard ICA problem. In fact, one can define a vector ~ s(t)=s(t)s(t 1):::s(t n +1)] T (19.16) by collecting n last values of the source signal, and similarly define ~ x(t)=x(t)x(t 1):::x(t n +1)] T (19.17) Then ~ x and ~ s are n -dimensional vectors, and the convolution (19.1) can be expressed for a finite number of values of the summation index k as ~ x = A ~ s (19.18) where A is a matrix that contains the coefficients a k of the convolution filter as its rows, at different positions for each row. This is the classic matrix representation of a filter. This representation is not exact near the top and bottom rows, but for a sufficiently large n , it is good enough in practice. From (19.18) we see that the blind deconvolution problem is actually (approxi- mately) a special case of ICA. The components of s are independent, and the mixing is linear, so we get the standard linear ICA model. BLIND SEPARATION OF CONVOLUTIVE MIXTURES 361 In fact, the one-unit (deflationary) ICA algorithms in Chapter 8 can be directly used to perform blind deconvolution. As defined above, the inputs x(t) should then consist of sample sequences x(t)x(t 1):::x(t n +1) of the signal x(t) to be deconvolved. Estimating just one “independent component”, we obtain the original deconvolved signal s(t) . If several components are estimated, they correspond to translated versions of the original signal, so it is enough to estimate just one component. 19.2 BLIND SEPARATION OF CONVOLUTIVE MIXTURES 19.2.1 The convolutive BSS problem In several practical applications of ICA, some kind of convolution takes place simul- taneously with the linear mixing. For example, in the classic cocktail-party problem, or separation of speech signals recorded by a set of microphones, the speech signals do not arrive in the microphones at the same time. This is because the sound travels in the atmosphere with a very limited speed. Moreover, the microphones usually record echos of the speakers’ voices caused by reverberations from the walls of the room or other obstacles. These two phenomena can be modeled in terms of convolutive mixtures. Here we have not considered noise and other complications that often appear in practice; see Section 24.2 and [429, 430]. Blind source separation of convolutive mixtures is basically a combination of standard instantaneous linear blind source separation and blind deconvolution. In the convolutive mixture model, each element of the mixing matrix A in the model x(t) = As(t) is a filter instead of a scalar. Written out for each mixture, the data model for convolutive mixtures is given by x i (t)= n X j =1 X k a ikj s j (t k ) for i =1:::n (19.19) This is a FIR filter model, where each FIR filter (for fixed indices i and j ) is defined by the coefficients a ikj . Usually these coefficients are assumed to be time-independent constants, and the number of terms over which the convolution index k runs is finite. Again, we observe only the mixtures x i (t) , and both the independent source signals s i (t) and all the coefficients a ikj must be estimated. To invert the convolutive mixtures (19.19), a set of similar FIR filters is typically used: y i (t)= n X j =1 X k w ikj x j (t k ) for i =1 ::: n (19.20) The output signals y 1 (t)::: y n (t) of the separating system are estimates of the source signals s 1 (t)::: s n (t) at discrete time t .The w ikj give the coefficients of the FIR filters of the separating system. The FIR filters used in separation can be 362 CONVOLUTIVE MIXTURES AND BLIND DECONVOLUTION either causal or noncausal depending on the method. The number of coefficient in each separating filter must sometimes be very large (hundreds or even thousands) for achieving sufficient inversion accuracy. Instead of the feedforward FIR structure, feedback (IIR type) filters have sometimes been used for separating convolutive mixtures, an example is presented in Section 23.4. See [430] for a discussion of mutual advantages and drawbacks of these filter structures in convolutive BSS. At this point, it is useful to discuss relationships between the convolutive BSS problem and the standard ICA problem on a general level [430]. Recall first than in standard linear ICA and BSS, the indeterminacies are the scaling and the order of the estimated independent components or sources (and their sign, which can be included in scaling). With convolutive mixtures the indeterminacies are more severe: the order of the estimated sources y i (t) is still arbitrary, but scaling is replaced by (arbitrary) filtering. In practice, many of the methods proposed for convolutive mixtures filter the estimated sources y i (t) so that they are temporally uncorrelated (white). This follows from the strong independence condition that most of the blind separation methods introduced for convolutive mixtures try to realize as well as possible. The temporal whitening effect causes some inevitable distortion if the original source signals themselves are not temporally white. Sometimes it is possible to get rid of this by using a feedback filter structure; see [430]. Denote by y(t)=y 1 (t)y 2 (t)::: y n (t)] T (19.21) the vector of estimated source signals. They are both temporally and spatially white if E fy(t)y H (t k )g = tk I = ( I t = k 0 t 6= k (19.22) where H denotes complex conjugate transpose (Hermitian operator). The standard spatial whitening condition E fy(t)y H (t)g = I is obtained as a special case when t = k . The condition (19.22) is required to hold for all the lag values k for which the separating filters (19.20) are defined. Douglas and Cichocki have introduced a simple adaptive algorithm for whitening convolutive mixtures in [120]. Lambert and Nikias have given an efficient temporal whitening method based on FIR matrix algebra and Fourier transforms in [257]. Standard ICA makes use of spatial statistics of the mixtures to learn a spatial blind separation system. In general, higher-order spatial statistics are needed for achieving this goal. However, if the source signals are temporally correlated, second-order spatiotemporal statistics are sufficient for blind separation under some conditions, as shown in [424] and discussed in Chapter 18. In contrast, blind separation of convolutive mixtures must utilize spatiotemporal statistics of the mixtures to learn a spatiotemporal separation system. Stationarity of the sources has a decisive role in separating convolutive mixtures, too. If the sources have nonstationary variances, second-order spatiotemporal statis- tics are enough as briefly discussed in [359, 456]. BLIND SEPARATION OF CONVOLUTIVE MIXTURES 363 For convolutive mixtures, stationary sources require higher than second-order statistics, just as basic ICA, but the following simplification is possible [430]. Spa- tiotemporal second-order statistics can be used to decorrelate the mixtures. This step returns the problem to that of conventional ICA, which again requires higher-order spatial statistics. Examples of such approaches are can be found in [78, 108, 156]. This simplification is not very widely used, however. Alternatively, one can resort to higher-order spatiotemporal statistics from the beginning for sources that cannot be assumed nonstationary. This approach has been adopted in many papers, and it will be discussed briefly later in this chapter. 19.2.2 Reformulation as ordinary ICA The simplest approach to blind separation of convolutive mixtures is to reformulate the problem using the standard linear ICA model. This is possible because blind deconvolution can be formulated as a special case of ICA, as we saw in (19.18). Define now a vector ~ s by concatenating M time-delayed versions of every source signal: ~ s(t)=s 1 (t)s 1 (t 1) ::: s 1 (t M +1)s 2 (t)s 2 (t 1):::s 2 (t M +1) ::: s n (t)s n (t 1):::s n (t M + 1)] T (19.23) and define similarly a vector ~ x(t)=x 1 (t)x 1 (t 1):::x 1 (t M +1)x 2 (t)x 2 (t 1):::x 2 (t M +1) ::: x n (t)x n (t 1) ::: x n (t M + 1)] T (19.24) Using these definitions, the convolutive mixing model (19.19) can be written ~ x = ~ A ~ s (19.25) where ~ A is a matrix containing the coefficients a ikj of the FIR filters in a suitable order. Now one can estimate the convolutive BSS model by applying ordinary ICA methods to the standard linear ICA model (19.25). Deflationary estimation is treated in [108, 401, 432]. These methods are based on finding maxima of the absolute value of kurtosis, thus generalizing the kurtosis-based methods of Chapter 8. Other examples of approaches in which the convolutive BSS problem has been solved using conventional ICA can be found in [156, 292]. A problem with the formulation (19.25) is that when the original data vector x is expanded to ~ x , its dimension grows very much. The number M of time delays that needs to be taken into account depends on the application, but it is often tens or hundreds, and the dimension of model (19.25) grows with the same factor, to nM . This may lead to prohibitively high dimensions. Therefore, depending on the application and the dimensions n and M , this reformulation can solve the convolutive BSS problem satisfactorily, or not. In blind deconvolution, this is not such a big problem because we have just one signal to begin with, and we only need to estimate one independent component, which 364 CONVOLUTIVE MIXTURES AND BLIND DECONVOLUTION is easier than estimating all of them. In convolutive BSS, however, we often need to estimate all the independent components, and their number is nM in the model (19.25). Thus the computations may be very burdensome, and the number of data points needed to estimate such a large number of parameters can be prohibitive in practical applications. This is especially true if we want to estimate the separating system adaptively, trying to track changes in the mixing system. Estimation should then be fast both in terms of computations and data collection time. Regrettably, these remarks hold largely for other approaches proposed for blind separation of convolutive mixtures, too. A fundamental reason of the computational difficulties encountered with convolutive mixtures is the fact that the number of the unknown parameters in the model (19.19) is so large. If the filters have length M ,itis M -fold compared with the respective instantaneous ICA model. This basic problem cannot be avoided in any way. 19.2.3 Natural gradient methods In Chapter 9, the well-known Bell-Sejnowski and natural gradient algorithms were derived from the maximum likelihood principle. This principle was shown to be quite closely related to the maximization of the output entropy, which is often called the information maximization (infomax) principle; see Chapter 9. These ICA estimation criteria and algorithms can be extended to convolutive mixtures in a straightforward way. Early results and derivations of algorithms can be found in [13, 79, 121, 268, 363, 426, 427]. An application to CDMA communication signals will be described later in Chapter 23. Amari, Cichocki, and Douglas presented an elegant and systematic approach for deriving natural gradient type algorithms for blind separation of convolutive mixtures and related tasks. It is based on algebraic equivalences and their nice properties. Their work has been summarized in [11], where rather general natural gradient learning rules have been given for complex-valued data both in the time domain and z - transform domain. The derived natural gradient rules can be implemented in either batch, on-line, or block on-line forms [11]. In the batch form, one can use a noncausal FIR filter structure, while the on-line algorithms require the filters to be causal. In the following, we represent an efficient natural gradient type algorithm [10, 13] described also in [430] for blind separation of convolutive mixtures. It can be implemented on-line using a feedforward (FIR) filter structure in the time domain. The algorithm is given for complex-valued data. The separating filters are represented as a sequence of coefficient matrices W k (t) at discrete time t and lag (delay) k . The separated output with this notation and causal FIR filters is y(t)= L X k=0 W k (t)x(t k ) (19.26) Here x(t k ) is n -dimensional data vector containing the values of the n mixtures (19.19) at the time instant t k ,and y(t) is the output vector whose components are [...]... delaying the output by L samples, is as follows [13, 430]: Wk t / Wk t g y t ( ) ( ) ( ( v L)) H (t k) k = 0 : : : L g (19.27) Quite similarly as in Chapter 9, each component of the vector applies the nonlinearity gi (:) to the respective component of the argument vector The optimal nonlinearity gi (:) is the negative score function gi = p0i =pi of the distribution pi of the source si In (19.27), (t)...BLIND SEPARATION OF CONVOLUTIVE MIXTURES 365 y estimates of the source signals si (t) i = 1 : : : m Hence (t) has m components, with m n This matrix notation allows the derivation of a separation algorithm using the natural gradient approach The resulting weight matrix update algorithm, which takes into account the... that is ubiquitous in ICA The permutation and signs of the sources are usually different in each frequency interval For reconstructing a source signal si (t) in the time domain, we need all its frequency components Hence we a need some method for choosing which source signals in different frequency intervals belong together To this end, various continuity criteria have been introduced by many authors;... the actual separation in the time domain Only selected parts of the separation procedure are carried out in the frequency domain Separating filters may be easier to learn in the frequency domain because components are now orthogonal and do not depend on each other like the time domain coefficients [21, 430] Examples of methods that apply their separation criterion in the time domain but do the rest in... first the noisy instantaneous linear ICA model x(t) = As(t) + n(t) (19.30) which has been discussed in more detail in Chapter 15 Making the standard realistic assumption that the additive noise (t) is independent of the source signals (t), the spatial covariance matrix x (t) of (t) at time t is C C n C x Cx(t) = ACs(t)AT + Cn(t) s (19.31) where s (t) and n (t) are respectively the covariance matrices... for , s (t), and n (t) Note that the covariances matrices s (t) and n (t) are diagonal The diagonality of s (t) follows from the independence of the sources, and n (t) can be taken diagonal because the components of the noise vector (t) are assumed to be uncorrelated We can also look at cross-covariance matrices x (t t + ) = Ef (t) (t + )T g over time This approach has been mentioned in the context of... averages [359, 356] s C C C C C n AC C C C x x (19.32) Cx(! t) = A(!)Cs(! t)AH (!) + Cn(! t) where Cx is the averaged spatial covariance matrix If s is nonstationary, one can again write multiple linearly independent equations for different time lags and solve for unknowns or find LMS estimates of them by diagonalizing a number of matrices in the frequency domain [123, 359, 356] If the mixing system is minimum . is a signal processing problem that is closely related to basic independent component analysis (ICA) and blind source separation (BSS). In com- munications. of convolutive mixtures. Many techniques for convolutive mix- 355 Independent Component Analysis. Aapo Hyv ¨ arinen, Juha Karhunen, Erkki Oja Copyright