Performance comparison of adaptive filtering in time and frequency domain for unmixing acoustic sources in real reverberant environments for close – microphone applications

PERFORMANCE COMPARISON OF ADAPTIVE FILTERING IN TIME AND FREQUENCY DOMAIN FOR UNMIXING ACOUSTIC SOURCES IN REAL REVERBERANT ENVIRONMENTS FOR CLOSE – MICROPHONE APPLICATIONS 2013 MASTER OF ENGINEERING Department of Electrical and Electronic Information Engineering DANG NGUYEN CHAU M125212 TOYOHASHI UNIVERSITY OF TECHYNOLOGY DATE： Department of Electrical and Electronic Information Engineering Name 2013/07/25 ID M125212 Supervisor H Uehara DANG NGUYEN CHAU Abstract Title PERFORMANCE COMPARISON OF ADAPTIVE FILTERING IN TIME AND FREQUENCY DOMAIN FOR UNMIXING ACOUSTIC SOURCES IN REAL REVERBERANT ENVIRONMENTS FOR CLOSE – MICROPHONE APPLICATIONS （800 words） One significant problem in audio recording is microphone leakage That is when the sound of an instrument or other sources is picked by the microphone other than the desirable sources For example, when a group of musicians is playing together, the individual microphone will not only take the signal from one instrument but also capture the interference signals that are generated by other instruments The close-microphone technique, in which the microphone is placed in close to the source of interest, is used in order to make the microphone capture as much of the sound of interest as possible and reduce the effect of microphone leakage The purpose of this work is comparing the performance of two adaptive filtering techniques for solving the problem of unmixing acoustic sources These two techniques identify channel impulse response based on minimum mean squared error (MMSE) criterion However, one uses the calculation in frequency domain with the Wiener filter while the other calculates it in time domain by using NLMS algorithm Besides that, the calculation in time domain uses the “solo interval”, which is usually used in music performance With the using of “solo interval”, the two calculations have different performance This work aims examining the performance the model using NLMS algorithm to the realistic problem of unmixing and separating of two interfering sources in the several reverberant environments with the close-microphone technique Moreover, the performance of the model in this work will be compared with the model using Wiener filter, which is presented as good algorithm for blind source separation (BSS) problem in unmixing acoustic sources In the experiment, two speakers, which are used to produce the anechoic recording of music from instruments, are used as the two sources and two omnidirectional microphones are used as the sensors In the music performance, there are usually time interval that there is only one instrument are in active In this period, the NLMS algorithm is used to estimate the channel response from this source to another microphone This channel response could be called as the leakage path response After that period, two of sources are in active The subtraction of the signal from this microphone with the leakage signal, which is the multiplication of the leakage path response and the other microphone signal, could be seen as the estimated signal of interest The experiment is produced in different rooms, which have different time reverberant, for examining the system performance in various real environments The length of the weighting vector, which is used in this work as the approximation of the leakage path response, is changed too The result of this changing is used to examine the performance of system when changing the length of weighting vector For the performance evaluation, by using the orthogonal projection, the output signal of the system could be decomposed into three components: a version of the original source signal, the error term that depends on the interference signal and the other error term that depends on the other noise Two parameters are used to show the performance of the algorithm is signal-to-interference ratio (SIR) and signal-to-distortion ratio (SDR) The SIR is used to show the remaining interference signal in the unmixed signal while the SDR is used as the quality measure of the remaining interference and noise in the output signal The result of the experimental shows that in the way of SIR, the Wiener filter gives the better performance than the NLMS algorithm when the distance of source and microphone is low (10cm-25cm) When this distance increases, the NLMS algorithm give the better performance than the Wiener filter In the other hand, in the way of SDR, the NLMS algorithm always gives the better performance than the Wiener filter The room reverberation time also has effect on the algorithm performance The room with longer reverberation time will gives the better performance in two cases SIR and SDR The weighting vector length has effect on the system performance too However, the result of this work shows that it is not effective when choosing increasing the weighting vector length for increasing the performance CONTENTS INTRODUCTION ADAPTIVE NOISE CANCELLATION 2.1 Wiener filter 2.2 Recursive Least Squares (RLS) adaptive filter 12 2.3 The Steepest descent method 15 2.4 The Least Mean Squared (LMS) adaptation method 16 3.PROBLEM FORMULATION 16 RELATED WORK 18 NLMS ALGORITHM APPROACH FOR UNMIXING ACOUSTIC SOURCES 21 PERFORMANCE EVALUATION 24 EXPERIMENTAL PROCEDURE 26 RESULT 28 8.1 Signal before and after processing 28 8.2 Algorithm performance 30 8.3 Effect of room acoustic 35 8.4 Effect of the length of weighting vector 38 CONCLUSION 42 REFERENCE 43 Supervisor: H UEHARA Page DANG NGUYEN CHAU LIST OF FIGURES Fig.1.1 Microphone leakage in Close-Microphone applications Fig.2.1 A frequency domain Wiener filter for reducing additive noise Fig.2.2 Wiener filter structure Fig.2.3 Illustration of an adaptive filter Fig.3.1 Block diagram of the blind source separation problem Fig4.1 Block diagram of Wiener filter Fig.4.2 Block diagram of Wiener filter for two sources-two microphones case Fig.5.1 Block diagram of system using NLMS algorithm Fig.7.1 Reverberant time Fig.7.2 Learning curve of the NLMS algorithm Fig.8.1 Clean signal (a), Signal at the microphone (b) and the Output signal of the system (c) Fig.8.2 System performance (SIR) with recording in room and the 2048 – filter length Fig.8.3 System performance (SDR) with recording in room and the 2048 – filter length Fig.8.4 System performance (SIR) with recording in room (a) and room (b) with the 2048 – filter length Fig.8.5 System performance (SDR) with recording in room (a) and room (b) with the 2048 – filter length Fig.8.6 System SDR performance and SIR performance of Wiener filter in various room Fig.8.7 System performance (SIR) of model using NLMS algorithm in various rooms with 2048 – filter length Fig.8.8 System performance (SDR) model using NLMS algorithm in various rooms with 2048 – filter length Fig.8.9 Algorithm performance (SIR) with various weighting vector length in room and room Fig.8.10 Algorithm performance (SDR) with various weighting vector length in room and room Supervisor: H UEHARA Page DANG NGUYEN CHAU LIST OF TABLES Table 7.1 Experiment parameter Table 7.2 Properties of rooms in which recordings took Supervisor: H UEHARA Page DANG NGUYEN CHAU ACKNOWLEDGEMENT In the first, I want to say thank you to Prof Uehara, who is my supervisor In the time I am in TUT, Prof Uehara has been helping me everything from problems in the school to the problem in life Prof Uehara has given me the ideas and suggesting, which is really value, for my research Besides that, Prof Uehara has been helped me to have to best condition for finishing my research Kitayama is my tutor in the time in Japan He is the person who has helped me, who has the first time far from country, to be acquainted with the life in Japan He has always joined the seminar with me, given the ideas for me… I really want to show my thankful to him Ad-hoc group is one of group in Wireless Communication Laboratory, which is the group I belong to I want to say thank you to everyone in group for the suggesting in the seminar With your help, I could my work more easy With the other in my Laboratory, I want to say thank you with your friendship All of you are very friendly with the, that make me feel no strange with the new life Finally, I want to say thank you to Vietnamese friends in TUT You are really good with me Toyohashi, June 15, 2013 Dang Nguyen Chau Supervisor: H UEHARA Page DANG NGUYEN CHAU PERFORMANCE COMPARISON OF ADAPTIVE FILTERING IN TIME AND FREQUENCY DOMAIN FOR UNMIXING ACOUSTIC SOURCES IN REAL REVERBERANT ENVIRONMENTS FOR CLOSE – MICROPHONE APPLICATIONS INTRODUCTION In the modern music, it often involves a number of musicians playing together inside the same room, with a number of microphones, which are set to capture the sound from their instrument (see Fig.1.1 [1]) A common technique for setting microphones in this situation is to place a dedicated microphone to reproduce each sound source Ideally, the microphone has to pick the only signal from the interested instrument However, due to the effect of the various instruments, the microphones will pick not only the signal from interested source but also the signal from other instruments This is known as the microphone leakage, which is undesired effect The close-microphone technique is the technique in which the microphone is placed close to the interested source The microphone is close to the interested source to capture as much of the interested sound as possible and reduce the microphone leakage effect This technique is also used to minimize the effect of the room acoustics on the received signal Supervisor: H UEHARA Page DANG NGUYEN CHAU Fig.1.1 Microphone leakage in Close-Microphone applications In order to address this problem, sound engineers suggest some ways: using the directional microphone, optimal placements of sources and microphones… However, the problem is discussed is more general, noting that this problem and the need for source separation and interference suppression arise in various other applications The purpose of this work is suggesting a model to solve the problem unmixing two interference sources recorded in various reverberant environments with the close-microphone technique Besides that, it is aimed examining the performance of this model and the model using Wiener filter in [2] In realistic case, the sources are located in enclose spaces such as concert hall or studio room The source signal arriving at the microphone will be largely dominated by the room impulse response Therefore, under such condition, the system from sources to microphones is the same as a multi input – multi output system This system is the set of room impulse response from the sources to the microphones However, the inversion of this system is not simple The main reasons are given in [2] such as: the non-minimum phase property of room impulse response, the unstable inversion of the system… One more reason is the room impulse response for audio processing is considered as a lengthy filter The inverting for the matrix with each element has ten thousand elements has large computational cost So, it is reasonable for suggesting a model to solve the general problem of unmixing sources Such model is presented in [2] with a model using Wiener filter Supervisor: H UEHARA Page DANG NGUYEN CHAU Wiener filter is an alternative for solving the problem source separation It provides a way to estimate the interested source sˆ(n) from the signal x(n) which contains the interested signal x(n) and interfering signal x(n) The Wiener filter is frequently used for the problem source separation NLMS is also frequently used in sound processing However, it is frequently used for noise removing or echo equalization This work suggests a model using NLMS with the close-microphone set up for unmixing the audio sources The system performance will be compared with the model using Wiener filter Moreover, the system is examined in various real environments for unmixing sources successfully This work is organized as follows Section 2, the adaptive filters are presented In section 3, the problem formulation is presented After that, the related work is discussed in section In section 5, the suggested model using NLMS algorithm is presented Section is used to discuss about the performance computing In the section and 8, the experimental and results are presented Finally, the conclusion is in section ADAPTIVE NOISE CANCELLATION In telecommunication from noisy acoustic environment, it is often that the interested signal is observed with an additive noise The noisy signal could be modeled as: y(m)  x(m)  n(m) (2.1) where x(m) and n(m) are the signal and noise, the variable m is the discrete-time index The signal x(m) could be recovered by subtraction of noise estimate from the noisy signal Fig.2.1 shows an adaptive noise cancellation system with two-input Supervisor: H UEHARA Page DANG NGUYEN CHAU Fig.2.1 A frequency domain Wiener filter for reducing additive noise In above system, a microphone takes input noisy signal x(m)  n(m) and the second microphone takes only noise  n(m   ) The factor and the time delay  provide a simple model of the effect of propagation of noise to different positions in space The noise from second microphone is processed by an adaptive filter to make it equal to the noise contain in the noisy signal of microphone Then, it is subtracted by the noisy signal to cancellation the noise Wiener filter is known as the optimal solution for the noise cancellation Besides that, adaptive filters such as: Recursive least square (RLS) adaptive filter, Steepest descent method, Least mean square (LMS) adaptive method are the filters with the coefficients filter get the Wiener solution adaptively 2.1 Wiener filter Wiener theory, which is formulated by Norbert Wiener, forms the foundation of data-dependent least squared error filters Wiener filter is used in wide range applications such as linear prediction, signal coding, echo cancellation, channel equalization… The coefficients of Wiener filter are calculated by minimizing the average squared distance between the filter output and the desired signal Supervisor: H UEHARA Page DANG NGUYEN CHAU Fig.8.1 (a) is the clean signal of the source Fig.8.1 (b) is the signal, which is observed by the microphone This signal contains the clean signal of the source, the signal from the interference source and the noise from the real environment As we can see, the signal of the microphone is different from the clean signal Fig.8.1 (c) is the output signal of the system This signal is the same as the clean signal of the source It shows that the system is successful in removing the interfering signal in the noisy signal However, Fig.8.1 only is not a clear evidence to show the ability of interference reduction In the followings section, the two parameters SIR and SDR are used to show more clear the ability of interference reduction 8.2 Algorithm performance The algorithm performances of two models, which are discussed in previous sections, are shown in the Fig.8.2 and Fig.8.3 The recording is produced in room In the way of SIR (Fig.8.2), the model using NLMS algorithm gives the lower SIR than the model using Wiener filter when the microphones are put close to the source On the other hand, when this distance increases, the performance of the model using Wiener filter decreases faster than the model using NLMS algorithm Thus, when the microphones are put far from the sources, the model using NLMS has higher performance With the result in Fig.8.2, the model NLMS algorithm always gives the SIR higher than 30dB with any distance of source and microphone As the result in Fig.8.2, the SIR performance of system using NLMS algorithm and Wiener filter is cross over The reason could be explain from the structure of the system using NLMS algorithm By using the “solo interval”, the system using NLMS algorithm will have a good approximation of the interference noise in the signals, which are observed as the microphones The system using Wiener filter uses the calculation as Eq.4.4 This calculation uses two approximations, which are mentioned in section Because of using these approximations, the system could not get the good approximation of the interference signal This makes the performance of the system using Wiener filter lower than the system using NLMS algorithm However, as the Eq.5.6, the output signal of the system using NLMS algorithm has two components The first component is the interested signal that we want to estimate The second component is the undesired component This makes the power of the interested signal in the output decrease For the close distance, the second component is large And it is decrease rapidly when the distance increases This is the reason why the two line of SIR performance are cross over Supervisor: H UEHARA Page 30 DANG NGUYEN CHAU Fig.8.2 System performance (SIR) with recording in room and the 2048 – filter length With the SDR, the result in Fig.8.3 shows that model using NLMS algorithm always gives higher performance than the model using Wiener filter with any distance of the source and microphone The model using NLMS algorithm gives the SDR always higher than 20dB with any distance of source and microphone The system using NLMS algorithm has better performance than the Wiener filter has the same reason as above Supervisor: H UEHARA Page 31 DANG NGUYEN CHAU Fig.8.3 System performance (SDR) with recording in room and the 2048 – filter length Depend on the application of user, they could choose the model using Wiener filter or NLMS algorithm With the application need the microphones are put close to the sources, the model using NLMS algorithm give higher SDR but lower SIR than the one using Wiener filter However, with the application, in which the microphones are put far from the sources, the model using NLMS algorithm is the better choice than the one using Wiener filter The recordings in other rooms have similar trend result with the room Fig.8.4 and Fig.8.5 shows the algorithm performances in room and room Supervisor: H UEHARA Page 32 DANG NGUYEN CHAU (a) (b) Fig.8.4 System performance (SIR) with recording in room (a) and room (b) with the 2048 – filter length Supervisor: H UEHARA Page 33 DANG NGUYEN CHAU (a) (b) Fig.8.5 System performance (SDR) with recording in room (a) and room (b) with the 2048 – filter length Supervisor: H UEHARA Page 34 DANG NGUYEN CHAU 8.3 Effect of room acoustic The room with longer reverberant time is usually considered difficult environment for the source separation problem However, the result of the two models shows that room with longer time reverberant gives the higher performance In the related work [2], it is shown that the room reverberant time has small effect on the system performance in SIR On the other hand, the system performance in SDR is strongly affected by reverberant time The result in Fig.8.6 shows the system performance in SIR and SDR of the system using Wiener filter Supervisor: H UEHARA Page 35 DANG NGUYEN CHAU Fig.8.6 System SDR performance and SIR performance of Wiener filter in various room Supervisor: H UEHARA Page 36 DANG NGUYEN CHAU In this work, the performance of the system using NLMS algorithm is recorded in various rooms and used to compare The results are similar to the related work The time reverberant of room has small effect on the system performance SIR while has strong effect on the system SDR In the case of SIR, the room environment has not great effect on the system performance Room with longest time reverberant gives the highest SIR However, the difference of algorithm performance between the rooms with the longest and shortest time reverberant is not so large It is only 3dB difference in SIR between the room and room for any case of source and microphone distance, while the reverberant time is 0.76 sec higher in room Fig.8.7 System performance (SIR) of model using NLMS algorithm in various rooms with 2048 – filter length On the other hand, in the case of SDR, there are a big different in system performance from room to room The result of the SDR in various rooms is shown in Fig.8.8 It is easy to see that the room 2, which has 0.2 sec longer reverberant time than room 1, gives higher SDR than room However, this difference is not large Room with longest time reverberant gives the highest SDR, it is 7dB higher than room Supervisor: H UEHARA Page 37 DANG NGUYEN CHAU Fig.8.8 System performance (SDR) model using NLMS algorithm in various rooms with 2048 – filter length The results of recording in various rooms show that the room acoustic properties, is time reverberant in this case, not a big effect on the system performance SIR It is only 3dB increase while the time reverberant is 0.76 sec longer Thus it is not important of choosing room acoustic in the way of SIR On the other hand, the time reverberant of room has large effect on the system performance The longer time reverberant give much higher SDR Depend on the requirement of the application is SDR or SIR more important, the choosing room acoustic property, in this case is time reverberant, is important or not 8.4 Effect of the length of weighting vector In [14][15], it is shown that the room impulse response could be presented as a lengthy filter, which could has 10000 of coefficients Thus, the BBS algorithm requires a lengthy filter for producing the estimated signal In this work, the model in section has a weighting vector, which is used as the approximation of the leakage impulse response The longer weighting vector gives the better approximation of leakage impulse response The changing of the length weighting vector has some effect on the system performance Supervisor: H UEHARA Page 38 DANG NGUYEN CHAU Fig.8.9 shows the system performance SIR in room (above) and room (under) with the weighting vector of 2048, 4096 and 8192 coefficients The result shows that the length of the weighting vector mostly not have effect on the system performance With two recordings in room and room 3, it is always 1dB difference for various weighting vector length Fig.8.10 shows the system performance SDR in room (above) and room (under) with the weighting vector of 2048, 4096 and 8192 coefficients Difference from the case of SIR, the length of weighting vector has larger effect of the system performance In room (above), the longer weighting vector gives higher SDR However, when increasing the length from 4096 to 8192, the SDR not increase as much as when the length increases from 2048 to 4096 In room 4, the weighting vector length 4096 has much higher SDR than the 2048 However, the 8192 and 4096 have the same SDR Supervisor: H UEHARA Page 39 DANG NGUYEN CHAU Fig.8.9 Algorithm performance (SIR) with various weighting vector length in room and room Supervisor: H UEHARA Page 40 DANG NGUYEN CHAU Fig.8.10 Algorithm performance (SDR) with various weighting vector length in room and room The results of SIR and SDR in this section show that with the SIR, the increasing of weighting vector length makes only 1dB increasing SIR in any room However, the increasing of weighting vector Supervisor: H UEHARA Page 41 DANG NGUYEN CHAU length makes the system has higher cost computation With the SDR, the increasing length gives higher increasing SDR in comparing with the SIR case, 3dB It is the trade-off between system performance and cost computation However, from the results, it is not effective when choosing the way increasing weighting vector length for increasing the performance It is take much higher computation cost while make a little increasing in performance CONCLUSION This work has compared performances of the adaptive filtering in time and frequency domain for unmixing acoustic sources in the real reverberant environments This model gives the higher SDR performance than the related work, while the SIR is lower when the microphone is close to the source and higher when the distance source-microphone is far The effect of room acoustic and weighting vector length on the performance are also examined too The room properties, especially the time reverberant, has a small effect on the system performance in SIR On the other hand, it has strong effect on the system performance SDR The result of examining effect of weighting vector length is valuable It is shown that it is not need to increase the weighting vector length This increasing makes the increasing in computational cost while makes a few higher performance The model in this work is suggesting in the way of unmixing the two audio sources in the real environment However, the problem unmixing two audio sources in a special case of the general problem unmixing audio sources When the number of sources and microphones increases, may be three or more, the model in this system could not unmix the sources successfully The output signal of the system still remains a lot of interfering, which could be detectable by a human For solving this general problem, [1] has suggested a model using Wiener filter for unmixing the audio sources In this work, the way of estimation the direct PSD and leakage PSD is suggested These PSD are used to calculate the Wiener filter coefficients The results of this work shows the model is suggested in the case of suppressing interfering signal The SIR has 10-15dB improvement However, in the case of SDR, it is only 5dB increase One way to develop this work is combining the model in this work and the model in the [1] with no weights case to makes a new model, with has the SDR and SIR improvement higher than the result in [1] Supervisor: H UEHARA Page 42 DANG NGUYEN CHAU REFERENCE [1] E K Kokkinis, J D Reiss and J Mourjopoulos, “A Wiener filter approach to microphone leakage reduction in close-microphone applications”, IEEE Trans on Audio, Speech and Language Processing, vol 20, no 3, Mar 2012 [2] E K Kokkinis and J Mourjopoulos, “Unmixing acoustic sources in real reverberant environments for close-microphone applications”, J Audio Eng Soc., vol 58, no 11, pp 1-10, Nov 2010 [3] E J Diethorn, “Subband noise reduction methods for speech enhancement”, in Audio Signal Processing for Next-Generation Multimedia Communication Systems, Y Huang and J Benesty, Eds (Kluver Academic, Boston, MA, 2004) [4] S Haykin, “Adaptive filter theory”, Prentice-Hall, 1986 [5] B Widrow et al., “Adaptive noise cancelling: Principles and applications”, Proc IEEE, vol 63, no 12, Dec 1975 [6] Saeed V Vaseghi, “Advanced signal processing and digital noise reduction”, Wiley, 1996 [7] Lin Bai and Qinye Yin, “A modified NLMS algorithm for adaptive noise cancellation”, Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on [8] J I Nagumo and A Noda, “A learning method for system identification”, Automatic Control, IEEE Transactions on, June 1967 [9] A E Albert and L S Gardner, “Stochastic approximation and nonlinear regression”, Cambridge, MA:MIT Press, 1967 [10] K Brandenburg and T Sporer, “NMR and Masking Flag: Evaluation of quality using perceptual criteria”, in Proc AES 11th Int Conf on Audio Test and Measurement (Portland, OR, May 1992) [11] K Furuya and A Kataoka, “Robust speech dereverberation using multichannel blind deconvolution with spectral subtraction”, IEEE Trans on Audio, Speech and Language Processing, vol 15, July 2007 [12] T Thiede et al., “PEAQ-The ITU standard for objective measurement of perceived audio quality”, J Audio Eng Soc., vol 48, Jan 2000 [13] E Vincent et al., “Performance measurement in blind audio source separation”, IEEE Trans on Audio Speech and Language Processing, vol 14, July 2007 [14] J N Mourjopoulos, “Digital equalization of room acoustic”, J Audio Eng Soc., vol 42, Nov 1994 Supervisor: H UEHARA Page 43 DANG NGUYEN CHAU [15] P D Hatziantoniou and J N Mourjopoulos, “Errors in real-time room acoustics de-reverberation”, J Audio Eng Soc., vol 52, Sep 2004 Supervisor: H UEHARA Page 44 DANG NGUYEN CHAU ... COMPARISON OF ADAPTIVE FILTERING IN TIME AND FREQUENCY DOMAIN FOR UNMIXING ACOUSTIC SOURCES IN REAL REVERBERANT ENVIRONMENTS FOR CLOSE – MICROPHONE APPLICATIONS INTRODUCTION In the modern music, it often... AND FREQUENCY DOMAIN FOR UNMIXING ACOUSTIC SOURCES IN REAL REVERBERANT ENVIRONMENTS FOR CLOSE – MICROPHONE APPLICATIONS （800 words） One significant problem in audio recording is microphone leakage... realistic problem of unmixing and separating of two interfering sources in the several reverberant environments with the close- microphone technique Moreover, the performance of the model in this work

Định dạng
Số trang	46
Dung lượng	1,08 MB