Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 14 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
14
Dung lượng
1,32 MB
Nội dung
EURASIP Journal on Applied Signal Processing 2004:12, 1817–1830 c 2004 Hindawi Publishing Corporation TheCramer-RaoBoundandDMTSignalOptimisationfortheIdentificationofaWiener-Type Model H. Koeppl Christian Doppler Laboratory for Nonlinear Signal Processing, Graz University of Technology, 8010 Graz, Austria Email: heinz.koeppl@tugraz.at A. S. Josan Department of Electronics and Communication Engineering, Indian Institute of Technology Guwahati, Guwahati 781039, Assam, India Email: awlok@iitg.er net.in G. Paoli System Engineering Group, Infineon Technologies, 9500 Villach, Austria Email: gerhard.paoli@infineon.com G. Kubin Christian Doppler Laboratory for Nonlinear Signal Processing, Graz University of Technology, 8010 Graz, Austria Email: gernot.kubin@tugraz.at Received 2 September 2003; Revised 8 January 2004 In linear system identification, optimal excitation signals can be determined using theCramer-Rao bound. This problem has not been thoroughly studied forthe nonlinear case. In this work, theCramer-Raoboundfora factorisable Volterra model is derived. The analytical result is supported with simulation examples. Thebound is then used to find the optimal excitation signal out ofthe class of discrete multitone signals. As the model is nonlinear in the parameters, thebound depends on the model parameters themselves. On this basis, a three-step identification procedure is proposed. To illustrate the procedure, signaloptimisation is explicitly performed fora third-order nonlinear model. Methods of nonlinear optimisation are applied forthe parameter estimation ofthe model. As a baseline, the problem of optimal discrete multitone signals for linear FIR filter estimation is reviewed. Keywords and phrases: Wiener model, Cramer-Rao bound, signal design, nonlinear system identification. 1. INTRODUCTION In the design of optimal excitation signals for system iden- tification, theCramer-Raobound plays a central role. Fora given model structure, it gives a lower bound on the vari- ance ofthe unbiased model parameter estimates fora given perturbation scenario [1]. The problem ofsignal optimisa- tion fortheidentificationof linear models is considered in [2]. We focus on a nonlinear model structure proposed in [3], which is nonlinear in the parameters and can be consid- ered a generalisation ofthe classical Wiener model [4,page 143]. Forthe classical Wiener model, theCramer-Raobound was derived in [5]. The goal of this work is to gain further insight into the design of optimal excitation signals fortheidentificationof nonlinear cascade systems. The application that drove our investigations is adaptive nonlinear filtering for ADSL data transmission systems. The block diagram in Figure 1 shows an application ofthe nonlinear model as a nonlinear canceler ofthe hybrid echo forthe receive path of an ADSL transceiver system. System distortion analysis revealed that the line-driver circuit is the main source of non- linearity. In the subsequent simulation experiments, a non- linear Wiener-type model of this line-driver circuit is used as a reference model. As excitation signalthe class of discrete multitone (DMT) signals as used in ADSL data transmis- sion is primarily considered. During the startup phase ofthe ADSL system, it is possible to send a predetermined DMT training sequence forthe nonlinear echo canceler. Thus, the goal ofthesignaloptimisation procedure is to find theDMT training sequence which is optimal in the sense that the most accurate model parameter estimates forthe echo canceler can be obtained. Our focus is on the effects ofa finite number of 1818 EURASIP Journal on Applied Signal Processing Twisted wire pair Hybrid Receive path − Nonlinear echo canceler ADSL digital transceiver Transmit path Nonlinear line driver Figure 1: Block diagram ofthe application ofa nonlinear canceler ofthe hybrid echo for an ADSL transceiver system. tones in the input signalandofa finite number of samples forthe estimation ofthe model parameters. The work is organised as follows. In Section 2, the con- sidered Wiener-type model is derived from the general Volterra model. TheCramer-Raoboundfor this model is computed in Section 3 while Section 4 deals with the pa- rameter estimation a lgorithm. Verification ofthe derived Cramer-Raobound via numerical simulations is performed in Section 5. A discussion, new algorithms, and simulation results concerning the design of optimal excitation signals forthe considered model are given in Section 6. 2. VOLTERRA MODEL ANDTHEWIENER-TYPE MODEL The multivariate kernel v p [k 1 , , k p ] ofthe homogeneous Volterra system of Figure 2 with y[n] = M p −1 k 1 =0 ··· M p −1 k p =0 v p k 1 , , k p u n − k 1 ···u n − k p (1) is factorisable if it can be written as a product of lower- dimensional terms v p k 1 , , k p = r p k 1 , , k r w p k r+1 , , k p (2) shown in Figure 3. The kernel function is fully factorisable if its kernel v p [k 1 , , k p ]canbewrittenas v p k 1 , , k p = p i=1 h pi k i . (3) The corresponding block diag ram is depicted in Figure 4. If all one-dimensional kernels h pi [k i ] are identical, that is, h p [k i ] = h pi [k i ]fori = 1, , p with v p k 1 , , k p = p i=1 h p k i ,(4) one arrives at the cascade structure of Figure 5,whichis recognised as a homogeneous Wiener system. In the case ofa general Volterra system of order N for which condi- tion (4)holdsforallordersp with p = 1, , N,weob- tain the considered simplified factorisable Volterra system. u[n] v p [k 1 , , k p ] y[n] Figure 2: Homogeneous Volterra system of order p. u[n] r p [k 1 , , k r ] w p [k r+1 , , k p ] y[n] Figure 3: Partially factorisable homogeneous Volterra system of or- der p. u[n] h p1 [k 1 ] h p2 [k 2 ] . . . h p(p−1) [k p−1 ] h pp [k p ] y[n] Figure 4: Fully factorisable homogeneous Volterra system of order p. This Wiener-type model andthe related measurement sce- nario are depicted in Figure 6. If the N different linear kernels h p [k]inFigure 6 differ only by a scaling factor, the classical Wiener model is obtained. The measured output z[n] ofthe considered model can be written as z[n] = y[n]+[n]with y[n] = N p=1 M p −1 k=0 h p [k]u[n − k] p ,(5) where u[n] is the input signaland [n]isassumedtobean additive zero-mean Gaussian noise process with covariance matrix Σ. Subsequently, forthe ease of notation and without Optimal Signals fortheIdentificationof Nonlinear Systems 1819 u[n] h p [k](·) p y[n] Figure 5: Homogeneous Wiener system of order p. u[n] h 1 [k] h 2 [k] . . . h N [k] x 2 [n] x N [n] (·) 2 . . . (·) N + [n] y[n] + z[n] Figure 6: The considered nonlinear Wiener-type model. loss of generality, M p = M for p = 1, , N is assumed. For convenience, the following objects are defined. The linear kernel matrix H ∈ R M×N is defined as H ≡ h 1 [0] ··· h N [0] . . . . . . h 1 [M − 1] h N [M − 1] (6) andthe windowed input matrix U ∈ R N s ×M is defined as U ≡ u[1] u[0] ··· u[−M +2] . . . . . . . . . u N s u N s − 1 ··· u N s − M +1 ,(7) where u[n]forn<1 is assumed to be known and N s is the considered observation sample length or estimation horizon. To be precise, to build up an N s × M data matrix U,onere- quires the knowledge of N s +M−1 samples ofthe input signal u[n], which would actually be the estimation horizon. Nev- ertheless, in the following, we stick to the convention that the estimation horizon is the number of rows ofthe data matrix U, that is, N s . In addition, the power operator P : R n×m → R n with (PX) n = m p=1 X np p (8) is defined, where the notation (·) I , denoting one element ofa nonscalar object with I possibly a multi-index, was used. Making use ofthe above definitions, the output ofthe non- linear model of Figure 6 reads z = PX + , X = UH,(9) where the elements of this objects correspond to z n ≡ z[n], n ≡ [n], and X np ≡ x p [n]. The parameter vector θ ≡ vec(H) will be needed in the following, where the linear index j of θ j corresponds to the matrix indices [k, p]ofH kp with j = (p − 1)M + k and k = j mod M, p =j/M,where· denotes the ceiling function. 3. THECRAMER-RAOBOUNDFORTHEWIENER-TYPE MODEL TheCramer-Raobound is the theoretical lower boundforthe variance of all unbiased e stimators ˆ θ forthe model pa- rameters θ and is determined by the diagonal elements ofthe inverse ofthe Fisher information matrix F: F ij ≡ E ∂ lnl θ|z ∂θ i ∂ lnl θ|z ∂θ j . (10) Here E(·) denotes the expectation operator with respect to the random vector z = PX + and l(θ|z) is the likelihood function forthe parameter vector θ given the noisy observa- tion vector z [1]. Thus, cov θθ T ij ≡ E θ i − E θ i θ j − E θ j ≥ F −1 ij . (11) Under the regularity condition [6, page 26] E ∂ lnl θ|z ∂θ = 0, (12) (10)canbewrittenas F = E(G), (13) with G ij ≡− ∂ 2 ln l θ|z ∂θ i ∂θ j , (14) the Hessian matrix ofthe objective function − ln l(θ|z)for the maximum likelihood estimation. Forthe additive Gaus- sian noise model of , the likelihood function l(H|z) forthe parameter matrix H given the observation vector z reads as follows: l H|z = (2π) N s |Σ| −1/2 exp − 1 2 z−P(UH) T Σ −1 z−P(UH) . (15) The entries ofthe Fisher information matrix (10) forthe con- sidered Wiener-type model (5) are calculated as follows. The log-likelihood function reads as follows: ln l H|z =− 1 2 N s log 2π − 1 2 log |Σ| − 1 2 z − P(UH) T Σ −1 z − P(UH) . (16) The derivative ofthe log-likelihood func tion with respect to the parameter matrix H can be decomposed as 1820 EURASIP Journal on Applied Signal Processing ∂ lnl H|z ∂H rs = ∂ lnl H|z ∂ ∂ ∂x s ∂x s ∂H rs , (17) where the columns x s ofthe matrix X = [x 1 , , x N ]have been introduced. The first two terms ofthe product give ∂ lnl H|z ∂ =− T Σ −1 , (18) ∂ ∂x s ≡ ˜ X s = s diag x [s−1] s , (19) where (·) [p] means elementwise operation. The last term yields ∂x s ∂H rs = u r , (20) with the columns u r ofthe matrix U = [u 1 , , u M ]. Thus, ∂ lnl H|z ∂H rs ∂ lnl H|z ∂H qp = ˜ X s u r T Σ −1 T Σ −1 ˜ X p u q . (21) Applying the expectation operator to the above expression gives the desired result forthe Fisher information matrix, which reads F [rs],[qp] = ˜ X s u r T Σ −1 ˜ X p u q . (22) The resulting matrix F ∈ R NM×NM can be thought of as con- sisting of submatrices ˜ F sp ∈ R M×M : F = ˜ F 11 ··· ˜ F 1N . . . . . . . . . ˜ F N1 ··· ˜ F NN , (23) with ˜ F sp = U T ˜ X s Σ −1 ˜ X p U. (24) Forthe special case ofa linear FIR filter, that is, N = 1, the Fisher infor mation matrix reads, using (19), F = ˜ F 11 = U T Σ −1 U, (25) which, for Σ = σ 2 I, gives the familiar result [1, page 86] F −1 = σ 2 U T U −1 (26) fortheCramer-Raoboundfor linear FIR filters. 4. PARAMETER ESTIMATION For parameter estimation, the likelihood function l(θ|z)is maximised with respect to θ using methods of nonlinear optimisation. Theoptimisation problem is given as ˆ θ = arg min θ J(θ), J(θ) ≡−ln l θ|z , (27) and ˆ θ ≡ vec( H). Forthe FIR Wiener-type model of (5), the gradient g ≡ ∂ θ J(θ) as well as the Hessian G ≡ ∂ θθ T J(θ)of (14) can be computed explicitly. Following the matrix nota- tion forthe model parameters, the gradient can be written in matrix form. Define the gradient matrix ∂ H as composed ofthe gradient vectors for each order of nonlinearity ∂ H ≡ ∂ h 1 , , ∂ h N , (28) where H ≡ [h 1 , , h N ]and∂ θ = vec(∂ H ). Applied to the objective function J(θ), the elements are found to be ∂ h s J(H) =−U T ˜ X s Σ −1 . (29) In correspondence to the matrix structure ofthe Fisher in- formation matrix in (24), the “off-diagonal” submatrices ofthe Hessian matrix are G sp ≡ ∂ h s h T p J(H) = U T ˜ X s Σ −1 ˜ X p U for s = p. (30) The diagonal submatrices given in component notation read G [rs][qs] ≡ ∂ H rs H qs J(H) = u T r ˜ X s Σ −1 ˜ X s u q + s(s − 1) T Σ −1 diag x s [s−2] diag u r u q . (31) Applying (13)to(30)and(31) and acknowledging the fact that is a zero-mean process, the Fisher information ma- trix (24) is retained. As with (29), (30), and (31), first- and second-order derivatives are available, and it is possible to apply a Newton-like optimisation algorithm [7] forthe min- imization of ( 27). This algorithm uses the quadratic approx- imation of J(θ) around some estimate θ (k) obtained after k iterations J θ (k) + δ ≈ J θ (k) + δ T g (k) + 1 2 δ T G (k) δ, (32) with δ = θ − θ (k) . For each iteration k, the quadratic ap- proximation is minimised with respect to δ,whereg (k) and G (k) denote the gradient and Hessian evaluated at θ (k) ,re- spectively. For this task, the Matlab routine fminunc.m [8]is applied. This procedure requires good initialisation to con- verge to the global minimum ofthe objective function J(θ) which is in general multimodal. In this case, the maximum likelihood estimator (27) yields an unbiased estimate. Fur- thermore, the maximum likelihood estimator is a minimum variance estimator [1], thus the variance of this estimator co- incides with theCramer-Rao bound. 5. VERIFICATION OFTHE THEORETICAL RESULT The above result (24) forthe Fisher information matrix oftheWiener-type model is verified by simulation examples. For this purpose, aWiener-type system is defined and will serve as a reference system forthe subsequent simulations. The verification is done by comparing the theoretical parameter Optimal Signals fortheIdentificationof Nonlinear Systems 1821 Table 1: Model coefficients ofthe third-order Wiener-type reference model ofthe line-driver circuit. Tap k = 0 k = 1 k = 2 k = 3 k = 4 k = 5 h 1 [k] 4.2299 1.3909 −1.0805 0.7283 −0.3481 0.0931 h 3 [k] 0.0511 0.1537 −0.2463 0.1418 −0.0314 0.0009 00.20.40.60.81 Normalised frequency (xπ) −5 0 5 10 15 Magnitude (dB) Figure 7: Absolute value ofthe linear transfer function H 1 (e jω )of theWiener-type reference model of Table 1 . variance obtained from the Fisher information matrix (24) with the parameter variance obtained by repeated estima- tion ofthe model parameters with the algorithm described in Section 4. As this estimator is a minimum variance esti- mator, the two variances are expected to match. This coinci- dence is checked forDMT input signals as well as for white Gaussian noise (WGN) input signals over different signal-to- noise (SNR) levels. 5.1. The reference model Forthe simulation, a specific reference configuration oftheWiener-type model is chosen. This reference configuration is a simple discrete-time model of an ADSL, G.Lite line- driver circuit [9]. To present reproducible results, the sim- plest model ofthe circuit was chosen as the reference model and explicit values ofthe model coefficients are given. It is a third-order model encompassing 12 coefficients θ j .Through the differential design ofthe circuit, the effects of nonlineari- ties of even orders are negligible compared to the effects ofthe nonlinearities of odd orders. Thus, the model consists only ofa dominating linear part with M 1 = 6andofasmall part of third order with M 3 = 6. The explicit values ofthe model coefficients are given in Table 1. They were found orig- inally by identifying the line-driver circuit using a broadband DMT input signalandthe estimation algorithm of Section 4. The model equation for this case reads z[n] = 5 k=0 h 1 [k]u[n − k] + 5 k=0 h 3 [k]u[n − k] 3 + [n]. (33) 00.20.40.60.81 Normalised frequency (xπ) −25 −20 −15 −10 −5 Magnitude (dB) Figure 8: Absolute value ofthe cubic transfer function H 3 (e jω )of theWiener-type reference model of Table 1 . Written in the compact notation of Section 3, this gives z = P UH r + , (34) with the reference coefficient matrix H r ∈ R 6×2 .Frequency responses forthe linear part H 1 (e jω ) = F (h 1 [k]) andforthe cubic part H 3 (e jω ) = F (h 3 [k]) ofthe reference model are depictedinFigures7 and 8, respectively. The linear response shows the typical lowpass characteristic ofa power amplifier, while the third-order response reflects the common observa- tion that the nonlinear distortion gets higher for higher fre- quencies. In Figure 9, the power spectrum ofthe output sig- nal oftheWiener-type reference model of Tabl e 1 is shown, fora typical downstream ADSL DMTsignal as input. The magnitude ofthe intermodulation products indicates that the nonlinear distortion introduced by the third-order term is 60 dB below the carrier signal. Thus, we are dealing with an extremely weak nonlinear system. Subsequently, the Fisher information matrix of (24) and its inverse are computed for this reference model. In correspondence to the partitioning (24) ofthe Fisher information matrix F = σ 2 U T UU T ˜ X 3 U U T ˜ X 3 UU T ˜ X 3 ˜ X 3 U , (35) the positive-definite covariance matrix can be decomposed into four submatrices: cov θθ T = cov h 1 h T 1 cov h 1 h T 2 cov h 2 h T 1 cov h 2 h T 2 . (36) 1822 EURASIP Journal on Applied Signal Processing 00.20.40.60.81 Frequency (xπ) −80 −60 −40 −20 0 Normalised power (dB) Figure 9: Power spectrum ofthe output oftheWiener-type refer- ence model of Tabl e 1 forthe line-driver circuit: DMT input sig- nal with N c =95 carriers; the perturbation is additive WGN with σ 2 = 1 × 10 −5 . 1 3 5 7 9 11 Row index i 1 3 5 7 9 11 Column index j Magnitude 0 1 2 3 ×10 −7 Figure 10: Cramer-Rao lower bound on the parameter covariance matrix cov(θθ T ) ij with M = 6, first- and third-order nonlinearity, and N s = 1000; the pertubation is WGN with σ 2 = 1 × 10 −5 and u[n] i s a WGN input signal with power σ 2 u = 0.64. In Figure 10, the parameter covariance matrix cov(θθ T ) ij fortheWiener-type reference model is shown forthe case N s = 1000 and σ 2 = 1 × 10 −5 fora WGN input signal with variance σ 2 u = 0.64. The figure reveals that there is a high co- variance between the linear parameters andthe third-order parameters. That corresponds to the known fact that even in the case ofa white input signal, the homogeneous first- and third-order responses ofa multilinear operator, such as a Volterra model, are correlated [10]. 5.2. Parameter estimation and variance comparison In the following, the derivation of Section 3 is verified us- ing different excitation signals and different perturbation scenarios. These investigations oftheWiener-type reference model of Table 1 are done with an estimation horizon of N s = 50. The variance estimates ofthe estimators are ob- tained by repeating theidentification procedure of Section 4 30 40 50 60 70 80 90 SNR (dB) −100 −90 −80 −70 −60 −50 −40 −30 var(θ)(dB) Figure 11: Linear dependence ofCramer-Raobound (dashed) on the SNR and variance ofthe estimators (solid) over different SNR with 95% confidence intervals shown as vertical bars, plotted for one kernel value for each order p; the two upper curves correspond to parameter H 12 = h 3 [0]; the two lower curves correspond to pa- rameter H 11 = h 1 [0]; the input signal is WGN. for N r = 100 i.i.d. realisations ofthe perturbation process [n]. Following the asymptotic results ofthe normality ofthe maximum likelihood estimator [11, page 52], the parameter estimates pass the Lilliefors test for normality [12]. Thus, the 95% confidence intervals ofa normal distribution are indi- cated in the following figures. To keep these figures simple, theCramer-Raobound diag(F −1 ) andthe variance estimates var(θ) of only one model parameter per order of nonlinearity p are shown versus different SNR. 5.2.1. WGN input signalThe input signal u[n] to the reference model is taken to be WGN, u[n] ∼ N (0, σ 2 u )withσ 2 u =0.64, while the additive perturbation ofthe output y[n]is[n] ∼ N (0, σ 2 ). TheCramer-Rao bound, the variance estimates ofthe estimators, and their corresponding confidence regions versus different SNR levels are given in Figure 11. Good agreement between simulation and theor y can be observed. 5.2.2. DMT input signal As a s econd scenario, the input signal u[n]istakentobea DMT signal: u[n] = N c −1 k=0 a k cos k s + k ω 0 n + ϕ k , (37) where ω 0 is the normalised grid frequency oftheDMT sig- nal. For further use, we define the vector of amplitudes a ≡ [a 0 , , a N c −1 ] T , the corresponding vector of powers ofthe individual tones p, andthe vector of normalised fre- quencies ω ≡ ω 0 · [k s , k s +1, , k s + N c − 1] T . The phase set ϕ ≡ [ϕ 0 , , ϕ N c −1 ] T for this simulation is initialised with random numbers drawn from the uniform distribu- tion U[0, 2π]. Theidentificationofthe reference model is Optimal Signals fortheIdentificationof Nonlinear Systems 1823 30 40 50 60 70 80 90 SNR (dB) −90 −80 −70 −60 −50 −40 −30 var(θ)(dB) Figure 12: Linear dependence ofCramer-Raobound (dashed) on the SNR and variance ofthe estimators (solid) over different SNR with 95% confidence intervals shown as vertical bars, plotted for one kernel value for each order p; the two upper curves correspond to parameter H 12 = h 3 [0]; the two lower curves correspond to pa- rameter H 11 = h 1 [0]; the input signal is aDMTsignal with N c = 12. performed using N c = 12 tones and is done for different SNR levels. TheCramer-Rao bound, the variance estimates ofthe estimators, and their corresponding confidence regions ver- sus different SNR levels are given in Figure 12. Once again, good agreement between simulation and theory can be ob- served. 6. DESIGN OF OPTIMAL EXCITATION SIGNALS Given a model structure with unknown parameters, the ac- curacy ofthe parameter estimates ofthe model depends on the used identification procedure and on the used excita- tion signal. If the estimator is a minimum variance estimator, then its parameter variance achieves the lower bound, that is, theCramer-Rao bound. Thus, to even further decrease the variance ofthe minimum variance estimator of Section 4, one can only optimise the excitation signal in such a way that the corresponding Cramer-Raobound is decreased. To have an optimality measure, a scalar objective function Ψ : R MN×MN → R of F −1 has to be found. In the theory of exper- iment design [13], different types of this objective function Ψ(·) are considered. The most popular criterion of optimal- ity is Ψ(F −1 ) =|F −1 |=|F| −1 ,where|·|denotes the deter- minant ofa matrix. 6.1. Signal design for linear FIR filters In this section, the well-known problem of optimising the amplitude distribution ofaDMTsignal subject to a total power constraint so as to achieve minimal variance estimates ofthe parameters ofa linear FIR filter is reviewed. Fora WGN perturbation, the Fisher information matrix forthe linear FIR filter case is given by (26). As mentioned earlier, one way to minimize theCramer-Raobound is to maximize the de- terminant of F. We apply the inequality log x ≤ αx − 1− log α for every α>0 to the M eigenvalues λ k ofthe positive- semidefinite matrix F: M k=1 log λ k ≤ α M k=1 λ k − M(1 + log α). (38) Inequality (38)isequivalentto log |F|≤α Tr (F) − M(1 + log α), (39) with Tr(·) denoting the trace ofa matrix. The quantity log |F| reaches its upper bound at λ k = λ = 1/α for k = 1, , M. The consequences of this relation forsignaloptimisation are outlined in the following example. Consider the case N s is the period oftheDMTsignal (37). The diagonal elements of F are all equal and correspond to the constrained total power oftheDMT signal, that is, Tr(F) = σ −2 MN s p k . Thus, fora given power oftheDMT signal, the right-hand side of (39) is fixed and gives the upper boundfor log |F|. It reaches its upper bound if the eigenvalues are all equal to λ = 1/α with α = σ 2 /(N s p k ). Furthermore, if we assume that M is even and M = N s , with (7)and(26), the matrix F turns out to be a circulant. Thus, the similarity transformation which diagonalises F is the discrete Fourier transform (DFT) T ∈ C M×M andthe eigenvalues of F are the diagonal elements of S = TFT −1 [14, page 379]. If the frequency spacing oftheDMTsignal (37) is chosen to be ω 0 = 2π/M and k s = 0, the eigenvalues of F correspond to the discrete power spectrum oftheDMT sig- nal. The matrix F is nonsingular for k = 0, , M/2, which corresponds to N c = M/2 + 1 tones oftheDMT signal. The tones at k = 0andk = M/2 contribute one spectral com- ponent to the discrete power spectrum each, while all other tones contribute two spectral components each. Thus, the eigenvalues of F are all equal and log |F| reaches its upper bound if the M/2 + 1 element amplitude vector oftheDMTsignal has the form a = [a/2, a, , a, a/2] T .Thisisinaccor- dance with the engineering intuition that fora finite number of tones anda predetermined power oftheDMT signal, the most accurate parameter estimation is possible if the power is equally distributed over all spe ctral components. Note that the above example is constructed in such a way that the fre- quency grid oftheDMTsignal spans the full bandwidth, that is, ω = 2π/M·[0, 1, , M/2] T . In general, the circularity of F is preserved if N s = mN p and M = N p ,whereN p is the period oftheDMTsignaland m ∈ N. In such situations, every mth spectral component oftheDMTsignal (37)withω 0 = 2π/M and N c = M/2 + 1 is nonzero and corresponds to an eigen- value ofthe matrix F. From above considerations, it is clear that fora frequency spacing ω 0 = 2π/M and N c <M/2+1, at least one eigenvalue of F is exactly zero. Thus, the corre- sponding estimation problem is an ill-posed one. As soon as the constraints N s /M ∈ N and ω 0 = 2π/M do not hold, the one-to-one correspondence between an eigenvalue of F anda nonzero spectral component oftheDMTsignal is lost. Thus, in the general case, one tone oftheDMTsignal impacts more than one eigenvalue of F. In this case, the amplitude distribu- tion oftheDMTsignal that maximises log |F| has to be found through numerical optimisation methods. 1824 EURASIP Journal on Applied Signal Processing 00.20.40.60.81 Normalised frequency (xπ) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Amplitude Figure 13: Optimal amplitude distribution ofaDMTsignal over the full bandwidth [0, π] encompassing N c = 4 tones forthe esti- mation of an M = 6FIRfilter. In [15], it is shown that, for linear FIR filters, the max- imization of log |F| subject to thesignal power constraint p k ≤ 1 leads to a semidefinite programming problem which can be solved efficiently [16]. More explicitly, the semidefinite program takes the form max p log F(p) ,subjecttoF(p) ≥ 0, ˜ p ≥ 0, (40) with ˜ p ≡ [1 − p k , p 0 , , p N c −1 ] T .Thekeyobservation that allows this eleg ant formulation is that the Fisher infor- mation matrix fora period ofaDMTsignal is the weighted sum of partial Fisher information matrices corresponding to each tone oftheDMT signal. The weights turn out to be the powers p k ofthe individual tones. Following this approach, the optimal excitation signals fora linear FIR filter are found subsequently. From (25), it is clear that the amplitude distri- bution ofthe optimal DMTsignal does not depend on the model parameters. In correspondence to the linear part ofthe reference model of Tab le 1 , the optimal amplitude distri- bution for an M = 6 linear FIR filter is computed. 6.1.1. DMTsignal with bandwidth [0, π] To guarantee that the matrix F is nonsingular, above consid- erationssuggestthatatleastN c = M/2+1 = 4tonesare required if tones at ω = 0andω = π are included. The optimised amplitude distribution found by semidefinite pro- gramming is given in Figure 13. This amplitude distribution corresponds to a flat signal spectrum because the spectral components for ω k = 0andω k = π scale differently (by a factor of 2) than the other components. Thus fora finite number of tones and finite sample length N s equal to the pe- riod ofthesignalandfor full bandwidth, the spectrum ofthe optimal DMTsignal turns out to be flat. For many ap- plications, the number of tones ofthe excitation sig nal is not exactly N c = M/2 + 1, but higher. Also for such a case with N c >M/2+1, the optimal amplitude distribution over the full bandwidth [0, π] is found to be spectrally flat. More interest- ing observations can be made fora bandpass DMTsignal in the next section. 00.20.40.60.81 Normalised frequency (xπ) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Amplitude Figure 14: Optimal amplitude distribution fora bandpass DMTsignal encompassing N c = 3 tones forthe estimation of an M = 6 FIR filter. 00.20.40.60.81 Normalised frequency (xπ) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Amplitude Figure 15: Amplitude distribution ofa bandpass DMTsignal en- compassing N c = 12 tones forthe estimation of an M = 6FIRfilter: optimised signal (circles) and, for reference, the spectrally flat signal (crosses). 6.1.2. DMTsignal with bandwidth (0, π/2) In the case ofa bandpass signal, where neither the frequency ω = 0norω = π is included, each tone contributes t wo spec- tral components and thus the minimum number of tones required forthe estimation ofthe linear FIR filter is N c = M/2. The optimal amplitude distribution for an N c = 3 bandpass signal using semidefinite programming is depicted in Figure 14. Thus, forthe bandpass signal with N c =M/2, the optimal spectral distribution is flat over the given band- width (0, π/2). But, if more than M/2 tones are contained in theDMT signal, the optimal amplitude distribution is no longer spectrally flat. This is exemplified forthe case N c = 12 in Figure 15. The figure shows, in addition to the optimal amplitude distribution, the spectral ly flat amplitude distribution as a reference. Thus, for general bandpass DMT signals, it turns out that the optimal spectral distribution is not flat over the given bandwidth (0, π/2). In the next sec- tion, this result is verified through estimation runs using the Optimal Signals fortheIdentificationof Nonlinear Systems 1825 01234567 Parameter number 1.2 1.4 1.6 1.8 2 2.2 2.4 ×10 −3 Standard deviation Figure 16: Mean and 95% confidence region ofthe estimated stan- dard deviation ofthe linear FIR filter parameter estimates fora bandpass input signal with N c = 12 tones: spectrally flat amplitude distribution (crosses), optimised amplitude distribution (circles); the perturbation is WGN with σ 2 = 1 × 10 −5 andthe estimation horizon is N s = 56. optimal N c = 12 DMTsignalandthe spec trally flat N c = 12 DMT signal. 6.1.3. Comparison ofthe estimation performance of bandpass DMT signals Now that the optimal bandpass input signalfora linear FIR filter is found, thesignal can be applied to theidentificationofa given linear FIR filter. The result is then compared with theidentification result obtained by applying the bandpass signal with a flat spectral distribution forthe given band- width (0, π/2). For this, the linear part oftheWiener-type model of Table 1 is used as the reference linear FIR filter and input-output data, that is, {u[n], z[n]},aremeasured. Foridentificationthe unbiased minimum variance estimator (UMVE) [1, page 87] forthe linear FIR filter case, ˆ θ = U T U −1 U T z (41) is applied both forthe optimal bandpass sequence andforthe spectrally flat bandpass sequence. The variance ofthe es- timate ˆ θ is computed by performing the estimation (41)over N r = 1000 i.i.d. noise realisations ofthe perturbation pro- cess [n] ∼ N (0, σ 2 )withσ 2 = 1 × 10 −5 and N s = 56. The estimated standard deviations of each FIR filter param- eter are shown for these signals in Figure 16. In addition, theCramer-Rao bounds for both signals and each parame- ter are computed. All bounds lie in the indicated 95% con- fidence region. To keep the figure simple, the bounds are not shown in Figure 16. The result shows clearly that the optimised DMTsignal which is not spectrally flat outper- forms the spectrally flat reference DMT signal. The relative reduction ofthe parameter variance averaged over all FIR fil- ter parameters comes out to be 26.01% or 1.45 dB. The fol- lowing remarks can be made. (1)Tobeabletoapplysemidefiniteprogramming,the estimation horizon N s has to match multiples ofthe per iod oftheDMT sig nal. In this case, the phase distribution ϕ falls out oftheoptimisation problem. (2) The characteristic shape ofthe variance as a func- tion ofthe parameter index as plotted in Figure 16 can be explained by the spectral decomposition ofthe matrix F. Due to the band limitation, the eigenvalue spread ofthe ma- trix F is ofthe order 1 × 10 3 . Therefore, F −1 is governed by the smallest eigenvalue λ k of F and can be approximated by F −1 ≈ λ −1 k v k v T k ,wherev k is the corresponding eigenvector of F. Thus, the characteristic shape in Figure 16 is primarily determined by the shape ofthe eigenvector corresponding to the smallest eigenvalue of F. 6.2. DMTsignal design fortheWiener-type mo del As theWiener-type model of (5) is a nonlinear-in-the- parameters model, its Fisher information matrix (24)de- pends on the model parameters. In contrast to the FIR fil- ter case, for each model parameter set, an o ptimal excitation signal can be defined. Furthermore, the entries ofthe Fisher information matrix correspond to higher-order moments ofthe input signal. Therefore, the optimal DMTsignal is not only determined by its amplitude distribution but also by its phase distribution ϕ. This implies that, even in the case where the estimation horizon N s is the period oftheDMT signal, the entire Fisher information matrix cannot be writ- ten as a weighted sum ofthe partial Fisher information ma- trices for each tone oftheDMT signal. Due to this, the for- mulation ofthesignaloptimisation problem by a semidefi- nite program is not possible forthe case ofthe Wiener-typ e model. Theoptimisation problem reads max p,ϕ log F(p, ϕ) ,subjecttoF(p, ϕ) ≥ 0, ˜ p ≥ 0, (42) where the objective function log |F(p, ϕ)| andthe constraint forthe positive semidefiniteness F(p, ϕ) ≥ 0arenownon- linear functions oftheoptimisation variables p and ϕ.To the best ofthe authors’ knowledge, no optimisationa lgo- rithm is available that combines a nonlinear objective func- tion with a nonlinear semidefinite matrix constraint. Fur- thermore, forthe above optimisation problem andforthe rest of Section 6.2, it is assumed that the reference model co- efficients of Ta bl e 1 are known, where as in reality they are not. In Section 6.3, a practical solution to circumvent this unrealistic assumption is presented. 6.2.1. Design of optimal QAM-DMT signals To still be able to illustrate the role of optimal signal design fortheWiener-type model, we restrict the considered sig- nal class to a subclass ofDMT signals with a finite number of members. The determination ofthe optimal excitation signal from this subclass can now be tackled by a complete search over all members ofthe subclass. A realistic subclass is the class ofDMT signals that are modulated according to a spe- cific QAM (quadrature amplitude modulation) scheme. The amplitudes and phases ofthe tones can now vary only on 1826 EURASIP Journal on Applied Signal Processing Figure 17: Eight-point QAM signal constellation. 00.20.40.60.81 Normalised frequency (xπ) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Amplitude Figure 18: Optimal amplitude distribution ofthe bandpass eight- point QAM-DMT signal encompassing N c = 6 tones forthe estima- tion oftheWiener-type model of Table 1 . the quantised levels ofthe QAM constellation. In the follow- ing simulation experiments, an eight-point QAM for each ofthe N c tones is applied. The amplitude quantisation is done in such a way that if all N c tones occupy the outer ring ofthe QAM constellation, thesignal power is p k = 0.64. In Figure 17, the used QAM constellation is depicted schemati- cally. The optimal amplitude distribution for an eight-point QAM-DMT bandpass signal with ω ∈ (0, π/2), N c = 6, which maximises log |F(p, ϕ)|, found through a complete search forthe nonlinear reference model of Tab le 1 , is shown in Figure 18. Forthe 12-parameter Wiener-type reference model, aDMTsignal with at least N c = 6toneshastobe applied to prevent an ill-posedness ofthe estimation prob- lem. From the insight gained through the simulation experi- ments, the following remarks can be made. (1) Due to the experiment setup, it comes at no surprise that the amplitude distribution ofthe optimal excitation sig- nal fortheWiener-type model is spectrally flat. The reason for that is that, roughly speaking, theCramer-Raobound can be seen as a noise-to-signal power ratio and thus thebound gets lowered if more signal power is applied to the corre- sponding system. Therefore, forthe optimal signal, all ofthe N c = 6 tones occupy the outer QAM constellation points of Figure 17. 0 5 10 15 20 25 30 Sample −2 −1 0 1 2 Amplitude Figure 19: One period N s = 28 of two discrete-time input signals fortheWiener-type model: signal with optimal QAM constellation (circles) and suboptimal signal (crosses) with the same amplitude but different phase distribution than the optimal signal. (2) In contrast to the linear FIR filter case, the phase con- stellation turns out to be of crucial importance even for N s being thesignal period. It is observed that even input signals with the same amplitude distribution but different phase sets ϕ than the optimal input signal can lead not only to very high Cramer-Rao bounds but even to biased estimates. These bi- ased estimates are caused by the practical problem that, for these special phase sets ϕ, the Hessian matrix ofthe estima- tor of Section 4 gets near to a singular matrix and thus theoptimisation algorithm fails to converge. Note that these observations have severe implications forthe methodology of nonlinear system identification. An im- proper choice ofthe phase set oftheDMT excitation signal can lead to an extremely ill-posed estimation problem. 6.2.2. Comparison ofthe estimation performance for QAM-DMT signals As a consequence ofthe above remarks, we present an esti- mation performance comparison between the optimal input signal (determined by its phase and amplitude distribution) and an input signal with the same amplitude but different phase distribution, which still al lows an unbiased estimation, that is, allows convergence oftheoptimisation algorithm. The two discrete-time signals which are compared in the es- timation performance are shown in Figure 19. The perfor- mance is evaluated by repeated identificationofthe refer- ence Wiener-type model of Ta bl e 1 over N r = 500 i.i.d. re- alisations ofthe perturbation process [n] ∼ N (0, σ 2 )with σ 2 = 1 × 10 −5 . The resulting standard deviations ofthe es- timates forthe two excitation signals are shown in Figures 20 and 21 forthe linear and cubic part oftheWiener-type model, respectively. In addition, theCramer-Rao bounds for both signals and each model parameter are computed. All bounds lie in the indicated 95% confidence region. To keep the figures simple, they are not shown in Figures 20 and 21. The mean parameter variance andthe var iance gain forthe two signals of Figure 19 are given in Ta ble 2. [...]... Result ofthe estimation comparison for optimal and suboptimal input signals of Figure 19 fortheidentificationoftheWiener-type model of Table 1 Mean variance for optimal signal Mean variance for suboptimal signal Mean variance gain 6.46 × 10−5 2.68 × 10−4 6.18 dB One can draw the important conclusion that, forasignal with optimal amplitude distribution but suboptimal phase distribution, the variances... respectively The estimated variances forthe two signals averaged over all parameters are given in Table 3 The following remarks can be made (1) One observes that the variance gain in Table 3 is larger than the gain in Table 2 obtained by the exact method of Section 6.2.2 An explanation of this counterintuitive effect is that the concatenation of two periods of one signal is just a scaling oftheCramer-Rao bound. .. Result ofthe estimation comparison forthe three-step input signalandforthe suboptimal input signalof Figure 23 fortheidentificationoftheWiener-type model of Table 1 Mean variance, three-step Mean variance, suboptimal Mean variance gain 7 2.06 × 10−5 1.58 × 10−4 8.85 dB CONCLUSION TheCramer-RaoboundforaWiener-type nonlinear model has been derived The parameter estimation algorithm maximises... variances ofthe parameter estimates can be an order of magnitude larger than forthe optimal signal max log F(p, ϕ, H) p,ϕ (45) (3) Perform a second estimation ofthe model parameters using the concatenation ofthe admissible DMTsignalof step (1) u1 [n] andthe optimal DMTsignal from step (2) u2 [n] An illustration of this procedure is given in Figure 22 From this block diagram, it becomes clear that... z[n] Standard deviation u1 [n] Estimator SigOpt 2 1.5 1 Figure 22: Block diagram ofthe three-step identification procedure 0.5 0 1 2 3 4 5 6 7 Parameter number Figure 24: Mean and 95% confidence region of estimated standard deviations ofthe estimators forthe linear part oftheWiener-type model fora bandpass QAM -DMT signal with Nc = 6: optimal input signal via the three-step procedure (circles) and. .. on the Fisher information matrix, which in this work is Parameter number Figure 21: Mean and 95% confidence region of estimated standard deviation ofthe estimates forthe cubic part ofthe Wienertype model fora bandpass QAM -DMT signal with Nc = 6: optimal input signal (circles) and suboptimal input signal (crosses); the perturbation is WGN with σ 2 = 1 × 10−5 andthe estimation horizon is Ns = 28 Table... boundfor one period by 1/2, 0 1 2 3 4 5 6 7 Parameter number Figure 25: Mean and 95% confidence region of estimated standard deviations ofthe estimates forthe cubic part oftheWiener-type model fora bandpass QAM -DMT signal with Nc = 6: optimal input signal via the three-step procedure (circles) and suboptimal input signal (crosses); the perturbation is WGN with σ 2 = 1 × 10−5 andthe estimation... in form ofa probability density function p(H) [11, page 127], then one could optimise the criterion 3.5 Standard deviation 3 2.5 2 log EH Ψ F(H) , 1.5 1 0 1 2 3 4 5 6 7 Figure 20: Mean and 95% confidence region of estimated standard deviation ofthe estimators forthe linear part oftheWiener-type model fora bandpass QAM -DMT signal with Nc = 6: optimal input signal (circles) and suboptimal input signal. .. while the concatenation of two periods of two distinct signals impacts theCramer-Raobound in a more complicated way Thus, even if one applies two periods ofthe optimal input signalof Section 6.2.2, the obtained mean variance turns out to be 3.16 × 10−5 , which is still higher than the mean variance obtained via the three-step procedure of Table 3 (2) For weakly nonlinear analog circuits, such as the. .. (cf Figure 19) The performance is once again evaluated by repeated identificationofthe reference Wiener-type model of Table 1 over Nr = 500 i.i.d realisations ofthe perturbation process [n] ∼ N (0, σ 2 ) with σ 2 = 1 × 10−5 with Ns = 56 The resulting standard deviations ofthe estimates forthe two excitation signals are shown in Figures 24 and 25 forthe linear and cubic parts oftheWiener-type model, . performance of bandpass DMT signals Now that the optimal bandpass input signal for a linear FIR filter is found, the signal can be applied to the identification of a given linear FIR filter. The. number 0.5 1 1.5 2 2.5 3 ×10 −3 Standard deviation Figure 24: Mean and 95% confidence region of estimated standard deviations of the estimators for the linear part of the Wiener-type model for a bandpass QAM -DMT signal with. keep the figures simple, they are not shown in Figures 20 and 21. The mean parameter variance and the var iance gain for the two signals of Figure 19 are given in Ta ble 2. Optimal Signals for the