Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 339 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
339
Dung lượng
1,43 MB
Nội dung
www.elsolucionario.net www.elsolucionario.net www.elsolucionario.net CHAPTER 1.1 Let r u ( k ) = E [ u ( n )u * ( n – k ) ] (1) r y ( k ) = E [ y ( n )y * ( n – k ) ] (2) We are given that y(n) = u(n + a) – u(n – a) Hence, substituting Eq (3) into (2), and then using Eq (1), we get r y(k ) = E [(u(n + a) – u(n – a))(u*(n + a – k ) – u*(n – a – k ))] = 2r u ( k ) – r u ( 2a + k ) – r u ( – 2a + k ) 1.2 We know that the correlation matrix R is Hermitian; that is R H = R Given that the inverse matrix R-1 exists, we may write –1 H R R = I where I is the identity matrix Taking the Hermitian transpose of both sides: RR –H = I Hence, R –H = R –1 That is, the inverse matrix R-1 is Hermitian 1.3 For the case of a two-by-two matrix, we may Ru = Rs + Rν www.elsolucionario.net (3) www.elsolucionario.net r 11 r 12 σ = + r 21 r 22 σ = r 11 + σ r 21 r 12 r 22 + σ For Ru to be nonsingular, we require With r12 = r21 for real data, this condition reduces to 2 ( r 11 + σ ) ( r 22 + σ ) – r 12 r 21 > 2 Since this is quadratic in σ , we may impose the following condition on σ for nonsingularity of Ru: 4∆ r σ > - ( r 11 + r 22 ) – 2 ( r + r ) – 1 11 22 where ∆ r = r 11 r 22 – r 12 1.4 We are given R = 1 1 This matrix is positive definite because a T a Ra = [ a ,a ] 1 1 a2 2 = a + 2a a + a 2 www.elsolucionario.net det ( R u ) = ( r 11 + σ ) ( r 22 + σ ) – r 12 r 21 > www.elsolucionario.net = ( a + a ) > for all nonzero values of a1 and a2 (Positive definiteness is stronger than nonnegative definiteness.) But the matrix R is singular because 2 det ( R ) = ( ) – ( ) = Hence, it is possible for a matrix to be positive definite and yet it can be singular (a) H r(0) r R M+1 = r RM (1) Let –1 R M+1 = a b H b C (2) where a, b and C are to be determined Multiplying (1) by (2): H r(0) r I M+1 = r RM a b b H C where IM+1 is the identity matrix Therefore, H r ( )a + r b = (3) + R M b = (4) H (5) rb + R M C = I M H H r ( )b + r C = T (6) From Eq (4): www.elsolucionario.net 1.5 www.elsolucionario.net –1 b = – R M (7) Hence, from (3) and (7): a = -H –1 r ( ) – r RM r (8) Correspondingly, –1 RM r b = – -H –1 r ( ) – r RM r From (5): –1 –1 C = R M – R M rb H –1 = –1 RM H –1 R M rr R M + -H –1 r ( ) – r RM r (10) As a check, the results of Eqs (9) and (10) should satisfy Eq (6) H –1 H –1 H –1 r ( )r R M H – r R M rr R M r ( )b + r C = – + r R M + H –1 H –1 r ( ) – r RM r r ( ) – r RM r H H = T We have thus shown that H –1 R M+1 –1 T –r R M 0 = + a –1 R–1 R M r R – rr H R – M M M T 0 = + a –1 R–1 –R M r M H –1 [ –r R M ] www.elsolucionario.net (9) www.elsolucionario.net where the scalar a is defined by Eq (8): B* r R M+1 = BT r(0) r (11) Let D –1 R M+1 = e e f H (12) where D, e and f are to be determined Multiplying (11) by (12): RM B* D r I M+1 = BT r(0) r e H e f Therefore RM D + r RM e + r r r B* H e B* (13) = I (14) f = BT e + r(0) f = BT D + r ( )e H = (15) T (16) From (14): – B* e = – RM r (17) f Hence, from (15) and (17): f = BT – B* r ( ) – r RM r (18) Correspondingly, www.elsolucionario.net (b) RM www.elsolucionario.net – B* RM r e = – BT – B* r ( ) – r RM r (19) From (13): – B* H e – B* BT = –1 RM –1 RM r r RM + BT – B* r ( ) – r RM r (20) As a check, the results of Eqs (19) and (20) must satisfy Eq (16) Thus BT r BT D + r ( )e H = r BT = –1 RM + – B* BT –1 BT –1 r ( )r R M r RM r r RM – BT – B* BT – B* r ( ) – r RM r r ( ) – r RM r T We have thus shown that – B* BT –1 –1 R M+1 = RM 0 T +f –r –1 = RM 0 T RM r r BT –1 R M – R – r B* M –1 RM – B* +f –R M r [ –r BT –1 RM ] where the scalar f is defined by Eq (18) 1.6 (a) We express the difference equation describing the first-order AR process u(n) as u ( n ) = v ( n ) + w1 u ( n – ) where w1 = -a1 Solving this equation by repeated substitution, we get u ( n ) = v ( n ) + w1 v ( n – ) + w1 u ( n – ) www.elsolucionario.net –1 D = RM – RM r www.elsolucionario.net = … n-1 = v ( n ) + w1 v ( n – ) + w1 v ( n – ) + … + w1 v ( ) (1) Here we have used the initial condition u(0) = or equivalently u(1) = v(1) E [v(n)] = µ for all n, we get the geometric series n-1 E [ u ( n ) ] = µ + w1 µ + w1 µ + … + w1 µ – w n - , = µ -1 – w 1 µn, w1 ≠ w1 = This result shows that if µ ≠ , then E[u(n)] is a function of time n Accordingly, the AR process u(n) is not stationary If, however, the AR parameter satisfies the condition: a < or w < then µ E [ ( n ) ] → - as n → ∞ – w1 Under this condition, we say that the AR process is asymptotically stationary to order one (b) When the white noise process v(n) has zero mean, the AR process u(n) will likewise have zero mean Then www.elsolucionario.net Taking the expected value of both sides of Eq (1) and using www.elsolucionario.net var [ v ( n ) ] = σ v var [ u ( n ) ] = E [ u ( n ) ] (2) Substituting Eq (1) into (2), and recognizing that for the white noise process E [ v ( n )v ( k ) ] = σ v 0, n=k (3) n≠k var [ u ( n ) ] = σ v ( + w + w + … + w 2 2n-2 2n – w1 - , σ v 2 = w – 1 σ v n, www.elsolucionario.net we get the geometric series ) w1 ≠ w1 = When |a1| < or |w1| < 1, then 2 σv σv var [ u ( n ) ] ≈ - = for large n 2 – w1 – a1 (c) The autocorrelation function of the AR process u(n) equals E[u(n)u(n-k)] Substituting Eq (1) into this formula, and using Eq (3), we get k k+2 E [ u ( n )u ( n – k ) ] = σ v ( w + w + … + w1 k+2n-2 2n k – w1 - , σ v w 2 = – w 1 σ v n, w1 ≠ w1 = ) www.elsolucionario.net For |a1| < or |w1| < 1, we may therefore express this autocorrelation function as r ( k ) = E [ u ( n )u ( n – k ) ] k σv w1 ≈ - for large n – w1 Case 1: < a1 < In this case, w1 = -a1 is negative, and r(k) varies with k as follows: -3 -1 +3 +1 -2 -4 +2 +4 k Case 2: -1 < a1 < In this case, w1 = -a1 is positive and r(k) varies with k as follows: r(k) -4 1.7 -3 -2 -1 +1 +2 +3 +4 k (a) The second-order AR process u(n) is described by the difference equation: u ( n ) = u ( n – ) – 0.5u ( n – ) + v ( n ) Hence w1 = w = – 0.5 and the AR parameters equal a1 = –1 a = 0.5 Accordingly, we write the Yule-Walker equations as www.elsolucionario.net r(k) www.elsolucionario.net h h h ( L-1 ) For a noiseless channel, the received signal is M (l) un = (l) ∑ hn x n-m, l = 0, 1, …, L-1 (1) m=0 By definition, we have M ( l ) –m ∑ hm z l H (z) = m=0 where z is the unit-delay operator We therefore rewrite Eq (1) in the equivalent form: (l) l un = H ( z ) [ xn ] (2) where H(l)(z) acts as an operator Multiplying Eq (2) by G(l)(z) and then summing over l: L-1 ∑G l (l) ( z ) [ un ] l=0 L-1 = ∑ G ( z )H ( z ) [ xn ] l l (3) l=0 According to the generalized Bezout identity: L-1 ∑ G ( z )H ( z ) l l = l=0 We may therefore simplify Eq (3) to L-1 (l) ∑ G ( z ) [ un l ] = xn l=0 Let 324 www.elsolucionario.net 16.4 (1) h = (0) www.elsolucionario.net y (l) (l) l ( n ) = G ( z ) [ u n ], l = 0, 1, …, L-1 Let G(l)(z) be written in the expanded form: M ( l ) –i ∑ gi l G (z) = z i=0 Hence, M ( l ) –i ∑ gi (n) = (l) z [ un ] i=0 M (l) (l) un –i ∑ gi = i=0 From linear prediction theory, we recognize that (l) uˆ n+1 M = (l) (l) un –i ∑ gi i=0 It follows therefore that y (l) –i (l) ( n ) = z [ uˆ n+1 ] and so we may rewrite L-1 xn = ∑z –i (l) [ uˆ n+1 ] l=0 16.5 We are given ∞ xˆ = ∫ x f V ( y – c x ) f X ( x ) dx f Y ( y ) –∞ where 325 www.elsolucionario.net y (l) www.elsolucionario.net f X ( x) = ⁄ 0, – 3≤x< otherwise 2 – υ ⁄ 2σ e f V (υ) = , 2πσ and ∞ Hence, ∫– – ( y – c x ) ⁄ 2σ dx xe xˆ = -2 ∫– 3 e – ( y – c x ) ⁄ 2σ (1) dx Let t = ( y – c0 x ) ⁄ σ dt = – c d x ⁄ σ Then, we may rewrite Eq (1) as ( y+ 3c ) ⁄ σ –t ⁄ σ - ( y – tσ )e dt ∫( y – 3c0 ) ⁄ σ -2 c0 xˆ = -( y+ 3c ) ⁄ σ σ – t ⁄ -e dt ∫( y – 3c0 ) ⁄ σ -c0 ( y+ 3c ) ⁄ σ –t ⁄ dt te ∫ σ ( y – 3c ) ⁄ σ = - y – - c0 c ( y+ 3c ) ⁄ σ – t ⁄ e dt ∫ (2) ( y – 3c ) ⁄ σ Next we recognize the following two results: 326 www.elsolucionario.net ∫–∞ f X ( x ) f V ( y – c0 x ) dx f Y ( y) = www.elsolucionario.net ∫( y – 3c ) ⁄ σ te –t ⁄ 2 dt = = ( y+ 3c ) ⁄ σ – t ⁄ ∫( y – 3c ) ⁄ σ e dt = = e –t ⁄ ( y + 3c ) ⁄ σ ( y – 3c ) ⁄ σ y – 3c 0 y+ 3c 0 2π Z - – Z - σ σ ∞ ∫( y – 3c ) ⁄ σ e –t ⁄ ∞ dt – ∫ ( y + 3c ) ⁄ σ σ Z ( ( y + 3c ) ⁄ σ ) – Z ( ( y – 3c ) ⁄ σ ) xˆ = - y + - c Q ( ( y – 3c ) ⁄ σ ) – Q ( ( y + 3c ) ⁄ σ ) c0 0 For convergence of a Bussgang algorithm: E [ y ( n )y ( n + k ) ] = E [ y ( n )g ( y ( n + k ) ) ] For large n: E [ y ( n ) ] = E [ y ( n )g ( y ( n ) ) ] For y(n) = x(n) to achieve perfect equalization: E [ xˆ ] = E [ xg ( x ) ] With xˆ being of zero mean and unit variance, we thus have E [ xˆ g ( x ) ] = 16.7 We start with (for real-valued data) y(n) = ∑ wˆ i ( n )u ( n – i ) i T –t ⁄ dt 2π { Q ( ( y – 3c ) ⁄ σ ) – Q ( ( y + 3c ) ⁄ σ ) } We may therefore rewrite Eq (2) in the compact form: 16.6 e T ˆ ( n )u ( n ) = u ( n )w ˆ (n) = w 327 www.elsolucionario.net ( y+ 3c ) ⁄ σ www.elsolucionario.net Let y = [ y ( n ), y ( n ), …, y ( n K ) ] T T u ( n1 ) T u ( n2 ) U = T ˆ ( n ) has a constant value w ˆ , averaged over the whole block of data, then Assuming that w ˆ y = Uw ˆ is The solution for w + ˆ = U y w where U+ is the pseudoinverse of the matrix U 16.8 (a) For the binary PSK system, a plot of the error signal versus the equalizer output has the form shown by the continuous curve in Fig 1: e(n) 1.5 SE-CMA 1.0 0.5 CMA -0.5 -1 -1.5 y(n) -2 -1 Fig 328 www.elsolucionario.net u ( nK ) www.elsolucionario.net (b) The corresponding plot for the signed-error (SE) version of the CMA is shown by the dashed line in Fig (c) CMA is a stochastic algorithm minimizing the Godard criterion 2 J cm = - E [ y n – R ] * 2 γ { w ( n+1 ) = w ( n ) + µu ( n )y ( n )γ – y ( n ) , ψ ( yn ) = + R2 where µ is small positive step size and ψ ( ) is called the CMA error function The signed-error (SE)-CMA algorithm is described as follows: w ( n+1 ) = w ( n ) + µu ( n ) sgn ψ ( y ( n ) ) where sgn ( ) denotes the signum function Specifically, sgn ( x ) = for x > – for x < The SE-CMA is computationally more efficient than CMA 16.9 The update function for the constant modulus algorithm (CMA) is described by ˆ ( n+1 ) = w ˆ ( n ) + µu ( n )e * ( n ) w where 2 e ( n ) = y ( n ) ( γ – y ( n ) ), γ = + R2 In quadriphase-shift keying (QPSK), the output signal y(n) is complex-valued, as shown by 329 www.elsolucionario.net where the positive constant R2 is the dispersion constant, which is chosen in accordance with the source statistics For a fractionally spaced equalizer (FSE) update algorithm, the algorithm is described by www.elsolucionario.net y ( n ) = y I ( n ) + jy Q ( n ), y I ( n ) = in-phase component y Q ( n ) = quadrature component Hence, 2 eI ( n ) = yI ( n ) ( R2 – yI ( n ) – yQ ( n ) ) 2 eQ ( n ) = yQ ( n ) ( R2 – yI ( n ) – yQ ( n ) ) ˆ ( n+1 ) = w ˆ ( n ) + µu ( n ) sgn e ( n ) w ˆ ( n+1 ) + µu ( n ) sgn [ e I ( n ) + je Q ( n ) ] = w The weights are complex valued Hence, following the analysis presented in Section 5.3 of the text, we may write ˆ I ( n ) + µ ( u I ( n ) sgn [ e I ( n ) ] – u Q ( n ) sgn [ e Q ( n ) ] ) ˆ I ( n+1 ) = w w (1) ˆ Q ( n ) + µ ( u Q ( n ) sgn [ e I ( n ) ] – u I ( n ) sgn [ e Q ( n ) ] ) ˆ Q ( n+1 ) = w w (2) where ˆ (n) = w ˆ I ( n ) + jw ˆ Q(n) w u ( n ) = u I ( n ) + ju Q ( n ) The standard version of the complex CMA is as follows: ˆ I ( n+1 ) = w ˆ I ( n ) + µ ( u I ( n )e I ( n ) – u Q ( n )e Q ( n ) ) w (3) ˆ Q ( n ) + µ ( u Q ( n )e I ( n ) – u I ( n )e Q ( n ) ) ˆ Q ( n+1 ) = w w (4) Both versions of the CMA, the signed version of Eqs (1) and (2) and the standard version of Eqs (3) and (4), can now be treated in the same way as the real-valued CMA in Problem 16.8 330 www.elsolucionario.net For the signed CMA, we thus have www.elsolucionario.net 16.10 Dithered signed-error version of CMA, hereafter referred to as DSE-CMA: (a) According to quantization theory, the operator αsgn(v(n)) has an effect equivalent to that of the two-level quantizer: Q(v(n)) = ∆ ⁄ –∆ ⁄ v(n) ≥ v(n) > Furthermore, since the samples of the dither ε ( n ) are i.i.d over the interval [-1, 1], { αε ( n ) } satisfies the requirement for a valid dither process if the constant α satisfies the requirement: α ≥ e(n) ε(n) w(n) x(n) + sgn x(n) + Fig The equivalent model is illustrated in Fig Hence, we may rewrite the DSE-CMA update formula: w ( n+1 ) = w ( n ) + µu ( n ) ( e ( n ) + ε ( n ) ) (1) Also since ε ( n ) is an uncorrelated random process, its first moment is defined by E [ε(n) e(n)] = E [ε(n)] = (2) Taking the expectation of the DSE-CMA error function, we find that it is a hardlimited version of the e(n), as shown by E [v(n) y(n)] α, = e ( n ), – α, v(n) > v(n) ≤ α v ( n ) < –α which follows from Eqs (1) and (2) 331 www.elsolucionario.net where v ( n ) = e ( n ) + αε ( n ) www.elsolucionario.net 16.11 (a) The Shalvi-Weinstein equalizer is based on the cost function 2 J ( n ) = E [ y ( n ) ] subject to the constraint E [ y ( n ) ] = σ x where y(n) is the equalizer output and σ x is the variance of the original data sequence applied to the channel input Applying the method of Lagrange multipliers, we may define a cost function for the Shalvi-Weinstein equalizer that incorporates the constraint as follows: 2 (1) where λ is the Lagrange multiplier From Eq (16.105) of the text, we find that the cost function for the Godard algorithm takes the following form for p = 2: J G = E ( y ( n ) – R2 ) 2 = E [ y ( n ) ] – 2R E [ y ( n ) ] + R (2) where R2 is a positive real constant Comparing Eqs (1) and (2), we see that these two cost functions have the same mathematical form Hence, we may infer that these two equalization algorithms share the same optimization criterion (b) For a more detailed discussion of the equivalence between the Godard and ShalviWeinstein algorithms, we may proceed as follows: First, rewrite the tap-weight vector of the equalizer in polar form (i.e., a unit-norm vector times a radial scale factor), and then optimize the Godard cost function with respect to the radial factor The “reduced” cost function that results from this transformation is then recognized as a monotonic transformation of the corresponding Shalvi-Weinstein cost function Since the transformation relating these two criteria is monotonic, their stationary points and local/global minima coincide to within a radial factor By taking this approach, the equivalence between the Godard and ShalviWeinstein equalizers seem to hold under general conditions (linear or nonlinear channels, iid or correlated data sequences applied to the channel input, Gaussian or non-Gaussian channel noise, etc).1 For further details on the issues raised herein, see P.A Regalia, “On the equivalence between the Godard and Shalvi-Weinstein schemes of blind equalization”, Signal Processing, vol 73, pp.185-190, 1999 332 www.elsolucionario.net J SW ( n ) = E [ y ( n ) ] + λ ( E [ y ( n ) ] – σ x ) www.elsolucionario.net www.elsolucionario.net 16.12 For the derivation of Eq (16.116), see the Appendix of the paper by Johnson et al.2 Note, however, the CMA cost function for binary PSK given in Eq (63) of that paper is four times that of Eq (16.116) C.R Johnson, et al., “Blind equalization using the constant modulus criterion: A review”, Proc IEEE, vol 86, pp 1927-1950, October 1998 333 www.elsolucionario.net CHAPTER 17 (a) The complementary error function x –t ⁄ dt ϕ ( x ) = ∫ e 2π – ∞ qualifies as a sigmoid function for two reasons: The function is a monotonically increasing function of x, with ϕ ( –∞ ) = ϕ ( ) = 0.5 ϕ(∞) = For x = ∞, ϕ equals the total area under the probability density function of a Gaussian variable with zero mean and unit variable; this area is unity by definition The function ϕ(x) is continuously differentiable: – x2 ⁄ dϕ = e dx 2π (b) The inverse tangent function –1 ϕ ( x ) = - tan ( x ) π also qualifies as a sigmoid function for two reasons: ϕ ( – ∞ ) = – ϕ(0) = ϕ ( ∞ ) = +1 ϕ(x) is continuously differentiable: dϕ = - π dx 1+x The complementary error function and the inverse tangent function differ from each other in the following respects: 334 www.elsolucionario.net 17.1 www.elsolucionario.net • • 17.2 The complementary error function is unipolar (nonsymmetric) The inverse tangent function is bipolar (antisymmetric) The incorporation of a momentum modifies the update rule for sympatic weight wkj as follows: ∂E ( n ) ∆w kj ( n ) = α∆w kj ( n-1 ) – η ∂w (1) kj = momentum constant = learning-rate parameter = cost function to be minimized = iteration number Equation (1) represents a first-order difference equation in ∆w wj ( n ) Solving it for ∆w kj ( n ) , we get n ∆w kj ( n ) = – η ∑ α n-t ∂E ( t ) t=0 -∂w kj (2) For -1 < α < 0, we may rewrite Eq (2) as n ∆w kj ( n ) = – η ∑ ( – ) t=0 n-t α n-t ∂E ( t ) -∂w kj Thus, the use of -1 < α < in place of the commonly used value < α < is merely to introduce the multiplying factor (-1)n-t, which (for a specified n) alternates in algebraic sign as t increases 17.3 Consider the real-valued version of the back-propagation algorithm, which is summarized in Table 17.2 of the text In the backward pass, starting from the output layer, we see that as we progress from the output layer, the error signal decreases the further away we are from the output layer This would then suggest that the learning rates used to adjust the weights in the multilayer perceptron should be increased to make up for the decrease in the error signal as we move away from the output layer In so doing, the rate at which the learning process takes place in the different layers of the network is equalized, which is highly desirable 335 www.elsolucionario.net where α η ε(n) n www.elsolucionario.net 17.4 We are given the time series u(n) = 2 ∑ v ( n-i ) + ∑ ∑ aij v ( n-1 )v ( n- j ) i=1 i=1 j=1 We may implement it as follows: v(n) z-1 a1 z-1 v(n-2) 17.5 a11 a21 + a12 X X Σ Σ u(n) a22 a2 z-1 v(n-3) X a3 Σ The minimum description length (MDL) criterion strives to optimize the model order In particular, it provides the best compromise between its two components: a likelihood function, and a penalty function Realizing that model order has a direct bearing on model complexity, it may therefore be argued that the MDL criterion tries to match the complexity of a model to the underlying complexity of the input data The risk R of Eq (17.63) also has two components: one determined by the training data, and the other determined by the network complexity Loosely speaking, the roles of these two components are comparable to those of the likelihood function and the penalty function in the MDL criterion, respectively Increasing the first component of the risk R at the expense of the second component implies that the training data are highly reliable, whereas increasing the second component of R at the expense of the first component implies that the training data are of poor quality 336 www.elsolucionario.net v(n-1) www.elsolucionario.net 17.6 (a) Laguerre-based version of MLP has the following structure: wo,0 wo,1 Input signal u(n) Lo(z,α) wo,2 v1 v2 v3 L(z,α) Output y(n) L(z,α) i=0,2, ,N-1 wo,N vM wM,N j=0,1, ,M-1 (b) The new BP algorithm can be devised for Laguerre-based version MLP in a manner similar to the LMS algorithm formulated for Langerre filter (Problem in Chapter 15) The only difference between the new BP algorithm and the conventional BP algorithm lies in the adjustment of the input-to-hidden layer weights, and the calculation of each hidden unit output Recall from the solution to Problem in Chapter 15, we have For hidden unit output: N -1 hj = ∑ wi, j ui ( n, α ) i=0 N -1 = ∑z –1 [ U i ( z, α ) ] i=0 M-1 ϕ ( h j ) = ( h j ) , j = 0, 1, …, M-1 y = ∑ v j ϕ ( h j ) j=0 337 www.elsolucionario.net v4 desired output d(n) www.elsolucionario.net For adaptation of weights Output layer (for simplicity, consider only one output unit) M-1 ∆v j ( n ) = µ˜ h j ( n )ϕ′ ∑ v j ϕ ( h j ) [ d – y ( n ) ] j=0 M-1 Bias ( n ) = µ˜ ϕ′ ∑ v j ϕ ( h j ) [ d – y ( n ) ] j=0 ∆w ij ( n ) = µ˜ u i ( n, α )ϕ′ ( h j ) ∑ v j ϕ′ ∑ v j h j [ d – y ( n ) ] j j Bias ( n ) = µ˜ ϕ′ ( h j ) ∑ v j ϕ ∑ v j h j [ d – y ( n ) ] j j 338 www.elsolucionario.net Hidden layer (for simplicity, consider only one hidden layer) ... over-fitted model is am wo = 2.9 (a) The Wiener solution is defined by RM aM = pM r M-m RM H r M-m am = R M-m, M-m M-m pm p M-m R M am = pm H r M-m a m = p M-m H H –1 p M-m = r M-m a m = r M-m R M p... fourth-order moment of a zero-mean Gaussian process of variance σ2 is 3σ4, and its second-order moment of σ2 Hence, the fourth-order cumulant is zero Indeed, all cumulants higher than order two... -1 ] T H Λ = - E [ UU ] N -1 det ( Λ ) = ∏ S k N k=0 Therefore, N -1 U 2 1 k - exp – - ∑ f U ( U 0, U 1, …, U N -1 ) = N -1 N –N k=0 - S k ( 2π )