Robust speech recognition and understanding pot

470 75 0
Robust speech recognition and understanding pot

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

!"#$%& ()**+, !*+" /&/". 0.1 2.1*3%&0.1/ !"#$%& ()**+, !*+" /&/". 0.1 2.1*3%&0.1/ ! Edited by Michael Grimm and Kristian Kroschel "#$%&' %ducation and Pu1lishin5 IV ! ! ! ! ! ! ! ! ! ! ! ! ! "#$%&'()*!$+!,()! /)0(!1*#02,&34!24*!"#$%&'(&456!7&)4426!8#',9&2! ! ! ! ! 8$',920,&45! 24*! 434.:93;&,! #')! 3;!,()!<2,)9&2%!&'!:)9<&,,)*!=&,(!09)*&,!,3!,()!'3#90)>! ?,2,)<)4,'! 24*! 3:&4&34'!)@:9)'')*!&4!,()!0(2:,)9'!29)!,()')!3;!,()!&4*&A&*#2%!034,9&$#,39'!24*!43,!4)0)''29&%+!,(3')!3;! ,()!)*&,39'!39!:#$%&'()9>!B3!9)':34'&$&%&,+!&'!200):,)*!;39!,()!200#920+!3;!&4;39<2,&34!034,2&4)*!&4!,()! :#$%&'()*!29,&0%)'>!"#$%&'()9!24*!)*&,39'!2''#<)'!43!9)':34'&$&%&,+!%&2$&%&,+!;39!24+!*2<25)!39!&4C#9+!,3! :)9'34'!39!:93:)9,+!29&'&45!3#,!3;!,()!#')!3;!24+!<2,)9&2%'6!&4',9#0,&34'6!<),(3*'!39!&*)2'!034,2&4)*!&4. '&*)>!8;,)9!,(&'!=39D!(2'!$))4!:#$%&'()*!$+!,()! /)0(!1*#02,&34!24*!"#$%&'(&456!2#,(39'!(2A)!,()!9&5(,! ,3!9):#$%&'(!&,6!&4!=(3%)!39!:29,6!&4!24+!:#$%&02,&34!3;!=(&0(!,()+!29)!24!2#,(39!39!)*&,396!24*!,()!<2D)! 3,()9!:)9'342%!#')!3;!,()!=39D>!! ! E!FGGH! /)0(!1*#02,&34!24*!"#$%&'(&45! ===>29'.C3#942%>03<! 8**&,&342%!03:&)'!024!$)!3$,2&4)*!;93<I!! :#$%&02,&34J29'.C3#942%>03<! ! K&9',!:#$%&'()*!L#4)!FGGH! "9&4,)*!&4!M932,&2!! ! ! ! 8!02,2%35#)!9)039*!;39!,(&'!$33D!&'!2A2&%2$%)!;93<!,()!8#',9&24!N&$929+>!! O3$#',!?:))0(!O)0354&,&34!24*!P4*)9',24*&456!1*&,)*!$+!Q&0(2)%!R9&<<!24*!S9&',&24!S93'0()%! ! ! !!!!!!!!!!!!!!:>!!0<>! -?TB!UHV.W.UGFXYW.GV.G! Y>!?:))0(!O)0354&,&34>!F>!?:))0(!P4*)9',24*&45>! V ! 43*50+* ! ! Z&5&,2%!':))0(!:930)''&45!&'!2!<2C39!;&)%*!&4!0#99)4,!9)')290(!2%%!3A)9!,()!=39%*>!-4!:29,&0#%29! ;39!2#,3<2,&0!':))0(!9)0354&,&34![8?O\6!A)9+!'&54&;&024,!20(&)A)<)4,'!(2A)!$))4!<2*)!'&40)! ,()!;&9',!2,,)<:,'!3;!*&5&,!9)0354&])9'!&4!,()!YU^G_'!24*!YUXG_'!=()4!':)0,92%!9)'34240)'!=)9)! *),)9<&4)*!$+!242%35#)!;&%,)9'!24*!%35&02%!0&90#&,'>!8'!"93;>!K#9#&!:3&4,)*!3#,!&4!(&'!9)A&)=! 34!^G!+)29'!3;!2#,3<2,&0! ':))0(!9)0354&,&34!2,!,()!WF4*!-111!-4,)942,&342%! M34;)9)40)!34! 803#',&0'6!?:))0(6!24*!?&542%!"930)''&45![-M8??"\6!FGGH6!=)!<2+!43=!'))!':))0(!9)0354&. ,&34!'+',)<'!&4!,()&9!W>^,(!5)4)92,&34>!8%,(3#5(!,()9)!29)!<24+!)@0)%%)4,!'+',)<'!;39!034. ,&4#3#'! ':))0(! 9)0354&,&346! ':))0(! ,924'%2,&34! 24*! &4;39<2,&34! )@,920,&346! 8?O! '+',)<'! 4))*!,3!$)!&<:93A)*!;39!':34,24)3#'!':))0(>!K#9,()9<39)6!93$#',4)''!#4*)9!43&'+!034*&. ,&34'!&'!',&%%!2!532%!,(2,!(2'!43,! $))4!20(&)A)*!)4,&9)%+!&;!*&',24,!<&093:(34)'!29)!#')*!;39! ':))0(!&4:#,>!/()!2#,3<2,)*!9)0354&,&34!3;!)<3,&34!&'!243,()9!2':)0,!&4!A3&0).*9&A)4!'+'. ,)<'!,(2,!(2'!52&4)*!<#0(!&<:39,240)!&4!9)0)4,!+)29'>!K39!42,#92%!%245#25)!#4*)9',24*&45! &4!5)4)92%6!24*!;39!,()!0399)0,!&4,)9:9),2,&34!3;!2!':)2D)9_'!9)0354&])*!=39*'6!'#0(!:292%&4. 5#&',&0!&4;39<2,&34!<2+!$)!#')*!,3!&<:93A)!;#,#9)!':))0(!'+',)<'>! /(&'!$33D!34!O3$#',!?:))0(!O)0354&,&34!24*!P4*)9',24*&45!$9&45'!,35),()9!<24+!*&;;)9)4,! 2':)0,'!3;!,()!0#99)4,!9)')290(!34!2#,3<2,&0!':))0(!9)0354&,&34!24*!%245#25)!#4*)9',24*. &45>!/()!;&9',!;3#9!0(2:,)9'!2**9)''!,()!,2'D!3;!A3&0)!20,&A&,+!*),)0,&34!=(&0(!&'!034'&*)9)*! 24!&<:39,24,!&''#)!;39!2%%!':))0(!9)0354&,&34!'+',)<'>!/()!4)@,!0(2:,)9'!5&A)!')A)92%!)@,)4. '&34'!,3!',2,).3;.,().29,!`QQ!<),(3*'>!K#9,()9<39)6!2!4#<$)9!3;!0(2:,)9'!:29,&0#%29%+!2*. *9)''!,()!,2'D!3;!93$#',!8?O!#4*)9!43&'+!034*&,&34'>!/=3!0(2:,)9'!34!,()!2#,3<2,&0!9)035. 4&,&34! 3;! 2! ':)2D)9_'! )<3,&342%! ',2,)! (&5(%&5(,! ,()! &<:39,240)! 3;! 42,#92%! ':))0(! #4*)9',24*&45!24*!&4,)9:9),2,&34!&4!A3&0).*9&A)4!'+',)<'>!/()!%2',!0(2:,)9'!3;!,()!$33D!2*. *9)''!,()!2::%&02,&34!3;!034A)9'2,&342%!'+',)<'!34!93$3,'6!2'!=)%%!2'!,()!2#,343<3#'!20a#&. '&,&34!3;!A302%&]2,&34!'D&%%'>! b)!=24,!,3!)@:9)''!3#9!,(24D'!,3!2%%!2#,(39'!=(3!(2A)!034,9&$#,)*!,3!,(&'!$33D!$+!,()!$)',! 3;!,()&9!'0&)4,&;&0!=39D>!b)!(3:)!+3#!)4C3+!9)2*&45!,(&'!$33D!24*!5),!<24+!()%:;#%!&*)2'!;39! +3#9!3=4!9)')290(!39!2::%&02,&34!3;!':))0(!,)0(43%35+>! ! ! Edito&s ! Q&0(2)%!R9&<<!24*!S9&',&24!S93'0()%! )ni+e&sit-t .a&ls&u2e 3456 7e&many! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! VII ! ! 6".&*.&% 43*50+* 7 89 7"/+* :+&/;/&< =*&*+&/".9 >$.10?*.&0@% 0.1 ()**+, !*+" /&/". (<%&*? !"#$%&.*%% AA8 !" $%&'()*+ !" ," (('* %/0 !" 1" 2)34(% B9 C";*@ :))3"0+,*% &" ()**+, =*&*+&/". /. &,* 43"+*%%/ "5 6".&/.$"$% :$1/" (&3*0?% ABD !%/)* 5'6)(7+ 8.97:%/ ;)9/'<)( %/0 =(%/<) ,'>)?'< D9 C*E :1;0.+*% /. 7"/+* :+&/;/&< =*&*+&/". $%/ FG( 0.1 G)&/?/H0&/". (&30&*-/*% AIJ !"," (('*+ !" $%&'()* %/0 1"-" @4/7./)7 I9 7"/+* 0.1 C"/%* =*&*+&/". E/&, :10K""%& ALM T. Takiguchi, N. -iyake, H. -atsuda and 5. Ariki N9 O;"@$&/".03< %)**+, 3*+" /&/". AMN A//) 2B%?%/*%/' L9 2%/ P*.*&/+ :@-"3/&,? &" Q?)3";* &,* 4*35"3?0.+* "5 ()**+, !*+" /&/". K0%*1 ". :3&/5/+/0@ C*$30@ C*&E"3R AJN 2>'/3CD%' @%/ %/0 1>'>C1>'/ E%' M9 : P*.*30@ :))3"S/?0&/".TG)&/?/H0&/". :))3"0+, &" U03-* V03-/. O%&/?0&/". "5 FVV% 8AD F4' !'%/3 %/0 G'/H)' E' W9 ="$#@* U0<*3 :3+,/&*+&$3*% 5"3 :$&"?0&/+ ()**+, !*+" /&/". 2%/ FVV 8B8 ,%(7% 1%9%( %/0 !.9) A" $" =./.??.9% VIII J9 :$1/" 7/%$0@ ()**+, !*+" /&/". 0.1 (*-?*.&0&/". K0%*1 ". =KC V"1*@% 8DJ I./3&)' !'%/3+ -4.J4/ EK+ L?9) $%KJ9)+ G'%.J4) !'%/3+ M%//'/3 5>%/3+ F'<>)& 2%>?' %/0 $./3<>4/ 5>%. 8A9 =/%+3*&*TV/S&$3* FVV%T#0%*1 :))3"0+, 5"3 C"/%< ()**+, !*+" /&/". 8NM D)794. N.9%O%+ ,%9%>%(4 N%7.> %/0 ,%9%O' N.>0% 889 ()**+, !*+" /&/". /. 2.R."E. C"/%< 6".1/&/".% 8MN !' ,'/3 %/0 8%.<>4/ F.4 8B92.+*3&0/.&< /. (/ 0@ O%&/?0&/". 0.1 (&"+,0%&/+ X*/-,&*1 7/&*3#/ :@-"3/&,?Y : 2./5/*1 >30?*E"3R &" :113*%% !"#$%&.*%% /. ()**+, !*+" /&/". 0.1 ()*0R*3 7*3/5/+0&/". 8WM P" 8)<)((% M.&%+ 1" ,.?'/%+ 1" -%(()7./ %/0 =" F4)/4B%/ 8D9 Z,* !*%*03+, "5 C"/%*T!"#$%& ()**+, !*+" /&/". K0%*1 ". >3*[$*.+< X03)/ X0;*@*& B8J G4)J'/3 5>%/3 %/0 Q)/:4/ ,)/3 8I9 :$&"+"33*@0&/".T#0%*1 V*&,"1% 5"3 C"/%*T !"#$%& ()**+, !*+" /&/". BDJ ->.?%&()*% =%(%>%/'+ ,.>%&&%0 A>%0' R ,.>%&&%0 ,)>0' F.&%J.4/B.4( 8N9 K/?"10@ O?"&/". !*+" /&/". $%/ ()**+, 0.1 4,<%/"@"-/+0@ 6,0 *% BLN !./3>H% N'& 8L9 O?"&/". O%&/?0&/". /. ()**+, 2%/ 0 D= O?"&/". ()0+* 6".+*)& BW8 ,'<>%)? -('&& %/0 N('97'%/ N(.9<>)? 8M9 U/.*03@< Q.&*3)"@0&*1 F/*303+,/+0@ CT-30? U0 $0-* V"1*@% 5"3 ()**+, !*+" /&/". O /.*% DA8 L&)0 5'7.4/' %/0 S'(4 5>.4 8W9 : >0+&"3*1 U0 $0-* V"1*@ 5"3 43"%"1< =*)*.1*.& ()**+, !*+" /&/". D8J N)/ 1>)/+ ,%(O A" F%9)3%H%C!.>/9./ %/0 !)//'T)( 2" 1.?) 8J9 O03@< =*+/%/". V0R/ /. 6".&/.$"$% ()**+, DDD U0)77) 2<>%()/6.(3+ E.4'9 7)/ 8.9<> %/0 E.4 8.K)9 BA9 :.0@<%/% 0.1 Q?)@*?*.&0&/". "5 0. :$&"?0&*1 =*@/?/&*3 "5 \]$30./+\ 7*3%*% /. :$1/" >/@*% $%/ ()**+, !*+" /&/". Z*+,./[$*% DN8 D%66%? F%99%/+ A?C=%?.4 Q%99'& %/0 ,./?% 8%99)& IX B89 :. Q?)3";*1 P: K0%*1 V"1/5/*1 =<.0?/+ C*$30@ C*&E"3R 5"3 60.&".*%*T=/-/& ()**+, !*+" /&/". DLD 2"F" E'/3+ ="F"=" E)4/3+ N"=" E)4/3+ F"N" E%& %/0 F"F"1" L4 BB9 Z0@R/ !"#"& 0.1 &,* :$&"."?"$% :+[$/%/&/". "5 7"+0@/H0&/". 0.1 (/ / (R/@@ DWN F'0)J4O' 2%H%0% BD9 6".;*3%0&/". (<%&*? "5 0. O;*3<10< !"#"& !"#";/*TQ7 IAN P.('%O' ,'794/%3%+ 5)/7% ,'J%9>'7%+ D%O%>'(. ,'J%9>'7%+ F'(.9>' L9>'34(. %/0 P.('>'(. F%3'7% BI9 ("$.1 U"+0@/H0&/". "5 O@*;0&/". $%/ 4/ 0* 5"3 :$1/&"3< !"#"&% IBA D.&.O. 2>'&.0%+ D.(4 P%O%9>'&%+ ,%O.7. N4&./+ $J4'<>' N.>*%H%+ LO4(. ,'*4&.7. %/0 5)/7% LH%' BN9 ()**+, !*+" /&/". 2.1*3 C"/%* 6".1/&/".%Y 6"?)*.%0&/". V*&,"1% IDJ A/3)? 0) ?% D.(()+ !.9) 1" 2)34(%+ 1%(&)/ 8)/'7)*+ !%K')( $%&'()*+ E4* -%(<'% %/0 A/7./'. !" $46'. [...]... systems is shown and discussed Three different VAD methods are described and compared to standardized and 2 Robust Speech Recognition and Understanding recently reported strategies by assessing the speech/ non -speech discrimination accuracy and the robustness of speech recognition systems 2 Applications VADs are employed in many areas of speech processing Recently, VAD methods have been described in the literature... communication Voice Activity Detection Fundamentals and Speech Recognition System Robustness Incoming speech Inactive speech encoder Inactive speech encoder 3 Decoded speech Communication channel Active speech encoder Active speech encoder VAD Figure 1 Speech coding with VAD for DTX 2.2 Speech enhancement Speech enhancement aims at improving the performance of speech communication systems in noisy environments... order to improve the robustness against the noise The motivations for these approaches are found in the speech production process and the reduced signal energy of word beginnings and endings The so called hang-over algorithms extends and smooth the VAD decision in order to recover speech periods that are masked by the acoustic noise 10 Robust Speech Recognition and Understanding 4 Robust VAD algorithms... challenges in robust speech detection and a review of the state of the art and applications VADs are frequently used in a number of applications including speech coding, speech enhancement and speech recognition A precise VAD extracts a set of discriminative speech features from the noisy speech and formulates the decision in terms of well defined rule The chapter has summarized three robust VAD methods... standard (ETSI, 2002), which considers different VADs for WF and FD, was used The MBQW VAD outperforms G.729, AMR1, AMR2 and AFE standard VADs in both clean and multi condition training/testing experiments When compared to recently reported VAD algorithms, it yields better results being the one that is closer to the “ideal” hand-labeled speech recognition performance 18 Robust Speech Recognition and. .. R.V.; Gaurav, V (2002) VAD Techniques for Real-Time Speech Transmission on the Internet, IEEE International Conference on High-Speed Networks and Multimedia Communications, pp 46-50 20 Robust Speech Recognition and Understanding Basbug, F.; Swaminathan, K.; Nandkumar, S (2004) Noise reduction and echo cancellation front-end for speech codecs, IEEE Trans Speech Audio Processing, vol 11, no 1, pp 1–13 Gustafsson,... Khalid, C.; Stephan, E.; Jeffrey, A (2000) SpeechDat-Car: A large speech database for automotive environments, Proc II LREC Conf 22 Robust Speech Recognition and Understanding Hirsch, H.G.; Pearce, D (2000) The AURORA experimental framework for the performance evaluation of speech recognition systems under noise conditions, ISCA ITRW ASR2000: Automatic Speech Recognition: Challenges for the Next Millennium... by several different classes and represented in the training process of such systems To avoid this, we decided to design an 26 Robust Speech Recognition and Understanding audio representation that would better determine the speech and perform significantly differently on all other non -speech data One possible way to achieve this is to see speech as a sequence of basic speech units that convey some... actual pause or speech frames that are correctly detected as pause or speech frames, respectively: HR0 = N 0 ,0 ref N0 HR1 = N 1,1 ref N1 (15) ref ref where N 0 and N 1 are the number of real non -speech and speech frames in the whole database, respectively, while N0,0 and N1,1 are the number of non -speech and speech frames correctly classified Figure 8 provides the results of this analysis and compares... Voice Activity Detection Fundamentals and Speech Recognition System Robustness 21 Boll, S., F Suppression of Acoustic Noise in Speech Using Spectral Subtraction, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol 27, no 2, April 1979 ETSI (2002) ETSI ES 201 108 Recommendation Speech Processing, Transmission and Quality aspects (STQ); Distributed speech recognition; Front-end feature extraction . described and compared to standardized and Robust Speech Recognition and Understanding 2 recently reported strategies by assessing the speech/ non -speech discrimination accuracy and the robustness. vector x. Assuming that the speech signals and the noise are additive, the VAD module has to decide in favour of the two hypotheses: Robust Speech Recognition and Understanding 8 snx nx += = : : 1 0 H H . algorithms extends and smooth the VAD decision in order to recover speech periods that are masked by the acoustic noise. Robust Speech Recognition and Understanding 10 4. Robust VAD algorithms

Ngày đăng: 26/06/2014, 23:20

Tài liệu cùng người dùng

Tài liệu liên quan