yy^HQi jjjgg scr.r!fia°@S Proceedings of ICT.rda'06 Hanoi May 20-21,2006 ANH HirdNG CUA AM DONG TRONG N H A N DANG TIENG VIET BANG PHlTONG PHAP HMM/ANN D^ing Ngpe Due, Luong Chi Mai, Nguyen Duy Tien Tom tdt Trong bdi bdo ndy chiing tdi Irinh bdy cdc nghien cuu vi vai Ird cua dm ddng (closure) Irong nhgn dgng tiing ndi lien luc Cdc thuc nghiim dugc liin hdnh tren cdc hi thdng nhgn dgng cho mudi cha sd liing Viit duac xdy dung bdng phuang phdp nhdn dgng HMM/ANN Ca s& dir Iiiu dugrc sir dung bao gom 442 cdu, 2345 tir Ihu dm qua mgng diin thogi tir gigng ndi ciia 213 nguai (135 nam 78 nii) Kit qud nhdn dgng cho thdy Irong cdc dm ddng cua dm tdc hOu (voiced slop) /b/ khdng cdi thiin dugc dg chinh xdc nhgn dgng thi cdc dm ddng cua cdc dm tdc vd (unvoiced slop) /I/, /ch/ lgi ndng cao khd ndng nhdn dgng cua hi thong Hi thdng nhdn dgng tot nhdl (dg chinh xdc 96.20% a mire lit vd 84.93% & mire cdu) Id hi thing nhgn dgng v&i su tham gia cua cdc dm ddng cua dm tdc vd /I/ vd /ch/ Tir khda: Nhdn dgng liing ndi Md hinh Markov dn Mgng naron Mgng lai ghep HMM/ANN G l l THIEU Am ddng la am dung trudc cac phy im tic Khi phit im mpt am tic, co quan phit im sg khep Ipi va ludng khdnp td phoi di sg bj can trd hoan toan Am tic dupc hinh thinh luong hoi phi vd sy can trd bpt thinh mpt tiing no (do vpy im tic ciing dugc gpi la am nd, plosive) Am ddng (closure) dugc hinh thinh tron^ qui trinh cic bp phpn phit im di chuyen vio vj tri ciu im vi cic CO quan phit im ddng Ipi trudc phit am mpt im tic la gia trj trung binh cua mpt am ddng) Trong phit am lien tyc va nhanh am ddng thudng rat n^in lam cho nd khd phan bift dugc vdi am d ding trudc Tuy am ddng khdng phii luc nao ciing xuit hifn rd rang tren bilu phd va bilu sdng, nhung nd van ton tpi nhu la mpt im vj dpc Ipp cic phit im lien tyc Trong bai bio niy chung tdi nghien cdu inh hudng ciia im ddng vifc ting cudng dp chinh xic nhin dpng Chiing tdi sd dyng bp cdng cy CSLU toolkit [3] de xay dyng hf thong nhpn dpng mudi chd so tiing Vift bing mpng lai ghep HMM/ANN Cac am ddng sau dd se dugc dua vio hf thong nhan dpng dl dinh gii inh hudng cua chiing din dp chinh xac nhpn dpng Trfn bilu dd phd (spectrogram) cua mpt phit im lien tyc, im ddng dugc dinh diu bing sy suy giam ning lugng nlu am ddng diimg sau mpt am nao Cdn am ddng ddng d diu mpt phit am, diem bit dau cua im ddng la mpt xung nhd tren bilu dd sdng (waveform), diy li tpp am xuit hifn cac bp phpn phit im di chuyin CO s d Dtr LIEU Khi ngudi ndi phat im nhp, im ddng cd thi khdng nhin thiy dugc tren bilu sdng va bilu pho, dd am ddng dugc xic djnh Ii Ichoang tring ning lugng tren bilu dd hay li mpt dopn 50ms nim trudc phy im tic (day Co sd dd lifu dugc dimg bii bio niy bao gom 442 ciu, 2345 td Co sd dd lifu niy dugc trich td hai co sd dd lifu tiing ndi "22 Language vl.2", va "Multi-Language Telephone Speech vl.2" ciia trung tam nghien 205 Procee Ky yfiu Hf i thto ICT.rda'06 ciiru CSLU (Center for Speech Language Understanding), Vifn Sau Dpi hpc Oregon, Hoaky Cic ciu co sd dd lifu dugc thu am qua mpng difn thopi td 213 ngudi ndi (135 nam, 78 nd), dd ngudi ndi phit am cac ciu bao gom cic chd so nhu: sd difn thopi, dja chi, sd buu difn, tudi, Ciu dai nhat cd 18 td va cau ngin nhit cd td Cac cau thu dugc da dang va khac vl toe dp phit am, cd ngudi ndi nhanh, ngudi ndi cham; vl dp to nhd, cd ciu nghe to, cd ciu thu im dugc rit nhd; cd ciu dugc ngudi ndi ndi vin phdng yen tinh, cd cau cd lin nhilu tpp im nhu tiing dii, ti vi xen vao ngudi ndi ngoi nhi, hay tiing d td ngudi ndi ddng tpi trpm buu difn cdng cpng, Cac cau dugc thu am vdi tin sd lay miu 8000Hz Co sd dd lifu tiing ndi dugc chia thinh tpp dd lifu: tap huin luyfn (training set) gdm 300 cau dung dl huin luyfn mpng ANN vi md hinh HMM; tpp phit triin (development set) gom 68 ciu dung dl phit triin cic hf thong nhpn dpng; tpp kiem tra (test set) gdm 74 cau dung kiim tra dp chinh xac nhpn dpng ciia hf thing Tat ci cic cau eo sd dd lifu tiing diu dugc phien am chinh ti (orthographic transcription) gin nhin bing tay (handlabeled) tpi mdc am vj Hifn chua cd mpt bing ky ty phien am thong nhit dimg cho tiing Vift, chua cd tii lifu gin nhin diing cho tiing Vift dugc cdng bo Trong qui trinh xay dyng CO sd du lifu, chiing tdi diing phuong thuc Telex de phien am chinh ta cho cic am tilt va am vj cua cac chu so De gin nhan, chiing tdi dya vao tii lifu hudng din gin nhan tiing Anh [2], sau dd kit hgp vdi dpc dilm am vj hpc tiing Vift dugc trinh bay cic sach giao khoa ve tiing Vift [1] Bang sau day la phien am chinh ti va phien am am vj mudi chu sd tiing Vift • l o i i u i tVU Bang Phien am chinh ta am t mudi chir so tieng Vi^ Phi ar si Phien am chinh ti khoong /kh// mootj Imlh hai Ihli ba boons /b/ot nawm tnlla\ saus tsl/a baayr /b//ai tarns IMIal chins /ch//i Chd Ihl MO HINH MARKOV AN (HM Mgt mo hinh Markov in (Hidde Model, HMM) dugc die trung bdi > phin CO ban sau: 1) N, so trpng thii (state) Markov, S= {Si, S2 Sj } 2) M, sd ky hifu quan sat (ob: symbol), V= {vi v^ Vi } 3) i4 = {Oij}, xac suit chuyin trpng tl transition probability distribution) dd Oij la xac suit dk mo hinh d tra tpi thdi diem /+1 md hinh da thai {tpi thdi dilm / ay = P(qi*i =j \qi = i) a.j>0;i.j = TN 4) B={bj(k)} ham xic suit quan sit mi thai (observation symbol prol distribution in state) bj(k) la xic si quan sat v* d trang thiiy tpi thdi dii 206 b/k) = P(vt tpi thdi dilm t\q, = Sj) Er=Af*^ = /;;• = AN bj(k)>0;j = l,N;k = l.M I Hfli ihto ICT.rda1 ^ Proceedings of ICT.rda'06 Hanoi May 20-21.2006 $) x= {Jt\, ^ - W cac xic suit trpng thii khic phyc vin dl niy thupt toin tiln-lui khdi diu (initial state distribution ) ^ la (forward-backward algorithms) hay dugc sd xic suit dl trpng thai / dugc chpn tai thdi dyng d\kmV.ha\dhxt=\: Thugt todn tiin liti (3) iii=P(qi=SJ Ta djnh nghTa biln tiln (forward) at (i) li xac suit cua day quan sat O tdi thdi dilm t: 0= Ol O2 O, tai trpng thii S, dugc sinh bdi md hinh X t,>0:i = l,N Cd ba bii toin co bin lien quan din nhin dpng tiing ndi dpt ddi vdi mdt md hinh Markov in Bii toan Chung ta cd day quan sit 0= {Ol OJ OS } vi md hinh Markov in X=(A B, J^ chung ta cin tinh xic suit P(0 IX) Gpi Q=qi q2 qh- qr li diy cic trang thai tuang dng vdi day quan sit O Ta cd xac suit dk day quan sat O dugc sinh bdi A vdi diy trpng thai Q la: a, (i)=P(Oi O2 O, q, =S, /X) (5) vdi gii trj khdi tao a,(i)=K^i(Oi) », li vector cic gii trj output ciia mpng ANN dugc xic djnh qtii trinh gin nhin Trong tpp vector mau >,, chi mpt nut ouput dugc nhpn gii trj 1, cic nut cdn Ipi cd gii trj bing Cac vector mau dugc ddng dl huan luyfn mpng neuron Qui trinh huan luyfn mpng la qua trinh hpc cd giam sat (supervised learning) thyc hien bing thu tyc huan luyfn truyin ngugc sai s i (back propagation of error) Quk trinh huan luyfn dugc thyc hifn 30 vdng lip (iteration) Sau qui trinh huin luyfn mpng, ta thu dugc cic trpng so ciia cic vdng lip huin luyfn khic (iteration) Cic trpng so niy dugc thd nhpn dpng tren tap dd lifu phit triin (development set) de chpn cic trpng so iimg vdi iteration cho kit qui tot nhit [11] ANN vdi tpp cac trpng so tot nhit dugc dimg lam mpng khdi diu de huin luyfn mpng lai ghep HMM/ANN bing cic vdng lap Diu tien ANN dugc diing df tinh xic suit quan sit mdi trang thii bj(k) Td cic xic suit niy HMM se tim dugc day quan sat tuang ung vdi dd lifu mau Cic tham so ciia HMM sau dd dugc dieu chinh bing thuit toin forward-backward dudi dang nhiing (embedded) Cic md hinh Markov an ciia cic category dugc ndi ghep Ipi vdi tpo mpt md hinh Idn va thupt toan forward-backward dugc ip dung dl dieu chinh cac tham so ciia cic md hinh theo cdng thdc (8), (9) Cic gia trj output cua mpng ANN cung se dugc tinh toin bing thupt toin forward-backward de tpo thinh mpt vector cd kich thudc la so niit cua mpng ANN Cic gii trj niy dugc diuig de huan luyfn Ipi mpng ANN bing thii tuc thu tyc huan luyfn truyin ngugc sai sd Cdng thdc (8) dugc dung dk tinh toin cic gii trj niy [4]: yi =P(q,=SjO.X) = «,(0A(0 P{C\^)