Phép dịch các truy vấn logic thành các truy vấn SQL trong các hệ truy vấn bằng ngôn ngữ tự nhiên

7 3 0
Phép dịch các truy vấn logic thành các truy vấn SQL trong các hệ truy vấn bằng ngôn ngữ tự nhiên

Đang tải... (xem toàn văn)

Thông tin tài liệu

Proceedings of ICT.rda'06 Hanoi May 20 Ky y^u HQi thto ICT.rda'06 PHEP DICH CAC TRUY VAN LOGIC THANH CAC TRUY > SQL TRONG CAC HE TRUY VAN B A N G NGON N G C TV* NI Translating the logical queries into SQL q leries in natural language query systems Nguyen Kim Anh Tdm tat Doi vai cdc hi truy vdn ngdn ngfr lu nhien, cdc truy vdn ngdn ngii lu nhien dugc bien dot thdnh cdc biiu thirc logic Bdi bdo ndy Irinh bdy mgt ky thugt cho phep dich cdc truy vdn iopc ndy thdnh cdc truy vdn SQL vd sau do, mgt he qudn tri ca sd dir lieu (DBMS) se lim tdt cd cdc cdu trd lai ddi vai cdu hdi vdi cdc ky thudt lap ki hogch vd tdi uu hod dgc biet rieng cua no Abstract For natural language query systems, natural language queries are transformed internally into logical expressions This paper present one technique to translate the logical queries into SQL queries and then, one relational DBMS is left lofmd all answers to the queries with its own specialised optimization and planning techniques l.GIOflTHlEU Giiip may tinh de siir dyng horn, gin giii vdi ngudi hom li diiu ma cac nhi lap trinh va nghien ciru may tinh da, dang vi se tilp tyc CO ging thyc hifn Ngdn ngir noi la mpt nhiing cich giao tilp thdng dung vi ty nhien nhit ciia ngudi Dk giiip miy tinh giao tilp dupc vdi ngudi thdng qua ngdn ngii ndi, chiing ta ein cd cic thinh phin xu ly ngdn ngCl ty nhien (NLP) Do tinh mip md, da nghTa ngdn ngCl ndi nen cho din nay, cac hf thong NLP xay dyng dugc diu bj gidi ban mpt miln nhd va chi thdng djch dugc mpt so lo^i cau nhat djnh Mpt ITnh vyc mi cic hf thdng NLP cd thi ip dyng hifu qua li cac hf truy vin co sd dvt lifu Ly li cae co sd dii' lifu (CSDL) thudng phii mpt mien dii nhd nen nhirng ciu truy vin tiing Vift ve du' lifu cd thi phin tich dugc bdi mpt hf thong NLP Doi vdi cic he truy vin ngdn ngii ty nhien ciia chiing tdi, cic truy van ngdn ngii ty nhien dugc bien doi cac bieu thirc ciia mpt ngdn ngir bieu dien y nghTa truy vin-logic md ti ClFR Bai bio dl cip din mpt ky thuit cho phep djch cic ciu truy vin logic cac vin SQL va sau dd, mpt DBMS se tim t cic cau tra idi ddi vdi cau hdi vdi ca thuit lap ke hoach va tdi uu hoa die rieng cua nd Do vay, cic tinh nang ciia DBMS manh cd thi dugc sir dyng tr ciu hdi va hf thdng cd thi dl dang phat doi vdi cic co sd dii lifu rit Idn Npi dung bii bio dupc trinh bay sau: phin md diu vdi mdt so khii nien bin lien quan din vifc xic djnh vi bilu ( ngii nghTa ciia CSDL quan lif Phin ti bay mdt kiln true phic thio cua hf thdng t vin ngdn ngii' ty nhien lim co sd cho p djch cic ciu truy vin logic cac truy SQL Phin trinh uiy phep djch cic cau t vin logic thinh cic truy vin SQL Cudi cu phin trinh biy mpt so vi dy minh ho? phin dua mpt vai danh gii vi kit luan MOT S KHAI NIfM CO BAN 2.1.Sordothircthe-lienket Trong thyc tl, thiit kl co sd dir li (CSDL) quan hf cho mpt xi nghifp, chung thudng siir dyng mpt so dd thyc thl-lien bieu dien cau true logic tong thi ciia CSL Proceedings of ICT.rda'06 Hanoi May 20-21,2006 um xl nghifp niy Cic thinh phin co bin ^M so thyc thl-lien kit la cic thyc cAc thudc tinh vi cic lien kit Mpt tip c rill (gd' ^^"^ 8'^n li thyc thi) ki hifu mpt c4c doi tugng cd cic tinh chit chung va c gin mpt ten gpi la mpt danh tir Cic tip thi ^^^^ ^^ ^i"'^ thong qua mdt tip cic ^ p h i t , dupc gpi li cic thupc tinh, dl phan |h cic die tnmg ciia tip thyc thi Mdi mpt afic tinh dupc gin mpt ten gpi ciing li mpt tiJ Mpt tip lien kit (gpi dem giin li lien kl hifu mot tip cic bd ma mdi bp bilu Uln iTtpt sy ket hgp giira cic thyc the dugc ^ theo bdi lien ket Moi lien kit dugc y n mpt ten gpi li mpt dpng tir Thdng budng, ngii nghTa cua cic thyc thi, cic thupc tfnh vi cic lien kit da phin nio dugc phin ^ thdng qua ten gpi ciia chiing Do vay, so d6 thyc thl-lien kit doi vdi mpt xi Ujghifp co indt y nghTa quan trpng nhit djnh doi vdi bp phan tich cii phap ciing nhu bp thong djch ngjj nghTa dl hiiu nghTa ciia cac eau tru^ van ijSi vdi CSDL ciia xi nghiep va ddi vdi chiing tdi, so dl thyc the-licn kit doi vdi mpt xi nghifp CO thi dugc xem nhu la nhirng tri thurc vl ngii nghTa da dugc biet vl CSDL ma chiing ta dang xem xet D6ng thdi, so dd thyc Jfel-lien kit niy ciing dugc siir dyng dl anh xa vao mo hinh dii lifu quan hf doi vdi xi nghifp thiit kl co sd dir lifu (CSDL) quan hf cho xi nghifp niy De don giin, chung tdi gii thiit ten ciia cic quan hf vi tip tiiupc tinh ciia chiing dugc dat trimg vdi cic tfn gpi tuong irng so thyc thl-lien kit dugc sir dyng thyc hifn inh xa 2.2 Logic mo ta CIFR Cic logic mo ti la cac hf hinh thiire cho phep bieu dien vi lap luin tren cic ldp ddi tugng phirc t^p (dugc gpi la cic khii nifm) vi cic moi quan hf giira chiing (thudng dugc bieu dien bdi cic quan hf hai ngdi vi ciing cdn dugc gpi la cic vai trd) Mpt CO so tri thirc cua logic mo ti gom cd hai thinh phin: • TBoxes chua mpt tip cic mo ti nifm vi bieu dien cho so chunj md hinh hda mien quan tim • ABoxes la mpt sy thi hifn bd phir ciia so bao gom mpt tap cic khang djnh lien quan din cic ci the cua cic ldp hay cic ci thi c6 quan hi vdi thdng qua cic moi quan bt , giiia chung CIFR li mpt sy md rpng ty nhien cua CIF vdi myc dich bieu dien tryc tilp cic quan hf n-ngoi ma die bift cd y nghTa ngir cinh eua chiing toi, bieu diln cic truy van doi vdi mdt co sd dir lifu quan hf Gia su chiing ta co mpt tap hu'u ban cic khai nifm nguyen to ki hifu bdi A, cic vai tro nguyen td ki hieu bdi P va cic quan hf n-ngoi ki hifu bdi R Chiing toi sir dung R ki hifu cac vai tro y, C ki hifu cic khai nifm y va T la khai nifm dinh, la khai nifm diy, n la phep giao va U la phdp hpp Cac khai nifm vi vai trd dugc xay dyng phii hgp vdi Cli phap sau: c T|±|^|c,nC,|C,UC,hC|VVJ.C|3*.C|(s lP]((s \P-] VR[U]T, :C,, ,7' :C.|3>?[C/Ir, :C, T :C Ngir nghTa ciia CIFR, nhu thong thudng dugc cho thong qua ham diln djch = (A,) Die bift, nlu R la mpt quan hf n-ngoi mi tif cic r-vai trd ciia nd li roi (R) = {U|, ,U„} th; R' li mdt tap cic bp dugc gin nhan cd danj , d diy d,, ,d„ e A' Chung ta vilt r[U] ki hifu gii trj dupe kit hpf vdi U-thanh phin cua bp r Cic cau true mdi dupc dien djch nhu sau: R[u,U']=^d,d')e^ xA'\3reR' xl = r{U)Ad' =r{U')] {>/R\u\T,:Ci, ,T„:Cj={le/S!\^reR'.r{U) {3Rlu]T,:q, ,T„:Cj = d-^{r[T,]eq', ,rlT„]eCj)] ={ie/s! 3reR' r{U) = d Ar[T,]eC/ A Ar[T„]eCj\ Ky ylu HQi thto ICT.rda'06 Proceedings of ICT.rda'06 Hanoi May 20 cau truy vin diu vao thinh cic bieu tl nifm eiia logic md ti CIFR-cac cau t logic cd thi thyc hifn dugc Tuy nhii tim tit ci cic ciu tri Idi doi vdi mpt cic co sd dii lifu Idn khdng dui hifn mdt each hifu qui Do vay, tron tilp theo, chiing tdi chii trpng vao nh djch cic cau truy vin logic thinh cac ti SQL ma cd thi dugc thyc hifn bdi m< mlm hf quan trj CSDL quan hf nio dc trg SQL CIFR-TBoxes dugc djnh nghTa li mpt tip hihi h^n cic khing djnh bao him C1CC2, d diy Ci.C: li cic khii nifm y ciia CIFR CIFR-ABoxes dupc djnh nghTa li mpt tip hihi han cic khing djnh A(a) vdi a li mpt thi hifn cua khii nifm nguyen to A, cac khing djnh P(a,b) vdi (a,b) la mpt thi hifn cua vai trd nguyen tl P vi cic khing djnh R(U| : di, ,U„ : d„) vdi la mpt thi hifn cua quan hf n-ngdi R Tinh thoi man ciia khii nifm cung nhu phep kfo theo logic CIFR-TBoxes dugc djnh nghTa nhu thdng thudng nfn chiing tdi khong dk cip din phin niy niia Cau truy vin ngon ngpr ty nhien Bp phan tich cu phap KlfeN TRUC HE T H N G Cay cu phap Trong phan niy, chiing tdi sS trinh biy mpt kiln triic phic thio doi vdi hf truy vin ngdn ngii ty nhien vi mpt so phin tich lien quan den ph6p djch cic ciu truy vin logic thinh cic cau truy vin SQL Bp thong dich ngp nghia Truy vin d^ing logl Theo kiln triic hinh 1, cau truy vin ngdn ngii ty nhien trudc tien dugc phin tich bdi bp phan tich cii phip Bp phan tich cu phap tham chilu din tiir diln tiJr vyng dl phan tich cic tir cd nghia cau truy vin ty nhifn, xic djnh lo^i tir vi timg bude t^o nen cay Cli phip doi vdi ciu truy vin thdng qua mpt tip cac luit efl phip Tilp sau dd, ciy phan tich cu phip kit qui dugc xir ly bdi bd thdng djch ngii nghTa dl bilu nghTa ciia cau truy vin va sinh cau truy van d?ng logic Ngdn ngii dugc lya chpn dl bilu diln cac cau truy vin logic phii cd ning md ta hay djnh nghTa dugc cac tinh chit hay cic diiu kifn trich rut dugc tir ciu truy vin diu vio Chiing tdi dk nghj su dyng mpt logic md ti nhu mpt ngdn ngii trung gian dl bilu diln cic truy vin logic dudi d^ng mpt bilu thurc logic mo ti Tilp theo, ciu truy vin logic niy se dupc djch thinh mpt truy vin SQL mi cd thi dupc thyc hifn bdi mpt phin mlm hf quin trj CSDL quan hf nao dd cd h6 trg SQL Bp sinh cau tri Idi sir dyng cac kit qui ciia truy vin SQL dl dua cau tri Idi cho ngudi sir dyng Qua phin trinh bay tren, cd thi thiy ring, bp thdng djch ngii nghTa cd ning djch cic BO djch LQL SQL T n y vin SQL DBM S quan hf I KSt qua truy vin Bf sinh cau tra Idi I Tra Idi Hinh 1: Kiln tnic hf thong DJCH CAU TRUY VAN LOGIC THANH TRUY VAN SQL De bilu diln y nghTa ciia cac truy v ty nhien, chiing tdi da su dung mpt logic n ti die bift-CIFR, da dugc gidi thifu troi phin 2.2 va thyc hifn phep djch so dd thi thl-lien kit CIFR-Tboxes va djch n^ dung ciia CSDL quan he CIFI Aboxes n/; HOi thto ICT.rda'06 Proceedings of ICT.rda'06 Hanoi May 20-21,2006 tip cic ci the dupc ki hifu bdi khii nifm niy mpt md hinh Do viy, nlu chiing ta cd the cd cac luit djch tdng quit ddi vdi mdi Trong phin niy, chiing tdi se chi ring, phep toin logic md ti vi phep djch cic ngii nghi^a ^^^^ P^i" ii^h so thyc doi vdi mdi quan hf, khii nifm vi vai tro thI-liSn kit cd thi dugc nim bit CIFR nguyen to thi mdt sy md ti nio dd dugc ciu Adng qua mpt phep djch tir so thyc thlthanh vdi cac phep toan vi cic quan hf, khii lien kit CIFR-TBoxes nifm vi vai trd nguyen td cd the dugc djch Co sd tri thirc CIFR-TBoxes dupe suy mdt truy vin co sd dii lifu Chiing tdi ti^ mft so dd thyc thl-lien ket S dugc xic xii ly mpt each dom giin mdi mpt djnh nghTa djnh nhu sau: khai nifm nhu mdt cay, d dd moi nut tuong Co sd tri thiic niy chiia mpt khai nifm iirng vdi mdt bilu thirc eon vi vay, phep nguyen to A doi vdi moi miln gii trj thupc djch dupc thyc hifn tir dudi len tinh hay mdi thyc thi A, mpt vai tro nguyen td Trudc trinh bay qua trinh djch, chung P doi vdi mdi thupc tinh P va mdt quan hf toi gia thiet: n+m-ngoi R ddi vdi mdi lien kit R n-ngdi • M6i quan he bilu dien thyc thi dugc bo (keo theo n thyc the) co m thupc tinh lien kit sung them mpt khoa dai difn Mpt tham chieu "Tip cac khing djnh bao ham ciia ea sd tri nao den khoa dugc thong dich nhu thirc dupc xac djnh nhu sau: mpt tham chilu din quan hf tuong ling Vi • Vdi moi cap eac thyc the E, F cho E du, co sd dir lieu quin ly hpc tip, chiing la-mpt F S, chiing ta co khang djnh: E toi them cac thupc tinh MaSV va MaGV nhu c F vdi E va F la cac khai nifm nguyen to cac khoa dai difn doi vdi quan hf SV va GV irng vdi cac thyc thi E va F tuong ung • Vdi mdi thyc thi E co cae thupc tinh A|, • Moi quan hf bieu dien lien ket dupe bd A2, ,AkVdi cic miln Di, D2, ,Dk tuong sung them mpt khoa d^i difn bao gom eac iimg, chung ta co khang djnh: khoi dai difn ciia cic quan hf bieu dien cac thyc thi dugc keo theo bdi lien kit niy Vi E c VAi.D, n nvAk.Dk n (^i dy, khoi d^ii difn cua HudngDan dugc hinh A,)n n(85 • Cho biet ten cic giing vien chi day mon Co sd dii lifu hay mdn Hf quin trj CSDL GV n VD?y[GV, MdnHpc]oTenMdno(Co sd dir lieu U Hf quin trj CSDL) Kit qui djch: SELECT MaGV FROM GV Proceedings of ICT.rda'06 Hanoi May 20-21.2006 MINUS SELECT MaGV FROM Day, MdnHpc WHERE AND Day.MaM =M6nHpc.MaM TenMon o 'Co sd dir lifu' TenMdn o 'Hf quin trj CSDL') • vien ciia giang vien A: AND Cho biet cic sinh Hf xic djnh dupc dudng dan giiia SV va GVla: SV MonHpc GV vi SV GV Ngudi su dung se dupc hoi de lya chpn dudng din phii hpp: Hpc[SV, MonHpc] „ Day[M6nHpc, GV] o TenGV.A HudngDin [SV, GV] o TenGV.A Ket qui djch: SELECT MaSV FROM Hpc, Day, GV WHERE Hpc.MaM = Day.MaM AND Day.MaGV = GV.MaGV AND TenGV = 'A' SELECT MaSV FROM HudngDin, GV WHERE HudngDin.MaGV = GV.MiGV AND TenGV = 'A' Ciu Uiiy vin SQL : SELECT * FROM SV WHERE MaSV IN (SELECT MiSV FROM Hpe, D^y, GV WHERE Hpc.MaM = Day.MaM AND Day.MaGV = GV.MaGV AND TenGV = 'A') SELECT * FROM SV WHERE MaSV IN (SELECT MaSV FROM HudngDan, GV WHERE HudngDin.MaGV = GV.MaGV AND TenGV = ' A ' ) MINUS SELECT MiGV FROM D?iy, MonHpc DANH GIA VA KET LUAN WHERE D?y.MaM = MdnHpc.MaM Chiing tdi da tiln hanh cii dat thiir nghifm mpt hf truy vin ngdn ngii ty nhien tiing Vift ddi vdi CSDL Quin ly hpc tap mpt khoa ciia trudng Dai hpc Bich khoa Hf thong cai dat da dip irng dugc cac yeu ciu va myc tieu de doi vdi mpt hf thdng truy vin ngon ngir ty nhien Tuy nhien, hifu qui ciia hf thong phy AND TenMdn NOT IN ('Co sd dtl lifu', 'Hf quin tn CSDL') a u truy vin SQL : SELECT TenGV FROM GV WHERE MaGV IN (SELECT MiGV FROM GV Proceedings ofICT.rda'06 Hanoi May Ky yhi HQi thto ICT.rda'06 thupc rit nhilu vao von tvr vyng ma ta dua vio Diy chinh la khd khin Idn nhit vi ciing la vin dl co bin eua bit 1^ hf thdng xir ly ngdn ngii ty nhien nio - sy hieu bilt cua nd ve CSDL cy thi Theo dinh gii ciia chung tdi, cich tiep can djch cic cau truy vin ty nhien tieng Vift dupc gidi thifu bii niy thinh mpt bilu thiic logic md ti la rit cd triin vpng Cau truy van d dang logic ty nhien va rit gin vdi cau truy vin ty nhien Horn nila, sir dyng nang lip luin ciia hf logic md ti, chung ta cd the djch dugc cic truy vin khdng diy dii thdng tin, khdng rd ring, kiim tra tinh nhit quin ciia ciu truy vin diu vio vi die bift cd the ip dyng cic ky thuit toi uu hoi vl ngii nghTa doi vdi cic ciu truy vin phiic t^p Cach tiep cin die bift phu hgp vdi cac truy vin tra ciru thong tin vl mpt khii nifm-mpt d^g truy vin biln ddi vdi cic hf CSDL quan hf Tii lifu tham khao S.Abiteboul and R Hull, IFO: semandc database model, ACM (4), p 525-565, 1987 Androutsopoulos, Interfacing Language Front-End to Relationa Tech Paper no.11, Deptof AI, Edingburgh, 1993 D Calvanese, M Lenzerini, D Na for Databases and Information Kluwer, 1998 G.D Giacomo, M Lenzerini, I Logic with inverse roles, restrictions, and n-ary relations In the 4th European Workshop on Lo 1994, pp 332-346 G.G Hendrix et all Developing language interface to complex da TODS,3(3),p 105-147, 1978 J.S Kaplan, Designing a portabl language database query system, AC! (I), p 1-19,1984 D.L Waltz, An English language Cuoi cimg, chiing tdi hy vpng ring hf answering system for a large i database, Comm ACM, 21(7), p thong cii dit se dugc cii tien va phit triin 1978 hoin thifn hon niia dl dip irng diy du cic yeu ciu ciia mpt hf truy vin ngdn ngii ty nhien tieng Vift vi thyc sy cho phep nhihig ngudi Ve tac gia six dyng khdng dugc dio tao vl Tin hpc cd thi Can bf giing d^y - Khoa Cong nghf th khai thic tdt cic CSDL trudng Dai hpc Bich khoa Ha npi ... djch LQL SQL T n y vin SQL DBM S quan hf I KSt qua truy vin Bf sinh cau tra Idi I Tra Idi Hinh 1: Kiln tnic hf thong DJCH CAU TRUY VAN LOGIC THANH TRUY VAN SQL De bilu diln y nghTa ciia cac truy. .. ngdn ngii trung gian dl bilu diln cic truy vin logic dudi d^ng mpt bilu thurc logic mo ti Tilp theo, ciu truy vin logic niy se dupc djch thinh mpt truy vin SQL mi cd thi dupc thyc hifn bdi mpt... triic phic thio doi vdi hf truy vin ngdn ngii ty nhien vi mpt so phin tich lien quan den ph6p djch cic ciu truy vin logic thinh cic cau truy vin SQL Bp thong dich ngp nghia Truy vin d^ing logl Theo

Ngày đăng: 08/12/2022, 21:00

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan