Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 81 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
81
Dung lượng
1,94 MB
Nội dung
DAI HOC QUOC GIA HA NO l ’KHOA CONY NCHE l l VAN l NGUYEN THI THOA l PHAT HI$N LUAT KET l l TRONG CD DH LIEU VA l IIHAI PHA DY LIEU Ch u yé n n g a nh: Co n g n ghé th d n g t i n h k m p p c x p p c p c v g p c v z p Mñ so: 1.01.10 l LUAN YAN THAC SI l l l NGAC iI HU NG DAN KHOA HOC z l l PGS TS DOAN VAN BAN l HP No t — 2003 l v l l 2.3.1 Cf c d i nh n gh i a h i nh th i i c t r a n h e th d n g t i n mñ 30 2.3.2 V i d u m i nh ho a 32 2.3.3 Th u t to a n ph a t h i e n top ch i b ao v a c5 c l us t ké t hq p m d 34 g z z g p p h h l v z v c z x z p x p v z z v e x p e v g p c v z p l x p s x v z e p v s z @ x i x 3 y h l v o v s l g Câ uo n g Mo t s d th u t to a n ph a t hi e n l u a t ké t h p p 37 h p c k v l g v h v v x p s x v z e p y h x v o v s s u t h v a nAI37 3.2 Th u t to a n SET M 39 x h v v x p p k 3.3 Th u t to a n Ap r io r i 42 h v v x p s e z e z 3.4 Th u t to a n Ap r io r iT i d 44 h v v x p s e z e z z g 3.5 Th u t to a n phâ n ho a ch .46 h v v x p s p x 3.6 Th u t to a n CHAR M 51 h v v x p k Ch uo n g p dq n g k y t l i u a t kh a i ph a d i i l ié u v ao bñ i to3 n b3o h iém 58 h p c s g p c o m v y z h x v o x z s x g z z y z h i x @ z v p @ z l 5B a i t a n x z v x p 5.2 C a i d a t ch u n n g t ri nh 60 x z g x v h p p c v e z p 5.3 K e t q u3 ch a y ch u n n g t r i nh .61 e v h x m h p p c v e z p 5.4 Nh n xé t k i t q u a 67 l p p v o z v h x DANH MUC BANG BIf iU, HINH VE l k l z l H i nh 1.1: Q u a t r i nh kh am ph a t ri thñ c z p h x v e z p o x l s x v e z v Bâ n g 2.1: Th u3 t to a n ph a t h i e n t i ch i bâo ph8 b i6 n nh i phâ n 26 p c h v v x p s x v z e p v z v s z @ s @ z p p z s p Bâ n g 2.2: Th uâ t to a n ph a t h i e n l u3 t ké t hp p nh i phâ n 27 p c h v v x p s x v z e p y h v o v s s p z s p Bâ n g 2.3: B a n g c u a g i ao d i ch v a c a c ch i m u c 33 p c x p c h x c z x g z i x x 3 z l h B a n g 2.4: H e th i n g t i n nh i phâ n 33 x p c e v z p c v z p p z s p B a n g 2.5: HQ th i n g t i n mñ 34 x p c v z p c v z p l B a n g 2.6: Th u t to a n ph a t h i e n to p ch i b ao mS 35 x p c h v v x p s x v z e p v s z @ x l B a n g 2.7: Th u t to a n ph a t h i e n l uâ t k i t hp p m d .36 x p c h v v x p s x v z e p y h v o z v s s l g Bâ n g 3.1: Th u t to a n AIS 37 p c h v v x p B a n g 3.2: V i d u th uâ t to a n AIS 38 x p c z g h v h v v x p B a n g 3.3: Th u t to a n SET M 40 x p c h v v x p k B a n g 3.4: V i d u th uâ t tow n SET M 42 x p c z g h v h v v p k B a n g 3.5: Th u t to a n A p rio ri 42 x p c h v v x p s e z e z B a n g 3.6: H im a p rio ri_ g e n 43 x p c z l x s e z e z c e p B a n g 3.7: V i d u th u t tow n Ap r io r i 44 x p c z g h v h v v p s e z e z B a n g 3.8: A l go ri thm A p rio riT i d .45 x p c y c e z v l s e z e z z g B a n g 3.9: V i d u th u t to a n A p rio riT i d .46 x p c z g h v h v v x p s e z e z z g B a n g 3.10: K y h i e u s t d u n g t ro n g th u I to a n phâ n ho a ch 48 x p c m z e h l v g h p c v e p c v h v x p s p x B a n g 3.11: Th uJ t to a n phâ n ho a ch 49 x p c h v v x p s p x B a n g 3.12: Th u t u c g e n_ l a r g e i t ems e ts 49 x p c h v h c e p y x e c e z v e l l e v l B a g33 t n e50 x c v p e B a n g 3.14: Thñ t u c g e n f i n a l_ co u n t 51 x p c v h c e p z p x y h p v Bâ n g 3.15: Th u i t t tow n CHAR M .54 p c h z v v v p k H i nh 3.1: CHAR M s a p xé p th eo th e t u t i i d ié n 55 z p k l x s p s v e v e v h v z z g z p H i nh 3.2: CHAR M s a p xé p th eo d hf i t rq t a n g d i n 56 z p k l x s p s v e g z v e v x p c g z p nh4 d n59 p g p H i nh 4.2: Ch a sf i g i ao d i n ch i nh c u a ch u n n g t ri nh KDD o n I ns u r a n c e 72 z p x l z c z x g z p z p h x h p p c v e z p p p l h e x p e CAC KY HIEU VA TU VIET TAT K y h i e u, m z e h Teng Anlt Ti ng Viet co nf i d e n c e Do t i n c r y D a t a b as e Co sS d i i l i e u e p c p y v z p c z e v t i t v ié t t a t v z v i z v v x v co nf p CSDL l l l p p l m i nsup z p l h s s l TID z v L ok e x e @ p x l e v e z p l e g m z z y z e h Do t i n c r y tf i i th ié u m i n im um s u ppo r t Dñ ho t ro to i th ié u s u ppo r t Df i hñ t ro z p z h z p z s l h l l h s l e l p h s z g e s e p v e z p e v v v m v e z v z v z z v z v h z h e T r a ns a cs t io n I d e n t if i c a t io n D i nh d a nh g i ao d i ch e k- i t ems e t o v g m i n im um co nf i d e n c e l s up h z x m i n co nf z p l l e v x p l x l v k- i t ems e t o z v e l l e v z p g e p v z z x v z p z p g x p c z x g z TA p gñm k m u c s c l o l h To p cf c k- i t ems e t phf i b ié n Mo i th a nh v ié n c u a to p co h a i t r uñ n g: i) to p m u c v a i i) dñ ho t r y s k z z v v o x s l z v p e l i h l z i e p x v z s h z z x v @ z s p g x v e z v e h p c m To p c a c k- i t ems e t u n g c t Mo i th5 nh v ié n c u a t4p co h a t t r añ n g: i) top m u c v a i i) df i ho t r y s k z z v x v s o z p l v i h e l z i l e p x z z v h h p x g c v z s v x v e m v v e x p c Mli DAU k y z S u tñ n g t r u g VUO1 bJ c c u a c5 c CSDL ih u n n g i n i, q u a n l y, v a kho a ho c h v p c v e h c @ 3 h x 3 z h p p c z p z h x p y m i x o x dñ thñ c d3 y nh a nh cho n g n3 n g l u c phâ n t i ch, kh a i ph a dñ l i e u do, t ao r a nh u c3 u g v g m p x p p c p p c y h s p v z o x z s x g y z e h g v x e x p h h dñ i hñ i mo t th e h e ms i cñ a c a c cf i n g c u v a k y th uâ t phâ n t i ch d i i l i e u th n g, g z z l v v e e l l z x x 3 z p c h i x o m v h v s p v z g z z y z e h v g p c tho n g m i nh C a c co n g c u v a k y th u t n a y l a ch u d e c u a mo t l i nh c mS1 X u‹ l t v p c l z p x 3 p c h i x o m v h v p x m y x h g e h x l v y z p l h y v h l e n l a l l nh v u c kh am ph a t r i th u c t ro n g CSDL y e p g y x y y p i h o x l s x v e z v h v e p c Kh a n a n g t a n g t r u r i g v u p t b c cñ a d i i l i e u d uo c x em xé t th eo h a i m a t: x p x p c v x p c v e h e z c i h s v @ 3 x g z z y z e h g h p e l p v v e x z l x v t ao n is i v a th u th3p d i i l i e u S u mS ro n g t ro n g th u th p d i i l i e u kho a ho c, k y th u v x p z l z i x v h v s g z z y z e h h l e p c v e p c v h v s g z z y z e h o x o m v h t, s u g is i th i e u r i n g rñ i m3 v a ch dfi i vs i hâ u h e t c a c s a n phâm th uo n g m a i v a v l l h c z l z v z e h e z p c e z l i x g z z i l z h e v x l x p s l v h p c l x z i x m a y mo c ho a c a c th uo n g v u (m u a b a n g th e t i n d u n g) v a g i ao d i ch q u a n l y ( nh u x m l x x v h p c i h l h x @ x p c v e v z p g h p c i x c z x g z h x p y m p h th u th ué3 dñ s i nh r a c a c ds n g dG l i e u nh a nh cho n g v a d e d a n g S t mS ro n g cñ a v h v h g l z p e x x g l p c g y z e h p x p p c i x g e g x p c v l e p c x cñ n g n gh l u u trG, ch a n g h a n c a c th ié t b i l u y I n t dñ l ie u l am v ie c nh a nh h n n, p c p c y h h v e x p c x p x v z v @ z y h m p v g y z e h y x l i z e p x p p p ch a t l u r i n g c ao h n n, g i a th a nh r e ho n, rf i i s t ph a t t r ié n c u a c a c cf i n g n gh e x v y h e z p c x p p c z x v x p e e p e z z l v s x v v e z p h x x 3 z p c p c e I n t r a n e t, I n t e r n e t, v a co n g n gh e D a t a w a r eho us e d a t ao r a nh ié u co ho i cho p v e x p e v p v e e p e v i x p c p c e x v x x e e h l e g x v x e x p z h z chñ n g t a t ro n g v i e c th u thâ p, phâ n t i ch, xâ l y v a d u y t r i d i i l i e u V i thé d i i l i e u p c v x v e p c i z e v h v s s p v z p y m i x g h m v e z g z z y z e h z v g z z y z e h c u a c a c a nh n gh i ep, c a c tf i ch u c v a d e n v i n g a y c a n g nh ié u th i n g t i n, n g s h x x g x p p c z e s x v z h i x g e p i z p c x m x p c p z h v z p c v z p s p c ph u v a d a d a n g Cf c ph uo n g ph a p phâ n ti ch d ii li e u tr u ye n ths ng kh0 ng cñ n phñ h i x g x g x p c s h p c s x s s p v z g z z y z e h v e h m e p v l p c o p c p s h i p vs i d i i l i e u k i0 u n a y Cf c ph u n n g ph ap t r u yé n tho n g co thé t ao r a c a c bâo c ao z s i l z g z z y z e h o z h p x m s h p p c s x s v e h m p v p c v v x e x x @ x t i i d i i l i e u nh u n g khñ n g th e phâ n t i ch nf i i d u n g c a c b ao c ao l am no i b3 t c a c t r i v z z g z z y z e h p h p c o p c v e s p v z p z z g h p c x @ x x y x l p z @ v x v e z th u c q u a n t rp n g D i e u d i n dé n nh u c i t u i hñi s u r a ds i th e h e ms i cñ a c a c v h h x p v e s p c z e h g g z p g p p h z v h g z z l h e x g l z v e e l l z x x cs n g c u v a k y th u t co kh a n a n g tho n g m i nh v a th d i n g g i up co n n g uñ i phâ n t i ch l p c h i x o m v h v o x p x p c v p c l z p i x v g z p c c z h s p p c h z s p v z h i nh n u i d i i l i e u d e kh a i th a c t ri th u c hñ u d u n g Cf c k y th u t v a cñ n g c u l a de z p p h z g z z y z e h g e o x z v x v e z v h h g h p c o m v h v i x p c h g y x g e t a i c u a c a c l i nh v u c no i b u t l a kh am phs t r i th u c t ro n g c a c CSDL Kh a i ph a d i i v x z h x x y z p i h p z @ h v y x o x l s l v e z v h v e p c x x z s x g z z l i e u l a mo t g i a i a n q u a n t ip n g t ro n g kh a i ph a t ri th i r d t i i CSDL Kh a i ph a y z e h y x l v c z x z g x p h x p v z s p c v e p c o x z s x v e z v z e g v z z x z s x l u a t ké t ho p l a mo t no t d u n g q u a n t ro n g t ro n g kh a i ph a d i i l i e u y h x v o v s y x l v p v g h p c h x p v e p c v e p c o x z s x g z z y z e h M u c d i ch c u a l u3 n v a n l a n gh ié n c i i u, to n g h ip c a c k i e n th u c v e kh a i k s i h g z 3 h x y h p i x p y x p c z p z z h v p c z s x o z e p v h i e o x z ph a d i i l i e u; t im h i e u m p t so th uJ t to a n kh a i ph a l u a t k e t hp p t ro n g CSDL lñ n x g z z y z e h v z l z e h l s v l v h v v x p o x v a a p d u n g v ao mo t b a i to a n t ro n g th u c I e x x s g h p c i x l v @ x z v x p v e p c v h e z s x y h x v o e v s s v e p c y p L u n n v a n gom c a c no i d u n g ch i nh s a u h p p i x p c l x p z g h p c z p l x h Ch uo n g 1, t ri nh b u y to n g q u a t v e kh a i phs d i i l i e u, c u th e l a d i nh n gh i a h p c v e z p @ h m v p c h x v i e o x z s l g z z y z e h h v e y x g z p p c z x kh a i ph a d u l i e u v a c a c t i n g d u n g c u a no, c a c g i a i a n cñ a q u a t ri nh ph a t h i e n o x z s x g h y z e h i x x v z p c g h p c h x p x c z x z g x p x h x v e z p s x v z e p t ri th t i c, c a c b a i to a n t ro n g kh a i ph a d i i l i e u C uo i ch uo n g 1, l u n v a n t ri nh b a y v e z v v z 3 x @ x z v x p v e p c o x z s x g z z y z e h h z h p c y h p i x p v e z p @ x m c a c k y th u t kh a i p l e a d i i l i e u phs b ié n h i e n n a y x o m v h v o x z s y e x g z z y z e h s l @ z p z e p p x m Ch uo n g 2, ph a t h i e u b ar tow n ph a t h i e n l uâ t k e t ho p, t i ep d e n t im h p c s x v z e h @ x e v p s x v z e p y h v o e v s v z e s g e p v z l hi e u h e tho n g t i n nh i phâ n v a h e thñ n g t i n m d c u n g th u t to a n ph a t h i e n l u a t z e h e v p c v z p p z s p i x e v p c v z p l g h p c v h v v x p s x v z e p y h x v k e t h pp t re n h e tho n g t i n nh i ph° a n v a th u t to a n ph a t h i/ e n l uâ t k e t ho p t ré n h e o e v s s v e e p e v p c v z p p z s x p i x v h v v x p s x v z e p y h v o e v s v e p e tho n g t i n m d v p c v z p l g Ch uo n g 3, g is t th i u i no t sñ th u a t to a n J uo c s t d u n g d e kh a i ph a dG l i e u h p c c z l v v z h z p v l v h x v v x p h l v g h p c g e o x z s x g y z e h nhU: AIS, SET M, A p rio ri, A p rio riT i d, phâ n ho a ch, CHAR M p k s e z e z s e z e z z g s p x k Ch u n n g 4, d e x u5 t ap d u ng kh a i ph a di i l i e u v ao b a i to a n b ao h i em v a vi et h i z e p p c g e p h v x s g h p c o x z s x g z z y z e h i x @ x z v x p @ x z e l i v ch u n n g t r i nh th e n gh i em h p p c v e z p v e p c z e l C uñ i cñ n g l a ké t l u n nh i i n g ké t q u a d a t d t io c cñ a l uA n v a n v a h uñ n g h s z p c y x o v y h p ph a t t r ié n t ro n g l u r i n g l a i x v v e z p v e p c y h e z p c y x z p z z p c o v h x g x v g v z 3 x y h p i x p i x h p c x CHNDNG TONG QUAN VE KHAI PHA DC LIEU l l l l 11 Kh a i ph a d i i l i e u x z s x g z z y z e h 11.1 D i nh ngh i a z p p c z x Ph a t h i e n t r i th i r c t ro n g CSDL l a q u a t r i nh k e t x u t t r i thñ c i u dU l i eU x v z e p v e z v z e v e p c y x h x v e z p o e v p h v v e z v z h g y z e Kh a i p e a dñ l i e u d uo c dñ n g d e mo t a g i a i a n ph a t h i e n t r i th i i c t ro n g CSDL x z s e x g y z e h g h g p c g e l v x c z x z g x p s x v z e p v e z v z z v e p c Kh a i ph a d i l i e u nh am ké t x uJ t r a nh i i n g t ri ih u c t iém n tñ dñ l i e u d e g i u p x z s x g z y z e h p x l o v p h v e x p z z p c v e z z h v z l p v g y z e h g e c z h s chO v l eC d u b ao t io n g k i nh a nh, v.v Kh a i, ph a d i i l i e u l am l am ch i ph i v e i y e g h @ x v z p c o z p g x p i i x z s x g z z y z e h y x l y x l z s z i e thñ i g i a n so vñ i cf c po u ri n g ph a p t r u yé n thñ n g t i‘ uñ c k i a ( b a n g tho n g k e) r a t v l z c z x p l i z 3 s h e z p c s x s v e h m p v p c v z h o z x @ x p c v p c o e e x v m e t th a i g i a n e v v x z c z x p S a u d a y l a mf i t so d i nh n gh i a m a n g t i nh mñ l a m a F ri e dm a n dñ l u a cho n x h g x m y x l z v l g z p p c z x l x p c v z p l y x l x e z e g l x p g y h x p t u c i c bñ i g i a n g v e kh a i phs d i i l i e u [6]: v h z @ z c z x p c i e o x z s l g z z y z e h D! nh n ghi a c u a F a y y a d: “Kh a i phs t i’ i thñ c l a m p t q uo t r i nh khñf l g p p c z x h x x m m x g x z s l v z z v y x l s v h v e z p o y c Um tf uo ng nh n r a nh i r ng m l u dñ l i e u co g i a t r i, h im i ch t iém n a n g v a co l v h p c p p e x p z e p c l y h g y z e h c z x v e z z l z v z l p x p c i x the h ié u dio c.” v e z h g z - D i nh n gh i a c u a F e r r uzz a: “Kh a i ph a d i i l i e u l a t3 p c a c ph u e n g ph a p z p p c z x h x e e e h x x z s x g z z y z e h y x v s x s h e p c s x s d uo c d u n g t ro n g t i e n t r i nh kh am ph a t r i thñ c d e ch i r a s t kh a c b i e t c a c mñ i g h g h p c v e p c v z e p v e z p o x l s x v e z v g e z e x l v o x @ z e v x l z quan lie va h x p l y z e i x mâ u ch u a b i e t b e n t ro n g d i i l i e u.” h h x @ z e v @ e p v e p c g z z y z e h - D i nh n gh i a c u a P a rs a y e: “Kh a i ph a d i i l i e u l a q u a t r i nh t r y g i up q u yé t z p p c z x h x x e l x m e x z s x g z z y z e h y x h x v e z p v e m c z h s h m v d i nh, t io n g ch u n g t a t im k iém c a c mf i u thf i n g t i n ch u a b i e t v a b r i t n gs g z p v z p c g h p c v x v z l o z l x l z h v z p c v z p h x @ z e v i x @ e z v p c l t ro n g CSDL v e p c 1.1.2.C a c ui i g d u n g ci i a kh phs di i l i e u x h z z c g h p c z z x o x z s l g z z y z e h Kh a i phs d u l i e u IN m p t l i nh v u c ngh i e n c i i u m i i r a d i i v ao nh i i n g x z s l g h y z e h l l s v y z p i h p c z e p z z h l z z e x g z z i x p z z p c n am 80 c u a th e k y 20 nh u n g JR th u h u t d ap c s u q u a n t5m, ch u \ c u a r a t p x l h x v e o m p h p c v h h v g x s l h h x p v l h h x e x v nhié u nh a n ghié n c t nhñ v ao nh u n g Ki n g d R8 th u c t ié n c u a no C a c k y th u t p z h p x p c z p v p i x p h p c z p c g v h v z p h x p x o m v h v kh ph a d t i l i e u co th e ap d u ng v ao nh i e u t i nh h uo ng th u c h i e n q u ye t d! nh o x z s x g v z y z e h v e x s g h p c i x p z e h v z p h p c v h z e p h m e v g p d a d a ng v a ph am v i r i n g t ro ng k i nh a nh Cf c l i nh v u c ch iém t y l e ap d u n g g x g x p c i x s x l di n g ké gñ i n co: g z p c o c z p i z e z p c v e p c o z p g x p y z p i h 3 z l v m y e x s g h p c - M a i okee!•!' 8’- c a c t i n g d u n g gñm phâ n t i ch nh u c p u kh a ch h a n g d n a k x z x v z p c g h p c c l s p v z p h s h o x x p c g p x t re n c a c r nâ u m u a; x a c d i nh c a c ch i efs l uo c k i nh a nh gom: q u a n g c ao, v i t ri v e e p x e p h l h x p x g z p x 3 z e l y h o z p g x p c l h x p c x i z v e z kho h a n g, v a m u c t i e u phâ n 63 u; phâ n lo a i kh a ch h a n g, kho ho a c s a n phâm; v a o x p c i x l h v z e h s p h s p y x z o x x p c o x l x p s l i x th ié t ké d a nh m u c, x ep d a t kho h a n g, v a ch ié n di ch q u a ng c ao v z v o g x p l h p e s g x v o x p c i x z p g z h x p c x - Tâi c l i i i i l i, c l i i“ t i i g L l ioñ i i : c a c r i n g d u n g gf im phâ n t i ch kh a n a n g t r a no z y z z z z y z y z z v z z c y z z z x e z p c g h p c c z l s p v z o x p x p c v e x p c u a kh a ch h a n g, ph5 n lo a i L a i kho a n co thé nh n d uo c, h i e u q u a, ph i i i i t i ch dâ u t u h x o x x p c s p y x z x z o x p v p p g h z e h h x s z z z z v z g h v h t a i ch i nh m i n ch tm g khoâ n, c a c h i p n g ( khé us c), v a cf i n g t r a i; m nh g if c u a cf c v x z z p l z p v l c o p x z s g p c o h l i x z p c v e x z l p c z h x 3 l u a cho n t a i ch i nh; v a ph a t h i e n s u g i a n l a n y h x p v x z z p i x s x v z e p l h c z x p y x p Sâ i i u a t, c l ié’ l ao: c a c r i n g d u n g gf im to i u u ho a t a i n g u yé n nh u c a c z z h x v y z y x x e z p c g h p c c z l v z h h x v x z p c h m p p h x th i e t b i, nhâ n l u c, v a v3 t l i e u; to t u u th ié t ké q u y t r i nh s a n x uâ t, bñ t r i kh u ch e v z e v @ z p p y h i x i v y z e h v v h h v z v o h m v e z p l x p p h v @ v e z o h e t ao, v a th1f t k e s a n phâm, ch a n g h a n, nh u o t6 v x i x v v o e l x p s l x p c x p p h v - C l iâ n i so c si“ t c k l to e: cf c t i n g dq n g gf im phâ n t i ch h i e u q u a d i e u t r i ch a c y z p z l l z v o y v e 3 v z p c g p c c z l s p v z z e h h x g z e h v e z x ch a n; H i u u q u a ths i g i a n d i e u t r i ( to i u u ths i g i a n n am v i e n), d i i l i e u l i e n q u a n x p z h h h x v l z c z x p g z e h v e z v z h h v l z c z x p p x l i z e p g z z y z e h y z e p h x p d e n s i c khoé be nh nhâ n vs i chi i ng nhA n c u a b a c s y; v a phâ n t i ch t a c dñ n g cñ a g l e p l z o @ e p p p i l z z z p c p p h x @ x l m i x s p v z v x g p c x m a t u y, x v h m - Ti n-› i t i li ho c : Ph a t h i e n c a c a n l a p t ro n g t r i nh t u ADN v a p ro t e i n,.v.v z p z v z y z x v z e p x g x p y x s v e p c v e z p v h l i x s e v e z p i i - P l iñ ii t i ch di e t lié ii v a l io i i o q u yé t di ii l i y z z z v z g z e v y z z z i x y z z z h m v g z z z y z - G iâo d i i c z g z z - P l iâ i i lo a i vâ i i ho r n y z z z y x z i z z e p - Kh a i p l iâ W eh x z s y z e 1.2 C a c g ia i a n chi nh cñ a q u3 tr inh ph a t h ie n tri thñ c x c z x z g x p z p x h v e z p s x v z e p v e z v T ro n g m u c m y, ch u n g t a kh ao s a t q u a t r i nh, phâ n t i ch c a c g i a n e s p c l h l m h p c v x o x l x v h x v e z p s p v z 3 x c z x z g x p ph a t h i e n t r i th u c Co g i a n ch i nh t ro ng q u a t r i nh ph a t h i e n t r i th u c x v z e p v e z v h c z [4,7,8,18]: - T r i ch cho n d i i l i e u e z 3 p g z z y z e h - T i e n xñ l y dG l i e u z e p p y m g y z e h x z g x p z p v e p c h x v e z p s x v z e p v e z v h - Bien dci dii lieu z e p g z g z z y z e h - Kh a i phs d i i l i e u x z s l g z z y z e h - B ié u d r e n v a d a nh g i a t r i th u c z h g e e p i x g x p c z x v e z v h Data Trzns farts\ztiun Intcrprcfacion•’ Target Data Preprocessed Data H i nh 1.1: Q uo l r i nh Nh am phs t r i th u c z p h y e z p l x l s l v e z v h Ti i c l t ‹ l ip ii d i e t lié ii { t l a t a s e l e c tioi i): l a b us c ch p n lo c dG l i e u c” a n z z y v y z s z z g z e v y z z z v y x v x l e y e v z z z y x @ h l 3 s p y g y z e h x p d u p c kh a i ph a t i i c a c n g uo n d i i l i e u nh am ph u c v u m u c d: ch kh a i phs t ri th u c g h s o x z s x v z z x p c h p g z z y z e h p x l s h i h l h g o x z s l v e z v h th eo m p t so t ie u ch i nh3 t di nh Ch a n g h a n, tro ng CSDL v e b a n h a ng, t a chp n r a v e l s v l v z e h z p v g z p x p c x p v e p c i e @ x p x p c v x s p e x cf c d i i l i e u v e c a c kh a ch h a n g, d a t h a n g v a ho a d u n C u thé h n n, d i i l i e u ch p n r a 3 g z z y z e h i e x o x x p c g x v x p c i x x g h p h v p p g z z y z e h s p e x ch i nh l a c a c b a n gh i b ao g em s6 h i e u kh a ch h a n g, t i n, d i a ch i, n g a y m u a, sS z p y x x @ x p c z @ x c e l l z e h o x x p c v z p g z x z p c x m l h x l l u r i n g v a lo a i h a n g y h e z p c i x y x z x p c Tié ii x k j di e t li f u ( d a t a y r e pi‘o c ess i n g): l a b us c l am s a ch dG l i e u v a z z z p o g z e v y z h g x v x m e e s z e l l z p c y x @ h l y x l l x g y z e h i x l am g l a u d i i l i e u N gh i a l a xñ l y c a c d i i l i e u khf i n g d i y d u, dG l i e u nh i e u, dG y x l c y x h g z z y z e h l c z x y x p y m x g z z y z e h o z p c g z m g h g y z e h p z e h g l i e u kho n g nhâ t q u a n, v.v., d i i l i e u d uo c l a y t i t nh ié u n g uo n d i i l i e u kho n g y z e h o p c p v h x p i i g g z z y z e h g h y x m v z v p z h p c h p g z z y z e h o p c n g nhâ t, nh am r u t go n d i i l i e u, rñ i r a c ho a dG l i e u S a u b us c m y dG l i e u g p c p v p x l e h v c p g z z y z e h e z e x x g y z e h x h @ h l l m g y z e h d u n g cho v i e c kh a i ph a t ri th u c s e nhâ t q u a n, d i y d u, d uo c r u t go n v a dUo c rs i g h p c i z e o x z s x v e z v h l e p v h x p g z m g h g h e h v c p i x g e l z r a c ho a V i d u, mo t kh a ch h a n g co th e co nh i e u b a n gh i i e c v ié t s a i té n, th a y e x x z g h l v o x x p c v e p z e h @ x p c z g z e i z v l x z v p v x m ds i d i a ch i v a g u y r a s u Um t uñ n g l a co nh i e u kh a ch h a n g kh a c nh a u Th em g l z g z x z i x c h m e x l h l v h p c y x p z e h o x x p c o x p x h e l ch t, co kh a ch h a n g cñ y ph a t am ho a c v i e t s a i t e n hoñ c d n a tho ng ti n li e n q u a n v o x x p c m s x v x l x i z e v l x z v e p g p x v p c v z p y z e p h x p d e n v i e c ho b i t i i chñ i mo t v a t h i nh th i i c kh u yé n m a i h a y b ao l i a nh, v.v L am g e p i z e @ z v z z z l v i x v z p v z z o h m p l x z x m @ x y z x p i i x l g i5 u dG l i e u l a ch u r n ho a v a l am m i n dG l i e u Jé d u a vé d a n g th uJ n lo i nh3 t c z h g y z e h y x h e p x i x y x l l z p g y z e h g h x i g x p c v h p y z p v nh am ph u c v u cho c a c kY th uA t kh a i ph a d i i l i e u S b us c s a u Cf c dG l i e u S c a c p x l s h i h 3 x o v h v o x z s x g z z y z e h @ h l l x h g y z e h x kh uo n d a n g kh a c nh a u cñ n g c a n d ao c q u l ds i v a t i nh to a n l a i d e d n a v e mo t o h p g x p c o x p x h p c x p g x h y g l z i x v z p v x p y x z g e g p x i e l v k i e u tho n g nh a t t i e n cho q u a t ri nh phf i n t i ch, ch a n g h a n q u i ds i d e n v i t i e n t e, o z e h v p c p x v v z e p h x v e z p s z p v z 3 x p c x p h t us i h a y n g a y s l nh, d i a ch i ch i t ié t h a y ch i a th eo vñ n g, v.v v h l z x m p c x m l y p g z x z z v z v x m z x v e i p c i i z g l z g e p i z v z e p v e