1. Trang chủ
  2. » Luận Văn - Báo Cáo

Luận văn thạc sĩ Khoa học máy tính: Phân lớp dữ liệu chuỗi thời gian dựa vào phép biến đổi SAX và mô hình không gian véc tơ

64 0 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Nội dung

Trang 1

ĈҤI HӐC QUӔC GIA TP HCM

LUҰ19Ă17+Ҥ&6Ƭ

TP.HӖ CHË0,1+WKiQJ8 năP 2020

Trang 2

ĈҤI HӐC QUӔC GIA TP HCM

LUҰ19Ă17+Ҥ&6Ƭ

TP.HӖ CHË0,1+WKiQJ8 năP 2020

Trang 3

&Ð1*75Î1+ĈѬӦ&+2¬17+¬1+7Ҥ, 75ѬӠ1*ĈҤ,+Ӑ&%È&+.+2$± Ĉ+4*- HCM &iQEӝKѭӟQJGүQ: 3*676'ѭѫQJ7XҩQ$QK

&iQEӝFKҩPQKұQ[pW763KҥP9ăQ&KXQJ

&iQEӝFKҩPQKұQ[pW: 3*6761JX\ӉQ7KDQK+LrQ

/XұQYăQWKҥFVƭÿѭӧFEҧRYӋWҥL7UѭӡQJĈҥLKӑF%iFK.KRDĈHQG TP +&0QJj\24 WKiQJ8 QăP

7KjQKSKҫQ+ӝLÿӗQJÿiQKJLiOXұQYăQWKҥFVƭJӗP

1 &KӫWӏFK 3*6764XҧQ7KjQK7Kѫ 2 3KҧQELӋQ 763KҥP9ăQ&KXQJ

3 3KҧQELӋQ 3*6761JX\ӉQ7KDQK+LrQ 4 Ӫ\YLrQ3*676'ѭѫQJ7XҩQ$QK 5 7KѭNê761JX\ӉQ7LӃQ7KӏQK

;iFQKұQFӫD&KӫWӏFK+ӝLÿӗQJÿiQKJLi/9Yj7UѭӣQJ.KRDTXҧQOêFKX\rQQJjQKVDXNKLOXұQYăn ÿmÿѭӧFVӱDFKӳD QӃXFy 

.+ .70È<7Ë1+

Trang 4

LӠI CҦ0Ѫ1

ĈӇ FyWKӇ KRjQWKjQKÿӅ WjLOXұQYăQWKҥFVƭPӝWFiFKKRjQFKӍQKErQFҥnh sӵ nӛ lӵc cӕ gҳng cӫa bҧQWKkQFzQFyVӵ Kѭӟng dүn nhiӋWWuQKFӫDTXê7Kҫ\&{FNJQJQKѭVӵ ÿӝQJYLrQӫng hӝ cӫDJLDÿuQKYjEҥQEqWURQJVXӕt thӡi gian hӑc tұp QJKLrQFӭXYjWKӵc hiӋn luұQYăQWKҥFVƭ

XLQFKkQWKjQKEj\Wӓ OzQJELӃt ѫQÿӃn 3*676'ѭѫQJ7XҩQ$QKQJѭӡi Thҫ\ ÿm WұQ WuQK GuX GҳW W{L WURQJ VXӕW TXi WUuQK Kӑc tұp tҥL WUѭӡQJ Ĉҥi hӑF %iFKKhoa ± TP Hӗ &Kt0LQK7Kҫ\FNJQJOjQJѭӡLKѭӟng dүQYjWҥRÿLӅu kiӋn tӕt nhҩt ÿӇ W{LFyWKӇ KRjQWKjQKOXұQYăQWKҥFVƭ

7{L[LQFҧPѫQTXt7Kҫ\&{QKӳQJQJѭӡLÿmWұQWuQKKѭӟng dүQYjWUX\Ӆn ÿҥWFKRW{LYjELӃt bao thӃ hӋ VLQKYLrQQKӳng kiӃn thӭFTXtEiXWURQJVXӕWTXiWUuQKhӑc tұp

7{L[LQFҧPѫQJLDÿuQKÿmÿӝQJYLrQYjWҥo mӑLÿLӅu kiӋn tӕt nhҩWÿӇ W{LFyWKӇ tiӃp tөFWKHRÿXәi viӋc hӑc tұSQJKLrQFӭu

4XDÿk\W{LFNJQJ[LQFKkQWKjQKFҧPѫQFiFDQKFKӏ YjFiFEҥQÿmJL~SÿӥJySêFKRW{LWURQJTXiWUuQKWKӵc hiӋn luұQYăQ

Trang 5

7Ï07ҲT LUҰ19Ă1

Dͷ li͏u chu͟i thͥi thͥi gian (time series data) Oj Pӝt chuӛi dӳ liӋu dҥng

ÿLӇmÿѭӧFÿRWKHRWӯng khoҧng thӡi gian liӅn nhau theo mӝt tҫn suҩt nhҩWÿӏnhÿmYjÿDQJÿѭӧc ӭng dөng trong nhiӅXQJjQKQJKӅOƭQKYӵFNKiFQKDX ViӋc pKkQWtFK

dӳ liӋu chuӛi thӡi gian WK{QJTXD SKkQOͣp (classification) ÿyQJYDLWUzTXDQWUӑng

YuÿyOjTXiWUuQKWUtFK[XҩWFiFWKXӝFWtQKWKӕQJNrFyêQJKƭDTXDÿyWDFyWKӇ dӵ ÿRiQFiFÿLӇm dӳ liӋXWUѭӟFNKLQy[ҧy ra, hoһc thӕQJNr[XKѭӟng dӳ liӋu hiӋn tҥi YjÿѭDUDTX\ӃWÿӏnh tӕWKѫQSKөc vө ÿӡi sӕQJFRQQJѭӡi

Trong nhiӅu thұSQLrQTXDFiFQKj QJKLrQcӭu ÿmFӕ gҳng cҧi tiӃn viӋFSKkQlӟp chӫ yӃu dӵD YjR Fҧi tiӃn TXi WUuQK WuP NLӃP WѭѫQJ Wӵ WUrQ FKXӛi dӳ liӋu thӡi gian 7URQJ ÿӅ WjL Qj\ FK~QJ W{L khҧR ViW mӝt Kѭӟng tiӃp cұn cKR EjL WRiQ Eҵng

viӋc SKkQOӟp dӳ liӋu chuӛi thӡi gian sӱ dөng SK˱˯QJSKiS[̭p x͑ g͡SNêKL͏XKyD (Symbolic Aggregate approXimation-SAX) kӃt hӧp vӟi P{KuQKNK{QJJLDQYHFWRU

(Vector Space Model-VSM)

&ѫ Vӣ cӫa SKѭѫQJ SKiS Qj\ dӵD WUrQ YLӋc chuyӇn chuӛi thӡi gian WKjQK

nhӳng tͳ (word) VDXNKLÿmJLҧm sӕ chiӅu cӫa chuӛi dӳ liӋXEDQÿҫXYjVӱ dөQJP{

KuQKNK{QJJLDQYHFWRUÿӇ SKkQOӟp/jPQKѭYұ\WDFyWKӇ chuyӇn dӳ liӋu chuӛi thӡi gian EDQÿҫXWKjQKmӝt tұp dӳ liӋu mӟi gӑQKѫQTXDÿyJLҧPÿiQJNӇ thӡi gian SKkQOӟp QKѭQJYүQÿҧm bҧRÿѭӧc nhӳQJWK{QJWLQFҫn thiӃt

ĈӅ WjLVӁ tӯQJEѭӟc iSGөng viӋFSKkQOӟp chuӛi dӳ liӋu thӡi gian dӵDYjRSKpS ELӃQ ÿәL 6$; Yj P{ KuQK NK{QJ JLDQ YHFWRU Ĉӗng thӡL FNJQJ iS Gөng viӋc

SKkQ Oӟp vӟi mӝt sӕ SKѭѫQJ SKiS NKiF QKѭ OiQJ JL͉ng g̯n nh̭t s͵ dͭQJ ÿ͡ ÿRxo̷n thͥLJLDQÿ͡ng (1 Nearest Neighbor Dynamic Time Warping -1NN-DTW), W~Lÿ͹ng m̳u (Bag of patterns) CuӕL FQJ U~W UD NӃt luұn vӅ WtQK KLӋu quҧ cӫa viӋc

SKkQOӟp chuӛi dӳ liӋu thӡi gian dӵDYjRSKpSELӃQÿәL6$;YjP{KuQKNK{QJJLDQ vector vӟi giҧi thuұt 1NN-'7:Yjgiҧi thuұt Bag of patterns

Trang 6

ABSTRACT

A time series is a series of data points listed (or graphed) in time order Most commonly, a time series is a sequence taken at successive equally spaced points in time It has been applied in many different domains such as industries, health, weather and finance Time series analysis plays an important role because it comprises methods for analyzing time series data in order to extract meaningful statistics and other characteristics of the data, thus helping human predictevents before it happen, or do some statical reporting and have better decision making In recent years, time series classification has attracted the attention of many researchers, many algorithms have been proposed to improve the performance of similar searching process of time series data In this project, we investigated an approach for the problem of classifying time series data using the Symbolic Aggregate approXimation (SAX) and the Vector space model (VSM)

SAX-VSM is based on two well-known techniques The first technique is Symbolic Aggregate approXimation to transforms real-valued time series into combined collections of SAX words after reduction of the time series

dimensionality The second technique is the Vector Space Model based on tfכidf

weighting scheme to classify By doing this, we can convert the original time series data into a more compact new dataset, thereby significantly reducing the time to classify but still ensuring the necessary information

In this paper, we will gradually apply the classification of time series based on Symbolic approximation and vector spatial models At the same time, the classification with some other methods such as one-nearest neighbor using dynamic time warping (1NN-DTW), Bag of patterns (BOP) Finally, draw conclusions about the effectiveness of time series classification based on Symbolic Aggregate approXimation and Vector space models in comparison with the 1NN-DTW algorithm and the Bag of patterns algorithm

Trang 7

LӠ,&$0Ĉ2$1

7{LFDPÿRDQUҵng, ngoҥi trӯ FiFNӃt quҧ tham khҧo tӯ FiFF{QJWUuQKNKiFQKѭÿmJKLU}WURQJOXұQYăQFiFF{QJYLӋFWUuQKEj\WURQJOXұQYăQQj\OjGRFKtQKW{LWKӵc hiӋQYjFKѭDFySKҫn nӝLGXQJQjRFӫa luұQYăQQj\ÿѭӧc nӝSÿӇ lҩy mӝt bҵng cҩp ӣ WUѭӡQJQj\KRһFWUѭӡQJNKiF

1Jj\03 WKiQJ08 QăP20

/ѭѫQJ3KөQJ7LrQ

Trang 8

.+$,3+È'Ӳ LIӊU CHUӚI THӠI GIAN 1

1.1.1 Tҫm quan trӑng cӫDNKDLSKiGӳ liӋu chuӛi thӡi gian 1

1.1.2 Dӳ liӋu chuӛi thӡi gian 2

1.2 MӜT SӔ +È,1,ӊ0/,Ç148$1'Ӳ LIӊU CHUӚI THӠI GIAN 3

7Î0.,ӂ07ѬѪ1*7Ӵ 75Ç1'Ӳ LIӊU CHUӚI THӠI GIAN 13

2.3 CHUҬ1+Ï$'Ӳ LIӊU (Z-SCORE NORMALIZATION) 14

2.4 RӠI RҤ&+Ï$&+8ӚI THӠI GIAN 15

2.4.1 Thu giҧm sӕ chiӅu bҵQJSKѭѫQJSKiS[ҩp xӍ gӝp tӯQJÿRҥn (Piecewise Aggregate Approximation - PAA) 15

2.4.2 PKѭѫQJSKiS[ҩp xӍ gӝSNêKLӋXKyD 6\PEROLF$JJUHJDWH approXimation - SAX) 17

3+Æ1/ӞP DӲ LIӊ87+ѬӠNG 18

3+Æ1/ӞP DӲ LIӊU CHUӚI THӠI GIAN 21

3KkQOӟp chuӛi thӡi gian bҵng giҧi thuұt OiQJJL͉ng g̯n nh̭t (1 Nearest

Trang 9

3KkQOӟp chuӛi thӡi gian bҵQJSKѭѫQJSKiS%DJRISDWWHUQVV %23  23

4.3.5 Bӝ dӳ liӋu OSU Leaf 38

;È&Ĉӎ1+7+Ð1*6Ӕ CHӐN CHO TҰP DӲ LIӊU 39

4.5 THӴC NGHIӊ0626È1+7Ë1+&+Ë1+;È&*,Ӳ$&È&3+ѬѪ1*3+È33+Æ1/ӞP 41

4.6 THӠI GIAN THӴC THI GIӲ$&È&3+ѬѪ1*3+È33+Æ1LӞP 43

Trang 10

DANH MӨ&+Î1+

+uQK1.1 KӃt quҧ EiQUѭӧXÿӓ ӣ Australia tӯ WKiQJ-ÿӃQWKiQJ-1991 3

+uQK&iFKWtQKNKRҧQJFiFKÿӝ ÿR[Rҳn thӡLJLDQÿӝng 8

+uQK+DLFKXӛi thӡi gian Q YjC 10

+uQK0DWUұn biӇu diӉQFiFKWtQK'7:FKRKDLFKXӛi thӡi gian 10

+uQK5jQJEXӝc dҧi Sakoe-Chiba 12

+uQK5jQJEXӝFKuQKEuQKKjQK,WDNXUD 13

+uQK&KXӛi thӡLJLDQ7Fyÿӝ GjLYj[ҩp xӍ PAA cӫa T, vӟLÿRҥn 16

+uQK%ҧng tra cӭXFyFKӭDFiFÿLӇm dӯQJFyJLiWUӏ tӯ ÿӃn 10 17

+uQK0ӝt chuӛi dӳ liӋu thӡLJLDQÿѭӧc biӃQÿәi PAA rӗLPmKyDWKjQKFiFNêhiӋu SAX Chuӛi thӡLJLDQÿѭӧFPmKyDWKjQKEDDEFFEF 18

+uQK9tGө vӅ SKkQOӟp dӳ liӋXKuQKҧQKiSGөQJWURQJOƭQKYӵc sinh hӑc 19

+uQK4XiWUuQKSKkQOӟp dӳ liӋu ± ѬӟFOѭӧQJÿӝ FKtQK[iF 20

+uQK4XiWUuQKSKkQOӟp dӳ liӋu ± 3KkQOӟp dӳ liӋu mӟi 21

+uQK0ӝWOkQFұn gҫn nhҩt cӫa mүu thӱ X 23

+uQK0ӝt chuӛi thӡLJLDQFyJLiWUӏ thӵc sӵ ÿѭӧFSKkQWtFKWKjQKWӯ SAX: GTTGACCA 24

+uQK0ӝWYtGө trӵc quan vӅ ÿҥi diӋQW~LPүu cho chuӛi thӡi gian MӛLKjQJbiӇu thӏ mӝt tӯ 6$;YjPӛi cӝt biӇu thӏ mӝt tұp dӳ liӋu chuӛi thӡi gian 25

+uQK3.1: Tәng quan vӅ thuұWWRiQ6$;-960O~Fÿҫu, chuӛi thӡLJLDQÿѭӧc gҳn QKmQÿѭӧc chuyӇQÿәLWKjQKFiFW~LWӯ bҵng SAX; Thӭ hai, thӕQJNrWIכ LGIÿѭӧc WtQKWRiQGүQÿӃn mӝWYHFWѫWUӑQJOѭӧng duy nhҩt cho mӛi lӟSÿjRWҥR'jQKFKRSKkQORҥi, mӝt chuӛi thӡi JLDQNK{QJQKmQÿѭӧc chuyӇQÿәLWKjQKWҫn sӕ hҥQYHFWѫYjJiQQKmQFӫDYHFWѫWUӑQJOѭӧng mang lҥi cosin tӕLÿDJLiWUӏ WѭѫQJWӵ 29

+uQK+Ӌ thӕQJÿӅ nghӏ 30

Trang 11

+uQK9jLҧnh rӡLWUtFKWӯ video Gun-'UDZWKHRG}LKjQKYLFӫa tay phҧLYj

chuyӇQWKjQKmӝt chuӛi cӱ ÿӝng 34 +uQK'ҥng chuӛi thӡi gian thuӝc lӟS3RLQW SKtDWUrQ YjGҥng chuӛi thӡi gian thuӝc lӟp Gun-'UDZ SKtDGѭӟi) 35 +uQK%Dÿѭӡng cong biӇu diӉn ba lӟSKjP&\OLQGHU%HOOYj)XQQHO 36 +uQK%ӕQQKyPÿѭӡng cong biӇu thӏ cho bӕn lӟp trong bӝ dӳ liӋu Trace 37 +uQK3KtDWUrQOjKuQKFKөp cӫa mӝWFRQFi7ӯ KuQKGҥQJÿѭӡQJELrQFӫDFimӝt chuӛi thӡLJLDQÿѫQELӃQÿѭӧc tҥo ra ӣ SKtDGѭӟi 38 +uQK3KtDWUrQOjKuQKFKөp 3 loҥLOiKuQKGҥQJKѫLWK\YjÿҫXOiFҩp WtQKFӫa $FHU&LUFLQDWXPWK{UuDOiUăQJFѭDFӫD$FHU*ODEUXPYjFҩXWU~FOiWK\QKӑn cӫa Quercus Garryana 39 +uQK%LӇXÿӗ thӇ hiӋQÿӝ FKtQK[iFFӫDEDSKѭѫQJSKiSSKkQOӟp 1NN-DTW, %DJRISDWWHUQVYj6$;-960WUrQEӝ dӳ liӋu 42 +uQKBiӇXÿӗ thӇ hiӋn thӡi gian thӵc thi giӳDEDSKѭѫQJSKiS11-DTW, Bag RISDWWHUQVYj6$;-960D SKѭѫQJSKiS11-'7:Yj%DJRISDWWHUQVE 

SKѭѫQJSKiS%DJRISDWWHUQVYj6$;-VSM c) 1NN-'7:Yj6$;-VSM 44

Trang 12

DANH MӨC BҦNG BIӆU

Bҧng 4.1 &iFEӝ dӳ liӋu thӵc nghiӋm 33 Bҧng 4.2 BҧQJWK{QJVӕ cho tұp dӳ liӋu khi sӱ dөQJSKѭѫQJSKiS6$;-VSM 39 Bҧng 4.3 BҧQJWK{QJVӕ cho tұp dӳ liӋu khi sӱ dөQJSKѭѫQJSKiSBag of patterns 40 Bҧng 4.4 Ĉӝ FKtQK[iFSKkQOӟp cӫDFiFSKѭѫQJSKiSNKLSKkQOӟSWUrQEӝ dӳ liӋu 41 Bҧng 4.5 Thӡi gian thӵc thi giӳDFiFSKѭѫQJSKiSSKkQOӟSWUrQEӝ dӳ liӋu 42

Trang 13

&+ѬѪ1* TӘNG QUAN Vӄ Ĉӄ 7¬,

PhҫQÿҫu cӫDFKѭѫQJQj\ÿLӇm qua mӝt sӕ NKiLQLӋPFѫEҧQOLrQTXDQÿӃQÿӅ WjLQKѭ Gӳ liӋu chuӛi thӡi gian, FiF ÿһF ÿLӇP Yj FiF EjL WRiQ OLrQ TXDQ ÿӃn dӳ liӋu chuӛi thӡi gian

Phҫn thӭ hai giӟi thiӋXVѫOѭӧc vӅ mөFWLrXYjQӝi dung cӫDÿӅ WjL%rn cҥQKÿyphҫQQj\FNJQJQrXOrQQKӳng nӝLGXQJFKtQKWURQJQJKLrQFӭXYjFiFNӃt quҧ ÿҥt ÿѭӧc cӫa luұQYăQ

Phҫn cuӕLFKѭѫQJsӁ giӟi thiӋu VѫOѭӧc vӅ nӝLGXQJFKtQKWURQJWӯQJFKѭѫQJcӫa WRjQEӝ luұQYăQ

1.1 .+$,3+È DӲ LIӊU CHUӚI THӠI GIAN

1.1.1 Tҫm quan trӑng cӫa NKDLSKi dӳ liӋu chuӛi thӡi gian

1Jj\ nay, vӟi sӵ SKiWWULӇn cӫa khoa hӑc Pi\WtQK, dӳ liӋXGQJÿӇ phөc vө cuӝc sӕng cӫDFRQQJѭӡi dҫn ÿѭӧc sӕ KyDOѭXWUӳ Yj[ӱ OêWUrQPi\WtQK hay mҥng Internet, JL~SWD FyWKӇ dӉ GjQJWUX\Yҩn khi cҫn thiӃW 7X\QKLrQFiFORҥi dӳ liӋu

QJj\FjQJWăQJQKDQKWҥRQrQNKӕLOѭӧng dͷ li͏u lͣn (big data) ĈLӅXQj\WK~Fÿҭy ta phҧLWuPFiFKÿӇ NKDLSKiGͷ li͏u (data mining) QKDQKYjKLӋu quҧ NK{QJFKӍ ÿӕi

vӟi doanh nghiӋSPj FzQÿӕi vӟLFiQKkQ9LӋFNKDLSKiGӳ liӋu giӡ ÿk\FjQJWUӣ QrQTXDQWUӑQJYjWKXK~Wÿѭӧc nhiӅXQJKLrQFӭXWUrQWKӃ giӟi nhҵPÿiSӭQJ\rX

cҫu truy h͛LWK{QJWLQ information retrieval) ÿ~QJO~FYjÿҫ\ÿӫ khi cҫn thiӃt Mӝt trong nhӳng loҥi kӇ WUrQOjdͷ li͏u chu͟i thͥi gian (time series data) Dӳ

liӋu chuӛi thӡi gian ÿmYjÿDQJÿѭӧc sӱ dөQJKjQJQJj\PDQJWtQKӭng dөng cao YjÿyQJYDLWUzY{FQJTXDQWUӑQJWURQJFiFF{QJYLӋc cҫn xӱ OêWK{QJWLQWtQKLӋu theo thӡi giaQQKѭFiFQJjQKNLQKWӃWjLFKtQKy tӃJLiRGөFP{LWUѭӡQJÿӏDOê,« ViӋc hiӇXYjU~WWUtFK ÿѭӧFWK{QJWLQҭQWURQJFiFGӳ liӋu chuӛi thӡLJLDQFyPӝWêQJKƭDlӟn JySSKҫn quyӃWÿӏQKÿӃn sӵ SKiWWULӇn cӫDOƭQKYӵc 9tGө ÿӭQJWUѭӟc viӋc ngұp lөt, [kP QKұp mһn, hҥQ KiQ NpR GjL Gӳ liӋu chuӛi thӡL JLDQ FNJQJ ÿѭӧF iSdөQJÿӇ SKkQWtFKP{LWUѭӡQJSKkQWtFKYjGӵ EiRWuQKKuQKWKӡi tiӃt, nhҵm chӍ ra QJX\rQQKkQYjKѭӟng khҳc phөc

Trang 14

ĈӅ WjLVӁ tұSWUXQJYjREjLWRiQSKkQOӟp dӳ liӋu chuӛi thӡi gian Tuy nKLrQtUѭӟc khi ÿLVkXYjR ÿһFÿLӇPYjFKLWLӃt vӅ EjLWRiQSKkQOӟp ta cҫQOjPU}mӝt sӕ NKiLQLӋm OLrQTXDQ

1.1.2 Dӳ liӋu chuӛi thӡi gian

Dӳ liӋu chuӛi thӡi gian ܺ OjPӝt tұp hӧp nhiӅu m̳u dͷ li͏u (data samples), mӛi mүXOjPӝt bӝ ሺܶǡ ܸሻ Trong ÿyܶ OjWKӡLÿLӇm tiӃQKjQKTXDQViWܸ OjJLiWUӏ TXDQViW, ݊ OjVӕ lҫQÿRÿҥt lҩy mүu.êKLӋu chuӛi thӡLJLDQFyGҥng:

ܺ ൌ ሺݔଵǡ ݔଶǡ ǥ ǡ ݔ௡ሻ

Trang 15

+uQK1.1 KӃt quҧ EiQUѭӧXÿӓ ӣ Australia tӯ WKiQJ-1980 ÿӃQWKiQJ-1991 [13]

1.2 MӜT SӔ +È,1,ӊM /,Ç148$1'Ӳ LIӊU CHUӚI THӠI GIAN

.KiLQLӋPFKtQKVӱ dөQJWURQJÿӅ WjL :

x T̵p hṷn luy͏n (training set)

/jWұSFiFGӳ liӋXÿmÿѭӧFSKkQOӟSÿӇ phөc vө cho viӋF[k\GӵQJP{KuQKGӵ ÿRiQQKmQ

x PKkQOͣp (Classification)

&KRWUѭӟc mӝt chuӛi thӡi gian Q FK˱DJiQQKmQ XQODEHOHG YjM lӟp, mӛi lӟp chӭa k chuӛi thӡL JLDQ Fy FQJ Pӝt sӕ ÿһF WUѭQJ QKҩW ÿӏnh dӵD WUrQ ÿӝ ÿR

khoҧQJFiFKÿmÿӏQKQJKƭDWUѭӟF%jLWRiQSKkQOӟp tiӃQKjQKSKkQORҥi chuӛi thӡi

gian Q YjRPӝt trong sӕ M lӟSÿy

x Ĉ͡ ÿRNKR̫QJFiFK (Distance measure)

/jSKѭѫQJSKiSWtQKWRiQNKRҧQJFiFKJLӳa 2 chuӛi thӡi gian Q YjC Hai

chuӛi thӡLJLDQÿѭӧF[HPOjWѭѫQJWӵ nhau khi khoҧQJFiFKJLӳDFK~QJWLӃn vӅ 0 Mӝt sӕ NKiLQLӋPNKiFOLrQTXDQ :

x Gom cͭm (clustering)

Gom cөP Oj SKkQ KRҥch dӳ liӋu chuӛi thӡL JLDQ WKjQK FiF QKyP VDR FKR FiFWKjQKSKҫn trong cөP OjWѭѫQJWӵ QKDX FzQ FiF WKjQKSKҫQ NKiF FөPOjUҩWNKiFnhau

Trang 16

x D͹ EiR predicting/forcasting)

&KRWUѭӟc mӝt chuӛi thӡi gian ܳFy݊ ÿLӇm dӳ liӋX%jLWRiQVӁ dӵ EiRJLiWUӏ cӫa chuӛi thӡLJLDQOLrQWLӃp tӯ thӡLÿLӇm ݊ ൅ ͳ ÿӃn ݊ ൅ ݇

x 3KiWKL͏n b̭WWK˱ͥng (novelty detection)

%jL WRiQ Qj\ [iF ÿӏQK FiF chu͟i con b̭W WK˱ͥng (unusual/ abnormal/ discord/ novel) OjFKXӛLFRQNKiFQKҩt so vӟLFiFFKXӛLFRQNKiFWURQJFKXӛi thӡi gian

x 3KiWKL͏QP{WtS motif detection)

%jLWRiQQj\ [iF ÿӏQKFiF P{WtS Pүu lһS OjFKXӛLFRQ WKѭӡng lһp lҥi nhiӅu nhҩt trong chuӛi thӡi gian

1JRjLFiFEjLWRiQNӇ WUrQӭng dөng khai pKiGӳ liӋu chuӛi thӡLJLDQFzQWӗn tҥi

mӝt sӕ EjLWRiQNKiFQKѭNKDLSKiOX̵t k͇t hͫp (association rules mining), truy v̭n d͹DWUrQQ͡i dung (query by content)« 7X\QKLrQÿӅ WjLQj\chӍ tұSWUXQJYjR F{QJ

SAX-VSM lҥLFyÿӝ phӭc tҥSOjWX\ӃQWtQKFNJQJJLӕng SKѭѫQJSKiSW~Lÿ͹ng m̳u (Bag of patterns - BOP) Oj FKX\ӇQ ÿәi tҩt cҧ FiF FKXӛi thӡi gian huҩn

luyӋn WKjQK W~L Wӯ Yj Vӱ dөng P{ KuQK NK{QJ JLDQ YHFWRU ÿӇ SKkQ Oӟp Tuy

QKLrQWKD\YuBOP [k\Gӵng n W~LFKRFKXӛi thӡi gian tұp huҩn luyӋn,

SAX-690[k\Gӵng mӝWW~LWӯ duy nhҩWFKRFiFOӟp, cung cҩp mӝWFiFKKLӋu quҧ

mӝWYHFWѫWUӑQJOѭӧng N (N OjVӕ Oѭӧng lӟS YjWKӡLJLDQSKkQORҥi nhanh

Trang 17

1.4 MӨC 7,Ç89¬1+,ӊM VӨ CӪ$Ĉӄ 7¬,

MөF WLrX FKtQK FӫD ÿӅ WjL Oj WuP KLӇX SKkQ lӟp dӳ liӋu chuӛi thӡi gian sӱ

dөng SKѭѫQJ SKiS x̭p x͑ g͡S Nê KL͏X KyD (SymbolicAggregate SAX) kӃt hӧp vӟi P{KuQKNK{QJJLDQYHFWRU (Vector Space Model-VSM)

approXimation-&iFF{QJYLӋFÿѭӧc thӵc hiӋQWURQJÿӅ WjLJӗPFy :

x 1JKLrQ FӭX thu gi̫m s͙ chi͉u b̹QJ SK˱˯QJ SKiS [̭p x͑ g͡p tͳQJ ÿR̩n (Piecewise Aggregate Approximation - PAA) Yj FiFK WKӵF KLӋQ WtQK WRiQ YӟLFKXӛLGӳOLӋX

x 1JKLrQ FӭX YӅ pKѭѫQJ SKiS x̭p x͑ g͡S Nê KL͏X KyD (Symbolic Aggregate

approXimation - SAX) biӃQÿәi chuӛi thӡLJLDQWKjQKPӝt chuӛLFiFNêWӵ

x 1JKLrQFӭXYӅ YLӋFVӱGөQJP{KuQKNK{QJJLDQYHFWRU (Vector Space

Model-VSM) ÿӇ SKkQOӟp

x +LӋQWKӵFFKѭѫQJWUuQKSKҫQPӅPYjWKӱQJKLӋPWUrQFiFEӝGӳOLӋXPүXÿӇVRViQK KLӋX VXҩW JLӳD SKѭѫQJ SKiS SKkQ OӟS 6$;-960 YӟL SKѭѫQJ SKiS 11-'7:YjBag of patterns

x %iRFiRNӃWTXҧÿҥWÿѭӧFWK{QJTXDWKӵFQJKLӋPYӟLQKLӅXEӝGӳOLӋXNKiFQKDX

1.5 &È&.ӂT QUҦ ĈҤ7ĈѬӦC

- Giҧm sӕ chiӅu cho chuӛi thӡi gian, qua thӵc nghiӋm vӟLFiFgiҧi thuұWSKkQOӟp NKiF QKDX Yj QKLӅu bӝ dӳ liӋX NKiF QKDX FK~QJ W{L QKұn thҩy rҵng viӋc thu giҧm sӕ chiӅu thӵc sӵ hiӋu quҧ YjFҧi thiӋn tӕFÿӝ WtQK

- VӅ giҧi thuұWSKkQOӟp 1NN-DTW kӃt quҧ mang lҥi vӟLÿӝ FKtQK[iFNKiFDRWX\QKLrQÿӕi vӟi chuӛi thӡLJLDQGjLthӡi gian thӵFWKLWKuOҥi mҩt nhiӅu thӡi gian KѫQVRYӟi Bag of patterns Yj6$;-VSM

- Bag of patterns, SAX-VSM qua thӵc nghiӋm mang lҥi kӃt quҧ NKiWӕWWX\QKLrQso vӟi viӋc tҥRUDQW~LQKѭJLҧi thuұt Bag of patterns WKuJLҧi thuұt SAX-VSM gӝSFiFOӟSFKXQJYjRPӝWW~LPDQJÿӃn thӡi gian thӵc thi tӕWKѫQ

Trang 18

&+ѬѪ1*2: &Ѫ6Ӣ /é7+8<ӂ79¬&È&&Ð1*75Î1+/,Ç1 QUAN

&KѭѫQJQj\WUuQKEj\FKLWLӃt vӅ FiFFѫVӣ OêWKX\ӃWÿѭӧFiSGөQJWURQJÿӅ WjLQKѭWKXJLҧm sӕ chiӅu sӱ dөQJSKѭѫQJSKiS[ҩp xӍ gӝp tӯQJÿRҥn (PAA), xҩp xӍ gӝSNêKLӋXKyD 6$; P{KuQKNK{QJJLDQYHFWRU 960

Ngày đăng: 03/08/2024, 13:53

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w