LUẬN án TIẾN sĩ máy TÍNH PHÁT HIỆN LUẬT kết hợp và LUẬT CHUỖI mờ TRONG cơ sở dữ LIỆU ĐỊNH LƢỢNG có yếu tố THỜI GIAN

BỘ GIÁO DỤC VÀ ĐÀO TẠO VIỆN HÀN LÂM KHOA HỌC VÀ CÔNG NGHỆ VIỆT NAM HỌC VIỆN KHOA HỌC VÀ CÔNG NGHỆ - TRƢƠNG ĐỨC PHƢƠNG PHÁT HIỆN LUẬT KẾT HỢP VÀ LUẬT CHUỖI MỜ TRONG CƠ SỞ DỮ LIỆU ĐỊNH LƢỢNG CÓ YẾU TỐ THỜI GIAN LUẬN ÁN TIẾN SĨ MÁY TÍNH HÀ NỘI – 2021 VIỆN HÀN LÂM KHOA HỌC VÀ CÔNG NGHỆ VIỆT NAM HỌC VIỆN KHOA HỌC VÀ CÔNG NGHỆ …… ….***………… TRƢƠNG ĐỨC PHƢƠNG PHÁT HIỆN LUẬT KẾT HỢP VÀ LUẬT CHUỖI MỜ TRONG CƠ SỞ DỮ LIỆU ĐỊNH LƯỢNG CÓ YẾU TỐ THỜI GIAN LUẬN ÁN TIẾN SĨ MÁY TÍNH Chun ngành : Hệ thống thơng tin Mã số: 48 01 04 Ngƣời hƣớng dẫn khoa học: PGS.TS Đỗ Văn Thành PGS.TS Nguyễn Đức Dũng Hà Nội – 2021 i LỜI CAM ĐOAN Tôi xin cam đoan cơng trình nghiên cứu riêng tơi Các kết viết chung với tác giả khác đồng ý đồng tác giả trước đưa vào luận án Các kết nêu luận án trung thực chưa cơng bố cơng trình khác Luận án hồn thành thời gian tơi làm Nghiên cứu sinh Học viện Khoa học Công nghệ, Viện Hàn lâm Khoa học Công nghệ Việt Nam Tác giả luận án NCS Trƣơng Đức Phƣơng ii LỜI CẢM ƠN Luận án Tiến sỹ “Phát luật kết hợp luật chuỗi mờ sở liệu định lượng có yếu tố thời gian” thực hướng dẫn khoa học PGS.TS Đỗ Văn Thành PGS.TS Nguyễn Đức Dũng Trước tiên xin bày tỏ lòng biết ơn sâu sắc tới thầy hướng dẫn PGS TS Đỗ Văn Thành PGS.TS Nguyễn Đức Dũng Trong trình thực luận án, nghiên cứu sinh nhận nhiều định hướng khoa học, học quý báu, hướng dẫn nhiệt tình từ thầy hướng dẫn Các thầy ln tận tâm động viên, khuyến khích dẫn giúp đỡ nghiên cứu sinh hoàn thành luận án Tôi xin chân thành cảm ơn thầy cô Học viện Khoa học Công nghệ tạo điều kiện thuận lợi cho tơi suốt q trình nghiên cứu thực luận án Tôi xin cảm ơn Ban Giám hiệu, tập thể cán bộ, giảng viên khoa Khoa học Tự nhiên Công nghệ, trường Đại học Thủ đô Hà Nội tạo điều kiện giúp đỡ suốt thời gian học tập nghiên cứu Nhân dịp này, tơi xin bày tỏ lịng biết ơn sâu sắc tới gia đình bạn bè cho điểm tựa vững chắc, tạo động lực để tơi hồn thành luận án Tác giả NCS Trương Đức Phương MỤC LỤC DANH MỤC HÌNH VẼ DANH MỤC BẢNG BIỂU DANH MỤC CÁC TỪ VIẾT TẮT MỞ ĐẦU CHƯƠNG TỔNG QUAN VỀ LUẬT KẾT HỢP VÀ MẪU CHUỖI, LUẬT CHUỖI CHUNG 18 1.1 Luật kết hợp 18 1.1.1 Phát luật kết hợp CSDL giao dịch 18 1.1.2 Phát luật kết hợp CSDL định lượng 21 1.1.3 Phát luật kết hợp tính đến khoảng cách thời gian xảy giao dịch CSDL có yếu tố thời gian 23 1.2 Mẫu chuỗi 25 1.2.1 Phát mẫu chuỗi CSDL chuỗi giao dịch 25 1.2.2 Phát mẫu chuỗi CSDL chuỗi định lượng 29 1.2.3 Phát mẫu chuỗi tính đến khoảng cách thời gian xảy giao dịch CSDL chuỗi có yếu tố thời gian 31 1.3 Luật chuỗi chung 34 1.3.1 Khái niệm luật chuỗi chung 34 1.3.2 Phát luật chuỗi chung 34 Kết luận Chương 38 CHƯƠNG PHÁT HIỆN LUẬT KẾT HỢP CĨ TÍNH ĐẾN KHOẢNG CÁCH THỜI GIAN TRONG CÁC CSDL ĐỊNH LƯỢNG CÓ YẾU TỐ THỜI GIAN 42 2.1 Giới thiệu 42 2.2 Một số khái niệm 44 2.3 Thuật toán phát luật kết hợp mờ với khoảng cách thời gian mờ 52 2.3.1 Bài toán đặt 52 2.3.2 Ý tưởng thuật toán 53 2.3.3 Thuật toán FTQ 54 2.3.4 Tính đắn tính đầy đủ thuật toán 58 2.3.5 Độ phức tạp thuật toán 60 2.3.6 Trường hợp suy biến luật kết hợp mờ với khoảng cách thời gian mờ 62 2.4 Thử nghiệm thuật toán 63 2.4.1 Dữ liệu thử nghiệm 63 2.4.2 Kết thử nghiệm 66 Kết luận Chương 72 CHƯƠNG PHÁT HIỆN MẪU CHUỖI CÓ TÍNH ĐẾN KHOẢNG CÁCH THỜI GIAN TRONG CÁC CSDL CHUỖI ĐỊNH LƯỢNG CÓ YẾU TỐ THỜI GIAN 74 3.1 Giới thiệu 74 3.2 Một số khái niệm 76 3.3 Thuật toán phát mẫu chuỗi mờ với khoảng cách thời gian mờ 84 3.3.1 Bài toán đặt 84 3.3.2 Ý tưởng thuật toán 84 3.3.3 Thuật toán FSPFTIM 85 3.3.4 Tính đắn tính đầy đủ thuật toán 88 3.3.5 Độ phức tạp thuật toán 90 3.3.6 Trường hợp suy biến mẫu chuỗi mờ với khoảng cách thời gian mờ 91 3.3.7 Minh họa thuật toán 92 3.4 Thử nghiệm thuật toán 93 3.4.1 Dữ liệu thử nghiệm 93 3.4.2 Kết thử nghiệm 95 Kết luận Chương 100 CHƯƠNG PHÁT HIỆN LUẬT CHUỖI CHUNG CĨ TÍNH ĐẾN KHOẢNG CÁCH THỜI GIAN TRONG CÁC CSDL CHUỖI ĐỊNH LƯỢNG CÓ YẾU TỐ THỜI GIAN 101 4.1 Giới thiệu 101 4.2 Một số khái niệm 103 4.3 mờ Thuật toán phát luật chuỗi chung mờ với khoảng cách thời gian 108 4.3.1 Bài toán đặt 108 4.3.2 Thuật toán IFERMiner 108 4.3.3 Tính đắn đầy đủ 112 4.3.4 Độ phức tạp thuật toán IFERMiner 114 4.3.5 Trường hợp suy biến luật chuỗi chung mờ với khoảng cách thời gian mờ 120 4.4 Thử nghiệm thuật toán 121 4.4.1 Dữ liệu thử nghiệm 121 4.4.2 Kết thử nghiệm 123 Kết luận Chương 129 KẾT LUẬN VÀ KIẾN NGHỊ 130 DANH MỤC CƠNG TRÌNH ĐÃ CÔNG BỐ 132 TÀI LIỆU THAM KHẢO 133 DANH MỤC HÌNH VẼ Hình 1.1 Các vấn đề liên quan đến nghiên cứu luận án 41 Hình 2.1 Các hàm thành viên tập mờ ứng với tỉ lệ tăng/giảm mã chứng khoán 65 Hình 2.2 Các hàm thành viên tập mờ Tỉ lệ thay đổi số VN30 65 Hình 2.3 Các hàm thành viên tập mờ thời gian 65 Hình 2.4 Mối quan hệ số lượng luật tìm từ thuật toán FTQ độ tin cậy cực tiểu min_conf trường hợp khác độ hỗ trợ cực tiểu min_sup 66 Hình 2.5 Mối quan hệ số lượng luật tìm min_sup với min_conf khác 67 Hình 2.6 Chi phí thời gian thực min_conf=70% 67 Hình 2.7 So sánh số luật phương pháp mờ hóa (A) phương pháp chia khoảng (B) khoảng cách thời gian thực thuật tốn FTQ 68 Hình 2.8 So sánh thời gian chạy phương pháp mờ hóa (A) phương pháp chia khoảng (B) khoảng cách thời gian thực thuật toán FTQ 68 Hình 3.1 Các hàm thành viên tập mờ thuộc LT 81 Hình 3.2 Mối quan hệ số lượng mẫu chuỗi mờ với khoảng cách thời gian mờ với min_sup (a) thời gian chạy thuật toán với min_sup (b) trường hợp số phân hoạch thuộc tính định lượng khác thực tập liệu S100I1000T3D341K 95 Hình 3.3 Mối quan hệ số lượng mẫu chuỗi mờ với khoảng cách thời gian mờ với min_sup (a) thời gian chạy thuật toán với min_sup (b) trường hợp số phân hoạch thuộc tính định lượng khác thực tập liệu Online Retail_France 96 Hình 3.4 Mối quan hệ số lượng mẫu chuỗi mờ với khoảng cách thời gian mờ với min_sup (a) thời gian chạy thuật toán với min_sup (b) trường hợp số phân hoạch khoảng cách thời gian (Kt) khác tập liệu S100I1000T3D341K 97 Hình 3.5 Mối quan hệ số lượng mẫu chuỗi mờ với khoảng cách thời gian mờ với min_sup (a) thời gian chạy thuật toán với min_sup (b) trường hợp số phân hoạch khoảng cách thời gian (Kt) khác tập liệu Online Retail_France 97 Hình 3.6 So sánh số mẫu chuỗi phương pháp mờ hóa (A) phương pháp chia khoảng (B) khoảng cách thời gian thực thuật toán FSPFTIM 98 Hình 3.7 So sánh thời gian chạy phương pháp mờ hóa (A) phương pháp chia khoảng (B) khoảng cách thời gian thực thuật toán FSPFTIM 98 Hình 4.1 Mối quan hệ số lượng luật FCSI valid với min_sup min_conf 124 Hình 4.2 Mối quan hệ thời gian thực thuật toán với min_sup min_conf 125 Hình 4.3 So sánh số luật phương pháp mờ hóa (A) phương pháp chia khoảng (B) khoảng cách thời gian thực thuật toán IFERMiner 126 Hình 4.4 So sánh thời gian chạy phương pháp mờ hóa (A) phương pháp chia khoảng (B) khoảng cách thời gian thực thuật toán IFERMiner 127 DANH MỤC BẢNG BIỂU Bảng 1.1 Ví dụ CSDL giao dịch 18 Bảng 1.2 Ví dụ CSDL định lượng 22 Bảng 1.3 Ví dụ CSDL giao dịch mua hàng có yếu tố thời gian 24 Bảng 1.4 Một số nghiên cứu phát luật kết hợp có tính đến khoảng cách thời gian 24 Bảng 1.5 Ví dụ CSDL chuỗi giao dịch 25 Bảng 1.6 Ví dụ CSDL chuỗi định lượng 29 Bảng 1.7 Ví dụ CSDL chuỗi giao dịch có yếu tố thời gian 31 Bảng 1.8 Một số nghiên cứu phát mẫu chuỗi có tính đến khoảng cách thời gian 32 Bảng 1.9 Một số nghiên cứu phát luật chuỗi chung 34 Bảng 2.1 Ví dụ CSDL định lượng có yếu tố thời gian 44 Bảng 2.2 CSDL mờ có yếu tố thời gian DF 46 Bảng 2.3 Dữ liệu thử nghiệm ISTANBUL STOCK EXCHANGE 63 Bảng 2.4 Kết thử nghiệm thuật toán FTQ với min_sup=7% min_con=70% 69 Bảng 2.5 Ý nghĩa luật thu CSDL VNINDEX 70 Bảng 3.1 Ví dụ CSDL chuỗi định lượng có yếu tố thời gian QSD 77 Bảng 3.2 CSDL chuỗi mờ có yếu tố thời gian FSD 78 Bảng 3.3 Dữ liệu thử nghiệm S100I1000T3D341K Online Retail_France 94 Bảng 3.4 Các mẫu chuỗi mờ với khoảng cách thời gian mờ tìm từ Online Retail_France 99 Bảng 4.1 Dữ liệu thử nghiệm thuật toán IFERMiner 122 128 Bảng 4.2 Các luật FCSI valid có độ hỗ trợ cao tương ứng với min_sup = 3% min_conf = 75% Luật tìm đƣợc 𝐿𝑜𝑛g 301Small ⇒=⇒ 1042Small 63Small, 413Small 𝐿𝑜𝑛g ⇒=⇒ 1042Small 616Small, 833Small 𝐿𝑜𝑛g ⇒=⇒ 1042Small 62Small, 116Small 𝐿𝑜𝑛g ⇒=⇒1250 Small 62Small, 116Small, 385Small 𝐿𝑜𝑛g ⇒=⇒1250Small Độ hỗ Độ tin Ý nghĩa trợ cậy 4.38% 76.13% Nếu khách hàng mua mặt hàng 301 với số lượng Small khách hàng mua mặt hàng 1042 với số lượng Small sau thời gian Long với độ hỗ trợ độ tin cậy 4.38% , 76.13% 4.41% 79.99% Nếu khách hàng mua mặt hàng 63 với số lượng Small mặt hàng 413 với số lượng Small khách hàng mua mặt hàng 1042 với số lượng Small sau thời gian Long với độ hỗ trợ độ tin cậy 4.41%, 79.99% 4.36% 94.79% Nếu khách hàng mua mặt hàng 616 với số lượng Small mặt hàng 833 với số lượng Small khách hàng mua mặt hàng 1042 với số lượng Small sau thời gian Long với độ hỗ trợ độ tin cậy 4.36% , 94.79 4.60% 80.00% Nếu khách hàng mua mặt hàng 62 với số lượng Small mặt hàng 116 với số lượng Small khách hàng mua mặt hàng 1250 với số lượng Small sau thời gian Long với độ hỗ trợ độ tin cậy 4.60%, 80.00% 4.4% 98.29% Nếu khách hàng mua mặt hàng 62 với số lượng Small, mặt hàng 116 với số lượng Small mặt hàng 385 với số lượng Small khách hàng mua mặt hàng 1250 với số lượng Small sau thời gian Long với độ hỗ trợ độ tin cậy 4.40%, 98.29% 129 Trong Bảng 4.2 trên, số 301, 1042, 63, 413, 616, 833, StockCodeID mặt hàng Cụ thể, StockCodeID 301 tương ứng mặt hàng có tên “Big doughnut fridge magnets”; StockCodeID 1042 “Rabbit night light”; StockCodeID 63là “Red retrospot mini cases”; StockCodeID 413 “Tea party birthday card”; StockCodeID 616 “10 Colour spaceboy pen”; StockCodeID 833 “Poppy's playhouse bathroom”; StockCodeID 62 “Assorted colour mini cases”; StockCodeID 116 “Set/6 red spotty paper plates”; StockCodeID 385 “Red retrospot picnic bag”; StockCodeID 1250 “Spaceboy mini backpack” Kết luận Chƣơng Chương tập trung trình bày thuật tốn phát luật chuỗi chung mờ với khoảng cách thời gian mờ CSDL định lượng có yếu tố thời gian gọi IFERMiner Phương pháp mờ hóa thuộc tính định lượng khoảng cách thời gian tương tự hai chương trước Thuật tốn IFERMiner phát triển từ ý tưởng thuật toán ERMiner để phát luật chuỗi chung CSDL giao dịch khơng có yếu tố thời gian Cụ thể thuật toán IFERMiner xây dựng dựa lớp tương đương mờ trái, mờ phải luật chuỗi chung mờ phổ biến với khoảng cách thời gian mờ phép hợp trái, phải lớp tương đương Thuật toán IFERMiner đắn đầy đủ Hơn độ phức tạp thuật toán chứng minh đa thức Việc thực nghiệm thuật toán IFERMiner cho thấy phù hợp đắn thuật tốn đảm bảo tính chất đóng xuống Kết thực nghiệm so sánh với phương pháp chia khoảng tương ứng Các luật chuỗi chung phát thuật toán IFERMiner cho thấy phù hợp ý nghĩa sử dụng chúng đời sống thực tiễn 130 KẾT LUẬN VÀ KIẾN NGHỊ Luận án đạt đƣợc kết sau: NCS hoàn thành mục tiêu luận án NCS tập trung nghiên cứu giải vấn đề phát luật kết hợp mẫu chuỗi, luật chuỗi chung có tính đến khoảng cách thời gian tương ứng CSDL định lượng CSDL chuỗi định lượng có yếu tố thời gian Cụ thể luận án đề xuất nghiên cứu giải pháp giải toán sau: Đề xuất giải toán phát luật kết hợp có tính đến khoảng cách thời gian xảy giao dịch CSDL định lượng có yếu tố thời gian Các luật phát gọi luật kết hợp mờ với khoảng cách thời gian mờ Thuật toán FTQ NCS đề xuất nhằm phát luật Thuật toán sử dụng phương pháp mờ hóa thuộc tính định lượng sau thực dựa ý tưởng thuật toán Apriori Việc thực nghiệm cho thấy thuật toán phù hợp với lý thuyết luật kết hợp, tính chất đóng xuống (hay tính chất Apriori tập phổ biến) Các luật phát cho thấy ý nghĩa ứng dụng thực tiễn chúng Đề xuất giải toán phát mẫu chuỗi có tính đến khoảng cách thời gian giao dịch CSDL chuỗi định lượng có yếu tố thời gian Các mẫu chuỗi gọi mẫu chuỗi mờ với khoảng cách thời gian mờ Thuật toán phát mẫu chuỗi mờ với khoảng cách thời gian mờ gọi FSPFTIM Thuật toán đắn đầy đủ Độ phức tạp tính tốn luận án Việc thực nghiệm thuật toán tập liệu thực cho thấy thuật tốn phù hợp với lý thuyết tính chất đóng xuống mẫu phổ biến Các mẫu chuỗi phát phù hợp có ý nghĩa ứng dụng thực tiễn Đề xuất giải toán phát luật chuỗi chung có tính đến khoảng cách thời gian giao dịch CSDL chuỗi định lượng có 131 yếu tố thời gian Các luật chuỗi phát gọi luật chuỗi chung mờ với khoảng cách thời gian mờ Thuật toán đề xuất để phát luật chuỗi chung gọi IFERMiner Thuật toán đắn đầy đủ Độ phức tạp tính tốn thuật tốn đưa độ phức tạp đa thức Việc thực nghiệm thuật toán tập liệu thực cho thấy thuật tốn phù hợp với tính chất đóng xuống luật chuỗi chung luật chuỗi chung phát phù hợp với thực tiễn có ý nghĩa ứng dụng Hƣớng nghiên cứu tƣơng lai: Các thuật toán phát luật kết hợp mờ mẫu chuỗi mờ với khoảng cách thời gian mờ tương ứng CSDL định lượng có yếu tố thờ gian CSDL chuỗi định lượng có yếu tố thời gian phát triển dựa thuật toán Apriori, thuật toán đánh giá có hiệu mức trung bình so với thuật toán phát luật kết hợp khác Một hướng nghiên cứu sau luận án chúng tơi phát triển thuật tốn hiệu để phát luật kết hợp mờ mẫu chuỗi mờ với khoảng cách thời gian Các mẫu chuỗi luật chuỗi chung biểu diễn mối quan hệ giao dịch đối tượng thực hiện, hướng nghiên cứu khác ưu tiên nghiên cứu phát loại mẫu chuỗi loại luật chuỗi chung biểu diễn mối quan hệ giao địch thực đối tượng khác miễn giao dịch đứng trước mẫu chuỗi phần tiền đề luật phải xảy tương ứng trước giao địch đứng sau mẫu chuỗi phần hệ luật chuỗi chung 132 DANH MỤC CƠNG TRÌNH ĐÃ CÔNG BỐ [CT1] Trương Đức Phương, Đỗ Văn Thành, “Phát luật chuỗi liên kết giao dịch từ sở liệu thời gian”, Kỷ yếu hội nghị khoa học công nghệ quốc gia lần thứ VII (FAIR „7), 2014, pp 488-495 [CT2] Trương Đức Phương, Đỗ Văn Thành, “Phát luật kết hợp liên kết chuỗi thời gian”, hội thảo quốc gia lần thứ 17: Một số vấn đề chọn lọc công nghệ thông tin truyền thông, 2014, pp 257-262 [CT3] Truong Duc Phuong, Do Van Thanh, Nguyen Duc Dung, “Mining fuzzy time-interval association rules from temporal quantitative databases”, International Conference on Information and Convergence Technology for Smart Society Vol.2 No.1, 2016, pp 52-58 [CT4] Truong Duc Phuong, Do Van Thanh, Nguyen Duc Dung, “An Effective Algorithm for Association Rules Mining from Temporal Quantitative Databases”, Indian Journal of Science and Technology, Vol 9(17), 2016 (Scopus*) [CT5] Truong Duc Phuong, Do Van Thanh, Nguyen Duc Dung "Mining Fuzzy Sequential Patterns with Fuzzy Time-Intervals in Quantitative Sequence Databases." Cybernetics and Information Technologies, Vol 18 (2), 2018, pp 3-19 (Scopus) [CT6] Trương Đức Phương, Đỗ Văn Thành, Nguyễn Đức Dũng “Phát mẫu chuỗi mờ với khoảng cách thời gian xác định từ sở liệu chuỗi định lượng”, hội thảo quốc gia lần thứ 21: Một số vấn đề chọn lọc công nghệ thông tin truyền thông, 2018, pp 280-287 [CT7] Trương Đức Phương “Xây dựng mơ hình dự báo số VN30 thị trường chứng khoán Việt Nam”, hội thảo quốc gia lần thứ 21: Một số vấn đề chọn lọc công nghệ thông tin truyền thông, 2018, pp 383-389 [CT8] Thanh Do Van, Phuong Truong Duc, “Fuzzy Common Sequential Rules Mining In Quantitative Sequence Databases”, Journal of Computer Science and Cybernetics, Vol 35(3), 2019, pp 217-232 [CT9] Thanh Do Van, Phuong Truong Duc, “Mining Fuzzy Common Sequential Rules with Fuzzy Time-Interval in quantitative sequence databases”, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, Vol 28 (6), 2020, pp 957 (SCIE) * https://www.scopus.com/authid/detail.uri?authorId=57189375549 133 TÀI LIỆU THAM KHẢO [1] R Agrawal, T Imieliński, and A Swami, “Mining association rules between sets of items in large databases,” Acm sigmod record, vol 22, no pp 207–216, 1993 [2] R Agrawal, R Srikant, and others, “Fast algorithms for mining association rules,” in Proc 20th int conf very large data bases, VLDB, 1994, vol 1215, no 12, pp 487–499 [3] A Savasere, E R Omiecinski, and S B Navathe, “An efficient algorithm for mining association rules in large databases,” 1995 [4] N Pasquier, Y Bastide, R Taouil, and L Lakhal, “Discovering frequent closed itemsets for association rules,” in International Conference on Database Theory, 1999, pp 398–416 [5] M Shekofteh, A M Rahmani, and M A Dezfuli, “A-Close+: An Algorithm for Mining Frequent Closed Itemsets,” in 2008 International Conference on Advanced Computer Theory and Engineering, 2008, pp 638–642 [6] N Pasquier, Y Bastide, R Taouil, and L Lakhal, “Efficient mining of association rules using closed itemset lattices,” Inf Syst., vol 24, no 1, pp 25–46, 1999 [7] J Pei, J Han, R Mao, and others, “Closet: An efficient algorithm for mining frequent closed itemsets.,” in ACM SIGMOD workshop on research issues in data mining and knowledge discovery, 2000, vol 4, no 2, pp 21–30 [8] J Wang, J Han, and J Pei, “CLOSET+: searching for the best strategies for mining frequent closed itemsets,” in Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, 2003, pp 236–245 [9] M J Zaki and C.-J Hsiao, “CHARM: An efficient algorithm for closed itemset mining,” in Proceedings of the 2002 SIAM international conference on data mining, 2002, pp 457–473 [10] D Burdick, M Calimlim, and J Gehrke, “Mafia: A maximal frequent itemset algorithm for transactional databases,” in ICDE, 2001, vol 1, pp 443–452 [11] K Gouda and M J Zaki, “Genmax: An efficient algorithm for mining maximal frequent itemsets,” Data Min Knowl Discov., vol 11, no 3, pp 223–242, 2005 [12] Z P Ogihara, M J Zaki, S Parthasarathy, M Ogihara, and W Li, “New algorithms for fast discovery of association rules,” 1997 [13] S Brin, R Motwani, J D Ullman, and S Tsur, “Dynamic itemset 134 counting and implication rules for market basket data,” Acm Sigmod Rec., vol 26, no 2, pp 255–264, 1997 [14] J Han, J Pei, Y Yin, and R Mao, “Mining frequent patterns without candidate generation: A frequent-pattern tree approach,” Data Min Knowl Discov., vol 8, no 1, pp 53–87, 2004 [15] L.-X Qin, P Luo, and Z.-Z Shi, “Efficiently mining frequent itemsets with compact FP-tree,” in International Conference on Intelligent Information Processing, 2004, pp 397–406 [16] L T T Nguyen, B Vo, L T T Nguyen, P Fournier-Viger, and A Selamat, “ETARM: an efficient top-k association rule mining algorithm,” Appl Intell., vol 48, no 5, pp 1148–1160, 2018 [17] M S Saravanan and R J R Sree, “A simple process model generation using a new association rule mining algorithm and clustering approach,” in 2011 Third International Conference on Advanced Computing, 2011, pp 265–269 [18] R Sumathi and E Kirubakaran, “Architectural perspective of parallelizing association rule mining,” in IEEE-International Conference On Advances In Engineering, Science And Management (ICAESM-2012), 2012, pp 437–442 [19] N Aryabarzan, B Minaei-Bidgoli, and M Teshnehlab, “negFIN: An efficient algorithm for fast mining frequent itemsets,” Expert Syst Appl., vol 105, pp 129–143, 2018 [20] R Srikant and R Agrawal, “Mining quantitative association rules in large relational tables,” in Acm Sigmod Record, 1996, vol 25, no 2, pp 1–12 [21] B Lent, A Swami, and J Widom, “Clustering association rules,” in Proceedings 13th International Conference on Data Engineering, 1997, pp 220–231 [22] T Fukuda, Y Morimoto, S Morishita, and T Tokuyama, “Mining optimized association rules for numeric attributes,” J Comput Syst Sci., vol 58, no 1, pp 1–12, 1999 [23] R Rastogi and K Shim, “Mining optimized association rules with categorical and numeric attributes,” IEEE Trans Knowl Data Eng., vol 14, no 1, pp 29–50, 2002 [24] K C C Chan and W.-H Au, “Mining fuzzy association rules,” in Proceedings of the sixth international conference on Information and knowledge management, 1997, pp 209–215 [25] C M Kuok, A Fu, and M H Wong, “Mining fuzzy association rules in databases,” ACM Sigmod Rec., vol 27, no 1, pp 41–46, 1998 [26] T.-P Hong, C.-S Kuo, and S.-C Chi, “Mining association rules from 135 quantitative data,” Intell data Anal., vol 3, no 5, pp 363–376, 1999 [27] T.-P Hong, C.-S Kuo, and S.-C Chi, “Trade-off between computation time and number of rules for fuzzy mining from quantitative data,” Int J Uncertainty, Fuzziness Knowledge-Based Syst., vol 9, no 05, pp 587–604, 2001 [28] T.-P Hong, M.-J Chiang, and S.-L Wang, “Mining from quantitative data with linguistic minimum supports and confidences,” in 2002 IEEE World Congress on Computational Intelligence 2002 IEEE International Conference on Fuzzy Systems FUZZ-IEEE’02 Proceedings (Cat No 02CH37291), 2002, vol 1, pp 494–499 [29] T P Hong, “Mining membership functions and fuzzy association rules,” 2003 [30] A U Tansel, J Clifford, S Gadia, S Jajodia, A Segev, and R Snodgrass, Temporal databases: theory, design, and implementation Benjamin-Cummings Publishing Co., Inc., 1993 [31] B Aydin and R A Angryk, Spatiotemporal Frequent Pattern Mining from Evolving Region Trajectories Springer, 2018 [32] H Lu, J Han, and L Feng, “Stock movement prediction and ndimensional inter-transaction association rules,” 1998 [33] A K H Tung, H Lu, J Han, and L Feng, “Efficient mining of intertransaction association rules,” IEEE Trans Knowl Data Eng., vol 15, no 1, pp 43–56, 2003 [34] L.-X Qin and Z.-Z Shi, “Efficiently mining association rules from time series,” Int J Inf Technol., vol 12, no 4, pp 30–38, 2006 [35] A J T Lee and C.-S Wang, “An efficient algorithm for mining frequent inter-transaction patterns,” Inf Sci (Ny)., vol 177, no 17, pp 3453–3476, 2007 [36] S Nandagopal, V P Arunachalam, and S Karthik, “A Novel Approach for Mining Inter-Transaction Itemsets,” Eur Sci J., vol 8, pp 14–22, 2012 [37] T.-N Nguyen, N L TT, B Vo, and N.-T Nguyen, “An Efficient Algorithm for Mining Frequent Closed Inter-Transaction Patterns,” 2019 [38] T.-N Nguyen, L T T Nguyen, B Vo, and N T Nguyen, “A Fast Algorithm for Mining Closed Inter-transaction Patterns,” in International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, 2020, pp 820–831 [39] R Agrawal, R Srikant, and others, “Mining sequential patterns,” in icde, 1995, vol 95, pp 3–14 136 [40] R Srikant and R Agrawal, “Mining sequential patterns: Generalizations and performance improvements,” in International Conference on Extending Database Technology, 1996, pp 1–17 [41] M N Garofalakis, R Rastogi, and K Shim, “SPIRIT: Sequential pattern mining with regular expression constraints,” in VLDB, 1999, vol 99, pp 7–10 [42] M J Zaki, “SPADE: An efficient algorithm for mining frequent sequences,” Mach Learn., vol 42, no 1–2, pp 31–60, 2001 [43] J Ayres, J Flannick, J Gehrke, and T Yiu, “Sequential pattern mining using a bitmap representation,” in Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, 2002, pp 429–435 [44] E Salvemini, F Fumarola, D Malerba, and J Han, “Fast sequence mining based on sparse id-lists,” in International Symposium on Methodologies for Intelligent Systems, 2011, pp 316–325 [45] P Fournier-Viger, A Gomariz, M Campos, and R Thomas, “Fast vertical mining of sequential patterns using co-occurrence information,” in Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2014, pp 40–52 [46] P Fournier-Viger, C.-W Wu, and V S Tseng, “Mining maximal sequential patterns without candidate maintenance,” in International Conference on Advanced Data Mining and Applications, 2013, pp 169– 180 [47] D Lo, S.-C Khoo, and J Li, “Mining and ranking generators of sequential patterns,” in Proceedings of the 2008 SIAM International Conference on Data Mining, 2008, pp 553–564 [48] J Han, J Pei, B Mortazavi-Asl, Q Chen, U Dayal, and M.-C Hsu, “FreeSpan: frequent pattern-projected sequential pattern mining,” in Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, 2000, pp 355–359 [49] J Pei et al., “Mining sequential patterns by pattern-growth: The prefixspan approach,” IEEE Trans Knowl Data Eng., vol 16, no 11, pp 1424–1440, 2004 [50] X Yan, J Han, and R Afshar, “CloSpan: Mining: Closed sequential patterns in large datasets,” in Proceedings of the 2003 SIAM international conference on data mining, 2003, pp 166–177 [51] T Van, B Vo, and B Le, “Mining sequential patterns with itemset constraints,” Knowl Inf Syst., vol 57, no 2, pp 311–330, 2018 [52] R Bhatta, C I Ezeife, and M N Butt, “Mining Sequential Patterns of Historical Purchases for E-commerce Recommendation,” in 137 International Conference on Big Data Analytics and Knowledge Discovery, 2019, pp 57–72 [53] C Fiot, A Laurent, and M Teisseire, “From crispness to fuzziness: Three algorithms for soft sequential pattern mining,” IEEE Trans Fuzzy Syst., vol 15, no 6, pp 1263–1277, 2007 [54] Y.-C Hu, R.-S Chen, G.-H Tzeng, and J.-H Shieh, “A fuzzy data mining algorithm for finding sequential patterns,” Int J Uncertainty, Fuzziness Knowledge-Based Syst., vol 11, no 02, pp 173–193, 2003 [55] Y.-C Hu, G.-H Tzeng, and C.-M Chen, “Deriving two-stage learning sequences from knowledge in fuzzy sequential pattern mining,” Inf Sci (Ny)., vol 159, no 1–2, pp 69–86, 2004 [56] T C.-K Huang, “Discovery of fuzzy quantitative sequential patterns with multiple minimum supports and adjustable membership functions,” Inf Sci (Ny)., vol 222, pp 126–146, 2013 [57] R J Kuo, C M Chao, and C Y Liu, “Integration of K-means algorithm and AprioriSome algorithm for fuzzy sequential pattern mining,” Appl Soft Comput., vol 9, no 1, pp 85–93, 2009 [58] R B V Subramanyam and A Goswami, “A fuzzy data mining algorithm for incremental mining of quantitative sequential patterns,” Int J Uncertainty, Fuzziness Knowledge-Based Syst., vol 13, no 06, pp 633–652, 2005 [59] T Huang, R Huang, B Liu, and Y Yan, “Extracting Various Types of Informative Web Content via Fuzzy Sequential Pattern Mining,” in AsiaPacific Web (APWeb) and Web-Age Information Management (WAIM) Joint Conference on Web and Big Data, 2017, pp 230–238 [60] M Yoshida, T Iizuka, H Shiohara, and M Ishiguro, “Mining sequential patterns including time intervals,” in Data Mining and Knowledge Discovery: Theory, Tools, and Technology II, 2000, vol 4057, pp 213–220 [61] Y.-L Chen, M.-C Chiang, and M.-T Ko, “Discovering time-interval sequential patterns in sequence databases,” Expert Syst Appl., vol 25, no 3, pp 343–354, 2003 [62] F Giannotti, M Nanni, D Pedreschi, and F Pinelli, “Mining sequences with temporal annotations,” in Proceedings of the 2006 ACM symposium on Applied computing, 2006, pp 593–597 [63] Y.-L Chen and T.-K Huang, “Discovering fuzzy time-interval sequential patterns in sequence databases,” IEEE Trans Syst Man, Cybern Part B, vol 35, no 5, pp 959–972, 2005 [64] I Mukhlash, D Yuanda, and M Iqbal, “Mining Fuzzy Time Interval Periodic Patterns in Smart Home Data,” Int J Electr Comput Eng., 138 vol 8, no 5, p 3374, 2018 [65] C.-I Chang, H.-E Chueh, and N P Lin, “Sequential patterns mining with fuzzy time-intervals,” in 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery, 2009, vol 3, pp 165–169 [66] C.-I Chang, H.-E Chueh, and Y.-C Luo, “An integrated sequential patterns mining with fuzzy time-intervals,” in 2012 International Conference on Systems and Informatics (ICSAI2012), 2012, pp 2294– 2298 [67] T H Duong, D Janos, V D Thi, N T Thang, and others, “An Algorithm for Mining High Utility Sequential Patterns with Time Interval,” Cybern Inf Technol., vol 19, no 4, pp 3–16, 2019 [68] T H Duong, N T Thang, V D Thi, and others, “HIGH UTILITY ITEM INTERVAL SEQUENTIAL PATTERN MINING ALGORITHM,” J Comput Sci Cybern., vol 36, no 1, pp 1–15, 2020 [69] P Fournier-Viger, U Faghihi, R Nkambou, and E M Nguifo, “CMRULES: An Efficient Algorithm for Mining Sequential Rules Common to Several Sequences.,” 2010 [70] P Fournier-Viger, J C.-W Lin, R U Kiran, Y S Koh, and R Thomas, “A survey of sequential pattern mining,” Data Sci Pattern Recognit., vol 1, no 1, pp 54–77, 2017 [71] P Fournier-Viger, R Nkambou, and V S.-M Tseng, “RuleGrowth: mining sequential rules common to several sequences by patterngrowth,” in Proceedings of the 2011 ACM symposium on applied computing, 2011, pp 956–961 [72] P Fournier-Viger, T Gueniche, S Zida, and V S Tseng, “ERMiner: sequential rule mining using equivalence classes,” in International Symposium on Intelligent Data Analysis, 2014, pp 108–119 [73] Ö F Çelebi, E Zeydan, \.Ismail Ar\i, Ö Ileri, and S Ergüt, “Alarm sequence rule mining extended with a time confidence parameter,” 2014 [74] S Kotsiantis and D Kanellopoulos, “Association rules mining: A recent overview,” GESTS Int Trans Comput Sci Eng., vol 32, no 1, pp 71– 82, 2006 [75] X Wu et al., “Top 10 algorithms in data mining,” Knowl Inf Syst., vol 14, no 1, pp 1–37, 2008 [76] H S Song, J kyeong Kim, and S H Kim, “Mining the change of customer behavior in an internet shopping mall,” Expert Syst Appl., vol 21, no 3, pp 157–168, 2001 [77] J R D Arcos and A A Hernandez, “Analyzing Online Transaction Data using Association Rule Mining: Misumi Philippines Market 139 Basket Analysis,” in Proceedings of the 2019 7th International Conference on Information Technology: IoT and Smart City, 2019, pp 45–49 [78] M V Adheli and M V Bag, “Intra and Inter Sector Stock Price Forecasting using Association Rule Mining (IISARM),” Int J Eng Res Technol., vol 3, no 8, pp 324–329, 2014 [79] V Rajput and S Bobde, “Stock market forecasting techniques: literature survey,” Int J Comput Sci Mob Comput, vol 5, no 6, pp 500– 506, 2016 [80] X Zhong and D Enke, “Forecasting daily stock market return using dimensionality reduction,” Expert Syst Appl., vol 67, pp 126–139, 2017 [81] D Gamberger, N Lavrac, and V Jovanoski, “High confidence association rules for medical diagnosis,” 1999 [82] X Li, Y Wang, and D Li, “Medical data stream distribution pattern association rule mining algorithm based on density estimation,” IEEE Access, vol 7, pp 141319–141329, 2019 [83] Y.-L Chen and T C.-K Huang, “A new approach for discovering fuzzy quantitative sequential patterns in sequence databases,” Fuzzy Sets Syst., vol 157, no 12, pp 1641–1661, 2006 [84] L A Zadeh, “Fuzzy sets,” Inf Control, vol 8, no 3, pp 338–353, 1965 [85] M M Gupta and R K Ragade, Fuzzy Set Theory and Its Applications: a Survey Springer Science & Business Media, 1977 [86] A Fu, M H Wong, S C Sze, W C Wong, W L Wong, and W K Yu, “Finding fuzzy sets for the mining of fuzzy association rules for numerical attributes,” in Proceedings of the First International Symposium on Intelligent Data Engineering and Learning (IDEAL’98), 1998, pp 263–268 [87] A Gyenesei, “A fuzzy approach for mining quantitative association rules,” Acta Cybern., vol 15, no 2, pp 305–320, 2001 [88] J C.-W Lin, T Li, P Fournier-Viger, and T.-P Hong, “A fast algorithm for mining fuzzy frequent itemsets,” J Intell Fuzzy Syst., vol 29, no 6, pp 2373–2379, 2015 [89] T Đ Phương and Đ V Thành, “Phát luật kết hợp liên kết giao dịch từ sở liệu định lượng thời gian,” in Hội thảo quốc gia lần thứ 16: Một số vấn đề chọn lọc công nghệ thông tin truyền thông, 2013, pp 250–258 [90] S Lu and C Li, “AprioriAdjust: An efficient algorithm for discovering the maximum sequential patterns,” 2004 140 [91] Z Yang and M Kitsuregawa, “LAPIN-SPAM: An improved algorithm for mining sequential pattern,” in 21st International Conference on Data Engineering Workshops (ICDEW’05), 2005, p 1222 [92] C Gao, J Wang, Y He, and L Zhou, “Efficient mining of frequent sequence generators,” in Proceedings of the 17th international conference on World Wide Web, 2008, pp 1051–1052 [93] S Yi, T Zhao, Y Zhang, S Ma, and Z Che, “An effective algorithm for mining sequential generators,” Procedia Eng., vol 15, pp 3653– 3657, 2011 [94] P Fournier-Viger, A Gomariz, M Šebek, and M Hlosta, “VGEN: fast vertical mining of sequential generator patterns,” in International Conference on Data Warehousing and Knowledge Discovery, 2014, pp 476–488 [95] J Wang, J Han, and C Li, “Frequent closed sequence mining without candidate maintenance,” IEEE Trans Knowl Data Eng., vol 19, no 8, pp 1042–1056, 2007 [96] A Gomariz, M Campos, R Marin, and B Goethals, “Clasp: An efficient algorithm for mining frequent closed sequences,” in PacificAsia Conference on Knowledge Discovery and Data Mining, 2013, pp 50–61 [97] P Fournier-Viger, C.-W Wu, A Gomariz, and V S Tseng, “VMSP: Efficient vertical mining of maximal sequential patterns,” in Canadian conference on artificial intelligence, 2014, pp 83–94 [98] C C Aggarwal and J Han, Frequent Pattern Mining Springer, 2014 [99] Y J M Pokou, P Fournier-Viger, and C Moghrabi, “Authorship Attribution Using Small Sets of Frequent Part-of-Speech Skip-grams.,” in FLAIRS Conference, 2016, pp 86–91 [100] D Schweizer, M Zehnder, H Wache, H.-F Witschel, D Zanatta, and M Rodriguez, “Using consumer behavior data to reduce energy consumption in smart homes: Applying machine learning to save energy without lowering comfort of inhabitants,” in 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA), 2015, pp 1123–1129 [101] P Fournier-Viger, T Gueniche, and V S Tseng, “Using partiallyordered sequential rules to generate more accurate sequence prediction,” in International Conference on Advanced Data Mining and Applications, 2012, pp 431–442 [102] G Yang, J Huang, and X Li, “Mining sequential patterns of PM2 pollution in three zones in China,” J Clean Prod., vol 170, pp 388– 398, 2018 141 [103] C Ou-Yang, S.-C Chou, Y.-C Juan, and H.-C Wang, “Mining Sequential Patterns of Diseases Contracted and Medications Prescribed before the Development of Stevens-Johnson Syndrome in Taiwan,” Appl Sci., vol 9, no 12, p 2434, 2019 [104] T Hong, C.-S Kuo, and S.-C Chi, “Mining fuzzy sequential patterns from quantitative data,” in IEEE SMC’99 Conference Proceedings 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat No 99CH37028), 1999, vol 3, pp 962–966 [105] H Cao, Y Zhang, L Jia, and G Si, “A Fuzzy Sequential Pattern Mining Algorithm Based on Independent Pruning Strategy for Parameters Optimization of Ball Mill Pulverizing System,” Inf Technol Control, vol 43, no 3, pp 303–314, 2014 [106] R.-S Chen, G.-H Tzeng, C C Chen, and Y.-C Hu, “Discovery of fuzzy sequential patterns for fuzzy partitions in quantitative attributes,” in Proceedings ACS/IEEE International Conference on Computer Systems and Applications, 2001, pp 144–150 [107] T Guyet, “Enhancing sequential pattern mining with time and reasoning,” Université de Rennes 1, 2020 [108] P Fournier-Viger, U Faghihi, R Nkambou, and E M Nguifo, “CMRules: Mining sequential rules common to several sequences,” Knowledge-Based Syst., vol 25, no 1, pp 63–76, 2012 [109] R B Parihar, R V Argiddi, and S S Apte, “Combined Intra-Inter transaction based approach for mining Association among the Sectors in Indian Stock Market,” Int J Comput Sci Inf Technol., vol 3, no 3, pp 3895–3899, 2012 [110] Y.-L Hsieh, D.-L Yang, J Wu, and Y.-C Chen, “Efficient Mining of Profit Rules from Closed Inter-Transaction Itemsets.,” J Inf Sci Eng., vol 32, no 3, pp 575–595, 2016 [111] L Wang, J Meng, P Xu, and K Peng, “Mining temporal association rules with frequent itemsets tree,” Appl Soft Comput., vol 62, pp 817– 829, 2018 [112] Tan, P.N., Steinbach, M., and Kumar, Association Analysis: Basic Concepts and Algorithms In Introduction to Data Mining AddisonWesley Longman Publishing Co., Inc Boston, MA, 2005 [113] UCI, “UCI-Machine Learning https://archive.ics.uci.edu/ml/datasets.php Repository.” [114] BSC, “Cơng ty Cổ phần Chứng khốn Ngân hàng Đầu tư Phát triển Việt Nam.” http://www.bsc.com.vn (accessed Mar 20, 2006) [115] P Fournier-Viger et al., “The SPMF open-source data mining library version 2,” in Lecture Notes in Computer Science (including subseries 142 Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2016, vol 9853 LNCS, pp 36–40, doi: 10.1007/978-3319-46131-1_8 ... thời gian mờ [CT2] CSDL định lượng có yếu tố thời gian Luật kết hợp mờ với khoảng cách thời gian rõ [1] Luật kết hợp mờ với khoảng cách thời gian mờ [CT4] CSDL chuỗi có yếu tố thời gian CSDL chuỗi. .. HỌC VÀ CÔNG NGHỆ VIỆT NAM HỌC VIỆN KHOA HỌC VÀ CÔNG NGHỆ …… ….***………… TRƢƠNG ĐỨC PHƢƠNG PHÁT HIỆN LUẬT KẾT HỢP VÀ LUẬT CHUỖI MỜ TRONG CƠ SỞ DỮ LIỆU ĐỊNH LƯỢNG CÓ YẾU TỐ THỜI GIAN LUẬN ÁN TIẾN SĨ... CHƯƠNG PHÁT HIỆN LUẬT KẾT HỢP CĨ TÍNH ĐẾN KHOẢNG CÁCH THỜI GIAN TRONG CÁC CSDL ĐỊNH LƢỢNG CÓ YẾU TỐ THỜI GIAN Trong chương 1, luận án khoảng trống cần nghiên cứu phát luật kết hợp có tính đến

Định dạng
Số trang	146
Dung lượng	3,24 MB