Tỷ lệ % phát hiện đúng, sai với các ngưỡng khác nh- 123docz.net

δ 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Nhận dạng đúng (%) 75 63 57 35 22 17 0 0 1

Nhận dạng sai (%) 0 0 0 0 0 0 21 48 92

3.5. Kết chương

Trong chương 3, luận án đã trình bày chi tiết bài tốn so khớp đồ thị khơng chính xác. Một số cách tiếp cận liên quan và những tồn tại của chúng. Tiếp theo, luận án trình bày một cách tiếp cận mới cho việc so khớp đồ thị dựa trên thuật tốn di truyền trong chi tiết. Thuật tốn đề xuất cĩ thể áp dụng trên một số lớp đồ thị như vơ hướng, cĩ hướng, cĩ trọng số hay gán nhãn. Tuy nhiên, nhược điểm của các thuật tốn ở trên là chỉ áp dụng được trên một số lớp đồ thị cụ thể. Kết quả trên đã được cơng bố trong tài liệu [4].

Từ đĩ, luận án đề xuất việc áp dụng của thuật tốn so khớp đồ thị vào quá trình phát hiện trang Web giả mạo dựa vào cấu trúc DOM của chúng. Các kết quả thực nghiệm cho thấy hướng tiếp cận dựa trên giải thuật di truyền cho hiệu quả tốt hơn so với thuật tốn STM. Đây là hướng tiếp cận đầy hứa hẹn để tích hợp trong các hệ thống phát hiện giả mạo. Kết quả này đã được cơng bố trong tài liệu [6].

KẾT LUẬN

Các kết quả của luận án

Hệ thống phát hiện xâm nhập mạng và giả mạo cĩ nhiệm vụ phân tích các thơng tin, theo dõi, phát hiện và ngăn chặn sự xâm nhập trái phép tài nguyên làm tổn hại đến tính bảo mật, tính tồn vẹn và tính sẵn sàng của hệ thống. Cĩ nhiều cách tiếp cận khác trong đĩ so khớp mẫu là một kỹ thuật được sử dụng phổ biến trong các hệ thống phát hiện và ngăn chặn xâm nhập mạng. Việc phát hiện các nguy cơ tiềm ẩn trong hệ thống phát hiện xâm nhập mạng được thực hiện bằng cách so khớp nội dung gĩi tin với các mẫu đã biết.

Trong luận án này, với mục tiêu áp dụng các thuật tốn so khớp trong việc phát triển các hệ thống phát hiện xâm nhập trái phép, luận án đã đạt được các kết quả như sau:

1. Phân tích đánh giá về hiệu năng cũng như thời gian thực hiện các thuật tốn so khớp mẫu hiện cĩ trên hệ thống phát hiện thâm nhập Snort. Kết quả này đã được cơng bố trong tài liệu [1];

2. Đưa ra các cải tiến cho thuật tốn so khớp đa mẫu Aho - Corasick bằng cách sử dụng kỹ thuật nén dịng và bảng chỉ số nhằm nâng cao hiệu quả của thuật tốn, các phân tích và so sánh thực tế nhằm kiểm nghiệm lý thuyết cũng đã được thực hiện trên hệ thống Snort. Kết quả này đã được cơng bố trong tài liệu [3];

3. Luận án cũng đề xuất một thuật tốn so khớp đa mẫu mới bằng cách xây dựng biểu đồ của các mẫu kết hợp với danh sách liên kết làm giảm thời gian thực hiện việc so khớp đồng thời đa mẫu. Việc cài đặt thực nghiệm của thuật tốn với trong sự so sánh với một số thuật tốn đã tồn tại cũng đã triển khai trên hệ thống Snort. Kết quả này đã được cơng bố trong tài liệu [5].

Với mục tiêu phát hiện các trang Web giả mạo, luận án đã đạt được các kết quả như sau:

4. Đưa ra thuật tốn mới dựa trên thuật tốn di truyền để so khớp các đồ thị khơng chính xác. Thuật tốn mới cĩ thể áp dụng đối với lớp đồ thị vơ hướng, cĩ hướng, cĩ trọng số hay gán nhãn. Kết quả này đã được cơng bố trong tài liệu [4];

5. Áp dụng việc so khớp đồ thị vào việc so khớp các DOM-Tree để phát hiện các trang Web giả mạo. Kết quả này đã được cơng bố trong tài liệu [6];

Hướng phát triển luận án

Với kết quả này, luận án mới chỉ dừng lại ở việc so khớp cấu trúc của trang Web và phần nội dung là văn bản trong trang Web. Các yếu tố về hình ảnh, âm thanh,… thường được sử dụng trong các trang Web như là những phần khơng thể thiếu. Việc so khớp các thành phần này cần phải được thực hiện để so khớp hai trang Web được chính xác hơn. Đây là phần thiếu sĩt của luận án và cũng là một trong những định hướng nghiên cứu tiếp của luận án.

DANH MỤC CÁC CƠNG TRÌNH ĐÃ CƠNG BỐ LIÊN QUAN ĐẾN LUẬN ÁN

[1]. Lê Đắc Nhường, Lê Đăng Nguyên, Trịnh Thị Thùy Giang, Lê Trọng Vĩnh.

Phân tích, đánh giá hiệu quả của các thuật toán so khớp chuỗi dùng trong an ninh mạng, Hội thảo các vấn đề chọn lọc về CNTT & TT lần thứ 14, Tr.451-

463, Cần Thơ 7-8/10/2011. NXB Khoa học kỹ thuật Hà Nội 2012.

[2]. Lê Đắc Nhường, Nguyễn Gia Như, Lê Đăng Nguyên, Lê Trọng Vĩnh. Song

song hĩa thuật tốn so khớp mẫu QuickSearch trong NIDS sử dụng mơ hình chia sẻ bộ nhớ trên OpenMP và Pthreads. Tạp chí Đại học Quốc gia Hà Nội,

tháng 12/2012. Vol 28(4), Tr 255 – 263.

[3]. Lê Đắc Nhường, Nguyễn Gia Như, Lê Đăng Nguyên, Lê Trọng Vĩnh, Tới ưu

khơng gian trạng thái của thuật tốn AHO-CORASICK sử dụng kỹ thuật nén dịng và bảng chỉ sớ, Chuyên san Bưu chính viễn thơng các cơng trình nghiên

cứu và ứng dụng cơng nghệ thơng tin số 9(29), Tr 23 – 29, 2013.

[4]. Le Dang Nguyen, Dac-Nhuong Le, Tran Thi Huong, Le Trong Vinh, A New

Genetic Algorithm Applied to Inexact Graph Matching. International Journal of Computer Science and Telecommunications, Volume 5, Issue 5, pp.1-7, May 2014.

[5]. Le Dang Nguyen, Dac-Nhuong Le, Le Trong Vinh, A New Multiple-Pattern

Matching Algorithm for the Network Intrusion Detection System – 4th

International Conference on Security Science and Technology (ICSST 2015) January 15-16, 2015 Portsmouth, UK. Vol 8, No 2, 2015 pp. 94-100.

[6]. Le Dang Nguyen, Dac-Nhuong Le, Le Trong Vinh, Detecting Phishing Web

Pages based on DOM-Tree Structure and Graph Matching Algorithm- The Fifth International Symposium on Information and Communication Technologies, SoICT 2014, December 4-5, 2014, Hanoi, Vietnam pp.280- 285.

TÀI LIỆU THAM KHẢO TIẾNG VIỆT

[1]. Nguyễn Thúc Hải, Mạng máy tính và các hệ thớng mở, NXB Giáo dục -1999. [2]. Nguyễn Phương Lan, Hồng Đức Hải, Lập trình LINUX,Tập 1, NXB Giáo Dục -

2001.

[3]. Lê Đắc Nhường, Lê Đăng Nguyên, Trịnh Thị Thùy Giang, Lê Trọng Vĩnh. Phân

tích, đánh giá hiệu quả của các thuật toán so khớp chuỗi dùng trong an ninh mạng, Hội thảo các vấn đề chọn lọc về CNTT & TT lần thứ 14, Tr.451-463, Cần

Thơ 7-8/10/2011. NXB Khoa học kỹ thuật Hà Nội 2012.

TIẾNG ANH

[4]. Rafeeq Rehman, Intrusion Detection with Snort, Prentice Hall, 2003. [5]. Martin Roesch, Chris Green, Snort User Manual, The Snort Project, 2003.

[6]. Stefan Axelson, Intrusion Detection Systems: A Survey and Taxonomy. Chalmers University of Technology, Sweden, 2000.

[7]. Christian Charras, Therry Lecroq, Handbook of Exact String Matching Algorithms, King's College Publications, 2004.

[8]. J. S. Wang, H. K. Kwak, Y. J. Jung, H. U. Kwon, C. G. Kim and K. S. Chung, “A Fast and Scalable string matching algorithm using contents correction signature hashing for network IDS”, IEICE Electronic Press, vol 5, no 22, pages 949-953, 2008.

[9]. Alfred V. Aho and Margaret J. Corasick “Efficient string matching: an aid to bibliographic search”. Commun. ACM Vol. 18, No. 6, pp. 333-340, 1975.

[10]. Nen-Fu Huang; Yen-Ming Chu; Chen-Ying Hsieh; Chi-Hung Tsai; Yih-Jou Tzang, “A Deterministic Cost-effective String Matching Algorithm for Network Intrusion Detection System”, In the Pro.c of The IEEE International Conference

on Communication, pp.1292-1297, June 2007.

[11]. Jianming, Y., Yibo, X., and Jun, L., “Memory Efficient String Matching Algorithm for Network Intrusion Management System”, In Proceedings of Global Telecommunications Conference, San Francisco, California, USA, pages

1-5, 2006.

[12]. R. Boyer and J. Moore. “A Fast String Searching Algorithm”, Commun. ACM,

pp. 762-772, 1977.

[13]. B. Commentz-Walter, “A String Matching Algorithm Fast on Average”, In the Proc. of the 6th International Conference on Automata, Languages, and Programming, 1979.

[14]. Yuebin Bai; Kobayashi, H, “New string matching technology for network security, Advanced Information Networking and Applications”. AINA, pp. 198 -

201, 2003.

[15]. S. Wu and U. Manner, “A Fast Algorithm for Multi-pattern Searching”,

Technical Report, Department of Computer Science, University of Arizona,

pp.94-117, 1994.

[16]. B. Xu, X. Zhou, and J. Li, “Recursive Shift Indexing: a Aast Multi-pattern String Matching Algorithm”, In the Proc. of the 4th International Conference on Applied Cryptography and Network Security (ACNS), 2006.

[17]. C. Allauzen and M. Raffinot, “Factor Oracle of a Set of Words”, Technical report 99-11, Institut Gaspard-Monge, Universite de Marne-la-Vallee,1999.

[18]. Z. W. Zhou,Y. B. Xue, J. D. Liu, W. Zhang, and J. Li, MDH, “A High Speed Multi-Phase Dynamic Hash String Matching Algorithm for Large-Scale Pattern Set”, In the Proc. of the 9th International Conference on Information and Communication Security (ICICS), 2007.

[19]. Stephen Gossen, Neil Jones, Neil McCurdy, Rayan Persaud. Pattern Matching in

Snort, 2002.

[20]. Mars A.Nortoon et.al, Methods and Systems for Multipattern Searching, Patent

US7996424, 2009

[21]. Branimir Z. Lambov, Efficient Storage for Finite State Machines, Patent

7949679, 2011.

[22]. W3C. Document Object Model. http://www.w3. org/TR/2004/REC-DOM-Level- 3-Core-20040407/core.html, 2007.

[23]. Le Dang Nguyen, Dac-Nhuong Le, Tran Thi Huong, Le Trong Vinh, “A New Genetic Algorithm Applied to Inexact Graph Matching”, International Journal of

Computer Science and Telecommunications, Vol.5 No.5, pp.1-6, 2014.

[24]. Hua Wang, Yang Zhang, “Web Data Extraction Based on Simple Tree Matching,” IEEE, 2010, pp. 15-18

[25]. Report a Phishing Website, http://www.phishtank.com (truy cập lần cuối cùng 15/11/2015)

[26]. M. Analoui, A. Mirzaei, and P. Kabiri, “Intrusion detection using multivariate analysis of variance al-gorithm,” In the Proc of the Third International Conference on Sys-tems, Signals & Devices SSD05, vol. 3, 2005.

[27]. D. Barbara, J. Couto, S. Jajodia, and N. Wu, “Special section on data mining for intrusion detection and threat analysis: Adam: a testbed for exploring the use of data mining in intrusion detection,” ACM SIGMOD Record, vol. 30, pp. 15–24, Dec. 2001.

[28]. D. Barbara, N. Wu, and S. Jajodia, “Detecting novelnetwork intrusions using bayes estimators,” in the Proc. of the First SIAM International Conferenceon Data Mining (SDM 2001), Chicago, USA, Apr. 2001

[29]. M. Botha and R. von Solms, “Utilising fuzzy logicand trend analysis for effective intrusion detection,” Computers & Security, vol. 22, no. 5, pp. 423–

434,2003.

[30]. Susan M. Bridges and M. Vaughn Rayford, “Fuzzydata mining and genetic algorithms applied to intrusion detection,” in Proc of the Twenty-thirdNational Information Systems ecurity Conference. National Institute of Standards and Technology, Oct.2000.

[31]. D. Bulatovic and D. Velasevic, “A distributed in-trusion detection system based on bayesian alarm networks,” Lecture Notes in Computer Science (Se-cure Networking CQRE (Secure) 1999), vol. 1740, pp. 219–228, 1999.

[32]. S. B. Cho, “Incorporating soft computing techniques into a probabilistic intrusion detection system,” IEEE Transactions on system, Man, and cybernetic

sppart, vol. 32, pp. 154–160, May 2002.

[33]. John E. Dickerson and Julie A. Dickerson, “Fuzzy network profiling for intrusion detection,” in Proc of NAFIPS 19th International Conference of the North American Fuzzy Information Processing Society, pp. 301–306, Atlanta,

USA, July 2000.

[34]. P. Z. Hu and Malcolm I. Heywood, “Predicting intru-sions with local linear model,” in the Proc. of the International Joint Conference on Neural Networks, vol. 3, pp. 1780–1785. IEEE, IEEE, July 2003.

[35]. H. Gunes Kayacik, A. Nur Zincir-Heywood, and Mal-colm I. Heywood, “On the capability of an som basedintrusion detection system,” in the Proc. of theInternational Joint Conference on Neural Networks, vol. 3, pp. 1808–1813.

IEEE, July 2003.

[36]. W. Lee, Salvatore J. Stolfo, and Kui W. Mok, “Mining audit data to build intrusion detection models,” in the Proc. of the Fourth International Conference

on Knowledge Discovery and Data Mining (KDD ’98), New York, NY, USA,

Aug. 1998.

[37]. W. Lee, Salvatore J. Stolfo, and Kui W Mok, “Adap-tive intrusion detection: A data mining pproach,” Artificial Inteligence Review, vol. 14, no. 6, pp. 533–567, 2000.

[38]. J. Z. Lei and Ali Ghorbani, “Network intrusion detec-tion using an improved competitive learning neuralnetwork,” in the Proc. of the Second Annual

Conference on Communication Networks and Services Research (CNSR04), pp.

190–197. IEEE-Computer Society, IEEE, May 2004.

[39]. Ken. Yoshida, “Entropy based intrusion detection,” in the Proc. of IEEE Pacific

Rim Conference on Communications, Computers and signal Processing

(PACRIM2003), vol. 2, pp. 840–843, 2003.

[40]. Ste. Zanero and Sergio M. Savaresi, “Unsupervised learning techniques for an intrusion detection sys-tem,” in Proc of the 2004 ACM symposium on Applied computing, pp. 412–419, 2004.

[41]. J. Gomez and D. Dasgupta, “Evolving fuzzy clas-sifiers for intrusion detection,”

in the Proc. of the 2002 IEEE Workshop on the Information Assurance.

[42]. Bing Liu, Web Data Mining Exploring Hyperlinks, Contents, and Usage Data, http://www.cs.uic.edu/~liub/WebMiningBook.html,December, 2006.

[43]. Endika Bengoetxea, “Inexact Graph Matching Using Estimation of Distribution Algorithms”, PhD These, University of the Basque Country Computer Engineering Faculty, 2002.

[44]. Bruno T.Mesmer, “Efficient subgraph isomorphism detection: a decomposition approach”, PhD These, Bern, 1995.

[45]. Ivan Olmos, Jesus A.Gonzalez and Mauricio Osorio. “Inexact Graph Matching: A Case of Study”, American Association for Artificial Intelligence, pp.586-588,

2006.

[46]. Yue Zhang, Jason Hong, and Lorrie Cranor “CANTINA: A Content-Based Approach to Detecting Phishing Web Sites”, In the Proc. of the 16th International Conference on World Wide Web, pp.639-648, 2007.

[47]. Likarish, Eunjin Jung, Dunbar. D., and Hansen. T.E., “B-APT: Bayesian Anti- Phishing Toolbar, ” In the Proc. of the 16th International Conference on

Communication 2008 (ICC’08), pp. 1745 – 1749. 2008

[48]. Vinnarasi Tharania. I, R. Sangareswari, and M. Saleembabu, “Web Phishing Detection In Machine Learning Using Heuristic Image Based Method,”

International Journal of Engineering Research and Applications, Vol. 2, Issue 5,

pp.1589-1593, 2012.

[49]. Kranti W., Supriya A. and N. V. Puri, “An Efficient Approach to Detecting Phishing A Web Using K-Means and Nạve-Bayes Algorithms,” International Journal of Research in Advent Technology, Vol.2, No.3, pp.106-111, 2014.

[50]. Jangjong Fan, Kehyih Su, “An Efficient Algorithm for Matching Multiple Patterns”, IEEE Transactions on Knowledge and Data Engineering, vol. 5, no.

2, pp.339-351, 1993.

[51]. Andrew D.J. Cross, Richard C. Winson and Edwin R.Hancock, “Inexact graph matching using genetic search”, Pattern Recognition,Vol.30, No.6, pp. 953-970, 1997.

[52]. Andrew D.J. Cross, Richard C. Winson and Edwin R.Hancock, “Convergence of a hill-climbing gennetic algorithm for graph matching” Pattern Recognition,Vol.33, pp. 1863-1880, 2000.

[53]. Yong Wang. Naohiro Ishii, “A genetic algorithm and its parallelization for graph matching with similatarity measure” Artif Life Robotics,Vol.33, pp. 68-73, 1998. [54]. Pekalska E, Duin R (2005) The dissimilarity representation for pattern

recognition: foundations and applications. Series in machine perception and artificial intelligence. World Scientific.

[55]. Luxburg UV, Bousquet O (2003) Distance-based classification with Lipschitz functions. J Mach Learn Res 5:669–695.

[56]. Gao X, Xiao B, Tao D, Li X (2010) A survey of graph edit distance. Pattern Anal Appl 13(1):113–129

[57]. Gartner T (2008) Kernels for structured data. No v 72 in kernels for structured data. World Scientific.

[58]. Lorenzo Livi, Antonello Rizzi (2013), The graph matching problem- Survey, Pattern Anal Applic (2013) 16:253–283.

[59]. M. Aldwairi, and D. Alansari, “Exscind: Fast pattern matching for intrusion detection using exclusion and inclusion filters”, Next Generation Web Services Practices (NWeSP), 2011 7th International Conference on, pp. 24-30, 2011.

[60]. M. Kharbutli, M. Aldwairi, and Abdullah Mughrabi, “Function and Data Parallelization of Wu-Manber Pattern Matching for Intrusion Detection Systems”, Network Protocols & Algorithms, volume 4(3), 2012.

[61]. G. Ahmed and N. Khare, “Hardware based String Matching Algorithms: A Survey”,International Journal of Computer Applications, volume 88(11):16-19, February 2014

[62]. Koloud Al-Khamaiseh, Shadi ALShagarin, A Survey of String Matching Algorithms, Int. Journal of Engineering Research and Applications, Vol. 4, Issue 7( Version 2), July 2014, pp.144-156.

[63]. Alok S. Tongaonkar (2004), Fast Pattern-Matching Techniques for Packet Filtering, Stony Brook University, 5-2004.

[64]. J.S.Wang, H.K.Kwak, Y.J.Jung, H.U.Kwon, C.G.Kim and K.S.Chung (2008),

A Fast and Scalable string matching algorithm using contents correction signature hashing for network IDS, IEICE Electronic Press, vol 5, no 22, pages 949-953

[65]. Nen-Fu Huang; Yen-Ming Chu; Chen-Ying Hsieh; Chi-Hung Tsai; Yih-Jou Tzang (2007), A Deterministic Cost-effective String Matching Algorithm for Network Intrusion Detection System, Communications. IEEE International

Conference, pp.1292-1297, June.

[66]. Zhai, Y. and B. Liu. Structured data extraction from the web based on partial tree alignment. IEEE Transactions on knowledge and Data Engineering, 2006: p. 1614-1628.

[67]. Zhai, Y. and B. Liu. Web data extraction based on partial tree alignment. In Proceedings of International conference on World Wide Web (WWW-2005), 2005. [68]. Surapong Auwatanamongkol. 2007. Inexact graph matching using a genetic algorithm for image recognition. Pattern Recogn. Lett. 28, 12 (September 2007), 1428-1437.

[69]. Gerard Sanromà, Francesc Serratosa, and René Alquézar. 2008. Hybrid Genetic Algorithm and Procrustes Analysis for Enhancing the Matching of Graphs Generated from Shapes. In Proceedings of the 2008 Joint IAPR International

Workshop on Structural, Syntactic, and Statistical Pattern Recognition (SSPR &

SPR '08), Springer-Verlag, Berlin, Heidelberg, 298-307.

[70]. M. Ferrer, E. Valveny, F. Serratosa, Median graphs: A genetic approach based on new theoretical properties, Pattern Recognition, v.42 n.9, p.2003-2012, September, 2009.

[71]. Weibo Chu; Zhu, B.B.; Feng Xue; Xiaohong Guan; Zhongmin Cai (2013), Protect sensitive sites from phishing attacks using features extractable from inaccessible phishing URLs, IEEE International Conference on Communications (ICC), pp. 1990-1994, DOI: 10.1109/ICC.2013.6654816.

[72]. Luong Anh Tuan Nguyen; Ba Lam To; Huu Khuong Nguyen; Chuan Pham; Choong Seon Hong (2014), International Conference on Control, Automation and Information Sciences (ICCAIS), pp.188-193, DOI: 10.1109/ICCAIS.2014.7020555

Tỷ lệ % phát hiện đúng, sai với các ngưỡng khác nhau

Ví dụ về thuật tốn Simple Tree Matching

Quá trình mở rộng cây