KỸ THUẬT học máy PHỐI hợp và TIỀN xử lý dữ LIỆU TRONG VIỆC NÂNG CAO CHẤT LƯỢNG PHÂN lớp của các hệ THỐNG PHÁT HIỆN xâm NHẬP MẠNG

BỘ GIÁO DỤC VÀ ĐÀO TẠO TRƯỜNG ĐẠI HỌC LẠC HỒNG HOÀNG NGỌC THANH KỸ THUẬT HỌC MÁY PHỐI HỢP VÀ TIỀN XỬ LÝ DỮ LIỆU TRONG VIỆC NÂNG CAO CHẤT LƯỢNG PHÂN LỚP CỦA CÁC HỆ THỐNG PHÁT HIỆN XÂM NHẬP MẠNG LUẬN ÁN TIẾN SĨ KHOA HỌC MÁY TÍNH Đồng Nai, năm 2022 HỒNG NGỌC THANH KỸ THUẬT HỌC MÁY PHỐI HỢP VÀ TIỀN XỬ LÝ DỮ LIỆU TRONG VIỆC NÂNG CAO CHẤT LƯỢNG PHÂN LỚP CỦA CÁC HỆ THỐNG PHÁT HIỆN XÂM NHẬP MẠNG LUẬN ÁN TIẾN SĨ KHOA HỌC MÁY TÍNH Chuyên ngành: Khoa học máy tính Mã số ngành: 9480101 NGƯỜI HƯỚNG DẪN KHOA HỌC PGS.TS TRẦN VĂN LĂNG Đồng Nai, năm 2022 LỜI CAM ĐOAN Tên tơi là: Hồng Ngọc Thanh Sinh ngày: 13/11/1969 Nơi sinh: Bình Định Là nghiên cứu sinh chuyên ngành Khoa học máy tính, khóa 2015, Trường đại học Lạc Hồng Tôi xin cam đoan luận án tiến sĩ “Kỹ thuật học máy phối hợp tiền xử lý liệu việc nâng cao chất lượng phân lớp hệ thống phát xâm nhập mạng” công trình nghiên cứu cá nhân tơi, cơng trình tơi thực dưới hướng dẫn giảng viên, người hướng dẫn khoa học là: PGS TS Trần Văn Lăng Các thuật toán, số liệu kết trình bày luận án hồn tồn có từ thử nghiệm, trung thực không chép Nghiên cứu sinh Hoàng Ngọc Thanh LỜI CẢM ƠN Lời đầu tiên, với lòng biết ơn sâu sắc nhất, xin gửi lời cảm ơn tới PGS TS Trần Văn Lăng - người hướng dẫn khoa học, thầy người truyền cho tri thức, tâm huyết nghiên cứu khoa học, thầy tận tâm hướng dẫn, giúp đỡ tạo điều kiện tốt nhất để tơi hồn thành luận án Tơi xin chân thành cảm ơn Quý thầy cô Ban giám hiệu, Khoa công nghệ thông tin, Khoa sau đại học Trường đại học Lạc Hồng giảng dạy tạo điều kiện thuận lợi cho suốt thời gian tham gia nghiên cứu sinh Tôi xin cảm ơn hỗ trợ từ Ban giám hiệu, Khoa kỹ thuật khoa học máy tính, Trung tâm ngoại ngữ cơng nghệ thơng tin Trường Đại học Quốc tế Sài Gịn, nơi công tác Và xin gửi lời cảm ơn chân thành tới đồng nghiệp, bạn bè - người quan tâm, động viên suốt thời gian qua Cuối cùng, xin dành tình cảm đặc biệt đến gia đình, người thân - người tin tưởng, động viên tiếp sức cho thêm nghị lực để tơi vững bước vượt qua khó khăn Tác giả Hồng Ngọc Thanh TĨM TẮT Phát bất thường dựa luồng vấn đề phát triển môi trường an ninh mạng Nhiều nghiên cứu trước áp dụng học máy phương pháp nâng cao khả phát bất thường hệ thống phát xâm nhập mạng (NIDS) Các nghiên cứu gần cho thấy, NIDS phải đối mặt với thách thức việc cải thiện độ xác, giảm tỷ lệ cảnh báo sai phát tấn công mới Nội dung luận án đề xuất số giải pháp sử dụng kỹ thuật học máy phối hợp cải tiến kỹ thuật tiền xử lý liệu việc nâng cao chất lượng phân lớp hệ thống phát xâm nhập mạng Điều dựa thực tế là: (1) Có nhiều liệu mất cân lớp tập liệu huấn luyện dùng cho NIDS (2) Các thuật tốn học máy sử dụng tất thuộc tính thực khơng liên quan đến mục tiêu phân lớp, điều làm giảm chất lượng phân lớp tăng thời gian tính tốn (3) Các phân lớp phối hợp vượt trội so với phân lớp đơn độ xác phân lớp Những lợi phân lớp phối hợp đặc biệt rõ ràng lĩnh vực phát xâm nhập Để giải vấn đề, luận án đề xuất cải tiến việc thực hai giải pháp giai đoạn tiền xử lý liệu, cụ thể là: (1) Đề xuất thuật toán lựa chọn thuộc tính sở cải tiến thuật tốn lựa chọn thuộc tính FFC BFE biết (2) Cải tiến kỹ thuật tăng mẫu giảm mẫu tập liệu huấn luyện Dữ liệu kết sau tiền xử lý sử dụng để huấn luyện phân lớp phối hợp cách sử dụng thuật toán học máy phối hợp đồng nhất (Bagging, Boosting, Stacking Decorate) không đồng nhất (Voting, Stacking RF) Kết thử nghiệm tập liệu huấn luyện kiểm tra đầy đủ tập liệu UNSW-NB15 cho thấy, giải pháp đề xuất cải thiện chất lượng phân lớp NIDS Bên cạnh kết đạt được, kết nghiên cứu luận án để lại tồn định hướng phát triển tương lai: (1) Thời gian h́n luyện mơ hình phân lớp đề x́t cịn lớn, việc phối hợp đắn thuật toán để xây dựng mơ hình phân lớp lai, đa nhãn đáp ứng thời gian thực vấn đề cần tiếp tục nghiên cứu (2) Năng lực xử lý đóng vai trị quan trọng việc khai thác thuật toán học máy Việc nâng cao hiệu xử lý theo hướng tiếp cận xử lý song song việc tối ưu tham số cho kỹ thuật học máy vấn đề bỏ ngỏ ABSTRACT Stream-based intrusion detection is a growing problem in computer network security environments Many previous researches have applied machine learning as a method to detect attacks in Network Intrusion Detection Systems (NIDS) However, these methods still have limitations of low accuracy, high false alarm rate and detecting new attacks The content of the thesis proposes some solutions using ensemble machine learning techniques and improving data preprocessing techniques in improving the classification quality of NIDS This is based on the fact that: (1) There is a lot of class imbalance data in the training datasets used for NIDS (2) Machine learning algorithms can use some features that are really irrelevant to the classification goal, which reduces the quality of classification and increases computation time (3) Ensemble classifiers outperform the single classifiers in classification accuracy The advantages of the ensemble classifier are particularly evident in the area of network intrusion detection To solve the problem, the thesis proposes to improve the implementation of two solutions in the data preprocessing stage, details as follows: (1) Proposing feature selection algorithms on the basis of improving known FFC and BFE feature selection algorithms (2) Improving techniques for oversampling and undersampling the training dataset The resulting data after preprocessing is used to train the ensemble classifiers using both homogeneous (Bagging, Boosting, Stacking and Decorate) and heterogeneous (Voting, Stacking and RF) ensemble machine learning algorithms The experimental results on the full training and testing datasets of the UNSW-NB15 dataset show that the proposed solutions have improved the classification quality of the NIDS In addition to the achieved results, the research results of the thesis also leave shortcomings and future development orientations: (1) The training time of the proposed classification models is still large, the coordination the right algorithms to build a hybrid, multi-label and real-time response classification model is a problem that needs to be further researched (2) Processing capacity plays an important role in exploiting machine learning algorithms The improvement of processing efficiency in the direction of parallel processing as well as the optimization of parameters for machine learning techniques is still an open issue MỤC LỤC CHƯƠNG GIỚI THIỆU 1.1 Hệ thống phát xâm nhập 1.1.1 Giới thiệu IDS 1.1.2 Phân loại IDS 1.1.3 IDS sử dụng kỹ thuật học máy 1.2 Tính cấp thiết đề tài luận án 1.3 Mục tiêu nghiên cứu 1.4 Đối tượng phạm vi nghiên cứu 1.4.1 Đối tượng nghiên cứu 1.4.2 Phạm vi nghiên cứu 1.5 Phương pháp nghiên cứu 1.6 Ý nghĩa khoa học thực tiễn 1.6.1 Ý nghĩa khoa học 1.6.2 Ý nghĩa thực tiễn 1.7 Những điểm đóng góp mới 1.8 Kết cấu luận án CHƯƠNG CÁC NGHIÊN CỨU LIÊN QUAN 2.1 Cơ sở lý thuyết 2.1.1 Lựa chọn thuộc tính 2.1.2 Lấy mẫu lại tập liệu 2.1.3 Kỹ thuật học máy 2.1.4 Tập liệu sử dụng cho IDS 2.1.5 Chỉ số đánh giá hiệu IDS 2.2 Các nghiên cứu liên quan học máy cho IDS 2.2.1 Lựa chọn thuộc tính 2.2.2 Lấy mẫu lại tập liệu 2.2.3 Các mơ hình học máy cho IDS 2.2.4 Nhận xét CHƯƠNG GIẢI PHÁP LỰA CHỌN THUỘC TÍNH 3.1 Giải pháp lựa chọn thuộc tính đề xuất 3.1.1 Các số đo thơng tin 3.1.2 Thuật tốn loại bỏ thuộc tính ngược BFE 3.1.3 Thuật tốn chọn thuộc tính thuận FFC 3.1.4 Thuật tốn lựa chọn thuộc tính đề xuất 3.2 Kết thực 3.2.1 Lựa chọn thuộc tính với kiểu tấn cơng Worms 1 5 6 6 6 7 8 15 21 27 33 36 36 38 40 56 57 57 57 58 59 61 65 66 MỤC LỤC 3.2.2 Lựa chọn thuộc tính với kiểu tấn cơng Shellcode 3.2.3 Lựa chọn thuộc tính với kiểu tấn cơng Backdoor 3.2.4 Lựa chọn thuộc tính với kiểu tấn cơng Analysis 3.2.5 Lựa chọn thuộc tính với kiểu tấn cơng Recce 3.2.6 Lựa chọn thuộc tính với kiểu tấn cơng DoS 3.2.7 Lựa chọn thuộc tính với kiểu tấn cơng Fuzzers 3.2.8 Lựa chọn thuộc tính với kiểu tấn cơng Exploits 3.2.9 Lựa chọn thuộc tính với kiểu tấn công Generic 3.3 So sánh, nhận xét đánh giá giải pháp lựa chọn thuộc tính đề xuất CHƯƠNG GIẢI PHÁP LẤY MẪU LẠI TẬP DỮ LIỆU 4.1 Giải pháp lấy mẫu lại tập liệu đề xuất 4.1.1 Giải pháp tăng mẫu 4.1.2 Giải pháp giảm mẫu 4.2 Kết thực 4.2.1 Tăng mẫu tập liệu 4.2.2 Giảm mẫu tập liệu 4.3 Tổng hợp kết nhận xét giải pháp lấy mẫu lại tập liệu CHƯƠNG KỸ THUẬT PHỐI HỢP CHO MƠ HÌNH IDS 5.1 Kỹ thuật phối hợp đề xuất 5.2 Kết thực 5.2.1 Sử dụng kỹ thuật phối hợp với kiểu tấn công Worms 5.2.2 Sử dụng kỹ thuật phối hợp với kiểu tấn công Shellcode 5.2.3 Sử dụng kỹ thuật phối hợp với kiểu tấn công Backdoor 5.2.4 Sử dụng kỹ thuật phối hợp với kiểu tấn công Analysis 5.2.5 Sử dụng kỹ thuật phối hợp với kiểu tấn công Recce 5.2.6 Sử dụng kỹ thuật phối hợp với kiểu tấn công DoS 5.2.7 Sử dụng kỹ thuật phối hợp với kiểu tấn công Fuzzers 5.2.8 Sử dụng kỹ thuật phối hợp với kiểu tấn công Exploits 5.2.9 Sử dụng kỹ thuật phối hợp với kiểu tấn công Generic 5.3 Tổng hợp kết nhận xét kỹ thuật phối hợp 5.4 Mơ hình phân lớp lai đề x́t CHƯƠNG KẾT LUẬN VÀ HƯỚNG PHÁT TRIỂN 6.1 Đánh giá kết đạt được, hạn chế hướng phát triển 6.2 Đánh giá ý nghĩa học thuật thực tiễn luận án 68 70 72 74 76 78 80 82 84 87 87 87 91 95 96 106 117 120 120 125 127 129 131 133 135 137 139 141 143 145 146 149 149 150 DANH MỤC CÁC KÝ HIỆU, CHỮ VIẾT TẮT Viết tắt ABC ADASYN ANN AUC Bagging BFE BFS BN CA CART CFS CNN CSE CV DoS DT FFC ELM ENN FPR GA GAR GC GP GR ICA IDS IG KNN KNNCF LC LDA LOO LR Viết đầy đủ Artificial Bee Colony Adaptive Synthetic Sampling Artificial Neural Network Area Under the Curve Bootstrap Aggregation Backward Feature Elimination Best First Search Bayesian Network Correlation Attribute Classification and Regression Trees Correlation-based Feature Selection Convolutional Neural Network Consistency Subset Evaluator Cross Validation Denial of Service Decision Tree Forward Feature Construction Extreme Learning Machines Edited Nearest Neighbors False Positive Rate Genetic Algorithm GRASP with Annealed Randomness Global Competence Genetic Programming Gain Ratio Independent Component Analysis Intrusion Detection System Information Gain K Nearest Neighbours K Nearest Neighbor Collaborative Filtering Local Competence Linear Discriminant Analysis Leave One Out Logistic Regression LSTM MARS ML MLP MV NB NCR NSGA OAR OSELM PART PCA PSO R2L RBF RF RMV RNN ROC RT SMOTE SSV SU SVM TPR U2R WLC WMV WRMV WTA Long Short - Term Memory Multivariate Adaptive Regression Splines Machine Learning Multi Layer Perceptron Majority Voting Naïve Bayes Neighborhood Cleaning Rule Non-dominated Sorting Genetic Algorithm One Against Rest Sequential Extreme Learning Machine Partial Decision Tree Principal Component Analysis Particle Swarm Optimization Remote to Local Radial Basis Function Random Forest Rigged Majority Voting Recurrent Neural Network Receiver Operating Characteristics Random Tree Synthetic Minority Over-Sampling Technique Separability Split Value Symmetrical Uncertainty Support Vector Machine True Positive Rate User to Root Weighted Local Competence Weighted Majority Voting Weighted Rigged Majority Voting Winner Takes All DANH MỤC CÁC CƠNG TRÌNH ĐÃ CƠNG BỐ CỦA LUẬN ÁN [CT1] Hoàng Ngọc Thanh, Trần Văn Lăng, Hoàng Tùng, “Một tiếp cận học máy để phân lớp kiểu tấn công hệ thống phát xâm nhập mạng”, Kỷ yếu Hội nghị Khoa học Quốc gia lần thứ IX - Nghiên cứu ứng dụng Công nghệ thơng tin (FAIR'9), 502-508, DOI: 10.15625/vap.2016.00061, 2016 [CT2] Hồng Ngọc Thanh, Trần Văn Lăng, “Rút gọn thuộc tính sử dụng độ lợi thông tin để tăng cường hiệu hệ thống phát xâm nhập mạng”, Kỷ yếu Hội nghị Khoa học Quốc gia lần thứ X - Nghiên cứu ứng dụng Công nghệ thơng tin (FAIR'10), 823-831, DOI: 10.15625/vap.2017.00097, 2017 [CT3] Hồng Ngọc Thanh, Trần Văn Lăng, “Một cách tiếp cận để giảm chiều liệu việc xây dựng hệ thống phát xâm nhập mạng hiệu quả”, Hội thảo lần thứ II: Một số vấn đề chọn lọc an tồn an ninh thơng tin - TP Hồ Chí Minh, 12/2017 [CT4] Hoàng Ngọc Thanh, Trần Văn Lăng, “Tạo luật cho tường lửa sử dụng kỹ thuật kết hợp dựa định”, Kỷ yếu Hội nghị Khoa học Quốc gia lần thứ XI - Nghiên cứu ứng dụng Công nghệ thông tin (FAIR'11), 489-496, DOI: 10.15625/vap.2018.00064, 2018 [CT5] Hoang Ngoc Thanh, Tran Van Lang, “An approach to reduce data dimension in building effective network intrusion detection systems,” EAI Endorsed Transactions on Context-aware Systems and Applications, 6(18), 2019 [CT6] Hoang Ngoc Thanh, Tran Van Lang, “Use the ensemble methods when detecting DoS attacks in network intrusion detection systems,” EAI Endorsed Transactions on Context-aware Systems and Applications, 6(19), 11 2019 [CT7] Hoang Ngoc Thanh, Tran Van Lang, “Evaluating effectiveness of ensemble classifiers when detecting Fuzzers attacks on the UNSW-NB15 dataset,” Journal of Computer Science and Cybernetics, 36(2), 173-185, DOI: 10.15625/1813-9663/36/2/14786, 2020 TÀI LIỆU THAM KHẢO [1] [2] [3] [4] [5] [6] [7] [8] [9] [1 0] [1 1] [1 2] [1 3] [1 4] [1 5] [1 6] S M Othman, F M Ba-Alwi, N T Alsohybe and A Y Al-Hashida, "Intrusion detection model using machine learning algorithm on Big Data environment," J Big Data, vol 5, no 34 https://doi.org/10.1186/s40537-018-0145-4, 2018 A Thakkar and R Lohiya, "A survey on intrusion detection system: feature selection, model, performance measures, application perspective, challenges, and future research directions," Artificial Intelligence Review, vol 55, p 453–563, 2022 A Khraisat, I Gondal, P Vamplew and J Kamruzzaman, "Survey of intrusion detection systems: techniques, datasets and challenges," Cybersecurity, vol 2, no 1, pp 1-22, 2019 H I Alsaadi, R M Almuttairi, O Bayat and a O N Ucani, "Computational intelligence algorithms to handle dimensionality reduction for enhancing intrusion detection system," J Inf Sci Eng., vol 36, no 2, pp 293-308, 2020 O Almomani, "A feature selection model for network intrusion detection system based on PSO, GWO, FFA and GA algorithms," Symmetry (Basel), vol 12, no 6, pp 1-20, 2020 M S Bonab, A Ghaffari, F S Gharehchopogh and P Alemi, "A wrapper-based feature selection for improving performance of intrusion detection systems," Int J Commun Syst., vol 33, no 12, pp 1-25, 2020 R Roberto, R José and A.-R Jesús, "Heuristic Search over a Ranking for Feature Selection," Lecture Notes in Computer Science, vol 3512, pp 742-749, 2005 N Junsomboon, "Combining Over-Sampling and Under-Sampling Techniques for Imbalance Dataset," in Proceedings of the 9th International Conference on Machine Learning and Computing, 2017 S Bagui and K Li, "Resampling imbalanced data for network intrusion detection datasets," Journal of Big Data, vol 8, no 6, 2021 H Ahmed, A Hameed and N Bawany, "Network intrusion detection using oversampling technique and machine learning algorithms," PeerJ Computer Science 8:e820 DOI 10.7717/peerj-cs.820, 2022 N V Chawla, K W Bowyer, L O Hall and W P Kegelmeyer, "SMOTE: Synthetic Minority Over-sampling Technique," Journal of Artificial Intelligence Research, p 321– 357, 2002 F Last, G Douzas and F Baỗóo, "Oversampling for Imbalanced Learning Based on KMeans and SMOTE," CoRR abs/1711.00837, 2017 Y Pristyanto, A F Nugraha, A Dahlan, L A Wirasakti, A A Zein and I Pratama, "Multiclass Imbalanced Handling using ADASYN Oversampling and Stacking Algorithm," 2022," in 2022 16th International Conference on Ubiquitous Information Management and Communication, doi: 10.1109/IMCOM53663.2022.9721632, 2022 A Pathak, "Analysis of Different SMOTE based Algorithms on Imbalanced Datasets," International Research Journal of Engineering and Technology (IRJET), vol 8, no 8, pp 4111-4114, 2021 T Elhassan, M Aljurf, F Al-Mohanna and M Shoukri, "Classification of Imbalance Data using Tomek Link (T-Link) Combined with Random Under-Sampling (RUS) as a Data Reduction Method," Journal of Informatics and Data Mining, vol 1, 2016 D Guan, W Yuan, Y.-K Lee and S Lee, "Nearest neighbor editing aided by unlabeled data," Information Sciences, vol 179, pp 2273-2282, 2009 [17 ] [18 ] [19 ] [20 ] [21 ] [22 ] [23 ] [24 ] [25 ] [26 ] [27 ] [28 ] [29 ] [30 ] [31 ] [32 ] [33 ] Y Chali, H Sadid and M Mojahid, "Complex question answering: homogeneous or heterogeneous," in 19th International Conference on Application of Natural Language, Montpellier, France, 2014 Y Seth, "Bootstrapping - A Powerful Resampling Method in Statistics," 12 2017 [Online] Available: https://yashuseth.wordpress.com/2017/12/02/bootstrapping-aresampling-method-in-statistics/ [Accessed 22 04 2022] I Syarif, E Zaluska, A Prugel-Bennett and G Wills, "Application of Bagging, Boosting and Stacking to Intrusion Detection," in International Workshop on Machine Learning and Data Mining in Pattern Recognition, 2012 N Sadki, "Understand different types of Boosting Algorithms," [Online] Available: https://iq.opengenus.org/types-of-boosting-algorithms/ [Accessed 22 04 2022] A J Malik, W Shahzad and F A Khan, "Binary PSO and Random Forests algorithm for probe attacks detection in a network," in Evolutionary Computation (CEC), 2011 IEEE Congress on, IEEE, 2011 S H Kok, A B Abdullah, N Z Jhanjhi and M Supramaniam, "A Review of Intrusion Detection System using Machine Learning Approach," International Journal of Engineering Research and Technology, vol 12, no 1, pp 8-15, 2019 A Özgür and H Erdem, "A review of KDD99 dataset usage in intrusion detection and machine learning between 2010 and 2015," PeerJ PrePrints, 2016 M Tavallaee, E Bagheri, W Lu and A.-A Ghorbani, "A detailed analysis of the KDDCup99 data set," in Proceedings of the Second IEEE Symposium on Computational Intelligence for Security and Defence Applications 2009, 2009 E K Viegas, A O Santin and L S Oliveira, "Toward a reliable anomaly-based intrusion detection in real-world environments," Computer Networks, vol 127, pp 200-216, 2017 S Aljawarneh, M Aldwairi and M B Yassein, "Anomaly-based intrusion detection system through feature selection analysis and building hybrid efficient model," Journal of Computational Science, vol 25, pp 152-160, 2018 I Sharafaldin, A H Lashkari and A A Ghorbani, "Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization," in Proceeding of 4th International Conference on Information Systems Security and Privacy, 2018 N Moustafa and J Slay, "UNSW-NB15: A comprehensive data set for network intrusion detection systems," in Conference on Military Communications and Information Systems, 2015 F Melo, "Area under the ROC Curve," in Encyclopedia of Systems Biology, Springer New York, 2013, pp 38-39 M Ring, S Wunderlich, D Scheuring, D Landes and A Hotho, "A Survey of Network- based Intrusion Detection Data Sets," Computers & Security, vol 86, p 147–167, 2019 S Raschka, "Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning," arXiv:1811.12808v2 [cs.LG] Dec 2018, 2018 M Bekkar, H K Djemaa and T A Alitouche, "Evaluation Measures for Models Assessment over Imbalanced Data Sets," Journal of Information Engineering and Applications, vol 3, no 10, pp 27-38, 2013 J Brownlee, "Machine Learning Mastery," 08 01 2020 [Online] Available: https://machinelearningmastery.com/tour-of-evaluation-metrics-forimbalanced- classification/ [Accessed 22 04 2022] [34 ] [35 ] [36 ] [37 ] [38 ] [39 ] [40 ] [41 ] [42 ] [43 ] [44 ] [45 ] [46 ] [47 ] [48 ] [49 ] M Torabi, N I Udzir, M T Abdullah and R Yaakob, "A Review on Feature Selection and Ensemble Techniques for Intrusion Detection System," International Journal of Advanced Computer Science and Applications, vol 15, no 5, pp 538-553, 2021 Z Liu, R Wang, M Tao and X Cai, "A class-oriented feature selection approach for multi-class imbalanced network traffic datasets based on local and global metrics fusion," Neurocomputing, vol 168, pp 365-381, 2015 Y Zhu, J Liang, J Chen and Z Ming, "An improved NSGA-III algorithm for feature selection used in intrusion detection," Knowledge-Based Systems, vol 116, pp 74-85, 2017 V B Vaghela, K H Vandra and N K Modi, "Entropy Based Feature Selection For Multi- Relational Naïve Bayesian Classifier," Journal of International Technology and Information Management, vol 23, no 1, pp 13-26, 2014 Z Weidong, F Jingyu and L Yongmin, "Using Gini-Index for Feature Selection in Text Categorization," in 3rd International Conference on Information, Business and Education Technology, 2014 A R A Yusof, N I Udzir, A Selamat, H Hamdan and M T Abdullah, "Adaptive feature selection for denial of services (DoS) attack," in 2017 IEEE Conference on Application, Information and Network Security (AINS), 2017 N Sharma, P Verlekar, R Ashary and S Zhiquan, "Regularization and feature selection for large dimensional data," Machine Learning (cs.LG); Numerical Analysis (math.NA); Optimization and Control, arXiv:1712.01975, pp 1-12, 2019 K Chen, F Y Zhou and X F Yuan, "Hybrid particle swarm optimization with spiralshaped mechanism for feature selection," Expert Systems with Applications, vol 128, pp 140-156, 2019 B Ma and Y Xia, "A tribe competition-based genetic algorithm for feature selection in pattern classification," Applied Soft Computing, vol 58, pp 328-338, 2017 T Mehmod and H B M Rais, "Ant colony optimization and feature selection for intrusion detection," in Advances in Machine Learning and Signal Processing, vol 387, Springer International Publishing, 2016, pp 305-312 F Kuang, S Zhang, Z Jin and W Xu, "A novel SVM by combining kernel principal component analysis and improved chaotic particle swarm optimization for intrusion detection," Soft Computing, vol 19, no 5, pp 1187-1199, 2015 M R G Raman, N Somu, K Kirthivasan, R Liscano and V S S Sriram, "An efficient intrusion detection system based on hypergraph - Genetic algorithm for parameter optimization and feature selection in support vector machine," KnowledgeBased Systems, vol 134, pp 1-12, 2017 M H Ali, B A D A Mohammed, A Ismail and M F Zolkipli, "A New Intrusion Detection System Based on Fast Learning Network and Particle Swarm Optimization," IEEE Access, vol 6, pp 20255-20261, 2018 P T T Hồng and N T Thủy, "Đánh giá kỹ thuật lựa chọn đặc trưng cho toán phân loại biểu gen," Tạp chí khoa học nông nghiệp Việt Nam, vol 14, no 3, pp 461- 468, 2016 J Leevy, T Khoshgoftaar, R Bauder and N Seliya, "A survey on addressing highclass imbalance in big data," Journal of Big Data, vol 5, no 1, 2018 J Johnson and T Khoshgoftaar, "Survey on deep learning with class imbalance," Journal of Big Data, vol 6, no 1, 2019 [50 ] [51 ] [52 ] [53 ] [54 ] [55 ] [56 ] [57 ] [58 ] [59 ] [60 ] [61 ] [62 ] [63 ] [64 ] [65 ] B Raghuwanshi and S Shukla, "SMOTE based class-specific extreme learning machine for imbalanced learning," Knowledge-Based Systems, vol 187, p 104814, 2020 A Luque, A Carrasco and A Martín, "The impact of class imbalance in classification performance metrics based on the binary confusion matrix," Pattern Recognition, vol 91, pp 216-231, 2019 G Douzas and F Bacao, "Effective data generation for imbalanced learning using conditional generative adversarial networks," Expert Systems with Applications, vol 91, pp 464-471, 2018 G Douzas, F Bacao and F Last, "Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE," Information Sciences, vol 465, pp 1-20, 2018 A Amin, S Anwar, A Adnan, M Nawaz, N Howard, J Quadir, A Havalah and A Hussain, "Comparing Oversampling Techniques to Handle the Class Imbalance Problem: A Customer Churn Prediction Case Study," IEEE Access, vol 4, pp 79407957, 2016 L Abdi and S Hashemi, "To Combat Multi-Class Imbalanced Problems by Means of Over-Sampling Techniques," IEEE Transactions on Knowledge and Data Engineering, vol 28, no 1, pp 238-251, 2016 A Fernández, S Río, N Chawla and F Herrera, "An insight into imbalanced Big Data classification: outcomes and challenges," Complex & Intelligent Systems, vol 3, 2017 M Basgall, W Hasperué, M Naiouf, A Fernández and F Herrera, "SMOTE-BD: An Exact and Scalable Oversampling Method for Imbalanced Classification in Big Data," Journal of Computer Science and Technology, vol 18, p e23, 2018 D Terzi and S Sagiroglu, "A new big data model using distributed cluster-based resampling for class-imbalance problem," Applied Computer Systems, vol 24, pp 104110, 2019 P Gutiérrez, M Lastra, J Benítez and F Herrara, "SMOTE-GPU: Big Data preprocessing on commodity hardware for imbalanced classification," Progress in Artificial Intelligence, vol 6, 2017 I Triguero, M Galar, D Merino, J Maillo, H Bustince and F Herrera, "Evolutionary Undersampling for Extremely Imbalanced Big Data Classification under Apache Spark," in 2016 IEEE congress on evolutionary computation, Vancouver, 2016 N T L Anh, "Thuật toán HMU toán phân lớp liệu mất cân bằng," Tạp chí khoa học và giáo dục, vol 02, no 42, pp 101-108, 2017 A Verma and V Ranga, "Statistical analysis of CIDDS-001 dataset for Network Intrusion Detection Systems using Distance-based Machine Learning," Procedia Computer Science, vol 125, pp 709-716, 2018 T Hamed, R Dara and S C Kremer, "Network intrusion detection system based on recursive feature addition and bigram technique," Computers & Security, vol 73, pp 137- 155, 2018 C R Wang, R F Xu, S J Lee and C H Lee, "Network intrusion detection using equality constrained-optimization-based extreme learning machines," KnowledgeBased Systems, vol 147, pp 68-80, 2018 G Fernandes, L F Carvalho, J J P C Rodrigues and M L Proenỗa, "Network anomaly detection using IP flows with Principal Component Analysis and Ant Colony Optimization," Journal of Network and Computer Applications, vol 64, pp 1-11, 2016 [66 ] [67 ] [68 ] [69 ] [70 ] [71 ] [72 ] [73 ] [74 ] [75 ] [76 ] [77 ] [78 ] [79 ] [80 ] [81 ] A H Hamamoto, L F Carvalho, L D H Sampaio, T Abróo and M L Proenỗa, "Network Anomaly Detection System using Genetic Algorithm and Fuzzy Logic," Expert Systems with Applications, vol 92, pp 390-402, 2018 W L Al-Yaseen, Z A Othman and M Z A Nazri, "Multi-level hybrid support vector machine and extreme learning machine based on modified K-means for intrusion detection system," Expert Systems with Applications, vol 67, pp 296-303, 2017 I S Thaseen and C A Kumar, "Intrusion detection model using fusion of chi-square feature selection and multi class SVM," Journal of King Saud University Computer and Information Sciences, vol 29, pp 462-472, 2017 R A R Ashfaq, X Z Wang, J Z Huang, H Abbas and Y L He, "Fuzziness based semi- supervised learning approach for intrusion detection system," Information Sciences, vol 378, pp 484-497, 2017 U Ravale, N Marathe and P Padiya, "Feature selection based hybrid anomaly intrusion detection system using K Means and RBF kernel function," Procedia Computer Science, vol 45, pp 428-435, 2015 V Hajisalem and S Babaie, "A hybrid intrusion detection system based on ABC-AFS algorithm for misuse and anomaly detection," Computer Networks, vol 136, pp 37-50, 2018 C Khammassi and S Krichen, "A GA-LR wrapper approach for feature selection in network intrusion detection," Computers & Security, vol 70, pp 255-277, 2017 M R G Raman, N Somu, K Kirthivasan, R Liscano and V S S Sriram, "An efficient intrusion detection system based on hypergraph - Genetic algorithm for parameter optimization and feature selection in support vector machine," KnowledgeBased Systems, vol 134, pp 1-12, 2017 S Shitharth and D P Winston, "An enhanced optimization based algorithm for intrusion detection in SCADA network," Computers & Security, vol 70, pp 16-26, 2017 S M H Bamakan, H Wang, T Yingjie and Y Shi, "An effective intrusion detection framework based on MCLP/SVM optimized by time-varying chaos particle swarm optimization," Neurocomputing, vol 199, pp 90-102, 2016 H Wang, J Gu and S Wang, "An effective intrusion detection framework based on SVM with feature augmentation," Knowledge-Based Systems, vol 136, pp 130-139, 2017 S Roshan, Y Miche, A Akusok and A Lendasse, "Adaptive and online network intrusion detection system using clustering and Extreme Learning Machines," Journal of The Franklin Institute, vol 355, pp 1752-1779, 2018 C Guo, Y Ping, N Liu and S S Luo, "A two-level hybrid approach for intrusion detection," Neurocomputing, vol 214, pp 391-400, 2016 S Y Ji, B K Jeong, S Choi and D H Jeong, "A multi-level intrusion detection method for abnormal network behaviors," Journal of Network and Computer Applications, vol 62, pp 9-17, 2016 A A Aburomman and M B I Reaz, "A novel weighted support vector machines multiclass classifier based on differential evolution for intrusion detection systems," Information Sciences, vol 414, pp 225-246, 2017 A S Amira, S E O Hanafi and A E Hassanien, "Comparison of classification techniques applied for network intrusion detection and classification," Journal of Applied Logic, vol 24, pp 109-118, 2017 [82 ] [83 ] [84 ] [85 ] [86 ] [87 ] [88 ] [89 ] [90 ] [91 ] [92 ] [93 ] [94 ] [95 ] [96 ] [97 ] [98 ] M Mazini, B Shirazi and I Mahdavi, "Anomaly network-based intrusion detection system using a reliable hybrid artificial bee colony and AdaBoost algorithms," Journal of King Saud University Computer and Information Sciences, 2018 W C Lin, S W Ke and C F Tsai, "CANN: An intrusion detection system based on combining cluster centers and nearest neighbors," Knowledge-Based Systems, vol 78, pp 13-21, 2015 D Papamartzivanos, F G Mármol and G Kambourakis, "Dendron: Genetic trees driven rule induction for network intrusion detection systems," Future Generation Computer Systems, vol 79, pp 558-574, 2018 G Folino, C Pizzuti and G Spezzano, "An ensemble-based evolutionary framework for coping with distributed intrusion detection," Genetic Programming and Evolvable Machines, vol 11, pp 131-146, 2010 M Gudadhe, P Prasad and K Wankhade, "A new data mining based network intrusion detection model," in Computer and Communication Technology (ICCCT), International Conference on, IEEE, 2010 P P Angelov and X Zhou, "Evolving fuzzy-rule-based classifiers from data streams," Fuzzy Systems, IEEE Transactions on, vol 16, pp 1462-1475, 2008 E Bahri, N Harbi and H N Huu, "Approach based ensemble methods for better and faster intrusion detection," Computational Intelligence in Security for Information Systems, Springer, pp 17-24, 2011 D Gaikwad and R C Thool, "Intrusion detection system using bagging with partial decision treebase classifier," Procedia Computer Science, vol 49, pp 92-98, 2015 L Lin, R Zuo, S Yang and Z Zhang, "SVM ensemble for anomaly detection based on rotation forest," in Intelligent Control and Information Processing (ICICIP), 2012 Third International Conference on, IEEE, 2012 G Kumar and K Kumar, "Design of an evolutionary approach for intrusion detection," The Scientific World Journal, 2013 V Bukhtoyarov and V Zhukov, "Ensemble-distributed approach in classification problem solution for intrusion detection systems," in Intelligent Data Engineering and Automated Learning-IDEAL 2014, Springer, 2014 S Masarat, H Taheri and S Sharifian, "A novel framework, based on fuzzy ensemble of classifiers for intrusion detection systems," in Computer and Knowledge Engineering (ICCKE), 2014 4th International eConference on, IEEE, 2014 S Mukkamala, A H Sung and A Abraham, "Intrusion detection using an ensemble of intelligent paradigms," Journal of network and computer applications, vol 28, pp 167- 182, 2005 M Govindarajan and R Chandrasekaran, "Intrusion detection using an ensemble of classification methods," in World Congress on Engineering and Computer Science, 2012 Y Meng and L F Kwok, "Enhancing false alarm reduction using voted ensemble selection in intrusion detection," International Journal of Computational Intelligence Systems, vol 6, pp 626-638, 2013 N F Haq, A R Onik and F M Shah, "An ensemble framework of anomaly detection using hybridized feature selection approach (HFSA)," in SAI Intelligent Systems Conference (IntelliSys), 2015 Y Gu, B Zhou and J Zhao, "PCA-ICA ensembled intrusion detection system by pareto- optimal optimization," The Journal of Information Technology, vol 7, pp 510515, 2008 [99 ] [10 0] [10 1] [10 2] [10 3] [10 4] [10 5] [10 6] [10 7] [10 8] [10 9] [11 0] [11 1] [11 2] [11 3] [11 4] A P F Chan, W W Y Ng, D S Yeung and E C C Tsang, "Comparison of different fusion approaches for network intrusion detection using ensemble of RBFNN," in 2005 International Conference on Machine Learning and Cybernetics, 2005 A Borji, Berlin and Heidelberg, "Combining Heterogeneous Classifiers for Network Intrusion Detection," in Advances in Computer Science - ASIAN 2007, 2007 B A Tama and K H Rhee, "A combination of PSO-based feature selection and treebased classifiers ensemble for intrusion detection systems," Advances in Computer Science and Ubiquitous Computing, Springer, pp 489-495, 2015 J Kim, H L T Thu and H Kim, "Long Short Term Memory Recurrent Neural Network Classifier for Intrusion Detection," in International Conference on Platform Technology and Service, 2016 B Abolhasanzadeh, "Nonlinear dimensionality reduction for intrusion detection using auto-encoder bottleneck features," in 7th Conference on Information and Knowledge Technology, 2015 U Fiore, F Palmieri, A Castiglione and A D Santis, "Network anomaly detection with the restricted Boltzmann machine," Neurocomputing, vol 122, pp 13-23, 2013 N Gao, L Gao, Q Gao and H Wang, "An Intrusion Detection Model Based on Deep Belief Networks," in Second International Conference on Advanced Cloud and Big Data, 2014 M Z Alom, V Bontupalli and T M Taha, "Intrusion detection using deep belief networks," in National Aerospace and Electronics Conference, 2015 H Hota and A K Shrivas, "Data mining approach for developing various models based on types of attack and feature selection as intrusion detection systems (IDS)," Intelligent Computing, Networking, and Informatics, Springer, pp 845-851, 2014 M S Pervez and D M Farid, "Feature selection and intrusion classification in NSLKDDCup99 dataset employing SVMs," in Software, Knowledge, Information Management and Applications (SKIMA), 2014 8th International Conference on, 2014 A C Enache and V V Patriciu, "Intrusions detection based on support vector machine optimized with swarm intelligence," in Applied Computational Intelligence and Informatics (SACI), 2014 IEEE 9th International Symposium on, 2014 N Jankowski and K Gra˛bczewski, "Heterogenous committees with competence analysis," in Hybrid Intelligent Systems HIS’05 Fifth International Conference on, IEEE, 2005 Y Chen and Y Zhao, "A novel ensemble of classifiers for microarray data classification," Applied soft computing, vol 8, pp 1664-1669, 2008 A Eleyan, H Özkaramanli and H Demirel, "Weighted majority voting for face recognition from low resolution video sequences," in Soft Computing, Computing with Words and Perceptions in System Analysis, Decision and Control, 2009 ICSCCW 2009 Fifth International Conference on, IEEE, 2009 J Richiardi and A Drygajlo, "Reliability-based voting schemes using modalityindependent features in multi-classifier biometric authentication," Multiple Classifier Systems, Springer, pp 377-386, 2007 A Kausar, M Ishtiaq, M A Jaffar and A M Mirza, "Optimization of ensemble based decision using PSO," in Proceedings of the World Congress on Engineering, WCE, 2010 [115 ] [116 ] [117 ] [118 ] [119 ] [120 ] [121 ] [122 ] [123 ] [124 ] [125 ] [126 ] [127 ] [128 ] [129 ] [130 ] M A Tahir, J Kittler and A Bouridane, "Multilabel classification using heterogeneous ensemble of multi-label classifiers," Pattern Recognition Letters, vol 33, pp 513-523, 2012 X Zhang, P Wang, L Du and H Liu, "New method for radar HRRP recognition and rejection based on weighted majority voting combination of multiple classifiers," in Signal Processing, Communications and Computing (ICSPCC), 2011 IEEE International Conference on, IEEE, 2011 S Gu and Y Jin, "Heterogeneous classifier ensembles for EEG-based motor imaginary detection," in 2012 12th UK Workshop on Computational Intelligence (UKCI), IEEE, 2012 L I Kuncheva and J J Rodríguez, "A weighted voting framework for classifiers ensembles," Knowledge and Information Systems, vol 38, pp 259-275, 2014 H Toman, L Kovacs, A Jonas, L Hajdu and A Hajdu, "Generalized weighted majority voting with an application to algorithms having spatial output," in International Conference on Hybrid Artificial Intelligence Systems, Springer, 2012 F Ye, Z Zhang, K Chakrabarty and X Gu, "Board-level functional fault diagnosis using artificial neural networks, support-vector machines, and weighted-majority voting," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol 32, pp 723-736, 2013 J Cheng and L Chen, "A weighted regional voting based ensemble of multiple classifiers for face recognition," in International Symposium on Visual Computing, Springer, 2014 K Remya and J Ramya, "Using weighted majority voting classifier combination for relation classification in biomedical texts," in Control, Instrumentation, Communication and Computational Technologies (ICCICCT), 2014 International Conference on, IEEE, 2014 H F Eid, A Darwish, A E Hassanien and T.-H Kim, "Intelligent hybrid anomaly network intrusion detection system," Communication and Networking, Springer, pp 209- 218, 2011 E D Hoz, E Hoz, A Ortiz, J Ortega and A Martínez-Álvarez, "Feature selection by multi-objective optimisation: Application to network anomaly detection by hierarchical self-organising maps," Knowledge-Based Systems, vol 71, pp 322-338, 2014 S Rastegari, P Hingston and C.-P Lam, "Evolving statistical rulesets for network intrusion detection," Applied Soft Computing, vol 33, pp 348-359, 2015 R Singh, H Kumar and R Singla, "An intrusion detection system using network traffic profiling and online sequential extreme learning machine," Expert Systems with Applications, vol 42, pp 8609-8624, 2015 N K Kanakarajan and K Muniasamy, "Improving the accuracy of intrusion detection using gar-forest with feature selection," in Proceedings of the 4th International Conference on Frontiers in Intelligent Computing: Theory and Applications (FICTA) 2015, Springer, 2016 A E Hassanien, T.-H Kim, J Kacprzyk and A I Awad, "Bio-inspiring Cyber Security and Cloud Services: Trends and Innovations," Springer, vol 70, 2014 H H Pajouh, G Dastghaibyfard and S Hashemi, "Two-tier network anomaly detection model: a machine learning approach," Journal of Intelligent Information Systems, pp 1- 14, 2015 J Sharma, C Giri, O.-C Granmo and M Goodwin, "Multi-layer intrusion detection system with ExtraTrees feature selection, extreme learning machine ensemble, and [131 ] [132 ] [133 ] [134 ] [135 ] [136 ] [137 ] [138 ] [139 ] [140 ] [141 ] softmax aggregation," EURASIP Journal on Information Security, vol 2019, pp 1-15, 2019 S Moualla, K Khorzom and A Jafar, "Improving the Performance of Machine Learning- Based Network Intrusion Detection Systems on the UNSW-NB15 Dataset," Computational Intelligence and Neuroscience, vol 2021, pp 1-13, 2021 V V Cảnh, "Phát xâm nhập mạng sử dụng kỹ thuật học máy," Tạp chí nghiên cứu khoa học và công nghệ quân sự, pp 105-120, 05-2017 P V Huong, L D Thuan, L T H Van and D V Hung, "Intrusion Detection in IoT Systems Based on Deep Learning Using Convolutional Neural Network," in 6th NAFOSTED Conference on Information and Computer Science (NICS), Ha Noi - Viet Nam, 2019 L H Hiep, L X Hieu, H T Tuyen and D T Quy, "Studying a solution for early detection of DDoS attacks based on machine learning algorithms," vol 227, no 11, p 137 – 144, 2022 T M Tuấn, P H Hảo and T T Nam, "Hệ thống phát xâm nhập hai tầng cho mạng IoT sử dụng máy học," Tạp chí Khoa học Trường Đại học Cần Thơ, vol 58, no 2, pp 43-50, 2022 C Sergio, D S Javier, L Ibai, O Ignacio, S Javier, J S Javier and I T Ana, "Chapter - Big Data in Road Transport and Mobility Research," in Intelligent Vehicles, Butterworth- Heinemann, 2018, pp 175-205 Kotthoff, Lars, C Thornton, H H Hoos, F Hutter and K Leyton-Brown, "AutoWEKA: Automatic model selection and hyperparameter optimization in WEKA," Automated Machine Learning - Springer, pp 81-95, 2019 M Artur, "Review the performance of the Bernoulli Naïve Bayes Classifier in Intrusion Detection Systems using Recursive Feature Elimination with Cross-validated selection of the best number of features," in The 2020 Annual International Conference on Brain- Inspired Cognitive Architectures for Artificial Intelligence: Eleventh Annual Meeting of the BICA Society, 2021 W Lian, G Nie, B Jia, D Shi, Q Fan and Y Liang, "An Intrusion Detection Method Based on Decision Tree-Recursive Feature Elimination in Ensemble Learning," Mathematical Problems in Engineering, vol 2020, pp 1-15, 2020 B Neal, S Mittal, A Baratin, V Tantia, M Scicluna, S Lacoste-Julien and I Mitliagkas, "A modern take on the bias-variance tradeoﬀ in neural networks," ArXiv, vol abs/1810.08591, 2018 V Kumar, D Sinha, A K Das, S C Pandey and R T Goswami, "An integrated rule based intrusion detection system: analysis on UNSW-NB15 data set and the real time online dataset," Cluster Computing, vol 23, p 1397–1418, 2019 Nghiên cứu sinh Tác giả DANH MỤC CÁC KÝ HIỆU, CHỮ VIẾT TẮT DANH MỤC CÁC HÌNH VẼ, ĐỒ THỊ DANH MỤC CÁC THUẬT TOÁN CHƯƠNG GIỚI THIỆU 1.1 Hệ thống phát xâm nhập 1.1.1 Giới thiệu IDS 1.1.1.1 Kiến trúc IDS 1.1.1.2 Các chức IDS 1.1.1.3 Quy trình hoạt động IDS 1.1.2 Phân loại IDS 1.1.2.1 IDS dựa mạng 1.1.2.2 IDS dựa máy chủ 1.1.3 IDS sử dụng kỹ thuật học máy 1.2 Tính cấp thiết đề tài luận án 1.3 Mục tiêu nghiên cứu 1.4 Đối tượng phạm vi nghiên cứu 1.4.1 Đối tượng nghiên cứu 1.4.2 Phạm vi nghiên cứu 1.5 Phương pháp nghiên cứu 1.6 Ý nghĩa khoa học thực tiễn 1.6.1 Ý nghĩa khoa học 1.6.2 Ý nghĩa thực tiễn 1.7 Những điểm đóng góp mới 1.8 Kết cấu luận án - Danh mục công trình cơng bố luận án CHƯƠNG CÁC NGHIÊN CỨU LIÊN QUAN 2.1 Cơ sở lý thuyết 2.1.1.1 Phương pháp lựa chọn thuộc tính a) Chiến lược tìm kiếm b) Tiêu chuẩn lựa chọn c) Mơ hình lựa chọn 2.1.1.2 Một số thuật tốn lựa chọn thuộc tính a) Tìm kiếm tồn Input a.2 Thuật tốn AAB Input End b) Tìm kiếm theo kinh nghiệm Input End Input 2.1.2 Lấy mẫu lại tập liệu 2.1.2.1 Các kỹ thuật tăng mẫu Input End End Populate c) ADASYN Input d) Borderline-SMOTE Input 2.1.2.2 Kỹ thuật giảm mẫu Input End c) Láng giềng gần nhất chỉnh sửa 2.1.3 Kỹ thuật học máy a) Bootstrap b) Bagging c) Boosting d) Xếp chồng e) Rừng ngẫu nhiên f) Decorate Input 2.1.3.2 Kỹ thuật học sâu 2.1.4 Tập liệu sử dụng cho IDS 2.1.5 Chỉ số đánh giá hiệu IDS 2.2 Các nghiên cứu liên quan học máy cho IDS 2.2.1.1 Các phương pháp lọc 2.2.1.2 Các phương pháp gói 2.2.1.3 Các phương pháp dựa tối ưu hóa 2.2.2 Lấy mẫu lại tập liệu 2.2.2.1 Các phương pháp mức thuật toán 2.2.2.2 Các phương pháp mức liệu 2.2.3 Các mơ hình học máy cho IDS 2.2.3.1 Phối hợp đồng nhất cho IDS 2.2.3.2 Phối hợp không đồng nhất cho IDS 2.2.3.3 Kỹ thuật học sâu cho IDS 2.2.3.4 Một số kỹ thuật khác để xây dựng IDS 2.2.3.5 Một số cơng trình nghiên cứu nước 2.2.3.6 So sánh hiệu phương pháp 2.2.4 Nhận xét CHƯƠNG GIẢI PHÁP LỰA CHỌN THUỘC TÍNH 3.1 Giải pháp lựa chọn thuộc tính đề x́t 3.1.2 Thuật tốn loại bỏ thuộc tính ngược BFE Input End End FindNoise 3.1.3 Thuật toán chọn thuộc tính thuận FFC Input Begin End End FindBest 3.1.4 Thuật tốn lựa chọn thuộc tính đề x́t Input End Input End 3.2.1 Lựa chọn thuộc tính với kiểu tấn cơng Worms 3.2.2 Lựa chọn thuộc tính với kiểu tấn cơng Shellcode 3.2.3 Lựa chọn thuộc tính với kiểu tấn cơng Backdoor 3.2.4 Lựa chọn thuộc tính với kiểu tấn cơng Analysis 3.2.5 Lựa chọn thuộc tính với kiểu tấn cơng Recce 3.2.6 Lựa chọn thuộc tính với kiểu tấn cơng DoS 3.2.7 Lựa chọn thuộc tính với kiểu tấn cơng Fuzzers 3.2.8 Lựa chọn thuộc tính với kiểu tấn cơng Exploits 3.2.9 Lựa chọn thuộc tính với kiểu tấn cơng Generic 3.3 So sánh, nhận xét đánh giá giải pháp lựa chọn thuộc tính đề xuất CHƯƠNG GIẢI PHÁP LAY MẪU LẠI TẬP DỮ LIỆU 4.1 Giải pháp lấy mẫu lại tập liệu đề xuất 4.1.1 Giải pháp tăng mẫu Input End Input End 4.1.2 Giải pháp giảm mẫu Input End Input End 4.2.1 Tăng mẫu tập liệu 4.2.1.1 Kỹ thuật tăng mẫu nguyên b) Tăng mẫu với kiểu tấn công Shellcode c) Tăng mẫu với kiểu tấn công Backdoor d) Tăng mẫu với kiểu tấn công Analysis e) Tăng mẫu với kiểu tấn công Recce f) Tăng mẫu với kiểu tấn công DoS g) Tăng mẫu với kiểu tấn công Fuzzers h) Tăng mẫu với kiểu tấn công Exploits i) Tăng mẫu với kiểu tấn công Generic 4.2.1.2 Kỹ thuật tăng mẫu đề xuất 4.2.1.3 So sánh, nhận xét đánh giá kỹ thuật tăng mẫu đề xuất 4.2.2 Giảm mẫu tập liệu a) Giảm mẫu với kiểu tấn công Worms b) Giảm mẫu với kiểu tấn công Shellcode c) Giảm mẫu với kiểu tấn công Backdoor d) Giảm mẫu với kiểu tấn công Analysis e) Giảm mẫu với kiểu tấn công Recce f) Giảm mẫu với kiểu tấn công DoS g) Giảm mẫu với kiểu tấn công Fuzzers h) Giảm mẫu với kiểu tấn công Exploits i) Giảm mẫu với kiểu tấn công Generic 4.2.2.2 Kỹ thuật giảm mẫu đề xuất 4.2.2.3 So sánh, nhận xét đánh giá kỹ thuật giảm mẫu đề xuất 4.3 Tổng hợp kết nhận xét giải pháp lấy mẫu lại tập liệu CHƯƠNG KỸ THUẬT PHỐI HỢP CHO MƠ HÌNH IDS 5.1 Kỹ thuật phối hợp đề x́t Input Output Begin End Input Output Begin End 5.2 Kết thực ● Với kỹ thuật phối hợp đồng nhất: ● Với kỹ thuật phối hợp không đồng nhất: 5.2.1 Sử dụng kỹ thuật phối hợp với kiểu tấn công Worms 5.2.2 Sử dụng kỹ thuật phối hợp với kiểu tấn công Shellcode 5.2.3 Sử dụng kỹ thuật phối hợp với kiểu tấn công Backdoor 5.2.4 Sử dụng kỹ thuật phối hợp với kiểu tấn công Analysis 5.2.5 Sử dụng kỹ thuật phối hợp với kiểu tấn công Recce 5.2.6 Sử dụng kỹ thuật phối hợp với kiểu tấn công DoS 5.2.7 Sử dụng kỹ thuật phối hợp với kiểu tấn công Fuzzers 5.2.8 Sử dụng kỹ thuật phối hợp với kiểu tấn công Exploits 5.2.9 Sử dụng kỹ thuật phối hợp với kiểu tấn công Generic 5.3 Tổng hợp kết nhận xét kỹ thuật phối hợp 5.4 Mơ hình phân lớp lai đề x́t CHƯƠNG KẾT LUẬN VÀ HƯỚNG PHÁT TRIỂN 6.1 Đánh giá kết đạt được, hạn chế hướng phát triển 6.2 Đánh giá ý nghĩa học thuật thực tiễn luận án DANH MỤC CÁC CƠNG TRÌNH ĐÃ CÔNG BỐ CỦA LUẬN ÁN TÀI LIỆU THAM KHẢO ... THANH KỸ THUẬT HỌC MÁY PHỐI HỢP VÀ TIỀN XỬ LÝ DỮ LIỆU TRONG VIỆC NÂNG CAO CHẤT LƯỢNG PHÂN LỚP CỦA CÁC HỆ THỐNG PHÁT HIỆN XÂM NHẬP MẠNG LUẬN ÁN TIẾN SĨ KHOA HỌC MÁY TÍNH Chuyên ngành: Khoa học máy. .. khóa 2015, Trường đại học Lạc Hồng Tôi xin cam đoan luận án tiến sĩ ? ?Kỹ thuật học máy phối hợp tiền xử lý liệu việc nâng cao chất lượng phân lớp hệ thống phát xâm nhập mạng? ?? cơng trình nghiên... thức việc cải thiện độ xác, giảm tỷ lệ cảnh báo sai phát tấn công mới Nội dung luận án đề xuất số giải pháp sử dụng kỹ thuật học máy phối hợp cải tiến kỹ thuật tiền xử lý liệu việc nâng cao

Tiêu đề	KỸ THUẬT HỌC MÁY PHỐI HỢP VÀ TIỀN XỬ LÝ DỮ LIỆU TRONG VIỆC NÂNG CAO CHẤT LƯỢNG PHÂN LỚP CỦA CÁC HỆ THỐNG PHÁT HIỆN XÂM NHẬP MẠNG
Tác giả	Hoàng Ngọc Thanh
Người hướng dẫn	PGS. TS. Trần Văn Lăng
Trường học	Trường Đại học Lạc Hồng
Chuyên ngành	Khoa học máy tính
Thể loại	Luận án tiến sĩ
Năm xuất bản	2022
Thành phố	Đồng Nai

Định dạng
Số trang	255
Dung lượng	1,63 MB