(Tiểu luận) đề tài sử dụng deep learning và phương pháp giảm chiều dữ liệu pca cho xác định bất thường trong hệ thống iot

B Ấ T T H Ư Ờ N G T R O N G H Ệ T H Ố N G I O T – D 1 7 C Q V T 0 7 Đ I N H Q U A N G H U Y S Ử D Ụ N G D E E P L E A R N I N G V À P H Ư Ơ N G P H Á P G I Ả M C H I Ề U D Ữ L I Ệ U P C A C H O X Á C Đ Ị N H HỌC VIỆN CÔNG NGHỆ BƯU CHÍNH VIỄN THƠNG KHOA VIỄN THƠNG I ĐỒ ÁN TỐT NGHIỆP ĐẠI HỌC Đề tài : SỬ DỤNG DEEP LEARNING VÀ PHƯƠNG PHÁP GIẢM CHIỀU DỮ LIỆU PCA CHO XÁC ĐỊNH BẤT THƯỜNG TRONG HỆ THỐNG IOT Giảng viên hướng dẫn : TS HOÀNG TRỌNG MINH Sinh viên thực : ĐINH QUANG HUY Mã sinh viên : B17DCVT167 Lớp : D17CQVT07-B Khố : 2017-2022 Hệ : ĐH CHÍNH QUY Hà Nội, tháng 12/2021 h HỌC VIỆN CÔNG NGHỆ BƯU CHÍNH VIỄN THƠNG KHOA VIỄN THƠNG I ĐỒ ÁN TỐT NGHIỆP ĐẠI HỌC Đề tài : SỬ DỤNG DEEP LEARNING VÀ PHƯƠNG PHÁP GIẢM CHIỀU DỮ LIỆU PCA CHO XÁC ĐỊNH BẤT THƯỜNG TRONG HỆ THỐNG IOT Giảng viên hướng dẫn : TS HOÀNG TRỌNG MINH Sinh viên thực : ĐINH QUANG HUY Mã sinh viên : B17DCVT167 Lớp : D17CQVT07-B Khố : 2017-2022 Hệ : ĐH CHÍNH QUY Hà Nội, tháng 12/2021 h Điểm:… (Bằng chữ ) Ngày… tháng… năm 2021 Giáo viên hướng dẫn TS Hoàng Trọng Minh h Điểm:… (Bằng chữ ) Ngày… tháng .năm 2021 h Kết đồ án tốt nghiệp đại học q trình rèn luyện, tích lũy kiến thức sau năm học Học Viện Công Nghệ Bưu Chính Viễn Thơng Để có kiến thức quý giá em xin gửi lời cảm ơn chân thành sâu sắc tới tất thầy cô giáo, cán giảng viên dạy trường Học viện Cơng nghệ Bưu Viễn Thông, đặc biệt thầy, cô khoa Viễn Thông 1, cảm ơn tất thầy cô năm qua dìu dắt, dạy dỗ để em có ngày báo cáo tốt nghiệp hơm Em xin gửi lời cảm ơn chân thành sâu sắc tới Thầy giáo, TS Hoàng Trọng Minh – giảng viên trực tiếp hướng dẫn em hoàn thành đồ án tốt nghiệp Thầy nhiệt tình bảo em cách tỉ mỉ cẩn thận Xin cám ơn gia đình, bạn bè thường xuyên quan tâm, giúp đỡ, chia sẻ kinh nghiệm, thời gian học tập, nghiên cứu suốt trình em thực làm đồ án tốt nghiệp Hà Nội, ngày 20 tháng 12 Năm 2021 Sinh viên thực Đinh Quang Huy h LỜI CẢM ƠN i MỤC LỤC .ii DANH MỤC HÌNH VẼ iv DANH MỤC BẢNG VÀ FIGURE v THUẬT NGỮ VIẾT TẮT vi BẢNG KÍ HIỆU viii LỜI NÓI ĐẦU CHƯƠNG 1: TỔNG QUAN VỀ PHÁT HIỆN BẤT THƯỜNG TRONG MẠNG DỰA TRÊN TRÍ TUỆ NHÂN TẠO .4 1.1 Giới thiệu chung .4 1.2 Hệ thống phát hiệm xâm nhập (IDS) 1.2.1 Khái niệm 1.2.2 Phân loại hệ thống phát xâm nhập [1] 1.2.3 Hệ thống phát xâm nhập dựa chữ ký (SIDS) [4] 10 1.2.4 Hệ thống phát xâm nhập dựa bất thường (AIDS) [4] .12 1.2.5 Nguồn liệu sử dụng cho hệ thống phát xâm nhập [4] 15 1.2.6 Ưu nhược điểm hệ thống phát xâm nhập IDS 17 1.2.7 Vấn đề hệ thống phát xâm nhập [8] 17 1.3 AI hệ thống phát xâm nhập [8] 18 1.3.1 Supervised Learning [9] .18 1.3.2 Unsupervised Learning [9] 27 1.3.3 Neural networks [17] 30 1.4 Bộ liệu phát xâm nhập 31 CHƯƠNG 2: XÂY DỰNG VÀ PHÂN TÍCH CHIẾN LƯỢC DỰA TRÊN DNN VÀ PCA 37 h 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66 2.1 Chiến lược áp dụng 37 2.1.1 Tiền xử lý liệu (preprocessing data) 38 2.1.2 Phân tích thành phần (PCA) [27] 41 2.1.3 Xây dựng mơ hình thuật toán 44 CHƯƠNG 3: TRIỂN KHAI MƠ HÌNH THỬ NGHIỆM VÀ ĐÁNH GIÁ KẾT QUẢ 46 3.1 Tổng quan đánh giá mơ hình 46 3.1.1 Chia dataset .46 3.1.2 Confusion Matrix 48 3.2 Thiết lập nghiên cứu đánh giá 49 3.3 Kết luận chung phân tích kết 52 KẾT LUẬN 54 TÀI LIỆU THAM KHẢO 55 PHỤ LỤC 61 Tiền xử lý liệu: .61 1.1 Đọc ghi liệu từ conn.log.labeled sang file.csv: .61 1.2 Gán nhãn gộp liệu 61 1.3 Xử lý liệu gộp 63 Triển khai thử nghiệm .65 h 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66 DANH MỤC HÌNH VẼ Hình 1: Hệ thống phát xâm nhập IDS .5 Hình 2: Hệ thống phát xâm nhập mạng (NIDS) Hình 3: Hình ảnh hệ thống phát xâm nhập nút mạng (NNIDS) .7 Hình 4: Hệ thống phát xâm nhập host .7 Hình 5: Phương pháp luận sử dụng IDS dựa chữ ký Hình 6: Mơ hình làm việc phương pháp SIDS 11 Hình 7: Ba loại bất thường 15 Hình 8: Phân loại tin nhắn Spam 20 Hình 9: Đồ thị quan hệ diện tích giá nhà 21 Hình 10: Cấu trúc phân loại chung 22 Hình 11: Ví dụ cho thuật tốn định .23 Hình 12: Sơ đồ khối thuật toán Rừng Ngẫu Nhiên [12] 24 Hình 13: Hình minh hoạ SVM tuyến tính đơn giản .25 Hình 14: Hình biểu diễn Clustering .28 Hình 15: Hình sử dụng phân cụm để phát xâm nhập .29 Hình 16: Quá trình K-Means clustering 29 Hình 17: Hình ảnh hình dung mạng nơ-ron 31 Hình 1: Chiến lược sử dụng PCA cho việc phát bất thường mạng .37 Hình 2: Code từ bước đến bước 39 Hình 3: Chia liệu 41 Hình 4: Các bước thực PCA [29] 42 Hình 5: Cấu trúc mạng DNN tham số .44 Hình 1: Chia liệu cho mơ hình AI 46 Hình 2: Thời gian huấn luyện trước sau giảm chiều 51 Hình 3: Thời gian kiểm tra trước sau giảm chiều .51 h 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66 DANH MỤC BẢNG VÀ FIGURE Bảng 1: So sánh phương pháp SIDS AIDS 13 Bảng 2: So sánh loại công nghệ IDS dựa vị trí chúng hệ thống máy tính 16 Bảng 3: IoT-23 datasets (capture names, Pcap sizes, and malware types) 35 Bảng 4: Đặc trưng mô tả đặc trưng tập liệu IoT-23 36 Bảng 1: Confusion matrix cho hệ thống IDS 48 Bảng 2: Kết nghiên cứu trước sau giảm chiều .50 h 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66 THUẬT NGỮ VIẾT TẮT Từ viết tắt AI Viết đầy đủ Giải thích nghĩa Artificial Intelligence Trí tuệ nhân tạo Anomaly-based Intrusion Hệ thống phát xâm nhập dựa Detection System bất thường ANN Artificial Neural Network Mạng nơ-ron nhân tạo CPU Central Processing Unit Bộ xử lý trung tâm DDoS Distributed Denial-of-Service Từ chối dịch vụ phân tán Deep Learning Học sâu DNN Deep Neural Network Mạng nơ-ron sâu DoS Denial-of-Service Từ chối dịch vụ FE Feature Extraction Trích xuất đặc trưng FS Feature Selection Lựa chọn đặc trưng FTP File Transfer Protocol Giao thức truyền tập tin GPU Graphics Processing Unit Bộ xử lý đồ hoạ Host Intrusion Detection Hệ thống phát xâm nhập System host Hypertext Transfer Protocol Giao thức truyền siêu văn Internet Control Message Giao thức thông báo điều khiển Protocol Internet Intrustion Detector System Hệ thống phát xâm nhập Intelligence for Intrusion Hệ thống phát xâm nhập Detection System thông minh Internet Message Access Giao thức truy cập tin nhắn Protocol Internet IoT Internet of Things Internet vạn vật MAR Missing at Random Khuyết ngẫu nhiên AIDS DL HIDS HTTP ICMP IDS IIDS IMAP h 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66 PHỤ LỤC Code chương trình “Sử dụng Deep Learning phương pháp giảm chiều liệu PCA cho xác định bất thường hệ thống IoT” Như em nói giải thích rõ rang em đưa code chi tiết: Tiền xử lý liệu: 1.1 Đọc ghi liệu từ conn.log.labeled sang file.csv: #kết nối google colab với drive from google.colab import drive drive.mount('/content/drive') #import thư viện import pandas as pd import tensorflow as tf #kiểm tra GPU device_name = tf.test.gpu_device_name() if device_name != '/device:GPU:0': raise SystemError('GPU device not found') print('Found GPU at: {}'.format(device_name)) #ghi file conn.log.labeled sang file.csv import csv with open(all_files[6],'r') as csv_file: csv_reader = csv.reader(csv_file,delimiter = '\t') with open('/content/drive/MyDrive/Colab Notebooks/IoT23/data_input/h5.csv','w',newline='') as new_file: csv_writer = csv.writer(new_file) next(csv_reader) next(csv_reader) next(csv_reader) next(csv_reader) next(csv_reader) next(csv_reader) for line in csv_reader: csv_writer.writerow(line) 1.2 Gán nhãn gộp liệu #kết nối google colab với drive from google.colab import drive drive.mount('/content/drive') h 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66 #import thư viện import pandas as pd import tensorflow as tf #kiểm tra GPU device_name = tf.test.gpu_device_name() if device_name != '/device:GPU:0': raise SystemError('GPU device not found') print('Found GPU at: {}'.format(device_name)) #Đọc file CSV all_files=[] import os for root, dirs, files in os.walk("/content/drive/MyDrive/Colab Notebooks/IoT23/data_input"): for file in files: if file.endswith(".csv"): all_files.append(os.path.join(root, file)) all_files #load file lên df3 = pd.read_csv(all_files[3],error_bad_lines=False,low_memory= False) df0 = pd.read_csv(all_files[0],error_bad_lines=False,low_memory= False) df1 = pd.read_csv(all_files[1],error_bad_lines=False,low_memory= False) df2 = pd.read_csv(all_files[2],error_bad_lines=False,low_memory= False) df4 = pd.read_csv(all_files[4],error_bad_lines=False,low_memory= False) df5 = pd.read_csv(all_files[5],error_bad_lines=False,low_memory= False) df6 = pd.read_csv(all_files[6],error_bad_lines=False,low_memory= False) #Gán nhãn cho honeyprot cho malware label_df0 = np.zeros(df0.shape[ 0],dtype = int) h 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66 label_df1 label_df2 label_df3 label_df4 label_df5 label_df6 = = = = = = np.ones(df1.shape[ 0],dtype = int) np.zeros(df2.shape[ 0],dtype = int) np.ones(df3.shape[ 0],dtype = int) np.ones(df4.shape[ 0],dtype = int) np.ones(df5.shape[ 0],dtype = int) np.zeros(df6.shape[ 0],dtype = int) df0['label'] df1['label'] df2['label'] df3['label'] df4['label'] df5['label'] df6['label'] = = = = = = = label_df0.T label_df1.T label_df2.T label_df3.T label_df4.T label_df5.T label_df6.T #Gộp file xử lý thành file lớn df_final = pd.concat([df0,df1,df2,df3,df4,df5,df6],axis = 0, ignore_index=True) df_final.to_csv(r'data_input/export_dataframe.csv',index=F alse,header=True) 1.3 Xử lý liệu gộp #load data lên môi trường code df = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/IoT23/data_input/export_dataframe.csv',error_bad_lines=False, low_memory=False) #show toàn liệu theo cột pd.set_option('display.max_columns', None) #show liệu df #tìm loại bảo dịng khơng có nghĩa h 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66 array = ['#types', '#close'] noise = df.loc[df['#fields'].isin(array)] noise.index df = df.drop(noise.index) #chuyển ký tự '-' thành NaN để loại bỏ khơng có thơng tin DF = df.replace({'-':np.nan}) DF = DF.fillna(DF.mode().iloc[ 0]) DF['conn_state'] = DF['conn_state'].replace(np.nan, 0) DF['local_orig'] = DF['local_orig'].replace(np.nan, 0) DF['tunnel_parents label detailed-label' ] = DF['tunnel_parents label detailedlabel'].replace(np.nan, 0) # X # Y cắt X ( chứa cột ngoại trừ nhãn ) = DF.iloc[::,:DF.shape[ 1]-1] cắt Y ( chứa nhãn ) = DF.iloc[::,-1] #load thư viện sklearn để encoding liệu from sklearn.experimental import enable_iterative_imputer from sklearn.impute import IterativeImputer from sklearn.preprocessing import OrdinalEncoder,LabelEncoder from sklearn.ensemble import ExtraTreesRegressor encoder = OrdinalEncoder() imputer = IterativeImputer(ExtraTreesRegressor()) le = LabelEncoder() # tạo hàm encode có chức mã hố non-null liệu thay liệu gốc def encode(data): #chỉ giữ lại giá trị khơng rỗng nonulls = np.array(data.dropna()) #định hình lại liệu để mã hóa impute_reshape = nonulls.reshape( -1,1) #mã hố ngày impute_ordinal = encoder.fit_transform(impute_reshape) #Gán lại giá trị mã hóa cho giá trị khơng rỗng data.loc[data.notnull()] = np.squeeze(impute_ordinal) h 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66 return data #tạo vòng lặp for để lặp qua cột liệu for columns in X.columns: print(columns) encode(X[columns]) # Cân dữ liệu # Coad thư viện để cân liệu from collections import Counter from sklearn.datasets import make_classification from imblearn.over_sampling import ADASYN from matplotlib import pyplot from numpy import where counter = Counter(Y) print(counter) # Chuyển đổ liệu oversample = ADASYN() x, y = oversample.fit_resample(X, Y) # Tóm tắt phân bố lớp học counter = Counter(y) print(counter) # output Triển khai thử nghiệm o THỬ NGHIỆM 1: tất 22 đặc trưng # load thư viện sklearn để xử lý liệu đầu vào thư viện PCA from sklearn.preprocessing import StandardScaler from sklearn.decomposition import PCA from sklearn import decomposition # Tạo đối tượng scaleer sc = StandardScaler() # Điều chỉnh tỷ lệ cho phù hợp với tính biến đổi Xtrain_std = sc.fit_transform(X) # Chia liệu Training, Testing Validating phần 3.1.1 from sklearn.model_selection import train_test_split X_train, X_val_and_test, Y_train, Y_val_and_test = train_test_split(Xa, y, test_size= 0.2) X_val, X_test, Y_val, Y_test = train_test_split(X_val_and_test, Y_val_and_test, test_size=0.2) h 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66 # Show số lượng liệu vừa cắt print("X_train",len(X_train)) print("Y_train",len(Y_train)) print("X_test",len(X_test)) print("Y_test",len(Y_test)) print("X_val",len(X_val)) print("Y_val",len(Y_val)) # output # Import thư viện keras tensorflow để xây dựng mơ hình DNN import keras from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense from keras.layers import Dropout from keras import regularizers from keras.constraints import maxnorm from keras.optimizers import SGD # Tạo modol gồm lớp model = Sequential([ Dense(120, activation='relu',kernel_regularizer=regularizers.l2(0.01) , input_shape=(22,)), Dropout(0.3), Dense(65, activation='relu',kernel_regularizer=regularizers.l2( 0.01) ), Dropout(0.3), Dense(35, activation='relu',kernel_regularizer=regularizers.l2( 0.01) ), Dropout(0.3), Dense(16, activation='relu',kernel_regularizer=regularizers.l2( 0.01) ), Dropout(0.3), Dense(1, activation='sigmoid'), ]) h 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66 # khởi tạo giá trị đầu vào cho mô số lần học, tốc độ học, hàm mát, phần 2.1.3 epochs = learning_rate = 0.01 momentum = 0.9 sgd = SGD(learning_rate=learning_rate, momentum=momentum) model.compile(optimizer=sgd, loss='binary_crossentropy', metrics=['accuracy']) # Thực Training đo thời gian training import time from datetime import timedelta start_time = time.monotonic() hist = model.fit(X_train, Y_train, batch_size=512, epochs=epochs, validation_data=(X_val, Y_val)) end_time = time.monotonic() print(timedelta(seconds=end_time - start_time)) h 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66 # Đóng băng modol model.save("/content/drive/MyDrive/Colab Notebooks/IoT23/model_22features") # Thực Testing đo thời gian test import time from datetime import timedelta start_time = time.monotonic() yout = model.predict_classes(X_test) end_time = time.monotonic() print(timedelta(seconds=end_time - start_time)) # Đánh giá mơ hình def my_confusion_matrix(y_true, y_pred): N = np.unique(y_true).shape[ 0] # number of classes cm = np.zeros((N, N)) for n in range(y_true.shape[0]): cm[y_true[n], y_pred[n]] += return cm cnf_matrix = my_confusion_matrix(Y_test, result) print('Confusion matrix:') print(cnf_matrix) print('\ nAccuracy:', np.diagonal(cnf_matrix) sum()/cnf_matrix.sum()) o THỬ NGHIỆM 2: tất 15 đặc trưng # load thư viện sklearn để xử lý liệu đầu vào thư viện PCA from sklearn.preprocessing import StandardScaler from sklearn.decomposition import PCA from sklearn import decomposition # Tạo đối tượng scaleer sc = StandardScaler() h 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66 # Điều chỉnh tỷ lệ cho phù hợp với tính biến đổi Xtrain_std = sc.fit_transform(X) # Tạo đối tượng pca với k thành phần làm tham số pca = decomposition.PCA(n_components= 15) # Điều chỉnh PCA chuyển đổi liệu Xa = pca.fit_transform(Xtrain_std) # Chia liệu Training, Testing Validating phần 3.1.1 from sklearn.model_selection import train_test_split X_train, X_val_and_test, Y_train, Y_val_and_test = train_test_split(Xa, y, test_size= 0.2) X_val, X_test, Y_val, Y_test = train_test_split(X_val_and_test, Y_val_and_test, test_size=0.2) # Show số lượng liệu vừa cắt print("X_train",len(X_train)) print("Y_train",len(Y_train)) print("X_test",len(X_test)) print("Y_test",len(Y_test)) print("X_val",len(X_val)) print("Y_val",len(Y_val)) # output # Import thư viện keras tensorflow để xây dựng mơ hình DNN import keras from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense from keras.layers import Dropout from keras import regularizers from keras.constraints import maxnorm from keras.optimizers import SGD # Tạo modol gồm lớp model = Sequential([ Dense(120, activation='relu',kernel_regularizer=regularizers.l2(0.01) , input_shape=(22,)), Dropout(0.3), h 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66 Dense(65, activation='relu',kernel_regularizer=regularizers.l2(0.01) ), Dropout(0.3), Dense(35, activation='relu',kernel_regularizer=regularizers.l2( 0.01) ), Dropout(0.3), Dense(16, activation='relu',kernel_regularizer=regularizers.l2( 0.01) ), Dropout(0.3), Dense(1, activation='sigmoid'), ]) # khởi tạo giá trị đầu vào cho mơ số lần học, tốc độ học, hàm mát, phần 2.1.3 epochs = 20 learning_rate = 0.01 momentum = 0.9 sgd = SGD(learning_rate=learning_rate, momentum=momentum) model.compile(optimizer=sgd, loss='binary_crossentropy', metrics=['accuracy']) # Thực Training đo thời gian training h 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66 import time from datetime import timedelta start_time = time.monotonic() hist = model.fit(X_train, Y_train, batch_size=512, epochs=epochs, validation_data=(X_val, Y_val)) end_time = time.monotonic() print(timedelta(seconds=end_time - start_time)) # Đóng băng modol model.save("/content/drive/MyDrive/Colab Notebooks/IoT23/model_22features") # Thực Testing đo thời gian test import time from datetime import timedelta start_time = time.monotonic() yout = model.predict_classes(X_test) end_time = time.monotonic() print(timedelta(seconds=end_time - start_time)) # Đánh giá mơ hình def my_confusion_matrix(y_true, y_pred): N = np.unique(y_true).shape[ 0] # number of classes cm = np.zeros((N, N)) for n in range(y_true.shape[0]): cm[y_true[n], y_pred[n]] += return cm cnf_matrix = my_confusion_matrix(Y_test, result) print('Confusion matrix:') print(cnf_matrix) print('\ nAccuracy:', np.diagonal(cnf_matrix) sum()/cnf_matrix.sum()) h 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66 o THỬ NGHIỆM 3: tất đặc trưng # load thư viện sklearn để xử lý liệu đầu vào thư viện PCA from sklearn.preprocessing import StandardScaler from sklearn.decomposition import PCA from sklearn import decomposition # Tạo đối tượng scaleer sc = StandardScaler() # Điều chỉnh tỷ lệ cho phù hợp với tính biến đổi Xtrain_std = sc.fit_transform(X) # Tạo đối tượng pca với k thành phần làm tham số pca = decomposition.PCA(n_components= 8) # Điều chỉnh PCA chuyển đổi liệu Xa = pca.fit_transform(Xtrain_std) # Chia liệu Training, Testing Validating phần 3.1.1 from sklearn.model_selection import train_test_split X_train, X_val_and_test, Y_train, Y_val_and_test = train_test_split(Xa, y, test_size= 0.2) X_val, X_test, Y_val, Y_test = train_test_split(X_val_and_test, Y_val_and_test, test_size=0.2) # Show số lượng liệu vừa cắt print("X_train",len(X_train)) print("Y_train",len(Y_train)) print("X_test",len(X_test)) print("Y_test",len(Y_test)) print("X_val",len(X_val)) print("Y_val",len(Y_val)) # output h 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66 # Import thư viện keras tensorflow để xây dựng mơ hình DNN import keras from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense from keras.layers import Dropout from keras import regularizers from keras.constraints import maxnorm from keras.optimizers import SGD # Tạo modol gồm lớp model = Sequential([ Dense(120, activation='relu',kernel_regularizer=regularizers.l2(0.01) , input_shape=(8,)), Dropout(0.3), Dense(65, activation='relu',kernel_regularizer=regularizers.l2( 0.01) ), Dropout(0.3), Dense(35, activation='relu',kernel_regularizer=regularizers.l2( 0.01) ), Dropout(0.3), Dense(16, activation='relu',kernel_regularizer=regularizers.l2( 0.01) ), Dropout(0.3), Dense(1, activation='sigmoid'), ]) h 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66 # khởi tạo giá trị đầu vào cho mơ số lần học, tốc độ học, hàm mát, phần 2.1.3 epochs = 20 learning_rate = 0.01 momentum = 0.9 sgd = SGD(learning_rate=learning_rate, momentum=momentum) model.compile(optimizer=sgd, loss='binary_crossentropy', metrics=['accuracy']) # Thực Training đo thời gian training import time from datetime import timedelta start_time = time.monotonic() hist = model.fit(X_train, Y_train, batch_size=512, epochs=epochs, validation_data=(X_val, Y_val)) end_time = time.monotonic() print(timedelta(seconds=end_time - start_time)) h 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.C.33.44.55.54.78.655.43.22.2.4.55.2237.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.66 37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.55.77.77.99.44.45.67.22.55.77.C.37.99.44.45.67.22.99