experiments with NetFPGA-10G platforms show that the core can protectservers against SYN flood attacks by up to 28+ millions of packets persecond.In addition, the thesis proposes an arch
Trang 1NGO DUC MINH
A HIGH PERFORMANCE ANOMALY-BASED INTRUSION DETECTION SYSTEM FOR SDN
Trang 2VIET NAM NATIONAL UNIVERSITY - HO CHI MINH CITY
HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY
NGO DUC MINH
A HIGH PERFORMANCE ANOMALY-BASED INTRUSION DETECTION SYSTEM FOR SDN
Trang 3TRƯỜNG ĐẠI HỌC BÁCH KHOA - ĐHQG - HCM
Cán bộ hướng dẫn khoa học :
(Ghi rõ họ, tên, học hàm, học vị và chữ ký)Cán bộ chấm nhận xét 1 :
(Ghi rõ họ, tên, học hàm, học vị và chữ ký)Cán bộ chấm nhận xét 2 :
(Ghi rõ họ, tên, học hàm, học vị và chữ ký)Luận văn thạc sĩ được bảo vệ tại Trường Đại học Bách Khoa, ĐHQG
Tp HCM ngày tháng năm
Thành phần Hội đồng đánh giá luận văn thạc sĩ gồm:
(Ghi rõ họ, tên, học hàm, học vị của Hội đồng chấm bảo vệ luận vănthạc sĩ)
Trang 4ĐẠI HỌC QUỐC GIA TP.HCM CỘNG HÒA XÃ HỘI CHỦ NGHĨA VIỆT NAM
NHIỆM VỤ LUẬN VĂN THẠC SĨ
Ngày, tháng, năm sinh: 05/06/1994 Nơi sinh: Đồng Nai
I TÊN ĐỀ TÀI: HỆ THỐNG PHÁT HIỆN XÂM NHẬP MẠNG DỰA TRÊN SỰ BẤT THƯỜNG HIỆU NĂNG CAO CHO MẠNG SDN
II NHIỆM VỤ VÀ NỘI DUNG:
• Nghiên cứu và hiện thực các giải thuật phát hiện/phòng chống tấncông mạng Trong đó, có nghiên cứu các kĩ thuật phát hiện dựa trênbất thường sử dụng kĩ thuật học máy trên phần cứng NetFPGA
• Xây dựng kiến trúc hệ thống bộ chuyển mạch OpenFlow trên nềntảng NetFPGA với khả năng tích hợp khối bảo mật Đo đạc và đánhgiá tốc độ chuyển mạch của hệ thống mới
• Tích hợp khối phát hiện tấn công dựa trên bất thường vào bộ chuyểnmạch OpenFlow trên nền tảng NetFPGA
• Module hóa và cải tiến tài nguyên hệ thống trên phần cứng NetFPGA.Thử nghiệm, đánh giá kết quả so với các công trình liên quan
III NGÀY GIAO NHIỆM VỤ: 13/08/2018
IV NGÀY HOÀN THÀNH NHIỆM VỤ: 30/12/2019
V CÁN BỘ HƯỚNG DẪN: PGS.TS TRẦN NGỌC THỊNH.
Tp HCM, ngày tháng năm 20
(Họ tên và chữ ký) (Họ tên và chữ ký)
Trang 5Firstly, I would like to express my sincere gratitude to my advisor, Associate
Professor Tran Ngoc Thinh for his warmly supports, taking the time from
the beginning of my work to orient the research, and during step by step
of the dissertation process
Besides my advisor, I would like to thank to my teachers at Faculty ofComputer Science & Engineering, Ho Chi Minh City University ofTechnology in providing a lot of knowledge for me during my master
course Especially, my sincere thanks also goes to Doctor Cuong
Pham-Quoc who has guided me and improved my skill a lot while writing
papers and reports
To Computer Engineering labmates, I would like to thank Mr Ho
Quang Chi Bao, Mr Tran Nguyen Vo, Mrs Tran Thi Thuy Chau, Mr Le Tan Long, Tran Minh Anh Tuan, Dang Ngo Nhat Truong, Nguyen Gia Phuc and
the many other individuals for the sleepless nights we were workingtogether before deadlines, and for all the fun we have had in the last threeyears
Last but importantly, I am grateful to the God for the good health andwell being that were necessary to complete this dissertation I would like tosay thank you so much to my family I could not go far on my way withouttheir supports and encouragements
Ho Chi Minh City, December 01st, 2019
Ngo Duc Minh
i
Trang 6A BSTRACT
Internet has become a global information system that can be accessedpublicly by linked computers from all over the world Along with thedevelopment of Internet of Things (IoTs) the number of entities inInternet can reach 20-100 billion in 2020 [1] Thus, an alternativeapproach for traditional network architecture to adapt comprehensivedemands is necessary, urgency Software-Defined Networking (SDN) [2]has been investigated by many organizations because of its advantagescompared to the traditional approaches Computer networks areconfigured and operated manually in traditional networks, while SDNprovides centralization control, simple hardware devices, and highvirtualization SDN decouples network control from forwardingfunctions, so that network control becomes programmable In SDN,network control contains controllers configured by networkadministrators through software interfaces
Besides, the developing of Artificial Intelligence (AI), which can train amachine to imitate intelligent human behaviors has become a prominenttopic AI has achieved several successes in practical applications such asdecision making, speech recognition, and also object classification.Besides, real-time applications usually require heavy computationaltasks; thus, processing on general-purpose processors is not efficiency inperformance Meanwhile, hardware accelerators such as GraphicsProcessing Units (GPUs) and Field-Programmable Gate Arrays (FPGAs)have been employed to improve the throughput of AI algorithms in theseapplications This work aims to enhance security for SDN bypre-processing and protecting network packets at the data plane usingmultiple approaches
This work introduces an efficient high-throughput and low-latencySYN flood defender architecture, carefully designed with a pipelinemodel A mathematical model is also added with the architecture forestimating SYN flood protection throughput and latency Our
ii
Trang 7experiments with NetFPGA-10G platforms show that the core can protectservers against SYN flood attacks by up to 28+ millions of packets persecond.
In addition, the thesis proposes an architecture for SDN-basedsecured forwarding devices (switches) by extending our previousarchitecture - HPOFS with multiple security functions, includinglightweight DDoS mechanisms, signature-based, and anomaly-basedIDS We implement our architecture on a heterogeneous system,including host processors, GPU, and FPGA boards To the best of ourknowledge, this is the first forwarding device for SDN implemented on aheterogeneous system in the literature Our system not only is enhancedsecurity but also provides a high-speed switching capacity based on theOpenFlow standard The implemented design on GTX Geforce 1080 G1for training phase is 14× faster when compared to CPU Intel Core i7 –
4770, 3,4GHz, 16GB of RAM on Ubuntu version 14.04 The switchingfunction along with three lightweight DDoS detection/preventionmechanisms have processing speed at 39.48 Gbps on NetFPGA-10Gboard (Xilinx xc5vtx240t FPGA device) Especially, our neural networkmodels on NetFPGA-10G board outperform CPU in processingperformance by reaching throughputs at 4.84 Gbps Moreover, the neuralnetwork model achieves 99.01% precision with only 0.02% false-positiverate using the created dataset
Trang 8ABSTRACT iv
Tóm Tắt Luận Văn Thạc Sĩ
Internet đã trở thành hệ thống thông tin toàn cầu và có thể truy cậpmọi lúc mọi nơi bởi các máy tính trên toàn thế giới Cùng với đó là sự pháttriển của internet vạn vật (Internet of things), số lượng kết nối có thể đạt20-100 tỉ vào năm 2020 [1] Vì thế, một giải pháp thay thế cho kiến trúcmạng truyền thống là rất cần thiết Software-defined Networking (SDN)[2] được xem là một giải pháp thay thế cho mạng truyền thống và đã đượcnghiên cứu bởi nhiều tổ chức trên thế giới với nhiều ưu thế so với mạngtruyền thống Mạng máy tính truyền thống thông thường được cấu hình
và vận hành bằng tay, người quản trị phải đến từng thiết bị phần cứng đểcấu hình chúng trong khi mạng SDN cung cấp khả năng quản lý tập trung.Trong mạng SDN, những bộ điều khiển được vận hành bởi người quản trịmạng thông qua những phần mềm
Thêm vào đó, sự phát triển của trí tuệ nhân tạo (Artificial Intelligence
- AI) có thể huấn luyện máy tính để bắt chước hành vi con người đã trởthành một chủ đề hứa hẹn AI đã đạt được nhiều thành công trong các bàitoán thực tế như là đưa ra quyết định, nhận dạng giọng nói, và cả phânloại thực thể Bên cạnh đó, các ứng dụng yêu cầu đáp ứng thời gian thựcthường yêu cầu tính toán phức tạp, dẫn đến việc sử dụng các bộ xử lýchung không còn hiệu quả về mặt hiệu suất Trong khi đó, phần cứngtăng tốc như GPU và FPGA đã và đang được nghiên cứu sử dụng để cảitiến hiệu suất cho những ứng dụng sử dụng các giải thuật trí tuệ nhântạo.Luận văn thạc sỹ này với mục tiêu là giảm thiểu tính toán cho đườngđiều khiển bằng cách tiền xử lý và bảo vệ mạng sớm hơn ở đường dữ liệu
sử dụng các hướng tiếp cận khác nhau, đặc biệt là hướng phát hiện tấncông dựa trên bất thường
Công trình này giới thiệu một kiến trúc hiệu năng cao cho phòng chốngtấn công SYN Flood, được thiết kế kỹ càng theo cơ chế xử lý đường ống.Một mô hình toán học cũng được đề xuất để đo đạc và đánh giá khả năngchống chịu của khối SYN Flood về khía cạnh hiệu suất và độ trễ Kết quảthực nghiệm trên nền tảng NetFPGA-10G cho thấy khối bảo mật này cóthể bảo vệ các máy chủ khỏi tấn công SYN Flood ở tốc độ lên đến 28+ triệugói tin trên một giây
Thêm vào đó, luận văn đề xuất bảo mật cho các thiết bị chuyển tiếp dữliệu theo kiến trúc mạng SDN bằng việc mở rộng công trình trước đó - cótên là HPOFS bằng cách xây dựng các bộ máy bảo mật bao gồm các khối
Trang 9chống tấn công DDoS, phát hiện tấn công dựa trên chữ ký, và phát hiện tấncông dựa trên dấu hiệu bất thường Tác giả hiện thực kiến trúc đề xuất trêncác nền tảng phần cứng hỗn hợp bao gồm các bộ xử lý chủ, board mạchGPU và FPGA Đây có thể xem là hệ thống đầu tiên có tích hợp các khối bảomật khác nhau trên phần cứng chuyển mạch theo kiến trúc mạng SDN.
Hệ thống này không những được bảo mật chặt chẽ mà còn có thể chuyểnmạch gói tin, quản lý tập trung theo giao thức OpenFlow Hệ thống huấnluyện các mô hình mạng neural sử dụng GPU GTX Geforce 1080 G1 chotốc độ nhanh hơn 14 lần so với CPU Intel Core i7 – 4770, 3,4GHz, 16GBRAM trên hệ điều hành Ubuntu phiên bản 14.04 Chức năng chuyển mạchcùng với các khối chống tấn công DDoS có thể xử lý ở tốc độ tối đa 39.48Gbps trên board mạch NetFPGA-10G (Xilinx xc5vtx240t FPGA device) Đặcbiệt, các mô hình học máy trên board mạch này hiệu suất hơn hẳn CPUkhi tốc độ xử lý đạt 4.84 Gbps Hơn nữa, mô hình học máy tốt nhất cho99.01% độ chính xác với 0.02% tỉ lệ cảnh báo sai khi sử dụng tập dữ liệuđược thu thập trong môi trường phòng thí nghiệm Kỹ thuật máy tính ĐạiHọc Bách Khoa Tp.HCM
Trang 10Statement of Originality
I certify that the results of this work is the product of my own work andcolleagues at the Faculty of Computer Science and Engineering (CSE), HoChi Minh City University of Technology (HCMUT), Vietnam NationalUniversity - Ho Chi Minh City (VNU HCM)
All the assistance received in preparing this thesis and sources havebeen acknowledged Parts of this work have previously been published inscientific papers below:
Heterogeneous Hardware-based Network Intrusion Detection System with Multiple Approaches for SDN In: Mobile Networks
and Applications - Vol 25; issue 1, 1-15 (2020) -ISBN/ISSN:1572-8153 (SCIE)
Ngoc Thinh, and Cuong Pham-Quoc High-throughput Machine
Learning Aproaches for Network Attacks Detection on FPGA In:
ICCASA2019, pp 1–10 Springer (2019)
• Cuong Pham-Quoc, Duc-Minh Ngo, Tran Ngoc Thinh HPOFS: A
High Performance and Secured OpenFlow Switch Architecture for FPGA In: Advances in Electrical and Computer Engineering - Issue:
3, Volume: 19, 19-28 (2019) -ISBN/ISSN: 1582-7445 (SCIE)
Efficient High-Throughput and Low-Latency SYN Flood Defender for High-Speed Networks In: Security and Communication
Networks - Volume 2018, 14 (2018) -ISBN/ISSN: 1939-0122 (SCIE)
Ngo Duc Minh
Trang 11Acknowledgments i
1.1 Problem statement 3
1.2 Contributions 4
1.3 Organization 6
2 Background and Related work 7 2.1 Software-Defined Network 7
2.2 Field Programmable Gate Array 9
2.3 NetFPGA-10G platform 10
2.4 TCP SYN flood attacks 11
2.5 Decision Tree 12
2.6 Artificial Neural Network 13
2.7 Anomaly-based network intrusion detection Dataset 16
2.7.1 NSL-KDD Dataset 17
2.7.2 Generated Dataset 17
2.8 Related work 18
2.8.1 Security on the control plane 19
2.8.2 Security on the data plane 20
2.8.3 DDoS attacks detection/prevention 22
2.8.4 Anomaly-based network intrusion detection 24
2.9 Summary 27
vii
Trang 12CONTENTS viii
3.1 Secured SDN data plane forwarding devices 29
3.2 SYN flood defender standalone architecture 33
3.2.1 Header extraction 34
3.2.2 Packet classification 34
3.2.3 SYN food attack detection 34
3.2.4 Decision generation 35
3.2.5 Packet modification 39
3.2.6 SYND Performance analysis 39
4 Network intrusion detection cores 43 4.1 Machine learning cores 43
4.1.1 Decision Tree 43
4.1.2 Artificial Neural Network 44
4.2 Anomaly-based Artificial Neural Networks on Heterogeneous Acceleration Hardware 46
4.2.1 FSOFS 47
4.2.2 GMC 49
4.2.3 FNN 52
4.3 SYND implementation 53
4.3.1 Header Extractor 53
4.3.2 IDG phase 1andIDG phase 2 53
4.3.3 CKG phase 1andCKG phase 2 54
4.3.4 Detector 55
4.3.5 Decision Controller 56
4.3.6 Packet Modifier 57
4.4 Combination of SYND and OpenFlow Switch 57
5 Evaluation 59 5.1 Synthesis results 59
5.2 Experimental setups 60
5.3 Experimental results 63
5.3.1 SYND core 63
5.3.2 FSOFS 67
5.3.3 GMC 68
5.3.4 FNN 69
5.3.5 Full system test 70
Trang 136 Conclusions and Future works 74
6.1 Conclusions 746.2 Future Works 75
A.1 High-throughput Machine Learning Aproaches for NetworkAttacks Detection 89A.2 High-throughput Machine Learning Aproaches for NetworkAttacks Detection 104A.3 An Efficient High-Throughput and Low-Latency SYN FloodDefender for High-Speed Networks 122
Architecture for FPGA 135
Trang 14L IST OF F IGURES
1.1 Number of emails sent each day from 2017-2022 [3] 2
1.2 OpenFlow architecture 3
2.1 Layered view of SDN architecture 8
2.2 The basic architecture of an FPGA device [4] 10
2.3 The NetFPGA 10G platform 11
2.4 TCP three-way handshaking protocol of (a) traditional system and (b) SYN flood defender system together with protected server 13
2.5 An simple ANN with three layers of neurons 14
2.6 SDN architecture with application, control plane, and data plane layers 19
2.7 SLICOTS architecture [5] 20
2.8 AVANT-GUARD proposal [6] 21
2.9 Stateful SDN architecture [7] 22
2.10 SYN Cookies behaviors in three-ways handshake protocol 23
2.11 High-level of proposed SDN architecture 28
3.1 Hardware-based forwarding device architecture with OpenFlow Function and three secure methods including Pre-scanner, F-NIDS, and F-ANIDS 30
3.2 Hardware architecture of SYN flood defender with five pipeline stages 33
4.1 Decision tree block diagram 44
4.2 Neuron network overview 45
4.3 Neuron network block diagram 45
4.4 A hardware-based forwarding device with one GPU Geforce GTX 1080 and two NetFPGA-10G boards 46
4.5 The FSOFS system on NetFPGA-10G board which is integrated in a CPU 48
x
Trang 154.6 GMC system on GPU Geforce GTX 1080 G1 board under handling
of a CPU 50
4.7 Mean Squared Error (mse) values of 10xN x1 models (0 ≤ N ≤ 4) 51
4.8 Mean Squared Error values of 10xN x1 models (4 ≤ N ≤ 10) 514.9 FNN system on NetFPGA-10G board which is integrated in a CPU 524.10 Header extraction in two phases of SYN flood defender 544.11 Index Generation process with CRC hash algorithm 544.12 Cookie Generation process using CRC hash algorithm and theconstruction of cookie number 554.13 The construction of SYN flood Detector module 564.14 The Decision Controller module with authentication mechanismusing known clients table 565.1 The first setup to measure performance of proposed system usingOSNT 615.2 The second setup with full system in real physical network 625.3 Processing throughputs of the core with 62-byte SYN packets 64
calculated by our estimated model 655.5 The latency comparison for different numbers of packets 66
5.6 Performance of the Pre-scanner component with three lightweight
DDoS detection/prevention techniques 68
5.7 Performance of F-NIDS block with various packet sizes 69
5.8 Performance of F-ANIDS_1 with mixed packets in different sending
speed levels 705.9 Comparison in training time between GPU and CPU 715.10 Processing speed of three implemented models on NetFPGA-10Gboard 725.11 Comparison between CPU and FPGA in number of received entries 73
Trang 16L IST OF T ABLES
2.1 The NetFPGA 10G specification 122.2 The 6 features descriptions 172.3 The number of packets in training and testing phases 182.4 Researches on detecting network attacks using Deep learningtechniques 252.5 Network anomaly-based detection system for SDN 264.1 Header modification in two cases: converting SYN to SYN/ACKpacket and converting ACK to RST packet 575.1 Synthesis results for the proposed forwarding device which isimplemented in the NetFPGA 10G platform 605.2 Comparison between SYN flood defender ability and otherproposals in the literature 665.3 Confusion matrix of the three models 72
xii
Trang 17ACK Acknowledgment flag.xii
ASIC Application-Specific Integrated Circuit.xii
AXI Advanced eXtensible Interface.xii
CPU Central Processing Unit.xii
DDoS Distributed Denial of Service.xii
DoS Denied of Service.xii
DRDoS Distributed Reflection Denial of Service.xii
DSP Digital Signal Processor.xii
FIFO First In First Out.xii
FTP File Transfer Protocol.xii
xiii
Trang 18GLOSSARY xiv
ICAP Internal Configuration Access Port.xii
IoTs Internet of Things.xii
IP Internet Protocol.xii
IPv4 Internet Protocol version 4.xii
IPv6 Internet Protocol version 6.xii
ISP Internet Service Provider.xii
IT Information Technology.xii
JTAG Joint Test Action Group.xii
KB Kilobyte.xii
Trang 19NFS Network File System.xii
NIC Network Interface Controller.xii
PIEF Port Ingress/Egress Filtering.xii
PR Partial Reconfiguration.xii
Rx Receiver.xii
SFP+ Small Form-factor Pluggable Interface +.xii
SSL Secure Socket Layer.xii
TLS Transport Layer Security.xii
TTL Time To Live.xii
Tx Transmitter.xii
Trang 20Internet has became a global information system that can be accessed publicly
by linked computers from all over the world In this system, thousand of smallcomputer groups (LAN, WAN, and MAN networks) from businesses, researchinstitutes, universities, individual users, and governments have connection witheach others Internet is always busy with enormous number of connections,there are more than 40,000 google queries performed every second and 3.5billion google searches per day [1] According to a report in 2017, about 269billion emails are sent each day and this number is predicted to reach morethan 333 billion in 2022 [3] (Figure1.1shows statistics and predicted number ofemails sent each day from 2017 to 2022) With the development of Internet ofThings (IoTs), every things will have ability to connect to the Internet, thenumber of entities in Internet can reach 20-100 billion in 2020 Besides, number
of users in internet is also increased linearly Both of these events declaredabove make internet data increase in a uncontrollable way if we use thetraditional network architecture To overcome this issue and provide such alarge number of users and also network-based services, many works have beenproposed and implemented both in academia and industry
One of the most emerging approach - Software-Defined Networking(SDN) [8] which has been considered as an alternative approach for traditional
because of it advantages compared to the traditional approaches Computernetworks are configured and operated manually in traditional networks while
1
Trang 21Figure 1.1: Number of emails sent each day from 2017-2022 [ 3
SDN provides centralization control, simple hardware devices, and highvirtualization SDN decouples network control from forwarding functions so
contains controllers configured by network administrators through software interfaces Each controller is responsible for handling a number of forwarding
devices that provide forwarding functions. Those forwarding devices routenetwork packets from sources to programmed destinations according tonetwork policies
SDN instantiates Taking the principles of SDN into design, OpenFlow switchesdecouple control from data planes (Figure1.2) While forwarding devices at thedata plane are responsible for routing network packets, an associated controller
at the control plane handles these devices and makes high-level routingdecisions According to SDN principles, the first packet of a new flow coming to
a forwarding device will be forwarded to the controller for making the
virtualization and load-sharing [10]
Trang 22applying anomaly-based approach to detect network intruder.
In addition, the developing of Artificial Intelligence (AI) [19] which can train
a machine to imitate intelligent human behaviors has become a prominenttopic AI has achieved several successes in practical applications such as visualperception, decision making, speech recognition, and also object classification.Likewise, Machine learning (ML) [20] is famous as a subset of AI with the ability
to update and improve itself when exposed to more data ML is flexible anddoes not require human intervention to make certain changes One of the mostpractical applications of ML is to solve classification problems which are similar
to the problem of detecting network intruders/attackers Many classificationtechniques [21] such as Linear Classifiers, Logistic Regression, Naive BayesClassifier, Support Vector Machines, Decision Trees and Neural Networks havebeen used to predict the category to which the data belongs Besides, real-timeapplications usually require heavy computational tasks; thus, processing on
Trang 23hardware accelerators such as Graphics Processing Units (GPUs) andField-Programmable Gate Arrays (FPGAs) have been employed to improve thethroughput of ML algorithms in these applications.
Currently, GPU-based acceleration is a promise approach for improvingperformance of traditional processor in ML algorithms due to the wide range ofhardware providers and the impression of high-performance high-throughputcomputing power GPU offers significant computation throughput due to athousand parallel processing cores integrated However, GPU platforms require
interconnect interface is needed to support high data rates Although GPUs areextremely efficient in the training phase, it is not enough efficiency in the testingphase because of the bottleneck in data transfer [22] GPUs are deployed only
on dedicated cards requiring links to the host CPU and memory over data
severely limits the abilities of GPUs in high performance computing [23]
Nowadays, FPGAs play as a seriously important role in data sampling andprocessing industries due to their flexibility in custom hardware, high parallellevel, and low energy consumption In the artificial intelligence field, there is asoaring demand for high energy efficiency hardware implement and massivelyparallel computing capacity for training and testing Therefore, GPUs are goodfor training while FPGA devices have emerged as the best possible choice fortesting [23;24] Some advantages of FPGAs can be listed as acceptable energyconsumption with high-performance, efficiency in parallel processing, customarchitectures allowed, high on-chip memory bandwidth, low-latency, highreliability, and a relatively short time to market
This work proposes a high performance anomaly-based intrusion detectionsystem for SDN networks The main contributions can be stated as follow:
detection/prevention mechanisms, especially the SYN flooding attacks
In addition, this thesis investigated in machine learning which isclassification techniques on acceleration hardware for high-throughputanomaly-based network detection
2 This work proposes an efficient high-throughput SYN flood defender
Trang 24this SYN Flood detection/prevention core was published in 2018.
3 For anomaly-based machine learning techniques, we designed twoclassification models, the decision tree and neurons network, fordetecting network attacks using the NSL-KDD dataset We implement thefirst prototype version on the NetFPGA-10G board and validate thesystem with the NSL-KDD dataset The experimental results shows that
we can beat both Geforce GTX 850M GPU and Intel core i5 8th generationCPU in processing time One research paper [26] about this research waspublished in 2019
4 We propose the architecture for High-Performance Secured OpenFlow
sources to destinations according to the OpenFlow protocol and examinethese packets to countermeasure different attacks The two behaviors can
be executed in both parallel and pipeline models to achieve optimized
Ingress/Egress filtering and Hop-Count filtering
5 By combining all of the above proposals We introduced a comprehensiveapproach [28] which is the integrating of the three intrusion detectionengines (IDS) for forwarding devices in Software Define Networksarchitecture The first IDS engine (named Pre-DDoS Scanner) adoptedDDoS detection/prevention cores While the second IDS engine (calledF-NIDS) uses snort rules to classify attacking packets and the last one
(called F-ANIDS) is able to recognize anomaly behaviors of network
packets based on a machine learning model The proposed architecture isimplemented on a heterogeneous platform including two FPGA boardsand one GPU platform under handling of host processors While the firstFPGA board is used to implement the original HPOFS and F-NIDS, the
trained neuron network for detecting anomaly behaviors is deployed on
Trang 25the second FPGA board For training the neuron network, the GPUplatform is used.
6 Finally, a number of testing scenarios for verifying and evaluating thesystem is introduced in this work The system is tested with both standardand real datasets collected from our institute networks The resourcesusage and power consumption of FPGA devices are also analyzed anddiscussed
As far as I am concerned, this thesis proposes a high performance
acceleration hardware platforms which are FPGA and GPU are used to prove theoutstanding abilities including high-throughput switching system basedOpenFlow standard and efficient real-time network attack detection whencompared with relevant software-based approaches
knowledge and introduces classification techniques that implemented assecurity functions in this work, the dataset used, and related works In Section3,The propose hardware architecture for forwarding devices in SDN network is
anomaly-based network intrusion detection engine based on the proposed
conclusions and future works are discussed in Section6
Trang 26This section firstly presents an overview of SDN architecture and discussionabout its security issues Secondly, the background on FPGA platform andtechniques used for our security functions in the architecture Finally, relatedwork in the literature are summarized
traditional networking by separating the control from the data plane SDNarchitecture has been considered as an alternative approach for traditionalnetworks architecture which has devices (e.g routers, switches, firewalls, )
automation at the network level SDN architecture divides network model into
three layers, Application layer, Control layer and Infrastructure layer Figure2.1illustrates the three layers of SDN architecture According to this architecture,SDN architecture decouples control functions from forwarding functions then
controllers running network services programmed by network applications
through Northbound interface Each controller is responsible for handling a
number of forwarding devices located in infrastructure layer that process
forwarding functions Those forwarding devices route network packets fromsource nodes to destination nodes according to network configuration
7
Trang 27Network Application
APPLICATION LAYER Network
Application Network Application
Network Applications
CONTROL LAYER
Network Service Network Service Network Services
INFRASTRUCTURE LAYER
Northbound Interface
Southbound Interface
Forwarding Device Forwarding Device Forwarding Device Forwarding Device
Figure 2.1: Layered view of SDN architecture
The centralized control model of SDN may cause many security issues,
previous studies have focused on building secured functions for controllers orincreasing strength of controllers [29; 30] However, these approaches could
developingintelligent data planes where data is pre-processed to preventsystems from attacking [17;18;31] Nevertheless, with the fast increasing thenumber of network attacks as well as attacking types, a system needs to beaugmented with different protection techniques to survive from attacks Whenbuilding secured functionalities for SDN, one of the critical issues is to notbreaking the principles of SDN such as centralization control and monitoring,decoupling controller and data planes
devices at the data plane is responsible for routing network packets, aassociated controller at the control plane handles these device and makeshigh-level routing decisions According to SDN principles, the first packet of anew flow coming to a forwarding device, it will be forwarded to the controllerfor making the corresponding routing path Due to this regulation, OpenFlownetwork as well as SDN is highly sensitive to saturation attacks, such as DDoS or
Trang 282.2.FIELDPROGRAMMABLEGATEARRAY 9
Flooding, where extreme new flows come to a switch simultaneously
This work proposes to implement security functions for forwarding devices in
an SDN network and we targeted on reconfigurable, parallel hardware
introduces an overview of this technology
Field Programmable Gate Array (FPGA) is a dominant technology for
general purpose processor, FPGAs have benefits in performance whilecompared to Application Specific Integrated Circuits (ASIC), FPGAs allow
FPGA-based high-performance computing platforms can be considered such asMicron [33] (former Convey) and Maxeler [34]
Although there exist many organizations develop FPGA, they share the samebasic architecture An FPGA device is a semiconductor device that includes amatrix of configurable logic blocks (CLB) connected through programmable
FPGA device, including:
tables (LUT) to implement combinational logic, registers for sequentialcircuits, and some additional logic elements such as multiplexers orbuffers Each LUT has multiple inputs to function as a multiple inputscombinational logic The number of inputs for each LUT depends on thearchitecture and generation of FPGA devices It is about 4 to 6 input formodern FPGA devices
communicating with external components or devices
disconnect CLBs, IO blocks and other components
FPGA may contain other blocks such as memory, clock distribution, digital
high-speed serial transceivers
Trang 29Figure 2.2: The basic architecture of an FPGA device [ 4 ]
An FPGA device is configured by loading an application-specificconfiguration data, named bitstream, into internal configuration memory.Partial reconfiguration (PR) is the modification of an operating FPGAconfiguration memory by loading a partial configuration file With the rapiddevelopment of technology, FPGAs allow dynamic partial reconfiguration(DPR) It means that some parts of an FPGA device can be reconfigured atruntime while other parts are still working This runtime reconfiguration helpssystems be updated while still operating The design flow of DPR partitionsconfiguration memory into static logic and reconfigurable logic [36]
FPGA is more flexible than application-specific integrated circuit (ASIC)
manufacturing by users, ASIC is programmed by experts from a manufacturer
advantage of hardware-based high-speed parallel processing but also the
programmed using hardware description language (HDL) such as Verilog, veryhigh speed integrated circuit (VHSIC) HDL (VHDL)
2.3 NETFPGA-10GPL ATFORM
research and teaching on parallel, high-speed data transfer network It allowsresearchers, developers, and students to build prototypes of high-speed,hardware-accelerated networking systems based on its supported platforms Its
Trang 302.4.TCP SYNFLOOD ATTACKS 11
platforms, which is named NetFPGA platform, are built upon FPGA technologysupported by the manufacturer There are several NetFPGA platforms such asNetFPGA 1G, NetFPGA CML, NetFPGA 10G and NetFPGA SUME In this work,
secured OpenFlow switch
Figure 2.3: The NetFPGA 10G platform
The NetFPGA-10G board includes four SFP+ ports and one Xilinx Virtex-5TX240T FPGA device Four SFP+ ports are suitable to build high-speed network
resources to handle massive traffic on a network We use Hardware DescriptionLanguage (HDL) to develop all modules in the three most importantcomponents More details about the board are shown in Table2.1
2.4 TCP SYNFLOOD AT TACKS
The TCP SYN flood attacks mechanism exploits the TCP three-way handshakeprotocol to acquire resources of target servers and to prevent legitimate clients
Different from normal forwarding devices, a SYN flood defender represents theprotected server to feedback SYN-ACK packets with SYN cookies technique
after receiving RST packets make the next SYN packets to access the protectedsystem
Trang 31Table 2.1: The NetFPGA 10G specification
For preventing SYN flood attacks, the defense systems could be placed atsource sides, victim sides, or network sides [39] The weakness of common SYNflood prevention systems deployed at source sides is the massive consumption
[38; 40; 41], aim to minimize stored information but increasing latencies ofnetwork packets In contrast, hardware-based approaches, comprising Field
Circuit (ASIC) [43] for parallel processing, have being used as efficient platformsfor building SYN flood defense systems The main advantages of hardwareapproaches are parallel processing and low latencies, suitable for protectingagainst high-rate SYN flood attacks However, there still exist some limitationssuch as low scalability, high implementation cost, and high complexity whichhave not optimized the design yet
A decision tree [44] is a decision support tool that uses a tree-like graph ormodel of decisions and their possible consequences, including chance eventoutcomes, resource costs, and utility A decision tree is a flowchart-like structure
Trang 322.6.ARTIFICIALNEURALNETWORK 13
Validate ACK packet
Forward SYN packet
Server
(b)
Figure 2.4: TCP three-way handshaking protocol of (a) traditional system and (b) SYN flood
defender system together with protected server.
which classify the examples by sorting them down the tree from the root tosome leaf node, with the leaf node providing the classification to the example.Each node in the tree acts as a test case for some attribute, and each edgedescending from that node corresponds to one of the possible answers to thetest case This process is recursive in nature and is repeated for every sub-treerooted at the new nodes The decision tree is implemented in hardware inSection4.1
Artificial Neural Network (ANN) [45;46] is a computing system that plays animportant role in various application domains such as computer vision, speech
Trang 33recognition, or medical diagnosis ANN imitates human neurons system bylearning, recording and using experiences from happened events For instance,
an ANN computational model composes of multiple neuron layers,connections, and directions of data propagation The ANN is able to learnfeatures of data with multiple levels of abstraction by finding the suitable linear
or non-linear mathematical manipulation to turn inputs into outputs Figure2.5illustrates an ANN example with three layers
Input
layer
Output layer
• Inputs are data entries (features)
• Output is a predicted result for the corresponding inputs
• Weights present the significance of input features
Trang 342.6.ARTIFICIALNEURALNETWORK 15
another neuron in the next layer The widely used activation function is
sometimes large, thus the activation function limits this result beforepassing it to the next layer In our design on hardware platforms, weprefer to use an alternative of the activation function as shown in
hardware resources required
neuron network, then, can be used to perform its tasks by computing the output
of the network according to new inputs and assigned weights This is referred to
as the testing phase The impelement of ANN is described in Section4.1
error-minimizing for activation functions Since the feed-forward is computed inthe usual way, the back propagation depends on the output calculated from theactivation function Weights in ANN can be treated as inputs going to a singlenode and being fed to the network in feed-forward steps to produce the output
of the single neuron The main idea of back-propagation is to use the output
to calculate errors of the function and narrow weights to the most appropriatevalues To handle the back-propagation computation, there are two values need
to be stored at each neuron:
• The output o of the node j − th in the feed-forward calculation
These two values are part of the gradient computation The partial derivative of
a function E respected to weight w is using the output of the neural network to
Trang 35calculate the impact of related weight inputs to the whole network can be express
by Equation2.3
∂E
We use Equation2.4for calculating back-propagated errors, there are differences
of finding at output layer and hidden layer With the back-propagated error atoutput layer, the output target is required to compute using delta rule
δ = (tar get − output) ∗ output ∗ (1 − output) (2.4)With the back-propagated error at other layers, instead of finding differencebetween target activate value and actual output to calculateδ, they requires the
total of multiplied back-propagated error of all nodes in the next layer and therespected weight since all single nodes of current layer connect to all node ofthe next layer
δ = (Xδ(nextl ayer ) ∗ w) ∗ output ∗ (1 − output) (2.5)Once the gradient is computed in Equation2.5, the change of weight (4w) can be
calculated in Equation2.6by multiplying it with the learning rateγ Learning rate
is a hyperparameter that controls how much weight it is adjusted in the networkwith respect to the loss gradient The lower the learning rate, the slower travel
on the slope of updating weight It also means that it will take more time to getcoverage
Finally, new weight are calculated by using current weight of j − th node adding
the coverage of gradient respected to that weight in Equation2.7
real-time environment experiments in HCMUT computer engineering lab will
Trang 362.7.ANOMALY-BASED NETWORK INTRUSION DETECTIONDATASET 17
be introduced
NLS-KDD [49] is chosen as the dataset for training and inference phases Forrunning with Weka tool [50], the dataset must be changed to the.arffformat(ARFF stands for Attribute-Relation File Format) It is an ASCII text file thatdescribes a list of instances sharing a set of attributes There are 41 features inthe data-set, however based on the hardware resource constraints, the 6
classification accuracy The 6 features descriptions are shown in Table2.2
Table 2.2: The 6 features descriptions
current connection in the past two seconds
current connection in the past two seconds
We have trained the system using 6 out of 41 features of NSL-KDD dataset asmentioned above to balance between accuracy and model size The generatedmodels are also tested with NSL-KDD dataset
generated packets collected and labelled from the Computer Engineering lab
-Ho Chi Minh City University of Technology - VNU-HCM The packets arecaptured from a device providing web services in several times of day Table2.3shows number of packets using for training and testing proposed system Thenormal packets are captured in a usual network state while the attack packetsare SYN flood packets generated by hping3 toolchain [52]
Trang 37Table 2.3: The number of packets in training and testing phases
associated data plane is under controlled by attackers because centralizedmanagement is conducted at the controller
communication between forwarding devices and controllers
simultaneously flood network packets with different values so that theswitches needs to encapsulate and send these packets to thecorresponding controller to get exact behaviors for these packets Withthese flooding packets, the communication channel between thecontroller and forwarding devices becomes congestion The combination
of one central controller and separation of the control and data plane isthe core weakness in SDN architecture
The control-data plane interface (Southbound Interface) can be easily broke by
vulnerable point We noticed that building security functions on applicationlayer is easier than other layers but these function will consume massive
performance of SDN will be increased if we can detect and pre-process attack
presents different secure approaches for SDN which focused on control anddata plane layers Besides, network intrusion detection/prevention types (DDoS
Trang 382.8.RELATED WORK 19
Figure 2.6: SDN architecture with application, control plane, and data plane layers.
attacks and anomaly-based attacks) that are deployed in hardware platform arediscussed and compared
Many studies propose to deploy security functions in the control plane of a SDNsuch as research in [14; 56] For instance, SLICOTS [5] builds a lightweightmodule on the control plane to prevent TCP SYN Flooding attacks bymonitoring packet rate and install temporary forwarding rules until the client isauthenticated to mitigate attack packets Figure2.7shows high-level proposedarchitecture of SLICOTS
This work propose to run a special service to monitor the average occupation oftheir interfaces to detect congestion conditions The associated controller bases
on this detection to coordinate bandwidth assignment of controlled links Usingthis approach, the controller can limit the flow transmission rate from dataplane to prevent the links from saturation attacks The mitigation procedure ofstarvation state allocates an average bandwidth, while flows exceeding themean are penalised This approach is only simulated and evaluated with asimulation tool
ROSEMARRY [58] protecting controllers only or PermOF [59] to protect the
Trang 39Figure 2.7: SLICOTS architecture [ 5
application layer only These above proposals consume massive computing
approaches are totally different from ours because our ultimate goal is toprotect the network against attacks without harming controller resources Inother words, we implement security functions at forwarding devices of anOpenFlow network
Although implementing SDN switches on FPGA platforms has been investigatedfor a while, especially OpenFlow network - an SDN instance such as work
in [9;60;61], these switches lack of security ability, including DDoS protection.There are several works propose to deploy security functions, below are someconcerned researches about SYN Flood attacks prevention and its apply on thedata plane of SDN architecture
network administrators to build up security functions on both control planeand data plane by inserting more flow table on the data plane However, theswitches in this work still need general purpose processors to perform
Trang 402.8.RELATED WORK 21
software-based security behaviours
algorithm and repeats the 3-way handshake with target servers once a user is
difference between two SEQ numbers in memory to synchronize SEQ numbersfrom the two sides Instead of sending RST packet after authenticating theconnection like our proposed mechanism, connection migration reproducesthe SYN packet and perform TCP three-way handshake to the server As a result,Avant-guard needs to synchronize the sequence number of the connection bystore the difference between two sequence numbers and have to modify thepacket belongs to this connection which will produce overhead over time(memory could be overloaded)
Figure 2.8: AVANT-GUARD proposal [ 6
technic [63] and probabilistic blacklisting of network traffic while the research
in [31] propose to prevent SYN Flood attack using TCP reset technic by insert anumber of switching rule into flow table
The main idea in this proposal is to deploy an Authenticator and a RADIUS