Luận văn thạc sĩ Khoa học máy tính: Multi-core Architecture for DoS

Table 2.1: TCP/IP and OSI model comparison TCP/IP ModelOSI ModelCommon Protocol 2.2.1DoS/DDoS Attacks Classification DoS/DDoS attack aims to prevent legitimate users from accessing netwo

Motivation

The twenty-first century is an age of technology explosion Information Technology (IT) is among the quickly innovated fields Internet, World Wide Web (WWW), Cloud, Big Data and Internet of Thing (IoT) are several achievements of IT The Internet is an important component It helps to connect people over the world as a non-geographical distance world For this advantage, Internet users are increasing over time With the increase in the number of mobile devices, the Internet connecting devices is increasing quicker.

There are more than 3.3 billion of Internet users as statistics from Internetlivestats.com [1] The increase of the connected devices will be a good chance for attackers to replicate malicious software, exploit user information and occupy devices for security attacking purpose The attacker may occupy devices for denial of service (DoS) or distributed DoS (DDoS) attacks on the Internet.

DoS attacks are the network attack methods that prevent legitimate users from accessing network resource or services Attackers perform DoS attacks by consuming network resource or server resource or both of them DoS attack not only exhausts network resource (e.g network bandwidth, router processing capability) and/or server resource (e.g. sockets, CPU, memory, disk/database bandwidth and input/output (I/O) bandwidth) but also exploits vulnerabilities of protocol or application DoS attack is performed from one source, while in reality, the attacker often performs DDoS attack from multiple sources through a botnet A botnet is a group of computers/devices which are occupied and controlled by the attacker through malware The increase of the inter-connected devices may lead to the rise of DDoS attacks.

Zargar et al survey [2] states that DDoS first appears in the 1980s Since 1999,

CHAPTER 1 INTRODUCTION many organisations have got DDoS attacks In February 2000, Yahoo! Services have been off for about two hours incurring the loss in revenue In October 2002, 9 of the 13 Domain Name System (DNS) root servers have shut down for an hour due to DDoS flooding attack In December 2010, Mastercard.com, Paypal, Visa.com and PostFinance have been attacked In 2012, nine banking sites of the United States of America (U.S for short) have got attacked, several websites got hanged DDoS attacks cause the victim to lose finance, including revenue and cost for recovering the system after DDoS attack As of 2013, The Internet Crime Complaint Center (IC3) [3] has reported that Internet crime had led to the loss of 781,841,611 U.S dollar (including DoS attacks), which is 48.8% higher than 581,441,110 U.S dollar in 2012.

DDoS not only causes financial loss but also wastes system resource and power Net- work devices and servers will consume more power while being exhausted in DDoS attacks In the Akamai’s state of the internet security report (Q1 2015) [4] shows that the average consumed bandwidth of DDoS attacks is 5.95 Gbps The report also shows that there are 8 DDoS attacks which consumed bandwidth more than 100 Gbps As of 2014, the Arbor Network’s report [5] has recorded the largest DDoS attack which consumed bandwidth up to 325.05 Gbps (Figure 1.1) It means that DDoS attacks are increasing in both quantity and scale That would be a challenge to the Internet security.

Figure 1.1: Peak DDoS attacks month-by-month

The attacker always wants to hide their identification while performing attacks Hidden identification not only helps attacker not to be tracked but also let the attack be difficult to mitigate The attacker uses a technique called internet protocol (IP) spoofing to hide the source of the attack The IP spoofing technique let the attacker change the sourceIP address of a packet difference from its original address Therefore, IP spoofing indi- rectly hides the attacker’s original address and identification The IP spoofing technique is used in most of the DDoS attacks, especially in reflection-based and amplification- based attacks which are highly consumed bandwidth and hard to mitigate attacks The router, which is implemented the network routing protocol, only checks the destination IP

CHAPTER 1 INTRODUCTION address of a packet while source IP address is intact It is a vulnerability that makes IP spoofing increasing The "Spoofer Project" [6] shows that 13.5% of IP address space can be spoofable (Figure 1.2) This percentage has been the highest value since 2006, and it is in an uptrend The largest attack in 2014, which consumed of 325.05 Gbps bandwidth, is a combination of reflection and amplification technique.

Figure 1.2: Spoofable IP address space - Spoofer Project

Therefore, quick identification and mitigation of DDoS attacks will prevent financial loss and save system resource In this thesis, the research focuses on mitigating DDoS attacks that apply IP spoofing technique The research shows how IP spoofing works through some specific case studies A novel architecture is proposed to identify and counter IP spoofing DDoS attacks.

Thesis Objective

The research is to find a solution that quickly detect and mitigate DDoS attacks in current high-speed network systems It can flexibly react to a different kind of DDoS attack in the future by applying new DDoS countering mechanism To meet those demands, the system has to have high-performance, support cooperation of multiple mechanisms and can be changed or updated filtering mechanism The proposed architecture should overcome the weakness of current mechanisms and meet those requirements to handle future variant of DDoS attacks The reconfigurable hardware such as the field programmable gate array (FPGA) platform is a suitable choice for the development of the system The system needs to have features below:

• It can cooperate multiple DDoS countering mechanisms.

• DDoS countering mechanism can be changed or updated to adapt and mitigate the new variant of DDoS attack.

• Its operating performance is at least 10Gbps to work with high-speed network systems.

Thesis Structure

The thesis has been organised into six chapters The following presentation is an outline of the thesis.

• Chapter 1 introduces security challenges of the Internet and affection of DDoS attacks An overview about the thesis is introduced It also shows the motivation and objective for conducting the research.

• Chapter 2 presents the background of networking models and several commonly used protocol It shows relevant research of DoS/DDoS, including DDoS classification and mechanisms in attack and defence It also presents the FPGA and some development platforms supported by manufacturers.

• Chapter 3 presents the proposed multicore architecture and the architecture components This chapter explains the architecture components communication and shows how it work in theory.

• Chapter 4 covers a topic of implementing a prototype system based on the proposed architecture This chapter shows how the prototype system is implemented and the architecture based on NetFPGA 10G platform.

• Chapter 5 shows the experiments of the prototype system This chapter presents the setup environment and the experimental results The experiments are conducted several times to make sure the prototype system works stably and the results are consistent.

• Chapter 6 concludes the final words for the research This chapter states the achievements of the thesis and the limitation of the work It also presents the consideration to optimize the prototype in future work.

• Appendix includes the publications have been accepted and/or published while conducting the research.

Summary

This chapter has just presented the challenges of the Internet security and the motivation for conducting the research It also shows the objective of the thesis In next chapter,network technology and related work which are relevant to DDoS attacks are discussed in detail.

This chapter presents the background and history of the network system Several popular protocols are discussed and how it is exploited to perform an attack This chapter also shows the classification of DDoS attack and defence with some case studies.

Background

Transmission Control Protocol/Internet Protocol Network Model

In 1969, Advanced Research Projects Agency Network (ARPANET) [7] was founded and funded by Advanced Research Projects Agency (ARPA) which then renamed itself to Defense Advanced Research Projects Agency (DARPA) The ARPANET was an early packet-switching network It is the first network to implement the Internet protocol suite.

The TCP/IP model is a networking model for network communication systems The IP [8] is a principle communication protocol of the Internet protocol suite for use in interconnected systems of packet-switched communication networks The IP is responsible for delivering packets from source host to destination host on interconnected networks.

Because the IP protocol is not reliable, the Transmission Control Protocol (TCP) is developed upon IP to provide reliability Then, the TCP/IP becomes the core of ARPANET and the Internet later.

The TCP/IP model has four layers Those layers are Network access layer, Internet layer, Transport layer and Application layer.

• Network Access layer is responsible for connecting a host to the local network It

CHAPTER 2 BACKGROUND AND RELATED WORK includes the protocols used to describe the local network topology, such as Ether- net, token ring, frame relay, ATM., and the interfaces needed to transmit internet layer datagrams to neighbour hosts This layer delivers data presented in bits to the network medium such as wireless, coaxial cable, optical fiber.

• Internet layer establishes a host-to-host connection It uses IP address to identify host and routes the packets to the destination based on the IP address The internet layer packages data into IP datagrams, which contain source and destination address information that is used to forward the datagrams between hosts and across networks.

The IP is implemented in this layer.

• Transport layer provides communication session management between hosts computers This layer defines the level of services and status of the connection used when transport data The TCP is implemented in this layer to provide reliability The combination of TCP and IP is the core for a reliable network such as ARPANET.

• Application layer defines application protocol and how host programs interface with transport layer services to communicate on the network Several well-known protocols operated in this layer are Telnet, File Transfer Protocol (FTP), Domain NameSystem (DNS), Simple Mail Transfer Protocol (SMTP), Hyper Text Transfer Proto- col (HTTP).

Open System Interconnect Network Model

Before developing the OSI, there are several networking protocols existed, including government-sponsored and vendor-developed and proprietary standards The OSI is developed to standardise those existing protocols and aim to inter-operate them together.

While OSI is developing, the TCP/IP model is widely accepted and becomes an indus- trial standard for network communication The OSI then becomes a reference model for teaching.

The OSI reference model has seven layers [9]: Physical, Data Link, Network, Trans- port, Session, Presentation and Application layer.

• Physical layer, which is the lowest layer of OSI model, is responsible for transmitting and receiving unstructured raw bit streams from a physical medium such as wireless, fiber optic, copper cable It defines electrical/optical signal to present digital signal pattern (1s and 0s) used by computer systems.

• Data Link layer provides error-free transfers of data frames from one host to another over the physical layer This layer establishes and terminates the logical link between two nodes, controls frame traffic and error checking, provide media access management to determine when the node has access right to use the physical medium.

CHAPTER 2 BACKGROUND AND RELATED WORK

• Network layer provides host-to-host connection establishment, routes frames among networks, translates logical IP address into a physical address of data link layer, provides frame fragmentation and reassembly based on router’s maximum transmission unit (MTU) size.

• Transport layer ensures that messages are delivered in sequence, error-free, and with no loss or duplications It relieves the higher layer protocol from any concern with the transfer of data between them and their pairs.

• Session layer allows session establishment between processes running on different stations It manages and supports sessions communication over the network It also performs security on sessions.

• Presentation layer format the data to be presented to the application layer This layer supports format translation, character code translation, data conversion and com- pression/encryption from a format that used by the application layer into a common format at the sending station, and vice versa.

• Application layer serves as a window for users and application process to access network services.

The Table 2.1 shows the difference between TCP/IP model and OSI model.

Table 2.1: TCP/IP and OSI model comparison

TCP/IP Model OSI Model Common Protocol

Application Layer FTP, HTTP, SMTP, IMAP, POP3, NFS Presentation Layer TLS, SSL

Session Layer NetBIOS, RPC, SCP Transport Layer Transport Layer TCP, UDP, SCTP

Internet Layer Network Layer IPv4, IPv6, ICMP, ARP, IPSec

Network Access Layer Data Link Layer Ethernet, ATM, FDDI, Frame Relay

Physical Layer 802.11 (Wireless), Bluetooth, DSL, ISDN

Related Work

DoS/DDoS Attacks Classification

DoS/DDoS attack aims to prevent legitimate users from accessing network resource and service It has been classified into network-level based and application-level based attacks

[2] The network-level based DDoS attack occurs from data link layer to transport layer of the OSI network model while application-level based DDoS attack exploits vulnerabilities of upper network layers The network-level based DDoS attacks consume network resource while application-level based DDoS attacks consume server resource.

Network-level based DDoS attack often exhausts network resource It has also been classified into four subclasses which are flooding attack, protocol exploitation flooding attack, reflection-based flooding attack and amplification-based flooding attack.

• Flooding attacks: this kind of attack focuses on disrupting legitimate user’s connectivity by exhausting victim network resource such as Internet Control Message Pro- tocol (ICMP) flood, Domain Name System (DNS) flood, User Datagram Protocol (UDP) flood ICMP [10] is a network protocol used to check network connectivity.

It must be implemented in every IP network module to provide feedback of network connectivity Its instance is a ping command in every Operating System (OS) When an abnormally large number of ICMP messages come to host, it can be overwhelmed by the requests; it is called ICMP flood or ping flood UDP and DNS flooding attack operate the same way as ICMP flooding attack but on the different protocol.

• Protocol exploitation flooding attacks: The attacker exploits specific protocol features or bugs in implementation to send malformed packets to confuse victim’s system The attacker often performs TCP SYN and TCP SYN/ACK flood in this kind of attack Following are some examples of this kind of attack.

– Ping of death (PoD): The attacker exploits the IP specification [8], which only supports the largest packet size of 65,535 bytes, to perform PoD attack [11] by sending oversized packets The oversized packet will cause victim system hang.

This vulnerability has been patched in modern OSs.

– Teardrop [11]: The attacker sends a packet to the victim, but this packet has been fragmented into IP fragments in which the header values are overlapped.

The victim’s machine will be crashed while re-assembling those fragments Re- cently, OS and network devices have handled such attacks Therefore, teardrop attacks no longer affect any layers of network devices The Figure 2.1 describes more detail how Teardrop attack works.

– TCP SYN flood: TCP is a connection-oriented protocol A TCP session starts with a three-way handshake First, a legitimate user sends a connection request with synchronisation (SYN) message to the server The server then acknowl- edges the SYN message by sending the SYN-ACK message back to the legitimate user Finally, the legitimate user sends an ACK request to the server to

Figure 2.1: Teardrop attack establish connection session The Figure 2.2(a) shows step-by-step of three-way handshaking process The attacker exploits the three-way handshake to attack victim by sending a large number of SYN requests but does not send ACK requests to complete the process of the three-way handshake The server waits for the ACK requests to complete those packets, which makes the server unable to precess legitimate requests The SYN flood attack can be carried out by sending packets with a spoofed address The Figure 2.2(b) describes this kind of attack.

• Reflection-based flooding attacks: Instead of attacking the victim directly, the attacker sends spoofed packets to reflectors, and then responses are sent back to the victim and cause flooding (i.e., Smurf attack, Fraggle attack) In reflection-based flooding attack, the attacking packets are spoofed Smurf attack [11] is an example of this kind of attack The attacker sends ICMP echo requests with destination IP address is the broadcast address These requests are spoofed such that its sourceIP address is the victim’s IP address The router which receives these packets will

CHAPTER 2 BACKGROUND AND RELATED WORK deliver these packets to clients (exploited as a reflector) that belong to the speci- fied broadcast address As a result, the victim will be flooded with responses from reflectors The Smurf attack applied multiple techniques such as IP spoofing, amplification and reflection The Figure 2.3 shows how Smurf attack works The Smurf attack can be prevented by disabling IP-direct broadcast command in the network routers Fraggle attack acts as the same way, but it uses UDP instead of ICMP.

Victim ICMP echo, src IP = 19.10.a.b, Dst IP = 19.10.255.255

ICMP reply, src IP = 19.10.x.y, Dst IP = 19.10.a.b

• Amplification-based flooding attacks: Attackers exploit services that response a large message or multiple messages to amplify the traffic towards the victim The reflection-based and amplification-based flooding attack work in tandem Botnets have been employed for both of these types of attack A botnet is a group of computers employed and controlled through malware by the attacker Smurf is a good example of this kind of attack When the attacker sends a spoofed ICMP echo packet with the destination address is the broadcast address, all hosts that belong to the group of broadcast address may reply to the victim The number of response is much larger than the initial request The factor to indicate how much larger the response compare to the initial request is called amplification factor.

Application-level based attacks exhaust server resource or exploit vulnerabilities of application protocol and application code The attacker often exploits stateless protocols for this kind of attack such as DNS, Network Time Protocol (NTP) DDoS attack applied reflection technique is also called as Distributed Reflection DoS (DRDoS) The attacker often performs DDoS attack from a botnet DDoS mitigation is even more challenging because botnet is popularly applied to perform an attack.

• DNS amplification attack [2]: DNS is a network name service which resolves a domain name to IP address and vice versa A DNS query message is as small as 64

CHAPTER 2 BACKGROUND AND RELATED WORK bytes, but its response is much bigger The DNS response message may contain information of inquiry domain, child domain and its services information The attacker exploits this vulnerability to perform DDoS attack The attacker sends DNS queries to DNS servers with option recursive query which requires DNS server to return all relative information of inquiring domain and its child domain These packets are spoofed such that their source IP address is victim address Therefore, the victim is flooded with huge responses from DNS servers DNS amplification DDoS has been researched before [12] The largest DNS amplification attack to Spamhaus with amplification factor up to 200x, reached 300Gbps [12] [13] This kind of attack can be mitigated by disabling recursive query from DNS servers to avoid exploitation.

• NTP amplification attack: NTP is a UDP-based protocol that supports time clock synchronisation NTP server originally supports a monlist command just for monitoring purpose When receiving a monlist command, NTP server returns up to last 600 client’s IP address that have contacted the NTP server The returned packets size is up to 206 times larger than the request [14] To perform an attack, the attacker sends packets with a monlist command to NTP servers that support monlist command Those packets are forged such that its source IP address is victim’s address.

As a result, the victim is flooded with huge packets returned from NTP servers.

Dos/DDoS Defence Mechanisms & Classification

DDoS defence mechanisms are countermeasure mechanisms to mitigate DDoS attacks.

Based on DDoS attack classification, DDoS defence mechanisms are classified into network- level based and application-level based mechanisms [2] Network-level based defence mechanism is deployed to mitigate DDoS attacks under network layers It is also cat- egorised into source-based, network-based, destination-based and hybrid mechanisms based on deployment location Application-level based defence mechanism is deployed to mitigate DDoS attacks that exploit the application layer vulnerabilities Detail information of each mechanism is discussed in the following.

• Source-based mechanism: These mechanisms are deployed near the source of the attack to prevent customer network from generating DDoS attack These mechanisms are deployed at the access router of source’s local network or at the access routers of an autonomous system (AS) that connects to the source’s edge routers.

Port Ingress/Egress Filtering (PIEF) is proposed by Ferguson [15] to filter spoofed

CHAPTER 2 BACKGROUND AND RELATED WORK packets The ingress and egress name depends on its deployment position Ingress filtering method is deployed to filter inbound traffic Spoofed packets are blocked when coming through this filter Egress filtering method filters outbound traffic to ensure that spoofed or malicious packets will never leave internal network IPSec protocol can eliminate IP spoofing packets by authenticating source address before delivering packets, but this method is not widely used because of high overhead.

• Destination-based mechanism: The destination-based mechanisms are deployed near the victim side to detect and mitigate the DDoS attacks Management Informa- tion Base (MIB) [16] provides method to monitor network traffic and routing statistic MIB can be used at victim side to detect DDoS attacks Wang et al [17] proposed a method named Hop-Count Filtering (HCF) to filter spoofed packets based on the number of hops that packets traversed before arriving at the victim Each packet travelling on the network has its own initial Time-To-Live (TTL) value When a packet traverses a router (hop), its TTL value will be decreased one before forward- ing to next hop Those packets in which TTL is equal to zero will be dropped, and the router will send message "TTL Exceeded in Transit" to the source Therefore, packet’s hop count value could not be spoofed Hop-count value is calculated by comparing initial TTL to final TTL value when it arrives at the destination While not being attacked, IP address and its hop count value will be collected and con- structed in IP-to-Hop-Count (IP2HC) tables When DDoS attack occurs, packets’ IP and hop-count value will be compared to IP2HC If it does match IP2HC table, it is a legitimated packet; otherwise, it is a spoofed packet and will be dropped The paper claimed that HCF can identify 90% of spoofed packets Ritu et al [18] combined probabilistic and round trip time in Distributed Probabilistic HCF-Round trip time (DPHCF-RTT) Packets will be checked once by intermediate DPHCF-RTT routers (nodes), then forwarded to the victim The larger number of intermediate routers implemented, the higher detection rate of malicious packets is The paper claimed that detection rate is up to 99.33%.

• Network-based mechanism: These mechanisms are mainly deployed on the routers of the ASs It detects attack traffic and creates a proper response to stop it at the intermediate network level The detecting and filtering malicious routers [19] is a method of this mechanism The Watchers [2] detect misbehaving routers exploited by an attacker that support DDoS attack such as misrouting packets The route-based packet filtering method [20] extends PIEF to the routers in the core of the Internet to filter malicious packets.

• Hybrid (Distributed) mechanism: There is no strong mechanism to mitigate DDoS attack effectively DDoS attack traffic has accurately been detected when it reaches the destination, but mitigation is not effective at the destination At the source of the

CHAPTER 2 BACKGROUND AND RELATED WORK attack, it is hard to detect DDoS, but it can be prevented completely at this place.

The hybrid (distributed) mechanism is researched to corporate other mechanisms to mitigate DDoS effectively Active internet traffic filtering (AITF) [21] is a hybrid mechanism that enables a receiver to deny all the traffic by default and only accepts the traffic that belongs to the established connection The alternative configuration could be that receiver accepts all traffic by default and only denies the traffic if it is identified as malicious or undesirable This method needs the corporation of all internet service provider (ISP) to receive the request from receiver and filter traffic from its source The StopIt [22] is a hybrid mechanism that enables each receiver to stop the attack from the source Each AS installs a server named StopIt server to provide service to filtering malicious traffic when receiving requests from hosts or other StopIt servers When the receiver receives malicious traffic, it will send a StopIt request to the StopIt server locating in the same AS with the traffic source and destination information This StopIt server will send the StopIt request to the StopIt server of the AS from where the traffic comes Then the StopIt server from source AS will request the router connecting to the host that generates the traffic to block the traffic This mechanism needs the corporation of all ASs on the Internet The Figure 2.4 shows how the StopIt mechanism works.

• Destination-based (Server side) mechanism: Most of the application-layer protocol organised as a client-server model A server is a process which is implemented as a specific service (e.g., DNS server, Web server, Email server) A client is a process that requests a service from a server Therefore, most of these mechanisms closely implement and observe the server to monitor clients’ behaviours so that they can

CHAPTER 2 BACKGROUND AND RELATED WORK detect and drop or limit the rate of the malicious requests DNS Amplification At- tacks Detector (DAAD) [23] is an example of this mechanism DAAD is a proactive mechanism to detect potential DNS amplification attacks This mechanism collects the DNS requests and replies using IPTraf tool [24] Then, DAAD stores the cap- tured data into MySQL database to classify If a reply does not match any request in a time frame, it is a suspicious packet and a firewall rule is updated to filter the attacker’s IP address.

• Hybrid (Distributed) mechanism: This type of mechanism employs collaboration between client and server to detect and react to the attacks For instance, Kandula et al propose a system to protect web cluster from application DDoS attacks This system employs Completely Automated Public Turing test to tell Computers andHumans Apart (CAPTCHA) [25] This mechanism differentiates DDoS flooding bots from human by requesting the client to solve a puzzle If the puzzle is solved,the client is a legitimate user Otherwise, the client is suspicious and is filtered.

Field Programmable Gate Array

Field Programmable Gate Array (FPGA) is a semiconductor device that is based around a matrix of configurable logic blocks (CLBs) connected via programmable interconnects [26] It is an integrated circuit device that can be configured by customer or designer after manufacturing Three major components of FPGA include:

• Logic block: It is also known as CLB CLB includes lookup tables (LUTs) to implement combinatorial logic, register for sequential circuits, and additional logics such as multiplexers Each LUT has multiple inputs to combine multiple parameters.

• Input/Output block: These blocks are responsible for connecting and communicating with external components or devices.

• Interconnection switch: These switches can be programmed to connect or disconnect CLBs, I/O blocks and other components.

FPGA may contain other blocks such as memory, clock distribution, digital signal processor (DSP), embedded microprocessors/microcontrollers, high-speed serial transceivers.

The Figure 2.5 shows basic components of an FPGA device.

FPGA is more flexible than application-specific integrated circuit (ASIC) Both of them are programmable While FPGA is programmable after manufacturing by users,ASIC is programmed by experts from a manufacturer and can not be re-programmed after manufacturing FPGA not only takes advantage of hardware-based high-speed parallel processing but also takes the flexibility of software-based programmability They are

Configurable Logic Block Input/Output Cell Interconnection Resource

Figure 2.5: FPGA device components designed and programmed using hardware description language (HDL) such as Verilog, very high speed integrated circuit (VHSIC) HDL (VHDL).

An FPGA device is configured by loading an application-specific configuration data,named bitstream, into internal configuration memory Partial reconfiguration (PR) is the modification of an operating FPGA configuration memory by loading a partial configuration file With the rapid development of technology, FPGAs allow dynamic partial reconfiguration (DPR) It means that some parts of an FPGA device can be reconfigured at runtime while other parts are still working This runtime reconfiguration helps systems be updated while still operating The design flow of DPR partitions configuration memory into static logic and reconfigurable logic [27] In DPR process, the static logic remains functioning while the reconfigurable logic is modified by the partial configuration file In this research, DPR is applied to change and update DDoS countering mechanism to adapt security challenges in the future.

The NetFPGA Platform

NetFPGA is an open-source hardware and software platform designed for research and teaching [28] It allows researchers, developers and students to build prototypes of high- speed, hardware-accelerated networking systems based on its supported platforms Its platforms, which is named NetFPGA platform, are built upon FPGA technology supported by the manufacturer There are several NetFPGA platforms such as NetFPGA 1G, NetFPGA CML, NetFPGA 10G and NetFPGA SUME This research is implemented on NetFPGA 10G.

NetFPGA 10G [29] is a NetFPGA platform based on Virtex-5 FPGA chipset supported

CHAPTER 2 BACKGROUND AND RELATED WORK by Xilinx NetFPGA 10G is a x8 generation 2 PCI-Express board with 4 ports 10Gbps SFP+ interface It is bundled with a Virtex-5 TX240T FPGA The Table 2.2 shows detail specification of NetFPGA 10G board.

Block RAM/FIFO v/ECC (36Kbits each) 324

Phase Locked Loop (PPL)/PMCD 6

RocketIO TM GTX High-Speed Transceivers 48

Summary

Chapter 2 presents the background which is the base knowledge for conducting the research It discusses TCP/IP and OSI network model A comparison chart shows the difference between two models The related work section discusses existing DDoS researches DDoS attack and defence mechanisms and classification are also presented.

FPGA and the NetFPGA 10G platform are briefly introduced They are the base platform for proposing the architecture in next chapter.

This chapter presents the proposed FPGA-based multicore architecture to integrate multiple DDoS countermeasure mechanisms Beside input and output port, the system has been partitioned into Static and Dynamic partition The static region implements base components of the system The dynamic region contains multiple DDoS defence cores and Defence Decision The Figure 3.1 shows the proposed architecture.

Packet Decoder D e fe n se D e ci si o n Defense Core 1

Figure 3.1: The proposed FPGA-based multicore architecture

Static Partition

Input Arbiter

This Input Arbiter module delivers raw packets from multiple input ports to Packet De- coder This module transfer packets coming in parallel into sequence This way helps the Packet Decoder to process all packets in serial There is an input queue to store the chain of packets before coming to the Packet Decoder There are several strategies to deliver packets to the input queue such as round robin, busy-port priority The round robin is a simple way to organise packets from input ports to the input queue The round robin strategy picks one packet in each port at a time then rotates next pick on another port.

The busy-port priority allows the busiest port to be served first If the port is busy, it is prioritised in picking the next packet The busy port can be identified by analysing the previous port statistic The port statistics value can be used to analyse priority are packet per second or megabit per second The busy-port priority seems to be harder than round robin in implementation.

Output Arbiter

The Output Arbiter module is responsible for sending the packet out to the network This module receives a packet and decides the destination to where the packet is forwarded.

The Output Arbiter picks the packet in the Packet FIFO one by one to check and deliver it to the right destination port Depends on the implementation, this module can be considered as a routing module.

Packet Decoder

This module decodes and extracts incoming packets into header and payload fields The header is then sent to defence cores that implement DDoS defence mechanisms located in the dynamic partition The number of header fields needs to be extracted depending on the implemented defence core General header fields include source IP address, destinationIP address, source port, destination port and TTL value For those defence cores that require more fields from the packet header, the feature extraction can be implemented in the Packet Decoder module Some abnormally based DDoS defence mechanisms need to calculate packet statistic which can also be integrated into Packet Decoder The raw incoming packets are then forwarded to and stored in the Packet FIFO module while waiting for classifying results from DDoS defence mechanisms.

Packet FIFO

In high-speed network processing systems, when an incoming package is processing, others are coming continuously or simultaneously at different ports Moreover, processing a packet takes time, the packet needs to be stored somewhere to wait for decisions from

CHAPTER 3 PROPOSED SYSTEM ARCHITECTURE processing cores There are two approaches to process packets The first approach, the packet is decoded into separate fields (header and payload) Those fields are then sent to defence cores for classifying If the packet is malicious, its separated fields are discarded.

If the packet is legitimate, the packet separated fields are encapsulated and sent out to the network This approach causes a heavy load on the system because it needs to manage and store multiple pieces of the packet and invests more resource to pack them Moreover, when encapsulating a packet, the packet checksum need to be calculated, and this process also takes time The Figure 3.2 shows how the first approach works.

Unpack Transport layer header Unpack Network layer header Unpack Data Link layer header

Pack Transport layer header Pack Network layer header Pack Data Link layer header

Figure 3.2: The first approach for processing packet

In this work, the Packet FIFO is chosen to store the incoming packets The size of this FIFO depends on system performance (throughput) as well as network speed The higher performance the system can process, the smaller the size of this FIFO is required.

Therefore, the size of this FIFO will vary from system implementation to system implementation In other words, this FIFO can be implemented by on-chip memory or off-chip memory based on its size Although the module is fixed at design time, It can not be considered as a module of the static partition (as depicted in Figure 3.1) because this module can be implemented in on-chip memory.

Packet Processing

The Packet Processing module receives decisions from the Defense Decision module located in the dynamic partition The decisions are either bypass to allow the corresponding packet to be bypassed to output ports or drop to alert that the corresponding packet belongs to a DDoS attack Based on these decisions, the Packet Processing module sends the corresponding packet from the Packet FIFO module to the destination output port if the packet is legitimate Otherwise, the packet is deleted from Packet FIFO.

Dispatch Interface

As stated above, one of the primary goals of this proposed architecture is the runtime reconfiguration of DDoS defence processing cores In order words, Defense Cores implementing DDoS defence mechanisms can be updated or changed at runtime to adapt DDoS attacks quickly While a Defense Core is being updated or replaced, others can keep working without any interfering To support this goal, the Dispatch Interface is responsible for communication between the host processor and the DDoS protection system When De- fense Cores are soft general purpose processors (GPPs), the host processor can configure the cores by sending new instruction caches through this interface Otherwise, dynamic partial reconfiguration bitstream is sent from the host processor to modify Defense Cores if these cores are dedicated hardware processing cores.

Updating Controller

The main purpose of this module is to control the process updating or changing DefenseCores When receiving an updating request from the host, this module disables the defence core before allowing configuration code to be updated However, the UpdatingController module will vary from implementation to implementation If the system exe- cutes DDoS defence mechanisms by soft GPPs, Updating Controller just needs to select the right instruction cache to update it by a new instruction cache However, UpdatingController should include different modules that support the dynamic partial reconfiguration technique in cases of dedicated hardware processing cores are used to execute DDoS defence mechanisms These supporting modules depend on FPGA families Therefore,we only describe the Updating Controller module in the proposed architecture.

Dynamic Partition

Defence Core

Defence Cores are mainly responsible for scanning incoming packets to classify these packets as legitimate packets or DDoS attacking packets Different DDoS filtering tech-

CHAPTER 3 PROPOSED SYSTEM ARCHITECTURE niques require different data input such as header fields or payload fields These fields are supplied by the Packer Decoder module Defence Cores can be implemented by softGPPs (such as MicroBlaze [30] or NIOS [31] processors) or dedicated custom hardware processing cores or both However, in all cases, Defense Cores can be updated or changed at both runtime and design time The update process when Defense Cores are soft GPPs just requires new instruction caches of GPPs changed Meanwhile, we need to use the dynamic partial reconfiguration technique for FPGA devices if Defense Cores are spe- cialised hardware cores Compared to the GPP approach, dedicated hardware DefenceCores achieve better performance due to parallelism.

Defence Decision

DDoS attacking packets can be deployed on different methods One defence mechanism can only detect one type of DDoS attack Therefore, if a packet is classified as an attacking packet by one DDoS filtering technique, we do not need to wait for decisions of other techniques in multiple filtering mechanisms DDoS protection system The main duty of the Defense Decision module is to monitor classified information from Defense Cores to determine whether a packet is legitimate or belongs to a DDoS attack If there exists one Defense Core recognising a packet as an attacking packet, Defense Decision right away sends a drop signal to the Packet Processing module accommodated in the static partition Consequently, Defense Decision resets all Defense Cores to start processing the next packet since we do not need to wait until all Defense Cores finish their scanning process In contrast, if all Defense Cores vote for a packet as a legitimate packet, DefenseDecision sends a bypass signal to the Packet Processing module.

Summary

Chapter 3 presents the proposed multicore architecture which separates dynamic partition out of static partition to achieve dynamic partial reconfiguration The proposed multicore architecture is feasible The design uses hardware-based high-performance processing and programmability from reconfigurable hardware such as FPGA to achieve the objective It is not so easy to implement but not so complicate On the next chapter, it presents an instance of the proposed architecture on a particular FPGA platform, a NetFPGA 10G board It is the prototype system to be experimented.

This section presents the first prototype version implementing the proposed FPGA-based multicore architecture to integrate two well-known DDoS defence techniques HCF and PIEF The NetFPGA-10G board containing a Xilinx XC5VTX240T device is used as an experimental platform The board consists of four SFP+ ports (NIC Rx and NIC Tx) that are suitable to build a high-speed network processing system The Figure 4.1 shows the NetFPGA 10G development platform Two Defence Cores implementing HCF and PIEF techniques are developed as dedicated hardware processing cores Therefore, we apply the dynamic partial reconfiguration technique to update and modify Defense Cores as stated above In this section, we mainly highlight the two DDoS filtering techniques, Updating Controller (DPR) module used to control the dynamic partial reconfiguration process, and Dispatch Interface (ICAP) to receive dynamic partial reconfiguration bitstream from the host processor These modules are implemented using Verilog-HDL as described above.

Figure 4.2 depicts our prototype system based on the proposed architecture.

Packet Decoder D ef en se D e ci si o n Port I/E Filtering

NIC Rx NIC Rx NIC Rx

NIC Tx NIC Tx NIC Tx

P ar tia l b it st re a m Output Queue

Figure 4.2: The FPGA-based multicore DDoS propection system based on the proposed architecture

Static Partition

Input Arbiter

The Input Arbiter module delivers raw packets from multiple input ports to Packet De- coder This module transfer packets coming in parallel into sequence This way helps thePacket Decoder to process all packets in serial There is an input queue to store the chain of packets before coming to the Packet Decoder In this prototype system, round robin strategy is implemented in the Input Arbiter.

Output Arbiter

The Output Arbiter module is responsible for sending the packet out to the network This module receives a packet and decides the destination to where the packet is forwarded.

The Output Arbiter picks the packet in the Packet FIFO one by one to check and deliver it to the right destination port In this prototype system, the Output Arbiter module supports manual configuration to route packets between ports.

Packet Decoder modules

The Packet Decoder module receives an incoming packet from the NIC Rx inputs of the SFP+ ports The Packet Decoder decodes and extracts the header of the incoming packet according to layer 2 to layer 4 in the OSI network model Then, the header is sent to defence cores in the Dynamic Partition for classifying The raw packet is stored in the Packet FIFO while waiting for classifying results from the defence cores The

CHAPTER 4 SYSTEM IMPLEMENTATION extracted header used in this prototype includes source IP address, destination IP address, source port, destination port and TTL value The Figure 4.3 shows the operation of Packet Decoder module.

Packet Decoder Transport layer header extraction

Network layer header extraction Data Link layer header extraction

Figure 4.3: The Packet Decoder module

Packet Processing modules

The Packet Processing modules forwards legitimate packets to the destination NIC Tx outputs of the SFP+ ports This module receives the classified result from Defence De- cision This module decides whether the packet is dropped out of the Packet FIFO or forwarded to the network depending on the classified result In the prototype system, a register is applied to manually control decision whether to drop the packet or sent out to remaining ports for further investigation or statistic.

Packet FIFO

In this prototype version, the FIFO intellectual property core provided by Xilinx [32] is used to develop the Packet FIFO The Packet FIFO is implemented by occupying FPGA block memory (on-chip memory) The configured size of Packet FIFO is 256-bits in width and 1024 entries in depth With that configuration, the Packet FIFO can store minimum21 packets in 1500 bytes of size and maximum 512 packets in 64 bytes of size.

Dispatch Interface

Due to dedicated hardware processing cores used for implementing DDoS filtering mechanisms, runtime update and change should be done by the dynamic partial reconfiguration technique Therefore, the Dispatch Interface module needs to support this technique The

CHAPTER 4 SYSTEM IMPLEMENTATION dynamic partial reconfiguration technique is technology dependency Hence, we use a Xilinx proposed interface to reconfigure the FPGA device at runtime Xilinx supports various interface to configure their FPGA devices Among those interfaces, an ICAP primitive interface [27] can read and write the FPGA internal configuration memory In this work, ICAP is used as Dispatch Interface allowing partial reconfiguration bitstream to be transferred from the host processor to internal configuration memory of the Xilinx FPGA device through a 256-bit Direct Memory Access (DMA) protocol The Figure 4.4 shows the internal architecture of our implemented Dispatch Interface based on ICAP The AXI-Stream [33] protocol is used for applications that typically focus on a data-centric It means that AXI-Stream is suitable for transferring data at a high-performance rate AXI- Stream provides 256-bit data bus while ICAP primitive support a 32 bit-width data interface Therefore, we have to segment data into 32-bit blocks to feed into ICAP_VIRTEX5, which is an ICAP primitive supported by Xilinx in Virtex 5 chipset A data buffer is used to sequentially transfer partial bitstream to the ICAP port.

Figure 4.4: The Dispatch Interface - ICAP

Updating Controller

As aforementioned, one of the main goals of this work is to update or change one processing core that implements one DDoS defence technique without interfering other cores,and, of course, without interfering the packet classification process The Updating Con- troller module is responsible for the dynamic partial reconfiguration process UpdatingController stops the core need to be updated and sends classified information to DefenseDecision on-behalf-of the updated core When the updating process is done by ICAP,the new core is authorised by Updating Controller to continue operation At that time, it scans incoming packets and sends its classified results to Defense Decision The Figure4.5 shows the architecture of the implemented DPR module The register Decoupling 0 and Decoupling 1 are used to avoid interference (pulse noise) while the defence cores are

Hop-Count Filtering Port I/E Filtering

Figure 4.5: The Dynamic Partial Reconfiguration module

Dynamic Partition

Hop-Count Filtering

Although DDoS attackers can forge any data in the header field of a packet, they cannot falsify the number of hops that a packet traverses to reach its destination Based on that specification, Wang et al [17] proposes HCF to filter malicious packets based on the number of traversed routers before coming to the victim Time-to-Live (TTL) is an 8- bit field [8] in the header field of IP packet that is originally introduced to specify the maximum lifetime of a packet on the network The TTL value is subtracted to 1 after traversed a router When TTL value is equal to 0, the packet is discarded The number of traversed hops of a packet, named hop-count, is calculated by subtracting the final TTL from the initial TTL values, so-called calculated hop-count The final TTL is the value when the packet reaches the destination The initial TTL values are set to 30, 32, 60, 64, 128, or 255 depending on Operating System (OS) where the packet is packed.

Comparing the calculated hop-count to the stored hop-count collected before to classify the packet If the calculated hop-count is identical to stored hop-count, the packet is legitimate Otherwise, the packet is suspicious The Figure 4.6 shows the HCF algorithm.

The HCF module in this work consists of three blocks (described in the Figure 4.7).

These are (i) Hop-count calculation to calculate Hop-count value of an incoming packet,

CHAPTER 4 SYSTEM IMPLEMENTATION f o r e a c h p a c k e t : e x t r a c t t h e f i n a l TTL Tf ; e x t r a c t I P a d d r e s s S ; i n f e r t h e i n i t i a l TTL T i ; c o m p u t e t h e hop − c o u n t Hc = T i − Tf ; i n d e x S t o g e t s t o r e d hop − c o u n t Hs ; i f ( Hc # Hs ) p a c k e t i s s p o o f e d ; e l s e p a c k e t i s l e g i t i m a t e ;

Figure 4.6: The algorithm of HCF

(ii) Content Addressable Memory (CAM) to store known Hop-count values of packets coming from specific sources, and (iii) Comparator to compare stored Hop-count values with calculated Hop-count values Due to the limitation of NetFPGA 10G board hardware resources, we only build two versions of CAM, 128 and 256 entries This module receives a header field of an incoming packet from the Packet Decoder module, located in the static partition, and calculates the actual Hop-count value The module, then, uses the source IP address of the packet to look up corresponding stored Hop-count value in CAM If the stored Hop-count value is identical to calculated Hop-count, the packet is legitimate.

Otherwise, the packet is spoofed When this is the first time the packet has come to the system (i.e., whose source IP does not exist in CAM), CAM returns a MISS signal and stores both source IP address of the packet and calculated Hop-count value This packet is considered as a legitimate packet However, the initial TTL can be forged It means that the packet is spoofed, but HCF could not recognise that because of the lack of information.

That is the main disadvantage and limitation of the HCF mechanism.

Hop-Count Filtering Source IP

Figure 4.7: The Hope-Count Filtering module

Port Ingress/Egress Filtering

In the computer network, Ingress filtering is a technique used to guarantee that incoming packets are coming from their original networks Routers which integrated the IngressFiltering method check source IP addresses of traversing packets A router drops a packet

CHAPTER 4 SYSTEM IMPLEMENTATION if its source IP address does not belong to the range of addresses to which the router connects Meanwhile, Egress filtering technique monitors outbound traffic to ensure that spoofed or malicious packets are not allowed to leave internal networks There is a special- purpose address registry [34] which defines IP address blocks that should not exist on the Internet as usual Therefore, they should be blocked or deleted The Table 4.1 shows the list of special-purpose address blocks.

Table 4.1: Global and specialized address blocks

0.0.0.0/8 "This" Network 10.0.0.0/8 Private-Use Networks

169.254.0.0/16 Link Local 172.16.0.0/12 Private-Use Networks 192.0.0.0/24 IETF Protocol Assignments 192.0.2.0/24 TEST-NET-1

192.88.99.0/24 6to4 Relay Anycast 192.168.0.0/16 Private-Use Networks 198.18.0.0/15 Network Interconnect Device Benchmark Testing 198.51.100.0/24 TEST-NET-2

203.0.113.0/24 TEST-NET-3 224.0.0.0/4 Multicast 240.0.0.0/4 Reserved for Future Use 255.255.255.255/32 Limited Broadcast

The PIEF module in this work consists of two blocks CAM and Comparator Figure 4.8 describes a diagram of the PIEF module CAM stores special IP address blocks defined in [34] while the Comparator compares IP addresses of incoming packets with values in CAM When receiving an IP address of an incoming packet, PIEF searches the IP in CAM If a MISS signal is returned (i.e., no record was found in CAM), the packet is legitimate Otherwise, the packet is illegitimate (i.e., an HIT signal is returned) PIEF forwards MISS or HIT signal to Defense Decision for further processing.

Port Ingress/Egress Filtering Source IP

Figure 4.8: The Port Ingress/Egress Filtering module

The system is finally synthesised by the Xilinx ISE 14.7 toolset without any manual optimisation According to the synthesis report, the prototype system can work at up to 116.782 MHz Table 4.2 presents hardware resources usage of our system The second column shows the synthesis frequency of each module The third and fourth column shows the total amount of each resource type The last column depicts the percentage of the resources usage compared to the overall FPGA device resources.

Table 4.2: The device utilization summary of the system

The prototype system is implemented on Xilinx Virtex 5 Therefore, the partial reconfiguration technology is Xilinx-dependent PlanAhead [35] from Xilinx is a design and analysis software product used to design FPGA devices In this implementation, PlanA- head is used to implement dynamic partial reconfiguration and the HCF and PIEF in theDynamic Partition To implement DPR, the static partition and dynamic partition have to be separated The partial defence core in the dynamic partition is implemented in a region called blackbox The blackboxes connects to the statics region through an interface which is instantiated by PlanAhead The Figure 4.9 shows how PIEF and HCF are implemented as partial reconfiguration cores in Xilinx Virtex 5 device The static region and the blackboxes are fixed defined in design time The partial defence cores such as PIEF and HCF are flexibly changed or updated inside the blackboxes Each blackbox reserves a specific resource The partial defence core must be efficiently implemented in the limited resource of the blacbox In the Figure 4.9, there are two blackboxes accommodate PIEF and HCF inside.

Figure 4.9: The prototype system - FPGA device view

Summary

Chapter 4 presents the prototype system implementation which is an instance of the proposed architecture The prototype system has two defence cores, a dynamic partial reconfiguration, four 10Gbps network interfaces with an on-chip Packet FIFO module The prototype system can operate at a maximum frequency of 116.782 MHz According to the specification of the implemented prototype, It is ready for the experiment In the next chapter, the experimental environment is setup to validate and evaluate the prototype system.

In this section, we present our experiments with the system implemented in the previous section We also analyse the throughput and packet classification results of the system in this section.

Experimental Setup

To validate the proposed system and evaluate system performance, we deploy a testing model as shown in Figure 5.1 In this testing model, we use three NetFPGA 10G boards.

The first board is used to implement the Open Source Network Tester (OSNT) [36] while the proposed system is implemented in the second board OSNT is a flexible tool which can generate and measure performance of packets of any size at the line-rate speed of 10Gbps We prepare TCP/UDP packets with various sizes from 64 to 1500 bytes Some of those packets are real and collected from the Internet to test HCF Some packets are synthetic to test PIEF A reference NIC is ported into the third board to capture package to figure out the detection rate, false positive rate, and false negative rate.

Le gi ti m ate p ac ke ts

SFP+ Port 2 SFP+ Port 1 SFP+ Port 0

SFP+ Port 3 SFP+ Port 2 SFP+ Port 1 SFP+ Port 0

Figure 5.1: The system validation setup

In the evaluation experiments, we configure two NetFPGA boards with OSNT as gen-

CHAPTER 5 EXPERIMENTS erator and monitor The network packets are sent from the generator to the proposed system at the maximum speed of OSNT Processed packets are then sent from the proposed system to the monitor to evaluate system performance The Figure 5.2 shows the throughput testing model.

SFP+ Port 3 SFP+ Port 2 SFP+ Port 1 SFP+ Port 0

Full-duplex connection Half-duplex connection

Figure 5.2: The throughput testing model setup

Experimental Results

In this work, we perform experiments with one and multiple simultaneous flows Figure 5.3 shows throughput of the prototype system We conduct six test cases with various packet sizes as shown in the figure The vertical axis shows throughput value in Gbps while the horizontal axis shows packet sizes in our test cases For each test case, the first column shows OSNT packet generator speed, the second, third and fourth columns show packet processing speed in one flow, two simultaneous flows and four simultaneous flow respectively In the four simultaneous flows, OSNT generator generates traffic on all four ports at the same time The four simultaneous flows test case is to test the highest packet processing capability The throughput value is calculated by the monitoring software.

According to these experimental results, our first prototype system can achieve throughput by up to 9.843 Gbps in one flow mode and up to 19.686 Gbps in two flows mode The prototype system achieves the highest packet processing value up to 26.248 Gbps.

Table 5.1 shows packet classification results in the validation test The results are obtained by capturing the classified packets by using a network reference NIC board We perform two test cases according to the CAM sizes, i.e 128-entries and 256-entry test cases In the 128-entry test case, 286,162 legitimate and 13,838 spoofed packets are sent to the system The system can recognise exactly all the legitimate and spoofed packets In other words, a 100% detection rate with a 0% false positive rate and a 0% false negative rate are achieved in this test case In the 256-entry test case, the proposed system also attains a 100% detection rate with a 0% false negative rate and a 0.17% false positive

Figure 5.3: The throughput evaluation rate The detection rates in the two test cases are higher than results reported in [17] and [18] whose detection rates are 90% and 99.33%, respectively Table 5.2 shows detail information.

Table 5.1: The packet classification statistic

CAM version Expected result Classified result

Author System Name Implementation Detection Rate

Wang et el HCF Software 90%

Ritu et el DPHCF-RTT Software 99.33%

Biet et el This work FPGA 100%

We also do a statistic based on the packet sizes The packets are classified into six groups less than 64 Bytes, 64-128 Bytes, 128-256 Bytes, 256-512 Bytes, 512-1024 Bytes,and 1024-1500 Bytes Figure 5.4 shows the detection rates, false positive rates, and false negative rates according to the packet sizes in each test case The detection rates, false negative rates, and false positive rates are stable when the sizes of packets are changed.

In the 256-entry test case, the false positive rates increase and decrease according to the packet sizes These rates are 0.33%, 0.74%, 0.19%, 0.17%, 0.13%, 0.12% for 64 Bytes, 64-128 Bytes, 128-256 Bytes, 256-512 Bytes, 512-1024 Bytes, and 1024-1500 Bytes packets, respectively.

Detection rate_128 False negative_128 False positive_128 Detection rate_256 False negative_256 False positive_256

Figure 5.4: The packet classification statistic ratio

We do partial reconfiguration (PR) experiments with two reconfigurable logic regions for the HCF and PIEF modules We develop a software program to communicate with DPR through DMA to check and send bitstream files to ICAP Controller The software program is responsible for calculating the PR time We also do PR experiments through JTAG and DMA AXI-Lite [33] in this work Table 5.3 shows the PR throughput of the experimental system We are going to optimise DPR by applying pipelining in future work In the pipelining simulation, we may achieve throughput up to 370MB/s at the frequency of 100MHz.

Table 5.3: The partial reconfiguration experiment

Method Bitstream size (KB) PR time (second) Throughput

Summary

Chapter 5 conducts the experiments to validate and evaluate the prototype system and the proposed architecture The experiments show that the prototype system achieves not only high DDoS detection rate but also high-performance It proves that the proposed architecture is feasible and works as design The next chapter gives the final words to the research It concludes the achievement and limitation It also shows the future work to improve the research.

The Introduction chapter has presented the benefits as well as the challenges of the Inter- net The Internet is among the most innovative technology fields The Internet improve- ment brings along with it security threats The Internet Protocol itself has vulnerabilities which can be exploited for DDoS attacks DDoS attacks not only causes finance loss but also waste the Internet infrastructure resource DDoS attacks are on the uptrend with the increase in the number of attacks and the bandwidth consumption Most of the large bandwidth consumption attacks are employs IP spoofing technique which allows attackers to forge the source IP address of a packet The IP spoofing technique not only allows the attackers to hide their identification but also let the attack be hard to mitigate The IP spoofing exploits a vulnerability in the IP routing protocol which only check for the destination address of the packet to make routing decision while source IP address is intact.

When performing an attack, the attacker often applies multiple attack mechanisms There are many researches to understand and mitigate DDoS attacks, but they are discrete solu- tions while DDoS attacks employ multiple mechanisms Therefore, An effective solution which incorporate multiple countering mechanisms is required This solution must have the ability to adapt many kinds of DDoS attacks and is update-able to detect the future variant of DDoS attacks This solution should be high-performance to work with next generation of network In this chapter, the proposed architecture is discussed to find out the achievement and limitation of the study The future work also state in this chapter.

Achievement

In this work, DDoS attack research has been conducted The DDoS attack classification and mechanisms are understood well Several countermeasure mechanisms are also stud- ied Based on the existing countering mechanisms, the multicore architecture has been proposed to integrate multiple discrete mechanisms to achieve each mechanism strength.

The proposed architecture supports dynamic partial reconfiguration which allows one

CHAPTER 6 CONCLUSION core to be updated without affecting other operation The experiments show that the prototype, which is an instance of the proposed architecture, works well as design The prototype system operates at line-rate of 10 Gbps network and up to 26.248 Gbps with 100% of detection rate and very low false positive and false negative rate Here is the brief of the achievement of the thesis.

• Researching DDoS attack mechanisms and classification.

• The proposed multicore architecture incorporates multiple discrete countering mechanisms and achieve their strength To the best of our knowledge, this is the first proposed FPGA-based multicore DDoS protection system with various concurrent working DDoS countermeasure techniques.

• Each defence core can be flexibly changed or updated at runtime without affecting other operation by applying dynamic partial reconfiguration.

• The prototype system reaches the line rate of 10 Gbps network and achieves up to 26.248 Gbps.

• The prototype system achieves the detection rate up to 100%, 0% of false positive and 0.17% of false negative rate.

The thesis has contributed to scientific world by three different publications:

1 A Novel High-Speed Architecture For Integrating Multiple DDoS Countermea- sure Mechanisms Using Reconfigurable Hardware Biet Nguyen-Hoang, Binh

Tran-Thanh, Cuong Pham-Quoc, Nguyen Quoc Tuan, Tran Ngoc Thinh In Jurnal Teknologi(SCOPUS E-ISSN: 21803722).

2 FPGA-based Multiple DDoS Countermeasure Mechanisms System Using Par- tial Dynamic Reconfiguration Tran Ngoc Thinh, Cuong Pham-Quoc,Biet Nguyen- Hoang, Chau Tran-Thi, Chien Do-Minh, Quoc Nguyen-Bao, Nguyen Quoc Tuan In REV Journal on Electronics and Communications Vol 5, No 3-4, JUL-DEC 2015.

3 FPGA-based Multicore Architecture for Integrating Multiple DDoS Defense Mechanisms Cuong Pham-Quoc, Biet Nguyen, Tran Ngoc Thinh In Interna- tional Symposium on Highly-Efficient Accelerators and Reconfigurable Tech- nologies (HEART2016), July 25 th -27 th , 2016.

The full papers are shown in Appendix A

Limitations

Although the proposed architecture and prototype system work as design, they still have several limitations need to be improved.

• Maximum throughput of the prototype system is 26.248 Gbps It is lower than the maximum supported bandwidth of the NetFPGA 10G platform which is up to 40 Gbps.

• Prototype system implements only two well-known DDoS defence mechanisms (PIEF and HCF) due to the limitation of NetFPGA 10G platform resource.

• The throughput of partial reconfiguration is still very low comparing to the ICAP protocol bandwidth Moreover, the current prototype system can only be reconfigured from the host system It means that the reconfiguration process needs to be issued from the host that connect directly the prototype system through PCI-express interface.

• The implemented prototype system maximum operating frequency is 116.782 MHz due to place and route The operating frequency can be higher if manual constraints are applied to optimise place and route process.

Future Work

Due to the limitations present in the previous section, The following future work are proposed.

• Apply manual user constraints in place and route phase to achieve higher operating frequency.

• Optimise code to get higher system throughput Higher operating frequency may also support to get higher throughput.

• Integrate more defence core to detect and mitigate another kind of DDoS attacks.

• Increase the dynamic partial reconfiguration throughput so that the system can quickly react to the changes of DDoS attack mechanisms.

• Need more development to provide network-based reconfiguration capability It means that the reconfiguration process can be issued from any hosts from the network by a defined protocol Therefore, the system can be a complete standalone system.

Applications of The Proposed Architecture

The proposed architecture is applicable to integrate into several network devices to provide DDoS countermeasure layer There are several potential application of the proposed architecture as below:

• Secure routing and switching device: This architecture can be integrated into switches or routers to provide secure layer defending against DDoS attacks.

• Cloud-based secure software-based networking device: In cloud-based virtualized environment, the architecture can be integrated into software-based routing and switching devices such as software defined networking devices to provide security layer.

For instance, the architecture can be integrated into OpenFlow switches which are an instance of software defined networking.

A Novel High-Speed Architecture For Integrating Multiple DDoS Counter-

Countermeasure Mechanisms Using Reconfigurable Hardware

• Paper name: A Novel High-Speed Architecture For Integrating Multiple DDoS Countermeasure Mechanisms Using Reconfigurable Hardware

• Authors: Biet Nguyen-Hoang, Binh Tran-Thanh, Cuong Pham-Quoc, Nguyen Quoc

• Conference: 2015 Recent Advancement In Informatics, Electrical And Electronics Engineering International Conference (RAIEIC2015)

• Conference day:December 10 th -12 th , 2015

• Publisher: Jurnal Teknologi (SCOPUS E-ISSN: 21803722)

A N OVEL H IGH - SPEED A RCHITECTURE FOR

Biet Nguyen-Hoang, Binh Tran-Thanh, Cuong Pham-Quoc, Nguyen Quoc Tuan, Tran Ngoc Thinh

Ho Chi Minh city University of Technology, Vietnam National University, Ho Chi Minh City, Vietnam

Corresponding author 7140220@hcmut.edu.vn 12073117@hcmut.edu.vn cuongpham@hcmut.edu.vn nqtuan@hcmut.edu.vn tnthinh@hcmut.edu.vn

In this paper, we propose a novel high-speed architecture to incorporate multiple standalone DDoS countering mechanisms The architecture separates DDoS filtering mechanisms, which are algorithms, out of packet decoder, which is the basement The architecture not only helps developers more concentrate on optimizing algorithms but also integrates multiple algorithms to achieve more efficient DDoS defense mechanism The architecture is implemented on reconfigurable hardware which helps algorithms to be flexibly changed or updated We implement and experiment the system using NetFPGA 10G board with incorporation of Port Ingress/Egress Filtering and Hop-Count Filtering to classify IP spoofing packets The synthesis results show that the system run at 118.907 MHz, utilizes 38.99%

Registers, and 44.75% BlockRAMs/FIFOs of the NetFPGA 10G board The system achieves the detection rate of 100% with false negative rate at 0%, and false positive rate closed to 0.16% The experimental results prove that the system achieves packet decoding throughput at 9.869 Gbps in half-duplex mode and 19.738 Gbps in full-duplex mode.

The Internet is growing fast and further It becomes an important mechanism to connect people and devices together For that benefit, the number of Internet users is increasing As of Oct 24, 2015, there are more than 3.2 billion of users joined Internet [1], and it is stably increasing This increase will be a good chance for attackers to replicate malicious software (malware), steal personal user information and occupy computers for distributed denial of service (DDoS) attacks

DDoS is a network attack method to prevent legitimate users from accessing network resources or services The attacker performs DDoS attacks by consuming network's resources, or by consuming server's resources, or both of them Attackers can perform attacks from one source (DoS), or from multiple sources (DDoS) Most of DDoS attacks use Internet Protocol (IP) address spoofing technique [2] that allows attackers to forge source IP address of a packet difference from its original address Spoofer Project showed that 13.5% address space is spoofable [3] The way routers route a packet is a vulnerability that attackers exploit to perform DDoS attacks

Network routers only check packets' IP destination address to make a routing decision, while source IP address is intact

In this paper, we propose a novel architecture to detect and defend against DDoS attack based on IP spoofing technique The architecture consists of two main components: Base System and DDoS Filtering

The Base System takes responsibility to extract header and store raw packets while waiting for classifying result from DDoS Filtering component DDoS Filtering component classifies packets based on the header received from the Base System DDoS Filtering component can include multiple filtering modules We implement Port Ingress/Egress Filtering (PIEF) and Hop- Count Filtering (HCF) module to counter DDoS attacks based on theory from [4] and [5]

The main contributions of this paper are as follows:

 High-speed packet decoder: filtering modules are implemented in a Field Programmable Gate Array (FPGA) device It takes advantage of hardware- based parallel processing, which is faster than software-based implementation

 A novel model for DDoS's countermeasure mechanism: the proposed architecture separates packet decoder component, which is a basement, from Packet Filtering component, which implements algorithms This architecture helps developers to implement filtering modules independently based on Packet decoder

 The Combination of PIEF and HCF for countering DDoS attacks: both implemented filtering modules are DDoS defense mechanisms that prevent IP spoofing attacks This combination not only consolidates DDoS countering mechanisms but also proves that multiple filtering mechanisms can be incorporated to prevent DDoS attacks

The rest of the paper is organized as follows; Section II describes DDoS attack and defense mechanisms and related work Section III presents our proposed DDoS countermeasure The implementation and experimental results are discussed in section IV Section V concludes the paper and introduces future work

In this section, we present DDoS background and countermeasure methods to prevent DDoS attacks

Attackers often employ computers or zombies controlled through malwares to form a botnet to perform a DDoS attack They are usually motivated by incentives such as financial/economical gain, revenge, ideological belief, intellectual challenge or cyberwarfare [2] Attacks which are for financial gain are dangerous and hard to mitigate

Zargar et al [2] classified DoS/DDoS attacks into two categories: network/transport-level and application- level flooding attack Network/Transport-level based flooding attacks are performed by exploiting vulnerabilities of layer 2 to layer 4 in the Open Systems Interconnection (OSI) network model to exhaust victim's network resources Application-level based flooding attacks exploit application-level vulnerabilities, including protocols and application code, to exhaust victim's server resources

Network/Transport-level based attacks are also categorized into 4 subcategories [2]; flooding attacks, protocols exploitation flooding attacks, reflection- based flooding attacks, and amplification-based flooding attacks Flooding attacks often exhaust network resources by consuming bandwidth or overburdening network devices In protocol exploitation attacks, attacker sends malformed packets, such as TCP SYN flood [6] [7] and TCP SYN/ACK flood [8], to confuse victim In reflection and amplification based attacks, the attacker sends spoofed packets in which source address is victim's IP address to reflectors/amplifiers, then responses are sent to the victim and cause flooding (i.e., Smurf attack, Fraggle attack, )

Application-level based attacks exploit vulnerabilities of application protocol and application code Attackers often exploit stateless protocols for this kind of attack, such as DNS, NTP DNS amplification DDoS had been researched [9] [10] and recorded an attack with 300Gbps [10] [11] NTP amplification DDoS sets a new record with 400Gbps in 2014 [12] [13]

Attacker (i.e., Master) starts DDoS attacks by sending control message to bots (i.e., Agents) in a botnet to perform an attack Attacker can control the botnet through Internet Relay Chat (IRC) or HTTP-based command

FPGA-based Multiple DDoS Countermeasure Mechanisms System Using

tem Using Partial Dynamic Reconfiguration

• Paper name: FPGA-based Multiple DDoS Countermeasure Mechanisms System Using Partial Dynamic Reconfiguration

• Authors: Tran Ngoc Thinh, Cuong Pham-Quoc,Biet Nguyen-Hoang, Chau Tran-

Thi, Chien Do-Minh, Quoc Nguyen-Bao, Nguyen Quoc Tuan

• Journal: REV Journal on Electronics and Communications Vol 5, No 3-4, JUL- DEC 2015.

REV JOURNAL ON ELECTRONICS AND COMMUNICATIONS, VOL 5, NO 3-4, JUL-DEC 2015 1

FPGA-based Multiple DDoS Countermeasure Mechanisms System Using Partial Dynamic

Tran Ngoc Thinh ( ) , Cuong Pham-Quoc ( ) , Biet Nguyen-Hoang, Chau Tran-Thi,

Chien Do-Minh, Quoc Nguyen-Bao, Nguyen Quoc Tuan Ho Chi Minh city University of Technology - VNU-HCM 268 Ly Thuong Kiet Str., District 10, Ho Chi Minh City, Vietnam

Email: { tnthinh,cuongpham } @hcmut.edu.vn

Abstract—In this paper, we propose a novel FPGA- based high-speed DDoS countermeasure system that can flexibly adapt to DDoS attacks while still maintaining system performance The system includes a packet decoder module and multiple DDoS countermeasure mechanisms.

We apply dynamic partial reconfiguration technique in this system so that the countermeasure mechanisms can be flexibly changed or updated on-the-fly The proposed system architecture separates DDoS protection modules (which implement DDoS countermeasure techniques) from the packet decoder module By using this approach, one DDoS protection module can be reconfigured without interfering with other modules The proposed system is implemented on a NetFPGA 10G board The synthesis results show that the system can work at up to 116.782MHz while utilizing up to 39.9% Registers and 49.85% BlockRAM of the Xilinx Virtex xcv5tx240t FPGA device on the NetFPGA 10G board The system achieves the detection rate of 100% with the false negative rate at 0% and false positive rate closed to 0.16% The prototype system achieves packet decoding throughput at 9.869 Gbps in half-duplex mode and 19.738 Gbps in full-duplex mode.

Index Terms—Partial Reconfiguration; ICAP; Reconfig- urable hardware; Distributed Denial of Service (DDoS);

DISTRIBUTED Denial of Service (DDoS) is a network attack method to prevent legitimate users from accessing network resources or services.

It consumes network resources or server resources by employing multiple computer zombies to simultaneously send requests to a victim The victim will be overloaded It then cannot respond to legitimate requests and causes denial of service Most of DDoS attacks use Internet Protocol (IP) address spoofing technique [1] that allows attackers to modify source IP address of a packet Hence, its original address is hidden Spoofer Project has shown that 13.5% of overall address space is spoofable [2] Network router only checks packet destination address to make a routing decision while keeping source address intact The way a router routes a packet is a vulnerability that attackers can exploit to perform DDoS attacks There is much research on proposing DDoS countering systems in the literature [1].

However, most of those proposed systems are implemented as software programs Therefore, they could not quickly react to DDoS attacks in a high-speed network environment To eliminate the limitation of those systems, we need to develop a platform that not only is programmable but also has a high- performance.

In last decades, computer society has seen the evolution of reconfigurable computing from less- complex prototyping to high density and performance platforms Field Programmable Gate Ar- ray (FPGA) is usually used for implementing reconfigurable computing systems FPGAs are programmable logic devices that consist of a matrix of Configurable Logic Blocks (CLBs) connected through programmable interconnects that can be re- programmed FPGA not only has advantages of hardware-based high-speed parallel processing but also takes the flexibility of software-based programmability In this work, we take these main advantages of FPGA devices into consideration to quickly adapt to various DDoS attack mechanisms and achieves high-speed computation Dynamic partial reconfiguration technique is applied to quickly

REV JOURNAL ON ELECTRONICS AND COMMUNICATIONS, VOL 5, NO 3-4, JUL-DEC 2015 2 react to the changes of vulnerability exploitations.

An FPGA device is configured by loading application-specific configuration data, named bitstream, into internal configuration memory Partial reconfiguration (PR) is the modification of an operating FPGA configuration memory by loading a partial configuration file With the rapid development of technology, FPGAs allow dynamic partial reconfiguration (DPR) It means that some parts of an FPGA device can be reconfigured at runtime while other parts are still working This runtime reconfiguration helps systems be updated while still operating The design flow of DPR partitions configuration memory into static logic and reconfigurable logic [3] In DPR process, the static logic remains functioning while the reconfigurable logic is modified by the partial configuration file.

In this paper, we propose a novel FPGA-based high-speed DDoS countermeasure system using reconfigurable computing platform with taking dynamic partial reconfiguration technique into consideration The system consists of three main components: Base System, DDoS Filtering, and Dynamic Partial Reconfiguration (DPR) The Base System takes responsibility to extract header and store raw packets while waiting for classifying results from the DDoS Filtering component The DDoS Filtering component classifies packets based on the header received from Base System The DDoS Filtering component can include multiple filtering mechanisms and can be updated or changed dynamically DPR consists of DPR Controller and Internal Configuration Access Port (ICAP) Controller [4] that partially reconfigures DDoS Filtering on-the-fly while the system is still operating DPR uses ICAP primitive to make reconfiguration.

The main contributions of the paper are as follows:

• High-speed packet decoder: the packet decoder module is implemented in an FPGA device.

It takes advantages of hardware-based parallel processing, which is faster than software-based implementation Experimental results show that the packet decoder in our proposed system reaches the line rate of 10Gbps in a high-speed network.

• A novel system architecture for DDoS countermeasure mechanisms: the proposed architecture separates the packet decoder module from the DDoS Filtering component implementing different DDoS filtering mechanisms This architecture helps developers to implement filtering modules independently using output information from the packet decoder Therefore, one filtering module can be reconfigured and updated dynamically without any interference from other filtering modules.

• Online reconfiguration system: the architecture allows DDoS filtering modules to be reconfigured while operating, without affecting other modules or changing the system architecture.

Therefore, system performance is still main- tained while being reconfigured.

The rest of the paper is organized as follows.

Section II analyzes background and related work.

Section III discusses our proposed DDoS countermeasure system architecture Section IV introduces our system implementation using a NetFPGA-10G board Experimental results are presented in Sec- tion V Finally, Section VI concludes the paper and introduces the future work.

II BACKGROUND AND RELATED WORK

In this section, we present background on DDoS attacks and DDoS countermeasures Several proposed systems in the literature to defend against the DDoS attacks also discussed in this section.

Attackers often employ computers or zombies controlled by malicious software to create a botnet to perform DDoS attacks They are usually motivated by incentives such as financial/economical gains, revenge, ideological belief, intellectual challenge or cyberwarfare [1] Attacks that are for financial gains are dangerous and hard to mitigate.

Zargar et al [1] classified DoS/DDoS attacks into two categories: network/transport- level and application-level flooding attacks.

Network/Transport-level based flooding attacks are performed by exploiting vulnerabilities of layer 2 to layer 4 in the Open Systems Interconnection(OSI) network model to exhaust victim network resources This category includes flooding attacks,protocols exploitation flooding attacks, reflection- based flooding attacks, and amplification-based flooding attacks Flooding attacks often exhaust network resources by consuming bandwidth or

REV JOURNAL ON ELECTRONICS AND COMMUNICATIONS, VOL 5, NO 3-4, JUL-DEC 2015 3 overburdening network devices In protocol- exploitation attacks, an attacker sends malformed packets, such as TCP SYN flood [5] [6] and TCP SYN/ACK flood [7], to confuse a victim.

In reflection and amplification based attacks, an attacker broadcasts spoofed packets whose source addresses are the IP address of a specific victim to make reflectors/amplifiers Consequently, responses are sent back to the victim and cause flooding (i.e., Smurf attacks, Fraggle attacks).

Tiêu đề	Multi-core Architecture For Denial of Service (DoS)/Distributed Denial of Service (DDoS) Countermeasure Based on Reconfigurable Hardware
Tác giả	Biet Nguyen Hoang
Người hướng dẫn	Assoc. Prof., Dr. Thinh Tran Ngoc, Dr. Cuong Pham Quoc
Trường học	Ho Chi Minh City University of Technology
Chuyên ngành	Computer Science
Thể loại	Master Thesis
Năm xuất bản	2016
Thành phố	Ho Chi Minh City

Định dạng
Số trang	85
Dung lượng	5,19 MB