Randomized Attacker 6 RoQ Attacks on End-System Admission Controllers Attack Definition.. 15 Attack potency versus attack peak rate for different 9 values.. Introducing and formalizing t
Trang 1GRADUATE SCHOOL OF ARTS AND SCIENCE
M.A., Boston University, 2005
Submitted in partial fulfillment of the
requirements for the degree of
Doctor of Philosophy
2007
Trang 2Copyright 2006 byGuirguis, Mina
All rights reserved
INFORMATION TO USERS
The quality of this reproduction is dependent upon the quality of the copysubmitted Broken or indistinct print, colored or poor quality illustrations andphotographs, print bleed-through, substandard margins, and improperalignment can adversely affect reproduction
In the unlikely event that the author did not send a complete manuscriptand there are missing pages, these will be noted Also, if unauthorizedcopyright material had to be removed, a note will indicate the deletion
®UMI
UMI Microform 3240623Copyright 2007 by ProQuest Information and Learning Company.All rights reserved This microform edition is protected againstunauthorized copying under Title 17, United States Code
ProQuest Information and Learning Company
300 North Zeeb RoadP.O Box 1346Ann Arbor, MI 48106-1346
Trang 32006
Trang 4First Reader Ẹ | 6Azer Bestavros, Ph.D.
Professor of Computer ScienceBoston University
Second Reader AMALTbrahim Matta, Ph.D.
Associate Professor of Computer ScienceBoston University
Third Reader ";
Katabi, Pal
pdistant Professdf of Computer Science
assachusetts Institute of Technology
Trang 5Looking back over the past few years, I am most certain that this work would not havebeen of any significance, if it was not for the support and help that I have received frommany individuals during my graduate career at Boston University.
I am most grateful and truly indebted to my advisors, Azer Bestavros and IbrahimMatta, for they have taught me how to be a computer scientist They have supported
me tremendously throughout my Ph.D program and have been much more than advisors;they have been teachers, mentors and friends They have always maintained an open door,offering their time and help, no matter how their schedules seemed busy The time I havespent interacting with them has benefited me extremely, not just on the academic level butalso on the personal level
I would like to thank John Byers and Mark Crovella for their valuable feedback ondifferent pieces of my research work, for supporting me with their recommendation lettersand for serving on my thesis committee I would also like to thank Dina Katabi, for herfeedback on earlier versions of this thesis and for serving on my thesis committee
I would also like to thank Abdelsalam Heddaya and Sonia Fahmy for their effortsthroughout my job application process They have supported me with recommendationletters and have always made sure to let me know of any opportunities that would be of
an interest to me
I would like to specially thank Yuting Zhang for her help with some of the experimentalwork done in this thesis Additional support for my research work has been provided by theNational Science Foundation and by Fortress Technologies I would like to thank OwaisHassan and Magued Barsoum for their support during my internships at Fortress andbeyond
I am very grateful to Fady Barsoum for his constant support as early as I could member He has invested great efforts in following up with my applications when I wasapplying for graduate schools and he has always been there for me when I needed advice
re-iv
Trang 6brought me to this stage of my career and brought my thesis to completion.
I would also like to thank George Atia, Hany Morcos, Karim Mattar, Maria Mitsi,Dan Buzan, and Xiaoyu Jiang for their support during my Ph.D program Special thanks
to Raymond Sweha for his help in submitting my thesis and paperwork for graduation.Finally, I would like to thank all members of the WING, NRG and BOSS for the wonderfultimes that I spent at Boston University
Trang 7ADAPTATION MECHANISMS
(Order No )
MINA GUIRGUIS
Boston University, Graduate School of Arts and Science, 2007
Major Professor: Azer Bestavros, Professor of Computer Science Department
ABSTRACT
One important consideration in realizing dependable computing systems and networks
is to uncover vulnerabilities in their designs to adversarial attacks Currently, the designs
of these systems employ different forms of adaptation mechanisms in order to optimize
their performance by ensuring that desirable properties, such as stability, efficiency and
fairness, are not compromised This thesis discovers and studies a new type of adversarialattacks that target such adaptation mechanisms by exploiting their dynamics of operation
— ?.e., the characteristics of their transient behavior We coin this new breed of adversarialattacks, Reduction of Quality (RoQ) attacks The premise of RoQ attacks is to keep anadaptive mechanism constantly in a transient state, effectively depriving the system frommuch of its capacity and significantly reducing its service quality
In this thesis we develop a general control-theoretic framework that provides a fied approach to modeling and vulnerability assessment of the dynamics underlying RoQexploits Within this framework, we introduce and formalize the notion of an attack ” Po-tency” that capitalizes on the attacker’s best incentive: maximizing the marginal utility
uni-of its attack traffic Unlike traditional brute-force Denial uni-of Service attacks that aim totake down a system at any cost, RoQ attacks aim to maximize the damage inflicted on asystem through consuming an innocuous, small fraction of that system’s hijacked capacity
vi
Trang 8a series of adaptation mechanisms that are commonly used in networking protocols, system admission controllers and load balancers We assess the impact of RoQ attacksusing analysis, simulations, and Internet experiments We identify key factors that exposethe tradeoffs between resilience and susceptibility to RoQ attacks These factors could beused to harden adaptation mechanisms against RoQ exploits, in addition to developingnew forms of countermeasures and defense mechanisms.
end-vii
Trang 9Dissertation Overview 00800 eae
2 RoQ Attack Definition and Premise
2.1
2.2
2.3
Attack Goal and Definition
Adaptation as an Optimization Process
Model Derivations 0 0 eee ees
Numerical and Simulation Results
Internet Experiments and Implementation Results
Discussion ee
Distributed RoQ Attacks for Stealing Bandwidth
Attack Definition 2 0 cv và số
411 RoQ Attack Construction
4.1.2 Selecting the Targeted links
4.1.3 A Lower Bound on Zombies
202122243135
Trang 105.3.2 An Outline for a Possible Defense Mechanism
5.3.3 RTO Randomization vs Randomized Attacker
6 RoQ Attacks on End-System Admission Controllers
Attack Definition © cuc ch cu ko
Model Derivations 2.0 00 eee ee
7.2.1 An Upper Bound on Attack Potency
7.2.2 Potency for Dynamic Load Balancing
7.2.3 A Lower Bound on Attack Potency
686869777882
Trang 118 Conclusions, Derivative Work and Future Work
124
126
130
138
Trang 12Model parameters for a PI admission controller 71Model parameters for load balancing policles cv 0 e 90
x1
Trang 1321
2:2
2:3
3-1
3:2
3:3
3-4
đỗ
j.6
3:7
4.1
4-2
4:3
4.4
4:5
4:6
4:7
4:8
4.9
Thesis organization © cv kg vn kg kg gà kg và 9
A general block diagram for the adaptation mechanisms considered 14
An example of a pricing function and the effect of a RoQ attack 15
Attack potency versus attack peak rate for different 9 values 19
Block diagram showing the feedback control system for TCP and RED 22
Vulnerability assessment of TCP+RED and TCP+DropTail to RoQ attacks 26 Tuning RoQ attack parameters to maximize potency for link bandwidth 30
Tuning RoQ attack parameters to maximize potency for delay jitter 31
Setup for Internet Experiments 0 0 HQ nh Q va va 32 RoQ attack potency as the attack period T is changed 33
RoQ attack potency as the attack duration 7 is changed ¬ 88
Adversarial scheme considered for distributed RoQ attacks 39
A more detailed view of the adversarial scheme considered 41
A lower bound on zombies for different probabilities and degrees 44
The two-link topology used in ns-2 simulation experiments 45
Improvement in allocated bandwidth as the level of DoS attack increases 45
Improvement in allocated bandwidth as r changes for a fixed Tanddé 46
The five-link topology used in ns-2 simulation experiments 46
Throughput allocated to each flow from the BC flows 47
Setup for Internet experimenf§., cu ki 48 4:10 Throughput allocated to the connection between C4 and S90 49
xi
Trang 145:3
5:4
5:5
5-6
5-7
5:8
5-9
6-1
6:2
6:3
6-4
6:5
71
7:2
7:3
7-4
7:5
76
77
7:8
7-9
7:10
7-11
7-12
7-13
Under-utilization due to a Shrew attack at saturation
Under-utilization due to a Shrew attack at full buffer
Normalized potency versus buffer size for different attack variants
Under-utilization due to a RoQ attack at saturation
Under-utilization due to a Shrew attack at full buffer
Assessment of RTO randomization for different RTT connections
Impact of different ranges of randomization under periodic attack
Impact of different ranges of randomization under randomized attack Block diagram for the components of the admission control feedback loop Linearized instances for the web-server model functions
Numerical assessment of admission controllers to RoQ attacks
Setup for Internet experiments 6 0 ee và Experimental assessment of admission controllers to RoQ attacks
A general setup for load balancing 2 ee Vulnerability assessment for proportional load-balancing to RoQ attacks
Vulnerability assessment for weighted load-balancing to RoQ attacks
Vulnerability assessment for least-loaded load-balancing to RoQ attacks
Attack potency under different balancing policies
Simulation result for the optimal balancing policy
Vulnerability assessment for proportional load balancers
Impact of feedback delay on the attack potency 004
Service degradation model as a linear function of the queue size
Impact of overhead/thrashing on attack potency 00
Changes in the queue size observed by a monitor
Impact of Ø on the admission ratio under proportional balancing policy Experimental assessment for the proportional load-balancer (@ = 0.003)
xi1
70 70 78 80 81
89 99 99 100 101 103 104 105 107 108 109 111 113
Trang 157-15 Experimental assessment for the least-loaded load-balancer 1167-16 Experimental assessment for load balancing policies to RoQ attacks 1177-17 Effect of feedback update periods 2 2.0.2.0 0004 ee 117
XIV
Trang 16AIMD Additive Increase Multiplicative DecreaseAQM Active Queue Management
DDoS _ Distributed Denial of Service
DoS Denial of Service
DTW Dynamic Time Warping
ESM _ End System Multicast
FRED Flow Random Early Drop
TT Internet Protocol
PL Proportional Integral
RED Random Early Detection
REM Random Exponential Marking
RoQ — Reduction of Quality
RIT Round-Trip Time
TCP _ Transmission Control Protocol
10) D) User Data-gram Protocol
VoIP _ Voice over Internet Protocol
xv
Trang 17The Internet continues to play a vital role in our daily lives with a profound impact
on our economy and on our society This role has been the fruit of significant researchefforts, visionary leadership, technological advances and major deployment efforts Inorder to sustain this role and to open it up for novel possibilities, through new services andapplications, the Internet has to provide a more secure infrastructure than the one thatexists today
Despite the efforts led by organizations, government agencies, industry and universitiestowards securing our computing systems and networks, security violations— such as in
Denial of Service (DoS) attacks! and virus/worm attacks—are still very common and their impact is quite significant.? Such exploits of computing systems and networks—and conse-
quently research in system and network security for countermeasures to such exploits—havelargely targeted static properties, namely characteristics or features of a system that arefairly independent of the system’s workload
Security breaches due to computer worms and viruses exploit known bugs or bad ware practices in the implementation of services and protocols Examples include buffer
soft-overflows and related code-injection attacks, which were estimated to be the culprit for
roughly 50% of the major security flaws over the last two decades [97] Clearly, the
char-acteristic property of the underlying system that permits such an exploit is “static” in
‘Currently, one of the greatest fears that face the US (and many other parts of the globe) is an Internet
meltdown Recent studies [83] have shown that the US is ill-prepared for a cyber-Katrina that is a result
of a large-scale coordinated attack.
Latest CSI/FBI report [30] indicates that among the 313 respondent, 65% reported virus security
breaches and 25% reported being subjected to DoS attacks The total loss from these two classes of attacks, over the 313 respondent, was more than $18M in 2005.
Trang 18language used to implement such systems Indeed, systems developed using memory safelanguages such as CCured and Cyclone [54, 74], or those using other defensive technologiessuch as StackGuard and ProPolice [29, 36] are not prone to such exploits.
Another example of exploits that target static properties of a computing system ornetwork are DoS attacks DoS attacks target the fized capacity of a system component
An adversary bent on limiting access to a network resource could simply marshal enoughclient machines to bring down an Internet service by subjecting it to sustained levels ofdemand that far exceed its capacity, making that service incapable of adequately responding
to legitimate requests.?
While such exploits may be viewed as mere nuisances—mounted for curiosity’s sake,benign acts of free speech, or even commercial advantage [97|—their impact on criticalresources and services may cripple our increasingly cyber-dependent economy Luckily,exploits of system static properties are not easy to mount because they do require control
of a fairly large base [91, 77], e.g., 100K-200K zombie clients in the case of MyDoom[94] More importantly, by their very nature, such attacks are easily anticipated, allowingcountermeasures to be taken, including the collection of information that could be used toprosecute the attack perpetrators Indeed, the ability to anticipate a DoS attack and/or
to trace-back perpetrators thereof are powerful deterrents
But, what if victims of an attack cannot anticipate or even detect that they are under
an attack? What if the attack’s purpose is not to necessarily cripple a service, but rather
to inflict significant degradation in some aspect of the service—e.g., resource utilization,system stability, or service quality—or to gain an unfair advantage over competing partiesusing a shared infrastructure?
In this thesis, we identify and study a new breed of exploits that target the dynamics of
#Tn the most aggressive forms of these attacks, highly publicized for its alleged relationship to the “open
source” community, MyDoom earned its malevolent moniker by crashing SCO Group’s Web site as the email-carried W32/Novarg.A, W32/Shimg, and W32/Mydoom worms mounted a widespread and record-setting Distributed DoS (DDoS) attack in the first minutes of February 1st, 2004 [94]
Trang 19its limited steady-state capacity or some other known static feature—to achieve the aboveadversarial goals In particular, we show that a determined adversary could, for example,bleed a system’s capacity or significantly reduce service quality by subjecting the system
to a fairly low-intensity (but well orchestrated and timed) request stream that causes thesystem to become very inefficient, or unstable We give examples of such attacks—which weterm Reduction of Quality (RoQ; as in “rock”) attacks—on a number of common adaptivecomponents in modern computing and networking systems RoQ attacks stand in sharpcontrast to traditional brute-force, sustained high-rate DoS attacks [20, 21], as well as otherattacks that exploit specific static protocol settings [59]
The Challenge of Capturing/Taming System Dynamics: Internet systems mayexhibit elaborate dynamic behaviors due to resource management strategies in general(e.g., scheduling and caching at individual hosts) and system adaptation strategies inparticular (e.g., admission control and load balancing) These dynamics are quite hard
to capture analytically or even empirically As a result, our models of computing systemcomponents tend to abstract away such dynamics and focus instead on static propertiesobtained through aggregations over time scales that are long enough to hide the transients
of adaptation; metrics used to monitor and evaluate a system’s performance (such asutilization, availability, response times, and jitter) are typically expressed as shapelessmean values, which do not give us insight into the inefficiencies caused by transients overtime scales shorter than those used in measuring such metrics
System dynamics could be “safely ignored” if one can ensure that such dynamics willnot interfere, or that they will have negligible impact on the overall performance of thesystem Such assurances are warranted for closed systems with predictable, non-adversarialworkloads However, in open environments, system dynamics cannot be “safely ignored”
as they could be exploited by adversaries In this thesis we show that such exploits arenot only plausible, but that their impact could be significant Notice that while systemdynamics could be shown not to interfere with, or significantly impact the fidelity of an
Trang 20the same could not be said for adversarially-engineered workloads.
The relatively little attention by computing system designers and practitioners to tem dynamics stands in sharp contrast to how other engineered systems, such as electricand mechanical systems are evaluated For such systems, the characterization of systemdynamics is front and center to protect against oscillatory behaviors and instabilities
sys-An Illustrative RoQ Exploit: Current adversarial strategies for DoS are brute force [21]
An attacker may render a system useless by subjecting it to a sustained attack workload(e.g., SYN attack) that far exceeds that system’s capacity The result is that legitimaterequests experience a much degraded response from a persistently overloaded system—oreven are denied access altogether Could an attacker achieve similar outcomes withoutpersistently overloading the system? The answer is yes To explain why this is the case,
we give a simple illustrative example for the use of admission control, which is typicallyembedded as part of the infrastructure (e.g., as part of an HTTP server, database server,firewall and traffic shapers, among others)
The feedback delay inherent in the design of any admission controller constitutes the
“Trojan Horse” through which a RoQ exploit could be mounted Consider an admissioncontroller that sets the admission rate to an end-host or service as a function of the uti-lization of its back-end [99, 100, 17] Now, consider a point in time when offered load
is low enough for the admission controller to allow a large percentage of all requests to
go through At this point, a surge in demand in a very short period of time would pushthe system into overload This, in turn, would result in the admission controller shuttingoff subsequent legitimate requests for a long time given the fact that under overloadedconditions, the system operates in an inefficient region (e.g., due to thrashing) Once thesystem “recovers” from the ill-effects of this unsuspected surge in demand, an attackerwould simply repeat the process Albeit simplistic, this attack illustrates how adaptationstrategies may be exploited by adversaries to reduce the system’s fidelity
Trang 21The contributions of this thesis can be broadly classified into two categories: conceptualcontributions and technical contributions We elaborate on these contributions below.
1.1.1 Conceptual Contributions
Identifying vulnerabilities in adaptation mechanisms
Due to the open-access nature of the Internet, the design of many of its componentsrely on adaptation mechanisms in order to optimize their performance in the presence ofunpredictable load conditions Traditionally, adaptation mechanisms were always beingconsidered as solutions to given problems as indicated in many research efforts that aretotally focused on more sophisticated adaptation mechanisms
The main contribution of this thesis is to uncover vulnerabilities in these adaptationmechanisms against a new class of adversarial attacks Specifically, this work orchestratesthe types and patterns of adversarial input that would make adaptation harmful, and showsthat the impact from such exploits is significant It is through exposing such vulnerabilitiesthat we would be able to build secure adaptation mechanisms
Towards that goal, first, this thesis demonstrates that “throwing” adaptation at aproblem, without security in mind, can potentially be introducing another problem that ismore serious Indeed, a given system may be better off operating around (but not too closeto) its optimal performance, than seeking optimal performance while opening a backdoorfor adversarial attack to compromise its performance, not to mention its fidelity and evenavailability Second, it is essential to step back and re-examine the different adaptationmechanisms (and other forms of resource management techniques), currently present inour infrastructure
Introducing and formalizing the notion of an attack potency
Previous assessments of adversarial attacks have focused solely on quantifying the age inflicted on a resource, without accounting for the cost that was invested in order toinflict such damage In this thesis, we introduce and formalize the notion of an attack
Trang 22dam-the “cost” of dam-the attack It is in dam-the attacker best incentive to maximize dam-the marginalutility of its attack traffic.
When assessing the impact of adversarial attacks, accounting for the cost has twoimportant implications First, it changes the nature of the problem, from the absolute sense
of bringing a system down at any cost, to optimizing the attack pattern and parameters,
to force the system to operate in its most inefficient regions Second, minimizing thecost in general makes triggering attack detection mechanisms an even more difficult task.Traditional detection mechanisms are typically triggered when traffic, over long time-scales,constantly exceeds particular thresholds Maintaining a low attack cost implies that trafficmay not exceed such thresholds over long time-scales Moreover, if one were to designdetection mechanisms that would be triggered on a shorter time-scale, this could potentiallylead to high false-positive alarm rate due to the normal burstiness present in legitimateworkloads Recent work presented in [55], shows how attackers are actually moving awayfrom flooding attacks to attacks that mimic the web browsing of a larger set of clients
to evade detection Other works include the low-rate Shrew attacks [59], where low-rateattack traffic can shutoff TCP traffic through targeting the timeout mechanism Thus, ournotion of attack potency captures (and quantifies) the new trends in mounting adversarialattacks, where attackers seek to keep low profiles
We parameterize our definition of potency to capture the aggressiveness of the attacker(.e., the level of exposure risk that the attacker is willing to take) Such formalizationextends and applies to scenarios and systems beyond those considered in this thesis Thisenables the categorization of families of adversarial attacks based on aggressiveness Inthis thesis, however, we focus on families of attacks that are less aggressive, in comparison
to traditional DoS attacks
Highlighting the design tradeoffs between resilience and susceptibility to RoQ attacks
In this thesis, we demonstrate the existence of design tradeoffs between resilience andsusceptibility to RoQ attacks In particular, adaptive mechanisms tuned to achieve the
Trang 23attacks On the other hand, if they were tuned to minimize the damage (or the potency)from RoQ attacks then their performance may be sacrificed during normal operation Forexample, tuning the design parameters for faster response would reduce the impact ofRoQ attacks, since the adaptation mechanism will respond aggressively to the adversarialworkload This would also mean, however, that the adaptation mechanism would still reactaggressively to non-adversarial random noise This is quite undesirable as it compromisesthe overall stability of the system.
Pinpointing the tradeoff between resilience and susceptibility to RoQ attacks has twoimportant implications First, such tradeoffs could be used to defend against RoQ attacks.For example, we illustrate an example where it is advantageous for the parameterizations
of an adaptation mechanism to be adjusted on-line based on whether or not the system issuspected to be under a RoQ attack Second, if these tradeoffs were taken into account,system designers and network operators may save significant research efforts to fine-tuneparameters for performance, since such settings could be increasing the impact of RoQattacks
1.1.2 Technical Contributions
Developing a framework for studying dynamic exploits
In this thesis, we present a control-theoretic framework that enables studying andassessing the impact of dynamic exploits on adaptation mechanisms This framework cap-tures the interplay between the efficiency-load behavior of a resource and the adaptationmechanisms of both the resource and its consumers The goal from employing these mecha-nisms is to converge the system into a stable operating point that maximizes some objectivefunction (often referred to as Lyapunov function [78])
Within this framework, we identify RoQ attacks as those aiming to hinder convergence,through careful orchestration of attack traffic This is achieved by disturbing the pricingfunction of the resource, at the right times, so the overall system will constantly operate
Trang 24other service degradations) through the potency metric In this thesis, we instantiate thisframework for three different adaption mechanisms, currently employed in our Internetinfrastructure However, this framework can be used to study the impact of dynamicexploits on a larger subset of adaptation mechanisms.
Identifying and assessing vulnerabilities in current adaptation mechanisms
This thesis develops three analytical models for assessing the impact of RoQ attacks onadaptation mechanisms that are currently deployed in our communication networks andcomputing system
In our first application, we focus on transport protocols, specifically on the sion Control Protocol (TCP), which is the dominant transport protocol for Internet traffic
Transmis-We present a modified fluid model to assess vulnerabilities in TCP’s Additive IncreaseMultiplicative Decrease (AIMD) mechanism against RoQ attacks RoQ attacks are iden-tified as those aiming to maximize the wasted bandwidth and delay jitter for legitimateconnections, per unit attack burst Through simulations and Internet experiments, we areable to study the impact of different parameters on the attack potency
Our second application studies vulnerabilities in admission control mechanisms ployed in server settings Through a discrete-time model, we are able to assess the impact
em-of RoQ attacks on a Proportional Integral (PI) admission controller RoQ attacks areidentified as those aiming to maximize the number of rejections for legitimate requests due
to attack traffic We validate our analysis through Internet experiments
Our last application assesses vulnerabilities in a host of load-balancing adaptation anisms employed in server settings Using queueing theory analysis and a discrete-timemodel, we give upper and lower bounds on the impact of RoQ attacks RoQ attacks areidentified as those aiming to maximize the response time for legitimate requests due toattack traffic Again, we validate our analysis via simulation, numerical solutions andInternet experiments
Trang 25Figure 1-1 visually illustrates the organization of this thesis.
In Chapter 2, we introduce RoQ attacks and give a formal definition of the attack
“Potency” A general control-theoretic framework is then presented whereby the transients
of adaptation are the result of an optimization process which forces the system to converge
to steady-state We then illustrate a case-study for the impact of RoQ attacks on a set ofrate-controlled connections Chapter 2 presents core material that would be instantiated
in subsequent Chapters
In Chapters 3, 4 and 5, we focus on studying RoQ attacks on adaptation mechanismsemployed in networking settings In particular, Chapter 3 studies vulnerabilities in net-working transport protocols (e.g., TCP) against RoQ attacks A dynamic fluid model ispresented, based on an instantiation from the general framework presented in Chapter 2.The analytical results obtained are validated through extensive simulations and Internetexperiments Chapter 4 exposes a distributed RoQ attack scheme that would compromise
a set of links so as to provide additional bandwidth to a particular set of flows We trate the feasibility of this distributed scheme for RoQ attacks via simulations and InternetExperiments In Chapter 5, we focus on performance bounds on the impact of RoQ attacks(and other low-rate attacks such as the Shrew attacks [59]) We expose variants of theseattacks, focusing on worst-case analysis via a simple discrete-time model We also studythe extent to which defense mechanisms are capable of mitigating the impact of the attack
Trang 26illus-In Chapters 6 and 7, we switch gears to studying RoQ attacks on adaptation nisms employed in end-systems In Chapter 6, we instantiate from the general frameworkpresented in Chapter 2, an analytical discrete-time model to study RoQ attacks on adap-tive admission control mechanisms We present numerical results that are validated viaInternet experiments In Chapter 7, we instantiate a queuing model to capture the effect
mecha-of RoQ attacks on a host mecha-of dynamic load-balancing policies We present analytical results,that are validated via simulations and Internet experiments
We conclude the thesis with a summary, derivative work and directions for future work
in Chapter 8
Trang 27Chapter 2
RoQ Attack Definition and Premise
This chapter gives a formal definition to a RoQ attack, emphasizing its novel conception
of the attacker’s goal: namely to maximize damage per unit cost A general framework
is then presented whereby the transients of adaptation are the result of an optimizationprocess which forces the system to converge to steady-state The premise of RoQ attacks is
to hinder such convergence We then present a case study for RoQ attacks on a set of controlled connections This Chapter presents the core material that would be instantiated
rate-in subsequent Chapters for various applications we consider
2.1 Attack Goal and Definition
We consider a RoQ attack comprising a burst of M requests or packets sent to a systemelement at the rate of 6 requests/packets per second over a short period of time 7, where
M = 6r This process is repeated every T units of time We call M the magnitude of theattack, 6 the amplitude of the attack, 7 the duration of the attack, and T the period of theattack
For the above RoQ attack, we define II, the attack potency, to be the ratio betweenthe damage caused by that attack and the cost of mounting such an attack Clearly, anattacker would be interested in maximizing the damage per unit cost—i.e., maximizingthe attack potency
DamagePotency =I] = :
Trang 28The Potency definition given by Equation 2.1 does not specify what constitutes age” and “cost” Clearly, one may consider various instantiations of these metrics Forexample, for an attacker aiming to minimize a web server availability, a natural metric of
“dam-“damage” would be the difference between the total number of requests admitted beforeand after the attack (excluding the attacker’s requests) If the attacker aims to maximizethe jitter in the users’ observed response time, then a natural metric of “damage” would
be the difference between the standard deviation of the time it takes to process a requestbefore and after the attack Similarly, there could be a number of different metrics forwhat constitutes “cost” Examples include the effective attack request-rate (1.e., M/T),the attack amplitude 6, the attack duration 7, etc Throughout this thesis, we will look
at various instantiations for these metrics, depending on the system under investigation.Notice that our potency definition could be used to study the impact of general classes
of attacks that may not be necessarily targeting the adaptation mechanisms, since thepresence of attack traffic trafic would naturally lead to some form of damage to legitimatetraffic Our focus in this thesis is on quantifying the damage that is the result of adaptationmechanisms being present
The Potency definition given by Equation 2.1 uses a parameter 2 to model the gressiveness of the attacker, by scaling the “cost” A large Q reflects the highest level ofaggression, ?.e., an attacker bent on inflicting the most damage and for whom cost is not
ag-a concern Mounting ag-a DoS ag-attag-ack is ag-an exag-ample of such behag-avior A smag-all © reflects ag-anattacker whose goal is to maximize damage with minimal exposure Based on the value
of 9, one can identify families of attacks, based on aggressiveness In this thesis, however,
we focus on stealthy RoQ attacks that are parameterized with a small value of 2 (e.g., 1
or 2) Notice that when 2 is chosen to be equal to 1, “damage” is compared directly to
“cost”
Trang 292.2 Adaptation as an Optimization Process
Our approach to studying adaptation dynamics, in the presence of RoQ attacks, relies
on modeling the complex interplay between the efficiency-load behavior of a resource andthe adaptation mechanisms of both the resource and its consumers In an abstract way,resource adaptation can be viewed as the process of measuring the offered load and setting
a price, based on a pricing function Consumer adaptation, on the other hand, would
be the process of observing the price and adjusting the demand accordingly A price is
simply a measure of congestion, something that would hurt the consumers.! For example,
packet loss rate experienced by a rate-controlled connection is considered a price, since theconnection, on observing this packet loss rate, would decrease its sending rate accordingly.The goal of employing these adaptation mechanisms is to bring the whole system intosteady-state At steady state, the price set by the resource matches the load offered by theconsumers Figure 2-1 illustrates a general block diagram for the adaptation mechanismsconsidered for the resource and for the consumers Notice that some of the systems westudy in this thesis only have one of these forms of adaptation mechanisms present Forexample, web clients may not necessarily adapt their request rate, based on feedback fromthe web-server Similarly, a network router may not necessarily adjust its packet droppingprobability based on the rates of connections traversing that link For the remaining part
of this Chapter, we focus on the general case in which a given system is employing bothforms of adaptation mechanisms for the resource and for the consumers
Consider a resource subjected to multiple streams (from the consumers), each of whichoffers a load characterized by a rate x,(t) of requests or packets per second In a networksetting, z(t) would represent the packet rate for a particular rate-controlled connection r(e.g., in packets/sec) In a web server setting, z;(£) would represent the request rate for
a particular service r (e.g., in hits/sec) The value of #;(#) is adapted based on feedbackreceived from the system (equivalently, prices) In a network setting, that pricing feed-
†While some studies have considered prices as in real money [65, 76], throughout this thesis, we will
focus on prices as measures of congestion.
Trang 30Resource (Plant)
| I
Observed Price I PriceConsumers Demand/Load | Pricing
gor 0) = Z(e(), p(ø0))) — De®), m6) (2.2)
In analyzing the convergence of such system to steady-state rates x7, we resort tooptimal control theory to show that the evolution of the system leads to optimizing someobjective function, called the system’s Lyapunov function [78] The basic idea is to findsuch Lyapunov function U (2) of the system state x that is positive, continuous and strictly
concave, such that $U (z(@)) > 0 if zr() # x* and equals zero when z,(t) = 2% for all r.
Lyapunov function U(x) is generally of the form in Equation 2.3, where the first term
?While we give examples of what constitutes a price in specific settings, other pricing functions are
certainly possible.
3We omit the subscript r, to indicate the whole set of rates from all streams
Trang 31represents the gain in request rates and the second term represents the associated costs(prices) Thus by optimizing U(x) the system optimizes its net gain.
Figure 2-2: An example of a pricing function and the effect of a RoQ
attack
Given that the system converges to a fixed point x7, one would be interested in the
rate of convergence, as this will determine the speed with which transients subside An
optimized RoQ exploit would leverage such transients of adaptation to knock off the system
whenever it is about to stabilize Let y determines the rate of convergence of the system—ahigher value indicates faster convergence Notice that for a linearized system, in the form
of ÿ = Ly where L is a matrix and y is a vector of state variables, the smallest eigen value
of L determines the rate of convergence, /
The analysis we have conducted so far could be used to provide insights into the effect
of adversarial attacks that aim to exploit the optimization process that leads the system
to converge to steady-state rates z; We do so next
Assume that the system had already stabilized to its steady-state z7 values Since aresource is used to its almost maximum capacity, the additional attack load is likely topush the resource towards saturation where the fed-back prices are extremely high Figure2-2 (a) illustrates an example of a pricing function as the load on the system varies Since
Trang 32the RoQ attack involves a sustained rate of 6 for 7 units of time, the system will be pushed
to a new stable point, say (2’)* Let refer to the new rate of convergence to the newstable point (i.e., from x* to (x’)*) Since the capacity of the attacked resource is effectivelyreduced during the attack duration 7, the resource pricing function is pushed to the left,
as shown in Figure 2-2 (a) Such higher prices result in faster convergence (i.e., higher ju’)and lower (z)7
As soon as the system stabilizes to (z’)*, an optimized RoQ exploit would cease, allowing
the system to return to its original state x* This pattern then repeats as illustrated in
Figure 2-2 (b), in effect forcing the system to spend its time oscillating between differentstates, due to the presence and absence of the attack traffic Note that in general, the attacktraffic effectively destroys the “contractive” mapping property of the pricing function that
is essential to ensure convergence The intuition behind the impact of different shapes ofpricing functions have been considered in a technical note [47] that accompanies this thesis.Having defined the RoQ exploit, we now turn our attention to assessing its potency
as defined by Equation 2.1 With respect to our analytical model parameters, one may
capture the “damage” caused by the attack using the expression ô(h + *): Intuitively,
this expression represents the wasted capacity (or other service qualities such as delay andrate jitter, as we discuss later) during instability The shaded area in Figure 2-2 (b), is avisual representation for the damage caused by the RoQ attack Also, one may capture
the “cost” of the attack by (ô/ (ar + 2): Intuitively, the cost increases with increasing
the attacker’s peak rate and decreases with longer attack period We emphasize that thedefinition of potency allows for many other instantiations of “damage” and “cost” (whichwill be more meaningful as we will do in later Chapters throughout this thesis) and thatour specific choices above are for illustrative purposes
Accordingly, we calculate potency using Equation 2.4, where 2 reflects the relativevalues that an attacker attributes to “damage” versus “cost”, or equivalently the desiredlevel of aggression
Trang 33where « represents the gain of the system; wy represents the aggressiveness of the connection
in increasing its sending rate when it observes no feedback—a lower round-trip time (RTT)connection would be more aggressive; the second term represents the multiplicative decrease(as in the Additive Increase Multiplicative Decrease (AIMD) transmission rules of TCP).Notice that Equation 2.5 is an instantiation from Equation 2.2
The link function p;(.) reflects the prices (or, costs) fed back to the sources as the inputload on the link varies Figure 2-2 (a) shows an example of a pricing function Given suchpositive, continuous, and increasing function, one could show that the Lyapunov function[78] of the system is given by [58]:
ies as
U(x) = À ` wrlog a, — À ` / pily)dy (2.6)
reR leL y=0
where (z) represents the net gain—the first term represents the gain in sending rates,
Trang 34while the second term represents the associated costs A Lyapunov stability analysis shows
that the system converges to a stable state z* that maximizes U(z), i.e, £U(a(t)) > 0 if
xr(t) # xf and equals zero when z,(t) = x% for all r
The steady-state rates xf can be obtained by equating to zero the following partialderivatives:
s10) = - Sala) (2.7)
ler lés
Given that this system is guaranteed to converge from any starting state zr(0), one would
be interested in that rate of convergence If the system is perturbed around its steady-state,
a linearized model—in terms of new variables y,(t), such that z;(#) = z‡ + \/(x*)y,(t)—
of the system along the diagonal The smallest eigen value, , can be used to calculate the
potency, as given by Equation 2.4 4
Figure 2:3 shows the potency plots for a 2-link tandem network used by three controlled sources as the attack peak rate varies, for Q=1 (labels on left y-axis) and forQ=2 (labels on right y-axis) The pricing function of the first link is pi(y) = 0.2/(10 — y),and that of the second link is po(y) = 0.5/(5 — y) One connection crosses both links, whileeach of the other two crosses only one link The additive-increase parameters 1; are taken
rate-to be (0.5, 1.5, 1.0)—the increase rate is lowest for the longest 2-link connection
4u' would be calculated similarly, but for a system linearized around (2')*, and with «% as a starting
state.
Trang 35Figure 2-3: Attack potency versus attack peak rate for different values
We observe that for a given value of Q, there is an optimal attack peak rate 6 thatoptimizes potency IJ On the one hand, a low attack peak rate, while less costly, results
in minimal damage, and thus results in low potency On the other hand, a higher attackpeak rate, while resulting in higher damage, may be too costly that it results in lowerpotency This suggests that an optimized RoQ attacker can achieve higher potency (i.e.,higher damage per unit-cost) by forcing the system into instabilities at the right times,injecting only the right amount of attack traffic This is true for any level of attacker’saggressiveness, (2
Chapters 3, 6 and 7 will consider more elaborate analytical models to gain furtherinsights into more complicated adaptation dynamics of specific systems that we could notcast in the generic optimization process we relied upon in this Chapter
Trang 36Chapter 3
RoQ Attacks on Network Transport Protocols
End system protocols (e.g., TCP) rely on feedback mechanisms to adapt their sendingrates to match their “fair share” of network resources TCP reduces its sending rate onpacket loss/marking and increases its rate on successful packet transmission Typically,the decrease in rate, which is needed to protect against wasting network utilization, isdrastic—e.g., by halving the sending rate—whereas the increase in rate, which is needed
to probe for available bandwidth, is slow—e.g., by linearly increasing the sending rate over
time Additive-Increase-Multiplicative-Decrease (AIMD) rules! ensure that flows react
ad-equately to congestion in a “friendly” manner to one another—hence the TCP-friendly label[45] Moreover, these protocols react even more swiftly to excessive losses by completelyshutting off their sending rates for a long period of time (e.g., timing out in TCP).Buffer management schemes play an important role in the effectiveness of transmissioncontrol mechanisms as they constitute the feedback signal (by marking or dropping pack-ets) to which such mechanisms adapt In DropTail, an incoming packet to a full queue
is dropped otherwise, its is queued DropTail doesn’t try to achieve any performance provements, nor does it try to stabilize the queue size Other Active Queue Management(AQM) techniques have been developed that try to maintain the queue size at a targetlevel and employ probabilistic dropping—e.g., RED and its many variants [39, 61, 80],
im-PI [49] and REM [12]) Such techniques improve fairness and allow flows to send smallbursts of packets without experiencing packet drops Stabilizing the queue at a low targetguarantees efficiency while minimizing jitter and round-trip times in general
‘Other TCP-friendly increase/decrease rules have also been proposed and evaluated [15] All would be
susceptible with various degrees to the same issues we consider in this work.
Trang 37The adaptation strategies of transmission control protocols such as TCP, while crucialfor alleviating congestion, make them vulnerable to losses that are generated throughother processes—namely losses that are not the result of congestion (e.g., wireless losses).The impact of such losses on TCP performance was considered in many studies; examplesinclude [13, 46] In these studies, however, the processes interfering with TCP’s adaptationcould be considered “non adversarial” in the sense that the losses were more or less theresult of (say) a random process as opposed to a calculated attack.
In this Chapter, we assess vulnerabilities in TCP/AQM adaptation mechanisms againstRoQ attacks We present experimental results obtained numerically using a more detailedcontrol-theoretic model in which the dynamics of queue management (e.g., RED) as well
as TCP’s AIMD adaptation are explicitly modeled This model is an instantiation fromthe general model presented in Chapter 2 We then validate our results through extensive
ns simulations in which other phenomena that are not captured in our analytical modelare present (e.g., TCP slow-start and timeouts) Internet experiments are also presented,which confirm the feasibility of RoQ attacks and provide further validation of the insights
we gained from analysis and simulations We discuss related topics at the end of theChapter
3.1 Attack Definition
Consider an attacker bent on maximizing the bandwidth wasted as a result of a RoQ
at-tack.? Thus, we define b,,, the wasted bandwidth, to be the difference between the achievable
throughput under normal conditions and the achievable throughput under a RoQ attack,both measured as the number of packets (or bytes) going through the link for legitimatetraffic Thus, 64 quantifies the absolute “damage” resulting from the attack Let bg, theattack bandwidth, denote the bandwidth consumed by the attacker over the link underattack Clearly ba could be construed as the cost of the attack—for instance because thehigher the value of b2 the more likely that the attacker would be identified Substituting
?Later, we consider other attack objectives—e.g., maximize jitter
Trang 38in Equation 2.1, we get the following definition of attack potency:
I = (3.1)
= |œ
S| e
The above definition does not account for the total traffic sent by the attacker but only
for the attack traffic that is observable at the link under attack We have also used an
alternative measure of cost that accounts for the total traffic injected by the adversary; the
resulting potencies were indistinguishable
3.2 Model Derivations
We extend an analytical fluid model similar to that proposed in [48, 57, 62, 86] to assess thepotential damage that could be inflicted by an adversarial exploitation of the adaptationdynamics of TCP + AQM (namely, AIMD + RED)
Observed Queue (Feedback)
Figure 3-1: Block diagram showing the feedback control system for TCP
and RED
The feedback control system for TCP and RED is depicted in Figure 3-1 The pricingfunction of a RED router [41] is given by the relationship between the congestion markingprobability p(t) and the average queue size v(t) The latter is an Exponentially-Weighted-Moving-Average (EWMA) of the instantaneous backlog buffer size b(t), which in turnevolves as a function of both the sending rates of legitimate TCP sources z;(.) and theattack rate y(.) Equations 3.2 and 3.3 capture the RED pricing function, where o and ¢are the RED parameters given by Pmaz/(Bmaz — Bmin) and Bmin, respectively In effect,the high-level goal of RED is to stabilize the queue size at a low value so as to minimize
Trang 39delay, while maximizing throughput by always maintaining the queue size at a non-zero(small) threshold.
repre-tween the sender and the receiver for connection 7, plus the queuing delay at the bottleneck
router Thus r;(t) equals D; + 19 We denote the propagation delay from sender ¿ to the
bottleneck by Ds,», which is a fraction a; of the total propagation delay, ¡.e., Dey = ai,Dj.The backlog buffer b(t), which is equal to the input rate x;(.) from the m connections plusthe attacker’s traffic y(.) minus the output link rate, evolves in accordance to Equation 3.4
d m
300) = _=ữ — Dov) — (C — y(t — Das)) (3.4)
Notice that the input rates are delayed by the propagation delay from the senders and theattacker to the bottleneck D,,, and Dy As in the generic analytical model of Chapter 2,
we notice that during the attack duration the capacity of the resource is effectively reduced
by the attack peak rate
According to TCP’s AIMD rules, the dynamics of TCP throughput for each of the m
Trang 40connections can be described by the following differential equations:
d 4Á — Tị
Tai) = ome — pelt — Die, (t)))
ae rit) (pe(t — Dos;(t))), where ¿ = 1,2, ,m (3.5)
The first term represents the additive increase rule, whereas the second represents themultiplicative decrease rule Both sides are multiplied by the rate of acknowledgments forthe last window of packets x;(t — r;(t)) In the above equations, the time delay from thebottleneck to sender 7, passing through the receiver ¿, is given by Dys,(t) = ri(t) — Ds,p
As in Chapter 2, we define the attack traffic, y(t), as the square wave given by Equation3.6, where 6 is the attack amplitude, 7 is the attack duration and T is the attack period
By lower bound, we mean that in reality, as we will show in simulations and Internetexperiments, the impact of a RoQ attack is likely to be even worse than the model predicts.This is so because it is reasonable to assume that the attack duration will be long enoughfor many connections not only to back off, but also to go into timeout/slow start, whichwould increase the “damage” from the attack
3.3 Numerical and Simulation Results
We are now ready to put the fluid model just developed to work by numerically solvingfor the attack potency We present results of ns simulations [35] and Internet experiments(next section) that relax the simplistic assumptions of our model to capture various effects(e.g., slowstart, timeouts, cross traffic on multiple hops) and other queue management