11 GA-Based Verification of Network Protocols Performance M. Baldi, F. Corno, M. Rebaudengo, Matteo Sonza Reorda, and Giovanni Squillero 11.1 Introduction Computer networks are gaining increasing importance as they penetrate business and everyday life. Technological evolution results in increased computational power and transmission capacity. These phenomena open the way to the development and exploitation of new applications (e.g. video conferencing) which require demanding services from the network. As a result, computer networks are in continuous evolution and the protocols regulating their operation must be assessed for their suitability to new technologies and ability to support the new applications. This assessment is particularly important since new technologies and applications often differ from the ones the protocol has been designed for. The effectiveness of a network solution depends on both the specifications of the protocols and their software and hardware implementations. Given leading hardware and software technologies, the protocol specification thus plays a crucial role in the overall network performance. Some aspects are of utmost importance: the correctness of the protocol, i.e. the warranty of showing the intended behavior in any specific situation; its performance, i.e. the utilization of the available bandwidth it is able to achieve over the physical medium; and its robustness, i.e. the property of being able to work correctly under abnormal conditions. The analysis of these different aspects of communication protocols is usually done according to the following techniques: Telecommunications Optimization: Heuristic and Adaptive Techniques, edited by D.W. Corne, M.J. Oates and G.D. Smith © 2000 John Wiley & Sons, Ltd Telecommunications Optimization: Heuristic and Adaptive Techniques. Edited by David W. Corne, Martin J. Oates, George D. Smith Copyright © 2000 John Wiley & Sons Ltd ISBNs: 0-471-98855-3 (Hardback); 0-470-84163X (Electronic) Telecommunications Optimization: Heuristic and Adaptive Techniques 186 Formal modeling: to verify some properties of the protocol. This requires us to model the protocol behavior according to some formal description language, such as LOTOS, and to run some formal verification tool. Description languages are nowadays powerful enough to capture most of the protocol semantics; however, verification tools are often quite limited in the size and complexity of the descriptions they can deal with. Some important aspects of a network protocol that are difficult to verify are those related to different time-out counters, message queues, and transmission strategies based on message contents. Simulation: to check the protocol in its entire complex behavior and explore the dependencies among the various hardware and software components of a computer network. A typical simulator allows the actions taken by a protocol to be monitored while operating over a computer network with a specified topology and traffic load. The simulator models a number of sources that can be programmed to generate traffic according to statistical models defined after real sources. One or more sources behave according to the protocol under study; the others are considered background traffic generators. Poisson statistical processes (Hui, 1990) are often adopted as a model of background traffic. We feel that neither approach is sufficient to give enough information about the protocol performance. While the formal approach delivers a mathematically proven answer, it is forced to work on an over-simplified view of the protocol. Traditional simulation techniques, on the other hand, rely on a clever choice of the traffic pattern to yield useful results, and worst-case analysis is often performed by hand-designing the traffic. We propose the adoption of a mixed simulation-based technique that rely on a Genetic Algorithm (GA) (Holland, 1975) to explore the solution space and look for an inconsistency in the verification goal. As a case study problem, we chose the performance verification of the Transmission Control Protocol (TCP) (Postel, 1981), a widely used computer network protocol. The TCP protocol was chosen because its characteristics and behavior are already well known to the computer network research community, and our goal is to better focus on the verification methodology and exploitation of the GA, and not on the protocol itself. In this work, we show how to integrate a GA with a network simulator to drive the generation of a critical background traffic. The GA aims at generating the worst-case traffic for the protocol under analysis, given some constraints on the traffic bandwidth. Errors and weaknesses in the network protocol can therefore be discovered via simulation of this worst-case traffic. We expect the GA to drive the network to conditions where the protocol performance (e.g. the effective transfer rate) is low, and we can study the behavior of the protocol under extreme load conditions. In previous work (Alba and Troya, 1996), GAs were applied to the analysis of communication protocols by checking the correctness of their Finite State Machine (FSM). Communicating protocol entities were expressed as a pair of communicating FSMs, a GA generates communication traces between the two protocol entities, and a FSM simulator executes them. The GA aims at detecting deadlocks and useless states. This approach showed several limitations, mainly due to over-simplifications necessary to express a real protocol as a mathematical model such as a finite state machine. GA-Based Verification of Network Protocols Performance 187 By contrast, our approach, being based on a network simulator, is able to model most of the aspects of the protocol stack and the network on which it is employed: not just the state machine, but also counters, message queues, routing tables, different re-transmission speeds, and so on. The GA generates the worst operating conditions of the protocol (by defining the load in the network) and finds the traffic configuration that minimizes its performance. During this analysis, the protocol is stressed enough to achieve high degrees of confidence, and informally, we can say that both its functionality and its performance are verified. Some preliminary experimental results we gathered show that traffic generated by the GA is able to expose protocol shortcomings that were impossible to reveal with formal techniques, and that were not found by means of statistical simulation. 11.2 The Approach In this work we choose to examine the TCP protocol, whose characteristics and behavior are already well known to the computer network research community, in order to better focus on the verification methodology and exploitation of the GA, and not on the protocol itself. Hence, we do not expect to discover any new information about TCP behavior and performance, but to find in an automated way well known information that, with traditional approaches, requires skills and in-depth knowledge of the protocol. We aim at studying the TCP protocol in real operating conditions. We set up an IP (Internet Protocol) network (Stevens, 1994) and a number of TCP probe connections (i.e. sender-receiver pairs). The network is loaded with a background traffic generated by User Datagram Protocol (UDP) (Postel, 1980) sender-receiver pairs. We chose the UDP protocol for the sake of simplicity, but other kinds of background traffic sources can be modeled too. During the analysis process, the GA provides a pattern of the traffic generated by background sources and a network simulation is run on the given topology. During this simulation, relevant data are gathered from probe connections by the simulator program and provided to the GA, which uses them to estimate the ‘damage’ that the background traffic made. Such information is then used to drive the generation of traffic patterns to be used in subsequent steps of the algorithm. As for statistical methods, our analysis requires a large number of simulation runs to obtain a significant result, therefore the choice of the simulation program is critical in order to keep the computational time small. In our work, simulations were run using a publicly available simulator called insane (Internet S imulated ATM Networking Environment – Mah (1996). Insane adopts an event-driven mechanism for the simulation, thus being reasonably fast. Moreover, it is able to deal with an arbitrary network topology described as a TCL (Welch, 1995) script and many characteristics of network nodes and links (such as processing speed, buffering capacity, bandwidth and latency) can be controlled. Figure 11.1 shows the architecture of the proposed verification environment, called Nepal (Network P rotocol Analysis Algorithm), which integrates the GA and the simulator. The approach is quite general and can be applied to different protocols using different simulators with limited effort. Telecommunications Optimization: Heuristic and Adaptive Techniques 188 Figure 11.1 Architecture of the Nepal system. 11.3 Genetic Algorithms The field of Genetic Algorithms (GAs) has been growing since the early 1970s, but only recently were GAs applied to real-world applications which demonstrate their commercial effectiveness. Holland (1975) invented GAs to mimic some of the processes observed in natural evolution: evolution, natural selection, reproduction. GAs have been investigated as a possible solution for many search and optimization problems in the way nature does, i.e., through evolution. Evolution operates on chromosomes, i.e. organic elements for encoding the structure of live beings. Processes of natural selection cause the chromosomes that encode successful individuals to reproduce more frequently than those less fit. In the process of biological reproduction, chromosomes coming from parents are combined to generate new individuals through crossover and mutation mechanisms. In the GA field, biological concepts are mapped as follows. Each solution (individual) is represented as a string (chromosome) of elements (genes); each individual is assigned a fitness value, based on the result of an evaluation function. The evaluation function measures the chromosome’s performance on the problem to be solved. The way of encoding solutions to the problem on chromosomes and the definition of an evaluation function are the mechanisms that link a GA to the problem to be solved. As the technique for encoding solutions may vary from problem to problem, a certain amount of art is involved in selecting a good encoding technique. A set of individuals constitutes a population that evolves from one generation to the next through the creation of new individuals and the deletion of some old ones. The process starts with an initial population created in some way, e.g. randomly. A GA is basically an algorithm which manipulates strings of digits: like nature, GAs solve the problem of finding good chromosomes by blindly manipulating the material in the chromosomes themselves. genetic algorithm network simulator TCL script relevant data background traffic pattern network topology GA-Based Verification of Network Protocols Performance 189 11.3.1 The Genetic Algorithm in Nepal The pseudocode of the GA adopted within Nepal is shown in Figure 11.2. When the GA starts, N p individuals are randomly generated. In our approach, an individual represents a background traffic pattern. The background traffic corresponding to the initial population is generated according to a Poisson process whose inter-arrival time between packets is exponentially distributed. At each generation the GA creates N p new individuals by applying a crossover operator to two parents. Parents are randomly chosen with a probability which linearly increases from the worst individual (smallest fitness) to the best one (highest fitness). A mutation operator is applied over each new individual with a probability p m . At the end of each generation, the N p individuals with higher fitness are selected for survival (elitism), and the worst N o ones are deleted from the population. The fitness function measures the probe connections’ throughput, i.e., the performance of the probe TCP connections perceived by end-users during the simulation experiment. All bytes successfully received at the TCP level, but not delivered to end-users, such as duplicated packets received due to the retransmission mechanism of the protocol, are therefore not considered. P 0 = random_population(N p ); compute_fitness(P 0 ); i=0; /* current generation number */ while (i < max_generation) { A = P i ; for j=0 to N o { /* new element generation */ s’= select_an_individual(); s”= select_an_individual(); s j = cross_over_operator(s’, s”); if(rand()≤p m ) s j = mutation_operator(s j ); A = A ∪ s j ; } compute_fitness(A); i++; P i = {the N p best individuals ∈ A }; } return( best_individual(P i ) ) Figure 11.2 Pseudocode of the genetic algorithm in Nepal. The fitness function should increase with the increasing goodness of a solution, and a solution in Nepal is good when the background traffic pattern is critical; therefore, the fitness function we defined is inversely proportional to the total number of bytes perceived Telecommunications Optimization: Heuristic and Adaptive Techniques 190 by the end-users. Hence, fitness is 1/B TCP where B TCP is the total number of bytes delivered to the TCP users on all probe connection. This simple measure already delivers satisfactory results on simple network topologies. For different problems and different goals, additional parameters can be easily included into the fitness function. 11.3.2 Encoding Each individual encodes the description of the traffic generated by all the background connections for the whole duration of the simulation. A connection is specified by a UDP source and destination pair, while the route followed by packets depends upon the network topology and on the routing mechanism. We assume that all the packets sent on the network are of the same size, hence individuals do not need to encode this information for each packet. Individuals are encoded as strings of N genes. Each gene represents a single background packet and N is constant during the whole experiment. Genes are composed of two parts: TAG, that indicates which background connection the packet will be sent on; DELAY, that represents how long the given source will wait before sending a new packet after sending the current one. A common model for background traffic commonly adopted in statistical network analysis is a Poisson process with negative exponentially distributed arrival times (Hui, 1990), and we want delays in randomly generated individual to be exponentially distributed. For this reason the DELAY field represent an index in an array of values exponentially distributed from 0 to K seconds with an average value of 0.4*K. Where K is called the Poisson distribution’s parameter. Please note that we make no assumptions on the number of genes with a given TAG; however, N is usually large enough to make the initial random background traffic almost equally distributed between sources. The crossover operators must have the ability to combine good property from different individuals and in Nepal each individual embeds two different characteristics: the background traffic as a function of time (time domain) and the background traffic as a function of the background connection (space domain). It is important to maintain the conceptual separation of these two characteristic, because, for instance, an individual could represent good background traffic during a given period of a simulation experiment, or it can represent a good traffic for a given background connection, or both. It is possible that the genes with a given TAG are not enough to generate traffic for the whole experiment, in this case the connection remains idle in the final part of the experiment. On the other hand, it is also possible that the simulation ends before all the packets are sent, and final genes are ignored. Figure 11.3 shows, on the left-hand side, a possible individual and, on the right-hand side, the corresponding background traffic generated during the simulation experiment. The table mapping the DELAY field to exponential delays is reported in the middle of the figure. In the individual shown in Figure 11.3, packet number 9 is not sent because the simulation is over before the delay associated with gene number 6 expires. On the other hand, one gene with the third tag is missing, since the delay associated with the last packet on that connection (packet number 7) expires before simulation’s end. GA-Based Verification of Network Protocols Performance 191 Figure 11.3 Individual encoding. Figure 11.4 Crossover operators 11.3.3 Crossover Operators The GA in Nepal implements four types of crossover: the widely adopted one-cut, two-cuts and uniform, plus a peculiar one called mix-source (Figure 11.4): One-cut crossover: let one cut point c be a randomly selected point between 1 and N. Nepal generates a new individual copying the first c genes from the first parent and the subsequent N – c ones from the second parent. Two-cuts crossover: let two cut points c 1 and c 2 be randomly selected between 1 and N with c 1 < c 2 . Nepal generates a new individual copying the first c 1 and the last N – c 2 genes from the first parent. The middle c 2 – c 1 genes are copied from the second parent. simulation experiment background connection ABC time tag delay 2 4 0 2 1 1 1 4 3 1 2 3 4 5 6 7 8 9 individual 01234 delays’ duration end of simulation one-cut c two-cut c 1 c 2 uniform mix-source Telecommunications Optimization: Heuristic and Adaptive Techniques 192 These two crossover operators generate a background traffic that is equal to the traffic generated by one parent for a given time, and then suddenly changes and follows the traffic generated by the other parent (strictly speaking, it is possible that the cut points do not correspond exactly to the same time instants in the two parents, however, N is large enough for this to be neglected). Uniform crossover: Nepal generates a new individual taking each gene from the first or the second parent with the same probability. Mix-source crossover: each TAG (i.e. each background source) is randomly associated with one of the two parents. Nepal generates a new individual copying each genes with a specific TAG from the associated parent. The new individual can be longer or shorter than N: in the former case the exceeding genes are dropped; in the latter one the individual is filled with random genes. The mix-source crossover aims at putting together good background sources from different individuals. We can say that the mix-source crossover acts in the space (or source) domain while the one-cut and two-cut ones act in the time domain. The uniform crossover is hybrid and acts in both domains. 11.3.4 Mutations Nepal implements three different mutation operators: speed-up, slow-down and random. Mutation applies on N m < N contiguous genes, where N m is a constant. The first gene is randomly selected between 1 and N and the gene list is considered circular. Speed-up mutation: decreases the DELAY field by one, without modifying the TAG field. DELAY values are considered circular. Slow-down mutation: increases the DELAY field by one, without modifying the TAG field. DELAY values are considered circular. Random mutation: substitutes each gene with a new random one. After each mutation, genes in the individual need to be sorted to fulfill the ‘sorted in time’ property. 11.3.5 Univocal Representation A general guideline for GA states that different chromosomes should represent different solutions. Diversely, a GA would waste some of the time in the generation and evaluation identical solutions, and even worse, the population could prematurely converge to a state where chromosomes are different but solutions are equal, and where genetic operators have no effect. GA-Based Verification of Network Protocols Performance 193 To ensure that a given background traffic is determined by an unique chromosome, genes are sorted to guarantee that the packet associated to gene i+1 is never sent before the packet associated with gene i. We say that genes in the individual are ‘sorted in time’. Otherwise, it would be possible to exchange adjacent genes with different TAGs without modifying the represented background traffic (e.g. packet number 2 and packet number 3 in Figure 11.3). Moreover, some crossover operators aim at combining good characteristic in the time domain. If an individual does not fulfil the ‘sorted in time’ property, it would be hard for such operators to preserve property related to such domain, and they would act almost in the same way as the common uniform crossover. Our experiments show that the performance of the GA would be significantly reduced. Chromosomes need to be ‘sorted in time’ after crossover and mutation operators. 11.4 Experimental Results We developed a prototypical implementation of the algorithm using the Perl 5 (Wall et al, 1996) language. The implementation of the GA is less than 500 lines long and, with a limited effort, can be ported to different simulators and different platforms. The source of the simulator consists of 38,370 lines of C++ code. We needed to modify insane to handle an externally generated background traffic; modifications of the original code are limited to about 100 lines. Such modifications allow the traffic source to be completely defined by the user. 11.4.1 Network Topology Figure 11.5 shows the topology of the IP network exploited in the experiments. Three TCP connections span from the transmitters TXi to the receivers RXi through three IP routers. Each TCP connection performs long file transfers generating a series of 1024 Kbyte messages at a maximum mean rate of 1.33 Mb/s. Thus, the three TCP sources can generate an overall load of 4 Mb/s, if this is allowed by the TCP flow control mechanism. Acknowledgments from each transmitter are carried by messages flowing in the reverse direction. These TCP connections represent the probe connections of the experiment. Two sources (BSi) generate background UDP traffic directed to their respective destinations (BDi) over the same links traversed by the TCP connections. The timing of background packets is controlled by the GA, as described earlier. The background traffic is generated only in the forward direction of the TCP connections and thus only data messages can be delayed and discarded due to network overload, not acknowledgments. In this topology, the link between the right-most two routers represents a bottleneck for the network since it is traversed by all the traffic generated by both the background sources and the TCP connections. Each link shown in Figure 11.5 has a capacity of 10 Mb/s in each direction and introduces a fixed 10 µs delay. For example, such links can be considered as dedicated 10 Mb/s Ethernet trunks, the 10 µs latency accounting for the propagation delay on the wires and the switching delay introduced by an Ethernet switch. Telecommunications Optimization: Heuristic and Adaptive Techniques 194 TX1 TX2 TX3 RX1 RX2 RX3 BS1 BS2 BD1BD2 Figure 11.5 Topology of the network. Routers introduce a fixed 0.1 ms delay component which accounts for the processing required on each packet and adds to the queuing delay. The size of output queues modeled by insane is given in terms of number of packets (we used 64 packet queues in our experiments). The buffering capacity of real routers is given in terms of byte volume, i.e. the number of packets that can be stored in a buffer depends upon their size. We choose to deal with fixed size packets to work around this limitation of the simulator program. 11.4.2 Parameter Values All the parameters used by the GA are summarized in Table 11.1, and the value used in our experiments is reported.The parameters K, N and T are tuned in order to model a 6 Mb/s background traffic level, because, in this situation, the probe connections’ traffic (4 Mb/s) plus the background traffic is nearly equal to the links’ bandwidth (10 Mb/s). With a smaller load, the background traffic would be unable to interfere significantly with probe connections; while with a larger one, the whole network would be congestioned and the TCP throughput would be very small, even with a completely random traffic. Table 11.1 Genetic algorithm parameter values. PARAMETER Name Value simulation duration [s] T 3 max number of generations G 500 population size N p 50 offspring size N o 20 number of genes in each individual N 4,500 Poisson distribution’s parameter [ms] K 6.67 mutation’s probability p m 50% number of genes affected by a mutation N m 100 [...]... shows that in a congested network TCP connections are easily shut up by other traffic not reacting to congestion, like UDP traffic Telecommunications Optimization: Heuristic and Adaptive Techniques 198 11.5 Conclusions We have presented a new heuristic methodology based on a genetic algorithm, for evaluating the performance of a communication protocol in a realistic environment and possibly finding its... lowest throughput of the TCP probe connections archived until generation i and Bi is the corresponding background traffic bandwidth In DEi , the effects of the traffic are Telecommunications Optimization: Heuristic and Adaptive Techniques 196 normalized with respect to the varying background traffic bandwidth Bi We experimentally examined the DE of the statistical approach with different background traffic . techniques: Telecommunications Optimization: Heuristic and Adaptive Techniques, edited by D.W. Corne, M.J. Oates and G.D. Smith © 2000 John Wiley & Sons, Ltd Telecommunications Optimization: Heuristic and Adaptive. congestion, like UDP traffic. Telecommunications Optimization: Heuristic and Adaptive Techniques 198 11.5 Conclusions We have presented a new heuristic methodology based on a genetic algorithm, for evaluating. different protocols using different simulators with limited effort. Telecommunications Optimization: Heuristic and Adaptive Techniques 188 Figure 11.1 Architecture of the Nepal system. 11.3 Genetic