270 Chapter 13: Network Monitoring, Restoration Routing Decision Node (RDN) is assumed. The first three schemes basic standalone mechanisms, and the last scheme attempts to find the best combination of the above two schemes. The primary route from node I to E is depicted as a thick black line and the link 2 → 3 is assumed to be broken during the operation. 13.6.2.1 Global routing update When the headend 2 (or tailend 3) of link 2 → 3 detects the link failure, it informs the RDN via the control plane. The RDN conducts the routing recomputation and updates the forwarding tables for all nodes, and new bursts will subsequently follow the new routes. For example, new bursts will follow the route I →4 →5 →6 E. This solution is optimal and the existing routing protocol can handle it well. However, global routing table updating is a slow process (in seconds or even minutes) due to the long round-trip time for the signal transmission and processing between the OBS nodes and the routing entity. As a result, a large amount of bursts will be lost before the forwarding tables are updated. 13.6.2.2 Local deflection This is similar to the traditional deflection routing usually seen in a congestion resolution scheme. When the headend 2 detects the link failure, it will automatically pick up the alternative next hop in the forwarding table for every new burst whose next hop on its primary route passes the faulty link. In the example, new bursts from I to E will follow the alternative route I → 1 →2 → 5 →6 → E. This would be the fastest restoration scheme since new bursts will be deflected to an alternative good link right after the link failure is detected locally. Therefore, it will incur the smallest restoration loss. However, because all the affected bursts are deflected to one alternative path, this scheme would increase the congestion loss. 13.6.2.3 Neighbor deflection In this scheme, the headend 2 will also send a different fault notification message to all its adjacent nodes in addition to the one to the RDN. This fault notification message contains the destination information for all the primary routes passing the faulty link. After receiving this message, each of the adjacent nodes will pick up an alternative next hop for the affected bursts that are heading to the faulty link according to their primary route. In the example, bursts from I to E will take the new route I →1 → 4 →5 →6 →E. Compared with the local deflection scheme, neighbor deflection has the potential to make the rerouted traffic more distributed instead of being totally deflected to one alternative path. In this way, less congestion and therefore less burst loss may occur. However, this scheme requires extra one-hop fault notification. One possible problem is that, if the network traffic load is already very heavy, distributed deflection may have a negative impact as it may deteriorate the congestion condition all over the network. 13.6.2.4 Distributed deflection While the restoration based on the neighbor deflection may make the deflected load more balanced, it suffers from an extra one-hop fault notification delay that will result 13.6 Recovery and Restoration 271 in greater overall burst loss because of bigger restoration loss. Therefore, a more efficient algorithm is a combination of local deflection and neighbor deflection, i.e., the affected bursts are deflected locally until the adjacent nodes receive the fault notification. At that time the affected bursts will be deflected in a distributive way. We name this scheme distributed deflection. 13.6.2.5 Distributed deflection One interesting observation from scheme 2 is that the capacity of the links between the headend (node 2) of the faulty link and its adjacent nodes (node 1) will not be utilized if affected bursts are deflected at adjacent nodes. Therefore, we define a distribution ratio, , to determine the portion of affected bursts that will be deflected at the adjacent nodes. That is, after the adjacent nodes receive the fault notification, affected bursts will be deflected distributively and 1– affected bursts will be forwarded to the headend node of the faulty link to be deflected locally. With a different value of ∈ 01, we have a different variance of the distributed restoration scheme. When = 0, it is equivalent to scheme 1, local deflection-based restoration. When = 1, it becomes scheme 3, the distributed deflection-based restoration. We use distributed deflection to denote the generalized distributed deflection mechanism. We also note that using introduces only a tiny amount of management complexity in the adjacent nodes. We expect that there exists an optimal value of that makes the affected bursts deflected in a most balanced way such that the minimum burst loss can be achieved. 13.6.2.6 Class-based QoS restoration scheme Here are defined three restoration classes with different priority values for all bursts. When a burst is generated at the edge node, it will be assigned a priority value in its control packet according to its restoration QoS requirement. Bursts in the highest class will pick up best from the local and neighbor deflection restoration schemes during different restoration periods. The local deflection will be chosen during the fault notification period because of its shorter fault notification time. And the local and neighbor deflection scheme with shorter alternative route length (number of hops in this paper) during the deflection period will be chosen because of its lower backup capacity requirement (and possible lower average burst loss probability). Bursts in the middle class will be restored via neighbor deflection, and the bursts in the lowest class will be dropped until the global forwarding table update is finished. In this way, the lower class bursts do not compete for the bandwidth with the higher class bursts in the deflected alternative routes Figure 13.6 depicts the simulation results on a NSF topology. The y-axis represents the overall burst loss probability under medium traffic during the three network operational phases around a link failure. In the x-axis, 1 represents the period before the link failure, 2 represents the period before a global forwarding table update and only the proposed fast restoration schemes are in place, and 3 represents the period after the global forwarding table update. We observe that the burst loss probability is very low for both phases 1 and 3, though it is actually a little bit higher in phase 3 due to the reduction in the network capacity. 272 Chapter 13: Network Monitoring, Restoration 0.1 0.11 0.12 0.13 0.14 0.15 0.16 1 2 3 Blocking probability Global update Local deflection Neighbour deflection Distributed deflection Restoration phase Figure 13.6. Restoration performance. However, the loss probability could increase significantly in phase 2. Relying only on the forwarding table update would incur very high burst loss in this phase. Nonetheless, the loss probability increases only moderately when the proposed fast restoration schemes are used in this phase. Among the four restoration schemes, distributed deflection shows the best performance (almost no extra burst loss) followed by local deflection and neighbor deflection. Global routing update incurs the highest burst loss. Specifically, the improvements from using the three fast restoration schemes over the global forwarding table update are 24.3%, 20.1%, and 10.5%, respectively. The distributed deflection and class-based QoS restoration schemes are built based on the performance differentiation of above four basic restoration schemes, therefore they can be directly built into the fault management module to provide differentiated restoration service. We note that the above schemes are studied for OBS networks, but they could be directly extended to other packet-switched networks. 13.7 INTEGRATED FAULT MANAGEMENT Except for the unreliable fault detector, the current Globus toolkit does not provide other fault tolerant mechanisms. There exist a number of different fault handling mechanisms for Grid applications to respond to system/component failures [37]: • fail-stop: stop the application; • ignore the failure: continue the application execution; • fail-over: assign the application to new resources and restart; • migration:replicationandreliablegroupcommunicationtocontinuetheexecution. References 273 Fail-over and migration are the two choices for fault-tolerant computing and can be achieved either in a deterministic way, in which the backup compute resource and network route are precomputed along with the primary, or in a dynamic way, in which the backup is dynamically discovered after the failure occurrence and detection. We note that this could be coupled with the connection protection and restoration mechanism in the network context discussed in Section 13.6. At the system level, performance monitoring, data analysis and fault detection, fault recovery will form a feedback loop to guarantee the system performance and availability. The complexity comes from the fact that performance/reliability data can be collected in different timescales from different layers of the Grid networks (layer 1/2/3 and application). Therefore, a Grid platform needs to provide different levels of fault tolerance capability in terms of timescale and granularity. A large portion of the fault management activities are conducted within the Grid infrastructure and transparent to the end-user applications. From the point view of applications, we identify the following failure scenarios and fault handling mechanisms to explicitly integrate the Grid service migration with the network protection or restoration: • Fail-over or migrate within the same host(s). The application requires dedicated or shared backup connections for the primary connections (either unicast or multi- cast). The applications can also specify the network restoration time requirement. • Fail-over or mitigate to different host(s). This mechanism is more generic in the sense that the application requires redundant Grid resource/service. In the case that the primary Grid resource fails, the Grid middleware can reset the connec- tions to the backup Grid resource/service and jobs can be migrated directly from the failed primary resource to the backups. Further studies are under way for the aforementioned cases in terms of dynamic connections setup and network resource allocation. REFERENCES [1] F. Cappello (2005) “Fault Tolerance in Grid and Grid 5000,” IFIP Workshop on Grid Computing and Dependability, Hakone, Japan, July 2005. [2] K. Birman, “Like it or Not, Web Services are Distributed Objects!”, white paper. [3] V. Paxson, G. Almes, J. Mahdavi, and M. Mathis (1998) “Framework for IP Performance Metrics.” RFC 2330, May 1998. [4] R. Hughes-Jones, P. Clarke, and S. Dallison (2005) “Performance of Gigabit and 10 Gigabit Ethernet NICs with Server Quality Motherboards,” grid edition of Future Generation Computer Systems, 21, 469–488. [5] M. Mathis and M. Allman (2001) “A Framework for Defining Empirical Bulk Transfer Capacity Metrics,” RFC 3148, July 2001. [6] Source code and documentation of iperf is available at http://dast.nlanr.net/Projects/ Iperf/. [7] G. Almes, S. Kalidindi, and M. Zekauskas (1999) “A One-way Delay Metric for IPPM.” RFC 2679, September 1999. [8] G. Almes, S. Kalidindi, and M. Zekauskas (1999) “A Round-trip Delay Metric for IPPM.” RFC2681, September 1999. 274 Chapter 13: Network Monitoring, Restoration [9] T. Ferrari and F. Giacomini (2004) “Network Monitoring for Grid Performance Optimiza- tion.” Computer Communications, 27, 1357–1363. [10] S. Andreozzi, D. Antoniades, A. Ciuffoletti, A. Ghiselli, E.P. Markatos, M. Polychronakis, and P. Trimintzios (2005) “Issues about the Integration of Passive and Active Moni- toring for Grid Networks,” Proceedings of the CoreGRID Integration Workshop, November 2005. [11] Source code and documentation of cricket is available at http://people.ee.ethz.ch/∼ oetiker/webtools/mrtg/. [12] Multi Router Traffic Grapher software and documentation, http://people.ee.ethz.ch/∼ oetiker/webtools/mrtg/. [13] “Grid network Monitoring: Demonstration of Enhanced Monitoring Tools,” Deliverable D7.2, EU Datagrid Document: WP7-D7.2–0110-4-1, January 2002. [14] “Final Report On Network Infrastructure And Services”, Deliverable D7.4, EU Datagrid Document Datagrid-07-D7-4-0206-2.0.doc, January 26, 2004. [15] EGGE JRA4: “Development of Network Services, Network Performance Monitoring – e2emonit,” http://marianne.in2p3.fr/egee/network/download.shtml. [16] Internet2 end-to-end performance initiative. http://e2epi.internet2.edu/web traceroute. Home page for web based traceroute http://www.traceroute.org/. [17] A large collection of traceroute, looking glass, route servers and BGP links traceroute may be found at http://www.traceroute.org/. [18] J. Boote, R. Carlson, and I. Kim. NDT source code and documentation available at http://sourceforge.net/projects/ndt. [19] R.L. Cottrell and Connie Logg (2002) “A new high performance network and application monitoring infrastructure.” Technical Report SLAC-PUB-9202, SLAC. [20] IEPM monitoring page, http://www.slac.stanford.edu/comp/net/iepm-bw.slac. stan- ford.edu/slac_wan_bw_tests.html. [21] Source code and documentation of thrulay is available at http://people.internet2.edu/∼ shalunov/thrulay/. [22] C. Dovrolis, P. Ramanathan, and D. Moore (2001) “What do Packet Dispersion Techniques Measure?,” IEEE INFOCOM. [23] Source code and documentation of pathload is available at http://www.cc.gatech.edu/fac/ Constantinos.Dovrolis/pathload.html. [24] Source code and documentation of pathchirp is available at http://www.spin.rice.edu/ Software/pathChirp/. [25] Source code and documentation of bbftp is available at http://doc.in2p3.fr/bbftp/. [26] A. Hanushevsky, source code and documentation of bbcp is available at http://www. slac.stanford.edu/∼abh/bbcp/. [27] L. Cottrell (2002) “IEPM-BW a New Network/Application Throughput Performance Measurement Infrastructure,” Network Measurements Working Group GGF Toronto, February 2002, www.slac.stanford.edu/grp/scs/net/talk/ggf-feb02.html. [28] R. Hughes-Jones, source code and documentation of UDPmon, a tool for investigating network performance, is available at www.hep.man.ac.uk/∼rich/net. [29] Endace Measurement Systems, http://www.endace.com. [30] J. Hall, I. Pratt, and I. Leslie (2001) “Non-Intrusive Estimation of Web Server Delays,” Proceedings of IEEE LCN2001, pp. 215–224. [31] Web100 Project team. Web100 project, http://www.web100.org. [32] B. Tierney, NetLogger source code and documentation available at http://www- didc.lbl.gov/NetLogger/. [33] “Telecommunications: Glossary of Telecommunication Terms”, Federal Standard 1037C, August 7, 1996. References 275 [34] X. Zhang, D. Zagorodnov, M. Hiltunen, K. Marzullo, and R.D. Schlichting (2004) “Fault- tolerant Grid Services Using Primary-Backup: Feasibility and Performance,” Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER), San Diego, CA, USA, September 2004. [35] N. Aghdaie and Y. Tamir (2002) “Implementation and Evaluation of Transparent Fault- tolerant Web Service with Kernel-level Support,” Proceedings of the IEEE Conference on Computer Communications and Networks, October 2002, Miami, FL, USA. [36] H. Jin, D. Zou, H. Chen, J. Sun, and S. Wu (2003) “Fault-tolerant Grid Architecture and Practice,” Journal of Computing Science and Technology, 18, 423–433. [37] P. Stelling, C. DeMatteis, I. Foster, C. Kesselman, C. Lee, and G. von Laszewski (1999) “A Fault Detection Service for Wide Area Distributed Computations,” Cluster Computing, 2(2), 117–128. [38] O. Bonaventure, C. Filsfils, and P. Francois (2005) “Achieving Sub-50 Millisec- onds Recovery upon BGP Peering Link Failures,” Co-Next 2005, Toulouse, France, October 2005. [39] J. Lang, “Link Management Protocol (LMP),” IETF RFC 4204. [40] M. Médard and S. Lumetta (2003) “Network Reliability and Fault Tolerance”, Wiley Ency- clopedia of Engineering (ed. by J.G. Proakis), John & Wiley & Sons Ltd. [41] L. Valcarenghi (2004) “On the Advantages of Integrating Service Migration and GMPLS Path Restoration for Recovering Grid Service Connectivity”, 1st International Workshop on Networks for Grid Applications (gridNets 2004), San Jose, CA, October 29, 2004. [42] D. Papadimitriou (ed.) “Analysis of Generalized Multi-Protocol Label Switching (GMPLS) based Recovery Mechanisms (including Protection and Restoration),” Draft-ietf-ccamp- gmpls-recovery-analysis-05. [43] Y. Xin and G.N. Rouskas (2004) “A Study of Path Protection in Large-Scale Optical Networks,” Photonic Network Communications, 7(3), 267–278. [44] J. L Marzo, E. Calle, C. Scoglio, and T. Anjali (2003) “Adding QoS Protection in Order to Enhance MPLS QoS Routing,” Proceedings of IEEE ICC ’03. [45] S. Rai and B. Mukherjee (2005) “IP Resilience within an Autonomous System: Current Approaches, Challenges, and Future Directions”, IEEE Communications Magazine, 43(10), 142–149. [46] L. Xu, H.G. Perros, and G.N. Rouskas (2001) “A Survey of Optical Packet Switching and Optical Burst Switching”, IEEE Communications Magazine, 39(1), 136–142. [47] Y. Xin, J. Teng, G. Karmous-Edwards, G.N. Rouskas, and D. Stevenson, “Fault Manage- ment with Fast Restoration for Optical Burst Switched Networks,” Proceedings of Broad- nets 2004, San Jose, CA, USA, October 2004. [48] Y. Xin, J. Teng, G. Karmous-Edwards, G.N. Rouskas, and D. Stevenson (2004) “A Novel Fast Restoration Mechanism for Optical Burst Switched Networks”, Proceedings of the Third Workshop on Optical Burst Switching, San Jose, CA, USA, October 2004. Chapter 14 Grid Network Services Infrastructure Cees de Laat, Freek Dijkstra, and Joe Mambretti 14.1 INTRODUCTION Previous chapters have presented a number of topics related to Grid network services, architecture, protocols, and technologies, including layer-based services, such as layer 4 transport, layer 3 IP, Ethernet, wide-area optical channels, and lightpath services. This chapter examines new designs for large-scale distributed facilities that are being created to support types of those Grid network services. These facilities are being designed in response to new services requirements, the evolution of network architecture standards, recent technology innovations, a need for enhanced manage- ment techniques, and changing infrastructure economics. Currently, many communities are examining requirements and designs for next-generation networks, including research communities, standards associations, technology developers, and international networking consortia. Some of these new designs are implemented in early prototypes, including in next-generation open communication exchanges. This chapter describes the general concept of such an exchange, and presents the services and functions of one that would be based on a foundation of optical services. Grid Networks: Enabling Grids with Advanced Communication Technology Franco Travostino, Joe Mambretti, Gigi Karmous-Edwards © 2006 John Wiley & Sons, Ltd 278 Chapter 14: Grid Network Services Infrastructure 14.2 CREATING NEXT-GENERATION NETWORK SERVICES AND INFRASTRUCTURE If current communications infrastructure did not exist and an initiative was estab- lished to create a new infrastructure specifically to support Grid network services, a fundamentally different approach would be used. This new infrastructure would not resemble today’s legacy communications architecture and implementations. The new communications infrastructure would be easily scalable and it would incorpo- rate many options for efficiently and flexibly providing multiple, flexible high-quality services. It would not be a facility for providing only predetermined services, but would provide capabilities for the ad hoc creation of services. It would incorporate capabilities for self-configuration and self-organization. The design of such a commu- nications infrastructure would be derived from many of the key concepts inherent in Internet architecture. For example, it would be highly distributed. Even without the Grid as a driving force to motivate the creation of communications infrastructure with these characteristics, architectural designs based on these princi- ples are motivated by other dynamics. Major changes are required in the methods by which communications services and infrastructure are designed and provisioned. These changes are motivated by the explosive growth in new data services, by the additional communities adopting data services, and by the number and type of communication-enabled devices. Soon services will be required for 3 billion mobile general communication devices. Today, there is a growing need to provide seamless, ubiquitous access, at any time from any location and any device. In addition to general communication devices, many more billions of special-purpose communications-enabled devices are anticipated: sensors, Radio Frequency Identification (RFID) tags, smart dust, nano- embedded materials, and others. All of these requirements are motivating a transfor- mation of the design of core network infrastructure. The legacy architectural models for services design and provisioning cannot meet the needs of these new require- ments. A fundamentally new approach is required. The guiding principles for that new approach will be based on those that informed Internet architecture. The next section presents some of these principles, which are key enablers for next-generation communication services and facilities. 14.2.1 END-TO-END PRINCIPLE In previous chapters, the important of the end-to-end principle in distributed infras- tructure design was discussed. Its basic premise is that the core of such distributed infrastructure, including networks, should be kept as simple as possible, and the intelligence of the infrastructure should be placed at the edge. This principle is now almost universally accepted by all networking standards bodies. Recently, the ITU formally endorsed the basic principles of the Internet design through its Study Group 13 recommendation for a Next-Generation Network (NGN) [1]. Balancing the need to maintain the end-to-end principle and to create new types of “intelligent” core facilities is a key challenge. 14.2 Creating Next-generation Network Services and Infrastructure 279 14.2.2 PACKET-BASED DATA UNITS After 35 years of increasing Internet success, communications standards organiza- tions have now almost universally endorsed a architecture founded on the communi- cation of packet-based data. The IETF has demonstrated the utility of this approach. The IEEE has established initiatives to create layer 2 transport enhancements to better support packet-based services. The ITU-T NGN recommendation specifies “A packet- based network able to provide telecommunication services and able to make use of multiple broadband, QoS-enabled transport capabilities.” The NGN Release 1 frame- work indicated that all near-term future services are to be based on IP, regardless of underlying transport. 14.2.3 ENHANCED FUNCTIONAL ABSTRACTION Earlier chapters have stressed the importance of abstracting capabilities from infras- tructure. The design of the Internet has been based on this principle. Attaining this goal is being assisted by the general recognition that legacy vertical infrastructure stacks, each supporting a separate service, must be replaced by an infrastructure that is capable of supporting multiple services. This design direction implies a funda- mentally new consideration of basic architectural principals. The ITU-T has begun to formally adopt these concepts. One indication of this direction is inherent in the ITU-T NGN recommendation that “service-related functions” should be “inde- pendent from underlying transport related technologies.” These recommendations indicates that the design should enable “unfettered access for users to networks,” through open interfaces. [1] Also, the ITU-T is beginning to move toward a more flexible architectural model than the one set forth in the classic 7 layer Open Systems Interconnect (OSI) basic reference model [2]. This trend is reflected in the ITU-T recommendations for the functional models for NGN services [3]. This document notes that the standard model may not be carried forward into future designs. The number of layers may not be the same, the functions of the layers may not be the same, standard attributes may not be the same, future protocols may not be defined by X.200, e.g., IP, and understandings of adherence to standards may not be the same. 14.2.4 SELF-ORGANIZATION Another important architectural design principle for any future network is self- organization. The Internet has been described as a self-organizing network. This description has been used to define its architecture in general, because many of its individual proto- cols are designed to adapt automatically to changing network conditions without having to rely on external processes. This term has also been used to describe the behavior of individual protocols and functions, such as those that automatically adjust to changing conditions. Because the Internet was designed for an unreliable infrastructure, it can, therefore, tolerate uncertain conditions better than networks that depend on a highly reliable infrastructure. [...]... specialized overlay networks can be implemented world-wide to support new Grid services This chapter presents an overview of some of these networking technology trends and it indicates how new types of infrastructure may be used to enhance Grid communication capabilities This chapter organizes topics into several categories: edge Grid Networks: Enabling Grids with Advanced Communication Technology Gigi... Unified Approach for Large Dynamic Networks, ” IEEE Communications, August, 78–85 [5] G Huston, ( 199 9) “Interconnection, Peering, and Settlements”, Proceedings of Inet 99 , June 199 9 [6] I Nakagawa, H Esaki, Y Kikuchi and K Nagami (2002) “Design of Next Generation IX Using MPLS Technology , IPSJ Journal, 43, 3280–3 290 [7] M Koga and K.-I Sato (2002) “Recent Advances in Photonic Networking Technologies,” Proceedings... its virtual environments, as long as that resource is communications enabled Given the extensibility of Grid environments to almost any type of component with a capability for data communications, emerging communications technologies are providing new opportunities for the creation of Grid services These technologies can be used within distributed communications infrastructure, on which it is possible... particularly interesting to Grid designers is the current trend toward the development of self-organizing wireless networks based on the 802.11 standard These networks, which are sometimes called ad hoc networks or multihop wireless networks, provide for distributed, usually mobile, inter-device functionality that enables networks to be created without base stations These networks are designed to manage... and utilized by the communication services that support distributed Grids One of the fastest growing areas in networking consists of many new communication devices being designed and developed for the network edge Because the network edge can be anywhere, especially within a Grid context, perhaps the “edge” can be broadly defined as any location that contains a device capable of communications intended... personal communications have been defined in terms of a particular service (e.g., voice) associated with a specific device (e.g., phone) This approach to communication is rapidly disappearing Increasingly, personal communications is defined as a capability to access a wide range of general communication services available through any one of multiple devices A common expectation of personal 295 296 Chapter... along with selected control and management processes for those resources to be allocated to specific external entities 14.4.1 THE DESIGN OF AN OPEN GRID SERVICES EXCHANGE A basic design objective for an open Grid services exchange is the creation of a facility that fully supports interactions among a complete range of advanced Grid services, within single domains and across multiple domains Because Grid. .. NEW ENABLING TECHNOLOGIES Multiple innovations in communications technologies are emerging from research laboratories and are being introduced into commercial markets at an extremely fast rate These innovations will be key enabling technologies for services within next-generation Grid environments These technologies will enable Grid services, especially communication services, to be much more abstracted... http://www.glif.is Chapter 15 Emerging Grid Networking Services and Technologies Joe Mambretti, Roger Helkey, Olivier Jerphagnon, John Bowers, and Franco Travostino 15.1 INTRODUCTION Grids can be conceptualized as virtual environments, based on multiple heterogeneous resources Such resources are not limited to those usually associated with traditional information technology infrastructure Grid architecture is designed... nodes on networks with a mesh topology 15.4 WIRELESS TECHNOLOGIES One of the most powerful revolutions in communications today comprises services based on an expanding field of wireless technologies, especially those related to mobile communications Although personal communication is driving many of these developments, another major force consists of device-to-device and device-to-system communications . services. Grid Networks: Enabling Grids with Advanced Communication Technology Franco Travostino, Joe Mambretti, Gigi Karmous-Edwards © 2006 John Wiley & Sons, Ltd 278 Chapter 14: Grid Network. Zekauskas ( 199 9) “A Round-trip Delay Metric for IPPM.” RFC2681, September 199 9. 274 Chapter 13: Network Monitoring, Restoration [9] T. Ferrari and F. Giacomini (2004) “Network Monitoring for Grid Performance. and documentation of iperf is available at http://dast.nlanr.net/Projects/ Iperf/. [7] G. Almes, S. Kalidindi, and M. Zekauskas ( 199 9) “A One-way Delay Metric for IPPM.” RFC 26 79, September 199 9. [8]