2015 International Conference on Advanced Computing and Applications Distributed Event Monitoring for Software Defined Networks Quan Vuong, Ha Manh Tran, Son Thanh Le School of Computer Science and Engineering International University – HCMC Vietnam National University Email: {vquan, tmha, ltson}@hcmiu.edu.vn The changes of network devices need to be updated as quickly as possible to the entire network, while a controller only manages a part of the network We have proposed an approach of distributed event monitoring for SDN This approach applies existing protocols and applications to collect and store log events on a syslog server It also uses an enhanced method of adaptive semantic filtering to filter non-trivial events for system administrators to take further actions The contribution is thus threefold: 1) Proposing an approach of distributed event monitoring for SDN focusing on collection and filtering 2) Implementing the approach using the existing rsyslog tool and ASF-BDT method 3) Evaluating the approach on ONOS simulation platform and providing result analysis The rest of the paper is structured as follows: the next section introduces some background of SDN focusing on architecture, planes and layers, an overview of the standard OpenFlow protocol [6] and simulation platforms Section III presents the approach of distributed event monitoring for SDN that includes the architecture, syslog protocol and application, and semantic filtering method Several experiments in Section IV reports the results of collecting and filtering log events before the paper is concluded in Section V Abstract—Software defined network separates data and control planes that facilitate network management functions, especially enabling programmable network control functions Event monitoring is a fault management function involved in collecting and filtering event notification messages from network devices This study presents an approach of distributed event monitoring for software defined network Monitoring events usually deals with a large amount of event log data, log collecting and filtering processes thus require a high degree of automation and efficiency This approach takes advantage of the OpenFlow and syslog protocols to collect and store log events obtained from network devices on a syslog server It also uses the adaptive semantic filtering method to filter and present non-trivial events for system administrators to take further actions We have evaluated this approach on a network simulation platform and provided some log collection and filtering results with analysis Keywords: Event Monitoring, Event Filtering, Syslog, Adaptive Semantic Filtering, Software Defined Network I I NTRODUCTION Software defined network (SDN) [1], [2] is an emerging computer networking paradigm that separates the control and data planes for more efficient network management Compared with the traditional networks, SDN enables programmability for network control and abstraction for network infrastructure, thus assisting system administrators in managing large networks with a high level of efficiency and scalability Event monitoring is one of the main network management functions required by various functions ranging from performance monitoring, fault diagnosis to high availability and intrusion detection It is involved in collecting and filtering a huge number of log events Traditional network devices, during execution, report operational data and log event data to the managing systems through network management protocols such as snmp [3], syslog [4], netconf [5] and cli Monitoring events on SDN with automation and efficiency is thus demanding A common approach to deal with this demand in SDN is to upload all operational data and log event data to a centralized data store that is available for several SDN controllers Network events raised to controllers are stored in their local data stores and the controllers synchronize with each other at some points of time This approach allows controllers to keep the updated information of the network for high consistency and availability However, the high responsiveness of controllers depends on topology and network events, i.e., 978-1-4673-8234-2/15 $31.00 © 2015 IEEE DOI 10.1109/ACOMP.2015.29 II BACKGROUND A SDN Architecture Software Defined Network (SDN) [1] is an emerging approach for the next generation of computer network Contrary to the traditional networks, SDN decouples the control and data planes which are integrated in conventional network devices The data plane only contains the forwarding devices (white box devices) and the control plane consists of controllers (network brain) along with a network operating system (NOS) which is installed on the controllers to regulate the forwarding devices Two planes communicate using an OpenFlow protocol which is the first open southbound standard for SDN It supports controllers to collect the flow statistics, event-based messages or network link failure events An SDN architecture has several layers such as Network Infrastructure, Southbound Interface, Network Operating System, Northbound Interface, Network Application, etc Each layer serves certain functions, for example, NOS interacts with 90 TABLE I C ONTROLLER F EATURE C OMPARISON the forwarding devices in Network Infrastructure to coordinate network traffic or detect network faults via Southbound Interface which belongs to the data plane as described in Figure As a well-defined paradigm, SDN provides a lot of benefits for campuses and enterprises The enterprises apply feasibly the SDN to build their efficient network which reduces the cost operation, optimizes computing resources and improving their business continuity However, network faults usually occur when a number of forwarding devices are increasingly connected to the network Besides, controllers are logically centralized and regularly consulted to forwarding devices, this causes much overhead on the network SDN controllers are, therefore, possibly corrupted The communication between the data and control planes is through Southbound Interface layer which integrates with the OpenFlow protocol and contains several forwarding devices Likewise, the management plane interacts with the control plane via Northbound Interface which provides some types of APIs such as REST APIs for developers to program the application for different purposes Despite the fact that OpenFlow is the mainstream protocol for SDN, many other protocols are supported in Southbound Interface layer such as netconf, snmp or vosdb An SDN controller contains a network operating system as a core responsible for tracking and distributing network environment information to the applications and synchronizing operational information among controllers Name Floodlight Onix OpenDaylight ONOS Architecture centralized distributed distributed distributed Northbound API RESTful API NVP NBAPI REST, RESTCONF RESTful API Prog Language Java Python, C Java Java ization in data centers It remains closed source and further development is not revealed in publications Floodlight is the first open source SDN controller that gains much attraction in research and industry It supports high availability in the manner of its proprietary sibling, Big Switch’s Big Network Controller via a hot standby system It does not support a distributed architecture for large scale performance OpenDaylight is another open source SDN controller backed by a large consortium of networking companies It implements a number of vendor-driven features Similarly to ONOS, OpenDaylight runs on a cluster of servers for high availability using distributed data store and leader election The OpenDaylight clustering architecture currently evolves and becomes one of the most used architectures B OpenFlow Protocol As mentioned above, OpenFlow [6] is a popular protocol which is integrated in a lot of forwarding devices such as OpenVSwitch in SDN architecture The controllers maintain the forwarding devices through the OpenFlow protocol An OpenFlow device has many flow tables controlled by a pipeline process and every flow table has three parts, as shown in Fig 2: (i) a matching rule; (ii) action to when matching packets; (iii) counters that keep the statistics of matched packets When a packet arrive in an OpenFlow device, the lookup process starts to find the matching rules in flow tables whether the packet matches with the packet rules If not, it drops the packet or transfer the packet to the controller Otherwise, the packet is executed by a set of actions which include: (i) forward the packet to the outgoing port(s); (ii) encapsulate and forward the packet to the controller; (iii) drop the packet; (iv) send the packet to the normal processing pipeline; (v) send the packet to the next flow table or to special tables, such as group or metering tables Fig Software Defined Networks in (a) planes, (b) layers, and (c) system design architecture [7] Generally, there are many types of NOS which mainly support for single or distributed instances As a single NOS, there are several controller frameworks such as NOX, Beacon, Ryu, Floodlight, etc Moreover, NOS supports both single and distributed functions including OpenDaylight, Onix and ONOS Their architecture fosters multiple controllers which can tolerate network failures When one of controllers is down, the policy management of distributed NOS promotes another controller to replace the failed controller in the shortest time Due to high network resilience, NOS can serve for large scale network systems ONOS and OpenDaylight also support graphical user interfaces to visualize network topology through web applications This feature allows system administrators to manage and interact the SDN network easily Table I compares different controllers based on their specification [12] Onix is the first distributed SDN controller to implement a global network view It originally targets network virtual- Fig OpenFlow-enabled SDN devices [7] The communication between the controllers and forwarding devices is Secure Channel integrated in OpenFlow-enabled switch, as shown in Fig The Secure Channel allows 91 commands and packets to be sent between controllers and switches using the OpenFlow protocol and Flow Table inside the forwarding devices It is necessary to categorize switches into dedicated OpenFlow switches that not support normal Layer and Layer processing, and OpenFlow-enabled switches The dedicated OpenFlow switches only support the OpenFlow protocol for forwarding packets between controllers and switches However, OpenFlow-enabled switches are commercial routers and switches enhanced with Openflow features Typically, the Flow Table re-uses the existing hardware, such as TCAM (Ternary Content Addressable Memory) The authors of the study [8] have presented two types of OpenFlow switches currently The first generation of OpenFlow switches is referred to as Type that supports the header formats, as shown in Fig and four basic actions mentioned above Furthermore, OpenFlow switches can rewrite portions of the packet header for NAT, or to obfuscate addresses on intermediate links, and to map packets to a priority class Likewise, some Flow Tables can match on arbitrary fields in the packet header, enabling experiments with new non-IP protocol Type switches are defined as a particular set of features emerges The detail requirements of an Openflow switch are defined by the OpenFlow Switch specification [6] Fig receive a packet, they check the matching rules to monitor the flow of the packet DCM proposed the first stage Bloom filter referred to as admission Bloom filter to group the monitored flows and then the second stage Bloom filter referred to as action Bloom filter decides the corresponding monitoring actions This technique saves the switch processing resource and improves the performance of network monitoring C Simulation Platforms To support distributed network system, NOS has to meet the demanding requirements of scalability, performance and availability ONOS (Opentwork It also provides the scalability function and deployment in hundreds of nodes For creating network, it supports scripts to establish the network The Python programming language is used to write scripts in Mininet After that, these scripts run on CLI and Mininet sets up the virtual network with switches, hosts, routers as the same real devices These devices can also be interacted and controlled via CLI In the other hand, Mininet provides functions that allows remote controllers to connect to the network Mininet can also be shared on many virtual machine platforms, such as Virtualbox, VMWare or Xen, by pre-installing in the first one and then exporting to the new ones Docker [16] is an open platform for developers and system administrators to build, ship, and run distributed applications At its core, Docker provides a way to run almost any application securely isolated in a container The isolation and security allow users to run many containers simultaneously on their host The lightweight nature of containers, which run without the extra load of a hypervisor, means users can get more out of their hardware Docker is similar to a virtual machine But unlike a virtual machine, instead of creating the whole virtual operating system, Docker allows applications to use the same Linux kernel as the system that they run on and only requires applications to be shipped with packages not already running on the host computer This gives a significant performance boost and reduces the size of applications, as shown in Fig Fig An architecture of SDN distributed log event monitoring The controller manages several OF switches or a part of a network with several hosts connected to the switch The connection between a controller and a switch is a link based on the OF channel as explained in the OpenFlow protocol Controllers synchronize to share the operational data of forwarding devices such as state, traffic An operation deployed on each controller monitors the state of switches and hosts The controller regularly sends messages to check whether a host is alive It also exchanges messages to to maintain the links connect among controller, switches and hosts When a switch or a host goes offline, an event is raised to the controller and saved in the store The controller then informs the state change of devices to others for data synchronization The controller maintains similar stores and the log events are pushed to the syslog server using the syslog protocol III D ISTRIBUTED E VENT M ONITORING Fig shows an architecture design of distributed log event monitoring on SDN This design allows hosts and switches to send log events to controllers that in turn send them to a syslog server using the syslog protocol Filtering functions work on log event datasets to eliminate trivial events and provide 93 important event message in that time The generated time of messages also influence the precision of error messages When two events are closely generated in the same time, it may be the same error messages The authors of study [20] have proposed an approach of adaptive semantic filtering (ASF) for dealing with this problem The ASF-BDT filtering application have applied the adaptive semantic filtering with bounded dynamic threshold [21] to correlate events in the log event datasets collected from the syslog server This method is an enhanced version of the ASF method It contains multiple filtering functions: simple filtering functions using specific fields and rules, temporal filtering functions using time stamps, and semantic filtering functions using Φ coefficient and Pearsons correlation coefficient [22], as shown in Fig The result of this application is a set of non-trivial log events that allow system administrators to take actions for system reliability and stability The Rsyslog [18] application supports simple expressionbased filtering functions for filtering messages It generally offers four types of filtering conditions including severity-based and facility-based selectors, property-based and expression-based filters and BSD-style blocks [19] The expression-based filter facilitates filtering on arbitrary complex expressions that contain boolean, arithmetic and string operations This filter contains if-then rules and possibly evolve into a full configuration scripting language A rule is represented in the format: if expr then action-part-of-selector-line, where if and then are fixed keywords that mus be present expr is a (potentially quite complex) expression such as the comparison of messages and keywords, and action-part-of-selector-line is the saving path for filtered events We have written multiple filtering scripts based on the expression-based filter to filter the event datasets by severity IV E VALUATION A SDN Configuration We have configured a topology with switches and ONOS controllers on Mininet, as shown in Fig Every switch connects to hosts except for switches in the middle The SDN simulation offers a distributed network system running on Inspiron 5447 Core i5-4210U Processor 1.70GHz (4 CPUs), 6GB RAM and 1TB HDD with Ubuntu Server 14.04 LTS Fig Processes of the adaptive semantic filtering method The syslog protocol [4] is a standard transport protocol that allows a device to send event notification messages through IP address to event message collectors or the syslog servers A syslog message carries the following information: facility, severity, hostname, timestamp and message The facility is recognized as the source of messages These sources can be operating systems, applications or processes The facility is broadly categorized by the types of sources and represented by integer, e.g., is for kernel message The severity is also presented by single-digit integer It describes the importance of the event message such as info, warn, error, debug, etc that allows system administrators to manage the types of messages easily and filter the non-trivial event messages immediately The hostname contains the host name or IP address configured on the host The message contains the text of the syslog message with some additional information about the application that generated the message The timestamp is the local time as the event message is generated The timestamp needs to be accurate because system administrators configure the network devices in synchronization time and conveniently looks up the Fig A configured SDN topology using ONOS The controllers run with ONOS controller instances created by using the Docker platform The syslog server is installed and run by Rsyslog [18] We have also configured the Rsyslog server with a remote IP address and default port 514 to collect log events from the ONOS controller in the syslog message format Note that the ONOS controllers send the event messages to the remote IP using the syslog protocol The event messages are recorded and stored in the log files for fault analysis and detection We have created several activities and checked events raised on the hosts Since the hosts connect to ports on the switches, 94 6x106 Number of Events events thus reflect changes on switches, such as host, port and link status These events are raised and pushed to the syslog server System administrators apply tools on events from the log files for fault analysis and detection The Rsyslog tool provides a set of rules to filter event messages following the severity levels, such as info, warn, error, system and debug The severity level assists system administrators to reduce a huge number of events Fig describes filtering rules in the rsyslog configuration to collect events in log files Log data Info event Warn event Error event 5x106 4x106 3x10 2x106 1x10 0 Time (day) Fig 11 Fig similarly to the total log dataset size, while the remaining datasets slowly increase per day The total number of events reaches 5.2 millions for the whole log dataset including approximately 2.4 millions warn events and 50 thousands error events, as shown in Fig 11 Filtering rules in the rsyslog configuration B Event Collection 700 Log data Info event Warn event Error event 500 Size (MB) 1000 Size (MB) Warn event 600 1400 1200 Number of log events for a period of days 800 400 300 600 200 400 100 200 0 0 Fig 10 Time (day) Fig 12 Size of log events for a period of days Time (day) Log datasets for warn messages for a period of days Specific statistics reports the details of warn and error datasets by the same periods While error events definitely require actions from system administrators, some warn events imply potential problems necessary to be considered The error dataset is also much smaller than the warn dataset The warn dataset quickly increases approximately 700 MB after days, but many warn events are trivial, as shown in Fig 12 We created several failure situations on network devices, the automated log system thus emits a large number of warn events The error dataset records error events after days with few errors, as shown in Fig 13 To evaluate the SDN automated log system as configured above, the log system automatically collects a large number of log datasets A log dataset is collected from several log files that contains various types of log events The total size of these datasets is more than 1.2 GB including log events with info, warn, error and system messages We have created several failure scenarios for the period of days Some scenarios include manually shutting down one or multiple switches and hosts during execution to record the non-trivial warning and failure event messages The log dataset is separated by the severity levels into smaller datasets: info, warn and error datasets Fig 10 reports the size of different datasets collected by a period of days The total log dataset size exponentially increases, especially for the last three days the dataset approximately reaches to 800 MB The warn dataset size increases C Event Filtering We have used three datasets: the whole log dataset, warn dataset and error dataset for comparison Fig 14 presents the numbers of resulting events for these datasets using ASFBDT with the Φ range of (0.5, 0.8) Except for the error dataset, the number of resulting events increases considerably 95 to record warn and error event messages Some experiments have collected a large volume of log datasets with different types of severities The whole dataset is approximately 1.2 GB for a period of collection days Other experiments have correlated events in the dataset to provide non-trivial events The future work focuses on setting up a larger SDN network topology with more controllers, switches and hosts so that we can collect a huge amount of event log data with a high level of event diversity We also configure netconf supported by the next version of ONOS to improve monitoring on the SDN network 12 Error event 10 Size (MB) 0 Time (day) Fig 13 ACKNOWLEDGEMENTS This research activity is funded by Vietnam National University in Ho Chi Minh City (VNU-HCM) under the grant number C2015-28-02 Log datasets for error messages for a period of days as the threshold increases The whole log dataset reduces significantly after using ASF-BDT (approximately 75%) and the info dataset contributes a large number of correlated events The warn dataset reduces the same rate as the whole log dataset, while the error dataset also contains several correlated events such as I/O Error: Broken pipe We observe that the experimental dataset possesses a high degree of event correlation R EFERENCES [1] N Feamster, J Rexford, and E Zegura The Road to SDN Queue– Large-Scale Implementations, 11(12):20:20–20:40, December 2013 [2] M Boucadair and C Jacquenet Software-Defined Networking: A Perspective from within a Service Provider Environment RFC 7149, March 2014 [3] J Case, M Fedor, M Schoffstall, and J Davin A Simple Network Management Protocol (SNMP) RFC 1157, May 1990 [4] R Gerhards The Syslog Protocol RFC 5424, March 2009 [5] R Enns, M Bjorklund, J Schăonwăalder, and A Bierman NETCONF Conguration Protocol RFC 6241, June 2011 [6] The OpenFlow Switch Specication URL: http://OpenFlowSwitch.org Last access in May 2015 [7] D Kreutz, F M V Ramos, P E Verissimo, C E Rothenberg, S Azodolmolky, and S Uhlig Software-Defined Networking: A Comprehensive Survey Proc IEEE, 103(1):14–76, Jan 2015 [8] N McKeown, T Anderson, H Balakrishnan, G Parulkar, L Peterson, J Rexford, S Shenker, and J Turner OpenFlow: Enabling Innovation in Campus Networks SIGCOMM Comput Commun Rev., 38(2):69–74, March 2008 [9] S R Chowdhury, M F Bari, R Ahmed, and R Boutaba PayLess: A Low Cost Network Monitoring Framework for Software Defined Networks In Proc Network Operations and Management Symposium (NOMS’14), pages 1–9 IEEE, May 2014 [10] Ye Yu, Chen Qian, and Xin Li Distributed and collaborative traffic monitoring in software defined networks In Proc 3rd Workshop on Hot Topics in Software Defined Networking, HotSDN ’14, pages 85–90, New York, NY, USA, 2014 ACM [11] The Apache Karaf URL: http://karaf.apache.org/ Last access in May 2015 [12] P Berde, M Gerola, J Hart, Y Higuchi, M Kobayashi, T Koide, B Lantz, B O’Connor, P Radoslavov, W Snow, and G Parulkar ONOS: Towards an Open, Distributed SDN OS In Proc 3rd Workshop on Hot Topics in Software Defined Networking, HotSDN ’14, pages 1–6, New York, NY, USA, 2014 ACM [13] Hazelcast Distributed Data Structure URL: http://docs.hazelcast.org/ docs/latest/manual/html/distributed-data-structures.html Last access in May 2015 [14] An Open-Source Distributed SDN Operating System URL: http://www.slideshare.net/albertspijkers/onos-sdn-open-networking Last access in May 2015 [15] B Lantz, B Heller, and N McKeown A network in a laptop: Rapid prototyping for software-defined networks In Proc 9th ACM SIGCOMM Workshop on Hot Topics in Networks, Hotnets-IX, pages 19:1–19:6, New York, NY, USA, 2010 ACM [16] The Docker Platform URL: https://docs.docker.com/ Last access in May 2015 [17] Containers vs vms URL: http://www.linuxfeed.org/2015/07/presentazionea-docker/ Last access in May 2015 [18] The Rocket Fast System for Log Processing URL: http://www.rsyslog com/ Last access in May 2015 Number of Results (x105) 16 14 12 10 Log data Warn event Error event 10 Fig 14 10.01 10.03 Threshold (s) 10.05 Resulting events for different datasets using ASF-BDT V C ONCLUSION We have proposed an approach of distributed log event monitoring that assists system administrators in managing log events for SDN The approach is characterized by the capability of collecting and filtering a large number of log events from network devices autonomously and efficiently Collecting events requires the configuration of rsyslog and simulation applications and implementation of scripts that run with the existing OpenFlow and syslog protocols Filtering events applies the ASF-BDT method for obtaining non-trivial events precisely We have configured an SDN network topology with Mininet and ONOS controllers on the Docker platform The network system runs with multiple testing scenarios, such as shutting down some switches and hosts during execution 96 [19] The Rsyslog Filter Conditions URL: http://www.rsyslog.com/doc/v8stable/configuration/filters.html Last access in May 2015 [20] Y Liang, Y Zhang, H Xiong, and R K Sahoo An Adaptive Semantic Filter for Blue Gene/L Failure Log Analysis In Proc 21th International Parallel and Distributed Processing Symposium (IPDPS 07), pages 1–8 IEEE Computer Society, 2007 [21] H M Tran, A V T Tran, S T Le, and S V Nguyen Improving Adaptive Semantic Filtering with Bounded Dynamic Threshold for Log Data Analytics Journal of Science and Technology, Vietnamese Academy of Science and Technology, 52:122–130, 2014 [22] H T Reynolds The Analysis of Cross-Classifications The Free Press, New York, 1977 97 ... the rsyslog configuration B Event Collection 700 Log data Info event Warn event Error event 500 Size (MB) 1000 Size (MB) Warn event 600 1400 1200 Number of log events for a period of days 800 400... Resulting events for different datasets using ASF-BDT V C ONCLUSION We have proposed an approach of distributed log event monitoring that assists system administrators in managing log events for SDN... the corresponding monitoring actions This technique saves the switch processing resource and improves the performance of network monitoring C Simulation Platforms To support distributed network