Technology Final Report Secure/Resilient Systems and Data Dissemination/Provenance

Technology Final Report Secure/Resilient Systems and Data Dissemination/Provenance September 2017 Prepared for The Northrop Grumman Cyber Research Consortium As part of IS Sector Investment Program Prepared by Bharat Bhargava CERIAS, Purdue University Table of Contents Executive Summary .3 1.1 Statement of Problem 1.2 Current State of Technology 1.3 Proposed Solution 1.4 Technical Activities, Progress, Findings and Accomplishments 18 1.5 Distinctive Attributes, Advantages and Discriminators .23 1.6 Tangible Assets Created by Project .24 1.7 Outreach Activities and Conferences 25 1.8 Intellectual Property Accomplishments .26 General Comments and Suggestions for Next Year .26 List of Figures Figure High-level view of proposed resiliency framework … ……… ……………….……….……………….5 Figure Service acceptance test ……………… ……………………………………………………………… Figure View of space and time of MTD-based resiliency solution….……………………………………… … Figure Moving target defense application example…….…… ……………………………………………… 10 Figure High-level resiliency framework architecture.…………………………………………………… … 11 Figure System states of the framework….……………………………………………………………………….12 Figure Data d1 leakage from Service X to Service Y….…………………………………………………………14 Figure Data Sensitivity Probability Functions ……… ……………………………………………………15 Figure Encrypted search over database of active bundles (by Leon Li, NG “WAXEDPRUNE” project) ……16 Figure 10 Experiment Setup For Moving Target Defense (MTD)…………………………………………………20 Figure 11 EHR dissemination in cloud (created by Dr Leon Li, NGC) ………………………………………… 21 Figure 12 AB performance overhead with browser's crypto capabilities on / off …… ………………………… 22 Figure 13 Encrypted Search over Encrypted Database ……………………………… ………………………… 23 List of Tables Table Executive Summary………………….……………………………………………………………… ….1 Table Operations supported by different crypto systems … …………………………………………… 15 Table Moving Target Defense (MTD) Measurements…… ………………………………………….21 Table Encrypted Database of Active Bundles Table ‘EHR_DB’ ………………………………………… 22 Executive Summary Title Author(s) Principal Investigator Secure/Resilient Systems and Data Dissemination/Provenance Bharat Bhargava Bharat Bhargava Funding Amount $200,000 Period of Performance September 1, 2016 - August 31, 2017 Was this a continuation of Investment Project? Key Words Yes TRL Level data provenance, data leakage, resiliency, adaptability, security, self-healing, MTD Key Partners & Vendors Table 1: Executive Summary 1.1 Statement of Problem In a cloud-based environment, the enlarged attack surface along with the constant use of zeroday exploits hampers attack mitigation, especially when attacks originate at the kernel level In a virtualized environment, an adversary that has fully compromised a virtual machine (VM) and has system privileges (kernel level, not the hypervisor) without being detected by traditional security mechanisms exposes the cloud processes and cloud-resident data to attacks that might compromise their integrity and privacy, jeopardizing mission-critical functions The main shortcoming of traditional defense solutions is that they are tailored to specific threats, therefore limited in their ability to cope with attacks originating outside their scope There is need to develop resilient, adaptable, reconfigurable infrastructure that can incorporate emerging defensive strategies and tools The architectures have to provide resiliency (withstand cyberattacks, and sustain and recover critical function) and antifragility (increase in capability, resilience, or robustness as a result of mistakes, faults, attacks, or failures) The volume of information and real time requirements have increased due to the advent of multiple input points of emails, texts, voice, tweets They are all coming to the government agencies such as US State Department for dissemination to many stakeholders, to make sure of security of classified information (cyber data, user data, attack event data) so that it can be identified as classified (secret) and disseminated based on access privileges to the right user in a specific location on a specific device For forensics/provenance, the determination of the identity of all who have accessed/updated/disseminated the sensitive cyber data including the attack event data is needed There is need to build systems capable of collecting, analyzing and reacting to dynamic cyber events across all domains while also ensuring that cyber threats are not propagated across security domain boundaries and compromise the operation of system Solutions that develop a science of cyber security that can apply to all systems, infrastructure, and applications are needed The current resilience schemes based on replication lead to an increase in the number of ways an attacker can exploit or penetrate the systems It is critical to design a vertical resiliency solution from the application layer down to physical infrastructure in which the protection against attacks is integrated across all the layers of the system (i.e., application, runtime, network) at all times, allowing the system to start secure, stay secure and return secure+ (i.e return with increased security than before) [13] after performing its function 1.2 Current State of Technology Current industry-standard cloud systems such as Amazon EC2 provide coarse-grain monitoring capabilities (e.g CloudWatch) for various performance parameters for services deployed in the cloud Although such monitors are useful for handling issues such as load distribution and elasticity, they not provide information regarding potentially malicious activity in the domain Log management and analysis tools such as Splunk [1], Graylog [2] and Kibana [3] provide capabilities to store, search and analyze big data gathered from various types of logs on enterprise systems, enabling organizations to detect security threats through examination by system administrators Such tools mostly require human intelligence for detection of threats and need to be complemented with automated analysis and accurate threat detection capability to quickly respond to possibly malicious activity in the enterprise and provide increased resiliency by providing automation of response actions In addition Splunk is expensive There are well-established moving target defense (MTD) solutions designed to combat against specific threats, but limited when there are exploits beyond their boundaries For instance, application-level redundancy and replication schemes prevent exploits that target the application code base, but fail against code injection attacks that target runtime execution, e.g buffer and heap overflows, and control flow of the application Instruction set randomization [51], address space randomization [4], randomizing runtime [5], and system calls [6] have been used to effectively combat against system-level (i.e returnoriented/code injection) attacks System-level diversification and randomizations are considered mature and tightly integrated into some operating systems Most of these defensive security mechanisms (i.e instruction/memory address randomizations) are effective for their targets, however, modern sophisticated attacks require defensive solution approaches to be deeply integrated into the architecture, from the application-level down to the infrastructure simultaneously and at all times Several general approaches have been proposed for controlling access to shared data and protecting its privacy DataSafe is a software-hardware architecture that supports data confidentiality throughout their lifecycle [7] It is based on additional hardware and uses a trusted hypervisor to enforce policies, track data flow, and prevent data leakage Applications running on the host are not required to be aware of DataSafe and can operate unmodified and access data transparently The hosts without DataSafe can only access encrypted data, but it is unable to track data if they are disclosed to non- DataSafe hosts The use of a special architecture limits the solution to well-known hosts that already have the required setup It is not practical to assume that all hosts will have the required hardware and software components in a crossdomain service environment A privacy-preserving information brokering (PPIB) system has been proposed for secure information access and sharing via an overlay network of brokers, coordinators, and a central authority (CA) [8] The approach does not consider the heterogeneity of components such as different security levels of client’s browsers, different user authentication schemes, trust levels of services The use of a trusted third party (TTP) creates a single point of trust and failure Other solutions address secures data dissemination in untrusted environments Pearson et al present a case study of EnCoRe project that uses sticky policies to manage the privacy of shared data across different domains [9] In the EnCoRe project, the sticky policies are enforced by a TTP and allow tracking of data dissemination, which makes it prone to TTP-related issues The sticky policies are also vulnerable to attacks from malicious recipients 1.3 Proposed Solution We propose an approach for enterprise system and data resiliency that is capable of dynamically adapting to attack and failure conditions through performance/cost-aware process and data replication, data provenance tracking and automated software-based monitoring & reconfiguration of cloud processes (see Figure 1) The main components of the proposed solution and the challenges involved in their implementation are described below Figure High-level view of proposed resiliency framework 1.3.1 Software-Defined Agility & Adaptability Adaptability to adverse situations and restoration of services is significant for high performance and security in a distributed environment Changes in both service context and the context of users can affect service compositions, requiring dynamic reconfiguration While changes in user context can result in updated priorities such as trading accuracy for shorter response time in an emergency, as well as updated constraints such as requiring trust levels of all services in a composition to be higher than a particular threshold in a critical mission, changes in service context can result in failures requiring the restart of a whole service composition Advances in virtualization have enabled rapid provisioning of resources, tools, and techniques to build agile systems that provide adaptability to changing runtime conditions In this project, we will build upon our previous work in adaptive network computing [10], end-to-end security in SOA [11] and the advances in software-defined networking (SDN) to create a dynamically reconfigurable processing environment that can incorporate a variety of cyber defense tools and techniques Our enterprise resiliency solution is based on two main industry-standard components: The cloud management software of OpenStack [12] – Nova, which provides virtual machines on demand; and the Software Defined Networks (SDN) solution – Neutron, which provides networking as a service and runs on top of OpenStack The solution that we developed for monitoring cloud processes and dynamic reconfiguration of service compositions as described in [10] involved a distributed set of monitors in every service domain for tracking service/domain-level performance and security parameters and a central monitor to keep track of the health of various cloud services Even though the solution enables dynamic reconfiguration of entire service compositions in the cloud, it requires replication, registration and tracking of services at multiple sites, which could have performance and cost implications for the enterprise In order to overcome these challenges, the proposed framework utilizes live monitoring of cloud resources to dynamically detect deviations from normal service behavior and integrity violations, and self-heal by reconfiguring service compositions through software-defined networking of automatically migrated service instances A component of this software-defined agility and adaptability solution is live monitoring of services as described below 1.3.1.1 Live Monitoring Cyber-resiliency is the ability of a system to continue degraded operations, self-heal, or deal with the present situation when attacked [13] We may need to shut down less critical computations, communications and allow for weaker consistency as long as the mission requirements are satisfied For this we need to measure the assurance level, (integrity/accuracy/trust) of the system from the Quality of Service (QoS) parameters such as response time, throughput, packet loss, delays, consistency, acceptance test success, etc To ensure the enforcement of SLAs and provide high security assurance in enterprise cloud computing, a generic monitoring framework needs to be developed The challenges involved in effective monitoring and analysis of service/domain behavior include the following:  Identification of significant metrics, such as response time, CPU usage, memory usage, etc., for service performance and behavior evaluation   Development of models for identifying deviations from performance (e.g., achieving the total response time below a specific threshold) and security goals (e.g., having service trust levels above a certain threshold) Design and development of adaptable service configurations and live migration solutions for increased resilience and availability Development of effective models for detection of anomalies in a service domain relies on careful selection of performance and security parameters to be integrated into the models Model parameters should be easy-to-obtain and representative of performance and security characteristics of various services running on different platforms We plan to investigate and utilize the following monitoring tools that provide integration with OpenStack in order to gather system usage/resiliency parameters in real time [14]: Ceilometer [15]: Provides a framework to meter and collect infrastructure metrics such as CPU, network, and storage utilization This tool provides alarms set when a metric crosses a predefined threshold, and can be used to send alarm information to external servers Monasca [16]: Provides a large framework for various aspects of monitoring including alarms, statistics, and measurements for all OpenStack components Tenants can define what to measure, what statistics to collect, how to trigger alarms and the notification method Heat [17]: Provides an orchestration engine to launch multiple composite cloud applications based on templates in the form of text files that can be treated like code Enabling actions like autoscaling based on alarms received from Ceilometer As a further improvement for dynamic service orchestration and self-healing, we plan to investigate models that are based on a graceful degradation approach for service composition, which replace services that not pass acceptance tests as seen in Figure based on userspecified or context-based policies with ones that are more likely to pass the tests at the expense of decreased performance Figure Service acceptance test 1.3.2 Moving Target Defense for Resiliency/Self-healing The traditional defensive security strategy for distributed systems is to prevent attackers from gaining control of the system using known techniques such as firewalls, redundancy, replications, and encryption However, given sufficient time and resources, all these methods can be defeated, especially when dealing with sophisticated attacks from advanced adversaries that leverage zero-day exploits This highlights the need for more resilient, agile and adaptable solutions to protect systems MTD is a component in NGC project Cyber Resilient System [13] Sunil Lingayat of NGC has taken interest and connected us with other researchers in NGC working in Dayton Our proposed Moving Target Defense (MTD) [18, 19] attack-resilient virtualization-based framework is a defensive strategy that aims to reduce the need to continuously fight against attacks by decreasing the gain-loss balance perception of attackers The framework narrows the exposure window of a node to such attacks, which increases the cost of attacks on a system and lowers the likelihood of success and the perceived benefit of compromising it The reduction in the vulnerability window of nodes is mainly achieved through three steps: Partitioning the runtime execution of nodes in time intervals Allowing nodes to run only with a predefined lifespan (as low as a minute) on heterogeneous platforms (i.e different OSs) Proactively monitoring their runtime below the OS The main idea of this approach is allowing nodes to run on a given computing platform (i.e hardware, hypervisors and OS) for a controlled period of time chosen in such a manner that successful ongoing attacks become ineffective as suggested in [20, 21, 22, 23] We accomplish such a control by allowing nodes to run only for a short period of time to complete n client requests on a given underlying computing platform, then vanish and appear on a different platform with different characteristics, i.e., guest OS, Host OS, hypervisor, hardware, etc We refer to this randomization and diversification technique of vanishing a node to appear in another platform as reincarnation The proposed framework introduces resiliency and adaptability to systems Resilience has two main components (1) continuing operation and (2) fighting through compromise [13] The MTD framework takes into consideration these components since it transforms systems to be able to adapt and self-heal when ongoing attacks are detected, which guarantees operation continuity The initial target of the framework is to prevent successful attacks by establishing short lifespans for nodes/services to reduce the probability of attackers’ taking over control In case an attack occurs within the lifespan of a node, the proactive monitoring system triggers a reincarnation of the node The attack model considers an adversary taking control of a node undetected by the traditional defensive mechanisms, a valid assumption in the face of novel attacks The adversary gains high privileges of the system and is able to alter all aspects of the applications Traditionally, the advantage of the adversaries, in this case, is the unbounded time and space to compromise and disrupt the reliability of the system, especially when it is replicated (i.e colluding) The fundamental premise of proposed framework is to eliminate the time and space advantage of the adversaries and create agility to avoid attacks that can defeat system objectives by extending the cloud framework We assume the cloud management software stack (i.e framework) and the virtual introspection libraries are secure 1.3.2.1 Resiliency Framework Design The criticality of diversity as a defensive strategy in addition to replication/redundancy was first proposed in [24] Diversity and randomization allow the system defender to deceive adversaries by continuously shifting the attack surface of the system We introduce a unified generic MTD framework designed to simultaneously move in space (i.e across platforms) and in time (i.e time-intervals as low as a minute) Unlike the-state-of-the-art singular MTD solution approaches [25, 26, 27, 28, 29, 30], we view the system space as a multidimensional space where we apply MTD on all layers of the space (application, OS, network) in short time intervals while we are aware of the status of the rest of the nodes in the system Figure illustrates how the MTD framework works The y-axis depicts the view of the space (e.g application, OS, network) and the x-axis the runtime (i.e elapsed time) The figure compares the traditional replicated systems without any diversification and randomization technique, the state-of-the-art systems [25, 26, 27, 28, 29, 30] with diversification and randomization techniques applied to certain layers of the infrastructure (application, OS or network) and proposed solution, which applies MTD to all layers Figure View of space and time of MTD-based resiliency solution As illustrated in Figure 3.c, nodes/services that are not reincarnated in a particular time-interval are marked with the result of an observation (e.g introspection) of either Clean (C) or Dirty (D) (i.e not compromised/compromised) To illustrate, in the third reincarnation round with replica n, we detect replica to be clean (marked with C) and replica as dirty as shown in that time interval entry with D We reincarnate the node whose entry shows D prior to the scheduled node in the next time-interval Two important factors need to be considered in the design of this framework: the lifespan of nodes or virtual machines, and the migration technique used in the reincarnation Figure shows a possible scenario in which virtual machines running on a platform become IDLE when an attack occurs and is detected When to and how to reincarnate nodes are our main research questions Figure Moving target defense application example Long periods of times increase the probability of success of an ongoing attack, while too short times impact the performance of the system Novel ways to determine when to vanish a node to run the replica in a new VM need to be developed In [23] VMs are reincarnated in fixed periods of times chosen using Round Robin or a randomized selection mechanism We propose the implementation of a more adaptable solution, which uses Virtual Machine Introspection (VMI) to persistently monitor the communication between virtual requests and available physical resources and switch the VM when anomalous behaviors are observed The other crucial factor in our design is the live migration technique used for the virtual machine reincarnation Migrating operating system instances across different platforms have traditionally been used to facilitate fault management, load balancing, and low-level system maintenance [31] Several techniques have been proposed to carry out the majority of migration while OSes continue to run to achieve acceptable performance with minimal service downtimes We propose to integrate some of these techniques [31, 32, 33] in a clustered environment to our MTD solution to guarantee adaptability and agility in our system When virtual machines are running live services, it is important that the reincarnation occurs in such a manner that both downtime and total transfer time are minimal The downtime refers to the time when no service is available during the transition The total transfer time refers to the time it takes complete the transition [31] Our main idea is to continue running the service in the source VM until the destination VM is ready to offer the service independently In this process there will be some time where part of the state (the most unchangeable state information) is copied to the destination VM while the source VM is still running At some point, the source VM will be stopped to copy the rest of the information (the most changeable state information) to the destination VM, which will take control after the information is copied No service will be available when the source VM is stopped and the copying process has not been completed This period of time defines the downtime 1.3.2.2 Resiliency Framework Infrastructure Our framework will be built on top of OpenStack cloud framework [12], a widely adopted open source cloud management software stack Figure shows the high-level architecture of our 10     o First give part of data (incomplete, less sensitive) o Watch how data is used and monitor trust level of using service o If trust level is sufficient – give next portion of data Raise the level of data classification to prevent leakage repetition Intentional leakage to create uncertainty and lower data value Monitor network messages o Check whether they contain e.g credit card number that satisfies specific pattern and can be validated using regular expressions After leakage is detected, make system stronger against similar attacks o Separate compromised role into two: e.g suspicious_role and benign_role o Send new certificates to all benign users for benign role o Create new AB with new policies, restricting access to suspicious_role (e.g to all doctors from the same hospital with a malicious one) o Increase sensitivity level for leaked data items, i.e for diagnosis Data Leakage Damage Assessment After data leakage is detected damage is assessed based on: • To whom was the data leaked (service with low trust level vs service with high level of trust) • Sensitivity (Classification) of leaked data (classified vs unclassified) • When was leaked data received • Can other sensitive data be derived from the leaked data (i.e diagnosis can be derived from leaked medical prescription) Damage = P(Data is Sensitive) * P (Service is Malicious) * P(t) (1), , where P(t) is the probability function for data sensitivity in time. Fig.8 Data sensitivity probability functions 15 Fig shows three different data sensitivity probability functions Data-related event (e.g product release) occurs at time to. Threat from data being leaked before to is high. Threat from data being leaked after to: either goes to zero right away (e.g leaking picture of a new smartphone after the product has already been released) or degrades linearly (e.g new encryption algorithm, that is being studied by attackers after it has been released) or remains to be high with time (e.g novel design of a product) 1.3.3.2 Data Provenance Each time when service tries to access data from AB, CM is notified about that and provenance data is recorded at CM The following provenance data is stored in the log file in encrypted form at CM: • Who tried to decrypt data • What type (class) of data • When • Where did data come from (who is the Sender) Although there are overheads and challenges related to provenance [36, 39] , not collecting it has huge implications on the reliability, veracity, and reproducibility of Big Data tracking and utilization of data and process provenance in workflow-driven analytical and variety of applications In addition, forensics requires provenance data for investigating data leakage incidents As a next step, we started investigating blockchain – based mechanism in order to ensure integrity of provenance data, i.e integrity of log files 1.3.3.3 Privacy-preserving publish/subscribe cloud data storage and dissemination Figure Encrypted search over database of Active Bundles (by Leon Li, NGC “WaxedPrune” project) 16 Figure (created by Leon Li at NGC) shows a public cloud that provides a repository for storing and sharing intelligence feed data from various sources This is used by the analytics software to derive different data trends The use of a public cloud requires the stored data to be encrypted The cloud includes three main components: (a) Collection Agent used to gather data from multiple distributed intelligence feeds; it maps similar data across different feeds and logically orders them for storage (b) CryptDB [44] that stores encrypted data and provides SQL query capabilities over encrypted data, using a search engine based on SQL-aware encryption schemes (c) Subscription API that provides various methods for data consumers to enable authorized access to data Our solution relies on CryptDB [44] that is used to support encrypted search queries for encrypted data CryptDB is a proxy to a database server It never releases decryption key to a database, thus, the database can be hosted by untrusted cloud provider When database is compromised, only ciphertext is revealed and data leakage is limited to data for currently logged in users User’s search query is stemmed, stopwords are removed and then it is converted to SQL query Table shows operations supported by different crypto systems Table 2. Operations supported by different crypto systems CryptDB has several weak points: (a) OPE encryption scheme is not secure in terms of revealing the order (b) Not all the SQL queries are supported E.g query that has two arithmetic operations a + b*c is not supported 17 Solution: use Fully Homomorphic Encryption (FHE) 1.4 Technical Activities, Progress, Findings and Accomplishments The main technical activities for this project consisted of the implementation of an enterprise system and data resiliency that is capable of dynamically adapting to attack and failure conditions through performance/cost-aware process and data replication, data provenance tracking and automated software-based monitoring & reconfiguration of cloud processes As the system was built experiments where conducted to measure its performance The results were used in two publications and in two posters for the CERIAS Security Symposium at Purdue The details of the technical activities and our main findings are described below 1.4.1 Live Monitoring Cyber-resiliency is the ability of a system to continue degraded operations, self-heal, or deal with the present situation when attacked [13] For this we need to measure the assurance level (integrity/accuracy/trust) of the system from the Quality of Service (QoS) parameters such as response time, throughput, packet loss, delays, consistency, etc The solution developed for dynamic reconfiguration of service compositions as described in [10] involved a distributed set of monitors in every service domain for tracking performance and security parameters and a central monitor to keep track of the health of various cloud services Even though the solution enables dynamic reconfiguration of entire service compositions in the cloud, it requires replication, registration and tracking of services at multiple sites, which could have performance and cost implications for the enterprise To overcome these challenges, the framework proposed in this work utilizes live monitoring of cloud resourcesto dynamically detect deviations from normal behavior and integrity violations, and self-heal by reconfiguring service compositions through software-defined networking [45] of automatically migrated service/VM instances As the goal of the proposed resiliency solution is to provide a generic model, for detection of possible threats and failures in a cloud-based runtime environment, limiting the utilized anomaly detection models to supervised learning algorithms will not provide the desired applicability Hence, unsupervised learning models such as k-means clustering [46] and one-class SVM classification [47] to detect outliers (i.e anomalies) in service and VM behavior will be more appropriate Algorithm shows an adaptation of the kmeans algorithm to cluster service performance data under normal system operation conditions and algorithm shows how to detect outliers by measuring the distance of the performance vector of a service at a particular point in time to all clusters formed during training Additionally, virtual machine introspection (VMI) [48] techniques need to be utilized to check the integrity of VMs at runtime to ensure that the application’s memory structure has not been modified in an unauthorized manner The results of the monitoring and anomaly detection processes help decide when to reincarnate VMs as described in the next section 18 Algorithm Anomaly training algorithm Algorithm Anomaly detection algorithm 1.4.2 Moving Target Defense The main idea of this MTD-technique is allowing a node running a distributed application on a given computing platform for a controlled period of time before vanishing it The allowed running time is chosen in such a manner that successful ongoing attacks become ineffective and a new node with different computing platform characteristics is created and inserted in place of the vanishing node The new node is updated by the remaining nodes after completing the replacement The required synchronization time is determined by the application and the amount of data that needs to be transferred to the new node as the reincarnation process not keep the state of the old node The randomization and diversification technique of vanishing a node to appear in another platform is called node reincarnation [18, 19] One key question is determining when to reincarnate a node One approach is setting a fixed period of time for each node and 19 reincarnating them after that lifespan In this first approach nodes to be reincarnated are selected either in Round Robin or randomly However, attacks can occur within the lifespan of each machine, which makes live monitoring mechanisms a crucial element Whether an attack is going on at the beginning of the reincarnation process determines how soon the old node must be stopped to keep the system resilient When no threats are present both the old node and new node can participate in the reincarnation process The old node can continue running until the new node is ready to take its place On the contrary, in case an attack is detected the old node should be stopped immediately and the reincarnation should occur without its participation, which from the perspective of the distributed application represents a greater downtime of the node Our main contribution here is the design and implementation of a prototype that speeds up the node reincarnation process using SDN, which allows configuring network devices on the fly via OpenFlow We avoid swapping virtual network interfaces of the nodes involved in the process as proposed in [18] to save time in the preparation of the new virtual machine The new virtual machine is created and automatically connected to the network The machine then starts participating in the distributed application when routing flows are inserted to the network devices to redirect the traffic directed to the old VM to the new one 1.4.2.1 Experiments and Results Experiments to evaluate the operation times of the proposed MTD solution were conducted Figure 10 shows the experiment setup A Byzantine fault tolerant (BFT-SMaRt) distributed application was run on a set of Ubuntu (either 12.04 or 14.04 randomly selected) VMs in a private cloud, which are connected with an SDN network using Open vSwitch The reincarnation is stateless, i.e the new node (e.g VM1’) does not inherit the state of the replaced node (e.g VM1) The set of new VMs are periodically refreshed to start clean and the network is reconfigured using OpenFlow when a VM is reincarnated to provide continued access to the application Figure 10 Experiment Setup For Moving Target Defense (MTD) 20 Table presents the results: virtual machine restarting and creation time, and Open vSwitch flow injection time Note that the important factor for system downtime here is the Open vSwitch flow injection time, as VM creation and restart take place periodically to create fresh backup copies, and not affect the downtime Table 3: Moving Target Defense (MTD) Measurements Measurements Times VM restarting time ~7s VM creation time ~11s Open vSwitch flow injection time ~250ms 1.4.3 Data Provenance and Leakage Detection in Untrusted Cloud Fig.11 EHR dissemination in cloud (created by Dr Leon Li, NGC) Based on the architecture framework, illustrated on Fig.11, we performed several experiments to evaluate performance overhead of our approach They are included in IEEE paper “Privacypreserving data dissemination in untrusted cloud” [49], written in collaboration with Donald Steiner, Leon Li and Jason Kobes as co-authors Experimental setup Hardware: Intel Core i7, CPU 860 @2.8GHz x8, 8GB DRAM OS: Linux Ubuntu 14.04.5, kernel 3.13.0-107-generic, 64 bit Browser: Mozilla Firefox for Ubuntu, ver 50.1.0 21 We use Active Bundle which, in addition to tamper – resistance, supports client’s browser cryptographic capabilities and authentication method detection Local request for Patient's Contact Information is sent by Doctor’s service to an Active Bundle, which represents EHR and runs on a Purdue University Server waxedprune.cs.purdue.edu:3000 We use an Active Bundle with access control policies, similar in terms of complexity Data request is issued from the service running on the same host with an Active Bundle Thus, we measure RTT for a local data request and exclude network delays that can affect the measurements Fig.12 AB performance overhead with browser's crypto capabilities on / off 1.4.4 Encrypted Search over Encrypted Data We deployed a database of Active Bundles in encrypted form Database contains extra – attribute used for indexing an AB, i.e short abstract of AB, its keywords In our case, diagnosis and age can be used as extra-attributes They are stored in encrypted form ID maps each record to its corresponding Active Bundle, i.e to Electronic Health Record of a given patient As it can be seen from Fig.13, user’s search query is converted to SQL query, then SQL query is converted into encrypted form User-defined functions are used for that CryptDB [44] open-source engine is able to execute set of SQL queries, operating on encrypted data Table Encrypted Database of Active Bundles Table ‘EHR_DB’ ID Diagnosis Age 001 Concussion 35 002 Insomnia 30 003 ColdSore Herpes1 40 22 Fig.13 Encrypted Search over Encrypted Database Query example: SELECT ID FROM EHR_DB WHERE Age BETWEEN 35 AND 40; Converted query: SELECT c1 FROM Alias1 WHERE ESRCH ( Enc(age), Enc(35, 40) ); Result is {001, 003} and it is received in the plaintext form at the Client’s side, but it is still encrypted on Cloud Provider’s side and, thus, is protected against curious or malicious cloud administrators 1.5 Distinctive Attributes, Advantages and Discriminators The distinctive attributes of the cloud security/resiliency approach we proposed in this project are the following:  Automated, self-healing cloud services: Existing cloud enterprise systems lack robust mechanisms to monitor compliance of services with security and performance policies under changing contexts, and to ensure uninterrupted operation in case of failures The proposed work will demonstrate that it is possible to enforce security and performance requirements of cloud enterprise systems even in the presence of anomalous behavior/attacks and failure of services The resiliency and self-healing will be accomplished through automated reconfiguration, migration and restoration of services with software-defined networking Our approach will be complementary to functionality provided by log management tools such as Splunk in that it will develop models that accurately analyze the log data gathered by such tools to immediately detect deviations from normal behavior and quickly respond to such anomalous behavior in order to provide increased automation of threat detection as well as resiliency This work will be done to contribute to IRADS on cyber resilient system and enterprise resiliency  Agile and resilient virtualization-based systems: We will design and implement the resiliency framework on top of a cloud framework with special emphasis on time (as low as a minute) and space diversification and randomization across heterogeneous cloud platforms (i.e OS, Hypervisors) while proactively monitoring (i.e virtual introspection) the nodes We abstract the system runtime from the virtual machine (VM) instance to 23 formally reason about its correct behavior This abstraction allows the framework to enable MTD capabilities to all types of systems regardless of its architecture or communication model (i.e asynchronous and synchronous) on all kinds of cloud platforms (e.g OpenStack) The developed prototype including modules for service monitoring, trust management and dynamic migration/restoration can be easily integrated into NGC cybersecurity software The modular architecture and use of standard software in the monitoring framework allows for easy plugin to IRAD software The resiliency work will allow identification of NGC clients’ requirements for building capabilities in prototypes for Air Force Research Lab (AFRL), DoD and SSA, IRS and State Department We plan to work closely with Daniel Goodwin of NGC on BAA proposals  Policy-based data access authorization, provenance and leakage detection: Our secure data dissemination model, based on Active Bundles, ensures that services can get access only to those data items for which services are authorized In addition to policy-based access control, we support trust-based and context-based data dissemination Trust level of services is continuously monitored and determined In addition, we check cryptographic capabilities of a browser used by client, which tries to access sensitive data Data dissemination takes into consideration how secure client’s browser is, how secure the client’s network is, how secure is the client authentication method (e.g password-based vs hardware-based or fingerprint authentication) We implemented data leakage detection mechanism, so that if authorized service leaks data to an unauthorized one then it is detected by Central Monitor and reported to data owner Data provenance tracking allows greater, fine-grain control over data access control decisions and enables limiting data leaks 1.6 Tangible Assets Created by Project The tangible assets created by this project include active bundle software, a tutorial describing the use of the developed software, as well as demos for Secure Data Dissemination/Provenance and our Moving Target Defense (MTD) technique Moving Target Defense (MTD) Solution:  Demo: https://www.dropbox.com/s/fqjh75su0p908ic/NGCRC-2017-Bhargava-DEMO2.mp4?dl=0  Code: https://www.dropbox.com/s/frsflh0xbhewp4u/mayflies.py?dl=0 Secure Data Dissemination/Provenance:  Demo: https://www.dropbox.com/s/4wg3vuv52j4s16v/NGCRC-2017-Bhargava-Demo1.wmv? dl=0 24  Code for Northrop Grumman ‘WAXEDPRUNE’ extended prototype: https://github.com/Denis-Ulybysh/absoa17 In addition we have the following related material available:  The tutorial explaining the use of the active bundle software: https://www.cs.purdue.edu/homes/bb/absoa/SymposiumApr2016/TechFest2016_AB_Tuto rial_Release.pdf  Code repository for the “WAXEDPRUNE” prototype: Ulybysh/absoa16  Demo video to demonstrate privacy-preserving data dissemination capabilities of the system: o Video 1: https://www.dropbox.com/s/30scw1srqsmyq6d/BhargavaTeam_DemoVideo_Sprin g16.wmv?dl=0 o Video 2: https://www.youtube.com/watch?v=SIUupq5V6zk&feature=youtu.be http://github.com/Denis- We have also created the following demo videos to demonstrate the capabilities of the adaptable service compositions and policy enforcement system for cloud enterprise agility:  Introduction to the user interface and a composite web service for ticket reservations: https://www.youtube.com/watch?v=nucRjKZBtnM  Composite trust algorithms for cloud-based services: https://www.youtube.com/watch? v=6uHEfoxjEgs  Trust update mechanisms: https://www.youtube.com/watch?v=xnm0-MzGBzw  Enforcing client’s QoS and security policies: https://www.youtube.com/watch?v=ePtAM0N7jdY  Service redirection policy: https://www.youtube.com/watch?v=e8xkCgZcQ1s  Adaptable service composition: https://dl.dropboxusercontent.com/u/79651021/Adaptable_Service_Composition_Bharga va.wmv 1.7 Outreach Activities and Conferences 25 We have disseminated the results of the project with presentations at various conferences The following list of publications resulted partially or wholly from the research work involved in this project:           M Villarreal-Vasquez, B Bhargava, P Angin, N Ahmed, D Goodwin, K Brin and J Kobes “An MTD-based Self-Adaptive Resilience Approach for Cloud Systems.” IEEE CLOUD 2017 D Ulybyshev, B Bhargava, M Villarreal-Vasquez, D Steiner, L Li, J Kobes, H Halpin, R Ranchal and A Oqab-Alsalem “Privacy-Preserving Data Dissemination in Untrusted Cloud.” IEEE CLOUD 2017 M Villarreal-Vasquez, P Angin, N Ahmed and B Bhargava “An MTD-based SelfAdaptive Resilience Approach for Cloud Systems.” 17th CERIAS Security Symposium at Purdue, 2016 M Azarmi End-to-End Security in Service-Oriented Architecture Ph.D Thesis, Purdue University, April 2016 D Ulybyshev, B Bhargava, L Li, J Kobes, D Steiner, H Halpin, B An, M Villarreal, and R Ranchal Authentication of User’s Device and Browser for Data Access in Untrusted Cloud 17th CERIAS Security Symposium at Purdue, 2016 P Angin, B Bhargava, R Ranchal Tamper-resistant Autonomous Agents-based MobileCloud Computing IEEE/IFIP Network Operations and Management Symposium (NOMS’16), April, 2016 R Ranchal, B Bhargava, R Fernando, H Lei, Z Jin Privacy Preserving Access Control in Service-Oriented Architecture IEEE International Conference on Web Services (ICWS’16), June 2016 R Fernando, R Ranchal, B An, L ben Othmane, B Bhargava Consumer Oriented Privacy Preserving Access Control for Electronic Health Records in the Cloud th IEEE International Conference on Cloud Computing (IEEE CLOUD’16), June 2016 N Ahmed and B Bhargava DisruptionResilient Publish and Subscribe 6th International Conference on Cloud Computing and Services Science (CLOSER'16), April 2016 (Best poster award) B Bhargava, P Angin, R Ranchal, S Lingayat, “A Distributed Monitoring and Reconfiguration Approach for Adaptive Network Computing,” 6th International Workshop on Dependable Network Computing and Mobile Systems (DNCMS) in conjunction with SRDS’15 (Best paper award) 1.8 Intellectual Property Accomplishments None General Comments and Suggestions for Next Year 26 The TechFest is a great place to learn and develop ideas We had the opportunity to participate in 2015 Collaboration and visits with other NGCRC universities is a great step forward for the future The coordinators at NGC are very dedicated References Splunk http://www.splunk.comhttp://www.splunk.com/ Graylog http://www.graylog.orghttp://www.graylog.org/ Kibana https://www.elastic.co/products/kibanahttps://www.elastic.co/products/kibana\ H Shacham, M Page, B Pfaff, E.J Goh, N Modadugu and D Boneh, “On the effectiveness of address-space randomization” In Proceedings of the 11th ACM conference on Computer and Communications Security (pp 298-307), 2004 J Xu, Z Kalbarczyk and R.K Iyer, “Transparent runtime randomization for security” In Proceedings of 22nd International Symposium on Reliable Distributed Systems (pp 260-269), 2003 C Warrender, S Forrest and B Pearlmutter, “Detecting intrusions using system calls: Alternative data models” In Proceedings of the 1999 IEEE Symposium on Security and Privacy (pp 133-145), 1999 Y.-Y Chen, P.A Jamkhedkar and R B Lee, “A software-hardware architecture for self-protecting data” In Proceedings of the ACM Conference on Computer and Communications Security, pp 14–27, 2012 F Li, B Luo, P Liu, D Lee and C.-H Chu, “Enforcing secure and privacy- preserving information brokering in distributed information sharing” IEEE Transactions on Information Forensics and Security, vol 8, no 6, pp 888–900, 2013 S Pearson, M.C Mont, “Sticky policies: An approach for managing privacy across multiple parties” IEEE Computer, no 9, pp 60–68, 2011 10 B Bhargava, P Angin, R Ranchal and S Lingayat, ”A Distributed Monitoring and Reconfiguration Approach for Adaptive Network Computing” 6th International Workshop on Dependable Network Computing and Mobile Systems (DNCMS), Montreal, 2015 11 M Azarmi, “End-to-End Security in Service-Oriented Architecture” Ph.D Thesis, Purdue University, Apr 2016 12 OpenStack https://www.openstack.org/ 13 S Norman, J Chase, D Goodwin, B Freeman, V Boyle and R Eckman, “A Condensed Approach to the Cyber Resilient Design Space” INSIGHT, 19(2), pp 4346, 2016 ( NGC IRAD) 14 Getting Started with OpenStack Monitoring http://www.stratoscale.com/blog/openstack/getting-startedwith-openstack-monitoring/http://www.stratoscale.com/blog/openstack/getting-started-with-openstack-monitoring/ 15 https://wiki.openstack.org/wiki/Telemetry 16 https://wiki.openstack.org/wiki/Monasca 17 https://wiki.openstack.org/wiki/Heat 18 N Ahmed and B Bhargava “Mayflies: A moving target defense framework for distributed systems,” in Proceedings of the 2016 ACM Workshop on Moving Target Defense, 2016, pp 59–64 27 19 M Villarreal-Vasquez, B Bhargava, P Angin, N Ahmed, D Goodwin, K Brin and J Kobes “An MTD-based Self-Adaptive Resilience Approach for Cloud Systems”, IEEE CLOUD 2017 20 N Ahmed and B Bhargava, “Towards Targeted Intrusion Detection Deployments in Cloud Computing” In the Int Journal of Next-Generation Computing Vol 6, No 2, IJNGC - JULY 2015 21 N Ahmed and B Bhargava, “From Byzantine Fault-Tolerance to Fault-Avoidance: An Architectural Transformation to Attack and Failure Resilience” In submission to IEEE Transactions on Cloud Computing 22 N Ahmed, “Design, Implementation and Experiments for Moving Target Defense” PhD Thesis, Purdue University, 2016 23 N Ahmed and B Bhargava, “Byzantine Fault Resilient Publish and Subscribe: A Moving Target Defense Approach” In submission 24 F B Schneider, (1990) “Implementing fault-tolerant services using the state machine approach: A tutorial” ACM Computing Surveys (CSUR), 22(4), 299-319 25 P.K Manadhata and J.M Wing, (2011) “An Attack Surface Metric” In the IEEE Trans Software Engineering, 37, 371-386, 2011 26 A Avizienis and L Chen, “On the Implementation of N-Version Programming for Software Fault-Tolerance During Program Execution” In Proceedings of International Computer Software and Applications Conference, 1977 27 J Knight and N Leveson, (1986) “An Experimental Evaluation of the Assumption of Independence in Multi-version Programming” In the Proceedings of the IEEE Transactions on Software Engineering, Vol 12, No 1, Jan 1986 28 B Cox, D Evans, A Fillipi, J Rowanhill and W Hu, “N-variant Systems: a Secretless Framework for Security Through Diversity” In the Defense Technical Information Center, 2006 29 L Chen and A Avizienis, “N-Version Programming: A Fault Tolerance Approach to Reliability of Software Operation” In the Proceedings of the 8th International Symp on Fault-Tolerant Computing, 1978 30 M Chew and D Song, “Mitigating Buffer Overflows by Operating System Randomization” In the Tech Report CMU-CS-02- 197, December 2002 31 C Clark, K Fraser, S Hand, J.G Hansen, E Jul, C Limpach, “Warfield, A Live migration of virtual machines” In Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation-Volume (pp 273-286), 2005 32 E Zayas, “Attacking the process migration bottleneck” In ACM SIGOPS Operating Systems Review (Vol 21, No 5, pp 13-24), 1987 33 M Kozuch and M Satyanarayanan, “Internet suspend/resume” In Proceedings Fourth IEEE Workshop on Mobile Computing Systems and Applications (pp 40-46), 2002 34 D Ulybyshev, B Bhargava, L Li, J Kobes, D Steiner, H Halpin, B An, M Villarreal – Vasquez and R Ranchal, “Authentication of User’s Device and Browser for Data Access in Untrusted Cloud” 17th CERIAS Security Symposium at Purdue, 2016 35 S Ram et al “A New Perspective on Semantics of Data Provenance”, Proc of the 1st Intl Workshop on the Role of Semantic Web in Provenance Management (SWPM), Oct 2009 36 Y.L Simmhan, B Plale and D Gannon “A survey of data provenance in e-science” SIGMOD Rec., 34(3):31–36, 2005 28 37 J Wang, D Crawl, S Purawat, M Nguyen and I Altintas, “Big data provenance: Challenges, state of the art and opportunities” In IEEE Big Data, pp 2509–2516, 2015 38 Apache Spark MLlib http://spark.apache.org/mllib/ 39 Y.L Simmhan, B Plale and D Gannon “A survey of data provenance techniques” Technical Report IUB-CS-TR618 https://www.cs.indiana.edu/ftp/techreports/TR618.pdf 40 R Ranchal, D Ulybyshev, P Angin and B Bhargava, “PD3: Policy-based Distributed Data Dissemination” 16th CERIAS Security Symposium at Purdue, 2015 (best poster award) 41 R Ranchal, “Cross-Domain Data Dissemination and Policy Enforcement” Ph.D Thesis, Purdue University, 2015 42 OpenStack Swift http://docs.openstack.org/developer/swift/ 43 Apache Lucene https://lucene.apache.org/core/https://lucene.apache.org/core/ 44 Popa, Raluca Ada, Catherine Redfield, Nickolai Zeldovich, and Hari Balakrishnan “CryptDB: protecting confidentiality with encrypted query processing.” In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, pp 85-100 ACM, 2011 45 K Kirkpatrick, “Software-defined networking,” Communications of the ACM, vol 56, no 9, pp 16–19, Sep 2013 46 M H Marghny and A I Taloba, “Outlier detection using improved genetic k-means,” International Journal of Computer Applications, vol 28, no 11, pp 33–36, August 2011 47 L M Manevitz and M Yousef, “One-class svms for document classification,” Journal of Machine Learning Research, vol 2, pp 139–154, Mar 2002 48 T Garfinkel and M Rosenblum, “A virtual machine introspection based architecture for intrusion detection,” in Proceedings of the Network and Distributed Systems Security Symposium, 2003, pp 191–206 49 D Ulybyshev, B Bhargava, M Villarreal-Vasquez, D Steiner, L Li, J Kobes, H Halpin, R Ranchal and A Oqab-Alsalem “Privacy-Preserving Data Dissemination in Untrusted Cloud.” IEEE CLOUD 2017 50 NG ‘WAXEDPRUNE’ prototype https://github.com/Denis-Ulybysh/absoa17 51 G S Kc, A.D Keromytis and V Prevelakis, “Countering code-injection attacks with instruction-set randomization” In Proceedings of the 10th ACM conference on Computer and Communications Security (pp 272-280), 2003 29

Định dạng
Số trang	29
Dung lượng	2,31 MB