1. Trang chủ
  2. » Tất cả

Development and research of models of organization distributed cloud computing based on the software defined infrastructure

8 0 0
Tài liệu đã được kiểm tra trùng lặp

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 8
Dung lượng 314,54 KB

Nội dung

Development and Research of Models of Organization Distributed Cloud Computing Based on the Software defined Infrastructure Procedia Computer Science 103 ( 2017 ) 569 – 576 Available online at www sci[.]

Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 103 (2017) 569 – 576 XIIth International Symposium «Intelligent Systems», INTELS’16, 5-7 October 2016, Moscow, Russia Development and research of models of organization distributed cloud computing based on the software-defined infrastructure I Bolodurina, D Parfenov* Orenburg State University, 13, prospekt Pobedy Orenburg460018, Russia Abstract Data centers are widely used for the placement of highly loaded applications and solutions used to process large amounts of data (Big Data) The study developed a model of cloud application and services The novelty of this model is the use of methods of intellectual analysis and prediction of dynamic characteristics in the study of multicomponent systems In order t o provide flexibility built models used agent-oriented programming Plurality objects of data center modeled as agents are discussed Each node in this case is a platform of agents, which controls other agents Agents that act as data resources for cloud applications and cloud services A distinctive feature of the proposed model is a set of individual parameters needed to perform tasks It formalizing reflected in the quality of service compliance requirements as part of resource characteristics © 2017 2017The TheAuthors Authors Published by Elsevier © Published by Elsevier B.V.B.V This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/) Peer-review under responsibility ofthe scientific committee of the XIIth International Symposium «Intelligent Systems» Peer-review under responsibility of the scientific committee of the XIIth International Symposium “Intelligent Systems” Keywords: cloud computing; computing resources; software-defined networks; virtual data center; software-defined infrastructure; softwaredefined storage Introduction In recent years, cloud computing has become a popular approach, used to provide an access to the services and applications for operation of business processes1 The usage of existing approaches in deploying cloud computing platforms has many advantages, such as reliability and quality of service (QoS) But at the same time, there are a number of limitations, caused both by consumers and by providers of cloud services For the consumer, cloud resources are endless in terms of scalability However, if we consider the economic aspect of their consumption, * Corresponding author E-mail address: parfenovdi@mail.ru 1877-0509 © 2017 The Authors Published by Elsevier B.V This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/) Peer-review under responsibility of the scientific committee of the XIIth International Symposium “Intelligent Systems” doi:10.1016/j.procs.2017.01.064 570 I Bolodurina and D Parfenov / Procedia Computer Science 103 (2017) 569 – 576 then their ability to scale significantly narrows From the side of cloud service provider the set of services and computing power are limited In order to maximize the economic aspect of cloud services by increasing the number of users, providers have to apply policies for the flexible usage of allocated resources, while minimizing operation costs Thereby in today’s virtual and physical data centers the problem of resources and cloud applications management is an important issue, because it has a direct impact on the operation costs In the past few years the large IT corporations (such as Amazon, Google, IBM, Microsoft, Oracle et c.) develop a renewed approach to the management of the resources and objects in data centers, used for the cloud applications The main trend in this sphere is the optimization of the data centers resource consumption In recent our researches we have developed the approaches to the storage optimization of cloud applications’ data and to improvement in efficiency of the access to cloud resources3,4 However, they not solve the assignment problem of cloud application instances in cloud environment In addition, the review of researches in this field has shown that the problem of optimizing resource selection for the specific types of cloud applications is insufficiently investigated5,6,7 Using of virtual infrastructure becoming the primary model of consumption today, the ability to create infrastructures on-the-fly is an absolute necessity for providers In addition, overlapping of these virtual infrastructures on the same physical infrastructure to satisfy diverse service level agreements (SLA) requirements (at same time) is a challenging task On one hand, this requires intelligent decisions and placement while provisioning (i.e., virtual to physical mapping) On the other hand, continuous transformation of workload and physical infrastructure mandates the infrastructure to be continuously mutating, adapting constantly to honor the SLA guarantees Software-defined infrastructure (SDI) has completely changed the way of organization and management of infrastructure, offering more simplicity, flexibility, and monetary benefits compared to a traditional view of infrastructure cloud computing The traditional problem of modern data centers is to ensure QoS In recent years in real data centers to ensure the service level agreement used two ways have gained best result in enterprise data centers – virtualization of physical servers and virtualization of network But this methods is not always work good We propose the approach to solve the all of this problems Our approach is based on the dynamical management of resources cloud applications This paradigm supported in software-defined infrastructure of data centers But this way has other problems The main bottleneck between the infrastructure levels is storage systems With the rapid growth of data centers and the unprecedented increase in storage demands, the traditional storage control techniques are considered unsuitable to deal with this large volume of data in an efficient manner Existing approaches of virtualization data storages and algorithms data placement either don't consider mapping of all resource types 5,8 or can only be used for a fixed network topology of data center 6,7 The software-defined storage (SDS) comes as a solution for this issue by abstracting the storage control operations from the storage devices and get it inside a centralized controller in the software layer But the inadequate resource allocation, lack of I/O performance prediction and insufficient isolation are affecting the storage performance in the cloud storage environment In order to guarantee the quality of service, SDS is an effective approach in data centers However, the lack of intelligence, robustness and self adjustment are blocking the applications and promotions of SDS heavily This paper focuses on the QoS resource scheduling problem in virtual data center (VDC) For the understanding of all the problems we have built a model of VDC and generalized model of cloud application (Section 2) Section describes algorithms of scheduling queue in cloud storages We present the efficiency experiment results in Section and the conclusion in Section Modeling For understanding the operation principles of a cloud application, we need to define its place in the infrastructure of the virtual data center Virtual data center is a dynamic object, changing in time t, its state can be formalized as a directed graph of the following form: VirtDC Node(t ), Connect (t ), CloudAppl (t ) , (1) 571 I Bolodurina and D Parfenov / Procedia Computer Science 103 (2017) 569 – 576 where vertices Node(t ) {Nodei (t )}i 1, N are active elements of the VDC infrastructure (computing nodes, storage systems and others); directed edges Connect (t ) {Connecti (t )}i CloudAppl (t ) {CloudAppli (t )}i 1, C 1,V are active user connections to the cloud applications; are active instances of the cloud applications, launched on the virtual resources The major feature of cloud applications is the approach, in which users have access to them and to their services, and they not know anything about their actual location Most often, users know only the address of aggregation node and application name The cloud system automatically selects the optimal virtual machine for the request, which will have been processed on it 2.1 Generalized model of cloud application Before we talk about the resource allocation for the cloud applications, it is necessary to determine their structure, basic parameters and key characteristics of their operation, affecting the efficiency of their usage For this purpose, we have developed a generalized model of cloud application The generalized model of cloud application is a multilayer structure formalized in a form of graphs, describing the connections of individual elements The model can be represented in the form of three basic slices, detailing the connections of the specific objects of VDC infrastructure: applications, related services and allocated resources The cloud application is a weighted directed acyclic graph of data dependencies: CloudAppl G ,V , (2) Its vertices G are tasks, which get information from the sources or process it in accordance with the users’ requests; directed edges V are dependencies of the tasks on the data sources between corresponding vertices Each vertex gG is characterized by the following tuple: g Re s, NAppl ,Utime, SchemeTask , (3) where Res are the resource requirements; NAppl is the number of application instances; Utime is the estimation of the users request execution time; SchemeTask are communication schemes of data transmission between sources and computing nodes Each directed edge vV connects the application with the required data source It is characterized by the following tuple: v u , v, Tdata, Mdata , Fdata ,Vdata , Qdata , (4) where u and v are linked vertices; Tdata is the type of transmitted data; Mdata – the access method to the information source (REST, JSON and others); Fdata – the physical type of accessed object (file in the storage system, local file, distributed database, data services and so on); Vdata – the traffic volume estimated on the accessed data (in Mb), the requirements for the QoS (quality of services) The originality of the model is in the fact that for each application the consolidated assessment of its work with data sources is calculated It allows to predict the performance of the whole cloud system 2.2 Model of cloud service As mentioned earlier, cloud service in one of the key slices in the generalized model of cloud application Cloud services as an autonomous data source for the application, for which it acts as a consolidated data handler Generally, cloud service is highly specialized and designed to perform a limited set of functions The advantage of 572 I Bolodurina and D Parfenov / Procedia Computer Science 103 (2017) 569 – 576 connecting cloud application to the service is in an isolated data processing, in contrast to the direct access to the raw data, when cloud application does not use a service The usage of services reduces the execution times of user requests Cloud service is formalized as a directed graph of data dependencies The difference lies in the fact that from the user point of view cloud service is a closed system Cloud service can be formalized as a tuple: CloudServ AgrIP , NameServ , Format , (5) where AgrIP is the address of aggregation computing node; NameServ – the service name; Format – the required format of output data The aggregator of a service selects optimal virtual machines, on which it is executed In addition, all its applications are distributed between predefined virtual machines or physical servers Their new instances are scaled dynamically, depending on the number of incoming requests from cloud applications, users or other services 2.3 Model of cloud resource To describe the assignment of cloud applications and services in the virtual data center infrastructure, we have also implemented the model of cloud resource Cloud resource represents an object of DC, which describes the behavior and the characteristics of the individual infrastructure elements, depending on its current state and parameters The objects of virtual data center are disk arrays, including detached storage devices, virtual machines, software-defined storages, databases of various kinds (SQL/NoSQL) In addition, each cloud service or application imposes requirements on the number of computing cores, the RAM and disk sizes, the presence of special libraries on physical or virtual nodes, used to launch their executing environments Each cloud resource can be formalized as follows: Cloud Re s T Re s, Param, State, Core, Rmem, Hmem, Lib , (6) where TRes is the type of resource; Param – the set of parameters; State – the state of resource; Core – the number of computing cores; Rmem – the size of RAM; Hmem – the size of disk; Lib – the libraries requirements The distinctive feature of the proposed model is the resource universality, which allows to explore it from the user’s points of view (as a closed system) and from the point of view of virtual DC software-defined infrastructure (as an open system) The novelty of the model is the simultaneous description of the data placement, the associated cloud applications and the state of virtual environment, considering network topology 2.4 Model of software-defined storage We developed the model of the software-defined storage, which details the resource model of the virtual data center It is represented in the form of a directed multigraph, its vertices are the VDC elements, which are responsible for applications’ data placement (e.g virtual disk arrays, DB and so on) Thus as arcs perform data communication with the cloud applications and services The weights will be used in the consolidated assessment, consisting of characteristics of the data demand, as well as utilization of specific devices, both the physical and the virtual level Stg StgNode, Data , (7) where StgNode – elements of the virtual data center, responsible for the placement of cloud application data; Data – data communication with cloud applications and services Each element of the software-defined storage has individual characteristics: 573 I Bolodurina and D Parfenov / Procedia Computer Science 103 (2017) 569 – 576 Stg ki MaxV ki , Pkistg , Volki (t ), R ki (t ), W ki (t ), skistg (t ) , where MaxVki  N is the maximum storage capacity in Mb; Pkistg (8) stg { pkij } j – the set of network ports; Volki (t )  N ‰ {0} – the available storage capacity in Mb; R ki (t ) and W ki (t ) are read and write speeds; skistg (t )  {" online "," offline "} – its state The generalized model of cloud-based applications as the basis for planning the development of algorithms of cloud services and applications on top of virtual machines and physical servers to virtual data center infrastructure, as well as scheduling algorithm and data access cloud services and applications, which is an original approach Algorithmic implementation Computing and networking resources provide the bulk of the support for cloud applications But current resource control systems are not capable of controlling heterogeneous resources that include computing and networking resources in combination with other resources In addition, current control systems are not capable of realizing the flexibility, scalability and economic advantages that would be inherent in the integrated control of converged heterogeneous resources in virtual data center infrastructure To correct this disadvantage, we have implemented an algorithm to optimize workload and performance of cloud applications and services in virtual data center The algorithm allows providing QoS, and is based on methods of ant colony and heuristic data mining The initial problem of optimization can be divided into four subproblems of preparation of plans of access to resources: x x x x Mapping cloud applications and services to virtual machines Mapping virtual machines to physical computational nodes Mapping virtual storages to data storages Mapping virtual channels to physical network channels We can solve the subproblems 1-3 by an algorithm based on the ant colony optimization approach For subproblem we use greedy algorithm The algorithm’s scheme is: x Build the graph R The graph form is chosen so that path in the graph determines mapping of cloud applications and services to virtual machines; x Build the graph G in the graph R The graph form is chosen so that path in the graph determines mapping of virtual machines and virtual storages; x Build paths Bi in the graph G The path is built according to the restrictions on maximum computational node performance and maximum data storage memory volume x For each Bi map virtual channels to physical ones given that virtual machines and virtual storages are mapped according to path Bi x Calculate the target function Bi Fi for each path Bi x Update the pheromone values on the arcs of the graph A depending on the target function values Fi x If the stop condition isn’t satisfied, go to stage As a result, the algorithm generated table of pheromones It is used for routing data flows The values of the routing table pheromones: R k nd of the elements are the result of exponentiation and normalization of the columns of the table 574 I Bolodurina and D Parfenov / Procedia Computer Science 103 (2017) 569 – 576 R R k nd (W nd )H , k nd ¦ R k nd i N k R k (9) id We describe the algorithm ants in more detail Step Select a destination d for the ant Make a list of visited vertices empty T ‡ Step Assign the value of the top, which is the top of the initial value of the ant: k s Step Add to the list of the top visited the original: T T ‰ {k } Step While k z d perform the following actions in a loop: select next to go to the top among the neighbor, not visited vertices according to the probabilities calculated using the formula: pnd W nd  D ln , n  N k \ T ,  D (| N k | -1) (10) where D  [0;1] - value showing the level of impact of the current traffic condition on the choice of the next vertex ant - the value characterizing the level of utilization of the arc If selection is possible to produce, then the total time an ant, add the time spent on the transition, designated as the selected top and add it to the list of visited vertices; If the transition is not possible or an ant lifetime exceeds TTL, then exit the algorithm Step Send the ant on the road, back to the passed in previous steps To put this k d Step While k z s perform these steps: For each sub-paths from this vertex to update: a) the statistics table according; b) the routing table; c) a table of pheromones Let us explain some of the details of the algorithm and method call it At regular intervals from each node is sent to the ant in the selected destination node in order to find the optimal path For each node the ant s selection takes place in accordance with the statistics of the traffic that passes through s, with probability proportional to the volume of a previously transmitted traffic from s to d In the process of direct travel ant (steps 1-4) to the end point, it records the history of visited vertices, which contains a pair - number visited vertex and the transition to this the top of the previous one Next vertex selected in substep 1) in step 4, in accordance with the current level of pheromones of all vertices adjacent to the current node k Nodes into which ant can go, should not be a previously visited If left ant vertices that it can move, and it is not the destination node, it is destroyed The best value for α QoS ranges [0.2, 0.5] the middle segment was taken for the experiment: D | 0.35 With direct journey ant situation may arise when it is moving in the wrong direction for a long time In this case, if the journey time exceeds the TTL threshold, an ant destroyed After a "warm-up" the network routing table is filled with the best ways Selection of the optimal route from s to occur based on the finding of the greatest value in the j column of the routing table for each node, starting from s Experimental For the experiment we have used the cloud system It includes OpenFlow switches (2 HP 3500yl, Netgear GSM7200), computing nodes (32Gb RAM, cores), server (32Gb RAM, cores) with OpenFlow controller and server (32Gb RAM, cores) for monitoring function As a selected fat tree topology with three levels Routers connected compounds having the speed 1000 Mbit / s, and the computers are connected to a third level router via the second level network connections at 100 Mbit/s I Bolodurina and D Parfenov / Procedia Computer Science 103 (2017) 569 – 576 575 We prepare two research scenarios that differ in the number of physical hosts and number types of cloud applications First scenario used 10 hosts connected to the types of cloud applications; second scenario used 35 hosts connected to types of cloud applications Starts generation flow with Poisson distribution and with intensity O The flow consists of two types of task First rbi  [1, 10] -All-to-All, second – rbi  [1, 20] - One-to-All For both types of tasks is used only 20% of all hosts in cloud The probability of selection of each type of task is equal to 0.5 It should be noted that if more than 10 hosts, which employ the processes of the first type, and for the second, QoS requirements are guaranteed to be broken During the experiment, the percentage of QoS violation, depending on the intensity of the revenues applications Graphs of the first and second scenarios are shown in Fig 1,a and Fig 1,b, respectively Each point corresponds to the index, averaged over 20 randomly generated flow problems Fig Dependence percent QoS violations of intensity In both cases the growth disorders bandwidth requirements and latency problems with increasing flow rate, and the second scenario data values more Designed algorithms are effective, it requires a comparison with existing analogues, which is planned to be done in terms of the emulated network Conclusions Our research has resulted in the generalized model of cloud application in virtual data center The model has been developed considering the requirements for computational resources These models reflect all the basic characteristics of modern cloud application and services as well as the use of SDN For data flow routing in SDNs we have developed the reactive and proactive methods Acknowledgements The research work was funded by Russian Foundation for Basic Research, according to the research projects No 16-37-60086 mol_а_dk and 16-07-01004 References Bein D, Bein W, Venigella S Cloud Storage and Online Bin Packing Proc of the 5th Intern Symp on Intelligent Distributed Computing 2011; p 63-68 Nagendram S, Lakshmi JV, Rao DV et al Efficient Resource Scheduling in Data Centers using MRIS Indian J of Computer Science and Engineering 2011; 2(5), p 764-769 Arzuaga E, Kaeli DR Quantifying load imbalance on virtualized enterprise servers Proc of the first joint WOSP/SIPEW international conference on Performance engineering 2010; p.235-242 576 I Bolodurina and D Parfenov / Procedia Computer Science 103 (2017) 569 – 576 Mishra M, Sahoo A On theory of VM placement: Anomalies in existing methodologies and their mitigation using a novel vector based approach Cloud Computing (CLOUD), IEEE International Conference 2011; p.275-282 Korupolu M, Singh A, Bamba B Coupled placement in modern Data Centers IEEE Intern Symp on Parallel & Distributed Processing 2009; p.1-12 Singh A, Korupolu M, Mohapatra D Server-storage virtualization: integration and load balancing in Data Centers Proc of the 2008 ACM/IEEE Conf on Supercomputing 2008; p.1-12 Plakunov A, Kostenko V Data center resource mapping algorithm based on the ant colony optimization Proc of Science and Technology Conference (Modern Networking Technologies) (MoNeTeC) 2014; p.1-6 Darabseh A, Al-Ayyoub M, Jararweh Y, Benkhelifa E, Vouk M, Rindos A SDStorage: A Software Defined Storage Experimental Framework Proc of Cloud Engineering (IC2E) 2015; p.341- 346 Parfenov D, Bolodurina I Approaches to the effective use of limited computing resources in multimedia applications in the educational institutions WCSE 2015-IPCE; 2015 ... mandates the infrastructure to be continuously mutating, adapting constantly to honor the SLA guarantees Software- defined infrastructure (SDI) has completely changed the way of organization and. .. predict the performance of the whole cloud system 2.2 Model of cloud service As mentioned earlier, cloud service in one of the key slices in the generalized model of cloud application Cloud services... elements of the virtual data center, responsible for the placement of cloud application data; Data – data communication with cloud applications and services Each element of the software- defined

Ngày đăng: 24/11/2022, 17:50

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN