Now runtime adaptive adjustment is still a difficult problem for Internet- intensive software system. It needs to solve how to map requirements changes into architecture units at runtime. Driven by predictive control for SaaS com- ponents to induce requirements evolution, runtime architecture change will be
258 B. Wen et al.
realized, and the validity of predictive control can be proved in the aspect of requirements/architecture evolution [10]. But this method is not extended to SOA level with requirements change-driven architecture evolution.
As shown in Fig.2, we propose the runtime adaptive adjustment scheme com- bined with effective predictive control method [10] and MAPE-K control circuit model [11]. The proposed adjustment system divided into real-time monitor, analysis engine, software architecture adjustment manager, requirements evolu- tion manager and Aspect execution engine. First of all, the real-time monitor acquires QoS values of services resource and saves into the log records. Then the QoS value will be transferred to the analysis engine that uses analysis model based wavelet transformation to predict next QoS value according to the logs.
Software architecture adjustment manager seeks the best runtime design deci- sion under the requirements constraints according to the predicted QoS value and the current QoS values. If the feasible runtime model can be found, the Aspect script will be automatically generated. Execution engine runs the script to complete the model transformation at runtime. Otherwise, design decision manager should identify the specific improvement points of requirements evolu- tion to induce these changes.
Fig. 2.Runtime adaptability to deal with the changes of requirements and scenario
The core points of the scheme are as follows.
(1) How to select the predictive QoS value for runtime services resource?
Wavelet transformation is the best choice.
(2) How to build software architecture adjustment mechanism based on QoS changes to predict in advance? The runtime adaptive adjustment algorithm is designed (Algorithm 2) for solving the above concerned points.
Runtime Exceptions Handling for Collaborative SOA Applications 259
1 INPUT: QoS value of service resources for SBS in time t and t+1; QoS value of the expected output.
2 OUTPUT:control operation vector in t+1 moment 1: Begin
2: Initialization: training classification prediction model; the points of tags to improve training requirements model;
3: IF Classification prediction (QoS value of services resource in run time t and t+1, expected output of QoS value) = requirements
4: THEN
5: control operation vector in t+1 moment=tag improvement point
(QoS value of services resource in t and t+1 moment at runtime, the expected output of QoS value)
6: ELSE
7: control operation vector in t+1 moment = architecture evolution (AOP style) (QoS value service resources in t and t+1 moment at runtime, the expected output of QoS value)
8: END IF
9: RETURN control operation vector in t + 1 moment
10: End
Algorithm 2.Runtime adaptive adjustment algorithm.
4 Experiment and Empirical Analysis
Hainan agricultural e-commerce platform-NongBo Mall4 should be chosen as experimental carrier which integrated the Internet of things, cloud and big data technology to build a service-oriented Internet based application software sys- tem. NongBo Mall is a service-oriented platform faced with richness stakehold- ers, diversity personalized requirements, thus leading to a variety of personalized customization requirements. Meanwhile, the platform adopts SOA development style and a large number of service resources (including Microsoft Asmx or Java Axis) have developed. The above characteristics conform to exploring empiri- cal collaborative SOA carrier requirements. The author’s team is the technical support of online/offline design for NongBo Mall. Close cooperation can help research achievements timely and effective application for Hainan agricultural electric business platform. Through continuous iteration, we can get a compre- hensive CASE tool needs to support collaborative SOA.
A large number of services have been developed for the platform. For example, only for agricultural products traceability information service, the WCF interface service address is as follows:
http://218.77.186.198:8000/TracesDataService.svc
Web service interface address is:
http://218.77.186.198:8000/TracesDataWebService.asmx
Its identification code is 2C516EF7-CBD8-4C1C-9EE0-00EB34AFBCB5.
4 http://www.963110.net.
260 B. Wen et al.
The test data:
PID(Product ID): A003121910013001 PRID(Planting ID): XH03020130401
For example, production history batch No. is
ProductionHistoryGetProductionHistoryByPID(string PID, string IDs) Parameters:
PID: 16 digits batch no.
IDs: identification codereturns a ProductionHistory object.
The main data structures, such as ProductionHistory (production record) class structure is as follows:
summaryBatch no./summary publicstringP ID{get;set;}
summaryPlanting number/summary publicstringP RID{get;set;}
summaryPlanting time/summary publicDateT imeP lantT M{get;set;} summaryPicking time/summary publicDateT imeP ickT M{get;set;}
summaryPesticide recordset/summary
publicListP esticideHistoryP esticideHistoryList{get;set;} summaryFertilization recordset/summary
publicListF ertilizerHistoryF ertilizerHistoryList{get;set;}
Fig. 3.Effect comparison for E-business platform using exception handling mechanism
In addition to platform-developed services, system also has called a large number of external services, such as map services, weather services and other physical services etc., which are typical SBS applications. In the early stage, platform operation is extremely unstable. No corresponding runtime exception
Runtime Exceptions Handling for Collaborative SOA Applications 261
handling mechanism is the main reason. Figure3(a) is the comparison result for the two sets of same system platform running at the same time. One of platforms is running under the part of the designed exception handling mechanism in this paper. Comparison results (Fig.3) show that the embedded runtime exception handling mechanism obviously improved the platform’s ability (including the rate of QoS prediction) to deal with all kinds of uncertainty.
5 Conclusions
The main contributions of this paper are summarized as follows: (1) Self-adaptive exception handling architecture for active services resource provisioning has been built. (2) Runtime adaptive adjustment mechanism has been designed to deal with requirements and context changes. This paper gives a general solution framework, but the details are still to be further in-depth study.
The paper aims at explore the collaborative SOA with runtime exception handling to enhance the operation reliability. We mainly focus on collaborative SOA runtime exception handling mechanism. The results and progresses of the study will provide systematical solutions to perfect the collaborative SOA.
Acknowledgements. This research has been supported by the Natural Science Foun- dation of China (No. 61562024, No. 61463012) and Natural Science Foundation of Hainan Province (No. 20156236).
References
1. Liu, L.: Editorial: services computing in 2016. IEEE Trans. Serv. Comput.9(1), 1 (2016)
2. Zhang, L.J.: Big services era: global trends of cloud computing and big data. IEEE Trans. Serv. Comput.5(4), 467–468 (2012)
3. Bin, W.: On-demand Service Software Engineering for Cloud Computing. National Defence Industry Press, Beijing (2014)
4. Ghezzi, C.: Surviving in a world of change: towards evolvable and self-adaptive service-oriented systems. In: Keynote Speech at Proceedings of the 11th Inter- national Conference on Service Oriented Computing, ICSOC 2013. Springer, Heidelberg (2013)
5. Kappel, G., Maamar, Z., Motahari-Nezhad, H.R. (eds.): ICSOC 2011. LNCS, vol. 7084. Springer, Heidelberg (2011)
6. Yau, S.S., An, H.G.: Software engineering meets services and cloud computing.
Computer44(10), 46–52 (2011)
7. Allier, S., et al.: Multitier diversification in web-based software applications. IEEE Softw.32(1), 83–90 (2015)
8. Lemos, A.L., Daniel, F., Benatallah, B.: Web service composition: a survey of techniques and tools. ACM Comput. Surv.48(3), 1–41 (2015)
9. Chouiref, Z., Belkhir, A., Benouaret, K., Hadjali, A.: A fuzzy framework for efficient user-centric web service selection. Appl. Soft Comput.41, 51–65 (2016). Elsevier 10. Xiong, W., et al.: A self-adaptation approach based on predictive control for SaaS.
Chin. J. Comput.39(2), 364–376 (2016)
11. Kephart, J.O., Chess, D.M.: The vision of autonomic computing. Computer36(1), 41–45 (2003)
Data-Intensive Work fl ow Scheduling in Cloud on Budget and Deadline Constraints
Zhang Xin, Changze Wu(&), and Kaigui Wu
College of Computer Science, Chongqing University, Chongqing 400044, China zx06063068@163.com, {wuchangze,kaiguiwu}@cqu.edu.cn
Abstract. With the development of Cloud Computing, large-scale applications expressed as scientific workflows are often executed in cloud. The problems of workflow scheduling are vital for achieving high efficient and meeting the needs of users in clouds. In order to obtain more cost reduction as well as maintain the quality of service by meeting the deadlines, this paper proposed a novel heuristic, PWHEFT (Path-task Weight Heterogeneous Earliest Finish Time), based on Heterogeneous Earliest Finish Time (HEFT). The criticality of tasks in a work- flow and data transmission between resources are considered in PWHEFT while ignored in some other algorithms. The heuristic is evaluated using simulation withfive different real world workflow applications. The simulation results show that our proposed scheduling heuristic can significantly improve planning suc- cess rate.
Keywords: Workflow HEFT Bi-criteria Data-intensive workflow scheduling
1 Introduction
Large-scale businesses and scientific applications which are usually comprised of big-data, multitasking and multidisciplinary sciences, have required more computing power beyond single machine capability [1]. An easy and popular way is to execute these applications which include scientific workflows, multi-tier web service work- flows, and big data processing workflows on the cloud. In order to execute these workflows in reasonable amount of time and acceptable cost, the workflow scheduling problem has been studied extensively over past years.
Workflow scheduling is a process of mapping inter-dependent tasks on the avail- able resources such that workflow application is able to complete its execution within the user’s specified Quality of Service (QoS) constraints such as deadline and budget [2]. Workflow scheduling in the cloud faces some challenges. Typically, the non-dedicated nature of resources imposes more difficulties as the contention for shared resources on the cloud needs to be considered during planning. These suggest that the planner may have to somehow query resources for their runtime information (e.g., the existing load) to make informed decisions. And planning should be performed in short time, because users may require a real-time response, and the runtime information, on which a planning decision has been made, varies over time and, thus, a planning decision made using out of date information may not be valid. Moreover, a user may
©ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2017 S. Wang and A. Zhou (Eds.): CollaborateCom 2016, LNICST 201, pp. 262–272, 2017.
DOI: 10.1007/978-3-319-59288-6_24
require his/her workflow application to complete within a certain deadline and budget.
However, minimizing makespan and minimizing execution cost are two conflicting objectives. There have been a few bi-criteria DAG planning heuristics in the literature [3–5], some of them have sophisticated designs, such as guided random research or local search, which usually require considerably high planning costs. Moreover, most of these heuristics do not take the data transmission between resources into account, which may lead to an invalid plan in Data-intensive workflow scheduling. Taking computation costs of scheduling into consideration, an efficient algorithm tries to compromise these values (makespan and execution cost) and still obtain approxi- mate optimal solution. In this paper, we aim for bi-criteria scheduling of workflow with the cloud. We address a new heuristic aiming at seeking out a beneficial trade-off between execution time and execution cost under Budget and Deadline constraints. The proposed heuristic, namely, PWHEFT, is based on HEFT [6] algorithm, which max- imizes efforts to reduce the overall execution time of a workflow. The HEFT algorithm selects the task with the highest upwards rank value at each step and assigns the selected task to the resource, which minimizes its earliest finish time with an insertion-based approach. While being effective at optimizing makespan, the HEFT algorithm does not consider the monetary cost and budget constraint when making scheduling decisions. Compared with HEFT, PWHEFT figures out appropriate schedule plan after considering Budget, Deadline, criticality of tasks and data transfer rates between processors.
The remaining paper is organized as follows: Sect.2presents the related work in the area of workflow scheduling. The problem description is presented in Sect.3. The proposed heuristic, PWHEFT, is discussed with the help of an example in Sect.4. The proposed PWHEFT algorithm is evaluated in Sect.5and Sect.6concludes the paper.
2 Related Work
Due to the NP-complete nature of the parallel task scheduling problem in general cases [7], many heuristics have been proposed in recent researches [8] to deal with this problem, and most of them achieve good performance in polynomial time. In the previous works, the heuristic-based algorithms can be classified into a variety of categories, such as list scheduling algorithms, clustering heuristics, and duplication-based algorithms.
Among them, the list scheduling algorithms are generally more practical, and their performances are better at a lower time complexity. There are different list based heuristic algorithms in literature like Dynamic Critical Path(DCP) [9], Dynamic Level Scheduling (DLS) [10], Critical Path on Processor (CPOP) [6], Heterogeneous Earliest Finish Time (HEFT) [6] etc. From all of these, HEFT outperforms in terms of makespan.
Only few works in the past considered bi-objective (time and cost mainly) criteria to schedule workflow tasks in cloud environment. Recently, Amandeep Verma and Sakshi Kaushal [11] proposed Cost-Time Efficient Scheduling Plan, BDHEFT, which is the extension of HEFT algorithm that schedule workflow tasks over the available cloud resources under Budget and Deadline Constrained. However, this heuristic generates a schedule plan only by considering the spare deadline along with spare budget which are calculated by simple average while selecting the suitable resource for Data-Intensive Workflow Scheduling in Cloud 263
each workflow task without taking criticality of tasks and parallelism of executing workflow into consideration. To address all these gaps, we introduced in this paper, a novel heuristic that gains a Budget and Deadline Constrained tasks and resources mapper by considering high weights of critical task and the parallelism time efficiency.
3 Models
3.1 Workflow Application Model
A Directed Acyclic Graph (DAG), GẳðT;Eị, is used to model the aforementioned workflow application, where T is the set of n tasks and E is the set of e edges between the tasks. Each edgeð ị 2i;j E denotes a dependency between two dependent tasks such that the execution oftj2T cannot be started beforeti2Tfinishes its execution. A task with no parent represents an entry task and a task with no children represents exit task.
If there is more than one exit (entry) task, they are connected to a zero-cost pseudo exit (entry) task with zero-cost edges. The task size (amounti) is expressed in Million of Instructions (MI). Data is a nn matrix of communication data, wheredatai;jis the amount of data required to be transmitted from tasktito task tj.
3.2 Cloud Resources Model
A cloud service provider which offers m computational resources, Rẳðr1r2 rmịat different processing power and different prices, provides information which is needed to make planning decisions. Each ri is represented by riẳðMsri;Psriị, where Msri denotes Million of Instruction per Second (MIPS) which refers to processing power of a resourceriandPsridenotes the price unit of using resourcerifor each time interval.
The data transfer rates between resources are stored in matrix B with size pp. The communication time between task ti (scheduled on rm) and tj (scheduled on rk) is defined as:
Tranð ịi;j ẳdatað ịi;k
Bðm;kị ð1ị
Before scheduling, average communication time is used to label the edges. The average communication time between taskti andtj is defined as
Tranð ịi;j ẳdatað ịi;k
B ð2ị
where B is the average transfer rate among the resources. Due to each task can be executed on different resources, the execution time,ETð ịi;j of a tasktion a resourcerj, is estimated by the following equation:
ETð ịi;j ẳamounti
Msrj
* 1ð ỵaị;aẵ0;1 ð3ị 264 Z. Xin et al.
whereais random number ranging from 0 to 1, and the execution costECð ịi;j is given by:
ECð ịi;j ẳPsri* ETð ịi;j ð4ị Therefore, the average execution time of a task tiwhich is defined as
ETiẳXm
jẳ1ETð ịi;j ð5ị
and the average execution costECiis given by:
ECiẳXm
jẳ1ECð ịi;j ð6ị
Although most of the commercial clouds (like Amazon) transfer the internal data at free of cost, the data transfer time cannot be ignored while a large amount of data is needed to be transferred between tasks.
3.3 Scheduling Model
There are three entities in our workflow scheduling model: User, Planner and Cloud Service Provider (CSP). A CSP has a set of computational resources with different capabilities which include processing power and prices and responds to the queries from the planner about the availability of requested resources. The user submits a workflow application along with budget, B and deadline, D to the planner. The planner decides how to execute workflow tasks over available resources.
4 Time and Cost Efficient Scheduling Algorithm
4.1 The PWHEFT
The proposed heuristic, Path-task Weight Heterogeneous Earliest Finish Time (PWHEFT), is based on HEFT, which is a well-known DAG scheduling heuristic. It is an extension of HEFT and considers budget and deadline constraints while scheduling tasks over available resources. PWHEFT has two major phases: task attributes calcu- lation and resource schedule.
First Phase: task attributes calculation
First phase in PWHEFT, the attributes of each task are calculated and all tasks are sorted by priority. The priorities of all tasks are computed using upward ranking. The upward rank of a taskti is recursively defined by
rankuð ị ẳti ETiỵmaxtjsucc tð ịi Tranð ịi;j ỵranku tj
: ð7ị
wheresucc tð ịi represents the set of all the children tasks ofti. Taking into account the criticality level of the task, downward rank is also calculated by:
Data-Intensive Workflow Scheduling in Cloud 265
rankdð ị ẳti maxtjpred tð ịi Tranð ịj;i ỵrankd tj ỵETj
ð8ị wherepred tð ịi represents the set of immediate predecessor ofti. Thus, the criticality level of a taskti is given by:
Clvlð ị ẳti ðrankuð ị ỵti rankdð ịti ị* 2
minti2Tfrankuð ị ỵti rankdð ịti g ỵmaxti2Tfrankuð ị ỵti rankdð ịti g ð9ị In order to facilitate the calculation,EST t
i;rj
ð ịandEFT t
i;rj
ð ịare used to denote the earliest start time and the earliest finish time of task ti which been scheduled on processorrj respectively. For the entry tasktentry, the EST can be calculated as:
EST t
entry;rj
ð ị ẳ0 ð10ị
For the other tasks in the graph, starting from the entry task, the EST and EFT values can be calculated as:
ESTt
i;rj
ð ị ẳmaxavail jẵ ;maxtm2pred tð ịi AFT tð ị ỵm Tranðm;iị
ð11ị
EFT t
i;rj
ð ị ẳEST t
i;rj
ð ị ỵECð ịi;j ð12ị wherepred tð ịi is the set of immediate predecessor tasks of taskti,avail jẵ is the time when the resource rj is ready for task execution, and AFT tð ịm represents the actual finish time of taskti. Analogously, LFTð ịti which denotes latestfinish time of tasktiis calculated by:
LFTð ị ẳti D;when tiẳtexit
mintj2succðtiịLFT tj ETjTranð ịi;j
;others
ð13ị where D is given deadline. The schedule length also called makespan is equal to the maximum of actualfinish time of the exit task texit.
makespanẳAFT tðexitị ð14ị
Second Phase: resource schedule
In the resource schedule phase, candidate resources are generated and the best resource is selected. For each task which is ordered byranku, the set of candidate resources is constructed using the six variables: Workflow Prediction Budget (WPB), Prediction Task Budget (PTB), Prediction Budget Factor (PBF), Weight Deadline Factor (WDF), Prediction Deadline Factor (PDF) and Prediction Task Deadline (PTD). For a taskti, the value of these variables is given by (15) to (20), as follows:
266 Z. Xin et al.
WPBẳBX
ti2allocatedTasksECð ịi X
ti2unallocatedTasksClvl tj *ECj ð15ị PBFð ị ẳti 0;when WPB\0
Clvlð ịti * ECi=P
tj2unallocatedTasksClvl tj *ECj
;others ð16ị
PTBð ị ẳti Clvlð ịti *ECiỵPBFð ịti * WPB ð17ị
WDFð ị ẳti 1;whentiẳtexit Clvlð ịti * maxtksucc tð ịiTranð ịi;k
1
* ECi;others
ð18ị
PDFð ị ẳti P WDFð ịti
tk2unallocatedTasksWDFð ịtk ð19ị
PTDð ị ẳti 0;when LFT tð ị i minrj2R EFT t
i;rj
ð ị n o
\0 minrj2R EFT t
i;rj
ð ị n o
ỵPDF tð ịi * LFTð ị ti minrj2R EFT t
i;rj
ð ị n o
;others 8<
:
ð20ị where B is given budget,ECð ịi is the execution cost of allocated taskti,ECjand ETjare average execution cost and average execution time of un-allocated tasktj. PBF or PDF is a value intended to act as a weight that tunes the impact on PTB or PTD Such prediction function is designed to determine which resources the tasktjPBF or PDF is a value intended to act as a weight that tunes the impact on PTB or PTD Such prediction function is designed to determine which resources the tasktiis predicted to finish on.
Based on the allocated deadline and budget to a task ti, a candidate set CSi is calculated by considering possible resources for a taskti, by:
CSiẳ Sð ịi;jj9Sð ịi;j;ECð ịi;j PTBð ị;ti EFT t
i;rj
ð ị PTDð ịti
n o
ð21ị where Sð ịi;j represents the possible resource, from the given R, which satisfies the inequality .Then the best possible resource is selected by the selection rules as follows:
– I. If CSi6ẳ ; then the best resource is selected from this set that minimizes the following expression:hEFT t
i;rj
ð ị ỵð1hịECð ịi;j for all j2CSi where EFT t
i;rj
ð ị is the earliestfinish time andECð ịi;j is the execution cost of a tasktiover all possible j resources inCSi respectively andh is the cost-time balance factor in a range of [0,1] which represents the user preference for execution time and execution cost.
– II.If CSiẳ ;and WPB0, then the resource from all the available resources that minimize the above equation is chosen.
Data-Intensive Workflow Scheduling in Cloud 267