SLGP Header

Deadline Based Execution of Scientific workflows on IaaS Clouds using Resource Provisioning and Scheduling Strategy

IJCSEC Front Page

Abstract
Cloud computing is the latest distributed computing paradigm and it offers tremendous opportunities to solve large-scale scientific problems. However, it presents various challenges that need to be addressed in order to be efficiently utilized for workflow applications. Although the workflow scheduling problem has been widely studied, there are very few initiatives tailored for cloud environments. Furthermore, the existing works fail to either meet the user’s quality of service (QoS) requirements or to incorporate some basic principles of cloud computing such as the elasticity and heterogeneity of the computing resources. This paper proposes a resource provisioning and scheduling strategy for scientific workflows on Infrastructure as a Service (IaaS) clouds. We present an algorithm based on the meta-heuristic optimization technique, particle swarm optimization (PSO), which aims to minimize the overall workflow execution cost while meeting deadline constraints. Our heuristic is evaluated using CloudSim and various well-known scientific workflows of different sizes. The results show that our approach performs better than the current state-of-the-art algorithms.
Keywords- Cloud computing, resource provisioning, scheduling, scientific workflow
I.Introduction
WORKFLOWS have been frequently used to model large scale scientific problems in areas such as bioinformatics, astronomy, and physics [1]. Such scientific workflows have ever-growing data and computing requirements and therefore demand a high-performance computing environment in order to be executed in a reasonable amount of time. These workflows are commonly modeled as a set of tasks interconnected via data or computing dependencies.
Over the years, distributed environments have evolved from shared community platforms to utility-based models; the latest of these being cloud computing. This technology enables the delivery of IT resources over the Internet [2], and follows a pay-as-you-go model where users are charged based on their consumption. There are various types of cloud providers [2], each of which has different product offerings. They are classified into a hierarchy of as-a-service terms: Software as a Service (SaaS), Platform as a Service (PaaS) and Infrastructure as a Service (IaaS). This paper focuses on IaaS clouds which offer the user a virtual pool of unlimited, heterogeneous resources that can be accessed on demand.
Moreover, they offer the flexibility of elastically acquiring or releasing resources with varying configurations to best suit the requirements of an application. Even though this empowers the users and gives them more control over the resources, it also dictates the development of innovative scheduling techniques so that the distributed resources are efficiently utilized. There are two main stages when planning the execution of a workflow in a cloud environment. The first one is the resource provisioning phase; during this stage, the computing resources that will be used to run the tasks are selected and provisioned. In the second stage, a schedule is generated and each task is mapped onto the best-suited resource. The selection of the resources and mapping of the tasks is done so that different user defined quality of service (QoS) requirements are met
Virtual machine (VM) performance is an additional challenge presented by cloud platforms. VMs provided by current cloud infrastructures do not exhibit a stable performance in terms of execution times. This work is based on the meta-heuristic optimization technique, particle swarm optimization (PSO). PSO is then used to solve such problem and produce a schedule defining not only the task to resource mapping but also the number and type of VMs that need to be leased, the time when they need to be leased and the time when they need to be released. Our contribution is therefore, an algorithm with higher accuracy in terms of meeting deadlines at lower costs that considers heterogeneous resources that can be dynamically acquired and released and are charged on a pay-per-use basis.

References:

  1. G. Juve, A. Chervenak, E. Deelman, S. Bharathi, G. Mehta, and K. Vahi, “Characterizing and profiling scientific workflows,” Future Generation Comput. Syst., vol. 29, no. 3, pp. 682–692, 2012.
  2. P. Mell, T. Grance, “The NIST definition of cloud computing— recommendations of the National Institute of Standards and Technology” Special Publication 800-145, NIST, Gaithersburg, 2011.
  3. R. Buyya, J. Broberg, and A. M. Goscinski, Eds., Cloud Computing: Principles and Paradigms, vol. 87, Hoboken, NJ, USA: Wiley, 2010.
  4. J. Kennedy and R. Eberhart, “Particle swarm optimization,” in Proc. 6th IEEE Int. Conf. Neural Netw., 1995, pp. 1942–1948.
  5. Y. Fukuyama and Y. Nakanishi, “A particle swarm optimization for reactive power and voltage control considering voltage stability,” in Proc. 11th IEEE Int. Conf. Intell. Syst. Appl. Power Syst., 1999, pp. 117–121.
  6. C. O. Ourique, E. C. Biscaia Jr., and J. C. Pinto, “The use of particle swarm optimization for dynamical analysis in chemical processes,” Comput. Chem. Eng., vol. 26, no. 12, pp. 1783–1793, 2002.
  7. T. Sousa, A. Silva, and A. Neves, “Particle swarm based data mining algorithms for classification tasks,” Parallel Comput., vol. 30, no. 5, pp. 767–783, 2004.
  8. M. R. Garey and D. S. Johnson, Computer and Intractability: A Guide to the NP-Completeness, vol. 238, New York, NY, USA: Freeman, 1979.
  9. M. Rahman, S. Venugopal, and R. Buyya, “A dynamic critical path algorithm for scheduling scientific workflow applications on global grids,” in Proc. 3rd IEEE Int. Conf. e-Sci. Grid Comput., 2007, pp. 35–42.