TOWARDS MULTI-OBJECTIVE SCHEDULING STRATEGIES FOR GRID COMPUTING ENVIRONMENTS

Agarwal, Amit

Please use this identifier to cite or link to this item: http://localhost:8081/jspui/handle/123456789/326

Title:	TOWARDS MULTI-OBJECTIVE SCHEDULING STRATEGIES FOR GRID COMPUTING ENVIRONMENTS
Authors:	Agarwal, Amit
Keywords:	MULTI-OBJECTIVE SCHEDULING;STRATEGIES FOR GRID COMPUTING;GRID COMPUTING;ORIENTED ALGORITHM
Issue Date:	2011
Abstract:	In recent years, there has been increasing interest in using networkbased resources for large scale data-intensive computation problems. The Grid computing systems are rapidly gaining importance due to their capability to deal with the continuously rising demands of computational and storage resources. A grid may consist of resources owned by a number of different organizations, within which sharing arrangements have been established. Grid computing can be defined as coordinated resource sharing and problem solving in dynamic, multi-institutional collaborations. Scheduling is an important research topic in Grid computing. The purpose of grid is to offer high quality services to members of Virtual Organizations (VOs) based on the efficient use of the available resources. This goal cannot be achieved without a good scheduling strategy applied to the local level (clusters) and global level (entire system) of Grids. Since the grid scheduling problem is NP-hard, we can afford only sub-optimal solutions to this problem. Grid scheduling scenario has additional complexities due to scattered data storages and consequent delays in data transfers, coordination required in the execution of linked tasks and arranging the required data interchange. Further challenges are related to the dynamic nature of Grid systems where the changing availability and quality of resources makes the scheduling decisions still more difficult. Optimally scheduling a single set of dependent tasks to multiple heterogeneous resources is an algorithmically hard problem. This problem becomes even harder when optimization of multiple criteria (at times conflicting) is required. Quality of Service (QoS) requirements of applications and resources add an extra layer of complexity to scheduling while matching the tasks to resources. Due to the dynamic nature of grid resources, the scheduling of scientific application tasks is still an issue and is an open problem. This thesis attempts to improve upon these frontages by contributing new single criteria and multi-criteria scheduling heuristics to satisfy performance objectives like makespan, economic cost, resource consumption, etc while meeting the QoS requirements like trustworthiness, availability, network bandwidth, etc. We have classified the research work in three parts: (1) Single Criteria Scheduling - Scheduling of application tasks meeting a single objective function like makespan or economic cost, (2) Bi-Criteria Scheduling - Scheduling of tasks meeting two objective functions simultaneously, and (3) QoS Oriented Multi Criteria Scheduling - Scheduling of tasks meeting more than one objective functions while respecting QoS requirements like bandwidth, availability and trustworthiness of resources. The research work in single criteria scheduling has been classified in two categories: (1) Time based scheduling, and (2) Cost based scheduling. In time based scheduling, we have developed a new scheduling heuristic called Scheduling with Heterogeneity using Critical Path (SHCP) algorithm for grid computing systems. The performance of workflow scheduling greatly depends on task sequence obtained using b-level (longest directed path considering average computation and communication cost from concerned task to exit task). But, in grids, the above approach does not fit due to high heterogeneity in computation costs of tasks on different resources and communication costs of data transfers among different tasks (scheduled on different resources) due to variable network bandwidth. In this research, computation and communication heterogeneity factors have been formulated based on 'Standard Deviation' which reflects the spread of execution time/ communication time of a task/ data transfer in workflow. The expected computation cost of each task is determined using computation heterogeneity factor and the expected communication cost is computed using communication heterogeneity factor rather than taking average computation and communication costs. The priority based task sequence is generated using these costs. The tasks on critical path in workflow are given priority in the task sequence. The SHCP algorithm is applied to this task sequence to generate the schedules. The simulated result analysis shows that SHCP generates comparatively shorter schedules for large workflow applications in grids. In cost based scheduling, task duplication and compaction strategy has been adopted in scheduling. This strategy adopts a mechanism to minimize the number of duplications (i.e., duplication cost overhead) to optimize the schedules. The scheduling algorithms (HED and RD) have been developed and validated for heterogeneous and homogeneous computing systems respectively. In this approach, the schedule is obtained using duplication based scheduling and then useless duplications and unproductive sub-schedules are removed from the above schedule to minimize the effective consumption of resources. The simulated result analysis shows that HED (Heterogeneous Economical Duplication) and RD (Reduced Duplication) algorithms generate comparable schedules with remarkably less duplication overheads and less processor consumption in heterogeneous and homogeneous computing environments respectively. The EDS-G (Economical Duplication Based Scheduling in Grids) inspired from HED strategy has been developed and validated in a simulated grid environment. In bi-criteria scheduling, two heuristics have been suggested i.e., compaction based bi-criteria scheduling (SODA) and two phase bi-criteria scheduling (DBSA). A compaction based scheduling (SODA) comprises two stages: (1) duplication based scheduling-optimizes the primary criterion, i.e., execution time, (2) compaction of schedules - minimizes the processor requirements and optimizes secondary criterion, i.e., economic cost without increasing the makespan obtained in primary scheduling. It exploits duplication strategy to obtain a shorter schedule in primary stage (primary scheduling) in order to minimize makespan (primary criteria). In secondary scheduling, the primary schedule is scanned to identify and remove the useless duplications and unproductive schedules, if any, without letting it to increase the makespan obtained in primary scheduling. In addition, the above modified schedule is further improved by swapping tasks among resources (from costlier to cheaper resource) in order to minimize economic cost (secondary criteria) without affecting the makespan. The experimental results reveal that the proposed approach generates schedules with low processor requirements which are fairly optimized for both economic cost and makespan for executing DAG applications in the grid environments. The Two phase bi-criteria scheduling (DBSA) is an extension of SODA approach which adopts the concept of sliding constraint in secondary scheduling phase. The primary phase of this approach is similar to SODA algorithm. In next phase, the primary schedule is first modified in order to minimize useless duplications and unproductive schedules, then it is improved by swapping tasks among resources (from costlier to cheaper resource) in order to minimize economic cost (secondary criteria) while letting the makespan to deteriorate within sliding constraint limits. This algorithm (DBSA) is developed and validated in the simulated grid environments. The simulated results show that DBSA surpasses SODA heuristic in terms of economic cost due to sliding makespan. It also generates better schedules which are fairly optimized in respect of both makespan and economic cost. Further, QoS oriented multi-criteria scheduling is discussed for (a) Trust based and (b) Availability aware scheduling. The research presented here considers the QoS constraints (trust, availability) of grid resources for scheduling application tasks using multiple criteria scheduling approach. This research work has been classified as trust based multi-criteria scheduling and availability aware QoS scheduling. In trust based multi-criteria scheduling, trust has been used as QoS requirement of workflow tasks. Trust is a major concern of resource users and owners in the grid environment with uncountable and at times unreliable nodes. An intelligent scheduling mechanism is essential which may require several different criteria to be considered simultaneously when evaluating the quality of solution or a schedule. A trust-oriented multi-objective scheduling (Trust-MOS) heuristic is designed to schedule the tasks to highly trustworthy resources in the grid environment. This approach has been divided into two phases. In primary phase, each task is scheduled on to a resource which fulfills the trust requirements and minimizes the makespan. The schedule generated in primary phase is then modified using a greedy approach to minimize the economic cost as well keeping the makespan within specified limit based on a sliding constraint. The Trust-MOS scheduling algorithm has been implemented in simulated grid environments. The simulated results reveal that Trust-MOS generates good quality schedules in terms of trust overheads, and reduces probabilities of task failure, hence is well suited for executing tasks in a secure manner. Availability aware QoS scheduling strategy considers availability as QoS of compute resource and bandwidth as QoS of underlying network in grid. A novel availability aware QoS oriented algorithm (AQuA) is developed for executing applications in grid environment. The application tasks specify the availability and bandwidth as QoS requirements to select the highly available compute resource over dedicated network. Such tasks get priority in scheduling over the qualified resources. After mapping of such tasks, the other tasks not having QoS requirements are mapped. The AQuA algorithm has been validated in simulated grid environment and results have been compared with the exiting QoS Guided Min-Min (QGMM) algorithm. The comparative result analysis shows that AQuA is able to prioritize the highly available grid resources and therefore increases the reliability to successfully execute applications without adversely affecting the makespan. In future, the research has been proposed to be extended towards the development of multi-criteria scheduling techniques for simultaneous execution of multiple workflows in grids.
URI:	http://hdl.handle.net/123456789/326
Other Identifiers:	Ph.D
Research Supervisor/ Guide:	Kumar , Padam
metadata.dc.type:	Doctoral Thesis
Appears in Collections:	DOCTORAL THESES (E & C)

Files in This Item:

File	Description	Size	Format
TOWARDS MULTI-OBJECTIVE SCHEDULING ATRATEGIES FOR GRID COMPUTING ENVIROMENTS.pdf		8.53 MB	Adobe PDF	View/Open

Show full item record