Please use this identifier to cite or link to this item: http://localhost:8081/xmlui/handle/123456789/2241
Title: ADAPTIVE CHECKPOINTING BASED FAULT TOLERANCE IN GRID ENVIRONMENT
Authors: Upadhyay, Neeraj
Keywords: FAULT TOLERANCE;HETEROGENEITY AND DYNAMISIM;ADAPTIVE CHECKPOINTING;ELECTRONICS AND COMPUTER ENGINEERING
Issue Date: 2012
Abstract: Grid systems differ from traditional distributed systems in terms of their large scale, heterogeneity and dynamism. These factors contributes towards higher number of fault occurrences as large scale causes lower values of Mean Time To Failure (MTTF), heterogeneity results in interaction faults (protocol incompatibilities) between communicating disparate nodes and dynamism implies dynamically varying resource availability due to resources autonomously entering and leaving the grid and thus effecting the jobs running on them. Another factor that increases probability of failure of applications is that applications running on grid are long running computations taking days to finish. Traditional approaches for tolerating faults in distributed systems include checkpointing and replication. Incorporating fault tolerance in scheduling algorithms is one of the approaches for handling faults in grid environment. Genetic Algorithms and Ant Colony Optimization are a popular class of meta-heuristic algorithms used for grid scheduling. This work designs heuristics for adaptive checkpointing based on fault information about resources. These heuristics have been incorporated in GA and ACO. Other adaptive checkpointing techniques developed focuses on online adaption of checkpoint interval based on MTBF, last failure time and fault indexes of resources. Performance comparison of adaptive checkpointing with periodic checkpointing techniques have been performed using simulated Grid environment for wide range of scenarios such as temporally and spatially correlated failures, real failure traces and real workload traces. Adaptive checkpointing techniques are found to give superior performance compared to periodic checkpointing.
URI: http://hdl.handle.net/123456789/2241
Other Identifiers: M.Tech
Research Supervisor/ Guide: Misra, Manoj
metadata.dc.type: M.Tech Dessertation
Appears in Collections:MASTERS' THESES (E & C)

Files in This Item:
File Description SizeFormat 
ECDG21994.pdf16.71 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.