Please use this identifier to cite or link to this item:
|Title:||COMPARATIVE PERFORMANCE ANALYSIS OF TWO NEW PARALLEL JOIN ALGORITHMS FOR MANAGING DATA SKEW|
|Authors:||Aravind, N. C.|
|Keywords:||ELECTRONICS AND COMPUTER ENGINEERING;MANAGING DATA SKEW;COMPUTER NETWORKS;PARALLEL PROCESSING|
|Abstract:||Parallel processing is an attractive option for relational database systems. As in any parallel environment, however, load balancing is a critical issue which affects overall performance. Load balancing for one common database operation, in particular thejoin of two. relations, can be severely hampered for conventional parallel algorithms, due to a natural phenomenon known as data skew. This is a problem which grows with the number of processors since even modest skew can cause load imbalances when the number of processors is large. In a couple of recent papers, two new parallel join algorithms designed to manage the problem of data skew are presented. These algorithms are based, respectively on the traditional sort merge and hash join algorithms, and employ techniques borrowed from mathematical optimization theory. This work focuses on the evaluation of comparative performance of these algorithms by simulation. The new algorithms are shown to be robust even in the presence of high skew in both the relations and for a large number of processors. The parallel sort merge, join algorithm is found to be extraordinarily well-behaved, providing near-linear speedups for almost all the cases studied. Contrary to the uniprocessor case, the performance of the parallel sort merge join algorithm is generally better thavh that of the parallel hash join algorithm for a large number of processors.|
|Research Supervisor/ Guide:||Sarje, A. K.|
|Appears in Collections:||MASTERS' DISSERTATIONS (E & C)|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.