Please use this identifier to cite or link to this item: http://localhost:8081/xmlui/handle/123456789/12177
Authors: Bhushan, Anant
Issue Date: 2010
Abstract: Clustering, particularly unsupervised clustering is central to a large number of com-puting application which involve machine learning and information retrieval. Algo-rithmic methods of improving the execution times by using filtering algorithm and kd-trees have been successful but do not provide scope for further improvements. The emergence of multi-core procdssors and their easy availability and low cost has made it possible to have increased computing power. The need of the hour is to have algorithms that can harness the increased computing power available at our disposal. In this work a novel approach has been presented which can reduce the running time of the clustering algorithms by exploiting the parallel computing architectures available today. We utilize the MPI libraries for creating parallel execution threads on multicore processors.Our approach involves adding a pre-processing and post-processing step to the parallel implementation of clustering using filtering algorithm.The preprocessing step is for finding groups of dimensions which have similar characteristics and which can therefore yield better quality clusters. These sub-groups of similar dimensions are clubbed together for parallel clustering operations in the subsequent steps, based on a similarity metric. The sub-groups of dimensions are created with an overlapping dimension among adjacent groups to facilitate merging of cluster centers during the post-processing step. The parallel clustering step produces overlapping cluster centers for the sub-groups of dimensions. The post-processing step takes the clusters created by the sub-groups of dimensions and merges the cluster centers based on the overlapping dimensions. The feasibility of the framework has been demonstrated through an implementation on multi-spectral image clustering using the filtering algorithm a.nd significantly re-duced running times were obtained. The pre-processing step involved the calculation of the kurtosis of the image data for calculating the similarity metric and grouping into sub-groups. The overhead involved in execution of the pre and post-processing steps was less than one percent of the time taken for clustering the data in parallel.
Other Identifiers: M.Tech
Research Supervisor/ Guide: Singh, Kuldip
Mittal, Ankush
metadata.dc.type: M.Tech Dessertation
Appears in Collections:MASTERS' DISSERTATIONS (E & C)

Files in This Item:
File Description SizeFormat 
ECDG20083.pdf1.93 MBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.