CLUSTERING MULTI-DIMENSIONAL DATA STREAM OBJECTS

Narsingh, Pardeshi Bharat

Please use this identifier to cite or link to this item: http://localhost:8081/jspui/handle/123456789/8901

Full metadata record

DC Field	Value	Language
dc.contributor.author	Narsingh, Pardeshi Bharat	-
dc.date.accessioned	2014-11-18T05:42:36Z	-
dc.date.available	2014-11-18T05:42:36Z	-
dc.date.issued	2010	-
dc.identifier	M.Tech	en_US
dc.identifier.uri	http://hdl.handle.net/123456789/8901	-
dc.guide	Toshniwal, Durga	-
dc.description.abstract	Clustering is an important data mining technique. It is an unsupervised learning process of grouping data objects meaningfully. Data streams are temporally ordered, fast changing, high-dimensional and potentially infinite volumes of data. Clustering of data streams is however a non-trivial task because of their dynamic, high-dimensionality and voluminous nature. Existing clustering algorithms are not able to accurately cluster such data streams. Thus, the existing data stream clustering algorithms must be improved so that they are able to mine data stream objects as they arrive. The proposed research work aims at the development of improved data stream clustering algorithm. The major objective of this work is to achieve improvement in terms of clustering purity considering the time complexity. Based on partitioning technique, an algorithm termed as Partitioning-based Improved Stream (PartIS) Clustering has been proposed. This algorithm merges or splits the clusters dynamically depending on the arriving data stream objects. Using Hierarchical Clustering methodology, an algorithm termed as Hierarchical-based Improved Stream (HIS) Clustering is proposed. By projecting data objects into a high-dimensional grid structure, this algorithm performs hierarchical clustering to obtain reasonable results. Using density based approach, an algorithm termed as Denisity-based Improved Stream (DenIS) Clustering is proposed. This algorithm is able to discover clusters of any arbitrary shape along with .proper discrimination of outliers. Finally DenIS Clustering algorithm is parallelized on CUDA to achieve computational speedup. The proposed work has been implemented on Linux platform using C Language. The parallel algorithm for exploiting CUDA technology is implemented using NVidia CUDA C on Quadro FX 3700 Graphics Card. All the experiments are performed on an Intel(R) Xeon(R) E5420 CPU having 16GB of RAM. iii	en_US
dc.language.iso	en	en_US
dc.subject	ELECTRONICS AND COMPUTER ENGINEERING	en_US
dc.subject	ELECTRONICS AND COMPUTER ENGINEERING	en_US
dc.subject	ELECTRONICS AND COMPUTER ENGINEERING	en_US
dc.subject	ELECTRONICS AND COMPUTER ENGINEERING	en_US
dc.title	CLUSTERING MULTI-DIMENSIONAL DATA STREAM OBJECTS	en_US
dc.type	M.Tech Dessertation	en_US
dc.accession.number	G20116	en_US
Appears in Collections:	MASTERS' THESES (E & C)

Files in This Item:

File	Description	Size	Format
ECD20116.pdf		4.11 MB	Adobe PDF	View/Open

Show simple item record