Please use this identifier to cite or link to this item:
http://localhost:8081/xmlui/handle/123456789/6578
Title: | FRAMEWORK FOR PARALLEL IMPLEMENTATION OF PART-OF-SPEECH TAGGING FOR TEXT MINING USING GRID COMPUTING |
Authors: | Kumar, Naveen |
Keywords: | ELECTRONICS AND COMPUTER ENGINEERING;PART-OF-SPEECH TAGGING;TEXT MINING;GRID COMPUTING |
Issue Date: | 2011 |
Abstract: | The amount of available data is increasing rapidly, which makes it difficult for humans to distinguish relevant information. Gaining new knowledge, retrieving the meaning of text documents and associate it to other knowledge is a major challenge. There is an urgent need to develop new text mining solutions to tackle exponential growth in text data. Problem sizes are increasing day by day by due to the addition of new text documents. Grid aware text mining is one of the solutions for knowledge extraction from such large volume of text. Part of speech (PUS) tagging is an important preprocessing task in text mining. But tagging algorithms working on a very large document collection take very long time on conventional computers to produce results. In this thesis we present a framework for parallel implementation of part of speech tagging for text mining using grid computing. Globus Toolkit, which is a middleware for scientific and data intensive grid applications, is used for developing this framework in grid. environment. Experimental results show that this model significantly reduces the part of speech tagging time for text mining. This model can be integrated into grid based text mining tool, helping to improve the overall performance of the text mining process |
URI: | http://hdl.handle.net/123456789/6578 |
Other Identifiers: | M.Tech |
Research Supervisor/ Guide: | Kumar, Padam |
metadata.dc.type: | M.Tech Dessertation |
Appears in Collections: | MASTERS' THESES (E & C) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
ECED G21054.pdf | 2.59 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.