DESIGN OF IMPROVED AND EFFICIENT INDEXING ALGORITHM FOR TEXT RETREIVAL

Shrivastava, Mridul

Please use this identifier to cite or link to this item: http://localhost:8081/jspui/handle/123456789/2183

Full metadata record

DC Field	Value	Language
dc.contributor.author	Shrivastava, Mridul	-
dc.date.accessioned	2014-09-26T14:07:48Z	-
dc.date.available	2014-09-26T14:07:48Z	-
dc.date.issued	2012	-
dc.identifier	M.Tech	en_US
dc.identifier.uri	http://hdl.handle.net/123456789/2183	-
dc.guide	Kumar, Padam	-
dc.description.abstract	Web-scale search engines deal with a volume of data and queries that forces them to make use of an index partitioned across many machines. Two main methods of partitioning an index for distributed processing have been described in the literature. In document partitioning, each processor node holds the information for a subset of documents, while in term partitioning, each node holds the information for a subset of terms. The major drawback in these approach are that the redistribution of data during the merge process make the indexing process tedious. So, we are presenting a novel distributed indexing algorithm which makes use of some novel data structures which helps in making merge process fast. Our algorithm also helps in maintaining proper load balancing as now the no special nodes are assigned for the merging process as is done in previous algorithm. We have presented an efficient alternative to the pipelined approach and the ad-hoc non-pipelined approach. Our method combines non-pipelined disk-accesses, a heuristic method to choose between pipelined and non-pipelined posting list processing, and an efficient query routing strategy. According to the experimental result,. our method provides a higher throughput than the pipelined approach, a shorter latency than the non-pipelined approach, and significantly improves the overall throughput/latency ratio.	en_US
dc.language.iso	en	en_US
dc.subject	ALGORITHM	en_US
dc.subject	DATA MANAGEMENT	en_US
dc.subject	DATA INDEXING	en_US
dc.subject	ELECTRONICS AND COMPUTER ENGINEERING	en_US
dc.title	DESIGN OF IMPROVED AND EFFICIENT INDEXING ALGORITHM FOR TEXT RETREIVAL	en_US
dc.type	M.Tech Dessertation	en_US
dc.accession.number	G21949	en_US
Appears in Collections:	MASTERS' THESES (E & C)

Files in This Item:

File	Description	Size	Format
ECDG21949.pdf		2.34 MB	Adobe PDF	View/Open

Show simple item record