A TECHNIQUE FOR DETECTING PARAPHRASES USING HYBRID SIMILARITY MEASURES

Deepak, T. Sai

Please use this identifier to cite or link to this item: http://localhost:8081/jspui/handle/123456789/11974

Full metadata record

DC Field	Value	Language
dc.contributor.author	Deepak, T. Sai	-
dc.date.accessioned	2014-11-28T11:03:03Z	-
dc.date.available	2014-11-28T11:03:03Z	-
dc.date.issued	2009	-
dc.identifier	M.Tech	en_US
dc.identifier.uri	http://hdl.handle.net/123456789/11974	-
dc.guide	Toshniwal, Durga	-
dc.description.abstract	Parsing, processing and understanding of natural languages like english, has always been challenging in Computational Linguistics. The main reason is that natural languages have large amounts of irregularities in their grammar. Also, there are many variations of how words are used in combinations to yield a meaning. One can express a situation in many different ways, using different grammar structures, using different words or word groups. These set of words or word groups which represent similar meanings are known as paraphrases. Detecting paraphrases plays a key role for many of the Natural language processing applications like such as Question Answering, Machine Translation, and Multi-text Summarization. Though a large number of techniques have been proposed and implemented for detecting paraphrases, a complete framework which considers all aspects like lexical similarity and semantic similarity measures is missing. Most of the existing techniques work independently and much research has not been done on the effect of combining all these techniques. In this thesis, we propose a technique for detecting paraphrases using hybrid similarity measures. A technique for unsupervised detection of paraphrases based on word to word similarities has been proposed. We have also developed a technique for supervised detection of paraphrases using semantic similarities. Finally, a hybrid technique for detecting semantic relatedness between two sentences is proposed by using both supervised and unsupervised similarity techniques. We have also explored the feasibility of using fact based similarity metric to detect paraphrases. We have tested all the above proposed metrics on a standard dataset, namely the Microsoft Research Paraphrase corpus. In order to obtain semantics of the words, we have used WordNet, a lexical database of english, as our background knowledge. We have also used Wiktionary as the back-end database for calculating fact based similarity. The results of the proposed schemes outperform their existing counterparts.	en_US
dc.language.iso	en	en_US
dc.subject	ELECTRONICS AND COMPUTER ENGINEERING	en_US
dc.subject	SIMILARITY MEASURES	en_US
dc.subject	PARAPHRASES	en_US
dc.subject	HYBRID	en_US
dc.title	A TECHNIQUE FOR DETECTING PARAPHRASES USING HYBRID SIMILARITY MEASURES	en_US
dc.type	M.Tech Dessertation	en_US
dc.accession.number	G14521	en_US
Appears in Collections:	MASTERS' THESES (E & C)

Files in This Item:

File	Description	Size	Format
ECDG14521.pdf		6.06 MB	Adobe PDF	View/Open

Show simple item record