DSpace Repository

KEYPHRASE EXTRACTION AND ENRICHMENT FOR NEWS MEDIA

Show simple item record

dc.contributor.author Jain, Nikita
dc.date.accessioned 2019-05-22T09:34:48Z
dc.date.available 2019-05-22T09:34:48Z
dc.date.issued 2016
dc.identifier.uri http://hdl.handle.net/123456789/14439
dc.description.abstract As newswire data is growing continuously at a very fast pace, the need for techniques generating instantly digestible and concise format news information is emerging. My research goal in dissertation thesis is to develop models that can automatically extract summarized and interesting news information. Aiming to solve the problem of low engagement time of news audience and several other news journalism problem. There has been great progress in automatically extraction and generation of facts, trivias and other interesting information from news media data such as trivia generation, event detection, headlines generation, sentiment analysis, questionanswering systems. However, in-spite of these approaches the news audience engagement time is still low. Also, these solutions are often based on different learning models. My goal is to develop general and scalable algorithms that can work over any language, any domain and any media format having textual content. The model (E3) in this thesis address these shortcomings. They provide effective and efficient keyphrases for multilingual and multi-format news data. They provide a set of features to rank the set of keyphrases. Furthermore, a method is provided to enrich the extracted keyphrases by finding the types and input query related information like role played by person entity. This kind of information is very helpful in cases where many people, multiple organization and multiple location are mentioned. As it is very difficult for a reader to keep track of all the mentioned entities. Henceforth, readers often losses interest in the news concept and the network traffic gets lost. Also, we have specifically chosen the keyphrase based summary as they provide a high-level overview of news data in a short span of time with little effort. We have evaluated our unsupervised system E3 on varying input queries, from general topics (E.g. Election) to specific topics (E.g. Bihar Election) to demonstrate the efficiency and effectiveness of our keyphrase extraction and keyphrase enrichment method over existing state-of-the-art. Our experimental results show that E3 performs significantly better than the defined baselines on seven different parameters. We also investigate the effect of the use of linguistic and syntactical features in keyphrase extraction, with an user case study and found that our system is fairly robust. en_US
dc.description.sponsorship Indian Institute of Technology, Roorkee en_US
dc.language.iso en en_US
dc.publisher Department of Computer Science and Engineering,IITR. en_US
dc.subject Keyphrase Extraction en_US
dc.subject Keyphrase Enrichment en_US
dc.subject Automated News Summarization en_US
dc.subject Keyphrase Ranking en_US
dc.subject Natural Language en_US
dc.subject Processing en_US
dc.title KEYPHRASE EXTRACTION AND ENRICHMENT FOR NEWS MEDIA en_US
dc.type Other en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record