CONTENT SIMILARITY BASED ANTI-PHISHING

Swagat, Konchada

DSpace Home
→
ELECTRONICS AND COMMUNICATION ENGINEERING (FORMERLY ELECTRONICS & COMPUTER ENGINEERING)
→
MASTERS' THESES (E & C)
→
View Item

dc.contributor.author	Swagat, Konchada
dc.date.accessioned	2014-12-01T05:41:30Z
dc.date.available	2014-12-01T05:41:30Z
dc.date.issued	2011
dc.identifier	M.Tech	en_US
dc.identifier.uri	http://hdl.handle.net/123456789/12408
dc.guide	Misra, Manoj
dc.description.abstract	Phishing is a critical problem traditionally involving a deceiving mail and a duplicate website. Phishers endeavor to lure users into submitting their crucial information and use the information for illegal purposes. The menace of phishing is rampant on the Internet. A recent report of Gartner puts the losses due to phishing at more than $3.2 billion annually [1], banking and other financial institutions and their users being the most affected. But in recent times phishing has spread into new frontiers like social networking, cyber warfare, etc., Innovative phishing strategies are being developed from time to time which go unnoticed. In the midst of all the chaos, it's the user who bears the brunt of phishing attacks. Numerous anti-phishing tools have been developed with various strategies. Attempts have been done to stop phishing when the phish is in the form of a mail using mail filters[2]. Organizations which deal with highly financial transactions have chosen two-way authentication to avoid risks of phishing, but such methods could not revolutionize the web because of the additional costs and effort involved. Some anti-phishing browser tools use web related heuristics like URL usability, ip-address usage, etc., but lead to unimpressive performance. We present here, a Content Similarity based Anti-Phishing system, a ubiquitous, robust and scalable method to detect phishing. Instead of depending on phishing related heuristics, we directly address the source of phishing, the content of pages and the similarity of pages involved. A sophisticated algorithm has been presented with proof of concepts wherever required, to detect similar pages and further establish the authority of those pages based on their URL, content and newly introduced factors like site-population. The performance achieved by our system has been impressive with a true-positive rate of more than 99% and better compared to previously developed systems which were discussed.	en_US
dc.language.iso	en	en_US
dc.subject	ELECTRONICS AND COMPUTER ENGINEERING	en_US
dc.subject	ANTI-PHISHING	en_US
dc.subject	PHISHING	en_US
dc.subject	DUPLICATE WEBSITE	en_US
dc.title	CONTENT SIMILARITY BASED ANTI-PHISHING	en_US
dc.type	M.Tech Dessertation	en_US
dc.accession.number	G20984	en_US