Please use this identifier to cite or link to this item: http://localhost:8081/xmlui/handle/123456789/15318
Title: PARALLELISED HIDING OF SENSITIVE PATTERNS FOR PRIVACY PRESERVATION
Authors: Agrawal, Nishtha
Keywords: Frequent Itemset Mining;Sensitive Patterns Removal;Parallelised FP-Tree based Sensitive Patterns Removal (PFSR;Preserving Data Mining Approaches
Issue Date: May-2019
Publisher: I I T ROORKEE
Abstract: Frequent itemset mining is a field of data mining where frequent itemsets are extracted from the dataset. This may reveal some sensitive information which is not meant to be shared with third party. Privacy Preserving Data Mining approaches are used to hide that sensitive information from the dataset but along with that they also have some side effects on the datasets. Among the three types of Privacy Preserving Data Mining methods, Heuristic-based are better in terms of scalability and time efficiency as compared to the border-based and exact approaches. Heuristics-based Privacy Preserving Data Mining approaches are used to sanitize the dataset i.e., removal of sensitive patterns from the transactions, based on some heuristics. So far most of the existing techniques used for hiding sensitive patterns make use of candidate-based pattern generation methods for generating frequent patterns which takes a lot of time because a large candidate itemset space is generated. In this work, we have proposed FP-Tree based Sensitive Patterns Removal (FSR) approach. This proposed approach makes use of candidate-less pattern generation technique for hiding the sensitive patterns which reduces a lot of time as compared to previous techniques. Experiments have been performed on benchmark dataset where the proposed approach has resulted into the sanitized data with substantially better utility and better time efficiency as compared to the existing approaches. But these sequential approaches are not able to cope up with the big data. So, there is another proposed approach- Parallelised FP-Tree based Sensitive Patterns Removal (PFSR), which is the parallel implementation of Proposed FSR approach on spark parallel computing framework. This parallelised approach is scalable enough for handling large dataset. Experiments performed using benchmark datasets shows that Proposed PFSR approach scales better as compared to Proposed FSR approach, and other existing sequential approaches.
URI: http://localhost:8081/xmlui/handle/123456789/15318
metadata.dc.type: Other
Appears in Collections:MASTERS' THESES (CSE)

Files in This Item:
File Description SizeFormat 
G29152.pdf510.19 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.