DSpace Repository

PRIVACY PRESERVING FREQUENT ITEMESET MINING WITH REDUCED SENSITIVE ITEMSETS FOR BIG DATA

Show simple item record

dc.contributor.author Makkar, Himanshu
dc.date.accessioned 2021-12-07T05:46:01Z
dc.date.available 2021-12-07T05:46:01Z
dc.date.issued 2018-05
dc.identifier.uri http://localhost:8081/xmlui/handle/123456789/15200
dc.description.abstract Frequent itemset mining is a field of data mining wherein we extract frequent itemsets from the dataset. This may reveal sensitive patterns. Privacy Preserving Data Mining(PPDM) approaches are used to hide sensitive information from the dataset but they also reduce the utility of the dataset. Heuristics-based PPDM approaches remove the sensitive patterns from the transactions containing them, based on some heuristics. Heuristic-based approaches are simple and take lesser computational time as compared to the border-based and exact approaches. Hence they have been given much attention by researchers for exploring better heuristics that can preserve the utility of data to a great extent. In this work, we have proposed two heuristics-based approaches- Removal of Closed Sensitive Itemsets with Maximum Support (MaxRCSI) and Removal of Closed Sensitive Itemsets with Minimum Support (MinRCSI). In these proposed approaches, sensitive itemsets are reduced to closed sensitive itemsets and sanitization process is carried over reduced closed sensitive itemsets. Experiments have been performed on real datasets as well as on benchmark dataset where the proposed approaches have resulted into the sanitized data with substantially better utility as compared to the existing approaches. But these sequential approaches are not able to cope up with the massive amount of data. The other two proposed approaches- Parallelized Removal of Closed Patterns with Minimum Support (MinPRCP) and Parallelized Removal of Closed Patterns with Maximum Support (MaxPRCP) are the parallel implementation of MinRCSI and MaxRCSI on spark parallel computing framework. These parallelized approaches are scalable enough for handling large dataset. Experiments performed using benchmark datasets shows that MinPRCP and MaxPRCP scales better as compared to MinRCSI, MaxRCSI, and other sequential approaches en_US
dc.description.sponsorship INDIAN INSTITUTE OF TECHNOLOGY ROORKEE en_US
dc.language.iso en en_US
dc.publisher I I T ROORKEE en_US
dc.subject Privacy Preserving Data Mining en_US
dc.subject Maximum Support en_US
dc.subject Parallelized Removal en_US
dc.subject Frequent Itemset Mining en_US
dc.title PRIVACY PRESERVING FREQUENT ITEMESET MINING WITH REDUCED SENSITIVE ITEMSETS FOR BIG DATA en_US
dc.type Other en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record