Please use this identifier to cite or link to this item:
|Title:||COMPARISON OF ASSOCIATION RULE MINING ALGORITHMS|
|Authors:||Rani, T. Sireesha|
|Keywords:||ELECTRONICS AND COMPUTER ENGINEERING;MINING ALGORITHMS;VIPER;LINUX ENVIRONMENT|
|Abstract:||The problem of discovering association rules between items in a large database of sales transactions is considered in the present work. Two new algorithms namely Apriori and VIPER are presented for solving this problem that are fundamentally different from the known algorithms. In a horizontal representation of the market-basket analysis, the database is organized as a set of rows where each row is storing a transaction identifier and a bit vector of 0's and I's to represent each items on sale, its presence or absence, respectively, in the transaction. Large itemsets are discovered here for generating strong association rules by satisfying minimum support count. Results have been discussed in the later sections for various database sizes. In a vertical representation of a market basket database each item is associated with a column of values representing the transactions in which it is present. The association rule mining algorithms that have been recently proposed for this representation show performance improvements over their classical counterparts, but are either efficient only for efficient database sizes or assume particular characteristics of the database contents or are applicable to specific kinds of database schemas. A new vertical mining algorithm called VIPER which is general purpose making no special requirements of the underlying database. VIPER stores data in compressed bit vectors called snakes and integrates a number of novel optimizations for efficient snake generation, intersection and counting and storage. The performance of viper is analyzed for a range of synthetic database workloads. Experimental results indicate significant performance gains especially for large databases over previously proposed horizontal mining algorithm. The code is written in C++ under LINUX environment.|
|Research Supervisor/ Guide:||Gupta, Indra|
|Appears in Collections:||MASTERS' THESES (E & C)|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.