Sunday, January 15, 2012

[IJCISIM] Horizontal Format Data Mining with Extended Bitmaps

We published our data mining algorithm to the International Journal of Computer Information Systems and Industrial Management Applications ISSN 2150-7988. It is available online with the Volume 4 - 2012.

Feel free to download it.
Horizontal Format Data Mining with Extended Bitmaps
Buddhika De Alwis, Supun Malinga, Kathiravelu Pradeeban, Denis Weerasiri, Shehan Perera
pp. 514-521 Full Text PDF

Abstract: Analysing the data warehouses to foresee the patterns of the transactions often needs high computational power and memory space due to the huge set of past history of the data transactions. With the fragmented data along with the current trend of distributed systems, most of the fundamental algorithms that are initially proposed to find the association among the itemsets in the data warehouses are inefficient either in throughput or the utilization of the resources.
Apriori algorithm is a mostly learned and implemented algorithm that mines the data warehouses to find the associations. However, Apriori is generally not an optimized algorithm. More variations, improvements, and alternatives have been suggested to overcome the inefficiency of Apriori algorithm, either as a whole or to specific sets of data. In any case, a fraction of improvement in the algorithm often improves the mining considerably. Frequent item set mining with vertical data format has been proposed as an improvement over the basic Apriori algorithm, which mines the data sets of vertical form, opposed to the typical horizontal format data as in case of Apriori.
In this paper we are proposing an algorithm as an alternative to Apriori algorithm, which will use bitmap indices in conjunction with a horizontal format data set converted to a vertical format data structure to mine frequent itemsets leveraging efficiencies of bitmap based operations and vertical format data orientation.

Keywords: Data mining, Association Rule, Apriori, Vertical format mining, Bitmap Indices, Data Analysis, Data Warehousing.

No comments:

Post a Comment

You are welcome to provide your opinions in the comments. Spam comments and comments with random links will be deleted.