Wednesday, January 6, 2010

Llovizna

So this is my first post under the new name "Llovizna". With these few changes, I feel this blog will be more attractive.

By the way, getting more hits to blogs?
some tips!
  1. Attractive and glamorous titles to the blog entry.
  2. Write a blog post on how to get more hits and share it on social networks. :D

And most importantly, a sexy title to the blog itself, even if it is technical. A good example would be 'Llovizna'.. ;) 

'Llovizna' means 'drizzle' in Spanish. I picked 'Llovizna' as the blog name on 6th of January, 2010, as it gives a sexy appeal to my then mostly technical oriented drizzle of thoughts. 'Llovizna' discusses the projects that I got involved, the coding fun that I had elsewhere, and some general thoughts on Information Technology, or even some of my random thoughts.

Llovizna is neither a pure technology blog nor an entertainment blog, though it may have both and more. It is just my personal blog. Llovizna contains a range of posts from the technology posts such as Auto Scaling With Amazon EC2 to posts like how to ignore someone you love, which are completely random. I also use Llovizna to document my past and recent travels.




Subscribe: [The Atom Feed]

 


    Image: Llovizna (telenovela)
    Llovizna was a popular 1997 telenovela from Venezuela. Llovizna means drizzle in Spanish. 
    The blog's title image is Llovizna waterfalls, which is also from the same country.
     

    Friday, January 1, 2010

    Association Rule Mining with Extended Vertical Format Data Mining

    We, the team of Mooshabaya (De Alwis.K.D.B.C, Malinga.A.S, Pradeeban.K, Weerasiri.W.A.D.D.) chose "Association Rule Mining with Extended Vertical Format Data Mining" as our Advanced Database CS4420 module research project. The project proposal can be found here. The research paper we submitted is given below.

    Abstract
    Analyzing the data warehouses to foresee the patterns of the transactions of the businesses and scientific infrastructures often needs high computational power and a high memory space due to the huge set of past history of data transactions. With the fragmented data along with the current trend of distributed systems, most of the fundamental algorithms that are initially proposed to find the association among the itemsets in the data warehouses are inefficient either in throughput or the utilization of the resources.

    Apriori algorithm is such an algorithm which was proposed to mine the data warehouses to find the associations. Apriori, though being the mostly learned and implemented algorithm for data mining, it is generally not an optimized algorithm. More variations, improvements, and alternatives have been suggested to overcome the inefficiency of Apriori algorithm, either as a whole or to a particular specific set of data. In either case a fraction of improvement in the algorithm often improves the mining considerably. Vertical Format Data mining is one of the efficient alternatives to Apriori algorithm. In this paper we are proposing an algorithm as an alternative to Apriori algorithm, which will use bitmap indices in conjunction with vertical format data mining. The implementation of the proposed algorithm is benchmarked with an implementation of Apriori Algorithm against a chosen set of benchmarks, which is supposed to be more efficient than its predecessors.

    Initial Paper: PDF.


    Update as on 18th Dec: We further worked on the algorithm and published a paper ("Horizontal Format Mining with Extended Bitmaps") on this. Slides with explanation on the algorithm can be downloaded here.