Saturday, December 31, 2011

Google Code-In

The reach of Google Code-In (GCI) is relatively lower in Sri Lanka (probably in other countries too), than its sister program Google Summer of Code. One major reason can be, the age group of 13 - 17 (pre-university students) is not much into programming, over generally into computers, like the university students of 18+ do, in third world countries like Sri Lanka. However, more importantly, the word is yet to be spread, regarding GCI, I feel.

This presentation resembles the Google Summer of Code presentation, I prepared for AbiWord/GSoC, in its style. This is based on my experience as a mentor for Haiku in Google Code-In 2011. Hence the examples used in this presentation are mostly from Haiku. I hope, this presentation will be useful to any student as an introduction to Google Code-In.

Special thanks to Krasimir Petkov, GCI mentor (Haiku) for his valuable input at several times, in shaping this presentation up.

An updated, GCI-2012 presentation is available here.

Friday, December 30, 2011

Why you give ma address to othersz?

When ticking those "Subscribe" buttons, I didn't think much. But suffering now, when it is even impossible to count the number of mail filters.

Some newsletters have obviously given my address to other third-parties too. May be with or without asking me, during the sign-up. However, thanks to the CAN-SPAM Act, all is well, and I was able to effectively get rid of these newsletters by unsubscribing from them. Good bye stupid newsletters - I am removing *many* of you today.. Thanks for messing my gmail inbox with all your marketing spam.. :)

P.S: The cat photo was taken from a bulk email, which holds no indication of the photographer or owner. Caption (y u censor me?) is added by me. Photo credits should go to the photographer.

Sunday, December 25, 2011

Open Source Evangelization and the evolution of the GSoC introductory presentation

Open Source Evangelization
We decided to use Google Summer of Code, as a mean to evangelize open source among the university students. I have been contributing to this evangelization effort this year. On this, I have prepared a presentation and presented at the [1] Institute of Engineers, Sri Lanka (IESL), [2] the Science Faculty of the University of Peradeniya, and [4] the Engineering Faculty of the University of Peradeniya. We have also scheduled a session at [5] the Science Faculty, the University of Jaffna, the 7th of January, 2012.

The evolution of the presentation can be found at the respective blog posts.
[1] GSoC-2011 [35 slides / IESL. First version - 40 minutes]
[2] GSoC 2011 and FOSS [38 slides / SET - UoP - 75 minutes]
[3] AbiWord and Google Summer of Code - 2011 [35 slides / AbiWord Specific - 75 minutes (estimated)]
[4] Google Summer of Code awareness session [42 slides / E-Fac - UoP - 75 minutes]
[5] Google Summer of Code 2012 [45 Slides / Further improved. Final / Global version - 75 minutes (estimated)]

It is interesting to find the growth of the GSoC presentation over the year. Hope my GCI presentation [28 slides / First Version - 50 minutes (estimated)] too will eventually become a presentation with a similar quality.

Saturday, December 24, 2011

The world is a paraboloid..

Finally, the year 2011 comes to an end. 2011 was a year with its own downs and ups. I still feel, 2010 and 2004 were the best of all the years. However, I was able to learn many new things and I strongly believe the impact 2011 made in me is really huge and irreversible. :) In summary, 2011 was a positive year. I am trying to analyze the year through this blog post, with a summary of the remarkable events of the year. 

For a software engineer (of course, for others too.. :)), work place plays a major role in his life. The open culture of WSO2 fits me pretty well. I always try my best to contribute to whatever I can. There is an array of webinars we did, and I have presented a few too. You can view them from "Webinars" in this blog. As a member of the WSO2 Stratos team, I have blogged more on Stratos and WSO2 Load Balancer. 

I have already done an analysis on the year 2011, when I blogged about my first year completion at WSO2, the 13th of September 2011. Read more on this - one year since...✍.  Similarly. "Open Source Evangelization and the evolution of the GSoC introductory presentation", summarizes the blog posts on the open source presentations I did on GSoC and GCI. This year, I have mentored for both GSoC (AbiWord) and GCI (Haiku). 

Sweetness of 2011
2011 had its own set of interesting days and events. Google Summer of Code Mentor Summit 2011 and  WWW2011India are remarkable.

2011 was a really good year for my blog Llovizna. 2010 was not a good year for my blog, I would say. 2011 was more of an extrapolation of the year 2009. I have blogged about most of my interesting life events through Llovizna, this year. Llovizna became a mixed blog this year - with blog posts touching random topics - not just technology.

how to ignore someone you love is one viral blog post of the year from me. An unreasonably huge delay of publishing one of my articles in a site, made me take a strong decision late this year - that I will publish all the posts through Llovizna only. 

The world is complicated
Hyperbolic Paraboloid [1]
The world is not flat. Rather, it is a paraboloid. I am good in mathematics - but alas, the world is much complicated to be represented by mathematical equations. 

Culture is NOT something our ancestors had or did at 500 BC. It is NOT something the books say. It is the principles around us today that makes us. Culture is NOT a static entity. It evolves. No individual can harm or destroy the culture. It indeed improves eventually, regardless of the popular belief. Even those who commonly acknowledge that they do not care about their culture too are bound to it. They may just avoid some aspects of it. What they avoid may just be the tip of the culture. Just like the tip of the iceberg. I was able to realize how the country and the culture have shaped us as who we are now, during my visit to the US.

Distributed Computing
I was looking into Data mining and distributed computing this year. Hope to research further on them in the upcoming years. 

The world is a paraboloid
This is one of the concluding blog posts of the year 2011, which happens to be the 100th blog post of the year. Like this blog post, the year too was a complicated mixture with lots of unrelated events packed together. Hope to have 2012 as a faster and more effective year. :) Merry Christmas and a very happy new year, everyone!


Saturday, December 17, 2011

Google Summer of Code 2012

We are having a series of GSoC awareness sessions, including the yesterday's session we had at the University of Peradeniya, and the upcoming session at the University of Jaffna on the 7th of January, 2012. These events focus on discussing GSoC and FOSS. Attached herewith is the latest version of the presentation I prepared to introduce GSoC 2012 to the students. Feel free to download and distribute, if the slow network prevents you viewing the presentation here.

As a mentor from the AbiWord community, I have come up with the slides based on our experience with the Google Summer of Code. This presentation is also influenced by my experience as a three time Google Summer of Code participant, with AbiWord (2011 as a mentor and 2009 as a student) and OMII-UK (2010 as a student). Special thanks to Martin Sevior and the AbiWord community for their valuable input at several times, in shaping this presentation up. 

Make sure to have a look at the Google Summer of Code 2012 project ideas from AbiWord.

The presentations in this blog require Shockwave Flash Plugin to display correctly. If you couldn't see it correctly, make sure you have the required plugin enabled. Feel free to drop a comment should you require further information.

Google Summer of Code awareness session

Yesterday we had an awareness session for Google Summer of Code (GSoC) at the Engineering Faculty of the University of Peradeniya. This event focussed on discussing GSoC and FOSS. It is an interesting fact that we have visited the University of Peradeniya, after exactly 11 months, for the very same event - Google Summer of Code awareness session. Our previous session was held at the science faculty, on 17th of Jan, 2011.

Attached herewith is my presentation, introducing GSoC 2012 to the students. This slides are based on my experience as a three time Google Summer of Code participant, with AbiWord (2011 as a mentor and 2009 as a student) and OMII-UK (2010 as a student).

In slow network connections, the presentation might take a bit longer to load. In that case, please feel free to download the presentation for your future reference.

Update: Pls find the latest revised version of this presentation at 

Thursday, December 15, 2011

DZone Kolamba Meetup

We had the first DZone Kolamba Meetup at 4.30 p.m - 6.30 p.m today, at WSO2. The theme was Big Data. Given below is the introductory slides to DZone. Photos of the event can be found here.

Wednesday, December 14, 2011

Configuring WSO2 Load Balancer for Auto Scaling

This post assumes that the reader is familiar at configuring the WSO2 Load Balancer without autoscaling, and has configured the system already with the load balancer. Hence this post focuses on setting up the load balancer with autoscaling. If you are a newbie to setting up WSO2 Servers proxied by WSO2 Load Balancer, please read the blog post, How to setup WSO2 Elastic Load Balancer to configure WSO2 Load Balancer without autoscaling.


The autoscaling configurations are defined from CARBON_HOME/repository/deployment/server/synapse-configs/tasks/autoscaler.xml

1) Task Definition
In WSO2 Load Balancer, the autoscaling algorithm to be used is defined as a Task. ServiceRequestsInFlightEC2Autoscaler is the default class that is used for the autoscaler task.
<task xmlns=""

2) loadbalancer.xml pointed from autoscaler.xml

This property points to the file loadbalancer.xml for further autoscaler configuration.
    <property name="configuration" value="$system:loadbalancer.xml"/>

3) Trigger Interval

The autoscaling task is triggered based on the trigger interval that is defined in the autoscaler.xml. This is given in seconds.
    <trigger interval="5"/>

Autoscale Mediators

autoscaleIn and autoscaleOut mediators are the mediators involved in autoscaling as we discussed above. As with the other synapse mediators, the autoscaling mediators should be defined in the main sequence of the synapse configuration, if you are going to use autoscaling. Load Balancer-1.0.x comes with these mediators defined at the main sequence, which can be found at $CARBON_HOME/repository/deployment/server/synapse-configs/sequences/main.xml. Hence you will need to modify main.xml, only if you are configuring the load balancer without autoscaling.

autoscaleIn mediator is defined as an in mediator. It gets the configurations from loadbalancer.xml, which is the single file that should be configured for autoscaling, once you have already got a system that is set up for load balancing.
        <autoscaleIn configuration="$system:loadbalancer.xml"/>

Similarly autoscaleOut mediator is defined as an out mediator.



loadbalancer.xml contains the service cluster configurations for the respective services to be load balanced and the load balancer itself. Here the service-awareness of the load balancer makes it possible to manage the load across multiple service clusters. The properties given in loadbalancer.xml is used to provide the required configurations and customizations for autoscaling and load balancing. These configurations can also be taken from the system properties as shown below.

1) Properties common for all the instances

1.1) ec2AccessKey

The property 'ec2AccessKey' is used to provide the EC2 Access Key of the instance.
    <property name="ec2AccessKey" value="${AWS_ACCESS_KEY}"/>

1.2) ec2PrivateKey
The certificate is defined by the properties 'ec2PrivateKey'.
    <property name="ec2PrivateKey" value="${AWS_PRIVATE_KEY}"/>

1.3) sshKey
 The ssh key pair is defined by 'sshKey'.
    <property name="sshKey" value="stratos-1.0.0-keypair"/>

1.4) instanceMgtEPR
'instanceMgtEPR' is the end point reference of the web service that is called for the management of the instances.
    <property name="instanceMgtEPR" value=""/>

1.5) disableApiTermination
The 'disableApiTermination' property is set to true by default, and is recommended to leave as it is. This prevents terminating the instances via the AWS API calls.
    <property name="disableApiTermination" value="true"/>

1.6) enableMonitoring
The 'enableMonitoring' property can be turned on, if it is preferred to monitor the instances.
    <property name="enableMonitoring" value="false"/> 

2) Configurations for the load balancer service group

These are defined under
<loadBalancer> .. </loadBalancer>

2.1) securityGroups
The service group that the load balancer belongs to is defined by the property 'securityGroups'. The security group will differ for each of the service that is load balanced as well as the load balancers. Autoscaler uses this property to identify the members of the same cluster.
        <property name="securityGroups" value="stratos-appserver-lb"/>

2.2) instanceType
'instanceType' defines the EC2 instance type of the instance - whether they are m1.small, m1.large, or m1.xlarge (extra large).
        <property name="instanceType" value="m1.large"/>

2.3) instances
The property, 'instances' defines the number of the load balancer instances. Multiple load balancers are used to prevent the single point of failure -  by providing a primary and a secondary load balancer.
        <property name="instances" value="1"/>

2.4) elasticIP
Elastic IP address for the load balancer is defined by the property, 'elasticIP'. We will be able to access the service, by accessing the elastic IP of the load balancer. The load balancer picks the value of the elastic IP from the system property ELASTIC_IP.
        <property name="elasticIP" value="${ELASTIC_IP}"/>

In a public cloud, elastic IPs are public (IPV4) internet addresses, which is a scarce resource. Hence it is recommended to use the elastic IPs only to the load balancer instances that to be exposed to the public, and all the services that are communicated private should be associated to private IP addresses for an efficient use of this resource. Amazon EC2 provides 5 IP addresses by default for each customer, which of course can be increased by sending a request to increase elastic IP address limit.

2.5) availabilityZone
This defines in which availability zone the spawned instances should be.
       <property name="availabilityZone" value="us-east-1c"/>

2.6) payload

The file that is defined by 'payload' is uploaded to the spawned instances. This is often a zip archive, that extracts itself into the spawned instances.
        <property name="payload" value="/mnt/"/> contains the necessary files such as the public and private keys, certificates, and the launch-params (file with the launch parameters) to download and start a load balancer instance in the spawned instances.

The launch-params includes the details for the newly spawned instances to function as the other instances. More information on this can be found from the EC2 documentations.

Sample Launch Parameters
Given below is a sample launch-params, that is used in StratosLive by the load balancer of the Application Server service.

We will look more into these launch-params now.

The credentials - access key ID and the secret access key are given to access the aws account.

S3 Locations
The service zip and the common modifications or patches are stored in an S3 bucket. The locations are given by a few properties in the launch-params shown above.
  • PRODUCT_MODIFICATIONS_PATH_S3 - Points to the product specific changes, files, or patches are uploaded to a specific location.
  • COMMON_MODIFICATIONS_PATH_S3 - Points to the patches and changes common to all the servers.
  • PRODUCT_PATH_S3 - Points to the location where the relevant Stratos service zips are available.
  • STARTUP_DELAY - Given in seconds. Provides some time to start the service that is downloaded on the newly spawned instance, such that it will join the service cluster and be available as a new service instance.
Apart from these, PRODUCT_NAME, SERVER_NAME, HTTP_PORT, and HTTPS_PORT for the application are also given.


3) Configurations for the application groups

These are defined under
<services> .. </services>

These too should be configured as we configured the properties for the load balancers above.

We define the default values of the properties for all the services under
<defaults> .. </defaults>

Some of these properties - such as the payload, host, and domain - will be specific to a particular service group, and should be defined separately for each of the services, under
<service> .. </service>

Properties applicable to all the instances
payload, availabilityZone, securityGroups, and instanceType are a few properties that are not specific to the application instances. We have already discussed about these properties when setting the load balancer properties above.

Properties specific to the application instances
These properties are specific to the application clusters, and are not applicable to the load balacer instances. We will discuss about these properties now.

3.1) minAppInstances
The property 'minAppInstances' shows the minimum of the application instances that should always be running in the system. By default, the minimum of all the application instances are set to 1, where we may go for a higher value for the services that are of high demand all the time, such that we will have multiple instances all the time serving the higher load.
            <property name="minAppInstances" value="1"/>

3.2) maxAppInstances
'maxAppInstances' defines the upper limit of the application instances. The respective service can scale up till it reaches the number of instances defined here.
            <property name="maxAppInstances" value="5"/>

3.3) queueLengthPerNode
The property 'queueLengthPerNode' provides the maximum length of the message queue per node.
            <property name="queueLengthPerNode" value="400"/>

3.4) roundsToAverage
The property 'roundsToAverage' indicates the number of attempts to be made before the scaling the system up or down. When it comes to scaling down, the algorithm makes sure that it doesn't terminate an instance that is just spawned. This is because the spawned instances are billed for an hour. Hence, even if we don't have much load, it makes sense to wait for a considerable amount (say 58 minutes) of time before terminating the instances.
            <property name="roundsToAverage" value="10"/>

3.5) instancesPerScaleUp
This defines how many instances should be scaled up for each time. By default, this is set to '1', such that a single instance is spawned whenever the system scales. However, this too can be changed such that multiple instances will be spawned each time the system scales up. However it may not be cost-effective to set this to a higher value.
            <property name="instancesPerScaleUp" value="1"/>

3.6) messageExpiryTime
messageExpiryTime defines how long the message can stay without getting expired.
            <property name="messageExpiryTime" value="60000"/>

Properties specific to a particular service group
Properties such as hosts and domain are unique to a particular service group, among all the service groups that are load balanced by the given load balancer. We should note that we can use a single load balancer set up with multiple service groups, such as Application Server, Enterprise Service Bus, Business Process Server, etc.

Here we also define the properties such as payload and availabilityZone, if they differ from the default values provided under
<defaults> .. </defaults>

Hence these properties should be defined under
<service>.. </service>
for each of the services.

3.7) hosts
'hosts' defines the hosts of the service that to be load balanced. These will be used as the access point or url to access the respective service.
Multiple hosts can be defined under
<hosts> .. </hosts>
Given below is a sample hosts configurations for the application server service

3.8) domain
Like the EC2 autoscaler uses the security groups to identify the service groups, 'domain' is used by the load balancer (ServiceDynamicLoadBalanceEndpoint) to correctly identify the clusters of the load balanced services.
Once you have configured the load balancer as above, with the product/service instances, you will have the system that dynamically scales.

Auto Scaling with WSO2 Load Balancer

How Auto Scaling works with WSO2 Load Balancer

The autoscaling component comprises of the synapse mediators AutoscaleInMediator and AutoscaleOutMediator and a Synapse Task ServiceRequestsInFlightEC2Autoscaler that functions as the load analyzer task. A system can scale up based on several factors, and hence autoscaling algorithms can easily be written considering the nature of the system. For example, Amazon's Auto Scaler API provides options to scale the system with the system properties such as Load (the timed average of the system load), CPUUtilization (utilization of the cpu at the given instance), or Latency (delay or latency in serving the service requests).

Autoscaler Components

  • AutoscaleIn mediator - Creates a unique token and puts that into a list for each message that is received.
  • AutoscaleOut mediator - Removes the relevant stored token from the list, for each of the response message that is sent.
  • Load Analyzer Task - ServiceRequestsInFlightEC2Autoscaler is the load analyzer task used for the service level autoscaling as the default. It periodically checks the length of the list of messages based on the configuration parameters. Here the messages that are in flight for each of the back end service is tracked by the AutoscaleIn and AutoscaleOut mediators, as we are using the messages in flight algorithm for autoscaling.

ServiceRequestsInFlightEC2Autoscaler implements the execute() of the Synapse Task interface. Here it calls sanityCheck() that does the sanity check and autoscale() that handles the autoscaling.

Sanity Check

sanityCheck() checks the sanity of the load balancers and the services that are load balanced, whether the running application nodes and the load balancer instances meet the minimum number specified in the configurations, and the load balancers are assigned elastic IPs.

nonPrimaryLBSanityCheck() runs once on the primary load balancers and runs time to time on the secondary/non-primary load balancers as the task is executed periodically. nonPrimaryLBSanityCheck() assigns the elastic IP to the instance, if that is not assigned already. Secondary load balancers checks that a primary load balancer is running periodically. This avoids the load balancer being a single point of failure in a load balanced services architecture.

computeRunningAndPendingInstances() computes the number of instances that are running and pending. ServiceRequestsInFlightEC2Autoscaler task computes the running and pending instances for the entire system using a single EC2 API call. This reduces the number of EC2 API calls, as AWS throttles the number of requests you can make in a given time. This method will be used to find whether the running instances meet the minimum number of instances specified for the application nodes and the load balancer instances through the configuration as given in loadbalancer.xml. Instances are launched, if the specified minimum number of instances is not found.


autoscale() handles the autoscaling of the entire system by analyzing the load of each of the domain. This contains the algorithm - RequestsInFlight based autoscaling. If the current average of requests is higher than that can be handled by the current nodes, the system will scale up. If the current average is less than that can be handled by the (current nodes - 1), the system will scale down.

Autoscaling component spawns new instances, and once the relevant services successfully start running in the spawned instances, they will join the respective service cluster. Load Balancer starts forwarding the service calls or the requests to the newly spawned instances, once they joined the service clusters. Similarly, when the load goes down, the autoscaling component terminates the under-utilized service instances, after serving the requests that are already routed to those instances.

StratosLive - A case study for WSO2 Load Balancer

In a cloud environment such as WSO2 StratosLive, auto-scaling becomes a crucial functionality. The system is expected to scale up and down with the dynamically changing load. Auto-scaling capabilities are sometimes provided by the Infrastructure as a Service provider themselves, such as the Autoscaling from Amazon. However, autoscaling is not necessarily a requirement that to be fulfilled by an IaaS. Say, you are providing Platform as a Service (PaaS) that is hosted over the pure native hardware, instead of an IaaS. In that case, your PaaS should be able to provide the required autoscaling and load balancing capabilities to the applications that are hosted on top of your platform. WSO2 Load Balancer is such a software load balancer, that handles the load balancing, fail over, and autoscaling functionalities.

WSO2 Load Balancer is used in production as a dynamic load balancer and autoscaler, as a complete software load balancer product. It is a stripped down version of WSO2 Enterprise Service Bus, containing only the components that are required for load balancing. WSO2 StratosLive can be considered a user scenario with WSO2 Load Balancer in production.

Multiple service groups are proxied by WSO2 Load Balancers. Some of the services have more than one instances to start with, to withstand the higher load. The system automatically scales according to the load that goes high and low. WSO2 Load Balancer is configured such that the permanent or the initial nodes are not terminated when the load goes high. The nodes that are spawned by the load balancer to handle the higher load will be terminated, when the load goes low. Hence, it becomes possible to have different services to run on a single instance, for the instances that are 'permanent', while the spawned instances will have a single carbon server instance.

scp - Copying files between two remote locations

Say, now you are going to copy a few files from a remote server to another. As usual, your remote_server_1 should be given the credentials to copy files to remote_server_2.

root@node2:~# scp -P 1984 -r /mnt/patches root@

Usually, your computer key must already have given the required permissions to access those remote locations. But since the access is not given to remote_server_1, it will prompt for the password of remote_server_2.
As a quick fix, you can copy the private key from your local computer to remote_location_1. However, further discussion on the security concerns on doing this can be found on the web.

scp -P 1984 ~/.ssh/id_rsa root@

Now if you encounter the below when trying,
root@node2:~# scp -P 1984 -r -i id_rsa /mnt/patches root@
ssh_exchange_identification: Connection closed by remote host
lost connection

Have a look into the denied and accessed hosts of remote_server_2, and make sure that the ip of remote_server_2 is allowed and not denied.

vim /etc/hosts.deny
vim /etc/hosts.allow
#sshd sshd1 sshd2 : ALL : ALLOW

Now, the scp command given above, should work as expected to copy the files from the remote_server_1 to remote_server_2.

ssh: connect to host port 22: Connection refused lost connection

This is one of the commonest errors that are thrown when trying to copy files over scp. The major reason for this is, the port being different from the default 22.
pradeeban@pradeeban:~$ scp -r /home/pradeeban/patches root@
ssh: connect to host port 22: Connection refused
lost connection
To fix this, use -P flag, and the port number. Notice the upper case. This is to maintain the consistency with the -p usage of cp command.
pradeeban@pradeeban:~$ scp -P 1984 -r /home/pradeeban/patches root@

Sunday, December 4, 2011

Before you start your localization..

Thursday, December 1, 2011

[Google Code-In 2011] Localizing Haiku

This year, I joined Haiku as a mentor for Google Code-in (GCI) 2011. This is specific to the GCI-2011 task that I have been mentoring for the localization of Haiku operating system. I will post about GCI in a more generic post for the wider audience soon.

Get used to the system
Make sure that you follow the localization guidelines specific to the project. For the Haiku localizations with the Haiku Translation Assistant (HTA), make sure to pick the correct language from the drop-down in the right hand side, under the label "Start Translating in..." If you are going to translate Haiku into Tamil, make sure to pick "Tamil". Also make sure that you have logged into the HTA before starting localization.

For example, if you are translating,

But if you are trying to translate

Join the relevant localization lists to get more information on the localization efforts for the particular project.
[Haiku i18n mail address -].

Translate only the strings. Not the notes below.
For example,
Note: A small radio device to receive short text messages
Translate only "Pager". Not the "Note:" below.

When refreshing the page, HTA sometimes tend to reset itself to en_US. Hence make sure that you are not trying to locale en_US (for example, say Tamil - ta).