Saturday, December 31, 2011

A few things that made my 2011 interesting..

Google HQ, Mountain View
Finally, the year 2011 comes to an end. 2011 was a year with its own downs and ups. I still feel, 2010 and 2004 were the best of all the years. In summary, 2011 was a positive year. I am trying to analyze the year through this blog post, with a summary of the remarkable events of the year.

A great meet-up with the mentors all over the world from different organizations, around 360 in number, at the Google HQ.

2. Google Code-In (GCI) Mentoring.
Mentoring high school students for GCI for Haiku.

Presented a few.

4. WWW2011India.
The web conference in Hyderabad. 

5. Open Source Evangelization.
Did several GSoC introduction presentations throughout the country. 

6. GSoC Mentoring for AbiWord. 
First time as a mentor. 

Chocolates of Colombo
7. One year as a software engineer at WSO2.
13th National Best Quality ICT Awards. 

9. WSO2Con 2011 in Battaramulla Waters Edge
We had days of sessions.

An interesting day, meeting friends after a long time, at BMICH.

11. Teafe at Bangalore.
Bangalore is a lovely city. Also the first time alone in a foreign country.

12. Exploring Sri Lankan coffee shops.
We have some good ones.

13. Applying for the USA visa.
Got one for 5 years, multi-entry.
14. Hyderabad
Experiencing the Indian three-wheeler scammers.

15. The long flight from DXB -> LAX.
It was a very long flight. I am sure it will continue to be my longest non-stop flight for quite a few years.

16. Random events such as DZone Kolomba Meetup.
Attended and presented in a few.

17. The peak of receiving spams.
Thanks CAN-SPAM Act for helping out.

18. Llovizna
2011 is the year that I started blogging beyond the technology posts, such as posts on time.

19. Flight over the north pole.
Who expected LAX -> DXB have this weird path.

20. Coordinating my unconference session at the mentor summit.
Community Matters!!!

21. Working with Cloud Computing
Eventually becoming an expert on the topic.

22. Wikipedia Edits.
Wikipedia in various languages.

23. Applying also for GSoC with SSI as a mentor.
We did not succeed, and I stuck with AbiWord.

24. Stephen Covey's Books
7 Habits of Highly Effective People and the 8th Habit.

25. Localizations and Translations.
I used the online tools effectively.

26. Messed up timezones in a long flight.
DXB -> LAX, same day arrival despite a long flight. LAX -> DXB, left on a Monday and arrived on Wednesday. Tuesday magically disappeared.

27. Award for the best blogger
I received the award from WSO2.

28. Importing a mobile phone directly.
And managing all the complexities that came with that bad decision.

29. Applying for Higher Studies
Erasmus Mundus Masters programs on distributed systems as well as data mining.

30. WSO2 Webinars.
I presented in the WSO2 Summer School Webinars.

Thanks for reading my list until the end. If you are really interested, you may also read the blog posts on my other years as well. Happy new year everyone.

Friday, December 30, 2011

Google Code-In

The reach of Google Code-In (GCI) is relatively lower in Sri Lanka (probably in other countries too), than its sister program Google Summer of Code. One major reason can be, the age group of 13 - 17 (pre-university students) is not much into programming, over generally into computers, like the university students of 18+ do, in third world countries like Sri Lanka. However, more importantly, the word is yet to be spread, regarding GCI, I feel.

This presentation resembles the Google Summer of Code presentation, I prepared for AbiWord/GSoC, in its style. This is based on my experience as a mentor for Haiku in Google Code-In 2011. Hence the examples used in this presentation are mostly from Haiku. I hope, this presentation will be useful to any student as an introduction to Google Code-In.

Special thanks to Krasimir Petkov, GCI mentor (Haiku) for his valuable input at several times, in shaping this presentation up.

An updated, GCI-2012 presentation is available here.

Sunday, December 25, 2011

Open Source Evangelization and the evolution of the GSoC introductory presentation

Open Source Evangelization
We decided to use Google Summer of Code, as a mean to evangelize open source among the university students. I have been contributing to this evangelization effort this year. On this, I have prepared a presentation and presented at the [1] Institute of Engineers, Sri Lanka (IESL), [2] the Science Faculty of the University of Peradeniya, and [4] the Engineering Faculty of the University of Peradeniya. We have also scheduled a session at [5] the Science Faculty, the University of Jaffna, the 7th of January, 2012.

The evolution of the presentation can be found at the respective blog posts.
[1] GSoC-2011 [35 slides / IESL. First version - 40 minutes]
[2] GSoC 2011 and FOSS [38 slides / SET - UoP - 75 minutes]
[3] AbiWord and Google Summer of Code - 2011 [35 slides / AbiWord Specific - 75 minutes (estimated)]
[4] Google Summer of Code awareness session [42 slides / E-Fac - UoP - 75 minutes]
[5] Google Summer of Code 2012 [45 Slides / Further improved. Final / Global version - 75 minutes (estimated)]

It is interesting to find the growth of the GSoC presentation over the year. Hope my GCI presentation [28 slides / First Version - 50 minutes (estimated)] too will eventually become a presentation with a similar quality.

Saturday, December 17, 2011

Google Summer of Code 2012

We are having a series of GSoC awareness sessions, including the yesterday's session we had at the University of Peradeniya, and the upcoming session at the University of Jaffna on the 7th of January, 2012. These events focus on discussing GSoC and FOSS. Attached herewith is the latest version of the presentation I prepared to introduce GSoC 2012 to the students. Feel free to download and distribute, if the slow network prevents you viewing the presentation here.

As a mentor from the AbiWord community, I have come up with the slides based on our experience with the Google Summer of Code. This presentation is also influenced by my experience as a three time Google Summer of Code participant, with AbiWord (2011 as a mentor and 2009 as a student) and OMII-UK (2010 as a student). Special thanks to Martin Sevior and the AbiWord community for their valuable input at several times, in shaping this presentation up. 

Make sure to have a look at the Google Summer of Code 2012 project ideas from AbiWord.

The presentations in this blog require Shockwave Flash Plugin to display correctly. If you couldn't see it correctly, make sure you have the required plugin enabled. Feel free to drop a comment should you require further information.

Google Summer of Code awareness session

Yesterday we had an awareness session for Google Summer of Code (GSoC) at the Engineering Faculty of the University of Peradeniya. This event focussed on discussing GSoC and FOSS. It is an interesting fact that we have visited the University of Peradeniya, after exactly 11 months, for the very same event - Google Summer of Code awareness session. Our previous session was held at the science faculty, on 17th of Jan, 2011.

Attached herewith is my presentation, introducing GSoC 2012 to the students. This slides are based on my experience as a three time Google Summer of Code participant, with AbiWord (2011 as a mentor and 2009 as a student) and OMII-UK (2010 as a student).

In slow network connections, the presentation might take a bit longer to load. In that case, please feel free to download the presentation for your future reference.

Update: Pls find the latest revised version of this presentation at 

Wednesday, December 14, 2011

Configuring WSO2 Load Balancer for Auto Scaling

This post assumes that the reader is familiar at configuring the WSO2 Load Balancer without autoscaling, and has configured the system already with the load balancer. Hence this post focuses on setting up the load balancer with autoscaling. If you are a newbie to setting up WSO2 Servers proxied by WSO2 Load Balancer, please read the blog post, How to setup WSO2 Elastic Load Balancer to configure WSO2 Load Balancer without autoscaling.


The autoscaling configurations are defined from CARBON_HOME/repository/deployment/server/synapse-configs/tasks/autoscaler.xml

1) Task Definition
In WSO2 Load Balancer, the autoscaling algorithm to be used is defined as a Task. ServiceRequestsInFlightEC2Autoscaler is the default class that is used for the autoscaler task.
<task xmlns=""

2) loadbalancer.xml pointed from autoscaler.xml

This property points to the file loadbalancer.xml for further autoscaler configuration.
    <property name="configuration" value="$system:loadbalancer.xml"/>

3) Trigger Interval

The autoscaling task is triggered based on the trigger interval that is defined in the autoscaler.xml. This is given in seconds.
    <trigger interval="5"/>

Autoscale Mediators

autoscaleIn and autoscaleOut mediators are the mediators involved in autoscaling as we discussed above. As with the other synapse mediators, the autoscaling mediators should be defined in the main sequence of the synapse configuration, if you are going to use autoscaling. Load Balancer-1.0.x comes with these mediators defined at the main sequence, which can be found at $CARBON_HOME/repository/deployment/server/synapse-configs/sequences/main.xml. Hence you will need to modify main.xml, only if you are configuring the load balancer without autoscaling.

autoscaleIn mediator is defined as an in mediator. It gets the configurations from loadbalancer.xml, which is the single file that should be configured for autoscaling, once you have already got a system that is set up for load balancing.
        <autoscaleIn configuration="$system:loadbalancer.xml"/>

Similarly autoscaleOut mediator is defined as an out mediator.



loadbalancer.xml contains the service cluster configurations for the respective services to be load balanced and the load balancer itself. Here the service-awareness of the load balancer makes it possible to manage the load across multiple service clusters. The properties given in loadbalancer.xml is used to provide the required configurations and customizations for autoscaling and load balancing. These configurations can also be taken from the system properties as shown below.

1) Properties common for all the instances

1.1) ec2AccessKey

The property 'ec2AccessKey' is used to provide the EC2 Access Key of the instance.
    <property name="ec2AccessKey" value="${AWS_ACCESS_KEY}"/>

1.2) ec2PrivateKey
The certificate is defined by the properties 'ec2PrivateKey'.
    <property name="ec2PrivateKey" value="${AWS_PRIVATE_KEY}"/>

1.3) sshKey
 The ssh key pair is defined by 'sshKey'.
    <property name="sshKey" value="stratos-1.0.0-keypair"/>

1.4) instanceMgtEPR
'instanceMgtEPR' is the end point reference of the web service that is called for the management of the instances.
    <property name="instanceMgtEPR" value=""/>

1.5) disableApiTermination
The 'disableApiTermination' property is set to true by default, and is recommended to leave as it is. This prevents terminating the instances via the AWS API calls.
    <property name="disableApiTermination" value="true"/>

1.6) enableMonitoring
The 'enableMonitoring' property can be turned on, if it is preferred to monitor the instances.
    <property name="enableMonitoring" value="false"/> 

2) Configurations for the load balancer service group

These are defined under
<loadBalancer> .. </loadBalancer>

2.1) securityGroups
The service group that the load balancer belongs to is defined by the property 'securityGroups'. The security group will differ for each of the service that is load balanced as well as the load balancers. Autoscaler uses this property to identify the members of the same cluster.
        <property name="securityGroups" value="stratos-appserver-lb"/>

2.2) instanceType
'instanceType' defines the EC2 instance type of the instance - whether they are m1.small, m1.large, or m1.xlarge (extra large).
        <property name="instanceType" value="m1.large"/>

2.3) instances
The property, 'instances' defines the number of the load balancer instances. Multiple load balancers are used to prevent the single point of failure -  by providing a primary and a secondary load balancer.
        <property name="instances" value="1"/>

2.4) elasticIP
Elastic IP address for the load balancer is defined by the property, 'elasticIP'. We will be able to access the service, by accessing the elastic IP of the load balancer. The load balancer picks the value of the elastic IP from the system property ELASTIC_IP.
        <property name="elasticIP" value="${ELASTIC_IP}"/>

In a public cloud, elastic IPs are public (IPV4) internet addresses, which is a scarce resource. Hence it is recommended to use the elastic IPs only to the load balancer instances that to be exposed to the public, and all the services that are communicated private should be associated to private IP addresses for an efficient use of this resource. Amazon EC2 provides 5 IP addresses by default for each customer, which of course can be increased by sending a request to increase elastic IP address limit.

2.5) availabilityZone
This defines in which availability zone the spawned instances should be.
       <property name="availabilityZone" value="us-east-1c"/>

2.6) payload

The file that is defined by 'payload' is uploaded to the spawned instances. This is often a zip archive, that extracts itself into the spawned instances.
        <property name="payload" value="/mnt/"/> contains the necessary files such as the public and private keys, certificates, and the launch-params (file with the launch parameters) to download and start a load balancer instance in the spawned instances.

The launch-params includes the details for the newly spawned instances to function as the other instances. More information on this can be found from the EC2 documentations.

Sample Launch Parameters
Given below is a sample launch-params, that is used in StratosLive by the load balancer of the Application Server service.

We will look more into these launch-params now.

The credentials - access key ID and the secret access key are given to access the aws account.

S3 Locations
The service zip and the common modifications or patches are stored in an S3 bucket. The locations are given by a few properties in the launch-params shown above.
  • PRODUCT_MODIFICATIONS_PATH_S3 - Points to the product specific changes, files, or patches are uploaded to a specific location.
  • COMMON_MODIFICATIONS_PATH_S3 - Points to the patches and changes common to all the servers.
  • PRODUCT_PATH_S3 - Points to the location where the relevant Stratos service zips are available.
  • STARTUP_DELAY - Given in seconds. Provides some time to start the service that is downloaded on the newly spawned instance, such that it will join the service cluster and be available as a new service instance.
Apart from these, PRODUCT_NAME, SERVER_NAME, HTTP_PORT, and HTTPS_PORT for the application are also given.


3) Configurations for the application groups

These are defined under
<services> .. </services>

These too should be configured as we configured the properties for the load balancers above.

We define the default values of the properties for all the services under
<defaults> .. </defaults>

Some of these properties - such as the payload, host, and domain - will be specific to a particular service group, and should be defined separately for each of the services, under
<service> .. </service>

Properties applicable to all the instances
payload, availabilityZone, securityGroups, and instanceType are a few properties that are not specific to the application instances. We have already discussed about these properties when setting the load balancer properties above.

Properties specific to the application instances
These properties are specific to the application clusters, and are not applicable to the load balacer instances. We will discuss about these properties now.

3.1) minAppInstances
The property 'minAppInstances' shows the minimum of the application instances that should always be running in the system. By default, the minimum of all the application instances are set to 1, where we may go for a higher value for the services that are of high demand all the time, such that we will have multiple instances all the time serving the higher load.
            <property name="minAppInstances" value="1"/>

3.2) maxAppInstances
'maxAppInstances' defines the upper limit of the application instances. The respective service can scale up till it reaches the number of instances defined here.
            <property name="maxAppInstances" value="5"/>

3.3) queueLengthPerNode
The property 'queueLengthPerNode' provides the maximum length of the message queue per node.
            <property name="queueLengthPerNode" value="400"/>

3.4) roundsToAverage
The property 'roundsToAverage' indicates the number of attempts to be made before the scaling the system up or down. When it comes to scaling down, the algorithm makes sure that it doesn't terminate an instance that is just spawned. This is because the spawned instances are billed for an hour. Hence, even if we don't have much load, it makes sense to wait for a considerable amount (say 58 minutes) of time before terminating the instances.
            <property name="roundsToAverage" value="10"/>

3.5) instancesPerScaleUp
This defines how many instances should be scaled up for each time. By default, this is set to '1', such that a single instance is spawned whenever the system scales. However, this too can be changed such that multiple instances will be spawned each time the system scales up. However it may not be cost-effective to set this to a higher value.
            <property name="instancesPerScaleUp" value="1"/>

3.6) messageExpiryTime
messageExpiryTime defines how long the message can stay without getting expired.
            <property name="messageExpiryTime" value="60000"/>

Properties specific to a particular service group
Properties such as hosts and domain are unique to a particular service group, among all the service groups that are load balanced by the given load balancer. We should note that we can use a single load balancer set up with multiple service groups, such as Application Server, Enterprise Service Bus, Business Process Server, etc.

Here we also define the properties such as payload and availabilityZone, if they differ from the default values provided under
<defaults> .. </defaults>

Hence these properties should be defined under
<service>.. </service>
for each of the services.

3.7) hosts
'hosts' defines the hosts of the service that to be load balanced. These will be used as the access point or url to access the respective service.
Multiple hosts can be defined under
<hosts> .. </hosts>
Given below is a sample hosts configurations for the application server service

3.8) domain
Like the EC2 autoscaler uses the security groups to identify the service groups, 'domain' is used by the load balancer (ServiceDynamicLoadBalanceEndpoint) to correctly identify the clusters of the load balanced services.
Once you have configured the load balancer as above, with the product/service instances, you will have the system that dynamically scales.

Auto Scaling with WSO2 Load Balancer

How Auto Scaling works with WSO2 Load Balancer

The autoscaling component comprises of the synapse mediators AutoscaleInMediator and AutoscaleOutMediator and a Synapse Task ServiceRequestsInFlightEC2Autoscaler that functions as the load analyzer task. A system can scale up based on several factors, and hence autoscaling algorithms can easily be written considering the nature of the system. For example, Amazon's Auto Scaler API provides options to scale the system with the system properties such as Load (the timed average of the system load), CPUUtilization (utilization of the cpu at the given instance), or Latency (delay or latency in serving the service requests).

Autoscaler Components

  • AutoscaleIn mediator - Creates a unique token and puts that into a list for each message that is received.
  • AutoscaleOut mediator - Removes the relevant stored token from the list, for each of the response message that is sent.
  • Load Analyzer Task - ServiceRequestsInFlightEC2Autoscaler is the load analyzer task used for the service level autoscaling as the default. It periodically checks the length of the list of messages based on the configuration parameters. Here the messages that are in flight for each of the back end service is tracked by the AutoscaleIn and AutoscaleOut mediators, as we are using the messages in flight algorithm for autoscaling.

ServiceRequestsInFlightEC2Autoscaler implements the execute() of the Synapse Task interface. Here it calls sanityCheck() that does the sanity check and autoscale() that handles the autoscaling.

Sanity Check

sanityCheck() checks the sanity of the load balancers and the services that are load balanced, whether the running application nodes and the load balancer instances meet the minimum number specified in the configurations, and the load balancers are assigned elastic IPs.

nonPrimaryLBSanityCheck() runs once on the primary load balancers and runs time to time on the secondary/non-primary load balancers as the task is executed periodically. nonPrimaryLBSanityCheck() assigns the elastic IP to the instance, if that is not assigned already. Secondary load balancers checks that a primary load balancer is running periodically. This avoids the load balancer being a single point of failure in a load balanced services architecture.

computeRunningAndPendingInstances() computes the number of instances that are running and pending. ServiceRequestsInFlightEC2Autoscaler task computes the running and pending instances for the entire system using a single EC2 API call. This reduces the number of EC2 API calls, as AWS throttles the number of requests you can make in a given time. This method will be used to find whether the running instances meet the minimum number of instances specified for the application nodes and the load balancer instances through the configuration as given in loadbalancer.xml. Instances are launched, if the specified minimum number of instances is not found.


autoscale() handles the autoscaling of the entire system by analyzing the load of each of the domain. This contains the algorithm - RequestsInFlight based autoscaling. If the current average of requests is higher than that can be handled by the current nodes, the system will scale up. If the current average is less than that can be handled by the (current nodes - 1), the system will scale down.

Autoscaling component spawns new instances, and once the relevant services successfully start running in the spawned instances, they will join the respective service cluster. Load Balancer starts forwarding the service calls or the requests to the newly spawned instances, once they joined the service clusters. Similarly, when the load goes down, the autoscaling component terminates the under-utilized service instances, after serving the requests that are already routed to those instances.

StratosLive - A case study for WSO2 Load Balancer

In a cloud environment such as WSO2 StratosLive, auto-scaling becomes a crucial functionality. The system is expected to scale up and down with the dynamically changing load. Auto-scaling capabilities are sometimes provided by the Infrastructure as a Service provider themselves, such as the Autoscaling from Amazon. However, autoscaling is not necessarily a requirement that to be fulfilled by an IaaS. Say, you are providing Platform as a Service (PaaS) that is hosted over the pure native hardware, instead of an IaaS. In that case, your PaaS should be able to provide the required autoscaling and load balancing capabilities to the applications that are hosted on top of your platform. WSO2 Load Balancer is such a software load balancer, that handles the load balancing, fail over, and autoscaling functionalities.

WSO2 Load Balancer is used in production as a dynamic load balancer and autoscaler, as a complete software load balancer product. It is a stripped down version of WSO2 Enterprise Service Bus, containing only the components that are required for load balancing. WSO2 StratosLive can be considered a user scenario with WSO2 Load Balancer in production.

Multiple service groups are proxied by WSO2 Load Balancers. Some of the services have more than one instances to start with, to withstand the higher load. The system automatically scales according to the load that goes high and low. WSO2 Load Balancer is configured such that the permanent or the initial nodes are not terminated when the load goes high. The nodes that are spawned by the load balancer to handle the higher load will be terminated, when the load goes low. Hence, it becomes possible to have different services to run on a single instance, for the instances that are 'permanent', while the spawned instances will have a single carbon server instance.

scp - Copying files between two remote locations

Say, now you are going to copy a few files from a remote server to another. As usual, your remote_server_1 should be given the credentials to copy files to remote_server_2.

root@node2:~# scp -P 1984 -r /mnt/patches root@

Usually, your computer key must already have given the required permissions to access those remote locations. But since the access is not given to remote_server_1, it will prompt for the password of remote_server_2.
As a quick fix, you can copy the private key from your local computer to remote_location_1. However, further discussion on the security concerns on doing this can be found on the web.

scp -P 1984 ~/.ssh/id_rsa root@

Now if you encounter the below when trying,
root@node2:~# scp -P 1984 -r -i id_rsa /mnt/patches root@
ssh_exchange_identification: Connection closed by remote host
lost connection

Have a look into the denied and accessed hosts of remote_server_2, and make sure that the ip of remote_server_2 is allowed and not denied.

vim /etc/hosts.deny
vim /etc/hosts.allow
#sshd sshd1 sshd2 : ALL : ALLOW

Now, the scp command given above, should work as expected to copy the files from the remote_server_1 to remote_server_2.

Later update: I found rsync to be more efficient.

nohup rsync -avz /source root@ &

ssh: connect to host port 22: Connection refused lost connection

This is one of the commonest errors that are thrown when trying to copy files over scp. The major reason for this is, the port being different from the default 22.
pradeeban@pradeeban:~$ scp -r /home/pradeeban/patches root@
ssh: connect to host port 22: Connection refused
lost connection
To fix this, use -P flag, and the port number. Notice the upper case. This is to maintain the consistency with the -p usage of cp command.
pradeeban@pradeeban:~$ scp -P 1984 -r /home/pradeeban/patches root@

Thursday, December 1, 2011

[Google Code-In 2011] Localizing Haiku

This year, I joined Haiku as a mentor for Google Code-in (GCI) 2011. This is specific to the GCI-2011 task that I have been mentoring for the localization of Haiku operating system. I will post about GCI in a more generic post for the wider audience soon.

Get used to the system
Make sure that you follow the localization guidelines specific to the project. For the Haiku localizations with the Haiku Translation Assistant (HTA), make sure to pick the correct language from the drop-down in the right hand side, under the label "Start Translating in..." If you are going to translate Haiku into Tamil, make sure to pick "Tamil". Also make sure that you have logged into the HTA before starting localization.

For example, if you are translating,

But if you are trying to translate

Join the relevant localization lists to get more information on the localization efforts for the particular project.
[Haiku i18n mail address -].

Translate only the strings. Not the notes below.
For example,
Note: A small radio device to receive short text messages
Translate only "Pager". Not the "Note:" below.

When refreshing the page, HTA sometimes tend to reset itself to en_US. Hence make sure that you are not trying to locale en_US (for example, say Tamil - ta).