Friday, December 30, 2011

A few things that made my 2011 interesting..

Google HQ, Mountain View

Finally, the year 2011 comes to an end. 2011 was a year with its own downs and ups. I still feel, 2010 and 2004 were the best of all the years. In summary, 2011 was a positive year. I am trying to analyze the year through this blog post, with a summary of the remarkable events of the year.

1. Google Summer of Code Mentor Summit 2011

A great meet-up with the mentors all over the world from different organizations, around 360 in number, at the Google HQ.

2. Google Code-In (GCI) Mentoring.

Mentoring high school students for GCI for Haiku.

3. Webinars at WSO2.

Presented a few.

4. WWW2011India.

The web conference in Hyderabad.

5. Open Source Evangelization.

Did several GSoC introduction presentations throughout the country.

6. GSoC Mentoring for AbiWord.

First time as a mentor.

Chocolates of Colombo

7. One year as a software engineer at WSO2.

Time flies fast.

8. Representing WSO2 at NBQSA 2011.

13th National Best Quality ICT Awards.

9. WSO2Con 2011 in Battaramulla Waters Edge

We had days of sessions.

10. BSc (Eng) convocation.

An interesting day, meeting friends after a long time, at BMICH.

11. Teafe at Bangalore.

Bangalore is a lovely city. Also the first time alone in a foreign country.

12. Exploring Sri Lankan coffee shops.

We have some good ones.

13. Applying for the USA visa.

Got one for 5 years, multi-entry.

14. Hyderabad
Experiencing the Indian three-wheeler scammers.

15. The long flight from DXB -> LAX.
It was a very long flight. I am sure it will continue to be my longest non-stop flight for quite a few years.

16. Random events such as DZone Kolomba Meetup.
Attended and presented in a few.

17. The peak of receiving spams.
Thanks CAN-SPAM Act for helping out.

18. Llovizna
2011 is the year that I started blogging beyond the technology posts, such as posts on time.

19. Flight over the north pole.
Who expected LAX -> DXB have this weird path.

20. Coordinating my unconference session at the mentor summit.
Community Matters!!!

21. Working with Cloud Computing
Eventually becoming an expert on the topic.

22. Wikipedia Edits.
Wikipedia in various languages.

23. Applying also for GSoC with SSI as a mentor.
We did not succeed, and I stuck with AbiWord.

24. Stephen Covey's Books
7 Habits of Highly Effective People and the 8th Habit.

25. Localizations and Translations.
I used the online tools effectively.

26. Messed up timezones in a long flight.
DXB -> LAX, same day arrival despite a long flight. LAX -> DXB, left on a Monday and arrived on Wednesday. Tuesday magically disappeared.

27. Award for the best blogger
I received the award from WSO2.

28. Importing a mobile phone directly.
And managing all the complexities that came with that bad decision.

29. Applying for Higher Studies
Erasmus Mundus Masters programs on distributed systems as well as data mining.

30. WSO2 Webinars.
I presented in the WSO2 Summer School Webinars.

Thanks for reading my list until the end. If you are really interested, you may also read the blog posts on my other years as well. Happy new year everyone.

Google Code-In

The reach of Google Code-In (GCI) is relatively lower in Sri Lanka (probably in other countries too), than its sister program Google Summer of Code. One major reason can be, the age group of 13 - 17 (pre-university students) is not much into programming, over generally into computers, like the university students of 18+ do, in third world countries like Sri Lanka. However, more importantly, the word is yet to be spread, regarding GCI, I feel.

This presentation resembles the Google Summer of Code presentation, I prepared for AbiWord/GSoC, in its style. This is based on my experience as a mentor for Haiku in Google Code-In 2011. Hence the examples used in this presentation are mostly from Haiku. I hope, this presentation will be useful to any student as an introduction to Google Code-In.

Special thanks to Krasimir Petkov, GCI mentor (Haiku) for his valuable input at several times, in shaping this presentation up.

An updated, GCI-2012 presentation is available here.

Sunday, December 25, 2011

Open Source Evangelization and the evolution of the GSoC introductory presentation

Open Source Evangelization

We decided to use Google Summer of Code, as a mean to evangelize open source among the university students. I have been contributing to this evangelization effort this year. On this, I have prepared a presentation and presented at the [1] Institute of Engineers, Sri Lanka (IESL), [2] the Science Faculty of the University of Peradeniya, and [4] the Engineering Faculty of the University of Peradeniya. We have also scheduled a session at [5] the Science Faculty, the University of Jaffna, the 7th of January, 2012.

The evolution of the presentation can be found at the respective blog posts.

[1] GSoC-2011 [35 slides / IESL. First version - 40 minutes]

[2] GSoC 2011 and FOSS [38 slides / SET - UoP - 75 minutes]

[3] AbiWord and Google Summer of Code - 2011 [35 slides / AbiWord Specific - 75 minutes (estimated)]

[4] Google Summer of Code awareness session [42 slides / E-Fac - UoP - 75 minutes]

[5] Google Summer of Code 2012 [45 Slides / Further improved. Final / Global version - 75 minutes (estimated)]

It is interesting to find the growth of the GSoC presentation over the year. Hope my GCI presentation [28 slides / First Version - 50 minutes (estimated)] too will eventually become a presentation with a similar quality.

Saturday, December 17, 2011

Google Summer of Code 2012

Google summer of code

View more presentations from Kathiravelu Pradeeban.

We are having a series of GSoC awareness sessions, including the yesterday's session we had at the University of Peradeniya, and the upcoming session at the University of Jaffna on the 7th of January, 2012. These events focus on discussing GSoC and FOSS. Attached herewith is the latest version of the presentation I prepared to introduce GSoC 2012 to the students. Feel free to download and distribute, if the slow network prevents you viewing the presentation here.

As a mentor from the AbiWord community, I have come up with the slides based on our experience with the Google Summer of Code. This presentation is also influenced by my experience as a three time Google Summer of Code participant, with AbiWord (2011 as a mentor and 2009 as a student) and OMII-UK (2010 as a student). Special thanks to Martin Sevior and the AbiWord community for their valuable input at several times, in shaping this presentation up.

Make sure to have a look at the Google Summer of Code 2012 project ideas from AbiWord.

The presentations in this blog require Shockwave Flash Plugin to display correctly. If you couldn't see it correctly, make sure you have the required plugin enabled. Feel free to drop a comment should you require further information.

Google Summer of Code awareness session

Google summer of code 2012

View more presentations from Kathiravelu Pradeeban.

Yesterday we had an awareness session for Google Summer of Code (GSoC) at the Engineering Faculty of the University of Peradeniya. This event focussed on discussing GSoC and FOSS. It is an interesting fact that we have visited the University of Peradeniya, after exactly 11 months, for the very same event - Google Summer of Code awareness session. Our previous session was held at the science faculty, on 17th of Jan, 2011.

Attached herewith is my presentation, introducing GSoC 2012 to the students. This slides are based on my experience as a three time Google Summer of Code participant, with AbiWord (2011 as a mentor and 2009 as a student) and OMII-UK (2010 as a student).

In slow network connections, the presentation might take a bit longer to load. In that case, please feel free to download the presentation for your future reference.

Update: Pls find the latest revised version of this presentation at

Google Summer of Code 2012.

Wednesday, December 14, 2011

Configuring WSO2 Load Balancer for Auto Scaling

This post assumes that the reader is familiar at configuring the WSO2 Load Balancer without autoscaling, and has configured the system already with the load balancer. Hence this post focuses on setting up the load balancer with autoscaling. If you are a newbie to setting up WSO2 Servers proxied by WSO2 Load Balancer, please read the blog post, How to setup WSO2 Elastic Load Balancer to configure WSO2 Load Balancer without autoscaling.

autoscaler.xml

The autoscaling configurations are defined from CARBON_HOME/repository/deployment/server/synapse-configs/tasks/autoscaler.xml

1) Task Definition

In WSO2 Load Balancer, the autoscaling algorithm to be used is defined as a Task. ServiceRequestsInFlightEC2Autoscaler is the default class that is used for the autoscaler task.

<task xmlns="http://ws.apache.org/ns/synapse"
      class="org.wso2.carbon.mediator.autoscale.ec2autoscale.ServiceRequestsInFlightEC2Autoscaler"
      name="autoscaler">

2) loadbalancer.xml pointed from autoscaler.xml

This property points to the file loadbalancer.xml for further autoscaler configuration.

    <property name="configuration" value="$system:loadbalancer.xml"/>

3) Trigger Interval

The autoscaling task is triggered based on the trigger interval that is defined in the autoscaler.xml. This is given in seconds.

    <trigger interval="5"/>

Autoscale Mediators

autoscaleIn and autoscaleOut mediators are the mediators involved in autoscaling as we discussed above. As with the other synapse mediators, the autoscaling mediators should be defined in the main sequence of the synapse configuration, if you are going to use autoscaling. Load Balancer-1.0.x comes with these mediators defined at the main sequence, which can be found at $CARBON_HOME/repository/deployment/server/synapse-configs/sequences/main.xml. Hence you will need to modify main.xml, only if you are configuring the load balancer without autoscaling.

autoscaleIn mediator is defined as an in mediator. It gets the configurations from loadbalancer.xml, which is the single file that should be configured for autoscaling, once you have already got a system that is set up for load balancing.

        <autoscaleIn configuration="$system:loadbalancer.xml"/>

Similarly autoscaleOut mediator is defined as an out mediator.

        <autoscaleOut/>

loadbalancer.xml

loadbalancer.xml contains the service cluster configurations for the respective services to be load balanced and the load balancer itself. Here the service-awareness of the load balancer makes it possible to manage the load across multiple service clusters. The properties given in loadbalancer.xml is used to provide the required configurations and customizations for autoscaling and load balancing. These configurations can also be taken from the system properties as shown below.

1) Properties common for all the instances

1.1) ec2AccessKey

The property 'ec2AccessKey' is used to provide the EC2 Access Key of the instance.

    <property name="ec2AccessKey" value="${AWS_ACCESS_KEY}"/>

1.2) ec2PrivateKey

The certificate is defined by the properties 'ec2PrivateKey'.

    <property name="ec2PrivateKey" value="${AWS_PRIVATE_KEY}"/>

1.3) sshKey

The ssh key pair is defined by 'sshKey'.

    <property name="sshKey" value="stratos-1.0.0-keypair"/>

1.4) instanceMgtEPR

'instanceMgtEPR' is the end point reference of the web service that is called for the management of the instances.

    <property name="instanceMgtEPR" value="https://ec2.amazonaws.com/"/>

1.5) disableApiTermination

The 'disableApiTermination' property is set to true by default, and is recommended to leave as it is. This prevents terminating the instances via the AWS API calls.

    <property name="disableApiTermination" value="true"/>

1.6) enableMonitoring

The 'enableMonitoring' property can be turned on, if it is preferred to monitor the instances.

    <property name="enableMonitoring" value="false"/>

2) Configurations for the load balancer service group

These are defined under

<loadBalancer> .. </loadBalancer>

2.1) securityGroups

The service group that the load balancer belongs to is defined by the property 'securityGroups'. The security group will differ for each of the service that is load balanced as well as the load balancers. Autoscaler uses this property to identify the members of the same cluster.

        <property name="securityGroups" value="stratos-appserver-lb"/>

2.2) instanceType
'instanceType' defines the EC2 instance type of the instance - whether they are m1.small, m1.large, or m1.xlarge (extra large).

        <property name="instanceType" value="m1.large"/>

2.3) instances

The property, 'instances' defines the number of the load balancer instances. Multiple load balancers are used to prevent the single point of failure - by providing a primary and a secondary load balancer.

        <property name="instances" value="1"/>

2.4) elasticIP

Elastic IP address for the load balancer is defined by the property, 'elasticIP'. We will be able to access the service, by accessing the elastic IP of the load balancer. The load balancer picks the value of the elastic IP from the system property ELASTIC_IP.

        <property name="elasticIP" value="${ELASTIC_IP}"/>

In a public cloud, elastic IPs are public (IPV4) internet addresses, which is a scarce resource. Hence it is recommended to use the elastic IPs only to the load balancer instances that to be exposed to the public, and all the services that are communicated private should be associated to private IP addresses for an efficient use of this resource. Amazon EC2 provides 5 IP addresses by default for each customer, which of course can be increased by sending a request to increase elastic IP address limit.

2.5) availabilityZone

This defines in which availability zone the spawned instances should be.

       <property name="availabilityZone" value="us-east-1c"/>

2.6) payload

The file that is defined by 'payload' is uploaded to the spawned instances. This is often a zip archive, that extracts itself into the spawned instances.

        <property name="payload" value="/mnt/payload.zip"/>

payload.zip contains the necessary files such as the public and private keys, certificates, and the launch-params (file with the launch parameters) to download and start a load balancer instance in the spawned instances.

The launch-params includes the details for the newly spawned instances to function as the other instances. More information on this can be found from the EC2 documentations.

Sample Launch Parameters

Given below is a sample launch-params, that is used in StratosLive by the load balancer of the Application Server service.

AWS_ACCESS_KEY_ID=XXXXXXXXXXXX,AWS_SECRET_ACCESS_KEY=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx,
AMI_ID=ami-xxxxxxxxx,ELASTIC_IP=xxx.xx.xxx.xxx,
PRODUCT_MODIFICATIONS_PATH_S3=s3://wso2-stratos-conf-1.5.2/appserver/,
COMMON_MODIFICATIONS_PATH_S3=s3://wso2-stratos-conf-1.5.2/stratos/,
PRODUCT_PATH_S3=s3://wso2-stratos-products-1.5.2,PRODUCT_NAME=wso2stratos-as-1.5.2,
SERVER_NAME=appserver.stratoslive.wso2.com,
HTTP_PORT=9763,HTTPS_PORT=9443,STARTUP_DELAY=0;60

We will look more into these launch-params now.

Credentials

The credentials - access key ID and the secret access key are given to access the aws account.

S3 Locations

The service zip and the common modifications or patches are stored in an S3 bucket. The locations are given by a few properties in the launch-params shown above.

PRODUCT_MODIFICATIONS_PATH_S3 - Points to the product specific changes, files, or patches are uploaded to a specific location.
COMMON_MODIFICATIONS_PATH_S3 - Points to the patches and changes common to all the servers.
PRODUCT_PATH_S3 - Points to the location where the relevant Stratos service zips are available.
STARTUP_DELAY - Given in seconds. Provides some time to start the service that is downloaded on the newly spawned instance, such that it will join the service cluster and be available as a new service instance.

Apart from these, PRODUCT_NAME, SERVER_NAME, HTTP_PORT, and HTTPS_PORT for the application are also given.

3) Configurations for the application groups

These are defined under

<services> .. </services>

These too should be configured as we configured the properties for the load balancers above.

We define the default values of the properties for all the services under

<defaults> .. </defaults>

Some of these properties - such as the payload, host, and domain - will be specific to a particular service group, and should be defined separately for each of the services, under

<service> .. </service>

Properties applicable to all the instances

payload, availabilityZone, securityGroups, and instanceType are a few properties that are not specific to the application instances. We have already discussed about these properties when setting the load balancer properties above.

Properties specific to the application instances

These properties are specific to the application clusters, and are not applicable to the load balacer instances. We will discuss about these properties now.

3.1) minAppInstances

The property 'minAppInstances' shows the minimum of the application instances that should always be running in the system. By default, the minimum of all the application instances are set to 1, where we may go for a higher value for the services that are of high demand all the time, such that we will have multiple instances all the time serving the higher load.

            <property name="minAppInstances" value="1"/>

3.2) maxAppInstances

'maxAppInstances' defines the upper limit of the application instances. The respective service can scale up till it reaches the number of instances defined here.

            <property name="maxAppInstances" value="5"/>

3.3) queueLengthPerNode

The property 'queueLengthPerNode' provides the maximum length of the message queue per node.

            <property name="queueLengthPerNode" value="400"/>

3.4) roundsToAverage

The property 'roundsToAverage' indicates the number of attempts to be made before the scaling the system up or down. When it comes to scaling down, the algorithm makes sure that it doesn't terminate an instance that is just spawned. This is because the spawned instances are billed for an hour. Hence, even if we don't have much load, it makes sense to wait for a considerable amount (say 58 minutes) of time before terminating the instances.

            <property name="roundsToAverage" value="10"/>

3.5) instancesPerScaleUp

This defines how many instances should be scaled up for each time. By default, this is set to '1', such that a single instance is spawned whenever the system scales. However, this too can be changed such that multiple instances will be spawned each time the system scales up. However it may not be cost-effective to set this to a higher value.

            <property name="instancesPerScaleUp" value="1"/>

3.6) messageExpiryTime

messageExpiryTime defines how long the message can stay without getting expired.

            <property name="messageExpiryTime" value="60000"/>

Properties specific to a particular service group

Properties such as hosts and domain are unique to a particular service group, among all the service groups that are load balanced by the given load balancer. We should note that we can use a single load balancer set up with multiple service groups, such as Application Server, Enterprise Service Bus, Business Process Server, etc.

Here we also define the properties such as payload and availabilityZone, if they differ from the default values provided under

<defaults> .. </defaults>

Hence these properties should be defined under

<service>.. </service>

for each of the services.

3.7) hosts

'hosts' defines the hosts of the service that to be load balanced. These will be used as the access point or url to access the respective service.

Multiple hosts can be defined under

<hosts> .. </hosts>

Given below is a sample hosts configurations for the application server service

            <hosts>
                <host>appserver.cloud-test.wso2.com</host>
                <host>as.cloud-test.wso2.com</host>
            </hosts>

3.8) domain

Like the EC2 autoscaler uses the security groups to identify the service groups, 'domain' is used by the load balancer (ServiceDynamicLoadBalanceEndpoint) to correctly identify the clusters of the load balanced services.

            <domain>wso2.manager.domain</domain>

Once you have configured the load balancer as above, with the product/service instances, you will have the system that dynamically scales.

Auto Scaling with WSO2 Load Balancer

How Auto Scaling works with WSO2 Load Balancer

The autoscaling component comprises of the synapse mediators AutoscaleInMediator and AutoscaleOutMediator and a Synapse Task ServiceRequestsInFlightEC2Autoscaler that functions as the load analyzer task. A system can scale up based on several factors, and hence autoscaling algorithms can easily be written considering the nature of the system. For example, Amazon's Auto Scaler API provides options to scale the system with the system properties such as Load (the timed average of the system load), CPUUtilization (utilization of the cpu at the given instance), or Latency (delay or latency in serving the service requests).

Autoscaler Components

AutoscaleIn mediator - Creates a unique token and puts that into a list for each message that is received.
AutoscaleOut mediator - Removes the relevant stored token from the list, for each of the response message that is sent.
Load Analyzer Task - ServiceRequestsInFlightEC2Autoscaler is the load analyzer task used for the service level autoscaling as the default. It periodically checks the length of the list of messages based on the configuration parameters. Here the messages that are in flight for each of the back end service is tracked by the AutoscaleIn and AutoscaleOut mediators, as we are using the messages in flight algorithm for autoscaling.

ServiceRequestsInFlightEC2Autoscaler implements the execute() of the Synapse Task interface. Here it calls sanityCheck() that does the sanity check and autoscale() that handles the autoscaling.

Sanity Check

sanityCheck() checks the sanity of the load balancers and the services that are load balanced, whether the running application nodes and the load balancer instances meet the minimum number specified in the configurations, and the load balancers are assigned elastic IPs.

nonPrimaryLBSanityCheck() runs once on the primary load balancers and runs time to time on the secondary/non-primary load balancers as the task is executed periodically. nonPrimaryLBSanityCheck() assigns the elastic IP to the instance, if that is not assigned already. Secondary load balancers checks that a primary load balancer is running periodically. This avoids the load balancer being a single point of failure in a load balanced services architecture.

computeRunningAndPendingInstances() computes the number of instances that are running and pending. ServiceRequestsInFlightEC2Autoscaler task computes the running and pending instances for the entire system using a single EC2 API call. This reduces the number of EC2 API calls, as AWS throttles the number of requests you can make in a given time. This method will be used to find whether the running instances meet the minimum number of instances specified for the application nodes and the load balancer instances through the configuration as given in loadbalancer.xml. Instances are launched, if the specified minimum number of instances is not found.

Autoscale

autoscale() handles the autoscaling of the entire system by analyzing the load of each of the domain. This contains the algorithm - RequestsInFlight based autoscaling. If the current average of requests is higher than that can be handled by the current nodes, the system will scale up. If the current average is less than that can be handled by the (current nodes - 1), the system will scale down.

Autoscaling component spawns new instances, and once the relevant services successfully start running in the spawned instances, they will join the respective service cluster. Load Balancer starts forwarding the service calls or the requests to the newly spawned instances, once they joined the service clusters. Similarly, when the load goes down, the autoscaling component terminates the under-utilized service instances, after serving the requests that are already routed to those instances.

StratosLive - A case study for WSO2 Load Balancer

In a cloud environment such as WSO2 StratosLive, auto-scaling becomes a crucial functionality. The system is expected to scale up and down with the dynamically changing load. Auto-scaling capabilities are sometimes provided by the Infrastructure as a Service provider themselves, such as the Autoscaling from Amazon. However, autoscaling is not necessarily a requirement that to be fulfilled by an IaaS. Say, you are providing Platform as a Service (PaaS) that is hosted over the pure native hardware, instead of an IaaS. In that case, your PaaS should be able to provide the required autoscaling and load balancing capabilities to the applications that are hosted on top of your platform. WSO2 Load Balancer is such a software load balancer, that handles the load balancing, fail over, and autoscaling functionalities.

WSO2 Load Balancer is used in production as a dynamic load balancer and autoscaler, as a complete software load balancer product. It is a stripped down version of WSO2 Enterprise Service Bus, containing only the components that are required for load balancing. WSO2 StratosLive can be considered a user scenario with WSO2 Load Balancer in production.

Multiple service groups are proxied by WSO2 Load Balancers. Some of the services have more than one instances to start with, to withstand the higher load. The system automatically scales according to the load that goes high and low. WSO2 Load Balancer is configured such that the permanent or the initial nodes are not terminated when the load goes high. The nodes that are spawned by the load balancer to handle the higher load will be terminated, when the load goes low. Hence, it becomes possible to have different services to run on a single instance, for the instances that are 'permanent', while the spawned instances will have a single carbon server instance.

Tuesday, December 13, 2011

scp - Copying files between two remote locations

Say, now you are going to copy a few files from a remote server to another. As usual, your remote_server_1 should be given the credentials to copy files to remote_server_2.

root@node2:~# scp -P 1984 -r /mnt/patches root@116.12.92.114:/mnt/patches

Usually, your computer key must already have given the required permissions to access those remote locations. But since the access is not given to remote_server_1, it will prompt for the password of remote_server_2.

As a quick fix, you can copy the private key from your local computer to remote_location_1. However, further discussion on the security concerns on doing this can be found on the web.

scp -P 1984 ~/.ssh/id_rsa root@116.12.92.113:~/

Now if you encounter the below when trying,

root@node2:~# scp -P 1984 -r -i id_rsa /mnt/patches root@116.12.92.114:/mnt/patches
ssh_exchange_identification: Connection closed by remote host
lost connection

Have a look into the denied and accessed hosts of remote_server_2, and make sure that the ip of remote_server_2 is allowed and not denied.

vim /etc/hosts.deny
vim /etc/hosts.allow
#sshd sshd1 sshd2 : ALL : ALLOW
sshd: 116.12.92.113

Now, the scp command given above, should work as expected to copy the files from the remote_server_1 to remote_server_2.

Later update: I found rsync to be more efficient.

nohup rsync -avz /source root@116.12.92.113:/home/destination-root &

ssh: connect to host xxx.xxx.xxx.xx port 22: Connection refused lost connection

This is one of the commonest errors that are thrown when trying to copy files over scp. The major reason for this is, the port being different from the default 22.

pradeeban@pradeeban:~$ scp -r /home/pradeeban/patches root@116.12.92.113:/mnt/patches
ssh: connect to host 116.12.92.113 port 22: Connection refused
lost connection

To fix this, use -P flag, and the port number. Notice the upper case. This is to maintain the consistency with the -p usage of cp command.

pradeeban@pradeeban:~$ scp -P 1984 -r /home/pradeeban/patches root@116.12.92.113:/mnt/patches

Find further useful tips at, http://www.linuxtutorialblog.com/post/ssh-and-scp-howto-tips-tricks

Saturday, December 3, 2011

Before you start your localization..

Before you start your localization from Pradeeban Kathiravelu, Ph.D.

Thursday, December 1, 2011

[Google Code-In 2011] Localizing Haiku

This year, I joined Haiku as a mentor for Google Code-in (GCI) 2011. This is specific to the GCI-2011 task that I have been mentoring for the localization of Haiku operating system. I will post about GCI in a more generic post for the wider audience soon.

Get used to the system

Make sure that you follow the localization guidelines specific to the project. For the Haiku localizations with the Haiku Translation Assistant (HTA), make sure to pick the correct language from the drop-down in the right hand side, under the label "Start Translating in..." If you are going to translate Haiku into Tamil, make sure to pick "Tamil". Also make sure that you have logged into the HTA before starting localization.

For example, if you are translating,

http://hta.polytect.org/catalogs/view/2/ta to Tamil, that's correct.

But if you are trying to translate

http://hta.polytect.org/catalogs/view/2/en to Tamil, that's wrong.

Join the relevant localization lists to get more information on the localization efforts for the particular project.
[Haiku i18n mail address - haiku-i18n@freelists.org].

Translate only the strings. Not the notes below.
For example,
in
Pager
Note: A small radio device to receive short text messagesTranslate only "Pager". Not the "Note:" below.

When refreshing the page, HTA sometimes tend to reset itself to en_US. Hence make sure that you are not trying to locale en_US (for example, say Tamil - ta).

Before you start your localization from Pradeeban Kathiravelu, Ph.D.

Tuesday, November 29, 2011

10 Points Before you start your localization..

I am mentoring the localization tasks of Haiku into Tamil for Google Code-In 2011, and hence thought of providing a few suggestions for localizations. Some of these suggestions will be specific to Tamil, while sharing a few common characteristics with other languages.

1) Use the standard terminology

Make sure that you have the necessary reference and the language's latest accepted technical glossary with you. Don't invent your own words or phrases. If you don't know a word, leave it blank, rather than filling it with your guesses.

If you find a word not in the glossary, try to find the meaning from the other reliable sources. If you have found a translation for a word, make sure the translation matches the standard. If an acceptable translation for a phrase is first found, share that with the other team members, and with their approval consider using the word in the translation. Words that are found not in the glossary should be noted down and later can be included in the Glossary.

Systems such as HTA, expect the localizations to be verified by the language maintainer or the mentor, before marking the translations as verified. That is, a translated word can be marked as faulty, by the language mentors.

2) Be consistent.

For example, I notice the use of "ஜன்னல்" and "சாளரம்" interchangeably, for the same context. Pls stick to one. In this case, my recommendation is to use "சாளரம்". Don't ignore the existing conventions.

3) Don't use slang or spoken/broken language

Words like "இங்க" and "ஓடுது" are a very slang way of translation, and are grammatically wrong. Please use formal Tamil. Not any spoken variant of Tamil. We will reject the spoken forms of phrases, which are considered wrong in written format.

If something is considered wrong in your Tamil lessons, they are wrong in localization too. We can't get broken or grammatically wrong localizations with wrong spellings into the project. :)

4) Translate as phrases

The phrases should be translated as a whole, and not as word-by-word.

Let's take the phrase, "Update time interval:"

It should be translated as, "மேம்படுத்தல் நேர இடைவெளி" and not "மேம்படுத்தல் நேரம் இடைவெளி". This is something that differentiates the Indic languages from English.

Don't translate word-by-word. Instead, translate by complete phrases. Phrases like, "Add graph" should be translated as a whole in Tamil. Phrases like "சேர்க்கவும் (add) வரைபடம் (graph)" or "வரைபட சேர்க்கவும்" are not grammatically complete, and any native Tamil speaker can point that. It should be "வரைபடத்தைச் சேர்க்கவும்".

"Do you want to stop" should be translated as "நிறுத்த வேண்டுமா?" (want to stop?), instead of "நீ நிறுத்த வேண்டுமா?". Here we omit, "நீ", as that is obvious.

5) Translate for the context.

Some words may have different meanings according to the context. Be careful when localizing them. "Them" may not be "அவர்களை" when it refers to the plural of "it". It should be "அவற்றை".

"written by:" should be "எழுதியவர்:". "எழுதப்பட்டது" doesn't make sense in this context.

Think of,
"written by:Raja"
"எழுதியவர்:ராஜா" will be natural.
"எழுதப்பட்டது ராஜா" doesn't make sense.

So translate for the context. Do not translate as it is.

6) Be respectful to the user

Pls do not use "நீ". Use "நீங்கள்" instead. Similarly, don't use "நிறுத்து". Should be "நிறுத்தவும்". The program should refer to the user in a respective manner. We should not offend the user, by calling him in "singular", as the rule of Tamil.

7) Locales

Be specific to the correct locale. If you are translating for ta-LK, consider the conventions involved, and remember this can be different from ta-IN. Some projects do not have the locales. They just have the country code, ignoring the potential minor changes between the locales.

8) Don't translate the control strings

For example, leave the strings such as,

%lld ms

as it is.

Don't try to introduce blank space between these. Translations such as

% lld நொடி

and

% lld MS

are invalid.

Don't try to introduce blank space between the %lld.

Also, there is no need to transliterate units such as MB, as we use them as standards. Translating it as எம்பி doesn't make sense.

9) Don't just "Google Translate"

For example,

"CPU Usage" should be translated as "CPU பயன்பாடு"

where it has been translated as,

CPU Usage = CPU பயன்பாட்டை by Google Translate.

Google Translate is using a learning algorithm, and is not always correct. Moreover, it is not complete for Indic languages such as Tamil. Please translate on yourself, since we mark those Google Translated phrases as "Faulty", as most of them can be translated using better vocabulary.

10) Easy translations first
There may be a few phrases that you may not be able to translate. Focus on the phrases that you can translate easily first, than struggling with long phrases that may take more time for you to translate.

P.S: This post is an updated version of a post that was written a long time back.

Saturday, November 26, 2011

The birth of viral contents over the Internet

Popular Content

For a scientist to become popular, it takes a considerable effort and lots of dedication. But someone who creates some creative content and uploads it over the Internet, might probably get equally famous among a wider audience.

Getting Viral

A content grabs the attention of millions and becomes an Internet meme by becoming viral, shared and spread over multiple online media. The content can be a video, a blog post, an image, or even an audio clip. Some contents become popular due to their controversial nature, and the others become popular just because of the curiosity of the people. The social media interaction makes the popular content more popular. Once a content sparks some interest to a viewer, he might probably visit the content back (say, if that is a video or an audio clip), and also share the content over the social media for the people in his network to view. This leads to an exponential growth to the popularity of the content. If an influential person shares your content to his circles of friends, most probably your content will be viewed and further shared by his circle of friends too.

Creating controversy or inducing curiosity

If we take YouTube, the mostly viewed videos are not necessarily good ones. Most of these video clips have more 'dislikes' than the 'likes', as people get disappointed with what they just saw, because of their curiosity. When the thumbnail image of the video shows some "cute stuff", it is very hard to resist the desire to click and view the clip. A sexy title and an attractive caption will be an added advantage. However, when we realize that there is nothing such interesting material in that clip, than a mere ad, we 'dislike' it. Still the 'view' count increases, and the video remains popular. Some companies work for their clients or customers to make their content viral by creating controversy around them, by posing as multiple users, or simply by sharing that content over multiple media, using multiple accounts.

Sparking the interest

There are a few genuine attempts that become viral by the fans viewing and sharing them multiple times. The most commonly stated example is the YouTube clip, "Yosemitebear Mountain Giant Double Rainbow 1-8-10", where someone shouts and expresses his extreme level of joy, looking at a double rainbow.

The Double Rainbow

It has got 31,595,276 views, 206,997 likes, 4,549 dislikes, and 91,157 comments. This was also made into a song, which also has become equally viral, with almost same number of views and likes. Comics have been written around the "Double Rainbow" and many parodies have been created around. According to an article in knowyourmeme.com, a tweet from Jimmy Kimmel was the major reason behind this video clip becoming popular. However, I am personally not supporting any such claims without a strong evident. Who knows - many others too may have shared the content and enjoyed it parallel.

Why this Kolaveri

Why This Kolaveri Di Full Song Promo Video in HD has got 6,263,365 views, 73,595 likes, 3,058 dislikes, and 30,361 comments within two weeks since it is posted. Like all the other addictions, "Kolaveri" is proven yet another rising addiction. Once watched, everyone keeps watching it multiple times, and then starts sharing. This leads to an exponential popularity growth. If this continues, it will very soon overtake the mostly known viral video - "Double Rainbow" shout. It is a song sung by the Tamil actor Dhanush, a son-in-law of the Tamil super star Rajinikanth. This song is sung in Tanglish, a Chennai slang of Tamil + Broken English, with simple words.

The girl in the green top in this clip is Shruti Hassan the heroine of the movie "3", to which this song belongs to. She is a daughter of Kamalhaasan (an award winning Tamil actor and long time competitor of Rajinikanth). The other girl in this song is Aishwarya Dhanush - Dhanush's wife who directs this movie. The debuting music director, Anirudh, a nephew of Rajinikanth can also be seen in this video. Everyone expected this song to become popular among the Tamil cinema fans due to this stardom. Nevertheless, no one including the producers of this song/movie expected it to become viral globally. The fact that the song is indeed sung in English, but with a south Indian accent and a touch of Tamil, must have helped the song becoming popular among the non-Tamil speakers.

how to ignore someone you love

For a content to become viral, it should reach the common men, and should not target a narrow niche. Among my blog posts, how to ignore someone you love can be stated as somewhat viral. It is the third mostly viewed post in my blog, along with the highest number (46) of facebook likes. I, myself didn't expect that post to become popular, since I wrote it without much effort unlike the technology blog posts, that I wrote with much effort. The attractive blog title, with the interesting common area of discussion - "ignoring facebook invitations", must have attracted more readers in, unlike the posts that are focused on a niche.

Creating a viral content is not that much easy though. No one has properly found a formula to estimate how the human brain functions. We can create some interesting content, but the audience decides its success.

Friday, November 25, 2011

A tribute to DZone..

Everyone into the information technology knows that DZone is a good way to find and read the quality articles or blog posts. The recent MVB (Most Valuable Blogger) program is yet another addition to the services provided by DZone, with interesting zones such as Cloud Zone, Architect Zone, and many more. With the success of the concept of the zones, DZone started to introduce many microzones such as HTML5 Zone, DevOps Zone, and a few others.

MVB not just merely re-posts a content as it is. But it formats and makes it better, prior to posting it, if necessary.

The below is an example:

My original post on "Amazon Autoscaling ~ Issue uploading payload?"

As appeared on cloud.zone: http://cloud.dzone.com/articles/amazon-autoscaling-issue

You can see DZone has actually improved the readability of the content by proper styling and syntax highlighting.

I encourage and recommend everyone who takes pride on their technology blogs to become an MVB. Nothing is encouraging than having our thoughts to reach a wider audience. Long live DZone!

Tuesday, November 22, 2011

Auto Scaling with WSO2 Load Balancer

Load Balancer is a crucial component in scalable architectures. WSO2 Load Balancer not only balances the load across the application instances, but also scales the system automatically to cater the dynamically changing load. WSO2 Load Balancer is a WSO2 Carbon based product. In this post, we will look how autoscaling works with the Load Balancer.

WSO2 Load Balancer ensures high availability and scalability in the enterprise systems. WSO2 Load Balancer is used in cloud environments to balance the load across the server instances. An ideal use case of the Load Balancer is WSO2 StratosLive, where the service instances are fronted with the load balancers and the system scales automatically as the service gets more web service calls. Having the Apache Tribes Group management framework, Apache Axis2 Clustering module, Apache Synapse mediation framework, and autoscaling component as the major building blocks, WSO2 Load Balancer becomes a complete software load balancer that functions as an autoscaler and a dynamic load balancer.

Architecture

WSO2 Load Balancer can be configured to function as a load balancer with autoscaling on the supported infrastructure. Currently the autoscaler supports EC2 API. Thus the Load Balancer can be configured as a dynamic load balancer with autoscaling, on Amazon EC2 and the other infrastructures compatible with the EC2 API. The autoscaling component uses ec2-client, a Carbon component that functions as a client for the EC2 API and carries out the infrastructure level functionalities. Spawning/starting a new instance, terminating a running instance, managing the service groups, and mapping the elastic IPs are a few of the infrastructure related functionalities that are handled by the autoscaling component.

2) How to configure the WSO2 Load Balancer for Auto Scaling?
3) StratosLive - A case study for WSO2 Load Balancer

Resources

Blog posts

WSO2 StratosLive - An Enterprise Ready Java PaaS

Auto Scaling with Amazon EC2 - I

Auto Scaling with Amazon EC2 - II

WSO2 Load Balancer - how it works

Moving from a 'Platform' to the 'Platform-as-a-Service' ~ What is it all about?

Getting Started with WSO2 Cloud Virtual Machines for Amazon EC2

Auto scaling web services on Amazon EC2

Summer School 2011 - Platform-as-a-Service: The WSO2 Way

- Presentation Slides

- Recording

Building SaaS for SMEs on WSO2 PaaS - Wednesday, 9th November 2011

Saturday, November 19, 2011

Time/Money Duality

This post can be considered the part-II of one of my previous posts - LATE. The movie "In Time" was the major motivation behind this post.

Time spent and perceived.

I have heard the phrases "Don't waste money!" and "Don't waste time" more often than any other suggestions asking not to waste something. "Don't waste electricity", "Don't waste water", and other similar suggestions are indeed the derivatives of "Don't waste money", or are driven by some sentiments such as "Don't waste food - give it to the poor instead!" We can simply conclude that "Money" and "Time" are considered two equivalent and most valuable assets. We spend money to save time, and also spend time to save or earn some money. I see this duality as the reason behind the routine of the humans. Everyone makes the world a better ('better' is a relative term. so someone's better may be another's worse though) place to live, at least by a tiny bit, through their job and otherwise, by investing their time.

Being a complex quantity, Time has its own real and imaginary counterparts. Each of us has 24 hours. But the effective time differs from person to person. I feel, in terms of physics, we can't define Time as a vector or a scalar rigidly. May be, we should research further on the nature of the multi-dimensional time!

If we consider time as a complex number, what we measure will be the time's projection on the x-axis, the real counterpart of the complex quantity. When we are waiting for something or someone even a few minutes go like an hour - we can explain this using the above "Complex-time" concept.

"Busy" is a relative term. I can be busy for task-1 or person-1, but may be available for task-2 or person-2.

The time and the money spent and the duality

"Can you spend five minutes with me regarding this project?"

"Sorry, I am afraid. I have to catch the train in 5 minutes to my home."

"Oh, it is fine. I am also on the same route. Let's discuss on the train"

Now, I am not busy for the discussion, since the talk is not going to consume my time.

Currently, it is impossible for us to travel by time, or purchase it. So either we spend much time or less of it - we can earn time relatively, but not absolutely. In natural terms, we can't earn time, but just spend it effectively. Time does have a monetary value. The In Time movie attempts to make Time as money, focusing on the time-money duality. It discusses the sharing of time, and transfer of time between different individuals. The rich have more time, and lesser the poorer.

Since Time is used as the money, the rich have more money, making them living forever, almost, where the poor keep running searching for time, awaiting their end, everyday. For them, "tomorrow is a luxury they (you) can't afford" and even idling becomes costly (of course, idling costs in the real world too, in the time scale, as of the above image).

Wish if we can buy some time, utilizing this time/money duality in the future.

Friday, November 18, 2011

3 Most Annoying Status Updates in Facebook!

Expressing love for their dad!

1) Tribute over the Facebook.
"May I ask a personal favor, only some of you will do it (and I know who you are). If you know someone who has fought cancer and passed away, or someone who is still fighting, please add this to your status for 1 hour as a mark of respect and remembrance, I hope I'm right about the people who will.. Let's save the world from cancer by posting.. ♥"

Come on! You are NOT contributing anything by posting/re-posting that stupid status for 1 hour. Just annoying the users over the Facebook. To make it worse, somehow these posts tend to become famous and spread like Viruses. You can even notice the 50+ likes, and believe me. These guys have never shown love to the cancer patients outside Facebook!

2) Stupid claims 97%

"All of us have a thousand wishes. To be thinner, have more money, a new phone. A cancer patient only has one wish, to kick the cancer . I know that 97% of you won't post this as your status, but my friends will be the 3% that do. In honor of someone who died, or is FIGHTING cancer, post this for at least one hour."

"NO.. I have heard about many cancer patients who had greater wishes.. for their family or even for the country.. The above status is just an insult to cancer patients.. ewwww... :("

3) I love my brothers (over the facebook)!
<- When some one starts to love his/her parents, siblings, spouses, friends, etc *using* facebook.

People are such sweet hearts (by caring everyone and love them by facebook (errr...) and posting statuses that make no sense at all!) :P

Sunday, November 13, 2011

Are you allowed to choose your name?

Facebook, Google+, and probably many other sites do not allow accounts to be registered under a name that looks "artificial". That means, you can register an account only on a name of a human. The reason given is, they do not want to have people registering *fake* accounts, and welcome only the *real* persons to be in!

Names such as the below are not allowed or at least challenged.

1) Kathiravelu பிரதீபன் - You can't mix two language scripts, though this one indeed is my name, where the first name is written in English, and the last name in Tamil script.

2) Prad33ban, Pradeeban_, or Pradeeb@n - Numbers or special characters are not allowed in the names. Only the alphabets of a single script are allowed.

3) Pradz or PradROX - Suspicious name, most probably a fake!

4) Rock Buddy, Superman, Monkey Gurl, or Fen0023 - Doesn't look real.

5) Pearl Kitty or Brownie - Kitties and puppies aren't allowed!

6) K.K, PDN, pra, or pdn - Initials, pen names, or pseudonyms aren't allowed.

7) Dr.Vijay - Salutations aren't allowed.

8) Sucker - Sounds offensive.

9) kaThiRaVeLu PrADeeBaN - Improper capitalization.

10) Fxx or Bot - Bots are not allowed.

11) Double Rainbow, Firefly, or Stone - Natural objects and insects! No!

12) Colombo Library, Llovizna&Sons, or OpenGroupForum - Libraries, Companies, or organizations aren't allowed to create a profile. Use a 'group' or a 'page' instead.

This regulation has gained severe opposition from those who prefer to have their online identity to be hidden or are interested in having a second life. Some of them choose to have a second life an independent one from their real, offline, or *first* life. They have a valid reason to have a different online identity, I feel. Whatever the name, let it sound like a bot, library, or a kitten, still there is a human behind the name. What matters the most is, no one's privacy should not be violated.

However these rules do not prevent the fake profile creators anyway. They just create fake profiles under real names. People are getting more into deeper fake stuff - SCIgen is an example, which allow you to generate fake papers, mostly used for a good motive of course - to identify the fake conferences by getting the fake papers auto-generated accepted.

Saturday, November 12, 2011

Copyrights - "Safely ignored"s in the Internet

I recently found an exact copy of one of my blog posts in another blog, without any credit or pointer to my blog. I tried to comment on his blog post with the link to the original post. He never approved my comment. Hence I decided to report his blog to google, for the copyright violation.

In the topmost banner of the blogger blogs, you can see "Report Abuse". I just clicked, and reported the post.

Google replied with,

Thanks for reaching out to us!

We have received your legal request. We receive many such complaints each
day; your message is in our queue, and we'll get to it as quickly as our
workload permits.

Due to the large volume of requests that we experience, please note that
we will only be able to provide you with a response if we determine your
request may be a valid and actionable legal complaint, and we may respond
with questions or requests for clarification. For more information on
Google's Terms of Service, please visit http://www.google.com/accounts/TOS

We appreciate your patience as we investigate your request.

Regards,
The Google Team

After within a few hours, Google took the page that violated the copyrights down in accordance with DMCA (The Digital Millennium Copyright Act of 1998), and sent me this message.

Hello,

Thanks for reaching out to us.
In accordance with the DMCA, we have completed processing your
infringement complaint and the content in question no longer appears on
the following URL(s):
http://{blog-name}.blogspot.com/ 2011/07/{post}*.html
Please let us know if we can assist you further.
Regards,
The Google Team

* I have removed the blog url to avoid harassing the blogger who copied the blog post.

Llovizna by Kathiravelu Pradeeban is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

One of my friends mentioned, why would I report that person's blog post. My friend mentioned that the person who copied my post is indeed helping me by spreading my thoughts to the followers of his blog.

No, it doesn't work that way. If someone finds his post in web search instead of me, further interaction with the reader will not be possible. If my view point is challenged in the blog post of that person, he will not be in a position to advocate for my thoughts. Most probably he himself would have forgotten the original post that he copied from, leaving the discussion to a nowhere-zone.

In print, everyone takes utmost care about the copyrights. But when it comes online, it is taken for granted that anyone can violate others' copyrights. It is common to see posts that are copied from the web, even in newspapers as "Thanks: The Internet" or "Thank you: The web". No one bothers to give the exact url or the author of the post, which is in fact a bad practice, and offensive just like any other pirated material.

I, however support open knowledge. Some restrictive licenses prevent others from using the content at all, than merely reading and understanding it. Licenses should be open, just like the open source licenses. In "Llovizna" (and wherever online/offline), I made sure to use only the content or images that are in the public domain, or made sure to provide the credentials to the original license holder, whenever I reused others' contents. Many images with supportive licenses, or those are in the public domain can be found in wikipedia or wikimedia. We need more of them.

Sharing is not just a copy-pasting. It should provide value to the original content, while giving the appreciation that it deserves, and engaging with the content.