Showing posts with label Localization. Show all posts
Showing posts with label Localization. Show all posts

Saturday, December 3, 2011

Before you start your localization..

Thursday, December 1, 2011

[Google Code-In 2011] Localizing Haiku

This year, I joined Haiku as a mentor for Google Code-in (GCI) 2011. This is specific to the GCI-2011 task that I have been mentoring for the localization of Haiku operating system. I will post about GCI in a more generic post for the wider audience soon.

Get used to the system
Make sure that you follow the localization guidelines specific to the project. For the Haiku localizations with the Haiku Translation Assistant (HTA), make sure to pick the correct language from the drop-down in the right hand side, under the label "Start Translating in..." If you are going to translate Haiku into Tamil, make sure to pick "Tamil". Also make sure that you have logged into the HTA before starting localization.

For example, if you are translating,

But if you are trying to translate

Join the relevant localization lists to get more information on the localization efforts for the particular project.
[Haiku i18n mail address -  haiku-i18n@freelists.org].

Translate only the strings. Not the notes below.
For example,
in
Pager
Note: A small radio device to receive short text messages
Translate only "Pager". Not the "Note:" below.

When refreshing the page, HTA sometimes tend to reset itself to en_US. Hence make sure that you are not trying to locale en_US (for example, say Tamil - ta).


Tuesday, November 29, 2011

10 Points Before you start your localization..

I am mentoring the localization tasks of Haiku into Tamil for Google Code-In 2011, and hence thought of providing a few suggestions for localizations. Some of these suggestions will be specific to Tamil, while sharing a few common characteristics with other languages.

1) Use the standard terminology
Make sure that you have the necessary reference and the language's latest accepted technical glossary with you. Don't invent your own words or phrases. If you don't know a word, leave it blank, rather than filling it with your guesses.

If you find a word not in the glossary, try to find the meaning from the other reliable sources. If you have found a translation for a word, make sure the translation matches the standard. If an acceptable translation for a phrase is first found, share that with the other team members, and with their approval consider using the word in the translation. Words that are found not in the glossary should be noted down and later can be included in the Glossary.

Systems such as HTA, expect the localizations to be verified by the language maintainer or the mentor, before marking the translations as verified. That is, a translated word can be marked as faulty, by the language mentors. 

2) Be consistent. 
For example, I notice the use of "ஜன்னல்" and "சாளரம்" interchangeably, for the same context. Pls stick to one. In this case, my recommendation is to use "சாளரம்". Don't ignore the existing conventions.

3) Don't use slang or spoken/broken language
Words like "இங்க" and "ஓடுது" are a very slang way of translation, and are grammatically wrong. Please use formal Tamil. Not any spoken variant of Tamil. We will reject the spoken forms of phrases, which are considered wrong in written format.

If something is considered wrong in your Tamil lessons, they are wrong in localization too. We can't get broken or grammatically wrong localizations with wrong spellings into the project. :)

4) Translate as phrases
The phrases should be translated as a whole, and not as word-by-word.

Let's take the phrase, "Update time interval:"
It should be translated as, "மேம்படுத்தல் நேர இடைவெளி" and not "மேம்படுத்தல் நேரம் இடைவெளி". This is something that differentiates the Indic languages from English.

Don't translate word-by-word. Instead, translate by complete phrases. Phrases like, "Add graph" should be translated as a whole in Tamil. Phrases like "சேர்க்கவும் (add) வரைபடம் (graph)" or "வரைபட சேர்க்கவும்" are not grammatically complete, and any native Tamil speaker can point that. It should be "வரைபடத்தைச் சேர்க்கவும்".

"Do you want to stop" should be translated as "நிறுத்த வேண்டுமா?" (want to stop?), instead of "நீ நிறுத்த வேண்டுமா?". Here we omit, "நீ", as that is obvious.


5) Translate for the context.
Some words may have different meanings according to the context. Be careful when localizing them. "Them" may not be "அவர்களை" when it refers to the plural of "it". It should be "அவற்றை".

"written by:" should be "எழுதியவர்:". "எழுதப்பட்டது" doesn't make sense in this context.

Think of,
"written by:Raja"
"எழுதியவர்:ராஜா" will be natural.
"எழுதப்பட்டது ராஜா" doesn't make sense.

So translate for the context. Do not translate as it is.

6) Be respectful to the user
Pls do not use "நீ". Use "நீங்கள்" instead. Similarly, don't use "நிறுத்து". Should be "நிறுத்தவும்". The program should refer to the user in a respective manner. We should not offend the user, by calling him in "singular", as the rule of Tamil.

7) Locales
Be specific to the correct locale. If you are translating for ta-LK, consider the conventions involved, and remember this can be different from ta-IN. Some projects do not have the locales. They just have the country code, ignoring the potential minor changes between the locales.

8) Don't translate the control strings
For example, leave the strings such as,
%lld ms
as it is.
Don't try to introduce blank space between these. Translations such as
% lld நொடி
and
% lld MS
are invalid.
Don't try to introduce blank space between the %lld.
Also, there is no need to transliterate units such as MB, as we use them as standards. Translating it as எம்பி doesn't make sense.

9) Don't just "Google Translate"
For example,
"CPU Usage" should be translated as "CPU பயன்பாடு"
where it has been translated as,
CPU Usage = CPU பயன்பாட்டை by Google Translate.

Google Translate is using a learning algorithm, and is not always correct. Moreover, it is not complete for Indic languages such as Tamil. Please translate on yourself, since we mark those Google Translated phrases as "Faulty", as most of them can be translated using better vocabulary.


10) Easy translations first
There may be a few phrases that you may not be able to translate. Focus on the phrases that you can translate easily first, than struggling with long phrases that may take more time for you to translate.
P.S: This post is an updated version of a post that was written a long time back.


Sunday, April 4, 2010

Install *this* font before reading!

Recently I was going through the online version of the teachers' guides from the National Institute of Education Sri Lanka (NIE). The major issue I noticed in those documents was, they have been written using non-unicode fonts (Sinhala - DL-Manel-bold and Tamil - Bamini). It should be noted that converting them to unicode is not very difficult given the UCSC converters. However the practical difficulty of using non-unicode fonts to the shared documents is, the other users are expected to have that particular font in their system. Copy-pasting into a plain text editor becomes impossible. Still I have seen many web sites asking to download the particular font to view their site correct. Some sites even ask the users to use Internet Explorer or some other specific browser. I would rather request them to change their site to unicode than asking each user to install their not-so-sexy font.

Here is a sample of the above mentioned issue.
mEka Odjlh - Written using DL-Manel-bold, which appears as garbage text here.
පෑන් ධාවකය - After converting to Unicode using the UCSC Font converter.

Similarly for Tamil
Ngid nrYj;jp - Bamini
பேனை செலுத்தி - After converting to Unicode using the UCSC Font converter.

I could also notice during the recent series of IT seminars in Jaffna peninsula, that the students who mentioned that they can type in Tamil were actually referring to their ability of typing using Bamini font. I guess we have created an awareness of Unicode among them. Unicode is much essential, if we really want to go beyond mere typing in local languages.

Sunday, August 30, 2009

Let's do FOSS!

Free and Open Source Software (FOSS) projects are community maintained and most of them depends on the volunteers for their existence. You can help them in many ways regardless of your computer literacy. I have been asked several times, "How can I contribute to a free and open source software project?". So finally I thought of sharing some of my understanding regarding this. This post is just how I see the stuff going there, and not the solution.

1. First make sure that you really have some time, interest on the particular project, willingness, and the basic domain knowledge. Though contributions often come from developers, it doesn't mean that it is the only possibility. Search the web and find an open source community that matches your interest.

2. You should have a basic knowledge of the version control systems [SVN, Git, or CVS], and the project management tools like maven, and be able to use them, as they are common in most of the open source projects.

3. Get some experience in building projects from source and creating and applying simple patches for the projects.

4. Search the web and find an open source community that suits your interest. You can find some interesting FOSS communities from the List of participating organizations in Google Summer of Codes.

5. Communities like Abiword are small, and there are huge communities like Apache as well, which have sub-communities themselves. In small communities, you will be dealing with the entire community, where in case of a large community, you will again have to pick a suitable project which will have a 'sub-community' itself.

6. Read the online references and make sure that new developers are mostly welcomed at the community, at the given time. Almost all the FOSS communities encourage newcomers as there is nothing like 'NO VACANCY' in FOSS communities. You are always more than welcomed, as a developer, technical writer, or a translator.

7. Check the possible areas where you can enter. Join the user and developer mailing lists and hang on the community IRC, listen to the ongoing developer architecture and design level discussions. You can try to break ice with other developers, via IRC.

8. Build the project from the source, check for the possible additions and improvements, check the mail archives and the bugs database for the bugs or the future implementation, to which you can contribute.

9. Now make sure that you have got some idea and understanding about the community, codebase, their practices, and how you are going to contribute.

10. Write a descriptive mail to the developer list clearly explaining your interest in becoming a potential contributor to the community. Don't forget to include the design level details of your suggestions, your time frame, and the amount of contribution you can offer to the community, also give a hint of your strong interest by making the ice break impressive to the community.

11. You can also query about the potential contribution you can give to the community. Mostly you will be replied personally or through the mailing list pointing you the possible areas you can work. Pick a suitable one from the suggested projects and considering your own interests and decisions, and make sure to describe and confirm your involvement in the project to the community.

12. After discussing with such a project community about your interest in contributing, you can start by Localizing the products into your local languages,  polishing the documentation, writing reviews and blog posts, providing ideas, spreading the word, or by the other possible means which are specific to the community, apart from being a developer in the community. Open source projects are for the users. Being a user itself is a great contribution for a project. You can help the community by timely updates on the bugs and request for the feature enhancements that you expect. By being an active contributor you can earn self-satisfaction and recognition.

13. Make sure to be in a close contact with your community, once you start implementing your idea (or contributing to the community in some other way). Feel free to ask questions from the developers via the mailing list or even personal emails.

14. Make sure to be active in your community and very soon once you become a committer/community developer, continue contributing. Always try to welcome and help the freshers and help them becoming active contributors.

Let's do FOSS.

[This post is based on the experience I got from Abiword community as a GSoC Student.]

Sunday, May 3, 2009

Computing for all

I have recently started to translate Abiword to ta-LK. It should also be noted that ta-IN already exists for Abiword. So in a language point of view, a Tamil localized form of Abiword can be found. But still ta-LK has some clear differences from ta-IN and follows different standards. In case if a Sinhala translation for Abiword starts in a near future, I hope, we can think of making Abiword the national localized word processor of Sri Lanka.

Yesterday I attended the LAKAPPS fellowship event, which was organized to mark the success of the LAKAPPS team, in the concept "Computing in Sri Lankan languages." Abiword with the adaptability to the OLPC system, suits the position in localized computing for rural community. I wish LAKAPPS team will go forward in its effort for a better computing for all Sri Lankans.

Wednesday, April 15, 2009

Internationalization & Localization

Internationalization (i18n) - The process of designing a software application so that it can be adapted to various languages and regions without engineering changes.

localization (L10n) - The process of adapting software for a specific region or language by adding locale-specific components and translating text.

globalization (g11n) / Native Language Support (NLS) - The combination of internationalization and localization.

Locale - A set of parameters that defines the user's language, country and any special variant preferences that the user wants to see in their user interface.

Localizability (L12y) - The degree to which a software product can be localized.

Resource - Part of a program which can appear to the user or be changed or configured by the user, and this is the data of the program, opposed to its code.

Core product - The language independent portion of a software product.



Compiled from:
* Wikipedia
* Mozilla Internationalization & Localization Guidelines

Saturday, April 11, 2009

Use of Unicode in AbiWord - Initial Discussions

This is a mail thread in the unicode mailing list started by one of the Abiword developers during the earliest stages of Abiword.
http://unicode.org/mail-arch/unicode-ml/Archives-Old/UML014/0787.html
This thread contains a huge array of the discussions on the topic.
Even if you are not an Abiword developer, this mail thread is a good one to have a look, since it has many important views of many developers about Unicode applications.