Unicode for Indian language websites
I wrote another article to the Prajasakti Telugu daily newspaper. In it, I have described problems with using non standard encodings on Indian language websites and solutions available to the users and content providers to convert the content to Unicode.
Unlike other articles I have posted, this one is tomorrow’s article today ![]()
W3C-India: Workshop on Internationalisation
I attended the Workshop on Internationalisation conducted by W3C-India office at Noida on 3rd and 4th of August 2006.
The workshop was organised by the W3C Indian office. CDAC and Department of Information and Communications Technology (MIT) are the major participants in the W3C Indian office. W3C works on setting up standards like HTML, XHTML, XML, DOM, CSS, MathML, SSML etc. W3C Indian office represents India in W3C, collects requirements specific to India in W3C standards and contributes to W3C standards in general.
The two foreign attendees to the workshop from W3C were from the W3C internationalisation workgroup. Their 6 sessions during the 2 days of the workshop focused on explaining the current status of internationalisation provisions in W3C standards and specific issues that are related to India that they were currently aware of. Their expectation of the workshop was to collect more issues related to Indian languages and specific rules for solving the issues recognised.
Unfortunately, most of the other presentations and discussions in the workshop were off-topic and repetitive. They included talks on introduction to localisation and internationalisation in general for
software applications (and not the web), machine translation, optical character recognition, corpus research, linguistic issues, speech recognition, speech synthesis research (but ignored SSML, which is more relevant in terms of standardisation!) and the internationalisation capabilities of IBM and Infosys.
Major contributions to the core issue of the workshops came from CDAC’s W3C-India language teams at Noida, Pune, Kolkata and Trivendrum. These teams have worked on issues with W3C standards, including the upcoming CSS3 standard. They have indicated issues with vertical text, justification of text, list style types, first letter drop-caps, etc., which is the kind of stuff that the workshop was meant to discuss.
However, the question of whether the standards have been comprehensively analysed has not been answered. The second day has seen discussion on international domain names and their implmentation in India (for .in TLD) by National Informatics Centre (NIC).
My Participation:
During the panel discussion, I pointed out the lack of effort to propagate W3C standards in the Indian web community. The major problem facing the growth of Indian language web today is the use of proprietary fonts and encodings to build websites. This is halting the progress of standards like Unicode. Further, Indian language websites are crippled and user cannot search the site or content in Google, participate by posting comments etc.
Yet, the W3C office whose major participants are CDAC and MIT have not taken steps to encourage the use of Unicode or other standards. Some government websites even use proprietary encodings. CDAC has not released its fonts under a proper license and thus killed any prospect of letting the publishers switch to Unicode using their fonts. Not just in the case of Unicode but government sites lack a commitment to even basic standards like HTML. For example, presidentofindia.nic.in is not following W3C standards. I have asked the attendents of the workshop to have deep commitment to follow standards and also encourage the use of these standards in the rest of India.
I also met a lot of people who are likely to help or work along to achieve common goals.
Firefox 1.0.4 Indic enabled build
The Firefox Indic enabled build 1.0.3 has security problem just because the official firefox has these problems. They updated it to fix the problems. So did we. Manish, my team mate, did a build for Firefox 1.0.4.
Get the newer version if you are using the older one. Here is an old screenshot.
Even this time we have applied the same patches as the previous version only this time on the firefox 1.0.4 official sources.
Update: A known problem with these builds is that if you have latest version of GTK+ installed on your system, for example on Debian Sid, you will have problem running firefox after installation. To fix the problem, simply remove libpango* files from the installation folder.
Firefox 1.0.3 Indic enabled build
On the lines of previous Firefox Indic enabled build 1.0.2, Manish did a build for Firefox 1.0.3.
Sarovar release is still not done. For now, it is on this page. Get the newer version if you are using the older one. Here is an old screenshot.
These are list of patches applied on it:
# pango patches
mozilla-1.7.3-pango-render.patch
firefox-1.0-pango-selection.patch
firefox-1.0-pango-space-width.patch
firefox-1.0-pango-direction.patch
firefox-1.0-pango-rounding.patch
# local bugfixes
firefox-PR1-stack-direction.patch
# official upstream patches
firefox-PR1-pkgconfig.patch
mozilla-1.7.3-xptcall-s390.patch
firefox-1.0-xptcall-s390.patch
firefox-1.0-nspr-s390.patch
firefox-1.0-useragent.patch
firefox-1.0-gtk-system-colors.patch
firefox-1.0-g-application-name.patch
firefox-1.0-remote-intern-atoms.patch
firefox-1.0-execshield-nspr.patch
firefox-1.0-execshield-xpcom.patch
# mozilla installer patch
mozilla-installer.patch
mozilla-installer.patch was the crash fix in the installer that I have made. All the other patches were taken from Fedora CVS.
Telugu Translations at Full Spead
I just talked to Kiran Chandra of FSF India Andra Pradesh Chapter. His team is working on Telugu translations for GNU/Linux and finished some GNOME modules and applications. Also made a new OTF font called Srujana! Yeeehaah! another OTF font for Telugu. There are some problems with the font and translations, but I am sure the issues will be ironed out very very soon. The fonts and translated files are available at http://ap.gnu.org.in. There is also an XKB map that is supposedly more convenient for DTP publishers all over AP.
Kiran told me that his team can finish the translations in a matter for few months. We will be working together from now on. Status updates and releases will be happen at http://telugu.sarovar.org.
Software and Indian language support
sunthosh,
Infact, most editors in GNOME will have this multilingual capability. For example, I write Indic text in sticky notes, evolution mail client etc.
Firefox 1.0.2 Indic enabled build
There are several security problems with firefox 1.0. Firefox 1.0.1 fixed many and firefox 1.0.2 fixed the rest. Ofcourse, these are needed in the ealier firefox 1.0 pango build I have made. So me and my team mate Manish made a new build of firefox 1.0.2 with pango patches. This build is much better than the previous one.
- The build is now an installer. So it is easy to install.
- This one has more pango patches including the selection patch and more patches from fedora CVS.
- The earlier build had a problem that it was asking for some strange libraries (libdnet) as depdencies. Well, this one has entirely same requirements as official firefox build.
- Earlier build required you to install latest pango first. But, that is a pain for most users in most distributions. This build has a solution to it. The system does not need latest pango for this build. The installer comes with pango libraries of its own.
BTW, while making this build, I encoutered a bug in the installer and fixed it. Yey! my first contribution (yes, I’d like to call it contribution) to the browser born for world domination. Also had to make a small change to pango to make it work from anywhere instead of standard locations.
It wil take some time to put it for download on sarovar. Apparently, sarovar is not taking large releases (about 10m) in a straight forward way. So I put it for download here. Get it now! Here is the old screenshot.
Ubuntu Transliterated to Telugu
A good deal of technology is well present to support Telugu on GNU/Linux. Most people don’t know this or don’t have an idea of how much support there is or don’t know how to use it. Keeping this in mind, we cooked up a transliterated version of Ubuntu in Telugu. This could also be a platform for poeple to work on translations with out worrying about setting up their machines for translations.
Chaitanya and Manish (my team mates now) did most of the work. Well, today we finished working on it. Essentially, from a english phonetic word list we obtained the sounds for most of english words and immediately transliterated them to Telugu. For the rest of the words like proper nouns etc. we used the RTS scheme for transliteration. In the end, the output is not so bad. It is mostly readable. I will be posting screenshots soon.
My first contribution to Encyclopedia Galactica
I have written an article on the Telugu Wikipedia about setting up your browser for Telugu.
Its not just about browser, it about setting up and using your computer for Telugu. I’ve written about GNU/Linux, Windows XP/2003 server. Someone will have to fill up the other operating system sections.
BTW, how can an encyclopedia have articles on *everything* without involving *everyone* on the planet.
Firefox Indic enabled build
I got firefox running with pango support. It would have been much easier if I were using Fedora Core 3. All I had to do was run:
$ MOZ_ENABLE_PANGO=1 firefox
That would have enabled pango support in firefox. But, no! I use Debian GNU/Linux and am in love with it. Then I also thought about all the people who are not using FC3 as their distribution (even FC2 people) and how they still don’t have firefox 1.0 with pango support. Then I decided to build firefox with pango support.
I downloaded latest source code from www.mozilla.org and applied the pango patch from the FC3 source rpm (I thought that was the easiest thing to do :)). Next I made a small change not to check for MOZ_ENABLE_PANGO and enable pango by default. Then compiled firefox with default firefox build settings + gtk2 + xft + pango.
It renders Indian language text (I can read Telugu and Hindi) nicely although selection and cursor placement are still not working correctly. Chistopher Blizzard has already made progress on it. Here is the screenshot:
One does not have to set any environmental variables to enable pango support in this build. It is enabled by default! Download the firefox build with pango (source tarball from which it has been built is here)
