Internationalization Work Flow with poedit

un flag Internationalization (i18n) has come a long way since I wrote my first multilingual website way back in 2000. In that era, we just swapped out entirely different sets of templates, which, of course, made it very difficult to keep the two sets in synch.

With the Zend_Translate module of Zend Framework, it is possible to do it with just one set of templates and focus on just the text. CSS wizards can make whatever layout changes they need regarding placement and alignment if we need to support right to left languages such as Arabic or Hebrew, and Zend Translate can provide the texts.

You tell your script to translate a key phrase such as “Log In” into the target language, which let’s assume is English. You provide a value for the key. In a short phrase like “Log In” the value would probably be the same thing as the key. But for larger blocks of text such as paragraphs, the key would be something like “Terms Para 1” and the actual value might span several lines. These keys and values live in a specially formatted file called a catalog. I laid my hands on a tool called PoEdit. It’s a free opensource tool to create translation catalogs. Each catalog is the translation to just one language. So when your script is translating to a language you need only worry about the target language.

poedit screen

I have legacy text in legacy presentation files. They have a suffix of either .tpl or .phtml. I had to go into poedit’s preferences and edit the PHP parser to accommodate this. It’s intuitive enough to add the additional suffixes that you want to parse, but the nonobvious trick was to add

--language=PHP  

to the xgettext command. So the command becomes:

xgettext --language=PHP --force-po -o %o %C %K %F

Then the next thing to do was to go into the templates and regular PHP source code and mark up all the text so that poedit can find it.

In the regular application code we can just shlep the keys around until the time comes to output them. So something like

$this->view->title = "About Us"; 

becomes

$this->view->title = _("About Us");  

That markup allows poedit to extract it.

We translate the code in the presentation layer, a la

translate($this->title); ?>

or we could just translate directly, a la

translate(_('Top Ten Gardening Tools')); ?>

For paragraphs, we insert a key, knowing that later we will cut the text out and insert it in the catalog.

So we will set our translator up with poedit and give him two catalogs, the one with all the keys translated to the original language, and the target language one, which will have all the same keys but nothing filled in for their values. He will refer to the original language values and use those to create the target language values.

There is a bit of a problem right now. We need some way to mark the source code and templates so that the poedit knows what strings to extract into the catalog. The default thing to use is _(), which is an alias for gettext. But if we are using the Zend translate model we only need _() for extracting, and in fact shouldn’t have it at all for translation. But trying to override the _() function didn’t work. You can reconfigure poedit to look for something else. (a nop function perhaps?). But the strings do not behave consistently throughout the application.

1 comment to Internationalization Work Flow with poedit

A sample text widget

Etiam pulvinar consectetur dolor sed malesuada. Ut convallis euismod dolor nec pretium. Nunc ut tristique massa.

Nam sodales mi vitae dolor ullamcorper et vulputate enim accumsan. Morbi orci magna, tincidunt vitae molestie nec, molestie at mi. Nulla nulla lorem, suscipit in posuere in, interdum non magna.