Internationalization deprecated

Providing software for users in the whole world means providing it in multiple languages. This consists of two steps:

  • Internationalization (i18n): Making the software international, i. e. preparing it to be adapted to different languages and locations.
  • Localization (L10n): Adapting the software to a particular language and/or location.

The Open-Xchange platform offers several facilities to simplify both parts. The L10n part is mostly a concern for translators. Open-Xchange facilities for that consist mainly of using a well-established format for translations: GNU Gettext Portable Object (PO) files. This allows translators to use existing dedicated translation tools or a simple UTF-8-capable text editor.

The i18n part is what software developers will be mostly dealing with and is what the rest of this document describes.

Translation

The main part of i18n is the translation of various text strings which are presented to the user. For this purpose, the Open-Xchange platform provides the RequireJS plugin 'gettext'. Individual translation files are specified as a module dependency of the form 'gettext!module_name'. The resulting module API is a function which can be called to translate a string using the specified translation files:

define('com.example/example', ['gettext!com.example/example'],
  function (gt) {
    'use strict';
    alert(gt('Hello, world!'));
  });

A file named ox.pot is created by the create_pot tasks of the build system. you will need to run:

grunt create_pot

will generate this file within the i18n/ directory. It will contain an entry for every call to one of gettext functions:

#: apps/com.example/example.js:4
msgid "Hello, world!"
msgstr ""

This file can be sent to the translators and during the translation process, one PO file for each language will be created. The PO files in the directory i18n shoud contain the translated entry:

#: apps/com.example/example.js:4
msgid "Hello, world!"
msgstr "Hallo, Welt!"

During the next build, the entries are copied from the central PO files into individual translation files. In our example, this would be apps/com.example/example.de_DE.js. Because of the added language ID, translation files can usually have the same name as the corresponding main module. Multiple related modules should use the same translation file to avoid the overhead of too many small translation files.

Most modules will require more complex translations than can be provided by a simple string lookup. To handle some of these cases, the gettext module provides traditional methods in addition to being a callable function. Other cases are handled by the build system.

Composite Phrases

In many cases, the translated texts will not be static, but contain dynamic values as parts of a sentence. The straight-forward approach of translating parts of the sentence individually and then using string concatenation to compose the final text is a BAD idea. Different languages have different sentence structures, and if there are multiple dynamic values, their order might need to be reversed in some languages, and not reversed in others.

The solution is to translate entire sentences and then to use the gettext function to insert dynamic values:

//#. %1$s is the given name
//#. %2$s is the family name
//#, c-format
alert(gt('Welcome, %1$s %2$s!', firstName, lastName));

Internally, the _.printf function is used to expand the placeholders, so the above is equivalent to:

//#. %1$s is the given name
//#. %2$s is the family name
//#, c-format
const pattern = gt('Welcome, %1$s %2$s!');
alert(_.printf(pattern, firstName, lastName));

and results in:

#. %1$s is the given name
#. %2$s is the family name
#, c-format
msgid "Welcome, %1$s, %2$s!"
msgstr ""

Comments

As shown in the examples above, it is possible to add comments for translators by starting a comment with "#.". Such comments must be placed immediately before the name of the variable which refers to the gettext module (in this case gt), or in the line above that variable. They can be separated by arbitrary whitespace and newlines, but not other tokens. All such gettext calls should have comments explaining every format specifier.

Examples of correct usage of comments:

alert(/*#. comment */ gt('Welcome!'));

//#. comment
alert(gt('Welcome!'));

//#. multi
//#. line
//#. comment
alert(gt('Welcome!'));

Examples of wrong usage of comments:

/*#. comment */ alert(gt('Welcome!'));

//#. comment
var x = 42;
alert(gt('Welcome!'));

Comments starting with "#," are meant for gettext tools, which in the case of "#, c-format", can ensure that the translator did not leave out or mistype any of the format specifiers.

Debugging

One of the most common i18n errors is forgetting to use a gettext function. To catch this kind of error, the UI can be started with the hash parameter "#debug-i18n=1". Reloading of the browser tab is required for this setting to take effect.

In this mode, every translated string is marked with invisible Unicode characters, and any DOM text without those characters gets reported on the console.

Unfortunately, this debug mode will also report any string which does not actually require translation. Examples of such strings include user data, numbers, strings already translated by the server, etc. To avoid filling the console with such false positives, every such string must be marked by passing it through the function _.noI18n:

document.createTextNode(_.noI18n(fileName));

This results in the strings being marked as "translated" without actually changing their visible contents. When not debugging, _.noI18n simply returns its first parameter.

Strings marked with _.noI18n must not be used as plain data anymore (e.g. to lookup an array element, to build file names, URL query strings, DB queries, JSON data sent to server, etc.) because of these additional characters.

It is possible to unmark a translated string by using the _.noI18n.fix function which removes the translation marker characters from the string.

const fileName = _.noI18n.fix(gt('Unnamed (%1$d)', unusedIndex));
someFileAPI.createTextFile(fileName + '.txt');

Advanced gettext Functions

There are several other functions which are required to cover all typical translation scenarios.

Contexts

Sometimes, the same English word or phrase has multiple meanings and must be translated differently depending on context. To be able to tell the individual translations apart, the method gettext.pgettext ("p" stands for "particular") should be used instead of calling gettext directly. It takes the context as the first parameter and the text to translate as the second parameter:

alert(gt.pgettext('description', 'Title'));
alert(gt.pgettext('salutation', 'Title'));

Results in:

msctxt "description"
msgid "Title"
msgstr "Beschreibung"

msctxt "salutation"
msgid "Title"
msgstr "Anrede"

Plural Forms

In the case of numbers, the rules to select the proper plural form can be very complex. With the exception of languages with no separate plural forms, English is the second simplest language in this respect, having only two plural forms: singular and plural. Other languages can have no plural form at all, or up to four forms, and theoretically even more.

The functions gettext.ngettext and gettext.npgettext (for a combination of plural forms with contexts) can select the proper plural form by using a piece of executable code embedded in the header of a PO file:

//#. %1$d is the number of mails
//#, c-format
const pattern = gt.ngettext('You have %1$d mail.', 'You have %1$d mails.', n);
alert(_.printf(pattern, n));

The function gettext.ngettext accepts three parameters: the English singular and plural forms and a "count" parameter which determines the chosen plural form. It returns the appropriate translated text without processing the placeholders. The function gettext.npgettext adds a context parameter before the others, similar to gettext.pgettext.

They would usually be used in combination with _.printf to insert the actual number into the translated string. However, all gettext functions are able to process placeholders by themselves, so the following example is equivalent:

//#. %1$d is the number of mails
//#, c-format
alert(gt.ngettext('You have %1$d mail.', 'You have %1$d mails.', n, n);

The above examples result in the following entry:

#. %1$d is the number of mails
#, c-format
msgid "You have %1$d mail"
msgid_plural "You have %1$d mails"
msgstr[0] ""
msgstr[1] ""

The number of msgstr[N] lines is determined by the number of plural forms in each language. This number is specified in the header of each PO file, together with the code to compute the index of the correct plural form from the supplied number.

Example with context parameter:

alert(gt.npgettext('context', 'You have %1$d mail.', 'You have %1$d mails.', n, n));

All functions can take additional format parameters following the regular parameters described above:

alert(gt.ngettext('Welcome %1$s! You have %2$d mail.', 'Welcome %1$s! You have %2$d mails.', n, userName, n));
alert(gt.npgettext('context', 'Welcome %1$s! You have %2$d mail.', 'Welcome %1$s! You have %2$d mails.', n, userName, n));

Singular, Plural, Exactly 1

As already mentioned, there are languages that have different plural forms depending on the count, or that use their singular form for other counts but 1 (such as 21, or 101). Remember that "singular" is not the same as "a single entity" in these languages but has to be seen as just one plural form among others.

Do not write out "1" or "one" in the singular form, or use another specific sentence for a single entity when using the plural functions. Otherwise, for example, the resulting text in the UI may tell about "one email" when there are 101 emails actually.

// WRONG!
alert(gt.ngettext('You have 1 mail.', 'You have %1$d mails.', n, n));

// Correct
alert(gt.ngettext('You have %1$d mail.', 'You have %1$d mails.', n, n));

In cases where a special term is needed for "exactly 1", separate that string from the plural function:

// WRONG!
alert(gt.ngettext('Every day.', 'Every %1$d days.', n, n));

// Correct
alert((n === 1) ? gt('Every day.') : gt.ngettext('Every %1$d day.', 'Every %1$d days.', n, n));

Always use the plural functions even if it is known that the count cannot be 1.

// WRONG!
if (n > 1) alert(gt.gettext('You have %1$d mails.', n));

// Correct
if (n > 1) alert(gt.ngettext('You have %1$d mail.', 'You have %1$d mails.', n, n));

This will always pick the plural form in English, but will pick the correct form in other languages with more complex rules.

Do not use the plural functions at all for sentences using a generic plural without count:

// WRONG!
alert(gt.ngettext('Unable to upload file', 'Unable to upload files', n));

// Correct
alert((n === 1) ? gt('Unable to upload file') : gt('Unable to upload files'));

Common Patterns

Custom Translations

App Suite has support for custom translations. gettext provides a function addTranslation(dictionary, key, value) that allows adding custom translation to a global dictionary ("*") as well as replacing translations in existing dictionaries. In order to use this, you have to create a plugin for the "core" namespace. With release 7.8.0, there is a special namespace for exactly this use-case. It is now possible to define plugins to be loaded for the "i18n" namespace, which is loaded right before gettext is being enabled and after that the core plugins will be loaded.

require(['io.ox/core/gettext'], function (gt) {
    gt.addTranslation('*', 'Existing translation', 'My custom translation');
});

To keep it simple you can use the internal escaping to build counterparts for ngettext, pgettext, as well as npgettext. A context is escaped by \x00. Singular and plural are separated by \x01:

// load gettext
var gt = require('io.ox/core/gettext');

// replace translation in dictionary 'io.ox/core'. Apps use context "app".
gt.addTranslation('io.ox/core', 'app\x00Address Book', 'My Contacts');

// replace translation - counterpart to ngettext
gt.addTranslation('io.ox/mail', 'Copy\x01Copies', { 0: 'Kopie', 1: 'Kopien' });

// replace translation - counterpart to npgettext
gt.addTranslation('io.ox/mail', 'test\x00One\x01Two', { 0: 'Eins', 1: 'Zwei' });

// some checks
require(['gettext!io.ox/mail'], function (gt) {

    console.log('Sigular:', gt.ngettext('Copy', 'Copies', 1));
    console.log('Plural:', gt.ngettext('Copy', 'Copies', 2));

    console.log('Without context', gt.gettext('Files')); // should be 'My File Box'
    console.log('With context', gt.pgettext('test', 'Files')); // should be 'Files'

    console.log('Plural with context', gt.npgettext('test', 'One', 'Two', 2)); // should be 'Two'
});

addTranslation() always works for current language. If you want to benefit from automatic POT file generation, use the following approach:

define('plugins/example/register', ['io.ox/core/gettext', 'gettext!custom'], function (gettext, gt) {

   'use strict';

   /* Just list all custom translations in your plugin by standalone gt() calls.
      The build system recognizes these strings and collects them in a POT file,
      so that they can be subject to translation processes. At runtime, gt('...')
      returns translations for current language.
   */
   if (false) {
       gt('Tree');
       gt('House');
       gt('Dog');
   }

   // get internal dictionary (for current language of course)
   var dictionary = gt.getDictionary();

   // add all translations to global dictionary. Done!
   gettext.addTranslation('*', dictionary);
});

Using translations from external dictionaries

Sometimes it would be nice to use a translation defined in a dictionary shipped within another UI plugin. Because all references in the dependency list of a call to the define function are picked up by the automatic msgid extraction plugin as part of the build step, it's not possible to directly add dictionary modules that are not supposed to be part of the current UI plugin to this dependency list.

There is a solution for this, which involves a little more code, but prevents the automatic extraction tool from picking up those external translations.

define('plugins/example/register', [
    'gettext!plugins/example/i18n',
    // make sure the dictionary is loaded by requirejs, but don't assign it any variable, yet!
    'gettext!io.ox/core'
], function (gt) {

   'use strict';
   // fetch dictionary from require cache, automatic extraction tool won't detect this as local i18n module
   var gtCore = require('gettext!io.ox/core');

   console.log(gt('I will end up in the pot/po files'));
   // this won't be added to the local set of i18n dictionary modules
   console.log(gtCore('This is some message defined in an external dictionary'));
});