Summary
Character set issues need resolution to help achieve results with multilingual implementations. The preferred solution is to place all the emphasis on the use of UTF-8 as the international character set with the greatest flexibility, and practicality of implementation.
Language questions arise in three areas: the fixed text strings of the CMS core framework, similar fixed strings in CMS extensions, and the text that is in the database and used to generate user content for delivery to the browser. A number of ad hoc mechanisms have been used by others to deal with translation, but preference is given to a more comprehensive framework approach, based on the GNU gettext
project.
Implementation can be achieved, where necessary, using PHP classes to supplement standard resources that will not be present in all hosting environments. A small range of standard classes organize most of the language processing. Translators have a web interface provided for their use.
It is thus feasible with...