Discussion and considerations
Here we look at the various issues around character sets and support for multiple languages, as they affect our CMS.
Character sets
The situation with earlier computers was that some degree of flexibility was achieved, because global companies like IBM appreciated the need for local languages. This was typically handled by "code pages", where the character set was redefined to suit the requirements of a particular language. This approach was adequate while any given computer was likely to be used in only one country and one language, or at worst a group of compatible languages. When multiple languages need to be handled, it is a cumbersome mechanism.
Another issue with early systems was that some languages use more than 256 characters, which makes it impossible to hold the entire character set in a single byte of 8 bits. This factor made the use of multiple byte character sets inevitable, and a number of these have been defined.
Diverse character sets have led...