Unicode and UTF-8
This is a very deep and broad topic. The purpose of this section is to provide a cursory introduction to the topic, as well as to provide some resources to learn much more about this topic.
A brief history
In the early days of computers, there was 7-bit ASCII, but that wasn’t good enough for everyone, so someone came up with 16-bit Unicode. This was a good start, but it has its own problems. Finally, the guys who invented C got around to inventing UTF-8, which is backward-compatible with ASCII and dovetails into UTF-16 and UTF-32, so anyone around the world can write Hello, World!
in their own language using their own characters on just about any computer. An added benefit of UTF-8 is that it is easily converted into/from Unicode when needed. Unicode didn’t stop there; it evolved as well. Unicode and UTF-8 are different encodings, but they are still somewhat interrelated.
Where we are today
Unicode now replaces older character encodings,...