Encoding text
Text characters can be represented in different ways. For example, the alphabet can be encoded using Morse code into a series of dots and dashes for transmission over a telegraph line.
In a similar way, text inside a computer is stored as bits (ones and zeros). .NET Core uses a standard called Unicode to encode text internally. Sometimes, you will need to move text outside .NET Core for use by systems that do not use Unicode or use a variation of Unicode.
The following table lists some alternative text encodings commonly used by computers:
Encoding |
Description |
ASCII |
This encodes a limited range of characters using the lower seven bits of a byte |
UTF-8 |
This represents each Unicode code point as a sequence of one to four bytes |
UTF-16 |
This represents each Unicode code point as a sequence of one or two 16-bit integers |
ANSI/ISO encodings |
This provides support for a variety of code pages that are used to support a specific language or group of languages |