Encoding text
Text characters can be represented in different ways. For example, the Western alphabet can be encoded using Morse code, into a series of dots and dashes for transmission over a telegraph line.
In a similar way, text inside a computer is stored as bits; ones and zeros. .NET uses a standard called Unicode to encode text internally. Sometimes, you will need to move text outside .NET for use by systems that do not use Unicode or use a variation of Unicode. The following table shows some alternative encodings:
Encoding |
Description |
---|---|
ASCII |
Encodes a limited range of characters using the lower seven bits of a byte |
UTF-8 |
Represents each Unicode code point as a sequence of one to four bytes |
UTF-16 |
Represents each Unicode code point as a sequence of one or two 16-bit integers |
ANSI/ISO encodings |
Provides support for a variety of code pages that are used to support a specific language or group of languages |
Encoding strings as byte arrays
Add a new console application project named...