Decoding bytes – how to get proper characters from some bytes
How can we work with files that aren't properly encoded? What do we do with files written in ASCII encoding?
A download from the internet is almost always in bytes—not characters. How do we decode the characters from that stream of bytes?
Also, when we use the subprocess
module, the results of an OS command are in bytes. How can we recover proper characters?
Much of this is also relevant to the material in Chapter 10, Input/Output, Physical Format and Logical Layout. We've included this recipe here because it's the inverse of the previous recipe, Encoding strings – creating ASCII and UTF-8 bytes.
Getting ready
Let's say we're interested in offshore marine weather forecasts. Perhaps this is because we own a large sailboat, or perhaps because good friends of ours have a large sailboat and are departing the Chesapeake Bay for the Caribbean.
Are there any...