Background briefing – handling file formats
As we've observed so far, our data comes in a wide variety of physical formats. In Chapter 1, Our Espionage Toolkit, we looked at ZIP files, which are archives that contain other files. In Chapter 2, Acquiring Intelligence Data, we looked at JSON files, which serialize many kinds of Python objects.
In this chapter, we're going to review some previous technology and then look at working specifically with CSV files. The important part is to look at the various kinds of image files that we might need to work with.
In all cases, Python encourages looking at a file as a kind of context. This means that we should strive to open files using the with
statement so that we can be sure the file is properly closed when we're done with the processing. This doesn't always work out perfectly, so there are some exceptions.
Working with the OS filesystem
There are many modules for working with files. We'll focus on two: glob
and os
.
glob
The glob
module implements filesystem...