Bundling and compressing files
Geospatial datasets often consist of multiple files. For this reason, they are often distributed as ZIP or TAR file archives. These formats can also compress data, but their ability to bundle multiple files is the primary reason they are used for geospatial data. While the TAR format doesn’t contain a compression algorithm, it incorporates gzip
compression and offers it as a program option. Python has standard modules for reading and writing both ZIP and TAR archives. These modules are called zipfile
and tarfile
, respectively.
The following example extracts the hancock.shp
, hancock.shx
, and hancock.dbf
files contained in the hancock.zip
file we downloaded using urllib
for use in the previous examples. This example assumes that the ZIP file is in the current directory:
import zipfile zip = open("hancock.zip", "rb") zipShape = zipfile.ZipFile(zip) shpName, shxName, dbfName = zipShape.namelist() shpFile = open(shpName, "...