Retrieving files from an FTP server
Retrieving files from an FTP server for processing is a very common operation for GIS programmers and can be automated with a Python script.
Getting ready
Connecting to an FTP
server and downloading a file is accomplished through the ftplib
module. A connection to an FTP server is created through the FTP object, which accepts a host, username, and password to create the connection. Once a connection has been opened, you can then search for and download files.
In this recipe, you will connect to the National Interagency Fire Center Incident FTP site and download a Google Earth format file for a wildfire in Alaska.
How to do it…
Follow these steps to create a script that connects to an FTP server and downloads a file:
Open IDLE and create a file called
c:\ArcpyBook\Appendix2\ftp.py
.We'll be connecting to an FTP server at the NIFC. Visit their website at http://ftpinfo.nifc.gov/ for more information.
Import the
ftplib
,os
, and soc
ket modules:import ftplib import os import socket
Add the following variables that define the URL, directory, and filename:
HOST = 'ftp.nifc.gov' DIRN = '/Incident_Specific_Data/ALASKA/Fire_Perimeters/20090805_1500' FILE = 'FirePerimeters_20090805_1500.kmz'
Add the following code block to create a connection. If there is a connection error, a message will be generated. If the connection was successful, a success message will be printed:
try: f = ftplib.FTP(HOST) except (socket.error, socket.gaierror), e: print 'ERROR: cannot reach "%s"' % HOST print '*** Connected to host "%s"' % HOST
Add the following code block to anonymously log in to the server:
try: f.login() except ftplib.error_perm: print 'ERROR: cannot login anonymously' f.quit() print '*** Logged in as "anonymous"'
Add the following code block to change to the directory specified in our
DIRN
variable:try: f.cwd(DIRN) except ftplib.error_perm: print 'ERROR: cannot CD to "%s"' % DIRN f.quit() print '*** Changed to "%s" folder' % DIRN
Use the
FTP.retrbinary()
function to retrieve the KMZ file:try: f.retrbinary('RETR %s' % FILE, open(FILE, 'wb').write) except ftplib.error_perm: print 'ERROR: cannot read file "%s"' % FILE os.unlink(FILE) else: print '*** Downloaded "%s" to CWD' % FILE
Make sure you disconnect from the server:
f.quit()
The entire script should appear as follows:
import ftplib import os import socket HOST = 'ftp.nifc.gov' DIRN = '/Incident_Specific_Data/ALASKA/Fire_Perimeters/20090805_1500' FILE = 'FirePerimeters_20090805_1500.kmz' try: f = ftplib.FTP(HOST) except (socket.error, socket.gaierror), e: print 'ERROR: cannot reach "%s"' % HOST print '*** Connected to host "%s"' % HOST try: f.login() except ftplib.error_perm: print 'ERROR: cannot login anonymously' f.quit() print '*** Logged in as "anonymous"' try: f.cwd(DIRN) except ftplib.error_perm: print 'ERROR: cannot CD to "%s"' % DIRN f.quit() print '*** Changed to "%s" folder' % DIRN try: f.retrbinary('RETR %s' % FILE, open(FILE, 'wb').write) except ftplib.error_perm: print 'ERROR: cannot read file "%s"' % FILE os.unlink(FILE) else: print '*** Downloaded "%s" to CWD' % FILE f.quit()
Save and run the script. If everything is successful, you should see the following output:
*** Connected to host "ftp.nifc.gov" *** Logged in as "anonymous" *** Changed to "/Incident_Specific_Data/ALASKA/Fire_Perimeters/20090805_1500" folder *** Downloaded "FirePerimeters_20090805_1500.kmz" to CWD
Check your
c:\ArcpyBook\Appendix2
directory for the file. By default, FTP will download files to the current working directory:
How it works…
To connect to an FTP server, you need to know the URL. You also need to know the directory and filename for the file that will be downloaded. In this script, we have hardcoded this information, so that you can focus on implementing the FTP-specific functionality. Using this information we then created a connection to the NIFC FTP server. This is done through the ftplib.FTP()
function, which accepts a URL to the host.
Anonymous logins are accepted by the nifc.gov
server, so we connect to the server in this manner. Keep in mind that if a server does not accept anonymous connections, you'll need to obtain a username/password. Once logged in, the script then changes directories from the root of the
FTP server to the path defined in the DIRN
variable. This was accomplished with the cwd(<path>)
function. The kmz
file was retrieved using the retrbinary()
function. Finally, you will want to close your connection to the FTP server when you're done. This is done with the quit()
method.
There's more…
There are a number of additional FTP-related methods that you can use to perform various actions. Generally, these can be divided into directory-level operations and file-level operations. Directory-level methods include the dir()
method to obtain a list of files in a directory, mkd()
to create a new directory, pwd()
to get the current working directory, and cwd()
to change the current directory.
The ftplib
module also includes various methods for working with files. You can upload and download files in binary or plain text format. The retrbinary()
and storbinary()
methods are used to retrieve and store binary files, respectively. Plain text files can be retrieved and stored using retrlines()
and storlines()
.
There are several others methods on the FTP class that you should be aware of. Deleting a file can be done with the delete()
method, while renaming a file can be accomplished with rename()
. You can also send commands to the FTP server through the sendcmd()
method.