Reading data from a delimited text file
File handling with Python is a very important topic for GIS programmers. Text files are often used as an interchange format for exchanging data between systems. They are simple, cross-platform, and easy to process. Comma-and tab-delimited text files are among the most commonly used formats for text files, so we'll take an extensive look at the Python tools available for processing these files. A common task for GIS programmers is to read comma-delimited text files containing x and y coordinates along with other attribute information. This information is then converted into GIS data formats, such as shapefiles or geodatabases.
Getting ready
To use Python's built-in file processing functionality, you must first open the file. Once open, data within the file is processed using functions provided by Python, and finally the file is closed. Always remember to close the file when you're done.
In this recipe, you will learn how to open, read, process, and close a comma-delimited text file.
How to do it…
Follow these steps to create a Python script that reads a comma-delimited text file:
In your
c:\ArcpyBook\data
folder, you will find a file calledN_America.A2007275.txt
. Open this file in a text editor. It should appear as follows:18.102,-94.353,310.7,1.3,1.1,10/02/2007,0420,T,72 19.300,-89.925,313.6,1.1,1.0,10/02/2007,0420,T,82 19.310,-89.927,309.9,1.1,1.0,10/02/2007,0420,T,68 26.888,-101.421,307.3,2.7,1.6,10/02/2007,0425,T,53 26.879,-101.425,306.4,2.7,1.6,10/02/2007,0425,T,45 36.915,-97.132,342.4,1.0,1.0,10/02/2007,0425,T,100
This file contains wildfire incident data derived from a satellite sensor from a single day in 2007. Each row contains latitude and longitude information for the fire along with additional information, including the date and time, the satellite type, confidence value, and others. In this recipe, you are going to pull out only the latitude, longitude, and confidence value.
Open IDLE and create a file called
c:\ArcpyBook\Appendix2\ReadDelimitedTextFile.py
.Use the Python
open()
function to open the file for reading:f = open("c:/ArcpyBook/data/N_America.A2007275.txt','r')
Read the content of the text file into a list:
lstFires = f.readlines()
Add a
for
loop to iterate all the rows that have been read into thelstFires
variable:for fire in lstFires:
Use the
split()
function to split the values into a list using a comma as the delimiter. The list will be assigned to a variable calledlstValues
. Make sure that you indent this line of code inside thefor
loop you just created:lstValues = fire.split(",")
Using the index values that reference latitude, longitude, and confidence values, create new variables:
latitude = float(lstValues[0]) longitude = float(lstValues[1]) confid = int(lstValues[8])
Print the values of each with the
print
statement:print "The latitude is: " + str(latitude) + " The longitude is: " + str(longitude) + " The confidence value is: " + str(confid)
Close the file:
f.close()
The entire script should appear as follows:
f = open('c:/ArcpyBook/data/N_America.A2007275.txt','r') lstFires = f.readlines() for fire in lstFires: lstValues = fire.split(',') latitude = float(lstValues[0]) longitude = float(lstValues[1]) confid = int(lstValues[8]) print "The latitude is: " + str(latitude) + " The longitude is: " + str(longitude) + " The confidence value is: " + str(confid) f.close()
Save and run the script. You should see the following output:
The latitude is: 18.102 The longitude is: -94.353 The confidence value is: 72 The latitude is: 19.3 The longitude is: -89.925 The confidence value is: 82 The latitude is: 19.31 The longitude is: -89.927 The confidence value is: 68 The latitude is: 26.888 The longitude is: -101.421 The confidence value is: 53 The latitude is: 26.879 The longitude is: -101.425 The confidence value is: 45 The latitude is: 36.915 The longitude is: -97.132 The confidence value is: 100
How it works…
Python's open()
function creates a file object, which serves as a link to a file residing on your computer. You must call the open()
function on a file before reading or writing data in a file. The first parameter for the open()
function is a path to the file you'd like to open. The second parameter of the open()
function corresponds to a mode, which is typically read (r
), write (w
), or append (a
). A value of r
indicates that you'd like to open the file for read-only operations, while a value of w
indicates you'd like to open the file for write operations. If the file you open in write mode already exists, it will overwrite any existing data in the file, so be careful using this mode. Append mode (a
) will open a file for write operations, but instead of overwriting any existing data, it will append data to the end of the file. So, in
this recipe, we have opened the N_America.A2007275.txt
file in read-only mode.
The readlines()
function then reads the entire contents of the file into a Python list, which can then be iterated. This list is stored in a variable called lstFires
. Each row in the text file will be a unique value in the list. Since this function reads the entire file into a list, you need to use this method with caution, as large files can cause significant performance problems.
Inside the for
loop, which is used to loop through each of the values in lstFires
, the split()
function is used to create a list object from a line of text that is delimited in some way. Our file is comma-delimited, so we can use split(",")
. You can also split based on other delimiters such as tabs, spaces, or any other delimiter. This new list object created by split()
is stored in a variable called lstValues
. This variable contains each of the wildfire values. This is illustrated in the following screenshot. You'll notice that latitude is located in the first position, longitude is located in the second position, and so on. Lists are zero based:
Using the index values (which reference latitude, longitude, and confidence values), we create new variables called latitude
, longitude
, and confid
. Finally, we print each of the values. A more robust geoprocessing script might write this information into a feature class
using an InsertCursor
object.
There's more...
Just as is the case with reading files, there are a number of methods that you can use to write data to a file. The write()
function is probably the easiest to use. It takes a single string argument and writes it to a file. The writelines()
function can be used to write the contents of a list structure to a file. Before writing data to a text file, you will need to open the file in either a write or append mode.