We will often create file objects to read or write data from a file. File objects can be created using the built-in open() method. The open() function takes two arguments, the name of the file and the mode. These modes dictate how we can interact with the file object. The mode argument is optional, and if left blank defaults to read-only. The following table illustrates the different file modes available for use:
File Mode |
Description |
r |
Opens the file for read-only mode (default). This does not offer forensic write protection! Please always use a certified process to protect evidence from modification. |
w |
Creates, or overwrites the file if it exists, for writing. |
a |
Creates a file if it doesn't exist for writing. If the file does exist, the file pointer is placed at the end of the file to append writes to the file. |
rb, wb, or ab |
Opens the file for reading or writing in binary mode. |
r+, rb+, w+, wb+, a+, or ab+ |
Opens the file for reading and writing in either standard or binary mode. If the file does not exist, the w or a modes create the file. |
Most often, we will use read and write in standard or binary mode. Let's take a look at a few examples and some of the common functions we might use. For this section, we will create a text file called file.txt with the following content:
This is a simple test for file manipulation.
We will often find ourselves interacting with file objects.
It pays to get comfortable with these objects.
In the following example, we open a file object that exists, file.txt, and assign it to a variable, in_file. Since we do not supply a file mode, it is opened in read-only mode by default. We can use the read() method to read all lines as a continuous string. The readline() method can be used to read individual lines as a string. Alternatively, the readlines() method creates a string for each line and stores it in a list. These functions take an optional argument, specifying the size of bytes to read.
Python keeps track of where we currently are in the file. To illustrate the examples we've described, we need to use the seek() operation to bring us back to the start of the file before we run our next example. The seek() operation accepts a number and will navigate to that decimal character offset within the file. For example, if we tried to use the read() method before seeking back to the start, our next print function (showcasing the readline() method) would not return anything. This is because the cursor would be at the end of the file as a result of the read() function:
>>> in_file = open('file.txt')
>>> print(in_file.read())
This is a simple test for file manipulation.
We will often find ourselves interacting with file objects.
It pays to get comfortable with these objects.
>>> in_file.seek(0)
>>> print(in_file.readline())
This is a simple test for file manipulation.
>>> in_file.seek(0)
>>> print(in_file.readlines())
['This is a simple test for file manipulation.\n', 'We will often find ourselves interacting with file objects.\n', 'It pays to get comfortable with these objects.']
In a similar fashion, we can create, or open and overwrite, an existing file using the w file mode. We can use the write() function to write an individual string or the writelines() method to write any iterable object to the file. The writelines() function essentially calls the write() method for each element of the iterable object.
For example, this is tantamount to calling write() on each element of a list:
>>> out_file = open('output.txt', 'w')
>>> out_file.write('Hello output!')
>>> data = ['falken', 124, 'joshua']
>>> out_file.writelines(data)
Python does a great job of closing connections to a file object automatically. However, best practice dictates that we should use the flush() and close() methods after we finish writing data to a file. The flush() method writes any data remaining in a buffer to the file, and the close() function closes our connection to the file object:
>>> out_file.flush()
>>> out_file.close()