Working with files in Python
When working with files it is important to be able to move through the filesystem, determine the type of file, and open a file in the different modes offered by the operating system.
Reading and writing files in Python
Now we are going to review the methods for reading and writing files. These are the methods we can use on a file object for different operations:
file.open(name_file,mode)
: Opens a file with a specific mode.file.write(string)
: Writes a string in a file.file.read([bufsize])
: Reads up tobufsize
, the number of bytes from the file. If run without the buffer size option, it will read the entire file.file.readline([bufsize])
: Reads one line from the file.file.close()
: Closes the file and destroys thefile
object.
The open()
function is usually used with two parameters (the file with which we are going to work and the access mode) and it returns a file
-type object. When opening a file with a certain access mode with the open()
function, a file
object is returned.
The opening modes can be r
(read), w
(write), and a
(append). We can combine the previous modes with others depending on the file type. We can also use the b
(binary), t
(text), and +
(open reading and writing) modes. For example, you can add a +
to your option, which allows read/write operations with the same object:
>>> f = open("file.txt","w")
>>> type(f)
<class '_io.TextIOWrapper'>
>>> f.close()
The following properties of the file object can be accessed:
closed
: ReturnsTrue
if the file has been closed. Otherwise,False
.mode
: Returns the opening mode.name
: Returns the name of the fileencoding
: Returns the character encoding of a text file
In the following example, we are using these properties to get information about the file.
You can find the following code in the read_file_properties.py
file.
file_descryptor = open("read_file_properties.py", "r+")
print("Content: "+file_descryptor.read())
print("Name: "+file_descryptor.name)
print("Mode: "+file_descryptor.mode)
print("Encoding: "+str(file_descryptor.encoding))
file_descryptor.close()
When reading a file, the readlines()
method reads all the lines of the file and joins them in a list sequence. This method is very useful if you want to read the entire file at once:
>>> allLines = file.readlines()
The alternative is to read the file line by line, for which we can use the readline()
method. In this way, we can use the file
object as an iterator if we want to read all the lines of a file one by one:
>>> with open("file.txt","r") as file:
... for line in file:
... print(line)
In the following example, we are using the readlines()
method to process the file and get counts of the lines and characters in this file.
You can find the following code in the count_lines_chars.py
file.
try:
countlines = countchars = 0
file = open('count_lines_chars.py', 'r')
lines = file.readlines()
for line in lines:
countlines += 1
for char in line:
countchars += 1
file.close()
print("Characters in file:", countchars)
print("Lines in file:", countlines)
except IOError as error:
print("I/O error occurred:", str(error))
If the file we are reading is not available in the same directory, then it will throw an I/O exception with the following error message:
I/O error occurred: [Errno 2] No such file or directory: 'newfile.txt'
Writing text files is possible using the write()
method and it expects just one argument that represents a string that will be transferred to an open file. You can find the following code in the write_lines.py
file:
try:
myfile = open('newfile.txt', 'wt')
for i in range(10):
myfile.write("line #" + str(i+1) + "\n")
myfile.close()
except IOError as error:
print("I/O error occurred: ", str(error.errno))
In the previous code, we can see how a new file called newfile.txt
is created. The open mode wt
means that the file is created in write mode and text format.
There are multiple ways to open and create files in Python, but the safest way is by using the with
keyword, in which case we are using the Context Manager approach. When we are using the open
statement, Python delegates to the developer the responsibility for closing the file, and this practice can provoke errors since developers sometimes forget to close it.
Developers can use the with
statement to handle this situation in a safely way. The with
statement automatically closes the file even if an exception is raised. Using this approach, we have the advantage that the file is closed automatically, and we don’t need to call the close()
method.
You can find the following code in the creating_file.py
file:
def main():
with open('test.txt', 'w') as file:
file.write("this is a test file")
if __name__ == '__main__':
main()
The previous code uses the context manager to open a file and returns the file as an object. We then call file.write("this is a test file")
, which writes it into the created file. The with
statement then handles closing the file for us in this case, so we don’t have to think about it.
IMPORTANT NOTE
For more information about the with
statement, you can check out the official documentation at https://docs.python.org/3/reference/compound_stmts.html#the-with-statement.
At this point we have reviewed the section on working with files in Python. The main advantage of using these methods is that they provide an easy way by which you can automate the process of managing files in the operating system.
In the next section, we’ll review how to manage exceptions in Python scripts. We’ll review the main exceptions we can find in Python for inclusion in our scripts.
Learn and understand exceptions management in Python
Each time your code executes in an unintended way Python stops your program, and it creates a special kind of data, called an exception. An exception or runtime error occurs during program execution. Exceptions are errors that Python detects during execution of the program. If the interpreter experiences an unusual circumstance, such as attempting to divide a number by 0 or attempting to access a file that does not exist, an exception is created or thrown, telling the user that there is a problem.
When an exception is not handled correctly, the execution flow is interrupted, and the console shows the information associated with the exception so that the reader can solve the problem with the information returned by the exception. Exceptions can be handled so that the program does not terminate.
Let’s look at some examples of exceptions:
>>> 4/0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ZeroDivisionError: division by zero
>>> a+4
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'a' is not defined
>>> "4"+4
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Can't convert 'int' object to str implicit
In the previous examples, we can see the exception traceback
, which consists of a list of the calls that caused the exception. As we see in the stack trace, the error was caused by executing an operation that is not permitted in Python.
IMPORTANT NOTE
Python provides effective methods that allow you to observe exceptions, identify them, and handle them efficiently. This is possible since all potential exceptions have their unambiguous names, so you can categorize them and react appropriately. We will review some tools in the Development environments for Python scripting section with some interesting techniques such as debugging.
In Python, we can use a try/except
block to resolve situations related to exception handling. Now, the program tries to run the division by zero. When the error happens, the exceptions manager captures the error and prints a message that is relevant to the exception:
>>> try:
... print("10/0=",str(10/0))
... except Exception as exception:
... print("Error =",str(exception))
...
Error = division by zero
The try
keyword begins a block of the code that may or may not be performing correctly. Next, Python tries to perform some operations; if it fails, an exception is raised, and Python starts to look for a solution.
At this point, the except
keyword starts a piece of code that will be executed if anything inside the try
block goes wrong – if an exception is raised inside a previous try
block, it will fail here, so the code located after the except
keyword should provide an adequate reaction to the raised exception. The following code raises an exception related to accessing an element that does not exist in the list:
>>> try:
... list=[]
... element=list[0]
... except Exception as exception:
... print("Exception=",str(exception))
...
Exception= list index out of range
In the previous code the exception is produced when trying to access the first element of an empty list.
In the following example, we join all these functionalities with exception management when we are working with files. If the file is not found in the filesystem, an exception of the IOError
type is thrown, which we can capture thanks to our try..except
block. You can find the following code in the read_file_exception.py
file:
try:
file_handle = open("myfile.txt", "r")
except IOError as exception:
print("Exception IOError: Unable to read from myfile ", exception)
except Exception as exception:
print("Exception: ", exception)
else:
print("File read successfully")
file_handle.close()
In the preceding code, we manage an exception when opening a file in read
mode and if the file does not exist it will throw the message "Exception IOError: Unable to read from myfile [Errno 2] No such file or directory: 'myfile.txt'"
.
Python 3 defines 63 built-in exceptions, and all of them form a tree-shaped hierarchy. Some of the built-in exceptions are more general (they include other exceptions), while others are completely concrete. We can say that the closer to the root an exception is located, the more general (abstract) it is.
Some of the exceptions available by default are listed here (the class from which they are derived is in parentheses):
BaseException
: The class from which all exceptions inherit.Exception (BaseException):
An exception is a special case of a more general class namedBaseException
.ZeroDivisionError (ArithmeticError)
: An exception raised when the second argument of a division is 0. This is a special case of a more general exception class namedArithmeticError
.EnvironmentError (StandardError)
: This is a parent class of errors related to input/output.IOError (EnvironmentError)
: This is an error in an input/output operation.OSError (EnvironmentError)
: This is an error in a system call.ImportError (StandardError)
: The module or the module element that you wanted to import was not found.
All the built-in Python exceptions form a hierarchy of classes. The following script dumps all predefined exception classes in the form of a tree-like printout.
You can find the following code in the get_exceptions_tree.py
file:
def printExceptionsTree(ExceptionClass, level = 0):
if level > 1:
print(" |" * (level - 1), end="")
if level > 0:
print(" +---", end="")
print(ExceptionClass.__name__)
for subclass in ExceptionClass.__subclasses__():
printExceptionsTree(subclass, level+1)
printExceptionsTree(BaseException)
As a tree is a perfect example of a recursive data structure, a recursion seems to be the best tool to traverse through it. The printExceptionsTree()
function takes two arguments:
- A point inside the tree from which we start traversing the tree
- A level to build a simplified drawing of the tree’s branches
This could be a partial output of the previous script:
BaseException
+---Exception
| +---TypeError
| +---StopAsyncIteration
| +---StopIteration
| +---ImportError
| | +---ModuleNotFoundError
| | +---ZipImportError
| +---OSError
| | +---ConnectionError
| | | +---BrokenPipeError
| | | +---ConnectionAbortedError
| | | +---ConnectionRefusedError
| | | +---ConnectionResetError
| | +---BlockingIOError
| | +---ChildProcessError
| | +---FileExistsError
| | +---FileNotFoundError
| | +---IsADirectoryError
| | +---NotADirectoryError
| | +---InterruptedError
| | +---PermissionError
| | +---ProcessLookupError
| | +---TimeoutError
| | +---UnsupportedOperation
| | +---herror
| | +---gaierror
| | +---timeout
| | +---Error
| | | +---SameFileError
| | +---SpecialFileError
| | +---ExecError
| | +---ReadError
In the output of the previous script, we can see the root of Python’s exception classes is the BaseException
class (this is a superclass of all the other exceptions). For each of the encountered classes, it performs the following set of operations:
- Print its name, taken from the
__name__
property. - Iterate through the list of subclasses delivered by the
__subclasses__()
method, an recursively invoke theprintExceptionsTree()
function, incrementing the nesting level, respectively.
Now that you know the functions, classes, objects and exceptions for working with Python, let’s move on to learning how to manage modules and packages. Also, we will review the use of some modules for managing parameters, including argparse
and optarse
.