How is Python code organized?
Let us talk a little bit about how Python code is organized. In this section, we will start to enter the proverbial rabbit hole and introduce more technical names and concepts.
Starting with the basics, how is Python code organized? Of course, you write your code into files. When you save a file with the extension .py
, that file is said to be a Python module.
If you are on Windows or macOS, which typically hide file extensions from the user, we recommend that you change the configuration so that you can see the complete names of the files. This is not strictly a requirement, only a suggestion that may come in handy when discerning files from each other.
It would be impractical to save all the code that is required for software to work within one single file. That solution works for scripts, which are usually not longer than a few hundred lines (and often they are shorter than that).
A complete Python application can be made of hundreds of thousands of lines of code, so you will have to scatter it through different modules, which is better, but not good enough. It turns out that even like this, it would still be impractical to work with the code. So, Python gives you another structure, called a package, which allows you to group modules together.
A package is nothing more than a folder. In earlier versions of Python, a special file, __init__.py
, was required to mark a directory as a package. This file does not need to contain any code, and even though its presence is not mandatory anymore, there are practical reasons why it is always a good idea to include it nonetheless.
As always, an example will make all this much clearer. We have created an illustration structure in our book project, and when we type in the console:
$ tree -v example
We get a tree representation of the contents of the ch1/example
folder, which contains the code for the examples of this chapter. Here is what the structure of a simple application could look like:
example
├── core.py
├── run.py
└── util
├── __init__.py
├── db.py
├── maths.py
└── network.py
You can see that within the root of this example, we have two modules, core.py
and run.py
, and one package, util
. Within core.py
, there may be the core logic of our application. On the other hand, within the run.py
module, we can probably find the logic to start the application. Within the util
package, we expect to find various utility tools and, in fact, we can guess that the modules there are named based on the types of tools they hold: db.py
would hold tools to work with databases, maths.py
would, of course, hold mathematical tools (maybe our application deals with financial data), and network.py
would probably hold tools to send/receive data on networks.
As explained before, the __init__.py
file is there just to tell Python that util
is a package and not just a simple folder.
Had this software been organized within modules only, it would have been harder to infer its structure. We placed a module-only example in the ch1/files_only
folder; see it for yourself:
$ tree -v files_only
This shows us a completely different picture:
files_only
├── core.py
├── db.py
├── maths.py
├── network.py
└── run.py
It is a little harder to guess what each module does, right? Now, consider that this is just a simple example, so you can guess how much harder it would be to understand a real application if we could not organize the code into packages and modules.
How do we use modules and packages?
When a developer is writing an application, it is likely that they will need to apply the same piece of logic in different parts of it. For example, when writing a parser for the data that comes from a form that a user can fill in a web page, the application will have to validate whether a certain field is holding a number or not. Regardless of how the logic for this kind of validation is written, it is likely that it will be needed for more than one field.
For example, in a poll application, where the user is asked many questions, it is likely that several of them will require a numeric answer. These might be:
- What is your age?
- How many pets do you own?
- How many children do you have?
- How many times have you been married?
It would be bad practice to copy/paste (or, said more formally, duplicate) the validation logic in every place where we expect a numeric answer. This would violate the don’t repeat yourself (DRY) principle, which states that you should never repeat the same piece of code more than once in your application. Despite the DRY principle, we feel the need here to stress the importance of this principle: you should never repeat the same piece of code more than once in your application!
There are several reasons why repeating the same piece of logic can be bad, the most important ones being:
- There could be a bug in the logic, and therefore you would have to correct it in every copy.
- You may want to amend the way you carry out the validation, and again, you would have to change it in every copy.
- You may forget to fix or amend a piece of logic because you missed it when searching for all its occurrences. This would leave wrong or inconsistent behavior in your application.
- Your code would be longer than needed for no good reason.
Python is a wonderful language and provides you with all the tools you need to apply the coding best practices. For this example, we need to be able to reuse a piece of code. To do this effectively, we need to have a construct that will hold the code for us so that we can call that construct every time we need to repeat the logic inside it. That construct exists, and it is called a function.
We are not going too deep into the specifics here, so please just remember that a function is a block of organized, reusable code that is used to perform a task. Functions can assume many forms and names, according to what kind of environment they belong to, but for now, this is not important. Details will be seen once we are able to appreciate them, later in the book. Functions are the building blocks of modularity in your application, and they are almost indispensable. Unless you are writing a super-simple script, functions will be used all the time. Functions will be explored in Chapter 4, Functions, the Building Blocks of Code.
Python comes with a very extensive library, as mentioned a few pages ago. Now is a good time to define what a library is: a collection of functions and objects that provide functionalities to enrich the abilities of a language. For example, within Python’s math
library, a plethora of functions can be found, one of which is the factorial
function, which calculates the factorial of a number.
In mathematics, the factorial of a non-negative integer number, N, denoted as N!, is defined as the product of all positive integers less than or equal to N. For example, the factorial of 5 is calculated as:
N! = 1*2*3*4*5 = 120
The factorial of 0 is 0! = 1, to respect the convention for an empty product.
So, if you want to use this function in your code, all you have to do is import it and call it with the right input values. Do not worry too much if input values and the concept of calling are not clear right now; please just concentrate on the import part. We use a library by importing specific components from it, which is then used for the intended purpose. In Python, to calculate 5!, we just need the following code:
>>> from math import factorial
>>> factorial(5)
120
Whatever we type in the shell, if it has a printable representation, will be printed in the console for us (in this case, the result of the function call: 120).
Let us go back to our example, the one with core.py
, run.py
, util
, and so on. Here, the util
package is our utility library. This is our custom utility belt that holds all those reusable tools (that is, functions), which we need in our application. Some of them will deal with databases (db.py
), some with the network (network.py
), and some will perform mathematical calculations (maths.py
) that are outside the scope of Python’s standard math
library and, therefore, we must code them for ourselves.
We will see in detail how to import functions and use them in their dedicated chapter. Let us now talk about another important concept: Python’s execution model.