Python modules and packages
In this section, you will learn how Python provides modules that are built in an extensible way and offers the possibility to developers to create their own modules.
What is a module in Python?
A module is a collection of functions, classes, and variables that we can use for implementing and application. There is a large collection of modules available with the standard Python distribution. Modules have a dual purpose among which we can highlight:
- Break a program with many lines of code into smaller parts.
- Extract a set of definitions that you use frequently in your programs to be reused. This prevents, for example, having to copy functions from one program to another.
A module can be specified as a file containing definitions and declarations from Python. The file must have a .py
extension and its name corresponds to the name of the module. We can start by defining a simple module in a .py
file. We’ll define a simple message(name)
function inside the my_functions.py
file that will print "Hi,{name}.This is my first module"
.
You can find the following code in the my_functions.py
file inside the first_module
folder:
def message(name):
print(f"Hi {name}.This is my first module")
Within our main.py
file, we can then import this file as a module and use the message(name)
method. You can find the following code in the main.py
file:
import my_functions
def main():
my_functions.message("Python")
if __name__ == '__main__':
main()
When a module is imported, its content is implicitly executed by Python. You already know that a module can contain instructions and definitions. Usually, the statements are used to initialize the module and are only executed the first time the module name appears in an import
statement.
That’s all we need in order to define a very simple Python module within our Python scripts.
How to import modules in Python
To use the definitions of a module in the interpreter or in another module, you must first import it. To do this, the import
keyword is used. Once a module has been imported, its definitions can be accessed via the dot .
operator.
We can import one or several names of a module as follows. This allows us to directly access the names defined in the module without having to use the dot .
operator.
>>> from my_functions import message
>>> message('python')
We can also use the *
operator to import all the functions of the module.
>>> from my_functions import *
>>> message('python')
Accessing any element of the imported module is done through the namespace, followed by a dot (.
) and the name of the element to be obtained. In Python, a namespace is the name that has been indicated after the word import
, that is, the path (namespace) of the module.
It is also possible to abbreviate namespaces by means of an alias. To do this, during the import, the keyword as is assigned followed by the alias with which we will refer to that imported namespace in the future. In this way, we can redefine the name that will be used within a module using the as
reserved word:
>>> from my_functions import message as my_message
>>> my_message('python')
Hi python. This is my first module
Getting information from modules
We can get more information about methods and other entities from a specific module using the dir()
method. This method returns a list with all the definitions (variables, functions, classes, …) contained in a module. For example, if we execute this method using the my_functions
module we created earlier, we will get the following result:
>>> dir(my_functions)
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'message']
The dir()
method returns an alphabetically sorted list containing all entities’ names available in the module identified by any name passed to the function as an argument. For example, you can run the following code to print the names of all entities within the sys
module. We can obtain the list of built - in modules with the following instructions:
>>> import sys
>>> sys.builtin_module_names
('_abc', '_ast', '_codecs', '_collections', '_functools', '_imp', '_io', '_locale', '_operator', '_signal', '_sre', '_stat', '_string', '_symtable', '_thread', '_tracemalloc', '_warnings', '_weakref', 'atexit', 'builtins', 'errno', 'faulthandler', 'gc', 'itertools', 'marshal', 'posix', 'pwd', 'sys', 'time', 'xxsubtype')
>>> dir(sys)
['__breakpointhook__', '__displayhook__', '__doc__', '__excepthook__', '__interactivehook__', '__loader__', '__name__', '__package__', '__spec__', '__stderr__', '__stdin__', '__stdout__', '__unraisablehook__', '_base_executable', '_clear_type_cache', '_current_frames',...]
The other modules that we can import are saved in files, which are in the paths indicated in the interpreter:
>>> sys.path
['', '/usr/lib/python3.4', '/usr/lib/python3.4/plat-x86_64-linux-gnu', '/usr/lib/python3.4/lib-dynload', '/usr/local/lib/python3.4/dist-packages', '/usr/lib/python3/dist-packages']
In the previous code, we are using the dir()
method to get all name entities from the sys
module.
Difference between a Python module and a Python package
In the same way that we group functions and other definitions into modules, Python packages allow you to organize and structure the different modules that make up a program in a hierarchical way. Also, packages make it possible for multiple modules with the same name to exist and not cause errors.
A package is simply a directory that contains other packages and modules. Also, in Python, for a directory to be considered a package, it must include a module called __init__.py
. In most cases, this file will be empty; however, it can be used to initialize package-related code. Among the main differences between a module and a package, we can highlight the following:
- Module: Each of the
.py
files that we create is called a module. The elements created in a module (functions, classes, …) can be imported to be used in another module. The name we are going to use to import a module is the name of the file. - Package: A package is a folder that contains
.py
files and contains a file called__init__.py
. This file does not need to contain any instructions. The packages, at the same time, can also contain other sub-packages.
Managing parameters in Python
Often in Python, scripts that are used on the command line as arguments are used to give users options when they run a certain command. To develop this task, one of the options is to use the argparse
module, which comes installed by default when you install Python.
One of the interesting choices is that the type of parameter can be indicated using the type
attribute. For example, if we want to treat a certain parameter as if it were an integer, then we might do so as follows:
parser.add_argument("-param", dest="param", type="int")
Another thing that could help us to have a more readable code is to declare a class that acts as a global object for the parameters. For example, if we wanted to pass several parameters at the same time to a function, we could use the above mentioned global object, which is the one that contains the global execution parameters.
You can find the following code in the params_global_argparse.py
file:
import argparse
class Parameters:
"""Global parameters"""
def __init__(self, **kwargs):
self.param1 = kwargs.get("param1")
self.param2 = kwargs.get("param2")
def view_parameters(input_parameters):
print(input_parameters.param1)
print(input_parameters.param2)
parser = argparse.ArgumentParser(description='Testing parameters')
parser.add_argument("-p1", dest="param1", help="parameter1")
parser.add_argument("-p2", dest="param2", help="parameter2")
params = parser.parse_args()
input_parameters = Parameters(param1=params.param1,param2=params.param2)
view_parameters(input_parameters)
In the previous script, we are using the argparse
module to obtain parameters and we encapsulate these parameters in an object with the Parameters
class.
For more information, you can check out the official website: https://docs.python.org/3/library/argparse.html.
In the following example, we are using the argparse
module to manage those parameters that we could use to perform a port scan, such as the IP address, ports, and verbosity level. You can find the following code in the params_port_scanning.py
file:
import argparse
if __name__ == "__main__":
description = """ Uses cases:
+ Basic scan:
-target 127.0.0.1
+ Specific port:
-target 127.0.0.1 -port 21
+ Port list:
-target 127.0.0.1 -port 21,22
+ Only show open ports
-target 127.0.0.1 --open True """
parser = argparse.ArgumentParser(description='Port scanning', epilog=description,
formatter_class=argparse.RawDescriptionHelpFormatter)
parser.add_argument("-target", metavar='TARGET', dest="target", help="target to scan",required=True)
parser.add_argument("-ports", dest="ports",
help="Please, specify the target port(s) separated by comma[80,8080 by default]",
default = "80,8080")
parser.add_argument('-v', dest='verbosity', default=0, action="count",
help="verbosity level: -v, -vv, -vvv.")
parser.add_argument("--open", dest="only_open", action="store_true",
help="only display open ports", default=False)
Having set the necessary parameters using the add_argument()
method, we could then access the values of these arguments using the parser module’s parse_args()
method. Later, we could access the parameters using the params
variable.
params = parser.parse_args()
print("Target:" + params.target)
print("Verbosity:" + str(params.verbosity))
print("Only open:" + str(params.only_open))
portlist = params.ports.split(',')
for port in portlist:
print("Port:" + port)
Running the script above with the -h
option shows the arguments it accepts and some execution use cases.
$ python params_port_scanning.py -h
usage: params_port_scan_complete.py [-h] -target TARGET [-ports PORTS] [-v] [--open]
Port scanning
optional arguments:
-h, --help show this help message and exit
-target TARGET target to scan
-ports PORTS Please, specify the target port(s) separated by comma[80,8080 by default]
-v verbosity level: -v, -vv, -vvv.
--open only display open ports
Uses cases:
+ Basic scan:
-target 127.0.0.1
+ Specific port:
-target 127.0.0.1 -port 21
+ Port list:
-target 127.0.0.1 -port 21,22
+ Only show open ports
-target 127.0.0.1 --open True
When running the above script without any parameters, we get an error message stating the target argument is required.
$ python params_port_scanning.py
usage: params_port_scanning.py [-h] -target TARGET [-ports PORTS] [-v] [--open]
params_port_scanning.py: error: the following arguments are required: -target
When running the above script with the target argument, we get default values for the rest of parameters. For example, default values are 0
for verbosity and 80
and 8080
for ports.
$ python params_port_scanning.py -target localhost
Params:Namespace(only_open=False, ports='80,8080', target='localhost', verbosity=0)
Target:localhost
Verbosity:0
Only open:False
Port:80
Port:8080
When running the above script with the target
, ports
, and verbosity
arguments, we get new values for these parameters.
$ python params_port_scanning.py -target localhost -ports 22,23 -vv
Params:Namespace(only_open=False, ports='22,23', target='localhost', verbosity=2)
Target:localhost
Verbosity:2
Only open:False
Port:22
Port:23
Managing parameters with OptionParser
Python provides a class called OptionParser
for managing command-line arguments. OptionParser
is part of the optparse
module, which is provided by the standard library. OptionParser
allows you to do a range of very useful things with command-line arguments:
- Specify a default if a certain argument is not provided.
- It supports both argument flags (either present or not) and arguments with values.
- It supports different formats of passing arguments.
Let’s use OptionParser
to manage parameters in the same way we have seen before with the argparse
module. In the code provided here, command-line arguments are used to pass in variables.
You can find the following code in the params_global_optparser.py
file:
from optparse import OptionParser
class Parameters:
"""Global parameters"""
def __init__(self, **kwargs):
self.param1 = kwargs.get("param1")
self.param2 = kwargs.get("param2")
def view_parameters(input_parameters):
print(input_parameters.param1)
print(input_parameters.param2)
parser = OptionParser()
parser.add_option("--p1", dest="param1", help="parameter1")
parser.add_option("--p2", dest="param2", help="parameter2")
(options, args) = parser.parse_args()
input_parameters = Parameters(param1=options.param1,param2=options.param2)
view_parameters(input_parameters)
The previous script demonstrates the use of the OptionParser
class. It provides a simple interface for command-line arguments, allowing you to define certain properties for each command-line option. It also allows you to specify default values. If certain arguments are not provided, it allows you to throw specific errors.
For more information, you can check out the official website: https://docs.python.org/3/library/optparse.html.
Now that you know how Python manages modules and packages, let’s move on to learning how to manage dependencies and create a virtual environment with the virtualenv
utility.