Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon

Preparing and automating a task in Python [Tutorial]

Save for later
  • 15 min read
  • 10 Jan 2019

article-image

To properly automate tasks, we need a platform so that they run automatically at the proper times. A task that needs to be run manually is not really fully automated. But, in order to be able to leave them running in the background while worrying about more pressing issues, the task will need to be adequate to run in fire-and-forget mode. We should be able to monitor that it runs correctly, be sure that we are capturing future actions (such as receiving notifications if something interesting arises), and know whether there have been any errors while running it.

Ensuring that a piece of software runs consistently with high reliability is actually a very big deal and is one area that, to be done properly, requires specialized knowledge and staff, which typically go by the names of sysadmin, operations, or SRE (Site Reliability Engineering).

In this article, we will learn how to prepare and automatically run tasks. It covers how to program tasks to be executed when they should, instead of running them manually, and how to be notified if there has been an error in an automated process.

This article is an excerpt from a book written by Jaime Buelta titled Python Automation Cookbook.  The Python Automation Cookbook helps you develop a clear understanding of how to automate your business processes using Python, including detecting opportunities by scraping the web, analyzing information to generate automatic spreadsheets reports with graphs, and communicating with automatically generated emails. To follow along with the examples implemented in the article, you can find the code on the book's GitHub repository.

Preparing a task


It all starts with defining exactly what task needs to be run and designing it in a way that doesn't require human intervention to run.

Some ideal characteristic points are as follows:

  1. Single, clear entry point: No confusion on what the task to run is.
  2. Clear parameters: If there are any parameters, they should be very explicit.
  3. No interactivity: Stopping the execution to request information from the user is not possible.
  4. The result should be stored: To be able to be checked at a different time than when it runs.
  5. Clear result: If we are working interactively in a result, we accept more verbose results or progress reports. But, for an automated task, the final result should be as concise and to the point as possible.
  6. Errors should be logged: To analyze what went wrong.


A command-line program has a lot of those characteristics already. It has a clear way of running, with defined parameters, and the result can be stored, even if just in text format. But, it can be improved with a config file to clarify the parameters and an output file.

Getting ready


We'll start by following a structure in which the main function will serve as the entry point, and all parameters are supplied to it. The definition of the main function with all the explicit arguments covers points 1 and 2. Point 3 is not difficult to achieve. To improve point 2 and 5, we'll look at retrieving the configuration from a file and storing the result in another.

How to do it...

  1. Prepare the following task and save it as prepare_task_step1.py:

import argparse
def main(number, other_number):
result = number * other_number
print(f'The result is {result}')

if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('-n1', type=int, help='A number', default=1)
parser.add_argument('-n2', type=int, help='Another number', default=1)

args = parser.parse_args()

main(args.n1, args.n2)

  1. Update the file to define a config file that contains both arguments, and save it as prepare_task_step2.py:

import argparse
import configparser
def main(number, other_number):
result = number * other_number
print(f'The result is {result}')

if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('-n1', type=int, help='A number', default=1)
parser.add_argument('-n2', type=int, help='Another number', default=1)

parser.add_argument('--config', '-c', type=argparse.FileType('r'),
help='config file')

args = parser.parse_args()
if args.config:
config = configparser.ConfigParser()
config.read_file(args.config)
# Transforming values into integers
args.n1 = int(config['DEFAULT']['n1'])
args.n2 = int(config['DEFAULT']['n2'])

main(args.n1, args.n2)

  1. Create the config file config.ini:

[ARGUMENTS]
n1=5
n2=7

  1. Run the command with the config file:

$ python3 prepare_task_step2.py -c config.ini
The result is 35
$ python3 prepare_task_step2.py -c config.ini -n1 2 -n2 3
The result is 35

  1. Add a parameter to store the result in a file, and save it as prepare_task_step5.py:

import argparse
import sys
import configparser
def main(number, other_number, output):
result = number * other_number
print(f'The result is {result}', file=output)

if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('-n1', type=int, help='A number', default=1)
parser.add_argument('-n2', type=int, help='Another number', default=1)

parser.add_argument('--config', '-c', type=argparse.FileType('r'),
help='config file')
parser.add_argument('-o', dest='output', type=argparse.FileType('w'),
help='output file',
default=sys.stdout)

args = parser.parse_args()
if args.config:
config = configparser.ConfigParser()
config.read_file(args.config)
# Transforming values into integers
args.n1 = int(config['DEFAULT']['n1'])
args.n2 = int(config['DEFAULT']['n2'])

main(args.n1, args.n2, args.output)

  1. Run the result to check that it's sending the output to the defined file:

$ python3 prepare_task_step5.py -n1 3 -n2 5 -o result.txt
$ cat result.txt
The result is 15
$ python3 prepare_task_step5.py -c config.ini -o result2.txt
$ cat result2.txt
The result is 35

How it works...


Note that the argparse module allows us to define files as parameters, with the argparse.FileType type, and opens them automatically. This is very handy and will raise an error if the file is not valid.

The configparser module allows us to use config files with ease. As demonstrated in Step 2, the parsing of the file is as simple as follows:

config = configparser.ConfigParser()
config.read_file(file)


The config will then be accessible as a dictionary divided by sections, and then values. Note that the values are always stored in string format, requiring to be transformed into other types, such as integers.

Python 3 allows us to pass a file parameter to the print function, which will write to that file. Step 5 shows the usage to redirect all the printed information to a file.

Note that the default parameter is sys.stdout, which will print the value to the Terminal (standard output). This makes it so that calling the script without an -o parameter will display the information on the screen, which is helpful in debugging:

$ python3 prepare_task_step5.py -c config.ini
The result is 35
$ python3 prepare_task_step5.py -c config.ini -o result.txt
$ cat result.txt
The result is 35

Setting up a cron job


Cron is an old-fashioned but reliable way of executing commands. It has been around since the 70s in Unix, and it's an old favorite in system administration to perform maintenance, such as freeing space, rotating logs, making backups, and other common operations.

Getting ready


We will produce a script, called  cron.py:

import argparse
import sys
from datetime import datetime
import configparser
def main(number, other_number, output):
result = number * other_number
print(f'[{datetime.utcnow().isoformat()}] The result is {result}',
file=output)

if __name__ == '__main__':
parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument('--config', '-c', type=argparse.FileType('r'),
help='config file',
default='/etc/automate.ini')
parser.add_argument('-o', dest='output', type=argparse.FileType('a'),
help='output file',
default=sys.stdout)

args = parser.parse_args()
if args.config:
config = configparser.ConfigParser()
config.read_file(args.config)
# Transforming values into integers
args.n1 = int(config['DEFAULT']['n1'])
args.n2 = int(config['DEFAULT']['n2'])

main(args.n1, args.n2, args.output)


Note the following details:

  1. The config file is by default, /etc/automate.ini. Reuse config.ini from the previous recipe.
  2. A timestamp has been added to the output. This will make it explicit when the task is run.
  3. The result is being added to the file, as shown with the 'a' mode where the file is open.
  4. The ArgumentDefaultsHelpFormatter parameter automatically adds information about default values when printing the help using the -h argument.


Check that the task is producing the expected result and that you can log to a known file:

$ python3 cron.py
[2018-05-15 22:22:31.436912] The result is 35
$ python3 cron.py -o /path/automate.log
$ cat /path/automate.log
[2018-05-15 22:28:08.833272] The result is 35

How to do it...

  1. Obtain the full path of the Python interpreter. This is the interpreter that's on your virtual environment:

$ which python
/your/path/.venv/bin/python

  1. Prepare the cron to be executed. Get the full path and check that it can be executed with no problem. Execute it a couple of times:

Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
$ /your/path/.venv/bin/python /your/path/cron.py -o /path/automate.log
$ /your/path/.venv/bin/python /your/path/cron.py -o /path/automate.log

  1. Check that the result is being added correctly to the result file:

$ cat /path/automate.log
[2018-05-15 22:28:08.833272] The result is 35
[2018-05-15 22:28:10.510743] The result is 35

  1. Edit the crontab file to run the task once every five minutes:

$ crontab -e
*/5 * * * * /your/path/.venv/bin/python /your/path/cron.py -o /path/automate.log


Note that this opens an editing Terminal with your default command-line editor.

  1. Check the crontab contents. Note that this displays the crontab contents, but doesn't set it to edit:

$ contab -l
*/5 * * * * /your/path/.venv/bin/python /your/path/cron.py -o /path/automate.log

  1. Wait and check the result file to see how the task is being executed:

$ tail -F /path/automate.log
[2018-05-17 21:20:00.611540] The result is 35
[2018-05-17 21:25:01.174835] The result is 35
[2018-05-17 21:30:00.886452] The result is 35

How it works...


The crontab line consists of a line describing how often to run the task (first six elements), plus the task. Each of the initial six elements mean a different unit of time to execute. Most of them are stars, meaning any:

* * * * * *
| | | | | | 
| | | | | +-- Year              (range: 1900-3000)
| | | | +---- Day of the Week   (range: 1-7, 1 standing for Monday)
| | | +------ Month of the Year (range: 1-12)
| | +-------- Day of the Month  (range: 1-31)
| +---------- Hour              (range: 0-23)
+------------ Minute            (range: 0-59)


Therefore, our line, */5 * * * * *, means every time the minute is divisible by 5, in all hours, all days... all years.

Here are some examples:

30  15 * * * * means "every day at 15:30"
30   * * * * * means "every hour, at 30 minutes"
0,30 * * * * * means "every hour, at 0 minutes and 30 minutes"
*/30 * * * * * means "every half hour"
0    0 * * 1 * means "every Monday at 00:00"

Do not try to guess too much. Use a cheat sheet like crontab guru for examples and tweaks. Most of the common usages will be described there directly. You can also edit a formula and get a descriptive text on how it's going to run.


After the description of how to run the cron job, including the line to execute the task, as prepared in Step 2 in the How to do it… section.

Capturing errors and problems


An automated task's main characteristic is its fire-and-forget quality. We are not actively looking at the result, but making it run in the background. This recipe will present an automated task that will safely store unexpected behaviors in a log file that can be checked afterward.

Getting ready


As a starting point, we'll use a task that will divide two numbers, as described in the command line.

How to do it...

  1. Create the task_with_error_handling_step1.py file, as follows:

import argparse
import sys
def main(number, other_number, output):
result = number / other_number
print(f'The result is {result}', file=output)

if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('-n1', type=int, help='A number', default=1)
parser.add_argument('-n2', type=int, help='Another number', default=1)
parser.add_argument('-o', dest='output', type=argparse.FileType('w'),
help='output file', default=sys.stdout)

args = parser.parse_args()

main(args.n1, args.n2, args.output)

  1. Execute it a couple of times to see that it divides two numbers:

$ python3 task_with_error_handling_step1.py -n1 3 -n2 2
The result is 1.5
$ python3 task_with_error_handling_step1.py -n1 25 -n2 5
The result is 5.0

  1. Check that dividing by 0 produces an error and that the error is not logged on the result file:

$ python task_with_error_handling_step1.py -n1 5 -n2 1 -o result.txt
$ cat result.txt
The result is 5.0
$ python task_with_error_handling_step1.py -n1 5 -n2 0 -o result.txt
Traceback (most recent call last):
 File "task_with_error_handling_step1.py", line 20, in <module>
 main(args.n1, args.n2, args.output)
 File "task_with_error_handling_step1.py", line 6, in main
 result = number / other_number
ZeroDivisionError: division by zero
$ cat result.txt

  1. Create the task_with_error_handling_step4.py file:

import logging
import sys
import logging
LOG_FORMAT = '%(asctime)s %(name)s %(levelname)s %(message)s'
LOG_LEVEL = logging.DEBUG

def main(number, other_number, output):
logging.info(f'Dividing {number} between {other_number}')
result = number / other_number
print(f'The result is {result}', file=output)

if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('-n1', type=int, help='A number', default=1)
parser.add_argument('-n2', type=int, help='Another number', default=1)

parser.add_argument('-o', dest='output', type=argparse.FileType('w'),
help='output file', default=sys.stdout)
parser.add_argument('-l', dest='log', type=str, help='log file',
default=None)

args = parser.parse_args()
if args.log:
logging.basicConfig(format=LOG_FORMAT, filename=args.log,
level=LOG_LEVEL)
else:
logging.basicConfig(format=LOG_FORMAT, level=LOG_LEVEL)

try:
main(args.n1, args.n2, args.output)
except Exception as exc:
logging.exception("Error running task")
exit(1)

  1. Run it to check that it displays the proper INFO and ERROR log and that it stores it on the log file:

$ python3 task_with_error_handling_step4.py -n1 5 -n2 0
2018-05-19 14:25:28,849 root INFO Dividing 5 between 0
2018-05-19 14:25:28,849 root ERROR division by zero
Traceback (most recent call last):
  File "task_with_error_handling_step4.py", line 31, in <module>
    main(args.n1, args.n2, args.output)
  File "task_with_error_handling_step4.py", line 10, in main
    result = number / other_number
ZeroDivisionError: division by zero
$ python3 task_with_error_handling_step4.py -n1 5 -n2 0 -l error.log
$ python3 task_with_error_handling_step4.py -n1 5 -n2 0 -l error.log
$ cat error.log
2018-05-19 14:26:15,376 root INFO Dividing 5 between 0
2018-05-19 14:26:15,376 root ERROR division by zero
Traceback (most recent call last):
  File "task_with_error_handling_step4.py", line 33, in <module>
    main(args.n1, args.n2, args.output)
  File "task_with_error_handling_step4.py", line 11, in main
    result = number / other_number
ZeroDivisionError: division by zero
2018-05-19 14:26:19,960 root INFO Dividing 5 between 0
2018-05-19 14:26:19,961 root ERROR division by zero
Traceback (most recent call last):
  File "task_with_error_handling_step4.py", line 33, in <module>
    main(args.n1, args.n2, args.output)
  File "task_with_error_handling_step4.py", line 11, in main
    result = number / other_number
ZeroDivisionError: division by zero

How it works...


To properly capture any unexpected exceptions, the main function should be wrapped into a try-except block, as done in Step 4 in the How to do it… section. Compare this to how Step 1 is not wrapping the code:

    try:
        main(...)
    except Exception as exc:
        # Something went wrong
        logging.exception("Error running task")
        exit(1)


The extra step to exit with status 1 with the exit(1) call informs the operating system that something went wrong with our script.

The logging module allows us to log. Note the basic configuration, which includes an optional file to store the logs, the format, and the level of the logs to display. Creating logs is easy. You can do this by making a call to the method logging.<logging level>, (where logging level is debug, info, and so on). logging.exception() is a special case that will create an ERROR log, but it will also include information about the exception, such as the stack trace.

Remember to check logs to discover errors. A useful reminder is to add a note on the results file, like this:

try:
    main(args.n1, args.n2, args.output)
except Exception as exc:
    logging.exception(exc)
    print('There has been an error. Check the logs', file=args.output)


In this article, we saw how to define and design a task so that no human intervention is needed to run it. We learned how to use cron for automating a task. We further presented an automated task that will safely store unexpected behaviors in a log file that can be checked afterward.

If you found this post useful, do check out the book, Python Automation Cookbook to develop a clear understanding of how to automate your business processes using Python. This includes detecting opportunities by scraping the web, analyzing information to generate automatic spreadsheets reports with graphs, and communicating with automatically generated emails.

Write your first Gradle build script to start automating your project [Tutorial]

Ansible 2 for automating networking tasks on Google Cloud Platform [Tutorial]

Automating OpenStack Networking and Security with Ansible 2 [Tutorial]