You're reading from Python Automation Cookbook 75 Python automation recipes for web scraping; data wrangling; and Excel, report, and email processing

Product type Paperback

Published in May 2020

Publisher Packt

ISBN-13 9781800207080

Length 526 pages

Edition 2nd Edition

Languages

Python

Tools

Excel

Concepts

Programming Language

Author (1):

Jaime Buelta

View More author details

Table of Contents (16) Chapters

Preface

1. Let's Begin Our Automation Journey

2. Automating Tasks Made Easy FREE CHAPTER

3. Building Your First Web Scraping Application

4. Searching and Reading Local Files

5. Generating Fantastic Reports

6. Fun with Spreadsheets

7. Cleaning and Processing Data

8. Developing Stunning Graphs

9. Dealing with Communication Channels

10. Why Not Automate Your Marketing Campaign?

11. Machine Learning for Automation

12. Automatic Testing Routines

13. Debugging Techniques

14. Other Books You May Enjoy

15. Index

Adding command-line arguments

A lot of tasks can be best structured as a command-line interface that accepts different parameters to change the way it works, for example, scraping a web page from a provided URL or other URL. Python includes a powerful argparse module in the standard library to create rich command-line argument parsing with minimal effort.

Getting ready

The basic use of argparse in a script can be shown in three steps:

Define the arguments that your script is going to accept, generating a new parser.
Call the defined parser, returning an object with all of the resulting arguments.
Use the arguments to call the entry point of your script, which will apply the defined behavior.

Try to use the following general structure for your scripts:

IMPORTS
def main(main parameters):
  DO THINGS
if __name__ == '__main__':
    DEFINE ARGUMENT PARSER
    PARSE ARGS
    VALIDATE OR MANIPULATE ARGS, IF NEEDED
    main(arguments)

The main function makes it easy to know what the entry point for the code is. The section under the if statement is only executed if the file is called directly, but not if it's imported. We'll follow this for all the steps.

How to do it…

Create a script that will accept a single integer as a positional argument, and will print a hash symbol that amount of times. The recipe_cli_step1.py script is as follows, but note that we are following the structure presented previously, and the main function is just printing the argument:
```
import argparse
def main(number):
    print('#' * number)
if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('number', type=int, help='A number')
    args = parser.parse_args()
    
    main(args.number)
```

Call the script and check how the parameter is presented. Calling the script with no arguments displays the automatic help. Use the automatic argument -h to display the extended help:

$ python3 recipe_cli_step1.py
usage: recipe_cli_step1.py [-h] number
recipe_cli_step1.py: error: the following arguments are required: number
$ python3 recipe_cli_step1.py -h
usage: recipe_cli_step1.py [-h] number
positional arguments:
  number      A number
optional arguments:
 -h, --help show this help message and exit

Calling the script with the extra parameters works as expected:

$ python3 recipe_cli_step1.py 4
####
$ python3 recipe_cli_step1.py not_a_number
usage: recipe_cli_step1.py [-h] number
recipe_cli_step1.py: error: argument number: invalid int value: 'not_a_number'

Change the script to accept an optional argument for the character to print. The default will be "#". The recipe_cli_step2.py script will look like this:

import argparse
def main(character, number):
    print(character * number)
if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('number', type=int, help='A number')
    parser.add_argument('-c', type=str, help='Character to print',
                        default='#')
args = parser.parse_args()
main(args.c, args.number)

The help is updated, and using the -c flag allows us to print different characters:

$ python3 recipe_cli_step2.py -h
usage: recipe_cli_step2.py [-h] [-c C] number
positional arguments:
  number      A number
optional arguments:
 -h, --help show this help message and exit
 -c C Character to print
$ python3 recipe_cli_step2.py 4
####
$ python3 recipe_cli_step2.py 5 -c m
mmmmm

Add a flag that changes the behavior when present. The recipe_cli_step3.py script is as follows:

import argparse
def main(character, number):
    print(character * number)
if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('number', type=int, help='A number')
    parser.add_argument('-c', type=str, help='Character to print',
                        default='#')
    parser.add_argument('-U', action='store_true', default=False,
                        dest='uppercase',
                        help='Uppercase the character')
    args = parser.parse_args()
    if args.uppercase:
        args.c = args.c.upper()
    main(args.c, args.number)

Calling it uppercases the character if the -U flag is added:

$ python3 recipe_cli_step3.py 4 -c f
ffff
$ python3 recipe_cli_step3.py 4 -c f -U
FFFF

How it works…

As described in step 1 of the How to do it section, the arguments are added to the parser through .add_arguments. Once all of the arguments are defined, calling parse_args() returns an object that contains the results (or exits if there's an error).

Each argument should add a help description, but their behavior can change greatly:

If an argument starts with a -, it is considered an optional parameter, like the -c argument in step 4. If not, it's a positional argument, like the number argument in step 1.
For clarity, always define a default value for optional parameters. It will be None if you don't, but this may be confusing.
Remember to always add a help parameter with a description of the parameter; help is automatically generated, as shown in step 2.
If a type is present, it will be validated, for example, number in step 3. By default, the type will be a string.
The actions store_true and store_false can be used to generate flags, arguments that don't require any extra parameters. Set the corresponding default value as the opposite Boolean. This is demonstrated in the U argument in steps 6 and 7.
The name of the property in the args object will be, by default, the name of the argument (without the dash, if it's present). You can change it with dest. For example, in step 6, the command-line argument -U is described as uppercase.

Changing the name of an argument for internal usage is very useful when using short arguments, such as single letters. A good command-line interface will use -c, but, internally, it's probably a good idea to use a more verbose label, such as configuration_file. Remember, explicit is better than implicit!

Some arguments can work in coordination with others, as shown in step 3. Perform all of the required operations to pass the main function as clear and concise parameters. For example, in step 3, only two parameters are passed, but one may have been modified.

There's more…

You can create long arguments as well with double dashes, for example:

 parser.add_argument('-v', '--verbose', action='store_true', default=False,
                     help='Enable verbose output')

This will accept both -v and --verbose, and it will store the name verbose.

Adding long names is a good way of making the interface more intuitive and easy to remember. It's easy to remember after a couple of times that there's a verbose option, and it starts with a v.

The main inconvenience when dealing with command-line arguments may be that you end up with too many of them. This creates confusion. Try to make your arguments as independent as possible and don't make too many dependencies between them; otherwise, handling the combinations can be tricky.

In particular, try to not create more than a couple of positional arguments, as they won't have mnemonics. Positional arguments also accept default values, but most of the time, that won't be the expected behavior.

For advanced details, refer to the Python documentation of argparse (https://docs.python.org/3/library/argparse.html).