Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases now! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Python Automation Cookbook

You're reading from   Python Automation Cookbook 75 Python automation recipes for web scraping; data wrangling; and Excel, report, and email processing

Arrow left icon
Product type Paperback
Published in May 2020
Publisher Packt
ISBN-13 9781800207080
Length 526 pages
Edition 2nd Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Jaime Buelta Jaime Buelta
Author Profile Icon Jaime Buelta
Jaime Buelta
Arrow right icon
View More author details
Toc

Table of Contents (16) Chapters Close

Preface 1. Let's Begin Our Automation Journey 2. Automating Tasks Made Easy FREE CHAPTER 3. Building Your First Web Scraping Application 4. Searching and Reading Local Files 5. Generating Fantastic Reports 6. Fun with Spreadsheets 7. Cleaning and Processing Data 8. Developing Stunning Graphs 9. Dealing with Communication Channels 10. Why Not Automate Your Marketing Campaign? 11. Machine Learning for Automation 12. Automatic Testing Routines 13. Debugging Techniques 14. Other Books You May Enjoy
15. Index

Installing third-party packages

One of the strongest capabilities of Python is the ability to use an impressive catalog of third-party packages that cover an amazing amount of ground in different areas, from modules specialized in performing numerical operations, machine learning, and network communications, to command-line convenience tools, database access, image processing, and much more!

Most of them are available on the official Python Package Index (https://pypi.org/), which has more than 200,000 packages ready to use. In this book, we'll install some of them. In general, it's worth spending a little time researching external tools when trying to solve a problem. It's very likely that someone else has already created a tool that solves all, or at least part, of the problem.

More important than finding and installing a package is keeping track of which packages are being used. This greatly helps with replicability, meaning the ability to start the whole environment from scratch in any situation.

Getting ready

The starting point is to find a package that will be of use in our project.

A great one is requests, a module that deals with HTTP requests and is known for its easy and intuitive interface, as well as its great documentation. Take a look at the documentation, which can be found here: https://requests.readthedocs.io/en/master/.

We'll use requests throughout this book when dealing with HTTP connections.

The next step will be to choose the version to use. In this case, the latest (2.22.0, at the time of writing) will be perfect. If the version of the module is not specified, by default it will install the latest version, which can lead to inconsistencies in different environments as newer versions are released.

We'll also use the great delorean module for time handling (version 1.0.0: http://delorean.readthedocs.io/en/latest/).

How to do it…

  1. Create a requirements.txt file in our main directory, which will specify all the requirements for our project. Let's start with delorean and requests:
    delorean==1.0.0
    requests==2.22.0
    
  2. Install all the requirements with the pip command:
    $ pip install -r requirements.txt
    ...
    Successfully installed babel-2.8.0 certifi-2019.11.28 chardet-3.0.4 delorean-1.0.0 humanize-0.5.1 idna-2.8 python-dateutil-2.8.1 pytz-2019.3 requests-2.22.0 six-1.14.0 tzlocal-2.0.0 urllib3-1.25.7
    

    Show the available modules installed using pip list:

    $ pip list
    Package         Version
    --------------- ----------
    Babel           2.8.0
    certifi         2019.11.28
    chardet         3.0.4
    Delorean        1.0.0
    humanize        2.0.0
    idna            2.8
    pip             19.2.3
    python-dateutil 2.8.1
    pytz            2019.3
    requests        2.22.0
    setuptools      41.2.0
    six             1.14.0
    tzlocal         2.0.0
    urllib3         1.25.8
    
  3. You can now use both modules when using the virtual environment:
    $ python
    Python 3.8.1 (default, Dec 27 2019, 18:05:45)
    [Clang 11.0.0 (clang-1100.0.33.16)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import delorean
    >>> import requests
    

How it works…

The requirements.txt file specifies the module and version, and pip performs a search on pypi.org.

Note that creating a new virtual environment from scratch and running the following will completely recreate your environment, which makes replicability very straightforward:

$ pip install -r requirements.txt

Note that step 2 of the How to do it section automatically installs other modules that are dependencies, such as urllib3.

There's more…

If any of the modules need to be changed to a different version because a new version is available, change them using requirements and run the install command again:

$ pip install -r requirements.txt

This is also applicable when a new module needs to be included.

At any point, the freeze command can be used to display all of the installed modules. freeze returns the modules in a format compatible with requirements.txt, making it possible to generate a file with our current environment:

$ pip freeze > requirements.txt

This will include dependencies, so expect a lot more modules in the file.

Finding great third-party modules is sometimes not easy. Searching for specific functionality can work well, but, sometimes, there are great modules that are a surprise because they do things you never thought of. A great curated list is Awesome Python (https://awesome-python.com/), which covers a lot of great tools for common Python use cases, such as cryptography, database access, date and time handling, and more.

In some cases, installing packages may require additional tools, such as compilers or a specific library that supports some functionality (for example, a particular database driver). If that's the case, the documentation will explain the dependencies.

See also

  • The Activating a virtual environment recipe, covered earlier in this chapter.
  • The Using a third-party tool—parse recipe, covered later in this chapter, to learn how to use one installed third-party module.
lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime