Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Practical Data Science Cookbook, Second Edition

You're reading from   Practical Data Science Cookbook, Second Edition Data pre-processing, analysis and visualization using R and Python

Arrow left icon
Product type Paperback
Published in Jun 2017
Publisher Packt
ISBN-13 9781787129627
Length 434 pages
Edition 2nd Edition
Languages
Arrow right icon
Authors (5):
Arrow left icon
Anthony Ojeda Anthony Ojeda
Author Profile Icon Anthony Ojeda
Anthony Ojeda
Prabhanjan Narayanachar Tattar Prabhanjan Narayanachar Tattar
Author Profile Icon Prabhanjan Narayanachar Tattar
Prabhanjan Narayanachar Tattar
ABHIJIT DASGUPTA ABHIJIT DASGUPTA
Author Profile Icon ABHIJIT DASGUPTA
ABHIJIT DASGUPTA
Sean P Murphy Sean P Murphy
Author Profile Icon Sean P Murphy
Sean P Murphy
Bhushan Purushottam Joshi Bhushan Purushottam Joshi
Author Profile Icon Bhushan Purushottam Joshi
Bhushan Purushottam Joshi
+1 more Show less
Arrow right icon
View More author details
Toc

Table of Contents (12) Chapters Close

Preface 1. Preparing Your Data Science Environment FREE CHAPTER 2. Driving Visual Analysis with Automobile Data with R 3. Creating Application-Oriented Analyses Using Tax Data and Python 4. Modeling Stock Market Data 5. Visually Exploring Employment Data 6. Driving Visual Analyses with Automobile Data 7. Working with Social Graphs 8. Recommending Movies at Scale (Python) 9. Harvesting and Geolocating Twitter Data (Python) 10. Forecasting New Zealand Overseas Visitors 11. German Credit Data Analysis

Installing and using virtualenv

virtualenv is a transformative Python tool. Once you start using it, you will never look back. virtualenv creates a local environment with its own Python distribution installed. Once this environment is activated from the shell, you can easily install packages using pip install into the new local Python.

At first, this might sound strange. Why would anyone want to do this? Not only does this help you handle the issue of package dependencies and versions in Python but also allows you to experiment rapidly without breaking anything important. Imagine that you build a web application that requires Version 0.8 of the awesome_template library, but then your new data product needs the awesome_template library Version 1.2. What do you do? With virtualenv, you can have both.

As another use case, what happens if you don't have admin privileges on a particular machine? You can't install the packages using sudo pip install required for your analysis so what do you do? If you use virtualenv, it doesn't matter.

Virtual environments are development tools that software developers use to collaborate effectively. Environments ensure that the software runs on different computers (for example, from production to development servers) with varying dependencies. The environment also alerts other developers to the needs of the software under development. Python's virtualenv ensures that the software created is in its own holistic environment, can be tested independently, and built collaboratively.

Getting ready

Assuming you have completed the previous recipe, you are ready to go for this one.

How to do it...

Install and test the virtual environment using the following steps:

  1. Open a command-line shell and type in the following command:
pip install virtualenv 

Alternatively, you can type in the following command:

sudo pip install virtualenv 
  1. Once installed, type virtualenv in the command window, and you should be greeted with the information shown in the following screenshot:
  1. Create a temporary directory and change location to this directory using the following commands:
mkdir temp 
cd temp 
  1. From within the directory, create the first virtual environment named venv:
virtualenv venv 
  1. You should see text similar to the following:
New python executable in venv/bin/python 
Installing setuptools, pip...done. 
  1. The new local Python distribution is now available. To use it, we need to activate venv using the following command:
source ./venv/bin/activate 
  1. The activated script is not executable and must be activated using the source command. Also, note that your shell's command prompt has probably changed and is prefixed with venv to indicate that you are now working in your new
    virtual environment.
  2. To check this fact, use which to see the location of Python, as follows:
which python 

You should see the following output:

/path/to/your/temp/venv/bin/python 

So, when you type python once your virtual environment is activated, you will run the local Python.

  1. Next, install something by typing the following:
pip install flask 

Flask is a micro-web framework written in Python; the preceding command will install a number of packages that Flask uses.

  1. Finally, we demonstrate the versioning power that virtual environment and pip offer, as follows:
pip freeze > requirements.txt 
cat requirements.txt 

This should produce the following output:

Flask==0.10.1 
Jinja2==2.7.2 
MarkupSafe==0.19 
Werkzeug==0.9.4 
itsdangerous==0.23 
wsgiref==0.1.2 
  1. Note that not only the name of each package is captured, but also the exact version number. The beauty of this requirements.txt file is that, if we have a new virtual environment, we can simply issue the following command to install each of the specified versions of the listed Python packages:
pip install -r requirements.txt 
  1. To deactivate your virtual environment, simply type the following at the shell prompt:
deactivate 

How it works...

virtualenv creates its own virtual environment with its own installation directories that operate independently from the default system environment. This allows you to try out new libraries without polluting your system-level Python distribution. Further, if you have an application that just works and want to leave it alone, you can do so by making sure the application has its own virtualenv.

There's more...

virtualenv is a fantastic tool, one that will prove invaluable to any Python programmer. However, we wish to offer a note of caution. Python provides many tools that connect to C-shared objects in order to improve performance. Therefore, installing certain Python packages, such as NumPy and SciPy, into your virtual environment may require external dependencies to be compiled and installed, which are system specific. Even when successful, these compilations can be tedious, which is one of the reasons for maintaining a virtual environment. Worse, missing dependencies will cause compilations to fail, producing errors that require you to troubleshoot alien error messages, dated make files, and complex dependency chains. This can be daunting even to the most veteran data scientist.

A quick solution is to use a package manager to install complex libraries into the system environment (aptitude or Yum for Linux, Homebrew or MacPorts for OS X, and Windows will generally already have compiled installers). These tools use precompiled forms of the third-party packages. Once you have these Python packages installed in your system environment, you can use the --system-site-packages flag when initializing a virtualenv. This flag tells the virtualenv tool to use the system site packages already installed and circumvents the need for an additional installation that will require compilation. In order to nominate packages particular to your environment that might already be in the system (for example, when you wish to use a newer version of a package), use pip install -I to install dependencies into virtualenv and ignore the global packages. This technique works best when you only install large-scale packages on your system, but use virtualenv for other types of development.

For the rest of the book, we will assume that you are using a virtualenv and have the tools mentioned in this chapter ready to go. Therefore, we won't enforce or discuss the use of virtual environments in much detail. Just consider the virtual environment as a safety net
that will allow you to perform the recipes listed in this book in isolation.

See also

You have been reading a chapter from
Practical Data Science Cookbook, Second Edition - Second Edition
Published in: Jun 2017
Publisher: Packt
ISBN-13: 9781787129627
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime