Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Python Data Visualization Cookbook (Second Edition)
Python Data Visualization Cookbook (Second Edition)

Python Data Visualization Cookbook (Second Edition): Visualize data using Python's most popular libraries

Arrow left icon
Profile Icon Igor Milovanovic Profile Icon Foures Profile Icon Giuseppe Vettigli
Arrow right icon
$9.99 $39.99
Full star icon Full star icon Full star icon Full star icon Empty star icon 4 (6 Ratings)
eBook Nov 2015 302 pages 1st Edition
eBook
$9.99 $39.99
Paperback
$48.99
Subscription
Free Trial
Renews at $19.99p/m
Arrow left icon
Profile Icon Igor Milovanovic Profile Icon Foures Profile Icon Giuseppe Vettigli
Arrow right icon
$9.99 $39.99
Full star icon Full star icon Full star icon Full star icon Empty star icon 4 (6 Ratings)
eBook Nov 2015 302 pages 1st Edition
eBook
$9.99 $39.99
Paperback
$48.99
Subscription
Free Trial
Renews at $19.99p/m
eBook
$9.99 $39.99
Paperback
$48.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

Python Data Visualization Cookbook (Second Edition)

Chapter 2. Knowing Your Data

In this chapter, we'll cover the following topics:

  • Importing data from CSV
  • Importing data from Microsoft Excel files
  • Importing data from fixed-width data files
  • Importing data from tab-delimited files
  • Importing data from a JSON resource
  • Exporting data to JSON, CSV, and Excel
  • Importing and manipulating data with Pandas
  • Importing data from a database
  • Cleaning up data from outliers
  • Reading files in chunks
  • Reading streaming data sources
  • Importing image data into NumPy arrays
  • Generating controlled random datasets
  • Smoothing the noise in real-world data

Introduction

This chapter covers basics about importing and exporting data from various formats. We first introduce how to import data by just using only the capabilities of the Python standard library; then we introduce the powerful Pandas library which is becoming the de facto standard in data manipulation in Python. Also we've covered the ways of cleaning data such as normalizing values, adding missing data, live data inspection, and usage of some similar tricks to get data correctly prepared for visualization.

Importing data from CSV

In this recipe, we'll work with the most common file format that you will encounter in the wild world of data—CSV. It stands for Comma Separated Values, which almost explains all the formatting there is. (There is also a header part of the file, but those values are also comma separated.)

Python has a module called csv that supports reading and writing CSV files in various dialects. Dialects are important because there is no standard CSV, and different applications implement CSV in slightly different ways. A file's dialect is almost always recognizable by the first look into the file.

Getting ready

What we need for this recipe is the CSV file itself. We'll use sample CSV data that you can download from ch02-data.csv.

We assume that sample data files are in the same folder as the code reading them.

How to do it...

The following code example demonstrates how to import data from a CSV file. We will perform the following steps for this:

  1. Open the ch02-data...

Importing data from Microsoft Excel files

Although Microsoft Excel supports some charting, sometimes you need more flexible and powerful visualization and need to export data from existing spreadsheets into Python for further use.

A common approach to importing data from Excel files is to export data from Excel into CSV-formatted files and use the tools described in the previous recipe to import data using Python from the CSV file. This is a fairly easy process if we have one or two files (and have Microsoft Excel or OpenOffice.org installed), but if we are automating a data pipe for many files (as part of an ongoing data processing effort), we are not in a position to manually convert every Excel file into CSV. So, we need a way to read any Excel file.

Python has decent support for reading and writing Excel files through the project www.python-excel.org. This support is available in the form of different modules for reading and writing and is platform-independent; in other words, we don...

Importing data from fixed-width data files

Log files from events and time series data files are common sources for data visualizations. Sometimes, we can read them using CSV dialect for tab-separated data, but sometimes they are not separated by any specific character. Instead, fields are of fixed widths and we can infer the format to match and extract data.

One way to approach this is to read a file line by line and then use string manipulation functions to split a string into separate parts. This approach seems straightforward, and if performance is not an issue, it should be tried first.

If performance is more important or the file to parse is large (hundreds of megabytes), using the Python module struct (http://docs.python.org/library/struct.html) can speed us up as the module is implemented in C rather than in Python.

Getting ready

As the module struct is part of the Python Standard Library, we don't need to install any additional software to implement this recipe.

How to do it...

We...

Importing data from tab-delimited files

Another very common format of flat datafile is the tab-delimited file. This can also come from an Excel export but can be the output of some custom software we must get our input from.

The good thing is that usually this format can be read in almost the same way as CSV files as the Python module csv supports the so-called dialects that enable us to use the same principles to read variations of similar file formats, one of them being the tab- delimited format.

Getting ready

Now you're already able to read CSV files. If not, please refer to the Importing data from CSV recipe first.

How to do it...

We will reuse the code from the Importing data from CSV recipe, where all we need to change is the dialect we are using as shown in the following code:

import csv

filename = 'ch02-data.tab'

data = []
try:
    with open(filename) as f:
        reader = csv.reader(f, dialect=csv.excel_tab)
       header = reader.next()
       data = [row for row in...

Importing data from a JSON resource

This recipe will show us how we can read the JSON data format. Moreover, we'll be using a remote resource in this recipe. It will add a tiny level of complexity to the recipe, but it will also make it much more useful because in real life we will encounter more remote resources than local ones.

JavaScript Object Notation (JSON) is widely used as a platform-independent format to exchange data between systems or applications.

A resource, in this context, is anything we can read, be it a file or a URL endpoint (which can be the output of a remote process/program or just a remote static file). In short, we don't care who produced a resource and how they did it; we just need it to be in a known format like JSON.

Getting ready

In order to get started with this recipe, we need the requests module installed and importable (in PYTHONPATH) in our virtual environment. We have installed this module in Chapter 1, Preparing Your Working Environment.

We also...

Introduction


This chapter covers basics about importing and exporting data from various formats. We first introduce how to import data by just using only the capabilities of the Python standard library; then we introduce the powerful Pandas library which is becoming the de facto standard in data manipulation in Python. Also we've covered the ways of cleaning data such as normalizing values, adding missing data, live data inspection, and usage of some similar tricks to get data correctly prepared for visualization.

Importing data from CSV


In this recipe, we'll work with the most common file format that you will encounter in the wild world of data—CSV. It stands for Comma Separated Values, which almost explains all the formatting there is. (There is also a header part of the file, but those values are also comma separated.)

Python has a module called csv that supports reading and writing CSV files in various dialects. Dialects are important because there is no standard CSV, and different applications implement CSV in slightly different ways. A file's dialect is almost always recognizable by the first look into the file.

Getting ready

What we need for this recipe is the CSV file itself. We'll use sample CSV data that you can download from ch02-data.csv.

We assume that sample data files are in the same folder as the code reading them.

How to do it...

The following code example demonstrates how to import data from a CSV file. We will perform the following steps for this:

  1. Open the ch02-data.csv file for reading...

Importing data from Microsoft Excel files


Although Microsoft Excel supports some charting, sometimes you need more flexible and powerful visualization and need to export data from existing spreadsheets into Python for further use.

A common approach to importing data from Excel files is to export data from Excel into CSV-formatted files and use the tools described in the previous recipe to import data using Python from the CSV file. This is a fairly easy process if we have one or two files (and have Microsoft Excel or OpenOffice.org installed), but if we are automating a data pipe for many files (as part of an ongoing data processing effort), we are not in a position to manually convert every Excel file into CSV. So, we need a way to read any Excel file.

Python has decent support for reading and writing Excel files through the project www.python-excel.org. This support is available in the form of different modules for reading and writing and is platform-independent; in other words, we don't...

Importing data from fixed-width data files


Log files from events and time series data files are common sources for data visualizations. Sometimes, we can read them using CSV dialect for tab-separated data, but sometimes they are not separated by any specific character. Instead, fields are of fixed widths and we can infer the format to match and extract data.

One way to approach this is to read a file line by line and then use string manipulation functions to split a string into separate parts. This approach seems straightforward, and if performance is not an issue, it should be tried first.

If performance is more important or the file to parse is large (hundreds of megabytes), using the Python module struct (http://docs.python.org/library/struct.html) can speed us up as the module is implemented in C rather than in Python.

Getting ready

As the module struct is part of the Python Standard Library, we don't need to install any additional software to implement this recipe.

How to do it...

We will...

Importing data from tab-delimited files


Another very common format of flat datafile is the tab-delimited file. This can also come from an Excel export but can be the output of some custom software we must get our input from.

The good thing is that usually this format can be read in almost the same way as CSV files as the Python module csv supports the so-called dialects that enable us to use the same principles to read variations of similar file formats, one of them being the tab- delimited format.

Getting ready

Now you're already able to read CSV files. If not, please refer to the Importing data from CSV recipe first.

How to do it...

We will reuse the code from the Importing data from CSV recipe, where all we need to change is the dialect we are using as shown in the following code:

import csv

filename = 'ch02-data.tab'

data = []
try:
    with open(filename) as f:
        reader = csv.reader(f, dialect=csv.excel_tab)
       header = reader.next()
       data = [row for row in reader]
except...

Importing data from a JSON resource


This recipe will show us how we can read the JSON data format. Moreover, we'll be using a remote resource in this recipe. It will add a tiny level of complexity to the recipe, but it will also make it much more useful because in real life we will encounter more remote resources than local ones.

JavaScript Object Notation (JSON) is widely used as a platform-independent format to exchange data between systems or applications.

A resource, in this context, is anything we can read, be it a file or a URL endpoint (which can be the output of a remote process/program or just a remote static file). In short, we don't care who produced a resource and how they did it; we just need it to be in a known format like JSON.

Getting ready

In order to get started with this recipe, we need the requests module installed and importable (in PYTHONPATH) in our virtual environment. We have installed this module in Chapter 1, Preparing Your Working Environment.

We also need Internet...

Exporting data to JSON, CSV, and Excel


While as producers of data visualization, we are mostly using other people's data, importing and reading data are our major activities. We do need to write or export data that we produced or processed, whether it is for our or others' current or future use.

We will demonstrate how to use the previously mentioned Python modules to import, export, and write data to various formats such as JSON, CSV, and XLSX.

For demonstration purposes, we are using the pregenerated dataset from the Importing data from fixed-width data files recipe.

Getting ready

For the Excel writing part, we will need to install the xlwt module (inside our virtual environment) by executing the following command:

$ pip install xlwt

How to do it...

We will present one code sample that contains all the formats that we want to demonstrate: CSV, JSON, and XLSX. The main part of the program accepts the input and calls appropriate functions to transform data. We will walk through separate sections...

Importing and manipulating data with Pandas


Until now we have seen how to import and export data using mostly the tools provided in the Python standard library. Now, we'll see how to do some of the operations shown above in just few lines using the Pandas library. Pandas is an open source, BSD-licensed library that simplifies the process of data import and manipulation thus providing data structures and parsing functions.

We will demonstrate how to import, manipulate and export data using Pandas.

Getting ready

To be able to use the code in this section, we need to install Pandas.This can be done again using pip as shown here:

pip install pandas

How to do it...

Here, we will import again the data ch2-data.csv, add a new column to the original data and export the result in csv, as shown in the following code snippet:

data = pd.read_csv('ch02-data.csv')
data['amount_x_2'] = data['amount']*2
data.to_csv('ch02-data_more.csv)

How it works...

First, we import Pandas in our environment and then we use...

Importing data from a database


Very often, our work on data analysis and visualization is at the consumer end of the data pipeline. We most often use the already produced data rather than producing the data ourselves. A modern application, for example, holds different datasets inside relational databases (or other databases like MongoDB), and we use these databases to produce beautiful graphs.

This recipe will show you how to use SQL drivers from Python to access data.

We will demonstrate this recipe using a SQLite database because it requires the least effort to set up, but the interface is similar to most other SQL-based database engines (MySQL and PostgreSQL). There are, however, differences in the SQL dialect that those database engines support. This example uses simple SQL language and should be reproducible on most common SQL database engines.

Getting ready

To be able to execute this recipe, we need to install the SQLite library as shown here:

$ sudo apt-get install sqlite3

Python support...

Left arrow icon Right arrow icon

Key benefits

  • Learn how to set up an optimal Python environment for data visualization
  • Understand how to import, clean and organize your data
  • Determine different approaches to data visualization and how to choose the most appropriate for your needs

Description

Python Data Visualization Cookbook will progress the reader from the point of installing and setting up a Python environment for data manipulation and visualization all the way to 3D animations using Python libraries. Readers will benefit from over 60 precise and reproducible recipes that will guide the reader towards a better understanding of data concepts and the building blocks for subsequent and sometimes more advanced concepts. Python Data Visualization Cookbook starts by showing how to set up matplotlib and the related libraries that are required for most parts of the book, before moving on to discuss some of the lesser-used diagrams and charts such as Gantt Charts or Sankey diagrams. Initially it uses simple plots and charts to more advanced ones, to make it easy to understand for readers. As the readers will go through the book, they will get to know about the 3D diagrams and animations. Maps are irreplaceable for displaying geo-spatial data, so this book will also show how to build them. In the last chapter, it includes explanation on how to incorporate matplotlib into different environments, such as a writing system, LaTeX, or how to create Gantt charts using Python.

Who is this book for?

If you already know about Python programming and want to understand data, data formats, data visualization, and how to use Python to visualize data then this book is for you.

What you will learn

  • Introduce yourself to the essential tooling to set up your working environment.
  • Explore your data using the capabilities of standard Python Data Library and Panda Library
  • Draw your first chart and customize it
  • Use the most popular data visualization Python libraries
  • Make 3D visualizations mainly using mplot3d
  • Create charts with images and maps
  • Understand the most appropriate charts to describe your data
  • Know the matplotlib hidden gems
  • Use plot.ly to share your visualization online

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Nov 30, 2015
Length: 302 pages
Edition : 1st
Language : English
ISBN-13 : 9781784394943
Category :
Languages :
Tools :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Nov 30, 2015
Length: 302 pages
Edition : 1st
Language : English
ISBN-13 : 9781784394943
Category :
Languages :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 146.97
Python Machine Learning
$48.99
Learning Data Mining with Python
$48.99
Python Data Visualization Cookbook (Second Edition)
$48.99
Total $ 146.97 Stars icon
Banner background image

Table of Contents

10 Chapters
1. Preparing Your Working Environment Chevron down icon Chevron up icon
2. Knowing Your Data Chevron down icon Chevron up icon
3. Drawing Your First Plots and Customizing Them Chevron down icon Chevron up icon
4. More Plots and Customizations Chevron down icon Chevron up icon
5. Making 3D Visualizations Chevron down icon Chevron up icon
6. Plotting Charts with Images and Maps Chevron down icon Chevron up icon
7. Using the Right Plots to Understand Data Chevron down icon Chevron up icon
8. More on matplotlib Gems Chevron down icon Chevron up icon
9. Visualizations on the Clouds with Plot.ly Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
(6 Ratings)
5 star 66.7%
4 star 0%
3 star 0%
2 star 33.3%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




Reader May 29, 2016
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Very clear recipes and explanations, everything I hoped it would be.
Amazon Verified review Amazon
Oleg Okun Jan 16, 2016
Full star icon Full star icon Full star icon Full star icon Full star icon 5
The title of this book includes the word "cookbook" and as a cookbook the book contains a plenty of practical recipes of data visualization in Python. It presents not a mere description of Python packages and commands related to visualization, but embeds these tools into real-world scenarios. Not only visualization itself but also data manipulation enabling insightful visualization are discussed in detail. Needless to say, the discussion of every topic is accompanied by ready-to-use Python code.
Amazon Verified review Amazon
Amazon Customer Dec 07, 2015
Full star icon Full star icon Full star icon Full star icon Full star icon 5
The book helped me to understand how to visualize data with python anf find a good solution to implement own little datamart at home for home automation project.
Amazon Verified review Amazon
Amazon Customer Dec 31, 2015
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I received a free copy of this book in exchange for my review. I think this is a great book. The examples for the different plotting methods and customizations all worked. The first chapter describe set-up and code samples for using data in different formats. I remember when I was first given the task to add a chart to a report to represent data and how it took me a minute to ensure I was doing things correctly. This book would helped me a great deal at that time. Many questions I had previously about plotting and correctly coding solutions for charts I haven't been asked to make yet, were answered. I have been creating reports and charts for a University Research team and this book has been a godsend. I think this book would have helped me when I was working using java for reports and charts. I just this is a great book.
Amazon Verified review Amazon
Jonathan Jul 14, 2017
Full star icon Full star icon Empty star icon Empty star icon Empty star icon 2
Please don"t get me wrong, the book is quite useful and it"s quite frankly more handy for me to look things up in a book than on the internet.But in essence, all the information is freely available on the internet and therefore the book is very, very expensive for a black-and-white handbook!
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.