Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Pandas Cookbook
Pandas Cookbook

Pandas Cookbook: Recipes for Scientific Computing, Time Series Analysis and Data Visualization using Python

eBook
$29.99 $43.99
Paperback
$54.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Table of content icon View table of contents Preview book icon Preview Book

Pandas Cookbook

Essential DataFrame Operations

In this chapter, we will cover the following topics:

  • Selecting multiple DataFrame columns
  • Selecting columns with methods
  • Ordering column names sensibly
  • Operating on the entire DataFrame
  • Chaining DataFrame methods together
  • Working with operators on a DataFrame
  • Comparing missing values
  • Transposing the direction of a DataFrame operation
  • Determining college campus diversity

Introduction

This chapter covers many fundamental operations of the DataFrame. Many of the recipes will be similar to those in Chapter 1, Pandas Foundations which primarily covered operations on a Series.

Selecting multiple DataFrame columns

Selecting a single column is accomplished by passing the desired column name as a string to the indexing operator of a DataFrame. This was covered in the Selecting a Series recipe in Chapter 1, Pandas Foundations. It is often necessary to focus on a subset of the current working dataset, which is accomplished by selecting multiple columns.

Getting ready

In this recipe, all the actor and director columns will be selected from the movie dataset.

How to do it...

  1. Read in the movie dataset, and pass in a list of the desired columns to the...

Selecting columns with methods

Although column selection is usually done directly with the indexing operator, there are some DataFrame methods that facilitate their selection in an alternative manner. select_dtypes and filter are two useful methods to do this.

Getting ready

You need to be familiar with all pandas data types and how to access them. The Understanding data types recipe in Chapter 1, Pandas Foundations, has a table with all pandas data types.

How it works...

  1. Read in the movie dataset, and use the title of the movie to label each row. Use the get_dtype_counts...

Ordering column names sensibly

One of the first tasks to consider after initially importing a dataset as a DataFrame is to analyze the order of the columns. This basic task is often overlooked but can make a big difference in how an analysis proceeds. Computers have no preference for column order and computations are not affected either. As human beings, we naturally view and read columns left to right, which directly impacts our interpretations of the data. Haphazard column arrangement is similar to haphazard clothes arrangement in a closet. It does no good to place suits next to shirts and pants on top of shorts. It's far easier to find and interpret information when column order is given consideration.

There are no standardized set of rules that dictate how columns should be organized within a dataset. However, it is good practice to develop a set of guidelines that you...

Operating on the entire DataFrame

In the Calling Series methods recipe in Chapter 1, Pandas Foundations, a variety of methods operated on a single column or Series of data. When these same methods are called from a DataFrame, they perform that operation for each column at once.

Getting ready

In this recipe, we explore a variety of the most common DataFrame attributes and methods with the movie dataset.

How to do it...

  1. Read in the movie dataset, and grab the basic descriptive attributes, shape, size, and ndim, along with running the len function:
>>> movie =...

Chaining DataFrame methods together

Whether you believe method chaining is a good practice or not, it is quite common to encounter it during data analysis with pandas. The Chaining Series methods together recipe in Chapter 1, Pandas Foundations, showcased several examples of chaining Series methods together. All the method chains in this chapter will begin from a DataFrame. One of the keys to method chaining is to know the exact object being returned during each step of the chain. In pandas, this will nearly always be a DataFrame, Series, or scalar value.

Getting ready

In this recipe, we count all the missing values in each column of the move dataset.

...

Working with operators on a DataFrame

A primer on operators was given in the Working with operators on a Series recipe from Chapter 1, Pandas Foundations, which will be helpful here. The Python arithmetic and comparison operators work directly on DataFrames, as they do on Series.

Getting ready

When a DataFrame operates directly with one of the arithmetic or comparison operators, each value of each column gets the operation applied to it. Typically, when an operator is used with a DataFrame, the columns are either all numeric or all object (usually strings). If the DataFrame does not contain homogeneous data, then the operation is likely to fail. Let's see an example of this failure with the college dataset, which contains...

Comparing missing values

Pandas uses the NumPy NaN (np.nan) object to represent a missing value. This is an unusual object, as it is not equal to itself. Even Python's None object evaluates as True when compared to itself:

>>> np.nan == np.nan
False
>>> None == None
True

All other comparisons against np.nan also return False, except not equal to:

>>> np.nan > 5
False
>>> 5 > np.nan
False
>>> np.nan != 5
True

Getting ready

Series and DataFrames use the equals operator, ==, to make element-by-element comparisons that return an object of the same size. This recipe shows you how to use the equals operator, which is very different from the equals method.

As in the previous recipe...

Transposing the direction of a DataFrame operation

Many DataFrame methods have an axis parameter. This important parameter controls the direction in which the operation takes place. Axis parameters can only be one of two values, either 0 or 1, and are aliased respectively as the strings index and columns.

Getting ready

Nearly all DataFrame methods default the axis parameter to 0/index. This recipe shows you how to invoke the same method, but with the direction of its operation transposed. To simplify the exercise, only the columns that reference the percentage race of each school from the college dataset will be used.

How to do...

Determining college campus diversity

Many articles are written every year on the different aspects and impacts of diversity on college campuses. Various organizations have developed metrics attempting to measure diversity. US News is a leader in providing rankings for many different categories of colleges, with diversity being one of them.

Their top 10 diverse colleges with Diversity Index are given as follows:

>> pd.read_csv('data/college_diversity.csv', index_col='School')

Getting ready

Our college dataset classifies race into nine different categories. When trying to quantify something without an obvious definition, such as diversity, it helps to start with something very simple. In this recipe...

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Use the power of pandas 0.20 to solve most complex scientific computing problems with ease
  • Leverage fast, robust data structures in pandas 0.20 to gain useful insights from your data
  • Practical, easy to implement recipes for quick solutions to common problems in data using pandas 0.20

Description

This book will provide you with unique, idiomatic, and fun recipes for both fundamental and advanced data manipulation tasks with pandas 0.20. Some recipes focus on achieving a deeper understanding of basic principles, or comparing and contrasting two similar operations. Other recipes will dive deep into a particular dataset, uncovering new and unexpected insights along the way. The pandas library is massive, and it's common for frequent users to be unaware of many of its more impressive features. The official pandas documentation, while thorough, does not contain many useful examples of how to piece together multiple commands like one would do during an actual analysis. This book guides you, as if you were looking over the shoulder of an expert, through practical situations that you are highly likely to encounter. Many advanced recipes combine several different features across the pandas 0.20 library to generate results.

Who is this book for?

This book is for data scientists, analysts and Python developers who wish to explore data analysis and scientific computing in a practical, hands-on manner. The recipes included in this book are suitable for both novice and advanced users, and contain helpful tips, tricks and caveats wherever necessary. Some understanding of pandas will be helpful, but not mandatory.

What you will learn

  • Master the fundamentals of pandas 0.20 to quickly begin exploring any dataset
  • Isolate any subset of data by properly selecting and querying the data
  • Split data into independent groups before applying aggregations and transformations to each group
  • Restructure data into tidy form to make data analysis and visualization easier
  • Prepare real-world messy datasets for machine learning
  • Combine and merge data from different sources through pandas SQL-like operations
  • Utilize pandas unparalleled time series functionality
  • Create beautiful and insightful visualizations through pandas 0.20 direct hooks to Matplotlib and Seaborn

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Oct 23, 2017
Length: 532 pages
Edition : 1st
Language : English
ISBN-13 : 9781784393342
Category :
Languages :
Concepts :
Tools :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want

Product Details

Publication date : Oct 23, 2017
Length: 532 pages
Edition : 1st
Language : English
ISBN-13 : 9781784393342
Category :
Languages :
Concepts :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 153.97
Learning pandas
$54.99
Pandas Cookbook
$54.99
Python Machine Learning, Second Edition
$43.99
Total $ 153.97 Stars icon

Table of Contents

11 Chapters
Pandas Foundations Chevron down icon Chevron up icon
Essential DataFrame Operations Chevron down icon Chevron up icon
Beginning Data Analysis Chevron down icon Chevron up icon
Selecting Subsets of Data Chevron down icon Chevron up icon
Boolean Indexing Chevron down icon Chevron up icon
Index Alignment Chevron down icon Chevron up icon
Grouping for Aggregation, Filtration, and Transformation Chevron down icon Chevron up icon
Restructuring Data into a Tidy Form Chevron down icon Chevron up icon
Combining Pandas Objects Chevron down icon Chevron up icon
Time Series Analysis Chevron down icon Chevron up icon
Visualization with Matplotlib, Pandas, and Seaborn Chevron down icon Chevron up icon

Customer reviews

Most Recent
Rating distribution
Full star icon Full star icon Full star icon Full star icon Half star icon 4.3
(32 Ratings)
5 star 75%
4 star 3.1%
3 star 6.3%
2 star 9.4%
1 star 6.3%
Filter icon Filter
Most Recent

Filter reviews by




Cedric Maltais Feb 13, 2020
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Great book ! help me a lot
Amazon Verified review Amazon
Jay Brown Jan 03, 2020
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Helpful!
Amazon Verified review Amazon
Dana Maria Oct 31, 2019
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Very useful book, especially for the beginner! Clearly organization and lots of example for you to practice. If you want to be a data analysis, it helps a lot!
Amazon Verified review Amazon
Sri Oct 01, 2019
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Well thought out book. Simple to follow layout and examples. Studying the book in depth will give full mastery on Pandas application in real world. The style is a trend setter. Fortunate to have come across this book. Regards, Sri
Amazon Verified review Amazon
an Aug 31, 2019
Full star icon Full star icon Empty star icon Empty star icon Empty star icon 2
The cookbooks I have read are quick reads that you can refer to solve a problem at hand. This book breaks each chapter into ridiculous sub-sections 'Getting Ready' 'How to do it' 'How it works' oh and one more 'There's more...' This continues on chapter after chapter without giving much details there..why am i reading this when it provides no info except filling pages. The funny thing is the Contents page also have the same repetition all over, looks like the publisher was snoring during the review :)If all the redundancies are removed I bet the number of pages could be more than halved. I would look for a different book.
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.