Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Newsletter Hub

Free Learning

You're reading from Hands-On Data Analysis with Pandas Efficiently perform data collection, wrangling, analysis, and visualization using Python

Product type Paperback

Published in Jul 2019

Publisher

ISBN-13 9781789615326

Length 740 pages

Edition 1st Edition

Languages

Python

Tools

Pandas

Concepts

Data Analysis

Author (1):

Stefanie Molin

View More author details

Table of Contents (21) Chapters

Preface

1. Section 1: Getting Started with Pandas FREE CHAPTER

2. Introduction to Data Analysis

3. Working with Pandas DataFrames

4. Section 2: Using Pandas for Data Analysis

5. Data Wrangling with Pandas

6. Aggregating Pandas DataFrames

7. Visualizing Data with Pandas and Matplotlib

8. Plotting with Seaborn and Customization Techniques

9. Section 3: Applications - Real-World Analyses Using Pandas

10. Financial Analysis - Bitcoin and the Stock Market

11. Rule-Based Anomaly Detection

12. Section 4: Introduction to Machine Learning with Scikit-Learn

13. Getting Started with Machine Learning in Python

14. Making Better Predictions - Optimizing Models

15. Machine Learning Anomaly Detection

16. Section 5: Additional Resources

17. The Road Ahead

18. Solutions

19. Other Books You May Enjoy

Leave a review - let other readers know what you think

Appendix

Cleaning up the data

Let's move on to the 3-cleaning_data.ipynb notebook for our discussion on data cleaning. We will begin by importing pandas and reading in the data/nyc_temperatures.csv file, which contains the maximum daily temperature (TMAX), minimum daily temperature (TMIN), and the average daily temperature (TAVG) from the LaGuardia airport station in New York City for October 2018:

>>> import pandas as pd

>>> df = pd.read_csv('data/nyc_temperatures.csv')
>>> df.head()

The data we retrieved from the API is in the long format; for our analysis, we want it in the wide format, but we will address that in the Pivoting DataFrames section later this chapter:

	`attributes`	`datatype`	`date`	`station`	`value`
`0`	`H,,S,`	`TAVG`	`2018-10-01T00:00:00`	`GHCND:USW00014732`	`21.2`
`1`	`,,W,2400`	`TMAX`	`2018-10-01T00:00:00`	`GHCND:USW00014732`	`25.6`
`2`	`,,W,2400`	`TMIN...`

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (1)

Stefanie Molin

Stefanie Molin is a data scientist and software engineer at Bloomberg LP in NYC, tackling tough problems in information security, particularly revolving around anomaly detection, building tools for gathering data, and knowledge sharing. She has extensive experience in data science, designing anomaly detection solutions, and utilizing machine learning in both R and Python in the AdTech and FinTech industries. She holds a B.S. in operations research from Columbia University's Fu Foundation School of Engineering and Applied Science, with minors in economics, and entrepreneurship and innovation. In her free time, she enjoys traveling the world, inventing new recipes, and learning new languages spoken among both people and computers.

See other products by Stefanie Molin