You're reading from Hands-On Data Analysis with Pandas A Python data science handbook for data collection, wrangling, analysis, and visualization

Product type Paperback

Published in Apr 2021

Publisher Packt

ISBN-13 9781800563452

Length 788 pages

Edition 2nd Edition

Languages

Python

Tools

Pandas

Concepts

Data Analysis

Author (1):

Stefanie Molin

View More author details

Table of Contents (21) Chapters

Preface

1. Section 1: Getting Started with Pandas

2. Chapter 1: Introduction to Data Analysis FREE CHAPTER

3. Chapter 2: Working with Pandas DataFrames

4. Section 2: Using Pandas for Data Analysis

5. Chapter 3: Data Wrangling with Pandas

6. Chapter 4: Aggregating Pandas DataFrames

7. Chapter 5: Visualizing Data with Pandas and Matplotlib

8. Chapter 6: Plotting with Seaborn and Customization Techniques

9. Section 3: Applications – Real-World Analyses Using Pandas

10. Chapter 7: Financial Analysis – Bitcoin and the Stock Market

11. Chapter 8: Rule-Based Anomaly Detection

12. Section 4: Introduction to Machine Learning with Scikit-Learn

13. Chapter 9: Getting Started with Machine Learning in Python

14. Chapter 10: Making Better Predictions – Optimizing Models

15. Chapter 11: Machine Learning Anomaly Detection

16. Section 5: Additional Resources

17. Chapter 12: The Road Ahead

18. Solutions

19. Other Books You May Enjoy

Appendix

Preprocessing data

In this section, we will be working in the preprocessing.ipynb notebook before we return to the notebooks we used for EDA. We will begin with our imports and read in the data:

>>> import numpy as np
>>> import pandas as pd
>>> planets = pd.read_csv('data/planets.csv')
>>> red_wine = pd.read_csv('data/winequality-red.csv')
>>> wine = pd.concat([
...     pd.read_csv(
...         'data/winequality-white.csv', sep=';'
...     ).assign(kind='white'), 
...     red_wine.assign(kind='red')
... ])

Machine learning models follow the garbage in, garbage out principle. We have to make sure that we train our models (have them learn) on the best possible version of the data. What this means will depend on the model we choose. For instance, models that...

The rest of the chapter is locked

You're reading from Hands-On Data Analysis with Pandas A Python data science handbook for data collection, wrangling, analysis, and visualization

Table of Contents (21) Chapters

Preprocessing data

Authors (1)

Other recommended products

Personalised recommendations for you

You're reading from Hands-On Data Analysis with Pandas A Python data science handbook for data collection, wrangling, analysis, and visualization

Table of Contents (21) Chapters

Preprocessing data

Unlock this book and the full library FREE for 7 days

Authors (1)

Other recommended products

Personalised recommendations for you