You're reading from Hands-On Data Analysis with Pandas A Python data science handbook for data collection, wrangling, analysis, and visualization

Product type Paperback

Published in Apr 2021

Publisher Packt

ISBN-13 9781800563452

Length 788 pages

Edition 2nd Edition

Languages

Python

Tools

Pandas

Concepts

Data Analysis

Author (1):

Stefanie Molin

View More author details

Table of Contents (21) Chapters

Preface

1. Section 1: Getting Started with Pandas

2. Chapter 1: Introduction to Data Analysis FREE CHAPTER

3. Chapter 2: Working with Pandas DataFrames

4. Section 2: Using Pandas for Data Analysis

5. Chapter 3: Data Wrangling with Pandas

6. Chapter 4: Aggregating Pandas DataFrames

7. Chapter 5: Visualizing Data with Pandas and Matplotlib

8. Chapter 6: Plotting with Seaborn and Customization Techniques

9. Section 3: Applications – Real-World Analyses Using Pandas

10. Chapter 7: Financial Analysis – Bitcoin and the Stock Market

11. Chapter 8: Rule-Based Anomaly Detection

12. Section 4: Introduction to Machine Learning with Scikit-Learn

13. Chapter 9: Getting Started with Machine Learning in Python

14. Chapter 10: Making Better Predictions – Optimizing Models

15. Chapter 11: Machine Learning Anomaly Detection

16. Section 5: Additional Resources

17. Chapter 12: The Road Ahead

18. Solutions

19. Other Books You May Enjoy

Appendix

Implementing supervised anomaly detection

The SOC has finished up labeling the 2018 data, so we should revisit our EDA to make sure our plan of looking at the number of usernames with failures on a minute resolution does separate the data. This EDA is in the 3-EDA_labeled_data.ipynb notebook. After some data wrangling, we are able to create the following scatter plot, which shows that this strategy does indeed appear to separate the suspicious activity:

Figure 11.12 – Confirming that our features can help form a decision boundary

In the 4-supervised_anomaly_detection.ipynb notebook, we will create some supervised models. This time we need to read in all the labeled data for 2018. Note that the code for reading in the logs is omitted since it is the same as in the previous section:

>>> with sqlite3.connect('logs/logs.db') as conn:
...     hackers_2018 = pd.read_sql(
...      ...

The rest of the chapter is locked

You're reading from Hands-On Data Analysis with Pandas A Python data science handbook for data collection, wrangling, analysis, and visualization

Table of Contents (21) Chapters

Implementing supervised anomaly detection

Authors (1)

Other recommended products

Personalised recommendations for you

You're reading from Hands-On Data Analysis with Pandas A Python data science handbook for data collection, wrangling, analysis, and visualization

Table of Contents (21) Chapters

Implementing supervised anomaly detection

Unlock this book and the full library FREE for 7 days

Authors (1)

Other recommended products

Personalised recommendations for you