Packt+ | Advance your knowledge in tech

You're reading from IPython Interactive Computing and Visualization Cookbook Over 100 hands-on recipes to sharpen your skills in high-performance numerical computing and data science in the Jupyter Notebook

Product type Paperback

Published in Jan 2018

Publisher Packt

ISBN-13 9781785888632

Length 548 pages

Edition 2nd Edition

Languages

Python

Tools

IPython

Concepts

Data Analysis

Author (1):

Cyrille Rossant

View More author details

Table of Contents (17) Chapters

Preface

1. A Tour of Interactive Computing with Jupyter and IPython FREE CHAPTER

2. Best Practices in Interactive Computing

3. Mastering the Jupyter Notebook

4. Profiling and Optimization

5. High-Performance Computing

6. Data Visualization

7. Statistical Data Analysis

8. Machine Learning

9. Numerical Optimization

10. Signal Processing

11. Image and Audio Processing

12. Deterministic Dynamical Systems

13. Stochastic Dynamical Systems

14. Graphs, Geometry, and Geographic Information Systems

15. Symbolic and Numerical Mathematics

Index

Learning from text – Naive Bayes for Natural Language Processing

In this recipe, we show how to handle text data with scikit-learn. Working with text requires careful preprocessing and feature extraction. It is also quite common to deal with highly sparse matrices.

We will learn to recognize whether a comment posted during a public discussion is considered insulting to one of the participants. We will use a labeled dataset from Impermium, released during a Kaggle competition (see http://www.kaggle.com/c/detecting-insults-in-social-commentary).

How to do it...

Let's import our libraries:

>>> import numpy as np
    import pandas as pd
    import sklearn
    import sklearn.model_selection as ms
    import sklearn.feature_extraction.text as text
    import sklearn.naive_bayes as nb
    import matplotlib.pyplot as plt
    %matplotlib inline

Let's open the CSV file with pandas:

>>> df = pd.read_csv('https://github.com/ipython-books/'
                     'cookbook-2nd-data/blob/master...

The rest of the chapter is locked

You're reading from IPython Interactive Computing and Visualization Cookbook Over 100 hands-on recipes to sharpen your skills in high-performance numerical computing and data science in the Jupyter Notebook

Table of Contents (17) Chapters

Learning from text – Naive Bayes for Natural Language Processing

How to do it...

Authors (1)

Other recommended products

Personalised recommendations for you

You're reading from IPython Interactive Computing and Visualization Cookbook Over 100 hands-on recipes to sharpen your skills in high-performance numerical computing and data science in the Jupyter Notebook

Table of Contents (17) Chapters

Learning from text – Naive Bayes for Natural Language Processing

How to do it...

Unlock this book and the full library FREE for 7 days

Authors (1)

Other recommended products

Personalised recommendations for you