Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Natural Language Processing with Python Quick Start Guide

You're reading from   Natural Language Processing with Python Quick Start Guide Going from a Python developer to an effective Natural Language Processing Engineer

Arrow left icon
Product type Paperback
Published in Nov 2018
Publisher Packt
ISBN-13 9781789130386
Length 182 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Nirant Kasliwal Nirant Kasliwal
Author Profile Icon Nirant Kasliwal
Nirant Kasliwal
Arrow right icon
View More author details
Toc

Bread and butter – most common tasks

There are several well-known text cleaning ideas. They have all made their way into the most popular tools today such as NLTK, Stanford CoreNLP, and spaCy. I like spaCy for two main reasons:

  • It's an industry-grade NLP, unlike NLTK, which is mainly meant for teaching.
  • It has good speed-to-performance trade-off. spaCy is written in Cython, which gives it C-like performance with Python code.

spaCy is actively maintained and developed, and incorporates the best methods available for most challenges.

By the end of this section, you will be able to do the following:

  • Understand tokenization and do it manually yourself using spaCy
  • Understand why stop word removal and case standardization works, with spaCy examples
  • Differentiate between stemming and lemmatization, with spaCy lemmatization examples
...
lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime