Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Conferences

Free Learning

You're reading from Hands-On Python Natural Language Processing Explore tools and techniques to analyze and process text with a view to building real-world NLP applications

Product type Paperback

Published in Jun 2020

Publisher Packt

ISBN-13 9781838989590

Length 316 pages

Edition 1st Edition

Languages

Processing

Tools

NumPy

Concepts

Mobile Application Development

Authors (2):

Mayank Rasu

Aman Kedia

View More author details

Table of Contents (16) Chapters

Preface

1. Section 1: Introduction

2. Understanding the Basics of NLP FREE CHAPTER

3. NLP Using Python

4. Section 2: Natural Language Representation and Mathematics

5. Building Your NLP Vocabulary

6. Transforming Text into Data Structures

7. Word Embeddings and Distance Measurements for Text

8. Exploring Sentence-, Document-, and Character-Level Embeddings

9. Section 3: NLP and Learning

10. Identifying Patterns in Text Using Machine Learning

11. From Human Neurons to Artificial Neurons for Understanding Text

12. Applying Convolutions to Text

13. Capturing Temporal Relationships in Text

14. State of the Art in NLP

15. Other Books You May Enjoy

Leave a review - let other readers know what you think

Data preprocessing

Before we delve into these models and gain familiarity with some of these algorithms, we must learn about preprocessing the training data. We covered some of the preprocessing steps when working with text data such as tokenization, stop word removal, lemmatization, stemming, and so on in Chapter 3, Building Your NLP Vocabulary. However, there are some additional data preprocessing steps that are extremely crucial in ML as the training data needs to adhere to certain rules to be of any value to the model. Poorly processed data is guaranteed to train low accuracy models. It should be noted that data preprocessing is a vast field and that you may be required to perform various preprocessing steps based on the data you are working with. For example, you may be required to handle unstructured data; perform outlier analysis, invalid data analysis, and duplicate data analysis; identify correlated features; and more. However, we will focus on some of the most widely used preprocessing...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (2)

Kedia

Aman Kedia is a data enthusiast and lifelong learner. He is an avid believer in Artificial Intelligence (AI) and the algorithms supporting it. He has worked on state-of-the-art problems in Natural Language Processing (NLP), encompassing resume matching and digital assistants, among others. He has worked at Oracle and SAP, trying to solve problems leveraging advancements in AI. He has four published research papers in the domain of AI.

See other products by Kedia

Rasu

Mayank Rasu is the author of the book Hands-On Natural Language Processing with Python. He has more than 12 years of global experience as a data scientist and quantitative analyst in the investment banking domain. He has worked at the intersection of finance and technology and has developed and deployed AI-based applications in the finance domain, which include sentiment analyzer, robotics process automation, and deep learning-based document reviewers. Mayank is also an educator and has trained/mentored working professionals on applied AI.

See other products by Rasu