Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
Natural Language Processing: Python and NLTK

You're reading from   Natural Language Processing: Python and NLTK Learn to build expert NLP and machine learning projects using NLTK and other Python libraries

Arrow left icon
Product type Course
Published in Nov 2016
Publisher Packt
ISBN-13 9781787285101
Length 702 pages
Edition 1st Edition
Languages
Arrow right icon
Authors (5):
Arrow left icon
Iti Mathur Iti Mathur
Author Profile Icon Iti Mathur
Iti Mathur
Jacob Perkins Jacob Perkins
Author Profile Icon Jacob Perkins
Jacob Perkins
Deepti Chopra Deepti Chopra
Author Profile Icon Deepti Chopra
Deepti Chopra
Nitin Hardeniya Nitin Hardeniya
Author Profile Icon Nitin Hardeniya
Nitin Hardeniya
Nisheeth Joshi Nisheeth Joshi
Author Profile Icon Nisheeth Joshi
Nisheeth Joshi
+1 more Show less
Arrow right icon
View More author details
Toc

Chapter 8. Distributed Processing and Handling Large Datasets

In this chapter, we will cover the following recipes:

  • Distributed tagging with execnet
  • Distributed chunking with execnet
  • Parallel list processing with execnet
  • Storing a frequency distribution in Redis
  • Storing a conditional frequency distribution in Redis
  • Storing an ordered dictionary in Redis
  • Distributed word scoring with Redis and execnet

Introduction

NLTK is great for in-memory, single-processor natural language processing. However, there are times when you have a lot of data to process and want to take advantage of multiple CPUs, multicore CPUs, and even multiple computers. Or, you might want to store frequencies and probabilities in a persistent, shared database so multiple processes can access it simultaneously. For the first case, we'll be using execnet to do parallel and distributed processing with NLTK. For the second case, you'll learn how to use the Redis data structure server/database to store frequency...

You have been reading a chapter from
Natural Language Processing: Python and NLTK
Published in: Nov 2016
Publisher: Packt
ISBN-13: 9781787285101
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image