Packt+ | Advance your knowledge in tech

You're reading from Getting Started with Python for the Internet of Things Leverage the full potential of Python to prototype and build IoT projects using the Raspberry Pi

Product type Course

Published in Feb 2019

Publisher

ISBN-13 9781838555795

Length 732 pages

Edition 1st Edition

Languages

Python

Tools

Raspberry Pi

Concepts

IoT Development

Authors (5):

Tim Cox

Prof. Diwakar Vaish

Sai Yamanoor

Steven Lawrence Fernandes

Srihari Yamanoor

+1 more

View More author details

Table of Contents (37) Chapters

Title Page

About Packt

Contributors

Preface

1. Getting Started with a Raspberry Pi 3 Computer FREE CHAPTER

2. Dividing Text Data and Building Text Classifiers

3. Using Python for Automation and Productivity

4. Predicting Sentiments in Words

5. Detecting Edges and Contours in Images

6. Building Face Detector and Face Recognition Applications

7. Using Python to Drive Hardware

8. Sensing and Displaying Real-World Data

9. Building Neural Network Modules for Optical Character Recognition

10. Arithmetic Operations, Loops, and Blinky Lights

11. Conditional Statements, Functions, and Lists

12. Communication Interfaces

13. Data Types and Object-Oriented Programming in Python

14. File I/O and Python Utilities

15. Requests and Web Frameworks

16. Awesome Things You Could Develop Using Python

17. Robotics 101

18. Using GPIOs as Input

19. Making a Gardener Robot

20. Basics of Motors

21. Bluetooth-Controlled Robotic Car

22. Sensor Interface for Obstacle Avoidance

23. Making Your Own Area Scanner

24. Basic Switching

25. Recognizing Humans with Jarvis

26. Making Jarvis IoT Enabled

27. Giving Voice to Jarvis

28. Gesture Recognition

29. Machine Learning

30. Making a Robotic Arm

1. Other Books You May Enjoy

Leave a review - let other readers know what you think

Index

Pre-processing data using tokenization

The pre-processing of data involves converting the existing text into acceptable information for the learning algorithm.

Tokenization is the process of dividing text into a set of meaningful pieces. These pieces are called tokens.

How to do it...

Introduce sentence tokenization:

from nltk.tokenize import sent_tokenize

Form a new text tokenizer:

tokenize_list_sent = sent_tokenize(text)
print "nSentence tokenizer:" 
print tokenize_list_sent

Form a new word tokenizer:

from nltk.tokenize import word_tokenize 
print "nWord tokenizer:" 
print word_tokenize(text)

Introduce a new WordPunct tokenizer:

from nltk.tokenize import WordPunctTokenizer 
word_punct_tokenizer = WordPunctTokenizer() 
print "nWord punct tokenizer:" 
print word_punct_tokenizer.tokenize(text)

The result obtained by the tokenizer is shown here. It divides a sentence into word groups:

The rest of the chapter is locked

You're reading from Getting Started with Python for the Internet of Things Leverage the full potential of Python to prototype and build IoT projects using the Raspberry Pi

Table of Contents (37) Chapters

Pre-processing data using tokenization

How to do it...

Unlock this book and the full library FREE for 7 days

Authors (5)

Personalised recommendations for you