You're reading from Machine Learning Techniques for Text Apply modern techniques with Python for text processing, dimensionality reduction, classification, and evaluation

Product type Paperback

Published in Oct 2022

Publisher Packt

ISBN-13 9781803242385

Length 448 pages

Edition 1st Edition

Languages

Python

Concepts

Machine Learning

Author (1):

Nikos Tsourakis

View More author details

Table of Contents (13) Chapters

Preface

1. Chapter 1: Introducing Machine Learning for Text

2. Chapter 2: Detecting Spam Emails FREE CHAPTER

3. Chapter 3: Classifying Topics of Newsgroup Posts

4. Chapter 4: Extracting Sentiments from Product Reviews

5. Chapter 5: Recommending Music Titles

6. Chapter 6: Teaching Machines to Translate

7. Chapter 7: Summarizing Wikipedia Articles

8. Chapter 8: Detecting Hateful and Offensive Language

9. Chapter 9: Generating Text in Chatbots

10. Chapter 10: Clustering Speech-to-Text Transcriptions

11. Index

Why subscribe?

12. Other Books You May Enjoy

Introducing web scraping

Throughout the book, we repeatedly see data’s value in creating intelligent systems. None of the discussions presented so far would make any sense without its presence. For instance, we incorporated publicly available corpora and built-in datasets from Python libraries in various case studies. In reality, however, suitable corpora are rarely available for free, and it’s the data scientist’s primary responsibility to harvest them. The world wide web (WWW) is a goldmine where we can resort to finding or augmenting our datasets using web scraping, the process of collecting and parsing raw data from the web. Afterward, the data is converted into the appropriate format to proceed with the subsequent analysis.

For this task to succeed, web crawlers are used to retrieve the requested content. These are also known as spiders because they crawl all over the web, just as real spiders crawl on their spiderwebs. The specific processing is performed...

The rest of the chapter is locked

You're reading from Machine Learning Techniques for Text Apply modern techniques with Python for text processing, dimensionality reduction, classification, and evaluation

Table of Contents (13) Chapters

Introducing web scraping

Unlock this book and the full library FREE for 7 days

Authors (1)