Subscription

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Newsletter Hub

Free Learning

You're reading from Python Machine Learning Cookbook Over 100 recipes to progress from smart data analytics to deep learning using real-world datasets

Product type Paperback

Published in Mar 2019

Publisher Packt

ISBN-13 9781789808452

Length 642 pages

Edition 2nd Edition

Languages

Python

Tools

Pandas

Concepts

Deep Learning

Authors (2):

Giuseppe Ciaburro

Prateek Joshi

View More author details

Table of Contents (18) Chapters

Preface

1. The Realm of Supervised Learning FREE CHAPTER

2. Constructing a Classifier

3. Predictive Modeling

4. Clustering with Unsupervised Learning

5. Visualizing Data

6. Building Recommendation Engines

7. Analyzing Text Data

8. Speech Recognition

9. Dissecting Time Series and Sequential Data

10. Analyzing Image Content

11. Biometric Face Recognition

12. Reinforcement Learning Techniques

13. Deep Neural Networks

14. Unsupervised Representation Learning

15. Automated Machine Learning and Transfer Learning

16. Unlocking Production Issues

17. Other Books You May Enjoy

Leave a review - let other readers know what you think

Building a bag-of-words model

When it comes to dealing with text documents that consist of millions of words, converting them into numerical representations is necessary. The reason for this is to make them usable for machine learning algorithms. These algorithms need numerical data so that they can analyze them and output meaningful information. This is where the bag-of-words approach comes into the picture. This is basically a model that learns a vocabulary from all of the words in all the documents. It models each document by building a histogram of all of the words in the document.

Getting ready

In this recipe, we will build a bag-of-words model to extract a document term matrix, using the sklearn.feature_extraction.text...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €18.99/month. Cancel anytime

Authors (2)

Giuseppe Ciaburro

Giuseppe Ciaburro holds a PhD and two master's degrees. He works at the Built Environment Control Laboratory - Università degli Studi della Campania "Luigi Vanvitelli". He has over 25 years of work experience in programming, first in the field of combustion and then in acoustics and noise control. His core programming knowledge is in MATLAB, Python and R. As an expert in AI applications to acoustics and noise control problems, Giuseppe has wide experience in researching and teaching. He has several publications to his credit: monographs, scientific journals, and thematic conferences. He was recently included in the world's top 2% scientists list by Stanford University (2022).

See other products by Giuseppe Ciaburro

Joshi

Prateek Joshi is the founder of Plutoshift and a published author of 9 books on Artificial Intelligence. He has been featured on Forbes 30 Under 30, NBC, Bloomberg, CNBC, TechCrunch, and The Business Journals. He has been an invited speaker at conferences such as TEDx, Global Big Data Conference, Machine Learning Developers Conference, and Silicon Valley Deep Learning. Apart from Artificial Intelligence, some of the topics that excite him are number theory, cryptography, and quantum computing. His greater goal is to make Artificial Intelligence accessible to everyone so that it can impact billions of people around the world.

See other products by Joshi