Subscription

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Conferences

Free Learning

You're reading from Building Machine Learning Systems with Python Explore machine learning and deep learning techniques for building intelligent systems using scikit-learn and TensorFlow

Product type Paperback

Published in Jul 2018

Publisher

ISBN-13 9781788623223

Length 406 pages

Edition 3rd Edition

Languages

Python

Tools

Scikit-learn

Concepts

Deep Learning

Authors (3):

Luis Pedro Coelho

Willi Richert

Matthieu Brucher

View More author details

Table of Contents (17) Chapters

Preface

1. Getting Started with Python Machine Learning FREE CHAPTER

2. Classifying with Real-World Examples

3. Regression

4. Classification I – Detecting Poor Answers

5. Dimensionality Reduction

6. Clustering – Finding Related Posts

7. Recommendations

8. Artificial Neural Networks and Deep Learning

9. Classification II – Sentiment Analysis

10. Topic Modeling

11. Classification III – Music Genre Classification

12. Computer Vision

13. Reinforcement Learning

14. Bigger Data

15. Where to Learn More About Machine Learning

16. Other Books You May Enjoy

Leave a review - let other readers know what you think

Solving our initial challenge

We will now put everything together and demonstrate our system for the following new post that we assign to the new_post variable:

new_post = '''
Disk drive problems. Hi, I have a problem with my hard disk.
After 1 year it is working only sporadically now.
I tried to format it, but now it doesn't boot any more.
Any ideas? Thanks. '''

As you learned earlier, you will first have to vectorize this post before you predict
its label:

>>> new_post_vec = vectorizer.transform([new_post])
>>> new_post_label = km.predict(new_post_vec)[0]

Now that we have the clustering, we do not need to compare new_post_vec to all post vectors. Instead, we can focus only on the posts of the same cluster. Let's fetch their indices in the original data set:

>>> similar_indices = (km.labels_ == new_post_label...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €18.99/month. Cancel anytime

Authors (3)

Willi Richert

Willi Richert is a Software Development Engineer at Microsoft for the last 5 years. He has also worked as a project team leader for mobile devices (Java/Python) at Richert GbR.

See other products by Willi Richert

Pedro Coelho

Luis Pedro Coelho is a computational biologist who analyzes DNA from microbial communities to characterize their behavior. He has also worked extensively in bioimage informatics - the application of machine learning techniques for the analysis of images of biological specimens. His main focus is on the processing and integration of large-scale datasets. He has a PhD from Carnegie Mellon University and has authored several scientific publications. In 2004, he began developing in Python and has contributed to several open source libraries. He is currently a faculty member at Fudan University in Shanghai.

See other products by Pedro Coelho

Brucher

Matthieu Brucher is a computer scientist who specializes in high-performance computing and computational modeling and currently works for JPMorgan in their quantitative research branch. He is also the lead developer of Audio ToolKit, a library for real-time audio signal processing. He has a PhD in machine learning and signals processing from the University of Strasbourg, two Master of Science degreesone in digital electronics and signal processing and another in automation from the University of Paris XI and Supelec, as well as a Master of Music degree from Bath Spa University.

See other products by Brucher