Subscription

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Conferences

Free Learning

You're reading from Mastering Machine Learning on AWS Advanced machine learning in Python using SageMaker, Apache Spark, and TensorFlow

Product type Paperback

Published in May 2019

Publisher Packt

ISBN-13 9781789349795

Length 306 pages

Edition 1st Edition

Languages

Python

Tools

Apache Spark

Concepts

Machine Learning

Authors (2):

Maximo Gurmendez

Dr. Saket S.R. Mengle

View More author details

Table of Contents (24) Chapters

Preface

1. Section 1: Machine Learning on AWS FREE CHAPTER

2. Getting Started with Machine Learning for AWS

3. Section 2: Implementing Machine Learning Algorithms at Scale on AWS

4. Classifying Twitter Feeds with Naive Bayes

5. Predicting House Value with Regression Algorithms

6. Predicting User Behavior with Tree-Based Methods

7. Customer Segmentation Using Clustering Algorithms

8. Analyzing Visitor Patterns to Make Recommendations

9. Section 3: Deep Learning

10. Implementing Deep Learning Algorithms

11. Implementing Deep Learning with TensorFlow on AWS

12. Image Classification and Detection with SageMaker

13. Section 4: Integrating Ready-Made AWS Machine Learning Services

14. Working with AWS Comprehend

15. Using AWS Rekognition

16. Building Conversational Interfaces Using AWS Lex

17. Section 5: Optimizing and Deploying Models through AWS

18. Creating Clusters on AWS

19. Optimizing Models in Spark and SageMaker

20. Tuning Clusters for Machine Learning

21. Deploying Models Built in AWS

22. Other Books You May Enjoy

Leave a review - let other readers know what you think

Appendix: Getting Started with AWS

Tuning Clusters for Machine Learning

Many data scientists and machine learning (ML) practitioners face the problem of scale when attempting to run ML data pipelines over big data. In this chapter, we will focus primarily on Elastic MapReduce (EMR), which is a very powerful tool for running very large ML jobs. There are many ways to configure EMR and not every setup works for every scenario. In this chapter, we will outline the main configurations of EMR and how each configuration works for different objectives. Additionally, we will present AWS Glue as a tool to catalog the results of our big data pipelines.

In this chapter, we will cover the following topics:

Introduction to the EMR architecture
Tuning EMR for different applications
Managing data pipelines with Glue

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $19.99/month. Cancel anytime

Authors (2)

Dr. Saket S.R. Mengle

Dr. Saket S.R. Mengle holds a PhD in text mining from Illinois Institute of Technology, Chicago. He has worked in a variety of fields, including text classification, information retrieval, large-scale machine learning, and linear optimization. He currently works as senior principal data scientist at dataxu, where he is responsible for developing and maintaining the algorithms that drive dataxu's real-time advertising platform.

See other products by Dr. Saket S.R. Mengle

Maximo Gurmendez

Maximo Gurmendez holds a master's degree in computer science/AI from Northeastern University, where he attended as a Fulbright Scholar. Since 2009, he has been working with dataxu as data science engineering lead. He's also the founder of Montevideo Labs (a data science and engineering consultancy). Additionally, Maximo is a computer science professor at the University of Montevideo and is director of its data science for business program.

See other products by Maximo Gurmendez