Subscription

Explore Products

Best Sellers

New Releases

Books

Videos

Audiobooks

Learning Hub

Conferences

Free Learning

You're reading from Apache Spark Deep Learning Cookbook Over 80 best practice recipes for the distributed training and deployment of neural networks using Keras and TensorFlow

Product type Paperback

Published in Jul 2018

Publisher Packt

ISBN-13 9781788474221

Length 474 pages

Edition 1st Edition

Languages

Scala

Tools

Apache Spark

Concepts

Deep Learning

Authors (2):

Ahmed Sherif

Amrith Ravindra

View More author details

Table of Contents (15) Chapters

Preface

1. Setting Up Spark for Deep Learning Development FREE CHAPTER

2. Creating a Neural Network in Spark

3. Pain Points of Convolutional Neural Networks

4. Pain Points of Recurrent Neural Networks

5. Predicting Fire Department Calls with Spark ML

6. Using LSTMs in Generative Networks

7. Natural Language Processing with TF-IDF

8. Real Estate Value Prediction Using XGBoost

9. Predicting Apple Stock Market Cost with LSTM

10. Face Recognition Using Deep Convolutional Networks

11. Creating and Visualizing Word Vectors Using Word2Vec

12. Creating a Movie Recommendation Engine with Keras

13. Image Classification with TensorFlow on Spark

14. Other Books You May Enjoy

Leave a review - let other readers know what you think

Preparing and cleansing data

This section of this chapter will discuss the various data preparation and text preprocessing steps involved before feeding it into the model as input. The specific way we prepare the data really depends on how we intend to model it, which in turn depends on how we intend to use it.

Getting ready

The language model will be based on statistics and predict the probability of each word given an input sequence of text. The predicted word will be fed in as input to the model, to, in turn, generate the next word.

A key decision is how long the input sequences should be. They need to be long enough to allow the model to learn the context for the words to predict. This input length will also define the...

The rest of the chapter is locked

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €18.99/month. Cancel anytime

Authors (2)

Ahmed Sherif

Ahmed Sherif is a data scientist who has worked with data in various roles since 2005. He started off with BI solutions and transitioned to data science in 2013. In 2016, he obtained a master's in Predictive Analytics from Northwestern University, where he studied the science and application of machine learning and predictive modeling using both Python and R. Lately, he has been developing machine learning and deep learning solutions on the cloud using Azure. In 2016, he published his first book, Practical Business Intelligence. He currently works as a Technology Solution Profession in Data and AI for Microsoft.

See other products by Ahmed Sherif

Ravindra

Amrith Ravindra is a machine learning enthusiast who holds degrees in electrical and industrial engineering. While pursuing his masters, he dove deeper into the world of machine learning and developed a love for data science. Graduate-level courses in engineering gave him the mathematical background to launch himself into a career in machine learning. He met Ahmed Sherif at a local data science meetup in Tampa. They decided to put their brains together to write a book on their favorite machine learning algorithms. He hopes this book will help him achieve his ultimate goal of becoming a data scientist and actively contributing to machine learning.

See other products by Ravindra