Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Production-Ready Applied Deep Learning

You're reading from   Production-Ready Applied Deep Learning Learn how to construct and deploy complex models in PyTorch and TensorFlow deep learning frameworks

Arrow left icon
Product type Paperback
Published in Aug 2022
Publisher Packt
ISBN-13 9781803243665
Length 322 pages
Edition 1st Edition
Tools
Arrow right icon
Authors (3):
Arrow left icon
Lenin Mookiah Lenin Mookiah
Author Profile Icon Lenin Mookiah
Lenin Mookiah
Tomasz Palczewski Tomasz Palczewski
Author Profile Icon Tomasz Palczewski
Tomasz Palczewski
Jaejun (Brandon) Lee Jaejun (Brandon) Lee
Author Profile Icon Jaejun (Brandon) Lee
Jaejun (Brandon) Lee
Arrow right icon
View More author details
Toc

Table of Contents (19) Chapters Close

Preface 1. Part 1 – Building a Minimum Viable Product
2. Chapter 1: Effective Planning of Deep Learning-Driven Projects FREE CHAPTER 3. Chapter 2: Data Preparation for Deep Learning Projects 4. Chapter 3: Developing a Powerful Deep Learning Model 5. Chapter 4: Experiment Tracking, Model Management, and Dataset Versioning 6. Part 2 – Building a Fully Featured Product
7. Chapter 5: Data Preparation in the Cloud 8. Chapter 6: Efficient Model Training 9. Chapter 7: Revealing the Secret of Deep Learning Models 10. Part 3 – Deployment and Maintenance
11. Chapter 8: Simplifying Deep Learning Model Deployment 12. Chapter 9: Scaling a Deep Learning Pipeline 13. Chapter 10: Improving Inference Efficiency 14. Chapter 11: Deep Learning on Mobile Devices 15. Chapter 12: Monitoring Deep Learning Endpoints in Production 16. Chapter 13: Reviewing the Completed Deep Learning Project 17. Index 18. Other Books You May Enjoy

Data Preparation in the Cloud

In this chapter, we will learn how data preparation can be set up in the cloud by leveraging various AWS cloud services. Considering the importance of extract, transform, and load (ETL) operations within data preparation, we will take a deeper look into setting up and scheduling ETL jobs in a cost-efficient manner. We will cover four different setups: ETL running on a single-node EC2 instance and an EMR cluster, and then utilizing Glue and SageMaker for ETL jobs. This chapter will also introduce Apache Spark, the most popular framework for ETL. By completing this chapter, you will be able to leverage the different advantages of the presented setups and select the right set of tools for your project.

In this chapter, we’re going to cover the following main topics:

  • Data processing in the cloud
  • Introduction to Apache Spark
  • Setting up a single-node EC2 instance for ETL
  • Setting up an EMR cluster for ETL
  • Creating a Glue job...
lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime