In this chapter, we will present the design considerations for big data applications using AWS services. More specifically, we will explore AWS services and platforms such as Kinesis, EMR, Apache Spark, SageMaker, and Glue that are often the key components of such applications. Our focus will be on the best practices for using these AWS services in various big data applications such as machine learning and streaming analytics applications. Finally, in the hands-on exercise, we will create EMR-Spark clusters.
In this chapter, you will learn about the following:
- Characteristics of a big data application
- Analyzing streaming data with Amazon Kinesis
- Best practices for building serverless big data applications
- Best practices for distributed machine learning and predictive analytics on AWS
- Using Amazon SageMaker for machine learning applications
- Best...