What this book covers
Chapter 1, Understanding Resilience Concepts, introduces the concept of resilience by drawing parallels with the aviation industry. It covers achieving resilient architecture using AWS infrastructure, fault-tolerant design best practices, and the shared responsibility model. You’ll learn about potential failure points and understand why maintaining resilience is an ongoing process crucial for a robust infrastructure.
Chapter 2, Implementing Resilient Compute and Auto Scaling, covers resilient compute and auto scaling solutions on AWS, focusing on failure-resistant system design, redundancy, and fault tolerance. It explores AWS Auto Scaling, cost-saving strategies, and the importance of monitoring. Key topics include multi-Availability Zone deployments, stateless architectures, and extending resilience to containers and serverless architectures.
Chapter 3, Securing and Backing Up Critical Data, covers data security and resilience strategies on AWS. It explores access control, layered backup strategies, multi-Region models for improved availability, automated recovery mechanisms, and disaster recovery best practices. You’ll learn how to design highly resilient and available systems using various AWS services.
Chapter 4, Orchestrating Graceful Degradation, explores the design principle of graceful degradation, explaining why it is critical to prevent your systems from facing catastrophic failures. You will also learn about different strategies to contain outages and how you can streamline the recovery process efficiently.
Chapter 5, Exploring the AWS Shared Responsibility Model, provides an introduction to what shared responsibilities between AWS and customers look like and the different roles these parties play in designing and operating a resilient infrastructure.
Chapter 6, Learning AWS Well-Architected Principles for Resiliency, explores the critical pillars of the AWS Well-Architected Framework through the lens of building resilient cloud solutions. This includes the operational excellence, reliability, security, efficiency, and cost optimization pillars. You’ll learn how to use AWS services to reduce heavy lifting, automate deployments, improve operational procedures, and secure your applications.
Chapter 7, Architecting Fault-Tolerant Applications, discusses architectural patterns and best practices for building fault-tolerant, highly available applications on AWS. You will learn about redundancy, loose coupling, graceful degradation, fault isolation concepts, and how important it is to build the right architecture to take the best from what AWS offers.
Chapter 8, Resiliency Considerations for Serverless Applications, helps you understand the advantages and strategies for building serverless-based applications. The chapter covers their impact on improving resilience. You will learn about idempotency, asynchronous transactions, error handling, and testing and deployment strategies.
Chapter 9, Using Containers to Improve Resiliency, focuses on ways container-based applications help with greater resilience. In this chapter, you will learn how to build, deploy, and operate containers using AWS services. You will gain an understanding of immutable deployments, scaling and security for containers, and specific considerations compared to traditional virtual machines.
Chapter 10, Resilient Architectures Across Regions, gives insights into running applications across multiple regions. It covers active/passive, active/active, and cell-based architectures, with a focus on high availability. You will learn about the pros and cons of each deployment model and what considerations to take for multi-Region deployments.
Chapter 11, Examples of Resilient Architecture, delves into architectural patterns for building resilient systems, ensuring reliability and availability through fault-tolerant designs, leveraging practical examples and real-world scenarios.
Chapter 12, Observability, Auditing, and Continuous Improvement, focuses on designing for observability, covering essential monitoring and auditing techniques to proactively identify issues and maintain system health for resilient applications.
Chapter 13, Performing Chaos Engineering Testing, explores chaos engineering principles, introducing controlled fault injection tests to proactively identify vulnerabilities and validate resilience mechanisms, enabling more robust system development.
Chapter 14, Disaster Recovery Planning and Testing, focuses on Disaster Recovery Planning (DRP), outlining procedures for crafting an effective plan, incorporating testing strategies to identify vulnerabilities, and enabling organizations to recover quickly and maintain business continuity during disruptive events.
Chapter 15, Finalize Building Resilient Architecture Using AWS Resilience Services, delves into AWS Cloud Resilience, exploring tools and capabilities for fault injection, chaos engineering, disaster recovery, and backup solutions to build highly reliable and available applications on the AWS cloud.