Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
AWS Certified DevOps Engineer - Professional Certification and Beyond

You're reading from   AWS Certified DevOps Engineer - Professional Certification and Beyond Pass the DOP-C01 exam and prepare for the real world using case studies and real-life examples

Arrow left icon
Product type Paperback
Published in Nov 2021
Publisher Packt
ISBN-13 9781801074452
Length 638 pages
Edition 1st Edition
Tools
Concepts
Arrow right icon
Author (1):
Arrow left icon
Adam Book Adam Book
Author Profile Icon Adam Book
Adam Book
Arrow right icon
View More author details
Toc

Table of Contents (31) Chapters Close

Preface 1. Section 1: Establishing the Fundamentals
2. Chapter 1: Amazon Web Service Pillars FREE CHAPTER 3. Chapter 2: Fundamental AWS Services 4. Chapter 3: Identity and Access Management and Working with Secrets in AWS 5. Chapter 4: Amazon S3 Blob Storage 6. Chapter 5: Amazon DynamoDB 7. Section 2: Developing, Deploying, and Using Infrastructure as Code
8. Chapter 6: Understanding CI/CD and the SDLC 9. Chapter 7: Using CloudFormation Templates to Deploy Workloads 10. Chapter 8: Creating Workloads with CodeCommit and CodeBuild 11. Chapter 9: Deploying Workloads with CodeDeploy and CodePipeline 12. Chapter 10: Using AWS Opsworks to Manage and Deploy your Application Stack 13. Chapter 11: Using Elastic Beanstalk to Deploy your Application 14. Chapter 12: Lambda Deployments and Versioning 15. Chapter 13: Blue Green Deployments 16. Section 3: Monitoring and Logging Your Environment and Workloads
17. Chapter 14: CloudWatch and X-Ray's Role in DevOps 18. Chapter 15: CloudWatch Metrics and Amazon EventBridge 19. Chapter 16: Various Logs Generated (VPC Flow Logs, Load Balancer Logs, CloudTrail Logs) 20. Chapter 17: Advanced and Enterprise Logging Scenarios 21. Section 4: Enabling Highly Available Workloads, Fault Tolerance, and Implementing Standards and Policies
22. Chapter 18: Autoscaling and Lifecycle Hooks 23. Chapter 19: Protecting Data in Flight and at Rest 24. Chapter 20: Enforcing Standards and Compliance with System Manger's Role and AWS Config 25. Chapter 21: Using Amazon Inspector to Check your Environment 26. Chapter 22: Other Policy and Standards Services to Know 27. Section 5: Exam Tips and Tricks
28. Chapter 23: Overview of the DevOps Professional Certification Test 29. Chapter 24: Practice Exam 1 30. Other Books You May Enjoy

Reliability

There are five design principles for reliability in the cloud:

  • Automating recover from failure
  • Testing recovery procedures
  • Scaling horizontally to increase aggregate workload availability
  • Stopping guessing capacity
  • Managing changes in automation

Automating recovery from failure

When you think of automating recovery from failure, the first thing most people think of is a technology solution. However, this is not necessarily the context that is being referred to in the reliability service pillar. These points of failure really should be based on Key Performance Indicators (KPIs) set by the business.

As part of the recovery process, it's important to know both the Recovery Time Objective (RTO) and Recovery Point Objective (RPO) of the organization or workload:

  • RTO (Recovery Time Objective): The maximum acceptable delay between the service being interrupted and restored
  • RPO (Recovery Point Objective): The maximum acceptable amount of time since the last data recovery point (backup) (Amazon Web Services, 2021)

Testing recovery procedures

In your cloud environment, you should not only test your workload functions properly, but also that they can recover from single or multiple component failures if they happen on a service, Availability Zone, or regional level.

Using practices such as Infrastructure as Code, CD pipelines, and regional backups, you can quickly spin up an entirely new environment. This could include your application and infrastructure layers, which will give you the ability to test that things work the same as in the current production environment and that data is restored correctly. You can also time how long the restoration takes and work to improve it by automating the recovery time.

Taking the proactive measure of documenting each of the necessary steps in a runbook or playbook allows for knowledge sharing, as well as fewer dependencies on specific team members who built the systems and processes.

Scaling horizontally to increase workload availability

When coming from a data center environment, planning for peak capacity means finding a machine that can run all the different components of your application. Once you hit the maximum resources for that machine, you need to move to a bigger machine.

As you move from development to production or as your product or service grows in popularity, you will need to scale out your resources. There are two main methods for achieving this: scaling vertically or scaling horizontally:

Figure 1.4 – Horizontal versus vertical scaling

Figure 1.4 – Horizontal versus vertical scaling

One of the main issues with scaling vertically is that you will hit the ceiling at some point in time, moving to larger and larger instances. At some point, you will find that there is no longer a bigger instance to move up to, or that the larger instance will be too cost-prohibitive to run.

Scaling horizontally, on the other hand, allows you to gain the capacity that you need at the time in a cost-effective manner.

Moving to a cloud mindset means decoupling your application components, placing multiple groupings of the same servers behind a load balancer, or pulling from a queue and optimally scaling up and down based on the current demand.

Stop guessing capacity

If resources become overwhelmed, then they have a tendency to fail, especially on-premises, as demands spike and those resources don't have the ability to scale up or out to meet demand.

There are service limits to be aware of, though many of them are called soft limits. These can be raised with a simple request or phone call to support. There are others called hard limits. They are set a specified number for every account, and there is no raising them.

Note

Although there isn't a necessity to memorize all these limitations, it is a good idea to become familiar with them and know about some of them since they do show up in some of the test questions – not as pure questions, but as context for the scenarios.

Managing changes in automation

Although it may seem easier and sometimes quicker to make a change to the infrastructure (or application) by hand, this can lead to infrastructure drift and is not a repeatable process. A best practice is to automate all changes using Infrastructure as Code, a code versioning system, and a deployment pipeline. This way, the changes can be tracked and reviewed.

You have been reading a chapter from
AWS Certified DevOps Engineer - Professional Certification and Beyond
Published in: Nov 2021
Publisher: Packt
ISBN-13: 9781801074452
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image