We'll use CloudFormation extensively throughout this book, so it's important that you have an understanding of what it is and how it fits in to the AWS ecosystem. There should easily be enough information here to get you started, but where necessary, we'll refer you to AWS' own documentation.
CloudFormation
What is CloudFormation?
The CloudFormation service allows you to provision and manage a collection of AWS resources in an automated and repeatable fashion. In AWS terminology, these collections are referred to as stacks. Note however that a stack can be as large or as small as you like. It might consist of a single S3 bucket, or it might contain everything needed to host your three-tier web app.
In this chapter, we'll show you how to define the resources to be included in your CloudFormation stack. We'll talk a bit more about the composition of these stacks and why and when it's preferable to divvy up resources between a number of stacks. Finally, we'll share a few of the tips and tricks we've learned over years of building countless CloudFormation stacks.
Pretty much everyone incurs at least one or two flesh wounds along their journey with CloudFormation. It is all very much worth it, though.
Why is CloudFormation important?
By now, the benefits of automation should be starting to become apparent to you. But don't fall in to the trap of thinking CloudFormation will be useful only for large collections of resources. Even performing the simplest task of, say, creating an S3 bucket can get very repetitive if you need to do it in every region.
We work with a lot of customers who have very tight controls and governance around their infrastructure, and especially in the network layer (think VPCs, NACLs, and security groups). Being able to express one's cloud footprint in YAML (or JSON), store it in a source code repository, and funnel it through a high-visibility pipeline gives these customers confidence that their infrastructure changes are peer-reviewed and will work as expected in production. Discipline and commitment to IaC SDLC practices are of course a big factor in this, but CloudFormation helps bring us out of the era of following 20-page run-sheets for manual changes, navigating untracked or unexplained configuration drift, and unexpected downtime caused by fat fingers.
The layer cake
Now is a good time to start thinking about your AWS deployments in terms of layers. Your layers will sit atop one another, and you will have well-defined relationships between them.
Here's a bottom-up example of how your layer cake might look:
- VPC with CloudTrail
- Subnets, routes, and NACLs
- NAT gateways, VPN or bastion hosts, and associated security groups
- App stack 1: security groups, S3 buckets
- App stack 1: cross-zone RDS and read replica
- App stack 1: app and web server auto scaling groups and ELBs
- App stack 1: CloudFront and WAF config
In this example, you may have many occurrences of the app stack layers inside your VPC, assuming you have enough IP addresses in your subnets! This is often the case with VPCs living inside development environments. So immediately, you have the benefit of multi-tenancy capability with application isolation.
One advantage of this approach is that while you are developing your CloudFormation template, if you mess up the configuration of your app server, you don't have to wind back all the work CFN did on your behalf. You can just turf that particular layer (and the layers that depend on it) and restart from there. This is not the case if you have everything contained in a single template.
We commonly work with customers for whom ownership and management of each layer in the cake reflects the structure of the technology divisions within a company. The traditional infrastructure, network, and cyber security folk are often really interested in creating a safe place for digital teams to deploy their apps, so they like to heavily govern the foundational layers of the cake. Conway's Law, coined by Melvin Conway, starts to come in to play here:
Finally, even if you are a single-person infrastructure coder working in a small team, you will benefit from this approach. For example, you'll find that it dramatically reduces your exposure to things such as AWS limits, timeouts, and circular dependencies.
CloudFormation templates
This is where we start to get our hands dirty. CloudFormation template files are the codified representation of your stack, expressed in either YAML or JSON. When you wish to create a CloudFormation stack, you push this template file to CloudFormation, through its API, web console, command line tools, or some other method (such as the SDK).
Templates can be replayed over and over again by CloudFormation, creating many instances of your stack.
YAML versus JSON
Up until recently, JSON was your only option. We'll actually encourage you to adopt YAML, and we'll be using it for all of the examples shown in this book. Some of the reasons are as follows:
- It's just nicer to look at. It's less syntax heavy, and should you choose to go down the path of generating your CloudFormation templates, pretty much every language has a YAML library of some kind.
- The size of your templates will be much smaller. This is more practical from a developer's point of view, but it also means you're less likely to run into the CloudFormation size limit on template files (50 KB).
- The string-substitution features are easier to use and interpret.
- Your EC2 UserData (the script that runs when your EC2 instance boots) will be much easier to implement and maintain.
A closer look at CloudFormation templates
CloudFormation templates consist of a number of parts, but these are the four we're going to concentrate on:
- Parameters
- Resources
- Outputs
- Mappings
Here's a short YAML example:
AWSTemplateFormatVersion: '2010-09-09'
Parameters:
EC2KeyName:
Type: String
Description: EC2 Key Pair to launch with
Mappings:
RegionMap:
us-east-1:
AMIID: ami-9be6f38c
ap-southeast-2:
AMIID: ami-28cff44b
Resources:
ExampleEC2Instance:
Type: AWS::EC2::Instance
Properties:
InstanceType: t2.nano
UserData:
Fn::Base64:
Fn::Sub': |
#!/bin/bash -ex
/opt/aws/bin/cfn-signal '${ExampleWaitHandle}'
ImageId:
Fn::FindInMap: [ RegionMap, Ref: 'AWS::Region', AMIID ]
KeyName:
Ref: EC2KeyName
ExampleWaitHandle:
Type: AWS::CloudFormation::WaitConditionHandle
Properties:
ExampleWaitCondition:
Type: AWS::CloudFormation::WaitCondition
DependsOn: ExampleEC2Instance
Properties:
Handle:
Ref: ExampleWaitHandle
Timeout: 600
Outputs:
ExampleOutput:
Value:
Fn::GetAtt: ExampleWaitCondition.Data
Description: The data signaled with the WaitCondition
Parameters
CloudFormation parameters are the input values you define when creating or updating your stack, similar to how you provide parameters to any command-line tools you might use. They allow you to customize your stack without making changes to your template. Common examples of what parameters might be used for are as follows:
- EC2 AMI ID: You may wish to redeploy your stack with a new AMI that has the latest security patches installed.
- Subnet IDs: You could have a list of subnets that an auto scaling group should deploy servers in. These subnet IDs will be different between your dev, test, and production environments.
- Endpoint targets and credentials: These include things such as API hostnames, usernames, and passwords.
You'll find that there are a number of parameter types. In brief, they are:
- String
- Number
- List
- CommaDelimitedList
In addition to these, AWS provides some AWS-specific parameter types. These can be particularly handy when you are executing your template via the CloudFormation web console. For example, a parameter type of AWS::EC2::AvailabilityZone::Name will cause the web console to display a drop-down list of valid Availability Zones for this parameter. In the ap-southeast-2 region, the list would look like this:
- ap-southeast-2a
- ap-southeast-2b
- ap-southeast-2c
The list of AWS-specific parameter types is steadily growing and is large enough that we can't list them here. We'll use many of them throughout this book, however, and they can easily be found in the AWS CloudFormation documentation.
When creating or updating a stack, you will need to provide values for all the parameters you've defined in your template. Where it makes sense, you can define default values for a parameter. For example, you might have a parameter called debug that tells your application to run in debug mode. You typically don't want this mode enabled by default, so you can set the default value for this parameter to false, disabled, or something else your application understands. Of course, this value can be overridden when creating or updating your stack.
You can and should provide a short, meaningful description for each parameter. These are displayed in the web console next to each parameter field. When used properly, they provide hints and context to whoever is trying to run your CloudFormation template.
At this point, we need to introduce the inbuilt Ref function. When you need to reference a parameter value, you use this function to do so:
KeyName:
Ref: EC2KeyName
While Ref isn't the only inbuilt function you'll need to know, it's almost certainly going to be the one you'll use the most. We'll talk more about inbuilt functions later in this chapter.
Resources
Resources are your actual pieces of AWS infrastructure. These are your EC2 instances, S3 buckets, ELBs, and so on. Almost any resource type you can create by pointing and clicking in the AWS web console can also be created using CloudFormation.
http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-template-resource-type-ref.html.
There are a few important things to keep in mind about CloudFormation resources:
- New or bleeding-edge AWS resources are often not immediately supported. CloudFormation support typically lags a few weeks (sometimes months) behind the release of new AWS features. This used to be quite frustrating for anyone to whom infrastructure automation is key. Fast-forward to today, and this situation is somewhat mitigated by the ability to use custom resources. These are discussed further on in this chapter.
- Resources have a default return value. You can use Ref to fetch these return values for use elsewhere in your template. For example, the AWS::EC2::VPC resource type has a default return value that is the ID of the VPC. They look something like this: vpc-11aa111a.
- Resources often contain additional return values. These additional values are fetched using the inbuilt Fn::GetAtt function. Continuing from the previous example, the AWS::EC2::VPC resource type also returns the following:
- CidrBlock
- DefaultNetworkAcl
- DefaultSecurityGroup
- Ipv6CidrBlocks
Outputs
Just like AWS resources, CloudFormation stacks can also have return values, called outputs. These values are entirely user defined. If you don't specify any outputs, then nothing is returned when your stack is completed.
Outputs can come in handy when you are using a CI/CD tool to create your CloudFormation stacks. For example, you might like to output the public hostname of an ELB so your CI/CD tool can turn it into a clickable link within the job output.
You'll also use them when your are linking together pieces of your layer cake. You may want to reference an S3 bucket or security group created in another stack. This is much easier to do with the new cross-stack references feature, which we'll discuss later in this chapter. You can expect to see the Ref and Fn::GetAtt functions a lot in the output section of any CloudFormation template.
Mappings
The mappings section is used to define a set of key/value pairs. If you require any kind of AWS region portability, perhaps for DR or availability purposes or simply to get your application closer to your end user, you'll almost certainly need to specify some mappings in your template. This is particularly necessary if you are referencing anything in your template that is region specific.
The canonical example would be to specify a map of EC2 AMI IDs in your template. This is because AMIs are a region-specific resource, so a reference to a valid Amazon Machine Image (AMI) ID in one region will be invalid in another.
Mappings look like this:
Mappings:
RegionMap:
us-east-1:
AMIID: ami-9be6f38c
ap-southeast-2:
AMIID: ami-28cff44b
Dependencies and ordering
When executing your template, CloudFormation will automatically work out which resources depend on each other and order their creation accordingly. Additionally, resource creation is parallelized as much as possible so that your stack execution finishes in the timeliest manner possible. Things occasionally become unstuck, however.
Let's take an example where an app server depends on a DB server. In order to connect to the database, the app server needs to know its IP address or hostname. This situation would actually require you to create the DB server first so that you can use Ref to fetch its IP and provide it to your app server. CloudFormation has no way of knowing about the coupling between these two resources, so it will go ahead and create them in any order it pleases (or in parallel if possible).
To fix this situation, we use the DependsOn attribute to tell CloudFormation that our app server depends on our DB server. In fact, DependsOn can actually take a list of strings if a resource happens to depend on multiple resources before it can be created. So if our app server were to also depend on, say, a Memcached server, then we use DependsOn to declare both dependencies.
If necessary, you can take this further. Let's say that after your DB server boots, it will automatically start the database, set up a schema, and import a large amount of data. It may be necessary to wait for this process to complete before we create an app server that attempts to connect to a DB expecting a complete schema and data set. In this scenario, we want a way to signal to CloudFormation that the DB server has completed its initialization so it can go ahead and create resources that depend on it. This is where WaitCondition and WaitConditionHandle come in.
Firstly, you create an AWS::CloudFormation::WaitConditionHandle type, which you can later reference via Ref.
Next, you create an AWS::CloudFormation::WaitCondition type. In our case, we want the wait period to start as soon as the DB server is created, so we specify that this WaitCondition resource DependsOn our DB server.
After the DB server has finished importing data and is ready to accept connections, it calls the callback URL provided by the WaitConditionHandle resource to signal to CloudFormation that it can stop waiting and start executing the rest of the CloudFormation stack. The URL is supplied to the DB server via UserData, again using Ref. Typically, curl, wget or some equivalent is used to call the URL.
A WaitCondition resource can have a Timeout period too. This is a value specified in seconds. In our example, we might supply a value of 900 because we know that it should never take more than 15 minutes to boot our DB and import the data.
Here's an example of what DependsOn, WaitConditionHandle, and WaitCondition look like combined:
ExampleWaitHandle:
Type: AWS::CloudFormation::WaitConditionHandle
Properties:
ExampleWaitCondition:
Type: AWS::CloudFormation::WaitCondition
DependsOn: ExampleEC2Instance
Properties:
Handle:
Ref: ExampleWaitHandle
Timeout: 600
Functions
CloudFormation provides some inbuilt functions to make composing your templates a lot easier. We've already looked at Ref and Fn::GetAtt. Let's look at some others you are likely to encounter.
Fn::Join
Use Fn::Join to concatenate a list of strings using a specified delimiter, like this, for example:
"Fn::Join": [ ".", [ 1, 2, 3, 4 ] ]
This would yield the following value:
"1.2.3.4"
Fn::Sub
Use Fn::Sub to perform string substitution. Consider this:
DSN: "Fn::Sub"
- mysql://${db_user}:${db_pass}@${db_host}:3306/wordpress
- { db_user: lchan, db_pass: ch33s3, db_host: localhost }
This would yield the following value:
mysql://lchan:ch33s3@localhost:3306/wordpress
When you combine these functions with Ref and Fn::GetAtt, you can start doing some really powerful stuff, as we'll be seeing in the recipes throughout this book.
Other available inbuilt functions include:
- Fn::Base64
- Fn::FindInMap
- Fn::GetAZs
- Fn::ImportValue
- Fn::Select
Conditionals
It's reasonably common to provision a similar but distinct set of resources based on which environment your stack is running in. In your development environment, for example, you may not wish to create an entire fleet of database servers (HA master and read slaves), instead opting for just a single database server. You can achieve this by using conditionals:
- Fn::And
- Fn::Equals
- Fn::If
- Fn::Not
- Fn::Or
Permissions and service roles
One important thing to remember about CloudFormation is that it's more or less just making API calls on your behalf. This means that CloudFormation will assume the very same permissions or role you use to execute your template. If you don't have permission to create a new hosted zone in Route 53, for example, any template you try to run that contains a new Route 53-hosted zone will fail.
On the flip side, this has created a somewhat tricky situation where anyone developing CloudFormation typically has a very elevated level of privileges, and these privileges are somewhat unnecessarily granted to CloudFormation each time a template is executed.
If my CloudFormation template contains only one resource, which is a Route 53-hosted zone, it doesn't make sense for that template to be executed with full admin privileges to my AWS account. It makes much more sense to give CloudFormation a very slim set of permissions to execute the template with, thus limiting the blast radius if a bad template were to be executed (that is, a bad copy-and-paste operation resulting in deleted resources).
Thankfully, service roles have recently been introduced, and you can now define an IAM role and tell CloudFormation to use this role when your stack is being executed, giving you a much safer space to play in.
Custom resources
As discussed previously in this chapter, it's common for there to be a lengthy wait between the release of a new AWS feature and your ability to use that feature in CloudFormation.
Before custom resources, this led AWS developers down the path of doing over 95 percent of their automation in CloudFormation and then running some CLI commands to fill in the gaps. It was often difficult to tell exactly which resources belonged to which stack, and knowing exactly when your stack had finished execution became a guessing game.
Fast forward to today, and the emerging pattern is to use a custom resource to delegate to a AWS Lambda function. Lambda can fill in the gaps by making API calls on your behalf, and it becomes much easier to track the heritage and completion of these resources.
Cross-stack references
When using the layered cake approach, it's very common to want to use outputs from one stack as inputs in another stack. For example, you may create a VPC in one stack and require its VPC ID when creating resources in another.
For a long time, one needed to provide some glue around stack creation to pass output between stacks. AWS recently introduced cross-stack references, which provide a more native way of doing this.
You can now export one or more outputs from your stack. This makes those outputs available to other stacks. Note that the name of this value needs to be unique, so it's probably a good idea to include the CloudFormation stack name in the name you're exporting to achieve this.
Once a value is exported, it becomes available to be imported in another stack using the Fn::ImportValue function—very handy!
Updating resources
One of the principles of IaC is that all changes should be represented as code for review and testing. This is especially important where CloudFormation is concerned.
After creating a stack for you, the CloudFormation service is effectively hands off. If you make a change to any of the resources created by CloudFormation (in the web console, command line, or by some other method), you're effectively causing configuration drift. CloudFormation no longer knows the exact state of the resources in your stack.
The correct approach is to make these changes in your CloudFormation template and perform an update operation on your stack. This ensures that CloudFormation always knows the state of your stack and allows you to maintain confidence that your infrastructure code is a complete and accurate representation of your running environments.
Change sets
When performing a stack update, it can be unclear exactly what changes are going to be made to your stack. Depending on which resource you are changing, you may find that it will need to be deleted and recreated in order to implement your change. This, of course, is completely undesired behavior if the resource in question contains data you'd like to keep. Keep in mind that RDS databases can be a particular pain point.
To mitigate this situation, CloudFormation allows you to create and review a change set prior to executing the update. The change set shows you which operations CloudFormation intends to perform on your resources. If the change set looks good, you can choose to proceed. If you don't like what you see, you can delete the change set and choose another course of action—perhaps choosing to create and switch to an entirely new stack to avoid a service outage.
Other things to know
There are a few other things you should keep in the back of your mind as you start to build out your own CloudFormation stacks. Let's take a look.
Name collisions
Often, if you omit the name attribute from a resource, CloudFormation will generate a name for you. This can result in weird-looking resource names, but it will increase the replayability of your template. Using AWS::S3::Bucket as an example, if you specify the BucketName parameter but don't ensure its uniqueness, CloudFormation will fail to execute your template the second time around because the bucket will already exist. Omitting BucketName fixes this. Alternatively, you may opt to generate your own unique name each time the template is run. There's probably no right or wrong approach here, so just do what works for you.
Rollback
When creating a CloudFormation stack, you are given the option of disabling rollback. Before you go ahead and set this to true, keep in mind that this setting persists beyond stack creation. We've ended up in precarious situations where updating an existing stack has failed (for some reason) but rollback has been disabled. This is a fun situation for no one.
Limits
The limits most likely to concern you are as follows:
- The maximum size allowed for your CloudFormation template is 50 KB. This is quite generous, and if you hit this limit, you almost certainly need to think about breaking up your template into a series of smaller ones. If you absolutely need to exceed the 50 KB limit, then the most common approach is to first upload your template to S3 and then provide an S3 URL to CloudFormation to execute.
- The maximum number of parameters you can specify is 60. If you need more than this then again, consider whether or not you need to add more layers to your cake. Otherwise, lists or mappings might get you out of trouble here.
- Outputs are also limited to 60. If you've hit this limit, it's probably time to resort to a series of smaller templates.
- Resources are limited to 200. The same rules apply here as before.
- By default, you're limited to a total of 200 CloudFormation stacks. You can have this limit increased simply by contacting AWS.
Circular dependencies
Something to keep in the back of your mind is that you may run in to a circular dependency scenario, where multiple resources depend on each other for creation. A common example is where two security groups reference each other in order to allow access between themselves.
A workaround for this particular scenario is to use the AWS::EC2::SecurityGroupEgress and AWS::EC2::SecurityGroupIngress types instead of the ingress and egress rule types for AWS::EC2::SecurityGroup.
DSLs and generators
DSLs and generators can be a point of hot debate among infrastructure coders. Some love them, some hate them. Some of the reasons why people love them include the following:
- They allow CloudFormation to be written in a language that is more native to them or their team.
- They allow the use some advanced programming constructs. Iteration is a particularly well-cited example.
- Until YAML was supported by CloudFormation, using a DSL usually resulted in code that was easier to read and far less verbose.
Some of the reasons people dislike them are:
- DSLs have a history of becoming abandonware or significantly lagging behind CloudFormation, although there are a couple of well-supported DSLs out there
- Developers are potentially required to learn a new language and navigate another new set of documentation, on top of learning CloudFormation and navigating the AWS documentation
- Google and Stack Overflow become a little less useful because one needs to translate questions and answers
Beyond what is written here, this topic won't come up again in this book. We can't give specific advice as to which road you should take because it's almost always a highly personal and situational choice. However, a sensible approach, especially while coming to grips with AWS and CloudFormation, would be to stick with YAML (or JSON) until you get to the point where you think a DSL or generator might be useful.
Credentials
Under no circumstances do you want to have credentials hardcoded in your templates or committed to your source code repository. Doing this doesn't just increase the chance your credentials will be stolen, it also reduces the portability of your templates. If your credentials are hardcoded and you need to change them, that obviously requires you to edit your CloudFormation template.
Instead, you should add credentials as parameters in your template. Be sure to use the NoEcho parameter when you do this so that CloudFormation masks the value anywhere the parameters are displayed.
Stack policies
If there are resources in your stack you'd like to protect from accidental deletion or modification, applying a stack policy will help you achieve this. By default, all resources are able to be deleted or modified. When you apply a stack policy, all resources are protected unless you explicitly allow them to be deleted or modified in the policy. Note that stack policies do not apply during stack creation—they only take effect when you attempt to update a stack.