In this section, we will go through an exhaustive list of best practices to be followed for AWS VPC. Most of these are recommended by AWS as well. Implementing these best practices will ensure that your resources, including your servers, data, and applications, are integrated with other AWS services and secured in AWS VPC. Remember that VPC is not a typical data center and it should not be treated as one.
Plan Your VPC before You Create It
Always start by planning and designing architecture for your VPC before you create it. A bad VPC design will have serious implications on the flexibility, scalability, availability, and security of your infrastructure. So, spend a good amount of time planning out your VPC before you actually start creating it.
Start with the objective of creating a VPC: is it for one application or for a business unit? Spec out all subnets you will need and figure out your availability and fault- tolerance requirements. Find out what all connectivity options you will need for connecting all internal and external networks. You might need to plan for a number of VPCs if you need to connect with networks in more than one region.
Choose the Highest CIDR Block
Once you create VPC with a CIDR block, you cannot change it. You will have to create another VPC and migrate your resources to a new VPC if you want to change your CIDR block. So, take a good look at your current resources and your requirements for the next few years in order to plan and design your VPC architecture. A VPC can have a CIDR block ranging from /16
to /28
, which means you can have between 65,536 and 16 IP addresses for your VPC. AWS recommends that you choose the highest CIDR block available, so always go for /16
CIDR block for your VPC. This way, you won't be short of IP addresses if you need to increase your instances exponentially.
All VPC connectivity options require you to have non-overlapping IP ranges. Consider future connectivity to all your internal and external networks. Make sure you take note of all available IP ranges for all your environments, including remote networks, data centers, offices, other AWS VPCs, and so on, before you assign CIDR ranges for your VPC. None of these should conflict and overlap with any network that you want to connect with.
Leave the Default VPC Alone
AWS provides a default VPC in every region for your AWS account. It is best to leave this VPC alone and start with a custom VPC for your requirement. The default VPC has all components associated with it; however, the security configuration of all these components, such as subnets, security groups, and network ACLs are quite open to the world. There is no private subnet either. So, it is a good idea to create your own VPC from scratch using either a VPC wizard in the AWS console or creating it manually through the AWS console or AWS CLI. You can configure all resources as per your requirement for your custom VPC.
Moreover, by default, if a subnet is not associated with a route table or an NACL, it is associated with the main route table and default NACL. These two components don't have any restrictions on inbound and outbound traffic, and you risk exposing your resources to the entire world.
You should not modify the main route table either; doing that might give other subnets routes that they shouldn't be given. Always create a custom route table and keep the main route table as it is.
Design for Region Expansion
AWS keeps on expanding its regions by adding more availability zones to them. We know that one subnet cannot span more than one availability zone, and distributing our resources across availability zones makes our application highly available and fault-tolerant. It is a good idea to reserve some IP address for future expansion while creating subnets with a subset of VPC CIDR block. By default, AWS reserves five IP address in every subnet for internal usage; make a note of this while allocating IP addresses to a subnet.
Ideally, you should design your subnets according to your architecture tiers, such as the database tier, the application tier, the business tier, and so on, based on their routing needs, such as public subnets needing a route to the internet gateway, and so on. You should also create multiple subnets in as many availability zones as possible to improve your fault-tolerance. Each availability zone should have identically sized subnets, and each of these subnets should use a routing table designed for them depending on their routing need. Distribute your address space evenly across availability zones and keep the reserved space for future expansion.
Follow the Least Privilege Principle
For every resource you provision or configure in your VPC, follow the least privilege principle. So, if a subnet has resources that do not need to access the internet, it should be a private subnet and should have routing based on this requirement. Similarly, security groups and NACLs should have rules based on this principle. They should allow access only for traffic required. Do not add a route to the internet gateway to the main route table as it is the default route table for all subnets.
Keep Most Resources in the Private Subnet
In order to keep your VPC and resources in your VPC secure, ensure that most of the resources are inside a private subnet by default. If you have instances that need to communicate with the internet, then you should add an Elastic Load Balancer (ELB) in the public subnet and add all instances behind this ELB in the private subnet.
Use NAT devices (a NAT instance or a NAT gateway) to access public networks from your private subnet. AWS recommends that you use a NAT gateway over a NAT instance as the NAT gateway is a fully managed, highly available, and redundant component.
Creating VPCs for Different Use Cases
You should ideally create one VPC each for your development, testing, and production environments. This will secure your resources from keeping them separate from each other, and it will also reduce your blast radius, that is, the impact on your environment if one of your VPCs goes down.
For most use cases such as application isolation, multi-tenant application, and business unit alignment, it is a good idea to create a separate VPC.
Favor Security Groups over NACLs
Security groups and NACLs are virtual firewalls available for configuring security rules for your instances and subnets respectively. While security groups are easier to configure and manage, NACLs are different. It is recommended that NACLs be used sparingly and not be changed often. NACLs should be the security policy for your organization as it does not work at a granular level. NACL rules are tied to the IP address and for a subnet, with the addition of every single rule, the complexity and management of these rules becomes exponentially difficult.
Security group rules are tied to instances and these rules span the entire VPC; they are stateful and dynamic in nature. They are easier to manage and should be kept simple. Moreover, security groups can pass other security groups as an object reference in order to allow access, so you can allow access to your database server security group only for the application server security group.
Access control for your VPC should be on top of your list while creating a VPC. You can configure IAM roles for your instances and assign them at any point. You can provide granular access for provisioning new resources inside a VPC and reduce the blast radius by restricting access to high-impact components such as various connectivity options, NACL configuration, subnet creation, and so on.
There will usually be more than one person managing all resources for your VPC; you should assign permissions to these people based on their role and by following the principle of least privileges. If someone does not need access to a resource, that access shouldn't be given in the first place.
Periodically, use the access advisor function available in IAM to find out whether all the permissions are being used as expected and take necessary actions based on your findings.
Create an IAM VPC admin group to manage your VPC and its resources.
Use VPC peering whenever possible. When you connect two VPCs using the VPC peering option, instances in these VPCs can communicate with each other using a private IP address. For a VPC peering connection, AWS uses its own network and you do not have to rely on an external network for the performance of your connection, and it is a lot more secure.
Using Elastic IP Instead of Public IP
Always use Elastic IP (EIP) instead of public IP for all resources that need to connect to the internet. The EIPs are associated with an AWS account instead of an instance. They can be assigned to an instance in any state, whether the instance is running or whether it is stopped. It persists without an instance so you can have high availability for your application depending on an IP address. The EIP can be reassigned and moved to Elastic Network Interface (ENI) as well. Since these IPs don't change, they can be whitelisted by target resources.
All these advantages of EIP over a public IP make it more favorable when compared with a public IP.
Always tag your resources in a VPC. The tagging strategy should be part of your planning phase. A good practice is to tag a resource immediately after it is created. Some common tags include version, owner, team, project code, cost center, and so on. Tags are supported by AWS billing and for resource-level permissions.
Monitoring is imperative to the security of any network, such as AWS VPC. Enable AWS CloudTrail and VPC flow logs to monitor all activities and traffic movement. The AWS CloudTrail will record all activities, such as provisioning, configuring, and modifying all VPC components. The VPC flow log will record all the data flowing in and out of the VPC for all the resources in VPC. Additionally, you can set up config rules for the AWS Config service for your VPC for all resources that should not have changes in their configuration.
Connect these logs and rules with AWS CloudWatch to notify you of anything that is not expected behavior and control changes within your VPC. Identify irregularities within your network, such as resources receiving unexpected traffic in your VPC, adding instances in the VPC with configuration not approved by your organization, among others.
Similarly, if you have unused resources lying in your VPC, such as security groups, EIP, gateways, and so on, remove them by automating the monitoring of these resources.
Lastly, you can use third-party solutions available on AWS marketplace for monitoring your VPC. These solutions integrate with existing AWS monitoring solutions, such as AWS CloudWatch, AWS CloudTrail, and so on, and provide information in a user-friendly way in the form of dashboards.