Introducing AWS Transit Gateway
AWS Transit Gateway is a central hub construct to interconnect multiple VPCs on AWS and on-premises networks together.
Its purpose is to do the following:
- Avoid finishing with a spaghetti network topology, which is likely to happen if you start peering all your VPCs one to another.
- Share common network functions across multiple VPCs such as internet and on-premises connectivity (either via VPN or AWS DX), VPC endpoints, and DNS endpoints.
- Keep those essential network functions separate from the rest of your AWS environment and in a central place managed by your network experts.
AWS Transit Gateway Overview
AWS Transit Gateway is a regional network construct, so in the case where you need to operate in more than one AWS region, you would end up with (at least) one TGW in each region. If you need to establish connectivity between VPCs in different regions, you have the option to create a cross-region peering connection between two TGWs.
TGWs are highly available by design, so you do not need to rely on more than one TGW for the resiliency purposes of the network transit hub. That said, when you attach a
VPC to a TGW, you need to specify on which subnet(s) in which AZ(s) you want that attachment to be effective. So, although the TGW is highly available, it is a best practice to specify subnets in more than one AZ when attaching a VPC to make the VPC attachment itself highly available. That said, resources deployed in a subnet within a specific AZ can only reach a TGW if there exists a TGW attachment to a subnet within the same AZ. In other words, even if you specify a route in a subnet’s route table to reach the TGW, if there is no TGW attachment to a subnet in the same AZ, then the TGW will not be reachable from that subnet. So, it is key to make sure to tie one subnet in each AZ to a TGW attachment wherever your resources need access to the TGW. It is usually recommended to use a separate subnet for that in each AZ, with a small Classless Inter-Domain Routing (CIDR) range (for example, a /28) so that you keep more IP addresses for your own resources. This allows you to have distinct network ACLs for the subnets where you deploy your resources and the subnets associated with the TGW, and you can also use separate route tables for those two types of subnets.
For organizations that intend to use stateful network appliances on their AWS environment, a specific mode called appliance mode can be enabled on the TGW.
The idea is to enable that appliance mode on the VPC attachment corresponding to the VPC where the appliance is deployed. It has then the effect of routing ingress and egress traffic through the same AZ in that VPC (for the sake of statefulness), which is not guaranteed otherwise.
Another important consideration for complex organizations that may have an AWS environment spread across multiple AWS regions is that you will not be charged extra for additional TGWs. Indeed, TGW usage is priced along two dimensions: per VPC attachment and per volume of traffic (GB) going through the TGW. So, unless you decide to attach some VPCs to more than one TGW, these costs will stay the same. TGW peering does not affect the costs either since there is no extra cost for peering, and the TGW traffic costs are not accounted for twice but only at one of two peered TGWs (typically at the sending TGW). The only additional costs in the case of cross-region peering between two TGWs would be inter-region data transfer charges.
Routing with AWS Transit Gateway
AWS Transit Gateway supports both dynamic and static routing. By default, the network elements (VPCs; VPN or DX connections; peered TGWs) attached to a TGW are associated with its default route table, unless otherwise specified. You naturally have the choice to organize routing as you please by creating additional routing tables and then associating each network element attached to the TGW with the routing table of your liking.
The routes that are defined in those routing tables can be defined statically or dynamically. When you attach a network element to a TGW, you specify whether you want the routes coming from that element to be automatically propagated to the TGW route table associated with that element. If you prefer not to, you must specify routing statically to and from the TGW.
Routes can be propagated automatically both from your on-premises networks connected to the TGW via VPN or DX and from your VPCs attached to the TGW. In the first case, routes are advertised back and forth using BGP between the TGW and your on-premises network equipment on the other end of the VPN or DX connection. In the case of VPCs, the routes are propagated from the VPCs to the TGW but not back to the VPCs from the TGW. You then need to update your VPCs’ route table, creating static routes for your VPCs to communicate with the TGW.
One more thing worth mentioning on routing is that Transit Gateway cannot handle VPC attachments when some VPCs contain IP addresses overlapping with each other. Thus, when you want to attach a set of VPCs (or on-premises networks) that may have overlapping IP addresses to a TGW, you need to deal with the overlapping IP addresses first. Going into more details on how exactly to do this goes beyond the scope of this chapter, but make sure to find a solution to that problem before attempting to connect these networks to a TGW. Multiple solutions exist out there, such as network address translation (NAT), leveraging IP version 6 (IPv6) instead of IP version 4 (IPv4) addresses, or leveraging a third-party solution to do the magic for you (typically through NATing).