The talk will be in part about our journey from a single AWS account, with two un-routable VPCs on incompatible address space, to a new infrastructure split over 12 AWS accounts, each with one or more VPCs. With greater complexity (there are 72 VPN tunnels per region..) comes a need for automation, so all infrastructure within these new AWS accounts is driven by code.
This transit network supports dynamic, multi-region routing, and is supported by a pair of Cisco cloud routers (per region). The configuration for these routers is driven primarily by Terraform, plugged into a Packer build pipeline, resulting in immutable router AMIs. Getting this all up and running required a PR to Terraform to add support for what we needed to the core aws_vpn_connection resource, and saw us bumping up against Terraform’s limitations, so I think there’s probably some space for exploration around there well.
If network connectivity is good I can demo routing failover in production by powering off instances.