Using multiple Terraform AWS Providers for global infrastructure

Using multiple Terraform AWS providers to facilitate globally provisioned infrastructure

Table Of Contents

Today I Explained

Sometimes when working within Cloud you’ll have requirements to ensure data redundancy or operate services within specific geographic areas. When working with AWS, this will mean that you’ll need to provision resources within multiple AWS Regions. For applications, this can mean just deploying a lambda or machine within a certain region. For infrastructure, this can present challenges when the infrastructure needs to span across regions.

   ┌─────────── AWS ──────────┐
   ▼             ▼            ▼
us-east-1    us-east-2    us-west-2

The above text block shows a graph, composed of a top level node named ‘AWS’, with three children nodes named us-east-1, us-east-2 and us-west-2

Terraform & the AWS Terraform Provider offer a solution to this by supporting the definition of multiple provider’s within a Terraform module, and the AWS Terraform Provider supporting specifying a specific region for providers. This makes it possible to ensure resources are created in specific regions:

provider "aws" {
  alias = "us-east-1"

  region  = "us-east-1"
  profile = var.aws_profile
}

provider "aws" {
  alias = "us-east-2"

  region  = "us-east-2"
  profile = var.aws_profile
}

# Create this resource within 'us-east-2'
resource "aws_s3_bucket" "example" {
  provider = aws.us-east-2

  # ...
}

This kind of module can be extremely useful for AWS Organizations which leverage AWS Accounts as logical seperation between key infrastructure resources. It can make things like S3 multi-region replication, cross-region networking or supporting regional resources. For some of these use-cases, adding a new region becomes a routine addition to the module.

To illustrate this, consider a pattern in which the artifacts of a single S3 bucket is expected to be replicated into buckets created within each AWS Region in us-east-1, and us-east-2. As the S3 buckets are likely created through the use of multiple resources such as (aws_s3_bucket_policy)[https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/s3_bucket_policy] or (aws_s3_bucket_replication_configuration)[https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/s3_bucket_replication_configuration]), this bucket would be available as a module.

Applying the above pattern, it is possible to pass the provider for a specific region to the module usage:

module "us-east-1" {
  source = "./modules/terraform-aws-s3-myreplicationbucket"
  providers = {
    aws = aws.us-east-1
  }

  # ...
}

module "us-east-2" {
  source = "./modules/terraform-aws-s3-myreplicationbucket"
  providers = {
    aws = aws.us-east-2
  }

  # ...
}

resource "aws_s3_bucket_replication_configuration" "replication" {
  provider = aws.source

  role   = aws_iam_role.replication.arn
  bucket = aws_s3_bucket.source.id

  rule {
    id = "MirrorToUSEast1"
    status = "Enabled"

    destination {
      bucket        = module.us-east-1.arn
      storage_class = "STANDARD"
    }
  }

  rule {
    id = "MirrorToUSEast2"
    status = "Enabled"

    destination {
      bucket        = module.us-east-2.arn
      storage_class = "STANDARD"
    }
  }
}

This kind of pattern comes up often when working with infrastructure that span across AWS Regions, and don’t fit well with a self-registration pattern. That is to say that the regional resources are responsible for registering with (and deregistering with) the global infrastructure.

One of the frustration points that arises with this approach is that whenever a new region is supported, it requires a workflow of copying, pasting & updating text to match the newly supported region. To support us-west-1 in the above example, it’d involve additions to the replication configuration, a new AWS provider for us-west-1, and a new module usage.

An alternative approach is instead of modifying the module to include resource configurations for when a new region is available, instead the resources for all regions are conditionally configured should the region be flagged as “enabled”. This would make use of the count meta-argument to conditionally create the Terraform modules.

This means the module becomes:

module "us-east-1" {
  source = "./modules/terraform-aws-s3-myreplicationbucket"
  count  = contains(var.aws_regions, "us-east-1") ? 1 : 0

  providers = {
    aws = aws.us-east-1
  }

  # ...
}

module "us-east-2" {
  source = "./modules/terraform-aws-s3-myreplicationbucket"
  count  = contains(var.aws_regions, "us-east-2") ? 1 : 0

  providers = {
    aws = aws.us-east-2
  }

  # ...
}

locals {
  aggregate = {
    "us-east-1" : module.us-east-1[*],
    "us-east-2" : module.us-east-2[*],
    "us-west-1" : module.us-west-1[*],
  }
  elements = [
    for key, value in local.aggregate :
    value[0] if length(value) > 0
  ]
}

# ...

resource "aws_s3_bucket_replication_configuration" "replication" {
  role   = aws_iam_role.replication.arn
  bucket = aws_s3_bucket.source.id

  dynamic "rule" {
    for_each = local.elements

    content {
      id       = "MirrorTo${lookup(rule.value, "region_name")}"
      priority = rule.key
      status   = "Enabled"

      delete_marker_replication {
        status = "Enabled"
      }

      filter {}

      destination {
        bucket        = lookup(rule.value, "arn")
        storage_class = "STANDARD"
      }
    }
  }
}

This allows the request of adding support for another AWS Region to this module to be the addition of a new region in the aws_regions variable, which can be updated when making use of a re-usable module for one of these responsibilities. A fully expanded version of the module usages, using a template to define, might look something like this:

{{ range (ds "regions") }}
#====== {{ .RegionName }}
provider "aws" {
  alias = "{{ .RegionName }}"

  region  = "{{ .RegionName }}"
  profile = var.aws_profile
}

module "{{ .RegionName }}" {
  source = "{{ include "source" }}"
  count  = contains(var.aws_regions, "{{ .RegionName }}") ? 1 : 0

  providers = {
    aws = aws.{{ .RegionName }}
  }

{{ template "variables" }}
}
{{ end }}

A note on region status

Within AWS, not all regions are enabled (or “opted-in”) by default. This means that for a Terraform module that has a provider definition for one of these “opted-in” regions, it will fail as AWS has not yet opted-in for this region.

provider "aws" {
  region = "eu-south-2"
}

# Errors with:
#╷
#│ Error: configuring Terraform AWS Provider: validating provider credentials: retrieving caller identity from STS: operation error STS: GetCallerIdentity, https response error StatusCode: 403, RequestID: cce27567-2471-4108-85aa-012a6b68d908, api error InvalidClientTokenId: The security token included in the request is invalid
#│ 
#│   with provider["registry.terraform.io/hashicorp/aws"].eu-south-2,
#│   on modules.tf line 72, in provider "aws":
#│   72: provider "aws" {
#│ 
#╵

This can be addressed by introducing the assumption that if an AWS Region is passed to the module, it can be reasonably assumed that the region has been “opted-in” for the AWS Account. If not, then fallback to us-east-1:

provider "aws" {
  alias = "eu-south-2"

  region  = contains(var.aws_regions, "eu-south-2") ? "eu-south-2" : "us-east-1"
  profile = var.aws_profile
}

A note on Global infrastructure

Not everything that can fit into this pattern should use this pattern. This kind of approach can excel if needing to provision infrastructure that is tightly coupled and spanning across multiple regions. When working with infrastructure that lends itself more towards geographic customizations, such as expiration policies, this pattern tends to become overwhelmed with the overhead of managing the variables.

These kind of global infrastructures often (but don’t have to be) the kind of infrastructure that are “self-registered” into, such as Transit Gateway, VPC Associations or S3 replication buckets.

A note on self-registration infrastructure

Within AWS there are many services that support the self-registration pattern. This is a pattern in which a “registry” resource exists within AWS that deployments of other services will “register” with that resource. A common example of this is subdomains within Route53, in which the root domain will be in an logically separated AWS Account, and for each subdomain deployment it will be responsible for creating the NS records within the root domain.

The key difference between self-registration and global infrastructure, is the manner in which the coupling occurs between the resources. Global Infrastructure resources are often interwoven with one another, such as the replication configuration of S3, or the Route Tables of Transit Gateways. Self registration often have the registering infrastructure being the focus, such as Route53 Subdomain, or VPC Routing with Transit Gateway Route Tables.