Cloud agnostic terraform patterns using build artifact metadata

Using the metadata accessible within container images for cloud agnostic deployments

Table Of Contents

Today I Explained

Designing infrastructure as code to be cloud agnostic is often a loss-based effort that will see us trying to encode conditional logic, favour a single cloud environment, or be disadvantaged in our usage of a cloud provider to avoid coupling ourselves to the single cloud. In many of these cases, a single service (such as AWS IAM) will encode hidden complexity, that when trying to co-exist within the clouds, will ultimately fail to have equivalencies.

One pattern that can assist in this kind of stated design is avoiding the use of multi-cloud Terraform modules, and instead relying on a common interface based on a single gesture. In this pattern, using a container image as the artifact, the Terraform module takes as its input the authoritative path to the container image. Within the metadata of this container image, it will inform the Terraform module about the schema of possible interfaces that the application possess, such as environment variables, reserved filepaths or operating ports.

Instead of re-specifying the ports, secrets, configuration file locations that are already known to the application, it simply communicates things like “I have an HTTP port specified by this interface”. The Terraform module then becomes extremely opinionated in how it interprets this schema, and any related configuration files included with the deployed. This has the module doing things like:

  • Looking up secrets from known registries based on capabilities/needs by the application
  • Monitoring & logging is defined by a strict common standard
  • Ports map by purpose, using common pre-defined ports (HTTP:80, SSH:22)

Parameters that might have been passed in previously, such as the identifier of a Virtual Private Cloud (VPC ID), instead are expected to follow conventions such as having them tagged in a specific way (or made available within Systems Manager Parameter Store).

In this pattern, it becomes the responsibility of the Terraform module to best utilize the capabilities of the cloud that it is operating within. This isn’t always straightforward, as many of these capabilities will need some configuration, but by grouping by convention & common purpose, they can be simplified.

A note on conventions & breaking points

Conventions can be really effective at simplifying configuration within infrastructure as code, as it no longer becomes a requirement to specify properties, but instead just rely on the environment matching the expected convention. The challenge that comes with this is the breaking point of these conventions.

For some conventions, the point at which the convention is exposed as fragile or reaching a breaking point will be irrelevant or something that can be worked around for special scenarios, allowing for breaking from the convention for a special circumstances.

For other conventions, the point at which the convention is exposed as fragile can become technical debt, a burden that will continue to incur effort costs as the organization tries to workaround the needs of the application, and the conventions created in the past.