The Infrastructure Nomad: Architecting Vendor-Agnostic Kubernetes Clusters with Terraform
Learn to architect vendor-agnostic Kubernetes clusters using Terraform. Escape cloud vendor lock-in and mitigate price surges with the Infrastructure Nomad approach.
In the current cloud landscape, loyalty is becoming an expensive virtue. We have seen major hyperscalers silently adjust pricing tiers, introduce fees for previously free resources (such as public IPv4 addresses), and modify egress costs. For CTOs and infrastructure architects, the realization is stark: total reliance on a single provider’s proprietary ecosystem is no longer just a convenience—it is a financial liability.
Enter the Infrastructure Nomad. This is not just a job title, but an architectural philosophy. It is the practice of designing systems that are polite guests on a cloud provider's platform but are always packed and ready to move to a cheaper or more performant host at a moment's notice. By leveraging the standardization of Kubernetes and the provisioning power of Terraform, organizations can decouple their workloads from the underlying infrastructure.
In this guide, we will explore how to architect a truly vendor-agnostic Kubernetes environment. We will move beyond the buzzwords of 'multi-cloud' and focus on 'portable-cloud'—giving you the leverage to counter price surges by simply pointing your Terraform state elsewhere.
The High Cost of the 'Easy Button'
Cloud providers have done an excellent job of building 'walled gardens.' Services like AWS EKS, Google GKE, and Azure AKS are fantastic, but they often come with hooks that dig deep into your architecture. When you rely heavily on provider-specific annotations for load balancers, proprietary IAM roles for pod identity, or cloud-specific storage classes, you are effectively welding your application to that vendor's floor.
Consider the scenario where a provider announces a 20% price hike on compute instances or changes the billing model for managed control planes. If your infrastructure code is riddled with aws_ specific resources and your Kubernetes manifests rely on ALBs (Application Load Balancers), migrating to DigitalOcean or Linode isn't a weekend project—it's a quarter-long re-engineering nightmare.
The goal of the Infrastructure Nomad is to treat the cloud provider as a commodity utility provider—supplying raw compute, network, and storage—while you control the logic that stitches it together.
To achieve this, we must adopt an abstraction mindset. We need to identify the intersection of services that all providers offer (Compute, Block Storage, Load Balancing) and build our Terraform modules to interface with these common denominators rather than proprietary differentiators.
Architecting with Terraform Abstractions
The core of vendor agnosticism lies in how you structure your Infrastructure as Code (IaC). A common mistake is writing a monolithic main.tf file dedicated to a specific provider. Instead, the Infrastructure Nomad uses a module-based abstraction layer.
Your root module should define the intent of the infrastructure, not the specific implementation. You can achieve this by creating a wrapper module that accepts a provider_name variable. Inside the module, logic branches to call the specific child module relevant to the chosen cloud.
Here is a simplified example of how this abstraction looks in practice:
variable "cloud_target" {
description = "Target provider: aws, gcp, or do"
type = string
}
module "k8s_cluster" {
source = "./modules/cluster_abstraction"
provider_id = var.cloud_target
node_count = 3
region = "us-east-1"
}Inside the cluster_abstraction module, you utilize Terraform logic to instantiate the correct resources. While Terraform doesn't support dynamic provider switching within a single run easily, separating your environments by workspaces (e.g., prod-aws and prod-gcp) allows you to maintain two potential landing spots for your application using the same high-level definitions.
Furthermore, avoid using provider-specific resources for things that can be handled inside Kubernetes. For example, instead of using Terraform to provision an AWS RDS instance and linking it, consider running a database operator within Kubernetes for smaller workloads, or use a Terraform module that swaps aws_db_instance for google_sql_database_instance based on the input variable, ensuring the output connection strings remain consistent for your applications.
Standardizing the Kubernetes Layer
Once Terraform has provisioned the raw metal (or virtual metal), Kubernetes takes over. However, your K8s manifests can still be a source of lock-in if you aren't careful. To be a true Infrastructure Nomad, you must standardize the interfaces your applications use.
1. Ingress Agnosticism:
Do not rely on cloud-specific load balancer controllers (like the AWS Load Balancer Controller) directly in your application ingress rules. Instead, deploy a standard Ingress Controller like NGINX or Traefik. Terraform creates one generic Layer 4 Load Balancer to expose the NGINX controller, and all your applications talk to NGINX. This means moving clouds only requires changing one Service definition, not rewriting ingress rules for 50 microservices.
2. Storage Class Abstraction:
Every cloud calls their block storage something different (gp2, pd-standard, do-block-storage). Do not hardcode these in your PersistentVolumeClaims (PVCs). Create a default StorageClass in your cluster bootstrap process named generic-block-storage. Map this class to the backend provider's specific driver. Your developers simply request generic-block-storage, and they never need to know if the disk is spinning in Virginia or Frankfurt.
3. External DNS and Cert-Manager:
Use tools like ExternalDNS and Cert-Manager. These tools act as the glue between your portable cluster and the outside world. Regardless of where your cluster lives, ExternalDNS can update your Cloudflare or Route53 records to point to the new IP addresses automatically, making the "switch" between providers a matter of minutes, not hours.
Becoming an Infrastructure Nomad does not mean you are constantly migrating. It means you possess the capability to migrate. This capability is a powerful negotiation tool and a critical insurance policy against price surges and service degradation.
By architecting your Kubernetes clusters with Terraform abstractions and adhering to strict standardization within the cluster, you transform your infrastructure from a static foundation into a dynamic fleet. The initial investment in setting up these abstractions is higher than clicking 'Create Cluster' in a console, but the long-term ROI in flexibility and cost control is immeasurable.
At Nohatek, we specialize in building resilient, portable cloud architectures. If you are looking to decouple your growth from your cloud provider's pricing model, let's talk about architecting your freedom.