Killing the Cluster Autoscaler: Achieving Just-in-Time EKS Scaling with Karpenter

Stop waiting for AWS Auto Scaling Groups. Learn how Karpenter revolutionizes Kubernetes scaling on EKS with just-in-time provisioning and massive cost savings.

Photo by Ben Lodge on Unsplash

For years, the Kubernetes Cluster Autoscaler (CA) has been the faithful workhorse of cloud-native infrastructure. It did the job: it watched for pending pods and adjusted Auto Scaling Group (ASG) sizes accordingly. But let’s be honest—it has always felt like trying to steer a cruise ship with a canoe paddle. It is reactive, often sluggish, and tethered to the rigid constraints of AWS Auto Scaling Groups.

In the world of modern DevOps, waiting five to seven minutes for a new node to join a cluster is an eternity. Whether you are running high-velocity CI/CD pipelines, training AI models, or handling bursty web traffic, that latency translates directly to lost productivity and frustrated users.

Enter Karpenter. Open-sourced by AWS, Karpenter isn't just a faster autoscaler; it represents a fundamental paradigm shift. It bypasses the traditional ASG abstraction entirely to communicate directly with the EC2 fleet. In this post, we will explore why it is time to retire the Cluster Autoscaler and how Nohatek is helping clients transition to Karpenter’s just-in-time provisioning model to slash costs and improve performance.

The Bottleneck: Why the Cluster Autoscaler is Obsolete

A close up of a metal structure with blue lights — Photo by Random Thinking on Unsplash

To understand why Karpenter is revolutionary, we must first look at the limitations of the legacy Cluster Autoscaler. The CA operates on a polling cycle. It checks for unschedulable pods, calculates the need for a new node, and then updates the Desired Capacity of an AWS Auto Scaling Group.

This architecture introduces several unavoidable inefficiencies:

The ASG Tax: The CA doesn't launch nodes; it asks an ASG to do it. This extra hop adds significant latency.
Constraint Rigidity: ASGs are usually homogenous. If you need a GPU instance for an AI workload but your ASG is configured for general-purpose compute, you are stuck creating complex node groups and tainting logic.
Inefficient Bin Packing: The CA is often forced to scale up an entire node group even if the pending pod only needs a fraction of those resources, leading to 'swiss-cheese' resource utilization and wasted cloud spend.

For CTOs and infrastructure managers, this rigidity manifests as higher AWS bills and slower application recovery times. The tool that was supposed to automate scaling has become the bottleneck.

How Karpenter Changes the Game: Groupless Autoscaling

A table topped with lots of pots and pans — Photo by Daniele Franchi on Unsplash

Karpenter acts as a proactive node provisioner rather than a reactive scaler. It sits inside your cluster and observes the Kubernetes API server directly. When it sees a pod that cannot be scheduled, it bypasses Auto Scaling Groups entirely and calls the EC2 RunInstances API directly.

This approach allows for Groupless Autoscaling. Instead of pre-defining node groups (e.g., "General Purpose," "Memory Optimized," "GPU"), you define high-level constraints using NodePools (formerly Provisioners). Karpenter analyzes the resource requests of the pending pods and mathematically determines the exact instance type that fits the need at the lowest price point.

Karpenter doesn't just ask for 'a node.' It asks AWS for 'the cheapest spot instance that has at least 4 vCPUs and 16GB of RAM and is available in us-east-1a.'

The result? Nodes are provisioned in seconds, not minutes. Furthermore, Karpenter aggressively utilizes Spot Instances by diversifying requests across multiple instance families, significantly reducing the risk of Spot interruptions impacting your workload.

Technical Deep Dive: Configuring NodePools and Consolidation

an aerial view of a pier in the middle of the ocean — Photo by Ivan Dimitrov on Unsplash

Implementing Karpenter requires a shift in mindset from "managing groups" to "managing constraints." The core configuration lies in the NodePool and the EC2NodeClass.

Here is a simplified example of how flexible a Karpenter configuration can be. This configuration tells Karpenter it can utilize a wide range of instance generations (c5, m5, r5) and purchase options (Spot or On-Demand) based on what the pods actually need:

apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: default
spec:
  template:
    spec:
      requirements:
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["spot", "on-demand"]
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ["c", "m", "r"]
        - key: karpenter.k8s.aws/instance-generation
          operator: Gt
          values: ["2"]
      nodeClassRef:
        name: default
  disruption:
    consolidationPolicy: WhenUnderutilized
    expireAfter: 720h

The Killer Feature: Consolidation

Notice the consolidationPolicy: WhenUnderutilized line? This is where the ROI becomes undeniable. The legacy Cluster Autoscaler is terrible at scaling down. It only removes a node if it is completely empty.

Karpenter, however, actively watches for underutilized nodes. If it sees a large node running only a few small pods, it will automatically provision a smaller, cheaper node, move the pods there, and terminate the expensive node. This continuous bin-packing optimization ensures your cluster is always right-sized, effectively defragmenting your infrastructure in real-time.

The Business Impact: ROI and Operational Efficiency

Switching to Karpenter is not just a technical exercise; it is a business decision. For Nohatek clients who have migrated from the standard Cluster Autoscaler to Karpenter on EKS, we have observed three consistent outcomes:

Cost Reduction: By leveraging Spot instances more effectively and utilizing consolidation to eliminate waste, compute bills often drop by 20% to 50%.
Developer Velocity: When a developer pushes a new microservice or a data scientist launches a Jupyter notebook, they get compute resources almost instantly. This removes the friction of waiting for infrastructure, keeping teams in the flow state.
Operational Simplicity: You no longer need to manage dozens of Auto Scaling Groups. A single Karpenter provisioner can often handle the workload of what used to be 10 separate node groups.

For companies leveraging AI/ML workloads, Karpenter is particularly potent. It can dynamically provision expensive GPU nodes only when training jobs are queued and immediately terminate them when the job finishes, preventing the dreaded scenario of idle GPU instances burning through the budget.

The Kubernetes ecosystem is maturing, and tools that were once standard are being replaced by smarter, more API-centric solutions. The Cluster Autoscaler served us well, but Karpenter is the future of EKS scaling. It offers the agility of serverless with the control of Kubernetes.

However, migrating a production cluster requires careful planning regarding IAM roles, VPC networking, and disruption budgets. If you are looking to optimize your EKS infrastructure, reduce your AWS spend, or accelerate your development pipelines, Nohatek is here to help. We specialize in modernizing cloud-native infrastructure to ensure you aren't just running in the cloud, but truly thriving in it.