kloia Blog

AWS Cloud Cost Savings by Migrating Kubernetes Workloads to Graviton

Written by Muhammad Bintang Bahy | Sep 9, 2025 10:11:32 AM

Running Kubernetes clusters on AWS Cloud is powerful and offers various deployment options depending on instance choice. One significant opportunity for optimization is migrating workloads from the traditional AMD64 (x86_64) architecture to AWS ARM64 Graviton instances. This post walks through our recent migration project, highlights the practical steps and real-world challenges.

Motivation: Better Performance at Lower Cost

AWS Graviton instances are built on custom ARM chips designed for cloud workloads. They deliver better price–performance compared to Intel and AMD instances, often at up to 40% lower cost. For Kubernetes clusters running many workloads, this can make a major difference in both performance and budget.

In our migration, we tested Graviton across development, staging, and production environments. The results were clear:

  • Development: Saved about $300/month
  • Staging: Saved about $400/month
  • Production: Saved about $1,200/month

Overall, we achieved around 45% lower compute costs while maintaining or improving workload performance. Graviton let us run services with lower CPU usage, faster response times, and reduced build durations for some applications.

AWS Graviton Migration 

Process Overview

  • Setting Up Graviton Node Groups

We added new EKS managed node groups with Graviton instances (t4g.xlarge, c6g.xlarge, c7g.xlarge). For Karpenter, we created provisioners that scale ARM64 nodes. This allowed us to run x86 and Graviton workloads side by side.

  • Building Multi-Architecture Images

We updated CI/CD to build images for both amd64 and arm64. Golang services used cross-compilation to avoid emulation, while Python and .NET used ARM64 runners. This made builds faster and deployments smooth.

  • Updating Scheduling Rules

We added node selectors and tolerations so Kubernetes could place pods on Graviton nodes. This let us migrate gradually and control workloads during testing.

  • Handling Compatibility Issues

Some libraries and tools did not support ARM64. In those cases, we kept workloads on x86 or switched to manual solutions. This way, most services migrated successfully without disruption.

We’ll detail each step and our approach in the following sections.

1. Setting Up Graviton EKS Node Groups

When setting up Graviton in our clusters, we worked with two types of autoscalers: EKS Managed Node Groups and Karpenter. By using both, we had the reliability of managed scaling with EKS and the flexibility of custom scaling policies with Karpenter. This gave us full control during the migration and let us balance stability with efficiency.

Note: We used taints and tolerations to gain more control during the migration of our workloads. Since the Kubernetes scheduler manages architecture selection, using taints and tolerations is optional.

A. EKS Managed Node Groups

For clusters using managed node groups, we added a new group with Graviton instances. We selected t4g.xlarge, c6g.xlarge and c7g.xlarge because they closely matched the size and capacity of our old AMD64 nodes. 

Sample Terraform Configuration:


eks_managed_node_groups = {
  # ... existing configurations
  linux_arm = {
    min_size     = 1
    max_size     = 6
    desired_size = 4
    ami_type     = "AL2_ARM_64"
    
    instance_types = [
      "t4g.xlarge",   # Burstable performance
      "c6g.xlarge",   # Compute optimized
      "c7g.xlarge",   # Latest generation compute optimized
    ]
    
    labels = {
      workload = "graviton"
    }
    taints = [{
      key    = "graviton"
      value  = "enabled"
      effect = "NO_SCHEDULE"
    }]
  }
}

B. Karpenter Provisioners

For clusters running Karpenter, we created new provisioners that launch Graviton instances. We also applied taints and tolerations to control which workloads would run on these new nodes during the migration.

Sample Karpenter Graviton Provisioner:


apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: graviton-nodepool
spec:
  consolidation:
    enabled: true
  limits:
    resources:
      cpu: "200"
  providerRef:
    name: default
  requirements:
    - key: karpenter.sh/capacity-type
      operator: In
      values: ["on-demand"]
    - key: node.kubernetes.io/instance-type
      operator: In
      values: ["t4g.xlarge", "c6g.xlarge", "c7g.xlarge"]
    - key: topology.kubernetes.io/zone
 operator: In
      values: ["eu-west-1a", "eu-west-1b", "eu-west-1c"]
    - key: kubernetes.io/arch
      operator: In
      values: ["arm64"]
    - key: kubernetes.io/os
      operator: In
      values: ["linux"]
  taints: 
    - effect: NoSchedule
      key: graviton
      value: enabled

2. Building Multi-Architecture Images

Running workloads on AWS Graviton nodes means your containers must support the arm64 architecture. To achieve this, we updated our CI/CD pipelines and Dockerfiles to build multi-architecture images that work on both amd64 and arm64. This allowed us to deploy the same image across x86 and Graviton nodes without maintaining separate builds.

A. Github Actions Build Pipeline Changes

We updated our CI/CD pipelines to build and push images for both architectures. Using Docker Buildx with QEMU, we could build amd64 and arm64 images in one step. This ensured every service had a single multi-arch image that runs on any node type.


jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      ...

+     - name: Set up QEMU
+       uses: docker/setup-qemu-action@v3
+
+     - name: Set up Docker Buildx
+       uses: docker/setup-buildx-action@v3

      - name: Build and Push 
        run: |
          docker buildx build \\
            --file ./Dockerfile \\
+           --platform linux/amd64,linux/arm64 \\
            --tag > \\
            --no-cache \\
            .
       ...

B. Example Golang Cross-Build Dockerfile

For Go services, we enhanced our multi-architecture build performance by leveraging Go's native cross-compilation capabilities. The following Dockerfile optimizations eliminate the need for ARM64 emulation during the build process, significantly reducing build times.


FROM --platform=$BUILDPLATFORM golang:1.21.0 AS builder

ARG BUILDPLATFORM
ARG TARGETARCH
ARG TARGETOS

WORKDIR /app
COPY go.sum go.mod ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=${TARGETOS} GOARCH=${TARGETARCH} go build -o /bin/app ./cmd/

FROM --platform=$TARGETPLATFORM scratch
COPY --from=builder /bin/app /bin/app
ENTRYPOINT ["/bin/app"]
            .
       ...

To run the build, the command below can be used.


docker buildx build \
  --file ./Dockerfile \
  --platform linux/amd64,linux/arm64 \
  --tag > \
  --no-cache              
 

Note: For languages like Python and .NET, cross-building is less straightforward (see the “Challenges” below).

3. Updating Scheduling Rules (Optional)

As mentioned above, to control rollout, we added nodeSelectors and tolerations so Kubernetes could place specific pods on Graviton nodes. This gave us flexibility to test gradually before moving everything over. While optional, this step helped reduce migration risk.


tolerations:
  - key: graviton
    operator: Exists
nodeSelector:
  kubernetes.io/arch: arm64

Challenges Encountered During Graviton Migration

1. Library Incompatibility

Some dependencies may not (yet) support ARM64. For example, we discovered the Fiona library for Python had no ARM64 support, meaning we had to skip Graviton migration for that specific service.

2. Github Actions Docker Slow Builds for Python and .NET

Unlike Go, Python and .NET don't natively support cross-building. Using QEMU emulation in Buildx made these image builds 5x slower!

The solution: Running builds on ARM64 Github Action runners.

  • Provisioned ARM64 Github self-hosted runners (since the public ones were in preview and our repo is private)
  • Split our build jobs to run AMD64 and ARM64 builds on matching runners
  • Merged images into a multi-arch manifest tag after building both types

**Docker’s official docs on this pattern.

3. Instana Auto Trace Graviton Incompatibility

Our monitoring tool (Instana) did not support automatic instrumentation on Graviton. To keep visibility, we switched to manual instrumentation based on Instana’s language-specific docs.

Future Optimization: Spot Instances with Graviton

Summary

Migrating our Kubernetes workloads to AWS Graviton was not without challenges, but the results made it worth the effort. By carefully updating node groups, pipelines, and scheduling rules, we achieved around 45% lower compute costs while also improving workload efficiency. Services ran with lower CPU usage, faster response times, and in many cases shorter build times.

Although we had to handle a few compatibility issues, these were manageable and did not block the migration. Overall, Graviton proved to be a reliable, cost-effective option for running Kubernetes at scale.

That said, Graviton migration may not be ideal for every application. Some programming languages and libraries still lack full ARM64 support, which can create compatibility issues. In addition, depending on the language and runtime, certain applications may even lose performance when moved from x86 to ARM. Because of this, it’s important to benchmark and test your workloads first to confirm that they actually benefit from Graviton before fully committing.

👉 For teams looking to cut costs and improve performance, moving to Graviton is a practical and future-ready choice.