マネージドノードグループを利用している EKS で eksctl を利用して Amazon Linux2 から Amazon Linux 2023 に移行してみる

マネージドノードグループを利用している EKS で eksctl を利用して Amazon Linux2 から Amazon Linux 2023 に移行してみる

こんにちは。クラウド事業本部の枡川です。

Amazon Linux2 は 2026 年 6 月 30 日にサポート終了します。

Amazon Linux 2 のサポート終了日 (サポート終了、または EOL) は 2026 年 6 月 30 日です。
https://aws.amazon.com/jp/amazon-linux-2/faqs/

ただし、EKS のノードとして EKS 最適化 AMI を利用している場合は、もう少し早い 2025 年 11 月 26 日にサポート終了となります。
使い続ける分には構わないものの、セキュリティパッチやバグ修正は行われません。

AWS will end support for EKS AL2-optimized and AL2-accelerated AMIs, effective November 26, 2025. While you can continue using EKS AL2 AMIs after the end-of-support (EOS) date (November 26, 2025), EKS will no longer release any new Kubernetes versions or updates to AL2 AMIs, including minor releases, patches, and bug fixes after this date.
https://docs.aws.amazon.com/eks/latest/userguide/eks-ami-deprecation-faqs.html

また、EKS 1.33 以降は Amazon Linux2 ベースの EKS 最適化 AMI が提供されなくなります。

Additionally, Kubernetes version 1.32 is the last version for which Amazon EKS will release AL2 AMIs.
https://github.com/awslabs/amazon-eks-ami

カスタム AMI を作成する形で 2026 年 6 月 30 まで Amazon Linux2 を使い続けることはできますが、そこまでして半年程度延命しても得られるメリットが少ないため、実質的には 2025 年 11 月 26 日までに移行することになると思います。
移行先としては Amazon Linux2023 の他、Bottlerocket も考えられます。

We recommend upgrading to Amazon Linux 2023 (AL2023) or Bottlerocket AMIs:
AL2023 enables a secure-by-default approach with preconfigured security policies, SELinux in permissive mode, IMDSv2-only mode enabled by default, optimized boot times, and improved package management for enhanced security and performance, well-suited for infrastructure requiring significant customizations like direct OS-level access or extensive node changes.
Bottlerocket enables enhanced security, faster boot times, and a smaller attack surface for improved efficiency with its purpose-built, container-optimized design, well-suited for container-native approaches with minimal node customizations.
https://docs.aws.amazon.com/eks/latest/userguide/eks-ami-deprecation-faqs.html

移行による影響を少しでも抑えたい場合は Amazon Linux2023 が良いと思いますが、このタイミングで Bottlerocket に移行してしまうのも良いでしょう。
インストールされているコンポーネントが少ないため、アタックサーフェスの縮小によるセキュリティの向上や、運用コスト低減などのメリットを享受できます。

https://docs.aws.amazon.com/ja_jp/eks/latest/userguide/eks-optimized-ami-bottlerocket.html

本ブログにおいては Amazon Linux2023 に移行してみます。

EKS クラスターを eksctl で作成する

まず、eksctl で Amazon Linux2 ベースの EKS 最適化 AMI を利用した EKS クラスターを作成します。
下記設定ファイルを利用します。

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: test-cluster
  region: ap-northeast-1
  version: "1.32"

iam:
  withOIDC: true

vpc:
  clusterEndpoints:
    publicAccess: true
    privateAccess: true

cloudWatch:
  clusterLogging:
    enableTypes:
      - "audit"
      - "authenticator"
      - "controllerManager"
      - "scheduler"

managedNodeGroups:
  - name: node-group-al2
    amiFamily: AmazonLinux2
    instanceTypes:
      - t3.medium
    maxSize: 1
    minSize: 1
    desiredCapacity: 1
    privateNetworking: true
    volumeType: gp3
    volumeSize: 50

autoModeConfig:
  enabled: false

addons:
  - name: eks-pod-identity-agent
    version: latest
  - name: vpc-cni
    version: latest
    useDefaultPodIdentityAssociations: true
  - name: coredns
    version: latest
  - name: kube-proxy
    version: latest

明示的に amiFamily を指定していますが、現時点 (v0.208.0) でのデフォルト値も AmazonLinux2 になります。

https://eksctl.io/usage/schema/#nodeGroups-amiFamily

意図せず Amazon Linux2 を利用している場合は気づかぬ内にサポート終了していることがないよう、特に注意が必要です。
下記コマンドを実行して、クラスターを作成します。

% eksctl create cluster -f cluster.yaml
2025-05-31 13:13:55 [ℹ]  eksctl version 0.208.0
2025-05-31 13:13:55 [ℹ]  using region ap-northeast-1
2025-05-31 13:13:55 [!]  Amazon EKS will no longer publish EKS-optimized Amazon Linux 2 (AL2) AMIs after November 26th, 2025. Additionally, Kubernetes version 1.32 is the last version for which Amazon EKS will release AL2 AMIs. From version 1.33 onwards, Amazon EKS will continue to release AL2023 and Bottlerocket based AMIs. The default AMI family when creating clusters and nodegroups in Eksctl will be changed to AL2023 in the future.
2025-05-31 13:13:55 [ℹ]  setting availability zones to [ap-northeast-1c ap-northeast-1d ap-northeast-1a]
2025-05-31 13:13:55 [ℹ]  subnets for ap-northeast-1c - public:192.168.0.0/19 private:192.168.96.0/19
2025-05-31 13:13:55 [ℹ]  subnets for ap-northeast-1d - public:192.168.32.0/19 private:192.168.128.0/19
2025-05-31 13:13:55 [ℹ]  subnets for ap-northeast-1a - public:192.168.64.0/19 private:192.168.160.0/19
2025-05-31 13:13:55 [ℹ]  nodegroup "node-group-al2" will use "" [AmazonLinux2/1.32]
2025-05-31 13:13:55 [ℹ]  using Kubernetes version 1.32
2025-05-31 13:13:55 [ℹ]  creating EKS cluster "test-cluster" in "ap-northeast-1" region with managed nodes
2025-05-31 13:13:55 [ℹ]  1 nodegroup (node-group-al2) was included (based on the include/exclude rules)
2025-05-31 13:13:55 [ℹ]  will create a CloudFormation stack for cluster itself and 1 managed nodegroup stack(s)
2025-05-31 13:13:55 [ℹ]  if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=ap-northeast-1 --cluster=test-cluster'
2025-05-31 13:13:55 [ℹ]  Kubernetes API endpoint access will use provided values {publicAccess=true, privateAccess=true} for cluster "test-cluster" in "ap-northeast-1"
2025-05-31 13:13:55 [ℹ]  configuring CloudWatch logging for cluster "test-cluster" in "ap-northeast-1" (enabled types: audit, authenticator, controllerManager, scheduler & disabled types: api)
2025-05-31 13:13:55 [ℹ]  default addons metrics-server were not specified, will install them as EKS addons
2025-05-31 13:13:55 [ℹ]
2 sequential tasks: { create cluster control plane "test-cluster",
    2 sequential sub-tasks: {
        5 sequential sub-tasks: {
            1 task: { create addons },
            wait for control plane to become ready,
            associate IAM OIDC provider,
            no tasks,
            update VPC CNI to use IRSA if required,
        },
        create managed nodegroup "node-group-al2",
    }
}
2025-05-31 13:13:55 [ℹ]  building cluster stack "eksctl-test-cluster-cluster"
2025-05-31 13:13:55 [ℹ]  deploying stack "eksctl-test-cluster-cluster"
2025-05-31 13:14:25 [ℹ]  waiting for CloudFormation stack "eksctl-test-cluster-cluster"
2025-05-31 13:14:55 [ℹ]  waiting for CloudFormation stack "eksctl-test-cluster-cluster"
2025-05-31 13:15:55 [ℹ]  waiting for CloudFormation stack "eksctl-test-cluster-cluster"
2025-05-31 13:16:56 [ℹ]  waiting for CloudFormation stack "eksctl-test-cluster-cluster"
2025-05-31 13:17:56 [ℹ]  waiting for CloudFormation stack "eksctl-test-cluster-cluster"
2025-05-31 13:18:57 [ℹ]  waiting for CloudFormation stack "eksctl-test-cluster-cluster"
2025-05-31 13:19:57 [ℹ]  waiting for CloudFormation stack "eksctl-test-cluster-cluster"
2025-05-31 13:20:57 [ℹ]  waiting for CloudFormation stack "eksctl-test-cluster-cluster"
2025-05-31 13:21:57 [ℹ]  waiting for CloudFormation stack "eksctl-test-cluster-cluster"
2025-05-31 13:22:57 [ℹ]  waiting for CloudFormation stack "eksctl-test-cluster-cluster"
2025-05-31 13:23:58 [ℹ]  waiting for CloudFormation stack "eksctl-test-cluster-cluster"
2025-05-31 13:23:59 [ℹ]  creating addon: eks-pod-identity-agent
2025-05-31 13:24:00 [ℹ]  successfully created addon: eks-pod-identity-agent
2025-05-31 13:24:00 [ℹ]  "addonsConfig.autoApplyPodIdentityAssociations" is set to true; will lookup recommended pod identity configuration for "vpc-cni" addon
2025-05-31 13:24:01 [ℹ]  deploying stack "eksctl-test-cluster-addon-vpc-cni-podidentityrole-aws-node"
2025-05-31 13:24:01 [ℹ]  waiting for CloudFormation stack "eksctl-test-cluster-addon-vpc-cni-podidentityrole-aws-node"
2025-05-31 13:24:31 [ℹ]  waiting for CloudFormation stack "eksctl-test-cluster-addon-vpc-cni-podidentityrole-aws-node"
2025-05-31 13:24:31 [ℹ]  creating addon: vpc-cni
2025-05-31 13:24:33 [ℹ]  successfully created addon: vpc-cni
2025-05-31 13:24:33 [ℹ]  creating addon: coredns
2025-05-31 13:24:34 [ℹ]  successfully created addon: coredns
2025-05-31 13:24:34 [ℹ]  creating addon: kube-proxy
2025-05-31 13:24:35 [ℹ]  successfully created addon: kube-proxy
2025-05-31 13:24:35 [ℹ]  creating addon: metrics-server
2025-05-31 13:24:35 [ℹ]  successfully created addon: metrics-server
2025-05-31 13:26:38 [ℹ]  addon "vpc-cni" active
2025-05-31 13:26:39 [ℹ]  updating IAM resources stack "eksctl-test-cluster-addon-vpc-cni-podidentityrole-aws-node" for pod identity association "a-sxfqrigqa8dp5wie8"
2025-05-31 13:26:40 [ℹ]  waiting for CloudFormation changeset "eksctl--aws-node-update-1748665599" for stack "eksctl-test-cluster-addon-vpc-cni-podidentityrole-aws-node"
2025-05-31 13:26:40 [ℹ]  nothing to update
2025-05-31 13:26:40 [ℹ]  IAM resources for aws-node (pod identity association ID: a-sxfqrigqa8dp5wie8) are already up-to-date
2025-05-31 13:26:40 [ℹ]  updating addon
2025-05-31 13:26:51 [ℹ]  addon "vpc-cni" active
2025-05-31 13:26:51 [ℹ]  building managed nodegroup stack "eksctl-test-cluster-nodegroup-node-group-al2"
2025-05-31 13:26:52 [ℹ]  deploying stack "eksctl-test-cluster-nodegroup-node-group-al2"
2025-05-31 13:26:52 [ℹ]  waiting for CloudFormation stack "eksctl-test-cluster-nodegroup-node-group-al2"
2025-05-31 13:27:22 [ℹ]  waiting for CloudFormation stack "eksctl-test-cluster-nodegroup-node-group-al2"
2025-05-31 13:28:06 [ℹ]  waiting for CloudFormation stack "eksctl-test-cluster-nodegroup-node-group-al2"
2025-05-31 13:29:08 [ℹ]  waiting for CloudFormation stack "eksctl-test-cluster-nodegroup-node-group-al2"
2025-05-31 13:29:08 [ℹ]  waiting for the control plane to become ready
2025-05-31 13:29:09 [✔]  saved kubeconfig as "/Users/masukawa.kentaro/.kube/config"
2025-05-31 13:29:09 [ℹ]  no tasks
2025-05-31 13:29:09 [✔]  all EKS cluster resources for "test-cluster" have been created
2025-05-31 13:29:09 [ℹ]  nodegroup "node-group-al2" has 1 node(s)
2025-05-31 13:29:09 [ℹ]  node "ip-192-168-160-246.ap-northeast-1.compute.internal" is ready
2025-05-31 13:29:09 [ℹ]  waiting for at least 1 node(s) to become ready in "node-group-al2"
2025-05-31 13:29:09 [ℹ]  nodegroup "node-group-al2" has 1 node(s)
2025-05-31 13:29:09 [ℹ]  node "ip-192-168-160-246.ap-northeast-1.compute.internal" is ready
2025-05-31 13:29:09 [✔]  created 1 managed nodegroup(s) in cluster "test-cluster"
2025-05-31 13:29:10 [ℹ]  kubectl command should work with "/Users/xxxxxx/.kube/config", try 'kubectl get nodes'
2025-05-31 13:29:10 [✔]  EKS cluster "test-cluster" in "ap-northeast-1" region is ready

作成中にも Amazon EKS will no longer publish EKS-optimized Amazon Linux 2 (AL2) AMIs after November 26th, 2025. と警告がでています。
無事作成完了したので、Nginx の Deployment を作成しておきます。

apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: default
  name: nginx
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: nginx
  replicas: 1
  template:
    metadata:
      labels:
        app.kubernetes.io/name: nginx
    spec:
      containers:
        - image: nginx
          imagePullPolicy: Always
          name: nginx
          ports:
            - containerPort: 80
          resources:
            requests:
              cpu: "0.5"

無事作成完了しました。

% kubectl apply -f nginx.yaml
deployment.apps/nginx created

この段階で、ノードグループは一つ存在します。

% eksctl get nodegroup --cluster test-cluster
CLUSTER         NODEGROUP       STATUS  CREATED                 MIN SIZE        MAX SIZE    DESIRED CAPACITY INSTANCE TYPE   IMAGE ID        ASG NAME                                    TYPE
test-cluster    node-group-al2  ACTIVE  2025-05-31T04:27:18Z    1               1           1t3.medium       AL2_x86_64      eks-node-group-al2-08cb924c-0837-7482-4c94-1c3a8e6c144c managed

ノードも一つ存在します。

% kubectl get node
NAME                                                 STATUS   ROLES    AGE     VERSION
ip-192-168-160-246.ap-northeast-1.compute.internal   Ready    <none>   5m27s   v1.32.3-eks-473151a

Pod 一覧は下記のようになります。

% kubectl get pod -A
NAMESPACE     NAME                              READY   STATUS    RESTARTS   AGE
default       nginx-7c64dfbfdc-7r6xg            1/1     Running   0          2m6s
kube-system   aws-node-xwvd7                    2/2     Running   0          6m49s
kube-system   coredns-68b8b66bdb-7cchc          1/1     Running   0          9m3s
kube-system   coredns-68b8b66bdb-rkrw5          1/1     Running   0          9m3s
kube-system   eks-pod-identity-agent-2xq9q      1/1     Running   0          6m49s
kube-system   kube-proxy-q7q29                  1/1     Running   0          6m49s
kube-system   metrics-server-6c8c76d545-9kcp5   1/1     Running   0          9m2s
kube-system   metrics-server-6c8c76d545-kgnld   1/1     Running   0          9m2s

デプロイした Nginx と EKS アドオン関連の Pod が存在しています。

Amazon Linux2023 ベースの EKS 最適化 AMI に移行する

では、Amazon Linux2023 に移行します。
まず、eksctl 用の設定ファイルに新しいノードグループを追記します。
eksctl には、eksctl upgrade nodegroup と呼ばれるコマンドも存在し、Amazon Linux2 の中でリリースバージョンを更新する際はこちらを利用できます。

https://eksctl.io/usage/nodegroup-managed/#upgrading-managed-nodegroups

直近のリリース を確認して、--release-version オプションを指定しながらコマンドを実行することで更新できる形です。

eksctl upgrade nodegroup --name=node-group-al2 --cluster=test-cluster --release-version=1.32.3-20250519

ただし、今回の Amazon Linux 2 から Amazon Linux2023 は設定変更扱いになるので上記コマンドは利用できないようです。
そのため、下記のように node-group-al2023 を新しく追加しました。

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: test-cluster
  region: ap-northeast-1
  version: "1.32"

iam:
  withOIDC: true

vpc:
  clusterEndpoints:
    publicAccess: true
    privateAccess: true

cloudWatch:
  clusterLogging:
    enableTypes:
      - "audit"
      - "authenticator"
      - "controllerManager"
      - "scheduler"

managedNodeGroups:
  - name: node-group-al2
    amiFamily: AmazonLinux2
    instanceTypes:
      - t3.medium
    maxSize: 1
    minSize: 1
    desiredCapacity: 1
    privateNetworking: true
    volumeType: gp3
    volumeSize: 50
  - name: node-group-al2023
    amiFamily: AmazonLinux2023
    instanceTypes:
      - t3.medium
    maxSize: 1
    minSize: 1
    desiredCapacity: 1
    privateNetworking: true
    volumeType: gp3
    volumeSize: 50

autoModeConfig:
  enabled: false

addons:
  - name: eks-pod-identity-agent
    version: latest
  - name: vpc-cni
    version: latest
    useDefaultPodIdentityAssociations: true
  - name: coredns
    version: latest
  - name: kube-proxy
    version: latest

修正した設定ファイルを利用してノードグループを作成します。

% eksctl create nodegroup -f cluster.yaml
2025-05-31 13:45:50 [!]  Amazon EKS will no longer publish EKS-optimized Amazon Linux 2 (AL2) AMIs after November 26th, 2025. Additionally, Kubernetes version 1.32 is the last version for which Amazon EKS will release AL2 AMIs. From version 1.33 onwards, Amazon EKS will continue to release AL2023 and Bottlerocket based AMIs. The default AMI family when creating clusters and nodegroups in Eksctl will be changed to AL2023 in the future.
2025-05-31 13:45:53 [ℹ]  nodegroup "node-group-al2" will use "" [AmazonLinux2/1.32]
2025-05-31 13:45:53 [ℹ]  nodegroup "node-group-al2023" will use "" [AmazonLinux2023/1.32]
2025-05-31 13:45:53 [ℹ]  1 existing nodegroup(s) (node-group-al2) will be excluded
2025-05-31 13:45:53 [ℹ]  1 nodegroup (node-group-al2023) was included (based on the include/exclude rules)
2025-05-31 13:45:53 [ℹ]  will create a CloudFormation stack for each of 1 managed nodegroups in cluster "test-cluster"
2025-05-31 13:45:53 [ℹ]
2 sequential tasks: { fix cluster compatibility, 1 task: { 1 task: { create managed nodegroup "node-group-al2023" } }
}
2025-05-31 13:45:53 [ℹ]  checking cluster stack for missing resources
2025-05-31 13:45:53 [ℹ]  cluster stack has all required resources
2025-05-31 13:45:53 [ℹ]  building managed nodegroup stack "eksctl-test-cluster-nodegroup-node-group-al2023"
2025-05-31 13:45:54 [ℹ]  deploying stack "eksctl-test-cluster-nodegroup-node-group-al2023"
2025-05-31 13:45:54 [ℹ]  waiting for CloudFormation stack "eksctl-test-cluster-nodegroup-node-group-al2023"
2025-05-31 13:46:24 [ℹ]  waiting for CloudFormation stack "eksctl-test-cluster-nodegroup-node-group-al2023"
2025-05-31 13:47:23 [ℹ]  waiting for CloudFormation stack "eksctl-test-cluster-nodegroup-node-group-al2023"
2025-05-31 13:48:00 [ℹ]  waiting for CloudFormation stack "eksctl-test-cluster-nodegroup-node-group-al2023"
2025-05-31 13:49:26 [ℹ]  waiting for CloudFormation stack "eksctl-test-cluster-nodegroup-node-group-al2023"
2025-05-31 13:49:26 [ℹ]  no tasks
2025-05-31 13:49:26 [✔]  created 0 nodegroup(s) in cluster "test-cluster"
2025-05-31 13:49:26 [ℹ]  nodegroup "node-group-al2023" has 1 node(s)
2025-05-31 13:49:26 [ℹ]  node "ip-192-168-163-109.ap-northeast-1.compute.internal" is ready
2025-05-31 13:49:26 [ℹ]  waiting for at least 1 node(s) to become ready in "node-group-al2023"
2025-05-31 13:49:26 [ℹ]  nodegroup "node-group-al2023" has 1 node(s)
2025-05-31 13:49:26 [ℹ]  node "ip-192-168-163-109.ap-northeast-1.compute.internal" is ready
2025-05-31 13:49:26 [✔]  created 1 managed nodegroup(s) in cluster "test-cluster"
2025-05-31 13:49:27 [ℹ]  checking security group configuration for all nodegroups
2025-05-31 13:49:27 [ℹ]  all nodegroups have up-to-date cloudformation templates

Amazon Linux2023 を利用したノードグループが作成されました。

% eksctl get nodegroup --cluster test-cluster
CLUSTER NODEGROUP STATUS CREATED MIN SIZE MAX SIZE DESIRED CAPACITY INSTANCE TYPE IMAGE ID ASG NAME TYPE
test-cluster node-group-al2 ACTIVE 2025-05-31T04:27:18Z 1 1 1t3.medium AL2_x86_64 eks-node-group-al2-08cb924c-0837-7482-4c94-1c3a8e6c144c managed
test-cluster node-group-al2023 ACTIVE 2025-05-31T04:46:20Z 1 1 1t3.medium AL2023_x86_64_STANDARD eks-node-group-al2023-decb9254-bec6-db67-fef2-f05903b6fdbd managed

それに伴い、ノードも作成されています。

% kubectl get node
NAME                                                 STATUS   ROLES    AGE   VERSION
ip-192-168-160-246.ap-northeast-1.compute.internal   Ready    <none>   30m   v1.32.3-eks-473151a
ip-192-168-163-109.ap-northeast-1.compute.internal   Ready    <none>   11m   v1.32.3-eks-473151a

この段階で、Pod は基本的に古い方のノードに存在します。
Amazon VPC CNI と kube-proxy、Pod Identity Agent といった DaemonSet としてデプロイされたリソースのみ新しいノードにも配置されます。

% kubectl get pod -o wide
NAME                     READY   STATUS    RESTARTS   AGE   IP                NODE                                                 NOMINATED NODE   READINESS GATES
nginx-7c64dfbfdc-7r6xg   1/1     Running   0          26m   192.168.167.212   ip-192-168-160-246.ap-northeast-1.compute.internal   <none>           <none>
masukawa.kentaro (arm64):~/development/test-env/eksctl
% kubectl get pod -o wide -A
NAMESPACE     NAME                              READY   STATUS    RESTARTS   AGE   IP                NODE                                                 NOMINATED NODE   READINESS GATES
default       nginx-7c64dfbfdc-7r6xg            1/1     Running   0          26m   192.168.167.212   ip-192-168-160-246.ap-northeast-1.compute.internal   <none>           <none>
kube-system   aws-node-q2bs4                    2/2     Running   0          11m   192.168.163.109   ip-192-168-163-109.ap-northeast-1.compute.internal   <none>           <none>
kube-system   aws-node-xwvd7                    2/2     Running   0          31m   192.168.160.246   ip-192-168-160-246.ap-northeast-1.compute.internal   <none>           <none>
kube-system   coredns-68b8b66bdb-7cchc          1/1     Running   0          33m   192.168.181.65    ip-192-168-160-246.ap-northeast-1.compute.internal   <none>           <none>
kube-system   coredns-68b8b66bdb-rkrw5          1/1     Running   0          33m   192.168.170.170   ip-192-168-160-246.ap-northeast-1.compute.internal   <none>           <none>
kube-system   eks-pod-identity-agent-2xq9q      1/1     Running   0          31m   192.168.160.246   ip-192-168-160-246.ap-northeast-1.compute.internal   <none>           <none>
kube-system   eks-pod-identity-agent-m5485      1/1     Running   0          11m   192.168.163.109   ip-192-168-163-109.ap-northeast-1.compute.internal   <none>           <none>
kube-system   kube-proxy-bn7ks                  1/1     Running   0          11m   192.168.163.109   ip-192-168-163-109.ap-northeast-1.compute.internal   <none>           <none>
kube-system   kube-proxy-q7q29                  1/1     Running   0          31m   192.168.160.246   ip-192-168-160-246.ap-northeast-1.compute.internal   <none>           <none>
kube-system   metrics-server-6c8c76d545-9kcp5   1/1     Running   0          33m   192.168.171.8     ip-192-168-160-246.ap-northeast-1.compute.internal   <none>           <none>
kube-system   metrics-server-6c8c76d545-kgnld   1/1     Running   0          33m   192.168.179.34    ip-192-168-160-246.ap-northeast-1.compute.internal   <none>           <none>

古いノードグループを削除する前に、Pod を新しいノードに移行していきます。
まず、古いノードに新しく Pod が配置されないように cordon を実行します。

% kubectl cordon ip-192-168-160-246.ap-northeast-1.compute.internal
node/ip-192-168-160-246.ap-northeast-1.compute.internal cordoned

SchedulingDisabled と表示されていることを確認します。

% kubectl get node
NAME                                                 STATUS                     ROLES    AGE   VERSION
ip-192-168-160-246.ap-northeast-1.compute.internal   Ready,SchedulingDisabled   <none>   39m   v1.32.3-eks-473151a
ip-192-168-163-109.ap-northeast-1.compute.internal   Ready                      <none>   20m   v1.32.3-eks-473151a

古いノードに対して drain を実行して、Pod を移行します。
今回は metrics-server が emptydir を持っているので、エラーがでました。

% kubectl drain ip-192-168-160-246.ap-northeast-1.compute.internal --ignore-daemonsets
node/ip-192-168-160-246.ap-northeast-1.compute.internal already cordoned
error: unable to drain node "ip-192-168-160-246.ap-northeast-1.compute.internal" due to error: cannot delete Pods with local storage (use --delete-emptydir-data to override): kube-system/metrics-server-6c8c76d545-9kcp5, kube-system/metrics-server-6c8c76d545-kgnld, continuing command...
There are pending nodes to be drained:
 ip-192-168-160-246.ap-northeast-1.compute.internal
cannot delete Pods with local storage (use --delete-emptydir-data to override): kube-system/metrics-server-6c8c76d545-9kcp5, kube-system/metrics-server-6c8c76d545-kgnld

特に消えて困るデータが入っているわけではないので、--delete-emptydir-data オプションを付与して再実行します。

% kubectl drain ip-192-168-160-246.ap-northeast-1.compute.internal --ignore-daemonsets --delete-emptydir-data
node/ip-192-168-160-246.ap-northeast-1.compute.internal already cordoned
Warning: ignoring DaemonSet-managed Pods: kube-system/aws-node-xwvd7, kube-system/eks-pod-identity-agent-2xq9q, kube-system/kube-proxy-q7q29
evicting pod kube-system/metrics-server-6c8c76d545-kgnld
evicting pod kube-system/coredns-68b8b66bdb-7cchc
evicting pod kube-system/coredns-68b8b66bdb-rkrw5
evicting pod default/nginx-7c64dfbfdc-7r6xg
evicting pod kube-system/metrics-server-6c8c76d545-9kcp5
error when evicting pods/"coredns-68b8b66bdb-7cchc" -n "kube-system" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
error when evicting pods/"metrics-server-6c8c76d545-kgnld" -n "kube-system" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
pod/nginx-7c64dfbfdc-7r6xg evicted
pod/metrics-server-6c8c76d545-9kcp5 evicted
evicting pod kube-system/coredns-68b8b66bdb-7cchc
evicting pod kube-system/metrics-server-6c8c76d545-kgnld
error when evicting pods/"metrics-server-6c8c76d545-kgnld" -n "kube-system" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
pod/coredns-68b8b66bdb-rkrw5 evicted
evicting pod kube-system/metrics-server-6c8c76d545-kgnld
error when evicting pods/"metrics-server-6c8c76d545-kgnld" -n "kube-system" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
pod/coredns-68b8b66bdb-7cchc evicted
evicting pod kube-system/metrics-server-6c8c76d545-kgnld
error when evicting pods/"metrics-server-6c8c76d545-kgnld" -n "kube-system" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
evicting pod kube-system/metrics-server-6c8c76d545-kgnld
error when evicting pods/"metrics-server-6c8c76d545-kgnld" -n "kube-system" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
evicting pod kube-system/metrics-server-6c8c76d545-kgnld
error when evicting pods/"metrics-server-6c8c76d545-kgnld" -n "kube-system" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
evicting pod kube-system/metrics-server-6c8c76d545-kgnld
pod/metrics-server-6c8c76d545-kgnld evicted
node/ip-192-168-160-246.ap-northeast-1.compute.internal drained

何度かエラーが出ていますが、Pod Disruption Budget を遵守しながら移行しているだけなので問題ありません。
コマンド実行後、DeamonsSet でデプロイされている Pod 以外は Amazon Linux2023 を利用したノードに移行していることを確認します。

% kubectl get pod -o wide -A
NAMESPACE     NAME                              READY   STATUS    RESTARTS   AGE   IP                NODE                                                 NOMINATED NODE   READINESS GATES
default       nginx-7c64dfbfdc-cs9kd            1/1     Running   0          89s   192.168.169.4     ip-192-168-163-109.ap-northeast-1.compute.internal   <none>           <none>
kube-system   aws-node-q2bs4                    2/2     Running   0          23m   192.168.163.109   ip-192-168-163-109.ap-northeast-1.compute.internal   <none>           <none>
kube-system   aws-node-xwvd7                    2/2     Running   0          42m   192.168.160.246   ip-192-168-160-246.ap-northeast-1.compute.internal   <none>           <none>
kube-system   coredns-68b8b66bdb-dc596          1/1     Running   0          89s   192.168.191.8     ip-192-168-163-109.ap-northeast-1.compute.internal   <none>           <none>
kube-system   coredns-68b8b66bdb-jl4lp          1/1     Running   0          84s   192.168.173.6     ip-192-168-163-109.ap-northeast-1.compute.internal   <none>           <none>
kube-system   eks-pod-identity-agent-2xq9q      1/1     Running   0          42m   192.168.160.246   ip-192-168-160-246.ap-northeast-1.compute.internal   <none>           <none>
kube-system   eks-pod-identity-agent-m5485      1/1     Running   0          23m   192.168.163.109   ip-192-168-163-109.ap-northeast-1.compute.internal   <none>           <none>
kube-system   kube-proxy-bn7ks                  1/1     Running   0          23m   192.168.163.109   ip-192-168-163-109.ap-northeast-1.compute.internal   <none>           <none>
kube-system   kube-proxy-q7q29                  1/1     Running   0          42m   192.168.160.246   ip-192-168-160-246.ap-northeast-1.compute.internal   <none>           <none>
kube-system   metrics-server-6c8c76d545-ds6nl   1/1     Running   0          89s   192.168.178.91    ip-192-168-163-109.ap-northeast-1.compute.internal   <none>           <none>
kube-system   metrics-server-6c8c76d545-nmkch   1/1     Running   0          58s   192.168.184.155   ip-192-168-163-109.ap-northeast-1.compute.internal   <none>           <none>

DeamonsSet でデプロイされている Pod (AWS VPC CNI、kube-proxy、Pod Identity Agent など) は同じ役割のものが新しいノードにも存在するので、このまま削除してしまって問題ありません。
では古いノードグループを削除します。
--approve オプションを付けずに実行すると、どのノードグループが削除対象になるかを確認できます。

% eksctl delete nodegroup -f cluster.yaml --include=node-group-al2 --disable-eviction
2025-05-31 14:13:10 [ℹ]  comparing 0 nodegroups defined in the given config ("cluster.yaml") against remote state
2025-05-31 14:13:10 [ℹ]  combined include rules: node-group-al2
2025-05-31 14:13:10 [ℹ]  1 nodegroup (node-group-al2) was included (based on the include/exclude rules)
2025-05-31 14:13:10 [ℹ]  (plan) would drain 1 nodegroup(s) in cluster "test-cluster"
2025-05-31 14:13:10 [ℹ]  starting parallel draining, max in-flight of 1
2025-05-31 14:13:10 [ℹ]  (plan) would delete 1 nodegroups from cluster "test-cluster"
2025-05-31 14:13:11 [ℹ]  1 task: { 1 task: { delete nodegroup "node-group-al2" [async] } }
2025-05-31 14:13:11 [✔]  (plan) would have deleted 1 nodegroup(s) from cluster "test-cluster"
2025-05-31 14:13:11 [!]  no changes were applied, run again with '--approve' to apply the changes

node-group-al2 が消えるのは想定通りなので、--approve オプションを付けて実行します。
本来このコマンド実行時に drain も実行してくれますが、DaemonSet を Eviction しようとしてエラーになったりと小回りが効かなかったので、事前に kubectl で実行する形にしています。
そのため、ここでは --disable-eviction オプションを付与しています。

% eksctl delete nodegroup -f cluster.yaml --include=node-group-al2 --disable-eviction --appro
ve
2025-05-31 14:13:40 [ℹ]  comparing 0 nodegroups defined in the given config ("cluster.yaml") against remote state
2025-05-31 14:13:40 [ℹ]  combined include rules: node-group-al2
2025-05-31 14:13:40 [ℹ]  1 nodegroup (node-group-al2) was included (based on the include/exclude rules)
2025-05-31 14:13:40 [ℹ]  will drain 1 nodegroup(s) in cluster "test-cluster"
2025-05-31 14:13:40 [ℹ]  starting parallel draining, max in-flight of 1
2025-05-31 14:13:40 [✔]  drained all nodes: [ip-192-168-160-246.ap-northeast-1.compute.internal]
2025-05-31 14:13:40 [ℹ]  will delete 1 nodegroups from cluster "test-cluster"
2025-05-31 14:13:40 [ℹ]  1 task: { 1 task: { delete nodegroup "node-group-al2" [async] } }
2025-05-31 14:13:40 [ℹ]  will delete stack "eksctl-test-cluster-nodegroup-node-group-al2"
2025-05-31 14:13:40 [✔]  deleted 1 nodegroup(s) from cluster "test-cluster"

AL2_x86_64 を利用しているノードグループが DELETING と表示されます。

% eksctl get nodegroup --cluster test-cluster
CLUSTER         NODEGROUP               STATUS          CREATED                 MIN SIZE    MAX SIZE DESIRED CAPACITY        INSTANCE TYPE   IMAGE ID                ASG NAME            TYPE
test-cluster    node-group-al2          DELETING        2025-05-31T04:27:18Z    1           11                       t3.medium       AL2_x86_64              eks-node-group-al2-08cb924c-0837-7482-4c94-1c3a8e6c144c              managed
test-cluster    node-group-al2023       ACTIVE          2025-05-31T04:46:20Z    1           11                       t3.medium       AL2023_x86_64_STANDARD  eks-node-group-al2023-decb9254-bec6-db67-fef2-f05903b6fdbd   managed

しばらくすると完全に消えました!

% eksctl get nodegroup --cluster test-cluster
CLUSTER         NODEGROUP               STATUS  CREATED                 MIN SIZE        MAX SIZE     DESIRED CAPACITY        INSTANCE TYPE   IMAGE ID                ASG NAME            TYPE
test-cluster    node-group-al2023       ACTIVE  2025-05-31T04:46:20Z    1               1   1t3.medium       AL2023_x86_64_STANDARD  eks-node-group-al2023-decb9254-bec6-db67-fef2-f05903b6fdbd   managed

設定ファイルからも該当する記述を削除して、最終的に下記のようになりました。

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: test-cluster
  region: ap-northeast-1
  version: "1.32"

iam:
  withOIDC: true

vpc:
  clusterEndpoints:
    publicAccess: true
    privateAccess: true

cloudWatch:
  clusterLogging:
    enableTypes:
      - "audit"
      - "authenticator"
      - "controllerManager"
      - "scheduler"

managedNodeGroups:
  - name: node-group-al2023
    amiFamily: AmazonLinux2023
    instanceTypes:
      - t3.medium
    maxSize: 1
    minSize: 1
    desiredCapacity: 1
    privateNetworking: true
    volumeType: gp3
    volumeSize: 50

autoModeConfig:
  enabled: false

addons:
  - name: eks-pod-identity-agent
    version: latest
  - name: vpc-cni
    version: latest
    useDefaultPodIdentityAssociations: true
  - name: coredns
    version: latest
  - name: kube-proxy
    version: latest

最後に

Amazon Linux2 のサポート終了日と Amazon Linux2 ベースの EKS 最適化 AMI でサポート終了日が異なるので、ご注意下さい。
今回はノードのみを移行しましたが、EKS 1.33 に上げる段階などで合わせて移行しても良いでしょう。
きちんと手順が整えられていれば、通常のプロセスの中で移行することも可能だと思います。

Share this article

facebook logohatena logotwitter logo

© Classmethod, Inc. All rights reserved.