[アップデート] EKS で Auto Mode が追加されたので試してみた#AWSreInvent

AWS re:Invent 2024
2024.12.02
この記事は公開されてから1年以上経過しています。情報が古い可能性がありますので、ご注意ください。
Amazon Elastic Kubernetes Service(以降 EKS) で Auto Mode を利用できるようになりました！

これまで EKS はかなり素に近い状態の Kubernetes クラスターを提供しており、利用前にアドオンなどを設定する必要がありました。

また、AWS 管理なのはコントロールプレーンのみであり、実際にワークロードを配置するコンピュート側は利用者で管理する必要がありました。

Auto Mode を利用することでこれらのタスクを AWS にオフロードでき、クラスターセットアップ時も運用時もより少ない労力で利用できるようになりました。
What's new

https://aws.amazon.com/jp/about-aws/whats-new/2024/12/amazon-eks-auto-mode/
公式ブログ

https://aws.amazon.com/jp/blogs/aws/streamline-kubernetes-cluster-management-with-new-amazon-eks-auto-mode/?trk=d57158fd-77e3-423f-9e1e-005fd2a64d89&sc_channel=el
 ドキュメント確認公式ドキュメントから特徴的な部分をピックアップしました。
Amazon EKS Auto Mode is responsible for creating, deleting, and patching EC2 Instances. You are responsible for the containers and pods deployed on the instance.

https://docs.aws.amazon.com/eks/latest/userguide/automode-learn-instances.html
ワークロードを配置する EC2 の作成、削除、パッチ適用は EKS 側の責務と明言されてますね。

Kubernetes 上で動作するアプリケーションの開発に使える時間が増えて最高です。

Auto Mode を利用した場合、 EKS 側が利用する AMI や、インスタンスタイプ、ファミリーを選択します。

EKS が管理はしてくれますが、実態は EC2 であり、マネジメントコンソールにもちゃんと表示されます。
AWS suggests running either EKS Auto Mode or self-managed Karpenter. You can install both during a migration or in an advanced configuration. If you have both installed, configure your node pools so that workloads are associated with either Karpenter or EKS Auto Mode.

https://docs.aws.amazon.com/eks/latest/userguide/automode-learn-instances.html
また、Karpenter と同様の目的を担うため、併用せず片方を利用することが推奨されます。
Network Policy を利用する場合は追加設定が必要です。

https://docs.aws.amazon.com/eks/latest/userguide/auto-net-pol.html
Fargate は利用できず、特定の EC2 インスタンスから適したものが選択されて利用されます。

ただ、パッチ適用も AWS 側の責務と言ってくれてますし、イメージキャッシュも効く可能性があると考えると、むしろ最高ですね。

また、GPU インスタンスも利用可能です。

https://docs.aws.amazon.com/eks/latest/userguide/automode-learn-instances.html
EC2 ノードへ SSH やセッションマネージャー経由のログインはできないものの、専用のコンポーネントを入れることでノードのログを確認することができます。

https://docs.aws.amazon.com/eks/latest/userguide/auto-get-logs.html
 試してみたさっそく作ってみようと思います。

ドキュメントを見る限りでは eksctl を使えそうですが、2024/12/2 12 時(JST)時点では、まだ作れませんでした。
Error: loading config file "cluster.yaml": error unmarshaling JSON: while decoding JSON: json: unknown field "autoModeConfig"
https://github.com/eksctl-io/eksctl/pull/8058
ということでマネジメントコンソールから作っていきます。
[追記]

数時間後には eksctl が対応してました(v0.195.0)。
https://dev.classmethod.jp/articles/create-auto-mode-eks-cluster-with-eksctl/
 権限付与まず、クラスター用とノード用の IAM ロールを作成します。
AmazonEKSAutoClusterRole として、下記権限を付与してクラスター用の IAM ロールを作成します。
AmazonEKSComputePolicy
AmazonEKSBlockStoragePolicy
AmazonEKSLoadBalancingPolicy
AmazonEKSNetworkingPolicy
AmazonEKSClusterPolicy
マネジメントコンソールで作成する場合、EKS - Auto Cluster を選択することで簡単に作成可能です。
AmazonEKSAutoNodeRole として、下記権限を付与してノード用の IAM ロールを作成します。
AmazonEKSWorkerNodeMinimalPolicy
AmazonEC2ContainerRegistryPullOnly
マネジメントコンソールで作成する場合、EKS - Auto Node を選択することで簡単に作成可能です。
 クラスター作成作成した IAM ロールを指定して、EKS クラスターを作成します。
細かい設定は下記です。
10 分ほどして、クラスターの作成が完了しました。

この時点で、EBS CSI driver などのセットアップは完了しているはずですが、アドオンとしては何も管理されていない状態です。
Kubernetes を操作する権限を取得して、kubectl コマンドを実行していきます。
aws eks update-kubeconfig --name funny-disco-monster --region ap-northeast-1
初期状態では、nodepool のみ存在していて、実際の Node(EC2)は存在しない状態でした。
$ kubectl get nodepools
NAME              NODECLASS   NODES   READY   AGE
general-purpose   default     0       True    8m12s
system            default     0       True    8m12s
$ kubectl get nodes
No resources found
 Ingress リソース作成下記シナリオに沿って、Ingess リソースを作成して、ALB 経由で公開するところまでやっていきます。
https://docs.aws.amazon.com/eks/latest/userguide/auto-elb-example.html
名前空間を作成します。
apiVersion: v1
kind: Namespace
metadata:
  name: game-2048
デプロイメントを作成します。
apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: game-2048
  name: deployment-2048
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: app-2048
  replicas: 5
  template:
    metadata:
      labels:
        app.kubernetes.io/name: app-2048
    spec:
      containers:
        - image: public.ecr.aws/l6m2t8p7/docker-2048:latest
          imagePullPolicy: Always
          name: app-2048
          ports:
            - containerPort: 80
          resources:
            requests:
              cpu: "0.5"
今回は c5a.xlarge(4vCPU、メモリ 8GiB) のインスタンスが起動されました。

0.5vCPU をリクエストする Pod が 5 個、メモリ関連のリクエストは無しなので、妥当に思います。

細かい選択ロジックは Karpenter 側を確認するとわかると思いますが、今回はそこまでは踏み込みません。
今回 kubectl apply を実行してから Pod の Status が Running になるまでは 50 秒程度でした。

リクエストを受けてから、EC2 を作成していると考えるとめちゃめちゃ早いですね。

Karpenter 同様、AutoScaling は経由せずに直接 EC2 を起動する API コールを AmazonEKSAutoClusterRole を利用して行っています。下記、RunInstances の API コールが記録されていました。
{
    "eventVersion": "1.10",
    "userIdentity": {
        "type": "AssumedRole",
        "principalId": "AROAW3MEE5OCLDZCHUPZW:aws-go-sdk-1733103336091287010",
        "arn": "arn:aws:sts::xxxxxxxxxxxx:assumed-role/AmazonEKSAutoClusterRole/aws-go-sdk-1733103336091287010",
        "accountId": "xxxxxxxxxxxx",
        "sessionContext": {
            "sessionIssuer": {
                "type": "Role",
                "principalId": "AROAW3MEE5OCLDZCHUPZW",
                "arn": "arn:aws:iam::xxxxxxxxxxxx:role/AmazonEKSAutoClusterRole",
                "accountId": "xxxxxxxxxxxx",
                "userName": "AmazonEKSAutoClusterRole"
            },
            "attributes": {
                "creationDate": "2024-12-02T01:35:36Z",
                "mfaAuthenticated": "false"
            }
        },
        "invokedBy": "eks.amazonaws.com"
    },
    "eventTime": "2024-12-02T01:35:38Z",
    "eventSource": "ec2.amazonaws.com",
    "eventName": "RunInstances",
    "awsRegion": "ap-northeast-1",
    "sourceIPAddress": "eks.amazonaws.com",
    "userAgent": "eks.amazonaws.com",
    "requestParameters": {
        "instancesSet": {
            "items": [
                {
                    "imageId": "ami-0e5a7734938a8bf25",
                    "minCount": 1,
                    "maxCount": 1
                }
            ]
        },
        "instanceType": "c5a.xlarge",
        "blockDeviceMapping": {},
        "availabilityZone": "ap-northeast-1c",
        "monitoring": {
            "enabled": false
        },
        "disableApiTermination": false,
        "disableApiStop": false,
        "clientToken": "fleet-ec1d9e87-232f-643e-8eba-ad222271d72e-0",
        "networkInterfaceSet": {
            "items": [
                {
                    "deviceIndex": 0,
                    "subnetId": "subnet-0e303425b994bd47e"
                }
            ]
        },
        "tagSpecificationSet": {
            "items": [
                {
                    "resourceType": "instance",
                    "tags": [
                        {
                            "key": "eks:eks-cluster-name",
                            "value": "funny-disco-monster"
                        },
                        {
                            "key": "eks:kubernetes-node-pool-name",
                            "value": "general-purpose"
                        },
                        {
                            "key": "eks:kubernetes-node-class-name",
                            "value": "default"
                        },
                        {
                            "key": "aws:ec2:managed-launch",
                            "value": "eks-auto"
                        },
                        {
                            "key": "kubernetes.io/cluster/funny-disco-monster",
                            "value": "owned"
                        },
                        {
                            "key": "aws:ec2:fleet-id",
                            "value": "fleet-ec1d9e87-232f-643e-8eba-ad222271d72e"
                        }
                    ]
                },
                {
                    "resourceType": "volume",
                    "tags": [
                        {
                            "key": "eks:kubernetes-node-class-name",
                            "value": "default"
                        },
                        {
                            "key": "aws:ec2:managed-launch",
                            "value": "eks-auto"
                        },
                        {
                            "key": "kubernetes.io/cluster/funny-disco-monster",
                            "value": "owned"
                        },
                        {
                            "key": "eks:eks-cluster-name",
                            "value": "funny-disco-monster"
                        },
                        {
                            "key": "eks:kubernetes-node-pool-name",
                            "value": "general-purpose"
                        }
                    ]
                }
            ]
        },
        "launchTemplate": {
            "launchTemplateId": "lt-01c81b26df2d10adc",
            "version": "1"
        },
        "operator": {
            "account": "xxxxxxxxxxxx"
        }
    },
    "responseElements": {
        "requestId": "8b4e560a-f5f2-4932-a94d-6f9ce35b9034",
        "reservationId": "r-00d9e59f9bf434328",
        "ownerId": "xxxxxxxxxxxx",
        "groupSet": {},
        "instancesSet": {
            "items": [
                {
                    "instanceId": "i-012fe632c846b693a",
                    "imageId": "ami-0e5a7734938a8bf25",
                    "bootMode": "uefi-preferred",
                    "currentInstanceBootMode": "uefi",
                    "instanceState": {
                        "code": 0,
                        "name": "pending"
                    },
                    "privateDnsName": "ip-10-0-11-17.ap-northeast-1.compute.internal",
                    "operator": {
                        "managed": true,
                        "principal": "eks.amazonaws.com"
                    },
                    "amiLaunchIndex": 0,
                    "productCodes": {},
                    "instanceType": "c5a.xlarge",
                    "launchTime": 1733103338000,
                    "placement": {
                        "availabilityZone": "ap-northeast-1c",
                        "tenancy": "default"
                    },
                    "monitoring": {
                        "state": "disabled"
                    },
                    "subnetId": "subnet-0e303425b994bd47e",
                    "vpcId": "vpc-0e586090838ae135d",
                    "privateIpAddress": "10.0.11.17",
                    "stateReason": {
                        "code": "pending",
                        "message": "pending"
                    },
                    "architecture": "x86_64",
                    "rootDeviceType": "ebs",
                    "rootDeviceName": "/dev/xvda",
                    "blockDeviceMapping": {},
                    "virtualizationType": "hvm",
                    "hypervisor": "xen",
                    "tagSet": {
                        "items": [
                            {
                                "key": "eks:kubernetes-node-pool-name",
                                "value": "general-purpose"
                            },
                            {
                                "key": "kubernetes.io/cluster/funny-disco-monster",
                                "value": "owned"
                            },
                            {
                                "key": "eks:eks-cluster-name",
                                "value": "funny-disco-monster"
                            },
                            {
                                "key": "aws:ec2launchtemplate:version",
                                "value": "1"
                            },
                            {
                                "key": "aws:ec2:fleet-id",
                                "value": "fleet-ec1d9e87-232f-643e-8eba-ad222271d72e"
                            },
                            {
                                "key": "eks:kubernetes-node-class-name",
                                "value": "default"
                            },
                            {
                                "key": "aws:ec2launchtemplate:id",
                                "value": "lt-01c81b26df2d10adc"
                            },
                            {
                                "key": "aws:ec2:managed-launch",
                                "value": "eks-auto"
                            }
                        ]
                    },
                    "clientToken": "fleet-ec1d9e87-232f-643e-8eba-ad222271d72e-0",
                    "groupSet": {
                        "items": [
                            {
                                "groupId": "sg-0dc454d519c121ee3",
                                "groupName": "eks-cluster-sg-funny-disco-monster-252732804"
                            }
                        ]
                    },
                    "sourceDestCheck": true,
                    "networkInterfaceSet": {
                        "items": [
                            {
                                "networkInterfaceId": "eni-0b721e21d11ef9030",
                                "subnetId": "subnet-0e303425b994bd47e",
                                "vpcId": "vpc-0e586090838ae135d",
                                "ownerId": "xxxxxxxxxxxx",
                                "operator": {
                                    "managed": true,
                                    "principal": "eks.amazonaws.com"
                                },
                                "status": "in-use",
                                "macAddress": "0a:d4:de:a9:2c:7d",
                                "privateIpAddress": "10.0.11.17",
                                "privateDnsName": "ip-10-0-11-17.ap-northeast-1.compute.internal",
                                "sourceDestCheck": true,
                                "interfaceType": "interface",
                                "groupSet": {
                                    "items": [
                                        {
                                            "groupId": "sg-0dc454d519c121ee3",
                                            "groupName": "eks-cluster-sg-funny-disco-monster-252732804"
                                        }
                                    ]
                                },
                                "attachment": {
                                    "attachmentId": "eni-attach-0202e58439b66fe04",
                                    "deviceIndex": 0,
                                    "networkCardIndex": 0,
                                    "status": "attaching",
                                    "attachTime": 1733103338000,
                                    "deleteOnTermination": true
                                },
                                "privateIpAddressesSet": {
                                    "item": [
                                        {
                                            "privateIpAddress": "10.0.11.17",
                                            "privateDnsName": "ip-10-0-11-17.ap-northeast-1.compute.internal",
                                            "primary": true
                                        }
                                    ]
                                },
                                "ipv6AddressesSet": {},
                                "tagSet": {}
                            }
                        ]
                    },
                    "iamInstanceProfile": {
                        "arn": "arn:aws:iam::xxxxxxxxxxxx:instance-profile/eks-ap-northeast-1-funny-disco-monster-6941428323739926734",
                        "id": "AIPAW3MEE5OCN57SYILZ4"
                    },
                    "ebsOptimized": false,
                    "enaSupport": true,
                    "cpuOptions": {
                        "coreCount": 2,
                        "threadsPerCore": 2
                    },
                    "capacityReservationSpecification": {
                        "capacityReservationPreference": "open"
                    },
                    "enclaveOptions": {
                        "enabled": false
                    },
                    "metadataOptions": {
                        "state": "pending",
                        "httpTokens": "required",
                        "httpPutResponseHopLimit": 1,
                        "httpEndpoint": "enabled",
                        "httpProtocolIpv4": "enabled",
                        "httpProtocolIpv6": "disabled",
                        "instanceMetadataTags": "disabled"
                    },
                    "maintenanceOptions": {
                        "autoRecovery": "default"
                    },
                    "privateDnsNameOptions": {
                        "hostnameType": "ip-name",
                        "enableResourceNameDnsARecord": false,
                        "enableResourceNameDnsAAAARecord": false
                    }
                }
            ]
        },
        "requesterId": "216214693835"
    },
    "requestID": "8b4e560a-f5f2-4932-a94d-6f9ce35b9034",
    "eventID": "6678cdfe-8b9c-4352-9aa6-eb50a9c8c697",
    "readOnly": false,
    "eventType": "AwsApiCall",
    "managementEvent": true,
    "recipientAccountId": "xxxxxxxxxxxx",
    "eventCategory": "Management"
}
下記 Karpenter を検証している記事でも Pod が Ready になるまで 49 秒と記載がありますが、Karpenter が AWS マネージドになったイメージですかね。
https://zenn.dev/johnn26/articles/clusterautoscaler-vs-karpenter
Auto scaling: Relying on Karpenter auto scaling, EKS Auto Mode monitors for unschedulable Pods and makes it possible for new nodes to be deployed to run those pods.

https://docs.aws.amazon.com/eks/latest/userguide/automode.html#_automated_components
Service 作成
apiVersion: v1
kind: Service
metadata:
  namespace: game-2048
  name: service-2048
spec:
  ports:
    - port: 80
      targetPort: 80
      protocol: TCP
  type: NodePort
  selector:
    app.kubernetes.io/name: app-2048
IngressClass 作成
apiVersion: networking.k8s.io/v1
kind: IngressClass
metadata:
  namespace: game-2048
  labels:
    app.kubernetes.io/name: LoadBalancerController
  name: alb
spec:
  controller: eks.amazonaws.com/alb
Ingress 作成
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  namespace: game-2048
  name: ingress-2048
  annotations:
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/target-type: ip
spec:
  ingressClassName: alb
  rules:
    - http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: service-2048
                port:
                  number: 80
Ingress リソース経由で ALB が作成されました。
DNS 名でアプリケーションにアクセス可能になりました！
AWS Load Balancer Controllerのセットアップ不要で、Ingress リソースが使えて感動です.

アプリケーションに必要なリソースがほとんどで、AWS Load Balancer Controller や EBS CSI driver のセットアップで作成されるリソースが見えないのも気持ち良いですね。
$ kubectl get all -A
NAMESPACE   NAME                                  READY   STATUS    RESTARTS   AGE
game-2048   pod/deployment-2048-98ddb8c75-272sg   1/1     Running   0          58m
game-2048   pod/deployment-2048-98ddb8c75-ctpbm   1/1     Running   0          58m
game-2048   pod/deployment-2048-98ddb8c75-hb6h4   1/1     Running   0          58m
game-2048   pod/deployment-2048-98ddb8c75-qk4h8   1/1     Running   0          58m
game-2048   pod/deployment-2048-98ddb8c75-znt2w   1/1     Running   0          58m

NAMESPACE     NAME                                TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
default       service/kubernetes                  ClusterIP   172.20.0.1       <none>        443/TCP        138m
game-2048     service/service-2048                NodePort    172.20.203.242   <none>        80:30670/TCP   53m
kube-system   service/eks-extension-metrics-api   ClusterIP   172.20.170.82    <none>        443/TCP        138m

NAMESPACE   NAME                              READY   UP-TO-DATE   AVAILABLE   AGE
game-2048   deployment.apps/deployment-2048   5/5     5            5           58m

NAMESPACE   NAME                                        DESIRED   CURRENT   READY   AGE
game-2048   replicaset.apps/deployment-2048-98ddb8c75   5         5         5       58m
明示的にデプロイしていないものはこのリソース(eks-extension-metrics-api)くらいでした。
$ kubectl describe service eks-extension-metrics-api -n kube-system
Name:                     eks-extension-metrics-api
Namespace:                kube-system
Labels:                   <none>
Annotations:              <none>
Selector:                 <none>
Type:                     ClusterIP
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       172.20.170.82
IPs:                      172.20.170.82
Port:                     metrics-api  443/TCP
TargetPort:               10443/TCP
Endpoints:                172.0.32.0:10443
Session Affinity:         None
Internal Traffic Policy:  Cluster
Events:                   <none>
top コマンドは打てず、あくまで内部でコンピュートリソースのサイジングに使うためのものに見えました。
$ kubectl top
Display resource (CPU/memory) usage.

The top command allows you to see the resource consumption for nodes or pods.

This command requires Metrics Server to be correctly configured and working on the server.

Available Commands:
node Display resource (CPU/memory) usage of nodes
pod Display resource (CPU/memory) usage of pods

Usage:
kubectl top [flags] [options]

Use "kubectl top <command> --help" for more information about a given command.
Use "kubectl options" for a list of global command-line options (applies to all commands).
 その他k8s リソースが無いと EC2 ノードが削除される挙動は注意が必要です。

もちろんコスト面では有利ですが、必要な数の Pod が常に存在するようにスケーリングしないと、EC2 作成のタイムラグが問題になるかもしれません。(k8s リソースがないと EC2 ノードが 0 になるので、この辺りは考慮して設計する必要があります)
また、コントロールプレーンのアップグレードは必要なので、アップグレードインサイトを確認して、適宜上げる必要があります。
[追記]

コントロールプレーン側のバージョンアップグレードを行うと、数分後にデータプレーン側も追随する形になります。

作業はシンプルになりますが、Pod Disruption Budgets(PDB) などの設定が重要になります。
https://dev.classmethod.jp/articles/eks-auto-mode-upgrade/
 まとめenhanced observability for EKS、Pod Identity、アップグレードインサイト、EKS API 経由のコントロールプレーン操作など、昨年の re:Invent 辺りから EKS の進化が止まらない印象です。

どんどん便利になる EKS を活用してバリバリ開発していきたいですね！