EKS マネージドノードグループから Auto Mode に移行してみた
EKS のマネージドノードグループから Auto Mode へのインプレース移行を試してみました。
移行に際して、可能であれば新しくクラスターを横に立てて Blue/Green 的にやる方が良さそうには見えてますが、インプレースでの移行も可能ってことでとりあえず試してみました!
マネージドノードグループを利用した EKS を作成
Terraform の AWS EKS Terraform module を利用して、非 Auto Mode でマネージドノードグループが関連付けられた EKS クラスターを作成します。
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 5.17.0"
name = "eks-vpc"
cidr = "10.0.0.0/16"
azs = ["ap-northeast-1a", "ap-northeast-1c", "ap-northeast-1d"]
public_subnets = ["10.0.0.0/24", "10.0.1.0/24", "10.0.2.0/24"]
private_subnets = ["10.0.100.0/24", "10.0.101.0/24", "10.0.102.0/24"]
enable_nat_gateway = true
single_nat_gateway = true
public_subnet_tags = {
"kubernetes.io/role/elb" = 1
}
private_subnet_tags = {
"kubernetes.io/role/internal-elb" = 1
}
}
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 20.35.0"
cluster_name = local.cluster_name
cluster_version = "1.32"
cluster_endpoint_public_access = true
cluster_endpoint_private_access = true
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
enable_cluster_creator_admin_permissions = true
eks_managed_node_groups = {
default = {
name = "default"
instance_types = ["t3.small"]
min_size = 1
max_size = 3
desired_size = 1
}
}
bootstrap_self_managed_addons = false
cluster_addons = {
coredns = {
version = "v1.11.4-eksbuild.2"
}
kube-proxy = {
version = "v1.32.0-eksbuild.2"
}
vpc-cni = {
version = "v1.19.3-eksbuild.1"
before_compute = true
}
eks-pod-identity-agent = {
version = "v1.3.5-eksbuild.2"
}
}
}
[補足]
あくまで Terraform の AWS EKS Terraform module に関する話になりますが、bootstrap_self_managed_addons = false
として組み込みネットワーキングアドオンを作成しないようにしています。
本属性は、非 Auto Mode の際はデフォルトで true なのですが、Auto Mode を有効化するとデフォルトが false になります。
その上に、変更すると強制的に replace 扱いになる属性なので、不要なクラスターリプレイスを引き起こさないように明示的に設定しています。
そもそも EKS アドオンを利用している場合、組み込みネットワーキングアドオンは不要です。
また、bootstrap_self_managed_addons
はクラスター作成時に効果を発揮する属性です。
明示的に設定しておらずに true
扱いになっていた場合は、bootstrap_self_managed_addons = true
と指定しておくことで、クラスターをリプレイスせずに Auto Mode を有効化可能です。
Kubernetes リソースのデプロイと確認
下記マニフェストファイルを利用して、アプリケーションをデプロイしておきます。
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: default
name: nginx
spec:
selector:
matchLabels:
app.kubernetes.io/name: nginx
replicas: 2
template:
metadata:
labels:
app.kubernetes.io/name: nginx
spec:
containers:
- image: nginx
imagePullPolicy: Always
name: nginx
ports:
- containerPort: 80
resources:
requests:
cpu: "0.5"
また、この段階でクラスター上の Kubernetes リソースの状態を確認しておきます。
今回、aws-node(VPC CNI driver)、CoreDNS、kube-proxy、Pod Identity Agent をアドオン経由でインストールしており、これらのコンポーネントは EKS Auto Mode 移行後には AWS 管理になります。
EKS アドオンを利用した場合、CoreDNS のみ DamonSet としてデプロイされ、残りは Deployment としてデプロイされています。
また、CoreDNS の Pod は ClusterIP 経由でアクセスできる状態になっており、ClusterIP が利用する IP はサービス IPv4 範囲(コントロールプレーン側が利用する IP) のレンジ内から選ばれます。
Nginx の Pod で /etc/resolve.conf
を確認すると、172.20.0.10
になっています。
# cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local ap-northeast-1.compute.internal
nameserver 172.20.0.10
options ndots:5
この IP は、kube-dns
という名前の ClusterIP に相当します。
% kubectl get svc kube-dns -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 172.20.0.10 <none> 53/UDP,53/TCP,9153/TCP 26m
Endpoints に複数の IP が指定されています。
% kubectl describe service/kube-dns -n kube-system
Name: kube-dns
Namespace: kube-system
Labels: eks.amazonaws.com/component=kube-dns
k8s-app=kube-dns
kubernetes.io/cluster-service=true
kubernetes.io/name=CoreDNS
Annotations: prometheus.io/port: 9153
prometheus.io/scrape: true
Selector: k8s-app=kube-dns
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 172.20.0.10
IPs: 172.20.0.10
Port: dns 53/UDP
TargetPort: 53/UDP
Endpoints: 10.0.100.25:53,10.0.100.40:53
Port: dns-tcp 53/TCP
TargetPort: 53/TCP
Endpoints: 10.0.100.25:53,10.0.100.40:53
Port: metrics 9153/TCP
TargetPort: 9153/TCP
Endpoints: 10.0.100.25:9153,10.0.100.40:9153
Session Affinity: None
Internal Traffic Policy: Cluster
Events: <none>
こちらが CoreDNS アドオン経由でデプロイされた Pod の IP になります。
% kubectl get pod -o wide -n kube-system
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
aws-node-brh56 2/2 Running 0 40m 10.0.100.9 ip-10-0-100-9.ap-northeast-1.compute.internal <none> <none>
coredns-6d78c58c9f-hzrmg 1/1 Running 0 39m 10.0.100.40 ip-10-0-100-9.ap-northeast-1.compute.internal <none> <none>
coredns-6d78c58c9f-kqzd4 1/1 Running 0 39m 10.0.100.25 ip-10-0-100-9.ap-northeast-1.compute.internal <none> <none>
eks-pod-identity-agent-5r76h 1/1 Running 0 37m 10.0.100.9 ip-10-0-100-9.ap-northeast-1.compute.internal <none> <none>
kube-proxy-9n95p 1/1 Running 0 39m 10.0.100.9 ip-10-0-100-9.ap-northeast-1.compute.internal <none> <none>
基本的には kube-proxy が iptables を書き換えたりして、ClusterIP 経由で各 Pod が CoreDNS にアクセスするような形になっているはずです。
Auto Mode を有効化する
では、Auto Mode を有効化してきます。
AWS EKS Terraform module の場合、cluster_compute_config
属性を追加することで、Auto Mode を有効化できます。
また、EKS Auto Mode でノードをプロビジョニングする際は、ノードプールと呼ばれる概念があり、インスタンスタイプ、CPU アーキテクチャ (ARM64/AMD64)やキャパシティタイプ (スポット/オンデマンド)などを指定可能です。
自分でカスタマイズして作成することも可能ですが、今回は一般的なユースケースを想定して元々用意されている general-purpose
を利用します。
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 20.35.0"
cluster_name = local.cluster_name
cluster_version = "1.32"
cluster_endpoint_public_access = true
cluster_endpoint_private_access = true
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
enable_cluster_creator_admin_permissions = true
eks_managed_node_groups = {
default = {
name = "default"
instance_types = ["t3.small"]
min_size = 1
max_size = 3
desired_size = 1
}
}
bootstrap_self_managed_addons = false
# Auto Mode の有効化
cluster_compute_config = {
enabled = true
node_pools = ["general-purpose"]
}
cluster_addons = {
coredns = {
version = "v1.11.4-eksbuild.2"
}
kube-proxy = {
version = "v1.32.0-eksbuild.2"
}
vpc-cni = {
version = "v1.19.3-eksbuild.1"
before_compute = true
}
eks-pod-identity-agent = {
version = "v1.3.5-eksbuild.2"
}
}
}
cluster_compute_config
属性を追加した所、terraform plan
の結果は下記のようになりました。
Terraform will perform the following actions:
# module.eks.data.aws_eks_addon_version.this["coredns"] will be read during apply
# (depends on a resource or a module with changes pending)
<= data "aws_eks_addon_version" "this" {
+ addon_name = "coredns"
+ id = (known after apply)
+ kubernetes_version = "1.32"
+ version = (known after apply)
}
# module.eks.data.aws_eks_addon_version.this["eks-pod-identity-agent"] will be read during apply
# (depends on a resource or a module with changes pending)
<= data "aws_eks_addon_version" "this" {
+ addon_name = "eks-pod-identity-agent"
+ id = (known after apply)
+ kubernetes_version = "1.32"
+ version = (known after apply)
}
# module.eks.data.aws_eks_addon_version.this["kube-proxy"] will be read during apply
# (depends on a resource or a module with changes pending)
<= data "aws_eks_addon_version" "this" {
+ addon_name = "kube-proxy"
+ id = (known after apply)
+ kubernetes_version = "1.32"
+ version = (known after apply)
}
# module.eks.data.aws_eks_addon_version.this["vpc-cni"] will be read during apply
# (depends on a resource or a module with changes pending)
<= data "aws_eks_addon_version" "this" {
+ addon_name = "vpc-cni"
+ id = (known after apply)
+ kubernetes_version = "1.32"
+ version = (known after apply)
}
# module.eks.data.tls_certificate.this[0] will be read during apply
# (depends on a resource or a module with changes pending)
<= data "tls_certificate" "this" {
+ certificates = (known after apply)
+ id = (known after apply)
+ url = "https://oidc.eks.ap-northeast-1.amazonaws.com/id/XXXXXXXXXXXXXXXXXXXXXXXX"
}
# module.eks.aws_eks_addon.before_compute["vpc-cni"] will be updated in-place
~ resource "aws_eks_addon" "before_compute" {
~ addon_version = "v1.19.2-eksbuild.1" -> (known after apply)
id = "test-cluster:vpc-cni"
tags = {}
# (11 unchanged attributes hidden)
# (1 unchanged block hidden)
}
# module.eks.aws_eks_addon.this["coredns"] will be updated in-place
~ resource "aws_eks_addon" "this" {
~ addon_version = "v1.11.4-eksbuild.2" -> (known after apply)
id = "test-cluster:coredns"
tags = {}
# (11 unchanged attributes hidden)
# (1 unchanged block hidden)
}
# module.eks.aws_eks_addon.this["eks-pod-identity-agent"] will be updated in-place
~ resource "aws_eks_addon" "this" {
~ addon_version = "v1.3.4-eksbuild.1" -> (known after apply)
id = "test-cluster:eks-pod-identity-agent"
tags = {}
# (11 unchanged attributes hidden)
# (1 unchanged block hidden)
}
# module.eks.aws_eks_addon.this["kube-proxy"] will be updated in-place
~ resource "aws_eks_addon" "this" {
~ addon_version = "v1.32.0-eksbuild.2" -> (known after apply)
id = "test-cluster:kube-proxy"
tags = {}
# (11 unchanged attributes hidden)
# (1 unchanged block hidden)
}
# module.eks.aws_eks_cluster.this[0] will be updated in-place
~ resource "aws_eks_cluster" "this" {
id = "test-cluster"
name = "test-cluster"
tags = {
"terraform-aws-modules" = "eks"
}
# (12 unchanged attributes hidden)
+ compute_config {
+ enabled = true
+ node_pools = [
+ "general-purpose",
]
+ node_role_arn = (known after apply)
}
~ kubernetes_network_config {
# (3 unchanged attributes hidden)
~ elastic_load_balancing {
~ enabled = false -> true
}
}
+ storage_config {
+ block_storage {
+ enabled = true
}
}
# (5 unchanged blocks hidden)
}
# module.eks.aws_iam_openid_connect_provider.oidc_provider[0] will be updated in-place
~ resource "aws_iam_openid_connect_provider" "oidc_provider" {
id = "arn:aws:iam::xxxxxxxxxxxx:oidc-provider/oidc.eks.ap-northeast-1.amazonaws.com/id/XXXXXXXXXXXXXXXXXXXXXXXX"
tags = {
"Name" = "test-cluster-eks-irsa"
}
~ thumbprint_list = [
- "9e99a48a9960b14926bb7f3b02e22da2b0ab7280",
] -> (known after apply)
# (4 unchanged attributes hidden)
}
# module.eks.aws_iam_role.eks_auto[0] will be created
+ resource "aws_iam_role" "eks_auto" {
+ arn = (known after apply)
+ assume_role_policy = jsonencode(
{
+ Statement = [
+ {
+ Action = [
+ "sts:TagSession",
+ "sts:AssumeRole",
]
+ Effect = "Allow"
+ Principal = {
+ Service = "ec2.amazonaws.com"
}
+ Sid = "EKSAutoNodeAssumeRole"
},
]
+ Version = "2012-10-17"
}
)
+ create_date = (known after apply)
+ force_detach_policies = true
+ id = (known after apply)
+ managed_policy_arns = (known after apply)
+ max_session_duration = 3600
+ name = (known after apply)
+ name_prefix = "test-cluster-eks-auto-"
+ path = "/"
+ tags_all = (known after apply)
+ unique_id = (known after apply)
+ inline_policy (known after apply)
}
# module.eks.aws_iam_role_policy_attachment.eks_auto["AmazonEC2ContainerRegistryPullOnly"] will be created
+ resource "aws_iam_role_policy_attachment" "eks_auto" {
+ id = (known after apply)
+ policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryPullOnly"
+ role = (known after apply)
}
# module.eks.aws_iam_role_policy_attachment.eks_auto["AmazonEKSWorkerNodeMinimalPolicy"] will be created
+ resource "aws_iam_role_policy_attachment" "eks_auto" {
+ id = (known after apply)
+ policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodeMinimalPolicy"
+ role = (known after apply)
}
# module.eks.aws_iam_role_policy_attachment.this["AmazonEKSBlockStoragePolicy"] will be created
+ resource "aws_iam_role_policy_attachment" "this" {
+ id = (known after apply)
+ policy_arn = "arn:aws:iam::aws:policy/AmazonEKSBlockStoragePolicy"
+ role = "test-cluster-cluster-20250503052606002600000003"
}
# module.eks.aws_iam_role_policy_attachment.this["AmazonEKSComputePolicy"] will be created
+ resource "aws_iam_role_policy_attachment" "this" {
+ id = (known after apply)
+ policy_arn = "arn:aws:iam::aws:policy/AmazonEKSComputePolicy"
+ role = "test-cluster-cluster-20250503052606002600000003"
}
# module.eks.aws_iam_role_policy_attachment.this["AmazonEKSLoadBalancingPolicy"] will be created
+ resource "aws_iam_role_policy_attachment" "this" {
+ id = (known after apply)
+ policy_arn = "arn:aws:iam::aws:policy/AmazonEKSLoadBalancingPolicy"
+ role = "test-cluster-cluster-20250503052606002600000003"
}
# module.eks.aws_iam_role_policy_attachment.this["AmazonEKSNetworkingPolicy"] will be created
+ resource "aws_iam_role_policy_attachment" "this" {
+ id = (known after apply)
+ policy_arn = "arn:aws:iam::aws:policy/AmazonEKSNetworkingPolicy"
+ role = "test-cluster-cluster-20250503052606002600000003"
}
# module.eks.aws_iam_role_policy_attachment.this["AmazonEKSVPCResourceController"] will be destroyed
# (because key ["AmazonEKSVPCResourceController"] is not in for_each map)
- resource "aws_iam_role_policy_attachment" "this" {
- id = "test-cluster-cluster-20250503052606002600000003-20250503052607818000000008" -> null
- policy_arn = "arn:aws:iam::aws:policy/AmazonEKSVPCResourceController" -> null
- role = "test-cluster-cluster-20250503052606002600000003" -> null
}
AWS 公式ドキュメント にも記載されている、下記 IAM ポリシーがクラスターロールに追加されています。
- AmazonEKSComputePolicy
- AmazonEKSBlockStoragePolicy
- AmazonEKSLoadBalancingPolicy
- AmazonEKSNetworkingPolicy
- AmazonEKSClusterPolicy
Auto Mode の有効化を終えた後、各 Pod は最初から存在するノード上にそのまま存在します。
% kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-7c64dfbfdc-5hxn5 1/1 Running 0 52m 10.0.100.137 ip-10-0-100-9.ap-northeast-1.compute.internal <none> <none>
nginx-7c64dfbfdc-hhvdr 1/1 Running 0 52m 10.0.100.43 ip-10-0-100-9.ap-northeast-1.compute.internal <none> <none>
また、有効化直後の段階では Auto Mode 管理のノードは存在しません。
% kubectl get node
NAME STATUS ROLES AGE VERSION
ip-10-0-100-9.ap-northeast-1.compute.internal Ready <none> 59m v1.32.3-eks-473151a
DNS 周りの設定も変わっていません。
Auto Mode 経由でノードを起動できる状態になったというだけのようです。
% kubectl describe service/kube-dns -n kube-system
Name: kube-dns
Namespace: kube-system
Labels: eks.amazonaws.com/component=kube-dns
k8s-app=kube-dns
kubernetes.io/cluster-service=true
kubernetes.io/name=CoreDNS
Annotations: prometheus.io/port: 9153
prometheus.io/scrape: true
Selector: k8s-app=kube-dns
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 172.20.0.10
IPs: 172.20.0.10
Port: dns 53/UDP
TargetPort: 53/UDP
Endpoints: 10.0.100.25:53,10.0.100.40:53
Port: dns-tcp 53/TCP
TargetPort: 53/TCP
Endpoints: 10.0.100.25:53,10.0.100.40:53
Port: metrics 9153/TCP
TargetPort: 9153/TCP
Endpoints: 10.0.100.25:9153,10.0.100.40:9153
Session Affinity: None
Internal Traffic Policy: Cluster
Events: <none>
では、アプリケーションを Auto Mode 管理のノードに移行していきましょう。
nodeSelector
で eks.amazonaws.com/compute-type: auto
を指定して kubectl apply
を実行します。
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: default
name: nginx
spec:
selector:
matchLabels:
app.kubernetes.io/name: nginx
replicas: 2
template:
metadata:
labels:
app.kubernetes.io/name: nginx
spec:
containers:
- image: nginx
imagePullPolicy: Always
name: nginx
ports:
- containerPort: 80
resources:
requests:
cpu: "0.5"
nodeSelector:
eks.amazonaws.com/compute-type: auto
Auto Mode 管理のノードが起動されるのを待って Pod が Pending 状態になっています。
% kubectl get pod
NAME READY STATUS RESTARTS AGE
nginx-66d8bcc68c-rdj7f 0/1 Pending 0 12s
nginx-7c64dfbfdc-5hxn5 1/1 Running 0 55m
nginx-7c64dfbfdc-hhvdr 1/1 Running 0 55m
Auto Mode 管理のノードが起動されました。
% kubectl get node
NAME STATUS ROLES AGE VERSION
i-016c762ee61d543dd Ready <none> 16s v1.32.2-eks-677bac1
ip-10-0-100-9.ap-northeast-1.compute.internal Ready <none> 62m v1.32.3-eks-473151a
無事、Auto Mode 管理のノードに Pod が移動しました。
% kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-66d8bcc68c-rdj7f 1/1 Running 0 71s 10.0.101.80 i-016c762ee61d543dd <none> <none>
nginx-66d8bcc68c-zjtr4 1/1 Running 0 30s 10.0.101.81 i-016c762ee61d543dd <none> <none>
既存のマネージドノードグループの削除
無事 Pod の移行も済んだので、不要になったマネージドノードグループを消していきます。
不要なのでさっさと消したい所ですが、アドオン関連の Pod が既存のノードに多く残っています。
% kubectl get pod -o wide -A
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default nginx-66d8bcc68c-rdj7f 1/1 Running 0 15m 10.0.101.80 i-016c762ee61d543dd <none> <none>
default nginx-66d8bcc68c-zjtr4 1/1 Running 0 15m 10.0.101.81 i-016c762ee61d543dd <none> <none>
kube-system aws-node-brh56 2/2 Running 0 78m 10.0.100.9 ip-10-0-100-9.ap-northeast-1.compute.internal <none> <none>
kube-system coredns-6d78c58c9f-hzrmg 1/1 Running 0 76m 10.0.100.40 ip-10-0-100-9.ap-northeast-1.compute.internal <none> <none>
kube-system coredns-6d78c58c9f-kqzd4 1/1 Running 0 76m 10.0.100.25 ip-10-0-100-9.ap-northeast-1.compute.internal <none> <none>
kube-system eks-pod-identity-agent-5r76h 1/1 Running 0 75m 10.0.100.9 ip-10-0-100-9.ap-northeast-1.compute.internal <none> <none>
kube-system kube-proxy-9n95p 1/1 Running 0 76m 10.0.100.9 ip-10-0-100-9.ap-northeast-1.compute.internal <none> <none>
aws-node(AWS VPC CNI)、CoreDNS、kube-proxy、Pod Identity エージェントなどですね。
% kubectl get all -A
NAMESPACE NAME READY STATUS RESTARTS AGE
default pod/nginx-66d8bcc68c-rdj7f 1/1 Running 0 16m
default pod/nginx-66d8bcc68c-zjtr4 1/1 Running 0 15m
kube-system pod/aws-node-brh56 2/2 Running 0 78m
kube-system pod/coredns-6d78c58c9f-hzrmg 1/1 Running 0 76m
kube-system pod/coredns-6d78c58c9f-kqzd4 1/1 Running 0 76m
kube-system pod/eks-pod-identity-agent-5r76h 1/1 Running 0 75m
kube-system pod/kube-proxy-9n95p 1/1 Running 0 76m
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/kubernetes ClusterIP 172.20.0.1 <none> 443/TCP 83m
kube-system service/eks-extension-metrics-api ClusterIP 172.20.148.179 <none> 443/TCP 83m
kube-system service/kube-dns ClusterIP 172.20.0.10 <none> 53/UDP,53/TCP,9153/TCP 76m
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-system daemonset.apps/aws-node 1 1 1 1 1 <none> 79m
kube-system daemonset.apps/eks-pod-identity-agent 1 1 1 1 1 <none> 75m
kube-system daemonset.apps/kube-proxy 1 1 1 1 1 <none> 76m
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
default deployment.apps/nginx 2/2 2 2 71m
kube-system deployment.apps/coredns 2/2 2 2 76m
NAMESPACE NAME DESIRED CURRENT READY AGE
default replicaset.apps/nginx-66d8bcc68c 2 2 2 16m
default replicaset.apps/nginx-7c64dfbfdc 0 0 0 71m
kube-system replicaset.apps/coredns-6d78c58c9f 2 2 2 76m
とりあえず(?)、cluster_addons
属性を削除してアドオンを削除してみます。
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 20.35.0"
cluster_name = local.cluster_name
cluster_version = "1.32"
cluster_endpoint_public_access = true
cluster_endpoint_private_access = true
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
enable_cluster_creator_admin_permissions = true
eks_managed_node_groups = {
default = {
name = "default"
instance_types = ["t3.small"]
min_size = 1
max_size = 3
desired_size = 1
}
}
cluster_compute_config = {
enabled = true
node_pools = ["general-purpose"]
}
bootstrap_self_managed_addons = false
}
差分は下記のようになりました。
Terraform will perform the following actions:
# module.eks.aws_eks_addon.before_compute["vpc-cni"] will be destroyed
# (because key ["vpc-cni"] is not in for_each map)
- resource "aws_eks_addon" "before_compute" {
- addon_name = "vpc-cni" -> null
- addon_version = "v1.19.2-eksbuild.1" -> null
- arn = "arn:aws:eks:ap-northeast-1:xxxxxxxxxxxx:addon/test-cluster/vpc-cni/20cb4a51-f67e-c71b-a374-2fe82b1d8630" -> null
- cluster_name = "test-cluster" -> null
- created_at = "2025-05-03T05:34:53Z" -> null
- id = "test-cluster:vpc-cni" -> null
- modified_at = "2025-05-03T05:35:02Z" -> null
- preserve = true -> null
- resolve_conflicts_on_create = "NONE" -> null
- resolve_conflicts_on_update = "OVERWRITE" -> null
- tags = {} -> null
- tags_all = {} -> null
# (2 unchanged attributes hidden)
- timeouts {}
}
# module.eks.aws_eks_addon.this["coredns"] will be destroyed
# (because key ["coredns"] is not in for_each map)
- resource "aws_eks_addon" "this" {
- addon_name = "coredns" -> null
- addon_version = "v1.11.4-eksbuild.2" -> null
- arn = "arn:aws:eks:ap-northeast-1:xxxxxxxxxxxx:addon/test-cluster/coredns/d2cb4a53-115c-f5d4-ecab-72b43e54990b" -> null
- cluster_name = "test-cluster" -> null
- created_at = "2025-05-03T05:37:18Z" -> null
- id = "test-cluster:coredns" -> null
- modified_at = "2025-05-03T05:38:02Z" -> null
- preserve = true -> null
- resolve_conflicts_on_create = "NONE" -> null
- resolve_conflicts_on_update = "OVERWRITE" -> null
- tags = {} -> null
- tags_all = {} -> null
# (2 unchanged attributes hidden)
- timeouts {}
}
# module.eks.aws_eks_addon.this["eks-pod-identity-agent"] will be destroyed
# (because key ["eks-pod-identity-agent"] is not in for_each map)
- resource "aws_eks_addon" "this" {
- addon_name = "eks-pod-identity-agent" -> null
- addon_version = "v1.3.4-eksbuild.1" -> null
- arn = "arn:aws:eks:ap-northeast-1:xxxxxxxxxxxx:addon/test-cluster/eks-pod-identity-agent/94cb4a54-041d-9329-299b-f4074a46a77b" -> null
- cluster_name = "test-cluster" -> null
- created_at = "2025-05-03T05:39:22Z" -> null
- id = "test-cluster:eks-pod-identity-agent" -> null
- modified_at = "2025-05-03T05:39:58Z" -> null
- preserve = true -> null
- resolve_conflicts_on_create = "NONE" -> null
- resolve_conflicts_on_update = "OVERWRITE" -> null
- tags = {} -> null
- tags_all = {} -> null
# (2 unchanged attributes hidden)
- timeouts {}
}
# module.eks.aws_eks_addon.this["kube-proxy"] will be destroyed
# (because key ["kube-proxy"] is not in for_each map)
- resource "aws_eks_addon" "this" {
- addon_name = "kube-proxy" -> null
- addon_version = "v1.32.0-eksbuild.2" -> null
- arn = "arn:aws:eks:ap-northeast-1:xxxxxxxxxxxx:addon/test-cluster/kube-proxy/46cb4a53-1165-2e29-2150-bba6b8a475c7" -> null
- cluster_name = "test-cluster" -> null
- created_at = "2025-05-03T05:37:18Z" -> null
- id = "test-cluster:kube-proxy" -> null
- modified_at = "2025-05-03T05:38:25Z" -> null
- preserve = true -> null
- resolve_conflicts_on_create = "NONE" -> null
- resolve_conflicts_on_update = "OVERWRITE" -> null
- tags = {} -> null
- tags_all = {} -> null
# (2 unchanged attributes hidden)
- timeouts {}
}
Plan: 0 to add, 0 to change, 4 to destroy.
アドオンを消しても、対応する Kubernetes リソースが消えなかったので kubectl 経由で強制的に消しました。
kubectl delete deployment coredns -n kube-system
kubectl delete daemonset aws-node -n kube-system
kubectl delete daemonset eks-pod-identity-agent -n kube-system
kubectl delete daemonset kube-proxy -n kube-system
さて、勢いでアドオンを消してしまいましたが、果たして良かったのでしょうか?
DaemonSet で実装されている、aws-node(VPC CNI driver)、kube-proxy、Pod Identity エージェントは良いとして、Deployment としてデプロイされていた CoreDNS が特に気になります。
Pod 上の /etc/resolve.conf
も Auto Mode 有効化前と同じ IP を指しています。
% kubectl exec nginx-66d8bcc68c-rdj7f -it -- /bin/sh
# cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local ap-northeast-1.compute.internal
nameserver 172.20.0.10
options ndots:5
この段階で CoreDNS 用の ClusterIP を確認してみると、Endpoints が無くなっています。
% kubectl describe service/kube-dns -n kube-system
Name: kube-dns
Namespace: kube-system
Labels: eks.amazonaws.com/component=kube-dns
k8s-app=kube-dns
kubernetes.io/cluster-service=true
kubernetes.io/name=CoreDNS
Annotations: prometheus.io/port: 9153
prometheus.io/scrape: true
Selector: k8s-app=kube-dns
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 172.20.0.10
IPs: 172.20.0.10
Port: dns 53/UDP
TargetPort: 53/UDP
Endpoints:
Port: dns-tcp 53/TCP
TargetPort: 53/TCP
Endpoints:
Port: metrics 9153/TCP
TargetPort: 9153/TCP
Endpoints:
Session Affinity: None
Internal Traffic Policy: Cluster
Events: <none>
一方で、Service を作成した後に Nginx Pod の中に乗り込んで名前解決してみると、特に問題無く名前解決できる状態です。
# dig nginx.default.svc.cluster.local
; <<>> DiG 9.18.33-1~deb12u2-Debian <<>> nginx.default.svc.cluster.local
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 42133
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: 5686dda4de381016 (echoed)
;; QUESTION SECTION:
;nginx.default.svc.cluster.local. IN A
;; ANSWER SECTION:
nginx.default.svc.cluster.local. 5 IN A 172.20.53.13
;; Query time: 0 msec
;; SERVER: 172.20.0.10#53(172.20.0.10) (UDP)
;; WHEN: Sat May 03 07:17:37 UTC 2025
;; MSG SIZE rcvd: 119
つまり、Auto Mode 管理のノード上にいるコンテナは、アドオンとしてデプロイされる CoreDNS を参照していないようです。
Auto Mode の場合、CoreDNS もノード上で systemd サービスとして実行されています。
ノード上のログをダンプして system/ps.txt
を確認すれば、kube-proxy や CoreDNS に該当するプロセスを確認できます。
kube-proxy
root 1170 0.0 1.4 1738012 56396 ? Ssl 07:53 0:01 /usr/bin/kube-proxy --hostname-override i-04239f7bdcfa45781 --config=/usr/share/kube-proxy/kube-proxy-config --kubeconfig=/etc/kubernetes/kube-proxy/kubeconfig
CoreDNS
coredns 1604 0.1 1.5 1813796 59840 ? Ssl 07:53 0:10 /usr/bin/coredns -conf=/etc/coredns/Corefile
Auto Mode 管理のノード上で起動した Pod は、名前解決の際に良い感じにノード上の CoreDNS にルーティングされていると考えられます。
最初から Auto Mode として作成したクラスターでは CoreDNS 用の ClusterIP も存在しません。
% kubectl get all -A
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/kubernetes ClusterIP 172.20.0.1 <none> 443/TCP 8m45s
kube-system service/eks-extension-metrics-api ClusterIP 172.20.213.192 <none> 443/TCP 8m42s
/etc/resolve.conf
や /etc/nsswitch.conf
についても非 Auto Mode と変わり映え無いので、ノードの iptables で良い感じにルーティングされているのでしょう。
resolve.conf
% kubectl exec nginx-66d8bcc68c-rdj7f -it -- /bin/sh
# cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local ap-northeast-1.compute.internal
nameserver 172.20.0.10
options ndots:5
nsswitch.conf
% kubectl exec nginx-66d8bcc68c-rdj7f -it -- /bin/sh
# cat /etc/nsswitch.conf
# /etc/nsswitch.conf
#
# Example configuration of GNU Name Service Switch functionality.
# If you have the `glibc-doc-reference' and `info' packages installed, try:
# `info libc "Name Service Switch"' for information about this file.
passwd: files
group: files
shadow: files
gshadow: files
hosts: files dns
networks: files
protocols: db files
services: db files
ethers: db files
rpc: db files
netgroup: nis
アドオンを消してよさそうなことがわかったので、改めてノードグループを削除します。
EKS クラスター関連の記述はこんな感じです。
アドオンの管理もノードグループの管理も不要になったので、スッキリしました。
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 20.35.0"
cluster_name = local.cluster_name
cluster_version = "1.32"
cluster_endpoint_public_access = true
cluster_endpoint_private_access = true
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
enable_cluster_creator_admin_permissions = true
bootstrap_self_managed_addons = false
# Auto Mode の有効化
cluster_compute_config = {
enabled = true
node_pools = ["general-purpose"]
}
}
差分は下記のようになりました。
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated
with the following symbols:
- destroy
Terraform will perform the following actions:
# module.eks.module.eks_managed_node_group["default"].aws_eks_node_group.this[0] will be destroyed
# (because module.eks.module.eks_managed_node_group["default"] is not in configuration)
- resource "aws_eks_node_group" "this" {
- ami_type = "AL2023_x86_64_STANDARD" -> null
- arn = "arn:aws:eks:ap-northeast-1:xxxxxxxxxxxx:nodegroup/test-cluster/default-20250503053528850800000011/00cb4a52-3c48-71f3-7020-b72076da380d" -> null
- capacity_type = "ON_DEMAND" -> null
- cluster_name = "test-cluster" -> null
- disk_size = 0 -> null
- id = "test-cluster:default-20250503053528850800000011" -> null
- instance_types = [
- "t3.small",
] -> null
- labels = {} -> null
- node_group_name = "default-20250503053528850800000011" -> null
- node_group_name_prefix = "default-" -> null
- node_role_arn = "arn:aws:iam::xxxxxxxxxxxx:role/default-eks-node-group-20250503052606002600000002" -> null
- release_version = "1.32.3-20250501" -> null
- resources = [
- {
- autoscaling_groups = [
- {
- name = "eks-default-20250503053528850800000011-00cb4a52-3c48-71f3-7020-b72076da380d"
},
]
# (1 unchanged attribute hidden)
},
] -> null
- status = "ACTIVE" -> null
- subnet_ids = [
- "subnet-011609187505c89a3",
- "subnet-0302d2f4c14d12022",
- "subnet-09181196696186fc9",
] -> null
- tags = {
- "Name" = "default"
} -> null
- tags_all = {
- "Name" = "default"
} -> null
- version = "1.32" -> null
- launch_template {
- id = "lt-0fbede7110502107a" -> null
- name = "default-2025050305352315330000000f" -> null
- version = "1" -> null
}
- scaling_config {
- desired_size = 1 -> null
- max_size = 3 -> null
- min_size = 1 -> null
}
- timeouts {}
- update_config {
- max_unavailable = 0 -> null
- max_unavailable_percentage = 33 -> null
}
}
# module.eks.module.eks_managed_node_group["default"].aws_iam_role.this[0] will be destroyed
# (because module.eks.module.eks_managed_node_group["default"] is not in configuration)
- resource "aws_iam_role" "this" {
- arn = "arn:aws:iam::xxxxxxxxxxxx:role/default-eks-node-group-20250503052606002600000002" -> null
- assume_role_policy = jsonencode(
{
- Statement = [
- {
- Action = "sts:AssumeRole"
- Effect = "Allow"
- Principal = {
- Service = "ec2.amazonaws.com"
}
- Sid = "EKSNodeAssumeRole"
},
]
- Version = "2012-10-17"
}
) -> null
- create_date = "2025-05-03T05:26:06Z" -> null
- description = "EKS managed node group IAM role" -> null
- force_detach_policies = true -> null
- id = "default-eks-node-group-20250503052606002600000002" -> null
- managed_policy_arns = [
- "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly",
- "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy",
- "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy",
] -> null
- max_session_duration = 3600 -> null
- name = "default-eks-node-group-20250503052606002600000002" -> null
- name_prefix = "default-eks-node-group-" -> null
- path = "/" -> null
- tags = {} -> null
- tags_all = {} -> null
- unique_id = "AROAW3MEE5OCLYIPBF6OZ" -> null
# (1 unchanged attribute hidden)
}
# module.eks.module.eks_managed_node_group["default"].aws_iam_role_policy_attachment.this["AmazonEC2ContainerRegistryReadOnly"] will be destroyed
# (because module.eks.module.eks_managed_node_group["default"] is not in configuration)
- resource "aws_iam_role_policy_attachment" "this" {
- id = "default-eks-node-group-20250503052606002600000002-2025050305260809140000000b" -> null
- policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly" -> null
- role = "default-eks-node-group-20250503052606002600000002" -> null
}
# module.eks.module.eks_managed_node_group["default"].aws_iam_role_policy_attachment.this["AmazonEKSWorkerNodePolicy"] will be destroyed
# (because module.eks.module.eks_managed_node_group["default"] is not in configuration)
- resource "aws_iam_role_policy_attachment" "this" {
- id = "default-eks-node-group-20250503052606002600000002-2025050305260796910000000a" -> null
- policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy" -> null
- role = "default-eks-node-group-20250503052606002600000002" -> null
}
# module.eks.module.eks_managed_node_group["default"].aws_iam_role_policy_attachment.this["AmazonEKS_CNI_Policy"] will be destroyed
# (because module.eks.module.eks_managed_node_group["default"] is not in configuration)
- resource "aws_iam_role_policy_attachment" "this" {
- id = "default-eks-node-group-20250503052606002600000002-20250503052607932300000009" -> null
- policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy" -> null
- role = "default-eks-node-group-20250503052606002600000002" -> null
}
# module.eks.module.eks_managed_node_group["default"].aws_launch_template.this[0] will be destroyed
# (because module.eks.module.eks_managed_node_group["default"] is not in configuration)
- resource "aws_launch_template" "this" {
- arn = "arn:aws:ec2:ap-northeast-1:xxxxxxxxxxxx:launch-template/lt-0fbede7110502107a" -> null
- default_version = 1 -> null
- description = "Custom launch template for default EKS managed node group" -> null
- disable_api_stop = false -> null
- disable_api_termination = false -> null
- id = "lt-0fbede7110502107a" -> null
- latest_version = 1 -> null
- name = "default-2025050305352315330000000f" -> null
- name_prefix = "default-" -> null
- security_group_names = [] -> null
- tags = {} -> null
- tags_all = {} -> null
- update_default_version = true -> null
- vpc_security_group_ids = [
- "sg-0d629394303b2d94b",
] -> null
# (8 unchanged attributes hidden)
- metadata_options {
- http_endpoint = "enabled" -> null
- http_put_response_hop_limit = 2 -> null
- http_tokens = "required" -> null
# (2 unchanged attributes hidden)
}
- monitoring {
- enabled = true -> null
}
- tag_specifications {
- resource_type = "instance" -> null
- tags = {
- "Name" = "default"
} -> null
}
- tag_specifications {
- resource_type = "network-interface" -> null
- tags = {
- "Name" = "default"
} -> null
}
- tag_specifications {
- resource_type = "volume" -> null
- tags = {
- "Name" = "default"
} -> null
}
}
# module.eks.module.eks_managed_node_group["default"].module.user_data.null_resource.validate_cluster_service_cidr will be destroyed
# (because module.eks.module.eks_managed_node_group["default"].module.user_data is not in configuration)
- resource "null_resource" "validate_cluster_service_cidr" {
- id = "2710373779183779823" -> null
}
Plan: 0 to add, 0 to change, 7 to destroy.
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Note: You didn't use the -out option to save this plan, so Terraform can't guarantee to take exactly these
actions if you run "terraform apply" now.
無事 Auto Mode にインプレースで移行して、既存のマネージドノードグループも消せました!
まとめ
EKS のマネージドノードグループから Auto Mode にインプレース移行してみました。
下記順序で行えば、問題なさそうです。
- Auto Mode 有効化
- アプリケーション移行
- Auto Mode に含まれていないアドオン (EFS CSI driver など) も含む
- Auto Mode では不要なアドオンを削除
- 既存ノードグループ削除
Ingress や PV が絡むともう少しややこしそうなので、いずれ検証してみようと思います。