ECSやEKSのメトリクスを一括取得するContainer Insightsが一般公開！既存ECSクラスタも追加設定可能に！

AWSのコンテナワークロードを超便利に監視するContainer Insightsがついに一般公開！既存クラスターへの設定も可能になっているので試さない理由はありません。

#Amazon ECS

#コンテナ

#AWS Fargate

#Amazon EKS

#AWS

濱田孝治

2019.09.02

この記事は公開されてから1年以上経過しています。情報が古い可能性がありますので、ご注意ください。

「これで…　これでAWSのコンテナワークロードは、全て、すべて丸見えなんやで…　ﾊﾞﾀｯ」

しばらくまえにパブリックプレビューとして提供されていたContainer Insightsですが、ついにGA（一般公開）の運びとなりました!!

Container monitoring for Amazon ECS, EKS, and Kubernetes is now available in Amazon CloudWatch

従来のCloudWatchでは取得できなかったタスクやコンテナ単位のメトリクスが、Container Insightsによって取得できます。

さらにGAによって、既存のECSクラスタも追加設定が可能になっており、既に構築済みのクラスタに対して「1分」でContainer Insightsがお手軽に利用できます！！まずは、手元の環境でONにしてもらい、そのメトリクスの便利さを味わってください。病みつきになることうけあいです！

この記事では、主に以下の内容を解説していきます。

Container Insightsの概要
ECSで取得できるメトリクス一覧
EKSで取得できるメトリクス一覧
ECSへのContainer Insightsの設定方法
EKSへのContainer Insightsの設定方法
Container Insightsの画面イメージ
Container Insightsの料金試算
まとめ

ｺﾝﾃﾅ ﾏﾙﾐｴ きたか…!!

　 ( ﾟдﾟ)　ｶﾞﾀｯ
　 /　　 ヾ
＿_L| /￣￣￣/＿
　 ＼/　　　/

Container Insightsとは？

公式マニュアルはこちら。

Using Container Insights - Amazon CloudWatch

ECSやFargateやEKSのメトリクスやログを、CloudWatchで取得〜分析するための機能です。従来でもCloudWatchで用意されているメトリクスはあったのですが、ECSの場合はサービス単位でしかメトリクスが取得できなかったり、分解能が低く使いにくく、見れない範囲はDatadogやMackerelなどの監視サービスの導入が必要ででした。

それが、Container Insightsにの登場により、非常に手軽にAWSマネージドな仕組みだけで詳細なメトリクスが取得可能となっています。また、後ほど画面つきで説明しますが、パフォーマンスが異常なコンテナからログを分析するためのドリルダウンがめちゃくちゃ簡単にできます。最高。

ECSで取得できるメトリクスの一覧

Amazon ECS Container Insights Metrics - Amazon CloudWatch

2019年9月2日時点のECS取得メトリクスの一覧です。主なところだとCPUやメモリ利用率について、タスク定義単位での取得が可能になっています。また、各サービスのサービス継続性を把握するのに最重要の実行中タスク数もRunningTaskCountとして、提供されています。

Metric Name	Dimensions	Description
ContainerInstanceCount	ClusterName	The number of EC2 instances running the Amazon ECS agent that are registered with a cluster.
CpuUtilized	TaskDefinitionFamily, ClusterName ServiceName, ClusterName ClusterName	The CPU units used by tasks in the resource that is specified by the dimension set that you're using. This metric is collected only for tasks that have a defined CPU reservation in their task definition.
CpuReserved	TaskDefinitionFamily, ClusterName ServiceName, ClusterName ClusterName	The CPU units reserved by tasks in the resource that is specified by the dimension set that you're using. This metric is collected only for tasks that have a defined CPU reservation in their task definition.
DeploymentCount	ServiceName, ClusterName	The number of deployments in an Amazon ECS service.
DesiredTaskCount	ServiceName, ClusterName	The desired number of tasks for an Amazon ECS service.
MemoryUtilized	TaskDefinitionFamily, ClusterName ServiceName, ClusterName ClusterName	The memory being used by tasks in the resource that is specified by the dimension set that you're using. This metric is collected only for tasks that have a defined memory reservation in their task definition.
MemoryReserved	TaskDefinitionFamily, ClusterName ServiceName, ClusterName ClusterName	The memory that is reserved by tasks in the resource that is specified by the dimension set that you're using. This metric is collected only for tasks that have a defined memory reservation in their task definition.
NetworkRxBytes	TaskDefinitionFamily, ClusterName ServiceName, ClusterName ClusterName	The number of bytes received by the resource that is specified by the dimensions that you're using. This metric is available only for containers in bridge network mode. It is not available for containers in awsvpc network mode or host network mode.
NetworkTxBytes	TaskDefinitionFamily, ClusterName ServiceName, ClusterName ClusterName	The number of bytes transmitted by the resource that is specified by the dimensions that you're using. This metric is available only for containers in bridge network mode. It is not available for containers in awsvpc network mode or host network mode.
PendingTaskCount	ServiceName, ClusterName	The number of tasks currently in the PENDING state.
RunningTaskCount	ServiceName, ClusterName	The number of tasks currently in the RUNNING state.
ServiceCount	ServiceName	The number of services in the cluster.
StorageReadBytes	TaskDefinitionFamily, ClusterName ServiceName, ClusterName ClusterName	The number of bytes read from storage in the resource that is specified by the dimensions that you're using.
StorageWriteBytes	TaskDefinitionFamily, ClusterName ServiceName, ClusterName ClusterName	The number of bytes written to storage in the resource that is specified by the dimensions that you're using.
TaskCount	ServiceName	The number of tasks running in the service.
TaskSetCount	ServiceName, ClusterName	The number of task sets in the service.

EKSで取得できるメトリクス一覧

Amazon EKS and Kubernetes Container Insights Metrics - Amazon CloudWatch

2019年9月2日時点のEKS取得メトリクスの一覧です。nodeやPodなどEKSクラスターを運用していく上で必要なメトリクスは全て出力されていると思います。ざざーっと眺めて、「こんだけとれるんや！！」とテンション上げてもらえればよござんす

Metric Name	Dimensions	Description
cluster_failed_node_count	ClusterName	The number of failed worker nodes in the cluster.
cluster_node_count	ClusterName	The total number of worker nodes in the cluster.
namespace_number_of_running_pods	Namespace ClusterName ClusterName	The number of pods running per namespace in the resource that is specified by the dimensions that you're using.
node_cpu_limit	ClusterName	The maximum number of CPU units that can be assigned to a single node in this cluster.
node_cpu_reserved_capacity	NodeName, ClusterName, InstanceId ClusterName	The percentage of CPU units that are reserved for node components, such as kubelet, kube-proxy, and Docker.
node_cpu_usage_total	ClusterName	The number of CPU units being used on the nodes in the cluster.
node_cpu_utilization	NodeName, ClusterName, InstanceId ClusterName	The total percentage of CPU units being used on the nodes in the cluster.
node_filesystem_utilization	NodeName, ClusterName, InstanceId ClusterName	The total percentage of file system capacity being used on nodes in the cluster.
node_memory_limit	ClusterName	The maximum amount of memory, in bytes, that can be assigned to a single node in this cluster.
node_memory_reserved_capacity	NodeName, ClusterName, InstanceId ClusterName	The percentage of memory currently being used on the nodes in the cluster.
node_memory_utilization	NodeName, ClusterName, InstanceId ClusterName	The percentage of memory currently being used by the node or nodes.
node_memory_working_set	ClusterName	The amount of memory, in bytes, being used in the working set of the nodes in the cluster.
node_network_total_bytes	NodeName, ClusterName, InstanceId ClusterName	The total number of bytes per second transmitted and received over the network per node in a cluster.
node_number_of_running_containers	NodeName, ClusterName, InstanceId ClusterName	The number of running containers per node in a cluster.
node_number_of_running_pods	NodeName, ClusterName, InstanceId ClusterName	The number of running pods per node in a cluster.
pod_cpu_reserved_capacity	PodName, Namespace, ClusterName ClusterName	The CPU capacity that is reserved per pod in a cluster.
pod_cpu_utilization	PodName, Namespace, ClusterName Namespace, ClusterName Service, Namespace, ClusterName ClusterName	The percentage of CPU units being used by pods.
pod_cpu_utilization_over_pod_limit	PodName, Namespace, ClusterName Namespace, ClusterName Service, Namespace, ClusterName ClusterName	The percentage of CPU units being used by pods that is over the pod limit.
pod_memory_reserved_capacity	PodName, Namespace, ClusterName ClusterName	The percentage of memory that is reserved for pods.
pod_memory_utilization	PodName, Namespace, ClusterName Namespace, ClusterName Service, Namespace, ClusterName ClusterName	The percentage of memory currently being used by the pod or pods.
pod_memory_utilization_over_pod_limit	PodName, Namespace, ClusterName Namespace, ClusterName Service, Namespace, ClusterName ClusterName	The percentage of memory that is being used by pods that is over the pod limit.
pod_number_of_container_restarts	PodName, Namespace, ClusterName	The total number of container restarts in a pod.
pod_network_rx_bytes	PodName, Namespace, ClusterName Namespace, ClusterName Service, Namespace, ClusterName ClusterName	The number of bytes per second being received over the network by the pod.
pod_network_tx_bytes	PodName, Namespace, ClusterName Namespace, ClusterName Service, Namespace, ClusterName ClusterName	The number of bytes per second being transmitted over the network by the pod.
service_number_of_running_pods	Service, Namespace, ClusterName ClusterName	The number of pods running the service or services in the cluster.

ECSへのContainer Insightsの設定方法

ECSへのContainer Insightsの設定方法を解説していきます。

Setting Up Container Insights on Amazon ECS - Amazon CloudWatch

アカウント全体でContainer Insightsをデフォルト有効にする方法

2019年9月2日現在、CloudFormationでECSクラスターを作成時のContainer Insightsを有効化する方法はありません。別の方法として、そのアカウント全体でECSクラスター作成するときのContainer InsightsをデフォルトONにする方法があります。

Webコンソールの場合

WebコンソールのECSを開き、Account SettingsからCloudWatch Container InsightsのチェックボックスをONにします。

CLIの場合

aws ecs put-account-settingを使います。

$ aws ecs put-account-setting --name "containerInsights" --value "enabled"
{
    "setting": {
        "name": "containerInsights",
        "value": "enabled",
        "principalArn": "arn:aws:iam::629895769338:role/cm-hamada.koji"
    }
}

ちなみに現状の確認は、aws ecs list-account-settingsで可能です。

$ aws ecs list-account-settings
{
    "settings": [
        {
            "name": "awsvpcTrunking",
            "value": "enabled",
            "principalArn": "arn:aws:iam::629895769338:role/cm-hamada.koji"
        },
        {
            "name": "containerInsights",
            "value": "enabled",
            "principalArn": "arn:aws:iam::629895769338:role/cm-hamada.koji"
        },
        {
            "name": "containerInstanceLongArnFormat",
            "value": "enabled",
            "principalArn": "arn:aws:iam::629895769338:role/cm-hamada.koji"
        },
        {
            "name": "serviceLongArnFormat",
            "value": "enabled",
            "principalArn": "arn:aws:iam::629895769338:role/cm-hamada.koji"
        },
        {
            "name": "taskLongArnFormat",
            "value": "enabled",
            "principalArn": "arn:aws:iam::629895769338:role/cm-hamada.koji"
        }
    ]
}

クラスター作成時に設定する方法

GUIの場合

ECSクラスター作成時にCloudWatch Container Insightsの設定項目が増えているので、それを設定します。

CLIの場合

以下のコマンドを実行します。

aws ecs create-cluster --cluster-name myCICluster --settings "name=containerInsights,value=enabled"

（重要）既存クラスターへのContainer Insightsの有効化

おまたせしました！まだパブリックプレビューの時点では、既存のECSクラスターに対しては、Container Insightsを有効化する方法はありませんでしたが、GAを迎えた今、それができます。

残念ながら現状、Webコンソールからはできませんが、CLIからは以下のコマンド一発で有効化できます。簡単！！

aws ecs update-cluster-settings --cluster myCICluster --settings name=containerInsights,value=enabled

EKSへのContainer Insightsの設定方法

EKSへの設定方法ですが、ECSに比べるといくらか複雑です。

Setting Up Container Insights on Amazon EKS and Kubernetes - Amazon CloudWatch

Quick Startで実施する場合

Quick Start Setup for Container Insights on Amazon EKS - Amazon CloudWatch

以下のコマンド一発でデプロイ可能です。curlでfluentdをDatemonsetとして配布するマニフェストファイルをダウンロードし、クラスター名とリージョン名を置換して、そのままkubectl applyするという力技。

curl https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/master/k8s-yaml-templates/quickstart/cwagent-fluentd-quickstart.yaml | sed "s/{{cluster_name}}/Cluster_Name/;s/{{region_name}}/Region/" | kubectl apply -f -

段階をおって実施する場合

以下2つの手順で実施できます。こちらのほうが、細かい設定変更が可能です。

Container Insightsの画面イメージ

というわけで、実際にContaner Insightsでどのようなグラフとメトリクスが確認できるか見てみましょう。今回はECSに対して、継続的にapache benchでコンテナに負荷をかけています。

最初の入口はWebコンソールでCloudWatchを開き、一番上のCloudWatchをクリック。概要のリンクをクリックすると、メニューが出てくるので、そのうちの「Container Insights」をクリック。

左上のプルダウンで「ECS Clusters」を選択すると、リージョン内全クラスターの代表的なメトリクスを一覧で確認できます。

下の方では、クラスター全体における平均CPUとメモリを確認可能。

左上で「ECS Services」を選択すると、サービス内の各タスクについてのメトリクスが確認可能。取得メトリクスも変わります。

下に「Task performance」が表示されており、各タスクの負荷状況を個別に閲覧できます。これが欲しかったんやで！！

タスクIDが表示されているので、ここからドリルダウン的に当該タスクのログを確認できます。

アプリケーションログを選択すると、CloudWatch Insightsの画面がログクエリがセットされた状態で表示されます。ここから「クエリの実行」をクリックすることで、シームレスにログの状況が確認可能！！

上記のドリルダウンは、ドロップダウンで「ECS Tasks」を選択してからの流れでも可能。その場合、下の一覧にはタスク内の各コンテナが表示されているので、後は上記と同様の手順で、アプリケーションログやパフォーマンスログがドリルダウンで確認できます。

Container Insightsの料金試算

「もう、こんだけ取れるんやったら、全部のクラスタ有無を言わさずONにしてしまったらええやんけ！」という気持ちになりそうですが、料金面もきになるところ。

Container Insightsのメトリクスは基本的にカスタムメトリクスとしての料金がかかります。といっても試算は難しいのですが、なんと！Container InsightsでECSやEKSをモニタリングしたときの料金試算ページが、公式で用意されています。

Amazon CloudWatch Pricing – Amazon Web Services (AWS)

ページの下の方「Pricing examples」に、以下2つのメニューが！！

Example 6 - Container Insights for Amazon ECS
Example 7 - Container Insights for Amazon EKS and Kubernetes(k8s)

Simple Caliculaterのようなツールになっているわけではないのですが、料金試算に必要な要素と計算順序が記載されているので、これにそって当てはめてみることで、実際に動かして見る前にどの程度の料金になりそうか試算が可能です。

「AWSにおけるコンテナワークロードのメトリクス監視、ログ分析のデファクトとなる機能」

正直、今までのCloudWatchで取得できるコンテナワークロードのメトリクスは、貧弱だったと言って良いでしょう。標準でとれるメトリクスの少なさに悲しい気持ちになって、MackerelやDatadogを導入していたかたも多いと思います。今回、Container Insightsがめでたく正式リリースされたことで、そのあたりの懸念を払拭できる可能性があります。

設定が簡単、かつ今回の一般リリースにより既存クラスターへの適用も可能になっており、まずは試してみるのが非常にお手軽になっており、今まで運用してきたコンテナワークロードのTCOを大幅にさげる可能性があります。また、異常なコンテナの識別からログへのドリルダウンも非常にシームレスに実施できます。

マネージドでここまでやってくるとは思いませんでした。ぜひ一度、お気軽に試してみていただければとおもいます。

それでは、今日はこのへんで。濱田（@hamako9999）でした。

ECSやEKSのメトリクスを一括取得するContainer Insightsが一般公開！既存ECSクラスタも追加設定可能に！

Container Insightsとは？

ECSで取得できるメトリクスの一覧

EKSで取得できるメトリクス一覧

ECSへのContainer Insightsの設定方法

アカウント全体でContainer Insightsをデフォルト有効にする方法

Webコンソールの場合

CLIの場合

クラスター作成時に設定する方法

GUIの場合

CLIの場合

（重要）既存クラスターへのContainer Insightsの有効化

EKSへのContainer Insightsの設定方法

Quick Startで実施する場合

段階をおって実施する場合

Container Insightsの画面イメージ

Container Insightsの料金試算

「AWSにおけるコンテナワークロードのメトリクス監視、ログ分析のデファクトとなる機能」

関連記事

主なカテゴリ

AWSで探す

注目のテーマ

プロダクトやサービスで探す

特集やシリーズから探す

お問い合わせ

運営会社