Athena を使って WAF トラフィックログをクエリしてみた

Athena 環境の準備って簡単に済ませたくないですか？ Terraform 設定ファイルを作成したのでどうぞご活用ください。
#AWS
#AWS WAF
#Amazon Athena
板倉舞
2025.05.30
カスタマーサクセス部 運用支援チームのいたくらです。
 はじめに新しく Web ACL を構築する場合、構築後にカウントモードで誤検知がないか様子を見る期間があると思います。

Web ACL の Sampled requests を有効にすれば、詳細ページの Sampled requests タブから「いつ、何件のリクエストがあったか」を確認できます。

その下の Sampled requests ではフィルターを設定すると、特定のルールに関連したリクエストの内容も確認できます。



Sampled requests はとても便利だと思いましたが過去 3 時間分しか表示できないため、とりあえず特定日にカウントされたリクエストをすべて確認したい私にはマッチしませんでした。
ということで、Athena を使って WAF トラフィックログをクエリして Action=COUNT のログを抽出し、エクスポートするまでをやってみました。
 やってみた Athena 関連リソースを作成する以下の Terraform 設定ファイルで必要な AWS リソースを一式作成可能です。

環境に応じて修正が必要な箇所・注意が必要な箇所はハイライトにしています。
waf-athena-setup.tfwaf-athena-setup.tf
　terraform {
　  required_providers {
　    aws = {
　      source  = "hashicorp/aws"
　      version = ">= 5.0"
　    }
　  }
　}　

　provider "aws" {
　  region = "ap-northeast-1" # 必要に応じて変更
　}　

　# AWSアカウントIDとリージョンを取得するためのデータソース
　data "aws_caller_identity" "current" {}
　data "aws_region" "current" {}　

　# パラメータに対応する変数
　variable "s3_waf_logs_bucket_name" {
　  type        = string
　  description = "AWS WAF ログが保存されている既存のS3バケット名"
+  default     = "hogehoge"
　}　

　variable "athena_work_group_name" {
　  type        = string
　  description = "Athenaワークグループ名"
　  default     = "WAFLogAnalysisWorkgroup"
　}　

　variable "athena_database_name" {
　  type        = string
　  description = "Athenaデータベース名"
　  default     = "waf_logs_db"
　}　

　variable "athena_table_name" {
　  type        = string
　  description = "WAFログをクエリするためのAthenaテーブル名"
　  default     = "waf_access_logs"
　}　

　variable "waf_acl_name_in_path" {
　  type        = string
　  description = "S3ログパスに含まれるWeb ACL名"
+  default     = "hogehoge"
　}　

　# Athenaクエリ結果用S3バケット
　resource "aws_s3_bucket" "athena_query_result_bucket" {
　  bucket = format("athena-query-results-%s-%s", data.aws_caller_identity.current.account_id, data.aws_region.current.name)
　  tags = {
　    Project = "WAFLogAnalysis"
　  }
+  force_destroy = true # 検証のため。本番環境では注意。
　}　

　resource "aws_s3_bucket_versioning" "athena_query_result_bucket_versioning" {
　  bucket = aws_s3_bucket.athena_query_result_bucket.id
　  versioning_configuration {
　    status = "Suspended"
　  }
　}　

　# Athena用IAMロール
　resource "aws_iam_role" "athena_access_role" {
　  name = format("AthenaWAFLogAccessRole-%s", var.athena_work_group_name)　

　  assume_role_policy = jsonencode({
　    Version = "2012-10-17",
　    Statement = [
　      {
　        Effect = "Allow",
　        Principal = {
　          Service = "athena.amazonaws.com"
　        },
　        Action = "sts:AssumeRole"
　      }
　    ]
　  })
　  tags = {
　    Project = "WAFLogAnalysis"
　  }
　}　

　resource "aws_iam_role_policy" "athena_s3_glue_access_policy" {
　  name = "AthenaS3GlueAccessPolicy"
　  role = aws_iam_role.athena_access_role.id　

　  policy = jsonencode({
　    Version = "2012-10-17",
　    Statement = [
　      { # WAFログS3バケットへのアクセス許可
　        Effect = "Allow",
　        Action = [
　          "s3:GetBucketLocation",
　          "s3:GetObject",
　          "s3:ListBucket"
　        ],
　        Resource = [
　          "arn:aws:s3:::${var.s3_waf_logs_bucket_name}",
　          "arn:aws:s3:::${var.s3_waf_logs_bucket_name}/*"
　        ]
　      },
　      { # Athenaクエリ結果バケットへのアクセス許可
　        Effect = "Allow",
　        Action = [
　          "s3:GetBucketLocation",
　          "s3:PutObject",
　          "s3:GetObject",
　          "s3:ListBucket"
　        ],
　        Resource = [
　          aws_s3_bucket.athena_query_result_bucket.arn,
　          "${aws_s3_bucket.athena_query_result_bucket.arn}/*"
　        ]
　      },
　      { # Glueデータベースへの一般的な読み取りアクセス許可
　        Effect = "Allow",
　        Action = [
　          "glue:GetDatabase",
　          "glue:GetDatabases"
　        ],
　        Resource = [
　          "arn:aws:glue:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:catalog",
　          "arn:aws:glue:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:database/default",
　          "arn:aws:glue:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:database/${var.athena_database_name}"
　        ]
　      },
　      { # Glueテーブル作成の許可
　        Effect = "Allow",
　        Action = [
　          "glue:CreateTable"
　        ],
　        Resource = [ # CreateTable はデータベースとカタログレベルで許可が必要
　          "arn:aws:glue:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:catalog",
　          "arn:aws:glue:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:database/${var.athena_database_name}",
　        ]
　      },
　      { # Glueテーブルおよびパーティション操作の許可
　        Effect = "Allow",
　        Action = [
　          "glue:GetTable",
　          "glue:GetTables",
　          "glue:UpdateTable",
　          "glue:DeleteTable",
　          "glue:GetPartition",
　          "glue:GetPartitions"
　        ],
　        Resource = [
　          "arn:aws:glue:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:catalog",
　          "arn:aws:glue:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:database/${var.athena_database_name}",
　          "arn:aws:glue:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:table/${var.athena_database_name}/*"
　        ]
　      }
　    ]
　  })
　}　

　# Athenaワークグループ
　resource "aws_athena_workgroup" "athena_work_group" {
　  name          = var.athena_work_group_name
　  state         = "ENABLED"
　  description   = "Workgroup for analyzing AWS WAF logs."
+  force_destroy = true # 検証のため。本番環境では注意。
　  configuration {
　    result_configuration {
　      output_location = "s3://${aws_s3_bucket.athena_query_result_bucket.bucket}/query-results/"
　      encryption_configuration {
　        encryption_option = "SSE_S3"
　      }
　    }
　  }
　  tags = {
　    Project = "WAFLogAnalysis"
　  }
　}　

　# Athenaデータベース (Glueカタログデータベース)
　resource "aws_glue_catalog_database" "athena_database" {
　  catalog_id  = data.aws_caller_identity.current.account_id
　  name        = var.athena_database_name
　  description = "Database for AWS WAF logs."
　}　

　# WAFログ用Athenaテーブル (Glueカタログテーブル)
　resource "aws_glue_catalog_table" "athena_waf_log_table" {
　  depends_on = [
　    aws_glue_catalog_database.athena_database
　  ]　

　  catalog_id    = data.aws_caller_identity.current.account_id
　  database_name = var.athena_database_name
　  name          = var.athena_table_name
　  description   = "Athena table for AWS WAF logs"
　  table_type    = "EXTERNAL_TABLE"　

　  parameters = {
　    "classification"            = "json"
　    "has_encrypted_data"        = "false"
　    "projection.enabled"        = "true"
　    "projection.year.type"      = "integer"
　    "projection.year.range"     = "2023,2050"
　    "projection.year.digits"    = "4"
　    "projection.month.type"     = "integer"
　    "projection.month.range"    = "1,12"
　    "projection.month.digits"   = "2"
　    "projection.day.type"       = "integer"
　    "projection.day.range"      = "1,31"
　    "projection.day.digits"     = "2"
　    "projection.hour.type"      = "integer"
　    "projection.hour.range"     = "0,23"
　    "projection.hour.digits"    = "2"
　    "storage.location.template" = "s3://${var.s3_waf_logs_bucket_name}/AWSLogs/${data.aws_caller_identity.current.account_id}/WAFLogs/${data.aws_region.current.name}/${var.waf_acl_name_in_path}/$${year}/$${month}/$${day}/$${hour}/"
　  }　

　  storage_descriptor {
　    location      = "s3://${var.s3_waf_logs_bucket_name}/AWSLogs/${data.aws_caller_identity.current.account_id}/WAFLogs/${data.aws_region.current.name}/${var.waf_acl_name_in_path}/"
　    input_format  = "org.apache.hadoop.mapred.TextInputFormat"
　    output_format = "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"
　    compressed    = false　

　    ser_de_info {
　      serialization_library = "org.openx.data.jsonserde.JsonSerDe"
　      parameters = {
　        "dots.in.keys"     = "true"
　        "case.insensitive" = "true"
　      }
　    }　

　    columns {
　      name = "timestamp"
　      type = "bigint"
　    }
　    columns {
　      name = "formatversion"
　      type = "int"
　    }
　    columns {
　      name = "webaclid"
　      type = "string"
　    }
　    columns {
　      name = "terminatingruleid"
　      type = "string"
　    }
　    columns {
　      name = "terminatingruletype"
　      type = "string"
　    }
　    columns {
　      name = "action"
　      type = "string"
　    }
　    columns {
　      name = "terminatingrulematchdetails"
　      type = "array<string>"
　    }
　    columns {
　      name = "httpsourcename"
　      type = "string"
　    }
　    columns {
　      name = "httpsourceid"
　      type = "string"
　    }
　    columns {
　      name = "rulegrouplist"
　      type = "array<struct<rulegroupid:string,terminatingrule:struct<ruleid:string,action:string,rulematchdetails:string>,nonterminatingmatchingrules:array<string>,excludedrules:string,customerconfig:string>>"
　    }
　    columns {
　      name = "ratebasedrulelist"
　      type = "array<string>"
　    }
　    columns {
　      name = "nonterminatingmatchingrules"
　      type = "array<struct<ruleid:string,action:string,rulematchdetails:array<struct<conditiontype:string,location:string,matcheddata:string,matchedfieldname:string>>>>"
　    }
　    columns {
　      name = "requestheadersinserted"
　      type = "string"
　    }
　    columns {
　      name = "responsecodesent"
　      type = "string"
　    }
　    columns {
　      name = "httprequest"
　      type = "struct<clientip:string,country:string,headers:array<struct<name:string,value:string>>,uri:string,args:string,httpversion:string,httpmethod:string,requestid:string,fragment:string,scheme:string,host:string>"
　    }
　    columns {
　      name = "labels"
　      type = "array<struct<name:string>>"
　    }
　  } # storage_descriptor ブロックの閉じ括弧　

　  partition_keys {
　    name = "year"
　    type = "string"
　  }
　  partition_keys {
　    name = "month"
　    type = "string"
　  }
　  partition_keys {
　    name = "day"
　    type = "string"
　  }
　  partition_keys {
　    name = "hour"
　    type = "string"
　  }
　}

　# 出力
　output "athena_query_database_name" {
　  description = "WAFログ用に作成されたAthenaデータベース名"
　  value       = aws_glue_catalog_database.athena_database.name
　}　

　output "athena_query_table_name" {
　  description = "WAFログ用に作成されたAthenaテーブル名"
　  value       = aws_glue_catalog_table.athena_waf_log_table.name
　}　

　output "athena_query_result_bucket_url" {
　  description = "Athenaクエリ結果のS3 URL"
　  value       = "s3://${aws_s3_bucket.athena_query_result_bucket.bucket}/query-results/"
　}　

　output "athena_workgroup_console_url" {
　  description = "AWSコンソールのAthenaワークグループへのURL"
　  value       = "https://${data.aws_region.current.name}.console.aws.amazon.com/athena/home?region=${data.aws_region.current.name}#/workgroups/${aws_athena_workgroup.athena_work_group.name}"
　}　

　output "athena_query_console_url" {
　  description = "AWSコンソールのAthenaクエリエディタへのURL"
　  value       = "https://${data.aws_region.current.name}.console.aws.amazon.com/athena/home?region=${data.aws_region.current.name}#/query-editor"
　}

terraform apply を実行すると、以下のような Outputs が表示されるので、一番上の URL を開きます。

すると Athena のクエリエディタが開きます。

ワークグループを WAFLogAnalysisWorkgroup に変更します。

「認証」をクリックします。

データベースを waf_logs_db に変更します。

 クエリを実行する例として、「UTC で 2025 年 5 月 29 日に記録された WAF ログのうち、いずれかのルールによって COUNT アクションが取られたリクエストの全情報」を取得します。
SELECT *
FROM waf_logs_db.waf_access_logs
CROSS JOIN UNNEST(nonterminatingmatchingrules) AS t (rule_info)
WHERE rule_info.action = 'COUNT'
  AND year = '2025'
  AND month = '05'
  AND day = '29';
クエリを入力し、「実行する or もう一度実行する」をクリックします。

すると以下のようにクエリ結果が表示されます。

特定日にカウントされたリクエストをすべて確認したいという私の願いが叶いました。

「結果の CSV をダウンロード」をクリックすればエクスポート可能です。

 必要に応じてお片付け今回構築する際に使用した設定ファイルでは、Athena クエリ結果用 S3 バケットと Athena ワークグループは force_destroy = true を指定しているため、terraform destroy をすると S3 内にオブジェクトが残っていようが、保存したクエリがあろうがこれらのリソースは強制削除されるようになっています。

そのため、検証用やサクッと目当てのログだけエクスポートしたいときには便利です。

強制削除したくない場合は、force_destroy = false に変更してから terraform apply を実行するようにしてください。
 あとがきAthena を使って WAF トラフィックログをクエリする方法のご紹介でした。

個人的には Athena 環境を準備するってハードル高いイメージがなんとなくあるので、この設定ファイルを利用して Athena 環境の準備は簡単に済ませて、ログの分析に時間を使ってほしいなと思います。
この記事がどなたかのお役に立てれば幸いです。
 アノテーション株式会社についてアノテーション株式会社はクラスメソッドグループのオペレーション専門特化企業です。

サポート・運用・開発保守・情シス・バックオフィスの専門チームが、最新 IT テクノロジー、高い技術力、蓄積されたノウハウをフル活用し、お客様の課題解決を行っています。

当社は様々な職種でメンバーを募集しています。

「オペレーション・エクセレンス」と「らしく働く、らしく生きる」を共に実現するカルチャー・しくみ・働き方にご興味がある方は、アノテーション株式会社 採用サイトをぜひご覧ください。
Athena を使って WAF トラフィックログをクエリしてみた

はじめに

やってみた

Athena 関連リソースを作成する

クエリを実行する

必要に応じてお片付け

あとがき

アノテーション株式会社について

関連記事

主なカテゴリ

AWSで探す

注目のテーマ

プロダクトやサービスで探す

特集やシリーズから探す

お問い合わせ

運営会社