[アップデート]Step FunctionsでAmazon Athenaの統合がサポートされました

少々時間が経ってしまいましたが、Step FunctionsのAthena統合やってみました
2020.10.31

この記事は公開されてから1年以上経過しています。情報が古い可能性がありますので、ご注意ください。

先日、Step FunctionsでAthenaとの統合がサポートされました。

今回のアップデートにより、Lambda Functionなどを介さずAthenaにクエリを実行できるようになりました。現時点で利用可能なAPIは以下の4つとなります。

それでは早速、Step FunctionsでサポートされているAPIの一部を実行してみたいと思います。

事前準備

Athenaテーブル

Athenaで何かしらのテーブルを作成しておきます。ここでは、CloudTrail証跡に対するテーブルで、以下のような方法で作成ました。

IAMロール

ステートマシンに権限を付与します。以下ドキュメントに利用するAPI毎に必要な権限等が記載されいます。

やってみた

ステートマシン作成

今回はステートマシンから直接クエリを実行し、クエリ結果をSNSにてメール通知してみたいと思います。定義は以下となります。(AWSアカウントIDが記載された箇所はマスクしています)

{
  "StartAt": "Start an Athena query",
  "States": {
    "Start an Athena query": {
      "Type": "Task",
      "Resource": "arn:aws:states:::athena:startQueryExecution.sync",
      "Parameters": {
        "QueryString": "SELECT * FROM \"cloudtrail_logs_cm_members_XXXXXXXXXXXX\" WHERE awsregion = 'ap-northeast-1' limit 1",
        "ResultConfiguration": {
          "OutputLocation": "s3://aws-athena-query-results-XXXXXXXXXXXX-ap-northeast-1/cm-members-XXXXXXXXXXXX/"
        }
      },
      "Next": "Get query results"
    },
    "Get query results": {
      "Type": "Task",
      "Resource": "arn:aws:states:::athena:getQueryResults",
      "Parameters": {
        "QueryExecutionId.$": "$.QueryExecution.QueryExecutionId"
      },
      "Next": "Send query results"
    },
    "Send query results": {
      "Type": "Task",
      "Resource": "arn:aws:states:::sns:publish",
      "Parameters": {
        "TopicArn": "arn:aws:sns:ap-northeast-1:XXXXXXXXXXXX:test-sakamaki-topic",
        "Message": {
          "Input.$": "$.ResultSet.Rows"
        }
      },
      "End": true
    }
  }
}

定義について説明します。

Start an Athena queryステート

Athenaに対しクエリを実行するStartQueryExecutionを呼び出し、パラメータQueryStringにてSQL、ResultConfigurationでクエリの結果を保存するS3バケットをを指定しています。.syncサフィックスを追加してクエリ完了まで待機する定義となります。 Step Functionsでサポートされているパラメータについては以下に記載があります。

Get query resultsステート

GetQueryResultsを呼び出し、クエリの実行結果を取得します。Start an Athena queryステートの実行結果を取得するため、Start an Athena queryから返される実行ID($.QueryExecution.QueryExecutionId)をパラメータに指定しています。

Send query resultsステート

Get query resultsステートで取得した実行結果を、SNSトピックに送信しメール通知を行います。

ステートマシン実行

作成したステートマシンを実行しました。

$ aws stepfunctions start-execution \
  --state-machine-arn arn:aws:states:ap-northeast-1:XXXXXXXXXXXX:stateMachine:test
{
    "executionArn": "arn:aws:states:ap-northeast-1:XXXXXXXXXXXX:execution:TestCodeBuildStateMachine:6e7af264-84e7-40e3-8447-a3c94f30616f",
    "startDate": 1588747745.583
}

Step Functionsコンソールより、ステートマシンの実行詳細を確認しました。クエリを実行しているStart an Athena queryステートから、Athanaのコンソールへ遷移することができました。

しばらくすると、ステートマシンが正常終了し、想定どおりクエリの実行、結果の取得が行えました。

参考までに各ステートのレスポンスは以下となります。(長いので畳んでいます。)

Start an Athena queryステート
{
  "QueryExecution": {
    "Query": "SELECT * FROM \"cloudtrail_logs_cm_members_XXXXXXXXXXXX\" WHERE awsregion = 'ap-northeast-1' limit 1",
    "QueryExecutionContext": {},
    "QueryExecutionId": "840dec37-c5b8-4bc7-879b-4c2f399c811e",
    "ResultConfiguration": {
      "OutputLocation": "s3://aws-athena-query-results-XXXXXXXXXXXX-ap-northeast-1/cm-members-XXXXXXXXXXXX/840dec37-c5b8-4bc7-879b-4c2f399c811e.csv"
    },
    "StatementType": "DML",
    "Statistics": {
      "DataScannedInBytes": 295929103,
      "EngineExecutionTimeInMillis": 30459,
      "QueryPlanningTimeInMillis": 1099,
      "QueryQueueTimeInMillis": 193,
      "ServiceProcessingTimeInMillis": 74,
      "TotalExecutionTimeInMillis": 30726
    },
    "Status": {
      "CompletionDateTime": 1604055416734,
      "State": "SUCCEEDED",
      "SubmissionDateTime": 1604055386008
    },
    "WorkGroup": "primary"
  }
}
Get query resultsステート
{
  "ResultSet": {
    "ResultSetMetadata": {
      "ColumnInfo": [
        {
          "CaseSensitive": true,
          "CatalogName": "hive",
          "Label": "eventversion",
          "Name": "eventversion",
          "Nullable": "UNKNOWN",
          "Precision": 2147483647,
          "Scale": 0,
          "SchemaName": "",
          "TableName": "",
          "Type": "varchar"
        },
        {
          "CaseSensitive": false,
          "CatalogName": "hive",
          "Label": "useridentity",
          "Name": "useridentity",
          "Nullable": "UNKNOWN",
          "Precision": 0,
          "Scale": 0,
          "SchemaName": "",
          "TableName": "",
          "Type": "row"
        },
        {
          "CaseSensitive": true,
          "CatalogName": "hive",
          "Label": "eventtime",
          "Name": "eventtime",
          "Nullable": "UNKNOWN",
          "Precision": 2147483647,
          "Scale": 0,
          "SchemaName": "",
          "TableName": "",
          "Type": "varchar"
        },
        {
          "CaseSensitive": true,
          "CatalogName": "hive",
          "Label": "eventsource",
          "Name": "eventsource",
          "Nullable": "UNKNOWN",
          "Precision": 2147483647,
          "Scale": 0,
          "SchemaName": "",
          "TableName": "",
          "Type": "varchar"
        },
        {
          "CaseSensitive": true,
          "CatalogName": "hive",
          "Label": "eventname",
          "Name": "eventname",
          "Nullable": "UNKNOWN",
          "Precision": 2147483647,
          "Scale": 0,
          "SchemaName": "",
          "TableName": "",
          "Type": "varchar"
        },
        {
          "CaseSensitive": true,
          "CatalogName": "hive",
          "Label": "awsregion",
          "Name": "awsregion",
          "Nullable": "UNKNOWN",
          "Precision": 2147483647,
          "Scale": 0,
          "SchemaName": "",
          "TableName": "",
          "Type": "varchar"
        },
        {
          "CaseSensitive": true,
          "CatalogName": "hive",
          "Label": "sourceipaddress",
          "Name": "sourceipaddress",
          "Nullable": "UNKNOWN",
          "Precision": 2147483647,
          "Scale": 0,
          "SchemaName": "",
          "TableName": "",
          "Type": "varchar"
        },
        {
          "CaseSensitive": true,
          "CatalogName": "hive",
          "Label": "useragent",
          "Name": "useragent",
          "Nullable": "UNKNOWN",
          "Precision": 2147483647,
          "Scale": 0,
          "SchemaName": "",
          "TableName": "",
          "Type": "varchar"
        },
        {
          "CaseSensitive": true,
          "CatalogName": "hive",
          "Label": "errorcode",
          "Name": "errorcode",
          "Nullable": "UNKNOWN",
          "Precision": 2147483647,
          "Scale": 0,
          "SchemaName": "",
          "TableName": "",
          "Type": "varchar"
        },
        {
          "CaseSensitive": true,
          "CatalogName": "hive",
          "Label": "errormessage",
          "Name": "errormessage",
          "Nullable": "UNKNOWN",
          "Precision": 2147483647,
          "Scale": 0,
          "SchemaName": "",
          "TableName": "",
          "Type": "varchar"
        },
        {
          "CaseSensitive": true,
          "CatalogName": "hive",
          "Label": "requestparameters",
          "Name": "requestparameters",
          "Nullable": "UNKNOWN",
          "Precision": 2147483647,
          "Scale": 0,
          "SchemaName": "",
          "TableName": "",
          "Type": "varchar"
        },
        {
          "CaseSensitive": true,
          "CatalogName": "hive",
          "Label": "responseelements",
          "Name": "responseelements",
          "Nullable": "UNKNOWN",
          "Precision": 2147483647,
          "Scale": 0,
          "SchemaName": "",
          "TableName": "",
          "Type": "varchar"
        },
        {
          "CaseSensitive": true,
          "CatalogName": "hive",
          "Label": "additionaleventdata",
          "Name": "additionaleventdata",
          "Nullable": "UNKNOWN",
          "Precision": 2147483647,
          "Scale": 0,
          "SchemaName": "",
          "TableName": "",
          "Type": "varchar"
        },
        {
          "CaseSensitive": true,
          "CatalogName": "hive",
          "Label": "requestid",
          "Name": "requestid",
          "Nullable": "UNKNOWN",
          "Precision": 2147483647,
          "Scale": 0,
          "SchemaName": "",
          "TableName": "",
          "Type": "varchar"
        },
        {
          "CaseSensitive": true,
          "CatalogName": "hive",
          "Label": "eventid",
          "Name": "eventid",
          "Nullable": "UNKNOWN",
          "Precision": 2147483647,
          "Scale": 0,
          "SchemaName": "",
          "TableName": "",
          "Type": "varchar"
        },
        {
          "CaseSensitive": false,
          "CatalogName": "hive",
          "Label": "resources",
          "Name": "resources",
          "Nullable": "UNKNOWN",
          "Precision": 0,
          "Scale": 0,
          "SchemaName": "",
          "TableName": "",
          "Type": "array"
        },
        {
          "CaseSensitive": true,
          "CatalogName": "hive",
          "Label": "eventtype",
          "Name": "eventtype",
          "Nullable": "UNKNOWN",
          "Precision": 2147483647,
          "Scale": 0,
          "SchemaName": "",
          "TableName": "",
          "Type": "varchar"
        },
        {
          "CaseSensitive": true,
          "CatalogName": "hive",
          "Label": "apiversion",
          "Name": "apiversion",
          "Nullable": "UNKNOWN",
          "Precision": 2147483647,
          "Scale": 0,
          "SchemaName": "",
          "TableName": "",
          "Type": "varchar"
        },
        {
          "CaseSensitive": true,
          "CatalogName": "hive",
          "Label": "readonly",
          "Name": "readonly",
          "Nullable": "UNKNOWN",
          "Precision": 2147483647,
          "Scale": 0,
          "SchemaName": "",
          "TableName": "",
          "Type": "varchar"
        },
        {
          "CaseSensitive": true,
          "CatalogName": "hive",
          "Label": "recipientaccountid",
          "Name": "recipientaccountid",
          "Nullable": "UNKNOWN",
          "Precision": 2147483647,
          "Scale": 0,
          "SchemaName": "",
          "TableName": "",
          "Type": "varchar"
        },
        {
          "CaseSensitive": true,
          "CatalogName": "hive",
          "Label": "serviceeventdetails",
          "Name": "serviceeventdetails",
          "Nullable": "UNKNOWN",
          "Precision": 2147483647,
          "Scale": 0,
          "SchemaName": "",
          "TableName": "",
          "Type": "varchar"
        },
        {
          "CaseSensitive": true,
          "CatalogName": "hive",
          "Label": "sharedeventid",
          "Name": "sharedeventid",
          "Nullable": "UNKNOWN",
          "Precision": 2147483647,
          "Scale": 0,
          "SchemaName": "",
          "TableName": "",
          "Type": "varchar"
        },
        {
          "CaseSensitive": true,
          "CatalogName": "hive",
          "Label": "vpcendpointid",
          "Name": "vpcendpointid",
          "Nullable": "UNKNOWN",
          "Precision": 2147483647,
          "Scale": 0,
          "SchemaName": "",
          "TableName": "",
          "Type": "varchar"
        }
      ]
    },
    "Rows": [
      {
        "Data": [
          {
            "VarCharValue": "eventversion"
          },
          {
            "VarCharValue": "useridentity"
          },
          {
            "VarCharValue": "eventtime"
          },
          {
            "VarCharValue": "eventsource"
          },
          {
            "VarCharValue": "eventname"
          },
          {
            "VarCharValue": "awsregion"
          },
          {
            "VarCharValue": "sourceipaddress"
          },
          {
            "VarCharValue": "useragent"
          },
          {
            "VarCharValue": "errorcode"
          },
          {
            "VarCharValue": "errormessage"
          },
          {
            "VarCharValue": "requestparameters"
          },
          {
            "VarCharValue": "responseelements"
          },
          {
            "VarCharValue": "additionaleventdata"
          },
          {
            "VarCharValue": "requestid"
          },
          {
            "VarCharValue": "eventid"
          },
          {
            "VarCharValue": "resources"
          },
          {
            "VarCharValue": "eventtype"
          },
          {
            "VarCharValue": "apiversion"
          },
          {
            "VarCharValue": "readonly"
          },
          {
            "VarCharValue": "recipientaccountid"
          },
          {
            "VarCharValue": "serviceeventdetails"
          },
          {
            "VarCharValue": "sharedeventid"
          },
          {
            "VarCharValue": "vpcendpointid"
          }
        ]
      },
      {
        "Data": [
          {
            "VarCharValue": "1.05"
          },
          {
            "VarCharValue": "{type=AWSService, principalid=null, arn=null, accountid=null, invokedby=events.amazonaws.com, accesskeyid=null, username=null, sessioncontext=null}"
          },
          {
            "VarCharValue": "2020-01-04T00:32:57Z"
          },
          {
            "VarCharValue": "sts.amazonaws.com"
          },
          {
            "VarCharValue": "AssumeRole"
          },
          {
            "VarCharValue": "ap-northeast-1"
          },
          {
            "VarCharValue": "events.amazonaws.com"
          },
          {
            "VarCharValue": "events.amazonaws.com"
          },
          {},
          {},
          {
            "VarCharValue": "{\"roleArn\":\"arn:aws:iam::XXXXXXXXXXXX:role/service-role/Amazon_EventBridge_Invoke_Step_Functions_1620158064\",\"roleSessionName\":\"c37b539a9d0f3d0295377e34b36b1d43\",\"durationSeconds\":3600}"
          },
          {
            "VarCharValue": "{\"credentials\":{\"accessKeyId\":\"ASIAVWTKLCOCCQLQOIWI\",\"expiration\":\"Jan 4, 2020 1:32:57 AM\",\"sessionToken\":\"IQoJb3JpZ2luX2VjENj//////////wEaDmFwLW5vcnRoZWFzdC0xIkcwRQIhAPEMDgW5CDxE9M+KuCjZ3Q7ypGO786hDJt40bCbkHU10AiAbVWDmoi+1yMzKZKifNOnKtda4XckpFqFYRm67xgrA2iqdAghSEAIaDDM5MjEzODc4OTc2NCIMJo5/C2adUJZp+nMkKvoBv6zRGIKcxXeYkDz+oJjBrPIPEhEhlBuIkZqEjST/BHPSMWN3TMkvrD5Cwe8G4uZ2ZNlEHT3f2cgmmO6qiRjBtExZl9YjOOiDrfxY9HiGwyz9u9oo7a8CTbHyjEFJXTWRf2bIA48qZk+Hp9Cyy6dyKb8TtLkUOq2cI5z8QF038l7T2JKLRW2Rkry8DAdk5cmBtzAifrEr7DILzlY/UkKrfxMb+zC6mII/I5OuXtfOO0DCTq8mdHq9SW3kqvidBb3WJRUGXFjFqrpFMZFAPi79rmUvzzMi/IUpWA7DoXeoxRYCYkVNMvDUyS4oIaXMy7u80c0koW0gLLtipzC5ur/wBTqFAht4QduPAZn95HVrfDX3lgWUpqH46mya6vDVGPc1Jco8k9twoWv1YRn7f4ZHAlAunL+fCqZYvjxjbMENK2RELADSiUKqdro/GtRN5h3UQWrQWtAAZQpFizn51QHOxWiRfyo9RrrxTya0JXCMhqg3IT0TruvLCGHrTl/Sgqkr+WAGiOCD6LrVXnelBjKwN5G3TKu/Jk3GQ1K8yS9ue8eD7NBxHd71UqE9W6RA/EoC6t7oSTQdJHY6/rjwLyWOabOqqA6Um2rQS2hJhjfWUlGXYYKiT9N8J3OOAqanccDrUH8XTip36UgUZ8k/A946pKmyNFfyq5rIDI9czg75iyRyirx7tbNzYQ==\"},\"assumedRoleUser\":{\"assumedRoleId\":\"AROAVWTKLCOCARZIIGFOW:c37b539a9d0f3d0295377e34b36b1d43\",\"arn\":\"arn:aws:sts::XXXXXXXXXXXX:assumed-role/Amazon_EventBridge_Invoke_Step_Functions_1620158064/c37b539a9d0f3d0295377e34b36b1d43\"}}"
          },
          {},
          {
            "VarCharValue": "c0fa5608-2e89-11ea-9eee-f7825c6d6d40"
          },
          {
            "VarCharValue": "475f8982-5221-401d-816f-6211bcb38a1a"
          },
          {
            "VarCharValue": "[{arn=arn:aws:iam::XXXXXXXXXXXX:role/service-role/Amazon_EventBridge_Invoke_Step_Functions_1620158064, accountid=XXXXXXXXXXXX, type=AWS::IAM::Role}]"
          },
          {
            "VarCharValue": "AwsApiCall"
          },
          {},
          {},
          {
            "VarCharValue": "XXXXXXXXXXXX"
          },
          {},
          {
            "VarCharValue": "b016524b-1795-49e2-9e1d-b2b21217a167"
          },
          {}
        ]
      }
    ]
  },
  "UpdateCount": 0
}
Send query resultsステート
{
  "MessageId": "a81cc49f-8113-5b0b-96c2-9c8a72cf4e2a",
  "SdkHttpMetadata": {
    "AllHttpHeaders": {
      "x-amzn-RequestId": [
        "b94bf74e-ae6f-5937-995f-af5b5db38c16"
      ],
      "Content-Length": [
        "294"
      ],
      "Date": [
        "Fri, 30 Oct 2020 10:57:32 GMT"
      ],
      "X-Amzn-Trace-Id": [
        "Root=1-5f9bf159-15a313cbc231645a3cb3f917;Parent=7601dbf384890cfb;Sampled=1"
      ],
      "Content-Type": [
        "text/xml"
      ]
    },
    "HttpHeaders": {
      "Content-Length": "294",
      "Content-Type": "text/xml",
      "Date": "Fri, 30 Oct 2020 10:57:32 GMT",
      "x-amzn-RequestId": "b94bf74e-ae6f-5937-995f-af5b5db38c16",
      "X-Amzn-Trace-Id": "Root=1-5f9bf159-15a313cbc231645a3cb3f917;Parent=7601dbf384890cfb;Sampled=1"
    },
    "HttpStatusCode": 200
  },
  "SdkResponseMetadata": {
    "RequestId": "b94bf74e-ae6f-5937-995f-af5b5db38c16"
  }
}

レスポンスについてはこちらも参考になります。

さいごに

今回のアップデートにより、Lambda Functionなどを介さず直接クエリが実行できるようになりました。ワークフローの視認性も上がり、作り込みの簡素化が期待できそうです。動的並列処理のような既存機能も利用していくことでStep Functionsだけで、複雑なワークフローへの対応が期待できそうです。

参考