CloudWatch Logs サブスクリプションフィルターから Amazon Data Firehose へ配信されたログデータが文字化けしている場合の対処方法

AWSテクニカルサポートノート

#AWS

#Amazon Data Firehose

#CloudWatch Logs

mochizuki.koji

2025.02.09

 困っている内容CloudWatch Logs サブスクリプションフィルターから Amazon Data Firehose(以下「Data Firehose」という) を経由し S3 へログデータを送信しているのですが、サブスクリプションフィルターから配信されたログデータを確認した所、文字化けが発生していました。
しかし、CloudWatch Logs 上だと文字化けは発生しておらず、特に問題なくログデータを閲覧することができます。

また、Data Firehose ストリームの「送信先の設定」にて、データレコードの圧縮やファイル拡張子フォーマット等の変更を試したものの、文字化けは解消しませんでした。
この文字化けを防ぎたいのですが、どのように対処すればいいのでしょうか？
 どう対応すればいいの？CloudWatch Logs サブスクリプションフィルターから配信されたデータを gzip 解凍し、解凍後のログデータで文字化けが発生していないか確認してください。

解凍後のログデータで文字化けが発生していない場合には、Data Firehose ストリームの「Amazon CloudWatch Logs からソースレコードを解凍する」機能の有効化をご検討ください。[1]
CloudWatch Logs サブスクリプションフィルターを介してサービスで配信されるログデータは、gzip 形式で圧縮されています。[2]
Logs sent to a service through a subscription filter are base64 encoded and compressed with the gzip format.
Data Firehose ストリームの「Amazon CloudWatch Logs からソースレコードを解凍する」機能を有効化することで、事前に gzip 解凍したログデータを S3 へ配信することができます。[3]
なお、既存の Data Firehose ストリームで「Amazon CloudWatch Logs からソースレコードを解凍する」機能を有効化する場合は、一度 Data Firehose ストリームの「AWS Lambda でソースレコードを変換」機能を有効化し、その後解凍機能の有効化および置き換え作業が必要です。[4]
Enabling decompression when Lambda processing is disabled
To enable decompression on an existing Firehose stream with Lambda processing disabled, you must first enable Lambda processing. This condition is only valid for existing streams. Following steps show how to enable decompression on existing streams that do not have Lambda processing enabled.
Create a Lambda function. You can either create a dummy record pass through or can use this blueprint to create a new Lambda function.
Update your current Firehose stream to enable Lambda processing and use the Lambda function that you created for processing.
Once you update the stream with new Lambda function, go back to Firehose console and enable decompression.
Disable the Lambda processing that you enabled in step 1. You can now delete the function that you created in step 1.
Enabling decompression when Lambda processing is enabled
If you already have a Firehose stream with a Lambda function, to perform decompression you can replace it with the Firehose decompression feature. Before you proceed, review your Lambda function code to confirm that it only performs decompression or message extraction. The output of your Lambda function should look similar to the examples shown in Fig 1 or Fig 2. If the output looks similar, you can replace the Lambda function using the following steps.
Replace your current Lambda function with this blueprint. The new blueprint Lambda function automatically detects whether the incoming data is compressed or decompressed. It only performs decompression if its input data is compressed.
Turn on decompression using the built-in Firehose option for decompression.
Enable CloudWatch metrics for your Firehose stream if it's not already enabled. Monitor the metric CloudWatchProcessorLambda_IncomingCompressedData and wait until this metric changes to zero. This confirms that all input data sent to your Lambda function is decompressed and the Lambda function is no longer required.
Remove the Lambda data transformation because you no longer need it to decompress your stream.

脚注
Enable decompression on a new Firehose stream from console - Amazon Data Firehose ↩︎
Log group-level subscription filters - Amazon CloudWatch Logs ↩︎
[アップデート] Amazon Kinesis Data Firehose の配信ストリームで CloudWatch ログデータを解凍後に送信出来るようになりました | DevelopersIO ↩︎
Enable decompression on an existing Firehose stream - Amazon Data Firehose ↩︎