Amazon Kinesis Data Firehose For S3 Destination Using CloudFormation

2021.08.13

この記事は公開されてから1年以上経過しています。情報が古い可能性がありますので、ご注意ください。

In this blog I have build an Amazon Kinesis Data Firehose Stream to deliver the data to Amazon Simple Storage (S3) using CloudFormation. Before the implementation of application lets understand what is Amazon Kinesis and Amazon Kinesis Data Firehose.

Amazon Kinesis

Amazon Kinesis is a service provided by Amazon which makes it easy to collect, process and analyse near real-time, streaming data at massive scale. Amazon Kinesis applications can be used to build dashboards, capture exceptions and generate alerts, drive recommendations, and make other near real-time business or operational decisions. Amazon Kinesis provides four types of Kinesis streaming data platforms.

  • Amazon Kinesis Data Streams — To collect and process large streams of data records in near real time.
  • Amazon Kinesis Data Firehose — To deliver near real-time streaming data to destinations such as Amazon S3, Redshift etc.
  • Amazon Kineses Data Analytics — To process and analyze streaming data using standard SQL.
  • Amazon Kinesis Video Streams — Fully manage services that use to stream live video from devices.

For more information, see Kinesis

Amazon Kinesis Data Firehose

Amazon Kinesis Data Firehose is a fully managed service provided by Amazon to deliver near-real-time streaming data to destinations provided by Amazon services. It load streaming data into data lakes, data stores, and analytics services. Amazon Kinesis Firehose supports four types of Amazon services as destinations.

  • Amazon S3 — an easy to use object storage
  • Amazon Redshift — petabyte-scale data warehouse
  • Amazon Elasticsearch Service — open source search and analytics engine
  • Splunk — operational intelligent tool for analyzing machine-generated data. For more information, see Amazon Kinesis Data Firehose

Amazon Kinesis Data Firehose Delivery Stream

The AWS::KinesisFirehose::DeliveryStream resource creates an Amazon Kinesis Data Firehose delivery stream that delivers near-real-time streaming data to an Amazon Simple Storage Service (Amazon S3), Amazon Redshift, or Amazon Elasticsearch Service (Amazon ES) destination. For more information , see aws-resource-kinesisfirehose-deliverystream

AWS CloudFormation 

AWS CloudFormation simplifies resource provisioning and management for a wide range of AWS services. The CloudFormation service quickly and reliably provisions application architectures (or ‘stacks’) that you model in the CloudFormation template files. It is easy to update or replicate the stacks as needed. For more information , see AWS CloudFormation

Create CloudFormation Stack

  1. Go to the CloudFormation Management Console and click on the create Stack.
  2. Specify the template. Refer this CloudFormation template to create a Stack for Amazon Kinesis Firehose streaming to deliver the data to Amazon Simple Storage (S3). Upload this YAML template file. A template is a JSON or YAML file that describes your Stack's resources and properties. Click on Next.

YAML

AWSTemplateFormatVersion: 2010-09-09
Description: Stack for Firehose DeliveryStream S3 Destination.
Resources:
  deliverystream:
    DependsOn:
      - deliveryPolicy
    Type: AWS::KinesisFirehose::DeliveryStream
    Properties:
      ExtendedS3DestinationConfiguration:
        BucketARN: !Join 
          - ''
          - - 'arn:aws:s3:::'
            - !Ref s3bucket
        BufferingHints:
          IntervalInSeconds: 60
          SizeInMBs: 50 
        CompressionFormat: UNCOMPRESSED
        Prefix: firehose/
        RoleARN: !GetAtt deliveryRole.Arn
        ProcessingConfiguration:
          Enabled: true
          Processors:
            - Parameters:
                - ParameterName: LambdaArn
                  ParameterValue: 'arn:aws:lambda:XX (Lambda Arn)'
              Type: Lambda 
  s3bucket:
    Type: AWS::S3::Bucket
    Properties:
      VersioningConfiguration:
        Status: Enabled
  deliveryRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Sid: ''
            Effect: Allow
            Principal:
              Service: firehose.amazonaws.com
            Action: 'sts:AssumeRole'
            Condition:
              StringEquals:
                'sts:ExternalId': 'XXXXXXXX(Your AWS AccountID)'
  deliveryPolicy:
    Type: AWS::IAM::Policy
    Properties:
      PolicyName: firehose_delivery_policy
      PolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Action:
              - 's3:AbortMultipartUpload'
              - 's3:GetBucketLocation'
              - 's3:GetObject'
              - 's3:ListBucket'
              - 's3:ListBucketMultipartUploads'
              - 's3:PutObject'
            Resource:
              - !Join 
                - ''
                - - 'arn:aws:s3:::'
                  - !Ref s3bucket
                  - '*'
              - !Join 
                - ''
                - - 'arn:aws:s3:::'
                  - !Ref s3bucket
                  - '*'
      Roles:
        - !Ref deliveryRole

3. Specify the Stack details. Give the Stack name of your choice. click on next. 4. Configure the Stack options. Mention the tags (its optional) and click next. 5. Review and Deploy the Stack.

Properties

DeliveryStreamName:

The name of the delivery stream.

DeliveryStreamType:

The delivery stream type. This can be of following values: DirectPut: Provider applications access the delivery stream directly. KinesisStreamAsSource: The delivery stream uses a Kinesis data stream as a source.

ExtendedS3DestinationConfiguration:

An Amazon S3 destination for the delivery stream. Conditional. You must specify only one destination configuration.

S3DestinationConfiguration

The S3DestinationConfiguration property type specifies an Amazon Simple Storage Service (Amazon S3) destination to which Amazon Kinesis Data Firehose (Kinesis Data Firehose) delivers data. Conditional. You must specify only one destination configuration.

Ref

When the logical ID of this resource is provided to the Ref intrinsic function, Ref returns the delivery stream name, such as mystack-deliverystream-1ABCD2EF3GHIJ.

Fn::GetAtt

Fn::GetAtt returns a value for a specified attribute of this type. The following are the available attributes and sample return values.

Resource

The Resources section of this file defines the resources to be provisioned in the stack.

BucketARN

The Amazon Resource Name (ARN) of the Amazon S3 bucket.

Tags

A tag is a key-value pair that you can define and assign to AWS resources. Tags are metadata. You can specify up to 50 tags when creating a delivery stream.

RoleARN

The Amazon Resource Name (ARN) of the AWS credentials.

Prefix

The YYYY/MM/DD/HH time format prefix is automatically used for delivered Amazon S3 files.

CompressionFormat

The compression format. If no value is specified, the default is UNCOMPRESSED. Allowed values: GZIP | HADOOP_SNAPPY | Snappy | UNCOMPRESSED | ZIP

ProcessingConfiguration

The data processing configuration for the Kinesis Data Firehose delivery stream.

BufferingHints

The buffering option.

Test The Delivery Stream

Now kinesis data firehose delivery stream has been created.

Let’s test the created delivery stream. For that click on the delivery stream and open Test with demo data node.

Click on Start sending demo data. This will start sending records to the delivery stream. After sending demo data click in Stop sending demo data to avoid further cost. It might take a few minutes for new objects to appear in your bucket, based on the buffering configuration of your bucket. Go to the destination S3 bucket and verify Whether the Streaming data has Uploaded in S3. Also check whether the streaming data does not have the Change attribute as well.

Amazon Kinesis Data Firehose Delivery Stream for S3 using Cloudformation has been created and tested successfully. Follow this Amazon Kinesis Data Firehose documentation if you want send the data to another destination create-destination-s3, aws-resource-kinesisfirehose-deliverystream