I tried Bedrock stream responses with API Gateway

I tried Bedrock stream responses with API Gateway

In the November 2025 update, it became possible to break through the "29-second wall" of API Gateway. We tested response streaming with Bedrock and confirmed that we can handle generative AI responses without being subject to the 10MB limit or timeout constraints.
2025.11.22

This page has been translated by machine translation. View original

On November 19, 2025, Amazon API Gateway was updated to support stream responses.

https://aws.amazon.com/jp/blogs/compute/building-responsive-apis-with-amazon-api-gateway-response-streaming/

Traditional API Gateway had limitations such as a 29-second integration timeout and a 10MB payload size limit. With this update, it's now possible to stream generated data in chunks directly to clients, allowing users to bypass these limitations.

I had the opportunity to test this functionality by building a Lambda function that utilizes Bedrock (Claude Haiku 4.5) stream responses and an API Gateway with streaming support using CloudFormation. I'd like to share my experience.

Test Environment

For this test, I deployed the following Lambda function and API Gateway using CloudFormation:

  • Runtime: Node.js 20.x
  • Model: Claude Haiku 4.5 (Bedrock)
  • Feature: API Gateway Response Streaming

Lambda Function

  • Instead of using the wrapper exports.handler = async..., I used awslambda.streamifyResponse
  • Output chunked data using responseStream.write
// ... (omitted) ...
exports.handler = awslambda.streamifyResponse(async (event, responseStream, context) => {
  const httpResponseMetadata = { /* ... */ };

  // Required: Send metadata
  responseStream = awslambda.HttpResponseStream.from(responseStream, httpResponseMetadata);

  // ... (Bedrock invocation process) ...

  for await (const chunk of response.body) {
    // Important: Write chunked data to the stream sequentially
    responseStream.write(text.delta.text);
  }

  responseStream.end();
});

API Gateway

  • Specified ResponseTransferMode: STREAM
  • Added /response-streaming-invocations to the end of the Uri
  StreamMethod:
    Type: AWS::ApiGateway::Method
    Properties:
      RestApiId: !Ref RestApi
      ResourceId: !Ref StreamResource
      HttpMethod: POST
      AuthorizationType: NONE
      Integration:
        Type: AWS_PROXY
        IntegrationHttpMethod: POST
        Uri: !Sub 'arn:aws:apigateway:${AWS::Region}:lambda:path/2021-11-15/functions/${StreamingLambda.Arn}/response-streaming-invocations'
        ResponseTransferMode: STREAM
        TimeoutInMillis: 300000

CloudFormation

CloudFormation template used for testing
AWSTemplateFormatVersion: '2010-09-09'
Description: 'API Gateway Response Streaming with Lambda and Bedrock Haiku 4.5'

Resources:
  LambdaExecutionRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal:
              Service: lambda.amazonaws.com
            Action: sts:AssumeRole
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
      Policies:
        - PolicyName: BedrockInvokePolicy
          PolicyDocument:
            Version: '2012-10-17'
            Statement:
              - Effect: Allow
                Action:
                  - bedrock:InvokeModelWithResponseStream
                Resource: '*'

  StreamingLambda:
    Type: AWS::Lambda::Function
    Properties:
      FunctionName: BedrockStreamingFunction
      Runtime: nodejs20.x
      Handler: index.handler
      Role: !GetAtt LambdaExecutionRole.Arn
      Timeout: 300
      Code:
        ZipFile: |
          const { BedrockRuntimeClient, InvokeModelWithResponseStreamCommand } = require("@aws-sdk/client-bedrock-runtime");

          const client = new BedrockRuntimeClient({ region: process.env.AWS_REGION });

          exports.handler = awslambda.streamifyResponse(async (event, responseStream, context) => {
            const httpResponseMetadata = {
              statusCode: 200,
              headers: {
                'Content-Type': 'text/plain',
                'Access-Control-Allow-Origin': '*'
              }
            };

            responseStream = awslambda.HttpResponseStream.from(responseStream, httpResponseMetadata);

            try {
              const body = JSON.parse(event.body || '{}');
              const prompt = body.prompt || "こんにちは";

              const command = new InvokeModelWithResponseStreamCommand({
                modelId: "jp.anthropic.claude-haiku-4-5-20251001-v1:0",
                contentType: "application/json",
                accept: "application/json",
                body: JSON.stringify({
                  anthropic_version: "bedrock-2023-05-31",
                  max_tokens: 100000,
                  messages: [{
                    role: "user",
                    content: prompt
                  }]
                })
              });

              const response = await client.send(command);

              for await (const chunk of response.body) {
                if (chunk.chunk?.bytes) {
                  const text = JSON.parse(new TextDecoder().decode(chunk.chunk.bytes));
                  if (text.type === 'content_block_delta' && text.delta?.text) {
                    responseStream.write(text.delta.text);
                  }
                }
              }

              responseStream.end();
            } catch (error) {
              responseStream.write(`Error: ${error.message}`);
              responseStream.end();
            }
          });

  LambdaInvokePermission:
    Type: AWS::Lambda::Permission
    Properties:
      FunctionName: !Ref StreamingLambda
      Action: lambda:InvokeFunction
      Principal: apigateway.amazonaws.com
      SourceArn: !Sub 'arn:aws:execute-api:${AWS::Region}:${AWS::AccountId}:${RestApi}/*'

  RestApi:
    Type: AWS::ApiGateway::RestApi
    Properties:
      Name: BedrockStreamingAPI
      Description: API Gateway with response streaming for Bedrock

  StreamResource:
    Type: AWS::ApiGateway::Resource
    Properties:
      RestApiId: !Ref RestApi
      ParentId: !GetAtt RestApi.RootResourceId
      PathPart: stream

  StreamMethod:
    Type: AWS::ApiGateway::Method
    Properties:
      RestApiId: !Ref RestApi
      ResourceId: !Ref StreamResource
      HttpMethod: POST
      AuthorizationType: NONE
      Integration:
        Type: AWS_PROXY
        IntegrationHttpMethod: POST
        Uri: !Sub 'arn:aws:apigateway:${AWS::Region}:lambda:path/2021-11-15/functions/${StreamingLambda.Arn}/response-streaming-invocations'
        ResponseTransferMode: STREAM
        TimeoutInMillis: 300000

  OptionsMethod:
    Type: AWS::ApiGateway::Method
    Properties:
      RestApiId: !Ref RestApi
      ResourceId: !Ref StreamResource
      HttpMethod: OPTIONS
      AuthorizationType: NONE
      Integration:
        Type: MOCK
        IntegrationResponses:
          - StatusCode: 200
            ResponseParameters:
              method.response.header.Access-Control-Allow-Headers: "'Content-Type,X-Amz-Date,Authorization,X-Api-Key'"
              method.response.header.Access-Control-Allow-Methods: "'POST,OPTIONS'"
              method.response.header.Access-Control-Allow-Origin: "'*'"
            ResponseTemplates:
              application/json: ''
        RequestTemplates:
          application/json: '{"statusCode": 200}'
      MethodResponses:
        - StatusCode: 200
          ResponseParameters:
            method.response.header.Access-Control-Allow-Headers: true
            method.response.header.Access-Control-Allow-Methods: true
            method.response.header.Access-Control-Allow-Origin: true

  Deployment:
    Type: AWS::ApiGateway::Deployment
    DependsOn:
      - StreamMethod
      - OptionsMethod
    Properties:
      RestApiId: !Ref RestApi

  Stage:
    Type: AWS::ApiGateway::Stage
    Properties:
      RestApiId: !Ref RestApi
      DeploymentId: !Ref Deployment
      StageName: prod

Outputs:
  ApiEndpoint:
    Description: API Gateway endpoint URL
    Value: !Sub 'https://${RestApi}.execute-api.${AWS::Region}.amazonaws.com/prod/stream'

  TestCommand:
    Description: Test command using curl
    Value: !Sub |
      curl --no-buffer -X POST https://${RestApi}.execute-api.${AWS::Region}.amazonaws.com/prod/stream \
        -H "Content-Type: application/json" \
        -d '{"prompt":"日本のAWSリージョンについて教えてください"}'

Testing

Execution Command

I sent a request to the deployed API using the curl command.
To observe the streaming behavior, I added the --no-buffer option.

curl --no-buffer -X POST https://****.ap-northeast-1.amazonaws.com/prod/stream \
  -H "Content-Type: application/json" \
  -d '{"prompt":"日本のAWSリージョンについて、以下の観点から詳しく説明してください:1) 各リージョンの歴史と開設時期、2) 提供されているサービスの違い、3) アベイラビリティゾーンの構成、4) レイテンシーとパフォーマンス特性、5) 料金体系の違い、6) ディザスタリカバリー戦略での活用方法、7) コンプライアンスと規制対応、8) 今後の展望。各項目について具体例を交えて説明してください。"}'

Results

I received about 52KB of text streamed over approximately 1.5 minutes.

100 55068    0 54536  100   532    571      5  0:01:46  0:01:35  0:00:11   472
このような選択により、ビジネス要件に応じた最適なAWS利用戦略が実現できます。
 - Completed in 95.500s

I confirmed that while a process that would have hit the 29-second timeout limit in previous versions of API Gateway took 95 seconds, it wasn't disconnected due to streaming, and I was able to receive the complete response.

Summary

Previously, when implementing responses from AI generation models that run for extended periods or large data downloads, we often had to choose configurations like ALB (Application Load Balancer) + Fargate to avoid API Gateway limitations.

With this update, we can now handle these requirements while maintaining a serverless configuration (API Gateway + Lambda). Consider simplifying your architecture through Lambda implementation.

Share this article

FacebookHatena blogX

Related articles