Building Serverless RAG with GenU (Generative AI Use Cases JP) v5.4 and S3 Vectors

Building Serverless RAG with GenU (Generative AI Use Cases JP) v5.4 and S3 Vectors

2026.02.19

This page has been translated by machine translation. View original

I'm from the Cloud Business Headquarters Consulting Division, Ishikawa. You can build a serverless RAG with dramatically lower running costs using S3 Vectors in the GenU (Generative AI Use Cases JP) v5.4 environment that was recently set up. I'd like to introduce how to use S3 Vectors, which just became GA at the end of last year, and AgentCore from GenU.

https://dev.classmethod.jp/articles/20260216-genu54-for-claude46opus/

Prerequisites

  • Vector store: Amazon S3 Vectors
  • RAG: Amazon Bedrock Knowledge Base
  • S3 Vectors and Bedrock Knowledge Base are in the Tokyo Region (ap-northeast-1)
  • Embedding model: Amazon Titan Text Embeddings V2
  • Create and deploy Amazon Bedrock AgentCore that calls Amazon Bedrock Knowledge Base
  • Deploy GenU (Generative AI Use Cases JP) that calls Amazon Bedrock Knowledge Base

Resource Configuration

Preparing the S3 Source Bucket

Place the documents that will be stored in RAG (Amazon S3 Vectors) in an S3 bucket.

20260218-geu-with-s3-vectors-rag-1

Creating an S3 Vector Bucket

Create a Vector Bucket to store vector data.

Setting item Value
Vector bucket name cm-rag-vector-bucket
Encryption SSE-S3 (default)

20260218-geu-with-s3-vectors-rag-2

Creating an S3 Vector Index

Create a vector index within the Vector Bucket.

Setting item Value
Vector index name cm-rag-index
Dimension 512
Distance metric Cosine
Non-filterable metadata keys AMAZON_BEDROCK_TEXT, AMAZON_BEDROCK_METADATA

20260218-geu-with-s3-vectors-rag-3

Creating an IAM Service Role

Create a service role (cm-japanese-rag-kb-role) for Bedrock Knowledge Base to access various resources.

4-1. Setting up the Trust Policy

Trusted entity type: Select Custom trust policy.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "bedrock.amazonaws.com"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
          "aws:SourceAccount": "<AccountID>"
        },
        "ArnLike": {
          "aws:SourceArn": "arn:aws:bedrock:<Region>:<AccountID>:knowledge-base/*"
        }
      }
    }
  ]
}

4-2. Creating and Attaching Permission Policies

Create and attach the following inline policy (cm-japanese-rag-kb-policy).

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "S3SourceReadAccess",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::cm-rag-source-bucket",
        "arn:aws:s3:::cm-rag-source-bucket/*"
      ]
    },
    {
      "Sid": "S3VectorsAccess",
      "Effect": "Allow",
      "Action": [
        "s3vectors:PutVectors",
        "s3vectors:GetVectors",
        "s3vectors:DeleteVectors",
        "s3vectors:QueryVectors",
        "s3vectors:GetIndex"
      ],
      "Resource": "arn:aws:s3vectors:<Region>:<AccountID>:bucket/cm-rag-vector-bucket/index/cm-rag-index"
    },
    {
      "Sid": "BedrockModelInvocation",
      "Effect": "Allow",
      "Action": "bedrock:InvokeModel",
      "Resource": "arn:aws:bedrock:<Region>::foundation-model/amazon.titan-embed-text-v2:0"
    }
  ]
}

Creating Amazon Bedrock Knowledge Base + Data Source

Here, we'll create an Amazon Bedrock Knowledge Base.

Step 1: Knowledge Base Details

Create an Amazon Bedrock Knowledge Base and data source.

Setting item Value
Knowledge base name cm-japanese-rag-kb
Knowledge base description Japanese text RAG with S3 Vectors
IAM permissions cm-japanese-rag-kb-role

20260218-geu-with-s3-vectors-rag-4

Step 2: Setting up Data Source and Chunking Strategy

Set up the data source and chunking strategy. It's possible to maintain meaningful segments while considering sentence boundaries. However, additional model calls occur during processing, so there will be extra costs during data ingestion (injection).

Semantic chunking determines split positions based on semantic connections in the text, not just character count. This determination requires calling a model (embedding model).

Setting item Value
Data source name cm-japanese-rag-kb-s3-source
S3 URI s3://cm-rag-source-bucket/documents/

Select Semantic chunking for Chunking strategy.

Setting item Value
Maximum buffer size for comparing sentence groups 1
Maximum token size of chunks 15
Breakpoint threshold for sentence group similarity 90

Recommended values are based on the AWS official blog and should be adjusted according to the document type.

https://aws.amazon.com/jp/blogs/machine-learning/amazon-bedrock-knowledge-bases-now-supports-advanced-parsing-chunking-and-query-reformulation-giving-greater-control-of-accuracy-in-rag-based-applications/

20260218-geu-with-s3-vectors-rag-5

Step 3: Setting up the Embedding Model and Vector Store

Embedding model

Setting item Value
Embeddings model Amazon Titan Text Embeddings V2
Vector dimensions 512

Note: Since the default dimension for Titan v2 is 1024, be sure to explicitly select 512. If it doesn't match the Vector Index dimension (512), you'll get a "Query vector contains invalid values or is invalid for this index" error.

Vector database

Setting item Value
Vector store creation method Create from existing vector store
Vector store S3 Vectors
S3 Vector bucket Select cm-rag-vector-bucket
S3 Vector index Select cm-rag-index

20260218-geu-with-s3-vectors-rag-6

Step 4: Review and Create

Review your settings and click Create knowledge base.

Syncing the Data Source (Ingestion)

  1. Select cm-japanese-rag-kb-s3-source in the Data Sources section
  2. Click Sync
  3. Wait until the status becomes Available

20260218-geu-with-s3-vectors-rag-7

Verifying Amazon Bedrock Knowledge Base

Select the knowledge base (cm-japanese-rag-kb), open Test knowledge base, specify a model, enter a prompt, and confirm that a response is returned.

20260218-geu-with-s3-vectors-rag-8

Now that we've created the RAG with S3 Vectors, let's incorporate it into GenU.

Creating and Deploying Amazon Bedrock AgentCore

Create and deploy an Amazon Bedrock AgentCore that calls the Amazon Bedrock Knowledge Base.

Preparing the AgentCore Program

Here, we'll create a strands agent that calls the Knowledge Base we created earlier. This agent retrieves information about "company regulations".

Create agent.py in the agentcore_s3vectors directory. Please replace KNOWLEDGE_BASE_ID with the ID you created earlier.

Note: I would like to explain about strands agent and AgentCore as well, but I'll omit it as the explanation would become too lengthy.

agentcore_s3vectors/agent.py

import os
import boto3
from botocore.config import Config
from strands import Agent, tool
from strands.models import BedrockModel
from bedrock_agentcore.runtime import BedrockAgentCoreApp

# Knowledge Base settings
KNOWLEDGE_BASE_ID = os.getenv("KNOWLEDGE_BASE_ID", "<Enter your KNOWLEDGE_BASE_ID here>")
KNOWLEDGE_BASE_REGION = os.getenv("KNOWLEDGE_BASE_REGION", "ap-northeast-1")

model_id = os.getenv("BEDROCK_MODEL_ID", "global.anthropic.claude-haiku-4-5-20251001-v1:0")
model = BedrockModel(
    model_id=model_id,
    max_tokens=4096,
    temperature=0.1,
    region_name="ap-northeast-1"
)

app = BedrockAgentCoreApp()

@tool
def get_rag(query: str, number_of_results: int = 5) -> str:
    """
    Search for information about company regulations, work rules, benefits, procedure guidelines.
    Always call this function when employees ask about "vacation system", "expense reimbursement", "service", 
    "childcare/nursing care leave", "condolence money", "wages", "travel expenses", "condolence leave", etc.,
    regarding internal rules or public systems.
    Provide accurate company regulations as answers based on the retrieved information.

    Args:
        query: Search query (user's question or content to search for)
        number_of_results: Maximum number of results to retrieve (default: 5)

    Returns:
        Text of search results
    """
    client = boto3.client(
        "bedrock-agent-runtime",
        region_name=KNOWLEDGE_BASE_REGION,
        config=Config(retries={"mode": "standard", "total_max_attempts": 3})
    )

    try:
        response = client.retrieve(
            knowledgeBaseId=KNOWLEDGE_BASE_ID,
            retrievalQuery={'text': query},
            retrievalConfiguration={
                'vectorSearchConfiguration': {
                    'numberOfResults': number_of_results
                }
            }
        )

        results = response.get('retrievalResults', [])
        if not results:
            return "No relevant information found."

        # Format results
        output_lines = []
        for i, result in enumerate(results, 1):
            text = result.get('content', {}).get('text', '')
            score = result.get('score', 0.0)
            source = result.get('location', {}).get('s3Location', {}).get('uri', 'Unknown')
            output_lines.append(f"--- Result {i} (Score: {score:.2f}) ---")
            output_lines.append(f"Source: {source}")
            output_lines.append(text)
            output_lines.append("")

        return "\n".join(output_lines)

    except Exception as e:
        return f"Failed to search Knowledge Base: {str(e)}"

@app.entrypoint
async def entrypoint(payload):
    agent = Agent(model=model, tools=[get_rag])
    message = payload.get("prompt", "")
    # return {"result": agent(message).message}
    stream_messages = agent.stream_async(message)
    async for message in stream_messages:
        if "event" in message:
            yield message

if __name__ == "__main__":
    app.run()

Installing Dependent Modules

Create requirements.txt in the agentcore_s3vectors directory.

agentcore_s3vectors/requirements.txt

strands-agents
strands-agents-tools
bedrock-agentcore
bedrock-agentcore-starter-toolkit

Install the required modules.

pip install -r requirements.txt

Agent Deployment Configuration

Run agentcore configure --entrypoint agent.py to create a configuration file (agentcore_s3vectors/bedrock_agentcore/agent/.bedrock_agentcore.yaml). This time, we created it automatically with default settings.

agentcore_s3vectors % agentcore configure --entrypoint agent.py
Configuring Bedrock AgentCore...
 Using file: agent.py

🏷️  Inferred agent name: agent
Press Enter to use this name, or type a different one (alphanumeric without '-')
Agent name [agent]:
 Using agent name: agent

🔍 Detected dependency file: requirements.txt
Press Enter to use this file, or type a different path (use Tab for autocomplete):
Path or Press Enter to use detected dependency file: requirements.txt
 Using requirements file: requirements.txt

🚀 Deployment Configuration
Select deployment type:
  1. Direct Code Deploy (recommended) - Python only, no Docker required
  2. Container - For custom runtimes or complex dependencies
Choice [1]: 1

Select Python runtime version:
  1. PYTHON_3_10
  2. PYTHON_3_11
  3. PYTHON_3_12
  4. PYTHON_3_13
Note: Current Python 3.14 not supported, using python3.11
Choice [2]: 2
 Deployment type: Direct Code Deploy (python.3.11)

🔐 Execution Role
Press Enter to auto-create execution role, or provide execution role ARN/name to use existing
Execution role ARN/name (or press Enter to auto-create):
 Will auto-create execution role

🏗️  S3 Bucket
Press Enter to auto-create S3 bucket, or provide S3 URI/path to use existing
S3 URI/path (or press Enter to auto-create):
 Will auto-create S3 bucket

🔐 Authorization Configuration
By default, Bedrock AgentCore uses IAM authorization.
Configure OAuth authorizer instead? (yes/no) [no]:
 Using default IAM authorization

🔒 Request Header Allowlist
Configure which request headers are allowed to pass through to your agent.
Common headers: Authorization, X-Amzn-Bedrock-AgentCore-Runtime-Custom-*
Configure request header allowlist? (yes/no) [no]:
 Using default request header configuration
Configuring BedrockAgentCore agent: agent

Memory Configuration
Tip: Use --disable-memory flag to skip memory entirely

 MemoryManager initialized for region: ap-northeast-1
No existing memory resources found in your account

Options:
 Press Enter to create new memory
 Type 's' to skip memory setup

Your choice:
 Short-term memory will be enabled (default)
 Stores conversations within sessions
 Provides immediate context recall

Optional: Long-term memory
 Extracts user preferences across sessions
 Remembers facts and patterns
 Creates session summaries
 Note: Takes 120-180 seconds to process

Enable long-term memory? (yes/no) [no]:
 Using short-term memory only
Will create new memory with mode: STM_ONLY
Memory configuration: Short-term memory only
Network mode: PUBLIC
Setting 'agent' as default agent

20260218-geu-with-s3-vectors-rag-9

Deploy Agent to AgentCore

Run agentcore launch to deploy the agent to AgentCore.

agentcore_s3vectors % agentcore launch
🚀 Launching Bedrock AgentCore (cloud mode - RECOMMENDED)...
 Deploy Python code directly to runtime
 No Docker required (DEFAULT behavior)
 Production-ready deployment

💡 Deployment options:
 agentcore deploy Cloud (current)
 agentcore deploy --local Local development

Launching with direct_code_deploy deployment for agent 'agent'
Creating memory resource for agent: agent
 MemoryManager initialized for region: ap-northeast-1
 Launching Bedrock AgentCore...Creating new STM-only memory...
 Creating memory resource (this may take 30-180 seconds)...
 Launching Bedrock AgentCore...Created memory: agent_mem-Jh9uGT50Zu
Created memory agent_mem-Jh9uGT50Zu, waiting for ACTIVE status...
Waiting for memory agent_mem-Jh9uGT50Zu to return to ACTIVE state and strategies to reach terminal states...
 Launching Bedrock AgentCore...[23:31:07] Memory: CREATING (10s elapsed)                                                                                                                                            manager.py:1029
 Launching Bedrock AgentCore...[23:31:17] Memory: CREATING (20s elapsed)                                                                                                                                            manager.py:1029
:                                                                                   :                                                                                   :                                                                                   

 Launching Bedrock AgentCore...[23:33:21] Memory: CREATING (144s elapsed)                                                                                                                                           manager.py:1029
 Launching Bedrock AgentCore...[23:33:32] Memory: CREATING (155s elapsed)                                                                                                                                           manager.py:1029
 Launching Bedrock AgentCore...Memory agent_mem-Jh9uGT50Zu is ACTIVE and all strategies are in terminal states (took 160 seconds)
[23:33:37]    ✅ Memory is ACTIVE (took 160s)                                                                                                                                              manager.py:1043
 Launching Bedrock AgentCore...ObservabilityDeliveryManager initialized for region: ap-northeast-1, account: 123456789012
 Launching Bedrock AgentCore...Created log group: /aws/vendedlogs/bedrock-agentcore/memory/APPLICATION_LOGS/agent_mem-Jh9uGT50Zu
 Launching Bedrock AgentCore...✅ Logs delivery enabled for memory/agent_mem-Jh9uGT50Zu
 Launching Bedrock AgentCore...Failed to enable observability for memory/agent_mem-Jh9uGT50Zu: ValidationException - X-Ray Delivery Destination is supported with CloudWatch Logs as a Trace Segment Destination. Please enable the CloudWatch Logs destination for your traces using the UpdateTraceSegmentDestination API (https://docs.aws.amazon.com/xray/latest/api/API_UpdateTraceSegmentDestination.html)
⚠️ Failed to enable observability: ValidationException: X-Ray Delivery Destination is supported with CloudWatch Logs as a Trace Segment Destination. Please enable the CloudWatch Logs destination for your
traces using the UpdateTraceSegmentDestination API (https://docs.aws.amazon.com/xray/latest/api/API_UpdateTraceSegmentDestination.html)
Memory created and active: agent_mem-Jh9uGT50Zu
Ensuring execution role...
Getting or creating execution role for agent: agent
Using AWS region: ap-northeast-1, account ID: 123456789012
Role name: AmazonBedrockAgentCoreSDKRuntime-ap-northeast-1-d4f0bc5a29
 Launching Bedrock AgentCore...Role doesn't exist, creating new execution role: AmazonBedrockAgentCoreSDKRuntime-ap-northeast-1-d4f0bc5a29
Starting execution role creation process for agent: agent
✓ Role creating: AmazonBedrockAgentCoreSDKRuntime-ap-northeast-1-d4f0bc5a29
Creating IAM role: AmazonBedrockAgentCoreSDKRuntime-ap-northeast-1-d4f0bc5a29
⠹ Launching Bedrock AgentCore...✓ Role created: arn:aws:iam::123456789012:role/AmazonBedrockAgentCoreSDKRuntime-ap-northeast-1-d4f0bc5a29
⠴ Launching Bedrock AgentCore...✓ Execution policy attached: BedrockAgentCoreRuntimeExecutionPolicy-agent
Role creation complete and ready for use with Bedrock AgentCore
Execution role available: arn:aws:iam::123456789012:role/AmazonBedrockAgentCoreSDKRuntime-ap-northeast-1-d4f0bc5a29
Using entrypoint: agent.py (relative to /Users/ishikawa.satoru/workspaces/cc/rd/generative-ai-use-cases/agentcore_s3vectors)
Creating deployment package...
📦 No cached dependencies found, will build
Building dependencies (this may take a minute)...
Building dependencies for Linux ARM64 Runtime (manylinux2014_aarch64)
Installing dependencies with uv for aarch64-manylinux2014 (cross-compiling for Linux ARM64)...
⠴ Launching Bedrock AgentCore...✓ Dependencies installed with uv
Creating dependencies.zip...
⠏ Launching Bedrock AgentCore...✓ Dependencies cached
Packaging source code...
⠋ Launching Bedrock AgentCore...Creating deployment package...
⠋ Launching Bedrock AgentCore...✓ Deployment package ready: 44.26 MB
Getting or creating S3 bucket for agent: agent
Bucket doesn't exist, creating new S3 bucket: bedrock-agentcore-codebuild-sources-123456789012-ap-northeast-1
 Created S3 bucket: bedrock-agentcore-codebuild-sources-123456789012-ap-northeast-1
 Launching Bedrock AgentCore...S3 bucket available: s3://bedrock-agentcore-codebuild-sources-123456789012-ap-northeast-1
Uploading deployment package to S3...
Uploading to s3://bedrock-agentcore-codebuild-sources-123456789012-ap-northeast-1/agent/deployment.zip...
 Launching Bedrock AgentCore...✓ Deployment package uploaded: s3://bedrock-agentcore-codebuild-sources-123456789012-ap-northeast-1/agent/deployment.zip
Deploying to Bedrock AgentCore Runtime...
 Launching Bedrock AgentCore...✅ Agent created/updated: arn:aws:bedrock-agentcore:ap-northeast-1:123456789012:runtime/agent-s921Yn5WVW
Waiting for agent endpoint to be ready...
 Launching Bedrock AgentCore...Enabling observability...
 Launching Bedrock AgentCore...Created/updated CloudWatch Logs resource policy
 Launching Bedrock AgentCore...Configured X-Ray trace segment destination to CloudWatch Logs
X-Ray indexing rule already configured
Transaction Search configured: resource_policy, trace_destination
🔍 GenAI Observability Dashboard: https://console.aws.amazon.com/cloudwatch/home?region=ap-northeast-1#gen-ai-observability/agent-core
 Deployment completed successfully - Agent: arn:aws:bedrock-agentcore:ap-northeast-1:123456789012:runtime/agent-s921Yn5WVW

20260218-geu-with-s3-vectors-rag-10

Call Amazon Bedrock AgentCore from GenU

Edit parameter.ts

Edit generative-ai-use-cases/packages/cdk/parameter.ts to add and override agentCoreRegion and agentCoreExternalRuntimes in the envs section to configure calling Amazon Bedrock AgentCore from GenU. Replace the AgentCore ARN with the ARN displayed in the last line after deploying AgentCore.

✅ Deployment completed successfully - Agent: arn:aws:bedrock-agentcore:ap-northeast-1:123456789012:runtime/agent-s921Yn5WVW

  '': {
    selfSignUpEnabled: false,
    agentCoreRegion: 'ap-northeast-1',
    agentCoreExternalRuntimes: [
      {
        name: 'Company Policies (S3 Vectors RAG)',
        arn: '<Replace with your AgentCore ARN>',
        description: 'S3 Vectors for RAG'
      }
  },

20260218-geu-with-s3-vectors-rag-11

After configuration, deploy again.

npm run cdk:deploy

Testing

[AgentCore Experimental] has been added to the left navigation. Clicking on it will show "Company Policies (S3 Vectors RAG)".

For a prompt like "I got married. Do I get any money?" which would be difficult to match with keywords alone, semantic search generates an answer based on company information.

20260218-geu-with-s3-vectors-rag-12

Conclusion

In this article, we introduced how to build a cost-effective RAG with a serverless configuration by combining GenU v5.4 and Amazon S3 Vectors.

Compared to traditional vector stores like Amazon Kendra or OpenSearch Serverless, S3 Vectors offers vector search capabilities with a serverless and simple configuration, which significantly reduces running costs. Additionally, using Amazon Bedrock AgentCore allows you to seamlessly call Bedrock Knowledge Base from GenU, which is very practical.

The fact that it can provide appropriate answers to queries like "I got married. Do I get any money?" that are difficult to match by keywords demonstrates the power of RAG through semantic search. I hope you can utilize this for use cases directly related to daily business operations such as company policies and FAQs.

Share this article

FacebookHatena blogX