[Resource Release] We conducted a hands-on workshop for building AI agents with Amazon Bedrock AgentCore Managed Harness!

[Resource Release] We conducted a hands-on workshop for building AI agents with Amazon Bedrock AgentCore Managed Harness!

I will publish hands-on procedures using Amazon Bedrock AgentCore's Managed Harness! Centered around the AWS console, you can build AI agents and experience agent behavior combining Browser, Code Interpreter, and RAG tools.
2026.05.26

This page has been translated by machine translation. View original

Introduction

Hello, I'm Kamino from the consulting department, and I love Amazon Bedrock AgentCore.

The other day, I hosted a hands-on event using Amazon Bedrock AgentCore Managed Harness. The content covered creating an AI agent using only the AWS console, without writing any code. It was a simple experience starting from a chatbot and gradually adding tools step by step to grow the AI agent.

Since I had the opportunity, I'd like to publish the hands-on instructions so anyone can try them!

The slide materials used in the lecture part are published on SpeakerDeck. They summarize the overall picture of AgentCore and the positioning of Managed Harness, so it would be smooth to skim through them before starting the hands-on, but since they're also a director's cut version of previous materials, a light read should be sufficient.

Managed Harness

Before getting into the hands-on steps, let me briefly introduce Managed Harness.

Managed Harness allows you to behave as an AI agent simply by configuring the model, connected tools, and system prompt on the console, without the user needing to implement any code.

01

The biggest appeal is that you can quickly create it on the console and quickly test it from the playground screen!

Internally, it's a service that wraps Strands Agents and AgentCore Runtime. You're essentially not writing the code yourself, but the structure is nearly the same.

02

As configuration items, you can freely combine system prompt / model / Memory / Gateway / Browser / Code Interpreter / Skill / Remote MCP Server.

03

Hands-on Structure

In this hands-on, we'll give Managed Harness tools such as Browser, Code Interpreter, and RAG tools via Gateway, and experience how the agent autonomously selects the necessary tools.

04

The hands-on will proceed with the following part breakdown.

Part Content
Part 1 Create a Harness and interact with it
Part 2 Change behavior with prompts
Part 3 Retrieve information from the Web — Browser
Part 4 Write code and perform calculations — Code Interpreter
Part 5 Search internal documents — Gateway / RAG
Part 6 Check the behind-the-scenes activity — Observability
Bonus Evaluate the agent — Evaluations

Prerequisites

Item Content
Time required Approximately 60 minutes (excluding preparation)
Region us-east-1 (US East - N. Virginia)
Required AWS account (Administrator permissions recommended)
Local environment Node.js 24, Docker (required for Gateway deployment in preparation)
CDK CDK Bootstrap completed (npx cdk bootstrap)

Parts 1–4 can be tried immediately as long as you have an AWS account. To experience Gateway / RAG in Part 5, you need to deploy Gateway and Knowledge Base with CDK in advance.

Preparation

Building Gateway and Knowledge Base (CDK)

Deploy the Gateway and Knowledge Base used in Part 5 with CDK. The CDK project is published on GitHub. During the hands-on event, I set it up in the account in advance, and participants did not need to do this.
It might be better to have an administrator or someone familiar with CDK do this in advance.

https://github.com/yuu551/agentcore-gateway-kit

Deploying this kit makes the following 3 tools available via Gateway.

Target Tool Name Description Implementation
kb-retrieve retrieve_documents Document search from Knowledge Base Lambda
web-tools fetch_webpage Text retrieval from web pages FastMCP on Runtime
aws-knowledge search_documentation etc AWS official documentation search AWS Hosted MCP Server

CDK Deploy

Deploy command
# Clone the repository
git clone https://github.com/yuu551/agentcore-gateway-kit.git
cd agentcore-gateway-kit

# Install dependencies
npm install

# Check template
npx cdk synth

# Deploy (2 stacks: KnowledgeBase + Gateway)
npx cdk deploy --all

The deployment takes a few minutes. When complete, the CDK Output will display KnowledgeBaseId / DataSourceId / DataSourceBucketName. Please note these down as they will be used in later steps.

Knowledge Base Sync

Immediately after deployment, documents are not yet synced, so run the sync.

Sync command
aws bedrock-agent start-ingestion-job \
  --knowledge-base-id <KnowledgeBaseId> \
  --data-source-id <DataSourceId>

Sample data in the form of an AgentCore overview document is placed in the S3 bucket, so once the sync is complete you can immediately try RAG search.

If you want to add your own documents, simply upload them to the S3 bucket and re-run the sync.

Add documents
aws s3 cp my-document.txt s3://<DataSourceBucketName>/

aws bedrock-agent start-ingestion-job \
  --knowledge-base-id <KnowledgeBaseId> \
  --data-source-id <DataSourceId>

If you want to know more about building the Gateway CDK, please also refer to the article below.

https://dev.classmethod.jp/articles/agentcore-gateway-cdk-managed-harness/

Preparing the AWS Account

  1. Access https://console.aws.amazon.com/ and sign in
  2. Change the region in the upper right of the screen to "US East (N. Virginia)" us-east-1

05

  1. Enter "AgentCore" in the search bar and click "Amazon Bedrock AgentCore"

06

That completes the preparation! Let's move into the hands-on from here.

Part 1: Create a Harness and Interact with It

Create an AI agent in just a few clicks and have your first interaction.

1-1. Create a Harness

  1. Click "Harness Preview" under "Build" in the left menu of the Bedrock console
  2. Click the "Quick create harness" button in the upper right of the screen

07

  1. It will be auto-generated in about 30 seconds and you'll be redirected to the playground screen

You're successful if the playground screen with a chat input field is displayed!

08

1-2. Try Changing the Model

Let's experience how easily the model can be changed.

  1. Turn ON the "Settings" toggle in the upper right of the playground screen
  2. The settings panel will appear on the right
  3. Click the pencil icon next to the model name in the "Model" section
  4. The model selection dialog will open

09

You can choose from models by various providers such as Anthropic, Amazon, Google, and Meta. For this time, select Claude Haiku 4.5 and click "Apply."

Being able to switch model providers between Bedrock / OpenAI / Gemini with a single click is also a feature of Managed Harness.

1-3. First Interaction

Type the following in the chat input field and send it.

Hello! What can you do?

You're good if an introductory-style response comes back!

10

1-4. Try an "Agent-like" Instruction

Next, try sending this.

Please research the latest information on Amazon Bedrock AgentCore and summarize the main features in a table.

How was that? It should only be able to answer within the model's knowledge and be unable to retrieve the latest information.

11

Since we haven't configured any tools for retrieving information yet, this is to be expected.
Let's grow it into an agent that retrieves information from the appropriate sources from here!

Part 2: Change Behavior with Prompts

Experience how the agent's behavior changes by modifying the system prompt.

2-1. Change the System Prompt

  1. Find the "System Prompt" section in the settings panel
  2. Delete the existing content and copy & paste the following
You are an AWS technical consultant named "TechNavi."

## Role
- You provide advice on AWS service selection and architecture design
- You always respond in Japanese, politely and specifically
- You include accurate AWS service names in your responses

## Response Style
- State the conclusion first, then explain the details
- When there are multiple options, create a comparison table
- Include consideration of cost aspects
  1. Once entered, it will be reflected in the next interaction

What you're changing here is a temporary setting on the playground. It won't be saved as the Harness's actual settings, so if you want to persist it, you need to change and save it from the Harness edit screen.

2-2. Feel the Change

Try sending the following.

I want to create a web application. Which services should I use?

How did it change compared to the general response in Part 1? It should answer professionally as "TechNavi," structured from conclusion to details, with a comparison table and mention of cost aspects.

12

With just one system prompt, the agent's behavior changes!

2-3. Reset the System Prompt

Since Parts 3 and beyond will focus on the tool integration experience, let's delete the contents of the "System Prompt" field and leave it empty.

Part 3: Retrieve Information from the Web — Browser

Add the Browser tool and experience how the agent operates the web in real time.

3-1. Add the Browser Tool

  1. Click the "Add tool" button in the "Tools" section of the settings panel
  2. Select the browser tool under "Select tool type"
  3. Select "AgentCore Browser Tool"
  4. Click "Add"

13

You're good if "aws.browser.v1" appears in the tools section of the settings panel!

3-2. Have It Look Up the Latest Information on the Web

Please send the following. This time we're specifying a URL directly.

Please access the following URL and summarize the main features and use cases of Amazon Bedrock AgentCore in a table.
https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/what-is-bedrock-agentcore.html

The agent will autonomously open the URL in the browser, read the page content, and organize it. Please confirm that a status like "Opening browser" is displayed.

14

The information that couldn't be answered in Part 1 is now retrieved from the web via the browser tool and answered!

About the Use Cases for the Browser Tool

There's a reason we specified the URL directly this time. The Browser tool is not well-suited for use cases like "searching a search engine to gather information." Anti-bot measures on the search engine side may trigger CAPTCHAs, and it takes extra time and tokens.

The official documentation troubleshooting also contains the following note:

Structure your agent to avoid search engines and implement the following architecture pattern:

  • Use the Browser tool only for specific page actions, not general web searching
  • Use non-browser MCP tools like Tavily search for general web search operations

https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/browser-tool-troubleshooting.html

It's more efficient to use the Browser tool for cases where you access a specific URL and read the page content, and to use MCP tools like the AWS Knowledge MCP Server or Tavily Search introduced in Part 5 for information searches.

3-3. Watch the Browser's Activity in Real Time

You can observe in real time as the agent browses the web.

  1. While the task from step 3-2 is running (or after sending a new question), open another browser tab
  2. Bedrock console → Click "Built-in tools" under "Build" in the left menu
  3. Click "Browser"

15

  1. In the "Browser sessions" section, find the session with a status of "Ready"
  2. Click the "View live session" link

16

A new window will open, showing in real time how the agent is browsing web pages!

17

It's fascinating to be able to watch an AI operating a browser right before your eyes. It makes judgments each time and takes appropriate actions.

Part 4: Write Code and Perform Calculations — Code Interpreter

Add the Code Interpreter tool and experience data analysis by executing code.

4-1. Add the Code Interpreter Tool

  1. Click the "Add tool" button in the "Tools" section of the settings panel
  2. Select the code interpreter under "Select tool type"
  3. Select "AgentCore Code Interpreter Tool"
  4. Click "Add"

On first launch, there may be a cold start that takes about 30 seconds. Don't panic, just wait.

4-2. Have It Calculate an AWS Cost Comparison

Please send the following.

Assuming an API with 10 million requests per month, please compare the costs of Lambda + API Gateway vs ECS Fargate. Calculate the monthly cost for each and tell me which is cheaper.

The agent automatically generates and executes Python code and outputs the cost calculation results as a comparison table.

18

Behind the scenes, the Code Interpreter is running code multiple times to perform the calculations. If you open the tool invocation logs, you can see the agent working hard on the calculations!

19

Expanding the logs, you can also check the actual Python code that was executed. Code Interpreter is a service that safely executes code in a sandbox completely isolated from the agent's environment.

Note that files generated by Code Interpreter (such as CSV or graph images) cannot be directly downloaded from the playground. If you want to fully utilize file output in earnest, you may want to consider tools that output to a specified S3, or implementing separately with code.

20

4-3. Experience Automatic Tool Selection

Next, send this.

Please check the following AWS official page and find out if there are ways to further reduce the Lambda costs from before. Also estimate the savings.
https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html

Observe how the agent automatically uses Code Interpreter (calculations) and Browser (web browsing) appropriately. Which tool to use is not specified by a human—the agent decides on its own.

21

22

Up to this point, we've tried the two tools: Browser and Code Interpreter!

What's interesting is that the agent decides on its own when to use which tool. Even without individually instructing it to go look at a web page or do a calculation, it combines the appropriate tools for a single question to build the answer.

Part 5: Search Internal Documents — Gateway / RAG

Experience an agent that connects to Knowledge Base via Gateway and can reference internal documents.

By connecting Gateway to Harness, adding just one gives you access to multiple tools at once (KB search, AWS documentation search, web retrieval).

23

5-1. Configure Gateway in Harness

Gateway is configured not from the playground's settings panel, but from the Harness edit screen. Since changes here update the Harness's actual settings, you need to click "Save" at the end.

  1. Click "Harness Preview" under "Build" in the left menu to return to the harness list
  2. Select the harness created this time and click "Edit"
  3. Check "Gateway" in the "Tools & Options" section
  4. Select rag-gateway-kit under "Please select a gateway"
  5. Select "IAM Role" for outbound authentication
  6. Click "Save"

24

After configuring, click "Test harness in playground" to return to the playground.

5-2. Ask Questions to the Knowledge Base

Please send the following.

Using RAG, please search for information about the deployment procedure for AgentCore Runtime.

You're good if a response comes back that retrieves information from the Knowledge Base via Gateway and cites the document content!

25

If "Kb-Retrieve Retrieve Documents" is displayed in the tool call log, it is answering from the pre-loaded documents rather than from the web.

5-3. Try Adding Documents (Optional)

The Knowledge Base documents are stored in an S3 bucket (rag-gw-datasource-*). You can also add your own documents and try them out.

  1. Open the rag-gw-datasource-* bucket in the S3 console and upload a text file
  2. Bedrock console → Left menu "Knowledge bases" → Open "RagGatewayKnowledgeBase"
  3. Select the data source and click the "Sync" button
  4. Once sync is complete, go back to the playground and try asking questions about the content of the added document

5-4. Try Other Tools in Gateway

Gateway also includes the AWS Knowledge MCP Server and URL Fetch tools in addition to KB search.

The AWS Knowledge MCP Server is a tool that can search AWS official documentation. Try sending the following.

Please tell me about AgentCore Payments.

The agent searches the official documentation via the AWS Knowledge MCP Server and responds.

26

Looking at the tool call logs, you can see that Aws-Knowledge Aws Search Documentation and Aws-Knowledge Aws Read Documentation are being used.

27

Next, let's also try the URL Fetch tool.

Please retrieve the following URL's page and summarize its content.
https://dev.classmethod.jp/articles/bedrock-agentcore-managed-harness-preview/

The fetch_webpage tool within the Gateway will be used to retrieve the content of the specified web page and summarize it.

28

As you can see, it's convenient that adding just one Gateway makes multiple tools available at once. It's also a nice point that adding new features to the Gateway is easy, and the harness itself doesn't need to be changed when you do.

Going back to the discussion of the Browser tool's use cases mentioned in Part 3, searching AWS official documentation is more efficient using the AWS Knowledge MCP Server as we did this time. You can leverage existing assets, the response is faster, and token consumption is lower.

The right image is to limit the Browser tool to situations where you access a specific URL and perform page operations, and to choose the appropriate MCP tool for information searches depending on the purpose.

Part 6: Check Behind-the-Scenes Activity — Observability

The main hands-on experience of building an agent is complete by Part 5. Try this if you have time to spare.

There is a "Show observability" button in the upper right of the playground screen.

29

Clicking it opens a dashboard where you can check the agent's execution traces and resource consumption.

30

A list of traces per session and metrics for CPU/memory consumption are displayed. Clicking a trace ID lets you check the agent's internal operations in detail.

31

The span tree on the left shows in what order the agent called which tools. You can also check the latency and token count for each span, and the tool input/output is displayed on the right.

In this example, you can even check the content of the document retrieved from the Knowledge Base. The trajectory at the bottom visualizes the flow of the agent's thinking → tool calls → response in a flowchart.

In actual operation, you'd use this observability feature to identify where time is being spent for performance improvement, and to find unintended actions and review processes for continuous improvement. Being able to visually see the processing is helpful for gaining insights.

Bonus: Evaluate the Agent — Evaluations

A mechanism to automatically evaluate the quality of the agent you've created is also available. You can configure it from "Evaluations" under "Evaluate" in the left menu.

32

Creating an Evaluation Configuration

Clicking "Create evaluation configuration" opens the evaluation settings screen.

33
Simply by selecting the agent and endpoint, you can automatically evaluate the traces of that agent. If you check "Activate this evaluation configuration after creation," evaluation will begin immediately.

Selecting Evaluators

Multiple built-in evaluators are available. In this hands-on, I selected the following 5.

  • Correctness
    • Checks whether the information in the response is factually accurate. It can be used to verify whether information retrieved from tools is correctly reflected.
  • Faithfulness
    • Checks whether the response is based on the context retrieved from tools. This checks whether the agent is fabricating the content of documents retrieved via RAG.
  • Response relevance
    • Checks whether the response accurately answers the user's question. This confirms that the response isn't off-topic from the question.
  • Coherence
    • Checks whether the response is logically structured and consistent.
  • Goal success rate
    • Checks whether the user's goal was achieved throughout the entire session. Unlike the other four, which evaluate at the individual response (TRACE) level, this evaluates the entire session.

Please select based on your purpose.

34

Category Evaluator Scope Evaluation Content
Response quality Correctness TRACE Evaluates whether the information in the response is factually accurate
Response quality Faithfulness TRACE Evaluates whether the response is based on the provided context/sources
Response quality Response relevance TRACE Evaluates whether the response appropriately answers the user's question
Response quality Coherence TRACE Evaluates whether the response is logically structured and consistent
Response quality Helpfulness TRACE Evaluates from the user's perspective how useful and valuable the response is
Response quality Conciseness TRACE Evaluates whether the response is appropriately concise without missing key information
Response quality Instruction following TRACE Evaluates how well the system prompt instructions are followed
Response quality Refusal TRACE Detects whether questions are being avoided or answers directly refused
Task completion Goal success rate SESSION Evaluates whether the user's goal was achieved throughout the entire session
Component level Tool selection accuracy TOOL_CALL Evaluates whether the appropriate tool was selected for the task
Component level Tool parameter accuracy TOOL_CALL Evaluates whether tool parameters were accurately extracted from user input
Safety Harmfulness TRACE Detects whether the response contains harmful content
Safety Stereotyping TRACE Detects generalizations/stereotypes about specific groups
Trajectory TrajectoryExactOrderMatch SESSION Evaluates whether the actual tool call order completely matches expectations
Trajectory TrajectoryInOrderMatch SESSION Evaluates whether the expected tools are included in order (other tools in between are OK)
Trajectory TrajectoryAnyOrderMatch SESSION Evaluates whether all expected tools were used regardless of order

Scope represents the granularity of the evaluation. TRACE evaluates individual responses (one exchange), SESSION evaluates the entire session (multiple exchanges), and TOOL_CALL evaluates tool selection and parameters respectively.

Trajectory-based evaluations allow you to define the expected tool call order and verify whether the agent made the correct tool selections.

In this online evaluation, we're evaluating traces from the hands-on in real time, but using batch evaluation allows you to prepare a dataset and perform bulk evaluation offline. If you want to systematically measure quality with expected values (ground truth data) like the Trajectory evaluation, consider batch evaluation as well.

https://dev.classmethod.jp/articles/amazon-bedrock-agentcore-evaluations-batch-evaluation/

You can set the percentage of traces to be evaluated with the sampling rate (100% is fine for the hands-on).

35

Checking Evaluation Results

After asking several questions and the evaluation is complete, you can check scores at both the session and trace levels.

36

Scores for ResponseRelevance / Correctness / Coherence / Faithfulness, etc., are automatically assigned to each trace.

Evaluation results can also be checked from the observability dashboard introduced in Part 6.

37

The flow is to check this dashboard, then use the observability feature from earlier to check what kind of behavior is occurring, and examine problematic interactions.

In production operation, you'll run a cycle of improving the system prompt and tool configuration based on these evaluation results.
Of course, since the results are determined by AI, it's important not to take them at face value and to visually verify results that feel off.

Cleanup

If you want to delete the resources created in the hands-on, follow these steps.

  1. Bedrock console → Harness Preview → Select the created harness and click "Delete"
  2. Delete the Gateway / Knowledge Base deployed with CDK
Cleanup command
cd agentcore-gateway-kit
npx cdk destroy --all

Closing

Were you able to experience building an AI agent with just Managed Harness?

Starting from a chatbot and gradually adding tools like Browser / Code Interpreter / Gateway, you can clearly see how the agent autonomously selects tools to complete tasks. I hope you also got a sense of how the agent's capabilities continue to expand by connecting external tools and data sources through Gateway.

The Managed Harness we explored today is just the entry point to AgentCore. The following article provides a summary of what the AgentCore service suite as a whole looks like.

https://dev.classmethod.jp/articles/amazon-bedrock-agentcore-2025-summary/

If you want to develop and deploy agents more flexibly, AgentCore CLI is an option available to you. The implementation in code is not that difficult, so please refer to the following article to get started.

https://dev.classmethod.jp/articles/agentcore-cli-inspector-web-ui/

I hope this article has been helpful in some way. Thank you for reading all the way to the end!


生成AI活用はクラスメソッドにお任せ

過去に支援してきた生成AIの支援実績100+を元にホワイトペーパーを作成しました。御社が抱えている課題のうち、どれが解決できて、どのようなサービスが受けられるのか?4つのフェーズに分けてまとめています。どうぞお気軽にご覧ください。

生成AI資料イメージ

無料でダウンロードする

Share this article