[Resource Release] We conducted a hands-on workshop for building AI agents with Amazon Bedrock AgentCore Managed Harness!

[Resource Release] We conducted a hands-on workshop for building AI agents with Amazon Bedrock AgentCore Managed Harness!

I will publish hands-on instructions using Amazon Bedrock AgentCore's Managed Harness! You can build AI agents centered around the AWS console and experience agent behavior combining Browser, Code Interpreter, and RAG tools.
2026.05.26

This page has been translated by machine translation. View original

Introduction

Hello, I'm Kamino from the consulting department who loves Amazon Bedrock AgentCore.

The other day, I hosted a hands-on event using Amazon Bedrock AgentCore Managed Harness. The content was about creating an AI agent using only the AWS console without writing any code. It's a simple experience where you start from a chatbot and grow an AI agent by adding tools step by step.

Since we're at it, I'd like to publish the hands-on instructions so anyone can try them!

The slide materials used in the lecture part are published on SpeakerDeck. They summarize the overall picture of AgentCore and the positioning of Managed Harness, so it's smooth if you skim through them before starting the hands-on, but since it's also a director's cut of previous materials, it's fine to just take a quick look.

Managed Harness

Before getting into the hands-on instructions, let me briefly introduce Managed Harness.

Managed Harness allows you to behave as an AI agent simply by configuring the model, tools to integrate, and system prompt on the console, without implementing any code on the user side.

01

The big appeal is that you can quickly create it on the console and quickly try it from the playground screen!

Internally, it is a service that wraps Strands Agents and AgentCore Runtime. It's essentially the same structure, just without writing the code yourself.

02

As configuration items, you can freely combine System Prompt / Model / Memory / Gateway / Browser / Code Interpreter / Skill / Remote MCP Server.

03

Hands-on Structure

In this hands-on, we'll give Managed Harness Browser, Code Interpreter, and RAG tools via Gateway, and have you experience how the agent autonomously selects the necessary tools.

04

The hands-on proceeds with the following part breakdown.

Part Content
Part 1 Create a Harness and interact with it
Part 2 Change behavior with prompts
Part 3 Retrieve information from the Web — Browser
Part 4 Write code and have it calculate — Code Interpreter
Part 5 Search internal documents — Gateway / RAG
Part 6 Check what's happening behind the scenes — Observability
Bonus Evaluate the agent — Evaluations

Prerequisites

Item Content
Time Required About 60 minutes (excluding advance preparation)
Region us-east-1 (N. Virginia)
Requirements AWS account (Administrator privileges recommended)
Local Environment Node.js 24, Docker (required for Gateway deployment in advance preparation)
CDK CDK Bootstrap complete (npx cdk bootstrap)

Parts 1–4 can be tried immediately with just an AWS account. To experience the Gateway / RAG in Part 5, you need to deploy the Gateway and Knowledge Base with CDK in advance.

Advance Preparation

Building Gateway and Knowledge Base (CDK)

Deploy the Gateway and Knowledge Base used in Part 5 with CDK. The CDK project is published on GitHub. During the hands-on event, I set it up in the account in advance and attendees did not do this themselves.
It may be best for an administrator or someone familiar with CDK to do this ahead of time.

https://github.com/yuu551/agentcore-gateway-kit

Deploying this kit makes the following 3 tools available via Gateway.

Target Tool Name Description Implementation
kb-retrieve retrieve_documents Document search from Knowledge Base Lambda
web-tools fetch_webpage Web page text retrieval FastMCP on Runtime
aws-knowledge search_documentation etc. AWS official documentation search AWS Hosted MCP Server

CDK Deploy

Deploy command
# Clone repository
git clone https://github.com/yuu551/agentcore-gateway-kit.git
cd agentcore-gateway-kit

# Install dependencies
npm install

# Check template
npx cdk synth

# Deploy (2 stacks: KnowledgeBase + Gateway)
npx cdk deploy --all

Deployment takes a few minutes. When complete, KnowledgeBaseId / DataSourceId / DataSourceBucketName will be displayed in the CDK Output. Please note these down as they will be used in later steps.

Knowledge Base Sync

Right after deployment, documents are not yet synchronized, so run the sync.

Sync command
aws bedrock-agent start-ingestion-job \
  --knowledge-base-id <KnowledgeBaseId> \
  --data-source-id <DataSourceId>

Sample data with an AgentCore overview document is placed in the S3 bucket, so once sync is complete you can immediately try RAG search.

If you want to add your own documents, just upload them to the S3 bucket and re-run the sync.

Add documents
aws s3 cp my-document.txt s3://<DataSourceBucketName>/

aws bedrock-agent start-ingestion-job \
  --knowledge-base-id <KnowledgeBaseId> \
  --data-source-id <DataSourceId>

For those who want to know more about building the Gateway CDK, please also refer to the article below.

https://dev.classmethod.jp/articles/agentcore-gateway-cdk-managed-harness/

AWS Account Preparation

  1. Go to https://console.aws.amazon.com/ and sign in
  2. Change the region in the upper right of the screen to "US East (N. Virginia)" us-east-1

05

  1. Type "AgentCore" in the search bar and click "Amazon Bedrock AgentCore"

06

Advance preparation is now complete! From here we'll get into the hands-on.

Part 1: Create a Harness and Interact with It

Create an AI agent in just a few clicks and have your first interaction.

1-1. Create a Harness

  1. From the left menu of the Bedrock console, click "Harness preview" under "Build"
  2. Click the "Quick create harness" button in the upper right of the screen

07

  1. It will be automatically generated in about 30 seconds and you'll be redirected to the playground screen

Success if the playground screen with a chat input field is displayed!

08

1-2. Try Changing the Model

Let's experience how easily the model can be changed.

  1. Turn the "Settings" toggle ON in the upper right of the playground screen
  2. A settings panel will appear on the right
  3. Click the pencil icon next to the model name in the "Model" section
  4. The model selection dialog will open

09

You can choose from models from various providers such as Anthropic, Amazon, Google, and Meta. This time, select Claude Haiku 4.5 and click "Apply".

Being able to switch between Bedrock / OpenAI / Gemini model providers with one click is also a feature of Managed Harness.

1-3. First Interaction

Type the following in the chat input field and send it.

Hello! What can you do?

It's OK if you get a self-introduction-like response!

10

1-4. Try an "Agent-like" Instruction

Next, try sending this.

Please research the latest information on Amazon Bedrock AgentCore and summarize the key features in a table.

How was it? It should only be able to answer within the model's knowledge and unable to retrieve the latest information.

11

Since we haven't configured any tools to retrieve information yet, this is to be expected.
From here, let's grow it into an agent that can retrieve information from the right places!

Part 2: Change Behavior with Prompts

Experience how the agent's behavior changes by modifying the system prompt.

2-1. Change the System Prompt

  1. Find the "System Prompt" section in the settings panel
  2. Delete the existing content and copy & paste the following
You are an AWS technical consultant named "TechNavi".

## Role
- You provide advice on AWS service selection and architecture design
- Always respond in a polite and specific manner
- Include accurate AWS service names in your responses

## Response Style
- State the conclusion first, then explain the details
- When there are multiple options, create a comparison table
- Include cost considerations
  1. Once entered, it will be reflected in the next interaction

What you're changing here is a temporary setting in the playground. Since it won't be saved as the Harness's actual settings, if you want to persist it, you need to change and save it from the Harness editing screen.

2-2. Experience the Change

Try sending the following.

I want to build a web application. Which services should I use?

How did it change compared to the general response in Part 1? It should respond professionally as "TechNavi", structured in order from conclusion to details, with comparison tables and mentions of cost considerations.

12

With just one system prompt, the agent's behavior changes!

2-3. Revert the System Prompt

Since Parts 3 and beyond focus on tool integration experiences, let's delete the content in the "System Prompt" field and leave it empty.

Part 3: Retrieve Information from the Web — Browser

Add the Browser tool and experience how the agent autonomously operates the web in real time.

3-1. Add the Browser Tool

  1. Click the "Add tool" button in the "Tools" section of the settings panel
  2. Select the browser tool under "Select tool type"
  3. Select "AgentCore Browser Tool"
  4. Click "Add"

13

It's OK if "aws.browser.v1" is displayed in the tools section of the settings panel!

3-2. Have It Look Up the Latest Information on the Web

Send the following. This time we're specifying a URL directly.

Please access the following URL, and summarize Amazon Bedrock AgentCore's key features and use cases in a table.
https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/what-is-bedrock-agentcore.html

The agent autonomously opens the URL in a browser, reads the page content, and organizes it. Please confirm that a status like "Opening browser" is displayed.

14

The information that couldn't be answered in Part 1 is now retrieved from the web via the browser tool and answered!

About the Browser Tool's Use Cases

There's a reason we specified the URL directly this time. The Browser tool is not well-suited for the purpose of "searching with a search engine to gather information". The search engine's anti-bot measures may trigger CAPTCHAs, and it takes extra time and tokens.

The official documentation's troubleshooting also contains the following statement.

Structure your agent to avoid search engines and implement the following architecture pattern:

  • Use the Browser tool only for specific page actions, not general web searching
  • Use non-browser MCP tools like Tavily search for general web search operations

https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/browser-tool-troubleshooting.html

It's more efficient to use the Browser tool for purposes like accessing a specific URL and reading the page content, and to use MCP tools like the AWS Knowledge MCP Server or Tavily Search introduced in Part 5 for information search.

3-3. Watch the Browser Operation in Real Time

You can observe the agent browsing the web in real time.

  1. While the task from step 3-2 is running (or after sending a new question), open another browser tab
  2. Bedrock console → Click "Built-in tools" under "Build" in the left menu
  3. Click "Browser"

15

  1. In the "Browser sessions" section, look for a session with status "Ready"
  2. Click the "View live session" link

16

A new window will open, displaying the agent browsing web pages in real time!

17

It's interesting to be able to watch the AI operating the browser right in front of you. It judges each time and takes appropriate actions.

Part 4: Write Code and Have It Calculate — Code Interpreter

Add the Code Interpreter tool and experience how it can execute code and perform data analysis.

4-1. Add the Code Interpreter Tool

  1. Click the "Add tool" button in the "Tools" section of the settings panel
  2. Select code interpreter under "Select tool type"
  3. Select "AgentCore Code Interpreter Tool"
  4. Click "Add"

There may be a cold start of about 30 seconds on the first launch. Just wait patiently.

4-2. Have It Calculate an AWS Cost Comparison

Send the following.

Assuming an API with 10 million requests per month, please compare the costs of Lambda + API Gateway and ECS Fargate. Calculate the monthly cost for each and tell me which is cheaper.

The agent automatically generates and executes Python code, and outputs a comparison table of the cost calculation results.

18

Behind the scenes, Code Interpreter executes code multiple times to calculate. If you open the tool call logs, you can see the agent working hard on the calculations!

19

Expanding the log also lets you see the actual Python code that was executed. Code Interpreter is a service that safely executes code in a sandbox completely isolated from the agent's environment.

Note that files generated by Code Interpreter (such as CSV or graph images) cannot be directly downloaded from the playground. If you want to make full use of file output, it's worth considering tools that output to a specified S3, or implementing it separately in code.

20

4-3. Experience Automatic Tool Selection

Next, send this.

Please check the following AWS official page and find out if there are ways to further reduce the Lambda costs from earlier. Also estimate the amount of savings.
https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html

Observe how the agent automatically uses Code Interpreter (calculation) and Browser (web browsing) as needed. Which tool to use is not specified by humans — the agent judges on its own.

21

22

So far, we've tried two tools: Browser and Code Interpreter!

What's interesting is that the agent itself judges when to use which tool. Without individually instructing it to go look at the web page and then calculate, it combines the appropriate tools for a single question to build its answer.

Part 5: Search Internal Documents — Gateway / RAG

Connect to the Knowledge Base via Gateway and experience an agent that can reference internal documents.

Connecting Gateway to Harness means that by adding just one, multiple tools (KB search, AWS documentation search, web retrieval) become available all at once.

23

5-1. Configure Gateway in the Harness

Gateway is configured not from the playground's settings panel but from the Harness editing screen. Since changes here update the Harness's actual settings, you need to click "Save" at the end.

  1. Click "Harness preview" under "Build" in the left menu, and return to the harness list
  2. Select the harness you created this time and click "Edit"
  3. Check "Gateway" in the "Tools & Options" section
  4. Select rag-gateway-kit under "Please select a gateway"
  5. Select "IAM role" for outbound authentication
  6. Click "Save"

24

After configuring, click "Test harness in playground" to return to the playground.

5-2. Ask the Knowledge Base a Question

Send the following.

Using RAG, please search for information about AgentCore Runtime deployment procedures.

It's OK if you get a response citing the contents of the document retrieved from the Knowledge Base via Gateway!

25

If the tool call log shows "Kb-Retrieve Retrieve Documents", it's answering from documents that were ingested in advance, not from the web.

5-3. Try Adding a Document (Optional)

The Knowledge Base documents are stored in the S3 bucket (rag-gw-datasource-*). You can also add your own documents and try them out.

  1. Open the rag-gw-datasource-* bucket in the S3 console and upload a text file
  2. Bedrock console → Left menu "Knowledge Bases" → Open "RagGatewayKnowledgeBase"
  3. Select the data source and click the "Sync" button
  4. Once sync is complete, return to the playground and ask a question about the content of the added document

5-4. Try the Other Gateway Tools

Gateway contains not only KB search, but also an AWS Knowledge MCP Server and URL Fetch tool.

The AWS Knowledge MCP Server is a tool that can search AWS official documentation. Try sending the following.

Please tell me about AgentCore Payments.

The agent searches the official documentation via the AWS Knowledge MCP Server and responds.

26

Looking at the tool call logs, you can see that Aws-Knowledge Aws Search Documentation and Aws-Knowledge Aws Read Documentation are being used.

27

Next, let's also try the URL Fetch tool.

Please retrieve the following URL's page and summarize its content.
https://dev.classmethod.jp/articles/bedrock-agentcore-managed-harness-preview/

The fetch_webpage tool within Gateway is used to retrieve the content of the specified web page and summarize it.

28

It's great that by adding just one Gateway, multiple tools become available all at once. It's also a nice point that you can easily add new features to the Gateway and don't need to change the harness itself.

Going back to the discussion of the Browser tool's use cases mentioned in Part 3, searching AWS official documentation is more efficient when using the AWS Knowledge MCP Server as we did this time. You can leverage existing assets, the response is faster, and token consumption is lower.

The image is to limit the Browser tool to situations where you access a specific URL and perform page operations, and to choose appropriate MCP tools for information search based on the purpose.

Supplement: AWS Knowledge MCP Server Can Be Used Without Gateway

Even if building the Gateway is difficult, the AWS Knowledge MCP Server alone can be added directly from the playground.

  1. Click the "Add tool" button in the "Tools" section of the settings panel
  2. Select remote MCP server under "Select tool type"
  3. Enter https://knowledge-mcp.global.api.aws as the endpoint URL
  4. Click "Add"

38

Once added, it will appear as mcp_1 in the tools section of the settings panel. After that, try asking a question to search AWS official documentation the same way as in Part 5-4.

39

The first time may take about 30 seconds to connect to the MCP server. After the second time it will work smoothly.

If you only need to search AWS documentation, this is sufficient. Adding MCP servers is easy, so publicly available MCP Servers are easy to handle in Harness too!

Part 6: Check What's Happening Behind the Scenes — Observability

The main agent building experience is complete up to Part 5. From here, try it if you have time.

There is a "Show observability" button in the upper right of the playground screen.

29

Clicking it opens a dashboard where you can check the agent's execution traces and resource consumption.

30

A list of traces per session and CPU/memory consumption metrics are displayed. Clicking on a trace ID lets you check the agent's internal operations in detail.

31

The span tree on the left shows which tools the agent called in what order. You can check the latency and token count for each span, and the tool input/output is displayed on the right.

In this example, you can even confirm the content of the documents retrieved from the Knowledge Base. The trajectory at the bottom visualizes the flow of agent thinking → tool calls → responses as a flowchart.

When actually operating, you use this observability feature to identify where time is being spent and improve performance, or find unintended actions and review processing for continuous improvement. Being able to visually see the processing is helpful for gaining insights.

Bonus: Evaluate the Agent — Evaluations

A mechanism for automatically evaluating the quality of the agent you built is also available. It can be configured from "Evaluations" under "Evaluations" in the left menu.

32

Creating an Evaluation Configuration

Clicking "Create evaluation configuration" opens the evaluation settings screen.

33
Just by selecting the agent and endpoint, you can automatically evaluate the traces of that agent. Checking "Enable this evaluation configuration after creation" will immediately start the evaluation.

Selecting Evaluators

Multiple built-in evaluators are provided. For this hands-on, we selected the following 5.

  • Correctness
    • Looks at whether the information contained in the response is factually accurate. It can be used to verify that information retrieved from tools is correctly reflected.
  • Faithfulness
    • Looks at whether the response is based on the context retrieved from tools. This is a check to see if the content of documents retrieved by RAG is not being fabricated.
  • Response relevance
    • Looks at whether the response accurately addresses the user's question. This is to confirm that the response is not off-topic from the question.
  • Coherence
    • Looks at whether the response is logically structured and consistent.
  • Goal success rate
    • Looks at whether the user's goals were achieved throughout the entire session. While the other four evaluate individual responses (TRACE), this one's characteristic is that it evaluates the entire session.

Please select based on your purpose.

34

Category Evaluator Scope Evaluation Content
Response quality Correctness TRACE Evaluates whether the information in the response is factually accurate
Response quality Faithfulness TRACE Evaluates whether the response is based on the provided context/sources
Response quality Response relevance TRACE Evaluates whether the response appropriately addresses the user's question
Response quality Coherence TRACE Evaluates whether the response is logically structured and consistent
Response quality Helpfulness TRACE Evaluates how useful and valuable the response is from the user's perspective
Response quality Conciseness TRACE Evaluates whether the response is appropriately concise without omitting important information
Response quality Instruction following TRACE Evaluates how well the system prompt instructions are followed
Response quality Refusal TRACE Detects whether questions are evaded or responses are directly refused
Task completion Goal success rate SESSION Evaluates whether the user's goals were achieved throughout the entire session
Component level Tool selection accuracy TOOL_CALL Evaluates whether the appropriate tool was selected for the task
Component level Tool parameter accuracy TOOL_CALL Evaluates whether tool parameters were accurately extracted from user input
Safety Harmfulness TRACE Detects whether harmful content is included in responses
Safety Stereotyping TRACE Detects generalizations/stereotypes about specific groups
Trajectory TrajectoryExactOrderMatch SESSION Evaluates whether the actual tool call order exactly matches expectations
Trajectory TrajectoryInOrderMatch SESSION Evaluates whether the expected tools are included in order (other tools in between are OK)
Trajectory TrajectoryAnyOrderMatch SESSION Evaluates whether all expected tools were used regardless of order

Scope represents the granularity of evaluation. TRACE evaluates individual responses (one interaction), SESSION evaluates the entire session (multiple interactions), and TOOL_CALL evaluates tool selection and parameters respectively.

Trajectory-type evaluations allow you to define the expected tool call order and verify whether the agent made the correct tool selection.

In this online evaluation, the traces during the hands-on are being evaluated in real time, but with batch evaluation, you can prepare a dataset and evaluate in bulk offline. If you want to systematically measure quality by preparing expected values (ground truth data) like Trajectory evaluation, consider batch evaluation as well.

https://dev.classmethod.jp/articles/amazon-bedrock-agentcore-evaluations-batch-evaluation/

You can set the percentage of traces to be evaluated with the sampling rate (100% is fine for the hands-on).

35

Checking Evaluation Results

After asking several questions and completing the evaluation, scores can be viewed at the session level and trace level.

36

Scores such as ResponseRelevance / Correctness / Coherence / Faithfulness are automatically assigned to each trace.

Evaluation results can also be viewed from the observability dashboard introduced in Part 6.

37

The flow is to check this dashboard, verify what kind of behavior is occurring with the observability features mentioned earlier, and identify any problematic interactions.

In production operation, you run a cycle of improving the system prompt and tool configuration based on these evaluation results.
Of course, since these are results determined by AI, it is also important not to take them at face value and to visually verify any results that feel off.

Cleanup

To delete the resources created during the hands-on, follow the steps below.

  1. Bedrock console → Harness preview → Select the created harness and click "Delete"
  2. Delete the Gateway / Knowledge Base deployed with CDK
Cleanup command
cd agentcore-gateway-kit
npx cdk destroy --all

Closing

Were you able to experience that you can build an AI agent with Managed Harness alone?

Starting from a chatbot and progressively adding tools like Browser / Code Interpreter / Gateway, you can clearly see how the agent autonomously selects tools and carries out tasks. I hope you also got a sense of how the agent's capabilities continue to expand by connecting external tools and data sources through the Gateway.

The Managed Harness we explored this time is just the entry point to AgentCore. The following article summarizes what kind of service suite AgentCore is as a whole.

https://dev.classmethod.jp/articles/amazon-bedrock-agentcore-2025-summary/

If you want to develop and deploy agents more flexibly, there is an option called AgentCore CLI. The implementation in code is not that difficult either, so please refer to the following article for how to get started.

https://dev.classmethod.jp/articles/agentcore-cli-inspector-web-ui/

I hope this article was helpful in some way. Thank you very much for reading to the end!


国内企業 AI活用実態調査2026 配布中

クラスメソッドが独自に行なったAI診断調査をもとに、企業のAI活用の現在地を調査レポートとしてまとめました。企業規模別の活用度傾向に加え、規模を超えてAI活用を進める企業に共通する取り組みまで、自社の現在地を捉えるためのヒントにぜひ。

国内企業 AI活用実態調査2026

無料でダウンロードする

Share this article