I tried deploying an AI agent to AgentCore using AgentCore CLI

I tried deploying an AI agent to AgentCore using AgentCore CLI

2026.03.22

This page has been translated by machine translation. View original

Introduction

Hello, this is Kanno from the Consulting Department. Recently I've been interested in the Life supermarket.

AWS has released a new CLI tool called AgentCore CLI (@aws/agentcore) as a Public Preview.
It allows you to create projects, develop locally, and deploy to AWS with a single command.

https://github.com/aws/agentcore-cli

You might be thinking, "Wait, wasn't there an AgentCore Starter Toolkit already...?" I was also curious about the difference between these two.
While keeping the differences in mind, I'll introduce the process of creating, deploying, and testing an agent using this AgentCore CLI.

Prerequisites

I used the following environment for this demonstration.

Environment Information

Item Version / Information
Node.js v25.5.0
Python 3.13.11
AgentCore CLI 0.3.0-preview.6.1

AgentCore CLI

AgentCore CLI is a command-line tool for creating, developing, and deploying AI agents on Amazon Bedrock AgentCore.

Its key features include:

  • Interactive project template creation (agentcore create)
  • Local development server (agentcore dev)
  • AWS CDK-based infrastructure deployment (agentcore deploy)
  • Adding resources like Memory, Gateway, Identity, Evaluator via agentcore add
  • Interactive TUI (Terminal UI)

Differences from Starter Toolkit

Before AgentCore CLI appeared, there was a Python-based tool called Bedrock AgentCore Starter Toolkit.

https://github.com/aws/bedrock-agentcore-starter-toolkit

The differences can be summarized as follows:

Aspect AgentCore CLI Starter Toolkit
Implementation The CLI itself is provided as a Node.js package (@aws/agentcore), and currently generates Python agents A Python-based toolkit. It provides a quickstart for Python, plus guides for TypeScript agents
User Experience Uses commands like agentcore create along with an interactive terminal UI CLI-based usage. While it's interactive, it doesn't have the terminal UI experience of AgentCore CLI
Infrastructure Generated projects include agentcore/cdk/, allowing CDK-based infrastructure management The create command can automatically generate CDK/Terraform templates. https://dev.classmethod.jp/articles/bedrock-agentcore-starter-toolkit-create-command/ However, you need to run separate commands for deployment.
Recommendation for new users The official README suggests Starter Toolkit users uninstall and switch to this, indicating it's the preferred choice for new users Still usable, but if you're starting fresh, it's better to check out AgentCore CLI first

The Starter Toolkit covered similar functionality, but AgentCore CLI provides a more polished developer experience with its terminal UI wizards, rating scale presets, and CDK-based infrastructure management.

So which one should we use? Will they continue to be updated?

https://github.com/aws/agentcore-cli/issues/317#issuecomment-3936726654

When Minorun asked this question in an issue, the response was that AgentCore CLI is the standard going forward and is recommended for use now. It seems reasonable to choose AgentCore CLI for new development or when in doubt.

Terminal UI

Let me introduce this feature first as it helps understand the rest of the content.
One interesting feature of AgentCore CLI is its interactive terminal UI. You can launch it by running agentcore without arguments.

Command
agentcore

CleanShot 2026-03-22 at 21.34.11@2x

It looks like we can do various things. Let's try invoke to run an AI agent. (Assuming you've already deployed an AI agent)

CleanShot 2026-03-22 at 21.34.33@2x

We got a streaming response!

Now let's try evals to evaluate the AI agent.

CleanShot 2026-03-22 at 21.35.35@2x

CleanShot 2026-03-22 at 21.35.43@2x

CleanShot 2026-03-22 at 21.35.53@2x

You can interactively select the Evaluator, target period, and target sessions.
Then you get results like this:

CleanShot 2026-03-22 at 21.36.43@2x

In the terminal UI, available commands are listed and you're guided through operations.
Many of the non-interactive commands I'll introduce today can be accessed through the terminal UI. It's nice not having to remember CLI commands and being able to find operations through the terminal UI! I found it convenient that evaluations can be completed entirely within the terminal.

Let's install it and try it out.

Installation

Install globally using npm.

Command
npm install -g @aws/agentcore

After installation, check the version:

Command
agentcore --version
0.3.0-preview.6.1

If you see the version number, you're good to go!

Creating a Project

Let's first understand the management structure of AgentCore CLI:

Project (agentcore.json)
├── Agents (agents[])
├── Memories (memories[])
├── Credentials (credentials[])
├── Custom Evaluators (evaluators[])
├── Online Evaluation Configs (onlineEvalConfigs[])
└── Gateway (mcp.json)
    └── Targets (targets[])

The agentcore create command creates an entire project, which includes one agent by default.

Resources like Memory, Gateway, and Credential are all defined at the project level, and agents reference these resources through environment variables in their code. Deployment with agentcore deploy targets the entire project.

You can add agents to the project with the agentcore add agent command:

Command
agentcore add agent --name SubAgent --framework Strands --model-provider Bedrock --protocol HTTP

The protocol can be HTTP, MCP, or A2A. You can also specify --type byo to bring your own code.

Creating Interactively

Create a project with agentcore create. You can select the framework and model provider interactively.

Command
agentcore create

The interactive wizard will ask for:

  1. Project name
    CleanShot 2026-03-22 at 12.04.55@2x
  2. Whether to add an agent to the project
    CleanShot 2026-03-22 at 12.05.02@2x
  3. Agent name
    CleanShot 2026-03-22 at 12.05.09@2x
  4. Whether to create a new agent or bring existing code
    CleanShot 2026-03-22 at 12.05.12@2x
  5. Language selection (currently only Python is available, TypeScript is marked as "Soon" so we can expect it)
    CleanShot 2026-03-22 at 12.05.19@2x
  6. Build type selection (Direct Code Deploy / Container)
    CleanShot 2026-03-22 at 12.05.27@2x
  7. Protocol selection (HTTP/A2A/MCP)
    CleanShot 2026-03-22 at 12.05.31@2x
  8. Framework selection (Strands / LangChain + LangGraph / GoogleADK / OpenAI Agents)
    CleanShot 2026-03-22 at 12.05.35@2x
  9. Model provider selection (Bedrock / Anthropic / OpenAI / Gemini)
    CleanShot 2026-03-22 at 12.05.39@2x
  10. Memory selection
    CleanShot 2026-03-22 at 12.05.45@2x
  11. Network mode (PUBLIC / VPC)
    CleanShot 2026-03-22 at 12.05.52@2x

The wizard is quite thorough.
For this demonstration, I created a project with Strands Agents + Amazon Bedrock and enabled Memory.

I enabled Long-term and short-term memory, which activated three long-term memory strategies: SEMANTIC, SUMMARIZATION, and USER_PREFERENCE.

Creating Non-Interactively

For CI/CD or scripting, you can create projects non-interactively with flags:

Command
agentcore create --name MyAgent --framework Strands --model-provider Bedrock --memory longAndShortTerm --defaults

The --defaults flag uses default values for unspecified options.

Generated Project Structure

Let's examine the directory structure after creation:

SampleProject/
├── agentcore/
│   ├── .env.local          # Environment variables for local development (gitignored)
│   ├── agentcore.json      # Resource definitions (agents, memories, etc.)
│   ├── aws-targets.json    # Deployment region settings
│   └── cdk/                # CDK infrastructure code
├── app/
│   └── MyAgent/
│       ├── main.py          # Agent entry point
│       ├── pyproject.toml   # Python dependencies
│       ├── Dockerfile       # For container builds
│       ├── mcp_client/
│       │   └── client.py    # MCP client connection settings
│       ├── memory/
│       │   └── session.py   # Memory session management
│       └── model/
│           └── load.py      # Model loading settings

agentcore/agentcore.json is the core configuration file that contains all agent, memory, credential, and evaluation definitions. The actual agent code is generated under the app/ directory as a framework-specific template.

The generated code is ready to deploy and test.

Configuration File Details

Let's take a closer look at the configuration files in the agentcore/ directory.

agentcore.json

This is the core configuration file. Here's what was generated for our project:

agentcore/agentcore.json
{
  "name": "sampleProject",
  "version": 1,
  "agents": [
    {
      "type": "AgentCoreRuntime",
      "name": "MyAgent",
      "build": "Container",
      "entrypoint": "main.py",
      "codeLocation": "app/MyAgent/",
      "runtimeVersion": "PYTHON_3_12",
      "networkMode": "PUBLIC",
      "modelProvider": "Bedrock",
      "protocol": "HTTP"
    }
  ],
  "memories": [
    {
      "type": "AgentCoreMemory",
      "name": "MyAgentMemory",
      "eventExpiryDuration": 30,
      "strategies": [
        {
          "type": "SEMANTIC",
          "namespaces": [
            "/users/{actorId}/facts"
          ]
        },
        {
          "type": "USER_PREFERENCE",
          "namespaces": [
            "/users/{actorId}/preferences"
          ]
        },
        {
          "type": "SUMMARIZATION",
          "namespaces": [
            "/summaries/{actorId}/{sessionId}"
          ]
        }
      ]
    }
  ],
  "credentials": [],
  "evaluators": [],
  "onlineEvalConfigs": []
}

Here's a summary of the main fields:

Field Description
name Project name (alphanumeric, up to 23 characters)
version Schema version
agents[] Array of agent definitions. Specify build (Direct Code Deploy / Container), runtimeVersion (PYTHON_3_10 to 3_13), networkMode (PUBLIC / VPC), etc.
memories[] Array of Memory resources. Define strategies (SEMANTIC / SUMMARIZATION / USER_PREFERENCE) and expiration
credentials[] Array of API keys or OAuth credentials
evaluators[] Custom evaluator definitions
onlineEvalConfigs[] Online evaluation settings

When you add agents or Memory with agentcore add, they're automatically reflected in this file.

To pass custom environment variables to an agent, add an envVars field in the agents[] array:

agentcore/agentcore.json (envVars example)
{
  "agents": [
    {
      "type": "AgentCoreRuntime",
      "name": "MyAgent",
      ...
      "envVars": [
        { "name": "LOG_LEVEL", "value": "INFO" },
        { "name": "CUSTOM_PARAM", "value": "my-value" }
      ]
    }
  ]
}

For envVars names, you can only use alphanumeric characters and underscores. These environment variables will be passed to the runtime both during local development (agentcore dev) and after deployment. Note that there's no direct subcommand in agentcore add to add environment variables, so you'll need to edit agentcore.json directly.

mcp.json

This is the Gateway configuration file. It defines MCP (Model Context Protocol) compatible gateways and their targets. It's not generated during project creation but is automatically created when you add a Gateway with agentcore add gateway.

agentcore/mcp.json
{
  "agentCoreGateways": [
    {
      "name": "ToolGateway",
      "description": "Gateway for ToolGateway",
      "targets": [],
      "authorizerType": "NONE",
      "enableSemanticSearch": true,
      "exceptionLevel": "NONE"
    }
  ]
}
Field Description
agentCoreGateways[] Array of Gateway definitions
targets[] Array of connection targets (add with agentcore add gateway-target)
authorizerType Authorization type (NONE / AWS_IAM / CUSTOM_JWT)
enableSemanticSearch Enable semantic search (default: true)
exceptionLevel Exception level (NONE / DEBUG)

When you create a Gateway, targets starts as an empty array, and you add targets later with agentcore add gateway-target.

aws-targets.json

This configuration file specifies the AWS account and region for deployment:

agentcore/aws-targets.json
[]

By default, it's an empty array, and account information is read when you run agentcore deploy. You can explicitly configure multiple targets:

agentcore/aws-targets.json (example)
[
  {
    "name": "default",
    "account": "123456789012",
    "region": "us-east-1",
    "description": "Default target (us-east-1)"
  }
]

.env.local

This is an environment variable file for local development. It's included in .gitignore and won't be committed to the repository.

Credentials added with agentcore add identity are stored in this file for local development. Environment variables follow this naming convention:

Environment Variable Description
AGENTCORE_CREDENTIAL_{NAME}=value Credential value added via Identity

After deployment, these are managed by Identity, so .env.local is only used for local development.

Examining Agent Code

Let's look at the agent's entry point, app/MyAgent/main.py:

app/MyAgent/main.py
from strands import Agent, tool
from bedrock_agentcore.runtime import BedrockAgentCoreApp
from model.load import load_model
from mcp_client.client import get_streamable_http_mcp_client
from memory.session import get_memory_session_manager

app = BedrockAgentCoreApp()
log = app.logger

# MCP client definition
mcp_clients = [get_streamable_http_mcp_client()]

# Tool definition
tools = []

@tool
def add_numbers(a: int, b: int) -> int:
    """Return the sum of two numbers"""
    return a+b
tools.append(add_numbers)

for mcp_client in mcp_clients:
    if mcp_client:
        tools.append(mcp_client)

def agent_factory():
    cache = {}
    def get_or_create_agent(session_id, user_id):
        key = f"{session_id}/{user_id}"
        if key not in cache:
            cache[key] = Agent(
                model=load_model(),
                session_manager=get_memory_session_manager(session_id, user_id),
                system_prompt="You are a helpful assistant. Use tools when appropriate.",
                tools=tools
            )
        return cache[key]
    return get_or_create_agent
get_or_create_agent = agent_factory()

@app.entrypoint
async def invoke(payload, context):
    log.info("Invoking Agent.....")
    session_id = getattr(context, 'session_id', 'default-session')
    user_id = getattr(context, 'user_id', 'default-user')
    agent = get_or_create_agent(session_id, user_id)

    stream = agent.stream_async(payload.get("prompt"))
    async for event in stream:
        if "data" in event and isinstance(event["data"], str):
            yield event["data"]

if __name__ == "__main__":
    app.run()

BedrockAgentCoreApp is a runtime wrapper, and functions decorated with @app.entrypoint become the agent's entry points.

The generated code modularizes Memory, MCP client, and model loading:

  • model/load.py — Creates a BedrockModel instance
  • memory/session.py — Session management using AgentCoreMemorySessionManager. Reads the Memory ID from the MEMORY_MYAGENTMEMORY_ID environment variable
  • mcp_client/client.py — Connection settings for the Streamable HTTP MCP client

When you add a Gateway and deploy, the Gateway URL is automatically injected into the environment variable AGENTCORE_GATEWAY_{GATEWAY_NAME}_URL. By modifying mcp_client/client.py to read from this environment variable, you can connect to external tools through the Gateway.

Adding Resources

With AgentCore CLI, you can declaratively add resources besides agents using the agentcore add command.
The terminal UI allows you to add and create resources interactively, which is convenient. The supported resources seem to cover all AgentCore primitives.

CleanShot 2026-03-22 at 15.15.36@2x

CleanShot 2026-03-22 at 15.15.52@2x

The interactive approach seems user-friendly. If you don't need to automate, starting with the interactive mode helps navigate without getting lost.

You can also add resources in non-interactive mode. Here are some command examples:

Adding Memory

Command
agentcore add memory --name SharedMemory --strategies SEMANTIC,SUMMARIZATION --expiry 30

Memory strategies include SEMANTIC (vector search), SUMMARIZATION (conversation summary), and USER_PREFERENCE (remembering user settings). Added Memory is defined in the memories array in agentcore.json and can be referenced in agent code via the environment variable MEMORY_SHAREDMEMORY_ID.

Adding Gateway

Command
agentcore add gateway --name ToolGateway

Gateways are MCP (Model Context Protocol) compatible proxies that manage connections to external tools. You can add targets to a Gateway to connect Lambda functions or external MCP servers. Target types include mcp-server, api-gateway, open-api-schema, smithy-model, and lambda-function-arn.

Adding an MCP server as a target
agentcore add gateway-target \
  --gateway ToolGateway \
  --name WeatherAPI \
  --type mcp-server \
  --endpoint https://mcp.example.com/mcp

To set up JWT authorization for a Gateway, specify --authorizer-type CUSTOM_JWT:

Adding JWT authentication
agentcore add gateway \
  --name SecureGateway \
  --authorizer-type CUSTOM_JWT \
  --discovery-url https://example.auth0.com/.well-known/openid-configuration \
  --allowed-audience my-api-audience

For Lambda targets, pass the tool definition in a JSON file (--tool-schema-file):

Adding a Lambda function as a target
agentcore add gateway-target \
  --gateway ToolGateway \
  --name WeatherLambda \
  --type lambda-function-arn \
  --lambda-arn arn:aws:lambda:us-east-1:123456789012:function:get-weather \
  --tool-schema-file ./tool-schema.json

Adding Identity

This is a mechanism for securely managing API keys or OAuth credentials. During local development, they're stored in .env.local, and after deployment, they're managed by AgentCore Identity.

Adding an API key
agentcore add identity --name OpenAI --api-key sk-...
Adding OAuth credentials
agentcore add identity \
  --type oauth \
  --name MyOAuthService \
  --discovery-url https://example.com/.well-known/openid-configuration \
  --client-id my-client-id \
  --client-secret my-client-secret \
  --scopes "read,write"

During deployment, you can choose whether to use values from .env.local, enter them manually, or skip and address them later.

CleanShot 2026-03-22 at 22.51.32@2x

If you're unsure about commands, using the terminal UI to add resources is probably the best approach.
Now that I've introduced various command examples, let's try local development and AWS deployment.

Local Development

Let's run the agent locally.

Command
agentcore dev

This command launches an interactive window. Let's try saying "hello":

CleanShot 2026-03-22 at 22.27.38@2x

Great! The agent is working locally! Being able to interact quickly is nice.

Deploying to AWS

Finally, it's time to deploy to AWS.

Command to execute
agentcore deploy

Internally, AWS CDK is executed, and the following resources are created:

  • AgentCore Runtime endpoint
  • ECR repository and container image
  • AgentCore Memory resources (if configured)
  • IAM roles
  • CloudWatch log groups

Additionally, after deployment, if CloudWatch Transaction Search is not yet enabled in the target region, a process to automatically enable it is incorporated (PR #506). This allows you to immediately search and analyze agent traces and session details in CloudWatch after deployment.

Until now, you needed to manually enable it once, so this is a welcome update.

After deployment is complete, let's check the status.

Command to execute
agentcore status

CleanShot 2026-03-22 at 23.25.17@2x

You can check the agent's status, endpoint information, environment variable settings, and more.

For this deployment, I added Gateway, Identity, Memory, and custom environment variables before deploying. When checking the console, the following environment variables were automatically set:

Environment Variable Example Value Source
MEMORY_MYAGENTMEMORY_ID SampleProject_MyAgentMemory-xxx Memory addition
AGENTCORE_GATEWAY_TOOLGATEWAY_URL https://sampleproject-toolgateway-xxxxx.gateway...amazonaws.com/mcp Gateway addition
AGENTCORE_GATEWAY_TOOLGATEWAY_AUTH_TYPE AWS_IAM Gateway addition
CREDENTIAL_OPENAI_NAME OpenAI Identity addition
CUSTOM_PARAM my-value Defined in envVars
LOG_LEVEL INFO Defined in envVars

Environment variables corresponding to added resources are automatically injected. You can access them in your agent code using os.environ.get("MEMORY_MYAGENTMEMORY_ID"), so there's no need to hardcode connection information.

Testing the Deployed Agent

After deployment is complete, let's send a prompt to the cloud-based agent.

Command to execute
agentcore invoke "こんにちは" --agent MyAgent

When executed, the following response is displayed:

Execution result
Provider: Bedrock
こんにちは!お元気ですか?

何かお手伝いできることはありますか?質問や知りたいことがあれば、お気軽にお聞きください。

Log: /path/to/project/agentcore/.cli/logs/invoke/invoke-MyAgent-20260322-122023.log

We got a response! Provider: Bedrock shows the model provider information, and Log: at the end displays the path to the log file where request and response details are recorded.

Let's look at the contents of the log file.

Contents of invoke log file
================================================================================
AGENTCORE INVOKE LOG
Agent: MyAgent
Runtime ARN: arn:aws:bedrock-agentcore:us-east-1:123456789012:runtime/SampleProject_MyAgent-xxxxx
Region: us-east-1
Session ID: none
Started: 2026-03-22T03:20:23.910Z
================================================================================

[12:20:23.911] INVOKE REQUEST (Session: none)

--- REQUEST ---
{
  "timestamp": "2026-03-22T03:20:23.911Z",
  "agent": "MyAgent",
  "runtimeArn": "arn:aws:bedrock-agentcore:us-east-1:123456789012:runtime/SampleProject_MyAgent-xxxxx",
  "region": "us-east-1",
  "prompt": "こんにちは"
}
--- END REQUEST ---

[12:20:28.746] INVOKE RESPONSE (4835ms)

--- RESPONSE ---
{
  "timestamp": "2026-03-22T03:20:28.746Z",
  "durationMs": 4835,
  "success": true,
  "response": "こんにちは!お元気ですか?..."
}
--- END RESPONSE ---

Requests and responses are recorded in JSON format, and you can also check the response time. This can be useful for debugging and performance analysis.

Continuing Conversations with Session IDs

Next, let's try using a session ID to maintain conversation context with Memory.

Command to execute
agentcore invoke "私の名前は神野です" --session-id my-session-001-01231233123123123123312312312 --agent MyAgent
Execution result
Provider: Bedrock
こんにちは、神野さん!はじめまして。

お名前を教えていただきありがとうございます。何かお手伝いできることはありますか?質問や知りたいことがあれば、お気軽にお尋ねください。

Log: /path/to/project/agentcore/.cli/logs/invoke/invoke-MyAgent-20260322-122722.log

Let's continue calling with the same session ID.

Command to execute
agentcore invoke "記憶しておりますか?先ほどの発言を。" --session-id my-session-001-01231233123123123123312312312 --agent MyAgent
Execution result
Provider: Bedrock
はい、記憶しております!

先ほど「私の名前は神野です」とおっしゃいましたね。神野さんのお名前はしっかり覚えています。

何かお手伝いできることがありましたら、お気軽にお申し付けください。

Log: /path/to/project/agentcore/.cli/logs/invoke/invoke-MyAgent-20260322-122746.log

It remembered the name properly! We confirmed that using the same session ID allows AgentCore Memory to maintain the conversation context.

Evaluations

The AgentCore CLI includes built-in LLM-as-a-Judge based evaluation features. It's helpful to be able to quantitatively measure the quality of agent responses. Let's create an Evaluator and run an evaluation.

First, let me briefly explain Evaluators.

Built-in Evaluator

AgentCore comes with pre-defined Built-in Evaluators.
Being able to start evaluations right away without having to write custom instructions is a good point.

https://docs.aws.amazon.com/ja_jp/bedrock-agentcore/latest/devguide/prompt-templates-builtin.html

Creating a Custom Evaluator

If built-in Evaluators aren't sufficient, you can create custom Evaluators with the AgentCore CLI.

Command to execute
agentcore add evaluator \
  --name ResponseQuality \
  --level SESSION \
  --model us.anthropic.claude-sonnet-4-5-20250929-v1:0 \
  --instructions "Evaluate the quality and helpfulness of the agent response. Context: {context}" \
  --rating-scale 1-5-quality

You can choose from 3 evaluation levels.

Level Evaluation Target
SESSION Quality of the entire session
TRACE Response accuracy for each turn
TOOL_CALL Appropriateness of individual tool selections

For --rating-scale, presets include 1-5-quality (default), 1-3-simple, pass-fail, and good-neutral-bad. You can choose between numerical or categorical scales depending on your needs.

Running On-demand Evaluations

After creating and deploying an Evaluator, let's run an evaluation against past traces.

Command to execute
agentcore run evals --agent MyAgent --evaluator ResponseQuality --days 7

This runs an LLM-as-a-Judge evaluation for traces from the specified period (7 days in this case). To use a Built-in Evaluator, specify it like this:

Command to execute
agentcore run evals --agent MyAgent --evaluator Builtin.Helpfulness --days 7

Let's look at the execution results.

Built-in Evaluator execution results
Agent: MyAgent | Mar 22, 2026, 12:28 PM | Sessions: 2 | Lookback: 7d

  Builtin.Helpfulness: 0.50

Results saved to: /path/to/project/agentcore/.cli/eval-results/eval_2026-03-22_03-28-42.json

The evaluation results are displayed in the terminal, and detailed JSON files are also saved. Looking at the JSON contents, you can see scores, labels, and explanations for each session.

eval-results/eval_2026-03-22_03-28-42.json (excerpt)
{
  "timestamp": "2026-03-22T03:28:42.469Z",
  "agent": "MyAgent",
  "evaluators": ["Builtin.Helpfulness"],
  "lookbackDays": 7,
  "sessionCount": 2,
  "results": [
    {
      "evaluator": "Builtin.Helpfulness",
      "aggregateScore": 0.5,
      "sessionScores": [
        {
          "sessionId": "55541c29-a909-48b1-af5f-8815aebf3297",
          "value": 0.5,
          "label": "Neutral/Mixed",
          "explanation": "The user's initial message 'こんにちは' is a greeting..."
        }
      ],
      "tokenUsage": {
        "inputTokens": 1757,
        "outputTokens": 621,
        "totalTokens": 2378
      }
    }
  ]
}

Since our conversation just had greetings like "hello", the Helpfulness score was 0.50 (Neutral/Mixed). This makes sense as we weren't having a particularly useful conversation.

Let's also try running the custom Evaluator (ResponseQuality).

Custom Evaluator execution results
Agent: MyAgent | Mar 22, 2026, 12:57 PM | Sessions: 2 | Lookback: 7d

  ResponseQuality: 3.00

Results saved to: /path/to/project/agentcore/.cli/eval-results/eval_2026-03-22_03-57-17.json

This also worked well!
Looking at the logs, it received a 3.00 (Good) on the 1-5-quality scale. The evaluation reasons mention "appropriate use of Japanese language and honorifics" and "memory correctly remembering the name", so you can confirm what aspects the Evaluator assessed.

{
  "timestamp": "2026-03-22T03:57:17.969Z",
  "agent": "MyAgent",
  "evaluators": [
    "ResponseQuality"
  ],
  "lookbackDays": 7,
  "sessionCount": 2,
  "results": [
    {
      "evaluator": "ResponseQuality",
      "aggregateScore": 3,
      "sessionScores": [
        {
          "sessionId": "55541c29-a909-48b1-af5f-8815aebf3297",
          "value": 3,
          "label": "Good",
          "explanation": "The user simply greeted the agent with 'こんにちは' (Hello in Japanese). The agent's response is appropriate and well-constructed in several ways: 1) It reciprocates the greeting in Japanese, maintaining language consistency. 2) It asks 'お元気ですか?' (How are you?), which is a natural follow-up to a greeting. 3) It proactively offers assistance by asking if there's anything it can help with. 4) It encourages the user to ask questions freely with 'お気軽にお聞きください' (please feel free to ask). 5) It includes a friendly emoji to create a welcoming tone. The response is culturally appropriate, polite, and conversational. It successfully opens the door for further interaction while being concise and not overwhelming. The agent demonstrates readiness to help without being pushy. This response meets all expectations for handling a simple greeting - it's warm, professional, maintains the user's language choice, and effectively invites continued conversation."
        },
        {
          "sessionId": "my-session-001-01231233123123123123312312312",
          "value": 3,
          "label": "Good",
          "explanation": "The conversation consists of two exchanges. In the first exchange, the user introduces themselves as '神野' (Kamino/Jinno). The agent responds politely in Japanese, acknowledging the name and offering assistance. This is appropriate and courteous.\n\nIn the second exchange, the user asks if the agent remembers the previous statement. The agent confirms it does remember and accurately recalls that the user said '私の名前は神野です' (My name is Kamino/Jinno). The agent demonstrates successful memory retention across the conversation.\n\nStrengths: 1) The agent uses appropriate Japanese language and honorifics (さん suffix), showing cultural awareness. 2) The agent correctly remembers the user's name from the previous exchange, which is the core of what was being tested. 3) The tone is friendly and helpful throughout. 4) The agent explicitly quotes what the user said, confirming accurate memory.\n\nThe responses are appropriate, polite, and functionally correct. The agent successfully demonstrates conversation continuity and memory - a key capability being tested. The quality meets expectations for a basic memory retention test with culturally appropriate responses. There are no significant errors or issues that would warrant a lower score, but the responses are straightforward without exceptional added value that would justify exceeding expectations."
        }
      ],
      "tokenUsage": {
        "inputTokens": 1160,
        "outputTokens": 557,
        "totalTokens": 1717
      }
    }
  ]
}

Setting up Online Evaluation

You can also set up continuous sampling-based evaluation for production traffic.

Command to execute
agentcore add online-eval \
  --name QualityMonitor \
  --agent MyAgent \
  --evaluator ResponseQuality \
  --sampling-rate 10 \
  --enable-on-create

Using the --enable-on-create flag enables online evaluation immediately after deployment. Without this flag, it remains Disabled after deployment, and you need to manually enable it from the management console.

Cleanup

After testing, remove the created resources.

Command to execute
agentcore remove all --force

This command resets the agentcore.json and mcp.json schemas (returns them to empty state). The source code itself isn't changed.

To actually delete AWS resources (AgentCore Runtime, Memory, ECR repository, etc.), run agentcore deploy after resetting the schema.

Command to execute
agentcore deploy

Deploying with empty schemas initiates CloudFormation stack deletion, cleaning up your AWS resources.

Conclusion

While it's still in Public Preview and specifications may change, I found it very helpful that you can follow the terminal UI guide to get started with AgentCore development. Using AgentCore CLI as a foundation for adding resources seems to reduce confusion when starting out.

During testing, I was surprised by how much you can do. There are other interesting points that I couldn't include in this article, so I hope to share other interesting use cases or noteworthy aspects in future articles.

I hope this article was helpful. Thank you for reading to the end!

Share this article

FacebookHatena blogX