[Update] Amazon Bedrock AgentCore Runtime now supports Stateful MCP servers

[Update] Amazon Bedrock AgentCore Runtime now supports Stateful MCP servers

2026.03.12

This page has been translated by machine translation. View original

Introduction

Hello, I'm Jinno from the consulting department, a big fan of the supermarket La Mu.

In a recent update, Amazon Bedrock AgentCore Runtime now supports Stateful MCP servers!

https://aws.amazon.com/jp/about-aws/whats-new/2026/03/amazon-bedrock-agentcore-runtime-stateful-mcp/

At first I wondered what kind of update this was, but looking at the MCP specifications, Elicitation (interactive input collection), Sampling (LLM text generation requests), and Progress Notifications are defined, and AgentCore now supports these. I see...

Diving deeper into each feature, the following can be realized.
It seems like the stateful nature enables more interactive actions.

Feature Overview
Elicitation Server interactively requests user input from the client
Sampling Server requests text generation from the client-side LLM
Progress Notifications Notifies the client of long-running process progress in real-time

https://modelcontextprotocol.io/specification/2025-11-25/client/elicitation

https://modelcontextprotocol.io/specification/2025-11-25/client/sampling

https://modelcontextprotocol.io/specification/2025-11-25/basic/utilities/progress

Using these features, I imagined a flow like the following for airline ticket booking. This assumes using Elicitation, Sampling, and Progress Notifications.

The flow above shows how the MCP server can focus on business logic, while delegating user interaction and LLM inference to the client side.
Previously, we might have had to individually implement dialog UI with choices and session management, but using this mechanism could simplify server-side implementation.

I'm eager to verify how well this actually works in practice.

For this article, I'll use a simple program as the client rather than an AI agent, and test it based on the official Stateful MCP server features sample, adapting the destinations and data for domestic travel!

https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/mcp-stateful-features.html

Prerequisites

For this project, I used Python and uv as the package manager. The versions used are:

  • Python 3.13.11
  • uv 0.9.26

The library versions used are:

Library Version
fastmcp 3.1.0
mcp 1.26.0
boto3 1.42.65
bedrock-agentcore-starter-toolkit 0.3.2

Initialize the project with uv and install the necessary libraries.

Setup
# Project initialization
uv init --no-readme
uv add fastmcp mcp boto3

# Install Starter Toolkit
uv add bedrock-agentcore-starter-toolkit

Implementation

MCP Server Implementation

First, let's implement a domestic travel planner MCP Server with FastMCP. We'll implement tools using Elicitation / Sampling / Progress Notifications features.

travel_server.py (full code)
travel_server.py
"""
Domestic Travel Planner - Stateful MCP Server
Demo of Elicitation / Sampling / Progress Notifications
"""
import asyncio
import logging
from fastmcp import FastMCP, Context
from enum import Enum

logging.basicConfig(level=logging.INFO, format="%(asctime)s [%(levelname)s] %(message)s")
logger = logging.getLogger("travel-planner")

mcp = FastMCP("Japan-Travel-Planner")

class TripType(str, Enum):
    SOLO = "ひとり旅"
    COUPLE = "カップル"
    FAMILY = "家族旅行"
    FRIENDS = "友人旅行"

DESTINATIONS = {
    "京都": {
        "name": "京都",
        "transport": 14000,
        "hotel": 12000,
        "highlights": ["伏見稲荷大社", "嵐山竹林", "清水寺"],
        "gourmet": ["湯豆腐", "抹茶スイーツ", "にしんそば"],
    },
    "沖縄": {
        "name": "沖縄",
        "transport": 40000,
        "hotel": 10000,
        "highlights": ["美ら海水族館", "古宇利島", "首里城"],
        "gourmet": ["ソーキそば", "タコライス", "海ぶどう"],
    },
    "北海道": {
        "name": "北海道",
        "transport": 35000,
        "hotel": 9000,
        "highlights": ["富良野ラベンダー畑", "小樽運河", "旭山動物園"],
        "gourmet": ["海鮮丼", "ジンギスカン", "スープカレー"],
    },
    "福岡": {
        "name": "福岡",
        "transport": 22000,
        "hotel": 8000,
        "highlights": ["太宰府天満宮", "中洲屋台", "糸島"],
        "gourmet": ["博多ラーメン", "もつ鍋", "明太子"],
    },
}

@mcp.tool()
async def plan_trip(ctx: Context) -> str:
    """
    Create a domestic travel plan (using all MCP features):
    1. Elicitation - Inquire about travel preferences
    2. Progress - Search progress for transportation and accommodations
    3. Sampling - Generate recommendations using AI
    """
    # ---- Phase 1: Elicitation ----
    logger.info("[Phase 1] Elicitation start - Asking about destination")
    dest_result = await ctx.elicit(
        message="Where would you like to go?\nOptions: Kyoto, Okinawa, Hokkaido, Fukuoka",
        response_type=str,
    )
    if dest_result.action != "accept":
        logger.info("[Phase 1] User canceled (destination)")
        return "Plan creation canceled."
    dest_key = dest_result.data.strip()
    dest = DESTINATIONS.get(dest_key, DESTINATIONS["京都"])
    logger.info(f"[Phase 1] Destination: {dest['name']}")

    type_result = await ctx.elicit(
        message="What type of trip?\n1. Solo travel\n2. Couple\n3. Family trip\n4. Trip with friends",
        response_type=TripType,
    )
    if type_result.action != "accept":
        logger.info("[Phase 1] User canceled (trip type)")
        return "Plan creation canceled."
    trip_type = type_result.data.value if hasattr(type_result.data, "value") else type_result.data
    logger.info(f"[Phase 1] Trip type: {trip_type}")

    days_result = await ctx.elicit(
        message="How many nights? (1-7 nights)",
        response_type=int,
    )
    if days_result.action != "accept":
        return "Plan creation canceled."
    days = max(1, min(7, days_result.data))

    travelers_result = await ctx.elicit(
        message="How many people?",
        response_type=int,
    )
    if travelers_result.action != "accept":
        return "Plan creation canceled."
    travelers = travelers_result.data
    logger.info(f"[Phase 1] Elicitation complete - {dest['name']}/{trip_type}/{days} nights/{travelers} people")

    # ---- Phase 2: Progress Notifications ----
    logger.info("[Phase 2] Progress Notifications start - Search processing")
    total_steps = 5
    for step in range(1, total_steps + 1):
        await ctx.report_progress(progress=step, total=total_steps)
        await asyncio.sleep(0.4)

    transport_cost = dest["transport"] * travelers
    hotel_cost = dest["hotel"] * days * ((travelers + 1) // 2)
    total_cost = transport_cost + hotel_cost
    logger.info(f"[Phase 2] Progress complete - Cost calculation: ¥{total_cost:,}")

    # ---- Phase 3: Sampling ----
    logger.info("[Phase 3] Sampling start - Requesting AI inference from client")
    ai_tips = f"Enjoy {dest['name']}!"
    try:
        response = await ctx.sample(
            messages=(
                f"Please give me 3 recommendations for {trip_type} to {dest['name']} "
                f"({travelers} people, {days} nights) in 60 characters or less. Please answer in Japanese."
            ),
            max_tokens=200,
        )
        if hasattr(response, "text") and response.text:
            ai_tips = response.text
        logger.info(f"[Phase 3] Sampling complete - Response: {ai_tips[:50]}...")
    except Exception as e:
        logger.warning(f"[Phase 3] Sampling failed: {e}")
        ai_tips = f"{dest['highlights'][0]} is a must-see. Don't miss {dest['gourmet'][0]} too!"

    # ---- Final Confirmation ----
    logger.info("[Phase 4] Final confirmation Elicitation")
    confirm = await ctx.elicit(
        message=f"""
========== Travel Plan Summary ==========
Destination: {dest['name']}
Type: {trip_type}
Schedule: {days} nights {days + 1} days
People: {travelers}

Estimated cost:
  Transportation: ¥{transport_cost:,}
  Accommodation: ¥{hotel_cost:,} ({(travelers + 1) // 2} rooms × {days} nights)
  Total: ¥{total_cost:,}

Would you like to make this reservation?""",
        response_type=["Book", "Cancel"],
    )
    if confirm.action != "accept" or confirm.data == "Cancel":
        logger.info("[Phase 4] User canceled")
        return "Reservation canceled. Search results will be saved for 24 hours."

    logger.info(f"[Phase 4] Reservation confirmed - {dest['name']}/{trip_type}/{days} nights/{travelers} people/¥{total_cost:,}")
    highlights_str = "\n".join(f"  - {h}" for h in dest["highlights"])
    gourmet_str = "\n".join(f"  - {g}" for g in dest["gourmet"])
    return f"""
{'=' * 50}
Reservation confirmed!
{'=' * 50}
Reservation number: TRV-{ctx.session_id[:8].upper()}

Destination: {dest['name']}
Schedule: {days} nights {days + 1} days / {travelers} people
Type: {trip_type}

Transportation: ¥{transport_cost:,}
Accommodation: ¥{hotel_cost:,} ({(travelers + 1) // 2} rooms × {days} nights)
Total: ¥{total_cost:,}

Recommended spots:
{highlights_str}

Local delicacies:
{gourmet_str}

AI recommendations:
{ai_tips}
{'=' * 50}
"""

if __name__ == "__main__":
    mcp.run(
        transport="streamable-http",
        host="0.0.0.0",
        port=8000,
        stateless_http=False,
    )

Let me highlight the key points:

Elicitation: Interactively Collecting User Input

travel_server.py
dest_result = await ctx.elicit(
    message="Where would you like to go?\nOptions: Kyoto, Okinawa, Hokkaido, Fukuoka",
    response_type=str,
)
if dest_result.action != "accept":
    return "Plan creation canceled."

Calling ctx.elicit() requests input from the client. response_type can be str, int, Enum, or a list (choices), and the client's response is returned as an ElicitResult.

If the action is not "accept", it's treated as a cancellation.

Progress Notifications: Visualizing Progress

travel_server.py
total_steps = 5
for step in range(1, total_steps + 1):
    await ctx.report_progress(progress=step, total=total_steps)
    await asyncio.sleep(0.4)

ctx.report_progress() notifies the current progress and total number of steps.
It's expected that the client will convert this into a UI element such as a progress bar.

Sampling: Requesting Text Generation from LLM

travel_server.py
response = await ctx.sample(
    messages=(
        f"Please give me 3 recommendations for {trip_type} to {dest['name']} "
        f"({travelers} people, {days} nights) in 60 characters or less. Please answer in Japanese."
    ),
    max_tokens=200,
)

Using ctx.sample(), the server can request text generation from an LLM available to the client.
In this configuration, rather than the MCP server directly calling an LLM, it's more like making an inference through the Bedrock call prepared on the client side.

Server Startup Configuration

travel_server.py
mcp.run(
    transport="streamable-http",
    host="0.0.0.0",
    port=8000,
    stateless_http=False,  # This is important
)

According to the documentation, to use Elicitation / Sampling / Progress Notifications, stateless_http=False must be set. This makes it stateful.

https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/mcp-stateful-features.html

Adding logger.info() to each phase allows you to track the flow of phases in CloudWatch Logs when deployed to AgentCore Runtime. HTTP-level logs alone wouldn't show which MCP features are running, so I added these to observe the behavior.

Test Client Implementation

For Stateful MCP, handlers need to be implemented on the client side as well.
We'll prepare handlers for Elicitation, Sampling, and Progress.

test_client.py (full code)
test_client.py
"""
Domestic Travel Planner - Test Client
Implementation of Elicitation / Sampling / Progress handlers

Environment variables:
  LOCAL_TEST=true (default) : Connect to local server
  LOCAL_TEST=false          : Connect to AgentCore Runtime (AGENT_ARN, BEARER_TOKEN required)
  USE_BEDROCK=true          : Use Amazon Bedrock for Sampling
  BEDROCK_MODEL_ID          : Bedrock model ID (default: us.anthropic.claude-haiku-4-5-20251001-v1:0)
"""
import asyncio
import os
import sys
import typing
from fastmcp import Client
from fastmcp.client.transports import StreamableHttpTransport
from fastmcp.client.elicitation import ElicitResult
from mcp.types import CreateMessageResult, TextContent

# --- Elicitation Handler ---

def _extract_options(response_type) -> list[str] | None:
    """Extract options list from response_type.
    fastmcp converts ["Book", "Cancel"] to Literal["Book", "Cancel"] before passing it.
    """
    if isinstance(response_type, list):
        return response_type
    if typing.get_origin(response_type) is typing.Literal:
        args = typing.get_args(response_type)
        if args:
            return list(args)
    return None

def _prompt_choice(options: list[str]) -> str:
    """Select by number. Retry on invalid input."""
    for i, opt in enumerate(options, 1):
        print(f"    {i}. {opt}")
    while True:
        raw = input("    Select a number: ").strip()
        try:
            idx = int(raw)
            if 1 <= idx <= len(options):
                return options[idx - 1]
        except ValueError:
            pass
        print(f"    ※ Enter a number between 1 and {len(options)}")

def _prompt_int(label: str = "Answer (number)") -> int:
    """Prompt for an integer. Retry on invalid input."""
    while True:
        raw = input(f"    {label}: ").strip()
        try:
            return int(raw)
        except ValueError:
            print("    ※ Please enter a number")

async def elicit_handler(message, response_type, params, ctx):
    print(f"\n>>> Question from server: {message}")
    try:
        options = _extract_options(response_type)
        if options:
            response = _prompt_choice(options)
        elif response_type is int:
            response = _prompt_int()
        else:
            response = input("    Answer: ").strip()
        return ElicitResult(action="accept", content={"value": response})
    except (KeyboardInterrupt, EOFError):
        print("\n    (Canceled)")
        return ElicitResult(action="decline", content=None)

# --- Sampling Handler ---

def _extract_prompt_text(messages) -> str:
    """Extract prompt text from a list of SamplingMessage"""
    if isinstance(messages, str):
        return messages
    if isinstance(messages, list):
        parts = []
        for msg in messages:
            if hasattr(msg, "content") and hasattr(msg.content, "text"):
                parts.append(msg.content.text)
            else:
                parts.append(str(msg))
        return "\n".join(parts)
    return str(messages)

def _invoke_bedrock(prompt: str) -> str:
    """Run inference using Amazon Bedrock"""
    import boto3
    model_id = os.getenv("BEDROCK_MODEL_ID", "us.anthropic.claude-haiku-4-5-20251001-v1:0")
    client = boto3.client("bedrock-runtime")
    response = client.converse(
        modelId=model_id,
        messages=[{"role": "user", "content": [{"text": prompt}]}],
        inferenceConfig={"maxTokens": 200},
    )
    return response["output"]["message"]["content"][0]["text"]

async def sampling_handler(messages, params, ctx):
    prompt_text = _extract_prompt_text(messages)
    print(f"\n>>> AI Sampling Request")
    print(f"    Prompt: {prompt_text[:120]}...")

    use_bedrock = os.getenv("USE_BEDROCK", "false").lower() == "true"
    try:
        if use_bedrock:
            print("    (Inferring with Bedrock...)")
            ai_response = _invoke_bedrock(prompt_text)
            print(f"    Bedrock response: {ai_response}")
        else:
            ai_response = input("    Enter AI response (Enter for default): ").strip()
            if not ai_response:
                ai_response = "1. Visit popular spots early 2. Enjoy local cuisine 3. Check out local-recommended hidden gems"
    except Exception as e:
        print(f"    ※ Sampling error: {e}")
        ai_response = "Could not get recommendations."

    return CreateMessageResult(
        role="assistant",
        content=TextContent(type="text", text=ai_response),
        model="bedrock" if use_bedrock else "manual",
        stopReason="endTurn",
    )

# --- Progress Handler ---

async def progress_handler(progress, total, message):
    pct = int((progress / total) * 100) if total else 0
    bar = "#" * (pct // 5) + "-" * (20 - pct // 5)
    print(f"\r    Progress: [{bar}] {pct}%", end="", flush=True)
    if progress == total:
        print(" Complete!")

# --- Main ---

async def main():
    local_test = os.getenv("LOCAL_TEST", "true").lower() == "true"

    if local_test:
        url = sys.argv[1] if len(sys.argv) > 1 else "http://localhost:8000/mcp"
        headers = {}
    else:
        agent_arn = os.getenv("AGENT_ARN")
        token = os.getenv("BEARER_TOKEN")
        if not agent_arn or not token:
            print("ERROR: AGENT_ARN / BEARER_TOKEN not set")
            sys.exit(1)
        encoded_arn = agent_arn.replace(":", "%3A").replace("/", "%2F")
        endpoint = os.getenv(
            "MCP_ENDPOINT",
            f"https://bedrock-agentcore.{os.getenv('AWS_REGION', 'us-west-2')}.amazonaws.com",
        )
        url = f"{endpoint}/runtimes/{encoded_arn}/invocations?qualifier=DEFAULT"
        headers = {"Authorization": f"Bearer {token}"}

    transport = StreamableHttpTransport(url=url, headers=headers)
    client = Client(
        transport,
        elicitation_handler=elicit_handler,
        sampling_handler=sampling_handler,
        progress_handler=progress_handler,
    )

    try:
        await client.__aenter__()
    except Exception as e:
        print(f"\nERROR: Failed to connect to server: {e}")
        sys.exit(1)

    try:
        print("\nTesting plan_trip tool...")
        print("(Running the full flow: Elicitation → Progress → Sampling)\n")
        result = await client.call_tool("plan_trip", {})
        print("\n" + "=" * 60)
        print("Result:")
        print("=" * 60)
        print(result.content[0].text)
    except KeyboardInterrupt:
        print("\n\nInterrupted")
    except Exception as e:
        print(f"\nERROR: {e}")
    finally:
        try:
            await client.__aexit__(None, None, None)
        except Exception:
            pass

if __name__ == "__main__":
    asyncio.run(main())

On the client side, we implemented three handlers.

The elicit_handler processes user input in response to questions from the server. Fastmcp converts list-like response_type=["Book", "Cancel"] to Literal type, so we use typing.get_origin() to detect this and convert it to a numbered selection:

test_client.py
def _extract_options(response_type) -> list[str] | None:
    if isinstance(response_type, list):
        return response_type
    if typing.get_origin(response_type) is typing.Literal:
        args = typing.get_args(response_type)
        if args:
            return list(args)
    return None

messages is passed as a list of SamplingMessage objects, so we extract .content.text to use:

test_client.py
def _invoke_bedrock(prompt: str) -> str:
    import boto3
    model_id = os.getenv("BEDROCK_MODEL_ID", "us.anthropic.claude-haiku-4-5-20251001-v1:0")
    client = boto3.client("bedrock-runtime")
    response = client.converse(
        modelId=model_id,
        messages=[{"role": "user", "content": [{"text": prompt}]}],
        inferenceConfig={"maxTokens": 200},
    )
    return response["output"]["message"]["content"][0]["text"]

The progress_handler is a simple implementation that displays a progress bar:

test_client.py
async def progress_handler(progress, total, message):
    pct = int((progress / total) * 100) if total else 0
    bar = "#" * (pct // 5) + "-" * (20 - pct // 5)
    print(f"\r    Progress: [{bar}] {pct}%", end="", flush=True)
    if progress == total:
        print(" Complete!")

Deploying to AgentCore Runtime

Now that we've completed the implementation, let's deploy it to AgentCore Runtime. The deployment procedure is also described in the official documentation.

https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/runtime-mcp.html

Creating a Cognito User Pool

JWT tokens are required for authentication with AgentCore Runtime. The starter toolkit provides a setup-cognito command for Cognito setup, so we'll use this.
I also learned about this command for the first time.

Cognito Setup
agentcore identity setup-cognito

Upon completion, credential information is saved to .agentcore_identity_user.env.

Configuration

Using the Cognito information, we configure AgentCore Runtime with authentication.

Configuration Command
agentcore configure \
  -e travel_server.py \
  -p MCP \
  -n japan_travel_planner \
  -ac '{"customJWTAuthorizer": {"allowedClients": ["<client_id>"], "discoveryUrl": "<discovery_url>"}}'
Option Description
-e Python file entry point
-p MCP Specify MCP protocol
-n Agent name
-ac OAuth auth config (Cognito client ID and OIDC discovery URL)

Use the values for <client_id> and <discovery_url> from the .agentcore_identity_user.env file generated by setup-cognito.

The configuration proceeds in an interactive format, and while default settings are generally fine, note that you should select container deployment for this case.

Deployment

Deployment Command
agentcore deploy

This automatically handles ECR repository creation, Docker image build & push, and AgentCore Runtime creation. When deployment completes, the Agent ARN is output - make note of it.

Obtaining a Bearer Token

The Starter toolkit also provides a command to obtain a Bearer token for accessing the deployed server.

Token Retrieval
# Load .agentcore_identity_user.env into environment variables
export $(grep -v '^#' .agentcore_identity_user.env | xargs)

# Get access token from Cognito
export BEARER_TOKEN=$(agentcore identity get-cognito-inbound-token)

You can get a token simply by loading the environment variables file saved by setup-cognito and passing it to get-cognito-inbound-token. This was also new to me, but it's convenient not having to write custom Python scripts.

Operation Check

Let's run a test client against the deployed MCP server. Adding USE_BEDROCK=true will generate Sampling responses using Amazon Bedrock.

Execution Command
export AGENT_ARN='arn:aws:bedrock-agentcore:us-east-1:123456789012:runtime/japan_travel_planner'
USE_BEDROCK=true LOCAL_TEST=false uv run python test_client.py
Execution Results
[1] Testing plan_trip tool...
    (Running the entire flow: Elicitation Progress Sampling)

>>> Question from server: Where would you like to go?
Options: Kyoto, Okinawa, Hokkaido, Fukuoka
    Answer: Kyoto

>>> Question from server: What type of trip?
1. Solo travel
2. Couple
3. Family trip
4. Friend trip
    Answer: Solo travel

>>> Question from server: How many nights? (1-7 nights)
    Answer: 2

>>> Question from server: How many people?
    Answer: 1
    Progress: [####################] 100% Complete!

>>> AI sampling request
    Prompt: Please recommend three things for a solo trip to Kyoto (1 person, 2 nights) in under 60 characters. Please answer in Japanese...
    (Inferencing with Bedrock...)
    Bedrock response: # 京都ひとり旅のおすすめ3つ

1. **伏見稲荷大社**
千本鳥居の壮観さ。朝早く訪れると人が少なく、静寂に包まれた神秘的な雰囲気を満喫できます。

2. **哲学の道**
桜や新緑が美しい散歩道。カフェも多く、ゆったり自分のペースで散策するひとり旅に最適です。

3. **清水寺周辺**
古都の景観が凝縮。参拝後、門前町で食べ歩きしたり、寺社仏閣めぐりを自由

>>> Question from server:
========== Travel Plan Summary ==========
Destination: Kyoto
Type: Solo travel
Schedule: 2 nights, 3 days
People: 1 person

Estimated cost:
  Transportation: ¥14,000
  Accommodation: ¥24,000 (1 room × 2 nights)
  Total: ¥38,000

Would you like to confirm this reservation?
    Answer: Book it

============================================================
Result:
============================================================

==================================================
Reservation confirmed!
==================================================
Reservation number: TRV-09A5D4B1

Destination: Kyoto
Schedule: 2 nights, 3 days / 1 person
Type: Solo travel

Transportation: ¥14,000
Accommodation: ¥24,000 (1 room × 2 nights)
Total: ¥38,000

Recommended spots:
  - Fushimi Inari Shrine
  - Arashiyama Bamboo Grove
  - Kiyomizu Temple

Local cuisine:
  - Yudofu (Tofu hot pot)
  - Matcha sweets
  - Nishin soba

AI recommendations:
# 京都ひとり旅のおすすめ3つ

1. **伏見稲荷大社**
千本鳥居の壮観さ。朝早く訪れると人が少なく、静寂に包まれた神秘的な雰囲気を満喫できます。

2. **哲学の道**
桜や新緑が美しい散歩道。カフェも多く、ゆったり自分のペースで散策するひとり旅に最適です。

3. **清水寺周辺**
古都の景観が凝縮。参拝後、門前町で食べ歩きしたり、寺社仏閣めぐりを自由
==================================================

We've confirmed the flow from Elicitation → Progress → Sampling → Final Confirmation on AgentCore Runtime!
The session was maintained statelessly, so the exchange continued without losing state throughout the interaction!

After execution, I sometimes saw a log message saying Session termination failed: 404.
According to MCP's Session Management specifications, a 404 may be returned when a session ends or expires.

I believe this end-of-session log falls into that category, but at least within the scope of our test, it didn't affect the tool invocation itself.

Checking CloudWatch Logs

On AgentCore Runtime, the server's standard output is sent to CloudWatch Logs. Looking at the logs we added in travel_server.py, we can see the flow of each MCP phase.

CloudWatch Logs (Application log excerpts)
05:03:49 [INFO] [Phase 1] Destination: Kyoto
05:03:54 [INFO] [Phase 1] Trip type: Solo travel
05:04:01 [INFO] [Phase 1] Elicitation complete - Kyoto/Solo travel/2 nights/1 person
05:04:01 [INFO] [Phase 2] Progress Notifications started - Search processing
05:04:03 [INFO] [Phase 2] Progress complete - Cost calculation: ¥38,000
05:04:03 [INFO] [Phase 3] Sampling started - Requesting AI inference from client
05:04:08 [INFO] [Phase 3] Sampling complete - Response: AI input...
05:04:08 [INFO] [Phase 4] Final confirmation Elicitation
05:04:13 [INFO] [Phase 4] Reservation confirmed - Kyoto/Solo travel/2 nights/1 person/¥38,000

You can track each phase—Elicitation → Progress → Sampling → Final Confirmation—in chronological order!

About Session Management

In Stateful MCP, the server receives an Mcp-Session-Id during initialization and uses the same session ID for subsequent requests.

While the MCP stateful features documentation explains that the server returns this header during initialization, the AgentCore Runtime general guide also mentions that the platform supplements the Mcp-Session-Id. In our case, we didn't have to write session ID management code ourselves—everything worked fine relying on FastMCP's client implementation.

In stateful mode, the server returns an Mcp-Session-Id header during the initialize call. Clients must include this session ID in subsequent requests to maintain session context. If the server terminates or the session expires, requests may return a 404 error, and clients must re-initialize to obtain a new session ID. For more details, see Session Management in the MCP specification.

Cleanup

After testing, we should remove unnecessary resources. Runtime resources can be cleaned up with agentcore destroy, and if you're using Identity, agentcore identity cleanup will take care of Cognito-related resources too.

I was already surprised by the Cognito-related commands, but I was shocked to discover there's even a command for cleaning up resources.

Resource Deletion
agentcore identity cleanup
agentcore destroy --agent japan_travel_planner --force

Conclusion

We've successfully confirmed that MCP specification features like Elicitation, Sampling, and Progress Notifications work on AgentCore Runtime.

I found Elicitation particularly interesting because it allows you to ask users questions during tool execution, enabling you to build workflows that gather information step by step through conversation. I'd like to implement this in combination with AI agents and UIs.

This update was more about understanding the MCP specification itself rather than AgentCore's implementation, which made it challenging but educational! There are still some aspects I don't fully understand, so I want to continue exploring and learning!

I hope this article was helpful. Thank you for reading until the end!

Share this article

FacebookHatena blogX