[Update] Amazon Bedrock AgentCore Runtime now supports Stateful MCP servers

神野雄大

2026.03.12

This page has been translated by machine translation. View original

Introduction

Hello, I'm Jinno from the Consulting Department, and I love the supermarket La Mu.

In a recent update, Amazon Bedrock AgentCore Runtime now supports Stateful MCP servers!

At first I wondered what kind of update this was, but the MCP specification defines Elicitation (interactive input collection), Sampling (text generation requests to LLMs), and Progress Notifications, and AgentCore now supports these features. I see...

Digging deeper into each feature, the following can be achieved.
It seems like the stateful nature enables more interactive actions.

Feature	Overview
Elicitation	Server interactively requests user input from the client
Sampling	Server requests text generation from the client-side LLM
Progress Notifications	Real-time notification of long-running process progress to client

Using these features, I imagined a flow like this for flight booking, assuming the use of Elicitation, Sampling, and Progress Notifications.

With the flow above, it seems the MCP server can focus on business logic while delegating user interaction and LLM inference to the client side.
Previously, we might have had to build dialogue UIs with choices and session management individually, but using this mechanism could simplify server-side implementation.

I'd like to test how well it actually works by writing some code.

For this example, I'll make the client a simple program rather than an AI agent, and I'll build on the official Stateful MCP server features sample, replacing destinations and data for domestic travel in Japan!

Prerequisites

For this example, I used Python with uv as the package manager. The versions used are:

Python 3.13.11
uv 0.9.26

The library versions used are:

Library	Version
fastmcp	3.1.0
mcp	1.26.0
boto3	1.42.65
bedrock-agentcore-starter-toolkit	0.3.2

Initialize the project with uv and install the necessary libraries.

Setup

# Project initialization
uv init --no-readme
uv add fastmcp mcp boto3

# Install Starter Toolkit
uv add bedrock-agentcore-starter-toolkit

Implementation

MCP Server Implementation

First, let's implement a domestic travel planner MCP Server with FastMCP. We'll implement tools using Elicitation / Sampling / Progress Notifications features.

travel_server.py (full code)

travel_server.py

"""
Domestic Travel Planner - Stateful MCP Server
Demonstration of Elicitation / Sampling / Progress Notifications
"""
import asyncio
import logging
from fastmcp import FastMCP, Context
from enum import Enum

logging.basicConfig(level=logging.INFO, format="%(asctime)s [%(levelname)s] %(message)s")
logger = logging.getLogger("travel-planner")

mcp = FastMCP("Japan-Travel-Planner")

class TripType(str, Enum):
    SOLO = "ひとり旅"
    COUPLE = "カップル"
    FAMILY = "家族旅行"
    FRIENDS = "友人旅行"

DESTINATIONS = {
    "京都": {
        "name": "京都",
        "transport": 14000,
        "hotel": 12000,
        "highlights": ["伏見稲荷大社", "嵐山竹林", "清水寺"],
        "gourmet": ["湯豆腐", "抹茶スイーツ", "にしんそば"],
    },
    "沖縄": {
        "name": "沖縄",
        "transport": 40000,
        "hotel": 10000,
        "highlights": ["美ら海水族館", "古宇利島", "首里城"],
        "gourmet": ["ソーキそば", "タコライス", "海ぶどう"],
    },
    "北海道": {
        "name": "北海道",
        "transport": 35000,
        "hotel": 9000,
        "highlights": ["富良野ラベンダー畑", "小樽運河", "旭山動物園"],
        "gourmet": ["海鮮丼", "ジンギスカン", "スープカレー"],
    },
    "福岡": {
        "name": "福岡",
        "transport": 22000,
        "hotel": 8000,
        "highlights": ["太宰府天満宮", "中洲屋台", "糸島"],
        "gourmet": ["博多ラーメン", "もつ鍋", "明太子"],
    },
}

@mcp.tool()
async def plan_trip(ctx: Context) -> str:
    """
    Create domestic travel plan (using all MCP features):
    1. Elicitation - gather travel preferences
    2. Progress - search progress for transportation & accommodation
    3. Sampling - generate AI recommendations
    """
    # ---- Phase 1: Elicitation ----
    logger.info("[Phase 1] Elicitation start - asking about destination")
    dest_result = await ctx.elicit(
        message="Where would you like to go?\nOptions: Kyoto, Okinawa, Hokkaido, Fukuoka",
        response_type=str,
    )
    if dest_result.action != "accept":
        logger.info("[Phase 1] User canceled (destination)")
        return "Plan creation canceled."
    dest_key = dest_result.data.strip()
    dest = DESTINATIONS.get(dest_key, DESTINATIONS["京都"])
    logger.info(f"[Phase 1] Destination: {dest['name']}")

    type_result = await ctx.elicit(
        message="What type of trip?\n1. Solo\n2. Couple\n3. Family\n4. Friends",
        response_type=TripType,
    )
    if type_result.action != "accept":
        logger.info("[Phase 1] User canceled (trip type)")
        return "Plan creation canceled."
    trip_type = type_result.data.value if hasattr(type_result.data, "value") else type_result.data
    logger.info(f"[Phase 1] Trip type: {trip_type}")

    days_result = await ctx.elicit(
        message="How many nights? (1-7 nights)",
        response_type=int,
    )
    if days_result.action != "accept":
        return "Plan creation canceled."
    days = max(1, min(7, days_result.data))

    travelers_result = await ctx.elicit(
        message="How many people?",
        response_type=int,
    )
    if travelers_result.action != "accept":
        return "Plan creation canceled."
    travelers = travelers_result.data
    logger.info(f"[Phase 1] Elicitation complete - {dest['name']}/{trip_type}/{days} nights/{travelers} people")

    # ---- Phase 2: Progress Notifications ----
    logger.info("[Phase 2] Progress Notifications start - search process")
    total_steps = 5
    for step in range(1, total_steps + 1):
        await ctx.report_progress(progress=step, total=total_steps)
        await asyncio.sleep(0.4)

    transport_cost = dest["transport"] * travelers
    hotel_cost = dest["hotel"] * days * ((travelers + 1) // 2)
    total_cost = transport_cost + hotel_cost
    logger.info(f"[Phase 2] Progress complete - cost calculation: ¥{total_cost:,}")

    # ---- Phase 3: Sampling ----
    logger.info("[Phase 3] Sampling start - requesting AI inference from client")
    ai_tips = f"Enjoy {dest['name']}!"
    try:
        response = await ctx.sample(
            messages=(
                f"Please give me 3 recommendations for a {trip_type} trip to {dest['name']} "
                f"({travelers} people, {days} nights) in 60 characters or less each. Please answer in Japanese."
            ),
            max_tokens=200,
        )
        if hasattr(response, "text") and response.text:
            ai_tips = response.text
        logger.info(f"[Phase 3] Sampling complete - response: {ai_tips[:50]}...")
    except Exception as e:
        logger.warning(f"[Phase 3] Sampling failed: {e}")
        ai_tips = f"{dest['highlights'][0]} is a must-see. Don't miss {dest['gourmet'][0]} too!"

    # ---- Final Confirmation ----
    logger.info("[Phase 4] Final confirmation Elicitation")
    confirm = await ctx.elicit(
        message=f"""
========== Travel Plan Summary ==========
Destination: {dest['name']}
Type: {trip_type}
Schedule: {days} nights {days + 1} days
People: {travelers}

Estimated cost:
  Transportation: ¥{transport_cost:,}
  Accommodation: ¥{hotel_cost:,} ({(travelers + 1) // 2} rooms × {days} nights)
  Total: ¥{total_cost:,}

Would you like to book this plan?""",
        response_type=["Book", "Cancel"],
    )
    if confirm.action != "accept" or confirm.data == "Cancel":
        logger.info("[Phase 4] User canceled")
        return "Booking canceled. Search results will be stored for 24 hours."

    logger.info(f"[Phase 4] Booking confirmed - {dest['name']}/{trip_type}/{days} nights/{travelers} people/¥{total_cost:,}")
    highlights_str = "\n".join(f"  - {h}" for h in dest["highlights"])
    gourmet_str = "\n".join(f"  - {g}" for g in dest["gourmet"])
    return f"""
{'=' * 50}
Booking Confirmed!
{'=' * 50}
Reservation number: TRV-{ctx.session_id[:8].upper()}

Destination: {dest['name']}
Schedule: {days} nights {days + 1} days / {travelers} people
Type: {trip_type}

Transportation: ¥{transport_cost:,}
Accommodation: ¥{hotel_cost:,} ({(travelers + 1) // 2} rooms × {days} nights)
Total: ¥{total_cost:,}

Recommended spots:
{highlights_str}

Local cuisine:
{gourmet_str}

AI recommendations:
{ai_tips}
{'=' * 50}
"""

if __name__ == "__main__":
    mcp.run(
        transport="streamable-http",
        host="0.0.0.0",
        port=8000,
        stateless_http=False,
    )

Let me highlight the key points:

Elicitation: Interactive user input collection

travel_server.py

dest_result = await ctx.elicit(
    message="Where would you like to go?\nOptions: Kyoto, Okinawa, Hokkaido, Fukuoka",
    response_type=str,
)
if dest_result.action != "accept":
    return "Plan creation canceled."

Calling ctx.elicit() requests input from the client. You can specify response_type as str, int, Enum, or a list (options), and the client's response is returned as an ElicitResult.

If the action is not "accept", it's treated as a cancellation.

Progress Notifications: Visualizing progress

travel_server.py

total_steps = 5
for step in range(1, total_steps + 1):
    await ctx.report_progress(progress=step, total=total_steps)
    await asyncio.sleep(0.4)

ctx.report_progress() notifies the current progress and total number of steps.
The client is expected to convert this into a UI element like a progress bar.

Sampling: Requesting text generation from LLM

travel_server.py

response = await ctx.sample(
    messages=(
        f"Please give me 3 recommendations for a {trip_type} trip to {dest['name']} "
        f"({travelers} people, {days} nights) in 60 characters or less each. Please answer in Japanese."
    ),
    max_tokens=200,
)

Using ctx.sample() allows the server to request text generation from an LLM available to the client.
In this setup, rather than the MCP server directly calling an LLM, it's using the Bedrock inference set up on the client side.

Server startup configuration

travel_server.py

mcp.run(
    transport="streamable-http",
    host="0.0.0.0",
    port=8000,
    stateless_http=False,  # This is important
)

According to the documentation, to use Elicitation / Sampling / Progress Notifications, you need to set stateless_http=False. This enables stateful operation.

Adding logger.info() to each phase allows you to track the flow of phases in CloudWatch Logs when deployed to AgentCore Runtime. This is useful because HTTP-level logs alone don't show which MCP features are being used.

Test Client Implementation

For Stateful MCP, the client side also needs handler implementations.
We need to prepare handlers for Elicitation, Sampling, and Progress.

test_client.py (full code)

test_client.py

"""
Domestic Travel Planner - Test Client
Implementation of Elicitation / Sampling / Progress handlers

Environment variables:
  LOCAL_TEST=true (default) : Connect to local server
  LOCAL_TEST=false          : Connect to AgentCore Runtime (AGENT_ARN, BEARER_TOKEN required)
  USE_BEDROCK=true          : Use Amazon Bedrock for Sampling
  BEDROCK_MODEL_ID          : Bedrock model ID (default: us.anthropic.claude-haiku-4-5-20251001-v1:0)
"""
import asyncio
import os
import sys
import typing
from fastmcp import Client
from fastmcp.client.transports import StreamableHttpTransport
from fastmcp.client.elicitation import ElicitResult
from mcp.types import CreateMessageResult, TextContent

# --- Elicitation handler ---

def _extract_options(response_type) -> list[str] | None:
    """Extract option list from response_type.
    fastmcp converts ["Book", "Cancel"] to Literal["Book", "Cancel"] when passing.
    """
    if isinstance(response_type, list):
        return response_type
    if typing.get_origin(response_type) is typing.Literal:
        args = typing.get_args(response_type)
        if args:
            return list(args)
    return None

def _prompt_choice(options: list[str]) -> str:
    """Select by number. Retry on invalid input."""
    for i, opt in enumerate(options, 1):
        print(f"    {i}. {opt}")
    while True:
        raw = input("    Select number: ").strip()
        try:
            idx = int(raw)
            if 1 <= idx <= len(options):
                return options[idx - 1]
        except ValueError:
            pass
        print(f"    ※ Enter a number between 1-{len(options)}")

def _prompt_int(label: str = "Answer (number)") -> int:
    """Input an integer. Retry on invalid input."""
    while True:
        raw = input(f"    {label}: ").strip()
        try:
            return int(raw)
        except ValueError:
            print("    ※ Please enter a number")

async def elicit_handler(message, response_type, params, ctx):
    print(f"\n>>> Question from server: {message}")
    try:
        options = _extract_options(response_type)
        if options:
            response = _prompt_choice(options)
        elif response_type is int:
            response = _prompt_int()
        else:
            response = input("    Answer: ").strip()
        return ElicitResult(action="accept", content={"value": response})
    except (KeyboardInterrupt, EOFError):
        print("\n    (Canceled)")
        return ElicitResult(action="decline", content=None)

# --- Sampling handler ---

def _extract_prompt_text(messages) -> str:
    """Extract prompt text from a list of SamplingMessage objects"""
    if isinstance(messages, str):
        return messages
    if isinstance(messages, list):
        parts = []
        for msg in messages:
            if hasattr(msg, "content") and hasattr(msg.content, "text"):
                parts.append(msg.content.text)
            else:
                parts.append(str(msg))
        return "\n".join(parts)
    return str(messages)

def _invoke_bedrock(prompt: str) -> str:
    """Run inference with Amazon Bedrock"""
    import boto3
    model_id = os.getenv("BEDROCK_MODEL_ID", "us.anthropic.claude-haiku-4-5-20251001-v1:0")
    client = boto3.client("bedrock-runtime")
    response = client.converse(
        modelId=model_id,
        messages=[{"role": "user", "content": [{"text": prompt}]}],
        inferenceConfig={"maxTokens": 200},
    )
    return response["output"]["message"]["content"][0]["text"]

async def sampling_handler(messages, params, ctx):
    prompt_text = _extract_prompt_text(messages)
    print(f"\n>>> AI sampling request")
    print(f"    Prompt: {prompt_text[:120]}...")

    use_bedrock = os.getenv("USE_BEDROCK", "false").lower() == "true"
    try:
        if use_bedrock:
            print("    (Inferring with Bedrock...)")
            ai_response = _invoke_bedrock(prompt_text)
            print(f"    Bedrock response: {ai_response}")
        else:
            ai_response = input("    Enter AI response (Enter for default): ").strip()
            if not ai_response:
                ai_response = "1. Visit popular spots early 2. Enjoy local cuisine 3. Check out local hidden gems"
    except Exception as e:
        print(f"    ※ Sampling error: {e}")
        ai_response = "Could not retrieve recommendations."

    return CreateMessageResult(
        role="assistant",
        content=TextContent(type="text", text=ai_response),
        model="bedrock" if use_bedrock else "manual",
        stopReason="endTurn",
    )

# --- Progress handler ---

async def progress_handler(progress, total, message):
    pct = int((progress / total) * 100) if total else 0
    bar = "#" * (pct // 5) + "-" * (20 - pct // 5)
    print(f"\r    Progress: [{bar}] {pct}%", end="", flush=True)
    if progress == total:
        print(" Complete!")

# --- Main ---

async def main():
    local_test = os.getenv("LOCAL_TEST", "true").lower() == "true"

    if local_test:
        url = sys.argv[1] if len(sys.argv) > 1 else "http://localhost:8000/mcp"
        headers = {}
    else:
        agent_arn = os.getenv("AGENT_ARN")
        token = os.getenv("BEARER_TOKEN")
        if not agent_arn or not token:
            print("ERROR: AGENT_ARN / BEARER_TOKEN not set")
            sys.exit(1)
        encoded_arn = agent_arn.replace(":", "%3A").replace("/", "%2F")
        endpoint = os.getenv(
            "MCP_ENDPOINT",
            f"https://bedrock-agentcore.{os.getenv('AWS_REGION', 'us-west-2')}.amazonaws.com",
        )
        url = f"{endpoint}/runtimes/{encoded_arn}/invocations?qualifier=DEFAULT"
        headers = {"Authorization": f"Bearer {token}"}

    transport = StreamableHttpTransport(url=url, headers=headers)
    client = Client(
        transport,
        elicitation_handler=elicit_handler,
        sampling_handler=sampling_handler,
        progress_handler=progress_handler,
    )

    try:
        await client.__aenter__()
    except Exception as e:
        print(f"\nERROR: Failed to connect to server: {e}")
        sys.exit(1)

    try:
        print("\nTesting plan_trip tool...")
        print("(Running full flow: Elicitation → Progress → Sampling)\n")
        result = await client.call_tool("plan_trip", {})
        print("\n" + "=" * 60)
        print("Result:")
        print("=" * 60)
        print(result.content[0].text)
    except KeyboardInterrupt:
        print("\n\nInterrupted")
    except Exception as e:
        print(f"\nERROR: {e}")
    finally:
        try:
            await client.__aexit__(None, None, None)
        except Exception:
            pass

if __name__ == "__main__":
    asyncio.run(main())

On the client side, we've implemented three handlers.

elicit_handler processes user input in response to questions from the server. Fastmcp converts lists like response_type=["Book", "Cancel"] to Literal types, so we detect this using typing.get_origin() and convert it to numbered options.

test_client.py

def _extract_options(response_type) -> list[str] | None:
    if isinstance(response_type, list):
        return response_type
    if typing.get_origin(response_type) is typing.Literal:
        args = typing.get_args(response_type)
        if args:
            return list(args)
    return None

messages is passed as a list of SamplingMessage objects, so we extract .content.text for use.

test_client.py

def _invoke_bedrock(prompt: str) -> str:
    import boto3
    model_id = os.getenv("BEDROCK_MODEL_ID", "us.anthropic.claude-haiku-4-5-20251001-v1:0")
    client = boto3.client("bedrock-runtime")
    response = client.converse(
        modelId=model_id,
        messages=[{"role": "user", "content": [{"text": prompt}]}],
        inferenceConfig={"maxTokens": 200},
    )
    return response["output"]["message"]["content"][0]["text"]

progress_handler is a simple implementation that displays a progress bar.

test_client.py

async def progress_handler(progress, total, message):
    pct = int((progress / total) * 100) if total else 0
    bar = "#" * (pct // 5) + "-" * (20 - pct // 5)
    print(f"\r    Progress: [{bar}] {pct}%", end="", flush=True)
    if progress == total:
        print(" Complete!")

Deployment to AgentCore Runtime

Now that we have our implementation, let's deploy it to AgentCore Runtime. The deployment procedure is also described in the official documentation.

Creating a Cognito User Pool

JWT tokens are required for authentication to AgentCore Runtime. The starter toolkit provides a setup-cognito command for Cognito setup, so we'll use that.
I didn't know this command existed until now.

Cognito Setup

agentcore identity setup-cognito

When completed, credentials are saved to .agentcore_identity_user.env.

Configuration

Using the Cognito information, we configure AgentCore Runtime with authentication.

Configuration Command

agentcore configure \
  -e travel_server.py \
  -p MCP \
  -n japan_travel_planner \
  -ac '{"customJWTAuthorizer": {"allowedClients": ["<client_id>"], "discoveryUrl": "<discovery_url>"}}'

Option	Description
`-e`	Python file entry point
`-p MCP`	Specify MCP protocol
`-n`	Agent name
`-ac`	OAuth auth config (Cognito client ID and OIDC discovery URL)

For <client_id> and <discovery_url>, use the values from .agentcore_identity_user.env generated by setup-cognito.

The configuration proceeds interactively, and default settings are generally fine, but note that we should select container deployment for this example.

Deployment

Deploy Command

agentcore deploy

This automatically creates an ECR repository, builds and pushes the Docker image, and creates the AgentCore Runtime. When deployment is complete, the Agent ARN is output, which we should save.

Obtaining a Bearer Token

We can also get a Bearer token to access the deployed server using the Starter toolkit command.

Get Token

# Load environment variables from .agentcore_identity_user.env
export $(grep -v '^#' .agentcore_identity_user.env | xargs)

# Get access token from Cognito
export BEARER_TOKEN=$(agentcore identity get-cognito-inbound-token)

We just read in the environment variables file saved by setup-cognito and pass it to get-cognito-inbound-token to get the token. I didn't know about this command either, but it's convenient not having to write a Python script to do this.

Testing the Operation

Let's run a test client against the deployed MCP server. Adding USE_BEDROCK=true generates Sampling responses using Amazon Bedrock.

Execution Command

export AGENT_ARN='arn:aws:bedrock-agentcore:us-east-1:123456789012:runtime/japan_travel_planner'
USE_BEDROCK=true LOCAL_TEST=false uv run python test_client.py

Execution Result

[1] Testing plan_trip tool...
    (Running the entire flow: Elicitation → Progress → Sampling)

>>> Question from server: Where would you like to go?
Options: Kyoto, Okinawa, Hokkaido, Fukuoka
    Answer: Kyoto

>>> Question from server: What type of trip?
1. Solo travel
2. Couple
3. Family trip
4. Friends trip
    Answer: Solo travel

>>> Question from server: How many nights? (1-7 nights)
    Answer: 2

>>> Question from server: How many people?
    Answer: 1
    Progress: [####################] 100% Complete!

>>> AI sampling request
    Prompt: Please recommend 3 things for a solo trip to Kyoto (1 person, 2 nights) in under 60 characters. Please answer in Japanese...
    (Inferring with Bedrock...)
    Bedrock response: # 京都ひとり旅のおすすめ3つ

1. **伏見稲荷大社**
千本鳥居の壮観さ。朝早く訪れると人が少なく、静寂に包まれた神秘的な雰囲気を満喫できます。

2. **哲学の道**
桜や新緑が美しい散歩道。カフェも多く、ゆったり自分のペースで散策するひとり旅に最適です。

3. **清水寺周辺**
古都の景観が凝縮。参拝後、門前町で食べ歩きしたり、寺社仏閣めぐりを自由

>>> Question from server:
========== Travel Plan Summary ==========
Destination: Kyoto
Type: Solo travel
Schedule: 2 nights, 3 days
Number of people: 1

Estimated cost:
  Transportation: ¥14,000
  Accommodation: ¥24,000 (1 room × 2 nights)
  Total: ¥38,000

Would you like to book this plan?
    Answer: Book it

============================================================
Result:
============================================================

==================================================
Booking confirmed!
==================================================
Booking number: TRV-09A5D4B1

Destination: Kyoto
Schedule: 2 nights, 3 days / 1 person
Type: Solo travel

Transportation: ¥14,000
Accommodation: ¥24,000 (1 room × 2 nights)
Total: ¥38,000

Recommended spots:
  - Fushimi Inari Shrine
  - Arashiyama Bamboo Grove
  - Kiyomizu Temple

Local cuisine:
  - Yudofu (Tofu hot pot)
  - Matcha sweets
  - Nishin soba

AI recommendations:
# 京都ひとり旅のおすすめ3つ

1. **伏見稲荷大社**
千本鳥居の壮観さ。朝早く訪れると人が少なく、静寂に包まれた神秘的な雰囲気を満喫できます。

2. **哲学の道**
桜や新緑が美しい散歩道。カフェも多く、ゆったり自分のペースで散策するひとり旅に最適です。

3. **清水寺周辺**
古都の景観が凝縮。参拿後、門前町で食べ歩きしたり、寺社仏閣めぐりを自由
==================================================

On the AgentCore Runtime as well, I confirmed the Elicitation → Progress → Sampling → Final confirmation flow!
The session is maintained statefully, so I verified that the state wasn't lost in the middle, and the entire exchange continued seamlessly!

After execution, there was sometimes a log message saying Session termination failed: 404.
According to the MCP Session Management specification, a 404 may be returned when a session ends or expires.

I suspect this end-of-session log is related to that, but at least within the scope of this test, it didn't affect the tool invocation itself.

Checking CloudWatch Logs

On AgentCore Runtime, the server's standard output is sent to CloudWatch Logs. Extracting the logs embedded in travel_server.py, we can trace the flow of each MCP phase.

CloudWatch Logs (Application Log Excerpt)

05:03:49 [INFO] [Phase 1] Destination: Kyoto
05:03:54 [INFO] [Phase 1] Travel type: Solo travel
05:04:01 [INFO] [Phase 1] Elicitation completed - Kyoto/Solo travel/2 nights/1 person
05:04:01 [INFO] [Phase 2] Progress Notifications started - Search processing
05:04:03 [INFO] [Phase 2] Progress completed - Cost calculation: ¥38,000
05:04:03 [INFO] [Phase 3] Sampling started - Requesting AI inference from client
05:04:08 [INFO] [Phase 3] Sampling completed - Response: AI input...
05:04:08 [INFO] [Phase 4] Final confirmation Elicitation
05:04:13 [INFO] [Phase 4] Booking confirmed - Kyoto/Solo travel/2 nights/1 person/¥38,000

You can see the Elicitation → Progress → Sampling → Final confirmation phases in chronological order!

About Session Management

In Stateful MCP, sessions are managed using the Mcp-Session-Id header. The server returns a session ID during initialization, and the client includes this ID in subsequent requests to maintain the session.

In this case, I didn't need to write session ID management code myself, as the FastMCP client implementation handled it perfectly.

Cleanup

After testing, let's delete unnecessary resources. The Runtime side can be cleaned up with agentcore destroy, and if you're using Identity, agentcore identity cleanup will take care of Cognito-related resources as well.

I was already surprised by the Cognito-related commands, but I was even more surprised to find there's a cleanup command too.

Resource Deletion

agentcore identity cleanup
agentcore destroy --agent japan_travel_planner --force

Conclusion

I was able to confirm that the MCP specification features—Elicitation, Sampling, and Progress Notifications—actually work on the AgentCore Runtime.

I found Elicitation interesting because it allows you to ask users questions during tool execution, enabling workflows that gather information incrementally through dialogue. I'd like to implement this combined with AI agents and UI.

This update is more about the MCP specification itself rather than AgentCore's structure, which makes it somewhat challenging but educational... There are still some areas I don't fully understand, so I want to explore more to gain a better understanding!

I hope this article was helpful. Thank you for reading to the end!

[Update] Amazon Bedrock AgentCore Runtime now supports Stateful MCP servers

Introduction

Prerequisites

Implementation

MCP Server Implementation

Elicitation: Interactive user input collection

Progress Notifications: Visualizing progress

Sampling: Requesting text generation from LLM

Server startup configuration

Test Client Implementation

Deployment to AgentCore Runtime

Creating a Cognito User Pool

Configuration

Deployment

Obtaining a Bearer Token

Testing the Operation

Checking CloudWatch Logs

About Session Management

Cleanup

Conclusion

AWS Topics

Trending Topics

Products & Services

Features and Series