I tried operating AgentCore Browser with an AI agent
Introduction
Hello, I'm Jinno from the Consulting Department who supports the supermarket La Mu.
I like the sweat suit set available at La Mu for 1,000 yen as it offers good value for money.
Today I'd like to introduce Amazon Bedrock AgentCore Observability! It's a metrics collection feature.
When I first heard the name, I intuitively thought, visibility...? Does it sound difficult...? So in this article, I'd like to unpack and explore it!
AgentCore Observability
AgentCore Observability is a feature that helps trace, debug, and monitor agent performance.
It's a feature that's useful when you want to see behavior while developing and operating agents.
Looking at the official documentation illustration, it appears as follows:
It collects information from AgentCore Runtime, Memory, and Gateway, converts it to OTEL (Open Telemetry) logs, and makes it visible.
The main telemetry is divided into three categories: metrics, structured logs, and spans and traces. Each has its own characteristics, so let's look at them in order!
First, metrics represent basic indicators such as call counts, latency, duration, token usage, session counts, throttling, user errors, and system error aggregations.
Since they're provided as CloudWatch metrics, I think these are relatively easy to understand items. They're useful for understanding agent performance and usage.
Next, structured logs are JSON-format logs that capture event intake, long-term memory extraction, integration procedures, and various operations. They record detailed operation logs of agents in JSON format compliant with OpenTelemetry standards. They contain detailed information about what decisions the agent made, which tools it used, what prompts it processed, etc., so I think they're very helpful for debugging.
Finally, spans and traces record the complete execution path from agent invocation to response, with spans representing individual operation units within that path. They have a hierarchical structure, so you can visually confirm how much time each process took.
In AgentCore Observability, sessions exist as a unit for bundling traces, imagining the entire conversation. It looks like this:
Dashboard Integration
This is a built-in dashboard. It's a convenient feature that can be viewed from CloudWatch.
It's provided as a feature called GenAI Observability
.
### Enabling Transaction Search
To allow span ingestion, enable Transaction Search once per account.
Click the Enable Transaction Search
button below to activate it.
Required Dependencies
``
When collecting OTEL logs with AgentCore Runtime, you need to include aws-opentelemetry-distro
in your dependencies. However, when deploying with the AgentCore starter toolkit, opentelemetry-instrument
is automatically enabled, allowing visualization.
It looks like the image below.
Log Groups
When using Runtime, two types of log streams are output. They look like this:
- runtime-logs
- OpenTelemtry standard logs
- YYYY/MM/DD/[rruntime-logs]xxxx
- Application logs
There are so many features...!!
This time I'd like to focus on examining the GenAI Observability dashboard!
Preparation
Agent to be used this time
I'll be checking what we can see with agents we've created before.
I'll test with an agent that uses the Memory feature.
https://github.com/yuu551/tech-learning-assistant/tree/main## Try it out
Let's set the runtime-session-id
in chat.py
to the same value.
runtime_session_id = "sample-id-xxxx"
Let's ask the agent a question right away.
python chat.py "Tell me about S3 bucket security" --user "test_user" --session "session_test" [11:05:31]
"Practical advice:\n- Regularly check AWS official documentation\n- Carefully configure security groups and bucket policies\n- Always keep the principle of least privilege in mind\n\nDo you have any questions or would you like to know more about specific implementations?"
python chat.py "I'm not good with security permissions... What specific settings should I make?" --user "test_user" --session "session_test"
"I'll explain the key points of S3 bucket security settings specifically:\n\n1. Bucket access settings\n- Block public access\n - Enable the \"Block all public access\" option\n - Prohibit all public access by default\n\n2. Bucket policy settings\n- Apply the principle of least privilege\n ```json\n {\n \"Version\": \"2012-10-17\",\n \"Statement\": [\n {\n \"Sid\": \"RestrictBucketAccess\",\n \"Effect\": \"Deny\",\n \"Principal\": \"*\",\n \"Action\": \"s3:*\",\n \"Resource\": [\n \"arn:aws:s3:::your-bucket-name\",\n \"arn:aws:s3:::your-bucket-name/*\"\n ],\n \"Condition\": {\n \"NotIpAddress\": {\n \"aws:SourceIp\": \"allowed IP address/range\"\n }\n }\n }\n ]\n }\n ```\n\n3. IAM user/role permission management\n- Allow only specifically needed operations\n- Separate read/write permissions\n- Regular permission reviews\n\n4. Enable encryption\n- Enable server-side encryption\n - SSE-S3 (AWS managed keys)\n - SSE-KMS (Customer managed keys)\n\n5. Versioning and tracking\n- Enable versioning\n- Record access logs with CloudTrail\n\nAs practical advice, I recommend the following:\n- Refer to AWS official documentation\n- Configure security groups step by step\n- Be mindful of the principle of least privilege\n\nDo you have any concerns about specific implementations or settings? Let's check them carefully together."
Let's also ask a question that uses tools. This is the weakness analysis tool.
python chat.py "Analyze my weaknesses" --session "session_test" --user "test_user" [11:21:58]
"From the analysis, I can see you have difficulty with security permission settings in cloud computing. Focusing on S3 bucket security, here's specific advice:\n\n1. IAM (Identity and Access Management) basics\n- Set bucket access rights using the \"principle of least privilege\"\n- Grant only minimum necessary permissions to users and roles\n\n2. Key points for bucket policy settings\n- Fine-grained control of read, write, delete permissions\n- Access restrictions from specific IP addresses\n- Enabling encryption\n\n3. Specific configuration example\n```json\n{\n \"Version\": \"2012-10-17\",\n \"Statement\": [\n {\n \"Effect\": \"Allow\",\n \"Principal\": {\"AWS\": \"arn:aws:iam::xxxxxxxxxxxx:user/username\"},\n \"Action\": [\n \"s3:GetObject\",\n \"s3:PutObject\"\n ],\n \"Resource\": \"arn:aws:s3:::your-bucket-name/*\"\n }\n ]\n}\n```\n\nI recommend learning step by step. Start with one setting and gradually deepen your understanding.\n\nPlease let me know if you have specific questions or points you'd like to know more about. We can learn together step by step."
```Now let's take a look at the CloudWatch dashboard.
Click on GenAI observability in the tab on the left. Then click on the "Bedrock AgentCore" tab.

A pretty cool dashboard has appeared.
There are three Views (Agents view, Sessions view, and Traces view), allowing you to check metrics and behavior from each perspective.
### Agents view
You can check the behavior of deployed Agents.

It's designed to show at a glance the number of Sessions and Traces, and whether there are any errors or throttling issues.
You can click on the endpoints from the agent list, so if you want to see details for a specific agent, click on the link.

This shows more detailed information for each agent.

Also, by clicking on the Sessions tab and Traces tab, you can see the list of Sessions and Traces within this agent.
I think this is the flow when looking at details by agent.
I'll explain Sessions and Traces later.
### Sessions view
You can check the list of Sessions for the agent.
I can see the Session ID with the random name I created earlier. Let's click on it.

When you click, you can see a summary of this Session. You can also see linked Traces.

Clicking further allows you to see even more details.### Traces View
Here you can view all traces in a list. Click on the latest trace.

When you click on it, a graphical dashboard appears. It visualizes the Spans within the Traces.

**Trajectory**

The flow view is easy to understand. This makes it clear how the agent starts up, launches tools, and so on.
You can also see the `identity_weak_areas` tool that was executed for the weakness analysis.
Looking at the Spans below, you can easily see what kind of AI actions were taken.

They are arranged in chronological order. You can see the logs by opening one of the Events.
Let's look at Event 1.

```json
{
"resource": {
"attributes": {
"deployment.environment.name": "bedrock-agentcore:default",
"aws.local.service": "my-agent.DEFAULT",
"service.name": "my-agent.DEFAULT",
"cloud.region": "us-west-2",
"aws.log.stream.names": "runtime-logs",
"telemetry.sdk.name": "opentelemetry",
"aws.service.type": "gen_ai_agent",
"telemetry.sdk.language": "python",
"cloud.provider": "aws",
"cloud.resource_id": "arn:aws:bedrock-agentcore:us-west-2:xxxxxxxxxxxx:runtime/my-agent-xxxx/runtime-endpoint/DEFAULT:DEFAULT",
"aws.log.group.names": "/aws/bedrock-agentcore/runtimes/my-agent-xxxx-DEFAULT",
"telemetry.sdk.version": "1.33.1",
"cloud.platform": "aws_bedrock_agentcore",
"telemetry.auto.version": "0.11.0-aws"
}
},
"scope": {
"name": "opentelemetry.instrumentation.botocore.bedrock-runtime",
"schemaUrl": "https://opentelemetry.io/schemas/1.30.0"
},
"timeUnixNano": 1756785750184271639,
"observedTimeUnixNano": 1756785750184290707,
"severityNumber": 9,
"severityText": "",
"body": {
"content": [
{
"text": "あなたは優秀な技術学習アシスタントです。\n エンジニアの技術学習をサポートし、理解度を記録し、効果的な学習方法を提案します。\n \n 以下のツールが利用可能です:\n - analyze_learning_progress: 学習進捗を分析(特定の技術分野も指定可能)\n - identify_weak_areas: 苦手分野を特定\n - suggest_review_topics: 復習すべきトピックを提案\n - get_session_summary: 学習セッションのサマリーを取得\n \n 以下の点に注意してください:\n - 技術的な質問には具体的な例を交えて説明する\n - 理解度を確認しながら進める\n - 苦手分野を特定したら、それに応じた学習方法を提案する\n - 励ましと建設的なフィードバックを提供する\n - 必要に応じてツールを活用して学習状況を把握する\n "
}
]
},
"attributes": {
"event.name": "gen_ai.system.message",
"gen_ai.system": "aws.bedrock"
},
"flags": 1,
"traceId": "68b66c556c14723d4f7168399c6e91af",
"spanId": "ad1a0ebc825b849e"
}
```Event1 was a log showing the system prompt being transmitted to the AI.
Let's also look at Event2.
```json
{
"resource": {
"attributes": {
"deployment.environment.name": "bedrock-agentcore:default",
"aws.local.service": "my-agent.DEFAULT",
"service.name": "my-agent.DEFAULT",
"cloud.region": "us-west-2",
"aws.log.stream.names": "runtime-logs",
"telemetry.sdk.name": "opentelemetry",
"aws.service.type": "gen_ai_agent",
"telemetry.sdk.language": "python",
"cloud.provider": "aws",
"cloud.resource_id": "arn:aws:bedrock-agentcore:us-west-2:xxxxxxxxxxxx:runtime/my-agent-xxxx/runtime-endpoint/DEFAULT:DEFAULT",
"aws.log.group.names": "/aws/bedrock-agentcore/runtimes/my-agent-xxxx-DEFAULT",
"telemetry.sdk.version": "1.33.1",
"cloud.platform": "aws_bedrock_agentcore",
"telemetry.auto.version": "0.11.0-aws"
}
},
"scope": {
"name": "opentelemetry.instrumentation.botocore.bedrock-runtime",
"schemaUrl": "https://opentelemetry.io/schemas/1.30.0"
},
"timeUnixNano": 1756779796818481971,
"observedTimeUnixNano": 1756779796818488400,
"severityNumber": 9,
"severityText": "",
"body": {
"content": [
{
"text": "Tell me about S3 bucket security"
}
]
},
"attributes": {
"event.name": "gen_ai.user.message",
"gen_ai.system": "aws.bedrock"
},
"flags": 1,
"traceId": "68b6551285d1b1a532aa55b39fa596eb",
"spanId": "a02614dcd6a51732"
}
It's followed by a specific question. That's easy to understand. This includes the history of past questions.
Event6 contained a question about analyzing weaknesses.
{
"resource": {
"attributes": {
"deployment.environment.name": "bedrock-agentcore:default",
"aws.local.service": "my-agent.DEFAULT",
"service.name": "my-agent.DEFAULT",
"cloud.region": "us-west-2",
"aws.log.stream.names": "runtime-logs",
"telemetry.sdk.name": "opentelemetry",
"aws.service.type": "gen_ai_agent",
"telemetry.sdk.language": "python",
"cloud.provider": "aws",
"cloud.resource_id": "arn:aws:bedrock-agentcore:us-west-2:xxxxxxxxxxxx:runtime/my-agent-xxxx/runtime-endpoint/DEFAULT:DEFAULT",
"aws.log.group.names": "/aws/bedrock-agentcore/runtimes/my-agent-xxxx-DEFAULT",
"telemetry.sdk.version": "1.33.1",
"cloud.platform": "aws_bedrock_agentcore",
"telemetry.auto.version": "0.11.0-aws"
}
},
"scope": {
"name": "opentelemetry.instrumentation.botocore.bedrock-runtime",
"schemaUrl": "https://opentelemetry.io/schemas/1.30.0"
},
"timeUnixNano": 1756779796818529300,
"observedTimeUnixNano": 1756779796818534000,
"severityNumber": 9,
"severityText": "",
"body": {
"content": [
{
"text": "Analyze my weaknesses"
}
]
},
"attributes": {
"event.name": "gen_ai.user.message",
"gen_ai.system": "aws.bedrock"
},
"flags": 1,
"traceId": "68b6551285d1b1a532aa55b39fa596eb",
"spanId": "a02614dcd6a51732"
}
```By tracking this, you can visually investigate what behaviors were performed in the flow, which makes analysis easier.
## Conclusion
We've mainly looked at the dashboard feature of Amazon Bedrock AgentCore Observability!
It's convenient that by enabling it with a button and deploying with built-in functionality, AI agents can be traced with OpenTelemetry standards, allowing you to visually confirm the behavior of AI agents.
There are many aspects that can be observed, and I'd like to introduce how to use them in actual operations in the future!
I hope this article was helpful to you. Thank you very much for reading until the end!!