[Update] Tried Using Strictly Consistent Metadata with AgentCore Memory Long-Term Memory

[Update] Tried Using Strictly Consistent Metadata with AgentCore Memory Long-Term Memory

I implemented the Strictly Consistent metadata added to AgentCore Memory. It's a new feature that allows values that should be controlled by the application to be specified directly, without going through LLM inference.
2026.06.16

This page has been translated by machine translation. View original

Introduction

Hello, I'm Jinno from the Consulting Division, a huge supermarket enthusiast.

Previously, I wrote an article about using custom metadata filters with AgentCore Memory's long-term memory.

https://dev.classmethod.jp/articles/agentcore-long-memory-metadata/

In that previous article, I introduced how to automatically extract metadata from event conversation content using LLM extraction (LLM_INFERRED). However, LLM-based extraction is inherently non-deterministic and was considered unsuitable for assigning strict values.

So with today's update, Strictly Consistent metadata has been added to AgentCore Memory.

https://aws.amazon.com/jp/about-aws/whats-new/2026/05/agentcore-memory-scmetadata/

This is a feature that allows the application side to directly specify metadata values in short-term memory, and those exact values are reflected in long-term memory records without going through an LLM. The What's New announcement lists department-scoped search, compliance boundaries, and multi-tenant memory implementation as use cases.

This time, while touching on the differences from the previous article, let's actually try out STRICTLY_CONSISTENT metadata!

Prerequisites

Item Version / Value
Python 3.13
boto3 1.43.29
botocore 1.43.29
AWS Region us-east-1
Dependency management uv

Create a project with uv and install boto3.

Setup
uv init
uv add boto3

All subsequent code is executed with uv run.

STRICTLY_CONSISTENT Metadata

Differences from LLM_INFERRED

The LLM_INFERRED introduced in the previous article was a mechanism where the LLM analyzes the conversation content of events to extract metadata values. For example, inferring destination = kyoto from a conversation like "I want to travel to Kyoto."

With the newly added STRICTLY_CONSISTENT, the application directly specifies values via the metadata parameter when creating short-term memory events, and those values are reflected as-is in long-term memory records even after going through the extraction and consolidation process. Since no LLM inference is involved, the values set in short-term memory are guaranteed to reach long-term memory unchanged.

Item LLM_INFERRED STRICTLY_CONSISTENT
Value determination LLM infers from conversation content Application specifies directly
Determinism Non-deterministic (depends on LLM interpretation) Deterministic (specified value is reflected as-is)
Setting limit Within metadata schema limit (max 20 entries) Max 3 keys
Supported types STRING / STRINGLIST / NUMBER STRING only
Consolidation behavior Semantically similar records may be consolidated Records with different values have their consolidation targets separated
Use cases Content classification / topic extraction Department scope / compliance boundaries / multi-tenant memory

Let me add a brief note about "Consolidation behavior" in the table. AgentCore Memory's long-term memory has a consolidation process that merges multiple records into one. For example, if a user had conversations across different sessions saying "the invoice amount is wrong" and "I want to change the billing payment method," these might be consolidated into one record because they are semantically close.

When STRICTLY_CONSISTENT is set, this consolidation is separated by value groups. Events with the same department=sales are extracted and consolidated together, but a department=sales record and a department=engineering record are treated as separate groups even if their content is similar, and the consolidation targets are separated.

While the What's New announcement lists multi-tenant memory and compliance boundaries as use cases, the official documentation best practices note "don't rely on metadata alone for tenant isolation." It seems best to use namespaces for tenant isolation and combine metadata for filtering within them. I'd like to dig deeper into this.

Supported Strategies

STRICTLY_CONSISTENT is available with the following strategies.

  • Semantic Memory Strategy
  • User Preference Memory Strategy
  • Episodic Memory Strategy

Note that Summarization Strategy is not supported.

Let's Try It

Creating Memory

This time, I'll set department (department) as STRICTLY_CONSISTENT and topic (topic category) as LLM_INFERRED, configuring both extraction types in a single Memory. Assuming a customer support scenario where the same customer contacts multiple departments, I'll verify that filtering by department works within the same namespace.

create_memory.py
import boto3

control = boto3.client("bedrock-agentcore-control", region_name="us-east-1")

response = control.create_memory(
    name="support_memory_sc",
    description="Multi-department support memory with strictly consistent metadata",
    eventExpiryDuration=30,
    memoryStrategies=[
        {
            "semanticMemoryStrategy": {
                "name": "semantic_strategy",
                "namespaceTemplates": ["support/{actorId}/facts"],
                "memoryRecordSchema": {
                    "metadataSchema": [
                        {
                            "key": "department",
                            "type": "STRING",
                            "extractionType": "STRICTLY_CONSISTENT",
                        },
                        {
                            "key": "topic",
                            "type": "STRING",
                            "extractionType": "LLM_INFERRED",
                            "extractionConfig": {
                                "llmExtractionConfig": {
                                    "definition": "The support topic category discussed in the conversation",
                                    "llmExtractionInstruction": "LATEST_VALUE",
                                    "validation": {
                                        "stringValidation": {
                                            "allowedValues": [
                                                "billing",
                                                "technical",
                                                "account",
                                                "general",
                                            ]
                                        }
                                    },
                                }
                            },
                        },
                    ]
                },
            }
        }
    ],
    indexedKeys=[
        {"key": "department", "type": "STRING"},
        {"key": "topic", "type": "STRING"},
    ],
)

memory_id = response["memoryId"]
print(f"Memory ID: {memory_id}")

For the department key, STRICTLY_CONSISTENT is specified for extractionType. No extractionConfig is needed, and the type is limited to STRING. On the other hand, the topic key uses LLM_INFERRED as before, with definition and allowedValues configured.

In the previous article, even values that should be determined deterministically, like department, were extracted by the LLM, so this is a major difference. You can clearly separate values that should be controlled by the application from values you want the LLM to infer.

Submitting Events

Let's submit events with different metadata values per department in a scenario where the same customer (actorId=customer-001) contacts different departments.

create_events.py
import boto3
from datetime import datetime, timezone

client = boto3.client("bedrock-agentcore", region_name="us-east-1")

memory_id = "YOUR_MEMORY_ID"

def create_support_event(session_id, department, messages):
    payload = []
    for role, text in messages:
        payload.append(
            {"conversational": {"role": role, "content": {"text": text}}}
        )

    response = client.create_event(
        memoryId=memory_id,
        actorId="customer-001",
        sessionId=session_id,
        eventTimestamp=datetime.now(timezone.utc),
        payload=payload,
        metadata={
            "department": {"stringValue": department},
        },
    )
    return response

# Inquiry to sales department (billing-related)
create_support_event(
    session_id="session-001",
    department="sales",
    messages=[
        ("USER", "It seems the amount on last month's invoice is incorrect. Could you check it?"),
        ("ASSISTANT", "Certainly. Could you provide me with the invoice number?"),
        ("USER", "It's INV-2026-0542. It should be 500,000 yen but it shows 550,000 yen."),
        ("ASSISTANT", "I've confirmed it. The extra 50,000 yen is the previous month's unpaid balance that was added."),
    ],
)

# Inquiry to engineering department (technical-related)
create_support_event(
    session_id="session-002",
    department="engineering",
    messages=[
        ("USER", "The API response in the production environment has suddenly become slow."),
        ("ASSISTANT", "Which endpoint is experiencing the delay?"),
        ("USER", "The /api/v2/reports endpoint, which normally takes 200ms, is now taking over 5 seconds."),
        ("ASSISTANT", "The database connection pool may be exhausted. Please check the connection count."),
    ],
)

# Another inquiry to sales department (account-related)
create_support_event(
    session_id="session-003",
    department="sales",
    messages=[
        ("USER", "I'd like to create a new account. Could you explain the procedure?"),
        ("ASSISTANT", "Please select 'Create New Account' from the admin panel and fill in the required information. You'll receive an email after approval."),
    ],
)

print("Events created successfully!")

The department key in the metadata parameter is passed with a value directly via stringValue. This value is reflected as-is in long-term memory records as STRICTLY_CONSISTENT, without going through LLM inference.

On the other hand, topic is automatically inferred by the LLM from the conversation content. For billing talk it would be billing, for API delay talk it would be technical, for account creation talk it would be account, and so on.

Checking Long-Term Memory Records

After submitting events, the extraction job runs after a while and long-term memory records are generated. Let's check all records within the same namespace.

list_records.py
import boto3
import json

client = boto3.client("bedrock-agentcore", region_name="us-east-1")

memory_id = "YOUR_MEMORY_ID"

response = client.list_memory_records(
    memoryId=memory_id,
    namespace="support/customer-001/facts",
)

for record in response.get("memoryRecordSummaries", []):
    metadata = record.get("metadata", {})
    dept = metadata.get("department", {}).get("stringValue", "N/A")
    topic = metadata.get("topic", {}).get("stringValue", "N/A")
    print(f"Content: {record['content']}")
    print(f"department={dept}, topic={topic}")
    print("---")
Execution result
Content: {'text': 'The API response for the /api/v2/reports endpoint in the production environment suddenly slowed down, taking over 5 seconds when it normally takes 200ms.'}
department=engineering, topic=technical
---
Content: {'text': 'The invoice (INV-2026-0542) for May 2026 showed 550,000 yen instead of the expected 500,000 yen, and it was explained that the 50,000 yen difference was the previous month\'s unpaid balance that was added.'}
department=sales, topic=billing
---
Content: {'text': 'The original amount on invoice INV-2026-0542 was understood to be 500,000 yen.'}
department=sales, topic=billing
---
Content: {'text': 'The user wants to create a new account.'}
department=sales, topic=account
---

Records with department=sales and department=engineering coexist within the same namespace! The department reflects the values passed directly by the application, and topic is correctly inferred by the LLM from the conversation content.

In the previous article, even deterministically determined values like these (department names, tenant attributes, etc.) were extracted by the LLM, so there was a risk of values fluctuating or becoming unintended values. With STRICTLY_CONSISTENT, the values specified by the application are entered as-is, which is reassuring.

Searching with Metadata Filters

STRICTLY_CONSISTENT metadata can also be used as a filter. Let's try filtering by department within the same namespace.

retrieve_by_department.py
import boto3

client = boto3.client("bedrock-agentcore", region_name="us-east-1")

memory_id = "YOUR_MEMORY_ID"
namespace = "support/customer-001/facts"

# Search only sales department records
response = client.retrieve_memory_records(
    memoryId=memory_id,
    namespace=namespace,
    searchCriteria={
        "searchQuery": "inquiry content",
        "topK": 10,
        "metadataFilters": [
            {
                "left": {"metadataKey": "department"},
                "operator": "EQUALS_TO",
                "right": {"metadataValue": {"stringValue": "sales"}},
            }
        ],
    },
)

print("=== Filter by department=sales ===")
for record in response.get("memoryRecordSummaries", []):
    metadata = record.get("metadata", {})
    dept = metadata.get("department", {}).get("stringValue", "N/A")
    topic = metadata.get("topic", {}).get("stringValue", "N/A")
    print(f"Score: {record['score']}")
    print(f"Content: {record['content']}")
    print(f"department={dept}, topic={topic}")
    print("---")

# Search only engineering department records (specifying the same namespace)
response = client.retrieve_memory_records(
    memoryId=memory_id,
    namespace=namespace,
    searchCriteria={
        "searchQuery": "inquiry content",
        "topK": 10,
        "metadataFilters": [
            {
                "left": {"metadataKey": "department"},
                "operator": "EQUALS_TO",
                "right": {"metadataValue": {"stringValue": "engineering"}},
            }
        ],
    },
)

print("=== Filter by department=engineering ===")
for record in response.get("memoryRecordSummaries", []):
    metadata = record.get("metadata", {})
    dept = metadata.get("department", {}).get("stringValue", "N/A")
    topic = metadata.get("topic", {}).get("stringValue", "N/A")
    print(f"Score: {record['score']}")
    print(f"Content: {record['content']}")
    print(f"department={dept}, topic={topic}")
    print("---")

metadataFilters is specified inside searchCriteria. Note that while it's at the top level for ListMemoryRecords, it is placed inside searchCriteria for RetrieveMemoryRecords.

Even though the same namespace is specified, the records are correctly filtered by the department value. The actual execution result is as follows.

Execution result
=== Filter by department=sales ===
Score: 0.3785945
Content: {'text': 'The original amount on invoice INV-2026-0542 was understood to be 500,000 yen.'}
department=sales, topic=billing

Score: 0.3695521
Content: {'text': 'The invoice (INV-2026-0542) for May 2026 showed 550,000 yen instead of the expected 500,000 yen, and it was explained that the 50,000 yen difference was the previous month\'s unpaid balance that was added.'}
department=sales, topic=billing

Score: 0.36409447
Content: {'text': 'The user wants to create a new account.'}
department=sales, topic=account

=== Filter by department=engineering ===
Score: 0.35761523
Content: {'text': 'The API response for the /api/v2/reports endpoint in the production environment suddenly slowed down, taking over 5 seconds when it normally takes 200ms.'}
department=engineering, topic=technical

The 4 records within the same namespace are correctly separated and returned by department value! Filtering by sales returns 3 records with billing and account, and filtering by engineering returns only 1 record with technical.

Combining Both Extraction Types as Filters

You can also combine STRICTLY_CONSISTENT and LLM_INFERRED metadata as filters.

combined_filter.py
import boto3
import json

client = boto3.client("bedrock-agentcore", region_name="us-east-1")

memory_id = "YOUR_MEMORY_ID"

# Search for records in sales department AND billing topic
response = client.retrieve_memory_records(
    memoryId=memory_id,
    namespace="support/customer-001/facts",
    searchCriteria={
        "searchQuery": "inquiry content",
        "topK": 10,
        "metadataFilters": [
            {
                "left": {"metadataKey": "department"},
                "operator": "EQUALS_TO",
                "right": {"metadataValue": {"stringValue": "sales"}},
            },
            {
                "left": {"metadataKey": "topic"},
                "operator": "EQUALS_TO",
                "right": {"metadataValue": {"stringValue": "billing"}},
            },
        ],
    },
)

print("=== Sales department × billing topic ===")
for record in response.get("memoryRecordSummaries", []):
    print(f"Content: {record['content']}")
    print(f"Metadata: {json.dumps(record.get('metadata', {}), ensure_ascii=False)}")
    print("---")

Department can be reliably filtered by the value specified by the application, and topic can be filtered by the classification the LLM determined from the conversation content. Being able to combine an axis to control deterministically with an axis to leave to the LLM is great.

Comparison with the Previous Article

Let me compare the LLM_INFERRED-only approach introduced in the previous article with the approach using STRICTLY_CONSISTENT this time.

Issues from the Previous Article and Today's Solutions

In the previous article, LLM extraction had the following issues.

  1. If allowedValues were too restrictive, even unrelated conversations would be forced to choose from the allowed values
  2. Without validation, notation variations would occur (tokyo / Tokyo / 東京, etc.)
  3. Deterministically determined values like department codes also had to be left to the LLM

STRICTLY_CONSISTENT solves these issues, particularly the third one. Since the application directly controls the values, they are not subject to the LLM's whims.

Guidelines for Choosing

Nature of the value Recommended extractionType
Department codes, tenant attributes, compliance levels STRICTLY_CONSISTENT
Region, environment names (prod/stg/dev) STRICTLY_CONSISTENT
Conversation topics, categories LLM_INFERRED
User sentiment, satisfaction LLM_INFERRED

The rule is simple: use STRICTLY_CONSISTENT for values that are finalized on the application side, and use LLM_INFERRED for values that need to be inferred from conversation content.

Configuration Comparison at Memory Creation

In the previous article, even values equivalent to department were all set using LLM extraction.

create_memory.py (diff)
  {
      "key": "department",
      "type": "STRING",
-     "extractionConfig": {
-         "llmExtractionConfig": {
-             "definition": "The department this conversation belongs to",
-             "llmExtractionInstruction": "LATEST_VALUE",
-             "validation": {
-                 "stringValidation": {
-                     "allowedValues": ["sales", "engineering", "hr", "unknown"]
-                 }
-             },
-         }
-     },
+     "extractionType": "STRICTLY_CONSISTENT",
  },

By switching to STRICTLY_CONSISTENT, extractionConfig is no longer needed, and the configuration becomes simpler. There's no need to ask the LLM to "infer the department," and instead you pass the value directly via the metadata parameter when submitting events.

Constraints

STRICTLY_CONSISTENT has several constraints.

Constraint Details
Maximum keys per strategy 3
Supported types STRING only
Declaration in indexedKeys Required
extractionConfig Cannot be specified (value is obtained from the event)
Supported strategies Semantic / User Preference / Episodic (including custom overrides). Summary not supported
When value is not specified That key is omitted from the record

Since there is a limit of 3 keys, you need to limit usage to keys that truly require deterministic control. For other classifications and categorizations, it seems best to leave them to LLM_INFERRED.

The official documentation contains details on constraints and quotas.

https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/long-term-memory-metadata.html

Conclusion

Since it's somewhat risky for deterministically determined values like department codes and compliance levels to depend on LLM inference results, it's great that you can now set values directly in short-term memory from the application side and have them carried over to long-term memory as-is. As the What's New announcement lists multi-tenant memory and compliance boundaries as use cases, combining namespace-based principal separation with STRICTLY_CONSISTENT metadata scoping enables more practical memory design.

I hope this article proves helpful to you in some way. Thank you for reading to the end!

Share this article