I tried Structured Output with Strands Agents

2025.12.15

This page has been translated by machine translation. View original

Introduction

Hello, I'm Kanno from the consulting department, and I love supermarkets.

When developing applications using LLMs, there are cases where you want to receive output from LLMs as structured data.
So this time, I tried Strands Agents' Structured Output feature!

Structured Output

For example, you might want to use a user's question to create queries for metadata filtering in OpenSearch or as parameters for subsequent searches.
For this purpose, you might include output instructions in your prompts or add markers to make them parsable, but these approaches can feel unstable and may require fallback processing, which adds some cost. It would be great if structured output could solve this issue.

Specifically, output from LLMs is returned as raw text, so the application side needs to handle parsing.
For instance, if you request "Output as JSON," there are cases where not only JSON but also some introductory text is added. To handle this, you might need to include "Output only JSON" in your system prompt or devise mechanisms to extract structured data using tags.

Me) Name: Tanaka Taro
Age: 21

Output this as JSON.

# This is unnecessary!!!
LLM (Nova 2 Lite) Here's the output in JSON format:

```json
# I only need this one line, code blocks are unnecessary
{"name": "Tanaka Taro", "age": 21}
```

Structured output is useful in such cases - it's a feature that allows you to get responses from LLMs in a format that follows a predefined schema.
You just define the schema with a Pydantic model, and you receive the results as validated Python objects.

The benefits include:

Getting typed Python objects instead of raw strings
Automatic validation of responses against the schema by Pydantic
IDE type hints working for LLM-generated responses

Not having to write parsing logic yourself is really nice!
In the case of Strands Agents, it's great that it works with all provider models that Strands supports, so you can structure outputs even if the model itself doesn't directly support it.

All of the model providers supported in Strands can work with Structured Output.

As a side note, both OpenAI and Anthropic have models that support this in their respective APIs.

For this article, I'll be trying this feature using Amazon Nova2 lite.

Prerequisites / Environment

Here's the environment used for this testing:

Python 3.13
uv 0.6.12
strands-agents 1.19.0
strands-agents-tools 0.2.17
pydantic 2.12.5
boto3 1.42.4

Let's Try It

Now, let's try structured output in practice!

Setting Up the Environment

First, let's create a project using uv and install the necessary libraries.

# Create the project
uv init strands-structured-output
cd strands-structured-output

# Install necessary libraries
uv add strands-agents strands-agents-tools pydantic boto3

Basic Usage

Let's start with a simple example. We'll create an agent that extracts book information from text.

Create main.py:

main.py

from pydantic import BaseModel, Field
from strands import Agent
from strands.models import BedrockModel

# Define Pydantic model
class BookInfo(BaseModel):
    """Model representing book information"""
    title: str = Field(description="Title of the book")
    author: str = Field(description="Author name")
    price: int = Field(description="Price (in yen)")
    genre: str = Field(description="Genre")

# Set up Bedrock model
bedrock_model = BedrockModel(
    model_id="us.amazon.nova-2-lite-v1:0",
    temperature=0.0
)

# Create agent
agent = Agent(model=bedrock_model)

# Extract information from text
text = """
"Cloud Native Architecture Introduction" is a technical book by Taro Tanaka,
explaining modern system design using AWS and Kubernetes.
It costs 3,200 yen and falls under the IT technical book genre.
"""

result = agent(
    text,
    structured_output_model=BookInfo
)

# Access the result
book: BookInfo = result.structured_output
print(f"Title: {book.title}")
print(f"Author: {book.author}")
print(f"Price: {book.price} yen")
print(f"Genre: {book.genre}")

Now let's run it:

uv run main.py

The result will be:

Tool #1: BookInfo
Title: クラウドネイティブアーキテクチャ入門
Author: 田中太郎
Price: 3200 yen
Genre: IT技術書

Wow! We successfully obtained structured data! Being able to access it with dot notation like book.title is convenient.
Having type specifications that enable IDE completion is a nice advantage.

Looking at the logs, we can see that it's structured in the form of tool usage.

Example Using Nested Models

Next, let's try a more complex example using nested models.
Let's extract restaurant review information:

from pydantic import BaseModel, Field
from typing import List, Optional
from strands import Agent
from strands.models import BedrockModel

# Define nested models
class MenuItem(BaseModel):
    """Menu item"""
    name: str = Field(description="Dish name")
    price: Optional[int] = Field(description="Price (in yen)", default=None)

class RestaurantReview(BaseModel):
    """Restaurant review information"""
    restaurant_name: str = Field(description="Restaurant name")
    cuisine_type: str = Field(description="Cuisine type")
    rating: float = Field(description="Rating (1.0-5.0)", ge=1.0, le=5.0)
    recommended_dishes: List[MenuItem] = Field(description="List of recommended dishes")
    overall_impression: str = Field(description="Overall impression (in 1-2 sentences)")

# Create agent
bedrock_model = BedrockModel(
    model_id="us.amazon.nova-2-lite-v1:0",
    region_name="us-west-2",
    temperature=0.0,
)

agent = Agent(model=bedrock_model)

# Extract information from review text
review_text = """
I recently visited "Trattoria Bella" which just opened in front of the station.
It's an authentic Italian restaurant, and the Carbonara (1,400 yen) and
Margherita pizza (1,600 yen) were exquisite!
The atmosphere was nice, the staff was courteous, and I'd give it 4.5 out of 5 points.
I think it's a perfect place for a date.
"""

result = agent(
    review_text,
    structured_output_model=RestaurantReview
)

review: RestaurantReview = result.structured_output
print(f"Restaurant name: {review.restaurant_name}")
print(f"Cuisine type: {review.cuisine_type}")
print(f"Rating: {review.rating}")
print(f"Recommended dishes:")
for dish in review.recommended_dishes:
    price_str = f"({dish.price} yen)" if dish.price else ""
    print(f"  - {dish.name}{price_str}")
print(f"Overall impression: {review.overall_impression}")

The result is:

uv run main.py

Tool #1: RestaurantReview
Restaurant name: トラットリア・ベッラ
Cuisine type: イタリアン
Rating: 4.5
Recommended dishes:
  - カルボナーラ(1400 yen)
  - マルゲリータピザ(1600 yen)
Overall impression: 本格的なイタリアンで、雰囲気も良く、スタッフの対応も丁寧。デートにもぴったりのおしゃれなレストランです。

It's handling nested models correctly! It's nice that it can also handle list types like List[MenuItem].

Using Validation

You can also utilize Pydantic's powerful validation features.
For example, you can detect errors if the character count is less than 10:

from pydantic import BaseModel, Field, field_validator
from strands import Agent
from strands.models import BedrockModel

class ProductRating(BaseModel):
    """Product rating"""

    product_name: str = Field(description="Product name")
    rating: int = Field(description="Rating (integer from 1 to 5)", ge=1, le=5)
    review_comment: str = Field(description="Review comment")

    @field_validator("review_comment")
    @classmethod
    def validate_comment_length(cls, value: str) -> str:
        if len(value) < 10:
            raise ValueError("Review comment must be at least 10 characters long")
        return value

bedrock_model = BedrockModel(
    model_id="us.amazon.nova-2-lite-v1:0",
    region_name="us-west-2",
    temperature=0.0,
)

agent = Agent(model=bedrock_model)

result = agent(
    "I bought an AWS book. 5 stars, very good book that taught me AWS basics well.",
    structured_output_model=ProductRating,
)

product: ProductRating = result.structured_output
print(f"Product name: {product.product_name}")
print(f"Rating: {'★' * product.rating}")
print(f"Comment: {product.review_comment}")

When Successful

Tool #1: ProductRating
Product name: AWSの本
Rating: ★★★★★
Comment: とても良い本でAWSの基礎がしっかり学べました。

When Failed

Let's try running it with just "Great" as the review comment:

uv run main.py
Tool #1: ProductRating
tool_name=<ProductRating> | structured output validation failed | error_message=<Validation failed for ProductRating. Please fix the following errors:
- Field 'review_comment': Value error, レビューコメントは10文字以上必要です>
Please make your review comment at least 10 characters long and resubmit your rating. Your current comment "最高" is only 2 characters. For example, write something like "AWSの本がとても分かりやすく、最高の学習リソースになりました！" which is more than 10 characters.
Tool #2: ProductRating
Product name: AWSの本
Rating: ★★★★★
Comment: AWSの本がとても分かりやすく、最高の学習リソースになりました！

Although it produced an error, it forced a structured output by retrying!
Looking at the official documentation, it states that when validation issues are detected, it automatically retries for resolution - so this seems to be the default behavior. This behavior is something to be aware of. For cases where strict validation is needed, we might prefer to reject the input. It seems more reliable to evaluate structured data with pure logic rather than relying on this validation.

This behavior is also reported in GitHub Issue #1108, and currently there's no parameter provided to limit the number of retries. If you want to limit retries, you can use Hooks to forcibly stop it.

Error Handling

Error handling is important when there's an issue parsing structured output. Strands Agents throws a StructuredOutputException in such cases:

from strands import Agent
from strands.models import BedrockModel
from strands.types.exceptions import StructuredOutputException
from pydantic import BaseModel, Field

class StrictModel(BaseModel):
    """Model with strict constraints"""
    name: str = Field(description="Name", min_length=1)
    age: int = Field(description="Age", ge=0, le=150)

bedrock_model = BedrockModel(
    model_id="us.amazon.nova-2-lite-v1:0",
    region_name="us-west-2",
    temperature=0.0,
)

agent = Agent(model=bedrock_model)

try:
    result = agent(
        "Hello",
        structured_output_model=StrictModel
    )
    data = result.structured_output
    print(f"Name: {data.name}, Age: {data.age}")
except StructuredOutputException as e:
    print(f"Structured output failed: {e}")

Let's run it:

uv run main.py                                                                                                                  
Hello! I'm here to help. Let me know if you need any specific information or support.
Tool #1: StrictModel
Name: ユーザー, Age: 25

Hmm, it's not detecting the failure...

In this example, for the prompt "Hello" which lacks sufficient information, the LLM creatively generated "Name: User, Age: 25" to avoid an error.

From my impression, the exception handling feels more like a safety net rather than a reliable mechanism. It seems necessary to control with prompts and validate from different perspectives in your process.

Trying a Query and Mode Extraction Example

Let's try one more example that I think could be useful in practice - extracting search modes and queries for a mock search pattern:

Have the LLM generate search mode and query using structured output
Execute a search with the output parameters

Here's the flow:

from enum import Enum
from typing import List, Optional

from pydantic import BaseModel, Field
from strands import Agent
from strands.models import BedrockModel

# Define search modes
class SearchMode(str, Enum):
    SEMANTIC = "semantic"  # Semantic search (vector search)
    KEYWORD = "keyword"  # Keyword search
    HYBRID = "hybrid"  # Hybrid search

# Structured query model
class StructuredQuery(BaseModel):
    """Search query information extracted from user questions"""

    search_query: str = Field(
        description="Query string to use for search. Converted from user's question into a search-appropriate form"
    )
    search_mode: SearchMode = Field(
        description="Search mode. semantic for semantic search, keyword for specific keyword search, hybrid when both are needed"
    )
    category: Optional[str] = Field(
        description="Search category target (e.g., AWS, Python, Databases, etc.)",
        default=None,
    )
    filters: Optional[List[str]] = Field(
        description="Additional filter conditions (e.g., Latest, Beginner-friendly, Official documents, etc.)",
        default=None,
    )
    time_range: Optional[str] = Field(
        description="Time range specification if any (e.g., After 2024, Last month, etc.)",
        default=None,
    )

# Mock search function
def mock_search(query: StructuredQuery) -> List[dict]:
    """
    Execute a mock search using structured query
    In an actual system, this would query OpenSearch or a vector DB
    """
    print(f"\n{'=' * 50}")
    print("🔍 Executing search...")
    print(f"{'=' * 50}")
    print(f"  Query: {query.search_query}")
    print(f"  Mode: {query.search_mode.value}")
    if query.category:
        print(f"  Category: {query.category}")
    if query.filters:
        print(f"  Filters: {', '.join(query.filters)}")
    if query.time_range:
        print(f"  Period: {query.time_range}")
    print(f"{'=' * 50}\n")

    # Return mock results
    mock_results = [
        {
            "title": f"【{query.category or 'Technology'}】Article about {query.search_query}",
            "score": 0.95,
            "snippet": f"This article explains {query.search_query} in detail...",
        },
        {
            "title": f"{query.search_query} Beginner's Guide",
            "score": 0.87,
            "snippet": f"Covers {query.search_query} from basics to advanced for beginners...",
        },
    ]

    return mock_results

# Create agent
bedrock_model = BedrockModel(
    model_id="us.amazon.nova-2-lite-v1:0",
    temperature=0.0,
)
agent = Agent(model=bedrock_model)

# List of test user questions
user_questions = [
    "Tell me about cold start mitigation strategies for AWS Lambda",
    "How to implement vector search in OpenSearch? I'd prefer beginner-friendly articles",
    "What are the new Bedrock features announced since 2024?",
]

for question in user_questions:
    print(f"\n{'#' * 60}")
    print(f"📝 User question: {question}")
    print(f"{'#' * 60}")

    # Convert question to structured query
    result = agent(
        f"Please structure the following user question as a search query:\n\n{question}",
        structured_output_model=StructuredQuery,
    )

    structured_query: StructuredQuery = result.structured_output

    # Display structured result
    print(f"\n📊 Structured result:")
    print(f"  search_query: {structured_query.search_query}")
    print(f"  search_mode: {structured_query.search_mode.value}")
    print(f"  category: {structured_query.category}")
    print(f"  filters: {structured_query.filters}")
    print(f"  time_range: {structured_query.time_range}")

    # Execute search using structured query
    results = mock_search(structured_query)

    # Display search results
    print("📚 Search results:")
    for i, res in enumerate(results, 1):
        print(f"  {i}. {res['title']} (Score: {res['score']})")

Now let's run the code:

uv run main.py

############################################################
📝 User question: Tell me about cold start mitigation strategies for AWS Lambda
############################################################

Tool #1: StructuredQuery

📊 Structured result:
  search_query: AWS Lambda コールドスタート対策
  search_mode: keyword
  category: AWS
  filters: None
  time_range: None

==================================================
🔍 Executing search...
==================================================
  Query: AWS Lambda コールドスタート対策
  Mode: keyword
  Category: AWS
==================================================

📚 Search results:
  1. 【AWS】Article about AWS Lambda コールドスタート対策 (Score: 0.95)
  2. AWS Lambda コールドスタート対策 Beginner's Guide (Score: 0.87)

############################################################
📝 User question: How to implement vector search in OpenSearch? I'd prefer beginner-friendly articles
############################################################

Tool #2: StructuredQuery

📊 Structured result:
  search_query: OpenSearch ベクトル検索 実装
  search_mode: keyword
  category: OpenSearch
  filters: ['初心者向け']
  time_range: None

==================================================
🔍 Executing search...
==================================================
  Query: OpenSearch ベクトル検索 実装
  Mode: keyword
  Category: OpenSearch
  Filters: 初心者向け
==================================================

📚 Search results:
  1. 【OpenSearch】Article about OpenSearch ベクトル検索 実装 (Score: 0.95)
  2. OpenSearch ベクトル検索 実装 Beginner's Guide (Score: 0.87)

############################################################
📝 User question: What are the new Bedrock features announced since 2024?
############################################################

Tool #3: StructuredQuery

📊 Structured result:
  search_query: Bedrock 新機能
  search_mode: keyword
  category: Bedrock
  filters: None
  time_range: 2024年以降

==================================================
🔍 Executing search...
==================================================
  Query: Bedrock 新機能
  Mode: keyword
  Category: Bedrock
  Period: 2024年以降
==================================================

📚 Search results:
  1. 【Bedrock】Article about Bedrock 新機能 (Score: 0.95)
  2. Bedrock 新機能 Beginner's Guide (Score: 0.87)

It's nicely extracting parameters and passing them to the search function!
It's great that we can force output into the needed form using structured output without having to specify it in the system prompt.

When using this in practice, you should refer to the best practices in the official documentation:

Impressions

It's really nice that you can define Pydantic models and get structured output through tool usage. IDE completion works, it's easy to handle the received data, and it's worth considering when you want to enforce and structure responses.

On the other hand, I'm concerned about error handling behavior, with its forced structuring and retries. This suggests the need for either separate validation processes or controlling the number of tool calls.

It's probably best to start with simple examples like the search query case. It's also important to analyze logs to see how successful the structuring is and whether it's meeting your intentions.

I'd like to explore the error handling more and share insights on that in future blog posts!