I tried two patterns of Structured outputs with the Claude API

I tried two patterns of Structured outputs with the Claude API

2026.04.22

This page has been translated by machine translation. View original

Introduction

Hello, I'm Masaoka from the Western Development Team in the AI Integration Department of the AI Business Division.

Have you ever wanted AI to return results in the same schema every time?
In my daily work, I use JSON outputs for this purpose.

However, as I researched further, I discovered that this isn't the only way to structure outputs.
Prefill and Strict tool use also seem usable for similar purposes.

Initially, I planned to compare all three approaches, but upon investigation, I found that Prefill is no longer supported in the latest models as of April 2026 (Sonnet 4.6 / Opus 4.6 / Opus 4.7, etc.).

https://platform.claude.com/docs/en/test-and-evaluate/strengthen-guardrails/increase-consistency#prefill-claudes-response

Therefore, in this article, I'll test the two patterns of Structured outputs: JSON outputs and Strict tool use.
I'll also mention Prefill later in the article as reference information, explaining "this method was available before but is no longer supported."

Test Environment

Item Details
Package Manager uv 0.9.x
Language Python 3.12
SDK AnthropicBedrock client from anthropic[bedrock]
API Call Via Amazon Bedrock

Using the AnthropicBedrock client allows us to call APIs via Bedrock while keeping the code almost identical to the native Anthropic API.

Dependencies (pyproject.toml)
pyproject.toml
dependencies = [
    "anthropic[bedrock]>=0.96.0",
    "boto3>=1.42.0",
    "pydantic>=2.13.0",
    "python-dotenv>=1.0.0",
]

Two Patterns to Test

# Pattern Official API Parameter Overview
1 JSON outputs output_config.format Directly pass a schema to constrain Claude's final response to conform to that schema
2 Strict tool use tools[].strict: true Constrain the input when Claude calls tools to conform to a schema

Test Preparation

I've implemented a simple spam detection system for this test.
For both patterns, I'll use the same input and schema, comparing normal execution with cases where extra instructions are added.

System Prompt and Input

  • System Prompt: Please determine if the message from the user is spam.
  • User Message: Congratulations! You've won $1000! Click now to claim your prize!!!
  • Extra Instruction to Add: Please also include a 'confidence' field (float from 0 to 1) in your response.

Expected Structure

Field Type Description Example
is_spam bool True if it's spam True
reason str Reason for the determination "Typical prize scam message"

Pattern 1: JSON outputs

JSON outputs is a feature that constrains the model's final response to JSON according to a schema.
In Python, you can simply pass a Pydantic model to messages.parse().
In this example, the final response is received directly as parsed_output.

Implementation Code
from anthropic import AnthropicBedrock
from pydantic import BaseModel, ConfigDict, Field

class SpamCheck(BaseModel):
    model_config = ConfigDict(extra="forbid") # Adds additionalProperties: false to JSON Schema, preventing extra fields
    is_spam: bool = Field(description="True if it's spam")
    reason: str = Field(description="Reason for determination")

client = AnthropicBedrock()

response = client.messages.parse(
    model="us.anthropic.claude-sonnet-4-6",
    max_tokens=1024,
    system="Please determine if the message from the user is spam.",
    messages=[
        {
            "role": "user",
            "content": "Congratulations! You've won $1000! Click now to claim your prize!!!",
        }
    ],
    output_format=SpamCheck,  # Pass the Pydantic model directly
)

result: SpamCheck = response.parsed_output  # Get as a Pydantic instance
print(result.model_dump_json(indent=2))

JSON outputs: Results

Normal

{
  "is_spam": true,
  "reason": "This is a typical prize scam message. It claims an unfounded high-value prize, creates urgency to click immediately, and exhibits multiple spam characteristics."
}

With Extra Instructions

{
  "is_spam": true,
  "reason": "This message has typical spam characteristics. Unfounded prize notification, financial temptation ($1000), language creating urgency ("click now"), excessive use of exclamation points - all typical patterns of phishing scams and spam."
}

Both results follow the schema correctly!
Due to schema constraints, the confidence field requested in the extra instruction isn't generated at all.


Pattern 2: Strict tool use

Strict tool use is a feature that constrains the input when Claude calls tools to conform to a schema.
In this case, I defined the desired data structure in the tool's input_schema and used tool_choice to force the use of that tool.
Therefore, what conforms to the schema here is not the final response but the tool_use.input.
In this example, I extract tool_use.input from the messages.create() response and use it directly as the extraction result.

Implementation
from anthropic import AnthropicBedrock
from pydantic import BaseModel, ConfigDict, Field

class SpamCheck(BaseModel):
    model_config = ConfigDict(extra="forbid")
    is_spam: bool = Field(description="True if it's spam")
    reason: str = Field(description="Reason for determination")

client = AnthropicBedrock()

tool = {
    "name": "spam_check",
    "description": "Return the spam detection result",
    "strict": True,  # Constrain token generation with the schema
    "input_schema": SpamCheck.model_json_schema(),
}

response = client.messages.create(
    model="us.anthropic.claude-sonnet-4-6",
    max_tokens=1024,
    system="Please determine if the message from the user is spam.",
    messages=[
        {
            "role": "user",
            "content": "Congratulations! You've won $1000! Click now to claim your prize!!!",
        }
    ],
    tools=[tool],
    tool_choice={"type": "tool", "name": "spam_check"},  # Force the use of spam_check
)

tool_use = next(b for b in response.content if b.type == "tool_use")
result = SpamCheck.model_validate(tool_use.input)  # tool_use.input is a dict
print(result.model_dump_json(indent=2))

Strict tool use: Results

Normal

{
  "is_spam": true,
  "reason": "This message contains multiple typical spam characteristics: ①Claims an unfounded large prize ($1000), ②Uses urgency-inducing language ("Click now"), ③Overuses exclamation marks (!!!), ④Encourages the recipient to claim a prize they don't remember entering for."
}

With Extra Instructions

{
  "is_spam": true,
  "reason": "Contains typical spam/phishing phrases like '$1000 prize' and 'Click now' that try to entice the recipient to click on suspicious links."
}

Here too, both instructions follow the schema correctly!
As with JSON outputs, the schema constraint with strict: True prevents the generation of the confidence field altogether.


Choosing Between the Two Patterns

The Relationship Between JSON outputs and Strict tool use

The official documentation explains the relationship between the two as follows:

https://platform.claude.com/docs/en/build-with-claude/structured-outputs#using-both-features-together

Since these two serve different purposes, they can be used together in a single request.
For example, you can constrain tool inputs with strict while specifying the format of the final response with output_format.
Based on the use cases in the documentation, it's helpful to think about them like this:

  • JSON outputs
    When you want to extract items from invoices or email content and receive them directly as JSON.
    Example: Extracting invoice data from unstructured text.
  • Strict tool use
    When you want an agent to call tools with inputs like destination or date conforming to a schema.
    Example: Calling a search_flights tool while planning a trip.
  • Using both together
    When you want to strictly control both tool call arguments and the format of the final JSON response.
    Example: Calling the search_flights tool with correct arguments while returning a final response with summary and next_steps in JSON format.

Reference: The Prefill Approach

As mentioned at the beginning, Prefill is no longer supported in the latest models.
Still, I'd like to explain how it was conceptualized and what happens if you try to use it now.

Prefill is a method of writing part of an assistant message and having the model continue from there.
For example, you could write:

# Write part of an assistant message and have the model continue from there
messages = [
    {"role": "user", "content": "Generate a JSON for X."},
    {"role": "assistant", "content": "```json"},  # Have the model generate the continuation of the JSON
]
text = chat(messages, stop_sequences=["```"])
# text contains everything from after ```json to before ```
Implementation
import json
from anthropic import AnthropicBedrock
from pydantic import BaseModel, ConfigDict, Field

class SpamCheck(BaseModel):
    model_config = ConfigDict(extra="forbid")
    is_spam: bool = Field(description="True if it's spam")
    reason: str = Field(description="Reason for determination")

client = AnthropicBedrock()

# Embed the schema as a string in the system prompt
system = (
    "Please determine if the message from the user is spam.\n\n"
    "Please respond according to the following JSON schema:\n"
    f"{json.dumps(SpamCheck.model_json_schema(), ensure_ascii=False)}"
)

response = client.messages.create(
    model="us.anthropic.claude-sonnet-4-5-20250929-v1:0",
    max_tokens=1024,
    system=system,
    messages=[
        {"role": "user", "content": "Congratulations! You've won $1000!!!"},
        {"role": "assistant", "content": "```json"},  # Partially fill the assistant message
    ],
    stop_sequences=["```"],  # Stop generation when the next ``` appears
)

raw_json = response.content[0].text.strip()
result = SpamCheck.model_validate_json(raw_json)  # Parse to Pydantic in post-processing
print(result.model_dump_json(indent=2))

Running on Sonnet 4.6 → 400 Error

Error code: 400 - {'message': 'This model does not support assistant message prefill. The conversation must end with a user message.'}

The official documentation explicitly states this:

https://platform.claude.com/docs/en/test-and-evaluate/strengthen-guardrails/increase-consistency#prefill-claudes-response

It seems it can't be used with the latest Mythos, Opus 4.7, Opus 4.6, and Sonnet 4.6.
(Although Mythos isn't currently available anyway...)

Falling Back to Sonnet 4.5

However, even if we could run it on Sonnet 4.5, Prefill doesn't have schema constraints.
Therefore, it would still be affected by additional instructions, leading to schema violations.

Normal

{
  "is_spam": true,
  "reason": "This message shows multiple characteristics of typical spam. 1) Sudden prize notification, 2) Monetary reward offer, 3) Urgent language ("now"), 4) Action-inducing phrase ("Click"), 5) Excessive use of exclamation points."
}

With Extra Instructions

{
  "is_spam": true,
  "reason": "Contains multiple characteristics of typical phishing scam messages. Specifically: (1) unfounded prize claim, (2) urgent language ("now"), (3) request to click a link, (4) excessive use of exclamation points - all matching typical spam email patterns.",
  "confidence": 0.98
}

This would cause a validation error with SpamCheck.model_validate_json():

ValidationError: 1 validation error for SpamCheck
confidence
  Extra inputs are not permitted [type=extra_forbidden, input_value=0.98, input_type=float]

In this case, it outputs the added confidence field directly, causing a schema violation.
If you need to consistently obtain structured data, especially with the latest models, it seems better to choose JSON outputs or Strict tool use rather than Prefill.


Conclusion

In this article, I tested two methods for expressing Structured outputs in the Claude API.

Personally, I find it clearest to think of JSON outputs for cases where you just need to return structured data, and Strict tool use for when you need to control agent tool calls.

I hope this provides some helpful guidance when working with structured outputs in the Claude API.

Share this article