[アップデート] Amazon Bedrock Data Automation で音声ファイルのカスタム出力（カスタムブループリント）が作成できるようになりました

たかくに

2025.05.07

こんにちは！クラウド事業本部コンサルティング部のたかくに（@takakuni_）です。
Amazon Bedrock Data Automation で音声ファイルのカスタム出力（カスタムブループリント）が作成できるようになりました。
https://aws.amazon.com/jp/about-aws/whats-new/2025/05/amazon-bedrock-data-automation-extraction-custom-insights-audio/
 アップデート内容Amazon Bedrock　Data Automation（以後、BDA）における、カスタム出力（カスタムブループリント）は、標準出力で取得されないフィールドを抽出/生成するための機能です。
BDA では、音声ファイルに対して以下の抽出/生成を標準出力でサポートしています。
 抽出オーディオトランスクリプト
オーディオ全体のトランスクリプト

コンテンツモデレーション
検出された有害成分

 生成オーディオサマリー
オーディオ全体のサマリー

トピックのサマリー
オーディオの各トピックのサマリー

https://docs.aws.amazon.com/bedrock/latest/userguide/audio-processing.html
たとえば、会議のレコーディングファイルがあったとして、上記の標準出力のみの場合、オーディオ全体のトランスクリプト や オーディオサマリー が書き起こされて終わりのため、ネクストアクションの洗い出しなどの情報はユーザー側で再度調べ直したり、LLM を使って抽出する必要があります。
カスタムブループリントでは、インプットに対して LLM を利用して「アクションアイテムを洗い出してください」などの指示を渡し、標準出力で取得されないフィールドを抽出/生成できます。
アップデート前まで、カスタムブループリントは文字および画像ファイルのみをサポートしていたのですが、今回のアップデートで音声ファイルもサポートしました。
https://docs.aws.amazon.com/bedrock/latest/userguide/creating-blueprint-audio.html
 やってみるそれでは、実際にブループリントを利用して音声ファイルのカスタム出力を体験してみたいと思います。
今回はサンプルブループリントを使って、抽出の具合を確認します。カスタム出力設定からサンプルブループリントを選択します。
General-Audio をクリックします。
ブループリントを利用して音声ファイルからの抽出/生成が始まりました。
ブループリントでは事前にフィールドが定義されています。
定義されているフィールドの内容は以下のとおりです。


フィールド名
指示
抽出タイプ
型


transcript_summary
Generate a concise abstractive summary of the conversation, focusing on the main topics and key themes. Ensure accuracy by summarizing only what is explicitly discussed, without adding specific details not present in the conversation. Keeping the response within 100 words.
inferred
string

sentiment_summary
A less than 10-word summary of the speakers' sentiments over the course of the audio transcript. Make sure to include changes in sentiment, if they occur.
inferred
string

topics
The main topics of the audio transcript, listed as single words.
inferred
[string] (Array of strings)

category
The category of the audio (not the topic). Choose from General conversation, Media, Hospitality, Speeches, Meetings, Education, Financial, Public sector, Healthcare, Sales, Audiobooks, Podcasts, 911 calls, Other.
inferred
string

spoken_named_entities
Any named entities (typically proper nouns) explicitly mentioned in the audio transcript including locations, brand names, company names, product names, services, events, organizations, etc. Do not include names of people, email addresses or common nouns.
extractive
[string] (Array of strings)

抽出（推論）タイプについては、以下のような指示の違いがあります。
明示的: BDA はインプットから直接値を抽出する必要があります。
推論: BDA はインプットに存在する情報に基づいて値を推論する必要があります。
https://docs.aws.amazon.com/bedrock/latest/userguide/idp-cases-extraction.html
数十秒待つと、次のように結果が返ってきました。
標準出力で取得できないような、category や sentiment_summary が取得できていますね。
{
  "matched_blueprint": {
    "arn": "arn:aws:bedrock:us-east-1:aws:blueprint/bedrock-data-automation-public-general-audio",
    "name": "General Audio",
    "confidence": 1
  },
  "inference_result": {
    "topics": [
      "production",
      "quality control",
      "marketing",
      "pricing",
      "features",
      "challenges",
      "customer support"
    ],
    "category": "Meetings",
    "transcript_summary": "John and David discussed the upcoming launch of the Safeguard Pro smart home security systems, covering production, quality control, marketing strategies, pricing, features, potential challenges, and customer support. They agreed on the importance of contingency plans and educational materials for consumers.",
    "sentiment_summary": "Positive and collaborative discussion.",
    "spoken_named_entities": [
      "Safeguard Pro",
      "holiday season",
      "social media",
      "tech review websites",
      "pop up demonstration booths",
      "home improvement stores",
      "$299.99",
      "$399",
      "$499",
      "AI power threat detection",
      "smart home integration",
      "mobile app"
    ]
  }
}
 料金最後に料金です。30 フィールドまでは 1 分あたり $0.009 でした。31 フィールド目から 1 フィールドあたり 0.0005 USD が追加で課金されるためご注意ください。
https://aws.amazon.com/bedrock/pricing/?nc1=h_ls
 まとめ以上、「Amazon Bedrock Data Automation で音声ファイルのカスタム出力（カスタムブループリント）が作成できるようになりました。」でした。
LLM で文字起こしした内容をさらに LLM で抽出するあの作業が、一度にできるのは非常に便利ですよね。文字起こし用のモデルを選べたら、嬉しいですね。
クラウド事業本部コンサルティング部のたかくに（@takakuni_）でした！

[アップデート] Amazon Bedrock Data Automation で音声ファイルのカスタム出力（カスタムブループリント）が作成できるようになりました

アップデート内容

抽出

生成

やってみる

料金

まとめ

関連記事

主なカテゴリ

AWSで探す

注目のテーマ

プロダクトやサービスで探す

特集やシリーズから探す

お問い合わせ

運営会社

フィールド名	指示	抽出タイプ	型
transcript_summary	Generate a concise abstractive summary of the conversation, focusing on the main topics and key themes. Ensure accuracy by summarizing only what is explicitly discussed, without adding specific details not present in the conversation. Keeping the response within 100 words.	inferred	string
sentiment_summary	A less than 10-word summary of the speakers' sentiments over the course of the audio transcript. Make sure to include changes in sentiment, if they occur.	inferred	string
topics	The main topics of the audio transcript, listed as single words.	inferred	[string] (Array of strings)
category	The category of the audio (not the topic). Choose from General conversation, Media, Hospitality, Speeches, Meetings, Education, Financial, Public sector, Healthcare, Sales, Audiobooks, Podcasts, 911 calls, Other.	inferred	string
spoken_named_entities	Any named entities (typically proper nouns) explicitly mentioned in the audio transcript including locations, brand names, company names, product names, services, events, organizations, etc. Do not include names of people, email addresses or common nouns.	extractive	[string] (Array of strings)