Guidance for Multimodal Data Processing Using Amazon Bedrock Data Automation で Amazon Bedrock Data Automation を体験してみた

Guidance for Multimodal Data Processing Using Amazon Bedrock Data Automation で Amazon Bedrock Data Automation を体験してみた

Clock Icon2025.03.10

こんにちは!クラウド事業本部コンサルティング部のたかくに(@takakuni_)です。

先日、Amazon Bedrock Data Automation(以後、BDA)が一般提供を開始しました。

https://dev.classmethod.jp/articles/amazon-bedrock-data-automation-general-availability-overview/

具体的なユースケースを体験してみたいなぁと思い、 aws-samples を探していたら、Guidance for Multimodal Data Processing Using Amazon Bedrock Data Automation というものがありました。

https://github.com/aws-solutions-library-samples/guidance-for-multimodal-data-processing-using-amazon-bedrock-data-automation/tree/main

日本語のソリューションライブラリページもありました。

https://aws.amazon.com/jp/solutions/guidance/multimodal-data-processing-using-amazon-bedrock-data-automation/

今回はこちらを利用して、 BDA を体験してみたいと思います。

Guidance for Multimodal Data Processing Using Amazon Bedrock Data Automation

Guidance for Multimodal Data Processing Using Amazon Bedrock Data Automation は名前の通り、 BDA を利用したガイダンスが記載されているリポジトリです。

本リポジトリには以下の 2 つのユースケースが記載されてます。

  • Automated Lending Flow
  • Intelligent Claims Review

Automated Lending Flow

Automated Lending Flow では、金融の融資システムを想定し BDA を利用して、W2(アメリカで利用されている源泉徴収票)、給与明細書、運転免許証、1099(フリーランス向けの所得証明)、銀行取引明細書からデータを抽出するワークフローを構築します。

image-8.png
Guidance for Multimodal Data Processing Using Amazon Bedrock Data Automation から画像引用

Intelligent Claims Review

Intelligent Claims Review では、医療保険のシステムを想定し、BDA を利用します。BDA によって、請求書類や画像から関連データを抽出するワークフローを想定しています。抽出されたデータを元に Amazon Bedrock Agents を用いて、保険適用の適格性を判定します。本格的ですね。

image-8.png
Guidance for Multimodal Data Processing Using Amazon Bedrock Data Automation から画像引用

やってみた

今回はサーバレスな構成ですぐ試せそうな、Automated Lending Flow にトライしてみたいと思います。

GitHub からソースをダウンロードし、手順 に従い、リソースをデプロイします。リージョンはオレゴンリージョンを利用しました。デプロイを行う環境は CloudShell を利用しました。

git clone https://github.com/aws-solutions-library-samples/guidance-for-multimodal-data-processing-using-amazon-bedrock-data-automation.git

cd guidance-for-multimodal-data-processing-using-amazon-bedrock-data-automation/deployment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

pip install --upgrade pip
pip install -r requirements.txt

cd lambda/lending_flow/layer/
pip install -r requirements.txt --target python
cd ../../..

sudo npm install -g aws-cdk

cdk bootstrap
cdk deploy lending-flow --require-approval never --context data_project_name=my-lending-project

https://github.com/aws-solutions-library-samples/guidance-for-multimodal-data-processing-using-amazon-bedrock-data-automation/blob/main/deployment/docs/a_lending_01_deployment.md

おおよそ 2 分程度でリソースのデプロイは完了しました。

 ✅  lending-flow

✨  Deployment time: 112.68s

Outputs:
lending-flow.lendingflowbucket = lending-flow-bucket43879c71-nhbtpemiwinc
Stack ARN:
arn:aws:cloudformation:us-west-2:622809842341:stack/lending-flow/d20c7280-fd8a-11ef-bd5f-02311b204b6d

✨  Total time: 129.57s

プロジェクトの作成

BDA の画面からプロジェクトを作成します。

2025-03-10 at 18.05.06-データオートメーション プロジェクト  Amazon Bedrock  us-west-2@2x.png

名前は指定された my-lending-project を利用します。

2025-03-10 at 18.05.16-データオートメーション プロジェクト  Amazon Bedrock  us-west-2@2x.png

今回、カスタム出力には事前定義されたデータ形式(ブループリント)に加え、プロンプトから抽出するデータ形式を決める方式を利用します。

まずは、事前定義されたデータ形式を指定します。カスタム出力画面に遷移します。

2025-03-10 at 18.05.39-データオートメーション プロジェクトの詳細  Amazon Bedrock  us-west-2@2x.png

ブループリントで以下を選択します。

  • Payslip
  • US-Driver-License
  • US-Bank-Check
  • Bank-Statement
  • W2-Form

2025-03-10 at 18.05.58-データオートメーション プロジェクトの詳細  Amazon Bedrock  us-west-2@2x.png

続いて、カスタムブループリントです。ブループリントを作成をクリックします。

2025-03-10 at 18.08.46-データオートメーション プロジェクトの詳細  Amazon Bedrock  us-west-2@2x.png

カスタムブループリントでは、ドキュメント + プロンプト or 自身で抽出したいデータの型を定義します。今回はドキュメント + プロンプトでデータの型を決定します。まずはソースとなるドキュメントをアップロードします。

2025-03-10 at 18.10.24-データオートメーション プロジェクトの詳細  Amazon Bedrock  us-west-2@2x.png

アップロード時に S3 が自動で作成されるようです。

2025-03-10 at 18.10.30-データオートメーション プロジェクトの詳細  Amazon Bedrock  us-west-2@2x.png

ドキュメントをアップロード後、ブループリントプロンプトを定義します。このプロンプトでどんなデータなのか、どう言ったデータが欲しいのかを入力します。

今回は This is an homeowner insurance form. Please extract all the keys and values from the form.(これは住宅所有者保険のフォームです。フォームからすべてのキーと値を抽出してください。)とプロンプト欄に入力しました。

2025-03-10 at 18.10.45-データオートメーション プロジェクトの詳細  Amazon Bedrock  us-west-2@2x.png

抽出が完了したらブループリントの作成をクリックします。

2025-03-10 at 18.12.03-データオートメーション プロジェクトの詳細  Amazon Bedrock  us-west-2@2x.png

44 件のデータがカスタムブループリントによって抽出可能になりました。

2025-03-10 at 18.17.39-データオートメーション プロジェクトの詳細  Amazon Bedrock  us-west-2@2x.png

問題なければ、カスタムブループリントをプロジェクトに紐づけます。

2025-03-10 at 18.17.46-データオートメーション プロジェクトの詳細  Amazon Bedrock  us-west-2@2x.png

サンプルのドキュメントをアップロードしてみました。各書類に対して適切なデータが抽出されていますね。

2025-03-10 at 18.27.52-データオートメーション プロジェクトの詳細  Amazon Bedrock  us-west-2@2x.png

S3 へのアップロード

S3 バケットへテストデータをアップロードしてみます。今回 2 つのデータをアップロードしました。

  • lending_package_w2.pdf
    • W2(アメリカで利用されている源泉徴収票)のみの PDF ドキュメント
  • lending_package.pdf
    • 給与明細書、運転免許証など一連のドキュメントが連なって格納された PDF ドキュメント

2025-03-10 at 18.30.28-オブジェクトをアップロード - S3 バケット lending-flow-bucket43879c71-nhbtpemiwinc  S3  us-west-2@2x.png

数分後、テストデータから出力結果が返ってきていました。

各データがどこにあり、どう言った値なのか X,Y 軸で表現されていますね。

W2-Form.json
W2-Form.json
{
  "matched_blueprint": {
    "arn": "arn:aws:bedrock:us-west-2:aws:blueprint/bedrock-data-automation-public-w2-form",
    "name": "W2-Form",
    "confidence": 0.9997794
  },
  "document_class": {
    "type": "W2"
  },
  "split_document": {
    "page_indices": [
      0
    ]
  },
  "inference_result": {
    "employer_info": {
      "employer_address": "100 Main Street, Anytown, USA",
      "control_number": "753951852",
      "employer_name": "John Stiles",
      "ein": "4963147952",
      "employer_zip_code": ""
    },
    "filing_info": {
      "omb_number": "1545-0008",
      "verification_code": ""
    },
    "codes": [
      {
        "amount": 500,
        "code": "A"
      },
      {
        "amount": 1500,
        "code": "C"
      },
      {
        "amount": 500,
        "code": "A"
      },
      {
        "amount": 1000,
        "code": "B"
      }
    ],
    "other": "NA",
    "federal_tax_info": {
      "federal_income_tax": 500,
      "allocated_tips": 150,
      "social_security_tax": 100,
      "medicare_tax": 5000
    },
    "state_taxes_table": [
      {
        "state_name": "Any Town",
        "local_wages_tips": 100,
        "employer_state_id_number": 7414568313,
        "state_wages_and_tips": 50,
        "state_income_tax": 500,
        "local_income_tax": 550,
        "locality_name": "Any Town"
      }
    ],
    "employee_general_info": {
      "employee_name_suffix": "M",
      "employee_address": "123 Any Street, Any Town, USA",
      "employee_last_name": "Desai",
      "employee_zip_code": "",
      "first_name": "Arnav",
      "ssn": "753-95-184"
    },
    "federal_wage_info": {
      "social_security_tips": 500,
      "wages_tips_other_compensation": 100,
      "medicare_wages_tips": 500,
      "social_security_wages": 1000
    },
    "nonqualified_plans_incom": 500
  },
  "explainability_info": [
    {
      "employer_info": {
        "employer_address": {
          "success": true,
          "confidence": 0.8203125,
          "geometry": [
            {
              "boundingBox": {
                "top": 0.1425448047201295,
                "left": 0.14196817506397247,
                "width": 0.22920190634634713,
                "height": 0.012461373525628122
              },
              "vertices": [
                {
                  "x": 0.1419682326138685,
                  "y": 0.1425448047201295
                },
                {
                  "x": 0.3711700814103196,
                  "y": 0.1425466427408256
                },
                {
                  "x": 0.37116998621120517,
                  "y": 0.15500617824575763
                },
                {
                  "x": 0.14196817506397247,
                  "y": 0.1550043520237203
                }
              ],
              "page": 1
            }
          ],
          "type": "string",
          "value": "100 Main Street, Anytown, USA"
        },
        "control_number": {
          "success": true,
          "confidence": 0.96875,
          "geometry": [
            {
              "boundingBox": {
                "top": 0.2110726215079758,
                "left": 0.18815032410208182,
                "width": 0.07354984704003364,
                "height": 0.009039828008477885
              },
              "vertices": [
                {
                  "x": 0.18815037135748738,
                  "y": 0.2110726215079758
                },
                {
                  "x": 0.26170017114211547,
                  "y": 0.21107319049701453
                },
                {
                  "x": 0.26170011512173585,
                  "y": 0.22011244951645367
                },
                {
                  "x": 0.18815032410208182,
                  "y": 0.22011188327421105
                }
              ],
              "page": 1
            }
          ],
          "type": "string",
          "value": "753951852"
        },
        "employer_name": {
          "success": true,
          "confidence": 0.93359375,
          "geometry": [
            {
              "boundingBox": {
                "top": 0.12678711041140112,
                "left": 0.1387916843766421,
                "width": 0.0795365760925506,
                "height": 0.010017143136122258
              },
              "vertices": [
                {
                  "x": 0.1387917302227856,
                  "y": 0.12678711041140112
                },
                {
                  "x": 0.2183282604691927,
                  "y": 0.12678775341053292
                },
                {
                  "x": 0.21832820411992376,
                  "y": 0.13680425354752337
                },
                {
                  "x": 0.1387916843766421,
                  "y": 0.1368036138399013
                }
              ],
              "page": 1
            }
          ],
          "type": "string",
          "value": "John Stiles"
        },
        "ein": {
          "success": true,
          "confidence": 0.94140625,
          "geometry": [
            {
              "boundingBox": {
                "top": 0.09136366114753505,
                "left": 0.2349443348793075,
                "width": 0.082468777029099,
                "height": 0.008978887998067744
              },
              "vertices": [
                {
                  "x": 0.2349443873542955,
                  "y": 0.09136366114753505
                },
                {
                  "x": 0.3174131119084065,
                  "y": 0.09136433992102652
                },
                {
                  "x": 0.31741304967196393,
                  "y": 0.1003425491456028
                },
                {
                  "x": 0.2349443348793075,
                  "y": 0.10034187343119876
                }
              ],
              "page": 1
            }
          ],
          "type": "string",
          "value": "4963147952"
        },
        "employer_zip_code": {
          "success": true,
          "confidence": 0.94140625,
          "type": "number",
          "value": ""
        }
      },
      "filing_info": {
        "omb_number": {
          "success": true,
          "confidence": 0.96875,
          "geometry": [
            {
              "boundingBox": {
                "top": 0.062263206756885306,
                "left": 0.5229129222276477,
                "width": 0.054734764738409214,
                "height": 0.007237991030641294
              },
              "vertices": [
                {
                  "x": 0.5229129920060516,
                  "y": 0.062263206756885306
                },
                {
                  "x": 0.577647686966057,
                  "y": 0.06226366384163711
                },
                {
                  "x": 0.5776476119650361,
                  "y": 0.0695011977875266
                },
                {
                  "x": 0.5229129222276477,
                  "y": 0.06950074233946638
                }
              ],
              "page": 1
            }
          ],
          "type": "string",
          "value": "1545-0008"
        },
        "verification_code": {
          "success": true,
          "confidence": 0.953125,
          "type": "string",
          "value": ""
        }
      },
      "codes": [
        {
          "amount": {
            "success": true,
            "confidence": 0.96484375,
            "geometry": [
              {
                "boundingBox": {
                  "top": 0.24381432263048575,
                  "left": 0.8361683973942627,
                  "width": 0.052291069842190896,
                  "height": 0.010749749203600112
                },
                "vertices": [
                  {
                    "x": 0.8361685454243932,
                    "y": 0.24381432263048575
                  },
                  {
                    "x": 0.8884594672364536,
                    "y": 0.24381472008624758
                  },
                  {
                    "x": 0.8884593117958847,
                    "y": 0.25456407183408586
                  },
                  {
                    "x": 0.8361683973942627,
                    "y": 0.2545636767006416
                  }
                ],
                "page": 1
              }
            ],
            "type": "number",
            "value": 500
          },
          "code": {
            "success": true,
            "confidence": 0.92578125,
            "geometry": [
              {
                "boundingBox": {
                  "top": 0.2430809066358361,
                  "left": 0.7697051779833031,
                  "width": 0.010262828357552323,
                  "height": 0.009039310072809614
                },
                "vertices": [
                  {
                    "x": 0.769705294542765,
                    "y": 0.2430809066358361
                  },
                  {
                    "x": 0.7799680063408554,
                    "y": 0.243080984672305
                  },
                  {
                    "x": 0.779967888558386,
                    "y": 0.2521202167086457
                  },
                  {
                    "x": 0.7697051779833031,
                    "y": 0.2521201390554483
                  }
                ],
                "page": 1
              }
            ],
            "type": "string",
            "value": "A"
          }
        },
        {
          "amount": {
            "success": true,
            "confidence": 0.953125,
            "geometry": [
              {
                "boundingBox": {
                  "top": 0.2721535295582447,
                  "left": 0.8371454072444379,
                  "width": 0.06011024835694867,
                  "height": 0.01074979317532615
                },
                "vertices": [
                  {
                    "x": 0.8371455554129705,
                    "y": 0.2721535295582447
                  },
                  {
                    "x": 0.8972556556013865,
                    "y": 0.27215397940864533
                  },
                  {
                    "x": 0.8972554989143194,
                    "y": 0.28290332273357083
                  },
                  {
                    "x": 0.8371454072444379,
                    "y": 0.28290287555274607
                  }
                ],
                "page": 1
              }
            ],
            "type": "number",
            "value": 1500
          },
          "code": {
            "success": true,
            "confidence": 0.81640625,
            "geometry": [
              {
                "boundingBox": {
                  "top": 0.2736188161090357,
                  "left": 0.7657951815180352,
                  "width": 0.009774123730187934,
                  "height": 0.009039298009431929
                },
                "vertices": [
                  {
                    "x": 0.7657952976114959,
                    "y": 0.2736188161090357
                  },
                  {
                    "x": 0.7755693052482231,
                    "y": 0.2736188891963082
                  },
                  {
                    "x": 0.7755691879899941,
                    "y": 0.28265811411846764
                  },
                  {
                    "x": 0.7657951815180352,
                    "y": 0.2826580413962151
                  }
                ],
                "page": 1
              }
            ],
            "type": "string",
            "value": "C"
          }
        },
        {
          "amount": {
            "success": true,
            "confidence": 0.92578125,
            "geometry": [
              {
                "boundingBox": {
                  "top": 0.3034243242441352,
                  "left": 0.8342127770604216,
                  "width": 0.052779729036450784,
                  "height": 0.010749723089211571
                },
                "vertices": [
                  {
                    "x": 0.8342129248132941,
                    "y": 0.3034243242441352
                  },
                  {
                    "x": 0.8869925060968724,
                    "y": 0.30342471241579466
                  },
                  {
                    "x": 0.8869923508643167,
                    "y": 0.31417404733334675
                  },
                  {
                    "x": 0.8342127770604216,
                    "y": 0.31417366150570136
                  }
                ],
                "page": 1
              }
            ],
            "type": "number",
            "value": 500
          },
          "code": {
            "success": true,
            "confidence": 0.9140625,
            "geometry": [
              {
                "boundingBox": {
                  "top": 0.30537827192579386,
                  "left": 0.7692156745449379,
                  "width": 0.010751520052441399,
                  "height": 0.009039296173577516
                },
                "vertices": [
                  {
                    "x": 0.7692157910459699,
                    "y": 0.30537827192579386
                  },
                  {
                    "x": 0.7799671945973793,
                    "y": 0.3053783509110394
                  },
                  {
                    "x": 0.7799670768151035,
                    "y": 0.3144175680993714
                  },
                  {
                    "x": 0.7692156745449379,
                    "y": 0.3144174895156471
                  }
                ],
                "page": 1
              }
            ],
            "type": "string",
            "value": "A"
          }
        },
        {
          "amount": {
            "success": true,
            "confidence": 0.95703125,
            "geometry": [
              {
                "boundingBox": {
                  "top": 0.3298091336201488,
                  "left": 0.8449638037925715,
                  "width": 0.060110203283513464,
                  "height": 0.010994065010677123
                },
                "vertices": [
                  {
                    "x": 0.8449639564616249,
                    "y": 0.3298091336201488
                  },
                  {
                    "x": 0.905074007076085,
                    "y": 0.32980956915190474
                  },
                  {
                    "x": 0.9050738456949088,
                    "y": 0.3408031986308259
                  },
                  {
                    "x": 0.8449638037925715,
                    "y": 0.3408027658293095
                  }
                ],
                "page": 1
              }
            ],
            "type": "number",
            "value": 1000
          },
          "code": {
            "success": true,
            "confidence": 0.94921875,
            "geometry": [
              {
                "boundingBox": {
                  "top": 0.33225165763053627,
                  "left": 0.7750797243021049,
                  "width": 0.008796719801318131,
                  "height": 0.009283577245226049
                },
                "vertices": [
                  {
                    "x": 0.7750798446694762,
                    "y": 0.33225165763053627
                  },
                  {
                    "x": 0.783876444103423,
                    "y": 0.3322517212781527
                  },
                  {
                    "x": 0.78387632265943,
                    "y": 0.3415352348757623
                  },
                  {
                    "x": 0.7750797243021049,
                    "y": 0.34153517156554153
                  }
                ],
                "page": 1
              }
            ],
            "type": "string",
            "value": "B"
          }
        }
      ],
      "other": {
        "success": true,
        "confidence": 0.92578125,
        "geometry": [
          {
            "boundingBox": {
              "top": 0.32565424368883483,
              "left": 0.6055008043968028,
              "width": 0.020525525221127605,
              "height": 0.009039367175182944
            },
            "vertices": [
              {
                "x": 0.6055009013879056,
                "y": 0.32565424368883483
              },
              {
                "x": 0.6260263296179304,
                "y": 0.32565439275950836
              },
              {
                "x": 0.6260262301808114,
                "y": 0.3346936108640178
              },
              {
                "x": 0.6055008043968028,
                "y": 0.3346934625598849
              }
            ],
            "page": 1
          }
        ],
        "type": "string",
        "value": "NA"
      },
      "federal_tax_info": {
        "federal_income_tax": {
          "success": true,
          "confidence": 0.96484375,
          "geometry": [
            {
              "boundingBox": {
                "top": 0.09118505139547091,
                "left": 0.7960969931019777,
                "width": 0.05277988713220094,
                "height": 0.010688755109531975
              },
              "vertices": [
                {
                  "x": 0.796097134644833,
                  "y": 0.09118505139547091
                },
                {
                  "x": 0.8488768802341786,
                  "y": 0.09118548584849973
                },
                {
                  "x": 0.8488767312540919,
                  "y": 0.10187380650500288
                },
                {
                  "x": 0.7960969931019777,
                  "y": 0.10187337438269675
                }
              ],
              "page": 1
            }
          ],
          "type": "number",
          "value": 500
        },
        "allocated_tips": {
          "success": true,
          "confidence": 0.97265625,
          "geometry": [
            {
              "boundingBox": {
                "top": 0.18176088183930464,
                "left": 0.7990279975953933,
                "width": 0.052779824052286206,
                "height": 0.010749785665361056
              },
              "vertices": [
                {
                  "x": 0.7990281403622645,
                  "y": 0.18176088183930464
                },
                {
                  "x": 0.8518078216476795,
                  "y": 0.1817612965411552
                },
                {
                  "x": 0.8518076714010966,
                  "y": 0.1925106675046657
                },
                {
                  "x": 0.7990279975953933,
                  "y": 0.19251025514684497
                }
              ],
              "page": 1
            }
          ],
          "type": "number",
          "value": 150
        },
        "social_security_tax": {
          "success": true,
          "confidence": 0.9609375,
          "geometry": [
            {
              "boundingBox": {
                "top": 0.12239497379222893,
                "left": 0.7995174872298415,
                "width": 0.05277986521708289,
                "height": 0.010749815415356898
              },
              "vertices": [
                {
                  "x": 0.7995176300661928,
                  "y": 0.12239497379222893
                },
                {
                  "x": 0.8522973524469244,
                  "y": 0.12239540143953916
                },
                {
                  "x": 0.8522972021308497,
                  "y": 0.13314478920758582
                },
                {
                  "x": 0.7995174872298415,
                  "y": 0.13314436390431275
                }
              ],
              "page": 1
            }
          ],
          "type": "number",
          "value": 100
        },
        "medicare_tax": {
          "success": true,
          "confidence": 0.95703125,
          "geometry": [
            {
              "boundingBox": {
                "top": 0.15171144609849258,
                "left": 0.7946300877618817,
                "width": 0.06108776054782328,
                "height": 0.010749866869923269
              },
              "vertices": [
                {
                  "x": 0.794630229905556,
                  "y": 0.15171144609849258
                },
                {
                  "x": 0.855717848309705,
                  "y": 0.15171193366155655
                },
                {
                  "x": 0.8557176975089502,
                  "y": 0.16246131296841584
                },
                {
                  "x": 0.7946300877618817,
                  "y": 0.16246082811835366
                }
              ],
              "page": 1
            }
          ],
          "type": "number",
          "value": 5000
        }
      },
      "state_taxes_table": [
        {
          "state_name": {
            "success": true,
            "confidence": 0.9453125,
            "geometry": [
              {
                "boundingBox": {
                  "top": 0.4101792993399344,
                  "left": 0.05992608586969041,
                  "width": 0.04343344317585837,
                  "height": 0.007818002004955082
                },
                "vertices": [
                  {
                    "x": 0.0599261135235259,
                    "y": 0.4101792993399344
                  },
                  {
                    "x": 0.10335952904554878,
                    "y": 0.41017959961707895
                  },
                  {
                    "x": 0.10335949691519096,
                    "y": 0.4179973013448895
                  },
                  {
                    "x": 0.05992608586969041,
                    "y": 0.41799700247060323
                  }
                ],
                "page": 1
              }
            ],
            "type": "string",
            "value": "Any Town"
          },
          "local_wages_tips": {
            "success": true,
            "confidence": 0.96484375,
            "geometry": [
              {
                "boundingBox": {
                  "top": 0.4062741397593082,
                  "left": 0.5927936809515684,
                  "width": 0.07037296455353081,
                  "height": 0.014170039542523383
                },
                "vertices": [
                  {
                    "x": 0.592793830617735,
                    "y": 0.4062741397593082
                  },
                  {
                    "x": 0.6631666455050992,
                    "y": 0.40627462741859327
                  },
                  {
                    "x": 0.6631664826928081,
                    "y": 0.4204441793018316
                  },
                  {
                    "x": 0.5927936809515684,
                    "y": 0.4204436957623141
                  }
                ],
                "page": 1
              }
            ],
            "type": "number",
            "value": 100
          },
          "employer_state_id_number": {
            "success": true,
            "confidence": 0.9375,
            "geometry": [
              {
                "boundingBox": {
                  "top": 0.40846978914779675,
                  "left": 0.14844226266294547,
                  "width": 0.08149107901047295,
                  "height": 0.009284080150678609
                },
                "vertices": [
                  {
                    "x": 0.14844230633548122,
                    "y": 0.40846978914779675
                  },
                  {
                    "x": 0.22993334167341842,
                    "y": 0.4084703531122496
                  },
                  {
                    "x": 0.2299332880271013,
                    "y": 0.41775386929847536
                  },
                  {
                    "x": 0.14844226266294547,
                    "y": 0.41775330845962133
                  }
                ],
                "page": 1
              }
            ],
            "type": "number",
            "value": 7414568313
          },
          "state_wages_and_tips": {
            "success": true,
            "confidence": 0.95703125,
            "geometry": [
              {
                "boundingBox": {
                  "top": 0.40651657001689195,
                  "left": 0.3225421922381328,
                  "width": 0.059132946293216526,
                  "height": 0.014169977981556559
                },
                "vertices": [
                  {
                    "x": 0.32254229141938756,
                    "y": 0.40651657001689195
                  },
                  {
                    "x": 0.3816751385313493,
                    "y": 0.406516979727401
                  },
                  {
                    "x": 0.38167502830364813,
                    "y": 0.4206865479984485
                  },
                  {
                    "x": 0.3225421922381328,
                    "y": 0.4206861417496965
                  }
                ],
                "page": 1
              }
            ],
            "type": "number",
            "value": 50
          },
          "state_income_tax": {
            "success": true,
            "confidence": 0.96484375,
            "geometry": [
              {
                "boundingBox": {
                  "top": 0.4062732423300216,
                  "left": 0.4632880686399044,
                  "width": 0.07037301782962863,
                  "height": 0.014414349827531647
                },
                "vertices": [
                  {
                    "x": 0.46328819627684825,
                    "y": 0.4062732423300216
                  },
                  {
                    "x": 0.533661086469533,
                    "y": 0.4062737299898285
                  },
                  {
                    "x": 0.5336609454597789,
                    "y": 0.42068759215755325
                  },
                  {
                    "x": 0.4632880686399044,
                    "y": 0.42068710868854875
                  }
                ],
                "page": 1
              }
            ],
            "type": "number",
            "value": 500
          },
          "local_income_tax": {
            "success": true,
            "confidence": 0.96875,
            "geometry": [
              {
                "boundingBox": {
                  "top": 0.40651942105611816,
                  "left": 0.7340279400629971,
                  "width": 0.07037290858511125,
                  "height": 0.014170031111511494
                },
                "vertices": [
                  {
                    "x": 0.7340281161126886,
                    "y": 0.40651942105611816
                  },
                  {
                    "x": 0.8044008486481083,
                    "y": 0.4065199086438038
                  },
                  {
                    "x": 0.8044006594523233,
                    "y": 0.42068945216762965
                  },
                  {
                    "x": 0.7340279400629971,
                    "y": 0.4206889686997068
                  }
                ],
                "page": 1
              }
            ],
            "type": "number",
            "value": 550
          },
          "locality_name": {
            "success": true,
            "confidence": 0.9609375,
            "geometry": [
              {
                "boundingBox": {
                  "top": 0.4074975349501956,
                  "left": 0.8645106996602827,
                  "width": 0.05815532886844066,
                  "height": 0.010261101883707824
                },
                "vertices": [
                  {
                    "x": 0.8645108447954342,
                    "y": 0.4074975349501956
                  },
                  {
                    "x": 0.9226660285287234,
                    "y": 0.40749793765203246
                  },
                  {
                    "x": 0.9226658755267086,
                    "y": 0.4177586368339034
                  },
                  {
                    "x": 0.8645106996602827,
                    "y": 0.41775823659741057
                  }
                ],
                "page": 1
              }
            ],
            "type": "string",
            "value": "Any Town"
          }
        }
      ],
      "employee_general_info": {
        "employee_name_suffix": {
          "success": true,
          "confidence": 0.9453125,
          "geometry": [
            {
              "boundingBox": {
                "top": 0.24637705778292,
                "left": 0.513137102879782,
                "width": 0.011240218714952954,
                "height": 0.008672870445067565
              },
              "vertices": [
                {
                  "x": 0.5131371853781364,
                  "y": 0.24637705778292
                },
                {
                  "x": 0.5243773215947349,
                  "y": 0.24637714309845343
                },
                {
                  "x": 0.5243772378111943,
                  "y": 0.25504992822798755
                },
                {
                  "x": 0.513137102879782,
                  "y": 0.25504984331521063
                }
              ],
              "page": 1
            }
          ],
          "type": "string",
          "value": "M"
        },
        "employee_address": {
          "success": true,
          "confidence": 0.890625,
          "geometry": [
            {
              "boundingBox": {
                "top": 0.3158787194915841,
                "left": 0.14086779816172376,
                "width": 0.20855373024524396,
                "height": 0.011483788824680041
              },
              "vertices": [
                {
                  "x": 0.14086785103120755,
                  "y": 0.3158787194915841
                },
                {
                  "x": 0.3494215284069677,
                  "y": 0.31588024258088354
                },
                {
                  "x": 0.3494214439669128,
                  "y": 0.3273625083162641
                },
                {
                  "x": 0.14086779816172376,
                  "y": 0.32736099512062705
                }
              ],
              "page": 1
            }
          ],
          "type": "string",
          "value": "123 Any Street, Any Town, USA"
        },
        "employee_last_name": {
          "success": true,
          "confidence": 0.9375,
          "geometry": [
            {
              "boundingBox": {
                "top": 0.2488184736161942,
                "left": 0.29908562045038717,
                "width": 0.036408386404964355,
                "height": 0.00891737127269776
              },
              "vertices": [
                {
                  "x": 0.29908568010875053,
                  "y": 0.2488184736161942
                },
                {
                  "x": 0.3354940068553515,
                  "y": 0.2488187495973069
                },
                {
                  "x": 0.3354939429168238,
                  "y": 0.25773584488889195
                },
                {
                  "x": 0.29908562045038717,
                  "y": 0.25773557024911087
                }
              ],
              "page": 1
            }
          ],
          "type": "string",
          "value": "Desai"
        },
        "employee_zip_code": {
          "success": true,
          "confidence": 0.94140625,
          "type": "number",
          "value": ""
        },
        "first_name": {
          "success": true,
          "confidence": 0.890625,
          "geometry": [
            {
              "boundingBox": {
                "top": 0.25064926020277745,
                "left": 0.10183295115118576,
                "width": 0.03854649714996286,
                "height": 0.008795241877087745
              },
              "vertices": [
                {
                  "x": 0.10183298712091159,
                  "y": 0.25064926020277745
                },
                {
                  "x": 0.14037944830114862,
                  "y": 0.2506495520995062
                },
                {
                  "x": 0.14037940786196776,
                  "y": 0.2594445020798652
                },
                {
                  "x": 0.10183295115118576,
                  "y": 0.25944421158378617
                }
              ],
              "page": 1
            }
          ],
          "type": "string",
          "value": "Arnav"
        },
        "ssn": {
          "confidence": 0.96875,
          "geometry": [
            {
              "boundingBox": {
                "top": 0.06226135188851642,
                "left": 0.30079733092935684,
                "width": 0.0894327933250581,
                "height": 0.008948422120915803
              },
              "vertices": [
                {
                  "x": 0.30079739099409536,
                  "y": 0.06226135188851642
                },
                {
                  "x": 0.39023012425441495,
                  "y": 0.0622620987336156
                },
                {
                  "x": 0.39023005363993124,
                  "y": 0.07120977400943222
                },
                {
                  "x": 0.30079733092935684,
                  "y": 0.07120903047046287
                }
              ],
              "page": 1
            }
          ],
          "type": "string",
          "value": "753-95-184"
        }
      },
      "federal_wage_info": {
        "social_security_tips": {
          "success": true,
          "confidence": 0.9609375,
          "geometry": [
            {
              "boundingBox": {
                "top": 0.18175926526832722,
                "left": 0.5932847914320433,
                "width": 0.052779883331533606,
                "height": 0.010627642773960555
              },
              "vertices": [
                {
                  "x": 0.593284903750808,
                  "y": 0.18175926526832722
                },
                {
                  "x": 0.6460646747635769,
                  "y": 0.18175967997088277
                },
                {
                  "x": 0.6460645550500721,
                  "y": 0.19238690804228778
                },
                {
                  "x": 0.5932847914320433,
                  "y": 0.1923864956571293
                }
              ],
              "page": 1
            }
          ],
          "type": "number",
          "value": 500
        },
        "wages_tips_other_compensation": {
          "success": true,
          "confidence": 0.96484375,
          "geometry": [
            {
              "boundingBox": {
                "top": 0.0920385169909297,
                "left": 0.6015936774477677,
                "width": 0.05326864467117953,
                "height": 0.010566614957136636
              },
              "vertices": [
                {
                  "x": 0.601593790278587,
                  "y": 0.0920385169909297
                },
                {
                  "x": 0.6548623221189472,
                  "y": 0.09203895527919093
                },
                {
                  "x": 0.6548622018677936,
                  "y": 0.10260513194806634
                },
                {
                  "x": 0.6015936774477677,
                  "y": 0.10260469598522869
                }
              ],
              "page": 1
            }
          ],
          "type": "number",
          "value": 100
        },
        "medicare_wages_tips": {
          "success": true,
          "confidence": 0.95703125,
          "geometry": [
            {
              "boundingBox": {
                "top": 0.1520763385466113,
                "left": 0.5986608236716688,
                "width": 0.05277990369432872,
                "height": 0.010749809450550235
              },
              "vertices": [
                {
                  "x": 0.5986609380433685,
                  "y": 0.1520763385466113
                },
                {
                  "x": 0.6514407273659976,
                  "y": 0.15207675972188475
                },
                {
                  "x": 0.6514406055145556,
                  "y": 0.16282614799716152
                },
                {
                  "x": 0.5986608236716688,
                  "y": 0.16282572916592541
                }
              ],
              "page": 1
            }
          ],
          "type": "number",
          "value": 500
        },
        "social_security_wages": {
          "success": true,
          "confidence": 0.96875,
          "geometry": [
            {
              "boundingBox": {
                "top": 0.1208052715842467,
                "left": 0.5869323080186079,
                "width": 0.06108786084087925,
                "height": 0.010749892654233068
              },
              "vertices": [
                {
                  "x": 0.5869324207282345,
                  "y": 0.1208052715842467
                },
                {
                  "x": 0.6480201688594871,
                  "y": 0.12080576694804955
                },
                {
                  "x": 0.6480200474927431,
                  "y": 0.13155516423847977
                },
                {
                  "x": 0.5869323080186079,
                  "y": 0.13155467158768777
                }
              ],
              "page": 1
            }
          ],
          "type": "number",
          "value": 1000
        }
      },
      "nonqualified_plans_incom": {
        "success": true,
        "confidence": 0.94921875,
        "geometry": [
          {
            "boundingBox": {
              "top": 0.24369042491050819,
              "left": 0.6064790665499286,
              "width": 0.053268539980232066,
              "height": 0.010871918675103787
            },
            "vertices": [
              {
                "x": 0.6064791833417267,
                "y": 0.24369042491050819
              },
              {
                "x": 0.6597476065301606,
                "y": 0.2436908298229998
              },
              {
                "x": 0.6597474821035988,
                "y": 0.25456234358561197
              },
              {
                "x": 0.6064790665499286,
                "y": 0.25456194106573354
              }
            ],
            "page": 1
          }
        ],
        "type": "number",
        "value": 500
      }
    }
  ]
}

データの量が多いため、さらに抽出した方法も紹介されていました。

~ $ find . -path "*/0/result.json" -exec jq '. | {matched_blueprint, document_class, split_document, inference_result} | with_entries(select(.value != null))' {} \;
{}
{
  "matched_blueprint": {
    "arn": "arn:aws:bedrock:us-west-2:622809842341:blueprint/494529f2245a",
    "name": "homeowners-insurance-application",
    "confidence": 0.16679549
  },
  "document_class": {
    "type": "default"
  },
  "split_document": {
    "page_indices": [
      0,
      1,
      2,
      3,
      4,
      5
    ]
  },
  "inference_result": {
    "Expiration Date": "20/10/2025",
    "Purchase Date and Time": "14/06/2009 09.30",
    "Policy Number": "45488257965",
    "Named Insured(s) and Mailing Address": "Alejandro Rosalez alejandrorosalez@example.com",
    "Insurance Company": "XYZ Insurance",
    "Co-Applicant Information": {
      "Drivers License Number": "1935478265",
      "Length of Time with Current Auto Carrier": "5 Years",
      "DL State": "WI",
      "Education Level": "Undergraduate",
      "Currently Insured- Auto": "Home",
      "Length of Time with Prior Auto Carrier": "3 Years",
      "Date of Birth": "16/07/1988",
      "Gender": "Male",
      "Marital Status": "Married",
      "Relationship to Primary Applicant": "Spouse",
      "Name": "Jane Doe"
    },
    "Insured Property": "Home",
    "Auto Claims, Accidents, and Violations": {
      "Major": "02",
      "Number of Comp Claims": "03",
      "Number of Violations": "02",
      "At-Fault": "03",
      "Number of Auto Accidents": "03",
      "Minor": "01",
      "Not-at-Fault": "01"
    },
    "Primary Phone #": "555-157-0100",
    "Effective Date": "20/10/2020",
    "Primary Email": "alejandrorosalez@example.com",
    "Alternate Phone #": "555-758-0100",
    "Primary Applicant Information": {
      "Type of Current Property Policy": "Home",
      "Drivers License Number": "7654825499",
      "Education Level": "Undergraduate",
      "Currently Insured Auto": "Home",
      "Length of Time with Prior Auto Carrier": "3 Years",
      "Gender": "Female",
      "Marital Status": "Married",
      "Name": "Alejandro Rosalez",
      "Length of Time with Current Auto Carrier": "5 Years",
      "Existing Esurance Policy": "Home Insurance",
      "DL State": "WI",
      "Date of Birth": "03/02/1990",
      "Years with Prior Property Company": "5 Years"
    }
  }
}

まとめ

以上、「Guidance for Multimodal Data Processing Using Amazon Bedrock Data Automation で Amazon Bedrock Data Automation を体験してみた」でした。

体で覚えたい方には、うってつけのコンテンツだと思いました。私もカスタムブループリントの部分は解像度がかなり高まりました。

このブログがどなたかの参考になれば幸いです。

クラウド事業本部コンサルティング部のたかくに(@takakuni_)でした!

Share this article

facebook logohatena logotwitter logo

© Classmethod, Inc. All rights reserved.