Guidance for Multimodal Data Processing Using Amazon Bedrock Data Automation で Amazon Bedrock Data Automation を体験してみた
こんにちは!クラウド事業本部コンサルティング部のたかくに(@takakuni_)です。
先日、Amazon Bedrock Data Automation(以後、BDA)が一般提供を開始しました。
具体的なユースケースを体験してみたいなぁと思い、 aws-samples
を探していたら、Guidance for Multimodal Data Processing Using Amazon Bedrock Data Automation
というものがありました。
日本語のソリューションライブラリページもありました。
今回はこちらを利用して、 BDA を体験してみたいと思います。
Guidance for Multimodal Data Processing Using Amazon Bedrock Data Automation
Guidance for Multimodal Data Processing Using Amazon Bedrock Data Automation は名前の通り、 BDA を利用したガイダンスが記載されているリポジトリです。
本リポジトリには以下の 2 つのユースケースが記載されてます。
- Automated Lending Flow
- Intelligent Claims Review
Automated Lending Flow
Automated Lending Flow では、金融の融資システムを想定し BDA を利用して、W2(アメリカで利用されている源泉徴収票)、給与明細書、運転免許証、1099(フリーランス向けの所得証明)、銀行取引明細書からデータを抽出するワークフローを構築します。
Guidance for Multimodal Data Processing Using Amazon Bedrock Data Automation から画像引用
Intelligent Claims Review
Intelligent Claims Review では、医療保険のシステムを想定し、BDA を利用します。BDA によって、請求書類や画像から関連データを抽出するワークフローを想定しています。抽出されたデータを元に Amazon Bedrock Agents を用いて、保険適用の適格性を判定します。本格的ですね。
Guidance for Multimodal Data Processing Using Amazon Bedrock Data Automation から画像引用
やってみた
今回はサーバレスな構成ですぐ試せそうな、Automated Lending Flow
にトライしてみたいと思います。
GitHub からソースをダウンロードし、手順 に従い、リソースをデプロイします。リージョンはオレゴンリージョンを利用しました。デプロイを行う環境は CloudShell を利用しました。
git clone https://github.com/aws-solutions-library-samples/guidance-for-multimodal-data-processing-using-amazon-bedrock-data-automation.git
cd guidance-for-multimodal-data-processing-using-amazon-bedrock-data-automation/deployment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install --upgrade pip
pip install -r requirements.txt
cd lambda/lending_flow/layer/
pip install -r requirements.txt --target python
cd ../../..
sudo npm install -g aws-cdk
cdk bootstrap
cdk deploy lending-flow --require-approval never --context data_project_name=my-lending-project
おおよそ 2 分程度でリソースのデプロイは完了しました。
✅ lending-flow
✨ Deployment time: 112.68s
Outputs:
lending-flow.lendingflowbucket = lending-flow-bucket43879c71-nhbtpemiwinc
Stack ARN:
arn:aws:cloudformation:us-west-2:622809842341:stack/lending-flow/d20c7280-fd8a-11ef-bd5f-02311b204b6d
✨ Total time: 129.57s
プロジェクトの作成
BDA の画面からプロジェクトを作成します。
名前は指定された my-lending-project
を利用します。
今回、カスタム出力には事前定義されたデータ形式(ブループリント)に加え、プロンプトから抽出するデータ形式を決める方式を利用します。
まずは、事前定義されたデータ形式を指定します。カスタム出力画面に遷移します。
ブループリントで以下を選択します。
- Payslip
- US-Driver-License
- US-Bank-Check
- Bank-Statement
- W2-Form
続いて、カスタムブループリントです。ブループリントを作成をクリックします。
カスタムブループリントでは、ドキュメント + プロンプト or 自身で抽出したいデータの型を定義します。今回はドキュメント + プロンプトでデータの型を決定します。まずはソースとなるドキュメントをアップロードします。
アップロード時に S3 が自動で作成されるようです。
ドキュメントをアップロード後、ブループリントプロンプトを定義します。このプロンプトでどんなデータなのか、どう言ったデータが欲しいのかを入力します。
今回は This is an homeowner insurance form. Please extract all the keys and values from the form.
(これは住宅所有者保険のフォームです。フォームからすべてのキーと値を抽出してください。)とプロンプト欄に入力しました。
抽出が完了したらブループリントの作成をクリックします。
44 件のデータがカスタムブループリントによって抽出可能になりました。
問題なければ、カスタムブループリントをプロジェクトに紐づけます。
サンプルのドキュメントをアップロードしてみました。各書類に対して適切なデータが抽出されていますね。
S3 へのアップロード
S3 バケットへテストデータをアップロードしてみます。今回 2 つのデータをアップロードしました。
- lending_package_w2.pdf
- W2(アメリカで利用されている源泉徴収票)のみの PDF ドキュメント
- lending_package.pdf
- 給与明細書、運転免許証など一連のドキュメントが連なって格納された PDF ドキュメント
数分後、テストデータから出力結果が返ってきていました。
各データがどこにあり、どう言った値なのか X,Y 軸で表現されていますね。
W2-Form.json
{
"matched_blueprint": {
"arn": "arn:aws:bedrock:us-west-2:aws:blueprint/bedrock-data-automation-public-w2-form",
"name": "W2-Form",
"confidence": 0.9997794
},
"document_class": {
"type": "W2"
},
"split_document": {
"page_indices": [
0
]
},
"inference_result": {
"employer_info": {
"employer_address": "100 Main Street, Anytown, USA",
"control_number": "753951852",
"employer_name": "John Stiles",
"ein": "4963147952",
"employer_zip_code": ""
},
"filing_info": {
"omb_number": "1545-0008",
"verification_code": ""
},
"codes": [
{
"amount": 500,
"code": "A"
},
{
"amount": 1500,
"code": "C"
},
{
"amount": 500,
"code": "A"
},
{
"amount": 1000,
"code": "B"
}
],
"other": "NA",
"federal_tax_info": {
"federal_income_tax": 500,
"allocated_tips": 150,
"social_security_tax": 100,
"medicare_tax": 5000
},
"state_taxes_table": [
{
"state_name": "Any Town",
"local_wages_tips": 100,
"employer_state_id_number": 7414568313,
"state_wages_and_tips": 50,
"state_income_tax": 500,
"local_income_tax": 550,
"locality_name": "Any Town"
}
],
"employee_general_info": {
"employee_name_suffix": "M",
"employee_address": "123 Any Street, Any Town, USA",
"employee_last_name": "Desai",
"employee_zip_code": "",
"first_name": "Arnav",
"ssn": "753-95-184"
},
"federal_wage_info": {
"social_security_tips": 500,
"wages_tips_other_compensation": 100,
"medicare_wages_tips": 500,
"social_security_wages": 1000
},
"nonqualified_plans_incom": 500
},
"explainability_info": [
{
"employer_info": {
"employer_address": {
"success": true,
"confidence": 0.8203125,
"geometry": [
{
"boundingBox": {
"top": 0.1425448047201295,
"left": 0.14196817506397247,
"width": 0.22920190634634713,
"height": 0.012461373525628122
},
"vertices": [
{
"x": 0.1419682326138685,
"y": 0.1425448047201295
},
{
"x": 0.3711700814103196,
"y": 0.1425466427408256
},
{
"x": 0.37116998621120517,
"y": 0.15500617824575763
},
{
"x": 0.14196817506397247,
"y": 0.1550043520237203
}
],
"page": 1
}
],
"type": "string",
"value": "100 Main Street, Anytown, USA"
},
"control_number": {
"success": true,
"confidence": 0.96875,
"geometry": [
{
"boundingBox": {
"top": 0.2110726215079758,
"left": 0.18815032410208182,
"width": 0.07354984704003364,
"height": 0.009039828008477885
},
"vertices": [
{
"x": 0.18815037135748738,
"y": 0.2110726215079758
},
{
"x": 0.26170017114211547,
"y": 0.21107319049701453
},
{
"x": 0.26170011512173585,
"y": 0.22011244951645367
},
{
"x": 0.18815032410208182,
"y": 0.22011188327421105
}
],
"page": 1
}
],
"type": "string",
"value": "753951852"
},
"employer_name": {
"success": true,
"confidence": 0.93359375,
"geometry": [
{
"boundingBox": {
"top": 0.12678711041140112,
"left": 0.1387916843766421,
"width": 0.0795365760925506,
"height": 0.010017143136122258
},
"vertices": [
{
"x": 0.1387917302227856,
"y": 0.12678711041140112
},
{
"x": 0.2183282604691927,
"y": 0.12678775341053292
},
{
"x": 0.21832820411992376,
"y": 0.13680425354752337
},
{
"x": 0.1387916843766421,
"y": 0.1368036138399013
}
],
"page": 1
}
],
"type": "string",
"value": "John Stiles"
},
"ein": {
"success": true,
"confidence": 0.94140625,
"geometry": [
{
"boundingBox": {
"top": 0.09136366114753505,
"left": 0.2349443348793075,
"width": 0.082468777029099,
"height": 0.008978887998067744
},
"vertices": [
{
"x": 0.2349443873542955,
"y": 0.09136366114753505
},
{
"x": 0.3174131119084065,
"y": 0.09136433992102652
},
{
"x": 0.31741304967196393,
"y": 0.1003425491456028
},
{
"x": 0.2349443348793075,
"y": 0.10034187343119876
}
],
"page": 1
}
],
"type": "string",
"value": "4963147952"
},
"employer_zip_code": {
"success": true,
"confidence": 0.94140625,
"type": "number",
"value": ""
}
},
"filing_info": {
"omb_number": {
"success": true,
"confidence": 0.96875,
"geometry": [
{
"boundingBox": {
"top": 0.062263206756885306,
"left": 0.5229129222276477,
"width": 0.054734764738409214,
"height": 0.007237991030641294
},
"vertices": [
{
"x": 0.5229129920060516,
"y": 0.062263206756885306
},
{
"x": 0.577647686966057,
"y": 0.06226366384163711
},
{
"x": 0.5776476119650361,
"y": 0.0695011977875266
},
{
"x": 0.5229129222276477,
"y": 0.06950074233946638
}
],
"page": 1
}
],
"type": "string",
"value": "1545-0008"
},
"verification_code": {
"success": true,
"confidence": 0.953125,
"type": "string",
"value": ""
}
},
"codes": [
{
"amount": {
"success": true,
"confidence": 0.96484375,
"geometry": [
{
"boundingBox": {
"top": 0.24381432263048575,
"left": 0.8361683973942627,
"width": 0.052291069842190896,
"height": 0.010749749203600112
},
"vertices": [
{
"x": 0.8361685454243932,
"y": 0.24381432263048575
},
{
"x": 0.8884594672364536,
"y": 0.24381472008624758
},
{
"x": 0.8884593117958847,
"y": 0.25456407183408586
},
{
"x": 0.8361683973942627,
"y": 0.2545636767006416
}
],
"page": 1
}
],
"type": "number",
"value": 500
},
"code": {
"success": true,
"confidence": 0.92578125,
"geometry": [
{
"boundingBox": {
"top": 0.2430809066358361,
"left": 0.7697051779833031,
"width": 0.010262828357552323,
"height": 0.009039310072809614
},
"vertices": [
{
"x": 0.769705294542765,
"y": 0.2430809066358361
},
{
"x": 0.7799680063408554,
"y": 0.243080984672305
},
{
"x": 0.779967888558386,
"y": 0.2521202167086457
},
{
"x": 0.7697051779833031,
"y": 0.2521201390554483
}
],
"page": 1
}
],
"type": "string",
"value": "A"
}
},
{
"amount": {
"success": true,
"confidence": 0.953125,
"geometry": [
{
"boundingBox": {
"top": 0.2721535295582447,
"left": 0.8371454072444379,
"width": 0.06011024835694867,
"height": 0.01074979317532615
},
"vertices": [
{
"x": 0.8371455554129705,
"y": 0.2721535295582447
},
{
"x": 0.8972556556013865,
"y": 0.27215397940864533
},
{
"x": 0.8972554989143194,
"y": 0.28290332273357083
},
{
"x": 0.8371454072444379,
"y": 0.28290287555274607
}
],
"page": 1
}
],
"type": "number",
"value": 1500
},
"code": {
"success": true,
"confidence": 0.81640625,
"geometry": [
{
"boundingBox": {
"top": 0.2736188161090357,
"left": 0.7657951815180352,
"width": 0.009774123730187934,
"height": 0.009039298009431929
},
"vertices": [
{
"x": 0.7657952976114959,
"y": 0.2736188161090357
},
{
"x": 0.7755693052482231,
"y": 0.2736188891963082
},
{
"x": 0.7755691879899941,
"y": 0.28265811411846764
},
{
"x": 0.7657951815180352,
"y": 0.2826580413962151
}
],
"page": 1
}
],
"type": "string",
"value": "C"
}
},
{
"amount": {
"success": true,
"confidence": 0.92578125,
"geometry": [
{
"boundingBox": {
"top": 0.3034243242441352,
"left": 0.8342127770604216,
"width": 0.052779729036450784,
"height": 0.010749723089211571
},
"vertices": [
{
"x": 0.8342129248132941,
"y": 0.3034243242441352
},
{
"x": 0.8869925060968724,
"y": 0.30342471241579466
},
{
"x": 0.8869923508643167,
"y": 0.31417404733334675
},
{
"x": 0.8342127770604216,
"y": 0.31417366150570136
}
],
"page": 1
}
],
"type": "number",
"value": 500
},
"code": {
"success": true,
"confidence": 0.9140625,
"geometry": [
{
"boundingBox": {
"top": 0.30537827192579386,
"left": 0.7692156745449379,
"width": 0.010751520052441399,
"height": 0.009039296173577516
},
"vertices": [
{
"x": 0.7692157910459699,
"y": 0.30537827192579386
},
{
"x": 0.7799671945973793,
"y": 0.3053783509110394
},
{
"x": 0.7799670768151035,
"y": 0.3144175680993714
},
{
"x": 0.7692156745449379,
"y": 0.3144174895156471
}
],
"page": 1
}
],
"type": "string",
"value": "A"
}
},
{
"amount": {
"success": true,
"confidence": 0.95703125,
"geometry": [
{
"boundingBox": {
"top": 0.3298091336201488,
"left": 0.8449638037925715,
"width": 0.060110203283513464,
"height": 0.010994065010677123
},
"vertices": [
{
"x": 0.8449639564616249,
"y": 0.3298091336201488
},
{
"x": 0.905074007076085,
"y": 0.32980956915190474
},
{
"x": 0.9050738456949088,
"y": 0.3408031986308259
},
{
"x": 0.8449638037925715,
"y": 0.3408027658293095
}
],
"page": 1
}
],
"type": "number",
"value": 1000
},
"code": {
"success": true,
"confidence": 0.94921875,
"geometry": [
{
"boundingBox": {
"top": 0.33225165763053627,
"left": 0.7750797243021049,
"width": 0.008796719801318131,
"height": 0.009283577245226049
},
"vertices": [
{
"x": 0.7750798446694762,
"y": 0.33225165763053627
},
{
"x": 0.783876444103423,
"y": 0.3322517212781527
},
{
"x": 0.78387632265943,
"y": 0.3415352348757623
},
{
"x": 0.7750797243021049,
"y": 0.34153517156554153
}
],
"page": 1
}
],
"type": "string",
"value": "B"
}
}
],
"other": {
"success": true,
"confidence": 0.92578125,
"geometry": [
{
"boundingBox": {
"top": 0.32565424368883483,
"left": 0.6055008043968028,
"width": 0.020525525221127605,
"height": 0.009039367175182944
},
"vertices": [
{
"x": 0.6055009013879056,
"y": 0.32565424368883483
},
{
"x": 0.6260263296179304,
"y": 0.32565439275950836
},
{
"x": 0.6260262301808114,
"y": 0.3346936108640178
},
{
"x": 0.6055008043968028,
"y": 0.3346934625598849
}
],
"page": 1
}
],
"type": "string",
"value": "NA"
},
"federal_tax_info": {
"federal_income_tax": {
"success": true,
"confidence": 0.96484375,
"geometry": [
{
"boundingBox": {
"top": 0.09118505139547091,
"left": 0.7960969931019777,
"width": 0.05277988713220094,
"height": 0.010688755109531975
},
"vertices": [
{
"x": 0.796097134644833,
"y": 0.09118505139547091
},
{
"x": 0.8488768802341786,
"y": 0.09118548584849973
},
{
"x": 0.8488767312540919,
"y": 0.10187380650500288
},
{
"x": 0.7960969931019777,
"y": 0.10187337438269675
}
],
"page": 1
}
],
"type": "number",
"value": 500
},
"allocated_tips": {
"success": true,
"confidence": 0.97265625,
"geometry": [
{
"boundingBox": {
"top": 0.18176088183930464,
"left": 0.7990279975953933,
"width": 0.052779824052286206,
"height": 0.010749785665361056
},
"vertices": [
{
"x": 0.7990281403622645,
"y": 0.18176088183930464
},
{
"x": 0.8518078216476795,
"y": 0.1817612965411552
},
{
"x": 0.8518076714010966,
"y": 0.1925106675046657
},
{
"x": 0.7990279975953933,
"y": 0.19251025514684497
}
],
"page": 1
}
],
"type": "number",
"value": 150
},
"social_security_tax": {
"success": true,
"confidence": 0.9609375,
"geometry": [
{
"boundingBox": {
"top": 0.12239497379222893,
"left": 0.7995174872298415,
"width": 0.05277986521708289,
"height": 0.010749815415356898
},
"vertices": [
{
"x": 0.7995176300661928,
"y": 0.12239497379222893
},
{
"x": 0.8522973524469244,
"y": 0.12239540143953916
},
{
"x": 0.8522972021308497,
"y": 0.13314478920758582
},
{
"x": 0.7995174872298415,
"y": 0.13314436390431275
}
],
"page": 1
}
],
"type": "number",
"value": 100
},
"medicare_tax": {
"success": true,
"confidence": 0.95703125,
"geometry": [
{
"boundingBox": {
"top": 0.15171144609849258,
"left": 0.7946300877618817,
"width": 0.06108776054782328,
"height": 0.010749866869923269
},
"vertices": [
{
"x": 0.794630229905556,
"y": 0.15171144609849258
},
{
"x": 0.855717848309705,
"y": 0.15171193366155655
},
{
"x": 0.8557176975089502,
"y": 0.16246131296841584
},
{
"x": 0.7946300877618817,
"y": 0.16246082811835366
}
],
"page": 1
}
],
"type": "number",
"value": 5000
}
},
"state_taxes_table": [
{
"state_name": {
"success": true,
"confidence": 0.9453125,
"geometry": [
{
"boundingBox": {
"top": 0.4101792993399344,
"left": 0.05992608586969041,
"width": 0.04343344317585837,
"height": 0.007818002004955082
},
"vertices": [
{
"x": 0.0599261135235259,
"y": 0.4101792993399344
},
{
"x": 0.10335952904554878,
"y": 0.41017959961707895
},
{
"x": 0.10335949691519096,
"y": 0.4179973013448895
},
{
"x": 0.05992608586969041,
"y": 0.41799700247060323
}
],
"page": 1
}
],
"type": "string",
"value": "Any Town"
},
"local_wages_tips": {
"success": true,
"confidence": 0.96484375,
"geometry": [
{
"boundingBox": {
"top": 0.4062741397593082,
"left": 0.5927936809515684,
"width": 0.07037296455353081,
"height": 0.014170039542523383
},
"vertices": [
{
"x": 0.592793830617735,
"y": 0.4062741397593082
},
{
"x": 0.6631666455050992,
"y": 0.40627462741859327
},
{
"x": 0.6631664826928081,
"y": 0.4204441793018316
},
{
"x": 0.5927936809515684,
"y": 0.4204436957623141
}
],
"page": 1
}
],
"type": "number",
"value": 100
},
"employer_state_id_number": {
"success": true,
"confidence": 0.9375,
"geometry": [
{
"boundingBox": {
"top": 0.40846978914779675,
"left": 0.14844226266294547,
"width": 0.08149107901047295,
"height": 0.009284080150678609
},
"vertices": [
{
"x": 0.14844230633548122,
"y": 0.40846978914779675
},
{
"x": 0.22993334167341842,
"y": 0.4084703531122496
},
{
"x": 0.2299332880271013,
"y": 0.41775386929847536
},
{
"x": 0.14844226266294547,
"y": 0.41775330845962133
}
],
"page": 1
}
],
"type": "number",
"value": 7414568313
},
"state_wages_and_tips": {
"success": true,
"confidence": 0.95703125,
"geometry": [
{
"boundingBox": {
"top": 0.40651657001689195,
"left": 0.3225421922381328,
"width": 0.059132946293216526,
"height": 0.014169977981556559
},
"vertices": [
{
"x": 0.32254229141938756,
"y": 0.40651657001689195
},
{
"x": 0.3816751385313493,
"y": 0.406516979727401
},
{
"x": 0.38167502830364813,
"y": 0.4206865479984485
},
{
"x": 0.3225421922381328,
"y": 0.4206861417496965
}
],
"page": 1
}
],
"type": "number",
"value": 50
},
"state_income_tax": {
"success": true,
"confidence": 0.96484375,
"geometry": [
{
"boundingBox": {
"top": 0.4062732423300216,
"left": 0.4632880686399044,
"width": 0.07037301782962863,
"height": 0.014414349827531647
},
"vertices": [
{
"x": 0.46328819627684825,
"y": 0.4062732423300216
},
{
"x": 0.533661086469533,
"y": 0.4062737299898285
},
{
"x": 0.5336609454597789,
"y": 0.42068759215755325
},
{
"x": 0.4632880686399044,
"y": 0.42068710868854875
}
],
"page": 1
}
],
"type": "number",
"value": 500
},
"local_income_tax": {
"success": true,
"confidence": 0.96875,
"geometry": [
{
"boundingBox": {
"top": 0.40651942105611816,
"left": 0.7340279400629971,
"width": 0.07037290858511125,
"height": 0.014170031111511494
},
"vertices": [
{
"x": 0.7340281161126886,
"y": 0.40651942105611816
},
{
"x": 0.8044008486481083,
"y": 0.4065199086438038
},
{
"x": 0.8044006594523233,
"y": 0.42068945216762965
},
{
"x": 0.7340279400629971,
"y": 0.4206889686997068
}
],
"page": 1
}
],
"type": "number",
"value": 550
},
"locality_name": {
"success": true,
"confidence": 0.9609375,
"geometry": [
{
"boundingBox": {
"top": 0.4074975349501956,
"left": 0.8645106996602827,
"width": 0.05815532886844066,
"height": 0.010261101883707824
},
"vertices": [
{
"x": 0.8645108447954342,
"y": 0.4074975349501956
},
{
"x": 0.9226660285287234,
"y": 0.40749793765203246
},
{
"x": 0.9226658755267086,
"y": 0.4177586368339034
},
{
"x": 0.8645106996602827,
"y": 0.41775823659741057
}
],
"page": 1
}
],
"type": "string",
"value": "Any Town"
}
}
],
"employee_general_info": {
"employee_name_suffix": {
"success": true,
"confidence": 0.9453125,
"geometry": [
{
"boundingBox": {
"top": 0.24637705778292,
"left": 0.513137102879782,
"width": 0.011240218714952954,
"height": 0.008672870445067565
},
"vertices": [
{
"x": 0.5131371853781364,
"y": 0.24637705778292
},
{
"x": 0.5243773215947349,
"y": 0.24637714309845343
},
{
"x": 0.5243772378111943,
"y": 0.25504992822798755
},
{
"x": 0.513137102879782,
"y": 0.25504984331521063
}
],
"page": 1
}
],
"type": "string",
"value": "M"
},
"employee_address": {
"success": true,
"confidence": 0.890625,
"geometry": [
{
"boundingBox": {
"top": 0.3158787194915841,
"left": 0.14086779816172376,
"width": 0.20855373024524396,
"height": 0.011483788824680041
},
"vertices": [
{
"x": 0.14086785103120755,
"y": 0.3158787194915841
},
{
"x": 0.3494215284069677,
"y": 0.31588024258088354
},
{
"x": 0.3494214439669128,
"y": 0.3273625083162641
},
{
"x": 0.14086779816172376,
"y": 0.32736099512062705
}
],
"page": 1
}
],
"type": "string",
"value": "123 Any Street, Any Town, USA"
},
"employee_last_name": {
"success": true,
"confidence": 0.9375,
"geometry": [
{
"boundingBox": {
"top": 0.2488184736161942,
"left": 0.29908562045038717,
"width": 0.036408386404964355,
"height": 0.00891737127269776
},
"vertices": [
{
"x": 0.29908568010875053,
"y": 0.2488184736161942
},
{
"x": 0.3354940068553515,
"y": 0.2488187495973069
},
{
"x": 0.3354939429168238,
"y": 0.25773584488889195
},
{
"x": 0.29908562045038717,
"y": 0.25773557024911087
}
],
"page": 1
}
],
"type": "string",
"value": "Desai"
},
"employee_zip_code": {
"success": true,
"confidence": 0.94140625,
"type": "number",
"value": ""
},
"first_name": {
"success": true,
"confidence": 0.890625,
"geometry": [
{
"boundingBox": {
"top": 0.25064926020277745,
"left": 0.10183295115118576,
"width": 0.03854649714996286,
"height": 0.008795241877087745
},
"vertices": [
{
"x": 0.10183298712091159,
"y": 0.25064926020277745
},
{
"x": 0.14037944830114862,
"y": 0.2506495520995062
},
{
"x": 0.14037940786196776,
"y": 0.2594445020798652
},
{
"x": 0.10183295115118576,
"y": 0.25944421158378617
}
],
"page": 1
}
],
"type": "string",
"value": "Arnav"
},
"ssn": {
"confidence": 0.96875,
"geometry": [
{
"boundingBox": {
"top": 0.06226135188851642,
"left": 0.30079733092935684,
"width": 0.0894327933250581,
"height": 0.008948422120915803
},
"vertices": [
{
"x": 0.30079739099409536,
"y": 0.06226135188851642
},
{
"x": 0.39023012425441495,
"y": 0.0622620987336156
},
{
"x": 0.39023005363993124,
"y": 0.07120977400943222
},
{
"x": 0.30079733092935684,
"y": 0.07120903047046287
}
],
"page": 1
}
],
"type": "string",
"value": "753-95-184"
}
},
"federal_wage_info": {
"social_security_tips": {
"success": true,
"confidence": 0.9609375,
"geometry": [
{
"boundingBox": {
"top": 0.18175926526832722,
"left": 0.5932847914320433,
"width": 0.052779883331533606,
"height": 0.010627642773960555
},
"vertices": [
{
"x": 0.593284903750808,
"y": 0.18175926526832722
},
{
"x": 0.6460646747635769,
"y": 0.18175967997088277
},
{
"x": 0.6460645550500721,
"y": 0.19238690804228778
},
{
"x": 0.5932847914320433,
"y": 0.1923864956571293
}
],
"page": 1
}
],
"type": "number",
"value": 500
},
"wages_tips_other_compensation": {
"success": true,
"confidence": 0.96484375,
"geometry": [
{
"boundingBox": {
"top": 0.0920385169909297,
"left": 0.6015936774477677,
"width": 0.05326864467117953,
"height": 0.010566614957136636
},
"vertices": [
{
"x": 0.601593790278587,
"y": 0.0920385169909297
},
{
"x": 0.6548623221189472,
"y": 0.09203895527919093
},
{
"x": 0.6548622018677936,
"y": 0.10260513194806634
},
{
"x": 0.6015936774477677,
"y": 0.10260469598522869
}
],
"page": 1
}
],
"type": "number",
"value": 100
},
"medicare_wages_tips": {
"success": true,
"confidence": 0.95703125,
"geometry": [
{
"boundingBox": {
"top": 0.1520763385466113,
"left": 0.5986608236716688,
"width": 0.05277990369432872,
"height": 0.010749809450550235
},
"vertices": [
{
"x": 0.5986609380433685,
"y": 0.1520763385466113
},
{
"x": 0.6514407273659976,
"y": 0.15207675972188475
},
{
"x": 0.6514406055145556,
"y": 0.16282614799716152
},
{
"x": 0.5986608236716688,
"y": 0.16282572916592541
}
],
"page": 1
}
],
"type": "number",
"value": 500
},
"social_security_wages": {
"success": true,
"confidence": 0.96875,
"geometry": [
{
"boundingBox": {
"top": 0.1208052715842467,
"left": 0.5869323080186079,
"width": 0.06108786084087925,
"height": 0.010749892654233068
},
"vertices": [
{
"x": 0.5869324207282345,
"y": 0.1208052715842467
},
{
"x": 0.6480201688594871,
"y": 0.12080576694804955
},
{
"x": 0.6480200474927431,
"y": 0.13155516423847977
},
{
"x": 0.5869323080186079,
"y": 0.13155467158768777
}
],
"page": 1
}
],
"type": "number",
"value": 1000
}
},
"nonqualified_plans_incom": {
"success": true,
"confidence": 0.94921875,
"geometry": [
{
"boundingBox": {
"top": 0.24369042491050819,
"left": 0.6064790665499286,
"width": 0.053268539980232066,
"height": 0.010871918675103787
},
"vertices": [
{
"x": 0.6064791833417267,
"y": 0.24369042491050819
},
{
"x": 0.6597476065301606,
"y": 0.2436908298229998
},
{
"x": 0.6597474821035988,
"y": 0.25456234358561197
},
{
"x": 0.6064790665499286,
"y": 0.25456194106573354
}
],
"page": 1
}
],
"type": "number",
"value": 500
}
}
]
}
データの量が多いため、さらに抽出した方法も紹介されていました。
~ $ find . -path "*/0/result.json" -exec jq '. | {matched_blueprint, document_class, split_document, inference_result} | with_entries(select(.value != null))' {} \;
{}
{
"matched_blueprint": {
"arn": "arn:aws:bedrock:us-west-2:622809842341:blueprint/494529f2245a",
"name": "homeowners-insurance-application",
"confidence": 0.16679549
},
"document_class": {
"type": "default"
},
"split_document": {
"page_indices": [
0,
1,
2,
3,
4,
5
]
},
"inference_result": {
"Expiration Date": "20/10/2025",
"Purchase Date and Time": "14/06/2009 09.30",
"Policy Number": "45488257965",
"Named Insured(s) and Mailing Address": "Alejandro Rosalez alejandrorosalez@example.com",
"Insurance Company": "XYZ Insurance",
"Co-Applicant Information": {
"Drivers License Number": "1935478265",
"Length of Time with Current Auto Carrier": "5 Years",
"DL State": "WI",
"Education Level": "Undergraduate",
"Currently Insured- Auto": "Home",
"Length of Time with Prior Auto Carrier": "3 Years",
"Date of Birth": "16/07/1988",
"Gender": "Male",
"Marital Status": "Married",
"Relationship to Primary Applicant": "Spouse",
"Name": "Jane Doe"
},
"Insured Property": "Home",
"Auto Claims, Accidents, and Violations": {
"Major": "02",
"Number of Comp Claims": "03",
"Number of Violations": "02",
"At-Fault": "03",
"Number of Auto Accidents": "03",
"Minor": "01",
"Not-at-Fault": "01"
},
"Primary Phone #": "555-157-0100",
"Effective Date": "20/10/2020",
"Primary Email": "alejandrorosalez@example.com",
"Alternate Phone #": "555-758-0100",
"Primary Applicant Information": {
"Type of Current Property Policy": "Home",
"Drivers License Number": "7654825499",
"Education Level": "Undergraduate",
"Currently Insured Auto": "Home",
"Length of Time with Prior Auto Carrier": "3 Years",
"Gender": "Female",
"Marital Status": "Married",
"Name": "Alejandro Rosalez",
"Length of Time with Current Auto Carrier": "5 Years",
"Existing Esurance Policy": "Home Insurance",
"DL State": "WI",
"Date of Birth": "03/02/1990",
"Years with Prior Property Company": "5 Years"
}
}
}
まとめ
以上、「Guidance for Multimodal Data Processing Using Amazon Bedrock Data Automation で Amazon Bedrock Data Automation を体験してみた
」でした。
体で覚えたい方には、うってつけのコンテンツだと思いました。私もカスタムブループリントの部分は解像度がかなり高まりました。
このブログがどなたかの参考になれば幸いです。
クラウド事業本部コンサルティング部のたかくに(@takakuni_)でした!