[アップデート] Amazon Bedrock Data Automation でハイパーリンクの認識をサポートしました
こんにちは!クラウド事業本部コンサルティング部のたかくに(@takakuni_)です。
Amazon Bedrock Data Automation でハイパーリンクの認識をサポートしました。以下のアップデートを抽出してお届けしています。
アップデート内容
いままで、Amazon Bedrock Data Automation (BDA) では PDF に埋め込まれたハイパーリンクを認識できなかったのですが、今回のアップデートでハイパーリンクを認識して、標準出力に含めることができるようになりました。
やってみる
実際に BDA を利用して、ハイパーリンク付きの PDF ファイルを抽出してみます。PDF ファイルは以下のファイルを利用しました。
元データ保管用/抽出用の S3 バケットの 2 つを用意します。
元データ保管用 S3 バケット
抽出用の S3 バケット
BDA のプロジェクトも任意の名前で作成します。ARN を控えておきましょう。
それではハイパーリンクが含まれる PDF ファイルの解析を行います。InvokeDataAutomationAsync API を実行します。
aws bedrock-data-automation-runtime invoke-data-automation-async \
--region us-east-1 \
--input-configuration "s3Uri=s3://bedrock-bda-us-east-1-hoge/Recommended Books for Golden Week.pdf" \
--output-configuration "s3Uri=s3://kb-bda-output-data-123456789012-use-1/data" \
--data-automation-configuration "dataAutomationProjectArn='arn:aws:bedrock:us-east-1:123456789012:data-automation-project/2abb9bbb5477',stage=LIVE" \
--data-automation-profile-arn "arn:aws:bedrock:us-east-1:123456789012:data-automation-profile/us.data-automation-v1"
うまく実行できています。(Invocaion が返ってきています。)
~ $ aws bedrock-data-automation-runtime invoke-data-automation-async \
> --region us-east-1 \
> --input-configuration "s3Uri=s3://bedrock-bda-us-east-1-hoge/Recommended Books for Golden Week.pdf" \
> --output-configuration "s3Uri=s3://kb-bda-output-data-123456789012-use-1/data" \
> --data-automation-configuration "dataAutomationProjectArn='arn:aws:bedrock:us-east-1:123456789012:data-automation-project/2abb9bbb5477',stage=LIVE" \
> --data-automation-profile-arn "arn:aws:bedrock:us-east-1:123456789012:data-automation-profile/us.data-automation-v1"
{
"invocationArn": "arn:aws:bedrock:us-east-1:123456789012:data-automation-invocation/762c8024-fcd3-4f41-8a3d-dab253331836"
}
結果を確認してみます。pages[0].representation.markdown
を確認するにハイパーリンクを認識し、リンクを作成していることがわかります。
{
"metadata": {
"asset_id": "0",
"logical_subdocument_id": "0",
"semantic_modality": "DOCUMENT",
"s3_bucket": "bedrock-bda-us-east-1-hoge",
"s3_key": "Recommended Books for Golden Week.pdf",
"number_of_pages": 2,
"start_page_index": 0,
"end_page_index": 1,
"file_type": "PDF"
},
"document": {
"statistics": {
"element_count": 28,
"table_count": 0,
"figure_count": 0,
"hyperlink_count": 7
}
},
"pages": [
{
"id": "e75d446f-57ff-4a8d-8eb0-c37c112546cb",
"page_index": 0,
"representation": {
+ "markdown": "# Recommended Books for Golden Week\n\nGolden Week is a perfect opportunity to relax and enrich your mind with some great books. Here are a few recommended readings that can inspire, entertain, and broaden your perspective during the holidays.\n\n## 1. [ \"Atomic Habits\" by James Clear ](https://jamesclear.com/atomic-habits)\n\n**Overview:**\n\nThis book provides practical strategies for forming good habits, breaking bad ones, and mastering the tiny behaviors that lead to remarkable results.\n\n**Why Recommended:**\n\nGolden Week is a great time to reflect on your routines and make positive changes. \"Atomic Habits\" offers actionable advice that can be implemented immediately.\n\n## 2. [ \"Ikigai: The Japanese Secret to a Long and Happy Life\" by ](https://www.penguinrandomhouse.com/books/549435/ikigai-by-hector-garcia-and-francesc-miralles/) [ Héctor García and Francesc Miralles ](https://www.penguinrandomhouse.com/books/549435/ikigai-by-hector-garcia-and-francesc-miralles/)\n\n**Overview:**\n\nExploring the Japanese concept of \"Ikigai,\" this book reveals the secrets to living a long, fulfilling, and meaningful life.\n\n**Why Recommended:**\n\nAs a Japanese concept, \"Ikigai\" is especially relevant during Golden Week. The book encourages self-discovery and finding purpose, making it ideal for a reflective holiday.\n\n## 3. [ \"Educated\" by Tara Westover ](https://tarawestover.com/book)\n\n**Overview:**\n\nA memoir about a woman who grows up in a strict and abusive household in rural Idaho but eventually escapes to learn about the wider world through education.\n\n**Why Recommended:**\n\n\"Educated\" is an inspiring story of resilience and the transformative power of learning, perfect for those seeking motivation and new perspectives.\n\n## 4. [ \"The Alchemist\" by Paulo Coelho ](https://www.harpercollins.com/products/the-alchemist-paulo-coelho)\n\n**Overview:**\n\nA philosophical novel about a young shepherd's journey to realize his personal legend and fulfill his dreams.\n\n**Why Recommended:**\n\nThis classic tale encourages readers to pursue their dreams and listen to their hearts, making it a wonderful companion for a holiday break."
},
"statistics": {
"element_count": 22,
"table_count": 0,
"figure_count": 0,
"hyperlink_count": 5
},
"asset_metadata": {
"rectified_image_width_pixels": 2478,
"rectified_image_height_pixels": 3507,
"corners": [
[
0.0000026556125769122557,
-0.000003584238878760135
],
[
1.0000033497906577,
2.0982249996337194E-11
],
[
1.0000000985232547,
1.0000031326855219
],
[
0.00000521979393186446,
0.9999999303847662
]
]
}
},
{
"id": "3dc3928d-336b-4cba-acd4-bb1a67a70ece",
"page_index": 1,
"representation": {
"markdown": "# 5. [ \"Deep Work: Rules for Focused Success in a Distracted ](https://www.hachettebookgroup.com/titles/cal-newport/deep-work/9781455586691/) [ World\" by Cal Newport ](https://www.hachettebookgroup.com/titles/cal-newport/deep-work/9781455586691/)\n\n**Overview:**\n\nThe book explores the benefits of deep, focused work and provides practical advice for achieving greater productivity and satisfaction.\n\n**Why Recommended:**\n\nGolden Week offers a rare chance to disconnect from daily distractions and focus on personal growth, making \"Deep Work\" a timely and valuable read.\n\nEnjoy your Golden Week with these inspiring books!"
},
"statistics": {
"element_count": 6,
"table_count": 0,
"figure_count": 0,
"hyperlink_count": 2
},
"asset_metadata": {
"rectified_image_width_pixels": 2478,
"rectified_image_height_pixels": 3507,
"corners": [
[
-0.000014060586570635065,
1.0551794977566014E-7
],
[
1.0,
-0.000009533358854142084
],
[
1.0000000985232547,
1.0000001392304676
],
[
5.8557916939246546E-8,
1.0000032023007557
]
]
}
}
],
"elements": [
{
"type": "TEXT",
"id": "af00c27f-b23c-4db3-ab98-eeb9e97f077a",
"reading_order": 0,
"page_indices": [
0
],
"representation": {
"markdown": "# Recommended Books for Golden Week"
},
"sub_type": "TITLE"
},
{
"type": "TEXT",
"id": "65ce85bf-d592-4cb3-80d0-3afb3d8f9483",
"reading_order": 1,
"page_indices": [
0
],
"representation": {
"markdown": "Golden Week is a perfect opportunity to relax and enrich your mind with some great books. Here are a few recommended readings that can inspire, entertain, and broaden your perspective during the holidays."
},
"sub_type": "PARAGRAPH"
},
{
"type": "TEXT",
"id": "84615c8c-0c70-47c1-a2a0-71cf49dfcf9b",
"reading_order": 2,
"page_indices": [
0
],
"representation": {
+ "markdown": "## 1. [ \"Atomic Habits\" by James Clear ](https://jamesclear.com/atomic-habits)"
},
"sub_type": "SECTION_HEADER"
},
{
"type": "TEXT",
"id": "e0a8c86f-0ea7-4c7c-8017-2268128ef6f7",
"reading_order": 3,
"page_indices": [
0
],
"representation": {
"markdown": "**Overview:**"
},
"sub_type": "PARAGRAPH"
},
{
"type": "TEXT",
"id": "95a8c622-623d-43b2-9045-890231cf69d1",
"reading_order": 4,
"page_indices": [
0
],
"representation": {
"markdown": "This book provides practical strategies for forming good habits, breaking bad ones, and mastering the tiny behaviors that lead to remarkable results."
},
"sub_type": "PARAGRAPH"
},
{
"type": "TEXT",
"id": "1d6fd112-aae1-42a2-b289-e1c6f0ce0e0c",
"reading_order": 5,
"page_indices": [
0
],
"representation": {
"markdown": "**Why Recommended:**"
},
"sub_type": "PARAGRAPH"
},
{
"type": "TEXT",
"id": "e637cd9a-80c3-4c3e-9f98-c355ee234dc5",
"reading_order": 6,
"page_indices": [
0
],
"representation": {
"markdown": "Golden Week is a great time to reflect on your routines and make positive changes. \"Atomic Habits\" offers actionable advice that can be implemented immediately."
},
"sub_type": "PARAGRAPH"
},
{
"type": "TEXT",
"id": "75ada715-381b-4327-a94e-f83d6ebdf51b",
"reading_order": 7,
"page_indices": [
0
],
"representation": {
+ "markdown": "## 2. [ \"Ikigai: The Japanese Secret to a Long and Happy Life\" by ](https://www.penguinrandomhouse.com/books/549435/ikigai-by-hector-garcia-and-francesc-miralles/) [ Héctor García and Francesc Miralles ](https://www.penguinrandomhouse.com/books/549435/ikigai-by-hector-garcia-and-francesc-miralles/)"
},
"sub_type": "SECTION_HEADER"
},
{
"type": "TEXT",
"id": "7ddf9f82-ec69-4245-af5d-703e4c7efab3",
"reading_order": 8,
"page_indices": [
0
],
"representation": {
"markdown": "**Overview:**"
},
"sub_type": "PARAGRAPH"
},
{
"type": "TEXT",
"id": "ecd50481-21d3-4b59-be08-fd9e665b819e",
"reading_order": 9,
"page_indices": [
0
],
"representation": {
"markdown": "Exploring the Japanese concept of \"Ikigai,\" this book reveals the secrets to living a long, fulfilling, and meaningful life."
},
"sub_type": "PARAGRAPH"
},
{
"type": "TEXT",
"id": "e95403ef-e020-49af-9a60-fd6486e7e1c2",
"reading_order": 10,
"page_indices": [
0
],
"representation": {
"markdown": "**Why Recommended:**"
},
"sub_type": "PARAGRAPH"
},
{
"type": "TEXT",
"id": "05d4f665-730c-4063-9bc0-0868e62ca9ba",
"reading_order": 11,
"page_indices": [
0
],
"representation": {
"markdown": "As a Japanese concept, \"Ikigai\" is especially relevant during Golden Week. The book encourages self-discovery and finding purpose, making it ideal for a reflective holiday."
},
"sub_type": "PARAGRAPH"
},
{
"type": "TEXT",
"id": "db9a134f-91cd-405d-a270-f1a5529af424",
"reading_order": 12,
"page_indices": [
0
],
"representation": {
+ "markdown": "## 3. [ \"Educated\" by Tara Westover ](https://tarawestover.com/book)"
},
"sub_type": "SECTION_HEADER"
},
{
"type": "TEXT",
"id": "5e21ffc3-6ca2-457f-8967-1daa956cc1de",
"reading_order": 13,
"page_indices": [
0
],
"representation": {
"markdown": "**Overview:**"
},
"sub_type": "PARAGRAPH"
},
{
"type": "TEXT",
"id": "40b1a0bb-ada5-4545-ba00-30ecb2bb6502",
"reading_order": 14,
"page_indices": [
0
],
"representation": {
"markdown": "A memoir about a woman who grows up in a strict and abusive household in rural Idaho but eventually escapes to learn about the wider world through education."
},
"sub_type": "PARAGRAPH"
},
{
"type": "TEXT",
"id": "8d80cb68-7440-4118-9da7-ecd70bb24e1b",
"reading_order": 15,
"page_indices": [
0
],
"representation": {
"markdown": "**Why Recommended:**"
},
"sub_type": "PARAGRAPH"
},
{
"type": "TEXT",
"id": "90992931-24a8-45ec-9b1c-766276cd6ae6",
"reading_order": 16,
"page_indices": [
0
],
"representation": {
"markdown": "\"Educated\" is an inspiring story of resilience and the transformative power of learning, perfect for those seeking motivation and new perspectives."
},
"sub_type": "PARAGRAPH"
},
{
"type": "TEXT",
"id": "53428e71-70ad-482c-a1f7-b5dc720ab494",
"reading_order": 17,
"page_indices": [
0
],
"representation": {
+ "markdown": "## 4. [ \"The Alchemist\" by Paulo Coelho ](https://www.harpercollins.com/products/the-alchemist-paulo-coelho)"
},
"sub_type": "SECTION_HEADER"
},
{
"type": "TEXT",
"id": "d7847f9b-7b29-4fdc-9710-116e389309af",
"reading_order": 18,
"page_indices": [
0
],
"representation": {
"markdown": "**Overview:**"
},
"sub_type": "PARAGRAPH"
},
{
"type": "TEXT",
"id": "32f4144b-610e-4815-9a83-b8357b10e73f",
"reading_order": 19,
"page_indices": [
0
],
"representation": {
"markdown": "A philosophical novel about a young shepherd's journey to realize his personal legend and fulfill his dreams."
},
"sub_type": "PARAGRAPH"
},
{
"type": "TEXT",
"id": "692a36a6-7538-41e3-afb4-549091294ab6",
"reading_order": 20,
"page_indices": [
0
],
"representation": {
"markdown": "**Why Recommended:**"
},
"sub_type": "PARAGRAPH"
},
{
"type": "TEXT",
"id": "893a9f28-99fa-4dfb-b339-9756c8eb8acf",
"reading_order": 21,
"page_indices": [
0
],
"representation": {
"markdown": "This classic tale encourages readers to pursue their dreams and listen to their hearts, making it a wonderful companion for a holiday break."
},
"sub_type": "PARAGRAPH"
},
{
"type": "TEXT",
"id": "dd23378a-809a-422d-9d7c-d30ec1055964",
"reading_order": 0,
"page_indices": [
1
],
"representation": {
+ "markdown": "# 5. [ \"Deep Work: Rules for Focused Success in a Distracted ](https://www.hachettebookgroup.com/titles/cal-newport/deep-work/9781455586691/) [ World\" by Cal Newport ](https://www.hachettebookgroup.com/titles/cal-newport/deep-work/9781455586691/)"
},
"sub_type": "TITLE"
},
{
"type": "TEXT",
"id": "f5315d4f-b093-4df9-9cf9-cb69f7a7cf63",
"reading_order": 1,
"page_indices": [
1
],
"representation": {
"markdown": "**Overview:**"
},
"sub_type": "PARAGRAPH"
},
{
"type": "TEXT",
"id": "91b94f89-237f-4521-835d-ab1e52d6018d",
"reading_order": 2,
"page_indices": [
1
],
"representation": {
"markdown": "The book explores the benefits of deep, focused work and provides practical advice for achieving greater productivity and satisfaction."
},
"sub_type": "PARAGRAPH"
},
{
"type": "TEXT",
"id": "b07cbec7-9f80-409d-b334-32c2c55ca45c",
"reading_order": 3,
"page_indices": [
1
],
"representation": {
"markdown": "**Why Recommended:**"
},
"sub_type": "PARAGRAPH"
},
{
"type": "TEXT",
"id": "aec8d662-3f22-4887-bbe2-d9ff4124608f",
"reading_order": 4,
"page_indices": [
1
],
"representation": {
"markdown": "Golden Week offers a rare chance to disconnect from daily distractions and focus on personal growth, making \"Deep Work\" a timely and valuable read."
},
"sub_type": "PARAGRAPH"
},
{
"type": "TEXT",
"id": "0f545244-6082-4930-b7e1-0a589ebb896d",
"reading_order": 5,
"page_indices": [
1
],
"representation": {
"markdown": "Enjoy your Golden Week with these inspiring books!"
},
"sub_type": "PARAGRAPH"
}
]
}
まとめ
以上簡単ではありましたが、「Amazon Bedrock Data Automation でハイパーリンクの認識をサポートしました。」でした。
シンプルに嬉しいアップデートですね。参考になれば幸いです。
クラウド事業本部コンサルティング部のたかくに(@takakuni_)でした!