[アップデート] Amazon Bedrock で DeepSeek-R1 がサーバーレスモデルとして利用可能になりました

#Amazon Bedrock
#DeepSeek
たかくに
2025.03.11
こんにちは！クラウド事業本部コンサルティング部のたかくに（@takakuni_）です。
Amazon Bedrock で DeepSeek-R1 がサーバーレスモデルとして利用可能になりました。
https://aws.amazon.com/jp/about-aws/whats-new/2025/03/deepseek-r1-fully-managed-amazon-bedrock/
AWS Blog も投稿されています。
https://aws.amazon.com/jp/blogs/aws/deepseek-r1-now-available-as-a-fully-managed-serverless-model-in-amazon-bedrock/
 アップデート内容アップデート前まで DeepSeek-R1 を Amazon Bedrock の機能を使って利用するには Amazon Bedrock Marketplace から利用する必要がありました。Amazon Bedrock Marketplace の裏側では SageMaker のリアルタイム推論が利用されており、p5e.48xlarge などの大きめの GPU インスタンスを起動し続ける必要がありました。
利用規模が小さいケースではコスト感が見合わなかったり、そもそもキャパシティ不足によって、インスタンスが起動しづらい状況であるため、利用できるユーザーは限られていました。
https://aws.amazon.com/jp/blogs/news/deepseek-r1-models-now-available-on-aws/
!DeepSeek には DeepSeek-R1 を蒸留した、DeepSeek-R1-Distill モデルというものがあります。
こちらのモデルは、 DeepSeek-R1 に比べサイズが小さいため Amazon Bedrock の Custom Model Import 機能を利用することができました。
今回のアップデートで、DeepSeek-R1 モデルを Amazon Nova や Anthropic Claude のようなサーバレスな形態で利用可能になりました。
考えられるメリットは次の通りです。
大きい GPU インスタンスを自身で起動しなくてよくなったこと
従量課金で小規模なユースケースの場合、コストメリットがより高くなる

推論サーバーのスケーリングを意識しなくてよくなったこと
リアルタイム推論ではインスタンス数を指定する必要があった

SageMaker のクォータ/キャパシティ不足を意識しなくてよくなった
GPU インスタンスを起動するためのクォータやキャパシティ不足を意識しなくてよくなった
代わりに Amazon Bedrock のスロットリングを意識する必要があります

なお、利用可能な推論パターンは、バージニア北部、オハイオ、オレゴンリージョンで構成されるクロスリージョン推論です。
DeepSeek-R1 is available fully managed in Amazon Bedrock in the US East (N. Virginia), US East (Ohio), and US West (Oregon) AWS Regions via cross-region inference.
https://aws.amazon.com/jp/blogs/news/deepseek-r1-models-now-available-on-aws/
 サポートしている機能機能にフォーカスしてみます。
まず、テキストからテキスト生成するモデルで、ストリーミングをサポートしています。
https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html
続いて API 周りです。こちらも InvokeModel API と Converse API をサポートしていました。
https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-deepseek.html
最後に周辺機能との統合です。こちらは Agents のみサポートされていると記載されていました。
https://docs.aws.amazon.com/bedrock/latest/userguide/models-features.html
 利用に際して今回の DeepSeek-R1 モデルは MIT ライセンス下で公開されているモデルであるものの、実稼働環境に実装する際にはデータプライバシー要件を慎重に考慮し、出力の偏り/結果を監視いただくことが推奨されています。
As is the case for all AI solutions, give careful consideration to data privacy requirements when implementing in your production environments, check for bias in output, and monitor your results. When implementing publicly available models like DeepSeek-R1, consider the following:
https://aws.amazon.com/jp/blogs/aws/deepseek-r1-now-available-as-a-fully-managed-serverless-model-in-amazon-bedrock/
必要に応じて Amazon Bedrock Guardrails で提供される以下の機能の利用をご検討ください。（※ 日本語は未サポートです。）
コンテンツフィルタリング
機密情報フィルタリング
コンテキストグラウンディングチェック（ハルシネーション検出）
自動推論チェック（Automated Reasoning checks）
プライベートプレビュー

 やってみるそれでは DeepSeek-R1 を利用してみます。今回はオレゴンリージョンで機能検証を行います。
まずは、モデルの有効化を行います。
 テキストチャットまずはプレイグラウンドから、チャット形式で質問してみます。DeepSeek-R1 が選べるようになっていますね。
Reasoning は自動で有効になっており、Reasoning したのちに回答が生成されています。（噂によると回答は日本語なのですが、中国語で Reasoning する時もあるようです。）
実行ログは、以下のスキーマで記録されました。
ModelInvocationLog.json
{
    "schemaType": "ModelInvocationLog",
    "schemaVersion": "1.0",
    "timestamp": "2025-03-11T02:48:18Z",
    "accountId": "123456789012",
    "identity": {
        "arn": "arn:aws:sts::123456789012:assumed-role/takakuni/takakuni"
    },
    "region": "us-west-2",
    "requestId": "59410309-360d-492f-a7ed-69d13e585ba3",
    "operation": "ConverseStream",
    "modelId": "arn:aws:bedrock:us-west-2:123456789012:inference-profile/us.deepseek.r1-v1:0",
    "input": {
        "inputContentType": "application/json",
        "inputBodyJson": {
            "messages": [
                {
                    "role": "user",
                    "content": [
                        {
                            "text": "今日はリモートワークをしています。昼ごはんを考えてください。"
                        }
                    ]
                }
            ],
            "inferenceConfig": {
                "maxTokens": 4096,
                "temperature": 1,
                "topP": 0.9,
                "stopSequences": []
            },
            "additionalModelRequestFields": {}
        },
        "inputTokenCount": 23,
        "cacheReadInputTokenCount": 0,
        "cacheWriteInputTokenCount": 0
    },
    "output": {
        "outputContentType": "application/json",
        "outputBodyJson": {
            "output": {
                "message": {
                    "role": "assistant",
                    "content": [
                        {
                            "reasoningContent": {
                                "reasoningText": {
                                    "text": "まず、ユーザーがリモートワーク中で昼ごはんを考えたいようです。リモートワークの場合は時間をかけずに簡単に作れるものが良いでしょう。冷蔵庫にあるもので作れるレシピを提案する必要がありますね。栄養バランスも考慮して、野菜、たんぱく質、炭水化物をバランスよく含むものが理想的です。\n\n手軽さを優先するなら、丼ものやサンドイッチが良いかもしれません。例えば、ツナと野菜のサンドイッチや、卵かけごはんなどは短時間で作れます。また、残り物を活用するレシピも喜ばれるでしょう。前日の夕食の残り野菜があれば、それを利用したチャーハンやスープもいいですね。\n\nユーザーの好みを考慮して、和食か洋食かも選べるように複数の選択肢を提示した方が良さそうです。調理時間や手間を考えて、5分でできるものから15分程度のものまでバリエーションを持たせるといいでしょう。さらに、ヘルシーなオプションやボリュームのあるものなど、ニーズに合わせて選べるようにするのもポイントです。\n\n最後に、材料の入手しやすさも考慮します。特別な調味料がなくても作れるレシピを紹介することで、ユーザーがすぐに実践できるようにします。例えば、納豆や豆腐を使った料理は手軽で栄養価も高いのでおすすめです。また、電子レンジを活用した調理法も時間短縮につながるので良いでしょう",
                                    "signature": ""
                                }
                            }
                        },
                        {
                            "text": "\n\nリモートワーク中の昼ごはんにぴったりな、簡単で栄養バランスの良いアイデアをご提案します！時間別・好みに合わせて選べるよう、5パターン用意しました。\n\n**【時短5分レシピ】**\n1. **究極の卵かけごはん丼**\n   - 温かいご飯に生卵をのせ、刻み海苔・ごま油少々・醤油をかける\n   - お好みで納豆や刻みネギを追加◎\n   - 栄養ポイント：タンパク質＋カルシウム＋鉄分\n\n**【10分で作れる】**\n2. **ツナと野菜のレンチン丼**\n   - 耐熱容器に冷凍ご飯→刻んだキャベツ→ツナ缶→マヨネーズ→チーズを乗せ、電子レンジ（600W2分）\n   - 仕上げにポン酢をかけて混ぜる\n   - 片手で食べられるのでPC作業中にも◎\n\n**【ヘルシー選択肢】**\n3. **豆腐とわかめの即席スープ定食**\n   - お湯で溶いた味噌に豆腐・乾燥わかめ・麩を入れる（3分）\n   - 冷凍おにぎりor玄米おにぎりを添える\n   - 体が温まる＆消化が良い\n\n**【ガッツリ系】**\n4. **リメイクチャーハン**\n   - 前日の残り野菜と冷凍ご飯をフライパンで炒め、卵と鶏がらスープの素で味付け\n   - ラー油を足すと食欲そそる\n   - フライパン1つで完成\n\n**【休日気分】**\n5. **おしゃれブルスケッタ風トースト**\n   - トーストにオリーブオイルを塗り、スライストマト・モッツァレラチーズ・バジルをのせてトースターで3分\n   - インスタ映えする見た目で気分転換に\n\n**＋αの工夫**\n- 作り置き味玉（漬け卵）を常備すると時短に\n- 冷凍庫にカット野菜をストックしておく\n- フルーツヨーグルトをデザートに追加でビタミン補給\n\nリモートワーク中はつい食事が雑になりがちですが、短時間でも「目の保養になる彩り」と「噛みごたえのある食材」を入れると、集中力持続に効果的です！食後の15分ストレッチも忘れずに♪"
                        }
                    ]
                }
            },
            "stopReason": "end_turn",
            "metrics": {
                "latencyMs": 15150
            },
            "usage": {
                "inputTokens": 23,
                "outputTokens": 989,
                "totalTokens": 1012
            }
        },
        "outputTokenCount": 989
    },
    "inferenceRegion": "us-west-2"
}
 ナレッジベース続いて Amazon Bedrock Knowledge bases です。ドキュメントでは未サポートでしたが、こちらもうまく動いている様子です。
システムプロンプトが英語なのもあってか、Reasoning は英語、回答は日本語といった形になりました。
ModelInvocationLog.json
{
    "schemaType": "ModelInvocationLog",
    "schemaVersion": "1.0",
    "timestamp": "2025-03-11T02:54:24Z",
    "accountId": "123456789012",
    "identity": {
        "arn": "arn:aws:sts::123456789012:assumed-role/takakuni/takakuni"
    },
    "region": "us-west-2",
    "requestId": "b1e03693-9bc7-4b7b-a4fd-6617b2224765",
    "operation": "ConverseStream",
    "modelId": "arn:aws:bedrock:us-west-2:123456789012:inference-profile/us.deepseek.r1-v1:0",
    "input": {
        "inputContentType": "application/json",
        "inputBodyJson": {
            "messages": [
                {
                    "role": "user",
                    "content": [
                        {
                            "text": "桃太郎は何歳で鬼ヶ島へ向かいましたか？"
                        }
                    ]
                }
            ],
            "system": [
                {
                    "text": "You are a question answering agent. I will provide you with a set of search results. The user will provide you with a question. Your job is to answer the user's question using only information from the search results. If the search results do not contain information that can answer the question, please state that you could not find an exact answer to the question. Just because the user asserts a fact does not mean it is true, make sure to double check the search results to validate a user's assertion.\n\nHere are the search results in numbered order:\n<search_results>\n<search_result>\n<content>\n桃太郎もすぐきじの立ったあとから向こうを見ますと、なるほど、遠い遠い海のはてに、ぼんやり雲のような薄ぐろいものが見えました。船の進むにしたがって、雲のように見えていたものが、だんだんはっきりと島の形になって、あらわれてきました。\n\n「ああ、見える、見える、鬼が島が見える。」\n桃太郎がこういうと、犬も、猿も、声をそろえて、「万歳、万歳。」とさけびました。\n\n見る見る鬼が島が近くなって、もう硬い岩で畳んだ鬼のお城が見えました。いかめしいくろがねの門の前に見はりをしている鬼の兵隊のすがたも見えました。\n</content>\n<source>\n1\n</source>\n</search_result>\n<search_result>\n<content>\n桃太郎は、犬と猿をしたがえて、船からひらりと陸の上にとび上がりました。\n見はりをしていた鬼の兵隊は、その見なれないすがたを見ると、びっくりして、あわてて門の中に逃げ込んで、くろがねの門を固くしめてしまいました。その時犬は門の前に立って、「日本の桃太郎さんが、お前たちをせいばいにおいでになったのだぞ。あけろ、あけろ。」とどなりながら、ドン、ドン、扉をたたきました。鬼はその声を聞くと、ふるえ上がって、よけい一生懸命に、中から押さえていました。\n</content>\n<source>\n2\n</source>\n</search_result>\n<search_result>\n<content>\nそこであわてておじいさんがお湯をわかすやら、おばあさんがむつきをそろえるやら、大さわぎをして、赤さんを抱き上あげて、うぶ湯をつかわせました。するといきなり、\n\n「うん。」\n\nと言いいながら、さんは抱いているおばあさんの手をはねのけました。\n\n「おやおや、何という元気のいい子だろう。」\n\nおじいさんとおばあさんは、こう言って顔を見合わせながら、「あッは、あッは。」とおもしろそうに笑いました。\n\nそして桃の中から生まれた子だというので、この子に桃太郎という名をつけました。\n\n引用：[楠山正雄 桃太郎](https://www.aozora.gr.jp/cards/000329/files/18376_12100.html)\n</content>\n<source>\n3\n</source>\n</search_result>\n<search_result>\n<content>\nおじいさんとおばあさんは、それはそれはだいじにして桃太郎を育てました。桃太郎はだんだん成長するにつれて、あたりまえの子供にくらべては、ずっと体も大きいし、力がばかに強くって、すもうをとっても近所の村じゅうで、かなうものは一人もないくらいでしたが、そのくせ気だてはごくやさしくって、おじいさんとおばあさんによく孝行をしました。\n\n桃太郎は十五になりました。\n\nもうそのじぶんには、日本の国中で、桃太郎ほど強いものはないようになりました。桃太郎はどこか外国へ出かけて、腕いっぱい、力だめしをしてみたくなりました。\n</content>\n<source>\n4\n</source>\n</search_result>\n<search_result>\n<content>\nするとそのころ、ほうぼう外国の島々をめぐって帰って来た人があって、いろいろめずらしい、ふしぎなお話をした末に、「もう何年も何年も船をこいで行くと、遠い遠い海のはてに、鬼が島という所がある。悪い鬼どもが、いかめしいくろがねのお城の中に住んで、ほうぼうの国からかすめ取った貴い宝物を守っている。」と言いました。\n\n桃太郎はこの話をきくと、その鬼が島へ行ってみたくって、もう居ても立ってもいられなくなりました。\n</content>\n<source>\n5\n</source>\n</search_result>\n\n</search_results>\n\nYou should provide your answer without any inline citations or references to specific sources within the answer text itself. Do not include phrases like \"according to source X\", \"[1]\", \"[source 2, 3]\", etc within your <text> tags.\n\nHowever, you should include <sources> tags at the end of each <answer_part> to specify which source(s) the information came from.\nNote that <sources> may contain multiple <source> if you include information from multiple results in your answer.\n\nDo NOT directly quote the <search_results> in your answer. Your job is to answer the user's question as concisely as possible.\n\nYou must output your answer in the following format. Pay attention and follow the formatting and spacing exactly:\n<answer>\n<answer_part>\n<text>\nfirst answer text\n</text>\n<sources>\n<source>source ID</source>\n</sources>\n</answer_part>\n<answer_part>\n<text>\nsecond answer text\n</text>\n<sources>\n<source>source ID</source>\n</sources>\n</answer_part>\n</answer>\n"
                }
            ],
            "inferenceConfig": {
                "maxTokens": 4096,
                "temperature": 0,
                "stopSequences": []
            },
            "additionalModelRequestFields": {}
        },
        "inputTokenCount": 1333,
        "cacheReadInputTokenCount": 0,
        "cacheWriteInputTokenCount": 0
    },
    "output": {
        "outputContentType": "application/json",
        "outputBodyJson": {
            "output": {
                "message": {
                    "role": "assistant",
                    "content": [
                        {
                            "reasoningContent": {
                                "reasoningText": {
                                    "text": "Okay, let's see. The user is asking how old Momotaro was when he went to Onigashima. I need to check the search results provided to find the answer.\n\nLooking through the search results, source 4 mentions that Momotaro turned 15 and then wanted to test his strength abroad. Source 5 follows up by saying that after hearing about Onigashima, he couldn't wait to go there. So combining these two, it seems he was 15 when he decided to go to Onigashima. The other sources talk about the journey and his birth but don't mention his age. So the answer should be 15, based on sources 4 and 5.\n",
                                    "signature": ""
                                }
                            }
                        },
                        {
                            "text": "\n\n<answer>\n<answer_part>\n<text>\n桃太郎は15歳の時に鬼ヶ島へ向かいました。その頃には日本国中で最強の存在となり、外国での力試しを望むようになり、鬼が島の話を聞いて出発を決意しました。\n</text>\n<sources>\n<source>4</source>\n<source>5</source>\n</sources>\n</answer_part>\n</answer>"
                        }
                    ]
                }
            },
            "stopReason": "end_turn",
            "metrics": {
                "latencyMs": 3271
            },
            "usage": {
                "inputTokens": 1333,
                "outputTokens": 243,
                "totalTokens": 1576
            }
        },
        "outputTokenCount": 243
    },
    "inferenceRegion": "us-west-2"
}
 エージェント最後に Amazon Bedrock Agents です。
素のエージェントに質問をしてみましたが、問題なく返ってきていますね。
絵文字を使ってきてますね。
ModelInvocationLog.json
{
    "schemaType": "ModelInvocationLog",
    "schemaVersion": "1.0",
    "timestamp": "2025-03-11T04:38:12Z",
    "accountId": "123456789012",
    "identity": {
        "arn": "arn:aws:sts::123456789012:assumed-role/deepseek-r1-agents-role/BedrockAgents-XJXF6EZELX-9ecc92e1-bad8-4c6a-9276-21f68534ccbd"
    },
    "region": "us-west-2",
    "requestId": "98a4fe77-59a8-4a50-8152-b6da34b9e9dd",
    "operation": "ConverseStream",
    "modelId": "us.deepseek.r1-v1:0",
    "input": {
        "inputContentType": "application/json",
        "inputBodyJson": {
            "messages": [
                {
                    "role": "user",
                    "content": [
                        {
                            "text": "こんにちは！"
                        }
                    ]
                }
            ],
            "system": [
                {
                    "text": "\nあなたは優秀なアシスタントです。\nユーザーの指示に忠実に従って下さい。モラハラな内容は回答を控えてください\nYou are a helpful assistant with tool/function calling capabilities.\n\nIf you need an input parameter for a tool/function, ask the user to provide that parameter before making a call to that function/tool. You will have access to a separate tool/function that you MUST use to ask questions to the user. Never call a tool/function before gathering all parameters required for the tool/function call.\n\nIt is your responsibility to pick the correct tools/functions that are going to help you answer the user questions. Continue using the provided tools/functions until the initial user request is perfectly addressed. If you do not have the necessary tools/functions to address the initial request, call it out and terminate conversation.\n\nWhen you receive a tool/function call response, use the output to format an answer to the original user question.\n\nProvide your final answer to the user's question within <answer></answer> xml tags.\n\n\n\n\n\n\n\n"
                }
            ],
            "inferenceConfig": {
                "stopSequences": [
                    "</answer>"
                ]
            },
            "additionalModelRequestFields": {}
        },
        "inputTokenCount": 231,
        "cacheReadInputTokenCount": 0,
        "cacheWriteInputTokenCount": 0
    },
    "output": {
        "outputContentType": "application/json",
        "outputBodyJson": {
            "output": {
                "message": {
                    "role": "assistant",
                    "content": [
                        {
                            "reasoningContent": {
                                "reasoningText": {
                                    "text": "Alright, the user greeted me with \"こんにちは！\". I need to respond politely in Japanese. Since they said \"こんにちは\", which is \"Hello\", I should reply with a friendly greeting. Let me make sure to use the same level of formality. Maybe add a smiley to keep it warm. No tools needed here, just a direct response. I should check if there's anything else they need help with after greeting. Keep it open-ended for further que",
                                    "signature": ""
                                }
                            }
                        },
                        {
                            "text": "\n</think>\n\n<answer>こんにちは！😊 今日はどのようなご用件ですか？"
                        }
                    ]
                }
            },
            "stopReason": "end_turn",
            "metrics": {
                "latencyMs": 2075
            },
            "usage": {
                "inputTokens": 231,
                "outputTokens": 128,
                "totalTokens": 359
            }
        },
        "outputTokenCount": 128
    },
    "inferenceRegion": "us-west-2"
}
ただし、Tool Use を使うケースでは DeepSeek-R1 はサポートしていないため利用が難しそうでした。
 コスト最後にコストです。ライセンス料がないため、 Claude 3.7 Sonnet の 1/3 程度まで抑えられていますね。
1,000 入力トークンあたり 0.00135 USD
1,000 入力トークンあたり 0.0054 USD
https://aws.amazon.com/bedrock/pricing/?nc1=h_ls
 まとめ以上、「Amazon Bedrock で DeepSeek-R1 がサーバーレスモデルとして利用可能になりました」でした。
データの取り扱い周りに気をつける必要がありますが、ユースケースによっては非常にパワフルなモデルなのではないかと思いました。
このブログがどなたかの参考になれば幸いです。
クラウド事業本部コンサルティング部のたかくに（@takakuni_）でした！
[アップデート] Amazon Bedrock で DeepSeek-R1 がサーバーレスモデルとして利用可能になりました

アップデート内容

サポートしている機能

利用に際して

やってみる

テキストチャット

ナレッジベース

エージェント

コスト

まとめ

関連記事

主なカテゴリ

AWSで探す

注目のテーマ

プロダクトやサービスで探す

特集やシリーズから探す

お問い合わせ

運営会社