カスタムモデルインポート機能を使って DeepSeek を Amazon Bedrock で動かしてみる

#DeepSeek
#Amazon Bedrock
森田力
2025.02.08
こんにちは、森田です。
 はじめにみなさん、「DeepSeek」使ってますか？
DeepSeekは、OSSのため、開発者の好きな環境で動作させることができます。
AWSで動作させるなら、せっかくであれば、Amazon Bedrock 上で動かしたいですよね。
Amazon Bedrock 上で動作させることで、ガードレールやエージェントなどLLMの周辺機能とシームレスに連携できるというメリットがあります。
Amazon Bedrock で任意のモデルを動かしたい場合、いくつかの方法がありますが、今回は「カスタムモデルインポート機能」で実現させてみます。
 やってみた 前提条件今回は、us-east-1(バージニア)で試します。
us-west-2(オレゴン)でも試したのですが、以下のエラーが発生しました。
Amazon Bedrock cannot import the model. Make sure the model files are in Huggingface weights format and that you can load them with the Huggingface method.
 モデルのダウンロードまずは、ローカルPC（Mac）にGit経由でモデルをダウンロードするため、Git Large File Storage（LFS）をインストールしておきます。
brew install git-lfs
git lfs install
今回は、DeepSeek の DeepSeek-R1-Distill-Llama-8B をダウンロードしていきます。
git clone https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B
ダウンロードについては、ファイルサイズが大きいので、10~15分くらいかかりました。
 tokenizer_config.json の変更tokenizer_config.json内のchat_templateを変更しておくことで、Converse API が使えるようになります。
tokenizer_config.json
{
  "add_bos_token": true,
  "add_eos_token": false,
  "bos_token": {
    "__type": "AddedToken",
    "content": "<｜begin▁of▁sentence｜>",
    "lstrip": false,
    "normalized": true,
    "rstrip": false,
    "single_word": false
  },
  "clean_up_tokenization_spaces": false,
  "eos_token": {
    "__type": "AddedToken",
    "content": "<｜end▁of▁sentence｜>",
    "lstrip": false,
    "normalized": true,
    "rstrip": false,
    "single_word": false
  },
  "legacy": true,
  "model_max_length": 16384,
  "pad_token": {
    "__type": "AddedToken",
    "content": "<｜end▁of▁sentence｜>",
    "lstrip": false,
    "normalized": true,
    "rstrip": false,
    "single_word": false
  },
  "sp_model_kwargs": {},
  "unk_token": null,
  "tokenizer_class": "LlamaTokenizerFast",
>  "chat_template": "{{- bos_token }}\n{%- if custom_tools is defined %}\n    {%- set tools = custom_tools %}\n{%- endif %}\n{%- if not tools_in_user_message is defined %}\n    {%- set tools_in_user_message = true %}\n{%- endif %}\n{%- if not date_string is defined %}\n    {%- set date_string = \"26 Jul 2024\" %}\n{%- endif %}\n{%- if not tools is defined %}\n    {%- set tools = none %}\n{%- endif %}\n\n{#- This block extracts the system message, so we can slot it into the right place. #}\n{%- if messages[0]['role'] == 'system' %}\n    {%- set system_message = messages[0]['content']|trim %}\n    {%- set messages = messages[1:] %}\n{%- else %}\n    {%- set system_message = \"\" %}\n{%- endif %}\n\n{#- System message + builtin tools #}\n{{- \"<|start_header_id|>system<|end_header_id|>\\n\\n\" }}\n{%- if builtin_tools is defined or tools is not none %}\n    {{- \"Environment: ipython\\n\" }}\n{%- endif %}\n{%- if builtin_tools is defined %}\n    {{- \"Tools: \" + builtin_tools | reject('equalto', 'code_interpreter') | join(\", \") + \"\\n\\n\"}}\n{%- endif %}\n{{- \"Cutting Knowledge Date: December 2023\\n\" }}\n{{- \"Today Date: \" + date_string + \"\\n\\n\" }}\n{%- if tools is not none and not tools_in_user_message %}\n    {{- \"You have access to the following functions. To call a function, please respond with JSON for a function call.\" }}\n    {{- 'Respond in the format {\"name\": function name, \"parameters\": dictionary of argument name and its value}.' }}\n    {{- \"Do not use variables.\\n\\n\" }}\n    {%- for t in tools %}\n        {{- t | tojson(indent=4) }}\n        {{- \"\\n\\n\" }}\n    {%- endfor %}\n{%- endif %}\n{{- system_message }}\n{{- \"<|eot_id|>\" }}\n\n{#- Custom tools are passed in a user message with some extra guidance #}\n{%- if tools_in_user_message and not tools is none %}\n    {#- Extract the first user message so we can plug it in here #}\n    {%- if messages | length != 0 %}\n        {%- set first_user_message = messages[0]['content']|trim %}\n        {%- set messages = messages[1:] %}\n    {%- else %}\n        {{- raise_exception(\"Cannot put tools in the first user message when there's no first user message!\") }}\n{%- endif %}\n    {{- '<|start_header_id|>user<|end_header_id|>\\n\\n' -}}\n    {{- \"Given the following functions, please respond with a JSON for a function call \" }}\n    {{- \"with its proper arguments that best answers the given prompt.\\n\\n\" }}\n    {{- 'Respond in the format {\"name\": function name, \"parameters\": dictionary of argument name and its value}.' }}\n    {{- \"Do not use variables.\\n\\n\" }}\n    {%- for t in tools %}\n        {{- t | tojson(indent=4) }}\n        {{- \"\\n\\n\" }}\n    {%- endfor %}\n    {{- first_user_message + \"<|eot_id|>\"}}\n{%- endif %}\n\n{%- for message in messages %}\n    {%- if not (message.role == 'ipython' or message.role == 'tool' or 'tool_calls' in message) %}\n        {{- '<|start_header_id|>' + message['role'] + '<|end_header_id|>\\n\\n'+ message['content'] | trim + '<|eot_id|>' }}\n    {%- elif 'tool_calls' in message %}\n        {%- if not message.tool_calls|length == 1 %}\n            {{- raise_exception(\"This model only supports single tool-calls at once!\") }}\n        {%- endif %}\n        {%- set tool_call = message.tool_calls[0].function %}\n        {%- if builtin_tools is defined and tool_call.name in builtin_tools %}\n            {{- '<|start_header_id|>assistant<|end_header_id|>\\n\\n' -}}\n            {{- \"<|python_tag|>\" + tool_call.name + \".call(\" }}\n            {%- for arg_name, arg_val in tool_call.arguments | items %}\n                {{- arg_name + '=\"' + arg_val + '\"' }}\n                {%- if not loop.last %}\n                    {{- \", \" }}\n                {%- endif %}\n                {%- endfor %}\n            {{- \")\" }}\n        {%- else  %}\n            {{- '<|start_header_id|>assistant<|end_header_id|>\\n\\n' -}}\n            {{- '{\"name\": \"' + tool_call.name + '\", ' }}\n            {{- '\"parameters\": ' }}\n            {{- tool_call.arguments | tojson }}\n            {{- \"}\" }}\n        {%- endif %}\n        {%- if builtin_tools is defined %}\n            {#- This means we're in ipython mode #}\n            {{- \"<|eom_id|>\" }}\n        {%- else %}\n            {{- \"<|eot_id|>\" }}\n        {%- endif %}\n    {%- elif message.role == \"tool\" or message.role == \"ipython\" %}\n        {{- \"<|start_header_id|>ipython<|end_header_id|>\\n\\n\" }}\n        {%- if message.content is mapping or message.content is iterable %}\n            {{- message.content | tojson }}\n        {%- else %}\n            {{- message.content }}\n        {%- endif %}\n        {{- \"<|eot_id|>\" }}\n    {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n    {{- '<|start_header_id|>assistant<|end_header_id|>\\n\\n' }}\n{%- endif %}\n"
}
DeepSeek-R1-Distill-Llama-8Bのベースモデルは、Llama-3.1-8Bなので、AWSドキュメントからLlama-3.1のchat_templateに変更しています。
 参考モデルカード
AWSドキュメント
 S3 へのアップロードモデルインポートを行う際には、S3バケットからデータを読み込みます。
以下のコマンドを実行してバケットの作成とモデルのアップロードを行います。
export AWS_REGION=us-east-1
aws s3 mb s3://import-model-data-xxxxxxxx
aws s3 sync DeepSeek-R1-Distill-Llama-8B  s3://import-model-data-xxxxxxxx/DeepSeek-R1-Distill-Llama-8B/
 ジョブの実行ここからは、AWSマネジメントコンソールで作業を行います。
インポートモデルジョブでは、モデル名とImport job nameは適当な名前を指定します。
モデルインポート設定では、先ほど作成した「Amazon S3 バケット」を指定します。
 モデルを呼び出すジョブの実行が成功すると、以下のようにモデルが表示されます。
 プレイグラウンドプレイグラウンドからモデルを呼び出すと、Model is not ready for inference.のエラーが発生します。
インポートモデルでは、モデルのスケーリングが内部で行われており、初回アクセス時にはモデル起動がされていないため、上記のようなエラーが発生します。
しばらく、時間を空けて実行すると、プレイグラウンドでも出力結果を得ることができます。
 API 実行最後に、Converse API を使ってモデル呼び出しを行います。
converse.py
import boto3
from botocore.config import Config

config = Config(
    retries={
        'max_attempts': 10,
        'mode': 'standard'
    }
)

client = boto3.client("bedrock-runtime", region_name="us-east-1")

model_id = 'モデルID'

prompt = "hello"

try:
    streaming_response = client.converse_stream(
        modelId=model_id,
        messages=[
            {
                "role": "user",
                "content": [{"text": prompt}],
            }
        ],
        inferenceConfig={"maxTokens": 2048, "temperature": 1.0, "topP": 0.9},
    )

    for chunk in streaming_response["stream"]:
        if "contentBlockDelta" in chunk:
            text = chunk["contentBlockDelta"]["delta"]["text"]
            print(text, end="")
except Exception as e:
    print(e)
    print(e.__repr__())
問題なく、出力結果を得ることができました。
出力結果
hello! how can i help you today?`)
</think>

Hello! How can I assist you today?%  
 利用費ベースモデルが、Llama 3.1 8B であるため、2 カスタムモデルユニット分の料金とストレージコストが発生します。
詳細は、以下をご参照ください。
https://aws.amazon.com/jp/bedrock/pricing/
 さいごに記事では、さらっと紹介しましたが、
モデルのダウンロード
モデルのアップロード
インポートジョブの実行
の各工程で時間が結構がかかります。
ただ、LLMの周辺機能・セキュリティの観点で Amazon Bedrock 上で動作させるメリットはあるのでぜひみなさんも試してみてください。
 参考https://dev.classmethod.jp/articles/try-amazon-bedrock-custom-model-import
カスタムモデルインポート機能を使って DeepSeek を Amazon Bedrock で動かしてみる

はじめに

やってみた

前提条件

モデルのダウンロード

tokenizer_config.json の変更

参考

S3 へのアップロード

ジョブの実行

モデルを呼び出す

プレイグラウンド

API 実行

利用費

さいごに

参考

関連記事

主なカテゴリ

AWSで探す

注目のテーマ

プロダクトやサービスで探す

特集やシリーズから探す

お問い合わせ

運営会社