【BERT】mobileBERTをPythonで動かしてみた

山本紘暉

2020.08.02

この記事は公開されてから1年以上経過しています。情報が古い可能性がありますので、ご注意ください。

CX事業本部の山本です。

今回は、mobileBERTという、自然言語処理用の機械学習モデルを動かしてみます。ある文章と、それに関する質問文を入力すると、質問に対する答えを返してくれるというタスクを実行します。

概要

BERT

まず、BERTとは自然言語処理用の機械学習モデルです。2018年に発表され、当時様々なタスクで当時の最高性能をマークし、話題になりました。詳しくは、以下のページにかかれています。

BERTとは｜Googleが誇る自然言語処理モデルの特徴、仕組みを解説 | Ledge.ai

上のページから概要を引用すると、以下のとおりです。

BERTとは、Bidirectional Encoder Representations from Transformers の略で、「Transformerによる双方向のエンコード表現」と訳され、2018年10月にGoogleのJacob Devlinらの論文で発表された自然言語処理モデルです。翻訳、文書分類、質問応答など自然言語処理の仕事の分野のことを「（自然言語処理）タスク」と言いますが、BERTは、多様なタスクにおいて当時の最高スコアを叩き出しました。

MobileBERT

MobileBERTとは、このBERTのモデルを圧縮して、モバイル機器で動かせるようにしたものです。

MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices

Empirical studies show that MobileBERT is 4.3x smaller and 5.5x faster than BERT_BASE

以下のページには、ブラウザで動かす例が書かれています。

ブラウザと Tensorflow.js を使った BERT の活用方法を考える

tensorflow/tfjs-models

mobileBERTを動かしてみる

利用するリポジトリ

今回利用するのは、下のリポジトリで公開されている、mobileBERTです。mobileBERTのモデルをPythonで動かせるようにラップされています。

gemde001/MobileBERT

インストール方法にも書かれていますが、mobileBERTのモデルは、下で公開されているものを利用しています。

Question and answer | TensorFlow Lite

インストール

ターミナルで実行するコマンドは、以下の通りです。

今回、WSLで動かしました。そのため、推論処理はCPUで行われています。（Linux系なら以下のコマンド動くと思います。また、GPUがあればGPUで推論処理をすると思います。）

環境を変えたくない場合は、venvなどを利用したほうが良いでしょう。

mkdir run_mobileBERT
cd run_mobileBERT

git clone https://github.com/gemde001/MobileBERT.git
cd MobileBERT/mobilebert

wget https://storage.googleapis.com/download.tensorflow.org/models/tflite/bert_qa/mobilebert_qa_vocab.zip
unzip mobilebert_qa_vocab.zip

pip install tensorflow
pip install bert-for-tf2
pip install numpy

サンプル実行

MobileBERTのフォルダにPythonスクリプトを作成し、以下のようにして保存します。

from mobilebert import MobileBERT

passage = """
Google LLC is an American multinational technology company that specializes in Internet-related services and products, which include online advertising technologies, search engine, cloud computing, software, and hardware. It is considered one of the Big Four technology companies, alongside Amazon, Apple, and Facebook.
Google was founded in September 1998 by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University in California. Together they own about 14 percent of its shares and control 56 percent of the stockholder voting power through supervoting stock. They incorporated Google as a California privately held company on September 4, 1998, in California. Google was then reincorporated in Delaware on October 22, 2002. An initial public offering (IPO) took place on August 19, 2004, and Google moved to its headquarters in Mountain View, California, nicknamed the Googleplex. In August 2015, Google announced plans to reorganize its various interests as a conglomerate called Alphabet Inc. Google is Alphabet's leading subsidiary and will continue to be the umbrella company for Alphabet's Internet interests. Sundar Pichai was appointed CEO of Google, replacing Larry Page who became the CEO of Alphabet.
"""

question = "Who is the CEO of Google?"

m = MobileBERT()
answer = m.run(passage, question)

print(answer)

実行すると以下のような出力が表示されます。最後の行で、回答が得られていることがわかります。（WSLを使用しており、GPUの設定をしていないため、警告がでています。）

2020-08-02 12:39:40.551565: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2020-08-02 12:39:40.551616: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2020-08-02 12:39:41.486269: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-08-02 12:39:41.499742: E tensorflow/stream_executor/cuda/cuda_driver.cc:314] failed call to cuInit: UNKNOWN ERROR (100)
2020-08-02 12:39:41.499802: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (DESKTOP-A5L0RB9): /proc/driver/nvidia/version does not exist
2020-08-02 12:39:41.500090: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-08-02 12:39:41.506308: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 4199995000 Hz
2020-08-02 12:39:41.508273: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55c0f3841510 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-08-02 12:39:41.508319: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
sundar pichai

英語ニュースなどでも実行しましたが、良い回答が得られることを確認できました。著作権の関係があるため掲載できませんが、ご自分で試してみてください。