Amazon SageMakerでネジの分類をしてみた

yoshim

2018.08.10

この記事は公開されてから1年以上経過しています。情報が古い可能性がありますので、ご注意ください。

概要

こんにちは、yoshimです。今回は訳あってSageMakerで「ネジ画像」を分類してみました。 (2018年８月12日追記:手元にある画像を、とりあえずパラメータ等を弄らずに学習してみたらどの程度か、という検証です。)

1.最初に

今回は先日ご紹介したチュートリアルに沿って、ネジ画像の分類をしてみました。

今回挑戦したネジの分類は「なべネジ」、「皿ネジ」、「蝶ネジ」の3種類のどれか、というのを分類するモデルを作成してみようと思います。

・なべネジ

・皿ネジ

・蝶ネジ

今回、上記の３種類のネジ画像を教師データとして用意して学習を進めます。なべネジと皿ネジの分類が難しそうですね。どこまでできるかわからないのですが、挑戦してみます。

2.実際にやってみた

実際にやって見たコードについては先日ご紹介したブログをご参照ください。このコードの「参照する画像を今回用意したネジのデータにした」だけしか変更点はありません。

今回は、ハイパーパラメータの「num_layers」と「epochs」をいじってみて、検証データセットでの精度を確認してみます。（利用するデータセットが少ないのですが...）

2-1.最小構成

「num_layers」=18、「epochs」=10でやってみました。「p2.xlarge」インスタンスで6分ほどで完了しました。

・ハイパーパラメータ

# The algorithm supports multiple network depth (number of layers). They are 18, 34, 50, 101, 152 and 200
# For this training, we will use 18 layers
num_layers = 18
# we need to specify the input image shape for the training data
image_shape = "3,224,224"
# we also need to specify the number of training samples in the training set
num_training_samples = 114
# specify the number of output classes
num_classes = 3
# batch size for training
mini_batch_size = 10
# number of epochs
epochs = 10
# learning rate
learning_rate = 0.01
# report top_5 accuracy
top_k = 1
# resize image before training
resize = 256
# period to store model parameters (in number of epochs), in this case, we will save parameters from epoch 2, 4, and 6
checkpoint_frequency = 2
# Since we are using transfer learning, we set use_pretrained_model to 1 so that weights can be 
# initialized with pre-trained weights
use_pretrained_model = 1

精度の推移を見てみます。

エポック数が少ないということもあり、まだまだ安定していないですね。とりあえず、実際に画像データを入れてみて、どんな感じかを確かめてみます。ここでは、トレーニングにも検証にも利用していない画像を使ってみます。

・皿ネジ

Result: label - saraneji, probability - 0.5205193161964417

皿ネジである確率が52％、と推測しました。

・蝶ネジ

Result: label - tyouneji, probability - 0.9986796975135803

蝶ネジである確率が99％、と推測しました。

・なべネジ

Result: label - nabeneji, probability - 0.9971808195114136

なべネジである確率が99％、と推測しました。

皿ねじの分類が微妙そうでしたが、悪くはなさそうですね。ただ、まだエポック数も少ないので、偶然良さそうな結果が出ているだけ、と言えそうです。続いて、エポック数を増やしてみようと思います。

2-2.エポック数を増やしてみた

「num_layers」=18、「epochs」=50でやってみました。 p2.xlargeで9分ほどで終わりました。

# The algorithm supports multiple network depth (number of layers). They are 18, 34, 50, 101, 152 and 200
# For this training, we will use 18 layers
num_layers = 18
# we need to specify the input image shape for the training data
image_shape = "3,224,224"
# we also need to specify the number of training samples in the training set
num_training_samples = 114
# specify the number of output classes
num_classes = 3
# batch size for training
mini_batch_size = 10
# number of epochs
epochs = 50
# learning rate
learning_rate = 0.01
# report top_5 accuracy
top_k = 1
# resize image before training
resize = 256
# period to store model parameters (in number of epochs), in this case, we will save parameters from epoch 2, 4, and 6
checkpoint_frequency = 2
# Since we are using transfer learning, we set use_pretrained_model to 1 so that weights can be 
# initialized with pre-trained weights
use_pretrained_model = 1

精度の推移を見てみます。

流石に疑わしい数字ですねwww。とりあえず、実際に画像データを入れてみて、どんな感じかを確かめてみます。画像は「2-1.最小構成」と同じものを利用します。

・皿ネジ

Result: label - saraneji, probability - 0.905545711517334

・蝶ネジ

Result: label - tyouneji, probability - 0.9991187453269958

・なべネジ

Result: label - saraneji, probability - 0.9510487914085388

検証精度は良いのに、なべネジを皿ネジだと分類してしまいました。学習に利用するデータ数が小さすぎることが原因で過学習しているかもしれません。

2-3.層を深くしてみた

「num_layers」=152、「epochs」=10でやってみました。 p2.xlargeで10分ほどで終わりました。

# The algorithm supports multiple network depth (number of layers). They are 18, 34, 50, 101, 152 and 200
# For this training, we will use 18 layers
num_layers = 152
# we need to specify the input image shape for the training data
image_shape = "3,224,224"
# we also need to specify the number of training samples in the training set
num_training_samples = 114
# specify the number of output classes
num_classes = 3
# batch size for training
mini_batch_size = 10
# number of epochs
epochs = 10
# learning rate
learning_rate = 0.01
# report top_5 accuracy
top_k = 1
# resize image before training
resize = 256
# period to store model parameters (in number of epochs), in this case, we will save parameters from epoch 2, 4, and 6
checkpoint_frequency = 2
# Since we are using transfer learning, we set use_pretrained_model to 1 so that weights can be 
# initialized with pre-trained weights
use_pretrained_model = 1

精度の推移を見てみます。

エポック数が10ということもあってか、まだ全然落ち着かないですね。

・皿ネジ

Result: label - saraneji, probability - 0.6383892893791199

・蝶ネジ

Result: label - tyouneji, probability - 0.9999992847442627

・なべネジ

Result: label - tyouneji, probability - 1.0

なべネジの分類がうまくいきませんね...。

2-4.層を深くしてエポック数も増やす

「num_layers」=152、「epochs」=50でやってみました。 p2.xlargeで27分ほどで終わりました。

# The algorithm supports multiple network depth (number of layers). They are 18, 34, 50, 101, 152 and 200
# For this training, we will use 18 layers
num_layers = 152
# we need to specify the input image shape for the training data
image_shape = "3,224,224"
# we also need to specify the number of training samples in the training set
num_training_samples = 114
# specify the number of output classes
num_classes = 3
# batch size for training
mini_batch_size = 10
# number of epochs
epochs = 50
# learning rate
learning_rate = 0.01
# report top_5 accuracy
top_k = 1
# resize image before training
resize = 256
# period to store model parameters (in number of epochs), in this case, we will save parameters from epoch 2, 4, and 6
checkpoint_frequency = 2
# Since we are using transfer learning, we set use_pretrained_model to 1 so that weights can be 
# initialized with pre-trained weights
use_pretrained_model = 1

精度の推移を見てみます。

なんとなく落ち着いてはいますね。

・皿ネジ

Result: label - saraneji, probability - 0.9747077822685242

・蝶ネジ

Result: label - tyouneji, probability - 0.9123058915138245

・なべネジ

Result: label - tyouneji, probability - 0.9999971389770508

やっぱりまだなべネジがうまく分類できないですね...。

3.まとめ

今回は、ネジの分類にチャレンジしてみました。データサイズも小さく、ハイパーパラメータのチューニングも碌にしていない割にはまあまあかな、とは思います。

結果

パラメータ	皿ネジ	なべネジ	蝶ネジ
「num_layers」=18、「epochs」=10	○（52%)	○（99%)	○（99%)
「num_layers」=18、「epochs」=50	○（90%)	×	○（99%)
「num_layers」=152、「epochs」=10	○（63%)	×	○（99%)
「num_layers」=152、「epochs」=50	○（97%)	×	○（91%)

課題としては、「なべネジ」を正しく分類できるようにする必要がありそうです。（「num_layers」=18、「epochs」=10はエポック数が少なく、十分に学習できていないので今回は考慮外）

また、今回このような結果になった原因としては「学習に利用したデータセットが少ない」というのが一番大きいのではないかと思いました。（３クラスで合計100枚程度の画像で学習したため、過学習に陥っている可能性が大）また、検証データセットで誤分類したクラスはもしかしたら「なべネジ」に偏っているのかもしれません。

次回はデータサイズを大きくする、パラメータチューニングの自動化や、「optimizer」、「augmentation_type」等のパラメータをいじる等の工夫をして再挑戦してみたいと思います。