ちょっと話題の記事

GlusterFSのログをfluentdで集約する

この記事は公開されてから1年以上経過しています。情報が古い可能性がありますので、ご注意ください。

はじめに

先日「GlusterFSで高可用性メールサーバを構築する | Developers.IO」という記事で、Amazon LinuxでGlusterFSを使ってファイルを分散する構成を作りました。
GlusterFSは分散ファイルシステムなのでノード同士が深い関連性を持っているのですが、この状態だとそれぞれのノードが個別にログを集積してしまい、ログからノード間の因果関係(例えば「あるノードで異常を検知した時に他のノードの状態はどうだったのか」とか)が分かりづらくなってしまいます。

こんな時はどうしたら良いのでしょう...そう、もちろんfluentdです。

今回はfluent-plugin-glusterfsというプラグインを使って、GlusterFSのログを集約したいと思います!

構成

GlusterFSのクラスタノードが2台あり、それぞれ別のAZに配置されています。それらのログを集約するログサーバを建てて、fluentdでログを転送してもらい、1つのファイルに保存します。

AWS_Design_Untitled_-_Cacoo

GlusterFSサーバの設定

GlusterFSの設定については前回の記事をご参照下さい。

まずはfluentd(td-agent)をインストールします。

$ curl -L http://toolbelt.treasuredata.com/sh/install-redhat.sh | sh

次にfluent-plugin-glusterfsをインストールします。

$ sudo /usr/lib64/fluent/ruby/bin/fluent-gem install fluent-plugin-glusterfs
Fetching: fluent-plugin-glusterfs-1.0.0.gem (100%)
Successfully installed fluent-plugin-glusterfs-1.0.0
1 gem installed
Installing ri documentation for fluent-plugin-glusterfs-1.0.0...
Installing RDoc documentation for fluent-plugin-glusterfs-1.0.0...

GlusterFSのログはデフォルトではrootのみが読み書き可能な状態になっており、fluentdから読み込めません。そこでアクセス権限を変更し、その他ユーザでも読み込み可能なようにします。

$ ls -alF /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
-rw------- 1 root root 1385  2月  3 07:21 2014 /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
$ sudo chmod +r /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
$ ls -alF /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
-rw-r--r-- 1 root root 1385  2月  3 07:21 2014 /var/log/glusterfs/etc-glusterfs-glusterd.vol.log

それではfluetndの設定です。GlusterFSのログファイルを読み込み、ログサーバに転送します。

$ sudo vi /etc/td-agent/td-agent.conf

<source>
  type glusterfs_log
  path /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
  pos_file /var/log/td-agent/etc-glusterfs-glusterd.vol.log.pos
  tag glusterfs_log.glusterd
  format /^(?<message>.*)$/
</source>

<match glusterfs_log.**>
  type forward
  send_timeout 60s
  recover_wait 10s
  heartbeat_interval 1s
  phi_threshold 8
  hard_timeout 60s

  <server>
    name logserver
    host 172.31.10.100
    port 24224
    weight 60
  </server>

  <secondary>
    type file
    path /var/log/td-agent/forward-failed
  </secondary>
</match>

最後にfluentdのサービスを起動します。自動起動の設定もしておきましょう。

$ sudo service td-agent start
Starting td-agent:                                         [  OK  ]
$ sudo chkconfig td-agent on

ログ集積サーバの設定

GlusterFSサーバと同様に、fluentdをインストールします。

$ curl -L http://toolbelt.treasuredata.com/sh/install-redhat.sh | sh

fluentdの設定ファイルを編集し、受信したGlusterFSのログを/var/log/td-agent/glusterdに保存するよう設定します。

<source>
  type forward
  port 24224
  bind 0.0.0.0
</source>

<match glusterfs_log.glusterd>
  type file
  path /var/log/td-agent/glusterd
</match>

fluentdのサービスを起動します。

$ sudo service td-agent start
Starting td-agent:                                         [  OK  ]
$ sudo chkconfig td-agent on

Security Groupの設定としては、GlusterFSサーバからログサーバに対し24224/tcp、24224/udpが到達出来るようにして下さい。

確認

volume stop VOLしたりすると...こんな感じで/var/log/td-agent/glusterdにログが保存されます!hostnameに複数の名前がありますね!

2014-02-03T11:28:02+00:00       glusterfs_log.glusterd  {"date":"2014-02-03","time":"11:28:02","time_usec":"974779","log_level":"I","source_file_name":"glusterd-handler.c","source_line":"866","function_name":"glusterd_handle_cli_get_volume","component_name":"0-glusterd","message":"Received get vol req","hostname":"ip-172-31-27-73"}
2014-02-03T11:28:21+00:00       glusterfs_log.glusterd  {"date":"2014-02-03","time":"11:28:21","time_usec":"823180","log_level":"I","source_file_name":"glusterd-handler.c","source_line":"502","function_name":"glusterd_handle_cluster_lock","component_name":"0-glusterd","message":"Received LOCK from uuid: b5fca2b0-d656-4149-8e9e-f29feacefd54","hostname":"ip-172-31-27-73"}
2014-02-03T11:28:21+00:00       glusterfs_log.glusterd  {"date":"2014-02-03","time":"11:28:21","time_usec":"823265","log_level":"I","source_file_name":"glusterd-utils.c","source_line":"285","function_name":"glusterd_lock","component_name":"0-glusterd","message":"Cluster lock held by b5fca2b0-d656-4149-8e9e-f29feacefd54","hostname":"ip-172-31-27-73"}
2014-02-03T11:28:21+00:00       glusterfs_log.glusterd  {"date":"2014-02-03","time":"11:28:21","time_usec":"823329","log_level":"I","source_file_name":"glusterd-handler.c","source_line":"1322","function_name":"glusterd_op_lock_send_resp","component_name":"0-glusterd","message":"Responded, ret: 0","hostname":"ip-172-31-27-73"}
2014-02-03T11:28:21+00:00       glusterfs_log.glusterd  {"date":"2014-02-03","time":"11:28:21","time_usec":"827426","log_level":"I","source_file_name":"glusterd-handler.c","source_line":"1366","function_name":"glusterd_handle_cluster_unlock","component_name":"0-glusterd","message":"Received UNLOCK from uuid: b5fca2b0-d656-4149-8e9e-f29feacefd54","hostname":"ip-172-31-27-73"}
2014-02-03T11:28:21+00:00       glusterfs_log.glusterd  {"date":"2014-02-03","time":"11:28:21","time_usec":"827486","log_level":"I","source_file_name":"glusterd-handler.c","source_line":"1342","function_name":"glusterd_op_unlock_send_resp","component_name":"0-glusterd","message":"Responded to unlock, ret: 0","hostname":"ip-172-31-27-73"}
2014-02-03T11:28:16+00:00       glusterfs_log.glusterd  {"date":"2014-02-03","time":"11:28:16","time_usec":"155641","log_level":"I","source_file_name":"glusterd-handler.c","source_line":"866","function_name":"glusterd_handle_cli_get_volume","component_name":"0-glusterd","message":"Received get vol req","hostname":"ip-172-31-12-220"}
2014-02-03T11:28:21+00:00       glusterfs_log.glusterd  {"date":"2014-02-03","time":"11:28:21","time_usec":"820585","log_level":"I","source_file_name":"glusterd-volume-ops.c","source_line":"354","function_name":"glusterd_handle_cli_stop_volume","component_name":"0-glusterd","message":"Received stop vol reqfor volume gVol0","hostname":"ip-172-31-12-220"}
2014-02-03T11:28:21+00:00       glusterfs_log.glusterd  {"date":"2014-02-03","time":"11:28:21","time_usec":"820676","log_level":"I","source_file_name":"glusterd-utils.c","source_line":"285","function_name":"glusterd_lock","component_name":"0-glusterd","message":"Cluster lock held by b5fca2b0-d656-4149-8e9e-f29feacefd54","hostname":"ip-172-31-12-220"}
2014-02-03T11:28:21+00:00       glusterfs_log.glusterd  {"date":"2014-02-03","time":"11:28:21","time_usec":"820698","log_level":"I","source_file_name":"glusterd-handler.c","source_line":"463","function_name":"glusterd_op_txn_begin","component_name":"0-management","message":"Acquired local lock","hostname":"ip-172-31-12-220"}
2014-02-03T11:28:21+00:00       glusterfs_log.glusterd  {"date":"2014-02-03","time":"11:28:21","time_usec":"823246","log_level":"I","source_file_name":"glusterd-rpc-ops.c","source_line":"548","function_name":"glusterd3_1_cluster_lock_cbk","component_name":"0-glusterd","message":"Received ACC from uuid: 83c8d48a-071e-4934-920f-b0fb8c0acdf4","hostname":"ip-172-31-12-220"}
2014-02-03T11:28:21+00:00       glusterfs_log.glusterd  {"date":"2014-02-03","time":"11:28:21","time_usec":"825009","log_level":"E","source_file_name":"glusterd-volume-ops.c","source_line":"909","function_name":"glusterd_op_stage_stop_volume","component_name":"0-","message":"Volume gVol0 has not been started","hostname":"ip-172-31-12-220"}
2014-02-03T11:28:21+00:00       glusterfs_log.glusterd  {"date":"2014-02-03","time":"11:28:21","time_usec":"825047","log_level":"E","source_file_name":"glusterd-op-sm.c","source_line":"1999","function_name":"glusterd_op_ac_send_stage_op","component_name":"0-","message":"Staging failed","hostname":"ip-172-31-12-220"}
2014-02-03T11:28:21+00:00       glusterfs_log.glusterd  {"date":"2014-02-03","time":"11:28:21","time_usec":"825080","log_level":"I","source_file_name":"glusterd-op-sm.c","source_line":"2039","function_name":"glusterd_op_ac_send_stage_op","component_name":"0-glusterd","message":"Sent op req to 0 peers","hostname":"ip-172-31-12-220"}
2014-02-03T11:28:21+00:00       glusterfs_log.glusterd  {"date":"2014-02-03","time":"11:28:21","time_usec":"827376","log_level":"I","source_file_name":"glusterd-rpc-ops.c","source_line":"607","function_name":"glusterd3_1_cluster_unlock_cbk","component_name":"0-glusterd","message":"Received ACC from uuid: 83c8d48a-071e-4934-920f-b0fb8c0acdf4","hostname":"ip-172-31-12-220"}
2014-02-03T11:28:21+00:00       glusterfs_log.glusterd  {"date":"2014-02-03","time":"11:28:21","time_usec":"827416","log_level":"I","source_file_name":"glusterd-op-sm.c","source_line":"2653","function_name":"glusterd_op_txn_complete","component_name":"0-glusterd","message":"Cleared local lock","hostname":"ip-172-31-12-220"}

まとめ

運用管理の観点ではログの一元管理とバックアップはとても重要です。fluentdはそれを実現するためのベストなソフトウェアだと思います。今回はそのままファイルに保存しましたが、そのままElasticSearchあたりに突っ込んで検索できるようにしたりとか、夢が広がりますね。