はじめてのHashiCorp Consul (第3回クラスタの作成とヘルスチェック)

Consulのチュートリアル(Consul Curriculum)をやってみました｡

HashiCorp Consul

加藤諒

2019.04.15

この記事は公開されてから1年以上経過しています。情報が古い可能性がありますので、ご注意ください。

おはようございます､加藤です｡Consul チュートリアルをやってみたの第3回です｡クラスタの作成とヘルスチェックをやってみました｡

はじめてのHashiCorp Consul (第1回インストール〜起動)

はじめてのHashiCorp Consul (第2回 Connect機能)

Consul by HashiCorp
Consulのメインページ
Consul Curriculum - HashiCorp Learn
Consulのチュートリアル

やってみる内容

Consul サーバー & クライアントを起動して､クライアントのクラスタへの参加(1サーバー構成だけど)とヘルスチェックの挙動確認を行います｡環境はDockerを使って用意しました｡

Consul クラスターの作成

Consulは本番環境の場合､サーバー & クライアント構成で動作します｡今回は1台のサーバーと1台のクライアント､合計2台で検証を行います｡

下記のオプションを設定して､ノード1のConsul エージェントを起動します｡

-server
エージェントをサーバーモードで動作させるフラグ
-bootstrap-expect=1
参加予定のConsulサーバー数を指定
Bootstrapping a Datacenter - Consul by HashiCorp
-node=agent-one
クラスタ内でのノードの名前(一意である必要がある)
デフォルトはホスト名ですが検証の為に指定します
-bind=172.20.20.10
Configuration - Consul by HashiCorp
-enable-script-checks=true
ヘルスチェックの為に外部スクリプトの実行を許可するフラグ

Dockerで検証したいのでdocker-composeで2台を立ち上げます｡

使用するdocker-compose.ymlはこちらです｡

version: "3"

services:
  n1:
    image: consul
    container_name: consul_n1
    environment:
      - CONSUL_BIND_ADDRESS=172.20.20.10
    command: agent -server -bootstrap-expect=1 -node=agent-one -enable-script-checks=true
    networks:
      consul:
        ipv4_address: 172.20.20.10
  n2:
    image: consul
    container_name: consul_n2
    environment:
      - CONSUL_BIND_ADDRESS=172.20.20.11
    command: agent -node=agent-two -enable-script-checks=true
    volumes:
      - "./consul.d:/consul/config"
    depends_on:
      - n1
    networks:
      consul:
        ipv4_address: 172.20.20.11

networks:
  consul:
    driver: bridge
    ipam:
      driver: default
      config:
        - subnet: 172.20.20.0/24

環境を起動します｡

git clone https://github.com/kmd2kmd/consul_getting_started_chapter5_6.git
cd consul_getting_started_chapter5_6
docker-compose up -d

サーバー側のコンテナの中に入り､クライアントをクラスタに参加させます｡ consul membersで､agent-twoが表示されていればOKです｡

docker-compose exec n1 sh

# 意図したオプションで起動しているか確認
ps | grep consul
#    6 consul     5:14 consul agent -data-dir=/consul/data -config-dir=/consul/config -server -bootstrap-expect=1 -node=agent-one -enable-script-checks=true

# 初期状態の確認
consul members
Node       Address            Status  Type    Build  Protocol  DC   Segment
agent-one  172.20.20.10:8301  alive   server  1.4.4  2         dc1  <all>

# メンバーの追加
consul join 172.20.20.11

consul members
# Node       Address            Status  Type    Build  Protocol  DC   Segment
# agent-one  172.20.20.10:8301  alive   server  1.4.4  2         dc1  <all>
# agent-two  172.20.20.11:8301  alive   client  1.4.4  2         dc1  <default>

# コンテナから抜けないでこのまま継続

サーバー側から､手動操作で参加をしていますが､Consul エージェント起動時に自動で参加することや､AWS上で動いている場合はConsul サーバーのEC2インスタンスをタグから見つけ出して参加する事が可能です｡(Consul サーバーのAutoScalingのサポート) Cloud Auto-join - Consul by HashiCorp

へルスチェック機能

ノード2(agent-two)は下記の2つの定義ファイルを読み込んで起動しています｡ ping.jsonはノード全体のヘルスチェックを行う為の設定です｡Googleに対してPingが成功するか(インターネットにアクセスできるか)をヘルスチェックとして行っています｡

{
    "check": {
        "name": "ping",
        "args": [
            "ping",
            "-c1",
            "google.com"
        ],
        "interval": "30s"
    }
}

web.jsonはサービスのヘルスチェックを行う為の設定です｡自身(localhost)に対して､cURLでTCP/80にアクセスを行えるかをヘルスチェックとして行っています｡ノード2では､Web機能を動かしていないので､サービスとしてのヘルスチェックは失敗する想定です｡

{
    "service": {
        "name": "web",
        "tags": [
            "rails"
        ],
        "port": 80,
        "check": {
            "args": [
                "curl",
                "localhost"
            ],
            "interval": "10s"
        }
    }
}

ヘルスチェックに失敗したものを確認してみます｡ノード2(agent-two)でサービスwebのチェックが失敗している事がわかります｡

curl -s  http://localhost:8500/v1/health/state/critical | jq
[
  {
    "Node": "agent-two",
    "CheckID": "service:web",
    "Name": "Service 'web' check",
    "Status": "critical",
    "Notes": "",
    "Output": "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n                                 Dload  Upload   Total   Spent    Left  Speed\n\r  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to localhost port 80: Connection refused\n",
    "ServiceID": "web",
    "ServiceName": "web",
    "ServiceTags": [
      "rails"
    ],
    "Definition": {},
    "CreateIndex": 23,
    "ModifyIndex": 23
  }
]

ヘルスチェックに失敗しているので､digでサービスwebのアドレスを解決しようとしても応答がありません｡

apk add bind-tools
dig +short @127.0.0.1 -p 8600 web.service.consul

しかし､ノードとしてはヘルスチェック(Googleへのping)に成功しているので､ノードの名前は解決をする事ができます｡

dig +short @127.0.0.1 -p 8600 agent-two.node.consul
172.20.20.11

サービスwebを復旧させてみます｡exitでノード1から抜けた後に､ノード2のコンテナの中に入りsocatでダミーのWebサーバーを立ち上げます｡

docker-compose exec n2 sh

apk add bind-tools socat
socat -v -T0.05 tcp-l:80,reuseaddr,fork system:"echo 'HTTP/1.1 200 OK'; echo 'Connection: close'; echo; cat" > /dev/null 2>&1 &

再度､全体のヘルスチェックを確認してみると､失敗無しになっており､サービスwebの名前解決も出来るように復旧しました｡

curl -s  http://localhost:8500/v1/health/state/critical | jq
# []
dig +short @127.0.0.1 -p 8600 web.service.consul
# 172.20.20.11

あとがき

ヘルスチェックの設定形式と挙動を確認できました｡今回はノードのヘルスチェックにpingを使用しましたが､HTTPチェックやTCPチェックなどいくつかのチェック手段が用意されています｡ Check Definition - Consul by HashiCorp 複数のヘルスチェックを簡単に動かすことができるので､1つのサーバーで複数サービスを動かしている場合など便利に使えそうですね!

はじめてのHashiCorp Consul (第3回クラスタの作成とヘルスチェック)

やってみる内容

Consul クラスターの作成

へルスチェック機能

あとがき

関連記事

AWSで探す

注目のテーマ

プロダクトやサービスで探す

特集やシリーズから探す

EVENTS