A brief comparison of health check between ELB and Route 53

2014.08.29

この記事は公開されてから1年以上経過しています。情報が古い可能性がありますので、ご注意ください。

This post is the partial translation of "ELBとRoute 53のヘルスチェック仕様の違い".

Today, I'd try to compare briefly the difference of health check function between Elastic Load Balancing (ELB) and Route 53.

As all of you already knows, ELB uses health check to determine to which instance the incoming traffic will be forwarded.

On the other hand, Route 53 uses it for which IP address will be answered to a DNS query, especially weighted round robin or DNS failover is enabled.

Cheat sheet of health check specification

Please take a look the table below.

Service ELB Route 53
Health Checker ELB itself Health Checkers spread over the world
(each seems to belong to the subnet we can retrieve with GetCheckerIpRanges API call)
How to access IP address of EC2 instance (internal IP address within the VPC) Can select from the global IP address or domain name
Check interval choose from 5 sec to 300 sec choose 30 sec (default) or 10 sec (with additional option fee)
Trial count(s) choose from 2 to 10 count
(set it to Healthy Threshold and UnHealthy Threshold each)
choose from 1 to 10 count
(set it to Failure Threshold)
Timeout threshold TCP choose from 2 sec to 60 sec 4 sec to establish TCP connection
HTTP/HTTPS 6 seconds
(4 sec for TCP connection and 2 sec for HTTP/HTTPS response)
HTTP/HTTPS with string matching * no string matching function 8 seconds
(4 sec for TCP connection, 2 sec for HTTP/HTTPS response and 2 additional sec to get HTTP response body)
String matching (HTTP/HTTPS) * no string matching function The string should be appear within first 5KB of the response body.
Maximum 255 characters
HTTP response code (HTTP/HTTPS) 200 between 200 or greater and less than 400
User-Agent (HTTP/HTTPS) ELB-HealthChecker/1.0 Amazon Route 53 Health Check Service; ref:<Health Check ID>
How to decide success or fail continuous success or fail of trial count with thresholds of timeout, string matching and response code continuous fail of trial count by a portion of health checkers
How long to determine unhealthy at minimum, 1 minute
(2 UnHealthy Threshold by 30 seconds interval)
at minimum, 10 seconds plus DNS cache expiration
(1 Failure Threshold by 10 seconds interval)
Monitoring CloudWatch (AWS/ELB Namespace)

  • HealthyHostCount
  • UnHealthyHostCount


ELB API: DescribeInstanceHealth

CloudWatch (AWS/Route53 Namespace)

  • HealthCheckStatus
  • HealthCheckPercentageHealthy
How to separate unhealthy one cut off traffic to unhealthy instance not include the IP address of unhealthy one as the DNS query response
Note) if all health checker reports unhealthy, response the IP address

Reference:

Summary

Now I took a brief look of difference of health check between ELB and Route 53. The ELB health check may be widely used by all of you. The Route 53 one seems effective for load balancing and high availability on DNS level. We want to describe a more deep details of it in feature articles.