I tried the new "Zone-Aware Routing" feature of ECS Service Connect for prioritizing same-AZ service-to-service communication

Zone-Aware routing has been added to ECS Service Connect. In multi-AZ service-to-service communication, it prioritizes endpoints within the same AZ, which is expected to reduce cross-AZ data transfer costs and latency. I actually verified how it works in practice.

suzuki.ryo

2026.07.02

This page has been translated by machine translation. View original

 IntroductionOn July 1, 2026, Zone-Aware routing functionality was added to ECS Service Connect.
https://aws.amazon.com/jp/about-aws/whats-new/2026/07/ecs-service-connect-zone-aware/
This feature reduces cross-AZ data transfer costs and latency by preferentially routing requests to endpoints within the same AZ for communication between ECS services in a multi-AZ configuration.
Traditional Service Connect distributed traffic evenly across all AZs using round-robin, which inevitably resulted in cross-AZ communication in multi-AZ configurations. While multi-AZ configurations themselves are necessary for availability, data transfer between AZs incurs additional charges and increases latency.
With this update, same-AZ priority routing is enabled by default.


Item
Traditional Service Connect
After Zone-Aware Support


Routing method
Round-robin (even distribution across all AZs)
Same-AZ priority

Cross-AZ data transfer
Inevitably occurs in multi-AZ configurations
Reduced due to priority routing within the same AZ

Availability
Automatic failover on AZ failure
Same (automatic redistribution to cross-AZ when capacity is insufficient)

Configuration change
—
Not required (enabled by default)

Application to existing services
—
Enabled with one redeployment

Additional charges
—
None

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-connect-zone-aware-routing.html
 How It WorksZone-Aware routing uses the zone-aware routing feature of the Service Connect proxy (Envoy sidecar).
The flow of operation is as follows.
Endpoint discovery: The proxy identifies all endpoints of the destination service and their AZ placement
Same-AZ priority: Requests are preferentially routed to endpoints in the same AZ as the request source
Residual capacity routing: Traffic that cannot be accommodated within the same AZ is distributed based on the residual capacity of other AZs
Fallback: If endpoints in the same AZ are unhealthy or insufficient, routing automatically switches to another AZ
 Threshold Conditions for ActivationFor Zone-Aware routing to be enabled, the number of endpoints for the destination service must be 2 × number of AZs or more.
2-AZ configuration: Minimum 4 endpoints
3-AZ configuration: Minimum 6 endpoints
If this threshold is not met, it falls back to normal load balancing without AZ consideration, and is automatically re-enabled when the number of endpoints increases. This is a mechanism to prevent overloading a single AZ.
 Comparison with Similar ConceptsThe approach of prioritizing the same AZ to reduce cross-AZ transfers also exists in other AWS services.


Mechanism
Target communication
How same-AZ priority is achieved


Regional NAT Gateway
Outbound communication
AZ affinity via workload detection (dynamic)

Service Connect Zone-Aware
Communication between ECS services
Envoy proxy weighting (dynamic)

ALB cross-zone disabled
Client → target
ALB node distribution settings

https://dev.classmethod.jp/articles/aws-nat-gateway-regional-availability/
 VerificationWe confirmed the behavior of Zone-Aware routing on an actual ECS cluster with a 2-AZ configuration.
 Verification ConfigurationECS cluster (Service Connect namespace: "test-sc-zone-aware")
├── Service A (client role, 2 tasks)
│   ├── AZ-a: 1 task (ECS Exec enabled)
│   └── AZ-c: 1 task (ECS Exec enabled)
└── Service B (server role, 4 tasks)
    ├── AZ-a: 2 tasks
    └── AZ-c: 2 tasks
Service B is a simple HTTP server that includes its own task's AZ information in the response. Requests are sent from Service A via Service Connect, and the AZ information in the response is used to verify which AZ's task the request was routed to. Service B has 4 tasks to meet the threshold condition (2 × number of AZs).
 Service B Applicationfrom flask import Flask, jsonify
import requests, os

app = Flask(__name__)

@app.route("/")
def az():
    meta_uri = os.environ.get("ECS_CONTAINER_METADATA_URI_V4", "")
    if meta_uri:
        task_meta = requests.get(meta_uri + "/task", timeout=2).json()
        return jsonify({"az": task_meta.get("AvailabilityZone", "unknown")})
    return jsonify({"az": "no-metadata-uri"})

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=8080)
 Service Connect ConfigurationSpecify name and appProtocol in the port mapping in Service B's task definition, and configure the server-side Service Connect settings when creating the service.
{
  "portMappings": [
    {
      "containerPort": 8080,
      "protocol": "tcp",
      "name": "http",
      "appProtocol": "http"
    }
  ]
}
Service Connect configuration when creating the service:
{
  "enabled": true,
  "namespace": "test-sc-zone-aware",
  "services": [
    {
      "portName": "http",
      "discoveryName": "service-b",
      "clientAliases": [
        {
          "port": 8080,
          "dnsName": "service-b"
        }
      ]
    }
  ]
}
Service A (client side) is configured in client mode only:
{
  "enabled": true,
  "namespace": "test-sc-zone-aware"
}
No additional configuration is required for Zone-Aware routing. It is enabled by default.
 Verification 1: Confirming Zone-Aware Routing BehaviorWe connected to tasks in each AZ of Service A via ECS Exec and sent 20 requests to service-b via Service Connect.
# Request from AZ-a task
aws ecs execute-command --cluster test-sc-zone-aware --task <task-id> \
  --container app --interactive \
  --command 'sh -c "for i in $(seq 1 20); do curl -s http://service-b:8080/; echo; done"'
Executed from the AZ-a (ap-northeast-1a) task:
{"az":"ap-northeast-1a"}
{"az":"ap-northeast-1a"}
{"az":"ap-northeast-1a"}
...(all the same below)
20/20 (100%) were routed to endpoints in the same AZ (ap-northeast-1a).
Executed from the AZ-c (ap-northeast-1c) task:
{"az":"ap-northeast-1c"}
{"az":"ap-northeast-1c"}
{"az":"ap-northeast-1c"}
...(all the same below)
Here too, 20/20 (100%) were routed to the same AZ (ap-northeast-1c). It was confirmed that when endpoints are evenly distributed, 100% are processed within the same AZ.
 Verification 2: Confirming Fallback BehaviorWe manually stopped both tasks on the AZ-a side of Service B and confirmed the behavior when requests were made from the AZ-a client.
# Stop service-b tasks in AZ-a
aws ecs stop-task --cluster test-sc-zone-aware --task <az-a-task-id-1> --reason "test fallback"
aws ecs stop-task --cluster test-sc-zone-aware --task <az-a-task-id-2> --reason "test fallback"
Requests from Service A task in AZ-a (executed after waiting 20 seconds after stopping):
{"az":"ap-northeast-1c"}
{"az":"ap-northeast-1c"}
{"az":"ap-northeast-1c"}
{"az":"ap-northeast-1c"}
{"az":"ap-northeast-1c"}
{"az":"ap-northeast-1c"}
{"az":"ap-northeast-1c"}
{"az":"ap-northeast-1c"}
{"az":"ap-northeast-1c"}
{"az":"ap-northeast-1c"}
10/10 (100%) were routed to AZ-c. With the endpoints on the AZ-a side absent, all requests were sent to the healthy endpoints in AZ-c.
Note that ECS launches new tasks to maintain the desired count. When healthy endpoints are once again present on the same AZ side and the conditions for Zone-Aware routing are met, routing reverts to same-AZ priority (fallback behavior described in the documentation).
 Supplementary Note: Behavior Below the ThresholdFor reference, the results of the same test conducted when Service B had 2 tasks (one in each AZ) are shown below.
{"az":"ap-northeast-1a"}
{"az":"ap-northeast-1a"}
{"az":"ap-northeast-1c"}
{"az":"ap-northeast-1c"}
{"az":"ap-northeast-1a"}
{"az":"ap-northeast-1c"}
...(14 executions)


AZ
Count
Ratio


ap-northeast-1a
8
57%

ap-northeast-1c
6
43%

Traffic was distributed nearly evenly between AZ-a and AZ-c, confirming that same-AZ priority behavior was not in effect. As described in the documentation, when the number of endpoints is below the threshold (2 × number of AZs = 4), the conditions for Zone-Aware routing are not met, and it falls back to normal load balancing without AZ consideration.
 NotesA supplementary note on monitoring in Fargate environments.
The documentation describes how to check the status of Zone-Aware routing using Envoy statistics. The metrics used for verification are lb_zone_routing_cross_zone and lb_zone_cluster_too_small. However, this procedure assumes connecting to an EC2 instance via SSM Session Manager and executing Docker exec.
In Fargate environments, direct access to the Service Connect agent container is not possible, so Envoy statistics could not be verified. As a means of checking cross-AZ communication trends in Fargate, using VPC Flow Logs (with the az-id field) is an option.
 SummaryWith Zone-Aware routing in ECS Service Connect, endpoints within the same AZ are now prioritized for inter-service communication in multi-AZ configurations. No additional configuration is required, and it is enabled by default for Service Connect services that meet the conditions.
In this verification configuration, we confirmed that 100% of traffic was routed to the same AZ when endpoints were evenly distributed across each AZ. Even when endpoints in the same AZ became unavailable, automatic fallback to healthy endpoints in another AZ was confirmed.
In environments using ECS Service Connect with multi-AZ, reductions in cross-AZ communication and decreases in data transfer costs and latency can be expected.
 Reference LinksAmazon ECS Service Connect zone-aware routing - Amazon ECS Developer Guide
Announcing zone-aware routing in Amazon ECS Service Connect - AWS Containers Blog

I tried the new "Zone-Aware Routing" feature of ECS Service Connect for prioritizing same-AZ service-to-service communication

Introduction

How It Works

Threshold Conditions for Activation

Comparison with Similar Concepts

Verification

Verification Configuration

Service B Application

Service Connect Configuration

Verification 1: Confirming Zone-Aware Routing Behavior

Verification 2: Confirming Fallback Behavior

Supplementary Note: Behavior Below the Threshold

Notes

Summary

Reference Links

AWS Topics

Trending Topics

Products & Services

Features and Series

Item	Traditional Service Connect	After Zone-Aware Support
Routing method	Round-robin (even distribution across all AZs)	Same-AZ priority
Cross-AZ data transfer	Inevitably occurs in multi-AZ configurations	Reduced due to priority routing within the same AZ
Availability	Automatic failover on AZ failure	Same (automatic redistribution to cross-AZ when capacity is insufficient)
Configuration change	—	Not required (enabled by default)
Application to existing services	—	Enabled with one redeployment
Additional charges	—	None

Mechanism	Target communication	How same-AZ priority is achieved
Regional NAT Gateway	Outbound communication	AZ affinity via workload detection (dynamic)
Service Connect Zone-Aware	Communication between ECS services	Envoy proxy weighting (dynamic)
ALB cross-zone disabled	Client → target	ALB node distribution settings