You can now specify thresholds and failure count methods with the Amazon ECS deployment circuit breaker
This page has been translated by machine translation. View original
Introduction
On July 1, 2026, new customization parameters were added to the Amazon ECS deployment circuit breaker.
The traditional circuit breaker only offered two choices — "enable/disable" and "enable/disable rollback" — with no way to change thresholds or failure counting methods. This update enables fine-grained control tailored to your environment and requirements.
Comparison with Previous Behavior
| Item | Before | This Update |
|---|---|---|
| Circuit breaker enable/disable | ✅ | ✅ |
| Rollback enable/disable | ✅ | ✅ |
| Failure count model | consecutive (fixed) | consecutive / cumulative (selectable) |
| Threshold type | BOUNDED_PERCENT(50) (fixed) | BOUNDED_PERCENT / UNBOUNDED_PERCENT / COUNT |
| Threshold value | Not configurable | Can specify value according to type |
New Parameters
The two parameters added in this update are as follows.
resetOnHealthyTask (failure count model)
true(default): consecutive model. The failure counter is reset when a task is determined to be healthy.false: cumulative model. The failure counter is not reset, and the threshold is evaluated based on the cumulative count since the start of the deployment.
thresholdConfiguration (threshold settings)
type: Specifies the threshold type.BOUNDED_PERCENT(default): Percentage relative to desiredCount. Adjusted to a range with a lower limit of 3 and an upper limit of 200.UNBOUNDED_PERCENT: Percentage with no upper or lower limits.COUNT: Fixed number of failures.
value: The threshold value. Default is 50 (meaning 50% when using BOUNDED_PERCENT).
Verification Environment
Verification was performed in the following environment.
- Region: ap-northeast-1
- ECS Cluster: Newly created
- Launch type: FARGATE
- Task definition: Container that immediately exits with exit 1 (for intentional failure)
- desiredCount: 2
The following Dockerfile was used as a dummy failing application.
FROM alpine:3.20
CMD ["sh", "-c", "echo 'Intentional failure for circuit breaker test' && exit 1"]
Verification Results
Circuit breaker activation was confirmed with 3 different configuration patterns.
Behavior with Default Settings
A service was created with only deploymentCircuitBreaker={enable=true,rollback=true}, without specifying the new parameters.
aws ecs create-service \
--cluster cb-test-cluster \
--service-name cb-test-default \
--task-definition cb-test-fail:1 \
--desired-count 2 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={subnets=[subnet-xxxxxxxxxxxxxxxxx,subnet-yyyyyyyyyyyyyyyyy],securityGroups=[sg-xxxxxxxxxxxxxxxxx],assignPublicIp=ENABLED}" \
--deployment-configuration "deploymentCircuitBreaker={enable=true,rollback=true}"
The following is the output of describe-services.
{
"deploymentCircuitBreaker": {
"enable": true,
"rollback": true,
"resetOnHealthyTask": true,
"thresholdConfiguration": {
"type": "BOUNDED_PERCENT",
"value": 50
}
}
}
After approximately 5 minutes, the circuit breaker was triggered.
{
"rolloutState": "FAILED",
"failedTasks": 4,
"desiredCount": 2,
"rolloutStateReason": "ECS deployment circuit breaker: tasks failed to start."
}
With desiredCount=2, the threshold for BOUNDED_PERCENT 50 is calculated based on 50% of desiredCount and, according to the documentation, is adjusted to a range with a lower limit of 3 and an upper limit of 200. This verification confirmed that under these conditions, the lower limit of 3 is applied as the effective threshold. The circuit breaker was triggered when the internal failure count reached the threshold. Note that the failedTasks value visible in describe-services (4 in this case) depends on the timing of retrieval and may not match the threshold.
COUNT + cumulative (Fast Rollback)
The threshold was set to a fixed count of 2, with the failure count model set to cumulative.
aws ecs create-service \
--cluster cb-test-cluster \
--service-name cb-test-count-cumulative \
--task-definition cb-test-fail:1 \
--desired-count 2 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={subnets=[subnet-xxxxxxxxxxxxxxxxx,subnet-yyyyyyyyyyyyyyyyy],securityGroups=[sg-xxxxxxxxxxxxxxxxx],assignPublicIp=ENABLED}" \
--deployment-configuration "deploymentCircuitBreaker={enable=true,rollback=true,resetOnHealthyTask=false,thresholdConfiguration={type=COUNT,value=2}}"
After approximately 4 minutes, the circuit breaker was triggered.
{
"rolloutState": "FAILED",
"failedTasks": 3,
"desiredCount": 2,
"rolloutStateReason": "ECS deployment circuit breaker: tasks failed to start."
}
With COUNT=2, the specified value is used directly as the threshold. The circuit breaker is triggered when the failure count reaches the threshold (the displayed failedTasks value may not match the threshold depending on retrieval timing).
COUNT + consecutive (For Comparison with Default)
Using the same threshold COUNT=2, the failure count model was set to consecutive (default).
aws ecs create-service \
--cluster cb-test-cluster \
--service-name cb-test-count-consecutive \
--task-definition cb-test-fail:1 \
--desired-count 2 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={subnets=[subnet-xxxxxxxxxxxxxxxxx,subnet-yyyyyyyyyyyyyyyyy],securityGroups=[sg-xxxxxxxxxxxxxxxxx],assignPublicIp=ENABLED}" \
--deployment-configuration "deploymentCircuitBreaker={enable=true,rollback=true,resetOnHealthyTask=true,thresholdConfiguration={type=COUNT,value=2}}"
After approximately 3–4 minutes, the circuit breaker was triggered.
{
"rolloutState": "FAILED",
"failedTasks": 2,
"desiredCount": 2,
"rolloutStateReason": "ECS deployment circuit breaker: tasks failed to start."
}
Summary of Verification Results
In this verification, all tasks failed immediately, meaning no tasks were determined to be healthy and no counter reset was triggered. As a result, no behavioral difference appeared between consecutive and cumulative. The true difference becomes apparent in mixed cases where "some tasks succeed and some fail."
- consecutive (
resetOnHealthyTask=true): Since the counter is reset when a task is determined to be healthy, it is harder to reach the threshold in cases where successes and failures are mixed. - cumulative (
resetOnHealthyTask=false): Since the counter is not reset, even sporadic failures will trigger a rollback once they accumulate to the threshold.
| Configuration | Effective Threshold | failedTasks at FAILED | Time Required |
|---|---|---|---|
| Default (BOUNDED_PERCENT 50, consecutive) | 3 (lower limit applied) | 4 | Approx. 5 minutes |
| COUNT=2, cumulative | 2 | 3 | Approx. 4 minutes |
| COUNT=2, consecutive | 2 | 2 | Approx. 3–4 minutes |
The failedTasks value depends on the timing of the circuit breaker's internal evaluation and the timing of the describe-services retrieval, so it does not necessarily match the effective threshold.
Configuration in CloudFormation
The new parameters can also be specified in CloudFormation. Add them under DeploymentConfiguration.DeploymentCircuitBreaker in AWS::ECS::Service.
Resources:
ECSService:
Type: AWS::ECS::Service
Properties:
ServiceName: my-service
Cluster: my-cluster
TaskDefinition: !Ref TaskDefinition
DesiredCount: 2
LaunchType: FARGATE
NetworkConfiguration:
AwsvpcConfiguration:
AssignPublicIp: ENABLED
Subnets:
- !Ref SubnetA
- !Ref SubnetB
SecurityGroups:
- !Ref SecurityGroup
DeploymentConfiguration:
DeploymentCircuitBreaker:
Enable: true
Rollback: true
ResetOnHealthyTask: false
ThresholdConfiguration:
Type: COUNT
Value: 2
When a stack was actually created with this template, the deployment circuit breaker was triggered after the ECS service creation process began. As a result, the CloudFormation stack was rolled back (this is the expected behavior since a failing task definition was used).
{
"deploymentCircuitBreaker": {
"enable": true,
"rollback": true,
"resetOnHealthyTask": false,
"thresholdConfiguration": {
"type": "COUNT",
"value": 2
}
}
}
The following is an excerpt of the stack events. The circuit breaker was triggered approximately 2 minutes after service creation, confirming that the custom settings were applied in the same way as during CLI verification.
| Timestamp (UTC) | Resource | Status |
|---|---|---|
| 01:11:59 | ECSService | CREATE_IN_PROGRESS (Resource creation Initiated) |
| 01:14:00 | ECSService | CREATE_FAILED (ECS Deployment Circuit Breaker was triggered) |
| 01:14:56 | cb-test-circuit-breaker | ROLLBACK_COMPLETE |
The naming convention correspondence is as follows.
| CLI Parameter | CloudFormation Property |
|---|---|
| resetOnHealthyTask | ResetOnHealthyTask |
| thresholdConfiguration.type | ThresholdConfiguration.Type |
| thresholdConfiguration.value | ThresholdConfiguration.Value |
Conclusion
The ECS deployment circuit breaker now offers a choice of threshold types and failure count models. It is now possible to specify a fixed-count threshold using COUNT and to evaluate based on cumulative failure counts using the cumulative model, allowing finer adjustment of deployment failure detection conditions.
This verification confirmed that the default configuration operates equivalently to the previous BOUNDED_PERCENT 50 / consecutive behavior, and that COUNT=2 is treated as a fixed-count threshold. In cases where all tasks fail immediately, there are no tasks determined to be healthy, so no clear difference appeared between consecutive and cumulative.
When the new parameters are not explicitly specified, the evaluation behavior for existing services is equivalent to the previous behavior. On the other hand, it is now possible to adjust the threshold and failure count model according to use cases — whether you want to detect failures early or tolerate a certain degree of temporary startup failures.
