You can now specify thresholds and failure count methods with the Amazon ECS deployment circuit breaker

You can now specify thresholds and failure count methods with the Amazon ECS deployment circuit breaker

New parameters for specifying threshold type and failure count model have been added to Amazon ECS deployment circuit breaker. I verified the behavior when using default values and COUNT specification with Fargate services.
2026.07.02

This page has been translated by machine translation. View original

Introduction

On July 1, 2026, new customization parameters were added to the Amazon ECS deployment circuit breaker.

https://aws.amazon.com/jp/about-aws/whats-new/2026/07/amazon-ecs-circuit-breaker-settings/

The traditional circuit breaker only offered two choices — "enable/disable" and "enable/disable rollback" — with no way to change thresholds or failure counting methods. This update enables fine-grained control tailored to your environment and requirements.

Comparison with Previous Behavior

Item Before This Update
Circuit breaker enable/disable
Rollback enable/disable
Failure count model consecutive (fixed) consecutive / cumulative (selectable)
Threshold type BOUNDED_PERCENT(50) (fixed) BOUNDED_PERCENT / UNBOUNDED_PERCENT / COUNT
Threshold value Not configurable Can specify value according to type

New Parameters

The two parameters added in this update are as follows.

resetOnHealthyTask (failure count model)

  • true (default): consecutive model. The failure counter is reset when a task is determined to be healthy.
  • false: cumulative model. The failure counter is not reset, and the threshold is evaluated based on the cumulative count since the start of the deployment.

thresholdConfiguration (threshold settings)

  • type: Specifies the threshold type.
    • BOUNDED_PERCENT (default): Percentage relative to desiredCount. Adjusted to a range with a lower limit of 3 and an upper limit of 200.
    • UNBOUNDED_PERCENT: Percentage with no upper or lower limits.
    • COUNT: Fixed number of failures.
  • value: The threshold value. Default is 50 (meaning 50% when using BOUNDED_PERCENT).

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/deployment-circuit-breaker.html

Verification Environment

Verification was performed in the following environment.

  • Region: ap-northeast-1
  • ECS Cluster: Newly created
  • Launch type: FARGATE
  • Task definition: Container that immediately exits with exit 1 (for intentional failure)
  • desiredCount: 2

The following Dockerfile was used as a dummy failing application.

FROM alpine:3.20
CMD ["sh", "-c", "echo 'Intentional failure for circuit breaker test' && exit 1"]

Verification Results

Circuit breaker activation was confirmed with 3 different configuration patterns.

Behavior with Default Settings

A service was created with only deploymentCircuitBreaker={enable=true,rollback=true}, without specifying the new parameters.

aws ecs create-service \
  --cluster cb-test-cluster \
  --service-name cb-test-default \
  --task-definition cb-test-fail:1 \
  --desired-count 2 \
  --launch-type FARGATE \
  --network-configuration "awsvpcConfiguration={subnets=[subnet-xxxxxxxxxxxxxxxxx,subnet-yyyyyyyyyyyyyyyyy],securityGroups=[sg-xxxxxxxxxxxxxxxxx],assignPublicIp=ENABLED}" \
  --deployment-configuration "deploymentCircuitBreaker={enable=true,rollback=true}"

The following is the output of describe-services.

{
    "deploymentCircuitBreaker": {
        "enable": true,
        "rollback": true,
        "resetOnHealthyTask": true,
        "thresholdConfiguration": {
            "type": "BOUNDED_PERCENT",
            "value": 50
        }
    }
}

After approximately 5 minutes, the circuit breaker was triggered.

{
    "rolloutState": "FAILED",
    "failedTasks": 4,
    "desiredCount": 2,
    "rolloutStateReason": "ECS deployment circuit breaker: tasks failed to start."
}

With desiredCount=2, the threshold for BOUNDED_PERCENT 50 is calculated based on 50% of desiredCount and, according to the documentation, is adjusted to a range with a lower limit of 3 and an upper limit of 200. This verification confirmed that under these conditions, the lower limit of 3 is applied as the effective threshold. The circuit breaker was triggered when the internal failure count reached the threshold. Note that the failedTasks value visible in describe-services (4 in this case) depends on the timing of retrieval and may not match the threshold.

COUNT + cumulative (Fast Rollback)

The threshold was set to a fixed count of 2, with the failure count model set to cumulative.

aws ecs create-service \
  --cluster cb-test-cluster \
  --service-name cb-test-count-cumulative \
  --task-definition cb-test-fail:1 \
  --desired-count 2 \
  --launch-type FARGATE \
  --network-configuration "awsvpcConfiguration={subnets=[subnet-xxxxxxxxxxxxxxxxx,subnet-yyyyyyyyyyyyyyyyy],securityGroups=[sg-xxxxxxxxxxxxxxxxx],assignPublicIp=ENABLED}" \
  --deployment-configuration "deploymentCircuitBreaker={enable=true,rollback=true,resetOnHealthyTask=false,thresholdConfiguration={type=COUNT,value=2}}"

After approximately 4 minutes, the circuit breaker was triggered.

{
    "rolloutState": "FAILED",
    "failedTasks": 3,
    "desiredCount": 2,
    "rolloutStateReason": "ECS deployment circuit breaker: tasks failed to start."
}

With COUNT=2, the specified value is used directly as the threshold. The circuit breaker is triggered when the failure count reaches the threshold (the displayed failedTasks value may not match the threshold depending on retrieval timing).

COUNT + consecutive (For Comparison with Default)

Using the same threshold COUNT=2, the failure count model was set to consecutive (default).

aws ecs create-service \
  --cluster cb-test-cluster \
  --service-name cb-test-count-consecutive \
  --task-definition cb-test-fail:1 \
  --desired-count 2 \
  --launch-type FARGATE \
  --network-configuration "awsvpcConfiguration={subnets=[subnet-xxxxxxxxxxxxxxxxx,subnet-yyyyyyyyyyyyyyyyy],securityGroups=[sg-xxxxxxxxxxxxxxxxx],assignPublicIp=ENABLED}" \
  --deployment-configuration "deploymentCircuitBreaker={enable=true,rollback=true,resetOnHealthyTask=true,thresholdConfiguration={type=COUNT,value=2}}"

After approximately 3–4 minutes, the circuit breaker was triggered.

{
    "rolloutState": "FAILED",
    "failedTasks": 2,
    "desiredCount": 2,
    "rolloutStateReason": "ECS deployment circuit breaker: tasks failed to start."
}

Summary of Verification Results

In this verification, all tasks failed immediately, meaning no tasks were determined to be healthy and no counter reset was triggered. As a result, no behavioral difference appeared between consecutive and cumulative. The true difference becomes apparent in mixed cases where "some tasks succeed and some fail."

  • consecutive (resetOnHealthyTask=true): Since the counter is reset when a task is determined to be healthy, it is harder to reach the threshold in cases where successes and failures are mixed.
  • cumulative (resetOnHealthyTask=false): Since the counter is not reset, even sporadic failures will trigger a rollback once they accumulate to the threshold.
Configuration Effective Threshold failedTasks at FAILED Time Required
Default (BOUNDED_PERCENT 50, consecutive) 3 (lower limit applied) 4 Approx. 5 minutes
COUNT=2, cumulative 2 3 Approx. 4 minutes
COUNT=2, consecutive 2 2 Approx. 3–4 minutes

The failedTasks value depends on the timing of the circuit breaker's internal evaluation and the timing of the describe-services retrieval, so it does not necessarily match the effective threshold.

Configuration in CloudFormation

The new parameters can also be specified in CloudFormation. Add them under DeploymentConfiguration.DeploymentCircuitBreaker in AWS::ECS::Service.

Resources:
  ECSService:
    Type: AWS::ECS::Service
    Properties:
      ServiceName: my-service
      Cluster: my-cluster
      TaskDefinition: !Ref TaskDefinition
      DesiredCount: 2
      LaunchType: FARGATE
      NetworkConfiguration:
        AwsvpcConfiguration:
          AssignPublicIp: ENABLED
          Subnets:
            - !Ref SubnetA
            - !Ref SubnetB
          SecurityGroups:
            - !Ref SecurityGroup
      DeploymentConfiguration:
        DeploymentCircuitBreaker:
          Enable: true
          Rollback: true
          ResetOnHealthyTask: false
          ThresholdConfiguration:
            Type: COUNT
            Value: 2

When a stack was actually created with this template, the deployment circuit breaker was triggered after the ECS service creation process began. As a result, the CloudFormation stack was rolled back (this is the expected behavior since a failing task definition was used).

{
    "deploymentCircuitBreaker": {
        "enable": true,
        "rollback": true,
        "resetOnHealthyTask": false,
        "thresholdConfiguration": {
            "type": "COUNT",
            "value": 2
        }
    }
}

The following is an excerpt of the stack events. The circuit breaker was triggered approximately 2 minutes after service creation, confirming that the custom settings were applied in the same way as during CLI verification.

Timestamp (UTC) Resource Status
01:11:59 ECSService CREATE_IN_PROGRESS (Resource creation Initiated)
01:14:00 ECSService CREATE_FAILED (ECS Deployment Circuit Breaker was triggered)
01:14:56 cb-test-circuit-breaker ROLLBACK_COMPLETE

The naming convention correspondence is as follows.

CLI Parameter CloudFormation Property
resetOnHealthyTask ResetOnHealthyTask
thresholdConfiguration.type ThresholdConfiguration.Type
thresholdConfiguration.value ThresholdConfiguration.Value

Conclusion

The ECS deployment circuit breaker now offers a choice of threshold types and failure count models. It is now possible to specify a fixed-count threshold using COUNT and to evaluate based on cumulative failure counts using the cumulative model, allowing finer adjustment of deployment failure detection conditions.

This verification confirmed that the default configuration operates equivalently to the previous BOUNDED_PERCENT 50 / consecutive behavior, and that COUNT=2 is treated as a fixed-count threshold. In cases where all tasks fail immediately, there are no tasks determined to be healthy, so no clear difference appeared between consecutive and cumulative.

When the new parameters are not explicitly specified, the evaluation behavior for existing services is equivalent to the previous behavior. On the other hand, it is now possible to adjust the threshold and failure count model according to use cases — whether you want to detect failures early or tolerate a certain degree of temporary startup failures.

Share this article

AWSのお困り事はクラスメソッドへ