The Story of a Job Being Rejected Three Times with the Same Error on a Windows AWS Deadline Cloud Worker
This page has been translated by machine translation. View original
Introduction
I'd like to share a problem I encountered when configuring AWS Deadline Cloud CMF (Customer-Managed Fleet) Workers on Windows and trying to use WORKER_AGENT_USER mode, which runs jobs under the same OS user as the Worker Agent.
When submitting a job, the session fails in under one second with FAILED right before input file synchronization, and the following error appears in the Worker Agent log:
ERROR Session.Failed Job cannot run as WORKER_AGENT_USER.
Worker Agent is running with Administrator privileges.
The message appears to refer to Administrator privileges. However, even after addressing the permissions issue, the same error message recurred twice more, and it turned out there were three separate root causes. Moreover, the final cause had nothing to do with privilege levels at all. This article covers the three fixes required.
What is AWS Deadline Cloud?
AWS Deadline Cloud is a managed service for building cloud-based render farms for 3DCG/VFX production. It provides the features needed to operate a render farm, including Worker auto-scaling, job management, and license delivery.
Target Audience
- Those building or considering building AWS Deadline Cloud CMF with Windows Workers
- Those who want to run jobs in
WORKER_AGENT_USERmode - Those hitting configuration requirements that official documentation alone cannot get them past
References
- AWS Deadline Cloud Developer Guide: Customer-managed fleets
- aws-deadline/deadline-cloud-worker-agent
- OpenJobDescription Specifications (job user / session model)
- Microsoft Learn: Replace a process level token (SeAssignPrimaryTokenPrivilege)
Prerequisite Configuration
WORKER_AGENT_USER mode is an approach that runs jobs under the same OS user as the Worker Agent service. It is used in minimal configurations that do not provide a dedicated user for job execution.
Membership in the Administrators Group
install-deadline-worker creates a non-interactive user called deadline-worker on Windows and starts the service under this user. However, this user was automatically added to the Administrators group. Because Deadline Cloud excludes users belonging to Administrators from being eligible for WORKER_AGENT_USER mode execution, all jobs are rejected.
The fix is to remove the user from the group and restart the service.
Remove-LocalGroupMember -Group Administrators -Member deadline-worker
Restart-Service DeadlineWorker -Force
SeAssignPrimaryTokenPrivilege
Even after removing the user from the group, the same error persists. Checking User Rights Assignment with secedit /export reveals that SeAssignPrimaryTokenPrivilege has been granted to deadline-worker. This is the privilege for replacing a process's token.
Deadline Cloud's check determines that not only members of Administrators, but also holders of this privilege, are equivalent to Administrator. Because install-deadline-worker automatically grants this privilege, removing the group membership alone does not change the determination. Removing the privilege allows the check to proceed to the next step.
@'
[Unicode]
Unicode=yes
[Version]
signature="$CHICAGO$"
Revision=1
[Privilege Rights]
SeAssignPrimaryTokenPrivilege = *S-1-5-19,*S-1-5-20
'@ | Set-Content -Path C:\Windows\Temp\fix-priv.inf -Encoding Unicode
secedit /configure /db C:\Windows\Temp\fix-priv.sdb /cfg C:\Windows\Temp\fix-priv.inf /areas USER_RIGHTS
Restart-Service DeadlineWorker -Force
One additional note here: this privilege has a dual nature — "removing it passes the check, but it is needed further down the line." When run_jobs_as_agent_user = true described later is enabled, this privilege is required to launch child processes for jobs. Therefore it ultimately needs to be restored. For now, remove it and proceed to the next fix.
Hardcoded Check in scheduler.py
Even after removing the privilege, the same error message recurs. At this point, I directly read the Worker Agent source on the Worker EC2 instance and identified the actual cause.
C:\Program Files\Python311\Lib\site-packages\deadline_worker_agent\scheduler\scheduler.py
The relevant section contained the following check:
# For Windows the WA runs as Administrator so fail jobs that were configured to runAs - WORKER_AGENT_USER
if (
os.name == "nt"
and self._job_run_as_user_override.job_user is None
and not self._job_run_as_user_override.run_as_agent
and job_details.job_run_as_user
and job_details.job_run_as_user.is_worker_agent_user
):
err_msg = "Job cannot run as WORKER_AGENT_USER. Worker Agent is running with Administrator privileges."
self._fail_all_actions(session_spec, err_msg)
Although the error message mentions Administrator privileges, this check does not examine runtime privileges at all. It is hardcoded behavior that rejects WORKER_AGENT_USER mode by default simply because os.name == "nt" — that is, because it is Windows. No amount of privilege adjustment will get past this point if you take the error message at face value.
The correct approach is to explicitly opt in via worker.toml.
[os]
run_jobs_as_agent_user = true
Additionally, SeAssignPrimaryTokenPrivilege, which was removed in the second fix, is needed to launch child processes, so restore it to deadline-worker at this point.
Final State
The combination of settings that must be in place after all three fixes are applied is as follows:
| Item | State |
|---|---|
deadline-worker membership in Administrators |
Removed |
deadline-worker's SeAssignPrimaryTokenPrivilege |
Retained (required for run_jobs_as_agent_user) |
run_jobs_as_agent_user in worker.toml |
true |
Only with this combination in place will jobs in WORKER_AGENT_USER mode actually run.
Summary
The same error message appeared three times on a Windows Deadline Cloud Worker, and behind it were three different root causes: the Administrators group, SeAssignPrimaryTokenPrivilege, and a hardcoded opt-out check that is Windows-default behavior. Rather than taking error messages at face value, reading the Worker Agent source when needed is an effective approach to operating Deadline Cloud on Windows.
I hope this serves as a useful reference for those building Deadline Cloud CMF Workers on Windows.