AWS Top Engineers GameDay, We won with Kiro "also" joining as a team member

AWS Top Engineers GameDay, We won with Kiro "also" joining as a team member

Won 1st place at the AWS Top Engineer GameDay with a huge lead of about 33% over the 2nd place! How we commanded and controlled with AI agent "Kiro" as a team member. The key to success was humans focusing on decision-making as "information hubs" while maximizing the AI's execution capabilities.
2026.02.20

This page has been translated by machine translation. View original

In February 2026, I participated in the AWS GameDay "Microservice Magic" held for Japan AWS Top Engineers and won first place (450,585 points) among 13 teams.

Our team consisted of 3 members: 2 AWS Top Engineers from different companies, plus 1 young IT specialist representing the next generation. In this team, I mainly took on the role of commanding and controlling AI agents.

In this article, while respecting Unicorn Rental's NDA (non-disclosure agreement), I'll share the behind-the-scenes strategy of "how we fought alongside AI" and "why such a significant point gap emerged" in this battlefield of top engineers.


Final Scores & Rankings

Rank Team Name Score
1 Amazing13 450,585
2 四天王 339,659
3 マルチクラウドセキュリティの教科書 336,546
4 Team EHA 305,584
5 We love Dr.Werner 295,773

What is GameDay?

GameDay is a team competition for cloud operations hosted by AWS. The themes and formats vary each time, with some focusing on building capabilities, operational skills, or security as the central theme.

This time, the theme was "Microservice Magic." It involved handling microservices in a provided AWS environment and was a "round that tested operational capabilities."

There were 13 participating teams, and the time limit was approximately 3.5 hours.


Utilizing AI Agent (Kiro)

For this GameDay, Amazon's AI coding assistant Kiro was officially provided. It came pre-installed on VSCode Server and was ready to use after SSO login authentication. Personal device usage was also supported, allowing free use of the top-tier Opus 4.6 model without worrying about usage fees.

I had used Kiro in the "Security Battle Royale" at the end of last year, but at that time, the top-tier Opus model was not yet available.

https://dev.classmethod.jp/articles/aws-gameday-2025-security-battle-royale/

Without a doubt, Kiro's Opus 4.6 demonstrated its true value in this GameDay. In every aspect—understanding situations, estimating causes, devising countermeasures—I felt a dimensional difference in performance compared to previous models.

The individual responses weren't particularly fast. However, this was more than compensated by the depth of reasoning and the precision that hit the mark on the first try. The dramatic reduction in trial and error ultimately led to overwhelmingly faster problem-solving speeds.

Toward the end, responses from this model frequently timed out, forcing a switch to Sonnet 4.6. However, by this point, most of our critical work was already complete, so there wasn't much impact. Overall, I believe this competition allowed us to maximize Opus's potential.

Tasks Assigned to AI

  • Infrastructure construction and configuration changes
  • Investigating and responding to anomalies
  • Designing, implementing, and improving countermeasures
  • Devising methods to share context between agents

Why We Chose the CLI Version

While Kiro was available as an IDE integration, we chose the CLI version.
The IDE version excels at interactive assistance, but for high-density parallel work like this competition, the CLI version's "execution environment portability" and "ease of multiplexing" were major advantages. When treating AI not just as a tool but as an independent execution unit, I find the CLI offers significantly more freedom of control.

Parallel Operation of Multiple Agents

We didn't aim for massive parallelization from the start. In the early stages, we assigned tasks to a single agent and enjoyed the coffee and snacks provided by the organizers while waiting for processing.
(GameDay is an event with abundant refreshments and hospitality, where taking breaks is encouraged by the organizers).
However, as the waiting time for agent background processing increased, a natural flow emerged—humans returning from breaks would add terminals, launch new Kiro instances, share context, and deploy them to the frontline.

Generally, "sequential deployment of forces" is considered a poor strategy, but it worked with AI agents because the cost of sharing context with new agents was extremely low.

Our operational cycle was:

  1. Issue instructions to an agent and move on to another agent or task while it processes
  2. Request status reports from completed agents
  3. Have agents with accumulated context perform Compact (context compression) for a break
  4. After compaction completes, check progress before resuming work

We devised ways to share context between agents, allowing each agent to autonomously understand the situation.

Parallelizing work significantly improved our response speed. When problems occurred, the ability to divide "temporary countermeasures" and "cause investigation"—basic elements of incident response—among agents led to improved rapid response capabilities. However, there were instances of operation conflicts between agents, indicating room for improvement in coordination mechanisms.

Redundancy in the Work Environment

In an AI-dependent work style, the work environment itself can become a SPOF (Single Point of Failure). Having alternative methods ready allowed us to continue without interruption in certain situations. "Work environment redundancy" has become essential in the AI utilization era.

The Human Role — Information Hub and Decision Making

Were we completely dependent on AI agents? Not at all. There were clearly "jobs only humans could do."

First, during the initial setup phase until AI became stable, team members created the foundation through manual operations. I could focus on AI preparation because my teammates supported the early stages. Even after AI stabilization, members continued to function as sensors—detecting anomalies and conducting patrol-like monitoring. Humans were not working solo.

In collaboration with AI agents, humans had two roles:

AI agents excel at completing tasks before them. They're good at seeing the "past" (existing code and settings) but struggle to integrate "now" (real-time situational changes) and "future" (anticipating what might happen). CLI-based agents also have the constraint of being unable to directly access information in browsers.

One role was being an "information hub," appropriately digesting and communicating information that AI couldn't access. The other was making decisions to redefine priorities like "focus on this now" as situations changed.

AI completes given tasks with surprising speed and accuracy, but humans needed to decide "what to show, when to show it, and which tasks to abandon." One reason for our victory was the effective meshing of "human decision-making" with "AI execution power" on the foundation created by team members.


Reflections

The experience of issuing instructions to Kiro feels similar to having multiple excellent engineers. However, I also realized that coordination between AI agents is even more challenging than human teams. Not everything went smoothly—in the heat of competition, "obvious things" that we omitted led to point losses in some situations.

Nevertheless, having humans orchestrate AI agents proved to be a significant advantage in team competition. I feel we've entered an era where one's "skill level as a user" of AI is being tested.

While GameDay requirements vary by round, I believe utilizing AI agents will become increasingly important. If you have the opportunity, I highly recommend participating.

gameday202602

Share this article

FacebookHatena blogX