[REPORT] Become a network support expert: We break it, you fix it #AWSreInvent #NET302-R1

2024.01.28

I took part in a workshop focused on troubleshooting and investigation the root cause of network issues.

Workshop:
NET302-R1 | Become a network support expert: We break it, you fix it

In this workshop, you assume the role of a network support engineer at a fictitious company. Your task is to solve various issues in your AWS network environment and help your colleagues with network tasks. Troubleshoot and deploy fixes to several networking problems, perform root cause analysis, and automate your network using AWS features and services as well as standard network tools. Grow your troubleshooting skills, use AWS services and features in new ways, and learn more about operating a network on AWS. You must bring your laptop to participate.

Report

Agenda

  • What are we playing with? (Overview of the environment and services)
  • How are we playing with it? (Introduction to the labs)
  • Alright, so how do I get started? (Access the workshop environment)
  • Fix the environment!

Workshop

In this session you will assume the role of a network support engineer at the fictitious company AnyCompany. Your task is to help by solving various network-related issues in your AWS network environment.

  • Troubleshoot
  • Using AWS tools to analyze the root cause
  • Reviewing logs to improve troubleshooting skills

This is not a typical step-by-step walkthrough workshop. You will tackle the lab challenges using your current knowledge and hints located in every lab page.

The Labs

All the Labs are located in Amazon CloudWatch. To get started go to CloudWatch Console and execute custom CloudWatch scripts that AWS provided. Then we can get started with the labs.

Here is a screenshot of a lab dashboard.

  1. Menu
    You can change between the different labs. You can do the labs in any order but AWS recommend the order: Lab 1 - 7.

  2. Hint
    Each lab provides some hints. During the workshop, AWS recommended trying to solve the labs initially without using hints.

  3. Solution
    It gives you a step-by-step solution on how to solve a lab. When you are stuck and unable to find the solution even after researching on your own or checking the hints, it's good to refer to them for guidance.

  4. Lab status
    You can check your lab status here. The status changes depending on how you are doing with the lab. If you have solved the issue, Lab status changes to LAB COMPLETED.
    If you complete a lab without reading all the hints or solution, AWS recommend you do that anyway after you have completed the Lab because hints might contain some things that you didn't think and useful information.

  5. Overview
    It gives you a bit of background of the Application/System/Environment you are going to toubleshoot.

  6. Problem
    It specifies what you need to solve in the lab.

Here are the problems for each lab.

Lab 1: Move to Centralized Egress Pattern

Can you find a way to get information around public IPv4 usage and share with the cloud economics team? Also, the Network team has identified that moving to Central Egress solution will save cost, can you help move to central egress design ensuring instances in private subnets reach internet via Central Egress NAT gateway and we do not have NAT in individual Spoke VPC.

Lab 2: Restrict Access to Specific Domains

The resources in the Private Subnets in the spoke VPCs can connect to the Internet using Transit Gateway and the Centralized Egress VPC but the Latest Security regulations in the organization wants to limit access to github.com, amazonaws.com and deny access to any other domain. The EC2 (Central monitor) in the Private Subnet in the Centralized Egress VPC which uses the same centralized NAT and Internet Gateway also needs to be restricted to access specific domain. Can you figure out and ensure that instances in Private subnets can only access github.com, amazonaws.com and not any other domain?

Lab 3: East West Traffic Inspection

The private instance in WebApp VPC is unable to reach the IOT backend server in IOT VPC. Can you figure out the issue and fix to ensure that instance in Web app VPC can connect to IOT Backend Server at Port 8080?

Lab 4: Encrypt and Decryption problem

Our internal CMS server seems to be able to connect successfully to its database, however we get some strange errors when querying the data. It seems like the CMS server can't decrypt the data. The engineer that implemented the encryption using AWS KMS and he is sure that the IAM permissions are correct and the problem is on the networking layer.

Lab 5: Download Amazon Linux 2 packages from S3 via Centralized S3 Interface Endpoint

Air gapped Reporting service instance can’t access S3 bucket via S3 Interface Endpoint in Central VPC.

Lab 6: Network segmentation

According to Transit Gateway Flow Logs our resources in the Air gapped VPC can route traffic to other VPCs than the centralized VPC. This should not be possible.

Lab 7: Network LoadBalancer in Security Group

The Webapp Team has created a new NLB Security Group for Network Load Balancer but they are unable to attach the Security group with Network Loadbalancer. Can you help attach NLB Security group to NLB and then update the Security Group rule for Webapp Instance to only allow traffic coming from NLB Security Group?

Run in your own AWS account

This workshop is designed to be ran at an AWS Event where AWS provides AWS accounts for the workshop. However you can deploy and run this workshop self-paced in your own AWS Account by deploying an AWS Cloudformation template in your account.

For more details, please refer to the document below.

Welcome to the Become a network expert: We break it, you fix it workshop!

Five VPCs are require for this workshop. The Cloudformation template will automatically delete your Default VPCs across all regions. So AWS recommend that you create this workshop in a dedicated AWS account.

We recommend that you create this workshop in a dedicated AWS account as the workshop requires five VPCs, which is the default limit of number of VPCs in a region for an account. The Cloudformation template will automatically delete your Default VPCs across all regions.

For more information, please refer to the document below.

Run in your own AWS account

Conclusion

It is a great opportunity to learn about AWS services you have never used, and also it is good workshop to develop your research skills. If you are interested, I highly recommend giving it a try.