The project requirements was to create an alert system for the client. The purpose of the system is to alert the concerned agents when any outage occurs in the server where client’s application is hosted.
We have used the following tools to create the alert system.
- Twilio Studio
- Twilio Functions
- A cloud service (which hosts the application)
- A monitoring service
Use of each component
- Twilio Studio – It is a visible editor tool available in Twilio where one can create visual communication work flows and it is easy to troubleshoot any issue in it, since there is less of scripting to be done.
- Twlio Functions – They are a serverless applications in Twilio. They provide the environment to run our code in Node JS. The functions then execute it. It is also scalable as per the application needs.
- A cloud service – A cloud service hosts the application which the client has created. These can be hosted on servers (for eg- EC2 in AWS) or in serverless architecture (for e.g.- lambda). For our project we have assumed the cloud service as AWS and that the client application sits on AWS EC2 instances.
- Monitoring service – A monitoring service monitors and keeps a check on the server’s performance in terms of various parameters like Network throughput, CPU status etc. When server goes down, the monitoring service is triggered which further triggers the Twilio setup to notify the agents. For our project we have assumed the monitoring service to be Cloudwatch.
Functioning of the alert system
- When the server on which the client application is hosted goes down, then the Cloudwatch alarm triggers and it invokes the Lambda function. The Lambda function contains python code which makes HTTP request to Twilio Functions API.
- Now since the server has gone down, then the agents need to be informed of the outage so they can take quick action. So, When the API of the Twilio function is called, then the function returns an agent’s number. Then the Twilio Studio is triggered
- As the function triggers the Twilio Studio flow, the studio contains the flow logic which would call the agent’s number returned by the Twilio functions.
- If the agent operator answers the phone: the system will verify if the operator is ready to take the problem, this verification will be done in the for of an Input taken from operator in as a key press.
- If the operator presses 1à He will take the problem, A message will be delivered to him that there is an outage and he can take necessary actions, the system exits.
- The operator can take necessary action to resolve the issue.
- If the operator presses 2, he is not available to take the issue, go to step 4.
- If the operator does not answer the phone
- go to Step 4.
- The function will run a loop to fetch the number of another operator.
- The system will go back to Step 1 and run sequentially.
- The loops in step 4 will maximum run three times in case none of the operators have answered the call.
- After this the alert system exits.
The alert system is a very useful tool for clients especially the tool is an added advantage to the monitoring mechanisms deployed by any client. Also the tool is compatible to any cloud server or an on premise server and can alert suddenly to the operators if a server goes down.