Hi,I am Akshay Rao, working in Classmethod India, in this blog I will be introducing a alert system for the cloud servers.You might be wondering there are in-built alert system in all cloud platforms, then why this alert system, trust me our system is way better than the in-built ones.
This alert system, alerts the operator when the server is down. The system also comes with a additional features which the cloud native systems cannot provide. We will discuss about the features in the features section.
The system is using two platform
1. Your cloud platform where your servers are running
3. Twilio platform
Some loopholes in cloud native alert system:-
1. Alert is not back looped until one of the operator responds.
2. Alert is in the form of email or sms, but I genuinely feel that how many of us continuously check every message or email.
3. The alerts are just transmitted to all the operators at once, this lowers the overall productivity as two or more operator are employed on the same task as alerts are not sent in a hierarchy.
The system is built taking all the above loophole into consideration.
Our system will be connected to your cloud platform through API link. When a server goes down the api will be triggered, our system will fetch the phone numbers from the database and call the operator according to the order in which numbers are edited. Now let’s take example that for a rack of server is assigned to 10 man team, So all the team member’s phone numbers will be entered in the database in orderly fashion, when the server goes down the first operator will be called, when operator receives the call he has the choice to accept the task to attend the problem or he can reject the task as he is busy with other task or is in non working hours. if the operator rejects the call then the call is forwarded to the next operators.Lets say rest all operators also rejected the task, the call is looped back and it will in the loop until the specified number of loops are completed, so this ensures that one operator for sure is attending the problem.
- All the alerts are in call form, which help the operator to attend the problem swiftly.
- The operator has the choice to accept the task or to forward the task, this will reduce the response time and increase the overall productivity.
- The system will is built for multi-cloud platform, no need to worry about setting up native alert system, in different cloud platform.
- Privilege access to edit the phone numbers in our real-time and secured database.
- The number of loops can be edited, which will help in making sure that at least one operator is attending the problem.
- Store and edit n number of phone numbers in the database.
- Call n number of operators and with lopping system.
- Ensuring at least one person is attending the problem.
Our team identified the loophole in the in-built alert system and found a solution. This could be achieved when a team works in coordination and enthusiasm. Over all I felt good working with team and with my contributions while developing the system.