Incident Procedure
-
-
If you are new, add your shadow to any pagerduty incident ( Delegate Incident )
-
If the problem cannot be resolved within 15 minutes and is affecting customers, to use
in the #war-room so that third-level support is notified/incident -
If the problem is taking longer, create an on-going incident in the Incident Manager
-
If you need someone from SYSTEC or another team, and no one is ACK your pager-duty call/@mention in slack, escalate to your team leader
-
In case there are issues with third-party infrastructure providers ie Heroku/Compose, use Support for External Service and Resource as a guideline
-
Incidents need a post-mortem. In Vienna it was decided to use pager-duty post mortems. The first person coming in contact with the incident is responsible for arranging the post-mortem meeting (involving all participants part of the incident management) and for following up on post-mortem agreed steps to improve the system. Either kulo (channel related) or draven (platform related) should be invited to the post-mortem.
-