The prime goal of incident management is to resolve incidents either with temp fix or perm fix and bring back the IT service. We list here few steps involved during incident process.
Resolve incidents to reduce downtime to the business
The prime goal of incident management is to resolve incidents either with temp fix or perm fix and bring back the IT service. Resolving the incidents, firstly requires to register the incident in the ITSM tool with a unique reference number, then categorization of the incident is done based on hardware, software, etc. and then the incident is assigned to the appropriate team or a person to take quick action, then the investigation and diagnosis is done, then resolution is implemented by searching knowledge articles or reference materials or KEDB and once the issue is resolved then the incident is closed.
Improve the quality of IT service and increase availability of the services operation
Incident management can improve the quality of IT services by identifying the recurring incidents, and logging problem tickets to identify the root cause of the incident/ incidents. If there is any new incident which has no resolution, then a problem ticket is created to identify the root cause and a fix.
By identifying the recurring incidents and its associated CI’s, availability management or capacity management or information security management or continuity management can redefine or revise the respective plans and procedures to improve the delivery of services.
Monitoring of services, detecting and mitigating incidents
As incident management team in many organizations is also involved in monitoring, they will get a complete picture why the incident occurred, what errors or warnings or exceptions have occurred. Accordingly, the monitoring team can consolidate the complete information from monitoring and event management tools and inform the problem management team for quicker resolution of unknown incidents.
Communicate regarding the progress of the major incidents to all stakeholders
Incident management team will communicate the progress of the major incidents to all the necessary stakeholders from the moment it has been registered to the closure.
Incident management team keeps sending notifications regularly after every half an hour or the defined timelines to all the relevant stakeholder giving information on the incident like:
- What is the incident?
- What is the priority?
- When the incident occurred?
- Where is the incident happening or happened?
- What is the associated CI?
- How many people are impacted?
- Who is working on the issue?
- Estimated time to resolve the incident
Ensure SLA’s don’t breach for any reason
Incident manager and management team will have to ensure the SLA doesn’t breach on any of the incident ticket, for any reason like 3rd party involvement, negligence of the incident management team, dependency on any other problem or change ticket.
Measure the effectiveness of incident management operations
Incident manager has to track the effectiveness of the incident management operations by defining the metrics and KPI’s (Key Performance Indicators) like:
- Number of incidents
- Number of major incidents
- Number of recurring incidents
- Average time taken to resolve the incident
- Average time taken to resolve the major incident
- Incidents that triggered problem tickets
- Incidents that triggered change tickets