Standard Operating Procedure Template

(1)

Standard Operating Procedure

[Title]

[Version]

[Company Name]

[Street Address]

[City, State Zip Code]

[Creation Date]

Notes:

 The following template is provided for writing a Standard Operating Procedure (SOP) document.

 [Inside each SOP section, text in green font between brackets is included to provide guidance to the author and should be deleted before publishing the final document.]

 Inside each section, text in black font is included to provide a realistic example in which a Standard Operating Procedure is written for the first-line support in an Incident Management.

 You are free to edit and use this Standard Operating Procedure template and its contents within your organization; however, we do ask that you don't distribute this template on the web without explicit permission from us.

Copyrights: ITIL® is a Registered Trade Mark of the Office of Government Commerce in the United Kingdom and other countries.

(2)

Document Control

Preparation

Action Name Date

Prepared by:

Release

Version Date Released Change Notice Pages Affected Remarks

1.0 N/A All First Release

Distribution List

(3)

1. INTRODUCTION

5

1.1 Purpose 5 1.2 Scope 5 1.3 Responsibilities 5 1.4 Summary 6

2. PROCEDURE7

2.1 Receive Call and Create Incident Ticket 7 2.2 Validate Incident 7

2.3 Gather Information 7

2.4 Identify Configuration Items (CI) Affected. 7 2.5 Categorize Incident 8

2.6 Look for Duplicate Incidents 8 2.7 Determine Impact 8

2.8 Determine Urgency 8 2.9 Calculate Priority 8 2.10 Process Major Incident 8 2.11 Perform First Diagnosis 9 2.12 Escalations 9

2.13 Get a Workaround or Resolution 9 2.14 Create a Resolution Plan 9

2.15 Apply the Resolution Plan 10

2.16 Check Restoration of Normal Service 10

2.17 Initiate a Problem Management Process for Recurring Incidents 10 2.18 Get User Satisfaction 10

2.19 Close the Incident 10

3. HANDLING OF EXCEPTIONS

11

3.1 Major Incidents 11 3.2 Functional Escalation 11 3.3 Hierarchical Escalation 11

4. ANNEX12

4.1 Glossary 12 4.2 List of tables 13 4.3 Bibliography 13

(4)

1. Introduction

[ITIL Standard Operating Procedures (SOP's) are the documented procedures for routine work, exception response and making changes for every device, system or procedure. SOPs are used by IT Operations Management as part of ITIL Service Operations.

This section is devoted to provide overall information about the document. The example provided in this template is for Incident Management - First Line Support.]

1.1 Purpose

[Specify what the intention of the whole Standard Operating Procedure (SOP) document is.]

The purpose of this document is to describe the procedures that the first-line support must perform as part of the Incident Management process.

1.2 Scope

[Define here the scope and limits on which the procedures are applied.]

This document encompasses all of the activities that the first-line support must perform in handling incidents originating in the applications and IT infrastructure within the organization. It does not include the handling that the team must perform for other types of requests that get to them, like service requests.

1.3 Responsibilities

[Describe the role or roles performed by the team, department or group targeted by this document. Also list their responsibilities.]

The first-line support performs the roles of Incident Owner and Incident Analyst. Each member of the team is responsible for the handling of assigned incidents. Their responsibilities are:

 Oversee the handling of the incident from the start to the closure.

 Find the Configuration Items (CI) affected.

 Perform initial diagnosis.

 Escalate the incident to the corresponding skilled team when needed.

 Apply workarounds and permanent solutions when is possible and permitted under their knowledge and authorization.

 Ensure the incident information is updated.

(5)

The first-level support works under the supervision of the Incident Manager.

1.4 Summary

[Describe here the structure of the rest of the Standard Operating Procedure document.]

The main activities in the document are described on Section 2 Procedure. Section 3 Handling of Exceptions, describe how to perform special activities like the handling of major incidents and escalations.

(6)

2. Procedure

[The Standard Operating Procedure (SOP) describes the routine work that needs to be done for every device, system or procedure. They also outline the

procedures to be followed if an exception is detected or if a change is required. List here the activities that should be performed.]

2.1 Receive Call and Create Incident Ticket

To increase the reliability and effectiveness of the Incident Management process, users are encouraged to report incidents and create the corresponding ticket from the web-based console. Efforts are also made to detect incidents and, when possible, self-heal from the Event Management automated tools. On both cases the incident ticket is created from sources other than the first-level support. For all other cases, as when the incident is just reported by a single call, the incident ticket must be created manually. Do so by using the Incident

Management module of the Service Management Automated System. Fill in all the information required (marked with a “*”) and the optional information deemed helpful to the case.

2.2 Validate Incident

If the purported incident is actually a service request then reclassify it and direct it into the Request Fulfillment module.

Check the incident data. Modify or complete the information wherever is needed. Classify the incident into one of the pre-defined types. In case a new type is needed, classify the incident as “Generic” and coordinate with the Incident Manager to initiate a Change process.

2.3 Gather Information

Gather as much information as required to understand the causes and solve the incident. You can also import any file deemed relevant into the incident record workspace.

(7)

Identify CIs affected. This includes CIs failing, degrading or in imminent risk of failing or degrading as a result of the incident. Update into the incident data. Check for dependencies in the appropriate tab. Alert Configuration management if there is any discrepancy in the Configuration Management System (CMS).

2.5 Categorize Incident

Categorize according to what the incident appears to be. Categorization determines the initial handling of the incident and could possibly be changed later.

2.6 Look for Duplicate Incidents

Use the option “Search duplicated incidents” to find out previous incidents that are similar or related. Then use the option to concatenate with similar incidents or to relate with related incidents.

2.7 Determine Impact

Assign a value for impact. By default, system calculates impact according to the CIs affected.

2.8 Determine Urgency

Assign a value for urgency. By default, urgency equals the value set for the type of incident. Only the Incident Manager role may change this value.

2.9 Calculate Priority

Priority is automatically calculated by the system, combining the impact and urgency according to a pre-defined set of rules. To change the rules for

calculating priority, a Change process must be initiated by the Incident Manager. Only the Incident Manager may override the priority calculated by the system.

(8)

For incidents of the highest impact, the system will advise to treat it as major incident. The Incident Manager can start the option “Treat as Major Incident” for any other incident. See the section 3.1 Major Incidents.

2.11 Perform First Diagnosis

Investigate the incident trying to find out its causes, effects and means of solving it. Look for solutions from the following sources:

 Document “Common incidents and troubleshooting”.

 Web-based knowledge base.

 List of known errors for each application.

 Experiences from related incidents.

 Common sense.

2.12 Escalations

If you cannot solve the incident within the stipulated times for the first-line support, or if the investigation and solution requires specialized knowledge, you should perform a functional escalation to the appropriate team at the second or third level. See the section 3.2 Functional Escalation.

A hierarchical escalation is also needed when the Incident should be treated as a major incident, or when the solution requires authorization from the appropriate level of decision. Most hierarchical escalations go first to the Incident Manager. See section 3.3 Hierarchical Escalation.

.

2.13 Get a Workaround or Resolution

Remember that your goal is to restore service as soon as possible. If a

permanent solution can be implemented within the agreed response times, apply it. If not, try an effective workaround and recommend an analysis by Problem Management at the end of the process.

2.14 Create a Resolution Plan

Write into the system the details on how the incident shall be solved. Whenever reasonable, include pre-testing, post-testing and backup options.

(9)

2.15 Apply the Resolution Plan

Execute the steps in the Resolution Plan. Document any update needed during the implementation. In case the solution is going to be applied by the user, check its effectiveness. Send a change request and monitor the change when the solution requires a non-standard change.

2.16 Check Restoration of Normal Service

Check that solution succeed as intended. If not, repeat the process since diagnosis.

2.17 Initiate a Problem Management Process for Recurring Incidents

Activate a request for Problem Management is the incident is recurring or if the incident is likely to recur.

2.18 Get User Satisfaction

Ask the user to fill in the optional survey. Document any other feedback from the process.

2.19 Close the Incident

Update and close the incident record along with any concatenated incident records.

(10)

3. Handling of Exceptions

[You may devote a section of the Standard Operating Procedures to detail how to handle deviations from the normal flow.]

3.1 Major Incidents

Incidents with the higher impact on the business are treated as major incidents. A special team is convened under the direct supervision of the Incident Manager to handle the incident faster than usual. Getting findings and conclusions are mandatory at the end of the process. Once you escalate a major incident, stay in contact with the handling team and the user as well, providing any support you are asked for. See section 3.3 Hierarchical Escalation.

3.2 Functional Escalation

If you cannot solve the incident within the stipulated times for the first level, or if the investigation and solution requires specialized knowledge, you should

perform a functional escalation to the appropriate team at the second or third tier. a) Identify first the appropriate expert or team to escalate the incident.

b) Escalate the incident.

c) Provide the information required for the expert or team. d) Provide updates to the user.

e) If the incident is re-routed to other functional area, ensure that the incident is re-classified.

f) Continue normal flow from step 2.17.

3.3 Hierarchical Escalation

A hierarchical escalation is needed when the Incident should be treated as a major incident, or when the solution requires authorization from the appropriate level of decision. At the point where a hierarchical escalation is needed, insert the following steps:

a) Identify that the incident should be hierarchically escalated.

b) Escalate to the appropriate authority, usually to the Incident Manager and, in some cases, to the specific authority supervising the affected area. c) Provide the information needed for the authority to make a decision. d) Continue with the regular process. Major incidents are usually handled by

(11)

4. Annex

[Insert here anything you may like to attach to support the Standard Operating Procedure (SOP) document.]

4.1 Glossary

[This section of the Standard Operating Procedures provides the definitions of terms, acronyms, and abbreviations required to understand this document.]

Term Definition

Change The addition, modification or removal of anything that could have an effect on IT services. Change

Management

The process responsible for controlling the lifecycle of all changes.

Configuration Item (CI)

Any component or other service asset that needs to be managed in order to deliver an IT service.

Configuration Management System (CMS)

A set of tools, data and information that is used to support service asset and configuration management.

Diagnosis A stage in the incident and problem lifecycles aimed at identifying a workaround for an incident or the root cause of a problem.

Escalation An activity that obtains additional resources when these are needed to meet service level targets or customer expectations.

Event

Management The process responsible for managing events throughout their lifecycle.

First-line support The first level in a hierarchy of support groups involved in the resolution of incidents. Functional

escalation

Transferring an incident, problem or change to a technical team with a higher level of expertise to assist in an escalation.

Hierarchic escalation

Informing or involving more senior levels of management to assist in an escalation.

Impact A measure of the effect of an incident, problem or change on business processes. Incident An unplanned interruption to an IT service or reduction in the quality of an IT service. Incident

Management

The process responsible for managing the lifecycle of all incidents.

Incident record A record containing the details of an incident.

(12)

Term Definition

Problem A cause of one or more incidents. Problem

Management

The process responsible for managing the lifecycle of all problems.

Resolution Action taken to repair the root cause of an incident or problem, or to implement a workaround.

Restore Taking action to return an IT service to the users after repair and recovery from an incident.

Root cause The underlying or original cause of an incident or problem. Second-line

support

The second level in a hierarchy of support groups involved in the resolution of incidents and investigation of problems.

Standard Change A pre-authorized change that is low risk, relatively common and follows a procedure or work instruction

Standard Operating Procedure (SOP)

Procedures used by IT operations management.

Third-line support The third level in a hierarchy of support groups involved in the resolution of incidents and investigation of problems.

Urgency A measure of how long it will be until an incident, problem or change has a significant impact on the business.

Workaround Reducing or eliminating the impact of an incident or problem for which a full resolution is not yet available.

Table 1. Glossary.

4.2 List of tables

[This section of the Standard Operating Procedure includes a list of all of the tables in the document.]

Table 1. Glossary...

4.3 Bibliography

(n.d.). Common incidents and troubleshooting.

Noname Software Company. (2012). Service Management Automated System's User Guide.

Standard Operating Procedure Template