FCA and PRA licenses (authorisations) and ongoing compliance support, training, recruitment. Contact us 7 days a week, 8am-11pm. Free consultations. Phone / Whatsapp: +4478 3368 4449  Email: hirett.co.uk@gmail.com
TERMS AND ABBREVIATIONS
Designated employee – Employee of the HIRETT to whom the HIRETT has determined the obligation to examine Customers’ claims.
Compliance laws, rules and standards – laws and other legislative acts regulating the performance of the HIRETT, standards set by self-regulating institutions, related to the activity of the HIRETT, professional codes of conduct and ethics and other standards of good practice related to the activity of the HIRETT.
Employee – a physical person who has actual legal relationship with the Agent based on a labour contract or other legal arrangement, including members of the Board.
Internal normative documents – documents that are issued by the HIRETT and which regulate the performance of the HIRETT, separate structural units or employees, for instance, policies, procedures, regulations, instructions.
Customer – person, who is utilizing one or more of the services provided by the HIRETT.
Administration – The structure of HIRETT whose responsibilities includes record – keeping function.
Procedure – the Procedure of the HIRETT for Examination of Customer’s Claims.
HIRETT – Hirett Ltd.
Claim – any type of document (application, complaint, claim, etc.) that is submitted by the Customer and that contains the claim (complaint, dispute).

1 INTRODUCTION

This document describes HIRETT’s Disaster Recovery Plan and accountabilities

Policy Purpose and Scope

To define accountabilities for the people, teams or process that will be responsible in the event of Disaster.

Required Documentation

The conformance of actual procedures and practices to the documentation provided will be periodically checked by the Mr. John Anthony, Director, Hirett Ltd and Mrs. Anne Victor, Director, Hirett Ltd. Any changes to these procedures must go through this same process as described above.

Emergency notification contacts

Name Address Home Phone Mobile/Cell Phone

Emergency response activities

Nr Action Who Performs
1. Identify and assess network outage Lead network administrator
2. Review with IT management Lead network admin, director of net. ops
3. Evacuate area if necessary Building security
4. Initiate remedial actions to recover network assets Lead network administrator or designee
5. Decision to invoke network DR plan Director of network operations, CIO
6. Initiate DR plan activities Lead network administrator or designee
7. Contact appropriate vendors and carriers Lead network administrator or designee
8. Follow through on recovery procedures Network administration team
9. Report to senior IT management Lead network administrator or designee

2 DISASTER RECOVERY PLAN OBJECTIVES

The main goal of HIRETT`s Disaster Recovery plan is to reduce the overall risk and damage to the company. An important part of planning for Disaster Recovery is setting objectives for recovering. These objectives will have a major impact on the cost and effort of the recovery, as well as help the company choose among recovery alternatives.

The Recovery Time Objective (RTO) specifies how soon you will be up and running following a disaster; essentially, it is the time in which HIRETT will need to recover. Applications and systems may have different RTOs depending on the data involved and systems role in the IT systems landscape. For example, one RTO may specify how long before the major functions of the enterprise are back on line, whilst a second (longer) RTO will determine how long until everything is fully recovered.

The Recovery Point Objective (RPO) determines how old the recovered data will be. This can be anywhere from a few seconds in the case of a sophisticated (and expensive) remote mirroring system to several hours, or even days, for less critical data. Like the RTO, the RPO is often assigned by functions, with critical functions — such as transaction processing – having short RPOs and less immediate functions recovering to a point further back in time.

The Network Recovery Objective (NRO) is, effectively, how long before you appear recovered to your customers. More technically, it is the time needed to recover or fail over network operations. NRO includes such jobs as establishing alternate communications links, reconfiguring Internet servers, setting alternate TCP/IP addresses and everything else to make the network working again.

Recovery teams

  • Emergency Management Team (EMT)
  • Disaster Recovery Team (DRT)
  • IT Technical Support (IT) for Networking

See Appendix A for details on the roles and responsibilities of each team.

Team member responsibilities

  • Each team member will designate an alternate/backup.
  • All team members should keep an updated calling list of team members’ work, home and cell phone numbers both at home and at work.
  • All team members should keep this plan for reference at home in case a network disaster happens after normal work hours. All team members should familiarize themselves with the contents of this plan.

3 RESPONSIBILITIES

Incident management response is depicted by a RACI – Responsible, Accountable, Consulted, and Informed matrix to describe the main tasks and responsibilities around the incident resolution process. RACI Matrix is the name given to a table which is used to describe the type and degree of involvement that stakeholders have in completing tasks.

The RACI Matrix displays deliverables or tasks along one axis:

  • Incident identification (any kind of malfunction) and addressing to 1st line support
  • Incident escalation to 2nd line support
  • Incident escalation to 3rd line support
  • Incident resolution (recovery)

It displays project roles or stakeholders along the other axis:

  • Hirett Ltd employees
  • Hirett Ltd directors
  • 1st line support
  • 2nd line support
  • 3rd line support

At each intersecting cell the type or degree of involvement is documented (Responsible, Accountable, Consulted, and Informed). Hirett Ltd Limited use the following definitions for each level of participation in a task or creation of incident ticket or a document.

Accountable – This is the person who is ultimately ensuring that the deliverable or task has been completed and is thorough and correct.  The accountable person directs the work of the responsible person and there should only be one truly accountable person.  This avoids any misunderstandings when something doesn’t get done or is done incorrectly.

Responsible – The responsible person(s) is the one performing the work. It can be one person, or a team of people.  The responsibility of the person(s) is to obtain the required information and utilise this information to completed the task and/ or create the deliverable. They may be reporting to a lead or manager who is accountable for the task or deliverable.  However, for smaller tasks or deliverables, when there is only one responsible person listed, they may ALSO be listed as the accountable party.

Consulted – The consulted person(s) is a subject matter expert.  The opinions and/ or knowledge of the consulted person(s) of a particular system or process are sought.  They don’t usually participate in completing a task or deliverable, other than by providing the relevant information that the responsible person needs to achieve their task or deliverable.

Informed – These are the people who need to be kept up to date on a task or deliverable.  They may need to track the amount of progress being made, but usually the main focus of the informed personnel’s regarding the completion of a task or deliverable. Typically, they are either reviewers of the completed document and provide formal sign-off and approval, or they may be dependent on the results from the task or deliverable.

4 INCIDENT MANAGEMENT RACI MATRIX

Incident Identification; Description; allocation to 1st line support Incident Resolution 1st line support Incident Resolution 2nd line support Incident Resolution 3rd line support
Hirett Ltd employees R I I I
Hirett Ltd Directors I, C I, C
1st line support A, R I, C I, C
2nd line support A, R I, C
3rd line support A, R

5 DISASTER RECOVERY & BUSINESS CONTINUITY MATRIX

Disaster Critical Resources Impacted Mitigation Time to Reinstate Risk
Power Loss Electricity, internet, telephones
  • Contact electricity company
  • If delay > 4 hours use mobile internet
  • If delay >24 hours arrange generator.
Depends on supplier (normally < few hours) MEDIUM
Internet fails Internet
  • Back up mobile broadband service maintained.  Can be switched to instantly through portable Wi-Fi hotspot-if required.
Depends on supplier used so typically < few hours MEDIUM
Telephone lines down Telephones, internet
  • Contact telecom co.
  • If delay >4 hours redirect phone lines via VOIP supplier.
Depends on supplier (normally <1 day) MEDIUM
Flooding Business premises
  • Relocate to a temporary office
  • Business insurance
24 hours LOW
Fire Electricity, internet, office space, onsite PCs
  • Relocate to temporary office.
  • Staff to work from home for first 3-4 days until new office prepared.
  • Calls rerouted as per routine when telephone lines down.
  • Phone co. to reroute numbers.
  • Fire safety standards maintained and staff made aware.
24 hours MEDIUM
Server Failure Data Servers
  • Backed up offsite
  • Copy held onsite as well – can see up to previous day’s records
  • Two servers run in parallel – can revert to other in event of catastrophic hardware failure within 6 hours.  As such, all payment services (except same-day “express” orders) would be completed in the stated timescale.  Back office function is web-based and thus can operate from anywhere in the world once data servers running.
< 6 hours normally. MEDIUM
Burglary PCs, telephones
  • Building is highly secured and there are CCTV’s everywhere, all mail is x-rayed and visits only by an appointment, we are based in a One Canada Square Building.
  • Telephone loss can be followed as per contingency plan for phone lines going down.
6 hours LOW

6 EMERGENCY RESPONSE CHECKLIST

The Director of the Company is responsible for putting contingency plans into effect. Not all of the following will be relevant in every situation. These points must be checked before any action is taken.

  • Check all staff available and made aware of problem.
  • Check electricity is working – if not contact supplier and take action as per mitigation above.
  • Check phone lines working – if not contact Telecom Company and take action as per mitigation above.
  • Check internetworking – if not contact supplier and take action as per mitigation above.
  • Check data servers accessible – take action as per mitigation above.
  • Staff to be instructed to work from home where necessary (all have been made aware and are able to do so where required):
  • Ensure internet (or temporary measure) reinstated within 4 hours – using alternative locations
  • Contact lettings companies regarding temporary office rental.

7 INCIDENTS AND PROBLEM MANAGEMENT

This framework should include the service quality definitions and measurements for major vendors:

  • Support hours – HIRETT requires complete service uptime on a 24/7 365 basis, all incidents are to be investigated and steps put into place to resolve the issues in line with the RTO.

Urgency determination guidelines

Urgency describes the criticality of service function unavailability for users of the service:

Urgency Service functions
High At least one of the following service functions is impaired:

– All ONLINE Functions

– All Front Office Functions

Medium At least one of the following service functions is impaired:

– All Back Office Functions.

Low Any  other Functions

Priority attachment guidelines

Priority is the relationship between the impact and the urgency, which determines the sequence of incident solving:

    Impact

 

Urgency

For the entire company or online clients For a user group For a user
High Very high Medium Low
Medium Very high Medium Low
Low High Low Low

Priority

Priority indicates parameters applied when resolving incidents or performing service support activities:

Priority Response time Resolution time
Very high 30 minutes. 4 hours
High 30 minutes. 8 hours
Medium 30 minutes. 12 hours
Low 30 minutes. 48 hours

Response time – a period of time it takes the service provider to start resolving an incident or performing service support.

Resolution time – a period of time it takes the service provider to resolve an incident or perform service support.

Service quality

Service Activity Description Priority Quality Service Definition
Average time to respond to incidents Average time taken for IT to start to resolve an incident Very high 30 minutes
High 30 minutes.
Medium 30 minutes.
Low 30 minutes.
Average incident resolution time Average time taken for IT to resolve an incident Very high 4 hours
High 8 hours
Medium 12 hours
Low 48 hours

8 BACKUP POLICY

Full and incremental backups protect and preserve corporate network information and should be performed on a regular basis for system logs and technical documents that are not easily replaced, have a high replacement cost or are considered critical.  Backup media should be stored in a secure and geographically separate location from the original and isolated from environmental hazards.  Backup network components, cabling and connectors, power supplies, spare parts and relevant documentation should be stored in a secure area on-site as well as at other corporate locations.

Network-specific data and document retention policies specify what records must be retained and for how long.  All network organizations are accountable for carrying out instructions for records management in their organization.

IT Technical Support follows these standards for data backup and archiving, particularly for networks:

Hard drive retention policy

Backup media is stored at locations that are secure, isolated from environmental hazards, and geographically separate from the location housing network components.

System databases

A copy of the most current network and system databases must be made at least twice per month or based on frequency of changes made.

These backups must be stored offsite.

The lead network administrator is responsible for this activity.

Offsite storage procedures

Hard drives, disks and other suitable media are stored in environmentally secure facilities.

Hard drive or disk rotation occurs on a regular schedule coordinated with the storage vendor.

Access to backup databases and other data is tested annually.

Emergency management procedures

The following procedures are to be followed by network administration and operations personnel and other designated HIRETT employees in the event of a network disruption or related outage.  Where uncertainty exists, the more reactive action should be followed to provide maximum protection and personnel safety.

These procedures are furnished to HIRETT management personnel to take home for reference. Several pages have been included to supply emergency contacts.

In the event of any situation where access to a building housing network infrastructure equipment is denied, personnel should report to alternate locations or contact security for access if the location is not damaged or quarantined.  Primary and secondary locations are listed below.

9 EMERGENCY MANAGEMENT PROCEDURES

The following procedures are to be followed by network administration and operations personnel and other designated HIRETT employees in the event of a network disruption or related outage.  Where uncertainty exists, the more reactive action should be followed to provide maximum protection and personnel safety.

These procedures are furnished to HIRETT management personnel to take home for reference. Several pages have been included to supply emergency contacts.

In the event of any situation where access to a building housing network infrastructure equipment is denied, personnel should report to alternate locations or contact security for access if the location is not damaged or quarantined.  Primary and secondary locations are listed below.

Alternate Locations    

Workplace:

Attempt to contact your immediate supervisor or management via telephone. Home and cell phone numbers are included in this document.

Workplace:

Attempt to contact your immediate supervisor or management via telephone. Home and cell phone numbers are included in this document.

Workplace:

Attempt to contact your immediate supervisor or management via telephone. Home and cell phone numbers are included in this document.

9.1 IN THE EVENT OF A NATURAL DISASTER

In the event of a major catastrophe affecting HIRETT network operations, immediately notify the < Name or Title of Person>.

STEP ACTION
1 Notify EMT and DRT of impending event as time permits.
2 If impending natural disaster can be tracked begin launching network within 48 hours as follows:

Deploy portable generators with fuel on standby.

Deploy network technical and admin personnel on standby.

Facilities department on standby for replacement shelters.

Basic necessities are acquired by support personnel when deployed:

Cash for one week

Food and water for one week

Gasoline and other fuels

Supplies, including chainsaws, batteries, rope, flashlights, medical supplies, etc.

3 24 hours prior to event:

Create an image of network and system databases and other relevant files.

Back up critical network and system elements.

Verify backup generator fuel status and operation.

Create backups of PBXs, routers, VoIP systems, e-mail, routers, switches, file servers, etc.

Fuel vehicles and emergency trailers.

Notify senior management.

9.2 IN THE EVENT OF A FIRE

If fire or smoke is present in the facility where network infrastructure assets are located, evaluate the situation and determine the severity, categorize the fire as a major or minor incident and take the appropriate action as defined in this section. Call ________ or contact your local first responders as soon as possible if the situation warrants it.

Personnel are to attempt to extinguish minor fires (e.g., single hardware component or paper fires) using hand-held fire extinguishers located throughout the facility. Any other fire or smoke situation will be handled by qualified building personnel until the local fire department arrives.

In the event of a major fire, call 911 and immediately evacuate the area.

In the event of any emergency situation, such as system and network security, site security and personal safety are the major concerns. If possible, the lead network administrator and/or designee should remain present at the facility until the fire department has arrived.

In the event of a major catastrophe affecting the facility, immediately notify senior management.

STEPACTION1Dial ________________ to contact the fire department.2Immediately notify all other personnel in the facility of the situation and evacuate the area.3Alert emergency personnel on:

Provide them with your name, extension where you can be reached, building and room number, and the nature of the emergency.  Follow all instructions given.

4Alert the EMT and DRT.

Note:    During non-staffed hours, security personnel will notify the Senior Executive responsible for the location directly.

5Notify Building Security.

Local security personnel will establish security at the location and not allow access to the site unless notified by the Senior Executive or his/her designated representative.

6Contact appropriate vendors to aid in the decision regarding the recovery and resumption of network services and protection of equipment as time and events permit.7All personnel evacuating the facilities will meet at their assigned outside location (assembly point) and follow instructions given by the designed authority. Under no circumstances may any personnel leave without the consent of a supervisor.

9.3 IN THE EVENT OF A NETWORK SERVICES PROVIDER OUTAGE

In the event of a network service provider outage, the guidelines and procedures in this section are to be followed.

STEP ACTION
1 Notify senior management of outage.

Determine cause of outage and timeframe for its recovery.

2 If outage will be greater than one hour, route all calls via alternate services.

If it is a major outage and all carriers are down and downtime will be greater than 12 hours, deploy satellite phones, if available.

9.4 IN THE EVENT OF A FLOOD OR WATER DAMAGE

In the event of a flood or broken water pipe near any network infrastructure location, the guidelines and procedures in this section are to be followed.

STEP ACTION
1 Assess the situation and determine if outside assistance is needed; if this is the case, dial _________ immediately.
2 Immediately notify all other personnel of the situation and to be prepared to cease voice and data operations.
3 Notify all other personnel in the facility of the situation and to be prepared to cease operations accordingly.
4 Water detected below raised floor may have different causes:

If water is slowly dripping from an air-conditioning unit and not endangering equipment, contact repair personnel immediately.

If water is of a major quantity and flooding beneath the floor (water main break), immediately implement power-down procedures. While power-down procedures are in progress, evacuate the area and follow management’s instructions.

10 Recovery Procedures

Work place and Infrastructure:

Desktop Computer Failure

Immediate replacement with a previously configured off line spare will be carried out by the following trained individuals.

Central Office – Mr. John Anthony, Director, Hirett Ltd and Mrs. Anne Victor, Director, Hirett Ltd.

Network Failure

In the event of an office network or numerous computer failures – Mr. John Anthony, Director, Hirett Ltd and Mrs. Anne Victor, Director, Hirett Ltd. support should be informed to initiate recovery procedures:

Director –
Tel.: (+44)
Email:

Director –

Tel.: (+44)
Email:

Business and Office Applications Recovery Procedures

All principal systems and applications used within HIRETT are supported by the following vendors:

__________________- Software (Vendor – ______________):

__________________ is HIRETT main ______________ Software System

1st line support will be performed by specially trained Ambassadors:

User support engineer –

email:

2nd line support will be performed by ______________ System Administrator:

System Administrator –
Tel.:
E-Mail:

3rd line support will be performed by the ____________ software Vendor – _______________:

System Administrator – ________________
Tel.: _________________
E-Mail: __________________

Product Supervisor – ______________
Tel.: _______________
E-Mail:

____________ System Infrastructure is serviced by ___________ Data Centre (located in ________-, ____________-).

1st line support will be performed by:

System Administrator –
Tel.: _____________
E-Mail: _____________

2nd line support will be performed by:

Technical Support – _______________
Tel.: _________________
E-Mail: __________________

Account Manager – _____________________
Tel.: ________________
E-Mail: _______________-

In accordance with SLA the ______________`s infrastructure recovery objectives in case of complete data center destruction are defined as:

RTO – in 24 hours full system functions will be recovered in alternative data center

RPO – max. data losses are limited by 10 min.

11 PLAN REVIEW AND MAINTENANCE

This network disaster recovery plan must be reviewed semi-annually and exercised on at least an annual basis. The test may be in the form of a walk-through, mock disaster, or component testing.  Additionally, considering the dynamic environment within HIRETT, it is important to review the listing of personnel and phone numbers contained within the network DR plan regularly.

The hard-copy version of the network DR plan will be stored in a common location where it can be viewed by site personnel and the EMT and DRT.  Electronic versions will be available via HIRETT network resources as provided by IT Technical Support.  Each recovery team will have its own directory with change management limited to the recovery plan coordinator.

12 DISASTER RECOVERY PLAN

Disaster Recovery Plan is designed to provide minimum values for two key metrics, in case   of an accident:

  • Recovery time   objective (RTO) – the maximum admissible amount   of   time   service   unavailability;
  • Recovery point   objective (RPO) – the   maximum   admissible   amount   of   data   loss   time   during   a serious

Usually   these   values   are   specified   and   taken   into   account   in   the   SLA Contract. The   main   events   that   can   occur   during   an   accident:

  1. Inaccessibility of   service:
    1. server crash
    2. data centre failure
    3. the lack   of   server   processing   power   with   increasing   load   on   the   service
  2. Loss, breakdown of   data:
    1. Loss, failure in   the   relational   database
    2. Loss of   event   log   data (logs) required   for   analysing   the   occurrence   of   an   accident and   debugging   programs, investigating   cases   of   penetration   into   the   system, etc.

Basic   means   of   control   for   accidents   prevention:

Using   the   tools   provided   by   the   cloud   service   on   which   technical service provider   infrastructure   is   based:

  1. Centralized monitoring   of   system   resources (periodical   metrics   monitoring)
  2. Centralized event   logs   view
  3. Notifications that   can   potentially   disrupt   the   normal   operation   of   the   service, (for   example, a   high   load   on   the   processor, lack   of   disk   space, lack   of   memory, etc.).

Disaster   recovery   plan    has   a   multi-level   approach   to   data   recovery (multi-tier), ie   there   are several   recovery   paths, depending   on   the   level   and   extent   of   damage:

  1. To ensure   uninterrupted   operation   and   instant   recovery   in   the   event   of   an   accident, we use   a   copy   of   the   infrastructure   that   is   in   the   data   center   in   another   region   of   our   cloud service   on   which   we   build   our   infrastructure: All   traffic   from   outside   is   controlled   by   the   HTTPS   load   balancing   service   inside   our cloud.   In   turn, HTTPS   load   balancer   redirects   traffic   to   available   backend   servers   by performing   a   health      Accordingly, if   the   service   and   backend   servers   are   not available   in   the   main   region, traffic   is   redirected   to   a   working   copy   in   another   region. In   the   scenario   described   above, the    Hot   Standby   Server   Failover    approach   is implemented.
  2. To restore   system   disks   and   data   to   them, after   an   accident   occurred, or   a   separate server   crashes, a   regular   snapshot   of   server   drives   with   system   and   user   data   can   be created.   The   interval   for   creating   snapshots   is   set   according   to   the   RTO   metrics   and   the value   of   each   server   in   the      If   a   disk   contains   frequently   changing   data, for example   data   from   a   relational   database, then   this   approach   requires   a   freeze   of   the system   disk   on   which   the   relational   database   data   resides.

 

To   recover   from   a   snapshot, the   following   steps   are   required:

  • A system   disk   is   created   from   the   corresponding   snapshot
  • A virtual   server   with   a   broken   disk   or   a   group   of   servers   is   stopped   or   deleted (depending   on   the   situation, and   the   type   of   disks, for   example, if   the   main system   disk   from   which   the   operating   system   boots   is   broken, the   virtual   server   is deleted   and   recreated   based   on   the   newly   restored   disk).
  • The tools   provided   by   the   cloud   service (___________) allow   to   replace   disks with   broken   data, to   newly   created   snapshots, or   to   re-create   a   virtual   server, based   on   the   newly   restored   disk   in   the   shortest   possible
  • After a   successful   replacement   or   re-creation, a   virtual   server   or   a   group   of servers   is
  1. In the   event   of   a   breakdown   in   order   to   restore   the   data   of   the   relational   database, there is   a   possibility   of   rollback   in      (Database   point-in-time   recovery).   We   use   the   tools   of the   database   system   itself   to   create   incremental   backups.   This   approach   allows   to quickly   create   backups   with   the   smallest   time   interval   and   also   quickly   roll   back   in   time:
    1. Backups are   stored   on   the   persistent   disk, attached   to   the   database      For   this purpose, a   task (cronjob) is   configured, which   creates   full   and   incremental   backups on   the   server,  with   a   certain   interval,   which   is   set   in   accordance   with   the   RPO metrics.
    2. The server   also   has   a   configured   task   that   archives, encrypts   and   copies   backups   to remote   Storage   in   a   cloud   server, which   can   be   used   for   restoring      The   backup time   on   this   Storage   is   much   higher, so   it   is   possible   to   recover   data   that   no   longer exists   on   the   persistent   disk   of   the   database   server.
    3. The following   steps   are   used   to   restore   the   backup:
      1. Backup download (full + incremental) from   the   remote   Storage   service   to persistence   disk, where   the   relational   database   itself   is   distributed.
      2. Backup is   prepared   using   special   tools (unzipped, decrypted, transactional   logs   are   rolled   into   a   specially   prepared   directory   with   data from   a   relational   database).
  • Switching the   database   system   to   the   directory   with   the   recovered   data.
  1. Along with   the   data   from   the   database, a   task   is   created   that   copies   system   event   logs and   application   logs   to   the   remote   Storage   service   of   our   cloud   hosting   provider (_______________).

13 ALERTS/VERIFICATION/DECLARATION PHASE

13.1 PLAN CHECKLIST

Network response and recovery checklists and plan flow diagrams are presented in the following two sections.  The checklists and flow diagrams may be used by Technical Support members as “quick references” when implementing the network DR plan or for training purposes.

Insert checklists and other relevant procedure documents here.


Plan
checklists

Initials Task to be completed

13.2 NETWORK DIAGRAMS

Insert network diagrams and other relevant procedure documents here.

13.3 RECOVERY FLOW DIAGRAMS

Insert network recovery flow diagrams and other relevant procedure documents here.

13.4 NOTIFICATION OF INCIDENT AFFECTING THE SITE

On-duty personnel responsibilities

If in-hours:

Upon observation or notification of a potentially serious network disruption at a company location, ensure that personnel on site have enacted standard emergency and evacuation procedures if appropriate and notify the EMT and DRT.

If out of hours:

IT Technical Support personnel should contact the EMT and DRT.

13.5 PROVIDE STATUS TO EMT AND DRT

Contact EMT and/or DRT and provide the following information when any of the following conditions exist: (See Appendix B for contact list)

  • Network performance has sufficiently degraded to where normal operations are not possible for three or more hours.
  • Any problem at any network infrastructure asset, system or location that would cause the above condition to be present or there is certain indication that the above condition is about to occur.

The EMT will provide the following information:

  • Location of incident.
  • Type of incident (e.g., fire, hurricane, flood).
  • Summarize the damage (e.g., minimal, heavy, total destruction).
  • Meeting location that is a safe distance from the disaster scene.
  • An estimated timeframe of when a damage assessment group can enter the facility (if possible).
  • The EMT will contact the respective team leader and report that a disaster involving network operations has occurred.
  • The EMT and/or DRT will contact the respective HIRETT team leader and report that a disaster affecting network operations has occurred.

13.6 DECIDE COURSE OF ACTION

Based on the information obtained, the EMT and/or DRT decide how to respond to the event: Mobilize IT Technical Support, repair/rebuild existing network operations with network technical and admin staff or relocate to a new facility.

Inform team members of decision

If a disaster is not declared, the location response team will continue to address and manage the situation through its resolution and provide periodic status updates to the EMT/DRT.

If a disaster is declared, the EMT and/or DRT will notify IT Technical Support immediately for deployment of network DR plans.

Declare a disaster if the situation is not likely to be resolved within predefined time frames.  The person who is authorized to declare a network disaster must also have at least one (1) backup who is also authorized to declare a disaster in the event the primary person is unavailable.

Contact networking and equipment vendors (see Appendix I)

Disaster declared: mobilize incident response/technical support teams/report to command center

Once a network desk disaster is declared, the Disaster Recovery Team (DRT) is mobilized. This team will initiate and coordinate the appropriate recovery actions.  Network technical and administrative employees should assemble at a designated location as soon as possible.  See Appendix E for emergency locations.

Conduct detailed damage assessment (This should be performed prior to declaring a disaster).

  1. Under the direction of local authorities, IT Technical Support and/or EMT/DRT, assess the damage to the network and related assets. Include vendors/providers of installed network services and equipment to ensure that their expert opinion regarding the condition of the network is determined ASAP.

Participate in a briefing on assessment requirements, reviewing:

  • Assessment procedures
  • Gather requirements
  • Safety and security issues

NOTE:  Access to the facility following a fire or potential chemical contamination will likely be denied for 24 hours or longer.

13.7 DOCUMENT ASSESSMENT RESULTS

Building access permitting:

Conduct an on-site inspection of affected areas to assess damage to essential network records (files, manuals, contracts, documentation, etc.) and electronic data.

Obtain information regarding damage to the network, e.g., environmental conditions, physical structure integrity, furniture, and fixtures) from the DRT.

Develop a Restoration Priority List, identifying facilities, vital records and equipment needed for resumption of network operations that could be restored and retrieved quickly.

Recommendations for required resources:

Contact DRT: Decide whether to continue to business recovery phase

The EMT and DRT gather information regarding the event; contacts senior management and provides them with detailed information on status.

Based on the information obtained, senior management decides whether to continue to the business recovery phase of this network DR plan.  If the situation does not warrant this action, continue to address the situation at the affected site(s).

This section documents the steps necessary to activate network recovery plans to support full restoration of systems and network functionality at either 1) the original company location or 2) an alternate/recovery site that would be used for an extended period of time.  Coordinate resources to re-establish network operations at the primary site and reconstruct network operations at a temporary/permanent system location, and to deactivate recovery teams upon return to normal network operations in either scenario.

HIRETT System and facility operation requirements:

The system and facility configurations for each location are important to re-establish normal network operations.  A list for each location will be included in Appendix F.

Notify IT technical support staff and coordinate return to primary facility/location

See Appendix A for IT Technical Support staff associated with recovery of network operations at the original site.

Secure funding for return to work

Make arrangements in advance with network service carriers and equipment vendors to recover network operations at the primary site.

Notify IT technical support staff/coordinate relocation to new facility/location

See Appendix A for IT Technical Support staff associated with configuring network services at an alternate location (replacement for original site).

Secure funding for relocation

Make arrangements in advance with network service carriers and equipment vendors.  Make arrangements in advance with local banks, credit card companies, hotels, office suppliers, food suppliers and others for emergency support.

Notify EMT and corporate business units of network recovery

Using the call list in Appendix B, notify the appropriate company personnel.  Inform them of any changes to processes or procedures, contact information, hours of operation, etc. (may be used for media information).

Operations recovered

Assuming all relevant network operations have been recovered either to the original location or to an alternate site with employees in place to support network operations, the company can declare that its network is functioning normally.

APPENDIXES

APPENDIX A: HIRETT RECOVERY

Emergency Management Team

Note:  See Appendix B for contact list.  Suggested members include  senior management, Human Resources, Corporate Public Relations, Legal Department, IT Technical Support, Risk Management and Operations.

Charter:

Responsible for overall coordination of the network disaster recovery effort, evaluation and determining disaster declaration and communications with senior management.

Support activities:

The Emergency Management Team

  • Evaluates which recovery actions should be invoked and coordinate with the corresponding network recovery teams.
  • Analyzes network damage assessment findings.
  • Sets restoration priority based on damage assessment reports in collaboration with IT Technical Support.
  • Provides senior management with ongoing status information.
  • Acts as a communication channel to corporate teams and major customers.
  • Work with vendors, carriers and IT Technical Support to develop a rebuild/repair schedule
  • Disaster Recovery Team (DRT)

Note:  See Appendix B for contact list.

Charter:

Responsible for overall coordination of the network disaster recovery effort, establishment of the emergency command area (if needed) and communications with senior management, the Emergency Management Team, and IT Technical Support teams.

Support activities:

  • Coordinate with EMT, senior management and IT Technical Support
  • Assist with determination of network recovery needs with IT Technical Support.
  • Establish command center and assembly areas.
  • Notify all company department heads and advise them to activate their plan(s) if applicable, based upon the disaster situation.
  • If no network disaster is declared, take appropriate action to return to normal network operations using regular network operations staff.
  • Determine if carriers, vendors and other teams are needed to assist with detailed damage assessment.
  • Prepare post-disaster debriefing report.
  • Coordinate the development of revised network recovery plans and ensure they are updated semi-annually.

IT technical support

Charter:

  • IT Technical Support will facilitate network recovery and restoration activities.
  • Support activities
  • Upon notification of disaster declaration, review and provide support as follows:
  • Facilitate network recovery and restoration activities, providing guidance on replacement equipment, systems and network services, as required.
  • Coordinate testing of network operations to ensure the network is functioning normally.

APPENDIX B: RECOVERY TEAM CONTACT LISTS

Emergency Management Team

 

Name Address Home Mobile

 

Disaster Recovery Team

 

Name Address Home Mobile

 

IT Technical Support

 

Name Address Home Mobile

 

Appendix C: Emergency numbers

First responders, network carriers, public utility companies and others

 

Name Contact Name Phone

 

Appendix D: Contact list

 

Name Address Home Mobile

 

APPENDIX E: EMERGENCY COMMAND CENTER LOCATIONS

Emergency command center – <Location Name>

Primary:

Address

Room

City

Contact:  “coordinator of rooms/space – (xxx) xxx-xxxx

Alternate:

Address

Room

City

Contact: “coordinator of rooms/space – (xxx) xxx-xxxx

Emergency command center – <Location Name>

Primary:

Address

Room

City, State

Contact: “coordinator of rooms/space – (xxx) xxx-xxxx

Alternate:

Address

Room XXX

City, State

Contact: “coordinator of rooms/space – (xxx) xxx-xxxx

 

 

APPENDIX F: FORMS

Incident/disaster form

Upon notification of a network disruption the on-duty personnel in Network Operations will make the initial entries into this form.  It will then be forwarded to the ECC and will be continually updated. This document will be the running log until the help desk incident/disaster has ended and “normal business” has resumed.

TIME AND DATE

 

 

 

TYPE OF EVENT

 

 

 
 
 
 
 
 

 

LOCATION

 

 
 
 
 

 

BUILDING ACCESS ISSUES

 

 
 
 
 

 

PROJECTED IMPACT TO OPERATIONS

 

 

RUNNING LOG (ongoing events)