TERMS AND ABBREVIATIONS |
Designated employee – Employee of the HIRETT to whom the HIRETT has determined the obligation to examine Customers’ claims. |
Compliance laws, rules and standards – laws and other legislative acts regulating the performance of the HIRETT, standards set by self-regulating institutions, related to the activity of the HIRETT, professional codes of conduct and ethics and other standards of good practice related to the activity of the HIRETT. |
Employee – a physical person who has actual legal relationship with the Agent based on a labour contract or other legal arrangement, including members of the Board. |
Internal normative documents – documents that are issued by the HIRETT and which regulate the performance of the HIRETT, separate structural units or employees, for instance, policies, procedures, regulations, instructions. |
Customer – person, who is utilizing one or more of the services provided by the HIRETT. |
Administration – The structure of HIRETT whose responsibilities includes record – keeping function. |
Procedure – the Procedure of the HIRETT for Examination of Customer’s Claims. |
HIRETT – Hirett Ltd. |
Claim – any type of document (application, complaint, claim, etc.) that is submitted by the Customer and that contains the claim (complaint, dispute). |
1 INTRODUCTION
This document describes HIRETT’s Disaster Recovery Plan and accountabilities
Policy Purpose and Scope
To define accountabilities for the people, teams or process that will be responsible in the event of Disaster.
Required Documentation
The conformance of actual procedures and practices to the documentation provided will be periodically checked by the Mr. John Anthony, Director, Hirett Ltd and Mrs. Anne Victor, Director, Hirett Ltd. Any changes to these procedures must go through this same process as described above.
Emergency notification contacts
Name | Address | Home Phone | Mobile/Cell Phone |
Emergency response activities
Nr | Action | Who Performs |
1. | Identify and assess network outage | Lead network administrator |
2. | Review with IT management | Lead network admin, director of net. ops |
3. | Evacuate area if necessary | Building security |
4. | Initiate remedial actions to recover network assets | Lead network administrator or designee |
5. | Decision to invoke network DR plan | Director of network operations, CIO |
6. | Initiate DR plan activities | Lead network administrator or designee |
7. | Contact appropriate vendors and carriers | Lead network administrator or designee |
8. | Follow through on recovery procedures | Network administration team |
9. | Report to senior IT management | Lead network administrator or designee |
2 DISASTER RECOVERY PLAN OBJECTIVES
The main goal of HIRETT`s Disaster Recovery plan is to reduce the overall risk and damage to the company. An important part of planning for Disaster Recovery is setting objectives for recovering. These objectives will have a major impact on the cost and effort of the recovery, as well as help the company choose among recovery alternatives.
The Recovery Time Objective (RTO) specifies how soon you will be up and running following a disaster; essentially, it is the time in which HIRETT will need to recover. Applications and systems may have different RTOs depending on the data involved and systems role in the IT systems landscape. For example, one RTO may specify how long before the major functions of the enterprise are back on line, whilst a second (longer) RTO will determine how long until everything is fully recovered.
The Recovery Point Objective (RPO) determines how old the recovered data will be. This can be anywhere from a few seconds in the case of a sophisticated (and expensive) remote mirroring system to several hours, or even days, for less critical data. Like the RTO, the RPO is often assigned by functions, with critical functions — such as transaction processing – having short RPOs and less immediate functions recovering to a point further back in time.
The Network Recovery Objective (NRO) is, effectively, how long before you appear recovered to your customers. More technically, it is the time needed to recover or fail over network operations. NRO includes such jobs as establishing alternate communications links, reconfiguring Internet servers, setting alternate TCP/IP addresses and everything else to make the network working again.
Recovery teams
- Emergency Management Team (EMT)
- Disaster Recovery Team (DRT)
- IT Technical Support (IT) for Networking
See Appendix A for details on the roles and responsibilities of each team.
Team member responsibilities
- Each team member will designate an alternate/backup.
- All team members should keep an updated calling list of team members’ work, home and cell phone numbers both at home and at work.
- All team members should keep this plan for reference at home in case a network disaster happens after normal work hours. All team members should familiarize themselves with the contents of this plan.
3 RESPONSIBILITIES
Incident management response is depicted by a RACI – Responsible, Accountable, Consulted, and Informed matrix to describe the main tasks and responsibilities around the incident resolution process. RACI Matrix is the name given to a table which is used to describe the type and degree of involvement that stakeholders have in completing tasks.
The RACI Matrix displays deliverables or tasks along one axis:
- Incident identification (any kind of malfunction) and addressing to 1st line support
- Incident escalation to 2nd line support
- Incident escalation to 3rd line support
- Incident resolution (recovery)
It displays project roles or stakeholders along the other axis:
- Hirett Ltd employees
- Hirett Ltd directors
- 1st line support
- 2nd line support
- 3rd line support
At each intersecting cell the type or degree of involvement is documented (Responsible, Accountable, Consulted, and Informed). Hirett Ltd Limited use the following definitions for each level of participation in a task or creation of incident ticket or a document.
Accountable – This is the person who is ultimately ensuring that the deliverable or task has been completed and is thorough and correct. The accountable person directs the work of the responsible person and there should only be one truly accountable person. This avoids any misunderstandings when something doesn’t get done or is done incorrectly.
Responsible – The responsible person(s) is the one performing the work. It can be one person, or a team of people. The responsibility of the person(s) is to obtain the required information and utilise this information to completed the task and/ or create the deliverable. They may be reporting to a lead or manager who is accountable for the task or deliverable. However, for smaller tasks or deliverables, when there is only one responsible person listed, they may ALSO be listed as the accountable party.
Consulted – The consulted person(s) is a subject matter expert. The opinions and/ or knowledge of the consulted person(s) of a particular system or process are sought. They don’t usually participate in completing a task or deliverable, other than by providing the relevant information that the responsible person needs to achieve their task or deliverable.
Informed – These are the people who need to be kept up to date on a task or deliverable. They may need to track the amount of progress being made, but usually the main focus of the informed personnel’s regarding the completion of a task or deliverable. Typically, they are either reviewers of the completed document and provide formal sign-off and approval, or they may be dependent on the results from the task or deliverable.
4 INCIDENT MANAGEMENT RACI MATRIX
Incident Identification; Description; allocation to 1st line support | Incident Resolution 1st line support | Incident Resolution 2nd line support | Incident Resolution 3rd line support | |
Hirett Ltd employees | R | I | I | I |
Hirett Ltd Directors | I, C | I, C | ||
1st line support | A, R | I, C | I, C | |
2nd line support | A, R | I, C | ||
3rd line support | A, R |
5 DISASTER RECOVERY & BUSINESS CONTINUITY MATRIX
Disaster | Critical Resources Impacted | Mitigation | Time to Reinstate | Risk |
Power Loss | Electricity, internet, telephones |
|
Depends on supplier (normally < few hours) | MEDIUM |
Internet fails | Internet |
|
Depends on supplier used so typically < few hours | MEDIUM |
Telephone lines down | Telephones, internet |
|
Depends on supplier (normally <1 day) | MEDIUM |
Flooding | Business premises |
|
24 hours | LOW |
Fire | Electricity, internet, office space, onsite PCs |
|
24 hours | MEDIUM |
Server Failure | Data Servers |
|
< 6 hours normally. | MEDIUM |
Burglary | PCs, telephones |
|
6 hours | LOW |
6 EMERGENCY RESPONSE CHECKLIST
The Director of the Company is responsible for putting contingency plans into effect. Not all of the following will be relevant in every situation. These points must be checked before any action is taken.
- Check all staff available and made aware of problem.
- Check electricity is working – if not contact supplier and take action as per mitigation above.
- Check phone lines working – if not contact Telecom Company and take action as per mitigation above.
- Check internetworking – if not contact supplier and take action as per mitigation above.
- Check data servers accessible – take action as per mitigation above.
- Staff to be instructed to work from home where necessary (all have been made aware and are able to do so where required):
- Ensure internet (or temporary measure) reinstated within 4 hours – using alternative locations
- Contact lettings companies regarding temporary office rental.
7 INCIDENTS AND PROBLEM MANAGEMENT
This framework should include the service quality definitions and measurements for major vendors:
- Support hours – HIRETT requires complete service uptime on a 24/7 365 basis, all incidents are to be investigated and steps put into place to resolve the issues in line with the RTO.
Urgency determination guidelines
Urgency describes the criticality of service function unavailability for users of the service:
Urgency | Service functions |
High | At least one of the following service functions is impaired:
– All ONLINE Functions – All Front Office Functions |
Medium | At least one of the following service functions is impaired:
– All Back Office Functions. |
Low | Any other Functions |
Priority attachment guidelines
Priority is the relationship between the impact and the urgency, which determines the sequence of incident solving:
Impact
Urgency |
For the entire company or online clients | For a user group | For a user |
High | Very high | Medium | Low |
Medium | Very high | Medium | Low |
Low | High | Low | Low |
Priority
Priority indicates parameters applied when resolving incidents or performing service support activities:
Priority | Response time | Resolution time |
Very high | 30 minutes. | 4 hours |
High | 30 minutes. | 8 hours |
Medium | 30 minutes. | 12 hours |
Low | 30 minutes. | 48 hours |
Response time – a period of time it takes the service provider to start resolving an incident or performing service support.
Resolution time – a period of time it takes the service provider to resolve an incident or perform service support.
Service quality
Service Activity | Description | Priority | Quality Service Definition |
Average time to respond to incidents | Average time taken for IT to start to resolve an incident | Very high | 30 minutes |
High | 30 minutes. | ||
Medium | 30 minutes. | ||
Low | 30 minutes. | ||
Average incident resolution time | Average time taken for IT to resolve an incident | Very high | 4 hours |
High | 8 hours | ||
Medium | 12 hours | ||
Low | 48 hours |
8 BACKUP POLICY
Full and incremental backups protect and preserve corporate network information and should be performed on a regular basis for system logs and technical documents that are not easily replaced, have a high replacement cost or are considered critical. Backup media should be stored in a secure and geographically separate location from the original and isolated from environmental hazards. Backup network components, cabling and connectors, power supplies, spare parts and relevant documentation should be stored in a secure area on-site as well as at other corporate locations.
Network-specific data and document retention policies specify what records must be retained and for how long. All network organizations are accountable for carrying out instructions for records management in their organization.
IT Technical Support follows these standards for data backup and archiving, particularly for networks:
Hard drive retention policy
Backup media is stored at locations that are secure, isolated from environmental hazards, and geographically separate from the location housing network components.
System databases
A copy of the most current network and system databases must be made at least twice per month or based on frequency of changes made.
These backups must be stored offsite.
The lead network administrator is responsible for this activity.
Offsite storage procedures
Hard drives, disks and other suitable media are stored in environmentally secure facilities.
Hard drive or disk rotation occurs on a regular schedule coordinated with the storage vendor.
Access to backup databases and other data is tested annually.
Emergency management procedures
The following procedures are to be followed by network administration and operations personnel and other designated HIRETT employees in the event of a network disruption or related outage. Where uncertainty exists, the more reactive action should be followed to provide maximum protection and personnel safety.
These procedures are furnished to HIRETT management personnel to take home for reference. Several pages have been included to supply emergency contacts.
In the event of any situation where access to a building housing network infrastructure equipment is denied, personnel should report to alternate locations or contact security for access if the location is not damaged or quarantined. Primary and secondary locations are listed below.
9 EMERGENCY MANAGEMENT PROCEDURES
The following procedures are to be followed by network administration and operations personnel and other designated HIRETT employees in the event of a network disruption or related outage. Where uncertainty exists, the more reactive action should be followed to provide maximum protection and personnel safety.
These procedures are furnished to HIRETT management personnel to take home for reference. Several pages have been included to supply emergency contacts.
In the event of any situation where access to a building housing network infrastructure equipment is denied, personnel should report to alternate locations or contact security for access if the location is not damaged or quarantined. Primary and secondary locations are listed below.
Alternate Locations
Workplace:
Attempt to contact your immediate supervisor or management via telephone. Home and cell phone numbers are included in this document.
Workplace:
Attempt to contact your immediate supervisor or management via telephone. Home and cell phone numbers are included in this document.
Workplace:
Attempt to contact your immediate supervisor or management via telephone. Home and cell phone numbers are included in this document.
9.1 IN THE EVENT OF A NATURAL DISASTER
In the event of a major catastrophe affecting HIRETT network operations, immediately notify the < Name or Title of Person>.
STEP | ACTION |
1 | Notify EMT and DRT of impending event as time permits. |
2 | If impending natural disaster can be tracked begin launching network within 48 hours as follows:
Deploy portable generators with fuel on standby. Deploy network technical and admin personnel on standby. Facilities department on standby for replacement shelters. Basic necessities are acquired by support personnel when deployed: Cash for one week Food and water for one week Gasoline and other fuels Supplies, including chainsaws, batteries, rope, flashlights, medical supplies, etc. |
3 | 24 hours prior to event:
Create an image of network and system databases and other relevant files. Back up critical network and system elements. Verify backup generator fuel status and operation. Create backups of PBXs, routers, VoIP systems, e-mail, routers, switches, file servers, etc. Fuel vehicles and emergency trailers. Notify senior management. |
9.2 IN THE EVENT OF A FIRE
If fire or smoke is present in the facility where network infrastructure assets are located, evaluate the situation and determine the severity, categorize the fire as a major or minor incident and take the appropriate action as defined in this section. Call ________ or contact your local first responders as soon as possible if the situation warrants it.
Personnel are to attempt to extinguish minor fires (e.g., single hardware component or paper fires) using hand-held fire extinguishers located throughout the facility. Any other fire or smoke situation will be handled by qualified building personnel until the local fire department arrives.
In the event of a major fire, call 911 and immediately evacuate the area.
In the event of any emergency situation, such as system and network security, site security and personal safety are the major concerns. If possible, the lead network administrator and/or designee should remain present at the facility until the fire department has arrived.
In the event of a major catastrophe affecting the facility, immediately notify senior management.
STEPACTION1Dial ________________ to contact the fire department.2Immediately notify all other personnel in the facility of the situation and evacuate the area.3Alert emergency personnel on:
Provide them with your name, extension where you can be reached, building and room number, and the nature of the emergency. Follow all instructions given.
4Alert the EMT and DRT.
Note: During non-staffed hours, security personnel will notify the Senior Executive responsible for the location directly.
5Notify Building Security.
Local security personnel will establish security at the location and not allow access to the site unless notified by the Senior Executive or his/her designated representative.
6Contact appropriate vendors to aid in the decision regarding the recovery and resumption of network services and protection of equipment as time and events permit.7All personnel evacuating the facilities will meet at their assigned outside location (assembly point) and follow instructions given by the designed authority. Under no circumstances may any personnel leave without the consent of a supervisor.
9.3 IN THE EVENT OF A NETWORK SERVICES PROVIDER OUTAGE
In the event of a network service provider outage, the guidelines and procedures in this section are to be followed.
STEP | ACTION |
1 | Notify senior management of outage.
Determine cause of outage and timeframe for its recovery. |
2 | If outage will be greater than one hour, route all calls via alternate services.
If it is a major outage and all carriers are down and downtime will be greater than 12 hours, deploy satellite phones, if available. |
9.4 IN THE EVENT OF A FLOOD OR WATER DAMAGE
In the event of a flood or broken water pipe near any network infrastructure location, the guidelines and procedures in this section are to be followed.
STEP | ACTION |
1 | Assess the situation and determine if outside assistance is needed; if this is the case, dial _________ immediately. |
2 | Immediately notify all other personnel of the situation and to be prepared to cease voice and data operations. |
3 | Notify all other personnel in the facility of the situation and to be prepared to cease operations accordingly. |
4 | Water detected below raised floor may have different causes:
If water is slowly dripping from an air-conditioning unit and not endangering equipment, contact repair personnel immediately. If water is of a major quantity and flooding beneath the floor (water main break), immediately implement power-down procedures. While power-down procedures are in progress, evacuate the area and follow management’s instructions. |
10 Recovery Procedures
Work place and Infrastructure:
Desktop Computer Failure
Immediate replacement with a previously configured off line spare will be carried out by the following trained individuals.
Central Office – Mr. John Anthony, Director, Hirett Ltd and Mrs. Anne Victor, Director, Hirett Ltd.
Network Failure
In the event of an office network or numerous computer failures – Mr. John Anthony, Director, Hirett Ltd and Mrs. Anne Victor, Director, Hirett Ltd. support should be informed to initiate recovery procedures:
Director –
Tel.: (+44)
Email:
Director –
Tel.: (+44)
Email:
Business and Office Applications Recovery Procedures
All principal systems and applications used within HIRETT are supported by the following vendors:
__________________- Software (Vendor – ______________):
__________________ is HIRETT main ______________ Software System
1st line support will be performed by specially trained Ambassadors:
User support engineer –
email:
2nd line support will be performed by ______________ System Administrator:
System Administrator –
Tel.:
E-Mail:
3rd line support will be performed by the ____________ software Vendor – _______________:
System Administrator – ________________
Tel.: _________________
E-Mail: __________________
Product Supervisor – ______________
Tel.: _______________
E-Mail:
____________ System Infrastructure is serviced by ___________ Data Centre (located in ________-, ____________-).
1st line support will be performed by:
System Administrator –
Tel.: _____________
E-Mail: _____________
2nd line support will be performed by:
Technical Support – _______________
Tel.: _________________
E-Mail: __________________
Account Manager – _____________________
Tel.: ________________
E-Mail: _______________-
In accordance with SLA the ______________`s infrastructure recovery objectives in case of complete data center destruction are defined as:
RTO – in 24 hours full system functions will be recovered in alternative data center
RPO – max. data losses are limited by 10 min.
11 PLAN REVIEW AND MAINTENANCE
This network disaster recovery plan must be reviewed semi-annually and exercised on at least an annual basis. The test may be in the form of a walk-through, mock disaster, or component testing. Additionally, considering the dynamic environment within HIRETT, it is important to review the listing of personnel and phone numbers contained within the network DR plan regularly.
The hard-copy version of the network DR plan will be stored in a common location where it can be viewed by site personnel and the EMT and DRT. Electronic versions will be available via HIRETT network resources as provided by IT Technical Support. Each recovery team will have its own directory with change management limited to the recovery plan coordinator.
12 DISASTER RECOVERY PLAN
Disaster Recovery Plan is designed to provide minimum values for two key metrics, in case of an accident:
- Recovery time objective (RTO) – the maximum admissible amount of time service unavailability;
- Recovery point objective (RPO) – the maximum admissible amount of data loss time during a serious
Usually these values are specified and taken into account in the SLA Contract. The main events that can occur during an accident:
- Inaccessibility of service:
- server crash
- data centre failure
- the lack of server processing power with increasing load on the service
- Loss, breakdown of data:
- Loss, failure in the relational database
- Loss of event log data (logs) required for analysing the occurrence of an accident and debugging programs, investigating cases of penetration into the system, etc.
Basic means of control for accidents prevention:
Using the tools provided by the cloud service on which technical service provider infrastructure is based:
- Centralized monitoring of system resources (periodical metrics monitoring)
- Centralized event logs view
- Notifications that can potentially disrupt the normal operation of the service, (for example, a high load on the processor, lack of disk space, lack of memory, etc.).
Disaster recovery plan has a multi-level approach to data recovery (multi-tier), ie there are several recovery paths, depending on the level and extent of damage:
- To ensure uninterrupted operation and instant recovery in the event of an accident, we use a copy of the infrastructure that is in the data center in another region of our cloud service on which we build our infrastructure: All traffic from outside is controlled by the HTTPS load balancing service inside our cloud. In turn, HTTPS load balancer redirects traffic to available backend servers by performing a health Accordingly, if the service and backend servers are not available in the main region, traffic is redirected to a working copy in another region. In the scenario described above, the Hot Standby Server Failover approach is implemented.
- To restore system disks and data to them, after an accident occurred, or a separate server crashes, a regular snapshot of server drives with system and user data can be created. The interval for creating snapshots is set according to the RTO metrics and the value of each server in the If a disk contains frequently changing data, for example data from a relational database, then this approach requires a freeze of the system disk on which the relational database data resides.
To recover from a snapshot, the following steps are required:
- A system disk is created from the corresponding snapshot
- A virtual server with a broken disk or a group of servers is stopped or deleted (depending on the situation, and the type of disks, for example, if the main system disk from which the operating system boots is broken, the virtual server is deleted and recreated based on the newly restored disk).
- The tools provided by the cloud service (___________) allow to replace disks with broken data, to newly created snapshots, or to re-create a virtual server, based on the newly restored disk in the shortest possible
- After a successful replacement or re-creation, a virtual server or a group of servers is
- In the event of a breakdown in order to restore the data of the relational database, there is a possibility of rollback in (Database point-in-time recovery). We use the tools of the database system itself to create incremental backups. This approach allows to quickly create backups with the smallest time interval and also quickly roll back in time:
- Backups are stored on the persistent disk, attached to the database For this purpose, a task (cronjob) is configured, which creates full and incremental backups on the server, with a certain interval, which is set in accordance with the RPO metrics.
- The server also has a configured task that archives, encrypts and copies backups to remote Storage in a cloud server, which can be used for restoring The backup time on this Storage is much higher, so it is possible to recover data that no longer exists on the persistent disk of the database server.
- The following steps are used to restore the backup:
- Backup download (full + incremental) from the remote Storage service to persistence disk, where the relational database itself is distributed.
- Backup is prepared using special tools (unzipped, decrypted, transactional logs are rolled into a specially prepared directory with data from a relational database).
- Switching the database system to the directory with the recovered data.
- Along with the data from the database, a task is created that copies system event logs and application logs to the remote Storage service of our cloud hosting provider (_______________).
13 ALERTS/VERIFICATION/DECLARATION PHASE
13.1 PLAN CHECKLIST
Network response and recovery checklists and plan flow diagrams are presented in the following two sections. The checklists and flow diagrams may be used by Technical Support members as “quick references” when implementing the network DR plan or for training purposes.
Insert checklists and other relevant procedure documents here.
Plan checklists
Initials | Task to be completed |
13.2 NETWORK DIAGRAMS
Insert network diagrams and other relevant procedure documents here.
13.3 RECOVERY FLOW DIAGRAMS
Insert network recovery flow diagrams and other relevant procedure documents here.
13.4 NOTIFICATION OF INCIDENT AFFECTING THE SITE
On-duty personnel responsibilities
If in-hours:
Upon observation or notification of a potentially serious network disruption at a company location, ensure that personnel on site have enacted standard emergency and evacuation procedures if appropriate and notify the EMT and DRT.
If out of hours:
IT Technical Support personnel should contact the EMT and DRT.
13.5 PROVIDE STATUS TO EMT AND DRT
Contact EMT and/or DRT and provide the following information when any of the following conditions exist: (See Appendix B for contact list)
- Network performance has sufficiently degraded to where normal operations are not possible for three or more hours.
- Any problem at any network infrastructure asset, system or location that would cause the above condition to be present or there is certain indication that the above condition is about to occur.
The EMT will provide the following information:
- Location of incident.
- Type of incident (e.g., fire, hurricane, flood).
- Summarize the damage (e.g., minimal, heavy, total destruction).
- Meeting location that is a safe distance from the disaster scene.
- An estimated timeframe of when a damage assessment group can enter the facility (if possible).
- The EMT will contact the respective team leader and report that a disaster involving network operations has occurred.
- The EMT and/or DRT will contact the respective HIRETT team leader and report that a disaster affecting network operations has occurred.
13.6 DECIDE COURSE OF ACTION
Based on the information obtained, the EMT and/or DRT decide how to respond to the event: Mobilize IT Technical Support, repair/rebuild existing network operations with network technical and admin staff or relocate to a new facility.
Inform team members of decision
If a disaster is not declared, the location response team will continue to address and manage the situation through its resolution and provide periodic status updates to the EMT/DRT.
If a disaster is declared, the EMT and/or DRT will notify IT Technical Support immediately for deployment of network DR plans.
Declare a disaster if the situation is not likely to be resolved within predefined time frames. The person who is authorized to declare a network disaster must also have at least one (1) backup who is also authorized to declare a disaster in the event the primary person is unavailable.
Contact networking and equipment vendors (see Appendix I)
Disaster declared: mobilize incident response/technical support teams/report to command center
Once a network desk disaster is declared, the Disaster Recovery Team (DRT) is mobilized. This team will initiate and coordinate the appropriate recovery actions. Network technical and administrative employees should assemble at a designated location as soon as possible. See Appendix E for emergency locations.
Conduct detailed damage assessment (This should be performed prior to declaring a disaster).
- Under the direction of local authorities, IT Technical Support and/or EMT/DRT, assess the damage to the network and related assets. Include vendors/providers of installed network services and equipment to ensure that their expert opinion regarding the condition of the network is determined ASAP.
Participate in a briefing on assessment requirements, reviewing:
- Assessment procedures
- Gather requirements
- Safety and security issues
NOTE: Access to the facility following a fire or potential chemical contamination will likely be denied for 24 hours or longer.
13.7 DOCUMENT ASSESSMENT RESULTS
Building access permitting:
Conduct an on-site inspection of affected areas to assess damage to essential network records (files, manuals, contracts, documentation, etc.) and electronic data.
Obtain information regarding damage to the network, e.g., environmental conditions, physical structure integrity, furniture, and fixtures) from the DRT.
Develop a Restoration Priority List, identifying facilities, vital records and equipment needed for resumption of network operations that could be restored and retrieved quickly.
Recommendations for required resources:
Contact DRT: Decide whether to continue to business recovery phase
The EMT and DRT gather information regarding the event; contacts senior management and provides them with detailed information on status.
Based on the information obtained, senior management decides whether to continue to the business recovery phase of this network DR plan. If the situation does not warrant this action, continue to address the situation at the affected site(s).
This section documents the steps necessary to activate network recovery plans to support full restoration of systems and network functionality at either 1) the original company location or 2) an alternate/recovery site that would be used for an extended period of time. Coordinate resources to re-establish network operations at the primary site and reconstruct network operations at a temporary/permanent system location, and to deactivate recovery teams upon return to normal network operations in either scenario.
HIRETT System and facility operation requirements:
The system and facility configurations for each location are important to re-establish normal network operations. A list for each location will be included in Appendix F.
Notify IT technical support staff and coordinate return to primary facility/location
See Appendix A for IT Technical Support staff associated with recovery of network operations at the original site.
Secure funding for return to work
Make arrangements in advance with network service carriers and equipment vendors to recover network operations at the primary site.
Notify IT technical support staff/coordinate relocation to new facility/location
See Appendix A for IT Technical Support staff associated with configuring network services at an alternate location (replacement for original site).
Secure funding for relocation
Make arrangements in advance with network service carriers and equipment vendors. Make arrangements in advance with local banks, credit card companies, hotels, office suppliers, food suppliers and others for emergency support.
Notify EMT and corporate business units of network recovery
Using the call list in Appendix B, notify the appropriate company personnel. Inform them of any changes to processes or procedures, contact information, hours of operation, etc. (may be used for media information).
Operations recovered
Assuming all relevant network operations have been recovered either to the original location or to an alternate site with employees in place to support network operations, the company can declare that its network is functioning normally.
APPENDIXES
APPENDIX A: HIRETT RECOVERY
Emergency Management Team
Note: See Appendix B for contact list. Suggested members include senior management, Human Resources, Corporate Public Relations, Legal Department, IT Technical Support, Risk Management and Operations.
Charter:
Responsible for overall coordination of the network disaster recovery effort, evaluation and determining disaster declaration and communications with senior management.
Support activities:
The Emergency Management Team
- Evaluates which recovery actions should be invoked and coordinate with the corresponding network recovery teams.
- Analyzes network damage assessment findings.
- Sets restoration priority based on damage assessment reports in collaboration with IT Technical Support.
- Provides senior management with ongoing status information.
- Acts as a communication channel to corporate teams and major customers.
- Work with vendors, carriers and IT Technical Support to develop a rebuild/repair schedule
- Disaster Recovery Team (DRT)
Note: See Appendix B for contact list.
Charter:
Responsible for overall coordination of the network disaster recovery effort, establishment of the emergency command area (if needed) and communications with senior management, the Emergency Management Team, and IT Technical Support teams.
Support activities:
- Coordinate with EMT, senior management and IT Technical Support
- Assist with determination of network recovery needs with IT Technical Support.
- Establish command center and assembly areas.
- Notify all company department heads and advise them to activate their plan(s) if applicable, based upon the disaster situation.
- If no network disaster is declared, take appropriate action to return to normal network operations using regular network operations staff.
- Determine if carriers, vendors and other teams are needed to assist with detailed damage assessment.
- Prepare post-disaster debriefing report.
- Coordinate the development of revised network recovery plans and ensure they are updated semi-annually.
IT technical support
Charter:
- IT Technical Support will facilitate network recovery and restoration activities.
- Support activities
- Upon notification of disaster declaration, review and provide support as follows:
- Facilitate network recovery and restoration activities, providing guidance on replacement equipment, systems and network services, as required.
- Coordinate testing of network operations to ensure the network is functioning normally.
APPENDIX B: RECOVERY TEAM CONTACT LISTS
Emergency Management Team
Name | Address | Home | Mobile |
Disaster Recovery Team
Name | Address | Home | Mobile |
IT Technical Support
Name | Address | Home | Mobile |
Appendix C: Emergency numbers
First responders, network carriers, public utility companies and others
Name | Contact Name | Phone |
Appendix D: Contact list
Name | Address | Home | Mobile |
APPENDIX E: EMERGENCY COMMAND CENTER LOCATIONS
Emergency command center – <Location Name>
Primary:
Address
Room
City
Contact: “coordinator of rooms/space – (xxx) xxx-xxxx
Alternate:
Address
Room
City
Contact: “coordinator of rooms/space – (xxx) xxx-xxxx
Emergency command center – <Location Name>
Primary:
Address
Room
City, State
Contact: “coordinator of rooms/space – (xxx) xxx-xxxx
Alternate:
Address
Room XXX
City, State
Contact: “coordinator of rooms/space – (xxx) xxx-xxxx
APPENDIX F: FORMS
Incident/disaster form
Upon notification of a network disruption the on-duty personnel in Network Operations will make the initial entries into this form. It will then be forwarded to the ECC and will be continually updated. This document will be the running log until the help desk incident/disaster has ended and “normal business” has resumed.
TIME AND DATE
TYPE OF EVENT
LOCATION
BUILDING ACCESS ISSUES
PROJECTED IMPACT TO OPERATIONS
RUNNING LOG (ongoing events)