It won’t be alright on the night: disaster recovery and the importance of rehearsals

Blog by: Phil Curwood, Chief Technology Officer, Adept4 - 08-Mar-2018

Disaster recovery (DR) is a crucial weapon in modern organisations’ IT arsenals. In a world in which both day-to-day and one-off business processes are tied to the availability and operability of IT systems, simply looking the other way and assuming that DR will never be required is a foolish strategy. No organisation can safely assume that DR will never be required, since it is impossible to fully mitigate the range of risks that can cause a need for it, from simple human error, to natural disasters such as fire and floods, to the diverse and dynamic malicious cyber threat landscape.

But what does a comprehensive DR strategy actually involve? Most business leaders are aware of the fundamentals; that part or all of their organisations’ IT infrastructure must be backed up at regular intervals and stored at a separate location, ready to be replicated as required. But this is just one piece of the puzzle. That’s the core technology process around which a complicated series of plans need to be developed, typically with a wide range of roles and procedures carried out by different members of the organisation. 

A lesser-discussed, yet still vital part of disaster recovery is the testing of the above procedures processes, ensuring that all the people and technologies involved can work as seamlessly as possible when the worst happens. After all, effective DR isn’t just about speed; it’s also about thoroughness. A replicated IT infrastructure is useless if it has failed to spin up an application that is vital to everyday business operations, or has replicated an out of date database. Far, far better to identify and mitigate such gaps and inefficiencies before your DR process is required in ‘real life’. 

Another reason why DR testing is so important is very simply the fact that DR needs evolve just as your personnel, IT infrastructure, business processes and applications do. Disaster recovery has a range of aims, but can be broadly understood to focus on restoring access to information, restoring access to applications or processes, and reconnecting users. As such, it is intrinsically tied to the technology, the processes, and the people within your business. If they change, the chances are that your DR plan needs to change too. 

How, then, should you develop an effective DR dress rehearsal? Here are the key steps to consider. 

  1. Draft your plan. This involves creating a set of documents setting out the processes to be followed by all individuals in the event of a disaster – possibly different versions for difference scenarios. It may need to include other documentation too, such as copies of software media, application source code and license keys. All this must be stored securely and separately from your main infrastructure. 
  1. Run your dress rehearsal. This should be a full, end-to-end rehearsal running from the moment the incident is announced, through to analysis and understanding of what has occurred, and the full replication process. Remember to schedule it at a time where disruption to normal business operations in minimised, which may mean overnight or at a weekend. Ideally, the plan should be executed by stakeholders who didn’t draft the original documents, and who don’t know the details of the imagined disaster scenario until the last moment – this, of course, makes for a more realistic rehearsal, and can help identify flaws and weaknesses that may got unspotted by more informed parties. 
  1. Throughout the dress rehearsal process, focus on identifying and recording any weaknesses, faults, anomalies and process gaps, as well as processes where the action doesn’t quite follow the plan. Remember that people are as integral to DR as technology, so you should focus on elements like the ability of involved parties to understand documentation, as well as technical capabilities of your tools and software. Where is additional staff training needed? Where are technology upgrades needed? 
  1. Schedule an all-party review meeting where stakeholders can, face-to-face, go through the original plan, these review documents, and collectively agree strengths and weaknesses of the DR approach. Strengths should be emulated, and weaknesses addressed. 
  1. Repeat the entire process at least once a year. Meanwhile, update the original DR documentation in line with new systems, upgrades, new technologies, new stakeholders and so on. DR requirements, as outlined above, change frequently, so it is important that your plan changes frequently too. 

Find out how to build an effective DRaaS business case. Download our free insight guide here. 

Topics: Cloud, disaster recovery, disaster recovery as a service

Sign up to our blog