The organization has a documented business continuity plan (BCP) which identifies the departments’ needs and requirements to recover in the event of a disaster. Technology is at the center of the business and typically touches every department. But IT only has finite resources—people, equipment, and time. This means IT has to have a comprehensive disaster recovery (DR) plan that is agreed upon by the business. This includes defining a plan for incidents such as phishing and ransomware attacks.
Start off by determining what all of IT assets are (e.g., hardware, software, data). Unless you have large stacks of money, it’s typically not cost-effective for IT to recover and bring up everything at the same time. Instead, it needs to be prioritized. One way is to categorize applications as mission critical (minimal downtime of x minutes/hours), essential (downtime of x hours/days), and non-essential (downtime of x days/weeks/months). There is a cost associated with the defined recovery time objective (RTO).
Preventative Controls To Implement
Now that you know what you have, there are some preventative controls you can implement to protect those assets:
1. Have a surge protector for each laptop/desktop/external monitor in case there is a power spike.
2. Have an uninterruptible power supply (UPS) for desktops/servers so that they can be shut down gracefully in the event of a power outage or bridge the time until the generator kicks on.
3. Have a generator for when the power goes out and you have equipment/systems that must stay up. You can use colored electrical outlets to designate which are connected to the generator. Make sure you test the generator and have a plan to maintain sufficient fuel.
4. Create backups including for email, applications, data, etc. Some backup considerations include:
- Incremental backups vs. full backups
- Real-time, daily, weekly, monthly, quarterly, and annual backups
- The time since your last backup will affect your recovery point objective (RPO)
- Don’t forget that passwords will be as of the time of the backup
- Periodic testing to ensure you can recover the backed-up data.
If there is an incident, the next question is where to recover. Whether the hardware is on premise or in the cloud could make the answer significantly more straightforward. Also, is the incident isolated such as one server going down and another server can be swapped in? Is there a fire in the main server room? If so, do you have a hot site or an alternate location? If you’re in the cloud, how easy is it for you to spin up other servers, or do you have disaster recovery as a service (DRaaS)? There could be significant costs depending on the strategy.
Managing Your Disaster Recovery Plan
Make sure you document your DR plan and keep it current. You may want to keep a hard copy of the DR plan at your alternate site (if applicable). The next three critical steps are to test, test, and test some more. Test at least once a year (preferably with the business). Tabletop tests are good, but actual tests are better and more realistic. Document your test results to see what went well, what could be improved, and what didn’t work or meet expectations. The lessons learned for each test will help you refine your DR plan (and BCP plans), especially with the business’ ever-changing needs and objectives.
There is a saying that applies: “Hope for the best, but plan for the worst.” Having a written and comprehensive DR plan will put you ahead of the game when you’re trying to recover the organization’s IT assets in a chaotic and stressful disaster situation.
For more information on having a thorough disaster recovery (DR) plan, follow me on LinkedIn!