top of page
  • Writer's pictureLaShana Lewis

3 Tips for Surviving a Mass Outage, Or, How I Learned to Stop Worrying and Love Disaster Recovery


A blue screen error with the words "A problem has been...to your computer"
You might receive a "blue screen of death" during an outage.

Last night, right before bed, my wife tapped me on the shoulder to show me a recent news article that flashed across our local news channel: A Microsoft outage.


Sleepily, I determined that the issue was probably regarding a faulty update (which at the time, I didn’t know how right I was), scanned various news agencies, and cross-referenced with some friends in the industry.


I forwarded an article to a friend I know in the field and hoped she wasn’t as badly affected.

Airlines, hospitals, and news stations were reporting issues.


I conked out a short time later and figured it’d all be fixed by the morning.


After all, these places had backups and disaster recovery/business continuity teams, right??

RIGHT???


I woke up to what seemed like a 5-alarm fire in the security realm, affecting everything mentioned above including banks which is NOT GOOD!


My friend texted me that this was going to be a mess.


After a few more articles came out, I realized that not only was I right about it being an update issue, but that many major facilities were on the uptick of rebooting (and probably restoring to a previous version of their OS), and that even one entity was using such an old system, that they weren’t affected, at all.


So, what lessons can be learned from this outage?


Business Continuity: Embracing the What-If Scenario


As a person who spent nearly a decade being thrown into the disaster recovery and business continuity field, basically the “what would you do when your computing systems are a flaming hole in the ground” version of triage management, you might be surprised how often super large institutions might not have a “what if” plan.


A few years ago, I developed one for a company that had offices all over the world. Their main system bit the dust, and it ended up impacting the executive team, directly.


I had to sit in with the executives since I was the only one with a time-stamped document that detailed everything that was happening from beginning to end.

Did I mention that my job was in the customer service department… a far cry from what I was “supposed” to be doing?


The only reason I knew so much about the issue and the affected parties was because I served as the unofficial tech support personnel for my department — fixing everything from software systems, crafting databases from scratch, and even doing manual upgrades to the network and desktop computers. 


As a result, I was always on the phone with other sectors of the IT department, implementing code and rebooting servers so they didn’t have to make the trek to our location.


“What should we do,” an executive asked me.


I didn’t want to answer because several people on the call outranked me, and I didn’t want to embarrass the wrong person and get more awful repercussions.


I answered anyway, given that I was already told that the chain of command dwindled down to the person asking me, and they made that decision be mine.


This is when I suggested that after we survived the catastrophe, we implement a disaster recovery team since there wasn’t one.


I served on it because I had previous experience in one for a different company of the same type just a few years before I joined.


Since then, whenever a company found out that I had a background in creating, implementing, and running a disaster recovery team, I was shoehorned in.


Fortunately, I’ve only been through a handful of real-world scenarios, and each time, we were so thankful for our practice exercises which kept us as calm as one could expect.


So after no less than three major corporations with worldwide offices having me help them manage the terrible “what if” scenario, I’ve decided to speak up and say that if your company does not have a disaster recovery or business continuity team and plan, you don’t have a secure setup.


3 Tips for Prepping for Disaster Recovery


Your company needs a disaster recovery plan, and if you don't have one ready yet, here are some tips to get started:

  1. Build a Disaster Recovery Team  The first one is obvious — build the team. Ideally, you want to have at least one person from each branch of your IT division, including the Help Desk or Customer Support, and your CTO or their appointed executive.

  2. Launch to Stage, and Wait  There are often two different sets of setups your organization needs — stage (which is basically a test, practice, sandbox-like system) and production (where all the live action happens) — where any new implementations should be pushed to Stage, first, with a time to wait a while to make sure there’s nothing bad about the repercussions that follow.

  3. Have Backup at the Ready  Most good computing implementations have a system setup that pretty much mirrors everything on the main setup which can easily be switched over to in the event that the inevitable happens regardless of how many times you’ve checked and double-checked.


And, it should go without saying that your disaster recovery team should run drills just to make sure that their latest setups work without a hitch.


It’s best to run these drills when the network is at a lull just in case your practice turns into a real-life disaster!


If you need a helping hand, we can help. Use our Fractional CIO Services to make sure you have a disaster recovery and business continuity plan in place as soon as possible.

Commentaires


bottom of page