Did you hear the joke about the day the internet broke? On July 19, 2024, it was no joke for millions of people worldwide who were traveling, trying to conduct banking activities or, worst of all, call 911.
Some airlines lost all connectivity, grounding thousands of flights and leaving travelers in airport limbo. Likewise, some 911 centers lost all computer-aided dispatch (CAD) connectivity, some lost radio system identifiers functionality, and still others lost nothing. I spoke to several 911 directors who surmised that 85% of the 911 centers nationally were affected to some extent, with about 10% of them critically affected. (Read more: .)
While many questions remain about the outage, what we do know is that the Microsoft/CrowdStrike failure exposed a GLARING vulnerability in our mission-critical systems.
From plan B to COOP
Before we talk about ways to harden those systems, let鈥檚 consider the simple tasks you do every day, many of them supported by digital systems and some connected to 911 CAD systems. Several 911 and fire service members I spoke to after the crash confirmed that the outage affected many of these platforms 鈥 inspection and preplan systems, call reporting and ambulance billing, to name a few. Some are innocuous enough to be down for a few hours, although certainly not call dispatching itself or even the billing systems. Many departments depend on this funding for daily operations, and errors or interruptions in billing could have significant impact.
Whether or not this outage affected you, every single agency leader should be considering the impact of system failure. What鈥檚 your plan B? Do you have a backup digital system or, more likely, the ability to quickly switch to manual functionality? If not, this was your real-life wakeup call!
The entire incident got me thinking about how much we take for granted related to the many functional/operational things we do every day, plus all the systems failures I鈥檝e witnessed over the past four decades. Have you ever trained with your crews on recovering from a pump failure? How about if the maxi-brake fails on the truck, the cables break on the aerial ladder, or a nozzle or facepiece failure? Broken standpipe and sprinkler Ys, unscheduled road work blocking community entrances, unknown executive protection details causing rerouting, water system failures. I could go on and on with the things I鈥檝e personally faced and that fire departments across the U.S. might have to deal with at any time.
Take it a step further into the you鈥檙e already doing, whether you know it or not. Mayday training is a prime example. We already do a great job of figuring out how to recover when things don鈥檛 go the way we expect on the fireground; it鈥檚 really no different than all these other situations for which we must plan. So, what鈥檚 your COOP? That鈥檚 right, I鈥檓 asking you to 鈥渨hat-if-it鈥 to death! We already do it all the time, folks; we鈥檙e just usually solving the problems for which people call us, not usually for ourselves.
Bottom line: Mission-critical systems MUST have failsafe systems in place. Things like CAD, radio systems, and even inspections and billing systems should have segregated platforms that don鈥檛 allow third-party automatic updates that haven鈥檛 been vetted.
One of the jurisdictions I used to work for learned this many years ago, after a computer hack through county systems resulted in a four-day CAD failure. The fix was to build a separate non-public platform for the mission-critical systems, which improved the system resiliency. When they built that, that added a 鈥渢est platform鈥 where any 鈥渁utomatic鈥 updates or changes sat in the test environment, until pushed through internally after confirmation in the testing environment. We know that many municipal systems don鈥檛 have this kind of resiliency, which showed for them and countless other companies and organizations on July 19.
Practice the plan
COOP planning isn鈥檛 new. Your emergency manager has likely been charged with developing or ongoing updates of these plans for your jurisdiction. These are typically plans that explain how departments will function after major weather events, during disaster declarations, or in the midst of any other emergency. Reach out to your EM if you need help coming up with your own agency COOP, and PLEASE take this beyond the pen and paper (or computer) you use to make the plan. Don鈥檛 limit COOP planning to simply the big tent of your organization; dig into the weeds on this one 鈥 the little but critical things you do every day.
No matter how you develop the plan, and whatever the plan covers, it won鈥檛 do you any good if it sits on a shelf or is buried in a computer folder. Use it on a regular basis!
Don鈥檛 think this will ever happen to you? I鈥檝e been there, done that, and I鈥檝e got the proverbial T-shirt. It CAN happen to you, and it WILL likely occur at the moment you least expect it. Will you be continue your work without missing a beat? Could YOU rescue YOU from the perils of a similar outage?