Sean Deuby

To contend with the explosion of cybercrime and its impact on business operations, many organizations are updating their disaster recovery plans to include cyber incident response. Many of the processes and guidelines in traditional disaster recovery plans have changed little in years, sometimes even in over a decade—making them ill-suited to address cyber disasters. More important, at a business level, disaster recovery is just one aspect of a larger discipline: operational resilience.  

Simon Hodgkinson, former Chief Information Security Officer at bp and Strategic Advisor for Semperis, emphasizes the need for organizations to view disaster recovery in the context of the overall viability of the business—including cyberattack prevention, detection, and response. 

“Disaster recovery is fairly narrow in its definition and typically viewed in a small timeframe,” he says. “Operational resilience is much broader, including aspects like the sort of governance you’ve put in place; how you manage operational risk management; your business continuity plans; and cyber, information, and third-party supplier risk management.”  

In other words, disaster recovery plans are chiefly concerned with recovery. Operational resilience looks at the bigger picture: your entire ecosystem and what can be done to keep your business operational during disruptive events.  

Mending a broken chain 

The broader focus of operational resilience requires organization-wide participation. You cannot simply leave it to a single department or team. Instead, everyone needs to be involved, from executives and the board of directors to individual employees in multiple departments.  

“Leadership needs to understand risk and to know the risk tolerance and risk appetite of the company,” says Hodgkinson. “That even includes things such as procurement functions and agreements with third-party suppliers. Resilience must be built into everything down to every-day workflows, and if a single supplier is insufficient to manage risk, then diversity of supply is a must.”  

Hodgkinson speaks to an important reality that many seem to forget: In today’s climate, it’s not just your organization under threat. Your suppliers, partners, and vendors are targets, too. If a major supplier is compromised or taken down, your business might go down with them. 

“I’ve seen many interesting cases where a cyber event at a supplier rendered multiple organizations unable to fulfill their business outcomes,” Hodgkinson recalls. “For instance, consider a retail organization that is using a logistics provider to get products to their stations and that provider experiences stockouts. Avoiding such scenarios requires a broader perspective. In the context of operational resilience, every risk management scenario and process must consider the supply chain.”  

Putting the “operation” in operational resilience 

The U.S. Department of Transportation proposed a $1 million fine against Colonial Pipeline for “control room management failures” in the 2021 cyberattack that disrupted gas delivery in the Eastern U.S., adding to the company’s revenue losses from the attack itself. The government’s take is that the company ignored operational resilience: Instead of planning how to manage and limit the scope of an incident, the organization simply shut down its process control networks the instant malware hit its systems.  

“It’s a really interesting scenario,” says Hodgkinson. “But sadly, I don’t think it’s a unique one. I would say many organizations don’t fully understand the impact of operational technology in a cyber incident.  

“Ideally, organizations managing national infrastructure or critical supply would think much more about business continuity management and mitigation controls,” he says. “Such thinking starts with knowing their risk profile and planning appropriately to manage it. Organizations must also test what they can do in terms of shutting down their networks, ensuring they have the capability to sever the connection between information technology and operational technology, so malware doesn’t bring everything to a grinding halt.”  

Bridging the gap between IT and OT  

The technology running systems such as pipelines and refineries is distinctly different from that found in a typical office environment. Different network protocols, a different approach to the security stack, and a greater concern for critical safety issues exist. According to Hodgkinson, one of the biggest sources of friction for industrial businesses—and the reason operational resilience efforts so often fail—involves a disconnect between information technology (IT) and operational technology (OT).  

Neither department fully understands the other’s workflows and challenges. This disconnect needs to change. And that begins with a change in perception. 

“I think part of the issue at the moment is that cyber is still seen as special,” Hodgkinson says. “The discussion always seems to conclude with the assumption that the security team or IT department is managing a particular risk, so no one else needs to worry about it. We need to demystify cybersecurity. It’s only with the proper business understanding and risk ownership that you can put proper resilience mechanisms in place. What worked really well in my experience at bp was bringing engineering into cyber and cyber into engineering, giving each team expertise and perspective that it previously lacked.”   

The truth is that different teams have different priorities. The engineering team might be aware of the importance of cybersecurity but needs to prioritize procedural elements and safety-critical matters. By encouraging interdepartmental collaboration, businesses can determine how to facilitate the rollout of controls and strategies across each environment.  

“I think it’s ultimately all about context,” Hodgkinson says. “What is the business trying to achieve, and what outcomes is it trying to fulfill? How does it support those outcomes? What technology does it use? What matters to it in terms of confidentiality, integrity, and availability?”  

The importance of Active Directory security and recovery in building operational resilience  

Active Directory (and Azure AD, in hybrid identity environments) has a central place in the quest for operational resilience.  

“You’ve got a mechanism for prioritization, but interestingly, I think people often forget that the single most important application [across] dimensions is Active Directory,” says Hodgkinson. “Without it, you cannot fulfill any of your business outcomes. Active Directory is at the very core of your ability to operate and deliver business outcomes, and it needs to be part of your operational resilience strategy instead of being treated as an island.”  

Take an active role in operational resilience 

Disaster recovery plans that focus on natural disasters are insufficient for dealing with modern threats to operational resilience. Because the organization’s identity system is critical to keeping operations running—and is the prime target for cyberattacks—protecting it is paramount. By prioritizing identity system defense, organizations can address one of the most serious threats to operational resilience.   

Learn more