Last month we talked about “integrated” SOCs, that bring together data from both IT and OT environments (IT SOC + OT SOC = an iSOC, December 2018). This is a powerful idea, not only because it can multiply the value of existing IT SOC investments, but also because correlating seemingly unrelated events across IT and OT produces novel indicators of compromise.
Let’s take the iSOC idea a step further. What are the practical challenges in getting one up and running? They come down to some key differences between IT and OT. They involve data collection, data analytics, and workflow. Let’s look at the key steps to take to bring your IT and OT SOC together:
1. Strategy and Implementation
A SOC (Security Operations Center) is there to help you keep your networks secure. But that doesn’t mean much unless you have a security policy in place. In a security-mature organization, the SOCs verify and validate the effectiveness of global and local security policies. This is particularly important in OT, where cybersecurity strategy development is not as advanced as it is in IT.
A security policy consists of a strategy and an implementation. Developing a security policy is a business decision, not a technology decision. It has to be made by balancing risks against costs.
The SOC’s job is two-fold: it monitors the events generated by the security policy across the enterprise; and it verifies that the policy is being applied correctly in the face of steady changes to operations. This applies to traditional SOCs as well as iSOCs.
2. Data Collection
Data collection, or network instrumentation, is the first task in building a SOC, and it’s a significantly different challenge in the OT world. You generally can’t install software agents on machines, devices, PLCs, and other industrial equipment. But it’s also very problematic to install extensive security-software stacks on engineering workstations or computers running applications supplied by OT vendors.
The general problem here is performance. Security software such as auto-updaters, A/V systems, and so on, very often introduce unacceptable performance and compatibility issues. While a manageable problem in the IT world, these issues can compromise safety and process uptime on the OT side.
What to do? Passively monitor your OT networks, and add some lightweight active monitoring that is tailored for OT. Again, your goal is to start with a sound security policy, and make sure it remains effective across time.
This isn’t terribly difficult in principle, but it requires continuous monitoring of production networks, and knowledgeable experts to interpret the monitoring. If you segregate your OT networks into Purdue-layers, then it can be cost-effective to monitor only the upper layers. But this has to be evaluated in light of the design.
3. Data Analytics
What kinds of things is an iSOC practitioner looking for? Basically, network-metadata and application-data patterns that fall outside norms for that environment. Any kind of unusual connectivity is a trigger. So is any kind of unusual data pattern within the applications. It’s essential to actively monitor the segregation boundaries (whether Purdue layers, firewall/DMZs, or both) for changes in network “reachabilities.” It’s also essential to detect recon activity within your OT networks.
Remember, firewalling doesn’t solve your whole problem. The worst exploits against OT come from well-resourced and smart actors who take advantage of the necessary connectivities between IT and OT.
Once you’ve eliminated obvious problems such as engineering workstations with connections to email and messaging servers, then you’re pretty good with regard to nuisance and casual attacks. But if your operations have any kind of safety impact, then you’re a target for better-resourced attackers.
Exfiltration of production data is an important problem but not a very difficult one. You aren’t going to have too much trouble detecting flows of production data going places they shouldn’t be going.
If you use access control rather than network segmentation to control your production networks, you will have some problems because devices and machines are usually headless. All of these issues need to be addressed by an iSOC.
As an iSOC operator, you have to get the plant manager involved because his team is responsible for remediation, and he also needs to know everything that can potentially impact his ability to deliver. This is a very significant hurdle. Traditional IT SOC operators have the ability and the authority to remediate many if not most security events on their own. That’s generally not the case with OT.
The plant manager, for instance, is not a traditional element of SOC workflow, and he doesn’t necessarily have the knowledge to deal with security events. He needs analytics, forensics and context to make sure you’re not wasting his time. He needs to hear things like: “Host X is exhibiting signs of malware infection, so please re-scan it.” Or, “there’s an unusual device in that network, in this location. What’s it there for?” Or, “a particular process in Building X is showing anomalies that we correlated with a recent software update. You should take a look.”
5. Data in Context
To develop this kind of context, new kinds of analytics are required in the iSOC. Rather than looking only at computer and network metadata, the actual process results (often available in SCADA systems or process historians) need to be considered as well. This is critical to iSOC success, but you won’t get it out of the box with traditional SOC products. You will need to do some significant work over time, with team members from multiple groups within your organization.
To bring this home, let’s look at the management questions. These are perhaps the biggest problem in standing up an effective iSOC. Who owns the problem of OT cybersecurity? Who’s looking at the logs? Who takes action? Who ensures that the right questions are being asked?
There’s no easy answer to this. Centralized IT and SecOps people have the necessary security and network knowledge, as well as the analytic tools. But the local plant operators have the deep process knowledge.
Just as crucially, the plant people are responsible for remediating issues as they are discovered. This has to be a partnership.
And as with any partnership, the first step is to communicate and build trust. That’s the subject of our next post, in which we’ll explore the OT security strategies used by some advanced industrial companies.