SOC coverage
Expanding coverage and using your data effectively is critically important to any SOC’s ability to carry out its responsibilities. With a lack of coverage or visibility, the SOC will have nothing to monitor or detect, and your organization will have a significantly higher risk level because of it. That’s why it is an important factor within a SOC to determine the current level of coverage, where the gaps are, how to prioritize the gaps, and how to fill those gaps by increasing coverage.
The first stage is to determine what level of coverage you currently have. Regardless of whether your organization is a newer SOC or a more established one, I make it a point to baseline coverage before making any other roadmap decisions. To do so, I take note of all data sources that are being ingested into whatever security information event management (SIEM) is in use, such as Splunk or ELK, and I write down and organize all alerts by function. In addition to this, I review or create a risk registry that categorizes the highest risks that face the organization and, if time permits, conduct a purple team exercise. As explained in the SOC environment responsibilities section for a red team engineer role, and will be further explained in future chapters, a purple team exercise is a joint project where offensive operations are carried out to test the visibility and efficiency of alerts in place. This can also be useful in proving the negative, which means proving that attacks can be conducted without any visibility, which helps push the importance of increasing coverage.
After you have determined where the gaps are, you need to prioritize them and create a plan on how to fix them. One of the strategies for prioritizing is to list out the types of logs and visibility each missing source would give. For example, if you are not capturing logs from your organization’s Virtual Private Network (VPN), and it’s a remote work environment, then you have limited visibility into User browsing logs. You would then write out the risks or potential attacks that could be discovered with those logs and write down any mitigations. For example, while you might not capture user browsing logs but capture endpoint detection and response logs, you’ll still have some visibility into the applications used, and downloads, so that might lower the risk of not capturing those initial logs. Another strategy is after writing down the missing sources and suspected size of the logs, to prioritize them based on an effort to ingest them. For example, if you can download an app on your SIEM, utilize an API key and creds, and everything is managed in-house, then as long as the size of the logs isn’t the problem, you should be able to easily configure the new ingest. I typically recommend picking a mix of harder-to-implement logs to start the process for ingest while also implementing a few that are considered a quick win. Another way to prioritize coverage gaps is based on what you are prepared to write detections for or base it on team expertise. That way, you know whatever source of data that gets added will immediately be used to increase coverage because, in theory, you don’t want to waste licensing space, time, or money if you can’t take advantage of the additional coverage.
Filling any coverage gaps after prioritization can be done by expanding your ingest license for your current SIEM tool or using a data lake to capture additional logs without ingest in your SIEM tool. Therefore, you’ll still be able to run analytics and only ingest absolutely necessary data within your SIEM tool. Expanding your resources and current license seems like the simplest path, but a common issue with that path is that it can be cost-prohibitive. That’s where other solutions such as log tiering or using a data lake come into play. By implementing a different technical solution, such as a data lake through providers such as Snowflake, you are able to have logs transferred and collected in a centralized location, and then run analytics over those logs, and only ingest detections on a subset of the logs. That way, you have the coverage, but your ingest costs are lower. One of the downsides to this approach is that you need to implement a new technical implementation, connectors, and so on, and that, of course, has its own costs associated with it. Of course, any approach that is taken to fill a coverage gap is reliant on proper cross-team collaboration.
Coverage is always an ongoing battle as your company grows and gains capabilities. It’s important to regularly evaluate your coverage for improvements and to document any gaps, as this will help your SOC team mature. As mentioned, there are multiple ways to evaluate coverage and identify gaps, which will allow you to make an action plan for how to remediate those gaps.