A SOC environment can have varying responsibilities. At the core, you have an incident response and alert triage. To accomplish comprehensive incident response, you want to have a strategy for 24/7 coverage, which can be accomplished by having a larger team and having shift patterns, partnering with a managed service security provider (MSSP), implementing a follow-the-sun model, or using an on-call roster. The option that you choose depends on the resources and infrastructure your team has. For example, if the organization has multiple offices located around the world, you would want to implement a follow-the-sun model, where you have multiple SOC analysts located at each office so that they can work their respective 9–5 type shift, and due to the various time zones and the addition of a few weekend shifts, that will give you the 24x7 coverage envisaged by the follow-the-sun model. The advantages of using an MSSP would give you the advantage of having a smaller in-house team while having support for alert triage and 24x7 coverage, but of course, that comes with a typically high price tag.
If you do everything out of one office and in multiple shifts, you would typically see 4 different shifts of 12 hours, A Day, A Night, B Day, B Night, where in 1 week, members assigned to A Day and A Night work for 4 days with 12-hour shifts, and the ones assigned to B Day and B Night work 3 days with 12-hour shifts, Each week, it would flip for the team that worked weekends.
The following chart will help you envision this:
Figure 1.1 – 2-week 4-shift rotation schedule
The benefit of that is that in a 14-day time period, you only work 7 days, but again they are 12-hour shifts. You’ll see this type of setup in a lot of government SOC environments. The final option is to just use an on-call roster. In theory, if a high-severity alert is triggered or someone needs to trigger the incident response process, they would use a call roster or a tool such as Ops Genie, which would page whoever is on call. While this is the cheapest option for coverage, it does come with a higher level of risk that alerts will go untriaged, or incidents might take longer to contain or mitigate. In my experience, I’ll typically start a new SOC environment following the on-call method while processes are established, and then as the organization and SOC mature, I’ll either contract with an MSSP or hire enough staff to use the follow-the-sun method.
In addition to incident response, another core responsibility is alert triage. To have alerts, you need to also establish detections that can be completed by SOC analysts or dedicated detection engineers. Alert triaging can sound boring and monotonous, but it is a situation where you are constantly solving a different puzzle. It’s the responsibility of the SOC to create alerts to be able to effectively analyze the logs to find potentially suspicious actions or traffic. The purpose of that is that it is not effective or scalable to be analyzing raw logs all day; the rate at which suspicious actions would be missed is astronomical. Therefore, in theory, it is a key responsibility to have the ability to create complex alerts, which ideally pull in information from various data sources to essentially complete the first few steps of triage for your analysts. You also need to test alerts and review them regularly for efficiency because you do not want your analysts to spend all day triaging false positive alerts. Another aspect that can help with this responsibility is setting up some type of case management system to ensure that analysts do not duplicate efforts by triaging the same alert, and that would allow you to track metrics such as time to ticket, time to triage, and time to mitigation, which can help you drive initiatives for your team.
Threat intelligence analysts can be like security engineers or other types of roles in the SOC, the same way that a SOC analyst might perform some threat intelligence activities such as conducting operational intelligence research, which would provide contextual information for potentially suspicious IPs, domains, URLs, among other types of activities. A threat intelligence analyst conducts research on a larger scale. Some of their responsibilities might include integrating threat feeds such as ones from Abuse.ch, FireEye, and GoPhish (there are literally hundreds of both public and private threat feeds) or creating a deny list of hashes or domains based on triage findings, which can be applied to your organization’s firewall or endpoint detection and response tools. Another responsibility of a threat intelligence analyst is to stay current on all cyber threats, advanced persistent threat groups, and vulnerabilities, which may include writing threat reports on vulnerabilities that apply to the organization or industry.
The NOC is a part of the SOC or can be its own team. The NOC contains team members called NOC analysts, who have similar responsibilities to SOC analysts. They are responsible for monitoring networks; however, they can be less security-focused. A great example of a NOC is where it monitors smaller managed networks and remediates or investigates changes that might not align with a golden image, or reviews whether a change management process was followed properly for any network changes. If a network change might be suspicious, the NOC would work closely with the SOC to triage and assist where needed in the incident response process.
Security engineers can have a multitude of responsibilities. First, they can assist with creating detections and alerts for analysts. Next, they work heavily on the concept of “trust but verify.” This means that you trust employees with the best of intentions, but review configurations to ensure that they are secure. One common example is that when testing in a development environment, it’s very easy to stand up a new Amazon Elastic Cloud Computer (EC2) instance or create a new security group and in the interest of time, leave its firewall rules wide open to the internet. A security engineer would ideally have alerts set up to trigger when a new overly permissive group is stood up and work with that initial person to restrict the group as needed for the project.
Red team operators similarly can be their own team and work closely with security engineers and the SOC or fully be part of the SOC. Red team engineers or operators typically work on offensive operations tasks such as penetration testing, purple team testing, and assisting with trust-but-verify. The penetration testing responsibility and purple teaming go hand in hand. A purple team exercise is where a penetration test is being conducted via a set of pre-specified use cases, and the SOC analysts are notified when the testing is going to begin, its status, and so on. That way, it is a collaborative effort to test out any detections and coverage and provide an opportunity to improve.
Managers of sub-teams or of the SOC, or really any leadership role as part of the overall SOC, needs to be unique. They of course must handle administrative tasks, such as capturing metrics, creating reports, and personnel management, but they also must have a strong technical background to be effective. They must plan out a roadmap that addresses the risk of the organization, allows the team to grow and follow a scalable model, while also supporting the team members. To provide for proper growth and planning, the managers should be able to project the next 6 months of roadmaps. I recommend planning out no more than 60% of your time for projects depending on the role, to account for interrupts such as incidents or unforeseen complications and be flexible in the amount of work or types of projects projected to allow for necessary pivots.
SOC environments clearly have many varying roles and responsibilities, which really depend on the specifics of your organization and environment. You should identify the expected responsibilities of your SOC and ensure that you have enough personnel with the correct skill set or potentially provide training to develop the needed skill sets to fill any role. For scaling purposes, I typically try to add one SOC analyst and one security engineer per 500 employees, 1 NOC analyst per 5,000 network assets, 1–3 total red team engineers, 1 threat intelligence analyst for every 1,000 employees, 1 manager for every 5–10 SOC employees, and a senior manager or director as needed. Of course, this is just a general rule of thumb and can be tweaked to meet your needs.