Adversarial Tradecraft in Cybersecurity

Preparing for Battle

In this chapter, we will look at various solutions that will prepare us to engage in a highly demanding cyber conflict. In the previous chapter, we noted that the principle of planning is critical to any advanced operation, especially games of conflict. As Benjamin Franklin said, "By failing to prepare, you are preparing to fail." This is especially true when dealing with an active cyber conflict. Effective usage of the required tools and infrastructure requires expertise, which can only be developed through the investment of significant time and practice. This chapter will cover preparatory steps that should be taken on each side before engaging in cyber operations. In this chapter, we will look at the difference between long-term strategic planning and short-term operational planning. We will explore how to break down and measure long-term planning and how to gauge operational efficiency. We want to develop an effective plan, wiki documentation, operational processes, and even code to automate these strategies, ensuring consistency and repeatability. From both the offensive and defensive perspectives, we will examine critical skills and infrastructure that should be included in these plans. This chapter will introduce many effective technologies and options, some you may already be aware of, and hopefully many new solutions. The aim is to reduce the complexity of computer security at scale by planning and using various frameworks to help automate and manage different tasks we may encounter. Our plans will have to be flexible according to operators and developers. As Eisenhower said, "The plan is useless, but planning is essential." In our situation, this means that while the exact actions taken may deviate from the plan, and while the plan should not be entirely prescriptive, a broad roadmap is critical for the direction of the team, especially during a high-stress event or time of crisis.

In this chapter, we will cover the following topics:

Communications
Team building
Long-term planning
Operational planning
Defensive signal collection
Defensive data management
Defensive data analysis
Defensive KPIs
Offensive data collection
Offensive tool development
Offensive KPIs

Essential considerations

Let's look at some potential roadmaps or solutions to include in your roadmap, for either cyber competitions or larger operations. We will start with essential properties on either side of our asymmetric conflict. Whether part of the offense or the defense, both sides rely on fluid communications and information sharing to carry out their operations. Both sides will be required to build and maintain a team for these operations. Furthermore, both sides will partake in strategic and operational planning. In this section, we are focusing on what the offensive and defensive teams have in common, and in later sections, we will focus on the differences between these unique teams.

Communications

As you begin to form your cyber operations team, plans should be documented to ensure the team has a set of broad goals and at a minimum, a shared direction or North Star. These plans should be written down for long-term reference, team collaboration, and development. While planning may seem like a task for managers, even individual contributors can develop their skills or tools by partaking in the shared collaboration and team direction. Planning is a team effort that unites the team in a shared vision. Both offensive and defensive teams benefit from having a wiki to store and share team knowledge, which may have been acquired on behalf of individual team members over a long period of time.

A knowledge base may also be a code repository such as GitLab, or a simple document repository such as an SMB share with documents. It should enable sharing within the team and could be publicly hosted, on a private network, or even ephemerally shifting as a Tor onion service. Ultimately, the intent is that we maintain a common medium where team members can share plans, tools, and information regarding tools, techniques, and policy. This location should be accessible and the solution should be semi-permanent with an emphasis on long-term team support. Choosing a good wiki or note repository is critical. You may want a publicly hosted product with an API to enable automated integrations; you may want a privately hosted service or even something with open-source code that you can review. This decision depends on your risk tolerance and any requirements for confidentiality. You may want a strong authorization feature set, such that you can restrict pages and workspaces from users or groups. Compartmentalizing different development and operational details will help mitigate exploitation or compromise of one of the operators. One feature that I've always appreciated is real-time, cooperative document editing, such as with Google Docs or Etherpad^[1]. Collaborative document editing can be very effective for the real-time editing and review of policy across distributed teams. Another set of compelling features could be integrated alerting and email updates. A good example of a self-hosted, open-source wiki application is DokuWiki, which is a simple and open-source wiki I've used on various engagements^[2]. While I've presented readers with many features and options, wiki solutions should be an easy choice for competition scenarios. In competition environments, focus on a simple, easily accessible solution that includes authentication and confidentiality controls, and promotes team collaboration.

A close second to knowledge-sharing technologies are real-time communication and chat technologies. Communication is the lifeblood of any team. The quicker real-time communications become, the closer they get to chat and the quicker team members can iterate, develop, and collaborate on ideas together. Chat capabilities are critical for your team, so it's important to choose the right infrastructure, or at least leverage what you have. Even if your team has the luxury of all being in person, they will still need to send each other digital information, logs, and files. Generally speaking, chat or communications should be considered as whatever your primary method for digital interaction with your team is, for example, email, IRC, XMPP, Slack, Mattermost, Zoom, or even more ephemeral communications such as Etherpad. One major consideration you will want is the ability to copy/paste directly into operations, so using something like traditional SMS may not work well for primary communications. You can take this a step further and supercharge your team's chat with chat-ops. Having the ability to issue group tasks directly from chat can give your team powerful automation abilities, such as the ability to publicly triage hosts or receive scan data from the networks, and share it in a chat room with the whole group.

I've used chat-ops on an incident response team in the past to quickly interrogate our entire fleet of machines for specific indicators of compromise, with the whole team present. We could also pull artifacts from hosts and quarantine machines directly from chat, making for very fast triage and response times while scoping an incident. It is advised that if you go heavily into chat-ops, you have dedicated rooms for this as the bot traffic can overwhelm human conversation at times. Another feature you may want to consider in your chat application is the ability to encrypt chat logs at rest, something that provides additional confidentiality and integrity to the communication. This is supported in the Slack chat application as a paid feature, known as EKM, or Enterprise Key Management. EKM allows you to encrypt messages and logs with your own cryptographic keys stored in AWS KMS, or Amazon's Key Management Service^[3]. Such features can be a lifesaver if part of your organization or infrastructure is compromised by allowing you to compartmentalize different chat rooms and logs. It can also pay to have a contingency chat solution in place, so that team members have a fallback if their chat is compromised, or they lose availability, for whatever reason. A contingency chat solution would preferably have a strong cryptographic method for proving authentication, such as GPG keys or using a solution such as Signal^[4]. Furthermore, having these pieces of infrastructure in place, including a knowledge base and an effective communication system, will greatly enable the team to develop their plans and further infrastructure cooperatively. These two components will be critical to both offensive and defensive teams alike.

Long-term planning

Long-term planning is some of the most important planning your group can do. It will allow you to set a theme for your group and give the team an overarching direction and avenue to express their innovative ideas. The length of your long-term planning cycle depends on the scope of your operations. For competitions, this could be an annual cycle, or you could start planning with only weeks leading up to the competition. Generally speaking, a long-term plan can be anything that helps you prepare for an operational engagement during your downtime. You can also iterate on these plans over time, such as adding or removing milestones as an operation develops and new needs arise. Some examples of long-term plans are three-year to five-year plans, annual plans, quarterly plans, monthly plans, and can sometimes even be preparations for a single event. As an example, from a competition perspective, this could mean using the months prior to develop a training and hunting plan. Higher-level planning may seem frivolous, but in general, the team should have an idea of its general direction, and it is best to write this down to ensure all are in agreement.

Over time, these larger plans may be broken down into milestone objectives to help team members digest the individual projects involved and to time box the different tasks involved. These milestone objectives will help determine whether progress is being made according to plan and on schedule. Time is one of your most precious resources in terms of economy and planning, which is why starting the planning sooner can help you tackle large tasks and potential time sinks. You will want to use your downtime to develop tools and automations to make your operational practices faster. For example, if your team is spending a lot of time auditing user access and rotating credentials, you could plan to develop a tool to help audit the users of a local machine and domain. Long-term planning should involve the creation of projects, which then encompass the development of infrastructure, tools, or skill improvements you want to make available to the group. Make sure you over budget for time on projects and milestones to allow for pivoting or error along the way. This also means making sure you don't overtask individuals or take on more projects than you have resources for. The benefit of long-term planning is in building up your capabilities over time, so do not rush your project development and burn your team out early. Similarly, if you fail completely at long-term planning, you may find yourself in a cyber conflict technically unprepared, scrambling to get tooling in place, or simply blind to your opponent's actions.

No plans are perfect. You need to be able to measure how close you are getting to your objective, and make course corrections if something is not going according to plan. Contingency plans should be available if goals, objective milestones, or metrics aren't being met. This will be a major theme of this chapter, as we touched on in the principle of planning. Throughout this book, we will be looking for ways to measure and test our techniques and make sure our plans are according to schedule. As we saw with the principle of time, the timing of our plans is absolutely critical when playing against an adversary, so we need to know when to pivot to maintain the advantage. If we start to get data contrary to our plan, such that some techniques may be detected, we need to modify our plans and potentially our tooling to support our new strategies. This is rooted in our principle of innovation: if our strategy is discovered, we will lose our advantage so we should be prepared to pivot our operations in that situation. Former UFC champion George St-Pierre said, "Innovation is very important to me, especially professionally. The alternative, standing pat, leads to complacency, rigidity and eventually failure. Innovation, to me, means progression, the introduction of new elements that are functional and adaptable to what I do"^[5]. As you go through your long-term planning, consider blocking off time for ad-hoc or unspecified research, tool development, or even process refinement. These stopgaps in long-term plans allow for pivots to be incorporated more easily. If a plan goes awry, these flexible gaps can be easily sacrificed for course correction. Otherwise, if the plan succeeds, these flexible gaps can be capitalized on for process improvement.

Expertise

One of the most important things you can prepare is knowledge. Hire for experience and talent, but also passion and team fit. It is important to build a quality team, in terms of expertise, experience, and capabilities, instead of a large quantity of bodies to throw at a problem. One of the unique aspects of computer science is the ability to both automate and scale solutions. This means an innovative engineer could automate a solution or part of a solution to a task that several people would otherwise perform manually. That said, you will absolutely need a team. There are simply too many areas of complex infrastructure and knowledge to manage with only a few people. Long-term plans should include owners in areas of subject-matter expertise. While you should generally be prepared for a wide set of digital environments, especially in regard to competition environments, it helps to know about your target environment and the types of systems that you will encounter there. In this book, we will primarily be focusing on Windows and Linux-based operating systems. Basic examples of expertise you could have on a CCDC team, either on offense or defense, include Windows strengths, Unix capabilities, web application experience, incident response prowess, red team abilities, and even reverse-engineering competencies. Lots of other skills also apply, such as vulnerability scanning, network monitoring, domain hardening, and infrastructure engineering abilities, to name a few. Areas you decide to invest in, in terms of expertise, should mirror your overall strategy and should be stacked toward your desired strengths. This means you should also invest in infrastructure and tooling that supports these areas of expertise and have members of your team cross-trained around your chosen expertise.

Contingency plans, in terms of the team's expertise, mean having the backup team trained in those areas and developing a training plan for cross-training resources. Cross-training can be in the form of weekly educational meetings, brown bags, or even quarterly formal training programs. Your group should be meeting regularly, which is a good time to exchange recent lessons learned. You can follow this up with individual training programs around skills team members are looking to improve. Formal training courses can be some of the best ways to upskill people quickly in areas you are interested in. SANS, for instance, is an incredible resource for cyber education, but the price is significant if you're on a tight budget^[6]. Many free resources also exist in terms of cyber training, but the most important thing is to give employees dedicated time for training. One of my favorite free resources for low-level technical skills is https://opensecuritytraining.info/, which includes over 23 high-quality courses, many with videos^[7]. Another interesting site for free education courses is Cybrary, while these courses aren't as in-depth as OpenSecurityTraining, their Career Paths include many relevant skills and their courses have a high production finish^[8].

You can even turn this into value for the whole team by having them present on the topics to the group after they learn a new technique or skill. Even experienced practitioners will need time to practice new skills and continue their education. While training is great, nothing is a substitute for real experience. Newly trained team members will bring a lot to the table, but you need to make sure they put those skills into practice as soon and for as long as possible. You should have junior team members shadow experienced team members on their operations or in practice, if time allows. I also like to use new members to make sure documents are up to date and have them take additional notes for the wiki during these shadow sessions.

Operational planning

Operational planning is anything that helps operators prepare and navigate through an upcoming engagement. Operational planning can take the form of runbooks to help operators with basic information, workflows, or technical tasks. Operational planning can also be high-level goals and tenants of a mission, such as a rule that operators should abide by. This planning allows for smooth processes and for operators to help themselves when they get stuck. Operational planning can be both generic to all operations or specific to a target engagement. Tailored plans should be crafted per engagement, which includes overall goals and special considerations for that operation. In real operations, this would typically involve a lot of reconnaissance, making sure the target technologies or threat actors are appropriately scoped. In a competition setting, this can look like making a spreadsheet with every host in the environment and highlighting the ones running critical services. You can then assign team members tasks on servers and move through them systematically, either triaging or exploiting them. Operational planning can also be thought of as policy or procedures for the team. This level of planning, creating a policy with supporting runbooks, can also help make sure processes are operationally secure. Automating these individual operations within the plan will be a great innovation in any team. For example, one operational runbook may instruct operators to use a VM for operations, to help reduce endpoint compromise, malware spread, and operator identification. A team member could innovate on this policy by creating that golden VM image for the team, and potentially automating the deployment of these VMs for other team members. Furthermore, these VMs could also come with all of the appropriate tooling and network configurations the operators need. Any of this automation should also be documented, and the original runbook should be updated with the new details of the automation. If the project grows enough it should be considered for turning into a supported long-term project, with a proper development life cycle.

Ultimately though, runbooks should provide guidance on a technique or process to a team member looking for clarification on an operation. The runbook should link to external information that enriches the subject and provides context as to why a tool or process may determine something. Some of the most useful runbooks also provide anecdotal experiences, links to corner cases, or links to references where team members borrowed a previous implementation. Runbooks could also include common flags to look out for if a process is going wrong or if deceptive tactics are at play. Plans should then include contingencies, such as creating an incident if you think there are deceptive practices at work or pivoting to a live response if you think tools aren't reporting properly. Keeping runbooks focused and atomic in terms of scope will help make them flexible and chainable into different operational plans. Maintaining operational goals and runbooks is one way to prepare your team for the high-pressure and fast-paced action of cyber conflict, especially in a competition setting.

Another operational planning consideration is finding a way to measure your team's operational progress. KPIs, or key performance indicators, on the group can help understand how they are working overtime. It is often best that KPIs or metrics are recorded and collected automatically; automation will save the painstaking review process of gathering metrics for management. Because the game of computer security is asymmetric, we will look at individual metrics either offense or defense can use to measure their operations. Even within offense and defense, KPIs can often be very role-specific since you are evaluating role performance and efficiency. That said, later sections in this chapter will have some example KPIs for different roles. It is also worth mentioning again that computer science is extremely complex, so sometimes the KPIs may be capturing other factors and not truly measuring the targeted element due to complexity at play. A good example of this may be a defensive team trying to achieve the fabled 1/10/60 time of detection, investigation, and response speeds^[9]. If they are using a cloud-based EDR service, there may be a delay in the ingestion and processing of the logs with that service, such as three to five minutes to receive and process an alert in their cloud. That means that no matter how finely tuned the defensive team's infrastructure, they will never be able to detect an incident within a minute of it happening while using such a cloud service. It is important to understand what's feasible in your environment when setting metrics and that may even require several rounds of measuring before you determine a baseline.

How your group plans to end a specific engagement is a forethought that should be done during the engagement planning. While we will have a specific chapter on this at the end of the book (Chapter 8, Clearing the Field), we will want to plan for how operations successfully end.

From the defensive perspective, this means planning and implementing capabilities that will allow you to evict an attacker from your environment. The defense needs the ability to do root cause analysis (RCA) and understand how the attacker got in and patch that vulnerability before they can exploit it again. From the offensive perspective, this could help determine when we will exit the target environment. You will also want to plan for the event that the operation takes an unexpected turn or goes in the opponent's favor. For the offense, this means planning how we will respond if the campaign is uncovered, our tools are exposed publicly, or even our operators are identified. This is often thought of as program security (Network Attacks and Exploitation: A Framework, Matthew Monte, page 110). As Monte describes it, "Program security is the principle of containing damage caused during the compromise of an operation. And no matter how good the Attacker is, some operations will get compromised. […] You do not want the failure of one operation impacting another." It is vitally important to consider how the offense will exfiltrate and exit their target environment after they have reached their goal. Similarly, the offense should consider what a successful response by the defense looks like and when they will choose to exit the environment or spend more resources to reengage. This is equally important to consider from a defensive perspective unless you want to relive a past compromise event.

This chapter moves from planning to setting up the infrastructure and tooling that each side should have in place to support their operations. Both sides will have a great deal of standing infrastructure. Tooling is critical to the team's operations, but due to the asymmetric nature of the game, I will cover each in their own section, one for each side. Even if you are not on the other team in your average role, I urge you to understand their tooling and infrastructure. As Sun Tzu says, "If you know the enemy and know yourself, you need not fear the result of a hundred battles." I cannot overstate the importance of understanding your opponent's tools and capabilities, as this outlines the options your opponent has available. I think Dave Cowen, the leader of the National CCDC red team, is a great example of this. For his day job, Dave is an incident response director, aiding defensive operations against real attackers. In his free time, Dave leads the volunteer red team, letting him think like an attacker and explore offensive techniques hands-on. Furthermore, if you can exploit your opponent's security infrastructure, you will gain a massive advantage in the conflict. In the following sections, we will see how much of the technology on both sides involves a great deal of standing infrastructure that in turn becomes a potential target itself.

Defensive perspective

In this section, we will focus on defense-specific planning, skills, tooling, and infrastructure. A lot of these tools may be used in one-off analysis tasks or in conjunction with other tools to achieve a larger team goal. We will see how spending time preparing cooperative infrastructure during the planning and preparation phases before an engagement will save us precious time during live operations. As Leo Tolstoy said, "The two most powerful warriors are patience and time." I interpret this as: if we use our time wisely, patiently setting up defensive systems, we will be far more powerful when we encounter the opponent. I have heard defense referred to as a series of web building, analogous to a spider building its web. Following this analogy, the net must be wide enough to cover all of the space they are tasked with protecting, but also flexible enough to alert them when the net has caught something. While it takes time for the spider to build their net, the result is a greatly improved ability to catch its prey. Still, the net must be maintained, and thus requires expertise and resources to make it a viable strategy. Further bolstering the idea that preparation is key, it is important to remember that the offense only needs to compromise its target once to get a foothold in the network. For the defense to be successful it must defend against 100% of the attacks on it. As we know, this is nigh on impossible, so we invest in preparing response processes to ensure that when we are inevitably compromised, we can identify the threat, then contain and eradicate it effectively. We can elaborate on this by creating a network of machines that identify compromise, buffering the systems we are protecting. This is a callback to the concept of defense in depth we highlighted in the last chapter. If the breach of a single system is near impossible to prevent, by creating a network of hardened systems we can detect the offense as they pivot through the network toward their goal. By invoking multiple defensive technologies in our strategy, we greatly increase the likelihood we will detect the offense at various stages in their attack chain. During planning, it is important to prioritize the infrastructure within the strategy tailored to your group's needs. Knowing that we may lose critical infrastructure at some time during an event, we should keep in mind alternative options in the event this occurs. This is a critical part of the contingency planning mentioned earlier, and in corporate parlance would be part of our business continuity planning strategy. In line with best practice, we should also use these alternative tools and methods to verify our primary tool's results to ensure that these tools aren't being deceived. It is common for an attacker to backdoor a system or deploy deceptive techniques to alter forensic tooling output, with the intent of confusing defenders.

It is well understood that the best investment a defensive team can make early on is in security log generation, aggregation, and alerting. In order to achieve these capabilities, we must generate logs from all critical systems and store them somewhere centrally.

Security logs can later be reviewed during a significant event, alerted on, and utilized for forensic reconstruction. Security collectors or agents are typically used to generate data from active infrastructure. I've always bucketed digital security collection into three categories: network-based telemetry, host-based telemetry, and application-specific or log-based telemetry. We will be considering all three in this book as they each have different strengths and weaknesses. We will use various agents to aggregate information from these sources into a central location for analysts to triage. For example, network monitoring can be helpful for identifying unknown devices operating in your network whereas application-specific logs could reveal detailed protocol information showing fraud or abuse in an application. In competitions, I like to prioritize network-based visibility, then host-based, and lastly application-specific telemetry, for the ability to spot a new compromise. Host-based agents or collectors are extremely helpful for investigating individual compromises, particularly with getting details around the infection and responding on the machine. Application-specific security metrics are likely the most important in a corporate incident as they are likely tied to your core business practices and could show the attacker moving on their goals or abusing your data, even if they exploited the principle of humanity and compromised legitimate users. For example, if your core product were a massive multiplayer online game, adding security logs and metrics to your game would likely uncover direct abuse faster than searching for an internal compromise. That said, this data is less useful in an attack and defense competition as the focus is typically on network penetration with less complex web applications in play. We will begin by looking at security log generation at these various sources, examining host-based, network-based, and application-specific telemetry, then we will cover some additional log aggregation, sorting, and search technologies. The logging journey does not stop there – after alerting on events we will look at post-processing and enrichment, including artifact extraction, storage, and analysis. The following is a slimmed-down list of some high-level projects you may want to consider when planning the toolsets for your defensive team. Within each of these areas, there are a number of techniques and tools that can be implemented. I will primarily focus on free and open-source solutions.

Signal collection

To start, let us look at host-based security event generation and collection. In this space, there are many traditional solutions, such as anti-virus providers like McAfee, Microsoft Defender, Symantec Endpoint Protection (SEP), Kaspersky, and ClamAV to name a few. While often thought of as depreciated, these agents can still produce particularly useful alerts on known malware and offensive techniques. Some platforms, such as SEP and Kaspersky, can also provide alerts on statistical anomalies, like when attackers use a crypter or packer to obfuscate their payloads.

While these solutions can be very useful in a corporate environment to deal with commodity threats, they will be less useful in attack and defense competitions where the offense may leverage custom malware. There are also endpoint detection and response (EDR) platforms, which are a more modern evolution of previous anti-virus scanning solutions. While EDR platforms incorporate many of the same factors as a traditional AV, one major differentiating factor is these tools let operators make arbitrary queries on their data. EDR agents also enable remediation and response actions to be taken on the impacted hosts remotely while the host is still online, which is known as a live response. These capabilities can be extremely effective when dealing with a live attacker, by leveraging the real-time ability to counter the attacker's plans on a specific host. Another core value of these tools is recording at a higher level of granularity all actions taken on the target. For example, out-of-the-box Windows and OS X may not record process creations, command-line parameters, modules loaded, and so on. EDR agents can be configured to record detailed process telemetry and to send this data to a central server, allowing for alerting and reconstruction of the incident. Reconstruction of the breach is key to ensuring we can prevent further occurrences of the threat. This is a key theme when performing incident response and is known as root cause analysis (RCA). As we will see in Chapter 8, Clearing the Field, if we attempt to remediate an intrusion without performing RCA, we risk only partially remediating the compromise, tipping our hand to the attacker and allowing them to change their tactics. With EDR agent data it is easy to investigate a single host, then search for the techniques or malware used there across the rest of the hosts or fleet. Using EDR agents also enables a way to interrogate all hosts with a security hypothesis to help determine if better alerting can be written, a process known as hunting. We will visit these hunting techniques much more in Chapter 7, The Research Advantage, where we look at discovering new alerts, forensic artifacts, and even log sources. EDR agents can also be used to collect rich behavioral data about processes, such as which files, network connections, and handles a process has open. Behavioral data can create some of the strongest alert types by ignoring fungible variables, such as process names, and focusing on metrics including how many files or network connections the program makes. Such behavioral technology could detect abstract techniques like port scanning or encrypting files for ransomware goals, regardless of the tool implementing the technique. Another popular technique for detecting compromise in corporate environments using EDR solutions is known as anomaly detection. This involves sorting all of the processes or executable telemetry in a given environment and going through the outliers. Often starting with the fewest occurrences of a given executable or process in an environment will uncover malicious anomalies. There are many popular commercial offerings in this space, such as Microsoft's Advanced Threat Protection, CrowdStrike, CarbonBlack, and Tanium, to name a few. One of the issues with commercial offerings is they are often configured to alert on as few false positives as possible.

This is important in a long-term deployment as we want to minimize analyst fatigue with unnecessary alerts. However, in a competition setting, where the time frame is shorter and we know we have an attacker in the environment, we will want to configure our host-based security collection to be as verbose as possible. By having sufficiently verbose endpoint collection we should be able to triage more esoteric hacker techniques or debug strange processes we encounter. I like using similar open-source EDR applications such as OSQuery^[10] for extra enrichment and GRR Rapid Response^[11] for additional investigative inquiries. Other very popular open-source EDR frameworks you could consider are Wazuh^[12] or Velociraptor^[13]. Both frameworks have a long tenure in the security space and have evolved over a number of years, making them robust and fully featured. Regardless of the solution you choose, host-based signal enhancement is great for digging into an incident on a specific host or searching the entire fleet for an indicator.

Network monitoring is an immensely powerful source of security data. By setting up strategically placed network taps you can see which devices and protocols regularly communicate over your network. As previously mentioned with host-based data, network telemetry can be used to spot anomalous or blatantly malicious traffic by sorting the protocols or destinations in your traffic. A good network monitoring program can be used to slowly harden the network posture by enabling the sysadmin to understand what normal traffic is, thus enabling them to reduce highly irregular traffic at the firewall. In a competition environment, this can be as simple as allowing the scored protocols through the firewall, reducing the immediate set of traffic your team needs to analyze. Following the principle of physical access further, by controlling inline network traffic using an IPS technology such as Suricata or an inline firewall, you can block all traffic from a compromised machine or isolate it to a containment VLAN. When you quarantine or isolate a host to prevent further lateral movement, you can have preconfigured firewall rules that still allow your team to triage the host. These network monitors can also be used for signal analysis, such that the defense can observe anomalous network transfers even if the offense attempts to tunnel those communications through another protocol or host. Throughout this book, we will be using a combination of Snort, Suricata, Wireshark, and Zeek to look at network traffic. Snort is nice for identifying known malicious patterns of network traffic; we will use Snort much like our traditional AV enhancements^[14]. Suricata is similarly useful for helping us identify malicious behavior patterns in traffic^[15]. Zeek is great for breaking down different protocols and providing detailed logs about the protocol flows^[16]. These core monitoring applications will serve as permanent solutions we can deploy around the network, providing powerful capabilities if we can get the infrastructure in place. Network monitoring is also very good for identifying problems on the network, making it a strong debugging resource.

In a competition setting for example, if a service is showing as down on the scoreboard, the defensive team can use their network monitors to quickly understand if the issue is a routing problem on the network or an endpoint issue on the affected host. While endpoint detection can be like searching for a needle in a haystack, network monitoring is like watching traffic on a highway – even at high speeds, it is often much easier to observe malicious behavior and where it is originating from. While you will occasionally receive firewall and network monitoring appliances in a default network architecture or competition environment, you can almost always rearchitect the network or routing to put one in place. In line with the principle of physical access, if you own the network switches you can likely hook a device onto the SPAN or mirrored ports^[17] to receive traffic on your monitoring interface. Furthermore, you could route traffic through a single host and turn it into a network monitor using a command-line tool like tcpdump^[18]. This can quickly be done with the following one-liner, which will capture all traffic at a given interface, in our case eth0:

$ sudo tcpdump -i eth0 -tttt -s 0 -w outfile.pcap

Granted, you will want to make sure the machine you collect traffic on has sufficient throughput and disk space to store the data. Collecting raw pcap data can build up very quickly, meaning you will need proper storage or to keep an eye on any live collection. A great one-off tool you can use for on-the-fly network traffic analysis is Wireshark^[19]. This tool is very popular because it comes with a GUI that will colorize protocols and allow operators to follow select TCP streams. Wireshark also includes a modular plugin framework, so if you encounter a new protocol you can reverse engineer it, then include the protocol dissector in Wireshark for it to decode^[20]. While you can easily use these quick solutions, you will likely want to invest in infrastructure here to really harness these capabilities over the long term. That said, Wireshark even comes with a command-line alternative called tshark, which is a headless network collection and parsing tool. tshark can do a number of analysis tasks on raw pcaps, but it can also collect network events for you as well. You can even use tshark to perform modified collection and produce special logs like the following, which will give all source IPs, destination IPs, and destination ports regarding traffic to and from a machine^[21]:

$ sudo tshark -i eth0 -nn -e ip.src -e ip.dst -e tcp.dstport -Tfields -E separator=, -Y ip > outfile.txt

Another important source of logs that may be available to you is application-specific security enhancements. Often security isn't thought of with the initial service and instead is added on as a product in-line, as in it is put in the middle of the network route to access the service.

This may look like a custom security appliance in your network, such as an email security gateway or a web application firewall in front of an important web app. These tools will also generate logs and alerts that are critical to your security program. For example, phishing is seen as a prominent vector into many organizations, so those organizations may use a product such as Proofpoint or Agari to screen incoming emails for security information and potentially alert on phishing emails. These tools could also provide application-specific response capabilities, for example with email, it could offer the ability for users to report emails or for network defenders to mass purge selected malicious emails. These security tools also cost a significant investment, both in terms of budget and expertise, so it's important to make them first-class citizens and give them the proper attention if your organization has decided to invest resources in them. Often, they are sold as a license or service subscription and come with vendor support, meaning you should prioritize their configuration and use the support resources if you've made the investment or been given the technology in a competition. A close relative to these security application logs are abuse metrics related to your business's core service. For example, if your organization runs a large custom web application that supports e-commerce or virtual hosting, you will want detailed metrics related to the use or abuse of this service. These could be metrics such as the number of transactions an account makes or top users of your API service. Just as we saw with other log sources, similar behavioral and anomaly detection methods apply. From a behavioral perspective, you could look at how quickly users navigate your pages to determine if automated abuse is occurring. From an anomaly perspective, you could sort data and view login attempts from similar IP addresses, to detect account takeover attempts of your user population. Another important log source to review is internal tooling and applications. Reviewing your own tool's logs for abuse or anomalous logins can help determine if someone in your group was compromised or if you have an insider threat. While auditing internal tool logs will likely not take as high a priority during an active network compromise, overlooking these logs would be considered a grave mistake in ensuring your operational security.

Finally, active defense infrastructure can help us coax the attacker into revealing themselves within the network. Active defense tools are solutions that seek to deceive attackers into thinking a piece of infrastructure is vulnerable, in an attempt to lure the attacker out^[22]. Active defense infrastructure will be a major theme of this text, giving the defense an advantage through setting traps for the offense. We will see how showing the false will help us detect the attacker by deceiving them to think we are vulnerable when we are not. Practically, this means using tools like honeypots, honey tokens, and fake infrastructure to trick our opponent. While this can be thought of as extraneous infrastructure, it comes back to the principle of deception we covered in the last chapter.

By creating fake but believable targets that are easy for our target to hack, we can bait them into divulging themselves and giving the defense the upper hand. This investment is a bet on the effectiveness of the deception. I would consider this solution to be an additional tactic to the collection methods already described, but probably not a great standalone tactic. The real secret to creating effective honeypots is to make sure that there are readily available paths that lead to the honeypot so that if an attacker were to compromise a typical user of a machine, they would naturally discover and be drawn to the trap. There are tons of examples within the Awesome Honeypots GitHub repository (https://github.com/paralax/awesome-honeypots), but the important part is picking an applicable solution to your network. Honeypots or tokens have been made for all kinds of applications and their use in your network should be strategic; otherwise, they will sit undiscovered for years. That said, if you can make a juicy target that is easy to discover, you may find it to be an excellent indicator of an attacker on your network.

Data management

Log aggregation is one of the biggest time-saving tasks a defensive team can focus on. In my opinion, logging pipelines are one of the unsung heroes of modern defensive infrastructure. Simply, logging doesn't get the attention it deserves in most defensive publications. In many corporate IT deployments, logging is ubiquitous and transparent, already happening in the background of most production environments. If your organization can piggyback on an existing logging pipeline it may save your team a great deal of infrastructure management. In a competition environment, this infrastructure is far less likely, and if you do manage centralized logging it will probably be through chaining simple tools. Logging can be as simple as sending everything to a single host, or as complex as deploying a tiered security information and event management (SIEM) service. Often logging pipelines are incorporated into the SIEM application, but it doesn't always have to be the case and logging can benefit from decoupling. Services like Filebeat or Logstash may be used to supplement an all-in-one solution such as Splunk^[23]. Splunk, a vendor solution, can also quickly provide log decoration and normalization benefits before the logs ever reach the SIEM. Regardless of whether you use the full SIEM or not, harnessing a logging pipeline means you can edit your logs and standardize them as you collect them. If you're not using a centralized logging solution such as a SIEM, you can still use a logging pipeline to enrich logs on a single host or send them all to a single location. Centralized logging can even be as simple as using default capabilities such as rsyslog, SMB, or even Windows event log^[24]. The reason I say simple log aggregation is different from sending to a SIEM is that there is a lot of power in indexing, searching, alerting, and even creating rich displays of our data that a SIEM gives us.

From a consulting perspective, this could look like a logging pipeline to support scripts that rapidly collect and index forensic data to scope an incident. Regardless, being able to triage issues across your target environment from a single host is a huge time-saving feature.

A full SIEM is a powerful investment to help sort and search logs. Products like Splunk or Elasticsearch can provide rich capabilities in terms of searching for and combining multiple data sets. In a competition environment, this may be more of a dream unless the hosting organization provides one or allows you the infrastructure to host one. That said, this is a critical piece of technology in any real defensive posture. The ability to index multiple log sources, search them in concert, transform data sets on the fly, combine them with external data sets, and display rich tables or graphs of data is invaluable for this type of analysis. As we briefly touched on earlier, Splunk is a dominant service in this space because of its ability to index and transform data. Splunk has many advanced features such as User Behavior Analytics (UBA), which correlates logs to perform anomaly detection on various user activities and detect account compromise^[25]. Splunk also offers an integration platform where users can write plugins to use data with custom services or provide unique displays in the UI. An open-source alternative to Splunk is HELK^[26], which is a free option providing similar functionality for those on a budget. HELK is a combination of many open-source logging technologies such as ELK, Elasticsearch, Logstash, and Kibana, and shows how the principle of innovation can easily be applied to create security-specific solutions. Throughout our efforts in this book, we will primarily use Elasticsearch with the HELK stack because it is open-source and easily available^[27]. If you are looking for a slimmer deployment, ELK also has built-in alerting functionality as standard. We can also look at using a special SIEM just for indexing and analyzing our network-based logs. A tool such as Vast can both ingest Zeek logs and raw pcap to provide search capabilities on these data sets^[28]. Logs will be the base element we will ingest and work with throughout the network. A SIEM can help normalize those logs by mapping them to common elements, so you can perform intelligent searches across all your data and not just individual log sets.

A nice to have would be a security orchestration, automation, and response (SOAR) application to help automate alert enrichment. In many large deployments, SOAR applications act as the connective tissue tying a myriad of other appliances into the SIEM. Such an application will reach out across your network and correlate the alerts with various information to get more context. This tool could enrich elements of the alert with more data, such as the user in an alert with all of their attributes from Active Directory. A good open-source example of a SOAR platform is Cortex^[29].

These larger applications that tend to integrate lots of infrastructures are a big investment but the reward in enhancing the triage for a professional security operations center (SOC) is unparalleled. This application will act as a central hub that allows analysts to quickly interrogate and act on various pieces of infrastructure throughout the environment. Not only will analysts get more information with every alert, including rich context, but they will also be able to triage incidents quicker and with automated responses, saving time in operations. A single pane of glass for all decorated event triage context is critical during a high-stakes event. Switching between many tools, technologies, or UIs is time consuming and error prone. SOARs help the defense solve this problem in a swift and repeatable way.

A separate component of the SIEM or SOAR should be a set of events or alerts, along with plans to review and update these events regularly. Ideally, these should be decoupled from the SIEM or SOAR application so that the team can review and curate the alert set independently. You can use infrastructure to help you manage this; using a project such as Threat Alert Logic Repository (TALR)^[30] can help manage alerts by organizing them according to features, tactics, or behavior. Using such a project could also help bootstrap your detection logic by giving you some good starting rules. OpenIOCs, or indicators of compromise, were a generic type of alert format invented by Mandiant in 2013 in an attempt to standardize the alerting format^[31]. I bring it up because the OpenIOC format included what I consider an essential feature of alerts, which is combinatory logic. A major failing of traditional antivirus solutions is taking too simplistic of an approach in their detection logic; by not combining multiple sources of data or context they often fail to detect more advanced attacker techniques. OpenIOC logic aims to provide defenders with a rich set of logic to create alerts that can take multiple pieces of evidence into account. Regardless of the event syntax or format you use, it is important to both standardize your detection logic and create robust event logic. This will help with reviewing existing alerts and strategizing future detection initiatives. Playbooks are another set of solutions your group can catalog and review. Playbooks are a technology that can help enhance alerts, by automating the associated actions that should be taken if that alert triggers in your SOAR^[32]. Your alert logic should be essential to your defensive organization, as this is what your operators are trained to look for as malicious activity. This should be written down and codified instead of kept as tribal knowledge, both to help disseminate the information among the team and regularly review its merit in terms of detection logic. By organizing your alert logic, you can begin to assess your gaps and where your team may be weak in terms of detection logic. If you have an offensive operations team, this would be a great place to have them help perform adversary emulation and brainstorm potential detection or alert logic. Reviewing popular techniques or covering gaps in your operating team's detection logic is a great way to prepare for both cyber competitions and real conflict.

With a real defensive operation, you would be remiss to not include an incident response case management system or alert management system. In a corporate deployment, this would be used between shifts and over a long time to track all ongoing cases and to make sure nothing is being dropped. In a competition environment, this can be as simple as a strike list of potentially compromised hosts or hosts you need to triage. Whatever your desired workflow is, rapidly triaging and resolving alerts, escalating alerts into larger incidents potentially for a different team, or having a system where you can track which cases (and steps in a given case) are being actively worked, is a vital system. This can be as simple as a spreadsheet to track infected hosts or remediation tasks. These spreadsheets can include tabs per host regarding who is triaging which pieces of evidence at any given time. Or this can be a standalone system with a rich application where users can upload and tag additional pieces of evidence to a case. ElastAlert comes built into HELK, which makes it an easy choice for deployment and testing ^[33]. We can also use ElastAlert in TheHive for our alert management system, as it comes built in and makes it easy to integrate with other deployed systems. ElastAlert can then send operators emails when they trigger on a known alert, and the alert triage flow can be handled in TheHive^[34]. By using TheHive we can integrate our alerts into other standalone services we may have, including integration to Cortex, allowing us to take actions directly from alerts. Using TheHive, with Cortex enrichment from the rest of our infrastructure, will be a powerful single interface that operators can use for alert investigation and resolution; otherwise, they may have to bounce between many systems in triaging an alert or incident.

A further set of nice to haves would be any form of intelligence aggregation application. Applications such as MISP can take multiple intelligence feeds and integrate them into a single location where your team can curate and track intel indicators^[35]. Collaborative Research Into Threats (CRITS) is another such application that can aggregate multiple intelligence feeds and map connections of artifacts with its internal graphing database^[36]. Professional intelligence services can also be purchased, which manage the intelligence feed curation on your behalf; however, these often cost a significant annual price. Hosted intelligence platforms can then be directly integrated into the SIEM or SOAR application to provide threat enrichment if there is ever an intel indicator match. Such an application could also run artifacts through your malware triage platforms, copy artifacts to your forensic evidence store, and even start an incident response case in your case management system if properly integrated. While aggregating external threat intel is extremely powerful, another useful feature of these applications is that they document your detailed notes and comments about threat data. The knowledge that another team member previously investigated on a specific threat, or saw similar indicators in a different alert, is powerful information to share within a team.

A private forensic evidence management system is another consideration for any defensive team. A natural follow-on to an incident response system is a system to store and catalog forensic artifacts that are discovered. This can help dramatically in post-analysis, attribution, or gaining an advantage over the opponent. This will likely be seen as an extraneous consideration until other systems are in place, but even a simple solution here can pay dividends in years to come with evidence management and malware analysis. Ideally, this should be integrated into the case management system, but it can be as simple as a network share or SFTP server where artifacts are dumped for backup purposes. You could also edit the permissions such that users couldn't update or delete other's evidence, perhaps by making files immutable after they are written. Such a write-once system would make sure artifacts or evidence is not accidentally overwritten or tampered with. These simple innovations could assure the integrity of artifacts and harden the authorization of the application. On Linux this can be done by setting the sticky bit, so only the file's owner or root can edit or delete the file. You can set the sticky bit on a directory or share with: chmod +t dir. You can take this further by making files immutable so that even the owner can't edit or delete the file with chattr +i file.txt. Ideally, you will also want something to hash files when they are uploaded to track and verify their integrity. Some of the most important attributes to store are the data itself, a hash of the data, the date it was written, and potentially the user that wrote it. The following is a quick script to show the reader how easy it is to innovate on these concepts with just a little scripting. In this case, we use Python 3.6 to watch a directory and make any new file added to the directory immutable, as well as adding a timestamp, file path, and hash of the file to our log. This script only runs on Linux because we make use of the native chattr binary. Be careful not to run the script in the directory it's monitoring or else it will enter an infinite loop as it observes itself updating the log file:

import sys
import time
import logging
import hashlib
import subprocess
# Comment 1: Important Watchdog imports
from watchdog.observers import Observer 
from watchdog.events import LoggingEventHandler
# Comment 2: Log file output configuration
logging.basicConfig(filename="file_integrity.txt",
                    filemode='a',
	            level=logging.INFO,
                    format='%(asctime)s - %(message)s',
                    datefmt='%Y-%m-%d %H:%M:%S')
hasher = hashlib.sha1()
def main():
  path = input("What is the path of the directory you wish to monitor: ")
  # Comment 3: Starting event handler and observer on target dir
  event_handler = LoggingEventHandler()
  event_handler.on_created = on_created
  observer = Observer()
  observer.schedule(event_handler, path, recursive=True)
  observer.start()
  try:
    while True:
      time.sleep(1)
  except KeyboardInterrupt:
    observer.stop()
  observer.join()
def on_created(event):
  # Comment 4: Action to take when new file is written
  subprocess.Popen(['chattr', '+i', event.src_path], bufsize=1)
  with open(event.src_path, 'rb') as afile:
    buf = afile.read()
    hasher.update(buf)
  logging.info(f"Artifact: %s \nFile SHA1: %s\n", event.src_path, hasher.hexdigest())
  print("New file added: {}\n File SHA1: {}\n".format(event.src_path, hasher.hexdigest()))
if __name__ == "__main__":
  main()

The preceding script is fairly simple but vastly powerful and applicable. You can use the script for almost any file reaction, and it can be chained together to create pipelines of analysis and processing for almost any task. Let us take a deeper look at the code. Below Comment #1, we can see the watchdog imports. watchdog is a critical library that will give us the ability to monitor for and react to events. Operators may need to download the watchdog library with the Python-Pip package manager. Next, below Comment #2, we can see how watchdog is configured to log its results to a text file. In this configuration, we can see the name of the log file and that the log file is in append mode, along with the format of the log messages. Below Comment #3, we can see the event handler is being created. We can also see the default event_handler.on_created event being set to our function on_created.

Next, we see the observer being instantiated, followed by the observer being correlated to our event handler and the target file path, and then starting the observer. Jumping down to below Comment #4, we can see the arbitrary actions that we invoke when the observer sees a new file write. In our case, we are spawning a new process to run chattr +i on the newly written binary, as discussed previously. We also use this method below Comment #4 to open the newly created file, get the file's SHA1 hash, and write this hash to our log file. In the next section, we explore more analysis options we can perform on files we collect.

Analysis tooling

Another set of tools I find are absolutely critical are local analysis and triage tools. These could be tools that help you get more local telemetry, potentially investigating some suspicious processes, or even analyze an artifact you found on the target system. Analysis tools are critical to giving your operators more insight into common operating systems, forensic artifacts, and even unknown data they may discover. Some good examples of Windows local analysis tools are things from SysInternals Suite, such as Autoruns, Process Monitor, and Process Explorer^[37]. Such tools will allow analysts to look at programs that have been locally persisted, various programs and threads that are running, and specific system calls those programs are making. These could also be tools with file, log, or artifact collection and/or parsing capabilities; tools that allow you to investigate different pieces of evidence. For example, tools such as Yara could allow you to quickly search a disk or directory for interesting artifacts in files^[38]. Another set of tools including Binwalk^[39] or Scalpel^[40] could then let you extract embedded files or artifacts that were discovered in Yara scans. By chaining local analysis tools like this, a team could quickly develop hunting routines to find trojaned files or embedded artifacts^[41]. Traditional forensic tools also work wonders here, tools such as TheSleuthKit or RedLine, depending on the systems^[42]. TheSleuthKit is still amazing for analyzing disk images and artifacts within images^[43]. Likewise, tools such as RedLine or Volatility can be useful for doing on-the-fly memory analysis^[44]. This allows for both rapid live response triage of a host, as well as pulling artifacts back for local analysis. On my defensive teams, I like to collect and prepare a standard set of tools team members can use for common analysis tasks, along with runbooks to help analysts use those tools. This practice of tool preparation helps standardize our analysis tools and create experts on the team.

An incredible example of the principle of innovation is the CCDC team representing the University of Virginia's (UVA) development of a tool called BLUESPAWN^[45]. This tool is a Swiss Army knife of existing tools and capabilities that students at UVA previously automated to meet their needs. BLUESPAWN is written in C++ and only targets the Windows operating system but is a powerhouse in terms of functionality.

The UVA team claims BLUESPAWN is a force multiplier, allowing team members with a Linux focus to easily triage Windows systems by using the tool. BLUESPAWN includes several high-level run modes, such as monitor, hunt, scan, mitigate, and react, for a variety of functionalities in one tool. BLUESPAWN is designed to unleash a verbose firehose of information back at operators, with which the defense likely trains on various runbooks to help debug, interpret, and respond to the tool's output. BLUESPAWN can also automate much of the patching and hardening of a system using this tool's mitigation features. BLUESPAWN also allows the defense to monitor and hunt in real time for specific techniques, giving them repeatable actions that they can use for triage. This tool will greatly enhance the capabilities of the group and would work excellently with a little training and some common runbooks^[46]. In the next chapter, you will see how they use this tool to automate tools like PE-Sieve and hunt for process-injected beacons of Cobalt Strike^[47]. In Chapter 3, Invisible is Best (Operating in Memory), we will take an in-depth look at this detection logic, walking through this reaction correspondence at play. Seeing this type of innovation puts the offensive teams on the back foot and gives the defensive teams a powerful advantage in their live response and triage capabilities.

Malware triage platforms, both static and dynamic, can be a powerful asset to any analysis team. These systems can be a cheap substitute for an actual reverse engineer, or a time saver for both reverse engineers and analysts. An open-source and extensible static analysis platform is Viper, where people can write extensions in Python to perform actions on individual forensic artifacts. Such a platform could act as the forensic storage and analysis capabilities all in one^[48]. From here you could have various workers determine if files are executable files, extract data from them such as URLs and IP addresses, and integrate this platform back into your threat intel application for enrichment. This framework can easily be integrated into a dynamic analysis platform such as Cuckoo Sandbox, where analysts can see detailed run information from the binary^[49]. Dynamic analysis can be extremely effective for getting more information via running malware in highly instrumented sandboxes, often revealing details that are obscured from basic static triage. At times, setting up dynamic sandboxing, especially Cuckoo Sandbox, can be exceedingly difficult due to various compatibility issues with supported hypervisors, agents, and virtual machines. If you're looking at Cuckoo, you may consider the GitHub project BoomBox, which will spin up a full Cuckoo deployment in a few simple commands^[50]. BoomBox also deploys a feature in the sandbox infrastructure known as INetSim, which will fake network communications to tease more functionality out of the running malware^[51]. These private infrastructure platforms will not likely be available during a competition environment, but perhaps similar cloud services will be in scope. Services such as VirusTotal^[52], Joe Sandbox^[53], Anyrun^[54], and HybridAnalysis^[55] can give a massive boost in analysis capabilities against a particular piece of malware, but also come with the drawback of using a public service.

With some public services, such as VirusTotal, offensive actors can write their own Yara rules to see when their malware gets uploaded to the platform. If this were the case, then uploading the sample would tip the defenders' hand, letting the offense know that they have acquired a particular sample.

Data transformation utilities such as CyberChef can also be immensely helpful^[56]. These should be considered auxiliary applications as they will not necessarily help in your core goals of detection. That said, hosted utilities can buy your team additional time and operational security in a crunch by giving them a centralized and secure service to perform common data transformations. This is also a great location to practice the principle of innovation. We can easily take local analysis tools such as those we've looked at earlier and create web services or other utilities that wrap those services. A great example of this principle is another homemade web application multitool, Pure Funky Magic (PFM)^[57]. PFM contains many common utilities that analysts would use but via a central location to access and share transformations. Similarly, Maltego or other mind-mapping services can be excellent for sharing intelligence or data about threats or targets among team members^[58]. These tools can be a force multiplier for sharing threat intelligence data and operational capabilities if you have that expertise on your team.

You should also consider offensive components on your blue team. This is essentially vulnerability management and penetration testing expertise, using the skills required to scan your infrastructure for vulnerabilities. You can pull a lot of this infrastructure from the next section on offensive perspectives, though I don't think the persistence or deception tactics apply if your team is just self-assessing for vulnerabilities. On Pros V Joes, an attack and defense competition with up to 10 team members, I have one or two team members focused on offensive operations. Because all of the network designs are the same in that competition, they begin by looking at the team's own infrastructure for vulnerabilities. This has many benefits: the closer infrastructure allows for quicker and more accurate scanning results, it allows us to locally develop and test exploits while protecting operational security, and it allows us to take points away from our opponents. After we've determined that our systems are reasonably hardened, we can automate some regular scanning intervals, and turn our tools on our opponent's infrastructure for exploitation.

As you can see, there is a lot of infrastructure that needs to be set up and in place ideally before a cyber incident, or at least ready to rapidly deploy in the case of an incident. It requires great skill and planning in choosing what technologies to implement first and on what timetable, while also keeping resources available to do basic operations. If you want to play around with some of the technologies I've mentioned, I highly recommend checking out Security Onion 2^[59]. This is an evolution of the very popular Security Onion, refactored with many of the tools we've already mentioned in this chapter.

While Security Onion 2 is designed to be deployed to production, you may also want to deploy dedicated hardware and software as a permanent solution. Many of the pieces of infrastructure I've mentioned will need their own dedicated deployments, potentially even with clustered hosting. This means you should use Security Onion 2 to explore potential solutions, see how they integrate with other services, use it for local triage, develop with it, and even deploy to production in smaller environments, but you should also consider deploying dedicated solutions. Obviously, there are some critical first steps, such as understanding the environment, building out the required talent, and flushing out a development plan, but after that, each component of infrastructure will be a major investment in its own right. It's important to not take on more projects than you are adequately resourced to manage, so choosing your early infrastructure investments wisely is a key decision. Depending on the staffing, I think security telemetry, log aggregation, artifact analysis, and live response capabilities would be some of the most important to prioritize.

Defensive KPIs

It helps to have metrics to measure the operational efficiency of a team^[60]. For that, we can use KPIs. KPIs are small measurable indicators we can use to benchmark how our team performs and gauge differences in their performance over time. For the defensive team, we may want to measure things like 1/10/60 time, or the mean time taken to detect an attack, the mean time taken to respond to an incident, and the mean time taken to resolution per incident. Other metrics may include the number of incidents triaged, the mean time taken to triage an incident, outliers on incident triage, or the number of rules that have been reviewed, to suggest a few. Such metrics will help your team identify gaps or weak points in your process that may be failing silently or need more resource investment. Often security is discussed in white-or-black terms of success or failure, but there is actually a myriad of different outcomes and lots of progress to be made in preparing for a conflict^[61]. Remember, the benefit of long-term planning is improving over time and metrics are your tool to make sure your team is heading in the right direction.

Offensive perspective

Now let's look at some of the skills, tools, and infrastructure the offense can get in place before an operation. John Lambert once tweeted, "If you shame attack research, you misjudge its contribution. The offense and defense aren't peers. Defense is offense's child"^[62]. While I do not think the relationship is as dramatic as defense being offense's child, I do think there is a lot of wisdom to the idea of the defense learning from offensive research. In cybersecurity, defensive systems are often slow-moving, static, and reactionary, waiting for the attacker to make the first move.

Beyond the initial setup, and throughout the rest of this text, we will often see the offense move first or take the initiative. The offense is far more ephemeral in nature compared to the defense's infrastructure. Overall, we have less infrastructure to worry about as we will spend much of our time focusing on the target's infrastructure and keeping our footprint as minimal as possible. Because the offense has less infrastructure, and it is more ephemeral by nature, it is naturally easier to pivot to new solutions or automate ideas with simple scripting. Automated deployment and saving on deployment time will be crucial as the offense pivots deeper and changes their tactics on the fly. If you can move faster than the defense on pivoting to new machines and changing your offensive tooling in the process, you will keep the defense guessing as to where the compromise is spreading. Similar to the defensive tooling, it's important to have alternative tools or contingency infrastructure in the event your team needs to pivot their operations.

Scanning and exploitation

Scanning and enumeration tools are the eyes and hands of the offense. They allow them to explore the target infrastructure and discover technologies they want to exploit. Scanning is the attacker's opening move so they should have these techniques well-honed. Like a good game of chess, there are several desired openings or initial scans the offense can use to understand the environment. Their chosen scanning technology should be well understood and automated to the point that they have different high-level scans or scripts they can run that use honed syntax on the specific tool. The attacker will want network scanning tools, vulnerability analysis tools, domain enumeration tools, and even web application scanning tools just to name a few. Network scanning tools can involve things such as Nmap or masscan. These network scanning tools send low-level TCP/IP data to discover active hosts and services on the network. A lot can be gained by automating and diffing scans from these tools over time, to paint a picture of which ports are opening and closing on target systems. On the National CCDC red team, we use ephemeral Docker instances that will change IP addresses between every scan and send us consolidated scan reports. One really helpful thing is diff'ing scan results over time, to observe what changed in a network posture between two points in time. Tools such as AutoRecon are valuable open-source alternatives that show how innovating on existing automation can continue to give an edge^[63]. Scantron is another investment in scanning technology that offers distributed agents and a UI if the team wants to take it to that level^[64]. The offense also has tools for enumerating specific software for a list of known vulnerabilities and exploits. These vulnerability scanners, tools such as nmap-vulners^[65], OpenVas^[66], or Metasploit^[67], allow attackers to find exploitable software from among that which they've already discovered.

Nmap-vulners allows the offense to chain their port scanning directly into vulnerability enumeration. Similarly, by importing Nmap scans into Metasploit, the offense can chain their scans directly into exploitation. On the National CCDC red team we also make heavy use of Metasploit RC scripting, to automate and chain exploits, callback instructions, and even loading more payloads^[68]. The offense also has plenty of enumeration tools for Windows domains once they've gained access to a target machine. Domain enumeration tools such as PowerView^[69] and BloodHound^[70] allow the offense to continue enumerating trust within a network to potentially privilege escalate between different users. These tools are often built into or can be dynamically loaded by post-exploitation frameworks such as CobaltStrike^[71] or Empire^[72]. While some of these command and control (C2) frameworks will fall into the category of payload development or hosted infrastructure, the capabilities they offer should be well understood on their own. An offensive team should know the underlying techniques the frameworks use and have the expertise to execute the techniques with other tools, in the event the framework becomes exploitable or easily detectable. The offense may also have tools specifically for enumerating web applications and scanning web applications for known vulnerabilities. Tools such as Burp^[73], Taipan^[74], or Sqlmap^[75] can be used to audit various web applications, depending on the applications. The overall goal with these web tools in competitions is to get code execution on the hosts via vulnerabilities in the web applications, steal the data from the database, or generally take over the web application. Next, I want to examine how we can automate several of these tools for easy operational use. It is not enough to prepare the tools, the knowledge to use them effectively also needs to be in place before the conflict. Because of the complexity of tool command-line flags, I prefer automating the syntax of these tools during downtime for easier operational use. For Nmap, such a scan may look like this, titled turbonmap scan:

$ alias turbonmap='nmap -sS -Pn --host-timeout=1m --max-rtt-timeout=600ms --initial-rtt-timeout=300ms --min-rtt-timeout=300ms --stats-every 10s --top-ports 500 --min-rate 1000 --max-retries 0 -n -T5 --min-hostgroup 255 -oA fast_scan_output -iL'
$ turbonmap 192.168.0.1/24

The preceding Nmap scan is highly aggressive and loud on the network. On weaker home routers, it may actually overwhelm the gateway, so knowing the environment and tailoring the scans for the environment is paramount. Let's go over some of the switches so you can tailor the scan yourself when you need to. This Nmap scan will enumerate the top 500 TCP ports, only sends the first half of the TCP handshake, and assumes all hosts are up. There is also a bit of fine-tuned timing, the -T5 takes care of all the base settings, and we drop the rtt-timeout down to 300ms, add a 1m host timeout, do not reattempt any ports, turn the minimum sending rate up to 1000, and scan 255 hosts at a time.

We can also write some simple Python automation to chain multiple tools together, as well as perform more in-depth scanning. The following shows how to use masscan to perform the initial scan and then follow up on those results with Nmap version scanning. The logic for this largely comes from Jeff McJunkin's blog post where he explores ways to speed up large Nmap scans^[76]. The purpose of this automation is to show how easy it is to chain simple tools together with a little bash scripting:

$ sudo masscan 192.168.0.1/24 -oG initial.gnmap -p 7,9,13,21-23,25-26,37,53,79-81,88,106,110-111,113,119,135,139,143-144,179,199,389,427,443-445,465,513-515,543-544,548,554,587,631,646,873,990,993,995,1025-1029,1110,1433,1720,1723,1755,1900,2000-2001,2049,2121,2717,3000,3128,3306,3389,3986,4899,5000,5009,5051,5060,5101,5190,5357,5432,5631,5666,5800,5900,6000-6001,6646,7070,8000,8008-8009,8080-8081,8443,8888,9100,9999-10000,32768,49152-49157 --rate 10000
$ egrep '^Host: ' initial.gnmap | cut -d" " -f2 | sort | uniq > alive.hosts
$ nmap -Pn -n -T4 --host-timeout=5m --max-retries 0 -sV -iL alive.hosts -oA nmap-version-scan

Beyond basic scanning and exploitation, the offensive team should know the current hot exploits or exploits that will reliably work on current popular 0-day or n-day vulnerabilities. This goes beyond vulnerability scanning to preparing several common exploits of the day, with tested implementations and payloads. For example, in April 2017, the EternalBlue exploit was leaked from the NSA, creating an n-day vulnerability lasting several months or years in some organizations^[77]. For a while, there were a number of unstable versions of this exploit available on public sources, as well as a few highly reliable implementations. The National CCDC red team weaponized this in such a reliable way that we had scripts ready to scan all teams for just this vulnerability, exploit it, and drop our post-exploitation. These exploits should also be automated or scripted out with their preferred syntax of exploitation, and already prepared to drop a next-stage tool, the post-exploitation toolkit. Granted, the post-exploitation toolkit should be dynamically compiled per target or host, which means the exploit script should also take a dynamic second-stage payload. Using dynamically generated payloads per exploit target will help reduce the ability to correlate initial compromises. Preferably the exploit will load this second stage directly into memory, to avoid as many forensic logs as possible as we will talk about in later chapters. Exploit scripts should be well tested across multiple versions of target operating systems, and where necessary should consider versions it does not support or are potentially unstable. Risky exploits, in terms of unstable execution or highly detectable techniques, should ideally inform operators when they run the script. On the CCDC red teams, we cross-train each member on our custom scripts or have designated operators with the specific exploit expertise to avoid execution errors.

Payload development

Tool development and obfuscation infrastructure are important roles for any offensive team. Often, offensive teams will require special payloads for target systems that make use of low-level APIs and programming skills. On the National CCDC red team, a lot of our development focus goes into local post-exploitation implants, which gain our team further access, persistence, and a great deal of other features. For example, the CCDC red team has malware that will repeatedly drop local firewall rules, start services, hide other files, and even mess with system users. Payload or implant development is an often underrepresented but critical component to an offensive team. This role crafts the capabilities of various post-exploitation payloads, from on disk searching and encryption functionality, to instrumenting C2 implants. At DEF CON 26, Alex Levinson and myself released a framework we worked on for the CCDC red team called Gscript^[78]. Gscript allowed other operators to quickly wrap and obfuscate any number of existing tools and capabilities inside a single, natively compiled Go executable. The core concept behind Gscript was enabling our team members to have the same rapid implant development functionality, along with a shopping cart of post-exploitation techniques. This is very helpful if an operator is working on an OS they are less familiar with, such as OS X or Windows, by providing them with many tested technique implementations. Gscript also provides operators a safety net of forensic and obfuscation considerations. Obfuscating any payload or artifact that is going into the target environment would fall under this payload development role. General executable obfuscators or packers should also be prepared to protect arbitrary payloads going into the target environment. If we are using Go implants, we will also look at garble for our additional payload obfuscation^[79]. Garble will further help protect our payloads by removing build information, replacing package names, and stripping symbol tables; steps that help obfuscate by further hiding the real.

C2 infrastructure is another critical component in most offensive operations. The C2 infrastructure, while including implants and often maintained by the post-exploitation team, is really a realm of its own. This is because C2 frameworks often incorporate so many different features that deciding which capabilities you want for your operation becomes critical in the planning stage. One big choice is between using an open-source framework or coding your own clandestine tooling. Clandestine tooling can help reduce analysis by not using public code but can also be used against your organization for clandestine attribution. On the National CCDC red team we develop many in-house implants and C2 frameworks, to help reduce the pre-game public analysis teams could perform. While we also use public C2 frameworks, we consider these less OPSEC-safe, due to the fact they lack confidentiality and defenders can easily get source code access once they identify them^[80].

Another such capability you may consider is the ability to load custom modules directly in memory. By loading additional capabilities directly into memory, you can prevent the defender from ever gaining access to those features unless they can capture the memory samples or tease the modules out in a sandbox. Or perhaps you want custom C2 protocols to obfuscate the communications and execution between the implant and the command server. There is an interesting hobby among C2 developers where they find other normal protocols that they can hide their C2 communications within, known as covert C2. This could involve abusing rich applications, such as real-time chat solutions, or non-application protocols, such as hiding C2 data within ICMP data fields. By obfuscating their traffic with covert C2, offensive operators can pretend to be a different, benign protocol communication on the network. One advanced take on this is called domain fronting, where offensive actors can abuse Content Delivery Networks (CDNs) such as Tor or Fastly to route traffic toward trusted hosts in the CDN networks, which will actually be routed to the attacker infrastructure later. This technically works by specifying a different domain in the HTTP Host header than the domain specified in the original query, such that when the request reaches the CDN it will be redirected to the app specified in the Host header^[81]. We will take a deeper look at domain fronting in Chapter 4, Blending In. Another feature you may consider is whether the language your implant is written in can easily be decompiled or read raw, reducing the analysis capabilities required to reverse engineer it. For example, implants that use Python or PowerShell can often be deobfuscated and read natively without having to do any advanced decomplication or disassembly. Even payloads written in languages like C#, which leverages .NET Framework, can easily be decompiled to give a better understanding of their native functionality. To help planners navigate the various features of open-source C2 frameworks you may consider browsing The C2 Matrix, a collection of many modern public C2 frameworks^[82]. For the sake of having a good example in this text, we will be primarily using Sliver, a C2 framework written in Go^[83]. Again, leveraging obfuscation on the implant side of the C2 framework will be key to slowing down defensive analysis. Something you may want to consider when planning your C2 support is multiple concurrent infections using different C2 frameworks on a target network. Sometimes you want different implants, different callback cadences, and even different callback IP spaces to help decouple and protect implants. Often you want these different implant frameworks totally decoupled, such that the discovery of one won't lead to the discovery of another. But sometimes you may even put such different implants on the same infected host, so that in the event that one is compromised, you still have another way back into that target device. It's a popular strategy to make one of these implants an operational implant and the other a form of long-term persistence, which can spawn more operational implants in the event you lose an operational session. On the CCDC red team, we do this very often with collaboration frameworks such as CobaltStrike and Metasploit.

On the CCDC red team, we have given the operator of collaborative and redundant C2 access the nickname of a shell sherpa, for when they guide other team members back to lost shells.

Auxiliary tooling

A hash-cracking server should be considered as a nice-to-have for getting the team more access as they penetrate the target environment. While often thought of as extraneous, this infrastructure can greatly enable an offensive team in an operation. Undoubtedly the team will come across some form of encrypted or hashed secrets during their operation, which they will want to crack to gain further access. By hosting your own solution for this you can both protect the operational security of the program and manage the resources on which jobs you want to crack faster. A good example of such a project for managing cracking infrastructure would be CrackLord^[84]. Analogous to preparing the cracking infrastructure, this team member could also prepare rainbow tables and wordlists in on-hand locations. Preparing such simple word lists can greatly enable a team's cracking and enumeration efforts. If you know enough about your target environment, such as the themes of a company or competition, I highly encourage creating special wordlists just for the target. I like to use a well-supported tool called CeWL to enumerate websites and generate custom wordlists from their contents^[85]. Similar to the defense's ad-hoc infrastructure, hosted data transformation services can also be very helpful here. Services such as CyberChef and PFM can also be very beneficial to an offensive team, as the offensive team analyzes various pieces of data they discover in the target environment. The offense can even use similar SIEM technology to index and sort through data they may discover in the target network. Having hosted auxiliary tools to support your offensive team, such as a hash-cracking server or a data transformation service such as CyberChef, is a price worth paying upfront for the operational efficiency it can bring.

Finally, reporting infrastructure is probably the unsung hero of most offensive teams. Despite being a real offensive engagement or a competition like CCDC, every offensive team has to have something to show for their work. In CCDC or Pros V Joes, scores are calculated as a mix of downtime for the defensive teams and compromises reported by the offensive teams. For the defensive teams, there is a scoring agent, which will regularly check their services to see if they are responding properly. The offensive team also hosts a reporting server where they document their compromises, with the corresponding data stolen and evidence of exploitation. These reporting servers where compromises are documented exist in real operations, from the C2 server holding a botnet to an advanced application showing how much an organization has made and how much individual members have earned.

In the game context, our reporting server has evolved over the years to now have rich dashboards showing compromises as well as tools to help format and auto-document compromises. This is potentially extraneous, but a time saver for a part of the engagement no one wants to think about till later.

While we can use the collection of many common red team tools in the Kali Linux distribution^[86], similar to our use of Security Onion 2, I would recommend not using this for primary operations. I think Kali would work fine for some competition scenarios, but you may want to consider using something custom for real offensive operations. Similar to how we wouldn't want to use Security Onion 2 as an all-in-one solution, it will be easier to clone or set up our favorite tools in a custom repository or dedicated images. That said, Kali is an amazing distro for playing around with various tools and experimenting with different solutions. I recommend creating your own repository of scripts and tools your group uses that you can maintain and clone to whatever image you choose. These tools can then be kept up to date with premade obfuscated builds for operators. This can help mitigate any flaws in the base operating system and provide additional operational security by obfuscating your common toolset.

Offensive KPIs

Offensive KPIs are a good way to measure how effectively your team is performing over time^[87]. Note that, unlike the defensive KPIs, these are probably not good KPIs for a general red team or pentest team, again due to us having different core objectives than the average pentest team (whose goal it is to ultimately help the client, instead of persisting and hiding from the client). On the National CCDC red team, it is good to know how our individual red cells are performing year over year, so we keep detailed metrics on each user's scores and reports, to track the difference between the years. This also helps us track the difference in compromises and the strengths and weaknesses across our red team. Some interesting KPIs we track on our red team are the average time to deploy attack infrastructure, breakout time (the time from compromising a single host to moving laterally), average persistence time, average machines compromised (percentage of total), average total points, average report length, and details of the compromise. Not all of these metrics are captured automatically; for example, we draw many from the manually entered reports, and we simply enter our breakout time for the year. Still, these KPIs help us identify areas we want to improve in the off season, as well as highlight areas our development had noticeable benefits.

Summary

In this chapter, we covered several core planning concepts and technologies each side should look to have in place before engaging in cyber conflict. We examined infrastructure for any team, such as knowledge sharing in the form of a wiki and chat technologies to enhance the team's communication and operations. We explored some long-term planning strategies in terms of building out a cyber operations team, including options for contingency plans and using alternative tools. We delved into the expertise that should exist on both offensive and defensive teams, as well as methods for regularly improving the cyber skills within your team. We also dug into general operational planning, engagement planning, and cultivating operational excellence. We even examined the importance for KPIs for measuring your team's growth, including KPIs that can be collected for both offensive and defensive teams. We probed a great deal of defensive strategy and infrastructure they should probably prepare before engaging in cyber conflict. The chapter covered various forms of security signal collection, including host-based, network-based, and application-based telemetry. We also took a brief detour into active defensive infrastructure, or honeypots, something we will revisit in later chapters. Next, we canvased defensive data management, from alert aggregation and indexing in a SIEM to enrichment with a SOAR application and a myriad of nice to haves to support that SOAR application. We also covered methods of alert logic creation and alert management. Along the defensive perspective, we encountered many frameworks we could leverage to make managing this infrastructure easier. From there, we moved on to common defensive analysis tools, such as forensic tools like TSK. We saw how innovating on and writing local analysis tools can give a large advantage for the defense with BLUESPAWN. This theme of innovation will continue throughout the book, showing users how to innovate on simple detection hypotheses to gain an advantage in the conflict.

On the offensive side, we examined some of their overall goals and tactics. The offense has a wide variety of scanning and enumeration tools at their disposal so that they can assess and exploit the target infrastructure. We saw how fast-moving teams like the CCDC red team have exploits prepared with the majority of their attacks already automated for consistency. We took a deep dive on payload development and how offensive teams should have dedicated considerations when it comes to implants and C2 infrastructure. We also examined auxiliary tooling for offensive teams, such as hash-cracking servers, reporting servers, and even applications for data sharing and manipulation.

Finally, we looked at KPIs specific to offensive teams, things they can measure to help improve their performance in these attack and defense competitions. In the next chapter, we will begin to deep dive into specific kill chain techniques and the escalating reaction correspondence around these techniques. Specifically, we will look at operating in memory, why this is important, and how the defense can respond for increased visibility.

Filter reviews by

All

Amazon verified reviews

Damari Jan 19, 2023

This book has expanded my new knowledge of cybersecurity more than I can ever ask for, definitely recommend this book to anyone trying to get into cybersecurity, it's a must read!!!

Amazon Verified review

Porky Aug 23, 2022

Lots of good nuggets on Refferance material, solid concepts and guidance, easy to understand.

Rich Jun 16, 2021

Where do I begin. I am just about done my journey through this dark arts of a book and I’ve learned how to better scan for vulnerability and collect network traffic than I ever could. The Most experienced hackers wouldn’t even be able to achieve this level of finesse and artistry on their own. Dan Borges hand delivers the tools and techniques necessary to become a titan in security. Really wish I had known about this book sooner but I’m glad i have it now

Andrew Roth Dec 10, 2022

This book goes into great technical depth in the cybersecurity space.

Josh Aug 19, 2021

Dan's senior-level infosec experience becomes very apparent as you navigate the pages of his book. Even for someone just starting out with information security concepts like myself, this book is a crucial read. He outlines a lot of concepts that are often overlooked, including the principle of physical control. This book touches on a lot and has motivated me to seek out an eventual hacking competition. The book is heavy on theory in the first 2 chapters but helps to clarify many of the most popular and even some-lesser known and open source tools out there at a hacker's disposal. The book is full of different red and blue (offensive and defensive) tactics that anyone looking to become a pentester ought to know. You'll find a massive amount of customized linux terminal commands/scripts, as well as later chapter that dives into keylogging, to name just a few. Coming from an IT Support background with only a Security+ and a little hands-on infosec experience, this book really helped me to better visualize the preparations and goals involved in hacking competitions. With the level of time-sensitivity that is involved in hacking competitions, this book is an absolute bible in gearing one up for what to expect. I highly highly recommend this book. Whether you're just delving in to information security or are in a mid to sr level role, you will finish this book with a plethora of new knowledge.

Adversarial Tradecraft in Cybersecurity: Offense versus defense in real-time computer conflict

What do you get with Print?

Adversarial Tradecraft in Cybersecurity

Preparing for Battle

Essential considerations

Communications

Long-term planning

Expertise

Operational planning

Defensive perspective

Signal collection

Data management

Analysis tooling

Defensive KPIs

Offensive perspective

Scanning and exploitation

Payload development

Auxiliary tooling

Offensive KPIs

Summary

References

Page 1 of 6

Key benefits

Description

Who is this book for?

What you will learn

Product Details

What do you get with Print?

Product Details

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

About the author

FAQs

Adversarial Tradecraft in Cybersecurity: Offense versus defense in real-time computer conflict

What do you get with Print?

Contact Details

Shipping Address

Billing Address

Key benefits

Description

Who is this book for?

What you will learn

Product Details

What do you get with Print?

Contact Details

Shipping Address

Billing Address

Product Details

Packt Subscriptions

Frequently bought together

Table of Contents

Recommendations for you

Customer reviews

Filter reviews by

People who bought this also bought

About the author

FAQs