Introducing the case study – splunking the BOTS Dataset v1
In this section, we will introduce the case study that we will use throughout this book. We will explore logs in BOTS Dataset v1
. Boss of the SOC (BOTS) is a blue-team capture-the-flag competition held during the annual Splunk .conf conference (https://tinyurl.com/39ru8d4b). Participants are given access to realistic network security logs to investigate real-world cybersecurity attacks. The nature of the attacks or the exact attack sequence is beyond the scope of this book. However, the dataset is a collection of data that we can use to explore some of the rich features of Splunk. BOTS
Dataset v1
was compiled by Ryan Kovar, David Herrald, and James Brodsky in 2016.
The setup
A fictional company, ABC Inc., has observed unusual activity on its network. They think that the problem is centered around three Windows devices (we8105desk, de9041srv, and we1149srv). The very cyber-conscious ABC Inc. also has several network security solutions installed on their network as part of their security infrastructure:
- Suricata: An open source intrusion detection system and intrusion prevention system (https://suricata.io)
- Fortigate: A next-generation firewall (https://www.fortinet.com)
- Internet Information Services (IIS): An extensible web server software created by Microsoft (https://www.iis.net/)
- Nessus: A proprietary vulnerability scanner developed by Tenable (https://www.tenable.com/products/nessus)
- Splunk Stream: A wire data capture solution built into Splunk (https://splunkbase.splunk.com/app/1809/)
The company would like you to investigate an incident that occurred in August 2016. What abnormal activity will you discover?
Our solution is to use Splunk to investigate the logs generated in August 2016. To get the full experience of installing Splunk, we will first deploy a Splunk environment to simulate the environment that generated BOTS Dataset v1
. The environment will consist of the following components:
- Three Splunk forwarders running on Windows devices (we8105desk, de9041srv, and we1149srv) deployed using AWS instances
- A dedicated indexer (Splunk Enterprise installed on an AWS instance running Red Hat Linux)
- A dedicated search head (Splunk Enterprise installed on an AWS instance running Red Hat Linux)
- A deployment server (Splunk Enterprise installed on an AWS instance running Red Hat Linux)
This will give us an environment that we can use to explore the important process of setting up and configuring Splunk in Chapter 2, Setting Up the Splunk Environmentment. This case study will require access to an AWS account, so you should sign up for an account using the AWS Management Console (https://aws.amazon.com/console/) if you do not have one. This case study does not require advanced knowledge of AWS, but it may be helpful to read a tutorial on AWS Cloud such as Learn the Fundamentals (https://tinyurl.com/2p8aj7b7) or watch a YouTube video (https://www.youtube.com/watch?v=r4YIdn2eTm4). You will also need a Splunk account to download the Splunk installation file and Splunk apps (https://www.splunk.com).
BOTS Dataset v1
is available for download from the Splunk Git repository (https://github.com/splunk/botsv1). We will use the dataset containing only attack logs due to space limitations of the free license of Splunk Enterprise. The dataset comes in the form of a Splunk app, which will install on our dedicated search head. Once we have installed and configured the Splunk deployment, we will design a series of Splunk queries, dashboards, reports, and alerts as we investigate the logs.
For this case study, we are assuming that Alice has an established security infrastructure that includes firewalls and other security devices. However, monitoring those devices does not fall under the scope of the project.
Once we have deployed and configured the Splunk environment, we will install BOTS Dataset v1
as an app on the search head and continue our exploration on the search head. The dataset consists of various machine and network logs generated by the appliances mentioned in the The setup section.
Now, let’s summarize what we have learned in this chapter.