Creating advanced charts
The charts we have looked at so far have been relatively simple. We did not require any special commands to prepare the data. In this section, we look at three chart types that are more advanced.
Table 6.2 reminds us of the charting options we have covered so far, as well as the three charts that we will cover in this section:
Chart type |
Works great for |
Search structure |
Line chart |
Displaying trends over time |
|
Area chart |
Showing changes in aggregated values over time |
|
Column chart |
Comparing fields or calculated values |
|
Bar chart |
Comparing fields or calculated values |
|
Single-value chart |
Displaying numeric changes in trends. Can also be used to display single values. |
|
Pie chart |
Comparing categories |
|
Scatter chart |
Displaying the relationship between discrete values in two dimensions |
|
Bubble chart |
Displaying the relationship between discrete values in three dimensions |
|
Choropleth map |
Displaying aggregated values in a geographic region |
|
Table 6.2 – Different charts available in Splunk
Now, let’s look at scatter plots.
Scatter plots
A scatter plot is a chart used to plot two numerical values. The position of the point on the x and y axes determines the value of the data point. In Splunk, a scatter plot chart requires two aggregate values such as count
, sum()
, or avg()
. For example, the following search shows the number of events and the number of bytes for each host, which is communicated to the target host (192.168.250.70
):
index=botsv1 "192.168.250.70" sourcetype=fortigate_traffic | stats count sum(bytes) as bytes by src
To display a scatter plot in Splunk, we click on the Visualization tab and select Scatter Chart.
One of the advantages of using a scatter plot is that it makes it easy to detect outliers. For example, Figure 6.46 shows that the data point on the extreme right is an outlier:
Figure 6.46 – Scatter plot showing count and sum(bytes)
Interestingly, this is the 23.22.63.114
source host that we have seen before. Do you remember what that host did? Yes—this is the host that executed the brute-force attack on our source. We can see that the behavior of the host indicates that it generated many events (count > 1400
). On the other hand, host 40.80.148.42
sent fewer malicious events but transferred more bytes. This host was possibly the host that dropped the malicious executable after the reconnaissance phase of the attack. In fact, we can look at the data to see whether there were any file transfers between the two hosts. The following query shows that there was a file transfer signature in the fortigate_traffic
logs for the two hosts in question:
index=botsv1 "192.168.250.70" sourcetype=fortigate_traffic src="40.80.148.42" app="File.Upload.HTTP"
Figure 6.47 shows the two "File.Upload.HTTP"
events from 40.80.148.42 (srcip)
to 192.168.250.70 (tranip)
:
Figure 6.47 – File transfer signature
The following query of the suricata
sourcetype also helps us identify an outlier. We can identify the 40.80.148.42
host as malicious based on the number of events and the number of distinct signatures that are detected by Suricata:
index=botsv1 earliest=0 sourcetype="suricata" dest=imreallynotbatman.com | stats count dc(signature) as signatures by src
Figure 6.48 shows the chart. Note that we removed the legend by setting Legend to None in the Format options:
Figure 6.48 – Scatter plot chart shows the count and distinct signatures by src
We look at bubble charts in the next section.
Bubble charts
Bubble charts are like scatter plots, but the size of each point on the chart represents a measure of some aggregate in the data. For example, the following Splunk query of the suricata
data generates the count, total bytes, and distinct count of signatures for all the external traffic in the Suricata logs:
index=botsv1 earliest=0 sourcetype="suricata" NOT (src=192.168* OR src=2001:* OR src=fe80:* OR src=0*) | stats count sum(bytes) as bytes dc(alert.signature) as sig_count by src
We select Visualization and choose Bubble Chart. The resulting bubble has three dimensions:
- Count—Indicated by the location of each data point on the x axis
- Total bytes—Indicated by the location of each data point on the y axis
- Distinct signatures—Indicated by the size of the bubble
Figure 6.49 shows a very large bubble on the rightmost side of the chart. Can you guess which host is represented by this large bubble? If you guessed 40.80.148.42
, you are correct:
Figure 6.49 – Bubble plot chart shows the count and number of bytes by src
Now, let’s look at ways that we can display data on maps.
Choropleth maps
Splunk map visualizations allow us to see where traffic originates geographically. There are two types of maps in Splunk:
- Cluster maps—Show aggregated values in a geographical region
- Choropleth maps—Useful for showing how an aggregated value ranges across a geographic region
The following query shows the source countries for the traffic in our fortigate_traffic
sourcetype. We eliminate events where the srccountry
type is Reserved
(usually indicating a class C
host) and any traffic from the United States
. We click on the Visualization tab and select Choropleth Map:
index=botsv1 earliest=0 sourcetype=fortigate_traffic srccountry!=Reserved srccountry!="United States" | stats count by srccountry | geom geo_countries featureIdField=srccountry
To generate a choropleth map, we first need to generate an aggregate of a field. In our preceding example, we are counting the number of events from each source country. This aggregate could range from counts to the distinct number of signatures to the size of bytes. The next step in the query uses the geom
command to specify to Splunk that we want a choropleth map. We specify the geo_countries
lookup, which is part of the lookups available in Splunk. We indicate that we will be aggregating based on the featureIdField
specified (srccountry
).
As with most charts, a choropleth map has multiple formatting options, such as the following:
- Zoom on Scroll—Determines whether using the scroll feature on the mouse triggers a zoom of the map (see Figure 6.50)
- Latitude/Longitude/Zoom—Define the initial display of the map and allow us to specify which area of the world that we want to focus on (see Figure 6.50):
Figure 6.50 – Format options for a map chart
- Colors—Defines which colors and what intensity and scaling are used to represent the aggregate values (see Figure 6.51):
Figure 6.51 – Color options for a map chart
Figure 6.52 shows the resulting choropleth chart:
Figure 6.52 – Final choropleth map
The charts discussed in this section not only enhance the insights that we get from Splunk but also allow us to almost immediately detect outliers. Sometimes, a simple table will suffice for conveying research, but other times, a beautiful choropleth chart makes a much bigger impression.