Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
Data Analytics Using Splunk 9.x

You're reading from   Data Analytics Using Splunk 9.x A practical guide to implementing Splunk's features for performing data analysis at scale

Arrow left icon
Product type Paperback
Published in Jan 2023
Publisher Packt
ISBN-13 9781803249414
Length 336 pages
Edition 1st Edition
Tools
Arrow right icon
Author (1):
Arrow left icon
Dr. Nadine Shillingford Dr. Nadine Shillingford
Author Profile Icon Dr. Nadine Shillingford
Dr. Nadine Shillingford
Arrow right icon
View More author details
Toc

Table of Contents (18) Chapters Close

Preface 1. Part 1: Getting Started with Splunk
2. Chapter 1: Introduction to Splunk and its Core Components FREE CHAPTER 3. Chapter 2: Setting Up the Splunk Environment 4. Chapter 3: Onboarding and Normalizing Data 5. Part 2: Visualizing Data with Splunk
6. Chapter 4: Introduction to SPL 7. Chapter 5: Reporting Commands, Lookups, and Macros 8. Chapter 6: Creating Tables and Charts Using SPL 9. Chapter 7: Creating Dynamic Dashboards 10. Part 3: Advanced Topics in Splunk
11. Chapter 8: Licensing, Indexing, and Buckets 12. Chapter 9: Clustering and Advanced Administration 13. Chapter 10: Data Models, Acceleration, and Other Ways to Improve Performance 14. Chapter 11: Multisite Splunk Deployments and Federated Search 15. Chapter 12: Container Management 16. Index 17. Other Books You May Enjoy

Creating advanced charts

The charts we have looked at so far have been relatively simple. We did not require any special commands to prepare the data. In this section, we look at three chart types that are more advanced.

Table 6.2 reminds us of the charting options we have covered so far, as well as the three charts that we will cover in this section:

Chart type

Works great for

Search structure

Line chart

Displaying trends over time

| timechart count [by comparison_category]

Area chart

Showing changes in aggregated values over time

| timechart count [by comparison_category]

Column chart

Comparing fields or calculated values

| stats count by comparison_category

Bar chart

Comparing fields or calculated values

| stats count by comparison_category

Single-value chart

Displaying numeric changes in trends. Can also be used to display single values.

| timechart count

OR

| stats count

Pie chart

Comparing categories

| stats count by comparison_category

Scatter chart

Displaying the relationship between discrete values in two dimensions

| stats x_value_aggregation y_value_aggregation by name_category [comparison_category]

Bubble chart

Displaying the relationship between discrete values in three dimensions

| stats x_value_aggregation y_value_aggregation size_aggregation by name_category [comparison_category]

Choropleth map

Displaying aggregated values in a geographic region

| stats count by featureId | geom geo_countries featureIdField=featureId

Table 6.2 – Different charts available in Splunk

Now, let’s look at scatter plots.

Scatter plots

A scatter plot is a chart used to plot two numerical values. The position of the point on the x and y axes determines the value of the data point. In Splunk, a scatter plot chart requires two aggregate values such as count, sum(), or avg(). For example, the following search shows the number of events and the number of bytes for each host, which is communicated to the target host (192.168.250.70):

index=botsv1 "192.168.250.70" sourcetype=fortigate_traffic 
| stats count sum(bytes) as bytes by src

To display a scatter plot in Splunk, we click on the Visualization tab and select Scatter Chart.

One of the advantages of using a scatter plot is that it makes it easy to detect outliers. For example, Figure 6.46 shows that the data point on the extreme right is an outlier:

Figure 6.46 – Scatter plot showing count and sum(bytes)

Figure 6.46 – Scatter plot showing count and sum(bytes)

Interestingly, this is the 23.22.63.114 source host that we have seen before. Do you remember what that host did? Yes—this is the host that executed the brute-force attack on our source. We can see that the behavior of the host indicates that it generated many events (count > 1400). On the other hand, host 40.80.148.42 sent fewer malicious events but transferred more bytes. This host was possibly the host that dropped the malicious executable after the reconnaissance phase of the attack. In fact, we can look at the data to see whether there were any file transfers between the two hosts. The following query shows that there was a file transfer signature in the fortigate_traffic logs for the two hosts in question:

index=botsv1 "192.168.250.70" sourcetype=fortigate_traffic src="40.80.148.42" app="File.Upload.HTTP"

Figure 6.47 shows the two "File.Upload.HTTP" events from 40.80.148.42 (srcip) to 192.168.250.70 (tranip):

Figure 6.47 – File transfer signature

Figure 6.47 – File transfer signature

The following query of the suricata sourcetype also helps us identify an outlier. We can identify the 40.80.148.42 host as malicious based on the number of events and the number of distinct signatures that are detected by Suricata:

index=botsv1 earliest=0 sourcetype="suricata" dest=imreallynotbatman.com
| stats count dc(signature) as signatures by src

Figure 6.48 shows the chart. Note that we removed the legend by setting Legend to None in the Format options:

Figure 6.48 – Scatter plot chart shows the count and distinct signatures by src

Figure 6.48 – Scatter plot chart shows the count and distinct signatures by src

We look at bubble charts in the next section.

Bubble charts

Bubble charts are like scatter plots, but the size of each point on the chart represents a measure of some aggregate in the data. For example, the following Splunk query of the suricata data generates the count, total bytes, and distinct count of signatures for all the external traffic in the Suricata logs:

index=botsv1 earliest=0 sourcetype="suricata" NOT (src=192.168* OR src=2001:* OR src=fe80:* OR src=0*)
| stats count sum(bytes) as bytes dc(alert.signature) as sig_count by src

We select Visualization and choose Bubble Chart. The resulting bubble has three dimensions:

  • Count—Indicated by the location of each data point on the x axis
  • Total bytes—Indicated by the location of each data point on the y axis
  • Distinct signatures—Indicated by the size of the bubble

Figure 6.49 shows a very large bubble on the rightmost side of the chart. Can you guess which host is represented by this large bubble? If you guessed 40.80.148.42, you are correct:

Figure 6.49 – Bubble plot chart shows the count and number of bytes by src

Figure 6.49 – Bubble plot chart shows the count and number of bytes by src

Now, let’s look at ways that we can display data on maps.

Choropleth maps

Splunk map visualizations allow us to see where traffic originates geographically. There are two types of maps in Splunk:

  • Cluster maps—Show aggregated values in a geographical region
  • Choropleth maps—Useful for showing how an aggregated value ranges across a geographic region

The following query shows the source countries for the traffic in our fortigate_traffic sourcetype. We eliminate events where the srccountry type is Reserved (usually indicating a class C host) and any traffic from the United States. We click on the Visualization tab and select Choropleth Map:

index=botsv1 earliest=0 sourcetype=fortigate_traffic srccountry!=Reserved srccountry!="United States"
| stats count by srccountry 
| geom geo_countries featureIdField=srccountry

To generate a choropleth map, we first need to generate an aggregate of a field. In our preceding example, we are counting the number of events from each source country. This aggregate could range from counts to the distinct number of signatures to the size of bytes. The next step in the query uses the geom command to specify to Splunk that we want a choropleth map. We specify the geo_countries lookup, which is part of the lookups available in Splunk. We indicate that we will be aggregating based on the featureIdField specified (srccountry).

As with most charts, a choropleth map has multiple formatting options, such as the following:

  • Zoom on Scroll—Determines whether using the scroll feature on the mouse triggers a zoom of the map (see Figure 6.50)
  • Latitude/Longitude/Zoom—Define the initial display of the map and allow us to specify which area of the world that we want to focus on (see Figure 6.50):
Figure 6.50 – Format options for a map chart

Figure 6.50 – Format options for a map chart

  • Colors—Defines which colors and what intensity and scaling are used to represent the aggregate values (see Figure 6.51):
Figure 6.51 – Color options for a map chart

Figure 6.51 – Color options for a map chart

Figure 6.52 shows the resulting choropleth chart:

Figure 6.52 – Final choropleth map

Figure 6.52 – Final choropleth map

The charts discussed in this section not only enhance the insights that we get from Splunk but also allow us to almost immediately detect outliers. Sometimes, a simple table will suffice for conveying research, but other times, a beautiful choropleth chart makes a much bigger impression.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image