Chapter 5. Data Optimization, Reports, Alerts, and Accelerating Searches
Finding the data that you need in Splunk is relatively easy, as you have seen in the previous chapters. Doing the same thing repeatedly, however, requires that you employ techniques that make data retrieval faster. In Chapter 2, Bringing in Data, you have been shown how to use data fields and to make field extractions. In Chapter 4, Data Models and Pivot, you learned how to create data models. You will continue that journey in this chapter by learning how to classify your data using event types, enrich your data using lookups and workflow actions, and normalize your data using tags.
Once you have all these essentials in place, you will be able to easily create reports, alerts, and dashboards. This is where Splunk really shines and your hard work so far will pay off.
In this chapter, we will cover a wide range of topics that showcase ways to manage, analyze, and get results from data. These topics will help you learn to work more efficiently with data and gather better insights from it:
- Data classification with event types
- Data normalization with tags
- Data enrichment with lookups
- Creating reports
- Creating alerts
- The Custom Cron schedule
- Best practices in scheduling jobs
- Optimizing searches
Data classification with event types
When you begin working with Splunk every day, you will quickly notice that many things are repeatable. In fact, while going through this book, you may have seen that search queries can easily get longer and more complex. One way to make things easier and shorten search queries is to create event types. Event types are not the same as events; an event is just a single instance of data. An event type is a grouping or classification of events that meet the same criteria.
If you took a break between chapters, you will probably want to open up Splunk again. Then you will execute a search command:
- Open up Splunk.
- Click on your Destinations app.
- Type in this query:
SPL> index=main http_uri=/booking/confirmation http_status_code=200
This data will return successful booking confirmations. Now say you want to search for this the next day. Without any data classification, you'll have to type the same search string as previously. Instead of tedious repetition, you can simplify your work by saving this search now as an event type. Follow these steps now:
- In the Save As dropdown, select Event Type:
- Label this new event type
good_bookings
. - Select a color that is best suited for the type of event; in this case, we will select green.
- Select 5 as the priority. Priority here determines which style wins if there is more than one event type. One is the highest and 10 is the lowest.
- Use the following screenshot as a guide, then click on Save:
Now let's create an event type for bad bookings:
- Change the search query from
http_status_code=200
tohttp_status_code=500
. The new query is as shown here:SPL> index=main http_uri=/booking/confirmation http_status_code=500
- Save this as an event type. This time, name it
bad_bookings
and opt for the color to be red and leaving Priority at 5:
We have created the two event types we needed. Now let's see them in action:
- Type the following query in the search input:
SPL> eventtype=*bookings
- Notice that the search results have now been color-coded based on the event type that you created. You can also just search for either
eventtype=good_bookings
oreventtype=bad_bookings
to narrow down your search results. - Examine the following screenshot, which shows the results. The colors we have chosen make it easy to spot the types of booking. Imagine the time this saves a manager, who can instantly look for bad bookings. It's just one more way Splunk can make operations so much easier:
Certain restrictions apply when creating event types. You cannot create an event type that consists of a piped command or subsearches. Only base commands can be saved as an event type.
Since the event type is now part of the search, you can then further manipulate data using piped commands, just like this:
SPL> eventtype=*bookings | stats count by eventtype
Create a few more event types now, using the following table as a guide:
Event type |
Search command |
Color |
|
|
green |
|
|
red |
|
|
blue |
|
|
purple |
Data normalization with tags
Tags in Splunk are useful for grouping events with related field values. Unlike event types, which are based on specified search commands, tags are created and mapped to specific fields. You can also have multiple tags assigned to the same field, and each tag can be assigned to that field for a specific reason.
The simplest use-case scenario when using tags is for classifying IP addresses. In our Eventgen logs, three IP addresses are automatically generated. We will create tags against these IP addresses that would allow us to classify them based on different conditions:
IP address |
Tags |
|
|
|
|
|
|
In our server farm of three servers, we are going to group them by purpose, patch status, and geolocation. We will achieve this using tags, as shown in the following steps:
- Begin by using the following search command:
SPL> index=main server_ip=10.2.1.33
- Expand the first event by clicking on the information field as seen in this screenshot:
- While expanded, look for the server_ip field. Click on the Actions dropdown and select Edit Tags:
- In the Create Tags window, fill in the Tag(s) text area using the following screenshot as a guide. For
10.2.1.33
, you will use the following tags:main
,patched
,east
. - Click on Save when you're done:
- Do the same for the remaining two IP addresses and create tags based on the previous table.
- Now let us make use of this newly-normalized data. Run the search command:
SPL> index=main tag=patched OR tag=east
This will give you all the events that come from the servers that are patched and hypothetically located in the east side of a building. You can then combine these with other search commands or an event type to narrow down the search results.
Consider a scenario where you need to find all booking payments with errors originating from the servers in the east side of a hypothetical building.
Without event types or tags, you would create a search command that looked something like this:
SPL> index=main server_ip=10.2.1.33 OR server_ip=10.2.1.35
AND (http_uri=/booking/payment http_status_code=500)
Compare that to this much more elegant and shorter search command, which you can try now:
SPL> index=main eventtype=bad_payment tag=east
Here's an additional exercise for you. Create tags for the following fields using this table as a guide and use them in a search query:
Fields |
Tags |
|
|
|
|
|
|
|
|
|
|
Now you can use these tags to search for bookings to major destinations, which have a status code of not_found
. These tags can make your searches much easier and more useful. Here is an example of a search command that combines what you have learned in this chapter so far:
- Go ahead and run this now:
SPL> eventtype=destination_details tag=major_destination tag=not_found
- Look through your results and see that you now have data from the destinations LAX, NY, and MIA.
Data enrichment with lookups
Occasionally you will come across pieces of data that you wish were rendered in a more readable manner. A common example is HTTP status codes. Computer engineers are often familiar with status codes as three-digit numbers. Business analysts, however, would not necessarily know the meaning of these codes. In Splunk, you solve this predicament by using lookup tables, which can pair numbers or acronyms with more understandable text classifiers.
A lookup table is a mapping of keys and values that Splunk can query so it can translate fields into more meaningful information at search time. This is best understood through an example. You can go through the following steps:
- From the Destinations app, click on Settings and then Lookups:
- In the Lookups page, click on the Add new option next to Lookup table files, as shown in the following screenshot:
- In the Add new page, make sure that the Destinations app is selected.
- Then, using the following screenshot as your guide, in Upload a lookup file, browse and choose the following:
C:\splunk-essentials\labs\chapter05\http_status.csv
. - Finally, type in
http_status.csv
in the Destination filename field. - Click on Save to complete:
The new lookup table file path will now appear in the main Lookup Table Files page. Change the permission so that all apps can use it and it will now appear as Global. The entries of the lookup table files should be similar to the following screenshot:

Now that we have configured the lookup table file, it is time to define the lookup:
- In the Lookups page under Settings, click on the Add new option next to Lookup definitions:
- Once again, make sure that this is being saved in the context of the Destinations app.
- In the name field, type in
http_status
. - Leave the Type as File-based. In the Lookup file dropdown, look for the
http_status.csv
file and select it. - Leave the following checkboxes blank:
- Save the definition.
- The new lookup definition will now appear in the table. Change the permission sharing to Global as well.
Let us now try to make use of this new lookup table:
- In the Destinations app search bar, type in:
SPL> eventtype=destination_details | top http_status_code
- The result will show the
http_status_code
column with the raw status codes. Now extend your search by using thelookup
command. The following multi-line command might not work if you simply copied it. Retyping or re-tabbing is required for it to work:SPL> eventtype=destination_details | top http_status_code | rename http_status_code AS status | lookup http_status status OUTPUT status_description, status_type
- Look at the followi ng output. The steps you took give you a meaningful output showing the description and type of the status codes, all because of the lookup table we first set up:
This is good for a first step, but for it to be a practical tool, the lookup needs to happen automatically with all queries. To do this, take the following steps:
- Go back to Settings and then the Lookups page.
- Click on Add new to add a new Automatic Lookup:
- Complete the form with the following information. Click on Save when you're done. Go to Permissions and change the sharing permission to Global by clicking on All Apps:
Now let's see how these changes can help us out:
- Go back to the Destinations app search bar and type in the following query.
SPL> eventtype=destination_details status_type=Redirection
Tip
Note that now you can filter your search using the lookup information without invoking the
lookup
command. - Notice that the search output will match all events where
http_status_code
equals301
or302
.
Creating reports
So far in this chapter, you have learned how to do three very important things: classify data using event types, normalize data using tags, and enrich data using lookup tables. All these, in addition to Chapter 4, Data Models and Pivot, constitute the essential foundation you need to use Splunk in an efficient manner. Now it is time to put them all to good use.
Splunk reports are reusable searches that can be shared to others or saved as a dashboard. Reports can also be scheduled periodically to perform an action, for example to be sent out as an e-mail. Reports can also be configured to display search results in a statistical table, as well as visualization charts. You can create a report through the search command line or through a Pivot. Here we will create a report using the search command line:
- In the Destinations app's search page, type in this command:
SPL> eventtype=bad_logins | top client_ip
The search is trying to find all client IP addresses that attempted to log in but got a 500 internal server error.
- To save this as a report for future, click on Save As | Report, then give it the title
Bad Logins
: - Next, click Save.
- Then click on View to go back to the search results.
- Notice that the report is now properly labeled with our title. You can see the report in the following screenshot:
- If you expand the Edit dropdown, you now have additional options to consider while working on this report.
You can modify the permissions so others can use your report. You have done this step a couple of times earlier in the book. This process will be identical to editing permissions for other objects in Splunk.
You can create a schedule to run this report on a timely basis and perform an action on it. The typical action would either be sending the result as an e-mail or running a script. Unfortunately, you would need a mail server to send an e-mail, so you will not be able to do this from your Splunk workstation the way it is currently configured. However, we will show you how it is done:
- Click Edit | Edit Schedule.
- In the pop-up window, click on Schedule Report.
- Change the Schedule option to run Every Day. The time range applies to the search time scope. The default is to run the report against a 15-minute time range.
Schedule windows are important for production environments. The schedule window you specify should be less than the time range. When there are multiple concurrent searches going on in the Splunk system, it will check whether you have a schedule window and will delay your report up to the defined time or until no other concurrent searches are running. This is one way of optimizing your Splunk system. If you need accurate results that are based on your time range, however, then do not use the schedule window option.
- Refer to the following screenshot, then click on Next when you're ready to move on:
- In the next window, check the Send Email box to show advanced e-mail options. Once again, since your workstation does not have a mail server, the scheduled report will not work. But it is worth viewing what the advanced e-mail options look like:
- Uncheck the Send Email option again and click on Save. The report will still run, but it will not perform any action. We can, however, embed the report into an external website and it will always show the results based on the scheduled run. We will reserve further discussion about this advanced option for Chapter 7, Splunk SDK for JavaScript and D3.js.
There is another option that you will commonly use for reports adding them to dashboards. You can do this with the Add to Dashboard button. We will use this option in Chapter 6, Panes of Glass.
Create a few more reports from SPL using the following guidelines. We will use some of these reports in future chapters so try your best to do all of them. You can always come back to this chapter if you need to:
Search |
Schedule |
Report name |
Time range |
Time window |
|
Run every hour |
Bad payments |
Last 24 hrs |
30 mins |
|
Run every 24 hours |
Bookings last 24 hrs |
Last 24 hrs |
15 mins |
You also have the ability to create reports using Pivot:
- Click on Pivot.
- Create a Pivot table on the Destination Details child object with Last 24 hours as your Filters and Airport Code as your Split Rows.
- Refer to the following screenshot then save it as a report entitled
Destinations by Airport Code
. Schedule the report to run every hour, within a 24-hour time range, and with a 30-minute time window:
Creating alerts
Alerts are crucial in IT operations. They provide real-time awareness of the state of the systems. Alerts also enable you to act fast when an issue has been detected prior to waiting for a user to report it. Sure enough, you can have a couple of data center operators monitor your dashboards, but nothing jolts their vigil more than an informative alert.
Now, alerts are only good if they are controlled and if they provide enough actionable information. Splunk allows you to do just that. In this section, we will walk you through how to create an actionable alert and how to throttle the alerting to avoid flooding your mailbox.
The exercises in this section will show you how to create an alert, but in order to generate the actual e-mail alert, you will need a mail server. This book will not cover mail servers but the process of creating the alert will be shown in full detail.
We want to know when there are instances of a failed booking scenario. This event type was constructed with the 500 HTTP status code. 5xx status codes are the most devastating errors in a web application so we want to be aware of them. We will now create an alert that will be triggered when a bad booking event is detected. Follow these steps:
- To create the alert, start by typing this:
SPL> eventtype=bad_bookings
- Click on Save As | Alert. In the Save As Alert panel, fill up the form using the following screenshot as a guide:
Let us explain some of the different options in this selection:
- Permissions: You should be fairly familiar with permissions by now. These apply to alerts as well.
- Alert type: There are two ways to create an alert, just as there are two ways to run a search: scheduled (ad hoc) or in real time. Splunk has predefined schedules that you can easily use, namely:
- Run every hour
- Run every day
- Run every week
- Run every month
- Although the schedules above are convenient, you will likely soon find yourself wanting more granularity for your searches. This is where the fifth option comes in: Run on Cron schedule. We will discuss this in detail later in the chapter.
- Trigger Conditions: These are the conditions or rules that define when the alert will be generated. The predefined conditions that Splunk offers out-of-the-box are:
- Number of Results: Most commonly used, this tells the alert to run whenever your search returns a certain number of events.
- Number of Hosts: This is used when you need to know how many hosts are returning events based on your search.
- Number of Sources: This is used when you need to know how many data sources are returning events based on your search.
- Custom: This is used when you want to base your condition on the value of a particular field that is returned in your search result. We will discuss this in detail further into this chapter.
- Trigger Actions: These are the actions that will be invoked when your trigger conditions are met. There are several possible default trigger actions currently included in Splunk Enterprise:
- Add to Triggered Alerts: This will add an entry to theActivity | Triggered alerts page. This is what we will use in this book since it is the only readily available option.
- Run a script: You can run a script (such as a Python script) located in the
$SPLUNK_HOME/bin/scripts
directory whenever this alert is generated. This is useful for self-repairing issues. - Send e-mail: Commonly used but requires a mail server to be configured.
- Webhook: A recently introduced type of trigger that allows Splunk to make an HTTP
POST
to an external application (such as Twitter or Slack).
Click on Save to save your first alert. We will come back later to optimize it. Meanwhile, you should have now been sent to the alert summary page where you can continue to make changes. Note that since we selected the Add to Triggered Alerts action, you should now see the history of when this alert was triggered on your machine. Since the Eventgen data is randomized and we scheduled it to run every hour, you may have to wait until the next hour for results:

Search and report acceleration
In Chapter 4, Data Models and Pivot, you learned how to accelerate a data model to speed up retrieval of data. The same principle applies to saved searches or reports:
- Click on the Reports link in the navigation menu of the Destinations app.
- Click on the Edit | Edit Acceleration option in the Bookings Last 24 Hrs report.
- Enable 1 Day acceleration as seen in the following screenshot:
- To check the progress of your report's acceleration, click on Settings | Report Acceleration Summaries:
Scheduling best practices
No matter how advanced and well-scaled your Splunk infrastructure is, if all scheduled searches and reports are running at the same time, the system will start experiencing issues. Typically you will receive a Splunk message saying that you have reached the limit of concurrent or historical searches. Suffice to say that there are only a certain number of searches that can be run on CPU core for each Splunk instance. The very first issue a beginner Splunk admin faces is how to limit the number of concurrent searches running at the same time. One way to fix this is to throw more servers into the Splunk cluster, but that is not the efficient way.
The trick to establishing a robust system is to properly stagger and budget scheduled searches and reports. This means ensuring that they are not running at the same time. There are two ways to achieve this:
- Time windows: The first way to ensure that searches are not running concurrently is to always set a time window. You have done this in the exercises in this chapter. This is not ideal if you need to schedule runs so that the schedule of each run always happen at an exact time.
- Custom Cron schedule: This is what most advanced users use to create their schedules. Cron is a system daemon, or a computer program that runs as a background process, derived from traditional UNIX systems; it is used to execute tasks at specified times.
Let us see an example of how to use a custom Cron schedule. Begin with this search query, which finds all errors in a payment:
- Type in the following:
SPL> eventtype=bad_payment
- Save it as an alert by clicking on Save As | Alert.
- Name it
Payment Errors
. - Change the permissions to
Shared in App
. - In the Alert type, change the schedule to Run on Cron Schedule.
- In the Earliest field, enter
-15m@m
(last 15 minutes and snap at the beginning of the minute. This means a time range of 15 minutes and also ensures that it starts at the beginning of that minute). - In the Latest field, type in
now
. In the Cron Expression field, type in*/5 * * * *
. - Finally, change the Trigger Actions to Add to Triggered Alerts. Use the following screenshot as a guide:
- Click Save when done.
The Cron expression * * * * *
corresponds to minute hour day month day-of-week.
Learning Cron expressions is easiest when you look at examples. The more examples, the simpler it is to understand this method of scheduling. Here are some typical examples:
Cron expression |
Schedule |
|
Every 5 minutes |
|
Every 15 minutes |
|
Every 6 hours, on the hour |
|
Every 2 hours at the 30th minute (for instance, 3:30) |
|
Every 1st and 10th of the month, at 2:45 pm. |
|
Every hour, Monday to Friday |
|
Every 2nd minute, 17th minute, 32nd minute, and 47th minute of every hour. |
Now that you know something about Cron expressions, you can fine-tune all your searches to run in precise and different schedules.
Summary indexing
In a matter of days, Splunk will accumulate data and start to move events into the cold bucket. If you recall, the cold bucket is where data is stored to disk. You will still be able to access this data but you are bound by the speed of the disk. Compound that with the millions of events that are typical with an enterprise Splunk implementation, and you can understand how your historical searches can slow down at an exponential rate.
There are two ways to circumvent this problem, one of which you have already performed: search acceleration and summary indexing.
With summary indexing, you run a scheduled search and output the results into an index called summary
. The result will only show the computed statistics of the search. This results in a very small subset of data that will seemingly be faster to retrieve than going through the entirety of the events in the cold bucket.
Say, for example, you wish to keep track of all counts of an error in payment and you wish to keep the data in the summary index. Follow these steps:
- From your Destinations app, go to Settings | Searches, reports, and alerts.
- Click on the New button to create a new scheduled search.
- Use the following information as a guide:
- Destinations app:
Destinations
- Search name:
Summary of Payment Errors
- Search:
eventtype=bad_payment | stats count
- Start time:
-2m@m
- Finish time:
now
- Destinations app:

Now perform the following steps:
- Click on Schedule this search.
- Change Schedule type to Cron.
- Set Cron schedule to
*/2 * * * *
. - Set Condition to always. This option, present in the Alert section denotes if the number of events is greater than 0.
- Set Expiration to Custom time of 1 hour.
Use the following screenshot as a guide:

Now perform the following steps:
- Click on the Enable checkbox in the Summary indexing section
- Add a new field in the Add fields section, where values will be
summaryCount
equals tocount
Use the following information as a guide:

- Save when you are ready to continue.
- Now go back to the Destinations app's Search page. Type in the following search command and wait about 5-10 minutes:
SPL> index=summary search_name="Summary of Payment Errors"
Notice that this stripped the original event of all other information except the count of events at the time that the scheduled search is running. We will use this information in later chapters to create optimized dashboards.
Summary
In this chapter, you have learned how to optimize data in three ways: classifying your data using event types, normalizing your data using tags, and enriching your data using lookup tables. You have also learned how to create advanced reports and alerts. You have accelerated your searches just like you did with data models. You have been introduced to the powerful Cron expression, which allows you to create granularity on your scheduled searches, and you have also been shown how to stagger your searches using time windows. Finally, you have created a summary index that allows you to search historical data faster. In the next chapter, Chapter 6, Panes of Glass, you will go on to learn more about how to do visualizations.