Splunk Enterprise has multiple integral components that work together and are primarily divided based on their functions. The list is very comprehensive. A standalone Splunk deployment doesn’t require all the components; however, a distributed and highly available deployment requires almost all of them.
A detailed understanding of standalone versus distributed deployment is covered in the following section of this chapter, Splunk Validated Architectures (SVAs). By the end of this section, you will be familiar with two types of Splunk components—namely, processing components and management components.
Processing components
The following are processing components:
Let’s understand the roles of these components in detail and their association with management components.
Forwarder
As the name suggests, this primarily forwards data from the source to the target indexer. There are two types of forwarders:
- Universal Forwarder (UF)
- Heavy Forwarder (HF)
UF is a software agent typically installed on the source system where data is being generated. It consists of an input configuration (that is, an inputs.conf
file) with a list of absolute file paths along with metadata fields such as index and sourcetype. UF is the preferred approach to monitoring and forwarding file contents to designated indexers. By default, UF makes use of the fishbucket process to forward data for indexing exactly once and avoids data duplication through cyclic redundancy checks (CRCs) and seek pointers. You will find further information about the additional supported data inputs and detailed explanations about the fishbucket concept in Chapter 9, Configuring Splunk Data Inputs.
The following diagram illustrates UF installed on a web server configured to monitor the web server logs and forward them continuously to the indexer as and when the logs get updated:
Figure 1.1: UF forwarding web server logs to indexer
Let us now look at SH, which is a critical user-facing processing component in a distributed deployment.
HF is a Splunk Enterprise instance and doesn't require separate binary for installation. It provides an extended feature set compared to a UF. It not only collects and forwards data, but also includes a Splunk user interface for configuration and management. To operate an HF, a forwarder license is required. Typically, an HF is configured in forwarding mode by disabling local data storage. Splunk Add-ons available on Splunkbase can be installed on an HF to facilitate data collection from various sources. This combination of features makes HFs a versatile choice for preprocessing and forwarding data while benefiting from a user-friendly interface.
SH
The SH component is a Splunk Enterprise instance that is dedicated to search management and provides a number of interfaces for users to interact with. The popular interfaces it offers to users are web, CLI, and RESTful API.
Multiple SHs can be grouped together and form a cluster called a SH cluster (SHC). Members of an SHC share the same baseline configuration, and jobs are allocated to available members by the SH captain.
In a standalone deployment, a single Splunk Enterprise instance (that is, the same instance) works as both the SH and indexer. In a distributed deployment model, the SH or SHC can submit searches to multiple indexers and consolidate the results returned. The results are stored locally in a dispatch directory located in $SPLUNK_HOME/var/run/splunk/dispatch
for later retrieval, and the results will be deleted after the job expires. $SPLUNK_HOME
refers to the installation directory where the Splunk software is installed. For example, ad hoc search results (that is, the search job outcome) are retained for 10 minutes in the dispatch directory, which will be removed after the job expires by a process called the dispatch reaper, which runs every 30 seconds.
SH stores search-time knowledge objects that work directly on raw data and/or fields being returned from the indexer—for example, knowledge objects such as field extractions, alerts, reports, dashboards, and macros are categorized as search-time knowledge objects in Splunk.
The following diagram illustrates a distributed deployment configuration featuring a single dedicated SH that communicates with three separate indexers when executing a search query:
Figure 1.2: SH and indexers interaction
Let us look at another critical processing component—the indexer, which is also called a search peer, as it responds to queries issued by the SH.
Indexer
The indexer accepts and stores the indexed data, which can be retrieved later when requested by the SH. The sources of data transmission can include forwarder agents or inputs without requiring dedicated agents. The indexer(s) can be set up as either standalone instances or as a clustered configuration for HA. The data that has been indexed remains unchangeable and is stored in the form of buckets. More details about buckets are provided in Chapter 5, Splunk Index Management:
Figure 1.3: Indexers receiving data from forwarders and storing it in indexes
So far, we have gone through the processing components and their roles in a Splunk Enterprise deployment. Let us go through the management components in the following section.
Management components
These are management components that support the processing components:
- Deployment Server (DS)
- SHC Deployer (SHC-D)
- Indexer CM
- License Manager (LM)
- MC
We’ll discuss them in the following subsections.
DS
A standalone Splunk Enterprise instance is used to manage the forwarders. The forwarders, which are located at the data source (typically a UF), often need new configurations to monitor new files or changes to an existing configuration followed by an optional restart. Changing them manually is a very time-consuming task in larger infrastructures. That’s where the DS comes to the rescue, by maintaining a central repository of configurations in the form of apps. In addition to UFs, HFs can also be centrally managed using a DS.
Chapter 4, Splunk Forwarder Management, goes through more details on this topic.
SHC-D
The SHC-D manages app configurations and deployments for an SHC in Splunk Enterprise deployment. It distributes app bundles to the SHs, applies configurations, and coordinates rolling restarts if needed.
The SHC-D usually stores all the apps at the following location: $SPLUNK_HOME/etc/shcluster/apps
.
Indexer CM
An indexer cluster incorporates a distinct Splunk Enterprise instance that functions as a Cluster manager, known as a CM. This CM does not engage in typical search operations but rather oversees the indexer cluster, governing it in the following ways:
- The Replication Factor (RF) is met
- The Search Factor (SF) is met
- Deployment of configurations to the cluster
- Responds to SH requests
The Search head indexer clustering overview section of Chapter 7 will explain the RF and SF in detail.
License manager
All components in Splunk Enterprise require a license for commercial use, except for UF, which is a software offered by Splunk that is available for use without requiring a separate license. The LM is loaded with the license file received from Splunk sales by an admin. Multiple license files might exist depending on the agreement with Splunk. The rest of the instances in the deployment, called license peers, are connected to the manager node. The manager node acts as a central license repository for configuring stacks, pools, and license volumes. It stores usage logs in a license_usage.log
file, which tracks all Splunk instances connected to the LM for violations and their usage. Out-of-the-box license reports are dependent on this log. We will discuss this in detail in Chapter 2, Splunk License Management.
Monitoring Console
The MC is a built-in app in Splunk that provides a centralized location for monitoring and managing Splunk deployments. It offers a GUI that allows administrators to monitor and configure various aspects of Splunk, including alerts and dashboards for monitoring indexing, license usage, search, resource usage, forwarders, health checks, and more. We will go through some of these dashboards in detail and set up alerts in later chapters.
Note
Do note that although these components have dedicated roles and activities to perform, some of them can be installed together on the same Splunk instance. A matrix of which components can be combined is provided in the docs: https://tinyurl.com/26f9n5zf.
We have come to the end of the components section. We learned that a UF is preferred for file monitoring and forwarding data to indexers. Depending on the deployment type, whether standalone or distributed, the number of components required to set up differs. Standalone Splunk doesn’t require many components as it functions as both an SH and indexer. A distributed deployment includes a number of additional management components for deployment, cluster management, and license management. The Splunk Enterprise binary utilized for all components remains same; the differentiation lies in the configuration of each binary instance, determining the role of each component such as the SH, indexer, SHC-D, DS, or LM.
As we dive into the chapters associated with both processing and management components, we will look into these topics in more detail, and you will find them discussed a lot throughout the book. So, understanding these components and their role in Splunk Enterprise deployment is quite important to understand the rest of the sections and chapters.