Understanding Splunk indexing and buckets
The strength of Splunk comes from the way data is indexed. Logically, a Splunk index is a repository of data that is stored in a uniform manner to make searching efficient. Physically, an index is a set of subdirectories called buckets. The term indexing in Splunk refers to the process whereby data coming from multiple sources into Splunk is organized into Splunk indexes. In this section, we will explore the mechanisms used to store data in indexes and buckets.
Raw data is forwarded from the source into Splunk. This data is converted into Splunk events, which are organized into indexes. An index is an immutable repository of data – that is, once data is added to an index, it cannot be edited. This goes back to the concept of the immutability of big data that we discussed in Chapter 1, Introduction to Splunk and its Core Components. There is no way to delete individual events from an index, but Splunk allows the following:
-
...