Chapter 3. Data Scrubbing
Getting data into Splunk can be a long process, and it is often a very important and overlooked process on a Splunk journey. If we do it poorly, there is lots of clean up that we have to do, which is usually much more complicated than just sitting down to plan out how we want our data to get to Splunk, and how we want it to look. This process is known as data scrubbing, or data cleaning. This is the process of breaking events at the proper line, as well as extracting some fields, or masking data before and after Splunk writes it to disk.
Topics that will be covered in this chapter:
- Heavy Forwarder management
- What is a Heavy Forwarder?
- Managing the deployment server
- Installing modular inputs
- Data formatting
- Event management
- Knowledge management
- Pre/post indexing techniques to clean data before indexing
- Pre-indexed field extractions
- Data masking
We're going to focus on only a few pieces of a Splunk system in this chapter:
- Universal Forwarders
- Thin clients that...