Getting to know the Data Node
Event and flow data are required for security purposes as well as for compliance. The amount of storage available on the Console and processors might not be enough for compliance.
For example, it may be mandated by Central Banks to keep event and flow data for 2 years. The available storage on processors can store data only for 6 months. In such a scenario, multiple Data Nodes can be added to a processor so that the processed data can be stored.
Adding a Data Node to deployment has two advantages:
- Increases the storage space for event and flow data
- Searches are more efficient when Data Nodes are used
Multiple Data Nodes can be attached to a single processor. One Data Node cannot be attached to multiple processors. What this means is that one Data Node will share data with just one processor.
When Data Nodes are added to the deployment, there is a process called data rebalancing that happens. The incoming data in the processor is distributed amongst the Data Nodes that are attached.
If a Data Node goes down (or crashes), the incoming data is not written to the Data Node. Once the Data Node is up, data is again rebalanced between the processor and Data Node. We will touch more on Data Nodes while discussing searches in Chapter 6.