Before we dive deep into different data ingestion techniques, let's discuss the difference between batch and real-time (stream) processing. The following explains the difference between these two ecosystems.
Batch processing versus real-time processing
Batch processing
The following points describe the batch processing system:
- Very efficient in processing a high volume of data.
- All data processing steps (that is, data collection, data ingestion, data processing, and results presentation) are done as one single batch job.
- Throughput carries more importance than latency. Latency is always more than a single minute.
- Throughput directly depends on the size of the data and available computational system resources.
- Available...