Handling big data is not just a matter of size; it's actually a multifaceted phenomenon. In fact, according to the 3V model (volume, velocity and variety), systems operating on big data can be classified using three (orthogonal) criteria:
- The first criterion to consider is the velocity that the system achieves to process the data. Although a few years ago, speed was used to indicate how quickly a system was able to process a batch, nowadays, velocity indicates whether a system can provide real-time outputs on streaming data.
- The second criterion is volume; that is, how much information is available to be processed. It can be expressed in the number of rows or features, or just a bare count of the bytes. In streaming data, the volume indicates the throughput of data arriving in the system.
- The last criterion is variety; that is...