Emerging perspectives & drivers for new age data architectures
Driver 1—BIG data intervention.
We have defined big data and large dataset concepts in Chapter 2, Machine learning and Large-scale datasets. The data that is now being ingested and needs to be processed typically has the following characteristics:
- Source: Depending upon the nature of the information, the source may be a real-time stream of data (for example, trade transactions), or batches of data containing updates since the last sync
- Content: The data may represent different types of information. Often, this information is related to other pieces of data and is needed to be connected
The following screenshot shows the types of data and different sources that need to be supported:
- Volume: Depending upon the nature of the data, the volumes that are being processed may vary. For example, master data or the securities definition data are relatively fixed, whereas the transaction data is enormous compared to the other two...