In this chapter, we will focus on central issues in large-scale data processing and data management, and when we should use a DBMS that can perform parallel operations versus when should we use a Hadoop or Yarn-style system. In this section, we are going to address two contrasting approaches to handling a high volume of data, filesystems, and DBMS.
Non-DBMS-based approach to big data
Filesystems
The filesystem-oriented approach is the traditional method used in the early days of data processing. However, several applications dealing with simple and small datasets use this approach even today. Data is stored and processed using separate files in this approach. For example, each user defines and implements their own files that...