Using Spark with relational databases
There is a debate on whether relational databases fit into big data processing scenarios. However, it's undeniable that vast quantities of structured data in live in such databases, and organizations rely heavily on the existing RDBMSs for their critical business transactions.
A vast majority of developers are most comfortable working with databases and the rich set of tools available from leading vendors. Increasingly, cloud service providers, such as Amazon AWS, have made administration, replication, and scaling simple enough for organizations to transition their large relational databases to the cloud.
Some good big data use cases for relational databases include the following:
- Complex OLTP transactions
- Applications or features that need ACID compliance
- Support for standard SQL
- Real-time ad hoc query functionality
- Systems many complex relationships
Note
For an excellent coverage of NoSQL and relational use cases, refer to the blog titled What the heck...