Working with Databases
In this chapter, we are going to look at how to work with relational databases. Databases remain one of the most common sources that the data pipeline reads data from and writes to, so it is important that we understand how to work with them efficiently. We will start off by looking at the Spark API and then create a simple database library that provides a simple interface to work with databases.
Specifically, we will look at the following topics
- Understanding the Spark JDBC API
- Working with the Spark JDBC API
- Loading the database configuration
- Creating a database interface
- Performing various database operations