Chapter 3: Loading and Unloading Data
In this chapter, we will delve into the data loading process, which allows us to put transformed data from source systems into a target data warehouse table structure. While data can be loaded into Amazon Redshift using an INSERT
statement (as in the case of other relational databases), it is more efficient to bulk load the data, given the volumes that a data warehouse handles. For example, in an ordering system-based data warehouse table, usually, the entire previous day's worth of data needs to be loaded rather than individual orders. Similarly, data from the data warehouse can be exported to other applications in bulk using the unload feature.
There are multiple ways of loading data into an Amazon Redshift cluster. The most common way is using the COPY
command to load data from Amazon S3. This chapter will cover all the different ways you can load data into a Redshift cluster from different sources.
The following recipes will be...