Chapter 1. Extract, Transform, and Load
Business may focus on profits and sales, but business intelligence (BI) focuses on data. Activities reliant on data require the business analyst to acquire it from diverse sources. The term Extract, Transform, and Load, commonly referred to as ETL, is a deliberate process to get, manipulate, and store data to meet business or analytic needs. ETL is the starting point for many business analytic projects. Poorly executed ETL may affect a business in the form of added cost and lost time to make decisions. This chapter covers the following four key topics:
- Understanding big data in BI analytics
- Extracting data from sources
- Transforming data to fit analytic needs
- Loading data into business systems for analysis
This chapter presents each ETL step within the context of the R computational environment. Each step is broken down into finer levels of detail and includes a variety of situations that business analysts encounter when executing BI in a big data business world.