Using R with Our Database
At this point, you can now copy data to and from a database. This gives you the freedom to expand beyond SQL to other data analytics tools (such as Excel) and incorporate any program that can read a CSV file as input into your pipeline. While almost any analytics tool can read a CSV file, you will still need to download the data. Adding more steps to your analytics pipeline can make your workflow more complex. Complexity can be undesirable because it necessitates additional maintenance and because it increases the number of failure points.
Another approach is to connect to your database directly in your analytics code. In this part of the chapter, we are going to look at how to do this in R—a programming language designed specifically for statistical computing. Later in the chapter, we will look at integrating our data pipelines with Python as well.
Why Use R?
While we have managed to perform aggregate-level descriptive statistics on our data...