In this section, we will learn to use the dplyr package to access data from database sources. We will also see how to hook up to an external database using the DBI package. The Pool package is also an important topic to manage connections and prevent leaks to manage performance.
- dplyr: A popular data-manipulation package for internal and external databases. It internally works as SQL. It provides a variety of functions for data manipulation:
- filter()
- select()
- arrange()
- rename()
- distinct()
- mutate()
- transmute()
- summarise()
- sample_n()
- sample_frac()
Let's see an example using some of these functions with the iris dataset:
library(dplyr) iris %>% filter(Sepal.Length>4 &Sepal.Length<5)
In the preceding code, the filter function has been used to filter the rows of the iris dataset, which has values between 4 and 5. We can also...