Querying DuckDB with dplyr
The dplyr package is highly regarded among data practitioners who use R for performing data analysis and modeling. It provides users with a set of key verbs for manipulating data, such as select
, filter
, arrange
, summarize
, and mutate
. By enabling users to combine these verbs through a composable grammar of data manipulation, the dplyr API provides an elegant and intuitive interface for constructing analytical queries programmatically.
dplyr can be used to query a range of data backends, including R dataframes, Apache Arrow tables, Apache Spark datasets, and a variety of popular SQL databases. The dataframe backend is the most frequently used, allowing users to query R dataframes and tibbles using the dplyr interface. The dbplyr package provides an alternative backend that enables dplyr to be used as a query interface for a range of SQL-based databases. It works behind the scenes by translating dplyr operations into the SQL dialect of the database you...