Summary
In this chapter, you learned a number of basic functions and various packages for data manipulation. Using built-in functions to manipulate data can be redundant. Several packages are tailored for filtering and aggregating data based on different techniques and philosophies. The sqldf
packages use embedded SQLite databases so that we can directly write SQL statements to query data frame in our working environment. On the other hand, data.table
provides an enhanced version of data.frame
and a powerful syntax, and dplyr
defines a grammar of data manipulation by providing a set of pipeline friendly verb functions. The rlist
class provides a set of pipeline friendly functions for non-tabular data manipulation. No single package is best for all situations. Each of them represents a way of thinking, and which best fits a certain problem depends on how you understand the problem and your experience of working with data.
Processing data and doing simulation require considerable...