First class functions in R
R is primarily a functional language at its core. In R, functions are treated just like any other data types, and are considered as first-class citizens. The following example shows that R considers everything as a function call.
Here, the operator +
is a function in itself:
> 10+20 [1] 30 > "+"(10,20) [1] 30
Here, the operator ^
is also a function in itself:
> 4^2 [1] 16 > "^"(4,2) [1] 16
Now, let's dive deep into functional concepts, which are crucial and widely used by R programmers.
Vectorized functions are among the most popular functional concepts which enable the programmer to execute functions at an individual element level for a given vector. This vector can also be a part of dataframe, matrix, or a list. Let's understand this in detail using the following example, in which we would like to have an operation on each element in a given vector V_in
. The operation is to square each element within the vector and output it as vector V_out
. We will implement them using three approaches as follows:
Approach 1: Here, the operations will be performed at the element level using a for
loop. This is the most primitive of all the three approaches in which vector allocation is being performed using the style of S language:
> V_in <- 1:100000 ## Input Vector > V_out <- c() ## Output Vector > for(i in V_in) ## For loop on Input vector + { + V_out <- c(V_out,i^2) ## Storing on Output vector + }
Approach 2: Here, the vectorized functional concept will be used to obtain the same objective. The loops in vectorized programming are implemented in C language, and hence, perform much faster than for
loops implemented in R (Approach 1). The time elapsed to run this operation is instantaneous:
> V_in <- 1:100000 ## Input Vector > V_out <- V_in^2 ## Output Vector
Approach 3: Here, higher order functions (or nested functions) are used to obtain the same objective. As functions are considered first class citizens in R, these can be called as an argument within another function. The widely used nested functions are in the apply family. The following table provides a summary of the various types of functions within the apply family:
Table 1.4 Various types of functions in the apply family
Now, lets' evaluate the first class function through examples. An apply
function can be applied to a dataframe, matrix, or array. Let's illustrate it using a matrix:
> x <- cbind(x1 = 7, x2 = c(7:1, 2:5)) > col.sums <- apply(x, 2, sum) > row.sums <- apply(x, 1, sum)
The lapply
is a first class function to be applied to a vector, list, or variables in a dataframe or matrix. An example of lapply
is shown below:
> x <- list(x1 = 7:1, x2 = c(7:1, 2:5)) > lapply(x, mean)
The use of the sapply
function for a vector input using customized function is shown below:
> V_in <- 1:100000 ## Input Vector > V_out <- sapply(V_in,function(x) x^2) ## Output Vector
The function mapply
is a multivariate sapply
. The mapply
function is the first input, followed by input parameters as shown below:
mapply(FUN, ..., MoreArgs = NULL, SIMPLIFY = T, USE.NAMES = T)
An example of mapply
to replicated two vector can be obtained as:
> mapply(rep, 1:6, 6:1)
The function call rep
function in R with input from 1 to 6 and is replicated as 6 to 1 using the second dimension of the mapply
function. The tapply
applies a function to each cell of the ragged array. For example, let's create a list with a multiple array:
The output is a relationship between two vectors with position as a value. The function rapply
is a recursive function for lapply as shown below:
> X <- list(list(a = pi, b = list(c = 1:1)), d = "a test") > rapply(X, sqrt, classes = "numeric", how = "replace")
The function applies sqrt to all numeric classes in the list and replace it with new values.