If we would like to put it simply, we could just say that functions are ways of manipulating vectors, lists, and data frames. This is perhaps not the most rigorous definition of a function; nevertheless, it catches a focal point of this entity—a function takes some inputs, which are vectors (even of one element), lists, or data frames, and results in one output, which is usually a vector, a list, or a data frame.
The exception here are functions that perform filesystem manipulation or some other specific tasks, which in some other languages are called procedures. For instance, the file.create() function we encountered before.
One of the most appreciated features of R is the possibility to easily explore the definition of all the functions available. This is easily done by submitting a command with the sole name of the function, without any parentheses. Let's try this with the mode() function and see what happens:
mode
function (x)
{
if (is.expression(x))
return("expression")
if (is.call(x))
return(switch(deparse(x[[1L]])[1L], `(` = "(", "call"))
if (is.name(x))
"name"
else switch(tx <- typeof(x), double = , integer = "numeric",
closure = , builtin = , special = "function", tx)
}
<bytecode: 0x102264c98>
<environment: namespace:base>
We are not going to get into detail with this function, but let's just notice some structural elements:
- We have a call to function(), which, by the way, is a function itself.
- We have the specification of the only argument of the mode function, which is x.
- We have braces surrounding everything coming after the function() call. This is the body of the function and contains all the calculations/computations performed by the function on its inputs.
Those are the actual, minimal elements for the definition of a function within the R language. We can resume this as follows:
function_name <- function(arguments){
[function body]
}
Now that we know the theory, let's try to define a simple and useless function that adds 2 to every number submitted:
adding_two <- function(the_number){
the_number + 2}
Does it work? Of course it does. To test it, we have to first execute the two lines of code stating the function definition, and then we will be able to employ our custom function:
adding_two( the_number = 4)
[1] 6
Now, let's introduce a bit more complicated but relevant concept: value assignment within a function. Let's imagine that you are writing a function and having the result stored within a function_result vector. You would probably write something like this:
my_func <- function(x){
function_result <- x / 2 }
You may even think that, once running your function, for instance, with x equal to 4, you should find an object function_result equal to 2 (4/2) within your environment.
So, let's try to print it out in the way that we learned some paragraphs earlier:
function_result
This is what happens:
Error: object function_result not found
How is this possible? This is actually because of the rules overseeing the assignment of values within a function. We can summarize those rules as follows:
- A function can look up a variable, even if defined outside the function itself
- Variables defined within the function remain within the function
How is it therefore possible to export the function_result object outside the function? You have two possible ways:
- Employing the <<- operator, the so-called superassignment operator
- Employing the assign() function
Here is the function rewritten to employ the superassignment operator:
my_func <- function(x){
function_result <<- x / 2 }
If you try to run it, you will now find that the function_result object will show up within your environment browser. One last step: exporting an object created within a function outside of the function is different than placing that object as a result of the function. Let's show this practically:
my_func <- function(x){
function_result <- x / 2
function_result}
If you now try to run my_func(4) once again, your console will print out the result:
[1] 2
But, within your environment, once again you will not find the function_result object. How is this? This is because within the function definition, you specified as a final result, or as a resulting value, the value of the function_result object. Nevertheless, as in the first formulation, this object was defined employing a standard assignment operator.