Function objects and lambda expressions
Lambda expressions, introduced in C++11, and further enhanced with every C++ version since, are one of the most useful features in modern C++. Their versatility comes not only from easily passing functions to algorithms, but also their use in a lot of circumstances where you need to pass the code around, especially as you can store a lambda in a std::function
.
Although lambdas made these programming techniques vastly simpler to work with, everything mentioned in this section is possible to perform without them. A lambda—or, more formally, a lambda expression—is a convenient way of constructing a function object. But instead of using lambda expressions, we could instead implement classes with operator()
overloaded, and then instantiate these to create function objects.
We will explore the lambda's similarities to these kinds of classes later, but first I will introduce lambda expressions in a simple use case.
The basic syntax of a C++ lambda
In a nutshell, lambdas enable programmers to pass functions to other functions, just as easily as a variable is passed.
Let's compare passing a lambda to an algorithm with passing a variable:
// Prerequisite
auto v = std::vector{1, 3, 2, 5, 4};
// Look for number three
auto three = 3;
auto num_threes = std::count(v.begin(), v.end(), three);
// num_threes is 1
// Look for numbers which is larger than three
auto is_above_3 = [](int v) { return v > 3; };
auto num_above_3 = std::count_if(v.begin(), v.end(), is_above_3);
// num_above_3 is 2
In the first case, we pass a variable to std::count()
, and in the latter case we pass a function object to std::count_if()
. This is a typical use case for lambdas; we pass a function to be evaluated many times by another function (in this case, std::count_if()
).
Also, the lambda does not need to be tied to a variable; just as we can put a variable right into an expression, we can do the same with a lambda:
auto num_3 = std::count(v.begin(), v.end(), 3);
auto num_above_3 = std::count_if(v.begin(), v.end(), [](int i) {
return i > 3;
});
The lambdas you have seen so far are called stateless lambdas; they don't copy or reference any variables from outside the lambda and therefore don't need any internal state. Let's make this a little more advanced by introducing stateful lambdas by using capture blocks.
The capture clause
In the previous example, we hard-coded the value 3
inside the lambda so that we always counted the numbers greater than three. What if we want to use external variables inside the lambda? What we do is capture the external variables by putting them in the capture clause; that is, the []
part of the lambda:
auto count_value_above(const std::vector<int>& v, int x) {
auto is_above = [x](int i) { return i > x; };
return std::count_if(v.begin(), v.end(), is_above);
}
In this example, we captured the variable x
by copying it into the lambda. If we want to declare x
as a reference, we put an &
at the beginning, like this:
auto is_above = [&x](int i) { return i > x; };
The variable is now merely a reference to the outer x
variable, just like a regular reference variable in C++. Of course, we need to be very cautious about the lifetime of objects we pass by reference into a lambda since the lambda might execute in a context where the referenced objects have ceased to exist. It's therefore safer to capture by value.
Capture by reference versus capture by value
Using the capture clause for referencing and copying variables works just like regular variables. Take a look at these two examples and see if you can spot the difference:
Capture by value |
Capture by reference |
|
|
In the first example, x
was copied into the lambda and was therefore not affected when x
was mutated; consequently std::count_if()
counts the number of values above 3.
In the second example, x
was captured by reference, and therefore std::count_if()
instead counts the number of values above 4.
Similarities between a lambda and a class
I mentioned earlier that lambda expressions generate function objects. A function object is an instance of a class that has the call operator, operator()()
, defined.
To understand what a lambda expression consists of, you can view it as a regular class with restrictions:
- The class only consists of one member function
- The capture clause is a combination of the class' member variables and its constructor
The following table shows lambda expressions and the corresponding classes. The left column uses capture by value and the right column capture by reference:
A lambda with capture by value… |
A lambda with capture by reference... |
|
|
...corresponds to this class: |
...corresponds to this class: |
|
|
Thanks to lambda expressions, we don't have to manually implement these function object types as classes.
Initializing variables in capture
As seen in the previous example, the capture clause initializes member variables in the corresponding class. This means that we can also initialize member variables inside a lambda. These variables will only be visible from inside the lambda. Here is an example of a lambda that initializes a capture variable called numbers
:
auto some_func = [numbers = std::list<int>{4,2}]() {
for (auto i : numbers)
std::cout << i;
};
some_func(); // Output: 42
The corresponding class would look something like this:
class SomeFunc {
public:
SomeFunc() : numbers{4, 2} {}
void operator()() const {
for (auto i : numbers)
std::cout << i;
}
private:
std::list<int> numbers;
};
auto some_func = SomeFunc{};
some_func(); // Output: 42
When initializing a variable inside a capture, you can imagine that there is a hidden auto
keyword in front of the variable name. In this case, you can think about numbers
as being defined like auto numbers = std::list<int>{4, 2}
. If you want to initialize a reference, you can use an ampersand in front of the name, which would correspond to auto&
. Here is an example:
auto x = 1;
auto some_func = [&y = x]() {
// y is a reference to x
};
Again, you have to be very cautious about lifetimes when referencing (and not copying) objects outside the lambda.
It's also possible to move an object inside a lambda, which is necessary when using move-only types such as std::unique_ptr
. Here is how it can be done:
auto x = std::make_unique<int>();
auto some_func = [x = std::move(x)]() {
// Use x here..
};
This also demonstrates that it is possible to use the same name (x
) for the variable. This is not necessary. Instead, we could have used some other name inside the lambda, for example [y = std::move(x)]
.
Mutating lambda member variables
As the lambda works just like a class with member variables, it can also mutate them. However, the function call operator of a lambda is const
by default, so we explicitly need to specify that the lambda can mutate its members by using the mutable
keyword. In the following example, the lambda mutates the counter
variable every time it's invoked:
auto counter_func = [counter = 1]() mutable {
std::cout << counter++;
};
counter_func(); // Output: 1
counter_func(); // Output: 2
counter_func(); // Output: 3
If a lambda only captures variables by reference, we do not have to add the mutable
modifier to the declaration, as the lambda itself doesn't mutate. The difference between mutable and non-mutable lambdas is demonstrated in the following code snippets:
Capture by value |
Capture by reference |
|
|
Output: |
Output: |
In the example to the right where v
is captured by reference, the lambda will mutate the variable v
, which is owned by the scope of some_func()
. The mutating lambda in the left column will only mutate a copy of v
, owned by the lambda itself. This is the reason why we will end up with different outputs in the two versions.
Mutating member variables from the compiler's perspective
To understand what's going on in the preceding example, take a look at how the compiler sees the previous lambda objects:
Capture by value |
Capture by reference |
|
|
As you can see, the first case corresponds to a class with a regular member, whereas the capture by reference case simply corresponds to a class where the member variable is a reference.
You might have noticed that we add the modifier const
on the operator()
member function of the capture by reference class, and we also do not specify mutable
on the corresponding lambda. The reason this class is still considered const
is that we do not mutate anything inside the actual class/lambda; the actual mutation applies to the referenced value, and therefore the function is still considered const
.
Capture all
In addition to capturing variables one by one, all variables in the scope can be captured by simply writing [=]
or [&]
.
Using [=]
means that every variable will be captured by value, whereas [&]
captures all variables by reference.
If we use lambdas inside a member function, it is also possible to capture the entire object by reference using [this]
or by copy by writing [*this]
:
class Foo {
public:
auto member_function() {
auto a = 0;
auto b = 1.0f;
// Capture all variables by copy
auto lambda_0 = [=]() { std::cout << a << b; };
// Capture all variables by reference
auto lambda_1 = [&]() { std::cout << a << b; };
// Capture object by reference
auto lambda_2 = [this]() { std::cout << m_; };
// Capture object by copy
auto lambda_3 = [*this]() { std::cout << m_; };
}
private:
int m_{};
};
Note that using [=]
does not mean that all variables in the scope are copied into the lambda; only the variables actually used inside the lambda are copied.
When capturing all variables by value, you can specify variables to be captured by reference (and vice versa). The following table shows the result of different combinations in the capture block:
Capture block |
Resulting capture types |
|
Capture |
|
Capture |
|
Capture Capture |
|
Capture Capture |
Although it is convenient to capture all variables with [&]
or [=]
, I recommend capturing variables one by one, as it improves the readability of the code by clarifying exactly which variables are used inside the lambda scope.
Assigning C function pointers to lambdas
Lambdas without captures can be implicitly converted to function pointers. Let's say you are using a C library, or an older C++ library, that uses a callback function as a parameter, like this:
extern void download_webpage(const char* url,
void (*callback)(int, const char*));
The callback is called with a return code and some downloaded content. It is possible to pass a lambda as a parameter when calling download_webpage()
. Since the callback is a regular function pointer, the lambda must not have any captures and you have to use a plus (+
) in front of the lambda:
auto lambda = +[](int result, const char* str) {
// Process result and str
};
download_webpage("http://www.packt.com", lambda);
This way, the lambda is converted into a regular function pointer. Note that the lambda cannot have any captures at all in order to use this functionality.
Lambda types
Since C++20, lambdas without captures are default-constructible and assignable. By using decltype
, it's now easy to construct different lambda objects that have the same type:
auto x = [] {}; // A lambda without captures
auto y = x; // Assignable
decltype(y) z; // Default-constructible
static_assert(std::is_same_v<decltype(x), decltype(y)>); // passes
static_assert(std::is_same_v<decltype(x), decltype(z)>); // passes
However, this only applies to lambdas without captures. Lambdas with captures have their own unique type. Even if two lambda functions with captures are plain clones of each other, they still have their own unique type. Therefore, it's not possible to assign one lambda with captures to another lambda.
Lambdas and std::function
As mentioned in the previous section, lambdas with captures (stateful lambdas) cannot be assigned to each other since they have unique types, even if they look exactly the same. To be able to store and pass around lambdas with captures, we can use std::function
to hold a function object constructed by a lambda expression.
The signature of a std::function
is defined as follows:
std::function< return_type ( parameter0, parameter1...) >
So, a std::function
returning nothing and having no parameters is defined like this:
auto func = std::function<void(void)>{};
A std::function
returning a bool
with an int
and a std::string
as parameters is defined like this:
auto func = std::function<bool(int, std::string)>{};
Lambda functions sharing the same signature (same parameters and same return type) can be held by the same type of std::function
objects. A std::function
can also be reassigned at runtime.
What is important here is that what is captured by the lambda does not affect its signature, and therefore both lambdas with and without captures can be assigned to the same std::function
variable. The following code shows how different lambdas are assigned to the same std::function
object called func
:
// Create an unassigned std::function object
auto func = std::function<void(int)>{};
// Assign a lambda without capture to the std::function object
func = [](int v) { std::cout << v; };
func(12); // Prints 12
// Assign a lambda with capture to the same std::function object
auto forty_two = 42;
func = [forty_two](int v) { std::cout << (v + forty_two); };
func(12); // Prints 54
Let's put the std::function
to use in something that resembles a real-world example next.
Implementing a simple Button class with std::function
Assume that we set out to implement a Button
class. We can then use the std::function
to store the action corresponding to clicking the button, so that when we call the on_click()
member function, the corresponding code is executed.
We can declare the Button
class like this:
class Button {
public:
Button(std::function<void(void)> click) : handler_{click} {}
auto on_click() const { handler_(); }
private:
std::function<void(void)> handler_{};
};
We can then use it to create a multitude of buttons with different actions. The buttons can conveniently be stored in a container because they all have the same type:
auto create_buttons () {
auto beep = Button([counter = 0]() mutable {
std::cout << "Beep:" << counter << "! ";
++counter;
});
auto bop = Button([] { std::cout << "Bop. "; });
auto silent = Button([] {});
return std::vector<Button>{beep, bop, silent};
}
Iterating the list and calling on_click()
on each button will execute the corresponding function:
const auto& buttons = create_buttons();
for (const auto& b: buttons) {
b.on_click();
}
buttons.front().on_click(); // counter has been incremented
// Output: "Beep:0! Bop. Beep:1!"
The preceding example with buttons and click handlers demonstrates some of the benefits of using std::function
in combination with lambdas; even though each stateful lambda will have its own unique type, a single std::function
type can wrap lambdas that share the same signature (return type and arguments).
As a side note, you might have noticed that the on_click()
member function is declared const
. However, it's mutating the member variable handler_
by increasing the counter
variable in one of the click handlers. This might seem like it breaks const-correctness rules, as a const member function of Button
is allowed to call a mutating function on one of its class members. The reason it is allowed is the same reason that member pointers are allowed to mutate their pointed-to value in a const context. Earlier in this chapter, we discussed how to propagate constness for pointer data members.
Performance consideration of std::function
A std::function
has a few performance losses compared to a function object constructed by a lambda expression directly. This section will discuss some of the things related to performance to consider when using std::function
.
Prevented inline optimizations
When it comes to lambdas, the compiler has the ability to inline the function call; that is, the overhead of the function call is eliminated. The flexible design of std::function
make it nearly impossible for the compiler to inline a function wrapped in a std::function
. The prevention of inline optimizations can have a negative impact on the performance if small functions wrapped in std::function
are called very frequently.
Dynamically allocated memory for captured variables
If a std::function
is assigned to a lambda with captured variables/references, the std::function
will, in most cases, use heap-allocated memory to store the captured variables. Some implementations of std::function
do not allocate additional memory if the size of the captured variable is below some threshold.
This means that not only is there a performance penalty due to the extra dynamic memory allocation, but also that it is slower, as heap-allocated memory can increase the number of cache misses (read more about cache misses in Chapter 4, Data Structures).
Additional run-time computation
Calling a std::function
is generally a bit slower than executing a lambda, as a little more code is involved. For small and frequently called std::function
s, this overhead may become significant. Imagine that we have a really small lambda defined like this:
auto lambda = [](int v) { return v * 3; };
The benchmark that follows demonstrates the difference between executing 10 million function calls for a std::vector
of the explicit lambda type versus a std::vector
of a corresponding std::function
. We will begin with the version using the explicit lambda:
auto use_lambda() {
using T = decltype(lambda);
auto fs = std::vector<T>(10'000'000, lambda);
auto res = 1;
// Start clock
for (const auto& f: fs)
res = f(res);
// Stop clock here
return res;
}
We only measure the time it takes to execute the loop inside the function. The next version wraps our lambda in a std::function
, and looks like this:
auto use_std_function() {
using T = std::function<int(int)>;
auto fs = std::vector<T>(10'000'000, T{lambda});
auto res = 1;
// Start clock
for (const auto& f: fs)
res = f(res);
// Stop clock here
return res;
}
I'm compiling this code on my MacBook Pro from 2018 using Clang with optimizations turned on (-O3
). The first version, use_lambda()
, executes the loop at roughly 2 ms, whereas the second version, use_std_function()
, takes almost 36 ms to execute the loop.
Generic lambdas
A generic lambda is a lambda accepting auto
parameters, making it possible to invoke it with any type. It works just like a regular lambda, but the operator()
has been defined as a member function template.
Only the parameters are template variables, not the captured values. In other words, the captured value, v
, in the following example will be of type int
regardless of the types of v0
and v1
:
auto v = 3; // int
auto lambda = [v](auto v0, auto v1) {
return v + v0*v1;
};
If we translate the above lambda to a class, it would correspond to something like this:
class Lambda {
public:
Lambda(int v) : v_{v} {}
template <typename T0, typename T1>
auto operator()(T0 v0, T1 v1) const {
return v_ + v0*v1;
}
private:
int v_{};
};
auto v = 3;
auto lambda = Lambda{v};
Just like the templated version, the compiler won't generate the actual function until the lambda is invoked. So, if we invoke the previous lambda like this:
auto res_int = lambda(1, 2);
auto res_float = lambda(1.0f, 2.0f);
the compiler will generate something similar to the following lambdas:
auto lambda_int = [v](int v0, const int v1) { return v + v0*v1; };
auto lambda_float = [v](float v0, float v1) { return v + v0*v1; };
auto res_int = lambda_int(1, 2);
auto res_float = lambda_float(1.0f, 2.0f);
As you might have figured out, these versions are further handled just like regular lambdas.
A new feature of C++20 is that we can use typename
instead of just auto
for the parameter types of a generic lambda. The following generic lambdas are identical:
// Using auto
auto x = [](auto v) { return v + 1; };
// Using typename
auto y = []<typename Val>(Val v) { return v + 1; };
This makes it possible to name the type or refer to the type inside the body of the lambda.