We’ve already seen that Seastar continuations are lambdas,
passed to the then()
method of a future. In the examples
we’ve seen so far, lambdas have been nothing more than anonymous
functions. But C++11 lambdas have one more trick up their sleeve, which
is extremely important for future-based asynchronous programming in
Seastar: Lambdas can capture state. Consider the
following example:
#include <seastar/core/sleep.hh>
#include <iostream>
::future<int> incr(int i) {
seastarusing namespace std::chrono_literals;
return seastar::sleep(10ms).then([i] { return i + 1; });
}
::future<> f() {
seastarreturn incr(3).then([] (int val) {
std::cout << "Got " << val << "\n";
});
}
The future operation incr(i)
takes some time to complete
(it needs to sleep a bit first…), and in that duration, it needs to save
the i
value it is working on. In the early event-driven
programming models, the programmer needed to explicitly define an object
for holding this state, and to manage all these objects. Everything is
much simpler in Seastar, with C++11’s lambdas: The capture
syntax “[i]
” in the above example means that the value
of i, as it existed when incr() was called() is captured into the
lambda. The lambda is not just a function - it is in fact an
object, with both code and data. In essence, the compiler
created for us automatically the state object, and we neither need to
define it, nor to keep track of it (it gets saved together with the
continuation, when the continuation is deferred, and gets deleted
automatically after the continuation runs).
One implementation detail worth understanding is that when a continuation has captured state and is run immediately, this capture incurs no runtime overhead. However, when the continuation cannot be run immediately (because the future is not yet ready) and needs to be saved till later, memory needs to be allocated on the heap for this data, and the continuation’s captured data needs to be copied there. This has runtime overhead, but it is unavoidable, and is very small compared to the related overhead in the threaded programming model (in a threaded program, this sort of state usually resides on the stack of the blocked thread, but the stack is much larger than our tiny capture state, takes up a lot of memory and causes a lot of cache pollution on context switches between those threads).
In the above example, we captured i
by value -
i.e., a copy of the value of i
was saved into the
continuation. C++ has two additional capture options: capturing by
reference and capturing by move:
Using capture-by-reference in a continuation is usually a mistake, and can lead to serious bugs. For example, if in the above example we captured a reference to i, instead of copying it,
::future<int> incr(int i) {
seastarusing namespace std::chrono_literals;
// Oops, the "&" below is wrong:
return seastar::sleep(10ms).then([&i] { return i + 1; });
}
this would have meant that the continuation would contain the address
of i
, not its value. But i
is a stack
variable, and the incr() function returns immediately, so when the
continuation eventually gets to run, long after incr() returns, this
address will contain unrelated content.
An exception to the capture-by-reference-is-usually-a-mistake rule is
the do_with()
idiom, which we will introduce later. This
idiom ensures that an object lives throughout the life of the
continuation, and makes capture-by-reference possible, and very
convenient.
Using capture-by-move in continuations is also very useful
in Seastar applications. By moving an object into a
continuation, we transfer ownership of this object to the continuation,
and make it easy for the object to be automatically deleted when the
continuation ends. For example, consider a traditional function taking a
std::unique_ptr<T>
.
int do_something(std::unique_ptr<T> obj) {
// do some computation based on the contents of obj, let's say the result is 17
return 17;
// at this point, obj goes out of scope so the compiler delete()s it.
By using unique_ptr in this way, the caller passes an object to the function, but tells it the object is now its exclusive responsibility - and when the function is done with the object, it automatically deletes it. How do we use unique_ptr in a continuation? The following won’t work:
::future<int> slow_do_something(std::unique_ptr<T> obj) {
seastarusing namespace std::chrono_literals;
// The following line won't compile...
return seastar::sleep(10ms).then([obj] () mutable { return do_something(std::move(obj)); });
}
The problem is that a unique_ptr cannot be passed into a continuation by value, as this would require copying it, which is forbidden because it violates the guarantee that only one copy of this pointer exists. We can, however, move obj into the continuation:
::future<int> slow_do_something(std::unique_ptr<T> obj) {
seastarusing namespace std::chrono_literals;
return seastar::sleep(10ms).then([obj = std::move(obj)] () mutable {
return do_something(std::move(obj));
});
}
Here the use of std::move()
causes obj’s move-assignment
is used to move the object from the outer function into the
continuation. The notion of move (move semantics), introduced
in C++11, is similar to a shallow copy followed by invalidating the
source copy (so that the two copies do not co-exist, as forbidden by
unique_ptr). After moving obj into the continuation, the top-level
function can no longer use it (in this case it’s of course ok, because
we return anyway).
The [obj = ...]
capture syntax we used here is new to
C++14. This is the main reason why Seastar requires C++14, and does not
support older C++11 compilers.
The extra () mutable
syntax was needed here because by
default when C++ captures a value (in this case, the value of
std::move(obj)) into a lambda, it makes this value read-only, so our
lambda cannot, in this example, move it again. Adding
mutable
removes this artificial restriction.
C++14 (and below) does not guarantee that lambda captures in continuations will be evaluated after the futures they relate to are evaluated (See https://en.cppreference.com/w/cpp/language/eval_order).
Consequently, avoid the programming pattern below:
return do_something(obj).then([obj = std::move(obj)] () mutable {
return do_something_else(std::move(obj));
});
In the example above, [obj = std::move(obj)]
might be
evaluated before do_something(obj)
is called, potentially
leading to use-after-move of obj
.
To guarantee the desired evaluation order, the expression above may be broken into separate statements as follows:
auto fut = do_something(obj);
return fut.then([obj = std::move(obj)] () mutable {
return do_something_else(std::move(obj));
});
This was changed in C++17. The expression that creates the object the
function then
is called on (the future) is evaluated before
all the arguments to the function, so this style is not required in
C++17 and above.
TODO: We already saw chaining example in slow() above. talk about the return from then, and returning a future and chaining more thens.