Asynchronous Programming with Seastar

Nadav Har’El - nyh@ScyllaDB.com

Avi Kivity - avi@ScyllaDB.com

Back to table of contents. Previous: 22 Promise objects. Next: 24 Seastar::thread.

23 Memory allocation in Seastar

23.1 Per-thread memory allocation

Seastar requires that applications be sharded, i.e., that code running on different threads operate on different objects in memory. We already saw in Seastar memory how Seastar takes over a given amount of memory (often, most of the machine’s memory) and divides it equally between the different threads. Modern multi-socket machines have non-uniform memory access (NUMA), meaning that some parts of memory are closer to some of the cores, and Seastar takes this knowledge into account when dividing the memory between threads. Currently, the division of memory between threads is static, and equal - the threads are expected to experience roughly equal amount of load and require roughly equal amounts of memory.

To achieve this per-thread allocation, Seastar redefines the C library functions malloc(), free(), and their numerous relatives — calloc(), realloc(), posix_memalign(), memalign(), malloc_usable_size(), and malloc_trim(). It also redefines the C++ memory allocation functions, operator new, operator delete, and all their variants (including array versions, the C++14 delete taking a size, and the C++17 variants taking required alignment).

It is important to remember that Seastar’s different threads can see memory allocated by other threads, but they are nontheless strongly discouraged from actually doing this. Sharing data objects between threads on modern multi-core machines results in stiff performance penalties from locks, memory barriers, and cache-line bouncing. Rather, Seastar encourages applications to avoid sharing objects between threads when possible (by sharding — each thread owns a subset of the objects), and when threads do need to interact they do so with explicit message passing, with submit_to(), as we shall see later.

23.2 Foreign pointers

An object allocated on one thread will be owned by this thread, and eventually should be freed by the same thread. Freeing memory on the wrong thread is strongly discouraged, but is currently supported (albeit slowly) to support library code beyond Seastar’s control. For example, std::exception_ptr allocates memory; So if we invoke an asynchronous operation on a remote thread and this operation returns an exception, when we free the returned std::exception_ptr this will happen on the “wrong” core. So Seastar allows it, but inefficiently.

In most cases objects should spend their entire life on a single thread and be used only by this thread. But in some cases we want to reassign ownership of an object which started its life on one thread, to a different thread. This can be done using a seastar::foreign_ptr<>. A pointer, or smart pointer, to an object is wrapped in a seastar::foreign_ptr<P>. This wrapper can then be moved into code running in a different thread (e.g., using submit_to()).

The most common use-case is a seastar::foreign_ptr<std::unique_ptr<T>>. The thread receiving this foreign_ptr will get exclusive use of the object, and when it destroys this wrapper, it will go back to the original thread to destroy the object. Note that the object is not only freed on the original shard - it is also destroyed (i.e., its destructor is run) there. This is often important when the object’s destructor needs to access other state which belongs to the original shard - e.g., unlink itself from a container.

Although foreign_ptr ensures that the object’s destructor automatically runs on the object’s home thread, it does not absolve the user from worrying where to run the object’s other methods. Some simple methods, e.g., methods which just read from the object’s fields, can be run on the receiving thread. However, other methods may need to access other data owned by the object’s home shard, or need to prevent concurrent operations. Even if we’re sure that object is now used exclusively by the receiving thread, such methods must still be run, explicitly, on the home thread:

    // fp is some foreign_ptr<>
    return smp::submit_to(fp.get_owner_shard(), [p=fp.get()]
        { return p->some_method(); });

So seastar::foreign_ptr<> not only has functional benefits (namely, to run the destructor on the home shard), it also has documentational benefits - it warns the programmer to watch out every time the object is used, that this is a foreign pointer, and if we want to do anything non-trivial with the pointed object, we may need to do it on the home shard.

Above, we discussed the case of transferring ownership of an object to a another shard, via seastar::foreign_ptr<std::unique_ptr<T>>. However, sometimes the sender does not want to relinquish ownership of the object. Sometimes, it wants the remote thread to operate on its object and return with the object intact. Sometimes, it wants to send the same object to multiple shards. In such cases, seastar::foreign_ptr<seastar::lw_shared_ptr<T>> is useful. The user needs to watch out, of course, not to operate on the same object from multiple threads concurrently. If this cannot be ensured by program logic alone, some methods of serialization must be used - such as running the operations on the home shard withsubmit_to()` as described above.

Normally, a seastar::foreign_ptr cannot not be copied - only moved. However, when it holds a smart pointer that can be copied (namely, a shared_ptr), one may want to make an additional copy of that pointer and create a second foreign_ptr. Doing this is inefficient and asynchronous (it requires communicating with the original owner of the object to create the copies), so a method future<foreign_ptr> copy() needs to be explicitly used instead of the normal copy constructor.

Back to table of contents. Previous: 22 Promise objects. Next: 24 Seastar::thread.