Please disable your adblock and script blockers to view this page

Thread-Per-Core Buffer Management for a modern Kafka-API storage system


SSD
AWS
IOPS at
Single Producer Single Consumer
Safe Memory Reclamation
SMR
Kafka-API
Access Control Lists
SPSC
@vectorizedio


KB
Kafka
Seastar’s[4

Kafka Java
time?To
Redpanda
Sarah
Noah
Ben
David
Michal
Mark Papadakis
Travis Downs

No matching tags


Java
Redpanda
TCP

No matching tags


GB
Redpanda
IO
Erlang
Orleans
mbuf[6

No matching tags

Positivity     41.00%   
   Negativity   59.00%
The New York Times
SOURCE: https://vectorized.io/blog/tpc-buffers/
Write a review: Hacker News
Summary

It tells you what your sensitivity is for blocking - which for Redpanda is less than 500 microseconds - otherwise, Seastar’s[4] reactor will print a stack trace warning you of the blocking since it effectively injects latency on the network poller.Once you have decided on your threading model, the next step is your memory model and ultimately, for storage engines, your buffer management. In this post, we’ll cover the perils of buffer management in a thread-per-core environment and describe iobuf, our solution for a 0-copy memory management in the world of Seastar.As mentioned earlier, Redpanda uses a single pinned thread per core architecture to do everything. what is deferred for the destination core, and it’s often a function of memory (smaller data structures are good candidates for broadcast), computation (how much time is spent deciding) and frequency of access (very likely operations tend to get materialized on every core).One question remaining is how, exactly, does memory management work in a TpC architecture? How does data actually travel from L-core-0 to L-core-66 safely using a network of SPSC queues within a fully asynchronous execution model where things can suspend at any point in time?To understand iobuf, we need to understand the actual memory constraints of Seastar, our TpC framework. Aside from a thread-per-core architecture, the memory management would have been our second bottleneck if not designed from the ground up for latency.

As said here by