Skip to content

Rate Limiter

hx235 edited this page Sep 21, 2021 · 18 revisions

When using RocksDB, users may want to throttle the maximum write speed within a certain limit for lots of reasons. For example, flash writes cause terrible spikes in read latency if they exceed a certain threshold. Since you've been reading this site, I believe you already know why you need a rate limiter. Actually, RocksDB contains a native RateLimiter which should be adequate for most use cases.

How to use

Create a RateLimiter object by calling NewGenericRateLimiter, which can be created separately for each RocksDB instance or by shared among RocksDB instances to control the aggregated write rate of flush and compaction.

RateLimiter* rate_limiter = NewGenericRateLimiter(
    rate_bytes_per_sec /* int64_t */, 
    refill_period_us /* int64_t */,
    fairness /* int32_t */);

Params:

  • rate_bytes_per_sec: this is the only parameter you want to set most of the time. It controls the total write rate of compaction and flush in bytes per second. Currently, RocksDB does not enforce rate limit for anything other than flush and compaction, e.g. write to WAL

  • refill_period_us: this controls how often tokens are refilled. For example, when rate_bytes_per_sec is set to 10MB/s and refill_period_us is set to 100ms, then 1MB is refilled every 100ms internally. Larger value can lead to burst writes while smaller value introduces more CPU overhead. The default value 100,000 should work for most cases.

  • fairness: RateLimiter accepts requests of user-pri, high-pri, mid-pri and low-pri. A request of lower pri is usually blocked in favor of the ones of higher pri with an exception to user-pri. Requests of user-pri are always satisfied before other pri(s) without being subjected to fairness mechanism we are going to talk about (see below for more information about user-pri).

    Currently, RocksDB assigns low-pri to request from compaction and high-pri to request from flush. Low-pri requests can get blocked if flush requests come in continuously. This fairness parameter grants low-pri requests permission by 1/fairness chance even though high-pri requests exist, to avoid starvation. In other words, the fairness parameter indicates that there is 1/fairness chance that high-pri requests will be blocked by low-pri requests.

    The fairness parameter also works with more priority levels. In general, the fairness parameter indicates that there is 1/fairness chance that a higher-pri request (e.g, high-pri request) will be blocked by all its lower-pri requests (e.g, mid-pri and low-pri requests). For example, if we have high-pri, mid-pri and low-pri, the possible orders of satisfying requests of different priorities with its corresponding probability are as following:

    {high_pri, mid_pri, low_pri} with probability (1 - 1/fairness) * (1 - 1/fairness);

    {high_pri, low_pri, mid_pri} with probability (1 - 1/fairness) * 1/fairness;

    {mid_pri, low_pri, high_pri} with probability 1/fairness * (1 - 1/fairness);

    {low_pri, mid_pri, high_pri} with probability 1/fairness^2.

    There is one EXCEPTION to user-pri request since user-pri is designed to have a superior priority over all other priorities without being subject to fairness mechanism. It is intended for representing the superior priority of foreground operations triggered by users, compared with the ones triggered by RocksDB internals. In other words, user-pri requests will never be blocked by requests of other priorities. For example, if we have user-pri, high-pri, mid-pri and low-pri, the possible orders of satisfying requests of different priorities with its corresponding probability are as following:

    {user_pri, high_pri, mid_pri, low_pri} with probability (1 - 1/fairness) * (1 - 1/fairness);

    {user_pri, high_pri, low_pri, mid_pri} with probability (1 - 1/fairness) * 1/fairness;

    {user_pri, mid_pri, low_pri, high_pri} with probability 1/fairness * (1 - 1/fairness);

    {user_pri, low_pri, mid_pri, high_pri} with probability 1/fairness^2.

    You should be good by leaving the fairness parameter at default 10.

Although tokens are refilled with a certain interval set by refill_period_us, the maximum bytes that can be granted in a single burst have to be bounded since we are not happy to see that tokens are accumulated for a long time and then consumed by a single burst request which definitely does not agree with our intention. GetSingleBurstBytes() returns this upper bound of tokens.

Then each time token should be requested before writes happen. If this request can not be satisfied now, the call will be blocked until tokens get refilled to fulfill the request. For example,

// block if tokens are not enough
rate_limiter->Request(1024 /* bytes */, rocksdb::Env::IO_HIGH); 
Status s = db->Flush();

Users could also dynamically change rate limiter's bytes per second with SetBytesPerSecond() when they need. see include/rocksdb/rate_limiter.h for more API details.

Customization

For the users whose requirements are beyond the functions provided by RocksDB native Ratelimiter, they can implement their own Ratelimiter by extending include/rocksdb/rate_limiter.h

Contents

Clone this wiki locally