Ristretto is built on three key principles including fast accesses, high concurrency, contention resistance and memory bounding. Let’s discuss more about the principles and how the team achieved them:
The team experimented with store interface within Ristretto and found out that sync.Map performs well for read-heavy workloads but it deteriorates for write workloads. As there was no thread-local storage, the team worked with sharded mutex-wrapped Go maps that gave good performance results. The team used 256 shards to ensure that the performance doesn’t get affected while working with a 64-core server.
With a shard based approach, the team also needed a quick way to understand which shard a key should go in. The long keys consumed too much memory so the team used uint64 for keys, instead of storing the entire key.
There was a need for using the hash of the key in multiple places and to quickly generate a fast hash, the team borrowed runtime.memhash from Go Runtime. The runtime.memhash function uses assembly code for generating a hash, quickly.
The team wanted to achieve high hit ratios but that would require managing metadata about the information that is currently present in the cache and the information that will be needed in it. They took inspiration from the paper BP-Wrapper that explains two ways for mitigating contention: prefetching and batching. The team only used ‘batching’ to lower contention instead of acquiring a mutex lock for every metadata mutation.
While talking about concurrency, Ristretto performs well under heavy concurrent load but it would lose some metadata while delivering better throughput performance.
The page reads, “Interestingly, that information loss doesn’t hurt our hit ratio performance because of the nature of key access distributions. If we do lose metadata, it is generally lost uniformly while the key access distribution remains non-uniform. Therefore, we still achieve high hit ratios and the hit ratio degradation is small as shown by the following graph.”
The workloads usually have variable-sized values where one value ncan cost a few bytes while another will cost few kilobytes and some other value might cost a few megabytes. In this case, it is not possible to have the same memory cost for all of them.
In Ristretto, the cost is attached to every key-value and users can easily specify what that cost is while calling the Set function. This cost is calculated against the MaxCost of the cache.
The page reads, “When the cache is operating at capacity, a heavy item could displace many lightweight items. This mechanism is nice in that it works well for all different workloads, including the naive approach where each key-value costs 1.”
To know more about Ristretto and its key principles in detail, check out the official post.
How Quarkus brings Java into the modern world of enterprise tech
LLVM 9 releases with official RISC-V target support, asm goto, Clang 9, and more
Twitter announces to test ‘Hide Replies’ feature in the US and Japan, after testing it in Canada