To avoid the anticipated limitations to our disk-based cache, we will now build our cache on top of an existing key-value storage system. When crawling, we may need to cache massive amounts of data and will not need any complex joins, so we will use high availability key-value storage, which is easier to scale than a traditional relational database or even most NoSQL databases. Specifically, our cache will use Redis, which is a very popular key-value store.
Key-value storage cache
What is key-value storage?
Key-value storage is very similar to a Python dictionary, in that each element in the storage has a key and a value. When designing the DiskCache, a key-value model lent itself well to the problem. Redis, in fact, stands for REmote DIctionary Server. Redis was...