Which data store should we use for a use case?
Choosing the right data store for a specific use case depends on various factors, including the nature of the data, access patterns, scalability requirements, consistency, latency, and the overall architecture of the application. Here are some guidelines and examples to help you select the most appropriate data store for different scenarios:
Use case |
Data store |
|
|
|
|
|
|
Scalable and fast key-value store |
Redis or Memcached |
Fast free-text search |
Lucene, Elasticsearch, or Solr |
Fast writes |
WAL |
Fast reads |
Caching, replications, in memory, CDNs |
Blob store video and images |
S3/CDNs |
Complex relations such as in a graph |
Graph db (Neo4j) |
Hot data |
In memory, SSDs |
Cold data |
Disk, Amazon Glacier |
Find "highly similar" data in a set of unstructured data (such as images, text blobs, and videos). This is particularly needed in AI applications. |
Vector database |
Time-series metrics data |
Time series (OpenTSDB) |
Proximity or nearby entity search |
Geo-spatial index (quadtrees or geohashing) |
Table 17.1: Choosing the right data store for different use cases
The preceding table lists the use cases for data store mapping. Now, let’s go over the data structures to be used for different use cases in the next section.