Understanding data sharding
Data sharding refers to splitting the data in a single database into multiple databases or tables according to a certain dimension, to improve performance and availability.
Data sharding is not to be confused with data partitioning, which is about dividing the data into sub-groups while keeping it stored in a single database. Many other opinions and ideas are floating around in academia and on the internet about this, but rest assured that the number of databases where the data is stored represents the main difference that you should be aware of when distinguishing between sharding and partitioning.
According to the granularity of data sharding, we can divide data sharding into two common forms – database shards and table shards:
- Database shards are partitions of data in a database, with each shard being stored in a different database instance.
- Table shards are the smaller pieces that used to be part of a single table and are now...