Why do we need sharding?
In database systems and computing systems in general, we have two ways to improve performance. The first one is to simply replace our servers with more powerful ones, keeping the same network topology and systems architecture. This is called vertical scaling.
An advantage of vertical scaling is that it is simple, from an operational standpoint, especially with cloud providers such as Amazon making it a matter of a few clicks to replace an r6g.medium
server instance with an r6g.extralarge
one. Another advantage is that we don’t need to make any code changes, so there is little to no risk of something going catastrophically wrong.
The main disadvantage of vertical scaling is that there is a limit to it; we can only get servers that are as powerful as those our cloud provider can give to us.
A related disadvantage is that getting more powerful servers generally comes with an increase in cost that is not linear but exponential. So, even if...