Load balancing
A balanced Cassandra cluster is the one where each node owns an equal number of keys. This means when you query nodetool status
, a balanced cluster will show the same percentage for all the nodes under the Owns
or Effective Ownership
columns. If the data is not uniformly distributed between the keys, even with equal ownership you will see some nodes are more occupied by the data than others. We use RandomPartitioner
or Murmur3Partitioner
to avoid this sort of lopsided cluster.
Note
Note that this section is valid for a setup that does not use vnodes. If you are using Cassandra Version 1.2 or a version after it with default settings, you can skip this section.
This section is specifically for a cluster that uses one token per Cassandra instance.
Anytime a new node is added or a node is decommissioned, the token distribution gets skewed. Normally, one always wants Cassandra to be fairly load balanced to avoid hotspots. Fortunately, it is very easy to load balance. The two-step load...