So far, we've developed a model of distribution in which the total data set is distributed among multiple machines, but any given piece of data lives on only one machine. This model carries a big advantage over a single-node configuration, which is that it's horizontally scalable. By distributing data over multiple machines, we can accommodate ever-larger data sets simply by adding more machines to our cluster.
But our current model doesn't solve the problem of fault tolerance. No hardware is perfect; any production deployment must acknowledge the fact that a machine might fail. Our current model isn't resilient to such failures: for instance, if Node 1 in our original three-node cluster were to suddenly catch fire, we would lose all the data on that node, including the row containing the alice user record.
To solve this problem, Cassandra provides replication...