Write complexity and data integrity
The amount of work we need to do to write data in the fully normalized strategy is basically equal to what we needed to do with a partially normalized layout. Our storage needs to increase by a bit, now we're storing one full copy of each status update for every follower the author has. However, storage is cheap, and writing data in Cassandra is cheap, so we've managed to make our timeline read pattern far more efficient at low cost.
One concern in any sort of denormalized scenario is data integrity. At the Cassandra level, the only thing stopping us from adding a status update to the user_status_updates
table is forgetting to add copies as appropriate to the home_status_updates
table, or vice versa. Even worse, if a user deletes a status update and we don't properly remove copies from all the home_status_updates
table, the user's followers might see status updates that they aren't supposed to.
For the most part, the responsibility for maintaining data integrity...