Understanding compaction
Cassandra deals with this build-up of SSTables over time by means of a process called compaction. Compaction aggregates partitions from multiple files into a single file, and in the process it removes old data and purges tombstones. But housekeeping is only one reason to do this; the other objective is to improve read performance by moving data for a given key into a single SSTable, thereby reducing the disk I/O required to read each key.
The exact mechanism that governs the compaction process depends on which compaction strategy you choose. As of version 3.8 (or 3.0.8, which added time-window compaction and deprecated date-tiered compaction), there are four strategies that ship with Cassandra (although you can implement your own):
Size-tiered compaction: This strategy causes SSTables to be compacted when there are multiple files of a similar size (the default is four). In update-heavy workloads, a partition may exist in many SSTables at once, resulting in reduced...