Backing up data
While Cassandra itself goes a long way toward reducing the possibility of data loss, it cannot prevent loss or corruption due to administrative or application-level mistakes. For this reason, it is still advisable to maintain backups of critical tables to allow you to recover to a known good point in the past.
Taking a snapshot
Fundamentally, backing up data in Cassandra involves taking a snapshot of the SSTable for a given keyspace at a moment in time, as it must have all the tables in order to properly recover if needed. You can create a snapshot using nodetool
as follows:
nodetool snapshot [keyspace_name]
This will create hard links to the current SSTables in that keyspace's snapshots directory (located inside the data
directory, which is located at /var/lib/cassandra/data/[keyspace_name]
by default), under a directory name based on the Unix epoch at the time the snapshot is generated. The advantage of this approach is that the hard link does not require any additional disk...