Taking backup of a Casandra cluster
Cassandra takes backup of data in the form of snapshots of SSTables
. While a node is online, we can take snapshots of data stored in data files of the Cassandra data directory. While taking snapshots, we can specify whether we want to take a snapshot of all data/keyspaces, a specific keyspace, or a specific column family. These snapshots can then be moved to another location for backup, or we can leave them at the default location. Snapshots are taken node wide, and all data is contained in the snapshot that is written before a snapshot is triggered. A node's snapshot may not be consistent with another replica node. However, when snapshots of all nodes are restored, data eventually becomes consistent.
Once we have taken snapshot of all nodes, we can configure Cassandra to take incremental snapshots. Incremental backup will start an automatic snapshot trigger whenever an SSTable
is flushed.
Note
A snapshot comprises only the data of the column family and doesn...