It is important to back up a cluster and all the data it contains. We have already talked about replica shards in the previous section. These are not appropriate as a backup, as they do not provide any data protection. Instead, there is a specific API that snapshots and then restores data. The snapshot and restore API is a cluster backup mechanism that saves the current state of the cluster in a repository. We also use the reindex API to make temporary backups when making certain bulk document changes, such as _update_by_query and _delete_by_query. Back up all documents first with _reindex.
There are a few different ways to create a repository for backups. All of the following are supported:
- Shared file systems: fs file type
- Read-only URLs: url file type
- S3: s3 file type
- HDFS: hdfs file type
- Azure: azure file type
- Google Cloud Storage: gcs file type ...