Chapter 3. Clustering, Sharding, and Replication
As your data grows and your database gets larger, it becomes increasingly difficult to keep the entire database in a single physical location, and often, it becomes more efficient to keep data in more than one machine. RethinkDB is a distributed database. This means that it consists of multiple connected machines that store some data each; although, to users, it appears as a single, centralized database.
This chapter is all about scaling RethinkDB and setting up and managing database clusters, groups of servers serving the same database, and associated tables. We will look at how to set up a database cluster, add machines to it, and scale RethinkDB.
In this chapter, you will also learn the following topics:
- Managing a RethinkDB cluster
- What replication is, and how to replicate tables
- What sharding is, and how to implement it within RethinkDB
Before we start working on the database, we will give a brief definition of scaling and explain...