Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
MongoDB Cookbook - Second Edition

You're reading from   MongoDB Cookbook - Second Edition Modern Database Management Made Easy

Arrow left icon
Product type Paperback
Published in Jan 2016
Publisher
ISBN-13 9781785289989
Length 370 pages
Edition 2nd Edition
Tools
Arrow right icon
Authors (2):
Arrow left icon
Amol Nayak Amol Nayak
Author Profile Icon Amol Nayak
Amol Nayak
Cyrus Dasadia Cyrus Dasadia
Author Profile Icon Cyrus Dasadia
Cyrus Dasadia
Arrow right icon
View More author details
Toc

Table of Contents (12) Chapters Close

Preface 1. Installing and Starting the Server 2. Command-line Operations and Indexes FREE CHAPTER 3. Programming Language Drivers 4. Administration 5. Advanced Operations 6. Monitoring and Backups 7. Deploying MongoDB on the Cloud 8. Integration with Hadoop 9. Open Source and Proprietary Tools A. Concepts for Reference Index

Starting multiple instances as part of a replica set

In this recipe, we will look at starting multiple servers on the same host but as a cluster. Starting a single mongo server is enough for development purposes or non-mission-critical applications. For crucial production deployments, we need the availability to be high, where if one server instance fails, another instance takes over and the data remains available to query, insert, or update. Clustering is an advanced concept and we won't be doing justice by covering this whole concept in one recipe. Here, we will be touching the surface and going into more detail in other recipes in the administration section later in the book. In this recipe, we will start multiple mongo server processes on the same machine for the purpose of testing. In a production environment, they will be running on different machines (or virtual machines) in the same or even different data centers.

Let's see in brief what a replica set exactly is. As the name suggests, it is a set of servers that are replicas of each other in terms of data. Looking at how they are kept in sync with each other and other internals is something we will defer to some later recipes in the administration section, but one thing to remember is that write operations will happen only on one node, which is the primary one. All the querying also happens from the primary by default, though we may permit read operations on secondary instances explicitly. An important fact to remember is that replica sets are not meant to achieve scalability by distributing the read operations across various nodes in a replica set. Its sole objective is to ensure high availability.

Getting ready

Though not a prerequisite, taking a look at the Starting a single node instance using command-line options recipe will definitely make things easier just in case you are not aware of various command-line options and their significance while starting a mongo server. Additionally, the necessary binaries and setups as mentioned in the single server setup must be done before we continue with this recipe. Let's sum up on what we need to do.

We will start three mongod processes (mongo server instances) on our localhost.

We will create three data directories, /data/n1, /data/n2, and /data/n3 for Node1, Node2, and Node3, respectively. Similarly, we will redirect the logs to /logs/n1.log, /logs/n2.log, and /logs/n3.log. The following image will give you an idea on how the cluster would look:

Getting ready

How to do it…

Let's take a look at the steps in detail:

  1. Create the /data/n1, /data/n2, /data/n3, and /logs directories for the data and logs of the three nodes respectively. On the Windows platform, you can choose the c:\data\n1, c:\data\n2, c:\data\n3, and c:\logs\ directories or any other directory of your choice for the data and logs respectively. Ensure that these directories have appropriate write permissions for the mongo server to write the data and logs.
  2. Start the three servers as follows. Users on the Windows platform need to skip the --fork option as it is not supported:
    $ mongod --replSet repSetTest --dbpath /data/n1 --logpath /logs/n1.log --port 27000 --smallfiles --oplogSize 128 --fork
    $ mongod --replSet repSetTest --dbpath /data/n2 --logpath /logs/n2.log --port 27001 --smallfiles --oplogSize 128 --fork
    $ mongod --replSet repSetTest --dbpath /data/n3 --logpath /logs/n3.log --port 27002 --smallfiles --oplogSize 128 –fork
    
  3. Start the mongo shell and connect to any of the mongo servers running. In this case, we connect to the first one (listening to port 27000). Execute the following command:
    $ mongo localhost:27000
    
  4. Try to execute an insert operation from the mongo shell after connecting to it:
    > db.person.insert({name:'Fred', age:35})
    

    This operation should fail as the replica set has not been initialized yet. More information can be found in the How it works… section.

  5. The next step is to start configuring the replica set. We start by preparing a JSON configuration in the shell as follows:
    cfg = {
      '_id':'repSetTest', 'members':[ {'_id':0, 'host': 'localhost:27000'}, {'_id':1, 'host': 'localhost:27001'}, {'_id':2, 'host': 'localhost:27002'} ]
    }
  6. The last step is to initiate the replica set with the preceding configuration as follows:
    > rs.initiate(cfg)
    
  7. Execute rs.status() after a few seconds on the shell to see the status. In a few seconds, one of them should become a primary and the remaining two should become secondary.

How it works…

We described the common options in the Installing single node MongoDB recipe with the command-line options recipe before and all these command-line options are described in detail.

As we are starting three independent mongod services, we have three dedicated database paths on the filesystem. Similarly, we have three separate log file locations for each of the processes. We then start three mongod processes with the database and log file path specified. As this setup is for test purposes and is started on the same machine, we use the --smallfiles and --oplogSize options. As these processes are running on the same host, we also choose the ports explicitly to avoid port conflicts. The ports that we chose here were 27000, 27001, and 27002. When we start the servers on different hosts, we may or may not choose a separate port. We can very well choose to use the default one whenever possible.

The --fork option demands some explanation. By choosing this option, we start the server as a background process from our operating system's shell and get the control back in the shell where we can then start more such mongod processes or perform other operations. In the absence of the --fork option, we cannot start more than one process per shell and would need to start three mongod processes in three separate shells.

If we take a look at the logs generated in the log directory, we should see the following lines in it:

[rsStart] replSet can't get local.system.replset config from self or any seed (EMPTYCONFIG)
[rsStart] replSet info you may need to run replSetInitiate -- rs.initiate() in the shell -- if that is not already done

Though we started three mongod processes with the --replSet option, we still haven't configured them to work with each other as a replica set. This command-line option is just used to tell the server on startup that this process will be running as a part of a replica set. The name of the replica set is the same as the value of this option passed on the command prompt. This also explains why the insert operation executed on one of the nodes failed before the replica set was initialized. In mongo replica sets, there can be only one primary node where all the inserting and querying happens. In the image shown, the N1 node is shown as the primary and listens to port 27000 for client connections. All the other nodes are slave/secondary instances, which sync themselves up with the primary and hence querying too is disabled on them by default. It is only when the primary goes down that one of the secondary takes over and becomes a primary node. However, it is possible to query the secondary for data as we have shown in the image; we will see how to query from a secondary instance in the next recipe.

Well, all that is left now is to configure the replica set by grouping the three processes that we started. This is done by first defining a JSON object as follows:

cfg = {
  '_id':'repSetTest', 'members':[ {'_id':0, 'host': 'localhost:27000'}, {'_id':1, 'host': 'localhost:27001'}, {'_id':2, 'host': 'localhost:27002'} ]
}

There are two fields, _id and members, for the unique ID of the replica set and an array of the hostnames and port numbers of the mongod server processes as part of this replica set, respectively. Using localhost to refer to the host is not a very good idea and is usually discouraged; however, in this case, as we started all the processes on the same machine, we are ok with it. It is preferred that you refer to the hosts by their hostnames even if they are running on localhost. Note that you cannot mix referring to the instances using localhost and hostnames both in the same configuration. It is either the hostname or localhost. To configure the replica set, we then connect to any one of the three running mongod processes; in this case, we connect to the first one and then execute the following from the shell:

> rs.initiate(cfg)

The _id field in the cfg object passed has a value that is the same as the value we gave to the --replSet option on the command prompt when we started the server processes. Not giving the same value would throw the following error:

{
        "ok" : 0,
        "errmsg" : "couldn't initiate : set name does not match the set name host Amol-PC:27000 expects"
}

If all goes well and the initiate call is successful, we should see something similar to the following JSON response on the shell:

{"ok" : 1}

In a few seconds, you should see a different prompt for the shell that we executed this command from. It should now become a primary or secondary. The following is an example of the shell connected to a primary member of the replica set:

repSetTest:PRIMARY>

Executing rs.status() should give us some stats on the replica set's status, which we will explore in depth in a recipe later in the book in the administration section. For now, the stateStr field is important and contains the PRIMARY, SECONDARY, and other texts.

There's more…

Look at the Connecting to the replica set in the shell to query and insert data recipe to perform more operations from the shell after connecting to a replica set. Replication isn't as simple as we saw here. See the administration section for more advanced recipes on replication.

See also

If you are looking to convert a standalone instance to a replica set, then the instance with the data needs to become a primary first, and then empty secondary instances will be added to which the data will be synchronized. Refer to the following URL on how to perform this operation:

http://docs.mongodb.org/manual/tutorial/convert-standalone-to-replica-set/

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image