In this article by Gabriel A. Canepa, author of the book CentOS High Performance, we will review the basic principles of clustering and show you, step by step, how to set up two CentOS 7 servers as nodes to later use them as members of a cluster.
(For more resources related to this topic, see here.)
As part of this process, we will install CentOS 7 from scratch in a brand new server as our first cluster member, along with the necessary packages, and finally, configure key-based authentication for SSH access from one node to the other.
In computing, a cluster consists of a group of computers (which are referred to as nodes or members) that work together so that the set is seen as a single system from the outside.
One typical cluster setup involves assigning a different task to each node, thus achieving a higher performance than if several tasks were performed by a single member on its own. Another classic use of clustering is helping to ensure high availability by providing failover capabilities to the set, where one node may automatically replace a failed member to minimize the downtime of one or several critical services. In either case, the concept of clustering implies not only taking advantage of the computing functionality of each member alone, but also maximizing it by complementing it with the others.
As we just mentioned, HA (High-availability) clusters aim to eliminate system downtime by failing services from one node to another in case one of them experiences an issue that renders it inoperative. As opposed to switchover, which requires human intervention, a failover procedure is performed automatically by the cluster without any downtime. In other words, this operation is transparent to end users and clients from outside the cluster.
On the other hand, HP (High-performance) clusters use their nodes to perform operations in parallel in order to enhance the performance of one or more applications. High-performance clusters are typically seen in scenarios involving applications that use large collections of data.
Just as the saying goes, Every journey begins with a small step, we will begin our own journey toward clustering by setting up the separate nodes that will make up our system. Our choice of operating system is Linux and CentOS, version 7, as the distribution, that being the latest available release of CentOS as of today. The binary compatibility with Red Hat Enterprise Linux © (which is one of the most well-used distributions in enterprise and scientific environments) along with its well-proven stability are the reasons behind this decision.
CentOS 7 along with its previous versions of the distribution are available for download, free of charge, from the project's website at http://www.centos.org/. In addition, specific details about the release can always be consulted in the CentOS wiki, http://wiki.centos.org/Manuals/ReleaseNotes/CentOS7.
Among the distinguishing features of CentOS 7, I would like to name the following:
To download CentOS, go to http://www.centos.org/download/ and click on one of the three options outlined in the following figure:
Download options for CentOS 7
These options are detailed as follows:
As the minimal install is sufficient for our purpose at hand, we can install other needed packages using yum later, that is, the recommended .iso file to download.
Here, X indicates the current update number of CentOS 7 and YYMM represent the year and month, both in two-digit notation, when the source code this version is based on was released.
This tells us the source code this release is based on dates from the month of June, 2014.
Independently of our preferred download method, we will need this .iso file in order to begin with the installation. In addition, feel free to burn it to optical media or a USB drive.
If you do not have dedicated hardware that you can use to set up the nodes of your cluster, you can still create one using virtual machines over some virtualization software, such as Oracle Virtualbox © or VMware ©, for example.
The following setup is going to be performed on a Virtualbox VM with 1 GB of RAM and 30 GB of disk space. We will use the default partitioning schema over LVM as suggested by the installation process.
The splash screen shown in the following screenshot is the first step in the installation process. Highlight Install CentOS 7 using the up and down arrows and press Enter:
Splash screen before starting the installation of CentOS 7
Select English (or your preferred installation language) and click on Continue, as shown in the following screenshot:
Selecting the language for the installation of CentOS 7
In the following screenshot, you can choose a keyboard layout, set the current date and time, choose a partitioning method, connect the main network interface, and assign a unique hostname for the node. We will name the current node node01 and leave the rest of the settings as default (we will configure the extra network card later). Then, click on Begin installation:
Configure keyboard layout, date and time, network and hostname, and partitioning schema
While the installation continues in the background, we will be prompted to set the password for the root account and create an administrative user for the node. Once these steps have been confirmed, the corresponding warnings no longer appear, as shown in the following screenshot:
Setting the password for root and creating an administrative user account
When the process is completed, click on Finish configuration and the installation will finish configuring the system and devices. When the system is ready to boot on its own, you will be prompted to do so. Remove the installation media and click on Reboot.
Now, we can proceed with setting up our network interfaces.
Our rather basic network infrastructure consists of 2 CentOS 7 boxes, with the node01 [192.168.0.2] and node02 [192.168.0.3] host names, respectively, and a gateway router called simply gateway [192.168.0.1].
In CentOS, network cards are configured using scripts in the /etc/sysconfig/network-scripts directory. This is the minimum content that is needed in /etc/sysconfig/network-scripts/ifcfg-enp0s3 for our purposes:
HWADDR="08:00:27:C8:C2:BE"
TYPE="Ethernet"
BOOTPROTO="static"
NAME="enp0s3"
ONBOOT="yes"
IPADDR="192.168.0.2"
NETMASK="255.255.255.0"
GATEWAY="192.168.0.1"
PEERDNS="yes"
DNS1="8.8.8.8"
DNS2="8.8.4.4"
Note that the UUID and HWADDR values will be different in your case. In addition, be aware that cluster machines need to be assigned a static IP address—never leave that up to DHCP! In the preceding configuration file, we used Google's DNS, but if you wish, feel free to use another DNS.
When you're done making changes, save the file and restart the network service in order to apply them:
systemctl restart network.service # Restart the network service
You can verify that the previous changes have taken effect (shown in the Restarting the network service and verifying settings figure) with the following two commands:
systemctl status network.service # Display the status of the network service
And the changes have also taken effect due to this command:
ip addr | grep 'inet addr' # Display the IP addresse
Restarting the network service and verifying settings
You can disregard all error messages related to the loopback interface, as shown in preceding screenshot. However, you will need to examine carefully any error messages related to the enp0s3 interface, if any, and get them resolved in order to proceed further.
The second interface will be called enp0sX, where X is typically 8. You can verify with the following command (shown in the following figure):
ip link show
Displaying NIC information
As for the configuration file of enp0s8, you can safely create it, copying the contents of ifcfg-enp0s3. Do not forget, however, to change the hardware (MAC) address as returned by the information on the NIC and leave the IP address field blank for now.
ip link show enp0s8
cp /etc/sysconfig/network-scripts/ifcfg-enp0s3 /etc/sysconfig/network-scripts/ifcfg-enp0s8
Then, restart the network service.
Note that you will also need to set up at least a basic DNS resolution method. Considering that we will set up a cluster with 2 nodes only, we will use /etc/hosts for this purpose.
Edit /etc/hosts with the following content:
192.168.0.2 node01
192.168.0.3 node02
192.168.0.1 gateway
In this article, we reviewed how to install the operating system and listed the necessary software components to implement the basic cluster functionality.
Further resources on this subject: