In this article by Arthur Berezin, the author of the book OpenStack Configuration Cookbook, we will cover the following topics:

Installing Pacemaker
Installing HAProxy
Configuring Galera cluster for MariaDB
Installing RabbitMQ with mirrored queues
Configuring highly available OpenStack services

(For more resources related to this topic, see here.)

Many organizations choose OpenStack for its distributed architecture and ability to deliver the Infrastructure as a Service (IaaS) platform for mission-critical applications. In such environments, it is crucial to configure all OpenStack services in a highly available configuration to provide as much possible uptime for the control plane services of the cloud. Deploying a highly available control plane for OpenStack can be achieved in various configurations. Each of these configurations would serve certain set of demands and introduce a growing set of prerequisites.

Pacemaker is used to create active-active clusters to guarantee services' resilience to possible faults. Pacemaker is also used to create a virtual IP addresses for each of the services. HAProxy serves as a load balancer for incoming calls to service's APIs.

This article discusses neither high availably of virtual machine instances nor Nova-Compute service of the hypervisor.

deploying-highly-available-openstack-img-0

Most of the OpenStack services are stateless, OpenStack services store persistent in a SQL database, which is potentially a single point of failure we should make highly available. In this article, we will deploy a highly available database using MariaDB and Galera, which implements multimaster replication. To ensure availability of the message bus, we will configure RabbitMQ with mirrored queues.

This article discusses configuring each service separately on three controllers' layout that runs OpenStack controller services, including Neutron, database, and RabbitMQ message bus. All can be configured on several controller nodes, or each service could be implemented on its separate set of hosts.

Installing Pacemaker

All OpenStack services consist of system Linux services. The first step of ensuring services' availability is to configure Pacemaker clusters for each service, so Pacemaker monitors the services. In case of failure, Pacemaker restarts the failed service. In addition, we will use Pacemaker to create a virtual IP address for each of OpenStack's services to ensure services are accessible using the same IP address when failures occurs and the actual service has relocated to another host.

In this section, we will install Pacemaker and prepare it to configure highly available OpenStack services.

Getting ready

To ensure maximum availability, we will install and configure three hosts to serve as controller nodes. Prepare three controller hosts with identical hardware and network layout. We will base our configuration for most of the OpenStack services on the configuration used in a single controller layout, and we will deploy Neutron network services on all three controller nodes.

How to do it…

Run the following steps on three highly available controller nodes:

Install pacemaker packages:

[root@controller1 ~]# yum install -y pcs pacemaker corosync
fence-agents-all resource-agents

Enable and start the pcsd service:

[root@controller1 ~]# systemctl enable pcsd
[root@controller1 ~]# systemctl start pcsd

Set a password for hacluster user; the password should be identical on all the nodes:
```
[root@controller1 ~]# echo 'password' | passwd --stdin 
hacluster
```
We will use the hacluster password through the HAProxy configuration.
Authenticate all controller nodes running using -p option to give the password on the command line, and provide the same password you have set in the previous step:
```
[root@controller1 ~] # pcs cluster auth controller1 controller2 
controller3 -u hacluster -p password --force
```
At this point, you may run pcs commands from a single controller node instead of running commands on each node separately.
```
[root@controller1 ~]# rabbitmqctl set_policy HA '^(?!amq.).*' 
'{"ha-mode": "all"}'
```

There's more...

You may find the complete Pacemaker documentation, which includes installation documentation, complete configuration reference, and examples in Cluster Labs website at http://clusterlabs.org/doc/.

Installing HAProxy

Addressing high availability for OpenStack includes avoiding high load of a single host and ensuring incoming TCP connections to all API endpoints are balanced across the controller hosts. We will use HAProxy, an open source load balancer, which is particularly suited for HTTP load balancing as it supports session persistence and layer 7 processing.

Getting ready

In this section, we will install HAProxy on all controller hosts, configure Pacemaker cluster for HAProxy services, and prepare for OpenStack services configuration.

How to do it...

Run the following steps on all controller nodes:

Install HAProxy package:
```
# yum install -y haproxy
```

Enable nonlocal binding Kernel parameter:

# echo net.ipv4.ip_nonlocal_bind=1 >>
/etc/sysctl.d/haproxy.conf
# echo 1 > /proc/sys/net/ipv4/ip_nonlocal_bind

Configure HAProxy load balancer settings for the GaleraDB, RabbitMQ, and Keystone service as shown in the following diagram:

Edit /etc/haproxy/haproxy.cfg with the following configuration:

global
   daemon
defaults
   mode tcp
   maxconn 10000
   timeout connect 2s
   timeout client 10s
   timeout server 10s
 
frontend vip-db
   bind 192.168.16.200:3306
   timeout client 90s
   default_backend db-vms-galera
 
backend db-vms-galera
   option httpchk
   stick-table type ip size 2
   stick on dst
   timeout server 90s
   server rhos5-db1 192.168.16.58:3306 check inter 1s port 9200
   server rhos5-db2 192.168.16.59:3306 check inter 1s port 9200
   server rhos5-db3 192.168.16.60:3306 check inter 1s port 9200
 
frontend vip-rabbitmq
   bind 192.168.16.213:5672
   timeout client 900m
   default_backend rabbitmq-vms
 
backend rabbitmq-vms
   balance roundrobin
   timeout server 900m
   server rhos5-rabbitmq1 192.168.16.61:5672 check inter 1s
   server rhos5-rabbitmq2 192.168.16.62:5672 check inter 1s
   server rhos5-rabbitmq3 192.168.16.63:5672 check inter 1s
 
frontend vip-keystone-admin
   bind 192.168.16.202:35357
   default_backend keystone-admin-vms
backend keystone-admin-vms
   balance roundrobin
   server rhos5-keystone1 192.168.16.64:35357 check inter 1s
   server rhos5-keystone2 192.168.16.65:35357 check inter 1s
   server rhos5-keystone3 192.168.16.66:35357 check inter 1s
 
frontend vip-keystone-public
   bind 192.168.16.202:5000
   default_backend keystone-public-vms
backend keystone-public-vms
   balance roundrobin
   server rhos5-keystone1 192.168.16.64:5000 check inter 1s
   server rhos5-keystone2 192.168.16.65:5000 check inter 1s
   server rhos5-keystone3 192.168.16.66:5000 check inter 1s

This configuration file is an example for configuring HAProxy with load balancer for the MariaDB, RabbitMQ, and Keystone service.

We need to authenticate on all nodes before we are allowed to change the configuration to configure all nodes from one point. Use the previously configured hacluster user and password to do this.
```
# pcs cluster auth controller1 controller2 controller3 -u
hacluster -p password --force
```

Create a Pacemaker cluster for HAProxy service as follows:

Note that you can run pcs commands now from a single controller node.

# pcs cluster setup --name ha-controller controller1 controller2 controller3
# pcs cluster enable --all
# pcs cluster start --all

Finally, using pcs resource create command, create a cloned systemd resource that will run a highly available active-active HAProxy service on all controller hosts:
```
pcs resource create lb-haproxy systemd:haproxy op monitor 
start-delay=10s --clone
```

Create the virtual IP address for each of the services:

# pcs resource create vip-db IPaddr2 ip=192.168.16.200
# pcs resource create vip-rabbitmq IPaddr2 ip=192.168.16.213
# pcs resource create vip-keystone IPaddr2 ip=192.168.16.202

You may use pcs status command to verify whether all resources are successfully running:
```
# pcs status
```

Configuring Galera cluster for MariaDB

Galera is a multimaster cluster for MariaDB, which is based on synchronous replication between all cluster nodes. Effectively, Galera treats a cluster of MariaDB nodes as one single master node that reads and writes to all nodes. Galera replication happens at transaction commit time, by broadcasting transaction write set to the cluster for application. Client connects directly to the DBMS and experiences close to the native DBMS behavior. wsrep API (write set replication API) defines the interface between Galera replication and the DBMS:

deploying-highly-available-openstack-img-2

Getting ready

In this section, we will install Galera cluster packages for MariaDB on our three controller nodes, then we will configure Pacemaker to monitor all Galera services.

Pacemaker can be stopped on all cluster nodes, as shown, if it is running from previous steps:

# pcs cluster stop --all

How to do it..

Perform the following steps on all controller nodes:

Install galera packages for MariaDB:

# yum install -y mariadb-galera-server xinetd resource-agents

Edit /etc/sysconfig/clustercheck and add the following lines:

MYSQL_USERNAME="clustercheck"
MYSQL_PASSWORD="password"
MYSQL_HOST="localhost"

Edit Galera configuration file /etc/my.cnf.d/galera.cnf with the following lines:

Make sure to enter host's IP address at the bind-address parameter.

Unlock access to the largest independent learning library in Tech for FREE!

Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.

Renews at €18.99/month. Cancel anytime

[mysqld]
skip-name-resolve=1
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
innodb_locks_unsafe_for_binlog=1
query_cache_size=0
query_cache_type=0
bind-address=[host-IP-address]
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_name="galera_cluster"
wsrep_slave_threads=1
wsrep_certify_nonPK=1
wsrep_max_ws_rows=131072
wsrep_max_ws_size=1073741824
wsrep_debug=0
wsrep_convert_LOCK_to_trx=0
wsrep_retry_autocommit=1
wsrep_auto_increment_control=1
wsrep_drupal_282555_workaround=0
wsrep_causal_reads=0
wsrep_notify_cmd=
wsrep_sst_method=rsync

You can learn more on each of the Galera's default options on the documentation page at http://galeracluster.com/documentation-webpages/configuration.html.

Add the following lines to the xinetd configuration file /etc/xinetd.d/galera-monitor:

service galera-monitor
{
       port           = 9200
       disable         = no
       socket_type     = stream
       protocol       = tcp
       wait           = no
       user           = root
       group           = root
       groups         = yes
       server         = /usr/bin/clustercheck
       type           = UNLISTED
       per_source     = UNLIMITED
       log_on_success =
       log_on_failure = HOST
       flags           = REUSE
}

Start and enable the xinetd service:

# systemctl enable xinetd
# systemctl start xinetd
# systemctl enable pcsd
# systemctl start pcsd

Authenticate on all nodes. Use the previously configured hacluster user and password to do this as follows:
```
# pcs cluster auth controller1 controller2 controller3 -u 
hacluster -p password --force
```
Now commands can be run from a single controller node.

Create a Pacemaker cluster for Galera service:

# pcs cluster setup --name controller-db controller1 
controller2 controller3
# pcs cluster enable --all
# pcs cluster start --all

Add the Galera service resource to the Galera Pacemaker cluster:

# pcs resource create galera galera enable_creation=true 
wsrep_cluster_address="gcomm://controller1,controller2,controll
er3" meta master-max=3 ordered=true op promote timeout=300s on-
fail=block --master

Create a user for CLusterCheck xinetd service:

mysql -e "CREATE USER 'clustercheck'@'localhost' IDENTIFIED BY 
'password';"

Installing RabbitMQ with mirrored queues

RabbitMQ is used as a message bus for services to inner-communicate. The queues are located on a single node that makes the RabbitMQ service a single point of failure. To avoid RabbitMQ being a single point of failure, we will configure RabbitMQ to use mirrored queues across multiple nodes. Each mirrored queue consists of one master and one or more slaves, with the oldest slave being promoted to the new master if the old master disappears for any reason. Messages published to the queue are replicated to all slaves.

Getting Ready

In this section, we will install RabbitMQ packages on our three controller nodes and configure RabbitMQ to mirror its queues across all controller nodes, then we will configure Pacemaker to monitor all RabbitMQ services.

How to do it..

Perform the following steps on all controller nodes:

Install RabbitMQ packages on all controller nodes:
```
# yum -y install rabbitmq-server
```

Start and enable rabbitmq-server service:

# systemctl start rabbitmq-server
# systemctl stop rabbitmq-server

RabbitMQ cluster nodes use a cookie to determine whether they are allowed to communicate with each other; for nodes to be able to communicate, they must have the same cookie. Copy erlang.cookie from controller1 to controller2 and controller3:
```
[root@controller1 ~]# scp /var/lib/rabbitmq/.erlang.cookie 
root@controller2:/var/lib/rabbitmq/
[root@controller1 ~]## scp /var/lib/rabbitmq/.erlang.cookie 
root@controller3:/var/lib/rabbitmq/
```
Start and enable Pacemaker on all nodes:
```
# systemctl enable pcsd
# systemctl start pcsd
```
Since we already authenticated all nodes of the cluster in the previous section, we can now run following commands on controller1.

Create a new Pacemaker cluster for RabbitMQ service as follows:

[root@controller1 ~]# pcs cluster setup --name rabbitmq 
controller1 controller2 controller3
[root@controller1 ~]# pcs cluster enable --all
[root@controller1 ~]# pcs cluster start --all

To the Pacemaker cluster, add a systemd resource for RabbitMQ service:

[root@controller1 ~]# pcs resource create rabbitmq-server 
systemd:rabbitmq-server op monitor start-delay=20s --clone

Since all RabbitMQ nodes must join the cluster one at a time, stop RabbitMQ on controller2 and controller3:
```
[root@controller2 ~]# rabbitmqctl stop_app
[root@controller3 ~]# rabbitmqctl stop_app
```

Join controller2 to the cluster and start RabbitMQ on it:

[root@controller2 ~]# rabbitmqctl join_cluster 
rabbit@controller1
[root@controller2 ~]# rabbitmqctl start_app

Now join controller3 to the cluster as well and start RabbitMQ on it:

[root@controller3 ~]# rabbitmqctl join_cluster 
rabbit@controller1
[root@controller3 ~]# rabbitmqctl start_app

At this point, the cluster should be configured and we need to set RabbitMQ's HA policy to mirror the queues to all RabbitMQ cluster nodes as follows:

There's more..

The RabbitMQ cluster should be configured with all the queues cloned to all controller nodes. To verify cluster's state, you can use the rabbitmqctl cluster_status and rabbitmqctl list_policies commands from each of controller nodes as follows:

[root@controller1 ~]# rabbitmqctl cluster_status
[root@controller1 ~]# rabbitmqctl list_policies

To verify Pacemaker's cluster status, you may use pcs status command as follows:

[root@controller1 ~]# pcs status

Configuring Highly OpenStack Services

Most OpenStack services are stateless web services that keep persistent data on a SQL database and use a message bus for inner-service communication. We will use Pacemaker and HAProxy to run OpenStack services in an active-active highly available configuration, so traffic for each of the services is load balanced across all controller nodes and cloud can be easily scaled out to more controller nodes if needed. We will configure Pacemaker clusters for each of the services that will run on all controller nodes. We will also use Pacemaker to create a virtual IP addresses for each of OpenStack's services, so rather than addressing a specific node, services will be addressed by their corresponding virtual IP address. We will use HAProxy to load balance incoming requests to the services across all controller nodes.

Get Ready

In this section, we will use the virtual IP address we created for the services with Pacemaker and HAProxy in previous sections. We will also configure OpenStack services to use the highly available Galera-clustered database, and RabbitMQ with mirrored queues.

This is an example for the Keystone service. Please refer to the Packt website URL here for complete configuration of all OpenStack services.

How to do it..

Perform the following steps on all controller nodes:

Install the Keystone service on all controller nodes:

yum install -y openstack-keystone openstack-utils
openstack-selinux

Generate a Keystone service token on controller1 and copy it to controller2 and controller3 using scp:

[root@controller1 ~]# export SERVICE_TOKEN=$(openssl rand -hex
10)
[root@controller1 ~]# echo $SERVICE_TOKEN >
~/keystone_admin_token
[root@controller1 ~]# scp ~/keystone_admin_token
root@controller2:~/keystone_admin_token

Export the Keystone service token on controller2 and controller3 as well:

[root@controller2 ~]# export SERVICE_TOKEN=$(cat 
~/keystone_admin_token)
[root@controller3 ~]# export SERVICE_TOKEN=$(cat 
~/keystone_admin_token)

Note: Perform the following commands on all controller nodes.

Configure the Keystone service on all controller nodes to use vip-rabbit:

# openstack-config --set /etc/keystone/keystone.conf DEFAULT
admin_token $SERVICE_TOKEN
# openstack-config --set /etc/keystone/keystone.conf DEFAULT
rabbit_host vip-rabbitmq

Configure the Keystone service endpoints to point to Keystone virtual IP:

# openstack-config --set /etc/keystone/keystone.conf DEFAULT
admin_endpoint 'http://vip-keystone:%(admin_port)s/'
# openstack-config --set /etc/keystone/keystone.conf DEFAULT
public_endpoint 'http://vip-keystone:%(public_port)s/'

Configure Keystone to connect to the SQL databases use Galera cluster virtual IP:

# openstack-config --set /etc/keystone/keystone.conf database 
connection mysql://keystone:keystonetest@vip-mysql/keystone
# openstack-config --set /etc/keystone/keystone.conf database 
max_retries -1

On controller1, create Keystone KPI and sync the database:

[root@controller1 ~]# keystone-manage pki_setup --keystone-user keystone --keystone-group keystone
[root@controller1 ~]# chown -R keystone:keystone
/var/log/keystone   /etc/keystone/ssl/
[root@controller1 ~] su keystone -s /bin/sh -c "keystone-manage
db_sync"

Using scp, copy Keystone SSL certificates from controller1 to controller2 and controller3:

[root@controller1 ~]# rsync -av /etc/keystone/ssl/
controller2:/etc/keystone/ssl/
[root@controller1 ~]# rsync -av /etc/keystone/ssl/
controller3:/etc/keystone/ssl/

Make sure that Keystone user is owner of newly copied files controller2 and controller3:

[root@controller2 ~]# chown -R keystone:keystone
/etc/keystone/ssl/
[root@controller3 ~]# chown -R keystone:keystone
/etc/keystone/ssl/

Create a systemd resource for the Keystone service, use --clone to ensure it runs with active-active configuration:

[root@controller1 ~]# pcs resource create keystone
systemd:openstack-keystone op monitor start-delay=10s --clone

Create endpoint and user account for Keystone with the Keystone VIP as given:

[root@controller1 ~]# export SERVICE_ENDPOINT="http://vip-keystone:35357/v2.0"
[root@controller1 ~]# keystone service-create --name=keystone --type=identity --description="Keystone Identity Service"
[root@controller1 ~]# keystone endpoint-create --service keystone --publicurl 'http://vip-keystone:5000/v2.0' --adminurl 'http://vip-keystone:35357/v2.0' --internalurl 'http://vip-keystone:5000/v2.0'
 
[root@controller1 ~]# keystone user-create --name admin --pass keystonetest
[root@controller1 ~]# keystone role-create --name admin
[root@controller1 ~]# keystone tenant-create --name admin
[root@controller1 ~]# keystone user-role-add --user admin --role admin --tenant admin

Create all controller nodes on a keystonerc_admin file with OpenStack admin credentials using the Keystone VIP:

cat > ~/keystonerc_admin << EOF
export OS_USERNAME=admin
export OS_TENANT_NAME=admin
export OS_PASSWORD=password
export OS_AUTH_URL=http://vip-keystone:35357/v2.0/
export PS1='[u@h W(keystone_admin)]$ '
EOF

Source the keystonerc_admin credentials file to be able to run the authenticated OpenStack commands:
```
[root@controller1 ~]# source ~/keystonerc_admin
```
At this point, you should be able to execute the Keystone commands and create the Services tenant:
```
[root@controller1 ~]# keystone tenant-create --name services
--description "Services Tenant"
```

Summary

In this article, we have covered the installation of Pacemaker and HAProxy, configuration of Galera cluster for MariaDB, installation of RabbitMQ with mirrored queues, and configuration of highly available OpenStack services.