High Availability MySQL Cookbook

Chapter 1. High Availability with MySQL Cluster

In this chapter, we will cover:

Designing a MySQL Cluster
Creating an initial cluster configuration file—config.ini
Installing a management node
Starting a management node
Installing and starting storage nodes
Installing and starting SQL nodes
Creating a MySQL Cluster table
Restarting a MySQL Cluster without downtime
Recovering from a cluster shutdown

Designing a MySQL Cluster

In this recipe, we will explain how to design a MySQL Cluster correctly. MySQL Cluster is an extremely powerful technology and this recipe will outline and briefly discuss some of the factors that you should consider while designing a MySQL Cluster.

We start with a high-level description of how a MySQL Cluster works in the How to do it... section. The How it work s... section explores the bits that make up a cluster in more detail and the There's more... section discusses the way that a MySQL Cluster stores and retrieves data.

How to do it...

MySQL Clusters are built from three different types of node. These three types of node, when connected together, allow a cluster to provide a cluster storage engine on MySQL servers for clients to connect to. To build a cluster, you must select the hardware on which you can run at least one type of each node. We now discuss these types of node and how they connect together.

Note

A node does not mean a single physical machine but a process that forms a part of a cluster. It is quite possible to run multiple nodes (that is, processes) on the same physical machine. For example, it is common to run a management node on the same host as the SQL node.

The three kinds of nodes that make up a MySQL Cluster are:

Management node—these are the nodes that control information about the makeup of the cluster, provide a central point to collect the information such as logs and also to control other nodes. A management node must be started before any other node.
Data or storage node—this is the ndbd process that holds the data in the cluster and does the low-level work of answering queries in conjunction with the other storage nodes in the cluster.
API nodes—these are the nodes that connect to the cluster to extract the data—the most common example of this type of node is a mysqld process that is compiled to support MySQL Cluster which is commonly known as a SQL node. In this book, we use the term API and SQL node interchangeably except when referring to an API node that specifically is not a mysqld process.

To design a cluster with redundancy of operations (in other words, one that is highly available), you require at least one management node, two storage nodes, and two SQL nodes. A management node is only required when starting another node in your cluster—a cluster that is running will happily run even without a management node. Note that when there is no management node running in a cluster, there is no central point to control your cluster, view logs, and critically if another node fails, it will not be able to restart.

If you were building the simplest possible cluster, it would consist of:

Two similar servers, each running a storage and SQL node
One small server, running a management node

Three physical pieces of hardware are required to handle the case where a cluster is cut clean down the middle (for example, in a two-node cluster, one node's network cable is unplugged). This is called a split brain problem and is explored in more detail in the How it works… section that follows. In short, with only two nodes, in the event of nodes being unable to communicate (for example, when one node fails) both nodes must shut down to protect data consistency—which makes for a rather pointless cluster.

When it comes to calculating how many storage nodes you actually require, the recipe Calculating DataMemory and IndexMemory in Chapter 3, MySQL Cluster Management will tell you the total amount of memory required for your cluster. From a function of this number, the desired level of redundancy (see the following information box), the most cost-efficient amount of RAM to fit per server, and the performance required it is possible to calculate the optimum number of servers required for storage nodes and the RAM requirement for each. For a simple test cluster of two nodes, you simply require enough spare RAM per storage node to carry all of the data that you plan to store in the cluster, plus a little more space (approximately 20 percent) to handle overheads and indexes.

Note

The level of availability refers to the number of servers that you wish to store each fragment of data on for redundancy. This is known as NoOfReplicas and is difficult to change in the future—often it is set to 2 or 4. Your number of data nodes must be a multiple of NoOfReplicas.

It is an extremely good practice to keep SQL nodes and storage nodes on different servers in order to prevent a large query swapping and crashing the storage node located on the same server, so in practice a cluster size of 3 is unusual.

The nodes in your cluster absolutely require uninterrupted and private network connections between them. If this is not the case, there are the following three problems:

Firewalls can cause bizarre behavior as MySQL Cluster daemons use a wide range of ports.
Data sent between nodes in a cluster is not encrypted. Therefore, anyone with access to that network can access all data stored in the cluster.
No form of security exists in communication between nodes, so anyone with access to the storage or management nodes can, for example, shutdown the cluster or inject their own data.

To avoid this, connect your cluster nodes to a private, non-firewalled network and dedicated switch, and protect the public interface with a good firewall.

MySQL Clusters must be built with nodes having very low latency connections—generally, just a pair of Ethernet switches. It is not possible to build a cluster over a higher-latency link (such as the Internet), although replication between MySQL Clusters is covered in Chapter 3.

How it works…

MySQL Cluster sits at the storage engine layer of a MySQL database server, with a storage engine known as NDBCLUSTER. This means that for clients connecting to that MySQL server, a MySQL Cluster table is exactly the same as a local InnoDB or MyISAM table. It is also quite conventional to only have some tables configured to use MySQL Cluster, as the following diagram demonstrates with a single MySQL server (mysqld process) running one database (database1) with three table types—MyISAM tables, InnoDB tables, and MySQL Cluster tables (NDBCLUSTER).

All of the cluster magic that allows physical servers holding parts of your database to fail without causing downtime is handled below the level of the MySQL server on which an incoming query is processed. The following diagram shows a MySQL Server (mysqld) connecting to a four-storage node MySQL Cluster, a local MyISAM table, and an InnoDB table stored on an external disk array. The client cannot tell the difference between these three types of tables.

MySQL Cluster has a shared nothing property, which means that unlike most clustering solutions, there is genuinely no single point of failure. On the other hand, in many other systems including some that we will cover later in this book, there is what can be considered a very reliable single point of failure—often a redundant shared disk system which can still fail as a result of a single event, such as a physical problem with the unit.

To achieve this no single point of failure architecture, MySQL Clusters store all of the data in the cluster on more than one node, which obviously has a performance impact. To mitigate this potential impact, most production clusters store both data and indexes in storage node memory (RAM).

Storing data in memory sounds scary and it is possible to configure tables to be stored on the disks of the storage nodes (covered in a later recipe). However, RAM-based storage provides significantly greater performance. By ensuring that data is stored on at least two different physical servers at a time it is unlikely that a failure (such as a disk drive or PSU) will occur in all the nodes holding a fragment of data at the same time.

However, it is still of course possible that all the nodes could fail (for example, in a data center-wide power failure). In order to ensure that this does not result in a loss of all data in the cluster, running storage nodes are constantly check pointing the data stored in the memory to a persistent storage on the disk.

In clusters consisting of more than two storage nodes, it is possible for all the servers holding a single fragment of data to fail. In this case, the cluster shuts itself down to ensure data consistency—this process is covered in more detail in the following There's more… section.

There's more…

In the background, a MySQL Cluster works by chopping up (also known as partitioning) your data into chunks (known as partitions, or by the preferred MySQL Cluster term fragments) and storing each fragment on as many different servers (data nodes) as you have selected. In this section, this process is explained in more detail. This is important to understand for anything other than the most superficial use of MySQL Cluster.

This process is shown in the following diagram, which shows a cluster design for two data nodes and two copies of each fragment of data to be held within the cluster. The MySQL Cluster has automatically worked out that it needs to partition our data into four partitions. Given this, it will ensure that each node has two fragments of data and that any single node does not have two identical chunks. It can be shown as follows:

In clusters where there are more storage nodes than the number of copies of each piece of data (which in many clusters is two, that is, each fragment is stored on two separate nodes), the cluster must further split the storage nodes into nodegroups. Nodegroups are groups of storage nodes that store the same fragments of data, and as long as one node in each nodegroup remains available, the cluster will have an entire copy of the data.

While each node in a nodegroup has the same data, each fragment within the cluster has a primary copy and one or more replicas (the number depending on NoOfReplicas again). The primary copy ("fragment replica") for each fragment or partition will be moved around automatically by the cluster to be spread out among the nodes for performance reasons. In the case of two nodes per nodegroup, each node will have approximately 50 percent of the primary fragments.

In a MySQL Cluster, if we wish to change a piece of our data (one or more of the fragments), we must modify each copy of fragment. In other words, make the same change on every node that stores that fragment. MySQL Cluster will attempt to do this in parallel, that is, it will send the request to change the fragment simultaneously to all nodes containing the relevant fragment. However, until the change has been committed, the transaction remains uncommitted in case a node fails.

MySQL Cluster declares the transaction committed to the client once all active data nodes with the relevant fragment on them have received the request to update their fragments and a single storage node has committed all of the changes.

This process is called a two-phase commit, and while it increases data integrity significantly, it reduces performance. The speed of a cluster executing transactions is a function of the following parameters (with the first and last extremely quick, due to all the data being stored in memory rather than on disk):

Time taken to locate all nodes involved in a transaction
Network latency talking to all involved nodes
Bandwidth available for transferring data between nodes
Time taken for all nodes to retrieve and / or change relevant data

This process is almost certainly slower than just accessing data from a locally attached disk (or from a kernel cache of a disk) and therefore, MySQL Cluster will almost always be slower in terms of query execution time for low workloads. MySQL Cluster may be faster under very high load (where its near linear scalability kicks in, as the load is spread over more nodes) and is valuable at all demand levels for its high availability. When you are designing your cluster, consider how much cost and performance you are willing to trade for scalability and high availability.

When you are considering to deploy the MySQL Cluster, it is essential to have an idea of both the problems that the MySQL Cluster will not solve and its specific requirements.

Note

For a complete list of requirements and limitations, visit the online MySQL Cluster reference guide (accessible from http://dev.mysql.com/doc/).

The remainder of this section covers and explains the limitations that most commonly cause problems for a MySQL Cluster administrator.

Operating System requirements: MySQL Cluster runs on several operating systems, and is specifically supported by the following:
- Linux (Red Hat and SUSE)
- Solaris
- Mac OS X
- Windows
When considering which of the supported operating systems to use, it is worth noting that far and away the most tested is the Linux operating system.
Limitations using indexes: There are some common limitations related to indexes inside MySQL Cluster tables as follows:
- Full-text indexes do not work with MySQL Cluster. Consider using Sphinx (http://www.sphinxsearch.com/) and / or a separate table for your full-text searches, possibly using another high-availability technique such as MySQL Replication in order to run your intensive search queries against a replica without affecting performance on the master (Chapter 5, High Availability with MySQL Replication).
- Text or BLOB fields cannot have indexes (however, VARCHAR fields can).
- You may only have one AUTO_INCREMENT field per table. If your table does not already have a primary key when it was created or altered to MySQL Cluster, a hidden AUTO_INCREMENT primary key field will be created (and used for partitioning). If this happens, you will not be able to create another AUTO_INCREMENT field, even though you cannot see the one that exists. Therefore, ensure that you always define a primary key in your tables (which are often the AUTO_INCREMENT fields). One of the key differences between MySQL Cluster and InnoDB tables is that for clustered tables in InnoDB, foreign key constraints are simply ignored (this is the same behavior with the MyISAM storage engine).
Limitations using transactions: While MySQL Cluster is transactional, in general, it does not support very large individual transactions particularly well.
The limit is difficult to quantify and depends on node performance, network connections, and number of transactions. However, in general, applications that use larger numbers of smaller transactions are more likely to experience fewer problems with MySQL Cluster. Therefore, if you have the choice, design the application that is to use MySQL Cluster for lots of small transactions wherever possible.
Common "Hard Limits" to reach: The following list of unchangeable limitations does vary significantly from release to release, but the limitations of the current version (that is, MySQL Cluster 7) are as follows:
- The total number of objects (databases, tables, and indexes) cannot exceed 20320
- The total number of attributes (columns and indexes) per table cannot exceed 128
- The total size of a row cannot exceed 8 KB
- The total number of storage nodes in a cluster cannot exceed 48
- The total number of nodes (storage, management, and SQL) cannot exceed 255
Networking requirements: MySQL Clusters require inter-cluster network traffic to have extremely low latency (small round trip (ping) times) and almost no packet loss.
If this is not the case, performance will generally be extremely poor and it is possible that nodes will continually be kicked out of the cluster for not replying to heartbeat packets from other nodes quickly enough. To achieve these requirements, it is desirable for all members of the cluster to be interconnected using the same switch infrastructure, which should have a speed of at least one gigabit.
Any network design involving a layer-3 device (such as a router) should be avoided wherever possible (although with modern wire speed, layer 3 forwarding for network devices can be as fast as layer 2). It is not recommended to attempt to get a cluster to work over a large network such as the Internet.
While technically not truly impossible, it is strongly recommended that you do not attempt to change the timeout values to configure a cluster over a high-latency link, as this won't really work properly!
If there is a need to replicate data across a WAN, consider replication between clusters (covered in the Replication between MySQL Clusters recipe in Chapter 3). It is possible to use high-speed cluster interconnects or Unix-like shared-memory segments, which themselves provide for extremely low-latency and high-reliability links.
System RAM requirements and best practice: MySQL Cluster is extremely memory-intensive. Although, actual data can be stored both on disk and in memory (RAM), the performance of data in memory tables is, in most cases, better when compared with disk-based tables in terms of order of magnitude.
Furthermore, even for disk-based tables, indexes (which can take up a significant amount of space) must still be stored in memory. Therefore, the RAM usage on data nodes is high, and the overall RAM requirement for a cluster is likely to be order of magnitude more than that required by a standalone MySQL server using InnoDB or MyISAM.
There are two major points to consider at an early stage:
- Firstly, 32-bit operating systems can have a problem allocating more than 2 gigabytes of RAM to a single process. They will also certainly have a problem addressing more than 4 GB RAM system-wide (even with special modifications to the 32-bit kernel to hack around this limit). Therefore, in most real-world clusters, a 64-bit operating system is likely to make more sense.
- Secondly, if a MySQL data node does not have enough physical RAM, it will either run out of it completely, in which case the kernel's out of memory (OOM) killer will almost certainly kill a data node process, or the operating system will begin to swap. This will likely result in a poor performance as the data node will not be able to respond to the heartbeats in a reasonable time and therefore, will be ejected from the cluster.
Processor architecture requirements: MySQL will run on both 32-bit and 64-bit architectures for all supported operating systems. When we want to decide which one of these to select, it is worth to remember the limitations of 32-bit architectures on RAM and also considering that the MySQL Cluster storage node process is available in two forms—a single- and multi-threaded binary.
The single-threaded version of the storage node process has been tested significantly. However, the multi-threaded binary is simpler to use when trying to run multiple versions of the single-threaded binary on a single machine.
Operating systems can be described as little-endian or big-endian (endianness can be thought of as the byte ordering used by an operating system). There are two parts to this constraint:
- Firstly, all machines used in the cluster must have the same architecture. For example, you cannot have an x86 management node talking to data nodes running on PowerPC.
- Secondly, it is important to remember that the MySQL client API is not endian-sensitive. So your big-endian cluster can happily communicate with the applications running on both big- and-little-endian operating systems.

Creating an initial cluster configuration file—config.ini

In this recipe, we will discuss the initial configuration required to start a MySQL Cluster. A MySQL Cluster has a global configuration file—config.ini, which resides on all management nodes. This file defines the nodes (processes) that make up the cluster and the parameters that the nodes will use.

Each management node, when it starts, reads the config.ini file to get information on the structure of the cluster and when other nodes (storage and SQL / API) start, they contact the already-running management node to obtain the details of the cluster architecture.

The creation of this global configuration file—config.ini, is the first step in building the cluster and this recipe looks at the initial configuration for this file. Later recipes will cover more advanced parameters which you can define (typically to tune a cluster for specific goals, such as performance).

How to do it…

The first step in building a cluster is to create a global cluster configuration file. This file, called config.ini, by convention, is stored on each management node and is used by the management node process to show the cluster makeup and define variables for each node. In our example, we will store this in the file /usr/local/mysql-cluster/config.ini, but it can be stored anywhere else.

The file consists of multiple sections. Each section contains parameters that apply to a particular node, for example, the node's IP address or the amount of memory to reserve for data. Each type of node (management, SQL, and data node) has an optional default section to save duplicating the same parameter in each node. Each individual node that will make up the cluster has its own sections, which inherits the defaults defined for its type and specifies the additional parameters, or overrides the defaults.

This global configuration file is not complex, but is best analyzed with an example, and in this recipe, we will create a simple cluster configuration file for this node. The first line to add in the config.ini file is a block for this new management node:

[ndb_mgmd]

Now, we specify an ID for the node. This is absolutely not required, but can be useful—particularly if you have multiple management nodes.

Id=1

Now, we specify the IP address or hostname of the management node. It is recommended to use IP addresses in order to avoid a dependency on the DNS:

HostName=10.0.0.5

Note

It is possible to define a node without an IP address, in this case, a starting node can either be told which nodeID it should take when it starts, or the management node will allocate the node to the most suitable free slot.

Finally, we define a directory to store local files (for example, cluster log files):

DataDir=/var/lib/mysql-cluster

This is all that is required to define a single management node.

Now, we define the storage nodes in our simple cluster. To add storage nodes, it is recommended that we use the default section to define a data directory (a place for the node to store the files, which the node stores on the disk). It is also mandatory to define the NoOfReplicas parameter, which was discussed in the There's more… section of the previous recipe.

[<type>_default] works for all three types of nodes (mgmd, ndbd, and mysqld) and defines a default value to save duplicating a parameter for every node in that section. For example, the DataDir of the storage nodes can (and should) be defined in the default section:

[ndbd_default]
DataDir=/var/lib/mysql-cluster
NoOfReplicas=2

Once we have defined the default section, then defining the node ID and IP / hostname for the other storage nodes that make up our cluster is a simple matter as follows:

[ndbd]
id=3
HostName=10.0.0.1

[ndbd]
id=4
HostName=10.0.0.2

Note

You can either use hostnames or IP addresses in config.ini file. I recommend that you use IP addresses for absolute clarity, but if hostnames are used, it is a good idea to ensure that they are hardcoded in /etc/hosts on each node in order to ensure that a DNS problem does not cause major issues with your cluster.

Finally for SQL nodes, it is both possible and common to simply define a large number of [mysqld] sections with no HostName parameter. This keeps the precise future structure of the cluster flexible (this is not generally recommended for storage and management nodes).

It is a good practice to define the hostnames for essential nodes, and if desired also leave some spare sections for future use (the recipe Taking an online backup of a MySQL Cluster later in Chapter 2, MySQL Cluster Backup and Recovery will explain one of several most common reasons why this will be useful). For example, to define two-cluster SQL nodes (with their servers running mysqld) with IP addresses 10.0.0.2 and 10.0.0.3, with two more slots available for any SQL or API nodes to connect to on a first come, first served basis, use the following:

[mysqld]
HostName=10.0.0.2

[mysqld]
HostName=10.0.0.3

[mysqld]

[mysqld]

Now that we have prepared a simple config.ini file for a cluster, it is potentially possible to move on to installing and starting the cluster's first management node. Recollect where we saved the file (in /usr/local/mysql-cluster/config.ini) as you will need this information when you start the management node for the first time.

There's more…

At this stage, we have not yet defined any advanced parameters. It is possible to use the config.ini file that we have written so far to start a cluster and import a relatively small testing data set (such as the world database provided by MySQL for testing, which we will use later in this book). However, it is likely that you will need to set a couple of other parameters in the ndbd_default section of the config.ini file before you get a cluster in which you can actually import anything more than a tiny amount of data.

Firstly, there is a maximum limit of 32,000 concurrent operations in a MySQL Cluster, by default. The variable MaxNoOfConcurrentOperations sets the number of records that can be in update phase or locked simultaneously. While this sounds like a lot, it is likely that any significant import of data will exceed this value, so this can be safely increased. The limit is set deliberately low to protect small systems from large transactions. Each operation consumes at least one record, which has an overhead of 1 KB of memory.

Note

The MySQL documentation states the following:

Unfortunately, it is difficult to calculate an exact value for this parameter so set it to a sensible value depending on the expected load on the cluster and monitor for errors when running large transactions (often when importing data):

MaxNoOfConcurrentOperations = 150000

A second extremely common limit to increase is the maximum number of attributes (fields, indexes, and so on) in a cluster which defaults to 1000. This is also quite low, and in the same way it can normally be increased:

MaxNoOfAttributes = 10000

The maximum number of ordered indexes is low and if you reach it, it will return a slightly cryptic error, Can't create table xxx (errno: 136). Therefore, it is often worth increasing it at the start, if you plan on having a total of more than 128 ordered indexes in your cluster:

MaxNoOfOrderedIndexes=512

Finally, it is almost certain that you will need to define some space for data and indexes on your storage nodes. Note that you should not allocate more storage space than you have to spare on the physical machines running the storage nodes, as a cluster swapping is likely to happen and the cluster will crash!

DataMemory=2G
IndexMemory=500M

With these parameters set, you are ready to start a cluster and import a certain amount of data in it.

Installing a management node

In this recipe, we will be using the RedHat Package Manager (RPM) files provided by MySQL to install a management node on a CentOS 5.3 Linux server. We will be using a x86_64 or 64-bit operating system. However, there is no practical difference between 64-bit and the 32-bit binaries for installation.

At the end of this recipe, you will have a management node installed and ready to start. In the next recipe, we will start the management node, as a running management node is required to check that your storage and SQL nodes start correctly in later recipes.

How to do it…

All files for MySQL Cluster for RedHat and CentOS 5 can be found in the Red Hat Enterprise Linux 5 RPM section from the download page at http://dev.mysql.com. We will first install the management node (the process with which every other cluster node talks to on startup). To get this, download the Cluster storage engine management package MySQL-Cluster-gpl-management-7.a.b-c.rhel5.x86_64.rpm.

Note

You must use the URL (that is, the address of the mirror site here that you have copied from the MySQL downloads page, which will replace a.mirror in the following commands). All the other instances where the command wget is used with the mirror site addresses as a.mirror should be replaced with the URL.

In the following example, a temporary directory is created and the correct file is downloaded:

[root@node5 ~]# cd ~/
[root@node5 ~]# mkdir mysql
[root@node5 ~]# cd mysql
[root@node5 mysql]# wget http://dev.mysql.com/get/Downloads/MySQL-Cluster-7.0/MySQL-Cluster-gpl-management-7.0.6-0.rhel5.x86_64.rpm/from/http://a.mirror/
--16:26:09--  http://dev.mysql.com/get/Downloads/MySQL-Cluster-7.0/MySQL-Cluster-gpl-management-7.0.6-0.rhel5.x86_64.rpm/from/http://a.mirror/
<snip>
16:26:10 (9.78 MB/s) - `MySQL-Cluster-gpl-management-7.0.6-0.rhel5.x86_64.rpm' saved [1316142/1316142]

At the same time of installing the management node, it is also a good idea to install the management client, which we will use to talk to the management node on the same server. This client is contained within the Cluster storage engine basic tools package—MySQL-Cluster-gpl-tools-7.a.b-c.rhel5.x86_64.rpm, and in the following example this file is downloaded:

[root@node5 ~]# wget http://dev.mysql.com/get/Downloads/MySQL-Cluster-7.0/MySQL-Cluster-gpl-tools-7.0.6-0.rhel5.x86_64.rpm/from/http://a.mirror/
--18:45:57--  http://dev.mysql.com/get/Downloads/MySQL-Cluster-7.0/MySQL-Cluster-gpl-tools-7.0.6-0.rhel5.x86_64.rpm/from/http://a.mirror/
<snip>
18:46:00 (10.2 MB/s) - `MySQL-Cluster-gpl-tools-7.0.6-0.rhel5.x86_64.rpm' saved [9524521/9524521]

Now, install the two files that we have downloaded with the rpm -ivh command (the flag's meaning –i for install, –v for verbose output, and –h which results in a hash progress bar):

[root@node5 mysql]# rpm -ivh MySQL-Cluster-gpl-management-7.0.6-0.rhel5.x86_64.rpm MySQL-Cluster-gpl-tools-7.0.6-0.rhel5.x86_64.rpm
Preparing...                ########################################### [100%]
   1:MySQL-Cluster-gpl-manage########################################### [100%]
   1:MySQL-Cluster-gpl-manage########################################### [100%]

As these two RPM packages are installed, the following binaries are now available on the system:

Type	Binary	Description
Management	`ndb_mgmd`	The cluster management server
Tools	`ndb_mgm`	The cluster management client—note that it is not `ndb_mgmd`, which is the server process
Tools	`ndb_size.pl`	Used for estimating the memory usage of existing databases or tables
Tools	`ndb_desc`	A tool to provide detailed information about a MySQL Cluster table

To actually start the cluster, a global configuration file must be created and used to start the management server. As discussed in the previous recipe this file can be called anything and stored anywhere, but by convention it is called config.ini and stored in /usr/local/mysql-cluster. For this example, we will use an extremely simple cluster consisting of one management node (10.0.0.5), two storage nodes (10.0.0.1 and 10.0.0.2) and two SQL nodes (10.0.0.3 and 10.0.0.4), but follow the previous recipe (including the There's more… section if you wish to import much data) to create a configuration file tailored to your setup with the correct number of nodes and IP addresses.

Once the contents of the file are prepared, copy it to /usr/local/mysql-cluster/config.ini. The complete config.ini file used in this example is as follows:

[ndb_mgmd]
Id=1
HostName=10.0.0.5
DataDir=/var/lib/mysql-cluster

[ndbd default]
DataDir=/var/lib/mysql-cluster
NoOfReplicas=2

[ndbd]
id=3
HostName=10.0.0.1

[ndbd]
id=4
HostName=10.0.0.2

[mysqld]
id=11
HostName=10.2.0.3

[mysqld]
id=12
HostName=10.2.0.4

[mysqld]
id=13

[mysqld]
id=14

At this stage, we have installed the management client and server (management node) and created the global configuration file.

Starting a management node

In this recipe, we will start the management node installed in the previous recipe, and then use the management client to confirm that it has properly started.

How to do it…

The first step is to create the data directory for the management node that you defined in config.ini file as follows:

[root@node5 mysql-cluster]# mkdir -p /usr/local/mysql-cluster

Now, change the directory to it and run the management node process (ndb_mgmd), telling it which configuration file to use:

[root@node5 mysql-cluster]# cd /usr/local/mysql-cluster
[root@node5 mysql-cluster]# ndb_mgmd  --config-file=config.ini 
2009-06-28 22:14:01 [MgmSrvr] INFO     -- NDB Cluster Management Server. mysql-5.1.34 ndb-7.0.6
2009-06-28 22:14:01 [MgmSrvr] INFO     -- Loaded config from '//mysql-cluster/ndb_1_config.bin.1'

Finally, check the exit code of the previous command (with the command echo $?). An exit code of 0 indicates success:

[root@node5 mysql-cluster]# echo $?
0

If you either got an error from running ndb_mgmd or the exit code was not 0, turn very briefly to the There's more… section of this recipe for a couple of extremely common problems at this stage.

Note

Everything must run as root, including the ndbd process. This is a common practice; remember that the servers running MySQL Cluster should be extremely well protected from external networks as anyone with any access to the system or network can interfere with the unencrypted and unauthenticated communication between storage nodes or connect to the management node. In this book, all MySQL Cluster tasks are completed as root.

Assuming that all is okay, we can now run the MySQL Cluster management client, ndb_mgm. This will be the default, connecting to a management client running on the local host on port 1186. Once in the client, use the SHOW command to show the overall status of the cluster:

[root@node5 mysql-cluster]# ndb_mgm
-- NDB Cluster -- Management Client –

And have a look at the structure of our cluster:

ndb_mgm> SHOW
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=3 (not connected, accepting connect from 10.0.0.1)
id=4 (not connected, accepting connect from 10.0.0.2)

[ndb_mgmd(MGM)] 1 node(s)
id=1 @node5  (mysql-5.1.34 ndb-7.0.6)

[mysqld(API)] 4 node(s)
id=11 (not connected, accepting connect from 10.2.0.2)
id=12 (not connected, accepting connect from 10.2.0.3)
id=13 (not connected, accepting connect from any host)
id=14 (not connected, accepting connect from any host)

This shows us that we have two storage nodes (both disconnected) and four API or SQL nodes (both disconnected). Now check the status of node ID 1 (the management node) with the <nodeid> STATUS command as follows:

ndb_mgm> 1 status
Node 1: connected (Version 7.0.6)

Finally, exit out of the cluster management client using the exit command:

ndb_mgm> exit

Congratulations! Assuming that you have no errors here, you now have a cluster management node working and ready to receive connections from the SQL and data nodes which are shown as disconnected.

There's more…

In the event that your cluster fails to start, a couple of really common causes have been included here:

If the data directory does not exist, you will see this error:

[root@node5 mysql-cluster]# ndb_mgmd
2009-06-28 22:13:48 [MgmSrvr] INFO     -- NDB Cluster Management Server. mysql-5.1.34 ndb-7.0.6
2009-06-28 22:13:48 [MgmSrvr] INFO     -- Loaded config from '//mysql-cluster/ndb_1_config.bin.1'
2009-06-28 22:13:48 [MgmSrvr] ERROR    -- Directory '/var/lib/mysql-cluster' specified with DataDir in configuration does not exist.
[root@node5 mysql-cluster]# echo $?
1

In this case, make sure that the directory exists:

[root@node5 mysql-cluster]# mkdir –p /var/lib/mysql-cluster

If there is a typo in the configuration file or if the cluster cannot find the config.ini file you may see this error:

[root@node5 mysql-cluster]# ndb_mgmd  --config-file=config.ini
2009-06-28 22:15:50 [MgmSrvr] INFO     -- NDB Cluster Management Server. mysql-5.1.34 ndb-7.0.6
2009-06-28 22:15:50 [MgmSrvr] INFO     -- Trying to get configuration from other mgmd(s) using 'nodeid=0,localhost:1186'...

At this point ndb_mgmd will hang. In this case, kill the ndb_mgmd process (Ctrl + C or with the kill command) and double-check the syntax of your config.ini file.

Installing and starting storage nodes

Storage nodes within a MySQL Cluster store all the data either in memory or on disk; they store indexes in memory and conduct a significant portion of the SQL query processing. The single-threaded storage node process is called ndbd and either this or the multi-threaded version (ndbdmt) must be installed and executed on each storage node.

Getting ready

From the download page at http://dev.mysql.com, all files for MySQL Cluster for RedHat and CentOS 5 can be found in the Red Hat Enterprise Linux 5 RPM section. It is recommended that the following two RPMs should be installed on each storage node:

Cluster storage engine basic tools (this contains the actual storage node process)—MySQL-Cluster-gpl-tools-7.0.6-0.rhel5.x86_64.rpm
Cluster storage engine extra tools (this contains other binaries that are useful to have on your storage nodes)—MySQL-Cluster-gpl-storage-7.0.6-0.rhel5.x86_64.rpm

Once these packages are downloaded, we will show in an example how to install the nodes, start the storage nodes, and check the status of the cluster.

How to do it…

Firstly, download the two files required on each storage node (that is, complete this on all storage nodes simultaneously):

[root@node1 ~]# cd ~/
[root@node1 ~]# mkdir mysql-storagenode
[root@node1 ~]# cd mysql-storagenode/
[root@node1 mysql-storagenode]# wget http://dev.mysql.com/get/Downloads/MySQL-Cluster-7.0/MySQL-Cluster-gpl-storage-7.0.6-0.rhel5.x86_64.rpm/from/http://a.mirror/
--21:17:04--  http://dev.mysql.com/get/Downloads/MySQL-Cluster-7.0/MySQL-Cluster-gpl-storage-7.0.6-0.rhel5.x86_64.rpm/from/http://a.mirror/
Resolving dev.mysql.com... 213.136.52.29
<snip>
21:18:06 (9.25 MB/s) - `MySQL-Cluster-gpl-storage-7.0.6-0.rhel5.x86_64.rpm' saved [4004834/4004834]

[root@node1 mysql-storagenode]# wget http://dev.mysql.com/get/Downloads/MySQL-Cluster-7.0/MySQL-Cluster-gpl-tools-7.0.6-0.rhel5.x86_64.rpm/from/http://a.mirror/
--21:19:12--  http://dev.mysql.com/get/Downloads/MySQL-Cluster-7.0/MySQL-Cluster-gpl-tools-7.0.6-0.rhel5.x86_64.rpm/from/http://a.mirror/
<snip>
21:20:14 (9.67 MB/s) - `MySQL-Cluster-gpl-tools-7.0.6-0.rhel5.x86_64.rpm' saved [9524521/9524521]

Once both the files are downloaded, install these two packages using the same command as it was used in the previous recipe:

[root@node1 mysql-storagenode]# rpm -ivh MySQL-Cluster-gpl-tools-7.0.6-0.rhel5.x86_64.rpm MySQL-Cluster-gpl-storage-7.0.6-0.rhel5.x86_64.rpm 
Preparing...                ########################################### [100%]
   1:MySQL-Cluster-gpl-stora########################################### [ 50%]
   2:MySQL-Cluster-gpl-tools########################################### [100%]

Now, using your favorite text editor, insert the following into /etc/my.cnf file, replacing 10.0.0.5:1186 with the hostname or IP address of the already installed management node:

[mysql_cluster]
ndb-connectstring=10.0.0.5:1186

Note

Ensure that you have completed the above steps on all storage nodes before continuing. This is because (unless you force it) a MySQL Cluster will not start without all storage nodes, and it is best practice to start all storage nodes at the same time, if possible.

Now, as we have installed the storage node client and configured /etc/my.cnf file to allow a starting storage node process to find its management node, we can start our cluster.

To join storage nodes to our cluster, following requirements must be met:

All storage nodes must be ready to join the cluster (this can be overridden, if really required)
A config.ini file must be prepared with the details of the storage nodes in a cluster, and then a management node must be started based on this file
The storage nodes must be able to communicate with (that is, no firewall) the management node, otherwise, the storage nodes will fail to connect
The storage nodes must be able to communicate freely with each other; problems here can cause clusters failing to start or, in some case, this can cause truly bizarre behavior
There must be enough memory on the storage nodes to start the process (using the configuration in this example, that is, the defaults will result in a total RAM usage of approximately 115 MB per storage node)

Note

When you start ndbd, you will notice that the two processes have started. One is an angel process, which monitors the other---the main ndbd process. The angel process is generally configured to automatically restart the main process if a problem is detected, which causes that process to exit. This can cause confusion, if you attempt to send a KILL signal to just the main process as the angel process can create a replacement process.

To start our storage nodes, on each node create the data directory that was configured for each storage node in the config.ini file on the management node, and run ndbd with the –-initial flag:

[root@node1 ~]# mkdir -p /var/lib/mysql-cluster
[root@node1 ~]# ndbd --initial
2009-07-06 00:31:57 [ndbd] INFO     -- Configuration fetched from '10.0.0.5:1186', generation: 1
[root@node1 ~]#

If you fail to create the data directory (/var/lib/mysql-cluster in our example in the previous recipe), you may well get a Cannot become daemon: /var/lib/mysql-cluster/ndb_3.pid: open for write failed: No such file or directory error. If you get this, run the mkdir command again on the relevant storage node

Once you have started ndbd on each node, you can run the management client, ndb_mgm, from any machine as long as it can talk to port 1186 on the management node with which it will work.

Note

ndb_mgm, like all MySQL Cluster binaries, reads the [mysqld_cluster] section in /etc/my.cnf to find the management node's IP address to connect to. If you are running ndb_mgm on a node that does not have this set in /etc/my.cnf, you should pass a cluster connect string to ndb_mgm (see the final recipe in this chapter—Cluster Concepts).

The --initial flag to ndbd tells ndbd to initialize the DataDir on the local disk, overwriting any existing data (or in this case, creating it for the first time).

You should only use --initial the first time you start a cluster or if you are deliberately discarding the local copy of data. If you do it at other times, you risk losing all the data held in the cluster, if there is no online node in the same nodegroup that stays online, long enough, to update the starting node.

Note

Be aware that starting ndbd with --initial does not always delete all of the logfiles in the DataDir; you should delete these manually if you need to remove them.

The cluster will go through various stages as it starts. If you run the ALL STATUS command in the management client while the nodes are starting, you will see that they start off as unconnected, then go through the startup phases, and finally are marked as started.

Because often a node starting must apply a large number of database transactions either from other nodes or from its local disk, this process can take some time in clusters with data. Although, in the case of an initial start, this process should be relatively fast. The following output shows the management client when all the storage nodes have started:

[root@node5 ~]# ndb_mgm
-- NDB Cluster -- Management Client --

ndb_mgm> SHOW
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=3 @10.0.0.1  (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0, Master)
id=4 @10.0.0.2  (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0)

[ndb_mgmd(MGM)] 1 node(s)
id=1 @10.0.0.5  (mysql-5.1.34 ndb-7.0.6)

[mysqld(API)] 4 node(s)
id=11 (not connected, accepting connect from 10.2.0.2)
id=12 (not connected, accepting connect from 10.2.0.3)
id=13 (not connected, accepting connect from any host)
id=14 (not connected, accepting connect from any host)
ndb_mgm> ALL STATUS
Node 3: started (mysql-5.1.34 ndb-7.0.6)
Node 4: started (mysql-5.1.34 ndb-7.0.6)

At this point, this cluster has one management node and two storage nodes that are connected. You are now able to start the SQL nodes.

In the case you have any problems, look at the following points:

Cluster error log—in the DataDir on the management node, with a filename similar to DataDir/ndb_1_cluster.log (the number is the sequence number) MySQL Cluster has an inbuilt rotation system—when a file gets to 1 MB, a new one with a higher sequence number is created
Storage node error log—in the DataDir on the relevant storage node, with a filename similar to DataDir/ndb_4_out.log (the number is the cluster node ID)

These two logfiles should give you a pretty good idea of what is causing a problem.

If one node remains in phase 1, when others are in phase 2, this likely indicates a network problem—normally, a firewall between storage nodes causes this issue. In such cases, double check that there are no software or hardware firewalls between the nodes.

There's more…

For convenience (particularly, when writing scripts to manage large clusters), you may want to start the ndbd process on all storage nodes to the point that they get configuration data from the management node and are able to be controlled by it but you may not want to completely start them. This is achieved with the --nostart or -n flag:

On the storage nodes (and assuming that ndbd is not already running):

[root@node1 ~]# ndbd –n
2009-07-09 20:59:49 [ndbd] INFO     -- Configuration fetched from '10.0.0.5:1186', generation: 1

Note

If required (for example, the first time you start a node), you could add the –initial flag as you would for a normal start of the storage node process.

Then, on the management node you should see that the nodes are not started (this is different from not connected) as shown here:

ndb_mgm> ALL STATUS
Node 3: not started (mysql-5.1.34 ndb-7.0.6)
Node 4: not started (mysql-5.1.34 ndb-7.0.6)
Node 5: not started (mysql-5.1.34 ndb-7.0.6)
Node 6: not started (mysql-5.1.34 ndb-7.0.6)

It is then possible to start all the nodes at the same time from the management client with the command <nodeid> START as follows:

NDB_MGM> 3 START
Database node 3 is being started.
ndb_mgm> NODE 3: START INITIATED (VERSION 7.0.6)
NODE 3: DATA USAGE DECREASED TO 0%(0 32K PAGES OF TOTAL 2560)

You can start all storage nodes in a not started state with the command ALL START:

NDB_MGM> ALL START
NDB Cluster is being started.
NDB Cluster is being started.
NDB Cluster is being started.
NDB Cluster is being started.

During the start up of storage nodes, you may find the following list of phases useful, if nodes fail or hang during a certain start up phase:

Note

MySQL Clusters with a large amount of data in them will be slow to start; it is often useful to look at CPU usage and network traffic to reassure you and check that the cluster is actually still doing something.

Setup and initialization (Phase -1): During this phase, each storage node is initialized (obtaining a node ID from the management node, fetching configuration data (effectively the contents of config.ini file), allocating ports for inter-cluster communication, and allocating memory).
Phase 0: If the storage node is started with --initial, the cluster kernel is initialized and in all cases certain parts of it are prepared for use.
Phase 1: The remainder of the cluster kernel is started and nodes start communicating with each other (using heartbeats).
Phase 2: Nodes check the status of each other and elect a Master node.
Phase 3: Additional parts of the cluster kernel used for communication are initialized.
Phase 4: For an initial start or initial node restart, the redo logfiles are created. The number of these files is equal to the NoOfFragmentLogFiles variable in the config.ini file. In the case of a restart, nodes read schemas and apply local checkpoints, until the last restorable global checkpoint has been reached.
Phase 5: Execute a local checkpoint, then a global checkpoint, and memory usage check.
Phase 6: Establish node groups.
Phase 7: The arbitrator node is selected and begins to function. At this point, nodes are shown as started in the management client, and SQL nodes may join the cluster.
Phase 8: In the case of a restart, indexes are rebuilt.
Phase 9: Internal node's startup variables are reset.

Installing and starting SQL nodes

SQL nodes are the most common form of API nodes, and are used to provide a standard MySQL interface to the cluster. To do this, they use a standard version of the MySQL server compiled to include support for the MySQL Cluster storage engine—NDBCLUSTER.

In earlier versions, this was included in most binaries, but to use more current and future versions of MySQL, you must specifically select the MySQL Cluster server downloads. It is highly recommended to install a mysql client on each SQL node for testing.

Note

Terminology sometimes causes confusion. A MySQL server is a mysqld process. A MySQL client is the mysql command that communicates with a MySQL server. It is recommended to install both on each SQL node, but of course, it is only required to have the server (which can be connected to by the clients on the other machines).

How to do it…

Download and install the following two files. For the sake of brevity, the process of using wget to download a file and rpm to install a package is not shown in this recipe, but it is identical to the procedure in the previous two recipes:

Server (from the cluster section)—MySQL-Cluster-gpl-server-7.a.b-c.rhel5.x86_64.rpm
Client (this is identical to the standard MySQL client and has the same filename)—MySQL-client-community-5.a.b-c.rhel5.x86_64.rpm

After installing these RPMs, a very simple /etc/my.cnf file will exist. We need to add two parameters to the [mysqld] section of this file to tell the mysqld server to enable support for MySQL Cluster and where to find its management node:

[mysqld]
# Enable MySQL Cluster
ndbcluster
# Tell this node where to find its management node 
ndb-connectstring=10.0.0.5

Note

The requirement to add lines to the [mysqld] section in addition to any [mysql_cluster] that may already be there is only read by the ndb_* daemons (but has a similar purpose). If you have a SQL node on the same server as a storage node you would have both.

Some modern versions of MySQL will also use a [mysql_cluster] section, but it is feasible to stick to defining the parameters in a [mysqld] section.

With these lines added, start the SQL node as follows:

[root@node1 ~]# service mysql start
Starting MySQL.                                            [  OK  ]

At this point, even if there is an error connecting to the cluster, it is unlikely that you will get anything other than okay here. However, if you see the following error in the standard mysql log (often /var/lib/mysqld.log or /var/lib/mysql/hostname.err), you should go and check that you have installed the correct server (the cluster server and not the standard server):

090708 23:48:14 [ERROR] /usr/sbin/mysqld: unknown option '--ndbcluster'
090708 23:48:14 [ERROR] Aborting

Even if your SQL node (MySQL server) starts without an error, it is important to verify that it has successfully joined the cluster. There are three tests to carry out:

On the management node, run the SHOW command and ensure that this node is now connected:

ndb_mgm> SHOW
Cluster Configuration
---------------------
[ndbd(NDB)]  4 node(s)
id=3  @10.0.0.1  (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0, Master)
id=4  @10.0.0.2  (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0)
id=5  @10.0.0.3  (mysql-5.1.34 ndb-7.0.6, Nodegroup: 1)
id=6  @10.0.0.4  (mysql-5.1.34 ndb-7.0.6, Nodegroup: 1)

[ndb_mgmd(MGM)]  1 node(s)
id=1  @10.0.0.5  (mysql-5.1.34 ndb-7.0.6)

[mysqld(API)]	4 node(s)
id=11  @10.0.0.1  (mysql-5.1.34 ndb-7.0.6)
id=12  @10.0.0.2  (mysql-5.1.34 ndb-7.0.6)
id=13  @10.0.0.3  (mysql-5.1.34 ndb-7.0.6)
id=14  @10.0.0.4  (mysql-5.1.34 ndb-7.0.6)

Now that you have confirmed that the SQL node is connected to the management node, in the mysql client on the SQL node confirm the status of the NDB engine:

[root@node1 ~]# mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 4
Server version: 5.1.34-ndb-7.0.6-cluster-gpl MySQL Cluster Server (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> SHOW ENGINE NDB STATUS;
+------------+-----------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Type       | Name                  | Status                                                                                                                                                       |
+------------+-----------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ndbcluster | connection            | cluster_node_id=11, connected_host=10.0.0.5, connected_port=1186, number_of_data_nodes=4, number_of_ready_data_nodes=4, connect_count=0                      |

Also check the output of the SHOW ENGINES command:

mysql> SHOW ENGINES;
+------------+---------+----------------------------------------------------------------+--------------+------+------------+
| Engine     | Support | Comment                                                        | Transactions | XA   | Savepoints |
+------------+---------+----------------------------------------------------------------+--------------+------+------------+
| ndbcluster | YES     | Clustered, fault-tolerant tables                               | YES          | NO   | NO         |

If the node is not connected, the status from the first command, SHOW ENGINE NDB STATUS, will typically be something like:

cluster_node_id=0, connected_host=(null), connected_port=0, number_of_data_nodes=0, number_of_ready_data_nodes=0, connect_count=0

Otherwise, the command will fail with ERROR 1286 (42000): Unknown table engine 'NDB'. If the second command does not have a YES in the supported column for ndbcluster there is a problem.

If any of these commands fail, check the following:

/etc/my.cnf [mysqld] section
That you have installed the cluster-specific mysqld binary

If the commands work, follow the steps in the next recipe to create a test MySQL Cluster table and ensure that your cluster is working correctly.

Creating a MySQL Cluster table

In this recipe, we will create a simple table in the cluster, and we will both insert and select data on this new table on two nodes in the example cluster created earlier in this chapter.

How to do it…

Note

A MySQL Cluster table is a table of type NDBCLUSTER accessible from your storage nodes and can be created in the normal way—with the only difference being an explicit TYPE in the CREATE TABLE statement.

In older versions of MySQL Cluster, it was necessary to create databases on all nodes. This is no longer the case.

In this example, we will create a very simple table, cluster_test on node1 and insert some data into it:

[root@node1 ~]# mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.

Your MySQL connection id is 2
Server version: 5.1.34-ndb-7.0.6-cluster-gpl MySQL Cluster Server (GPL)
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> CREATE DATABASE cluster_test;
Query OK, 1 row affected (0.43 sec)

mysql> USE cluster_test;
Database changed

mysql> CREATE TABLE ctest (i INT) ENGINE=NDBCLUSTER;
Query OK, 0 rows affected (0.84 sec)

mysql> INSERT INTO ctest () VALUES (1);
Query OK, 1 row affected (0.04 sec)

mysql> SELECT * FROM ctest;

+------+
| i    |
+------+
|    1 | 
+------+

1 row in set (0.00 sec)

The next step is to select this row from the other SQL node in the cluster (node2), and then insert another row from the second node:

[root@node2 ~]# mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.

Your MySQL connection id is 2
Server version: 5.1.34-ndb-7.0.6-cluster-gpl MySQL Cluster Server (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> SHOW DATABASES;
+--------------------+
| Database           |
+--------------------+
| information_schema | 
| cluster_test       | 
| mysql              | 
| test               | 
+--------------------+

4 rows in set (0.03 sec)

mysql> use cluster_test;

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

Database changed

mysql> show tables;

+------------------------+
| Tables_in_cluster_test |
+------------------------+
| ctest                  | 
+------------------------+

1 row in set (0.04 sec)

mysql> SELECT * from ctest;
+------+
| i    |
+------+
|    1 | 
+------+

1 row in set (0.00 sec)

mysql> INSERT INTO ctest () VALUES (2);
Query OK, 1 row affected (0.01 sec)

Finally, we return to the first node and check that the data inserted on node2 is visible.

mysql> SELECT * FROM ctest;

+------+
| i    |
+------+
|    2 | 
|    1 | 
+------+

2 rows in set (0.00 sec)

mysql>

Congratulations! The cluster works, and you have created a table in it. You can repeat these tests while powering off the storage nodes and watch the cluster continue to work. With a single-cluster storage node powered off, everything will continue to work (except for the SQL node that was previously running on the powered-off mode, of course!).

Restarting a MySQL Cluster without downtime

One of the key aspects of a highly-available system is that routine maintenance can be carried out without any service interruption to users. MySQL Cluster achieves this through its shared nothing architecture, and in this recipe we will show how to restart the three types of nodes online (without taking the cluster down as a whole).

Getting started

For this recipe, we will be using the following cluster setup:

Four storage nodes
One management node
Two SQL nodes

The output of the SHOW command on the management client for this cluster is as follows:

ndb_mgm> SHOW
Cluster Configuration
---------------------
[ndbd(NDB)] 4 node(s)
id=3 @10.0.0.1  (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0)
id=4 @10.0.0.2  (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0)
id=5 @10.0.0.3  (mysql-5.1.34 ndb-7.0.6, Nodegroup: 1, Master)
id=6 @10.0.0.4  (mysql-5.1.34 ndb-7.0.6, Nodegroup: 1)

[ndb_mgmd(MGM)] 1 node(s)
id=1 @10.0.0.5  (mysql-5.1.34 ndb-7.0.6)

[mysqld(API)] 4 node(s)
id=11 @10.0.0.1  (mysql-5.1.34 ndb-7.0.6)
id=12 @10.0.0.2  (mysql-5.1.34 ndb-7.0.6)
id=13 (not connected, accepting connect from any host)
id=14 (not connected, accepting connect from any host)

How to do it…

In this section, we will show how to restart each node using our example cluster.

Restarting a storage node
There are two ways to restart a storage node. For both the methods, the first step is to check the output of SHOW command in the management client to ensure that there is at least one other online (not starting or shutdown) node in the same nodegroup.
Note
In our example cluster, we have storage node ID 3 and 4 in nodegroup 0 and storage node ID 5 and 6 in nodegroup 1.
The two options for restarting a node are as follows: Firstly, from the management client a node can be restarted with the <nodeid> RESTART command:
```
ndb_mgm> 3 status
Node 3: started (mysql-5.1.34 ndb-7.0.6)

ndb_mgm> 3 RESTART
Node 3: Node shutdown initiated

Node 3: Node shutdown completed, restarting, no start.
Node 3 is being restarted

ndb_mgm> 3 status
Node 3: starting (Last completed phase 4) (mysql-5.1.34 ndb-7.0.6)
Node 3: Started (version 7.0.6)

ndb_mgm> 3 status
Node 3: started (mysql-5.1.34 ndb-7.0.6)
```
Secondly, on the storage node itself the ndbd process can simply be killed and restarted:
Note
Remember that ndbd has two processes—an angel process in addition to the main process. You must kill both these processes at the same time.
```
[root@node4 ~]# ps aux | grep ndbd
root      4082  0.0  0.4  33480  2316 ?        Ss   Jul08   0:00 ndbd --initial
root      4134  0.1 17.4 426448 91416 ?        Sl   Jul08   0:02 ndbd --initial
root      4460  0.0  0.1  61152   720 pts/0    R+   00:11   0:00 grep ndbd
[root@node4 ~]# kill 4082 4134
[root@node4 ~]# ps aux | grep ndbd | grep -v grep | wc -l
0
```
Once we have killed the ndbd process, and ensured that no processes are running with the name ndbd, we can restart the ndbd process:
```
[root@node4 ~]# ndbd
2009-07-09 00:12:03 [ndbd] INFO     -- Configuration fetched from '10.0.0.5:1186', generation: 1
```
If you were to leave a management client connected during this process, you can see that the management node picks up on the dead node and then allow it to rejoin the cluster:
```
ndb_mgm> Node 6: Node shutdown completed. Initiated by signal 15.
ndb_mgm> Node 6: Started (version 7.0.6)
ndb_mgm> 6 status
Node 6: started (mysql-5.1.34 ndb-7.0.6)
```
Note
Remember that you can restart more than one node at a time, but you must always have one node fully started in each nodegroup or your cluster will shut down.
Restarting a management node
Restarting a management node is best done by simply killing the ndb_mgmd process and restarting it.
Note
When there is no management node in the cluster, there is no central logging for the cluster, and the storage and the API nodes cannot start or restart (so if they fail they will stay dead). In addition, processes that are initiated from the management client (such as hot backups) cannot be run.
Firstly, we will pass the process ID of the ndb_mgmd process to the kill command:
```
[root@node5 mysql-cluster]# kill $(pidof ndb_mgmd)
```
This will kill the management node, so now start it again:
```
[root@node5 mysql-cluster]# ndb_mgmd --config-file=config.ini
2009-07-09 00:30:00 [MgmSrvr] INFO     -- NDB Cluster Management Server. mysql-5.1.34 ndb-7.0.6
2009-07-09 00:30:00 [MgmSrvr] INFO     -- Loaded config from '//mysql-cluster/ndb_1_config.bin.1'
```
Finally, verify that the management node is working:
```
[root@node5 mysql-cluster]# ndb_mgm
-- NDB Cluster -- Management Client --
ndb_mgm> 1 status
Node 1: connected (Version 7.0.6)
```
Restarting a SQL node
Restarting a SQL node is trivial—just restart the mysqld process as normal, and carry out the checks mentioned earlier to ensure that the node restarts correctly.
```
[root@node1 ~]# service mysql restart
Shutting down MySQL..                                     [  OK  ]
Starting MySQL..                                          [  OK  ]
```

Recovering from a cluster shutdown

This recipe will cover the procedure to follow in case of a cluster shutdown. We discuss the split brain problem and also explain how to start a cluster without all storage nodes.

How to do it…

This section covers the procedure to follow—both for a partial failure (some nodes fail, but the cluster remains operational) and a complete failure (all nodes fail!).

Partial cluster failure
In the event of a single node failing, you will notice the following:
- If the node that fails is a management node—no immediate problem occurs, but other nodes cannot restart and activities requiring a management node (online backups, centralized logging) will not take place
- If the node that fails is a storage node—assuming one node remains in each nodegroup, there will be no immediate action (but there is the possibility of a small number of transactions being rolled back)
- If the node that fails is a SQL node—any clients connected to that SQL node clearly will either have to use another SQL node or will fail, but no effect on the cluster
To recover from a partial shutdown, carry out the restart procedure in the previous recipe—Restarting a MySQL Cluster without downtime; however, it may not be necessary to kill the existing process.
Note
If you do find a zombied (crashed) process remaining, you should first kill that process and then restart the node.
Complete cluster failure
The following errors can cause a total cluster shutdown:
- Catastrophic software bug that causes multiple cluster nodes to fail
- Every node in the cluster loosing power (an entire facility failing for example), split brain condition (that will be discussed shortly)
- Malicious or mistaken users gaining access to the management node or any storage node
A split brain problem refers to the problem of cutting a cluster suffering some communication problems between nodes. If we have four nodes and split them into two groups of two nodes each (perhaps, through the failure in a switch), there is absolutely no way for either of the pairs to tell if the other node is working or not. In this case, the only safe thing to do is to shut down both the nodes, even though both pairs could have all the data required to carry on working.
Imagine what would happen if your four-data nodes, two-storage nodes cluster continued working as two separate clusters—and then you had to attempt to reconcile two completely different databases!
MySQL Cluster gets around this with the concept of an arbitrator—put simply, the cluster nodes elect a single node to act as the arbitrator while all nodes can still communicate.
In the event of nodes loosing each other's contact, they (as a new group) ask the following questions:
- Do we (nodes I can now talk to) have enough nodes to remain viable (one storage node per nodegroup)?
- Can I see the previously agreed arbitrator?
Unless the answer is yes for each node, the cluster will shut down with an error similar to the following appearing in the log:
Forced node shutdown completed. Caused by error 2305: 'Node lost connection to other nodes and cannot form a unpartitioned cluster, please investigate if there are error(s) on other node(s)(Arbitration error). Temporary error, restart node'.
Note
The arbitrator is typically the management node, but can be a SQL node and you can specify ArbitrationRank=1 in config.ini file to make a node of high priority to become the cluster arbitrator.
Recovery of a full cluster shutdown is conceptually simple—we need to start all storage nodes. It is likely that storage nodes would have killed themselves or had been killed by whatever caused the outage. So the procedure is identical to the rolling cluster restart without killing the existing processes. In other words, start the management node (ndb_mgmd), start all storage nodes (ndbd), and start all SQL nodes (start or restart mysqld).

How it works…

During a full-cluster start up, the storage nodes will start and will have to use their local copies of data that they stored to disk (It is likely that there will be some data loss after a total cluster shutdown). By default, a running MySQL Cluster will commit a Local Checkpoint (LCP)—a copy of all the local data held on disk—every time 4 MB of changes are made to the cluster (since the previous LCP).

A MySQL Cluster will also take a global checkpoint (all transactions that have occurred since the last LCP) to disk every two seconds. A storage node when starting from a full shutdown will apply all local transactions up to the last LCP, and then apply them up to two seconds of transactions from the latest global checkpoint to get data that is as up-to-date as possible. Because global checkpoints are made consistently across the cluster, this allows for consistent recovery of all nodes.

There's more…

In the case of a total cluster shutdown, it may happen that a storage node is damaged and cannot be repaired quickly. It is possible to start your cluster with only one storage node per nodegroup. To do this, pass the --nowait-nodes=<NODES> where <NODES> is a comma-separated list of nodes not to wait for. For example, in this example cluster:

[ndbd(NDB)] 4 node(s)
id=3 @10.0.0.1  (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0)
id=4 @10.0.0.2  (mysql-5.1.34 ndb-7.0.6, Nodegroup: 0, Master)
id=5 @10.0.0.3  (mysql-5.1.34 ndb-7.0.6, Nodegroup: 1)
id=6 @10.0.0.4  (mysql-5.1.34 ndb-7.0.6, Nodegroup: 1)

We could potentially start with nodes [ (3 or 4) and (5 or 6) ]. In this example, we will start without node3:

This example assumes that your cluster is already shut down.

Run the following command on the nodes that you want to start:

[root@node4 ~]# ndbd --nowait-nodes=3
2009-07-09 23:32:02 [ndbd] INFO     -- Configuration fetched from '10.0.0.5:1186', generation: 1

The cluster should start without node3:

ndb_mgm> ALL STATUS
Node 3: not connected
Node 4: started (mysql-5.1.34 ndb-7.0.6)
Node 5: started (mysql-5.1.34 ndb-7.0.6)
Node 6: started (mysql-5.1.34 ndb-7.0.6)

Clearly, at this point, the cluster no longer has a single point of failure and as quickly as possible node3 should be repaired and started.

Tigran Petrosyan Aug 19, 2012

Very useful book. There were lot off useful and different descriptions and approaches for realizing High Availability of MySQL Server. server.

Amazon Verified review

Patrice Guay Jul 05, 2010

Under the CentOS distribution, a MySQL database service can be transformed into an highly available service using different approaches. High Availability MySQL Cookbook presents them: * MySQL Cluster * MySQL replication * Shared storage * Block-level replicationThis book is mainly focused on the MySQL Cluster approach with four (4) entire chapters dedicated to this method. The three (3) remaining approaches are covered in dedicated single chapters while the last chapter covers performance tuning of MySQL.The configuration examples presented in the book are rich and well explained. These examples could be used to quickly set up an highly available MySQL database from scratch. However, the inner workings of MySQL in a highly available configuration could be more detailed. The graphical representations used to illustrate the theory behind the examples are poorly designed.Another weakness of this book is the proposed method of installation for the Multi Master Replication Manager (MMM). The author recommends the installation of source packages. While these tools are a valuable resource for a multi-master replication configuration, the use of manually installed software on a production environment should be avoided. The author should provide a way to install these tools via a set of packages specifically built for the CentOS distribution.Nonetheless, this book is a valuable resource for a novice to intermediate level MySQL database administrator in his journey toward making its database highly available.

Kindle Customer Oct 05, 2016

I think this book is very useful.

IGGt Sep 12, 2012

This book provides a great introduction to MySQL cluster and the related concepts, with step by step instructions to set it up.The one negative is that it is very Linux oriented, for those of using MySQL in a windows environment, it requires a degree of interpretation to work out the commands.However, having spent a while looking for a good book on MySQL Cluster, this is highly recomended

David M. Stokes May 03, 2010

MySQL is an easy database to get running initially but it can be tricky to run in demanding environments. High Availability MySQL Cookbook is a relatively thin book packed with information.The first four chapters are on the care and feeding of a MySQL 7.x Cluster. It starts simply with an initial configuration, covers backups & replication, covers user defined partitioning, and covers troubleshooting node failures. The examples are short, to the point, and devoid of any unneeded filler.Next come chapters on replication including the very tricky Multi Master Replication. Used shared storage is covered in a separate chapter from DRBD. The book finishes strongly with performance tuning.The good points -- the book is concise and not padded with filler. The bad is that it is too short. Considering how well the author covers the topics that may not be too big a of a problem. For a novice to intermediate level MySQL DBA, this is a valuable guide on making your databases highly available. Long time MySQL DBAs will find nothing new but might appreciate the clear and concise explanations presented.