Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases now! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
MariaDB High Performance

You're reading from   MariaDB High Performance Familiarize yourself with the MariaDB system and build high-performance applications

Arrow left icon
Product type Paperback
Published in Sep 2014
Publisher
ISBN-13 9781783981601
Length 298 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Pierre Mavro Pierre Mavro
Author Profile Icon Pierre Mavro
Pierre Mavro
Arrow right icon
View More author details
Toc

Table of Contents (13) Chapters Close

Preface 1. Performance Introduction 2. Performance Analysis FREE CHAPTER 3. Performance Optimizations 4. MariaDB Replication 5. WAN Slave Architectures 6. Building a Dual Master Replication 7. MariaDB Multimaster Slaves 8. Galera Cluster – Multimaster Replication 9. Spider – Sharding Your Data 10. Monitoring 11. Backups Index

Choosing the appropriate hardware

Choosing the correct hardware is not an easy task. MariaDB has the following hardware requirements:

  • Disk performance
  • RAID and acceleration cards
  • RAM
  • CPU

Some types of software do not require so many important resources, but this is not the case for MariaDB. Of course, it depends on what you want to use your MariaDB instance for. For example, for a small website with poor access, you do not really need a huge configuration; a 10-year-old PC should really be enough. However, for a high-load website, requests should be analyzed to know which kind of hardware should be taken into consideration.

Disks

The disk is one of the biggest parts as several kinds of elements should be taken into consideration, and the storage is, in most cases, the bottleneck. Everything will depend on the write access you will need of course. That's why you're going to see several solutions that exist for speedy access to sensitive and reactive requirements.

SATA magnetic drives

SATA Hard Disk Drives (HDDs) are the slowest solution that can be commonly found on some servers. Generally, there are two kinds of rotation-per-minute drives:

  • 5400 rpms: These disks have the slowest performances but the highest density
  • 7200 rpms: These are slower drives but they have high density

10K HDDs exist but are not designed for production usages. A good solution to win access time is to have the highest disk cache size.

We can find disk caches with 2,5' and 3,5' sizes on the market. Servers are now generally shipped with 2,5' drives as we could add more than 3,5'. For instance, it's common to see 1U servers with eight arrays plugged to 2,5' disks. On 3U servers, constructors can add up to 25 disks. With Redundant Array of Independent Disks (RAID) mechanisms, it becomes interesting to get as many drives as possible to speed up the storage.

SAS magnetic drives

SAS magnetic drives are faster drives than SATA and are generally used with a specific PCI-X RAID card to enhance performance. Like the SATA HDDs, there are two kinds of rotation speeds:

  • 10K rpms: These disks have the highest SAS density but are slower
  • 15K rpms: These disks have the lowest SAS density and are faster but less robust

The disk choice is important, but there is another thing to take into account. Like SATA drives, 3,5' drives exist, but they are hard to find now. Let's stick with 2,5' drives instead.

Hybrid drives

Hybrid drives are more common because their performances are similar to that of Solid State Drives (SSDs) with the size of SATA HDDs. This is a real good alternative to the high cost of SSDs. Hybrid drives are bridging the gap between SSDs and SATA drives.

Hybrid drives combine NAND flash drives (SSDs) with HDDs. The NAND flash of the drive is used to store data as cache to quickly deliver often-accessed files. The HDD part of the drive stores all the information, but the access is slower.

The hybrid drives that we can find on the market today have, for example, 1 TB of magnetic storage with 8 GB, 16 GB, or 24 GB of NAND flash.

SSDs

SSDs are the fastest disks on the market! They give the best disk performance that we can find today. However, SSD (NAND flash) drives are expensive, so a storage disk array is really expensive.

Tip

SSDs are more expensive and prone to more failures than other disk drives. They have a limited life time, so you should use them with the RAID system.

RAID and acceleration cards

Having an overview of what kinds of disks exist is generally not enough to get maximum fault tolerance and speed performance. That's why additional mechanisms such as RAID and acceleration cards exist. We'll see their pros and cons in the following sections.

RAID cards and levels

I have already talked about PCI-X RAID cards—cards where disks are plugged embed fast cache memory. Today, we can commonly find 512 MB, 1 GB, or 2 GB flash cache. The more flash cache the PCI-X card has, the faster the transactions. You generally, depending on the card model, configure two kinds of cache: read and write caches.

There are two types of read cache:

  • Demand caching: This helps to quickly serve the same information if requested multiple times. In this case, it significantly improves disk I/O performance.
  • Look-ahead caching: If the required data is sequentially stored in blocks, this will store the next requested blocks in the cache to serve them faster when they are asked for.

The best performance solution for MariaDB is demand caching, as data is not sequential when reading.

There are two types of write cache:

  • Write-back caching: When a write request is issued, data is quickly written to the cache and the system is informed about the correct write. When it's free time for the bus or when the buffer does not have enough space to store new data, the data cache is written to the disk.
  • Write-through caching: This is the same as the write-back caching method, except the data is immediately transferred from the cache to the disk before informing the system.

In the case of a system crash, the write-back caching method is of course the most dangerous option. To avoid losing data, a Battery Backup Unit (BBU) is present on the cards to preserve data during a power cut. For example, when the system powers up and the SAS RAID card boots, the battery writes the cache information to disk.

When using BBU, it is recommend to disable the learning cycle. During a learning cycle, the battery is unloaded/reloaded and the write cache method switches from write-back to write-through.

Depending on the card manufacturer, some other options can be configured to customize those cache types.

Regarding the RAID levels, multiple solutions exist, and here are the common ones:

RAID level

Description

0

Block Level (BL) striping without parity; this provides fast read and write but no security

1

BL mirroring without parity; this provides security and fast read but slow write access

5

BL striping with distributed parity; this provides more security but slow read and write access

6

This is the same as RAID 5 but with double distributed parity; this is the slowest but it provides high security

10

This is also called 1+0: mirroring without parity but with BL striping; this is fast and provides security

RAID 0 is not really the best solution for production use as there is no security. If a disk crashes, there is no way to recover it. In RAID 1, it's only mirroring! Even if we add more than two disks, the same information will be replicated. So, it is not good to use with MariaDB, but it generally answers OS disk problems. RAID 5 has been a really good solution for several years because of its good security guarantee. But we're losing performance here because of the parity calculation and storage, which corresponds to one disk. It's not recommended to create a very big RAID 5 solution, because if you lose more than one disk, all your data is lost. RAID 6 permits to lose up to two disks at once! However, the parity calculation is double and performance is not what we expect.

RAID 10 is a better solution! RAID 10 stripes mirrors; it's as simple as that! We have security as we could lose more than one disk (with mirroring) and have speed (with striping). The major problem of this solution is the cost, as you would only be able to use half of the total capacity of your disks. For example, if I have 12 disks in a server, you can consider that six disks are mirrored against the other six. Each of the six groups are stripped or they can be divided once again to get smaller (three) stripes.

Fusion-io direct acceleration cards

Fusion-io direct acceleration cards are PCI-X cards that permit the drives to be faster than classic SSD solutions, with a better and consistent I/O throughput to give up to 85 percent more transactions. How? Simply because it requires less hardware components to access data and uses high speed hardware to achieve it.

When you use SSD/HDD SAS drives, CPU transactions need to pass through the RAID card and are then transferred to the disks. This is the bottleneck! On a high load charge on the SAS RAID card, the performance degrades gradually because of the connectivity to the disks.

To avoid it, Fusion-io direct access cards embed NAND flash directly on the PCI-X card to permit the drive to have a big cache system (up to 5 TB per card). The high bus bandwidth of the PCI-X permits the drive to quickly access information from the CPU and reduce a lot of latency.

The Fusion-io company provides other cheaper solutions to speed up server performance, but the fastest solution remains the Fusion-io direct access card anyway. Moreover, MariaDB is a partner with Fusion-io and has created special parameters to double the I/O capacity on those cards (available since MariaDB 5.5.31).

Disk arrays

Disk arrays have been the solution to get maximum performance, and the only way to have a huge data size solution. The information that comes from the server(s) to the disk array (DAS, NAS, and SAN) takes too much time to process requests as it passes through several kinds of components such as networks in the worst case.

Even if it's a good solution in several cases, it's unfortunately not the fastest one. The recommendation for high performance is to store data locally. Multiple solutions exist for replication and high availability, so you don't have to worry about it.

RAM

In MariaDB, RAM availability is very important. The more RAM you have, the more data from your database can be kept in memory. For instance, on the InnoDB/XtraDB engine, to get maximum performance, it's recommended to get the database size equal to the free RAM size. It's also used to store table caches and so on.

Of course, if you have terabits of database data, it will be hard to get that much RAM. However, solutions exist to avoid those problems.

Another important thing is to look at your server architecture. You should take care of the motherboard's bus frequency and keep it as high as possible. In a major case, if you fill all the RAM slots that the motherboard can take, the bus frequency will decrease and the result will be a higher latency communication between the CPU and RAM. If you want to get the maximum RAM capacity of your server without losing any performance, look at the server constructor documentation to fill the correct amount of RAM slots with the highest RAM size per slot.

The latest important thing is not related to MariaDB: the Error-Correcting Code memory (ECC memory). It's a type of RAM that can detect and correct the most common kinds of internal data corruption. You may lower memory performance by around two to three percent. This is not a big performance loss, but you'll be sure that your data will be best protected from corruption.

CPU

Depending on the CPU model and constructor, having a lot of cores is of course interesting for multi-threading operations. A high processor clock speed allows faster calculation.

The L1, L2, and L3 processor cache sizes are very important as well. More memory allocation can be used to store on the processor; the fewer round trips made, faster the transactions will be.

To get maximum dedicated performance, you have to use the Linux cgroup feature to bind CPUs/cores to a MariaDB instance. This is also called CPU pinning.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime