Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
Professional SQL Server High Availability and Disaster Recovery

You're reading from   Professional SQL Server High Availability and Disaster Recovery Implement tried-and-true high availability and disaster recovery solutions with SQL Server

Arrow left icon
Product type Paperback
Published in Jan 2019
Publisher Packt
ISBN-13 9781789802597
Length 564 pages
Edition 1st Edition
Languages
Concepts
Arrow right icon
Author (1):
Arrow left icon
Ahmad Osama Ahmad Osama
Author Profile Icon Ahmad Osama
Ahmad Osama
Arrow right icon
View More author details
Toc

Table of Contents (9) Chapters Close

Professional SQL Server High Availability and Disaster Recovery
Preface
1. Getting Started with SQL Server HA and DR FREE CHAPTER 2. Transactional Replication 3. Monitoring Transactional Replication 4. AlwaysOn Availability Groups 5. Managing AlwaysOn Availability Groups 6. Configuring and Managing Log Shipping Appendix

What is High Availability and Disaster Recovery?


High Availability

High availability refers to providing an agreed level of system or application availability by minimizing the downtime caused by infrastructure or hardware failure.

When the hardware fails, there's not much you can do other than switch the application to a different computer so as to make sure that the hardware failure doesn't cause application downtime.

Disaster Recovery

Business continuity and disaster recovery, though used interchangeably, are different concepts.

Disaster recovery refers to re-establishing the application or system connectivity or availability on an alternate site, commonly known as a DR site, after an outage in the primary site. The outage can be caused by a site-wide (data center) wide infrastructure outage or a natural disaster.

Business continuity is a strategy that ensures that a business is up and running with minimal or zero downtime or service outage. For example, as a part of business continuity, an organization may plan to decouple an application into small individual standalone applications and deploy each small application across multiple regions. Let's say that a financial application is deployed on region one and the sales application is deployed on region two. Therefore, if a disaster hits region one, the finance application will go down, and the company will follow the disaster recovery plan to recover the financial application. However, the sales application in region two will be up and running.

High availability and disaster recovery are not only required during hardware failures; you also need them in the following scenarios:

  • System upgrades: Critical system upgrades such as software, hardware, network, or storage require the system to be rebooted and may even cause application downtime after being upgraded because of configuration changes. If there is an HA setup present, this can be done with zero downtime.

  • Human errors: As it's rightly said, to err is human. We can't avoid human errors; however, we can have a system in place to recover from human errors. An error in deployment or an application configuration or bad code can cause an application to fail. An example of this is the GitLab outage on January 31, 2017, which was caused by the accidental removal of customer data from the primary database server, resulting in an overall downtime of 18 hours.

    Note

    You can read more about the GitLab outage post-mortem here: https://about.gitlab.com/2017/02/10/postmortem-of-database-outage-of-january-31/.

  • Security breaches: Cyber-attacks are a lot more common these days and can result in downtime while you find and fix the issue. Moving the application to a secondary database server may help reduce the downtime while you fix the security issue in most cases.

Let's look at an example of how high availability and disaster recovery work to provide business continuity in the case of outages.

Consider the following diagram:

Figure 1.1: A simple HA and DR example

The preceding diagram shows a common HA and DR implementation with the following configuration:

  • The primary and secondary servers (SQL Server instance) are in Virginia. This is for high availability (having an available backup system).

  • The primary and secondary servers are in the same data center and are connected over LAN.

  • A DR server (a third SQL Server instance) is in Ohio, which is far away from Virginia. The third SQL Server instance is used as a DR site.

  • The DR site is connected over the internet to the primary site. This is mostly a private network for added security.

  • The primary SQL Server (node 1) is active and is currently serving user transactions.

  • The secondary and DR servers are inactive or passive and are not serving user transactions.

Let's say there is a motherboard failure on node 1 and it crashes. This causes node 2 to be active automatically and it starts serving user transactions. This is shown in the following diagram:

Figure 1.2: A simple HA and DR example – Node 1 crashes

This is an example of high availability where the system automatically switches to the secondary node within the same data center or a different data center in the same region (Virginia here).

The system can fall back to the primary node once it's fixed and up and running.

Note

A data center is a facility that's typically owned by a third-party organization, allowing customers to rent or lease out infrastructure. A node here refers to a standalone physical computer. A disaster recovery site is a data center in a different geographical region than that of the primary site.

Now, let's say that while the primary server, node 1, was being recovered, there was a region-wide failure that caused the secondary server, node 2, to go down. At this point, the region is down; therefore, the system will fail over to the DR server, node 3, and it'll start serving user transactions, as shown in the following diagram:

Figure 1.3: A simple HA and DR example – Nodes 1 and 2 crash

This is an example of disaster recovery. Once the primary and secondary servers are up and running, the system can fall back to the primary server.

Note

Organizations periodically perform DR drills (mock DR) to make sure that the DR solution is working fine and to estimate downtime that may happen in the case of an actual DR scenario.

You have been reading a chapter from
Professional SQL Server High Availability and Disaster Recovery
Published in: Jan 2019
Publisher: Packt
ISBN-13: 9781789802597
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image