Oracle Data Guard architecture
The main architecture of Oracle Data Guard 11gR2 includes a primary database, up to 30 standby databases, the redo transport services, (which automatically ship the redo log data from the primary to standby server), and Apply Services (which applies the changes in redo on the standby database). There are of course some background processes special to a Data Guard configuration, which run the services in question.
In a Data Guard configuration, the switchover and failover concepts are also very important. By performing a switchover, it's possible to change the roles of the primary and standby databases and change the direction of the redo shipping. Failover is the option that we must use to open a standby database to user connection in read/write mode, when the primary database is inaccessible.
The last Data Guard components that we'll mention in this chapter are user interfaces to monitor and administrate a Data Guard configuration. These are SQL*Plus, Oracle Enterprise Manager Cloud Control, and Data Guard broker command-line interface (DGMGRL).
Data Guard services
These services are the vital points of a Data Guard configuration. Database administrators should decide and use the proper configuration to supply the business needs and tune these services to comply with SLAs.
Redo transport services
In a primary database, when a user commits a transaction, the relevant redo data is written into online redo logfiles from memory (Redo Log Buffer). After the online redo log group becomes full it is archived into an archived redo logfile with a log switch. It's possible to configure Data Guard sending the redo data to standby databases from the log buffer as the transactions are committed (by LGWR process) or from the online redo logfiles when they're being archived (by ARCn processes). Shipping redo data with ARCH will result in more data loss in the case of primary database failure because the data change information in the current online log of primary will be lost.
The following diagram shows the Data Guard configuration with ARCH transportation mode:
Here are the important properties of the log transport with the ARCH attribute:
Logs are sent by the ARCH process; the LNS process is not in use
Standby redo logs are not mandatory on the standby database
Data in the unarchived online redo log will be lost in a failover
If LGWR is used for the redo transportation, it's possible to guarantee zero data loss failovers by creating a Data Guard configuration in which the primary database waits for confirmation from the standby database that redo has been received, before it informs that the commit is completed. This configuration is called Synchronous redo transport (SYNC). However, this may affect the performance of the primary database.
The following diagram shows the Data Guard configuration with LGWR and SYNC transportation mode:
The following points explain the diagram in a better way:
Redo is read and sent to the standby database directly from the log buffer by the LNS process
Acknowledgment needed from the standby database (RFS to LNS and LNS to LGWR) to send COMMIT ACK to the database user
It's mandatory to use standby redo logs
Zero data loss in failover can be guaranteed with this configuration
There maybe slower response times on the primary database
The primary database stops giving service in a network disruption incident between primary and standby
Tip
If SYNC redo transport is chosen in an 11g Data Guard configuration, the performance decrease on the primary database will be less than the earlier releases. Previously, the primary database used to finish writes to the online redo log first and then send redo to the standby database. There were two consecutive I/O operations that the primary database needs to wait for in order to complete the commit. In 11g these two I/O operations run in parallel. The primary database does not wait for finishing writes to online redo log and it sends the redo data to standby at the same time.
The other option is to use the Asynchronous redo transport (ASYNC) method, which avoids the impact to primary database performance. In this method, the primary database never waits for any acknowledgment from the standby database in order to complete the commit. In the ASYNC redo transport method we have the performance gain; however, this method does not guarantee zero data loss failovers because it does not guarantee all the committed transactions being received by the standby database at any moment.
The following points explain the diagram in a better way:
No acknowledgment needed from standby to send the COMMIT ACK to the database user
Redo is read and sent to standby from the Redo Log Buffer or online redo logs by the LNS process. If LNS cannot catch the send data in the Redo Log Buffer before it is recycled, it automatically reads and sends redo data from the online redo log.
The committed transactions that weren't shipped to standby yet, may be lost in a failover
Potential slower response time on primary database with SYNC mode is not valid here
Protection modes
Data Guard offers three data protection modes, which serve different business needs in terms of data protection and performance. You can find the properties of these modes in the following comparison table:
Mode |
Redo transport |
Action with no standby database connection |
Risk of data loss |
---|---|---|---|
Maximum Protection |
SYNC and LGWR |
The primary database needs to write redo to at least one standby database. Otherwise it will shut down. |
Zero data loss is guaranteed. |
Maximum Availability |
SYNC and LGWR |
Normally works with SYNC redo transport. If the primary database cannot write redo to any of its standby databases, it continues processing transactions as in ASYNC mode. |
Zero data loss in normal operation, but not guaranteed. |
Maximum Performance |
ASYNC and LGWR/ARCH |
Never expects acknowledgment from the standby database. |
Potential for minimal data loss in a normal operation. |
Apply services
Data Guard automatically transfers redo data from the primary to standby database and applies it on the standby database. Redo transport services work independent of apply services and never wait for Redo Apply but if there's a problem on redo transportation, apply services normally stop and wait for the new redo to arrive. The most important categorization in apply services is the Redo Apply and SQL Apply. These apply methods create the infrastructure of physical and logical standby databases.
As a property of Data Guard, both in Redo Apply and SQL Apply, the standby database validates the redo data in order to prevent physical corruptions that may occur at the primary database from reflecting to the standby database. By default, the standby database writes received redo data into the standby redo logfiles and apply services do not apply redo until the standby redo log is archived as an archived redo log. If we use the real-time apply feature, which became available with 10g, the apply services don't wait for the archival operation and apply the redo data as it's received and written into the standby redo logs.
It's also possible to specify a delay value to keep the standby database behind the primary database with the specified minutes. This may be chosen to prevent human error operations on the primary database to be applied to standby immediately. However, as we discussed previously, after the support of flashback database, there's no need to define a delay in Data Guard configuration.
Redo Apply (physical standby databases)
Redo Apply keeps a block-by-block copy of the primary database. By default, Redo Apply automatically runs a parallel apply processes, which is equal to the number of CPUs of the standby database server minus one. These parallel recovery processes are controlled by the MRP process, which is the background process responsible for the application of redo data.
Redo Apply has the following benefits for its users:
There are no unsupported data types, objects, and DDLs
Redo Apply has higher performance when compared with SQL Apply or any other replication solutions
It offers simple management by keeping the database structure exactly the same as the primary database with its fully automated architecture
It's possible to take advantages of Active Data Guard and snapshot standby for reporting and testing
Backups taken from physical standby databases are ready to be restored to primary. So we can offload the backup from primary
Redo Apply offers a strong corruption detection and prevention mechanism.
It's possible to use physical standby databases for the rolling upgrades of the database software, which is known as transient logical standby
The real-time apply feature applies the redo as it's received. This feature makes it possible to query real-time or near real-time data from the standby database
By offering these features, Redo Apply (physical standby database) has become a very popular and widely used-technology for the high availability and disaster recovery of Oracle databases.
Monitoring Redo Apply
While Redo Apply runs on the standby database, administrators need to monitor the status of the apply process and check if it's working in accordance with the selected configuration. As mentioned, the MRP process is responsible from the Redo Apply process and monitoring the status of this process will give us valuable information on what's going on with Redo Aapply.