Time for action – monitoring Redo Apply
We'll install Data Guard configuration beginning with Chapter 2, Configuring Oracle Data Guard Physical Standby Database. So, you will not be able to perform the actions in this chapter on the test environment. Please just read the actions to consolidate the given theoretical information mentioned earlier.
We'll query the v$managed_standby
view on the standby database for monitoring. The Data Guard configuration is in the Maximum Performance mode with ASYNC
and LGWR
attributes. We'll change the redo transport and apply characteristic and monitor the behavior of Data Guard.
For our first test, a one hour delay is defined. Let's check this by running the following query on the primary database:
SQL> select name, value from v$parameter where name like'log_archive_dest_2'; NAME VALUE ------------------- ---------------------------------------- log_archive_dest_2 SERVICE=TEST_STANDBY LGWR ASYNCVALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE)DB_UNIQUE_NAME=TEST DELAY=60
We can see that a 60-minute delay is defined on the primary database. This doesn't mean that the redo data will be sent with a 60-minute delay. This setting means the redo data will be sent immediately but the standby database will not apply the redo that was received in the last 60 minutes.
Tip
Downloading the example code
You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
So let's see what's happening on the standby side by running the following query on the standby database. (Note: We can connect to a standby database from the standby database server with the
sqlplus / as sysdba
command. This allows us to connect to the database as a sys user and with password file authentication.)SQL> select process, status, thread#, sequence#, block#, blocksfrom v$managed_standby; PROCESS STATUS THREAD# SEQUENCE# BLOCK# BLOCKS --------- ------------ ---------- ---------- ---------- ---------- ARCH CONNECTED 0 0 0 0 ARCH CONNECTED 0 0 0 0 MRP0 WAIT_FOR_LOG 1 461 0 0 RFS IDLE 0 0 0 0 RFS IDLE 1 469 1727085 40
The output shows that the log with the sequence
469
is being received from primary, but theMRP
process is still waiting for the log with the sequence number461
. Let's check if this log has been received:SQL> select name, archived from v$archived_log wheresequence#=461; NAME ARC ----------------------------------------------------------- -- +FRA/test/archivelog/2012_08_08/thread_1_seq_461.2606.7908 YES
So the log sequence
461
was received butMRP
is not applying it because of the configured 60-minute delay on the primary database. We can see this situation more clearly on the alert log:RFS[1]: Archived Log:'+FRA/test/archivelog/2012_08_08/thread_1_seq_461.2606.790810199' Wed Aug 8 22:31:28 2012 RFS[1]: Archive log thread 1 sequence 461 available in 60 minute(s) Wed Aug 8 23:14:48 2012 Media Recovery Log +FRA/test/archivelog/2012_08_08/thread_1_seq_460.2841.790809291 Media Recovery Delayed for 60 minute(s)
The highlighted line in the previous code shows that the log sequence
461
was received at22:31
but will be available to use only after 60 minutes.Now let's cancel the delay on the media recovery and monitor again. On the primary database perform the following:
SQL> alter system set log_archive_dest_2='SERVICE=TEST_STANDBYLGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE)DB_UNIQUE_NAME=TEST'; System altered.
After a few minutes on the standby database perform the following:
SQL> select process, status, thread#, sequence#, block#, blocksfrom v$managed_standby; PROCESS STATUS THREAD# SEQUENCE# BLOCK# BLOCKS --------- ------------ ---------- ---------- ---------- ------ ARCH CONNECTED 0 0 0 0 ARCH CLOSING 1 470 3432448 403 MRP0 WAIT_FOR_LOG 1 471 0 0 RFS IDLE 0 0 0 0 RFS IDLE 1 471 878728 2
We can see that, the
MRP
is not waiting for any old sequence; it's waiting for the log sequence that is on the way from primary to standby. (Because the LGWR attribute is used on log transport, this log is the current log sequence on the primary.)Let's look at the alert log again:
Thu Aug 09 00:27:16 2012 Media Recovery Log +FRA/test/archivelog/2012_08_09/thread_1_seq_470.515.790820745 Thu Aug 09 00:27:57 2012 Media Recovery Waiting for thread 1 sequence 471 (in transit)
As you can see there's no text in alert log about the delay, because it was cancelled. The
MRP
process applied the log sequence470
and started to wait for the next log (471
) to completely arrive and get archived. It also indicates that the next log is in transit, which means it is currently being received byRFS
.Let's convert the Redo Apply mode to real-time apply and see how Data Guard will apply the redo as it received from the primary database. First we'll stop Redo Apply on the standby database and start again in the real-time apply mode:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; Database altered. SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USINGCURRENT LOGFILE DISCONNECT FROM SESSION; Database altered.
After a few minutes we will check the status of the processes:
SQL> select process, status, thread#, sequence#, block#, blocksfrom v$managed_standby; PROCESS STATUS THREAD# SEQUENCE# BLOCK# BLOCKS --------- ------------ ---------- -------- --------- ------- ARCH CONNECTED 0 0 0 0 ARCH CLOSING 1 472 3432448 403 MRP0 APPLYING_LOG 1 473 1985328 4096000 RFS IDLE 0 0 0 0 RFS IDLE 1 473 1985957 11
Now it's obvious that MRP
is applying the log as it arrives to standby. The RFS
process is transferring the log sequence 473
, which is the current log on the primary side, and at the same time the MRP
process is applying the same log sequence. Look at the block number column; we can see that MRP
is applying the redo blocks that have just arrived.
Tip
You should also know that, even there is a DELAY
value specified on the primary database; if the apply mode is real-time apply on the standby database, the DELAY
will be ignored. You'll see the following lines in the standby alert log in such a case:
Managed standby recovery started with USING CURRENT LOGFILE Ignoring previously specified DELAY 60 minutes
What just happened?
You have just seen the Redo Apply behavior on different Data Guard configurations such as delayed, non-delayed, and real-time apply. You learned how to query the status of the important Data Guard processes MRP and RFS on the standby database.
Pop quiz – real-time apply consideration
Q1. What's the risk of using real time apply and how can we overcome this risk?
SQL Apply (logical standby databases)
The SQL Apply technology resides on mining the standby redo logs, building SQL transactions that apply the changes in question, and finally, executing the SQL on the standby database, which is read/write accessible. This process is more expensive in terms of hardware resource usage as a matter of course. The LSP process manages the application of changes to a logical standby database.
The general purpose of building a logical standby database is reporting the needs with read/write access requirement. SQL Apply is not suitable for disaster recovery and high availability as much as Redo Apply because of the unsupported data types and logically different database infrastructure.
SQL Apply offers the following benefits to its users:
The logical standby database is always read/write accessible while SQL Apply is running; so that users may run reports, create temporary tables and indexes for performance issues. Also it's possible to create objects and keep data on the standby database, which do not exist on primary.
The logical standby database is open for read/write activity. But normally there are no writes possible on the standby objects, which exist on primary. This feature maintains the consistency of the replicated primary data.
It's possible to upgrade the Oracle database software version with almost no downtime using a logical standby database.
Role transitions
Role transitions basically enable users to change the roles of the databases in a Data Guard configuration. There are two role transition options in Data Guard, which are switchover and failover.
Switchover
In a basic Data Guard configuration with one primary and one standby database, a switchover operation changes the roles of these databases, and so the direction of the redo shipping. In a correctly designed configuration, archived log shipping in the opposite direction starts immediately after switchover and clients do not need to change their connection descriptions in order to connect the new primary database.
If there is more than one standby database in a Data Guard configuration, it's possible to perform switchover between the primary and any of the standby databases. After the switchover, the new primary database can continue to send redo to all of the standby databases in the configuration.
Regardless of the configuration of Data Guard, a switchover operation always guarantees zero data loss. This brings high reliability to switchover and thus it's widely used for planned maintenance operations, such as hardware or operating system upgrades, database software rolling upgrade, and other infrastructure maintenances. Switchover reduces the downtime for these maintenance operations by a significant amount of time.
Failover
Failover is the operation of converting a standby database to a primary database, because of a failure in the original primary database. If the flashback database is disabled on the primary database, failover is an operation with no return. In other words, we have to flashback the failed primary database to a state before failover in order to re-establish the configuration. Without flashback, Data Guard configuration needs to be built from scratch.
A manual database failover may be performed in the case of failure with the initiative of the database owner. However, this will require extra outage for the decision making. If fast-start failover is used, which is a 10g release 2 feature, the failover operation will perform automatically.
Fast-start failover
This property of automating the failover operation can only be used in Data Guard broker enabled configuration. The observer process which runs on a different server from the primary and standby databases, continuously monitors the accessibility of the primary database. If both the observer and the standby database cannot reach the primary database for a predefined length of time, a fully-automated failover process is started. With 11g Release 2, we call it fully automated, because this process includes changing the role of the standby as primary, starting the database services on the new primary database, disconnecting the client from the failed primary database, and redirecting them to the new primary database.
If the observer establishes the connection with the original primary database again after the failover, it informs the database that the failover was performed and it will automatically reinstate the database using flashback. In order to configure fast-start failover, we need to specify the fast recovery area and enable flashback on the primary and standby databases.
Keep in mind that in Version 11g, Data Guard must be on Maximum Availability or Maximum Performance mode in order to use fast-start failover. In 10g Release 2, only Maximum Availability mode is supported for fast-start failover.
User interfaces for administering Data Guard
There are three options for a database administrator to manage a Data Guard environment, which are SQL*Plus command-line interface, Oracle Enterprise Manager, and Data Guard broker command-line interface (DGMGRL). In almost every IT infrastructure management interface, command-line tools offer great flexibility and detailed options and the graphical interfaces are user friendly, simple, and automated.
SQL*Plus
SQL*Plus provides all kinds of administration and monitoring operations for the administrators, but you'll need to access each server in the Data Guard configuration and do the operations separately. It's also sometimes painful to have easy readable outputs from SQL*Plus.
DGMGRL
Data Guard broker command-line interface (DGMGRL) is the Data Guard broker tool that automates and centralizes Data Guard management. Using DGMGRL we can run some consecutive operations such as switchover and failover with just one command. Also, the status of the Data Guard configuration can be queried with special Data Guard broker commands via DGMGRL. Outputs are designed to be easily readable.
Enterprise Manager
Enterprise Manager offers an integrated graphical user interface for Data Guard broker enabled Data Guard configurations. It's possible to graphically monitor the general configuration information, performance, synchronization status of Data Guard, and also perform administration tasks such as switchover, failover, adding, and removing standby database from configuration.