Dataguard Cheat Sheet
Dataguard Cheat Sheet
Currently being updated, this statement will be removed when I have completed this section
Terminology
Log Files
There are many options see the broker section for more information
Troubleshooting (Monitoring commands and log files)
configuration DGMGRL> show configuration;
DGMGRL> show database prod1;
DGMGRL> show database prod1dr;
# There are a number of specific information commands, here are the most used
database DGMGRL> show database prod1 statusreport;
DGMGRL> show database prod1 inconsistentProperties;
DGMGRL> show database prod1 inconsistentlogxptProps;
DGMGRL> show database prod1 logxptstatus;
DGMGRL> show database prod1 latestlog;
# change the instance name to reflect the one you have choosen
There are a number of commands that you can use to change the state of the database
turn off/on the redo DGMGRL> edit database prod1 set state=transport-off;
transport service for all Primary
standby databases DGMGRL> edit database prod1 set state=transport-on;
DGMGRL> edit database prod1dr set state=apply-off;
turn off/on the apply state Standby
DGMGRL> edit database prod1dr set state=apply-on;
DGMGRL> edit database prod1dr set state=apply-off;
put a database into a real-
Standby sql> alter database open read only;
time query mode DGMGRL> edit database prod1dr set state=apply-on;
# Choose what level of protection you require
sql> alter database set standby to maximize performance;
sql> alter database set standby to maximize availability;
change the protection mode Primary sql> alter database set standby to maximize protection;
Redo Processing
LGWR - log writer process flushes from the SGA to the ORL files
LNS - LogWriter Network Service reads redo being flushed from the redo
buffers by the LGWR and performs a network send of the redo to the standby
ARCH - archives the ORL files to archive logs, that also used to fulfill
gap resolution requests, one ARCH processes is dedicated to local redo log
activity only and never communicates with a standby database
Real-Time Apply
sql> alter database recover managed standby database using current logfile
Enable real-time apply disconnect;
sql> select recovery_mode from v$archive_dest_status where dest_id = 2;
Determine if real-time RECOVERY_MODE
apply is enabled --------------------------
MANAGED REAL-TIME APPLY
Tools and views to monitor redo
select process, client_process, thread#, sequence#, status from
v$managed_standby;
## primary (example)
Logical Standby
select owner from dba_logstdby_skip where statement_opt = 'INTERNAL SCHEMA' order by
owner;
schema that are not
maintained by SQL
Note: system and sys schema are not replicated so don't go creating tables in these
apply
schemas, the above command should return about 17 schemas (Oracle 11g) that are
replicated.
Check tables with
select distinct owner, table_name from dba_logstdby_unsupported;
unsupported data select owner, table_name from logstdby_unsupported_tables;
types
## Syntax
dbms_logstdby.skip (
stmt in varchar2,
schema_name in varchar2 default null,
object_name in varchar2 default null,
proc_name in varchar2 default null,
use_like in boolean default true,
esc in char1 default null
skip replication of );
tables
## Examples
execute dbms_logstdby.skip(stmt => 'DML', schema_name => 'HR', object_name =>
'EMPLOYEE');
execute dbms_logstdby.skip(stmt => 'SCHEMA_DDL', schema_name => 'HR', object_name =>
'EMPLOYEE');
sc status
displaying the barrier ------------------------------------------------------------------------------------
-
44604 BARRIER SYNCHRONIZATION ON DDL WITH XID 1.15.256 (WAITING ON 17 TRANSACTIONS)
Tuning SQL Apply
# Set the MAX_SERVERS to 8 x the number of cores
MAX_SERVERS execute dbms_logstdby.apply_set ('MAX_SERVERS', 64);
# Set the MAX_SGA to 200MB
MAX_SGA execute dbms_logstdby.apply_set ('MAX_SGA', 200);
# Set the Hash table size to 10 million
_HASH_TABLE_SIZE execute dbms_logstdby.apply_set ('_HASH_TABLE_SIZE', 10000000);
DDL defer DDLs to off-peak hours
# Set the PERSERVE_COMMIT_ORDER to false
Preserve commit order execute dbms_logstdby.apply_set (name => 'PRESERVE_COMMIT_ORDER', value => FALSE);
# apply lag: indicates how current the replicated data at the logical standby is
# transport lag: indicates how much redo data that has already been generated is
missing at the logical # standby in term of redo records
lagging SQL Apply
select name, value, unit from v$dataguard_stats;
Name Value
------------------------------------------------------------------------------------
SQL Apply component -----------------
bottleneck TRANSACTIONS APPLIED 3764
TRANSACTIONS MINED 4985
The mined transactions should be about twice the applied transaction, if this
decreases or staying at a low value you need to start looking at the mining engine.
select count(1) as idle_preparers from v$logstdby_process where type = 'PREPARER'
and STATUS_CODE = 16166;
Make sure all preparers
are busy IDLE_PREPARER
----------------------------
0
select used_memory_size from v$logstdby_session where session_id = (select value
from v$logstdby_stats where name = 'LOGMINER SESSION ID');
Make sure the peak
size is well below the USED_MEMORY_SIZE
amount allocated ----------------------------
32522244
select (available_txn - pinned_txn) as pipleline_depth from v$logstdby_session where
session_id (select value from v$lostdby_stats where name = 'LOGMINER SESSION ID');
PIPELINE_DEPTH
verify that the preparer ----------------------------
does not have enough 8
work for the applier
processes select count(*) as applier_count from v$logstdby_process where type = 'APPLIER';
APPLIER_COUNT
----------------------------
20
Setting max_servers execute dbms_logstdby.apply_set('MAX_SERVERS', 36);
and preparers execute dbms_logstdby.apply_set('PREPARE_SERVERS', 3);
display the pageout ## Run this first
activity select name, value from v$logstdby_stats where name line '%PAGE%' or name like
'%UPTIME' or name like '%IDLE%';
PIPELINE_DEPTH
----------------------------
256
select value from v$logstdby_stats where name = 'LARGE TXNS WAITING TO BE ASSIGNED';
VALUE
---------------------------
12
Monitoring
# Use the thread# when using RAC an detect missing sequences
archive gap logs
select thread#, low_sequence#, high_sequence# from v$archive_gap;
select max(sequence#), thread# from v$archived_log group by thread#;
Note: if using a RAC environment make sure you check each instance
## check that MRP (applying_log) matches the RFS process, if the MRP line is
missing then you need to
## start the apply process, you also may see the status of wait_for_gap so wait
check that redo has until the gap have been
been applied 2 ## resolved first
(physical)
sql> select client_process, process, sequence#, status from v$managed_standby;
## if you are using a logical standby then you need to check the following to
confirm the redo has been
## applied
check that redo has
sql> select applied_scn, latest_scn, mining_scn from v$logstdby_progress;
been applied 3
(logical) ## if the mining scn is behind you may have a gap check this by using the
following
switchover sql> alter database commit to switchover to physical standby with session
9
(primary) shutdown;
## now check on the primary we should be one in front (run on the primary)
sql> select thread#, sequence#, status from v$log;
Note: if using a RAC environment make sure you check each instance
## check that MRP (applying_log) matches the RFS process, if the MRP line
is missing then you need to
## start the apply process, you also may see the status of wait_for_gap so
check that redo wait until the gap have been
has been applied 2 ## resolved first
(physical)
sql> select client_process, process, sequence#, status from
v$managed_standby;
## if you are using a logical standby then you need to check the following
to confirm the redo has been
## applied
check that redo
sql> select applied_scn, latest_scn, mining_scn from v$logstdby_progress;
has been applied 3
(logical) ## if the mining scn is behind you may have a gap check this by using the
following
## confirm that the prepare has started to happen, you should see
"preparing dictionary"
Prepare the logical
10 sql> select switchover_status from v$database;
standby
## wait a while until the dictionary is built and sent and you should see
"preparing switchover"
sql> select switchover_status from v$database;
## you should now see its in the state of "to logical standby"
Check primary
11
database state sql> select switchover_status from v$database;
the last chance to ## On the primary
CANCEL the sql> alter database prepare to switchover cancel;
switchover (no 12
going back after ## on the logical
this) sql> alter database prepare to switchover cancel;
switchover the
primary to a logical 13 sql> alter database commit to switchover to logical standby;
standby
## check that its ready to become the primary, you should see "to primary"
switchover the
sql> select switchover_status from v$database
logical standby to 14
a primary ## Complete the switchover
sql> alter database commit to standby to primary;
start the apply
15 sql> alter database start logical standby apply immediate;
process
select name, value, time_computed from v$dataguard_stats where name like '%lag%';
Check redo
1
applied ## You can also use the SCN number
FAILOVER_SCN
-----------------------------------------------
7658841
## Now flashback the old primary to this SCN and start in mount mode
startup mount;
bring back the
flashback database to scn 7658841;
old primary
1 alter database convert to physical standby;
(physical shutdown immediate;
standby) startup mount;
## hopefully the old primary will start to resolve any gap issues at the next log
switch, which means we can start the MRP
## process to get this standby going to catchup as fast as possible
alter database recover managed standby database using current logfile disconnect;
## eventually the missing redos will be sent to the standby and applied, bring us
back to synchronization again.
bring back the 2 ## again we need to obtained the SCN
old primary select merge_change# as flashback_scn, processed_change# as recovery_scn from
(logical standby) dba_logstdby_history where stream_sequence# = (select max(stream_sequence#)-1 from
dba_logstdby_history);
flashback_scn recovery_scn
---------------------------------------------------------
7658941 7659568
## Now flashback the old primary to this SCN and start in mount mode
startup mount;
flashback database to scn 7658841;
alter database convert to physical standby;
shutdown immediate;
startup mount;
## Now we need to hand feed the archive logs from the primary to the standby (old
primary) into the MRP
## process, so lets get those logs (run on the primary)
## Now you will hopefully have a short list of the files you need, now you need to
register them with
## the standby database (old primary)
## Now you can recover up to the SCN but not including the one you specify
recover managed standby database until change 7659568;
## Now the standby database becomes a logical standby as up to this point it has
been a physical one.
alter database active standby database;
## Lastly you need tell your new logical standby to ask the primary for a new copy
of the dictionary and
## all the redo in between. The SQL Apply will connect to the new primary using the
database link and
## retrieve the LogMiner dictionary, once the dictionary has been built, SQL Apply
will apply all the
## redo sent from the new primary and get itself synchronized
create public database link reinstatelogical connect to system identified by
password using 'service_name_of_new_primary_database';
Role Transitions
A database operates in one of the following mutually exclusive roles: primary or standby. Data Guard enables you to
change these roles dynamically by issuing the SQL statements described in this chapter, or by using either of the
Data Guard broker's interfaces. Oracle Data Guard supports the following role transitions:
Switchover
Allows the primary database to switch roles with one of its standby databases. There is no data loss during a
switchover. After a switchover, each database continues to participate in the Data Guard configuration with its new
role.
Failover
Changes a standby database to the primary role in response to a primary database failure. If the primary database
was not operating in either maximum protection mode or maximum availability mode before the failure, some data
loss may occur. If Flashback Database is enabled on the primary database, it can be reinstated as a standby for the
new primary database once the reason for the failure is corrected.
4. Verify that the switchover target is ready to be switched to the primary role.
Startup commands
-----------------------------------------------------------------------
startup nomount
alter database mount standby database;
alter database recover managed standby database disconnect;
alter database recover managed standby database;
alter database recover managed standby database using current logfile;
recover managed standby database delay 60;
Register a missing log file alter database register physical logfile '';If FAL doesn't work and it says the log is
already registered
alter database register or replace physical logfile '';