Recover physical standby database after loss of archive log(2)
上次写过一篇DG丢失归档后的处理过程,总体来说就是使用增量备份覆盖gap数据从而跳过gap的archivelog 这里再阐述另一种情况
[oracle@db61 orcl]$ ora si
SQL*Plus: Release 10.2.0.5.0 – Production on Wed Jun 20 14:35:01 2012
Copyright (c) 1982, 2010, Oracle. All Rights Reserved.
Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 – 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> select database_role from v$database;
DATABASE_ROLE
—————-
PHYSICAL STANDBY
SQL> recover standby database;
ORA-00279: change 40103914365 generated at 05/23/2012 09:26:36 needed for thread 3
ORA-00289: suggestion : /data/oracle/oradata/orcl/arch/3_8658_657561562.dbf
ORA-00280: change 40103914365 for thread 3 is in sequence #8658
Specify log: {
ORA-00308: cannot open archived log ‘/data/oracle/oradata/orcl/arch/3_8658_657561562.dbf’
ORA-27037: unable to obtain file status Linux-x86_64 Error: 2: No such file or directory Additional information: 3
这个库的大概情况为丢失了6月4号至今的所有归档,很容易想到使用standby 的current_scn去作为 起始scn增量备份,对于这里的增量备份出现了一个有趣的现象。看上面的操作,可以知道还是需要scn为40103914365的归档文件,but why ?既然已经使用增量备份recover database,这里就不兜圈子了,DG开启redo apply之后 oracle 会寻找file header最低的scn开始apply 我们可以查询下当前的file header scn:
SQL> select file#,to_char(checkpoint_change#) from v$datafile_header;
FILE# TO_CHAR(CHECKPOINT_CHANGE#)
———- —————————————-
1 42501726792
2 42501726792
3 42501726801
4 42501726801
5 42501726801
6 42501726801
7 42501726792
8 40103914365
9 42501726792
10 42501726801
11 42501726801
FILE# TO_CHAR(CHECKPOINT_CHANGE#)
———- —————————————-
12 42501726801
13 42501726792
14 42501726801
15 42501726792
16 42501726792
17 42501726801
18 42501726801
…
看到file 8的scn正是oracle需要的scn 对应上面的操作:change 40103914365 generated at 05/23/2012 09:26:36 needed for thread 3 这里的05/23/2012 09:26:36足以说明问题。查看主库的file 8文件发现 change time 为 05/23/2012,从这里可以说明file 8自从2012-05-23之后从来没有change过,对于这种file – BLOCK change为0 ,也就是说change scn为上一次的05/23/2012 09:26:36之前的change scn,即所有的块都不满足以上条件,所以对于从6月4号开始的增量备份,oracle将忽略这个文件的所有blocks从而导致recover之后file header checkpoint scn没有发生变化,当开启redo apply之后oracle仍然从最小的scn开始尝试恢复,从而导致这个诡异的现象,当然这种极端情况是很少出现的,这里我们可以采用rman copy这个file到standby端从而解决这个问题。
eg:
RMAN> copy datafile ‘+DATA/pri/datafile/udata01_16.dbf’ to ‘/data/xxx.dbf’;
Starting backup at 20-JUN-12
using channel ORA_DISK_1
channel ORA_DISK_1: starting datafile copy
input datafile fno=00008 name=+DATA/pri/datafile/udata01_16.dbf
output filename=/data/temp.dbf tag=TAG20120620T145721 recid=172 stamp=786467766
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:18:46
Finished backup at 20-JUN-12
Starting Control File and SPFILE Autobackup at 20-JUN-12
piece handle=/data/oracle/product/10.2/db1/dbs/c-1185342296-20120620-00 comment=NONE
Finished Control File and SPFILE Autobackup at 20-JUN-12
scp ….
[oracle@db61 orcl]$ ora dgstats
NAME VALUE UNIT TIME_COMPUTED
—————————— ——————– —————————— ——————————
apply finish time +00 00:09:07.7 day(2) to second(1) interval 20-JUN-2012 18:54:58
apply lag +00 00:57:49 day(2) to second(0) interval 20-JUN-2012 18:54:58
estimated startup time 116 second 20-JUN-2012 18:54:58
standby has been open N 20-JUN-2012 18:54:58
transport lag +00 00:42:10 day(2) to second(0) interval 20-JUN-2012 18:54:58
4个小时之后DG 追上了16天的gap 恢复速度还是不错的 总体来说通过增量备份恢复丢失归档的DG是一个很常规的手法 16天的归档>=4T 如果从带库恢复归档 将是一个漫长的过程