VMCD.ORG » Fractured blocks when Rman backup is running

系统alert在主库出现下面错误 OS为linux 5.5

Stopping background process CJQ0
Sat Feb 11 03:41:07 2012
Hex dump of (file 9, block 561424) in trace file /data/oracle/diag/rdbms/yhdstd/yhddb1/trace/yhddb1_ora_11327.trc
Corrupt block relative dba: 0x02489110 (file 9, block 561424)
Fractured block found during backing up datafile
Data in bad block:
type: 6 format: 2 rdba: 0x02489110
last change scn: 0x0007.a9d2831f seq: 0x1 flg: 0x06
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0xe6b90601
check value in block header: 0x8c60
computed block checksum: 0x82af
Reread of blocknum=561424, file=/data/oracle/oradata/yhddb1/md_data01.dbf. found valid data

这个情况在9i比较常见，rman备份时datafile 正在处在剧烈的io操作，如大批量的写入等等，oracle判断此块为Fractured，但是这并不是真正意义上的Corrupt

oracle会再次check这个块 ‘Reread of blocknum=561424, file=/data/oracle/oradata/yhddb1/md_data01.dbf. found valid data’ 发现这个块是valid的，数据库版本是11.2.0.3

[oracle@rac03 ~]$ crontab -l
0 03 * * 2 sh /home/oracle/monitor/script/rman_level0.sh >> /home/oracle/monitor/script/rman_level0.log
0 03 * * 0,1,3,4,5,6 sh /home/oracle/monitor/script/rman_level1.sh >> /home/oracle/monitor/script/rman_level1.log

rman备份确实是放在3点左右，和alert log 中的时间相吻合

由此发现在11g中我们还是建议rman备份尽量不要放在数据库繁忙的阶段

reference:

fact: Oracle Server – Enterprise Edition 8
fact: Oracle Server – Enterprise Edition 9
fact: Recovery Manager (RMAN)
symptom: Fractured block found during backup up datafile
symptom: Reread of blocknum found some corrupt data
symptom: Analyze table validate structure cascade returns no errors
change: NOTE ROLE:

The messages are of the form

Reread of blocknum=36256, file=/pdscdata/pdsclive/data1/dispatch_data_large2.
dbf. found same corrupt data
***
Corrupt block relative dba: 0xfc008dc0 (file 63, block 36288)
Fractured block found during backing up datafile
Data in bad block –
type: 0 format: 0 rdba: 0x00000000
last change scn: 0x0000.00000000 seq: 0x0 flg: 0x00
consistency value in tail: 0x53494e53
check value in block header: 0x0, block checksum disabled
spare1: 0x0, spare2: 0x0, spare3: 0x0
cause: RMAN backups of datafile are being performed while the datafile is
involved in heavy I/O.

RMAN reads Oracle blocks from disk. If it finds that the block is fractured,
which means it is being actively used, it performs a reread of the block. If
that fails again then the block is assumed to be corrupt.

By identifying the object that these blocks belong to by following
Handling Oracle Block Corruptions in Oracle7/8/8i and
performing an analyze .. validate structure cascade on the object involved you
can confirm that
the object is not corrupt.

fix:

Run the backups when the tablespace has less I/O activity.