针对exadata最近频繁报出的IO error,做如下总结
data node alert
ORA-27603: 单元存储 I/O 错误, I/O 在磁盘 o/192.168.10.5/DATA_DM01_CD_08_dm01cel03 上失败, 偏移量 17331625984 (数据长度 253952) ORA-27626: Exadata 错误: 201 (Generic I/O error) WARNING: Read Failed. group:1 disk:32 AU:4132 offset:761856 size:253952 path:o/192.168.10.5/DATA_DM01_CD_08_dm01cel03 incarnation:0x802360d9 asynchronous result:'I/O error' subsys:OSS iop:0x2b8c42c03640 bufp:0x2b8c42fc4e00 osderr:0xc9 osderr1:0x0 Exadata error:'Generic I/O error' IO elapsed time: 18021514 usec Time waited on I/O: 18013517 usec WARNING: failed to read mirror side 1 of virtual extent 2039 logical extent 0 of file 274 in group [1.540250240] from disk DATA_DM01_CD_08_DM01CEL03 allocation unit 4132 reason error; if possible, will try another mirror side NOTE: successfully read mirror side 2 of virtual extent 2039 logical extent 1 of file 274 in group [1.540250240] from disk DATA_DM01_CD_05_DM01CEL02 allocation unit 4133
ASM alert
Wed Jun 19 08:45:30 2013 Errors in file /u01/app/grid/diag/asm/+asm/+ASM1/trace/+ASM1_r000_76832.trc: ORA-27603: Cell storage I/O error, I/O failed on disk o/192.168.10.4/DATA_DM01_CD_07_dm01cel02 at offset 1140850688 for data length 1048576 ORA-27626: Exadata error: 201 (Generic I/O error) WARNING: Read Failed. group:1 disk:19 AU:272 offset:0 size:1048576 Sun Jul 28 23:05:07 2013 NOTE: repairing group 1 file 274 extent 2039 SUCCESS: extent 2039 of file 274 group 1 repaired - all online mirror sides found readable, no repair required
storage node alert
Jul 28 23:05:07 dm01cel03 kernel: sd 0:2:8:0: SCSI error: return code = 0x00070002 Jul 28 23:05:07 dm01cel03 kernel: end_request: I/O error, dev sdi, sector 33916368
针对在DB端与storage端报出的IO error ,ORACLE用直接利用ASM中默认的处理行为,首先去read secondary extent上的block
并且会在primary extent上尝试做repair操作,针对这个repair操作分为两种行为,针对以上ASM alert log 发现:
1. SUCCESS: extent 4753 of file 502 group 1 repaired by relocating to a different AU on the same disk or the disk is offline
ASM use the mirrored copy which allows the disk to re-allocate data around any bad blocks in the physical disk media–也就是重新分配了一块物理的AU SIZE区域
2. SUCCESS: extent 2039 of file 274 group 1 repaired - all online mirror sides found readable, no repair required
ASM 做了 initiate 操作重写了这个SIZE。
针对这个报错,表明stroage disk的寿命在不断的缩减,同理随着磁盘物理坏块的增加,一旦disk达到critical的值那么这块盘将建议被replaced(利用ASM fast disk sync来同步).
另外针对这个问题,在传统存储端不是很容易见到这个错误,例如我们所常用的external redundancy,在存储层面的冗余一般已经足够安全,所以XD在storage端的表现并不如它的软件所提供的功能那么亮眼。(我们可以说传统存储的安全性>>xd sun storage?,也许有点鲁莽,Maybe..)
针对上述ASM的自动修复行为可以参考之前的文章
这里顺便提一下在normal redundancy环境中的Req_mir_free_MB与Usable_file_MB
[grid@dm01db01 trace]$ asmcmd lsdg State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name MOUNTED NORMAL N 512 4096 4194304 15593472 6102196 5197824 452186 0 N DATA_DM01/ MOUNTED NORMAL N 512 4096 4194304 894720 893432 298240 297596 0 Y DBFS_DG/ MOUNTED NORMAL N 512 4096 4194304 3896064 1717684 1298688 209498 0 N RECO_DM01/
total_MB/3=Req_mir_free_MB why ? Req_mir_free_MB可以等同于热备盘,oracle在normal模式下,ASM disk 将被等价的切分成3块,来实现Req_mir_free_MB包含的disk能够替代任意primary,secondary中的盘。另外Req_mir_free_MB中的空间也是可以被用到的,当Usable_file_MB用光的时候,将会使用继续使用Req_mir_free_MB的空间来写数据
但是Req_mir_free_MB/2 才是真实可以写的空间,因为normal必须写两份数据。当Req_mir_free_MB耗尽时,其实已经不存在hot spare disk了,这个时候如果主备extend同时坏掉,那么就会出现丢数据。结合一个案例来说明:
[grid@dm01db01 ~]$ asmcmd -p ASMCMD [+] > lsdg State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name MOUNTED NORMAL N 512 4096 4194304 15593472 9918184 5197824 2360180 0 N DATA_DM01/ MOUNTED NORMAL N 512 4096 4194304 894720 893432 298240 297596 0 Y DBFS_DG/ MOUNTED NORMAL N 512 4096 4194304 3896064 28248 1298688 -635220 0 N RECO_DM01/ Usable_file_MB=-635220 ==> Req_mir_free_MB/2
恢复之后:
ASMCMD [+] > lsdg State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name MOUNTED NORMAL N 512 4096 4194304 15593472 9918184 5197824 2360180 0 N DATA_DM01/ MOUNTED NORMAL N 512 4096 4194304 894720 893432 298240 297596 0 Y DBFS_DG/ MOUNTED NORMAL N 512 4096 4194304 3896064 3860220 1298688 1280766 0 N RECO_DM01/
实际上这个时候 Usable_file_MB=(1280766+635220)MB