昨天的logic standby 事故处理:

夜里12点左右 2台logic standby 全部停止日志apply 查看logic standby 相关进程发现:

1.select spid, type, status from v$logstdby_process no rows returns

2.2012-03-12 23:36:16 update “PROD_DATA2″.”PM_SUPPLIER” set xxx…..

ORA-01403: no data found

报出如下错误,这种错误一般是由于standby 端的表数据与primary端出现了不一致导致的,最大的麻烦莫过于这张表是一个LOB字段的表

通过数据的对比,发现源表与目标表 已经出现了很大的不一致,初步判断是由于skip操作导致的 transaction 丢失导致的数据异常

准备使用DBMS_LOGSTDBY.INSTANTIATE_TABLE 进行表的重新同步:

INSTANTIATE_TABLE Procedure

This procedure creates and populates a table in the standby database from a corresponding table in the primary database. The table requires the name of the database link (dblink) as an input parameter. If the table already exists in the logical standby database, it will be dropped and re-created based on the table definition at the primary database. This procedure only brings over the data associated with the table, and not the associated indexes and constraints.

Use the INSTANTIATE_TABLE procedure to:

Add a table to a standby database.

Re-create a table in a standby database.

Syntax

DBMS_LOGSTDBY.INSTANTIATE_TABLE (
schema_name IN VARCHAR2,
table_name IN VARCHAR2,
dblink IN VARCHAR2);
Parameters

Table 56-5 INSTANTIATE_TABLE Procedure Parameters

Parameter Description
schema_name
Name of the schema
table_name
Name of the table to be created or re-created in the standby database
dblink
Name of the database link account that has privileges to read and lock the table in the primary database

SQL*Plus: Release 10.2.0.5.0 – Production on Wed Mar 14 00:42:21 2012

Copyright (c) 1982, 2010, Oracle. All Rights Reserved.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 – 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> alter session set db_file_multiblock_read_count = 128;

Session altered.

SQL> alter session set sort_area_size =804857600;

Session altered.

SQL> alter session set workarea_size_policy = manual;

Session altered.

SQL> exec DBMS_LOGSTDBY.INSTANTIATE_TABLE(‘PROD_DATA2′,’PM_SUPPLIER’,’dblinkprimary’);

ERROR at line 1:
ORA-20001: ORA-31693: Table data object “PROD_DATA2″.”PM_SUPPLIER” failed to
load/unload and is being skipped due to error:
ORA-12161: TNS:internal error: partial data received
ORA-02063: preceding from
ORA-06512: at “SYS.KUPW$WORKER”, line 113
ORA-02063: preceding from
ORA-01555: snapshot too old: rollback segment number with name “” too small
ORA-22924: snapshot too old
ORA-02063: preceding 2 lines from DBLINKPRIMARY
ORA-01555: snapshot too old: rollback segment number with name “” too small
ORA-22924: snaps
ORA-06512: at “SYS.DBMS_LOGSTDBY”, line 664
ORA-06512: at line 1

报出如下错误 这种错误一般是由于LOB 回滚段不足所导致的:

关于LOB的undo做如下说明:

RETENTION – time-based: this specifies how long older versions are to be retained.

PCTVERSION – space-based: this specifies what percentage of the LOB segment is to be used to hold older versions.

Points to Note:

RETENTION is a keyword in the LOB column definition. No value can be specified for RETENTION. The RETENTION value is implicit. If a LOB is created with database compatibility set to 9.2.0.0 or higher, undo_management=TRUE and PCTVERSION is not explicitly specified, time-based retention is used. The LOB RETENTION value is always equal to the value of the UNDO_RETENTION database instance parameter.

You cannot specify both PCTVERSION and RETENTION.

PCTVERSION is applicable only to LOB chunks/pages allocated in LOBSEGMENTS. Other LOB related data in the table column and the LOBINDEX segment use regular undo mechanism.

PCTVERSION=0: the space allocated for older versions of LOB data in LOBSEGMENTS can be reused by other transactions and can cause“snapshot too old” errors.

PCTVERSION=100: the space allocated by older versions of LOB data can never be reused by other transactions. LOB data storage space is never reclaimed and it always increases.

RETENTION and PCTVERSION can be altered Recommendation Time-based retention using the RETENTION keyword is preferred. A high value for RETENTION or PCTVERSION may be needed to avoid “snapshot too old” errors in environments with high concurrent read/write LOB access.

查看”PROD_DATA2″.”PM_SUPPLIER”这张表的LOB属性:

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 – 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options

SQL>
SQL> select bitand(flags,32) from sys.lob$ where OBJ#= (select OBJECT_ID from dba_objects where
OWNER=’PROD_DATA2′ and OBJECT_NAME=’PM_SUPPLIER’); 2

BITAND(FLAGS,32)
—————-
0

SQL> select table_name, segment_name, pctversion, retention
from dba_lobs where table_name = ‘PM_SUPPLIER’;

TABLE_NAME SEGMENT_NAME PCTVERSION RETENTION
—————————— —————————— ———- ———-
PM_SUPPLIER SYS_LOB0000017052C00041$$ 10

修改为90

alter table prod_data2.PM_SUPPLIER modify lob(PRODUCT_DESCRIPTION) (pctversion 90);

sys@item> select table_name, segment_name, pctversion, retention
from dba_lobs where table_name = ‘PM_SUPPLIER’; 2

TABLE_NAME SEGMENT_NAME PCTVERSION RETENTION
—————————— —————————— ———- ———-
PM_SUPPLIER SYS_LOB0000017052C00041$$ 90

重新开启同步,继续报错:

ERROR at line 1:
ORA-20001: ORA-39082: Object type
TRIGGER:”PROD_DATA2″.”TRIG_PRODUCT_SUPPLIER_INSERT” created with compilation
warnings
ORA-06512: at “SYS.DBMS_LOGSTDBY”, line 664
ORA-06512: at line 1

使用 DBMS_LOGSTDBY.INSTANTIATE_TABLE同步时 所依赖的对象会一并同步过来,这个trigger 在源端我们已经禁用了,直接drop掉这个trigger继续同步:

SQL> alter session set db_file_multiblock_read_count = 128;

Session altered.

SQL> alter session set sort_area_size =804857600;

Session altered.

SQL> alter session set workarea_size_policy = manual;

Session altered.

SQL> exec DBMS_LOGSTDBY.INSTANTIATE_TABLE(‘PROD_DATA2′,’PM_SUPPLIER’,’dblinkprimary’);

PL/SQL procedure successfully completed.

经历了漫长的等待终于同步成功,由于此前失败了好几次,总共大概花了7个小时,可见LOB表对于logic standby的影响,一旦出现问题,同步花费的时间将大大的超过普通表,一张记录相同的普通表 使用DBMS_LOGSTDBY.INSTANTIATE_TABLE同步大概只需要10分钟左右,相比较而言差了几百倍之多!

同步完成后 进度很快追了上来:

SPID TYPE STATUS
———— ————— —————————————————————————————————-
29989 COORDINATOR ORA-16116: no work available
29940 READER ORA-16116: no work available
29942 BUILDER ORA-16116: no work available
29944 PREPARER ORA-16116: no work available
29946 PREPARER ORA-16116: no work available
29948 PREPARER ORA-16116: no work available
29952 ANALYZER ORA-16116: no work available
29954 APPLIER ORA-16116: no work available
29956 APPLIER ORA-16116: no work available
29958 APPLIER ORA-16116: no work available
29960 APPLIER ORA-16116: no work available
29962 APPLIER ORA-16116: no work available
29964 APPLIER ORA-16116: no work available
29966 APPLIER ORA-16116: no work available
29968 APPLIER ORA-16116: no work available
29970 APPLIER ORA-16116: no work available
29972 APPLIER ORA-16116: no work available
29974 APPLIER ORA-16116: no work available
29976 APPLIER ORA-16116: no work available

19 rows selected.

APPLIED_SCN APPLIED_TIME RESTART_SCN RESTART_TIME LATEST_SCN LATEST_TIME MINING_SCN MINING_TIME
———– ——————- ———– ——————- ———- ——————- ———- ——————-
3.5694E+10 2012-03-13 06:31:33 3.5686E+10 2012-03-12 19:07:22 3.5694E+10 2012-03-13 06:31:56 3.5694E+10 2012-03-13 06:31:34

终于处理完了这次故障,仰头一看 东方已经微微既白。对于大型的OLTP系统LOB表的存在无疑增加的很多维护的难度,对于要求性极高的系统,LOB字段建议不要存放在数据库系统,无论对于DG还是后期的例如OGG的同步 都是一个极大的隐患。