一套10.2.0.5 RAC 系统 evmd cssd服务无法启动
CRS version
[oracle@ptdb02 oracle]$ crsctl query crs softwareversion
CRS software version on node [ptdb02] is [10.2.0.5.0]
h1:3:respawn:/sbin/init.d/init.evmd run >/dev/null 2>&1
h2:3:respawn:/sbin/init.d/init.cssd fatal >/dev/null 2>&1
[crsd(21292)]CRS-1012:The OCR service started on node ptdb01.
2012-09-08 18:48:24.980
[evmd(21873)]CRS-1401:EVMD started on node ptdb01.
2012-09-08 18:48:25.271
[crsd(22023)]CRS-1005:The OCR upgrade was completed. Version has changed from 169870592 to 169870592. Details in /u01/app/oracle/product/10.2.0/crs_1/log/ptdb01/crsd/crsd.log.
2012-09-08 18:48:25.271
[crsd(22023)]CRS-1012:The OCR service started on node ptdb01.
2012-09-08 18:48:26.679
[evmd(22605)]CRS-1401:EVMD started on node ptdb01.
CSSD Reconfiguration一直没有成功 active node 为ptdb01 evmd cssd 进程无法启动 –> check evmd log
Oracle Database 10g CRS Release 10.2.0.5.0 Production Copyright 1996, 2007, Oracle. All rights reserved
2012-09-08 18:59:03.765: [ EVMD][999623216]0Initializing OCR
2012-09-08 18:59:03.773: [ EVMD][999623216]0Active Version from OCR:10.2.0.5.0
2012-09-08 18:59:03.773: [ EVMD][999623216]0Active Version and Software Version are same
2012-09-08 18:59:03.773: [ EVMD][999623216]0Initializing Diagnostics Settings
2012-09-08 18:59:03.773: [ EVMD][999623216]0ENV Logging level for Module: allcomp 0
2012-09-08 18:59:03.773: [ EVMD][999623216]0ENV Logging level for Module: default 0
2012-09-08 18:59:03.773: [ EVMD][999623216]0ENV Logging level for Module: COMMCRS 0
2012-09-08 18:59:03.773: [ EVMD][999623216]0ENV Logging level for Module: COMMNS 0
2012-09-08 18:59:03.774: [ EVMD][999623216]0ENV Logging level for Module: EVMD 0
2012-09-08 18:59:03.774: [ EVMD][999623216]0ENV Logging level for Module: EVMDMAIN 0
2012-09-08 18:59:03.774: [ EVMD][999623216]0ENV Logging level for Module: EVMCOMM 0
2012-09-08 18:59:03.774: [ EVMD][999623216]0ENV Logging level for Module: EVMEVT 0
2012-09-08 18:59:03.774: [ EVMD][999623216]0ENV Logging level for Module: EVMAPP 0
2012-09-08 18:59:03.774: [ EVMD][999623216]0ENV Logging level for Module: EVMAGENT 0
2012-09-08 18:59:03.774: [ EVMD][999623216]0ENV Logging level for Module: CRSOCR 0
2012-09-08 18:59:03.774: [ EVMD][999623216]0ENV Logging level for Module: CLUCLS 0
2012-09-08 18:59:03.774: [ EVMD][999623216]0ENV Logging level for Module: OCRRAW 0
2012-09-08 18:59:03.774: [ EVMD][999623216]0ENV Logging level for Module: OCROSD 0
2012-09-08 18:59:03.774: [ EVMD][999623216]0ENV Logging level for Module: OCRAPI 0
2012-09-08 18:59:03.775: [ EVMD][999623216]0ENV Logging level for Module: OCRUTL 0
2012-09-08 18:59:03.775: [ EVMD][999623216]0ENV Logging level for Module: OCRMSG 0
2012-09-08 18:59:03.775: [ EVMD][999623216]0ENV Logging level for Module: OCRCLI 0
2012-09-08 18:59:03.775: [ EVMD][999623216]0ENV Logging level for Module: CSSCLNT 0
[ clsdmt][1108588864]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=ptdb01DBG_EVMD))
2012-09-08 18:59:03.777: [ EVMD][999623216]0Creating pidfile /u01/app/oracle/product/10.2.0/crs_1/evm/init/ptdb01.pid
2012-09-08 18:59:03.781: [ EVMD][999623216]0Authorization database built successfully.
2012-09-08 18:59:04.209: [ OCRAPI][999623216]procr_open: Node Failure. Attempting retry #0
..
2012-09-08 18:59:04.210: [ OCRCLI][999623216]oac_reconnect_server: Could not connect to server. clsc ret 9
2012-09-08 18:59:19.177: [ OCRCLI][999623216]oac_reconnect_server: Could not connect to server. clsc ret 9
2012-09-08 18:59:19.227: [ OCRAPI][999623216]procr_open: Node Failure. Attempting retry #298
2012-09-08 18:59:19.228: [ OCRCLI][999623216]oac_reconnect_server: Could not connect to server. clsc ret 9
2012-09-08 18:59:19.278: [ OCRAPI][999623216]procr_open: Node Failure. Attempting retry #299
...
2012-09-08 18:59:19.635: [ OCRCLI][999623216]oac_reconnect_server: Could not connect to server. clsc ret 9
2012-09-08 18:59:19.636: [ EVMAPP][999623216][PANIC]0Unable to open local accept socket - errno 13
2012-09-08 18:59:19.636: [ EVMD][999623216][PANIC]0EVMD exiting
2012-09-08 18:59:19.636: [ EVMD][999623216]0Done.
OCR 无法 Initial retry 299次之后失败 导致evmd进程无法启动 这点与11gr2有了区别 11r2之后 crs通过olr(Oracle Local Repository)获取asm information 从而online ocr 可以参考这篇文章
我们通过/etc/orc.loc 获取ocr信息:
[oracle@ptdb01 oracle]$ cat ocr.loc
ocrconfig_loc=/u01/app/oracle/product/10.2.0/db_1/cdata/localhost/local.ocr
local_only=TRUE
[oracle@ptdb01 oracle]$ strings /u01/app/oracle/product/10.2.0/db_1/cdata/localhost/local.ocr
root
root
SYSTEM
DATABASE
local_only
ORA_CRS_HOME
versionstring
version
language
AMERICAN_AMERICA.WE8ISO8859P1
activeversion
node_numbers
10G Release 2
/u01/app/oracle/product/10.2.0/db_1
true
node0
10.2.0.5.0
hostnames
privatenames
node_numbers
node_names
configured_node_map
clustername
localhost
ptdb01
nodenum
node0
nsendpoint
hostname
privatename
nodename
ptdb01
127.0.0.1
nodenum
ptdb01
nodenum
ptdb01
(ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=0))
10.2.0.5.0
local_only=TRUE 这显然是一个local的文件 指向 “/u01/app/oracle/product/10.2.0/db_1/cdata/localhost/local.ocr”这个文件 但是local.ocr并不包含任何ocr location的信息 导致无法初始化ocr 从而导致evmd进程无法online 手动设置ocrconfig_loc=/dev/raw/raw2 启动crs
2012-09-08 19:04:20.912: [ EVMD][934681136]0Authorization database built successfully.
2012-09-08 19:04:21.270: [ EVMEVT][934681136][ENTER]0EVM Listening on: 54720654
2012-09-08 19:04:21.274: [ EVMAPP][934681136]0EVMD Started
2012-09-08 19:04:21.277: [ EVMEVT][1199147328]0Listening at (ADDRESS=(PROTOCOL=tcp)(HOST=ptdb01-priv)(PORT=0)) for P2P evmd connections requests
2012-09-08 19:04:21.281: [ EVMD][934681136]0Authorization database built successfully.
2012-09-08 19:04:21.395: [ EVMEVT][1230616896][ENTER]0Establishing P2P connection with node: ptdb02
2012-09-08 19:04:21.397: [ EVMEVT][1241106752]0Private Member Update event for ptdb01 received by clssgsgrpstat
2012-09-08 19:00:07.083
[cssd(13795)]CRS-1601:CSSD Reconfiguration complete. Active nodes are ptdb01 ptdb02 .
ptdb01 ptdb02 均被active cssd启动成功, 事后解到工程师安装ASM的时候曾经错选为单实例然后简单的删除了这个实例,一些后续的清理工作并没有完成。
———————————————————-
DBCA建库报错 :
CRS version:
[grid@db-42 bin]$ crsctl query crs softwareversion
Oracle Clusterware version on node [db-42] is [11.2.0.3.0]
[grid@db-42 bin]$ id oracle
uid=502(oracle) gid=501(oinstall) groups=501(oinstall),502(dba),506(asmdba)
[grid@db-42 bin]$ crs_getperm ora.ARCH.dg
Name: ora.DATA.dg
owner:grid:rwx,pgrp:oinstall:rwx,other::r–
[grid@db-42 bin]$ crs_getperm ora.DATA.dg
Name: ora.DATA.dg
owner:grid:rwx,pgrp:asmadmin:rwx,other::r–
两种方法
1 把oracle加入asmadmin group
2 修改ora.DATA.dg
[grid@db-42 bin]$ crs_setperm ora.DATA.dg -u user:oracle:rwx
[grid@db-42 bin]$ crs_getperm ora.DATA.dg
Name: ora.DATA.dg
owner:grid:rwx,pgrp:asmadmin:rwx,other::r–,user:oracle:rwx