VMCD.ORG » 11gR2 RAC Rebootless Node Fencing

Rebootless Node Fencing

In versions before 11.2.0.2 Oracle Clusterware tried to prevent a split-brain with a fast reboot (better: reset) of the server(s) without waiting for ongoing I/O operations or synchronization of the file systems. This mechanism has been changed in version 11.2.0.2 (first 11g Release 2 patch set). After deciding which node to evict, the clusterware:

. attempts to shut down all Oracle resources/processes on the server (especially processes generating I/Os)

. will stop itself on the node

. afterwards Oracle High Availability Service Daemon (OHASD) will try to start the Cluster Ready Services (CRS) stack again. Once the cluster interconnect is back online,all relevant cluster resources on that node will automatically start

. kill the node if stop of resources or processes generating I/O is not possible (hanging in kernel mode, I/O path, etc.)

This behavior change is particularly useful for non-cluster aware applications.

[cssd(3713)]CRS-1610:Network communication with node rac1 (1) missing for 90% of timeout interval.
Removal of this node from cluster in 2.190 seconds
…
[cssd(3713)]CRS-1652:Starting clean up of CRSD resources.
…
[cssd(3713)]CRS-1654:Clean up of CRSD resources finished successfully.
[cssd(3713)]CRS-1655:CSSD on node rac2 detected a problem and started to shutdown.
…
[cssd(5912)]CRS-1713:CSSD daemon is started in clustered mode

与11gR2之前的机制不一样,Oracle不再直接kill掉这个node 而是采取了kill 相关的process 如果尝试kill失败则会去kill node，这种机制对于big cluster 是一种很好的保护，避免了node reboot 之后 resources re-mastered 导致的资源冻结，下面这段话比较详细的说明了这个观点：

Prior to 11g R2, during voting disk failures the node will be rebooted to protect the integrity of the cluster. But rebooting cannot be necessarily just the communication issue. The node can be hanging or the IO operation can be hanging so potentially the reboot decision can be the incorrect one. So Oracle Clusterware will fence the node without rebooting. This is a big (and big) achievement and changes in the way the cluster is designed.

The reason why we will have to avoid the reboot is that during reboots resources need to re-mastered and the nodes remaining on the cluster should be re-formed. In a big cluster with many numbers of nodes, this can be potentially a very expensive operation so Oracle fences the node by killing the offending process so the cluster will shutdown but the node will not be shutdown. Once the IO path is available or the network heartbeat is available, the cluster will be started again. Be assured the data will be protected but it will be done without any pain rebooting the nodes. But in the cases where the reboot is needed to protect the integrity, the cluster will decide to reboot the node.

reference from :RAC_System_Test_Plan_Outline_11gr2_v2_0