Next, SGeRAC package manager shuts down Oracle Clusterware via the Oracle Clusterware MNP, followed by the storage needed by Oracle Clusterware (this requires subsequent shutdown of mount point and disk group MNPs in the case of the storage needed by Oracle Clusterware being managed by CFS). It can do this since the dependent RAC database instance MNP is already down. Before shutting itself down, Oracle Clusterware shuts down the ASM instance if configured, and then the node applications. Lastly, SGeRAC itself shuts down.
Note that the stack can be brought up or down manually, package by package, by using cmrunpkg/cmhaltpkg in the proper dependency order. To disable (partially or wholly) automatic startup of the stack when a node joins the cluster, the AUTO_RUN attribute should be set to NO on the packages that should not automatically be started.
How Serviceguard Extension for RAC starts, stops and checks Oracle Clusterware
Having discussed how the toolkit manages the overall control flow of the combined stack during startup and shutdown, we will now discuss how the toolkit interacts with Oracle Clusterware and RAC database instances. We begin with the toolkit interaction with Oracle Clusterware.
The MNP for Oracle Clusterware provides start and stop functions for Oracle Clusterware and has a service for checking the status of Oracle Clusterware.
The start function starts Oracle Clusterware using crsctl start crs. To ensure successful startup of Oracle Clusterware, the function, every 10 seconds, runs crsctl check until the command output indicates that the CSS, CRS, and EVM daemons are healthy. If Oracle Clusterware does not start up successfully, the start function will execute the loop until the package start timer expires, causing SGeRAC to fail the instance of the Oracle Clusterware MNP on that node.
The stop function stops Oracle Clusterware using crsctl stop crs. Then, every 10 seconds, it runs ps until the command output indicates that the processes called evmd.bin, crsd.bin, and ocssd.bin no longer exist.
The check function runs ps to determine process id of the process called ocssd.bin. Then, in a continuous loop driven by a configurable timer, it uses kill
When Oracle Clusterware MNP is in maintenance mode, the check function pauses the Oracle Clusterware health checking. Otherwise, if the check function finds that the process has died, it means that Oracle Clusterware has either failed or been inappropriately shut
How Serviceguard Extension for RAC Mounts, dismounts and checks ASM disk groups
We discuss the toolkit interaction with the ASM disk groups.
The MNP for the ASM diskgroups that are needed by RAC database provides mount and dismount functions for the ASM diskgroups and has a service for checking the status of those ASM diskgroups whether they are mounted or not.
The start function executes su to the Oracle software owner user id. It then determines the ASM instance id on the current node for the specified diskgroup using crsctl status resource ora.asm. It is stored in variable and used for future references. Then it mounts the ASM disk groups mentioned in that ASMDG MNP by connecting to ASM instance using sqlplus.
The stop function executes su to the Oracle software owner user id. It unmounts the ASM diskgroups which are specified in that ASMDG MNP by connecting to ASM instance via sqlplus.
The check function determines the status of the ASM disk groups that are mentioned in ASMDG MNP. When ASMDG MNP is in maintenance mode, the ASM diskgroup status checking is paused. Otherwise, in a continuous loop driven by a configurable timer, the check function monitors the status of the ASM diskgroups mentioned in that ASMDG MNP. If one or more ASM diskgroup is in a dismounted state, the check function will report