failure and there are no other nodes on the local site that it can run on with package switching enabled. The workload packages can be halted and restarted using the cmhaltpkg and cmrunpkg commands when the Site Controller Package is running. The Site Controller Package is not affected when the workload packages are administratively halted using the cmhaltpkg command.
Site Failover
The Site Controller Package initiates a site failover when the site is lost or when the complex workload has failed. The Site Controller Package performs a site failover by first failing over itself to a node in the remote site. Before preparing the replicated storage, the Site Controller Package first ensures that all the packages in the failed site have halted cleanly. On the node in the remote site, the Site Controller Package prepares the replicated storage and starts the packages of the complex workload’s redundant configuration.
An MNP package that is down is considered as halted clean only if all its instances have run the halt scripts successfully. A failover package is considered as halted clean only if it has successfully executed the halt script on the node where it last went down.
When an MNP package instance has not halted cleanly, Serviceguard will not allow the corresponding node to be removed. To remove the node from the cluster, any resource of the instance that may still be online on the node must be cleaned and the package's node switching flag for the node must be enabled.
Following is a sample of a typical disaster tolerant RAC database that is configured in its Site Controller Package configuration file:
site san_francisco critical_package sfo_app critical_package sfo_hrdb managed_package sfo_hrdb_mp managed_package sfo_hrdb_dg
site san_jose critical_package sjc_app critical_package sjc_hrdb managed_package sjc_hrdb_mp managed_package sjc_hrdb_dg
In this example, the Site Controller Package initiates and performs a site failover to the san_jose site when either of the packages configured as the critical_package on the san_francisco site has failed and halted cleanly in the cluster. So, when sfo_app or sfo_hrdb fails and is halted cleanly in the cluster, the Site Controller Package initiates and performs a site failover to the san_jose site.
Following is an example of a Site Controller Package configuration file where all the packages in the workload are configured using the managed_package attribute.
site san_francisco managed_package sfo_app managed_package sfo_hrdb managed_package sfo_hrdb_mp managed_package sfo_hrdb_dg
site san_jose managed_package sjc_app managed_package sjc_hrdb managed_package sjc_hrdb_mp managed_package sjc_hrdb_dg
In this example, the Site Controller Package initiates and performs a site failover to the san_jose site when all the configured managed packages in the san_francisco site have failed and halted cleanly in the cluster. So when sfo_app, sfo_hrdb, sfo_hrdb_mp, and sfo_hrdb_dg packages have failed and halted cleanly, the Site Controller Package initiates and performs a site failover to the san_jose site.
Overview of Site Aware Disaster Tolerant Architecture 343