200 IBM Certification Study Guide AIX HACMP
handling membership and event management by using heartbeats. On the
SP, the original High Availability infrastructure was built on this technology,
and HACMP/ES Version 4.3. is now another instance relying on it. As of AIX
4.3.2 and PSSP 3.1, the High Availability infrastructure, which previously was
tightly coupled to PSSP, was externalized into a package called RISC System
Cluster Technology (RSCT). This package can be installed and run, not only
on SP nodes, but also on regular RS/6000 systems. This allows HACMP/ES
to also be available on non-SP RS/6000s as of Version 4.3.
10.2.1 IBM RISC System Cluster Technology (RSCT)
The High Availability services previously packaged with the IBM PSSP for
AIX Availability Services, also known as the ssp.ha fileset, are now an
integral part of the HACMP/ES software. The IBM RS/6000 Cluster
Technology (RSCT) services provide greater scalability, notify distributed
subsystems of software failure, and coordinate recovery and synchronization
among all subsystems in the software stack.
Packaging these services with HACMP/ES makes it possible to run this
software on all RS/6000s, not just on SP nodes.
RSCT Services include the following components:
Event Manager A distributed subsystem providing a set of high
availability services. It creates events by matching
information about the state of system resources with
information about resource conditions of interest to
client programs. Client programs, in turn, can use event
notifications to trigger recovery from system failures.
Group Services A sy stem-wide, fault-tolerant, and highly available
facility for coordinating and monitoring changes to the
state of an application running on a set of nodes. Group
Services helps both in the design and implementation of
fault-tolerant applications and in the consistent recovery
of multiple applications. It accomplishes these two
distinct tasks in an integrated framework.
Topology Service A facility for generating heartbeats over multiple
networks and for providing information about adapter
membership, node membership, and routing. Adapter
and node membership provide indications of adapter
and node failures respectively. Reliable Messaging uses
the routing information to route messages between
nodes around adapter failures.