2.6.2.1Single Point-of-Failure Hardware Component Recovery As described in 2.2.1.2, “Special Network Considerations” on page 12, the HPS Switch network is one resource that has to be considered as a single point of failure. Since a node can support only one switch adapter, its failure will disable the switch network for this node. It is strongly recommended to promote a failure like this into a node failure, if the switch network is critical to your operations.

Critical failures of the switch adapter would cause an entry in the AIX error log. Error labels like HPS_FAULT9_ER or HPS_FAULT3_ER are considered critical, and can be specified to AIX Error Notification in order to be able to act upon them.

With HACMP, there is a SMIT screen to make it easier to set up an error notification object. This is much easier than the traditional AIX way of adding a template file to the ODM class. Under smit hacmp > RAS Support > Error Notification > Add a Notify Method, you will find the menu allowing you to add these objects to the ODM. An example of the SMIT panel is shown below:

Add a Notify Method

Type or select values in entry fields.

Press Enter AFTER making all desired changes.

 

 

[Entry Fields]

 

* Notification Object Name

[HPS_ER9]

 

* Persist across system restart?

Yes

+

Process ID for use by Notify Method

[]

+#

Select Error Class

All

+

Select Error Type

 

PERM

+

Match Alertable errors?

All

+

Select Error Label

[HPS_FAULT9_ER] +

Resource Name

 

[All]

+

Resource Class

 

[All]

+

Resource Type

 

[All]

+

* Notify Method

[/usr/sbin/cluster/utilities/clstop -grsy]

F1=Help

F2=Refresh

F3=Cancel

F4=List

F5=Reset

F6=Command

F7=Edit

F8=Image

F9=Shell

F10=Exit

Enter=Do

 

Figure 8. Sample Screen for Add a Notification Method

46 IBM Certification Study Guide AIX HACMP

Page 64
Image 64
IBM SG24-5131-00 manual Sample Screen for Add a Notification Method