HP EMS HARDWARE S B6191-90028 manual Tips for Hardware Monitoring

Page 16

Introduction

Hardware Monitoring Overview

Tips for Hardware Monitoring

Here are some tips for using hardware monitoring.

Keep hardware monitoring enabled to protect your system from undetected failures. Hardware monitoring is an important tool for maintaining high-availability on your system. In a high-availability environment, the failure of a hardware resource makes the system vulnerable to another failure. Until the failed hardware is repaired, the backup hardware resource represents a single-point of failure. Without hardware monitoring you may not be aware of the failure. But if you are using hardware monitoring, you are alerted to the failure. This allows you to repair the failure and restore high-availability as quickly as possible.

Integrate the peripheral status monitor (PSM) into your MC/ServiceGuard strategy. An important feature of hardware monitoring is its ability to communicate with applications responsible for maintaining system availability, such as MC/ServiceGuard. The peripheral status monitor (PSM) allows you to integrate hardware monitoring into MC/ServiceGuard. The PSM gives you the ability to failover a package based on an event detected by hardware monitoring. If you are using MC/ServiceGuard, you should consider using the PSM to include your system hardware resources in the MC/ServiceGuard strategy. In addition, the necessary notification methods are provided for communicating with network management application such as HP OpenView.

Utilize the many notification methods available. The notification methods provided by hardware monitoring provide a great deal of flexibility in designing a strategy to keep you informed of how well your system hardware is working. The default monitoring configuration was selected to provide a variety of notification for all supported hardware resources. As you become familiar with hardware monitoring, you may want to customize the monitoring to meet your individual requirements.

Use email and/or textfile notification methods for all your requests. Both of these methods, which are included in the default monitoring, receive the entire content of the message so you can read it immediately. Methods such as console and syslog alert you to the occurrence of an event but do not deliver the entire message. You are required to retrieve the message using the resdata utility, which requires an additional step.

Use the `All monitors' option when creating a monitoring request. This applies the monitoring request to all monitors. This has the benefit of ensuring a new class of hardware resource added to your system will automatically be monitored. This means that new hardware is protected from undetected hardware failure with no effort on your part.

Easily replicate your hardware monitoring on all your systems. Once you have implemented a hardware monitoring strategy on one of your system, you can replicate that same monitoring on other systems. Simply copy all of the hardware monitor configuration files to each system that will use the same monitoring. The monitor configuration files live in /var/stm/config/tools/monitor. Of course, you must have installed hardware event monitoring on each system before you copy the configuration files to it. Be sure to enable monitoring on all systems.

16

Chapter 1

Image 16
Contents EMS Hardware Monitors Users Guide Manufacturing Part Number B6191-90028 SeptemberLegal Notices Printing History Page Contents Using the Peripheral Status Monitor Hardware Monitor Configuration FilesSpecial Procedures Contents About This Manual Related Web Site Introduction What is Hardware Monitoring? Hardware Monitoring OverviewHow Does Hardware Monitoring Work? Components Involved in Hardware MonitoringBenefits of Hardware Monitoring Products Supported by Hardware Monitors Tips for Hardware Monitoring Hardware Monitoring Terms Hardware Monitoring TermsTerm Definition Hardware Monitoring Terms Introduction Hardware Monitoring Overview Chapter Installing and Using Monitors Steps Involved Steps for Installing and Configuring Hardware Monitoring Installing EMS Hardware Monitors Supported System ConfigurationRemoving EMS Hardware Monitors Disk Arrays Checking for Special RequirementsProduct Model/Product Special Number Requirements Disk Products Tape Products all supported by the Scsi Tape Devices MonitorHigh Availability Storage Systems Fibre Channel Scsi MultiplexersFibre Channel Arbitrated Loop FC-AL Hub Fibre Channel AdaptersProduct Model/Product Special Requirements Number Memory Fibre Channel SwitchSystem Interface Cards OthersWhat Is a Monitoring Request? Using Hardware Monitoring RequestsSome Monitoring Request Examples Building a Monitoring Request Running the Monitoring Request Manager To run the Monitoring Request Manager, typeEnabling Hardware Event Monitoring 13 Default Monitoring Requests for Each Monitor Default Monitoring RequestsSeverity Levels Notification Method Listing Monitor Descriptions To list the descriptions of available monitorsViewing Current Monitoring Requests To view or show the current monitoring requestsAdding a Monitoring Request To add a monitoring request14 Monitoring Requests Configuration Settings Setting DescriptionEvent Description MC/ServiceGuard Severity Response Level 15 Event Severity LevelsEvent Severity Levels Example of Adding a Monitoring Request == ADD Comment Modifying Monitoring Requests To modify a monitoring requestVerifying Hardware Event Monitoring Checking Detailed Monitoring Status Events =Retrieving and Interpreting Event Messages Sample Event MessageDeleting Monitoring Requests To delete a monitoring requestTo disable hardware event monitoring Disabling Hardware Event MonitoringDetailed Description Detailed Picture of Hardware Monitoring Hardware Monitoring ArchitectureComponents from Three Different Applications Hardware Monitoring Request ManagerPolling or Asynchronous? EMS Hardware Event MonitorStartup Client Event Monitoring Service EMS Peripheral Status Monitor PSM1File Locations File LocationsDirectories and Files Description Startup Process in Detail Monitoring Startup ProcessDisabling Monitoring Asynchronous Event Detection in Detail Event DecodingAsynchronous Event Detection Process Event Polling in Detail FC-AL Hub and FC Switch Polling ProcessesMemory Monitor Polling Monitoring Polling Process Memory Monitor Polling Process Detailed Picture of Hardware Monitoring Chapter Using the Peripheral Status Monitor Peripheral Status Monitor Overview How Does the PSM Work? Peripheral Status Monitor PSM Components PSM StatesPSM Resource Paths PSM StatusHow Does the PSM Work? Chapter Configuring Package Dependencies using SAM To create a package resource dependencyResourcepollinginterval Resourceupvalue =UP To create a PSM monitoring request Creating EMS Monitoring Requests for PSMFrom the Actions menu select Add Monitoring Request Specifying When to Send Event Notify Setting the Polling Interval Polling IntervalMonitoring Request Parameters Determining the Frequency of Events OptionsSelecting Protocols for Sending Events Notify Via Opcmsg ITOCritical Major Minor Normal To set the opcmsg ITOTo set the Snmp trap TCP and UDPTo set the TCP or UDP conditions To set for email notificationAdding a Notification Comment Comment Copying Monitoring Requests To change the monitoring parameters of a request Removing Monitoring Requests To remove monitoring requestsTo view the parameters for a monitoring request Viewing Monitoring RequestsFrom the Actions menu select View Monitoring Request To restore the operating state of a resource to UP Using the setfixed Utility to Restore Hardware UP StateExample 4-1 Example of Using setfixed Using the Peripheral Status Monitor Hardware Monitor Configuration Files Monitor Configuration Files Monitor Configuration File Entries Setting Values DescriptionFile Names File FormatInterval must be a Eventnum must be aFrequency must be a Sample Global Configuration File Event DefinitionSeverity Action Polling IntervalSample Monitor-Specific Configuration File = Communication Device#POLLINTERVAL Startup Configuration File Startup Configuration File Entries Keyword Values DescriptionUDP Default Monitoring Requests Default File EntriesDescription Entry Peripheral Status Monitor PSM Configuration File Considerations for Modifying the PSM Configuration File PSM Configuration File Fields MonitorresourcenameCritical Downseverityope Example File Entries Special Procedures Fibre Channel Arbitrated Loop Hub Monitor Initial Monitor Configuration Configuring the FC-AL Monitor Configuration FileConfiguration Files Adding or Removing an FC-AL HubSetting Default Description Value This setting is requiredPSM Configuration File Fibre Channel Switch Monitor Adding or Removing an FC Switch Configuring the FC Switch Monitor Configuration FileChanging the FC Switch Monitoring Configuration 2PSM Configuration File Fields Special Procedures 112 Index 114 115