Slurm Monitor

1-Ok

0-Warn

0-Crit

0-Pend

0-Unk

Slurm Status

10-Ok

0-Warn

0-Crit

0-Pend

0-Unk

Supermon Metrics Monitor

1-Ok

0-Warn

0-Crit

0-Pend

0-Unk

Switch

2-Ok

0-Warn

0-Crit

0-Pend

0-Unk

Switch Data Collection

1-Ok

0-Warn

0-Crit

0-Pend

0-Unk

Syslog Alert Monitor

1-Ok

0-Warn

0-Crit

0-Pend

0-Unk

Syslog Alerts

10-Ok

0-Warn

0-Crit

0-Pend

0-Unk

System Event Log

9-Ok

1-Warn

0-Crit

0-Pend

0-Unk

System Event Log Monitor

1-Ok

0-Warn

0-Crit

0-Pend

0-Unk

System Free Space

10-Ok

0-Warn

0-Crit

0-Pend

0-Unk

Totals:

115-Ok

1-Warn

0-Crit

0-Pend

0-Unk

If one or more warnings are reported, use the analyze option to obtain an analysis of the problem. When possible, the command output provides potential corrective action or the reasons for a given state. For example:

# nrg --mode analyze

Nodelist Description

-----------------------------------------------------------------------------

nh

[System

Event Log

- NOSUCHHOST] The check_sel plug-in failed

 

to

find

the console port for this node, a common cause is the

 

console

device cp-xxxxx, is not reachable. If this is the

 

head node and the

head node is externally connected, you may

 

be

able

to define

cp-xxxxx in /etc/hosts using the external

IP to allow connectivity. Sensor collection may not be possible when using externally connected console ports for head nodes on platforms that use IPMI to gather sensor information. If this is not the head node then it may indicate a communication problem with the associated console device 'cp-{nodename}'.

5.14 Creating a Baseline Copy of the Database

After you run the OVP to verify the successful installation and configuration of the system, HP recommends that you take a snapshot of the configuration and management database to create a baseline version. You can use a baseline copy of the database to restore the database to its original state.

Enter the following command to back up the configuration and management database to a file. If you do not specify a directory, the default location for the backup file is in the /var/hptc/ database directory. Consider adding a date and time stamp to the file name to determine at a glance when the backup file was created. For example:

#managedb backup your_filename

Depending on your corporate security policy for managing system and database backups, consider storing the database backup file on a remote system as an additional precaution.

The HP XC System Software Administration Guide provides information about basic database management commands. For more information about managing the configuration and management database, see the MySQL Reference Manual, which is available at the following website:

http://dev.mysql.com/

5.15 Creating a Baseline Report of the System Configuration

The sys_check utility is a data collection tool you can use to diagnose system errors and problems. Use the sys_check utility now to create a baseline report of the system configuration (software and hardware).

The sys_check utility collects configuration data only for the node on which it is run unless you set and export the SYS_CHECK_SYSWIDE variable, which collects configuration data for all nodes in the HP XC system.

Use the following commands to run the sys_check utility in its simplest form:

38 XC Software Installation