Appendix - Fault Isolation

MTBF

NOPWR

OVERRD

PC Hi

PCIOPN

PCLOW

PCVOTE

PSBAD

PSU OK

PSUs

PWR

REGDIF

SOFT

SPD OK

SPR OK

SPRPCU

TEMPOK

USER

Below MTBF Threshold No Power

Cabinet Fan Speed Override Active

Power Controller Over Voltage PCI Card Bay Door Open

Power Controller Under

Voltage

Power Controller Voter Fault Invalid Power Supply Type

Cabinet Power Supply Unit(s)

OK

Multiple Power Supply Unit Faults

Breaker Tripped

ACU Registers Differ

Soft Error

Cabinet Fan Speed Override Completed

Cabinet Spare (PCU) OK

Cabinet Spare (PCU) Fault

Cabinet Temperature OK

User Reported Error

deleted.

The CRU/FRU’s rate of transient and hard failures became too great.

The CRU/FRU lost power.

The fan override (setting fans to full power from the normal 70%) was activated.

An over-voltage condition was detected by the power controller.

The PCI card-bay door is open.

An under-voltage condition was detected by the power controller.

A voter fault was detected by the power controller.

The power supply ID bits do not match that of any supported unit.

The cabinet power supply unit(s) are OK. Multiple power supply units faulted in a cabinet.

The circuit breaker for the PCIB power supply tripped.

A comparison of the registers on both ACUs showed a difference.

The driver reported a transient error. A transient error occurs when a hardware fault is detected, but the problem is corrected by the system. Look at the syslog for related error messages.

The cabinet-fan speed override completed.

The cabinet spare (PCU) is OK.

The power control unit spare line faulted. The cabinet temperature is OK.

A user issued ftsmaint disable to disable the hardware device.

A.6 Troubleshooting Procedures

When a fault occurs, several things can happen. If it is a non-critical fault, the system will continue to process data. If it is a critical fault, the system (or a subsystem) may be inoperative. When troubleshooting the system, determine the fault first by using the LEDs and screen messages. Then, verify that the component(s) is out of service using the ftsmaint ls command (PA-8500 Continuum Series 400 Technical Service Guide (HP-UX systems, Section 2.6.3 ), error logs, and diagnostic tests. This flow is shown in the Troubleshooting Flow below.

Page 87
Image 87
Lucent Technologies PA-8500 manual Troubleshooting Procedures