IBM 755 manual Serviceability

Page 8

PCI extended error handling

PCI extended error handling (EEH) enabled adapters respond to a special data packet generated from the affected PCI slot hardware by calling system firmware, which will examine the affected bus, allow the device driver to reset it, and continue without a system reboot. For Linux, EEH support extends to the majority of frequently used devices, although some third-party PCI devices may not provide native EEH support.

Predictive failure and dynamic component deallocation

Servers with POWER processors have long had the capability to perform predictive failure analysis on certain critical components such as processors and memory. When these components exhibit symptoms that would indicate a failure is imminent, the system can dynamically deallocate and call home about the failing part before the error is propagated system-wide. In many cases, the system will first attempt to reallocate resources in such a way that will avoid unplanned outages. In the event that insufficient resources exist to maintain full system availability, these servers will attempt to maintain partition availability by user-defined priority.

Uncorrectable error recovery

When the auto-restart option is enabled, the system can automatically restart following an unrecoverable software error, hardware failure, or environmentally induced (ac power) failure.

Serviceability

The purpose of serviceability is to repair the system while attempting to minimize or eliminate service cost (within budget objectives), while maintaining high customer satisfaction. Serviceability includes system installation, MES (system upgrades/downgrades), and system maintenance/repair. Depending upon the system and warranty contract, service may be performed by the customer, an IBM representative, or an authorized warranty service provider.

The Serviceability features delivered in this system provide a highly efficient service environment by incorporating the following attributes:

Design for Customer Set Up (CSU), Customer Installed Features (CIF), and Customer Replaceable Units (CRU)

Error detection and Fault Isolation (ED/FI)

First Failure Data Capture (FFDC)

Converged service approach across multiple IBM server platforms

Service environments

The HMC is a dedicated server that provides functions for configuring and managing servers for either partitioned or full-system partition using a GUI or command-line interface (CLI). An HMC attached to the system allows support personnel (with client authorization) to remotely log in to review error logs and perform remote maintenance if required.

The POWER7 processor-based platforms support two main service environments:

Attachment to one or more HMCs is a supported option by the system. This is the default configuration for servers supporting logical partitions with dedicated or virtual I/O. In this case, all servers have at least one logical partition.

No HMC.

Full system partition: A single partition owns all the server resources and only one operating system may be installed.

IBM United States Hardware Announcement 110-008

IBM is a registered trademark of International Business Machines Corporation

8

Image 8
Contents Table of contents Key prerequisites For more information, visit Planned availability dateDescription Minimum defined configuration, if no choice is made, is EXP 12S SAS Drawer #5886 Reliability, availability, and serviceability RAS features Reliability, fault tolerance, and data correctionMemory error correction extensions Fault monitoring functionsRedundancy for array self-healing Mutual surveillancePOWER7 processor functions POWER7 single processor checkstoppingAvailability enhancement functions Partition availability priorityServiceability Service Interface First Failure Data Capture and Error Data AnalysisLocation diagrams Stand-alone diagnosticsService labels Call Home Error Handling and ReportingService Processor IBM Electronic Services BenefitsStatement of general direction Product number E8C 5774 3m 14-Ft 3PH/24A Power Cord 8236 Inch racks Safety Information PublicationsServices Physical specificationsTechnical information Specified operating environmentNoise level and sound power Hardware requirementsSoftware requirements Limitations 110-008SAS AIX Linux DVD-RAM Sata Cable orders IBM Electronic ServicesPlanning information Warranty period Warranty serviceTerms and conditions IBM Global FinancingWarranty service upgrades Maintenance Services Usage plan machine IBM hourly service rate classification Field-installable featuresModel conversions Machine installation Graduated program license charges applyPrices Educational allowanceAdptr AA E8C 3687 Both PCI Full Width Keyboard -- USB, Danish, #159 IOP E8C US English New features Existing featuresMeter SAS Cable Virtual USB, Spanish USB, Russian Optional 8XDIMMS Kapalua Terms of use Order nowTrademarks