IBM 750 manual Mutual surveillance, Environmental monitoring functions

Page 16

Disk drive fault tracking is designed to alert the system administrator of an impending disk drive failure before it impacts customer operation.

Mutual surveillance

The Service Processor monitors the operation of the firmware during the boot

process, and also monitors the HypervisorTM for termination. The Hypervisor monitors the Service Processor and will perform a reset/reload if it detects the loss of the Service Processor. If the reset/reload does not correct the problem with the Service Processor, the Hypervisor will notify the operating system and the operating system can take appropriate action, including calling for service.

Environmental monitoring functions

POWER7-based servers include a range of environmental monitoring functions:

Temperature monitoring warns the system administrator of potential environmental-related problems by monitoring the air inlet temperature. When the inlet temperature rises above a warning threshold, the system initiates an orderly shutdown. When the temperature exceeds the critical level or if the temperature remains above the warning level for too long, the system will shut down immediately.

Fan speed is controlled by monitoring actual temperatures on critical components and adjusting accordingly. If internal component temperatures reach critical levels, the system will shut down immediately, regardless of fan speed. When a redundant fan fails, the system calls out the failing fan and continues running. When a nonredundant fan fails, the system shuts down immediately.

Availability enhancement functions

The POWER7 family of systems continues to offer and introduce significant enhancements designed to increase system availability.

POWER7 processor functions

As in POWER6, the POWER7 processor has the ability to do processor instruction retry and alternate processor recovery for a number of core-related faults. This significantly reduces exposure to both hard (logic) and soft (transient) errors in the processor core. Soft failures in the processor core are transient (intermittent) errors, often due to cosmic rays or other sources of radiation, and generally are not repeatable. When an error is encountered in the core, the POWER7 processor will first automatically retry the instruction. If the source of the error was truly transient, the instruction will succeed and the system will continue as before. On IBM systems prior to POWER6, this error would have caused a checkstop.

Hard failures are more difficult, being true logical errors that will be replicated each time the instruction is repeated. Retrying the instruction will not help in this situation because the instruction will continue to fail. As in POWER6, POWER7 processors have the ability to extract the failing instruction from the faulty core and retry it elsewhere in the system for a number of faults, after which the failing core is dynamically deconfigured and called out for replacement. The entire process is transparent to the partition owning the failing instruction. These systems are designed to avoid a full system outage.

POWER7 single processor checkstopping

As in POWER6, POWER7 provides single processor checkstopping. This significantly reduces the probability of any one processor affecting total system availability.

Partition availability priority

Also available is the ability to assign availability priorities to partitions. If an alternate processor recovery event requires spare processor resources in order to protect a workload, when no other means of obtaining the spare resources is available, the system will determine which partition has the lowest priority and attempt to claim the needed resource. On a properly configured POWER7 processor-

IBM United States Hardware Announcement 110-009

IBM is a registered trademark of International Business Machines Corporation

16

Image 16
Contents Table of contents Overview Planned availability date Key prerequisitesFor more information, visit Description Sata DVD-RAM Page Page IBM United States Hardware Announcement IBM United States Hardware Announcement IBM United States Hardware Announcement Drawer availability Page Inch racks Reliability, fault tolerance, and data correction Fault monitoring functions Memory error correction extensionsRedundancy for array self-healing Environmental monitoring functions Mutual surveillanceAvailability enhancement functions Serviceability Service Interface First Failure Data Capture and Error Data AnalysisService labels Stand-alone diagnosticsLocation diagrams Error Handling and Reporting Service Processor IBM Electronics ServicesCall Home BenefitsPage Statement of general direction Product number12X DDR 73.4 GB 15K RPM SAS SFF Disk Drive 8233 3M SAS CABLE, Adptr to Adptr AA Rfid Tags for SERVERS, BLADES, Bladecenters RACKS, and Hmcs USB Power Cord Foot, To Wall/OEM PDU Power Distribution Unit 8233 Following conversions are available to customers Business Partner information PublicationsIBM Publications Center Portal Services Physical specificationsTechnical information Specified operating environmentNoise level and sound power Hardware requirementsPage RAID Software requirements LimitationsIBM United States Hardware Announcement Memory features Feature Minimum Maximum Number Quantity Scsi 12X Cable Choice DDRYes USB PCI DVD-ROM Sata SAS, SFF Planning information IBM Electronic ServicesCable orders Warranty period Warranty serviceTerms and conditions IBM Global FinancingWarranty service upgrades IBM United States Hardware Announcement Usage plan machine IBM hourly service rate classification Field-installable featuresModel conversions Machine installation Graduated program license charges applyPrices Educational allowance0296 Both 0855 Both Yes No US TAA Compliance Indicator 1476 Support Primary OS IBM Adptr AA E8B 3688 Both Yes No Blades BLADECENTERS, RACKS, and Hmcs E8B PCI English, #103P Opt Front IOP E8B Quantity 150 of #3658 Initial Month Indicator Minimum monthly maintenance charge SSD Placem Indicator US TAA Compliance Indicato Line Cord RPM SFF SAS Disk Ultra Scsi Port CBL. for RACK/RACK SAS Cable Dasd Backplane Extender CBL. USB Keyboard Cryptographic SAS RAID DVD RAM Driv NON Paired Pcie SAS RAID Keyboard USB, Polish Drawer to OEM PDU WR PW Drawer Mouse USB, with Keyboard Language Group Specify FRE Mmmc IOR24 Feature conversions Trademarks Order nowTerms of use