IBM 710, 730 Memory error correction extensions, Fault monitoring functions, Mutual surveillance

Page 12

The system cache and memory offer ECC (error checking and correcting) fault- tolerant features. ECC is designed to correct environmentally induced, single-bit, intermittent memory failures and single-bit hard failures. With ECC, the likelihood of memory failures will be reduced. ECC also provides double-bit memory error detection that helps protect data in the event of a double-bit memory failure.

The AIX and IBM i operating systems provide disk drive mirroring and disk drive controller duplexing. The Linux operating system supports disk drive mirroring (RAID 1) through software, while other RAID protection schemes are provided via hardware RAID adapters.

The Journaled File System, also known as JFS or JFS2, helps maintain file system consistency and reduces the likelihood of data loss when the system is abnormally halted due to a power failure. JFS, the recommended file system for 32-bit kernels, now supports extents on the Linux operating system. This feature is designed to reduce or eliminate fragmentation. Its successor, JFS2, is the recommended file system for 64-bit kernels.

With 64-bit addressing, a maximum file system size of 32 TB and maximum file size of 16 TB, JFS2 is highly recommended for systems running the AIX operating system.

Memory error correction extensions

The memory has single-bit-error correction and double-bit-error detection ECC circuitry. The ECC code is also designed such that the failure of any one specific memory module within an ECC word by itself can be corrected absent any other fault.

Memory protection features include scrubbing to detect errors, a means to call for the deallocation of memory pages for a pattern of correctable errors detected, and signaling deallocation of a logical memory block when an error occurs that cannot be corrected by the ECC code.

Fault monitoring functions

When a POWER7 processor-based system is initially powered on, BIST (built- in self-test) and POST (power-on self-test) check processor, cache, memory, and associated hardware required for proper booting of the operating system. If a noncritical error is detected or if the errors occur in resources that can be removed from the system configuration, the restarting process is designed to proceed to completion. The errors are logged in the system nonvolatile RAM (NVRAM).

Disk drive fault tracking is designed to alert the system administrator of an impending disk drive failure before it impacts customer operation.

Mutual surveillance

The Service Processor monitors the operation of the firmware during the boot

process, and also monitors the HypervisorTM for termination. The Hypervisor monitors the Service Processor and will perform a reset/reload if it detects the loss of the Service Processor. If the reset/reload does not correct the problem with the Service Processor, the Hypervisor will notify the operating system and the operating system can take appropriate action, including calling for service.

Environmental monitoring functions

POWER7 based servers include a range of environmental monitoring functions:

Temperature monitoring warns the system administrator of potential environmental-related problems by monitoring the air inlet temperature. When the inlet temperature rises above a warning threshold, the system initiates an orderly shutdown. When the temperature exceeds the critical level or if the temperature remains above the warning level for too long, the system will shut down immediately.

IBM Europe, Middle East, and Africa Hardware

IBM is a registered trademark of International Business Machines Corporation 12

Announcement ZG10-0214

 

Image 12
Contents Table of contents IBM Europe, Middle East, and Africa Hardware Key prerequisites Feature number Description 8350 Planned availability dateDescription Processor Activations6xxx One Power Cord Or Linux4526 MB Memory 69.7 GB 5268 Storage7318 Two Processor Chassis 9300/97xx Specify2146 Primary 5268 Storage Backplane for six SFF Drives/SATAIBM Europe, Middle East, and Africa Hardware IBM Europe, Middle East, and Africa Hardware IBM Europe, Middle East, and Africa Hardware Inch racks Reliability, fault tolerance, and data correction Memory error correction extensions Fault monitoring functionsMutual surveillance Environmental monitoring functionsPOWER7 processor functions Availability enhancement functionsServiceability Service InterfaceFirst Failure Data Capture and Error Data Analysis Stand-alone diagnosticsService labels Error handling and reportingService Processor IBM Electronics ServicesCall Home Benefits Statement of general direction Product numberCustomer Specified Placement 8231 SSD Placement Indicator CEC 8231Specify #5886 Load Source placement 8231 US TAA Compliance Indicator 8231Factory Deconfiguration of 1-core 8231 3m 200V/12A Pwr Cd UK 82311476 3m 200V/16A Pwr Cd 8231Rack Indicator- Not Factory Integrated 8231 Active Memory Expansion Enablement 8231One Processor of 5250 Enterprise Enablement 8231 Enablement 8231Power Cord 3m 14-ft, Drawer To OEM PDU Power CordPower Cord 7m 9-foot, To Wall/OEM PDU Power Cord M9-foot, To Wall/OEM PDUPower Cord 2.7M 9-foot, To Wall/OEM PDU 125V, 15A 8231 3m 14-Ft 3PH/16A Power Cord 82313m 14-Ft1PH/24A Power Cord 8231 Power Cord 2.7M 9-foot, To Wall/OEM PDU 250V, 15A 8231Publications Following conversions are available to customersServices 8231-E2B Service DVDInstallation Road Map Safety Information Statement Warranty IBM Publications Center PortalPhysical specifications Technical informationSpecified operating environment Operating environmentNoise level and sound power Preliminary data Hardware requirementsIBM Europe, Middle East, and Africa Hardware RAID Software requirements LimitationsSystem All processors must be activated IBM Europe, Middle East, and Africa Hardware Feature Order Number Description Status 16 GB 1066 MHz 45275886 Exp 12S SAS Disk Drawer Available Adapter NumberStorage devices/Bays 5273 PCIe LP 2-Port 1GbE SX Adapter 5274Device Bay NumberCable orders IBM Electronic Services BenefitsVolume orders Contact your IBM representative Warranty periodWarranty service Terms and conditionsIBM Europe, Middle East, and Africa Hardware Warranty service upgrades 8231-E2B Type/Model 8231-E2B Feature number CRUs are designated as being either a Tier 1 or a Tier 2 CRUUsage plan machine IBM hourly service rate classification Maintenance service offeringsField-installable features Model conversions Machine installationPrices Graduated program license charges applyYes Applicable processor tier is Small Licensed machine codeAnnouncement countries All European, Middle Eastern, and African countriesTrademarks Terms of useIBM Europe, Middle East, and Africa Hardware