IBM P5 570 manual Serviceability, Error indication and LED indicators

Page 71

If the output shows CPU Guard as disabled, enter the following command to enable it:

chdev -l sys0 -a cpuguard='enable'

Cache or cache-line deallocation is aimed at performing dynamic reconfiguration to bypass potentially failing components. This capability is provided for both L2 and L3 caches. Dynamic run-time deconfiguration is provided if a threshold of L1 or L2 recovered errors is exceeded.

In case of an L3 cache run-time array single-bit solid error, the spare chip resources are used to perform a L3 cache line delete on the failing line.

PCI hot-plug slot fault tracking helps prevent slot errors from causing a system machine check interrupt and subsequent reboot. This provides superior fault isolation, and the error affects only the single adapter. Run-time errors on the PCI bus that are caused by failing adapters will result in recovery action. If this is unsuccessful, the PCI device will be gracefully shut down. Parity errors on the PCI bus itself will result in bus retry and, if uncorrected, the bus and any I/O adapters or devices on that bus will be deconfigured.

The p5-570 supports PCI Extended Error Handling (EEH) if it is supported by the PCI-X adapter. In the past, PCI bus parity errors caused a global machine check interrupt, which eventually required a system reboot in order to continue. In the p5-570 system, hardware, system firmware, and AIX interaction has been designed to allow transparent recovery of intermittent PCI bus parity errors and graceful transition to the I/O device available state in the case of a permanent parity error in the PCI bus.

EEH-enabled adapters respond to a special data packet that is generated from the affected PCI slot hardware by calling system firmware, which examines the affected bus, allows the device driver to reset it, and continues without a system reboot.

Persistent deallocation functions include:

￿Processor

￿Memory

￿Deconfigure or bypass failing I/O adapters

￿L3 cache

Following a hardware error that has been flagged by the service processor, the subsequent reboot of the system invokes extended diagnostics. If a processor or L3 cache has been marked for deconfiguration by persistent processor deallocation, the boot process will attempt to proceed to completion with the faulty device automatically deconfigured. Failing I/O adapters will be deconfigured or bypassed during the boot process.

Note: The auto-restart (reboot) option, when enabled, can reboot the system automatically following an unrecoverable software error, software hang, hardware failure, or environmentally induced failure (such as loss of power supply).

3.2.8 Serviceability

By increasing service productivity, the system is up and running for a longer time. p5-570 improves service productivity by providing the following functions.

Error indication and LED indicators

The p5-570 is designed to be installed by an IBM service representative. The addition of most hardware features after the install is customer setup. To help the customer and the IBM service representative, the p5-570 provides internal LED diagnostics that identify parts that require service. Indication of an error is provided through a series of light attention signals,

Chapter 3. Capacity on Demand, RAS, and manageability 59

Image 71
Contents IBM Eserver p5 Technical Overview Introduction Page IBM Sserver p5 570 Technical Overview Introduction First Edition July Contents Page Page Vi p5-570 Technical Overview and Introduction Vii Trademarks Team that wrote this Redpaper PrefaceComments welcome Become a published authorGeneral description P5-570 Technical Overview and Introduction Physical package System specificationsView from the front Minimum and optional featuresProcessor card FC Description Processor card featuresDisk and media features Memory featuresModel D10 I/O drawer USB diskette drive5 I/O drawers Model D10 I/O drawer physical package Model D20 I/O drawer Model D11 I/O drawerModel D20 I/O drawer physical package Drawers and usable PCI slotsHardware Management Console models Value PaksSystem racks Model type conversionIBM RS/6000 7014 Model T00 Enterprise Rack Rack-mounting rules for p5-570 and I/O drawers AC Power Distribution Unit and rack contentIBM RS/6000 7014 Model T42 Enterprise Rack Flat panel display options Additional options for rackIBM 7212 Model 102 TotalStorage Storage device enclosure OEM rackHardware Management Console 7310 Model CR2 Statement of direction 18 p5-570 Technical Overview and Introduction Architecture and technical overview POWER4 POWER5 POWER5 chipST operation Dynamic power managementEnhanced SMT features Simultaneous multi-threadingPOWER4 Power chip evolutionCMOS, copper, and SOI technology Processor cardsProcessor card with DDR1 memory socket layout view Processor drawer interconnect cablesPmcycles -m Processor clock rateMemory restriction Memory placement rulesMemory subsystem RIO-2 buses and GX+ card System busesMemory throughput PCI-X slots and adapters Internal I/O subsystemSP bus Bit and 32-bit adapters LAN adaptersGraphic accelerators Scsi adaptersInternal hot swappable Scsi disks Internal storageHot-swap disks and Linux Internal RAID optionsInternal media devices 2 7311 Model D10 and 7311 Model D11 I/O drawers External I/O subsystems1 I/O drawers 3 7311 Model D20 I/O drawer Model D10 featuresModel D11 features Model D20 internal Scsi cabling 4 7311 I/O drawer and RIO-2 cablingCost Optimized Performance Optimized 5 7311 I/O drawer and Spcn cablingIBM 2104 Expandable Storage Plus External disk subsystemsIBM TotalStorage Enterprise Storage Server IBM 7133 Serial Disk Subsystem SSAIBM TotalStorage FAStT Storage servers Virtual Ethernet Advanced Power Virtualization featureDynamic logical partitioning VirtualizationMicro-Partitioning technology 15shows the POWER5 partitioning conceptVirtual I/O Server POWER5 PartitioningPartition Load Manager Service processorService processor extender Service processor baseIPL flow without an HMC attached to the system Boot processIPL flow with an HMC attached to the system Hardware Management ConsoleProfiles Definitions of partitionsManaged systems Specific partition definitions used for Micro-Partitioning System Management ServicesHardware requirements for partitioning 17 System Management Services main menu Boot optionsDVD-ROM, DVD-RAM Additional boot optionsAIX 5L Operating system requirementsSecurity Linux support LinuxCapacity on Demand, RAS, and manageability Way 1.9 GHz POWER5 processor card with DDR2 memory slots Processor Capacity Upgrade on Demand methodsWay 1.65 GHz POWER5 processor card Way 1.9 GHz POWER5 processor card with DDR1 memory slotsCapacity Upgrade on Demand for memory Capacity Upgrade on Demand for memory feature codes How to report temporary activation resourcesTrial Capacity on Demand Reliability, availability, and serviceabilityFault avoidance Mutual surveillance First Failure Data CapturePermanent monitoring Environmental monitoring Memory reliability, fault tolerance, and integritySelf-healing Dynamic or persistent deallocation Fault masking5 N+1 redundancy Resource deallocationError indication and LED indicators ServiceabilityAdvanced System Management Interface Concurrent MaintenanceManageability Advanced System Management main menu Service Agent3 p5 Customer-Managed Microcode Service Update Management AssistantService focal point Cluster CSM V1.4 on AIX and Linux planned 4Q04 CSM value pointsOther publications IBM RedbooksOnline resources Help from IBM How to get IBM Redbooks68 p5-570 Technical Overview and Introduction Page IBM Eserver p5 Technical Overview Introduction

P5 570 specifications

The IBM P5 570 is a high-performance server that was designed for enterprise-scale computing, offering a blend of advanced technologies and a flexible architecture. Launched as part of IBM's Power5 server line, the P5 570 stands out for its robust processing capabilities and extensive scalability, making it a preferred choice for businesses requiring reliable and efficient computing solutions.

At the heart of the P5 570 is the IBM Power5 processor, which employs simultaneous multi-threading (SMT) technology. This allows the processor to handle two threads per core, effectively doubling the throughput for workloads ideally suited to multi-threading. The server typically features a configuration of up to 32 Power5 processors, providing an impressive compute power that supports demanding applications, ranging from databases to complex enterprise resource planning (ERP) systems.

The P5 570 architecture supports a wide range of memory configurations, with a maximum memory capacity of up to 512 GB. Utilizing IBM’s proprietary Chip Memory technology, it can deliver high bandwidth and low latency, significantly enhancing performance for memory-intensive applications. Furthermore, the integrated memory controller architecture optimizes memory access, ensuring that critical workloads run smoothly.

Scalability is a key characteristic of the P5 570, with the ability to expand processing power and memory capacity as an organization’s needs grow. The server supports various operating systems, including AIX, Linux, and IBM i, which provides flexibility for diverse IT environments. This versatility ensures that companies can run their preferred applications without the need for substantial system overhauls.

In terms of storage, the P5 570 utilizes advanced RAID technology and supports a variety of disk configurations, ensuring that data integrity and availability are maintained. Coupled with built-in security features, such as the IBM Trusted Foundation, which establishes a secure boot environment, the P5 570 offers a reliable platform for mission-critical workloads.

Finally, the IBM P5 570 is designed for high availability and redundancy. Features like hot-swappable components and advanced error detection and recovery mechanisms minimize downtime, making it a dependable choice for businesses that operate around the clock. Combined with its powerful hardware and versatile software support, the IBM P5 570 remains a formidable player in the high-performance server arena.