72 Bit DIMM

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

64 bits

 

 

 

6 bits

 

2 bits

 

 

Data

 

 

 

ECC

Spare

Figure 1-11 Memory ProteXion

In the event that a chip failure on the DIMM is detected by memory scrubbing, the memory controller can re-route data around that failed chip through the spare bits (similar to the hot-spare drive of RAID array). It can do this automatically without issuing a Predictive Failure Analysis (PFA) or Light Path Diagnostics alert to the administrator. After the second DIMM failure, PFA and Light Path Diagnostics alerts would occur on that DIMM as normal.

￿Memory scrubbing

Memory scrubbing is an automatic daily test of all the system memory that detects and reports memory errors that might be developing before they cause a server outage.

Memory scrubbing and Memory ProteXion work in conjunction with each other, but they do not require memory mirroring (as described below) to be enabled to work properly.

When a bit error is detected, memory scrubbing determines if the error is recoverable or not. If it is recoverable, Memory ProteXion is enabled and the data that was stored in the damaged locations is rewritten to a new location. The error is then reported so that preventative maintenance can be performed. As long as there are enough good locations to allow the proper operation of the server, no further action is taken other than recording the error in the error logs.

If the error is not recoverable, then memory scrubbing sends an error message to the Light Path Diagnostics, which then turns on the proper lights and LEDs to guide you to the defective DIMM. If memory mirroring is enabled, then the mirrored copy of the data in the damaged DIMM is used until the system is powered down and the DIMM replaced.

20 IBM ^xSeries 440 Planning and Installation Guide

Page 34
Image 34
IBM 440 manual Memory ProteXion