data in the damaged DIMM is used until the system is powered down and the DIMM replaced.
Certain restrictions exist with respect to placement and size of memory DIMMs when memory mirroring is enabled. These are discussed in “Memory mirroring” on page 67.
Chipkill memory
Chipkill is integrated into the
If a memory chip error does occur, Chipkill is designed to automatically take the inoperative memory chip offline while the server keeps running. The memory controller provides memory protection similar in concept to disk array striping with parity, writing the memory bits across multiple memory chips on the DIMM. The controller is able to reconstruct the “missing” bit from the failed chip and continue working as usual.
Chipkill support is provided in the memory controller and implemented using standard ECC DIMMs, so it is transparent to the operating system.
In addition, to maintain the highest levels of system availability, if a memory error is detected during POST or memory configuration, the server can automatically disable the failing memory bank and continue operating with reduced memory capacity. You can manually
Memory mirroring, Chipkill, and Memory ProteXion provide multiple levels of redundancy to the memory subsystem. Combining Chipkill with Memory ProteXion enables up to two memory chip failures per memory port (8 DIMMs) on the x440. An
1.The first failure detected by the Chipkill algorithm on each port doesn’t generate a Light Path Diagnostics error, since Memory ProteXion recovers from the problem automatically.
2.Each memory port could then sustain a second chip failure without shutting down.
3.Provided that memory mirroring is enabled, the third chip failure on that port would send the alert and take the DIMM offline, but keep the system running out of the redundant memory bank.