Figure 5. ECC Memory Operation

As the data is read from memory, the ECC circuit again performs a scan and compares the resulting pattern to the pattern which was stored in the check bits. If a single-bit error has occurred (the most common form of error), the scan will always detect it, automatically correct it and record its occurrence. In this case, system operation will not be affected.

The scan will also detect all double-bit errors, though they are much less common. With double-bit errors, the ECC unit will detect the error and record its occurrence in NVRAM; it will then halt the system to avoid data corruption. The data in NVRAM can then be used to isolate the defective component.

In order to implement an ECC memory system, you need an ECC memory controller and ECC SIMMs. ECC SIMMs differ from standard memory SIMMs in that they have additional storage space to hold the check bits.

The IBM PC Servers 500 and 720 have ECC circuitry and provide support for ECC memory SIMMs to give protection against memory errors.

1.4.3 Error Correcting Code-Parity Memory (ECC-P)

Previous IBM servers such as the IBM Server 85 were able to use standard memory to implement what is known as ECC-P. ECC-P takes advantage of the fact that a 64-bit word needs 8 bits of parity in order to detect single-bit errors (one bit/byte of data). Since it is also possible to use an ECC algorithm on 64 bits of data with 8 check bits, IBM designed a memory controller which implements the ECC algorithm using the standard memory SIMMs.

10NetWare Integration Guide

Page 25
Image 25
IBM SG24-4576-00 manual Error Correcting Code-Parity Memory ECC-P, ECC Memory Operation