Intel NetStructure® MPCBL0001 High Performance Single Board Computer

Contents

3.10.2Error Reporting

The MCH handles error reporting from the memory subsystem. Errors consist of correctable and uncorrectable bit errors. The ECC algorithms used are capable of correcting any number of bit errors contained within a 4-bit nibble. In addition, any number of bit errors contained within two 4- bit nibbles is detected. The MCH communicates these errors to the ICH3 via special cycles over the hub link interface. These special cycles indicate to the ICH3 that an MCH-detected error has occurred. The MCH special cycle communicates the type of event that should be generated by the ICH3 when an error is detected. Selection for the generation of an SERR, SMI, or SCI event is provided. Status for these reported errors is then found in the MCH DRAM_FERR (first error) and DRAM_NERR (next error) status registers. Refer to the MCH data sheet for more information (see Appendix A, “Reference Documents”).

Correctable memory errors generate an SMI and are logged via IPMI as a SEL. Non-correctable errors first generate an SMI (which generates a SEL) and then an NMI.

Each P64H2 device reports the PCI errors that occur on the buses to which it is attached. These consist of the PCI error assertions of the PERR# or SERR# signals. The errors are reported by sending the DO_SERR special cycle to the MCH on the Hub Interface. The MCH forwards the error to the ICH3, which generates the appropriate error condition to the processor(s) such as NMI, SMI, or SCI.

PCI address parity errors are considered catastrophic and may abort further data transfers by the P64H2 if that is the programmed response. Parity/ECC is checked on both the Hub Interface and PCI bus transactions. PCI data parity errors are considered less severe and allow transactions to continue. Data parity errors cause the Detected Parity Error” status to be logged and, if enabled, the DO_SERR special cycle is transmitted. In a transaction where a data error occurs, the data being forwarded to the next bus is “poisoned” to ensure the error follows the data to its destination. Poisoned data has bad parity or multi-bit ECC errors introduced before being forwarded to the next bus.

PCI assertions of the SERR# signal also result in the DO_SERR special cycle being generated on the hub interface when enabled. Other potential causes for a DO_SERR special cycle include:

Parity errors on the target bus during a write.

A master timeout on a delayed transaction.

The occurrence of a PCI master abort cycle.

Refer to the P64H2 Data Sheet, section 4.9, for more information on error handling. For details on obtaining this document, see Appendix A, “Reference Documents.”

The ICH3 device has the ability to report PCI and hub link errors directly to the processors. When a PERR# or SERR# occurs on the ICH3 local PCI bus, the ICH3 can be programmed to generate NMI or SMI. The ICH3 also fields messages from the MCH and its attached hub devices to indicate errors to the processors on their behalf. The messages may request SMI#, SCI, NMI, or SERR3 to be asserted. Software must check the MCH and attached hub devices to determine the exact cause of the error. Refer to the ICH3 Data Sheet for more information on error handling and generation. For details on obtaining this document, see Appendix A, “Reference Documents.”

Technical Product Specification

57

Order #273817

 

Page 57
Image 57
Intel MPCBL0001 manual Error Reporting