Intel® NetStructureTMZT 7102 Chassis Management Module

Built-In Self Test

At OS start-up, the CMM will read the contents of BIST results in the reserved event log area and store the errors as entries in the CMM SEL. This will allow the CMM application to take the appropriate action based upon the SEL events as a result of RedBoot BIST tests. If there is not enough space to log the events in the CMM SEL, no results are logged to the CMM SEL.

The BIST event log will be erased only after the event log is stored into the CMM SEL. Event strings for BIST events are listed in Section 11.0, “Health Event Strings” on page 67.

6.7OS Flash Corruption Detection and Recovery Design

The OS is responsible for flash content integrity at runtime. Flash monitoring under the OS environment can be divided into two parts: monitoring static images and monitoring dynamic images.

Static images refer to the RedBoot image, FPGA image and BlueCat image in flash. These images should not change throughout the lifetime of the CMM unless they are purposely updated or corrupted. The CRC for these files are written into flash when the images are uploaded.

Dynamic image refers to the OS Flash File System (JFFS2). This image will dynamically change throughout the runtime of the OS.

6.7.1Monitoring the Static Images

A static test is run at specified time intervals during CMM operation. The interval is specified on the command line in the CMM startup script. The default interval is 24 hours. A value of zero will turn off the test. The static test will read each static image (RedBoot, FPGA, BlueCat), calculate the image checksum, and compare with the checksum in the RedBoot configuration area (FIS). If the check sum test fails, the error will be logged to the CMM SEL.

6.7.2Monitoring the Dynamic Images

For monitoring the dynamic images, the CMM leverages the corruption detection ability from the JFFS(2) flash file system. At OS start-up, the CMM executes an initialization script to mount the JFFS(2) flash partitions (/etc and /home). If a flash corruption is detected, an event will be logged to the CMM SEL.

During normal OS operation, flash corruption during file access can also be detected by the JFFS(2) and/or the flash driver. If a flash corruption is detected, an event will be logged to the

CMM SEL.

6.7.3CMM Failover

If during normal OS operation a critical error occurs on the active CMM, such as for a flash corruption, the standby CMM will be checked to see if it is in a healthier state. If the standby CMM is in a healthier state, then a failover will occur.

50

Technical Product Specification

Page 50
Image 50
Intel ZT 7102 manual OS Flash Corruption Detection and Recovery Design, Monitoring the Static Images, CMM Failover