Chapter 9. Troubleshooting

The fact that a sector repair message has been sent to you is an indication of the presence of grown defects on a particular drive. While typical modern disk drives are designed to allow several hundred grown defects, special attention should be paid to any drive in a unit that begins to indicate sector repair messages. This may be an indication of a drive that is beginning to fail. You may wish to replace the drive, especially if the number of sector repair errors exceeds 3 per month.

0024 Sbuf memory test failed

The 3ware RAID controller, as part of its data integrity features, performs diagnostics on its internal RAM devices. Once a day, a non-destructive test is performed on the cache memory. Failure of the test indicates a failure of a hardware component on the 3ware RAID controller. This message is sent to notify you of the problem. If the controller is still under warranty, contact 3ware Technical Support for a replacement controller.

0025 Cache flush failed; some data lost

To improve performance, the 3ware RAID controller features caching layer firmware. For write commands this means that it acknowledges it has completed a write operation before the data is committed to disk. If the 3ware RAID controller can not commit the data to the media after it has acknowledged to the host, this message is posted.

Typically, the Cache Flush Failed notification would be an indication of a catastrophic failure of the drives in the unit, such as loss of power to multiple drives in a unit.

To troubleshoot the reasons for the failure, collect the logs for your system and contact 3ware technical support at http://www.3ware.com/support/ index.asp. For information on what error logs are and how to collect them, see http://www.3ware.com/KB/article.aspx?id=12278.

0026 Drive ECC error reported

This message may be sent when a drive returns the ECC error response to n 3ware RAID controller command. The message may or may not be associated with a host command. Internal operations such as Verify post this message whenever drive ECC errors are detected.

Drive ECC errors are an indication of a problem with grown defects on a particular drive. For redundant units, this typically means that dynamic sector repair would be invoked (see message “0023 Sector repair completed” on page 121). For non-redundant units (RAID 0 and degraded units), drive ECC errors result in the 3ware RAID controller returning failed status to the associated host command.

122

3ware Serial ATA RAID Controller User Guide for the Power Mac G5

Page 130
Image 130
AMCC 720-0138-00 manual Sbuf memory test failed, Cache flush failed some data lost, Drive ECC error reported