HP manual Monitoring IO Accelerator health, Nand flash and component failure, Health metrics

Page 54

Monitoring IO Accelerator health

NAND flash and component failure

The IO Accelerator is a highly fault-tolerant storage subsystem that provides many levels of protection against component failure and the loss nature of solid state storage. However, as in all storage subsystems, component failures might occur.

When a large enough number of data blocks is retired due to error, the NAND flash media is considered worn out. By properly selecting NAND flash media for the hosted application and proactively monitoring device age and health, you can assure reliable performance over the intended product life.

Health metrics

The IO Accelerator driver manages block retirement using pre-determined retirement thresholds. The IO Accelerator Management Tool and the fio-statusutility show a health indicator that starts at 100 and counts down to 0. As certain thresholds are crossed, various actions are taken.

At the 10% healthy threshold, a one-time warning is issued. For methods for capturing this alarm event, see "Health monitoring techniques (on page 54)."

At 0%, the device is considered unhealthy. It enters write-reduced mode, which somewhat prolongs its lifespan so data can be safely migrated. In this state, the IO Accelerator behaves normally except for the reduced write performance.

At some point after the 0% threshold, the device enters read-only mode. Any attempt to write to the IO Accelerator causes an error. Some file systems might require special mount options to mount a read-only block device, beyond specifying that the mount should be read-only. For example, under Linux, ext3 requires that -o ro,noload is used. The noload option tells the file system not to try to replay the journal.

Read-only mode should be considered a final opportunity to migrate data off the device since device failure is more likely with continued use.

The IO Accelerator might enter failure mode. In this case, the device is offline and inaccessible. This can be caused by an internal catastrophic failure, improper firmware upgrade procedures, or device wears out.

Health monitoring techniques

fio-status

Output from the fio-statusutility shows the health percentage and drive state. These items are in bold in the following sample output.

Found 1 ioDrive in this system

Fusion-io driver version: 2.1.0 build 19032

Adapter: ioDrive

HP 160GB SLC PCIe ioDrive for ProLiant Servers, Product Number:600278-B21

Monitoring IO Accelerator health 54

Image 54
Contents HP IO Accelerator Version 3.2.3 Windows User Guide Page Contents Maintenance Resources Contents summary About this guideIntroduction OverviewProduct naming Performance attributes IO Accelerator capacity 320GB 640GB Models AJ878B BK836ARequired operating environment Supported firmware revisionsSupported hardware Page Introduction Installation overview Uninstalling a previously-installed driverWindows Server environments Installing software on a Windows operating system Using the Setup Wizard Windows Server environments Windows Server environments Windows Server environments Windows Server environments Upgrading device firmware from VSL 1.x.x or 2.x.x to Upgrading procedure Fio-bugreportManual installation on Windows Server Fio-update-iodrive iodriveversion.fffManually installing on Windows Server Performing the upgrade Upgrading the device firmware using WindowsManually installing on Windows Server 2008 Viewing the firmware versionUpgrading driver software using Windows Fio-detach /dev/fct0Windows Server environments Silent install option DefragmentationOutdated firmware check Enabling PCIe power IO Accelerator namingWindows Server environments Fio-config -p Fiopreallocatememory 1072,4997,6710,10345 Creating a RAID configurationSetting up Snmp for Windows operating systems Snmp details for Windows operating systemsUsing test mode registry values Via garbage collectionSnmp MIB fields supporting Windows Server Snmp test registry entry DescriptionWindows Server environments Using installation logs Windows Installer logging optionsMsiexec /i C\MyPackage\Example.msi /L*V C\log\example.log Creating an installation log Creating an uninstall logCreating a patch install log Automated logging with the Windows Installer Logging Policy Troubleshooting event log messagesError ioDrivex Common maintenance tasks MaintenanceMaintenance tools Uninstalling the IO Accelerator driver packageEnabling PCIe power override Unmanaged shutdown issuesEnabling Autoattach Enabling the override parameter Fio-config /dev/fctx -p PCIeGLOBALSLOTPOWERLIMIT Fio-config /dev/fct2 -p PCIeGLOBALSLOTPOWERLIMITUtilities Utilities referenceFio-attach Fio-attach device options Fio-beaconFio-bugreport Fio-beacon device optionsFio-config Fio-config device optionsCompressing Fio-config options FioexternalpoweroverrideFio-detach Fio-detach device optionsFio-format Fio-format device options deviceFio-pci-check options Options Fio-pci-checkFio-status Fio-status device optionsFfield Geometry and capacity information not available. appears Fio-sure-erase Fio-sure-erase options deviceFio-trim-config Fio-update-iodrive Fio-trim-config options OptionsFio-update-iodrive iodriveversion.fff options Format domainbusslot.func Health metrics Monitoring IO Accelerator healthNand flash and component failure Health monitoring techniquesFlashback substitution events Software RAID and health monitoringMonitoring IO Accelerator health Introduction to Trim Using TrimTrim support Trim platformsStarting and stopping Trim Enabling TrimControlling Trim aggressiveness Trim configurations Introduction to Windows page files Using Windows page files with the IO AcceleratorConfiguring IO Accelerator paging support RAM consumptionFio-config -g Fiopreallocatememory Fio-config -p Fiopreallocatememory 1234,17834Fio-config -p Fiopreallocatememory Non-paged memory poolSetting up page files Windows page file managementSystem drive paging file configuration Guaranteeing minimum committable memoryVirtual Memory performance Verifying page file operationDir c /ah Disabling Dvfs Performance and tuningIntroduction to performance and tuning Limiting Apci C-statesSetting Numa affinity Setting the interrupt handler affinityIntroduction to Numa architecture Numa configurationFio-config -p Fioaffinity 4,n1,0xf5,n07,g19,g2,0xff0 Fioaffinity parameterAdvanced configuration example Checking the log for errors Fio-config -p Fioaffinity 5,g0,0xf6,0xfSubscription service ResourcesFor more information HP contact information Support and other resourcesBefore you contact HP Customer Self RepairRéparation par le client CSR Riparazione da parte del cliente Reparaciones del propio cliente Reparo feito pelo cliente Support and other resources Support and other resources Support and other resources Safety and regulatory compliance Warranty informationRegulatory information Acronyms and abbreviations NumaSMI-S Documentation feedback Index Using the IO Accelerator as a swap