HP c-Class Performance Tuning manual Benchmarking through a filesystem

Page 14

Benchmarking through a filesystem

Issue

Although using a filesystem is necessary for most storage deployments, it involves additional work to access the data stored on the IO Accelerator. These additional lookups decrease maximum system performance when compared to the benchmark results achieved by benchmarking directly on the block device.

Solution

When you are running micro-benchmarks to vet system performance, you should benchmark by accessing the block device directly. Otherwise, use any application-native filesystem implementation, possibly testing a handful where available. HP testing has shown XFS to be reasonably fast under most circumstances. For Linux, HP recommends using the XFS filesystem.

Slow performance using RAID5 on Linux

Issue

The native Linux implementation of RAID5 is verified to have performance issues with the IO Accelerator. The Linux RAID5 configuration is believed to use a single thread/single CPU to perform the needed parity calculations inherent in running a RAID5 system.

Solution

An alternate RAID stack might provide better performance. Additionally, IO Accelerators might be configured to operate in a RAID10 solution.

Using CP and other system utilities

Issue

Most traditional system utilities, such as CP and rsync, are built with slow legacy storage in mind. They do not achieve optimal performance from the IO Accelerator like well-tuned applications.

This is not to say that the IO Accelerator does not work well with standard utilities. It is still much faster than traditional storage using the same utilities, and additional performance benefits will be available in the future as these utilities are optimized for high-performance storage.

Solution

Avoid using traditional system utilities for general benchmarking purposes, as they are not a good representation of peak performance.

ext4 in Kernel.org 2.6.33 or earlier might silently corrupt data when discard (trim) is enabled

CAUTION: HP does not support the use of ext4 in Kernel.org 2.6.33 or earlier. Ext4 in Kernel.org 2.6.33 or earlier might silently corrupt data when discard is enabled.

The ext4 filesystem in the Kernel.org kernel 2.6.33 and earlier contains a bug where the data in a portion of a file might be improperly discarded (set to all 0x00) under some workloads. Use Version 2.6.34 or newer

Debugging performance issues 14

Image 14
Contents HP IO Accelerator Performance Tuning Guide Page Contents Setting Windows driver affinity About the Performance and Tuning Guide IntroductionSystem performance Verifying Linux system performanceWrite bandwidth test System performance Verifying Windows system performance with Iometer Oversubscribed bus Debugging performance issuesImproperly configured benchmark Handling PCIe errors PCIe link width improperly negotiated CPU thermal throttling or auto-idling Benchmarking through a filesystem Slow performance using RAID5 on LinuxUsing CP and other system utilities To avoid this issue. For more information, see the patch Multiple outstanding IOs General tuning techniquesUsing direct I/O, unbuffered, or zero copy Pre-conditioning $ dd if=/dev/zero of=/dev/fioX bs=10M oflag=directPre-allocating memory $ echo 4096 /sys/block/fio name/queue/nrrequestsPreallocatemb Tuning techniques for writes Increased steady-state write performance with fio-formatStride = chunk size / filesystem block size Linux filesystem tuningExt2-3-4 tuning Stripewidth = dbd * strideUsing the IO Accelerator as swap space Options iomemory-vsl preallocatememory=1072,4997,6710,10345$ tar xjvf fio-X.Y.Z.tar.bz2 $ cd fio-X.Y.Z Fio benchmarkCompiling the fio benchmark Page Fd = openfilename, Owronly Programming using direct I/OUsing direct I/O on Linux Fd = openfilename, Owronly OdirectUsing direct I/O on Windows ++ code sample Programming using direct I/O Programming using direct I/O Setting Windows driver affinity Windows driver affinityCreate the SetWorkerAffinity2 tag of type Regdword Acronyms and abbreviations Index Index