HP c-Class Performance Tuning manual Debugging performance issues, Improperly configured benchmark

Page 10

Debugging performance issues

Improperly configured benchmark

Issue

The most common issue in achieving performance with the IO Accelerator is the failure to properly set up the micro benchmark.

Solution

Be sure that you start with the benchmarks described in the previous sections to insure that the system is performing properly with a known benchmark.

Oversubscribed bus

Issue

It is possible to install multiple IO Accelerators and other PCIe peripherals in a way that causes unequal performance from each of the drives. In networking, this is frequently called over-subscription, and it is common in all but the latest generation of systems. For PCIe bus based high performance peripherals, having an over-subscribed topology can drastically affect performance, especially if the drives are set up for a RAID configuration.

An example of a balanced topology, which maximizes performance, is a system that has 16 PCI lanes connected from IO Accelerators to a switch chip, and 16 lanes from the switch chip to the root complex.

An example of an over-subscription condition, which decreases performance, is a system that has 12 PCI lanes connected from IO Accelerators to a switch chip, but only 8 lanes from the switch chip to the root complex.

Other PCIe devices, such as high performance network cards and graphic cards, can also create an over-subscription condition.

Solution

To verify that there are no bandwidth bottlenecks in the PCIe bus, run the fio-pci-checkutility and look for errors.

NOTE: The fio-pci-checkutility is not fully functional on all operating systems.

Issue

A related common performance problem that is harder to diagnose is when the PCIe lanes are run off of the south bridge. Running off the south bridge is inherently slower than running off the north bridge. This can create an oversubscribed configuration that the fio-pci-checkutility might not be able to diagnose.

Solution

Diagnosing this issue is beyond the scope of this document. For assistance in determining if your system suffers from this problem, run the fio-bugreportutility and the results of the system-vetting benchmarks.

Debugging performance issues 10

Image 10
Contents HP IO Accelerator Performance Tuning Guide Page Contents Setting Windows driver affinity About the Performance and Tuning Guide IntroductionSystem performance Verifying Linux system performanceWrite bandwidth test System performance Verifying Windows system performance with Iometer Improperly configured benchmark Debugging performance issuesOversubscribed bus Handling PCIe errors PCIe link width improperly negotiated CPU thermal throttling or auto-idling Using CP and other system utilities Slow performance using RAID5 on LinuxBenchmarking through a filesystem To avoid this issue. For more information, see the patch Using direct I/O, unbuffered, or zero copy General tuning techniquesMultiple outstanding IOs Pre-conditioning $ dd if=/dev/zero of=/dev/fioX bs=10M oflag=directPre-allocating memory $ echo 4096 /sys/block/fio name/queue/nrrequestsPreallocatemb Tuning techniques for writes Increased steady-state write performance with fio-formatStride = chunk size / filesystem block size Linux filesystem tuningExt2-3-4 tuning Stripewidth = dbd * strideUsing the IO Accelerator as swap space Options iomemory-vsl preallocatememory=1072,4997,6710,10345Compiling the fio benchmark Fio benchmark$ tar xjvf fio-X.Y.Z.tar.bz2 $ cd fio-X.Y.Z Page Fd = openfilename, Owronly Programming using direct I/OUsing direct I/O on Linux Fd = openfilename, Owronly OdirectUsing direct I/O on Windows ++ code sample Programming using direct I/O Programming using direct I/O Setting Windows driver affinity Windows driver affinityCreate the SetWorkerAffinity2 tag of type Regdword Acronyms and abbreviations Index Index