HP c-Class Performance Tuning manual Write bandwidth test

Page 7

$ fio --filename=/dev/fioa --direct=1 --rw=randwrite --bs=4k --size=5G --numjobs=64 --runtime=10 --group_reporting --name=file1

These tests are also available as fio job input files, which can be requested from HP support (http://www.hp.com/go/support).

The latest expected performance numbers for your card type can be found in the HP PCIe IO Accelerator for ProLiant Servers Data Sheet (http://h20195.www2.hp.com/V2/GetPDF.aspx/4AA0-4235ENW.pdf).Your results should exceed those on the data sheet. The sample IOPS test uses a 4K block size and 64 threads. The sample bandwidth test uses 1MB block size and four threads. For multi-card runs, IOPS are calculated by adding the per-card bandwidth/second together and dividing it by the block size. For sample output from each of the tests and how to validate that your system is performing properly, see "Write bandwidth test (on page 7)." The key data points in the output are highlighted.

Be sure to run the system-vetting tests in the order they appear in the following section. The initial write bandwidth test might need to be run twice with a short pause between the runs to setup the card for best performance.

Write bandwidth test

Read IOPS test on Linux:

file1: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=sync, iodepth=1

...

file1: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=sync, iodepth=1 Starting 64 processes

Jobs: 64 (f=64): [rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr] [100.0%

done] [420482/0 kb/s] [eta 00m:00s] file1: (groupid=0, jobs=64): err= 0: pid=27861

read : io=4, 058MiB, bw=414MiB/s, iops=104K, runt= 10036msec clat (usec) : min=44, max=36,940, avg=620.03, stdev=50.53

bw (KiB/s) : min=5873, max=15664, per=1.55%, avg=6560.25, stdev=54.21 cpu : usr=0.50%, sys=4.01%, ctx=1163152, majf=0, minf=704

IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,

>64=0.0%

submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,

>64=0.0%

complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,

>64=0.0%

issued r/w: total=1038929/0, short=0/0

lat (usec): 50=0.01%, 100=0.43%, 250=11.05%, 500=35.74%, 750=22.88%

lat (usec): 1000=14.99%

lat (msec): 2=14.88%, 4=0.02%, 10=0.01%, 50=0.01%

Run status group 0 (all jobs):

System performance 7

Image 7
Contents HP IO Accelerator Performance Tuning Guide Page Contents Setting Windows driver affinity Introduction About the Performance and Tuning GuideVerifying Linux system performance System performanceWrite bandwidth test System performance Verifying Windows system performance with Iometer Improperly configured benchmark Debugging performance issuesOversubscribed bus Handling PCIe errors PCIe link width improperly negotiated CPU thermal throttling or auto-idling Using CP and other system utilities Slow performance using RAID5 on LinuxBenchmarking through a filesystem To avoid this issue. For more information, see the patch Using direct I/O, unbuffered, or zero copy General tuning techniquesMultiple outstanding IOs $ dd if=/dev/zero of=/dev/fioX bs=10M oflag=direct Pre-conditioning$ echo 4096 /sys/block/fio name/queue/nrrequests Pre-allocating memoryPreallocatemb Increased steady-state write performance with fio-format Tuning techniques for writesStripewidth = dbd * stride Linux filesystem tuningExt2-3-4 tuning Stride = chunk size / filesystem block sizeOptions iomemory-vsl preallocatememory=1072,4997,6710,10345 Using the IO Accelerator as swap spaceCompiling the fio benchmark Fio benchmark$ tar xjvf fio-X.Y.Z.tar.bz2 $ cd fio-X.Y.Z Page Fd = openfilename, Owronly Odirect Programming using direct I/OUsing direct I/O on Linux Fd = openfilename, OwronlyUsing direct I/O on Windows ++ code sample Programming using direct I/O Programming using direct I/O Windows driver affinity Setting Windows driver affinityCreate the SetWorkerAffinity2 tag of type Regdword Acronyms and abbreviations Index Index