Basics of Backup Performance

Backup performance will always be limited by one or more bottlenecks in the system, of which the tape drive is just one part. The goal is to make the tape drive the bottleneck. That way the system will achieve the performance figures as advertised on the drive's specification sheet.

Please note that backup jobs can stress hardware resources up to their highest limit, which would never happen during normal application load. This puts the emphasis on the rest of the system and causes failures, which are based in the involvement of many components and their time-critical handshake of data.

The flow of data throughout the system must be fast enough to provide the tape drive with data at its desired rates. High-speed tape drives, such as the Ultrium 960, are so fast that this can be a difficult goal to achieve. If the backup performance is not matching the data sheet of the tape drive, then there is a bottleneck somewhere else in the system.

One single component, like the 100BASE-T network, can decrease a SDLT or LTO tape drive performance to a very low transfer rate (this would be a very good use case for first staging the data on disk and then backing it up to tape.)

All components must be considered for getting the theoretical backup performance. Practical performance data can only be obtained from benchmarks.

Factors, which critically affect the backup speed:

Multiplexing

This allows better bandwidth utilization of the tape drive during backup but can slow down restore performance because all the data is interleaved all the way down the tape. Therefore, the time spent to perform a single stream restore is higher due to other streams having to be read (and potentially ignored).

Disk and Tape Buffers

DP offers a set of advanced options for backup devices and disk agents. The default settings are device-based and match most of all backup environments. Ultrium 960 is an exception and requires a modification as described in chapter Tuning Recommendations.

Data File Size

The larger the number of smaller files there are, the larger the overhead there is associated with backing them up. The worst-case scenario for backup is large numbers of small files due to system overhead of file access.

Data compressibility

Incompressible data will back up slower than higher compressible data. JPEG files, for example, are not very compressible, whereas database files can be highly compressible. The accepted standard for quoting tape backup specifications revolves around an arbitrary figure of 2:1 compressible data.

Disk Array Performance

It is often overlooked that you cannot put data on tape any faster than you can read it from disk. Backup is more sequential in nature than random (from a disk array access perspective). Disk array performance depends on the number of disks, RAID configuration, the number of Fibre Channel ports to access the array, queue depth available, and so on. HP has written several utilities to “read data” from disk arrays and deliver a performance value. This enables users to determine the throughput of their disk arrays when operating in backup mode and performing file system traversals typical of this activity.

These Performance Assessment Tools (PAT utilities) can be downloaded from http://www.hp.com/support/pat. The performance tools are also embedded within the HP industry- leading Library and Tape Tools diagnostics, which can be downloaded from http://www.hp.com/support/tapetools.

8