4285ch01.fm Draft Document for Review May 4, 2007 11:35 am
34 Linux Performance and Tuning Guidelines

1.5.4 Bonding module

The Linux kernel provides network interface aggregation capability by using a bonding driver.
This is a device independent bonding driver, while there are device specific drivers as well.
The bonding driver supports the 802.3 link aggregation specification and some original load
balancing and fault tolerant implementations as well. It achieves a higher level of availability
and performance improvement. Please refer to the kernel documentation
Documentation/networking/bonding.txt.
1.6 Understanding Linux performance metrics
Before we can look at the various tuning parameters and performance measurement utilities
in the Linux operating system, it makes sense to discuss various available metrics and their
meaning in regard to system performance. Because this is an open source operating system,
a significant amount of performance measurement tools are available. The tool you ultimately
choose will depend upon your personal liking and the amount of data and detail you require.
Even though numerous tools are available, all performance measurement utilities measure
the same metrics, so understanding the metrics enables you to use whatever utility you come
across. Therefore, we cover only the most important metrics, understanding that many more
detailed values are available that might be useful for detailed analysis beyond the scope of
this paper.

1.6.1 Processor metrics

򐂰CPU utilization
This is probably the most straightforward metric. It describes the overall utilization per
processor. On IBM System x architectures, if the CPU utilization exceeds 80% for a
sustained period of time, a processor bottleneck is likely.
򐂰User time
Depicts the CPU percentage spent on user processes, including nice time. High values in
user time are generally desirable because, in this case, the system performs actual work.
򐂰System time
Depicts the CPU percentage spent on kernel operations including IRQ and softirq time.
High and sustained system time values can point you to bottlenecks in the network and
driver stack. A system should generally spend as little time as possible in kernel time.
򐂰Waiting
Total amount of CPU time spent waiting for an I/O operation to occur. Like the blocked
value, a system should not spend too much time waiting for I/O operations; otherwise you
should investigate the performance of the respective I/O subsystem.
򐂰Idle time
Depicts the CPU percentage the system was idle waiting for tasks.
򐂰Nice time
Depicts the CPU percentage spent on re-nicing processes that change the execution
order and priority of processes.
򐂰Load average
The load average is not a percentage, but the rolling average of the sum of the followings:
the number of processes in queue waiting to be processed
the number of processes waiting for uninterruptable task to be completed